This file contains release notes for major and minor releases of xpar.
For a complete list of source-level changes, consult the ChangeLog file.

===============================================================================
v1.0 (16-04-2026)
- File-format bump to v1.0. v0.x archives are rejected; re-encode any data
  you still need. Future v1.x minors stay decodable by v1.0+ tools.
- Joint header now records total input size and each lace carries a
  sequence number, so a .xpa truncated at a lace boundary or with laces
  reordered is detected and refused rather than silently decoding to
  garbage.
- Sharded header CRC32C now covers the version, shard count, shard
  number, and total size fields in addition to the body, so a
  shard_number bit-flip or a header-field swap between shards is
  detected.
- New '-H / --integrity' flag selects the per-lace / per-shard tag
  algorithm: 'crc32c' (default, 32-bit hardware-accelerated) or
  'blake2b' (128-bit BLAKE2b, 2^64 birthday-bound collision resistance
  instead of 2^16). Supported in joint, systematic, and sharded
  (Vandermonde + Leopard) modes.
- New '--auth=<keyfile>' flag reads a 1-64 byte key and switches the
  tag to a keyed BLAKE2b-128 MAC, implying '-H blake2b'. Decoders
  require the same key; wrong / missing / spurious keys are all
  rejected.
- New '-s / --systematic' joint mode: encoder writes a parity-only .xpa
  (about 18% of input) and the original file stays on disk unchanged.
  Decoder reads the (possibly corrupted) original plus the parity file
  and emits the corrected data to stdout. Incompatible with
  --interlacing.
- New '-t / --test' integrity-check mode for '-J', '-Js', '-W', and
  '-L'. Runs the full correction pipeline but writes nothing; exits
  non-zero on any unrecoverable block, shard shortfall, or tag
  mismatch, and also flags pristine-but-corrected archives.
- BLAKE2b has hand-vectorised SSE4.1 (also used on AArch64 via
  sse2neon) and AVX2 kernels with runtime CPUID dispatch; a portable
  reference C implementation is the fallback. '--disable-blake2b' at
  configure time strips BLAKE2b and MAC support entirely.
- Leopard-sharded decode is faster on aarch64 (Linux and Apple Silicon)
  via sse2neon. Fix a missing return in the SSSE3 xor_mem4 path that
  caused a latent heap overflow (surfaced as a segfault on Apple
  Silicon, caught by ASan in CI) when reconstructing from missing
  shards.
- Self-check now covers the new features, including truncation-,
  reorder-, and shard-number-swap rejection; the error-injection
  helper is rewritten in C so Python is no longer required to run the
  suite.
- CI builds additionally run under Address and Undefined-Behaviour
  sanitizers on Linux and macOS, catching regressions in the hot
  decoder paths. Release artifacts now include a Win95-target i686
  binary (xpar-i686-w95.exe, built with --with-windows-target=win95,
  imports only KERNEL32.dll) and a DJGPP/MS-DOS binary (xpar-dos.exe,
  CWSDPMI bundled via CWSDSTUB) alongside the existing i686/x86_64
  Windows binaries.

===============================================================================
v0.7 (19-09-2025)
- Rename the sharded mode to Vandermonde-sharded mode (-S => -W).
- Update building instructions for specific architecture and operating system
  combinations.
- Include benchmarks in the repository.
- Introduce minimum shard size.
- Improve stability without --no-mmap.
- Add FFT-based Reed-Solomon encoders and decoders that operate in
  linearithmic time (@catid).
- Fix a memory leak in gf256mat_inv used by the Vandermonde-sharded mode.
- Update and reflow the man-page.

===============================================================================
v0.6 (10-09-2025)
- Move the project page to iczelia/xpar.
- Minor style changes and fixes to command parsing.

===============================================================================
v0.5 (17-10-2024)
- OpenMP support for sharded mode (which unfortunately seems bottlenecked by
  I/O). 
- Switch to yarg for command-line parsing, remove dependency on Rich Felker's
  `getopt_long`.
- Hopefully the last v0.x release. Hopefully, it will receive some feedback
  which will help to introduce future improvements and release v1.0. The file
  format will not change from now on, unless there is a bug or another major
  misfeature that needs to be fixed.

===============================================================================
v0.4 (16-10-2024)
- x86_64 static Linux binaries are no longer provided.
- OpenMP support has been added to improve encoding and decoding performance
  in joint mode with high interlacing factors on multi-core machines.
- 3-way saturating CRC32C implementation has been added to improve performance
  on x86_64 machines that support SSE4.2.
- Slightly improve the performance of the sharded mode.
- Fix undefined behaviour in sharded mode regarding int shifts.

===============================================================================
v0.3 (16-10-2024)
- Improve joint encoding performance on x86_64 machines.
- Support aarch64 Linux.
- Improve cross-platform compatibility of the sharded mode.

===============================================================================
v0.2 (15-10-2024)
- Provides a manual page for the xpar command.
- Provides platform-specific code for aarch64, which can be enabled via
  the --enable-aarch64 configure option.

===============================================================================
v0.1 (14-10-2024)
- Initial release.
- Supports joint mode and sharded mode for error and erasure correction.
- Provides platform-specific code for x86_64, which can be enabled via
  the --enable-x86_64 configure option.
- Tested on x86_64 Linux (Ubuntu), x86_64 and aarch64 MacOS and x86_64 and
  i686 Windows.
