- Test edge cases: corrupted files, unsupported formats, empty files, mismatched extensions.
- Improve resiliency of mime detection, maybe enforce magic db regeneration on first use.
- Normalize MIME detection and extension mapping.
- Implement a UriProcessor to detect and process embedded data URIs (e.g. data:image/*;base64) in HTML, CSS, JS, XML, SVG. Extract, decode, optimize via existing processors, and reinsert re-encoded content.
- Review and clean up unused or redundant CMake variables.
- Complete refactoring of third-party library integrations (libFLAC, libwavpack) to use
FILE*or callback-based APIs instead of filenames, ensuring full Unicode support on Windows.
- Implement brute-force recompression across compression modes and select the smallest output.
- Support optional stripping of EXIF/ICC metadata.
- Benchmark compression ratio vs.
jpegtran. - Integrate multiple JPEG optimizers (
jpegtran,jpegoptim,jpeg-recompress,guetzli,cjpegli) and select the best result.
- Improve
WebpEncoderwith advanced lossless options (-m 6,-q 100). - Support removal of non-essential chunks (XMP, ICC).
- Investigate pdfsizeopt techniques (image recompression, font unification, metadata stripping)
↳ https://github.com/pts/pdfsizeopt
- Add support for 7Z recompression using 7zip SDK.
- Investigate integration of BestCFBF (https://papas-best.com/downloads/bestcfbf/stable/bestcfbf.cpp) for optimizing MSI, DOC, PPT, XLS:
- On Windows: adapt and integrate directly with COM Structured Storage APIs.
- On Linux/macOS: research alternative libraries (e.g. libgsf, olefile) to replicate functionality.
- Explore Leanify-style handling of container formats that are essentially ZIP/LZMA/Deflate (APK, XPS, XPInstall, EPUB, DOCX, ODT, etc.) and integrate similar recursive optimization.
- Investigate advmng for MNG recompression (delta compression, ancillary chunk removal)
↳ https://www.advancemame.it/doc-advmng - Rewrite hardlink handling in archive_processor with a cross-platform approach, since current implementation is not available on Windows.
- Preserve chapters, tags, and attachments (e.g. fonts, cover art).
- Finish Matroska container support (currently unfinished).
-
ALAC – investigate integration via libavcodec or standalone decoder.
-
TAK – closed source, not feasible (note).
-
LA (Lossless Audio) – abandoned, not feasible (note).
-
TTA (The True Audio) – integrate open source library.
↳ https://github.com/stseelig/libttaR -
MPEG‑4 ALS – investigate reference implementation.
↳ https://www.iso.org/standard/43345.html ↳ https://github.com/OptiVorbis/OptiVorbis -
Lepton (Rust JPEG recompressor) – consider FFI integration.
↳ https://github.com/dropbox/lepton (original C++), https://github.com/microsoft/lepton_jpeg_rust -
WOFF/WOFF2 – recompression via zlib/Brotli.
↳ https://www.w3.org/TR/WOFF2/ -
SWF – recompress embedded zlib/LZMA streams (legacy, low priority).
↳ https://en.wikipedia.org/wiki/SWF -
STL – ASCII recompression, binary deduplication of triangles.
↳ https://en.wikipedia.org/wiki/STL_(file_format) -
PCX – palette optimization and RLE recompression.
↳ https://en.wikipedia.org/wiki/PCX -
ICO – optimize embedded PNG/BMP.
↳ https://en.wikipedia.org/wiki/ICO_(file_format) -
SVG/XML/HTML/CSS/JS – minification and metadata stripping.
↳ https://www.w3.org/Graphics/SVG/ -
FB2 – FictionBook container optimization.
↳ https://en.wikipedia.org/wiki/FictionBook -
MPEG1/2 – investigate Ocarina recompression approach.
↳ https://encode.su/threads/1111-Ocarina-s-MPEG1-and-MPEG2-video-compressor -
H.264 – investigate lossless recompression (Pied Piper / losslessh264).
↳ https://encode.su/threads/2285-H264-Lossless-recompression-Pied-Piper-(losslessh264) -
Sound Slimmer – investigate MP3/AAC archival recompression concepts.
↳ https://audiophilesoft.com/load/junk/sound_slimmer_v1_04_001/9-1-0-61 -
Executables (PE/EXE/DLL) – optional Leanify-style recompression.
↳ https://en.wikipedia.org/wiki/Portable_Executable -
Lua bytecode – optional Leanify-style recompression.
↳ https://www.lua.org/manual/5.4/manual.html#6.4 -
RDB, GFT – niche formats, investigate feasibility.
-
WebAssembly – integrate wasm-opt (Binaryen).
↳ https://github.com/WebAssembly/binaryen -
HTML/XML – integrate tidy-html5 for cleanup/minification.
↳ https://github.com/htacg/tidy-html5 -
HDR (Radiance RGBE) – add support for HDR file compression using stb_image/stb_image_write.
↳ https://github.com/nothings/stb -
HALAC (High Availability Lossless Audio Codec) - add support if and when the source code gets updated ↳ https://github.com/Hakan-Abbas/HALAC-High-Availability-Lossless-Audio-Compression
-
OpenEXR – integrate openexr/imath for PIZ/ZIP lossless recompression.
-
FITS – integrate cfitsio for scientific data compression.
Processor Lossless Metadata Container Notes FlacProcessor ✅ ✅ ✅ Works. Recompresses audio & optimizes cover art. WavPackProcessor ✅ ✅ ✅ Works. Consider additional compression methods. ApeProcessor ✅ ✅ ✅ Recompresses audio (MACLib) & optimizes cover art (TagLib). OggProcessor ✅ ✅ ✅ Recompresses Ogg FLAC ( libFLAC) and Ogg Vorbis (OptiVorbis). Direct copy for Opus. Extracts/optimizes cover art securely avoiding memory leaks.MpegProcessor ✅ ✅ ✅ Recompresses MP3 audio using mp3packer(except on Windows). Extracts/optimizes ID3v2 cover art.WavProcessor ❌ ✅ ✅ Container-only mode: extracts/optimizes ID3v2 cover art inside RIFF. AiffProcessor ❌ ✅ ✅ Container-only: extracts/optimizes ID3v2 cover art inside AIFF. JpegProcessor ✅ 🟡 N.A. Copies APP/COM markers.
Add optional metadata stripping.
Integrate other optimizers.
raw_equal implemented (pixel compare).PngProcessor ✅ 🟡 N.A. Works. Needs formal verification for lossless & metadata (iCCP, sRGB, text chunks...). ZopfliPngProcessor ✅ 🟡 N.A. raw_equal implemented (pixel compare).
Copies standard chunks viazopflipng_lib.
Needs ability to parameterize iterations.WebpProcessor ✅ 🟡 N.A. Copies EXIF/XMP/ICCP chunks.
Improve lossless options (-m 6,-q 100).
Add optional chunk removal.
raw_equal implemented (pixel compare).GifProcessor ✅ ✅ N.A. Works. Could use a better fork. FlexiGifProcessor ✅ ❌ N.A.
Needs ability to parameterize iterations/settings (like Zopfli).TiffProcessor ✅ 🟡 N.A. Copies standard metadata tags (XMP, EXIF, ICC).
Uses Deflate compression.
Needs verification.JxlProcessor ✅ 🟡 N.A. Re-encode loop implemented.
Metadata preservation (JXL box) implemented, but needs verification.
raw_equal implemented (pixel compare).TgaProcessor ✅ ❌ N.A. Uses stb_image to re-apply RLE. raw_equalimplemented (pixel compare).
Metadata not preserved.BmpProcessor ✅ ✅ N.A. Uses bmplib. Supports RLE4, RLE8, RLE24 (OS/2), and Huffman 1D compression. Preserves DPI and ICC profiles.PnmProcessor ✅ N.A. N.A. Uses stb_imageto read and internal writer. Optimizes by converting ASCII formats (P1-P3) to Binary (P4-P6). Needs verification.SqliteProcessor ✅ N.A. N.A. VACUUM+ANALYZEare standard, safe operations.
Considered verified.MseedProcessor ✅ ✅ N.A. Metadata is part of header structure.
Considered complete.
May be extended for JSON header metadata.MkvProcessor 🟡 🟡 ❌ Uses mkclean.
Container extraction/finalization is TODO.
Verify chapter/tag/attachment preservation.ArchiveProcessor ✅ N.A. 🟡 Core extractor/rebuilder using libarchive.
Needs extensive testing for archive types (ZIP, TAR, RAR...).
Rewrite hardlink handling.
Add 7z SDK support.PdfProcessor ✅ N.A. 🟡 Extracts streams, recompresses Flate streams with Zopfli using qpdf.
Complex format, needs verification.
Investigatepdfsizeopttechniques.
raw_equal implemented (raw stream compare).OOXMLProcessor ✅ N.A. 🟡 Extracts ZIP, recompresses embedded PNG/JPG with Zopfli.
Needs verification.
Explore Leanify-style recursive optimization.OdfProcessor ✅ N.A. 🟡 Extracts ZIP, recompresses embedded XML with Zopfli.
Storesmimetypeuncompressed.
Needs verification.
Explore Leanify-style recursive optimization.
(Legend: ✅ = Verified, 🟡 = Partially implemented/Needs verification, ❌ = Not implemented/Missing, N.A. = Not Applicable)