zip2zip: Inference-Time Adaptive Vocabularies for Language Models via Token Compression
Saibo Geng, Nathan Ranchin, Yunzhen Yao, Maxime Peyrard, Chris Wendler, Michael Gastpar, Robert West
Paper: https://arxiv.org/abs/2506.01084
This package provides a high-performance LZW compression library with Python bindings. It is designed to be used as part of the zip2zip project, where it provides efficient, high-performance compression capabilities.
We developed a new variant of the Lempel-Ziv-Welch (LZW) compression algorithm that doesn't need perfectly encoded input to decode. This allows the algorithm to decode (or decompress) generated sequences from a Large Language Model (LLM) without the need to store the entire compression codebook.
pip install zip2zip-compression
Make sure you have the Rust toolchain installed.
pip install maturin
maturin build --release
See the docs folder for more information:
You can find usage examples in the example folder.