Fujitsu Launches One Compression: Open-Source LLM Quantization Library with Novel QEP Method
Key Takeaways
- ▸Fujitsu's OneComp library brings NeurIPS 2025 research into production with the novel Quantization Error Propagation (QEP) method for improved LLM quantization accuracy
- ▸The library offers both ease-of-use (single-line quantization) and advanced features (AutoBit, JointQ, rotation preprocessing) for researchers and practitioners
- ▸Integration with vLLM and support for LoRA fine-tuning enables practical deployment and customization of quantized models in production environments
Summary
Fujitsu has released Fujitsu One Compression (OneComp), an open-source Python library designed for post-training quantization of Large Language Models. The library implements several state-of-the-art quantization algorithms including GPTQ, DBF, RTN, and introduces a novel Quantization Error Propagation (QEP) method that was presented at NeurIPS 2025, which corrects quantization errors by propagating them to subsequent layers to improve accuracy of quantized models.
OneComp offers practical features for developers and researchers, including vLLM plugin integration for serving quantized models, AutoBit for mixed-precision quantization with automatic bitwidth assignment based on available VRAM, JointQ for simultaneous optimization of weight assignments and scale parameters, and LoRA SFT post-processing for fine-tuning quantized models with adapters. The library also includes rotation preprocessing techniques based on SpinQuant/OstQuant that reduce quantization error by learning optimal rotation matrices before quantization.
The tool is designed for ease of use, allowing users to quantize any Hugging Face-compatible model with a single line of code, handling QEP 4-bit quantization, evaluation, and model saving automatically. Currently, OneComp has been verified with specific model architectures including Llama and Qwen3, with plans to expand support for additional architectures. The library is released under an open-source license with copyright held by Fujitsu Ltd.
- Open-source release democratizes access to state-of-the-art quantization techniques, reducing barriers for model compression and efficient inference
Editorial Opinion
Fujitsu's release of OneComp represents a timely contribution to the LLM community, bridging the gap between cutting-edge research (NeurIPS 2025) and practical deployment tools. The combination of novel QEP methodology with mature quantization algorithms and pragmatic features like vLLM integration positions this library as a valuable resource for organizations seeking to reduce model inference costs. The open-source approach and ease-of-use design could significantly accelerate adoption of efficient LLM inference across industry and research.



