Fujitsu Launches One Compression: Open-Source LLM Quantization Library with Novel QEP Method

Key Takeaways

▸Fujitsu's OneComp library brings NeurIPS 2025 research into production with the novel Quantization Error Propagation (QEP) method for improved LLM quantization accuracy
▸The library offers both ease-of-use (single-line quantization) and advanced features (AutoBit, JointQ, rotation preprocessing) for researchers and practitioners
▸Integration with vLLM and support for LoRA fine-tuning enables practical deployment and customization of quantized models in production environments

Source:

Hacker Newshttps://FujitsuResearch.github.io/OneCompression/↗

Summary

Fujitsu has released Fujitsu One Compression (OneComp), an open-source Python library designed for post-training quantization of Large Language Models. The library implements several state-of-the-art quantization algorithms including GPTQ, DBF, RTN, and introduces a novel Quantization Error Propagation (QEP) method that was presented at NeurIPS 2025, which corrects quantization errors by propagating them to subsequent layers to improve accuracy of quantized models.

OneComp offers practical features for developers and researchers, including vLLM plugin integration for serving quantized models, AutoBit for mixed-precision quantization with automatic bitwidth assignment based on available VRAM, JointQ for simultaneous optimization of weight assignments and scale parameters, and LoRA SFT post-processing for fine-tuning quantized models with adapters. The library also includes rotation preprocessing techniques based on SpinQuant/OstQuant that reduce quantization error by learning optimal rotation matrices before quantization.

The tool is designed for ease of use, allowing users to quantize any Hugging Face-compatible model with a single line of code, handling QEP 4-bit quantization, evaluation, and model saving automatically. Currently, OneComp has been verified with specific model architectures including Llama and Qwen3, with plans to expand support for additional architectures. The library is released under an open-source license with copyright held by Fujitsu Ltd.

Open-source release democratizes access to state-of-the-art quantization techniques, reducing barriers for model compression and efficient inference

Editorial Opinion

Fujitsu's release of OneComp represents a timely contribution to the LLM community, bridging the gap between cutting-edge research (NeurIPS 2025) and practical deployment tools. The combination of novel QEP methodology with mature quantization algorithms and pragmatic features like vLLM integration positions this library as a valuable resource for organizations seeking to reduce model inference costs. The open-source approach and ease-of-use design could significantly accelerate adoption of efficient LLM inference across industry and research.

Fujitsu Launches One Compression: Open-Source LLM Quantization Library with Novel QEP Method

Key Takeaways

▸Fujitsu's OneComp library brings NeurIPS 2025 research into production with the novel Quantization Error Propagation (QEP) method for improved LLM quantization accuracy
▸The library offers both ease-of-use (single-line quantization) and advanced features (AutoBit, JointQ, rotation preprocessing) for researchers and practitioners
▸Integration with vLLM and support for LoRA fine-tuning enables practical deployment and customization of quantized models in production environments

Summary

Open-source release democratizes access to state-of-the-art quantization techniques, reducing barriers for model compression and efficient inference

Editorial Opinion

Fujitsu's release of OneComp represents a timely contribution to the LLM community, bridging the gap between cutting-edge research (NeurIPS 2025) and practical deployment tools. The combination of novel QEP methodology with mature quantization algorithms and pragmatic features like vLLM integration positions this library as a valuable resource for organizations seeking to reduce model inference costs. The open-source approach and ease-of-use design could significantly accelerate adoption of efficient LLM inference across industry and research.

Fujitsu Launches One Compression: Open-Source LLM Quantization Library with Novel QEP Method

Key Takeaways

Summary

Editorial Opinion

More from Fujitsu

UK Parliament Condemns Slow Horizon Scandal Compensation Process; Fujitsu Still Hasn't Paid

Comments

Suggested

Microsoft's Leaked 'Aion' Project Reveals Vision for Copilot-First Operating System

Stanford Researchers Use Multi-Agent AI and Reinforcement Learning to Improve HIP Kernel Generation for AMD GPUs

Investigation Uncovers AI-Generated Deepfakes in Lily Jay Foundation Charity Fraud

Fujitsu Launches One Compression: Open-Source LLM Quantization Library with Novel QEP Method

Key Takeaways

Summary

Editorial Opinion

More from Fujitsu

UK Parliament Condemns Slow Horizon Scandal Compensation Process; Fujitsu Still Hasn't Paid

Comments

Suggested

Microsoft's Leaked 'Aion' Project Reveals Vision for Copilot-First Operating System

Stanford Researchers Use Multi-Agent AI and Reinforcement Learning to Improve HIP Kernel Generation for AMD GPUs

Investigation Uncovers AI-Generated Deepfakes in Lily Jay Foundation Charity Fraud