Meituan's 1.6T-Parameter LongCat-2.0: Trained and Deployed Without Nvidia GPUs
Key Takeaways
- ▸LongCat-2.0 is the first trillion-parameter model to be trained and deployed entirely on domestic Chinese AI ASIC hardware (reportedly Huawei Ascend 910C chips), with zero reliance on Nvidia components
- ▸The full training pipeline reportedly involved approximately 50,000 ASIC cards processing over 35 trillion tokens with stable convergence, proving the viability of large-scale LLM training on non-Nvidia infrastructure
- ▸While not the absolute strongest model globally, LongCat-2.0 demonstrates competitive performance comparable to Claude Opus 4.6 and particularly strong agentic capabilities
Summary
Meituan released LongCat-2.0, a 1.6-trillion-parameter foundation model using a Mixture-of-Experts architecture with approximately 48 billion activated parameters per token. The model demonstrates competitive performance, approaching Claude Opus 4.6 capabilities and briefly ranking in the top three models by usage on OpenRouter following its release.
The defining achievement is that Meituan trained and deployed LongCat-2.0 entirely on domestic Chinese AI ASIC chips—reportedly approximately 50,000 Huawei Ascend 910C cards—without relying on any Nvidia GPUs. The full training pipeline spanned more than 35 trillion tokens with stable convergence and no significant rollbacks or irrecoverable loss spikes.
This represents a watershed moment for China's AI industry independence. While earlier domestic-compute achievements focused narrowly on inference or post-training optimization of existing models, LongCat-2.0 completes the entire pipeline—from large-scale pretraining through production deployment—on local hardware. The achievement demonstrates that US semiconductor export controls no longer constitute an absolute barrier to developing competitive large language models.
- This marks a fundamental shift from previous 'domestic compute' milestones that only addressed inference or post-training; LongCat-2.0 represents the first successful end-to-end training-to-serving pipeline on Chinese hardware
Editorial Opinion
LongCat-2.0's true significance lies not in its model benchmarks but in demonstrating that competitive trillion-parameter LLMs can be trained and deployed entirely on Chinese ASIC hardware, independent of US semiconductor supply chains. While earlier domestic-compute efforts proved narrow capabilities, this full-pipeline success signals a maturation of China's AI infrastructure outside Nvidia's ecosystem. The achievement will likely accelerate industry-wide interest in hardware diversification for large-scale model training.



