DeepSeek Completes Full-Parameter Post-Training of V4-Pro on Huawei's Ascend 910C Chips
Key Takeaways
- ▸Chinese AI accelerators have reportedly achieved full-parameter post-training of a 1.6-trillion-parameter model, marking potential progress toward reducing Nvidia GPU dependency for AI training workloads
- ▸This represents a reversal from August 2024 when DeepSeek abandoned Ascend chips for training the R2 model due to performance instability and software limitations
- ▸The Ascend 910C delivers approximately 60% of an Nvidia H100's inference performance, but this is the first claimed success at production-scale post-training rather than pre-training from scratch
Summary
A research consortium involving DeepSeek and Huawei Technologies announced the successful completion of full-parameter post-training for DeepSeek's V4-Pro model—a 1.6-trillion-parameter language model—using a cluster of at least 1,000 Huawei Ascend 910C accelerators. The achievement represents a potential milestone in China's efforts to reduce dependence on Nvidia GPUs for AI model training, an area where Chinese silicon has previously struggled to match performance under U.S. export controls.
Post-training, the critical phase where model weights are fine-tuned for instruction-following, safety alignment, and task-specific performance, was previously seen as beyond the technical capabilities of Ascend hardware. The consortium included Huawei, Shenzhen Loop Area Institute, Harbin Institute of Technology, and the Shenzhen Research Institute of Big Data. This contrasts sharply with August 2024, when DeepSeek reportedly abandoned Ascend chips for the R2 model due to unstable performance and software gaps, ultimately relying on Nvidia GPUs for training.
However, the announcement lacks crucial supporting evidence—no benchmarks, timing data, or efficiency metrics were disclosed. Tom's Hardware notes that DeepSeek itself has not publicly commented on the achievement, and the article questions the veracity of the claim against a pattern of unsubstantiated Chinese state announcements. The result, while potentially significant if validated, remains unproven at scale.
- The announcement provides no benchmarks, efficiency data, or independent verification; DeepSeek has not publicly confirmed the achievement, raising credibility concerns
Editorial Opinion
If verified, this milestone would represent genuine geopolitical significance—demonstrating China's ability to conduct full-parameter post-training on domestic silicon outside U.S. export control reach. However, the lack of supporting evidence, absence of public confirmation from DeepSeek, and history of unsubstantiated claims from Chinese state actors warrant significant skepticism. Without independent validation, this announcement falls into a familiar pattern of technology claims that sound impressive in headline form but lack real-world proof.



