DeepSeek Completes Full-Parameter Post-Training of V4-Pro on Huawei's Ascend 910C Chips

Key Takeaways

▸Chinese AI accelerators have reportedly achieved full-parameter post-training of a 1.6-trillion-parameter model, marking potential progress toward reducing Nvidia GPU dependency for AI training workloads
▸This represents a reversal from August 2024 when DeepSeek abandoned Ascend chips for training the R2 model due to performance instability and software limitations
▸The Ascend 910C delivers approximately 60% of an Nvidia H100's inference performance, but this is the first claimed success at production-scale post-training rather than pre-training from scratch

Source:

Hacker Newshttps://www.tomshardware.com/tech-industry/artificial-intelligence/huawei-led-team-claims-it-post-trained-deepseeks-1-6-trillion-parameter-models-on-ascend-910c-chips↗

Summary

A research consortium involving DeepSeek and Huawei Technologies announced the successful completion of full-parameter post-training for DeepSeek's V4-Pro model—a 1.6-trillion-parameter language model—using a cluster of at least 1,000 Huawei Ascend 910C accelerators. The achievement represents a potential milestone in China's efforts to reduce dependence on Nvidia GPUs for AI model training, an area where Chinese silicon has previously struggled to match performance under U.S. export controls.

Post-training, the critical phase where model weights are fine-tuned for instruction-following, safety alignment, and task-specific performance, was previously seen as beyond the technical capabilities of Ascend hardware. The consortium included Huawei, Shenzhen Loop Area Institute, Harbin Institute of Technology, and the Shenzhen Research Institute of Big Data. This contrasts sharply with August 2024, when DeepSeek reportedly abandoned Ascend chips for the R2 model due to unstable performance and software gaps, ultimately relying on Nvidia GPUs for training.

However, the announcement lacks crucial supporting evidence—no benchmarks, timing data, or efficiency metrics were disclosed. Tom's Hardware notes that DeepSeek itself has not publicly commented on the achievement, and the article questions the veracity of the claim against a pattern of unsubstantiated Chinese state announcements. The result, while potentially significant if validated, remains unproven at scale.

The announcement provides no benchmarks, efficiency data, or independent verification; DeepSeek has not publicly confirmed the achievement, raising credibility concerns

Editorial Opinion

If verified, this milestone would represent genuine geopolitical significance—demonstrating China's ability to conduct full-parameter post-training on domestic silicon outside U.S. export control reach. However, the lack of supporting evidence, absence of public confirmation from DeepSeek, and history of unsubstantiated claims from Chinese state actors warrant significant skepticism. Without independent validation, this announcement falls into a familiar pattern of technology claims that sound impressive in headline form but lack real-world proof.

DeepSeek Completes Full-Parameter Post-Training of V4-Pro on Huawei's Ascend 910C Chips

Key Takeaways

▸Chinese AI accelerators have reportedly achieved full-parameter post-training of a 1.6-trillion-parameter model, marking potential progress toward reducing Nvidia GPU dependency for AI training workloads
▸This represents a reversal from August 2024 when DeepSeek abandoned Ascend chips for training the R2 model due to performance instability and software limitations
▸The Ascend 910C delivers approximately 60% of an Nvidia H100's inference performance, but this is the first claimed success at production-scale post-training rather than pre-training from scratch

Summary

The announcement provides no benchmarks, efficiency data, or independent verification; DeepSeek has not publicly confirmed the achievement, raising credibility concerns

Editorial Opinion

If verified, this milestone would represent genuine geopolitical significance—demonstrating China's ability to conduct full-parameter post-training on domestic silicon outside U.S. export control reach. However, the lack of supporting evidence, absence of public confirmation from DeepSeek, and history of unsubstantiated claims from Chinese state actors warrant significant skepticism. Without independent validation, this announcement falls into a familiar pattern of technology claims that sound impressive in headline form but lack real-world proof.

DeepSeek Completes Full-Parameter Post-Training of V4-Pro on Huawei's Ascend 910C Chips

Key Takeaways

Summary

Editorial Opinion

More from DeepSeek

Researchers Discover DeepSeek-Powered Autonomous Cyberattack Campaign

DeepSeek V4 Flash Achieves Parity with GPT-5.6 on Agentic Memory Benchmark at 20x Lower Cost

DeepSeek Releases V4-Flash: Optimized LLM for Speed and Efficiency

Comments

Suggested

Strangers Pretrain 15M-Parameter Language Model Using GitHub Actions and Hugging Face PRs

Research Identifies Fundamental Trilemma: LLM Safeguards Cannot Simultaneously Provide Reliable Safety, Useful Capability, and Open Access

Token Diplomacy: China Positions Open-Source AI as Global Strategic Resource

DeepSeek Completes Full-Parameter Post-Training of V4-Pro on Huawei's Ascend 910C Chips

Key Takeaways

Summary

Editorial Opinion

More from DeepSeek

Researchers Discover DeepSeek-Powered Autonomous Cyberattack Campaign

DeepSeek V4 Flash Achieves Parity with GPT-5.6 on Agentic Memory Benchmark at 20x Lower Cost

DeepSeek Releases V4-Flash: Optimized LLM for Speed and Efficiency

Comments

Suggested

Strangers Pretrain 15M-Parameter Language Model Using GitHub Actions and Hugging Face PRs

Research Identifies Fundamental Trilemma: LLM Safeguards Cannot Simultaneously Provide Reliable Safety, Useful Capability, and Open Access

Token Diplomacy: China Positions Open-Source AI as Global Strategic Resource