Subscribe

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Service

Ant Group Uses Chinese Chips to Train AI Models

Ant Group Uses Chinese Chips to Train AI Models Ant Group Uses Chinese Chips to Train AI Models
IMAGE CREDITS: WSJ

In a strategic shift that reflects China’s growing push for self-reliance in artificial intelligence. Ant Group is increasingly using domestic semiconductors to train large language models (LLMs), aiming to reduce costs and lessen its dependence on U.S.-restricted technology.

The Alibaba-affiliated fintech giant has adopted chips from Chinese suppliers, including those linked to Huawei and its parent company Alibaba. As it builds out next-gen AI systems using the Mixture of Experts (MoE) method. Sources familiar with the matter say the results achieved using Chinese-made chips have been comparable to those trained on Nvidia’s H800 GPUs. A chip still used in parts of Ant’s AI development.

But as chip supply restrictions tighten and training costs balloon. Ant Group is reportedly increasing its use of AMD and local chipmakers’ hardware for newer models. A move that could signal a deeper strategy shift as China’s tech firms navigate AI under U.S. export controls.

Ant Group internal research paper outlines how the company trained its MoE-based models without relying on high-end GPUs like Nvidia’s. Significantly reducing compute costs in the process. While traditional training of one trillion tokens cost around 6.35 million yuan ($880,000), Ant claims its hardware-efficient strategy lowered the cost to 5.1 million yuan — nearly a 20% reduction.

The paper, which includes the phrase “scaling models without premium GPUs” in the title, suggests that Ant is exploring more cost-effective ways to scale AI training by combining MoE architecture with accessible, lower-spec chips.

MoE models function like a team of specialised AI modules, each activated only when needed. This setup offers greater efficiency and scalability compared to traditional monolithic models. The technique has gained traction among global AI leaders like Google. As well as Chinese startups like DeepSeek, which recently unveiled the cost-efficient R1 model.

Ant’s approach stands in contrast to Nvidia’s trajectory, which remains focused on creating ever-more powerful GPUs to match growing demand. Nvidia CEO Jensen Huang has argued that businesses will continue to pursue performance at any cost. Rather than shift toward smaller, cheaper chips.

But Ant Group is betting on a different future — one that values real-world application and cost-effective training as keys to AI adoption. According to sources, the company’s models — named Ling-Plus and Ling-Lite. Are designed for industry use cases in finance, healthcare, and enterprise automation.

Earlier this year, Ant acquired Haodf.com, a Chinese online medical platform, hinting at plans to integrate AI in clinical settings. It also continues to operate services such as the Zhixiaobao virtual assistant and its financial advisory platform Maxiaocai, expanding its footprint in applied AI.

“We believe performance in the real world — not just model size — is what truly matters,” said Robin Yu, CTO of Beijing-based Shengshang Tech. “If you find a weakness in the world’s top fighter and win, it’s still a win.”

In a further nod to collaboration and transparency, Ant has open-sourced both Ling-Lite and Ling-Plus. The former contains 16.8 billion parameters, while the larger Ling-Plus boasts 290 billion. For comparison, estimates suggest GPT-4.5, developed by OpenAI, has roughly 1.8 trillion parameters.

Despite the progress, Ant acknowledges in its paper that training remains a fragile process, especially when working with non-premium hardware. Minor adjustments to hardware setups or model architectures reportedly caused spikes in error rates during experimentation. Revealing the ongoing difficulty of optimising performance under constrained conditions.

Still, the research could represent a significant step toward reducing China’s reliance on U.S. technology in one of the most strategically important tech domains of the decade.

If Ant’s approach proves scalable, it may encourage more Chinese companies to embrace homegrown chips. Open-source AI architectures, and cost-conscious model training — shifting the dynamics of the global AI race in the process.

Share with others