Ant Group Reduces Costs by Training AI Models with Local Chips

Friday, Apr 4, 2025

Ant Group is shifting towards Chinese-produced semiconductors for training AI models to cut expenses and reduce reliance on limited US technology, as per sources familiar with the situation.

The company, owned by Alibaba, has utilized chips from local suppliers, including affiliates of Alibaba and Huawei Technologies, to train large language models with the Mixture of Experts (MoE) approach. These efforts reportedly yielded results on par with Nvidia's H800 chips, insiders noted. Although Ant continues to employ Nvidia chips for some AI projects, a source indicated the company is increasingly opting for alternatives from AMD and Chinese chip producers for its newer models.

This move underscores Ant's deeper participation in the escalating AI competition between Chinese and American tech companies, particularly as firms seek cost-efficient solutions for model training. The use of domestic hardware signifies a broader strategy among Chinese firms to circumvent export limits that prevent access to high-end chips like Nvidia's H800, which, while not the latest, remains one of the strongest GPUs available to Chinese companies.

Ant released a research paper outlining its work, claiming its models, in certain tests, outperformed those developed by Meta. While these claims have not been independently verified, if validated, Ant's initiatives could mark progress in China's mission to reduce AI application costs and reliance on overseas hardware.

MoE models segment tasks into smaller datasets managed by different components and have gained traction among AI experts. This technique is employed by Google and the Hangzhou-based startup, DeepSeek. MoE functions like a team of experts, each handling a portion of a task to enhance model production efficiency. Ant has not commented on its hardware-related endeavors.

Training MoE models necessitates high-performance GPUs, often too costly for smaller firms to afford. Ant's research aimed at lowering that cost barrier was epitomized in a paper emphasizing the goal: Scaling Models without premium GPUs. [our quotation marks]

Ant's strategy in adopting MoE for cost-reduction purposes contrasts with an approach like Nvidia's. CEO Jensen Huang believes the demand for processing power will rise, even as more efficient models like DeepSeek's R1 emerge. His perspective is that firms will seek more potent chips to bolster revenue, rather than slashing costs with less expensive alternatives. Nvidia's focus remains on developing GPUs with increased cores, transistors, and memory.

According to the Ant Group paper, training one trillion tokensbasic data units for AI learningcosts around 6.35 million yuan (approximately $880,000) with traditional high-performance hardware. The company's optimized method reduced this expense to about 5.1 million yuan by utilizing lower-specification chips.

Ant intends to apply its models developed this wayLing-Plus and Ling-Liteto industrial AI contexts like healthcare and finance. Earlier this year, Ant acquired Haodf.com, a Chinese online medical platform, to advance its goal of deploying AI solutions in healthcare. It also provides other AI services, including the Zhixiaobao virtual assistant app and the Maxiaocai financial advisory platform.

If you find one point of attack to beat the worlds best kung fu master, you can still say you beat them, which is why real-world application is important, stated Robin Yu, CTO of Beijing-based AI company Shengshang Tech.

Ant has made its models open source. Ling-Lite consists of 16.8 billion parameterssettings that influence model functionwhile Ling-Plus boasts 290 billion. In comparison, estimates suggest the closed-source GPT-4.5 has around 1.8 trillion parameters, as reported by MIT Technology Review.

Latest News

Here are some news that you might be interested in.

Friday, Apr 4, 2025

Kay Firth-Butterfield, ex-WEF: Exploring the Future of AI, the Metaverse, and Digital Transformation

Kay Firth-Butterfield is an eminent figure known worldwide for her expertise in ethical artificial intelligence and is a renowned AI ethics speaker. Previously leading AI and Machine Learning at the World Economic Forum (WEF), she has been a significant advocate for technology that positively impacts society.

Thursday, Apr 3, 2025

Research Suggests OpenAI Utilizes Copyrighted Material for AI Model Training

A recent investigation by the AI Disclosures Project has brought to light concerns regarding the datasets employed by OpenAI for training its large language models (LLMs). The study suggests that the GPT-4o model from OpenAI shows a "strong recognition" of proprietary and protected information from O’Reilly Media publications.

Thursday, Apr 3, 2025

Backlash Arises Over AI Copyright Report from Tony Blair Institute

The Tony Blair Institute (TBI) has unveiled a report urging the UK to take a leading role in managing the intricate intersection of art and artificial intelligence.

Thursday, Apr 3, 2025

AI Enhances Budgeting Efficiency, Yet Human Supervision Remains Crucial

Research from Vlerick Business School indicates that AI technology consistently surpasses human performance in financial planning, especially when it comes to budget allocation following strategic principles. Companies leveraging AI for budget processes witness significant enhancements in the precision and efficiency of their financial plans compared to those made by humans.