End of Training "Arms Race," Key to AI Victory Shifts to Inference Chips

Edited by Betty From Gasgoo
Gasgoo Munich- A tiny chip is emerging as the key lever pushing artificial intelligence from the "training" phase into real-world "application," unlocking a market projected to exceed 310 billion yuan over the next three years.

As inference demand overtakes training, the question is: who will seize the initiative in the computing power race?

Recently, OpenAI signed a deal with wafer-scale AI chip firm Cerebras. Between 2026 and 2028, the AI giant plans to integrate 750MW of Cerebras chips into its inference computing infrastructure.

This massive deployment isn't for model training; it is dedicated entirely to AI inference services.

Inference — the stage where trained models actually "go to work" — is fast becoming the new focal point of the AI chip market.

Zhou Zhengang, vice president at IDC China, points out that inference cards accounted for 57.6% of China's data center accelerator shipments in 2024, compared to 33% for training cards. Since the emergence of DeepSeek, smaller companies have been shifting compute power from training to inference. Major platforms like Tencent and Baidu integrating DeepSeek have further accelerated this demand, so the share of inference chips is expected to climb significantly in 2025.

Inference Demand Surpasses Training, Industry Hits Inflection Point

For the past two years, training large models was the undisputed focus of the AI chip market.

Now, that landscape is quietly shifting.

According to the "2025-2030 Artificial Intelligence Chip Industry Market Research and Investment Prospect Forecast Report," China's market for AI inference chips and related services is expanding at a staggering compound annual growth rate of 94.9% — surging from 11.3 billion yuan in 2020 to 162.6 billion yuan in 2024. By 2025, that figure is projected to hit 310.6 billion yuan.

The primary driver behind this explosive growth is the rapid deployment of AI large models.

The widespread adoption of generative AI has fundamentally reshaped demand structures. With the arrival of open-source, high-performance models like DeepSeek-R1, more users are actually putting large models to work, driving a massive surge in inference demand.

Tencent management made it clear during a recent earnings call that the company sees stronger demand for GPUs on the inference side. As user inference needs grow, AI demand has outstripped the computing power available from existing GPU resources.

This shift is also reflected in the planning of intelligent computing centers.

In December 2024, Hong Kong's largest AI supercomputing center at Cyberport went live, with the government offering subsidies to users. Zheng Songyan, CEO of Cyberport, noted that the center's second phase is planned for 1,700 PFLOPS; by October 2025, total capacity will be boosted to 3,000 PFLOPS.

The rapid expansion of China's AI inference chip market is also closely tied to policy. State policies are accelerating the development of smart cities and digital government, driving demand for chips and related services.

These massive projects require extensive computational infrastructure, making operational cost control a critical priority. AI inference chips and services with superior energy efficiency are perfectly positioned to meet that need.

Domestic Chips Enter the Arena; Cost-Performance Becomes the Key to Victory

As inference demand becomes the market mainstream, the competitive landscape for AI chips is undergoing a profound shift.

The most obvious trend: domestic chips are facing unprecedented opportunities.

Data from Zhou Zhengang shows that domestic computing power claimed 34.6% of China's data center accelerator market in 2024. That marks a significant shift from 2022 and 2023, when Nvidia commanded an 85% to 90% market share.

The nature of inference tasks offers domestic chips a foothold. Unlike training, inference imposes more flexible compatibility requirements, allowing workloads from different applications to be distributed across various chips.

微信图片_20260122173817_742_8.png

Image source: Screenshot from Nvidia's official website

This means domestic chips can carve out their own application scenarios without going head-to-head with Nvidia.

Market acceptance of domestic chips is rising fast.

Some observers note that while customers used to question how domestic chips stacked up against the Nvidia H100 in terms of advantages, value, and ecosystem compatibility, "those concerns are fading."

Cloud and AI vendors now need domestic alternatives to Nvidia to supplement their resources and provide backup.

In this market, extreme value for money has become the winning weapon for domestic chips.

One industry insider put it bluntly: "Through extreme performance optimization, domestic chips can deliver value that surpasses the Nvidia 4090 in specific fields."

These specific fields include transportation, energy, and telecommunications.

As large models running on edge devices become smarter — from smartwatches to industrial robots — the boundaries of the inference chip market are constantly expanding.

This market, once overshadowed by the glare of training chips, is entering its golden age.

This tiny inference chip is no longer just a technological accessory; it is the critical variable determining whether AI can truly integrate into every industry.

Gasgoo not only offers timely news and profound insight about China auto industry, but also help with business connection and expansion for suppliers and purchasers via multiple channels and methods. Buyer service: buyer-support@gasgoo.com Seller Service: seller-support@gasgoo.com

All Rights Reserved. Do not reproduce, copy and use the editorial content without permission. Contact us: autonews@gasgoo.com