This sector surge is no accident.
From the surge in Token usage on the demand side to the chip shortage on the supply side, and a fundamental shift in business models, computing power leasing is entering an unprecedented boom cycle.
Severe Supply-Demand Mismatch Keeps "Scarcity" Premium High
The core driver behind this rally lies in the massive gap between surging global demand for computing power and rigidly constrained supplies of high-end chips.
On the demand side, AI applications have shifted from pure large-model training to a widespread surge in inference scenarios.
A recent report from CITIC Securities notes that the surge of Agent and multimodal applications has driven a rapid increase in Token usage. By April 2026, weekly cumulative Token consumption on OpenRouter, the world's largest API aggregation platform, had jumped roughly 7 to 8 times compared to a year earlier. Domestic large models have become the primary driver of this growth.
Mao Shengyong, deputy director of the National Bureau of Statistics, revealed at a State Council Information Office press briefing on April 16 that daily Token usage had already surpassed 140 trillion in March this year — a rise of over 40% since the end of 2025.

Image Source: LETTALL Electronic
The supply side, meanwhile, faces multiple bottlenecks.
Data from SemiAnalysis shows that as of April 2026, overseas rental prices for H100 chips have climbed 40% in just five months.
Dongwu Securities explicitly stated in a report that only top-tier cloud providers can secure relatively sufficient high-end computing power. Demand from second-tier cloud providers and large-model companies remains far from satisfied, leaving a massive gap.
Amid this mismatch, computing power has emerged as the scarcest strategic resource of the AI era.
Industry-Wide Price Hikes as Business Models Shift from "Selling Compute" to "Selling Tokens"
The direct manifestation of this supply-demand imbalance is a sharp rise in computing power prices across the industry. But behind this lies an even more profound shift: a fundamental upgrade of business models.
Recently, cloud vendors both at home and abroad have initiated a wave of price increases.
Tencent Cloud raised prices for input tokens of its Hunyuan HY2.0 Instruct model by 463% in March, followed by an announcement on April 9 that it would uniformly raise prices for AI computing power and container services by another 5%. Alibaba Cloud announced on March 18 that it would hike prices for AI computing power and storage products by up to 34%, while Baidu Intelligent Cloud also increased prices for AI computing-related products by 5% to 30%.
Overseas, Amazon AWS raised prices for EC2 instances used in large-model training by 15%, and Google Cloud hiked prices for some data transfer services in North America by 100%.
More critically, the business model of the computing power leasing industry is undergoing a profound transformation from "selling computing power" to "selling Tokens."
Dongwu Securities points out that given the current tight supply, computing power leasing companies are gaining bargaining power. Their business models are upgrading from simply renting raw computing power to model services or Token-sharing models. This shift is expected to significantly boost profitability and drive a re-rating of their valuation system from P/E to P/S.
This means computing power leasing firms are no longer just hardware renters; they are deeply participating in the value distribution of AI applications.
Taken together, the sustained rally in the computing power leasing sector is the result of a confluence of three factors: surging demand, constrained supply, and business model upgrades. In the short term, the compute crunch is unlikely to ease quickly, so the trend of rising prices should continue. In the long term, as business models evolve from hardware leasing to value distribution, the profit ceiling of the computing power leasing industry is being redefined.
Of course, investors must also keep an eye on potential risks such as slower-than-expected technological development, geopolitical tensions, and intensifying industry competition — maintaining rational judgment amidst the fever.
Domestic Large Models Rise, Accelerating Demand for Localized Compute Leasing
Another key engine driving this rally is the better-than-expected rise of domestic large models.
Domestic large models have delivered a particularly strong performance in the global arena.
According to calculations based on the latest data from OpenRouter, global total usage of AI large models reached 27 trillion Tokens between March 30 and April 5, representing a 18.9% week-on-week increase.
Among the ranked AI large models, weekly usage of Chinese models climbed to 12.96 trillion Tokens, up 31.48% from the previous week. Weekly usage of US AI models stood at 3.03 trillion Tokens, a modest 0.76% increase.
Weekly usage of Chinese AI models has grown for five consecutive weeks and has surpassed US models for five weeks running.
During the same period, the top six models globally by usage were all Chinese. Among the top three, two were from Alibaba's Qwen 3.6 series. Qwen3.6 Plus (free) ranked first with 4.6 trillion Tokens in weekly usage, while Qwen3.6 Plus Preview ranked third with 1.64 trillion Tokens.
Looking back further, data shows that in the week of February 9 to 15, usage of Chinese models hit 4.12 trillion Tokens, surpassing US models (2.94 trillion Tokens) for the first time. By the week of February 16, weekly usage of Chinese models surged to 5.16 trillion Tokens, further widening the lead.
Among the top five models globally by usage, Chinese models occupied four slots: MiniMax's M2.5, Moonshot AI's Kimi K2.5, Zhipu's GLM-5, and DeepSeek's V3.2.

Image Source: DeepSeek
On the commercialization front, Zhipu CEO Zhang Peng stated at the 2025 annual report performance briefing on the evening of March 31 that API pricing for Zhipu would increase by 83% in the first quarter of 2026. Even so, demand still outstrips supply, with usage growing 400%. This data indicates that domestic large models have moved from the free trial phase into genuine paid commercialization, keeping underlying demand for computing power extremely tight.
The computing power needs of domestic large models have their own unique characteristics.
Models represented by DeepSeek widely adopt a Mixture-of-Experts (MoE) architecture. For instance, DeepSeek-V3 has a total of 671 billion parameters, with about 37 billion activated per Token. This creates a strong demand for continuous, high-bandwidth computing power during inference.
Meanwhile, ByteDance's Doubao large model saw daily Token usage surpass 120 trillion by March 2026, a 1,000-fold increase since its launch. Under the impetus of OpenClaw, MiniMax saw its Token consumption grow several-fold in two months. API quotas across vendors are tightening, and usage continues to hit record highs.
For most small and medium-sized AI enterprises, the cost of building a thousand-card cluster is prohibitively high, forcing them to turn to computing power leasing platforms.
Policy support is also intensifying. The Ministry of Industry and Information Technology issued a notice on a special action to empower SMEs with inclusive computing power, encouraging the creation of dedicated compute pools for small businesses. It promotes flexible payment models based on "card-hours," "core-hours," or Tokens, and explores innovative businesses like "compute banks" and "compute supermarkets" to lower barriers. The previously launched "Millisecond Compute" action has already been deployed in 50 regions across 31 provinces, aiming to bridge the "last mile" for efficient compute resource circulation — providing infrastructure guarantees for the widespread adoption of computing power leasing.
In short, as domestic large models move from "catching up" to "competing with" their global peers, combined with the exponential expansion of inference scenarios and sustained policy support, the computing power leasing market is gaining solid incremental demand. The sector's prosperity is poised to shift from short-term thematic hype to long-term performance realization.








