Introduction
The heat sink industry, an industry that was not focused on in the past, is gradually moving from behind the scenes to the front of the stage due to the explosive growth in the amount of data and computation brought about by AI.
China’s Data Center Power Density Increases
Data source CDCC, National Securities Research Institute
The Heat Sink Industry is facing the pain point of technological transformation and upgrading
The conversion of electrical energy into other forms of energy will be accompanied by the process of doing work; and doing work will release heat, and overheating will burn the GPU.
Previously, air-cooling was generally used to reduce heat, but air-cooling reaches the upper limit of power cost-effective decline.
China’s data center applications there is a clear requirement: data centers must have liquid cooling presence. What is at least certain is that liquid cooling penetration will climb from less than 10% today to 20% by 2025.
Considering the overall planning and layout of [east counting and west counting], more new cabinets will be added in the hub nodes in the future, the air-cooled solution may not be able to strictly meet the requirements in some areas, and the penetration rate of liquid-cooled solution is expected to increase at an accelerated pace.
If only in the heat dissipation engineering technology to strive for improvement in the original program to do some fine-tuning or optimization, then the progress of upgrading the speed will be slower, to provide the cooling capacity and high performance, high arithmetic and so on the gap between the needs of the gap between the demand will be more and more big.
Only through some creative, disruptive heat dissipation technology, fundamentally realize the scale of orders of magnitude or several times the ability to improve, it is possible to solve the current use of traditional technology faced by the chip performance heat dissipation of the ever-widening gap between supply and demand.
Due to the rapid increase in AI arithmetic demand, the associated CPU/GPU power increase is showing an accelerated trend, requiring more powerful and effective cooling solutions to keep the devices running properly.
Source: Zheshang Securities Co., Ltd.
Large manufacturers take the lead in promoting heat dissipation technology
After Moore’s Law era AI chip performance and power consumption synchronization dramatically increased, air-cooled chip-level cooling power consumption ceiling at about 800W, air-cooled to reach the upper limit of power after the cost-effective decline.
As the most advanced mass-produced GPU in the world today, the power of NVIDIA H100 is as high as 700-800 W. This is just the power required by a GPU, which is already more than an ordinary one-horse air conditioner.
According to Taiwan’s Economic Daily News, NVIDIA and TSMC are working with hardware manufacturers to advance heat dissipation technology.
According to AI supply chain sources, heat treatment technology provider Golik is actively working with TSMC and NVIDIA to develop AI GPU immersion liquid cooling systems.
As arithmetic power continues to increase, then chip performance needs to be greatly improved to support it, and this brings out another major challenge, namely the chip’s thermal design power consumption (TDP).
Recently, it was also revealed that Golik, a heat treatment technology provider from Taiwan Province of China, has secured an order for 150 liquid-cooled distributors from TSMC and is working with TSMC and NVIDIA to develop immersion liquid-cooling systems for AI GPUs.
Intel is also a supporter of immersion liquid cooling technology, and in 2022, Intel said that “the time for immersion liquid cooling is now.”
Immersion liquid cooling technology will become the mainstream cooling technology
Currently, the first mainstream liquid cooling solution is by way of water circulation, through the pump and piping into the body to take away heat. The other is immersion technology, the heat source (such as chips) placed in a non-conductive liquid, which takes away heat energy.
Therefore, in order to increase the power density of a single cabinet, data centers in recent years began to commonly use liquid cooling solutions. It can be broadly divided into two technology paths: Cold Plate and Immersion.
The former is through the cold plate will heat the device heat indirectly transferred to the closed in the circulating pipeline cooling liquid; the latter will directly heat the device as well as the circuit board as a whole directly in the liquid.
Compared with the air medium, the liquid thermal conductivity is higher, the specific heat capacity is greater, the heat absorption capacity is also stronger. In addition, the operating costs, liquid cooling cooling also has a greater advantage.
This heat dissipation packaging technology with a triple liquid cooling cycle of the server device technology. It can eliminate all secondary and tertiary cooling systems for data centers. Simply insert the self-immersed server into the cabinet, and then connect to the water pipe and non-conductive coolant pipe can be used directly.
Heat Dissipation Methods | Principle of Heat Dissipation | Advantages | Disadvantages |
Air Cooling | The heat generated by the CPU/GPU when working is transferred to the heat sink, and the heat is transferred with the surrounding air under the action of the fan. | Low price (the current price of the top dual-tower air-cooled Leemin IBE, for example, is only 600 yuan), high security | Noisy, large size, limited cooling efficiency, dependent on chassis ventilation |
Air Cooling
Heat Dissipation Methods | Principle of Heat Dissipation | Advantages | Advantages |
Water Cooling | CPU heat is absorbed by the liquid in the pipeline under the action of the water pump, cooled and then circulated repeatedly | Low noise, ultimate cooling capacity is higher than air cooling, not dependent on chassis air ducts | High price, 360mm specification price within 800-1000 dollars, fully enclosed maintenance is difficult, there is a risk of water leakage |
Liquid cooling solution
In terms of heat dissipation technology, the current heat dissipation module is based on active-passive hybrid heat dissipation technology with heat pipe technology.
Currently, thermal modules are categorized into “air cooling” and “liquid cooling”:
Air-cooled heat dissipation is to use air as a medium, through the thermal interface materials, heat equalization sheet (VC) or heat pipes and other intermediate materials, by the heat sink or fan and air convection for heat dissipation.
Liquid-cooled heat dissipation is through, or submerged heat dissipation, mainly through the liquid heat convection heat dissipation, and then make the chip cool down, but with the increase in the heat generated by the chip and the size of the chip shrinking, chip thermal design power consumption (TDP) increase, air-cooled heat dissipation gradually insufficient to use.
2023E | 2024E | 2025E | 2026E | 2027E | |
Total Cooling Requirement KW | 3.9E+05 | 6.0E+05 | 9.5E+0.5 | 1.5E+06 | 2.4E+06 |
Percentage of cold plate liquid cooling | 80% | 70% | 60% | 50% | 40% |
Percentage of immersion liquid cooling | 20% | 30% | 40% | 50% | 60% |
Cold Plate Liquid Cooling Unit Price $/KW | 967 | 870 | 783 | 705 | 644 |
Immersion liquid cooling unit price $/KW | 3455 | 3109 | 2798 | 2519 | 2268 |
Liquid Cooling Market Size $/billions | 6 | 9 | 15 | 23 | 37 |
Heat dissipation market evolves to liquid cooling + chip level
The core of the chip-level cooling system is a heat sink module composed of heat pipes and heat equalization plates.
The principle of the chip cooling module is to conduct the heat from the chip through heat pipes, heat equalization plates and other thermal conductive materials, and then along the thermal conductive link to the location of the cooling fins.
The heat sink fins are made of pure copper, with multi-fold structure and large contact area with the air, which is conducted to the heat sink by starting the fan for active heat dissipation.
The fan speed is automatically adjusted according to the amount of heat dissipated, thus completing the heat conduction to heat dissipation.
The current cold plate liquid cooling technology has a high degree of maturity and is in the mainstream of liquid cooling technology routes, assuming that the current proportion is 80%. Comprehensive estimates, AI large model training + reasoning will bring 4 billion yuan of liquid cooling market space. With the increase in model parameters, the use of promotion, the next four years will bring liquid cooling market 60% + compound annual growth rate.
According to the calculations, it is expected that the scale of the server cooling module can maintain a compound growth rate of nearly 30% until 2026.
Liquid cooling technology investment opportunities in three main lines
① Huawei Electric – Emerson Department of professional temperature control manufacturers: the earliest engaged in precision air conditioning R & D design, with many years of industrial insight, technology research and development with foresight, and the formation of a platform for the layout of the cooling, empowering multi-industry applications;
② layout of liquid cooling technology server manufacturers: cooling technology from the room level to the line level and even the server internal chip level extension, can participate in the liquid cooling technology program server manufacturers, is expected to usher in the opportunity to upgrade the arithmetic power faster, strengthen product competitiveness;
③ Vendors that provide complete solutions that include chip-level heat dissipation: the chip serves as the core heat source of the server, and as the power of the chip increases, the heat dissipation program is upgraded to the internal chip level of the server.
Conclusion:
Behind the popularity of the capital market, is the heat dissipation technology is increasingly becoming a constraint on the chip and other electronic products to upgrade the performance of the Achilles’ heel.
As an industry that grew up together with computer science, heat dissipation module manufacturers have experienced many electronic information revolution, but now the outbreak of AI, seems to really make this industry really realize the development.
Because of this, the industry needs to be related to thermal management, heat dissipation technology to rapidly improve in order to meet the chip and other electronic information products continue to iterate and upgrade.