The escalation of the U.S. ban completely suppresses the development of China’s AI chips, and it is imperative to accelerate the construction of a national production and supply chain.
On October 17th, the Bureau of Industry and Security (BIS) of the US Department of Commerce updated the "Export Control Rules for Advanced Computing Chips and Semiconductor Manufacturing Equipment", which revised and strengthened the rules of October 7th, 2022. The rule is still in the publicity period and will take effect in 30 days.
The new regulations will restrict NVIDIA’s sales of chips to China market, saying that stricter control will be imposed on NVIDIA A800 and H800 chips, which will be reviewed within 25 days to determine whether a license is needed to sell such chips to China. At the same time, there are 13 China GPU enterprises listed in the entity list, including Moore Thread, Wall Technology and so on.
In this regard, NVIDIA responded that it will not have an immediate and substantial impact on its income, but it may cause damage to its long-term development.Previously, NVIDIA disclosed in the second quarter earnings conference call that China’s sales accounted for 20-25% of the data center.
For wall technology and Moore thread, being included in the entity list means that if you can’t get permission, you can’t not only import American technology or products, but also use a fab based on American technology or equipment to OEM chips for them.
01
Upgrade chip export control measures
According to the regulations of 2022, the United States prohibits the export of chips exceeding two standards: one is the power standard, and the other is the "interconnection bandwidth", that is, the communication speed standard between chips. According to the new regulations, the communication speed will be replaced by "performance density", that is, the number of floating-point operations per square millimeter, which prevents enterprises from finding alternatives.
For example, in order not to violate the export restrictions of the United States, NVIDIA has launched a special edition chip for the China market-A800/H800 which limits the communication speed (the processing speed is about 70% of A100/H100). Although it retains strong computing power, the time for training AI big models will increase. Even so, A800/H800 is still considered as the best data center chip for AI training/reasoning in China market. There is also Gaudi2, a special AI processor launched by Intel for the China market.
The revised export control measures will prohibit American enterprises from selling data center chips running at 300TFLOPS or above to China. If the performance density of the chip reaches or exceeds 370GFLOPS per square millimeter, the chip with the speed of 150-300TFLOPS will be banned from sale. Chips running at the above speed but with low performance density belong to the "grey zone", which means that enterprises must inform the US government about their sales to China.
Some analysts said that the new measures may also include NVIDIA’s flagship game graphics card.RTX4090. The latest news shows that the US Department of Commerce has clarified the export control policy, which is aimed at4090The ban on graphics cards allows consumer applications to be exempted from export. this means4090Graphics cards can still be retailed in the consumer market in China (including Hongkong and Macau), but they are not allowed to be used for commercial and production purposes.
In order to prevent enterprises from bypassing chip restrictions through Chiplet’s chip stacking technology, the new measures also extend the new regulations to more than 40 countries to export advanced licensing requirements, so as to prevent AI chips such as A100 and H100 series from being exported to China from other overseas regions; In addition, licensing requirements for chip manufacturing equipment have been put forward for 21 countries outside China, and the list of equipment prohibited from entering these countries has been expanded to limit the manufacturing capacity of advanced chips below 14nm in China.
The new export control regulations have also added a list of items in many sub-fields, including ASIC (Application Specific Integrated Circuit), FPGA (Field Programmable Gate Array), SRAM storage and calculation integrated chip, Chiplet, multiple exposure technology, NPU (Neural Network Processor), etc., which are all within the control scope of the United States.
If only the peak performance is limited, chip companies can also launch small-volume products with low performance and power consumption for the China market, and then form a cluster through high-speed connection, and finally achieve the expected cluster performance. However, the terms of chip performance density have blocked this road. This new regulation will fill the "loophole" of the previous chip restrictions, include some AI chips that originally met the previous technical parameters, and prevent China enterprises from purchasing American AI chips through overseas subsidiaries.
When controlling the export of chip companies, the performance of American companies will also be affected. The China market has long been the largest market for American chip companies outside the United States. In 2022, the revenues of NVIDIA, Intel and AMD in Chinese mainland (including Hong Kong, China) accounted for 21.4%, 27.2% and 22.1% respectively.
It is worth noting that enterprises in Taiwan, China, such as ASUS, Gigabyte and MSI, still have a large number of end customers in Chinese mainland. According to the statistics of Chinese mainland and Taiwan, China, the revenue of NVIDIA, Intel and AMD in China market accounted for 47.3%, 40.3% and 32.1% respectively.
02
The demand for computing power in AI era has increased exponentially.
Since the end of November 2022,ChatGPT, an artificial intelligence conversation and chat robot released by American startup OpenAI, quickly became popular., only useThe number of registered users exceeds one million in five days.,And broke through 100 million in two months.,Become the fastest growing consumer application in history..
The AI big model represented by ChatGPT has opened a new wave of science and technology of productivity innovation.Man and machine are no longer limited to simple imperative interaction.,Machines can understand complex intentions, whichSubvert many formats in the past Internet development., alsoHave a profound impact on the real economy and industrial development..therefore,GPT was founded by Microsoft founder bill·Gates rated it as the most important technological progress since the graphical interface.,It was called the iPhone moment in the field of artificial intelligence by NVIDIA founder Huang Renxun..
When everyone is immersed in the amazing "vitality" of the super-large language model, AI, a concept that has existed for decades, has become the core variable of the development of human society in the next few decades. Behind AI is the organic combination of computing power, data and algorithm.
In essence, the explosion of ChatGPT is an explosive embodiment after the comprehensive ability of software (data, algorithm) and hardware (computing power) in the field of AI has been greatly improved. With the wide application of cloud computing, especially deep learning has become the mainstream way of AI research and application, AI’s requirements for computing power have been increasing rapidly.
When we talk about the performance of AI chips, the first indicator that comes to mind is computing power. Computational powerIt is the infrastructure of algorithms and data, supporting algorithms and data.Refers to the amount of computing tasks that a computer system can complete.,Usually used to describe the ability to process data, using FLOPS.(Floating Point Operations Per Second)Represents the number of floating-point operations or instructions that can be completed per second. In the early decades of AI development, the required computing power increased according to Moore’s Law-it doubled in about 20 months.
In 1950s, Claude Shannon, an American mathematician, trained a robot mouse named Theseus to navigate in the maze and remember the path. Theseus was built on the basis of 40FLOPS;. In 2012, AlexNet (an image recognition AI) marked the beginning of the era of deep learning, and the time for doubling computing power was greatly shortened to six months; In 2015, the emergence of AlphaGo brought the era of large-scale AI model, and its computing needs were greater than all previous AI systems.
Compared with the traditional AI algorithm,The large-scale model has been greatly improved in parameter scale.,The parameters generally reach hundreds of billions or even trillions..For example, the GPT series of OpenAI.,The original GPT-1 has 117 million parameters.,The number of parameters to GPT-3 has reached 175 billion,And the corresponding ability has been greatly improved..
The huge demand of AI algorithm model for computing power has promoted the development of chip industry today. According to OpenAI, since 2012, the amount of computation used in global AI training has increased exponentially, doubling every 3.43 months on average. At present, the amount of computation has expanded by 300,000 times, far exceeding the growth rate of computing power.
Generally speaking, AI chips are called AI accelerators or computing cards, that is, cores specially designed to accelerate AI algorithms (other non-computing tasks are still handled by CPU); In a broad sense, chips for AI computing applications can all be called AI chips.This brings the hardware giants in the field of computing power into people’s sight again.The computing power value contained in the underlying hardware such as CPU, GPU, FPGA and ASIC will be reshaped.
At present, GPU is the main choice of AI computing power becauseGPU was originally designed for graphics rendering.,The calculation involved in graphic rendering is highly parallel,This parallelization makes GPU very suitable for large-scale data parallel computing such as machine learning and deep learning.. GPUParallel computing ability can greatly improve computing efficiency and shorten the training and reasoning time of AI algorithm, which has become the core of computing power in the AI era.
andSpecifically, the rendering process is the calculation of the position and color of geometric points, both of which are mathematically multiplied by four-dimensional vectors and transformation matrices, soMore than 80% of the GPU is a computing unit.With computing units such as tensor kernel and matrix multiplication,In comparison, only 20% of CPU is an arithmetic unit.So GPUCommon machine learning and deep learning operations can be performed faster.:Such as convolution and matrix multiplication.Compared with general computing units, these computing units,It has higher efficiency and faster speed.
When performing calculations such as deep learning, alsoA lot of memory and high-speed memory bandwidth are needed to store and process massive data..Compared with other hardware, GPU(Such as CPU),It has higher memory bandwidth and larger memory capacity.,Data can be stored and processed more effectively.,So as to improve the calculation speed..
at present,With NVIDIA A100, for example.、Release of H100 and other models.,Compared with other hardware, GPU has great advantages in computing power.The work of GPU has gradually changed from graphic processing at the beginning to computing.In the training stage of deep learning, its performance is unparalleled.Become the most suitable hardware to support AI training and learning.It is applied to data center acceleration and some intelligent terminal fields.
According to JPR statistics, in the first quarter of 2023, the market share of GPU in NVIDIA reached 84%, and it was the leader of GPU market.heroWeida first proposed the concept of GPU in 1999, launched CUDA computing platform in 2006, and released the supercomputer DGXGH200 in May 2023. Its computing power scale reached 1Eflops, which supported the training of trillion-parameter AI large models and provided linear scalability for giant artificial intelligence models. NVIDIA has become a key supplier of AI computing power based on its ecosystem construction such as GPU and CUDA.
- As for CPU, because GPU can’t work alone, it must be controlled by CPU to work. CPU can also be used as an AI chip alone, processing complex logical operations and different data types. Although it is compatible, it is wasteful. When a large number of data with unified processing types are needed, GPU can be called for parallel calculation.
- However, FPGA has the characteristics of low energy consumption, high performance and programmability. Compared with CPU and GPU, it has obvious performance or energy consumption advantages, but it requires high users.Through FPGA, the cost of R&D and debugging can be effectively reduced, the market responsiveness can be improved, and differentiated products can be introduced. Technology giants have laid out platforms of cloud computing and FPGA. With the gradual enrichment of the developer ecology of FPGA and the increase of applicable programming languages, the application of FPGA will be more extensive.
- ASIC can optimize the hardware level more pertinently, so as to obtain better performance and power consumption ratio. However, the design and manufacture of ASIC chips need a lot of money, a long research and development cycle and engineering cycle, and the deep learning algorithm is still developing rapidly. If the deep learning algorithm changes greatly, FPGA can quickly change its architecture to adapt to the latest changes, and once ASIC chips are customized, it is difficult to modify them.
- AI chips can also use NPU, which has sprung up in recent years. Under the same chip area, NPU can achieve dozens of times the AI performance of GPU. Nvidia has also stuffed many Tensor Core in recent generations of GPUs, but if NPU is used, it can not completely rely on the CUDA ecology in NVIDIA, so many companies from Intel, AMD and even China have taken a share.
In terms of performance, Huawei’s NPU is not inferior to NVIDIA’s products in AI performance, and what it lacks at present is "ecology". Only when there are enough developers involved can the ecology be established, but the transformation is "painful", which means a lot of code refactoring.
Generally speaking, GPU is currently the most mature and widely used general-purpose chip for AI computing in the market, and will continue its leading position in the short term. In the period when the algorithm technology and application level are still shallow, its powerful computing power, low research and development cost and universality will continue to occupy the main market share of AI chips.
From the "all-round no-dead-angle" upgrade ban in the United States, we can see that behind the contemporary chip war is a broader and more far-reaching strategic AI war, and the competition for computing power will become more and more fierce, which will determine the map of scientific and technological strength in the next few decades. A seemingly impenetrable network seems to not only kill the domestic high-end computing chip industry, but also block China’s advanced road to AI revolution.
In the short term, domestic AI chips can’t adopt foreign advanced EDA and IP at the design level, nor can they be manufactured by advanced processes, so they are faced with the challenges of increasing product power consumption and reducing energy efficiency ratio, and their competitiveness has declined, but this decline is only temporary. In the long run, the only way for domestic AI chip companies to be free is to face up to difficulties, unite as one, warm up with the supply chain and build a sound and complete supply chain that is not restricted by the United States.
Experts suggest that, on the one hand, efforts should be made in the fields of EDA tools and IP to make up for each other’s shortcomings, enhance their strength and accelerate the pace of localization; On the other hand, efforts should be made to build an unrestricted advanced technology OEM line, to tackle the key problems in the manufacturing process of the flow sheet and to build a sound and complete industrial chain.At the same time, break through the barriers between industry application, chip research and development, system development and university research, form cross-enterprise, cross-field and cross-industry cooperation, and then promote the all-dimensional development of the chip industry.
In recent years, China’s semiconductor industry has been crawling forward with gunfire, and today’s "gunfire" has been upgraded to the level of "atomic bomb". Fortunately, in the context of escalating sanctions and shortage of computing power, domestic AI companies and GPU companies are struggling to move forward. Huawei, Ali, Baidu and Tencent have all been developing their own AI chips. AI chip companies include listed CAMBRIAN, Jingjiawei and Haiguang information, as well as core technology, Suiyuan, Hanbo, Muxi, Biyu, Moore Thread, Tianzhixian and other enterprises, which have achieved fruitful results in architecture innovation, design, OEM and application. The data shows that the development of local AI chip manufacturers in China is in a stage of rapid growth. In the first half of 2023, the market size of China’s accelerated chips exceeded 500,000. From a technical point of view, GPU cards occupy 90% market share; From the brand point of view, local AI chip brands in China shipped more than 50,000 pieces, accounting for about 10% of the total market.
After the frequent introduction of bans, some American enterprises are worried about "stimulating the development of an ecosystem dominated by China". Unfortunately, these "worries" will come true in the near future.However, the reaction will be as strong as the intensive sanctions, and the industrial chain of China AI chips will break through in an all-round way and the supply chain will be self-controllable.
Source: EEPW
* Blog content is published by netizens personally, which only represents the blogger’s personal views. If there is any infringement, please contact the staff to delete it.