Tengyan Interview | Academician Wu Hequan: The advantages, challenges and innovation paths of China's large-scale model development

2023-06-19 10:14:36

Image source: Generated by Unbounded AI tool

As ChatGPT set off a global craze, the AI model behind it suddenly surged. Everyone wants to know, what are the dimensions and standards for evaluating the level of a large model?

The launch of ChatGPT allows us to see the gap between China and the United States on AIGC. So, what is the current development status of China's large model? What opportunities and challenges will China's large-scale model development face in the future?

We are currently in a critical period for the development of general artificial intelligence. Facing the development trend of independent research on large models by various institutions, how to improve the efficiency of computing power and effectively avoid low-level duplication?

Some people in the industry worry that AI will destroy human beings. Is this worrying unfounded? How can we prevent problems before they happen, and realize the predictable results and controllable behavior of AI?

With various questions about AIGC, **Tencent Research Institute exclusively interviewed Wu Hequan, an academician of the Chinese Academy of Engineering and an authoritative expert in the field of communications in my country. **

【Interviewer】

Niu Fulian Senior Researcher, Tencent Research Institute

Wu Chunling Senior Researcher at Tencent Research Institute

Wang Qiang Senior Expert of Tencent Research Institute

(hereinafter referred to as T)

The total scale of China's existing computing power compared with the United States: there is a gap but not big

**T: Some people say that China's large-scale model development is 1-2 years behind foreign countries. What do you think of the current development of China's large-scale models? **

**Wu Hequan: **China started later than the United States in the development of large-scale models. After ChatGPT came out, many domestic units expressed that they are developing generative large-scale models. At present, there are only a few companies such as Microsoft and Google in the United States. Compared with large-scale model research, my country has more units that develop large-scale models than the United States, but the large number of research subjects does not mean that China has a high level of research and development in large-scale models. It is said that the number of parameters of a large domestic model is as high as 1.75 trillion, surpassing GPT-4, but there are no reports of its application. **Although some Chinese companies have claimed to launch chatbots similar to ChatGPT, they are currently not as good as ChatGPT in terms of multilingual support, and there is still a gap in terms of response speed in terms of Chinese dialogue capabilities. **

**We only notice ChatGPT now, which targets generative tasks and mainly completes language generation such as chatting and writing. Google's BERT model pays more attention to judgment and decision-making, emphasizing language understanding such as question answering and semantic relationship extraction. task, the technology of the BERT model also deserves our attention. **Evaluation of the level of large-scale models should be multi-dimensional, comprehensiveness, rationality, ease of use, response speed, cost, energy efficiency, etc. **Generally speaking, the gap between the development of large-scale models in my country and foreign countries is 1-2 years The basis is still unclear, and it is not meaningful to draw this conclusion now. **

Chinese companies have natural advantages over foreign companies in obtaining Chinese corpus and understanding of Chinese culture. **China has the most complete manufacturing categories and has favorable conditions for training AIGC for real industries. In terms of computing power, China already has a good foundation. **According to the OpenAI report, the computing power required to train the GPT3 model is as high as 3.64EFlops/day, which is equivalent to 3-4 Pengcheng Cloud Brain II (Pengcheng Cloud Brain II is 1Eflops, that is, tens of billions of floating-point calculations per second ). **According to the data at the end of 2022, the United States accounts for 36% of the global computing power, and China accounts for 31%. Among them, China is significantly higher than the United States (according to the data at the end of 2021, the scale of intelligent computing in the United States accounts for 15% of the total scale of global intelligent computing, and China accounts for 26%). my country is not only a large Internet company with considerable computing power, but also national laboratories and Laboratories supported by some city governments also have large-scale computing power resources. It can be said that China can also achieve the computing power support required for training large models. **It is understood that Pengcheng Lab is designing Pengcheng Cloud Brain III, which has a computing power of 16EFlops, which is three times higher than that of GPT-3. It is expected to cost 6 billion yuan and will continue to provide powerful computing power for artificial intelligence training. strong support.

China AIGC research and development: Need to recognize the gap, focus on challenges and innovate

**T: In addition to our good foundation in computing power, what challenges do you think there are in building a large-scale model in China? **

Wu Hequan: Computing power alone is not enough. We still face many challenges in the following aspects:

**First of all, the foundation of the big model is the deep learning framework. Tensorflow and PyTorch in the United States have been cultivating the deep learning framework ecology for many years. Although domestic companies have also independently developed the deep learning framework, the market test is not enough, and the ecology still needs to be built.

**Secondly, extending AIGC to industrial applications may require more than one large model. How to efficiently integrate multiple large models has challenges in standardization and data fusion.

Third, large models require massive data training. China has thousands of years of civilization, but most of the rich cultural deposits have not been digitized. Chinese is less than 0.1% of the corpus used in ChatGPT training. Although my country's Internet companies have a large amount of network data such as e-commerce, social networking, and search, the respective data types are not comprehensive enough, and the credibility of online knowledge is not strictly guaranteed. The Chinese corpus that can be used for training still needs a lot of mining work.

Fourth, the GPU chip that large model training relies on is represented by Nvidia’s A100 chip, but the chip has been restricted from being exported to China by the United States, and the performance of domestic GPUs needs further testing. There is still a gap in efficiency.

Fifth, there are not a few technicians engaged in AI research in China, but there is still a shortage of talents with architecture design capabilities and AIGC data training prompters. Before the emergence of ChatGPT, some people thought that the number of papers and patents in China in AI was comparable to that of the United States. **The launch of ChatGPT made us see the gap between China and the United States on AIGC. Now we need to clearly understand and pay attention to the challenges we face , make real innovations, turn challenges into opportunities, and make China's contribution to the new round of AI track. **

It is recommended to open the national computing power platform to support various large-scale model training

**T: ChatGPT is undoubtedly a huge innovation. How should China encourage innovations like this in the future, and what aspects of work should it do? **

**Wu Hequan: **The development of artificial intelligence from discriminative to generative is a milestone innovation, and it has begun to enter the track of general artificial intelligence. From GPT-3 to GPT-4, it has developed from text input to partial graphic input, that is, it has increased the ability to understand graphics. On this basis, it is not far to implement a deep learning architecture and a general model to support multi-modal data input. Yes, but the task generalization of large models and the refinement of on-demand invocation of large models still require greater investment and innovation. Data unlabeled and unsupervised learning for graphics and videos is much more difficult than language and text input.

We are now in a critical period of development towards general artificial intelligence. For our country, this is a rare opportunity for leapfrog development and also a severe challenge. Computing power, models, and data are the necessary conditions for the success of ChatGPT and will also be the essential factors for the success of general artificial intelligence. In addition, the most innovative ecology, mechanism, and talents are the key. China is comparable to the United States in terms of the total scale of computing power, but the coordination of computing power across data centers still faces institutional challenges, and the utilization rate and efficiency of computing power in many intelligent computing centers are not high. **Many units research large models independently, and low-level duplication is inevitable. **It is recommended to form a joint force with a reasonable division of labor under the coordination of the national science and technology and industrial plans. It is recommended to open the computing power platform of the national laboratory to support various large-scale model training. ** For example, the computing power of Pengcheng Cloud Brain has reached 3/4 of the total capacity, which can support the scale of 200 billion parameters comparable to GPT-3. Open source Chinese pre-trained language large model. **At the same time, it is recommended to form a computing power alliance to concentrate the computing power resources of existing high-end GPUs and provide the computing power required for large-scale model data training. **Currently, the "China Computing Power Network (C2NET)" mainly built by Pengcheng Laboratory has been connected to more than 20 large-scale intelligent computing, super computing, and data centers, and the aggregated heterogeneous computing power has reached 3EFlops. Among them, the self-developed AI computing power exceeds 1.8EFlops. In addition, the application of chatbots is only an intuitive way to train and test AIGC, but chatting is not just needed. It is necessary to develop various models for industry applications based on large models, so as to make large models effective in the industry as soon as possible. Cultivate more talents in the application of all walks of life. **

Large model industry applications require comprehensive talents who understand both industry technology and AI training

**T: So far we have seen the application of ChatGPT in some fields, such as chatbots, text generation and speech recognition. Will there be some application opportunities in the physical industry and field in the future? What obstacles are still faced in the application of large models in the physical industry? **

**Wu Hequan: **Based on the existing ChatGPT chatbots, after supplementing relevant industry and enterprise knowledge training, they can undertake intelligent customer service work in enterprises, replacing workers to provide customers with pre-sales and after-sales services. In the design and manufacturing process that requires software programming, ChatGPT can replace programmers to complete programming tasks and check software bugs. Can undertake the collection, translation and arrangement of documents and materials required in the design and production process. After professional training, AIGC-like large models can be used to design EDA software, such as tool software for IC design. In animation and game companies, robots trained based on AIGC-like large models can write scripts, create game scripts and program them according to prompts, and complete the rendering of 3D animations.

However, ChatGPT is not a general model, and it is difficult to directly apply it to the manufacturing process of the real industry. However, it can be based on the principle of training ChatGPT and use the knowledge graph of industries and enterprises for in-depth training. It is possible to develop a large model dedicated to enterprises to complete this. The challenge of the first job is to need talents who are not only familiar with enterprise upload process and key link technology, but also master artificial intelligence big data training technology.

From focusing on the results to focusing on the process, the integration of technology and legal system dominates the reasoning process of AIGC

**T:ChatGPT will also make various mistakes, and it will also bring some ethical, security and privacy issues. When applying large models in the future, how can we create an inclusive and safe and development environment of? **

**Wu Hequan:**The emergence of generative AI has pushed the society’s attention to artificial intelligence to an unprecedented height. While triggering an upsurge in AI research in the scientific and industrial circles, many experts worry that artificial intelligence will destroy human beings and call for Stop research on GPT-5. The concerns of some experts are not unfounded, because the thinking process of ChatGPT robots is currently opaque. Humans have created ChatGPT, but at present humans do not fully grasp its reasoning process. The unknowable will be uncontrollable, and there are risks of robot abnormality, ethical anomie, and behavior out of control.

**The solution is not to stop research on artificial intelligence, but to focus on AIGC research instead of focusing on results, design and lead its reasoning process, so that results can be expected and behaviors can be controlled. **The promotion and application of the large model in the future requires a safe and credible evaluation by a qualified institution, and the reasoning process of the large model is traceable after inspection. At the same time, it is necessary to establish corresponding AI governance laws and regulations to prevent misleading AIGC training, hold AIGC training subjects accountable, and severely punish abetting and abetting crimes. Through the complementarity of technology and the legal system, artificial intelligence has become a truly loyal assistant to human beings.

View Original

This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.

Reward
like
Comment
Repost
Share

Comment

0/400

No comments

Topic
#Gate Releases August Reserves Report
11k Popularity
#BTC Hits New ATH
99k Popularity
#Show My Alpha Points
128k Popularity
#ETH Countdown To A New High
6k Popularity
#Major Coins Rally
2k Popularity

sitemap