6000-word detailed explanation of the Pangu model: Can it support the other pole of the world's AI?

Huawei has shown off its "muscles" in the field of large models.

On July 7, the 2023 Huawei Developer Conference (HDC 2023) opened. In the keynote speech of more than two hours in the afternoon, HUAWEI CLOUD disclosed the progress of the Pangu large model in detail for the first time. It not only released the industry-oriented Pangu large model 3.0, but also introduced in detail the basic technical capabilities of Huawei to develop large models.

The Pangu Large Model 3.0 includes a "5+N+X" three-tier structure. The three layers refer to the five basic large models of the L0 layer, the N industry-wide large models of the L1 layer, and the L2 layer that allows users to independently train more Refine the scene model. It adopts a complete layered decoupling design, and enterprise users can choose a suitable large-scale model development, upgrade or fine-tuning based on their own business needs, so as to adapt to the changing needs of thousands of industries.

Hu Houkun, Huawei's rotating chairman, said at the recent WAIC conference that the core of Huawei's development model is to focus on computing power and applications**. On the one hand, it is to deepen the computing power and build a strong computing power base to support the development of China's artificial intelligence industry. On the other hand, it is to combine large-scale models, from general-purpose large-scale models to industry-wide large-scale research and innovation, to truly make artificial intelligence serve thousands of industries and scientific research well.

Huawei is one of the earliest cloud service providers in China to deploy large-scale models, and has released the Pangu large-scale model as early as 2021. On the road to developing large-scale models, Huawei has built an AI computing power cloud platform based on Kunpeng and Ascend from the bottom layer, as well as technical capabilities such as the heterogeneous computing architecture CANN, the full-scenario AI framework MindSpore, and the AI development production line ModelArts. .

In addition to the large model and computing power base, at the meeting, HUAWEI CLOUD also highlighted typical cases of the combination of the Pangu large model and specific industries. The industries involved include government affairs, meteorology, railways, manufacturing, and finance, as well as multiple upgrades and reshaping of Huawei. Application cases of cloud software products and services.

Whether it is basic technical capabilities, AI**+ cloud product service system, or application cases in specific industries, HUAWEI CLOUD has demonstrated highly mature and systematic business capabilities, which really impresses the industry. Bring surprises. **While everyone is still arguing about who is China's OpenAI, HUAWEI CLOUD has opened up a fairly mature development path for large-scale models.

Huawei is using its own practice to prove that large-scale models are important, but more importantly, it is to use large-scale models to solve the pain points of industries and products, to make products and services that can make enterprises and users pay, and to truly create value for thousands of industries.

01 Pangu Large Model 3.0: Layered Decoupling Architecture

Decoupling is the keyword of the Pangu Model 3.0 released today. This is also a common appeal of industry customers who have actually invoked large models in the past few months.

A leading SaaS vendor said when releasing its own large-scale model upgrade application, "We do not develop large-scale models by ourselves, but in different business scenarios, which large-scale model is good at what, we take that model." In order to be able to Switching between different large models, "Our own product architecture must be independent of the underlying large model, or loosely coupled."

"The decoupling design of Pangu's large model is for the sake of the industry." At the Huawei Developer Conference, Zhang Pingan, Huawei's executive director and CEO of Huawei Cloud, gave the differentiated route of Pangu's large model. Its core is to decouple the various layers and capabilities of the Pangu model, allowing industry users to develop according to their own needs.

Specifically, Pangu Large-scale Model 3.0 is an industry-oriented large-scale model series, including "5+N+X" three-tier architecture:

“5” represents the five basic large models of the L0 layer: including natural language, vision, multimodal, forecasting, and scientific computing large models, which provide and meet the needs of various skills in industry scenarios.

Pangu 3.0 provides customers with serialized basic large models with 10 billion parameters, 38 billion parameters, 71 billion parameters and 100 billion parameters, matching the diversified needs of customers in different scenarios, different delays, and different response speeds. At the same time, it provides a new set of capabilities, including knowledge question answering, copy generation, and code generation for NLP large models, as well as image generation and image understanding for multi-modal large models. These skills can be directly used by customers and partner companies. Regardless of the size of the large model, Pangu provides a consistent set of capabilities.

The "N" in the "5+N+X" three-tier structure represents N large industry models at the L1 level. There are two ways to provide industry large models: on the one hand, HUAWEI CLOUD can provide general industry large models trained using industry public data, including government affairs, finance, manufacturing, mining, weather and other large models; on the other hand, it can be based on industry customers With its own data, on the L0 and L1 layers of the Pangea large model, it trains its own proprietary large model for customers.

Zhang Pingan said: "Pangu was born to serve the industry, providing a variety of large-scale model deployment, development, and reasoning forms. It can generate its own large-scale industry model just like Huawei's large-scale model of Pangu, and only needs to input its own private data. .” Moreover, the training data is also decoupled from the large model.

The X in "5+N+X" means that the L2 layer provides customers with more detailed scene models, focusing more on government affairs hotlines, network assistants, leading drug screening, foreign object detection on conveyor belts, and typhoon paths Provide customers with "out-of-the-box" model services for specific industry applications or specific business scenarios such as forecasting.

Through the three-layer large model of "**5+N+X", HUAWEI CLOUD built its own large model base.

At yesterday's World Artificial Intelligence Conference, Hu Houkun, Huawei's rotating chairman, explained vividly: "The most basic level of benchmarking is the general large-scale model, which we call the basic large-scale model. Our image at this level is called reading thousands of books, which is to do well. A large amount of basic knowledge is learned. On this layer, industry models and scene models are also created, called traveling thousands of miles. There are still many challenges to overcome from reading thousands of books to traveling thousands of miles. The key point is to Huawei is working with partners from various industries to fully match and integrate knowledge from various industries with large models.”

**In addition, the innovation of the large model is not only the innovation of the model itself, but also depends on the innovation of various root technologies of AI. At the meeting, Yao Jun, director of Huawei's Noah's Ark Laboratory, introduced the technical base of the Pangu model.

Huawei has built an AI computing power cloud platform based on Kunpeng and Ascend at the bottom layer, as well as the heterogeneous computing architecture CANN, the full-scenario AI framework MindSpore, and the AI development production line ModelArts, etc., to provide distributed solutions for the development and operation of large models. Key capabilities such as parallel acceleration, operator and compilation optimization, and cluster-level communication optimization. Based on Huawei's AI root technology, the performance of large model training can be adjusted to 1.1 times that of mainstream GPUs in the industry.

Computing power is the basis for training large models. At this conference, Zhang Ping'an announced that the Ascend AI cloud service with a single cluster 2000P Flops computing power will be launched simultaneously in Huawei Cloud's Ulanqab and Gui'an AI computing power centers. In addition to supporting Huawei's all-scenario AI framework Shengsi MindSpore, Shengteng AI Cloud Service also supports mainstream AI frameworks such as Pytorch and Tensorflow.

At the same time, 90% of operators in these frameworks can be smoothly migrated to the Ascend platform through Huawei's end-to-end migration tool. For example, Meitu migrated 70 models to Ascend in just 30 days. At the same time, HUAWEI CLOUD and the Meitu team jointly optimized more than 30 operators and accelerated the process in parallel. Compared with the original solution, the AI performance improved by 30% .

In addition, GPU failures are often encountered during large model training, and developers have to restart training frequently, which takes a long time and costs a lot. Ascend AI cloud service can provide more stable AI computing service. The long-term stability rate of 30-day kilocalorie training reaches 90%, and the breakpoint recovery time does not exceed 10 minutes.

02 Empower thousands of industries

Ren Zhengfei previously said, "The direct contribution of artificial intelligence software platform companies to human society may be less than 2%, and 98% is the promotion of industrial society and agricultural society. But the application platform is not our option, we will be the bottom layer of AI Computing power platform."

Letting large models into thousands of industries has become the focus of Huawei's development of large models. At the meeting, HUAWEI CLOUD introduced the application cases of the Pangu large model in seven fields including government affairs, railways, meteorology, and finance.

Government Affairs

In the field of government affairs, HUAWEI CLOUD and Shenzhen Futian District Government Service Data Management Bureau have launched Xiaofu, Futian's government affairs smart assistant based on the Pangu government affairs model, which can accurately understand people's consultation intentions and change the traditional one-stop service model. By fine-tuning more than 200,000 pieces of government affairs data, including 12345 hotlines, policy documents, government affairs encyclopedia, etc., government affairs assistants have mastered a wealth of industry knowledge such as laws and regulations and handling procedures.

According to Huawei Cloud, the core of Pangu's large model of government affairs is cognitive ability. Let the urban public system be seen and understood, and complete the closed loop from perception to cognition and disposal. And according to different scenarios, it provides different capabilities such as question answering, copy generation, video perception, and multimodal understanding.

HUAWEI CLOUD introduced two typical scenarios: the first is a consulting scenario, where enterprise users consult government assistants about relevant investment support policies, and the government assistants can introduce relevant regulations and policies and provide appropriate suggestions to the consultants; the second scenario, As shown in the figure above, it is a government affairs processing scene based on dialogue and multi-modal capabilities. The staff can intelligently analyze the violations in the pictures based on the pictures taken by the camera.

railway

In the railway field, Huawei demonstrated the application case of the truck inspection assistant.

Traditional train inspectors have to inspect millions of train pictures every day to detect whether there are faults in the freight cars running on the railway network. After the introduction of the Pangu large model, it can accurately identify 67 kinds of trucks running on the live network and more than 430 kinds of faults, and the screening rate of non-faulty pictures is as high as 95%. In other words, train inspectors only need to detect 1/20 of the train pictures in the past, which is equivalent to a 20-fold increase in work efficiency.

coal mine

In the field of coal mines, the large-scale model of Pangu Mine has been used in 8 mines across the country. A large model can cover more than 1,000 subdivided scenarios under business processes such as mining, excavation, machinery, transportation, transportation, and washing of coal mines, allowing more More coal miners can work on the ground, which not only makes the working environment of coal miners more comfortable, but also greatly reduces safety accidents.

meteorological

The meteorological field was the focus of the Huawei Cloud press conference. Just a few days ago, the research results of the Pangu meteorological large model were published in the top foreign journal "Nature", and the reviewers commented: Let us re-examine the future of weather forecasting.

Originally, to predict the path of a typhoon in the next 10 days, it took 5 hours to simulate on a high-performance computer cluster with 3,000 servers. Now based on the pre-trained Pangea meteorological large model, through AI reasoning, researchers only need to configure a single card on a single server, and can obtain more accurate prediction results within 10 seconds.

At present, the Pangea meteorological large model can be used to predict weather such as waves, high temperatures, typhoons, and cold waves. Compared with traditional weather forecasting, it is faster and more accurate. Previously, Pangu cooperated with the Meteorological Bureau to predict the path of "Mawa" 10 days in advance. In addition, Pangu also predicted the arrival of the cold wave in Finland two days in advance, compared with the forecast of the European Meteorological Agency. Pangu's forecast is also closer to the real temperature,

finance

In the field of finance, Pangu Large Model cooperated with ICBC to create a series of exploratory applications.

One of the typical scenarios is to improve the work efficiency of bank tellers. ICBC has tens of thousands of outlets across the country and 200,000 outlet tellers. They need to switch between various services, which will waste a lot of time.

The Pangu Financial Large Model pre-trains various bank operations, policies, and case documents, and can automatically generate procedures and operational guidance for counter staff according to customer problems, reducing the average operation that originally required 5 operations to 1. Knot time shortened by more than 5 minutes.

And this is only the most basic application. Huawei is exploring with the financial industry to apply the large model to more financial scenarios such as credit analysis in the future.

manufacturing

Huawei itself is also a manufacturing company. The hardware products it manufactures involve communication base stations, mobile phones, automobiles, chips and other fields. Based on the experience accumulated in the past, Huawei introduced the Pangu large model into the field of production and manufacturing.

In the past, it often took more than 3 hours to make a one-day production plan for a single production line to make a device allocation plan. After learning the various device data, business processes and rules of Huawei's production line, the Pangu manufacturing large model can accurately understand the business needs, and call the Tianchou AI solver plug-in to make the next 3 days in one minute. Production Plan.

Drug Discovery

In the field of drug research and development, the original research and development of a new drug takes an average of 10 years and costs 1 billion US dollars. The large molecular model of Pangu drugs helped the team of Professor Liu Bing of the First Affiliated Hospital of Xi'an Jiaotong University discover the world's first new target and new class of antibiotics in 40 years, and shorten the lead drug development cycle to one month and reduce the development cost by 70%.

03 Large model integrated into Huawei Cloud product system

In addition to the practice in thousands of industries, the HUAWEI CLOUD Pangu model has also been deeply integrated into HUAWEI CLOUD's product services to restructure product innovation.

Pangu Large Model + Huawei Cloud Service

With the blessing of the Pangu model, a series of B-end products and services of Huawei Cloud have been upgraded and reconstructed. At the meeting, HUAWEI CLOUD introduced the details of four service upgrades: data service, cloud customer service, BI, and cloud search.

* In the data service, through the copywriting and code generation technology of the Pangu large model, the efficiency of data writing and front-end code writing can be improved, and the new product launch cycle can be greatly shortened.

  • In the cloud customer service, through the dialogue Q&A embedded in the industry knowledge base and the ability of intent mining, the whole process of AI priority answering is realized, and the efficiency of customer service is improved by 30%.
  • In BI, through NL2SQL and AutoGraph intelligent routing, automatic recommendation from SQL to visual charts is realized, and through multiple rounds of natural language interaction, everyone can easily gain insight into business details from data.
  • In cloud search, through multi-modal Embedding and NL2API technology, it realizes video, text, map and other wide-ranging scene searches. With the help of powerful semantic understanding and generalization capabilities, the search accuracy rate is increased by 15%.

Pangu large model + CodeArts code tool

HUAWEI CLOUD combines the CodeArts R&D tools with the Pangu large model, and officially released CodeArts Snap, an intelligent programming assistant for developers.

The tool has trained 76 billion lines of selected codes and 13 million technical documents. It has three core functions of intelligent generation, intelligent question and answer, and intelligent collaboration. It can realize code generation in one sentence of dialogue, automatic annotation and generation of test cases in one click. One command can be deployed intelligently, so that every software developer has his own programming assistant.

Pangu Large Model + Digital Man

HUAWEI CLOUD empowers the MetaStudio digital content production line through the Pangu basic large model, creates the Pangu Digital Human Congress model, provides two major services of model generation and model driving, and has used 200,000 hours of audio and video data for pre-training.

Based on these two major services, developers can quickly generate and drive digital human models, empowering online education, entertainment live broadcast, corporate conferences and other industry applications, so that every enterprise employee can realize "digital human freedom". For example, users only need to upload a 20-second personal video on the service page of HUAWEI CLOUD MetaStudio to quickly generate a personalized digital human explanation video. The work completed by three R&D personnel in three days in the past can now be completed in only three minutes .

Pangu Large Model + Embodied Intelligence

At the meeting, Huawei Cloud also mentioned the application of the Pangu model in the field of robotics and demonstrated a video.

In the past, giving commands to robots required developers to program, but based on the natural language understanding ability of the Pangu model, robots can recognize natural language, execute commands, and have autonomous intelligence guided by global perception. At the meeting, Huawei demonstrated a video in which users do not need to input program commands, but only need to give orders to the robot in natural language, and the robot can complete commands such as picking up items, and will make autonomous judgments based on the environment in the process (such as moving Open the sundries that block the target object) to complete the task.

According to Huawei, the above demonstration is not a concept video, but a real product, which was exhibited at the venue during the HDC conference.

**04 Summary and thinking: Can Huawei become the other pole of AI? **

Zhang Pingan said, “In order to help global customers, partners, and developers train and use large models, we are committed to creating a world for global customers AI **Another pole, providing new AI developers s Choice". **

Even earlier, as early as March this year, Ren Zhengfei had expressed a similar meaning within the company. He said that there will be a surge in AI models, not just Microsoft. Ren Zhengfei's reason is actually the direction of Huawei Cloud's efforts today, that is, the direct contribution of artificial intelligence software platform companies to human society may be less than 2%, and 98% is the promotion of industrial society and agricultural society.

For example, factories in China and Germany are promoting the promotion of artificial intelligence to the industry, so as to realize unmanned production; for example, the wharf in Tianjin Port has also tried unmanned cargo loading and unloading. Once the code is entered, the container will be automatically removed from the ship. Carry it over and then transport it away by car; for example, in the coal mine in Shanxi, after adopting 5G+ artificial intelligence underground, the number of personnel has been reduced by 60-70%, and most people work in suits in the control room on the ground.

These are examples where AI has been applied to the industrial side on a large scale in the past few years. What these industries have in common is that they have huge scale and output value, and a little improvement in efficiency can bring huge benefits.

**The emergence of large models essentially provides more efficient productivity tools. **On the one hand, for these industries that are already embracing AI, it means higher efficiency and faster transformation process; and higher efficiency also means that it is easier for more industries to calculate the "economic account" ", AI has the potential to transform from a few so-called major industries to transforming thousands of industries.

This is the reason why Huawei resolutely enters the industry. In fact, major domestic cloud service companies such as Alibaba Cloud, Tencent Cloud, Volcano Cloud, and Baidu Cloud have similar ideas. In the case of the same direction and close starting point, who can run the fastest in this competition is the whole chain capability from computing power, large model base, platform, products to specific solutions.

Due to well-known reasons, Huawei cannot obtain the world's most advanced computing chip, which is currently recognized, and it seems that it is inherently insufficient in this competition. But judging from today's press conference, Huawei can't see that it is lagging behind due to the constraints of the upper reaches. In the key chain of the large model, it has come up with mature products and cases, and the decoupled Pangu large model architecture is even more It is eye-catching. **In fact, considering the needs of localization today, Huawei, which does not lag behind in terms of computing power, is likely to become an independent and controllable advantage. **

Large models have become a new opportunity for Huawei, and it looks like it is becoming a reality.

View Original
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • Comment
  • Repost
  • Share
Comment
0/400
No comments
Trade Crypto Anywhere Anytime
qrCode
Scan to download Gate App
Community
English
  • 简体中文
  • English
  • Tiếng Việt
  • 繁體中文
  • Español
  • Русский
  • Français (Afrique)
  • Português (Portugal)
  • Bahasa Indonesia
  • 日本語
  • بالعربية
  • Українська
  • Português (Brasil)