WO2021232958A1

WO2021232958A1 - Method and apparatus for executing operation, electronic device, and storage medium

Info

Publication number: WO2021232958A1
Application number: PCT/CN2021/085028
Authority: WO
Inventors: 谭志鹏
Original assignee: Oppo广东移动通信有限公司
Priority date: 2020-05-18
Filing date: 2021-04-01
Publication date: 2021-11-25
Also published as: CN111582459A; CN111582459B; TW202145079A

Abstract

A method and apparatus for executing an operation, an electronic device, and a storage medium, which belong to the technical field of computers. The method comprises: in an electronic device with a dedicated processing chip, instructing a central processing unit to receive information to be processed by a first neural network (210); instructing the dedicated processing chip to process target information according to a pre-established second neural network, so as to obtain target result data (220); and transmitting the target result data back to the central processing unit (230), wherein the second neural network is a neural network established according to network information, and the network information indicates a network structure of the first neural network. The problems in the prior art of a large time overhead and a low calculation efficiency caused by generally establishing and parsing a neural network by means of a central processing unit and frequently calling a dedicated processing chip to execute operator calculation in the neural network are solved, and the operation efficiency of the neural network is improved on the basis of not changing a hardware architecture.

Description

Method, electronic equipment, device and storage medium for performing operation

This application claims the priority of a Chinese patent application filed on May 18, 2020, with the application number 202010419847.9 and the invention title "Methods, electronic equipment, devices and storage media for performing operations", the entire contents of which are incorporated herein by reference. Applying.

Technical field

The embodiments of the present application relate to the field of computer technology, and in particular to a method, electronic equipment, device, and storage medium for performing operations.

Background technique

With the rapid development of neural network technology, applications based on neural networks are also rapidly applied and popularized in life.

In the related art, the electronic device can process the specified data based on the neural network, and perform the specified operation according to the processing result. First, the electronic device needs to perform network inference on the neural network in the central processing unit (English: Central Processing Unit, abbreviation: CPU) after receiving the specified data. The operator calculation part of the neural network needs to be executed in a dedicated processing chip. Since the central processing unit performs a calculation for each operator in the neural network, it is necessary to call a dedicated processing chip once, and transfer the data back and forth between the central processing unit and the dedicated processing chip once. Therefore, the time overhead of the electronic device in this scenario is relatively large, and the processing efficiency is relatively low.

Summary of the invention

The embodiments of the present application provide a method, electronic equipment, device, and storage medium for performing operations. The technical solution is as follows:

According to one aspect of the present application, there is provided a method for performing operations, which is applied to an electronic device, the electronic device includes a dedicated processing chip, and the method includes:

Instruct the central processing unit to receive target information, where the target information is information to be processed by the first neural network;

Instruct the dedicated processing chip to process the target information according to a pre-established second neural network to obtain target result data. The second neural network is a neural network established according to network information, and the network information is used to instruct the first neural network. The network structure of a neural network;

Transmitting the target result data back to the central processing unit;

Perform corresponding operations according to the target result data.

According to another aspect of the present application, there is provided a device for performing operations, which is applied to an electronic device, the electronic device includes a dedicated processing chip, and the device includes:

The information receiving module is used to instruct the central processing unit to receive target information, where the target information is information to be processed by the first neural network;

The data acquisition module is used to instruct the dedicated processing chip to process the target information according to a pre-established second neural network to obtain target result data. The second neural network is a neural network established according to network information, and the network information Used to indicate the network structure of the first neural network;

A data transmission module for transmitting the target result data back to the central processing unit;

The operation execution module is used to execute the corresponding operation according to the target result data.

According to another aspect of the present application, an electronic device is provided. The electronic device includes a processor and a memory, and at least one instruction is stored in the memory. The instruction is loaded and executed by the processor to realize This application implements the methods for performing operations provided.

According to another aspect of the present application, a computer-readable storage medium is provided, the storage medium stores at least one instruction, and the instruction is loaded and executed by a processor to implement the execution operation provided by the implementation of the present application. method.

According to one aspect of the present application, a computer program product is provided, the computer program product includes computer instructions, and the computer instructions are stored in a computer-readable storage medium. The processor of the computer device reads the computer instruction from the computer-readable storage medium, and the processor executes the computer instruction, so that the computer device implements the use method of the secondary cell data provided in the above-mentioned various aspects.

Description of the drawings

In order to more clearly introduce the technical solutions in the embodiments of the present application, the following will briefly introduce the drawings that need to be used in the description of the embodiments of the present application. Obviously, the drawings in the following description are only some embodiments of the present application. For those of ordinary skill in the art, other drawings can be obtained from these drawings without creative work.

Fig. 1 is a structural block diagram of an electronic device provided by an exemplary embodiment of the present application;

Fig. 2 is a flowchart of a method for performing operations provided by an exemplary embodiment of the present application;

FIG. 3 is a schematic diagram of an operation method in a related technology involved in an embodiment of the present application;

FIG. 4 is a schematic diagram of an operation method involved in an embodiment of the present application;

FIG. 5 is a flowchart of a method for performing operations according to another exemplary embodiment of the present application;

FIG. 6 is a graph structure of a first neural network based on the embodiment shown in FIG. 5;

Fig. 7 is a structural block diagram of a device for performing operations according to an exemplary embodiment of the present application.

Detailed ways

In order to make the purpose, technical solutions, and advantages of the present application clearer, the implementation manners of the present application will be described in further detail below in conjunction with the accompanying drawings.

When the following description refers to the accompanying drawings, unless otherwise indicated, the same numbers in different drawings represent the same or similar elements. The implementation manners described in the following exemplary embodiments do not represent all implementation manners consistent with the present application. On the contrary, they are merely examples of devices and methods consistent with some aspects of the application as detailed in the appended claims.

In the description of this application, it should be understood that the terms "first", "second", etc. are only used for descriptive purposes, and cannot be understood as indicating or implying relative importance. In the description of this application, it should be noted that, unless otherwise clearly specified and limited, the terms "connected" and "connected" should be understood in a broad sense, for example, it can be a fixed connection, a detachable connection, or an integral Ground connection; it can be a mechanical connection or an electrical connection; it can be directly connected or indirectly connected through an intermediate medium. For those of ordinary skill in the art, the specific meanings of the above terms in this application can be understood under specific circumstances. In addition, in the description of this application, unless otherwise specified, "plurality" means two or more. "And/or" describes the association relationship of the associated objects, indicating that there can be three types of relationships, for example, A and/or B, which can mean: A alone exists, A and B exist at the same time, and B exists alone. The character "/" generally indicates that the associated objects before and after are in an "or" relationship.

In the embodiments of the present application, it relates to a deep learning calculation framework. The deep learning computing framework includes two main modules, namely the network inference module and the operator realization module. Among them, the network reasoning module is used to realize network reasoning, and the operator realization module is used to realize operator calculation.

In related technologies, electronic devices usually implement network inference on the CPU and operator calculation on the GPU side. In this processing method, each time the CPU calculates an operator, the GPU needs to be called once. Since data is frequently moved and copied between the memory of the CPU and the memory of the GPU, the efficiency of processing the data through the neural network is poor and the time cost is high.

Illustratively, in a deep learning neural network calculation process, the stage that consumes more time is the operator calculation stage executed by the operator realization module. In one possible way, a deep learning network includes dozens or hundreds of layers of operators. Based on this application scenario, the embodiment of the present application provides a method for improving the calculation efficiency of an operator.

In summary, the embodiments of the present application provide a deep learning computing framework that can reduce the number of calls and memory movement between the CPU and the dedicated processing chip when the CPU and the dedicated processing chip cooperate to perform heterogeneous calculations. frequency. This application can analyze the network model of the first neural network on the CPU side, and then continuously transmit operator information to the dedicated processing chip side. After the analysis of the entire network is completed, the dedicated processing chip will successfully construct a second neural network. Among them, the dedicated processing chip side can also fuse operators that can be merged with each other in the second neural network. Then, in the operation process, the CPU side only needs to send one instruction to complete the inference of the entire second neural network on the dedicated processing chip side, obtain the final calculation result and return it to the CPU side, thereby completing the entire processing flow.

In order to facilitate the understanding of the solutions shown in the embodiments of the present application, several terms appearing in the embodiments of the present application are introduced below.

Dedicated processing chip: used to construct the second neural network, and use the second neural network to perform the network inference process according to the target information forwarded by the central processing unit.

Optionally, the dedicated processing chip may be an image processor (English: Graphics Processing Unit, abbreviation: GPU), a digital signal processor (English: Digital Signal Processing, abbreviation: DSP), a neural network processor (English: Neural network Processing) Unit, abbreviation: NPU), tensor processor (English: Tensor Processing Unit, abbreviation: TPU), deep learning processor or brain processor (English: Brain Processing Unit, abbreviation: BPU) one or more of them.

Schematically, the GPU is designed to accelerate the processing of operations in the image field. In the implementation process, the GPU needs to work under the call of the CPU. Therefore, in the actual application process, when the neural network processes data, it always performs calculations through a heterogeneous computing framework that combines CPU and GPU.

For example, if the operator level in the first neural network is 200 layers (that is, the first neural network includes 200 operators), then in the existing deep learning calculation framework, the CPU analyzes the first neural network. When the CPU needs to perform calculations on each operator, the GPU is called once. In this scenario, the number of times the CPU calls the GPU is 201 times. Among them, 200 calls are used to call the GPU to perform operator calculations, and one call is used to initialize the GPU.

If the method provided by the embodiment of the present application is used, after the CPU receives the target information, it can parse out the network information of the first neural network in the initialization phase, and send it to the GPU in the form of initialization instructions. The GPU establishes a second neural network according to the network information, and the next time the CPU sends a running instruction that includes the target information, it will automatically complete the processing of the entire target information on the GPU and feed back the target result data to the CPU. In this scenario, the CPU only needs to send an initialization instruction and a run instruction once throughout the entire process, which greatly reduces the time overhead and memory relocation operations caused by calling the GPU.

Schematically, the working principle of DSP is to receive analog signals, convert analog signals into digital signals, modify, delete or enhance digital signals, and interpret digital data back to analog data or actual environment format in other system chips. .

Schematically, the NPU simulates human neurons and synapses at the circuit layer, and uses deep learning instruction sets to directly process large-scale neurons and synapses. NPU can realize the integration of storage and calculation through synaptic weights, thereby improving operating efficiency.

Illustratively, the TPU can provide high-throughput and low-precision calculations for forward operations of neural networks. In practical applications, the operation of the TPU needs to be controlled by the CPU.

Illustratively, the BPU can implement a chip structure through multiple architectures. Among them, the architecture supported by the BPU includes at least one of Gaussian architecture, Bernoulli architecture, or Bayesian architecture. In the current application mode of the BPU, specific operations need to be performed in accordance with the control instructions of the CPU.

Target information: the information to be processed by the first neural network. For example, the target information may be image information or audio information. It should be noted that the target information is the data processed in the inference stage after the first neural network completes the training.

When the target information is image information, the first neural network is used to classify the image information, or the first neural network is used to identify the type of objects in the image information. When the target information is audio information, the first neural network is used to identify the speaker of the audio information, or the first neural network is used to identify text information in the audio information. It should be noted that the above-mentioned relationship between the target information and the first neural network is only exemplary, and does not substantially limit the form of the relationship between the target information and the first neural network.

Optionally, when the electronic device is a terminal, the target information is information sent by an application layer in the terminal. It should be noted that the terminal can be equipped with a corresponding operating system, and the application program in the operating system interacts with the underlying hardware controlled by the operating system through the application layer.

Optionally, when the electronic device is a server, the target information is information forwarded by the terminal to the server. Information sent by the application layer in the terminal.

In a possible implementation solution, for example, when the first neural network is a face detection network, the target information may be a picture to be detected.

In the embodiments of this application, the following technical solutions are provided:

A method for performing operations, wherein, applied to an electronic device, the electronic device includes a dedicated processing chip, and the method includes: instructing a central processing unit to receive target information, the target information being information to be processed by a first neural network ; Instruct the dedicated processing chip to process the target information according to a pre-established second neural network to obtain target result data, the second neural network is a neural network established according to network information, and the network information includes the first The network structure and weight information of the neural network; the target result data is transmitted back to the central processing unit; the corresponding operation is performed according to the target result data.

Optionally, before the instructing the dedicated processing chip to process the target information according to the pre-established second neural network to obtain target result data, the method further includes: when a network construction instruction is received Analyze the first neural network to obtain the network information, where the network information includes graph structure information and weight information of the first neural network; and instruct the dedicated processing chip to establish the first neural network according to the network information 2. Neural network.

Optionally, the instructing the dedicated processing chip to establish the second neural network according to the network information includes: obtaining a global memory of a predetermined storage space in the dedicated processing chip; and according to the data of the dedicated processing chip Specification, storing the network information in the global memory; instructing the dedicated processing chip to establish the second neural network according to the network information.

Optionally, the instructing the dedicated processing chip to establish the second neural network according to the network information includes: obtaining graph structure information and weight information in the network information; and determining according to the graph structure information The input tensor and output tensor of each operator; according to the identifier of the input tensor and the identifier of the output tensor of each operator, complete the concatenation of the operators in the second neural network; The weight information determines the corresponding convolution kernel, and the convolution kernel is the input tensor of the corresponding operator.

Optionally, the completion of concatenating the operators in the second neural network according to the identification of the input tensor and the identification of the output tensor of each operator includes: When the first operator is constructed, it is detected whether the first operator has the conditions for fusion with the second operator, and the second operator is an operator that has been constructed in the second neural network; when the When the first operator meets the conditions for fusion with the second operator, the first operator and the second operator are fused; and the network information in the global memory is updated according to the fused operator.

Optionally, the method further includes: determining the number of operator layers of the first neural network according to the graphic structure information; when the number of operator layers is greater than or equal to a threshold value of the number of layers, executing the instruction The step of processing the target information by a dedicated processing chip according to the pre-established second neural network to obtain target result data.

Optionally, the method further includes: deleting the information of the second neural network in the dedicated processing chip in response to the logout of the process invoking the first neural network.

Optionally, the method further includes: in response to the logout of the process calling the first neural network, storing the information of the second neural network in the dedicated processing chip locally, and deleting the dedicated processing chip Information in the second neural network.

Optionally, the method further includes: when the target result data is image recognition data, displaying a frame-selected result area in the recognized target image; or, when the target result data is voice recognition data , Play the synthesized artificial intelligence voice or display the recognized text.

Optionally, the dedicated processing chip includes at least one of the following: an image processor, a digital signal processor, a neural network processor, a tensor processor, a deep learning processor, or a brain processor.

The beneficial effects brought about by the technical solutions provided in the embodiments of the present application may include:

The embodiment of the application can instruct the central processing unit to receive the information to be processed by the first neural network in an electronic device including a dedicated processing chip, and instruct the dedicated processing card chip to process the target information according to the pre-established second neural network to obtain target result data , And send the target result data back to the central processing unit, so that the electronic device executes the corresponding operation according to the target result data. Among them, the second neural network is a neural network established based on network information, and the network information indicates the network structure of the first neural network, which overcomes the need to use the central processing unit to establish and analyze the neural network and frequently call the dedicated processing chip to execute the neural network. The operator calculation in the network brings about the problems of large time overhead and low calculation efficiency, which improves the calculation efficiency of the neural network without changing the hardware architecture.

Exemplarily, the method for performing operations shown in the embodiments of the present application can be applied to an electronic device that has a display screen and has a computing function. Electronic devices can include mobile phones, tablets, laptops, desktop computers, all-in-one computers, servers, workstations, TVs, set-top boxes, smart glasses, smart watches, digital cameras, MP4 playback electronic devices, MP5 playback electronic devices, learning machines , Point reading machine, electronic paper book, electronic dictionary or on-board electronic equipment, etc.

Please refer to FIG. 1. FIG. 1 is a structural block diagram of an electronic device provided by an exemplary embodiment of the present application. As shown in FIG. 1, the electronic device includes a processor 120, a memory 140, and a bus 160. At least one instruction is stored, and the instruction is loaded and executed by the processor 120 to implement the method for performing operations as described in each method embodiment of the present application. Among them, the processor 120 includes a central processing unit 121 and a dedicated processing chip 122. It should be noted that the central processing unit 121 includes memory, and the dedicated processing chip 122 also includes memory.

The processor 120 may include one or more processing cores. The processor 120 uses various interfaces and lines to connect various parts of the entire electronic device 100, and executes by running or executing instructions, programs, code sets, or instruction sets stored in the memory 140, and calling data stored in the memory 140. Various functions and processing data of the electronic device 100. Optionally, the processor 120 may be implemented in a hardware form of at least one of a digital signal processor, a field-programmable gate array (Field-Programmable Gate Array, FPGA), and a programmable logic array (Programmable Logic Array, PLA). The processor 120 may be integrated with one or a combination of a central processing unit, an image processor, and a modem. Among them, the CPU mainly processes the operating system, user interface, and application programs; the GPU is used to render and draw the content that needs to be displayed on the display screen; the modem is used to process wireless communication. It can be understood that the above-mentioned modem may not be integrated into the processor 120, but may be implemented by a chip alone.

The memory 140 may include random access memory (RAM), or read-only memory (ROM). Optionally, the memory 140 includes a non-transitory computer-readable storage medium. The memory 140 may be used to store instructions, programs, codes, code sets or instruction sets. The memory 140 may include a program storage area and a data storage area, where the program storage area may store instructions for implementing the operating system and instructions for at least one function (such as touch function, sound playback function, image playback function, etc.), Instructions used to implement the following method embodiments; the storage data area can store the data involved in the following method embodiments.

The bus 160 is used to connect various hardware components in the electronic device to facilitate data interaction between various hardware components. In the embodiment of the present application, the bus 160 is used to connect the processor 120 and the memory 140, so that the above two pieces of hardware can exchange data.

Please refer to FIG. 2, which is a flowchart of a method for performing operations according to an exemplary embodiment of the present application. This method of performing operations can be applied to the electronic device shown in FIG. 1 above. In Figure 2, the method of performing operations includes:

Step 210: Instruct the central processing unit to receive target information, where the target information is information to be processed by the first neural network.

In the embodiment of the present application, the electronic device is used as the execution subject of the entire solution. In some application scenarios, the system service or third-party application in the electronic device will call the first neural network to process the target information. When the above application or service needs to call the first neural network, the electronic device will initialize the first neural network. Among them, the electronic device can instruct the central processing unit to receive target information. The target information may be information sent by the application or service to the CPU, and the information is the information to be processed by the first neural network.

For example, when the first neural network is a model for recognizing a human face, the target information may be an image collected by a camera application.

Step 220: Instruct the dedicated processing chip to process the target information according to the pre-established second neural network to obtain target result data. The second neural network is a neural network established according to the network information, and the network information includes the network structure and weights of the first neural network. information.

Optionally, the dedicated processing chip in the embodiment of the present application will process the target information according to the second neural network established in advance. Among them, the dedicated processing chip can establish a second neural network on the side of the dedicated processing chip when receiving the network information sent by the CPU.

For example, there are five different first neural networks built into electronic devices, namely A neural network, B neural network, C neural network, D neural network, and E neural network. Each neural network has a specific function, please refer to Table 1 for details.

Table I

A神经网络A neural network	B神经网络B neural network	C神经网络C neural network	D神经网络D neural network	E神经网络E neural network
人脸检测Face Detection	车辆号牌识别Vehicle number plate recognition	知识问答Quiz	商品识图Commodity Recognition Map	终端模式判定Terminal mode determination

In the five different first neural networks shown in Table 1, when the electronic device executes the corresponding function, the processor will initialize the corresponding first neural network and process the data. For example, in the electronic device, the camera application is started as the triggering event of A neural network initialization. When the camera application in the electronic device is started, the electronic device instructs the CPU to load the information of the A neural network. At this time, the CPU sends the network information of the A neural network to the special processing chip, and the special processing chip will establish a second neural network corresponding to the A neural network based on the network information.

Please refer to FIG. 3, which is a schematic diagram of an operation method in a related technology involved in an embodiment of the present application. In FIG. 3, a central processing unit 121 and a dedicated processing chip 122 are included. The central processing unit 121 constructs a first neural network including n operators, and initializes the dedicated processing chip 122 in the calling operation 310. When the central processing unit 121 uses the first neural network to process the target information, each operator needs to call the dedicated processing chip 122 once for calculation. That is, in the operator calculation operation 320, the central processing unit 121 calls the dedicated processing chip 122 a total of n times. In this process, in order to obtain the result data, the electronic device makes the central processing unit 121 call the dedicated processing chip 122 (n+1) times in total.

Please refer to FIG. 4, which is a schematic diagram of an operation method involved in an embodiment of the present application. In FIG. 4, a central processing unit 121 and a dedicated processing chip 122 are included. The central processing unit 121 may execute step 410 and step 420. In step 410, when the central processing unit 121 receives the network construction instruction, it parses the first neural network to obtain network information. Among them, the network information is used to indicate the graphic structure information and weight information of the first neural network. In step 420, the central processing unit 121 sends network information to the dedicated processing chip 122. In step 430, the dedicated processing chip 122 establishes a second neural network according to the network information. In step 440, when the central processing unit 121 processes the target information, it only needs to send the target information and the running instruction to the dedicated processing chip 122 to complete one running call. The central processing unit 121 can obtain the target result data. In this process, in order to obtain the result data, the electronic device causes the central processing unit 121 to call the dedicated processing chip 122 twice.

It can be seen from the comparison of the operating conditions of Figures 3 and 4 that the method of performing operations provided by the embodiments of the present application can effectively reduce the number of times that the CPU calls the dedicated processing chip, so that when the electronic device runs the same first neural network, it can be shortened by The duration of the target result data.

In step 230, the target result data is transmitted back to the central processing unit.

In the embodiment of the present application, the electronic device can transmit the target result data back to the central processing unit after the dedicated processing chip calculates the target result data.

Step 240: Perform a corresponding operation according to the target result data.

In the embodiment of the present application, the electronic device can also execute the corresponding application operation according to the target result data. Wherein, the corresponding application operation may be a visible graphic display operation, or a data processing flow that is not visible in the background, which is not limited in the embodiment of the present application.

In a possible implementation manner, the electronic device may display the recognized face area in the image in the face recognition scene.

In another possible implementation manner, the electronic device can also play the synthesized artificial intelligence speech or display the recognized text.

To sum up, the method for performing operations provided in this embodiment can instruct the central processing unit to receive the information to be processed by the first neural network in an electronic device that includes a dedicated processing chip, and instruct the dedicated processing card chip according to the pre-established first neural network. 2. The neural network processes the target information, obtains the target result data, and returns the target result data to the central processing unit. Among them, the second neural network is a neural network established based on network information, and the network information indicates the network structure of the first neural network, which overcomes the need to use the central processing unit to establish and analyze the neural network and frequently call the dedicated processing chip to execute the neural network. The operator calculation in the network brings about the problems of large time overhead and low calculation efficiency, which improves the calculation efficiency of the neural network without changing the hardware architecture.

Based on the solution disclosed in the previous embodiment, the electronic device can also establish a second neural network in the dedicated processing chip, thereby reducing the number of calls between the CPU and the dedicated processing chip. Please refer to the following embodiment.

Please refer to FIG. 5, which is a flowchart of a method for performing operations according to another exemplary embodiment of the present application. This method of performing operations can be applied to the electronic device shown in FIG. 1 above. In Figure 5, the method of performing operations includes:

Step 511: When a network construction instruction is received, the first neural network is parsed to obtain network information, and the network information is used to indicate the graph structure information and weight information of the first neural network.

Illustratively, the electronic device can trigger the network construction instruction when the application is started or when the function is called. At this time, the electronic device will instruct the CPU to parse the designated first neural network to obtain network information. Among them, because the network information includes graphic structure information and weight information.

Step 512: Instruct the dedicated processing chip to establish a second neural network according to the network information.

Illustratively, the electronic device can instruct the dedicated processing chip to establish the second neural network based on the network information.

In the embodiment of the present application, the electronic device may also implement step (a1), step (a2), and step (a3) to implement the process of instructing the dedicated processing chip to establish the second neural network based on the network information. The introduction is as follows:

Step (a1): Obtain a global memory of a predetermined storage space in a dedicated processing chip.

Optionally, the electronic device can obtain a predetermined storage space of a predetermined size in a dedicated processing chip. In addition, the predetermined storage space is a global memory, so that each component in the dedicated processing chip can smoothly access network information.

Step (a2), according to the data specification of the dedicated processing chip, store the network information in the global memory.

Optionally, in order for each component in the dedicated processing chip to smoothly access the network information, the electronic device may store the network information in accordance with the data specification of the dedicated chip when storing the network information.

Optionally, the data specification is the definition of the second neural network in the dedicated processing chip. In this definition, the second neural network includes the definition of operators and the definition of tensors. Details are as follows:

Among them, NetDef represents the entire second neural network, which includes several tensors and operators. For the tensor dimension dims, the data 1,224,224,3 represents BATCH=1, HEIGHT=224, WIDTH=224, and CHANNLE=3 in the matrix dimension.

In the embodiment of the present application, the dedicated processing chip reconstructs the second neural network according to the format of the data specification provided in the above-mentioned first neural network by the network information.

In a second neural network operator serial connection method, the electronic device obtains the graph structure information and weight information in the network information; according to the graph structure information, the input tensor and output tensor of each operator are determined. Quantity; according to the identification of the input tensor and the identification of the output tensor of each operator, complete the concatenation of the operators in the second neural network; determine the corresponding convolution according to the weight information The kernel, the convolution kernel is the input tensor of the corresponding operator.

It should be noted that the graph structure information can indicate which operator's input tensor is another operator's input tensor. The electronic device can determine the position of the input tensor and the position of the output tensor of each operator according to the graphic structure information. Therefore, it can be seen that the graphic structure information will help each operator complete the concatenation.

In a possible manner, the dedicated processing chip can also fuse the first operator and the second operator in the second neural network when the operator has the fusion condition. After the first operator and the second operator are fused, the electronic device can change the name of the output tensor of the first operator to the name of the output tensor of the second operator, and at the same time, follow the information of the fused operator , Update the definition of the first operator in the second neural network.

On the other hand, when the operators in the second neural network cannot be merged before, the dedicated processing chip will retain each operator.

In one execution mode, when the first operator is constructed by the dedicated processing chip, it is checked whether the first operator meets the conditions for fusion with the second operator. The second operator is the operator that has been constructed in the second neural network. When the first operator meets the conditions for fusion with the second operator, the first operator and the second operator are fused; the network information in the global memory is updated according to the fused operator.

Step (a3), instruct the dedicated processing chip to establish a second neural network according to the network information.

In another possible implementation manner, the electronic device may also implement step (b1), step (b2), and step (b3) to implement the process of instructing the dedicated processing chip to establish the second neural network according to the network information. The introduction is as follows:

Step (b1), obtain the graphic structure information and weight information in the network information.

Step (b2), according to the graphic structure information, complete the concatenation of the operators in the second neural network.

Step (b3), according to the weight information, determine the weight between the operators.

It should be noted that, since the second neural network occupies a certain storage space, the embodiment of the present application is taken as an example to improve the operating efficiency of the dedicated processing chip. When the process calling the first neural network is logged off, the storage space of the dedicated processing chip occupied by the second neural network is released in a corresponding manner.

In a possible implementation manner, the electronic device deletes the information of the second neural network in the dedicated processing chip in response to the logout of the process calling the first neural network.

In another possible implementation manner, the electronic device stores the information of the second neural network in the dedicated processing chip locally in response to the logout of the process calling the first neural network. Wherein, the local storage area may be a non-volatile storage medium, including an external memory of an electronic device. When the information of the second neural network has been stored locally in the electronic device, the electronic device deletes the information of the second neural network in the dedicated processing chip. It should be noted that the information of the second neural network includes network information and all the information in the second neural network.

Step 520: Receive target information.

In the embodiment of the present application, the execution process of step 520 is the same as the execution process of step 210, and will not be repeated here.

Step 531: Determine the number of operator layers of the first neural network according to the graphic structure information.

Step 532: When the number of operator layers is greater than or equal to the threshold value of the number of layers, instruct the dedicated processing chip to process the target information according to the pre-established second neural network to obtain target result data.

In the embodiment of the present application, the electronic device can select whether to enable the operation execution method shown in the present application according to the number of operator layers. That is, when the number of operator levels is greater than or equal to the level threshold, the method for performing operations shown in this application is started. Illustratively, the layer number threshold may be 10, 15, 20, 50, etc., which is not limited in the embodiment of the present application.

It should be noted that when the number of operator layers of the first neural network is greater than the threshold value of the number of layers, it indicates that the structure of the first neural network is relatively complicated, and the use of a traditional architecture will lead to a longer calculation time. Therefore, in this application, by setting the layer number threshold, the first neural network adopts the operation execution method provided in this application when it needs to be optimized.

Step 541: When the target result data is image recognition data, display the result area selected by the frame in the recognized target image.

Step 542: When the target result data is speech recognition data, play the synthesized artificial intelligence speech or display the recognized text.

Optionally, in a possible application scenario, the electronic device can implement the method for performing operations introduced in the embodiment of the present application by executing step (c1), step (c2), step (c3), and step (c4). The introduction is as follows:

In step (c1), the electronic device can analyze the first neural network on the CPU side. After the first neural network is input to the deep learning calculation framework of the electronic device, the model is analyzed on the CPU side first, and the content of the analysis includes the graphic structure of the model and the weight data of the model. Please refer to FIG. 6. FIG. 6 is a graph structure of a first neural network provided based on the embodiment shown in FIG. 5. In FIG. 6, the first neural network 600 includes an operator 610, an operator 620, and an operator 630. Among them, the operator 610 includes an input tensor 611 and an output tensor 612. The operator 620 includes an input tensor 621 and an output tensor 622. The operator 630 includes an input tensor 631 and an output tensor 632. Among them, the first neural network 600 is formed by concatenating several operators, and each operator has several inputs and one output. It should be noted that, except for the final output operator, the output of each operator must be the input of other specified operators. The weight data of the second neural network is the data saved when the first neural network completes the training. In a possible manner, the weight data may be a convolution kernel. Illustratively, the weight data can be used as an input of the operator.

Step (c2), the CPU side transfers network information to the dedicated processing chip. Among them, the CPU transfers the analyzed graph structure and weight data of the first neural network to the dedicated processing chip at one time. Because the way of expressing the first neural network on the side of the dedicated processing chip is different from that of the CPU. In the embodiment of the present application, when the dedicated processing chip constructs the second neural network, each time an operator is constructed, the fusion of an operator needs to be completed. In other words, the embodiment of the present application can complete the concatenation between operators in a manner that can be understood by the calculation framework of the dedicated processing chip, until all operators are integrated, thereby constructing a second neural network.

In step (c3), the CPU side sends a running instruction to the dedicated processing chip, so that the dedicated processing chip side completes network inference. After the dedicated processing chip completes the network construction, the electronic device causes the CPU to send a running instruction to the dedicated processing chip, and the GPU directly calculates the target result data through the second neural network.

Step (c4), the dedicated processing chip side returns the target result data of the second neural network to the CPU side. In this step, after the dedicated processing chip obtains the target result data, it only needs to pass the calculation result to the CPU side once.

In summary, in this embodiment, after the central processing unit parses out the network information of the first neural network, a second neural network is constructed on the side of the dedicated processing chip based on the network information, and the constructed second neural network can be processed exclusively. Chip recognition. During the construction process, the dedicated processing chip can place the network information in the global memory of the predetermined storage space, so that the embodiment of the present application can effectively construct the second nerve that the dedicated processing chip can recognize in the dedicated processing chip. The network improves the stability of the operation of electronic equipment based on the neural network.

The method for performing operations provided in this embodiment can also effectively reduce frequent calls between the CPU and the dedicated processing chip and memory migration operations in a scenario with a large number of neural network layers.

The following are device embodiments of this application, which can be used to implement the method embodiments of this application. For details that are not disclosed in the device embodiments of this application, please refer to the method embodiments of this application.

Please refer to FIG. 7, which is a structural block diagram of a device for performing operations according to an exemplary embodiment of the present application. The device for performing operations can be implemented as all or a part of the electronic device through software, hardware, or a combination of the two. The device includes:

The information receiving module 710 is configured to instruct the central processing unit to receive target information, where the target information is information to be processed by the first neural network;

The data obtaining module 720 is used to instruct the dedicated processing chip to process the target information according to a second neural network established in advance to obtain target result data. The second neural network is a neural network established according to network information. Information indicates the network structure of the first neural network;

A data transmission module 730, configured to transmit the target result data back to the central processing unit;

The operation execution module 740 is configured to execute corresponding operations according to the target result data.

In an optional embodiment, the device further includes a network analysis module and a network construction instruction module. The network analysis module is configured to analyze the first neural network to obtain the network information when a network construction instruction is received, and the network information includes graph structure information and weight information of the first neural network; The network construction instruction module is used to instruct the dedicated processing chip to establish the second neural network according to the network information.

In an optional embodiment, the network construction instruction module is configured to obtain a global memory of a predetermined storage space in the dedicated processing chip; according to the data specification of the dedicated processing chip, store the network information in In the global memory; instruct the dedicated processing chip to establish the second neural network according to the network information.

In an optional embodiment, the network construction instruction module is used to obtain graph structure information and weight information in the network information; and determine the input tensor and output of each operator according to the graph structure information Tensor; according to the identification of the input tensor and the identification of the output tensor of each operator, complete the concatenation of the operators in the second neural network; determine the corresponding volume according to the weight information The product kernel, the convolution kernel is the input tensor of the corresponding operator.

In an optional embodiment, the network construction instruction module is configured to detect whether the first operator meets the conditions for fusion with the second operator when the first operator is constructed by the dedicated processing chip, The second operator is an operator that has been constructed in the second neural network; when the first operator meets the conditions for fusion with the second operator, the first operator and the second operator are combined. Two operators are merged; according to the merged operator, the network information in the global memory is updated.

In an optional embodiment, the device further includes a layer number determining module. The layer number determining module is configured to determine the number of operator layers of the first neural network according to the graphic structure information; the data obtaining module 720 is configured to determine when the number of operator layers is greater than or equal to a threshold value of the number of layers. When, instruct the dedicated processing chip to process the target information according to the pre-established second neural network to obtain target result data.

In an optional embodiment, the device further includes a first deletion module, configured to log off the information of the second neural network in the dedicated processing chip in response to the process of invoking the first neural network. delete.

In an optional embodiment, the device further includes a second deletion module, configured to log off the information of the second neural network in the dedicated processing chip in response to the process of invoking the first neural network. Stored locally, and delete the information of the second neural network in the dedicated processing chip.

In an optional embodiment, the data return module 730 is configured to display the frame-selected result area in the recognized target image when the target result data is image recognition data; or, the The data return module 730 is used for playing the synthesized artificial intelligence voice or displaying the recognized text when the target result data is voice recognition data.

In an optional embodiment, the dedicated processing chip involved in the device includes at least one of the following: an image processor, a digital signal processor, a neural network processor, a tensor processor, a deep learning processor, or a brain processor.

The device for performing operations provided in this embodiment can also effectively reduce frequent calls and memory migration between the CPU and the dedicated processing chip in a scenario with a large number of neural network layers.

The embodiments of the present application also provide a computer-readable medium that stores at least one instruction, and the at least one instruction is loaded and executed by the processor to implement the operations described in each of the above embodiments. method.

It should be noted that when the device for performing operations provided in the above embodiments executes the method for performing operations provided in this application, only the division of the above functional modules is used for illustration. In actual applications, the above functions can be allocated according to needs. It is completed by different functional modules, that is, the internal structure of the device is divided into different functional modules to complete all or part of the functions described above. In addition, the apparatus for performing operations provided in the above-mentioned embodiments belongs to the same concept as the embodiments of the methods for performing operations. For the specific implementation process, refer to the method embodiments, which will not be repeated here.

The serial numbers of the foregoing embodiments of the present application are for description only, and do not represent the superiority or inferiority of the embodiments.

Those of ordinary skill in the art can understand that all or part of the steps in the above embodiments can be implemented by hardware, or by instructing related hardware through a program. The program can be stored in a computer-readable storage medium. The storage medium mentioned can be a read-only memory, a magnetic disk or an optical disk, etc.

The foregoing are only exemplary embodiments that can be implemented in this application, and are not intended to limit this application. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of this application shall be included in Within the scope of protection of this application.

Claims

A method for performing operations, wherein, applied to an electronic device, the electronic device includes a dedicated processing chip, and the method includes:

Instruct the central processing unit to receive target information, where the target information is information to be processed by the first neural network;

Instruct the dedicated processing chip to process the target information according to a pre-established second neural network to obtain target result data. The second neural network is a neural network established according to network information, and the network information includes the first neural network. Network structure and weight information of the network;

Transmitting the target result data back to the central processing unit;

Perform corresponding operations according to the target result data.
The method according to claim 1, before the instructing the dedicated processing chip to process the target information according to the pre-established second neural network to obtain target result data, the method further comprises:

When receiving a network construction instruction, parse the first neural network to obtain the network information, the network information including the graph structure information and weight information of the first neural network;

Instruct the dedicated processing chip to establish the second neural network according to the network information.
The method according to claim 2, wherein the instructing the dedicated processing chip to establish the second neural network according to the network information comprises:

Acquiring a global memory of a predetermined storage space in the dedicated processing chip;

Storing the network information in the global memory according to the data specification of the dedicated processing chip;

Instruct the dedicated processing chip to establish the second neural network according to the network information.
The method according to claim 3, wherein the instructing the dedicated processing chip to establish the second neural network according to the network information includes:

Acquiring graphic structure information and weight information in the network information;

Determine the input tensor and output tensor of each operator according to the graphic structure information;

Complete the concatenation of the operators in the second neural network according to the identifiers of the input tensor and the output tensor of each operator;

According to the weight information, the corresponding convolution kernel is determined, and the convolution kernel is the input tensor of the corresponding operator.
The method according to claim 4, wherein said concatenating the operators in the second neural network according to the identification of the input tensor and the identification of the output tensor of each operator comprises:

When the first operator is constructed by the dedicated processing chip, it is detected whether the first operator has the conditions for fusion with the second operator, and the second operator has been constructed in the second neural network operator;

When the first operator meets the conditions for fusion with the second operator, fusing the first operator and the second operator;

According to the merged operator, the network information in the global memory is updated.
The method according to claim 4, further comprising:

Determining the number of operator layers of the first neural network according to the graphic structure information;

When the number of operator layers is greater than or equal to the layer number threshold, the step of instructing the dedicated processing chip to process the target information according to the pre-established second neural network to obtain target result data is performed.
The method according to claim 4, further comprising:

In response to the logout of the process calling the first neural network, the information of the second neural network in the dedicated processing chip is deleted.
The method according to claim 4, further comprising:

In response to the logout of the process calling the first neural network, the information of the second neural network in the dedicated processing chip is stored locally, and the information of the second neural network in the dedicated processing chip is deleted .
The method according to any one of claims 1 to 6, the method further comprising:

When the target result data is image recognition data, display the result area selected by the frame in the recognized target image;

or,

When the target result data is speech recognition data, the synthesized artificial intelligence speech is played or the recognized text is displayed.
The method according to any one of claims 1 to 6, wherein the dedicated processing chip comprises at least one of the following:

Image processor, digital signal processor, neural network processor, tensor processor, deep learning processor or brain processor.
A device for performing operations, which is applied to an electronic device, the electronic device includes a dedicated processing chip, and the device includes:

The information receiving module is used to instruct the central processing unit to receive target information, where the target information is information to be processed by the first neural network;

The data acquisition module is used to instruct the dedicated processing chip to process the target information according to a pre-established second neural network to obtain target result data. The second neural network is a neural network established according to network information, and the network information Indicates the network structure of the first neural network;

A data transmission module for transmitting the target result data back to the central processing unit;

The operation execution module is used to execute the corresponding operation according to the target result data.
The device according to claim 11, further comprising: a network analysis module and a network construction instruction module;

The network analysis module is configured to analyze the first neural network to obtain the network information when a network construction instruction is received, and the network information includes graph structure information and weight information of the first neural network;

The network construction instruction module is used to instruct the dedicated processing chip to establish the second neural network according to the network information.
The device according to claim 12, wherein the network construction instruction module is configured to:

Acquiring a global memory of a predetermined storage space in the dedicated processing chip;

Storing the network information in the global memory according to the data specification of the dedicated processing chip;

Instruct the dedicated processing chip to establish the second neural network according to the network information.
The device according to claim 13, wherein the network construction instruction module is configured to:

Acquiring graphic structure information and weight information in the network information;

Determine the input tensor and output tensor of each operator according to the graphic structure information;

Complete the concatenation of the operators in the second neural network according to the identifiers of the input tensor and the output tensor of each operator;

According to the weight information, the corresponding convolution kernel is determined, and the convolution kernel is the input tensor of the corresponding operator.
The device according to claim 14, wherein the network construction instruction module is configured to:

When the first operator is constructed by the dedicated processing chip, it is detected whether the first operator has the conditions for fusion with the second operator, and the second operator has been constructed in the second neural network Operator; when the first operator meets the conditions for fusion with the second operator, the first operator and the second operator are fused; according to the fused operator, the global memory is updated Network information.
The device according to claim 14, further comprising a layer number determining module;

The layer number determining module is configured to determine the number of operator layers of the first neural network according to the graphic structure information;

The data obtaining module is configured to instruct the dedicated processing chip to process the target information according to the pre-established second neural network to obtain target result data when the number of operator layers is greater than or equal to the threshold value of the number of layers.
The device according to claim 14, the device further comprising a first deleting module, configured to:

In response to the logout of the process calling the first neural network, the information of the second neural network in the dedicated processing chip is deleted.
The device according to claim 14, the device further comprising a second deletion module, configured to:

In response to the logout of the process calling the first neural network, the information of the second neural network in the dedicated processing chip is stored locally, and the information of the second neural network in the dedicated processing chip is deleted .
An electronic device, wherein the electronic device includes a processor, a memory connected to the processor, and program instructions stored on the memory, and the processor executes the program instructions as claimed in the claims The method of performing operations described in any one of 1 to 10.
A computer-readable storage medium, wherein program instructions are stored in the storage medium, wherein, when the program instructions are executed by a processor, the method for performing operations according to any one of claims 1 to 10 is realized.