WO2021068249A1

WO2021068249A1 - Method and apparatus for hardware simulation and emulation during running, and device and storage medium

Info

Publication number: WO2021068249A1
Application number: PCT/CN2019/110840
Authority: WO
Inventors: 李金鹏; 黄炯凯; 蔡权雄; 牛昕宇
Original assignee: 深圳鲲云信息科技有限公司
Priority date: 2019-10-12
Filing date: 2019-10-12
Publication date: 2021-04-15
Also published as: CN113228056A; CN113228056B

Abstract

Provided are a method and apparatus for hardware simulation and emulation during running, and a computer device and a storage medium, wherein same are applied to the field of artificial intelligence. The method comprises: acquiring a neural network structure graph and a neural network parameter (S101); constructing a corresponding neural network in a simulated manner according to the neural network structure graph (S102); acquiring data to be emulated, and quantizing the data according to quantization information to obtain emulation input data (S103), wherein the emulation input data and the neural network parameter are of the same hardware data type; inputting the neural network parameter and the emulation input data into the neural network for convolution calculation to obtain a convolution result (S104); and on the basis of the convolution result, obtaining an emulation result and outputting same (S105). Since data to be emulated is quantized to be of the same hardware data type as a neural network parameter, when software is used for emulation, a result of emulation calculation is closer to that of hardware calculation; and the data calculation amount of the hardware data type is less than the calculation amount of a floating-point type, such that the calculation speed of neural network emulation can also be increased.

Description

Run-time hardware simulation simulation method, device, equipment and storage medium

Technical field

The invention belongs to the field of artificial intelligence technology, and in particular relates to a runtime hardware simulation simulation method, device, equipment and storage medium.

Background technique

Artificial neural network (artificial neural network, abbreviation ANN), abbreviated as neural network (neural network, abbreviation NN) or neural network, is a kind of imitating the structure and function of biological neural network (animal central nervous system, especially brain) Mathematical model or calculation model, used to estimate or approximate the function.

The neural network is mainly composed of: input layer, hidden layer, and output layer. When there is only one hidden layer, the network is a two-layer neural network. Since the input layer has not undergone any transformation, it can not be regarded as a separate layer. In practice, each neuron in the input layer of the network represents a feature, and the number of output layers represents the number of classification labels (when doing binary classification, if a sigmoid classifier is used, the number of neurons in the output layer is 1 ; If the softmax classifier is used, the number of neurons in the output layer is 2), and the number of hidden layers and hidden layer neurons are manually set.

A neural network needs to be simulated and tested continuously from development to delivery. The simulation test generally occurs before the delivery of the neural network product to simulate the real operating environment of the neural network, and the software is configured to the real state of use.

Existing simulation tests generally use verification tools to link to the hardware for simulation simulation, or use software to simulate hardware calculation results. However, the simulation is performed on hardware, the simulation speed is extremely slow, and large-scale testing requires more hardware resources, which is not easy to implement, resulting in more software simulations. When using software simulation, there is a certain numerical difference between the pure floating-point calculation results used in the software simulation and the hardware calculation results, and the complete numerical consistency cannot be achieved. Therefore, the existing neural network simulation method has the problem of inconsistency between pure floating-point calculation results and hardware calculation results, resulting in inconsistencies between the results of the neural network and the simulation results when the neural network runs on the delivered product.

Summary of the invention

The embodiment of the present invention provides a runtime hardware simulation simulation method, which aims to solve the problem of inconsistency between pure floating-point calculation results and hardware calculation results in the existing neural network simulation methods.

The embodiment of the present invention is implemented in this way and provides a runtime hardware simulation simulation method, which includes the steps:

Acquiring a neural network structure diagram and neural network parameters, where the neural network structure diagram includes quantitative information;

Simulate and construct a corresponding neural network according to the neural network structure diagram;

Acquiring data to be simulated, and quantizing the data to be simulated according to the quantization information to obtain simulation input data, where the simulation input data and the neural network parameters are of the same hardware data type;

Inputting the neural network parameters and the simulation input data to the neural network for convolution calculation to obtain a convolution result;

Based on the convolution result, the simulation result is obtained and output.

Furthermore, the step of obtaining the data to be simulated and quantizing the data to be simulated according to the quantization information to obtain the simulation input data includes:

Obtain the data to be simulated, and according to the quantization information, convert the data to be simulated into data to be simulated with a unit length of 8 bits to obtain simulation input data with a unit length of 8 bits.

Furthermore, the data to be simulated is floating-point type data, the neural network parameters are neural network parameters with a unit length of 8 bits, and the step of quantizing the data to be simulated according to the quantization information specifically further includes:

Subtracting the floating-point offset value of the neural network data of the 8-bit unit length and the simulation input data of the 8-bit unit length respectively to obtain the integer data of int32;

The step of inputting the neural network parameters and the simulation input data to the neural network for convolution calculation, and obtaining the convolution result specifically includes:

Input the neural network parameters of int32 and the simulation input data into the neural network for convolution calculation, and get the convolution result of int32.

Furthermore, the neural network parameters include bias parameters, and the step of obtaining a simulation result based on the convolution result and outputting it specifically includes:

Add the offset parameter to the convolution result of the int32 to obtain the offset result of the int32;

Based on the offset result, the simulation result is obtained and output.

Furthermore, the step of obtaining a simulation result for output based on the convolution result includes:

Performing inverse quantization on the convolution result according to the quantization information and the offset value to obtain a simulation result;

Output the simulation result.

Furthermore, the quantization information includes a scaling value, and the step of dequantizing the convolution result according to the quantization information and the offset value to obtain a simulation result includes:

Perform inverse quantization on the convolution result according to the scaling value and the offset value to obtain a simulation result.

The present invention also provides a runtime hardware simulation simulation device, the device includes:

An obtaining module, used to obtain a neural network structure diagram and neural network parameters, the neural network structure diagram including quantitative information;

The construction module is used to simulate and construct a corresponding neural network according to the neural network structure diagram;

The quantization module is used to obtain the data to be simulated, and quantify the data to be simulated according to the quantization information to obtain simulation input data, and the simulation input data and the neural network parameters are of the same data type;

A calculation module, configured to input the neural network parameters and the simulation input data to the neural network for convolution calculation to obtain a convolution result;

The output module is used to obtain a simulation result for output based on the convolution result.

Furthermore, the quantization module is also used to obtain data to be simulated, and according to the quantization information, convert the data to be simulated into data to be simulated with a unit length of 8 bits to obtain simulated input data with a unit length of 8 bits.

The present invention also provides a computer device, including a memory and a processor, and a computer program is stored in the memory. When the processor executes the computer program, the runtime hardware simulation described in any one of the embodiments of the present invention is implemented. The steps of the simulation method.

The present invention also provides a computer-readable storage medium on which a computer program is stored. When the computer program is executed by a processor, the runtime hardware simulation described in any one of the embodiments of the present invention is implemented. The steps of the simulation method.

The beneficial effects achieved by the present invention: because the present invention quantifies the data to be simulated into the same hardware data type as the neural network parameters, when software simulation is used, the simulation calculation is closer to the result of the hardware calculation, and the amount of data calculation of the hardware data type Less than the calculation amount of floating point type, it can also improve the calculation speed of neural network simulation.

Description of the drawings

FIG. 1 is a schematic flowchart of a runtime hardware simulation method provided by an embodiment of the present invention;

2 is a schematic flowchart of another runtime hardware simulation method provided by an embodiment of the present invention;

3 is a schematic flowchart of another runtime hardware simulation method provided by an embodiment of the present invention;

4 is a schematic structural diagram of a hardware simulation device at runtime according to an embodiment of the present invention;

FIG. 5 is a schematic diagram of a specific flow of an output module 405 according to an embodiment of the present invention;

FIG. 6 is a schematic diagram of a specific flow of another output module 405 according to an embodiment of the present invention;

Fig. 7 is a schematic structural diagram of an embodiment of a computer device according to an embodiment of the present invention.

Detailed ways

In order to make the objectives, technical solutions, and advantages of the present invention clearer, the following further describes the present invention in detail with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain the present invention, but not to limit the present invention.

The simulation of the existing neural network is mainly carried out through hardware simulation or software simulation. The hardware simulation is closer to the calculation logic of the neural network when it is running. However, the calculation speed of hardware simulation is extremely slow, and it is not suitable for large-scale simulation testing, which leads to the use of software simulation. More. When using software simulation, there is a certain numerical difference between the pure floating-point calculation results used in the software simulation and the hardware calculation results, and the complete numerical consistency cannot be achieved. The present invention quantifies the data to be simulated into the same hardware data type as the neural network parameters. When using software simulation, the simulation calculation is closer to the result of the hardware calculation, and the data calculation amount of the hardware data type is less than the calculation amount of the floating point type. , Can also improve the calculation speed of neural network simulation.

As shown in FIG. 1, it is a flowchart of an embodiment provided by a runtime hardware simulation method according to the present application. The foregoing method of hardware simulation at runtime includes the steps:

S101: Obtain a neural network structure diagram and neural network parameters.

Wherein, the above-mentioned neural network structure diagram includes quantization information, and the above-mentioned quantization information includes information about how long the data is quantized.

The simulation software can quantify the data to be simulated according to the above-mentioned quantization information, for example, quantize it into 8bit data.

The above-mentioned neural network structure diagram can be a neural network structure diagram of recognition type, such as face recognition, vehicle recognition, etc., or a neural network structure diagram of detection type, such as object detection, vehicle detection, etc.

The above-mentioned neural network structure diagram may also be a single-layer network structure diagram, such as a convolutional neural network corresponding to a convolutional layer.

The above-mentioned neural network structure diagram can be understood as a neural network structure, and further, can be understood as a neural network structure used for various neural network models. The above-mentioned neural network structure uses layers as computing units, including but not limited to: convolutional layer, pooling layer, ReLU, fully connected layer, and so on.

The aforementioned neural network parameters refer to the parameters corresponding to each layer in the neural network structure, and may be weight parameters, bias parameters, and so on. The above-mentioned various neural network models can be pre-trained corresponding neural network models. Since the neural network model is pre-trained, the attributes of the neural network parameters are also trained. Therefore, the neural network configured in the simulation software It can be used directly according to the configured neural network parameters, and there is no need to train the neural network. According to the pre-trained neural network model, the neural network structure diagram and parameters can be uniformly described.

The neural network structure diagram and neural network parameters described above can be acquired locally or on a cloud server. For example, the neural network structure diagram and neural network parameters described above can be stored locally and automatically when used. The selection or the user selects, or uploads the neural network structure diagram and neural network parameters to the cloud server, and downloads the neural network structure diagram and neural network parameters in the cloud server through the network when in use.

S102: Simulate and construct a corresponding neural network according to the neural network structure diagram.

Among them, the neural network structure diagram is the neural network structure diagram obtained in step S101, and the obtained neural network structure diagram is simulated in the simulation software, so as to construct the corresponding neural network in the software.

S103: Obtain the data to be simulated, and quantify the simulated data according to the quantization information to obtain simulation input data.

The aforementioned data to be simulated is data input by the user. When the neural network is an image processing type neural network, the data input by the user is image data.

The above-mentioned quantization can be done by a compiler.

Specifically, it can be calculated by the formula r=s×(qz), where r refers to the floating point value, which is the data input by the user, q refers to the quantized data, z is the offset value, and s is the zoom The values, s and z are generated by the compiler.

According to the formula r=s×(q-z), the quantized data is q=r/s+z.

Since s and z are generated by the compiler and r is the data to be simulated input by the user, the input data to be simulated is quantized by the compiler to obtain the quantized simulation input data.

Among them, after quantization, the obtained simulation input data and neural network parameters are the same data type.

It should be noted that the neural network parameters are different from the data to be simulated. The neural network parameters are of the hardware data type, and the data to be simulated are of the floating point data type. After being quantized by the compiler, the data to be simulated becomes simulation input data, which is the same data type as the neural network parameters.

Further, the above steps of acquiring the data to be simulated, and quantizing the data to be simulated according to the quantization information to obtain the simulation input data include:

Obtain the data to be simulated, and according to the quantization information, convert the data to be simulated into data to be simulated with a unit length of 8 bits, and obtain simulation input data with a unit length of 8 bits.

Among them, the neural network parameters are also 8-bit unit length data, and the neural network parameters include weight parameters and bias parameters.

S104: Input the neural network parameters and the simulation input data to the neural network for convolution calculation, and obtain the convolution result.

The aforementioned neural network includes a convolutional layer, which can perform convolution calculations on the neural network parameters and simulation input data.

Because it is a neural network simulated in software, the convolution calculation process in the neural network is consistent with the floating-point calculation process, that is, the data of the hardware data type is calculated through the floating-point calculation process, so that the floating-point calculation is The result is closer to the hardware calculation result.

S105: Based on the convolution result, a simulation result is obtained and output.

The aforementioned convolution result is the convolution result calculated in step S104, and the aforementioned convolution result can be output as a simulation result after being processed by the activation layer.

In a possible embodiment, after the convolution result is obtained, there is no active layer, and the convolution result is input to the next layer of the network for calculation, for example, the pooling layer is performed to pool the convolution result.

In the embodiment of the present invention, the neural network structure diagram and neural network parameters are obtained, and the neural network structure diagram includes quantitative information; the corresponding neural network is simulated and constructed according to the neural network structure diagram; the data to be simulated is obtained, and the simulated data is quantified according to the quantitative information , To obtain the simulation input data, the simulation input data and the neural network parameters are the same hardware data type; input the neural network parameters and the simulation input data to the neural network for convolution calculation to obtain the convolution result; based on the convolution result, the simulation result is obtained. Output. Since the data to be simulated is quantified into the same hardware data type as the neural network parameters, when using software simulation, the simulation calculation is closer to the result of the hardware calculation, and the data calculation amount of the hardware data type is less than the calculation amount of the floating point type. The calculation speed of neural network simulation can be improved.

Optionally, as shown in FIG. 2, it is a flowchart of an embodiment provided by another runtime hardware simulation method according to the present application.

S201: Obtain a neural network structure diagram and neural network parameters.

S202: Simulate and construct a corresponding neural network according to the neural network structure diagram.

S203: Obtain data to be simulated, and according to the quantization information, convert the data to be simulated into data to be simulated with a unit length of 8 bits to obtain simulation input data with a unit length of 8 bits.

S203: Subtract the floating point offset value of the neural network data with the length of 8bit unit and the simulation input data with the length of 8bit unit respectively to obtain the integer data of int32.

The integer data is the calculation data of the hardware, which makes the simulation calculation of the software closer to the calculation logic of the hardware, so that the simulation calculation result of the software is closer to the calculation result of the hardware.

S204: Input the neural network parameters of the int32 and the simulation input data into the neural network for convolution calculation, and obtain the convolution result of the int32.

Input the neural network parameters of int32 and the simulation input data into the convolutional layer in the neural network for convolution calculation, and the convolution result obtained is also the integer data of int32, which is close to the result of hardware calculation.

S205: Add the offset parameter to the convolution result of int32 to obtain the offset result of int32.

The above-mentioned bias parameters are the bias parameters in the neural network parameters, and the above-mentioned process of adding the bias parameters is to perform addition calculation in the bias layer.

Specifically, the calculation formula of the neural network is y=w*x+b.

Among them, y is the output, x is the input, w*x is the convolution of the weight parameter and the input value, and b is the bias parameter.

S206: Activate the bias result.

The bias result is activated by the activation function of the activation layer to output the bias result. Of course, if the bias result is not activated, the bias result will be input to the next calculation node for calculation.

S207: Perform inverse quantization on the bias result passing through the active layer to obtain a simulation result.

The above-mentioned dequantization process is r=s×(q-z), and the dequantization operation is performed by the compiler.

Among them, r refers to the floating point value, which is the data input by the user, q refers to the quantized data, z is the offset value, s is the scaling value, and s and z are generated by the compiler.

S208: Output the simulation result.

The simulation result is output to the user so that the user knows the simulation result. The simulation result is used to provide benchmark data as a hardware reference, and to provide a calculation model for algorithm testing. Provide guidance for hardware design.

In the embodiment of the present invention, the neural network structure diagram and neural network parameters are obtained, and the neural network structure diagram includes quantitative information; the corresponding neural network is simulated and constructed according to the neural network structure diagram; the data to be simulated is obtained, and the simulated data is quantified according to the quantitative information , To obtain the simulation input data, the simulation input data and the neural network parameters are the same hardware data type; input the neural network parameters and the simulation input data to the neural network for convolution calculation to obtain the convolution result; based on the convolution result, the simulation result is obtained. Output. Since the data to be simulated is quantified into the same hardware data type as the neural network parameters, when using software simulation, the simulation calculation is closer to the result of the hardware calculation, and the data calculation amount of the hardware data type is less than the calculation amount of the floating point type. The calculation speed of neural network simulation can be improved. The entire calculation process is closer to the calculation mode of the hardware, reducing irrelevant content in floating-point calculations, and facilitating the hardware to be used for output verification. At the same time, because the calculation mode and operation mode are consistent with the hardware, the final calculation result of the hardware can be directly simulated, which can be used for algorithm testing of neural networks.

Optionally, after constructing the neural network in the software according to the neural network structure diagram, as shown in Figure 3, the specific steps include:

S301: The user inputs picture data.

S302: quantize the input picture data, and quantize the image data of Float32 into picture data with a unit length of 8 bits.

S303: Subtract the offset value from the quantized picture data to become int32 picture data for input to the convolutional layer, and go to step S306.

S304, the user inputs a neural network weight parameter, and the neural network weight parameter is a neural network weight parameter with a unit length of 8 bits.

In S305, the neural network weight parameter is also subtracted from the offset value to become an int32 neural network weight parameter for input to the convolutional layer, and step S306 is entered.

S306: Perform convolution calculation on the processed image data and the processed neural network weight parameter to obtain a convolution result.

S307: Add the offset parameter to the convolution result to obtain a feature map.

S308: Determine whether there is an activation layer or an activation function. If it does, dequantize the feature map and output it to the user; if it does not exist, go to step S309.

S309: Continue to quantize the feature map as an input to the next calculation layer.

A person of ordinary skill in the art can understand that all or part of the processes in the above-mentioned embodiment methods can be implemented by instructing relevant hardware through a computer program. The computer program can be stored in a computer readable storage medium, and the program can be stored in a computer readable storage medium. When executed, it may include the procedures of the above-mentioned method embodiments. Among them, the aforementioned storage medium may be a non-volatile storage medium such as a magnetic disk, an optical disc, a read-only memory (Read-Only Memory, ROM), or a random access memory (Random Access Memory, RAM), etc.

It should be understood that although the various steps in the flowchart of the drawings are displayed in sequence as indicated by the arrows, these steps are not necessarily executed in sequence in the order indicated by the arrows. Unless explicitly stated in this article, the execution of these steps is not strictly limited in order, and they can be executed in other orders. Moreover, at least part of the steps in the flowchart of the drawings may include multiple sub-steps or multiple stages. These sub-steps or stages are not necessarily executed at the same time, but can be executed at different times, and the order of execution is also It is not necessarily performed sequentially, but may be performed alternately or alternately with at least a part of other steps or sub-steps or stages of other steps.

As shown in FIG. 4, which is a schematic structural diagram of a runtime hardware simulation device provided by this embodiment, the foregoing device 400 includes:

The obtaining module 401 is configured to obtain a neural network structure diagram and neural network parameters, where the neural network structure diagram includes quantitative information;

The construction module 402 is used to simulate and construct a corresponding neural network according to the neural network structure diagram;

The quantization module 403 is configured to obtain the data to be simulated, and quantify the data to be simulated according to the quantization information to obtain simulation input data, and the simulation input data and the neural network parameters are of the same data type;

The calculation module 404 is configured to input the neural network parameters and the simulation input data into the neural network for convolution calculation, and obtain a convolution result;

The output module 405 is configured to obtain a simulation result for output based on the convolution result.

Further, as shown in FIG. 4, the quantization module is also used to obtain the data to be simulated, and according to the quantization information, convert the data to be simulated into the data to be simulated with a unit length of 8 bits to obtain a simulation with a unit length of 8 bits. Input data.

Further, as shown in FIG. 4, the data to be simulated is floating-point type data, the neural network parameter is a neural network parameter with a length of 8bit unit, and the quantization module 403 is also used to calculate the neural network with a length of 8bit unit. The network data and the simulation input data of the 8-bit unit length are respectively subtracted from their own floating-point offset values to obtain integer data of int32;

The calculation module 404 is also used to input the neural network parameters of the int32 and the simulation input data into the neural network for convolution calculation to obtain the convolution result of the int32.

Further, as shown in FIG. 5, the neural network parameters include bias parameters, and the output module 405 includes:

The bias unit 4051 is configured to add a bias parameter to the convolution result of the int32 to obtain the bias result of the int32;

The first output unit 4052 is configured to obtain a simulation result for output based on the bias result.

Further, as shown in FIG. 6, the output module 405 includes:

The dequantization unit 4053 is configured to dequantize the convolution result according to the quantization information and the offset value to obtain a simulation result;

The second output unit 4054 is configured to output the simulation result.

Further, as shown in FIG. 6, the quantization information includes a scaling value, and the inverse quantization unit 4053 is further configured to perform inverse quantization on the convolution result according to the scaling value and the offset value to obtain a simulation result.

The runtime hardware simulation device provided by the embodiment of the present application can realize the various implementation manners in the method embodiments of FIG. 1 to FIG. 3 and the corresponding beneficial effects. In order to avoid repetition, details are not described herein again.

In order to solve the above technical problems, the embodiments of the present application also provide computer equipment. Please refer to FIG. 7 for details. FIG. 7 is a block diagram of the basic structure of the computer device in this embodiment.

The computer device 7 includes a memory 701, a processor 702, and a network interface 703 that are connected to each other in communication through a system bus. It should be pointed out that only the computer device 70 with components 701-703 is shown in the figure, but it should be understood that it is not required to implement all the illustrated components, and more or fewer components may be implemented instead. Among them, those skilled in the art can understand that the computer device here is a device that can automatically perform numerical calculation and/or information processing in accordance with pre-set or stored instructions. Its hardware includes, but is not limited to, a microprocessor, a dedicated Integrated Circuit (Application Specific Integrated Circuit, ASIC), Programmable Gate Array (Field-Programmable GateArray, FPGA), Digital Processor (Digital Signal Processor, DSP), embedded equipment, etc.

The computer equipment can be computing equipment such as desktop computers, notebooks, palmtop computers, and cloud servers. Computer equipment can interact with customers through keyboard, mouse, remote control, touchpad or voice control equipment.

The memory 701 includes at least one type of readable storage medium. The readable storage medium includes flash memory, hard disk, multimedia card, card-type memory (for example, SD or DX memory, etc.), random access memory (RAM), static random access memory ( SRAM), read only memory (ROM), electrically erasable programmable read only memory (EEPROM), programmable read only memory (PROM), magnetic memory, magnetic disks, optical disks, etc. In some embodiments, the memory 701 may be an internal storage unit of the computer device 7, such as a hard disk or a memory of the computer device 7. In other embodiments, the memory 701 may also be an external storage device of the computer device 7, such as a plug-in hard disk, a smart media card (SMC), and a secure digital (Secure Digital, SD) card, flash card (Flash Card), etc. Of course, the memory 701 may also include both the internal storage unit of the computer device 7 and its external storage device. In this embodiment, the memory 701 is generally used to store an operating system and various application software installed in the computer device 7, such as a program code of a real-time hardware simulation method. In addition, the memory 701 can also be used to temporarily store various types of data that have been output or will be output.

The processor 702 may be a central processing unit (Central Processing Unit, CPU), a controller, a microcontroller, a microprocessor, or other data processing chips in some embodiments. The processor 702 is generally used to control the overall operation of the computer device 7. In this embodiment, the processor 702 is configured to run the program code stored in the memory 701 or process data, for example, run the program code of a real-time hardware simulation method.

The network interface 703 may include a wireless network interface or a wired network interface, and the network interface 703 is generally used to establish a communication connection between the computer device 7 and other electronic devices.

This application also provides another implementation manner, that is, a computer-readable storage medium is provided. The computer-readable storage medium stores a runtime hardware simulation simulation program, and the aforementioned runtime hardware simulation simulation program can be processed by at least one The processor executes, so that at least one processor executes the steps of the above-mentioned runtime hardware simulation method.

Through the description of the above implementation manners, those skilled in the art can clearly understand that the above-mentioned embodiment method can be implemented by means of software plus the necessary general hardware platform, of course, it can also be implemented by hardware, but in many cases the former is better.的实施方式。 Based on this understanding, the technical solution of this application essentially or the part that contributes to the existing technology can be embodied in the form of a software product, and the computer software product is stored in a storage medium (such as ROM/RAM, magnetic disk, The optical disc) includes a number of instructions to make a terminal device (which can be a mobile phone, a computer, a server, an air conditioner, or a network device, etc.) execute a runtime hardware simulation method of each embodiment of the present application.

The terms "including" and "having" in the specification and claims of this application and the above description of the drawings and any variations thereof are intended to cover non-exclusive inclusions. The terms "first", "second", etc. in the specification and claims of the application or the above-mentioned drawings are used to distinguish different objects, rather than to describe a specific sequence. Reference to "embodiments" herein means that a specific feature, structure, or characteristic described in conjunction with the embodiments may be included in at least one embodiment of the present application. The appearance of the phrase in various places in the specification does not necessarily refer to the same embodiment, nor is it an independent or alternative embodiment mutually exclusive with other embodiments. Those skilled in the art clearly and implicitly understand that the embodiments described herein can be combined with other embodiments.

The above are only the preferred embodiments of the present invention and are not intended to limit the present invention. Any modification, equivalent replacement and improvement made within the spirit and principle of the present invention shall be included in the protection scope of the present invention. Inside.

Claims

A runtime hardware simulation simulation method, which is characterized in that it comprises the steps:

Acquiring a neural network structure diagram and neural network parameters, where the neural network structure diagram includes quantitative information;

Simulate and construct a corresponding neural network according to the neural network structure diagram;

Acquiring data to be simulated, and quantizing the data to be simulated according to the quantization information to obtain simulation input data, where the simulation input data and the neural network parameters are of the same hardware data type;

Inputting the neural network parameters and the simulation input data to the neural network for convolution calculation to obtain a convolution result;

Based on the convolution result, the simulation result is obtained and output.
The runtime hardware simulation simulation method according to claim 1, wherein the step of obtaining the data to be simulated and quantizing the data to be simulated according to the quantization information to obtain the simulation input data comprises:

Obtain the data to be simulated, and according to the quantization information, convert the data to be simulated into data to be simulated with a unit length of 8 bits to obtain simulation input data with a unit length of 8 bits.
The runtime hardware simulation simulation method according to claim 2, wherein the data to be simulated is floating-point type data, the neural network parameter is a neural network parameter with a unit length of 8 bit, and the data to be simulated is The step of quantifying the data according to the quantified information specifically further includes:

Subtracting the floating-point offset value of the neural network data of the 8-bit unit length and the simulation input data of the 8-bit unit length respectively to obtain the integer data of int32;

The step of inputting the neural network parameters and the simulation input data to the neural network for convolution calculation, and obtaining the convolution result specifically includes:

Input the neural network parameters of int32 and the simulation input data into the neural network for convolution calculation, and get the convolution result of int32.
The runtime hardware simulation simulation method according to claim 3, wherein the neural network parameters include bias parameters, and the step of obtaining a simulation result based on the convolution result and outputting it specifically comprises:

Add the offset parameter to the convolution result of the int32 to obtain the offset result of the int32;

Based on the offset result, the simulation result is obtained and output.
The runtime hardware simulation simulation method according to claim 1, wherein the step of obtaining a simulation result based on the convolution result and outputting it comprises:

Performing inverse quantization on the convolution result according to the quantization information and the offset value to obtain a simulation result;

Output the simulation result.
The runtime hardware simulation simulation method according to claim 5, wherein the quantization information includes a scaling value, and the convolution result is dequantized according to the quantization information and the offset value to obtain The steps of the simulation results include:

Perform inverse quantization on the convolution result according to the scaling value and the offset value to obtain a simulation result.
A run-time hardware simulation simulation device, characterized in that the device includes:

An obtaining module, used to obtain a neural network structure diagram and neural network parameters, the neural network structure diagram including quantitative information;

The construction module is used to simulate and construct a corresponding neural network according to the neural network structure diagram;

The quantization module is used to obtain the data to be simulated, and quantify the data to be simulated according to the quantization information to obtain simulation input data, and the simulation input data and the neural network parameters are of the same data type;

A calculation module, configured to input the neural network parameters and the simulation input data to the neural network for convolution calculation to obtain a convolution result;

The output module is used to obtain a simulation result for output based on the convolution result.
The runtime hardware simulation simulation device according to claim 7, wherein the quantization module is also used to obtain the data to be simulated, and according to the quantization information, convert the data to be simulated into the data to be simulated with an 8-bit unit length. Simulation data, the simulation input data of 8bit unit length is obtained.
A computer device, comprising a memory and a processor, wherein a computer program is stored in the memory, and when the processor executes the computer program, the runtime hardware simulation method according to any one of claims 1 to 6 is implemented A step of.
A computer-readable storage medium, wherein a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a processor, the runtime according to any one of claims 1 to 6 is realized. The steps of the hardware simulation method.