CN114399019A

CN114399019A - Neural network compiling method, system, computer device and storage medium

Info

Publication number: CN114399019A
Application number: CN202111647960.3A
Authority: CN
Inventors: 吴晓; 陶为; 何国敏
Original assignee: Nanjing Fengxing Technology Co ltd
Current assignee: Nanjing Fengxing Technology Co ltd
Priority date: 2021-12-30
Filing date: 2021-12-30
Publication date: 2022-04-26

Abstract

The application discloses a neural network compiling method, a neural network compiling system, computer equipment and a storage medium. Inputting standardized model data and quantitative data of a network to be compiled and hardware parameters of a hardware platform to be deployed; firstly, carrying out analysis treatment; automatically generating a verification model according to the analysis result; converting the analyzed data into network model configuration data for compiling; combining hardware parameters to obtain a calculation cycle dicing scheme; judging a current compiling mode, if so, operating the generated verification model to obtain a software comparison result and generate a hardware comparison file; and finally, compiling according to the network model configuration data, the hardware parameters and the calculation cycle block cutting scheme to generate a hardware configuration file. The neural network compiling method combines the compiling flow with the verification process, so that in the product research and development stage, the generated verification model provides a software comparison result and a hardware comparison file, the test research and development speed is accelerated, and the development period is shortened.

Description

Neural network compiling method, system, computer device and storage medium

Technical Field

The present application relates to the field of neural network technologies, and in particular, to a neural network compiling method, a neural network compiling system, a computer device, and a storage medium.

Background

An FPGA (Field Programmable Gate Array) is a semi-custom circuit, has the characteristics of strong reconfigurability, low implementation delay and the like, and is used as a hardware platform for accelerating the inference calculation of a Convolutional Neural Network (CNN). However, as the application field of CNN expands, more and more new networks with more complex hierarchical structures are proposed, some deep CNN models have tens to hundreds of layers, and there are significant differences in size and configuration between layers. These trends in CNN model architecture increase the complexity of hardware design, making it more difficult to design a generic CNN hardware accelerator to map different CNN algorithms efficiently. With the ever-increasing size and complexity of CNN models, methods of customizing designs become increasingly impractical, and methods of automated compilation are essential.

The existing compiling method for deploying the CNN model for the FPGA platform is based on the CNN model structure, the calculation of different types of layers in a network model is mapped to a calculation unit of a hardware accelerator to be deployed, and a hardware configuration file capable of controlling the behavior of the accelerator is compiled and generated. After the compiling is finished, the generated hardware configuration file is deployed on the hardware accelerator in an off-line mode, and then the implementation result of the CNN model on the hardware can be obtained.

The compiling method controls the hardware behavior through the hardware configuration file, and different configuration information in the hardware configuration file can be flexibly changed to obtain the mapping realization of different CNN models on the hardware accelerator. In order to verify the accuracy of the configuration information in the compiled hardware configuration file, the conventional method verifies the hardware reasoning result of the CNN hardware accelerator. Specifically, a Python model or other software models of the network to be tested are usually manually built, and a software reasoning result is generated and then compared with a hardware reasoning result obtained by deployment. However, the verification method is separated from the compiling process, so that a specific error position is difficult to locate in a specific implementation process, hardware needs to be redeployed in a subsequent iterative error correction process, the verification process is long, and further the development period of the CNN hardware accelerator is long.

Disclosure of Invention

In order to solve the problems that a test verification process is separated from a compiling process, relevance is not strong, a fault position is difficult to locate, hardware needs to be redeployed in an iterative error correction process, the verification process period is long, and further the development period of a CNN hardware accelerator is long in the existing compiling method, the neural network compiling method, the system, the computer equipment and the storage medium are provided through the following aspects.

A first aspect of the present application provides a neural network compiling method, including:

inputting standard model data and quantitative data of a neural network to be compiled and hardware parameters of a hardware platform to be deployed;

analyzing the standard model data and the quantitative data and performing software reasoning operation to obtain intermediate model structure information and intermediate parameter data; the intermediate model structure information comprises model structure description information and model structure analysis data;

automatically generating a corresponding verification model according to the model structure description information;

preprocessing the intermediate model structure information and the intermediate parameter data to obtain network model configuration data; the network model configuration data comprises network hierarchical structure information which can be compiled, weight parameters of the neural network to be compiled, input test data of the neural network to be compiled and a software reasoning result;

according to the network model configuration data, combining hardware parameters to obtain a calculation cycle block cutting scheme aiming at the neural network to be compiled;

judging whether the current execution mode is a test mode or a user mode;

if the test mode is adopted, the following operations are executed:

operating the verification model to obtain a calculation result of the verification model, and comparing the calculation result with a software reasoning result to obtain a software comparison result;

generating a hardware contrastable file according to a calculation result of the verification model;

compiling according to the network model configuration data, the hardware parameters and the calculation cycle block cutting scheme to generate a hardware configuration file;

if the mode is the user mode, the following operations are executed:

and compiling according to the network model configuration data, the hardware parameters and the calculation cycle block cutting scheme to generate a hardware configuration file.

Optionally, automatically generating a corresponding verification model according to the model structure description information includes:

inputting model structure description information, and creating a calculation code document and a calling code document of a verification model;

traversing all layers in the model structure description information, and writing corresponding calculation operation codes into a calculation code document layer by layer according to the layer types; the computing operation in the corresponding computing operation code is realized by running a computing function of the corresponding operation in a preset verification model function library;

traversing all layers in the model structure description information, and writing the processing operation codes of the calculation results of the corresponding layers into a calling code document according to the types of the layers in the model structure description information;

writing corresponding codes for generating a software comparison result file and a hardware contrastable file in a calling code document of the verification model;

a verification model is obtained.

Optionally, the operating the verification model to obtain a calculation result of the verification model, and comparing the calculation result with the software inference result to obtain a software comparison result includes:

operating and calling the code document, and loading a software reasoning result;

inputting input test data and hardware parameters of a neural network to be compiled, operating a calculation code of a target layer in a verification model to calculate to obtain a calculation result of the target layer in the verification model, and comparing data of a corresponding layer in a software reasoning result according to the calculation result of the target layer in the verification model to obtain a software comparison result of the target layer; wherein the target layer is any layer in the verification model;

and traversing all layers in the verification model to obtain a calculation result of the verification model and a software comparison result.

A second aspect of the present application provides a neural network compiling system, configured to implement the steps of the neural network compiling method provided in the first aspect of the present application; the neural network system includes:

the model analysis module is used for analyzing the standard model data and the quantitative data and performing software reasoning operation to obtain intermediate model structure information and intermediate parameter data; the intermediate model structure information comprises model structure description information and model structure analysis data;

a model transformation module to perform the following operations: preprocessing the intermediate model structure information and the intermediate parameter data to obtain network model configuration data; the network model configuration data comprises network hierarchical structure information which can be compiled, weight parameters of the neural network to be compiled, input test data of the neural network to be compiled and a software reasoning result; according to the network model configuration data, combining hardware parameters of a hardware platform to be deployed to obtain a calculation loop block cutting scheme aiming at a neural network to be compiled;

the judging module is used for judging whether the current execution mode of the compiling system is a user mode or a test mode;

the model compiling module is used for compiling and generating a hardware configuration file according to the network model configuration data, the hardware parameters and the calculation cycle block cutting scheme;

a model verification module to perform the following operations: automatically generating a corresponding verification model according to the model structure description information; operating the verification model to obtain a calculation result of the verification model, and comparing the calculation result with a software reasoning result to obtain a software comparison result; and generating a hardware contrastable file according to the calculation result of the verification model.

Optionally, the model verification module is further configured to perform the following operations:

a verification model is obtained.

A third aspect of the present application discloses a computer device comprising:

a memory for storing a computer program;

a processor for implementing the steps of the neural network compiling method as disclosed in the first aspect of the present application when executing the computer program.

A fourth aspect of the present application discloses a computer-readable storage medium, which stores a computer program that, when being processed and executed, implements the steps of the neural network compiling method as disclosed in the first aspect of the present application.

The present application discloses a neural network compiling method, system, computer device, and storage medium by the above aspects. Inputting standardized model data and quantitative data of a network to be compiled and hardware parameters of a hardware platform to be deployed; analyzing the standardized model data and the quantized data to obtain intermediate model structure information and intermediate parameter data including model structure description information; automatically generating a verification model according to the model structure description information; preprocessing the structure number information and the intermediate parameter data of the intermediate model to obtain network model configuration data which can be used for hardware compiling; combining hardware parameters to obtain a calculation cycle block cutting scheme aiming at the neural network to be compiled; judging a current compiling mode, if the compiling mode is the testing mode, operating the generated verification model to obtain a calculation result of the verification model, comparing the calculation result with a software reasoning result in the network model configuration data to obtain a software comparison result, and generating a hardware comparison file; finally, compiling according to the network model configuration data and the hardware parameters in combination with the calculation cycle block cutting scheme to generate a hardware configuration file; and if the user mode is the user mode, directly compiling and generating a hardware configuration file according to the network model configuration data, the hardware parameters and the calculation cycle block cutting scheme. The neural network compiling method combines the compiling flow with the verification process, provides a software comparison result and a hardware comparison file through the generated verification model, accelerates the test research and development speed of the product research and development stage, and shortens the development period.

Furthermore, the verification model in the neural network compiling method disclosed in this embodiment can solve the error problem in a software layer by simulating a hardware reasoning behavior before compiling and deploying. Meanwhile, the iterative error correction redeployment of the hardware level is converted into the verification error correction of the software level, and the test development period of the product is greatly shortened.

Drawings

Fig. 1 is a schematic workflow diagram of a neural network compiling method according to an embodiment of the present disclosure;

fig. 2 is a schematic flowchart illustrating a work flow of generating a verification model in a neural network compiling method according to an embodiment of the present application;

fig. 3 is a schematic workflow diagram of a verification model in a neural network compiling method according to an embodiment of the present application;

fig. 4 is a schematic structural diagram of a neural network compiling system according to an embodiment of the present disclosure.

Detailed Description

In order to solve the problems that a test verification process and a compiling process in the existing compiling method are separated, the relevance is not strong, the error position is difficult to locate, hardware needs to be redeployed in the iterative error correction process, the verification process period is long, and the development period of a CNN hardware accelerator is further long, the neural network compiling method, the system, the computer equipment and the storage medium are provided through the following embodiments.

Referring to fig. 1, an embodiment of the present application discloses a neural network compiling method. According to the method, an automatic compiling process is realized according to a network model to be compiled and hardware parameters of a hardware platform to be deployed, a configuration file required by the hardware to be deployed is generated, a software comparison result and a hardware comparison file are generated by combining a corresponding verification model, the development period is shortened, and product iteration is accelerated. As shown in fig. 1, the neural network compiling method includes the following steps:

and step 10, inputting standard model data and quantitative data of the neural network to be compiled and hardware parameters of the hardware platform to be deployed.

In the present embodiment, the standard model data uses an ONNX (Open Neural Network Exchange) standardized model file. ONNX is an open format representing a deep neural network model for storing trained models. The ONNX defines a group of standard formats which are independent of environment and platform, and provides a foundation for the interoperability of the deep learning model, so that the deep learning model can be interactively used under different frameworks and environments. The ONNX file stores not only the weights of the neural network model, but also the structural information of the model, the input and output of each layer in the network and some other auxiliary information. Quantization refers to converting a floating point algorithm of a neural network to be compiled into a fixed point algorithm. In the present embodiment, the quantization data includes quantization processing information that operates differently for different layers. The hardware parameters of the hardware platform to be deployed include, but are not limited to, data bit width, computation parallelism, and on-chip storage resources.

The Neural Network to be compiled includes, but is not limited to, CNN, DNN (Deep Neural Networks), RNN (Recurrent Neural Networks), LSTM (Long short-term memory), SNN (Spiking Neural Networks), and transform models.

And 20, analyzing the standard model data and the quantitative data and performing software reasoning operation to obtain intermediate model structure information and intermediate parameter data.

In this embodiment, the analysis tool is used to analyze the ONNX model data of the network to be compiled and the corresponding quantized data, and then perform further software inference operation according to the analyzed data. The analysis process is to extract the structural information and the network parameters of the model by analyzing the deep learning model description file, and carry out operations such as connection transform, operator combination, operator splitting and the like on the model calculation diagram according to the constraint conditions of hardware. And by combining inference operation on the neural network, a specific operation software result of each layer can be obtained. The intermediate model structure information and the intermediate parameter data can be obtained through the operation. Wherein the constraint conditions of the hardware comprise a computing resource constraint and a storage resource constraint.

In practical application, the intermediate model structure information includes model structure description information (prototxt document) and model structure analysis data; the intermediate parameter data comprises weight quantization bias and other network reasoning related parameter data, network input data provided by a random generation program or a document loading program, and software layer-by-layer reasoning result data; the model structure description information is a special format document with a file suffix of ". prototxt", and is referred to as a prototxt document in the application. The prototxt document is used to store information describing neural network neural structures, and can be viewed using a "context (Convolutional neural network framework) tool.

In some examples, the ONNX model data and corresponding quantized data are parsed by an ONNX parsing tool written in python language. The ONNX parsing tool described above may also be written in other languages such as caffe.

And step 30, automatically generating a corresponding verification model according to the model structure description information.

In the present embodiment, the gold verification model is used as a verification model in the neural network compiling method. The generation result of the golden verification model is stored in a matrix element form and can be directly used as a software comparison result to be compared and verified with a software reasoning result element by element; meanwhile, the golden verification model is a verification design based on a hardware architecture, and a generated calculation result can be flexibly converted into a hardware comparable format. If the problem is found by comparing the software reasoning result with the software reasoning result, the error position can be quickly positioned, and the problem is solved by quickly iterating, verifying and updating. The verification model can solve the error problem in a software layer in one step before compiling and deploying, and converts the complex deployment error correction operation in a hardware layer into the rapid verification error correction operation in a software layer. In this embodiment, matlab is used to generate a gold verification model, so that the verification model better fits the calculation process of the hardware platform. In other embodiments, the verification model may also be implemented using C or other programming languages.

Further, referring to fig. 2, step 30, automatically generating a corresponding verification model according to the model structure description information, including:

step 31, inputting model structure description information, and creating a calculation code document and a calling code document of the verification model.

In the present embodiment, the prototxt document obtained in the step 20 is input as a basic document for generating the gold verification model, and a verification model calculation code document corresponding to the neural network to be compiled and a corresponding calling code document are newly created.

Step 32, traversing all layers in the model structure description information, and writing the corresponding calculation operation codes into a calculation code document layer by layer according to the layer types; and the computing operation in the corresponding computing operation code is realized by running the computing function of the corresponding operation in the preset verification model function library.

And step 33, traversing all layers in the model structure description information, and writing the processing operation codes of the calculation results of the corresponding layers into the calling code document according to the types of the layers in the model structure description information.

And step 34, writing corresponding codes for generating a software comparison result file and a hardware comparison file in a calling code document of the verification model.

And step 35, obtaining a verification model corresponding to the neural network to be compiled.

In this embodiment, a gold verification model corresponding to a network to be compiled is automatically generated according to network structure information in a prototxt document. The golden verification model includes a verification model calculation code document and a corresponding calling code document.

In practical application, a prototxt document of a network to be compiled is loaded for reading, and a corresponding calculation code document of a golden verification model is newly established for writing. Traversing all layers in the prototxt document, judging the type of the current layer, and automatically writing the corresponding calculation operation code into the corresponding position in the calculation code document according to the type of the layer to obtain the calculation code of the corresponding layer in the verification model. And the computing operation in the corresponding computing operation code is realized by running a computing function of the corresponding operation in a preset verification model function library. Types of layers include, but are not limited to, batch normalization, ReLU, pooling, element addition calculations, deconvolution. The preset verification model function library comprises calculation functions which are written in advance and are related to the calculation operation of the verification model. The calculation function includes, but is not limited to, a convolution calculation function, a pooling layer calculation function, a ReLU operation calculation function, and the like, according to the type of the model corresponding to the different layers. And writing the corresponding verification model calculation codes in the calculation code document layer by layer to obtain the calculation code document of the verification model.

In practical application, after the calculation code document of the verification model is written, the calling code document (the gold _ top document) of the verification model is opened for writing, and the code of the preprocessing part of the gold _ top document is automatically written. Traversing all layers in the prototxt document, performing accumulation counting operation on layers of different types, judging whether a layer of a certain type exists according to whether an accumulation result is greater than 0, and automatically writing a processing operation code of the calculation result of the layer of the certain type into the gold _ top document; wherein the types of layers include, but are not limited to, batch normalization, ReLU, pooling, element-plus-computation, deconvolution; the processing operation code of the above type layer calculation result is used for executing the operation including saving the data result and printing the data result. And automatically writing corresponding codes for generating a software comparison result file and a hardware contrastable file in the calling code document. And completing writing to obtain a calling code document gold _ top document of the verification model. And subsequently, automatically loading the test data input by the neural network to be compiled by calling the gold _ top document, running a verification model, and outputting a corresponding software comparison result file and a corresponding hardware comparison result file.

It should be noted that, in practical applications, the same technical effects of the present embodiment can be achieved as long as step 30 and corresponding steps 31-36 are performed after step 20 and before step 70.

Step 40, preprocessing the intermediate model structure information and the intermediate parameter data to obtain network model configuration data; the network model configuration data comprises network hierarchical structure information which can be compiled, weight parameters of the neural network to be compiled, input test data of the neural network to be compiled and a software reasoning result.

In practical application, since hardware processing is regular and orderly operation, the model structure obtained by the analysis tool and the corresponding parameter data need to be preprocessed to meet the requirement of subsequent calculation. The above-mentioned intermediate model structure information and intermediate parameter data are converted in step 40 of the present embodiment into network model configuration data that can be recognized by a hardware compiler and adapted for subsequent processing. Specifically, the network model configuration data includes network hierarchy structure information that can be compiled, weight parameters of the neural network to be compiled, input test data of the neural network to be compiled, and a software inference result. The network model configuration data can be directly used for hardware compiling processing, and part of parameters can also be used as input of a verification model and used for generating a test comparison result of the golden verification model.

And step 50, obtaining a calculation loop block cutting scheme aiming at the neural network to be compiled according to the network model configuration data and by combining the hardware parameters of the hardware platform to be deployed.

In order to better deploy the neural network to be compiled to a corresponding hardware platform, on-chip resources are fully utilized, and the dimensionality of the network model needs to be segmented. The network model after the segmentation processing can better adapt to the storage and calculation resources of the hardware accelerator to be deployed, so that the maximization of the on-chip resource utilization rate and data reuse and the minimization of data communication are realized. The calculation cycle block cutting scheme is a common hardware optimization acceleration scheme, and a network structure of a model is respectively cut along certain dimensions of a calculation cycle to relieve the pressure of data transmission inside and outside a chip, and meanwhile, on-chip resources can be more effectively utilized under the condition that the on-chip resources are limited.

Step 60, determine whether the current execution mode is a test mode or a user mode.

The neural network compiling method disclosed by the embodiment provides two working modes, namely a testing mode and a user mode. The test mode is used in the process of developing and testing the corresponding compiling system, and the correctness of the model analysis compiling process of the network is ensured so as to accelerate the development period. The user mode is used by a user in the process of using the corresponding compiling system, and only the neural network model to be compiled needs to be analyzed and compiled to generate a hardware configuration file. In the actual use process, when a corresponding compiling system is operated, the compiling requirement needs to be input. In some implementations, the compilation requirements include whether a mode currently running the compilation system is a user mode or a test mode, whether comparison of the golden verification model calculation results is required, whether parsing of the batch normalization layer is required, whether printing of the upper plate test results is required, whether automatic block cutting is required for the model, whether loading of an input test data document of the neural network to be compiled or random generation of input volume data of the neural network to be compiled is required, whether channel-by-channel quantization is required, and the like.

If the test mode is true, step 70 to step 90 are performed.

And step 70, operating the verification model to obtain a calculation result of the verification model, and comparing the calculation result with a software reasoning result to obtain a software comparison result.

In this embodiment, according to the current compiling mode, if the compiling mode is the test mode, the generated gold verification model is automatically run, and a verification model calculation result is obtained through a series of calculations. And comparing the obtained calculation result of the verification model with the software reasoning result obtained by preprocessing in the step 40 to obtain a software comparison result.

Further, referring to fig. 3, step 70, running the verification model to obtain a calculation result of the verification model, and comparing the calculation result with the software inference result to obtain a software comparison result, including:

and step 71, operating the calling code document of the verification model and loading the software reasoning result.

In this embodiment, a code gold _ top document corresponding to a main calling function of the verification model is called to run the gold verification model, and meanwhile, the step 40 is loaded to obtain a software inference result, which is used as a basis for performing subsequent software comparison.

And 72, inputting input test data of the neural network to be compiled and hardware parameters of the hardware platform to be deployed, operating calculation codes of a target layer in the verification model to calculate, obtaining a calculation result of the target layer in the verification model, and comparing data of a corresponding layer in the software reasoning result according to the calculation result of the target layer in the verification model to obtain a software comparison result of the target layer, wherein the target layer is any one layer in the verification model.

Inputting the input test data of the neural network to be compiled in the network model configuration data obtained in the step 40, and inputting the hardware parameters of the hardware platform to be deployed. Relevant parameters in the golden verification model are configured according to hardware parameters. And calling the calculation codes layer by layer according to the network structure of the neural network to be compiled to generate a verification model calculation result.

Because the calculation result and the software inference result of the verification model are stored in a matrix array form and can be directly compared, the comparison between the calculation result of the target layer and the software inference result of the corresponding layer in the verification model can be completed in a layer-by-layer element-by-element comparison form, and the software comparison result of the target layer is obtained.

And 73, traversing all layers in the verification model to obtain a calculation result of the verification model and a software comparison result.

In the actual application process, the obtained software comparison result of each layer is written into a software comparison result document, and the software comparison result is used as a reference basis for the quick verification error correction operation of the corresponding software layer.

And 80, generating a hardware contrastable file according to the calculation result of the verification model.

In this embodiment, the calling code document of the verification model, the gold _ top document, runs the calculation codes of the corresponding functions of the hardware contrastable file, and generates the hardware contrastable data of the corresponding layer according to the calculation result of the verification model of each layer. And writing the hardware contrastable data of each layer into the hardware contrastable file to obtain the hardware contrastable file which can compare with the hardware implementation result layer by layer. The operation of generating the hardware contrastable file mainly comprises the steps of changing the dimensionality of the data matrix, converting the data matrix in the calculation result of the verification model into a hardware storage form, carrying out data binary conversion, and converting decimal data into binary data and hexadecimal data. The optionally generated hardware contrastable files comprise simulation contrastable data files and upper plate contrastable data files.

And step 90, compiling and generating a hardware configuration file according to the network model configuration data, the hardware parameters and the calculation loop block cutting scheme.

In this embodiment, the network model configuration data includes network hierarchy structure information that can be compiled, weight parameters of the neural network to be compiled, input test data of the neural network to be compiled, and software inference results, and the hardware parameters include, but are not limited to, data bit width, computation parallelism, and on-chip storage resources. And performing data transmission configuration (including DMA (direct memory access) scheduling instructions and the like), overall scheduling configuration, module unit internal configuration (including register parameter assignment and the like) and storage configuration of a hardware platform according to the network model configuration data, the hardware parameters and the calculation cycle block-cutting scheme to generate a hardware configuration file. And the hardware platform completes configuration of the relevant registers according to the compiling requirement according to the generated hardware configuration file, and stores the calculation data, the parameters and the scheduling instructions according to the specified arrangement sequence.

If so, go directly to step 90.

In this embodiment, step 90 is directly executed without model verification in the user mode, and a hardware configuration file is obtained. In practical application, no matter in a test mode or a user mode, a hardware configuration file is finally generated, and compiling work of the network model is completed and used for deploying the network model to a corresponding hardware platform.

The hardware configuration file generated by compiling can control the behavior of the hardware accelerator by changing the configuration of the register or the hardware parameter; meanwhile, the verification model in the test mode can be positioned to the position of the problem through the generated software comparison result, and the problems that the operator can be forcibly adapted and the problem can be resolved only by the operation of an upper board of the conventional compiler are solved.

The embodiment discloses a neural network compiling method, which comprises the steps of inputting standardized model data and quantitative data of a network to be compiled and hardware parameters of a hardware platform to be deployed; analyzing the standardized model data and the quantized data to obtain intermediate model structure information and intermediate parameter data including model structure description information; automatically generating a verification model according to the model structure description information; preprocessing the intermediate model structure information and the intermediate parameter data to obtain network model configuration data which can be used for hardware compiling; combining hardware parameters to obtain a calculation cycle block cutting scheme aiming at the neural network to be compiled; judging a current compiling mode, if the compiling mode is the testing mode, operating the generated verification model to obtain a calculation result of the verification model, comparing the calculation result with a software reasoning result in the network model configuration data to obtain a software comparison result, and generating a hardware comparison file; finally, compiling according to the network model configuration data, the hardware parameters and the calculation cycle block cutting scheme to generate a hardware configuration file; and if the user mode is the user mode, directly compiling and generating a hardware configuration file according to the network model configuration data, the hardware parameters and the calculation cycle block cutting scheme. The neural network compiling method combines the compiling process with the verification process, provides a software comparison result and a hardware comparison file through the generated verification model, ensures the correctness of the network model analyzing the compiling process, accelerates the test research and development speed of the product research and development stage, and shortens the development period.

A second embodiment of the present application provides a neural network compiling system, which is used for implementing the steps of the neural network compiling method disclosed in the first embodiment. Referring to fig. 2, the neural network compiling system provided in this embodiment includes a model parsing module, a model converting module, a determining module, a model compiling module, and a verifying module.

The model analysis module is used for analyzing the standard model data and the quantitative data and performing software reasoning operation to obtain intermediate model structure information and intermediate parameter data; the intermediate model structure information comprises model structure description information and model structure analysis data.

The model conversion module is used for executing the following operations: preprocessing the intermediate model structure information and the intermediate parameter data to obtain network model configuration data; the network model configuration data comprises network hierarchical structure information which can be compiled, weight parameters of the neural network to be compiled, input test data of the neural network to be compiled and a software reasoning result; and obtaining a calculation loop block cutting scheme aiming at the neural network to be compiled according to the network model configuration data and the hardware parameters of the hardware platform to be deployed.

The judging module is used for judging whether the current execution mode of the compiling system is a user mode or a test mode.

And the model compiling module is used for compiling and generating a hardware configuration file according to the network model configuration data, the hardware parameters and the calculation cycle block cutting scheme.

The model verification module is configured to perform the following operations: automatically generating a corresponding verification model according to the model structure description information; operating the verification model to obtain a calculation result of the verification model, and comparing the calculation result with a software reasoning result to obtain a software comparison result; and generating a hardware contrastable file according to the calculation result of the verification model.

Further, when the model verification module is configured to automatically generate a corresponding verification model, it is configured to:

And step 35, obtaining a verification model.

Further, the model verification module is further configured to:

Step 72, inputting input test data of the neural network to be compiled and hardware parameters of the hardware platform to be deployed, operating calculation codes of a target layer in the verification model for calculation to obtain a calculation result of the target layer in the verification model, and comparing data of a corresponding layer in a software reasoning result according to the calculation result of the target layer in the verification model to obtain a software comparison result of the target layer; wherein the target layer is any layer in the verification model.

The present embodiment gives two examples for the generation process and the verification process of the model verification module. The neural network to be compiled in this embodiment includes, but is not limited to, CNN, DNN, RNN, LSTM, SNN, and transform models. The following example illustrates the implementation of the present embodiment by using two CNN models as the neural network model to be compiled.

Example one, a resnet18 network model with a quantized data bit width of 16 bits is taken as a model to be compiled to illustrate the generation process of the model verification module and the functions performed by the model verification module. And loading the prototxt document analyzed by the resnet18 network model for reading, and creating a calculation code document and a calling code document of the gold verification model. Firstly, opening a calculation code document for writing. Judging the type of the current layer, and if the type is a convolution operation, writing codes of the convolution operation into a verification model calculation code document, wherein the calculation codes call convolution calculation functions in a preset verification model function library. In the resnet18 model, the computation types of the network model include convolution, max pooling, average pooling, element addition, full connectivity layer, and the like. All layers of the resnet18 network are traversed, and writing of a computational code document of the gold verification model is completed.

And opening a calling code document (a gold _ top document) of the gold authentication model, and automatically writing the code of the preprocessing part of the gold _ top document by writing according to the prototxt document of the network model. Traversing all layers in the prototxt document, performing accumulation counting operation on layers of different types, judging whether a layer of a certain type exists according to whether an accumulation result is greater than 0, and automatically writing a processing operation code of a calculation result corresponding to the layer of the certain type into the gold _ top document; wherein the types of layers include, but are not limited to, batch normalization, ReLU, pooling, element-plus-computation, deconvolution; and automatically writing corresponding codes of a software comparison result file and a hardware comparison file (comprising a simulation comparison data file and an upper plate comparison data file). And finishing writing to obtain a code gold _ top document corresponding to the main calling function of the verification model. And subsequently, automatically loading the test data input by the neural network to be compiled by calling the gold _ top document, running a verification model, and outputting a corresponding software comparison result file and a corresponding hardware comparison result file.

When the test mode is started, the generated gold _ top document is called, input test data of the neural network to be compiled in network model configuration data obtained by a data preprocessing part in the model conversion module and corresponding comparison data obtained by software reasoning are loaded, and hardware parameters of the platform to be deployed are obtained. And operating the calculation codes in the calculation code document of the golden verification model of the network to be verified, and calculating layer by layer according to the structure of the resnet18 network model. Wherein, the input test data of the verification model is consistent with the input test data of the software reasoning process in the model analysis module. In the calculation process, when specific operations such as convolution, pooling, batch normalization and the like are involved, a corresponding calculation function in a preset golden verification model function library is called, a calculation result of the layer in the verification model is obtained by calculating each layer, the calculation result is compared with a data result of the layer in the loaded software inference data until all layers in all networks are calculated, and a software comparison result file and a hardware comparison file of each layer are printed.

Example two, the VGG16 network model with the quantized data bit width of 16bit is taken as a model to be compiled to illustrate the generation process of the model verification module and the functions executed by the model verification module. Firstly, loading a prototxt document analyzed by a VGG16 network for reading operation, opening a corresponding calculation code document of a gold verification model for writing operation, traversing all layers to write corresponding codes, and completing the writing of the calculation code document of the gold verification model; opening a calling code document of the corresponding gold verification model to perform writing operation, writing a corresponding processing operation code according to the layer type existing in the VGG16 network, and completing the calling code document of the gold verification model; and obtaining a verification model corresponding to the VGG16 network.

When the test mode is started, the calling code document corresponding to the VGG16 network verification model loads the network model configuration data and the hardware parameters and runs the verification model, different calculation functions are called to calculate the layer-by-layer comparison calculation results, and after all layers are calculated, the software comparison result file and the hardware comparison file of each layer are printed.

In practical applications, the neural network compiling system may also be referred to as a neural network compiling tool chain, so as to visually represent a series of processing procedures implemented by the neural network compiling system.

With the above, a second embodiment of the present application discloses a neural network compiling system, which is used for implementing the steps of the neural network compiling method disclosed in the first embodiment. The system comprises two working modes, namely a user mode and a test mode. The user mode is started by a user, and the test mode is used in the process of developing and testing the compiling system so as to verify the correctness of the model analysis compiling process. The neural network compiling system provided by the embodiment combines the automatic verification process with the network compiling process, and provides a software comparison result and a hardware comparison file through the verification model generated by the model verification module, so that the test research and development speed of the product research and development stage is accelerated, and the development period is shortened.

It should be noted that, before the compiling system is operated, in addition to inputting standard model data and quantized data of a neural network to be compiled and hardware parameters of a hardware platform to be deployed, current compiling requirements need to be configured according to actual conditions, for example, whether a mode of the current compiling system is a user mode or a test mode, whether a verification result needs to be compared, whether a batch normalization layer needs to be analyzed, whether a board-on test result needs to be printed, whether a model needs to be automatically cut into blocks, whether test data needs to be loaded or randomly generated, whether channel-by-channel quantization is required, and the like.

A third embodiment of the present application discloses a computer device comprising a memory and a processor; wherein the memory is for storing a computer program; the processor is configured to implement the steps of the neural network compiling method according to the first embodiment of the present application when executing the computer program.

A fourth embodiment of the present application discloses a computer-readable storage medium. The storage medium stores a computer program that is processed and executed to implement the steps of the neural network compiling method according to the first embodiment of the present application.

The present application has been described in detail with reference to specific embodiments and illustrative examples, but the description is not intended to limit the application. Those skilled in the art will appreciate that various equivalent substitutions, modifications or improvements may be made to the presently disclosed embodiments and implementations thereof without departing from the spirit and scope of the present disclosure, and these fall within the scope of the present disclosure. The protection scope of this application is subject to the appended claims.

Similar parts in the above embodiments are referred to each other.

Claims

1. A neural network compiling method, comprising:

according to the network model configuration data, combining the hardware parameters to obtain a calculation cycle block cutting scheme aiming at the neural network to be compiled;

judging whether the current execution mode is a test mode or a user mode;

if the test mode is adopted, the following operations are executed:

operating the verification model to obtain a calculation result of the verification model, and comparing the calculation result with the software reasoning result to obtain a software comparison result;

generating a hardware contrastable file according to the calculation result of the verification model;

compiling to generate a hardware configuration file according to the network model configuration data, the hardware parameters and the calculation cycle block cutting scheme;

if the mode is the user mode, the following operations are executed:

and compiling to generate a hardware configuration file according to the network model configuration data, the hardware parameters and the calculation cycle block cutting scheme.

2. The neural network compiling method of claim 1, wherein the automatically generating the corresponding verification model according to the model structure description information comprises:

inputting the model structure description information, and creating a calculation code document and a calling code document of the verification model;

traversing all layers in the model structure description information, and writing corresponding calculation operation codes into the calculation code document layer by layer according to layer types; the computing operation in the corresponding computing operation code is realized by running a computing function of the corresponding operation in a preset verification model function library;

traversing all layers in the model structure description information, and writing the processing operation codes of the calculation results of the corresponding layers into the calling code document according to the types of the layers in the model structure description information;

writing corresponding codes for generating a software comparison result file and a hardware contrastable file in the calling code document;

obtaining the verification model.

3. The neural network compiling method of claim 2, wherein the operating the verification model to obtain a computation result of the verification model, and comparing the computation result with the software reasoning result to obtain a software comparison result comprises:

operating the calling code document and loading the software reasoning result;

inputting input test data and the hardware parameters of the neural network to be compiled, operating a calculation code of a target layer in the verification model for calculation to obtain a calculation result of the target layer in the verification model, and comparing data of a corresponding layer in the software reasoning result according to the calculation result of the target layer in the verification model to obtain a software comparison result of the target layer; wherein the target layer is any layer in the verification model;

and traversing all layers in the verification model to obtain the calculation result of the verification model and the software comparison result.

4. A neural network compilation system characterized by steps for implementing a neural network compilation method according to any one of claims 1 to 3; the neural network compiling system includes:

the model analysis module is used for analyzing standard model data and quantitative data and performing software reasoning operation to obtain intermediate model structure information and intermediate parameter data; the intermediate model structure information comprises model structure description information and model structure analysis data;

a model transformation module to perform the following operations: preprocessing the intermediate model structure information and the intermediate parameter data to obtain network model configuration data; the network model configuration data comprises network hierarchical structure information which can be compiled, weight parameters of the neural network to be compiled, input test data of the neural network to be compiled and a software reasoning result; according to the network model configuration data, combining hardware parameters of a hardware platform to be deployed to obtain a calculation loop block cutting scheme aiming at the neural network to be compiled;

a model verification module to perform the following operations: automatically generating a corresponding verification model according to the model structure description information; operating the verification model to obtain a calculation result of the verification model, and comparing the calculation result with the software reasoning result to obtain a software comparison result; and generating a hardware contrastable file according to the calculation result of the verification model.

5. The neural network compiling system of claim 4 wherein the model verification module is further configured to:

obtaining the verification model.

6. The neural network compiling system of claim 5 wherein the model verification module is further configured to:

operating the calling code document and loading the software reasoning result;

7. A computer device, comprising:

a memory for storing a computer program;

a processor for implementing the steps of the neural network compilation method as claimed in any one of claims 1 to 3 when executing the computer program.

8. A computer-readable storage medium, characterized in that the storage medium stores a computer program which, when executed by a processor, implements the steps of the neural network compiling method according to any one of claims 1 to 3.