CN114707650A - Simulation implementation method for improving simulation efficiency - Google Patents

Simulation implementation method for improving simulation efficiency Download PDF

Info

Publication number
CN114707650A
CN114707650A CN202210321357.4A CN202210321357A CN114707650A CN 114707650 A CN114707650 A CN 114707650A CN 202210321357 A CN202210321357 A CN 202210321357A CN 114707650 A CN114707650 A CN 114707650A
Authority
CN
China
Prior art keywords
neural network
floating point
simulation
file
layer
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210321357.4A
Other languages
Chinese (zh)
Inventor
朱旭东
吴春选
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Xiongmai Integrated Circuit Technology Co Ltd
Original Assignee
Hangzhou Xiongmai Integrated Circuit Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Xiongmai Integrated Circuit Technology Co Ltd filed Critical Hangzhou Xiongmai Integrated Circuit Technology Co Ltd
Priority to CN202210321357.4A priority Critical patent/CN114707650A/en
Publication of CN114707650A publication Critical patent/CN114707650A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/06Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
    • G06N3/063Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/06Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
    • G06N3/063Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means
    • G06N3/065Analogue means

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Computational Linguistics (AREA)
  • Neurology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Image Analysis (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The application discloses a simulation implementation method for improving simulation efficiency, which relates to the technical field of deep learning, and comprises the following steps: the quantization set picture quantizes the neural network model through the neural network compiler to generate an executable file, the test set of the ten thousand people generates first input data, a first fixed point feature file and a floating point feature file through the neural network compiler, and if the statistical result of the precision table accords with a preset precision range, the executable file and the first input data are read to simulate the neural network model. The method has the advantages that batch simulation of a plurality of different types of neural network models is achieved, correctness of transplanting to a chip or an FPGA is guaranteed, simulation is carried out layer by layer aiming at the different types of neural network models, more simulation verification points are covered, risks of chip flow are prevented, and meanwhile, comprehensive precision verification is carried out on a precision table for counting the neural network models.

Description

Simulation implementation method for improving simulation efficiency
The application is a divisional application with an application number of 202111653883.2 in 12 months and 31 days in 2021, and is named as 'a simulation implementation method based on a neural network compiler, a neural network compiler and a computer readable storage medium'.
Technical Field
The application belongs to the technical field of deep learning, and particularly relates to a simulation implementation method for improving simulation efficiency.
Background
With the development of the internet technology, the collected mass data provide enough scenes for deep learning training, the development of an intelligent algorithm mainly based on a convolutional neural network depends on the mass data, and the precision of the intelligent algorithm exceeds the recognition precision of human beings in the fields of image classification, object recognition and the like.
The neural network algorithm is expected to fall to the ground in the security field, and a trained algorithm model on a server needs to be analyzed into a computer language which can be identified by an embedded chip, so that the security camera can be conveniently installed and monitored.
The process of realizing the convolutional neural network algorithm on a Central Processing Unit (CPU) or a Graphics Processing Unit (GPU) is transplanted into a Field Programmable Gate Array (FPGA) or a chip for implementation, so that the CPU is convenient to carry and install, the computational power realized in the CPU cannot meet the current requirements, and the GPU cannot be implemented in an embedded device, and a 32-bit floating point forward implementation process is performed in a python language or a C + + language, so that in order to reduce the chip area to reduce the cost and meet the requirement that the precision is not lost, the CPU needs to be quantized into 8-bit fixed points for implementation, the FPGA or the chip is implemented in a verilog (HDL) hardware description language, and therefore, the whole neural network model needs to be simulated and whether the precision meets the requirement is verified.
The current technical solution has the following drawbacks: firstly, only each middle layer of a neural network model can be simulated, information of each layer can be manually retrieved and configured into a file in the early stage, precision test of a test set of ten thousand persons cannot be performed, and range distribution of different data sets is not simulated. Secondly, when other types of neural network models or test sets of different scenes are replaced, the running correctness of the chip or the FPGA cannot be ensured, the chip flow cost is increased, and the running performance is reduced due to the fact that a floating point multiplier is adopted without a quantization neural network.
Disclosure of Invention
The application aims to provide a simulation implementation method for improving simulation efficiency, so as to solve the technical problems that in the prior art, only each middle layer of a neural network model can be simulated, and precision testing of a ten-thousand test set cannot be performed.
In order to achieve the technical purpose, the technical scheme adopted by the application is as follows:
a simulation realization method for improving simulation efficiency comprises the following steps:
the method comprises the steps that a neural network compiler is built and used for receiving a quantization set picture, a plurality of neural network models of different types and a test set of thousands of people, and after the neural network compiler carries out precision verification, the neural network models are simulated layer by layer;
the quantization set picture quantizes the neural network model through the neural network compiler to generate an executable file, and the ten-thousand-person test set generates first input data, a first floating point feature file and a floating point feature file through the neural network compiler;
comparing the first floating point feature file with the floating point feature file, and outputting a precision table for counting the neural network model;
and if the statistical result of the precision table accords with a preset precision range, reading the executable file and the first input data to simulate the neural network model.
Preferably, the method further comprises the steps of:
building an environment of the neural network compiler, installing the neural network compiler, and testing whether the neural network compiler is installed successfully;
and the building environment of the neural network compiler is set to be the same as the operating system of the simulation system.
Preferably, the quantization set picture quantizes the neural network model through the neural network compiler to generate an executable file, and specifically includes the following steps:
preparing neural network models of different types and quantized set pictures in different scenes;
operating the neural network compiler, and quantizing the neural network model according to the quantization set picture to generate the executable file;
the executable file comprises a neural network name identifier, a layer identifier of an input layer, a layer identifier of a middle layer, a layer identifier of an output layer, a quantized weight value, a quantized deviation value, a layer operation name, layer parameter information, layer association information and layer internal storage information.
Preferably, the method further comprises the steps of:
presetting the number of the neural network models, setting the initial cycle number to be 0, and judging whether the cycle number accords with the preset number of the neural network models;
if the number of the circulation times does not accord with the number of the preset neural network models, the quantization set picture quantizes the neural network models through the neural network compiler to generate the executable file, and the ten-thousand test sets generate the first input data, the first floating point feature file and the floating point feature file through the neural network compiler;
and if the cycle times accord with the preset number of the neural network models, ending the process.
Preferably, the ten-thousand people test set generates first input data, a first floating point feature file and a floating point feature file through the neural network compiler, and specifically includes the following steps:
preparing different ten-thousand person test sets according to different neural network models;
the ten-thousand-person test set generates first input data with the network resolution through a scaling function, and the ten-thousand-person test set is simulated to generate a first floating point feature file and a floating point feature file.
Preferably, comparing the first floating point feature file with the floating point feature file, and outputting a precision table for counting the neural network model, specifically including the following steps:
the floating point characteristic file comprises first floating point characteristic data, the fixed point characteristic data in the first floating point characteristic file is converted into conversion floating point characteristic data, and second floating point characteristic data is generated;
comparing the similarity of the first floating point characteristic data and the second floating point characteristic data, and if the similarity is within a preset variable, meeting the precision requirement; if the similarity is not within the preset variable, the precision requirement is not met;
and outputting the similarity statistical result of the first floating point characteristic data and the second floating point characteristic data in a form of a table.
Preferably, if the statistical result of the precision table conforms to a preset precision range, the executable file and the first input data are read to perform simulation of the neural network model, and the method specifically includes the following steps:
counting the precision table, wherein the counting result of counting needs to accord with a preset precision range;
reading the executable file, configuring hardware according to the executable file, reading the first input data, starting simulation of the neural network model according to the first input data, and generating a second fixed-point feature file;
and comparing the first fixed point feature file with the second fixed point feature file, and if the first fixed point feature file is different from the second fixed point feature file, storing error data in the second fixed point feature file.
Preferably, the method further comprises the steps of:
establishing a first folder, and automatically generating a first main folder under the first folder, wherein the first main folder is used for storing the executable file;
automatically generating a first auxiliary folder under the first folder, wherein the first auxiliary folder is used for storing the first fixed point feature file;
and automatically generating an input data folder under the first folder, wherein the input data folder is used for storing the first input data.
Preferably, preparing different types of neural network models and quantization set pictures specifically includes the following steps:
and establishing a second folder, and generating a second main folder under the second folder, wherein the second main folder is used for storing the neural network models of different types, the quantization set pictures and the floating point feature files.
Preferably, different ten-thousand people test sets are prepared according to different neural network models, and the method specifically comprises the following steps:
and establishing a second auxiliary folder under the second main folder, wherein the second auxiliary folder is used for storing the ten-thousand people test set.
A neural network compiler is applied to the simulation implementation method for improving the simulation efficiency, and comprises the following steps: the network analysis module, the network quantization module, the network merging module, the network storage module and the network forward execution module are connected in sequence;
the network analysis module is used for receiving the quantization set picture, a plurality of different types of neural network models and a ten-thousand-person test set, analyzing and reconstructing the structure of the neural network model layer by layer, and at least acquiring one of layer operation names, layer parameter information and layer associated information of an input layer, an output layer and a middle layer of the neural network model;
the network quantization module is used for generating an offset value and a conversion value according to the reconstructed neural network model and converting the weight value of the floating point type into the weight value of the fixed point type;
the network merging module is used for merging the running water operation instructions of the convolution layer, the pooling layer and the activation layer in the neural network model;
the network storage module is used for storing the data in the network analysis module, the network quantification module and the network merging module to generate an executable file;
and the network forward execution module is used for generating the first input data, the first fixed point characteristic file and the floating point characteristic file by the ten-thousand people test set through the network forward execution module, comparing the first fixed point characteristic file with the floating point characteristic file, and outputting a precision table for counting the neural network model.
A computer readable storage medium having stored thereon computer instructions which, when executed by a processor, implement the steps of the method described above.
The application provides beneficial effect lies in:
1. the quantization set picture is used for quantizing different neural network models through the neural network compiler to generate different executable files, and if the statistical result of the precision table accords with the preset precision range, the executable files and the first input data are read to simulate the neural network models. The simulation method has the advantages that batch simulation of a plurality of different types of neural network models is realized, various marginalized simulations are considered in the simulation of the neural network models, the correctness of the neural network models transplanted to a chip or an FPGA is ensured, hardware is configured through an executable file, simulation is performed layer by layer aiming at the different types of neural network models, more simulation verification points are covered, the risk of chip flow is prevented, the cost is saved, the simulation efficiency is improved, and meanwhile, comprehensive precision verification is performed on a precision table for counting the neural network models.
2. The number of neural network models is preset, the initial cycle number is set to be 0, and whether the cycle number accords with the number of the preset neural network models or not is judged. By judging the number of the neural network models, the time for generating the executable file, the first input data, the first fixed point feature file and the floating point feature file is saved, and the time consumed for quantizing the neural network models in the forward process is avoided. The generated different data are automatically stored in different folders through prestored paths, corresponding data are provided for realizing simulation of various types of neural network models, the simulation flow is simplified, and the simulation efficiency is accelerated.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the embodiments or the prior art descriptions will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without inventive exercise.
Fig. 1 is a flowchart of a simulation implementation method for improving simulation efficiency in embodiment 1.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some embodiments of the present application, but not all embodiments. The components of the embodiments of the present application, as generally described and illustrated in the figures herein, could be arranged and designed in a wide variety of different configurations.
Thus, the following detailed description of the embodiments of the present application, presented in the accompanying drawings, is not intended to limit the scope of the claimed application, but is merely representative of selected embodiments of the application. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
Example 1:
as shown in fig. 1, the present embodiment includes a simulation implementation method for improving simulation efficiency, which includes the following steps:
and constructing a neural network compiler for receiving the quantization set picture, a plurality of neural network models of different types and a test set of thousands of people, and simulating the neural network models layer by layer after the neural network compiler performs precision verification.
The quantization set picture quantizes the neural network model through the neural network compiler to generate an executable file, and the test set of all people generates first input data, a first fixed point feature file and a floating point feature file through the neural network compiler.
Comparing the first floating point characteristic file with the floating point characteristic file, and outputting a precision table for counting the neural network model; and if the statistical result of the precision table accords with the preset precision range, reading the executable file and the first input data to simulate the neural network model.
The simulation method has the advantages that batch simulation of a plurality of different types of neural network models is realized, various marginalized simulations are considered in the simulation of the neural network models, the correctness of the neural network models transplanted to a chip or an FPGA is ensured, hardware is configured through an executable file, simulation is performed layer by layer aiming at the different types of neural network models, more simulation verification points are covered, the risk of chip flow is prevented, the cost is saved, the simulation efficiency is improved, and meanwhile, comprehensive precision verification is performed on a precision table for counting the neural network models.
Further comprising the steps of: the method comprises the steps of building an environment of the neural network compiler, installing the neural network compiler, and testing whether the neural network compiler is successfully installed, wherein the building environment of the neural network compiler is set to be an operating system the same as that of the simulation system. Specifically, the neural network compiler is packaged into whl format, and whl format is a format of a compressed file, so that the test is convenient to install under an operating system.
The quantization set picture quantizes the neural network model through the neural network compiler to generate an executable file, and the method specifically comprises the following steps: different types of neural network models and quantized set pictures in different scenes are prepared.
And operating a neural network compiler, and quantizing the neural network model according to the quantization set picture to generate an executable file. The executable file comprises a neural network name identifier, a layer identifier of an input layer, a layer identifier of a middle layer, a layer identifier of an output layer, a quantized weight value, a quantized deviation value, a layer operation name, layer parameter information, layer association information and layer internal storage information.
Specifically, the network analysis module of the neural network compiler analyzes and reconstructs the structure of the original neural network model layer by layer, generates an offset value and a conversion value according to the reconstructed neural network model, and converts a floating-point weight value into a fixed-point weight value. And the network merging module and the network quantizing module operate simultaneously to merge the pipeline operation commands in the convolution layer, the pooling layer and the activation layer in the neural network model. And the network storage module generates an executable file from the data quantized by the network analysis module, the network quantization module and the network merging module.
The offset value is generated by the following formula:
the formula I is as follows: x'm=(x′max-x′min)*2bw
Wherein, x'mDenotes an offset value, x'maxExpression of maximum weight value, x 'of floating-point type'minA minimum weight value of a floating point type is represented, bw represents a converted bit width, and in the present embodiment, a bit width of 12 bits is currently supported.
The formula for generating the conversion value is as follows:
the formula II is as follows: max ((bw-ceil (log))2(x′m)+1)),bw)
Wherein f represents a conversion value, max represents a maximum value as a built-in function belonging to the system library, bw represents a conversion bit width, log2 represents a built-in function of the system library, and x'mThe offset value is represented and the built-in function with ceil belonging to the system library represents rounding up.
Converting the floating-point weight value into the fixed-point weight value, wherein the formula for converting the floating-point characteristic data into the fixed-point characteristic data is expressed as follows:
the formula III is as follows: x ═ round (X)float*2f)+x′m
Wherein X represents fixed-point feature data, which may be a weight value of a fixed-point type in this embodiment, XfloatThe floating point feature data may be a floating point weight value in this embodiment, round represents a rounded system library built-in function, f represents a converted value, x'mAn offset value is indicated.
Specifically, the layer operation name includes at least one of convolution, deconvolution, pooling, full concatenation, sanction, concatenation, point addition, point multiplication, normalization, and activation of the layer operation. The layer parameter information includes at least one of a convolution kernel size, a convolution kernel span, a grouping, a padding value, whether to have an active layer, a quantized weight value, and a quantized offset value. The layer association information at least includes one item of an input layer operation name, layer parameter information, an output layer operation name of the current layer, and layer parameter information of the current layer. The layer memory information at least comprises the memory size of the current layer and whether to multiplex one of other layer memories.
Specifically, the different types of neural network models include a detection network, an identification network, a classification network, and the like, and the number of the quantization set pictures in different scenes at least includes 50.
Further comprising the steps of: the number of the neural network models is preset, the initial cycle number is set to be 0, and whether the cycle number accords with the number of the preset neural network models or not is judged.
If the cycle times do not accord with the number of the preset neural network models, the quantization set picture quantizes the neural network models through the neural network compiler to generate executable files, and the test set of all people generates first input data, first fixed point feature files and floating point feature files through the neural network compiler.
And if the cycle times accord with the number of the preset neural network models, ending the process. Each time an executable file is simulated, the number of cycles is increased by 1.
By judging the number of the neural network models, the time for generating the executable file, the first input data, the first fixed point feature file and the floating point feature file is saved, and the time consumed for quantizing the neural network models in the forward process is avoided.
The method comprises the following steps that a test set of thousands of people generates first input data, a first floating point characteristic file and a floating point characteristic file through a neural network compiler, and specifically comprises the following steps:
preparing different ten-thousand test sets according to different neural network models, generating first input data with the network resolution by the ten-thousand test sets through a scaling function, and simulating the ten-thousand test sets to generate a first floating point feature file and a floating point feature file.
Specifically, the ten-thousand-person test set is a picture set, the number of the ten-thousand-person test set is ten thousand pictures, and the ten-thousand-person test set generates first input data, a first floating point feature file and a floating point feature file through a network forward execution module.
Further comprising the steps of: and establishing a first folder, and automatically generating a first main folder under the first folder, wherein the first main folder is used for storing executable files.
And automatically generating a first auxiliary folder under the first folder, wherein the first auxiliary folder is used for storing the first fixed point characteristic file. And automatically generating an input data folder under the first folder, wherein the input data folder is used for storing the first input data.
Preparing different types of neural network models and quantization set pictures, and specifically comprising the following steps: and establishing a second folder, and generating a second main folder under the second folder, wherein the second main folder is used for storing different types of neural network models, quantized set pictures and floating point feature files.
Preparing different ten-thousand-person test sets according to different neural network models, and specifically comprising the following steps of: and establishing a second auxiliary folder under the second main folder, wherein the second auxiliary folder is used for storing the ten-thousand-person test set.
Specifically, under the current PATH, a first folder and a second folder are established, the file name of the first folder is defined as SPE _ PATH1, the file name of the second folder is defined as SPE _ PATH2, under the SPE _ PATH2 file, a second main folder named by the name of the neural network is established to store the neural network model and the quantization set picture generated by the GPU, and a second auxiliary folder is established to store the ten-thousand testing sets under the second main folder.
When an executable file is generated, the neural network compiler generates a second main folder named by the neural network name according to the analyzed neural network name under the SPE _ PATH1 file, and stores the executable file generated by the neural network compiler.
An input data folder under the SPE _ PATH1 file is automatically generated, in this embodiment, the name of the neural network analyzed by the neural network compiler is defined as resnet, the file name of the generated input data folder is defined as SPE _ PATH1/resnet/data _ input, a first input data with the network resolution size generated by a ten-thousand person test set through a scaling function is stored, and in order to facilitate simulation, a hexadecimal format is adopted, and each data is arranged in a row.
The method comprises the steps of automatically generating a first auxiliary folder under an SPE _ PATH1 file, wherein the name of an analyzed neural network is resnet, the name of a network layer is conv1_1, and the layer serial number is 1, and the file name of the generated first auxiliary folder is defined as SPE _ PATH1/resnet/conv1_1_1 and is used for storing first fixed point feature files generated by a middle layer and an output layer during simulation of a ten-thousand-person test set so as to facilitate the correctness of simulation verification data.
The generated different data are automatically stored in different folders through prestored paths, corresponding data are provided for realizing simulation of various types of neural network models, the simulation flow is simplified, and the simulation efficiency is accelerated.
Further comprising the steps of: and presetting the number of executable files, and judging whether the number of the executable files in the first main folder exceeds the number of the preset executable files.
If the number of the executable files in the first main folder does not exceed the number of the preset executable files, the neural network compiler simulates the ten-thousand-person test set to generate a first fixed point feature file.
And if the number of the executable files in the first main folder exceeds the number of the preset executable files, ending the process of simulating the ten-thousand test sets by the neural network compiler.
And determining whether the simulation of the test set of all the people is finished or not by judging the number of executable files in the first main folder, and if the simulation is finished, ending the simulation process and improving the simulation efficiency.
Comparing the first floating point feature file with the floating point feature file, and outputting a precision table for counting the neural network model, wherein the method specifically comprises the following steps:
the floating point characteristic file comprises first floating point characteristic data, the fixed point characteristic data in the first floating point characteristic file is converted into conversion floating point characteristic data, and second floating point characteristic data is generated;
comparing the similarity of the first floating point characteristic data and the second floating point characteristic data, and if the similarity is within a preset variable, meeting the precision requirement; if the similarity is not within the preset variable, the precision requirement is not met;
and outputting the similarity statistical result of the first floating point characteristic data and the second floating point characteristic data in a form of a table.
Specifically, the fixed point feature data in the first fixed point feature file is converted into the conversion floating point feature data through a conversion formula, where the conversion formula is as follows:
the formula four is as follows: x'float=(X-x′m)/2f
Wherein, x'floatThe representation of the floating point feature data may be the second floating point feature data in this embodiment, X represents the fixed point feature data, and may be the fixed point feature data, X 'in the first fixed point feature file in this embodiment'mDenotes an offset value, and f denotes a conversion value.
Specifically, comparing the similarity between the first floating point characteristic data and the second floating point characteristic data, the similarity distance formula is as follows:
the formula five is as follows:
Figure BDA0003568315110000091
where n represents the total number of floating point signatures, xiRepresenting a first floating-point characteristic, yiRepresenting the second floating-point characterization data, i.e. x 'in formula four'floatThe value of (c). θ represents the similarity of the distances, and closer to 1 indicates higher accuracy.
In this embodiment, a ten thousand test sets corresponding to the neural network models are tested, a preset variable is set to be a similarity distance of 0.8, the similarity between the first floating point feature data and the second floating point feature data is compared, that is, the similarity of each picture in the ten thousand test sets is counted, when the similarity distance is greater than or equal to 0.8, it is indicated that the precision requirement is met, counting is performed, the percentage of the counting data of each neural network model in the ten thousand test sets is counted, and a precision table for counting the neural network models is output. The statistical result of the precision table can be visually seen, and whether the requirement of hardware design meets the precision requirement or not can be checked.
If the statistical result of the precision table accords with the preset precision range, reading the executable file and the first input data to simulate the neural network model, and specifically comprising the following steps of:
and counting the precision table, wherein the counting result of counting needs to accord with a preset precision range. Reading the executable file, configuring hardware according to the executable file, reading first input data, starting simulation of the neural network model according to the first input data, and generating a second fixed-point feature file.
And comparing the first fixed point feature file with the second fixed point feature file, and if the first fixed point feature file is different from the second fixed point feature file, storing error data in the second fixed point feature file.
Problems in simulation can be conveniently located through error data in the second fixed point feature file, simulation efficiency can be improved, and the coverage of simulation is wider.
Example 2:
the embodiment includes a neural network compiler, and the simulation implementation method applied to the embodiment 1 for improving the simulation efficiency includes: the system comprises a network analysis module, a network quantification module, a network merging module, a network storage module and a network forward execution module which are connected in sequence.
And the network analysis module is used for receiving the quantization set picture, a plurality of different types of neural network models and a ten-thousand-person test set, analyzing and reconstructing the structure of the neural network model layer by layer, and at least acquiring one of the layer operation names, the layer parameter information and the layer associated information of the input layer, the output layer and the middle layer of the neural network model.
Specifically, the network analysis module analyzes the structure of the original neural network model layer by layer, at least obtains one of layer operation names, layer parameter information and layer associated information of an input layer, an output layer and a middle layer of the neural network model, reconstructs an internal sequence execution structure after analysis, redefines a data structure of an internal related network layer, wherein the network layer comprises a convolutional layer, a pooling layer and an activation layer, and fills contents such as layer execution sequence, layer operation types, layer operation names, layer parameter information and layer associated information into the data structure of the internal related network layer.
And the network quantization module is used for generating an offset value and a conversion value according to the reconstructed neural network model and converting the floating point weight value into the fixed point weight value.
Specifically, the floating point characteristic data stored in the address space is converted into a data format supported by hardware, and a conversion value is calculated, so that the hardware calculation amount and the number of multipliers are reduced.
And the network merging module is used for merging the running water operation instructions of the convolution layer, the pooling layer and the activation layer in the neural network model.
Specifically, according to the principle of reducing the bandwidth of the external memory, the pipeline operation instructions in the convolutional layer, the pooling layer and the active layer are optimized, equivalent transformation optimization is performed on the convolutional layer, the pooling layer and the active layer, the internal data structure is optimized and combined again, resource consumption is reduced, and execution efficiency is improved. The data interaction between the internal memory and the external memory is reduced, so that the bandwidth utilization rate is improved, and the layers in the same pipeline level are combined, wherein the main combining layers are a convolution layer and a pooling layer.
And the network storage module is used for storing the data in the network analysis module, the network quantification module and the network merging module to generate an executable file.
And the network forward execution module is used for generating first input data, a first fixed point characteristic file and a floating point characteristic file by the ten-thousand test sets through the network forward execution module, comparing the first fixed point characteristic file and the floating point characteristic file, and outputting a precision table for counting the neural network model.
Specifically, the standardization part is executed by adopting an open-source deep learning architecture to ensure correct comparison standard, and the simulation part keeps the network forward logic consistent with the hardware execution network logic to ensure the consistency of data and hardware simulation results.
For relevant points, see the description of example 1.
Example 3:
a computer readable storage medium having stored thereon computer instructions which, when executed by a processor, implement the steps of the method of embodiment 2.
As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, apparatus, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present invention has been described with reference to flowchart illustrations and/or block diagrams of methods, terminal devices (systems), and computer program products according to the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing terminal to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing terminal, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing terminal to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing terminal to cause a series of operational steps to be performed on the computer or other programmable terminal to produce a computer implemented process such that the instructions which execute on the computer or other programmable terminal provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
It should be noted that:
reference in the specification to "one embodiment" or "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment is included in at least one embodiment of the application. Thus, the appearances of the phrase "one embodiment" or "an embodiment" in various places throughout this specification are not necessarily all referring to the same embodiment.
While the preferred embodiments of the present application have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including the preferred embodiment and all changes and modifications that fall within the scope of the present application.
In addition, it should be noted that the specific embodiments described in the present specification may be different in terms of the parts, the shapes of the components, the names of the components, and the like. All equivalent or simple changes in the structure, characteristics and principles as described in the patent concept are included in the protection scope of the present patent. Various modifications, additions and substitutions for the specific embodiments described herein may occur to those skilled in the art without departing from the scope and spirit of the invention as defined by the accompanying claims.

Claims (9)

1. A simulation realization method for improving simulation efficiency is characterized by comprising the following steps:
constructing a neural network compiler, receiving a quantization set picture, a plurality of neural network models of different types and a test set of thousands of people, constructing an environment of the neural network compiler, and installing the neural network compiler;
after the neural network compiler carries out precision verification, simulating the neural network model layer by layer;
the quantization set picture quantizes the neural network model through the neural network compiler to generate an executable file, and the ten-thousand-person test set generates first input data, a first floating point feature file and a floating point feature file through the neural network compiler;
comparing the first floating point feature file with the floating point feature file, and outputting a precision table for counting the neural network model;
if the statistical result of the precision table accords with a preset precision range, reading the executable file and the first input data to simulate the neural network model;
presetting the number of the neural network models, setting the initial cycle number to be 0, and judging whether the cycle number accords with the preset number of the neural network models;
if the number of the circulation times does not accord with the number of the preset neural network models, the quantization set picture quantizes the neural network models through the neural network compiler to generate the executable file, and the ten-thousand test sets generate the first input data, the first floating point feature file and the floating point feature file through the neural network compiler;
if the cycle times accord with the number of the preset neural network models, ending the process;
the quantization set pictures are different types of neural network models and pictures collected under different scenes, and the ten-thousand test sets are picture sets.
2. The simulation implementation method for improving simulation efficiency according to claim 1, further comprising the steps of:
testing whether the neural network compiler is successfully installed;
and the building environment of the neural network compiler is set to be the same as the operating system of the simulation system.
3. The simulation implementation method for improving simulation efficiency according to claim 1, wherein the quantization set picture quantizes the neural network model through the neural network compiler to generate an executable file, specifically comprising the following steps:
preparing neural network models of different types and quantized set pictures in different scenes;
operating the neural network compiler, and quantizing the neural network model according to the quantization set picture to generate the executable file;
the executable file comprises a neural network name identifier, a layer identifier of an input layer, a layer identifier of a middle layer, a layer identifier of an output layer, a quantized weight value, a quantized deviation value, a layer operation name, layer parameter information, layer association information and layer internal storage information.
4. The simulation implementation method for improving simulation efficiency according to claim 1, wherein the ten-thousand people test set generates the first input data, the first floating point feature file and the floating point feature file through the neural network compiler, and specifically includes the following steps:
preparing different ten-thousand-person test sets according to different neural network models;
the ten-thousand-person test set generates first input data with the network resolution through a scaling function, and the ten-thousand-person test set is simulated to generate a first floating point feature file and a floating point feature file.
5. The simulation implementation method for improving simulation efficiency according to claim 1, wherein comparing the first floating point profile with the floating point profile, and outputting a precision table for statistics of the neural network model, specifically includes the following steps:
the floating point characteristic file comprises first floating point characteristic data, the fixed point characteristic data in the first floating point characteristic file is converted into conversion floating point characteristic data, and second floating point characteristic data is generated;
comparing the similarity of the first floating point characteristic data and the second floating point characteristic data, and if the similarity is within a preset variable, meeting the precision requirement; if the similarity is not within the preset variable, the precision requirement is not met;
and outputting the similarity statistical result of the first floating point characteristic data and the second floating point characteristic data in a form of a table.
6. The simulation implementation method for improving simulation efficiency according to claim 1, wherein if the statistical result of the precision table conforms to a preset precision range, the executable file and the first input data are read to perform simulation of the neural network model, and the method specifically includes the following steps:
counting the precision table, wherein the counting result of counting needs to accord with a preset precision range;
reading the executable file, configuring hardware according to the executable file, reading the first input data, starting simulation of the neural network model according to the first input data, and generating a second fixed-point feature file;
and comparing the first fixed point feature file with the second fixed point feature file, and if the first fixed point feature file is different from the second fixed point feature file, storing error data in the second fixed point feature file.
7. The simulation implementation method for improving simulation efficiency according to claim 1, further comprising the steps of:
establishing a first folder, and automatically generating a first main folder under the first folder, wherein the first main folder is used for storing the executable file;
automatically generating a first auxiliary folder under the first folder, wherein the first auxiliary folder is used for storing the first fixed point feature file;
and automatically generating an input data folder under the first folder, wherein the input data folder is used for storing the first input data.
8. The simulation implementation method for improving simulation efficiency according to claim 3, wherein the preparation of different types of neural network models and quantization set pictures specifically comprises the following steps:
and establishing a second folder, and generating a second main folder under the second folder, wherein the second main folder is used for storing the neural network models of different types, the quantization set pictures and the floating point feature files.
9. The simulation implementation method for improving simulation efficiency according to claim 4, wherein different ten-thousand test sets are prepared according to different neural network models, and the method specifically comprises the following steps:
and establishing a second folder, generating a second main folder under the second folder, and establishing a second auxiliary folder under the second main folder, wherein the second auxiliary folder is used for storing the ten-thousand people test set.
CN202210321357.4A 2021-12-31 2021-12-31 Simulation implementation method for improving simulation efficiency Pending CN114707650A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210321357.4A CN114707650A (en) 2021-12-31 2021-12-31 Simulation implementation method for improving simulation efficiency

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210321357.4A CN114707650A (en) 2021-12-31 2021-12-31 Simulation implementation method for improving simulation efficiency
CN202111653883.2A CN114004352B (en) 2021-12-31 2021-12-31 Simulation implementation method, neural network compiler and computer readable storage medium

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
CN202111653883.2A Division CN114004352B (en) 2021-12-31 2021-12-31 Simulation implementation method, neural network compiler and computer readable storage medium

Publications (1)

Publication Number Publication Date
CN114707650A true CN114707650A (en) 2022-07-05

Family

ID=79932421

Family Applications (3)

Application Number Title Priority Date Filing Date
CN202210315323.4A Pending CN114676830A (en) 2021-12-31 2021-12-31 Simulation implementation method based on neural network compiler
CN202111653883.2A Active CN114004352B (en) 2021-12-31 2021-12-31 Simulation implementation method, neural network compiler and computer readable storage medium
CN202210321357.4A Pending CN114707650A (en) 2021-12-31 2021-12-31 Simulation implementation method for improving simulation efficiency

Family Applications Before (2)

Application Number Title Priority Date Filing Date
CN202210315323.4A Pending CN114676830A (en) 2021-12-31 2021-12-31 Simulation implementation method based on neural network compiler
CN202111653883.2A Active CN114004352B (en) 2021-12-31 2021-12-31 Simulation implementation method, neural network compiler and computer readable storage medium

Country Status (1)

Country Link
CN (3) CN114676830A (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114386588B (en) * 2022-03-23 2022-07-29 杭州雄迈集成电路技术股份有限公司 Neural network reasoning method and system
CN116796674B (en) * 2023-08-24 2023-11-24 上海合见工业软件集团有限公司 Heterogeneous hardware simulation method and system
CN117034822B (en) * 2023-10-10 2023-12-15 北京云枢创新软件技术有限公司 Verification method based on three-step simulation, electronic equipment and medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190340492A1 (en) * 2018-05-04 2019-11-07 Microsoft Technology Licensing, Llc Design flow for quantized neural networks
US20200193273A1 (en) * 2018-12-14 2020-06-18 Microsoft Technology Licensing, Llc Residual quantization for neural networks
CN111523526A (en) * 2020-07-02 2020-08-11 杭州雄迈集成电路技术股份有限公司 Target detection method, computer equipment and readable storage medium
CN113228056A (en) * 2019-10-12 2021-08-06 深圳鲲云信息科技有限公司 Runtime hardware simulation method, device, equipment and storage medium

Family Cites Families (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103929210B (en) * 2014-04-25 2017-01-11 重庆邮电大学 Hard decision decoding method based on genetic algorithm and neural network
US10643126B2 (en) * 2016-07-14 2020-05-05 Huawei Technologies Co., Ltd. Systems, methods and devices for data quantization
US20190265955A1 (en) * 2016-07-21 2019-08-29 Ramot At Tel-Aviv University Ltd. Method and system for comparing sequences
CN108510067B (en) * 2018-04-11 2021-11-09 西安电子科技大学 Convolutional neural network quantification method based on engineering realization
WO2019200548A1 (en) * 2018-04-17 2019-10-24 深圳鲲云信息科技有限公司 Network model compiler and related product
CN109102064B (en) * 2018-06-26 2020-11-13 杭州雄迈集成电路技术股份有限公司 High-precision neural network quantization compression method
CN109740302B (en) * 2019-04-02 2020-01-10 深兰人工智能芯片研究院(江苏)有限公司 Simulation method and device of neural network
CN110795165A (en) * 2019-10-12 2020-02-14 苏州浪潮智能科技有限公司 Neural network model data loading method and related device
CN113272813B (en) * 2019-10-12 2023-05-05 深圳鲲云信息科技有限公司 Custom data stream hardware simulation method, device, equipment and storage medium
CN110750945B (en) * 2019-12-25 2020-11-13 安徽寒武纪信息科技有限公司 Chip simulation method and device, simulation chip and related product
CN111178512B (en) * 2019-12-31 2023-04-18 中科南京人工智能创新研究院 Device operation neural network test method and device
CN113326930B (en) * 2020-02-29 2024-05-03 华为技术有限公司 Data processing method, neural network training method, related device and equipment
CN111401550A (en) * 2020-03-10 2020-07-10 北京迈格威科技有限公司 Neural network model quantification method and device and electronic equipment
CN112232497A (en) * 2020-10-12 2021-01-15 苏州浪潮智能科技有限公司 Method, system, device and medium for compiling AI chip
CN112446491B (en) * 2021-01-20 2024-03-15 上海齐感电子信息科技有限公司 Real-time automatic quantification method and real-time automatic quantification system for neural network model
CN113159276B (en) * 2021-03-09 2024-04-16 北京大学 Model optimization deployment method, system, equipment and storage medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190340492A1 (en) * 2018-05-04 2019-11-07 Microsoft Technology Licensing, Llc Design flow for quantized neural networks
US20200193273A1 (en) * 2018-12-14 2020-06-18 Microsoft Technology Licensing, Llc Residual quantization for neural networks
CN113228056A (en) * 2019-10-12 2021-08-06 深圳鲲云信息科技有限公司 Runtime hardware simulation method, device, equipment and storage medium
CN111523526A (en) * 2020-07-02 2020-08-11 杭州雄迈集成电路技术股份有限公司 Target detection method, computer equipment and readable storage medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
余菲;赵杰;王静霞;温国忠;宋荣;: "BP神经网络字符识别系统Matlab建模及硬件实现", 深圳职业技术学院学报, no. 03, 20 May 2019 (2019-05-20) *

Also Published As

Publication number Publication date
CN114004352B (en) 2022-04-26
CN114676830A (en) 2022-06-28
CN114004352A (en) 2022-02-01

Similar Documents

Publication Publication Date Title
CN114004352B (en) Simulation implementation method, neural network compiler and computer readable storage medium
US20220283820A1 (en) Data parallelism in distributed training of artificial intelligence models
US11354579B2 (en) Dynamic multi-layer execution for artificial intelligence modeling
WO2021233069A1 (en) Quantization training and image processing methods and devices, and storage medium
US20220276871A1 (en) Executing large artificial intelligence models on memory-constrained devices
US8271252B2 (en) Automatic verification of device models
CN114548384A (en) Method and device for constructing impulse neural network model with abstract resource constraint
CN114429208A (en) Model compression method, device, equipment and medium based on residual structure pruning
CN114490065A (en) Load prediction method, device and equipment
US20220358269A1 (en) Simulation execution system, simulation execution method, and computer readable medium
CN112465141A (en) Model compression method, model compression device, electronic device and medium
CN113168552A (en) Artificial intelligence application development system, computer device and storage medium
CN110807428A (en) Coal sample identification method and device, server and storage medium
CN114492742A (en) Neural network structure searching method, model issuing method, electronic device, and storage medium
CN113228056B (en) Runtime hardware simulation method, device, equipment and storage medium
US20240161474A1 (en) Neural Network Inference Acceleration Method, Target Detection Method, Device, and Storage Medium
Anuradha et al. Efficient workload characterization technique for heterogeneous processors
Sumeet et al. Performance Evaluation of GraphCore IPU-M2000 Accelerator for Text Detection Application
CN113272813B (en) Custom data stream hardware simulation method, device, equipment and storage medium
CN114077884A (en) Model conversion optimization device and method of deep learning model and readable storage medium
CN115562969B (en) Simulation evaluation method, system, electronic device and medium for neural network processor
CN117931211A (en) Model deployment method, device, apparatus, chip and storage medium
US20240104363A1 (en) Method and apparatus for the joint optimization of a neural network and dedicated hardware for the neural network
US20220366267A1 (en) Performance Modeling and Analysis of Artificial Intelligence (AI) Accelerator Architectures
Westby FPGA Acceleration on Multilayer Perceptron (MLP) Neural Network for Handwritten Digit Recognition

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information

Country or region after: China

Address after: 311400 4th floor, building 9, Yinhu innovation center, No.9 Fuxian Road, Yinhu street, Fuyang District, Hangzhou City, Zhejiang Province

Applicant after: Zhejiang Xinmai Microelectronics Co.,Ltd.

Address before: 311400 4th floor, building 9, Yinhu innovation center, No.9 Fuxian Road, Yinhu street, Fuyang District, Hangzhou City, Zhejiang Province

Applicant before: Hangzhou xiongmai integrated circuit technology Co.,Ltd.

Country or region before: China