CN111860817A - Network model deployment method, device, equipment and readable storage medium - Google Patents

Network model deployment method, device, equipment and readable storage medium Download PDF

Info

Publication number
CN111860817A
CN111860817A CN202010663000.5A CN202010663000A CN111860817A CN 111860817 A CN111860817 A CN 111860817A CN 202010663000 A CN202010663000 A CN 202010663000A CN 111860817 A CN111860817 A CN 111860817A
Authority
CN
China
Prior art keywords
file
network model
deployment
target
acquiring
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
CN202010663000.5A
Other languages
Chinese (zh)
Inventor
赵红博
阚宏伟
赵谦谦
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suzhou Inspur Intelligent Technology Co Ltd
Original Assignee
Suzhou Inspur Intelligent Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suzhou Inspur Intelligent Technology Co Ltd filed Critical Suzhou Inspur Intelligent Technology Co Ltd
Priority to CN202010663000.5A priority Critical patent/CN111860817A/en
Publication of CN111860817A publication Critical patent/CN111860817A/en
Withdrawn legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/06Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
    • G06N3/063Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • General Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Computational Linguistics (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Neurology (AREA)
  • Stored Programmes (AREA)

Abstract

The invention discloses a network model deployment method, a device, equipment and a computer readable storage medium, wherein the method comprises the following steps: acquiring a network model file obtained by adopting a target frame; determining a target analysis mode according to the target frame, and analyzing the network model file according to the target analysis mode to obtain an intermediate file; obtaining a structure file by using the intermediate file, and performing format conversion processing on the structure file to obtain a standard structure file and a control file which are adaptive to the FPGA equipment; acquiring a plurality of operator codes, and generating a bit stream file by using the plurality of operator codes, the standard structure file and the control file; storing the bit stream file in the FPGA equipment to complete the deployment of the network model on the FPGA equipment; according to the method, through two times of conversion, the rapid deployment of the network model of any architecture on the FPGA equipment can be realized, and the deployment flexibility and the deployment speed are improved.

Description

Network model deployment method, device, equipment and readable storage medium
Technical Field
The present invention relates to the field of deep learning technologies, and in particular, to a network model deployment method, a network model deployment apparatus, a network model deployment device, and a computer-readable storage medium.
Background
In order to increase the data processing speed of the neural network model, many manufacturers deploy the neural network model on the FPGA device, and perform high-speed calculation by using the FPGA device. In the related art, after the neural network model under a certain framework is deployed to the FPGA device, if the neural network model under another framework is to be deployed, a lot of other work is required to make the FPGA device support the neural network model under the new framework, for example, the FPGA device may need to be adjusted. This makes the related art have poor flexibility and slow efficiency when deploying the neural network model.
Therefore, how to solve the problems of poor deployment flexibility and slow deployment speed in the related art is a technical problem to be solved by those skilled in the art.
Disclosure of Invention
In view of this, the present invention provides a network model deployment method, a network model deployment apparatus, a network model deployment device, and a computer-readable storage medium, which solve the problems of poor deployment flexibility and slow deployment speed in the related art.
In order to solve the above technical problem, the present invention provides a network model deployment method, including:
acquiring a network model file obtained by adopting a target frame;
Determining a target analysis mode according to the target frame, and analyzing the network model file according to the target analysis mode to obtain an intermediate file;
obtaining a structure file by using the intermediate file, and performing format conversion processing on the structure file to obtain a standard structure file and a control file which are adaptive to the FPGA equipment;
acquiring a plurality of operator codes, and generating a bit stream file by using the operator codes, the standard structure file and the control file;
and storing the bit stream file in the FPGA equipment to complete the deployment of the network model on the FPGA equipment.
Optionally, the obtaining a plurality of operator codes comprises:
identifying a plurality of computational operators in the network model file;
and acquiring a communication operator, and coding the communication operator and the plurality of calculation operators to obtain a plurality of operator codes.
Optionally, the determining a target analysis manner according to the target framework includes:
determining a storage mode of the network model file;
and determining the target analysis mode according to the target frame and the storage mode.
Optionally, the obtaining a structure file by using the intermediate file includes:
Performing network layer compression processing and optimization processing on the intermediate file to obtain an optimized file;
and obtaining the structure file according to the optimization file.
Optionally, the method further comprises:
acquiring a weight file corresponding to the network model file by using the intermediate file;
and acquiring input data, and sending the input data and the weight file to the FPGA equipment so that the FPGA equipment can perform data processing on the input data according to the weight file.
Optionally, the acquiring input data includes:
acquiring original data;
and preprocessing the original data to obtain the input data.
Optionally, the method further comprises:
and acquiring a processing result obtained after the FPGA equipment performs data processing on the input data.
The invention also provides a network model deployment device, comprising:
the acquisition module is used for acquiring the network model file obtained by adopting the target frame;
the analysis module is used for determining a target analysis mode according to the target frame and analyzing the network model file according to the target analysis mode to obtain an intermediate file;
the conversion module is used for obtaining a structure file by utilizing the intermediate file and carrying out format conversion processing on the structure file to obtain a standard structure file and a control file which are adaptive to the FPGA equipment;
The generating module is used for acquiring a plurality of operator codes and generating a bit stream file by using the operator codes, the standard structure file and the control file;
and the deployment module is used for storing the bit stream file in the FPGA equipment to complete the deployment of the network model on the FPGA equipment.
The invention also provides a network model deployment device, comprising a memory and a processor, wherein:
the memory is used for storing a computer program;
the processor is configured to execute the computer program to implement the network model deployment method.
The present invention also provides a computer-readable storage medium for storing a computer program, wherein the computer program, when executed by a processor, implements the network model deployment method described above.
The network model deployment method provided by the invention comprises the steps of obtaining a network model file obtained by adopting a target framework; determining a target analysis mode according to the target frame, and analyzing the network model file according to the target analysis mode to obtain an intermediate file; obtaining a structure file by using the intermediate file, and performing format conversion processing on the structure file to obtain a standard structure file and a control file which are adaptive to the FPGA equipment; acquiring a plurality of operator codes, and generating a bit stream file by using the plurality of operator codes, the standard structure file and the control file; and storing the bit stream file in the FPGA equipment to complete the deployment of the network model on the FPGA equipment.
Therefore, the method obtains the network model file and then obtains the intermediate file by analyzing the network model file in a target analysis mode corresponding to the target architecture, and can convert the network model file under any architecture into the intermediate file with the same format. The intermediate file is used to obtain a structure file for describing the structure of the network model. Through format conversion, a standard structure file and a control file which are adaptive to the FPGA equipment can be obtained, the standard structure file can describe the structure of the network model, and the control file can describe special operations in the network structure. After the operator codes corresponding to operators in the network model are obtained, the operator codes, the standard structure file and the control file are used for generating a bit stream file required by the FPGA equipment, and the deployment of the network model on the FPGA equipment is completed in a mode of storing the bit stream file in the FPGA equipment. Through two conversions, the network model file under the target architecture is converted into the intermediate file, and then the intermediate file is utilized to obtain the bit stream file, so that the network model file of any architecture can be converted into the bit stream file, the rapid deployment of the network model of any architecture on the FPGA equipment is realized, the deployment flexibility and the deployment speed are improved, and the problems of poor deployment flexibility and low deployment speed in the related technology are solved.
In addition, the invention also provides a network model deployment device, network model deployment equipment and a computer readable storage medium, and the network model deployment device, the network model deployment equipment and the computer readable storage medium also have the beneficial effects.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the provided drawings without creative efforts.
Fig. 1 is a flowchart of a network model deployment method according to an embodiment of the present invention;
fig. 2 is a schematic structural diagram of a network model deployment apparatus according to an embodiment of the present invention;
fig. 3 is a schematic structural diagram of a network model deployment device according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
In a possible implementation manner, please refer to fig. 1, where fig. 1 is a flowchart of a network model deployment method according to an embodiment of the present invention. The method comprises the following steps:
s101: and acquiring the network model file obtained by adopting the target frame.
Some or all of the steps in this embodiment may be performed by a network model deployment device that supports an FPGA device, and the network model deployment device may be referred to as a host. The network model deployment device may itself comprise an FPGA device, or provide an interface with which the FPGA device can communicate data. The network model files correspond to the network models, and the specific contents of the network model files are different according to different factors such as the structure, the framework, the contents and the like of the network models. In this embodiment, the network model file is obtained by using a target framework, and the target framework may be any one of architectures, which may be specifically selected according to actual needs, for example, may be a tensoflow framework or may be a restore framework.
The specific determination method for the target architecture is not limited in this embodiment, for example, after the network model file is acquired, the architecture of the network model file is identified, and it is determined that the architecture belongs to the target architecture; or the framework information sent by the user can be obtained, and the framework adopted by the network model file is determined by utilizing the framework information. The network model file may be generated by a user on the network model deployment device, or may be obtained from a network model file sent by another device or a terminal, or may be obtained from a network model file stored in the storage device through a preset port.
S102: and determining a target analysis mode according to the target frame, and analyzing the network model file according to the target analysis mode to obtain an intermediate file.
And after the network model file is obtained, determining a corresponding target analysis mode according to a target frame corresponding to the network model file. The target analysis mode is used for analyzing the network model file so as to obtain an intermediate file with a fixed format and complete the first conversion of the network model file. The network model file is analyzed according to the target analysis mode, all information corresponding to the network model file can be accurately obtained, the information is utilized to generate an intermediate file, all information of the network model file, including structure information, weight information and the like, is recorded in the intermediate file, and the network model files with different architectures are converted into the intermediate file in a unified format so as to be converted for the second time in the follow-up process.
In one embodiment, multiple analysis modes can be preset in advance, a corresponding target analysis mode is selected to analyze the network model file after the target architecture is determined, and the analysis modes adopted by various frames can be flexibly adjusted by changing the preset mode and the corresponding relation between the frames and the analysis modes. Or the target parsing mode can be integrated into a format conversion tool or a mature format conversion tool, such as a convert _ tools tool, can be called to parse the network model file. After the network model file is analyzed, corresponding information can be obtained, and the information can record data such as the structure, the weight, the special operation and the like of the network model. After the information of the network model is obtained, an intermediate file with a fixed format, such as txt format or param format, is generated according to the generation method of the intermediate file.
In a possible embodiment, the network model files may be generated by different saving methods, for example, the network model files generated by different programming languages are not stored in the same saving method, i.e., the file format is different; or the network model edited by the same programming language can be stored in various ways. In order to ensure the accuracy of analyzing the network model file, the step of determining the target analysis mode according to the target framework may include:
step 11: and determining the storage mode of the network model file.
Step 12: and determining a target analysis mode according to the target frame and the storage mode.
In order to ensure the accuracy of analyzing the network model file, different analyzing modes can be determined for different storage modes. When the target analysis mode is determined, the target analysis mode is determined from two aspects of a target structure and a storage mode.
S103: and obtaining a structure file by using the intermediate file, and performing format conversion processing on the structure file to obtain a standard structure file and a control file which are adaptive to the FPGA equipment.
The structure file is obtained from the intermediate file and is used for recording the structure characteristics of the network model, such as the number of network layers, the specific processing mode performed by each layer and other information. After the structure file is obtained, in order to make the structure file be compatible with the FPGA device, the structure file needs to be subjected to format conversion processing, that is, data in the structure file is read out and written into a new file according to a format that can be adapted by the FPGA device. The new file comprises a standard structure file and a control file, the standard structure file is a model structure file which can be used on the FPGA device, the control file is used for controlling the execution flow of the neural network and recording some special operations in the network structure, and the special operations cannot be directly used on the FPGA device and are recorded in the control file.
Further, in a possible implementation manner, in order to improve the computation speed of the network model after being deployed in the FPGA device, the step of obtaining the structure file by using the intermediate file may include:
step 21: and performing network layer compression processing and optimization processing on the intermediate file to obtain an optimized file.
Step 22: and obtaining the structure file according to the optimization file.
The network layer compression process can compress two or more network layers into one network layer, and the calculation speed of the model is improved by reducing the number of the network layers. For example, a convolutional layer (conv layer) and a normalization layer (bn layer). The optimization process may optimize the computation speed of each network layer, and the specific content is not limited, and may be, for example, a quantization process. After network layer compression processing and optimization processing are carried out on the intermediate file, an optimized file can be obtained, and a structure file obtained through the optimized file is the structure of the optimized network model.
S104: and acquiring a plurality of operator codes, and generating a bit stream file by using the plurality of operator codes, the standard structure file and the control file.
The operator codes are codes corresponding to all calculation operators in the network model, and the calculation operators are used for coding, so that the network model executes corresponding calculation on the FPGA equipment. The specific content of the calculation operator is not limited, and may be, for example, a linear rectification operator (ReLU operator), a convolution operator (conv operator), a pooling operator (pool operator), a normalization operator (softmax operator), an addition operator (add operator), or the like. The operator codes can be generated and stored and are acquired when the bit stream file is generated; or the corresponding calculation operator of the network model file can be determined temporarily and the corresponding operator code can be generated. After the operator codes are obtained, the operator codes, the standard structure file and the control file are used for generating a bit stream file, and the bit stream file (or bit stream file) is used for being written into the FPGA equipment. And the intermediate file is used for generating a bit stream file, so that the second conversion is completed, and the network model file with any architecture can be converted into a bit stream file which can be directly read by the FPGA equipment.
In a possible implementation manner, in order to avoid the storage space occupation caused by storing the operator codes, the step of obtaining a plurality of operator codes may include:
step 31: a plurality of computational operators in the network model file are identified.
Step 32: and acquiring a communication operator, and coding the communication operator and the multiple calculation operators to obtain multiple operator codes.
In order to realize the calculation of each calculation operator on the FPGA device, when encoding each calculation operator itself, the communication operator needs to be obtained and encoded, and finally all the operator codes are obtained, and the communication operator is not responsible for the processing and calculation of data, and is responsible for the forwarding of various data, and the communication between each calculation operator is completed. Data communication can be efficiently completed by utilizing the calculation operator, and the calculation speed is improved. Meanwhile, by identifying and coding the calculation operators in the network model file, the operator codes corresponding to all the possible calculation operators do not need to be stored in advance, and the occupation of storage space is avoided.
S105: and storing the bit stream file in the FPGA equipment to complete the deployment of the network model on the FPGA equipment.
After the bitstream file is compiled, the bitstream file is stored in the FPGA device. The specific storage mode is not limited, and for example, the bitstream file may be sent to the FPGA device, and stored in a programming mode. And storing the network model into the FPGA equipment to complete the deployment of the network model on the FPGA equipment. By the method of twice conversion, the network model of any architecture can be deployed on the FPGA equipment without adjusting the FPGA equipment or executing other prepositive work, and the rapid deployment of the network model of any architecture is completed.
After a possible implementation manner, after the deployment of the network model is completed, the FPGA device may be further controlled to perform data processing by using the network model, and the method may further include:
step 41: and acquiring a weight file corresponding to the network model file by using the intermediate file.
Step 42: and acquiring input data, and sending the input data and the weight file to the FPGA equipment so that the FPGA equipment can process the input data according to the weight file.
Step 43: and acquiring a processing result obtained after the FPGA equipment performs data processing on the input data.
Step 44: and executing the target step by taking the processing result as an input value of the target step.
The weight file is used for recording weight parameters required by the calculation of the network model, and the specific content of the weight file is related to the structure of the network model and the training result. Since the intermediate file records all information such as structure information and weight information of the network model, the weight file can be obtained from the intermediate file. Specifically, the weight file may be obtained when the structure file is obtained, that is, step 41 may be executed simultaneously with the step of "obtaining the structure file using the intermediate file" in step S103, or may be executed when data processing using the FPGA device is required. The input data is input data of the network model, and specific content of the input data is not limited, and may be, for example, images, audio, and the like. After the input data are obtained, the input data and the weight file are sent to the FPGA device, so that the FPGA device can process the input data according to the weight in the weight file. The input data may be obtained in any manner, such as reading the input data from a preset path, or obtaining data sent by other devices or terminals as the input data.
After the FPGA device finishes processing the input data to obtain a processing result, the processing result can be sent to the network model deployment device so as to be used as an input value to execute a target step. It should be noted that the target step is a step that needs to be executed according to the processing result, and the target step may specifically be to execute a calculation, forward the processing result, and the like by using the processing result, or may be another step, which is not limited in this embodiment.
In one embodiment, the input data is not directly acquired data, and the directly acquired raw data needs to be processed to obtain the input data. The step of acquiring input data may comprise:
step 51: raw data is acquired.
Step 52: and preprocessing the original data to obtain input data.
The specific processing mode of the preprocessing can be one or more, and the specific content of the processing mode is related to the type of the original data. For example, when the original data is an image, the preprocessing may be one or more of a resize process (a process of resizing an image), a mean process, a filtering process, and the like. Or when the raw data is audio, the preprocessing may include a noise reduction processing or the like.
By applying the network model deployment method provided by the embodiment of the invention, the intermediate file is obtained by analyzing the acquired network model file in a target analysis mode corresponding to the target architecture, and the network model file under any architecture can be converted into the intermediate file with the same format. The intermediate file is used to obtain a structure file for describing the structure of the network model. Through format conversion, a standard structure file and a control file which are adaptive to the FPGA equipment can be obtained, the standard structure file can describe the structure of the network model, and the control file can describe special operations in the network structure. After the operator codes corresponding to operators in the network model are obtained, the operator codes, the standard structure file and the control file are used for generating a bit stream file required by the FPGA equipment, and the deployment of the network model on the FPGA equipment is completed in a mode of storing the bit stream file in the FPGA equipment. Through two conversions, the network model file under the target architecture is converted into the intermediate file, and then the intermediate file is utilized to obtain the bit stream file, so that the network model file of any architecture can be converted into the bit stream file, the rapid deployment of the network model of any architecture on the FPGA equipment is realized, the deployment flexibility and the deployment speed are improved, and the problems of poor deployment flexibility and low deployment speed in the related technology are solved.
In the following, the network model deployment apparatus provided in the embodiment of the present invention is introduced, and the network model deployment apparatus described below and the network model deployment method described above may be referred to correspondingly.
Referring to fig. 2, fig. 2 is a schematic structural diagram of a network model deployment apparatus according to an embodiment of the present invention, including:
an obtaining module 110, configured to obtain a network model file obtained by using a target framework;
the analysis module 120 is configured to determine a target analysis mode according to the target frame, and analyze the network model file according to the target analysis mode to obtain an intermediate file;
the conversion module 130 is configured to obtain a structure file by using the intermediate file, and perform format conversion processing on the structure file to obtain a standard structure file and a control file that are adapted to the FPGA device;
the generating module 140 is configured to obtain a plurality of operator codes, and generate a bitstream file by using the plurality of operator codes, the standard structure file, and the control file;
and the deployment module 150 is configured to store the bitstream file in the FPGA device, so as to complete deployment of the network model on the FPGA device.
Optionally, the generating module 140 includes:
the identification unit is used for identifying various calculation operators in the network model file;
And the coding unit is used for acquiring the communication operator, coding the communication operator and the multiple calculation operators and obtaining multiple operator codes.
Optionally, the parsing module 120 includes:
the storage mode determining unit is used for determining the storage mode of the network model file;
and the analysis mode determining unit is used for determining a target analysis mode according to the target frame and the storage mode.
Optionally, the conversion module 130 includes:
the optimization unit is used for performing network layer compression processing and optimization processing on the intermediate file to obtain an optimized file;
and the structure file acquisition unit is used for acquiring the structure file according to the optimization file.
Optionally, the method further comprises:
the weight file acquisition module is used for acquiring a weight file corresponding to the network model file by using the intermediate file;
and the sending module is used for acquiring the input data and sending the input data and the weight file to the FPGA equipment so that the FPGA equipment can process the input data according to the weight file.
Optionally, the sending module includes:
an original data acquisition unit for acquiring original data;
and the preprocessing unit is used for preprocessing the original data to obtain input data.
Optionally, the method further comprises:
The processing result acquisition module is used for acquiring a processing result obtained after the FPGA equipment performs data processing on the input data;
and the subsequent execution module is used for executing the target step by taking the processing result as an input value of the target step.
In the following, the network model deployment device provided by the embodiment of the present invention is introduced, and the network model deployment device described below and the network model deployment method described above may be referred to correspondingly.
Referring to fig. 3, fig. 3 is a schematic structural diagram of a network model deployment device according to an embodiment of the present invention. Wherein the network model deployment device 100 may include a processor 101 and a memory 102, and may further include one or more of a multimedia component 103, an information input/information output (I/O) interface 104, and a communication component 105.
The processor 101 is configured to control overall operations of the network model deployment apparatus 100, so as to complete all or part of the steps in the above network model deployment method; the memory 102 is used to store various types of data to support the operation of the network model deployment device 100, which may include, for example, instructions for any application or method operating on the network model deployment device 100, as well as application-related data. The Memory 102 may be implemented by any type of volatile or non-volatile Memory device or combination thereof, such as one or more of Static Random Access Memory (SRAM), Electrically erasable Programmable Read-Only Memory (EEPROM), erasable Programmable Read-Only Memory (EPROM), Programmable Read-Only Memory (PROM), Read-Only Memory (ROM), magnetic Memory, flash Memory, magnetic disk, or optical disk.
The multimedia component 103 may include a screen and an audio component. Wherein the screen may be, for example, a touch screen and the audio component is used for outputting and/or inputting audio signals. For example, the audio component may include a microphone for receiving external audio signals. The received audio signal may further be stored in the memory 102 or transmitted through the communication component 105. The audio assembly also includes at least one speaker for outputting audio signals. The I/O interface 104 provides an interface between the processor 101 and other interface modules, such as a keyboard, mouse, buttons, etc. These buttons may be virtual buttons or physical buttons. The communication component 105 is used for wired or wireless communication between the network model deployment device 100 and other devices. Wireless Communication, such as Wi-Fi, bluetooth, Near Field Communication (NFC), 2G, 3G, or 4G, or a combination of one or more of them, so that the corresponding Communication component 105 may include: Wi-Fi part, Bluetooth part, NFC part.
The network model deployment Device 100 may be implemented by one or more Application Specific Integrated Circuits (ASICs), Digital Signal Processors (DSPs), Digital Signal Processing Devices (DSPDs), Programmable Logic Devices (PLDs), Field Programmable Gate Arrays (FPGAs), controllers, microcontrollers, microprocessors or other electronic components, and is configured to execute the network model deployment method according to the above embodiments.
In the following, the computer-readable storage medium provided by the embodiment of the present invention is introduced, and the computer-readable storage medium described below and the network model deployment method described above may be referred to correspondingly.
The present invention also provides a computer-readable storage medium, on which a computer program is stored, which, when executed by a processor, implements the steps of the network model deployment method described above.
The computer-readable storage medium may include: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read-only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.
The embodiments are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same or similar parts among the embodiments are referred to each other. The device disclosed by the embodiment corresponds to the method disclosed by the embodiment, so that the description is simple, and the relevant points can be referred to the method part for description.
Those of skill would further appreciate that the various illustrative components and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both, and that the various illustrative components and steps have been described above generally in terms of their functionality in order to clearly illustrate this interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.
The steps of a method or algorithm described in connection with the embodiments disclosed herein may be embodied directly in hardware, in a software module executed by a processor, or in a combination of the two. A software module may reside in Random Access Memory (RAM), memory, Read Only Memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art.
Finally, it should also be noted that, herein, relationships such as first and second, etc., are intended only to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms include, or any other variation is intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that includes a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus.
The network model deployment method, the network model deployment device, the network model deployment apparatus, and the computer-readable storage medium provided by the present invention are described in detail above, and specific examples are applied herein to explain the principle and the implementation of the present invention, and the description of the above embodiments is only used to help understanding the method and the core idea of the present invention; meanwhile, for a person skilled in the art, according to the idea of the present invention, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present invention.

Claims (10)

1. A method for deploying a network model, comprising:
acquiring a network model file obtained by adopting a target frame;
determining a target analysis mode according to the target frame, and analyzing the network model file according to the target analysis mode to obtain an intermediate file;
obtaining a structure file by using the intermediate file, and performing format conversion processing on the structure file to obtain a standard structure file and a control file which are adaptive to the FPGA equipment;
acquiring a plurality of operator codes, and generating a bit stream file by using the operator codes, the standard structure file and the control file;
and storing the bit stream file in the FPGA equipment to complete the deployment of the network model on the FPGA equipment.
2. The network model deployment method of claim 1, wherein the obtaining a plurality of operator codes comprises:
identifying a plurality of computational operators in the network model file;
and acquiring a communication operator, and coding the communication operator and the plurality of calculation operators to obtain a plurality of operator codes.
3. The method for deploying a network model according to claim 1, wherein the determining a target resolution according to the target framework comprises:
Determining a storage mode of the network model file;
and determining the target analysis mode according to the target frame and the storage mode.
4. The method for deploying a network model according to claim 1, wherein the obtaining a structure file by using the intermediate file comprises:
performing network layer compression processing and optimization processing on the intermediate file to obtain an optimized file;
and obtaining the structure file according to the optimization file.
5. The network model deployment method of claim 1, further comprising:
acquiring a weight file corresponding to the network model file by using the intermediate file;
and acquiring input data, and sending the input data and the weight file to the FPGA equipment so that the FPGA equipment can perform data processing on the input data according to the weight file.
6. The network model deployment method of claim 5, wherein the obtaining input data comprises:
acquiring original data;
and preprocessing the original data to obtain the input data.
7. The network model deployment method of claim 5, further comprising:
Acquiring a processing result obtained after the FPGA equipment performs data processing on the input data;
and executing the target step by taking the processing result as an input value of the target step.
8. A network model deployment apparatus, comprising:
the acquisition module is used for acquiring the network model file obtained by adopting the target frame;
the analysis module is used for determining a target analysis mode according to the target frame and analyzing the network model file according to the target analysis mode to obtain an intermediate file;
the conversion module is used for obtaining a structure file by utilizing the intermediate file and carrying out format conversion processing on the structure file to obtain a standard structure file and a control file which are adaptive to the FPGA equipment;
the generating module is used for acquiring a plurality of operator codes and generating a bit stream file by using the operator codes, the standard structure file and the control file;
and the deployment module is used for storing the bit stream file in the FPGA equipment to complete the deployment of the network model on the FPGA equipment.
9. A network model deployment device comprising a memory and a processor, wherein:
The memory is used for storing a computer program;
the processor for executing the computer program to implement the network model deployment method of any one of claims 1 to 7.
10. A computer-readable storage medium for storing a computer program, wherein the computer program, when executed by a processor, implements the network model deployment method of any one of claims 1 to 7.
CN202010663000.5A 2020-07-10 2020-07-10 Network model deployment method, device, equipment and readable storage medium Withdrawn CN111860817A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010663000.5A CN111860817A (en) 2020-07-10 2020-07-10 Network model deployment method, device, equipment and readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010663000.5A CN111860817A (en) 2020-07-10 2020-07-10 Network model deployment method, device, equipment and readable storage medium

Publications (1)

Publication Number Publication Date
CN111860817A true CN111860817A (en) 2020-10-30

Family

ID=73153758

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010663000.5A Withdrawn CN111860817A (en) 2020-07-10 2020-07-10 Network model deployment method, device, equipment and readable storage medium

Country Status (1)

Country Link
CN (1) CN111860817A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112541577A (en) * 2020-12-16 2021-03-23 上海商汤智能科技有限公司 Neural network generation method and device, electronic device and storage medium
CN112819153A (en) * 2020-12-31 2021-05-18 杭州海康威视数字技术股份有限公司 Model transformation method and device
CN115796284A (en) * 2023-02-08 2023-03-14 苏州浪潮智能科技有限公司 Inference method, inference device, storage medium and equipment based on TVM compiler

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112541577A (en) * 2020-12-16 2021-03-23 上海商汤智能科技有限公司 Neural network generation method and device, electronic device and storage medium
CN112819153A (en) * 2020-12-31 2021-05-18 杭州海康威视数字技术股份有限公司 Model transformation method and device
CN112819153B (en) * 2020-12-31 2023-02-07 杭州海康威视数字技术股份有限公司 Model transformation method and device
CN115796284A (en) * 2023-02-08 2023-03-14 苏州浪潮智能科技有限公司 Inference method, inference device, storage medium and equipment based on TVM compiler

Similar Documents

Publication Publication Date Title
CN111860817A (en) Network model deployment method, device, equipment and readable storage medium
CN110136744B (en) Audio fingerprint generation method, equipment and storage medium
CN112819153B (en) Model transformation method and device
CN109074808B (en) Voice control method, central control device and storage medium
EP3564812B1 (en) Method and system for automated creation of graphical user interfaces
CN110176256B (en) Recording file format conversion method and device, computer equipment and storage medium
CN108415826B (en) Application testing method, terminal device and computer readable storage medium
CN114333852A (en) Multi-speaker voice and human voice separation method, terminal device and storage medium
KR102041772B1 (en) Program editing device, program editing method and program editing program stored in the storage medium
JP6778811B2 (en) Speech recognition method and equipment
CN114818695A (en) Text style migration method, device, equipment and storage medium
CN114138274A (en) High-level intermediate representation conversion method and related device of deep learning compiler
CN112735479B (en) Speech emotion recognition method and device, computer equipment and storage medium
CN113409803B (en) Voice signal processing method, device, storage medium and equipment
CN116343791A (en) Service execution method, device, computer equipment and storage medium thereof
CN109960590A (en) A method of optimization embedded system diagnostic printing
CN114283791A (en) Speech recognition method based on high-dimensional acoustic features and model training method
CN114187388A (en) Animation production method, device, equipment and storage medium
CN110556099B (en) Command word control method and device
CN110096266B (en) Feature processing method and device
CN114185657A (en) Task scheduling method and device of cloud platform, storage medium and electronic equipment
KR102572141B1 (en) Method and apparatus for reconstructing wave from bits steam of speech codec, computer-readable storage medium and computer program
CN111368523A (en) Method and device for converting layout format of movie and television script
EP4068141A1 (en) Method and system to enable print functionality in high-level synthesis (hls) design platforms
JP7367839B2 (en) Voice recognition device, control method, and program

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WW01 Invention patent application withdrawn after publication
WW01 Invention patent application withdrawn after publication

Application publication date: 20201030