WO2023115776A1 - Neural network reasoning method and apparatus, and computer device, computer-readable storage medium and computer program product - Google Patents

Neural network reasoning method and apparatus, and computer device, computer-readable storage medium and computer program product Download PDF

Info

Publication number
WO2023115776A1
WO2023115776A1 PCT/CN2022/090030 CN2022090030W WO2023115776A1 WO 2023115776 A1 WO2023115776 A1 WO 2023115776A1 CN 2022090030 W CN2022090030 W CN 2022090030W WO 2023115776 A1 WO2023115776 A1 WO 2023115776A1
Authority
WO
WIPO (PCT)
Prior art keywords
target
network
neural network
network layer
deployment
Prior art date
Application number
PCT/CN2022/090030
Other languages
French (fr)
Chinese (zh)
Inventor
李天健
许思
Original Assignee
上海商汤智能科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 上海商汤智能科技有限公司 filed Critical 上海商汤智能科技有限公司
Publication of WO2023115776A1 publication Critical patent/WO2023115776A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/40Transformation of program code
    • G06F8/41Compilation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/04Inference or reasoning models

Definitions

  • the embodiment of the present disclosure is based on the Chinese patent application with the application number 202111595072.1, the application date is December 24, 2021, and the application name is "a neural network reasoning method, device, computer equipment and storage medium", and requires the Chinese patent Priority of the application, the entire content of the Chinese patent application is hereby incorporated by reference into this disclosure.
  • the present disclosure relates to but not limited to the field of computer technology, and in particular relates to a neural network reasoning method and device, computer equipment, computer-readable storage media, and computer program products.
  • an inference engine is usually used to optimize the configuration parameters of different calculations, which can improve the performance of neural network inference.
  • Embodiments of the present disclosure provide a neural network reasoning method and device, computer equipment, a computer-readable storage medium, and a computer program product.
  • An embodiment of the present disclosure provides a neural network reasoning method, including:
  • the target configuration parameters corresponding to the target network layer Based on the network parameters corresponding to the target network layer and the predetermined correspondence between the sample network layer and configuration parameters, determine the target configuration parameters corresponding to the target network layer; wherein, the sample network layer and the target network The types of the layers are the same, and the configuration parameters corresponding to the sample network layer are the configuration information of the algorithm when performing the operation corresponding to the sample network layer;
  • the target configuration parameters corresponding to the target network layer are automatically determined based on the network parameters corresponding to the target network layer and the predetermined correspondence between the sample network layer and the configuration parameters, and Deploying the target neural network based on the target configuration parameters saves time for configuring parameters in the neural network deployment initialization phase, thereby improving the deployment efficiency of the neural network.
  • An embodiment of the present disclosure also provides a neural network reasoning device, including:
  • the analysis part is configured to obtain the target neural network to be deployed, and analyze the target neural network, and determine the network parameters corresponding to each network layer of the target neural network;
  • the determining part is configured to determine target configuration parameters corresponding to the target network layer based on network parameters corresponding to the target network layer and a predetermined correspondence between sample network layers and configuration parameters; wherein, the sample network The layer is of the same type as the target network layer, and the configuration parameter corresponding to the sample network layer is configuration information of an algorithm when performing an operation corresponding to the sample network layer;
  • the reasoning part is configured to deploy the target neural network based on the target configuration parameters, and perform network reasoning based on the target neural network.
  • An embodiment of the present disclosure also provides a computer device, including: a processor, a memory, and a bus.
  • the memory stores machine-readable instructions executable by the processor.
  • the processor and The memories communicate with each other through a bus, and the machine-readable instructions are executed by the processor to execute the steps of the above neural network reasoning method.
  • An embodiment of the present disclosure also provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is run by a processor, the steps of the above neural network reasoning method are executed.
  • An embodiment of the present disclosure also provides a computer program product, where the computer program product includes a computer program or an instruction, and when the computer program or instruction is run on an electronic device, the electronic device is made to execute the steps of the above method.
  • FIG. 1 shows a flowchart of a neural network reasoning method provided by an embodiment of the present disclosure
  • FIG. 2 shows a flowchart of a method for determining target configuration parameters corresponding to a target network layer in the neural network reasoning method provided by an embodiment of the present disclosure
  • FIG. 3 shows a flowchart of a method for determining at least one candidate sample network layer corresponding to a target network layer in the neural network reasoning method provided by an embodiment of the present disclosure
  • FIG. 4 shows a flowchart of a method for deploying a target neural network in the neural network reasoning method provided by an embodiment of the present disclosure
  • Fig. 5 shows the flow chart of the method for generating the target deployment code corresponding to the target neural network in the neural network reasoning method provided by the embodiment of the present disclosure
  • FIG. 6 shows a flowchart of a method for network reasoning in the neural network reasoning method provided by an embodiment of the present disclosure
  • Fig. 7 shows a schematic diagram of the architecture of a neural network reasoning device provided by an embodiment of the present disclosure
  • FIG. 8 shows a schematic structural diagram of a computer device provided by an embodiment of the present disclosure.
  • the inference engine in order to obtain better inference performance, the inference engine often needs to traverse a large number of configuration parameter combinations in the preprocessing stage, and perform the actual deployment of the neural network according to the configuration parameter combinations obtained through the traversal. , to select a better combination of configuration parameters according to the test results after actual deployment, which makes the preprocessing stage take a long time and reduces the deployment efficiency of the neural network.
  • the present disclosure provides a neural network reasoning method, device, computer equipment, and storage medium.
  • the network parameters corresponding to the target network layer and the predetermined sample network layer and configuration parameters are used.
  • the execution subject of the neural network reasoning method provided in the embodiments of the present disclosure is generally a computer device with a certain computing power.
  • the computer device includes, for example: a terminal device or a server or other processing device, and the terminal device may be a user equipment (User Equipment, UE), a mobile device, a user terminal, a terminal, and the like.
  • the neural network reasoning method may be implemented by a processor invoking computer-readable instructions stored in a memory.
  • the method includes S101 to S103, wherein:
  • S101 Obtain a target neural network to be deployed, analyze the target neural network, and determine network parameters corresponding to each network layer of the target neural network.
  • S102 Determine the target configuration parameters corresponding to the target network layer based on the network parameters corresponding to the target network layer and the predetermined correspondence between the sample network layer and the configuration parameters; wherein, the sample network layer and the The types of the target network layers are the same, and the configuration parameters corresponding to the sample network layers are configuration information of algorithms when performing operations corresponding to the sample network layers.
  • S103 Deploy the target neural network based on the target configuration parameters, and perform network reasoning based on the target neural network.
  • the network parameters corresponding to each network layer of the target neural network include weight parameters, bias parameters, convolution parameters of the convolution layer, activation parameters of the activation layer, etc., by determining the network parameters of the target neural network
  • the network parameters corresponding to the respective layers can determine the type of each network layer of the target neural network to be deployed, and the parameter values of the network parameters corresponding to each network layer.
  • the network parameters corresponding to the convolutional layer in the target neural network are the amount of convolution operations for convolution operations, the size of the convolution kernel, and the convolution step size Equal convolution parameters, wherein the amount of convolution operation can be represented by the length and width of the feature map participating in the convolution operation, and the target neural network can represent a specific neural network among multiple alternative neural networks .
  • the target neural network when analyzing the target neural network, it is also possible to determine the network parameters corresponding to the network layers of the target neural network and the hierarchical relationship between the network layers of the target neural network, and then perform network reasoning.
  • S102 Determine the target configuration parameters corresponding to the target network layer based on the network parameters corresponding to the target network layer and the predetermined correspondence between the sample network layer and the configuration parameters; wherein, the sample network layer and the The types of the target network layers are the same, and the configuration parameters corresponding to the sample network layers are configuration information of algorithms when performing operations corresponding to the sample network layers.
  • the target network layer can be a convolutional layer or a network layer performing matrix operations. Since these network layers have a large amount of computation, corresponding configuration parameters can be set to improve computing efficiency.
  • the algorithm represents the Methods, including whether to use a specific computing mechanism, the amount of computing performed by each computing unit during computing, etc.
  • the target network layer may represent a specific network layer among multiple network layers in the target neural network
  • the target configuration parameter may represent a configuration parameter corresponding to the target network layer.
  • the configuration parameters corresponding to the sample network layer are optimal configuration parameters for neural network deployment of the sample network layer under the setting of various network parameters, for example, may be the optimal solution of the configuration parameters corresponding to the target network layer .
  • the configuration parameters corresponding to the sample network layer include the architecture of each unified computing device in a Graphics Processing Unit (GPU).
  • GPU Graphics Processing Unit
  • CUDA Computer Unified Device Architecture
  • the product calculation is decomposed, the iteration step size of each cycle calculation of each CUDA operation unit, the iteration step size of each cycle calculation of each minimum operation unit, etc.
  • a plurality of sample network layers with different network parameters can be determined in an exhaustive manner, and before the neural network is deployed, the neural network reasoning engine can pre-determine the correspondence between the sample network layers and the configuration parameters relationship; for any of the sample network layers, the configuration parameters corresponding to the sample network layer may be the configuration parameters used when the neural network whose operation results meet the preset conditions is deployed, and the preset conditions may be after deployment
  • the inference speed of the neural network is greater than the preset threshold.
  • the preset threshold may be a positive number.
  • the device used for neural network deployment using the sample network layer is a test device, not the target deployment device for actually deploying the target network layer. Due to hardware differences between devices, the same configuration parameters are in The running results on different deployment devices may also be different, that is, the target neural network is deployed according to the configuration parameters with better running results on the test device, and the final running results may not be better.
  • the optimal configuration parameters for the neural network deployment of the sample network layer under each network parameter setting can be obtained, and since the type of the sample network layer is the same as that of the target network layer, subsequent When determining the target configuration parameters of the target network layer, the target configuration parameters corresponding to the target network layer may be determined based on the network parameters corresponding to the target network layer and the predetermined correspondence between the sample network layer and configuration parameters . Therefore, a better solution suitable for configuration parameters corresponding to a specific convolutional layer or a network layer performing matrix operations (the target network layer) can be quickly selected, saving time for configuring parameters in the initial stage of neural network deployment.
  • the target configuration parameters corresponding to the target network layer may be determined through the following steps:
  • S201 Based on the similarity between the network parameters corresponding to the target network layer and the sample network parameters corresponding to the sample network layer, determine at least one candidate sample network layer corresponding to the target network layer.
  • the similarity between the network parameters corresponding to the target network layer and the sample network parameters corresponding to the sample network layer may be a cosine similarity between network parameters.
  • one or more of the network parameters may be used to determine the similarity.
  • the similarity of each network parameter selected for determining the similarity can be determined separately, and the similarity of each network parameter is weighted and summed, and the similarity after the weighted sum is degree, as the similarity between the network parameters corresponding to the target network layer and the sample network parameters corresponding to the sample network layer.
  • At least one candidate sample network layer corresponding to the target network layer may be determined through the following steps:
  • S2011 Determine the sample network layer whose similarity is greater than a preset similarity as an initial sample network layer.
  • sample network layer 2 and sample network layer 4 can be It is determined as the initial sample network layer, so that a candidate set including multiple initial sample network layers can be obtained.
  • the optimal solution is selected and recorded by traversing the candidate set.
  • the candidate set can be reduced by pruning.
  • S2012 Determine the at least one candidate network layer from the initial sample network layer based on the configuration information screening condition matching the network parameter corresponding to the target network layer and the configuration parameters corresponding to each initial sample network layer.
  • the configuration information screening conditions can be obtained by means of data analysis, for example, by performing data analysis on the network parameters corresponding to the target network layer, the maximum value and minimum value of each configuration parameter corresponding to the network parameters can be obtained.
  • the corresponding configuration information filter condition is that the parameter value of the configuration parameter must not be greater than the corresponding maximum value and must not be less than the corresponding minimum value.
  • each of the initial sample network layers is screened, and determined from each of the initial sample network layers
  • the at least one candidate sample network layer can reduce the amount of calculation when subsequently determining the configuration parameters of the target network layer, thereby improving the deployment efficiency of the neural network.
  • S202 For any of the candidate sample network layers, deploy the target neural network based on the configuration parameters corresponding to the candidate sample network layer, and determine the operation result of the target neural network in the deployment mode corresponding to the candidate sample network layer .
  • the neural network reasoning engine can be used to deploy the neural network; after the target neural network is deployed to the target deployment device, the target neural network can be determined in the deployment mode corresponding to the candidate sample network layer.
  • the running result, the running result may be inference speed, reasoning accuracy, etc., and the deployment effect of the target neural network in this deployment mode can be determined through the running result.
  • S203 Based on the operation results of the target neural network in the deployment mode corresponding to each of the candidate sample network layers, determine the target candidate sample network layer, and use the configuration parameters corresponding to the target candidate sample network layer as the target configuration parameter.
  • the operation when determining the target candidate sample network layer based on the operation results of the corresponding deployment modes of each candidate sample network layer, the operation can be determined from the candidate sample network layers according to the preset operation result evaluation rules. For the network layer of the target candidate sample with a better result, the configuration parameter corresponding to the network layer of the target candidate sample is used as the target configuration parameter.
  • the target candidate sample network layer may represent a specific candidate sample network layer in at least one candidate sample network layer.
  • the corresponding operation result evaluation rule may be to select a candidate sample network layer whose inference speed is greater than a preset threshold as the target candidate sample network layer.
  • S103 Deploy the target neural network based on the target configuration parameters, and perform network reasoning based on the target neural network.
  • the network reasoning is to perform data processing on the input data based on the target neural network, so as to obtain a data processing result corresponding to the input data.
  • the target neural network as an image recognition network as an example, after deploying the target neural network, after inputting a picture containing a cat into the target neural network, through the network reasoning of the target neural network, The reasoning result can be obtained as "cat".
  • the target neural network can be deployed through the following steps:
  • S401 Based on the target configuration parameters, determine the first deployment code corresponding to the target network layer; and, based on the network parameters of other network layers in the target neural network except the target network layer, determine the other The second deployment code corresponding to the network layer.
  • the first deployment code and the second deployment code are codes that can be recognized by the central processing unit.
  • the target can be recorded in the central processing unit.
  • the deployment configuration of the neural network wherein the central processing unit is a device for deploying the neural network reasoning engine, and is used for deploying the target neural network.
  • the target configuration parameters corresponding to the target network layer may be encapsulated based on a preset code encapsulation rule , to determine the first deployment code corresponding to the target network layer; when determining the second deployment code corresponding to the other network layers based on the network parameters of other network layers in the target neural network except the target network layer
  • the second deployment code corresponding to the other network layer may be obtained from the neural network reasoning engine according to the network parameters of the other network layer.
  • other network layers may represent any network layer in the target neural network other than the target network layer, or two or more network layers.
  • the code encapsulation rule may define a template for code encapsulation, and when encapsulating the target configuration parameters corresponding to the target network layer based on the preset code encapsulation rules, the target configuration parameter and the template may be The corresponding relationship between the target configuration parameters is added to the corresponding position of the template, thereby generating the first deployment code corresponding to the target network layer.
  • the code encapsulation rule may indicate a correspondence between the first configuration parameter and a code encapsulation template.
  • fusion information corresponding to the target network layer may also be obtained, and the target network layer and the fusion relationship with the target network layer may be processed according to the fusion information
  • Other network layers perform code encapsulation.
  • the target network layer as a convolutional layer as an example, according to the fusion information, it can be determined that other network layers that have a fusion relationship with the convolutional layer are activation layers, then the convolutional layer can be Corresponding configuration parameters and network parameters corresponding to the activation layer are simultaneously code-encapsulated, which can improve the efficiency of initial deployment code generation.
  • the convolutional code may be split into multiple parts. In this way, these split code parts can be spliced to obtain the deployment code.
  • code encapsulation through preset encapsulation rules can automatically generate encapsulated code when deploying the neural network, without adding corresponding codes to the neural network inference engine in advance, thereby reducing the space occupied by the neural network inference engine. Improve the deployment efficiency of neural networks.
  • S402 Based on the first deployment code and the second deployment code, generate a target deployment code corresponding to the target neural network, and add the target deployment code to a target deployment device.
  • the target deployment device may be a hardware device such as a graphics processor that can be used for neural network deployment. After the target deployment code is added to the target deployment device, the deployment of the target neural network is completed. .
  • the target deployment code may refer to a specific deployment code generated by the first deployment code and the second deployment code.
  • the corresponding code can be generated according to the fusion effect of convolution/matrix operations, and the performance of neural network reasoning will be improved.
  • the target deployment code corresponding to the target neural network can be generated through the following steps:
  • S4021 Concatenate the first deployment code and the second deployment code to determine an initial deployment code.
  • the first deployment code and the second deployment code can be combined according to the connection relationship between the network layers in the target neural network.
  • the deployment code is spliced, and the spliced deployment code is the initial deployment code.
  • S4022 Call the target interface function of the target deployment device to compile the initial deployment code, and generate the target deployment code, where the target deployment code is code running on the target deployment device.
  • the target interface function may represent a specific interface function of the target deployment device.
  • the target interface function of the NVRTC interface can be called to compile the initial deployment code, and generate target deployment code that can run on the graphics processor.
  • the deployment target can be generated in real time.
  • the target deployment code of the neural network can improve the deployment efficiency of the neural network.
  • network reasoning can be performed through the following steps:
  • S601 Receive deployment information corresponding to the target deployment code sent by the target deployment device; wherein the deployment information is used to describe a deployment location of codes corresponding to each network layer of the target neural network.
  • the target deployment device may deploy the target deployment code in the target deployment device, and send the deployment information to the neural network reasoning engine, so that the The target deployment device performs neural network inference after receiving the inference instruction.
  • S602 Perform neural network inference based on the deployment information and the hierarchical relationship between network layers obtained by parsing the target neural network.
  • the codes corresponding to each deployment information may be sequentially run according to the hierarchical relationship to perform neural network reasoning.
  • the hierarchical relationship between the network layers of the target neural network obtained by parsing the target neural network may be the same analysis that determines the network parameters corresponding to each network layer of the target neural network obtained, or obtained by another analysis of the target neural network.
  • the inference sequence between at least one network layer that needs to be used when performing neural network inference can be determined according to the hierarchical relationship, and the neural network inference engine can sequentially report to the target according to the inference sequence.
  • the deployment device sends an inference instruction to instruct the target deployment device to run corresponding codes according to the inference sequence to perform neural network inference.
  • the neural network reasoning method provided by the embodiments of the present disclosure automatically determines the network parameters corresponding to the target network layer based on the network parameters corresponding to the target network layer and the predetermined corresponding relationship between the sample network layer and the configuration parameters when deploying the target neural network.
  • the target configuration parameters corresponding to the network layer, and the target neural network is deployed based on the target configuration parameters, which saves the time for configuring parameters in the neural network deployment initialization stage, thereby improving the deployment efficiency of the neural network.
  • the writing order of each step does not mean a strict execution order and constitutes any limitation on the implementation process.
  • the specific execution order of each step should be based on its function and possible
  • the inner logic is OK.
  • the embodiment of the present disclosure also provides a neural network reasoning device corresponding to the neural network reasoning method. Since the problem-solving principle of the device in the embodiment of the present disclosure is similar to the above-mentioned neural network reasoning method of the embodiment of the present disclosure, therefore The implementation of the device can refer to the implementation of the method.
  • FIG. 7 it is a schematic diagram of the architecture of a neural network inference device provided by an embodiment of the present disclosure.
  • the device includes various parts, which can be implemented by a processor in a computer device; of course, it can also be implemented by a specific logic Circuit implementation; in the process of implementation, the processor can be a central processing unit (Central Processing Unit, CPU), a microprocessor (Microprocessor Unit, MPU), a digital signal processor (Digital Signal Processor, DSP), an application-specific integrated circuit ( Application Specific Integrated Circuit, ASIC), Field Programmable Gate Array (Field Programmable Gate Array, FPGA) or GPU, etc.
  • the neural network reasoning device includes: an analysis part 701, a determination part 702, and a reasoning part 703; wherein,
  • the parsing part 701 is configured to acquire the target neural network to be deployed, analyze the target neural network, and determine network parameters corresponding to each network layer of the target neural network;
  • the determining part 702 is configured to determine the target configuration parameter corresponding to the target network layer based on the network parameter corresponding to the target network layer and the predetermined correspondence between the sample network layer and the configuration parameter; wherein, the sample The network layer is of the same type as the target network layer, and the configuration parameter corresponding to the sample network layer is configuration information of an algorithm when performing an operation corresponding to the sample network layer;
  • the reasoning part 703 is configured to deploy the target neural network based on the target configuration parameters, and perform network reasoning based on the target neural network.
  • the determining part 702 determines the target configuration parameters corresponding to the target network layer based on the network parameters corresponding to the target network layer and the predetermined correspondence between sample network layers and configuration parameters , is configured as:
  • the target neural network For any of the candidate sample network layers, deploy the target neural network based on the configuration parameters corresponding to the candidate sample network layer, and determine the operation results under the deployment mode corresponding to the candidate sample network layer;
  • the target candidate sample network layer Based on the operation results of the target neural network in the deployment mode corresponding to each of the candidate sample network layers, determine the target candidate sample network layer, and use the configuration parameters corresponding to the target candidate sample network layer as the target configuration parameters.
  • the determining part 702 determines at least When a candidate sample network layer is configured as:
  • the at least one candidate sample network layer is determined from the initial sample network layer based on the configuration information screening condition matching the network parameter corresponding to the target network layer and the configuration parameters corresponding to each initial sample network layer.
  • the reasoning part 703, when deploying the target neural network based on the target configuration parameters is configured to:
  • the reasoning part 703 when determining the first deployment code corresponding to the target network layer based on the target configuration parameters and the network parameters, is configured to:
  • the target configuration parameters corresponding to the target network layer are encapsulated, and the first deployment code corresponding to the target network layer is determined.
  • the reasoning part 703 when generating the target deployment code corresponding to the target neural network based on the first deployment code and the second deployment code, is configured to:
  • the reasoning part 703 before performing network reasoning based on the target neural network, is further configured to:
  • deployment information corresponding to the target deployment code sent by the target deployment device; wherein the deployment information is used to describe the deployment position of the code corresponding to each network layer of the target neural network;
  • the parsing part 701 when parsing the target neural network and determining network parameters corresponding to each network layer of the target neural network, is configured as:
  • Analyzing the target neural network determining the network parameters corresponding to the respective network layers of the target neural network and the hierarchical relationship between the network layers of the target neural network;
  • the reasoning part 703 when performing network reasoning based on the target neural network, is configured to:
  • Neural network reasoning is performed based on the deployment information and the hierarchical relationship.
  • the reasoning part 703 when performing neural network reasoning based on the deployment information and the hierarchical relationship, is configured to:
  • the codes corresponding to the deployment information are sequentially run to perform neural network reasoning.
  • the neural network reasoning device when deploying the target neural network, automatically determines the network parameters corresponding to the target network layer based on the corresponding relationship between the predetermined sample network layer and the configuration parameters.
  • the target configuration parameters corresponding to the network layer, and the target neural network is deployed based on the target configuration parameters, which saves the time for configuring parameters in the neural network deployment initialization stage, thereby improving the deployment efficiency of the neural network.
  • a "part" may be a part of a circuit, a part of a processor, a part of a program or software, etc., of course it may also be a unit, a module or a non-modular one.
  • FIG. 8 it is a schematic structural diagram of a computer device 800 provided by an embodiment of the present disclosure, including a processor 801 , a memory 802 , and a bus 803 .
  • the memory 802 is configured to store execution instructions, including a memory 8021 and an external memory 8022; the memory 8021 here is also called an internal memory, and is configured to temporarily store calculation data in the processor 801, and exchange data with an external memory 8022 such as a hard disk.
  • the processor 801 exchanges data with the external memory 8022 through the memory 8021.
  • the processor 801 communicates with the memory 802 through the bus 803, so that the processor 801 executes the following instructions:
  • the target configuration parameters corresponding to the target network layer Based on the network parameters corresponding to the target network layer and the predetermined correspondence between the sample network layer and configuration parameters, determine the target configuration parameters corresponding to the target network layer; wherein, the sample network layer and the target network The types of the layers are the same, and the configuration parameters corresponding to the sample network layer are the configuration information of the algorithm when performing the operation corresponding to the sample network layer;
  • the processor 801 may also be called a CPU.
  • the processor 801 may be an integrated circuit chip with signal processing capability.
  • the processor 801 may also be a general processor, DSP, ASIC, FPGA, GPU or other programmable logic devices, discrete gate or transistor logic devices, or discrete hardware components.
  • a general-purpose processor may be a microprocessor, or the processor may be any conventional processor, or the like.
  • the processor 801 may be jointly implemented by integrated circuit chips.
  • Embodiments of the present disclosure also provide a computer-readable storage medium, on which a computer program is stored, and when the computer program is run by a processor, the steps of the neural network reasoning method described in the foregoing method embodiments are executed.
  • the storage medium may be a volatile computer-readable storage medium or a non-volatile computer-readable storage medium.
  • Embodiments of the present disclosure also provide a computer program product, the computer program product carries program code, and the instructions included in the program code can be used to execute the steps of the neural network reasoning method described in the method embodiment above.
  • the computer program product carries program code
  • the instructions included in the program code can be used to execute the steps of the neural network reasoning method described in the method embodiment above.
  • the above-mentioned Method Example please refer to the above-mentioned Method Example.
  • the above-mentioned computer program product may be specifically implemented by means of hardware, software or a combination thereof.
  • the computer program product is embodied as a computer storage medium, and in other embodiments, the computer program product is embodied as a software product, such as a software development kit (Software Development Kit, SDK) and the like.
  • the disclosed systems, devices and methods may be implemented in other ways.
  • the device embodiments described above are illustrative.
  • the division of the units is a logical function division.
  • multiple units or components can be combined or integrated. to another system, or some features may be ignored, or not implemented.
  • the mutual coupling or direct coupling or communication connection shown or discussed may be through some communication interfaces, and the indirect coupling or communication connection of devices or units may be in electrical, mechanical or other forms.
  • the units described as separate components may or may not be physically separated, and the components shown as units may or may not be physical units, that is, they may be located in one place, or may be distributed to multiple network units. Part or all of the units can be selected according to actual needs to achieve the purpose of the solution of this embodiment.
  • each functional unit in each embodiment of the present disclosure may be integrated into one processing unit, each unit may exist separately physically, or two or more units may be integrated into one unit.
  • the functions are realized in the form of software function units and sold or used as independent products, they can be stored in a non-volatile computer-readable storage medium executable by a processor.
  • the computer software product is stored in a storage medium, including several
  • the instructions are used to make a computer device (which may be a personal computer, a server, or a network device, etc.) execute all or part of the steps of the methods described in various embodiments of the present disclosure.
  • the aforementioned storage media include: U disk, mobile hard disk, read-only memory (Read-Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic disk or optical disc and other media that can store program codes. .
  • Embodiments of the present disclosure provide a neural network reasoning method and device, computer equipment, a computer-readable storage medium, and a computer program product, wherein the neural network reasoning method includes: acquiring a target neural network to be deployed, and The network is analyzed to determine the network parameters corresponding to each network layer of the target neural network; based on the network parameters corresponding to the target network layer and the corresponding relationship between the predetermined sample network layer and configuration parameters, determine A target configuration parameter corresponding to a layer; wherein, the sample network layer is of the same type as the target network layer, and the configuration parameter corresponding to the sample network layer is configuration information of an algorithm when performing an operation corresponding to the sample network layer; Deploying the target neural network based on the target configuration parameters, and performing network reasoning based on the target neural network.
  • the above scheme saves the time for configuring parameters in the initialization phase of neural network deployment, thereby improving the deployment efficiency of the neural network.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Image Analysis (AREA)

Abstract

Provided are a neural network reasoning method and apparatus, and a computer device and a computer-readable storage medium. The neural network reasoning method comprises: acquiring a target neural network to be deployed, and parsing same, so as to determine network parameters corresponding to network layers of the target neural network; on the basis of a network parameter corresponding to a target network layer, and a predetermined correlation between a sample network layer and a configuration parameter, determining a target configuration parameter corresponding to the target network layer, wherein the type of the sample network layer is the same as that of the target network layer, and the configuration parameter corresponding to the sample network layer is configuration information of an algorithm when an operation corresponding to the sample network layer is executed; and deploying the target neural network on the basis of the target configuration parameter, and performing network reasoning on the basis of the target neural network.

Description

一种神经网络推理方法及装置、计算机设备、计算机可读存储介质、计算机程序产品Neural network reasoning method and device, computer equipment, computer-readable storage medium, and computer program product
相关申请的交叉引用Cross References to Related Applications
本公开实施例基于申请号为202111595072.1、申请日为2021年12月24日、申请名称为“一种神经网络推理方法、装置、计算机设备及存储介质”的中国专利申请提出,并要求该中国专利申请的优先权,该中国专利申请的全部内容在此引入本公开作为参考。The embodiment of the present disclosure is based on the Chinese patent application with the application number 202111595072.1, the application date is December 24, 2021, and the application name is "a neural network reasoning method, device, computer equipment and storage medium", and requires the Chinese patent Priority of the application, the entire content of the Chinese patent application is hereby incorporated by reference into this disclosure.
技术领域technical field
本公开涉及但不限于计算机技术领域,尤其涉及一种神经网络推理方法及装置、计算机设备、计算机可读存储介质、计算机程序产品。The present disclosure relates to but not limited to the field of computer technology, and in particular relates to a neural network reasoning method and device, computer equipment, computer-readable storage media, and computer program products.
背景技术Background technique
随着深度学习的发展,神经网络的种类也越来越多,神经网络中卷积计算的配置参数也越来越多。在部署神经网络时通常使用推理引擎针对不同计算的配置参数进行优化,可以提高神经网络推理的性能。With the development of deep learning, there are more and more types of neural networks, and there are more and more configuration parameters for convolution calculations in neural networks. When deploying a neural network, an inference engine is usually used to optimize the configuration parameters of different calculations, which can improve the performance of neural network inference.
发明内容Contents of the invention
本公开实施例提供一种神经网络推理方法及装置、计算机设备、计算机可读存储介质、计算机程序产品。Embodiments of the present disclosure provide a neural network reasoning method and device, computer equipment, a computer-readable storage medium, and a computer program product.
本公开实施例提供了一种神经网络推理方法,包括:An embodiment of the present disclosure provides a neural network reasoning method, including:
获取待部署的目标神经网络,并对所述目标神经网络进行解析,确定所述目标神经网络各网络层分别对应的网络参数;Obtaining the target neural network to be deployed, and analyzing the target neural network, and determining network parameters corresponding to each network layer of the target neural network;
基于目标网络层对应的网络参数、以及预先确定的样本网络层与配置参数之间的对应关系,确定与所述目标网络层对应的目标配置参数;其中,所述样本网络层与所述目标网络层的类型相同,所述样本网络层对应的配置参数为执行所述样本网络层对应的运算时的算法的配置信息;Based on the network parameters corresponding to the target network layer and the predetermined correspondence between the sample network layer and configuration parameters, determine the target configuration parameters corresponding to the target network layer; wherein, the sample network layer and the target network The types of the layers are the same, and the configuration parameters corresponding to the sample network layer are the configuration information of the algorithm when performing the operation corresponding to the sample network layer;
基于所述目标配置参数部署所述目标神经网络,并基于所述目标神经网络进行网络推理。Deploying the target neural network based on the target configuration parameters, and performing network reasoning based on the target neural network.
这样,在部署目标神经网络时,通过基于目标网络层对应的网络参数、以及预先确定的样本网络层与配置参数之间的对应关系,自动确定与所述目标网络层对应的目标配置参数,并基于所述目标配置参数部署所述目标神经网络,节约了神经网络部署初始化阶段中配置参数的时间,从而提高了神经网络的部署效率。In this way, when deploying the target neural network, the target configuration parameters corresponding to the target network layer are automatically determined based on the network parameters corresponding to the target network layer and the predetermined correspondence between the sample network layer and the configuration parameters, and Deploying the target neural network based on the target configuration parameters saves time for configuring parameters in the neural network deployment initialization phase, thereby improving the deployment efficiency of the neural network.
本公开实施例还提供一种神经网络推理装置,包括:An embodiment of the present disclosure also provides a neural network reasoning device, including:
解析部分,被配置为获取待部署的目标神经网络,并对所述目标神经网络 进行解析,确定所述目标神经网络各网络层分别对应的网络参数;The analysis part is configured to obtain the target neural network to be deployed, and analyze the target neural network, and determine the network parameters corresponding to each network layer of the target neural network;
确定部分,被配置为基于目标网络层对应的网络参数、以及预先确定的样本网络层与配置参数之间的对应关系,确定与所述目标网络层对应的目标配置参数;其中,所述样本网络层与所述目标网络层的类型相同,所述样本网络层对应的配置参数为执行所述样本网络层对应的运算时的算法的配置信息;The determining part is configured to determine target configuration parameters corresponding to the target network layer based on network parameters corresponding to the target network layer and a predetermined correspondence between sample network layers and configuration parameters; wherein, the sample network The layer is of the same type as the target network layer, and the configuration parameter corresponding to the sample network layer is configuration information of an algorithm when performing an operation corresponding to the sample network layer;
推理部分,被配置为基于所述目标配置参数部署所述目标神经网络,并基于所述目标神经网络进行网络推理。The reasoning part is configured to deploy the target neural network based on the target configuration parameters, and perform network reasoning based on the target neural network.
本公开实施例还提供一种计算机设备,包括:处理器、存储器和总线,所述存储器存储有所述处理器可执行的机器可读指令,在计算机设备运行的情况下,所述处理器与所述存储器之间通过总线通信,所述机器可读指令被所述处理器执行时执行上述神经网络推理方法的步骤。An embodiment of the present disclosure also provides a computer device, including: a processor, a memory, and a bus. The memory stores machine-readable instructions executable by the processor. When the computer device is running, the processor and The memories communicate with each other through a bus, and the machine-readable instructions are executed by the processor to execute the steps of the above neural network reasoning method.
本公开实施例还提供一种计算机可读存储介质,该计算机可读存储介质上存储有计算机程序,该计算机程序被处理器运行时执行上述神经网络推理方法的步骤。An embodiment of the present disclosure also provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is run by a processor, the steps of the above neural network reasoning method are executed.
本公开实施例还提供一种计算机程序产品,所述计算机程序产品包括计算机程序或指令,在所述计算机程序或指令在电子设备上运行的情况下,使得所述电子设备执行上述方法的步骤。An embodiment of the present disclosure also provides a computer program product, where the computer program product includes a computer program or an instruction, and when the computer program or instruction is run on an electronic device, the electronic device is made to execute the steps of the above method.
关于上述神经网络推理装置、计算机设备、计算机可读存储介质及计算机程序产品的效果描述参见上述神经网络推理方法的说明。For the effect description of the above-mentioned neural network reasoning device, computer equipment, computer-readable storage medium and computer program product, please refer to the description of the above-mentioned neural network reasoning method.
为使本公开的上述目的、特征和优点能更明显易懂,下文特举较佳实施例,并配合所附附图,作详细说明如下。In order to make the above-mentioned objects, features and advantages of the present disclosure more comprehensible, preferred embodiments will be described in detail below together with the accompanying drawings.
应当理解的是,以上的一般描述和后文的细节描述仅是示例性和解释性的,而非限制本公开。It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.
附图说明Description of drawings
为了更清楚地说明本公开实施例的技术方案,下面将对实施例中所需要使用的附图作简单地介绍,此处的附图被并入说明书中并构成本说明书中的一部分,这些附图示出了符合本公开的实施例,并与说明书一起用于说明本公开的技术方案。应当理解,以下附图仅示出了本公开的某些实施例,因此不应被看作是对范围的限定,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他相关的附图。In order to illustrate the technical solutions of the embodiments of the present disclosure more clearly, the following will briefly introduce the accompanying drawings used in the embodiments. The accompanying drawings here are incorporated into the specification and constitute a part of the specification. The drawings show the embodiments consistent with the present disclosure, and are used together with the description to explain the technical solution of the present disclosure. It should be understood that the following drawings only show some embodiments of the present disclosure, and therefore should not be regarded as limiting the scope. For those skilled in the art, they can also make From these drawings other related drawings are obtained.
图1示出了本公开实施例所提供的一种神经网络推理方法的流程图;FIG. 1 shows a flowchart of a neural network reasoning method provided by an embodiment of the present disclosure;
图2示出了本公开实施例所提供的神经网络推理方法中,确定与目标网络层对应的目标配置参数的方法的流程图;FIG. 2 shows a flowchart of a method for determining target configuration parameters corresponding to a target network layer in the neural network reasoning method provided by an embodiment of the present disclosure;
图3示出了本公开实施例所提供的神经网络推理方法中,确定与目标网络层对应的至少一个候选样本网络层的方法的流程图;FIG. 3 shows a flowchart of a method for determining at least one candidate sample network layer corresponding to a target network layer in the neural network reasoning method provided by an embodiment of the present disclosure;
图4示出了本公开实施例所提供的神经网络推理方法中,部署目标神经网络的方法的流程图;FIG. 4 shows a flowchart of a method for deploying a target neural network in the neural network reasoning method provided by an embodiment of the present disclosure;
图5示出了本公开实施例所提供的神经网络推理方法中,生成目标神经网 络对应的目标部署代码的方法的流程图;Fig. 5 shows the flow chart of the method for generating the target deployment code corresponding to the target neural network in the neural network reasoning method provided by the embodiment of the present disclosure;
图6示出了本公开实施例所提供的神经网络推理方法中,进行网络推理的方法的流程图;FIG. 6 shows a flowchart of a method for network reasoning in the neural network reasoning method provided by an embodiment of the present disclosure;
图7示出了本公开实施例所提供的一种神经网络推理装置的架构示意图;Fig. 7 shows a schematic diagram of the architecture of a neural network reasoning device provided by an embodiment of the present disclosure;
图8示出了本公开实施例所提供的一种计算机设备的结构示意图。FIG. 8 shows a schematic structural diagram of a computer device provided by an embodiment of the present disclosure.
具体实施方式Detailed ways
为使本公开实施例的目的、技术方案和优点更加清楚,下面将结合本公开实施例中附图,对本公开实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本公开一部分实施例,而不是全部的实施例。通常在此处附图中描述和示出的本公开实施例的组件可以以各种不同的配置来布置和设计。因此,以下对在附图中提供的本公开的实施例的详细描述并非旨在限制要求保护的本公开的范围,而是表示本公开的选定实施例。基于本公开的实施例,本领域技术人员在没有做出创造性劳动的前提下所获得的所有其他实施例,都属于本公开保护的范围。In order to make the purpose, technical solutions and advantages of the embodiments of the present disclosure clearer, the technical solutions in the embodiments of the present disclosure will be clearly and completely described below in conjunction with the accompanying drawings in the embodiments of the present disclosure. Obviously, the described embodiments are The present disclosure discloses some embodiments, but not all embodiments. The components of the disclosed embodiments generally described and illustrated in the figures herein may be arranged and designed in a variety of different configurations. Accordingly, the following detailed description of the embodiments of the present disclosure provided in the accompanying drawings is not intended to limit the scope of the claimed disclosure, but represents selected embodiments of the present disclosure. Based on the embodiments of the present disclosure, all other embodiments obtained by those skilled in the art without creative effort shall fall within the protection scope of the present disclosure.
应注意到:相似的标号和字母在下面的附图中表示类似项,因此,一旦某一项在一个附图中被定义,则在随后的附图中不需要对其进行进一步定义和解释。It should be noted that like numerals and letters denote similar items in the following figures, therefore, once an item is defined in one figure, it does not require further definition and explanation in subsequent figures.
本文中术语“和/或”,仅仅是描述一种关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B这三种情况。另外,本文中术语“至少一种”表示多种中的任意一种或多种中的至少两种的任意组合,例如,包括A、B、C中的至少一种,可以表示包括从A、B和C构成的集合中选择的任意一个或多个元素。The term "and/or" in this article only describes an association relationship, which means that there can be three kinds of relationships, for example, A and/or B can mean: there is A alone, A and B exist at the same time, and B exists alone. situation. In addition, the term "at least one" herein means any one of a variety or any combination of at least two of the more, for example, including at least one of A, B, and C, which may mean including from A, Any one or more elements selected from the set formed by B and C.
经本公开实施例研究发现,相关技术中,为了得到较优的推理性能,推理引擎在预处理阶段往往需要遍历大量的配置参数组合,并根据遍历得到的各配置参数组合进行神经网络的实际部署,以根据实际部署后的测试结果选择较优的配置参数组合,使得预处理阶段花费时间较长,降低了神经网络的部署效率。According to the research of the embodiments of the present disclosure, it is found that in related technologies, in order to obtain better inference performance, the inference engine often needs to traverse a large number of configuration parameter combinations in the preprocessing stage, and perform the actual deployment of the neural network according to the configuration parameter combinations obtained through the traversal. , to select a better combination of configuration parameters according to the test results after actual deployment, which makes the preprocessing stage take a long time and reduces the deployment efficiency of the neural network.
基于上述研究,本公开提供了一种神经网络推理方法、装置、计算机设备及存储介质,在部署目标神经网络时,通过基于目标网络层对应的网络参数、以及预先确定的样本网络层与配置参数之间的对应关系,自动确定与所述目标网络层对应的目标配置参数,并基于所述目标配置参数部署所述目标神经网络,节约了神经网络部署初始化阶段中配置参数的时间,从而提高了神经网络的部署效率。Based on the above research, the present disclosure provides a neural network reasoning method, device, computer equipment, and storage medium. When deploying the target neural network, the network parameters corresponding to the target network layer and the predetermined sample network layer and configuration parameters are used. The corresponding relationship between, automatically determine the target configuration parameters corresponding to the target network layer, and deploy the target neural network based on the target configuration parameters, saving the time for configuring parameters in the initial stage of neural network deployment, thereby improving Deployment efficiency of neural networks.
为便于对本实施例进行理解,首先对本公开实施例所公开的一种神经网络推理方法进行详细介绍,本公开实施例所提供的神经网络推理方法的执行主体一般为具有一定计算能力的计算机设备,该计算机设备例如包括:终端设备或服务器或其它处理设备,终端设备可以为用户设备(User Equipment,UE)、移动设备、用户终端、终端等。在一些可能的实现方式中,该神经网络推理方法可以通过处理器调用存储器中存储的计算机可读指令的方式来实现。In order to facilitate the understanding of this embodiment, a neural network reasoning method disclosed in the embodiments of the present disclosure is first introduced in detail. The execution subject of the neural network reasoning method provided in the embodiments of the present disclosure is generally a computer device with a certain computing power. The computer device includes, for example: a terminal device or a server or other processing device, and the terminal device may be a user equipment (User Equipment, UE), a mobile device, a user terminal, a terminal, and the like. In some possible implementation manners, the neural network reasoning method may be implemented by a processor invoking computer-readable instructions stored in a memory.
参见图1所示,为本公开实施例提供的神经网络推理方法的流程图,所述方法包括S101至S103,其中:Referring to FIG. 1, which is a flowchart of a neural network reasoning method provided by an embodiment of the present disclosure, the method includes S101 to S103, wherein:
S101:获取待部署的目标神经网络,并对所述目标神经网络进行解析,确定所述目标神经网络各网络层分别对应的网络参数。S101: Obtain a target neural network to be deployed, analyze the target neural network, and determine network parameters corresponding to each network layer of the target neural network.
S102:基于目标网络层对应的网络参数、以及预先确定的样本网络层与配置参数之间的对应关系,确定与所述目标网络层对应的目标配置参数;其中,所述样本网络层与所述目标网络层的类型相同,所述样本网络层对应的配置参数为执行所述样本网络层对应的运算时的算法的配置信息。S102: Determine the target configuration parameters corresponding to the target network layer based on the network parameters corresponding to the target network layer and the predetermined correspondence between the sample network layer and the configuration parameters; wherein, the sample network layer and the The types of the target network layers are the same, and the configuration parameters corresponding to the sample network layers are configuration information of algorithms when performing operations corresponding to the sample network layers.
S103:基于所述目标配置参数部署所述目标神经网络,并基于所述目标神经网络进行网络推理。S103: Deploy the target neural network based on the target configuration parameters, and perform network reasoning based on the target neural network.
以下是对上述步骤的详细介绍。The following is a detailed description of the above steps.
针对S101,所述目标神经网络各网络层分别对应的网络参数包括权重参数、偏置参数、卷积层的卷积参数、激活层的激活参数等,通过确定所述目标神经网络中的各网络层分别对应的网络参数,即可确定所述待部署的目标神经网络的各网络层的类型,以及各网络层分别对应的网络参数的参数值。For S101, the network parameters corresponding to each network layer of the target neural network include weight parameters, bias parameters, convolution parameters of the convolution layer, activation parameters of the activation layer, etc., by determining the network parameters of the target neural network The network parameters corresponding to the respective layers can determine the type of each network layer of the target neural network to be deployed, and the parameter values of the network parameters corresponding to each network layer.
示例性的,对所述目标神经网络进行解析,可以确定出所述目标神经网络中卷积层所对应的网络参数为进行卷积运算的卷积运算量、卷积核大小、卷积步长等卷积参数,其中,所述卷积运算量可以通过参与卷积运算的特征图的长和宽来表示,目标神经网络可以表示在多个可供选择的神经网络中的一个特定的神经网络。Exemplarily, by analyzing the target neural network, it can be determined that the network parameters corresponding to the convolutional layer in the target neural network are the amount of convolution operations for convolution operations, the size of the convolution kernel, and the convolution step size Equal convolution parameters, wherein the amount of convolution operation can be represented by the length and width of the feature map participating in the convolution operation, and the target neural network can represent a specific neural network among multiple alternative neural networks .
此外,在对所述目标神经网络进行解析时,还可以确定所述目标神经网络各网络层分别对应的网络参数和所述目标神经网络的各网络层之间的层级关系,然后基于层级关系进行网络推理。In addition, when analyzing the target neural network, it is also possible to determine the network parameters corresponding to the network layers of the target neural network and the hierarchical relationship between the network layers of the target neural network, and then perform network reasoning.
S102:基于目标网络层对应的网络参数、以及预先确定的样本网络层与配置参数之间的对应关系,确定与所述目标网络层对应的目标配置参数;其中,所述样本网络层与所述目标网络层的类型相同,所述样本网络层对应的配置参数为执行所述样本网络层对应的运算时的算法的配置信息。S102: Determine the target configuration parameters corresponding to the target network layer based on the network parameters corresponding to the target network layer and the predetermined correspondence between the sample network layer and the configuration parameters; wherein, the sample network layer and the The types of the target network layers are the same, and the configuration parameters corresponding to the sample network layers are configuration information of algorithms when performing operations corresponding to the sample network layers.
这里,所述目标网络层可以是卷积层或者进行矩阵运算的网络层,由于这些网络层的运算量较大,因此可以设置相应的配置参数以提高运算效率,所述算法表示进行运算时的方法,包括是否使用特定的运算机制、运算时各运算单元所执行的运算量等。其中,目标网络层可以表示目标神经网络中的多个网络层中的特定的网络层,目标配置参数可以表示与目标网络层对应的配置参数。Here, the target network layer can be a convolutional layer or a network layer performing matrix operations. Since these network layers have a large amount of computation, corresponding configuration parameters can be set to improve computing efficiency. The algorithm represents the Methods, including whether to use a specific computing mechanism, the amount of computing performed by each computing unit during computing, etc. Wherein, the target network layer may represent a specific network layer among multiple network layers in the target neural network, and the target configuration parameter may represent a configuration parameter corresponding to the target network layer.
其中,所述样本网络层对应的配置参数为所述样本网络层在各网络参数设置下进行神经网络部署的较优配置参数,例如,可以是所述目标网络层对应的配置参数的最优解。Wherein, the configuration parameters corresponding to the sample network layer are optimal configuration parameters for neural network deployment of the sample network layer under the setting of various network parameters, for example, may be the optimal solution of the configuration parameters corresponding to the target network layer .
示例性的,以所述目标网络层和所述样本网络层为卷积层为例,所述样本网络层对应的配置参数,包括图形处理器(Graphics Processing Unit,GPU)中各统一计算设备架构(Compute Unified Device Architecture,CUDA)运算单元所执行的卷积运算的运算量、各最小运算单元所执行的卷积运算的运算量、是否使用双缓冲机制进行卷积运算、是否使用split机制对卷积计算进行分解、各 CUDA运算单元每次循环计算的迭代步长、由各最小运算单元每次循环计算的迭代步长等。Exemplarily, taking the target network layer and the sample network layer as a convolution layer as an example, the configuration parameters corresponding to the sample network layer include the architecture of each unified computing device in a Graphics Processing Unit (GPU). (Compute Unified Device Architecture, CUDA) computing volume of the convolution operation performed by the computing unit, the computing volume of the convolution operation performed by each minimum computing unit, whether to use the double buffer mechanism for convolution operation, whether to use the split mechanism to perform the convolution operation The product calculation is decomposed, the iteration step size of each cycle calculation of each CUDA operation unit, the iteration step size of each cycle calculation of each minimum operation unit, etc.
在一些实施例中,可以通过穷举的方式确定多个具有不同网络参数的样本网络层,在进行神经网络部署之前,神经网络推理引擎中可以预先确定出所述样本网络层与配置参数的对应关系;针对任一所述样本网络层,与该样本网络层对应的配置参数,可以是运行结果满足预设条件的神经网络在部署时所使用的配置参数,所述预设条件可以是部署后的神经网络的推理速度大于预设阈值。其中,预设阈值可以是正数。In some embodiments, a plurality of sample network layers with different network parameters can be determined in an exhaustive manner, and before the neural network is deployed, the neural network reasoning engine can pre-determine the correspondence between the sample network layers and the configuration parameters relationship; for any of the sample network layers, the configuration parameters corresponding to the sample network layer may be the configuration parameters used when the neural network whose operation results meet the preset conditions is deployed, and the preset conditions may be after deployment The inference speed of the neural network is greater than the preset threshold. Wherein, the preset threshold may be a positive number.
需要说明的是,使用样本网络层进行神经网络部署时所使用的设备为测试设备,并不是实际部署所述目标网络层的目标部署设备,由于设备之间存在硬件差异,因此相同的配置参数在不同的部署设备上的运行结果也可能不同,也即按照在测试设备上运行结果较优的配置参数进行目标神经网络的部署,最终得到的运行结果不一定是较优的。It should be noted that the device used for neural network deployment using the sample network layer is a test device, not the target deployment device for actually deploying the target network layer. Due to hardware differences between devices, the same configuration parameters are in The running results on different deployment devices may also be different, that is, the target neural network is deployed according to the configuration parameters with better running results on the test device, and the final running results may not be better.
这样,在神经网络部署之前,可以得到所述样本网络层在各网络参数设置下进行神经网络部署的较优配置参数,而由于所述样本网络层与所述目标网络层的类型相同,后续在确定所述目标网络层的目标配置参数时便可以基于目标网络层对应的网络参数、以及预先确定的样本网络层与配置参数之间的对应关系,确定与所述目标网络层对应的目标配置参数。因而,能够快速地选择适合特定的卷积层或者进行矩阵运算的网络层(所述目标网络层)对应的配置参数的较优解,节约了神经网络部署初始化阶段中配置参数的时间。In this way, before the deployment of the neural network, the optimal configuration parameters for the neural network deployment of the sample network layer under each network parameter setting can be obtained, and since the type of the sample network layer is the same as that of the target network layer, subsequent When determining the target configuration parameters of the target network layer, the target configuration parameters corresponding to the target network layer may be determined based on the network parameters corresponding to the target network layer and the predetermined correspondence between the sample network layer and configuration parameters . Therefore, a better solution suitable for configuration parameters corresponding to a specific convolutional layer or a network layer performing matrix operations (the target network layer) can be quickly selected, saving time for configuring parameters in the initial stage of neural network deployment.
在一些实施方式中,如图2所示,可以通过以下步骤确定与所述目标网络层对应的目标配置参数:In some implementation manners, as shown in FIG. 2, the target configuration parameters corresponding to the target network layer may be determined through the following steps:
S201:基于所述目标网络层对应的网络参数与所述样本网络层对应的样本网络参数之间的相似度,确定与所述目标网络层对应的至少一个候选样本网络层。S201: Based on the similarity between the network parameters corresponding to the target network layer and the sample network parameters corresponding to the sample network layer, determine at least one candidate sample network layer corresponding to the target network layer.
这里,所述目标网络层对应的网络参数与所述样本网络层对应的样本网络参数之间的相似度,可以是各网络参数之间的余弦相似度。Here, the similarity between the network parameters corresponding to the target network layer and the sample network parameters corresponding to the sample network layer may be a cosine similarity between network parameters.
在一些实施例中,在确定所述相似度时可以使用一个或多个所述网络参数进行确定。在使用多个网络参数确定相似度时,可以分别确定选中的用于确定相似度的各网络参数的相似度,并对各网络参数的相似度进行加权求和,并将加权求和后的相似度,作为所述目标网络层对应的网络参数与所述样本网络层对应的样本网络参数之间的相似度。In some embodiments, one or more of the network parameters may be used to determine the similarity. When multiple network parameters are used to determine the similarity, the similarity of each network parameter selected for determining the similarity can be determined separately, and the similarity of each network parameter is weighted and summed, and the similarity after the weighted sum is degree, as the similarity between the network parameters corresponding to the target network layer and the sample network parameters corresponding to the sample network layer.
在一些实施方式中,如图3所示,可以通过以下步骤确定与所述目标网络层对应的至少一个候选样本网络层:In some implementation manners, as shown in FIG. 3, at least one candidate sample network layer corresponding to the target network layer may be determined through the following steps:
S2011:将相似度大于预设相似度的所述样本网络层确定为初始样本网络层。S2011: Determine the sample network layer whose similarity is greater than a preset similarity as an initial sample network layer.
示例性的,以所述预设相似度为0.8,样本网络层1至4与目标网络层的相似度分别为0.6、0.9、0.75、0.85为例,可以将样本网络层2和样本网络层4确定为所述初始样本网络层,从而可以得到包含多个初始样本网络网络层的候选集合。Exemplarily, taking the preset similarity as 0.8, and the similarities between sample network layers 1 to 4 and the target network layer as 0.6, 0.9, 0.75, and 0.85 respectively as an example, sample network layer 2 and sample network layer 4 can be It is determined as the initial sample network layer, so that a candidate set including multiple initial sample network layers can be obtained.
示例性的,通过遍历候选集合,选择并记录最优解。这里,为了提高与网络层对应的配置参数相匹配的解的准确率,可以采用减枝的方式减小候选集合。Exemplarily, the optimal solution is selected and recorded by traversing the candidate set. Here, in order to improve the accuracy of the solution matching the configuration parameters corresponding to the network layer, the candidate set can be reduced by pruning.
S2012:基于与所述目标网络层对应的网络参数相匹配的配置信息筛选条件、以及各初始样本网络层对应的配置参数,从所述初始样本网络层中确定所述至少一个候选样本网络层。S2012: Determine the at least one candidate network layer from the initial sample network layer based on the configuration information screening condition matching the network parameter corresponding to the target network layer and the configuration parameters corresponding to each initial sample network layer.
这里,所述配置信息筛选条件可以通过数据分析的方式得到的,比如可以通过对所述目标网络层对应的网络参数进行数据分析,得到与所述网络参数对应的各配置参数的最大值和最小值,则对应的配置信息筛选条件即为配置参数的参数值不得大于对应的最大值,且不得小于对应的最小值。Here, the configuration information screening conditions can be obtained by means of data analysis, for example, by performing data analysis on the network parameters corresponding to the target network layer, the maximum value and minimum value of each configuration parameter corresponding to the network parameters can be obtained. value, the corresponding configuration information filter condition is that the parameter value of the configuration parameter must not be greater than the corresponding maximum value and must not be less than the corresponding minimum value.
在S2011至S2012的实施例中,通过设置与所述目标网络层对应的网络参数相匹配的配置信息筛选条件,对各所述初始样本网络层进行筛选,从各所述初始样本网络层中确定所述至少一个候选样本网络层,可以减少后续确定目标网络层的配置参数时的运算量,从而提高神经网络的部署效率。In the embodiment of S2011 to S2012, by setting configuration information screening conditions matching the network parameters corresponding to the target network layer, each of the initial sample network layers is screened, and determined from each of the initial sample network layers The at least one candidate sample network layer can reduce the amount of calculation when subsequently determining the configuration parameters of the target network layer, thereby improving the deployment efficiency of the neural network.
S202:针对任一所述候选样本网络层,基于该候选样本网络层对应的配置参数部署所述目标神经网络,并确定所述目标神经网络在该候选样本网络层对应的部署方式下的运行结果。S202: For any of the candidate sample network layers, deploy the target neural network based on the configuration parameters corresponding to the candidate sample network layer, and determine the operation result of the target neural network in the deployment mode corresponding to the candidate sample network layer .
这样,在进行神经网络部署时可以使用神经网络推理引擎进行部署;在将所述目标神经网络部署到目标部署设备后,可以确定所述目标神经网络在该候选样本网络层对应的部署方式下的运行结果,所述运行结果可以是推理速度、推理精度等,通过所述运行结果即可确定所述目标神经网络在该部署方式下的部署效果。In this way, the neural network reasoning engine can be used to deploy the neural network; after the target neural network is deployed to the target deployment device, the target neural network can be determined in the deployment mode corresponding to the candidate sample network layer. The running result, the running result may be inference speed, reasoning accuracy, etc., and the deployment effect of the target neural network in this deployment mode can be determined through the running result.
S203:基于所述目标神经网络在各所述候选样本网络层对应的部署方式下的运行结果,确定目标候选样本网络层,并将所述目标候选样本网络层对应的配置参数作为所述目标配置参数。S203: Based on the operation results of the target neural network in the deployment mode corresponding to each of the candidate sample network layers, determine the target candidate sample network layer, and use the configuration parameters corresponding to the target candidate sample network layer as the target configuration parameter.
在一些实施例中,在基于各候选样本网络层对应的部署方式下的运行结果,确定目标候选样本网络层时,可以根据预设的运行结果评估规则,从所述候选样本网络层中确定运行结果较优的目标候选样本网络层,然后将所述目标候选样本网络层对应的配置参数作为所述目标配置参数。其中,目标候选样本网络层可以表示至少一个候选样本网络层中的特定的候选样本网络层。In some embodiments, when determining the target candidate sample network layer based on the operation results of the corresponding deployment modes of each candidate sample network layer, the operation can be determined from the candidate sample network layers according to the preset operation result evaluation rules. For the network layer of the target candidate sample with a better result, the configuration parameter corresponding to the network layer of the target candidate sample is used as the target configuration parameter. Wherein, the target candidate sample network layer may represent a specific candidate sample network layer in at least one candidate sample network layer.
示例性的,以所述运行结果为推理速度为例,则对应的运行结果评估规则可以是选择推理速度大于预设阈值的候选样本网络层,作为所述目标候选样本网络层。Exemplarily, taking the operation result as the inference speed as an example, the corresponding operation result evaluation rule may be to select a candidate sample network layer whose inference speed is greater than a preset threshold as the target candidate sample network layer.
在S201至S203的实施例中,通过确定出包含至少一个候选样本网络层的候选集合,并根据候选集合进行试运行(即根据样本网络层的配置参数进行神经网络部署)和相关配置参数选择,相较于相关技术中使用穷举等方式进行试运行并选择较优配置参数,在神经网络部署的初始化阶段中所需要试运行的次数更少,从而节约初始化阶段中的参数配置时间;通过快速的配置参数实现目标网络层的快速算法选择,从而提高神经网络的部署效率。In the embodiment of S201 to S203, by determining a candidate set containing at least one candidate sample network layer, and performing trial operation (that is, performing neural network deployment according to the configuration parameters of the sample network layer) and related configuration parameter selection according to the candidate set, Compared with related technologies that use exhaustive methods to conduct trial runs and select optimal configuration parameters, the number of trial runs required in the initialization phase of neural network deployment is less, thereby saving parameter configuration time in the initialization phase; through fast The configuration parameters of the target network layer realize the fast algorithm selection, thereby improving the deployment efficiency of the neural network.
S103:基于所述目标配置参数部署所述目标神经网络,并基于所述目标神经网络进行网络推理。S103: Deploy the target neural network based on the target configuration parameters, and perform network reasoning based on the target neural network.
这里,所述网络推理为基于所述目标神经网络对输入数据进行数据处理,从而得到与输入数据对应的数据处理结果。Here, the network reasoning is to perform data processing on the input data based on the target neural network, so as to obtain a data processing result corresponding to the input data.
示例性的,以所述目标神经网络为图像识别网络为例,在部署所述目标神经网络后,将包含有猫的图片输入所述目标神经网络后,经过所述目标神经网络的网络推理,可以得到推理结果为“猫”。Exemplarily, taking the target neural network as an image recognition network as an example, after deploying the target neural network, after inputting a picture containing a cat into the target neural network, through the network reasoning of the target neural network, The reasoning result can be obtained as "cat".
在一些实施方式中,如图4所示,可以通过以下步骤部署目标神经网络:In some implementations, as shown in Figure 4, the target neural network can be deployed through the following steps:
S401:基于所述目标配置参数,确定所述目标网络层对应的第一部署代码;以及,基于所述目标神经网络中除所述目标网络层外的其他网络层的网络参数,确定所述其他网络层对应的第二部署代码。S401: Based on the target configuration parameters, determine the first deployment code corresponding to the target network layer; and, based on the network parameters of other network layers in the target neural network except the target network layer, determine the other The second deployment code corresponding to the network layer.
这里,所述第一部署代码和第二部署代码为中央处理器所能识别的代码,通过生成所述第一部署代码和所述第二部署代码,即可在中央处理器中记录所述目标神经网络的部署配置,其中,所述中央处理器为部署所述神经网络推理引擎的设备,用于对所述目标神经网络进行部署。Here, the first deployment code and the second deployment code are codes that can be recognized by the central processing unit. By generating the first deployment code and the second deployment code, the target can be recorded in the central processing unit. The deployment configuration of the neural network, wherein the central processing unit is a device for deploying the neural network reasoning engine, and is used for deploying the target neural network.
在一些实施例中,在基于所述目标配置参数,确定所述目标网络层对应的第一部署代码时,可以基于预设的代码封装规则,对所述目标网络层对应的目标配置参数进行封装,确定所述目标网络层对应的第一部署代码;在基于所述目标神经网络中除所述目标网络层外的其他网络层的网络参数,确定所述其他网络层对应的第二部署代码时,可以根据所述其他网络层的网络参数,从所述神经网络推理引擎中获取与所述其他网络层对应的第二部署代码。这里,其他网络层可以表示所述目标神经网络中除所述目标网络层以外的任意一个网络层,或者两个及以上网络层。In some embodiments, when determining the first deployment code corresponding to the target network layer based on the target configuration parameters, the target configuration parameters corresponding to the target network layer may be encapsulated based on a preset code encapsulation rule , to determine the first deployment code corresponding to the target network layer; when determining the second deployment code corresponding to the other network layers based on the network parameters of other network layers in the target neural network except the target network layer The second deployment code corresponding to the other network layer may be obtained from the neural network reasoning engine according to the network parameters of the other network layer. Here, other network layers may represent any network layer in the target neural network other than the target network layer, or two or more network layers.
其中,所述代码封装规则可以定义进行代码封装时的模板,在基于预设的代码封装规则,对所述目标网络层对应的目标配置参数进行封装时,可以根据所述目标配置参数与模板之间的对应关系,将所述目标配置参数添加至模板的对应位置处,从而生成所述目标网络层对应的第一部署代码。这里,所述代码封装规则可以表示所述第一配置参数与代码封装的模板之间的对应关系。Wherein, the code encapsulation rule may define a template for code encapsulation, and when encapsulating the target configuration parameters corresponding to the target network layer based on the preset code encapsulation rules, the target configuration parameter and the template may be The corresponding relationship between the target configuration parameters is added to the corresponding position of the template, thereby generating the first deployment code corresponding to the target network layer. Here, the code encapsulation rule may indicate a correspondence between the first configuration parameter and a code encapsulation template.
在一些实施方式中,在进行代码封装时,还可以获取与所述目标网络层对应的融合信息,并根据所述融合信息对所述目标网络层,以及与所述目标网络层具有融合关系的其他网络层进行代码封装。In some implementation manners, when performing code encapsulation, fusion information corresponding to the target network layer may also be obtained, and the target network layer and the fusion relationship with the target network layer may be processed according to the fusion information Other network layers perform code encapsulation.
示例性的,以所述目标网络层为卷积层为例,根据所述融合信息可以确定与卷积层具有融合关系的其他网络层为激活层,则可以在进行代码封装时将卷积层对应配置参数,以及激活层对应的网络参数同时进行代码封装处理,从而可以提高初始部署代码的生成效率。Exemplarily, taking the target network layer as a convolutional layer as an example, according to the fusion information, it can be determined that other network layers that have a fusion relationship with the convolutional layer are activation layers, then the convolutional layer can be Corresponding configuration parameters and network parameters corresponding to the activation layer are simultaneously code-encapsulated, which can improve the efficiency of initial deployment code generation.
示例性的,在进行代码封装之前,可以将卷积代码拆分成多个部分。这样,可以利用这些拆分的代码部分进行拼接,得到部署代码。Exemplarily, before code encapsulation, the convolutional code may be split into multiple parts. In this way, these split code parts can be spliced to obtain the deployment code.
这样,通过预设的封装规则进行代码封装,可以在部署神经网络时自动生成封装好的代码,无需提前在神经网络推理引擎中添加相应的代码,从而可以减少神经网络推理引擎所占用的空间,提高神经网络的部署效率。In this way, code encapsulation through preset encapsulation rules can automatically generate encapsulated code when deploying the neural network, without adding corresponding codes to the neural network inference engine in advance, thereby reducing the space occupied by the neural network inference engine. Improve the deployment efficiency of neural networks.
S402:基于所述第一部署代码和所述第二部署代码,生成所述目标神经网络对应的目标部署代码,并将所述目标部署代码添加至目标部署设备。S402: Based on the first deployment code and the second deployment code, generate a target deployment code corresponding to the target neural network, and add the target deployment code to a target deployment device.
这里,所述目标部署设备可以是图形处理器等可以用于进行神经网络部署的硬件设备,在将所述目标部署代码添加至所述目标部署设备后,即完成了所述目标神经网络的部署。Here, the target deployment device may be a hardware device such as a graphics processor that can be used for neural network deployment. After the target deployment code is added to the target deployment device, the deployment of the target neural network is completed. .
其中,目标部署代码可以表示由第一部署代码和第二部署代码生成的特定的部署代码。Wherein, the target deployment code may refer to a specific deployment code generated by the first deployment code and the second deployment code.
在S401至S402的实施例中,通过自动生成部署代码并对生成的代码进行融合,相较于相关技术中使用神经网络推理引擎保存全部的部署代码,能够节约神经网络推理引擎的存储空间,从而提高神经网络的部署效率。In the embodiments of S401 to S402, by automatically generating deployment codes and merging the generated codes, compared with related technologies using neural network reasoning engines to save all deployment codes, the storage space of neural network reasoning engines can be saved, thereby Improve the deployment efficiency of neural networks.
并且,相较于全部静态的编译方式,可以依据卷积/矩阵运算的融合的效果生成对应的代码,神经网络推理的性能会提升。Moreover, compared with all static compilation methods, the corresponding code can be generated according to the fusion effect of convolution/matrix operations, and the performance of neural network reasoning will be improved.
在一些实施方式中,如图5所示,可以通过以下步骤生成所述目标神经网络对应的目标部署代码:In some implementations, as shown in Figure 5, the target deployment code corresponding to the target neural network can be generated through the following steps:
S4021:将所述第一部署代码和所述第二部署代码进行拼接,确定初始部署代码。S4021: Concatenate the first deployment code and the second deployment code to determine an initial deployment code.
这里,在将所述第一部署代码和所述第二部署代码进行拼接时,可以根据所述目标神经网络中各网络层之间的连接关系,对所述第一部署代码和所述第二部署代码进行拼接,拼接后的部署代码即为所述初始部署代码。Here, when splicing the first deployment code and the second deployment code, the first deployment code and the second deployment code can be combined according to the connection relationship between the network layers in the target neural network. The deployment code is spliced, and the spliced deployment code is the initial deployment code.
S4022:调用所述目标部署设备的目标接口函数对所述初始部署代码进行编译,生成所述目标部署代码,其中,所述目标部署代码为在所述目标部署设备上运行的代码。S4022: Call the target interface function of the target deployment device to compile the initial deployment code, and generate the target deployment code, where the target deployment code is code running on the target deployment device.
这里,目标接口函数可以表示所述目标部署设备的特定的接口函数。Here, the target interface function may represent a specific interface function of the target deployment device.
示例性的,以所述目标部署设备为图形处理器为例,可以调用NVRTC接口的目标接口函数对所述初始部署代码进行编译,生成可以运行在所述图形处理器上的目标部署代码。Exemplarily, taking the target deployment device as a graphics processor as an example, the target interface function of the NVRTC interface can be called to compile the initial deployment code, and generate target deployment code that can run on the graphics processor.
在S4021至S4022的实施例中,通过对自动生成的第一部署代码和第二部署代码进行拼接,并调用目标接口函数对拼接后的初始部署代码进行编译处理,可以实时的生成用于部署目标神经网络的目标部署代码,从而可以提高神经网络的部署效率。In the embodiment of S4021 to S4022, by splicing the automatically generated first deployment code and the second deployment code, and calling the target interface function to compile and process the spliced initial deployment code, the deployment target can be generated in real time. The target deployment code of the neural network can improve the deployment efficiency of the neural network.
在一些实施方式中,如图6所示,可以通过以下步骤进行网络推理:In some implementations, as shown in Figure 6, network reasoning can be performed through the following steps:
S601:接收所述目标部署设备发送的与所述目标部署代码对应的部署信息;其中,所述部署信息用于描述所述目标神经网络的各网络层对应的代码的部署位置。S601: Receive deployment information corresponding to the target deployment code sent by the target deployment device; wherein the deployment information is used to describe a deployment location of codes corresponding to each network layer of the target neural network.
这里,所述目标部署设备在接收到所述目标部署代码后,可以将所述目标部署代码部署在所述目标部署设备中,并将所述部署信息发送至神经网络推理引擎中,以便所述目标部署设备在接收到推理指令后进行神经网络推理。Here, after receiving the target deployment code, the target deployment device may deploy the target deployment code in the target deployment device, and send the deployment information to the neural network reasoning engine, so that the The target deployment device performs neural network inference after receiving the inference instruction.
S602:基于所述部署信息,以及对所述目标神经网络进行解析得到的各网络层之间的层级关系进行神经网络推理。S602: Perform neural network inference based on the deployment information and the hierarchical relationship between network layers obtained by parsing the target neural network.
这里,在基于所述部署信息和所述层级关系进行神经网络推理时,可以按照所述层级关系,依次运行各所述部署信息对应的代码,以进行神经网络推理。Here, when performing neural network reasoning based on the deployment information and the hierarchical relationship, the codes corresponding to each deployment information may be sequentially run according to the hierarchical relationship to perform neural network reasoning.
在一些实施例中,对所述目标神经网络进行解析得到的目标神经网络的各 网络层之间的层级关系,可以是在确定所述目标神经网络各网络层分别对应的网络参数的同一次解析得到,也可以是对所述目标神经网络的另一次解析得到。In some embodiments, the hierarchical relationship between the network layers of the target neural network obtained by parsing the target neural network may be the same analysis that determines the network parameters corresponding to each network layer of the target neural network obtained, or obtained by another analysis of the target neural network.
在一些实施例中,根据所述层级关系可以确定出在进行神经网络推理时,所需要使用的至少一个网络层之间的推理顺序,神经网络推理引擎可以根据所述推理顺序依次向所述目标部署设备发送推理指令,以指示所述目标部署设备根据所述推理顺序运行对应的代码进行神经网络推理。In some embodiments, the inference sequence between at least one network layer that needs to be used when performing neural network inference can be determined according to the hierarchical relationship, and the neural network inference engine can sequentially report to the target according to the inference sequence. The deployment device sends an inference instruction to instruct the target deployment device to run corresponding codes according to the inference sequence to perform neural network inference.
本公开实施例提供的神经网络推理方法,在部署目标神经网络时,通过基于目标网络层对应的网络参数、以及预先确定的样本网络层与配置参数之间的对应关系,自动确定与所述目标网络层对应的目标配置参数,并基于所述目标配置参数部署所述目标神经网络,节约了神经网络部署初始化阶段中配置参数的时间,从而提高了神经网络的部署效率。The neural network reasoning method provided by the embodiments of the present disclosure automatically determines the network parameters corresponding to the target network layer based on the network parameters corresponding to the target network layer and the predetermined corresponding relationship between the sample network layer and the configuration parameters when deploying the target neural network. The target configuration parameters corresponding to the network layer, and the target neural network is deployed based on the target configuration parameters, which saves the time for configuring parameters in the neural network deployment initialization stage, thereby improving the deployment efficiency of the neural network.
本领域技术人员可以理解,在具体实施方式的上述方法中,各步骤的撰写顺序并不意味着严格的执行顺序而对实施过程构成任何限定,各步骤的具体执行顺序应当以其功能和可能的内在逻辑确定。Those skilled in the art can understand that in the above method of specific implementation, the writing order of each step does not mean a strict execution order and constitutes any limitation on the implementation process. The specific execution order of each step should be based on its function and possible The inner logic is OK.
基于同一发明构思,本公开实施例中还提供了与神经网络推理方法对应的神经网络推理装置,由于本公开实施例中的装置解决问题的原理与本公开实施例上述神经网络推理方法相似,因此装置的实施可以参见方法的实施。Based on the same inventive concept, the embodiment of the present disclosure also provides a neural network reasoning device corresponding to the neural network reasoning method. Since the problem-solving principle of the device in the embodiment of the present disclosure is similar to the above-mentioned neural network reasoning method of the embodiment of the present disclosure, therefore The implementation of the device can refer to the implementation of the method.
参照图7所示,为本公开实施例提供的一种神经网络推理装置的架构示意图,该装置包括所包括的各部分,可以通过计算机设备中的处理器来实现;当然也可通过具体的逻辑电路实现;在实施的过程中,处理器可以为中央处理器(Central Processing Unit,CPU)、微处理器(Microprocessor Unit,MPU)、数字信号处理器(Digital Signal Processor,DSP)、专用集成电路(Application Specific Integrated Circuit,ASIC)、现场可编程门阵列(Field Programmable Gate Array,FPGA)或GPU等。所述神经网络推理装置包括:解析部分701、确定部分702、推理部分703;其中,Referring to FIG. 7 , it is a schematic diagram of the architecture of a neural network inference device provided by an embodiment of the present disclosure. The device includes various parts, which can be implemented by a processor in a computer device; of course, it can also be implemented by a specific logic Circuit implementation; in the process of implementation, the processor can be a central processing unit (Central Processing Unit, CPU), a microprocessor (Microprocessor Unit, MPU), a digital signal processor (Digital Signal Processor, DSP), an application-specific integrated circuit ( Application Specific Integrated Circuit, ASIC), Field Programmable Gate Array (Field Programmable Gate Array, FPGA) or GPU, etc. The neural network reasoning device includes: an analysis part 701, a determination part 702, and a reasoning part 703; wherein,
解析部分701,被配置为获取待部署的目标神经网络,并对所述目标神经网络进行解析,确定所述目标神经网络各网络层分别对应的网络参数;The parsing part 701 is configured to acquire the target neural network to be deployed, analyze the target neural network, and determine network parameters corresponding to each network layer of the target neural network;
确定部分702,被配置为基于目标网络层对应的网络参数、以及预先确定的样本网络层与配置参数之间的对应关系,确定与所述目标网络层对应的目标配置参数;其中,所述样本网络层与所述目标网络层的类型相同,所述样本网络层对应的配置参数为执行所述样本网络层对应的运算时的算法的配置信息;The determining part 702 is configured to determine the target configuration parameter corresponding to the target network layer based on the network parameter corresponding to the target network layer and the predetermined correspondence between the sample network layer and the configuration parameter; wherein, the sample The network layer is of the same type as the target network layer, and the configuration parameter corresponding to the sample network layer is configuration information of an algorithm when performing an operation corresponding to the sample network layer;
推理部分703,被配置为基于所述目标配置参数部署所述目标神经网络,并基于所述目标神经网络进行网络推理。The reasoning part 703 is configured to deploy the target neural network based on the target configuration parameters, and perform network reasoning based on the target neural network.
在一些实施方式中,所述确定部分702,在基于目标网络层对应的网络参数、以及预先确定的样本网络层与配置参数之间的对应关系,确定与所述目标网络层对应的目标配置参数时,被配置为:In some implementations, the determining part 702 determines the target configuration parameters corresponding to the target network layer based on the network parameters corresponding to the target network layer and the predetermined correspondence between sample network layers and configuration parameters , is configured as:
基于所述目标网络层对应的网络参数与所述样本网络层对应的样本网络参数之间的相似度,确定与所述目标网络层对应的至少一个候选样本网络层;determining at least one candidate sample network layer corresponding to the target network layer based on the similarity between the network parameters corresponding to the target network layer and the sample network parameters corresponding to the sample network layer;
针对任一所述候选样本网络层,基于该候选样本网络层对应的配置参数部署所述目标神经网络,并确定在该候选样本网络层对应的部署方式下的运行结 果;For any of the candidate sample network layers, deploy the target neural network based on the configuration parameters corresponding to the candidate sample network layer, and determine the operation results under the deployment mode corresponding to the candidate sample network layer;
基于所述目标神经网络在各所述候选样本网络层对应的部署方式下的运行结果,确定目标候选样本网络层,并将所述目标候选样本网络层对应的配置参数作为所述目标配置参数。Based on the operation results of the target neural network in the deployment mode corresponding to each of the candidate sample network layers, determine the target candidate sample network layer, and use the configuration parameters corresponding to the target candidate sample network layer as the target configuration parameters.
在一些实施方式中,所述确定部分702,在基于所述目标网络层对应的网络参数与所述样本网络层对应的样本网络参数之间的相似度,确定与所述目标网络层对应的至少一个候选样本网络层时,被配置为:In some implementations, the determining part 702 determines at least When a candidate sample network layer is configured as:
将相似度大于预设相似度的所述样本网络层确定为初始样本网络层;Determining the sample network layer whose similarity is greater than the preset similarity as the initial sample network layer;
基于与所述目标网络层对应的网络参数相匹配的配置信息筛选条件、以及各初始样本网络层对应的配置参数,从所述初始样本网络层中确定所述至少一个候选样本网络层。The at least one candidate sample network layer is determined from the initial sample network layer based on the configuration information screening condition matching the network parameter corresponding to the target network layer and the configuration parameters corresponding to each initial sample network layer.
在一些实施方式中,所述推理部分703,在基于所述目标配置参数部署所述目标神经网络时,被配置为:In some implementations, the reasoning part 703, when deploying the target neural network based on the target configuration parameters, is configured to:
基于所述目标配置参数,确定所述目标网络层对应的第一部署代码;以及,基于所述目标神经网络中除所述目标网络层外的其他网络层的网络参数,确定所述其他网络层对应的第二部署代码;Based on the target configuration parameters, determine the first deployment code corresponding to the target network layer; and, based on the network parameters of other network layers in the target neural network except the target network layer, determine the other network layers The corresponding second deployment code;
基于所述第一部署代码和所述第二部署代码,生成所述目标神经网络对应的目标部署代码,并将所述目标部署代码添加至目标部署设备。Based on the first deployment code and the second deployment code, generate a target deployment code corresponding to the target neural network, and add the target deployment code to a target deployment device.
在一些实施方式中,所述推理部分703,在基于所述目标配置参数和所述网络参数,确定所述目标网络层对应的第一部署代码时,被配置为:In some implementation manners, the reasoning part 703, when determining the first deployment code corresponding to the target network layer based on the target configuration parameters and the network parameters, is configured to:
基于预设的代码封装规则,对所述目标网络层对应的目标配置参数进行封装,确定所述目标网络层对应的第一部署代码。Based on a preset code encapsulation rule, the target configuration parameters corresponding to the target network layer are encapsulated, and the first deployment code corresponding to the target network layer is determined.
在一些实施方式中,所述推理部分703,在基于所述第一部署代码和所述第二部署代码,生成所述目标神经网络对应的目标部署代码时,被配置为:In some implementation manners, the reasoning part 703, when generating the target deployment code corresponding to the target neural network based on the first deployment code and the second deployment code, is configured to:
将所述第一部署代码和所述第二部署代码进行拼接,确定初始部署代码;splicing the first deployment code and the second deployment code to determine an initial deployment code;
调用所述目标部署设备的目标接口函数对所述初始部署代码进行编译,生成所述目标部署代码,其中,所述目标部署代码为在所述目标部署设备上运行的代码。Calling the target interface function of the target deployment device to compile the initial deployment code to generate the target deployment code, wherein the target deployment code is code running on the target deployment device.
在一些实施方式中,在基于所述目标神经网络进行网络推理之前,所述推理部分703还被配置为:In some implementations, before performing network reasoning based on the target neural network, the reasoning part 703 is further configured to:
接收所述目标部署设备发送的与所述目标部署代码对应的部署信息;其中,所述部署信息用于描述所述目标神经网络的各网络层对应的代码的部署位置;receiving deployment information corresponding to the target deployment code sent by the target deployment device; wherein the deployment information is used to describe the deployment position of the code corresponding to each network layer of the target neural network;
所述解析部分701,在对所述目标神经网络进行解析,确定所述目标神经网络各网络层分别对应的网络参数时,被配置为:The parsing part 701, when parsing the target neural network and determining network parameters corresponding to each network layer of the target neural network, is configured as:
对所述目标神经网络进行解析,确定所述目标神经网络各网络层分别对应的网络参数和所述目标神经网络的各网络层之间的层级关系;Analyzing the target neural network, determining the network parameters corresponding to the respective network layers of the target neural network and the hierarchical relationship between the network layers of the target neural network;
所述推理部分703,在基于所述目标神经网络进行网络推理时,被配置为:The reasoning part 703, when performing network reasoning based on the target neural network, is configured to:
基于所述部署信息和所述层级关系进行神经网络推理。Neural network reasoning is performed based on the deployment information and the hierarchical relationship.
在一些实施方式中,所述推理部分703,在基于所述部署信息和所述层级关系进行神经网络推理时,被配置为:In some implementation manners, the reasoning part 703, when performing neural network reasoning based on the deployment information and the hierarchical relationship, is configured to:
按照所述层级关系,依次运行各所述部署信息对应的代码,以进行神经网络推理。According to the hierarchical relationship, the codes corresponding to the deployment information are sequentially run to perform neural network reasoning.
本公开实施例提供的神经网络推理装置,在部署目标神经网络时,通过基于目标网络层对应的网络参数、以及预先确定的样本网络层与配置参数之间的对应关系,自动确定与所述目标网络层对应的目标配置参数,并基于所述目标配置参数部署所述目标神经网络,节约了神经网络部署初始化阶段中配置参数的时间,从而提高了神经网络的部署效率。关于装置中的各部分的处理流程、以及各部分之间的交互流程的描述可以参照上述方法实施例中的相关说明,这里不再详述。The neural network reasoning device provided by the embodiments of the present disclosure, when deploying the target neural network, automatically determines the network parameters corresponding to the target network layer based on the corresponding relationship between the predetermined sample network layer and the configuration parameters. The target configuration parameters corresponding to the network layer, and the target neural network is deployed based on the target configuration parameters, which saves the time for configuring parameters in the neural network deployment initialization stage, thereby improving the deployment efficiency of the neural network. For the description of the processing flow of each part in the device and the interaction flow between each part, reference may be made to the relevant description in the above method embodiment, and details are not described here again.
在本公开实施例以及其他的实施例中,“部分”可以是部分电路、部分处理器、部分程序或软件等等,当然也可以是单元,还可以是模块也可以是非模块化的。In the embodiments of the present disclosure and other embodiments, a "part" may be a part of a circuit, a part of a processor, a part of a program or software, etc., of course it may also be a unit, a module or a non-modular one.
基于同一技术构思,本公开实施例还提供了一种计算机设备。参照图8所示,为本公开实施例提供的计算机设备800的结构示意图,包括处理器801、存储器802、和总线803。其中,存储器802被配置为存储执行指令,包括内存8021和外部存储器8022;这里的内存8021也称内存储器,被配置为暂时存放处理器801中的运算数据,以及与硬盘等外部存储器8022交换的数据,处理器801通过内存8021与外部存储器8022进行数据交换,当计算机设备800运行时,处理器801与存储器802之间通过总线803通信,使得处理器801在执行以下指令:Based on the same technical idea, the embodiment of the present disclosure also provides a computer device. Referring to FIG. 8 , it is a schematic structural diagram of a computer device 800 provided by an embodiment of the present disclosure, including a processor 801 , a memory 802 , and a bus 803 . Among them, the memory 802 is configured to store execution instructions, including a memory 8021 and an external memory 8022; the memory 8021 here is also called an internal memory, and is configured to temporarily store calculation data in the processor 801, and exchange data with an external memory 8022 such as a hard disk. For data, the processor 801 exchanges data with the external memory 8022 through the memory 8021. When the computer device 800 is running, the processor 801 communicates with the memory 802 through the bus 803, so that the processor 801 executes the following instructions:
获取待部署的目标神经网络,并对所述目标神经网络进行解析,确定所述目标神经网络各网络层分别对应的网络参数;Obtaining the target neural network to be deployed, and analyzing the target neural network, and determining network parameters corresponding to each network layer of the target neural network;
基于目标网络层对应的网络参数、以及预先确定的样本网络层与配置参数之间的对应关系,确定与所述目标网络层对应的目标配置参数;其中,所述样本网络层与所述目标网络层的类型相同,所述样本网络层对应的配置参数为执行所述样本网络层对应的运算时的算法的配置信息;Based on the network parameters corresponding to the target network layer and the predetermined correspondence between the sample network layer and configuration parameters, determine the target configuration parameters corresponding to the target network layer; wherein, the sample network layer and the target network The types of the layers are the same, and the configuration parameters corresponding to the sample network layer are the configuration information of the algorithm when performing the operation corresponding to the sample network layer;
基于所述目标配置参数部署所述目标神经网络,并基于所述目标神经网络进行网络推理。Deploying the target neural network based on the target configuration parameters, and performing network reasoning based on the target neural network.
这里,处理器801还可以称为CPU。处理器801可能是一种集成电路芯片,具有信号的处理能力。处理器801还可以是通用处理器、DSP、ASIC、FPGA、GPU或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。另外,处理器801可以由集成电路芯片共同实现。Here, the processor 801 may also be called a CPU. The processor 801 may be an integrated circuit chip with signal processing capability. The processor 801 may also be a general processor, DSP, ASIC, FPGA, GPU or other programmable logic devices, discrete gate or transistor logic devices, or discrete hardware components. A general-purpose processor may be a microprocessor, or the processor may be any conventional processor, or the like. In addition, the processor 801 may be jointly implemented by integrated circuit chips.
本公开实施例还提供一种计算机可读存储介质,该计算机可读存储介质上存储有计算机程序,该计算机程序被处理器运行时执行上述方法实施例中所述的神经网络推理方法的步骤。其中,该存储介质可以是易失性的计算机可读取存储介质或非易失的计算机可读取存储介质。Embodiments of the present disclosure also provide a computer-readable storage medium, on which a computer program is stored, and when the computer program is run by a processor, the steps of the neural network reasoning method described in the foregoing method embodiments are executed. Wherein, the storage medium may be a volatile computer-readable storage medium or a non-volatile computer-readable storage medium.
本公开实施例还提供一种计算机程序产品,该计算机程序产品承载有程序代码,所述程序代码包括的指令可用于执行上述方法实施例中所述的神经网络推理方法的步骤,具体可参见上述方法实施例。Embodiments of the present disclosure also provide a computer program product, the computer program product carries program code, and the instructions included in the program code can be used to execute the steps of the neural network reasoning method described in the method embodiment above. For details, please refer to the above-mentioned Method Example.
其中,上述计算机程序产品可以具体通过硬件、软件或其结合的方式实现。在一些实施方式中,所述计算机程序产品具体体现为计算机存储介质,在另一些实施方式中,计算机程序产品具体体现为软件产品,例如软件开发包(Software Development Kit,SDK)等等。Wherein, the above-mentioned computer program product may be specifically implemented by means of hardware, software or a combination thereof. In some embodiments, the computer program product is embodied as a computer storage medium, and in other embodiments, the computer program product is embodied as a software product, such as a software development kit (Software Development Kit, SDK) and the like.
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的系统和装置的具体工作过程,可以参考前述方法实施例中的对应过程。在本公开所提供的几个实施例中,应该理解到,所揭露的系统、装置和方法,可以通过其它的方式实现。以上所描述的装置实施例是示意性的,例如,所述单元的划分,为一种逻辑功能划分,实际实现时可以有另外的划分方式,又例如,多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些通信接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。Those skilled in the art can clearly understand that for the convenience and brevity of description, for the specific working process of the system and device described above, reference can be made to the corresponding process in the foregoing method embodiments. In the several embodiments provided in the present disclosure, it should be understood that the disclosed systems, devices and methods may be implemented in other ways. The device embodiments described above are illustrative. For example, the division of the units is a logical function division. In actual implementation, there may be another division method. For example, multiple units or components can be combined or integrated. to another system, or some features may be ignored, or not implemented. In another point, the mutual coupling or direct coupling or communication connection shown or discussed may be through some communication interfaces, and the indirect coupling or communication connection of devices or units may be in electrical, mechanical or other forms.
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。The units described as separate components may or may not be physically separated, and the components shown as units may or may not be physical units, that is, they may be located in one place, or may be distributed to multiple network units. Part or all of the units can be selected according to actual needs to achieve the purpose of the solution of this embodiment.
另外,在本公开各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。In addition, each functional unit in each embodiment of the present disclosure may be integrated into one processing unit, each unit may exist separately physically, or two or more units may be integrated into one unit.
所述功能如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个处理器可执行的非易失的计算机可读取存储介质中。基于这样的理解,本公开的技术方案本质上或者说对相关技术做出贡献的部分或者该技术方案的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本公开各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(Read-Only Memory,ROM)、随机存取存储器(Random Access Memory,RAM)、磁碟或者光盘等各种可以存储程序代码的介质。If the functions are realized in the form of software function units and sold or used as independent products, they can be stored in a non-volatile computer-readable storage medium executable by a processor. Based on this understanding, the essence of the technical solution of the present disclosure or the part that contributes to the related technology or the part of the technical solution can be embodied in the form of a software product. The computer software product is stored in a storage medium, including several The instructions are used to make a computer device (which may be a personal computer, a server, or a network device, etc.) execute all or part of the steps of the methods described in various embodiments of the present disclosure. The aforementioned storage media include: U disk, mobile hard disk, read-only memory (Read-Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic disk or optical disc and other media that can store program codes. .
最后应说明的是:以上所述实施例,仅为本公开的具体实施方式,用以说明本公开的技术方案,而非对其限制,本公开的保护范围并不局限于此,尽管参照前述实施例对本公开进行了详细的说明,本领域的普通技术人员应当理解:任何熟悉本技术领域的技术人员在本公开揭露的技术范围内,其依然可以对前述实施例所记载的技术方案进行修改或可轻易想到变化,或者对其中部分技术特征进行等同替换;而这些修改、变化或者替换,并不使相应技术方案的本质脱离本公开实施例技术方案的精神和范围,都应涵盖在本公开的保护范围之内。因此,本公开的保护范围应所述以权利要求的保护范围为准。Finally, it should be noted that: the above-mentioned embodiments are only specific implementations of the present disclosure, and are used to illustrate the technical solutions of the present disclosure, rather than limit them, and the protection scope of the present disclosure is not limited thereto, although referring to the aforementioned The embodiments have described the present disclosure in detail, and those skilled in the art should understand that any person familiar with the technical field can still modify the technical solutions described in the foregoing embodiments within the technical scope disclosed in the present disclosure Changes can be easily imagined, or equivalent replacements can be made to some of the technical features; and these modifications, changes or replacements do not make the essence of the corresponding technical solutions deviate from the spirit and scope of the technical solutions of the embodiments of the present disclosure, and should be included in this disclosure. within the scope of protection. Therefore, the protection scope of the present disclosure should be defined by the protection scope of the claims.
工业实用性Industrial Applicability
本公开实施例提供了一种神经网络推理方法及装置、计算机设备、计算机 可读存储介质、计算机程序产品,其中,神经网络推理方法包括:获取待部署的目标神经网络,并对所述目标神经网络进行解析,确定所述目标神经网络各网络层分别对应的网络参数;基于目标网络层对应的网络参数、以及预先确定的样本网络层与配置参数之间的对应关系,确定与所述目标网络层对应的目标配置参数;其中,所述样本网络层与所述目标网络层的类型相同,所述样本网络层对应的配置参数为执行所述样本网络层对应的运算时的算法的配置信息;基于所述目标配置参数部署所述目标神经网络,并基于所述目标神经网络进行网络推理。上述方案节约了神经网络部署初始化阶段中配置参数的时间,从而提高了神经网络的部署效率。Embodiments of the present disclosure provide a neural network reasoning method and device, computer equipment, a computer-readable storage medium, and a computer program product, wherein the neural network reasoning method includes: acquiring a target neural network to be deployed, and The network is analyzed to determine the network parameters corresponding to each network layer of the target neural network; based on the network parameters corresponding to the target network layer and the corresponding relationship between the predetermined sample network layer and configuration parameters, determine A target configuration parameter corresponding to a layer; wherein, the sample network layer is of the same type as the target network layer, and the configuration parameter corresponding to the sample network layer is configuration information of an algorithm when performing an operation corresponding to the sample network layer; Deploying the target neural network based on the target configuration parameters, and performing network reasoning based on the target neural network. The above scheme saves the time for configuring parameters in the initialization phase of neural network deployment, thereby improving the deployment efficiency of the neural network.

Claims (12)

  1. 一种神经网络推理方法,包括:A neural network reasoning method, comprising:
    获取待部署的目标神经网络,并对所述目标神经网络进行解析,确定所述目标神经网络各网络层分别对应的网络参数;Obtaining the target neural network to be deployed, and analyzing the target neural network, and determining network parameters corresponding to each network layer of the target neural network;
    基于目标网络层对应的网络参数、以及预先确定的样本网络层与配置参数之间的对应关系,确定与所述目标网络层对应的目标配置参数;其中,所述样本网络层与所述目标网络层的类型相同,所述样本网络层对应的配置参数为执行所述样本网络层对应的运算时的算法的配置信息;Based on the network parameters corresponding to the target network layer and the predetermined correspondence between the sample network layer and configuration parameters, determine the target configuration parameters corresponding to the target network layer; wherein, the sample network layer and the target network The types of the layers are the same, and the configuration parameters corresponding to the sample network layer are the configuration information of the algorithm when performing the operation corresponding to the sample network layer;
    基于所述目标配置参数部署所述目标神经网络,并基于所述目标神经网络进行网络推理。Deploying the target neural network based on the target configuration parameters, and performing network reasoning based on the target neural network.
  2. 根据权利要求1所述的方法,其中,所述基于目标网络层对应的网络参数、以及预先确定的样本网络层与配置参数之间的对应关系,确定与所述目标网络层对应的目标配置参数,包括:The method according to claim 1, wherein the target configuration parameters corresponding to the target network layer are determined based on the network parameters corresponding to the target network layer and the predetermined correspondence between the sample network layer and configuration parameters ,include:
    基于所述目标网络层对应的网络参数与所述样本网络层对应的样本网络参数之间的相似度,确定与所述目标网络层对应的至少一个候选样本网络层;determining at least one candidate sample network layer corresponding to the target network layer based on the similarity between the network parameters corresponding to the target network layer and the sample network parameters corresponding to the sample network layer;
    针对任一所述候选样本网络层,基于该候选样本网络层对应的配置参数部署所述目标神经网络,并确定所述目标神经网络在该候选样本网络层对应的部署方式下的运行结果;For any of the candidate sample network layers, deploy the target neural network based on the configuration parameters corresponding to the candidate sample network layer, and determine the operation result of the target neural network in the deployment mode corresponding to the candidate sample network layer;
    基于所述目标神经网络在各所述候选样本网络层对应的部署方式下的运行结果,确定目标候选样本网络层,并将所述目标候选样本网络层对应的配置参数作为所述目标配置参数。Based on the operation results of the target neural network in the deployment mode corresponding to each of the candidate sample network layers, determine the target candidate sample network layer, and use the configuration parameters corresponding to the target candidate sample network layer as the target configuration parameters.
  3. 根据权利要求2所述的方法,其中,所述基于所述目标网络层对应的网络参数与所述样本网络层对应的样本网络参数之间的相似度,确定与所述目标网络层对应的至少一个候选样本网络层,包括:The method according to claim 2, wherein, based on the similarity between the network parameters corresponding to the target network layer and the sample network parameters corresponding to the sample network layer, at least A candidate sample network layer, including:
    将相似度大于预设相似度的所述样本网络层确定为初始样本网络层;Determining the sample network layer whose similarity is greater than the preset similarity as the initial sample network layer;
    基于与所述目标网络层对应的网络参数相匹配的配置信息筛选条件、以及各初始样本网络层对应的配置参数,从所述初始样本网络层中确定所述至少一个候选样本网络层。The at least one candidate sample network layer is determined from the initial sample network layer based on the configuration information screening condition matching the network parameter corresponding to the target network layer and the configuration parameters corresponding to each initial sample network layer.
  4. 根据权利要求1至3任一所述的方法,其中,所述基于所述目标配置参数部署所述目标神经网络,包括:The method according to any one of claims 1 to 3, wherein said deploying said target neural network based on said target configuration parameters comprises:
    基于所述目标配置参数,确定所述目标网络层对应的第一部署代码;以及,基于所述目标神经网络中除所述目标网络层外的其他网络层的网络参数,确定所述其他网络层对应的第二部署代码;Based on the target configuration parameters, determine the first deployment code corresponding to the target network layer; and, based on the network parameters of other network layers in the target neural network except the target network layer, determine the other network layers The corresponding second deployment code;
    基于所述第一部署代码和所述第二部署代码,生成所述目标神经网络对应的目标部署代码,并将所述目标部署代码添加至目标部署设备。Based on the first deployment code and the second deployment code, generate a target deployment code corresponding to the target neural network, and add the target deployment code to a target deployment device.
  5. 根据权利要求4所述的方法,其中,所述基于所述目标配置参数,确定所述目标网络层对应的第一部署代码,包括:The method according to claim 4, wherein said determining the first deployment code corresponding to the target network layer based on the target configuration parameters comprises:
    基于预设的代码封装规则,对所述目标网络层对应的目标配置参数进行封装,确定所述目标网络层对应的第一部署代码。Based on a preset code encapsulation rule, the target configuration parameters corresponding to the target network layer are encapsulated, and the first deployment code corresponding to the target network layer is determined.
  6. 根据权利要求4所述的方法,其中,所述基于所述第一部署代码和所述第二部署代码,生成所述目标神经网络对应的目标部署代码,包括:The method according to claim 4, wherein said generating the target deployment code corresponding to the target neural network based on the first deployment code and the second deployment code comprises:
    将所述第一部署代码和所述第二部署代码进行拼接,确定初始部署代码;splicing the first deployment code and the second deployment code to determine an initial deployment code;
    调用所述目标部署设备的目标接口函数对所述初始部署代码进行编译,生成所述目标部署代码,其中,所述目标部署代码为在所述目标部署设备上运行的代码。calling the target interface function of the target deployment device to compile the initial deployment code to generate the target deployment code, wherein the target deployment code is code running on the target deployment device.
  7. 根据权利要求4至6任一所述的方法,其中,在基于所述目标神经网络进行网络推理之前,所述方法还包括:The method according to any one of claims 4 to 6, wherein, before performing network reasoning based on the target neural network, the method further comprises:
    接收所述目标部署设备发送的与所述目标部署代码对应的部署信息;其中,所述部署信息用于描述所述目标神经网络的各网络层对应的代码的部署位置;receiving deployment information corresponding to the target deployment code sent by the target deployment device; wherein the deployment information is used to describe the deployment position of the code corresponding to each network layer of the target neural network;
    所述对所述目标神经网络进行解析,确定所述目标神经网络各网络层分别对应的网络参数,包括:The said target neural network is analyzed, and the network parameters corresponding to each network layer of said target neural network are determined, including:
    对所述目标神经网络进行解析,确定所述目标神经网络各网络层分别对应的网络参数和所述目标神经网络的各网络层之间的层级关系;Analyzing the target neural network, determining the network parameters corresponding to the respective network layers of the target neural network and the hierarchical relationship between the network layers of the target neural network;
    所述基于所述目标神经网络进行网络推理,包括:The network reasoning based on the target neural network includes:
    基于所述部署信息和所述层级关系进行神经网络推理。Neural network reasoning is performed based on the deployment information and the hierarchical relationship.
  8. 根据权利要求7所述的方法,其中,所述基于所述部署信息和所述层级关系进行神经网络推理,包括:The method according to claim 7, wherein said performing neural network reasoning based on said deployment information and said hierarchical relationship comprises:
    按照所述层级关系,依次运行各所述部署信息对应的代码,以进行神经网络推理。According to the hierarchical relationship, the codes corresponding to the deployment information are sequentially run to perform neural network reasoning.
  9. 一种神经网络推理装置,包括:A neural network reasoning device, comprising:
    解析部分,被配置为获取待部署的目标神经网络,并对所述目标神经网络进行解析,确定所述目标神经网络各网络层分别对应的网络参数;The parsing part is configured to obtain the target neural network to be deployed, and analyze the target neural network to determine network parameters corresponding to each network layer of the target neural network;
    确定部分,被配置为基于目标网络层对应的网络参数、以及预先确定的样本网络层与配置参数之间的对应关系,确定与所述目标网络层对应的目标配置参数;其中,所述样本网络层与所述目标网络层的类型相同,所述样本网络层对应的配置参数为执行所述样本网络层对应的运算时的算法的配置信息;The determining part is configured to determine target configuration parameters corresponding to the target network layer based on network parameters corresponding to the target network layer and a predetermined correspondence between sample network layers and configuration parameters; wherein, the sample network The layer is of the same type as the target network layer, and the configuration parameter corresponding to the sample network layer is configuration information of an algorithm when performing an operation corresponding to the sample network layer;
    推理部分,被配置为基于所述目标配置参数部署所述目标神经网络,并基于所述目标神经网络进行网络推理。The reasoning part is configured to deploy the target neural network based on the target configuration parameters, and perform network reasoning based on the target neural network.
  10. 一种计算机设备,包括:处理器、存储器和总线,所述存储器存储有所述处理器可执行的机器可读指令,在计算机设备运行的情况下,所述处理器与所述存储器之间通过总线通信,所述机器可读指令被所述处理器执行时执行如权利要求1至8任一所述的神经网络推理方法的步骤。A computer device, comprising: a processor, a memory, and a bus, the memory stores machine-readable instructions executable by the processor, and when the computer device is running, the processor and the memory are connected through Bus communication, when the machine-readable instructions are executed by the processor, the steps of the neural network reasoning method according to any one of claims 1 to 8 are executed.
  11. 一种计算机可读存储介质,该计算机可读存储介质上存储有计算机程序,该计算机程序被处理器运行时执行如权利要求1至8任一所述的神经网络推理方法的步骤。A computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is run by a processor, the steps of the neural network reasoning method according to any one of claims 1 to 8 are executed.
  12. 一种计算机程序产品,所述计算机程序产品包括计算机程序或指令,在所述计算机程序或指令在电子设备上运行的情况下,使得所述电子设备执行如权利要求1至8任一所述的神经网络推理方法的步骤。A computer program product, the computer program product comprising a computer program or an instruction, when the computer program or instruction is run on an electronic device, the electronic device is made to execute the method described in any one of claims 1 to 8 The steps of the neural network inference method.
PCT/CN2022/090030 2021-12-24 2022-04-28 Neural network reasoning method and apparatus, and computer device, computer-readable storage medium and computer program product WO2023115776A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202111595072.1A CN114398040A (en) 2021-12-24 2021-12-24 Neural network reasoning method, device, computer equipment and storage medium
CN202111595072.1 2021-12-24

Publications (1)

Publication Number Publication Date
WO2023115776A1 true WO2023115776A1 (en) 2023-06-29

Family

ID=81227241

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/090030 WO2023115776A1 (en) 2021-12-24 2022-04-28 Neural network reasoning method and apparatus, and computer device, computer-readable storage medium and computer program product

Country Status (2)

Country Link
CN (1) CN114398040A (en)
WO (1) WO2023115776A1 (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114398040A (en) * 2021-12-24 2022-04-26 上海商汤科技开发有限公司 Neural network reasoning method, device, computer equipment and storage medium
CN116089095B (en) * 2023-02-28 2023-10-27 苏州亿铸智能科技有限公司 Deployment method for ReRAM neural network computing engine network

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110633785A (en) * 2018-06-21 2019-12-31 清华大学 Method and system for calculating convolutional neural network
CN111144561A (en) * 2018-11-05 2020-05-12 杭州海康威视数字技术股份有限公司 Neural network model determining method and device
CN111582454A (en) * 2020-05-09 2020-08-25 北京百度网讯科技有限公司 Method and device for generating neural network model
CN111985624A (en) * 2020-08-31 2020-11-24 商汤集团有限公司 Neural network training and deploying method, text translation method and related products
CN112947935A (en) * 2021-02-26 2021-06-11 上海商汤智能科技有限公司 Operation method and device, electronic device and storage medium
CN114398040A (en) * 2021-12-24 2022-04-26 上海商汤科技开发有限公司 Neural network reasoning method, device, computer equipment and storage medium

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110633785A (en) * 2018-06-21 2019-12-31 清华大学 Method and system for calculating convolutional neural network
CN111144561A (en) * 2018-11-05 2020-05-12 杭州海康威视数字技术股份有限公司 Neural network model determining method and device
CN111582454A (en) * 2020-05-09 2020-08-25 北京百度网讯科技有限公司 Method and device for generating neural network model
CN111985624A (en) * 2020-08-31 2020-11-24 商汤集团有限公司 Neural network training and deploying method, text translation method and related products
CN112947935A (en) * 2021-02-26 2021-06-11 上海商汤智能科技有限公司 Operation method and device, electronic device and storage medium
CN114398040A (en) * 2021-12-24 2022-04-26 上海商汤科技开发有限公司 Neural network reasoning method, device, computer equipment and storage medium

Also Published As

Publication number Publication date
CN114398040A (en) 2022-04-26

Similar Documents

Publication Publication Date Title
WO2023115776A1 (en) Neural network reasoning method and apparatus, and computer device, computer-readable storage medium and computer program product
WO2021098509A1 (en) Neural network joint compilation method, apparatus and electronic device
US11741361B2 (en) Machine learning-based network model building method and apparatus
US20200225921A1 (en) Lookup table optimization for programming languages that target synchronous digital circuits
Gao et al. Android malware detection via graphlet sampling
JP7394211B2 (en) Methods, devices, equipment, and media for parallel execution of smart contracts
CN112101529B (en) Deployment method and architecture for neural network model reasoning cross-platform
US11775862B2 (en) Tracking provenance in data science scripts
CN114968612B (en) Data processing method, system and related equipment
US20220198266A1 (en) Using disentangled learning to train an interpretable deep learning model
WO2023082644A1 (en) Network model processing method and apparatus, and device, storage medium and computer program product
CN113449299A (en) Projected vector modification as suppression of machine learning model string fill
CN114201107A (en) Storage device, method for operating storage device, and electronic device
CN117034273A (en) Android malicious software detection method and system based on graph rolling network
CN113961919A (en) Malicious software detection method and device
CN113672985A (en) Machine learning algorithm script compiling method and compiler for privacy protection
CN113312618A (en) Program vulnerability detection method and device, electronic equipment and medium
CN113971224A (en) Image retrieval system, method and related equipment
CN113326523A (en) Privacy calculation method and device and electronic equipment
US20200342287A1 (en) Selective performance of deterministic computations for neural networks
CN111966383A (en) Quantitative analysis method, system and medium for operating system kernel compatibility
EP4414901A1 (en) Model weight acquisition method and related system
Belyaev et al. LuNA-ICLU compiler for automated generation of iterative fragmented programs
US20230104356A1 (en) Model driven sub-system for design and execution of experiments
CN116755714B (en) Method, device, equipment and storage medium for operating deep neural network model

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22909113

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE