WO2021134350A1 - 神经网络模型的推理方法、装置、计算机设备和存储介质 - Google Patents

神经网络模型的推理方法、装置、计算机设备和存储介质 Download PDF

Info

Publication number
WO2021134350A1
WO2021134350A1 PCT/CN2019/130183 CN2019130183W WO2021134350A1 WO 2021134350 A1 WO2021134350 A1 WO 2021134350A1 CN 2019130183 W CN2019130183 W CN 2019130183W WO 2021134350 A1 WO2021134350 A1 WO 2021134350A1
Authority
WO
WIPO (PCT)
Prior art keywords
model
network
node
sub
standard
Prior art date
Application number
PCT/CN2019/130183
Other languages
English (en)
French (fr)
Inventor
庄奇
Original Assignee
深圳元戎启行科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳元戎启行科技有限公司 filed Critical 深圳元戎启行科技有限公司
Priority to CN201980037513.0A priority Critical patent/CN113811897B/zh
Priority to PCT/CN2019/130183 priority patent/WO2021134350A1/zh
Publication of WO2021134350A1 publication Critical patent/WO2021134350A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Definitions

  • This application relates to a neural network model reasoning method, device, computer equipment and storage medium.
  • Artificial neural network abstracts the human brain neuron network from the perspective of information processing, establishes a certain simple model, and composes different networks according to different connection methods.
  • Neural network models are widely used in speech recognition, image recognition and natural language processing. With the development of computer technology, the network structure of the neural network model has become more and more complex, and the number of layers has grown from dozens of layers to thousands of layers or even more.
  • the inventor realizes that as the network structure of the neural network model becomes more and more complex, the process of inferring the neural network model, that is, the process of running the entire neural network model on input data to obtain output data takes more time. Therefore, how to reduce the reasoning time of the neural network model and improve the reasoning speed has become a technical problem that needs to be solved at present.
  • a neural network model inference method is provided.
  • a reasoning method for a neural network model includes:
  • model reasoning task Acquiring a model reasoning task, the model reasoning task carrying a model identifier
  • a reasoning device for a neural network model includes:
  • the task acquisition module is used to acquire a model reasoning task, and the model reasoning task carries a model identifier;
  • the model analysis module is used to analyze the neural network model corresponding to the model identifier to obtain the original model network corresponding to the neural network model;
  • the model traversal module is configured to obtain a preset standard sub-network corresponding to the model identifier; traverse the original model network according to the standard sub-network to obtain a sub-network to be optimized;
  • the model optimization module is used to optimize the sub-network to be optimized based on the target sub-network corresponding to the standard sub-network to obtain an optimized neural network model
  • the model reasoning module is used to perform reasoning according to the optimized neural network model to obtain model reasoning results.
  • a computer device including a memory and one or more processors, the memory stores computer readable instructions, and when the computer readable instructions are executed by the processor, the one or more processors execute The following steps:
  • model reasoning task Acquiring a model reasoning task, the model reasoning task carrying a model identifier
  • One or more non-volatile computer-readable storage media storing computer-readable instructions.
  • the computer-readable instructions When executed by one or more processors, the one or more processors perform the following steps:
  • model reasoning task Acquiring a model reasoning task, the model reasoning task carrying a model identifier
  • Fig. 1 is an application scenario diagram of a neural network model inference method according to one or more embodiments.
  • Fig. 2 is a schematic flowchart of an inference method according to a neural network model in one or more embodiments.
  • Fig. 3 is a partial abstract diagram of the original model network according to one or more embodiments.
  • FIG. 4 is a flow diagram of the steps of optimizing the sub-network to be optimized based on the target sub-network corresponding to the standard sub-network in an embodiment to obtain the optimized neural network model.
  • Fig. 5 is a block diagram of an inference device according to a neural network model in one or more embodiments.
  • Figure 6 is a block diagram of a computer device according to one or more embodiments.
  • the reasoning method of the neural network model provided in this application can be applied to a terminal or a server.
  • the terminal can acquire a model reasoning task, and the model reasoning task carries a model identifier.
  • the terminal analytical model identifies the corresponding neural network model, and obtains the original model network corresponding to the neural network model.
  • the terminal obtains the preset standard sub-network corresponding to the model identifier, and traverses the original model network according to the standard sub-network to obtain the sub-network to be optimized.
  • the terminal optimizes the sub-network to be optimized based on the target sub-network corresponding to the standard sub-network to obtain an optimized neural network model.
  • the terminal performs inference according to the optimized neural network model, and obtains the model inference result.
  • the terminal may specifically include, but is not limited to, various personal computers, notebook computers, smart phones, and tablet computers.
  • the neural network model inference method implements the inference of the neural network model and can be applied to a variety of application environments, and the neural network model can include multiple types.
  • the neural network model may include a convolutional neural network model, a recurrent neural network model, and a recurrent neural network model.
  • the neural network model can be used to process a variety of different data.
  • the neural network model may specifically include an image recognition model, a feature extraction model, a speech recognition model, a text recognition model, and a scene classification model.
  • the reasoning method of the neural network model provided in the present application can be specifically applied in the field of automatic driving, and the neural network model can specifically include at least one of an image recognition model, a behavior prediction model, or a risk assessment model.
  • the neural network model can be an image recognition model
  • the reasoning method of the neural network model provided in this application can be applied to the application environment as shown in FIG. 1.
  • the autonomous driving vehicle may include a sensor 102 and a terminal 104, and the sensor 102 may communicate with the terminal 104 through a connection established with the terminal 104.
  • the sensor 102 can collect an image of the environment within the visual range. For example, when an autonomous vehicle is driving to an intersection, the sensor 102 can collect traffic signal images.
  • the terminal 104 performs image recognition according to the signal light image collected by the sensor 102, and judges the color of the signal light in the image. Specifically, the terminal 104 may generate a model reasoning task according to the image recognition task, and the model reasoning task carries a model identifier corresponding to the image recognition model that needs to be called. The terminal 104 may obtain the neural network model belonging to the image recognition model according to the model identifier, analyze the image recognition model, and obtain the original model network corresponding to the image recognition model. The terminal 104 obtains the preset standard image recognition sub-network, and traverses the original model network according to the standard image recognition sub-network to obtain the sub-network to be optimized in the image recognition model.
  • the terminal 104 optimizes the sub-network to be optimized in the image recognition model based on the target image recognition sub-network corresponding to the standard image recognition sub-network to obtain the optimized image recognition model.
  • the terminal 104 performs calculation inference on the signal light image according to the optimized image recognition model, and obtains the color of the signal light in the signal light image.
  • the above-mentioned neural network model inference method can also be applied to the server.
  • the reasoning method of the neural network model provided in this application can be specifically applied in the field of natural language processing, and the neural network model can be a text classification model.
  • the terminal may upload a text classification request to the server, and the server performs classification processing on the text to be classified according to the text classification request.
  • the server can generate a model inference task, and the model inference task carries a model identifier corresponding to the text classification model that needs to be invoked.
  • the server may obtain the pre-configured text classification model according to the model identifier, analyze the text classification model, and obtain the original model network corresponding to the text classification model.
  • the server can obtain the preset standard classification sub-network corresponding to the model identifier, and traverse the original model network according to the standard classification sub-network to obtain the sub-network to be optimized corresponding to the text classification model.
  • the server may optimize the sub-network to be optimized based on the target classification sub-network corresponding to the standard classification sub-network to obtain an optimized text classification model.
  • the server can reason about the input text to be classified according to the optimized text classification model to obtain a text classification result.
  • a reasoning method of a neural network model is provided. Taking the method applied to a terminal as an example for description, the method includes the following steps:
  • Step 202 Obtain a model reasoning task, and the model reasoning task carries a model identifier.
  • Model reasoning refers to the calculation of the data input to the neural network model according to the network structure sequence of the neural model and the corresponding operations of the included multiple calculation layers to obtain the inference result output by the neural network model.
  • the terminal can obtain the model inference task, and perform inference on the corresponding neural network model according to the model inference task.
  • the terminal may determine the neural network model specified by the user according to the received operation instruction of the user, and generate a model inference task carrying the model identifier.
  • the terminal can also determine the neural network model that needs to be called according to actual operating requirements, and generate model inference tasks.
  • the terminal can generate a model reasoning task.
  • the model reasoning task the image recognition model is inferred after the image is input, and the output of the image recognition model is obtained. Recognition results.
  • Model identification is carried in the model reasoning task.
  • the model identification refers to the mark identification corresponding to the neural network model, which is used to mark the neural network model, and the neural network model has a unique corresponding model identification.
  • the terminal may include an inference engine, and the terminal may perform model inference tasks through the inference engine, and perform inference on the neural network model corresponding to the model identifier.
  • the reasoning engine refers to the functional module used to complete the reasoning in the terminal.
  • Step 204 Analyze the neural network model corresponding to the model identifier to obtain the original model network corresponding to the neural network model.
  • the terminal can analyze the acquired model reasoning task to obtain the model identifier carried in the model reasoning task.
  • the terminal can obtain the neural network model corresponding to the model identifier according to the model reasoning task.
  • the neural network model can be pre-trained and configured in the terminal.
  • the neural network model can be stored in the memory or storage corresponding to the terminal.
  • the neural network model corresponding to the model identifier may include at least one of a variety of different neural network models. For example, depending on the network structure of the neural network model, it may specifically include at least one of a convolutional neural network model (Convolutional Neural Networks, CNN), a recurrent neural network model (Recurrent Neural Network, RNN), and a recurrent neural network model. kind.
  • the neural network model may specifically include at least one of an image recognition model, a feature extraction model, a speech recognition model, a text recognition model, and a scene classification model.
  • the terminal can read the neural network model corresponding to the model identifier from the storage location such as memory according to the model reasoning task, analyze the read neural network model, and obtain the original model network corresponding to the neural network model.
  • the original model network refers to the neural network structure corresponding to the neural network model read by the terminal.
  • the neural network model can include multiple calculation layers, and each calculation layer can correspond to data operations. There can be conditions or associations between the computing layer and the computing layer. For example, the output of some calculation layers may be the input of the corresponding calculation layer.
  • the calculation layer included in the neural network model and the correlation between the calculation layers constitute the original model network corresponding to the neural network model.
  • the association relationship and sequence between the calculation layer and the calculation layer are fixed, and different neural network models can have different network structures.
  • a convolutional neural network model it can specifically include an input layer, a convolutional layer, a pooling layer, a fully connected layer, and an output layer.
  • the terminal expresses the original model network obtained by the analysis in various forms.
  • the terminal may record the original model network obtained by analysis in the form of a list, and the order of the list may indicate the association relationship between the calculation layer and the calculation layer in the original model network.
  • the terminal can also generate a corresponding abstract graph according to the structure of the original model network.
  • Figure 3 is a partial abstract diagram of the abstract diagram corresponding to the original model network. Since the original model network corresponding to the complete neural network model is relatively large, a partial abstract diagram is taken as an example.
  • the abstract graph corresponding to the original model network is a directed graph, and the computing layer in the neural network model corresponds to the nodes in the abstract graph.
  • the directed edges between nodes represent the input-output relationship between the computing layer and the computing layer, and the data output by the computing layer can be input to the computing layer pointed by the arrow according to the directed edges.
  • Each node may include calculation layer information corresponding to the calculation layer, and the calculation layer information includes at least one of a calculation layer identifier, a calculation layer type, a calculation layer attribute, and a calculation layer condition corresponding to the calculation layer.
  • the terminal may determine the computing layer information as the node information of the corresponding node.
  • Step 206 Obtain a preset standard sub-network corresponding to the model identifier.
  • the terminal can obtain the standard sub-network corresponding to the model identifier, and the model identifier of the neural network model can correspond to one or more standard sub-networks.
  • the standard sub-network can be preset by the user according to actual needs, and the user can determine the sub-network that can be optimized as the standard sub-network according to the original model network of the neural network model.
  • the terminal can receive the standard sub-network input by the user through the input device.
  • the standard sub-network is the partial network structure of the complete neural network model.
  • the standard sub-network can include multiple computing layers and the association relationship between multiple computing layers.
  • the association relationship between the computing layers can specifically include the logic between the computing layers. Conditions and the relationship between input and output.
  • the standard sub-network can be pre-set by the user and stored in the terminal, and the standard sub-network can be a subset of the original model network in the neural network model.
  • the standard sub-network can also be expressed in multiple forms.
  • the terminal can record and represent the standard sub-network in the form of a list, each row in the list can record the calculation layer in the standard sub-network, and the order of the list can indicate the association relationship between the calculation layer and the calculation layer.
  • the terminal can also generate an abstract graph based on the standard sub-network, and use the form of the abstract graph to represent the standard sub-network.
  • step 208 the original model network is traversed according to the standard sub-network to obtain the sub-network to be optimized.
  • the terminal can traverse the original model network according to the preset standard sub-network, and filter out the sub-network to be optimized corresponding to the standard sub-network in the original model network.
  • the sub-network to be optimized refers to the part of the network that can be optimized in the original model network.
  • the speed of model reasoning is low, which is not conducive to the data processing process that requires high real-time performance. For example, in the field of autonomous driving, data processing results need to be quickly inferred based on neural network models.
  • the terminal needs to temporarily store the processed data in the memory after each data processing according to an operation corresponding to a computing layer.
  • the terminal can traverse the sub-network to be optimized from the original model network according to the standard sub-network, so as to optimize the original model network of the neural network model and reduce the number of calculation layers in the original model network.
  • the terminal may sequentially traverse the calculation layers in the original model network according to the order of the calculation layers in the standard sub-network, and find the calculation layer corresponding to the standard sub-network calculation layer.
  • the calculation layer corresponding to the calculation layer in the standard sub-network is determined to be the calculation layer to be optimized.
  • the position in the original model network remains unchanged, and the search continues to the next calculation layer in the standard sub-network.
  • the corresponding calculation layer to be optimized until all the calculation layers to be optimized are found.
  • the terminal may record the network formed by the computing layer to be optimized as the sub-network to be optimized, and the sub-network to be optimized is a subset of the original model network.
  • the standard sub-network and the original model network can be represented in the form of abstract graphs.
  • the terminal can obtain the standard nodes in the standard sub-network according to the order of the abstract graph, and compare the original model nodes in the original model network abstract graph with the standard nodes one by one according to the order of the original model network.
  • the terminal can record the successfully compared original model node as the node to be optimized, keep the position of the original model network abstract graph unchanged, obtain the next standard node, continue to compare the standard node with the original model node, and repeat the search with the standard node Corresponding original model nodes until the nodes to be optimized corresponding to all standard nodes are found in the original model network.
  • the terminal may record the network formed by the nodes to be optimized as the sub-network to be optimized.
  • multiple standard nodes are continuous.
  • the nodes to be optimized may be continuous or discontinuous, that is, the nodes to be optimized in the sub-network to be optimized may include original model nodes that do not belong to the nodes to be optimized.
  • step 210 the sub-network to be optimized is optimized based on the target sub-network corresponding to the standard sub-network to obtain an optimized neural network model.
  • Step 212 Perform inference according to the optimized neural network model to obtain model inference results.
  • the terminal can optimize the traversed sub-network to be optimized to obtain the optimized neural network model, thereby increasing the inference speed of the neural network model and reducing the inference time of the neural network model.
  • the optimized neural network model simplifies the original model network compared with the original neural network model, reduces the number of calculation layers, and avoids unnecessary use of the memory of the terminal by the output results of multiple calculation layers.
  • the terminal can obtain the target sub-network corresponding to the standard sub-network.
  • the target sub-network may be preset by the user according to actual needs, and there is a unique mapping relationship between the target sub-network and the standard sub-network.
  • the target sub-network refers to the network structure obtained after optimizing the standard sub-network.
  • the terminal can optimize the sub-network to be optimized in the original model network according to the target sub-network to obtain an optimized neural network model.
  • the terminal can specifically obtain at least one input node and at least one output node corresponding to the sub-network to be optimized.
  • the terminal can delete the sub-network to be optimized from the original model network, and connect the input nodes and output nodes corresponding to the sub-network to be optimized with the target sub-network respectively, so as to replace the sub-network to be optimized with the target sub-network to obtain the optimized sub-network.
  • Neural network model Compared with the sub-network to be optimized, the target sub-network simplifies the network structure, reduces the number of calculation layers, and simplifies the neural network model.
  • the terminal can perform inferences based on the optimized neural network model, and perform operations in turn according to the calculation layer operations corresponding to the optimized neural network model to obtain the inferred data results. For example, based on the optimized image recognition model, the terminal may sequentially perform operations on the input images according to the operation sequence corresponding to the optimized network structure to obtain the recognized image results.
  • the terminal when the terminal obtains the model reasoning task and needs to reason about the neural network model, it analyzes the neural network model corresponding to the model identifier to obtain the original model network of the neural network model.
  • the terminal can obtain the preset standard sub-network, and traverse the original model network according to the standard sub-network, so as to filter out the sub-networks to be optimized that can be optimized in the original model network.
  • the terminal optimizes the sub-network to be optimized based on the target sub-network, and obtains an optimized neural network model. Before inferring the neural network model, the terminal optimizes the network structure of the neural network model, does not affect the training gradient of the neural network model, and ensures the accuracy of the neural network model.
  • the terminal performs inference according to the optimized neural network model, which simplifies the network structure of the neural network model and reduces the calculation layer in the neural network model, thereby effectively improving the inference speed of the neural network model and saving the inference time of the neural network model , Improve the real-time nature of model reasoning.
  • the reasoning method of the above neural network model before obtaining the preset standard sub-network corresponding to the model identifier, the reasoning method of the above neural network model further includes: obtaining the standard node association file; generating the network description script according to the standard node association file; executing the network Describe the script and generate a standard subnet.
  • the terminal Before obtaining the preset standard sub-network, the terminal also includes generating the standard sub-network according to actual needs. Specifically, the terminal may obtain a standard node associated file, and the standard node associated file is a template file used to record the standard node and the associated relationship between the standard nodes. The user can input the network structure of the standard sub-network according to the specific format of the template file to obtain the standard node association file. Standard nodes are used to represent the computing layers in the standard sub-network, and each computing layer corresponds to a node.
  • the standard node associated files may specifically include standard node files and node condition files.
  • Standard node files and node condition files can record standard nodes in a specific format.
  • the standard node file may be a standard node list in the form of a data table, each row in the list records a standard node, and the order of the list may indicate the logical order and association relationship of the standard nodes.
  • the node condition file can also be a node condition list in the form of a data table, and each row in the list records the logical conditions that the standard node needs to meet.
  • the input node identifier may include one or more node identifiers corresponding to the input node
  • the standard node attribute may include at least one node attribute corresponding to the standard node.
  • “layer_0”, “layer_1” and “layer_2” respectively represent the node identifiers corresponding to the standard nodes, and the type “concat” of the standard node whose node identifier is “layer_2” connects the standard node “layer_0" and the standard node “layer_1",
  • the node attribute dim of the standard node “layer_2” is equal to 1.
  • the standard node list may include at least two standard node identifiers.
  • the node condition list may include condition identifiers corresponding to multiple conditions
  • the standard node identifiers may be node identifiers corresponding to at least two standard nodes
  • the standard node attributes refer to node attributes corresponding to the standard nodes.
  • the node condition with the condition identifier "Condition_0” indicates that the attribute dim of the standard node “layer_3” should be equal to the attribute dim of the standard node "layer_4".
  • the node condition "Condition_0" is met, otherwise it is not met.
  • the terminal can read the node information corresponding to the standard node in the standard node association file according to the specific format of the standard node association file, and generate a network description script according to the node information.
  • the terminal can execute the generated network description script, and generate a standard sub-network by running the network description script.
  • the standard sub-network can be used to represent the part of the network structure that can be optimized in the neural network model.
  • the terminal can optimize the original model network of the neural network model based on the generated standard sub-network to simplify the network structure of the original model network and improve the neural network.
  • the inference speed of the network model can be used to represent the part of the network structure that can be optimized in the neural network model.
  • the terminal can obtain the standard node association file, generate a network description script based on the standard node information recorded in the standard node association file, and execute the network description script to generate a standard sub-network.
  • the terminal can flexibly generate a standard sub-network according to the standard node association file, thereby improving the flexibility and diversity of the sub-network to be optimized.
  • Using the standard sub-network helps to filter out more complex sub-networks to be optimized, thereby deeply optimizing the network structure of the neural network model, effectively improving the inference speed of the neural network model, and saving the memory occupied when inferring the neural network model.
  • the step of traversing the original model network according to the standard sub-network to obtain the sub-network to be optimized includes: topologically sorting the standard sub-network and the original model network to obtain the standard node sequence and the original model node sequence; According to the standard node sequence and the original model node in the original model node sequence, the node to be optimized is obtained; the sub-network to be optimized is generated according to the node to be optimized.
  • the terminal can traverse the original model network based on the standard sub-network, and filter out the sub-network to be optimized corresponding to the standard sub-network from the original model network.
  • the terminal may perform topological sorting on the standard sub-network and the original model network respectively to obtain the standard node sequence corresponding to the standard sub-network and the original model node sequence corresponding to the original model network.
  • Topological sorting refers to arranging the nodes in a directed network into a sequence that satisfies the topological order, and the sequence obtained by topological sorting is a one-dimensional linear sequence.
  • the terminal may record the linear sequence obtained by the topological sorting of the standard sub-network as the standard node sequence, and record the linear sequence obtained by the topological sorting of the original model network as the original model node sequence.
  • the order of nodes in the linear sequence obtained by topological sorting by the terminal is fixed.
  • the terminal can match the original model node in the original model node sequence according to the standard node sequence to obtain the node to be optimized, and generate the sub-network to be optimized according to the node to be optimized. It is understandable that when the terminal obtains multiple preset standard sub-networks, the terminal can match the standard node sequence corresponding to the multiple standard sub-networks with the original model node sequence respectively, and filter out the multiple standard node sequences from the original model network. Each standard sub-network corresponds to the sub-network to be optimized.
  • the terminal may obtain the standard nodes in sequence in the sequence of the standard node sequence.
  • the terminal traverses multiple original model nodes in the original model node sequence according to the sequence of the original model node sequence, and sequentially determines whether the original model node matches the standard node.
  • the terminal can record the original model node that matches the standard node as the node to be optimized, and obtain the next standard node in the standard node sequence, and repeat the standard node with the subsequent original model
  • the nodes are matched until the nodes to be optimized that match all the standard nodes are selected from the original model nodes, or all the original model nodes in the original model node sequence are traversed.
  • the terminal can obtain all the nodes to be optimized that match the standard nodes in the original model node sequence, and determine the network composed of the nodes to be optimized as the sub-network to be optimized.
  • standard nodes are continuous.
  • the nodes to be optimized can be continuous or discontinuous, and some nodes to be optimized can be discretely distributed in the original model network.
  • the terminal when the original model node does not match the standard node, the terminal can obtain the next original model node to match the standard node in the sequence of the original model node sequence.
  • the terminal can make inferences based on the original neural network model and generate an optimization failure prompt message.
  • the terminal may display optimization failure prompt information through the display interface, thereby prompting the user that the preset standard sub-network does not correspond to the neural network model corresponding to the model identifier, and the reasoning process of the neural network model is not optimized.
  • the terminal obtains the corresponding standard node sequence and the original model node sequence by topologically sorting the standard sub-network and the original model network respectively.
  • the topological sequence accurately represents the order of the nodes in the network structure, thus accurately Of to filter out the nodes to be optimized in the original model node sequence.
  • the terminal selects the nodes to be optimized from the original model node sequence according to the standard node sequence, and generates the sub-network to be optimized, which is helpful for optimizing the sub-network to be optimized, and performs reasoning based on the optimized neural network model, which effectively improves the neural network model.
  • the reasoning speed saves the time required for reasoning.
  • matching the original model nodes in the original model node sequence according to the standard node sequence to obtain the node to be optimized includes: obtaining the standard node information corresponding to the standard node according to the standard node sequence; and sequentially based on the standard node information The original model node in the original model node sequence is traversed, and the original model node matching the standard node information is determined as the node to be optimized.
  • the terminal can sequentially screen out the nodes to be optimized that match the standard nodes in the original model node sequence according to the sequence of the standard node sequence.
  • the terminal may obtain the standard node information corresponding to the standard node, and the standard node information corresponding to the standard node may specifically include the standard node identifier, the standard node type, the standard node attributes, and the standard node conditions.
  • the original model node corresponds to the original model node information including the original model node identifier, the original model node type, the original model node attributes, and the original model node conditions.
  • the terminal traverses the original model nodes based on the standard node information, and sequentially determines whether the standard node information matches the original model node information.
  • the matching original model node is recorded as the node to be optimized, and the standard node information corresponding to the next standard node is obtained for matching. If not, then repeatedly determine whether the original model node information corresponding to the next original model node matches the standard node information.
  • the terminal can sequentially match multiple types of information in the node information.
  • all the standard node information corresponding to the standard node matches the original model node information it is determined that the standard node matches the original model node.
  • any standard node information corresponding to the standard node does not match the original model node information it is determined that the standard node does not match the original model node.
  • the terminal can obtain the standard node type corresponding to the standard node, and sequentially traverse the original model nodes matching the standard node type according to the original model node sequence.
  • the original model node type does not match the standard node type in the original model node sequence, it is determined that there is no original model node matching the standard node in the original model node sequence.
  • the terminal obtains the original model node attributes corresponding to the original model node, and compares the original model node attributes with the standard node attributes.
  • the terminal repeatedly compares the standard node type with the next original model node type.
  • the terminal obtains at least one original input node corresponding to the standard node and the original model node, and there is a fixed topological order between the original input nodes.
  • the terminal can sequentially compare the original input nodes corresponding to the standard node and the original model node according to the topological sequence between the original input nodes.
  • the terminal repeatedly compares the standard node type with the next original model node type.
  • the original input node corresponding to the standard node is an empty node, the original input node corresponding to the original model node is determined as the input node of the sub-network to be optimized.
  • the terminal can obtain the standard node conditions and the original model node conditions. Similarly, when the standard node condition does not match the original model node condition, the terminal repeatedly compares the standard node type with the next original model node type. When the standard node condition matches the original model node condition, the terminal determines that the standard node matches the original model node, and the original model node can be recorded as a node to be optimized that matches the standard node.
  • the terminal matches the standard node with the original model node according to the standard node information corresponding to the standard node and the original model node information corresponding to the original model node, and traverses the sequence of the original model node to find the information corresponding to the standard node.
  • the matched original model node is used as the node to be optimized, which effectively improves the accuracy of screening the node to be optimized, accurately traverses the sub-network to be optimized for optimization, and effectively improves the accuracy of optimizing the neural network model.
  • the sub-network to be optimized is optimized based on the target sub-network corresponding to the standard sub-network, and the steps of obtaining the optimized neural network model include:
  • Step 402 Obtain the original input node corresponding to the original model node.
  • Step 404 Compare the original input node with the sub-network to be optimized.
  • Step 406 When the original input node belongs to the sub-network to be optimized, the original model node corresponding to the original input node is determined as the output node.
  • Step 408 Eliminate the sub-network to be optimized, connect the target sub-network with the output node, and obtain an optimized neural network model.
  • the terminal can obtain the original input node corresponding to each original model node from the original model network according to the association relationship between the original model nodes.
  • the original input node refers to the original model node whose output data is used as the input data of the original model node, and each original model node itself can be the original input node of other original model nodes.
  • Each original model node may correspondingly include at least one original input node.
  • the terminal can compare multiple original input nodes with multiple nodes to be optimized in the sub-network to be optimized. When the original input node does not belong to the sub-network to be optimized, continue to compare the next original input node with the sub-network to be optimized.
  • the terminal can determine the original model node corresponding to the original input node belonging to the sub-network to be optimized as the output node of the sub-network to be optimized.
  • the terminal when the terminal traverses the node to be optimized, it can determine the input node corresponding to the sub-network to be optimized according to the original input node of the original model node.
  • the terminal can replace the sub-network to be optimized with the target sub-network to realize the optimization of the neural network model. Specifically, the terminal may remove the sub-network to be optimized from the original model network, and replace the sub-network to be optimized with the target sub-network. The terminal can respectively connect the target sub-network with the input node and output node corresponding to the sub-network to be optimized to obtain an optimized neural network model.
  • the terminal can obtain the output node of the sub-network to be optimized, and replace the sub-network to be optimized with the target sub-network.
  • the terminal connects the target sub-network with the output node, ensuring the accuracy of replacing the sub-network to be optimized, and effectively improving the accuracy of the optimized neural network model.
  • the reasoning method of the above neural network model before analyzing the neural network model corresponding to the model identifier, the reasoning method of the above neural network model further includes: obtaining historical reasoning data, the historical reasoning data including historical optimization model identifiers; and combining the model identifier with the historical optimization model identifier Compare; when the model identifier belongs to the historical optimization model identifier, call the historical optimization neural network model corresponding to the historical optimization model identifier.
  • Historical reasoning data refers to the data corresponding to the neural network model inferred by the terminal in historical time, and historical time refers to the historical time period between the terminal's acquisition of the model reasoning task.
  • the historical reasoning data may specifically include the historical optimization model identifier.
  • the historical optimization model identifier refers to the model corresponding to the neural network model after optimization according to the neural network model inference method in the above embodiment when the terminal performs neural network model inference in the historical time. logo.
  • the terminal can compare the model identifier with the historical optimization model identifier to determine whether the model identifier is the same as any one of the historical optimization model identifiers. If yes, it is determined that the model identifier belongs to the historical optimization model identifier. Otherwise, it does not belong to the historical optimization model identification.
  • the terminal can infer the neural network model corresponding to the model identifier according to the neural network model inference method in the foregoing embodiment.
  • the terminal can call the historical optimization neural network model corresponding to the historical optimization model identifier, perform inference according to the historical optimization neural network model, and obtain the model inference result.
  • the terminal can obtain historical reasoning data, and compare the model identifier with the historical optimization model identifier in the historical reasoning data.
  • the terminal can call the historical optimization neural network model corresponding to the historical optimization model identifier to perform inference, without optimizing the neural network model every time, which effectively saves the terminal's computing resources.
  • a neural network model reasoning device including: a task acquisition module 502, a model analysis module 504, a model traversal module 506, a model optimization module 508, and a model reasoning module 510 ,among them:
  • the task acquisition module 502 is used to acquire a model reasoning task, and the model reasoning task carries a model identifier.
  • the model analysis module 504 is used to analyze the neural network model corresponding to the model identifier to obtain the original model network corresponding to the neural network model.
  • the model traversal module 506 is used to obtain the preset standard sub-network corresponding to the model identifier; traverse the original model network according to the standard sub-network to obtain the sub-network to be optimized.
  • the model optimization module 508 is used to optimize the sub-network to be optimized based on the target sub-network corresponding to the standard sub-network to obtain an optimized neural network model.
  • the model reasoning module 510 is used to perform reasoning according to the optimized neural network model to obtain model reasoning results.
  • the above-mentioned neural network model reasoning device further includes a sub-network generation module for obtaining standard node association files; generating a network description script according to the standard node association files; executing the network description script to generate a standard sub-network.
  • the above-mentioned model traversal module 506 is also used to perform topological sorting of the standard sub-network and the original model network respectively to obtain the standard node sequence and the original model node sequence; according to the original model node sequence in the standard node sequence and the original model node sequence.
  • the model nodes are matched to obtain the nodes to be optimized; the sub-networks to be optimized are generated according to the nodes to be optimized.
  • the above-mentioned model traversal module 506 is further configured to obtain standard node information corresponding to the standard node according to the standard node sequence; sequentially traverse the original model nodes in the original model node sequence based on the standard node information, and determine the standard node The original model node with matching information is used as the node to be optimized.
  • the aforementioned model traversal module 506 is further configured to generate optimization failure prompt information when there is no original model node matching the standard node in the original model node sequence.
  • the above-mentioned model optimization module 508 is also used to obtain the original input node corresponding to the original model node; compare the original input node with the sub-network to be optimized; when the original input node belongs to the sub-network to be optimized, determine The original model node corresponding to the original input node is used as the output node; the sub-network to be optimized is eliminated, and the target sub-network is connected with the output node to obtain the optimized neural network model.
  • the reasoning device of the above neural network model further includes a model recognition module for obtaining historical reasoning data, the historical reasoning data includes historical optimization model identification; the model identification is compared with the historical optimization model identification; when the model When the identifier belongs to the historical optimization model identifier, the historical optimization neural network model corresponding to the historical optimization model identifier is called.
  • Each module in the reasoning device of the above neural network model can be implemented in whole or in part by software, hardware and a combination thereof.
  • the above-mentioned modules may be embedded in the form of hardware or independent of the processor in the computer equipment, or may be stored in the memory of the computer equipment in the form of software, so that the processor can call and execute the operations corresponding to the above-mentioned modules.
  • a computer device is provided.
  • the computer device may be a server, and its internal structure diagram may be as shown in FIG. 6.
  • the computer equipment includes a processor, a memory, a network interface, and a database connected through a system bus. Among them, the processor of the computer device is used to provide calculation and control capabilities.
  • the memory of the computer device includes a non-volatile storage medium and an internal memory.
  • the non-volatile storage medium stores an operating system, computer readable instructions, and a database.
  • the internal memory provides an environment for the operation of the operating system and computer-readable instructions in the non-volatile storage medium.
  • the database of the computer equipment is used to store the inference data of the neural network model.
  • the network interface of the computer device is used to communicate with an external terminal through a network connection.
  • the computer-readable instructions are executed by the processor to realize a neural network model inference method.
  • FIG. 6 is only a block diagram of part of the structure related to the solution of the present application, and does not constitute a limitation on the computer device to which the solution of the present application is applied.
  • the specific computer device may Including more or fewer parts than shown in the figure, or combining some parts, or having a different arrangement of parts.
  • a computer device includes a memory and one or more processors.
  • the memory stores computer readable instructions.
  • the one or more processors execute the above method embodiments. step.
  • One or more non-volatile computer-readable storage media storing computer-readable instructions.
  • the computer-readable instructions execute A step of.
  • Non-volatile memory may include read only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash memory.
  • Volatile memory may include random access memory (RAM) or external cache memory.
  • RAM is available in many forms, such as static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double data rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous chain Channel (Synchlink) DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), etc.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)

Abstract

一种神经网络模型的推理方法,包括:获取模型推理任务,所述模型推理任务携带模型标识(202);解析所述模型标识对应的神经网络模型,得到所述神经网络模型对应的原模型网络(204);获取预设的与所述模型标识对应的标准子网络(206);根据所述标准子网络对所述原模型网络进行遍历,得到待优化子网络(208);基于所述标准子网络所对应的目标子网络将所述待优化子网络进行优化,得到优化后的神经网络模型(210);及根据所述优化后的神经网络模型进行推理,得到模型推理结果(212)。

Description

神经网络模型的推理方法、装置、计算机设备和存储介质 技术领域
本申请涉及一种神经网络模型的推理方法、装置、计算机设备和存储介质。
背景技术
人工神经网络是从信息处理角度对人脑神经元网络进行抽象,建立某种简单模型,按不同的连接方式组成不同的网络。神经网络模型在语音识别、图像识别以及自然语言处理等领域均有着广泛的应用。随着计算机技术的发展,神经网络模型的网络结构愈发复杂,层数由最初的几十层发展至上千层甚至更多。
发明人意识到,随着神经网络模型的网络结构变得更多更复杂,推理神经网络模型,即输入数据运行整个神经网络模型得到输出数据的过程需要耗费更多的时间。因此,如何减少神经网络模型的推理时间,提高推理速度成为目前需要解决的技术问题。
发明内容
根据本申请公开的各种实施例,提供一种神经网络模型的推理方法、装置、计算机设备和存储介质。
一种神经网络模型的推理方法包括:
获取模型推理任务,所述模型推理任务携带模型标识;
解析所述模型标识对应的神经网络模型,得到所述神经网络模型对应的原模型网络;
获取预设的与所述模型标识对应的标准子网络;
根据所述标准子网络对所述原模型网络进行遍历,得到待优化子网络;
基于所述标准子网络所对应的目标子网络将所述待优化子网络进行优化,得到优化后的神经网络模型;及
根据所述优化后的神经网络模型进行推理,得到模型推理结果。
一种神经网络模型的推理装置包括:
任务获取模块,用于获取模型推理任务,所述模型推理任务携带模型标识;
模型解析模块,用于解析所述模型标识对应的神经网络模型,得到所述神经网络模型对应的原模型网络;
模型遍历模块,用于获取预设的与所述模型标识对应的标准子网络;根据所述标准子网络对所述原模型网络进行遍历,得到待优化子网络;
模型优化模块,用于基于所述标准子网络所对应的目标子网络将所述待优化子网络进行优化,得到优化后的神经网络模型;及
模型推理模块,用于根据所述优化后的神经网络模型进行推理,得到模型推理结果。
一种计算机设备,包括存储器和一个或多个处理器,所述存储器中储存有计算机可读指令,所述计算机可读指令被所述处理器执行时,使得所述一个或多个处理器执行以下步骤:
获取模型推理任务,所述模型推理任务携带模型标识;
解析所述模型标识对应的神经网络模型,得到所述神经网络模型对应的原模型网络;
获取预设的与所述模型标识对应的标准子网络;
根据所述标准子网络对所述原模型网络进行遍历,得到待优化子网络;
基于所述标准子网络所对应的目标子网络将所述待优化子网络进行优化,得到优化后的神经网络模型;及
根据所述优化后的神经网络模型进行推理,得到模型推理结果。
一个或多个存储有计算机可读指令的非易失性计算机可读存储介质,计算机可读指令被一个或多个处理器执行时,使得一个或多个处理器执行以下 步骤:
获取模型推理任务,所述模型推理任务携带模型标识;
解析所述模型标识对应的神经网络模型,得到所述神经网络模型对应的原模型网络;
获取预设的与所述模型标识对应的标准子网络;
根据所述标准子网络对所述原模型网络进行遍历,得到待优化子网络;
基于所述标准子网络所对应的目标子网络将所述待优化子网络进行优化,得到优化后的神经网络模型;及
根据所述优化后的神经网络模型进行推理,得到模型推理结果。
本申请的一个或多个实施例的细节在下面的附图和描述中提出。本申请的其它特征和优点将从说明书、附图以及权利要求书变得明显。
附图说明
为了更清楚地说明本申请实施例中的技术方案,下面将对实施例中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其它的附图。
图1为根据一个或多个实施例中神经网络模型的推理方法的应用场景图。
图2为根据一个或多个实施例中神经网络模型的推理方法的流程示意图。
图3为根据一个或多个实施例中原模型网络的局部抽象图。
图4为一个实施例中基于标准子网络所对应的目标子网络将待优化子网络进行优化,得到优化后的神经网络模型步骤的流程示意图。
图5为根据一个或多个实施例中神经网络模型的推理装置的框图。
图6为根据一个或多个实施例中计算机设备的框图。
具体实施方式
为了使本申请的技术方案及优点更加清楚明白,以下结合附图及实施例,对本申请进行进一步详细说明。应当理解,此处描述的具体实施例仅仅用以解释本申请,并不用于限定本申请。
本申请提供的神经网络模型的推理方法,可以应用于终端,也可以应用于服务器。以应用于终端为例,终端可以获取模型推理任务,模型推理任务携带模型标识。终端解析模型标识对应的神经网络模型,得到神经网络模型对应的原模型网络。终端获取预设的与模型标识对应的标准子网络,根据标准子网络对原模型网络进行遍历,得到待优化子网络。终端基于标准子网络所对应的目标子网络将待优化子网络进行优化,得到优化后的神经网络模型。终端根据优化后的神经网络模型进行推理,得到模型推理结果。终端具体可以包括但不限于各种个人计算机、笔记本电脑、智能手机、平板电脑。
可以理解的,本申请提供的神经网络模型的推理方法实现对神经网络模型进行推理,可以应用于多种应用环境,神经网络模型可以包括多种类型。例如,神经网络模型可以包括卷积神经网络模型、循环神经网络模型以及递归神经网络模型等。神经网络模型可以用于处理多种不同的数据。例如,神经网络模型具体可以包括图像识别模型、特征提取模型、语音识别模型、文本识别模型以及场景分类模型等。
在其中一个实施例中,本申请提供的神经网络模型的推理方法具体可以应用于自动驾驶领域中,神经网络模型具体可以包括图像识别模型、行为预测模型或者风险评估模型等中的至少一种。例如,神经网络模型可以是图像识别模型,本申请提供的神经网络模型的推理方法可以应用与如图1所示的应用环境中。自动驾驶车辆可以包括传感器102和终端104,传感器102可以通过与终端104建立的连接与终端104进行通信。传感器102可以采集视觉范围内的环境图像。比如在自动驾驶车辆行驶至路口时,传感器102可以采集交通信号灯图像。终端104根据传感器102采集的信号灯图像进行图像识别,判断图像中信号灯的颜色。具体的,终端104可以根据图像识别任务 生成模型推理任务,模型推理任务携带了需要调用的图像识别模型对应的模型标识。终端104可以根据模型标识获取属于图像识别模型的神经网络模型,对图像识别模型进行解析,得到图像识别模型对应的原模型网络。终端104获取预设的标准图像识别子网络,根据标准图像识别子网络对原模型网络进行遍历,得到图像识别模型中的待优化子网络。终端104基于标准图像识别子网络所对应的目标图像识别子网络,将图像识别模型中的待优化子网络进行优化,得到优化后的图像识别模型。终端104根据优化后的图像识别模型对信号灯图像进行运算推理,得到信号灯图像中信号灯的颜色。
在其中一个实施例中,当对得到模型推理结果的实时性要求较低,或者根据神经网络模型推理的数据量较大时,上述神经网络模型的推理方法还可以应用于服务器中。例如,本申请提供的神经网络模型的推理方法具体可以应用于自然语言处理领域中,神经网络模型具体可以是文本分类模型。具体的,终端可以向服务器上传文本分类请求,服务器根据文本分类请求对待分类文本进行分类处理。当需要通过文本分类模型对待分类文本进行分类运算时,服务器可以生成模型推理任务,模型推理任务携带了需要调用的文本分类模型对应的模型标识。服务器可以根据模型标识获取预先配置的文本分类模型,对文本分类模型进行解析,得到文本分类模型对应的原模型网络。服务器可以获取预设的与模型标识对应的标准分类子网络,根据标准分类子网络对原模型网络进行遍历,得到文本分类模型对应的待优化子网络。服务器可以基于标准分类子网络对应的目标分类子网络,将待优化子网络进行优化,得到优化后的文本分类模型。服务器可以根据优化后的文本分类模型对输入的待分类文本进行推理,得到文本分类结果。
在其中一个实施例中,如图2所示,提供了一种神经网络模型的推理方法,以该方法应用于终端为例进行说明,包括以下步骤:
步骤202,获取模型推理任务,模型推理任务携带模型标识。
模型推理是指根据神经模型的网络结构顺序,依次按照包括的多个计算层各自对应的操作,对输入神经网络模型的数据进行运算,以此得到神经网 络模型输出的推理结果。终端可以获取模型推理任务,根据模型推理任务对相应的神经网络模型进行推理。
具体的,终端可以在用户需要进行模型推理时,根据接收到用户的操作指令确定用户指定的神经网络模型,生成携带模型标识的模型推理任务。终端也可以根据实际运行需求确定需要调用的神经网络模型,生成模型推理任务。例如,在图像识别的过程中,当需要调用图像识别模型对图像进行识别处理时,终端可以生成模型推理任务,根据模型推理任务在输入图像后对图像识别模型进行推理,得到图像识别模型输出的识别结果。
模型推理任务中携带有模型标识。模型标识是指神经网络模型对应的标记标识,用于对神经网络模型进行标记,神经网络模型存在唯一对应的模型标识。在其中一个实施例中,终端可以包括推理引擎,终端可以通过推理引擎执行模型推理任务,对模型标识对应的神经网络模型进行推理。推理引擎是指终端中用于完成推理的功能模块。
步骤204,解析模型标识对应的神经网络模型,得到神经网络模型对应的原模型网络。
终端可以对获取到的模型推理任务进行解析,得到模型推理任务中携带的模型标识。终端可以根据模型推理任务获取模型标识对应的神经网络模型。神经网络模型可以是预先训练得到的,并且配置在终端。神经网络模型可以存储在终端对应的内存或存储器中。模型标识对应的神经网络模型可以包括多种不同的神经网络模型中的至少一种。例如,根据神经网络模型的网络结构的不同,具体可以包括卷积神经网络模型(Convolutional Neural Networks,简称CNN)、循环神经网络模型(RecurrentNeural Network,简称RNN)以及递归神经网络模型等中的至少一种。根据神经网络模型的功能不同,神经网络模型具体可以包括图像识别模型、特征提取模型、语音识别模型、文本识别模型以及场景分类模型等中的至少一种。
终端可以根据模型推理任务从比如内存等存储位置,读取模型标识对应的神经网络模型,对读取到的神经网络模型进行解析,得到神经网络模型对 应的原模型网络。原模型网络是指终端读取到的神经网络模型对应的神经网络结构。神经网络模型可以包括多个计算层,每个计算层可以对应数据操作。计算层与计算层之间可以存在条件或关联关系。例如,一些计算层的输出可以为对应计算层的输入。神经网络模型包括的计算层以及计算层之间的关联关系构成了神经网络模型对应的原模型网络。在原模型网络中,计算层与计算层之间的关联关系以及先后顺序都是固定的,对于不同的神经网络模型可以有对应不同的网络结构。例如,在卷积神经网络模型中,具体可以包括输入层、卷积层、池化层、全连接层以及输出层等。
终端将解析得到的原模型网络采用多种形式表示。例如,终端可以采用列表的形式记录解析得到的原模型网络,列表的顺序可以表示原模型网络中计算层与计算层之间的关联关系。终端也可以根据原模型网络的结构生成对应的抽象图。如图3所示,图3为原模型网络所对应抽象图中的局部抽象图。由于完整的神经网络模型对应的原模型网络较为庞大,因此以局部的抽象图为例说明。原模型网络对应的抽象图为有向图,神经网络模型中的计算层与抽象图中的节点相对应。节点与节点之间的有向边表示计算层与计算层之间的输入输出关系,计算层输出的数据可以根据有向边输入至箭头指向的计算层。每个节点可以包括对应计算层的计算层信息,计算层信息包括计算层对应的计算层标识、计算层类型、计算层属性以及计算层条件等中的至少一种。对应的,终端可以确定计算层信息作为对应节点的节点信息。
步骤206,获取预设的与模型标识对应的标准子网络。
终端可以获取与模型标识相对应的标准子网络,神经网络模型的模型标识可以对应一个或多个标准子网络。标准子网络可以是用户根据实际需求预先设置的,用户可以根据神经网络模型的原模型网络,确定可以进行优化的子网络作为标准子网络。终端可以接收用户通过输入设备输入的标准子网络。标准子网络为完整的神经网络模型的局部网络结构,标准子网络可以包括多个计算层以及多个计算层之间的关联关系,计算层之间的关联关系具体可以包括计算层之间的逻辑条件和输入输出关系。标准子网络可以是用户预先设 置后存储在终端中的,标准子网络可以是神经网络模型中原模型网络的子集。与原模型网络对应的,标准子网络也可以采用多种形式表示。例如,终端可以采用列表的形式记录和表示标准子网络,列表中的每一行可以记录标准子网络中的计算层,列表的顺序可以表示计算层与计算层之间的关联关系。终端还可以根据标准子网络生成抽象图,采用抽象图的形式表示标准子网络。
步骤208,根据标准子网络对原模型网络进行遍历,得到待优化子网络。
终端可以根据预设的标准子网络对原模型网络进行遍历,在原模型网络中筛选出与标准子网络相对应的待优化子网络。待优化子网络是指原模型网络中可以进行优化处理的部分网络。在神经网络模型中,包括大量计算层,网络结构较为复杂,终端在推理神经网络模型时需要耗费较长时间,模型推理的速度较低,不利于对实时性要求较高的数据处理过程。例如,在自动驾驶领域中,需要快速根据神经网络模型推理得到数据处理结果。而且,终端在每根据一个计算层对应的操作进行数据处理后,需要将处理得到的数据暂存在内存中。由于神经网络模型的计算层较多,会占用较大的内存空间。因此,终端可以根据标准子网络从原模型网络中遍历出待优化子网络,以便对神经网络模型的原模型网络进行优化,减少原模型网络中的计算层数。
具体的,终端可以根据标准子网络中计算层的顺序,依次对原模型网络中的计算层进行遍历,查找与标准子网络计算层相对应的计算层。当找到相对应的计算层时,确定与标准子网络中计算层相对应的计算层为待优化计算层,原模型网络中的位置不变,向下继续查找与标准子网络中下一个计算层相对应的待优化计算层,直到查找出所有待优化计算层。终端可以将待优化计算层所构成的网络记作待优化子网络,待优化子网络为原模型网络的子集。
例如,标准子网络和原模型网络可以采用抽象图的形式表示。终端可以按照抽象图顺序获取标准子网络中的标准节点,按照原模型网络的顺序逐一将原模型网络抽象图中的原模型节点与标准节点进行比对。终端可以将比对成功的原模型节点记作待优化节点,保持原模型网络抽象图的位置不变,获取下一个标准节点,继续向下比对标准节点和原模型节点,重复查找与标准 节点相对应的原模型节点,直到在原模型网络中查找出与所有标准节点分别对应的待优化节点。终端可以将待优化节点所构成的网络记作待优化子网络。在标准子网络中,多个标准节点之间是连续的。而在待优化子网络中,待优化节点之间可以是连续的,也可以是不连续的,即待优化子网络的待优化节点之间可以包括不属于待优化节点的原模型节点。
步骤210,基于标准子网络所对应的目标子网络将待优化子网络进行优化,得到优化后的神经网络模型。
步骤212,根据优化后的神经网络模型进行推理,得到模型推理结果。
终端可以对遍历出的待优化子网络进行优化,得到优化后的神经网络模型,以此提高神经网络模型的推理速度,减少神经网络模型的推理时间。同时,优化后的神经网络模型相较于原神经网络模型简化了原模型网络,减少了计算层的层数,避免多个计算层的输出结果不必要的占用终端的内存。
具体的,终端可以获取与标准子网络相对应的目标子网络。目标子网络可以是用户根据实际需求预先设置的,目标子网络与标准子网络存在唯一对应的映射关系。目标子网络是指对标准子网络进行优化处理后得到的网络结构。终端可以根据目标子网络对原模型网络中的待优化子网络进行优化处理,得到优化后的神经网络模型。终端具体可以获取待优化子网络所对应的至少一个输入节点,以及至少一个输出节点。终端可以从原模型网络中删除待优化子网络,将待优化子网络所对应的输入节点和输出节点分别与目标子网络连接,以此将待优化子网络替换为目标子网络,得到优化后的神经网络模型。目标子网络相较于待优化子网络简化了网络结构,减少了计算层层数,进而简化了神经网络模型。终端可以基于优化后的神经网络模型进行推理,按照优化后的神经网络模型所对应的计算层操作依次进行运算,得到推理出的数据结果。例如,终端可以基于优化后的图像识别模型,按照优化后的网络结构所对应的操作顺序依次对输入的图像进行运算,得到识别出的图像结果。
在本实施例中,终端在获取到模型推理任务需要对神经网络模型进行推理时,解析模型标识所对应的神经网络模型,得到神经网络模型的原模型网 络。终端可以获取预设的标准子网络,根据标准子网络对原模型网络进行遍历,以此在原模型网络中筛选出可以进行优化的待优化子网络。终端基于目标子网络将待优化子网络进行优化,得到优化后的神经网络模型。终端在推理神经网络模型之前,对神经网络模型的网络结构进行优化,不影响神经网络模型的训练梯度,保证神经网络模型的准确性。终端根据优化后的神经网络模型进行推理,简化了神经网络模型的网络结构,减少了神经网络模型中的计算层,从而有效的提高了神经网络模型的推理速度,节省了神经网络模型的推理时间,提高了模型推理的实时性。
在其中一个实施例中,在获取预设的与模型标识对应的标准子网络之前,上述神经网络模型的推理方法还包括:获取标准节点关联文件;根据标准节点关联文件生成网络描述脚本;执行网络描述脚本,生成标准子网络。
终端在获取预设的标准子网络之前,还包括根据实际需求生成标准子网络。具体的,终端可以获取标准节点关联文件,标准节点关联文件是用于记录标准节点以及标准节点之间关联关系的模板文件。用户可以按照模板文件的特定格式输入标准子网络的网络结构,得到标准节点关联文件。标准节点用于表示标准子网络中的计算层,每个计算层对应一个节点。
标准节点关联文件具体可以包括标准节点文件和节点条件文件。标准节点文件以及节点条件文件可以按照特定格式记录标准节点。例如,标准节点文件可以是数据表形式的标准节点列表,列表中的每一行记录一个标准节点,列表的顺序可以表示标准节点的逻辑顺序和关联关系。节点条件文件也可以是数据表形式的节点条件列表,列表中的每一行记录标准节点所需要满足的逻辑条件。
在其中一个实施例中,标准节点列表每一行的标准节点可以记录为“标准节点标识=标准节点类型([输入节点标识],[标准节点属性])”。其中,输入节点标识可以包括一个或多个输入节点对应的节点标识,标准节点属性可以包括标准节点对应的至少一个节点属性。例如,标准节点列表中的其中一个标准节点具体可以为“layer_2=concat([layer_0,layer_1],[dim=1])”。其中, “layer_0”、“layer_1”以及“layer_2”分别表示标准节点对应的节点标识,节点标识为“layer_2”的标准节点的类型“concat”为连接标准节点“layer_0”和标准节点“layer_1”,标准节点“layer_2”的节点属性dim等于1。
标准节点列表中可以包括至少两个标准节点标识。节点条件列表中节点条件的特定格式可以表示为“条件标识=逻辑表达式([标准节点标识],[标准节点属性])”。其中,节点条件列表中可以包括多个条件各自对应的条件标识,标准节点标识可以至少两个标准节点各自对应的节点标识,标准节点属性是指与标准节点相对应的节点属性。例如,节点条件列表中的其中一个节点条件具体可以为“Condition_0=eq(layer_3.dim,layer_4.dim)”。其中,条件标识为“Condition_0”的节点条件表示标准节点“layer_3”的属性dim要与标准节点“layer_4”的属性dim相等。当两个标准节点的属性dim相等时,满足节点条件“Condition_0”,否则不满足。
终端可以根据标准节点关联文件的特定格式,读取标准节点关联文件中标准节点对应的节点信息,根据节点信息生成网络描述脚本。终端可以执行生成的网络描述脚本,通过运行网络描述脚本生成标准子网络。标准子网络可以用于表示神经网络模型中可以进行优化的部分网络结构,终端可以基于生成的标准子网络对神经网络模型的原模型网络进行优化,以此简化原模型网络的网络结构,提高神经网络模型的推理速度。
在本实施例中,终端可以获取标准节点关联文件,根据标准节点关联文件中记载的标准节点信息生成网络描述脚本,执行网络描述脚本以此生成标准子网络。终端根据标准节点关联文件可以灵活的生成标准子网络,进而提高了待优化子网络的灵活性和多样性。通过标准子网络有助于筛选出更加复杂的待优化子网络,从而对神经网络模型的网络结构进行深度优化,有效的提高了神经网络模型的推理速度,节省推理神经网络模型时占用的内存。
在其中一个实施例中,根据标准子网络对原模型网络进行遍历,得到待优化子网络的步骤包括:将标准子网络和原模型网络分别进行拓扑排序,得到标准节点序列和原模型节点序列;根据标准节点序列与原模型节点序列中 的原模型节点进行匹配,得到待优化节点;根据待优化节点生成待优化子网络。
终端可以基于标准子网络对原模型网络进行遍历,从原模型网络中筛选出与标准子网络相对应的待优化子网络。具体的,终端可以将标准子网络和原模型网络分别进行拓扑排序,得到标准子网络所对应的标准节点序列,以及原模型网络所对应的原模型节点序列。拓扑排序是指将有向网络中的节点排成满足拓扑次序的序列,拓扑排序所得到的序列为一维的线性序列。终端可以将标准子网络拓扑排序得到的线性序列记作标准节点序列,将原模型网络拓扑排序得到的线性序列记作原模型节点序列。终端进行拓扑排序得到的线性序列中节点之间的顺序是固定的。
终端可以根据标准节点序列与原模型节点序列中的原模型节点进行匹配,得到待优化节点,根据待优化节点生成待优化子网络。可以理解的,当终端获取到多个预设的标准子网络时,终端可以将多个标准子网络所对应的标准节点序列分别与原模型节点序列进行匹配,从原模型网络中筛选出与多个标准子网络分别对应的待优化子网络。
具体的,终端可以按照标准节点序列的顺序,依次获取标准节点。终端按照原模型节点序列的顺序,对原模型节点序列中的多个原模型节点进行遍历,依次判断原模型节点与标准节点是否匹配。当原模型节点与标准节点相匹配时,终端可以将与标准节点相匹配的原模型节点记作待优化节点,并且获取标准节点序列中的下一个标准节点,重复将标准节点与后续的原模型节点进行匹配,直到在原模型节点中筛选出与所有标准节点相匹配的待优化节点,或者遍历完原模型节点序列中的所有原模型节点。终端可以获取原模型节点序列中所有与标准节点相匹配的待优化节点,确定由待优化节点所构成的网络作为待优化子网络。在标准子网络中,标准节点之间是连续的。而在待优化子网络中,待优化节点之间可以是连续的,也可以是不连续的,部分待优化节点可以离散分布在原模型网络中。
在其中一个实施例中,当原模型节点与标准节点不匹配时,终端可以按 照原模型节点序列的顺序,获取下一个原模型节点与标准节点进行匹配。当原模型节点序列中不存在与标准节点相匹配的原模型节点时,终端可以根据原神经网络模型进行推理,并且生成优化失败提示信息。终端可以通过显示界面展示优化失败提示信息,以此提示用户预设的标准子网络与模型标识所对应的神经网络模型不对应,未对神经网络模型的推理过程进行优化。
在本实施例中,终端通过将标准子网络和原模型网络分别进行拓扑排序,得到各自对应的标准节点序列和原模型节点序列,拓扑序列准确的表示了节点在网络结构中的顺序,从而准确的筛选出原模型节点序列中的待优化节点。终端根据标准节点序列从原模型节点序列中筛选出待优化节点,生成待优化子网络,有助于对待优化子网络进行优化,根据优化后的神经网络模型进行推理,有效的提高了神经网络模型的推理速度,节省了推理所需耗费的时间。
在其中一个实施例中,根据标准节点序列与原模型节点序列中的原模型节点进行匹配,得到待优化节点的步骤包括:根据标准节点序列获取标准节点对应的标准节点信息;基于标准节点信息依次对原模型节点序列中的原模型节点进行遍历,确定与标准节点信息匹配的原模型节点作为待优化节点。
终端可以按照标准节点序列的顺序依次在原模型节点序列中筛选出与标准节点相匹配的待优化节点。具体的,终端可以获取标准节点对应的标准节点信息,标准节点对应的标准节点信息具体可以包括标准节点标识、标准节点类型、标准节点属性以及标准节点条件等。对应的,原模型节点对应有包括原模型节点标识、原模型节点类型、原模型节点属性以及原模型节点条件等原模型节点信息。终端基于标准节点信息对原模型节点进行遍历,依次判断标准节点信息与原模型节点信息是否匹配。若是,则将相匹配的原模型节点记作待优化节点,获取下一个标准节点对应的标准节点信息进行匹配。若否,则重复判断下一个原模型节点对应的原模型节点信息与标准节点信息是否匹配。
终端可以将节点信息中的多种信息依次进行匹配。当标准节点对应的全部标准节点信息与原模型节点信息匹配时,确定标准节点与原模型节点相匹 配。当标准节点对应的任一标准节点信息与原模型节点信息不匹配时,确定标准节点与原模型节点不匹配。
举例说明,终端可以获取标准节点对应的标准节点类型,按照原模型节点序列依次遍历与标准节点类型相匹配的原模型节点。当原模型节点序列中不存在原模型节点类型与标准节点类型相匹配时,确定原模型节点序列中不存在与标准节点相匹配的原模型节点。当遍历到与标准节点类型相匹配的原模型节点类型时,终端获取原模型节点对应的原模型节点属性,将原模型节点属性与标准节点属性进行比对。当原模型节点属性与标准节点属性不匹配时,终端重复将标准节点类型与下一个原模型节点类型进行比对。当原模型节点属性与标准节点属性匹配时,终端获取标准节点以及原模型节点各自对应的至少一个原输入节点,原输入节点之间存在固定的拓扑顺序。终端可以根据原输入节点之间的拓扑顺序依次将标准节点和原模型节点各自对应的原输入节点进行比对。当原输入节点之间的类型不同时,终端重复将标准节点类型与下一个原模型节点类型进行比对。当标准节点对应的原输入节点为空节点时,确定原模型节点对应的原输入节点作为待优化子网络的输入节点。当所有原输入节点相匹配时,终端可以获取标准节点条件和原模型节点条件。同理,当标准节点条件与原模型节点条件不匹配时,终端重复将标准节点类型与下一个原模型节点类型进行比对。当标准节点条件与原模型节点条件匹配时,终端确定标准节点与原模型节点相匹配,可以将原模型节点记作与标准节点相匹配的待优化节点。
在本实施例中,终端根据标准节点对应的标准节点信息,以及原模型节点对应的原模型节点信息,将标准节点与原模型节点进行匹配,从原模型节点序列中遍历出与标准节点信息相匹配的原模型节点作为待优化节点,有效的提高了筛选待优化节点的准确性,准确的遍历出待优化子网络进行优化,有效的提高了对神经网络模型优化的准确性。
在其中一个实施例中,如图4所示,基于标准子网络所对应的目标子网络将待优化子网络进行优化,得到优化后的神经网络模型的步骤包括:
步骤402,获取原模型节点对应的原输入节点。
步骤404,将原输入节点与待优化子网络进行比对。
步骤406,当原输入节点属于待优化子网络时,确定原输入节点对应的原模型节点作为输出节点。
步骤408,剔除待优化子网络,将目标子网络与输出节点连接,得到优化后的神经网络模型。
终端可以根据原模型节点之间的关联关系,从原模型网络中获取每个原模型节点所对应的原输入节点。原输入节点是指所输出的数据作为原模型节点的输入数据的原模型节点,每个原模型节点本身可以作为其他原模型节点的原输入节点。每个原模型节点可以对应包括至少一个原输入节点。终端可以将多个原输入节点分别与待优化子网络中的多个待优化节点进行比对。当原输入节点不属于待优化子网络时,继续将下一个原输入节点与待优化子网络进行比对。当原输入节点属于待优化子网络时,终端可以确定属于待优化子网络的原输入节点所对应的原模型节点作为待优化子网络的输出节点。在其中一个实施例中,终端在遍历待优化节点时,可以根据原模型节点的原输入节点确定待优化子网络所对应的输入节点。
终端可以将待优化子网络替换为目标子网络,实现对神经网络模型的优化。具体的,终端可以从原模型网络中剔除待优化子网络,将待优化子网络替换为目标子网络。终端可以目标子网络分别与待优化子网络所对应的输入节点和输出节点相连接,得到优化后的神经网络模型。
在本实施例中,终端可以获取待优化子网络的输出节点,将待优化子网络替换为目标子网络。终端将目标子网络与输出节点连接,保证了替换待优化子网络的准确性,有效的提高了优化神经网络模型的准确性。
在其中一个实施例中,在解析模型标识对应的神经网络模型之前,上述神经网络模型的推理方法还包括:获取历史推理数据,历史推理数据包括历史优化模型标识;将模型标识与历史优化模型标识进行比对;当模型标识属于历史优化模型标识时,调用与历史优化模型标识相对应的历史优化神经网 络模型。
终端在获取到模型推理任务,解析得到模型推理任务携带的模型标识后,可以获取历史推理数据。历史推理数据是指终端在历史时间内推理神经网络模型所对应的数据,历史时间是指终端获取到模型推理任务之间的历史时间段。历史推理数据具体可以包括历史优化模型标识,历史优化模型标识是指终端在历史时间内进行神经网络模型推理时,根据上述实施例中神经网络模型的推理方法进行优化后神经网络模型所对应的模型标识。
终端可以将模型标识与历史优化模型标识进行比对,判断模型标识是否与历史优化模型标识中的任意一个相同。若是,则确定模型标识属于历史优化模型标识。否则不属于历史优化模型标识。当模型标识不属于历史优化模型标识时,终端可以按照上述实施例中神经网络模型的推理方法对模型标识所对应的神经网络模型进行推理。当模型标识属于历史优化模型标识时,终端可以调用与历史优化模型标识相对应的历史优化神经网络模型,根据历史优化神经网络模型进行推理,得到模型推理结果。
在本实施例中,终端可以获取历史推理数据,将模型标识与历史推理数据中的历史优化模型标识进行比对。当模型标识属于历史优化模型标识时,终端可以调用对应历史优化模型标识所对应的历史优化神经网络模型进行推理,不需要每次都对神经网络模型进行优化,有效的节省了终端的运算资源。
应该理解的是,虽然图2和图4的流程图中的各个步骤按照箭头的指示依次显示,但是这些步骤并不是必然按照箭头指示的顺序依次执行。除非本文中有明确的说明,这些步骤的执行并没有严格的顺序限制,这些步骤可以以其它的顺序执行。而且,图2和图4中的至少一部分步骤可以包括多个子步骤或者多个阶段,这些子步骤或者阶段并不必然是在同一时刻执行完成,而是可以在不同的时刻执行,这些子步骤或者阶段的执行顺序也不必然是依次进行,而是可以与其它步骤或者其它步骤的子步骤或者阶段的至少一部分轮流或者交替地执行。
在其中一个实施例中,如图5所示,提供了一种神经网络模型的推理装 置,包括:任务获取模块502、模型解析模块504、模型遍历模块506、模型优化模块508和模型推理模块510,其中:
任务获取模块502,用于获取模型推理任务,模型推理任务携带模型标识。
模型解析模块504,用于解析模型标识对应的神经网络模型,得到神经网络模型对应的原模型网络。
模型遍历模块506,用于获取预设的与模型标识对应的标准子网络;根据标准子网络对原模型网络进行遍历,得到待优化子网络。
模型优化模块508,用于基于标准子网络所对应的目标子网络将待优化子网络进行优化,得到优化后的神经网络模型。
模型推理模块510,用于根据优化后的神经网络模型进行推理,得到模型推理结果。
在其中一个实施例中,上述神经网络模型的推理装置还包括子网络生成模块,用于获取标准节点关联文件;根据标准节点关联文件生成网络描述脚本;执行网络描述脚本,生成标准子网络。
在其中一个实施例中,上述模型遍历模块506还用于将标准子网络和原模型网络分别进行拓扑排序,得到标准节点序列和原模型节点序列;根据标准节点序列与原模型节点序列中的原模型节点进行匹配,得到待优化节点;根据待优化节点生成待优化子网络。
在其中一个实施例中,上述模型遍历模块506还用于根据标准节点序列获取标准节点对应的标准节点信息;基于标准节点信息依次对原模型节点序列中的原模型节点进行遍历,确定与标准节点信息匹配的原模型节点作为待优化节点。
在其中一个实施例中,上述模型遍历模块506还用于当原模型节点序列中不存在与标准节点相匹配的原模型节点时,生成优化失败提示信息。
在其中一个实施例中,上述模型优化模块508还用于获取原模型节点对应的原输入节点;将原输入节点与待优化子网络进行比对;当原输入节点属 于待优化子网络时,确定原输入节点对应的原模型节点作为输出节点;剔除待优化子网络,将目标子网络与输出节点连接,得到优化后的神经网络模型。
在其中一个实施例中,上述神经网络模型的推理装置还包括模型识别模块,用于获取历史推理数据,历史推理数据包括历史优化模型标识;将模型标识与历史优化模型标识进行比对;当模型标识属于历史优化模型标识时,调用与历史优化模型标识相对应的历史优化神经网络模型。
关于神经网络模型的推理装置的具体限定可以参见上文中对于神经网络模型的推理方法的限定,在此不再赘述。上述神经网络模型的推理装置中的各个模块可全部或部分通过软件、硬件及其组合来实现。上述各模块可以硬件形式内嵌于或独立于计算机设备中的处理器中,也可以以软件形式存储于计算机设备中的存储器中,以便于处理器调用执行以上各个模块对应的操作。
在一个实施例中,提供了一种计算机设备,该计算机设备可以是服务器,其内部结构图可以如图6所示。该计算机设备包括通过系统总线连接的处理器、存储器、网络接口和数据库。其中,该计算机设备的处理器用于提供计算和控制能力。该计算机设备的存储器包括非易失性存储介质、内存储器。该非易失性存储介质存储有操作系统、计算机可读指令和数据库。该内存储器为非易失性存储介质中的操作系统和计算机可读指令的运行提供环境。该计算机设备的数据库用于存储神经网络模型的推理数据。该计算机设备的网络接口用于与外部的终端通过网络连接通信。该计算机可读指令被处理器执行时以实现一种神经网络模型的推理方法。
本领域技术人员可以理解,图6中示出的结构,仅仅是与本申请方案相关的部分结构的框图,并不构成对本申请方案所应用于其上的计算机设备的限定,具体的计算机设备可以包括比图中所示更多或更少的部件,或者组合某些部件,或者具有不同的部件布置。
一种计算机设备,包括存储器和一个或多个处理器,存储器中储存有计算机可读指令,计算机可读指令被处理器执行时,使得一个或多个处理器执行时实现上述方法实施例中的步骤。
一个或多个存储有计算机可读指令的非易失性计算机可读存储介质,计算机可读指令被一个或多个处理器执行时,使得一个或多个处理器执行时实现上述方法实施例中的步骤。
本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程,是可以通过计算机可读指令来指令相关的硬件来完成,所述的计算机可读指令可存储于一非易失性计算机可读取存储介质中,该计算机可读指令在执行时,可包括如上述各方法的实施例的流程。其中,本申请所提供的各实施例中所使用的对存储器、存储、数据库或其它介质的任何引用,均可包括非易失性和/或易失性存储器。非易失性存储器可包括只读存储器(ROM)、可编程ROM(PROM)、电可编程ROM(EPROM)、电可擦除可编程ROM(EEPROM)或闪存。易失性存储器可包括随机存取存储器(RAM)或者外部高速缓冲存储器。作为说明而非局限,RAM以多种形式可得,诸如静态RAM(SRAM)、动态RAM(DRAM)、同步DRAM(SDRAM)、双数据率SDRAM(DDRSDRAM)、增强型SDRAM(ESDRAM)、同步链路(Synchlink)DRAM(SLDRAM)、存储器总线(Rambus)直接RAM(RDRAM)、直接存储器总线动态RAM(DRDRAM)、以及存储器总线动态RAM(RDRAM)等。
以上实施例的各技术特征可以进行任意的组合,为使描述简洁,未对上述实施例中的各个技术特征所有可能的组合都进行描述,然而,只要这些技术特征的组合不存在矛盾,都应当认为是本说明书记载的范围。
以上所述实施例仅表达了本申请的几种实施方式,其描述较为具体和详细,但并不能因此而理解为对发明专利范围的限制。应当指出的是,对于本领域的普通技术人员来说,在不脱离本申请构思的前提下,还可以做出若干变形和改进,这些都属于本申请的保护范围。因此,本申请专利的保护范围应以所附权利要求为准。

Claims (20)

  1. 一种神经网络模型的推理方法,包括:
    获取模型推理任务,所述模型推理任务携带模型标识;
    解析所述模型标识对应的神经网络模型,得到所述神经网络模型对应的原模型网络;
    获取预设的与所述模型标识对应的标准子网络;
    根据所述标准子网络对所述原模型网络进行遍历,得到待优化子网络;
    基于所述标准子网络所对应的目标子网络将所述待优化子网络进行优化,得到优化后的神经网络模型;及
    根据所述优化后的神经网络模型进行推理,得到模型推理结果。
  2. 根据权利要求1所述的方法,其特征在于,在所述获取预设的与所述模型标识对应的标准子网络之前,所述方法还包括:
    获取标准节点关联文件;
    根据所述标准节点关联文件生成网络描述脚本;及
    执行所述网络描述脚本,生成所述标准子网络。
  3. 根据权利要求1所述的方法,其特征在于,所述根据所述标准子网络对所述原模型网络进行遍历,得到待优化子网络,包括:
    将所述标准子网络和所述原模型网络分别进行拓扑排序,得到标准节点序列和原模型节点序列;
    根据所述标准节点序列与所述原模型节点序列中的原模型节点进行匹配,得到待优化节点;及
    根据所述待优化节点生成待优化子网络。
  4. 根据权利要求3所述的方法,其特征在于,所述根据所述标准节点序列与所述原模型节点序列中的原模型节点进行匹配,得到待优化节点,包括:
    根据所述标准节点序列获取标准节点对应的标准节点信息;及
    基于所述标准节点信息依次对所述原模型节点序列中的原模型节点进行遍历,确定与所述标准节点信息匹配的原模型节点作为待优化节点。
  5. 根据权利要求3所述的方法,其特征在于,在所述根据所述标准节点序列与所述原模型节点序列中的原模型节点进行匹配之后,所述方法还包括:
    当所述原模型节点序列中不存在与标准节点相匹配的原模型节点时,生成优化失败提示信息。
  6. 根据权利要求1所述的方法,其特征在于,所述基于所述标准子网络所对应的目标子网络将所述待优化子网络进行优化,得到优化后的神经网络模型,包括:
    获取原模型节点对应的原输入节点;
    将所述原输入节点与所述待优化子网络进行比对;
    当所述原输入节点属于所述待优化子网络时,确定所述原输入节点对应的所述原模型节点作为输出节点;及
    剔除所述待优化子网络,将所述目标子网络与所述输出节点连接,得到优化后的神经网络模型。
  7. 根据权利要求1至6任一项所述的方法,其特征在于,在所述解析所述模型标识对应的神经网络模型之前,所述方法还包括:
    获取历史推理数据,所述历史推理数据包括历史优化模型标识;
    将所述模型标识与所述历史优化模型标识进行比对;及
    当所述模型标识属于所述历史优化模型标识时,调用与所述历史优化模型标识相对应的历史优化神经网络模型。
  8. 一种神经网络模型的推理装置,包括:
    任务获取模块,用于获取模型推理任务,所述模型推理任务携带模型标识;
    模型解析模块,用于解析所述模型标识对应的神经网络模型,得到所述神经网络模型对应的原模型网络;
    模型遍历模块,用于获取预设的与所述模型标识对应的标准子网络;根据所述标准子网络对所述原模型网络进行遍历,得到待优化子网络;
    模型优化模块,用于基于所述标准子网络所对应的目标子网络将所述待 优化子网络进行优化,得到优化后的神经网络模型;及
    模型推理模块,用于根据所述优化后的神经网络模型进行推理,得到模型推理结果。
  9. 根据权利要求8所述的装置,其特征在于,所模型遍历模块还用于将所述标准子网络和所述原模型网络分别进行拓扑排序,得到标准节点序列和原模型节点序列;根据所述标准节点序列与所述原模型节点序列中的原模型节点进行匹配,得到待优化节点;及根据所述待优化节点生成待优化子网络。
  10. 一种计算机设备,包括存储器及一个或多个处理器,所述存储器中储存有计算机可读指令,所述计算机可读指令被所述一个或多个处理器执行时,使得所述一个或多个处理器执行以下步骤:
    获取模型推理任务,所述模型推理任务携带模型标识;
    解析所述模型标识对应的神经网络模型,得到所述神经网络模型对应的原模型网络;
    获取预设的与所述模型标识对应的标准子网络;
    根据所述标准子网络对所述原模型网络进行遍历,得到待优化子网络;
    基于所述标准子网络所对应的目标子网络将所述待优化子网络进行优化,得到优化后的神经网络模型;及
    根据所述优化后的神经网络模型进行推理,得到模型推理结果。
  11. 根据权利要求10所述的计算机设备,其特征在于,所述处理器执行所述计算机可读指令时还执行以下步骤:
    获取标准节点关联文件;
    根据所述标准节点关联文件生成网络描述脚本;及
    执行所述网络描述脚本,生成所述标准子网络。
  12. 根据权利要求10所述的计算机设备,其特征在于,所述处理器执行所述计算机可读指令时还执行以下步骤:
    将所述标准子网络和所述原模型网络分别进行拓扑排序,得到标准节点序列和原模型节点序列;
    根据所述标准节点序列与所述原模型节点序列中的原模型节点进行匹配,得到待优化节点;及
    根据所述待优化节点生成待优化子网络。
  13. 根据权利要求12所述的计算机设备,其特征在于,所述处理器执行所述计算机可读指令时还执行以下步骤:
    根据所述标准节点序列获取标准节点对应的标准节点信息;及
    基于所述标准节点信息依次对所述原模型节点序列中的原模型节点进行遍历,确定与所述标准节点信息匹配的原模型节点作为待优化节点。
  14. 根据权利要求10所述的计算机设备,其特征在于,所述处理器执行所述计算机可读指令时还执行以下步骤:
    获取原模型节点对应的原输入节点;
    将所述原输入节点与所述待优化子网络进行比对;
    当所述原输入节点属于所述待优化子网络时,确定所述原输入节点对应的所述原模型节点作为输出节点;及
    剔除所述待优化子网络,将所述目标子网络与所述输出节点连接,得到优化后的神经网络模型。
  15. 根据权利要求10至14任一项所述的计算机设备,其特征在于,所述处理器执行所述计算机可读指令时还执行以下步骤:
    获取历史推理数据,所述历史推理数据包括历史优化模型标识;
    将所述模型标识与所述历史优化模型标识进行比对;及
    当所述模型标识属于所述历史优化模型标识时,调用与所述历史优化模型标识相对应的历史优化神经网络模型。
  16. 一个或多个存储有计算机可读指令的非易失性计算机可读存储介质,所述计算机可读指令被一个或多个处理器执行时,使得所述一个或多个处理器执行以下步骤:
    获取模型推理任务,所述模型推理任务携带模型标识;
    解析所述模型标识对应的神经网络模型,得到所述神经网络模型对应的 原模型网络;
    获取预设的与所述模型标识对应的标准子网络;
    根据所述标准子网络对所述原模型网络进行遍历,得到待优化子网络;
    基于所述标准子网络所对应的目标子网络将所述待优化子网络进行优化,得到优化后的神经网络模型;及
    根据所述优化后的神经网络模型进行推理,得到模型推理结果。
  17. 根据权利要求16所述的存储介质,其特征在于,所述计算机可读指令被所述处理器执行时还执行以下步骤:
    获取标准节点关联文件;
    根据所述标准节点关联文件生成网络描述脚本;及
    执行所述网络描述脚本,生成所述标准子网络。
  18. 根据权利要求16所述的存储介质,其特征在于,所述计算机可读指令被所述处理器执行时还执行以下步骤:
    将所述标准子网络和所述原模型网络分别进行拓扑排序,得到标准节点序列和原模型节点序列;
    根据所述标准节点序列与所述原模型节点序列中的原模型节点进行匹配,得到待优化节点;及
    根据所述待优化节点生成待优化子网络。
  19. 根据权利要求18所述的存储介质,其特征在于,所述计算机可读指令被所述处理器执行时还执行以下步骤:
    根据所述标准节点序列获取标准节点对应的标准节点信息;及
    基于所述标准节点信息依次对所述原模型节点序列中的原模型节点进行遍历,确定与所述标准节点信息匹配的原模型节点作为待优化节点。
  20. 根据权利要求16所述的存储介质,其特征在于,所述计算机可读指令被所述处理器执行时还执行以下步骤:
    获取原模型节点对应的原输入节点;
    将所述原输入节点与所述待优化子网络进行比对;
    当所述原输入节点属于所述待优化子网络时,确定所述原输入节点对应的所述原模型节点作为输出节点;及
    剔除所述待优化子网络,将所述目标子网络与所述输出节点连接,得到优化后的神经网络模型。
PCT/CN2019/130183 2019-12-30 2019-12-30 神经网络模型的推理方法、装置、计算机设备和存储介质 WO2021134350A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201980037513.0A CN113811897B (zh) 2019-12-30 2019-12-30 神经网络模型的推理方法、装置、计算机设备和存储介质
PCT/CN2019/130183 WO2021134350A1 (zh) 2019-12-30 2019-12-30 神经网络模型的推理方法、装置、计算机设备和存储介质

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2019/130183 WO2021134350A1 (zh) 2019-12-30 2019-12-30 神经网络模型的推理方法、装置、计算机设备和存储介质

Publications (1)

Publication Number Publication Date
WO2021134350A1 true WO2021134350A1 (zh) 2021-07-08

Family

ID=76686212

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/130183 WO2021134350A1 (zh) 2019-12-30 2019-12-30 神经网络模型的推理方法、装置、计算机设备和存储介质

Country Status (2)

Country Link
CN (1) CN113811897B (zh)
WO (1) WO2021134350A1 (zh)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116523052B (zh) * 2023-07-05 2023-08-29 成都阿加犀智能科技有限公司 一种快速推理方法、装置及设备

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103838205A (zh) * 2013-12-09 2014-06-04 浙江大学 Bp全局最优丙烯聚合生产过程最优软测量仪表和方法
CN106096727A (zh) * 2016-06-02 2016-11-09 腾讯科技(深圳)有限公司 一种基于机器学习的网络模型构造方法及装置
CN107464217A (zh) * 2017-08-16 2017-12-12 清华-伯克利深圳学院筹备办公室 一种图像处理方法及装置
CN107766893A (zh) * 2017-11-03 2018-03-06 电子科技大学 基于标签多级编码神经网络的目标识别方法
CN108875693A (zh) * 2018-07-03 2018-11-23 北京旷视科技有限公司 一种图像处理方法、装置、电子设备及其存储介质

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105844401B (zh) * 2016-03-22 2019-04-12 北京工商大学 基于案例推理的湖库水华治理复杂动态关联模型与决策方法
CN110245741A (zh) * 2018-03-09 2019-09-17 佳能株式会社 多层神经网络模型的优化和应用方法、装置及存储介质
CN109299688B (zh) * 2018-09-19 2021-10-01 厦门大学 基于可变形快速卷积神经网络的舰船检测方法
CN110348562B (zh) * 2019-06-19 2021-10-15 北京迈格威科技有限公司 神经网络的量化策略确定方法、图像识别方法和装置
CN110378413A (zh) * 2019-07-17 2019-10-25 Oppo广东移动通信有限公司 神经网络模型处理方法、装置以及电子设备
CN110555514B (zh) * 2019-08-20 2022-07-12 北京迈格威科技有限公司 神经网络模型搜索方法、图像识别方法和装置

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103838205A (zh) * 2013-12-09 2014-06-04 浙江大学 Bp全局最优丙烯聚合生产过程最优软测量仪表和方法
CN106096727A (zh) * 2016-06-02 2016-11-09 腾讯科技(深圳)有限公司 一种基于机器学习的网络模型构造方法及装置
CN107464217A (zh) * 2017-08-16 2017-12-12 清华-伯克利深圳学院筹备办公室 一种图像处理方法及装置
CN107766893A (zh) * 2017-11-03 2018-03-06 电子科技大学 基于标签多级编码神经网络的目标识别方法
CN108875693A (zh) * 2018-07-03 2018-11-23 北京旷视科技有限公司 一种图像处理方法、装置、电子设备及其存储介质

Also Published As

Publication number Publication date
CN113811897B (zh) 2022-05-31
CN113811897A (zh) 2021-12-17

Similar Documents

Publication Publication Date Title
US11741361B2 (en) Machine learning-based network model building method and apparatus
US20210034980A1 (en) Real-time visualization of machine learning models
CN111340237B (zh) 数据处理和模型运行方法、装置和计算机设备
CN112232293B (zh) 图像处理模型训练、图像处理方法及相关设备
CN109583325A (zh) 人脸样本图片标注方法、装置、计算机设备及存储介质
WO2023246801A1 (zh) 算法流水线编排方法、装置、电子设备和存储介质
CN113435330A (zh) 基于视频的微表情识别方法、装置、设备及存储介质
CN111783997A (zh) 一种数据处理方法、装置及设备
CN114492601A (zh) 资源分类模型的训练方法、装置、电子设备及存储介质
WO2022141489A1 (zh) 深度学习模型的推理方法、装置、计算机设备和存储介质
WO2021134350A1 (zh) 神经网络模型的推理方法、装置、计算机设备和存储介质
US20220067495A1 (en) Intelligent processor, data processing method and storage medium
US20240095529A1 (en) Neural Network Optimization Method and Apparatus
US11403267B2 (en) Dynamic transformation code prediction and generation for unavailable data element
CN113326523A (zh) 一种隐私计算方法、装置及电子设备
US20230376781A1 (en) Methods and systems for autonomous task composition of vision pipelines using an algorithm selection framework
CN116798053A (zh) 图标生成方法及装置
US20220351533A1 (en) Methods and systems for the automated quality assurance of annotated images
CN116484016A (zh) 一种基于时序路径自动维护的时序知识图谱推理方法和系统
WO2023273171A1 (zh) 图像处理方法、装置、设备和存储介质
CN115883172A (zh) 异常监测方法、装置、计算机设备和存储介质
WO2021134231A1 (zh) 基于推理引擎的计算资源分配方法、装置和计算机设备
Sagaama et al. Automatic parameter tuning for big data pipelines with deep reinforcement learning
CN111723249A (zh) 一种实现数据处理的方法、装置、计算机存储介质及终端
WO2022217419A1 (zh) 神经网络模型推理方法、装置、计算机设备和存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19958702

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19958702

Country of ref document: EP

Kind code of ref document: A1