CN112598117A

CN112598117A - Neural network model design method, deployment method, electronic device and storage medium

Info

Publication number: CN112598117A
Application number: CN202011598884.7A
Authority: CN
Inventors: 伍宇明
Original assignee: Guangzhou Xaircraft Technology Co Ltd
Current assignee: Guangzhou Xaircraft Technology Co Ltd
Priority date: 2020-12-29
Filing date: 2020-12-29
Publication date: 2021-04-02

Abstract

The invention provides a neural network model design method, a deployment method, electronic equipment and a storage medium, wherein the neural network model design method comprises the following steps: acquiring a neural network module library; the neural network module library comprises a plurality of neural network modules; determining a plurality of target neural network modules matched with the scene tasks from a neural network module library according to the scene tasks; and then constructing a target neural network model matched with the scene task according to the plurality of target neural network modules. The invention greatly reduces the time cost of network model design and improves the efficiency of model design.

Description

Neural network model design method, deployment method, electronic device and storage medium

Technical Field

The invention relates to the technical field of deep learning, in particular to a neural network model design method, a deployment method, electronic equipment and a storage medium.

Background

The deep neural network has achieved great success in many fields, and at present, the deep neural network has complex and various structures and large design space, and high labor cost is required for manual design, so that how to find the optimal neural network structure is a technical difficulty.

At present, the most appropriate model structure is searched out by adopting a scheme based on neural network search in the related technology, and the method generally needs to reinitialize the neural network search space for each new scene requirement for searching again, so that the time consumption is huge, and the problem of low search efficiency exists.

Disclosure of Invention

An objective of the present invention is to provide a neural network model design method, a neural network model deployment method, an electronic device, and a storage medium, so as to reduce the time consumption for designing a neural network model and improve the efficiency of obtaining an optimal neural network model.

The technical scheme of the invention can be realized as follows:

in a first aspect, the present invention provides a neural network model design method, including: acquiring a neural network module library; the neural network module library comprises a plurality of neural network modules; determining a plurality of target neural network modules matched with the scene tasks from the neural network module library according to the scene tasks; and constructing a target neural network model matched with the scene task according to the plurality of target neural network modules.

In a second aspect, the present invention provides a neural network model deployment method, including: acquiring a target neural network model; and when the performance index of the target neural network model on a target hardware platform is within a set performance index range, deploying the target neural network model on the target hardware platform.

In a third aspect, the present invention provides a neural network model designing apparatus, including: the acquisition module is used for acquiring a neural network module library; the neural network module library comprises a plurality of neural network modules; the determining module is used for determining a plurality of target neural network modules matched with the scene tasks from the neural network module library according to the scene tasks; and the construction module is used for constructing a target neural network model matched with the scene task according to the plurality of target neural network modules.

In a fourth aspect, the present invention provides a neural network model deployment apparatus, including: the acquisition module is used for acquiring a target neural network model; and the deployment module is used for deploying the target neural network model on the target hardware platform when the performance index of the target neural network model on the target hardware platform is within the set performance index range.

In a fifth aspect, the present invention provides an electronic device, including a processor and a memory, where the memory stores a computer program executable by the processor, and the processor can execute the machine executable instructions to implement the neural network model design method of the first aspect and/or the neural network model deployment method of the second aspect.

In a sixth aspect, the present invention provides a storage medium having a computer program stored thereon, where the computer program is executed by a processor to implement the neural network model design method of the first aspect and/or the neural network model deployment method of the second aspect.

The invention provides a neural network model design method, a deployment method, electronic equipment and a storage medium, wherein the neural network model design method comprises the following steps: obtaining a neural network module library; the neural network module library comprises a plurality of neural network modules; determining a plurality of target neural network modules matched with the scene tasks from a neural network module library according to the scene tasks; and then constructing a target neural network model matched with the scene task according to the plurality of target neural network modules. The difference from the prior art is that: in the prior art, an optimal neural network model needs to be obtained through a neural network structure searching mode, the time consumption is long, and the searching efficiency is low.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the embodiments will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present invention and therefore should not be considered as limiting the scope, and for those skilled in the art, other related drawings can be obtained according to the drawings without inventive efforts.

FIG. 1 is a schematic flow chart of a neural network model design method provided by an embodiment of the present invention;

fig. 2 is a schematic flow chart of an implementation manner of step S13 provided by the embodiment of the present invention;

fig. 3 is a schematic flow chart of an implementation manner of step S132 provided by the embodiment of the present invention;

fig. 4 is a schematic flowchart of an implementation manner of step S131 provided by the embodiment of the present invention;

FIG. 5 is a schematic flow chart of another neural network model design method provided by an embodiment of the present invention;

FIG. 6 is a schematic flow chart of one implementation of step S14 provided by an embodiment of the present invention;

FIG. 7 is a second schematic flowchart of another neural network model design method according to an embodiment of the present invention;

FIG. 8 is a third schematic flowchart of another neural network model design method provided in an embodiment of the present invention;

FIG. 9 is a fourth schematic flowchart of another neural network model design method provided in the embodiments of the present invention;

FIG. 10 is a schematic flow chart of a neural network model deployment method provided by an embodiment of the present invention;

FIG. 11 is a functional block diagram of a neural network model designing apparatus according to an embodiment of the present invention;

fig. 12 is a functional block diagram of a neural network model deployment apparatus according to an embodiment of the present invention;

fig. 13 is a block diagram of an electronic device according to an embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. The components of embodiments of the present invention generally described and illustrated in the figures herein may be arranged and designed in a wide variety of different configurations.

Thus, the following detailed description of the embodiments of the present invention, presented in the figures, is not intended to limit the scope of the invention, as claimed, but is merely representative of selected embodiments of the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures.

Before describing the embodiments of the present invention, terms involved in the embodiments of the present invention are explained:

pruning: cutting the number of feature maps of the neural network;

and (3) quantification: representing the neural network operation with a fewer number of bits;

and (3) distillation: a small model with better performance is obtained in a mode that a teacher model (a big model) guides a student model (a small model) to train.

Deep neural networks have been used with great success in many areas, such as natural language processing, speech recognition and computer vision. The trained neural network model can be generally deployed on a selected hardware platform, such as cloud equipment, edge equipment and the like, and the performance of the neural network model can be tested and actual scene tasks can be executed on the hardware platform, such as image recognition, target detection, scene segmentation and the like, and the implementation can be realized on a mobile phone or an embedded terminal.

In order to obtain an optimal neural network model supporting operation on a hardware platform, in the related art, the most appropriate model structure is searched for a specific application or specific hardware through automatic neural network structure search (NAS for short), or a super network is trained first, and then an optimal sub network is searched for each task.

The inventor finds that the method for obtaining the optimal neural network model needs to reinitialize the neural network search space for each new scene task in the research process, and has the problems of huge time consumption and low search efficiency.

In order to solve the above technical problems, the inventor provides a neural network model design method, which combines the existing neural network knowledge base to obtain all neural network modules, constructs a series of candidate neural networks matched with scene functions based on actual scene tasks, and then screens out candidate neural network models with optimal operating speed and accuracy on a hardware platform as target neural network models based on the neural network modules capable of operating on the hardware platform, the whole process does not need to search models, so as to improve the efficiency of model design, and the neural network model design process in the embodiment of the invention can be performed in parallel with the hardware platform development, that is, the neural network model design method in the embodiment of the invention can be performed alone to obtain the optimal neural network model which can be deployed on the hardware platform, in the process of deployment, the neural network model design method can be synchronously executed, so that decoupling of a model design flow and a hardware platform development flow is realized, research and development time is reduced, the purpose of decoupling of the model design and the model deployment flow is realized, and the research and development time is reduced.

First, an implementation process of the neural network model design method provided by the embodiment of the present invention is described in detail below, please refer to fig. 1, where fig. 1 is a schematic flow chart of the neural network model design method provided by the embodiment of the present invention, and the method includes:

and S11, acquiring a neural network module library.

It is understood that a plurality of neural network modules may be included in the neural network module library. These neural network modules are all from key modules in the advanced neural network structure in the neural network knowledge base, the key modules can be core components of each neural network structure different from other neural network structures, and can also be from published research papers, modules adjusted according to effects in practice and custom modules.

For example, the neural network module library may include an inverse residual block (inversed residual block) (denoted as B1), a Fire module (denoted as B2), a PSP module (denoted as B3), an adjusted PSP module (denoted as B4), a custom upsampling module (denoted as B5), and the like.

In some possible embodiments, each neural network module in the neural network module library may further correspond to a precision index to characterize a reference index of the neural network module in the instantiated neural network. For example, in the ImageNet classification, an example network is obtained by combining modules of a single type, such as MobileNet v3, SqueezeNet and the like, the precision index of each neural network module under the example network is recorded and used as a reference index, for example, the classification accuracy of the SqueezeNet consisting of Fire modules on the ImageNet is recorded.

And S12, determining a plurality of target neural network modules matched with the scene tasks from the neural network module library according to the scene tasks.

In some possible embodiments, a plurality of target neural network modules matched with the scene task are selected from the neural network module library, a plurality of neural network models can be obtained by performing combined design according to the selected target neural network modules, the neural network modules can meet the requirements of the scene task, and the scene task can be segmentation, classification, detection and identification and the like. When the user determines that the classification task is required currently, the current scene task can be selected as the classification task. Currently, a user may also set a combination sequence of multiple scene tasks in advance, for example, the combination sequence is a classification task, a detection and identification task, and a segmentation task, and then multiple target neural networks matched with the task scenes may be sequentially selected according to the combination sequence of the multiple scene tasks.

And S13, constructing a target neural network model matched with the scene task according to the plurality of target neural network modules.

According to the neural network model design method provided by the embodiment of the invention, the neural network module library is obtained, the target neural network module capable of solving the scene task can be selected according to the actual scene task, and then the neural network model matched with the scene task is constructed based on the target neural network module. The difference from the prior art is that: in the prior art, an optimal neural network model needs to be obtained by a neural network structure searching mode, the time consumption is long, and the searching efficiency is low.

Optionally, in some possible scenarios, because the number of the selected target neural network modules is large, a large number of neural network models may be designed in combination, and meanwhile, in order to enable the designed target neural model to run on a target hardware platform, an implementation manner for obtaining the target neural network model is given below, referring to fig. 2, fig. 2 is a schematic flow chart of an implementation manner of step S13 provided by the embodiment of the present invention, that is, step S13 may include:

s131, constructing a candidate neural network module set and a first candidate neural network set according to the plurality of target neural network modules.

In some possible embodiments, each first candidate neural network in the first set of candidate neural networks is formed by a combination of at least two target neural network modules; each first candidate neural network is matched with a scene task, namely, the designed first candidate neural network can solve the current scene task, such as segmentation, classification, detection and identification; the function index of each first candidate neural network is in a set function index range, the function index can be a precision index, and when the precision index of the network is greater than or equal to 80%, the network corresponding to the precision index can be used as the first candidate neural network. Each candidate neural network module in the set of candidate neural network modules is one of the plurality of target neural network modules, and each candidate neural network module is supported to run on the target hardware platform.

And S132, determining a target neural network model according to the first candidate neural network set and the candidate neural network module set.

Optionally, in order to select a possible target neural network from the first candidate neural network set, referring to fig. 3, fig. 3 is a schematic flow chart of an implementation manner of step S13 provided in the embodiment of the present invention, that is, a possible implementation manner of step S132 is:

s132-1, determining the first candidate neural networks which meet the set conditions in the first candidate neural network set as the target neural network model.

In some possible embodiments, the setting conditions may be: the first candidate neural network includes each target neural network module included in the set of candidate neural network modules and the first candidate neural network has a minimum runtime on the target hardware platform.

For example, the target neural network modules may be B1, B2, B3, B4 and B5, and the first candidate neural network designed in combination according to the scene task is a model M1 (including B1, B4 and B5), a model M2 (including B2, B3 and B5) and a model M3 (including B4 and B5). The set of candidate neural network modules includes B2, B3, B4, B5. It can be seen that B3 does not support running on the target hardware platform, therefore, model M2 can be eliminated, where the running time of model M3 on the target hardware platform is less than that of model M1 on the target hardware platform, and the running speed of model M3 is optimal, and at this time, M3 can be used as the target neural network model.

It is understood that if the runtime of model M3 at the target hardware platform is the same as the runtime of model M1 at the target hardware platform, then model M3 or model M1 may be selected as the target neural network model.

Optionally, in order to facilitate understanding of the implementation of obtaining the candidate neural network and the candidate neural network module according to the embodiment of the present invention, a possible implementation is given below, referring to fig. 4, where fig. 4 is a schematic flowchart of an implementation of S131 provided by the embodiment of the present invention, and step S131 may include:

s131-1, performing weighted combination on at least two target neural network modules in the plurality of target neural network modules according to the scene task and the set hyper-parameters to obtain a plurality of first candidate neural networks to form a first candidate neural network set.

In some possible embodiments, the feature structure of the candidate neural network to be designed may be determined by analyzing the implementation characteristics of the current scene task, so that the designed first candidate neural network model can realize the function of matching with the current scene task, for example, the segmentation task is characterized by a large scene but small key objects, and multi-layer semantic features are considered during design. Meanwhile, in order to enable the designed neural network to run on a hardware platform quickly, a lightweight neural network module can be selected for designing the module.

For example, if the current scene task is a segmentation task, and the above design factors are combined, the designed neural network model may include: model M1: b4 × n1+ B1 × n2+ B5 × n3, where n1, n2, and n3 are hyper-parameters, which can be determined according to the size of the segmented object and the actual experimental effect. Because the hardware platform supports different modules differently, more alternative model structures can be designed, such as model M2: b3 × n4+ B2 × n5+ B5 × n6, n4, n5, n6 are hyper-parameters.

In a possible implementation manner, for the first candidate neural network that has been designed, a richer structure may also be added, such as modifying the structure of the model M1, and using another structure instead of B1 to implement the cascade connection between B4 and B4, so as to obtain the composition M3.

The models are trained and tested to obtain a scene segmentation result, and a first candidate neural network set is formed on the premise that the accuracy rate meets the service requirement.

S131-2, cascading each target neural network module to obtain a network corresponding to the target neural network module; and taking target neural network modules corresponding to the network supported by the hardware platform to operate as candidate neural network modules to form a candidate neural network module set.

In some possible implementation manners, each target neural network module may be cascaded to obtain a network with an n-layer structure, specifically, n individual target neural network modules are cascaded, for example, a module B1, where n is 5, forms a B1 × 5 network, the networks formed by combining each target neural network module have the same scale, and then the network formed by combining each target neural network module is converted into a form supported by the same target hardware platform, a neural network module corresponding to a network that does not support conversion is excluded, and a candidate neural network module library is further obtained. For example, the target hardware platform does not support the running of the target neural network module B3, and B1, B2, B4, and B5 all support the running, then the candidate neural network modules collectively include B1, B2, B4, and B5.

In one possible implementation, after obtaining the set of candidate neural network modules, the test runtime of each candidate neural network module may be recorded, and then all candidate neural network modules may be ranked based on the test runtime, for example, the ranking result of each module in the library of candidate neural network modules is B2> B1> B4> B5 based on the test runtime.

Optionally, in a scenario, in the process of determining the target neural network model, when there is no first candidate neural network model that satisfies the set condition, for example, the hardware platform does not support B1, B3, and B4, then M1 (including B1, B4, and B5), M2 (including B2, B3, and B5), and M3 (including B4 and B5) do not satisfy the set condition, a second candidate neural network set may be further constructed, and a possible implementation manner is given below on the basis of fig. 1, referring to fig. 5, where fig. 5 is one of schematic flow charts of another neural network model design method provided by an embodiment of the present invention, and the method further includes:

and S14, when the first candidate neural network which meets the set conditions does not exist, constructing a second candidate neural network set based on the candidate neural network module set.

It is to be appreciated that each second candidate neural network in the second set of candidate neural networks matches the scenario task; and the function index of each second candidate neural network is in a set function index range, the scale of the constructed second candidate neural network is consistent with that of the first candidate neural network, and then a target neural network model matched with the hardware platform is screened out based on the second candidate neural network.

Optionally, to facilitate understanding of the above implementation process for constructing the second candidate neural network set, a possible implementation is given below, referring to fig. 6, where fig. 6 is a schematic flow chart of an implementation of step S14, and step S14 may include:

s141, modifying the feature layer of at least one neural network module of each first candidate neural network to obtain a modified first candidate neural network, and taking the modified first candidate neural network as a second candidate neural network set.

In some possible embodiments, in the first candidate neural network set, if there is a first candidate neural network that includes a target neural network module that does not support operation on a target hardware platform but can be deployed on the hardware platform after a small amount of structural modification to the target neural network module, the target neural network module may be modified, and the first candidate neural network corresponding to the modified target neural network module may be used as a second candidate neural network;

for example, the model M1 includes modules B1, B4, and B5, and the tested target hardware platform does not support run B1, but may be deployed after modifying the feature layer in B1, for example, modifying B1 to B1 'after modifying the feature layer ReLU6 in B1 to a ReLU layer, to obtain a modified model M1', which includes B4 n1+ B1 'n 2+ B5 n3, and similarly, M3' may also perform similar processing, and make the modified first candidate neural network into the second candidate neural network set.

S142, sequencing all candidate neural network modules based on the running time corresponding to each candidate neural network module; and constructing a second candidate neural network set based on the ordered candidate neural network module sets and the speed selection strategy.

In other possible embodiments, a second candidate neural network matching the scenario task may be reconstructed according to candidate neural network modules in the candidate neural network module set, for example, the candidate neural network module library includes B1, B2, B4 and B5, and the result of the ranking of each module in the candidate neural network module library is B2> B1> B4> B5 based on the test runtime, based on which the second candidate neural network may be reconstructed, for example, such as that the reconstructed model M4 is B2 n7+ B4 n8+ B5 n9, where n7, n8 and n9 are hyper-parameters.

It should be noted that, the difference between the construction of the second candidate neural network and the construction of the first candidate neural network is that, in the process of constructing the first candidate neural network, the speed index between each of the neural network modules designed in combination is unknown, so the process of constructing the first candidate neural network belongs to blind construction, and aims to obtain all candidate neural network modules capable of being matched with the scene task, while in the process of constructing the second candidate neural network, the construction is performed based on the candidate neural network module library, the candidate neural network modules in the candidate neural network set have corresponding running time, and the modules for constructing the second candidate neural network can be selected based on the running speed left-right principle.

It should be further noted that there is no execution sequence between step S141 and step S142, and in one scenario, the second candidate neural network may be obtained only through step S141; in another scenario, when the candidate neural modules cannot be deployed even after being structurally modified, the neural network modules in the candidate neural network module library may be recombined to obtain a second candidate neural network set in step S142.

Optionally, the neural network model design process in the embodiment of the present invention may be performed in parallel with hardware platform development, so as to achieve decoupling of the model design process and the hardware platform development process, that is, after obtaining the target neural network model, the target neural network model may be taken to the target hardware platform for deployment to test performance of the target neural network model, an implementation manner is given below, that is, on the basis of fig. 1, referring to fig. 7, fig. 7 is a second schematic flow chart of another neural network model design method provided in the embodiment of the present invention, and the method may further include:

and S15, when the performance index of the target neural network model on the target hardware platform is in the set performance index range, deploying the target neural network model on the target hardware platform.

Optionally, if the screened target neural network model is deployed on the target hardware platform, and both the precision and the speed cannot reach the set conditions, the target neural network model may be continuously adjusted, so that the target neural network model can finally meet the set conditions, an implementation manner for adjusting the target neural network model is provided below, referring to fig. 8, where fig. 8 is a third schematic flow chart of another neural network model design method provided in the embodiment of the present invention, and the method may further include:

and S16, when the speed index of the target neural network model on the target hardware platform is not in the speed index range and the precision index is in the precision index range, compressing the target neural network model so that the performance index of the compressed target neural network model is in the performance index range.

And S17, distilling the target neural network model when the speed index of the target neural network model on the target hardware platform is in the speed index range and the precision index is not in the precision index range, so that the performance index of the compressed target neural network model is in the performance index range.

For example, if the target neural network model is the model M3 in the above embodiment, if the speed index is not in the speed index range and the accuracy index is in the accuracy index range after M3 deployment, the acceleration network is further compressed by pruning and quantization to obtain a model meeting the requirements, and deployment is completed; the target neural network model is the model M4 in the above embodiment, and if the speed index is in the speed index range after M4 deployment and the accuracy index is not in the accuracy index range, the accuracy of M4 is improved in a distillation manner, wherein in the process of modifying the model by applying the distillation manner, the teacher model selects the model with the highest accuracy in the candidate neural network set, the student model selects the model which can be directly deployed on hardware or the model compressed in step S16, the student model is retrained to obtain the model finally used for deployment, and finally deployment is completed.

It should be noted that there is no execution sequence between step S16 and step S17, and in one scenario, when the speed of the target neural network model does not meet the requirement, step S16 may be executed, and if the accuracy of the target neural network model does not meet the requirement, step S17 may be executed.

Alternatively, in the prior art, another solution based on neural network search is to train a super network first and then search for the optimal sub-network for each task, which is efficient in searching, but since the super network is determined in advance, it cannot be updated with the domain knowledge. In order to solve the above problem, a possible implementation is given below, referring to fig. 9, where fig. 9 is a fourth schematic flowchart of another neural network model design method provided by an embodiment of the present invention, and the method may further include:

and S18, updating the neural network module library according to the neural network knowledge library.

Based on the above embodiments, the neural network model design method in the embodiments of the present invention can obtain a target neural network model, and the target neural network model can be deployed on a hardware platform, but in the deployment process, the neural network model design method can be synchronously executed, thereby achieving decoupling of a model design flow and a hardware platform development flow, reducing development time, achieving the purpose of decoupling of the model design and the model deployment flow, and reducing development time. To this end, an embodiment of the present invention provides a neural network model deployment method, and referring to fig. 10, fig. 10 is a schematic flowchart of a neural network model deployment method provided in an embodiment of the present invention, where the method includes:

and S21, acquiring the target neural network model.

It is understood that the target neural network model can be obtained by various embodiments of the neural network model design method provided in the above embodiments, and the details are not repeated herein.

And S22, when the performance index of the target neural network model on the target hardware platform is in the set performance index range, deploying the target neural network model on the target hardware platform.

It can be understood that when the performance index of the target neural network model on the target hardware platform is within the set performance index range, the target neural network model is deployed on the target hardware platform. If the performance index of the target neural network model on the target hardware platform is not within the set performance index range, the target neural network model may be modified through steps S16 and S17 in the above embodiment, so that the modified target neural network model meets the deployment condition.

In order to implement the above steps S11 to S18 to achieve the corresponding technical effects, an implementation manner of a neural network model designing apparatus is given below, referring to fig. 11, fig. 11 is a functional block diagram of a neural network model designing apparatus according to an embodiment of the present invention, wherein the neural network model designing apparatus 30 includes: an acquisition module 301, a construction module 302 and a determination module 303.

An obtaining module 301, configured to obtain a neural network module library; the neural network module library comprises a plurality of neural network modules.

And the determining module 303 is configured to determine, according to the scenario task, a plurality of target neural network modules that match the scenario task from the neural network module library.

A building module 302, configured to build a target neural network model matched with the scene task according to the plurality of target neural network modules.

It is understood that the obtaining module 301, the constructing module 302 and the determining module 303 may be used to perform the steps S11 to S13 to achieve the corresponding technical effect.

Optionally, the neural network model designing apparatus 30 may further include a deployment module, and the deployment module is configured to deploy the target neural network model on the target hardware platform when the performance index of the target neural network model on the target hardware platform is within the set performance index range. The deployment module is also used for compressing the target neural network model when the speed index of the target neural network model on the target hardware platform is not in the speed index range and the precision index is in the precision index range so as to enable the performance index of the compressed target neural network model to be in the performance index range; or distilling the target neural network model when the speed index of the target neural network model on the target hardware platform is in the speed index range and the precision index is not in the precision index range so as to enable the performance index of the compressed target neural network model to be in the performance index range.

Optionally, the neural network model designing apparatus 30 may further include an updating module, and the updating module is configured to update the neural network module library according to the neural network knowledge library.

Optionally, the building module 302 is specifically configured to: constructing a candidate neural network module set and a first candidate neural network set according to the plurality of target neural network modules; each first candidate neural network in the first candidate neural network set is formed by combining at least two target neural network modules; each first candidate neural network is matched with a scene task; the function index of each first candidate neural network is in a set function index range; each candidate neural network module in the candidate neural network module set is one of a plurality of target neural network modules and each candidate neural network module is supported to run on a target hardware platform; and determining a target neural network model according to the first candidate neural network set and the candidate neural network module set.

Optionally, the determining module 303 is specifically configured to determine a first candidate neural network, in which the first candidate neural network set meets the set condition, as the target neural network model; wherein, the setting conditions are as follows: the first candidate neural network includes each target neural network module included in the set of candidate neural network modules, and the first candidate neural network has a minimum runtime on the target hardware platform.

Optionally, the building module 302 is specifically configured to: combining at least two target neural network modules in a target neural network module library according to the scene task and the set hyper-parameters to obtain a plurality of first candidate neural networks to form a candidate neural network set; cascading each target neural network module to obtain a network corresponding to the target neural network module; and taking target neural network modules corresponding to the network supported by the hardware platform to operate as candidate neural network modules to form a candidate neural network module set. The method is further specifically configured to modify the structure of the first candidate neural network to obtain a modified first candidate neural network, and add the modified first candidate neural network to the candidate neural network set.

Optionally, the building module 302 is further configured to: when the first candidate neural network meeting the set conditions does not exist, constructing a second candidate neural network set based on the candidate neural network module set; each second candidate neural network in the second set of candidate neural networks is matched with the scene task; the function index of each second candidate neural network is within a set function index range, and is further specifically configured to: modifying the characteristic layer of at least one target neural network module of each first candidate neural network to obtain a modified first candidate neural network, and taking the modified first candidate neural network as a second candidate neural network set; and/or ranking all candidate neural network modules based on the corresponding running time of each candidate neural network module; and constructing a second candidate neural network set based on the ranked candidate neural network module targets and the speed selection strategy.

In order to implement the above steps S21 to S22 to achieve the corresponding technical effect, an implementation manner of a neural network model deployment apparatus is given below, referring to fig. 12, fig. 12 is a functional block diagram of a neural network model deployment apparatus according to an embodiment of the present invention, where the neural network model deployment apparatus 40 includes: an acquisition module 401 and a deployment module 402.

An obtaining module 401, configured to obtain a target neural network model.

The deployment module 402 is configured to deploy the target neural network model on the target hardware platform when the performance index of the target neural network model on the target hardware platform is within the set performance index range.

Fig. 13 shows a block diagram of an electronic device according to an embodiment of the present invention, where fig. 13 is a block diagram of the electronic device according to the embodiment of the present invention. The electronic device 50 comprises a communication interface 501, a processor 502 and a memory 503. The processor 502, the memory 503, and the communication interface 501 are electrically connected to each other, directly or indirectly, to enable transmission or interaction of data. For example, the components may be electrically connected to each other via one or more communication buses or signal lines. The memory 503 can be used for storing software programs and modules, such as program instructions/modules corresponding to the neural network model deployment method and/or the neural network model design method provided by the embodiment of the present invention, and the processor 502 executes various functional applications and data processing by executing the software programs and modules stored in the memory 503. The communication interface 501 may be used for communicating signaling or data with other node devices. The electronic device 500 may have a plurality of communication interfaces 501 in the present invention.

The memory 503 may be, but is not limited to, a Random Access Memory (RAM), a Read Only Memory (ROM), a programmable read-only memory (PROM), an erasable read-only memory (EPROM), an electrically erasable read-only memory (EEPROM), and the like.

The processor 502 may be an integrated circuit chip having signal processing capabilities. The processor may be a general-purpose processor including a Central Processing Unit (CPU), a Network Processor (NP), and the like; but may also be a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other programmable logic device, discrete gate or transistor logic, discrete hardware components, etc.

It is understood that the respective modules of the neural network model deploying device 40 or the neural network model designing device 30 may be stored in the form of software or Firmware (Firmware) in the memory 503 of the electronic device 50 and executed by the processor 502, and at the same time, data, codes of programs, and the like required for executing the modules may be stored in the memory 503.

Embodiments of the present invention provide a storage medium on which a computer program is stored, where the computer program, when executed by a processor, implements a neural network model deployment method and/or a neural network model design method as in any one of the foregoing embodiments. The computer readable storage medium may be, but is not limited to, various media that can store program codes, such as a U disk, a removable hard disk, a ROM, a RAM, a PROM, an EPROM, an EEPROM, a magnetic or optical disk, etc.

The above description is only for the specific embodiment of the present invention, but the scope of the present invention is not limited thereto, and any changes or substitutions that can be easily conceived by those skilled in the art within the technical scope of the present invention are included in the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the appended claims.

In the description of the present invention, it should be noted that if the terms "upper", "lower", "inside", "outside", etc. indicate an orientation or a positional relationship based on that shown in the drawings or that the product of the present invention is used as it is, this is only for convenience of description and simplification of the description, and it does not indicate or imply that the device or the element referred to must have a specific orientation, be constructed in a specific orientation, and be operated, and thus should not be construed as limiting the present invention.

Furthermore, the appearances of the terms "first," "second," and the like, if any, are used solely to distinguish one from another and are not to be construed as indicating or implying relative importance.

It should be noted that the features of the embodiments of the present invention may be combined with each other without conflict.

Claims

1. A method of neural network model design, the method comprising:

acquiring a neural network module library; the neural network module library comprises a plurality of neural network modules;

determining a plurality of target neural network modules matched with the scene tasks from the neural network module library according to the scene tasks;

and constructing a target neural network model matched with the scene task according to the plurality of target neural network modules.

2. The neural network model design method of claim 1, wherein constructing a target neural network model matching the scene task according to the target neural network module comprises:

constructing a candidate neural network module set and a first candidate neural network set according to the plurality of target neural network modules;

wherein each first candidate neural network in the first set of candidate neural networks is formed by combining at least two target neural network modules; each first candidate neural network is matched with a scene task; the function index of each first candidate neural network is in a set function index range; each candidate neural network module in the set of candidate neural network modules is one of the plurality of target neural network modules and each of the candidate neural network modules is supported to run on a target hardware platform;

determining the target neural network model from the first set of candidate neural networks and the set of candidate neural network modules.

3. The neural network model design method of claim 2, wherein determining the target neural network model from the first set of candidate neural networks and the set of candidate neural network modules comprises:

determining the first candidate neural networks in the first candidate neural network set meeting set conditions as target neural network models;

wherein the setting conditions are as follows: each of the target neural network modules included in the first candidate neural network is included in the set of candidate neural network modules, and the first candidate neural network has a minimum runtime on the target hardware platform.

4. The neural network model designing method according to claim 3, further comprising, after determining the first candidate neural networks satisfying a set condition in the first set of candidate neural networks as a target neural network model:

and when the performance index of the target neural network model on the target hardware platform is within a set performance index range, deploying the target neural network model on the target hardware platform.

5. The neural network model design method of claim 1, further comprising, after constructing a target neural network model matching the scenario task from the plurality of target neural network modules:

and updating the neural network module library according to the neural network knowledge library.

6. The neural network model design method of claim 2, wherein constructing a set of candidate neural network modules and a first set of candidate neural network modules from the plurality of target neural network modules comprises:

combining at least two target neural network modules in the plurality of target neural network modules according to the scene task and the set hyper-parameters to obtain a plurality of first candidate neural networks to form a first candidate neural network set;

cascading each target neural network module to obtain a network corresponding to the target neural network module; and taking the target neural network module corresponding to the network supported by the hardware platform to operate as the candidate neural network module to form the candidate neural network module set.

7. The neural network model design method of claim 6, further comprising:

and modifying the structure of the first candidate neural network to obtain a modified first candidate neural network, and adding the modified first candidate neural network to the first candidate neural network set.

8. The neural network model design method of claim 3, further comprising:

when the first candidate neural network meeting the set condition does not exist, constructing a second candidate neural network set based on the candidate neural network module set;

wherein each second candidate neural network in the second set of candidate neural networks is matched to the scenario task; the function index of each second candidate neural network is within the set function index range.

9. The neural network model design method of claim 8, wherein constructing a second set of candidate neural networks based on the set of candidate neural network modules comprises:

modifying the feature layer of at least one target neural network module of each first candidate neural network to obtain a modified first candidate neural network, and taking the modified first candidate neural network as the second candidate neural network set; and/or the presence of a gas in the gas,

ranking all the candidate neural network modules based on the corresponding running time of each candidate neural network module; and constructing the second candidate neural network set based on the ordered candidate neural network module sets and the speed selection strategy.

10. The neural network model design method of claim 4, wherein the performance indicators include a speed indicator and a precision indicator; the performance index range comprises a speed index range and a precision index range; further comprising:

when the speed index of the target neural network model on the target hardware platform is not in the speed index range and the precision index is in the precision index range, compressing the target neural network model so that the performance index of the compressed target neural network model is in the performance index range; alternatively, the first and second electrodes may be,

when the speed index of the target neural network model on the target hardware platform is in the speed index range and the precision index is not in the precision index range, distilling the target neural network model so as to enable the performance index of the compressed target neural network model to be in the performance index range.

11. A neural network model deployment method, the method comprising:

acquiring a target neural network model;

and when the performance index of the target neural network model on a target hardware platform is within a set performance index range, deploying the target neural network model on the target hardware platform.

12. The neural network model deployment method of claim 11, wherein the method comprises: the target neural network model is obtained by the neural network model designing method according to any one of claims 1 to 10.

13. An apparatus for designing a neural network model, comprising:

the acquisition module is used for acquiring a neural network module library; the neural network module library comprises a plurality of neural network modules;

the determining module is used for determining a plurality of target neural network modules matched with the scene tasks from the neural network module library according to the scene tasks;

and the construction module is used for constructing a target neural network model matched with the scene task according to the plurality of target neural network modules.

14. A neural network model deployment device, comprising:

the acquisition module is used for acquiring a target neural network model;

and the deployment module is used for deploying the target neural network model on the target hardware platform when the performance index of the target neural network model on the target hardware platform is within the set performance index range.

15. An electronic device comprising a processor and a memory, the memory storing a computer program executable by the processor, the computer program executable by the processor to implement the neural network model design method of any one of claims 1-10 and/or to implement the neural network model deployment method of any one of claims 11-12.

16. A storage medium having stored thereon a computer program for implementing a neural network model design method according to any one of claims 1 to 10 and/or for implementing a neural network model deployment method according to any one of claims 11 to 12 when executed by a processor.