CN113469352A

CN113469352A - Neural network model optimization method, data processing method and device

Info

Publication number: CN113469352A
Application number: CN202010243890.4A
Authority: CN
Inventors: 梁杰鑫
Original assignee: Shanghai Sensetime Intelligent Technology Co Ltd
Current assignee: Shanghai Sensetime Intelligent Technology Co Ltd
Priority date: 2020-03-31
Filing date: 2020-03-31
Publication date: 2021-10-01
Anticipated expiration: 2040-03-31
Also published as: CN113469352B

Abstract

The present disclosure provides an optimization method, a data processing method and a device for a neural network model, wherein the neural network model includes a plurality of operators, and the optimization method includes: preprocessing a parameter search space corresponding to each candidate algorithm in a plurality of candidate algorithms to obtain a shrinkage parameter search space of each candidate algorithm; searching based on the shrinkage parameter searching space of each candidate algorithm to obtain a parameter value corresponding to each candidate algorithm; and searching based on the candidate algorithm supported by each operator in the operators and the parameter values of the supported candidate algorithm to obtain the optimized neural network model. The optimization method of the neural network model determines a shrinkage-limited parameter search space for each candidate algorithm, determines parameter values of parameters in the shrinkage-limited parameter search space, and then searches based on the parameter values of the candidate algorithms to improve the optimization efficiency of the neural network model.

Description

Neural network model optimization method, data processing method and device

Technical Field

The present disclosure relates to the field of image processing technologies, and in particular, to an optimization method of a neural network model, a data processing method, and an apparatus.

Background

With the wide application of the neural network in various fields, the landing of a series of intelligent products is promoted, such as image recognition, image restoration, voice recognition and the like; with the continuous development of the neural network model, algorithms used in the neural network are more and more diverse, the structure of the neural network model is more and more complex, and the neural network has the problems of low development efficiency and high cost.

Disclosure of Invention

The embodiment of the disclosure at least provides an optimization method of a neural network model, a data processing method and a data processing device.

In a first aspect, an embodiment of the present disclosure provides a method for optimizing a neural network model, where the neural network model includes multiple operators, and the method includes: preprocessing a parameter search space corresponding to each candidate algorithm in a plurality of candidate algorithms to obtain a shrinkage parameter search space of each candidate algorithm; searching based on the shrinkage parameter searching space of each candidate algorithm to obtain a parameter value corresponding to each candidate algorithm; and searching based on the candidate algorithm supported by each operator in the operators and the parameter values of the supported candidate algorithm to obtain the optimized neural network model.

In this way, a target algorithm is selected for each candidate data format in the multiple candidate data formats, a search path corresponding to each operator is determined according to the target algorithm corresponding to at least one data format supported by each operator in the multiple operators, then the search is carried out based on the search path corresponding to each operator in the multiple operators, a neural network model is obtained, and therefore the target algorithm is determined for each data format to reduce the search space, and then the search is carried out in the search space, so that the optimization efficiency of the neural network model is improved.

In an optional implementation manner, the preprocessing a parameter search space corresponding to each candidate algorithm in a plurality of candidate algorithms to obtain a reduced parameter search space of each candidate algorithm includes: determining a plurality of parameter sets corresponding to each of the plurality of candidate algorithms; wherein, in the plurality of parameter groups, the same parameter has different values; performing performance simulation on the plurality of parameter sets based on a plurality of task scenes to obtain a performance simulation result of each parameter set in the plurality of parameter sets; and determining a shrinkage parameter search space of each candidate algorithm based on performance simulation results respectively corresponding to the plurality of parameter sets of each candidate algorithm.

Therefore, a plurality of parameter sets are determined for each candidate algorithm, performance simulation is carried out on the parameter sets of each candidate algorithm based on a plurality of task scenes, a shrinkage-limited parameter search space is determined for each candidate algorithm based on a performance simulation result, so that the shrinkage-limited parameter search space is reduced compared with an original parameter search space, the optimal solution of the search space is in the shrinkage-limited parameter search space, and the obtained parameter values of each candidate algorithm can enable the corresponding candidate algorithm to have better performance, so that the performance of the optimized neural network model is improved.

In an optional implementation manner, the determining a constrained parameter search space of each candidate algorithm based on performance simulation results corresponding to the plurality of parameter sets of each candidate algorithm respectively includes: determining at least one target parameter set from the plurality of parameter sets based on performance simulation results corresponding to the plurality of parameter sets of each candidate algorithm, respectively; and determining a shrinkage parameter search space of each candidate algorithm based on values of all parameters contained in the at least one target parameter group.

Therefore, the method and the device realize the determination of the shrinkage parameter search space corresponding to the candidate algorithm based on the performance simulation result corresponding to each parameter group.

In an optional embodiment, the candidate algorithm comprises a public parameter and at least one private parameter, and the parameter search space is a search space of the at least one private parameter.

In an optional implementation manner, the searching based on the candidate algorithm supported by each operator of the multiple operators and the parameter value of the supported candidate algorithm to obtain the optimized neural network model includes: selecting a target algorithm corresponding to each candidate data format from at least one candidate algorithm corresponding to each candidate data format in a plurality of candidate data formats; determining a search path corresponding to each operator according to a target algorithm corresponding to at least one data format supported by each operator in a plurality of operators; wherein the plurality of candidate data formats includes the at least one data format; and searching based on the search path corresponding to each operator in the operators and the parameter value corresponding to the target algorithm in the search path of each operator to obtain the optimized neural network model.

In this way, a target algorithm is selected for each candidate data format in multiple candidate data formats, a search path corresponding to each operator is determined according to the target algorithm corresponding to at least one data format supported by each operator in multiple operators, then searching is carried out based on the search path corresponding to each operator in multiple operators and the parameter value of the candidate algorithm supported by each operator, and a neural network model is obtained.

In an optional implementation manner, the selecting a target algorithm corresponding to each candidate data format from at least one candidate algorithm corresponding to each candidate data format in the multiple candidate data formats includes: determining first consumed time of at least one candidate algorithm corresponding to each candidate data format when a calculation task is executed based on the data of each candidate data format; and selecting a target algorithm corresponding to each candidate data format from the at least one candidate algorithm corresponding to each candidate data format according to the first consumed time of each candidate algorithm in the at least one candidate algorithm corresponding to each candidate data format.

Therefore, the corresponding target algorithm is determined for each candidate data format through the first time consumption of the candidate algorithm when the calculation task is executed on the basis of the data of the candidate data format, so that a search path is formed on the basis of each candidate data format and the target algorithm corresponding to each candidate data format, the search space corresponding to each operator is reduced, and the aim of improving the optimization efficiency of the neural network model is fulfilled.

In an alternative embodiment, the method further comprises: for each candidate data format in the plurality of candidate data formats, selecting a target conversion data format corresponding to each candidate data format from the plurality of candidate formats; the determining a search path corresponding to each operator according to a target algorithm corresponding to at least one data format supported by each operator in the plurality of operators comprises: and determining a search path corresponding to each operator according to a target algorithm corresponding to each data format in at least one data format supported by each operator and a target conversion data format corresponding to each data format.

Therefore, the search space can be further reduced on the basis of the implementation mode, and the purpose of further improving the optimization efficiency of the neural network model is achieved.

In an alternative embodiment, the selecting the target converted data format corresponding to each candidate data format from the plurality of candidate formats includes: determining, for a first candidate data format of the plurality of candidate data formats, a second elapsed time for converting at least one second candidate data format other than the first candidate data format into the first data format; and selecting a target conversion data format corresponding to the first candidate data format from the at least one second candidate data format based on the second consumed time.

Therefore, for each first candidate data format, the paths which need less time consumption for converting the candidate data formats are screened out according to the second time consumption for converting the second candidate data formats into the first candidate data formats, so that the search space of each operator is further reduced, and the aim of further improving the optimization efficiency of the neural network model is fulfilled.

In an optional implementation manner, the selecting, based on the second elapsed time, a target converted data format corresponding to the first candidate data format from the at least one second candidate data format includes: and selecting a target conversion data format corresponding to the first candidate data format from the at least one second candidate data format based on the first consumed time of the target algorithm corresponding to the first candidate data format and the second consumed time of each second candidate data format in the at least one second candidate data format.

In this way, for each first candidate data format, according to the first time consumption corresponding to the first candidate data format and the second time consumption corresponding to each second candidate data format, the target conversion data format is determined for each first candidate data format, so that search paths with better performance are reserved in the reduced search space as much as possible, and therefore, the search is performed based on the search paths corresponding to each operator in a plurality of operators, and the performance of the optimized neural network model is better.

In an optional implementation manner, the searching based on the search path corresponding to each operator in the multiple operators and the parameter value corresponding to the target algorithm in the search path of each operator to obtain the optimized neural network model includes: dividing the neural network model into a plurality of sub-network models, wherein each sub-network model comprises at least one operator in the neural network model; searching based on a search path corresponding to each operator in at least one operator included in each sub-network model and a parameter value corresponding to a target algorithm in the search path corresponding to each operator to obtain each sub-network model; and obtaining the optimized neural network model based on each sub-network model in the plurality of sub-network models.

In this way, the neural network model is divided into a plurality of sub-network models, so that the search task can be divided into a plurality of sub-tasks; different subtasks can be realized in parallel through different arithmetic units, so that the search process can be further improved, and the optimization efficiency can be improved.

In an alternative embodiment, the dividing the neural network model into a plurality of sub-network models includes: the neural network model is divided into a plurality of sub-network models based on at least one of a network depth threshold condition and a data format condition.

In an optional embodiment, the network depth threshold condition comprises: the network depth of the current sub-network model reaches a preset depth threshold; and/or the data format conditions include: the number of data formats supported by the next operator of the current operator is 1.

Therefore, the sub-network models with more reasonable division are obtained, so that the sub-network models with more excellent performance can be obtained when each sub-network model is searched, and the performance of the finally formed neural network model is further improved.

In an optional implementation manner, the searching based on the search path corresponding to each operator in the multiple operators and the parameter value corresponding to the target algorithm in the search path of each operator to obtain the optimized neural network model includes: determining a third time consumption corresponding to a search path based on a target algorithm corresponding to the search path and a parameter value of the target algorithm corresponding to the search path for the search path corresponding to each operator in a plurality of operators; determining a target path based on the third time consumption of the search path corresponding to each operator in the plurality of operators; and obtaining the optimized neural network model based on the target path.

Thus, the target path is determined through the third time consumption of each search path, and the optimized neural network model is obtained based on the target path, so that the optimization of the neural network model is realized.

In a second aspect, an embodiment of the present disclosure provides a data processing method, including: acquiring data to be processed; executing a data processing task on the data to be processed by using a target neural network model to obtain a data processing result of the data to be processed; the target neural network model is obtained based on the optimization method of the neural network model according to any embodiment of the disclosure.

In this way, the data processing method uses the neural network model generated by the optimization method of the neural network model provided by any embodiment of the disclosure to process the data to be processed, and the generated neural network model has better performance, so that the obtained data processing result has higher accuracy.

In a third aspect, an embodiment of the present disclosure provides an apparatus for optimizing a neural network model, where the neural network model includes a plurality of operators, and the apparatus includes: the preprocessing module is used for preprocessing a parameter search space corresponding to each candidate algorithm in a plurality of candidate algorithms to obtain a shrinkage parameter search space of each candidate algorithm; the first searching module is used for searching based on the shrinkage parameter searching space of each candidate algorithm to obtain a parameter value corresponding to each candidate algorithm; and the second searching module is used for searching based on the candidate algorithm supported by each operator in the operators and the parameter value of the supported candidate algorithm to obtain the optimized neural network model.

In a fourth aspect, an embodiment of the present disclosure provides a data processing apparatus, including: the acquisition module is used for acquiring data to be processed; the processing module is used for executing a data processing task on the data to be processed by utilizing a target neural network model to obtain a data processing result of the data to be processed; the target neural network model is obtained based on the optimization method of the neural network model according to any embodiment of the disclosure.

In a fifth aspect, this disclosure also provides an electronic device, a processor, and a memory, where the memory stores machine-readable instructions executable by the processor, and the processor is configured to execute the machine-readable instructions stored in the memory, and when the machine-readable instructions are executed by the processor, the machine-readable instructions are executed by the processor to perform the steps in the first aspect or the possible implementation manner of the second aspect.

In a sixth aspect, alternative implementations of the present disclosure also provide a computer-readable storage medium having a computer program stored thereon, where the computer program is executed to perform the steps in the first aspect or the possible implementation manners of the second aspect.

In a seventh aspect, alternative implementations of the present disclosure also provide a computer program product including computer readable instructions which, when executed by a computer, perform the steps in the possible implementation manners of the first aspect or the second aspect.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present disclosure, the drawings required for use in the embodiments will be briefly described below, and the drawings herein incorporated in and forming a part of the specification illustrate embodiments consistent with the present disclosure and, together with the description, serve to explain the technical solutions of the present disclosure. It is appreciated that the following drawings depict only certain embodiments of the disclosure and are therefore not to be considered limiting of its scope, for those skilled in the art will be able to derive additional related drawings therefrom without the benefit of the inventive faculty.

Fig. 1 illustrates a flowchart of an optimization method of a neural network model provided by an embodiment of the present disclosure;

FIG. 2 is a flow chart illustrating a specific method for obtaining a reduced parameter search space for each candidate algorithm provided by an embodiment of the present disclosure;

FIG. 3 is a schematic diagram of an algorithmic interface provided by an embodiment of the present disclosure;

FIG. 4 is a flow chart illustrating a specific method of obtaining an optimized neural network model provided by an embodiment of the present disclosure;

FIG. 5 is a flowchart illustrating a specific method for selecting a target algorithm corresponding to each candidate data format according to an embodiment of the present disclosure;

FIG. 6 shows examples of search paths respectively corresponding to operators in the case where a target algorithm is not determined for a candidate data format and a target algorithm is determined for a candidate data format, which are provided by the embodiment of the disclosure;

fig. 7 is a flowchart illustrating a specific method for obtaining a neural network model by searching based on a search path corresponding to each operator in a plurality of operators according to the embodiment of the present disclosure;

FIG. 8 is a flow chart illustrating a particular method of partitioning a neural network model into a plurality of sub-network models provided by an embodiment of the present disclosure;

FIG. 9 is a flow chart illustrating another specific method for optimizing the neural network model provided by the embodiment of the present disclosure;

FIG. 10 is a flowchart illustrating a specific method for selecting a target transformed data format corresponding to each candidate data format according to an embodiment of the present disclosure;

FIG. 11 is a diagram illustrating an example of search paths corresponding to operators when a target algorithm and a target conversion data format are not determined for a candidate data format and an example of search paths corresponding to operators when a target algorithm and a target conversion data format are determined for a candidate data format, provided by an embodiment of the present disclosure;

FIG. 12 is a flow chart illustrating a data processing method provided by an embodiment of the present disclosure;

FIG. 13 is a schematic diagram illustrating an apparatus for optimizing a neural network model provided by an embodiment of the present disclosure;

FIG. 14 is a schematic diagram of a data processing apparatus provided by an embodiment of the present disclosure;

fig. 15 shows a schematic diagram of an electronic device provided by an embodiment of the present disclosure.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present disclosure more clear, the technical solutions of the embodiments of the present disclosure will be described clearly and completely with reference to the drawings in the embodiments of the present disclosure, and it is obvious that the described embodiments are only a part of the embodiments of the present disclosure, not all of the embodiments. The components of the embodiments of the present disclosure, generally described and illustrated in the figures herein, can be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the present disclosure, presented in the figures, is not intended to limit the scope of the claimed disclosure, but is merely representative of selected embodiments of the disclosure. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the disclosure without making creative efforts, shall fall within the protection scope of the disclosure.

The neural network can be simplified into a computational graph composed of a plurality of operators, and the inference optimization of the neural network model can be decomposed into two directions: optimization of operator speed and optimization of computational graph. The content of computational graph optimization relates to the content of operator fusion, algorithm selection of operators and the like, the current neural network reasoning framework supports graph optimization of different degrees, and in the method for automatically optimizing the computational graph, because the algorithm can be automatically optimized in real time during graph optimization, the time of at least hours is needed for one-time optimization, and the selection among various data formats is not supported, so that the time cost of computational graph optimization is overlarge. In another method for optimizing a computational graph, a search space for optimizing the graph is reduced as much as possible, but real-time automatic optimization of an algorithm is not supported, neural network reasoning is accelerated by emphasizing a manual optimization operator, and development cost is high.

Based on the research, the optimization method of the neural network model is suitable for optimizing the neural network model to be deployed, and the optimization method improves the optimization efficiency of the neural network model by reducing the search space in the optimization process.

The above-mentioned drawbacks are the results of the inventor after practical and careful study, and therefore, the discovery process of the above-mentioned problems and the solutions proposed by the present disclosure to the above-mentioned problems should be the contribution of the inventor in the process of the present disclosure.

It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures.

To facilitate understanding of the present embodiment, first, a detailed description is given to an optimization method of a neural network model disclosed in an embodiment of the present disclosure, where an execution subject of the optimization method of the neural network model provided in the embodiment of the present disclosure is generally an electronic device with certain computing capability, and the electronic device includes, for example: a terminal device, which may be a User Equipment (UE), a mobile device, a User terminal, a cellular phone, a cordless phone, a Personal Digital Assistant (PDA), a handheld device, a computing device, a vehicle mounted device, a wearable device, or a server or other processing device. In some possible implementations, the optimization method of the neural network model may be implemented by a processor calling computer-readable instructions stored in a memory.

The following describes an optimization method of a neural network model provided by an embodiment of the present disclosure.

Referring to fig. 1, a flowchart of a method for optimizing a neural network model provided in an embodiment of the present disclosure is shown, where the method includes steps S101 to S103, where:

s101: preprocessing a parameter search space corresponding to each candidate algorithm in a plurality of candidate algorithms to obtain a shrinkage parameter search space of each candidate algorithm;

s102: searching based on the shrinkage parameter searching space of each candidate algorithm to obtain a parameter value corresponding to each candidate algorithm;

s103: and searching based on the candidate algorithm supported by each operator in the operators and the parameter values of the supported candidate algorithm to obtain the optimized neural network model.

The method comprises the steps of preprocessing a parameter search space corresponding to each candidate algorithm in a plurality of candidate algorithms to obtain a shrinkage-limited parameter search space of each candidate algorithm, searching based on the shrinkage-limited parameter search space of each candidate algorithm to obtain a parameter value corresponding to each candidate algorithm, searching based on the candidate algorithms supported by each operator in a plurality of operators and the parameter values of the supported candidate algorithms to obtain a neural network model, determining the shrinkage-limited parameter search space for each candidate algorithm in advance to reduce the search space, searching in the shrinkage-limited parameter search space to obtain the parameter values of each candidate algorithm, and improving the optimization efficiency of the neural network model.

The following describes details of S101 to S103.

I: in S101, the plurality of candidate algorithms are a plurality of algorithms supported by a plurality of operators of the neural network model; the candidate algorithms supported by different operators may be the same, may be different from each other, or may be partially the same. For different algorithms, the corresponding parameters may also be the same, may be different from each other, or may be partially the same. The plurality of candidate algorithms includes, for example, at least one of convolution, pooling, activation, and the like; the convolution includes, for example: one or more of a block convolution, a point-by-point convolution, a separable convolution, a deep convolution, a hole convolution, a direct convolution, a matrix multiplication-based convolution, a fast convolution, a fourier transform-based convolution, and the like.

The algorithm implementation of an operator generally has some parameters; these parameters include public parameters and private parameters. Wherein the private parameters are related to the performance of the algorithm expressed on the scale of specific hardware and specific computing tasks; for different algorithms, the number of the private parameters can be N, and the private parameter is p₀,p₁,…,p_NComposed N-tuple (p)₀,p₁,…,p_N)。

For the same algorithm, different values of the private parameters may result in different required durations and/or different consumed computing resources when performing computing tasks based on the algorithm. In order to quickly determine a parameter value for each candidate algorithm, in the embodiment of the present disclosure, a parameter search space corresponding to each candidate algorithm is first preprocessed to obtain a reduced parameter search space of each candidate algorithm. For any candidate parameter, the type and the number of the parameters in the restricted parameter search space are the same as those of the corresponding parameter search space, and the value ranges of the parameter values are different. Generally, the value range of the parameter values of at least part of the parameters in the parameter search space is larger than the value range of the parameter values in the shrinkage parameter search space.

Referring to fig. 2, an embodiment of the present disclosure provides a specific method for preprocessing a parameter search space corresponding to each candidate algorithm in multiple candidate algorithms to obtain a reduced parameter search space of each candidate algorithm, including:

s201: determining a plurality of parameter sets corresponding to each of the plurality of candidate algorithms; and the values of the same parameter in the plurality of parameter groups are different.

In a possible embodiment, the parameter values of the parameters of the plurality of parameter sets of a candidate algorithm may be determined randomly, for example, for the candidate algorithm.

In another possible implementation, the values of the candidate algorithm in other trained neural network models may be determined as reference values, and then a plurality of parameter sets corresponding to the candidate algorithm are determined based on the reference values. Here, the "other trained neural network model" may be, for example, another neural network model having a function similar to that of the neural network model to be generated, another neural network model having a structure similar to that of the neural network model to be generated, or the like, or another neural network model having a deployment environment similar to that of the neural network model to be generated.

Illustratively, the parameters of a certain candidate algorithm are: A. b, C, and D; the values of the first trained neural network model are respectively as follows: a1, b1, c1, and d 1; values in the second trained neural network model are respectively: a2, b2, c2 and d 2; taking (a1, b1, c1, d1), (a2, b2, c2 and d2) as reference values, and then determining a plurality of values near a1 and/or a2 based on a1 and a2 as the value of A; determining a plurality of values near B1 and/or B2 based on B1 and B2 as the value of B; determining a plurality of values near C1 and/or C2 based on C1 and C2 as the value of C; based on D1 and D2, determining a plurality of values near D1 and/or D2 as the value of D, and then intersecting and/or combining the values determined for A, B, C and D respectively to obtain a plurality of parameter sets. Here, "near" means, for example, that the difference from the reference value is smaller than a preset difference threshold. Further, the parameter set may be (a1, b1, c1, d1) and/or (a2, b2, c2, d2) as it is.

In another possible implementation, the value range of each parameter may be determined according to the experience of the developer, and the value of the parameter is determined within the value range of each parameter, so as to obtain a plurality of parameter groups. Or a plurality of parameter sets are directly generated according to the experience of a developer.

S202: and performing performance simulation on the plurality of parameter groups based on a plurality of task scenes to obtain a performance simulation result of each parameter group in the plurality of parameter groups.

Here, when performing performance simulation on a plurality of parameter sets based on a plurality of task scenarios, for example, the simulation system may determine a time duration consumed when the candidate algorithm performs a calculation task based on each parameter set, and determine the time consumed as a performance simulation result of each parameter set; wherein the shorter the elapsed time, the better the performance corresponding to the parameter set. Or determining the computing resources consumed by the candidate algorithm based on each parameter group to execute the computing task through the simulation system, and determining the consumed computing resources as the performance simulation result corresponding to each parameter group; wherein the less computing resources consumed, the better the performance corresponding to the set of parameters. In addition, a simulation system can be established based on preset computing resources, the consumed time of the candidate algorithm for executing the computing task based on each parameter group can be determined based on the established simulation system, and the consumed time determined based on the preset computing resources can be used as the performance simulation result corresponding to each parameter group.

In addition to performing performance simulation on each of the plurality of parameter sets, performance information corresponding to each parameter set may be obtained by theoretical calculation.

In addition, in another embodiment of the present disclosure, since N is different in each candidate algorithm, its display cannot be represented in the unified interface of the operator any more. The embodiment of the disclosure also provides a uniform interface for the search algorithm of the neural network model provided by the embodiment of the disclosure, and common parameters of different candidate algorithms supported by an operator can be explicitly expressed in the algorithm interface; for the private parameters of different algorithms, setting parameter pointers for the private parameters; the storage positions pointed by the parameter pointers of the private parameters are arranged according to a specific sequence, and the parameter values of the private parameters stored in the storage positions pointed by the parameter pointers are analyzed and used by corresponding alternative algorithms.

As shown in fig. 3, an example of an algorithmic interface is provided. In this example, operator a has a number of parameters in common; each candidate algorithm supported by the operator A has various private parameters; when the multiple parameter groups are subjected to performance simulation based on multiple task scenarios, for any one group of parameter groups, for example, public parameters of multiple candidate algorithms supported by the operator a and parameter pointers of private parameters of the multiple candidate algorithms can be transmitted to the algorithms through an algorithm interface of the operator a; the algorithm obtains parameter values of a parameter group consisting of private parameters through parameter pointers of all the parameters, and then executes simulation calculation based on the transmitted public parameters and the obtained parameter values of the private parameters to obtain a performance simulation result of the parameter group.

In addition, it should be noted that the algorithm interface structure can also be used in the application process of the neural network model.

S203: and determining a shrinkage parameter search space of each candidate algorithm based on performance simulation results respectively corresponding to the plurality of parameter sets of each candidate algorithm.

In a specific implementation, for example, at least one target parameter set may be determined from the plurality of parameter sets based on performance simulation results corresponding to the plurality of parameter sets of each candidate algorithm, respectively; and determining a shrinkage parameter search space of each candidate algorithm based on values of all parameters contained in the at least one target parameter group.

Here, when determining at least one target parameter group from a plurality of parameter groups, when determining, by the simulation system, a candidate algorithm to characterize a performance simulation result corresponding to each parameter group based on a consumed time length when each parameter group performs a calculation task, for example, the consumed time length corresponding to each parameter group may be compared with a preset time length threshold; determining the parameter group with the consumed duration less than the preset time threshold as a target parameter group; or selecting h parameter groups from the multiple parameter groups according to the sequence of the consumed duration from small to large as the target parameter group. Wherein h is a positive integer greater than 0.

Similarly, when the simulation system determines that the candidate algorithm represents the performance simulation result corresponding to each parameter group based on the computing resources consumed by each parameter group when executing the computing task, for example, the computing resources consumed by each parameter group may be compared with a preset computing resource threshold; and the parameter group region with the consumed computing resource smaller than the design computing resource threshold is rated as the target parameter group; or selecting h parameter groups from the plurality of parameter groups as target parameter groups according to the sequence of consuming little to much computing resources. Wherein h is a positive integer greater than 0.

After determining the target parameter set, a shrinkage parameter search space may be determined for each candidate parameter, for example, in the following manner: and for each candidate parameter, determining a value set of the parameter value of each parameter in the plurality of parameters according to the value of each parameter in the target parameter group corresponding to the candidate parameter, and determining the value set of the parameter value of each parameter as a shrinkage-limited parameter search space, wherein the shrinkage-limited parameter search space comprises a plurality of discrete parameter values.

In another possible implementation manner, a parameter value range of each of the plurality of parameters may be determined according to a value of each parameter in the target parameter group, where the parameter value range includes a value of each parameter in the target parameter group. For example, the parameter value range may use a maximum value of the value of each parameter in the target parameter group as a maximum value of the value range, and use a minimum value of each parameter in the target parameter group as a minimum value of the value range. Or for example, an offset may be set, the maximum value of the parameter value interval is the sum of the maximum value of the values of each parameter in the target parameter set and the offset; the minimum value of the parameter value interval is the sum of the minimum value of the values of all the parameters in the target parameter group and the offset.

In addition, other determination methods can be adopted, and the determination method can be specifically set according to actual needs.

II: in step S102, when the parameter values corresponding to each candidate algorithm are obtained by searching the shrinkage-limited parameter search space based on each candidate algorithm, for example, a plurality of candidate parameter sets may be determined in the shrinkage-limited parameter search space, and the parameter values of at least some of the parameters in different candidate parameter sets are different. And then determining performance information of the candidate algorithm when the candidate parameter set executes the calculation task, and selecting the candidate parameter set with the best performance information from the candidate parameter sets as the parameter value corresponding to the candidate algorithm based on the performance information.

Alternatively, a preset search algorithm may be used to determine the parameter values for each candidate algorithm from the scaled parameter search space.

Specifically, the embodiment of the present disclosure provides a specific example of determining a shrinkage parameter search space for each candidate algorithm, including:

a: according to the characteristics of the candidate algorithm, M reasonable N tuples (namely parameter sets) for testing the private parameters of the candidate algorithm are prepared.

B: prepare the computation problem common in K actual scenarios. The K actual scenarios are, for example, common parameters used when the trained model performs a calculation task by using a candidate algorithm.

C: in K calculation problems, performing simulation test on M N-tuples to obtain a performance record MAX _ FLOPS of each N-tuple which best expresses in the K calculation problems, and dividing the performance record by a theoretical performance record PEAK _ FLOPS of current hardware to obtain a performance simulation result MAX _ FLOPS/PEAK _ FLOPS respectively corresponding to each N-tuple.

D: setting a performance threshold T; the N-tuple with MAX _ FLOPS/PEAK _ FLOPS > T is selected.

F: merging the parameter values of each parameter in the selected N-tuple to obtain a set of N parameter values; the N parameter value sets are in one-to-one correspondence with the N parameters in the N-tuple.

G: and determining the value range of each parameter value set aiming at the parameter values of each parameter in each parameter value set so as to obtain the value range of the parameter in the candidate algorithm determined shrinkage parameter search space.

III: after the parameter value corresponding to each candidate algorithm is determined, the candidate algorithm supported by each operator in the operators and the parameter value of the supported mutual selection algorithm can be searched to obtain the neural network model.

Referring to fig. 4, an embodiment of the present disclosure provides a specific method for obtaining an optimized neural network model by searching based on a candidate algorithm supported by each operator of a plurality of operators and parameter values of the supported candidate algorithm, including:

s401: selecting a target algorithm corresponding to each candidate data format from at least one candidate algorithm corresponding to each candidate data format in a plurality of candidate data formats.

Here, in S401, the data format refers to a format in which data is stored in a file or record, and includes different formats such as numeric values, characters, binary numbers, and the like. There may be one or more data formats supported by each algorithm; the data formats supported by different algorithms may be the same, may be different from each other, or may be partially the same; the time required to perform a computational task using different data formats for each algorithm may also be different. A target algorithm selected for each candidate data format, for example, an algorithm that satisfies a certain requirement when executing a calculation task based on the corresponding candidate data format, where the requirement is, for example, a reasoning speed requirement and/or a consumed calculation resource requirement; the computing resources are, for example, one or more of processor resources, memory resources, hard disk resources, network resources, and the like.

Illustratively, taking an example that a certain inference speed needs to be satisfied when an algorithm executes a computation task, as shown in fig. 5, the embodiment of the present disclosure provides a specific manner for selecting a target algorithm corresponding to each candidate data format from at least one candidate algorithm corresponding to each candidate data format in a plurality of candidate data formats, including:

s501: determining a first consumed time of at least one candidate algorithm corresponding to each candidate data format when a calculation task is executed based on the data of each candidate data format.

Here, the candidate algorithm corresponding to each candidate data format is an algorithm capable of executing a calculation task based on the candidate data format. For each candidate data format, at least one candidate algorithm capable of performing computational tasks based on each candidate data format. In order to determine a target algorithm for a candidate data format from among at least one candidate algorithm, various candidate algorithms may be used herein to perform an actual computational task based on the corresponding candidate data format, and determine a first elapsed time for each candidate algorithm to perform the actual computational task based on the candidate data format. Or, performing inference calculation on first consumed time required by each candidate algorithm when executing a calculation task based on the corresponding candidate data format, and further obtaining the theoretical first consumed time.

S502: and selecting a target algorithm corresponding to each candidate data format from the at least one candidate algorithm corresponding to each candidate data format according to the first consumed time of each candidate algorithm in the at least one candidate algorithm corresponding to each candidate data format.

Here, when there is only one candidate algorithm corresponding to a certain candidate data format, if the candidate data format is a data format necessary for the neural network, the corresponding candidate algorithm may be directly determined as the target algorithm of the candidate data format; if the candidate data format is not a necessary data format of the neural network and the first time consumption of the candidate algorithm corresponding to the candidate data format is too long, for example, is greater than a certain first time threshold, the candidate data format may be removed from the candidate data formats adopted by the neural network model; or, it provides a possible choice of each operator of the neural network, even if the first time of the candidate algorithm corresponding to the candidate data format is too long, the candidate algorithm is used as the target algorithm of the candidate data format, and is used as a step of the search space, and the subsequent search process is executed.

Under the condition that at least two candidate algorithms corresponding to a certain candidate data format exist, for example, the first consumed time corresponding to various candidate algorithms is compared with a preset first time length threshold, and the candidate algorithm with the first consumed time smaller than the first time length threshold is determined as the target algorithm of the candidate data format; or, s candidate algorithms may be determined as the target algorithms corresponding to the candidate data formats from the candidate algorithms corresponding to the candidate data formats in the order from short to long of the first time consumption. Wherein s is a positive integer greater than 0.

In another embodiment, taking the case that certain computing resource requirements need to be met when an algorithm executes a computing task, for example, it may be determined that at least one candidate algorithm corresponding to each candidate data format needs to consume computing resources when the computing task is executed based on data of each candidate data format; and then selecting a target algorithm corresponding to each candidate data format from the at least one candidate algorithm corresponding to each candidate data format according to the first time consumption of each candidate algorithm in the at least one candidate algorithm corresponding to each candidate data format.

The specific implementation manner is similar to the embodiment corresponding to fig. 5, and is not described herein again.

In another embodiment, taking the algorithm that needs to satisfy a certain inference speed condition and a certain computation resource condition when executing a computation task as an example, for example, under a preset computation resource condition, a first consumed time of at least one candidate algorithm corresponding to each candidate data format when executing the computation task based on data of each candidate data format may be determined; and selecting a target algorithm corresponding to each candidate data format from the at least one candidate algorithm corresponding to each candidate data format according to the first consumed time of each candidate algorithm in the at least one candidate algorithm corresponding to each candidate data format.

Here, the first elapsed time for each candidate algorithm may be determined, for example, under real hardware. The real hardware satisfies the above-mentioned computing resource condition. Alternatively, the first elapsed time for each candidate algorithm may be determined under simulated computing resources.

The specific method for receiving the above S401 to obtain the neural network model further includes:

s402: determining a search path corresponding to each operator according to a target algorithm corresponding to at least one data format supported by each operator in a plurality of operators; wherein the plurality of candidate data formats includes the at least one data format.

In specific implementation, after a target algorithm corresponding to each candidate data format is determined, for each operator in the neural network model, a search path corresponding to each operator is determined according to the target algorithm corresponding to each data format supported by the operator.

A search path corresponding to each operator forms a search space corresponding to the operator; and forming a total search space corresponding to the neural network model by using the search spaces corresponding to all operators in the neural network model.

For example, if m candidate data formats supported by any operator of the neural network model and n candidate algorithms supported by each candidate data format (n is not 1) are provided, when the search space is not optimized based on the optimization method provided by the embodiment of the present disclosure, the original search space of each operator formed should include m × n search paths; as shown in a in fig. 6, an example of a search path corresponding to each operator when the search space is not optimized is provided.

If there are d operators in the neural network model, the total search path in the total search space corresponding to the neural network model includes: (m.times.n)^dAnd (4) respectively. It can be seen that as the number of operators in the neural network model increases, the number of total search paths in the total search space grows exponentially.

Optimizing the search space of each operator based on the optimization method provided by the embodiment of the disclosure; if for any operator in the neural network model, if the candidate data formats supported by the operator include m types, and there are s types of target algorithms corresponding to each candidate data format, then for any operator, according to the target algorithm corresponding to the operator, as shown in b in fig. 6, there are m × s search paths determined for the operator, where s is smaller than n.

The total number of search paths in the total search space corresponding to the neural network model is: (m.times.s)^dIn this way, it can be seen that the total search space is reduced, which in turn enables greater efficiency in searching based on the reduced search space.

The specific method for receiving the S402 to obtain the neural network model further includes:

s403: and searching based on the search path corresponding to each operator in the operators and the parameter value corresponding to the target algorithm in the search path of each operator to obtain the optimized neural network model.

In specific implementation, after a search path corresponding to each operator in a plurality of operators in a neural network is determined, a search space of a neural network model is also constructed; and then searching based on the search space to determine the neural network model. Among a plurality of search paths included in the search space, there are: the candidate algorithm, the candidate data format adopted by the candidate algorithm and the parameter value corresponding to the candidate algorithm.

Here, the search is performed based on the search space, and a suitable data format and algorithm are determined for each operator in the search space, so that the performance of the neural network model composed of all the operators is optimal when the calculation task is performed.

The performance is optimized, for example, by at least one of: the method has the advantages of quickest reasoning speed, least consumed computing resources, quickest reasoning speed under certain computing resource conditions and the like. Specifically, the determination is made according to the actual requirements of the neural network model.

For example, the embodiment of the present disclosure takes the performance of the neural network model represented by the fastest inference speed as an example, and describes a process of searching for a search path corresponding to each operator in the multiple operators to obtain an optimized neural network model, where the process includes:

determining a third time consumption corresponding to the search path based on a target algorithm corresponding to the search path for the search path corresponding to each operator in the plurality of operators; determining a target path based on the third time consumption of the search path corresponding to each operator in the plurality of operators; and obtaining the optimized neural network model based on the target path.

When searching is performed based on the search path corresponding to each operator in the multiple operators, for example, unified search can be performed on the neural network model, a current search path is determined from the search paths corresponding to each operator in each search, the current search paths corresponding to all the operators are combined to form an alternative neural network, and performance verification is performed on the alternative neural network.

For example, the alternative neural network is used to perform processing on the image, and the speed of the processing is determined as an index characterizing the performance of the alternative neural network.

Or executing a processing process on the image by using the alternative neural network, and taking the computing resource occupied in the processing process as an index for representing the performance of the alternative neural network.

And then, through the processes, determining the candidate neural network with the best performance as the neural network model obtained in the searching process.

In another embodiment, as shown in fig. 7, another specific method for obtaining a neural network model by searching based on a search path corresponding to each operator in a plurality of operators is further provided in the embodiments of the present disclosure, and includes:

s701: dividing the neural network model into a plurality of sub-network models, wherein each sub-network model comprises at least one operator in the neural network model.

Here, dividing the neural network model into a plurality of sub-network models enables dividing the search task into a plurality of sub-tasks; different subtasks can be realized in parallel through different arithmetic units, so that the search process can be further improved, and the optimization efficiency can be improved.

Illustratively, the neural network model may be divided into a plurality of sub-network models, for example, in the following manner: the neural network model is divided into a plurality of sub-network models based on at least one of a network depth threshold condition and a data format condition.

Here, the network depth threshold condition includes, for example: the network depth of the current sub-network model reaches a preset depth threshold; the data format conditions include: the number of data formats supported by the next operator of the current operator is 1.

In another embodiment, for example, other conditions for partitioning the subnetwork model may also be included, including, but not limited to, any of the following: the current operator is the end point of the neural network model; at least two downstream operators directly connected with the current operator; and the downstream operator directly connected with the current operator is directly connected with at least two upstream operators.

Illustratively, referring to fig. 8, the embodiment of the present disclosure provides a specific method for dividing a neural network model into a plurality of sub-network models, including:

s801: determining a current operator for the current sub-network model from the operators which are not divided in the calculation graph; and the upstream operator directly connected with the current operator is already divided into the current sub-network model or is already divided into other sub-network models.

S802: and (3) carrying out the following detection processes corresponding to S8021 to S8025 on the current operator respectively:

s8021: detecting whether the network depth of the current sub-network model reaches a preset depth threshold value;

s8022: detecting that the number of data formats supported by the next operator of the current operator is 1;

s8023: detecting whether the current operator is the end point of the neural network model;

s8024: detecting whether at least two downstream operators directly connected with the current operator exist;

s8025: detecting whether a downstream operator directly connected with a current operator is directly connected with at least two upstream operators;

if any one of the detection results of S8021 to S8025 is yes, jumping to S803; and if all the detection results are detected to be negative, jumping to S804.

For example, the above S8021 to S8025 may be sequentially executed in a certain order; if the detection result of any one of the detection processes from S8021 to S8025 is yes, the process directly jumps to S803, and the subsequent detection process is not executed.

S803: and taking the current operator as an end point operator of the current sub-network model to generate the current sub-network model.

S804: dividing the current operator into a current sub-network model; and returns to the above S801.

And until the division of all operators in the computation graph is completed.

For example, the following partitioning process may be specifically adopted:

a: if the current operator is the end point of the calculation graph, stopping dividing by taking the current operator as the end point of the current sub-network model;

b: and if the depth of the current sub-network model is equal to the threshold value, the current operator is the end point of the current sub-network model, and the new sub-network model is divided from the next sianzu.

C: if the next operator is a bifurcation operator, ending the current sub-network model at the next operator; assuming there are b branches, the construction of the b sub-network models starts at the first operator of each branch.

D: and if the next operator is the convergence operator, the current operator is the end point of the sub-network model, and the new sub-network model is divided from the next operator.

E: and if the next operator only supports one data format, the next operator is a bottleneck operator, the current operator is used as the end point of the current sub-network model, and the new sub-network model is divided from the next operator.

Receiving the above S701, after obtaining a plurality of subnet models, the method further includes:

s702: and searching based on the search path corresponding to each operator in at least one operator included in each sub-network model to obtain each sub-network model.

Here, the search path corresponding to each operator in the sub-network model is searched, i.e. the search process is performed only for each sub-network model. When the searching process is respectively executed for different sub-network models, the searching process can be realized in parallel, and the searching process can also be combined by parallel and serial. Parallel-serial combination, for example, dividing a plurality of sub-network models into a plurality of groups, wherein each group has at least two sub-network models; the parallel search process is performed for a plurality of packets, and the serial search process is performed for at least two subnet models within each packet. The specific search mode can be set according to actual needs.

Illustratively, when the sub-network model division is completed, each operator generates different search paths according to the data format and algorithm supported by the operator; the time consumption of each search path is taken as the cost of passing through the search path, a weighted directed graph is constructed, and then a predetermined search algorithm, such as Dijkstra's algorithm in the single-source shortest-path algorithm, is utilized. And solving the shortest path from the starting point to the end point of the sub-network model to complete the optimization process of the sub-network model. The search procedure for the subnet model is as follows:

a: and generating candidate data formats supported by each operator in the subnetwork model and a search path of the candidate algorithm.

B: and obtaining the time consumption of the search path formed by each candidate data format and the corresponding target algorithm.

C: and connecting the search path in each operator to the search path which can be corresponding to the next operator, wherein one search path can correspond to a plurality of search paths of the next operator.

D: and searching by using a universal searching algorithm to obtain a set of searching paths with the least cost.

S703: and obtaining the optimized neural network model based on each sub-network model in the plurality of sub-network models.

Here, since the operators in the plurality of sub-network models have a sequential connection order, for example, the neural network model can be obtained by connecting a plurality of searched sub-network models according to the connection order.

The method comprises the steps of selecting a target algorithm for each candidate data format in multiple candidate data formats, determining a search path corresponding to each operator according to the target algorithm corresponding to at least one data format supported by each operator in multiple operators, searching based on the search path corresponding to each operator in multiple operators and parameter values of the candidate algorithms supported by each operator to obtain a neural network model, further reducing a search space selected by the algorithm by determining the target algorithm for each data format, and searching in the search space to achieve the purpose of further optimizing the neural network model with higher efficiency and lower cost.

Referring to fig. 9, another specific method for obtaining the optimized neural network model by searching based on the candidate algorithm supported by each operator of the multiple operators and the parameter value of the supported candidate algorithm is provided in the embodiment of the present disclosure, and includes:

s901: selecting a target algorithm corresponding to each candidate data format from at least one candidate algorithm corresponding to each candidate data format in a plurality of candidate data formats.

Here, the specific implementation manner of S901 is similar to that of S401 described above, and is not described herein again.

S902: and aiming at each candidate data format in the plurality of candidate data formats, selecting a target conversion data format corresponding to each candidate data format from the plurality of candidate formats.

Here, in S902, in some cases, each operator in the neural network model supports interconversion between data formats, for example, two operators having a connection relationship therebetween, where a data format of data output by a previous operator is data format a, and a data format adopted by a subsequent operator in executing a calculation task is data format b, and thus when the subsequent operator receives relevant data from the previous operator to execute the calculation task, the data is first converted from data format a to data format b, and then the calculation task is executed based on the data converted into data format b, and the conversion between data formats also requires a certain time and a certain amount of calculation resources.

For each candidate data format, a corresponding target conversion data format is determined, for example, to meet a certain requirement, for example, a conversion speed requirement and/or a consumed computational resource requirement, for the process of converting the target data format into the candidate data format.

Illustratively, taking the requirement of meeting the conversion speed as an example, referring to fig. 10, an embodiment of the present disclosure provides a specific method for selecting a target conversion data format corresponding to each candidate data format from a plurality of candidate data formats, including:

s1001: determining, for a first candidate data format of the plurality of candidate data formats, a second elapsed time for converting at least one second candidate data format other than the first candidate data format into the first data format;

s1002: and selecting a target conversion data format corresponding to the first candidate data format from the at least one second candidate data format based on the second consumed time.

Here, the first candidate data format includes at least a part of the plurality of candidate data formats. For each first candidate data format, at least part of the plurality of candidate data formats except the first candidate data format is a second candidate data format corresponding to the first candidate data format. When determining the target converted data format for each first candidate data format, for example, the same data in different second candidate data formats may be respectively converted into data in the first candidate data format, and the conversion time may be determined as a second elapsed time corresponding to each different second candidate data format. Then, for example, a second candidate data format with a second elapsed time less than a preset second time threshold may be determined as a target candidate data format of the first candidate data format. Or w second candidate data formats can be selected from the second candidate data formats as target candidate data formats of the first candidate data format according to the sequence from short to long of the second consumed time. Wherein w is a positive integer greater than 0.

In another embodiment, when determining the target converted data format for each first candidate data format, for example, the following manner may also be adopted: and selecting a target conversion data format corresponding to the first candidate data format from the at least one second candidate data format based on the first consumed time of the target algorithm corresponding to the first candidate data format and the second consumed time of each second candidate data format in the at least one second candidate data format. Here, for example, the first consumed time of the target algorithm corresponding to the first candidate data format and the second consumed time in each second candidate data format may be added to obtain a consumed time sum value, and then the consumed time sum value is compared with a preset third duration threshold; and if the sum of the consumed time is smaller than the third duration threshold, determining the corresponding second candidate data format as the target conversion data format of the first candidate data format.

In another embodiment of the present disclosure, for example, a manner of determining a target converted data format for a first candidate data format from second candidate data formats may also be employed based on the computing resources consumed when converting the second candidate data format to the first candidate data format.

In addition, the target conversion data format may be determined for each candidate data format by combining the computing resources consumed when converting the second candidate data format into the first candidate data format and the second consumed time, for example, determining the second consumed time for converting the second candidate data format into the first candidate data format under the condition of the preset computing resources, and then determining the target conversion data format for the first candidate data format from the respective second candidate data formats according to the second consumed time.

The specific method for obtaining the neural network model in step S902 further includes:

s903: and determining a search path corresponding to each operator according to a target algorithm corresponding to each data format in at least one data format supported by each operator and a target conversion data format corresponding to each data format.

Here, in a case where the neural network model supports interconversion between data formats, if there are m candidate data formats supported by any operator of the neural network model and n algorithms supported by each candidate data format, when a search space is not optimized based on the optimization method provided by the embodiment of the present disclosure, an original search space is formed as shown in a in fig. 11, and a search path of each operator has m × n × (m-1); if there are d operators in the neural network model, the neural networkThe total number of search paths in the total search space corresponding to the network model is: (m.times.n.times. (m-1))^dAnd (3) strips.

When the search space is optimized based on the optimization method provided by the embodiment of the present disclosure, if there are w target conversion data formats determined for each candidate data format, and there are s target algorithms determined for each candidate data format, and w is less than m, and s is less than n, then as shown in b in fig. 11, there are m × s × w search paths included in the search space formed for each operator; if there are d operators in the neural network model, the total search path in the total search space corresponding to the neural network model includes (m × s × w)^dAnd (3) strips.

S904: and searching based on the search path corresponding to each operator in the operators and the parameter value corresponding to the target algorithm in the search path of each operator to obtain the optimized neural network model.

Here, the specific implementation of S904 is similar to S403 described above, and is not described here again.

The method comprises the steps of selecting a target algorithm and a target conversion data format for each candidate data format in multiple candidate data formats, determining a search path corresponding to each operator according to the target algorithm and the target conversion data format corresponding to at least one data format supported by each operator in multiple operators, searching based on the search path corresponding to each operator in multiple operators and parameter values of the candidate algorithm supported by each operator to obtain a neural network model, determining the target algorithm for each data format to reduce a search space, and searching in the search space, so that the purpose of optimizing the neural network model more efficiently and at lower cost is further achieved.

It will be understood by those skilled in the art that in the method of the present invention, the order of writing the steps does not imply a strict order of execution and any limitations on the implementation, and the specific order of execution of the steps should be determined by their function and possible inherent logic.

Referring to fig. 12, an embodiment of the present disclosure further provides a data processing method, including:

s1201: acquiring data to be processed;

s1202: executing a data processing task on the data to be processed by using a target neural network to obtain a data processing result of the data to be processed;

the target neural network model is obtained based on the optimization method of the neural network model according to any embodiment of the disclosure.

The following are exemplary:

(1) for the case that the data to be processed includes image data, the processing of the data to be processed includes: at least one of face recognition, object detection, and semantic segmentation. Here, the face recognition includes, for example: at least one of face key point identification, face emotion identification, face attribute (such as age, gender and the like) identification and living body detection. Object detection includes, for example: and detecting at least one of object position and object type.

(2) For the situation that the data to be processed comprises the character data, the processing of the data to be processed comprises the following steps: dialog generation, and character prediction. Dialog generation includes, for example: at least one of intelligent question answering, voice self-help and the like. Character prediction includes, for example: search keyword prediction, character completion prediction, and the like. For the case that the data to be processed comprises three-dimensional point cloud data, the processing of the data to be processed comprises: and at least one of obstacle detection and target detection.

According to the data processing method provided by the embodiment of the disclosure, the neural network model generated by the optimization method of the neural network model provided by any embodiment of the disclosure is used for processing the data to be processed, and the generated neural network model has better performance, so that the obtained data processing result has higher accuracy.

Based on the same inventive concept, the embodiment of the present disclosure further provides an optimization apparatus of a neural network model corresponding to the optimization method of the neural network model, and since the principle of solving the problem of the apparatus in the embodiment of the present disclosure is similar to the optimization method of the neural network model in the embodiment of the present disclosure, the implementation of the apparatus may refer to the implementation of the method, and repeated details are not described again.

Referring to fig. 13, there is a schematic diagram of an apparatus for optimizing a neural network model according to an embodiment of the present disclosure, the apparatus includes: a preprocessing module 131, a first search module 132, and a second search module 133; wherein,

the preprocessing module 131 is configured to preprocess a parameter search space corresponding to each candidate algorithm in multiple candidate algorithms to obtain a constrained parameter search space of each candidate algorithm;

a first searching module 132, configured to search based on the shrinkage parameter search space of each candidate algorithm, to obtain a parameter value corresponding to each candidate algorithm;

the second searching module 133 is configured to search based on the candidate algorithm supported by each operator of the multiple operators and the parameter value of the supported candidate algorithm, so as to obtain the optimized neural network model.

In a possible implementation manner, when the preprocessing module 131 preprocesses the parameter search space corresponding to each candidate algorithm in the plurality of candidate algorithms to obtain the constrained parameter search space of each candidate algorithm, the preprocessing module is configured to: determining a plurality of parameter sets corresponding to each of the plurality of candidate algorithms; wherein, in the plurality of parameter groups, the same parameter has different values; performing performance simulation on the plurality of parameter sets based on a plurality of task scenes to obtain a performance simulation result of each parameter set in the plurality of parameter sets; and determining a shrinkage parameter search space of each candidate algorithm based on performance simulation results respectively corresponding to the plurality of parameter sets of each candidate algorithm.

In a possible embodiment, when determining the constrained parameter search space of each candidate algorithm based on the performance simulation results corresponding to the plurality of parameter sets of each candidate algorithm, the preprocessing module 131 is configured to: determining at least one target parameter set from the plurality of parameter sets based on performance simulation results corresponding to the plurality of parameter sets of each candidate algorithm, respectively; and determining a shrinkage parameter search space of each candidate algorithm based on values of all parameters contained in the at least one target parameter group.

In a possible embodiment, the candidate algorithm includes a public parameter and at least one private parameter, and the parameter search space is a search space of the at least one private parameter.

In a possible implementation manner, the second searching module 133, when performing a search based on the candidate algorithm supported by each operator of the plurality of operators and the parameter value of the supported candidate algorithm to obtain the optimized neural network model, is configured to: selecting a target algorithm corresponding to each candidate data format from at least one candidate algorithm corresponding to each candidate data format in a plurality of candidate data formats; determining a search path corresponding to each operator according to a target algorithm corresponding to at least one data format supported by each operator in a plurality of operators; wherein the plurality of candidate data formats includes the at least one data format; and searching based on the search path corresponding to each operator in the operators and the parameter value corresponding to the target algorithm in the search path of each operator to obtain the optimized neural network model.

In a possible implementation manner, the second searching module 133, when selecting a target algorithm corresponding to each candidate data format from at least one candidate algorithm corresponding to each candidate data format in a plurality of candidate data formats, is configured to: determining first consumed time of at least one candidate algorithm corresponding to each candidate data format when a calculation task is executed based on the data of each candidate data format; and selecting a target algorithm corresponding to each candidate data format from the at least one candidate algorithm corresponding to each candidate data format according to the first consumed time of each candidate algorithm in the at least one candidate algorithm corresponding to each candidate data format.

In a possible implementation, the second searching module 133 is further configured to: for each candidate data format in the plurality of candidate data formats, selecting a target conversion data format corresponding to each candidate data format from the plurality of candidate formats; the second searching module 133, when determining the search path corresponding to each operator according to the target algorithm corresponding to at least one data format supported by each operator in the plurality of operators, is configured to: and determining a search path corresponding to each operator according to a target algorithm corresponding to each data format in at least one data format supported by each operator and a target conversion data format corresponding to each data format.

In a possible implementation manner, the second searching module 133, when selecting the target converted data format corresponding to each candidate data format from the plurality of candidate formats, is configured to: determining, for a first candidate data format of the plurality of candidate data formats, a second elapsed time for converting at least one second candidate data format other than the first candidate data format into the first data format; and selecting a target conversion data format corresponding to the first candidate data format from the at least one second candidate data format based on the second consumed time.

In a possible implementation manner, when selecting, based on the second elapsed time, a target conversion data format corresponding to the first candidate data format from the at least one second candidate data format, the method is configured to: and selecting a target conversion data format corresponding to the first candidate data format from the at least one second candidate data format based on the first consumed time of the target algorithm corresponding to the first candidate data format and the second consumed time of each second candidate data format in the at least one second candidate data format.

In a possible implementation manner, when the optimized neural network model is obtained by searching based on the search path corresponding to each operator in the multiple operators and the parameter value corresponding to the target algorithm in the search path of each operator, the second searching module 133 is configured to divide the neural network model into multiple sub-network models, where each sub-network model includes at least one operator in the neural network model; searching based on a search path corresponding to each operator in at least one operator included in each sub-network model and a parameter value corresponding to a target algorithm in the search path corresponding to each operator to obtain each sub-network model; and obtaining the optimized neural network model based on each sub-network model in the plurality of sub-network models.

In a possible implementation, the second searching module 133, when dividing the neural network model into a plurality of sub-network models, is configured to divide the neural network model into the plurality of sub-network models based on at least one of a network depth threshold condition and a data format condition.

In one possible embodiment, the network depth threshold condition includes: the network depth of the current sub-network model reaches a preset depth threshold; and/or the data format conditions include: the number of data formats supported by the next operator of the current operator is 1.

In a possible implementation manner, when the second searching module 133 performs a search based on a search path corresponding to each operator in the multiple operators and a parameter value corresponding to a target algorithm in the search path of each operator to obtain the optimized neural network model, it is configured to: determining a third time consumption corresponding to a search path based on a target algorithm corresponding to the search path and a parameter value of the target algorithm corresponding to the search path for the search path corresponding to each operator in a plurality of operators; determining a target path based on the third time consumption of the search path corresponding to each operator in the plurality of operators; and obtaining the optimized neural network model based on the target path.

The description of the processing flow of each module in the device and the interaction flow between the modules may refer to the related description in the above method embodiments, and will not be described in detail here.

Referring to fig. 14, an embodiment of the present disclosure further provides a data processing apparatus, including:

an obtaining module 141, configured to obtain data to be processed; the processing module 142 is configured to perform a data processing task on the data to be processed by using a target neural network model, so as to obtain a data processing result of the data to be processed; the target neural network model is obtained based on the optimization method of the neural network model according to any embodiment of the disclosure.

An embodiment of the present disclosure further provides an electronic device 150, as shown in fig. 15, which is a schematic structural diagram of the electronic device 150 provided in the embodiment of the present disclosure, and includes:

a processor 151 and a memory 152; the memory 152 stores machine-readable instructions executable by the processor 151 which, when executed by the electronic device, are executed by the processor to perform the steps of: preprocessing a parameter search space corresponding to each candidate algorithm in a plurality of candidate algorithms to obtain a shrinkage parameter search space of each candidate algorithm; searching based on the shrinkage parameter searching space of each candidate algorithm to obtain a parameter value corresponding to each candidate algorithm; searching based on the candidate algorithm supported by each operator in the operators and the parameter value of the supported candidate algorithm to obtain the optimized neural network model; alternatively, the machine readable instructions are executed by the processor to implement the steps of: acquiring data to be processed; executing a data processing task on the data to be processed by using a target neural network model to obtain a data processing result of the data to be processed; the target neural network model is obtained based on the optimization method of the neural network model provided by any embodiment of the disclosure.

For the specific execution process of the instruction, reference may be made to the optimization method of the neural network model or the steps of the data processing method described in the embodiments of the present disclosure, which are not described herein again.

The embodiments of the present disclosure also provide a computer-readable storage medium, on which a computer program is stored, and when the computer program is executed by a processor, the computer program performs the optimization method of the neural network model or the steps of the data processing method described in the above method embodiments. The storage medium may be a volatile or non-volatile computer-readable storage medium.

The optimization method for a neural network model or the computer program product of the data processing method provided in the embodiments of the present disclosure includes a computer-readable storage medium storing a program code, where instructions included in the program code may be used to execute the steps of the optimization method for a neural network model or the data processing method described in the above method embodiments, which may be specifically referred to in the above method embodiments and are not described herein again.

Embodiments of the present disclosure also provide a computer program product comprising computer readable instructions; the computer readable instructions, when executed by a computer, implement any of the methods of the preceding embodiments. The computer program product may be embodied in hardware, software or a combination thereof. In an alternative embodiment, the computer program product is embodied in a computer storage medium, and in another alternative embodiment, the computer program product is embodied in a Software product, such as a Software Development Kit (SDK), or the like.

It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the system and the apparatus described above may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again. In the several embodiments provided in the present disclosure, it should be understood that the disclosed system, apparatus, and method may be implemented in other ways. The above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units is only one logical division, and there may be other divisions when actually implemented, and for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection of devices or units through some communication interfaces, and may be in an electrical, mechanical or other form.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, functional units in the embodiments of the present disclosure may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit.

The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a non-volatile computer-readable storage medium executable by a processor. Based on such understanding, the technical solution of the present disclosure may be embodied in the form of a software product, which is stored in a storage medium and includes several instructions for causing an electronic device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present disclosure. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.

Finally, it should be noted that: the above-mentioned embodiments are merely specific embodiments of the present disclosure, which are used for illustrating the technical solutions of the present disclosure and not for limiting the same, and the scope of the present disclosure is not limited thereto, and although the present disclosure is described in detail with reference to the foregoing embodiments, those skilled in the art should understand that: any person skilled in the art can modify or easily conceive of the technical solutions described in the foregoing embodiments or equivalent technical features thereof within the technical scope of the present disclosure; such modifications, changes or substitutions do not depart from the spirit and scope of the embodiments of the present disclosure, and should be construed as being included therein. Therefore, the protection scope of the present disclosure shall be subject to the protection scope of the claims.

Claims

1. A method of optimizing a neural network model, the neural network model comprising a plurality of operators, the method comprising:

preprocessing a parameter search space corresponding to each candidate algorithm in a plurality of candidate algorithms to obtain a shrinkage parameter search space of each candidate algorithm;

searching based on the shrinkage parameter searching space of each candidate algorithm to obtain a parameter value corresponding to each candidate algorithm;

and searching based on the candidate algorithm supported by each operator in the operators and the parameter values of the supported candidate algorithm to obtain the optimized neural network model.

2. The optimization method according to claim 1, wherein the preprocessing the parameter search space corresponding to each candidate algorithm in the plurality of candidate algorithms to obtain the reduced parameter search space of each candidate algorithm comprises:

determining a plurality of parameter sets corresponding to each of the plurality of candidate algorithms; wherein, in the plurality of parameter groups, the same parameter has different values;

performing performance simulation on the plurality of parameter sets based on a plurality of task scenes to obtain a performance simulation result of each parameter set in the plurality of parameter sets;

and determining a shrinkage parameter search space of each candidate algorithm based on performance simulation results respectively corresponding to the plurality of parameter sets of each candidate algorithm.

3. The optimization method according to claim 2, wherein the determining the constrained parameter search space of each candidate algorithm based on the performance simulation results corresponding to the plurality of parameter sets of each candidate algorithm respectively comprises:

determining at least one target parameter set from the plurality of parameter sets based on performance simulation results corresponding to the plurality of parameter sets of each candidate algorithm, respectively;

and determining a shrinkage parameter search space of each candidate algorithm based on values of all parameters contained in the at least one target parameter group.

4. The optimization method according to any one of claims 1 to 3,

the candidate algorithm comprises a public parameter and at least one private parameter, and the parameter search space is a search space of the at least one private parameter.

5. The optimization method according to any one of claims 1 to 4, wherein the searching based on the candidate algorithm supported by each operator of the plurality of operators and the parameter value of the supported candidate algorithm to obtain the optimized neural network model comprises:

selecting a target algorithm corresponding to each candidate data format from at least one candidate algorithm corresponding to each candidate data format in a plurality of candidate data formats;

determining a search path corresponding to each operator according to a target algorithm corresponding to at least one data format supported by each operator in a plurality of operators; wherein the plurality of candidate data formats includes the at least one data format;

and searching based on the search path corresponding to each operator in the operators and the parameter value corresponding to the target algorithm in the search path of each operator to obtain the optimized neural network model.

6. The optimization method according to claim 5, wherein selecting the target algorithm corresponding to each candidate data format from the at least one candidate algorithm corresponding to each candidate data format of the plurality of candidate data formats comprises:

determining first consumed time of at least one candidate algorithm corresponding to each candidate data format when a calculation task is executed based on the data of each candidate data format;

and selecting a target algorithm corresponding to each candidate data format from the at least one candidate algorithm corresponding to each candidate data format according to the first consumed time of each candidate algorithm in the at least one candidate algorithm corresponding to each candidate data format.

7. The optimization method according to claim 5 or 6, further comprising:

for each candidate data format in the plurality of candidate data formats, selecting a target conversion data format corresponding to each candidate data format from the plurality of candidate formats;

the determining a search path corresponding to each operator according to a target algorithm corresponding to at least one data format supported by each operator in the plurality of operators comprises:

and determining a search path corresponding to each operator according to a target algorithm corresponding to each data format in at least one data format supported by each operator and a target conversion data format corresponding to each data format.

8. The optimization method according to claim 7, wherein the selecting the target converted data format corresponding to each candidate data format from the plurality of candidate formats includes:

determining, for a first candidate data format of the plurality of candidate data formats, a second elapsed time for converting at least one second candidate data format other than the first candidate data format into the first data format;

and selecting a target conversion data format corresponding to the first candidate data format from the at least one second candidate data format based on the second consumed time.

9. The optimization method according to claim 8, wherein selecting the target transformed data format corresponding to the first candidate data format from the at least one second candidate data format based on the second elapsed time comprises:

and selecting a target conversion data format corresponding to the first candidate data format from the at least one second candidate data format based on the first consumed time of the target algorithm corresponding to the first candidate data format and the second consumed time of each second candidate data format in the at least one second candidate data format.

10. The optimization method according to any one of claims 4 to 9, wherein the obtaining the optimized neural network model by searching based on the search path corresponding to each operator in the plurality of operators and the parameter value corresponding to the target algorithm in the search path of each operator comprises:

dividing the neural network model into a plurality of sub-network models, wherein each sub-network model comprises at least one operator in the neural network model;

searching based on a search path corresponding to each operator in at least one operator included in each sub-network model and a parameter value corresponding to a target algorithm in the search path corresponding to each operator to obtain each sub-network model;

and obtaining the optimized neural network model based on each sub-network model in the plurality of sub-network models.

11. The optimization method of claim 10, wherein the dividing the neural network model into a plurality of sub-network models comprises:

the neural network model is divided into a plurality of sub-network models based on at least one of a network depth threshold condition and a data format condition.

12. The optimization method of claim 11, wherein the network depth threshold condition comprises: the network depth of the current sub-network model reaches a preset depth threshold; and/or

The data format conditions include: the number of data formats supported by the next operator of the current operator is 1.

13. The optimization method according to any one of claims 1 to 12, wherein the obtaining the optimized neural network model by performing a search based on the search path corresponding to each operator in the plurality of operators and the parameter value corresponding to the target algorithm in the search path of each operator comprises:

determining a third time consumption corresponding to a search path based on a target algorithm corresponding to the search path and a parameter value of the target algorithm corresponding to the search path for the search path corresponding to each operator in a plurality of operators;

determining a target path based on the third time consumption of the search path corresponding to each operator in the plurality of operators;

and obtaining the optimized neural network model based on the target path.

14. A data processing method, comprising:

acquiring data to be processed;

executing a data processing task on the data to be processed by using a target neural network model to obtain a data processing result of the data to be processed;

wherein the target neural network model is obtained based on the optimization method of the neural network model according to any one of claims 1 to 13.

15. An apparatus for optimizing a neural network model, the neural network model comprising a plurality of operators, the apparatus comprising:

the preprocessing module is used for preprocessing a parameter search space corresponding to each candidate algorithm in a plurality of candidate algorithms to obtain a shrinkage parameter search space of each candidate algorithm;

the first searching module is used for searching based on the shrinkage parameter searching space of each candidate algorithm to obtain a parameter value corresponding to each candidate algorithm;

and the second searching module is used for searching based on the candidate algorithm supported by each operator in the operators and the parameter value of the supported candidate algorithm to obtain the optimized neural network model.

16. A data processing apparatus, comprising:

the acquisition module is used for acquiring data to be processed;

17. An electronic device, comprising: a processor, a memory storing machine-readable instructions executable by the processor for executing the machine-readable instructions stored in the memory, the machine-readable instructions, when executed by the processor, the processor performing the steps of the method of optimization of a neural network model according to any one of claims 1 to 13, or performing the steps of the method of data processing according to claim 14.

18. A computer-readable storage medium, characterized in that the computer-readable storage medium has stored thereon a computer program which, when being executed by an electronic device, carries out the steps of the method of optimization of a neural network model according to any one of claims 1 to 13, or the steps of the method of data processing according to claim 14.