CN113869501A

CN113869501A - Neural network generation method and device, electronic equipment and storage medium

Info

Publication number: CN113869501A
Application number: CN202111215815.8A
Authority: CN
Inventors: 薛超; 李乾; 李明明; 陶大程
Original assignee: Jingdong Technology Information Technology Co Ltd
Current assignee: Jingdong Technology Information Technology Co Ltd
Priority date: 2021-10-19
Filing date: 2021-10-19
Publication date: 2021-12-31
Anticipated expiration: 2041-10-19
Also published as: CN113869501B

Abstract

The embodiment of the invention discloses a method and a device for generating a neural network, electronic equipment and a storage medium. The method includes the steps that multiple paths of nodes are selected from a pre-trained super network, aiming at each multiple path of nodes, each input edge of the current multiple paths of nodes is used as a competitor, retention and discarding of each input edge are used as strategies, and accuracy of a prediction result output by the super network is used as a utility popularity value, a first game is constructed, so that construction of a topological structure game of the super network is achieved, a first Nash equilibrium strategy combination of the first game is further determined, the first Nash equilibrium strategy combination comprises retention probabilities and discarding probabilities corresponding to the input edges respectively, at least one input edge, the retention probabilities and the discarding probabilities of which meet preset conditions, is selected from the first Nash equilibrium strategy combination, a neural network is further generated based on the input edges selected by the multiple paths of nodes, and accuracy of the determined neural network is improved.

Description

Neural network generation method and device, electronic equipment and storage medium

Technical Field

The embodiment of the invention relates to the technical field of artificial intelligence, in particular to a neural network generation method and device, electronic equipment and a storage medium.

Background

Over the past few years, a great deal of work has been devoted to the development of neural network search (NAS) algorithms to achieve the optimal neural network structure for a particular task.

The current common method is to relax the initial neural network structure into a super-network structure, then calculate the network weight value corresponding to each input edge of each node in the super-network structure by using a difference method, and finally use the single-path network structure formed by each node and the edge with the highest network weight value of each node as the optimal neural network structure.

In the process of implementing the invention, at least the following technical problems are found in the prior art:

the accuracy of the method for determining the optimal neural network structure according to the network weight values of the edges needs to be improved.

Disclosure of Invention

The embodiment of the invention provides a method and a device for generating a neural network, electronic equipment and a storage medium, which are used for improving the accuracy of the determined neural network.

In a first aspect, an embodiment of the present invention provides a method for generating a neural network, where the method includes:

acquiring a hyper-network pre-trained by using a training sample, and selecting a plurality of nodes in the hyper-network; wherein the multi-way node is a node having a plurality of input edges;

for each multi-path node, constructing a first game by taking each input edge of the current multi-path node as a competitor, taking the reservation and the abandonment of each input edge as strategies and taking the accuracy of a prediction result output by the super network as a utility manifold value; determining a first Nash equilibrium strategy combination of the first game, and selecting at least one input edge of which the retention probability or the discarding probability meets a preset condition in the first Nash equilibrium strategy combination; wherein, the first nash equilibrium strategy combination comprises a retention probability and a discarding probability corresponding to each input edge;

and generating a neural network based on the input edges selected for each of the plurality of nodes.

In a second aspect, an embodiment of the present invention further provides an apparatus for generating a neural network, where the apparatus includes:

the node selection module is used for acquiring a super network pre-trained by using a training sample and selecting a plurality of paths of nodes in the super network; wherein the multi-way node is a node having a plurality of input edges;

the input edge selection module is used for constructing a first game for each multi-path node by taking each input edge of the current multi-path node as a competitor, taking the reservation and the abandonment of each input edge as strategies and taking the accuracy of a prediction result output by the super network as a utility manifold value; determining a first Nash equilibrium strategy combination of the first game, and selecting at least one input edge of which the retention probability or the discarding probability meets a preset condition in the first Nash equilibrium strategy combination; wherein, the first nash equilibrium strategy combination comprises a retention probability and a discarding probability corresponding to each input edge;

and the neural network generating module is used for generating a neural network based on the input edges selected for the multiple nodes.

In a third aspect, an embodiment of the present invention further provides an electronic device, where the electronic device includes:

one or more processors;

a storage device for storing one or more programs,

when the one or more programs are executed by the one or more processors, the one or more processors implement the method for generating a neural network according to any embodiment of the present invention.

In a fourth aspect, the embodiments of the present invention further provide a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the method for generating a neural network according to any of the embodiments of the present invention.

The embodiment of the invention has the following advantages or beneficial effects:

by selecting multi-path nodes in a pre-trained hyper-network, aiming at each multi-path node, taking each input edge of the current multi-path node as a competitor, taking the reservation and the abandonment of each input edge as strategies, and constructing a first game by taking the accuracy of the prediction result output by the super network as a utility manifold value so as to realize the construction of the topological structure game of the super network, further determining a first Nash equilibrium strategy combination of the first game, which comprises a retention probability and a discarding probability respectively corresponding to each input edge, to obtain Nash equilibrium of the topological structure game, and selecting at least one input edge with retention probability and discarding probability meeting preset conditions from the first Nash equilibrium strategy combination to select the input edge with high connection keeping possibility, and further, a neural network is generated based on the input edges selected by the multi-path nodes, so that the accuracy of the determined neural network is improved.

Drawings

In order to more clearly illustrate the technical solutions of the exemplary embodiments of the present invention, a brief description is given below of the drawings used in describing the embodiments. It should be clear that the described figures are only views of some of the embodiments of the invention to be described, not all, and that for a person skilled in the art, other figures can be derived from these figures without inventive effort.

Fig. 1A is a schematic flowchart of a method for generating a neural network according to an embodiment of the present invention;

FIG. 1B is a schematic diagram of a cell in a hyper-network according to an embodiment of the present invention;

fig. 1C is a schematic process diagram of a first game according to an embodiment of the present invention;

fig. 2 is a schematic flow chart illustrating a method for generating a neural network according to a second embodiment of the present invention;

fig. 3A is a schematic flowchart of a method for generating a neural network according to a third embodiment of the present invention;

fig. 3B is a schematic process diagram of a second game according to the third embodiment of the present invention;

fig. 4 is a schematic structural diagram of a neural network generation apparatus according to a fourth embodiment of the present invention;

fig. 5 is a schematic structural diagram of an electronic device according to a fifth embodiment of the present invention.

Detailed Description

The present invention will be described in further detail with reference to the accompanying drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the invention and are not limiting of the invention. It should be further noted that, for the convenience of description, only some of the structures related to the present invention are shown in the drawings, not all of the structures.

Before explaining the embodiments provided in the present application, an application scenario of the neural network generation method provided in the present application is exemplarily explained. For example, the neural network generation method provided by the present application may be applied to generate neural networks such as an image classification network, an image segmentation network, an image feature extraction network, an image compression network, an image enhancement network, an image noise reduction network, an image tag generation network, a text classification network, a text translation network, a text digest extraction network, a text prediction network, a keyword conversion network, a text semantic analysis network, a speech recognition network, an audio noise reduction network, an audio synthesis network, an audio equalizer conversion network, a weather prediction network, a commodity recommendation network, an article recommendation network, an action recognition network, a face recognition network, and a facial expression recognition network. The above application scenarios are merely exemplary, and the application scenarios of the neural network generation method are not limited in the present application.

Example one

Fig. 1A is a schematic flowchart of a method for generating a neural network according to an embodiment of the present invention, where this embodiment is applicable to a case where a neural network is generated by a method for constructing a first game of a topological structure and determining a nash equilibrium policy combination according to a super network pre-trained by using training samples, the method may be executed by a device for generating a neural network, the device may be implemented by hardware and/or software, and the method specifically includes the following steps:

s110, acquiring a hyper-network pre-trained by using a training sample, and selecting a plurality of nodes in the hyper-network; wherein a multi-way node is a node having multiple input edges.

Wherein the training samples may be a training data set for training the super network. For example, the training samples may be image data, and the prediction result is an image processing result; or the training sample is text data, and the prediction result is a text processing result; or, the training sample is audio data, and the prediction result is an audio processing result.

For example, if the training sample is image data, the super network may be an image classification super network, and the prediction result output by the super network may be an image classification result; alternatively, the super-network may be an image segmentation super-network, and the prediction result may be an image segmentation result; or, the super network may be an image feature extraction super network, and the prediction result may be an image feature extraction result; alternatively, the super network may be an image compression super network, and the prediction result may be an image compression result; alternatively, the super network may be an image enhancement super network, and the prediction result may be an image enhancement result; or, the super network may be an image denoising super network, and the prediction result may be an image denoising result; alternatively, the super network may be an image tag generation super network, the prediction result may be an image tag, and so on. If the training sample is text data, the super network can be a text classification super network, and a prediction result output by the super network can be a text classification result; alternatively, the hyper-network may be a text prediction hyper-network, and the prediction result may be a text prediction result; or the hyper network can be a text abstract extracting hyper network, and the prediction result can be a text abstract extracting result; alternatively, the hyper-network may be a text translation hyper-network, and the predicted result may be a text translation result; alternatively, the hyper-network may be a keyword translation hyper-network, and the predicted result may be a keyword translation result; alternatively, the super network may be a text semantic analysis super network, the prediction result may be a text semantic analysis result, and so on. If the training sample is audio data, the super network can be a voice recognition super network, and the prediction result output by the super network can be a voice recognition result; alternatively, the super-network may be an audio noise reduction super-network, and the prediction result may be an audio noise reduction result; alternatively, the super network may be an audio synthesis super network, and the prediction result may be an audio synthesis result; alternatively, the super network may be an audio equalizer switch super network, the prediction result may be an audio equalizer switch result, and so on.

In this embodiment, the super network may be a network resulting from initial network relaxation. The whole super-network can be formed by serially connecting a plurality of cells in sequence, for example, a super-network is formed by serially connecting 20 cells, as shown in fig. 1B, which shows a schematic diagram of cells in the super-network. The super network comprises nodes and input edges of the nodes; an input edge of a node may be a parallel edge to the node; the input edge includes at least one operator. For example, the operation operators include, but are not limited to, max pooling for kernel size 3 × 3, average pooling for kernel size 3 × 3, jumps, separable convolution for kernel size 3 × 3, separable convolution for kernel size 5 × 5, void convolution for kernel size 3 × 3, void convolution for kernel size 5 × 5. Specifically, the data passes through an operator in the input edge, and the operation related to the operator, for example, a3 × 3 hole convolution operation, may be performed. The nodes in the hyper-network represent the feature representation after fusion through the operation operators on the input edges of the nodes.

Specifically, after the super network pre-trained by using the training samples is obtained, the multi-path nodes with a plurality of input edges are selected from the super network. The number of the multi-path nodes selected in the super network in the embodiment may be one or more.

And S120, for each multi-path node, constructing a first game by taking each input edge of the current multi-path node as a competitor, taking the reservation and the abandonment of each input edge as strategies and taking the accuracy of the prediction result output by the super network as a utility manifold value.

In this embodiment, the optimal neural network can be discretized from a pre-trained hyper-network. Therefore, the present embodiment can combine with a mathematical model for researching strategic interaction among rational decision makers in game theory, and represent the task of extracting a proper neural network structure from a pre-trained super network as a game among competitors (input edges of multiple nodes), and the strategy is retention and discard.

Specifically, for each multi-path node, each input edge of each multi-path node is used as a competitor, retention and discarding of each input edge are used as strategies, accuracy of a prediction result output by the super network is used as a utility manifold value, and a first game of each multi-path node is constructed. In this embodiment, optionally, the training sample is an image, and the prediction result is an image classification result, an image segmentation result, a target detection result, or a target tracking result; or, the training sample is a text, and the prediction result is a text classification result or a natural language processing result.

Wherein the utility manifold value may comprise an accuracy of a prediction result output by the super network after deleting one or more input edges of the multi-way node in the super network. Specifically, the embodiment can calculate the utility manifold value of each multi-path node of the super network, and further construct the first game of each multi-path node, so as to select the input edge in the first game of each multi-path node.

S130, determining a first Nash equilibrium strategy combination of the first game, and selecting at least one input edge of which the retention probability or the discarding probability meets a preset condition from the first Nash equilibrium strategy combination.

The first nash equilibrium strategy combination includes the reserve probability and the discard probability corresponding to each input edge. Specifically, a first nash equalization policy combination for a multi-way node may be calculated based on the utility manifold value for the multi-way node.

In an alternative embodiment, the determining the first nash equilibrium policy combination for the first game includes: obtaining utility manifold values of current multi-path nodes; wherein the utility manifold value of the current multi-path node comprises: after any i input edges of the current multi-path nodes are deleted from the super network, the accuracy of a prediction result output by the super network is high; wherein i is an integer valued from 1 to M, and M is the number of input edges of the current multi-path node; and determining a first Nash equilibrium strategy combination of the first game based on the utility flow values of the current multi-path nodes and a Nash equilibrium solving algorithm. In this alternative embodiment, the utility manifold value of the multi-way node may specifically be the accuracy of the prediction result output by the super-network after deleting one or more input edges of the multi-way node in the super-network. Illustratively, a multi-path node in the super network includes 3 input edges, which are a1, a2 and a3 respectively, and then the utility traffic value of the multi-path node includes the accuracy of the prediction result output by the super network after a1 is deleted in the super network, the accuracy of the prediction result output by the super network after a2 is deleted, the accuracy of the prediction result output by the super network after a3 is deleted, the accuracy of the prediction result output by the super network after a1 and a2 are deleted, the accuracy of the prediction result output by the super network after a1 and a3 are deleted, the accuracy of the prediction result output by the super network after a2 and a3 are deleted, and the accuracy of the prediction result output by the super network after a1, a2 and a3 are deleted.

In this alternative embodiment, the nash equilibrium solving algorithm may be a monte carlo algorithm. Specifically, the specific implementation method for determining the first Nash Equilibrium policy combination of the first game based on the utility flow value and the Nash Equilibrium solution Algorithm may refer to a paper entitled "Fast Complete Algorithm for Multiplayer Nash equibrium" published by Sam Ganzfried and published 7 months 2020, and the MIQCP software in the paper needs to be replaced by the Nash Equilibrium solution Algorithm, such as the monte carlo Algorithm.

In the optional implementation manner, after any input edge of each multi-path node is deleted from the super network, the accuracy of a prediction result output by the super network is used as the utility manifold value of each multi-path node, and then the first nash equilibrium strategy combination of the first game is determined based on the utility manifold value of each multi-path node and the nash equilibrium solving algorithm, so that the accurate determination of the retention probability and the discarding probability corresponding to each input edge of the multi-path node is realized, and the accuracy of the selected input edge is further improved.

In this embodiment, after the first nash equalization policy combination of the first game is determined, at least one input edge may be selected according to a preset condition and the retention probability and the discarding probability of the input edge of each multi-path node in the first nash equalization policy combination. The preset condition may be a preset probability screening condition. For example, selecting at least one input edge of the first nash equalization policy combination, where the retention probability or the drop probability satisfies the predetermined condition, may be: selecting N input edges with the retention probability higher than a preset retention threshold, and selecting N input edges with the discarding probability lower than a preset discarding threshold. The number of N can be adjusted according to actual requirements.

And S140, generating a neural network based on the input edges selected for the multi-path nodes.

Specifically, after at least one input edge whose retention probability or discard probability satisfies a preset condition is selected from the first nash-equalization policy combination, a neural network may be generated based on each multi-path node and the input edge selected for each multi-path node. Alternatively, other input edges in the super network than the selected input edge may be deleted, a neural network may be generated based on the super network after the deletion operation is performed, and the like.

The process of selecting the input edges for the multi-way nodes according to this embodiment is exemplarily described with reference to a schematic process diagram of a first game shown in fig. 1C. As shown in fig. 1C, 101 in fig. 1C shows a hybrid weight super network, which includes nodes a, b, C, and d, wherein the bottom multi-way node a includes three input edges, which are a1, a2, and a3, respectively. Three input edges can be considered a game with three competitors. As shown at 102 in fig. 1C, the reservation and relinquish of three competitors are taken as policies (K denotes reservation, D denotes relinquish); the nash equalization policy combination of the multi-path node a is obtained through a nash equalization solving algorithm, for example, shown in 103 in fig. 1C, the nash equalization policy combination of the multi-path node a includes retention probability 0.1 of a1, drop probability 0.9 of a1, retention probability 0.75 of a2, drop probability 0.25 of a2, retention probability 0.8 of a3, and drop probability 0.2 of a 3. The first two input edges that retain probability are selected in the nash equilibrium policy combination by the Aramax (2) function (the higher the probability of Keep, the higher the probability of the input edge remaining connected), as in 104 in fig. 1C, a2 and a3 are selected in a1, a2, a 3. When the game is over, as shown at 105 in fig. 1C, the other input edge (a1) other than the first two selected input edges is deleted, and the bottommost multiplex node a has two parallel input edges left.

In the technical scheme of this embodiment, by selecting multiple nodes in the pre-trained super network, for each multiple node, taking each input edge of the current multi-path node as a competitor, taking the reservation and the abandonment of each input edge as strategies, and constructing a first game by taking the accuracy of the prediction result output by the super network as a utility popularity value so as to realize the construction of the topological structure game of the super network, further determining a first Nash equilibrium strategy combination of the first game, which comprises a retention probability and a discarding probability respectively corresponding to each input edge, to obtain Nash equilibrium of the topological structure game, and selecting at least one input edge with retention probability and discarding probability meeting preset conditions from the first Nash equilibrium strategy combination to select the input edge with high connection keeping possibility, and further, a neural network is generated based on the input edges selected by the multi-path nodes, so that the accuracy of the determined neural network is improved.

Example two

Fig. 2 is a schematic flow chart of a method for generating a neural network according to the second embodiment of the present invention, and optionally, the selecting multiple nodes in the super network includes: selecting a plurality of nodes of which the number of input edges is greater than N in the super network; selecting at least one input edge with retention probability or discarding probability meeting preset conditions in the first nash equalization strategy combination, wherein the input edge comprises the following steps: selecting N input edges of which the retention probability or the discarding probability meets a preset condition in the first Nash equilibrium strategy combination; wherein N is an integer not less than 1. Wherein explanations of the same or corresponding terms as those of the above embodiments are omitted. Referring to fig. 2, the method for generating a neural network provided in this embodiment includes the following steps:

s210, obtaining a hyper-network pre-trained by using the training samples, and selecting a plurality of paths of nodes with the number of input edges larger than N in the hyper-network, wherein the plurality of paths of nodes are nodes with a plurality of input edges.

Wherein N is an integer not less than 1. N may be set according to the actual accuracy requirements for the neural network, e.g., N may be equal to 3, 4, 5, etc. Specifically, in this embodiment, the multi-path nodes with the number of input edges greater than N need to be selected in the super network, so as to further construct the first game for the multi-path nodes with the number of input edges greater than N.

S220, for each multi-path node, constructing a first game by taking each input edge of the current multi-path node as a competitor, taking reservation and discarding of each input edge as a strategy and taking the accuracy of a prediction result output by the super network as a utility manifold value.

S230, determining a first Nash equilibrium strategy combination of the first game, and selecting N input edges of which the retention probability or the discarding probability meets a preset condition from the first Nash equilibrium strategy combination.

The first nash equilibrium strategy combination includes the reserve probability and the discard probability corresponding to each input edge. Specifically, in this embodiment, after selecting a plurality of nodes with the number of input edges greater than N in the super network, a first game is constructed for the plurality of nodes, and N input edges with retention probabilities or discarding probabilities meeting preset conditions are selected from a first nash equilibrium policy combination of the first game.

Optionally, selecting N input edges whose retention probability or discard probability satisfies a preset condition in the first nash equalization policy combination includes: selecting N input edges with highest retention probability in the first Nash equilibrium strategy combination; or selecting N input edges with the lowest discarding probability in the first Nash equalization strategy combination.

Specifically, the retention probabilities of the input edges in the first nash equalization strategy combination may be sorted, and the N input edges with the highest probability after sorting are obtained; or, sorting the discarding probability of each input edge in the first nash equalization strategy combination, and acquiring N input edges with the lowest probability after sorting. In this optional embodiment, the accurate selection of the input edges of each multi-path node can be realized by selecting the N input edges with the highest retention probability or discarding the N input edges with the lowest probability, thereby improving the accuracy of the optimal single-path network.

Of course, the N input edges whose retention probability or discard probability satisfies the preset condition in the first nash equalization policy combination may also be: selecting N input edges with retention probability higher than a preset retention threshold value in a first Nash equilibrium strategy combination; or selecting N input edges with the retention probability lower than a preset retention threshold value in the first Nash equilibrium strategy combination.

In this embodiment, N input edges may be selected for a multi-path node whose input edge number is greater than N by selecting N input edges whose retention probability or discard probability satisfies a preset condition in the first nash equalization policy combination, so as to ensure that the number of input edges of the multi-path node in the neural network is N.

And S240, generating a neural network based on the input edges selected for the multi-path nodes.

According to the technical scheme of the embodiment, the N input edges are selected for the multi-path nodes with the input edges larger than N in the super network, and the N input edges with the retention probability or the discarding probability meeting the preset conditions are selected from the first Nash equilibrium strategy combination of the multi-path nodes, so that the N input edges are selected for the multi-path nodes with the input edges larger than N, the number of the input edges of the multi-path nodes in the neural network is ensured to be N, the input edges of the multi-path nodes in the neural network are prevented from being too few, and the accuracy of the generated neural network is further improved.

EXAMPLE III

Fig. 3A is a schematic flow chart of a method for generating a neural network according to a third embodiment of the present invention, and optionally, the method further includes: for each edge in the super network, constructing a second game by taking each candidate operation operator contained in the current edge as a competitor, taking the reservation and the abandonment of each candidate operation operator as a strategy and taking the accuracy of a prediction result output by the super network as a utility manifold value; determining a second Nash equilibrium strategy combination of the second game, and selecting at least one operation operator of which the retention probability or the discarding probability meets a preset condition from the second Nash equilibrium strategy combination; wherein, the second nash equilibrium strategy combination comprises a retention probability and a discarding probability corresponding to each candidate operator; the generating a neural network based on the input edges selected for each of the plurality of nodes includes: and generating a neural network based on the input edges selected for each of the multiple nodes and the operator selected for each of the edges. Referring to fig. 3A, the method for generating a neural network provided in this embodiment includes the following steps:

s310, acquiring the super network pre-trained by using the training samples, and selecting multiple nodes in the super network.

Wherein a multi-way node is a node having multiple input edges.

And S320, for each multi-path node, constructing a first game by taking each input edge of the current multi-path node as a competitor, taking the reservation and the abandonment of each input edge as strategies and taking the accuracy of the prediction result output by the super network as a utility manifold value.

S330, determining a first Nash equilibrium strategy combination of the first game, and selecting at least one input edge of which the retention probability or the discarding probability meets a preset condition from the first Nash equilibrium strategy combination; the first nash equilibrium strategy combination includes the reserve probability and the discard probability corresponding to each input edge.

S340, regarding each edge in the super network, taking each candidate operation operator contained in the current edge as a competitor, taking the reservation and the abandonment of each candidate operation operator as a strategy, and taking the accuracy of the prediction result output by the super network as a utility manifold value, and constructing a second game.

The candidate operators may be operators in each edge, such as void convolution 3 × 3, jump, max pooling, separable convolution 5 × 5, and so on. In this embodiment, to further construct an accurate neural network, an optimal operator may be selected for each edge. Therefore, the present embodiment can combine with a mathematical model for researching strategic interaction between rational decision makers in game theory, and express the task of extracting a proper neural network structure from a pre-trained hyper-network as a game between competitors (each candidate operator included at the edge), and the strategy is retention and discard.

Specifically, for each edge in the super network, each candidate operation operator contained in each edge is used as a competitor, the reservation and the abandonment of the candidate operation operator are used as strategies, the accuracy of a prediction result output by the super network is used as a utility manifold value, and a second game of each edge is constructed.

Wherein constructing the utility manifold value for use in the second game may comprise removing one or more operators of the current edge from the super network and determining the accuracy of the prediction output by the super network. Specifically, the embodiment can calculate the utility manifold value of each edge of the super network, and further construct the second game of each edge, so as to select an operator from the second nash equilibrium policy combination of the second game of each edge.

S350, determining a second Nash equilibrium strategy combination of the second game, and selecting at least one operation operator of which the retention probability or the discarding probability meets the preset condition from the second Nash equilibrium strategy combination.

And the second Nash equilibrium strategy combination comprises a retention probability and a discarding probability which are respectively corresponding to each candidate operator. Specifically, a second nash equalization policy combination for each edge may be calculated based on the utility manifold value for each edge.

Optionally, the determining a second nash equilibrium policy combination of the second game includes: obtaining a current utility manifold value; wherein the current utility manifold value comprises: after any j candidate operation operators contained in the current edge are deleted from the super network, the accuracy of a prediction result output by the super network is improved; j is an integer which takes a value from 1 to P, and P is the number of candidate operators contained in the current edge; and determining a second Nash equilibrium strategy combination of the second game based on the current utility popularity value and a Nash equilibrium solving algorithm.

The current utility manifold value may specifically be the accuracy of the prediction result output by the super network after deleting one or more candidate operators of the current utility manifold value in the super network. Illustratively, an edge in a super network includes 3 candidate operators, b1, b2, and b3, respectively, and the utility manifold value of the edge includes the accuracy of the prediction result output by the super network after deleting b1 in the super network, the accuracy of the prediction result output by the super network after deleting b2, the accuracy of the prediction result output by the super network after deleting b3, the accuracy of the prediction result output by the super network after deleting b1 and a2, the accuracy of the prediction result output by the super network after deleting b1 and b3, the accuracy of the prediction result output by the super network after deleting b2 and b3, and the accuracy of the prediction result output by the super network after deleting b1, b2, and b 3. Alternatively, the nash equilibrium solving algorithm may be a monte carlo algorithm.

In the optional implementation manner, after any candidate operator on each side is deleted from the super network, the accuracy of the prediction result output by the super network is used as the utility manifold value of each side, and then the second nash equilibrium strategy combination of the second game is determined based on the utility manifold value and the nash equilibrium solving algorithm of each side, so that the accurate determination of the retention probability and the discarding probability corresponding to each candidate operator on each side is realized, and the accuracy of the selected operator is further improved.

Optionally, the selecting at least one operation operator whose retention probability or discard probability in the second nash equalization policy combination meets a preset condition includes: selecting at least one operation operator with the highest retention probability in the second Nash equilibrium strategy combination; or selecting at least one operation operator with the lowest discarding probability in the second nash equalization strategy combination. Or, the selecting at least one operation operator whose retention probability or drop probability in the second nash equalization strategy combination satisfies a preset condition includes: selecting at least one operation operator with retention probability higher than a preset retention threshold value in the second Nash equilibrium strategy combination; or selecting at least one operation operator with the discarding probability lower than a preset retention threshold value in the second nash equalization strategy combination.

It should be noted that the execution sequence of S340-S350, and S310-S330 is not limited in this embodiment. Specifically, S340-S350 may be performed after S310-S330, that is, the operator may be selected for the input edge that has been selected for each multi-way node and the input edges of other non-multi-way nodes. Still alternatively, S340-S350 may be performed concurrently with S310-S330, or prior to S310-S330, i.e., operators may be selected for each edge in the original hyper-network.

And S360, generating a neural network based on the input edges selected for the multi-path nodes and the operation operators selected for the edges.

Specifically, after at least one operator whose retention probability or discard probability satisfies a preset condition is selected from the second nash-equalization policy combination, a neural network may be generated according to the operator selected for each edge and the input edge selected for each multi-path node. Alternatively, the other input edges in the super network other than the selected input edge may be deleted, and the other candidate operators in the super network other than the selected operator may be deleted, and a neural network may be generated based on the super network after the deletion operation is performed, and so on.

Illustratively, the generating the neural network based on the input edges selected for the respective multi-way nodes and the operation operators selected for the respective edges includes: for each multi-path node, deleting other input edges of the current multi-path node except the input edge selected for the current multi-path node in the hyper network; for each edge, deleting other operation operators contained in the current edge except the operation operator selected for the current edge in the hyper-network; a neural network is generated based on the current hyper-network.

In this exemplary embodiment, after selecting an input edge for each multi-path node and selecting an operator for each edge, the input edges except for the selected input edge in each multi-path node are deleted, the operators except for the selected operator in each edge are deleted, and then a neural network is generated according to the deleted super network.

For example, the process of selecting the operation operator for each edge in this embodiment is exemplarily described with reference to a process diagram of a second game shown in fig. 3B. As shown in fig. 3B, each line in 301 in fig. 3B may represent one candidate operator, including L1, L2, L3, L4, L5. The 5 candidate operators can be regarded as a game with five competitors, as shown in the diagram 302 in fig. 3B, with the reservation and abandonment of each candidate operator as a policy (K for reservation and D for abandonment); a nash equalization strategy combination of the edge is obtained through a nash equalization solving algorithm, for example, the nash equalization strategy combination shown in 303, where the nash equalization strategy combination includes a retention probability of 0.99 of L1, a drop probability of 0.01 of L1, a retention probability of 0.10 of L2, a drop probability of 0.90 of L2, a retention probability of 0.15 of L3, a drop probability of 0.85 of L3, a retention probability of 0.90 of L4, a drop probability of 0.10 of L4, a retention probability of 0.19 of L5, and a drop probability of 0.81 of L5. The candidate operator with the highest probability of retention is selected in the nash equalization strategy combination, such as operator L1 selected in 304 of fig. 3B. When the game is over, as shown at 305 in fig. 3B, the candidate operators other than the selected operator L1 are deleted, and one operator is left to form the final architecture.

According to the technical scheme of the embodiment, aiming at each edge in the super network, each candidate operation operator contained in the current edge is taken as a competitor, the reservation and the abandonment of each candidate operation operator are taken as strategies, the accuracy of the prediction result output by the super network is taken as a utility ranking value, a second game is constructed to realize the construction of the operation operator game of the super network, a second Nash equilibrium strategy combination of the second game is further determined, the second Nash equilibrium strategy combination comprises the reservation probability and the abandonment probability respectively corresponding to each candidate operation operator, Nash equilibrium of the operation operator game is obtained, at least one operation operator with the reservation probability and the abandonment probability meeting preset conditions is selected from the second Nash equilibrium strategy combination to realize the selection of the optimal operation operator, and then each input edge selected based on the constructed first game and each operation operator selected based on the second game are further selected, and a neural network is generated, so that the accuracy of the determined neural network is further improved.

Example four

Fig. 4 is a schematic structural diagram of a device for generating a neural network according to a fourth embodiment of the present invention, where this embodiment is applicable to a case where a neural network is generated by a method for constructing a first game of a topological structure and determining a nash equalization policy combination according to a super network pre-trained by using training samples, and the device specifically includes: a node selection module 410, an input edge selection module 420, and a neural network generation module 430.

A node selection module 410, configured to obtain a super network pre-trained using a training sample, and select a plurality of nodes in the super network; wherein the multi-way node is a node having a plurality of input edges;

an input edge selecting module 420, configured to construct, for each of the multiple paths of nodes, a first game by using each input edge of a current multiple path of nodes as a competitor, using retention and discarding of each input edge as a policy, and using accuracy of a prediction result output by the super network as a utility manifold value; determining a first Nash equilibrium strategy combination of the first game, and selecting at least one input edge of which the retention probability or the discarding probability meets a preset condition in the first Nash equilibrium strategy combination; wherein, the first nash equilibrium strategy combination comprises a retention probability and a discarding probability corresponding to each input edge;

and a neural network generating module 430, configured to generate a neural network based on the input edges selected for each of the multiple nodes.

Optionally, the training sample is image data, and the prediction result is an image processing result; or the training sample is text data, and the prediction result is a text processing result; or, the training sample is audio data, and the prediction result is an audio processing result.

Optionally, the node selecting module 410 includes a first selecting unit, configured to select a multi-path node in which the number of input edges in the super network is greater than N; the input edge selecting module 420 includes a second selecting unit, configured to select N input edges whose retention probability or discard probability satisfies a preset condition in the first nash equalization policy combination; wherein N is an integer not less than 1.

Optionally, the second selecting unit is specifically configured to:

selecting N input edges with highest retention probability in the first Nash equilibrium strategy combination; or selecting the N input edges with the lowest discarding probability in the first nash equalization strategy combination.

Optionally, the input edge selecting module 420 includes a first policy combination determining unit, configured to obtain a utility manifold value of a current multi-path node; wherein the utility manifold value of the current multi-path node comprises: after any i input edges of the current multi-path nodes are deleted from the super network, the accuracy of a prediction result output by the super network is high; wherein, i is an integer which takes values from 1 to M, and M is the number of input edges of the current multi-path node; and determining a first Nash equilibrium strategy combination of the first game based on the utility flow value of the current multi-path node and a Nash equilibrium solving algorithm.

Optionally, the device for generating a neural network further includes an operation operator selection module, where the operation operator selection module is configured to, for each edge in the super network, construct a second game by using each candidate operation operator included in the current edge as a competitor, using reservation and discard of each candidate operation operator as a policy, and using accuracy of a prediction result output by the super network as a utility manifold value; determining a second Nash equilibrium strategy combination of the second game, and selecting at least one operation operator of which the retention probability or the discarding probability meets a preset condition from the second Nash equilibrium strategy combination; wherein, the second nash equilibrium strategy combination comprises a retention probability and a discarding probability corresponding to each candidate operator; the neural network generating module 430 includes a first generating unit, configured to generate a neural network based on the input edges selected for each of the multiple nodes and the operator selected for each of the edges.

Optionally, the operator selecting module includes a second policy combination determining unit, configured to obtain a current utility manifold value; wherein the current edge utility manifold value comprises: after any j candidate operators contained in the current edge are deleted from the super network, the accuracy of a prediction result output by the super network is high; wherein j is an integer valued from 1 to P, and P is the number of candidate operators contained in the current edge; and determining a second Nash equilibrium strategy combination of the second game based on the current utility popularity value and a Nash equilibrium solving algorithm.

Optionally, the operator selecting module includes an operator selecting unit, configured to select at least one operator with a highest retention probability in the second nash equalization policy combination; or selecting at least one operation operator with the lowest discarding probability in the second nash equalization strategy combination.

Optionally, the first generating unit is specifically configured to:

for each of the multiple nodes, deleting other input edges of the current multiple nodes except the input edge selected for the current multiple node in the super network;

for each edge, deleting other operation operators contained in the current edge except the operation operator selected for the current edge in the hyper-network;

a neural network is generated based on the current hyper-network.

In this embodiment, a node selection module is used to select a plurality of nodes in a pre-trained super network, an input edge selection module is used to construct a first game for each multi-node by using each input edge of the current multi-node as a competitor, using the retention and discard of each input edge as a strategy, and using the accuracy of a prediction result output by the super network as a utility ranking value, so as to implement the construction of a topological structure game of the super network, and further determine a first nash equilibrium strategy combination of the first game, which includes the retention probability and the discard probability corresponding to each input edge, to obtain nash equilibrium of the topological structure game, and select at least one input edge, which satisfies a preset condition with the retention probability and the discard probability, in the first nash equilibrium strategy combination to select an input edge with a high connection possibility, and further pass through a neural network generation module, and a neural network is generated based on the input edges selected by each multi-path node, so that the accuracy of the determined neural network is improved.

The generation device of the neural network provided by the embodiment of the invention can execute the generation method of the neural network provided by any embodiment of the invention, and has corresponding functional modules and beneficial effects of the execution method.

It should be noted that, the units and modules included in the system are merely divided according to functional logic, but are not limited to the above division as long as the corresponding functions can be realized; in addition, specific names of the functional units are only for convenience of distinguishing from each other, and are not used for limiting the protection scope of the embodiment of the invention.

EXAMPLE five

Fig. 5 is a schematic structural diagram of an electronic device according to a fifth embodiment of the present invention. FIG. 5 illustrates a block diagram of an exemplary electronic device 12 suitable for use in implementing embodiments of the present invention. The electronic device 12 shown in fig. 5 is only an example and should not bring any limitation to the function and the scope of use of the embodiment of the present invention. The device 12 is typically an electronic device that is responsible for determining the functioning of the neural network.

As shown in FIG. 5, electronic device 12 is embodied in the form of a general purpose computing device. The components of electronic device 12 may include, but are not limited to: one or more processors or processing units 16, a memory 28, and a bus 18 that couples the various components (including the memory 28 and the processing unit 16).

Bus 18 represents one or more of any of several types of bus structures, including a memory bus or memory controller, a peripheral bus, an accelerated graphics port, and a processor or local bus using any of a variety of bus architectures. By way of example, such architectures include, but are not limited to, an Industry Standard Architecture (ISA) bus, a Micro Channel Architecture (MCA) bus, an enhanced ISA bus, a Video Electronics Standards Association (VESA) local bus, and a Peripheral Component Interconnect (PCI) bus.

Electronic device 12 typically includes a variety of computer-readable media. Such media may be any available media that is accessible by electronic device 12 and includes both volatile and nonvolatile media, removable and non-removable media.

Memory 28 may include computer device readable media in the form of volatile Memory, such as Random Access Memory (RAM) 30 and/or cache Memory 32. The electronic device 12 may further include other removable/non-removable, volatile/nonvolatile computer storage media. By way of example only, the storage device 34 may be used to read from and write to non-removable, nonvolatile magnetic media (not shown in FIG. 5, commonly referred to as a "hard drive"). Although not shown in FIG. 5, a magnetic disk drive for reading from and writing to a removable, nonvolatile magnetic disk (e.g., a "floppy disk") and an optical disk drive for reading from or writing to a removable, nonvolatile optical disk (e.g., a Compact disk-Read Only Memory (CD-ROM), a Digital Video disk (DVD-ROM), or other optical media) may be provided. In these cases, each drive may be connected to bus 18 by one or more data media interfaces. Memory 28 may include at least one program product 40, with program product 40 having a set of program modules 42 configured to carry out the functions of embodiments of the invention. Program product 40 may be stored, for example, in memory 28, and such program modules 42 include, but are not limited to, one or more application programs, other program modules, and program data, each of which examples or some combination may comprise an implementation of a network environment. Program modules 42 generally carry out the functions and/or methodologies of the described embodiments of the invention.

Electronic device 12 may also communicate with one or more external devices 14 (e.g., keyboard, mouse, camera, etc., and display), one or more devices that enable a user to interact with electronic device 12, and/or any devices (e.g., network card, modem, etc.) that enable electronic device 12 to communicate with one or more other computing devices. Such communication may be through an input/output (I/O) interface 22. Also, the electronic device 12 may communicate with one or more networks (e.g., a Local Area Network (LAN), Wide Area Network (WAN), and/or a public Network such as the internet) via the Network adapter 20. As shown, the network adapter 20 communicates with other modules of the electronic device 12 via the bus 18. It should be understood that although not shown in the figures, other hardware and/or software modules may be used in conjunction with electronic device 12, including but not limited to: microcode, device drivers, Redundant processing units, external disk drive Arrays, disk array (RAID) devices, tape drives, and data backup storage devices, to name a few.

The processor 16 executes various functional applications and data processing by executing programs stored in the memory 28, for example, implementing the neural network generation method provided by the above embodiments of the present invention, including:

Of course, those skilled in the art can understand that the processor can also implement the technical solution of the neural network generation method provided in any embodiment of the present invention.

EXAMPLE six

An embodiment of the present invention further provides a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the steps of the method for generating a neural network provided in any embodiment of the present invention, and the method includes:

Computer storage media for embodiments of the invention may employ any combination of one or more computer-readable media. The computer readable medium may be a computer readable signal medium or a computer readable storage medium. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.

A computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.

Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.

Computer program code for carrying out operations for embodiments of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).

It is to be noted that the foregoing is only illustrative of the preferred embodiments of the present invention and the technical principles employed. It will be understood by those skilled in the art that the present invention is not limited to the particular embodiments described herein, but is capable of various obvious changes, rearrangements and substitutions as will now become apparent to those skilled in the art without departing from the scope of the invention. Therefore, although the present invention has been described in greater detail by the above embodiments, the present invention is not limited to the above embodiments, and may include other equivalent embodiments without departing from the spirit of the present invention, and the scope of the present invention is determined by the scope of the appended claims.

Claims

1. A method for generating a neural network, comprising:

2. The method of claim 1, wherein the training samples are image data, and the prediction result is an image processing result;

alternatively, the first and second electrodes may be,

the training sample is text data, and the prediction result is a text processing result;

alternatively, the first and second electrodes may be,

the training samples are audio data, and the prediction result is an audio processing result.

3. The method of claim 1, wherein selecting the plurality of nodes in the super network comprises:

selecting a plurality of nodes of which the number of input edges is greater than N in the super network;

the selecting at least one input edge of which the retention probability or the drop probability in the first nash equalization strategy combination meets the preset condition comprises:

selecting N input edges of which the retention probability or the discarding probability meets a preset condition in the first Nash equilibrium strategy combination; wherein N is an integer not less than 1.

4. The method according to claim 3, wherein said selecting the N input edges of the first Nash equalization strategy combination whose retention probability or drop probability satisfies a predetermined condition comprises:

selecting N input edges with highest retention probability in the first Nash equilibrium strategy combination; alternatively, the first and second electrodes may be,

and selecting N input edges with the lowest discarding probability in the first Nash equilibrium strategy combination.

5. The method of claim 1 wherein said determining a first nash equalization policy combination for said first game comprises:

obtaining utility manifold values of current multi-path nodes; wherein the utility manifold value of the current multi-path node comprises: after any i input edges of the current multi-path nodes are deleted from the super network, the accuracy of a prediction result output by the super network is high; wherein, i is an integer which takes values from 1 to M, and M is the number of input edges of the current multi-path node;

and determining a first Nash equilibrium strategy combination of the first game based on the utility flow value of the current multi-path node and a Nash equilibrium solving algorithm.

6. The method according to any one of claims 1-5, further comprising:

for each edge in the super network, constructing a second game by taking each candidate operation operator contained in the current edge as a competitor, taking the reservation and the abandonment of each candidate operation operator as a strategy and taking the accuracy of a prediction result output by the super network as a utility manifold value; determining a second Nash equilibrium strategy combination of the second game, and selecting at least one operation operator of which the retention probability or the discarding probability meets a preset condition from the second Nash equilibrium strategy combination; wherein, the second nash equilibrium strategy combination comprises a retention probability and a discarding probability corresponding to each candidate operator;

the generating a neural network based on the input edges selected for each of the plurality of nodes includes:

and generating a neural network based on the input edges selected for each of the multiple nodes and the operator selected for each of the edges.

7. The method of claim 6 wherein said determining a second nash equalization policy combination for said second game comprises:

obtaining a current utility manifold value; wherein the current edge utility manifold value comprises: after any j candidate operators contained in the current edge are deleted from the super network, the accuracy of a prediction result output by the super network is high; wherein j is an integer valued from 1 to P, and P is the number of candidate operators contained in the current edge;

and determining a second Nash equilibrium strategy combination of the second game based on the current utility popularity value and a Nash equilibrium solving algorithm.

8. The method according to claim 6, wherein said selecting at least one operator with retention probability or drop probability satisfying a predetermined condition in the second nash equalization strategy combination comprises:

selecting at least one operation operator with the highest retention probability in the second Nash equilibrium strategy combination; alternatively, the first and second electrodes may be,

and selecting at least one operation operator with the lowest discarding probability in the second Nash equilibrium strategy combination.

9. The method of claim 6, wherein generating a neural network based on the input edges selected for each of the plurality of nodes and the operator selected for each of the edges comprises:

a neural network is generated based on the current hyper-network.

10. An apparatus for generating a neural network, comprising:

11. An electronic device, characterized in that the electronic device comprises:

one or more processors;

a storage device for storing one or more programs,

when executed by the one or more processors, cause the one or more processors to implement a method of generating a neural network as claimed in any one of claims 1-9.

12. A computer-readable storage medium, on which a computer program is stored, which program, when being executed by a processor, carries out the method of generating a neural network as claimed in any one of claims 1 to 9.