CN114330699A

CN114330699A - Neural network structure searching method and device

Info

Publication number: CN114330699A
Application number: CN202011043055.2A
Authority: CN
Inventors: 李明阳; 周振坤; 徐羽琼
Original assignee: Huawei Technologies Co Ltd
Current assignee: Huawei Technologies Co Ltd
Priority date: 2020-09-28
Filing date: 2020-09-28
Publication date: 2022-04-12
Also published as: WO2022063247A1

Abstract

A neural network structure searching method and a device relate to the field of AI, and can determine a neural network structure with excellent performance in a short time by using less computing resources under the condition of ensuring that theoretical time delay and real time delay are consistent. The method comprises the following steps: the method comprises the steps of obtaining a super network according to a target task, obtaining the running time delay of each deep learning operator in the super network in an electronic device, determining a time delay loss function according to the running time delay of each deep learning operator in the electronic device, executing training operation on the super network, updating model parameters of the super network according to the time delay loss function and the network loss function until the updated super network meets the running condition of the target task on the electronic device, and determining a target neural network structure according to the updated architecture parameters of each network layer. The hyper-network comprises a plurality of network layers, each network layer comprises a plurality of nodes, any two nodes of one network layer are connected through a deep learning operator, and the model parameters comprise the architecture parameters of each network layer.

Description

Neural network structure searching method and device

Technical Field

The present application relates to the field of Artificial Intelligence (AI), and in particular, to a neural network structure search method and apparatus.

Background

With the rapid development of AI technology, various neural network models are developed endlessly. The performance of the neural network structure has an important influence on the task execution effect of the neural network model. The better the performance of the neural network structure, the better the task execution effect of the neural network model. Therefore, when constructing a neural network model, how to determine a neural network structure with excellent performance is a research hotspot of those skilled in the art.

The application of neural Network Architecture Search (NAS) technology is applied, and the NAS technology can automatically search a neural network architecture with optimal performance in a predefined search space. However, in the prior art, when the NAS technology is used for searching the neural network structure, the consumption of computing resources is high, and the consistency between the theoretical time delay and the real time delay cannot be ensured.

Disclosure of Invention

The application provides a neural network structure searching method and device, which can determine a neural network structure with excellent performance in a short time by using less computing resources under the condition of ensuring that theoretical time delay and real time delay are consistent.

According to the first aspect, the application provides a neural network structure searching method, a neural network structure searching device obtains a super network according to a target task, obtains the time delay of each deep learning operator in the super network in the operation of electronic equipment, determines a time delay loss function of the super network according to the time delay of each deep learning operator in the operation of the electronic equipment, then performs training operation on the super network, updates model parameters of the super network according to the time delay loss function and the network loss function obtained in the training process until the updated super network meets the condition that the target task operates on the electronic equipment, and determines a target neural network structure according to the updated architecture parameters of each network layer. The super network comprises a plurality of network layers, each network layer comprises a plurality of nodes, any two nodes of one network layer are connected through a deep learning operator, and the model parameters comprise architecture parameters of each network layer in the plurality of network layers.

Thus, the super network obtained according to the target task is a structure comprising a plurality of network layers, each network layer comprises a plurality of nodes, and any two nodes are connected through the deep learning operator, and the super network comprises all possible sub-networks for executing the target task. According to the method and the device, the model parameters of the hyper-network are updated through the training of the hyper-network, the model parameters comprise the architecture parameters of each network layer, until the updated hyper-network meets the conditions, the target neural network structure can be determined according to the updated architecture parameters of each network layer, namely the neural network structure with the optimal performance is determined. And because the reference is the time delay loss function when the model parameters are updated, the time delay loss function is obtained according to the real time delay of each deep learning operator running on the electronic equipment, and the consistency of the theoretical time delay and the real time delay can be ensured when the target neural network structure is determined.

Optionally, in a possible implementation manner of the present application, the method for determining a delay loss function of a super network according to a delay of each deep learning operator in operation of an electronic device may include: the neural network structure searching device determines a network embedding coefficient corresponding to each deep learning operator according to a corresponding relation between pre-stored operators and the network embedding coefficient, determines a product of the running time delay of each deep learning operator in the electronic equipment and the network embedding coefficient corresponding to the deep learning operator, determines a sum of all the products, and then determines a time delay loss function according to the sum and the time delay consistency coefficient.

Therefore, according to the real time delay of each deep learning operator in the operation of the electronic equipment and the network embedding coefficient corresponding to each deep learning operator, the time delay corresponding to the discrete deep learning operators is constructed into a continuous time delay constraint function, and the consistency of the time delay is ensured.

Optionally, in another possible implementation manner of the present application, the architecture parameter of the network layer includes a connection weight of each deep learning operator of the network layer. In this case, the method of "determining the target neural network structure according to the updated architecture parameters of each network layer" may include: and the neural network structure searching device acquires the connection weight of which the numerical value meets the preset condition in the updated architecture parameters of each network layer, and determines the target neural network structure according to all the acquired connection weights.

In the prior art, the target neural network structure is determined according to the largest connection weight in the architecture parameters of each network layer, whereas in the present application, the target neural network structure is determined according to the connection weight of which the numerical value meets the preset condition in the architecture parameters of each network layer. The number of connection weights per network layer is not limited in this application. When the number of connection weights acquired from each network layer is plural, the larger the number of connection weights to be retained is, the more stable the obtained target neural network structure is, compared with one of the prior art, so that the task execution effect of the neural network model determined according to the target neural network structure is better.

Optionally, in another possible implementation manner of the present application, the method for updating the model parameter of the hyper-network according to the delay loss function and the network loss function obtained in the training process may include: the neural network structure searching device determines the overall loss function of the super network according to the time delay loss function and the network loss function, and updates the model parameters of the super network according to the overall loss function.

Therefore, model parameters in the super network are updated according to the delay loss function and the network loss function, and the target neural network structure is guaranteed to meet the requirements of delay consistency and network precision.

Optionally, in another possible implementation manner of the present application, the method of "updating model parameters of the hyper-network according to the global loss function" may include: the neural network structure searching device determines the gradient information of each model parameter according to the overall loss function, and adjusts the model parameter according to the gradient information of each model parameter. Wherein the gradient information is used to represent the adjustment coefficients of the corresponding model parameters.

Updating of the model parameters by gradients is achieved.

In a second aspect, a neural network structure search apparatus is provided, which includes various modules for executing the neural network structure search method of the first aspect or any one of the possible implementations of the first aspect.

In a third aspect, a neural network structure search apparatus is provided, which includes a memory and a processor. The memory is coupled to the processor. The memory is for storing computer program code, the computer program code including computer instructions. The neural network structure search means, when the processor executes the computer instructions, performs a neural network structure search method as the first aspect and any one of its possible implementations.

In a fourth aspect, a chip system is provided, which is applied to a neural network structure search apparatus. The chip system includes one or more interface circuits, and one or more processors. The interface circuit and the processor are interconnected through a line; the interface circuit is configured to receive a signal from a memory of the neural network structure search apparatus and send the signal to the processor, the signal including computer instructions stored in the memory. The neural network structure search means, when the processor executes the computer instructions, performs a neural network structure search method as the first aspect and any one of its possible implementations.

In a fifth aspect, there is provided a computer readable storage medium comprising computer instructions which, when run on a neural network structure search apparatus, cause the neural network structure search apparatus to perform a neural network structure search method as in the first aspect and any one of its possible implementations.

In a sixth aspect, the present application provides a computer program product comprising computer instructions which, when run on a neural network structure search apparatus, cause the neural network structure search apparatus to perform a neural network structure search method as in the first aspect and any one of its possible implementations.

Reference may be made in detail to the second to sixth aspects and various implementations of the first aspect in this application; moreover, for the beneficial effects of the second aspect to the sixth aspect and various implementation manners thereof, reference may be made to beneficial effect analysis in the first aspect and various implementation manners thereof, and details are not described here.

These and other aspects of the present application will be more readily apparent from the following description.

Drawings

Fig. 1 is a schematic structural diagram of a neural network structure search system according to an embodiment of the present disclosure;

FIG. 2 is a schematic structural diagram of a computing device according to an embodiment of the present disclosure;

fig. 3 is a schematic flowchart of a neural network structure searching method according to an embodiment of the present disclosure;

FIG. 4 is a schematic diagram of a super network according to an embodiment of the present application;

fig. 5 is a second schematic flowchart of a neural network structure searching method according to an embodiment of the present application;

fig. 6 is a third schematic flowchart of a neural network structure searching method according to an embodiment of the present disclosure;

fig. 7 is a schematic structural diagram of a neural network structure search apparatus according to an embodiment of the present application.

Detailed Description

In the embodiments of the present application, words such as "exemplary" or "for example" are used to mean serving as an example, instance, or illustration. Any embodiment or design described herein as "exemplary" or "e.g.," is not necessarily to be construed as preferred or advantageous over other embodiments or designs. Rather, use of the word "exemplary" or "such as" is intended to present concepts related in a concrete fashion.

In the following, the terms "first", "second" are used for descriptive purposes only and are not to be understood as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include one or more of that feature. In the description of the embodiments of the present application, "a plurality" means two or more unless otherwise specified.

At present, the construction process of the neural network model is as follows: and constructing a neural network structure, training and evaluating the constructed neural network structure to obtain a neural network structure with excellent performance, and determining a neural network model according to the neural network structure with excellent performance.

Most of the existing neural network structures are designed artificially. For example, network structures such as ResNet, which magnifies the colors on the image classification task, Transformer, which is called a super on the machine translation task, and the like are designed by experts in the field. However, the design of the network structure is obtained by experts according to abundant experience and a large number of experiments, and the problems of long time consumption, low accuracy and inconsistent time delay exist. The time delay inconsistency refers to inconsistency between theoretical time delay and real time delay of the neural network model, and the real time delay refers to time delay of actual operation of the neural network model on the electronic equipment.

The NAS technology can automatically search a neural network structure with excellent performance in a predefined search space, so that the problem of manually designing the neural network structure is solved.

In the scheme of the first prior art, a reinforcement learning technology is adopted to search a neural network structure. Specifically, the neural network structure search apparatus may use a Recurrent Neural Network (RNN) as a controller, and generate the sub-network by using parameter sampling of the controller according to a preset search space. The sub-network is trained to converge to obtain model evaluation indexes, such as the accuracy of the sub-network, the number of floating-point Operations Per Second (FLOPs), and the like. The controller parameters may then be updated based on the model evaluation index. Then, the neural network structure searching apparatus may repeatedly perform the above-described operations, i.e., generate another sub-network using the updated controller parameter samples according to the search space, train the other sub-network to obtain a new model evaluation index, and update the last updated controller parameter according to the new model evaluation index. And circulating until a sub-network with excellent performance is obtained, and taking the sub-network as the network structure of the neural network model to be determined.

However, the neural network structure searching apparatus needs to train a large number of subnetworks to obtain the subnetworks with optimal performance, and network weights need to be initialized each time the subnetworks are trained, which results in high consumption of computing resources. In addition, when the neural network structure search device updates the controller parameters, reference is made to the FLOPs, and the FLOPs cannot reflect the real time delay of the sub-network on different electronic devices, so that the consistency of the theoretical time delay and the real time delay of the sub-network cannot be ensured.

In the second scheme of the prior art, an evolutionary algorithm and a reinforcement learning technology are adopted to search a neural network structure. In the searching process in the scheme, on the basis of the scheme of the first prior art, the neural network structure searching device is additionally arranged to send the sub-network to the electronic equipment and receive the real time delay for operating the sub-network returned by the electronic equipment. Therefore, when the neural network structure searching device updates the controller parameters, the real time delay can be referred to instead of FLOPs, and the problem of inconsistent time delay in the scheme of the prior art is solved.

However, the solution of the second prior art still has a problem of large consumption of computing resources, and sending the sub-network to the electronic device also increases a large amount of computing resources, thereby resulting in low searching efficiency of the neural network structure.

In conclusion, the neural network structure search in the prior art has the problems that the consumption of computing resources is high, and the consistency between the theoretical time delay and the real time delay cannot be ensured.

In order to determine a neural network structure with excellent performance in a short time by using less computing resources under the condition of ensuring that theoretical time delay and real time delay are consistent, the embodiment of the application provides a neural network structure searching method, a super network is obtained according to a target task, and a time delay loss function of the super network is determined according to the time delay of each deep learning operator in the super network in operation on electronic equipment. In the process of training the super network, model parameters of the super network are updated according to the delay loss function and the network loss function until the updated super network meets the condition that a target task runs on the electronic equipment, and finally, a target neural network structure is determined according to the updated architecture parameters of each network layer. Thus, the super network obtained according to the target task is a structure comprising a plurality of network layers, each network layer comprises a plurality of nodes, and any two nodes are connected through the deep learning operator, and the super network comprises all possible sub-networks for executing the target task. According to the method and the device, the model parameters of the hyper-network are updated through the training of the hyper-network, the model parameters comprise the architecture parameters of each network layer, until the updated hyper-network meets the conditions, the target neural network structure can be determined according to the updated architecture parameters of each network layer, namely the neural network structure with the optimal performance is determined. And because the reference is the time delay loss function when the model parameters are updated, the time delay loss function is obtained according to the real time delay of each deep learning operator running on the electronic equipment, and the consistency of the theoretical time delay and the real time delay can be ensured when the target neural network structure is determined.

The execution main body of the neural network structure searching method provided by the embodiment of the application is a neural network structure searching device.

In one scenario, the neural network structure searching apparatus may be an electronic device, and the electronic device may be a server or a terminal device. That is to say, the electronic device initiates a target task by itself, and determines a target neural network structure with optimal performance by executing the neural network structure search method provided by the embodiment of the present application, so as to determine the neural network model. The electronic device then runs the neural network model to perform the target task.

In another scenario, the neural network structure searching apparatus may be a server, and the terminal device runs the neural network model. That is to say, the server determines a target neural network structure with optimal performance by executing the neural network structure searching method provided by the embodiment of the present application, thereby determining the neural network model, and sending the neural network model to the terminal device. The terminal device runs the received neural network model to execute the target task. Specifically, the neural network structure search method provided by the embodiment of the present application may be applied to a neural network structure search system.

Fig. 1 shows a structure of the neural network structure search system. As shown in fig. 1, the neural network structure search system may include: a server 11 and a terminal device 12. The server 11 and the terminal device 12 establish connection by a wired communication method or a wireless communication method.

The server 11 is an execution subject of the neural network structure search method provided in the embodiment of the present application. The method is mainly used for training the super network, and updating model parameters in the super network according to a delay loss function and a network loss function until the updated super network meets the condition that a target task runs on the terminal device 12. And is further configured to determine a target neural network structure according to the updated architecture parameters of each network layer, thereby determining a neural network model, and send the neural network model to the terminal device 12.

In some embodiments, the server 11 may be one server, a server cluster composed of a plurality of servers, or a cloud computing service center. The embodiment of the present application does not limit the specific form of the server, and fig. 1 illustrates one server as an example.

And the terminal device 12 is used for running the neural network model from the server 11 to execute the target task.

In some embodiments, terminal device 12 may be: the mobile terminal may be a mobile phone (mobile phone), a tablet computer, a notebook computer, a palm top computer, a Mobile Internet Device (MID), a wearable device, a Virtual Reality (VR) device, an Augmented Reality (AR) device, a wireless terminal in industrial control (industrial control), a wireless terminal in self driving (self driving), a wireless terminal in remote surgery (remote medical supply), a wireless terminal in smart grid (smart grid), a wireless terminal in transportation safety (smart safety), a wireless terminal in city (smart city), a wireless terminal in smart home (smart home), an internet of things (internet of things, t) device, etc. embodiments of the present application do not limit the specific form of the terminal device, and a mobile phone 12 is shown as an example in fig. 1.

The embodiment of the present application does not limit in which scenario the neural network structure search method is specifically applied.

The basic hardware structures of the server 11 and the terminal device 12 are similar and both include elements included in the computing apparatus shown in fig. 2. The hardware configurations of the server 11 and the terminal device 12 will be described below by taking the computing apparatus shown in fig. 2 as an example.

As shown in fig. 2, the computing device may include a processor 21, a memory 22, a communication interface 23, and a bus 24. The processor 21, the memory 22 and the communication interface 23 may be connected by a bus 24.

The processor 21 is a control center of the computing device, and may be a single processor or a collective term for a plurality of processing elements. For example, the processor 21 may be a Central Processing Unit (CPU), other general-purpose processors, or the like. The general-purpose processor may be a microprocessor, any conventional processor, etc., and may be a Graphics Processing Unit (GPU), a Digital Signal Processor (DSP), etc., for example.

For one embodiment, processor 21 may include one or more CPUs, such as CPU 0 and CPU 1 shown in FIG. 2.

The memory 22 may be, but is not limited to, a read-only memory (ROM) or other type of static storage device that may store static information and instructions, a Random Access Memory (RAM) or other type of dynamic storage device that may store information and instructions, an electrically erasable programmable read-only memory (EEPROM), a magnetic disk storage medium or other magnetic storage device, or any other medium that can be used to carry or store desired program code in the form of instructions or data structures and that can be accessed by a computer.

In a possible implementation, the memory 22 may exist separately from the processor 21, and the memory 22 may be connected to the processor 21 via a bus 24 for storing instructions or program codes. The processor 21, when calling and executing the instructions or program codes stored in the memory 22, can implement the neural network structure searching method provided by the following embodiments of the present application.

In the embodiment of the present application, the software programs stored in the memory 22 are different for the server 11 and the terminal device 12, so the functions implemented by the server 11 and the terminal device 12 are different. The functions performed by the devices will be described in connection with the following flow charts.

In another possible implementation, the memory 22 may also be integrated with the processor 21.

The communication interface 23 is used for connecting the computing apparatus and other devices through a communication network, where the communication network may be an ethernet, a Radio Access Network (RAN), a Wireless Local Area Network (WLAN), or the like. The communication interface 23 may include a receiving unit for receiving data, and a transmitting unit for transmitting data.

The bus 24 may be an Industry Standard Architecture (ISA) bus, a Peripheral Component Interconnect (PCI) bus, an Extended ISA (EISA) bus, or the like. The bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one thick line is shown in FIG. 2, but it is not intended that there be only one bus or one type of bus.

It should be noted that the configuration shown in fig. 2 does not constitute a limitation of the computing device, which may include more or less components than those shown, or some components in combination, or a different arrangement of components than those shown in fig. 2, in addition to the components shown in fig. 2.

Based on the hardware structure of the computing device, the embodiment of the present application provides a neural network structure search method, and the following describes the neural network structure search method provided by the embodiment of the present application with reference to the accompanying drawings. In the embodiment of the present application, a scenario in which a server executes a neural network structure search method, determines a neural network model, and a terminal device receives and runs the neural network model is taken as an example, and the neural network structure search method provided in the embodiment of the present application is introduced.

When the neural network structure searching method is applied to the neural network structure searching system shown in fig. 1, as shown in fig. 3, the neural network structure searching method may include the following steps 301 to 305.

301. And the server acquires the hyper-network according to the target task.

The target task is used for indicating and constructing a neural network model running on the terminal equipment. The super network comprises a plurality of network layers, each network layer comprises a plurality of nodes, and any two nodes of one network layer are connected through one or more deep learning operators. The type of the deep learning operator can be convolution, separation convolution, expansion convolution, average pooling and the like. And, each neural network structure sampled from the super network may be used to perform a target task.

Typically, each network layer includes at least two nodes. The more the number of the nodes included in the network layer is, the more the corresponding deep learning operators are, the more the required computing resources are, and the higher the accuracy of the output result is.

The server may determine a target neural network structure with the optimal performance of the neural network model in advance when acquiring a target task for instructing to construct the neural network model operating on the terminal device. Specifically, the server may first obtain the super network according to the target task.

It can be understood that the process of the server acquiring the hyper-network according to the target task is as follows: the server may determine whether there is a historical task locally that is the same as or similar to the target task. If the task exists, the fact that the server constructs the super network according to the historical tasks before is indicated, and the server can directly obtain the super network constructed according to the historical tasks before from the local. If the super network does not exist, the fact that the super network is not constructed by the server according to the target task is indicated, and at the moment, the server can construct the super network according to the target task and a preset search space. In the two modes of acquiring the super network, the server directly acquires the super network from the local, so that the workload of searching the target neural network structure can be reduced, and the searching efficiency is improved.

In addition, the target task may include an output type of the neural network model. For example, the target task may be to construct a face recognition neural network model running on the terminal device, to recognize a face, and output a corresponding person name. For another example, the target task may be to build a hand pose estimation model running on the terminal device for recognizing the hand pose of the person in the picture.

Fig. 4 is a schematic structural diagram of a super network according to an embodiment of the present application. As shown in fig. 4, the super network includes three network layers as an example. The first network layer comprises three nodes, and the deep learning operators used by the connection among the three nodes comprise: 3 × 3 standard convolution, 5 × 5 standard convolution and jump join operators. The second network layer comprises three nodes, and the deep learning operators used by the connections among the three nodes comprise: a 3 × 3 standard convolution, a 5 × 5 standard convolution and a 3 × 3 separation convolution. The third network layer comprises four nodes, and the deep learning operators used by the connections among the four nodes comprise: 3 × 3 standard convolution, 5 × 5 split convolution, 3 × 3 dilated convolution and skip join operators. Thus, as can be seen from fig. 4, the super network includes: the standard convolution of 3 × 3, the standard convolution of 5 × 5, the jump join operator, the separation convolution of 3 × 3, the separation convolution of 5 × 5, the dilation convolution of 3 × 3, and six depth learning operators in total. It is to be understood that, in each network layer shown in fig. 4, the connection of each node is only an exemplary connection, how the nodes of each network layer are connected specifically, and the deep learning operator used for the connection between two nodes in the embodiment of the present application is not limited herein.

302. And the server acquires the running time delay of each deep learning operator in the hyper-network on the terminal equipment.

The server may send each deep learning operator in the super network to the terminal device after the super network is constructed. The terminal equipment can operate each received deep learning operator and return the time delay for operating each deep learning operator to the server. Therefore, the server can acquire the running time delay of each deep learning operator on the terminal equipment.

303. And the server determines the time delay loss function of the super network according to the time delay of each deep learning operator in operation on the terminal equipment.

After the server obtains the time delay of each deep learning operator in operation on the terminal device, the time delay loss function of the whole super network can be determined according to the time delays. Reference may be made specifically to the description of steps 303A-303C below.

304. And the server executes training operation on the hyper-network, and updates the model parameters of the hyper-network according to the delay loss function and the network loss function acquired in the training process until the updated hyper-network meets the condition that the target task runs on the terminal equipment.

Wherein the network loss function is used to characterize the difference between the predicted output and the data tag of the super network. The larger the output value of the network loss function, the larger the difference between the predicted output and the data tag. The training process of the hyper-network can be understood as a process of minimizing the output values of the delay loss function and the network loss function.

After the server acquires the super network in step 301, the server may train the super network, and update the model parameters in the super network according to the delay loss function determined in step 303 and the network loss function acquired in the training process until the updated super network meets the condition that the target task runs on the terminal device, and stop the training process. Wherein the model parameters may include architecture parameters for each of the plurality of network layers.

It is understood that the above conditions may include: accuracy requirements and delay requirements. For example, the condition may include that the accuracy of the output result obtained by operating the super network reaches a preset percentage, the time delay of operating the super network is less than a preset time value, and the like. The condition is previously set for the target task and the hardware configuration of the terminal device.

305. And the server determines a target neural network structure according to the updated architecture parameters of each network layer.

Wherein, the target neural network structure is a network structure with optimal performance.

In a specific implementation, the architecture parameter of each network layer may include a connection weight of each deep learning operator in all deep learning operators of the network layer. In this case, the process of determining the target neural network structure by the server according to the updated architecture parameters of each network layer may be: the server firstly obtains the connection weight of which the numerical value meets the preset condition in the updated architecture parameters of each network layer, and then determines the target neural network structure according to all the obtained connection weights.

It is understood that the preset condition can be realized in various ways.

In one possible implementation, the preset condition may be: a preset number of connection weights in each network layer. The connection weight satisfying the preset condition is: and sorting all the connection weights in the network layer according to the descending order to obtain the weight with the preset number. The preset number of connection weights of different network layers may be the same or different.

In another possible implementation manner, the preset condition may be: a connection weight greater than a preset weight value. In this way, the server may obtain a connection weight greater than a preset weight value from the architecture parameter of each network layer, and determine the target neural network structure according to all the obtained connection weights. Of course, the preset condition may also be that a corresponding preset weight value is set for each network layer. The preset weight values corresponding to different network layers may be the same or different.

According to the neural network structure searching method provided by the embodiment of the application, the super network is obtained according to the target task, and the time delay loss function of the super network is determined according to the time delay of each deep learning operator in the super network in the electronic equipment. In the process of training the super network, model parameters of the super network are updated according to the delay loss function and the network loss function until the updated super network meets the condition that a target task runs on the electronic equipment, and finally, a target neural network structure is determined according to the updated architecture parameters of each network layer. Thus, the super network obtained according to the target task is a structure comprising a plurality of network layers, each network layer comprises a plurality of nodes, and any two nodes are connected through the deep learning operator, and the super network comprises all possible sub-networks for executing the target task. According to the method and the device, the model parameters of the hyper-network are updated through the training of the hyper-network, the model parameters comprise the architecture parameters of each network layer, until the updated hyper-network meets the conditions, the target neural network structure can be determined according to the updated architecture parameters of each network layer, namely the neural network structure with the optimal performance is determined. And because the reference is the time delay loss function when the model parameters are updated, the time delay loss function is obtained according to the real time delay of each deep learning operator running on the electronic equipment, and the consistency of the theoretical time delay and the real time delay can be ensured when the target neural network structure is determined.

For example, assume that the target task is to build a hand pose estimation model running on the terminal device, and the hand pose estimation model is running on the GPU of the terminal device. Then, in the case where the target neural network structure needs to be determined within one day, thousands of GPUs may be consumed if the target neural network structure is determined using the prior art scheme. If the target neural network structure is determined by the neural network structure searching method provided by the embodiment of the application, only 1-2 GPUs may be consumed. Therefore, the neural network structure searching method provided by the embodiment of the application greatly saves the computing resources required for searching the target neural network structure.

Optionally, in this embodiment of the application, based on fig. 3, as shown in fig. 5, the step 303 may specifically include the following steps 303A to 303C.

303A, the server determines the network embedding coefficient corresponding to each deep learning operator according to the corresponding relationship between the pre-stored operator and the network embedding coefficient.

The network embedding coefficient has the function of keeping the meanings of the delay loss function and the network loss function obtained by the network embedding coefficient consistent.

303B, the server determines the product of the time delay of each deep learning operator running on the terminal equipment and the network embedding coefficient corresponding to the deep learning operator, and determines the sum of all the products.

After determining the network embedding coefficient corresponding to each deep learning operator, the server may calculate the product of the time delay of each deep learning operator operating on the terminal device and the network embedding coefficient corresponding to the deep learning operator, and add all the products to obtain a sum value.

In a specific implementation, assume that the server is used

And the time delay set is formed by the time delay of each deep learning operator running on the terminal equipment in a plurality of deep learning operators included in the representation hyper network.

Wherein S represents a set formed by all deep learning operators in the hyper-network, operator represents the deep learning operators in the set S,

and representing the time delay of the ith deep learning operator in the S set on the terminal equipment.

Then, the server can calculate the product of the time delay of each deep learning operator running on the terminal device and the network embedding coefficient corresponding to the deep learning operator, and calculate the sum of all the products. Wherein the sum satisfies the following formula:

wherein the content of the first and second substances,

representing the time delay, alpha, of the ith deep learning operator in the S set on the terminal equipment_iThe network embedding coefficient corresponding to the ith deep learning operator is represented,

and (3) representing the weighted summation result of all the deep learning operators in the S set, namely calculating the product of the time delay of each deep learning operator and the network embedding coefficient corresponding to the deep learning operator, and calculating the sum of all the products.

303C, the server determines a delay loss function according to the sum and the delay consistency coefficient.

The server obtains the sum of all the products

Thereafter, a delay loss function may be determined. Wherein the delay loss function satisfies the following formula:

wherein λ is_laRepresenting the delay uniformity coefficient, loss_laRepresenting the delay loss function.

It can be understood that λ is described above_laThe variable is a matrix formed by the connection weights of each deep learning operator in a plurality of network layers of the super network, and the matrix is continuously updated in the training process.

Optionally, in this embodiment of the application, based on fig. 5, as shown in fig. 6, the step 304 may specifically include the following steps 304A to 304B.

304A, the server executes training operation on the super network, and determines the whole loss function of the super network according to the delay loss function and the network loss function.

The time delay loss function is used for ensuring the time delay consistency of the target neural network structure, and the network loss function is used for ensuring the precision requirement, namely the accuracy requirement, of the target neural network structure.

It will be appreciated that in order to avoid overfitting of the super-network, the server also needs to take into account the network regularization term when determining the overall loss function.

In particular, the server may determine an overall loss function for the super network. Wherein the overall loss function satisfies the following formula:

therein, loss_laRepresenting the delay loss function, loss_mseA function representing the loss of the network is represented,

representing the network regularization term and loss the overall loss function.

304B, the server updates the model parameters of the hyper-network according to the body loss function until the updated hyper-network meets the condition that the target task runs on the terminal equipment.

In a specific implementation, the server may determine gradient information of each model parameter according to the global loss function, where the gradient information is used to represent an adjustment coefficient of the corresponding model parameter. The server may then adjust each model parameter based on the gradient information for that model parameter.

In the case where the model parameters include a network parameter and an architecture parameter of each network layer, and the network parameter of each network layer includes a weight calculated by each deep learning of the network layer, the server may update the network parameter of each network layer first. Wherein the updated network parameters satisfy the following formula:

wherein W represents the weight of a deep learning operator in a network parameter of a certain network layer, W_N' denotes the value of the network parameter w after the last training,

gradient information representing a network parameter W, W_NRepresenting the value of the updated network parameter w.

The server may then update the architecture parameters for each network layer. Wherein the updated architecture parameters satisfy the following formula:

wherein a represents the connection weight of a deep learning operator in the architecture parameters of a certain network layer, W_A' represents the value of the architecture parameter a after the last training,

gradient information, W, representing an architectural parameter a_ARepresenting the updated value of the architectural parameter a.

The scheme provided by the embodiment of the application is mainly introduced from the perspective of a method. To implement the above functions, it includes hardware structures and/or software modules for performing the respective functions. Those of skill in the art will readily appreciate that the present application is capable of hardware or a combination of hardware and computer software implementing the various illustrative algorithm steps described in connection with the embodiments disclosed herein. Whether a function is performed as hardware or computer software drives hardware depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.

Fig. 7 is a schematic structural diagram of a neural network structure search apparatus 70 according to an embodiment of the present disclosure. The neural network structure searching apparatus 70 is configured to execute the neural network structure searching method shown in any one of fig. 3, 5, and 6. The neural network structure searching apparatus 70 may include an acquiring unit 71, a determining unit 72, a training unit 73, and an updating unit 74.

The acquiring unit 71 is configured to acquire a super network according to the target task, where the super network includes a plurality of network layers, each network layer includes a plurality of nodes, and any two nodes in one network layer are connected by a deep learning operator; and the time delay of each deep learning operator in the hyper-network in the electronic equipment is obtained. For example, in conjunction with fig. 3, the obtaining unit 71 may be configured to perform step 301 and step 302. And the determining unit 72 is configured to determine a delay loss function of the super network according to the time delay of each deep learning operator in the electronic device, which is acquired by the acquiring unit 71. For example, in connection with fig. 3, the determining unit 72 may be adapted to perform step 303. A training unit 73, configured to perform a training operation on the hyper network acquired by the acquisition unit 71. For example, in connection with fig. 3, training unit 73 may be configured to perform the training operation on the super network as described in step 304. And an updating unit 74, configured to update model parameters of the piconet according to the delay loss function determined by the determining unit 72 and the network loss function obtained in the training process of the training unit 73 until the updated piconet meets a condition that the target task runs on the electronic device, where the model parameters include architecture parameters of each of the multiple network layers. For example, in conjunction with fig. 3, the updating unit 74 may be configured to execute the step 304 of updating the model parameters of the hyper-network according to the delay loss function and the network loss function obtained in the training process. The determining unit 72 is further configured to determine the target neural network structure according to the architecture parameter of each network layer updated by the updating unit 74. For example, in connection with fig. 3, the determining unit 72 may be configured to perform step 305.

Optionally, the determining unit 72 is specifically configured to: determining a network embedding coefficient corresponding to each deep learning operator according to the corresponding relation between the pre-stored operators and the network embedding coefficients; determining the product of the time delay of each deep learning operator in the operation of the electronic equipment and the network embedding coefficient corresponding to the deep learning operator, and determining the sum of all the products; and determining a time delay loss function according to the sum value and the time delay consistency coefficient.

Optionally, the architecture parameter of the network layer includes a connection weight of each deep learning operator of the network layer, and the determining unit 72 is specifically configured to: acquiring a connection weight of which the numerical value meets a preset condition in the updated architecture parameters of each network layer; and determining a target neural network structure according to all the acquired connection weights.

Optionally, the updating unit 74 is specifically configured to: determining the whole loss function of the super network according to the time delay loss function and the network loss function; and updating the model parameters of the hyper-network according to the overall loss function.

Optionally, the updating unit 74 is specifically configured to: determining gradient information of each model parameter according to the overall loss function, wherein the gradient information is used for expressing an adjusting coefficient of the corresponding model parameter; and adjusting the model parameters according to the gradient information of each model parameter.

Of course, the neural network structure search device 70 provided in the embodiment of the present application includes, but is not limited to, the above modules.

In actual implementation, the obtaining unit 71, the determining unit 72, the training unit 73 and the updating unit 74 may be implemented by a processor shown in fig. 2. For a specific implementation process, reference may be made to the description of the neural network structure search method portion shown in fig. 3, fig. 5, or fig. 6, which is not described herein again.

Another embodiment of the present application further provides a computer-readable storage medium, in which computer instructions are stored, and when the computer instructions are executed on a neural network structure search apparatus, the neural network structure search apparatus is caused to perform the steps performed by the neural network structure search apparatus in the method flow shown in the above method embodiment.

Another embodiment of the present application further provides a chip system, which is applied to a neural network structure search apparatus. The chip system includes one or more interface circuits, and one or more processors. The interface circuit and the processor are interconnected by a line. The interface circuit is configured to receive a signal from a memory of the neural network structure search apparatus and send the signal to the processor, the signal including computer instructions stored in the memory. When the processor executes the computer instructions, the neural network structure search device executes the steps executed by the neural network structure search device in the method flow shown in the above method embodiment.

In another embodiment of the present application, a computer program product is also provided, which includes computer instructions that, when executed on a neural network structure search apparatus, cause the neural network structure search apparatus to perform the steps performed by the neural network structure search apparatus in the method flow shown in the above method embodiment.

In the above embodiments, the implementation may be wholly or partially realized by software, hardware, firmware, or any combination thereof. When implemented using a software program, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. The processes or functions according to the embodiments of the present application are generated in whole or in part when the computer-executable instructions are loaded and executed on a computer. The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device. The computer instructions may be stored on a computer readable storage medium or transmitted from one computer readable storage medium to another computer readable storage medium, for example, the computer instructions may be transmitted from one website, computer, server, or data center to another website, computer, server, or data center via wire (e.g., coaxial cable, fiber optic, Digital Subscriber Line (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.). Computer-readable storage media can be any available media that can be accessed by a computer or can comprise one or more data storage devices, such as servers, data centers, and the like, that can be integrated with the media. The usable medium may be a magnetic medium (e.g., floppy disk, hard disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., Solid State Disk (SSD)), among others.

The foregoing is only illustrative of the present application. Those skilled in the art can conceive of changes or substitutions based on the specific embodiments provided in the present application, and all such changes or substitutions are intended to be included within the scope of the present application.

Claims

1. A neural network structure search method, comprising:

acquiring a hyper-network according to a target task, wherein the hyper-network comprises a plurality of network layers, each network layer comprises a plurality of nodes, and any two nodes of one network layer are connected through a deep learning operator;

acquiring the running time delay of each deep learning operator in the hyper-network in the electronic equipment;

determining a time delay loss function of the super network according to the running time delay of each deep learning operator in the electronic equipment;

executing training operation on the hyper-network, and updating model parameters of the hyper-network according to the delay loss function and the network loss function obtained in the training process until the updated hyper-network meets the condition that the target task runs on the electronic equipment; the model parameters include architecture parameters for each of the plurality of network layers;

and determining a target neural network structure according to the updated architecture parameters of each network layer.

2. The method for searching the neural network structure according to claim 1, wherein the determining the delay loss function of the hyper-network according to the time delay of each deep learning operator in the electronic device comprises:

determining a network embedding coefficient corresponding to each deep learning operator according to the corresponding relation between the pre-stored operators and the network embedding coefficients;

determining the product of the time delay of each deep learning operator in the operation of the electronic equipment and the network embedding coefficient corresponding to the deep learning operator, and determining the sum of all the products;

and determining the time delay loss function according to the sum and the time delay consistency coefficient.

3. The neural network structure searching method according to claim 1 or 2, wherein the architecture parameters of the network layer include connection weights of each deep learning operator of the network layer, and the determining the target neural network structure according to the updated architecture parameters of each network layer comprises:

acquiring a connection weight of which the numerical value meets a preset condition in the updated architecture parameters of each network layer;

and determining the target neural network structure according to all the acquired connection weights.

4. The neural network structure searching method according to any one of claims 1 to 3, wherein the updating the model parameters of the super network according to the delay loss function and the network loss function obtained in the training process includes:

determining the whole loss function of the super network according to the delay loss function and the network loss function;

and updating the model parameters of the hyper-network according to the overall loss function.

5. The neural network structure searching method of claim 4, wherein the updating the model parameters of the super network according to the global loss function comprises:

determining gradient information of each model parameter according to the overall loss function, wherein the gradient information is used for representing an adjusting coefficient of the corresponding model parameter;

and adjusting the model parameters according to the gradient information of each model parameter.

6. A neural network structure search apparatus, comprising:

the system comprises an acquisition unit, a processing unit and a processing unit, wherein the acquisition unit is used for acquiring a hyper network according to a target task, the hyper network comprises a plurality of network layers, each network layer comprises a plurality of nodes, and any two nodes of one network layer are connected through a deep learning operator; acquiring the running time delay of each deep learning operator in the hyper-network in the electronic equipment;

the determining unit is used for determining a time delay loss function of the hyper-network according to the time delay of each deep learning operator in the operation of the electronic equipment, which is acquired by the acquiring unit;

a training unit, configured to perform a training operation on the hyper network acquired by the acquisition unit;

the updating unit is used for updating the model parameters of the hyper-network according to the delay loss function determined by the determining unit and the network loss function obtained in the training process of the training unit until the updated hyper-network meets the condition that the target task runs on the electronic equipment; the model parameters include architecture parameters for each of the plurality of network layers;

the determining unit is further configured to determine a target neural network structure according to the architecture parameter of each network layer updated by the updating unit.

7. The apparatus according to claim 6, wherein the determining unit is specifically configured to:

8. The apparatus according to claim 6 or 7, wherein the architecture parameters of the network layer include connection weights of each deep learning operator of the network layer, and the determining unit is specifically configured to:

9. The neural network structure search device according to any one of claims 6 to 8, wherein the update unit is specifically configured to:

10. The apparatus according to claim 9, wherein the updating unit is specifically configured to:

11. A neural network structure search apparatus, comprising a memory and a processor; the memory and the processor are coupled; the memory for storing computer program code, the computer program code comprising computer instructions; when the processor executes the computer instructions, the neural network structure searching means performs the neural network structure searching method according to any one of claims 1 to 5.

12. A computer-readable storage medium characterized by comprising computer instructions which, when run on a neural network structure search apparatus, cause the neural network structure search apparatus to perform the neural network structure search method of any one of claims 1 to 5.