CN112101525A

CN112101525A - Method, device and system for designing neural network through NAS

Info

Publication number: CN112101525A
Application number: CN202010936265.8A
Authority: CN
Inventors: 毛伟; 余浩; 代柳瑶; 朱雪娟; 常成; 王宇航; 李凯
Original assignee: Southwest University of Science and Technology
Current assignee: Southwest University of Science and Technology; Southern University of Science and Technology
Priority date: 2020-09-08
Filing date: 2020-09-08
Publication date: 2020-12-18

Abstract

The invention discloses a method, a device and a system for designing a neural network through NAS. The method comprises the following steps: after receiving the generation request, searching a neural network architecture based on the constructed search space and the search strategy to obtain a plurality of neural network architectures; running a plurality of neural network architectures on hardware based on the training set and the verification set, monitoring hardware use information, and determining evaluation results when the plurality of neural network architectures are respectively endowed with different neural network parameters according to an evaluation scheme preset by a verification result corresponding to the verification set and the hardware use information; and determining a target neural network architecture and corresponding target neural network parameters in the plurality of neural network architectures according to the optimal evaluation result in the evaluation results, and generating a corresponding neural network model based on the target neural network architecture and the target neural network parameters. The neural network model is determined according to the data processing capacity and the hardware using condition, so that the data processing efficiency is accelerated, and the hardware loss is reduced.

Description

Method, device and system for designing neural network through NAS

Technical Field

The embodiment of the invention relates to computer technology, in particular to a method, a device and a system for designing a neural network through NAS.

Background

With the development of technologies such as big data, internet of things, mobile edge computing, and the like, an Artificial Intelligence (AI) technology represented by a deep neural network is widely applied in the fields of image processing and the like. The number of edge termination devices has increased rapidly, while the amount of data generated by the edge termination devices has reached the ZettaByte (ZB) level.

The data volume of the AI terminal is increased, which can lead to the increase of the calculated amount at the edge side, the existing data processing framework based on the deep learning Neural network mainly converts a large data set into model parameters, and the network framework is searched by a Neural Architecture Search (NAS) algorithm based on a hardware overhead limitation traversal framework to find the Neural network framework with the most moderate performance, thereby completing inference analysis.

The existing equipment for the AI neural network accelerator has the technical defects of long data processing time and low precision when processing a large amount of data, and the calculation performance is not enough to support the current data amount, so that the power consumption of hardware is large and the energy consumption efficiency is low.

Disclosure of Invention

The invention provides a method, a device and a system for designing a neural network through an NAS (network attached storage), which can reduce the power consumption of hardware and improve the energy consumption efficiency.

In a first aspect, an embodiment of the present invention provides a method for designing a neural network by a NAS, including:

after receiving the generation request, searching a neural network architecture based on the constructed search space and the search strategy to obtain a plurality of neural network architectures;

running a plurality of neural network architectures on hardware based on the training set and the verification set, monitoring hardware use information, and determining evaluation results when the plurality of neural network architectures are respectively endowed with different neural network parameters according to an evaluation scheme preset by a verification result corresponding to the verification set and the hardware use information;

and determining a target neural network architecture and corresponding target neural network parameters in the plurality of neural network architectures according to the optimal evaluation result in the evaluation results, and generating a corresponding neural network model based on the target neural network architecture and the target neural network parameters.

In a second aspect, an embodiment of the present invention further provides an apparatus for designing a neural network by a NAS, including: a search module, an execution module, and a generation module, wherein,

the searching module is used for searching the neural network architecture based on the constructed searching space and the searching strategy after receiving the generation request to obtain a plurality of neural network architectures;

the execution module is used for running a plurality of neural network architectures on hardware based on the training set and the verification set, monitoring hardware use information, and determining evaluation results when the plurality of neural network architectures are respectively endowed with different neural network parameters according to an evaluation scheme preset by a verification result corresponding to the verification set and the hardware use information;

and the generating module is used for determining a target neural network architecture and corresponding target neural network parameters in the plurality of neural network architectures according to the optimal evaluation result in the evaluation results, and generating a corresponding neural network model based on the target neural network architecture and the target neural network parameters.

In a third aspect, embodiments of the present invention also provide a system for designing a neural network by NAS, which when executed by a computer processor is configured to perform the method for designing a neural network by NAS as described in the first aspect.

The embodiment of the invention provides a method, a device and a system for designing a neural network through an NAS (network attached storage), wherein after a generation request is received, neural network architecture search is carried out based on a constructed search space and a search strategy to obtain a plurality of neural network architectures, the plurality of neural network architectures are operated on hardware based on a training set and a verification set and hardware use information is monitored, evaluation results of different neural network architectures under corresponding different parameters are obtained according to an evaluation scheme preset by a verification result corresponding to the verification set and the hardware use information, a target neural network architecture and corresponding target neural network parameters in the plurality of neural network architectures are determined according to an optimal evaluation result in the evaluation results, and a corresponding neural network model is generated based on the target neural network architecture and the target neural network parameters. According to the scheme, the neural network model can be determined according to the data processing capacity of the neural network architecture and the hardware use condition, the problem that the improper neural network model influences the data processing efficiency is solved, the data processing efficiency is accelerated, and the loss of hardware is reduced.

Drawings

FIG. 1 is a flowchart of a method for designing a neural network by a NAS according to an embodiment of the present invention;

FIG. 2 is a diagram of neural network architecture search and hardware remapping;

FIG. 3 is a process diagram of the neural network compiling and then invoking hardware identified RISC-V instructions to map to corresponding hardware resources;

FIG. 4 is a flowchart of a method for designing a neural network by a NAS according to a second embodiment of the present invention;

FIG. 5 is a flowchart illustrating step 420 of a method for designing a neural network by a NAS according to a second embodiment of the present invention;

FIG. 6 is a diagram of spatiotemporal multiplexing pulsating data flow based on memory computation according to a second embodiment of the present invention;

fig. 7 is a flowchart for determining a target neural network architecture and corresponding target neural network parameters in a plurality of neural network architectures according to an optimal evaluation result in the evaluation results according to the second embodiment of the present invention;

FIG. 8 is a flowchart of a second embodiment of the present invention for implementing a method for designing a neural network by a NAS;

fig. 9 is a schematic diagram of an apparatus for designing a neural network by a NAS according to a third embodiment of the present invention.

Detailed Description

The present invention will be described in further detail with reference to the accompanying drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the invention and are not limiting of the invention. It should be further noted that, for the convenience of description, only some of the structures related to the present invention are shown in the drawings, not all of the structures.

In addition, the embodiments and features of the embodiments in the present invention may be combined with each other without conflict.

Before discussing exemplary embodiments in more detail, it should be noted that some exemplary embodiments are described as processes or methods depicted as flowcharts. Although a flowchart may describe the operations (or steps) as a sequential process, many of the operations can be performed in parallel, concurrently or simultaneously. In addition, the order of the operations may be re-arranged. The process may be terminated when its operations are completed, but may have additional steps not included in the figure. The processes may correspond to methods, functions, procedures, subroutines, and the like. In addition, the embodiments and features of the embodiments in the present invention may be combined with each other without conflict.

Example one

Fig. 1 is a flowchart of a method for designing a neural network by using an NAS according to an embodiment of the present invention, which is applicable to a situation where a large number of computation tasks are processed in real time, where the method may be executed by an AI terminal, and selection of the AI terminal may be allocated by a hardware accelerator, so as to evaluate a hardware utilization rate and a model accuracy of the AI terminal, and further optimize a neural network model, and specifically include the following steps:

and step 110, after receiving the generation request, searching the neural network architecture based on the constructed search space and the search strategy to obtain a plurality of neural network architectures.

When the AI terminal receives a large amount of data, the calculation amount of the AI terminal on the edge side is increased, the loss of hardware is increased, and a neural network model needs to be reconstructed to process the data in time. Specifically, the AI terminal may include an image classification terminal. And receiving a neural network model generation request issued by the AI terminal, searching neural network frameworks in different network layers based on a preset search strategy, and searching to obtain a plurality of neural network frameworks.

Specifically, the generation request may contain a preset search policy. Wherein the search space comprises a list of possible network architectures. The search strategy can comprise a differentiable method and a gradient descent method for searching, so that the search speed is increased. The neural network architecture can be realized by searching NAS through the network architecture, and the NAS neural network architecture searching algorithm can search the neural network architecture with the most moderate performance according to design requirements and corresponding hardware platforms based on the hardware overhead limit traversal architecture.

In particular, the search strategy may serialize the architecture parameters of the neural network architecture by a differentiable method, such that the search space becomes differentiable. In a continuous search space, a neural network architecture search is performed according to a gradient descent method, such as: the minimum value of the loss function is iteratively solved step by a gradient descent method, so that the minimum loss function can be obtained. Of course, in practical application, the data processing efficiency or the hardware utilization rate can also be solved step by iteration through a gradient ascent method, so that the maximum data processing efficiency or the maximum hardware utilization rate is obtained.

When the NAS neural network architecture is searched according to a preset search strategy, layer-by-layer precision selection and branch selection may be performed on a network layer, specifically, the network layer precision may include a network weight, and a weight bit value of each layer of the corresponding network layer is large or small. For example, in a five-layer network, the corresponding network weights are 2, 4, 6, 4, and 8, which means that the first layer weight bit value is 2, the second layer weight bit value is 4, and so on.

And 120, operating a plurality of neural network architectures on hardware based on the training set and the verification set, monitoring hardware use information, and determining evaluation results when the plurality of neural network architectures are respectively endowed with different neural network parameters according to an evaluation scheme preset by the verification result corresponding to the verification set and the hardware use information.

The plurality of neural network architectures obtained in step 110 may be used to process data included in the training set and the validation set, and the processing process may be implemented on hardware to obtain hardware usage information. The processing capability of the neural network architecture on data and the loss condition of the neural network architecture on hardware can be used for evaluating the neural network architecture, so that different neural network parameters corresponding to the neural network architecture can be obtained by processing the training set on a plurality of neural network architectures, and further, the verification result corresponding to the neural network architecture can be obtained by processing the verification set on the neural network architecture. Fig. 2 is a diagram of neural network architecture search and hardware mutual mapping, and as shown in fig. 2, while data is processed, hardware usage information may be received, and an evaluation result when the neural network architecture is given neural network parameters may be determined by combining a verification result obtained by processing a verification set and the received hardware usage information.

For example, the processing of data by the neural network architecture may include convolution, pooling, padding, and the like.

Specifically, the training set includes a sample data set for training, which is mainly used to train parameters in the neural network architecture. The validation set includes a set of sample data for validating performance of the neural network architecture. After training of different neural network architectures on the training set is finished, the performance of each neural network architecture is compared and judged through the verification set. The different neural network architectures may include neural network architectures corresponding to different parameters, and may also refer to neural network architectures with different architectures. The verification result may include data processing capability, and specifically may include data processing accuracy, data loss rate, and the like.

For example, when the AI terminal includes an image classification terminal, after image classification is completed, according to the probability distribution of image mapping to the prediction output classification, the correlation loss of image classification performed on the corresponding neural network architecture can be obtained, which may specifically include the cross entropy of image classification. Cross entropy can be used to measure the dissimilarity between two probability distributions.

Fig. 3 is a process diagram of invoking a RISC-V instruction identified by hardware to map to a corresponding hardware resource after neural network compilation, and as shown in fig. 3, based on the RISC-V instruction, the neural network architecture and parameters thereof, an instruction set corresponding to the neural network architecture is mapped to the hardware through the RISC-V instruction to be implemented, so as to further implement cooperative optimization of the neural network architecture and the hardware, that is, the utilization rate of the hardware can be improved by adjusting the neural network architecture or the parameters thereof, and further improve the data processing efficiency. After the neural network is obtained, the neural network is mapped into an instruction which can be identified by hardware through a software compiling environment, the instruction comprises a precision setting instruction (setup), a data access control instruction (ld-mem/st-mem), memory computing (cim-ld-mem) and the like, and the architecture and the circuit module are controlled on the hardware to realize the operation of the neural network.

The hardware usage information may include hardware utilization, and the hardware usage information may be fed back to the neural network architecture through instruction mapping as a constraint for the search.

The verification result of the verification set can indicate the processing capacity of the corresponding neural network architecture on data, and the hardware use information can indicate the loss information of the neural network architecture on hardware. The data processing capability and hardware loss information may enable evaluation of neural network architecture performance given neural network parameters. And after the neural network architecture is searched, evaluating the neural network architecture based on the real-time data, the neural network architecture and the evaluation scheme, and further performing the subsequent steps.

And step 130, determining a target neural network architecture and corresponding target neural network parameters in the plurality of neural network architectures according to the optimal evaluation result in the evaluation results, and generating a corresponding neural network model based on the target neural network architecture and the target neural network parameters.

The evaluation result comprises a verification result of the verification set on the neural network architecture and corresponding hardware use conditions. The evaluation result can be fed back to the search strategy to guide the continuous search of the neural network architecture with higher data processing capability, and the optimal verification result and the hardware use condition can determine the target neural network architecture which is most suitable for processing the current data. The target neural network parameters are obtained in the process of training the target neural network architecture through the training set, and the target neural network architecture and the target neural network parameters can generate corresponding neural network models.

Specifically, the neural network parameters may include convolution kernel parameters and step sizes of the neural network architecture, and may further include weight bit values of the network layer. According to given neural network parameters, the configuration of the neural network architecture can be determined, and a neural network model corresponding to the neural network architecture and the neural network parameters can be further defined.

The embodiment of the invention provides a method for designing a neural network through an NAS (network attached storage), which comprises the steps of receiving a generation request, and searching a neural network architecture based on a constructed search space and a search strategy to obtain a plurality of neural network architectures; running a plurality of neural network architectures on hardware based on the training set and the verification set, monitoring hardware use information, and determining evaluation results when the plurality of neural network architectures are respectively endowed with different neural network parameters according to an evaluation scheme preset by a verification result corresponding to the verification set and the hardware use information; and determining a target neural network architecture and corresponding target neural network parameters in the plurality of neural network architectures according to the optimal evaluation result in the evaluation results, and generating a corresponding neural network model based on the target neural network architecture and the target neural network parameters. According to the scheme, the neural network model can be determined according to the data processing capacity of the neural network architecture and the hardware use condition, the problem that the neural network model influences the data processing efficiency is solved, the data processing efficiency is accelerated, and the hardware loss is reduced.

Example two

Fig. 4 is a flowchart of a method for designing a neural network by a NAS according to a second embodiment of the present invention, where the second embodiment is optimized based on the foregoing embodiments. As shown in fig. 4, a second embodiment of the present invention provides a method for designing a neural network by a NAS, including the following steps:

step 410, constructing a search space based on a preset network element stacking mode.

Specifically, the search space includes a set of searchable NAS neural network architectures, and traversing the architecture can find a neural network architecture with appropriate performance.

The predetermined network element stacking pattern may include a MobileNet-V2 element.

Preferably, in order to find a suitable neural network architecture, the search space may include a set of neural network architectures searched in a plurality of sets of network layers with different weight bit values.

And step 420, searching a neural network architecture in the search space through a predefined search strategy.

In particular, the search strategy may comprise a differentiable approach, searching the space based on a gradient descent search strategy.

It should be noted that the search space is generally discrete, i.e., not differentiable. And adding a differentiable method, and serializing the architecture parameters of the network by normalizing the exponential function so as to enable the search space to become differentiable. And a gradient descent method is further adopted to search the search space, so that the search speed is greatly increased.

The gradient descent method may solve a least squares problem. Gradient descent is one of the most commonly used methods when solving model parameters of machine learning algorithms, i.e. unconstrained optimization problems. When the minimum value of the loss function is solved, iterative solution can be carried out step by step through a gradient descent method, and the minimized loss function and the model parameter value are obtained. In addition, in machine learning, the basic gradient descent method may include a random gradient descent method and a batch gradient descent method.

Fig. 5 is a flowchart of step 420 in a method for designing a neural network by a NAS according to a second embodiment of the present invention, and in an implementation, as shown in fig. 5, step 420 may specifically include:

and step 421, continuously processing the architecture parameters of the neural network architecture by the normalized exponential function.

In particular, the architecture parameters of the neural network architecture are continuous, making the search space differentiable. The method has the advantages that the search space can be searched by adopting a gradient descent method, and the search speed is greatly increased.

And 422, searching a neural network architecture in the search space by adopting a gradient descent method.

In particular, the gradient descent method may find the neural network architecture in a search space such that the objective function is minimized. The objective function in this embodiment may include a loss function.

And 430, training the plurality of neural network architectures on hardware by using the training set to obtain corresponding intermediate neural network models and hardware use information when the plurality of neural network architectures are respectively endowed with different neural network parameters.

In particular, the training set may be used to train the neural network architecture to obtain neural network parameters. And performing loss calculation on the result output by the verification set and the output of the training set, and then, updating each neural network parameter in the neural network architecture by descending the calculated loss inverse gradient. The neural network parameters can be obtained by training.

It should be noted that, when the intermediate neural network model performs convolution operation with mixed precision, as shown in fig. 6, a memory computing architecture based on combination of a global cache and a local cache may be designed, and a method of aggregate data stream multiplexing is adopted to realize maximum reuse of on-chip data.

Specifically, the systolic array is a network composed of tightly coupled processing units constrained by a data flow relationship, wherein each node performs data interaction with one or more surrounding nodes, interaction results can be stored in the processing units or transmitted to downstream, and a local cache and a global cache are combined to realize storage. The neural network architecture can be further optimized by adopting the pulse array calculation, the input buffer and the weight buffer which are about to participate in the convolution calculation are expanded into two matrixes, and then the matrix partial sum operation is carried out. The parallel long data path is converted into the short data path of each processing unit, so that data overlapping is reduced, and data storage is facilitated. The input buffer, the weight buffer and the convolution operation can all be operated by adopting a time division multiplexing method, different signals can be mutually interwoven in different time periods by adopting the time division multiplexing technology and are transmitted along the same channel, and the signals in each time period are extracted and restored into original signals at a receiving end, so that the transmission of multiple signals on the same channel is realized.

And 440, bringing the verification set into a plurality of intermediate neural network models on hardware to obtain corresponding verification results.

Specifically, the verification set is brought into a plurality of intermediate neural network models, and the verification results of the plurality of intermediate neural network models can be obtained. The verification result may include the processing capability of the neural network model on the data, such as data accuracy rate, data loss rate, and the like.

Step 450, determining an evaluation result based on the verification result and the hardware usage information.

And when the intermediate neural network model processes the verification set, evaluating the intermediate neural network model according to the evaluation scheme to obtain an evaluation result.

It should be noted that, in addition to the verification result and the hardware usage information, the evaluation result may be determined according to the energy consumption of the intermediate neural network model.

Preferably, to compare the evaluation results of a plurality of models of the intermediate neural network. Determining an evaluation result based on the verification result and the hardware usage information may include: and determining an evaluation result according to the model accuracy contained in the verification result and the hardware utilization rate contained in the hardware use information.

The model accuracy may include the accuracy of the data processing and the attrition rate of the data by the intermediate neural network model. The hardware utilization rate can comprise the utilization rate of the hardware when the data is processed by the intermediate neural network model.

Step 460, determining a target neural network architecture and corresponding target neural network parameters in the plurality of neural network architectures according to the optimal evaluation result in the evaluation results, and generating a corresponding neural network model based on the target neural network architecture and the target neural network parameters.

Fig. 7 is a flowchart for determining a target neural network architecture and corresponding target neural network parameters in a plurality of neural network architectures according to an optimal evaluation result in the evaluation results provided in the second embodiment of the present invention, and in an implementation, as shown in fig. 7, the method may specifically include:

step 461, determining the optimal evaluation result by maximizing the model accuracy and the hardware utilization rate.

In particular, maximizing model accuracy and hardware utilization may be achieved by normalizing an exponential function.

The model accuracy rate may indicate the data processing capability of the neural network architecture, and specifically may include data processing efficiency, data loss rate, and the like. High data processing efficiency and low data loss rate can correspond to high model accuracy, and accordingly, the neural network architecture has stronger data processing capability.

Hardware utilization may indicate a loss of hardware as the neural network architecture processes data. The utilization rate of the hardware is high, which shows that the loss of the hardware is relatively average when the neural network architecture processes data, and the local loss of the hardware is reduced.

The maximum model accuracy and the hardware utilization rate can be calculated by a least square method, and of course, in practical application, the maximum model accuracy and the hardware utilization rate can also be calculated by a gradient ascending method.

And 462, determining the neural network architecture corresponding to the optimal evaluation result as a target neural network architecture, and determining the architecture parameter and the weight bit value corresponding to the optimal evaluation result as the target neural network parameter.

The neural network parameters may include architecture parameters, i.e., structural parameters in the search space, such as convolution kernel parameters, step sizes, and the like, and weight bit values.

The optimal evaluation result may include model accuracy and hardware utilization. And determining the model accuracy and the hardware utilization rate, wherein the corresponding neural network architecture can determine a target neural network architecture and target neural network parameters.

The second embodiment of the invention provides a method for designing a neural network through NAS, which constructs a search space based on a method of stacking preset network units, performing a neural network architecture search within the search space by a predefined search strategy, training a plurality of neural network architectures on hardware by utilizing a training set to obtain corresponding intermediate neural network models and hardware use information when the plurality of neural network architectures are respectively endowed with different neural network parameters, the verification set is brought into a plurality of middle neural network models on hardware to obtain corresponding verification results, the evaluation results are determined based on the verification results and the hardware use information, determining a target neural network architecture and corresponding target neural network parameters in the plurality of neural network architectures according to an optimal evaluation result in the evaluation results, and generating a corresponding neural network model based on the target neural network architecture and the target neural network parameters. According to the scheme, the neural network model can be determined according to the data processing capacity of the neural network architecture and the hardware use condition, the problem that the neural network model influences the data processing efficiency is solved, the data processing efficiency is improved, and the hardware loss is reduced.

Fig. 8 is a flowchart of an implementation of a method for designing a neural network by a NAS according to a second embodiment of the present invention, which illustrates an implementation of the method. As shown in fig. 8, after receiving the generation request, a neural network architecture search is performed in the search space based on a preset search strategy, and the neural network architecture obtained by the search includes a network relationship set determined by a plurality of network units based on the network quantization bits. And then, training neural network architecture parameters on hardware resources based on the training set and the verification set to obtain neural network parameters. And simultaneously, performing performance evaluation on the trained neural network architecture through a verification set verification result and hardware use conditions. And further determining a neural network model or searching a neural network architecture again for further training and performance evaluation to finally obtain the determined neural network model.

EXAMPLE III

Fig. 9 is a schematic diagram of an apparatus for designing a neural network by a NAS according to a third embodiment of the present invention. As shown in fig. 9, the apparatus may include: a search module 910, an execution module 920, and a generation module 930, wherein:

the searching module 910 is configured to, after receiving the generation request, perform neural network architecture search based on the constructed search space and the search policy to obtain a plurality of neural network architectures;

an execution module 920, configured to run multiple neural network architectures on hardware based on the training set and the verification set, monitor hardware usage information, and determine, according to an evaluation scheme preset by a verification result corresponding to the verification set and the hardware usage information, evaluation results when the multiple neural network architectures are respectively assigned with different neural network parameters;

a generating module 930, configured to determine a target neural network architecture and corresponding target neural network parameters in the plurality of neural network architectures according to an optimal evaluation result in the evaluation results, and generate a corresponding neural network model based on the target neural network architecture and the target neural network parameters.

After receiving the generation request, the apparatus for designing a neural network by using an NAS according to this embodiment searches a neural network architecture based on a constructed search space and a search strategy to obtain a plurality of neural network architectures; running a plurality of neural network architectures on hardware based on the training set and the verification set, monitoring hardware use information, and determining evaluation results when the plurality of neural network architectures are respectively endowed with different neural network parameters according to an evaluation scheme preset by a verification result corresponding to the verification set and the hardware use information; and determining a target neural network architecture and corresponding target neural network parameters in the plurality of neural network architectures according to the optimal evaluation result in the evaluation results, and generating a corresponding neural network model based on the target neural network architecture and the target neural network parameters. The device can determine the neural network model according to the capability of the neural network architecture for processing data and the service condition of hardware, solves the problem that the data processing is influenced by the improper neural network model, accelerates the data processing efficiency and reduces the consumption of hardware.

On the basis of the foregoing embodiment, the search module 910 is specifically configured to:

constructing a search space based on a method of stacking preset network elements;

and carrying out neural network architecture search in the search space through a predefined search strategy.

In one embodiment, performing a neural network architecture search within a search space by a predefined search strategy comprises: carrying out continuous processing on the architecture parameters of the neural network architecture through a normalized exponential function; and (3) performing neural network architecture search in a search space by adopting a gradient descent method.

On the basis of the foregoing embodiment, the execution module 920 is specifically configured to:

training a plurality of neural network architectures on hardware by utilizing a training set to obtain corresponding intermediate neural network models and hardware use information when the plurality of neural network architectures are respectively endowed with different neural network parameters;

bringing the verification set into a plurality of intermediate neural network models on hardware to obtain corresponding verification results;

based on the verification result and the hardware usage information, an evaluation result is determined.

In one embodiment, determining the evaluation result based on the verification result and the hardware usage information includes: and determining an evaluation result according to the model accuracy contained in the verification result and the hardware utilization rate contained in the hardware use information.

On the basis of the foregoing embodiment, the generating module 930 is specifically configured to:

determining an optimal evaluation result by maximizing the model accuracy and the hardware utilization rate;

and determining the architecture parameters and the weight bit values of the target neural network architecture based on the optimal evaluation result, and further determining the target neural network parameters.

The apparatus for designing a neural network by NAS provided in this embodiment may be used to perform the method for designing a neural network by NAS provided in the foregoing embodiment, and has corresponding functions and advantages.

From the above description of the embodiments, it is obvious for those skilled in the art that the present invention can be implemented by software and necessary general hardware, and certainly, can also be implemented by hardware, but the former is a better embodiment in many cases. Based on such understanding, the technical solutions of the present invention may be embodied in the form of a software product, which can be stored in a computer-readable storage medium, such as a floppy disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a FLASH Memory (FLASH), a hard disk or an optical disk of a computer, and includes several instructions for enabling a computer device (which may be a personal computer, a server, or a network device) to execute the methods according to the embodiments of the present invention.

It should be noted that, in the embodiment of the above search apparatus, each included unit and module are merely divided according to functional logic, but are not limited to the above division as long as the corresponding functions can be implemented; in addition, specific names of the functional units are only for convenience of distinguishing from each other, and are not used for limiting the protection scope of the present invention.

Example four

Embodiments of the present invention also provide a system for designing a neural network by NAS, which when executed by a computer processor is configured to perform the method for designing a neural network by NAS according to embodiment one.

Of course, in the system for designing a neural network by NAS provided by the embodiment of the present invention, the computer-executable instructions are not limited to the method operations described above, and may also perform related operations in the search method provided by any embodiment of the present invention.

It is to be noted that the foregoing is only illustrative of the preferred embodiments of the present invention and the technical principles employed. It will be understood by those skilled in the art that the present invention is not limited to the particular embodiments described herein, but is capable of various obvious changes, rearrangements and substitutions as will now become apparent to those skilled in the art without departing from the scope of the invention. Therefore, although the present invention has been described in greater detail by the above embodiments, the present invention is not limited to the above embodiments, and may include other equivalent embodiments without departing from the spirit of the present invention, and the scope of the present invention is determined by the scope of the appended claims.

Claims

1. A method for designing a neural network over a NAS, comprising:

running the plurality of neural network architectures on hardware based on a training set and a verification set, monitoring hardware use information, and determining evaluation results when the plurality of neural network architectures are respectively endowed with different neural network parameters according to an evaluation scheme preset by a verification result corresponding to the verification set and the hardware use information;

2. The method of claim 1, wherein performing a neural network architecture search based on the constructed search space and the search strategy comprises:

constructing a search space based on a preset network unit stacking mode;

3. The method of claim 1, wherein the operating the plurality of neural network architectures on hardware based on a training set and a validation set and monitoring hardware usage information, and determining evaluation results when the plurality of neural network architectures are respectively assigned with different neural network parameters according to an evaluation scheme preset by validation results corresponding to the validation set and the hardware usage information comprises:

training the plurality of neural network architectures on hardware by utilizing a training set to obtain corresponding intermediate neural network models and hardware use information when the plurality of neural network architectures are respectively endowed with different neural network parameters;

determining an evaluation result based on the verification result and the hardware usage information.

4. The method of claim 3, wherein determining an evaluation result based on the validation result and the hardware usage information comprises:

and determining an evaluation result according to the model accuracy contained in the verification result and the hardware utilization rate contained in the hardware use information.

5. The method of claim 2, wherein performing a neural network architecture search within the search space through a predefined search strategy comprises:

carrying out continuous processing on the architecture parameters of the neural network architecture through a normalized exponential function;

and searching a neural network architecture in the search space by adopting a gradient descent method.

6. The method of claim 4, wherein the neural network parameters comprise architecture parameters and weight bit values, and wherein determining a target neural network architecture and corresponding target neural network parameters from an optimal one of the evaluation results comprises:

determining the optimal evaluation result by maximizing the model accuracy and the hardware utilization;

and determining the neural network architecture corresponding to the optimal evaluation result as a target neural network architecture, and determining the architecture parameter and the weight bit value corresponding to the optimal evaluation result as target neural network parameters.

7. The method of any one of claims 1-6, wherein the search space comprises a set of neural network architectures searched in a plurality of sets of network layers with different weight bit values.

8. An apparatus for designing a neural network over a NAS, comprising: a search module, an execution module, and a generation module, wherein,

the execution module is used for running the neural network architectures on hardware based on a training set and a verification set, monitoring hardware use information, and determining evaluation results when the neural network architectures are respectively endowed with different neural network parameters according to an evaluation scheme preset by a verification result corresponding to the verification set and the hardware use information;

9. The apparatus of claim 8, wherein the search module comprises:

the construction submodule is used for constructing a search space based on a preset network unit stacking mode;

and the searching submodule is used for searching the neural network architecture in the searching space through a predefined searching strategy.

10. A system for designing a neural network over a NAS, wherein the system, when executed by a computer processor, is configured to perform the method for designing a neural network over a NAS of any one of claims 1-7.