CN112101525A - Method, device and system for designing neural network through NAS - Google Patents

Method, device and system for designing neural network through NAS Download PDF

Info

Publication number
CN112101525A
CN112101525A CN202010936265.8A CN202010936265A CN112101525A CN 112101525 A CN112101525 A CN 112101525A CN 202010936265 A CN202010936265 A CN 202010936265A CN 112101525 A CN112101525 A CN 112101525A
Authority
CN
China
Prior art keywords
neural network
hardware
architecture
parameters
search
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010936265.8A
Other languages
Chinese (zh)
Inventor
毛伟
余浩
代柳瑶
朱雪娟
常成
王宇航
李凯
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Southwest University of Science and Technology
Southern University of Science and Technology
Original Assignee
Southwest University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Southwest University of Science and Technology filed Critical Southwest University of Science and Technology
Priority to CN202010936265.8A priority Critical patent/CN112101525A/en
Publication of CN112101525A publication Critical patent/CN112101525A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/06Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
    • G06N3/063Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means

Abstract

The invention discloses a method, a device and a system for designing a neural network through NAS. The method comprises the following steps: after receiving the generation request, searching a neural network architecture based on the constructed search space and the search strategy to obtain a plurality of neural network architectures; running a plurality of neural network architectures on hardware based on the training set and the verification set, monitoring hardware use information, and determining evaluation results when the plurality of neural network architectures are respectively endowed with different neural network parameters according to an evaluation scheme preset by a verification result corresponding to the verification set and the hardware use information; and determining a target neural network architecture and corresponding target neural network parameters in the plurality of neural network architectures according to the optimal evaluation result in the evaluation results, and generating a corresponding neural network model based on the target neural network architecture and the target neural network parameters. The neural network model is determined according to the data processing capacity and the hardware using condition, so that the data processing efficiency is accelerated, and the hardware loss is reduced.

Description

Method, device and system for designing neural network through NAS
Technical Field
The embodiment of the invention relates to computer technology, in particular to a method, a device and a system for designing a neural network through NAS.
Background
With the development of technologies such as big data, internet of things, mobile edge computing, and the like, an Artificial Intelligence (AI) technology represented by a deep neural network is widely applied in the fields of image processing and the like. The number of edge termination devices has increased rapidly, while the amount of data generated by the edge termination devices has reached the ZettaByte (ZB) level.
The data volume of the AI terminal is increased, which can lead to the increase of the calculated amount at the edge side, the existing data processing framework based on the deep learning Neural network mainly converts a large data set into model parameters, and the network framework is searched by a Neural Architecture Search (NAS) algorithm based on a hardware overhead limitation traversal framework to find the Neural network framework with the most moderate performance, thereby completing inference analysis.
The existing equipment for the AI neural network accelerator has the technical defects of long data processing time and low precision when processing a large amount of data, and the calculation performance is not enough to support the current data amount, so that the power consumption of hardware is large and the energy consumption efficiency is low.
Disclosure of Invention
The invention provides a method, a device and a system for designing a neural network through an NAS (network attached storage), which can reduce the power consumption of hardware and improve the energy consumption efficiency.
In a first aspect, an embodiment of the present invention provides a method for designing a neural network by a NAS, including:
after receiving the generation request, searching a neural network architecture based on the constructed search space and the search strategy to obtain a plurality of neural network architectures;
running a plurality of neural network architectures on hardware based on the training set and the verification set, monitoring hardware use information, and determining evaluation results when the plurality of neural network architectures are respectively endowed with different neural network parameters according to an evaluation scheme preset by a verification result corresponding to the verification set and the hardware use information;
and determining a target neural network architecture and corresponding target neural network parameters in the plurality of neural network architectures according to the optimal evaluation result in the evaluation results, and generating a corresponding neural network model based on the target neural network architecture and the target neural network parameters.
In a second aspect, an embodiment of the present invention further provides an apparatus for designing a neural network by a NAS, including: a search module, an execution module, and a generation module, wherein,
the searching module is used for searching the neural network architecture based on the constructed searching space and the searching strategy after receiving the generation request to obtain a plurality of neural network architectures;
the execution module is used for running a plurality of neural network architectures on hardware based on the training set and the verification set, monitoring hardware use information, and determining evaluation results when the plurality of neural network architectures are respectively endowed with different neural network parameters according to an evaluation scheme preset by a verification result corresponding to the verification set and the hardware use information;
and the generating module is used for determining a target neural network architecture and corresponding target neural network parameters in the plurality of neural network architectures according to the optimal evaluation result in the evaluation results, and generating a corresponding neural network model based on the target neural network architecture and the target neural network parameters.
In a third aspect, embodiments of the present invention also provide a system for designing a neural network by NAS, which when executed by a computer processor is configured to perform the method for designing a neural network by NAS as described in the first aspect.
The embodiment of the invention provides a method, a device and a system for designing a neural network through an NAS (network attached storage), wherein after a generation request is received, neural network architecture search is carried out based on a constructed search space and a search strategy to obtain a plurality of neural network architectures, the plurality of neural network architectures are operated on hardware based on a training set and a verification set and hardware use information is monitored, evaluation results of different neural network architectures under corresponding different parameters are obtained according to an evaluation scheme preset by a verification result corresponding to the verification set and the hardware use information, a target neural network architecture and corresponding target neural network parameters in the plurality of neural network architectures are determined according to an optimal evaluation result in the evaluation results, and a corresponding neural network model is generated based on the target neural network architecture and the target neural network parameters. According to the scheme, the neural network model can be determined according to the data processing capacity of the neural network architecture and the hardware use condition, the problem that the improper neural network model influences the data processing efficiency is solved, the data processing efficiency is accelerated, and the loss of hardware is reduced.
Drawings
FIG. 1 is a flowchart of a method for designing a neural network by a NAS according to an embodiment of the present invention;
FIG. 2 is a diagram of neural network architecture search and hardware remapping;
FIG. 3 is a process diagram of the neural network compiling and then invoking hardware identified RISC-V instructions to map to corresponding hardware resources;
FIG. 4 is a flowchart of a method for designing a neural network by a NAS according to a second embodiment of the present invention;
FIG. 5 is a flowchart illustrating step 420 of a method for designing a neural network by a NAS according to a second embodiment of the present invention;
FIG. 6 is a diagram of spatiotemporal multiplexing pulsating data flow based on memory computation according to a second embodiment of the present invention;
fig. 7 is a flowchart for determining a target neural network architecture and corresponding target neural network parameters in a plurality of neural network architectures according to an optimal evaluation result in the evaluation results according to the second embodiment of the present invention;
FIG. 8 is a flowchart of a second embodiment of the present invention for implementing a method for designing a neural network by a NAS;
fig. 9 is a schematic diagram of an apparatus for designing a neural network by a NAS according to a third embodiment of the present invention.
Detailed Description
The present invention will be described in further detail with reference to the accompanying drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the invention and are not limiting of the invention. It should be further noted that, for the convenience of description, only some of the structures related to the present invention are shown in the drawings, not all of the structures.
In addition, the embodiments and features of the embodiments in the present invention may be combined with each other without conflict.
Before discussing exemplary embodiments in more detail, it should be noted that some exemplary embodiments are described as processes or methods depicted as flowcharts. Although a flowchart may describe the operations (or steps) as a sequential process, many of the operations can be performed in parallel, concurrently or simultaneously. In addition, the order of the operations may be re-arranged. The process may be terminated when its operations are completed, but may have additional steps not included in the figure. The processes may correspond to methods, functions, procedures, subroutines, and the like. In addition, the embodiments and features of the embodiments in the present invention may be combined with each other without conflict.
Example one
Fig. 1 is a flowchart of a method for designing a neural network by using an NAS according to an embodiment of the present invention, which is applicable to a situation where a large number of computation tasks are processed in real time, where the method may be executed by an AI terminal, and selection of the AI terminal may be allocated by a hardware accelerator, so as to evaluate a hardware utilization rate and a model accuracy of the AI terminal, and further optimize a neural network model, and specifically include the following steps:
and step 110, after receiving the generation request, searching the neural network architecture based on the constructed search space and the search strategy to obtain a plurality of neural network architectures.
When the AI terminal receives a large amount of data, the calculation amount of the AI terminal on the edge side is increased, the loss of hardware is increased, and a neural network model needs to be reconstructed to process the data in time. Specifically, the AI terminal may include an image classification terminal. And receiving a neural network model generation request issued by the AI terminal, searching neural network frameworks in different network layers based on a preset search strategy, and searching to obtain a plurality of neural network frameworks.
Specifically, the generation request may contain a preset search policy. Wherein the search space comprises a list of possible network architectures. The search strategy can comprise a differentiable method and a gradient descent method for searching, so that the search speed is increased. The neural network architecture can be realized by searching NAS through the network architecture, and the NAS neural network architecture searching algorithm can search the neural network architecture with the most moderate performance according to design requirements and corresponding hardware platforms based on the hardware overhead limit traversal architecture.
In particular, the search strategy may serialize the architecture parameters of the neural network architecture by a differentiable method, such that the search space becomes differentiable. In a continuous search space, a neural network architecture search is performed according to a gradient descent method, such as: the minimum value of the loss function is iteratively solved step by a gradient descent method, so that the minimum loss function can be obtained. Of course, in practical application, the data processing efficiency or the hardware utilization rate can also be solved step by iteration through a gradient ascent method, so that the maximum data processing efficiency or the maximum hardware utilization rate is obtained.
When the NAS neural network architecture is searched according to a preset search strategy, layer-by-layer precision selection and branch selection may be performed on a network layer, specifically, the network layer precision may include a network weight, and a weight bit value of each layer of the corresponding network layer is large or small. For example, in a five-layer network, the corresponding network weights are 2, 4, 6, 4, and 8, which means that the first layer weight bit value is 2, the second layer weight bit value is 4, and so on.
And 120, operating a plurality of neural network architectures on hardware based on the training set and the verification set, monitoring hardware use information, and determining evaluation results when the plurality of neural network architectures are respectively endowed with different neural network parameters according to an evaluation scheme preset by the verification result corresponding to the verification set and the hardware use information.
The plurality of neural network architectures obtained in step 110 may be used to process data included in the training set and the validation set, and the processing process may be implemented on hardware to obtain hardware usage information. The processing capability of the neural network architecture on data and the loss condition of the neural network architecture on hardware can be used for evaluating the neural network architecture, so that different neural network parameters corresponding to the neural network architecture can be obtained by processing the training set on a plurality of neural network architectures, and further, the verification result corresponding to the neural network architecture can be obtained by processing the verification set on the neural network architecture. Fig. 2 is a diagram of neural network architecture search and hardware mutual mapping, and as shown in fig. 2, while data is processed, hardware usage information may be received, and an evaluation result when the neural network architecture is given neural network parameters may be determined by combining a verification result obtained by processing a verification set and the received hardware usage information.
For example, the processing of data by the neural network architecture may include convolution, pooling, padding, and the like.
Specifically, the training set includes a sample data set for training, which is mainly used to train parameters in the neural network architecture. The validation set includes a set of sample data for validating performance of the neural network architecture. After training of different neural network architectures on the training set is finished, the performance of each neural network architecture is compared and judged through the verification set. The different neural network architectures may include neural network architectures corresponding to different parameters, and may also refer to neural network architectures with different architectures. The verification result may include data processing capability, and specifically may include data processing accuracy, data loss rate, and the like.
For example, when the AI terminal includes an image classification terminal, after image classification is completed, according to the probability distribution of image mapping to the prediction output classification, the correlation loss of image classification performed on the corresponding neural network architecture can be obtained, which may specifically include the cross entropy of image classification. Cross entropy can be used to measure the dissimilarity between two probability distributions.
Fig. 3 is a process diagram of invoking a RISC-V instruction identified by hardware to map to a corresponding hardware resource after neural network compilation, and as shown in fig. 3, based on the RISC-V instruction, the neural network architecture and parameters thereof, an instruction set corresponding to the neural network architecture is mapped to the hardware through the RISC-V instruction to be implemented, so as to further implement cooperative optimization of the neural network architecture and the hardware, that is, the utilization rate of the hardware can be improved by adjusting the neural network architecture or the parameters thereof, and further improve the data processing efficiency. After the neural network is obtained, the neural network is mapped into an instruction which can be identified by hardware through a software compiling environment, the instruction comprises a precision setting instruction (setup), a data access control instruction (ld-mem/st-mem), memory computing (cim-ld-mem) and the like, and the architecture and the circuit module are controlled on the hardware to realize the operation of the neural network.
The hardware usage information may include hardware utilization, and the hardware usage information may be fed back to the neural network architecture through instruction mapping as a constraint for the search.
The verification result of the verification set can indicate the processing capacity of the corresponding neural network architecture on data, and the hardware use information can indicate the loss information of the neural network architecture on hardware. The data processing capability and hardware loss information may enable evaluation of neural network architecture performance given neural network parameters. And after the neural network architecture is searched, evaluating the neural network architecture based on the real-time data, the neural network architecture and the evaluation scheme, and further performing the subsequent steps.
And step 130, determining a target neural network architecture and corresponding target neural network parameters in the plurality of neural network architectures according to the optimal evaluation result in the evaluation results, and generating a corresponding neural network model based on the target neural network architecture and the target neural network parameters.
The evaluation result comprises a verification result of the verification set on the neural network architecture and corresponding hardware use conditions. The evaluation result can be fed back to the search strategy to guide the continuous search of the neural network architecture with higher data processing capability, and the optimal verification result and the hardware use condition can determine the target neural network architecture which is most suitable for processing the current data. The target neural network parameters are obtained in the process of training the target neural network architecture through the training set, and the target neural network architecture and the target neural network parameters can generate corresponding neural network models.
Specifically, the neural network parameters may include convolution kernel parameters and step sizes of the neural network architecture, and may further include weight bit values of the network layer. According to given neural network parameters, the configuration of the neural network architecture can be determined, and a neural network model corresponding to the neural network architecture and the neural network parameters can be further defined.
The embodiment of the invention provides a method for designing a neural network through an NAS (network attached storage), which comprises the steps of receiving a generation request, and searching a neural network architecture based on a constructed search space and a search strategy to obtain a plurality of neural network architectures; running a plurality of neural network architectures on hardware based on the training set and the verification set, monitoring hardware use information, and determining evaluation results when the plurality of neural network architectures are respectively endowed with different neural network parameters according to an evaluation scheme preset by a verification result corresponding to the verification set and the hardware use information; and determining a target neural network architecture and corresponding target neural network parameters in the plurality of neural network architectures according to the optimal evaluation result in the evaluation results, and generating a corresponding neural network model based on the target neural network architecture and the target neural network parameters. According to the scheme, the neural network model can be determined according to the data processing capacity of the neural network architecture and the hardware use condition, the problem that the neural network model influences the data processing efficiency is solved, the data processing efficiency is accelerated, and the hardware loss is reduced.
Example two
Fig. 4 is a flowchart of a method for designing a neural network by a NAS according to a second embodiment of the present invention, where the second embodiment is optimized based on the foregoing embodiments. As shown in fig. 4, a second embodiment of the present invention provides a method for designing a neural network by a NAS, including the following steps:
step 410, constructing a search space based on a preset network element stacking mode.
Specifically, the search space includes a set of searchable NAS neural network architectures, and traversing the architecture can find a neural network architecture with appropriate performance.
The predetermined network element stacking pattern may include a MobileNet-V2 element.
Preferably, in order to find a suitable neural network architecture, the search space may include a set of neural network architectures searched in a plurality of sets of network layers with different weight bit values.
And step 420, searching a neural network architecture in the search space through a predefined search strategy.
In particular, the search strategy may comprise a differentiable approach, searching the space based on a gradient descent search strategy.
It should be noted that the search space is generally discrete, i.e., not differentiable. And adding a differentiable method, and serializing the architecture parameters of the network by normalizing the exponential function so as to enable the search space to become differentiable. And a gradient descent method is further adopted to search the search space, so that the search speed is greatly increased.
The gradient descent method may solve a least squares problem. Gradient descent is one of the most commonly used methods when solving model parameters of machine learning algorithms, i.e. unconstrained optimization problems. When the minimum value of the loss function is solved, iterative solution can be carried out step by step through a gradient descent method, and the minimized loss function and the model parameter value are obtained. In addition, in machine learning, the basic gradient descent method may include a random gradient descent method and a batch gradient descent method.
Fig. 5 is a flowchart of step 420 in a method for designing a neural network by a NAS according to a second embodiment of the present invention, and in an implementation, as shown in fig. 5, step 420 may specifically include:
and step 421, continuously processing the architecture parameters of the neural network architecture by the normalized exponential function.
In particular, the architecture parameters of the neural network architecture are continuous, making the search space differentiable. The method has the advantages that the search space can be searched by adopting a gradient descent method, and the search speed is greatly increased.
And 422, searching a neural network architecture in the search space by adopting a gradient descent method.
In particular, the gradient descent method may find the neural network architecture in a search space such that the objective function is minimized. The objective function in this embodiment may include a loss function.
And 430, training the plurality of neural network architectures on hardware by using the training set to obtain corresponding intermediate neural network models and hardware use information when the plurality of neural network architectures are respectively endowed with different neural network parameters.
In particular, the training set may be used to train the neural network architecture to obtain neural network parameters. And performing loss calculation on the result output by the verification set and the output of the training set, and then, updating each neural network parameter in the neural network architecture by descending the calculated loss inverse gradient. The neural network parameters can be obtained by training.
It should be noted that, when the intermediate neural network model performs convolution operation with mixed precision, as shown in fig. 6, a memory computing architecture based on combination of a global cache and a local cache may be designed, and a method of aggregate data stream multiplexing is adopted to realize maximum reuse of on-chip data.
Specifically, the systolic array is a network composed of tightly coupled processing units constrained by a data flow relationship, wherein each node performs data interaction with one or more surrounding nodes, interaction results can be stored in the processing units or transmitted to downstream, and a local cache and a global cache are combined to realize storage. The neural network architecture can be further optimized by adopting the pulse array calculation, the input buffer and the weight buffer which are about to participate in the convolution calculation are expanded into two matrixes, and then the matrix partial sum operation is carried out. The parallel long data path is converted into the short data path of each processing unit, so that data overlapping is reduced, and data storage is facilitated. The input buffer, the weight buffer and the convolution operation can all be operated by adopting a time division multiplexing method, different signals can be mutually interwoven in different time periods by adopting the time division multiplexing technology and are transmitted along the same channel, and the signals in each time period are extracted and restored into original signals at a receiving end, so that the transmission of multiple signals on the same channel is realized.
And 440, bringing the verification set into a plurality of intermediate neural network models on hardware to obtain corresponding verification results.
Specifically, the verification set is brought into a plurality of intermediate neural network models, and the verification results of the plurality of intermediate neural network models can be obtained. The verification result may include the processing capability of the neural network model on the data, such as data accuracy rate, data loss rate, and the like.
Step 450, determining an evaluation result based on the verification result and the hardware usage information.
And when the intermediate neural network model processes the verification set, evaluating the intermediate neural network model according to the evaluation scheme to obtain an evaluation result.
It should be noted that, in addition to the verification result and the hardware usage information, the evaluation result may be determined according to the energy consumption of the intermediate neural network model.
Preferably, to compare the evaluation results of a plurality of models of the intermediate neural network. Determining an evaluation result based on the verification result and the hardware usage information may include: and determining an evaluation result according to the model accuracy contained in the verification result and the hardware utilization rate contained in the hardware use information.
The model accuracy may include the accuracy of the data processing and the attrition rate of the data by the intermediate neural network model. The hardware utilization rate can comprise the utilization rate of the hardware when the data is processed by the intermediate neural network model.
Step 460, determining a target neural network architecture and corresponding target neural network parameters in the plurality of neural network architectures according to the optimal evaluation result in the evaluation results, and generating a corresponding neural network model based on the target neural network architecture and the target neural network parameters.
Fig. 7 is a flowchart for determining a target neural network architecture and corresponding target neural network parameters in a plurality of neural network architectures according to an optimal evaluation result in the evaluation results provided in the second embodiment of the present invention, and in an implementation, as shown in fig. 7, the method may specifically include:
step 461, determining the optimal evaluation result by maximizing the model accuracy and the hardware utilization rate.
In particular, maximizing model accuracy and hardware utilization may be achieved by normalizing an exponential function.
The model accuracy rate may indicate the data processing capability of the neural network architecture, and specifically may include data processing efficiency, data loss rate, and the like. High data processing efficiency and low data loss rate can correspond to high model accuracy, and accordingly, the neural network architecture has stronger data processing capability.
Hardware utilization may indicate a loss of hardware as the neural network architecture processes data. The utilization rate of the hardware is high, which shows that the loss of the hardware is relatively average when the neural network architecture processes data, and the local loss of the hardware is reduced.
The maximum model accuracy and the hardware utilization rate can be calculated by a least square method, and of course, in practical application, the maximum model accuracy and the hardware utilization rate can also be calculated by a gradient ascending method.
And 462, determining the neural network architecture corresponding to the optimal evaluation result as a target neural network architecture, and determining the architecture parameter and the weight bit value corresponding to the optimal evaluation result as the target neural network parameter.
The neural network parameters may include architecture parameters, i.e., structural parameters in the search space, such as convolution kernel parameters, step sizes, and the like, and weight bit values.
The optimal evaluation result may include model accuracy and hardware utilization. And determining the model accuracy and the hardware utilization rate, wherein the corresponding neural network architecture can determine a target neural network architecture and target neural network parameters.
The second embodiment of the invention provides a method for designing a neural network through NAS, which constructs a search space based on a method of stacking preset network units, performing a neural network architecture search within the search space by a predefined search strategy, training a plurality of neural network architectures on hardware by utilizing a training set to obtain corresponding intermediate neural network models and hardware use information when the plurality of neural network architectures are respectively endowed with different neural network parameters, the verification set is brought into a plurality of middle neural network models on hardware to obtain corresponding verification results, the evaluation results are determined based on the verification results and the hardware use information, determining a target neural network architecture and corresponding target neural network parameters in the plurality of neural network architectures according to an optimal evaluation result in the evaluation results, and generating a corresponding neural network model based on the target neural network architecture and the target neural network parameters. According to the scheme, the neural network model can be determined according to the data processing capacity of the neural network architecture and the hardware use condition, the problem that the neural network model influences the data processing efficiency is solved, the data processing efficiency is improved, and the hardware loss is reduced.
Fig. 8 is a flowchart of an implementation of a method for designing a neural network by a NAS according to a second embodiment of the present invention, which illustrates an implementation of the method. As shown in fig. 8, after receiving the generation request, a neural network architecture search is performed in the search space based on a preset search strategy, and the neural network architecture obtained by the search includes a network relationship set determined by a plurality of network units based on the network quantization bits. And then, training neural network architecture parameters on hardware resources based on the training set and the verification set to obtain neural network parameters. And simultaneously, performing performance evaluation on the trained neural network architecture through a verification set verification result and hardware use conditions. And further determining a neural network model or searching a neural network architecture again for further training and performance evaluation to finally obtain the determined neural network model.
EXAMPLE III
Fig. 9 is a schematic diagram of an apparatus for designing a neural network by a NAS according to a third embodiment of the present invention. As shown in fig. 9, the apparatus may include: a search module 910, an execution module 920, and a generation module 930, wherein:
the searching module 910 is configured to, after receiving the generation request, perform neural network architecture search based on the constructed search space and the search policy to obtain a plurality of neural network architectures;
an execution module 920, configured to run multiple neural network architectures on hardware based on the training set and the verification set, monitor hardware usage information, and determine, according to an evaluation scheme preset by a verification result corresponding to the verification set and the hardware usage information, evaluation results when the multiple neural network architectures are respectively assigned with different neural network parameters;
a generating module 930, configured to determine a target neural network architecture and corresponding target neural network parameters in the plurality of neural network architectures according to an optimal evaluation result in the evaluation results, and generate a corresponding neural network model based on the target neural network architecture and the target neural network parameters.
After receiving the generation request, the apparatus for designing a neural network by using an NAS according to this embodiment searches a neural network architecture based on a constructed search space and a search strategy to obtain a plurality of neural network architectures; running a plurality of neural network architectures on hardware based on the training set and the verification set, monitoring hardware use information, and determining evaluation results when the plurality of neural network architectures are respectively endowed with different neural network parameters according to an evaluation scheme preset by a verification result corresponding to the verification set and the hardware use information; and determining a target neural network architecture and corresponding target neural network parameters in the plurality of neural network architectures according to the optimal evaluation result in the evaluation results, and generating a corresponding neural network model based on the target neural network architecture and the target neural network parameters. The device can determine the neural network model according to the capability of the neural network architecture for processing data and the service condition of hardware, solves the problem that the data processing is influenced by the improper neural network model, accelerates the data processing efficiency and reduces the consumption of hardware.
On the basis of the foregoing embodiment, the search module 910 is specifically configured to:
constructing a search space based on a method of stacking preset network elements;
and carrying out neural network architecture search in the search space through a predefined search strategy.
In one embodiment, performing a neural network architecture search within a search space by a predefined search strategy comprises: carrying out continuous processing on the architecture parameters of the neural network architecture through a normalized exponential function; and (3) performing neural network architecture search in a search space by adopting a gradient descent method.
On the basis of the foregoing embodiment, the execution module 920 is specifically configured to:
training a plurality of neural network architectures on hardware by utilizing a training set to obtain corresponding intermediate neural network models and hardware use information when the plurality of neural network architectures are respectively endowed with different neural network parameters;
bringing the verification set into a plurality of intermediate neural network models on hardware to obtain corresponding verification results;
based on the verification result and the hardware usage information, an evaluation result is determined.
In one embodiment, determining the evaluation result based on the verification result and the hardware usage information includes: and determining an evaluation result according to the model accuracy contained in the verification result and the hardware utilization rate contained in the hardware use information.
On the basis of the foregoing embodiment, the generating module 930 is specifically configured to:
determining an optimal evaluation result by maximizing the model accuracy and the hardware utilization rate;
and determining the architecture parameters and the weight bit values of the target neural network architecture based on the optimal evaluation result, and further determining the target neural network parameters.
The apparatus for designing a neural network by NAS provided in this embodiment may be used to perform the method for designing a neural network by NAS provided in the foregoing embodiment, and has corresponding functions and advantages.
From the above description of the embodiments, it is obvious for those skilled in the art that the present invention can be implemented by software and necessary general hardware, and certainly, can also be implemented by hardware, but the former is a better embodiment in many cases. Based on such understanding, the technical solutions of the present invention may be embodied in the form of a software product, which can be stored in a computer-readable storage medium, such as a floppy disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a FLASH Memory (FLASH), a hard disk or an optical disk of a computer, and includes several instructions for enabling a computer device (which may be a personal computer, a server, or a network device) to execute the methods according to the embodiments of the present invention.
It should be noted that, in the embodiment of the above search apparatus, each included unit and module are merely divided according to functional logic, but are not limited to the above division as long as the corresponding functions can be implemented; in addition, specific names of the functional units are only for convenience of distinguishing from each other, and are not used for limiting the protection scope of the present invention.
Example four
Embodiments of the present invention also provide a system for designing a neural network by NAS, which when executed by a computer processor is configured to perform the method for designing a neural network by NAS according to embodiment one.
Of course, in the system for designing a neural network by NAS provided by the embodiment of the present invention, the computer-executable instructions are not limited to the method operations described above, and may also perform related operations in the search method provided by any embodiment of the present invention.
From the above description of the embodiments, it is obvious for those skilled in the art that the present invention can be implemented by software and necessary general hardware, and certainly, can also be implemented by hardware, but the former is a better embodiment in many cases. Based on such understanding, the technical solutions of the present invention may be embodied in the form of a software product, which can be stored in a computer-readable storage medium, such as a floppy disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a FLASH Memory (FLASH), a hard disk or an optical disk of a computer, and includes several instructions for enabling a computer device (which may be a personal computer, a server, or a network device) to execute the methods according to the embodiments of the present invention.
It should be noted that, in the embodiment of the above search apparatus, each included unit and module are merely divided according to functional logic, but are not limited to the above division as long as the corresponding functions can be implemented; in addition, specific names of the functional units are only for convenience of distinguishing from each other, and are not used for limiting the protection scope of the present invention.
It is to be noted that the foregoing is only illustrative of the preferred embodiments of the present invention and the technical principles employed. It will be understood by those skilled in the art that the present invention is not limited to the particular embodiments described herein, but is capable of various obvious changes, rearrangements and substitutions as will now become apparent to those skilled in the art without departing from the scope of the invention. Therefore, although the present invention has been described in greater detail by the above embodiments, the present invention is not limited to the above embodiments, and may include other equivalent embodiments without departing from the spirit of the present invention, and the scope of the present invention is determined by the scope of the appended claims.

Claims (10)

1. A method for designing a neural network over a NAS, comprising:
after receiving the generation request, searching a neural network architecture based on the constructed search space and the search strategy to obtain a plurality of neural network architectures;
running the plurality of neural network architectures on hardware based on a training set and a verification set, monitoring hardware use information, and determining evaluation results when the plurality of neural network architectures are respectively endowed with different neural network parameters according to an evaluation scheme preset by a verification result corresponding to the verification set and the hardware use information;
and determining a target neural network architecture and corresponding target neural network parameters in the plurality of neural network architectures according to the optimal evaluation result in the evaluation results, and generating a corresponding neural network model based on the target neural network architecture and the target neural network parameters.
2. The method of claim 1, wherein performing a neural network architecture search based on the constructed search space and the search strategy comprises:
constructing a search space based on a preset network unit stacking mode;
and carrying out neural network architecture search in the search space through a predefined search strategy.
3. The method of claim 1, wherein the operating the plurality of neural network architectures on hardware based on a training set and a validation set and monitoring hardware usage information, and determining evaluation results when the plurality of neural network architectures are respectively assigned with different neural network parameters according to an evaluation scheme preset by validation results corresponding to the validation set and the hardware usage information comprises:
training the plurality of neural network architectures on hardware by utilizing a training set to obtain corresponding intermediate neural network models and hardware use information when the plurality of neural network architectures are respectively endowed with different neural network parameters;
bringing the verification set into a plurality of intermediate neural network models on hardware to obtain corresponding verification results;
determining an evaluation result based on the verification result and the hardware usage information.
4. The method of claim 3, wherein determining an evaluation result based on the validation result and the hardware usage information comprises:
and determining an evaluation result according to the model accuracy contained in the verification result and the hardware utilization rate contained in the hardware use information.
5. The method of claim 2, wherein performing a neural network architecture search within the search space through a predefined search strategy comprises:
carrying out continuous processing on the architecture parameters of the neural network architecture through a normalized exponential function;
and searching a neural network architecture in the search space by adopting a gradient descent method.
6. The method of claim 4, wherein the neural network parameters comprise architecture parameters and weight bit values, and wherein determining a target neural network architecture and corresponding target neural network parameters from an optimal one of the evaluation results comprises:
determining the optimal evaluation result by maximizing the model accuracy and the hardware utilization;
and determining the neural network architecture corresponding to the optimal evaluation result as a target neural network architecture, and determining the architecture parameter and the weight bit value corresponding to the optimal evaluation result as target neural network parameters.
7. The method of any one of claims 1-6, wherein the search space comprises a set of neural network architectures searched in a plurality of sets of network layers with different weight bit values.
8. An apparatus for designing a neural network over a NAS, comprising: a search module, an execution module, and a generation module, wherein,
the searching module is used for searching the neural network architecture based on the constructed searching space and the searching strategy after receiving the generation request to obtain a plurality of neural network architectures;
the execution module is used for running the neural network architectures on hardware based on a training set and a verification set, monitoring hardware use information, and determining evaluation results when the neural network architectures are respectively endowed with different neural network parameters according to an evaluation scheme preset by a verification result corresponding to the verification set and the hardware use information;
and the generating module is used for determining a target neural network architecture and corresponding target neural network parameters in the plurality of neural network architectures according to the optimal evaluation result in the evaluation results, and generating a corresponding neural network model based on the target neural network architecture and the target neural network parameters.
9. The apparatus of claim 8, wherein the search module comprises:
the construction submodule is used for constructing a search space based on a preset network unit stacking mode;
and the searching submodule is used for searching the neural network architecture in the searching space through a predefined searching strategy.
10. A system for designing a neural network over a NAS, wherein the system, when executed by a computer processor, is configured to perform the method for designing a neural network over a NAS of any one of claims 1-7.
CN202010936265.8A 2020-09-08 2020-09-08 Method, device and system for designing neural network through NAS Pending CN112101525A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010936265.8A CN112101525A (en) 2020-09-08 2020-09-08 Method, device and system for designing neural network through NAS

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010936265.8A CN112101525A (en) 2020-09-08 2020-09-08 Method, device and system for designing neural network through NAS

Publications (1)

Publication Number Publication Date
CN112101525A true CN112101525A (en) 2020-12-18

Family

ID=73751687

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010936265.8A Pending CN112101525A (en) 2020-09-08 2020-09-08 Method, device and system for designing neural network through NAS

Country Status (1)

Country Link
CN (1) CN112101525A (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112784949A (en) * 2021-01-28 2021-05-11 华东计算技术研究所(中国电子科技集团公司第三十二研究所) Neural network architecture searching method and system based on evolutionary computation
CN112784140A (en) * 2021-02-03 2021-05-11 浙江工业大学 Search method of high-energy-efficiency neural network architecture
CN112861951A (en) * 2021-02-01 2021-05-28 上海依图网络科技有限公司 Method for determining image neural network parameters and electronic equipment
CN113033784A (en) * 2021-04-18 2021-06-25 沈阳雅译网络技术有限公司 Method for searching neural network structure for CPU and GPU equipment
CN113094504A (en) * 2021-03-24 2021-07-09 北京邮电大学 Self-adaptive text classification method and device based on automatic machine learning
CN113408634A (en) * 2021-06-29 2021-09-17 深圳市商汤科技有限公司 Model recommendation method and device, equipment and computer storage medium
CN114860417A (en) * 2022-06-15 2022-08-05 中科物栖(北京)科技有限责任公司 Multi-core neural network processor and multi-task allocation scheduling method for processor
WO2022252694A1 (en) * 2021-05-29 2022-12-08 华为云计算技术有限公司 Neural network optimization method and apparatus
WO2022253226A1 (en) * 2021-06-02 2022-12-08 杭州海康威视数字技术股份有限公司 Configuration determination method and apparatus for neural network model, and device and storage medium

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160342893A1 (en) * 2015-05-21 2016-11-24 Google Inc. Rotating data for neural network computations
CN109598332A (en) * 2018-11-14 2019-04-09 北京市商汤科技开发有限公司 Neural network generation method and device, electronic equipment and storage medium
CN110222824A (en) * 2019-06-05 2019-09-10 中国科学院自动化研究所 Intelligent algorithm model is autonomously generated and evolvement method, system, device
CN110659721A (en) * 2019-08-02 2020-01-07 浙江省北大信息技术高等研究院 Method and system for constructing target detection network
CN110909877A (en) * 2019-11-29 2020-03-24 百度在线网络技术(北京)有限公司 Neural network model structure searching method and device, electronic equipment and storage medium
CN111401516A (en) * 2020-02-21 2020-07-10 华为技术有限公司 Neural network channel parameter searching method and related equipment
CN111428224A (en) * 2020-04-02 2020-07-17 苏州杰锐思智能科技股份有限公司 Computer account login method based on face recognition

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160342893A1 (en) * 2015-05-21 2016-11-24 Google Inc. Rotating data for neural network computations
CN109598332A (en) * 2018-11-14 2019-04-09 北京市商汤科技开发有限公司 Neural network generation method and device, electronic equipment and storage medium
CN110222824A (en) * 2019-06-05 2019-09-10 中国科学院自动化研究所 Intelligent algorithm model is autonomously generated and evolvement method, system, device
CN110659721A (en) * 2019-08-02 2020-01-07 浙江省北大信息技术高等研究院 Method and system for constructing target detection network
CN110909877A (en) * 2019-11-29 2020-03-24 百度在线网络技术(北京)有限公司 Neural network model structure searching method and device, electronic equipment and storage medium
CN111401516A (en) * 2020-02-21 2020-07-10 华为技术有限公司 Neural network channel parameter searching method and related equipment
CN111428224A (en) * 2020-04-02 2020-07-17 苏州杰锐思智能科技股份有限公司 Computer account login method based on face recognition

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
李孝安 等 *

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112784949B (en) * 2021-01-28 2023-08-11 华东计算技术研究所(中国电子科技集团公司第三十二研究所) Neural network architecture searching method and system based on evolutionary computation
CN112784949A (en) * 2021-01-28 2021-05-11 华东计算技术研究所(中国电子科技集团公司第三十二研究所) Neural network architecture searching method and system based on evolutionary computation
CN112861951A (en) * 2021-02-01 2021-05-28 上海依图网络科技有限公司 Method for determining image neural network parameters and electronic equipment
CN112861951B (en) * 2021-02-01 2024-03-26 上海依图网络科技有限公司 Image neural network parameter determining method and electronic equipment
CN112784140A (en) * 2021-02-03 2021-05-11 浙江工业大学 Search method of high-energy-efficiency neural network architecture
CN112784140B (en) * 2021-02-03 2022-06-21 浙江工业大学 Search method of high-energy-efficiency neural network architecture
CN113094504A (en) * 2021-03-24 2021-07-09 北京邮电大学 Self-adaptive text classification method and device based on automatic machine learning
CN113033784A (en) * 2021-04-18 2021-06-25 沈阳雅译网络技术有限公司 Method for searching neural network structure for CPU and GPU equipment
WO2022252694A1 (en) * 2021-05-29 2022-12-08 华为云计算技术有限公司 Neural network optimization method and apparatus
WO2022253226A1 (en) * 2021-06-02 2022-12-08 杭州海康威视数字技术股份有限公司 Configuration determination method and apparatus for neural network model, and device and storage medium
CN113408634B (en) * 2021-06-29 2022-07-05 深圳市商汤科技有限公司 Model recommendation method and device, equipment and computer storage medium
CN113408634A (en) * 2021-06-29 2021-09-17 深圳市商汤科技有限公司 Model recommendation method and device, equipment and computer storage medium
CN114860417A (en) * 2022-06-15 2022-08-05 中科物栖(北京)科技有限责任公司 Multi-core neural network processor and multi-task allocation scheduling method for processor

Similar Documents

Publication Publication Date Title
CN112101525A (en) Method, device and system for designing neural network through NAS
Li et al. Auto-tuning neural network quantization framework for collaborative inference between the cloud and edge
Banitalebi-Dehkordi et al. Auto-split: A general framework of collaborative edge-cloud AI
WO2021175058A1 (en) Neural network architecture search method and apparatus, device and medium
WO2021143883A1 (en) Adaptive search method and apparatus for neural network
US20230196202A1 (en) System and method for automatic building of learning machines using learning machines
CN111783937A (en) Neural network construction method and system
CN113505883A (en) Neural network training method and device
CN111723910A (en) Method and device for constructing multi-task learning model, electronic equipment and storage medium
CN111428854A (en) Structure searching method and structure searching device
CN116362325A (en) Electric power image recognition model lightweight application method based on model compression
CN113902116A (en) Deep learning model-oriented reasoning batch processing optimization method and system
CN111831359A (en) Weight precision configuration method, device, equipment and storage medium
CN111831354A (en) Data precision configuration method, device, chip array, equipment and medium
CN111831355A (en) Weight precision configuration method, device, equipment and storage medium
CN112001491A (en) Search method and device for determining neural network architecture for processor
CN115016938A (en) Calculation graph automatic partitioning method based on reinforcement learning
US11921667B2 (en) Reconfigurable computing chip
WO2022252694A1 (en) Neural network optimization method and apparatus
CN115952856A (en) Neural network production line parallel training method and system based on bidirectional segmentation
WO2021238734A1 (en) Method for training neural network, and related device
CN116341634A (en) Training method and device for neural structure search model and electronic equipment
CN111984418B (en) Automatic adjusting and optimizing method and device for granularity parameters of sparse matrix vector multiplication parallel tasks
CN112633516B (en) Performance prediction and machine learning compiling optimization method and device
CN115081609A (en) Acceleration method in intelligent decision, terminal equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20201218