CN115511052A - Neural network searching method, device, equipment and storage medium - Google Patents

Neural network searching method, device, equipment and storage medium Download PDF

Info

Publication number
CN115511052A
CN115511052A CN202211181426.2A CN202211181426A CN115511052A CN 115511052 A CN115511052 A CN 115511052A CN 202211181426 A CN202211181426 A CN 202211181426A CN 115511052 A CN115511052 A CN 115511052A
Authority
CN
China
Prior art keywords
neural network
hyper
neural
performance
parameter configuration
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211181426.2A
Other languages
Chinese (zh)
Inventor
张磊
李富康
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jeejio Beijing Technology Co ltd
Original Assignee
Jeejio Beijing Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jeejio Beijing Technology Co ltd filed Critical Jeejio Beijing Technology Co ltd
Priority to CN202211181426.2A priority Critical patent/CN115511052A/en
Publication of CN115511052A publication Critical patent/CN115511052A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The application discloses a neural network searching method, a neural network searching device, neural network searching equipment and a storage medium, and relates to the technical field of neural networks. The method comprises the following steps: determining a neural network search space by using an automatic artificial intelligence neural structure search algorithm; carrying out hyper-parameter configuration on each neural network structure in the neural network search space, and optimizing the configured hyper-parameters; and performing network performance grading on the neural network after the hyper-parameter configuration, and selecting the optimal neural network according to the network performance grading corresponding to all the neural networks. The neural network searching space is determined by utilizing an automatic artificial intelligent neural structure searching algorithm, the neural network searching space can be enlarged, after each neural network structure in the neural network searching space is subjected to super-parameter configuration, super-parameter optimization is carried out, and finally the optimal neural network is determined according to network performance scores, so that the user requirements can be flexibly met, and the accuracy of neural network searching is improved.

Description

Neural network searching method, device, equipment and storage medium
Technical Field
The present invention relates to the field of neural network technologies, and in particular, to a neural network searching method, apparatus, device, and storage medium.
Background
In recent years, machine learning has become more and more widely applied, but many manual operations such as feature extraction, model selection, parameter adjustment, and the like are required in the machine learning process. These machine learning processes requiring human intervention can greatly impact the speed and efficiency of applications, and to address this problem, the field of automated artificial intelligence has evolved gradually in recent years. The approximation calculation technique can be applied to hardware, software and other aspects of a system. In recent years, it has been proposed that a neural network accelerator can be constructed by performing approximate calculation of a program using a neural network and dedicated hardware. Approximate computing techniques are roughly classified into two categories: hardware-based, software-based. The neural network accelerator trains a neural network model, deploys the pre-trained network, and accelerates on the hardware level as shown in fig. 1. The neural network accelerator consists of an observation area, a neural network selection area and a binary code generation part. The observation region is mainly used for configuring candidate codes and collecting input and output data for training and evaluating the neural network. The function of the neural network selection region is to select an optimal neural network from the set of neural networks. The binary generation phase first generates a corresponding deployment configuration for the pre-trained neural network model.
In the prior art, the selection work of the neural network generally uses an enumeration-based neural network selection algorithm, and the basic idea is to determine a set of neural networks, perform performance test and evaluation on each neural network in the set, and output one of the neural networks with the highest performance. However, the enumeration-based search strategy requires determining a set of neural networks in advance, training the neural networks in turn, and then selecting the neural network with the best training performance from the set of neural networks. Therefore, the selection of the group of neural networks is very important, and a good-performance neural network or a poor-performance neural network can be selected, so that the instability is great, and great test is provided for a selection person of the neural networks. Moreover, the greater the number of the selected neural networks, the greater the probability that the neural network with better performance is selected correspondingly, but the selection of the cardinality of the neural network is difficult to determine, the less the number is, the better performance neural network is selected, and the larger the number is, the larger the calculation amount is, and the overall efficiency is affected.
Disclosure of Invention
In view of this, the present invention provides a neural network searching method, apparatus, device and medium, which can more flexibly meet the user requirements and improve the accuracy of the optimal neural network search. The specific scheme is as follows:
in a first aspect, the present application discloses a neural network searching method, including:
determining a neural network search space by using an automatic artificial intelligence neural structure search algorithm;
carrying out hyper-parameter configuration on each neural network structure in the neural network search space, and optimizing the configured hyper-parameters;
and performing network performance scoring on the neural network subjected to the super-parameter configuration, and selecting an optimal neural network according to the network performance scoring corresponding to all the neural networks.
Optionally, the performing hyper-parameter configuration on each neural network structure in the neural network search space includes:
and carrying out hyper-parameter configuration on each neural network structure in the neural network search space by utilizing a hyper-parameter calculation formula.
Optionally, the performing hyper-parameter configuration on each neural network structure in the neural network search space by using a hyper-parameter calculation formula includes:
selecting the highest performance score from the network performance scores corresponding to all the neural networks configured with the hyper-parameters;
and carrying out hyper-parameter configuration on one neural network structure in the neural network searching space by utilizing a hyper-parameter calculation formula based on the highest performance score until each neural network structure in the neural network searching space is configured with a hyper-parameter.
Optionally, the hyper-parameter calculation formula is as follows:
Figure BDA0003866953940000021
wherein λ is a hyper-parameter; m is a group of λ Is a regression model for hyper-parameters; sigma is a standard deviation; mu isIt is expected that,
Figure BDA0003866953940000031
f best the highest performance score;
Figure BDA0003866953940000032
a cumulative distribution function that is a standard normal distribution; ψ is the probability density of a standard normal distribution.
Optionally, the optimizing the configured hyper-parameter includes:
optimizing the configured hyperparameters by utilizing a regression model which is constructed in advance based on a Bayesian optimization algorithm and aims at the hyperparameters; the regression model is used to model the conditional probability of the performance assessment of the hyperparameters.
Optionally, the network performance scoring of the neural network after the hyper-parameter configuration includes:
according to the target network parameters corresponding to the neural network after the hyper-parameter configuration, performing performance scoring on the neural network by using a preset evaluation formula;
wherein the target network parameters comprise calculation accuracy, calculation speed and energy consumption; the performance scoring formula is a weighted sum of each of the target network parameters and a respective corresponding weight.
Optionally, before performing the network performance scoring on the neural network after the hyper-parameter configuration, the method further includes:
deploying the neural network after the hyper-parameter configuration to a pre-constructed simulation platform; the simulation platform is constructed based on a neural network accelerator and a central processing unit;
and performing operation simulation on the neural network by using the simulation platform so as to obtain various network parameters corresponding to the neural network from a network output result.
In a second aspect, the present application discloses a neural network searching apparatus, comprising:
the search space determining module is used for determining a neural network search space by using a neural structure search algorithm of automatic artificial intelligence;
the hyper-parameter configuration module is used for carrying out hyper-parameter configuration on each neural network structure in the neural network search space and optimizing the configured hyper-parameters;
and the performance evaluation module is used for scoring the network performance of the neural network after the hyper-parameter configuration and selecting the optimal neural network according to the network performance scores corresponding to all the neural networks.
In a third aspect, the present application discloses an electronic device, comprising:
a memory for storing a computer program;
a processor for executing the computer program to implement the neural network searching method described above.
In a fourth aspect, the present application discloses a computer readable storage medium for storing a computer program; wherein the computer program when executed by a processor implements the neural network searching method described above.
In the application, a neural network search space is determined by using an automatic artificial intelligence neural structure search algorithm; carrying out hyper-parameter configuration on each neural network structure in the neural network search space, and optimizing the configured hyper-parameters; and performing network performance scoring on the neural network subjected to the super-parameter configuration, and selecting an optimal neural network according to the network performance scoring corresponding to all the neural networks. Therefore, the neural network searching space is determined by utilizing the automatic artificial intelligent neural structure searching algorithm, the neural network searching space can be enlarged, a more comprehensive neural network searching range is applied, the problem that the neural network searching space is small is solved, after super-parameter configuration is carried out on each neural network structure in the neural network searching space, super-parameter optimization is carried out, finally, the optimal neural network is determined according to the network performance grading of the neural network, the user requirements can be met more flexibly, and the accuracy of searching the optimal neural network is improved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the provided drawings without creative efforts.
FIG. 1 is a schematic diagram of an acceleration flow of a neural network accelerator at a hardware level;
FIG. 2 is a flow chart of a neural network searching method provided by the present application;
FIG. 3 is a schematic diagram of a specific hyperparametric formation process provided herein;
FIG. 4 is a flow chart illustrating a specific NAS algorithm provided herein;
fig. 5 is a schematic structural diagram of a neural network searching apparatus provided in the present application;
fig. 6 is a block diagram of an electronic device provided in the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
In the prior art, an enumeration-based neural network selection algorithm is generally used for selection work of neural networks, and the basic idea is to determine a set of neural networks, perform performance test and evaluation on each neural network in the set, and output one of the neural networks with the highest performance. However, the enumeration-based search strategy requires a set of neural networks to be determined in advance, training of the neural networks to be performed in sequence, and then selecting the neural network with the best training performance from the set of neural networks. Therefore, the selection of the group of neural networks is very important, and a good-performance neural network or a poor-performance neural network can be selected, so that the instability is great, and great test is provided for a selection person of the neural networks. Moreover, the greater the number of the selected neural networks, the greater the probability that the neural network with better performance is selected correspondingly, but the selection of the cardinality of the neural network is difficult to determine, the less the number is, the better performance neural network is selected, and the larger the number is, the larger the calculation amount is, and the overall efficiency is affected. In order to overcome the technical problem, the application provides a neural network searching method which can meet the user requirements more flexibly and improve the accuracy of optimal neural network searching.
The embodiment of the application discloses a neural network searching method, and as shown in fig. 2, the method can comprise the following steps:
step S11: and determining a neural network search space by using an automatic artificial intelligence neural structure search algorithm.
In this embodiment, a Neural network search space is determined by using a Neural architecture search algorithm (NAS) of Automated Machine Learning (Automated ml). It can be understood that in many deep learning applications, optimization of the neural network structure is required, and the NAS algorithm is proposed for solving the problem that a high-performance neural network can be automatically designed according to the NAS algorithm. The neural network accelerator NPU (neural processing unit) has far higher efficiency than the CPU/GPU in processing the neural network calculation process, so the NPU has a very large application scenario in the machine learning field. Therefore, in this embodiment, for the neural network processor, in the workflow of the neural network processor, the approximate calculation of the neural network structure may be performed through the NAS algorithm, so that the calculation amount is reduced, and the efficiency of the processor is greatly improved.
Step S12: and carrying out hyper-parameter configuration on each neural network structure in the neural network search space, and optimizing the configured hyper-parameters.
In this embodiment, the hyper-parameter configuration is performed on each neural network structure in the neural network search space, and the configured hyper-parameters are optimized. It can be understood that the neural network structure search can automatically acquire the neural network with strong generalization capability and good hardware requirement by an economical and efficient search method. The NAS algorithm mainly comprises three aspects, namely a search space, a search strategy and evaluation prediction. Specifically, the configured hyper-parameters may be optimized by using a bayesian optimization algorithm, that is, in this embodiment, the search space of the neural network may be expanded by using the NAS algorithm, and the search policy part uses bayesian optimization as a search policy.
It will be appreciated that the approximate computational problem of a neural network can be described as training the neural network with a training set and a test set given the neural network search space. The goal of neural networks is to have optimal performance and to be able to meet the needs of the user. User requirements may include computational accuracy, computational speed, and energy consumption, as well as output quality, real-time constraints, and the like.
The selection problem of neural networks can be written as:
Figure BDA0003866953940000061
where η represents a neural network in the neural network search space N, P erf Representing the overall performance of the corresponding neural network, C representing the user requirements, D train Represents a training set, D test The test set is represented, thereby converting the neural network search into a mathematical model. Further, program approximation is performed using MLP (Multilayer Perceptron), which may be represented by a hyper-parameter tuple: λ = (L, n0, A0, n1, a1.. NL). L is the total number of layers of the neural network, ni is the number of neurons, ai is the activation function of the ith layer, and the architecture hyper-parameter forming process is shown in FIG. 3. It can be seen that the neural network selection problem can be restated as one having a parameter space
Figure BDA0003866953940000062
In a hyper-parametric optimization problem of, wherein
Figure BDA0003866953940000063
Is a domain of the number of neurons,
Figure BDA0003866953940000064
is a field of the i-th layer activation function type. Therefore, the neural network search problem is expressed as a hyper-parameter optimization problem through the inference.
In this embodiment, the performing the hyper-parameter configuration on each neural network structure in the neural network search space may include: and carrying out hyper-parameter configuration on each neural network structure in the neural network search space by utilizing a hyper-parameter calculation formula. The hyper-parameters of each neural network structure are calculated in sequence through a preset hyper-parameter calculation formula and then configured.
In this embodiment, the performing the hyper-parameter configuration on each neural network structure in the neural network search space by using the hyper-parameter calculation formula may include: selecting the highest performance score from the network performance scores corresponding to all the neural networks configured with the hyper-parameters; and carrying out hyper-parameter configuration on one neural network structure in the neural network searching space by utilizing a hyper-parameter calculation formula based on the highest performance score until each neural network structure in the neural network searching space is configured with a hyper-parameter.
In this embodiment, the hyper-parameter calculation formula is as follows:
Figure BDA0003866953940000071
wherein λ is a hyper-parameter; m λ Is a regression model for hyper-parameters; sigma is a standard deviation; mu is the sum of the values of the desired values,
Figure BDA0003866953940000072
f best the highest performance score, namely the highest performance score of all the hyper-parameter configurations measured so far;
Figure BDA0003866953940000073
a cumulative distribution function that is a standard normal distribution; ψ is the probability density of a standard normal distribution.
In this embodiment, in order to automatically generate a plurality of neural networks, hyper-parameters corresponding to a plurality of neural network structures need to be automatically generated, so an EI function is adopted, EI is calculated for each neural network structure in N, the highest EI is selected as the hyper-parameter of the next neural network structure, the hyper-parameter of the next neural network structure is determined according to the score measured in history, and the steps are continuously cycled until the hyper-parameters of all the neural network structures are determined.
In this embodiment, the optimizing the configured hyper-parameter may include: optimizing the configured hyperparameters by utilizing a regression model which is constructed in advance based on a Bayesian optimization algorithm and aims at the hyperparameters; the regression model is used to model the conditional probability of the performance assessment of the hyperparameters. Since the performance of the neural network cannot be directly expressed by the hyper-parameters, since the accuracy and speed are determined by many complex hardware and algorithm factors, a regression model can be built to map the effect of visible parameters and invisible parameters on the performance of the neural network.
Specifically, a regression tree is established by using an SMAC Algorithm (sequential model-based Algorithm Configuration, sequence model-based Algorithm Configuration) to evaluate the conditional probability p (f) of the performance p of a certain hyper-parametric Configuration λ Pare λ), and the SMAC algorithm may converge to an optimal value over several iterations. Besides, a regression Model can be established by using an SMBO (Sequential Model-Based optimization) algorithm.
It will be appreciated that the neural network design process is essentially a parameter tuning process. In the prior art, the most used are grid search and random search, but for grid search, if the hyper-parameter a has 3 choices, the hyper-parameter B has 2 choices, the hyper-parameter C has 3 choices, and the hyper-parameter combination is 3 × 2 × 3=18, then the combination with better performance obtained by traversing the 18 combinations is the output of grid search. However, the grid search has a large calculation amount and is easy to combine explosions, and the grid search is not a very efficient hyper-parameter optimization mode. For random search, if the hyper-parameters are continuously valued, the values of the hyper-parameters are randomly selected and combined for performance test and evaluation. But random searches may be particularly inefficient and particularly effective. Therefore, the effect of the grid search-based and random search-based hyper-parameter optimization algorithm is very general. That is, whether the parameter combination of the grid search or the random parameter of the random search is used, the neural network with better performance cannot be selected due to the cardinality or randomness of the parameter. When the parameter selection of the grid search is enlarged, the combination of parameters explodes, the number of the neural networks is large, the calculation amount is increased greatly, and the selection efficiency is low. Also for the random search method, due to its randomness, the result has great instability, and the performance of the neural network cannot be well evaluated.
For example, fig. 4 is a flowchart of a specific NAS algorithm provided in this embodiment, and first a regression model constructed based on SMAC is initialized. The following steps are then iterated: generating a neural network structure candidate using the SMAC regression model; training and evaluating on a target hardware simulator; and updates the model using the new data points thus obtained. And finally, selecting the most energy-saving optimal neural network according to the constraint conditions specified by the user by the algorithm.
Step S13: and performing network performance scoring on the neural network subjected to the super-parameter configuration, and selecting an optimal neural network according to the network performance scoring corresponding to all the neural networks.
In this embodiment, the neural network with the highest score is selected as the optimal neural network by scoring the network performance of the neural network with the hyper-parameter configuration and then scoring the network performance corresponding to all the neural networks.
In this embodiment, the performing network performance scoring on the neural network after the hyper-parameter configuration may include: according to the target network parameters corresponding to the neural network after the hyper-parameter configuration, performing performance scoring on the neural network by using a preset evaluation formula; wherein the target network parameters comprise calculation accuracy, calculation speed and energy consumption; the performance scoring formula is a weighted sum of each of the target network parameters and a respective corresponding weight. Since the number of network layers and neurons can directly determine the overhead and the calculation accuracy, different neural network structures directly affect the output and performance of the approximation calculation program. Therefore, in the approximate calculation process, the requirements of the user, including calculation accuracy, calculation overhead and the like, need to be comprehensively considered, and in the embodiment, the performance of the neural network is comprehensively scored according to the calculation accuracy, the calculation speed and the energy consumption as feedback parameters.
The performance scoring formula is:
f Pare =α×P acc +β×P speed +ε×P energy
wherein, P acc Indicating the accuracy of the calculation, P speed Indicating the calculated speed, P energy Representing energy consumption, and representing the weights of the energy consumption, alpha, beta and epsilon in the evaluation function, wherein the weights can be guided to different user requirements by changing the weights, and finally, a neural network meeting the user requirements is obtained.
In this embodiment, before performing network performance scoring on the neural network after the hyper-parameter configuration, the method may further include: deploying the neural network subjected to the hyper-parameter configuration to a pre-constructed simulation platform; the simulation platform is constructed based on a neural network accelerator and a central processing unit; and performing operation simulation on the neural network by using the simulation platform so as to obtain various network parameters corresponding to the neural network from a network output result. The conventional neural network acceleration method does not add hardware parameters into performance evaluation of the neural network, but target hardware has a large influence on the neural network, so that feedback of the hardware on the neural network needs to be built through some ways, and the purpose of doing so is to enable the selected neural network to better meet user requirements in various aspects of software and hardware. Therefore, the method can be realized by building a hardware simulation platform, and particularly, the cpu + npu simulation platform is built, so that the neural network system is fed back at the output of the terminal, and the neural network capable of meeting the user requirements can be obtained. And a unified simulation platform of hardware is built, so that the neural network accelerator can start to work. In the binary generation stage of the neural network accelerator, a corresponding NPU configuration is generated for a pre-trained neural network model, and then an executable file is generated for a CPU + NPU platform.
In the prior art, the performance evaluation standard of the neural network model is basically the accuracy, and the training process usually takes the accuracy as a feedback adjustment parameter. However, the evaluation criteria of the neural network accelerator are not only accuracy, but also a series of evaluation criteria including software and hardware, such as calculation amount, calculation speed, power consumption and the like. Evaluating the neural network for accuracy only does not meet the requirements of the accelerator, in other words, cannot guarantee that the power consumption requirements or other requirements of the user are met, and therefore some changes need to be made to the manner of performance evaluation. In the embodiment, the neural network is comprehensively evaluated by establishing a unified simulation platform of cpu + npu, and direct feedback of the performance of the neural network can be realized on a hardware level, so that the requirements of users can be met from multiple aspects such as calculation precision and calculation speed, and a better neural network structure is obtained.
It can be seen that, in the process of performing research on the neural network accelerator, it is found that in the process of determining the neural network, if the requirement on the calculation accuracy is relatively high, the required neural network structure will be more elaborate, and a series of additional overheads, such as calculation amount and power consumption, will be brought. The user may require a high computational accuracy for the neural network structure, or the user may not require the computational accuracy, and a balance between the amount of computation and the computational accuracy may need to be found. In this embodiment, autoML is used to automatically select an optimal neural network structure according to the calculation accuracy required by the user, so as to flexibly adjust the neural network structure, thereby achieving balance. Specifically, a neural network structure is optimized by using an NAS algorithm in AutoML, and then hyperparametric optimization is performed by using a Bayesian optimization algorithm. Compared with the enumeration-based neural network structure selection algorithm, the NAS algorithm can more flexibly meet the requirements of users. In addition, for the prior art, the conventional neural acceleration method only performs performance evaluation on one processor platform, and it is difficult to directly feed back the hardware architecture to the performance evaluation of the neural network structure, whereas in the embodiment, a hardware simulator based on a CPU and an NPU is used to generate direct feedback of the performance evaluation, which is beneficial for the NAS algorithm to find the most energy-efficient neural network on the target hardware.
As can be seen from the above, in the present embodiment, a neural network search space is determined by using an automatic artificial intelligence neural structure search algorithm; carrying out hyper-parameter configuration on each neural network structure in the neural network search space, and optimizing the configured hyper-parameters; and performing network performance scoring on the neural network subjected to the super-parameter configuration, and selecting an optimal neural network according to the network performance scoring corresponding to all the neural networks. Therefore, the neural network searching space is determined by utilizing the automatic artificial intelligent neural structure searching algorithm, the neural network searching space can be enlarged, a more comprehensive neural network searching range is applied, the problem that the neural network searching space is small is solved, after the hyper-parameter configuration is carried out on each neural network structure in the neural network searching space, the hyper-parameter optimization is carried out, finally, the optimal neural network is determined according to the network performance grading of the neural network, the user requirements can be met more flexibly, and the accuracy of the optimal neural network searching is improved.
Correspondingly, the embodiment of the present application further discloses a neural network searching apparatus, as shown in fig. 5, the apparatus includes:
the search space determining module 11 is configured to determine a neural network search space by using a neural structure search algorithm of automated artificial intelligence;
a hyper-parameter configuration module 12, configured to perform hyper-parameter configuration on each neural network structure in the neural network search space, and optimize configured hyper-parameters;
and the performance evaluation module 13 is configured to perform network performance scoring on the neural network after the hyper-parameter configuration, and select an optimal neural network according to the network performance scoring corresponding to all the neural networks.
As can be seen from the above, in the present embodiment, a neural network search space is determined by using an automatic artificial intelligence neural structure search algorithm; carrying out hyper-parameter configuration on each neural network structure in the neural network search space, and optimizing the configured hyper-parameters; and performing network performance scoring on the neural network subjected to the super-parameter configuration, and selecting an optimal neural network according to the network performance scoring corresponding to all the neural networks. Therefore, the neural network searching space is determined by utilizing the automatic artificial intelligent neural structure searching algorithm, the neural network searching space can be enlarged, a more comprehensive neural network searching range is applied, the problem that the neural network searching space is small is solved, after super-parameter configuration is carried out on each neural network structure in the neural network searching space, super-parameter optimization is carried out, finally, the optimal neural network is determined according to the network performance grading of the neural network, the user requirements can be met more flexibly, and the accuracy of searching the optimal neural network is improved.
In some embodiments, the hyper-parameter configuration module may specifically include:
and the hyper-parameter configuration unit is used for carrying out hyper-parameter configuration on each neural network structure in the neural network search space by utilizing a hyper-parameter calculation formula.
In some specific embodiments, the hyper-parameter configuration unit may specifically include:
the grading screening unit is used for selecting the highest performance grade from the network performance grades corresponding to all the neural networks configured with the hyper-parameters;
and the hyper-parameter configuration unit is used for carrying out hyper-parameter configuration on one neural network structure in the neural network search space by using a hyper-parameter calculation formula based on the highest performance score until each neural network structure in the neural network search space is configured with a hyper-parameter.
In some embodiments, the hyper-parameter calculation formula is as follows:
Figure BDA0003866953940000111
wherein λ is a hyper-parameter; m λ Is a regression model for hyper-parameters; sigma is a standard deviation; mu is the sum of the values of the desired values,
Figure BDA0003866953940000121
f best the highest performance score;
Figure BDA0003866953940000122
a cumulative distribution function that is a standard normal distribution;
Figure BDA0003866953940000123
is the probability density of a standard normal distribution.
In some specific embodiments, the hyper-parameter configuration module may specifically include:
the hyper-parameter optimization unit is used for optimizing the configured hyper-parameters by utilizing a regression model which is constructed in advance based on a Bayesian optimization algorithm and aims at the hyper-parameters; the regression model is used to model the conditional probability of the performance assessment of the hyperparameters.
In some embodiments, the performance evaluation module may specifically include:
the performance scoring unit is used for scoring the performance of the neural network by using a preset evaluation formula according to the target network parameters corresponding to the neural network after the hyper-parameter configuration;
wherein the target network parameters comprise calculation accuracy, calculation speed and energy consumption; the performance scoring formula is a weighted sum of each of the target network parameters and a respective corresponding weight.
In some specific embodiments, the neural network searching apparatus may further include:
the deployment unit is used for deploying the neural network after the hyper-parameter configuration to a pre-constructed simulation platform; the simulation platform is constructed based on a neural network accelerator and a central processing unit;
and the simulation unit is used for simulating the operation of the neural network by using the simulation platform so as to obtain various network parameters corresponding to the neural network from a network output result.
Further, the embodiment of the present application also discloses an electronic device, which is shown in fig. 6, and the content in the drawing cannot be considered as any limitation to the application scope.
Fig. 6 is a schematic structural diagram of an electronic device 20 according to an embodiment of the present disclosure. The electronic device 20 may specifically include: at least one processor 21, at least one memory 22, a power supply 23, a communication interface 24, an input output interface 25, and a communication bus 26. The memory 22 is used for storing a computer program, and the computer program is loaded and executed by the processor 21 to implement relevant steps in the neural network searching method disclosed in any one of the foregoing embodiments.
In this embodiment, the power supply 23 is configured to provide a working voltage for each hardware device on the electronic device 20; the communication interface 24 can create a data transmission channel between the electronic device 20 and an external device, and a communication protocol followed by the communication interface is any communication protocol applicable to the technical solution of the present application, and is not specifically limited herein; the input/output interface 25 is configured to acquire external input data or output data to the outside, and a specific interface type thereof may be selected according to specific application requirements, which is not specifically limited herein.
In addition, the storage 22 is used as a carrier for storing resources, and may be a read-only memory, a random access memory, a magnetic disk or an optical disk, etc., the resources stored thereon include an operating system 221, a computer program 222, and data 223 including a neural network structure, etc., and the storage manner may be a transient storage or a permanent storage.
The operating system 221 is configured to manage and control each hardware device and the computer program 222 on the electronic device 20, so as to implement the operation and processing of the mass data 223 in the memory 22 by the processor 21, and may be Windows Server, netware, unix, linux, or the like. The computer program 222 may further include a computer program that can be used to perform other specific tasks in addition to the computer program that can be used to perform the neural network searching method performed by the electronic device 20 disclosed in any of the foregoing embodiments.
Further, the embodiment of the present application also discloses a computer storage medium, in which computer executable instructions are stored, and when the computer executable instructions are loaded and executed by a processor, the steps of the neural network searching method disclosed in any one of the foregoing embodiments are implemented.
The embodiments are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same or similar parts among the embodiments are referred to each other. The device disclosed by the embodiment corresponds to the method disclosed by the embodiment, so that the description is simple, and the relevant points can be referred to the method part for description.
The steps of a method or algorithm described in connection with the embodiments disclosed herein may be embodied directly in hardware, in a software module executed by a processor, or in a combination of the two. A software module may reside in Random Access Memory (RAM), memory, read Only Memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art.
Finally, it should also be noted that, in this document, relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. The term "comprising", without further limitation, means that the element so defined is not excluded from the group consisting of additional identical elements in the process, method, article, or apparatus that comprises the element.
The neural network searching method, device, equipment and medium provided by the invention are introduced in detail, specific examples are applied in the text to explain the principle and the implementation of the invention, and the description of the above embodiments is only used to help understanding the method and the core idea of the invention; meanwhile, for a person skilled in the art, according to the idea of the present invention, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present invention.

Claims (10)

1. A neural network searching method, comprising:
determining a neural network search space by using an automatic artificial intelligence neural structure search algorithm;
carrying out hyper-parameter configuration on each neural network structure in the neural network search space, and optimizing the configured hyper-parameters;
and performing network performance grading on the neural network after the hyper-parameter configuration, and selecting the optimal neural network according to the network performance grading corresponding to all the neural networks.
2. The neural network searching method of claim 1, wherein the performing of the hyper-parameter configuration on each neural network structure in the neural network searching space comprises:
and carrying out hyper-parameter configuration on each neural network structure in the neural network search space by utilizing a hyper-parameter calculation formula.
3. The method according to claim 2, wherein the performing hyper-parameter configuration on each neural network structure in the neural network search space by using a hyper-parameter calculation formula comprises:
selecting the highest performance score from the network performance scores corresponding to all the neural networks configured with the hyper-parameters;
and carrying out hyper-parameter configuration on one neural network structure in the neural network search space by utilizing a hyper-parameter calculation formula based on the highest performance score until each neural network structure in the neural network search space is configured with hyper-parameters.
4. The neural network searching method of claim 3, wherein the hyper-parameter calculation formula is as follows:
Figure FDA0003866953930000011
wherein λ is a hyper-parameter; m λ Is a regression model for hyper-parameters; σ is the standard deviation; mu is the sum of the values of the desired,
Figure FDA0003866953930000012
f best the highest performance score;
Figure FDA0003866953930000013
a cumulative distribution function that is a standard normal distribution; ψ is the probability density of a standard normal distribution.
5. The neural network searching method of claim 1, wherein the optimizing the configured hyper-parameters comprises:
optimizing the configured hyperparameters by utilizing a regression model which is constructed in advance based on a Bayesian optimization algorithm and aims at the hyperparameters; the regression model is used to model the conditional probability of the performance assessment of the hyperparameters.
6. The method for searching the neural network according to claim 1, wherein the scoring the network performance of the neural network after the hyper-parameter configuration comprises the following steps:
according to the target network parameters corresponding to the neural network after the hyper-parameter configuration, performing performance scoring on the neural network by using a preset evaluation formula;
wherein the target network parameters comprise calculation accuracy, calculation speed and energy consumption; the performance scoring formula is a weighted sum of each of the target network parameters and a respective corresponding weight.
7. The neural network searching method according to any one of claims 1 to 6, wherein before scoring the network performance of the neural network after the hyper-parameter configuration, the method further comprises:
deploying the neural network after the hyper-parameter configuration to a pre-constructed simulation platform; the simulation platform is constructed based on a neural network accelerator and a central processing unit;
and performing operation simulation on the neural network by using the simulation platform so as to obtain various network parameters corresponding to the neural network from a network output result.
8. A neural network searching apparatus, comprising:
the search space determining module is used for determining a neural network search space by using a neural structure search algorithm of automatic artificial intelligence;
the hyper-parameter configuration module is used for carrying out hyper-parameter configuration on each neural network structure in the neural network search space and optimizing the configured hyper-parameters;
and the performance evaluation module is used for scoring the network performance of the neural network after the hyper-parameter configuration and selecting the optimal neural network according to the network performance scores corresponding to all the neural networks.
9. An electronic device, comprising:
a memory for storing a computer program;
a processor for executing the computer program to implement the neural network searching method of any one of claims 1 to 7.
10. A computer-readable storage medium for storing a computer program; wherein the computer program when executed by the processor implements a neural network searching method as claimed in any one of claims 1 to 7.
CN202211181426.2A 2022-09-27 2022-09-27 Neural network searching method, device, equipment and storage medium Pending CN115511052A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211181426.2A CN115511052A (en) 2022-09-27 2022-09-27 Neural network searching method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211181426.2A CN115511052A (en) 2022-09-27 2022-09-27 Neural network searching method, device, equipment and storage medium

Publications (1)

Publication Number Publication Date
CN115511052A true CN115511052A (en) 2022-12-23

Family

ID=84505261

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211181426.2A Pending CN115511052A (en) 2022-09-27 2022-09-27 Neural network searching method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN115511052A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117764129A (en) * 2024-01-18 2024-03-26 天津大学 Edge equipment for automatically designing multiplication-free neural network

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117764129A (en) * 2024-01-18 2024-03-26 天津大学 Edge equipment for automatically designing multiplication-free neural network
CN117764129B (en) * 2024-01-18 2024-05-07 天津大学 Edge equipment for automatically designing multiplication-free neural network

Similar Documents

Publication Publication Date Title
Zhang et al. A return-cost-based binary firefly algorithm for feature selection
Khritonenko et al. Distributed self-configuring evolutionary algorithms for artificial neural networks design
Ahmadi et al. Learning fuzzy cognitive maps using imperialist competitive algorithm
CN110674965A (en) Multi-time step wind power prediction method based on dynamic feature selection
CN110738362A (en) method for constructing prediction model based on improved multivariate cosmic algorithm
CN115511052A (en) Neural network searching method, device, equipment and storage medium
Valdez et al. Fuzzy control of parameters to dynamically adapt the PSO and GA algorithms
Yan et al. A double weighted Naive Bayes with niching cultural algorithm for multi-label classification
Czajkowski et al. Steering the interpretability of decision trees using lasso regression-an evolutionary perspective
Chen et al. A Spark-based Ant Lion algorithm for parameters optimization of random forest in credit classification
Sun et al. Dynamic Intelligent Supply-Demand Adaptation Model Towards Intelligent Cloud Manufacturing.
CN112508177A (en) Network structure searching method and device, electronic equipment and storage medium
Diao et al. Fuzzy-rough classifier ensemble selection
CN115528750B (en) Power grid safety and stability oriented data model hybrid drive unit combination method
Chen et al. Deep reinforcement learning with model-based acceleration for hyperparameter optimization
Zahari et al. Evaluation of sustainable development indicators with fuzzy TOPSIS based on subjective and objective weights
García et al. A two-step approach of feature construction for a genetic learning algorithm
Valdez et al. A new evolutionary method with a hybrid approach combining particle swarm optimization and genetic algorithms using fuzzy logic for decision making
CN112712202B (en) Short-term wind power prediction method and device, electronic equipment and storage medium
Fonou-Dombeu Ranking semantic web ontologies with ELECTRE
CN114386142A (en) Building energy consumption prediction method based on multisource fusion feature selection and fuzzy difference enhanced Stacking framework
Rahman et al. Implementation of artificial neural network on regression analysis
Butka et al. One approach to combination of FCA-based local conceptual models for text analysis—grid-based approach
Meng et al. Learning non-stationary dynamic Bayesian network structure from data stream
CN113449869A (en) Learning method of easy-reasoning Bayesian network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination