CN110020667A - Neural network structure search method, system, storage medium, and device - Google Patents

Neural network structure search method, system, storage medium, and device Download PDF

Info

Publication number
CN110020667A
CN110020667A CN201910128954.3A CN201910128954A CN110020667A CN 110020667 A CN110020667 A CN 110020667A CN 201910128954 A CN201910128954 A CN 201910128954A CN 110020667 A CN110020667 A CN 110020667A
Authority
CN
China
Prior art keywords
neural network
middle layer
network structure
parameter
search
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910128954.3A
Other languages
Chinese (zh)
Inventor
贾东亚
赵巍
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou Shiyuan Electronics Thecnology Co Ltd
Original Assignee
Guangzhou Shiyuan Electronics Thecnology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou Shiyuan Electronics Thecnology Co Ltd filed Critical Guangzhou Shiyuan Electronics Thecnology Co Ltd
Priority to CN201910128954.3A priority Critical patent/CN110020667A/en
Publication of CN110020667A publication Critical patent/CN110020667A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/285Selection of pattern recognition techniques, e.g. of classifiers in a multi-classifier system
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention relates to a searching method, a system, a storage medium and a device of a neural network structure, wherein the method comprises the following steps: s1: acquiring a preset neural network architecture and a sampling structure; the neural network architecture comprises an input layer, an output layer and a plurality of intermediate layers which are arranged in sequence; s2: according to the network structure to be determined in each intermediate layer and the corresponding structure search interval, sampling for multiple times through the sampling structure to obtain a plurality of sub-neural network structures; s3: carrying out classification training on the plurality of sub-neural network structures to obtain a plurality of updated sub-neural network structures and a structure search interval of each updated intermediate layer; s4: obtaining the accuracy rate of the classification of the plurality of updated sub-neural network structures, and updating the parameters of the sampling structure according to the accuracy rate; s5: a neural network structure of the demand is determined. The invention can automatically search out the neural network structure with the most suitable effect for the classification task, thereby saving time and improving efficiency.

Description

Searching method, system, storage medium and the equipment of neural network structure
Technical field
The present invention relates to field of signal processing, more particularly to a kind of searching method of neural network structure, system, storage Medium and equipment.
Background technique
Classification for signal data, method primarily now are conventional machines learning method and deep learning method.Before Person is some features for artificially extracting signal data, then inputs the feature into various classifiers and go to learn, finally again with Classifier after habit is classified;The latter is to carry out the feature of learning signal simultaneously using deep neural network and classify.? Big data era of today, deep learning method can learn from big data to richer more useful feature, therefore, depth Learning method is under big data using more and more extensive.
The core of deep learning method is the structure of neural network, and neural network mainly includes convolutional neural networks; The structure of convolutional neural networks mainly includes input layer, multiple middle layers and output layer;Wherein, middle layer may be selected to be convolution Layer, pond layer, the normalization layers structure such as layer and full articulamentum, these layer of structure have its respective parameter such as weight, layer structure again Selection and the setting of each parameter the performance of neural network is had a very big impact;In addition to each layer parameter setting it Outside, connection between layers equally affects the performance of convolutional neural networks.Therefore, it is designed when for Modulation recognition task When one convolutional neural networks structure, designer is needed to have certain Neural Network Structure Design and parameter adjustment experience, and And need to continuously attempt to adjust meticulously, entire design process is complicated and expends energy.
Summary of the invention
Based on this, the object of the present invention is to provide a kind of searching methods of neural network structure, and have to search automatically Rope goes out the neural network structure that effect most agrees with classification task, constantly adjusts neural network structure without designer, saves the time, The advantages of improving efficiency.
A kind of searching method of neural network structure, includes the following steps:
Step S1: preset neural network framework and sampling structure are obtained;Wherein, the neural network framework includes input The middle layer of layer, output layer and multiple sequentials;Each middle layer includes network structure and the network knot to be determined The corresponding search structure section of structure;
Step S2: according to network structure to be determined in each middle layer and the corresponding search structure of the network structure Section carries out multiple repairing weld by sampling structure, obtains multiple sub-neural network structures;
Step S3: classification based training is carried out to multiple sub-neural network structures, and updates multiple sub-neural networks The parameter of network structure in structure in the search structure section of the parameter of the network structure of each middle layer and each middle layer, Obtain the search structure section of updated multiple sub-neural network structures and updated each middle layer;
Step S4: carrying out classification verifying to updated multiple sub-neural network structures, obtains updated multiple son minds The accuracy rate of classification through network structure, and according to the parameter of accuracy rate update sampling structure, obtain updated sampling Structure;
Step S5: step S2 to step S4 is repeated after reaching preset times, according to updated sampling structure and more The search structure section of each middle layer after new, determines the neural network structure of demand.
The present invention presets the corresponding search structure section of network structure of each middle layer in neural network structure, by adopting Spline structure searches out the neural network structure that effect most agrees with classification task automatically, without manually adjusting neural network structure, saves The time has been saved, efficiency is improved.
In one embodiment, the network structure to be determined includes input structure and processing structure;Described to true When fixed network structure is input structure, the corresponding search structure section of the network structure is the input structure region of search;? When the network structure to be determined is processing structure, the corresponding search structure section of the network structure is that processing structure is searched for Section.
In one embodiment, the sampling structure includes that sequence generates model, encoder and decoder;
It is described according to the corresponding search structure area of network structure and the network structure to be determined in each middle layer Between, the step of carrying out multiple repairing weld by sampling structure, obtain multiple sub-neural network structures, comprising:
Step S21: the sequence for the network structure that each middle layer need to determine in neural network framework is obtained;
Step S22: preset value list entries is generated in model, obtains the output valve that sequence generates model;
Step S23: the corresponding search structure section of network structure for the middle layer that need to currently determine is obtained;
Step S24: according to the corresponding search structure section of the network structure for the middle layer that need to currently determine, the knot is determined The size of structure search space, and be decoded as the output valve that sequence generates model and the search structure section by decoder The decoding data that size matches;
Step S25: according to the corresponding relationship of each network structure in the decoding data and the search structure section, meter Calculate the acquisition probability of each network structure in the search structure section;
Step S26: according to the acquisition probability, each network structure in the search structure section is adopted at random Sample determines the network structure for the middle layer that need to currently determine;
Step S27: judging whether the network structure of each middle layer in neural network framework determines and finish, if not determined, The network structure of currently determining middle layer is encoded by encoder then, and the data after coding are input to sequence life At model, and the sequence for the network structure that need to be determined according to each middle layer, by the network knot of next middle layer that need to be determined Network structure of the structure as the middle layer that need to currently determine, returns to step S23;If having had determined that, determined according to each middle layer Network structure, obtain a sub-neural network structure;
Step S28: it repeats to obtain multiple sub-neural network structures after step S22 reaches preset times to step S27.
Sub-neural network structure is automatically determined by sampling structure circulation, and then makes each centre of sub-neural network structure Layer structure is interrelated, meets sub-neural network structure to the flow chart of data processing of processing, improves the accuracy of prediction.
In one embodiment, described that classification based training is carried out to multiple sub-neural network structures, and update multiple institutes State the network in sub-neural network structure in the search structure section of the parameter of the network structure of each middle layer and each middle layer The parameter of structure, obtain the search structure of updated multiple sub-neural network structures and updated each middle layer The step of section, comprising:
Step S31: training set is obtained, and the training set is randomly divided into the identical subset of multiple quantity;
Step S32: take wherein a subset classification based trainings are carried out to multiple sub-neural network structures, obtain multiple described The intersection entropy loss of sub-neural network structure;
Step S33: according to the intersection entropy loss of multiple neural network structures, multiple sub-neural network knots are obtained The parameter gradients of the processing structure of each middle layer in structure;
Step S34: in multiple sub-neural network structures, the described of the same processing structure of same middle layer is obtained Parameter gradients, and the parameter gradients are subjected to average value processing, and using treated mean value as the same of the same middle layer The parameter of one processing structure updates gradient;
Step S35: updating gradient according to the parameter, updates same middle layer in multiple sub-neural network structures The parameter of same processing structure in the processing structure region of search of the parameter of same processing structure and same middle layer;
Step S36: judging whether all subsets carry out classification based training to multiple sub-neural network structures, if It is the processing structure field of search for then obtaining updated multiple sub-neural network structures and updated each middle layer Between;If it is not, then removing a untrained subset, step S32 is returned to.
Classification based training is carried out to sub- neural network structure one by one, and with the parameter of the same processing structure of same middle layer ladder The parameter for spending the same processing structure as the same middle layer updates gradient, improves the precision of parameter update.
In one embodiment, described that classification verifying is carried out to updated multiple sub-neural network structures, it is updated The accuracy rate of the classification of multiple sub-neural network structures afterwards, and according to the parameter of accuracy rate update sampling structure, it obtains The step of updated sampling structure, comprising:
Step S41: obtaining verifying collection, and the accuracy rate of updated each sub-neural network structure is calculated by verifying collection;
Step S42: according to the accuracy rate, the penalty values of the sampling structure are calculated;
Step S43: it according to the penalty values of the sampling structure, obtains sequence and generates the parameter update gradient of model, coding The parameter of device updates gradient and the parameter of decoder updates gradient;
Step S44: gradient, the parameter of encoder update gradient and decoder are updated according to the parameter that sequence generates model Parameter updates gradient, and renewal sequence generates parameter, the parameter of the parameter of encoder and decoder of model, obtains updated adopt Spline structure.
In one embodiment, the search structure area according to updated sampling structure and updated each middle layer Between, the step of determining the neural network structure of demand, comprising:
Step S51: corresponding updated according to network structure and the network structure to be determined in each middle layer Search structure section carries out multiple repairing weld by updated sampling structure, obtains multiple sub-neural network structures;
Step S52: carrying out classification verifying to multiple sub-neural network structures, obtains the highest son mind of classification accuracy Through network structure;
Step S53: being trained the highest sub-neural network structure of classification accuracy, and by the sub- nerve net after training Neural network structure of the network structure as demand.
Updated search structure section is corresponded to by determining collector and each network structure, obtains multiple son nerves Network structure, then the highest sub-neural network structure of classification accuracy is determined by verifying collection, finally to the sub-neural network Structure is trained, and improves the accuracy rate of classification.
The present invention also provides a kind of search systems of neural network structure, comprising:
Preset data obtains module, for obtaining preset neural network framework and sampling structure;Wherein, the nerve net Network framework includes the middle layer of input layer, output layer and multiple sequentials;Each middle layer includes network knot to be determined Structure and the corresponding search structure section of the network structure;
Sampling module, for according to network structure to be determined in each middle layer and the corresponding knot of the network structure The structure region of search carries out multiple repairing weld by sampling structure, obtains multiple sub-neural network structures;
Training module for carrying out classification based training to multiple sub-neural network structures, and updates multiple son minds The parameter of network structure through middle layer each in network structure and the network structure in the search structure section of each middle layer Parameter obtains the search structure section of updated multiple sub-neural network structures and updated each middle layer;
Sampling structure update module obtains more for carrying out classification verifying to updated multiple sub-neural network structures The accuracy rate of the classification of multiple sub-neural network structures after new, and according to the parameter of accuracy rate update sampling structure, it obtains Obtain updated sampling structure;
Neural network structure determining module, for the structure according to updated sampling structure and updated each middle layer The region of search determines the neural network structure of demand.
The present invention presets the corresponding search structure section of network structure of each middle layer in neural network structure, by adopting Spline structure searches out the neural network structure that effect most agrees with classification task automatically, without manually adjusting neural network structure, saves The time has been saved, efficiency is improved.
The present invention also provides a kind of computer readable storage mediums, store computer program thereon, the computer program When being executed by processor realize as it is above-mentioned arbitrarily as described in neural network structure searching method the step of.
The present invention also provides a kind of data processing equipment, including memory, primary processor and it is stored in the memory In and the computer program that can be executed by the primary processor, the primary processor realize as above when executing the computer program The step of stating the searching method of any neural network structure.
In order to better understand and implement, the invention will now be described in detail with reference to the accompanying drawings.
Detailed description of the invention
The application environment structural schematic block diagram of Fig. 1 present invention searching method of neural network structure in one embodiment;
Fig. 2 is the flow chart of the searching method of neural network structure of the present invention;
Fig. 3 is the structural schematic diagram of neural network framework in one embodiment of the invention;
Fig. 4 is the flow chart that the present invention obtains multiple sub-neural network structures;
Fig. 5 is the schematic diagram that sub-neural network structure is obtained in one embodiment of the invention;
Fig. 6 is the structural schematic diagram of the sub-neural network structure obtained in one embodiment of the invention;
Fig. 7 is the flow chart for the parameter that the present invention updates network structure;
Fig. 8 is the flow chart for the parameter that the present invention updates sampling structure;
Fig. 9 is the flow chart of the neural network structure of acquisition demand of the present invention;
Figure 10 is the structural schematic diagram of the search system of neural network structure of the present invention;
Figure 11 is the structural schematic diagram of sampling module of the present invention;
Figure 12 is the structural schematic diagram of training module of the present invention;
Figure 13 is the structural schematic diagram of sampling structure update module of the present invention;
Figure 14 is the structural schematic diagram of neural network structure determining module of the present invention.
Specific embodiment
Referring to Fig. 1, Fig. 1 is the application environment knot of the searching method of neural network structure in one embodiment of the invention Structure schematic block diagram.As shown in Figure 1, the application environment of the searching method of the neural network structure of the embodiment is data processing equipment 10.The data processing equipment 10 includes user's input sink 11, primary processor 12, memory 13 and display 14.
The data processing equipment 10 obtains the neural network framework of user preset by user's input sink 11 And sampling structure.The memory 13 stores preset neural network framework and sampling structure storage, wherein the nerve net Network framework includes the middle layer of input layer, output layer and multiple sequentials;Each middle layer includes network knot to be determined Structure and the corresponding search structure section of the network structure.The primary processor 12 is according to net to be determined in each middle layer Network structure and the corresponding search structure section of the network structure carry out multiple repairing weld by sampling structure, obtain multiple son minds Classification based training is carried out through network structure, and to multiple sub-neural network structures, and updates multiple sub-neural network knots The parameter of network structure in structure in the search structure section of the parameter of the network structure of each middle layer and each middle layer;It is right Updated multiple sub-neural network structures carry out classification verifying, obtain the classification of updated multiple sub-neural network structures Accuracy rate, and according to the parameter of accuracy rate update sampling structure.At this point, the primary processor 12 judges above-mentioned steps number Whether reach preset times, if not having, repeat the above steps, if having reached predetermined, the primary processor 12 is according to more Sampling structure after new and the network structure in the search structure section of each middle layer are sent, and determine the neural network knot of demand Structure, and the neural network structure is shown by the display 14.
User's input sink 11 is used to receive the data of user's input, including the nerve net inputted when user preset Network framework, sampling structure and preset times etc. can be keyboard, mouse etc..
The primary processor 12 is realized by least one hardware processor, to execute the operation of data processing equipment 10, It can be desktop computer, portable computer, one or more application specific integrated circuit (ASIC), digital signal processor (DSP), digital signal processing appts (DSPD), programmable logic device (PLD), field programmable gate array (FPGA), control Device, microcontroller, microprocessor or other electronic components.
The memory 13 can store the information for inputting and/or showing at data processing equipment 10.At data The memory 13 that storing data is provided in reason equipment 10 can be implemented as nonvolatile memory, such as flash memory.Memory 13 is also All data that the primary processor 12 generates during processing can be stored.The data stored in memory 13 are not limited to above-mentioned Example, and memory 13 can be needed with storing data processing equipment 10 or all information for executing its operation.
The neural network framework for the user preset that the display 14 can obtain user's input sink 11, sampling The data such as structure and preset times are shown, for user's verification, meanwhile, the display 14 can also be by the primary processor 12 processing result data such as neural network structure is shown.Wherein, the type of display 14 is unrestricted, and display 14 may be realized in various forms, such as plasma display panel (PDP), liquid crystal display (LCD), Organic Light Emitting Diode (OLED), flexible display etc..
Wherein, the exemplary embodiment being not limited to the described above including the component in data processing equipment 10.
Referring to Fig. 2, the searching method of neural network structure of the invention, includes the following steps:
Step S1: preset neural network framework and sampling structure are obtained;Wherein, the neural network framework includes input The middle layer of layer, output layer and multiple sequentials;Each middle layer includes network structure and the network knot to be determined The corresponding search structure section of structure.
Wherein, the input layer of the neural network framework is for inputting data to be processed;The neural network framework Middle layer be layer structure for handling data, can be set as needed multiple;The output layer of the neural network framework For the data exported after multiple intermediate layer handles.For example, referring to Fig. 3, the neural network framework include input layer, it is defeated The middle layer of layer and three sequentials out;Three middle layers are sequentially the first middle layer, the second middle layer and third Middle layer.
Wherein, the network structure to be determined includes input structure and processing structure;In the network knot to be determined When structure is input structure, the corresponding search structure section of the network structure is the input structure region of search;Described to be determined Network structure when being processing structure, the corresponding search structure section of the network structure is the processing structure region of search.
The input structure is the input source of middle layer;The input structure region of search is all inputs of middle layer The section of source structure.Each middle layer not only can using one layer thereon of output as input, but also can by its more front upper one The output of layer is as input;But the input structure region of search of each middle layer and different, middle layer after arrangement more it is defeated It is more to enter source, the corresponding input structure region of search is also bigger, specifically, if by the middle layer of the multiple sequential It is sequentially defined as first layer, the second layer, K layers of third layer ..., then the input structure region of search of first layer is defaulted as [input layer Output];The input structure region of search of the second layer is [output of first layer];The input structure region of search of third layer is [output of first layer, the output of the second layer] ... K layers of the input structure region of search be [output of first layer, the second layer Output, K-1 layers of output ... of output of third layer].
The processing structure is one layer of structure that middle layer handles the data of input;The processing structure search is empty Between the section that constitutes of the data of input are handled for middle layer all layers of structure.The layer structure is the data to input The method structure handled using different calculating parameters and different function calculation methods, specifically, it can be for not With the convolutional layer of convolution kernel, the maximum pond layer with different Chi Huahe, the average pond layer with different Chi Huahe, only right The data of input be transmitted without the transport layer of processing or other it is any can be to the structure that input data is handled. The calculating parameter of the layer structure mainly includes weight parameter.Further, the layer structure is and the data of the input The matched structure of signal dimension is then designed as one-dimensional process layer structure if the data of input are one-dimensional signal;If the number of input According to for 2D signal, then two-dimensional process layer structure is designed as;If the data of input are three dimensional signal, it is designed as three-dimensional process layer Structure.Neural network structure is easily and fast determined to reduce operation, and then realizing, the processing structure of each middle layer is searched for empty Between it is all the same, it is corresponding in the present embodiment, to improve search efficiency, pass through and exclude processing knot bad in some academic effects The processing structure search space is set [convolutional layer that convolution kernel is 5, the convolutional layer that convolution kernel is 3, Chi Huahe 3 by structure Maximum pond layer, Chi Huahe be 5 average pond layer, transmit the transport layer of data].It is one-dimensional signal for input, it is described Processing structure search space is set as [the one-dimensional convolutional layer that convolution kernel is 5, the one-dimensional convolutional layer that convolution kernel is 3, Chi Huahe 3 One-dimensional maximum pond layer, the one-dimensional pond layer that is averaged that Chi Huahe is 5 transmits the transport layer of data].
Step S2: according to network structure to be determined in each middle layer and the corresponding search structure of the network structure Section carries out multiple repairing weld by sampling structure, obtains multiple sub-neural network structures.
Wherein, each sub-neural network structure includes the middle layer of input layer, output layer and multiple sequentials;Each Middle layer includes determining network structure.
Fig. 4 to fig. 6 is please referred to, in one embodiment, the sampling structure includes that sequence generates model, encoder reconciliation Code device;It is described according to the corresponding search structure section of network structure and the network structure to be determined in each middle layer, lead to The step of over-sampling structure carries out multiple repairing weld, obtains multiple sub-neural network structures, comprising:
Step S21: the sequence for the network structure that the need of each middle layer determine in acquisition neural network framework.
Wherein, the determining of the network structure of each middle layer is sequentially each middle layer of sequential in neural network framework Input structure and processing structure, specifically, if the middle layer of the multiple sequential is sequentially defined as first layer, second Layer, K layers of third layer ..., the input default of first layer are the input of input layer, and K layers of output is defaulted as the defeated of output layer Enter, then the sequence for the network structure that each middle layer need to determine in neural network framework are as follows: the processing structure of first layer, the second layer Input structure, the processing structure of the second layer, the input structure of third layer, third layer K layers of processing structure ... of input structure, K layers of processing structure.
Step S22: preset value list entries is generated in model, obtains the output valve that sequence generates model.
For convenience of calculating, by preset value to be that 0 vector list entries generates in model entirely, obtains sequence and generate model Output.
The sequence generate model can for hidden Markov model, Recognition with Recurrent Neural Network model, shot and long term memory network or Other any models that sequence can be generated.It includes weight parameter and calculating function that the sequence, which generates in model, when initial, Random value is imparted to the weight parameter, it is after preset value is input in sequence generation model, then preset value and sequence is raw It is calculated at the weight parameter in model according to function is calculated, and then obtains the output valve that sequence generates model.
Step S23: the corresponding search structure section of network structure for the middle layer that need to currently determine is obtained.
Step S24: according to the corresponding search structure section of the network structure for the middle layer that need to currently determine, the knot is determined The size of structure search space, and be decoded as the output valve that sequence generates model and the search structure section by decoder The decoding data that size matches.
For example, if the input region of search that the corresponding search structure section for the middle layer that need to currently determine is K layers, As [output of first layer, the output of the second layer, K-1 layers of output ... of output of third layer], then the input structure is searched for The size in section be K-1, then using sequence generate model output be decoded into length for K-1 vector as decoded data.
Step S25: according to the corresponding relationship of each network structure in the decoding data and the search structure section, meter Calculate the acquisition probability of each network structure in the search structure section.
Wherein, the corresponding relationship between decoding data and each input structure or each processing structure can be preset, for example, if current The input structure that the network structure for the middle layer that need to be determined is the 4th layer, the decoding data that decoder decoding obtains are [x1, x2, x3], then it can preset x1For the 1st layer of output, x2For the 2nd layer of output, x3For the 3rd layer of output;X can also be preset1It is the 2nd layer Output, x2For the 3rd layer of output, x3It is subsequent to be no longer changed for the 1st layer of output etc., but after default determine.
Wherein, the probability of each input structure or processing structure is calculated by softmax function.For example, if currently needing true The input structure that fixed middle layer is the 4th layer, the decoding data that decoder decoding obtains are X=[x1, x2, x3], and x1It is the 1st The output of layer, x2For the 2nd layer of output, x3For the 3rd layer of output;Then the input structure of the middle layer be the 1st layer output, x2 Output, x for the 2nd layer3Calculation for the probability of the 3rd layer of output is as follows:
Step S26: according to the acquisition probability, each network structure in the search structure section is adopted at random Sample determines the network structure for the middle layer that need to currently determine.
Step S27: judging whether the network structure of each middle layer in neural network framework determines and finish, if not determined, The network structure of currently determining middle layer is encoded by encoder then, and the data after coding are input to sequence life At model, and the sequence for the network structure that need to be determined according to each middle layer, by the network knot of next middle layer that need to be determined Network structure of the structure as the middle layer that need to currently determine, returns to step S23;If having had determined that, determined according to each middle layer Network structure, obtain a sub-neural network structure.
Wherein, if the middle layer of the multiple sequential is sequentially defined as first layer, the second layer, third layer ... K Layer, then the sequence for the network structure that each middle layer need to determine is successively are as follows: the input knot of the processing structure of first layer, the second layer Structure, the processing structure of the second layer, the input structure of third layer, K layers of processing structure ... of input structure of third layer, K layers Processing structure, that is, when starting, first using the processing structure of first layer as the network structure for the middle layer that need to currently determine, After the processing structure of first layer to be determined, then using the input structure of the second layer as the network for the middle layer that need to currently determine Structure, after the input structure of the second layer to be determined, then using the processing structure of the second layer as the middle layer that need to currently determine Network structure, and so on, until having determined the network structures of all middle layers.
Referring to Fig. 6, including input layer, the first middle layer, the second middle layer, third middle layer for neural network framework With the neural network of output layer, be in the processing result region of search of each middle layer [the one-dimensional convolutional layer that convolution kernel is 5, The one-dimensional convolutional layer that convolutional layer is 3, the one-dimensional maximum pond layer that Chi Huahe is 3, the one-dimensional average pond layer that Chi Huahe is 5 pass Send the transport layer of data] when, by sampling structure carry out sampling acquisition a sub-neural network structure be input layer, convolution kernel is The one-dimensional maximum pond layer and output layer that one-dimensional convolutional layer that 3 one-dimensional convolutional layer, convolution kernel are 3, Chi Huahe are 3.
Step S28: it repeats to obtain multiple sub-neural network structures after step S23 reaches preset times to step S27.
Wherein, the quantity for the sub-neural network structure that can be obtained as needed, it is default to repeat step S23 to step S27's Number, if the neural network structure of demand need to be determined quickly, can incite somebody to action specifically, the value range of preset times is 2 to 100 Preset number setting is smaller, and for example 5,6 is inferior, if you need to accurately determine the neural network structure of demand, then can will preset Number setting it is larger, as N be 80,90 etc..
When wherein, to avoid subsequent trained sub-neural network structure, the son of a new sampling out is from the beginning trained every time Neural network, to save the training time, the same processing structure in multiple sub-neural network structures in same middle layer is total Enjoy parameter, for example, if the processing structure of the first middle layer of first sub- neural network structure use convolution kernel for 5 one-dimensional volume Lamination, the processing structure of the second middle layer use convolutional layer for 3 one-dimensional convolutional layer, and the processing structure of third middle layer is using volume The one-dimensional convolutional layer that product core is 5;The processing structure of first middle layer of second sub- neural network structure uses convolution kernel for 5 One-dimensional convolutional layer, the processing structure of the second middle layer use pond core for 5 one-dimensional average pond, the processing knot of third middle layer Structure use pond core for 3 one-dimensional maximum pond layer;The processing structure of first middle layer of the sub- neural network structure of third is adopted The one-dimensional average pond for being 5 with pond core, the processing structure of the second middle layer use pond core for 5 the one-dimensional pond that is averaged, the The processing structure of three middle layers use convolutional layer for 3 one-dimensional convolution;Then first sub- neural network structure and second son mind The parameter for the one-dimensional convolutional layer that the convolution kernel of the first middle layer through network structure is 5 is identical;Second sub- neural network structure Use pond core identical for the 5 one-dimensional average parameter in pond with the second middle layer of the sub- neural network structure of third.
Sub-neural network structure is automatically determined by sampling structure circulation, and then makes each centre of sub-neural network structure Layer structure is interrelated, meets sub-neural network structure to the flow chart of data processing of processing, improves the accuracy of prediction.
Step S3: classification based training is carried out to multiple sub-neural network structures, and updates multiple sub-neural networks The parameter of network structure in structure in the search structure section of the parameter of the network structure of each middle layer and each middle layer, Obtain the search structure section of updated multiple sub-neural network structures and updated each middle layer.
Wherein, the parameter of network structure when initial in the search structure section of each middle layer is the numerical value assigned at random, And then the parameter of the network structure of each middle layer is also the numerical value assigned at random in the sub-neural network structure.
Referring to Fig. 7, in one embodiment, it is described that classification based training is carried out to multiple sub-neural network structures, and Update the search structure area of the parameter of the network structure of each middle layer and each middle layer in multiple sub-neural network structures Between in network structure parameter, obtain updated multiple sub-neural network structures and updated each middle layer The step of search structure section, comprising:
Step S31: training set is obtained, the training set is randomly divided into the identical subset of multiple quantity.
Wherein, a plurality of sequence signal is cut into isometric multiple subsequence signals, then by these subsequence signals according to Preset ratio is divided into training set and verifying collection.For example, if a plurality of sequence signal is one-dimensional signal, port number C, cutting The length of subsequence afterwards is N, then the signal format after cutting is N x C, these subsequence a part are as training set, in addition A part of then conduct verifying collection.
Step S32: take wherein a subset classification based trainings are carried out to multiple sub-neural network structures, obtain multiple described The intersection entropy loss of sub-neural network structure.
Wherein, the calculation for intersecting entropy loss are as follows:
In above-mentioned formula, L is to intersect entropy loss, and C is classification quantity, x [jtrue] it is to be carried out using sub-neural network structure When classification, data x adheres to the prediction result in correct class separately;X [j] is data x when being classified using sub-neural network structure Belong to the prediction result of one type;Log () is to take logarithm to the data in bracket the bottom of by of natural constant e.
Step S33: according to the intersection entropy loss of multiple neural network structures, multiple sub-neural network knots are obtained The parameter gradients of the processing structure of each middle layer in structure.
Step S34: in multiple sub-neural network structures, the described of the same processing structure of same middle layer is obtained Parameter gradients, and the parameter gradients are subjected to average value processing, and using treated mean value as the same of the same middle layer The parameter of one processing structure updates gradient.
Wherein, the parameter of the same processing structure of the same middle layer updates the calculation of gradient are as follows:
In above-mentioned formula, w indicates the parameter of the same processing structure of same middle layer, for example, weight parameter;Indicate w Gradient;EM~π(L (m, w)) indicates the cross entropy expected shortfall in the sampling structure π sub-neural network structure determined;M is indicated The quantity of the sub-neural network of same processing structure with same middle layer;miIt indicates to pass through i-th of sub-neural network structure;Indicate the cross entropy loss function of i-th of sub-neural network structure to the same processing structure of same middle layer Gradient.
Step S35: updating gradient according to the parameter, updates same middle layer in multiple sub-neural network structures The parameter of same processing structure in the processing structure region of search of the parameter of same processing structure and same middle layer.
For example, if the processing structure of the first middle layer of first sub- neural network structure use convolution kernel for 5 it is one-dimensional Convolutional layer, the processing structure of the second middle layer use convolutional layer for 3 one-dimensional convolutional layer, and the processing structure of third middle layer uses The one-dimensional convolutional layer that convolution kernel is 5;The processing structure of first middle layer of second sub- neural network structure uses convolution kernel for 5 One-dimensional convolutional layer, the processing structure of the second middle layer use pond core for 5 one-dimensional average pond, the processing of third middle layer Structure use pond core for 3 one-dimensional maximum pond layer;The processing structure of first middle layer of the sub- neural network structure of third Use pond core for 5 one-dimensional average pond, the processing structure of the second middle layer use pond core for 5 the one-dimensional pond that is averaged, The processing structure of third middle layer use convolutional layer for 3 one-dimensional convolutional layer;Then in the first of first sub- neural network structure The processing structure of first middle layer of interbed and second sub- neural network structure is identical, takes the two sub-neural network structures Intersect the parameter gradients for the one-dimensional convolutional layer that entropy loss is 3 to convolutional layer, then solve the mean value of the two parameter gradients, and then will The mean value updates gradient as the parameter for the one-dimensional convolutional layer that the convolutional layer of the first middle layer is 3, thus more according to the parameter New gradient carries out more the parameter for the one-dimensional convolutional layer that the convolutional layer of the first middle layer of first sub- neural network structure is 3 Newly, the parameter for the one-dimensional convolutional layer for being 3 to the convolutional layer of the first middle layer of second sub- neural network structure is updated, together When, update the one-dimensional convolutional layer that convolutional layer is 3 in first middle layer.
Step S36: judging whether all subsets carry out classification based training to multiple sub-neural network structures, if It is the processing structure field of search for then obtaining updated multiple sub-neural network structures and updated each middle layer Between;If it is not, then removing a untrained subset, step S32 is returned to.
Classification based training is carried out to sub- neural network structure one by one, and with the parameter of the same processing structure of same middle layer ladder The parameter for spending the same processing structure as the same middle layer updates gradient, improves the precision of parameter update.
Step S4: carrying out classification verifying to updated multiple sub-neural network structures, obtains updated multiple son minds The accuracy rate of classification through network structure, and according to the parameter of accuracy rate update sampling structure, obtain updated sampling Structure.
Wherein, the parameter of sampling structure is the numerical value assigned at random when initial.
It is described classification is carried out to updated multiple sub-neural network structures to test referring to Fig. 8, in one embodiment Card obtains the accuracy rate of the classification of updated multiple sub-neural network structures, and updates sampling structure according to the accuracy rate Parameter, the step of obtaining updated sampling structure, comprising:
Step S41: obtaining verifying collection, and the standard of updated multiple sub-neural network structures is calculated according to verifying collection True rate.
Step S42: according to the accuracy rate, the loss function of the sampling structure is calculated.
Wherein, the calculation of the sampling structure penalty values are as follows:
In above-mentioned formula, L indicates penalty values;The quantity of N expression sub-neural network;P (k) indicates that sampling structure samples out the The probability of k sub- neural network structure, the summarizing of output of this result by sampling structure in each timing node determine;Rk Indicate the accuracy rate of k-th of sub-neural network;Log () is to take logarithm to the data in bracket the bottom of by of natural constant e.
Step S43: it according to the penalty values of the sampling structure, obtains sequence and generates the parameter update gradient of model, coding The parameter of device updates gradient and the parameter of decoder updates gradient.
Wherein, the parameter that sequence generates model updates the calculation of gradient are as follows:
In above-mentioned formula, θ indicates that sequence generates the parameter of model, for example, weight parameter;Indicate the gradient of θ.
Wherein, the gradient of the weight of the encoder and the parameter of decoder are identical, and the calculation of the two is equal are as follows:
In above-mentioned formula, the weight of w presentation code device or the parameter of decoder, for example, weight parameter;Indicate w's Gradient.
Step S44: gradient, the parameter of encoder update gradient and decoder are updated according to the parameter that sequence generates model Parameter updates gradient, and renewal sequence generates parameter, the parameter of the parameter of encoder and decoder of model, obtains updated adopt Spline structure.
Step S5: step S2 to step S4 is repeated after reaching preset times, according to updated sampling structure and more The search structure section of each middle layer after new, determines the neural network structure of demand.
Wherein, the number for repeating step S2 to step S4 can be preset as needed, specifically, the value range of preset times It is 2 to 100, can be smaller by the setting of preset number if the neural network structure of demand need to be determined quickly, for example 5,6 times Deng, if you need to accurately determine demand neural network structure, then can will by preset number be arranged it is larger, as N be 80,90 etc..
It is described according to updated sampling structure and updated each middle layer referring to Fig. 9, in one embodiment Search structure section, the step of determining the neural network structure of demand, comprising:
Step S51: corresponding updated according to network structure and the network structure to be determined in each middle layer Search structure section carries out multiple repairing weld by updated sampling structure, obtains multiple sub-neural network structures;
Step S52: carrying out classification verifying to multiple sub-neural network structures, obtains the highest son mind of classification accuracy Through network structure.
Step S53: being trained the highest sub-neural network structure of classification accuracy, and by the sub- nerve net after training Neural network structure of the network structure as demand.
Wherein, when being trained to the highest sub-neural network structure of classification accuracy, by the sub-neural network knot Each network architecture parameters in structure assign numerical value at random, then are trained by training set to the sub-neural network structure, obtain The neural network structure for the demand of obtaining.
The present invention presets the corresponding search structure section of network structure of each middle layer in neural network structure, by adopting Spline structure searches out the neural network structure that effect most agrees with classification task automatically, without manually adjusting neural network structure, saves The time has been saved, efficiency is improved.
Referring to Fig. 10, the present invention also provides a kind of search systems 20 of neural network structure, comprising:
Preset data obtains module 21, for obtaining preset neural network framework and sampling structure;Wherein, the nerve The network architecture includes the middle layer of input layer, output layer and multiple sequentials;Each middle layer includes network to be determined Structure and the corresponding search structure section of the network structure;
Sampling module 22, for corresponding according to network structure and the network structure to be determined in each middle layer Search structure section carries out multiple repairing weld by sampling structure, obtains multiple sub-neural network structures;
Training module 23 for carrying out classification based training to multiple sub-neural network structures, and updates multiple sons Network structure in neural network structure in the search structure section of the parameter of the network structure of each middle layer and each middle layer Parameter, obtain the search structure section of updated multiple sub-neural network structures and updated each middle layer;
Sampling structure update module 24 is obtained for carrying out classification verifying to updated multiple sub-neural network structures The accuracy rate of the classification of updated multiple sub-neural network structures, and according to the accuracy rate update sampling structure parameter, Obtain updated sampling structure;
Neural network structure determining module 25, for the knot according to updated sampling structure and updated each middle layer The structure region of search determines the neural network structure of demand.
The present invention presets the corresponding search structure section of network structure of each middle layer in neural network structure, by adopting Spline structure searches out the neural network structure that effect most agrees with classification task automatically, without manually adjusting neural network structure, saves The time has been saved, efficiency is improved.
Figure 11 is please referred to, in one embodiment, the sampling module 22 includes:
Network structure order determination unit 221, the network knot that need to be determined for obtaining each middle layer in neural network framework The sequence of structure;
Output valve acquiring unit 222 obtains sequence and generates the defeated of model for generating preset value list entries in model It is worth out;
Search structure section acquiring unit 223, the corresponding knot of network structure for obtaining the middle layer that need to currently determine The structure region of search;
Decoding unit 224, for the corresponding search structure section of network structure according to the middle layer that need to currently determine, really The size in the fixed search structure space, and be decoded as searching with the structure by the output valve that sequence generates model by decoder The decoding data that the size in rope section matches;
Probability calculation unit 225 is obtained, for according to each network in the decoding data and the search structure section The corresponding relationship of structure calculates the acquisition probability of each network structure in the search structure section;
Network structure determination unit 226 is used for according to the acquisition probability, to each network in the search structure section Structure carries out stochastical sampling, determines the network structure for the middle layer that need to currently determine;
Judging unit 227 is finished for judging whether the network structure of each middle layer in neural network framework determines, if not It has been determined that, then encoded the network structure of currently determining middle layer by encoder, and the data after coding are defeated Enter to sequence and generate model, and the sequence for the network structure that need to be determined according to each middle layer, by next centre that need to be determined Network structure of the network structure of layer as the middle layer that need to currently determine, continues the network structure for determining current middle layer;If It has been had determined that, then the network structure determined according to each middle layer obtains a sub-neural network structure;
Sub-neural network structure acquiring unit 228, for obtaining multiple sub-neural network structures.
Sub-neural network structure is automatically determined by sampling structure circulation, and then makes each centre of sub-neural network structure Layer structure is interrelated, meets sub-neural network structure to the flow chart of data processing of processing, improves the accuracy of prediction.
Figure 12 is please referred to, in one embodiment, the sampling structure update module 23 includes:
Training set division unit 231, for obtaining training set, and it is identical that the training set is randomly divided into multiple quantity Subset;
Cross entropy costing bio disturbance unit 232, for take wherein a subset multiple sub-neural network structures are divided Class training, obtains the intersection entropy loss of each sub-neural network structure;
Parameter gradients computing unit 233 obtains multiple for the intersection entropy loss according to multiple neural network structures The parameter gradients of the processing structure of each middle layer in the sub-neural network structure;
Parameter updates gradient computing unit 234, for obtaining same middle layer in multiple sub-neural network structures Same processing structure the parameter gradients, and the parameter gradients are subjected to average value processing, and treated mean value is made Parameter for the same processing structure of the same middle layer updates gradient;
Updating unit 235 updates same in multiple sub-neural network structures for updating gradient according to the parameter The parameter of the same processing structure of middle layer and the same processing structure in the processing structure region of search of same middle layer Parameter;
Classification judging unit 236, for judging whether all subsets carry out multiple sub-neural network structures Classification based training, if so, obtaining the updated processing for updating multiple sub-neural network structures and updated each middle layer Search structure section;If it is not, then removing a untrained subset, continue to carry out classification instruction to multiple sub-neural network structures Practice.
Classification based training is carried out to sub- neural network structure one by one, and with the parameter of the same processing structure of same middle layer ladder The parameter for spending the same processing structure as the same middle layer updates gradient, improves the precision of parameter update.
Figure 12 is please referred to, in one embodiment, the training module 24 includes:
Accuracy rate acquiring unit 241 calculates updated each sub-neural network for obtaining verifying collection, and by verifying collection The accuracy rate of structure;
Penalty values computing unit 242, for calculating the penalty values of the sampling structure according to the accuracy rate;
Gradient computing unit 243 obtains sequence and generates the parameter of model more for the penalty values according to the sampling structure New gradient, the parameter of encoder update gradient and the parameter of decoder updates gradient;
Sampling structure updating unit 244, the parameter for generating model according to sequence update gradient, the parameter of encoder more New gradient and the parameter of decoder update gradient, and renewal sequence generates parameter, the ginseng of the parameter of encoder and decoder of model Number, obtains updated sampling structure.
Figure 14 is please referred to, in one embodiment, the neural network structure determining module 25 includes:
Sub-neural network structure sampling unit 251, for according to network structure to be determined in each middle layer and should The corresponding updated search structure section of network structure carries out multiple repairing weld by updated sampling structure, obtains multiple Sub-neural network structure;
Sub-neural network structure acquiring unit 252 is obtained for carrying out classification verifying to multiple sub-neural network structures Obtain the highest sub-neural network structure of classification accuracy.
Neural network structure acquiring unit 253, for being trained to the highest sub-neural network structure of classification accuracy, And using the sub-neural network structure after training as the neural network structure of demand.
Updated search structure section is corresponded to by determining collector and each network structure, obtains multiple son nerves Network structure, then the highest sub-neural network structure of classification accuracy is determined by verifying collection, finally to the sub-neural network Structure is trained, and improves the accuracy rate of classification.
The present invention also provides a kind of computer readable storage mediums, store computer program thereon, the computer program When being executed by processor realize as it is above-mentioned arbitrarily as described in neural network structure searching method the step of.
It wherein includes storage medium (the including but not limited to disk of program code that the present invention, which can be used in one or more, Memory, CD-ROM, optical memory etc.) on the form of computer program product implemented.Computer-readable storage media packet Permanent and non-permanent, removable and non-removable media is included, can be accomplished by any method or technique information storage.Letter Breath can be computer readable instructions, data structure, the module of program or other data.The example packet of the storage medium of computer Include but be not limited to: phase change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), Other kinds of random access memory (RAM), read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), flash memory or other memory techniques, read-only disc read only memory (CD-ROM) (CD-ROM), digital versatile disc (DVD) or other optical storage, magnetic cassettes, tape magnetic disk storage or other magnetic storage devices or any other non-biography Defeated medium, can be used for storage can be accessed by a computing device information.
The present invention also provides a kind of data processing equipment, including memory, primary processor and it is stored in the memory In and the computer program that can be executed by the primary processor, the primary processor realize as above when executing the computer program The step of stating the searching method of any neural network structure.
The preset neural network framework and sampling structure storage that the memory storage obtains, wherein the nerve net Network framework includes the middle layer of input layer, output layer and multiple sequentials;Each middle layer includes network knot to be determined Structure and the corresponding search structure section of the network structure.The primary processor is according to network to be determined in each middle layer Structure and the corresponding search structure section of the network structure carry out multiple repairing weld by sampling structure, obtain multiple son nerves Network structure, and classification based training is carried out to multiple sub-neural network structures, and update multiple sub-neural network structures In the parameter of network structure of each middle layer and the parameter of the network structure in the search structure section of each middle layer;To more Multiple sub-neural network structures after new carry out classification verifying, obtain the standard of the classification of updated multiple sub-neural network structures True rate, and according to the parameter of accuracy rate update sampling structure.At this point, whether the primary processor judges above-mentioned steps number Reach preset times, if not having, repeat the above steps, if having reached predetermined, the primary processor is according to updated Network structure in the search structure section of sampling structure and each middle layer is sent, and determines the neural network structure of demand.
The primary processor realized by least one hardware processor, can to execute the operation of data processing equipment 10 Think desktop computer, portable computer, one or more application specific integrated circuit (ASIC), digital signal processor (DSP), digital signal processing appts (DSPD), programmable logic device (PLD), field programmable gate array (FPGA), control Device, microcontroller, microprocessor or other electronic components.
The memory can store the information for inputting and/or showing at data processing equipment.It is set as data processing The standby middle memory for providing storing data can be implemented as nonvolatile memory, such as flash memory.The memory can be one It is a or it is multiple wherein include program code storage medium (including but not limited to magnetic disk storage, CD-ROM, optical memory Deng) on the form of computer program product implemented.Computer-readable storage media includes permanent and non-permanent, removable With non-removable media, information storage can be accomplished by any method or technique.Information can be computer-readable instruction, number According to structure, the module of program or other data.The example of the storage medium of computer includes but is not limited to: phase change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), other kinds of arbitrary access are deposited Reservoir (RAM), read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), flash memory or other in Deposit technology, read-only disc read only memory (CD-ROM) (CD-ROM), digital versatile disc (DVD) or other optical storages, magnetic box type magnetic Band, tape magnetic disk storage or other magnetic storage devices or any other non-transmission medium, can be used for storing can be calculated The information-storing device of equipment access can also store all data that the primary processor generates during processing.It is deposited in memory The data of storage are not limited to above-mentioned example, and memory can be needed with storing data processing equipment or for executing its operation All information.
In several embodiments provided herein, it should be understood that disclosed system, device and method can be with It realizes by another way.For example, the apparatus embodiments described above are merely exemplary, for example, the unit It divides, only a kind of logical function partition, there may be another division manner in actual implementation, such as multiple units or components It can be combined or can be integrated into another system, or some features can be ignored or not executed.Another point, it is shown or The mutual coupling, direct-coupling or communication connection discussed can be through some interfaces, the indirect coupling of device or unit It closes or communicates to connect, can be electrical property, mechanical or other forms.
The unit as illustrated by the separation member may or may not be physically separated, aobvious as unit The component shown may or may not be physical unit, it can and it is in one place, or may be distributed over multiple In network unit.It can select some or all of unit therein according to the actual needs to realize the mesh of this embodiment scheme 's.
The embodiments described above only express several embodiments of the present invention, and the description thereof is more specific and detailed, but simultaneously It cannot therefore be construed as limiting the scope of the patent.It should be pointed out that coming for those of ordinary skill in the art It says, without departing from the inventive concept of the premise, various modifications and improvements can be made, these belong to protection of the invention Range.

Claims (19)

1. a kind of searching method of neural network structure, which comprises the steps of:
Step S1: preset neural network framework and sampling structure are obtained;Wherein, the neural network framework include input layer, The middle layer of output layer and multiple sequentials;Each middle layer includes network structure and the network structure to be determined Corresponding search structure section;
Step S2: according to the corresponding search structure section of network structure and the network structure to be determined in each middle layer, Multiple repairing weld is carried out by sampling structure, obtains multiple sub-neural network structures;
Step S3: classification based training is carried out to multiple sub-neural network structures, and updates multiple sub-neural network structures In the parameter of network structure of each middle layer and the parameter of the network structure in the search structure section of each middle layer, obtain The search structure section of updated multiple sub-neural network structures and updated each middle layer;
Step S4: classification verifying is carried out to updated multiple sub-neural network structures, obtains updated multiple sub- nerve nets The accuracy rate of the classification of network structure, and according to the parameter of accuracy rate update sampling structure, obtain updated sampling structure;
Step S5: repeating step S2 to step S4 after reaching preset times, according to updated sampling structure and after updating Each middle layer search structure section, determine the neural network structure of demand.
2. the searching method of neural network structure according to claim 1 characterized by comprising described to be determined Network structure includes input structure and processing structure;When the network structure to be determined is input structure, the network knot The corresponding search structure section of structure is the input structure region of search;When the network structure to be determined is processing structure, institute Stating the corresponding search structure section of network structure is the processing structure region of search.
3. the searching method of neural network structure according to claim 2 characterized by comprising the sampling structure Model, encoder and decoder are generated including sequence;
It is described according to the corresponding search structure section of network structure and the network structure to be determined in each middle layer, lead to The step of over-sampling structure carries out multiple repairing weld, obtains multiple sub-neural network structures, comprising:
Step S21: the sequence for the network structure that each middle layer need to determine in neural network framework is obtained;
Step S22: preset value list entries is generated in model, obtains the output valve that sequence generates model;
Step S23: the corresponding search structure section of network structure for the middle layer that need to currently determine is obtained;
Step S24: according to the corresponding search structure section of the network structure for the middle layer that need to currently determine, determine that the structure is searched The size in rope space, and the output valve that sequence generates model is decoded as by the size with the search structure section by decoder The decoding data to match;
Step S25: according to the corresponding relationship of each network structure in the decoding data and the search structure section, institute is calculated State the acquisition probability of each network structure in search structure section;
Step S26: according to the acquisition probability, stochastical sampling is carried out to network structure each in the search structure section, really The network structure of determining middle layer is needed before settled;
Step S27: judging whether the network structure of each middle layer in neural network framework determines and finish, will if not determined Currently the network structure of determining middle layer is encoded by encoder, and the data after coding are input to sequence and generate mould Type, and the sequence for the network structure that need to be determined according to each middle layer make the network structure of next middle layer that need to be determined Network structure for the middle layer that need to currently determine, returns to step S23;If having had determined that, according to the determining net of each middle layer Network structure obtains a sub-neural network structure;
Step S28: it repeats to obtain multiple sub-neural network structures after step S22 reaches preset times to step S27.
4. the searching method of neural network structure according to claim 2, which is characterized in that described to multiple son minds Classification based training is carried out through network structure, and updates the ginseng of the network structure of each middle layer in multiple sub-neural network structures The parameter of network structure in the search structure section of several and each middle layer, obtain updated multiple sub- nerve nets The step of search structure section of network structure and updated each middle layer, comprising:
Step S31: training set is obtained, and the training set is randomly divided into the identical subset of multiple quantity;
Step S32: take wherein a subset classification based trainings are carried out to multiple sub-neural network structures, obtain multiple son minds Intersection entropy loss through network structure;
Step S33: it according to the intersection entropy loss of multiple neural network structures, obtains in multiple sub-neural network structures The parameter gradients of the processing structure of each middle layer;
Step S34: in multiple sub-neural network structures, the parameter of the same processing structure of same middle layer is obtained Gradient, and the parameter gradients are subjected to average value processing, and using treated mean value as the same place of the same middle layer The parameter for managing structure updates gradient;
Step S35: gradient is updated according to the parameter, same middle layer is same in the multiple sub-neural network structures of update The parameter of same processing structure in the processing structure region of search of the parameter of processing structure and same middle layer;
Step S36: judging whether all subsets carry out classification based training to multiple sub-neural network structures, if so, Obtain the processing structure region of search of updated multiple sub-neural network structures and updated each middle layer;If It is no, then a untrained subset is removed, step S32 is returned to.
5. the searching method of neural network structure according to claim 4, which is characterized in that the meter for intersecting entropy loss Calculation mode are as follows:
In above-mentioned formula, L is to intersect entropy loss, and C is classification quantity, x [jtrue] it is to be classified using sub-neural network structure When, data x adheres to the prediction result in correct class separately;X [j] is when being classified using sub-neural network structure, and data x is adhered to separately In the prediction result of one type;Log () is to take logarithm to the data in bracket the bottom of by of natural constant e.
6. the searching method of neural network structure according to claim 5, which is characterized in that the same middle layer it is same The parameter of one processing structure updates the calculation of gradient are as follows:
In above-mentioned formula, w indicates the parameter of the same processing structure of same middle layer;Indicate the gradient of w;EM~π(L (m, w)) Indicate the cross entropy expected shortfall in the sampling structure π sub-neural network structure determined;M indicates there is the same of same middle layer The quantity of the sub-neural network of one processing structure;miIt indicates to pass through i-th of sub-neural network structure;It indicates i-th Gradient of the cross entropy loss function of sub-neural network structure to the same processing structure of same middle layer.
7. the searching method of neural network structure according to claim 2, which is characterized in that described to updated multiple Sub-neural network structure carries out classification verifying, obtains the accuracy rate of the classification of updated multiple sub-neural network structures, and root The parameter of sampling structure, the step of obtaining updated sampling structure are updated according to the accuracy rate, comprising:
Step S41: verifying collection is obtained, and the accurate of updated multiple sub-neural network structures is calculated according to verifying collection Rate;
Step S42: according to the accuracy rate, the penalty values of the sampling structure are calculated;
Step S43: it according to the penalty values of the sampling structure, obtains sequence and generates the parameter of model and update gradient, encoder Parameter updates gradient and the parameter of decoder updates gradient;
Step S44: gradient is updated according to the parameter that sequence generates model, the parameter of encoder updates the parameter of gradient and decoder Gradient is updated, renewal sequence generates parameter, the parameter of the parameter of encoder and decoder of model, obtains updated sampling knot Structure.
8. the searching method of neural network structure according to claim 7, which is characterized in that
The calculation of the sampling structure penalty values are as follows:
In above-mentioned formula, L indicates penalty values;The quantity of N expression sub-neural network;P (k) indicates that sampling structure samples out k-th The probability of sub-neural network structure;RkIndicate the accuracy rate of k-th of sub-neural network;Log () is the bottom of by of natural constant e to including Data in number take logarithm.
9. the searching method of neural network structure according to claim 7, which is characterized in that
The parameter that the sequence generates model updates the calculation of gradient are as follows:
In above-mentioned formula, θ indicates that sequence generates the parameter of model;Indicate the gradient of θ;
The gradient of the weight of the encoder and the parameter of decoder are identical, and the calculation of the two is equal are as follows:
In above-mentioned formula, the weight of w presentation code device or the parameter of decoder;Indicate the gradient of w.
10. the searching method of neural network structure according to claim 2, which is characterized in that described according to updated The search structure section of sampling structure and updated each middle layer, the step of determining the neural network structure of demand, comprising:
Step S51: according to the corresponding updated structure of network structure and the network structure to be determined in each middle layer The region of search carries out multiple repairing weld by updated sampling structure, obtains multiple sub-neural network structures;
Step S52: classification verifying is carried out to multiple sub-neural network structures, obtains the highest sub- nerve net of classification accuracy Network structure;
Step S53: being trained the highest sub-neural network structure of classification accuracy, and by the sub-neural network knot after training Neural network structure of the structure as demand.
11. according to the searching method of neural network structure described in any claim in claim 2-10, which is characterized in that It is hidden Markov model, Recognition with Recurrent Neural Network model or shot and long term memory network that the sequence, which generates model,.
12. according to the searching method of neural network structure described in any claim in claim 2-10, which is characterized in that The processing structure region of search of each middle layer is identical.
13. the searching method of neural network structure according to claim 12, which is characterized in that the processing structure search Section includes: the convolutional layer that convolution kernel is 5, the convolutional layer that convolution kernel is 3, the maximum pond layer that Chi Huahe is 3, Chi Huahe 5 Average pond layer and transmission data transport layer.
14. a kind of search system of neural network structure characterized by comprising
Preset data obtains module, for obtaining preset neural network framework and sampling structure;Wherein, the neural network frame Structure includes the middle layer of input layer, output layer and multiple sequentials;Each middle layer include network structure to be determined, with And the corresponding search structure section of the network structure;
Sampling module, for being searched according to network structure to be determined in each middle layer and the corresponding structure of the network structure Rope section carries out multiple repairing weld by sampling structure, obtains multiple sub-neural network structures;
Training module for carrying out classification based training to multiple sub-neural network structures, and updates multiple sub- nerve nets The ginseng of network structure in network structure in the search structure section of the parameter of the network structure of each middle layer and each middle layer Number obtains the search structure section of updated multiple sub-neural network structures and updated each middle layer;
Sampling structure update module, for carrying out classification verifying to updated multiple sub-neural network structures, after being updated Multiple sub-neural network structures classification accuracy rate, and according to the accuracy rate update sampling structure parameter, obtain more Sampling structure after new;
Neural network structure determining module, for the search structure according to updated sampling structure and updated each middle layer Section determines the neural network structure of demand.
15. the search system of neural network structure according to claim 14, which is characterized in that the sampling module packet It includes:
Network structure order determination unit, for obtain each middle layer in neural network framework need determine network structure it is suitable Sequence;
Output valve acquiring unit obtains the output valve that sequence generates model for generating preset value list entries in model;
Search structure section acquiring unit, the network structure corresponding search structure area for obtaining the middle layer that need to currently determine Between;
Decoding unit, described in determining according to the corresponding search structure section of network structure of middle layer that need to currently determine The size in search structure space, and be decoded as the output valve that sequence generates model and the search structure section by decoder The decoding data that matches of size;
Probability calculation unit is obtained, for pair according to each network structure in the decoding data and the search structure section It should be related to, calculate the acquisition probability of each network structure in the search structure section;
Network structure determination unit, for according to the acquisition probability, to each network structure in the search structure section into Row stochastical sampling determines the network structure for the middle layer that need to currently determine;
Judging unit is finished for judging whether the network structure of each middle layer in neural network framework determines, if not determined, The network structure of currently determining middle layer is encoded by encoder then, and the data after coding are input to sequence life At model, and the sequence for the network structure that need to be determined according to each middle layer, by the network knot of next middle layer that need to be determined Network structure of the structure as the middle layer that need to currently determine continues the network structure for determining current middle layer;If having had determined that, According to the network structure that each middle layer determines, a sub-neural network structure is obtained;
Sub-neural network structure acquiring unit, for obtaining multiple sub-neural network structures.
16. the search system of neural network structure according to claim 14, which is characterized in that the sampling structure updates Module includes:
Training set division unit is randomly divided into the identical subset of multiple quantity for obtaining training set, and by the training set;
Cross entropy costing bio disturbance unit, for take wherein a subset classification based trainings are carried out to multiple sub-neural network structures, Obtain the intersection entropy loss of each sub-neural network structure;
Parameter gradients computing unit obtains multiple sons for the intersection entropy loss according to multiple neural network structures The parameter gradients of the processing structure of each middle layer in neural network structure;
Parameter updates gradient computing unit, for obtaining the same of same middle layer in multiple sub-neural network structures Parameter gradients of processing structure, and the parameter gradients are subjected to average value processing, and will be described in treated mean value is used as The parameter of the same processing structure of same middle layer updates gradient;
Updating unit updates same middle layer in multiple sub-neural network structures for updating gradient according to the parameter The parameter of same processing structure and the same processing structure in the processing structure region of search of same middle layer parameter;
Classification judging unit, for judging whether all subsets carry out classification instruction to multiple sub-neural network structures Practice, if so, the processing structure for obtaining the multiple sub-neural network structures of updated update and updated each middle layer is searched Rope section;If it is not, then removing a untrained subset, continue to carry out classification based training to multiple sub-neural network structures.
17. the search system of neural network structure according to claim 14, which is characterized in that the training module packet It includes:
Accuracy rate acquiring unit calculates updated each sub-neural network structure for obtaining verifying collection, and by verifying collection Accuracy rate;
Penalty values computing unit, for calculating the penalty values of the sampling structure according to the accuracy rate;
Gradient computing unit, for the penalty values according to the sampling structure, obtain sequence generate model parameter update gradient, The parameter of encoder updates gradient and the parameter of decoder updates gradient;
Sampling structure updating unit, the parameter for generating model according to sequence updates gradient, the parameter of encoder updates gradient Gradient is updated with the parameter of decoder, renewal sequence generates parameter, the parameter of the parameter of encoder and decoder of model, obtains Updated sampling structure.
18. a kind of computer readable storage medium, stores computer program thereon, which is characterized in that the computer program quilt The step of searching method of the neural network structure as described in any one of claim 1 to 13 is realized when processor executes.
19. a kind of data processing equipment, which is characterized in that in the memory including memory, primary processor and storage And the computer program that can be executed by the primary processor, the primary processor realize such as right when executing the computer program It is required that the step of searching method of neural network structure described in any one of 1 to 13.
CN201910128954.3A 2019-02-21 2019-02-21 Neural network structure search method, system, storage medium, and device Pending CN110020667A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910128954.3A CN110020667A (en) 2019-02-21 2019-02-21 Neural network structure search method, system, storage medium, and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910128954.3A CN110020667A (en) 2019-02-21 2019-02-21 Neural network structure search method, system, storage medium, and device

Publications (1)

Publication Number Publication Date
CN110020667A true CN110020667A (en) 2019-07-16

Family

ID=67189122

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910128954.3A Pending CN110020667A (en) 2019-02-21 2019-02-21 Neural network structure search method, system, storage medium, and device

Country Status (1)

Country Link
CN (1) CN110020667A (en)

Cited By (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110490320A (en) * 2019-07-30 2019-11-22 西北工业大学 Deep neural network structural optimization method based on forecasting mechanism and Genetic Algorithm Fusion
CN110490323A (en) * 2019-08-20 2019-11-22 腾讯科技(深圳)有限公司 Network model compression method, device, storage medium and computer equipment
CN110659690A (en) * 2019-09-25 2020-01-07 深圳市商汤科技有限公司 Neural network construction method and device, electronic equipment and storage medium
CN110751267A (en) * 2019-09-30 2020-02-04 京东城市(北京)数字科技有限公司 Neural network structure searching method, training method, device and storage medium
CN110826696A (en) * 2019-10-30 2020-02-21 北京百度网讯科技有限公司 Search space construction method and device of hyper network and electronic equipment
CN110851566A (en) * 2019-11-04 2020-02-28 沈阳雅译网络技术有限公司 Improved differentiable network structure searching method
CN111191785A (en) * 2019-12-20 2020-05-22 沈阳雅译网络技术有限公司 Structure searching method based on expanded search space
CN111340221A (en) * 2020-02-25 2020-06-26 北京百度网讯科技有限公司 Method and device for sampling neural network structure
CN111782398A (en) * 2020-06-29 2020-10-16 上海商汤智能科技有限公司 Data processing method, device and system and related equipment
CN111783937A (en) * 2020-05-19 2020-10-16 华为技术有限公司 Neural network construction method and system
CN111882048A (en) * 2020-09-28 2020-11-03 深圳追一科技有限公司 Neural network structure searching method and related equipment
CN112381215A (en) * 2020-12-17 2021-02-19 之江实验室 Self-adaptive search space generation method and device for automatic machine learning
CN112445823A (en) * 2019-09-04 2021-03-05 华为技术有限公司 Searching method of neural network structure, image processing method and device
WO2021057690A1 (en) * 2019-09-24 2021-04-01 华为技术有限公司 Neural network building method and device, and image processing method and device
WO2021164751A1 (en) * 2020-02-21 2021-08-26 华为技术有限公司 Perception network architecture search method and device
US12032571B2 (en) 2019-09-17 2024-07-09 Huawei Cloud Computing Technologies Co., Ltd. AI model optimization method and apparatus

Cited By (24)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110490320B (en) * 2019-07-30 2022-08-23 西北工业大学 Deep neural network structure optimization method based on fusion of prediction mechanism and genetic algorithm
CN110490320A (en) * 2019-07-30 2019-11-22 西北工业大学 Deep neural network structural optimization method based on forecasting mechanism and Genetic Algorithm Fusion
CN110490323A (en) * 2019-08-20 2019-11-22 腾讯科技(深圳)有限公司 Network model compression method, device, storage medium and computer equipment
CN112445823A (en) * 2019-09-04 2021-03-05 华为技术有限公司 Searching method of neural network structure, image processing method and device
WO2021043193A1 (en) * 2019-09-04 2021-03-11 华为技术有限公司 Neural network structure search method and image processing method and device
US12032571B2 (en) 2019-09-17 2024-07-09 Huawei Cloud Computing Technologies Co., Ltd. AI model optimization method and apparatus
WO2021057690A1 (en) * 2019-09-24 2021-04-01 华为技术有限公司 Neural network building method and device, and image processing method and device
CN110659690B (en) * 2019-09-25 2022-04-05 深圳市商汤科技有限公司 Neural network construction method and device, electronic equipment and storage medium
CN110659690A (en) * 2019-09-25 2020-01-07 深圳市商汤科技有限公司 Neural network construction method and device, electronic equipment and storage medium
CN110751267A (en) * 2019-09-30 2020-02-04 京东城市(北京)数字科技有限公司 Neural network structure searching method, training method, device and storage medium
CN110751267B (en) * 2019-09-30 2021-03-30 京东城市(北京)数字科技有限公司 Neural network structure searching method, training method, device and storage medium
CN110826696A (en) * 2019-10-30 2020-02-21 北京百度网讯科技有限公司 Search space construction method and device of hyper network and electronic equipment
CN110851566A (en) * 2019-11-04 2020-02-28 沈阳雅译网络技术有限公司 Improved differentiable network structure searching method
CN110851566B (en) * 2019-11-04 2022-04-29 沈阳雅译网络技术有限公司 Differentiable network structure searching method applied to named entity recognition
CN111191785A (en) * 2019-12-20 2020-05-22 沈阳雅译网络技术有限公司 Structure searching method based on expanded search space
WO2021164751A1 (en) * 2020-02-21 2021-08-26 华为技术有限公司 Perception network architecture search method and device
CN111340221A (en) * 2020-02-25 2020-06-26 北京百度网讯科技有限公司 Method and device for sampling neural network structure
CN111340221B (en) * 2020-02-25 2023-09-12 北京百度网讯科技有限公司 Neural network structure sampling method and device
WO2021233342A1 (en) * 2020-05-19 2021-11-25 华为技术有限公司 Neural network construction method and system
CN111783937A (en) * 2020-05-19 2020-10-16 华为技术有限公司 Neural network construction method and system
CN111782398A (en) * 2020-06-29 2020-10-16 上海商汤智能科技有限公司 Data processing method, device and system and related equipment
CN111882048A (en) * 2020-09-28 2020-11-03 深圳追一科技有限公司 Neural network structure searching method and related equipment
CN112381215A (en) * 2020-12-17 2021-02-19 之江实验室 Self-adaptive search space generation method and device for automatic machine learning
CN112381215B (en) * 2020-12-17 2023-08-11 之江实验室 Self-adaptive search space generation method and device oriented to automatic machine learning

Similar Documents

Publication Publication Date Title
CN110020667A (en) Neural network structure search method, system, storage medium, and device
US11853893B2 (en) Execution of a genetic algorithm having variable epoch size with selective execution of a training algorithm
US20190164538A1 (en) Memory compression in a deep neural network
CN109902706A (en) Recommended method and device
Shen et al. Fractional skipping: Towards finer-grained dynamic CNN inference
CN106022392B (en) A kind of training method that deep neural network sample is accepted or rejected automatically
CN104751842B (en) The optimization method and system of deep neural network
CN109359140A (en) A kind of sequence of recommendation method and device based on adaptive attention
CN110163433A (en) A kind of ship method for predicting
CN111079899A (en) Neural network model compression method, system, device and medium
CN106228185A (en) A kind of general image classifying and identifying system based on neutral net and method
CN108875053A (en) A kind of knowledge mapping data processing method and device
KR20180107940A (en) Learning method and apparatus for speech recognition
CN102184205A (en) Multi-mode string matching algorithm based on extended precision chaos hash
CN106897744A (en) A kind of self adaptation sets the method and system of depth confidence network parameter
CN112000772A (en) Sentence-to-semantic matching method based on semantic feature cube and oriented to intelligent question and answer
CN110020435B (en) Method for optimizing text feature selection by adopting parallel binary bat algorithm
CN109871934A (en) Feature selection approach based on the distributed parallel binary of Spark a flying moth darts into the fire algorithm
CN113221950A (en) Graph clustering method and device based on self-supervision graph neural network and storage medium
CN112560985A (en) Neural network searching method and device and electronic equipment
Kiaee et al. Alternating direction method of multipliers for sparse convolutional neural networks
Anderson et al. Performance-oriented neural architecture search
CN115345358A (en) Oil well parameter adaptive regulation and control method based on reinforcement learning
CN107194468A (en) Towards the decision tree Increment Learning Algorithm of information big data
CN110110447A (en) It is a kind of to mix the feedback limit learning machine steel strip thickness prediction technique that leapfrogs

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination