CN110210609A

CN110210609A - Model training method, device and terminal based on the search of neural frame

Info

Publication number: CN110210609A
Application number: CN201910509296.2A
Authority: CN
Inventors: 高参; 何伯磊; 肖欣延
Original assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Current assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Priority date: 2019-06-12
Filing date: 2019-06-12
Publication date: 2019-09-06

Abstract

The embodiment of the present invention proposes that a kind of model training method, device and terminal based on the search of neural frame, method include: to generate model according to network to generate multiple sub-network hyper parameter set；Multiple sub-network models are generated according to multiple sub-network hyper parameter set, and multiple sub-network models are verified respectively, obtain the corresponding multiple accuracys rate of multiple sub-network models；Obtain the pro-active network hyper parameter set of pro-active network model；Multiple sub-network hyper parameter set and pro-active network hyper parameter set are separately input into arbiter, the first desired value is obtained；First desired value and multiple accuracys rate are input to network to generate in model, update the parameter that network generates model, generates model to generate new network.By carrying out the update of network parameter to controller using pro-active network hyper parameter, the process for generating sub-network model to controller is instructed.Reach to effectively reduce and be related to the time of optimal sub-network model, improves the formation efficiency of sub-network model.

Description

Model training method, device and terminal based on the search of neural frame

Technical field

The present invention relates to machine learning techniques field more particularly to a kind of model training sides based on the search of neural frame Method, device and terminal.

Background technique

Deep learning model all achieves good results in many tasks, but adjusting ginseng is one for depth model The thing of item very suffering, numerous hyper parameters and network architecture parameters can generate volatile combination, and therefore, recent years is refreshing Framework search and hyperparameter optimization through network become a research hotspot.Wherein, the technology of computer Automated Design network is logical Often it is called neural frame search (Neural Architecture Search, NAS).The target of neural frame search is by certainly Dynamic planned network structure substitutes the various network structures of engineer, reduction people time on design optimal network model Consumption.Hyper parameter is exactly the frame parameter inside machine learning model, is usually set by hand, is adjusted by continuous trial and error It is whole.Hyperparameter optimization problem is the core focus of automaton study, and NAS is the subproblem in hyperparameter optimization.Currently, The main frame of neural frame search is divided into two parts controller (Controller) and sub-network (Child Network).Its In, Controller is usually Recognition with Recurrent Neural Network (RNN) model.In RNN model, it can be every five output and form one layer Neural network, i.e. Child Network.And the output of previous step is the input of next step, this ensure that RNN is based on front N-1 layers of all parameter information predict the parameter of n-th layer.The reward mechanism of current techniques mostlys come from Child The verification result of Network does not utilize other external informations, so generating the low efficiency of sub-network model, can not provide a Property demand.

Summary of the invention

The embodiment of the present invention provides a kind of model training method, device and terminal based on the search of neural frame, with solution One or more technical problems certainly in the prior art.

In a first aspect, the embodiment of the invention provides a kind of model training methods based on the search of neural frame, comprising:

Model, which is generated, according to network generates multiple sub-network hyper parameter set；

Multiple sub-network models are generated according to the multiple sub-network hyper parameter set, and to the multiple sub-network model It is verified respectively, obtains the corresponding multiple accuracys rate of the multiple sub-network model；

Obtain the pro-active network hyper parameter set of pro-active network model；

The multiple sub-network hyper parameter set and the pro-active network hyper parameter set are separately input into arbiter, Obtain the first desired value；

First desired value and the multiple accuracy rate are input to the network to generate in model, update the network The parameter of model is generated, generates model to generate new network.

In one embodiment, multiple sub-network models are generated according to the multiple sub-network hyper parameter set, and right The multiple sub-network model is verified respectively, obtains the corresponding multiple accuracys rate of the multiple sub-network model, comprising:

The first sub-network model is generated according to the first sub-network hyper parameter set；

Training set is input in the first sub-network model, training obtains the first mould of the first sub-network model Shape parameter；

Test set is input in the first sub-network model with first model parameter, the first accuracy rate is obtained；

Above-mentioned steps are repeated, until obtaining multiple accuracys rate.

In one embodiment, the sub-network hyper parameter set and pro-active network hyper parameter set difference is defeated Enter into arbiter, obtain the first desired value, comprising:

The sub-network hyper parameter set is input in the arbiter, the second desired value is exported；

The pro-active network hyper parameter set is input in the arbiter, third desired value is obtained；

It regard the sum of second desired value and the third desired value as first desired value.

In one embodiment, further includes:

Maximum accuracy rate is selected from the multiple accuracy rate, obtains the corresponding sub-network model of the maximum accuracy rate；

By the maximum accuracy rate corresponding sub-network model replacement pro-active network model；

The hyper parameter set and remaining sub-network hyper parameter set point of the corresponding sub-network model of the maximum accuracy rate It is not input in the arbiter, obtains fourth phase prestige value；

The fourth phase prestige value and the multiple accuracy rate are input to the network to generate in model, update the network The parameter of model is generated, generates model to generate new network.

Second aspect, the embodiment of the invention provides a kind of model training apparatus based on the search of neural frame, comprising:

Sub-network hyper parameter set generation module generates multiple sub-network hyper parameter collection for generating model according to network It closes；

Accuracy rate computing module, for generating multiple sub-network models according to the multiple sub-network hyper parameter set, and The multiple sub-network model is verified respectively, obtains the corresponding multiple accuracys rate of the multiple sub-network model；

Pro-active network hyper parameter module, for obtaining the pro-active network hyper parameter set of pro-active network model；

First desired value computing module is used for the multiple sub-network hyper parameter set and the pro-active network hyper parameter Set is separately input into arbiter, obtains the first desired value；

First network generates model modification module, for first desired value and the multiple accuracy rate to be input to institute It states network to generate in model, updates the parameter that the network generates model, generate model to generate new network.

In one embodiment, the accuracy rate computing module includes:

First sub-network model generation unit, for generating the first sub-network mould according to the first sub-network hyper parameter set Type；

First model parameter generation unit, for training set to be input in the first sub-network model, training is obtained First model parameter of the first sub-network model；

Accuracy rate computing unit, for test set to be input to the first sub-network model with first model parameter In, the first accuracy rate is obtained, above-mentioned steps are repeated, until obtaining multiple accuracys rate.

In one embodiment, the first desired value computing module includes:

Second desired value computing unit is exported for the sub-network hyper parameter set to be input in the arbiter Second desired value；

Third desired value computing unit is obtained for the pro-active network hyper parameter set to be input in the arbiter To third desired value；

First desired value computing unit, for will the sum of second desired value and the third desired value conduct described the One desired value.

In one embodiment, further includes:

Maximum accuracy rate confirmation module obtains the maximum for selecting maximum accuracy rate from the multiple accuracy rate The corresponding sub-network model of accuracy rate；

Network model replacement module, for the corresponding sub-network model of the maximum accuracy rate to be replaced the pro-active network Model；

Fourth phase prestige value computing module, for the hyper parameter set of the corresponding sub-network model of the maximum accuracy rate and surplus Remaining sub-network hyper parameter set is separately input into the arbiter, obtains fourth phase prestige value；

Second network generation module update module, for the fourth phase prestige value and the multiple accuracy rate to be input to institute It states network to generate in model, updates the parameter that the network generates model, generate model to generate new network.

The third aspect, the embodiment of the invention provides a kind of model training terminal based on the search of neural frame, the bases It can also be executed by hardware corresponding soft by hardware realization in the function of the model training terminal of neural frame search Part is realized.The hardware or software include one or more modules corresponding with above-mentioned function.

It include processing in the structure of the model training terminal based on the search of neural frame in a possible design Device and memory, the memory, which is used to store, supports the model training terminal based on the search of neural frame to execute above-mentioned base In the program of the model training method of neural frame search, the processor is configured to being stored in the memory for executing Program.It is described based on neural frame search model training terminal can also include communication interface, for other equipment or Communication.

Fourth aspect, the embodiment of the invention provides a kind of computer readable storage mediums, for storing based on neural frame Frame search model training terminal used in computer software instructions comprising for execute it is above-mentioned based on neural frame search Program involved in model training method.

A technical solution in above-mentioned technical proposal has the following advantages that or the utility model has the advantages that by super using pro-active network Parameter carries out the update of network parameter to controller, and the process for generating sub-network model to controller instructs.Reach effective Reduction is related to the time of optimal sub-network model, improves the formation efficiency of sub-network model.

Above-mentioned general introduction is merely to illustrate that the purpose of book, it is not intended to be limited in any way.Except foregoing description Schematical aspect, except embodiment and feature, by reference to attached drawing and the following detailed description, the present invention is further Aspect, embodiment and feature, which will be, to be readily apparent that.

Detailed description of the invention

In the accompanying drawings, unless specified otherwise herein, otherwise indicate the same or similar through the identical appended drawing reference of multiple attached drawings Component or element.What these attached drawings were not necessarily to scale.It should be understood that these attached drawings depict only according to the present invention Disclosed some embodiments, and should not serve to limit the scope of the present invention.

Fig. 1 shows a kind of flow chart of model training method based on the search of neural frame according to an embodiment of the present invention.

Fig. 2 shows a kind of schematic diagrames of the model training method based on the search of neural frame according to an embodiment of the present invention.

Fig. 3 shows the schematic diagram of sub-network model according to an embodiment of the present invention.

Fig. 4 shows the process of another model training method based on the search of neural frame according to an embodiment of the present invention Figure.

Fig. 5 shows the process of another model training method based on the search of neural frame according to an embodiment of the present invention Figure.

Fig. 6 shows a kind of structural frames of model training apparatus based on the search of neural frame according to an embodiment of the present invention Figure.

Fig. 7 shows the structural frames of another model training apparatus based on the search of neural frame according to an embodiment of the present invention Figure.

Fig. 8 shows a kind of structural representation of model training terminal based on the search of neural frame according to an embodiment of the present invention Figure.

Specific embodiment

Hereinafter, certain exemplary embodiments are simply just described.As one skilled in the art will recognize that Like that, without departing from the spirit or scope of the present invention, described embodiment can be modified by various different modes. Therefore, attached drawing and description are considered essentially illustrative rather than restrictive.

Embodiment one

In a specific embodiment, a kind of model training method based on the search of neural frame, neural frame are provided Frame searches for (Neural Architecture Search, NAS), is the optimized parameter search problem of higher dimensional space.Detailed process It may include: first to define search space, candidate network structure then found out by search strategy, and carry out to candidate network structure Assessment carries out the search of next round according to feedback.

As depicted in figs. 1 and 2, the method specifically includes:

Step S10: model is generated according to network and generates multiple sub-network hyper parameter set.

In a kind of example, the search space of definition includes sub-network hyper parameter set.Hyper parameter can be definition network The parameter of structure, for example, there is several layer networks, every layer of operator, the filter size etc. in convolution.Hyper parameter has dimension height, It is discrete and the features such as interdepend.Network, which generates model, can be Recognition with Recurrent Neural Network model (RNN, Recurrent Neural Network), sub-network model can be convolutional neural networks (CNN, Convolutional Neural Networks) model.

Be according to the process that network generation model generates multiple sub-network hyper parameter set: initialization network generates model Parameter, and the sub-network hyper parameter of given initialization, such as a1.The sub-network hyper parameter of initialization is input to network and generates mould In type, multiple sub-network hyper parameters, such as a2, a3......aN are generated.N number of sub-network hyper parameter constitutes sub-network hyper parameter collection It closes, such as A1={ a1, a2, a3......aN }.

Step S20: multiple sub-network models are generated according to multiple sub-network hyper parameter set, and to multiple sub-network models It is verified respectively, obtains the corresponding multiple accuracys rate of multiple sub-network models.

In a kind of example, one is generated according to sub-network hyper parameter set, such as A1={ a1, a2, a3......aN } Sub-network model.It is Filter Height respectively as shown in figure 3, including five outputs in each sub-network model generated Convolution kernel height (or length), Filter Width convolution kernel width, Stride Height step height, Stride Width step width, Num of Filter convolution kernel number.Five outputs constitute sub-network model hyper parameter set.Obtain one A data set D={ D_train, D_test, D_trainFor training set, D_testFor test set.Training set and data set can be emotion point The special corpus task such as analysis, text classification.Using training set D_trainMultiple sub-network models are trained respectively, are obtained pair The sub-network model parameter answered.Later, test set D is utilized_testMultiple sub-network models are verified respectively, it is respectively right to obtain The verification result answered, i.e. accuracy rate.

Step S30: the pro-active network hyper parameter set of pro-active network model is obtained.

In a kind of example, process and the aforementioned sub-network for obtaining the pro-active network hyper parameter set of pro-active network model are super The acquisition process of parameter sets is similar, and details are not described herein.Since pro-active network model is that the desired of engineer reaches The model of demand, incorporated it is more intuitively or theoretic priori knowledge.So introducing the mesh of pro-active network model , by generating the update that model carries out network parameter to network using pro-active network hyper parameter, and then mould is generated to network The process that type generates sub-network model is instructed.The time for generating optimal sub-network model is effectively reduced, sub-network is improved The formation efficiency of model.

Step S40: multiple sub-network hyper parameter set and pro-active network hyper parameter set are separately input into arbiter, Obtain the first desired value.

In a kind of example, the guidance that pro-active network model generates model to network is introduced by arbiter.Specifically, Multiple sub-network hyper parameter set and pro-active network hyper parameter set are separately input into arbiter, the first desired value is obtained. First desired value can be 0 or 1,0 expression sub-network model parameter and differ larger with pro-active network model parameter, and 1 indicates subnet Network model parameter differs smaller with pro-active network parameter model.Purpose is that the sub-network model generated more connects with pro-active network parameter Close better, the process for illustrating that pro-active network model generates model generation sub-network model to network has carried out effective guidance.

It should be pointed out that the input sequencing of sub-network hyper parameter set and pro-active network hyper parameter set is not done It limits.Arbiter can be arbitrary classification network, such as CNN network model.

Step S50: being input to network for the first desired value and multiple accuracys rate and generate in model, updates network and generates model Parameter, generate model to generate new network.

Using the pro-active network model being pre-designed to network generate model instruct, not only using priori knowledge as Network parameter section is input to network and generates in model, but also using priori knowledge as (the i.e. first expectation of the first prize signal Value) it is input in network generation model, the update for generating model parameter to network is instructed.

In one embodiment, as shown in figure 4, step S20 includes:

Step S201: the first sub-network model is generated according to the first sub-network hyper parameter set；

Step S202: training set is input in the first sub-network model, and training obtains the first of the first sub-network model Model parameter；

Step S203: test set is input in the first sub-network model with the first model parameter, obtains the first standard True rate；

Step S204: repeating above-mentioned steps, until obtaining multiple accuracys rate.

In a kind of example, network generates model and generates N number of sub-network hyper parameter, constitutes the first sub-network hyper parameter set A₁={ a₁；a₂ a_N}.First sub-network hyper parameter set A₁={ a₁；a₂ a_NOrdered fabrication is at the first sub-network MODEL C₁.It is given One data D={ training set D_train；Test set D_test, utilize D_trainTo the first sub-network MODEL C₁Training, and utilize D_testTo the first sub-network MODEL C₁Test, is verified as a result, such as accuracy rate R₁₁.It repeats the above steps M times, obtains M Accuracy rate R₁₁......R_1MM accuracy rate R₁₁......R_1M.First sub-network model corresponds to M accuracy rate R₁₁......R_1M。 Step S201 is repeated to step S204.Second sub-network model corresponds to M accuracy rate R₂₁......R_2M... is until i-th A sub- network model corresponds to M accuracy rate R_i1......R_iM,.Accuracy rate R_i1......R_iMIt is expressed as the second prize signal Reward_2i。

In one embodiment, as shown in figure 4, step S40 includes:

Step S401: sub-network hyper parameter set is input in arbiter, exports the second desired value；

Step S402: pro-active network hyper parameter set is input in arbiter, third desired value is obtained；

Step S403: the sum of the second desired value and third desired value are regard as the first desired value.

In a kind of example, sub-network hyper parameter set is inputted in arbiter, exports the second desired value Reward_la, Reward_laiIt is 0 or 1.Pro-active network hyper parameter set inputs in arbiter, exports third desired value Reward_1b, Reward_1bIt is 0 or 1.First desired value Reward_li=AReward_1b+BReward_lai, A and B are known numeric values.

Finally, by Reward_liAnd Reward_2iNetwork is input to as feedback information to generate in model, passes through intensified learning In Policy-Gradient algorithm, update the parameter that network generates model, generate model to generate new network.

In one embodiment, as shown in Figure 4, further includes:

Step S60: maximum accuracy rate is selected from multiple accuracys rate, obtains the corresponding sub-network model of maximum accuracy rate；

Step S70: the corresponding sub-network model of maximum accuracy rate is replaced into pro-active network model；

Step S80: the hyper parameter set of the corresponding sub-network model of maximum accuracy rate and remaining sub-network hyper parameter collection Conjunction is separately input into arbiter, obtains fourth phase prestige value；

Step S90: fourth phase prestige value and multiple accuracys rate are input to network and generated in model, network is updated and generates model Parameter, generate model to generate new network.

In a kind of example, pass through Reward_2iSequence utilizes the best subnet of generation in the sub-network model of generation Network model is instructed as pro-active network model sub-network generating process is generated.Whole training process and the above method one It causes, details are not described herein.After M network of training generates model, need from the candidate collection of sub-network model again Select the maximum sub-network model of accuracy rate as pro-active network model, and random fixed network generates the network ginseng in model Number.The search space that network generates model can be reduced.

Embodiment two

In another embodiment specific implementation mode, as shown in figure 5, providing a kind of model training based on the search of neural frame Device, comprising:

Sub-network hyper parameter set generation module 10 generates multiple sub-network hyper parameter collection for generating model according to network It closes；

Accuracy rate computing module 20, for generating multiple sub-network models according to the multiple sub-network hyper parameter set, And the multiple sub-network model is verified respectively, obtain the corresponding multiple accuracys rate of the multiple sub-network model；

Pro-active network hyper parameter module 30, for obtaining the pro-active network hyper parameter of pro-active network model；

First desired value computing module 40, for the multiple sub-network hyper parameter set and the pro-active network to be surpassed ginseng Manifold conjunction is separately input into arbiter, obtains the first desired value；

First network generates model modification module 50, for first desired value and the multiple accuracy rate to be input to The network generates in model, updates the parameter that the network generates model, generates model to generate new network.

In one embodiment, as shown in fig. 6, the accuracy rate computing module 20 includes:

First sub-network model generation unit 201, for generating the first sub-network according to the first sub-network hyper parameter set Model；

First model parameter generation unit 202, for training set to be input in the first sub-network model, trained To the first model parameter of the first sub-network model；

Accuracy rate computing unit 203, for test set to be input to the first sub-network with first model parameter In model, the first accuracy rate is obtained, above-mentioned steps are repeated, until obtaining multiple accuracys rate.

In one embodiment, as shown in fig. 6, the first desired value computing module 40 includes:

Second desired value computing unit 401, it is defeated for the sub-network hyper parameter set to be input in the arbiter Second desired value out；

Third desired value computing unit 402, for the pro-active network hyper parameter set to be input in the arbiter, Obtain third desired value；

First desired value computing unit 403, for regarding the sum of second desired value and the third desired value as institute State the first desired value.

In one embodiment, as shown in Figure 7, further includes:

Maximum accuracy rate confirmation module 60, for selecting maximum accuracy rate from the multiple accuracy rate, obtain it is described most The corresponding sub-network model of big accuracy rate；

Network model replacement module 70, for the corresponding sub-network model of the maximum accuracy rate to be replaced the priori net Network model；

Fourth phase prestige value computing module 80, for the corresponding sub-network model of the maximum accuracy rate hyper parameter set and Remaining sub-network hyper parameter set is separately input into the arbiter, obtains fourth phase prestige value；

Second network generation module update module 90, for the fourth phase prestige value and the multiple accuracy rate to be input to The network generates in model, updates the parameter that the network generates model, generates model to generate new network.

The function of each module in each device of the embodiment of the present invention may refer to the corresponding description in the above method, herein not It repeats again.

Embodiment three

Fig. 8 shows the structural block diagram of the model training terminal according to an embodiment of the present invention based on the search of neural frame.Such as Shown in Fig. 8, which includes: memory 910 and processor 920, and being stored in memory 910 can run on processor 920 Computer program.The processor 920 realized when executing the computer program in above-described embodiment based on neural frame The model training method of search.The quantity of the memory 910 and processor 920 can be one or more.

The terminal further include:

Communication interface 930 carries out data interaction for being communicated with external device.

Memory 910 may include high speed RAM memory, it is also possible to further include nonvolatile memory (non- Volatile memory), a for example, at least magnetic disk storage.

If memory 910, processor 920 and the independent realization of communication interface 930, memory 910,920 and of processor Communication interface 930 can be connected with each other by bus and complete mutual communication.The bus can be Industry Standard Architecture Structure (ISA, Industry Standard Architecture) bus, external equipment interconnection (PCI, Peripheral Component Interconnect) bus or extended industry-standard architecture (EISA, Extended Industry StandardArchitecture) bus etc..The bus can be divided into address bus, data/address bus, control bus etc..For just It is only indicated with a thick line in expression, Fig. 8, it is not intended that an only bus or a type of bus.

Optionally, in specific implementation, if memory 910, processor 920 and communication interface 930 are integrated in one piece of core On piece, then memory 910, processor 920 and communication interface 930 can complete mutual communication by internal interface.

The embodiment of the invention provides a kind of computer readable storage mediums, are stored with computer program, the program quilt Processor realizes any the method in above-described embodiment when executing.

In the description of this specification, reference term " one embodiment ", " some embodiments ", " example ", " specifically show The description of example " or " some examples " etc. means specific features, structure, material or spy described in conjunction with this embodiment or example Point is included at least one embodiment or example of the invention.Moreover, particular features, structures, materials, or characteristics described It may be combined in any suitable manner in any one or more of the embodiments or examples.In addition, without conflicting with each other, this The technical staff in field can be by the spy of different embodiments or examples described in this specification and different embodiments or examples Sign is combined.

In addition, term " first ", " second " are used for descriptive purposes only and cannot be understood as indicating or suggesting relative importance Or implicitly indicate the quantity of indicated technical characteristic." first " is defined as a result, the feature of " second " can be expressed or hidden It include at least one this feature containing ground.In the description of the present invention, the meaning of " plurality " is two or more, unless otherwise Clear specific restriction.

Any process described otherwise above or method description are construed as in flow chart or herein, and expression includes It is one or more for realizing specific logical function or process the step of executable instruction code module, segment or portion Point, and the range of the preferred embodiment of the present invention includes other realization, wherein can not press shown or discussed suitable Sequence, including according to related function by it is basic simultaneously in the way of or in the opposite order, Lai Zhihang function, this should be of the invention Embodiment person of ordinary skill in the field understood.

Expression or logic and/or step described otherwise above herein in flow charts, for example, being considered use In the order list for the executable instruction for realizing logic function, may be embodied in any computer-readable medium, for Instruction execution system, device or equipment (such as computer based system, including the system of processor or other can be held from instruction The instruction fetch of row system, device or equipment and the system executed instruction) it uses, or combine these instruction execution systems, device or set It is standby and use.For the purpose of this specification, " computer-readable medium ", which can be, any may include, stores, communicates, propagates or pass Defeated program is for instruction execution system, device or equipment or the use device in conjunction with these instruction execution systems, device or equipment. The more specific example (non-exhaustive list) of computer-readable medium include the following: there is the electrical connection of one or more wirings Portion's (electronic device), portable computer diskette box (magnetic device), random access memory (RAM), read-only memory (ROM) can It wipes editable read-only memory (EPROM or flash memory), fiber device and portable read-only memory (CDROM). In addition, computer-readable medium can even is that the paper that can print described program on it or other suitable media, because can For example by carrying out optical scanner to paper or other media, then to be edited, be interpreted or when necessary with other suitable methods It is handled electronically to obtain described program, is then stored in computer storage.

It should be appreciated that each section of the invention can be realized with hardware, software, firmware or their combination.Above-mentioned In embodiment, software that multiple steps or method can be executed in memory and by suitable instruction execution system with storage Or firmware is realized.It, and in another embodiment, can be under well known in the art for example, if realized with hardware Any one of column technology or their combination are realized: having a logic gates for realizing logic function to data-signal Discrete logic, with suitable combinational logic gate circuit specific integrated circuit, programmable gate array (PGA), scene Programmable gate array (FPGA) etc..

Those skilled in the art are understood that realize all or part of step that above-described embodiment method carries It suddenly is that relevant hardware can be instructed to complete by program, the program can store in a kind of computer-readable storage medium In matter, which when being executed, includes the steps that one or a combination set of embodiment of the method.

It, can also be in addition, each functional unit in each embodiment of the present invention can integrate in a processing module It is that each unit physically exists alone, can also be integrated in two or more units in a module.Above-mentioned integrated mould Block both can take the form of hardware realization, can also be realized in the form of software function module.The integrated module is such as Fruit is realized and when sold or used as an independent product in the form of software function module, also can store in a computer In readable storage medium storing program for executing.The storage medium can be read-only memory, disk or CD etc..

The above description is merely a specific embodiment, but scope of protection of the present invention is not limited thereto, any Those familiar with the art in the technical scope disclosed by the present invention, can readily occur in its various change or replacement, These should be covered by the protection scope of the present invention.Therefore, protection scope of the present invention should be with the guarantor of the claim It protects subject to range.

Claims

1. a kind of model training method based on the search of neural frame characterized by comprising

Multiple sub-network models are generated according to the multiple sub-network hyper parameter set, and the multiple sub-network model is distinguished It is verified, obtains the corresponding multiple accuracys rate of the multiple sub-network model；

The multiple sub-network hyper parameter set and the pro-active network hyper parameter set are separately input into arbiter, obtained First desired value；

First desired value and the multiple accuracy rate are input to the network to generate in model, the network is updated and generates The parameter of model generates model to generate new network.

2. the method according to claim 1, wherein being generated according to the multiple sub-network hyper parameter set multiple Sub-network model, and the multiple sub-network model is verified respectively, it is corresponding more to obtain the multiple sub-network model A accuracy rate, comprising:

Training set is input in the first sub-network model, training obtains the first model ginseng of the first sub-network model Number；

Above-mentioned steps are repeated, until obtaining multiple accuracys rate.

3. the method according to claim 1, wherein by the sub-network hyper parameter set and the pro-active network Hyper parameter set is separately input into arbiter, obtains the first desired value, comprising:

4. the method according to claim 1, wherein further include:

The hyper parameter set and remaining sub-network hyper parameter set difference of the corresponding sub-network model of the maximum accuracy rate are defeated Enter into the arbiter, obtains fourth phase prestige value；

The fourth phase prestige value and the multiple accuracy rate are input to the network to generate in model, the network is updated and generates The parameter of model generates model to generate new network.

5. a kind of model training apparatus based on the search of neural frame characterized by comprising

Sub-network hyper parameter set generation module generates multiple sub-network hyper parameter set for generating model according to network；

Accuracy rate computing module, for generating multiple sub-network models according to the multiple sub-network hyper parameter set, and to institute It states multiple sub-network models to be verified respectively, obtains the corresponding multiple accuracys rate of the multiple sub-network model；

First desired value computing module is used for the multiple sub-network hyper parameter set and the pro-active network hyper parameter set It is separately input into arbiter, obtains the first desired value；

First network generates model modification module, for first desired value and the multiple accuracy rate to be input to the net Network generates in model, updates the parameter that the network generates model, generates model to generate new network.

6. device according to claim 5, which is characterized in that the accuracy rate computing module includes:

First sub-network model generation unit, for generating the first sub-network model according to the first sub-network hyper parameter set；

First model parameter generation unit, for training set to be input in the first sub-network model, training obtains described First model parameter of the first sub-network model；

Accuracy rate computing unit, for being input to test set in the first sub-network model with first model parameter, The first accuracy rate is obtained, above-mentioned steps are repeated, until obtaining multiple accuracys rate.

7. device according to claim 5, which is characterized in that the first desired value computing module includes:

Second desired value computing unit, for the sub-network hyper parameter set to be input in the arbiter, output second Desired value；

Third desired value computing unit obtains for the pro-active network hyper parameter set to be input in the arbiter Three desired values；

First desired value computing unit, for regarding the sum of second desired value and the third desired value as the first phase Prestige value.

8. device according to claim 5, which is characterized in that further include:

Maximum accuracy rate confirmation module, for selecting maximum accuracy rate from the multiple accuracy rate, it is described maximum accurate to obtain The corresponding sub-network model of rate；

Network model replacement module, for the corresponding sub-network model of the maximum accuracy rate to be replaced the pro-active network mould Type；

Fourth phase prestige value computing module, for the hyper parameter set of the corresponding sub-network model of the maximum accuracy rate and remaining Sub-network hyper parameter set is separately input into the arbiter, obtains fourth phase prestige value；

Second network generation module update module, for the fourth phase prestige value and the multiple accuracy rate to be input to the net Network generates in model, updates the parameter that the network generates model, generates model to generate new network.

9. a kind of model training terminal based on the search of neural frame characterized by comprising

One or more processors；

Storage device, for storing one or more programs；

When one or more of programs are executed by one or more of processors, so that one or more of processors Realize such as any one of claims 1 to 4 the method.

10. a kind of computer readable storage medium, is stored with computer program, which is characterized in that the program is held by processor Such as any one of claims 1 to 4 the method is realized when row.