CN110210609A - Model training method, device and terminal based on the search of neural frame - Google Patents

Model training method, device and terminal based on the search of neural frame Download PDF

Info

Publication number
CN110210609A
CN110210609A CN201910509296.2A CN201910509296A CN110210609A CN 110210609 A CN110210609 A CN 110210609A CN 201910509296 A CN201910509296 A CN 201910509296A CN 110210609 A CN110210609 A CN 110210609A
Authority
CN
China
Prior art keywords
network
model
sub
parameter set
hyper parameter
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910509296.2A
Other languages
Chinese (zh)
Inventor
高参
何伯磊
肖欣延
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN201910509296.2A priority Critical patent/CN110210609A/en
Publication of CN110210609A publication Critical patent/CN110210609A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the present invention proposes that a kind of model training method, device and terminal based on the search of neural frame, method include: to generate model according to network to generate multiple sub-network hyper parameter set;Multiple sub-network models are generated according to multiple sub-network hyper parameter set, and multiple sub-network models are verified respectively, obtain the corresponding multiple accuracys rate of multiple sub-network models;Obtain the pro-active network hyper parameter set of pro-active network model;Multiple sub-network hyper parameter set and pro-active network hyper parameter set are separately input into arbiter, the first desired value is obtained;First desired value and multiple accuracys rate are input to network to generate in model, update the parameter that network generates model, generates model to generate new network.By carrying out the update of network parameter to controller using pro-active network hyper parameter, the process for generating sub-network model to controller is instructed.Reach to effectively reduce and be related to the time of optimal sub-network model, improves the formation efficiency of sub-network model.

Description

Model training method, device and terminal based on the search of neural frame
Technical field
The present invention relates to machine learning techniques field more particularly to a kind of model training sides based on the search of neural frame Method, device and terminal.
Background technique
Deep learning model all achieves good results in many tasks, but adjusting ginseng is one for depth model The thing of item very suffering, numerous hyper parameters and network architecture parameters can generate volatile combination, and therefore, recent years is refreshing Framework search and hyperparameter optimization through network become a research hotspot.Wherein, the technology of computer Automated Design network is logical Often it is called neural frame search (Neural Architecture Search, NAS).The target of neural frame search is by certainly Dynamic planned network structure substitutes the various network structures of engineer, reduction people time on design optimal network model Consumption.Hyper parameter is exactly the frame parameter inside machine learning model, is usually set by hand, is adjusted by continuous trial and error It is whole.Hyperparameter optimization problem is the core focus of automaton study, and NAS is the subproblem in hyperparameter optimization.Currently, The main frame of neural frame search is divided into two parts controller (Controller) and sub-network (Child Network).Its In, Controller is usually Recognition with Recurrent Neural Network (RNN) model.In RNN model, it can be every five output and form one layer Neural network, i.e. Child Network.And the output of previous step is the input of next step, this ensure that RNN is based on front N-1 layers of all parameter information predict the parameter of n-th layer.The reward mechanism of current techniques mostlys come from Child The verification result of Network does not utilize other external informations, so generating the low efficiency of sub-network model, can not provide a Property demand.
Summary of the invention
The embodiment of the present invention provides a kind of model training method, device and terminal based on the search of neural frame, with solution One or more technical problems certainly in the prior art.
In a first aspect, the embodiment of the invention provides a kind of model training methods based on the search of neural frame, comprising:
Model, which is generated, according to network generates multiple sub-network hyper parameter set;
Multiple sub-network models are generated according to the multiple sub-network hyper parameter set, and to the multiple sub-network model It is verified respectively, obtains the corresponding multiple accuracys rate of the multiple sub-network model;
Obtain the pro-active network hyper parameter set of pro-active network model;
The multiple sub-network hyper parameter set and the pro-active network hyper parameter set are separately input into arbiter, Obtain the first desired value;
First desired value and the multiple accuracy rate are input to the network to generate in model, update the network The parameter of model is generated, generates model to generate new network.
In one embodiment, multiple sub-network models are generated according to the multiple sub-network hyper parameter set, and right The multiple sub-network model is verified respectively, obtains the corresponding multiple accuracys rate of the multiple sub-network model, comprising:
The first sub-network model is generated according to the first sub-network hyper parameter set;
Training set is input in the first sub-network model, training obtains the first mould of the first sub-network model Shape parameter;
Test set is input in the first sub-network model with first model parameter, the first accuracy rate is obtained;
Above-mentioned steps are repeated, until obtaining multiple accuracys rate.
In one embodiment, the sub-network hyper parameter set and pro-active network hyper parameter set difference is defeated Enter into arbiter, obtain the first desired value, comprising:
The sub-network hyper parameter set is input in the arbiter, the second desired value is exported;
The pro-active network hyper parameter set is input in the arbiter, third desired value is obtained;
It regard the sum of second desired value and the third desired value as first desired value.
In one embodiment, further includes:
Maximum accuracy rate is selected from the multiple accuracy rate, obtains the corresponding sub-network model of the maximum accuracy rate;
By the maximum accuracy rate corresponding sub-network model replacement pro-active network model;
The hyper parameter set and remaining sub-network hyper parameter set point of the corresponding sub-network model of the maximum accuracy rate It is not input in the arbiter, obtains fourth phase prestige value;
The fourth phase prestige value and the multiple accuracy rate are input to the network to generate in model, update the network The parameter of model is generated, generates model to generate new network.
Second aspect, the embodiment of the invention provides a kind of model training apparatus based on the search of neural frame, comprising:
Sub-network hyper parameter set generation module generates multiple sub-network hyper parameter collection for generating model according to network It closes;
Accuracy rate computing module, for generating multiple sub-network models according to the multiple sub-network hyper parameter set, and The multiple sub-network model is verified respectively, obtains the corresponding multiple accuracys rate of the multiple sub-network model;
Pro-active network hyper parameter module, for obtaining the pro-active network hyper parameter set of pro-active network model;
First desired value computing module is used for the multiple sub-network hyper parameter set and the pro-active network hyper parameter Set is separately input into arbiter, obtains the first desired value;
First network generates model modification module, for first desired value and the multiple accuracy rate to be input to institute It states network to generate in model, updates the parameter that the network generates model, generate model to generate new network.
In one embodiment, the accuracy rate computing module includes:
First sub-network model generation unit, for generating the first sub-network mould according to the first sub-network hyper parameter set Type;
First model parameter generation unit, for training set to be input in the first sub-network model, training is obtained First model parameter of the first sub-network model;
Accuracy rate computing unit, for test set to be input to the first sub-network model with first model parameter In, the first accuracy rate is obtained, above-mentioned steps are repeated, until obtaining multiple accuracys rate.
In one embodiment, the first desired value computing module includes:
Second desired value computing unit is exported for the sub-network hyper parameter set to be input in the arbiter Second desired value;
Third desired value computing unit is obtained for the pro-active network hyper parameter set to be input in the arbiter To third desired value;
First desired value computing unit, for will the sum of second desired value and the third desired value conduct described the One desired value.
In one embodiment, further includes:
Maximum accuracy rate confirmation module obtains the maximum for selecting maximum accuracy rate from the multiple accuracy rate The corresponding sub-network model of accuracy rate;
Network model replacement module, for the corresponding sub-network model of the maximum accuracy rate to be replaced the pro-active network Model;
Fourth phase prestige value computing module, for the hyper parameter set of the corresponding sub-network model of the maximum accuracy rate and surplus Remaining sub-network hyper parameter set is separately input into the arbiter, obtains fourth phase prestige value;
Second network generation module update module, for the fourth phase prestige value and the multiple accuracy rate to be input to institute It states network to generate in model, updates the parameter that the network generates model, generate model to generate new network.
The third aspect, the embodiment of the invention provides a kind of model training terminal based on the search of neural frame, the bases It can also be executed by hardware corresponding soft by hardware realization in the function of the model training terminal of neural frame search Part is realized.The hardware or software include one or more modules corresponding with above-mentioned function.
It include processing in the structure of the model training terminal based on the search of neural frame in a possible design Device and memory, the memory, which is used to store, supports the model training terminal based on the search of neural frame to execute above-mentioned base In the program of the model training method of neural frame search, the processor is configured to being stored in the memory for executing Program.It is described based on neural frame search model training terminal can also include communication interface, for other equipment or Communication.
Fourth aspect, the embodiment of the invention provides a kind of computer readable storage mediums, for storing based on neural frame Frame search model training terminal used in computer software instructions comprising for execute it is above-mentioned based on neural frame search Program involved in model training method.
A technical solution in above-mentioned technical proposal has the following advantages that or the utility model has the advantages that by super using pro-active network Parameter carries out the update of network parameter to controller, and the process for generating sub-network model to controller instructs.Reach effective Reduction is related to the time of optimal sub-network model, improves the formation efficiency of sub-network model.
Above-mentioned general introduction is merely to illustrate that the purpose of book, it is not intended to be limited in any way.Except foregoing description Schematical aspect, except embodiment and feature, by reference to attached drawing and the following detailed description, the present invention is further Aspect, embodiment and feature, which will be, to be readily apparent that.
Detailed description of the invention
In the accompanying drawings, unless specified otherwise herein, otherwise indicate the same or similar through the identical appended drawing reference of multiple attached drawings Component or element.What these attached drawings were not necessarily to scale.It should be understood that these attached drawings depict only according to the present invention Disclosed some embodiments, and should not serve to limit the scope of the present invention.
Fig. 1 shows a kind of flow chart of model training method based on the search of neural frame according to an embodiment of the present invention.
Fig. 2 shows a kind of schematic diagrames of the model training method based on the search of neural frame according to an embodiment of the present invention.
Fig. 3 shows the schematic diagram of sub-network model according to an embodiment of the present invention.
Fig. 4 shows the process of another model training method based on the search of neural frame according to an embodiment of the present invention Figure.
Fig. 5 shows the process of another model training method based on the search of neural frame according to an embodiment of the present invention Figure.
Fig. 6 shows a kind of structural frames of model training apparatus based on the search of neural frame according to an embodiment of the present invention Figure.
Fig. 7 shows the structural frames of another model training apparatus based on the search of neural frame according to an embodiment of the present invention Figure.
Fig. 8 shows a kind of structural representation of model training terminal based on the search of neural frame according to an embodiment of the present invention Figure.
Specific embodiment
Hereinafter, certain exemplary embodiments are simply just described.As one skilled in the art will recognize that Like that, without departing from the spirit or scope of the present invention, described embodiment can be modified by various different modes. Therefore, attached drawing and description are considered essentially illustrative rather than restrictive.
Embodiment one
In a specific embodiment, a kind of model training method based on the search of neural frame, neural frame are provided Frame searches for (Neural Architecture Search, NAS), is the optimized parameter search problem of higher dimensional space.Detailed process It may include: first to define search space, candidate network structure then found out by search strategy, and carry out to candidate network structure Assessment carries out the search of next round according to feedback.
As depicted in figs. 1 and 2, the method specifically includes:
Step S10: model is generated according to network and generates multiple sub-network hyper parameter set.
In a kind of example, the search space of definition includes sub-network hyper parameter set.Hyper parameter can be definition network The parameter of structure, for example, there is several layer networks, every layer of operator, the filter size etc. in convolution.Hyper parameter has dimension height, It is discrete and the features such as interdepend.Network, which generates model, can be Recognition with Recurrent Neural Network model (RNN, Recurrent Neural Network), sub-network model can be convolutional neural networks (CNN, Convolutional Neural Networks) model.
Be according to the process that network generation model generates multiple sub-network hyper parameter set: initialization network generates model Parameter, and the sub-network hyper parameter of given initialization, such as a1.The sub-network hyper parameter of initialization is input to network and generates mould In type, multiple sub-network hyper parameters, such as a2, a3......aN are generated.N number of sub-network hyper parameter constitutes sub-network hyper parameter collection It closes, such as A1={ a1, a2, a3......aN }.
Step S20: multiple sub-network models are generated according to multiple sub-network hyper parameter set, and to multiple sub-network models It is verified respectively, obtains the corresponding multiple accuracys rate of multiple sub-network models.
In a kind of example, one is generated according to sub-network hyper parameter set, such as A1={ a1, a2, a3......aN } Sub-network model.It is Filter Height respectively as shown in figure 3, including five outputs in each sub-network model generated Convolution kernel height (or length), Filter Width convolution kernel width, Stride Height step height, Stride Width step width, Num of Filter convolution kernel number.Five outputs constitute sub-network model hyper parameter set.Obtain one A data set D={ Dtrain, Dtest, DtrainFor training set, DtestFor test set.Training set and data set can be emotion point The special corpus task such as analysis, text classification.Using training set DtrainMultiple sub-network models are trained respectively, are obtained pair The sub-network model parameter answered.Later, test set D is utilizedtestMultiple sub-network models are verified respectively, it is respectively right to obtain The verification result answered, i.e. accuracy rate.
Step S30: the pro-active network hyper parameter set of pro-active network model is obtained.
In a kind of example, process and the aforementioned sub-network for obtaining the pro-active network hyper parameter set of pro-active network model are super The acquisition process of parameter sets is similar, and details are not described herein.Since pro-active network model is that the desired of engineer reaches The model of demand, incorporated it is more intuitively or theoretic priori knowledge.So introducing the mesh of pro-active network model , by generating the update that model carries out network parameter to network using pro-active network hyper parameter, and then mould is generated to network The process that type generates sub-network model is instructed.The time for generating optimal sub-network model is effectively reduced, sub-network is improved The formation efficiency of model.
Step S40: multiple sub-network hyper parameter set and pro-active network hyper parameter set are separately input into arbiter, Obtain the first desired value.
In a kind of example, the guidance that pro-active network model generates model to network is introduced by arbiter.Specifically, Multiple sub-network hyper parameter set and pro-active network hyper parameter set are separately input into arbiter, the first desired value is obtained. First desired value can be 0 or 1,0 expression sub-network model parameter and differ larger with pro-active network model parameter, and 1 indicates subnet Network model parameter differs smaller with pro-active network parameter model.Purpose is that the sub-network model generated more connects with pro-active network parameter Close better, the process for illustrating that pro-active network model generates model generation sub-network model to network has carried out effective guidance.
It should be pointed out that the input sequencing of sub-network hyper parameter set and pro-active network hyper parameter set is not done It limits.Arbiter can be arbitrary classification network, such as CNN network model.
Step S50: being input to network for the first desired value and multiple accuracys rate and generate in model, updates network and generates model Parameter, generate model to generate new network.
Using the pro-active network model being pre-designed to network generate model instruct, not only using priori knowledge as Network parameter section is input to network and generates in model, but also using priori knowledge as (the i.e. first expectation of the first prize signal Value) it is input in network generation model, the update for generating model parameter to network is instructed.
In one embodiment, as shown in figure 4, step S20 includes:
Step S201: the first sub-network model is generated according to the first sub-network hyper parameter set;
Step S202: training set is input in the first sub-network model, and training obtains the first of the first sub-network model Model parameter;
Step S203: test set is input in the first sub-network model with the first model parameter, obtains the first standard True rate;
Step S204: repeating above-mentioned steps, until obtaining multiple accuracys rate.
In a kind of example, network generates model and generates N number of sub-network hyper parameter, constitutes the first sub-network hyper parameter set A1={ a1;a2 aN}.First sub-network hyper parameter set A1={ a1;a2 aNOrdered fabrication is at the first sub-network MODEL C1.It is given One data D={ training set Dtrain;Test set Dtest, utilize DtrainTo the first sub-network MODEL C1Training, and utilize DtestTo the first sub-network MODEL C1Test, is verified as a result, such as accuracy rate R11.It repeats the above steps M times, obtains M Accuracy rate R11......R1MM accuracy rate R11......R1M.First sub-network model corresponds to M accuracy rate R11......R1M。 Step S201 is repeated to step S204.Second sub-network model corresponds to M accuracy rate R21......R2M... is until i-th A sub- network model corresponds to M accuracy rate Ri1......RiM,.Accuracy rate Ri1......RiMIt is expressed as the second prize signal Reward2i
In one embodiment, as shown in figure 4, step S40 includes:
Step S401: sub-network hyper parameter set is input in arbiter, exports the second desired value;
Step S402: pro-active network hyper parameter set is input in arbiter, third desired value is obtained;
Step S403: the sum of the second desired value and third desired value are regard as the first desired value.
In a kind of example, sub-network hyper parameter set is inputted in arbiter, exports the second desired value Rewardla, RewardlaiIt is 0 or 1.Pro-active network hyper parameter set inputs in arbiter, exports third desired value Reward1b, Reward1bIt is 0 or 1.First desired value Rewardli=AReward1b+BRewardlai, A and B are known numeric values.
Finally, by RewardliAnd Reward2iNetwork is input to as feedback information to generate in model, passes through intensified learning In Policy-Gradient algorithm, update the parameter that network generates model, generate model to generate new network.
In one embodiment, as shown in Figure 4, further includes:
Step S60: maximum accuracy rate is selected from multiple accuracys rate, obtains the corresponding sub-network model of maximum accuracy rate;
Step S70: the corresponding sub-network model of maximum accuracy rate is replaced into pro-active network model;
Step S80: the hyper parameter set of the corresponding sub-network model of maximum accuracy rate and remaining sub-network hyper parameter collection Conjunction is separately input into arbiter, obtains fourth phase prestige value;
Step S90: fourth phase prestige value and multiple accuracys rate are input to network and generated in model, network is updated and generates model Parameter, generate model to generate new network.
In a kind of example, pass through Reward2iSequence utilizes the best subnet of generation in the sub-network model of generation Network model is instructed as pro-active network model sub-network generating process is generated.Whole training process and the above method one It causes, details are not described herein.After M network of training generates model, need from the candidate collection of sub-network model again Select the maximum sub-network model of accuracy rate as pro-active network model, and random fixed network generates the network ginseng in model Number.The search space that network generates model can be reduced.
Embodiment two
In another embodiment specific implementation mode, as shown in figure 5, providing a kind of model training based on the search of neural frame Device, comprising:
Sub-network hyper parameter set generation module 10 generates multiple sub-network hyper parameter collection for generating model according to network It closes;
Accuracy rate computing module 20, for generating multiple sub-network models according to the multiple sub-network hyper parameter set, And the multiple sub-network model is verified respectively, obtain the corresponding multiple accuracys rate of the multiple sub-network model;
Pro-active network hyper parameter module 30, for obtaining the pro-active network hyper parameter of pro-active network model;
First desired value computing module 40, for the multiple sub-network hyper parameter set and the pro-active network to be surpassed ginseng Manifold conjunction is separately input into arbiter, obtains the first desired value;
First network generates model modification module 50, for first desired value and the multiple accuracy rate to be input to The network generates in model, updates the parameter that the network generates model, generates model to generate new network.
In one embodiment, as shown in fig. 6, the accuracy rate computing module 20 includes:
First sub-network model generation unit 201, for generating the first sub-network according to the first sub-network hyper parameter set Model;
First model parameter generation unit 202, for training set to be input in the first sub-network model, trained To the first model parameter of the first sub-network model;
Accuracy rate computing unit 203, for test set to be input to the first sub-network with first model parameter In model, the first accuracy rate is obtained, above-mentioned steps are repeated, until obtaining multiple accuracys rate.
In one embodiment, as shown in fig. 6, the first desired value computing module 40 includes:
Second desired value computing unit 401, it is defeated for the sub-network hyper parameter set to be input in the arbiter Second desired value out;
Third desired value computing unit 402, for the pro-active network hyper parameter set to be input in the arbiter, Obtain third desired value;
First desired value computing unit 403, for regarding the sum of second desired value and the third desired value as institute State the first desired value.
In one embodiment, as shown in Figure 7, further includes:
Maximum accuracy rate confirmation module 60, for selecting maximum accuracy rate from the multiple accuracy rate, obtain it is described most The corresponding sub-network model of big accuracy rate;
Network model replacement module 70, for the corresponding sub-network model of the maximum accuracy rate to be replaced the priori net Network model;
Fourth phase prestige value computing module 80, for the corresponding sub-network model of the maximum accuracy rate hyper parameter set and Remaining sub-network hyper parameter set is separately input into the arbiter, obtains fourth phase prestige value;
Second network generation module update module 90, for the fourth phase prestige value and the multiple accuracy rate to be input to The network generates in model, updates the parameter that the network generates model, generates model to generate new network.
The function of each module in each device of the embodiment of the present invention may refer to the corresponding description in the above method, herein not It repeats again.
Embodiment three
Fig. 8 shows the structural block diagram of the model training terminal according to an embodiment of the present invention based on the search of neural frame.Such as Shown in Fig. 8, which includes: memory 910 and processor 920, and being stored in memory 910 can run on processor 920 Computer program.The processor 920 realized when executing the computer program in above-described embodiment based on neural frame The model training method of search.The quantity of the memory 910 and processor 920 can be one or more.
The terminal further include:
Communication interface 930 carries out data interaction for being communicated with external device.
Memory 910 may include high speed RAM memory, it is also possible to further include nonvolatile memory (non- Volatile memory), a for example, at least magnetic disk storage.
If memory 910, processor 920 and the independent realization of communication interface 930, memory 910,920 and of processor Communication interface 930 can be connected with each other by bus and complete mutual communication.The bus can be Industry Standard Architecture Structure (ISA, Industry Standard Architecture) bus, external equipment interconnection (PCI, Peripheral Component Interconnect) bus or extended industry-standard architecture (EISA, Extended Industry StandardArchitecture) bus etc..The bus can be divided into address bus, data/address bus, control bus etc..For just It is only indicated with a thick line in expression, Fig. 8, it is not intended that an only bus or a type of bus.
Optionally, in specific implementation, if memory 910, processor 920 and communication interface 930 are integrated in one piece of core On piece, then memory 910, processor 920 and communication interface 930 can complete mutual communication by internal interface.
The embodiment of the invention provides a kind of computer readable storage mediums, are stored with computer program, the program quilt Processor realizes any the method in above-described embodiment when executing.
In the description of this specification, reference term " one embodiment ", " some embodiments ", " example ", " specifically show The description of example " or " some examples " etc. means specific features, structure, material or spy described in conjunction with this embodiment or example Point is included at least one embodiment or example of the invention.Moreover, particular features, structures, materials, or characteristics described It may be combined in any suitable manner in any one or more of the embodiments or examples.In addition, without conflicting with each other, this The technical staff in field can be by the spy of different embodiments or examples described in this specification and different embodiments or examples Sign is combined.
In addition, term " first ", " second " are used for descriptive purposes only and cannot be understood as indicating or suggesting relative importance Or implicitly indicate the quantity of indicated technical characteristic." first " is defined as a result, the feature of " second " can be expressed or hidden It include at least one this feature containing ground.In the description of the present invention, the meaning of " plurality " is two or more, unless otherwise Clear specific restriction.
Any process described otherwise above or method description are construed as in flow chart or herein, and expression includes It is one or more for realizing specific logical function or process the step of executable instruction code module, segment or portion Point, and the range of the preferred embodiment of the present invention includes other realization, wherein can not press shown or discussed suitable Sequence, including according to related function by it is basic simultaneously in the way of or in the opposite order, Lai Zhihang function, this should be of the invention Embodiment person of ordinary skill in the field understood.
Expression or logic and/or step described otherwise above herein in flow charts, for example, being considered use In the order list for the executable instruction for realizing logic function, may be embodied in any computer-readable medium, for Instruction execution system, device or equipment (such as computer based system, including the system of processor or other can be held from instruction The instruction fetch of row system, device or equipment and the system executed instruction) it uses, or combine these instruction execution systems, device or set It is standby and use.For the purpose of this specification, " computer-readable medium ", which can be, any may include, stores, communicates, propagates or pass Defeated program is for instruction execution system, device or equipment or the use device in conjunction with these instruction execution systems, device or equipment. The more specific example (non-exhaustive list) of computer-readable medium include the following: there is the electrical connection of one or more wirings Portion's (electronic device), portable computer diskette box (magnetic device), random access memory (RAM), read-only memory (ROM) can It wipes editable read-only memory (EPROM or flash memory), fiber device and portable read-only memory (CDROM). In addition, computer-readable medium can even is that the paper that can print described program on it or other suitable media, because can For example by carrying out optical scanner to paper or other media, then to be edited, be interpreted or when necessary with other suitable methods It is handled electronically to obtain described program, is then stored in computer storage.
It should be appreciated that each section of the invention can be realized with hardware, software, firmware or their combination.Above-mentioned In embodiment, software that multiple steps or method can be executed in memory and by suitable instruction execution system with storage Or firmware is realized.It, and in another embodiment, can be under well known in the art for example, if realized with hardware Any one of column technology or their combination are realized: having a logic gates for realizing logic function to data-signal Discrete logic, with suitable combinational logic gate circuit specific integrated circuit, programmable gate array (PGA), scene Programmable gate array (FPGA) etc..
Those skilled in the art are understood that realize all or part of step that above-described embodiment method carries It suddenly is that relevant hardware can be instructed to complete by program, the program can store in a kind of computer-readable storage medium In matter, which when being executed, includes the steps that one or a combination set of embodiment of the method.
It, can also be in addition, each functional unit in each embodiment of the present invention can integrate in a processing module It is that each unit physically exists alone, can also be integrated in two or more units in a module.Above-mentioned integrated mould Block both can take the form of hardware realization, can also be realized in the form of software function module.The integrated module is such as Fruit is realized and when sold or used as an independent product in the form of software function module, also can store in a computer In readable storage medium storing program for executing.The storage medium can be read-only memory, disk or CD etc..
The above description is merely a specific embodiment, but scope of protection of the present invention is not limited thereto, any Those familiar with the art in the technical scope disclosed by the present invention, can readily occur in its various change or replacement, These should be covered by the protection scope of the present invention.Therefore, protection scope of the present invention should be with the guarantor of the claim It protects subject to range.

Claims (10)

1. a kind of model training method based on the search of neural frame characterized by comprising
Model, which is generated, according to network generates multiple sub-network hyper parameter set;
Multiple sub-network models are generated according to the multiple sub-network hyper parameter set, and the multiple sub-network model is distinguished It is verified, obtains the corresponding multiple accuracys rate of the multiple sub-network model;
Obtain the pro-active network hyper parameter set of pro-active network model;
The multiple sub-network hyper parameter set and the pro-active network hyper parameter set are separately input into arbiter, obtained First desired value;
First desired value and the multiple accuracy rate are input to the network to generate in model, the network is updated and generates The parameter of model generates model to generate new network.
2. the method according to claim 1, wherein being generated according to the multiple sub-network hyper parameter set multiple Sub-network model, and the multiple sub-network model is verified respectively, it is corresponding more to obtain the multiple sub-network model A accuracy rate, comprising:
The first sub-network model is generated according to the first sub-network hyper parameter set;
Training set is input in the first sub-network model, training obtains the first model ginseng of the first sub-network model Number;
Test set is input in the first sub-network model with first model parameter, the first accuracy rate is obtained;
Above-mentioned steps are repeated, until obtaining multiple accuracys rate.
3. the method according to claim 1, wherein by the sub-network hyper parameter set and the pro-active network Hyper parameter set is separately input into arbiter, obtains the first desired value, comprising:
The sub-network hyper parameter set is input in the arbiter, the second desired value is exported;
The pro-active network hyper parameter set is input in the arbiter, third desired value is obtained;
It regard the sum of second desired value and the third desired value as first desired value.
4. the method according to claim 1, wherein further include:
Maximum accuracy rate is selected from the multiple accuracy rate, obtains the corresponding sub-network model of the maximum accuracy rate;
By the maximum accuracy rate corresponding sub-network model replacement pro-active network model;
The hyper parameter set and remaining sub-network hyper parameter set difference of the corresponding sub-network model of the maximum accuracy rate are defeated Enter into the arbiter, obtains fourth phase prestige value;
The fourth phase prestige value and the multiple accuracy rate are input to the network to generate in model, the network is updated and generates The parameter of model generates model to generate new network.
5. a kind of model training apparatus based on the search of neural frame characterized by comprising
Sub-network hyper parameter set generation module generates multiple sub-network hyper parameter set for generating model according to network;
Accuracy rate computing module, for generating multiple sub-network models according to the multiple sub-network hyper parameter set, and to institute It states multiple sub-network models to be verified respectively, obtains the corresponding multiple accuracys rate of the multiple sub-network model;
Pro-active network hyper parameter module, for obtaining the pro-active network hyper parameter set of pro-active network model;
First desired value computing module is used for the multiple sub-network hyper parameter set and the pro-active network hyper parameter set It is separately input into arbiter, obtains the first desired value;
First network generates model modification module, for first desired value and the multiple accuracy rate to be input to the net Network generates in model, updates the parameter that the network generates model, generates model to generate new network.
6. device according to claim 5, which is characterized in that the accuracy rate computing module includes:
First sub-network model generation unit, for generating the first sub-network model according to the first sub-network hyper parameter set;
First model parameter generation unit, for training set to be input in the first sub-network model, training obtains described First model parameter of the first sub-network model;
Accuracy rate computing unit, for being input to test set in the first sub-network model with first model parameter, The first accuracy rate is obtained, above-mentioned steps are repeated, until obtaining multiple accuracys rate.
7. device according to claim 5, which is characterized in that the first desired value computing module includes:
Second desired value computing unit, for the sub-network hyper parameter set to be input in the arbiter, output second Desired value;
Third desired value computing unit obtains for the pro-active network hyper parameter set to be input in the arbiter Three desired values;
First desired value computing unit, for regarding the sum of second desired value and the third desired value as the first phase Prestige value.
8. device according to claim 5, which is characterized in that further include:
Maximum accuracy rate confirmation module, for selecting maximum accuracy rate from the multiple accuracy rate, it is described maximum accurate to obtain The corresponding sub-network model of rate;
Network model replacement module, for the corresponding sub-network model of the maximum accuracy rate to be replaced the pro-active network mould Type;
Fourth phase prestige value computing module, for the hyper parameter set of the corresponding sub-network model of the maximum accuracy rate and remaining Sub-network hyper parameter set is separately input into the arbiter, obtains fourth phase prestige value;
Second network generation module update module, for the fourth phase prestige value and the multiple accuracy rate to be input to the net Network generates in model, updates the parameter that the network generates model, generates model to generate new network.
9. a kind of model training terminal based on the search of neural frame characterized by comprising
One or more processors;
Storage device, for storing one or more programs;
When one or more of programs are executed by one or more of processors, so that one or more of processors Realize such as any one of claims 1 to 4 the method.
10. a kind of computer readable storage medium, is stored with computer program, which is characterized in that the program is held by processor Such as any one of claims 1 to 4 the method is realized when row.
CN201910509296.2A 2019-06-12 2019-06-12 Model training method, device and terminal based on the search of neural frame Pending CN110210609A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910509296.2A CN110210609A (en) 2019-06-12 2019-06-12 Model training method, device and terminal based on the search of neural frame

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910509296.2A CN110210609A (en) 2019-06-12 2019-06-12 Model training method, device and terminal based on the search of neural frame

Publications (1)

Publication Number Publication Date
CN110210609A true CN110210609A (en) 2019-09-06

Family

ID=67792403

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910509296.2A Pending CN110210609A (en) 2019-06-12 2019-06-12 Model training method, device and terminal based on the search of neural frame

Country Status (1)

Country Link
CN (1) CN110210609A (en)

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110543944A (en) * 2019-09-11 2019-12-06 北京百度网讯科技有限公司 neural network structure searching method, apparatus, electronic device, and medium
CN110689127A (en) * 2019-10-15 2020-01-14 北京小米智能科技有限公司 Neural network structure model searching method, device and storage medium
CN110889450A (en) * 2019-11-27 2020-03-17 腾讯科技(深圳)有限公司 Method and device for super-parameter tuning and model building
CN111126564A (en) * 2019-11-27 2020-05-08 东软集团股份有限公司 Neural network structure searching method, device and equipment
CN111444884A (en) * 2020-04-22 2020-07-24 万翼科技有限公司 Method, apparatus and computer-readable storage medium for recognizing a component in an image
CN111488971A (en) * 2020-04-09 2020-08-04 北京百度网讯科技有限公司 Neural network model searching method and device, and image processing method and device
CN111523665A (en) * 2020-04-23 2020-08-11 北京百度网讯科技有限公司 Super-network parameter updating method and device and electronic equipment
CN111652354A (en) * 2020-05-29 2020-09-11 北京百度网讯科技有限公司 Method, apparatus, device and storage medium for training a hyper-network
WO2021114625A1 (en) * 2020-05-28 2021-06-17 平安科技(深圳)有限公司 Network structure construction method and apparatus for use in multi-task scenario
TWI769418B (en) * 2019-12-05 2022-07-01 財團法人工業技術研究院 Method and electronic device for selecting neural network hyperparameters
TWI771745B (en) * 2020-09-07 2022-07-21 威盛電子股份有限公司 Hyper-parameter setting method and building platform for neural network model

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108334949A (en) * 2018-02-11 2018-07-27 浙江工业大学 A kind of tachytelic evolution method of optimization depth convolutional neural networks structure
CN109598332A (en) * 2018-11-14 2019-04-09 北京市商汤科技开发有限公司 Neural network generation method and device, electronic equipment and storage medium
CN109816116A (en) * 2019-01-17 2019-05-28 腾讯科技(深圳)有限公司 The optimization method and device of hyper parameter in machine learning model

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108334949A (en) * 2018-02-11 2018-07-27 浙江工业大学 A kind of tachytelic evolution method of optimization depth convolutional neural networks structure
CN109598332A (en) * 2018-11-14 2019-04-09 北京市商汤科技开发有限公司 Neural network generation method and device, electronic equipment and storage medium
CN109816116A (en) * 2019-01-17 2019-05-28 腾讯科技(深圳)有限公司 The optimization method and device of hyper parameter in machine learning model

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
BARRET ZOPH 等: "NEURAL ARCHITECTURE SEARCH WITH REINFORCEMENT LEARNING", 《HTTPS://ARXIV.ORG/ABS/1611.01578》 *

Cited By (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110543944B (en) * 2019-09-11 2022-08-02 北京百度网讯科技有限公司 Neural network structure searching method, apparatus, electronic device, and medium
CN110543944A (en) * 2019-09-11 2019-12-06 北京百度网讯科技有限公司 neural network structure searching method, apparatus, electronic device, and medium
CN110689127A (en) * 2019-10-15 2020-01-14 北京小米智能科技有限公司 Neural network structure model searching method, device and storage medium
CN110889450A (en) * 2019-11-27 2020-03-17 腾讯科技(深圳)有限公司 Method and device for super-parameter tuning and model building
CN111126564A (en) * 2019-11-27 2020-05-08 东软集团股份有限公司 Neural network structure searching method, device and equipment
CN110889450B (en) * 2019-11-27 2023-08-11 腾讯科技(深圳)有限公司 Super-parameter tuning and model construction method and device
CN111126564B (en) * 2019-11-27 2023-08-08 东软集团股份有限公司 Neural network structure searching method, device and equipment
TWI769418B (en) * 2019-12-05 2022-07-01 財團法人工業技術研究院 Method and electronic device for selecting neural network hyperparameters
US11537893B2 (en) 2019-12-05 2022-12-27 Industrial Technology Research Institute Method and electronic device for selecting deep neural network hyperparameters
CN111488971A (en) * 2020-04-09 2020-08-04 北京百度网讯科技有限公司 Neural network model searching method and device, and image processing method and device
CN111488971B (en) * 2020-04-09 2023-10-24 北京百度网讯科技有限公司 Neural network model searching method and device, and image processing method and device
CN111444884A (en) * 2020-04-22 2020-07-24 万翼科技有限公司 Method, apparatus and computer-readable storage medium for recognizing a component in an image
CN111523665A (en) * 2020-04-23 2020-08-11 北京百度网讯科技有限公司 Super-network parameter updating method and device and electronic equipment
CN111523665B (en) * 2020-04-23 2024-02-13 北京百度网讯科技有限公司 Super network parameter updating method and device and electronic equipment
WO2021114625A1 (en) * 2020-05-28 2021-06-17 平安科技(深圳)有限公司 Network structure construction method and apparatus for use in multi-task scenario
CN111652354A (en) * 2020-05-29 2020-09-11 北京百度网讯科技有限公司 Method, apparatus, device and storage medium for training a hyper-network
CN111652354B (en) * 2020-05-29 2023-10-24 北京百度网讯科技有限公司 Method, apparatus, device and storage medium for training super network
TWI771745B (en) * 2020-09-07 2022-07-21 威盛電子股份有限公司 Hyper-parameter setting method and building platform for neural network model

Similar Documents

Publication Publication Date Title
CN110210609A (en) Model training method, device and terminal based on the search of neural frame
US20230252327A1 (en) Neural architecture search for convolutional neural networks
US11829874B2 (en) Neural architecture search
EP3711000B1 (en) Regularized neural network architecture search
Jozwik et al. Deep convolutional neural networks outperform feature-based but not categorical models in explaining object similarity judgments
EP3871132A1 (en) Generating integrated circuit floorplans using neural networks
US20200265315A1 (en) Neural architecture search
US20190354868A1 (en) Multi-task neural networks with task-specific paths
JP6384065B2 (en) Information processing apparatus, learning method, and program
CN109034365A (en) The training method and device of deep learning model
CN107977748B (en) Multivariable distorted time sequence prediction method
CN109948680A (en) The classification method and system of medical record data
Orsborn et al. Multiagent shape grammar implementation: automatically generating form concepts according to a preference function
Silva et al. Finding multiple roots of a box-constrained system of nonlinear equations with a biased random-key genetic algorithm
CN110187760A (en) Intelligent interactive method and device
CN113609337A (en) Pre-training method, device, equipment and medium of graph neural network
CN112580807A (en) Neural network improvement demand automatic generation method and device based on efficiency evaluation
CN108229640B (en) Emotion expression method and device and robot
CN110334716A (en) Characteristic pattern processing method, image processing method and device
CN111369063B (en) Test paper model training method, test paper combining method and related device
CN107870862A (en) Construction method, traversal method of testing and the computing device of new control forecast model
CN110705889A (en) Enterprise screening method, device, equipment and storage medium
CN110020195A (en) Article recommended method and device, storage medium, electronic equipment
WO2022015390A1 (en) Hardware-optimized neural architecture search
Rajapakshe et al. emoDARTS: Joint Optimisation of CNN & Sequential Neural Network Architectures for Superior Speech Emotion Recognition

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination