CN109344968A - A kind of method and device of the hyper parameter processing of neural network - Google Patents
A kind of method and device of the hyper parameter processing of neural network Download PDFInfo
- Publication number
- CN109344968A CN109344968A CN201811176082.XA CN201811176082A CN109344968A CN 109344968 A CN109344968 A CN 109344968A CN 201811176082 A CN201811176082 A CN 201811176082A CN 109344968 A CN109344968 A CN 109344968A
- Authority
- CN
- China
- Prior art keywords
- model
- data
- hyper parameter
- parameter
- training
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
A kind of method of the hyper parameter processing of neural network, comprising: the first data in the data for being used to carry out model training are divided into K equal portions, generate a hyper parameter dictionary for model;Every time when training, permutation and combination is carried out to the value of the parameter in the hyper parameter dictionary, the data for choosing K-1 parts for each permutation and combination circulation are trained, and remaining portion data are verified, and record the model score mean value and the average value of training time of K model verifying;Optimal model score mean value and the corresponding parameter combination of the average value of training time in specified range are selected, the hyper parameter as the model combines.A kind of device of the hyper parameter processing of neural network is also provided.This manual setting of hyper parameter can be changed into the process automation dependent on experience, greatly alleviate the workload of modelling personnel, improve work efficiency by this programme.Simultaneously by the hyper parameter setting being more suitable for, the speed of model training and the precision of model are improved.
Description
Technical field
The present invention relates to machine learning field of neural networks, the method and dress of the hyper parameter processing of espespecially a kind of neural network
It sets.
Background technique
In machine learning and related fields, the computation model inspiration of neural network (artificial neural network) is in animal
Pivot nervous system (especially brain), and be used to estimate or may rely on a large amount of input and general unknown approximate letter
Number.Artificial neural network typically appears as " neuron " interconnected, it can be from the calculated value of input, and being capable of machine
Study and pattern-recognition due to their self-adaptive property system.Artificial neural network also have it is preliminary it is adaptive with from
Organizational capacity.Change synaptic weight value in study or training process, to adapt to the requirement of ambient enviroment.Consolidated network is because of study
Mode and content difference can have the function of different.Artificial neural network is the system with learning ability, can be developed
Knowledge, so that being more than the original know-how of designer.In general, its learning training mode can be divided into two kinds, one is have prison
The study of tutor is superintended and directed or had, is at this moment classified or is imitated using given sample canonical;Another kind be unsupervised learning or
Inaction tutor's study is claimed at this moment only to provide mode of learning or certain rules, then specific learning Content is with system local environment
(i.e. input signal situation) and it is different, system can find environmental characteristic and regularity automatically, have the function of more approximate human brain.
Hyper parameter is the parameter of the setting value before starting learning process, rather than the supplemental characteristic obtained by training.
Under normal conditions, it needs to optimize hyper parameter in machine-learning process, selects one group of optimal hyper parameter to learner, with
Improve the performance and effect of study.Such as learning rate, regularization parameter, the number of plies of neural network, nerve in each hidden layer
The number of member, the rounds (Epoch) of study, the size (minibatch size) of batch data, the activation primitive of neuron,
The selection etc. of cost function is all hyper parameter.
Concept with hyper parameter difference is parameter, it is a part that model training learns in the process, such as nerve net
Network weight.Parameter is to be obtained by model training, and it (is substantially the parameter of parameter, often that hyper parameter, which is human configuration parameter,
Secondary change hyper parameter, model will re -trainings).Hyper parameter is can to influence neural network learning speed and last classification knot
Fruit, the speed that wherein pace of learning of neural network mainly declines according to cost function on training set is related, and last classification
Result mainly follow verifying collection on classification accuracy rate it is related.
In the model training of previous neural network, the setting of hyper parameter tends to rely on the experience of designer.?
After completing super ginseng setting, model training, assessment are carried out.Later according to model training, assessment as a result, carry out hyper parameter tune
It is whole, later again re -training, assessment.Usual this process needs are repeated multiple times, until finding one group of more appropriate hyper parameter.
The work measurer of modelling personnel is big, and inefficiency.
Summary of the invention
In order to solve the above-mentioned technical problems, the present invention provides the methods and dress of a kind of hyper parameter for handling neural network
It sets, improves the speed of model training and the precision of model.
In order to reach the object of the invention, the present invention provides a kind of methods of the hyper parameter of neural network processing, comprising:
The first data in the data for being used to carry out model training are divided into K equal portions, generate a hyper parameter word for model
Allusion quotation;
Every time when training, permutation and combination is carried out to the value of the parameter in the hyper parameter dictionary, for each arrangement
The data that combination circulation chooses K-1 parts are trained, and remaining portion data are verified, and the model of record K model verifying obtains
Divide the average value of mean value and training time;
Select optimal model score mean value and the corresponding parameter group of the average value of training time in specified range
It closes, the hyper parameter as the model combines.
Further, the selection optimal model score mean value and the average value of training time in specified range
Corresponding parameter combination, after the hyper parameter combination of the model, further includes:
Combined using selected hyper parameter, using in the data for carrying out model training in addition to first data
Data, carry out model evaluation.
Further, the parameter in the hyper parameter dictionary includes at least following parameter:
Learning rate, the rounds of study, the size of batch data.
A kind of device of the hyper parameter processing of neural network, comprising: memory and processor;Wherein:
The memory, for saving the program of the hyper parameter for handling neural network;
The processor executes following behaviour for reading the program for executing the hyper parameter for being used to handle neural network
Make:
The first data in the data for being used to carry out model training are divided into K equal portions, generate a hyper parameter word for model
Allusion quotation;
Every time when training, permutation and combination is carried out to the value of the parameter in the hyper parameter dictionary, for each arrangement
The data that combination circulation chooses K-1 parts are trained, and remaining portion data are verified, and the model of record K model verifying obtains
Divide the average value of mean value and training time;
Select optimal model score mean value and the corresponding parameter group of the average value of training time in specified range
It closes, the hyper parameter as the model combines.
Further, the processor is also used for selected hyper parameter combination, using described for carrying out model instruction
Data in experienced data in addition to first data carry out model evaluation.
Further, the parameter in the hyper parameter dictionary includes at least following parameter:
Learning rate, the rounds of study, the size of batch data.
This manual setting of hyper parameter can be changed into the process automation dependent on experience by the scheme of the present embodiment,
The workload for greatly alleviating modelling personnel, improves work efficiency.Simultaneously by the hyper parameter setting being more suitable for, improve
The speed of model training and the precision of model.
Other features and advantages of the present invention will be illustrated in the following description, also, partly becomes from specification
It obtains it is clear that understand through the implementation of the invention.The objectives and other advantages of the invention can be by specification, right
Specifically noted structure is achieved and obtained in claim and attached drawing.
Detailed description of the invention
Attached drawing is used to provide to further understand technical solution of the present invention, and constitutes part of specification, with this
The embodiment of application technical solution for explaining the present invention together, does not constitute the limitation to technical solution of the present invention.
Fig. 1 is a kind of flow chart of the method for the hyper parameter processing of neural network of the embodiment of the present invention;
Fig. 2 is the flow chart for the method that the hyper parameter of the one exemplary neural network of application of the present invention is handled;
Fig. 3 is the schematic diagram that the exemplary training data of present invention application divides;
Fig. 4 is a kind of schematic diagram of the device of the hyper parameter processing of neural network of the embodiment of the present invention.
Specific embodiment
To make the objectives, technical solutions, and advantages of the present invention clearer, below in conjunction with attached drawing to the present invention
Embodiment be described in detail.It should be noted that in the absence of conflict, in the embodiment and embodiment in the application
Feature can mutual any combination.
Step shown in the flowchart of the accompanying drawings can be in a computer system such as a set of computer executable instructions
It executes.Also, although logical order is shown in flow charts, and it in some cases, can be to be different from herein suitable
Sequence executes shown or described step.
Fig. 1 is a kind of flow chart of the method for the hyper parameter processing of neural network of the embodiment of the present invention, as shown in Figure 1,
The method of the present embodiment includes:
The first data in the data for being used to carry out model training are divided into K equal portions by step 101, generate one for model
Hyper parameter dictionary;
When step 102, each training, permutation and combination is carried out to the value of the parameter in the hyper parameter dictionary, for every
The data that a kind of permutation and combination circulation chooses K-1 part are trained, and remaining portion data are verified, and are recorded K model and are verified
Model score mean value and the average value of training time;
Step 103, selection optimal model score mean value and the average value of training time in specified range are corresponding
Parameter combination, as the model hyper parameter combine.
The method of the hyper parameter processing for the neural network that the embodiment of the present invention proposes, dependent on the process automation of experience,
The workload of modelling personnel is greatly alleviated, working efficiency is improved.Simultaneously by the hyper parameter setting being more suitable for, mould is improved
The speed of type training and the precision of model.
Fig. 2 is the flow chart for the method that the hyper parameter of the one exemplary neural network of application of the present invention is handled, as shown in Fig. 2,
Originally exemplary method includes:
Step 201, first system automatically divide the data for carrying out model training.
In this example, original data are divided into 3 parts automatically, are respectively as follows: training set and test set.Wherein training set is used
Carry out training pattern, test set is used to measure the quality of model performance.
Step 202, system are one hyper parameter dictionary of auto-building model.
Value in this example, in hyper parameter dictionary comprising several hyper parameters and corresponding hyper parameter.Such as such as lower word
Allusion quotation: { " learning_rate ": (0.0001,0.0009,0.001,0.009,0.01,0.09,0.1,0.9), " batch_
Size ": (1000,2000,3000,4000,5000,6000,7000,8000), " epoch_num ": (1,5,9,15) }.The word
The value that allusion quotation represents learning rate can are as follows: 0.0001,0.0009,0.001,0.009,0.01,0.09,0.1,0.9 this 8 values.
The value of batch_size are as follows: 1000,2000,3000,4000,5000,6000,7000,8000 this 8 values.Epoch_num's
Value are as follows: 1,5,9,15 this four values.
Step 203, user define the quantity of the data K folding for model cross validation, such as 5.System is according to this number
Word carries out data division to training set automatically again.Such as training set is divided into 5 equal portions.In each training, circulation is chosen
Wherein it is used as training set for 4 parts, it is remaining a as verifying collection, as shown in figure 3,5 model trainings are carried out using identical data,
Wherein respectively using different training sets and verifying collection.
Step 204, system can be automatically to learning_rate in dictionary, and the parameter of batch_size, epoch_num take
Value carries out permutation and combination.Automatic model training and assessment are carried out using the data of step 203 for each permutation and combination.
Record the model score after this 5 model evaluations, select score averages as this time permutation and combination most
Whole score.The time for recording this 5 model trainings, select the average value of training time as this time permutation and combination it is final when
Between.
Step 205 after all permutation and combination of parameter, takes the optimal value and instruction of each score mean value in having traversed dictionary
Practice the corresponding parameter combination of the acceptable value of time average, the hyper parameter as this model ideal combines.
Step 206 is combined using the ideal hyper parameter of selection, using test data set, is carried out the assessment of model, is obtained
The performance data of final mask.
This manual setting of hyper parameter can be changed into the process automation dependent on experience, pole by this exemplary method
The workload for alleviating modelling personnel greatly, improves work efficiency.Simultaneously by the hyper parameter setting being more suitable for, mould is improved
The speed of type training and the precision of model.
Fig. 4 is a kind of schematic diagram of the device of the hyper parameter processing of neural network of the embodiment of the present invention, as shown in figure 4,
The device of the present embodiment includes: memory and processor, wherein
The memory, for saving the program of the hyper parameter for handling neural network;
The processor executes following behaviour for reading the program for executing the hyper parameter for being used to handle neural network
Make:
The first data in the data for being used to carry out model training are divided into K equal portions, generate a hyper parameter word for model
Allusion quotation;
Every time when training, permutation and combination is carried out to the value of the parameter in the hyper parameter dictionary, for each arrangement
The data that combination circulation chooses K-1 parts are trained, and remaining portion data are verified, and the model of record K model verifying obtains
Divide the average value of mean value and training time;
Select optimal model score mean value and the corresponding parameter group of the average value of training time in specified range
It closes, the hyper parameter as the model combines.
Optionally, the processor is also used for selected hyper parameter combination, using described for carrying out model training
Data in data in addition to first data, carry out model evaluation.
Optionally, the parameter in the hyper parameter dictionary includes at least following parameter:
Learning rate, the rounds of study, the size of batch data.
The device of the present embodiment can be automatically performed the matching, assessment and final model measurement of hyper parameter.
The embodiment of the invention also provides a kind of computer readable storage mediums, are stored with computer executable instructions,
The computer executable instructions are performed the method for realizing the hyper parameter processing of the neural network.
It will appreciated by the skilled person that whole or certain steps, system, dress in method disclosed hereinabove
Functional module/unit in setting may be implemented as software, firmware, hardware and its combination appropriate.In hardware embodiment,
Division between the functional module/unit referred in the above description not necessarily corresponds to the division of physical assemblies;For example, one
Physical assemblies can have multiple functions or a function or step and can be executed by several physical assemblies cooperations.Certain groups
Part or all components may be implemented as by processor, such as the software that digital signal processor or microprocessor execute, or by
It is embodied as hardware, or is implemented as integrated circuit, such as specific integrated circuit.Such software can be distributed in computer-readable
On medium, computer-readable medium may include computer storage medium (or non-transitory medium) and communication media (or temporarily
Property medium).As known to a person of ordinary skill in the art, term computer storage medium is included in for storing information (such as
Computer readable instructions, data structure, program module or other data) any method or technique in the volatibility implemented and non-
Volatibility, removable and nonremovable medium.Computer storage medium include but is not limited to RAM, ROM, EEPROM, flash memory or its
His memory technology, CD-ROM, digital versatile disc (DVD) or other optical disc storages, magnetic holder, tape, disk storage or other
Magnetic memory apparatus or any other medium that can be used for storing desired information and can be accessed by a computer.This
Outside, known to a person of ordinary skill in the art to be, communication media generally comprises computer readable instructions, data structure, program mould
Other data in the modulated data signal of block or such as carrier wave or other transmission mechanisms etc, and may include any information
Delivery media.
Claims (6)
1. the method that a kind of hyper parameter of neural network is handled characterized by comprising
The first data in the data for being used to carry out model training are divided into K equal portions, generate a hyper parameter dictionary for model;
Every time when training, permutation and combination is carried out to the value of the parameter in the hyper parameter dictionary, for each permutation and combination
The data that circulation chooses K-1 parts are trained, and remaining portion data are verified, and the model score of record K model verifying is equal
The average value of value and training time;
Optimal model score mean value and the corresponding parameter combination of the average value of training time in specified range are selected, is made
It is combined for the hyper parameter of the model.
2. the method according to claim 1, wherein described select optimal model score mean value and in specified
The corresponding parameter combination of the average value of training time in range, after the hyper parameter combination of the model, further includes:
It is combined using selected hyper parameter, using the number in the data for carrying out model training in addition to first data
According to progress model evaluation.
3. method according to claim 1 or 2, which is characterized in that parameter in the hyper parameter dictionary include at least with
Lower parameter:
Learning rate, the rounds of study, the size of batch data.
4. the device that a kind of hyper parameter of neural network is handled, comprising: memory and processor;It is characterized by:
The memory, for saving the program of the hyper parameter for handling neural network;
The processor is executed described for handling the program of the hyper parameter of neural network, is performed the following operations for reading:
The first data in the data for being used to carry out model training are divided into K equal portions, generate a hyper parameter dictionary for model;
Every time when training, permutation and combination is carried out to the value of the parameter in the hyper parameter dictionary, for each permutation and combination
The data that circulation chooses K-1 parts are trained, and remaining portion data are verified, and the model score of record K model verifying is equal
The average value of value and training time;
Optimal model score mean value and the corresponding parameter combination of the average value of training time in specified range are selected, is made
It is combined for the hyper parameter of the model.
5. device according to claim 4, which is characterized in that
The processor is also used for selected hyper parameter combination, using removing in the data for carrying out model training
Data outside first data carry out model evaluation.
6. device according to claim 4 or 5, which is characterized in that parameter in the hyper parameter dictionary include at least with
Lower parameter:
Learning rate, the rounds of study, the size of batch data.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811176082.XA CN109344968A (en) | 2018-10-10 | 2018-10-10 | A kind of method and device of the hyper parameter processing of neural network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811176082.XA CN109344968A (en) | 2018-10-10 | 2018-10-10 | A kind of method and device of the hyper parameter processing of neural network |
Publications (1)
Publication Number | Publication Date |
---|---|
CN109344968A true CN109344968A (en) | 2019-02-15 |
Family
ID=65308996
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811176082.XA Pending CN109344968A (en) | 2018-10-10 | 2018-10-10 | A kind of method and device of the hyper parameter processing of neural network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109344968A (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110222847A (en) * | 2019-06-13 | 2019-09-10 | 苏州浪潮智能科技有限公司 | A kind of machine learning method and device |
CN110298240A (en) * | 2019-05-21 | 2019-10-01 | 北京迈格威科技有限公司 | A kind of user vehicle recognition methods, device, system and storage medium |
CN110751269A (en) * | 2019-10-18 | 2020-02-04 | 网易(杭州)网络有限公司 | Graph neural network training method, client device and system |
WO2021012894A1 (en) * | 2019-07-23 | 2021-01-28 | 平安科技(深圳)有限公司 | Method and apparatus for obtaining neural network test report, device, and storage medium |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106408031A (en) * | 2016-09-29 | 2017-02-15 | 南京航空航天大学 | Super parameter optimization method of least squares support vector machine |
CN108062587A (en) * | 2017-12-15 | 2018-05-22 | 清华大学 | The hyper parameter automatic optimization method and system of a kind of unsupervised machine learning |
CN108490782A (en) * | 2018-04-08 | 2018-09-04 | 中南大学 | A kind of method and system being suitable for complex industrial process product quality indicator missing data completion based on selective double layer integrated study |
-
2018
- 2018-10-10 CN CN201811176082.XA patent/CN109344968A/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106408031A (en) * | 2016-09-29 | 2017-02-15 | 南京航空航天大学 | Super parameter optimization method of least squares support vector machine |
CN108062587A (en) * | 2017-12-15 | 2018-05-22 | 清华大学 | The hyper parameter automatic optimization method and system of a kind of unsupervised machine learning |
CN108490782A (en) * | 2018-04-08 | 2018-09-04 | 中南大学 | A kind of method and system being suitable for complex industrial process product quality indicator missing data completion based on selective double layer integrated study |
Non-Patent Citations (1)
Title |
---|
陆继明 等: "《同步发电机微机励磁控制》", 30 January 2016 * |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110298240A (en) * | 2019-05-21 | 2019-10-01 | 北京迈格威科技有限公司 | A kind of user vehicle recognition methods, device, system and storage medium |
CN110298240B (en) * | 2019-05-21 | 2022-05-06 | 北京迈格威科技有限公司 | Automobile user identification method, device, system and storage medium |
CN110222847A (en) * | 2019-06-13 | 2019-09-10 | 苏州浪潮智能科技有限公司 | A kind of machine learning method and device |
WO2021012894A1 (en) * | 2019-07-23 | 2021-01-28 | 平安科技(深圳)有限公司 | Method and apparatus for obtaining neural network test report, device, and storage medium |
CN110751269A (en) * | 2019-10-18 | 2020-02-04 | 网易(杭州)网络有限公司 | Graph neural network training method, client device and system |
CN110751269B (en) * | 2019-10-18 | 2022-08-05 | 网易(杭州)网络有限公司 | Graph neural network training method, client device and system |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109344968A (en) | A kind of method and device of the hyper parameter processing of neural network | |
US20200293892A1 (en) | Model test methods and apparatuses | |
CN110647920A (en) | Transfer learning method and device in machine learning, equipment and readable medium | |
EP3635637A1 (en) | Pre-training system for self-learning agent in virtualized environment | |
KR20200022739A (en) | Method and device to recognize image and method and device to train recognition model based on data augmentation | |
CN109325516B (en) | Image classification-oriented ensemble learning method and device | |
CN108171329A (en) | Deep learning neural network training method, number of plies adjusting apparatus and robot system | |
CN110288007A (en) | The method, apparatus and electronic equipment of data mark | |
CN109635833A (en) | A kind of image-recognizing method and system based on cloud platform and model intelligent recommendation | |
CN109635653A (en) | A kind of plants identification method | |
CN108416187A (en) | A kind of method and device of determining pruning threshold, model pruning method and device | |
CN107544960B (en) | Automatic question-answering method based on variable binding and relation activation | |
CN109086654A (en) | Handwriting model training method, text recognition method, device, equipment and medium | |
CN110458600A (en) | Portrait model training method, device, computer equipment and storage medium | |
CN111494964B (en) | Virtual article recommendation method, model training method, device and storage medium | |
CN110442725A (en) | Entity relation extraction method and device | |
CN105678381A (en) | Gender classification network training method, gender classification method and related device | |
CN108197561A (en) | Human face recognition model optimal control method, device, equipment and storage medium | |
CN107590102A (en) | Random Forest model generation method and device | |
CN109961102A (en) | Image processing method, device, electronic equipment and storage medium | |
CN111026267A (en) | VR electroencephalogram idea control interface system | |
CN110222734B (en) | Bayesian network learning method, intelligent device and storage device | |
CN111160562A (en) | Continuous learning method and device based on meta-learning optimization method | |
CN107274425A (en) | A kind of color image segmentation method and device based on Pulse Coupled Neural Network | |
CN108549899A (en) | A kind of image-recognizing method and device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20190215 |