CN107885967A - A kind of regression model hyperparameter optimization method - Google Patents

A kind of regression model hyperparameter optimization method Download PDF

Info

Publication number
CN107885967A
CN107885967A CN201710997220.XA CN201710997220A CN107885967A CN 107885967 A CN107885967 A CN 107885967A CN 201710997220 A CN201710997220 A CN 201710997220A CN 107885967 A CN107885967 A CN 107885967A
Authority
CN
China
Prior art keywords
mrow
msub
parameter
comparability
hyper parameter
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201710997220.XA
Other languages
Chinese (zh)
Inventor
姜高霞
王文剑
杜航原
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanxi University
Original Assignee
Shanxi University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanxi University filed Critical Shanxi University
Priority to CN201710997220.XA priority Critical patent/CN107885967A/en
Publication of CN107885967A publication Critical patent/CN107885967A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16ZINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS, NOT OTHERWISE PROVIDED FOR
    • G16Z99/00Subject matter not provided for in other main groups of this subclass

Landscapes

  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a kind of regression model hyperparameter optimization method, including Step 1: it is p to train hyper parameter successively on all data setslRegression model, obtain candidate's hyper parameter models that L is trained;Step 2, each candidate's hyper parameter model is obtained in each sample (xi, yi) on error;Step 3, calculated direction similarity matrix;Step 4, the Comparability for calculating parameters;Step 5, the hyper parameter with minimum Comparability is found, returned as optimized parameter;Parameter optimisation procedure provided by the present invention need not be manually set other parameters, and optimization process is disturbed without subjectivity;The optimization process of the present invention divides without data, has high efficiency and certainty;The main calculating section of the inventive method is relatively independent, can be with parallel processing in big data;The efficiency of the inventive method is 68 times of cross validation method.

Description

A kind of regression model hyperparameter optimization method
Technical field
The invention belongs to machine learning Optimization Modeling field, and in particular to a kind of hyperparameter optimization side towards regression model Method.
Background technology
Under the background of data and information rapid growth, core driver amount of the machine learning as data mining, As key link indispensable in knowledge extraction process.Rule of thumb between inferred from input data input data and output data Corresponding relation is machine learning a kind of major issue to be solved.When input data and output data are numeric type data When, such issues that be exactly regression forecasting problem.For example, according to the conditional forecasting precipitation such as temperature, humidity, according to somewhere population Amount, GDP and consumption of resident index predict electric load, utilize the closing price, amount of increase and conclusion of the business of stock price index futures index proxima luce (prox. luc) Amount carrys out regression forecasting its second day opening price etc..
At present, many outstanding regression models have been used in various actual prediction problems.Common recurrence mould Type has:Ridge regression (Ridge regression), lasso trick return (Least Absolute Shrinkage and Selection Operator, LASSO), support vector regression (Support vector regressor), ElasticNet (LASSO and ridge The mixture of recurrence) etc..These models actually can regard an optimization problem (regular terms master here with regular terms as If in order to prevent model over-fitting), and the regular parameter in regular terms usually requires to set in advance.Once regular parameter is set Improper, the prediction effect of model may be excessively poor.In practice, parameter value is specified even from experience, prediction effect may not Difference, but it is nor best.
For an actual prediction problem, different models has different prediction effects.Even identical model, it is super Parameter (regular parameter as mentioned above) is different, and prediction accuracy can also differ greatly.There is no a kind of model or a fixation Parameter always there is optimal prediction accuracy in all data.Therefore, when using certain outstanding regression model come pre- , it is necessary to cautiously select or adjust hyper parameter when surveying practical problem.
Cross validation (Cross-validation, CV) method is a kind of common general parameter system of selection, and it passes through It is trained in another part data and is verified to estimate the predictive ability of parameters drag in a part of data, from And select it considers that the best parameter of predictive ability.However, training and verification process need data to carry out random division, this is just Bring the uncertainty of estimation error and the nonuniqueness of parameter selection;Other data division, training and verification step increase The complexity calculated.When computing capability is limited, and actual demand is more urgent (real-time estimate of such as short-term traffic flow), this Kind parameter optimization method seems unable to do what one wishes.Therefore the hyper parameter of regression model how is efficiently and accurately selected, is effectively to use Regression model carries out the important foundation of actual prediction.
The content of the invention
The invention aims to solve the above problems, a kind of efficient, accurate parameter optimization method is proposed, so as to carry Rise the predictive ability of regression model.The present invention abandons " data division, training and checking " pattern of cross validation method, directly exists Regression model is trained in initial data;Select most preferably to join using the similarity between the training error corresponding to each hyper parameter Number.Efficiency is so both improved, in turn ensures that the uniqueness of acquired results.
A kind of regression model hyperparameter optimization method of the present invention, comprises the following steps:
Step 1: in all data set { (xi,yi), i=1,2 ..., n on train the hyper parameter to be p successivelylRegression model M (| p), obtain the L candidate's hyper parameter models trained m (| pl), l=1,2 ..., L };
Step 2, obtain each candidate's hyper parameter model m (| pl) in each sample (xi,yi) on error:
Step 3, calculated direction similarity matrix S=(suv)L×L
Step 4, the Comparability of l-th of parameter are all row k 2l-k column elements in the similarity matrix of direction Average value, wherein row sequence number k need 1≤k of satisfaction≤l;Comparability SS (the p of parameters are calculated according to the following formulal):
Wherein, w=min { l, L-l+1 }, u, v ∈ { 1,2 ..., L }, pu,pv∈P;
Step 5, the hyper parameter with minimum Comparability is found, returned as optimized parameter;
Make optimized parameter p*=p1, minimum Comparability SS*=1;For all parameters in P, perform successively as follows Process:Compare its Comparability SS (pl) with SS* size, if SS (pl) < SS*, then update SS*=SS (pl), p*= pl
Last p* is exactly selected optimal parameter.
The advantage of the invention is that:
(1) parameter optimisation procedure provided by the present invention need not be manually set other parameters, and optimization process is disturbed without subjectivity;
(2) optimization process of the invention divides without data, has high efficiency and certainty;
(3) the main calculating section of the inventive method is relatively independent, can be with parallel processing in big data;
(4) efficiency of the inventive method is 6-8 times of cross validation method;
(5) method of the invention is close with the predictive ability of parameter selected by conventional cross validation method.
Brief description of the drawings
Fig. 1 is a kind of flow chart of regression model hyperparameter optimization method of the present invention.
Fig. 2 figures compared with the optimization time of 10 folding cross validations (10FCV) for the present invention.
Fig. 3 is the prediction error of the prediction error of parameter and parameter selected by 10 folding cross validations (10FCV) selected by the present invention Compare figure.
Embodiment
Below in conjunction with drawings and examples, the present invention is described in further detail.
Hyperparameter optimization problem:Known regression data collection { (xi,yi), i=1,2 ..., n } and regression model m (| p), it is right Needed in the hyper parameter p of model from L candidate parameter { pl, l=1,2 ..., L in select one of this most suitable data set ginseng Number p*, that is, to cause regression model m (| p*) that best precision of prediction can be reached on this data set.
Symbol represents:{(xi,yi), i=1,2 ..., n } represent existing regression data collection, wherein xi,yiIs represented respectively The input vector and output valve of i sample, n are the sample size in data set;M (| p) regression model is represented, wherein p is model Hyper parameter, its span is the set P={ p of the equal difference comprising L element or Geometric Sequencel, l=1,2 ..., L};P* is optimal hyper parameter, and p* ∈ P;Positive integer k value is between 1 and l;P is taken for hyper parameterlRegression model m (·|pl) in i-th of sample (xi,yi) on training error;S represent scale be L × L direction similarity matrix, matrix it is every Individual element is suv, u, v ∈ { 1,2 ..., L } and p hereu,pv∈P;I () is indicator function, i.e., logic is true return 1, otherwise Return to 0;SS(pl) represent hyper parameter plComparability;Constant w=min { l, L-l+1 }.
Specifically, the present invention is a kind of regression model hyperparameter optimization method, flow is as shown in figure 1, comprise the following steps:
Initial problem is:For numeric type data collection { (x to be learnedi,yi), i=1,2 ..., n }, there is regression model m (| p), wherein p is hyper parameter undetermined.Candidate's equal difference/wait than parameter sets is P={ pl, l=1,2 ..., L }.
Step 1: in all data set { (xi,yi), i=1,2 ..., n on train the hyper parameter to be p successivelylRegression model M (| p), obtain the L candidate's hyper parameter models trained m (| pl), l=1,2 ..., L };Wherein, xi,yiRepresent respectively The input and output value of i-th of sample, n are the sample size of data set, and the p in regression model m (| p) is super ginseng to be optimized Number, its candidate parameter set is P={ pl, l=1,2 ..., L }, L is the number of candidate's hyper parameter;
Step 2, obtain each candidate's hyper parameter model m (| pl) in each sample (xi,yi) on error:
Predicted value m (the x of modeli|pl) subtract true output yiAs error
In the case where data, model and parameter are given, what training error was to determine.
Step 3, calculated direction similarity matrix S=(suv)L×L, wherein sequence number index u, v ∈ { 1,2 ..., L }, any two Individual candidate's hyper parameter pu,pvU row v column elements s in ∈ P, matrix SuvCalculation it is as follows:
Wherein:I () is indicator function, i.e. inequality1 is returned during establishment, otherwise returns to 0;
Assuming that there is L candidate parameter.The direction similarity of any two parameter is corresponding training error symbol (or direction) Identical frequency.All direction similarities form L × L direction similarity matrix.Obviously, this matrix is symmetrical matrix And leading diagonal value is 1.
Step 4, the Comparability of l-th of parameter are all row k 2l-k column elements in the similarity matrix of direction Average value, wherein row sequence number k need 1≤k of satisfaction≤l;Comparability SS (the p of parameters are calculated according to the following formulal):
Wherein, w=min { l, L-l+1 }, u, v ∈ { 1,2 ..., L }, pu,pv∈P;
The Comparability of l-th parameter (l=1,2 ..., L) is all row k 2l-k row in the similarity matrix of direction The average value (k≤l) of element.For example, the Comparability of the 1st parameter is the column element of direction the 1st row of similarity matrix the 1st Value;The Comparability of 2nd parameter is being averaged for the column element of direction the 1st row of similarity matrix the 3rd and the column element of the 2nd row the 2nd Value;The Comparability of 3rd parameter is the column element of direction the 1st row of similarity matrix the 5th, the column element of the 2nd row the 4th and the 3rd row The average value of 3rd column element, by that analogy.
Step 5, the hyper parameter with minimum Comparability is found, returned as optimized parameter;
Make optimized parameter p*=p1, minimum Comparability SS*=1;For all parameters in P, perform successively as follows Process:Compare its Comparability SS (pl) with SS* size, if SS (pl) < SS*, then update SS*=SS (pl), p*= pl
Last p* is exactly the hyper parameter value after optimization.
The effect of the present invention can be further illustrated by following simulation result.
All regression datas are all from UCI public data collection (http://archive.ics.uci.edu/ml/ Datasets.html), respectively Housing, Energy efficiency, Concrete, MG, Airfoil self- noise、Yacht Hydrodynamics、Geographical Original of Music、Skill Craft Master Table、Combined Cycle Power Plant、Condition Based Maintenance.Regression model is using support Vector regression model, parameter to be optimized are core scale parameter γ, and candidate collection is { 2-5,2-4,···,25}。
Fig. 2 is the comparison of the parameter optimization time on above-mentioned 10 data sets.Due to 10FCV take than this method it is more, this In counted the ratio that the former parameter optimization time and the latter optimize the time.Both are repeated 10 times, and have little deviation, are drawn in figure The average value and standard deviation of time ratios are gone out.As seen from the figure, efficiency of the invention is 6-8 times of cross validation method.
Fig. 3 compares the prediction error of the regression forecasting error and parameter selected by 10FCV of parameter selected by this method.For ease of Compare, a diagonal has been added in figure as reference line.As seen from the figure, on 10 data sets two kinds prediction errors diagonal Near line.Therefore the prediction error of parameter selected by two methods is more or less the same.
The invention discloses a kind of regressive prediction model parameter optimization method, for being selected from numerous candidate parameters The best hyper parameter of generalization ability (or predictive ability).The present invention be applied to ordered grid type candidate parameter set, it generally by One equal difference or Geometric Sequence composition.The principal character of the present invention has:Whole parameter optimisation procedure need not be manually set any ginseng Number, eliminates subjective disturbing factor;Optimization process divides without data, both improves efficiency, turn avoid data division and brings Randomness;It is relatively independent that the part of main amount of calculation is accounted in method, can be with parallel processing in big data;This method is selected to join Number is close with the effect of parameter selected by conventional cross validation method, and efficiency is 6-8 times of the latter.

Claims (1)

1. a kind of regression model hyperparameter optimization method, comprises the following steps:
Step 1: in all data set { (xi,yi), i=1,2 ..., n on train the hyper parameter to be p successivelylRegression model m ( | p), obtain candidate's hyper parameter models that L trains m (| pl), l=1,2 ..., L };Wherein, xi,yiI-th is represented respectively The input and output value of individual sample, n are the sample size of data set, and the p in regression model m (| p) is hyper parameter to be optimized, Its candidate parameter set is P={ pl, l=1,2 ..., L }, L is the number of candidate's hyper parameter;
Step 2, obtain each candidate's hyper parameter model m (| pl) in each sample (xi,yi) on error:
<mrow> <msubsup> <mi>e</mi> <mi>i</mi> <mrow> <mo>(</mo> <msub> <mi>p</mi> <mi>l</mi> </msub> <mo>)</mo> </mrow> </msubsup> <mo>=</mo> <mi>m</mi> <mrow> <mo>(</mo> <msub> <mi>x</mi> <mi>i</mi> </msub> <mo>|</mo> <msub> <mi>p</mi> <mi>l</mi> </msub> <mo>)</mo> </mrow> <mo>-</mo> <msub> <mi>y</mi> <mi>i</mi> </msub> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>1</mn> <mo>)</mo> </mrow> </mrow>
Predicted value m (the x of modeli|pl) subtract true output yiAs error
Step 3, calculated direction similarity matrix S=(suv)L×L, wherein sequence number index u, v ∈ { 1,2 ..., L }, any two time Select hyper parameter pu,pvU row v column elements s in ∈ P, matrix SuvCalculation it is as follows:
<mrow> <msub> <mi>s</mi> <mrow> <mi>u</mi> <mi>v</mi> </mrow> </msub> <mo>=</mo> <mfrac> <mn>1</mn> <mi>n</mi> </mfrac> <munderover> <mo>&amp;Sigma;</mo> <mrow> <mi>i</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>n</mi> </munderover> <mi>I</mi> <mrow> <mo>(</mo> <msubsup> <mi>e</mi> <mi>i</mi> <mrow> <mo>(</mo> <msub> <mi>p</mi> <mi>u</mi> </msub> <mo>)</mo> </mrow> </msubsup> <mo>&amp;CenterDot;</mo> <msubsup> <mi>e</mi> <mi>i</mi> <mrow> <mo>(</mo> <msub> <mi>p</mi> <mi>v</mi> </msub> <mo>)</mo> </mrow> </msubsup> <mo>&gt;</mo> <mn>0</mn> <mo>)</mo> </mrow> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>2</mn> <mo>)</mo> </mrow> </mrow>
Wherein:I () is indicator function, i.e. inequality1 is returned during establishment, otherwise returns to 0;
Step 4, the Comparability of l-th parameter are that all row k 2l-k column elements are averaged in the similarity matrix of direction Value, wherein row sequence number k needs 1≤k of satisfaction≤l;Comparability SS (the p of parameters are calculated according to the following formulal):
<mrow> <mi>S</mi> <mi>S</mi> <mrow> <mo>(</mo> <msub> <mi>p</mi> <mi>l</mi> </msub> <mo>)</mo> </mrow> <mo>=</mo> <mfrac> <mn>1</mn> <mi>w</mi> </mfrac> <munder> <munder> <mi>&amp;Sigma;</mi> <mrow> <mi>u</mi> <mo>+</mo> <mi>v</mi> <mo>=</mo> <mn>2</mn> <mi>l</mi> </mrow> </munder> <mrow> <mi>u</mi> <mo>&amp;le;</mo> <mi>v</mi> </mrow> </munder> <msub> <mi>s</mi> <mrow> <mi>u</mi> <mi>v</mi> </mrow> </msub> <mo>,</mo> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>3</mn> <mo>)</mo> </mrow> </mrow>
Wherein, w=min { l, L-l+1 }, u, v ∈ { 1,2 ..., L }, pu,pv∈P;
Step 5, the hyper parameter with minimum Comparability is found, returned as optimized parameter;
Make optimized parameter p*=p1, minimum Comparability SS*=1;For all parameters in P, following process is performed successively: Compare its Comparability SS (pl) and SS*Size, if SS (pl) < SS*, then SS is updated*=SS (pl), p*=pl
Last p*It is exactly selected optimal parameter.
CN201710997220.XA 2017-10-24 2017-10-24 A kind of regression model hyperparameter optimization method Pending CN107885967A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710997220.XA CN107885967A (en) 2017-10-24 2017-10-24 A kind of regression model hyperparameter optimization method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710997220.XA CN107885967A (en) 2017-10-24 2017-10-24 A kind of regression model hyperparameter optimization method

Publications (1)

Publication Number Publication Date
CN107885967A true CN107885967A (en) 2018-04-06

Family

ID=61782117

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710997220.XA Pending CN107885967A (en) 2017-10-24 2017-10-24 A kind of regression model hyperparameter optimization method

Country Status (1)

Country Link
CN (1) CN107885967A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109508455A (en) * 2018-10-18 2019-03-22 山西大学 A kind of GloVe hyper parameter tuning method
CN109816116A (en) * 2019-01-17 2019-05-28 腾讯科技(深圳)有限公司 The optimization method and device of hyper parameter in machine learning model
CN110084374A (en) * 2019-04-24 2019-08-02 第四范式(北京)技术有限公司 Construct method, apparatus and prediction technique, device based on the PU model learnt
CN111723342A (en) * 2020-06-22 2020-09-29 杭州电力设备制造有限公司 Transformer top layer oil temperature prediction method based on elastic network regression model
CN113053113A (en) * 2021-03-11 2021-06-29 湖南交通职业技术学院 PSO-Welsch-Ridge-based anomaly detection method and device

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109508455A (en) * 2018-10-18 2019-03-22 山西大学 A kind of GloVe hyper parameter tuning method
CN109508455B (en) * 2018-10-18 2021-11-19 山西大学 GloVe super-parameter tuning method
CN109816116A (en) * 2019-01-17 2019-05-28 腾讯科技(深圳)有限公司 The optimization method and device of hyper parameter in machine learning model
CN109816116B (en) * 2019-01-17 2021-01-29 腾讯科技(深圳)有限公司 Method and device for optimizing hyper-parameters in machine learning model
CN110084374A (en) * 2019-04-24 2019-08-02 第四范式(北京)技术有限公司 Construct method, apparatus and prediction technique, device based on the PU model learnt
CN111723342A (en) * 2020-06-22 2020-09-29 杭州电力设备制造有限公司 Transformer top layer oil temperature prediction method based on elastic network regression model
CN111723342B (en) * 2020-06-22 2023-11-07 杭州电力设备制造有限公司 Transformer top layer oil temperature prediction method based on elastic network regression model
CN113053113A (en) * 2021-03-11 2021-06-29 湖南交通职业技术学院 PSO-Welsch-Ridge-based anomaly detection method and device

Similar Documents

Publication Publication Date Title
CN107885967A (en) A kind of regression model hyperparameter optimization method
CN104536412B (en) Photoetching procedure dynamic scheduling method based on index forecasting and solution similarity analysis
CN109685252A (en) Building energy consumption prediction technique based on Recognition with Recurrent Neural Network and multi-task learning model
CN107169633A (en) A kind of gas line network, gas storage peak regulating plan integrated evaluating method
CN103810101A (en) Software defect prediction method and system
CN108062302B (en) A kind of recognition methods of text information and device
CN100507460C (en) Dynamic soft measuring and form establishing method base pulse response formwork and parameter optumization
CN109242149A (en) A kind of student performance early warning method and system excavated based on educational data
CN101288089A (en) Load prediction based on-line and off-line training of neural networks
CN106600001B (en) Glass furnace Study of Temperature Forecasting method based on Gaussian mixtures relational learning machine
CN111047085A (en) Hybrid vehicle working condition prediction method based on meta-learning
CN108090788A (en) Ad conversion rates predictor method based on temporal information integrated model
CN106649479A (en) Probability graph-based transformer state association rule mining method
CN106503853A (en) A kind of foreign exchange transaction forecast model based on multiple scale convolutional neural networks
CN111178585A (en) Fault reporting amount prediction method based on multi-algorithm model fusion
CN107644297A (en) A kind of energy-saving of motor system amount calculates and verification method
CN103699947A (en) Meta learning-based combined prediction method for time-varying nonlinear load of electrical power system
CN115146580A (en) Integrated circuit path delay prediction method based on feature selection and deep learning
Nasir Modern services export performances among emerging and developed Asian economies
CN107451684A (en) Stock market&#39;s probability forecasting method based on core stochastic approximation
CN105894138A (en) Optimum weighted composite prediction method for shipment amount of manufacturing industry
CN105787265A (en) Atomic spinning top random error modeling method based on comprehensive integration weighting method
CN103605493A (en) Parallel sorting learning method and system based on graphics processing unit
CN105354644A (en) Financial time series prediction method based on integrated empirical mode decomposition and 1-norm support vector machine quantile regression
CN109493921A (en) A kind of atmospheric distillation process modeling approach based on multi-agent system model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20180406