CN109376869A - A kind of super ginseng optimization system of machine learning based on asynchronous Bayes optimization and method - Google Patents
A kind of super ginseng optimization system of machine learning based on asynchronous Bayes optimization and method Download PDFInfo
- Publication number
- CN109376869A CN109376869A CN201811588608.5A CN201811588608A CN109376869A CN 109376869 A CN109376869 A CN 109376869A CN 201811588608 A CN201811588608 A CN 201811588608A CN 109376869 A CN109376869 A CN 109376869A
- Authority
- CN
- China
- Prior art keywords
- model
- parameter
- module
- pond
- point
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Software Systems (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Medical Informatics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Physics & Mathematics (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Artificial Intelligence (AREA)
- Feedback Control In General (AREA)
Abstract
The present invention relates to a kind of super ginseng optimization system of machine learning based on asynchronous Bayes optimization and methods, comprising: Bayes's optimization module, model parameter pool model, Kmeans cluster module, task scheduling modules, adaptive determining model degree of parallelism module;The present invention efficiently carries out automation to the machine learning under big data environment and adjusts ginseng, efficiently use multiprocessing parallel calculation ability, it is efficient to carry out big data machine learning automation tune ginseng, so that people can better use big data machine learning in production practice.
Description
Technical field
The present invention relates to a kind of super ginseng optimization system of machine learning based on asynchronous Bayes optimization and methods.Belong to calculating
Machine artificial intelligence field.
Background technique
With the development of cloud computing and big data technology, machine learning techniques become the hot spot of academia and business circles.So
And machine learning is related to a large amount of theoretical knowledges, while machine learning model includes quantity of parameters, needs experience ability abundant
Design an efficient model.In order to promote machine learning techniques widely to apply, it is effectively reduced and carries out machine learning application
Threshold, automatic machinery study (Automatic Machine Learning, abbreviation AutoML) technology is come into being, i.e., logical
It crosses and provides automatic technology to each link of machine learning, allow beginner that can also carry out machine learning model training and application.
The core of AutoML is the automation tune ginseng of machine learning model, that is, automatically selects hyper parameter, the selection of hyper parameter
To machine learning using extremely important, different hyper parameters directly affect the effect (ratio of machine learning application in production practice
Such as predictablity rate), the hyper parameter selection course of machine learning model is as shown in Figure 1, since machine learning model is usually wrapped
Containing quantity of parameters, parameter space is huge, how efficiently to carry out adjusting ginseng being a urgent problem to be solved.Currently used tune ginseng
Method has: using artificial tune ginseng, Grid search and Random search as the simple parameter adjustment method of representative;To be based on Bayes
Method of optimization etc. is the heuristic of representative.The schematic diagram of Grid search and Random search are as shown in Figure 2.
Artificial tune ginseng is a kind of most simple parameter adjustment method for most having artistry again.It, can be in face of a machine learning application
Tune ginseng is carried out using the artificial method for adjusting ginseng, so that it is determined that model parameter can be based on experienced machine learning expert
Empirical value, which carries out artificial adjust, joins;For unfamiliar machine learning new person, artificial " trial-and-error method " can be carried out and (carried out enough
Experiment, a preferable parameter of group model effect can be found).In general, artificial tune ginseng is the process taken time and effort.
Grid search is one of simplest automation parameter adjustment method.The thought of Grid search is simple and direct
, user only needs to define one group of parameter value range, and according to certain interval combination parameter, then corresponding training pattern is chosen
Select the corresponding parameter of the best model of model evaluation.In general, the parameter combination space of Grid search is larger, for example, for
Logistic regression application, it is assumed that have 5 parameters, each parameter has 10 possible values, then entire interblock space will be 105,
Being trained to so multi-model will be a very time-consuming process.Since parameter combination space is usually larger, Grid
Search is suitable for the very short scene of model training time-consuming, is difficult to play a role in big data scene.
For the deficiency of Grid search, some scholars have studied Random search, press different from Grid search
Fixed interval exhaustion parameter combination, Random search it is random select parameter combination.The research table of Bergstra et al.
Bright: the effect of Random search will not be poorer than Grid search under normal circumstances.Random search it is random select ginseng
Array is closed, and can avoid the mutual redundancy between parameter point to a certain extent.Random search the problem is that: if
Certain two parameter point is from closer (for example, Euclidean distance is smaller in space) is obtained, then the two parameter points are exactly to be mutually redundant
, search efficiency can be reduced, for high-dimensional feature space (when parameter is more), is easily trapped into some regional area.
The above method is all the carry out parameter space search of violence, and search efficiency is lower, no longer suitable in big data environment
With.Bayes's optimization is a kind of optimization algorithm (the sequential model based based on model of serializing
Optimization), for Bayes's optimization using trained model information as priori knowledge, guidance generates next parameter
Point must can faster obtain best model effect, can greatly accelerate entirely compared to Grid search and Random search
Ginseng process is adjusted, is the almost super ginseng optimization method of best machine learning model at present.
Classical Bayes optimizes existing deficiency: optimization process is a serial process, can not efficiently use multimachine
Computation capability under big data environment, still remains the lower problem of efficiency so that big data machine learning be difficult to carry out from
Dynamicization adjusts ginseng, can not successfully manage big data environment.How classical Bayes to be optimized and carry out parallelization, successfully manages big data
Environment preferably carries out social production reality to people to preferably use big data machine learning techniques in production practice
It tramples and is of great significance.
Currently, optimizing for the Bayes under distributed environment, a large amount of research work is based primarily upon synchronous Batch and carries out
Research.In general, often existing between synchronous execution pattern (Bulk Synchronous Parallel, BSP mode) each task
The case where waiting mutually;Each of asynchronous execution mode (Asynchronous Synchronous Parallel, SSP mode)
It does not need to wait mutually between business, therefore, it is more efficient that asynchronous execution mode (SSP) compares synchronous execution pattern (BSP).Such as figure
Shown in 3, there are three calculate node in Fig. 3, it can be found that in the synchronous mode of left side the execution of task 4,5,6 need to wait task 1,
2,3 end;The execution of task 4,5,6 withouts waiting for the end of task 1,2,3 in the asynchronous mode of right side.In general, identical
Number of tasks, asynchronous mode can execute faster compared to synchronous mode to be terminated.
The asynchronous Bayes optimization that Kandasamy K et al. is proposed is a kind of parallel mode to classical Bayes optimization,
But the method in paper is responsible for the assessment of a model using each calculate node, this use single model can not effectively
Big data is trained, while can not also cope with multiple models convergent scene simultaneously.
The problems such as Bayes's optimization under above-mentioned big data border is lower there are efficiency, so that machine learning under big data environment
Automation adjusts the availability of ginseng technology lower.
Summary of the invention
The technology of the present invention solves the problems, such as: adjusting ginseng for being difficult to carry out machine learning automation under big data environment this is asked
Topic overcomes the deficiencies of the prior art and provide a kind of super ginseng optimization system and method based on asynchronous Bayes optimization, to big data
Machine learning under environment efficiently carries out automation and adjusts ginseng, efficiently uses multiprocessing parallel calculation ability, efficient count greatly
It is automated according to machine learning and adjusts ginseng, so that people can better use big data machine learning in production practice.
A kind of the technology of the present invention solution: the super ginseng optimization method of machine learning based on asynchronous Bayes optimization, comprising:
Bayes's optimization module, model parameter pool model, Kmeans cluster module, task scheduling modules, adaptive determining model are parallel
Spend module;
Bayes's optimization module realizes Bayesian Optimization Algorithm, generates candidate parameter point;Receive model parameter pond module
Or after the signal of Kmeans cluster module, model parameter point and corresponding model-evaluation index, update in Access Model parameter pond
Bayes's optimization;Get interface is provided for model (machine learning model) parameter (model hyper parameter, such as learning rate, regularization system
The parameters such as number) pond module calls directly;For multi-model while convergent scene, GetBatch interface is provided and is clustered for Kmeans
Module is called;GetBatch interface realizes following algorithm: random generation L (L > 10000) a parameter point, calculates L parameter
The corresponding EI of point (revenue function value, Bayes optimize the standard for generating candidate parameter point) value, find l (l>200and l<
1000) a maximum parameter point of EI value executes gradient descent algorithm since each parameter point, finds local best points return;
Model parameter pond module, be responsible for model parameter point management work, specifically include: obtain model hyper parameter point,
Parameter point in model parameter pond is supplied to computing cluster and the functions such as uses by parameter point replacement in model parameter pond.Model ginseng
Number pond mainly passes through an array and realizes, model parameter point is abstracted into parameter point object, provides Push and Pull interface for meter
Calculate cluster and the interaction of model parameter pond.Model parameter pond obtains model parameter point from Bayes's optimization module by Get interface, leads to
It crosses GetBatch interface and obtains multiple groups inequality parameter point from Kmeans cluster module.Parameter point in model parameter pond can be counted
Cluster (Spark cluster) Pull is calculated, can receive the model-evaluation index of computing cluster Push.
Kmeans cluster module, Kmeans cluster are mainly used to generate the parameter point of multiple inequalities;By model parameter pond mould
Block calls, and receives the signal for generating k inequality parameter point;Bayes's optimization module is called to generate K (typically larger than k) a original time
Select parameter point;It is clustered by Kmeans and candidate parameter point is polymerized to k class, then select the maximum parameter of revenue function value in k class
Point returns result to model parameter pond module to generate the parameter point of k inequality.
Task scheduling modules, whether the model in the module of judgment models parameter pond should deconditioning.Specifically, including:
Model convergence and Early Stopping algorithm.Whether model convergence, computation model precision reach the threshold value being set in advance,
If reaching threshold value, restrain, it is otherwise, not converged;Early Stopping calculate first history training pattern to should
The mean value E (P) of the model-evaluation index P of preceding iteration wheel number, if "current" model evaluation index p < E (P) * 0.9, stops instructing
Practice, otherwise, continues to train.The module is main and model parameter pond interacts, and parameter corresponds to model in judgment models parameter pond
State, send a signal to model parameter pond module.
Adaptively cover half type degree of parallelism module really, the degree of parallelism of model in adaptive determination computing cluster.This module
The main computational efficiency for passing through the corresponding model parameter pond of experimental evaluation difference model parameter pond size, obtains optimal computed performance
Corresponding model parameter pond size.Specifically, calculating the time-consuming that different model parameter ponds size executes a wheel model iteration, does and return
One change processing, more time-consuming length are repeated to test repeatedly, be held to obtain best model in order to avoid the influence of enchancement factor
The corresponding model parameter pond size of row performance.This module is mainly called by model parameter pond module, and initialization model parameter is used to
Pond size.
A kind of super ginseng optimization method of machine learning based on asynchronous Bayes optimization of the invention, comprising the following steps:
(1) adaptive determining model degree of parallelism module is executed, so that it is determined that computing cluster best model degree of parallelism, by result
Pass to model parameter pond module;
(2) model parameter pond module does some initial works, such as model parameter pond size (being assumed to be n);
(3) Bayes's optimization module does some initial works, for example initialization Bayes optimizes hyper parameter space configuration,
Bayes's Optimized Iterative wheel number etc.;
(4) model parameter pond module calls Kmeans cluster module to be filled into mould to generate n initial parameter point
Shape parameter pond;
(5) computing cluster corresponds to model to the parameter in the module of model parameter pond and carries out a wheel model iteration, and model is commented
Valence index is sent to model parameter pond module;
(6) scheduler module judges that parameter corresponds to model and is according to parameter point and model-evaluation index in the module of model parameter pond
It is no should deconditioning, if should stop, to model parameter pond module send deconditioning signal;
(7) if model parameter pond module receives the model stop signal of scheduler module, the model parameter that please be looked for novelty
Point (if a model stops, requesting Bayes's optimization module to generate a parameter point;If multiple models stop, asking
Kmeans cluster module is asked to generate multiple parameters point), parameter point is used by computing cluster in the module of model parameter pond, starts a wheel
Model training repeats the above process until reaching Bayes optimizes outage threshold (Bayes's Optimized Iterative wheel number).
The advantages of the present invention over the prior art are that: super ginseng optimization method Grid Search of present mainstream,
That there are efficiency is lower by Random Search, needs a large amount of computing resource;Didactic Bayes's optimization method presence can only go here and there
The deficiencies of row executes, and can not efficiently use multiprocessing parallel calculation ability under distributed environment.Which results under big data environment, very
Hardly possible carries out the super ginseng optimization of machine learning, asynchronous Bayes's optimization method proposed by the present invention, by based on the different of model parameter pond
Step model training can be very good to realize the super ginseng optimization of asynchronous parallel, while it is efficient to retain Bayes's optimization itself again
Super ginseng optimization efficiency.The present invention can efficiently use multimachine computing capability under distributed environment, allow engineering under big data environment
Practise automation adjust ginseng is possibly realized, thus be conducive to people social production practice in better by big data machine learning into
The analysis of row data, mining data value.
Detailed description of the invention
Fig. 1 is that model selects and adjust ginseng process schematic;
Fig. 2 is Grid searchda (left figure) and Random search (right figure) schematic diagram;
Fig. 3 is synchronous to execute (left figure) and asynchronous execution figure (right figure);
Fig. 4 is the overall framework figure of present system;
Fig. 5 is model parameter pond module diagram in the present invention;
Fig. 6 is Kmeans cluster module schematic diagram in the present invention;
Fig. 7 is the task scheduling modules implementation flow chart in the present invention;
Fig. 8 is the adaptive cover half type degree of parallelism module implementation flow chart really in the present invention.
Specific embodiment
Technical solution of the present invention can be expressed as Fig. 4, specifically include that Bayes's optimization module, model parameter pool model,
Kmeans cluster module, task scheduling modules, adaptive determining model degree of parallelism module.Pass through the collaboration work of above-mentioned several modules
Make, the super ginseng optimization method of the machine learning based on Bayes's optimization proposed by the present invention may be implemented.
In above-mentioned module, Bayes's optimization module:
Bayes's optimization module is basic technology of the invention, this module mainly realizes the excellent method of Bayes, Bayes
Relationship modeling between model-evaluation index and parameter point can produce more meaningful parameter point by optimization.In the present invention, shellfish
Ye Si optimization is responsible for generating candidate parameter point, receives feedback information (parameter point and corresponding model from model parameter pond module
Evaluation index).
In above-mentioned module, model parameter pond module:
Model parameter pond is one of key technology of the invention, and building block technique figure is as shown in Figure 5.It is responsible for mould in model parameter pond
The management of shape parameter point receives the parameter point generated from Bayes's optimization module, directly receives shellfish for single model parameter point
The parameter point that this optimization module of leaf generates is optimized by Bayes generate multiple groups candidate parameter first for multiple model parameter points
Then point generates the parameter point of multiple groups inequality by Kmeans cluster module.
Model parameter pond module provides Push and Pull interface, and computing cluster (Spark cluster) can be with Pull model parameter
Point, is then trained;After model convergence, computing cluster Push model-evaluation index to model parameter pond.Since model is joined
Number is different, and the reasons such as randomness of machine learning model, model itself usually has the different training times, is based on this, so that it may
To realize efficient asynchronous parallel tune ginseng.
In above-mentioned module, Kmeans cluster module:
Kmeans cluster module is the key that one of of the invention.Machine learning model declines especially with gradient and carries out
The model of solution, such as the models such as logistic regression and support vector machines, in general, these models can be received by tens of wheel iteration
It holds back, when model training, calculate node can correspond to model to parameter in model parameter pond and carry out a wheel iteration, then judgement convergence
Property, at this moment the case where this there is multiple models while restraining optimized using Bayes and generates multiple groups candidate parameter point and will lead
It sends one's regards and selects the mutual redundancy of parameter point, automated so as to cause entire machine learning and adjust ginseng efficiency lower.
This module mainly realizes Kmeans clustering algorithm, receives Bayes and optimizes the original time candidate parameter of multiple groups generated
Then point carries out Kmeans clustering processing, generate k inequality and make the parameter point of revenue function larger (such as EI function),
Above-mentioned parameter point is filled into the module of model parameter pond.
Kmeans cluster generates multiple groups candidate parameter point schematic diagram and (shows tetra- parameters of A, B, C, D in figure as shown in Figure 6
Point, actual parameter point are more), horizontal axis representation parameter point, the longitudinal axis represents revenue function value, it is assumed that three models are restrained simultaneously, directly
Connecing will likely be generated parameter point A, B, C (corresponding financial value is larger) using Bayes's optimization three groups of candidate parameter points of generation, but
It is that there are mutual redundancy, parameter point distances smaller (such as Euclidean distance) by parameter point A and B, it will reduce the tune of Bayes's optimization
Join efficiency.It is clustered by using Kmeans, candidate parameter point A, B, C and D is clustered, theoretically, A and B will be converged to together
In one kind, the biggish parameter point A of one of financial value is returned, parameter point A, C and D of inequality will be generated in this way, to mention
The efficiency of high Bayes's optimization.
In above-mentioned module, task scheduling modules:
Task scheduling modules are one of important modules of the invention.This module mainly carries out the model in model parameter pond
Convergence judgement, task scheduling modules include model convergence and Early Stopping technology two parts.
Model convergence, which mainly passes through computation model precision and whether reaches the threshold value being set in advance, to be judged, if reached
It is then restrained to threshold value, it is otherwise, not converged.
In some machine learning applications, in machine learning model training process, some performance-relevant information are exactly can
?.Especially when model training is iteration, performance curve (Performance curve) is just available, such as
The model solved using gradient decline, with trained progress, often model accuracy rate (Accuracy) is higher and higher, and every
When one wheel iteration (Epoch) terminates, available model accuracy rate.Using the curve between accuracy rate and train epochs,
It may determine that currently whether trained model is possible to more preferable than known best model effect, for that can not obtain known to ratio
The better model of best model effect can terminate the training of model, discharge corresponding computing resource, in time to more go
Assess promising model.Based on the algorithm of above-mentioned thought, referred to as Early Stopping algorithm.By using Early
Stopping technology can effectively accelerate entire machine learning automation to adjust ginseng process.
In above-mentioned module, adaptive cover half type degree of parallelism module really:
Adaptively cover half type degree of parallelism module is one of important module of the invention really.Model degree of parallelism refers to calculating
The Number of Models being performed simultaneously in cluster, model degree of parallelism directly affect the calculated performance of entire cluster, and model degree of parallelism is set
Set excessive, the too small calculated performance that can all influence entire computing cluster, this module can adaptive cover half type degree of parallelism really.
For Spark platform machine learning model, adaptively the working principle of cover half type degree of parallelism module is really: passing through
The computational efficiency in the corresponding model parameter pond of experimental evaluation difference model parameter pond size, so that it is corresponding to obtain optimal computed performance
Model parameter pond size, comprise the concrete steps that: calculate different model parameter ponds size execute one wheel model iteration time-consuming, then
Time normalization is done, more time-consuming length is repeated to test repeatedly (such as 3 times), be found most in order to avoid the influence of enchancement factor
The corresponding model parameter pond size of good model execution performance.Compared to the entire time-consuming for adjusting ginseng process, above process time-consuming is micro-
It is inappreciaple.
Below in conjunction with specific embodiments and the drawings, the present invention is described in detail.
Present example using Python as programming language, which use big data processing platform Spark and
MLlib distributed machines learning database based on Spark, for the super ginseng optimization problem of machine learning under big data environment.Below
It is specifically addressed by taking the classical machine learning model logistic regression being commonly used under big data environment as an example.
As shown in figure 4, the method is specifically implemented by the following steps:
1. Bayes's optimization module
This module needs to realize Bayesian Optimization Algorithm, and Bayes, which optimizes, needs an initial parameter area, mentions simultaneously
It is as follows for parameter space configuration interface:
For Logic Regression Models, main hyper parameter has: maxIter: model iteration wheel number, regParam: regularization
Coefficient, tol: model convergence precision etc..One parameter space range of initial configuration is needed, is configured as follows:
Parameter name | Meaning of parameters | Use interface |
maxIter | Model iteration wheel number | hp.randint(‘maxIter’,100) |
regParam | Regularization coefficient | hp.loguniform(‘regParam’,1e-3,1e+2) |
tol | Convergence precision | hp.loguniform(‘tol’,1e-4,1e-6) |
When model parameter pond needs one group of parameter, Bayes's optimization module directly generates one group of parameter feedback to mould
Shape parameter pond module (is degenerated to classical Bayesian Optimization Algorithm);When model parameter pond needs multiple groups parameter (assuming that
For K), Bayes, which optimizes, generates the original candidate parameter point of multiple groups, and the parameter point of K group inequality is generated by Kmeans cluster module
Feed back to model parameter pond module.
2. model parameter pond module
As shown in figure 5, this module mainly realizes model parameter pond, it is responsible for the management work of model parameter.Entire machine
Learning automaton tune ginseng this module of process is primarily present three phases: initial phase, first stage and second stage.
Initial phase includes: to execute the adaptive determining big little module in model parameter pond, determines that model parameter pond size is (false
It is set as k);
First stage includes: to call Bayes's optimization module, and random generation k group parameter point is filled into model parameter pond;
The corresponding model of parameter in calculate node training pattern parameter pond;It is excellent that parameter point and corresponding model evaluation are fed back into Bayes
Change, the Gaussian process of initialization Bayes's optimization;
Second stage includes: execution task scheduling modules, and whether the corresponding model of parameter restrains in judgment models parameter pond,
The Number of Models m of Statistical Convergence, does respective markers in model parameter pond;Task scheduling modules are executed, according to Early
The corresponding model of parameter in Stopping algorithm judgment models parameter pond whether should deconditioning, statistics should deconditioning
Number of Models n, do respective markers in model parameter pond;Calculate node calculates above-mentioned m+n model in test data set
Model-evaluation index;Record best model and corresponding best model evaluation index;M+n that above-mentioned training is completed
Model feedback optimizes to Bayes, updates Gaussian process;Optimize in conjunction with Bayes, is clustered using Kmeans and generate m+n group candidate
Parameter point fills model parameter pond;Calculate node carries out a wheel model training to the corresponding model of parameter in model parameter pond
It updates;Circulation executes the above process, until reaching specified optimization wheel number, returns to best model, best model evaluation index.
3.Kmeans cluster module
As shown in fig. 6, Kmeans cluster module realizes a kind of Kmeans algorithm, mainly to being produced in Bayes's optimization module
Raw multiple groups initial parameter point carries out cluster operation, to generate the parameter point of K group inequality.
A kind of algorithm of the multiple candidate parameter points of generation based on Kmeans mainly comprises the following processes: random generation L
(L > 10000) a parameter point;Calculating the corresponding EI of L parameter point, (revenue function value, Bayes, which optimizes, generates candidate parameter point
Standard) value, finds a maximum parameter point of EI value of l (l<1000 l>200and);Gradient decline is executed since each parameter point
Algorithm finds local best points;Kmeans clustering algorithm is carried out to above-mentioned l local best points, and it is maximum to return to EI value
Parameter point.
It clusters target: if candidate parameter point is closer (for example Euclidean distance is smaller), will form redundancy, reduce and adjust
Join efficiency, it is desirable to obtain multiple inequalities and make the biggish parameter point of EI value.
Cluster data: using parameter point parameter value and revenue function value as feature, be normalized (avoid feature it
Between scale it is inconsistent caused by cluster Problem of Failure), sample point is polymerized to k (parameter point number to be generated) class.
Cluster result: from every one kind of cluster result, select one the maximum parameter point of revenue function value is returned.
As an example with Fig. 6, original candidate parameter point includes L parameter point: A, B, C, D, clusters and produces by Kmeans
Given birth to A, B, C, D three classes find and make the maximum parameter point of revenue function value in every one kind, A, C, D, then A, C, D inequality and
So that revenue function value is larger.
4. task scheduling modules
Task scheduling modules realize Early Stopping algorithm.This module receives the parameter point in model parameter pond
The judgement of holding back property.
As soon as iteration is taken turns in the every progress of model in model parameter pond, task scheduling modules will do it convergence judgement, task
Scheduler module mainly according to the model in convergence precision judgment models parameter pond whether should deconditioning, it is whole in order to effectively accelerate
A tune ginseng process can prejudge whether model is possible to obtain most by the performance curve in Early Stoping technology
Good modelling effect can terminate the training that can not obtain the model of best model effect in time, to start next group of parameter
The training of point.
As shown in fig. 7, task scheduling modules obtain the data in model parameter pond: model parameter point, model-evaluation index
Deng, first according to model convergence judgment models whether should deconditioning, computation model precision w judges whether to reach model
The threshold value W of setting restrains if reaching threshold value;Otherwise, not converged.If above-mentioned judging result be it is not converged, continue to make
Judged with Early Stopping algorithm, calculates model-evaluation index P of the model in current iteration wheel number of trained mistake
Mean value E (P), if "current" model evaluation index p < E (P) * 0.9, then it is assumed that the training of the model should be terminated.
5. adaptive cover half type degree of parallelism module really
As shown in figure 8, this module mainly realizes a kind of algorithm of adaptive cover half type degree of parallelism really.It is responsible for determining mould
The fair-sized in shape parameter pond.
It for Logic Regression Models, comprises the concrete steps that: calculating different model parameter ponds size and execute a wheel model iteration
Time-consuming, then does time normalization, more time-consuming length, in order to avoid the influence of enchancement factor, repeat to test repeatedly (such as 3
It is secondary), find the corresponding model parameter pond size of best model execution performance.Compared to the entire time-consuming for adjusting ginseng process, above-mentioned mistake
Journey time-consuming is inappreciable.
It such as Logic Regression Models, comprises the concrete steps that: determining an initial model parameter pond magnitude range, such as
I=1~e (e represents maximum model parameter pond size configuration, e > 1), is dimensioned to i for model parameter pond, repeats a wheel
Model iteration (in order to avoid enchancement factor, repeats 3 times) three times.Circulation executes the above process, is sorted according to time-consuming length,
Time-consuming shortest i is returned as model parameter pond size.
Above embodiments are provided just for the sake of the description purpose of the present invention, and are not intended to limit the scope of the invention.This
The range of invention is defined by the following claims.It does not depart from spirit and principles of the present invention and the various equivalent replacements made and repairs
Change, should all cover within the scope of the present invention.
Claims (2)
1. a kind of super ginseng optimization system of machine learning based on asynchronous Bayes optimization, it is characterised in that: including Bayes's optimization
Module, model parameter pool model, Kmeans cluster module, task scheduling modules, adaptive determining model degree of parallelism module;
Bayes's optimization module realizes Bayesian Optimization Algorithm, generates candidate parameter point;Receive model parameter pond module or
After the signal of Kmeans cluster module, model parameter point and corresponding model-evaluation index, update pattra leaves in Access Model parameter pond
This optimization;Get interface is provided to call directly for model parameter pond module;For multi-model while convergent scene, provide
GetBatch interface is called for Kmeans cluster module;GetBatch interface realizes following algorithm: random L parameter of generation
Point, L > 10000 calculate the corresponding EI value of L parameter point, i.e. revenue function value, find the l maximum parameter point of EI value, from every
A parameter point starts to execute gradient descent algorithm, finds local best points;l>200andl<1000;
Model parameter pond module, is responsible for the management work of model parameter point, specifically includes: to obtain model hyper parameter point, model
Parameter point in model parameter pond is supplied to computing cluster and uses function by parameter point replacement in parameter pond;Model parameter pond is logical
It crosses an array to realize, model parameter point is abstracted into parameter point object, provides Push and Pull interface for computing cluster and mould
The interaction of shape parameter pond;Model parameter pond module calls Bayes's optimization module Get interface to obtain model parameter point, passes through
GetBatch interface obtains multiple groups inequality parameter point from Kmeans cluster module;Parameter point in the module of model parameter pond is calculated
Cluster Pull receives the model-evaluation index of computing cluster Push;
Kmeans cluster module generates the parameter point of multiple inequalities by Kmeans cluster;It is called, connects by model parameter pond module
Receive the signal for generating k inequality parameter point;Bayes's optimization module is called to generate multiple original candidates parameter points;Pass through Kmeans
Candidate parameter point is polymerized to k class by cluster, the maximum parameter point of revenue function value in k class is then selected, to generate k inequality
Parameter point returns result to model parameter pond module;K is greater than k;
Task scheduling modules, whether the model in the module of judgment models parameter pond should deconditioning;It specifically includes: model convergence
Property and Early Stopping algorithm;Model convergence, whether computation model precision reaches the threshold value being set in advance, if reached
Threshold value then restrains, otherwise, not converged;Early Stopping algorithm is realized are as follows: training pattern is corresponding for calculating history first
The mean value E (P) of the model-evaluation index P of current iteration wheel number, if "current" model evaluation index p < E (P) * 0.9, stops instructing
Practice, otherwise, continues to train;The task scheduling modules and model parameter pond module interact, in the module of judgment models parameter pond
Parameter corresponds to the state of model, sends a signal to model parameter pond module;
Adaptively cover half type degree of parallelism module really, the degree of parallelism of model in adaptive determination computing cluster;It is commented by experiment
Estimate the computational efficiency in the corresponding model parameter pond of different model parameter ponds size, obtains the corresponding model parameter of optimal computed performance
Pond size;The module is called by model parameter pond module, is used to initialization model parameter pond size.
2. a kind of super ginseng optimization method of machine learning based on asynchronous Bayes optimization, it is characterised in that: the following steps are included:
(1) adaptive determining model degree of parallelism module is executed, so that it is determined that computing cluster best model degree of parallelism, result is transmitted
Give model parameter pond module;
(2) model parameter pond module carries out initial work, including model parameter pond size;
(3) Bayes's optimization module carries out initial work, including initialization Bayes optimizes hyper parameter space configuration, Bayes
Optimized Iterative wheel number;
(4) model parameter pond module calls Kmeans cluster module, to generate n initial parameter point, is filled into model ginseng
In number pond module;
(5) computing cluster corresponds to model to the parameter in the module of model parameter pond and carries out a wheel model iteration, and model evaluation is referred to
Mark is sent to model parameter pond module;
(6) scheduler module judges that parameter corresponds to whether model answers according to parameter point and model-evaluation index in the module of model parameter pond
The deconditioning sends deconditioning signal to model parameter pond module if should stop;
(7) if model parameter pond module receives the model stop signal of scheduler module, the model parameter point that please be look for novelty, such as
One model of fruit stops, then Bayes's optimization module is requested to generate a parameter point;If multiple models stop, requesting
Kmeans cluster module generates multiple parameters point;Parameter point is used by computing cluster in the module of model parameter pond, starts a wheel model
Type training repeats the above process until reaching Bayes optimizes outage threshold, i.e. Bayes's Optimized Iterative wheel number.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811588608.5A CN109376869A (en) | 2018-12-25 | 2018-12-25 | A kind of super ginseng optimization system of machine learning based on asynchronous Bayes optimization and method |
PCT/CN2019/091485 WO2020133952A1 (en) | 2018-12-25 | 2019-06-17 | Asynchronous bayesian optimization-based machine learning super-parameter optimization system and method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811588608.5A CN109376869A (en) | 2018-12-25 | 2018-12-25 | A kind of super ginseng optimization system of machine learning based on asynchronous Bayes optimization and method |
Publications (1)
Publication Number | Publication Date |
---|---|
CN109376869A true CN109376869A (en) | 2019-02-22 |
Family
ID=65371987
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811588608.5A Pending CN109376869A (en) | 2018-12-25 | 2018-12-25 | A kind of super ginseng optimization system of machine learning based on asynchronous Bayes optimization and method |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN109376869A (en) |
WO (1) | WO2020133952A1 (en) |
Cited By (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110334732A (en) * | 2019-05-20 | 2019-10-15 | 北京思路创新科技有限公司 | A kind of Urban Air Pollution Methods and device based on machine learning |
CN110619423A (en) * | 2019-08-06 | 2019-12-27 | 平安科技(深圳)有限公司 | Multitask prediction method and device, electronic equipment and storage medium |
CN110659741A (en) * | 2019-09-03 | 2020-01-07 | 浩鲸云计算科技股份有限公司 | AI model training system and method based on piece-splitting automatic learning |
CN111027709A (en) * | 2019-11-29 | 2020-04-17 | 腾讯科技(深圳)有限公司 | Information recommendation method and device, server and storage medium |
WO2020133952A1 (en) * | 2018-12-25 | 2020-07-02 | 中国科学院软件研究所 | Asynchronous bayesian optimization-based machine learning super-parameter optimization system and method |
JP2020144530A (en) * | 2019-03-05 | 2020-09-10 | 日本電信電話株式会社 | Parameter estimation device, method, and program |
CN111797833A (en) * | 2020-05-21 | 2020-10-20 | 中国科学院软件研究所 | Automatic machine learning method and system oriented to remote sensing semantic segmentation |
CN112261721A (en) * | 2020-10-19 | 2021-01-22 | 南京爱而赢科技有限公司 | Combined beam distribution method based on Bayes parameter-adjusting support vector machine |
CN113305853A (en) * | 2021-07-28 | 2021-08-27 | 季华实验室 | Optimized welding parameter obtaining method and device, electronic equipment and storage medium |
CN113742991A (en) * | 2020-05-30 | 2021-12-03 | 华为技术有限公司 | Model and data joint optimization method and related device |
CN115470910A (en) * | 2022-10-20 | 2022-12-13 | 晞德软件(北京)有限公司 | Automatic parameter adjusting method based on Bayesian optimization and K-center sampling |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
TWI733270B (en) * | 2019-12-11 | 2021-07-11 | 中華電信股份有限公司 | Training device and training method for optimized hyperparameter configuration of machine learning model |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2015184729A1 (en) * | 2014-06-05 | 2015-12-10 | Tsinghua University | Method and system for hyper-parameter optimization and feature tuning of machine learning algorithms |
CN108062587A (en) * | 2017-12-15 | 2018-05-22 | 清华大学 | The hyper parameter automatic optimization method and system of a kind of unsupervised machine learning |
CN108470210A (en) * | 2018-04-02 | 2018-08-31 | 中科弘云科技(北京)有限公司 | A kind of optimum option method of hyper parameter in deep learning |
CN108573281A (en) * | 2018-04-11 | 2018-09-25 | 中科弘云科技(北京)有限公司 | A kind of tuning improved method of the deep learning hyper parameter based on Bayes's optimization |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105989374B (en) * | 2015-03-03 | 2019-12-24 | 阿里巴巴集团控股有限公司 | Method and equipment for training model on line |
CN108446302A (en) * | 2018-01-29 | 2018-08-24 | 东华大学 | A kind of personalized recommendation system of combination TensorFlow and Spark |
CN109062782B (en) * | 2018-06-27 | 2022-05-31 | 创新先进技术有限公司 | Regression test case selection method, device and equipment |
CN109376869A (en) * | 2018-12-25 | 2019-02-22 | 中国科学院软件研究所 | A kind of super ginseng optimization system of machine learning based on asynchronous Bayes optimization and method |
-
2018
- 2018-12-25 CN CN201811588608.5A patent/CN109376869A/en active Pending
-
2019
- 2019-06-17 WO PCT/CN2019/091485 patent/WO2020133952A1/en active Application Filing
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2015184729A1 (en) * | 2014-06-05 | 2015-12-10 | Tsinghua University | Method and system for hyper-parameter optimization and feature tuning of machine learning algorithms |
CN108062587A (en) * | 2017-12-15 | 2018-05-22 | 清华大学 | The hyper parameter automatic optimization method and system of a kind of unsupervised machine learning |
CN108470210A (en) * | 2018-04-02 | 2018-08-31 | 中科弘云科技(北京)有限公司 | A kind of optimum option method of hyper parameter in deep learning |
CN108573281A (en) * | 2018-04-11 | 2018-09-25 | 中科弘云科技(北京)有限公司 | A kind of tuning improved method of the deep learning hyper parameter based on Bayes's optimization |
Non-Patent Citations (2)
Title |
---|
姚诚伟 等: "一种深度生成模型的超参数自适应优化法", 《实验室研究与探索》 * |
杨斌 等: "一种支持向量机回归中超参数自适应方法", 《广西师范大学学报(自然科学版)》 * |
Cited By (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2020133952A1 (en) * | 2018-12-25 | 2020-07-02 | 中国科学院软件研究所 | Asynchronous bayesian optimization-based machine learning super-parameter optimization system and method |
JP7124768B2 (en) | 2019-03-05 | 2022-08-24 | 日本電信電話株式会社 | Parameter estimation device, method and program |
JP2020144530A (en) * | 2019-03-05 | 2020-09-10 | 日本電信電話株式会社 | Parameter estimation device, method, and program |
WO2020179627A1 (en) * | 2019-03-05 | 2020-09-10 | 日本電信電話株式会社 | Parameter estimation device, method and program |
CN110334732A (en) * | 2019-05-20 | 2019-10-15 | 北京思路创新科技有限公司 | A kind of Urban Air Pollution Methods and device based on machine learning |
CN110619423A (en) * | 2019-08-06 | 2019-12-27 | 平安科技(深圳)有限公司 | Multitask prediction method and device, electronic equipment and storage medium |
CN110619423B (en) * | 2019-08-06 | 2023-04-07 | 平安科技(深圳)有限公司 | Multitask prediction method and device, electronic equipment and storage medium |
CN110659741A (en) * | 2019-09-03 | 2020-01-07 | 浩鲸云计算科技股份有限公司 | AI model training system and method based on piece-splitting automatic learning |
CN111027709A (en) * | 2019-11-29 | 2020-04-17 | 腾讯科技(深圳)有限公司 | Information recommendation method and device, server and storage medium |
CN111797833A (en) * | 2020-05-21 | 2020-10-20 | 中国科学院软件研究所 | Automatic machine learning method and system oriented to remote sensing semantic segmentation |
CN113742991A (en) * | 2020-05-30 | 2021-12-03 | 华为技术有限公司 | Model and data joint optimization method and related device |
CN112261721A (en) * | 2020-10-19 | 2021-01-22 | 南京爱而赢科技有限公司 | Combined beam distribution method based on Bayes parameter-adjusting support vector machine |
CN113305853A (en) * | 2021-07-28 | 2021-08-27 | 季华实验室 | Optimized welding parameter obtaining method and device, electronic equipment and storage medium |
CN115470910A (en) * | 2022-10-20 | 2022-12-13 | 晞德软件(北京)有限公司 | Automatic parameter adjusting method based on Bayesian optimization and K-center sampling |
Also Published As
Publication number | Publication date |
---|---|
WO2020133952A1 (en) | 2020-07-02 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109376869A (en) | A kind of super ginseng optimization system of machine learning based on asynchronous Bayes optimization and method | |
Mei et al. | An efficient feature selection algorithm for evolving job shop scheduling rules with genetic programming | |
CN102567391B (en) | Method and device for building classification forecasting mixed model | |
CN103745273B (en) | Semiconductor fabrication process multi-performance prediction method | |
CN107229693A (en) | The method and system of big data system configuration parameter tuning based on deep learning | |
CN105184368A (en) | Distributed extreme learning machine optimization integrated framework system and method | |
CN105809349B (en) | Dispatching method for step hydropower station group considering incoming water correlation | |
Sun et al. | Research and application of parallel normal cloud mutation shuffled frog leaping algorithm in cascade reservoirs optimal operation | |
CN105930916A (en) | Parallel modular neural network-based byproduct gas real-time prediction method | |
Wei et al. | Research on cloud design resources scheduling based on genetic algorithm | |
Chen et al. | You only search once: A fast automation framework for single-stage dnn/accelerator co-design | |
CN111898867A (en) | Airplane final assembly production line productivity prediction method based on deep neural network | |
Shang et al. | Performance of genetic algorithms with different selection operators for solving short-term optimized reservoir scheduling problem | |
CN116629352A (en) | Hundred million-level parameter optimizing platform | |
CN109409746A (en) | A kind of production scheduling method and device | |
CN116307211A (en) | Wind power digestion capability prediction and optimization method and system | |
Cheng et al. | Swiftnet: Using graph propagation as meta-knowledge to search highly representative neural architectures | |
CN113420508A (en) | Unit combination calculation method based on LSTM | |
CN108492013A (en) | A kind of manufacture system scheduling model validation checking method based on quality control | |
CN116523640A (en) | Financial information management system based on scheduling feedback algorithm | |
CN116881224A (en) | Database parameter tuning method, device, equipment and storage medium | |
Seghir et al. | A new discrete imperialist competitive algorithm for QoS-aware service composition in cloud computing | |
Esfahanizadeh et al. | Stream iterative distributed coded computing for learning applications in heterogeneous systems | |
Zhang et al. | Multi-objective evolutionary for object detection mobile architectures search | |
Li et al. | Parameters optimization of back propagation neural network based on memetic algorithm coupled with genetic algorithm |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20190222 |