CN104200087A - Parameter optimization and feature tuning method and system for machine learning - Google Patents

Parameter optimization and feature tuning method and system for machine learning Download PDF

Info

Publication number
CN104200087A
CN104200087A CN201410422475.XA CN201410422475A CN104200087A CN 104200087 A CN104200087 A CN 104200087A CN 201410422475 A CN201410422475 A CN 201410422475A CN 104200087 A CN104200087 A CN 104200087A
Authority
CN
China
Prior art keywords
parameter
parameter sets
machine learning
optimization
sets
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201410422475.XA
Other languages
Chinese (zh)
Other versions
CN104200087B (en
Inventor
杨广文
季颖生
陈宇澍
付昊桓
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tsinghua University
Original Assignee
Tsinghua University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tsinghua University filed Critical Tsinghua University
Priority to CN201410422475.XA priority Critical patent/CN104200087B/en
Priority to PCT/CN2014/090050 priority patent/WO2015184729A1/en
Publication of CN104200087A publication Critical patent/CN104200087A/en
Application granted granted Critical
Publication of CN104200087B publication Critical patent/CN104200087B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • G06N20/10Machine learning using kernel methods, e.g. support vector machines [SVM]

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Evolutionary Computation (AREA)
  • Data Mining & Analysis (AREA)
  • Medical Informatics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Artificial Intelligence (AREA)
  • Complex Calculations (AREA)
  • Navigation (AREA)
  • Feedback Control In General (AREA)

Abstract

The invention provides a parameter optimization and feature tuning method for machine learning. The method includes the following steps that a plurality of parameter sets are generated randomly; iterative optimization based on EnKF is conducted on the parameter sets; performance evaluation is conducted on the optimized parameter sets, a set pool and a supplement parameter set are obtained according to evaluation results, and the performance of the parameter sets in the set pool is higher than the performance of the parameter sets in the supplement parameter set; iterative optimization based on the EnKF and performance evaluation are conducted on the parameter sets in the set pool and the parameter sets in the supplement parameter set again, and accordingly an optimal parameter set is obtained. By the adoption of the method, the computational efficiency for processing the computed results of parameter optimization can be improved, and universality is high. The invention further provides a parameter optimization and feature tuning system for machine learning.

Description

For the parameter optimization of machine learning and the method and system of feature tuning
Technical field
The present invention relates to the parameter optimization technical field of machine learning, particularly the method and system of a kind of parameter optimization for machine learning and feature tuning.
Background technology
For general machine learning algorithm, the performance of model depends primarily on its parameter configuration.The model that adopts different parameter combinations to generate often has very large performance difference.Parameter optimization is a stochastic optimization problems, and its randomness is mainly reflected in: the training data that generation model is used and test data have comprised limited sample, cannot reflect entirety, and the joint distribution function of parameter space based on unknown.The basic definition of this problem is as follows: a given training dataset X t, wherein X tdata distribution G based on unknown, the target of parameter optimization is to look for the parameter combinations θ of a machine learning algorithm F, at X ton set up a model f, make f in lower (or minimizing) performance evaluating value that maximizes of given performance evaluating criterion g ().Basic problem is expressed as follows,
θ opt ≈ arg min θ 1 n Σ x ∈ X V g ( x , F ( θ , X T ) ) .
But, directly calculate being contemplated to be of G very difficult.Therefore, general mode is that optimal model f is at given verification msg collection x von expectation average, in this process, can adopt Model Selection technology in order to ensure Generalization Capability, such as, cross validation.For unknown parameter space, search out best parameter group θ optbe from parameter space, to select limited parameter combinations to train generation model, then carry out Performance Evaluation, final output performance is evaluated and tested optimum parameter combinations.
For machine learning, parameter optimization is a challenge all the time, because there is the parameter combinations of magnanimity in the higher-dimension serial number space that has multiple parameters to form.Grid search is the most the most frequently used a kind of parameter optimization technology.The parameter of machine learning forms a mesh space, the given one group of feasible value of each parameter, and each combination represents a grid lattice point.Grid search adopts the mode of limit to test each given parameter combinations.Grid search technology realizes simple, and versatility is good, guarantees to find globally optimal solution in given parameter space.Its shortcoming is, adopts the way of search of limit to have scaling concern (calculated amount is index scale with parameter scale or search granularity and rises), makes computing cost large.
Between more than ten years, many optimisation techniques are developed the parameter optimization process for accelerating machine learning in the past.These roughly can be divided into two large classes: a class is numeric type optimization method, and another kind of is evolution type optimized algorithm.Numeric type parameter optimization technology, such as Gradient Descent.These class methods are obtained corresponding information and are determined the direction of search and the step-size in search of next iteration step by numerical evaluation, search is to have the specific direction therefore can Fast Convergent.Than other technologies, numeric type parameter optimization technology has very high search efficiency.But these class methods are easily absorbed in local optimum, its effect is extremely limited by initial point.Secondly,, for the optimizing problem of quantity of parameters, this class methods speed of convergence can obviously reduce.In addition, most of numeric type method is all embedded, need to carry out the derivation of equation and program realization targetedly to former algorithm, and versatility is poor.Evolution type parameter optimization technology, has comprised genetic algorithm, simulated annealing, and particle cluster algorithm, etc.Than numeric type method, these class methods can be avoided local optimum effectively, and guarantee to find the parameter combinations of approximate global optimum.Due to, evolutional algorithm mostly is random proximity search, there is no clear and definite direction and step-length, and speed of convergence is slow, length consuming time.
In machine learning field, another basic problem is feature tuning, has comprised that feature strengthens and feature selecting.A pre-treatment step of model training is that sample characteristics is carried out to convergent-divergent, the characteristic of sample is normalized to the hydraulic performance decline that causes model within the scope of unified codomain for avoiding factor value difference, otherwise can be according to the importance of feature, weights (zoom factor) are adjusted accordingly, object is for improving model performance, is referred to as feature and strengthens.Feature selecting, for removing redundancy and incoherent feature, finds a character subset for setting up model, thereby can reduce dimension and reduce the training time of model, even can strengthen the generalization of model.Feature selecting has comprised three types of technology: filtering type, and packaging type, and embedded.Filtering type technology, according to certain statistic, assesses and selects most important feature to feature, calculates simply still and lacks modelling verification, and precision is poor.Packaging type technology, the mode of employing iteration is selected a character subset at every turn, and generation model carries out Performance Evaluation, and the character subset accuracy obtaining is high, but computing cost is large.Embedded technology is carried out feature selecting in model training process, this need to change algorithm, and versatility is not strong.Except above-mentioned three classes, by feature zoom factor is carried out to tuning, can reach equally feature selecting.In the process of feature tuning, need to carry out equally parameter optimization and find optimal parameter could determine final model performance.Therefore, can consider that two processes are merged into a process carries out tuning simultaneously.
The zoom factor of feature is completed to feature tuning as a class parameter, and together with the parameter of machine learning algorithm, carry out optimizing meeting and cause a large amount of parameters.But, still lack quick, accurate, general, can be effective to Machine Learning Parameter optimizing technology, the especially optimizing in higher-dimension continuous parameter space at present.
Summary of the invention
The present invention is intended to solve at least to a certain extent one of technical matters in above-mentioned correlation technique.
For this reason, one object of the present invention is to propose a kind of method of parameter optimization for machine learning and feature tuning, and the method can promote result of calculation and the counting yield of processing parameter optimizing, and highly versatile.
Another object of the present invention is to provide the system of a kind of parameter optimization for machine learning and feature tuning.
To achieve these goals, the embodiment of first aspect present invention has proposed a kind of method of parameter optimization for machine learning and feature tuning, comprises the following steps: generate at random multiple parameter sets; Respectively described multiple parameter sets are carried out to the iteration optimization based on EnKF; Respectively the multiple parameter sets after optimizing are carried out to Performance Evaluation, and obtain gathering pond and supplementary parameter set according to assessment result, wherein, the performance of the parameter sets in described set pond is higher than the performance of the parameter sets in described supplementary parameter set; Parameter sets in parameter sets in pair set pond and described supplementary parameter set is carried out iteration optimization and the Performance Evaluation based on EnKF again, to obtain optimized parameter set.
According to the method for the parameter optimization for machine learning of the embodiment of the present invention and feature tuning, can be efficiently in the continuous parameter space of higher-dimension, find optimum solution; And simultaneously processing parameter optimizing of the method and two problems of feature tuning, can promote accuracy and the counting yield of the result of calculation of processing parameter optimizing; In addition, the method is carried out tuning using the zoom factor for the treatment of characteristic as parameter to feature; And the method highly versatile, is applicable to the various algorithms of machine learning.
In addition, the method for the parameter optimization for machine learning according to the above embodiment of the present invention and feature tuning can also have following additional technical characterictic:
In some instances, described described multiple parameter sets are carried out to the iteration optimization based on EnKF, specifically comprise: each parameter sets is trained with generation model on predetermined training dataset by machine learning algorithm; On predetermined verification msg collection, described model is carried out to Performance Evaluation; By EnKF algorithm, described multiple parameter sets are upgraded.
In some instances, the multiple parameter sets of described random generation, specifically comprise the following steps: generate at random a parameter vector θ ∈ R m × 1, wherein each parameter generates random value in predetermined parameter area; One group of normalized vector { ρ of random generation i| ρ i∈ R m × 1, i=1 ..., N}, carries out orthogonalization to guarantee disturbance linear independence to it; Generate parameter sets disturbance, specifically comprise:
A′=(F ar 1ρ 1,F ar 2ρ 2,...,F ar Nρ N)∈R m×N,r i~N(0,S p),
Wherein, the disturbance of A ' expression parametric geometry, ρ irepresent the random perturbation vector generating, variable r irepresent arbitrary width, and Gaussian distributed, variance S pconfigurable, matrix F a=(f 1e 1, f 2e 2..., f ne n), e irepresentation unit vector, f ia configurable zoom variables, for adjusting perturbation amplitude; By parameter vector θ, add respectively disturbance set A ' in each group perturbation vector ε i∈ A ', obtains one group of parameter θ i=θ+ε i, symbiosis becomes N group parameter to form parameter sets A; Repeat above-mentioned several step, to generate N eindividual parameter sets.
In some instances, also comprise: described training dataset and verification msg collection are carried out to feature convergent-divergent; By the parameter input machine learning algorithm in parameter sets, concentrate and train with generation model at training data; By described model, each sample is predicted, obtained the estimated value of described model; Obtain the model of all parameters, and carry out Performance Evaluation, specifically comprise:
Wherein, the predicted value that adopts each group parameter generation model to obtain that HA has represented set-inclusion, n represents the number of verification msg collection sample.
In some instances, also comprise: by following formula, described parameter sets is upgraded:
A a=A f+A′(HA′) T(HA′(HA′) T+γγ T) -1(D-HA),
Wherein, A represents parameter sets, A frepresent current parameter sets, A arepresent the parameter sets of upgrading, the disturbance of A ' expression parameter sets, D represents observation set, and γ represents observation set disturbance, and HA represents model prediction results set, the set disturbance of HA ' expression HA.
In some instances, described respectively to optimize after multiple parameter sets carry out Performance Evaluation, and obtain gathering pond and supplementary parameter set according to assessment result, further comprise: according to assessment result, the multiple parameter sets after described optimization are divided three classes, establish the performance number that score (A) represents parameter sets; If score (A) >=thresh1, judges that this parameter sets performance is fine, and this parameter sets is kept in set pond; If score (A)≤thresh2, judges this parameter sets poor-performing, and abandons this parameter sets; If thresh2 < score (A) < is thresh1, judges that the performance of this parameter sets is general, and parameter sets general performance is merged at random between two, to generate supplementary parameter set.
In some instances, also comprise: in the general parameter sets of performance, choose a pair of parameter sets, generate new parameter sets by the merge algorithm based on EnKF, specifically comprise: suppose that a pair of parameter sets of choosing is A i, A j, calculate respectively Q by following formula ijand Q ji:
Calculate respectively A ijand A ji, wherein:
A ij m = A &OverBar; i + A i &prime; Q ij ,
A ji m = A &OverBar; j + A j &prime; Q ji ;
For Q ijand Q jicarry out respectively UR decomposition; From matrix and the middle N row with maximum pivot of selecting respectively, according to it in parameter sets and in select corresponding parameter vector and form last parameter sets A m; If parameter sets list to be combined is empty, the new parameter sets of random generation, otherwise select at random another to merge parameter sets.
The embodiment of second aspect present invention provides the system of a kind of parameter optimization for machine learning and feature tuning, comprising: generation module, and described generation module is used for generating multiple parameter sets; Optimize module, described optimization module is for carrying out the iteration optimization based on EnKF to described multiple parameter sets respectively; Evaluation module, described evaluation module is for carrying out Performance Evaluation to the multiple parameter sets after optimizing respectively, and obtain gathering pond and supplementary parameter set according to assessment result, wherein, the performance of the parameter sets in described set pond is higher than the performance of the parameter sets in described supplementary parameter set; Acquisition module, described acquisition module carries out iteration optimization and the Performance Evaluation based on EnKF again for the parameter sets in parameter sets and the described supplementary parameter set in pair set pond, to obtain optimized parameter set.
According to the system of the parameter optimization for machine learning of the embodiment of the present invention and feature tuning, can be efficiently in the continuous parameter space of higher-dimension, find optimum solution; And the simultaneously processing parameter optimizing of this system and two problems of feature tuning, can promote accuracy and the counting yield of the result of calculation of processing parameter optimizing; In addition, this system is carried out tuning using the zoom factor for the treatment of characteristic as parameter to feature; And this system highly versatile, is applicable to the various algorithms of machine learning.
In addition, the system of the parameter optimization for machine learning according to the above embodiment of the present invention and feature tuning can also have following additional technical characterictic:
In some instances, described optimization module is for training with generation model each parameter sets on predetermined training dataset by machine learning algorithm, and on predetermined verification msg collection, described model is carried out to Performance Evaluation, and by EnKF algorithm, described multiple parameter sets are upgraded.
In some instances, described generation module generates multiple parameter sets, specifically comprises: generate at random a parameter vector θ ∈ R m × 1, wherein each parameter generates random value in predetermined parameter area; One group of normalized vector { ρ of random generation i| ρ i∈ R m × 1, i=1 ..., N}, carries out orthogonalization to guarantee disturbance linear independence to it; Generate parameter sets disturbance, specifically comprise:
A′=(F ar 1ρ 1,F ar 2ρ 2,...,F ar Nρ N)∈R m×N,r i~N(0,S p),
Wherein, the disturbance of A ' expression parametric geometry, ρ irepresent the random perturbation vector generating, variable r irepresent arbitrary width, and Gaussian distributed, variance S pconfigurable, matrix F a=(f 1e 1, f 2e 2..., f ne n), e irepresentation unit vector, f ia configurable zoom variables, for adjusting perturbation amplitude; By parameter vector θ, add respectively disturbance set A ' in each group perturbation vector ε i∈ A ', obtains one group of parameter θ i=θ+ε i, symbiosis becomes N group parameter to form parameter sets A; Repeat above-mentioned several step, to generate N eindividual parameter sets.
In some instances, described optimization module is also for carrying out feature convergent-divergent to described training dataset and verification msg collection; By the parameter input machine learning algorithm in parameter sets, concentrate and train with generation model at training data; By described model, each sample is predicted, obtained the estimated value of described model; Obtain the model of all parameters, and carry out Performance Evaluation, specifically comprise:
Wherein, the predicted value that adopts each group parameter generation model to obtain that HA has represented set-inclusion, n represents the number of verification msg collection sample.
In some instances, by following formula, described parameter sets is upgraded:
A a=A f+A′(HA′) T(HA′(HA′) T+γγ T) -1(D-HA),
Wherein, A represents parameter sets, A frepresent current parameter sets, A arepresent the parameter sets of upgrading, the disturbance of A ' expression parameter sets, D represents observation set, and γ represents observation set disturbance, and HA represents model prediction results set, the set disturbance of HA ' expression HA.
In some instances, described evaluation module is for being divided three classes the multiple parameter sets after described optimization according to assessment result, if score (A) represents the performance number of parameter sets, and in the time of score (A) >=thresh1, judge that this parameter sets performance is fine, and this parameter sets is kept in set pond, and in the time of score (A)≤thresh2, judge this parameter sets poor-performing, and abandon this parameter sets, and in the time of thresh2 < score (A) < thresh1, the performance of this parameter sets is general, and parameter sets general performance is merged at random between two, to generate supplementary parameter set.
In some instances, described evaluation module also, for choosing a pair of parameter sets in the general parameter sets of performance, generates new parameter sets by the merge algorithm based on EnKF, specifically comprises: suppose that a pair of parameter sets of choosing is A i, A j, calculate respectively Q by following formula ijand Q ji:
Calculate respectively A ijand A ji, wherein:
A ij m = A &OverBar; i + A i &prime; Q ij ,
A ji m = A &OverBar; j + A j &prime; Q ji ;
For Q ijand Q jicarry out respectively UR decomposition; From matrix and the middle N row with maximum pivot of selecting respectively, according to it in parameter sets and in select corresponding parameter vector and form last parameter sets A m; If parameter sets list to be combined is empty, the new parameter sets of random generation, otherwise select at random another to merge parameter sets.
Additional aspect of the present invention and advantage in the following description part provide, and part will become obviously from the following description, or recognize by practice of the present invention.
Brief description of the drawings
Above-mentioned and/or additional aspect of the present invention and advantage accompanying drawing below combination is understood becoming the description of embodiment obviously and easily, wherein:
Fig. 1 is the process flow diagram for the parameter optimization of machine learning and the method for feature tuning according to an embodiment of the invention;
Fig. 2 is the principle schematic for the parameter optimization of machine learning and the method for feature tuning in accordance with another embodiment of the present invention;
Fig. 3 is the process flow diagram single set being upgraded according to one embodiment of the invention;
Fig. 4 is the operational flow diagram of gathering according to an embodiment of the invention evolution;
Fig. 5 is the operational flow diagram that merges according to an embodiment of the invention search;
Fig. 6 is the process flow diagram for the parameter optimization of machine learning and the method for feature tuning in accordance with another embodiment of the present invention; And
Fig. 7 is the structured flowchart for the parameter optimization of machine learning and the system of feature tuning according to an embodiment of the invention.
Embodiment
Describe embodiments of the invention below in detail, the example of described embodiment is shown in the drawings, and wherein same or similar label represents same or similar element or has the element of identical or similar functions from start to finish.Be exemplary below by the embodiment being described with reference to the drawings, only for explaining the present invention, and can not be interpreted as limitation of the present invention.
Describe according to the method and system of the parameter optimization for machine learning of the embodiment of the present invention and feature tuning below in conjunction with accompanying drawing.
Fig. 1 is the process flow diagram for the parameter optimization of machine learning and the method for feature tuning according to an embodiment of the invention.Fig. 6 is the process flow diagram for the parameter optimization of machine learning and the method for feature tuning in accordance with another embodiment of the present invention.Shown in Fig. 1 and Fig. 6, the method comprises the following steps:
Step S101, generates multiple parameter sets at random.
If feature strengthens function and do not start, parameter sets has comprised the parameter of machine learning algorithm; If feature tuning function on, parameter sets has comprised parameter and the feature zoom factor of machine learning algorithm.Suppose that parameter sets A represents, wherein set member θ irepresent one group of parameter (in EnKF algorithm as a group system state), A is the state matrix of a m × N, and wherein, m represents number of parameters, and N represents set member's number.In some instances, suppose N erepresent the number of parameter sets, the random concrete steps that generate of parameter sets comprise:
Step 1-1: generate at random a parameter vector wherein each parameter generates random value in predetermined parameter area.
Step 1-2: generate at random one group of normalized vector { ρ i| ρ i∈ R m × 1, i=1 ..., N}, carries out orthogonalization to guarantee disturbance linear independence to it.
Step 1-3: generate parameter sets disturbance, specifically comprise:
A′=(F ar 1ρ 1,F ar 2ρ 2,...,F ar Nρ N)∈R m×N,r i~N(0,S p),
Wherein, the disturbance of A ' expression parametric geometry, ρ irepresent the random perturbation vector generating, variable r irepresent arbitrary width, and Gaussian distributed, variance S pconfigurable, matrix F a=(f 1e 1, f 2e 2..., f ne n), e irepresentation unit vector, f ia configurable zoom variables, for adjusting perturbation amplitude.
Step 1-4: by parameter vector θ, add respectively disturbance set A ' in each group perturbation vector ε i∈ A ', obtains one group of parameter θ i=θ+ε i, symbiosis becomes N group parameter to form parameter sets A.
Step 1-5: repeat above-mentioned steps, repeated execution of steps 1-1 is to step 1-4, to generate N eindividual parameter sets.
Step S102, carries out the iteration optimization based on EnKF to multiple parameter sets respectively.In some instances, specifically comprise: each parameter sets is trained with generation model on predetermined training dataset by machine learning algorithm, and on predetermined verification msg collection, model is carried out to Performance Evaluation, finally by EnKF algorithm, multiple parameter sets are upgraded.More specifically, as shown in Figure 3, this step comprises the following steps:
Step 2-1: training dataset (is for example denoted as to X t) and verification msg collection (be for example denoted as X v) carry out feature convergent-divergent.
If the function of feature tuning does not start, the characteristic unification of each sample is normalized in the codomain of appointment, these normalized training datas and verification msg will be used until whole optimizing finishes always.If feature tuning function on, supposes zoom factor δ j∈ θ iif do not carry out feature selecting, according to δ jeach feature is carried out to the convergent-divergent on numerical value; If carry out feature selecting, first zoom factor is normalized to [0,1], by formula δ i/ δ max, then select the zoom factor that is greater than given codomain to carry out the convergent-divergent on numerical value to corresponding feature, only have the characteristic after convergent-divergent train and verify, all δ iparticipating in EnKF calculates.If feature tuning function does not start, step 2-1 only carries out first, otherwise this step is carried out at every turn.
Step 2-2: by θ iin parameter be input to machine learning algorithm, and at training dataset X tin train generation model, and, conventionally adopt Model Selection technology to guarantee Generalization Capability.
Step 2-3: by model, each sample is predicted, obtained the estimated value of model.More specifically, adopt θ ithe model generating is at verification msg collection X vupper, each sample is predicted, export a rank value, i.e. the estimated value of model output, supposes H θ irepresent to comprise the vector predicting the outcome.
Step 2-4: obtain the model of all parameters, and carry out Performance Evaluation.Specifically, to each group parameter θ i∈ A, repeating step 2-1 is to step 2-3, and generation model also carries out Performance Evaluation, suppose that HA has represented set-inclusion and adopted each to organize predicted value that parameter generation model obtains (in EnKF algorithm, its expression state is to the mapping of observation), HA can be expressed as:
Wherein, n represents the number of verification msg collection sample.
Further, for each parameter sets A, repeat above-mentioned steps and train assessment, obtain corresponding HA.
Further, in some instances, after step S102, also can comprise the following steps:
Step 3: generate observation set, suppose that D represents observation set, D is expressed as:
Wherein, represent i group observation vector.Observation represents given sample prior probability, if the prior probability of sample is not provided, gives an initial observation value, supposes d 0represent the initial observation that observation vector has comprised sample, the concrete steps that observation set generates comprise:
Step 3-1: generate at random one group of normalized vector, if sample size n is few, random vector is carried out to orthogonalization, otherwise do not carry out.
Step 3-2: generate random observation disturbance, suppose that γ represents observation set disturbance:
Wherein, represent the random perturbation vector generating in step 3-1, variable v irepresent arbitrary width, and Gaussian distributed, wherein variance S oconfigurable.
Step 3-3: by initial observation vector d 0, add respectively each group perturbation vector v iβ i∈ γ, calculates an observation vector d i=d 0+ v iβ i, symbiosis becomes the observation of N group to form observation set D.
Step 3-4: repeating step 3-1, to step 3-3, generates N eindividual observation disturbance set and observation set.
In step S102, by following formula, parameter sets is upgraded:
A a=A f+A′(HA′) T(HA′(HA′) T+γγ T) -1(D-HA),
Wherein, A represents parameter sets, A frepresent current parameter sets, A arepresent the parameter sets of upgrading, the disturbance of A ' expression parameter sets, D represents observation set, and γ represents observation set disturbance, and HA represents model prediction results set, the set disturbance of HA ' expression HA.Further, HA ' calculates by following formula:
HA &OverBar; = HAM N ,
HA &prime; = HA - HA &OverBar; ,
Wherein, represent ensemble average value, in each element be in the computation process of EnKF, due to (HA ' (HA ') t+ γ γ t) -1computing cost excessive, therefore adopt householder conversion to carry out UR and decompose to optimize calculating.Suppose represent HA ', X (i, j) represent an element in X, X (:, j) represent row wherein, make τ represent the border of a relative residual error, make 1 > > τ > 0, for each the row i=1 in X ..., N, carries out following steps:
Step 4-1: calculate the residual norm of remaining columns, computing formula is as follows:
ResNorm(k)=||X(i:n,k)|| 2?k=i,...,N。
Step 4-2: if &tau; 1 - &tau; &Sigma; l = 1 i - 1 | X ( l , l ) | > ( N - i + 1 ) ResNorm ( k ^ ) , (wherein, represent the row number of maximum residul difference norm), p=i-1 so, circulation finishes, and goes to step 4-7; Otherwise p=i, exchanges two row
Step 4-3: initialization vector wherein &omega; i ( k ) = 0 k < i X ( k , i ) k &GreaterEqual; i .
Step 4-4: calculate Norm = ResNorm ( k ^ ) , And
If X (i, i) > 0, ω so ii+ Norm × e i, upgrade X (i), meet following condition:
X ( i , k ) = X ( i , k ) k < i - Norm k = i 0 k > i
Otherwise, ω ii-Norm × e i, upgrade X (i), meet following condition:
X ( i , k ) = X ( i , k ) k < i + Norm k = i 0 k > i .
Step 4-5: calculate ω ii/ || ω i|| 2, to k=i+1:N, carry out following renewal and calculate:
X(i:n,k)=X(i:n,k)-2ω i(i:n)(ω i(i:n) TX(i:n,k))。
Step 4-6:i=i+1, if i > is N, circulation finishes, and goes to step 4-7; Otherwise, start to carry out from step 4-1.
Step 4-7: set up matrix s is defined as S=X (:, 1:p).
After the UR of step 4-7 decomposes, obtain following estimation formulas by above-mentioned steps 4-1:
( HA &prime; ( HA &prime; ) T + &gamma;&gamma; T ) - 1 &ap; U ( S ^ S ^ T ) - 1 0 0 0 U T ,
Wherein, U represents orthogonal matrix, represent the row of non-zero entry in upper triangular matrix Cover matrix S, matrix row by p row diagonal element absolute value maximum form.After decomposing, matrix U is made up of p householder conversion, is expressed as U=H (ω 1) H (ω 2) ... H (ω p), wherein householder conversion H (ω i) be defined as H (ω i)=I-2 ω iω i t.So, the EnKF computing formula of optimization is converted into:
A a = A f + A &prime; ( HA &prime; ) T U ( S ^ S ^ T ) - 1 0 0 0 U T ( D - HA ) ,
After UR has decomposed, for the consideration of counting yield, in above-mentioned formula, residue is calculated strictly and is carried out from right to left, finally obtains the parameter sets A upgrading a.
Step 5: reach the bull wheel number of setting if EnKF upgrades wheel number, go to step S102, the parameter sets A obtaining with renewal atraining pattern and carry out Performance Evaluation, then goes to step 6 described later; Otherwise, go to step S102 and carry out the renewal calculating of EnKF next time.
Step S103, carries out Performance Evaluation to the multiple parameter sets after optimizing respectively, and obtains gathering pond and supplementary parameter set according to assessment result, and wherein, the performance of the parameter sets in set pond is higher than the performance of the parameter sets in supplementary parameter set.As shown in Figure 4, specifically comprise the following steps:
Step 6: upgrade in computation process at each EnKF, preserve parameter sets A and corresponding HA in all EnKF renewal processes, it is carried out to unified Performance Evaluation, export parameter sets optimum in this process, the calculation task of each parameter sets is exported one group of A and HA.In addition, all Performance Evaluation results will go on record.
Step 7: if the wheel number that set is evolved reaches the bull wheel number of setting, go to step 14 described later; Otherwise, go to aftermentioned step 8, proceed set and evolve.
Step 8: collect the parameter sets of each calculation task output and predict the outcome, gather assessment, and according to assessment result, the multiple parameter sets after optimizing are divided three classes, particularly, adopt two threshold values that parameter sets is divided into three classes, suppose that score (A) represents the performance number of parameter sets, assessment mode is as follows:
If score (A)>=thresh1, judges that this parameter sets performance is fine, and this parameter sets is kept in set pond, if the quantity in set pond is N e, go to aftermentioned step 12; Otherwise, continue to carry out this step.
If score (A)≤thresh2, judges this parameter sets poor-performing, and abandons this parameter sets.
If thresh2 < score (A) < is thresh1, judges that the performance of this parameter sets is general, and parameter sets general performance is merged at random between two, to generate supplementary parameter set.After having assessed all parameter sets, if there is the parameter sets that need to merge, go to aftermentioned step 9; Otherwise, go to aftermentioned step 10.
Step 9: the parameter sets general to performance, choose a pair of parameter sets, then adopt the merge algorithm based on EnKF to generate new parameter sets A m, specifically comprise the following steps:
Step 9-1: suppose that a pair of parameter sets of choosing is A i, A j, calculate respectively Q by following formula ijand Q ji:
Wherein, above-mentioned calculating has adopted the optimization in step 4 to calculate.
Step 9-2: calculate respectively A ijand A ji, be specially;
A ij m = A &OverBar; i + A i &prime; Q ij ,
A ji m = A &OverBar; j + A j &prime; Q ji .
Step 9-3: to Q ijand Q jicarry out respectively UR decomposition.Be specially:
Q ij ~ U ~ ij S ~ ij V ~ ij ,
Wherein, represent a upper triangular matrix, wherein matrix column according to the absolute value of pivot (diagonal element) by arranging from big to small, represent permutation matrix, for row being exchanged in the time selecting pivot.
Step 9-4: from matrix and select respectively the N row with maximum pivot (diagonal element), according to it in parameter sets and in select corresponding parameter vector and form last parameter sets A m.
Step 9-5: if parameter sets list to be combined is empty, go to aftermentioned step 10, i.e. the new parameter sets of random generation, otherwise select at random another to merge parameter sets, repeated execution of steps 9-1 is to step 9-5.
Step S104, the parameter sets in the parameter sets in pair set pond and supplementary parameter set is carried out iteration optimization and the Performance Evaluation based on EnKF again, to obtain optimized parameter set.As shown in Figure 4, specifically comprise the following steps:
Step 10: if having parameter sets to be dropped or merge, perform step that S101 is random generates new parameter sets.
Step 11: the set of carrying out a new round through the parameter sets merging or generate is at random evolved, and go to step S102.
Step 12: after set phylogenetic scale finishes, merge search, as shown in Figure 2.If search iteration step is initial merging, the parameter sets number in set pond is 1, goes to aftermentioned step 15; Otherwise, from set pond, choose a pair of parameter sets and merge the new parameter sets of generation, and perform step 9-1 to step 9-4.
Step 13: each newly-generated parameter sets is carried out to EnKF renewal, go to step S102.
Step 14: after EnKF upgrades, the parameter sets of collecting this process performance optimum, then gathers assessment, joins the parameter sets of score (A) >=thresh1 in set pond.Then, go to step 12 and carry out the fusion search of next iteration step.
Step 15: find out one group of best parameter of performance evaluating from record, as optimized parameter set output.
In sum, method of the present invention has adopted EnKF technology, can be used in parameter optimization and the feature tuning two aspect problems of machine learning, using feature zoom factor as a class parameter, come processing feature enhancing and feature selecting by zoom factor being carried out to optimizing, and can together carry out parameter optimization with the parameter of machine learning algorithm.As shown in Figure 2, the method has mainly comprised that set is evolved and fusion two stages of search.
More specifically, method of the present invention is based on EnKF technology, and EnKF technology can be estimated the nonlinear problem that comprises a large amount of stochastic variables, has set up a framework based on EnKF technology simultaneously, and has adopted multiple optimisation technique.Specifically, first, the method has adopted EnKF technology, regards the parameter optimization of machine learning and the unification of feature tuning problem as the state using parameter as system is estimated for a nonlinear system; Secondly, the method has been set up a framework based on EnKF, adopt many integrated technologies to be easily absorbed in the problem of local optimum for numerical optimization, adopt set evolution technology, assess through the parameter sets after EnKF calculating multiple, retention property is good, abandon poor performance, merge performance general, be mainly used in improving hunting zone and search efficiency, after set evolution finishes, adopt and merge search technique, merge the good parameter sets of performance, further search for, guarantee to find approximate optimal solution in higher dimensional space; Then, because the data volume of machine learning is large, make in EnKF computation process, the calculating of part matrix computing and storage overhead are very large, have adopted efficient UR decomposition technique for this reason, increase the operation efficiency of EnKF, thereby have strengthened practicality.
According to the method for the parameter optimization for machine learning of the embodiment of the present invention and feature tuning, can be efficiently in the continuous parameter space of higher-dimension, find optimum solution; And simultaneously processing parameter optimizing of the method and two problems of feature tuning, can promote accuracy and the counting yield of the result of calculation of processing parameter optimizing; In addition, the method is carried out tuning using the zoom factor for the treatment of characteristic as parameter to feature; And the method highly versatile, is applicable to the various algorithms of machine learning.
Further embodiment of the present invention also provides the system of a kind of parameter optimization for machine learning and feature tuning.
Fig. 7 is the structured flowchart for the parameter optimization of machine learning and the system of feature tuning according to an embodiment of the invention.As shown in Figure 7, for the parameter optimization of machine learning and the system 700 of feature tuning, comprising according to an embodiment of the invention: generation module 710, optimization module 720, evaluation module 730 and acquisition module 740.
Wherein, generation module 710 is for generating multiple parameter sets.
If feature strengthens function and do not start, parameter sets has comprised the parameter of machine learning algorithm; If feature tuning function on, parameter sets has comprised parameter and the feature zoom factor of machine learning algorithm.Suppose that parameter sets A represents, wherein set member θ irepresent one group of parameter (in EnKF algorithm as a group system state), A is the state matrix of a m × N, and wherein, m represents number of parameters, and N represents set member's number.In some instances, suppose N erepresent the number of parameter sets, the random concrete steps that generate parameter sets of generation module 710 comprise:
Step 1-1: generate at random a parameter vector wherein each parameter generates random value in predetermined parameter area.
Step 1-2: generate at random one group of normalized vector { ρ i| ρ i∈ R m × 1, i=1 ..., N}, carries out orthogonalization to guarantee disturbance linear independence to it.
Step 1-3: generate parameter sets disturbance, specifically comprise:
A′=(F ar 1ρ 1,F ar 2ρ 2,...,F ar Nρ N)∈R m×N,r i~N(0,S p),
Wherein, the disturbance of A ' expression parametric geometry, ρ irepresent the random perturbation vector generating, variable r irepresent arbitrary width, and Gaussian distributed, variance S pconfigurable, matrix F a=(f 1e 1, f 2e 2..., f ne n), e irepresentation unit vector, f ia configurable zoom variables, for adjusting perturbation amplitude.
Step 1-4: by parameter vector θ, add respectively disturbance set A ' in each group perturbation vector ε i∈ A ', obtains one group of parameter θ i=θ+ε i, symbiosis becomes N group parameter to form parameter sets A.
Step 1-5: repeat above-mentioned steps, repeated execution of steps 1-1 is to step 1-4, to generate N eindividual parameter sets.
Optimize module 720 for respectively multiple parameter sets being carried out to the iteration optimization based on EnKF.In some instances, specifically comprise: having module 720 is trained with generation model each parameter sets on predetermined training dataset by machine learning algorithm, and on predetermined verification msg collection, model is carried out to Performance Evaluation, finally by EnKF algorithm, multiple parameter sets are upgraded.More specifically, as shown in Figure 3, this this process can comprise the following steps:
Step 2-1: training dataset (is for example denoted as to X t) and verification msg collection (be for example denoted as X v) carry out feature convergent-divergent.
If the function of feature tuning does not start, the characteristic unification of each sample is normalized in the codomain of appointment, these normalized training datas and verification msg will be used until whole optimizing finishes always.If feature tuning function on, supposes zoom factor δ j∈ θ iif do not carry out feature selecting, according to δ jeach feature is carried out to the convergent-divergent on numerical value; If carry out feature selecting, first zoom factor is normalized to [0,1], by formula δ i/ δ max, then select the zoom factor that is greater than given codomain to carry out the convergent-divergent on numerical value to corresponding feature, only have the characteristic after convergent-divergent train and verify, all δ iparticipating in EnKF calculates.If feature tuning function does not start, step 2-1 only carries out first, otherwise this step is carried out at every turn.
Step 2-2: by θ iin parameter be input to machine learning algorithm, and at training dataset X tin train generation model, and, conventionally adopt Model Selection technology to guarantee Generalization Capability.
Step 2-3: by model, each sample is predicted, obtained the estimated value of model.More specifically, adopt θ ithe model generating is at verification msg collection X vupper, each sample is predicted, export a rank value, i.e. the estimated value of model output, supposes H θ irepresent to comprise the vector predicting the outcome.
Step 2-4: obtain the model of all parameters, and carry out Performance Evaluation.Specifically, to each group ginseng θ i∈ A number, repeating step 2-1 is to step 2-3, and generation model also carries out Performance Evaluation, suppose that HA has represented set-inclusion and adopted each to organize predicted value that parameter generation model obtains (in EnKF algorithm, its expression state is to the mapping of observation), HA can be expressed as:
Wherein, n represents the number of verification msg collection sample.
Further, for each parameter sets A, repeat above-mentioned steps and train assessment, obtain corresponding HA.
Further, in some instances, after said process, also can comprise the following steps:
Step 3: generate observation set, suppose that D represents observation set, D is expressed as:
Wherein, represent i group observation vector.Observation represents given sample prior probability, if the prior probability of sample is not provided, gives an initial observation value, supposes d 0represent the initial observation that observation vector has comprised sample, the concrete steps that observation set generates comprise:
Step 3-1: generate at random one group of normalized vector, if sample size n is few, random vector is carried out to orthogonalization, otherwise do not carry out.
Step 3-2: generate random observation disturbance, suppose that γ represents observation set disturbance:
Wherein, represent the random perturbation vector generating in step 3-1, variable v irepresent arbitrary width, and Gaussian distributed, wherein variance S oconfigurable.
Step 3-3: by initial observation vector d 0, add respectively each group perturbation vector v iβ i∈ γ, calculates an observation vector d i=d 0+ v iβ i, symbiosis becomes the observation of N group to form observation set D.
Step 3-4: repeating step 3-1, to step 3-3, generates N eindividual observation disturbance set and observation set.
Wherein, in this example, optimize module 720 and by following formula, parameter sets upgraded:
A a=A f+A′(HA′) T(HA′(HA′) T+γγ T) -1(D-HA),
Wherein, A represents parameter sets, A frepresent current parameter sets, A arepresent the parameter sets of upgrading, the disturbance of A ' expression parameter sets, D represents observation set, and γ represents observation set disturbance, and HA represents model prediction results set, the set disturbance of HA ' expression HA.。Further, HA ' calculates by following formula:
HA &OverBar; = HAM N ,
HA &prime; = HA - HA &OverBar; ,
Wherein, represent ensemble average value, in each element be in the computation process of EnKF, due to (HA ' (HA ') t+ γ γ t) -1computing cost excessive, therefore adopt householder conversion to carry out UR and decompose to optimize calculating.Suppose represent HA ', X (i, j) represent an element in X, X (:, j) represent row wherein, make τ represent the border of a relative residual error, make 1 > > τ > 0, for each the row i=1 in X ..., N, carries out following steps:
Step 4-1: calculate the residual norm of remaining columns, computing formula is as follows:
ResNorm(k)=||X(i:n,k)|| 2?k=i,...,N。
Step 4-2: if &tau; 1 - &tau; &Sigma; l = 1 i - 1 | X ( l , l ) | > ( N - i + 1 ) ResNorm ( k ^ ) , (wherein, represent the row number of maximum residul difference norm), p=i-1 so, circulation finishes, and goes to aftermentioned step 4-7; Otherwise p=i, exchanges two row X ( : , k ^ ) &LeftRightArrow; X ( : , i ) .
Step 4-3: initialization vector wherein &omega; i ( k ) = 0 k < i X ( k , i ) k &GreaterEqual; i .
Step 4-4: calculate Norm = ResNorm ( k ^ ) , And
If X (i, i) > 0, ω so ii+ Norm × e i, upgrade X (i), meet following condition:
X ( i , k ) = X ( i , k ) k < i - Norm k = i 0 k > i
Otherwise ω ii-Norm × e i,, upgrade X (i), meet following condition:
X ( i , k ) = X ( i , k ) k < i + Norm k = i 0 k > i .
Step 4-5: calculate ω ii/ || ω i|| 2, to k=i+1:N, carry out following renewal and calculate:
X(i:n,k)=X(i:n,k)-2ω i(i:n)(ω i(i:n) TX(i:n,k))。
Step 4-6:i=i+1, if i > is N, circulation finishes, and goes to aftermentioned step 4-7; Otherwise, start to carry out from step 4-1.
Step 4-7: set up matrix s is defined as S=X (:, 1:p).
After the UR of step 4-7 decomposes, obtain following estimation formulas by above-mentioned steps 4-1:
( HA &prime; ( HA &prime; ) T + &gamma;&gamma; T ) - 1 &ap; U ( S ^ S ^ T ) - 1 0 0 0 U T ,
Wherein, U represents orthogonal matrix, represent the row of non-zero entry in upper triangular matrix Cover matrix S, matrix row by p row diagonal element absolute value maximum form.After decomposing, matrix U is made up of p householder conversion, is expressed as U=H (ω 1) H (ω 2) ... H (ω p), wherein householder conversion H (ω i) be defined as H (ω i)=I-2 ω iω i t.So, the EnKF computing formula of optimization is converted into:
A a = A f + A &prime; ( HA &prime; ) T U ( S ^ S ^ T ) - 1 0 0 0 U T ( D - HA ) ,
After UR has decomposed, for the consideration of counting yield, in above-mentioned formula, residue is calculated strictly and is carried out from right to left, finally obtains the parameter sets A upgrading a.
Step 5: reach the bull wheel number of setting if EnKF upgrades wheel number, go to step S102, the parameter sets A obtaining with renewal atraining pattern and carry out Performance Evaluation, then goes to step 6 described later; Otherwise, carry out the renewal of EnKF next time and calculate.
Evaluation module 730, for respectively the multiple parameter sets after optimizing being carried out to Performance Evaluation, and obtains gathering pond and supplementary parameter set according to assessment result, and wherein, the performance of the parameter sets in set pond is higher than the performance of the parameter sets in supplementary parameter set.In some instances, this process can be summarized as following steps:
Step 6: upgrade in computation process at each EnKF, preserve parameter sets A and corresponding HA in all EnKF renewal processes, it is carried out to unified Performance Evaluation, export parameter sets optimum in this process, the calculation task of each parameter sets is exported one group of A and HA.In addition, all Performance Evaluation results will go on record.
Step 7: if the wheel number that set is evolved reaches the bull wheel number of setting, go to step 14 described later; Otherwise, go to step 8 described later, proceed set and evolve.
Step 8: collect the parameter sets of each calculation task output and predict the outcome, gather assessment, and according to assessment result, the multiple parameter sets after optimizing are divided three classes, particularly, adopt two threshold values that parameter sets is divided into three classes, suppose that score (A) represents the performance number of parameter sets, assessment mode is as follows:
If score (A)>=thresh1, judges that this parameter sets performance is fine, and this parameter sets is kept in set pond, if the quantity in set pond is N e, go to aftermentioned step 12; Otherwise, continue to carry out this step.
If score (A)≤thresh2, judges this parameter sets poor-performing, and abandons this parameter sets.
If thresh2 < score (A) < is thresh1, judges that the performance of this parameter sets is general, and parameter sets general performance is merged at random between two, to generate supplementary parameter set.After having assessed all parameter sets, if there is the parameter sets that need to merge, go to aftermentioned step 9; Otherwise, go to aftermentioned step 10.
Step 9: the parameter sets general to performance, choose a pair of parameter sets, then adopt the merge algorithm based on EnKF to generate new parameter sets A m, specifically comprise the following steps:
Step 9-1: suppose that a pair of parameter sets of choosing is A i, A j, calculate respectively Q by following formula ijand Q ji:
Wherein, above-mentioned calculating has adopted the optimization in step 4 to calculate.
Step 9-2: calculate respectively A ijand A ji, be specially;
A ij m = A &OverBar; i + A i &prime; Q ij ,
A ji m = A &OverBar; j + A j &prime; Q ji .
Step 9-3: to Q ijand Q jicarry out respectively UR decomposition.Be specially:
Q ij ~ U ~ ij S ~ ij V ~ ij ,
Wherein, represent a upper triangular matrix, wherein matrix column according to the absolute value of pivot (diagonal element) by arranging from big to small, represent permutation matrix, for row being exchanged in the time selecting pivot.
Step 9-4: from matrix and select respectively the N row with maximum pivot (diagonal element), according to it in parameter sets and in select corresponding parameter vector and form last parameter sets A m.
Step 9-5: if parameter sets list to be combined is empty, go to aftermentioned step 10, i.e. the new parameter sets of random generation, otherwise select at random another to merge parameter sets, repeated execution of steps 9-1 is to step 9-5.
Acquisition module 740 carries out iteration optimization and the Performance Evaluation based on EnKF again for the parameter sets in parameter sets and the supplementary parameter set in pair set pond, to obtain optimized parameter set.In some instances, this process can be summarized as following steps:
Step 10: if having the parameter sets to be dropped or merge, generate new parameter sets at random.
Step 11: the set of carrying out a new round through the parameter sets merging or generate is at random evolved.
Step 12: after set phylogenetic scale finishes, merge search, as shown in Figure 2.If search iteration step is initial merging, the parameter sets number in set pond is 1, goes to aftermentioned step 15; Otherwise, from set pond, choose a pair of parameter sets and merge the new parameter sets of generation, and perform step 9-1 to step 9-4.
Step 13: each newly-generated parameter sets is carried out to EnKF renewal.
Step 14: after EnKF upgrades, the parameter sets of collecting this process performance optimum, then gathers assessment, joins the parameter sets of score (A) >=thresh1 in set pond.Then, go to step 12 and carry out the fusion search of next iteration step.
Step 15: find out one group of best parameter of performance evaluating from record, as optimized parameter set output.
In sum, system of the present invention has adopted EnKF technology, can be used in parameter optimization and the feature tuning two aspect problems of machine learning, using feature zoom factor as a class parameter, come processing feature enhancing and feature selecting by zoom factor being carried out to optimizing, and can together carry out parameter optimization with the parameter of machine learning algorithm.As shown in Figure 2, when this system is moved, mainly comprised that set is evolved and fusion two stages of search.
More specifically, system of the present invention is based on EnKF technology, and EnKF technology can be estimated the nonlinear problem that comprises a large amount of stochastic variables, has set up a framework based on EnKF technology simultaneously, and has adopted multiple optimisation technique.Specifically, first, this system has adopted EnKF technology, regards the parameter optimization of machine learning and the unification of feature tuning problem as the state using parameter as system is estimated for a nonlinear system; Secondly, this system made a framework based on EnKF, adopt many integrated technologies to be easily absorbed in the problem of local optimum for numerical optimization, adopt set evolution technology, assess through the parameter sets after EnKF calculating multiple, retention property is good, abandon poor performance, merge performance general, be mainly used in improving hunting zone and search efficiency, after set evolution finishes, adopt and merge search technique, merge the good parameter sets of performance, further search for, guarantee to find approximate optimal solution in higher dimensional space; Then, because the data volume of machine learning is large, make in EnKF computation process, the calculating of part matrix computing and storage overhead are very large, have adopted efficient UR decomposition technique for this reason, increase the operation efficiency of EnKF, thereby have strengthened practicality.
According to the system of the parameter optimization for machine learning of the embodiment of the present invention and feature tuning, can be efficiently in the continuous parameter space of higher-dimension, find optimum solution; And the simultaneously processing parameter optimizing of this system and two problems of feature tuning, can promote accuracy and the counting yield of the result of calculation of processing parameter optimizing; In addition, this system is carried out tuning using the zoom factor for the treatment of characteristic as parameter to feature; And this system highly versatile, is applicable to the various algorithms of machine learning.
In description of the invention, it will be appreciated that, term " " center ", " longitudinally ", " laterally ", " length ", " width ", " thickness ", " on ", D score, " front ", " afterwards ", " left side ", " right side ", " vertically ", " level ", " top ", " end " " interior ", " outward ", " clockwise ", " counterclockwise ", " axially ", " radially ", orientation or the position relationship of instructions such as " circumferentially " are based on orientation shown in the drawings or position relationship, only the present invention for convenience of description and simplified characterization, instead of device or the element of instruction or hint indication must have specific orientation, with specific orientation structure and operation, therefore can not be interpreted as limitation of the present invention.
In addition, term " first ", " second " be only for describing object, and can not be interpreted as instruction or hint relative importance or the implicit quantity that indicates indicated technical characterictic.Thus, at least one this feature can be expressed or impliedly be comprised to the feature that is limited with " first ", " second ".In description of the invention, the implication of " multiple " is at least two, for example two, and three etc., unless otherwise expressly limited specifically.
In the present invention, unless otherwise clearly defined and limited, the terms such as term " installation ", " being connected ", " connection ", " fixing " should be interpreted broadly, and for example, can be to be fixedly connected with, and can be also to removably connect, or integral; Can be mechanical connection, can be also electrical connection; Can be to be directly connected, also can indirectly be connected by intermediary, can be the connection of two element internals or the interaction relationship of two elements, unless separately there is clear and definite restriction.For the ordinary skill in the art, can understand as the case may be above-mentioned term concrete meaning in the present invention.
In the present invention, unless otherwise clearly defined and limited, First Characteristic Second Characteristic " on " or D score can be that the first and second features directly contact, or the first and second features are by intermediary indirect contact.And, First Characteristic Second Characteristic " on ", " top " and " above " but First Characteristic directly over Second Characteristic or oblique upper, or only represent that First Characteristic level height is higher than Second Characteristic.First Characteristic Second Characteristic " under ", " below " and " below " can be First Characteristic under Second Characteristic or tiltedly, or only represent that First Characteristic level height is less than Second Characteristic.
In the description of this instructions, the description of reference term " embodiment ", " some embodiment ", " example ", " concrete example " or " some examples " etc. means to be contained at least one embodiment of the present invention or example in conjunction with specific features, structure, material or the feature of this embodiment or example description.In this manual, to the schematic statement of above-mentioned term not must for be identical embodiment or example.And, specific features, structure, material or the feature of description can one or more embodiment in office or example in suitable mode combination.In addition,, not conflicting in the situation that, those skilled in the art can carry out combination and combination by the feature of the different embodiment that describe in this instructions or example and different embodiment or example.
Although illustrated and described embodiments of the invention above, be understandable that, above-described embodiment is exemplary, can not be interpreted as limitation of the present invention, and those of ordinary skill in the art can change above-described embodiment within the scope of the invention, amendment, replacement and modification.

Claims (14)

1. for the parameter optimization of machine learning and a method for feature tuning, it is characterized in that, comprise the following steps:
The multiple parameter sets of random generation;
Respectively described multiple parameter sets are carried out to the iteration optimization based on EnKF;
Respectively the multiple parameter sets after optimizing are carried out to Performance Evaluation, and obtain gathering pond and supplementary parameter set according to assessment result, wherein, the performance of the parameter sets in described set pond is higher than the performance of the parameter sets in described supplementary parameter set;
Parameter sets in parameter sets in pair set pond and described supplementary parameter set is carried out iteration optimization and the Performance Evaluation based on EnKF again, to obtain optimized parameter set.
2. the method for the parameter optimization for machine learning according to claim 1 and feature tuning, is characterized in that, describedly respectively described multiple parameter sets is carried out to the iteration optimization based on EnKF, specifically comprises:
Each parameter sets is trained with generation model on predetermined training dataset by machine learning algorithm;
On predetermined verification msg collection, described model is carried out to Performance Evaluation;
By EnKF algorithm, described multiple parameter sets are upgraded.
3. the method for the parameter optimization for machine learning according to claim 1 and feature tuning, is characterized in that, the multiple parameter sets of described random generation, specifically comprise the following steps:
A parameter vector θ ∈ R of random generation m × 1, wherein each parameter generates random value in predetermined parameter area;
One group of normalized vector { ρ of random generation i| ρ i∈ R m × 1, i=1 ..., N}, carries out orthogonalization to guarantee disturbance linear independence to it;
Generate parameter sets disturbance, specifically comprise:
A′=(F ar 1ρ 1,F ar 2ρ 2,...,F ar Nρ N)∈R m×N,r i~N(0,S p),
Wherein, the disturbance of A ' expression parametric geometry, ρ irepresent the random perturbation vector generating, variable r irepresent arbitrary width, and Gaussian distributed, variance S pconfigurable, matrix F a=(f 1e 1, f 2e 2..., f ne n), e irepresentation unit vector, f ia configurable zoom variables, for adjusting perturbation amplitude;
By parameter vector θ, add respectively disturbance set A ' in each group perturbation vector ε i∈ A ', obtains one group of parameter θ i=θ+ε i, symbiosis becomes N group parameter to form parameter sets A;
Repeat above-mentioned several step, to generate N eindividual parameter sets.
4. the method for the parameter optimization for machine learning according to claim 2 and feature tuning, is characterized in that, also comprises:
Described training dataset and verification msg collection are carried out to feature convergent-divergent;
By the parameter input machine learning algorithm in parameter sets, concentrate and train with generation model at training data;
By described model, each sample is predicted, obtained the estimated value of described model;
Obtain the model of all parameters, and carry out Performance Evaluation, specifically comprise:
Wherein, the predicted value that adopts each group parameter generation model to obtain that HA has represented set-inclusion, n represents the number of verification msg collection sample.
5. the method for the parameter optimization for machine learning according to claim 2 and feature tuning, is characterized in that, also comprises: by following formula, described parameter sets is upgraded:
A a=A f+A′(HA′) T(HA′(HA′) T+γγ T) -1(D-HA),
Wherein, A represents parameter sets, A frepresent current parameter sets, A arepresent the parameter sets of upgrading, the disturbance of A ' expression parameter sets, D represents observation set, and γ represents observation set disturbance, and HA represents model prediction results set, the set disturbance of HA ' expression HA.
6. the method for the parameter optimization for machine learning according to claim 1 and feature tuning, it is characterized in that, describedly respectively the multiple parameter sets after optimizing are carried out to Performance Evaluation, and obtain gathering pond and supplementary parameter set according to assessment result, further comprise:
According to assessment result, the multiple parameter sets after described optimization are divided three classes, establish the performance number that score (A) represents parameter sets;
If score (A) >=thresh1, judges that this parameter sets performance is fine, and this parameter sets is kept in set pond;
If score (A)≤thresh2, judges this parameter sets poor-performing, and abandons this parameter sets;
If thresh2 < score (A) < is thresh1, judges that the performance of this parameter sets is general, and parameter sets general performance is merged at random between two, to generate supplementary parameter set.
7. the method for the parameter optimization for machine learning according to claim 6 and feature tuning, is characterized in that, also comprises:
In the general parameter sets of performance, choose a pair of parameter sets, generate new parameter sets by the merge algorithm based on EnKF, specifically comprise:
Suppose that a pair of parameter sets of choosing is A i, A j, calculate respectively Q by following formula ijand Q ji:
Calculate respectively A ijand A ji, wherein:
A ij m = A &OverBar; i + A i &prime; Q ij ,
A ji m = A &OverBar; j + A j &prime; Q ji ;
For Q ijand Q jicarry out respectively UR decomposition;
From matrix and the middle N row with maximum pivot of selecting respectively, according to it in parameter sets and in select corresponding parameter vector and form last parameter sets A m;
If parameter sets list to be combined is empty, the new parameter sets of random generation, otherwise select at random another to merge parameter sets.
8. for the parameter optimization of machine learning and a system for feature tuning, it is characterized in that, comprising:
Generation module, described generation module is used for generating multiple parameter sets;
Optimize module, described optimization module is for carrying out the iteration optimization based on EnKF to described multiple parameter sets respectively;
Evaluation module, described evaluation module is for carrying out Performance Evaluation to the multiple parameter sets after optimizing respectively, and obtain gathering pond and supplementary parameter set according to assessment result, wherein, the performance of the parameter sets in described set pond is higher than the performance of the parameter sets in described supplementary parameter set;
Acquisition module, described acquisition module carries out iteration optimization and the Performance Evaluation based on EnKF again for the parameter sets in parameter sets and the described supplementary parameter set in pair set pond, to obtain optimized parameter set.
9. the system of the parameter optimization for machine learning according to claim 8 and feature tuning, it is characterized in that, described optimization module is for training with generation model each parameter sets on predetermined training dataset by machine learning algorithm, and on predetermined verification msg collection, described model is carried out to Performance Evaluation, and by EnKF algorithm, described multiple parameter sets are upgraded.
10. the system of the parameter optimization for machine learning according to claim 8 and feature tuning, is characterized in that, described generation module generates multiple parameter sets, specifically comprises:
A parameter vector θ ∈ R of random generation m × 1, wherein each parameter generates random value in predetermined parameter area;
One group of normalized vector { ρ of random generation i| ρ i∈ R m × 1, i=1 ..., N}, carries out orthogonalization to guarantee disturbance linear independence to it;
Generate parameter sets disturbance, specifically comprise:
A′=(F ar 1ρ 1,F ar 2ρ 2,...,F ar Nρ N)∈R m×N,r i~N(0,S p),
Wherein, the disturbance of A ' expression parametric geometry, ρ irepresent the random perturbation vector generating, variable r irepresent arbitrary width, and Gaussian distributed, variance S pconfigurable, matrix F a=(f 1e 1, f 2e 2..., f ne n), e irepresentation unit vector, f ia configurable zoom variables, for adjusting perturbation amplitude;
By parameter vector θ, add respectively disturbance set A ' in each group perturbation vector ε i∈ A ', obtains one group of parameter θ i=θ+ε i, symbiosis becomes N group parameter to form parameter sets A;
Repeat above-mentioned several step, to generate N eindividual parameter sets.
The system of 11. parameter optimizations for machine learning according to claim 9 and feature tuning, is characterized in that, described optimization module is also for carrying out feature convergent-divergent to described training dataset and verification msg collection; By the parameter input machine learning algorithm in parameter sets, concentrate and train with generation model at training data; By described model, each sample is predicted, obtained the estimated value of described model; Obtain the model of all parameters, and carry out Performance Evaluation, specifically comprise:
Wherein, the predicted value that adopts each group parameter generation model to obtain that HA has represented set-inclusion, n represents the number of verification msg collection sample.
The system of 12. parameter optimizations for machine learning according to claim 9 and feature tuning, is characterized in that, by following formula, described parameter sets is upgraded:
A a=A f+A′(HA′) T(HA′(HA′) T+γγ T) -1(D-HA),
Wherein, A represents parameter sets, A frepresent current parameter sets, A arepresent the parameter sets of upgrading, the disturbance of A ' expression parameter sets, D represents observation set, and γ represents observation set disturbance, and HA represents model prediction results set, the set disturbance of HA ' expression HA.
The system of 13. parameter optimizations for machine learning according to claim 8 and feature tuning, it is characterized in that, described evaluation module is for being divided three classes the multiple parameter sets after described optimization according to assessment result, if score (A) represents the performance number of parameter sets, and in the time of score (A) >=thresh1, judge that this parameter sets performance is fine, and this parameter sets is kept in set pond, and in the time of score (A)≤thresh2, judge this parameter sets poor-performing, and abandon this parameter sets, and in the time of thresh2 < score (A) < thresh1, the performance of this parameter sets is general, and parameter sets general performance is merged at random between two, to generate supplementary parameter set.
The system of 14. parameter optimizations for machine learning according to claim 13 and feature tuning, it is characterized in that, described evaluation module also, for choosing a pair of parameter sets in the general parameter sets of performance, generates new parameter sets by the merge algorithm based on EnKF, specifically comprises:
Suppose that a pair of parameter sets of choosing is A i, A j, calculate respectively Q by following formula ijand Q ji:
Calculate respectively A ijand A ji, wherein:
A ij m = A &OverBar; i + A i &prime; Q ij ,
A ji m = A &OverBar; j + A j &prime; Q ji ;
For Q ijand Q jicarry out respectively UR decomposition;
From matrix and the middle N row with maximum pivot of selecting respectively, according to it in parameter sets and in select corresponding parameter vector and form last parameter sets A m;
If parameter sets list to be combined is empty, the new parameter sets of random generation, otherwise select at random another to merge parameter sets.
CN201410422475.XA 2014-06-05 2014-08-25 For the parameter optimization of machine learning and the method and system of feature tuning Active CN104200087B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201410422475.XA CN104200087B (en) 2014-06-05 2014-08-25 For the parameter optimization of machine learning and the method and system of feature tuning
PCT/CN2014/090050 WO2015184729A1 (en) 2014-06-05 2014-10-31 Method and system for hyper-parameter optimization and feature tuning of machine learning algorithms

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
CNPCT/CN2014/079308 2014-06-05
CN2014079308 2014-06-05
CN201410422475.XA CN104200087B (en) 2014-06-05 2014-08-25 For the parameter optimization of machine learning and the method and system of feature tuning

Publications (2)

Publication Number Publication Date
CN104200087A true CN104200087A (en) 2014-12-10
CN104200087B CN104200087B (en) 2018-10-02

Family

ID=52085380

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410422475.XA Active CN104200087B (en) 2014-06-05 2014-08-25 For the parameter optimization of machine learning and the method and system of feature tuning

Country Status (2)

Country Link
CN (1) CN104200087B (en)
WO (1) WO2015184729A1 (en)

Cited By (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104915566A (en) * 2015-06-17 2015-09-16 大连理工大学 Design method for depth calculation model supporting incremental updating
CN107169469A (en) * 2017-06-02 2017-09-15 南京理工大学 A kind of material identification method of the MIMO radar based on machine learning
WO2018040561A1 (en) * 2016-08-31 2018-03-08 华为技术有限公司 Data processing method, device and system
CN107844837A (en) * 2017-10-31 2018-03-27 第四范式(北京)技术有限公司 The method and system of algorithm parameter tuning are carried out for machine learning algorithm
WO2019014933A1 (en) * 2017-07-21 2019-01-24 深圳市汇顶科技股份有限公司 Method and device for setting parameters in signal calculation method
CN109409533A (en) * 2018-09-28 2019-03-01 深圳乐信软件技术有限公司 A kind of generation method of machine learning model, device, equipment and storage medium
WO2019067374A1 (en) * 2017-09-26 2019-04-04 Amazon Technologies, Inc. Dynamic tuning of training parameters for machine learning algorithms
CN109685089A (en) * 2017-10-18 2019-04-26 北京京东尚科信息技术有限公司 The system and method for assessment models performance
CN109726763A (en) * 2018-12-29 2019-05-07 北京神州绿盟信息安全科技股份有限公司 A kind of information assets recognition methods, device, equipment and medium
CN110197285A (en) * 2019-05-07 2019-09-03 清华大学 Security cooperation deep learning method and device based on block chain
CN110263949A (en) * 2019-06-21 2019-09-20 安徽智寰科技有限公司 Merge the data processing method and system of machine mechanism and intelligent algorithm system
CN111275123A (en) * 2020-02-10 2020-06-12 北京信息科技大学 Method and system for generating large-batch confrontation samples
CN111797990A (en) * 2019-04-08 2020-10-20 北京百度网讯科技有限公司 Training method, training device and training system of machine learning model
CN112336310A (en) * 2020-11-04 2021-02-09 吾征智能技术(北京)有限公司 Heart disease diagnosis system based on FCBF and SVM fusion
CN112380044A (en) * 2020-12-04 2021-02-19 腾讯科技(深圳)有限公司 Data anomaly detection method and device, computer equipment and storage medium
US11079739B2 (en) 2019-02-25 2021-08-03 General Electric Company Transfer learning/dictionary generation and usage for tailored part parameter generation from coupon builds
US11472115B2 (en) 2019-03-21 2022-10-18 General Electric Company In-situ monitoring system assisted material and parameter development for additive manufacturing

Families Citing this family (28)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10235443B2 (en) * 2016-03-01 2019-03-19 Accenture Global Solutions Limited Parameter set determination for clustering of datasets
CN106169096B (en) * 2016-06-24 2018-07-24 山西大学 A kind of appraisal procedure of machine learning system learning performance
CN107659595B (en) * 2016-07-25 2021-06-25 阿里巴巴集团控股有限公司 Method and device for evaluating capability of distributed cluster to process designated service
US20180082212A1 (en) * 2016-09-20 2018-03-22 Intel Corporation Optimizing machine learning running time
US10762163B2 (en) 2016-12-05 2020-09-01 Microsoft Technology Licensing, Llc Probabilistic matrix factorization for automated machine learning
CN108734330A (en) * 2017-04-24 2018-11-02 北京京东尚科信息技术有限公司 Data processing method and device
GB201805302D0 (en) * 2018-03-29 2018-05-16 Benevolentai Tech Limited Ensemble Model Creation And Selection
US10600005B2 (en) 2018-06-01 2020-03-24 Sas Institute Inc. System for automatic, simultaneous feature selection and hyperparameter tuning for a machine learning model
US10417558B1 (en) 2018-09-28 2019-09-17 Deep Insight Solutions, Inc. Methods and systems for artificial neural network optimistic event processing
CN109041093B (en) * 2018-07-10 2021-08-13 深圳无线电检测技术研究院 Blind signal source power position joint estimation method and system
US20200151576A1 (en) * 2018-11-08 2020-05-14 Uber Technologies, Inc. Training adaptable neural networks based on evolvability search
DE102018220064A1 (en) 2018-11-22 2020-05-28 Volkswagen Aktiengesellschaft Determination of values of production parameters
CN109740113B (en) * 2018-12-03 2023-10-03 东软集团股份有限公司 Super-parameter threshold range determining method and device, storage medium and electronic equipment
CN109376869A (en) * 2018-12-25 2019-02-22 中国科学院软件研究所 A kind of super ginseng optimization system of machine learning based on asynchronous Bayes optimization and method
CN110070117B (en) * 2019-04-08 2023-04-07 腾讯科技(深圳)有限公司 Data processing method and device
CN112085180B (en) * 2019-06-14 2024-05-17 北京百度网讯科技有限公司 Machine learning super parameter determination method, device, equipment and readable storage medium
US11681931B2 (en) 2019-09-24 2023-06-20 International Business Machines Corporation Methods for automatically configuring performance evaluation schemes for machine learning algorithms
CN113366445A (en) * 2019-09-30 2021-09-07 株式会社日立信息通信工程 State prediction system
CN111259604A (en) * 2020-01-16 2020-06-09 中国科学院空间应用工程与技术中心 High orbit satellite light pressure model identification method and system based on machine learning
CN111401569B (en) * 2020-03-27 2023-02-17 支付宝(杭州)信息技术有限公司 Hyper-parameter optimization method and device and electronic equipment
CN115701294A (en) 2020-06-04 2023-02-07 三菱电机株式会社 Optimal solution calculation device and optimal solution calculation method for optimization problem
US11823076B2 (en) 2020-07-27 2023-11-21 International Business Machines Corporation Tuning classification hyperparameters
CN112816191B (en) * 2020-12-28 2022-07-29 哈尔滨工业大学 Multi-feature health factor fusion method based on SDRSN
CN112835011A (en) * 2020-12-31 2021-05-25 华中光电技术研究所(中国船舶重工集团公司第七一七研究所) Laser radar inversion algorithm based on machine learning parameter compensation
CN112949850B (en) * 2021-01-29 2024-02-06 北京字节跳动网络技术有限公司 Super-parameter determination method, device, deep reinforcement learning framework, medium and equipment
CN113642766B (en) * 2021-07-08 2024-01-30 南方电网科学研究院有限责任公司 Method, device, equipment and medium for predicting power outage number of power system station
CN117077598B (en) * 2023-10-13 2024-01-26 青岛展诚科技有限公司 3D parasitic parameter optimization method based on Mini-batch gradient descent method
CN117800425B (en) * 2024-03-01 2024-06-07 宜宾科全矿泉水有限公司 Water purifier control method and system based on artificial intelligence

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2007003343A1 (en) * 2005-06-30 2007-01-11 Biocrates Life Sciences Ag Apparatus and method for analyzing a metabolite profile
CN103744978A (en) * 2014-01-14 2014-04-23 清华大学 Parameter optimization method for support vector machine based on grid search technology
CN103793764A (en) * 2014-02-10 2014-05-14 济南大学 Package optimizing system and method based on GPU and neighboring mass data rapid analysis

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060224532A1 (en) * 2005-03-09 2006-10-05 Case Western Reserve University Iterative feature weighting with neural networks
CN101782976B (en) * 2010-01-15 2013-04-10 南京邮电大学 Automatic selection method for machine learning in cloud computing environment

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2007003343A1 (en) * 2005-06-30 2007-01-11 Biocrates Life Sciences Ag Apparatus and method for analyzing a metabolite profile
CN103744978A (en) * 2014-01-14 2014-04-23 清华大学 Parameter optimization method for support vector machine based on grid search technology
CN103793764A (en) * 2014-02-10 2014-05-14 济南大学 Package optimizing system and method based on GPU and neighboring mass data rapid analysis

Cited By (34)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104915566A (en) * 2015-06-17 2015-09-16 大连理工大学 Design method for depth calculation model supporting incremental updating
WO2018040561A1 (en) * 2016-08-31 2018-03-08 华为技术有限公司 Data processing method, device and system
CN107784363A (en) * 2016-08-31 2018-03-09 华为技术有限公司 Data processing method, apparatus and system
CN107784363B (en) * 2016-08-31 2021-02-09 华为技术有限公司 Data processing method, device and system
CN107169469B (en) * 2017-06-02 2020-06-19 南京理工大学 Material identification method of MIMO radar based on machine learning
CN107169469A (en) * 2017-06-02 2017-09-15 南京理工大学 A kind of material identification method of the MIMO radar based on machine learning
WO2019014933A1 (en) * 2017-07-21 2019-01-24 深圳市汇顶科技股份有限公司 Method and device for setting parameters in signal calculation method
CN109791564A (en) * 2017-07-21 2019-05-21 深圳市汇顶科技股份有限公司 The setting method and device of parameter in signal calculating method
US11397887B2 (en) 2017-09-26 2022-07-26 Amazon Technologies, Inc. Dynamic tuning of training parameters for machine learning algorithms
WO2019067374A1 (en) * 2017-09-26 2019-04-04 Amazon Technologies, Inc. Dynamic tuning of training parameters for machine learning algorithms
CN109685089A (en) * 2017-10-18 2019-04-26 北京京东尚科信息技术有限公司 The system and method for assessment models performance
CN109685089B (en) * 2017-10-18 2020-12-22 北京京东尚科信息技术有限公司 System and method for evaluating model performance
CN111652380A (en) * 2017-10-31 2020-09-11 第四范式(北京)技术有限公司 Method and system for adjusting and optimizing algorithm parameters aiming at machine learning algorithm
CN111652380B (en) * 2017-10-31 2023-12-22 第四范式(北京)技术有限公司 Method and system for optimizing algorithm parameters aiming at machine learning algorithm
CN107844837A (en) * 2017-10-31 2018-03-27 第四范式(北京)技术有限公司 The method and system of algorithm parameter tuning are carried out for machine learning algorithm
CN107844837B (en) * 2017-10-31 2020-04-28 第四范式(北京)技术有限公司 Method and system for adjusting and optimizing algorithm parameters aiming at machine learning algorithm
CN109409533A (en) * 2018-09-28 2019-03-01 深圳乐信软件技术有限公司 A kind of generation method of machine learning model, device, equipment and storage medium
CN109409533B (en) * 2018-09-28 2021-07-27 深圳乐信软件技术有限公司 Method, device, equipment and storage medium for generating machine learning model
CN109726763B (en) * 2018-12-29 2021-05-28 绿盟科技集团股份有限公司 Information asset identification method, device, equipment and medium
CN109726763A (en) * 2018-12-29 2019-05-07 北京神州绿盟信息安全科技股份有限公司 A kind of information assets recognition methods, device, equipment and medium
US11609549B2 (en) 2019-02-25 2023-03-21 General Electric Company Transfer learning/dictionary generation and usage for tailored part parameter generation from coupon builds
US11079739B2 (en) 2019-02-25 2021-08-03 General Electric Company Transfer learning/dictionary generation and usage for tailored part parameter generation from coupon builds
US12023860B2 (en) 2019-03-21 2024-07-02 General Electric Company In-situ monitoring system assisted material and parameter development for additive manufacturing
US11472115B2 (en) 2019-03-21 2022-10-18 General Electric Company In-situ monitoring system assisted material and parameter development for additive manufacturing
CN111797990A (en) * 2019-04-08 2020-10-20 北京百度网讯科技有限公司 Training method, training device and training system of machine learning model
CN110197285A (en) * 2019-05-07 2019-09-03 清华大学 Security cooperation deep learning method and device based on block chain
US11954592B2 (en) 2019-05-07 2024-04-09 Tsinghua University Collaborative deep learning methods and collaborative deep learning apparatuses
CN110197285B (en) * 2019-05-07 2021-03-23 清华大学 Block chain-based safe cooperation deep learning method and device
CN110263949A (en) * 2019-06-21 2019-09-20 安徽智寰科技有限公司 Merge the data processing method and system of machine mechanism and intelligent algorithm system
CN111275123A (en) * 2020-02-10 2020-06-12 北京信息科技大学 Method and system for generating large-batch confrontation samples
CN112336310A (en) * 2020-11-04 2021-02-09 吾征智能技术(北京)有限公司 Heart disease diagnosis system based on FCBF and SVM fusion
CN112336310B (en) * 2020-11-04 2024-03-08 吾征智能技术(北京)有限公司 FCBF and SVM fusion-based heart disease diagnosis system
CN112380044A (en) * 2020-12-04 2021-02-19 腾讯科技(深圳)有限公司 Data anomaly detection method and device, computer equipment and storage medium
CN112380044B (en) * 2020-12-04 2024-05-28 腾讯科技(深圳)有限公司 Data anomaly detection method, device, computer equipment and storage medium

Also Published As

Publication number Publication date
WO2015184729A1 (en) 2015-12-10
CN104200087B (en) 2018-10-02

Similar Documents

Publication Publication Date Title
CN104200087A (en) Parameter optimization and feature tuning method and system for machine learning
Jiang et al. Surrogate-model-based design and optimization
CN102246060B (en) Systems and methods for hydrocarbon reservoir development and management optimization
CN112187554B (en) Operation and maintenance system fault positioning method and system based on Monte Carlo tree search
US9921338B2 (en) Selecting and optimizing oil field controls for production plateau
CN109857804B (en) Distributed model parameter searching method and device and electronic equipment
Paler et al. Machine learning optimization of quantum circuit layouts
Ascia et al. Performance evaluation of efficient multi-objective evolutionary algorithms for design space exploration of embedded computer systems
Akbari et al. KASRA: A Kriging-based Adaptive Space Reduction Algorithm for global optimization of computationally expensive black-box constrained problems
CN105893669A (en) Global simulation performance predication method based on data digging
Zaefferer et al. A case study on multi-criteria optimization of an event detection software under limited budgets
Liu et al. Petroleum production forecasting based on machine learning
Zhang et al. A fast active learning method in design of experiments: multipeak parallel adaptive infilling strategy based on expected improvement
US10209403B2 (en) Method of modelling a subsurface volume
Chatterjee et al. Adaptive bilevel approximation technique for multiobjective evolutionary optimization
Kruczyk et al. Random reducts: A monte carlo rough set-based method for feature selection in large datasets
Chowdhury et al. Concurrent surrogate model selection (cosmos) based on predictive estimation of model fidelity
Carè et al. Uncertainty bounds for kernel-based regression: A Bayesian SPS approach
Stinstra et al. Metamodeling by symbolic regression and Pareto simulated annealing
CN104392119A (en) Multiphase support vector regression-based seismic wave crest and trough modeling method
WO2014197637A1 (en) Selecting and optimizing oil field controls for production plateau
CN105022798A (en) Categorical data mining method of discrete Bayesian network on the basis of prediction relationship
Nevrly et al. Heuristic challenges for spatially distributed waste production identification problems
Temizel et al. Improved Optimization Through Procedures as Pseudo Objective Functions in Nonlinear Optimization of Oil Recovery With Next-Generation Reservoir Simulators
Wu Optimal design of the water treatment plants

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant