CN109213805A

CN109213805A - A kind of method and device of implementation model optimization

Info

Publication number: CN109213805A
Application number: CN201811045717.2A
Authority: CN
Inventors: 侯广健
Original assignee: Neusoft Corp
Current assignee: Neusoft Corp
Priority date: 2018-09-07
Filing date: 2018-09-07
Publication date: 2019-01-15

Abstract

The embodiment of the present application discloses a kind of method and device of implementation model optimization, for generating optimal models, this method comprises: obtaining the data feature values of training data set；The sample data sets that distance is less than first threshold are determined as approximate sample data sets by the distance between data feature values and the data feature values of sample data sets for calculating training data set；Obtain the preferred value of each hyper parameter of the corresponding target algorithm of approximation sample data sets；It is adjusted by initial value using each hyper parameter of the training data set to target algorithm, the value for meeting each hyper parameter of target algorithm of preset standard is obtained, the initial value of each hyper parameter of target algorithm is the preferred value of each hyper parameter of the corresponding target algorithm of approximation sample data sets；The corresponding optimal models of target algorithm are generated, each hyper parameter of the corresponding optimal models of target algorithm uses the value for meeting each hyper parameter of target algorithm of preset standard.

Description

A kind of method and device of implementation model optimization

Technical field

This application involves field of computer technology, and in particular to a kind of method and device of implementation model optimization.

Background technique

In machine learning field, how using training data to train the model that one more optimizes and be one and important ask Topic.It is related to two aspect contents among these, on the one hand, how to be chosen such that for some algorithm and to train to hyper parameter Model is excellent in specified index, i.e., hyper parameter tuning problem, hyper parameter are needed before starting machine-learning process The parameter of setting value, it is generally the case that need to optimize hyper parameter, select one group of optimal hyper parameter, to improve study Performance and effect；On the other hand, how to select a certain model that the model is imitated in the application in trained multiple models Fruit is best, i.e. multi-model select permeability.

For hyper parameter tuning problem, currently used method includes grid search (Grid Search) and stochastic retrieval (Random Search), such methods are the combinations of traversal hyper parameter value as much as possible, carry out model training respectively, obtain To optimal hyper parameter, still, as the raising of hyper parameter quantity can have serious performance issue.For this purpose, being directed to black box function The method of optimization is also employed in hyper parameter tuning problem, such as Bayes's optimization method etc., but such methods have super ginseng The problem of number initialization value influences hyper parameter tuning effect.

Summary of the invention

In view of this, the embodiment of the present application provides a kind of method and device of implementation model optimization, to solve the prior art In middle hyper parameter evolutionary process, the technical issues of can not determining hyper parameter initialization value.

To solve the above problems, technical solution provided by the embodiments of the present application is as follows:

A kind of method of implementation model optimization, which comprises

The data feature values of training data set are obtained, the training data set includes a plurality of first training data；

It calculates between the data feature values of the training data set and the data feature values of each sample data sets The sample data sets that the distance is less than first threshold are determined as the approximate sample data of the training data set by distance Set；

Obtain the preferred value of each hyper parameter of the corresponding target algorithm of the approximate sample data sets, the approximation sample The preferred value of each hyper parameter of the corresponding target algorithm of notebook data set is predetermined；

It is adjusted, is obtained by initial value using each hyper parameter of the training data set to the target algorithm The value of each hyper parameter of the target algorithm of preset standard, the initial value of each hyper parameter of the target algorithm must be met For the preferred value of each hyper parameter of the corresponding target algorithm of the approximation sample data sets；

Generate the corresponding optimal models of the target algorithm, each hyper parameter of the corresponding optimal models of the target algorithm Using the value of each hyper parameter of the target algorithm for meeting preset standard.

In one possible implementation, the method also includes:

Obtain the Sensitirity va1ue of each hyper parameter of the corresponding target algorithm of the approximate sample data sets, the approximation The Sensitirity va1ue of each hyper parameter of the corresponding target algorithm of sample data sets is predetermined；

The hyper parameter that the Sensitirity va1ue is more than second threshold is determined as to the target hyper parameter of the target algorithm；

It is described to be adjusted by initial value using each hyper parameter of the training data set to the target algorithm It is whole, obtain the value for meeting each hyper parameter of the target algorithm of preset standard, comprising:

It is adjusted by initial value using each target hyper parameter of the training data set to the target algorithm Whole, other hyper parameters that the target algorithm is arranged are default value, obtain each of the target algorithm for meeting preset standard The value of hyper parameter, the initial value of each target hyper parameter of the target algorithm are the corresponding mesh of the approximate sample data sets Mark the preferred value of each target hyper parameter of algorithm.

In one possible implementation, when the target algorithm is multiple, the generation target algorithm Optimal models, comprising:

Generate the optimal models of each target algorithm.

In one possible implementation, the method also includes:

Pending data is inputted to the optimal models of each target algorithm respectively, it is defeated to obtain each optimal models The model of the pending data out exports result；

The model output result of multiple pending datas is voted, the most termination of the pending data is obtained Fruit.

In one possible implementation, the method also includes:

Second training data is inputted to the optimal models of each target algorithm respectively, obtains each optimal models The model of second training data of output exports result；

The second training data for carrying the first label and multiple the second training datas for carrying the second label are made For third training data, training generates second-level model；First label is the actual result of second training data, described Second label is that the model of second training data exports one in result, each carries the second training of the second label Data and the model output result of second training data of each optimal models output correspond；

Multiple pending datas for carrying third label are inputted into the second-level model, obtain the pending data Final result, the third label are that the model of the pending data exports one in result, each carry third mark The model output result of the pending data of label and the pending data of each optimal models output corresponds.

A kind of device of implementation model optimization, described device include:

First acquisition unit, for obtaining the data feature values of training data set, the training data set includes more The first training data of item；

Computing unit, for calculating the data feature values of the training data set and the data of each sample data sets The sample data sets that the distance is less than first threshold are determined as the training data set by the distance between characteristic value Approximate sample data sets；

Second acquisition unit, for obtaining each hyper parameter of the approximate sample data sets corresponding target algorithms The preferred value of preferred value, each hyper parameter of the corresponding target algorithm of the approximation sample data sets is predetermined；

Optimize unit, for being opened using each hyper parameter of the training data set to the target algorithm by initial value Beginning is adjusted, obtain meet preset standard the target algorithm each hyper parameter value, the target algorithm it is each The initial value of hyper parameter is the preferred value of each hyper parameter of the corresponding target algorithm of the approximate sample data sets；

First generation unit, for generating the corresponding optimal models of the target algorithm, the target algorithm is corresponding most Meet the value of each hyper parameter of the target algorithm of preset standard described in each hyper parameter use of excellent model.

In one possible implementation, described device further include:

Third acquiring unit, for obtaining each hyper parameter of the approximate sample data sets corresponding target algorithms The Sensitirity va1ue of Sensitirity va1ue, each hyper parameter of the corresponding target algorithm of the approximation sample data sets is predetermined 's；

Determination unit, the hyper parameter for being more than second threshold for the Sensitirity va1ue are determined as the mesh of the target algorithm Mark hyper parameter；

Optimize unit, specifically for using the training data set to each target hyper parameter of the target algorithm by Initial value starts to be adjusted, other hyper parameters that the target algorithm is arranged are default value, obtains the institute for meeting preset standard The value of each hyper parameter of target algorithm is stated, the initial value of each target hyper parameter of the target algorithm is the approximate sample The preferred value of each target hyper parameter of the corresponding target algorithm of data acquisition system.

In one possible implementation, when the target algorithm is multiple, first generation unit is specific to use In the optimal models for generating each target algorithm.

In one possible implementation, described device further include:

First input unit is obtained for pending data to be inputted to the optimal models of each target algorithm respectively The model of the pending data of each optimal models output exports result；

First obtains unit, for the model output result of multiple pending datas to be voted, described in acquisition The final result of pending data.

In one possible implementation, described device further include:

Second input unit is obtained for the second training data to be inputted to the optimal models of each target algorithm respectively The model for obtaining second training data of each optimal models output exports result；

Second generation unit, for that will carry the second training data of the first label and multiple carry the second label The second training data as third training data, training generates second-level model；First label is the second training number According to actual result, second label be second training data model export result in one, each carry The model of second training data of the second label and second training data of each optimal models output exports result It corresponds；

Second input unit is also used to respectively input pending data the optimal models of each target algorithm, obtains The model for obtaining the pending data of each optimal models output exports result；

Second obtaining unit is obtained for multiple pending datas for carrying third label to be inputted the second-level model The final result of the pending data is obtained, the third label is that the model of the pending data exports one in result It is a, each carry the model of the pending data of third label and the pending data of each optimal models output Result is exported to correspond.

A kind of computer readable storage medium is stored with instruction in the computer readable storage medium storing program for executing, works as described instruction When running on the terminal device, so that the method that the terminal device executes above-mentioned implementation model optimization.

A kind of computer program product, when the computer program product is run on the terminal device, so that the terminal The method that equipment executes above-mentioned implementation model optimization.

It can be seen that the embodiment of the present application has the following beneficial effects:

The embodiment of the present application obtains the data feature values of training data set first, and the data for calculating training data set are special The distance between the data feature values of value indicative and each sample data sets, and the sample data set by distance less than pre-determined distance The approximate sample data sets for being determined as the training data set are closed, due to being previously determined correspondence for sample data sets Target algorithm each hyper parameter preferred value, and approximate sample data sets are determined from multiple sample data sets And it is approximate with training data set, therefore, obtain the excellent of each hyper parameter of the corresponding target algorithm of approximation sample data sets Choosing value, the initial value for corresponding to each hyper parameter of target algorithm as training data set carry out hyper parameter tuning, thus The hyper parameter initialization value in hyper parameter evolutionary process has been determined.In addition, the embodiment of the present application can also be to training data set The initial value of each hyper parameter of corresponding target algorithm is adjusted, and obtains the optimal value of each hyper parameter, to complete super ginseng Number evolutionary process, to obtain the corresponding optimal models of target algorithm.

Detailed description of the invention

Fig. 1 is a kind of flow chart of the method for implementation model optimization provided by the embodiments of the present application；

Fig. 2 is a kind of flow chart of the method for hyper parameter dimensionality reduction provided by the embodiments of the present application；

Fig. 3 is a kind of flow chart of the method for determining pending data final result provided by the embodiments of the present application；

Fig. 4 is the flow chart of another method for determining pending data final result provided by the embodiments of the present application；

Fig. 5 is a kind of structure drawing of device of implementation model optimization provided by the embodiments of the present application.

Specific embodiment

In order to make the above objects, features, and advantages of the present application more apparent, with reference to the accompanying drawing and it is specific real Mode is applied to be described in further detail the embodiment of the present application.

Technical solution provided by the present application for ease of understanding is below first illustrated the background technique of the application.

Inventor has found in studying machine learning, trains a preferably model and is usually directed to two aspects, one How aspect is chosen such that the model trained is excellent on index of performance for some algorithm, i.e., to hyper parameter Hyper parameter tuning problem；On the other hand, how to select a certain model that the model is being applied in trained multiple models Middle effect is best, i.e. multi-model select permeability.

For hyper parameter tuning problem, currently used method includes grid search (Grid Search) and stochastic retrieval (Random Search), such methods are in hyper parameter negligible amounts, i.e., in the lower situation of hyper parameter dimension, by the greatest extent may be used Traversal hyper parameter valued combinations more than energy obtain optimized parameter, but ask when the raising of hyper parameter dimension can have serious performance Topic.For this purpose, the method for black box function optimization is also employed in hyper parameter tuning problem, such as Bayes's optimization method etc., But such methods initialize to need to solve with the problems such as hyper parameter dimensionality reduction there are still hyper parameter.

And it is directed to multi-model select permeability, currently used method is to select performance from polyalgorithm according to evaluation index Optimal model is as final training pattern, but if only the model for selecting a performance optimal from multiple models, house Remaining model is abandoned, the model that plenty of time training much needs on the one hand is wasted；On the other hand, it is showed in training optimal Model, but might not remain to be optimal effect in practical applications.

Based on this, the embodiment of the present application provides a kind of method of implementation model optimization, firstly, obtaining training data set Data feature values, each sample data sets that the data feature values and database for then calculating training data set save The sample data sets that distance is less than pre-determined distance are determined as the approximation of training data set by the distance between data feature values Sample data sets, and using the preferred value of each hyper parameter of the corresponding target algorithm of approximate sample data sets as training number According to the initial value for each hyper parameter for gathering corresponding target algorithm, to realize the hyper parameter in hyper parameter evolutionary process Initialization；Furthermore it is also possible to be adjusted to hyper parameter initial value, the optimal value of each hyper parameter is obtained, super ginseng is solved and adjusts Excellent problem；In addition, being kept away the embodiment of the present application also provides the algorithm for determining final result from multiple optimal models output result The case where leading to final result inaccuracy merely with an optimal models is exempted from.

For convenient for those skilled in the art understand that the application technical solution, below in conjunction with attached drawing to the embodiment of the present application A kind of method of implementation model optimization provided is illustrated.

Referring to Fig. 1, which is a kind of flow chart of the method for implementation model optimization provided by the embodiments of the present application, such as Fig. 1 Shown, this method may include:

S101: the data feature values of training data set are obtained.

In the present embodiment, the data feature values of training data set are obtained first, wherein training data set includes a plurality of First training data；Data feature values can be used for characterizing the attribute of the training data set, for example, the number of training data set According to amount, data type (discrete, continuous), shortage of data value quantity (i.e. in training data set data value be null value number), Discrete data entropy, data maximums, data minimum value etc..

For ease of understanding, for example, including 100,000 the first training datas of continuous type, then the training number in training data set Data volume according to set is 100,000, data type is 1 (such as 0 mark discrete data, 1 mark continuous data), shortage of data value Quantity 5, data maximums max, data minimum value are min.

In addition, a large amount of sample can be constructed in the database before the data feature values for obtaining training data set Data acquisition system, each sample data sets may include a plurality of sample data, and similarly, each sample data sets are also corresponding with respectively A data feature values, for example, the data volume of sample data sets, data type (discrete, continuous), shortage of data value quantity, from Dissipate data entropy, continuous data maximum value, minimum value etc..

Meanwhile each sample data sets are respectively corresponding at least one target algorithm, and calculate for each target The preferred value of each hyper parameter of method.For example, when target algorithm is neural network algorithm, the corresponding hyper parameter of the algorithm can be The convolution number of plies, the residual error number of plies, deconvolution number of plies etc., for each hyper parameter, each sample data sets are corresponding with each The preferred value of hyper parameter, as the convolution number of plies is 3, the residual error number of plies is 5, warp lamination is 3, the pond number of plies is 5.

When a sample data sets are corresponding with multiple target algorithms, there is target calculation for each target algorithm The preferred value of the corresponding each hyper parameter of method.For example, sample data sets 1 are corresponding with target algorithm 1 and target algorithm 2, then needle To target algorithm 1, it is respectively A and B that sample data sets 1, which are corresponding with the preferred value of hyper parameter 1 and hyper parameter 2 in target algorithm 1,； For target algorithm 2, sample data sets 1 be corresponding with the preferred value of hyper parameter 3 and hyper parameter 4 in target algorithm 2 be respectively C and D。

Similar, each sample data sets are corresponding with the preferred value of each hyper parameter of target algorithm, sample data The preferred value for gathering each hyper parameter of corresponding target algorithm is predetermined.In practical applications, sample data sets The preferred value of each hyper parameter of corresponding target algorithm can be determined according to many experiments.

S102: it calculates between the data feature values of training data set and the data feature values of each sample data sets The sample data sets that distance is less than first threshold are determined as the approximate sample data sets of training data set by distance.

In the present embodiment, calculate training data set data feature values and each sample data sets data characteristics it Between distance, and compare calculate obtain distance and first threshold relationship, when distance be less than first threshold when, show train number It is similar to the sample data sets according to gathering, then this is determined as the close of training data set apart from corresponding sample data sets Like sample data sets.When the data feature values of training data and the distance between the data feature values of multiple sample data sets Respectively less than first threshold when, multiple sample data sets are determined as to the approximate sample data sets of training data set.Wherein, First threshold can be a certain fixed value, and the quantity for the approximate sample data sets that also can according to need determines, such as needs 5 A approximation sample data sets, then by the data characteristics of the data feature values of training data set and each sample data sets it Between distance be ranked up, nearest preceding 5 sample data sets of selected distance are as approximate sample data sets, then the first threshold Value can be the 6th distance value of sequence.That is first threshold can based on experience value or practical application is set, and the application is real Setting of the example for first threshold is applied without limiting.

Wherein, calculate training data set data feature values and sample data sets data feature values between away from From when, can use formula (1) acquisition:

Wherein, X is training data set, and Y is certain sample data sets, and xi is that i-th of data is special in training data set X Value indicative, yi are i-th of data feature values in sample data sets Y, and d is the data feature values and sample number of training data set X According to the distance between the data feature values of set Y.

Using above-mentioned formula, the data feature values of training data set and the data characteristics of each sample data sets are calculated The distance between value, and judge to obtain whether distance is less than first threshold, if it is less, by corresponding sample data sets It is determined as the approximate sample data sets of the training data set, executes S103.

S103: the preferred value of each hyper parameter of the corresponding target algorithm of approximation sample data sets is obtained.

In the present embodiment, the preferred of each hyper parameter of the corresponding target algorithm of each approximate sample data sets is obtained Value.Since approximate sample data sets are to select from multiple sample data sets, and each sample data sets are preparatory The preferred value of each hyper parameter in the corresponding target algorithm of the sample data sets, therefore, approximate sample data sets has been determined The preferred value of each hyper parameter of corresponding target algorithm is predetermined.Which in practical applications, can according to need to A little algorithms carry out hyper parameter tuning to determine target algorithm, to obtain these corresponding target algorithms of approximation sample data sets The preferred value of each hyper parameter.

In specific implementation, the preferred value of each hyper parameter of the corresponding target algorithm of approximation sample data sets is obtained, As in hyper parameter evolutionary process, training data set is directed to the hyper parameter initial value of above-mentioned target algorithm, that is, It says, the preferred value for each hyper parameter that approximate sample data can be corresponded to target algorithm is surpassing as training data set Arameter optimization substitutes into the initial value of each hyper parameter when target algorithm, realizes the initialization of hyper parameter.

For example, approximate sample data sets correspond to target algorithm 1, and 1 preferred value of hyper parameter of target algorithm 1 is A, super ginseng Several 2 preferred values be B, then in hyper parameter tuning, training data set for target algorithm 1 hyper parameter 1 initial value be A, The initial value of hyper parameter 2 is B, to realize the initialization for corresponding to hyper parameter in target algorithm 1 to training data set.

S104: being adjusted by initial value using each hyper parameter of the training data set to target algorithm, is obtained Meet the value of each hyper parameter of the target algorithm of preset standard.

Pass through aforesaid operations, it has been determined that initial value of the training data set to each hyper parameter of target algorithm, this implementation In example, to realize super ginseng tuning, it can use training data set and each hyper parameter of target algorithm be adjusted.It is realizing In the process, training data set can be utilized to each hyper parameter of target algorithm by initial value based on black box function optimization algorithm Start to be adjusted, obtains the value for meeting each hyper parameter of target algorithm of preset standard.

In specific implementation, the data in training data set are substituted into target algorithm, so that hyper parameter is opened from initial value Beginning is adjusted, until the value for meeting each hyper parameter of target algorithm of preset standard is obtained, as each hyper parameter Optimal value.Wherein, the initial value of each hyper parameter of target algorithm is the corresponding target algorithm of approximation sample data sets The preferred value of each hyper parameter guarantees hyper parameter so that guaranteeing is to adjust in evolutionary process since universal effective initial value Stability.Black box function optimization algorithm can be Bayesian Optimization Algorithm etc..Preset standard can be according to black box function optimization The actual conditions of algorithm determine, such as the assessment result of model some evaluation index reaches goal-selling or the number of iterations reaches Preset quantity etc..

In practical application, can be simultaneously in target algorithm using training data set based on black box function optimization algorithm Each hyper parameter be adjusted, to obtain the optimal value of each hyper parameter.For example, by 1 He of hyper parameter in target algorithm 1 The initial value of hyper parameter 2 is each configured to A and B, is adjusted using when training dataset contract to hyper parameter 1 and hyper parameter 2, To obtain the optimal value of hyper parameter 1 and hyper parameter 2.

Alternatively, it is also possible to be individually adjusted to certain hyper parameter, during adjustment, hyper parameter generation for being adjusted Enter optimal value, other hyper parameters for not being adjusted keep initial value constant, to obtain the optimal value of certain hyper parameter, successively into Row is until obtain the optimal value of each hyper parameter.For example, initial value of the training data set for the hyper parameter 1 of target algorithm 1 Initial value for A, hyper parameter 2 is B, then is surpassed using training data set in target algorithm 1 based on black box function optimization algorithm Parameter 1 is adjusted since A, until obtaining the optimal value A ' of the hyper parameter 1 of target algorithm 1, is being adjusted process to hyper parameter 1 In, hyper parameter 2 is that initial value B is remained unchanged；Similarly, during being adjusted to the progress of hyper parameter 2 from initial value B, surpass Parameter 1 is optimal value A ' adjusted, to obtain the optimal value B ' of hyper parameter 2.

S105: the corresponding optimal models of target algorithm are generated.

After the optimal value of each hyper parameter of target algorithm has been determined, calculated the optimal value of each hyper parameter as target Method corresponds to value of each hyper parameter in practical application in model, so that the corresponding model of the target algorithm is determined as optimal mould Type.Wherein, each hyper parameter of the corresponding optimal models of target algorithm is using each super of the target algorithm for meeting preset standard The value of parameter.

For example, the optimal value of the hyper parameter 1 of target algorithm 1 is A ', the optimal value of hyper parameter 2 is B ', then by A ' and B ' generation Enter the corresponding model of target algorithm 1 and obtains optimal models.

In practical applications, when the hyper parameter quantity in target algorithm is more, when carrying out hyper parameter tuning, there are offices It is sex-limited, it is based on this, the embodiment of the present application provides a kind of method for reducing adjusted hyper parameter quantity, corresponding from target algorithm Qualified target hyper parameter is selected in multiple hyper parameters, tuning then is carried out to the target hyper parameter of target algorithm, is obtained The optimal value of target hyper parameter, that is to say, that dimensionality reduction is carried out to the hyper parameter quantity of target algorithm, only target hyper parameter is carried out Tuning.

The method provided by the embodiments of the present application for reducing hyper parameter quantity for ease of understanding, below in conjunction with attached drawing to the party Method is illustrated.

Referring to fig. 2, which is a kind of method flow diagram of hyper parameter dimensionality reduction provided by the embodiments of the present application, as shown in Fig. 2, This method may include:

S201: the Sensitirity va1ue of each hyper parameter of the corresponding target algorithm of approximation sample data sets is obtained.

In the present embodiment, after determining approximate sample data sets by S102, approximate sample data sets pair are obtained The Sensitirity va1ue of each hyper parameter for the target algorithm answered, that is, the Sensitirity va1ue of each hyper parameter in target algorithm is obtained, The Sensitirity va1ue is for characterizing the hyper parameter to the influence degree of target algorithm.The higher expression of the corresponding Sensitirity va1ue of hyper parameter should The variation of hyper parameter influences the variation of target algorithm calculated result bigger；The lower variation pair for indicating the hyper parameter of Sensitirity va1ue The variation of target algorithm calculated result influences smaller.

Each sample data sets are corresponding with the Sensitirity va1ue of each hyper parameter of target algorithm, sample data sets pair The Sensitirity va1ue of each hyper parameter for the target algorithm answered is predetermined.In practical applications, sample data sets are corresponding Target algorithm each hyper parameter Sensitirity va1ue can also according to many experiments determine.Each sample data sets are corresponding Target algorithm each hyper parameter preferred value and Sensitirity va1ue can determine simultaneously.In this way, approximate sample data sets are It is determined from multiple sample data sets, then each hyper parameter of the corresponding target algorithm of approximate sample data sets is sensitive Angle value is also predetermined.

S202: the hyper parameter that Sensitirity va1ue is more than second threshold is determined as to the target hyper parameter of target algorithm.

In the present embodiment, it obtains in target algorithm after the Sensitirity va1ue of each hyper parameter, the spirit of more each hyper parameter The relationship of sensitivity value and second threshold shows the hyper parameter when the corresponding Sensitirity va1ue of certain hyper parameter is more than second threshold Variation is affected to the variation of target algorithm calculated result, then the hyper parameter is determined as to the target hyper parameter of target algorithm, The dimensionality reduction for realizing hyper parameter, reduces the workload of subsequent tuning.

It is understood that can determine multiple approximate sample data sets by S102, comparing certain hyper parameter It, can be first to each super of the corresponding target algorithm of multiple approximation sample data sets when whether Sensitirity va1ue is greater than second threshold The Sensitirity va1ue of parameter is averaged, and is compared using the Sensitirity va1ue after average with second threshold.For example, determining two close Like sample data sets, the sensitivity of hyper parameter 1 is the spirit of 3.5, hyper parameter 2 in target algorithm 1 in approximate sample data sets 1 Sensitivity value is 3.4；The sensitivity of hyper parameter 1 is the sensitive of 3.9, hyper parameter 2 in target algorithm 1 in approximate sample data sets 2 Angle value is 3.2；Then the average sensitivity value of hyper parameter 1 is 3.7 in target algorithm 1, the average sensitivity value of hyper parameter 2 is 3.3, When second threshold is 3.5, then hyper parameter 1 is determined as the target hyper parameter of target algorithm 1.

Wherein, second threshold can be a certain fixed value, and the quantity for the target hyper parameter that also can according to need determines, example 5 target hyper parameters are such as needed, then are ranked up the average sensitivity value of each hyper parameter of target algorithm, average spirit is chosen Sensitivity is worth highest preceding 5 hyper parameters as target hyper parameter, then second threshold can be the 6th average sensitivity of sequence Value.That is second threshold can based on experience value or practical application is set, setting of the embodiment of the present application for second threshold Without limiting.

S203: being adjusted by initial value using each target hyper parameter of the training data set to target algorithm, Other hyper parameters that target algorithm is arranged are default value, obtain the value for meeting each hyper parameter of target algorithm of preset standard.

By aforesaid operations, after the target hyper parameter for determining target algorithm, using training data set to target algorithm Each target hyper parameter carry out tuning processing, in specific implementation, can based on black box function optimization algorithm using training number Each target hyper parameter of target algorithm is adjusted by initial value according to set, i.e., is initialized each target hyper parameter For the preferred value of each target hyper parameter of the corresponding target algorithm of approximate sample data sets, and to each target hyper parameter into Row adjustment, and for target hyper parameter outside other hyper parameters be set to default value, without adjustment, to obtain each mesh Mark the optimal value of hyper parameter.

It in practical applications, can be simultaneously in target algorithm using training data set based on black box function optimization algorithm Each target hyper parameter be adjusted, to obtain the optimal value of each target hyper parameter；It can also be individually super to certain target Parameter is adjusted, and during adjustment, the target hyper parameter that has been adjusted substitutes into optimal value, be not adjusted other Target hyper parameter keeps initial value constant, to obtain the optimal value of certain target hyper parameter, successively carries out until obtaining each mesh Mark the optimal value of hyper parameter.

In addition, after obtaining the optimal value of the corresponding target hyper parameter of target algorithm, by the corresponding model of target algorithm Each target hyper parameter is set as optimal value to generate the corresponding optimal models of target algorithm.

The method provided through this embodiment can be first to target when the corresponding hyper parameter quantity of target algorithm is more The hyper parameter of algorithm carries out dimensionality reduction, the higher hyper parameter of Sensitirity va1ue is determined as target hyper parameter, then to target hyper parameter Tuning processing is carried out, the hyper parameter quantity of tuning is reduced, improves optimization efficiency, to obtain the target being affected to target algorithm The optimal value of hyper parameter, and then generate the corresponding optimal models of target algorithm.

The corresponding optimal models of target algorithm can be determined by above method embodiment, when approximate sample data sets pair When answering multiple target algorithms, method described in Fig. 1 or Fig. 2 is carried out for each target algorithm, to generate each target algorithm Corresponding optimal models, so that it is determined that multiple optimal models out.

When determining multiple optimal models, it can use multiple optimal models and pending data judged, obtain Multiple model datas are as a result, be the final result for determining pending data, it is to be processed that the embodiment of the present application provides two kinds of determinations The method of data final result, one is determine pending data by way of ballot from the output result of multiple models Final result；Another kind is that the final result of pending data is determined by constructing second-level model mode.

Above two method for ease of understanding is illustrated above two determining method below in conjunction with attached drawing.

Referring to Fig. 3, which is a kind of method flow of determining pending data final result provided by the embodiments of the present application Figure, as shown in figure 3, this method may include:

Pending data: being inputted the optimal models of each target algorithm by S301 respectively, obtains each optimal models output Pending data model export result.

In the present embodiment, the corresponding optimal models of each target algorithm are input to using pending data as input data, Each optimal models are obtained for the model output result of pending data.Wherein, in pending data and training data set Included the first training data attributive character having the same.

For example, the model output result for the pending data that the optimal models of target algorithm 1 export is a, target algorithm 2 The model of the pending data of optimal models output exports the pending data that result is b, the optimal models of target algorithm 3 export Model output result be a.

In practical application, model output result optimal models corresponding to target algorithm are related, for example, working as optimal models When for disaggregated model, then model output result is the classification results to pending data；When optimal models are prediction model, then It is the prediction result to pending data that model, which exports result,.

S302: the model output result of multiple pending datas is voted, the final result of pending data is obtained.

When obtaining multiple models output result for pending data, pending data is obtained by ballot mode Final result, it is, the most model output result of poll will be obtained as the final result of pending data.

For example, the model of the optimal models due to the model output result and target algorithm 3 of the optimal models of target algorithm 1 Output result is a, and the model data result of the optimal models of target algorithm 2 is b, i.e., the poll that the poll of a is 2, b is 1, then Using a as the final result of pending data.

Above embodiment described the final results that pending data is determined by way of ballot, are described below and pass through The method that the method for building second-level model determines pending data final result.

Referring to fig. 4, which is the method flow diagram that the embodiment of the present application another kind determines pending data final result, such as Shown in Fig. 4, this method may include:

Second training data: being inputted the optimal models of each target algorithm by S401 respectively, and it is defeated to obtain each optimal models The model of second training data out exports result.

In the present embodiment, the optimal models of each target algorithm are separately input to using the second training data as input data In, result is exported to obtain the model for the second training data of each optimal models output.Wherein, the second training data with First training data attributive character having the same, and the second training data is configured with the label of actual result.

For example, the model output result for the second training data that the optimal models of target algorithm 1 export is a, target algorithm 2 Optimal models output the second training data model output result be b, target algorithm 3 optimal models output second instruction The model output result for practicing data is a.

S402: by the second training data for carrying the first label and multiple the second training numbers for carrying the second label According to as third training data, training generates second-level model.

First label can be the actual result of the second training data, and the second label can be the model of the second training data One in result is exported, the second training data of the second label and the second training of each optimal models output are each carried The model output result of data corresponds.I.e. when obtaining multiple models output result about the second training data, by certain One the second training data carries a model output result as a third training data, which is taken With another model output result as another third training data, until each model output result is included in one article the In three training datas, while the second training data can also be carried to actual result and be also used as a third training data, with Second-level model is generated, that is, generates second-level model using the training of a large amount of third training data, to ensure that second-level model exports As a result accuracy.

For example, 100 the second training datas are shared, three optimal models, for wherein a certain the second training data of item can To generate four third training datas: respectively carrying the second training number of the model output result of first optimal models According to, carry second optimal models model output result the second training data, carry the moulds of third optimal models Second training data of type output result and the second training data for carrying actual result, then can be generated altogether 400 Third training data.If the quantity of the second training data is n_train, the quantity of optimal models is m, then third training data Quantity is n_train* (m+1).

Using third training data, training generates second-level model, so that second-level model can accurately export input number According to result.

Pending data: being inputted the optimal models of each target algorithm by S403 respectively, obtains each optimal models output Pending data model export result.

In the present embodiment, the corresponding optimal models of each target algorithm are input to using pending data as input data, Each optimal models are obtained for the model output result of pending data.

S404: multiple pending datas for carrying third label are inputted into second-level model, obtain pending data most Terminate fruit.

After generating the second-level model that can accurately identify data result by S401 and S402, third mark is carried by multiple The pending data of label is input in the second-level model, to obtain the final result of pending data, to guarantee second-level model For the result of output compared with the output result of any single model, accuracy is higher.Wherein, third label is pending data Model exports one in result, each carry the pending data of third label and each optimal models export it is to be processed The model output result of data corresponds.

Similar, for example, obtaining the model output of the pending data of each optimal models output there are three optimal models As a result, the data for then inputting second-level model are respectively as follows: the number to be processed for carrying the model output result of first optimal models According to, carry second optimal models model output result pending data, carry the models of third optimal models The pending data for exporting result, by the final result of the available pending data of second-level model.

The method of the determination pending data final result provided through the foregoing embodiment, can be from multiple optimal models It exports and determines final result in result, compared with obtaining final result merely with an optimal models in the prior art, improve As a result accuracy, and do not need to abandon other models, obtained optimal models are utilized, and making for model is improved With rate.

Based on above method embodiment, present invention also provides the devices of implementation model optimization, below in conjunction with attached drawing pair The device is illustrated.

Referring to Fig. 5, which is a kind of structure drawing of device of implementation model optimization provided by the embodiments of the present application, such as Fig. 5 institute Show, the apparatus may include:

First acquisition unit 501, for obtaining the data feature values of training data set, the training data set includes A plurality of first training data；

Computing unit 502, for calculate the training data set data feature values and each sample data sets The sample data sets that the distance is less than first threshold are determined as the training dataset by the distance between data feature values The approximate sample data sets of conjunction；

Second acquisition unit 503, for obtaining each super ginseng of the corresponding target algorithm of the approximate sample data sets The preferred value of several preferred values, each hyper parameter of the corresponding target algorithm of the approximation sample data sets is predetermined 's；

Optimize unit 504, for utilizing the training data set to each hyper parameter of the target algorithm by initial Value starts to be adjusted, and obtains the value for meeting each hyper parameter of the target algorithm of preset standard, the target algorithm The initial value of each hyper parameter is the preferred value of each hyper parameter of the corresponding target algorithm of the approximate sample data sets；

First generation unit 505, for generating the corresponding optimal models of the target algorithm, the target algorithm is corresponding Meet the value of each hyper parameter of the target algorithm of preset standard described in each hyper parameter use of optimal models.

In one possible implementation, described device further include:

Described device further include:

It should be noted that the specific implementation of each module or unit may refer to Fig. 1 to Fig. 4 the method in the present embodiment Realization, details are not described herein for the present embodiment.

In addition, the embodiment of the present application also provides a kind of computer readable storage medium, the computer readable storage medium storing program for executing In be stored with instruction, when described instruction is run on the terminal device, so that the terminal device executes above-mentioned implementation model The method of optimization.

The embodiment of the present application also provides a kind of computer program product, and the computer program product is transported on the terminal device When row, so that the method that the terminal device executes above-mentioned implementation model optimization.As can be seen from the above embodiments, the application is implemented Example obtains the data feature values of training data set first, calculates the data feature values and each sample data of training data set The distance between data feature values of set, and the sample data sets that distance is less than pre-determined distance are determined as the training data The approximate sample data sets of set, due to being previously determined each super of corresponding target algorithm for sample data sets The preferred value of parameter, and approximate sample data sets are determining and close with training data set from multiple sample data sets Seemingly, therefore, the preferred value for obtaining each hyper parameter of the corresponding target algorithm of approximation sample data sets, as training number According to the initial value for each hyper parameter for gathering corresponding target algorithm hyper parameter tuning is carried out, so that it is determined that hyper parameter evolutionary process In hyper parameter initialization value.In addition, the embodiment of the present application can also be to each of the corresponding target algorithm of training data set The initial value of hyper parameter is adjusted, and obtains the optimal value of each hyper parameter, to complete hyper parameter evolutionary process, to obtain mesh Mark the corresponding optimal models of algorithm.

It should be noted that each embodiment in this specification is described in a progressive manner, each embodiment emphasis is said Bright is the difference from other embodiments, and the same or similar parts in each embodiment may refer to each other.For reality For applying system or device disclosed in example, since it is corresponded to the methods disclosed in the examples, so being described relatively simple, phase Place is closed referring to method part illustration.

It should be appreciated that in this application, " at least one (item) " refers to one or more, and " multiple " refer to two or two More than a."and/or" indicates may exist three kinds of relationships, for example, " A and/or B " for describing the incidence relation of affiliated partner It can indicate: only exist A, only exist B and exist simultaneously tri- kinds of situations of A and B, wherein A, B can be odd number or plural number.Word Symbol "/" typicallys represent the relationship that forward-backward correlation object is a kind of "or"." at least one of following (a) " or its similar expression, refers to Any combination in these, any combination including individual event (a) or complex item (a).At least one of for example, in a, b or c (a) can indicate: a, b, c, " a and b ", " a and c ", " b and c ", or " a and b and c ", and wherein a, b, c can be individually, can also To be multiple.

It should also be noted that, herein, relational terms such as first and second and the like are used merely to one Entity or operation are distinguished with another entity or operation, without necessarily requiring or implying between these entities or operation There are any actual relationship or orders.Moreover, the terms "include", "comprise" or its any other variant are intended to contain Lid non-exclusive inclusion, so that the process, method, article or equipment including a series of elements is not only wanted including those Element, but also including other elements that are not explicitly listed, or further include for this process, method, article or equipment Intrinsic element.In the absence of more restrictions, the element limited by sentence "including a ...", it is not excluded that There is also other identical elements in process, method, article or equipment including the element.

The step of method described in conjunction with the examples disclosed in this document or algorithm, can directly be held with hardware, processor The combination of capable software module or the two is implemented.Software module can be placed in random access memory (RAM), memory, read-only deposit Reservoir (ROM), electrically programmable ROM, electrically erasable ROM, register, hard disk, moveable magnetic disc, CD-ROM or technology In any other form of storage medium well known in field.

The foregoing description of the disclosed embodiments makes professional and technical personnel in the field can be realized or use the application. Various modifications to these embodiments will be readily apparent to those skilled in the art, as defined herein General Principle can be realized in other embodiments without departing from the spirit or scope of the application.Therefore, the application It is not intended to be limited to the embodiments shown herein, and is to fit to and the principles and novel features disclosed herein phase one The widest scope of cause.

Claims

1. a kind of method of implementation model optimization, which is characterized in that the described method includes:

The data feature values of the training data set and the distance between the data feature values of each sample data sets are calculated, The sample data sets that the distance is less than first threshold are determined as to the approximate sample data sets of the training data set；

Obtain the preferred value of each hyper parameter of the corresponding target algorithm of the approximate sample data sets, the approximation sample number It is predetermined according to the preferred value for each hyper parameter for gathering corresponding target algorithm；

It is adjusted, is accorded with by initial value using each hyper parameter of the training data set to the target algorithm The value of each hyper parameter of the target algorithm of preset standard is closed, the initial value of each hyper parameter of the target algorithm is institute State the preferred value of each hyper parameter of the corresponding target algorithm of approximate sample data sets；

Generate the corresponding optimal models of the target algorithm, each hyper parameter of the corresponding optimal models of the target algorithm uses The value of each hyper parameter of the target algorithm for meeting preset standard.

2. the method according to claim 1, wherein the method also includes:

Obtain the Sensitirity va1ue of each hyper parameter of the corresponding target algorithm of the approximate sample data sets, the approximation sample The Sensitirity va1ue of each hyper parameter of the corresponding target algorithm of data acquisition system is predetermined；

It is described to be adjusted by initial value using each hyper parameter of the training data set to the target algorithm, it obtains The value of each hyper parameter of the target algorithm of preset standard must be met, comprising:

It is adjusted by initial value using each target hyper parameter of the training data set to the target algorithm, if Other hyper parameters for setting the target algorithm are default value, obtain each hyper parameter for meeting the target algorithm of preset standard Value, the initial value of each target hyper parameter of the target algorithm is the corresponding target algorithm of the approximate sample data sets Each target hyper parameter preferred value.

3. the method according to claim 1, wherein when the target algorithm is multiple, described in the generation The optimal models of target algorithm, comprising:

Generate the optimal models of each target algorithm.

4. according to the method described in claim 3, it is characterized in that, the method also includes:

Pending data is inputted to the optimal models of each target algorithm respectively, obtains each optimal models output The model of the pending data exports result；

The model output result of multiple pending datas is voted, the final result of the pending data is obtained.

5. according to the method described in claim 3, it is characterized in that, the method also includes:

Second training data is inputted to the optimal models of each target algorithm respectively, obtains each optimal models output Second training data model export result；

Using the second training data for carrying the first label and multiple the second training datas for carrying the second label as Three training datas, training generate second-level model；First label be second training data actual result, described second Label is that the model of second training data exports one in result, each carries the second training data of the second label It is corresponded with the model output result of second training data of each optimal models output；

Multiple pending datas for carrying third label are inputted into the second-level model, obtain the final of the pending data As a result, the third label is that the model of the pending data exports one in result, third label is each carried The model output result of pending data and the pending data of each optimal models output corresponds.

6. a kind of device of implementation model optimization, which is characterized in that described device includes:

First acquisition unit, for obtaining the data feature values of training data set, the training data set includes a plurality of One training data；

Computing unit, for calculating the data feature values of the training data set and the data characteristics of each sample data sets The sample data sets that the distance is less than first threshold are determined as the approximation of the training data set by the distance between value Sample data sets；

Second acquisition unit, for obtain the corresponding target algorithm of the approximate sample data sets each hyper parameter it is preferred The preferred value of value, each hyper parameter of the corresponding target algorithm of the approximation sample data sets is predetermined；

Optimize unit, for using the training data set to each hyper parameter of the target algorithm by initial value into Row adjustment obtains the value for meeting each hyper parameter of the target algorithm of preset standard, each super ginseng of the target algorithm Several initial values is the preferred value of each hyper parameter of the corresponding target algorithm of the approximate sample data sets；

First generation unit, for generating the corresponding optimal models of the target algorithm, the corresponding optimal mould of the target algorithm Meet the value of each hyper parameter of the target algorithm of preset standard described in each hyper parameter use of type.

7. device according to claim 6, which is characterized in that described device further include:

Third acquiring unit, for obtain the corresponding target algorithm of the approximate sample data sets each hyper parameter it is sensitive The Sensitirity va1ue of angle value, each hyper parameter of the corresponding target algorithm of the approximation sample data sets is predetermined；

Determination unit, the target that the hyper parameter for being more than second threshold for the Sensitirity va1ue is determined as the target algorithm surpass Parameter；

Optimize unit, specifically for utilizing the training data set to each target hyper parameter of the target algorithm by initial Value starts to be adjusted, other hyper parameters that the target algorithm is arranged are default value, obtains the mesh for meeting preset standard The value of each hyper parameter of algorithm is marked, the initial value of each target hyper parameter of the target algorithm is the approximate sample data Gather the preferred value of each target hyper parameter of corresponding target algorithm.

8. device according to claim 6, which is characterized in that when the target algorithm is multiple, described first is generated Unit, specifically for generating the optimal models of each target algorithm.

9. a kind of computer readable storage medium, which is characterized in that it is stored with instruction in the computer readable storage medium storing program for executing, when When described instruction is run on the terminal device, so that the terminal device perform claim requires the described in any item realization moulds of 1-5 The method of type optimization.

10. a kind of computer program product, which is characterized in that when the computer program product is run on the terminal device, make Obtain the method that the terminal device perform claim requires the described in any item implementation models optimizations of 1-5.