CN111105040A - Hyper-parameter optimization method, device, computer equipment and storage medium - Google Patents

Hyper-parameter optimization method, device, computer equipment and storage medium Download PDF

Info

Publication number
CN111105040A
CN111105040A CN201911110640.7A CN201911110640A CN111105040A CN 111105040 A CN111105040 A CN 111105040A CN 201911110640 A CN201911110640 A CN 201911110640A CN 111105040 A CN111105040 A CN 111105040A
Authority
CN
China
Prior art keywords
hyperparameter
target
parameter
super parameter
super
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201911110640.7A
Other languages
Chinese (zh)
Inventor
侯皓龄
刘云峰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Zhuiyi Technology Co Ltd
Original Assignee
Shenzhen Zhuiyi Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Zhuiyi Technology Co Ltd filed Critical Shenzhen Zhuiyi Technology Co Ltd
Priority to CN201911110640.7A priority Critical patent/CN111105040A/en
Publication of CN111105040A publication Critical patent/CN111105040A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting

Abstract

The application relates to a hyper-parameter optimization method, a hyper-parameter optimization device, computer equipment and a storage medium. The method comprises the following steps: acquiring a preset number of super parameter sets, and respectively predicting the super parameter set scores of each super parameter set through a trained empirical model; screening candidate hyperparameter groups from the hyperparameter groups according to the hyperparameter group scores; according to the candidate hyperparameter set and a preset machine learning operator, performing model training according to a training sample set corresponding to the target problem to obtain a trained target problem prediction model; testing the target problem prediction model according to the test sample set corresponding to the target problem to obtain an evaluation value corresponding to the candidate super parameter set; updating a reference evaluation value currently corresponding to the target problem according to the evaluation value, and returning to the step of acquiring the preset number of super parameter sets to continue execution until the iteration stop condition is met; and determining the candidate super parameter group corresponding to the reference evaluation value as a target super parameter group. By adopting the method, the optimization efficiency can be improved.

Description

Hyper-parameter optimization method, device, computer equipment and storage medium
Technical Field
The present application relates to the field of parameter optimization technologies, and in particular, to a method and an apparatus for hyper-parameter optimization, a computer device, and a storage medium.
Background
With the development of computer technology, machine learning technology is developed, and valuable information can be automatically mined from massive data through the machine learning technology. The machine learning research and application is premised on improving the effect of the machine learning algorithm, such as the effect of the machine learning algorithm from the aspects of feature engineering, model selection, hyper-parameter optimization, result evaluation and the like. The traditional machine learning depends on expert knowledge, and has the problems of low efficiency and high cost. In order to solve the problem, an automatic machine learning technology is gradually developed, and by taking the super-parameter optimization as an example, training and evaluation of a machine learning model are performed based on a selected super-parameter group, so as to realize the super-parameter optimization.
Currently, a hyper-parameter optimization algorithm is generally based on sampling and a way of evaluating the sampling to search a hyper-parameter optimization space, and the hyper-parameter optimization algorithm includes a grid search, a random search, a bayesian optimization, an evolutionary algorithm, and the like. In order to improve the optimization effect, the existing hyper-parameter optimization algorithm generally needs to perform a large number of samples and evaluate the large number of samples respectively, so that the problem of low optimization efficiency exists.
Disclosure of Invention
In view of the above, it is necessary to provide a method, an apparatus, a computer device and a storage medium for hyper-parameter optimization, which can improve the optimization efficiency.
A method of hyper-parameter optimization, the method comprising:
acquiring a preset number of super parameter sets, and respectively predicting the super parameter set scores of each super parameter set through a trained empirical model;
screening candidate hyperparameter groups from the acquired hyperparameter groups according to the hyperparameter group scores;
according to the candidate hyperparameter set and a preset machine learning operator, performing model training according to a training sample set corresponding to the target problem to obtain a trained target problem prediction model;
testing the target problem prediction model according to the test sample set corresponding to the target problem to obtain an evaluation value corresponding to the candidate super parameter set;
updating the reference evaluation value currently corresponding to the target problem according to the evaluation value, returning to the step of acquiring the preset number of super parameter sets, and respectively predicting the super parameter set scoring of each super parameter set through a trained empirical model to continue to execute until the iteration stop condition is met;
and determining the candidate hyperparameter set corresponding to the reference evaluation value corresponding to the target problem as a target hyperparameter set.
In one embodiment, there are a plurality of trained empirical models; the acquiring of the preset number of super parameter sets and the predicting of the super parameter set score of each super parameter set through the trained empirical model respectively include:
acquiring a super parameter set;
predicting the super parameter set through a plurality of trained empirical models respectively to obtain a plurality of super parameter set scores of the super parameter set;
and obtaining the super parameter group scores of the super parameter groups according to the plurality of super parameter group sub scores, and returning to the step of obtaining the super parameter groups to continue execution until a preset number of super parameter group scores are obtained.
In one embodiment, the obtaining the super parameter set score of the super parameter set according to the plurality of super parameter set sub-scores includes:
weighting and summing the multiple hyperparameter group scores according to weights corresponding to the corresponding empirical models to obtain the hyperparameter group scores of the hyperparameter groups;
the method further comprises the following steps:
and dynamically updating the weight corresponding to each empirical model according to a plurality of hyperparameter group scores corresponding to the candidate hyperparameter groups.
In one embodiment, the dynamically updating the weight corresponding to each empirical model according to a plurality of meta-parameter group scores corresponding to the candidate meta-parameter groups includes:
obtaining a plurality of super parameter group scores corresponding to the candidate super parameter groups;
determining a target problem label corresponding to each empirical model according to the obtained hyperparameter set sub-scores;
and dynamically updating the weight corresponding to the corresponding empirical model according to the hyperparameter set sub-score and the target problem label corresponding to each empirical model.
In one embodiment, the dynamically updating the weights corresponding to the respective empirical models according to the meta-parameter set sub-score and the target problem label corresponding to each empirical model includes:
dynamically determining corresponding target weight according to the corresponding hyperparameter group score, the target problem label and the weight of each empirical model and a preset mapping relation;
and carrying out normalization processing on the target weight to obtain an updated weight corresponding to each empirical model.
In one embodiment, the screening candidate hyperparameter sets from the obtained hyperparameter sets according to the hyperparameter set scores includes:
screening the maximum hyperparameter group score from the predicted hyperparameter group scores;
determining the hyper-parameter group corresponding to the screened hyper-parameter group score as a candidate hyper-parameter group;
the updating the reference evaluation value currently corresponding to the target problem according to the evaluation value includes:
updating the reference evaluation value as the evaluation value when the evaluation value is greater than or equal to a reference evaluation value currently corresponding to the target problem.
In one embodiment, the step of training the empirical model includes:
acquiring an experience data set; the empirical data set includes a set of empirical hyperparameters and corresponding scores for the set of empirical hyperparameters;
and taking the experience hyperparameter group as an input feature, and taking the corresponding experience hyperparameter group score as an expected output feature to carry out model training to obtain a trained experience model.
In one embodiment, the acquiring the empirical data set includes:
acquiring an experience super parameter set;
according to the experience hyper-parameter set and a preset machine learning operator, performing model training according to a training sample set corresponding to a preset problem to obtain a trained preset problem prediction model;
testing the preset problem prediction model according to the test sample set corresponding to the preset problem to obtain experience hyperparameter group scores corresponding to the experience hyperparameter groups, and returning to the step of obtaining the experience hyperparameter groups to continue executing until the experience hyperparameter group scores of the target quantity are obtained;
and obtaining an experience data set according to the experience hyperparameter set and the corresponding experience hyperparameter set grading.
In one embodiment, after determining the candidate super parameter set corresponding to the reference evaluation value corresponding to the target problem as the target super parameter set, the method further includes:
according to the target hyper-parameter set and the machine learning operator, performing model training according to a target sample set corresponding to the target problem to obtain a trained target prediction model;
when a prediction trigger condition corresponding to the target problem is met, target data to be predicted are obtained;
and inputting the target data into the target prediction model for prediction to obtain a corresponding prediction result.
A hyper-parametric optimization apparatus, the apparatus comprising:
the acquisition module is used for acquiring a preset number of super parameter groups and respectively predicting the super parameter group scores of each super parameter group through a trained empirical model;
the screening module is used for screening candidate hyperparameter groups from the acquired hyperparameter groups according to the hyperparameter group scores;
the training module is used for carrying out model training according to the candidate hyper-parameter set and a preset machine learning operator and a training sample set corresponding to the target problem to obtain a trained target problem prediction model;
the evaluation module is used for testing the target problem prediction model according to the test sample set corresponding to the target problem to obtain an evaluation value corresponding to the candidate super parameter set;
the updating module is used for updating the reference evaluation value currently corresponding to the target problem according to the evaluation value, instructing the acquisition module to continue to acquire the preset number of super parameter sets, and respectively predicting the super parameter set grading step of each super parameter set through a trained empirical model until the iteration stop condition is met;
and the determining module is used for determining the candidate super parameter group corresponding to the reference evaluation value corresponding to the target problem as a target super parameter group.
In one embodiment, there are a plurality of trained empirical models; the acquisition module is further used for acquiring a super parameter set; predicting the super parameter set through a plurality of trained empirical models respectively to obtain a plurality of super parameter set scores of the super parameter set; and obtaining the super parameter group scores of the super parameter groups according to the plurality of super parameter group sub scores, and returning to the step of obtaining the super parameter groups to continue execution until a preset number of super parameter group scores are obtained.
In one embodiment, the obtaining module is further configured to perform weighted summation on the multiple meta-parameter group scores according to weights corresponding to respective empirical models to obtain meta-parameter group scores of the meta-parameter groups; the updating module is further configured to dynamically update the weight corresponding to each empirical model according to the multiple meta-parameter group scores corresponding to the candidate meta-parameter groups.
In one embodiment, the apparatus further comprises:
the prediction module is used for carrying out model training according to the target sample set corresponding to the target problem according to the target hyper-parameter set and the machine learning operator to obtain a trained target prediction model; when a prediction trigger condition corresponding to the target problem is met, target data to be predicted are obtained; and inputting the target data into the target prediction model for prediction to obtain a corresponding prediction result.
A computer device comprising a memory storing a computer program and a processor implementing the steps of the method for hyper-parameter optimization as described in the various embodiments above when executing the computer program.
A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method for hyper-parameter optimization as described in the various embodiments above.
According to the hyper-parameter optimization method, the hyper-parameter optimization device, the computer equipment and the storage medium, the hyper-parameter set scores of each hyper-parameter set are predicted through the pre-trained empirical model, the preset number of hyper-parameter sets are preliminarily screened according to the hyper-parameter set scores, candidate hyper-parameter sets needing further evaluation are screened out, so that all the hyper-parameter sets are prevented from being evaluated respectively, the evaluation time of the hyper-parameter sets can be saved, and the hyper-parameter optimization efficiency can be improved. And further, evaluating the candidate hyper-parameter set according to the training sample set and the test sample set corresponding to the target problem, and dynamically updating the reference evaluation value corresponding to the target problem based on the evaluation value so as to screen out the hyper-parameter set suitable for the target problem and improve the optimization effect of the hyper-parameter. Moreover, the super parameter set is screened and evaluated in an iterative loop mode, and the optimization efficiency can be further improved under the condition of improving the optimization effect of the super parameters.
Drawings
FIG. 1 is a diagram illustrating an exemplary scenario for implementing a method for hyper-parametric optimization;
FIG. 2 is a schematic flow chart diagram of a method for hyperparametric optimization in one embodiment;
FIG. 3 is a schematic flow chart diagram of a method for hyperparametric optimization in another embodiment;
FIG. 4 is a schematic flow chart diagram illustrating a method for hyperparametric optimization in yet another embodiment;
FIG. 5 is a block diagram of an apparatus for hyperparametric optimization in one embodiment;
FIG. 6 is a diagram illustrating an internal structure of a computer device according to an embodiment.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.
The hyper-parameter optimization method provided by the application can be applied to the application environment shown in FIG. 1. Wherein the terminal 102 communicates with the server 104 via a network. The server 104 obtains a preset number of super parameter sets, respectively predicts the super parameter set score of each super parameter set through a trained empirical model, screens candidate super parameter sets from the obtained super parameter sets according to the super parameter set scores, evaluates the candidate super parameters according to a training sample set and a testing sample set corresponding to the target problem according to a preset machine learning operator, updates a reference evaluation value corresponding to the target problem according to the obtained evaluation value, returns to the steps of obtaining the preset number of super parameter sets and predicting to obtain the corresponding super parameter set, and continues to execute until an iteration stop condition is met, and determines the super parameter set corresponding to the reference evaluation value as the target super parameter set. The server 104 may trigger the above-mentioned hyper-parameter optimization process when receiving the hyper-parameter optimization instruction sent by the terminal 102, and may also push the determined target hyper-parameter set to the terminal 102. The terminal 102 may be, but not limited to, various personal computers, notebook computers, smart phones, tablet computers, and portable wearable devices, and the server 104 may be implemented by an independent server or a server cluster formed by a plurality of servers.
In one embodiment, as shown in fig. 2, a method for hyper-parameter optimization is provided, which is described by taking the method as an example applied to the server in fig. 1, and includes the following steps:
s202, acquiring a preset number of super parameter sets, and respectively predicting the super parameter set scores of each super parameter set through a trained empirical model.
Wherein the preset number is a number of the preset super parameter sets, such as 20. A hyper-parameter set is a set of one or more hyper-parameters, i.e. a set of hyper-parameters. The number of the hyper-parameters in the hyper-parameter group is determined by corresponding machine learning operators, for example, the machine learning operators are random forests, and the hyper-parameter group comprises two hyper-parameters of learning rate and tree number. The super parameter set may specifically include one or more super parameters to be optimized, and a certain super parameter value corresponding to each super parameter. The empirical model is a model obtained by model training based on an empirical data set and can be used to predict a hyperparameter set score for the hyperparameter set.
The hyperparameter set scoring is a scoring value which is obtained by predicting the hyperparameter set through an empirical model and can be used for representing the quality of the hyperparameter set. The higher the hyperparameter set score is, the greater the optimization assistance of the corresponding hyperparameter set is characterized, that is, the greater the possibility of characterizing the corresponding hyperparameter set as the optimal hyperparameter set is. Correspondingly, the lower the super parameter group score, the lower the possibility that the corresponding super parameter group is represented as the optimal super parameter group, and by eliminating the super parameter group with the lower super parameter group score, the invalid super parameter group can be eliminated, so that the evaluation time of the invalid super parameter group is saved. The value range of the hyperparameter set score can be customized, such as [0,1], or [0,100], and is not specifically limited herein.
Specifically, when the hyper-parameter optimization triggering condition is met, the server acquires a preset number of hyper-parameter sets, and predicts each acquired hyper-parameter set through a trained empirical model to obtain a hyper-parameter set score corresponding to each hyper-parameter set. The hyper-parameter optimization triggering condition is, for example, a hyper-parameter optimization instruction sent by the terminal is received, or the current time is consistent with the preset hyper-parameter optimization triggering time.
In one embodiment, the server obtains a preset number of sets of hyper-parameters, and after obtaining the preset number of sets of hyper-parameters, predicts the score of each obtained hyper-parameter set through a trained empirical model.
In one embodiment, the server obtains a super parameter set, predicts the super parameter set scores of the super parameter set through a trained empirical model, and returns to the step of obtaining a super parameter set to continue execution until the super parameter set scores of a preset number of super parameter sets are predicted. The server specifically can sample each hyper-parameter in the hyper-parameter search space to obtain a hyper-parameter group, judges the sampling times of the hyper-parameter after predicting the hyper-parameter group score of the hyper-parameter group, and continues to perform the step of sampling the hyper-parameter in the hyper-parameter search space when the sampling times is smaller than the preset sampling times, or continues to perform the subsequent steps based on the predicted super-parameter group scores of the preset number.
In one embodiment, the empirical model used to predict the hyperparameter set score for each hyperparameter set can be one or more, the one or more empirical models corresponding to the same machine learning operator. And the server pre-trains each machine learning operator to obtain an experience model corresponding to a plurality of preset problems. The higher the matching degree of the preset problem and the target problem is, the higher the correlation between the empirical model corresponding to the preset problem and the target problem is. The server may predict the super parameter set score for each super parameter set using the empirical model with the highest correlation to the target problem, or may predict the super parameter set score for each super parameter set using a plurality of empirical models with correlation to the target problem. The server can also jointly predict the hyperparameter set score of each hyperparameter set through a plurality of empirical models obtained by pre-training corresponding to the machine learning operators, and dynamically adjust the influence proportion of each empirical model on the hyperparameter set score, so that the empirical models related to the target problem do not need to be pre-screened, and the problem of low optimization accuracy caused by the fact that the relevance of the screened empirical models and the target problem is not high enough can be avoided.
For example, taking a machine learning operator as a random forest as an example, assuming that preset questions are product recommendation, room price prediction of Shenzhen, medicine price prediction and the like, a target question is room price prediction of Changsha, and the preset question of Shenzhen has a high matching degree with the target question, it is determined that the empirical model corresponding to the preset question has a high correlation with the target question.
In one embodiment, the server scores the experience hyper-parameter set based on a preset machine learning operator, a training sample set and a testing sample set corresponding to a preset problem to obtain a corresponding experience hyper-parameter set score, and performs model training according to the experience hyper-parameter set and the corresponding experience hyper-parameter set score to obtain a trained experience model corresponding to the preset problem. The machine learning operators involved in the training process of the empirical model may be the same as or different from the machine learning operators used for scoring the empirical hyperparameter set. For example, the machine learning operators involved in the training of the empirical model are multi-layer perceptrons, while the machine learning operators used to score the set of empirical hyperparameters are random forests.
In one embodiment, the server selects the set of hyper-parameters to be scored from the hyper-parameter search space by an optimization operator. The optimization operator is, for example, a bayesian optimization operator, and is not limited in particular. The server can also randomly select the hyper-parameter set to be scored in the hyper-parameter search space, and can also traverse the hyper-parameter search space to acquire the hyper-parameter set to be scored. It can be understood that, in the hyper-parameter optimization iteration process, the hyper-parameter sets acquired by the server are different from each other.
And S204, screening candidate hyperparameter groups from the acquired hyperparameter groups according to the hyperparameter group scores.
Specifically, the server compares the predicted super parameter set scores of the preset number, and screens candidate super parameter sets from the super parameter sets of the corresponding preset number according to a comparison result, so that the screened candidate super parameter sets can be evaluated conveniently.
In one embodiment, the server screens one or more candidate hyperparameter sets from a preset number of hyperparameter sets according to a preset screening condition based on the predicted hyperparameter set scores. The preset screening condition may be, for example, to screen a super parameter group corresponding to the largest super parameter group score, or to screen a super parameter group having a super parameter group score greater than or equal to the preset score, or to screen a super parameter group corresponding to a plurality of super parameter group scores ranked first, or to screen one or more super parameter groups having a super parameter group score greater than or equal to the preset score and ranked first, which is not illustrated herein. A preset score such as 0.9.
In one embodiment, the server locally pre-stores positive examples and corresponding positive example scores, and negative examples and corresponding negative example scores for the trained empirical model. And after predicting the super parameter group score corresponding to each super parameter group, the server compares the super parameter group score with the positive sample score and the negative sample score respectively. And if the hyperparameter set score is larger than the positive example sample score, updating the positive example sample score into the hyperparameter set score, and updating the positive example sample into the hyperparameter set corresponding to the hyperparameter set score. And if the hyperparameter set score is smaller than the negative example sample score, updating the negative example sample score into the hyperparameter set score, and updating the negative example sample score into the hyperparameter set corresponding to the hyperparameter set score. Therefore, the server can select the super parameter group by referring to the positive example sample and the negative example sample, and can select a better super parameter group for grading, so that the optimization efficiency and accuracy can be improved.
And S206, according to the candidate hyper-parameter set and a preset machine learning operator, performing model training according to a training sample set corresponding to the target problem to obtain a trained target problem prediction model.
The machine learning operator, that is, the machine learning algorithm, is a basic algorithm for training a machine learning model, such as a multilayer perceptron. The training sample set is a sample set used for training a target problem prediction model, and includes a plurality of training data and a training data label corresponding to each training data.
Specifically, the server obtains a training sample set corresponding to the target problem, takes training data in the training sample set as input features, takes corresponding training data labels as expected output features, and performs model training according to the screened candidate hyperparameter set and a preset machine learning operator to obtain a trained target problem prediction model.
And S208, testing the target problem prediction model according to the test sample set corresponding to the target problem to obtain an evaluation value corresponding to the candidate super parameter set.
The test sample set is used for testing the trained target problem prediction model and comprises test data and a test data label corresponding to each test data. The evaluation value is used for representing the quality degree of the candidate super parameter set, and particularly can be used for representing the prediction effect, such as accuracy, of the target problem prediction model obtained by training according to the candidate super parameter set.
Specifically, the server obtains a test sample set corresponding to the target problem, and inputs each test data in the test sample set into a trained target problem prediction model for prediction to obtain a corresponding prediction label. And the server determines the evaluation value corresponding to the corresponding candidate super parameter set according to the prediction label and the test data label corresponding to each test data in the test sample set.
In one embodiment, the server calculates the accuracy and/or the recall ratio corresponding to the target problem prediction model according to the prediction label and the test data label corresponding to each test data in the test sample set, and obtains the evaluation value corresponding to the candidate hyperparameter set according to the accuracy and/or the recall ratio. For example, the accuracy is determined as the evaluation value corresponding to the candidate super parameter set.
And S210, updating the reference evaluation value currently corresponding to the target problem according to the evaluation value, and returning to the step S202 to continue execution until the iteration stop condition is met.
The iteration stop condition is a condition or basis for stopping the iteration optimization process, such as the number of iterations is greater than or equal to a preset number of iterations, or the evaluation value corresponding to the candidate super-parameter set is greater than or equal to a preset evaluation value. The number of iterations is preset, for example, 15, and the evaluation value is preset, for example, 0.99, and is not particularly limited herein.
Specifically, the server compares the evaluation value corresponding to the candidate super parameter set with the reference evaluation value currently corresponding to the target problem, and dynamically updates the reference evaluation value currently corresponding to the target problem according to the comparison result. When the evaluation value is greater than or equal to the reference evaluation value, the reference evaluation value is updated to the evaluation value corresponding to the candidate super parameter set, otherwise the reference evaluation value is kept unchanged. And after dynamically updating the reference evaluation value according to the evaluation value, the server returns to the step of acquiring the preset number of super parameter sets, and the step of respectively predicting the super parameter set scores of each super parameter set through the trained empirical model is continuously executed until the iteration stopping condition is met, and the iteration optimization process is stopped.
In one embodiment, when a plurality of candidate hyper-parameter sets are screened from a preset number of hyper-parameter sets, the server determines the evaluation value corresponding to each candidate hyper-parameter set according to the above manner based on a preset machine learning operator, a training sample set corresponding to the target problem and a test sample set. The server selects the largest evaluation value from the plurality of evaluation values, and updates the reference evaluation value currently corresponding to the target problem according to the selected evaluation value.
In one embodiment, when it is determined that the evaluation value is greater than or equal to the reference evaluation value, the server updates the reference evaluation value based on the evaluation value, and dynamically updates the weight corresponding to each empirical model based on a plurality of super-parameter group scores corresponding to the candidate super-parameter group. When the evaluation value is judged to be smaller than the reference evaluation value, the server keeps the reference evaluation value unchanged, and dynamically updates the weight corresponding to each empirical model according to the scores of the plurality of super parameter groups corresponding to the candidate super parameter groups. And after dynamically updating the weight corresponding to each empirical model, if the iteration times are less than the preset iteration times, the server returns to the step of obtaining the super parameter groups with the preset number, the step of respectively predicting the super parameter group scores of each super parameter group through the trained empirical model is continuously executed, and otherwise, the iterative optimization process is stopped.
And S212, determining the candidate hyperparameter set corresponding to the reference evaluation value corresponding to the target problem as the target hyperparameter set.
Specifically, after the iterative optimization process is stopped, the reference evaluation value currently corresponding to the target problem is the optimal evaluation value recorded in the iterative optimization process, the candidate super parameter group corresponding to the reference evaluation value is the optimal super parameter group in the iterative optimization process, and the server determines the candidate super parameter group corresponding to the reference evaluation value currently corresponding to the target problem as the target super parameter group, so as to obtain a corresponding target prediction model based on the target super parameter group in the application process.
According to the super-parameter optimization method, the super-parameter set scores of each super-parameter set are predicted through the pre-trained empirical model, the preset number of super-parameter sets are preliminarily screened according to the super-parameter set scores, candidate super-parameter sets needing further evaluation are screened out, so that all the super-parameter sets are prevented from being evaluated respectively, the evaluation time of the super-parameter sets can be saved, and the super-parameter optimization efficiency can be improved. And further, evaluating the candidate hyper-parameter set according to the training sample set and the test sample set corresponding to the target problem, and dynamically updating the reference evaluation value corresponding to the target problem based on the evaluation value so as to screen out the hyper-parameter set suitable for the target problem and improve the optimization effect of the hyper-parameter. Moreover, the super parameter set is screened and evaluated in an iterative loop mode, and the optimization efficiency can be further improved under the condition of improving the optimization effect of the super parameters.
In one embodiment, there are a plurality of trained empirical models; step S202 includes: acquiring a super parameter set; respectively predicting the super parameter set through a plurality of trained empirical models to obtain a plurality of super parameter set sub-scores of the super parameter set; and obtaining the super parameter group scores of the super parameter groups according to the plurality of super parameter group sub scores, returning to the step of obtaining the super parameter groups, and continuing to execute until the preset number of super parameter group scores are obtained.
Specifically, the server has trained a plurality of empirical models in advance. And when the condition of the super parameter optimization triggering is met, the server acquires a super parameter group, predicts the super parameter group through each trained empirical model respectively to obtain corresponding super parameter group sub-scores, and thus obtains a plurality of super parameter group sub-scores corresponding to the super parameter group. And the server determines the super parameter group scores corresponding to the super parameter groups according to the plurality of super parameter group scores obtained by prediction. And after the super parameter group scores of the obtained super parameter groups are determined, the server returns to the step of obtaining one super parameter group and continues to execute until the super parameter group scores corresponding to the preset number of super parameter groups are obtained through prediction.
In one embodiment, the server pre-trains a corresponding empirical model under a different preset problem for each machine learning operator. When a hyper-parameter optimization process is triggered by aiming at a target problem and a preset machine learning operator, the server screens a plurality of empirical models corresponding to the preset machine learning operator from the pre-trained empirical models, and predicts the hyper-parameter group sub-score of each hyper-parameter group through each screened empirical model.
In one embodiment, the server may perform a weighted summation of the plurality of predicted meta-parameter group scores to obtain a meta-parameter group score of the corresponding meta-parameter group. The sum of the weights corresponding to each of the plurality of hyperparameter set sub-scores may not be 1. The server may average the plurality of predicted super parameter set sub-scores and determine the calculated mean value of the super parameter set sub-scores as the super parameter set scores of the super parameter set. Where the average may be an arithmetic average or a weighted average.
In one embodiment, the server obtains a preset number of sets of hyper-parameters in a serial or parallel manner. After the preset number of the super parameter groups are obtained, the server predicts the super parameter group scores of each super parameter group through each trained empirical model respectively, and determines the super parameter group scores of the super parameter groups according to the plurality of super parameter group scores corresponding to each super parameter group.
In the above embodiment, the trained multiple empirical models are combined to predict the hyperparameter set scores of the hyperparameter sets, so that when the hyperparameter sets corresponding to the target problem are optimized based on the hyperparameter set scores, the efficiency and accuracy of the optimization of the hyperparameters can be improved.
In one embodiment, deriving a hyperparameter set score for the hyperparameter set from the plurality of hyperparameter set sub-scores comprises: weighting and summing the multiple hyperparameter group scores according to weights corresponding to the corresponding empirical models to obtain hyperparameter group scores of the hyperparameter groups; the above-mentioned hyper-parameter optimization method further comprises: and dynamically updating the weight corresponding to each empirical model according to a plurality of hyperparameter group scores corresponding to the candidate hyperparameter groups.
Specifically, after the server obtains a plurality of super parameter group scores corresponding to the super parameter group through prediction of a plurality of trained empirical models, the server performs weighted summation on the plurality of super parameter group scores according to weights corresponding to the empirical models corresponding to the super parameter group scores to obtain the super parameter group scores of the super parameter group. Further, when the weight updating triggering condition is met, the server dynamically updates the weight corresponding to each empirical model according to the scores of the multiple hyper-parameter groups corresponding to the screened candidate hyper-parameter groups. The weight updating triggering condition is not specifically limited, for example, to screen out a candidate super parameter set from a preset number of super parameter sets, or to obtain an evaluation value corresponding to the candidate super parameter set through prediction of a trained target problem prediction model, or to dynamically update a reference evaluation value corresponding to a target problem according to the evaluation value corresponding to the candidate super parameter set.
In one embodiment, the server performs weighted summation on a plurality of super parameter group scores corresponding to each super parameter group according to the following formula (1) to obtain the super parameter group score corresponding to the super parameter group.
Φ=∑ωiΦi(1)
Wherein phi is the hyperparameter group score corresponding to the hyperparameter group, phiiThe meta-parameter set score, ω, predicted for the meta-parameter set for the ith empirical modeliAnd the weight corresponding to the ith empirical model. The total number of the empirical models can be customized according to actual conditions, such as 50. It can be understood that the plurality of empirical models in the formula (1) are different empirical models obtained by training for different preset problems, and an adaptive model corresponding to the target problem can be obtained by adaptively adjusting the plurality of empirical models under the target problem, where the adaptive model is a model formed by combining the plurality of empirical models according to weights obtained by adaptation.
In one embodiment, the combination of the plurality of empirical models includes, but is not limited to, integration such as Boosting, Stacking, and Blending.
In one embodiment, after the plurality of meta-parameter group scores corresponding to the meta-parameter groups are predicted and obtained through the trained plurality of empirical models, the server dynamically updates the weight corresponding to each empirical model according to the plurality of meta-parameter group scores.
In the above embodiment, the corresponding sub-scores of the hyper-parameters are weighted and summed according to the weights corresponding to the empirical models, so as to obtain the score of the hyper-parameter group, and thus the influence degree of the empirical models on the optimization of the hyper-parameters is reflected by the weights, and the correlation between the empirical models and the target problem is reflected by the weights. The self-adaptation of the empirical models is realized by dynamically adjusting the weight of each empirical model, so that the influence of the non-relevant empirical model on the optimization is reduced, the influence of the relevant empirical model on the optimization is improved, and the efficiency and the accuracy of the super-parameter optimization are improved.
In one embodiment, dynamically updating the weight corresponding to each empirical model according to a plurality of meta-parameter group scores corresponding to the candidate meta-parameter groups includes: obtaining a plurality of super parameter group scores corresponding to the candidate super parameter groups; determining a target problem label corresponding to each empirical model according to the obtained hyperparameter set sub-scores; and dynamically updating the weight corresponding to the corresponding empirical model according to the hyperparameter set sub-score and the target problem label corresponding to each empirical model.
The target problem label is a label for representing the correlation degree of the empirical model and the target problem.
Specifically, when the weight update triggering condition is met, the server acquires a plurality of super parameter group sub-scores corresponding to the candidate super parameter groups, and respectively determines the target problem labels corresponding to the corresponding empirical models according to each super parameter group sub-score in the plurality of super parameter group sub-scores. And the server dynamically updates the current corresponding weight of each empirical model according to the hyperparameter set sub-score and the target problem label corresponding to each empirical model.
In one embodiment, the server predicts the meta-parameter set scores of the candidate meta-parameter set respectively through the trained empirical models to obtain a plurality of meta-parameter set scores corresponding to the candidate meta-parameter set. When the server screens out the candidate hyper-parameter sets from the preset number of hyper-parameter sets, the server stores a plurality of hyper-parameter set sub-scores corresponding to the candidate hyper-parameter sets in the local, and when the weight updating triggering condition is met, the server acquires the plurality of hyper-parameter set sub-scores corresponding to the candidate hyper-parameter sets from the local.
In one embodiment, the server determines the target problem label corresponding to the empirical model corresponding to each hyperparameter set sub-score according to the ranking of the hyperparameter set sub-score in the multiple hyperparameter set sub-scores.
In one embodiment, the target problem labels include a first label characterizing that the empirical model is relevant to the target problem and a second label characterizing that the empirical model is not relevant to the target problem. For example, when the value range of the hyperparameter set score is [0,1], the first label is 1, and the second label is 0. The server sorts a plurality of super parameter group sub-scores corresponding to the candidate super parameter groups according to a descending order, determines target problem labels corresponding to the empirical models corresponding to the super parameter group sub-scores in the specified number in the top sorting as first labels, and determines target problem labels corresponding to other empirical models in the plurality of empirical models as second labels. The specified number can be customized, such as 1, and is not particularly limited herein.
For example, the server determines the target problem label corresponding to the empirical model corresponding to the largest super parameter group sub-score among the plurality of super parameter group sub-scores corresponding to the candidate super parameter groups as 0, and determines the target problem labels corresponding to the other empirical models as 0.
In one embodiment, the server determines a target problem label corresponding to the empirical model with the meta-parameter set sub-score greater than or equal to a preset score threshold as a first label, and determines a target problem label corresponding to the empirical model with the meta-parameter set sub-score less than the preset score threshold as a second label. The preset score threshold is, for example, 0.9, and is not particularly limited herein.
In the above embodiment, based on the scores of the multiple hyperparameter sub-groups corresponding to the candidate hyperparameter groups, the weight corresponding to each empirical model is dynamically adjusted to realize the self-adaptation of the empirical model, so that the influence degree of the relevant empirical model on the target problem is improved, the influence degree of the non-relevant empirical model on the target problem is reduced, and the optimization efficiency and accuracy are improved.
In one embodiment, dynamically updating the weight corresponding to each empirical model according to the meta-parameter set sub-score and the target problem label corresponding to each empirical model includes: dynamically determining corresponding target weight according to the corresponding hyperparameter group score, the target problem label and the weight of each empirical model and a preset mapping relation; and carrying out normalization processing on the target weight to obtain an updated weight corresponding to each empirical model.
Specifically, the server dynamically determines the target weight corresponding to each empirical model according to the default mapping relationship based on the meta-parameter group score, the target problem label and the weight corresponding to each empirical model. And after the target weight corresponding to each empirical model is determined, normalizing each target weight according to the determined multiple target weights to obtain the updated weight corresponding to each empirical model.
In one embodiment, the server determines the target weight corresponding to each empirical model according to a preset mapping relationship shown in the following formula (2).
ωi0=exp(-α(Φi(x)-label)2i(2)
Wherein, ω isiIs the weight before updating corresponding to the ith empirical model, which can be understood as the weight, ω, corresponding to the empirical model at presenti0Is the target weight corresponding to the ith empirical model, x is a hyper-parameter set to be scored, phii(x) The meta-parameter group scores obtained by predicting the meta-parameter group for the ith empirical model are obtained, label is a target problem label corresponding to the ith empirical model, and α is a preset constant value.
In one embodiment, when the server updates the weights of the empirical models according to the above formula (2), since the current weight is multiplied by a number smaller than 1, in order to prevent the weights from falling back to 0 in multiple rounds of updating, the server performs normalization processing on the target weights corresponding to each empirical model according to the following formula (3) to obtain updated weights.
Figure BDA0002272613310000151
Wherein, ω isi0Target weight, ω, for the ith empirical modelj0Is the target weight corresponding to the jth empirical model, sigma omegaj0Representing the sum of the target weights, ω, corresponding to each of a plurality of empirical modelsiIndicating the updated weight corresponding to the ith empirical model. It is understood that when the weights of the empirical model are updated, the server performs weighted summation on the corresponding hyperparameter sub-scores according to the updated weights.
In the above embodiment, the target weight corresponding to the empirical model is dynamically determined based on the meta-parameter sub-score, the target problem label and the current weight, so that the experience having a positive influence on the optimization effect is selected in the adaptive process of the empirical model, thereby improving the optimization efficiency and accuracy. And the normalized target weight is determined as the updated weight to avoid weight degradation to 0 during the iterative update process. In this way, the optimization proportion of the empirical model close to the target problem is increased by dynamically updating the weight, the optimization proportion of the empirical model far away from the target problem to the target problem is reduced, and the target problem is optimized by combining a plurality of empirical models all the time, so that the generalization is improved, and the optimization effect is improved.
In one embodiment, step S204 includes: screening the maximum hyperparameter group score from the predicted hyperparameter group scores; determining the hyper-parameter group corresponding to the screened hyper-parameter group score as a candidate hyper-parameter group; updating the reference evaluation value currently corresponding to the target problem according to the evaluation value, wherein the updating comprises the following steps: when the evaluation value is greater than or equal to the reference evaluation value currently corresponding to the target problem, the reference evaluation value is updated to the evaluation value.
Specifically, the server compares the predicted hyperparameter set scores to screen the largest hyperparameter set score from the preset number of hyperparameter set scores, and determines the hyperparameter set corresponding to the screened hyperparameter set score as the candidate hyperparameter set. Further, the server compares the evaluation value corresponding to the candidate super parameter set with a reference evaluation value currently corresponding to the target problem. When the evaluation value is judged to be larger than or equal to the reference evaluation value currently corresponding to the target problem, the server updates the reference evaluation value currently corresponding to the target problem to the evaluation value corresponding to the candidate super parameter set.
In one embodiment, when the evaluation value corresponding to the candidate super parameter set is smaller than the reference evaluation value currently corresponding to the target problem, the server keeps the reference evaluation value currently corresponding to the target problem unchanged, and continues to perform the subsequent steps.
In one embodiment, when the server updates the reference evaluation value corresponding to the target problem according to the evaluation value corresponding to the candidate super parameter set, the server updates and stores the candidate super parameter set to the local.
In one embodiment, the reference evaluation value currently corresponding to the target problem is an evaluation value corresponding to the currently optimal candidate super parameter set, and the optimal candidate super parameter set in the optimization process is recorded by dynamically updating the reference evaluation value, so that the optimal candidate super parameter set can be determined according to the reference evaluation value when the iterative optimization is stopped.
In the above embodiment, the maximum corresponding hyper-parameter set in the preset number of hyper-parameter sets is determined as the candidate hyper-parameter set, so as to avoid evaluating all hyper-parameter sets, and improve the efficiency and accuracy of hyper-parameter optimization. In the optimization process, the reference evaluation value corresponding to the target problem is dynamically updated, so that the optimal candidate hyper-parameter set corresponding to the target problem is determined according to the reference evaluation value, and the efficiency and the accuracy of the hyper-parameter optimization can be further improved.
In one embodiment, the step of training the empirical model comprises: acquiring an experience data set; the experience data set comprises an experience hyperparameter set and a corresponding experience hyperparameter set score; and taking the experience hyperparameter group as an input feature, and taking the corresponding experience hyperparameter group score as an expected output feature to carry out model training to obtain a trained experience model.
Specifically, the server obtains experience hyper-parameter sets, labels each experience hyper-parameter set to obtain a corresponding experience hyper-parameter set score, and obtains an experience data set according to each experience hyper-parameter set and the corresponding experience hyper-parameter set score. And the server takes the experience hyperparameter group as an input feature and takes the corresponding experience hyperparameter group score as an expected output feature to carry out model training to obtain a trained experience model.
In one embodiment, the server may manually label each experience super parameter group by using the terminal, or may automatically label each experience super parameter group based on a preset machine learning operator, a training sample set corresponding to a preset problem, and a test sample set.
In one embodiment, the server initializes a model of the multi-layer perceptron network structure, selects cross entropy as a loss function, and performs model training according to an empirical data set to obtain a trained empirical model. The server stores the trained empirical model locally so as to predict the hyperparameter set scores of the hyperparameter set directly through the empirical model stored locally in the hyperparameter optimization process, and the acceleration of the hyperparameter optimization based on experience can be realized.
In the above embodiment, the empirical model is trained in advance based on the empirical data set, so that preliminary screening of the hyper-parameter set is performed based on the trained empirical model, and the evaluation time of the invalid hyper-parameter set can be saved, thereby improving the efficiency of hyper-parameter optimization.
In one embodiment, obtaining an empirical data set includes: acquiring an experience super parameter set; according to the experience hyper-parameter set and a preset machine learning operator, performing model training according to a training sample set corresponding to a preset problem to obtain a trained preset problem prediction model; testing the preset problem prediction model according to the test sample set corresponding to the preset problem to obtain experience hyperparameter set scores corresponding to the experience hyperparameter sets, and returning to the step of obtaining the experience hyperparameter sets to continue executing until the experience hyperparameter set scores of the target quantity are obtained; and obtaining an experience data set according to the experience hyperparameter set and the corresponding experience hyperparameter set score.
In one embodiment, the server obtains a target number of experience hyper-parameter sets, and determines the experience hyper-parameter set score corresponding to each experience hyper-parameter set in a parallel or serial manner based on a preset machine learning operator, a training sample set corresponding to a preset problem and a test sample set.
In one embodiment, the server selects the experience hyper-parameter set from the hyper-parameter search space through an optimization operator, and evaluates the experience hyper-parameter set based on a preset machine learning operator, a training sample set corresponding to a preset problem and a test sample set to obtain a corresponding experience hyper-parameter set score. And circulating the selection and evaluation operation to obtain a plurality of groups of experience hyperparameter sets and corresponding experience hyperparameter sets, and using the experience hyperparameter sets and the corresponding experience hyperparameter sets as experience data for training an experience model corresponding to the preset problem.
For example, it is assumed that the hyper-parameter search space includes two hyper-parameters, i.e., C and lambda, where a search range of C is an integer of [1,2,3,4], and a search range of lambda is a floating point of [0,1], the server selects a set of experience hyper-parameter sets in the hyper-parameter search space, where C is 1 and lambda is 0.5, the server determines that the score of the experience hyper-parameter set corresponding to the experience hyper-parameter set is 0.99 based on a preset machine learning operator, a preset training sample set of a problem, and a test sample set, and obtains multiple sets of experience data by looping the above operations.
In the above embodiment, based on the training sample set and the test sample set corresponding to the preset problem, the experience super parameter set score corresponding to each experience super parameter set is automatically determined, so that an experience data set related to the preset problem is obtained, and an experience model related to the preset problem can be obtained by performing model training based on the experience data set.
In an embodiment, after step S212, the above-mentioned hyper-parameter optimization method further includes: according to the target hyper-parameter set and the machine learning operator, performing model training according to a target sample set corresponding to the target problem to obtain a trained target prediction model; when a prediction trigger condition corresponding to a target problem is met, target data to be predicted are obtained; and inputting the target data into a target prediction model for prediction to obtain a corresponding prediction result.
The prediction trigger condition is a condition for triggering a prediction operation, for example, a prediction instruction sent by the terminal is received, or the current time is consistent with a preset prediction trigger time. The target data is data that affects the predicted result of the target problem.
Specifically, the target hyper-parameter set is an optimal hyper-parameter set corresponding to a preset machine learning operator, which is obtained by optimization under a target problem. And the server performs model training according to the target hyper-parameter set and a preset machine learning operator and a target sample set corresponding to the target problem to obtain a trained target prediction model, and stores the target prediction model locally, so that a relatively good prediction effect can be obtained when prediction is performed based on the target prediction model. And when the prediction trigger condition corresponding to the target problem is met, the server acquires target data to be predicted corresponding to the target problem, inputs the target data into a trained target prediction model for prediction, and obtains a corresponding prediction result.
In one embodiment, the server may determine a training sample set and/or a testing sample set corresponding to the target problem in the super parameter set optimization process as a target sample set corresponding to the target problem.
For example, assuming that the target problem is house price prediction, the target data to be predicted is total production value, engel coefficient, average income of residents and families, loan interest rate, currency supply amount, and the like, and the corresponding prediction result is predicted house price predicted based on the target data.
In the above embodiment, the target prediction model corresponding to the target problem is obtained by training according to the selected target hyper-parameter set, and the target data is predicted based on the target prediction model, so that the accuracy of the prediction result can be improved.
As shown in FIG. 3, in one embodiment, an adaptive hyper-parametric optimization method is provided, which mainly comprises acquiring empirical data, accelerating hyper-parametric optimization based on the empirical data, and adaptive processing based on an empirical model. The server firstly acquires experience data, trains an experience model according to the experience data, and initially screens the hyper-parameter set through the trained experience model, so that the optimal hyper-parameter set can be searched out more quickly, the effect of improving the hyper-parameter searching efficiency by using experience is achieved, and the optimization efficiency is improved. Further, because the empirical model with a large difference from the target problem may not have a large optimization effect on the target problem, the server solves the problem of correlation between the target problem and the empirical model through adaptive processing of the empirical model, and the empirical model is matched with the problem through adaptation, so that the optimization effect is improved.
FIG. 4 is a flow diagram of a method for hyperparametric optimization in one embodiment. The server inputs a plurality of empirical models and a training sample set and a testing sample set corresponding to the target problem. And the server initializes the hyper-parameter optimization, such as initializing weights corresponding to the empirical models and reference evaluation values corresponding to the target problem. After initialization is completed, the server carries out pre-sampling on the super parameter group to obtain the super parameter group to be scored, and the super parameter group scoring corresponding to the super parameter group is predicted according to the weight through a plurality of empirical models. And after predicting the super parameter scores of the super parameter sets, the server judges whether the pre-sampling times are greater than or equal to the preset number. If not, the server returns to the step of pre-sampling the super parameter group to obtain the super parameter group to be scored and continues to execute. And if so, the server determines the evaluation value corresponding to the maximum hyperparameter set corresponding to the hyperparameter set score based on a preset machine learning operator, a training sample set corresponding to the target problem and a testing sample set.
The server compares the determined evaluation value with a reference evaluation value currently corresponding to the target problem, which can be understood as a current optimum value. And when the evaluation value is greater than or equal to the current optimal value, the server updates the current optimal value according to the evaluation value, and updates the weight of each empirical model according to a plurality of super parameter group scores corresponding to the super parameter group corresponding to the maximum super parameter group score. Otherwise, the server directly updates the weight of each empirical model according to the multiple hyperparameter sub-scores corresponding to the hyperparameter group corresponding to the maximum hyperparameter group score. And after the weight is updated, the server judges whether the iteration times are exhausted, if so, the super parameter group corresponding to the current optimal value is determined as the optimal super parameter group, the optimal super parameter group is output, and if not, the step of performing pre-sampling on the super parameter group to obtain the super parameter group to be scored is returned to and continuously executed.
In the embodiment, the self-adaptive processing of the empirical model and the self-adaptive optimization of the hyper-parameters are realized by dynamically adjusting the weight of the empirical model, the empirical data are fully utilized, and the optimization effect and the optimization efficiency can be improved under the condition of less training samples. The optimization effect can be improved by continuously increasing the weight of the relevant experience, and the robustness of the system can be improved by continuously decreasing the weight of the non-relevant experience.
In one embodiment, a newton iteration method is applied in the adaptive model training process, but the method is not limited to the newton iteration method, and for example, a neural network may also be used for training, which is not limited herein.
In one embodiment, in the existing hyper-parameter optimization method, grid search needs to perform traversal search on all hyper-parameter dimensions, bayesian optimization needs to establish a proxy function for a search space to select sampling points, an evolution algorithm needs to be optimized by combining a large number of sampling and biological evolution behaviors, and both the problems of low optimization efficiency exist, and random search is optimized by randomly selecting a group of hyper-parameter groups in the hyper-parameter space, so that the problems of high randomness and low accuracy exist. The hyper-parameter optimization method provided by the application obtains the empirical model based on empirical data pre-training, and preliminarily screens the hyper-parameter set through the empirical model so as to save the evaluation time of the invalid hyper-parameter set and improve the optimization efficiency. By the self-adaptive adjustment of the experience model, the optimization proportion of the related experience can be improved, and the optimization proportion of the non-related experience can be reduced, so that the optimization accuracy can be improved.
It should be understood that although the steps in the flowcharts of fig. 2 and 4 are shown in order as indicated by the arrows, the steps are not necessarily performed in order as indicated by the arrows. The steps are not performed in the exact order shown and described, and may be performed in other orders, unless explicitly stated otherwise. Moreover, at least some of the steps in fig. 2 and 4 may include multiple sub-steps or multiple stages that are not necessarily performed at the same time, but may be performed at different times, and the order of performing the sub-steps or stages is not necessarily sequential, but may be performed alternately or alternately with other steps or at least some of the sub-steps or stages of other steps.
In one embodiment, as shown in fig. 5, there is provided a hyper-parametric optimization apparatus 500, comprising: an acquisition module 502, a screening module 504, a training module 506, an evaluation module 508, an update module 510, and a determination module 512, wherein:
an obtaining module 502, configured to obtain a preset number of sets of hyperparameters, and predict, through a trained empirical model, a hyperparameter set score of each hyperparameter set respectively;
a screening module 504, configured to screen candidate hyperparameter sets from the obtained hyperparameter sets according to the hyperparameter set scores;
a training module 506, configured to perform model training according to the candidate hyperparameter set and a preset machine learning operator and according to a training sample set corresponding to the target problem, to obtain a trained target problem prediction model;
the evaluation module 508 is configured to test the target problem prediction model according to the test sample set corresponding to the target problem, so as to obtain an evaluation value corresponding to the candidate super parameter set;
an updating module 510, configured to update a reference evaluation value currently corresponding to the target problem according to the evaluation value, and instruct the obtaining module 502 to continue to perform the steps of obtaining a preset number of sets of hyperparameters, and respectively predicting, by using a trained empirical model, a hyperparameter set score of each hyperparameter set until an iteration stop condition is met;
the determining module 512 is configured to determine a candidate super parameter set corresponding to the reference evaluation value corresponding to the target problem as a target super parameter set.
In one embodiment, there are a plurality of trained empirical models; the acquisition module is also used for acquiring the super parameter group; respectively predicting the super parameter set through a plurality of trained empirical models to obtain a plurality of super parameter set sub-scores of the super parameter set; and obtaining the super parameter group scores of the super parameter groups according to the plurality of super parameter group sub scores, returning to the step of obtaining the super parameter groups, and continuing to execute until the preset number of super parameter group scores are obtained.
In one embodiment, the obtaining module is further configured to perform weighted summation on the multiple hyperparameter set scores according to weights corresponding to the respective empirical models to obtain hyperparameter set scores of the hyperparameter sets; and the updating module is further used for dynamically updating the weight corresponding to each empirical model according to the multiple meta-parameter group scores corresponding to the candidate meta-parameter groups.
In one embodiment, the updating module is further configured to obtain a plurality of meta-parameter group scores corresponding to the candidate meta-parameter groups; determining a target problem label corresponding to each empirical model according to the obtained hyperparameter set sub-scores; and dynamically updating the weight corresponding to the corresponding empirical model according to the hyperparameter set sub-score and the target problem label corresponding to each empirical model.
In one embodiment, the updating module is further configured to dynamically determine a corresponding target weight according to a preset mapping relationship based on the meta-parameter set score, the target problem label, and the weight corresponding to each empirical model; and carrying out normalization processing on the target weight to obtain an updated weight corresponding to each empirical model.
In one embodiment, the screening module 504 is further configured to screen a maximum hyperparameter set score from the predicted hyperparameter set scores; determining the hyper-parameter group corresponding to the screened hyper-parameter group score as a candidate hyper-parameter group; the updating module 510 is further configured to update the reference evaluation value as the evaluation value when the evaluation value is greater than or equal to the reference evaluation value currently corresponding to the target problem.
In one embodiment, the training module 506 is further configured to train the empirical model, specifically including: acquiring an experience data set; the experience data set comprises an experience hyperparameter set and a corresponding experience hyperparameter set score; and taking the experience hyperparameter group as an input feature, and taking the corresponding experience hyperparameter group score as an expected output feature to carry out model training to obtain a trained experience model.
In one embodiment, the training module 506 is further configured to obtain a set of empirical hyper-parameters; according to the experience hyper-parameter set and a preset machine learning operator, performing model training according to a training sample set corresponding to a preset problem to obtain a trained preset problem prediction model; testing the preset problem prediction model according to the test sample set corresponding to the preset problem to obtain experience hyperparameter set scores corresponding to the experience hyperparameter sets, and returning to the step of obtaining the experience hyperparameter sets to continue executing until the experience hyperparameter set scores of the target quantity are obtained; and obtaining an experience data set according to the experience hyperparameter set and the corresponding experience hyperparameter set score.
In one embodiment, the above-mentioned hyper-parameter optimization apparatus 500 further includes: the prediction module is used for carrying out model training according to a target sample set corresponding to a target problem according to the target hyper-parameter set and the machine learning operator to obtain a trained target prediction model; when a prediction trigger condition corresponding to a target problem is met, target data to be predicted are obtained; and inputting the target data into a target prediction model for prediction to obtain a corresponding prediction result.
For specific limitations of the hyper-parametric optimization apparatus, reference may be made to the above limitations of the hyper-parametric optimization method, which are not described herein again. The modules in the above-described hyper-parametric optimization apparatus may be implemented in whole or in part by software, hardware, and a combination thereof. The modules can be embedded in a hardware form or independent from a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.
In one embodiment, a computer device is provided, which may be a server, and its internal structure diagram may be as shown in fig. 6. The computer device includes a processor, a memory, a network interface, and a database connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, a computer program, and a database. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The database of the computer device is used for storing the super parameter group, the trained empirical model and the like. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement a hyper-parametric optimization method.
Those skilled in the art will appreciate that the architecture shown in fig. 6 is merely a block diagram of some of the structures associated with the disclosed aspects and is not intended to limit the computing devices to which the disclosed aspects apply, as particular computing devices may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.
In one embodiment, a computer device is provided, which includes a memory and a processor, the memory stores a computer program, and the processor implements the steps of the hyper-parameter optimization method in the above embodiments when executing the computer program.
In one embodiment, a computer-readable storage medium is provided, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method for hyper-parameter optimization in the respective embodiments described above.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in the embodiments provided herein may include non-volatile and/or volatile memory, among others. Non-volatile memory can include read-only memory (ROM), Programmable ROM (PROM), Electrically Programmable ROM (EPROM), Electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), Double Data Rate SDRAM (DDRSDRAM), Enhanced SDRAM (ESDRAM), Synchronous Link DRAM (SLDRAM), Rambus Direct RAM (RDRAM), direct bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM).
The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims (15)

1. A method of hyper-parameter optimization, the method comprising:
acquiring a preset number of super parameter sets, and respectively predicting the super parameter set scores of each super parameter set through a trained empirical model;
screening candidate hyperparameter groups from the acquired hyperparameter groups according to the hyperparameter group scores;
according to the candidate hyperparameter set and a preset machine learning operator, performing model training according to a training sample set corresponding to the target problem to obtain a trained target problem prediction model;
testing the target problem prediction model according to the test sample set corresponding to the target problem to obtain an evaluation value corresponding to the candidate super parameter set;
updating the reference evaluation value currently corresponding to the target problem according to the evaluation value, returning to the step of acquiring the preset number of super parameter sets, and respectively predicting the super parameter set scoring of each super parameter set through a trained empirical model to continue to execute until the iteration stop condition is met;
and determining the candidate hyperparameter set corresponding to the reference evaluation value corresponding to the target problem as a target hyperparameter set.
2. The method of claim 1, wherein there are a plurality of trained empirical models; the acquiring of the preset number of super parameter sets and the predicting of the super parameter set score of each super parameter set through the trained empirical model respectively include:
acquiring a super parameter set;
predicting the super parameter set through a plurality of trained empirical models respectively to obtain a plurality of super parameter set scores of the super parameter set;
and obtaining the super parameter group scores of the super parameter groups according to the plurality of super parameter group sub scores, and returning to the step of obtaining the super parameter groups to continue execution until a preset number of super parameter group scores are obtained.
3. The method of claim 2, wherein the deriving the hyperparameter set scores for the hyperparameter set from the plurality of hyperparameter set sub-scores comprises:
weighting and summing the multiple hyperparameter group scores according to weights corresponding to the corresponding empirical models to obtain the hyperparameter group scores of the hyperparameter groups;
the method further comprises the following steps:
and dynamically updating the weight corresponding to each empirical model according to a plurality of hyperparameter group scores corresponding to the candidate hyperparameter groups.
4. The method of claim 3, wherein dynamically updating the weight for each empirical model based on a plurality of meta-parameter group scores corresponding to the candidate meta-parameter groups comprises:
obtaining a plurality of super parameter group scores corresponding to the candidate super parameter groups;
determining a target problem label corresponding to each empirical model according to the obtained hyperparameter set sub-scores;
and dynamically updating the weight corresponding to the corresponding empirical model according to the hyperparameter set sub-score and the target problem label corresponding to each empirical model.
5. The method of claim 4, wherein dynamically updating the weights corresponding to the respective empirical models according to the meta-parameter set sub-score and the target problem label corresponding to each empirical model comprises:
dynamically determining corresponding target weight according to the corresponding hyperparameter group score, the target problem label and the weight of each empirical model and a preset mapping relation;
and carrying out normalization processing on the target weight to obtain an updated weight corresponding to each empirical model.
6. The method of claim 1, wherein the screening the obtained set of hyperparameters for candidate hyperparameters based on the hyperparameter set score comprises:
screening the maximum hyperparameter group score from the predicted hyperparameter group scores;
determining the hyper-parameter group corresponding to the screened hyper-parameter group score as a candidate hyper-parameter group;
the updating the reference evaluation value currently corresponding to the target problem according to the evaluation value includes:
updating the reference evaluation value as the evaluation value when the evaluation value is greater than or equal to a reference evaluation value currently corresponding to the target problem.
7. The method according to any one of claims 1 to 6, wherein the step of training the empirical model comprises:
acquiring an experience data set; the empirical data set includes a set of empirical hyperparameters and corresponding scores for the set of empirical hyperparameters;
and taking the experience hyperparameter group as an input feature, and taking the corresponding experience hyperparameter group score as an expected output feature to carry out model training to obtain a trained experience model.
8. The method of claim 7, wherein said obtaining an empirical data set comprises:
acquiring an experience super parameter set;
according to the experience hyper-parameter set and a preset machine learning operator, performing model training according to a training sample set corresponding to a preset problem to obtain a trained preset problem prediction model;
testing the preset problem prediction model according to the test sample set corresponding to the preset problem to obtain experience hyperparameter group scores corresponding to the experience hyperparameter groups, and returning to the step of obtaining the experience hyperparameter groups to continue executing until the experience hyperparameter group scores of the target quantity are obtained;
and obtaining an experience data set according to the experience hyperparameter set and the corresponding experience hyperparameter set grading.
9. The method according to any one of claims 1 to 6, wherein after determining the candidate hyperparameter set corresponding to the reference evaluation value corresponding to the target problem as the target hyperparameter set, the method further comprises:
according to the target hyper-parameter set and the machine learning operator, performing model training according to a target sample set corresponding to the target problem to obtain a trained target prediction model;
when a prediction trigger condition corresponding to the target problem is met, target data to be predicted are obtained;
and inputting the target data into the target prediction model for prediction to obtain a corresponding prediction result.
10. A hyper-parametric optimization apparatus, the apparatus comprising:
the acquisition module is used for acquiring a preset number of super parameter groups and respectively predicting the super parameter group scores of each super parameter group through a trained empirical model;
the screening module is used for screening candidate hyperparameter groups from the acquired hyperparameter groups according to the hyperparameter group scores;
the training module is used for carrying out model training according to the candidate hyper-parameter set and a preset machine learning operator and a training sample set corresponding to the target problem to obtain a trained target problem prediction model;
the evaluation module is used for testing the target problem prediction model according to the test sample set corresponding to the target problem to obtain an evaluation value corresponding to the candidate super parameter set;
the updating module is used for updating the reference evaluation value currently corresponding to the target problem according to the evaluation value, instructing the acquisition module to continue to acquire the preset number of super parameter sets, and respectively predicting the super parameter set grading step of each super parameter set through a trained empirical model until the iteration stop condition is met;
and the determining module is used for determining the candidate super parameter group corresponding to the reference evaluation value corresponding to the target problem as a target super parameter group.
11. The apparatus of claim 10, wherein there are a plurality of trained empirical models; the acquisition module is further used for acquiring a super parameter set; predicting the super parameter set through a plurality of trained empirical models respectively to obtain a plurality of super parameter set scores of the super parameter set; and obtaining the super parameter group scores of the super parameter groups according to the plurality of super parameter group sub scores, and returning to the step of obtaining the super parameter groups to continue execution until a preset number of super parameter group scores are obtained.
12. The apparatus of claim 11, wherein the obtaining module is further configured to perform weighted summation on the multiple meta-parameter group scores according to weights corresponding to respective empirical models, so as to obtain meta-parameter group scores of the meta-parameter groups; the updating module is further configured to dynamically update the weight corresponding to each empirical model according to the multiple meta-parameter group scores corresponding to the candidate meta-parameter groups.
13. The apparatus of claim 10, further comprising:
the prediction module is used for carrying out model training according to the target sample set corresponding to the target problem according to the target hyper-parameter set and the machine learning operator to obtain a trained target prediction model; when a prediction trigger condition corresponding to the target problem is met, target data to be predicted are obtained; and inputting the target data into the target prediction model for prediction to obtain a corresponding prediction result.
14. A computer device comprising a memory and a processor, the memory storing a computer program, wherein the processor implements the steps of the method of any one of claims 1 to 9 when executing the computer program.
15. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method of any one of claims 1 to 9.
CN201911110640.7A 2019-11-14 2019-11-14 Hyper-parameter optimization method, device, computer equipment and storage medium Pending CN111105040A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911110640.7A CN111105040A (en) 2019-11-14 2019-11-14 Hyper-parameter optimization method, device, computer equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911110640.7A CN111105040A (en) 2019-11-14 2019-11-14 Hyper-parameter optimization method, device, computer equipment and storage medium

Publications (1)

Publication Number Publication Date
CN111105040A true CN111105040A (en) 2020-05-05

Family

ID=70420676

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911110640.7A Pending CN111105040A (en) 2019-11-14 2019-11-14 Hyper-parameter optimization method, device, computer equipment and storage medium

Country Status (1)

Country Link
CN (1) CN111105040A (en)

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111340244A (en) * 2020-05-15 2020-06-26 支付宝(杭州)信息技术有限公司 Prediction method, training method, device, server and medium
CN111539536A (en) * 2020-06-19 2020-08-14 支付宝(杭州)信息技术有限公司 Method and device for evaluating service model hyper-parameters
CN112949850A (en) * 2021-01-29 2021-06-11 北京字节跳动网络技术有限公司 Hyper-parameter determination method, device, deep reinforcement learning framework, medium and equipment
CN113139624A (en) * 2021-05-18 2021-07-20 南京大学 Network user classification method based on machine learning
CN113673174A (en) * 2021-09-08 2021-11-19 中国平安人寿保险股份有限公司 Hyper-parameter determination method, device, equipment and storage medium
CN113777965A (en) * 2020-05-21 2021-12-10 广东博智林机器人有限公司 Spraying quality control method and device, computer equipment and storage medium
WO2022028323A1 (en) * 2020-08-07 2022-02-10 华为技术有限公司 Classification model training method, hyper-parameter searching method, and device
CN114966413A (en) * 2022-05-27 2022-08-30 深圳先进技术研究院 Method for predicting state of charge of energy storage battery pack
WO2023123385A1 (en) * 2021-12-31 2023-07-06 深圳晶泰科技有限公司 Candidate molecule parameter optimization method and apparatus, target molecule design method and apparatus, device, and storage medium
US11823076B2 (en) 2020-07-27 2023-11-21 International Business Machines Corporation Tuning classification hyperparameters

Cited By (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111340244A (en) * 2020-05-15 2020-06-26 支付宝(杭州)信息技术有限公司 Prediction method, training method, device, server and medium
CN113777965B (en) * 2020-05-21 2023-07-18 广东博智林机器人有限公司 Spray quality control method, spray quality control device, computer equipment and storage medium
CN113777965A (en) * 2020-05-21 2021-12-10 广东博智林机器人有限公司 Spraying quality control method and device, computer equipment and storage medium
CN111539536A (en) * 2020-06-19 2020-08-14 支付宝(杭州)信息技术有限公司 Method and device for evaluating service model hyper-parameters
CN111539536B (en) * 2020-06-19 2020-10-23 支付宝(杭州)信息技术有限公司 Method and device for evaluating service model hyper-parameters
US11823076B2 (en) 2020-07-27 2023-11-21 International Business Machines Corporation Tuning classification hyperparameters
WO2022028323A1 (en) * 2020-08-07 2022-02-10 华为技术有限公司 Classification model training method, hyper-parameter searching method, and device
CN112949850A (en) * 2021-01-29 2021-06-11 北京字节跳动网络技术有限公司 Hyper-parameter determination method, device, deep reinforcement learning framework, medium and equipment
CN112949850B (en) * 2021-01-29 2024-02-06 北京字节跳动网络技术有限公司 Super-parameter determination method, device, deep reinforcement learning framework, medium and equipment
CN113139624A (en) * 2021-05-18 2021-07-20 南京大学 Network user classification method based on machine learning
CN113673174A (en) * 2021-09-08 2021-11-19 中国平安人寿保险股份有限公司 Hyper-parameter determination method, device, equipment and storage medium
CN113673174B (en) * 2021-09-08 2023-07-25 中国平安人寿保险股份有限公司 Super parameter determination method, device, equipment and storage medium
WO2023123385A1 (en) * 2021-12-31 2023-07-06 深圳晶泰科技有限公司 Candidate molecule parameter optimization method and apparatus, target molecule design method and apparatus, device, and storage medium
CN114966413A (en) * 2022-05-27 2022-08-30 深圳先进技术研究院 Method for predicting state of charge of energy storage battery pack
WO2023226358A1 (en) * 2022-05-27 2023-11-30 深圳先进技术研究院 Prediction method for state of charge of energy storage battery pack

Similar Documents

Publication Publication Date Title
CN111105040A (en) Hyper-parameter optimization method, device, computer equipment and storage medium
CN109902753B (en) User recommendation model training method and device, computer equipment and storage medium
CN109063217B (en) Work order classification method and device in electric power marketing system and related equipment thereof
CN111210024A (en) Model training method and device, computer equipment and storage medium
JP2021532499A (en) Machine learning-based medical data classification methods, devices, computer devices and storage media
CN111625516B (en) Method, apparatus, computer device and storage medium for detecting data state
US11585918B2 (en) Generative adversarial network-based target identification
US20200167593A1 (en) Dynamic reconfiguration training computer architecture
CN113762350A (en) Abnormal data detection method and device, computer equipment and storage medium
CN110415036B (en) User grade determining method, device, computer equipment and storage medium
US11593619B2 (en) Computer architecture for multiplier-less machine learning
CN110674636A (en) Power utilization behavior analysis method
CA3135954A1 (en) Code generation for deployment of a machine learning model
CN112508177A (en) Network structure searching method and device, electronic equipment and storage medium
CN113239697A (en) Entity recognition model training method and device, computer equipment and storage medium
US6393413B1 (en) N-tuple or RAM based neural network classification system and method
US20220269991A1 (en) Evaluating reliability of artificial intelligence
CN113777965B (en) Spray quality control method, spray quality control device, computer equipment and storage medium
US20230063686A1 (en) Fine-grained stochastic neural architecture search
CN111368044A (en) Intelligent question answering method and device, computer equipment and storage medium
US20210149986A1 (en) Computer architecture for multi-domain probability assessment capability for course of action analysis
Amutha et al. A Survey on Machine Learning Algorithms for Cardiovascular Diseases Predic-tion
CN116956991B (en) Multi-layer perceptron model parameter adjustment method, device, equipment and storage medium
CN112669893B (en) Method, system, device and equipment for determining read voltage to be used
CN115620906A (en) Pathogenic pathogen prediction method, pathogenic pathogen prediction device, computer equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination