CN111126616A - Method, device and equipment for realizing super-parameter selection - Google Patents

Method, device and equipment for realizing super-parameter selection Download PDF

Info

Publication number
CN111126616A
CN111126616A CN201911151006.8A CN201911151006A CN111126616A CN 111126616 A CN111126616 A CN 111126616A CN 201911151006 A CN201911151006 A CN 201911151006A CN 111126616 A CN111126616 A CN 111126616A
Authority
CN
China
Prior art keywords
data sets
machine learning
target data
training
learning model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201911151006.8A
Other languages
Chinese (zh)
Inventor
侯广健
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Neusoft Corp
Original Assignee
Neusoft Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Neusoft Corp filed Critical Neusoft Corp
Priority to CN201911151006.8A priority Critical patent/CN111126616A/en
Publication of CN111126616A publication Critical patent/CN111126616A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Medical Informatics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Artificial Intelligence (AREA)
  • Feedback Control In General (AREA)

Abstract

The embodiment of the application discloses a method, a device and equipment for realizing super-parameter selection, wherein after K training verification data sets are obtained, a group of super-parameters is selected to establish a current machine learning model and the current machine learning model is determined as a current optimal model; and then, a group of hyper-parameters is reselected to establish a current machine learning model, the number of the current optimal model and the number of the added target data sets are re-determined by utilizing the difference between the merits and the demerits of the target data sets based on the current machine learning model and the current optimal model, the processes of establishing the current machine learning model, re-determining the current optimal model based on the difference between the merits and the demerits of the models and increasing the number of the target data sets are repeatedly executed for many times until a preset stopping condition is met, and at the moment, the hyper-parameters corresponding to the finally obtained current optimal model can be determined as the target hyper-parameters. Therefore, the efficiency of the super-parameter selection process is improved, the time and the memory consumption of the super-parameter selection process are reduced, and the selection cost of the model super-parameter is reduced.

Description

Method, device and equipment for realizing super-parameter selection
Technical Field
The application relates to the technical field of automatic machine learning, in particular to a method, a device and equipment for realizing super-parameter selection.
Background
At present, the automatic machine learning technology is rapidly developed, and in the technical field of automatic machine learning, an optimal hyper-parameter corresponding to a machine learning model (e.g., a neural network model, etc.) can be automatically searched, and the process can involve processes such as hyper-parameter space definition, hyper-parameter space search, model training, model evaluation, and the like. The simplicity and complexity degree of model evaluation can influence the efficiency of the super-parameter selection process.
A common model evaluation method is the cross-validation method. Due to the fact that the model evaluation process based on the cross validation method is complex, the efficiency is low when the cross validation method is used for evaluating the machine learning models corresponding to each group of hyper-parameters, and therefore the efficiency is low when the evaluation results of the machine learning models corresponding to each group of hyper-parameters are used for selecting the hyper-parameters.
Disclosure of Invention
In view of this, embodiments of the present application provide a method, an apparatus, and a device for implementing super-parameter selection, so as to solve the technical problem in the prior art that the efficiency of performing a super-parameter selection process is low.
In order to solve the above problem, the technical solution provided by the embodiment of the present application is as follows:
a method of implementing hyper-parameter selection, the method comprising:
acquiring K training verification data sets, wherein K is a positive integer;
selecting a group of hyper-parameters to establish a current machine learning model and determining the current machine learning model as a current optimal model;
reselecting a group of hyper-parameters to establish a current machine learning model, if the verification result generated by the current machine learning model through training by utilizing any group of target data sets is inferior to the verification result generated by the current optimal model through training by utilizing the same target data sets, eliminating the current machine learning model, and executing the reselecting of the group of hyper-parameters to establish the current machine learning model and the subsequent steps; each group of the target data sets comprises m training verification data sets, m is a positive integer smaller than K, and the initial number of the target data sets is one group;
if the verification results generated by the current machine learning model through training by using each group of target data sets are not different from the verification results generated by the current optimal model through training by using the same target data sets, determining the current machine learning model as the current optimal model, adding a group of target data sets when the number of the target data sets does not reach the maximum number or keeping the number of the target data sets unchanged when the number of the target data sets reaches the maximum number, and executing the steps of reselecting a group of hyper-parameters to establish the current machine learning model and subsequently;
and when a preset stopping condition is met, determining the hyper-parameter corresponding to the current optimal model as a target hyper-parameter.
In one possible implementation, after reselecting a set of hyper-parameters to build the current machine learning model, the method further comprises:
selecting a group of target data sets, training the current machine learning model by using the group of target data sets and generating a verification result;
judging whether the verification result generated by the current machine learning model through training by using the group of target data sets is inferior to the verification result generated by the current optimal model through training by using the group of target data sets;
if so, determining that the verification result generated by training the current machine learning model by using any group of target data sets is inferior to the verification result generated by training the current optimal model by using the same target data set;
if not, when an unused target data set exists, executing the step of selecting one group of target data sets and the subsequent steps, and when all groups of target data sets are used, determining that the verification results generated by the current machine learning model through training by using each group of target data sets are not different from the verification results generated by the current optimal model through training by using the same target data set.
In one possible embodiment, when m is greater than 1, the training the current machine learning model with the set of target data sets and generating the verification result includes:
training the current machine learning model by using each training verification data set and generating a verification result;
calculating an average value of the generated m verification results as a verification result generated by training the current machine learning model by using the group of target data sets;
when m is equal to 1, training the current machine learning model with the set of target data sets and generating a verification result, including:
training the current machine learning model by using each training verification data set and generating a verification result;
and taking the generated verification result as the verification result generated by training the current machine learning model by using the set of target data sets.
In one possible embodiment, the training the current machine learning model with each training verification data set and generating the verification result includes:
training the current machine learning model by using training data in a training verification data set;
and inputting the verification data in the training verification data set into the trained current machine learning model and outputting a verification result.
In a possible implementation manner, each set of the target data sets includes m training verification data sets different from each other, the training verification data sets included in each set of the target data sets are all different, and the maximum number of the target data sets is K divided by m and then rounded.
In one possible embodiment, the verification result is obtained according to an evaluation result of at least one of evaluation indexes of the machine learning model.
In a possible embodiment, the preset stop condition is that the running time reaches a first threshold value, or the number of times of performing the step of reselecting the set of hyper-parameters to establish the current machine learning model and the subsequent steps reaches a second threshold value.
An apparatus to enable hyper-parameter selection, the apparatus comprising:
the data acquisition unit is used for acquiring K training verification data sets, wherein K is a positive integer;
the model establishing unit is used for selecting a group of hyper-parameters to establish a current machine learning model and determining the current machine learning model as a current optimal model;
the parameter selection unit is used for reselecting a group of hyper-parameters to establish a current machine learning model, and if the verification result generated by the current machine learning model through training by using any group of target data sets is inferior to the verification result generated by the current optimal model through training by using the same target data set, the current machine learning model is eliminated, and the reselection of the group of hyper-parameters to establish the current machine learning model and the subsequent steps are executed; each group of the target data sets comprises m training verification data sets, m is a positive integer smaller than K, and the initial number of the target data sets is one group; if the verification results generated by the current machine learning model through training by using each group of target data sets are not different from the verification results generated by the current optimal model through training by using the same target data sets, determining the current machine learning model as the current optimal model, adding a group of target data sets when the number of the target data sets does not reach the maximum number or keeping the number of the target data sets unchanged when the number of the target data sets reaches the maximum number, and executing the steps of reselecting a group of hyper-parameters to establish the current machine learning model and subsequently;
and the parameter determining unit is used for determining the hyper-parameter corresponding to the current optimal model as a target hyper-parameter when a preset stopping condition is met.
A computer-readable storage medium, having stored therein instructions, which, when executed on a terminal device, cause the terminal device to execute any implementation of a method for implementing hyper-parameter selection provided in an embodiment of the present application.
An apparatus for enabling hyper-parameter selection, comprising: the system comprises a memory, a processor and a computer program stored on the memory and capable of running on the processor, wherein when the processor executes the computer program, any implementation mode of the method for realizing the super-parameter selection provided by the embodiment of the application is realized.
Therefore, the embodiment of the application has the following beneficial effects:
in the embodiment of the application, after K training verification data sets are obtained, a group of hyper-parameters is selected to establish a current machine learning model, and the current machine learning model is determined as a current optimal model; and then, a group of hyper-parameters is reselected to establish a current machine learning model, the current optimal model is re-determined and the number of target data sets is increased based on the difference between the current machine learning model and the current optimal model and the difference between the merits obtained by the target data sets, the processes of establishing the current machine learning model, re-determining the current optimal model based on the difference between the merits of the models and increasing the number of the target data sets are repeatedly executed for multiple times until a preset stopping condition is met, and at the moment, the hyper-parameters corresponding to the finally obtained current optimal model can be determined as the target hyper-parameters.
The current optimal model is always not different from each machine learning model which is trained and verified, so that the target hyper-parameter determined based on the finally determined current optimal model can not be different from other selected hyper-parameters, and the selection of the optimal hyper-parameter is realized. In addition, the model with poor effect is trained and verified based on the target data sets with small quantity, so that the model with poor effect can be eliminated only through less training and verification, the execution times of the training and verification of the model with poor effect are reduced, the efficiency of the super-parameter selection process is improved, the time consumption and the memory consumption of the super-parameter selection process are reduced, and the selection cost of the super-parameters of the model is reduced.
Drawings
Fig. 1 is a schematic diagram of a cross-validation method provided in an embodiment of the present application;
FIG. 2 is a flowchart of a method for implementing hyper-parameter selection according to an embodiment of the present disclosure;
fig. 3 is a schematic diagram illustrating an acquisition process of K training verification data sets according to an embodiment of the present application;
FIG. 4 is a flowchart of another method for implementing hyper-parameter selection according to an embodiment of the present application;
fig. 5 is a structural diagram of an apparatus for implementing super-parameter selection according to an embodiment of the present disclosure.
Detailed Description
In order to make the aforementioned objects, features and advantages of the present application more comprehensible, embodiments accompanying the drawings are described in detail below.
First, the relevant contents of the cross-validation method will be described with reference to fig. 1. In the embodiment of the present application, the model may refer to a machine learning model, and the hyper-parameter may also be referred to as a hyper-parameter for short.
As shown in fig. 1, the process of performing the current model evaluation by using the cross-validation method may specifically be:
the first step is as follows: the training validation data used to perform the model training validation was cut into X-folds (X-flods).
The second step is that: and selecting 1-fold (1-flod) in the X-flods as verification data, selecting the rest X-1-fold ((X-1) -flods) in the X-flods as training data, and training and verifying the current model by using the verification data and the training data to obtain a verification result of the current model.
The third step: the second step is executed X times in a loop, and the selected verification data is different each time the second step is executed.
The fourth step: and taking the average value of the verification results of the X current models as the evaluation result of the current models.
It should be noted that the current model may be any one of model 1, model 2, … …, and model n in fig. 1, where each model corresponds to a set of hyper-parameters, and n is a positive integer. That is, X times of training and verification processes are required for each model, and the evaluation result of the model is obtained. And then obtaining a group of hyper-parameters corresponding to a certain model according to the evaluation results of the n models as a result of hyper-parameter selection.
The inventor finds in the research of the cross validation method that when the cross validation method is used for evaluating the machine learning model corresponding to each group of hyper-parameters to select the hyper-parameters, a plurality of models with different groups of hyper-parameters need to be trained and validated respectively. For example, as shown in fig. 1, when it is necessary to select the optimal hyper-parameter from the 1 st group hyper-parameter, the 2 nd group hyper-parameter, … …, and the nth group hyper-parameter, it is necessary to perform X times of training and verification on the model (model 1) having the 1 st group hyper-parameter, X times of training and verification on the model (model 2) having the 2 nd group hyper-parameter, … …, and X times of training and verification on the model (model n) having the nth group hyper-parameter, respectively, using the cross-validation method described above. Therefore, the first step to the fourth step need to be executed when each model is trained and verified by using the cross-validation method, so that the super-parameter selection process based on the cross-validation method is complicated, the efficiency of the super-parameter selection process is low, the time consumption of the super-parameter selection process is long, the memory consumption is high, and the selection cost of the model super-parameters is high.
Based on this, the embodiment of the present application provides a method for implementing hyper-parameter selection, which specifically includes: firstly, K training verification data sets are obtained so as to generate each target data set by using the K training verification data sets; secondly, selecting a group of hyper-parameters to establish a current machine learning model and determining the current machine learning model as a current optimal model; then, a group of hyper-parameters is selected again to establish a current machine learning model, the current optimal model is re-determined and the number of target data sets is increased based on the difference between the current machine learning model and the current optimal model and the difference between the merits obtained by the target data sets, the process of establishing the current machine learning model, re-determining the current optimal model based on the difference between the merits of the models and increasing the number of the target data sets is repeatedly executed for many times until a preset stopping condition is met, and at the moment, the hyper-parameters corresponding to the finally obtained current optimal model can be determined as the target hyper-parameters.
For the convenience of understanding and explaining the method for implementing hyper-parameter selection provided by the embodiments of the present application, the relevant contents of the "training verification data set" and the "target data set" referred to in the embodiments of the present application will be described first.
The training validation dataset is used to train and validate the machine learning model. Each training verification data set can comprise a folding verification data and at least a folding training data, and each training verification data set is different, especially the verification data in each training verification data set are different. In addition, the training data sets in the training verification data sets may be completely different or partially different.
Each group of target data sets is a data set generated according to K training verification data sets, each group of target data sets may include m training verification data sets, and m is a positive integer smaller than K. In addition, m training verification data sets included in each group of target data sets are different from each other, the training verification data sets included in each group of target data sets are different from each other, and the maximum number of the target data sets is obtained by dividing K by m and then rounding.
As an example, assume that K is 30, and the K training verification data sets are respectively the 1 st training verification data set to the 30 th training verification data set. Based on the above contents and the above assumptions, when m is 3, the maximum number of target data sets is 10, each group of target data sets may include 3 different training verification data sets, and each training verification data set included in different groups of target data sets is different. For example, the 1 st set of target data sets includes the 1 st through 3 rd training verification data sets, the 2 nd set of target data sets includes the 4 th through 6 th training verification data sets, … …, and the 10 th set of target data sets includes the 28 th through 30 th training verification data sets.
It should be noted that, in the embodiment of the present application, each group of target data sets may be generated in advance according to K training verification data sets, or may be generated one by one along with the determination process of the current optimal model (for example, a group of target data sets is generated each time the current optimal model is determined again until the maximum number of the target data sets is reached), which is not specifically limited in the embodiment of the present application. In addition, the number m of training verification data sets included in the target data set may be predetermined, and may be determined according to application scenarios (e.g., stability requirements for model evaluation and efficiency requirements for the hyper-parameter selection process).
In addition, in the embodiment of the present application, the number of target data sets may be gradually increased along with the determination process of the current optimal model. Based on this, the initial number of target data sets may be set as one set so that the number of target data sets is gradually increased with subsequent re-determination of the current optimal model, that is, one set of target data sets is increased every time the current optimal model is re-determined.
For the convenience of understanding the embodiments of the present application, a method for implementing super-parameter selection provided by the embodiments of the present application is described below with reference to the accompanying drawings.
Method embodiment one
Referring to fig. 2, the flowchart of a method for implementing hyper-parameter selection according to an embodiment of the present application is shown. As shown in fig. 2, the method may specifically include steps S201 to S207:
s201: k training verification data sets are obtained, wherein K is a positive integer.
The acquisition process of the K training verification data sets is not limited in the embodiment of the application. For ease of understanding and explanation, a process for acquiring K training verification data sets is described below in conjunction with fig. 3.
As shown in fig. 3, step S201 may specifically be: firstly, cutting training verification data into K-fold (K-flods) to obtain 1 st fold data, 2 nd fold data, … … and K-fold data; then, a 1 st training verification data set is constituted by using the Kth-fold data as verification data and the Kth-1 st-fold data to the 1 st-fold data as training data, a 2 nd training verification data set is constituted by using the Kth-fold data as training data, the Kth-1 st-fold data as verification data, and the Kth-2 nd to the 1 st-fold data as training data, … …, and a Kth training verification data set is constituted by using the Kth-fold data to the 2 nd-fold data as training data and the 1 st-fold data as verification data. This enables the validation data in the 1 st training validation data set, the validation data in the 2 nd training validation data set, … …, and the validation data in the kth training validation data set to be different.
The training verification data is data that can be used to train and verify the machine learning model, and the training verification data can be set or provided according to an application scenario.
S202: and selecting a group of hyper-parameters to establish a current machine learning model and determining the current machine learning model as a current optimal model.
In the embodiment of the present application, the process of selecting the hyper-parameters is to select one hyper-parameter from multiple sets of hyper-parameters, which enables the machine learning model to achieve the optimal hyper-parameters, and in order to implement the process of selecting the hyper-parameters, it is necessary to select one hyper-parameter from multiple sets of hyper-parameters to establish the current machine learning model, and use the current machine learning model as the current optimal model, so as to subsequently determine the current optimal model again by comparing the advantages and disadvantages of the newly established current machine learning model and the current optimal model, so that the current optimal model is not inferior to (i.e., superior to or equal to) each established machine learning model all the time.
S203: and reselecting a group of hyper-parameters to establish the current machine learning model.
In the embodiment of the present application, the "reselecting a set of hyper-parameters to build the current machine learning model" specifically refers to selecting a set of hyper-parameters from a plurality of sets of unselected hyper-parameters to build the current machine learning model, so that the current machine learning model is different from the built machine learning models. The unselected hyper-parameters are hyper-parameters that have not been used to build the machine learning model at the current time. For ease of understanding and explanation of step S203, the following description is made in conjunction with two examples.
As a first example, if the hyper-parameter selection is performed in the 1 st group of hyper-parameters, the 2 nd group of hyper-parameters, … …, and the T th group of hyper-parameters, and the current optimal model is established by using the 1 st group of hyper-parameters, step S203 may specifically be: selecting one hyper-parameter set from the 2 nd, 3 rd, … …, and T th hyper-parameters to build the current machine learning model (e.g., selecting the 5 th hyper-parameter set to build the current machine learning model), so that whether to re-determine the current optimal model can be determined later by comparing the current machine learning model with the current optimal model.
As a second example, if a hyper-parameter is selected from the group 1 hyper-parameter, the group 2 hyper-parameter, … …, and the group T hyper-parameter, and the group 1 hyper-parameter and the group 5 hyper-parameter are selected to build a machine learning model, and the current optimal model is built by using the group 1 hyper-parameter, step S203 may specifically be: selecting one group of hyper-parameters from the 2 nd group of hyper-parameters to the 4 th group of hyper-parameters and the 6 th group of hyper-parameters to the T th group of hyper-parameters to establish the current machine learning model (for example, selecting the 8 th group of hyper-parameters to establish the current machine learning model), so that whether the current optimal model is determined again or not can be determined by comparing the advantages and disadvantages of the current machine learning model and the current optimal model.
S204: if the current machine learning model is trained using any one set of target data sets and the generated verification result is inferior to the current optimal model, the current machine learning model is eliminated and step S206 is executed.
In the embodiment of the application, when it is determined that the current machine learning model is trained by using any one group of target data sets and the generated verification result is inferior to the verification result generated by the current optimal model trained by using the same target data set, it is determined that the current machine learning model is inferior to the current optimal model, and at this time, the current machine learning model needs to be eliminated so as to eliminate the hyper-parameters for constructing the current machine learning model. This enables the current optimal model to be consistently no worse than the individual machine learning models that have been built.
In addition, in order to improve the selection efficiency of the hyper-parameters, when it is determined that the verification result generated by the current machine learning model trained by using a group of target data sets is inferior to the verification result generated by the current optimal model trained by using the same target data set, the current machine learning model is immediately eliminated, and the verification result generated by the current machine learning model trained by using other unused target data sets is not compared with the verification result generated by the current optimal model trained by using the same target data set, so that the training and verification processes are reduced. For ease of understanding and explanation, the following description is made in conjunction with examples.
By way of example, assume that there are 1 st through 3 rd target data sets at the current time, and it is determined that the current machine learning model trained with the 1 st target data set produces a validation result that is no worse than the current optimal model trained with the 1 st target data set. Based on the above assumptions, at this time, if it is determined that the verification result generated by the current machine learning model trained using the set 2 of target data sets is inferior to the verification result generated by the current optimal model trained using the set 2 of target data sets, the current neural network is eliminated, and step S206 is performed. At this point, the current machine learning model no longer needs to be trained and validated using the set 3 target dataset.
It should be noted that the specific implementation corresponding to the above example will be explained in detail in steps S404 to S408 of the second method embodiment, and will not be described again here.
S205: if the verification results generated by the current machine learning model through training by using each group of target data sets are not different from the verification results generated by the current optimal model through training by using the same target data sets, the current machine learning model is determined as the current optimal model, a group of target data sets is added when the number of the target data sets does not reach the maximum number or the number of the target data sets is kept unchanged when the number of the target data sets reaches the maximum number, and step S206 is executed.
In the embodiment of the present application, when it is determined that the current machine learning model is not inferior to the current optimal model in the verification result generated by training the current machine learning model with each group of target data sets and using the same target data set, it is determined that the current machine learning model is not inferior to the current optimal model, and thus it is necessary to determine the current machine learning model as the current optimal model, so as to ensure that the current optimal model is not inferior to the established machine learning models all the time. At this time, it is necessary to further determine whether the number of target data sets reaches the maximum number of target data sets. If the number of the target data sets is determined to reach the maximum number of the target data sets, determining that the target data sets cannot be added any more, and executing the step S206 only by keeping the number of the target data sets unchanged; if the number of target data sets does not reach the maximum number of target data sets, it is determined that the target data sets can be added, and at this time, a group of target data sets needs to be added, and step S206 is performed.
It should be noted that, in the embodiment of the present application, a set of target data sets is added each time the current optimal model is determined again until the maximum number of target data sets is reached. For example, after the current optimal model is determined for the first time and before the current optimal model is determined for the second time, the different models need to be compared in terms of quality based on a group of target data sets; after the current optimal model is determined for the second time and before the current optimal model is determined for the third time, the different models can be compared in terms of quality based on the two sets of target data sets; … …, respectively; after the current optimal model is determined at the R-th time and before the current optimal model is determined at the R + 1-th time, the different models can be compared based on the R groups of target data sets. Wherein R is a positive integer.
S206: and judging whether a preset stop condition is met, if so, executing step S207, and if not, executing step S203.
The preset stop condition is a condition for determining whether to stop executing the hyper-parameter selection process, and the preset stop condition may be set in advance, and may be set in particular according to an application scenario.
As a first example, the preset stop condition may be that the operation time reaches a first threshold. Where runtime may refer to the execution time of the hyper-parameter selection process. The first threshold value may be predetermined, and may be set according to an application scenario.
As a second example, the preset stop condition may be that the number of times of performing "reselecting a set of hyper-parameters to establish the current machine learning model and subsequent steps" reaches a second threshold value, that is, the preset condition may be that the number of times of performing steps S203-S206 reaches the second threshold value. The second threshold may be preset, and may be set according to an application scenario.
S207: and determining the hyper-parameters corresponding to the current optimal model as target hyper-parameters.
In the embodiment of the present application, when it is determined that the preset stop condition is met, since the current optimal model at the current time is not inferior to each of the established machine learning models, it is necessary to use the current optimal model at the current time as a final optimal model, and determine the hyper-parameter corresponding to the current optimal model as the target hyper-parameter, so that the machine learning model established based on the target hyper-parameter is not inferior to (even superior to) the machine learning models established based on other sets of hyper-parameters.
Based on the specific implementation manner of the method for implementing hyper-parameter selection provided in the first embodiment of the method, in the embodiment of the present application, after K training verification data sets are obtained, a group of hyper-parameters is first selected to establish a current machine learning model, and the current machine learning model is determined as a current optimal model; and then, a group of hyper-parameters is reselected to establish a current machine learning model, the current optimal model is re-determined and the number of target data sets is increased based on the difference between the current machine learning model and the current optimal model and the difference between the merits obtained by the target data sets, the processes of establishing the current machine learning model, re-determining the current optimal model based on the difference between the merits of the models and increasing the number of the target data sets are repeatedly executed for multiple times until a preset stopping condition is met, and at the moment, the hyper-parameters corresponding to the finally obtained current optimal model can be determined as the target hyper-parameters.
Because the current optimal model is always not inferior to (even superior to) each machine learning model which is trained and verified, the target hyper-parameter determined based on the finally determined current optimal model can not be inferior to (even superior to) other selected hyper-parameters, and the model based on the target hyper-parameter can reach the optimal model. In addition, the model with poor effect is trained and verified based on the target data sets with small quantity, so that the model with poor effect can be eliminated only through less training and verification, the execution times of the training and verification of the model with poor effect are reduced, the efficiency of the super-parameter selection process is improved, the time consumption and the memory consumption of the super-parameter selection process are reduced, and the selection cost of the super-parameters of the model is reduced.
In order to facilitate understanding and explanation of the specific implementation of the method for implementing super-parameter selection provided in the embodiment of the present application, the method for implementing super-parameter selection is briefly introduced below by combining two method examples, and then a detailed description is given by combining the method for implementing super-parameter selection provided in the second method embodiment.
The relevant contents of the first method example are as follows:
when the target dataset includes 1 training validation dataset (i.e., m equals 1), the method of implementing the hyperparameter selection may specifically include the steps of:
the first step is as follows: the method comprises the steps of obtaining a 1 st training verification data set to a Kth training verification data set, taking the 1 st training data set as a 1 st group target data set, taking a 2 nd training data set as a 2 nd group target data set, … …, and taking the Kth training data set as a Kth group target data set.
The second step is that: and selecting the 1 st group of hyper-parameters to establish a machine learning model and determining the model as a current optimal model (namely, taking the machine learning model with the 1 st group of hyper-parameters as an initial model of the current optimal model), and training the current optimal model by using the 1 st group of target data sets and generating a verification result.
The third step: and selecting the 2 nd group of hyper-parameters to establish a machine learning model, and training the machine learning model with the 2 nd group of hyper-parameters by using the 1 st group of target data sets to generate a verification result.
The fourth step: discarding the machine learning model having the set 2 hyperparameters if the machine learning model having the set 2 hyperparameters produces a verification result that is worse than a verification result produced by the current optimal model using the set 1 target data set. However, if the verification result of the machine learning model with the 2 nd set of hyper-parameters using the 1 st set of target data sets is not inferior to the verification result of the current optimal model using the 1 st set of target data sets, the machine learning model with the 2 nd set of hyper-parameters is determined as the current optimal model, and the current optimal model is trained using the 2 nd set of target data sets and generates the verification result (i.e., after the current optimal model is re-determined once, a set of target data sets for training and verification is added to the current optimal model so that the current optimal model is trained using the added target data sets and generates the verification result).
It should be noted that for ease of understanding and explanation, the following steps will be performed on the assumption that "the machine learning model with the set 2 hyperparameters produces verification results that are not inferior to the current optimal model using the set 1 target dataset".
The fifth step: and selecting the 3 rd group of hyper-parameters to establish a machine learning model.
And a sixth step: firstly, training a machine learning model with a 3 rd group of hyper-parameters by using a 1 st group of target data sets and generating a verification result, wherein two results of (1) and (2) are generated, specifically:
(1) if the machine learning model with the 3 rd set of hyper-parameters uses the 1 st set of target data sets to produce a validation result that is worse than the validation result produced by the current optimal model using the 1 st set of target data sets, then the machine learning model with the 3 rd set of hyper-parameters is discarded (2) if the machine learning model with the 3 rd set of hyper-parameters uses the 1 st set of target data sets to produce a validation result that is not worse than the validation result produced by the current optimal model using the 1 st set of target data sets, then the machine learning model with the 3 rd set of hyper-parameters is trained again using the 2 nd set of target data sets and produces a validation result, at which time both ① and ② results are produced, which are specifically ① if the machine learning model with the 3 rd set of hyper-parameters uses the 2 nd set of target data sets to produce a validation result that is worse than the validation result produced by the current optimal model using the 2 nd set of target data sets, then the machine learning model with the 3 rd set of hyper-parameters is discarded, and the machine learning model with the 3 rd set of hyper-parameters is determined to produce a validation result that is not worse than the current optimal model using the 1 st set of target data sets, then the machine learning model with the current optimal model and produces a validation result that is not produced by the current optimal model.
And by analogy, based on the above thought, the machine learning model with the 4 th group of hyper-parameters to the machine learning model with the H (H is the total group number of the hyper-parameters) group of hyper-parameters are continuously processed correspondingly.
As mentioned above, in the example of the method, each set of target data sets includes only 1 training verification data set, so that each time the current optimal model is determined again, a new set of target data sets (i.e., a training verification data set is added) needs to be added to the re-determined current optimal model for training and generating a verification result, so that the comparison between the new machine learning model and the current optimal model can be performed based on each target data set that the current optimal model already uses. In addition, because the number of the target data sets is limited, when the number of the target data sets used by the current optimal model reaches the maximum number of the target data sets, only the current optimal model needs to be determined again in the subsequent process, and a new target data set does not need to be added to the current optimal model after the current optimal model is determined again.
In addition, the model advantages and disadvantages determined based on the verification result of the single training verification data set have contingency, so that the better machine learning model is eliminated and the poorer machine learning model is reserved easily due to the poor verification result of the training verification data set, and the construction effect of the model is easily influenced. Based on this, in order to avoid the occurrence of the above problem, each set of target data set may include a plurality of training verification data sets, so that the verification results of the plurality of training verification data sets can be integrated to determine the superiority and inferiority of the model.
Based on this, the present application embodiment also provides a second method example for explaining a method of implementing hyper-parameter selection based on a target dataset comprising at least 2 training validation datasets (i.e., m ≧ 2). The flow of the method for realizing the super-parameter selection provided by the second method example is similar to that provided by the first method example, and the difference between the two methods is only that: (1) the target data set is obtained in different processes; (2) the process of training and generating validation results with a set of target data sets differs for machine learning models. For the sake of brevity, the second method example will be described below in connection with the example and based on the two differences described above.
For example, when the target data set includes 3 training verification data sets, the "acquisition process of the target data set" (i.e., the first step) in the ① second method example may specifically be to acquire the 1 st training verification data set to the kth training verification data set, and take the 1 st training data set to the 3 rd training data set as the 1 st group target data set, the 4 th training data set to the 6 th training data set as the 2 nd group target data set, … …, and the K-2 nd training data set to the kth training data set as the T (T ═ K/3) th group target data set, and the "machine learning model trains and generates verification results using a group of target data sets" in the ② second method example may specifically be to train and generate verification results using three training data sets in the group of target data sets, respectively, and then use the average of the generated three verification results as the verification results generated by the machine learning model using the group of target data sets.
As described above, in the second exemplary method, each set of target data sets includes a plurality of (e.g., 3) training verification data sets, so that each time the current optimal model is determined again, a new set of target data sets (i.e., a plurality of (e.g., 3) training verification data sets) needs to be added to the re-determined current optimal model at the same time to generate verification results, so that the comparison of superiority between the new machine learning model and the current optimal model can be performed based on each target data set that has been utilized by the current optimal model. Therefore, the problems that a better machine learning model is eliminated and a poorer machine learning model is reserved due to poor verification results of a group of training verification data sets can be effectively avoided, and the stability and credibility of the model evaluation process are improved.
Based on the two method examples, the embodiment of the present application further provides other implementation manners of the method for implementing the super-parameter selection, which will be described in detail in method embodiment two below.
The second method embodiment is an improvement on the basis of the first method embodiment, and for the sake of brevity, the same parts in the second method embodiment as those in the first method embodiment will not be described again.
Referring to fig. 4, the figure is a flowchart of another method for implementing hyper-parameter selection according to an embodiment of the present application. As shown in fig. 4, the method may specifically include steps S401 to S410:
s401: k training verification data sets are obtained, wherein K is a positive integer.
It should be noted that the content of step S401 is the same as that of step S201, and is not described again here.
S402: selecting a group of hyper-parameters to establish a current machine learning model, determining the current machine learning model as a current optimal model, and acquiring a group of target data sets.
It should be noted that the content of the action "selecting a group of hyper-parameters to establish the current machine learning model and determining the current machine learning model as the current optimal model" is the same as the content of the step S202, and is not described herein again.
In the embodiment of the application, after K training verification data sets are obtained, a group of hyper-parameters needs to be selected from multiple groups of hyper-parameters to establish a current machine learning model, and the current machine learning model is used as a current optimal model, that is, an initial current optimal model is obtained. At this time, a group of target data sets is obtained according to the K training verification data sets, so that the initial data volume of the target data sets is a group.
It should be noted that the content of the "target data set" refers to the content of the "target data set" in the above step S204.
S403: and reselecting a group of hyper-parameters to establish the current machine learning model.
It should be noted that the content of step S403 is the same as that of step S203, and is not described again here.
S404: a set of target data sets is selected, and the current machine learning model is trained using the set of target data sets and generates a verification result.
In the embodiments of the present application, "selecting a set of target data sets" refers to selecting a set of target data sets from target data sets not used by the current machine learning model. For example, assume that the current machine learning model needs to be trained and verified by using the 1 st set of target data sets, the 2 nd set of target data sets, and the 3 rd set of target data sets, and that the verification result generated by the current machine learning model trained by using the 1 st set of target data sets is obtained. Based on the above assumptions, the current machine learning model already uses the 1 st set of target data sets, and does not use the 2 nd set of target data sets and the 3 rd set of target data sets. At this time, step S404 may be: selecting a 2 nd group of target data sets, training a current machine learning model by using the group of target data sets and generating a verification result; alternatively, a 3 rd set of target data sets is selected, and the current machine learning model is trained using the set of target data sets and generates a validation result.
In the embodiment of the present application, the selection rule used when "selecting a group of target data sets" is executed is not limited, and for example, the selection rule may be selected randomly or may be selected in accordance with a preset arrangement order of the target data sets.
In addition, in this embodiment of the present application, "training the current machine learning model with the set of target data sets and generating the verification result" refers to training and verifying the current machine learning model with each training verification data set in the set of target data sets, and obtaining the comprehensive verification result of the current machine learning model on the set of target data sets based on the corresponding verification result of each training verification data set. For ease of understanding and explanation, the following explanation and description are made in conjunction with two embodiments.
As a first embodiment, when m is greater than 1, the action "train the current machine learning model with the set of target data sets and generate the verification result" may specifically include: firstly, training a current machine learning model by using each training verification data set and generating a verification result; then, an average of the generated m verification results is calculated as a verification result generated by training the current machine learning model with the set of target data sets. For ease of understanding and explanation of the first embodiment, the following description is given with reference to an example.
For example, assume m is 3 and the target dataset includes a 1 st training validation dataset, a 2 nd training validation dataset, and a 3 rd training validation dataset. Based on the above contents and the above assumptions, it can be known that the current machine learning model is first trained by using the 1 st training verification data set to generate the 1 st verification result, is trained by using the 2 nd training verification data set to generate the 2 nd verification result, and is then trained by using the 3 rd training verification data set to generate the 3 rd verification result, and then the average value of the 1 st verification result, the 2 nd verification result, and the 3 rd verification result is calculated as the verification result generated by training the current machine learning model by using the set of target data sets.
As a second embodiment, when m is equal to 1, the action "train the current machine learning model with the set of target data sets and generate the verification result" may specifically include: the current machine learning model is trained by utilizing each training verification data set to generate a verification result, and the generated verification result is used as the verification result generated by training the current machine learning model by utilizing the group of target data sets. For ease of understanding and explanation of the second embodiment, the following description is given with reference to an example.
For example, assume m is 1 and the target dataset comprises the 1 st training verification dataset. Based on the above contents and the above assumptions, the current machine learning model may be trained using the 1 st training verification data set to generate the 1 st verification result, and then the 1 st verification result is used as the verification result generated by training the current machine learning model using the set of target data sets.
Based on the above two embodiments, the present application further provides an embodiment of the action "train the current machine learning model with each training verification data set and generate the verification result", which will be described in detail after step S410.
S405: judging whether the verification result generated by the current machine learning model through training by using the group of target data sets is inferior to the verification result generated by the current optimal model through training by using the group of target data sets, if so, executing the step S408; if not, steps S406-S407 are executed.
In the embodiment of the present application, if it is determined that the verification result generated by the current machine learning model through training by using the set of target data sets is inferior to the verification result generated by the current optimal model through training by using the set of target data sets, it may be directly determined that the current optimal model is superior to the current machine learning model, and at this time, the current machine learning model may be directly eliminated. However, if it is determined that the verification result generated by the current machine learning model trained using the set of target data sets is not inferior to the verification result generated by the current optimal model trained using the set of target data sets, it is necessary to further determine whether the current machine learning model is inferior to the current optimal model in combination with the goodness determination result of the verification result generated by the current machine learning model trained using the other sets of target data sets and the verification result generated by the current optimal model trained using the same set of target data sets.
S406: judging whether an unused target data set exists, if so, executing a step S404; if not, go to step S407.
In the embodiment of the present application, after determining that the verification result generated by the current machine learning model trained using the set of target data sets is not worse than the verification result generated by the current optimal model trained using the set of target data sets, it is necessary to further determine whether there is an unused target data set. At this time, if it is determined that there is an unused target data set, it needs to be further determined whether a verification result generated by the current neural network through training using the unused target data set is inferior to a verification result generated by the current optimal model through training using the same target data set. However, if it is determined that there are no unused target data sets, that is, all target data sets are used, and the verification result generated by the current machine learning model trained using each set of target data sets is not inferior to the verification result generated by the current optimal model trained using the same target data set, it may be determined that the current machine learning model is not inferior to the current optimal model, and the current machine learning model may be determined as the current optimal model, and it may be determined whether to add a target data set according to the number of target data sets.
S407: determining that the verification results generated by the current machine learning model through training by using each group of target data sets are all different from the verification results generated by the current optimal model through training by using the same target data sets, determining the current machine learning model as the current optimal model, adding one group of target data sets when the number of the target data sets does not reach the maximum number or keeping the number of the target data sets unchanged when the number of the target data sets reaches the maximum number, and executing step S409.
It should be noted that the content of step S407 is the same as that of step S205, and is not described herein again.
S408: and determining that the verification result generated by the current machine learning model through training by using any one group of target data sets is inferior to the verification result generated by the current optimal model through training by using the same target data set, eliminating the current machine learning model, and executing the step S409.
It should be noted that the content of step S407 is the same as the content of step S204, and is not described herein again.
S409: and judging whether a preset stop condition is met, if so, executing the step S410, and if not, executing the step S403.
It should be noted that, for the technical details of the "preset stop condition", reference is made to the relevant contents in the above step S206. This is explained and illustrated below in connection with two examples.
As a first example, the preset stop condition may be that the operation time reaches a first threshold. Where runtime may refer to the execution time of the hyper-parameter selection process. The first threshold value may be predetermined, and may be set according to an application scenario.
As a second example, the preset stop condition may be that the number of times of performing "reselecting a set of hyper-parameters to establish the current machine learning model and subsequent steps" reaches a second threshold, that is, the preset condition may be that the number of times of performing steps S403 to S409 reaches the second threshold. The second threshold may be preset, and may be set according to an application scenario.
S410: and determining the hyper-parameters corresponding to the current optimal model as target hyper-parameters.
It should be noted that the content of step S410 is the same as the content of step S207, and is not described herein again.
The foregoing steps S401 to S410 are an implementation manner of the method for implementing super-parameter selection provided in the embodiment of the present application.
In addition, an implementation manner of the action "train the current machine learning model with each training verification data set and generate a verification result" is further provided in the embodiments of the present application, and in this implementation manner, the action "train the current machine learning model with each training verification data set and generate a verification result" may specifically include: firstly, training a current machine learning model by using training data in a training verification data set; and then, inputting the verification data in the training verification data set into the trained current machine learning model and outputting a verification result. For ease of understanding and explanation, the following description is made in conjunction with examples.
For example, assume that the training verification data set is the 1 st training verification data set in fig. 3, the 1 st to K-1 st fold data in the 1 st training verification data set are training data, and the K-fold data in the 1 st training verification data set is verification data. Based on the upper content and the hypothesis, the current machine learning model is trained by using the 1 st to K-1 st data in the 1 st training verification data set to obtain a trained current machine learning model; and inputting the Kth data in the 1 st training verification data set into the trained current machine learning model to output a verification result. Therefore, the process of training the current machine learning model by using the 1 st training verification data set and generating the verification result can be realized.
In addition, the embodiment of the present application does not limit the manner of obtaining the verification result, and as a possible implementation, the verification result is obtained according to the evaluation result of at least one of the evaluation indexes of the machine learning model. For ease of understanding and explanation, the following description is made in conjunction with two examples.
As a first example, when the verification result is obtained from the evaluation result of any of the evaluation indexes of the machine learning model, then the evaluation result of any of the evaluation indexes of the machine learning model may be directly taken as the verification result.
As a second example, when the verification result is obtained from the evaluation results of at least two of the evaluation indexes of the machine learning model, then a weighted average of the evaluation results of at least two of the evaluation indexes of the machine learning model may be taken as the verification result.
The evaluation index of the machine learning model is not limited in the embodiments of the present application, and may be any evaluation index that is currently used or will appear in the future for evaluating the machine learning model.
The above is a specific implementation manner of the method for implementing super-parameter selection provided in method embodiment two, and this implementation manner has the following beneficial effects in addition to the beneficial effects of the method for implementing super-parameter selection provided in method embodiment one:
(1) the target data set may include at least one training verification data set, and if the number of the training verification data sets included in the target data set is larger, the longer the time required for the machine learning model (current machine learning model or current optimal model) to perform training and verification by using each group of target data sets is, the longer the time required for performing the super-parameter selection process is, and the lower the efficiency of performing the super-parameter selection process is; however, if the number of training verification data sets included in the target data set is larger, the model evaluation stability is higher, so that the possibility that the finally determined current optimal model is the actual optimal model is higher, and adverse effects on the super-parameter selection (that is, the model training effect) caused by that a current machine learning model with good overall effect is directly eliminated because the training is performed by using one (or a few) training verification data sets and the generated verification result is inferior to the current optimal model are avoided. Based on this, in order to take account of both the model evaluation stability and the execution efficiency of the hyper-parameter selection, an appropriate m may be set according to the application scenario, so that the execution efficiency of the hyper-parameter selection can be improved on the basis of satisfying the requirement of the model evaluation stability when the "method for implementing the hyper-parameter selection" is executed based on this m.
(2) In the embodiment of the application, as long as it is determined that the verification result generated by the current machine learning model through training by using a group of target data sets is inferior to the verification result generated by the current optimal model through training by using the same target data set, the current machine learning model is directly eliminated, and the verification result generated by the current machine learning model through training by using other unused target data sets is not compared with the verification result generated by the current optimal model through training by using the same target data set. Therefore, the model with poor effect is eliminated only through less training and verification, the execution times of the training and verification of the model with poor effect are reduced, and the efficiency of the super-parameter selection process is improved.
Based on the above method embodiment, the present application embodiment further provides a device for implementing super-parameter selection, which will be described below.
Referring to fig. 5, which is a structural diagram of an apparatus for implementing super-parameter selection according to an embodiment of the present application, as shown in fig. 5, the apparatus includes:
a data obtaining unit 501, configured to obtain K training verification data sets, where K is a positive integer;
a model establishing unit 502, configured to select a group of hyper-parameters to establish a current machine learning model and determine the current machine learning model as a current optimal model;
a parameter selection unit 503, configured to reselect a set of hyper-parameters to establish a current machine learning model, and if a verification result obtained by training the current machine learning model with any one set of target data sets is worse than a verification result obtained by training the current optimal model with the same target data set, eliminate the current machine learning model, and execute the reselected set of hyper-parameters to establish the current machine learning model and subsequent steps; each group of the target data sets comprises m training verification data sets, m is a positive integer smaller than K, and the initial number of the target data sets is one group; if the verification results generated by the current machine learning model through training with each group of target data sets are not different from the verification results generated by the current optimal model through training with the same target data sets, the current machine learning model is determined as the current optimal model, when the number of the target data sets does not reach the maximum number, a group of target data sets is added, or when the number of the target data sets reaches the maximum number, the number of the target data sets is kept unchanged, and the return parameter selection unit 503 continues to execute the reselecting of a group of hyper-parameters to establish the current machine learning model;
a parameter determining unit 504, configured to determine, when a preset stop condition is met, a hyper-parameter corresponding to the current optimal model as a target hyper-parameter.
In a possible implementation manner, the parameter selecting unit 503 includes:
the model selection subunit is used for reselecting a group of hyper-parameters to establish the current machine learning model;
the training and verifying subunit is used for selecting a group of target data sets, training the current machine learning model by using the group of target data sets and generating a verification result;
a result judging subunit, configured to judge whether a verification result generated by the current machine learning model through training with the set of target data sets is worse than a verification result generated by the current optimal model through training with the set of target data sets;
the first determining subunit is used for determining that the verification result generated by the current machine learning model through training by utilizing any one group of target data sets is worse than the verification result generated by the current optimal model through training by utilizing the same target data set if the current machine learning model is the optimal model;
and the first determining subunit is configured to, if not, return to the training and verifying subunit to perform the selection of one group of target data sets when an unused target data set still exists, and determine that the verification result obtained by training the current machine learning model with each group of target data sets is not different from the verification result obtained by training the current optimal model with the same target data set when each group of target data sets is used.
In one possible implementation, when m is greater than 1, the training verification subunit includes:
the training verification module is used for training the current machine learning model by utilizing each training verification data set and generating a verification result;
the mean value calculation module is used for calculating the mean value of the generated m verification results to serve as the verification result generated by training the current machine learning model by using the group of target data sets;
when m is equal to 1, the training verification subunit is specifically configured to:
the training verification module is used for training the current machine learning model by utilizing each training verification data set and generating a verification result;
and the verification result determining module is used for taking the generated verification result as the verification result generated by training the current machine learning model by using the set of target data sets.
In one possible implementation, the training verification module includes:
the training submodule is used for training the current machine learning model by using training data in a training verification data set;
and the verification submodule is used for inputting the verification data in the training verification data set into the trained current machine learning model and outputting a verification result.
In a possible implementation manner, each group of the target data sets includes m training verification data sets different from each other, the training verification data sets included in each group of the target data sets are all different, and the maximum number of the target data sets is obtained by dividing K by m.
In one possible implementation, the verification result is obtained according to an evaluation result of at least one of evaluation indexes of the machine learning model.
In a possible implementation manner, the preset stop condition is that the running time reaches a first threshold value, or the number of times of performing the step of reselecting the set of hyper-parameters to establish the current machine learning model and the subsequent step reaches a second threshold value.
It should be noted that, implementation of each unit in this embodiment may refer to the foregoing method embodiment, and this embodiment is not described herein again.
In addition, an embodiment of the present application further provides a computer-readable storage medium, where instructions are stored in the computer-readable storage medium, and when the instructions are run on a terminal device, the terminal device is caused to execute the method for implementing hyper-parameter selection described in the foregoing method embodiment.
In addition, an embodiment of the present application provides an apparatus for implementing secure operating environment switching, including: the system comprises a memory, a processor and a computer program stored on the memory and capable of running on the processor, wherein when the processor executes the computer program, the method for realizing the super-parameter selection is realized according to the method embodiment. In this way, in the embodiment of the application, after K training verification data sets are obtained, a group of hyper-parameters is selected to establish a current machine learning model, and the current machine learning model is determined as a current optimal model; and then, a group of hyper-parameters is reselected to establish a current machine learning model, the current optimal model is re-determined and the number of target data sets is increased based on the difference between the current machine learning model and the current optimal model and the difference between the merits obtained by the target data sets, the processes of establishing the current machine learning model, re-determining the current optimal model based on the difference between the merits of the models and increasing the number of the target data sets are repeatedly executed for multiple times until a preset stopping condition is met, and at the moment, the hyper-parameters corresponding to the finally obtained current optimal model can be determined as the target hyper-parameters.
The current optimal model is always not different from each machine learning model which is trained and verified, so that the target hyper-parameter determined based on the finally determined current optimal model can not be different from other selected hyper-parameters, and the selection of the optimal hyper-parameter is realized. In addition, the model with poor effect is trained and verified based on the target data sets with small quantity, so that the model with poor effect can be eliminated only through less training and verification, the execution times of the training and verification of the model with poor effect are reduced, the efficiency of the super-parameter selection process is improved, the time consumption and the memory consumption of the super-parameter selection process are reduced, and the selection cost of the super-parameters of the model is reduced.
It should be noted that, in the present specification, the embodiments are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments may be referred to each other. For the system or the device disclosed by the embodiment, the description is simple because the system or the device corresponds to the method disclosed by the embodiment, and the relevant points can be referred to the method part for description.
It should be understood that in the present application, "at least one" means one or more, "a plurality" means two or more. "and/or" for describing an association relationship of associated objects, indicating that there may be three relationships, e.g., "a and/or B" may indicate: only A, only B and both A and B are present, wherein A and B may be singular or plural. The character "/" generally indicates that the former and latter associated objects are in an "or" relationship. "at least one of the following" or similar expressions refer to any combination of these items, including any combination of single item(s) or plural items. For example, at least one (one) of a, b, or c, may represent: a, b, c, "a and b", "a and c", "b and c", or "a and b and c", wherein a, b, c may be single or plural.
It is further noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
The steps of a method or algorithm described in connection with the embodiments disclosed herein may be embodied directly in hardware, in a software module executed by a processor, or in a combination of the two. A software module may reside in Random Access Memory (RAM), memory, Read Only Memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art.
The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present application. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the application. Thus, the present application is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (10)

1. A method of implementing hyper-parameter selection, the method comprising:
acquiring K training verification data sets, wherein K is a positive integer;
selecting a group of hyper-parameters to establish a current machine learning model and determining the current machine learning model as a current optimal model;
reselecting a group of hyper-parameters to establish a current machine learning model, if the verification result generated by the current machine learning model through training by utilizing any group of target data sets is inferior to the verification result generated by the current optimal model through training by utilizing the same target data sets, eliminating the current machine learning model, and executing the reselecting of the group of hyper-parameters to establish the current machine learning model and the subsequent steps; each group of the target data sets comprises m training verification data sets, m is a positive integer smaller than K, and the initial number of the target data sets is one group;
if the verification results generated by the current machine learning model through training by using each group of target data sets are not different from the verification results generated by the current optimal model through training by using the same target data sets, determining the current machine learning model as the current optimal model, adding a group of target data sets when the number of the target data sets does not reach the maximum number or keeping the number of the target data sets unchanged when the number of the target data sets reaches the maximum number, and executing the steps of reselecting a group of hyper-parameters to establish the current machine learning model and subsequently;
and when a preset stopping condition is met, determining the hyper-parameter corresponding to the current optimal model as a target hyper-parameter.
2. The method of claim 1, wherein after reselecting a set of hyper-parameters to build a current machine learning model, the method further comprises:
selecting a group of target data sets, training the current machine learning model by using the group of target data sets and generating a verification result;
judging whether the verification result generated by the current machine learning model through training by using the group of target data sets is inferior to the verification result generated by the current optimal model through training by using the group of target data sets;
if so, determining that the verification result generated by training the current machine learning model by using any group of target data sets is inferior to the verification result generated by training the current optimal model by using the same target data set;
if not, when an unused target data set exists, executing the step of selecting one group of target data sets and the subsequent steps, and when all groups of target data sets are used, determining that the verification results generated by the current machine learning model through training by using each group of target data sets are not different from the verification results generated by the current optimal model through training by using the same target data set.
3. The method of claim 2, wherein training the current machine learning model with the set of target data sets and generating validation results when m is greater than 1 comprises:
training the current machine learning model by using each training verification data set and generating a verification result;
calculating an average value of the generated m verification results as a verification result generated by training the current machine learning model by using the group of target data sets;
when m is equal to 1, training the current machine learning model with the set of target data sets and generating a verification result, including:
training the current machine learning model by using each training verification data set and generating a verification result;
and taking the generated verification result as the verification result generated by training the current machine learning model by using the set of target data sets.
4. The method of claim 3, wherein training the current machine learning model with each training validation dataset and generating validation results comprises:
training the current machine learning model by using training data in a training verification data set;
and inputting the verification data in the training verification data set into the trained current machine learning model and outputting a verification result.
5. The method according to any one of claims 1-4, wherein each set of target data sets comprises m training verification data sets that are different from each other, wherein each set of target data sets comprises different training verification data sets, and wherein the maximum number of target data sets is K divided by m and rounded.
6. The method according to any one of claims 1 to 4, wherein the verification result is obtained from an evaluation result of at least one of evaluation indexes of a machine learning model.
7. The method of claim 1, wherein the predetermined stopping condition is that a running time reaches a first threshold value, or a number of times of performing the step of reselecting a set of hyper-parameters to establish a current machine learning model and the subsequent steps reaches a second threshold value.
8. An apparatus for enabling hyper-parameter selection, the apparatus comprising:
the data acquisition unit is used for acquiring K training verification data sets, wherein K is a positive integer;
the model establishing unit is used for selecting a group of hyper-parameters to establish a current machine learning model and determining the current machine learning model as a current optimal model;
the parameter selection unit is used for reselecting a group of hyper-parameters to establish a current machine learning model, and if the verification result generated by the current machine learning model through training by using any group of target data sets is inferior to the verification result generated by the current optimal model through training by using the same target data set, the current machine learning model is eliminated, and the reselection of the group of hyper-parameters to establish the current machine learning model and the subsequent steps are executed; each group of the target data sets comprises m training verification data sets, m is a positive integer smaller than K, and the initial number of the target data sets is one group; if the verification results generated by the current machine learning model through training by using each group of target data sets are not different from the verification results generated by the current optimal model through training by using the same target data sets, determining the current machine learning model as the current optimal model, adding a group of target data sets when the number of the target data sets does not reach the maximum number or keeping the number of the target data sets unchanged when the number of the target data sets reaches the maximum number, and executing the steps of reselecting a group of hyper-parameters to establish the current machine learning model and subsequently;
and the parameter determining unit is used for determining the hyper-parameter corresponding to the current optimal model as a target hyper-parameter when a preset stopping condition is met.
9. A computer-readable storage medium having stored therein instructions which, when run on a terminal device, cause the terminal device to perform a method of implementing hyper-parameter selection as claimed in any of claims 1-7.
10. An apparatus for enabling hyper-parameter selection, comprising: memory, a processor, and a computer program stored on the memory and executable on the processor, the processor implementing the method of implementing hyper-parameter selection as claimed in any of claims 1-7 when executing the computer program.
CN201911151006.8A 2019-11-21 2019-11-21 Method, device and equipment for realizing super-parameter selection Pending CN111126616A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911151006.8A CN111126616A (en) 2019-11-21 2019-11-21 Method, device and equipment for realizing super-parameter selection

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911151006.8A CN111126616A (en) 2019-11-21 2019-11-21 Method, device and equipment for realizing super-parameter selection

Publications (1)

Publication Number Publication Date
CN111126616A true CN111126616A (en) 2020-05-08

Family

ID=70496111

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911151006.8A Pending CN111126616A (en) 2019-11-21 2019-11-21 Method, device and equipment for realizing super-parameter selection

Country Status (1)

Country Link
CN (1) CN111126616A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112257561A (en) * 2020-10-20 2021-01-22 广州云从凯风科技有限公司 Human face living body detection method and device, machine readable medium and equipment
CN113780287A (en) * 2021-07-30 2021-12-10 武汉中海庭数据技术有限公司 Optimal selection method and system for multi-depth learning model

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110443364A (en) * 2019-06-21 2019-11-12 深圳大学 A kind of deep neural network multitask hyperparameter optimization method and device

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110443364A (en) * 2019-06-21 2019-11-12 深圳大学 A kind of deep neural network multitask hyperparameter optimization method and device

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112257561A (en) * 2020-10-20 2021-01-22 广州云从凯风科技有限公司 Human face living body detection method and device, machine readable medium and equipment
CN113780287A (en) * 2021-07-30 2021-12-10 武汉中海庭数据技术有限公司 Optimal selection method and system for multi-depth learning model

Similar Documents

Publication Publication Date Title
KR20180041200A (en) Information processing method and apparatus
CN106557576B (en) Prompt message recommendation method and device based on artificial intelligence
CN109102797B (en) Speech recognition test method, device, computer equipment and storage medium
US20220383627A1 (en) Automatic modeling method and device for object detection model
CN111126616A (en) Method, device and equipment for realizing super-parameter selection
CN109857804B (en) Distributed model parameter searching method and device and electronic equipment
WO2017197330A1 (en) Two-stage training of a spoken dialogue system
JPWO2018229877A1 (en) Hypothesis inference device, hypothesis inference method, and computer-readable recording medium
CN104424240B (en) Multilist correlating method, main service node, calculate node and system
WO2016188498A1 (en) Wireless network throughput evaluating method and device
CN109766259B (en) Classifier testing method and system based on composite metamorphic relation
RU2011131824A (en) SEARCH INTRA MODE FOR VIDEO INFORMATION ENCODING
RU2014135303A (en) TEXT PROCESSING METHOD (OPTIONS) AND PERMANENT MACHINE READABLE MEDIA (OPTIONS)
JP2019046031A (en) Optimal solution search method, optimal solution search program, and optimal solution search apparatus
US20210049513A1 (en) Unsupervised model evaluation method, apparatus, server, and computer-readable storage medium
US20190220924A1 (en) Method and device for determining key variable in model
CN109992659B (en) Method and device for text sorting
Van Gelder Generalized conflict-clause strengthening for satisfiability solvers
CN110413750A (en) The method and apparatus for recalling standard question sentence according to user's question sentence
CN114341894A (en) Hyper-parameter recommendation method for machine learning method
Lê et al. A novel variable ordering heuristic for BDD-based K-terminal reliability
CN111061875B (en) Super parameter determination method, device, computer equipment and storage medium
CN109344057A (en) Combination accelerated test case generation method based on genetic method and semiology analysis
Adnan On dynamic selection of subspace for random forest
Flach Classification in context: Adapting to changes in class and cost distribution

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination