CN112766402A - Algorithm selection method and device and electronic equipment - Google Patents
Algorithm selection method and device and electronic equipment Download PDFInfo
- Publication number
- CN112766402A CN112766402A CN202110121910.5A CN202110121910A CN112766402A CN 112766402 A CN112766402 A CN 112766402A CN 202110121910 A CN202110121910 A CN 202110121910A CN 112766402 A CN112766402 A CN 112766402A
- Authority
- CN
- China
- Prior art keywords
- algorithm
- models
- target
- jth
- data
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000010187 selection method Methods 0.000 title claims abstract description 21
- 238000012360 testing method Methods 0.000 claims abstract description 76
- 238000012549 training Methods 0.000 claims abstract description 69
- 238000000034 method Methods 0.000 claims abstract description 53
- 238000011156 evaluation Methods 0.000 claims description 20
- 238000010606 normalization Methods 0.000 claims description 10
- 238000012545 processing Methods 0.000 claims description 10
- 230000006870 function Effects 0.000 description 24
- 238000010801 machine learning Methods 0.000 description 9
- 238000013527 convolutional neural network Methods 0.000 description 8
- 230000000694 effects Effects 0.000 description 7
- 238000003860 storage Methods 0.000 description 7
- 238000010586 diagram Methods 0.000 description 6
- 238000004891 communication Methods 0.000 description 5
- 238000004140 cleaning Methods 0.000 description 4
- 238000013528 artificial neural network Methods 0.000 description 2
- 238000003066 decision tree Methods 0.000 description 2
- 238000009826 distribution Methods 0.000 description 2
- 238000012417 linear regression Methods 0.000 description 2
- 238000007477 logistic regression Methods 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 230000035945 sensitivity Effects 0.000 description 2
- 238000012706 support-vector machine Methods 0.000 description 2
- 238000012795 verification Methods 0.000 description 2
- 238000004364 calculation method Methods 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 238000002790 cross-validation Methods 0.000 description 1
- 238000007599 discharging Methods 0.000 description 1
- 239000006185 dispersion Substances 0.000 description 1
- 230000004927 fusion Effects 0.000 description 1
- 230000014509 gene expression Effects 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 238000002161 passivation Methods 0.000 description 1
- 238000000638 solvent extraction Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Computation (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Software Systems (AREA)
- Mathematical Physics (AREA)
- Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Computing Systems (AREA)
- Molecular Biology (AREA)
- General Health & Medical Sciences (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The application discloses an algorithm selection method and device and electronic equipment. The method comprises the following steps: dividing target sample data into N parts of training data and M parts of test data; when j takes each integer of 1 to G, the following process is performed: respectively training each training data aiming at the jth alternative algorithm to obtain N models of the jth alternative algorithm; determining a loss value of each model in the N models of the jth alternative algorithm according to the M test data; determining target parameters of the jth alternative algorithm according to the loss values of the N models of the jth alternative algorithm; determining a target algorithm from G candidate algorithms according to the G target parameters; wherein G is the number of alternative algorithms. Therefore, according to the scheme of the embodiment of the application, only the target parameters of the alternative algorithms need to be determined according to the target sample data, so that the algorithm with the optimal performance can be selected according to the target parameters without repeatedly trying and exploring.
Description
Technical Field
The application belongs to the technical field of communication, and particularly relates to an algorithm selection method, an algorithm selection device and electronic equipment.
Background
Machine learning is a process of finding a feature function that characterizes data using existing data. Currently, supervised machine learning is classified into regression learning and prediction learning, and various learning algorithms such as decision trees, linear regression, bayesian classification, logistic regression, support vector machine, extreme gradient boost (xgboost), neural network family, etc. exist in the field of supervised learning.
Among them, each algorithm has merits when performing machine learning task, and generally when the problem of selecting which algorithm to perform engineering application is encountered, engineers often select the currently popular machine learning algorithm based on experience. Then, the selected machine learning algorithm and the known data set are used for training, and the difference (similar to the deviation) between the trained label and the real label is used for parameter optimization in the training process, so that the optimal parameter model is selected.
However, in the process of implementing the present application, the inventors found that: because the characteristics of each application scene are different, if the effect evaluation of the final model is not good, for example, the algorithm selection is not appropriate or the cost of training the model by using the selected algorithm is high and the engineering is difficult to implement, another algorithm is often required to be tried again, which wastes time and cost.
Therefore, in the prior art, the algorithm used in a certain application scenario is often selected based on experience or in an attempt mode, so that the accuracy is low, and the exploration cost is high.
Content of application
The embodiment of the application aims to provide an algorithm selection method, an algorithm selection device and electronic equipment, and the problems that the accuracy of an algorithm used for selecting a certain application scene based on experience or in an attempt mode is low, and the exploration cost is high can be solved.
In order to solve the technical problem, the present application is implemented as follows:
in a first aspect, an embodiment of the present application provides an algorithm selection method, where the method includes:
dividing target sample data into N parts of training data and M parts of test data, wherein N is greater than 1, and M is greater than or equal to 1;
when j takes each integer of 1 to G, the following process is performed:
respectively training each piece of training data aiming at the jth alternative algorithm to obtain N models of the jth alternative algorithm;
determining a loss value of each model in the N models of the jth alternative algorithm according to the M test data;
determining target parameters of the jth candidate algorithm according to the loss values of the N models of the jth candidate algorithm, wherein the target parameters are used for representing evaluation values of the performance of the candidate algorithms;
determining a target algorithm from the G candidate algorithms according to the G target parameters;
wherein G is the number of the alternative algorithms.
In a second aspect, an embodiment of the present application provides an algorithm selecting apparatus, including:
the splitting module is used for dividing target sample data into N parts of training data and M parts of test data, wherein N is greater than 1, and M is greater than or equal to 1;
a first parameter determining module, configured to, when j takes each integer from 1 to G, perform the following procedure:
respectively training each piece of training data aiming at the jth alternative algorithm to obtain N models of the jth alternative algorithm;
determining a loss value of each model in the N models of the jth alternative algorithm according to the M test data;
determining target parameters of the jth candidate algorithm according to the loss values of the N models of the jth candidate algorithm, wherein the target parameters are used for representing evaluation values of the performance of the candidate algorithms;
the first selection module is used for determining a target algorithm from the G candidate algorithms according to the G target parameters;
wherein G is the number of the alternative algorithms.
In a third aspect, an embodiment of the present application provides an electronic device, which includes a processor, a memory, and a program or instructions stored on the memory and executable on the processor, and when executed by the processor, the program or instructions implement the steps of the method according to the first aspect.
In a fourth aspect, embodiments of the present application provide a readable storage medium, on which a program or instructions are stored, which when executed by a processor implement the steps of the method according to the first aspect.
In a fifth aspect, an embodiment of the present application provides a chip, where the chip includes a processor and a communication interface, where the communication interface is coupled to the processor, and the processor is configured to execute a program or instructions to implement the method according to the first aspect.
In the embodiment of the application, target sample data can be divided into N parts of training data and M parts of test data, then when j is each integer from 1 to G, the j th candidate algorithm is trained through each part of training data respectively to obtain N models of the j th candidate algorithm, so that the loss value of each model of the N models of the j th candidate algorithm is determined according to the M parts of test data, the target parameter of the j th candidate algorithm is determined according to the loss values of the N models of the j th candidate algorithm, and then the target algorithm is determined from the G candidate algorithms according to the G target parameters, wherein G is the number of the candidate algorithms.
Therefore, in the embodiment of the application, the performance goodness index, that is, the target parameter, of each candidate algorithm in the multiple candidate algorithms can be determined according to the target sample data, and then an algorithm with the optimal performance can be selected from the multiple candidate algorithms according to the target parameter for use in the target application scene.
Therefore, in the embodiment of the application, the target parameters of the alternative algorithms are determined only according to the target sample data, so that the algorithm with the optimal performance in the target application scene can be selected according to the target parameters without repeatedly trying and exploring, and therefore the algorithm used in the target application scene selected in the embodiment of the application has high accuracy and low exploration cost.
Drawings
FIG. 1 is a flow chart of an algorithm selection method provided by an embodiment of the present application;
FIG. 2 is a schematic diagram illustrating the partitioning of target sample data in the embodiment of the present application;
FIG. 3 is a schematic diagram of a model family and f in an embodiment of the present application;
FIG. 4 is a block diagram of an algorithm selecting apparatus according to an embodiment of the present application;
FIG. 5 shows one of the block diagrams of an electronic device provided by an embodiment of the application;
fig. 6 shows a second block diagram of an electronic device according to an embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some, but not all, embodiments of the present application. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
The terms first, second and the like in the description and in the claims of the present application are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the application are capable of operation in sequences other than those illustrated or described herein. In addition, "and/or" in the specification and claims means at least one of connected objects, a character "/" generally means that a preceding and succeeding related objects are in an "or" relationship.
Referring to fig. 1, an embodiment of the present invention provides an algorithm selection method, which may include the steps of:
step 101: and dividing target sample data into N parts of training data and M parts of test data.
Wherein N is greater than 1 and M is greater than or equal to 1.
When j takes each integer from 1 to G, executing the following steps 102 to 104:
step 102: and aiming at the jth alternative algorithm, respectively training through each piece of training data to obtain N models of the jth alternative algorithm.
Step 103: and determining the loss value of each model in the N models of the jth alternative algorithm according to the M test data.
Step 104: and determining target parameters of the jth alternative algorithm according to the loss values of the N models of the jth alternative algorithm.
The target parameter is used for representing an evaluation value of the performance of the alternative algorithm.
In addition, the alternative algorithm is an algorithm suitable for a target application scenario, that is, the multiple alternative algorithms can achieve the same function. For example, in the application scenario of supervised learning, algorithms such as decision trees, linear regression, bayesian classification, logistic regression, support vector machine, extreme gradient boosting, neural network families, and the like may be used. However, which algorithm has the best performance may be determined according to predetermined target sample data. Therefore, the target parameter is used to represent the evaluation value of the performance of the candidate algorithm, i.e. the degree of performance of the candidate algorithm is represented by the size of the target parameter.
Therefore, according to the embodiment of the application, the performance quality degree of each alternative algorithm in the multiple alternative algorithms suitable for the target application scene can be determined according to the predetermined target sample data.
For example, alternative algorithms include: the method comprises the following steps that three algorithms of xgboost, a Convolutional Neural Network (CNN) algorithm and a fusion algorithm (gbdt + lr), when N is 30 and M is 1, the xgboost algorithm needs to be trained through each part of 30 parts of training data respectively, so that 30 models of the xgboost algorithm are obtained; similarly, the CNN algorithm is trained through each of 30 parts of training data to obtain 30 models of the CNN algorithm, and the gbdt + lr is trained through each of 30 parts of training data to obtain 30 models of the gbdt + lr.
Then, respectively calculating the loss value of each of 30 models of the xgboost algorithm, 30 models of the CNN algorithm and 30 models of the gbdt + lr according to the test data; furthermore, the target parameters of the xgboost algorithm are determined according to the loss values of 30 models of the xgboost algorithm, the target parameters of the CNN algorithm are determined according to the loss values of 30 models of the CNN algorithm, and the target parameters of the gbdt + lr are determined according to the loss values of 30 models of the gbdt + lr.
Therefore, in the embodiment of the application, the predetermined target sample data is divided into N parts of training data and M parts of test data, so that each candidate algorithm is trained through each part of training data to obtain N models of each candidate algorithm, and the loss value of each model of each candidate algorithm is calculated through M parts of test data, so that the target parameters of the candidate algorithms can be determined according to the loss values of the N models of the same candidate algorithm.
The multiple models of the alternative algorithm can be obtained by training the same alternative algorithm through multiple sets of training data, and the target parameters of the alternative algorithm are determined according to the loss values of the multiple models, so that the performance quality of the alternative algorithm can be more accurately represented by the target parameters.
Optionally, before step 101, the method may further include: and performing data cleaning on the target sample data, so that the steps 101-103 are executed by using the target sample data after the data cleaning.
The data cleaning process comprises checking data consistency, processing invalid values and missing values and the like.
Therefore, in the embodiment of the application, after the target sample data is subjected to the standardization processing, the determined target parameters of each alternative algorithm can more accurately represent the performance degree of the alternative algorithm.
Step 105: and determining a target algorithm from the G candidate algorithms according to the G target parameters.
Wherein G is the number of the alternative algorithms.
As can be seen from the above, in the algorithm selection method of the embodiment of the present application, in the embodiment of the present application, the target sample data can be divided into N training data and M test data, and then when j is an integer from 1 to G, N models of the jth candidate algorithm are obtained by respectively training each training data for the jth candidate algorithm, so that the loss value of each model of the N models of the jth candidate algorithm is determined according to the M test data, the target parameter of the jth candidate algorithm is determined according to the loss values of the N models of the jth candidate algorithm, and then the target algorithm is determined from the G candidate algorithms according to the G target parameters, where G is the number of the candidate algorithms.
Therefore, in the embodiment of the application, the performance goodness index, that is, the target parameter, of each candidate algorithm in the multiple candidate algorithms can be determined according to the target sample data, and then an algorithm with the optimal performance can be selected from the multiple candidate algorithms according to the target parameter for use in the target application scene.
Therefore, in the embodiment of the application, the target parameters of the alternative algorithms are determined only according to the target sample data, so that the algorithm with the optimal performance in the target application scene can be selected according to the target parameters without repeatedly trying and exploring, and therefore the algorithm used in the target application scene selected in the embodiment of the application has high accuracy and low exploration cost.
Optionally, the dividing the target sample data into N parts of training data and M parts of test data includes:
and dividing the target sample data into N training data and M testing data by adopting a cross verification method.
The purpose of cross-validation is to obtain a reliable and stable model. Namely, the cross verification method can ensure that the training data and the test data have randomness, so that the model obtained by subsequently adopting the training data for training has stability.
Optionally, a difference between the number of the negative target sample data in each of the N training data sets and a preset number is smaller than a preset value. Specifically, for example, the number of the negative target sample data in each of the N training data sets is equal.
Therefore, in the embodiment of the application, the number of the negative sample data in each piece of training data is set to be approximately equal, so that the condition that model training fails due to the fact that no negative sample exists in part of the training data or the negative sample is insufficient is prevented.
Optionally, each piece of test data includes input data and output data; determining a loss value of one of the N models of the jth candidate algorithm according to the M test data, including:
when i takes each integer from 1 to M, the following process is performed:
inputting input data in the ith test data into a first model, and outputting ith prediction data, wherein the first model is one of N models of the jth alternative algorithm;
substituting output data in the ith part of test data and the ith part of prediction data into a predetermined loss function to obtain an ith loss value;
and determining the loss value of the first model according to the 1 st to Mth loss values.
When M is equal to 1, only input data in the test data needs to be input into the first model, and a prediction data is output, so that the output data in the test data and the obtained prediction data are substituted into a predetermined loss function, and the obtained loss value is the loss value of the first model.
And when M is greater than 1, the above process needs to be performed for each piece of test data, so as to obtain a loss value corresponding to each piece of test data, and then the loss value of the first model is determined according to the loss values corresponding to M pieces of test data.
Optionally, the loss function is one of the following:
absolute value loss function, square loss function, cross entropy loss function, hinge loss function.
The input data in the ith test data comprise L input values, the output data in the ith test data comprise output values which are in one-to-one correspondence with the input values in the ith test data, the ith prediction data comprise predicted values which are in one-to-one correspondence with the output values in the ith test data, and L is larger than zero;
when the loss function is an absolute value loss function, substituting output data in the ith test data and the ith prediction data into a predetermined loss function to obtain an ith loss value, wherein the method comprises the following steps:
substituting an output value included in the ith test data and a predicted value included in the ith prediction data into an absolute value loss functionObtaining the ith loss value S (i);
wherein f isi(k) Represents the k-th predicted value, y, in the ith pre-test datai(k) And k is an integer from 1 to L.
When the loss function is a square loss function, substituting output data in the ith test data and the ith prediction data into a predetermined loss function to obtain an ith loss value, wherein the method comprises the following steps:
substituting the output value included in the ith test data and the predicted value included in the ith prediction data into a square loss functionObtaining the ith loss value S (i);
wherein f isi(k) Represents the k-th predicted value, y, in the ith pre-test datai(k) And k is an integer from 1 to L.
Optionally, the determining the loss value of the first model according to the 1 st to mth loss values includes:
calculating a second average value of the 1 st to Mth loss values, and determining the second average value as a loss value of the first model.
That is, in the embodiment of the present application, the average value of the 1 st to mth loss values is used as the loss value of the first model, so that the loss value of the first model can more accurately represent the deviation between the predicted result and the actual result of the first model on the test data.
Here, it is understood that, for the loss values according to the 1 st to mth, the manner of determining the loss value of the first model is not limited thereto.
Optionally, the determining the target parameter of the first algorithm according to the loss values of the N models includes:
calculating a first average b of the loss values of the N models;
calculating the variance v of the loss values of the N models;
substituting the first average value b and the variance v into a preset formula E ═ k1 × b + k2 × v to obtain the target parameter E;
where k1 and k2 are predetermined weights, respectively.
In addition, for example, the qth model of the pth alternative algorithm has a loss value epqThe first average value of N models of the p-th alternative algorithmVariance of N models of the p-th alternative algorithmp is an integer, q is an integer from 1 to N, and N is the number of training data (i.e., the number of models for an alternative algorithm).
Further, the deviation: the difference between the predicted value and the true value is described, and the larger the deviation is, the more the deviation deviates from the true data. Variance: described is the range of variation of the predictor, i.e. the degree of dispersion, i.e. the distance from its expected value, the greater the variance, the more dispersed the distribution of the data.
In a machine learning model, the deviation of an estimation model represents an inaccurate estimation part caused by too simple model, the model is too simple, the sensitivity to data is generally poor, and the estimation performance effect of the whole model is possibly far away from real data; the variance of the model represents a larger variation space and uncertainty caused by too complicated model, the model is too complicated, the sensitivity of the model to data is strong, and the estimated model has larger fluctuation and also influences the distance from the estimated model to the real model.
While in the embodiments of the present application, the N models of an alternative algorithm may constitute the model family of the alternative algorithm, for example, as shown in fig. 3, f denotes a function that is truly capable of characterizing data, point f1 and other points within the coverage of first circle 301 denote the model family of the first alternative algorithm, point f2 and other points within the coverage of second circle 302 denote the model family of the second alternative algorithm, and point f3 and other points within the coverage of third circle 303 denote the model family of the third alternative algorithm. As can be seen from fig. 3, the first circle 301, the second circle 302 and the third circle 303 cover each model within the range, the distance to f x is actually made up of two parts, one part is the average distance to f x of the whole model family, and the other part is the variance of the model family itself.
Therefore, when determining the evaluation value for indicating the performance of the candidate algorithm according to the loss values of the N models of the candidate algorithm, the average value and the variance of the loss values of the N models can be considered comprehensively, that is, the sum of the average value and the variance of the loss values of the N models of the candidate algorithm is adopted, so that the performance degree of the candidate algorithm can be more accurately indicated, and the algorithm with the best performance can be selected from the candidate algorithms to serve as the algorithm used in the target application scenario.
Optionally, the determining a target algorithm from the G candidate algorithms according to the G target parameters includes:
and selecting the candidate algorithm with the minimum target parameter as the target algorithm.
When the sum of the average value and the variance of the loss values of the N models of one candidate algorithm is used as the target parameter of the candidate algorithm, the smaller the target parameter is, the higher the performance of the candidate algorithm is, so that in this case, the candidate algorithm with the smallest target parameter is the candidate algorithm (i.e., the target algorithm) with the best performance among the multiple candidate algorithms.
Optionally, the method further includes:
calculating parameter information of each model in the N models of the target algorithm, wherein the parameter information comprises one of accuracy, recall rate and comprehensive evaluation index;
and selecting one model from the N models of the target algorithm according to the parameter information of the N models of the target algorithm to serve as a target model used by the target application scene.
The above-mentioned overall evaluation index may be referred to as an F value (Fsocre)
In addition, the accuracy, the recall rate and the comprehensive evaluation index of each model of the target algorithm can be calculated according to the data obtained after each piece of training data is trained according to the target algorithm, so that one model can be selected from N models in the target algorithm as the target model used by the target application scene according to one of the accuracy, the recall rate and the comprehensive evaluation index.
For example, in a target application scenario, comparing the accuracy rates of the emphasis models, selecting a model with the highest accuracy rate from the N models of the target algorithm as a target model used in the target application scenario; or comparing the recall rates of the opinion models in the target scene, and selecting the model with the highest recall rate from the N models of the target algorithm as the target model used in the target application scene; or, in the target application scenario, comparing the comprehensive evaluation indexes of the weighted models, selecting a model with the highest comprehensive evaluation index from the N models of the target algorithm as the target model used in the target application scenario.
Optionally, the determining target parameters of the jth candidate algorithm according to the loss values of the N models of the jth candidate algorithm includes:
carrying out data normalization processing on loss values of the N models of the jth alternative algorithm;
determining target parameters of the jth alternative algorithm according to the loss values of the N models of the jth alternative algorithm after data normalization processing
When the M is greater than 1, that is, when there are multiple test data, if the multiple test data may include data in different scenarios, for example, the first test data is data about house price, and the second test data is data about mobile phone price, the two data themselves have a large difference in order of magnitude, and therefore, the loss values of the two test data in the same model also have a large difference. At this time, if the target parameters of one candidate algorithm are calculated by directly using the loss values of the N models of the candidate algorithm, the effect of the loss value with a smaller order of magnitude is weakened, and the effect of the loss value with a larger order of magnitude is amplified, so that the finally determined target parameters of the candidate algorithm are inaccurate.
In the embodiment of the application, the loss values of the N models of the same alternative algorithm can be respectively normalized, so that the loss values of the N models of the alternative algorithm are in the same quantity level, and the target parameters can more accurately represent the performance quality of the alternative algorithm.
Namely, the normalization process can realize the normalization within a model family (i.e. within N models of one candidate algorithm) and the passivation of regression indexes, so as to better evaluate each candidate algorithm.
In summary, the specific implementation of the algorithm selection method in the embodiment of the present application can be as follows:
the first step is as follows: and (4) preparing data.
After data cleaning is performed on target sample data, as shown in fig. 2, the target sample data is randomly divided into a training data set and a testing data set, then the training data is randomly and averagely divided into 30 parts (or more than 30 parts of data, and 30 is an empirical number, and it is generally considered that statistical indexes of 30 or more sample quantities can describe overall distribution), and deep ideas such as negative samples are used to ensure that equal number of negative samples exist in each part of training data, so that the condition that model training fails due to the fact that no negative sample exists in part of training data or the negative samples are insufficient is prevented.
That is, the target sample data is finally divided into 30 parts of training data and 1 part of test data.
The second step is that: alternative algorithms and hyper-parameters are determined.
The selection of the alternative algorithm can exhaust all machine learning models theoretically, but in order to reduce the calculation cost, the required classification or regression algorithm is generally selected according to the corresponding application scene or the popular application of the engineering world; for example, three algorithms, xgboost, CNN algorithm, gbdt + lr, may be used as the alternative algorithms. Wherein, the hyper-parameter of each alternative algorithm adopts an empirical value.
The third step: training process and loss evaluation, namely, the following processes are executed for each alternative algorithm:
firstly, training each piece of training data by adopting an alternative algorithm and a hyper-parameter of the algorithm to obtain 30 models of the alternative algorithm, wherein the 30 models of one alternative algorithm can form a model family of the alternative algorithm.
Then, L input values included in the test data are input to each model, respectively, to obtain L predicted values corresponding to each model, and then the following process is performed for each model:
inputting L output values in the test data and L predicted values corresponding to one of the models into a predetermined loss functionObtaining a loss value e of the model, wherein f (k) represents a k-th predicted value, and y (k) represents a k-th output value.
Therefore, an alternative algorithm can obtain 30 models through training data, and each model can calculate a loss value through test data, so that an alternative algorithm can correspond to the loss values of 30 models.
The fourth step: the target parameter E of each alternative algorithm is calculated according to the following formula:
wherein E ispRepresenting the target parameters of the p-th alternative algorithm,epqand representing the loss value of the qth model of the pth alternative algorithm, wherein p is an integer, q is an integer from 1 to 30, and k1 and k2 are respectively predetermined weights.
The fifth step
And selecting the candidate algorithm with the minimum target parameter as the algorithm used by the target application scene.
The sixth step
And selecting one model from the 30 models of the target algorithm as a model used by the target application scene according to one of the accuracy, the recall rate and the Fsocre of the 30 models of the target algorithm.
As can be seen from the above, in the embodiments of the present application, the distance between the machine learning model function and the real function learned under the corresponding candidate algorithm is defined by using the expressions of different data sets on different candidate algorithms, the candidate algorithm with the optimal performance is selected as the direction of the subsequent engineering application, and then the optimal model in the candidate algorithm is selected as the basis for the subsequent construction. Therefore, the method and the device for modeling the application scene can enable a machine learning engineer to select an algorithm which is relatively close to the current application scene from various alternative algorithms for use in the algorithm selection period, so that the subsequent repeated exploration cost is reduced, the modeling success and the engineering application capability are improved, and the model stability and the generalization capability are enhanced.
It should be noted that, in the algorithm selection method provided in the embodiment of the present application, the execution main body may be an algorithm selection device, or alternatively, a control module in the algorithm selection device for executing a loading algorithm selection method. In the embodiment of the present application, an algorithm selection device is taken as an example to execute a loading algorithm selection method, and the algorithm selection method provided in the embodiment of the present application is described.
Referring to fig. 4, an embodiment of the present invention provides an algorithm selecting apparatus, and the algorithm selecting apparatus 400 may include the following modules:
the splitting module 401 is configured to divide the target sample data into N parts of training data and M parts of test data, where N is greater than 1 and M is greater than or equal to 1;
a first parameter determining module 402, configured to perform the following process when j takes each integer from 1 to G:
respectively training each piece of training data aiming at the jth alternative algorithm to obtain N models of the jth alternative algorithm;
determining a loss value of each model in the N models of the jth alternative algorithm according to the M test data;
determining target parameters of the jth candidate algorithm according to the loss values of the N models of the jth candidate algorithm, wherein the target parameters are used for representing evaluation values of the performance of the candidate algorithms;
a first selection module 403, configured to determine a target algorithm from the G candidate algorithms according to the G target parameters;
wherein G is the number of the alternative algorithms.
Optionally, a difference between the number of the negative target sample data in each of the N training data sets and a preset number is smaller than a preset value.
Optionally, each piece of test data includes input data and output data; when the first parameter determining module 402 determines the loss value of one model of the N models of the jth candidate algorithm according to the M test data, the first parameter determining module is specifically configured to:
when i takes each integer from 1 to M, the following process is performed:
inputting input data in the ith test data into a first model, and outputting ith prediction data, wherein the first model is one of N models of the jth alternative algorithm;
substituting output data in the ith part of test data and the ith part of prediction data into a predetermined loss function to obtain an ith loss value;
and determining the loss value of the first model according to the 1 st to Mth loss values.
Optionally, when determining the target parameter of the jth candidate algorithm according to the loss values of the N models of the jth candidate algorithm, the first parameter determining module 402 is specifically configured to:
calculating a first average value b of loss values of the N models of the jth alternative algorithm;
calculating the variance v of the loss values of the N models of the jth alternative algorithm;
substituting the first average value b and the variance v into a preset formula E-k 1-b + k 2-v to obtain a target parameter E of the jth alternative algorithm;
where k1 and k2 are predetermined weights, respectively.
Optionally, the apparatus further comprises:
a second parameter determining module 404, configured to calculate parameter information of each of the N models of the target algorithm, where the parameter information includes one of an accuracy, a recall rate, and a comprehensive evaluation index;
a second selecting module 405, configured to select one model from the N models of the target algorithm according to the parameter information of the N models of the target algorithm, so as to serve as a target model used in a target application scenario.
Optionally, when the target parameter of the jth candidate algorithm is determined according to the loss values of the N models of the jth candidate algorithm, the first parameter determining module 402 is specifically configured to:
carrying out data normalization processing on loss values of the N models of the jth alternative algorithm;
and determining target parameters of the jth alternative algorithm according to the loss values of the N models of the jth alternative algorithm after data normalization processing.
The algorithm selecting device in the embodiment of the present application may be a device, and may also be a component, an integrated circuit, or a chip in a terminal. The device can be mobile electronic equipment or non-mobile electronic equipment. By way of example, the mobile electronic device may be a mobile phone, a tablet computer, a notebook computer, a palm top computer, a vehicle-mounted electronic device, a wearable device, an ultra-mobile personal computer (UMPC), a netbook or a Personal Digital Assistant (PDA), and the like, and the non-mobile electronic device may be a server, a Network Attached Storage (NAS), a Personal Computer (PC), a Television (TV), a teller machine or a self-service machine, and the like, and the embodiments of the present application are not particularly limited.
The algorithm selecting device in the embodiment of the present application may be a device having an operating system. The operating system may be an Android (Android) operating system, an ios operating system, or other possible operating systems, and embodiments of the present application are not limited specifically.
The algorithm selection device provided in the embodiment of the present application can implement each process implemented in the method embodiment of fig. 1, and is not described here again to avoid repetition.
As can be seen from the above, the algorithm selection device in the embodiment of the application can divide target sample data into N training data and M test data, and then train each training data for the jth candidate algorithm when j is an integer from 1 to G, to obtain N models of the jth candidate algorithm, so as to determine a loss value of each model of the N models of the jth candidate algorithm according to the M test data, determine a target parameter of the jth candidate algorithm according to the loss values of the N models of the jth candidate algorithm, and further determine a target algorithm from the G candidate algorithms according to the G target parameters, where G is the number of the candidate algorithms.
Therefore, the algorithm selection device in the embodiment of the application can determine the performance quality index, namely the target parameter, of each alternative algorithm in the multiple alternative algorithms according to the target sample data, and further can select an algorithm with the optimal performance from the multiple alternative algorithms according to the target parameter so as to be used in the target application scene.
Therefore, the algorithm selection device in the embodiment of the application only needs to determine the target parameters of the alternative algorithms according to the target sample data, so that the algorithm with the optimal performance in the target application scene can be selected according to the target parameters without repeatedly trying and exploring, and therefore the algorithm used in the target application scene selected in the embodiment of the application is high in accuracy and low in exploration cost.
Optionally, an electronic device is further provided in this embodiment of the present application, as shown in fig. 5, the electronic device 500 includes a processor 510, a memory 520, and a program or an instruction stored in the memory 520 and capable of being executed on the processor 510, where the program or the instruction is executed by the processor 510 to implement each process of the above embodiment of the algorithm selection method, and can achieve the same technical effect, and in order to avoid repetition, it is not described here again.
It should be noted that the electronic devices in the embodiments of the present application include the mobile electronic devices and the non-mobile electronic devices described above.
Fig. 6 is a schematic diagram of a hardware structure of an electronic device implementing an embodiment of the present application.
The electronic device 600 includes, but is not limited to: a radio frequency unit 601, a network module 602, an audio output unit 603, an input unit 604, a sensor 605, a display unit 606, a user input unit 607, an interface unit 608, a memory 609, a processor 610, and the like.
Those skilled in the art will appreciate that the electronic device 600 may further comprise a power source (e.g., a battery) for supplying power to the various components, and the power source may be logically connected to the processor 610 through a power management system, so as to implement functions of managing charging, discharging, and power consumption through the power management system. The electronic device structure shown in fig. 6 does not constitute a limitation of the electronic device, and the electronic device may include more or less components than those shown, or combine some components, or arrange different components, and thus, the description is omitted here.
The processor 610 is configured to perform the following processes:
dividing target sample data into N parts of training data and M parts of test data, wherein N is greater than 1, and M is greater than or equal to 1;
when j takes each integer of 1 to G, the following process is performed:
respectively training each piece of training data aiming at the jth alternative algorithm to obtain N models of the jth alternative algorithm;
determining a loss value of each model in the N models of the jth alternative algorithm according to the M test data;
determining target parameters of the jth candidate algorithm according to the loss values of the N models of the jth candidate algorithm, wherein the target parameters are used for representing evaluation values of the performance of the candidate algorithms;
determining a target algorithm from the G candidate algorithms according to the G target parameters;
wherein G is the number of the alternative algorithms.
As can be seen from the above, the electronic device 600 according to the embodiment of the present application can divide target sample data into N training data and M test data, and then train each training data for the jth candidate algorithm when j is an integer from 1 to G to obtain N models of the jth candidate algorithm, so as to determine a loss value of each model of the N models of the jth candidate algorithm according to the M test data, determine a target parameter of the jth candidate algorithm according to the loss values of the N models of the jth candidate algorithm, and further determine a target algorithm from the G candidate algorithms according to the G target parameters, where G is the number of the candidate algorithms.
Therefore, the electronic device 600 in the embodiment of the present application may determine, according to the target sample data, a performance quality index, that is, a target parameter, of each candidate algorithm in the multiple candidate algorithms, and further may select, according to the target parameter, an algorithm with the best performance from the multiple candidate algorithms to use in the target application scenario.
Therefore, the electronic device 600 according to the embodiment of the present application only needs to determine the target parameters of the candidate algorithms according to the target sample data, so that the algorithm with the best performance in the target application scenario can be selected according to the target parameters without repeatedly trying and exploring, and therefore, the accuracy of the algorithm used in the target application scenario selected according to the embodiment of the present application is higher, and the exploration cost is lower.
Optionally, a difference between the number of negative sample data in each of the N training data sets and a preset number is smaller than a preset value.
Optionally, each piece of test data includes input data and output data; when determining the loss value of one model of the N models of the jth candidate algorithm according to the M test data, the processor 610 is specifically configured to:
when i takes each integer from 1 to M, the following process is performed:
inputting input data in the ith test data into a first model, and outputting ith prediction data, wherein the first model is one of N models of the jth alternative algorithm;
substituting output data in the ith part of test data and the ith part of prediction data into a predetermined loss function to obtain an ith loss value;
and determining the loss value of the first model according to the 1 st to Mth loss values.
Optionally, when determining the target parameter of the jth candidate algorithm according to the loss values of the N models of the jth candidate algorithm, the processor 610 is specifically configured to:
calculating a first average value b of loss values of the N models of the jth alternative algorithm;
calculating the variance v of the loss values of the N models of the jth alternative algorithm;
substituting the first average value b and the variance v into a preset formula E-k 1-b + k 2-v to obtain a target parameter E of the jth alternative algorithm;
where k1 and k2 are predetermined weights, respectively.
Optionally, the processor 610 is further configured to:
calculating parameter information of each model in the N models of the target algorithm, wherein the parameter information comprises one of accuracy, recall rate and comprehensive evaluation index;
and selecting one model from the N models of the target algorithm according to the parameter information of the N models of the target algorithm to serve as a target model used by a target application scene.
Optionally, when determining the target parameter of the jth candidate algorithm according to the loss values of the N models of the jth candidate algorithm, the processor 610 is specifically configured to:
carrying out data normalization processing on loss values of the N models of the jth alternative algorithm;
and determining target parameters of the jth alternative algorithm according to the loss values of the N models of the jth alternative algorithm after data normalization processing.
The embodiment of the present application further provides a readable storage medium, where a program or an instruction is stored on the readable storage medium, and when the program or the instruction is executed by a processor, the program or the instruction implements each process of the embodiment of the algorithm selection method, and can achieve the same technical effect, and in order to avoid repetition, details are not repeated here.
The processor is the processor in the electronic device described in the above embodiment. The readable storage medium includes a computer readable storage medium, such as a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and so on.
The embodiment of the present application further provides a chip, where the chip includes a processor and a communication interface, the communication interface is coupled to the processor, and the processor is configured to execute a program or an instruction to implement each process of the embodiment of the algorithm selection method, and can achieve the same technical effect, and is not described herein again to avoid repetition.
It should be understood that the chips mentioned in the embodiments of the present application may also be referred to as system-on-chip, system-on-chip or system-on-chip, etc.
It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element. Further, it should be noted that the scope of the methods and apparatus of the embodiments of the present application is not limited to performing the functions in the order illustrated or discussed, but may include performing the functions in a substantially simultaneous manner or in a reverse order based on the functions involved, e.g., the methods described may be performed in an order different than that described, and various steps may be added, omitted, or combined. In addition, features described with reference to certain examples may be combined in other examples.
Through the above description of the embodiments, those skilled in the art will clearly understand that the method of the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but in many cases, the former is a better implementation manner. Based on such understanding, the technical solutions of the present application may be embodied in the form of a software product, which is stored in a storage medium (such as ROM/RAM, magnetic disk, optical disk) and includes instructions for enabling a terminal (such as a mobile phone, a computer, a server, an air conditioner, or a network device) to execute the method according to the embodiments of the present application.
While the present embodiments have been described with reference to the accompanying drawings, it is to be understood that the invention is not limited to the precise embodiments described above, which are meant to be illustrative and not restrictive, and that various changes may be made therein by those skilled in the art without departing from the spirit and scope of the invention as defined by the appended claims.
Claims (10)
1. An algorithm selection method, characterized in that the method comprises:
dividing target sample data into N parts of training data and M parts of test data, wherein N is greater than 1, and M is greater than or equal to 1;
when j takes each integer of 1 to G, the following process is performed:
respectively training each piece of training data aiming at the jth alternative algorithm to obtain N models of the jth alternative algorithm;
determining a loss value of each model in the N models of the jth alternative algorithm according to the M test data;
determining target parameters of the jth candidate algorithm according to the loss values of the N models of the jth candidate algorithm, wherein the target parameters are used for representing evaluation values of the performance of the candidate algorithms;
determining a target algorithm from the G candidate algorithms according to the G target parameters;
wherein G is the number of the alternative algorithms.
2. The algorithm selection method of claim 1, wherein the difference between the number of negative sample data in each of the N training data sets and a predetermined number is smaller than a predetermined value.
3. The algorithm selection method of claim 1, wherein each of the test data includes input data and output data;
determining a loss value of one of the N models of the jth candidate algorithm according to the M test data, including:
when i takes each integer from 1 to M, the following process is performed:
inputting input data in the ith test data into a first model, and outputting ith prediction data, wherein the first model is one of N models of the jth alternative algorithm;
substituting output data in the ith part of test data and the ith part of prediction data into a predetermined loss function to obtain an ith loss value;
and determining the loss value of the first model according to the 1 st to Mth loss values.
4. The method of claim 1, wherein determining target parameters for the jth candidate algorithm based on the loss values of the N models for the jth candidate algorithm comprises:
calculating a first average value b of loss values of the N models of the jth alternative algorithm;
calculating the variance v of the loss values of the N models of the jth alternative algorithm;
substituting the first average value b and the variance v into a preset formula E-k 1-b + k 2-v to obtain a target parameter E of the jth alternative algorithm;
where k1 and k2 are predetermined weights, respectively.
5. The algorithm selection method of claim 1, further comprising:
calculating parameter information of each model in the N models of the target algorithm, wherein the parameter information comprises one of accuracy, recall rate and comprehensive evaluation index;
and selecting one model from the N models of the target algorithm according to the parameter information of the N models of the target algorithm to serve as a target model used by a target application scene.
6. The method of claim 1, wherein determining target parameters for the jth candidate algorithm based on the loss values of the N models for the jth candidate algorithm comprises:
carrying out data normalization processing on loss values of the N models of the jth alternative algorithm;
and determining target parameters of the jth alternative algorithm according to the loss values of the N models of the jth alternative algorithm after data normalization processing.
7. An algorithm selection device, the device comprising:
the splitting module is used for dividing target sample data into N parts of training data and M parts of test data, wherein N is greater than 1, and M is greater than or equal to 1;
a first parameter determining module, configured to, when j takes each integer from 1 to G, perform the following procedure:
respectively training each piece of training data aiming at the jth alternative algorithm to obtain N models of the jth alternative algorithm;
determining a loss value of each model in the N models of the jth alternative algorithm according to the M test data;
determining target parameters of the jth candidate algorithm according to the loss values of the N models of the jth candidate algorithm, wherein the target parameters are used for representing evaluation values of the performance of the candidate algorithms;
the first selection module is used for determining a target algorithm from the G candidate algorithms according to the G target parameters;
wherein G is the number of the alternative algorithms.
8. The algorithm selection device of claim 7, wherein the difference between the number of negative target sample data in each of the N training data sets and the predetermined number is less than a predetermined value.
9. The algorithm selection device of claim 7, wherein each of the test data sets comprises input data and output data; when the first parameter determining module determines the loss value of one model of the N models of the jth candidate algorithm according to the M test data, the first parameter determining module is specifically configured to:
when i takes each integer from 1 to M, the following process is performed:
inputting input data in the ith test data into a first model, and outputting ith prediction data, wherein the first model is one of N models of the jth alternative algorithm;
substituting output data in the ith part of test data and the ith part of prediction data into a predetermined loss function to obtain an ith loss value;
and determining the loss value of the first model according to the 1 st to Mth loss values.
10. The algorithm selection device of claim 7, wherein the first parameter determining module, when determining the target parameter of the jth candidate algorithm according to the loss values of the N models of the jth candidate algorithm, is specifically configured to:
calculating a first average value b of loss values of the N models of the jth alternative algorithm;
calculating the variance v of the loss values of the N models of the jth alternative algorithm;
substituting the first average value b and the variance v into a preset formula E-k 1-b + k 2-v to obtain a target parameter E of the jth alternative algorithm;
where k1 and k2 are predetermined weights, respectively.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110121910.5A CN112766402A (en) | 2021-01-28 | 2021-01-28 | Algorithm selection method and device and electronic equipment |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110121910.5A CN112766402A (en) | 2021-01-28 | 2021-01-28 | Algorithm selection method and device and electronic equipment |
Publications (1)
Publication Number | Publication Date |
---|---|
CN112766402A true CN112766402A (en) | 2021-05-07 |
Family
ID=75706552
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110121910.5A Pending CN112766402A (en) | 2021-01-28 | 2021-01-28 | Algorithm selection method and device and electronic equipment |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112766402A (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113626331A (en) * | 2021-08-12 | 2021-11-09 | 曙光信息产业(北京)有限公司 | Communication algorithm selection method and device, computer equipment and storage medium |
CN114331238A (en) * | 2022-03-17 | 2022-04-12 | 北京航天晨信科技有限责任公司 | Intelligent model algorithm optimization method, system, storage medium and computer equipment |
CN116700736A (en) * | 2022-10-11 | 2023-09-05 | 荣耀终端有限公司 | Determination method and device for application recommendation algorithm |
-
2021
- 2021-01-28 CN CN202110121910.5A patent/CN112766402A/en active Pending
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113626331A (en) * | 2021-08-12 | 2021-11-09 | 曙光信息产业(北京)有限公司 | Communication algorithm selection method and device, computer equipment and storage medium |
CN114331238A (en) * | 2022-03-17 | 2022-04-12 | 北京航天晨信科技有限责任公司 | Intelligent model algorithm optimization method, system, storage medium and computer equipment |
CN116700736A (en) * | 2022-10-11 | 2023-09-05 | 荣耀终端有限公司 | Determination method and device for application recommendation algorithm |
CN116700736B (en) * | 2022-10-11 | 2024-05-31 | 荣耀终端有限公司 | Determination method and device for application recommendation algorithm |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112766402A (en) | Algorithm selection method and device and electronic equipment | |
CN108197652B (en) | Method and apparatus for generating information | |
CN110766080B (en) | Method, device and equipment for determining labeled sample and storage medium | |
CN114265979B (en) | Method for determining fusion parameters, information recommendation method and model training method | |
CN111435463B (en) | Data processing method, related equipment and system | |
CN116596095B (en) | Training method and device of carbon emission prediction model based on machine learning | |
CN112561082A (en) | Method, device, equipment and storage medium for generating model | |
CN113961765B (en) | Searching method, searching device, searching equipment and searching medium based on neural network model | |
CN111178537B (en) | Feature extraction model training method and device | |
CN112785005B (en) | Multi-objective task assistant decision-making method and device, computer equipment and medium | |
CN113240155A (en) | Method and device for predicting carbon emission and terminal | |
CN111611390B (en) | Data processing method and device | |
CN111159481B (en) | Edge prediction method and device for graph data and terminal equipment | |
CN113378067B (en) | Message recommendation method, device and medium based on user mining | |
CN114970357A (en) | Energy-saving effect evaluation method, system, device and storage medium | |
CN115034379A (en) | Causal relationship determination method and related equipment | |
CN113869377A (en) | Training method and device and electronic equipment | |
CN114330090A (en) | Defect detection method and device, computer equipment and storage medium | |
CN112862021A (en) | Content labeling method and related device | |
CN113409096B (en) | Target object identification method and device, computer equipment and storage medium | |
CN115618065A (en) | Data processing method and related equipment | |
CN113807391A (en) | Task model training method and device, electronic equipment and storage medium | |
CN112819079A (en) | Model sampling algorithm matching method and device and electronic equipment | |
CN109436980A (en) | The condition detection method and system of elevator components | |
CN118094233B (en) | Content processing model integration method and related equipment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20210507 |
|
WD01 | Invention patent application deemed withdrawn after publication |