CN112395551A - Optimization method of logistic regression - Google Patents
Optimization method of logistic regression Download PDFInfo
- Publication number
- CN112395551A CN112395551A CN201910751819.4A CN201910751819A CN112395551A CN 112395551 A CN112395551 A CN 112395551A CN 201910751819 A CN201910751819 A CN 201910751819A CN 112395551 A CN112395551 A CN 112395551A
- Authority
- CN
- China
- Prior art keywords
- parameter
- auc
- formula
- sample
- logistic regression
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000007477 logistic regression Methods 0.000 title claims abstract description 54
- 238000000034 method Methods 0.000 title claims abstract description 42
- 238000005457 optimization Methods 0.000 title claims description 14
- 238000011478 gradient descent method Methods 0.000 claims abstract description 25
- 230000006870 function Effects 0.000 claims description 72
- 238000012545 processing Methods 0.000 claims description 14
- 238000004364 calculation method Methods 0.000 claims description 12
- 238000010801 machine learning Methods 0.000 claims description 10
- 238000012360 testing method Methods 0.000 claims description 6
- 238000006243 chemical reaction Methods 0.000 claims description 5
- 238000003860 storage Methods 0.000 claims description 5
- 238000012549 training Methods 0.000 description 18
- 238000010586 diagram Methods 0.000 description 9
- 238000013210 evaluation model Methods 0.000 description 8
- 238000004590 computer program Methods 0.000 description 6
- 238000007476 Maximum Likelihood Methods 0.000 description 4
- 238000013459 approach Methods 0.000 description 3
- 238000011156 evaluation Methods 0.000 description 3
- 238000012986 modification Methods 0.000 description 3
- 230000004048 modification Effects 0.000 description 3
- 238000012795 verification Methods 0.000 description 3
- 238000013461 design Methods 0.000 description 2
- 238000009826 distribution Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000013106 supervised machine learning method Methods 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000004422 calculation algorithm Methods 0.000 description 1
- 238000013145 classification model Methods 0.000 description 1
- 238000009795 derivation Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000009472 formulation Methods 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/18—Complex mathematical operations for evaluating statistical data, e.g. average values, frequency distributions, probability functions, regression analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Software Systems (AREA)
- Mathematical Physics (AREA)
- Computational Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Pure & Applied Mathematics (AREA)
- Mathematical Optimization (AREA)
- Mathematical Analysis (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Artificial Intelligence (AREA)
- Computing Systems (AREA)
- Evolutionary Computation (AREA)
- Evolutionary Biology (AREA)
- Life Sciences & Earth Sciences (AREA)
- Operations Research (AREA)
- Probability & Statistics with Applications (AREA)
- Bioinformatics & Computational Biology (AREA)
- Algebra (AREA)
- Databases & Information Systems (AREA)
- Medical Informatics (AREA)
Abstract
The invention discloses a method and a device for optimizing logistic regression, which relate to the field of logistic regression, wherein the method comprises the following steps: calculating a first parameter according to an area AUC under a receiver operating characteristic ROC curve as a loss function; updating the first parameter by using a gradient descent method to obtain a second parameter; and substituting the second parameter as the value of the parameter theta into a probability formula of the logistic regression model to obtain a probability value.
Description
Technical Field
The invention relates to the field of logistic regression, in particular to a logistic regression optimization method.
Background
With the development of science and technology, the application of the internet is increased day by day, the dependence of people on the network is more obvious, correspondingly, the competition among large internet companies is more intense, how to increase the access rate of users and prolong the online time of the users becomes an important problem considered by the large internet companies, and the technical mode corresponding to the important problem is click rate prediction, for example, through the click rate prediction, advertisements with stronger showing capability can be screened out by a search engine website, a shopping website can push commodities which the users wish to consume, and a news entertainment website can directionally show contents which the users are more interested in. The technical method for predicting the click rate is to adopt a logistic regression model, wherein the logistic regression model is one of the models with the highest popularity in the internet industry, and the logistic regression model is usually used for carrying out probability distribution formula expression behind random events.
In the prior art, when solving parameters of a logistic regression model, log loss of a function obtained by maximum likelihood estimation is used as a loss function, and log loss is optimized to solve model parameters in model training; however, when the evaluation model judges whether the obtained model parameters are good or bad, the Area Under the Receiver Operating Characteristic ROC (ROC) Curve (Area Under Curve, AUC) is used as an index. The processing and evaluation of the same model parameter adopt different modes successively, and no correlation exists between the two modes, so that the model parameter obtained by optimizing the log loss is not necessarily regarded as the optimal solution in the subsequent AUC evaluation model, and the prediction accuracy of the logistic regression model is reduced.
Therefore, the model parameter obtained by using the log loss as the loss function in the prior art is not necessarily considered as the optimal solution in the subsequent AUC evaluation model, which is a problem to be solved urgently.
Disclosure of Invention
The embodiment of the application provides an optimization method of logistic regression, and solves the problem that in the prior art, model parameters obtained by using log loss as a loss function are not necessarily regarded as optimal solutions in a subsequent AUC evaluation model.
The optimization method of logistic regression provided by the embodiment of the application specifically includes:
calculating a first parameter according to an area AUC under a receiver operating characteristic ROC curve as a loss function;
updating the first parameter by using a gradient descent method to obtain a second parameter;
and substituting the second parameter as the value of the parameter theta into a probability formula of the logistic regression model to obtain a probability value.
One possible implementation, the AUC as a loss function includes:
according to the statistic that the AUC is equivalent to the Whitney test, the calculation method of the AUC is subjected to form conversion, and the converted formula is as follows:
wherein,for the first vector of parameters is a vector of parameters,in the form of a vector of positive samples,in the form of a vector of negative samples, representing the probability that the score of a positive sample is greater than the score of a negative sample;
the formula of the AUC as a loss function is:
wherein, in the total sample, the data set of the positive sample is P, the data set of the negative sample is Q, and the counting function g (x) is:
one possible implementation further includes:
the counting function g (x) is converted into a logic function;
obtaining said AUC as a loss function applicable to a gradient descent method, with the formula:
one possible implementation manner, wherein the updating the first parameter by using a gradient descent method includes:
updating the first parameter according to the formula of the gradient descent method and the AUC as a loss function, wherein the formula is as follows:
one possible implementation further includes:
applying supervised machine learning to train the first parameter to obtain the second parameter; the method comprises the following steps:
acquiring the existing N samples and corresponding N sample results; setting the number of samples needed for updating the first parameter once to be M, and then updating the first parameter for a total number of times to be N/M, wherein N is an integer greater than 0, M is an integer greater than 0, and N is greater than M;
and according to the M samples and the corresponding M sample results, solving the first parameter by applying the formula with the AUC as a loss function, and updating N/M times by using a gradient descent method to obtain the second parameter.
The embodiment of the present application provides an optimization device for logistic regression, which specifically includes:
the first processing unit is used for calculating a first parameter according to an area AUC under a receiver operating characteristic ROC curve as a loss function; updating the first parameter by using a gradient descent method to obtain a second parameter;
and the second processing unit is used for substituting the second parameter as the value of the parameter theta into a probability formula of the logistic regression model to obtain the probability value.
One possible implementation, the AUC as a loss function includes:
according to the statistic that the AUC is equivalent to the Whitney test, the calculation method of the AUC is subjected to form conversion, and the converted formula is as follows:
wherein,for the first vector of parameters is a vector of parameters,in the form of a vector of positive samples,in the form of a vector of negative samples, representing the probability that the score of a positive sample is greater than the score of a negative sample;
the formula of the AUC as a loss function is:
wherein, in the total sample, the data set of the positive sample is P, the data set of the negative sample is Q, and the counting function g (x) is:
one possible implementation further includes:
the counting function g (x) is converted into a logic function;
obtaining said AUC as a loss function applicable to a gradient descent method, with the formula:
embodiments of the present application provide a computer device comprising a program or instructions that, when executed, cause a computer to perform the method of any of the above possible designs.
Embodiments of the present application provide a storage medium containing a program or instructions that, when executed, cause a computer to perform the method of any of the above possible designs.
The optimization method of the logistic regression provided by the invention has the following beneficial effects: and taking the AUC as a loss function, and directly taking the optimized AUC as a target in the training process of the model to obtain better model performance, so that the prediction accuracy of the logistic regression model is improved, and the application effect under each business scene is further improved.
Drawings
FIG. 1 is a flow chart of a prior art logistic regression method;
FIG. 2 is a flow chart of a method for logistic regression optimization in an embodiment of the present application;
FIG. 3 is a flowchart illustrating a method for optimizing logistic regression using supervised machine learning training parameters according to an embodiment of the present disclosure;
fig. 4 is a schematic structural diagram of an apparatus of a logistic regression optimization method in an embodiment of the present application.
Detailed Description
In order to better understand the technical solutions, the technical solutions will be described in detail below with reference to the drawings and the specific embodiments of the specification, and it should be understood that the specific features in the embodiments and examples of the present application are detailed descriptions of the technical solutions of the present application, but not limitations of the technical solutions of the present application, and the technical features in the embodiments and examples of the present application may be combined with each other without conflict.
FIG. 1 is a flow chart of a logistic regression method in the prior art, and as shown in the figure, the logistic regression model is one of the most popular models in the Internet industry, and is generally used as a two-classification model.
Step 101: obtaining a probability formula of a logistic regression model, wherein the obtained formula (1) meets the following form:
specifically, for example, if the click rate predicts whether the user will click on the advertisement, the probability distribution behind the random event such as "whether one person will click on the advertisement" is expressed by using the above formula, where X represents the observed feature, θ is a parameter that the algorithm needs to solve, and Y ═ 1 is the probability when the user actually clicks on the advertisement and predicts that the advertisement is also clicked.
Step 102: when solving the parameters of the probability formula, calculating by using Log loss as a loss function;
specifically, when solving the model parameter θ, the model parameter θ is obtained by minimizing the log loss of the maximum likelihood estimation by using the log loss derived from the maximum likelihood estimation as a loss function, wherein the derivation formulas (2) to (4) obtained by the loss function satisfy the following form:
L(θ│x)=Pr(Y│X;θ) (2)
=∏iPr(yi│xi;θ) (3)
=∏ihθ(xi)yi(1-hθ(xi))(1-yi) (4)
step 103: updating parameters by using a gradient descent method;
specifically, the loss function in step 102 is derived by using a gradient descent method, and the parameter θ is continuously updated; and training a model parameter theta by machine learning, wherein when a sample prediction result obtained by training is very close to an actual sample result, namely a loss function is minimum, the obtained parameter theta is a finally confirmed model parameter.
Step 104: the parameters were substituted into the probability formula and the logistic regression model was evaluated using AUC.
Specifically, the parameter θ obtained through the machine learning training in step 103 is brought into the probability formula of the logistic regression model together with the verification sample to be calculated, and then the AUC value is obtained from the calculated probability value by using the AUC calculation method, when the AUC value is larger, the better the performance of the logistic regression model is, that is, the AUC is used to evaluate the logistic regression model, and whether the obtained model parameter θ is the optimal solution is determined.
From the above step 101-104, when solving the parameters of the logistic regression model, the log loss obtained by the maximum likelihood estimation is used as the loss function, and the model parameter θ is solved by optimizing (i.e. minimizing) the log loss in the model training; however, the model is then evaluated using AUC as an index to determine whether the obtained model parameter θ is the optimal solution. Different modes are successively adopted for processing and evaluating the same model parameter, and no incidence relation exists between the two modes, so that the model parameter obtained by optimizing the log loss is not necessarily regarded as the optimal solution in the subsequent AUC evaluation model, and the prediction accuracy of the logistic regression model is reduced.
Therefore, in the prior art, the model parameter obtained by using log loss as a loss function is not necessarily considered as an optimal solution in a subsequent AUC evaluation model, which is a problem to be solved urgently, the optimization method of logistic regression aims to obtain better model expression by directly taking AUC as the loss function and aiming at optimizing AUC in the training process of the model, so that the prediction accuracy of the logistic regression model is improved, and the application effect under each business scene is further improved.
Fig. 2 is a flowchart of a method for optimizing logistic regression according to an embodiment of the present application, and specific steps will be described in detail below.
Step 201: obtaining a probability formula of the logistic regression model, wherein the obtained formula (1) still satisfies the following form:
step 202: calculating a first parameter from the AUC as a loss function;
specifically, when solving the parameters of the probability formula, calculating by using an area AUC under a Receiver Operating Characteristic (ROC) curve as a loss function; there are two general most intuitive AUC calculation methods, one is: drawing an ROC curve, wherein the area under the ROC curve is the value of AUC; the other is as follows: assuming that there are a total of (m + n) samples, where m positive samples and n negative samples have a total of m x n sample pairs, the probability value that a positive sample is predicted as a positive sample is greater than the probability value that a negative sample is predicted as a positive sample is recorded as 1, and the total of the counts is divided by (m x n) to obtain the AUC. The first method of calculating AUC is typically used in AUC evaluation models; where AUC is used instead of log loss as a loss function, a second method of calculating AUC is chosen.
To apply the second method of calculating AUC, it needs to be mathematically transformed. Considering that AUC is equivalent to the statistic of wheatstone test Wilcoxon-Mann-Whitney, the method for calculating AUC is expressed as a linear model in terms of the probability that the Score of a positive sample is greater than that of a negative sample, and the Score of a positive sample is randomly selected from a positive sample set and a negative sample set, and the type of logistic regression model is combined, so that the above understanding is mathematically formalized, and the obtained formula (5) satisfies the following form:
wherein,for the first parameter vector, equivalent to the model parameter theta in the prior art,in the form of a vector of positive samples,in the form of a vector of negative samples,representing the probability that the score of a positive sample is greater than the score of a negative sample;
further, setting the data set of the positive sample as P and the data set of the negative sample as Q;
the counting function g (x) satisfies the formula (6), and the following can be specifically referred to:
thus, incorporating the counting function g (x) into the mathematical formulation of AUC, the resulting formula (7) satisfies the following form:
the formula is a formula of taking AUC as a loss function, and considering that the loss function needs to update the first parameter by applying a gradient descent method, the counting function g (x) is converted into a logic function and then is combined with the logic functionIn the formula (2), i.e., the formula (8) obtained as a loss function of AUC applicable to the gradient descent method satisfies the following form:
step 203: updating the first parameter by using a gradient descent method to obtain a second parameter;
specifically, the formula with AUC as the loss function in step 202 can be derived by using a gradient descent method to update the first parameterThe resulting formula (9) satisfies the following form:
taking N sample data as training data, and adopting supervised machine learning to obtain first parameterTraining is carried out, and the first parameter is continuously updated by applying a gradient descent methodTo obtain a second parameter of the optimal solution.
Step 204: and substituting the second parameter as the value of the parameter theta into a probability formula of the logistic regression model to obtain a probability value.
Specifically, L sample data are taken as verification data, and the method is divided into the following steps according to the actual result of the sample: positive and negative examples, first parameter obtained by step 203And combining the L sample data and substituting the L sample data into a probability formula of the logistic regression model for calculation, predicting the probability of the obtained positive sample into the positive sample and the probability of the obtained negative sample into the positive sample, and combining an AUC (average value of coefficient) conventional calculation method, namely drawing an ROC (optimum characteristic) curve, wherein the area under the ROC curve is the value of AUC, calculating the value of AUC, and when the value of AUC is larger, the logistic regression model is better in performance, namely the AUC is used for evaluating the logistic regression model, and judging whether the calculated second parameter is the optimal solution.
Model parameters obtained by step 201-The index which is consistent with the AUC evaluation model in the subsequent step 204 is kept, namely the index is obtained by applying the calculation method of AUC, thereby ensuring the model parameters obtained when the AUC is taken as a loss function and minimizedCan also be better in the subsequent AUC evaluation modelThe evaluation results of (1); and the method applies supervised machine learning to carry out model parametersIn the training process, the AUC is directly optimized as the target, so that the model parameters obtained by model trainingCompared with the model parameter theta obtained by the log loss function in the prior art, the method is more optimized.
To better understand how to adopt a supervised machine learning method to model parametersTraining is performed to obtain the optimal solution, and the optimal solution is verified and evaluated, which will be further described by way of example in conjunction with the supervised machine learning method adopted in step 201-204.
Fig. 3 is a flowchart of an optimization method of logistic regression using supervised machine learning training parameters in an embodiment of the present application. The supervised machine learning refers to learning a function from a given training data set, and when new data comes, a result can be predicted according to the function. The specific process is described below in conjunction with the practical application of the logistic regression model.
Step 301: preparing training data;
specifically, firstly, training data is prepared, for example, a logistic regression model is applied to predict the click rate of the travel advertisement, and firstly, sample data of 200 users is obtained as the training data, wherein the sample data comprises: observed user characteristics and click results of the user, wherein the observed user characteristics include: name, gender, age, region, online time; the click result includes: click and not click.
Step 302: randomly initializing parameters;
specifically, the model parameters are initialized randomly, the range of the model parameters is (0, 1), the number of samples required for updating the model parameters once is set to be 10, and the total number of times of updating the model parameters is 200/10-20.
Step 303: calculating a predicted value;
specifically, after the rule is set, 10 samples are obtained as a unit, 10 samples of a unit are obtained first, the 10 samples are substituted into a probability formula of a logistic regression model according to initialized model parameters, such as 0.1, to calculate, a prediction result of the 10 samples of the first unit is obtained, the prediction result is compared with click results of users in the 10 samples, and the quality of the model parameters is judged according to the comparison result.
Step 304: calculating AUC as loss function;
the 10 samples are used as input values, and the above formula of taking the AUC as a loss function is applied to solve the model parameters.
Step 305: updating parameters by using a gradient descent method;
specifically, the model parameter solved in step 304 is used as a base number, 10 samples of the second unit are obtained, the model parameter is updated, after the update, the 10 samples of the second unit and the updated model parameter are brought into the probability formula of the logistic regression model together for calculation, the prediction result of the 10 samples of the second unit is obtained, and the prediction result is compared with the click result of the user in the 10 samples of the second unit to judge whether the model parameter is good enough.
Step 306: whether the parameters are good enough;
specifically, the model parameters are calculated and updated by repeating the same operation on the remaining 10 samples of 8 units, the parameters are adjusted by continuously updating the model parameters and the obtained comparison result, and when the comparison result shows that the prediction result is close to the click result of the user, for example, the approach rate is 99%, the parameters are considered to be good enough, so that the final parameters are obtained; and when the comparison result shows that the difference between the prediction result and the click result of the user is large, such as the approach rate is 90%, continuing to take the sample for iterative computation and updating the model parameters until the comparison result is good, such as the approach rate is 99%.
Step 307: and obtaining the final parameters.
Through the processing in step 306, a final parameter is obtained, where the argument in the logistic regression model, i.e., the observed user feature, includes more than one content, and the mathematical expression form of the argument is a vector, so the model parameter corresponding to the argument is also a vector, and the number of model parameter values included in the vector corresponds to the number of contents included in the argument (the observed user feature), so the final parameter is also a vector.
Through the steps 301-307, the final parameters are trained through supervised machine learning, then sample data of 100 users are taken as verification data, and the steps are divided into: combining the final parameters and substituting the final parameters into a probability formula of the logistic regression model to obtain the probability of the positive sample being predicted as the positive sample and the probability of the negative sample being predicted as the positive sample, and then combining an AUC conventional calculation method to obtain the value of AUC; when the value of AUC is larger, the model parameter is better, namely the obtained model parameter is the optimal solution.
Fig. 4 is a schematic structural diagram of an apparatus of an optimization method of logistic regression in the embodiment of the present application, which includes a first processing unit 401 and a second processing unit 402, and is described in detail below.
The first processing unit 401 is configured to calculate a first parameter according to an area AUC under a receiver operating characteristic ROC curve as a loss function; updating the first parameter by using a gradient descent method to obtain a second parameter;
and a second processing unit 402, configured to substitute the second parameter as a value of the parameter θ into a probability formula of the logistic regression model to obtain a probability value.
One possible implementation, the AUC as a loss function includes:
according to the statistic that the AUC is equivalent to the Whitney test, the calculation method of the AUC is subjected to form conversion, and the converted formula is as follows:
wherein,for the first vector of parameters is a vector of parameters,in the form of a vector of positive samples,in the form of a vector of negative samples, representing the probability that the score of a positive sample is greater than the score of a negative sample;
the formula of the AUC as a loss function is:
wherein, in the total sample, the data set of the positive sample is P, the data set of the negative sample is Q, and the counting function g (x) is:
one possible implementation further includes:
the counting function g (x) is converted into a logic function;
obtaining said AUC as a loss function applicable to a gradient descent method, with the formula:
finally, it should be noted that: as will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
It will be apparent to those skilled in the art that various changes and modifications may be made in the present application without departing from the scope of the application. Thus, if such modifications and variations of the present application fall within the scope of the claims of the present application and their equivalents, the present application is intended to include such modifications and variations as well.
Claims (10)
1. A method of optimization of logistic regression, comprising:
calculating a first parameter according to an area AUC under a receiver operating characteristic ROC curve as a loss function;
updating the first parameter by using a gradient descent method to obtain a second parameter;
and substituting the second parameter as the value of the parameter theta into a probability formula of the logistic regression model to obtain a probability value.
2. The method of claim 1, wherein the AUC as a function of loss comprises:
according to the statistic that the AUC is equivalent to the Whitney test, the calculation method of the AUC is subjected to form conversion, and the converted formula is as follows:
wherein,for the first vector of parameters is a vector of parameters,in the form of a vector of positive samples,as negative sample vector, AUCRepresenting the probability that the score of a positive sample is greater than the score of a negative sample;
the formula of the AUC as a loss function is:
wherein, in the total sample, the data set of the positive sample is P, the data set of the negative sample is Q, and the counting function g (x) is:
5. the method of claim 1, further comprising:
applying supervised machine learning to train the first parameter to obtain the second parameter; the method comprises the following steps:
acquiring the existing N samples and corresponding N sample results; setting the number of samples needed for updating the first parameter once to be M, and then updating the first parameter for a total number of times to be N/M, wherein N is an integer greater than 0, M is an integer greater than 0, and N is greater than M;
and according to the M samples and the corresponding M sample results, solving the first parameter by applying the formula with the AUC as a loss function, and updating N/M times by using a gradient descent method to obtain the second parameter.
6. An apparatus for logistic regression optimization, comprising:
the first processing unit is used for calculating a first parameter according to an area AUC under a receiver operating characteristic ROC curve as a loss function; updating the first parameter by using a gradient descent method to obtain a second parameter;
and the second processing unit is used for substituting the second parameter as the value of the parameter theta into a probability formula of the logistic regression model to obtain the probability value.
7. The apparatus of claim 6, wherein the AUC as a function of loss comprises:
according to the statistic that the AUC is equivalent to the Whitney test, the calculation method of the AUC is subjected to form conversion, and the converted formula is as follows:
wherein,for the first vector of parameters is a vector of parameters,in the form of a vector of positive samples,as negative sample vector, AUCRepresenting the probability that the score of a positive sample is greater than the score of a negative sample;
the formula of the AUC as a loss function is:
wherein, in the total sample, the data set of the positive sample is P, the data set of the negative sample is Q, and the counting function g (x) is:
9. a computer device comprising a program or instructions that, when executed, perform the method of any of claims 1 to 5.
10. A storage medium comprising a program or instructions which, when executed, perform the method of any one of claims 1 to 5.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910751819.4A CN112395551A (en) | 2019-08-15 | 2019-08-15 | Optimization method of logistic regression |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910751819.4A CN112395551A (en) | 2019-08-15 | 2019-08-15 | Optimization method of logistic regression |
Publications (1)
Publication Number | Publication Date |
---|---|
CN112395551A true CN112395551A (en) | 2021-02-23 |
Family
ID=74602789
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910751819.4A Pending CN112395551A (en) | 2019-08-15 | 2019-08-15 | Optimization method of logistic regression |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112395551A (en) |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20100094783A1 (en) * | 2008-09-15 | 2010-04-15 | Ankur Jain | Method and System for Classifying Data in System with Limited Memory |
CN106528771A (en) * | 2016-11-07 | 2017-03-22 | 中山大学 | Fast structural SVM text classification optimization algorithm |
CN107909498A (en) * | 2017-10-26 | 2018-04-13 | 厦门理工学院 | Based on the recommendation method for maximizing receiver operating characteristic curve area under |
-
2019
- 2019-08-15 CN CN201910751819.4A patent/CN112395551A/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20100094783A1 (en) * | 2008-09-15 | 2010-04-15 | Ankur Jain | Method and System for Classifying Data in System with Limited Memory |
CN106528771A (en) * | 2016-11-07 | 2017-03-22 | 中山大学 | Fast structural SVM text classification optimization algorithm |
CN107909498A (en) * | 2017-10-26 | 2018-04-13 | 厦门理工学院 | Based on the recommendation method for maximizing receiver operating characteristic curve area under |
Non-Patent Citations (2)
Title |
---|
LIAN YAN ET AL.: ""Optimizing Classifier Performance via an Approximation to the Wilcoxon-Mann-Whitney Statistic"", 《PROCEEDINGS OF THE TWENTIETH INTERNATIONAL CONFERENCE ON MACHINE LEARNING (ICML-2003)》, pages 1 - 8 * |
王兵: ""一种基于逻辑回归模型的搜索广告点击率预估方法的研究"", 《中国优秀硕士学位论文全文数据库 信息科技辑》, pages 138 - 2316 * |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20220198289A1 (en) | Recommendation model training method, selection probability prediction method, and apparatus | |
CN111797321B (en) | Personalized knowledge recommendation method and system for different scenes | |
CN110503531B (en) | Dynamic social scene recommendation method based on time sequence perception | |
KR20230070272A (en) | Computer-based systems, computing components, and computing objects configured to implement dynamic outlier bias reduction in machine learning models | |
CN107346464A (en) | Operational indicator Forecasting Methodology and device | |
US11257019B2 (en) | Method and system for search provider selection based on performance scores with respect to each search query | |
JP2024503774A (en) | Fusion parameter identification method and device, information recommendation method and device, parameter measurement model training method and device, electronic device, storage medium, and computer program | |
CN106095887A (en) | Context aware Web service recommendation method based on weighted space-time effect | |
CN112819523B (en) | Marketing prediction method combining inner/outer product feature interaction and Bayesian neural network | |
WO2020135642A1 (en) | Model training method and apparatus employing generative adversarial network | |
Borges et al. | On measuring popularity bias in collaborative filtering data | |
CN113449011A (en) | Big data prediction-based information push updating method and big data prediction system | |
CN113537630A (en) | Training method and device of business prediction model | |
CN113449012A (en) | Internet service mining method based on big data prediction and big data prediction system | |
CN111444930B (en) | Method and device for determining prediction effect of two-classification model | |
CN113763031A (en) | Commodity recommendation method and device, electronic equipment and storage medium | |
CN112218114B (en) | Video cache control method, device and computer readable storage medium | |
CN113065067A (en) | Article recommendation method and device, computer equipment and storage medium | |
CN109145207B (en) | Information personalized recommendation method and device based on classification index prediction | |
CN111309706A (en) | Model training method and device, readable storage medium and electronic equipment | |
CN116578400A (en) | Multitasking data processing method and device | |
Sagaama et al. | Automatic parameter tuning for big data pipelines with deep reinforcement learning | |
CN115600818A (en) | Multi-dimensional scoring method and device, electronic equipment and storage medium | |
CN112395551A (en) | Optimization method of logistic regression | |
CN113128597B (en) | Method and device for extracting user behavior characteristics and classifying and predicting user behavior characteristics |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |