CN112395551A - Optimization method of logistic regression - Google Patents

Optimization method of logistic regression Download PDF

Info

Publication number
CN112395551A
CN112395551A CN201910751819.4A CN201910751819A CN112395551A CN 112395551 A CN112395551 A CN 112395551A CN 201910751819 A CN201910751819 A CN 201910751819A CN 112395551 A CN112395551 A CN 112395551A
Authority
CN
China
Prior art keywords
parameter
auc
formula
sample
logistic regression
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910751819.4A
Other languages
Chinese (zh)
Inventor
林淼哲
方桢
王雨晨
詹杰凡
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Youkun Information Technology Co ltd
Original Assignee
Shanghai Youkun Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Youkun Information Technology Co ltd filed Critical Shanghai Youkun Information Technology Co ltd
Priority to CN201910751819.4A priority Critical patent/CN112395551A/en
Publication of CN112395551A publication Critical patent/CN112395551A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/18Complex mathematical operations for evaluating statistical data, e.g. average values, frequency distributions, probability functions, regression analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Computational Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Pure & Applied Mathematics (AREA)
  • Mathematical Optimization (AREA)
  • Mathematical Analysis (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Evolutionary Computation (AREA)
  • Evolutionary Biology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Operations Research (AREA)
  • Probability & Statistics with Applications (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Algebra (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)

Abstract

The invention discloses a method and a device for optimizing logistic regression, which relate to the field of logistic regression, wherein the method comprises the following steps: calculating a first parameter according to an area AUC under a receiver operating characteristic ROC curve as a loss function; updating the first parameter by using a gradient descent method to obtain a second parameter; and substituting the second parameter as the value of the parameter theta into a probability formula of the logistic regression model to obtain a probability value.

Description

Optimization method of logistic regression
Technical Field
The invention relates to the field of logistic regression, in particular to a logistic regression optimization method.
Background
With the development of science and technology, the application of the internet is increased day by day, the dependence of people on the network is more obvious, correspondingly, the competition among large internet companies is more intense, how to increase the access rate of users and prolong the online time of the users becomes an important problem considered by the large internet companies, and the technical mode corresponding to the important problem is click rate prediction, for example, through the click rate prediction, advertisements with stronger showing capability can be screened out by a search engine website, a shopping website can push commodities which the users wish to consume, and a news entertainment website can directionally show contents which the users are more interested in. The technical method for predicting the click rate is to adopt a logistic regression model, wherein the logistic regression model is one of the models with the highest popularity in the internet industry, and the logistic regression model is usually used for carrying out probability distribution formula expression behind random events.
In the prior art, when solving parameters of a logistic regression model, log loss of a function obtained by maximum likelihood estimation is used as a loss function, and log loss is optimized to solve model parameters in model training; however, when the evaluation model judges whether the obtained model parameters are good or bad, the Area Under the Receiver Operating Characteristic ROC (ROC) Curve (Area Under Curve, AUC) is used as an index. The processing and evaluation of the same model parameter adopt different modes successively, and no correlation exists between the two modes, so that the model parameter obtained by optimizing the log loss is not necessarily regarded as the optimal solution in the subsequent AUC evaluation model, and the prediction accuracy of the logistic regression model is reduced.
Therefore, the model parameter obtained by using the log loss as the loss function in the prior art is not necessarily considered as the optimal solution in the subsequent AUC evaluation model, which is a problem to be solved urgently.
Disclosure of Invention
The embodiment of the application provides an optimization method of logistic regression, and solves the problem that in the prior art, model parameters obtained by using log loss as a loss function are not necessarily regarded as optimal solutions in a subsequent AUC evaluation model.
The optimization method of logistic regression provided by the embodiment of the application specifically includes:
calculating a first parameter according to an area AUC under a receiver operating characteristic ROC curve as a loss function;
updating the first parameter by using a gradient descent method to obtain a second parameter;
and substituting the second parameter as the value of the parameter theta into a probability formula of the logistic regression model to obtain a probability value.
One possible implementation, the AUC as a loss function includes:
according to the statistic that the AUC is equivalent to the Whitney test, the calculation method of the AUC is subjected to form conversion, and the converted formula is as follows:
Figure BDA0002167439550000021
wherein,
Figure BDA0002167439550000022
for the first vector of parameters is a vector of parameters,
Figure BDA0002167439550000023
in the form of a vector of positive samples,
Figure BDA0002167439550000024
in the form of a vector of negative samples,
Figure BDA0002167439550000029
Figure BDA0002167439550000025
representing the probability that the score of a positive sample is greater than the score of a negative sample;
the formula of the AUC as a loss function is:
Figure BDA0002167439550000026
wherein, in the total sample, the data set of the positive sample is P, the data set of the negative sample is Q, and the counting function g (x) is:
Figure BDA0002167439550000027
one possible implementation further includes:
the counting function g (x) is converted into a logic function;
obtaining said AUC as a loss function applicable to a gradient descent method, with the formula:
Figure BDA0002167439550000028
one possible implementation manner, wherein the updating the first parameter by using a gradient descent method includes:
updating the first parameter according to the formula of the gradient descent method and the AUC as a loss function, wherein the formula is as follows:
Figure BDA0002167439550000031
one possible implementation further includes:
applying supervised machine learning to train the first parameter to obtain the second parameter; the method comprises the following steps:
acquiring the existing N samples and corresponding N sample results; setting the number of samples needed for updating the first parameter once to be M, and then updating the first parameter for a total number of times to be N/M, wherein N is an integer greater than 0, M is an integer greater than 0, and N is greater than M;
and according to the M samples and the corresponding M sample results, solving the first parameter by applying the formula with the AUC as a loss function, and updating N/M times by using a gradient descent method to obtain the second parameter.
The embodiment of the present application provides an optimization device for logistic regression, which specifically includes:
the first processing unit is used for calculating a first parameter according to an area AUC under a receiver operating characteristic ROC curve as a loss function; updating the first parameter by using a gradient descent method to obtain a second parameter;
and the second processing unit is used for substituting the second parameter as the value of the parameter theta into a probability formula of the logistic regression model to obtain the probability value.
One possible implementation, the AUC as a loss function includes:
according to the statistic that the AUC is equivalent to the Whitney test, the calculation method of the AUC is subjected to form conversion, and the converted formula is as follows:
Figure BDA0002167439550000032
wherein,
Figure BDA0002167439550000033
for the first vector of parameters is a vector of parameters,
Figure BDA0002167439550000034
in the form of a vector of positive samples,
Figure BDA0002167439550000035
in the form of a vector of negative samples,
Figure BDA0002167439550000038
Figure BDA0002167439550000036
representing the probability that the score of a positive sample is greater than the score of a negative sample;
the formula of the AUC as a loss function is:
Figure BDA0002167439550000037
wherein, in the total sample, the data set of the positive sample is P, the data set of the negative sample is Q, and the counting function g (x) is:
Figure BDA0002167439550000041
one possible implementation further includes:
the counting function g (x) is converted into a logic function;
obtaining said AUC as a loss function applicable to a gradient descent method, with the formula:
Figure BDA0002167439550000042
embodiments of the present application provide a computer device comprising a program or instructions that, when executed, cause a computer to perform the method of any of the above possible designs.
Embodiments of the present application provide a storage medium containing a program or instructions that, when executed, cause a computer to perform the method of any of the above possible designs.
The optimization method of the logistic regression provided by the invention has the following beneficial effects: and taking the AUC as a loss function, and directly taking the optimized AUC as a target in the training process of the model to obtain better model performance, so that the prediction accuracy of the logistic regression model is improved, and the application effect under each business scene is further improved.
Drawings
FIG. 1 is a flow chart of a prior art logistic regression method;
FIG. 2 is a flow chart of a method for logistic regression optimization in an embodiment of the present application;
FIG. 3 is a flowchart illustrating a method for optimizing logistic regression using supervised machine learning training parameters according to an embodiment of the present disclosure;
fig. 4 is a schematic structural diagram of an apparatus of a logistic regression optimization method in an embodiment of the present application.
Detailed Description
In order to better understand the technical solutions, the technical solutions will be described in detail below with reference to the drawings and the specific embodiments of the specification, and it should be understood that the specific features in the embodiments and examples of the present application are detailed descriptions of the technical solutions of the present application, but not limitations of the technical solutions of the present application, and the technical features in the embodiments and examples of the present application may be combined with each other without conflict.
FIG. 1 is a flow chart of a logistic regression method in the prior art, and as shown in the figure, the logistic regression model is one of the most popular models in the Internet industry, and is generally used as a two-classification model.
Step 101: obtaining a probability formula of a logistic regression model, wherein the obtained formula (1) meets the following form:
Figure BDA0002167439550000051
specifically, for example, if the click rate predicts whether the user will click on the advertisement, the probability distribution behind the random event such as "whether one person will click on the advertisement" is expressed by using the above formula, where X represents the observed feature, θ is a parameter that the algorithm needs to solve, and Y ═ 1 is the probability when the user actually clicks on the advertisement and predicts that the advertisement is also clicked.
Step 102: when solving the parameters of the probability formula, calculating by using Log loss as a loss function;
specifically, when solving the model parameter θ, the model parameter θ is obtained by minimizing the log loss of the maximum likelihood estimation by using the log loss derived from the maximum likelihood estimation as a loss function, wherein the derivation formulas (2) to (4) obtained by the loss function satisfy the following form:
L(θ│x)=Pr(Y│X;θ) (2)
=∏iPr(yi│xi;θ) (3)
=∏ihθ(xi)yi(1-hθ(xi))(1-yi) (4)
step 103: updating parameters by using a gradient descent method;
specifically, the loss function in step 102 is derived by using a gradient descent method, and the parameter θ is continuously updated; and training a model parameter theta by machine learning, wherein when a sample prediction result obtained by training is very close to an actual sample result, namely a loss function is minimum, the obtained parameter theta is a finally confirmed model parameter.
Step 104: the parameters were substituted into the probability formula and the logistic regression model was evaluated using AUC.
Specifically, the parameter θ obtained through the machine learning training in step 103 is brought into the probability formula of the logistic regression model together with the verification sample to be calculated, and then the AUC value is obtained from the calculated probability value by using the AUC calculation method, when the AUC value is larger, the better the performance of the logistic regression model is, that is, the AUC is used to evaluate the logistic regression model, and whether the obtained model parameter θ is the optimal solution is determined.
From the above step 101-104, when solving the parameters of the logistic regression model, the log loss obtained by the maximum likelihood estimation is used as the loss function, and the model parameter θ is solved by optimizing (i.e. minimizing) the log loss in the model training; however, the model is then evaluated using AUC as an index to determine whether the obtained model parameter θ is the optimal solution. Different modes are successively adopted for processing and evaluating the same model parameter, and no incidence relation exists between the two modes, so that the model parameter obtained by optimizing the log loss is not necessarily regarded as the optimal solution in the subsequent AUC evaluation model, and the prediction accuracy of the logistic regression model is reduced.
Therefore, in the prior art, the model parameter obtained by using log loss as a loss function is not necessarily considered as an optimal solution in a subsequent AUC evaluation model, which is a problem to be solved urgently, the optimization method of logistic regression aims to obtain better model expression by directly taking AUC as the loss function and aiming at optimizing AUC in the training process of the model, so that the prediction accuracy of the logistic regression model is improved, and the application effect under each business scene is further improved.
Fig. 2 is a flowchart of a method for optimizing logistic regression according to an embodiment of the present application, and specific steps will be described in detail below.
Step 201: obtaining a probability formula of the logistic regression model, wherein the obtained formula (1) still satisfies the following form:
Figure BDA0002167439550000061
step 202: calculating a first parameter from the AUC as a loss function;
specifically, when solving the parameters of the probability formula, calculating by using an area AUC under a Receiver Operating Characteristic (ROC) curve as a loss function; there are two general most intuitive AUC calculation methods, one is: drawing an ROC curve, wherein the area under the ROC curve is the value of AUC; the other is as follows: assuming that there are a total of (m + n) samples, where m positive samples and n negative samples have a total of m x n sample pairs, the probability value that a positive sample is predicted as a positive sample is greater than the probability value that a negative sample is predicted as a positive sample is recorded as 1, and the total of the counts is divided by (m x n) to obtain the AUC. The first method of calculating AUC is typically used in AUC evaluation models; where AUC is used instead of log loss as a loss function, a second method of calculating AUC is chosen.
To apply the second method of calculating AUC, it needs to be mathematically transformed. Considering that AUC is equivalent to the statistic of wheatstone test Wilcoxon-Mann-Whitney, the method for calculating AUC is expressed as a linear model in terms of the probability that the Score of a positive sample is greater than that of a negative sample, and the Score of a positive sample is randomly selected from a positive sample set and a negative sample set, and the type of logistic regression model is combined, so that the above understanding is mathematically formalized, and the obtained formula (5) satisfies the following form:
Figure BDA0002167439550000071
wherein,
Figure BDA0002167439550000072
for the first parameter vector, equivalent to the model parameter theta in the prior art,
Figure BDA0002167439550000073
in the form of a vector of positive samples,
Figure BDA0002167439550000074
in the form of a vector of negative samples,
Figure BDA0002167439550000075
representing the probability that the score of a positive sample is greater than the score of a negative sample;
further, setting the data set of the positive sample as P and the data set of the negative sample as Q;
the counting function g (x) satisfies the formula (6), and the following can be specifically referred to:
Figure BDA0002167439550000076
thus, incorporating the counting function g (x) into the mathematical formulation of AUC, the resulting formula (7) satisfies the following form:
Figure BDA0002167439550000077
the formula is a formula of taking AUC as a loss function, and considering that the loss function needs to update the first parameter by applying a gradient descent method, the counting function g (x) is converted into a logic function and then is combined with the logic function
Figure BDA0002167439550000078
In the formula (2), i.e., the formula (8) obtained as a loss function of AUC applicable to the gradient descent method satisfies the following form:
Figure BDA0002167439550000079
step 203: updating the first parameter by using a gradient descent method to obtain a second parameter;
specifically, the formula with AUC as the loss function in step 202 can be derived by using a gradient descent method to update the first parameter
Figure BDA0002167439550000081
The resulting formula (9) satisfies the following form:
Figure BDA0002167439550000082
taking N sample data as training data, and adopting supervised machine learning to obtain first parameter
Figure BDA0002167439550000083
Training is carried out, and the first parameter is continuously updated by applying a gradient descent method
Figure BDA0002167439550000084
To obtain a second parameter of the optimal solution.
Step 204: and substituting the second parameter as the value of the parameter theta into a probability formula of the logistic regression model to obtain a probability value.
Specifically, L sample data are taken as verification data, and the method is divided into the following steps according to the actual result of the sample: positive and negative examples, first parameter obtained by step 203
Figure BDA0002167439550000085
And combining the L sample data and substituting the L sample data into a probability formula of the logistic regression model for calculation, predicting the probability of the obtained positive sample into the positive sample and the probability of the obtained negative sample into the positive sample, and combining an AUC (average value of coefficient) conventional calculation method, namely drawing an ROC (optimum characteristic) curve, wherein the area under the ROC curve is the value of AUC, calculating the value of AUC, and when the value of AUC is larger, the logistic regression model is better in performance, namely the AUC is used for evaluating the logistic regression model, and judging whether the calculated second parameter is the optimal solution.
Model parameters obtained by step 201-
Figure BDA0002167439550000086
The index which is consistent with the AUC evaluation model in the subsequent step 204 is kept, namely the index is obtained by applying the calculation method of AUC, thereby ensuring the model parameters obtained when the AUC is taken as a loss function and minimized
Figure BDA0002167439550000087
Can also be better in the subsequent AUC evaluation modelThe evaluation results of (1); and the method applies supervised machine learning to carry out model parameters
Figure BDA0002167439550000088
In the training process, the AUC is directly optimized as the target, so that the model parameters obtained by model training
Figure BDA0002167439550000089
Compared with the model parameter theta obtained by the log loss function in the prior art, the method is more optimized.
To better understand how to adopt a supervised machine learning method to model parameters
Figure BDA00021674395500000810
Training is performed to obtain the optimal solution, and the optimal solution is verified and evaluated, which will be further described by way of example in conjunction with the supervised machine learning method adopted in step 201-204.
Fig. 3 is a flowchart of an optimization method of logistic regression using supervised machine learning training parameters in an embodiment of the present application. The supervised machine learning refers to learning a function from a given training data set, and when new data comes, a result can be predicted according to the function. The specific process is described below in conjunction with the practical application of the logistic regression model.
Step 301: preparing training data;
specifically, firstly, training data is prepared, for example, a logistic regression model is applied to predict the click rate of the travel advertisement, and firstly, sample data of 200 users is obtained as the training data, wherein the sample data comprises: observed user characteristics and click results of the user, wherein the observed user characteristics include: name, gender, age, region, online time; the click result includes: click and not click.
Step 302: randomly initializing parameters;
specifically, the model parameters are initialized randomly, the range of the model parameters is (0, 1), the number of samples required for updating the model parameters once is set to be 10, and the total number of times of updating the model parameters is 200/10-20.
Step 303: calculating a predicted value;
specifically, after the rule is set, 10 samples are obtained as a unit, 10 samples of a unit are obtained first, the 10 samples are substituted into a probability formula of a logistic regression model according to initialized model parameters, such as 0.1, to calculate, a prediction result of the 10 samples of the first unit is obtained, the prediction result is compared with click results of users in the 10 samples, and the quality of the model parameters is judged according to the comparison result.
Step 304: calculating AUC as loss function;
the 10 samples are used as input values, and the above formula of taking the AUC as a loss function is applied to solve the model parameters.
Step 305: updating parameters by using a gradient descent method;
specifically, the model parameter solved in step 304 is used as a base number, 10 samples of the second unit are obtained, the model parameter is updated, after the update, the 10 samples of the second unit and the updated model parameter are brought into the probability formula of the logistic regression model together for calculation, the prediction result of the 10 samples of the second unit is obtained, and the prediction result is compared with the click result of the user in the 10 samples of the second unit to judge whether the model parameter is good enough.
Step 306: whether the parameters are good enough;
specifically, the model parameters are calculated and updated by repeating the same operation on the remaining 10 samples of 8 units, the parameters are adjusted by continuously updating the model parameters and the obtained comparison result, and when the comparison result shows that the prediction result is close to the click result of the user, for example, the approach rate is 99%, the parameters are considered to be good enough, so that the final parameters are obtained; and when the comparison result shows that the difference between the prediction result and the click result of the user is large, such as the approach rate is 90%, continuing to take the sample for iterative computation and updating the model parameters until the comparison result is good, such as the approach rate is 99%.
Step 307: and obtaining the final parameters.
Through the processing in step 306, a final parameter is obtained, where the argument in the logistic regression model, i.e., the observed user feature, includes more than one content, and the mathematical expression form of the argument is a vector, so the model parameter corresponding to the argument is also a vector, and the number of model parameter values included in the vector corresponds to the number of contents included in the argument (the observed user feature), so the final parameter is also a vector.
Through the steps 301-307, the final parameters are trained through supervised machine learning, then sample data of 100 users are taken as verification data, and the steps are divided into: combining the final parameters and substituting the final parameters into a probability formula of the logistic regression model to obtain the probability of the positive sample being predicted as the positive sample and the probability of the negative sample being predicted as the positive sample, and then combining an AUC conventional calculation method to obtain the value of AUC; when the value of AUC is larger, the model parameter is better, namely the obtained model parameter is the optimal solution.
Fig. 4 is a schematic structural diagram of an apparatus of an optimization method of logistic regression in the embodiment of the present application, which includes a first processing unit 401 and a second processing unit 402, and is described in detail below.
The first processing unit 401 is configured to calculate a first parameter according to an area AUC under a receiver operating characteristic ROC curve as a loss function; updating the first parameter by using a gradient descent method to obtain a second parameter;
and a second processing unit 402, configured to substitute the second parameter as a value of the parameter θ into a probability formula of the logistic regression model to obtain a probability value.
One possible implementation, the AUC as a loss function includes:
according to the statistic that the AUC is equivalent to the Whitney test, the calculation method of the AUC is subjected to form conversion, and the converted formula is as follows:
Figure BDA0002167439550000111
wherein,
Figure BDA0002167439550000112
for the first vector of parameters is a vector of parameters,
Figure BDA0002167439550000113
in the form of a vector of positive samples,
Figure BDA0002167439550000114
in the form of a vector of negative samples,
Figure BDA0002167439550000115
Figure BDA0002167439550000116
representing the probability that the score of a positive sample is greater than the score of a negative sample;
the formula of the AUC as a loss function is:
Figure BDA0002167439550000117
wherein, in the total sample, the data set of the positive sample is P, the data set of the negative sample is Q, and the counting function g (x) is:
Figure BDA0002167439550000118
one possible implementation further includes:
the counting function g (x) is converted into a logic function;
obtaining said AUC as a loss function applicable to a gradient descent method, with the formula:
Figure BDA0002167439550000119
finally, it should be noted that: as will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
It will be apparent to those skilled in the art that various changes and modifications may be made in the present application without departing from the scope of the application. Thus, if such modifications and variations of the present application fall within the scope of the claims of the present application and their equivalents, the present application is intended to include such modifications and variations as well.

Claims (10)

1. A method of optimization of logistic regression, comprising:
calculating a first parameter according to an area AUC under a receiver operating characteristic ROC curve as a loss function;
updating the first parameter by using a gradient descent method to obtain a second parameter;
and substituting the second parameter as the value of the parameter theta into a probability formula of the logistic regression model to obtain a probability value.
2. The method of claim 1, wherein the AUC as a function of loss comprises:
according to the statistic that the AUC is equivalent to the Whitney test, the calculation method of the AUC is subjected to form conversion, and the converted formula is as follows:
Figure FDA0002167439540000011
wherein,
Figure FDA0002167439540000012
for the first vector of parameters is a vector of parameters,
Figure FDA0002167439540000013
in the form of a vector of positive samples,
Figure FDA0002167439540000014
as negative sample vector, AUC
Figure FDA0002167439540000015
Representing the probability that the score of a positive sample is greater than the score of a negative sample;
the formula of the AUC as a loss function is:
Figure FDA0002167439540000016
wherein, in the total sample, the data set of the positive sample is P, the data set of the negative sample is Q, and the counting function g (x) is:
Figure FDA0002167439540000017
3. the method of claim 2, further comprising:
the counting function g (x) is converted into a logic function;
obtaining said AUC as a loss function applicable to a gradient descent method, with the formula:
Figure FDA0002167439540000018
4. the method of claim 1, wherein the updating the first parameter using a gradient descent method comprises:
updating the first parameter according to the formula of the gradient descent method and the AUC as a loss function, wherein the formula is as follows:
Figure FDA0002167439540000021
5. the method of claim 1, further comprising:
applying supervised machine learning to train the first parameter to obtain the second parameter; the method comprises the following steps:
acquiring the existing N samples and corresponding N sample results; setting the number of samples needed for updating the first parameter once to be M, and then updating the first parameter for a total number of times to be N/M, wherein N is an integer greater than 0, M is an integer greater than 0, and N is greater than M;
and according to the M samples and the corresponding M sample results, solving the first parameter by applying the formula with the AUC as a loss function, and updating N/M times by using a gradient descent method to obtain the second parameter.
6. An apparatus for logistic regression optimization, comprising:
the first processing unit is used for calculating a first parameter according to an area AUC under a receiver operating characteristic ROC curve as a loss function; updating the first parameter by using a gradient descent method to obtain a second parameter;
and the second processing unit is used for substituting the second parameter as the value of the parameter theta into a probability formula of the logistic regression model to obtain the probability value.
7. The apparatus of claim 6, wherein the AUC as a function of loss comprises:
according to the statistic that the AUC is equivalent to the Whitney test, the calculation method of the AUC is subjected to form conversion, and the converted formula is as follows:
Figure FDA0002167439540000022
wherein,
Figure FDA0002167439540000023
for the first vector of parameters is a vector of parameters,
Figure FDA0002167439540000024
in the form of a vector of positive samples,
Figure FDA0002167439540000025
as negative sample vector, AUC
Figure FDA0002167439540000026
Representing the probability that the score of a positive sample is greater than the score of a negative sample;
the formula of the AUC as a loss function is:
Figure FDA0002167439540000027
wherein, in the total sample, the data set of the positive sample is P, the data set of the negative sample is Q, and the counting function g (x) is:
Figure FDA0002167439540000031
8. the apparatus of claim 7, further comprising:
the counting function g (x) is converted into a logic function;
obtaining said AUC as a loss function applicable to a gradient descent method, with the formula:
Figure FDA0002167439540000032
9. a computer device comprising a program or instructions that, when executed, perform the method of any of claims 1 to 5.
10. A storage medium comprising a program or instructions which, when executed, perform the method of any one of claims 1 to 5.
CN201910751819.4A 2019-08-15 2019-08-15 Optimization method of logistic regression Pending CN112395551A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910751819.4A CN112395551A (en) 2019-08-15 2019-08-15 Optimization method of logistic regression

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910751819.4A CN112395551A (en) 2019-08-15 2019-08-15 Optimization method of logistic regression

Publications (1)

Publication Number Publication Date
CN112395551A true CN112395551A (en) 2021-02-23

Family

ID=74602789

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910751819.4A Pending CN112395551A (en) 2019-08-15 2019-08-15 Optimization method of logistic regression

Country Status (1)

Country Link
CN (1) CN112395551A (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100094783A1 (en) * 2008-09-15 2010-04-15 Ankur Jain Method and System for Classifying Data in System with Limited Memory
CN106528771A (en) * 2016-11-07 2017-03-22 中山大学 Fast structural SVM text classification optimization algorithm
CN107909498A (en) * 2017-10-26 2018-04-13 厦门理工学院 Based on the recommendation method for maximizing receiver operating characteristic curve area under

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100094783A1 (en) * 2008-09-15 2010-04-15 Ankur Jain Method and System for Classifying Data in System with Limited Memory
CN106528771A (en) * 2016-11-07 2017-03-22 中山大学 Fast structural SVM text classification optimization algorithm
CN107909498A (en) * 2017-10-26 2018-04-13 厦门理工学院 Based on the recommendation method for maximizing receiver operating characteristic curve area under

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
LIAN YAN ET AL.: ""Optimizing Classifier Performance via an Approximation to the Wilcoxon-Mann-Whitney Statistic"", 《PROCEEDINGS OF THE TWENTIETH INTERNATIONAL CONFERENCE ON MACHINE LEARNING (ICML-2003)》, pages 1 - 8 *
王兵: ""一种基于逻辑回归模型的搜索广告点击率预估方法的研究"", 《中国优秀硕士学位论文全文数据库 信息科技辑》, pages 138 - 2316 *

Similar Documents

Publication Publication Date Title
US20220198289A1 (en) Recommendation model training method, selection probability prediction method, and apparatus
CN111797321B (en) Personalized knowledge recommendation method and system for different scenes
CN110503531B (en) Dynamic social scene recommendation method based on time sequence perception
KR20230070272A (en) Computer-based systems, computing components, and computing objects configured to implement dynamic outlier bias reduction in machine learning models
CN107346464A (en) Operational indicator Forecasting Methodology and device
US11257019B2 (en) Method and system for search provider selection based on performance scores with respect to each search query
JP2024503774A (en) Fusion parameter identification method and device, information recommendation method and device, parameter measurement model training method and device, electronic device, storage medium, and computer program
CN106095887A (en) Context aware Web service recommendation method based on weighted space-time effect
CN112819523B (en) Marketing prediction method combining inner/outer product feature interaction and Bayesian neural network
WO2020135642A1 (en) Model training method and apparatus employing generative adversarial network
Borges et al. On measuring popularity bias in collaborative filtering data
CN113449011A (en) Big data prediction-based information push updating method and big data prediction system
CN113537630A (en) Training method and device of business prediction model
CN113449012A (en) Internet service mining method based on big data prediction and big data prediction system
CN111444930B (en) Method and device for determining prediction effect of two-classification model
CN113763031A (en) Commodity recommendation method and device, electronic equipment and storage medium
CN112218114B (en) Video cache control method, device and computer readable storage medium
CN113065067A (en) Article recommendation method and device, computer equipment and storage medium
CN109145207B (en) Information personalized recommendation method and device based on classification index prediction
CN111309706A (en) Model training method and device, readable storage medium and electronic equipment
CN116578400A (en) Multitasking data processing method and device
Sagaama et al. Automatic parameter tuning for big data pipelines with deep reinforcement learning
CN115600818A (en) Multi-dimensional scoring method and device, electronic equipment and storage medium
CN112395551A (en) Optimization method of logistic regression
CN113128597B (en) Method and device for extracting user behavior characteristics and classifying and predicting user behavior characteristics

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination