CN115809372A - Click rate prediction model training method and device based on decoupling invariant learning - Google Patents

Click rate prediction model training method and device based on decoupling invariant learning Download PDF

Info

Publication number
CN115809372A
CN115809372A CN202310053850.7A CN202310053850A CN115809372A CN 115809372 A CN115809372 A CN 115809372A CN 202310053850 A CN202310053850 A CN 202310053850A CN 115809372 A CN115809372 A CN 115809372A
Authority
CN
China
Prior art keywords
environment
invariant
rate prediction
prediction model
click
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202310053850.7A
Other languages
Chinese (zh)
Other versions
CN115809372B (en
Inventor
何向南
张洋
史天昊
冯福利
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Science and Technology of China USTC
Original Assignee
University of Science and Technology of China USTC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Science and Technology of China USTC filed Critical University of Science and Technology of China USTC
Priority to CN202310053850.7A priority Critical patent/CN115809372B/en
Publication of CN115809372A publication Critical patent/CN115809372A/en
Application granted granted Critical
Publication of CN115809372B publication Critical patent/CN115809372B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a method and a device for training a click rate prediction model based on decoupling invariant learning. The method comprises the following steps: step one, constructing a click rate prediction model and a model optimization target based on a decoupling invariant learning method; randomly sampling the environment data set to obtain a training sample data set; fixing the environment specific part parameters of the click rate prediction model, mining the environment invariant characteristics of the training sample data set by using the click rate prediction model, and updating the environment invariant part parameters of the click rate prediction model; fixing the environment invariant part parameters of the click rate prediction model, mining the environment specific characteristics of the training sample data set by using the updated click rate prediction model, and updating the environment specific part parameters of the click rate prediction model; and iterating the second step to the fourth step until the click rate prediction model meets a preset convergence condition, and obtaining a trained click rate prediction model.

Description

Training method and device of click rate prediction model based on decoupling invariant learning
Technical Field
The invention relates to the field of recommendation systems, data mining and machine learning, in particular to a click rate prediction model training method and device based on decoupling invariant learning, a click rate prediction method, electronic equipment and a storage medium.
Background
Click rate prediction is a crucial link of a recommendation system. In recent years, feature interaction modeling is recognized as the core of the click-through rate prediction problem, and most of research focuses on efficient modeling of feature interactions. However, the feature interaction modeling models in the prior art are all based on fitting empirical risk minimization to historical data to learn feature interactions, i.e., learning feature interactions in the form of interpreting historical data. However, services need to be provided in future scenes in real recommendation scenes, and due to the fact that user interests continuously change, drift exists between new data and historical data, and feature interaction obtained by fitting the historical data is difficult to generalize well on the new data, so that performance of a recommendation system is damaged.
In order to solve the problem that the learning model is poor in generalization due to the existence of distribution drift and based on empirical risk minimization, the technical personnel in the field propose a paradigm of invariant learning. Invariant learning assumes that training data is collected from heterogeneous environments, and invariant correlations are identified by distributed shifts between the environments. While this approach makes stable feature interactive learning possible, it assumes that the target can be adequately predicted by the context-invariant correlations. In the recommendation system, since the training part is affected by the coupling of the environment-invariant correlation and the environment-specific correlation, this assumption cannot be satisfied, and the ability to recognize stable feature interactions is difficult to guarantee.
Disclosure of Invention
In view of the foregoing problems, the present invention provides a method and an apparatus for training a click rate prediction model based on decoupled invariant learning, a click rate prediction method, an electronic device, and a storage medium, so as to solve at least one of the above problems.
According to a first aspect of the present invention, there is provided a training method for a click rate prediction model based on decoupling invariant learning, comprising:
the method comprises the steps that firstly, a click rate prediction model and a model optimization target are built on the basis of a decoupling invariant learning method, wherein parameters of the click rate prediction model comprise environment invariant part parameters and environment specific part parameters, and the model optimization target comprises an optimization target of the environment invariant part parameters and an optimization target of the environment specific part parameters;
randomly sampling an environment data set to obtain a training sample data set, wherein the environment data set represents historical click data of a user in different time periods, and comprises tag values;
fixing the parameters of the environment specific part of the click rate prediction model, mining the environment invariant features of the training sample data set by using the click rate prediction model to obtain a first prediction result, processing the first prediction result and the label value of the training sample data set by using an environment invariant loss function through a gradient descent method based on the optimization target of the parameters of the environment invariant part to obtain a first loss value, and updating the parameters of the environment invariant part of the click rate prediction model according to the first loss value;
fixing the environment invariant part parameters of the click rate prediction model, mining the environment specific characteristics of the training sample data set by using the updated click rate prediction model to obtain a second prediction result, processing the second prediction result and the label value of the training sample data set by using an environment specific loss function through a gradient descent method based on the optimization target of the environment specific part parameters to obtain a second loss value, and updating the environment specific part parameters of the click rate prediction model according to the second loss value;
and iterating the second step to the fourth step until the click rate prediction model meets a preset convergence condition, and obtaining a trained click rate prediction model.
According to an embodiment of the present invention, the optimization objective of the above model is represented by formula (1) and formula (2):
Figure SMS_1
(1),
Figure SMS_2
(2),
wherein formula (1) represents an optimization objective of the environment-invariant partial parameters, formula (2) represents an optimization objective of the environment-specific partial parameters,
Figure SMS_5
a parameter representing a constant part of the environment,
Figure SMS_8
is represented in the environment
Figure SMS_11
In the context of the parameters of the specific part of the environment,
Figure SMS_4
is represented in the environment
Figure SMS_7
The predicted loss is calculated as a result of the calculation,
Figure SMS_10
is used for controlling
Figure SMS_14
The over-parameters of the intensity are,
Figure SMS_3
is used for preventing
Figure SMS_13
A regularization constraint that is context invariant dependent is captured,
Figure SMS_15
the variance representing the risk of experience for different training environments,
Figure SMS_16
finger environment
Figure SMS_6
The weight of the loss is predicted and,
Figure SMS_9
to represent
Figure SMS_12
The coefficient of (a).
According to the embodiment of the invention, the variance of the experience risks of different training environments is adopted
Figure SMS_17
Expressed by equation (3):
Figure SMS_18
(3),
wherein ,
Figure SMS_20
representing the number of elements of the set of training environments,
Figure SMS_23
and
Figure SMS_24
the values of the different environments are represented,
Figure SMS_19
is represented in the environment
Figure SMS_22
The environment-specific part-parameter of (a),
Figure SMS_25
is represented in the environment
Figure SMS_26
The variance of the prediction loss and the empirical risk of different training environments obtained by the calculation
Figure SMS_21
A mode for capturing different environment shares;
wherein the environment
Figure SMS_27
The weight of the predicted loss is represented by equation (4):
Figure SMS_28
(4),
wherein ,
Figure SMS_29
to representAll environments are traversed.
According to the embodiment of the invention, the click rate prediction model comprises a click rate prediction model based on the decoupling invariant learning of the click data feature embedding layer and/or a click rate prediction model based on the decoupling invariant learning of the click data feature domain weight layer.
According to the embodiment of the invention, the click rate prediction model based on the decoupling invariant learning of the click data feature embedding level is determined by formula (5):
Figure SMS_30
(5),
wherein ,
Figure SMS_45
a parameter representing a constant part of the environment,
Figure SMS_33
is represented in the environment
Figure SMS_37
In the context of the parameters of the specific part of the environment,
Figure SMS_34
Figure SMS_36
Figure SMS_41
a feature representing the click data is shown,
Figure SMS_44
is shown as
Figure SMS_39
The characteristics of the individual click data are,
Figure SMS_42
denotes the first
Figure SMS_31
The characteristics of the individual click data are such that,
Figure SMS_35
indicating the number of click data features that are to be,
Figure SMS_47
denotes the first
Figure SMS_51
The environment-invariant features corresponding to the individual features are embedded,
Figure SMS_48
is shown as
Figure SMS_52
The environment-invariant features corresponding to the individual features are embedded,
Figure SMS_43
denotes the first
Figure SMS_46
The characteristic corresponds to
Figure SMS_49
The specific features of the individual environments are embedded in,
Figure SMS_50
denotes the first
Figure SMS_32
The characteristic corresponds to
Figure SMS_38
The specific features of the individual environments are embedded in,
Figure SMS_40
the click rate prediction model;
the click rate prediction model based on the decoupling invariant learning of the click data feature domain weight layer is determined by a formula (6):
Figure SMS_53
(6),
wherein ,
Figure SMS_54
representing a domain
Figure SMS_61
In the above-mentioned characterization of (1),
Figure SMS_62
representing a domain
Figure SMS_55
In the above-mentioned characterization of (1),
Figure SMS_57
representation domain
Figure SMS_60
And domain
Figure SMS_66
The environment of the room does not change the weight,
Figure SMS_59
representing a domain
Figure SMS_63
And domain
Figure SMS_64
In the environment of
Figure SMS_65
Of a particular weight of
Figure SMS_56
Figure SMS_58
Representing the number of feature fields.
According to an embodiment of the present invention, the domain
Figure SMS_67
Is characterized by
Figure SMS_68
Is based on domains
Figure SMS_69
Mid-feature embedding
Figure SMS_70
A calculation is performed, determined by equation (7):
Figure SMS_71
(7),
wherein ,
Figure SMS_72
represent the first of the data
Figure SMS_75
The characteristics of the data are such that,
Figure SMS_78
indicates all the domains
Figure SMS_73
Data characteristics of
Figure SMS_76
Corresponding to
Figure SMS_77
The set of (a) or (b),
Figure SMS_79
represent the first of the data
Figure SMS_74
And embedding the characteristics corresponding to the data characteristics.
According to a second aspect of the present invention, there is provided a click rate prediction method, including:
acquiring a historical data set of a user to be predicted, wherein the historical data set of the user to be predicted comprises user characteristic data and user click data;
and mining a prediction result of the environment-invariant feature interaction of the historical data set of the user to be predicted by using a click rate prediction model, wherein the click rate prediction model is obtained by training the click rate prediction model based on decoupling invariant learning through the training method.
According to a third aspect of the present invention, there is provided a training apparatus for a click rate prediction model based on decoupling invariant learning, comprising:
the model building module is used for executing the first step and building a click rate prediction model and a model optimization target based on a decoupling invariant learning method, wherein parameters of the click rate prediction model comprise parameters of an environment invariant part and parameters of an environment specific part, and the model optimization target comprises an optimization target of the parameters of the environment invariant part and an optimization target of the parameters of the environment specific part;
the data sampling module is used for executing the second step, randomly sampling an environment data set to obtain a training sample data set, wherein the environment data set represents historical click data of a user in different time periods, and comprises a label value;
the invariant parameter updating module is used for executing the third step, fixing the environment specific part parameters of the click rate prediction model, excavating the environment invariant characteristics of the training sample data set by using the click rate prediction model to obtain a first prediction result, processing the first prediction result and the label value of the training sample data set by using an environment invariant loss function through a gradient descent method based on the optimization target of the environment invariant part parameters to obtain a first loss value, and updating the environment invariant part parameters of the click rate prediction model according to the first loss value;
the specific parameter updating module is used for executing the fourth step, fixing the environment invariant part parameters of the click rate prediction model, mining the environment specific characteristics of the training sample data set by using the updated click rate prediction model to obtain a second prediction result, processing the second prediction result and the label value of the training sample data set by using an environment specific loss function through a gradient descent method based on the optimization target of the environment specific part parameters to obtain a second loss value, and updating the environment specific part parameters of the click rate prediction model according to the second loss value;
and the iteration module is used for iterating the second step to the fourth step until the click rate prediction model meets a preset convergence condition, so as to obtain a trained click rate prediction model.
According to a fourth aspect of the present invention, there is provided an electronic apparatus comprising:
one or more processors;
a storage device to store one or more programs,
wherein the one or more programs, when executed by the one or more processors, cause the one or more processors to perform a training method based on a click-through rate prediction model of decoupled invariant learning and a click-through rate prediction method.
According to a fifth aspect of the present invention, there is provided a computer readable storage medium having stored thereon executable instructions that, when executed by a processor, cause the processor to perform a method of training a click-through rate prediction model based on decoupled invariant learning and a method of click-through rate prediction.
According to the training method of the click rate prediction model based on the decoupling invariant learning, provided by the invention, the click rate prediction model with good generalization can be obtained, so that the model can identify stable characteristic interaction in different historical environments, and meanwhile, the problem that the click rate prediction model in the prior art is low in identification accuracy due to the fact that a data drift phenomenon exists between data processed in the model application stage and historical training data is solved, and the prediction accuracy of the model is greatly improved.
Drawings
FIG. 1 is a flow chart of a method of training a click-through rate prediction model based on decoupled invariant learning, according to an embodiment of the present invention;
FIG. 2 (a) is a schematic diagram of a decoupled invariant learning model according to an embodiment of the present invention;
FIG. 2 (b) is a schematic diagram of a light decoupling invariant learning model according to an embodiment of the present invention;
FIG. 3 is a flow chart of a click-through rate prediction method according to an embodiment of the present invention;
FIG. 4 is a schematic structural diagram of a training apparatus for a click rate prediction model based on decoupling invariant learning according to an embodiment of the present invention;
FIG. 5 schematically shows a block diagram of an electronic device suitable for implementing a click-through rate prediction model training method and a click-through rate prediction method based on decoupled invariant learning, according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is further described in detail with reference to the accompanying drawings in combination with the embodiments.
Click-through rate prediction is a key link in recommendation systems, and early methods were factorized models, which model feature interactions in the form of factorization and inner products. In recent years, with the rapid development of machine learning and deep learning technologies, those skilled in the art propose to implement more efficient and complex feature interactive modeling based on various neural networks, such as a multi-layer perceptron, an inner product or outer product neural network, an attention-oriented neural network, a convolutional neural network, or a graph neural network. In recent years, methods based on neural network architecture search have also been proposed, with some efforts focused on automatic search of optimal network architecture modeling feature interactions, and other efforts focused on automatic selection or generation of optimal feature interactions. These efforts enable better feature interaction modeling while also greatly reducing human input. However, the characteristic interaction modeling models have the problems of data drift, poor generalization and the like; meanwhile, aiming at the problems of data drift and poor generalization, the technical personnel in the field provide a recommendation model based on an invariant learning paradigm; the recommendation model hypothesis target based on the invariant learning paradigm can be fully predicted by the environment invariant correlation, and the hypothesis cannot be met in the actual training and application process of the model, so that the capability of identifying stable feature interaction of the recommendation model based on the invariant learning paradigm is difficult to guarantee.
In order to learn stable feature interaction in the recommendation system click rate prediction problem and improve the generalization capability of a model on new data, the invention provides a stable feature interaction capturing method based on decoupling invariant learning. According to the method, historical data are divided into different environments according to time sequence, and an invariant learning hypothesis is established by decoupling the environment invariant correlation and the environment specific correlation and removing the environment invariant correlation, so that stable feature interaction is captured by applying invariant learning. Meanwhile, the stable characteristic interaction capturing method based on the decoupling invariant learning can capture stable characteristic interaction from heterogeneity of historical data in different environments, so that the learned characteristic interaction can have good generalization capability in a service phase of a real click rate prediction problem scene, and the prediction accuracy of a recommendation system is improved.
FIG. 1 is a flowchart of a training method of a click-through rate prediction model based on decoupled invariant learning according to an embodiment of the present invention.
As shown in FIG. 1, the training method of the click rate prediction model based on the decoupling invariant learning includes operations S110-150.
In operation S110, a click-through rate prediction model and a model optimization objective are constructed based on a decoupling invariant learning method, where parameters of the click-through rate prediction model include parameters of an environment-invariant portion and parameters of an environment-specific portion, and the model optimization objective includes an optimization objective of the parameters of the environment-invariant portion and an optimization objective of the parameters of the environment-specific portion.
In operation S120, an environment data set is randomly sampled to obtain a training sample data set, where the environment data set represents historical click data of a user in different time periods, and includes a tag value.
In operation S130, the specific environmental parameter of the click rate prediction model is fixed, the click rate prediction model is used to mine the invariant environmental characteristics of the training sample data set to obtain a first prediction result, the first prediction result and the label value of the training sample data set are processed by using the invariant environmental loss function through a gradient descent method based on the optimization target of the invariant environmental parameter to obtain a first loss value, and the invariant environmental parameter of the click rate prediction model is updated according to the first loss value.
In operation S140, the environment invariant portion parameter of the click rate prediction model is fixed, the updated click rate prediction model is used to mine the environment specific feature of the training sample data set to obtain a second prediction result, the second prediction result and the label value of the training sample data set are processed by using the environment specific loss function through a gradient descent method based on the optimization target of the environment specific portion parameter to obtain a second loss value, and the environment specific portion parameter of the click rate prediction model is updated according to the second loss value.
In operation S150, the operations S120 to S140 are performed iteratively until the click rate prediction model meets a preset convergence condition, so as to obtain a trained click rate prediction model.
The training method of the click rate prediction model provided by the invention can fully mine the invariant features and the specific features of historical data in different environments; wherein the invariant features of the historical data refer to features that are common to the data over different time periods, for example, a user has an invariant, relatively fixed preference for certain items or topics over different time periods, and will pay attention to the items or topics for a long time and click on content related to the items or topics; the specific characteristics of the historical data refer to the user's preference for certain items or topics appearing suddenly at a certain time point or period, for example, the user may pay more attention to a sudden news hot event or a sudden red article on social media and improve the click rate of the relevant hot event.
According to the method for training the click rate prediction model based on the decoupling invariant learning, the click rate prediction model with good generalization performance can be obtained, so that the model can identify stable characteristic interaction in different historical environments, meanwhile, the problem that the click rate prediction model in the prior art is low in identification accuracy due to the fact that data drift exists between data processed in the model application stage and historical training data is solved, and the prediction accuracy of the model is greatly improved.
According to an embodiment of the present invention, the optimization objective of the above model is represented by formula (1) and formula (2):
Figure SMS_80
(1),
Figure SMS_81
(2),
whereinFormula (1) represents the optimization objective of the environment-invariant partial parameters, formula (2) represents the optimization objective of the environment-specific partial parameters,
Figure SMS_82
a parameter representing a constant part of the environment,
Figure SMS_93
is represented in the environment
Figure SMS_95
In the context of the parameters of the specific part of the environment,
Figure SMS_85
is represented in the environment
Figure SMS_91
The predicted loss obtained in (1) is calculated,
Figure SMS_92
is used for controlling
Figure SMS_94
The over-parameters of the intensity are,
Figure SMS_83
is used for preventing
Figure SMS_86
A regularization constraint that is context invariant dependent is captured,
Figure SMS_89
the variance representing the risk of experience for different training environments,
Figure SMS_90
finger environment
Figure SMS_84
The weight of the loss is predicted and,
Figure SMS_87
to represent
Figure SMS_88
The coefficient of (a).
According to the embodiment of the invention, the variance of the experience risks of different training environments is adopted
Figure SMS_96
Expressed by equation (3):
Figure SMS_97
(3),
wherein ,
Figure SMS_100
representing the number of elements of the set of training environments,
Figure SMS_102
and
Figure SMS_104
the values of the different environments are represented,
Figure SMS_99
is represented in the environment
Figure SMS_101
The environment-specific part parameter of (a),
Figure SMS_103
is represented in the environment
Figure SMS_105
The variance of the empirical risk of the different training environments
Figure SMS_98
A mode for capturing different environment shares;
wherein the environment
Figure SMS_106
The weight of the predicted loss is represented by equation (4):
Figure SMS_107
(4),
wherein ,
Figure SMS_108
representing traversal of all environments.
According to the embodiment of the invention, the click rate prediction model comprises a click rate prediction model based on the decoupling invariant learning of the click data feature embedding layer and/or a click rate prediction model based on the decoupling invariant learning of the click data feature domain weight layer.
According to the embodiment of the invention, the click rate prediction model based on the decoupling invariant learning of the click data feature embedding level is determined by formula (5):
Figure SMS_109
(5),
wherein ,
Figure SMS_119
a parameter representing a constant part of the environment,
Figure SMS_112
is represented in the environment
Figure SMS_116
In the context of the parameters of the specific part of the environment,
Figure SMS_113
Figure SMS_117
Figure SMS_122
a feature representing the click data is shown,
Figure SMS_126
is shown as
Figure SMS_118
The characteristics of the individual click data are such that,
Figure SMS_121
to representFirst, the
Figure SMS_110
The characteristics of the individual click data are such that,
Figure SMS_114
indicating the number of click data features that are to be characterized,
Figure SMS_123
is shown as
Figure SMS_129
The environment-invariant features corresponding to the individual features are embedded,
Figure SMS_130
is shown as
Figure SMS_131
The environment-invariant features corresponding to the individual features are embedded,
Figure SMS_124
is shown as
Figure SMS_128
The characteristic corresponds to
Figure SMS_125
The specific features of the individual environments are embedded in,
Figure SMS_127
is shown as
Figure SMS_111
The characteristic corresponds to
Figure SMS_115
The specific features of the individual environments are embedded in,
Figure SMS_120
the click rate prediction model.
The click rate prediction model based on the decoupling invariant learning of the click data feature domain weight layer is determined by a formula (6):
Figure SMS_132
(6),
wherein ,
Figure SMS_134
representing a domain
Figure SMS_141
The characterization of (a) is performed,
Figure SMS_142
representing a domain
Figure SMS_133
In the above-mentioned characterization of (1),
Figure SMS_136
representation domain
Figure SMS_138
And domain
Figure SMS_139
The environment of the room does not change the weight,
Figure SMS_137
representing a domain
Figure SMS_143
And domain
Figure SMS_144
In the environment of
Figure SMS_145
Has a specific weight of
Figure SMS_135
Figure SMS_140
Representing the number of feature fields.
According to an embodiment of the present invention, the domains mentioned above
Figure SMS_146
Is characterized by
Figure SMS_147
Is based on domains
Figure SMS_148
Mid-feature embedding
Figure SMS_149
A calculation is performed, determined by equation (7):
Figure SMS_150
(7),
wherein ,
Figure SMS_151
represent the first of the data
Figure SMS_157
The characteristics of the data are such that,
Figure SMS_158
indicates all the domains
Figure SMS_153
Data characteristics of
Figure SMS_154
Corresponding to
Figure SMS_155
The set of (a) or (b),
Figure SMS_156
represent the first of the data
Figure SMS_152
And embedding the characteristics corresponding to the data characteristics.
The invention provides a stable characteristic interaction capturing framework aiming at a click rate prediction problem, which mainly comprises three parts: a decoupled invariant learning objective for capturing stable feature interactions; a meta-learning optimization framework for implementing a decoupled invariant learning objective; model architecture for implementing decoupled invariant learning.
For use ofIn the method, historical data is divided into a plurality of learning targets with equal duration in sequence in order to capture the decoupling invariant learning target of the stable characteristic interaction
Figure SMS_159
A different environment. The invention divides model parameters for modeling feature interaction into an environment-invariant part
Figure SMS_160
With environment-specific parts
Figure SMS_161
Respectively for capturing environment-invariant and environment-specific correlations. The invention designs the characteristic embedding level and the characteristic domain weight level respectively
Figure SMS_162
And
Figure SMS_163
. In order to achieve decoupling of the environment invariant correlation and the environment specific correlation to meet the sufficient prediction assumption of invariant learning and capture stable feature interaction, the invention designs a decoupling invariant learning target which consists of an environment specific learning target meeting the sufficient prediction target assumption and an environment invariant learning target removing the environment specific correlation influence.
The invention divides historical data into equal time length in sequence
Figure SMS_164
Different environments can better mine the constant characteristics in the historical data or the specific characteristics related to the environments. For example, historical click data of a user is divided into multiple sections, and common features among the multiple sections of data, namely invariant click features irrelevant to the environment of the user, can be mined; the different characteristics among the plurality of pieces of data may be specific click characteristics of the user related to the environment.
Learning objectives are specific to the environment that satisfy the assumption of adequate prediction objectives. To satisfy the sufficient prediction assumption of invariant learning, in the environment
Figure SMS_165
In, combine
Figure SMS_166
And with
Figure SMS_167
Should the target be sufficiently predictable, while the environment-specific part can focus on capturing the environment-specific correlations, the following optimization targets are designed, as shown in equation (8):
Figure SMS_168
(8),
wherein ,
Figure SMS_170
is represented in the environment
Figure SMS_172
The predicted loss obtained in (1) is calculated,
Figure SMS_174
is used for controlling
Figure SMS_169
The over-parameters of the intensity are,
Figure SMS_175
is used for preventing
Figure SMS_176
Capturing regularization constraints of environment invariant correlations by letting environment specific parameters
Figure SMS_177
In a removing environment
Figure SMS_171
Outside environment
Figure SMS_173
No contribution is made to the prediction, as shown in equation (9):
Figure SMS_178
(9)。
by optimizing the learning objective, the partial parameters of the environment can be made constant
Figure SMS_179
And environment specific part parameters
Figure SMS_180
In the environment
Figure SMS_181
The middle union satisfies the sufficient prediction condition while making
Figure SMS_182
Focus on capturing environment-specific dependencies.
An environment-invariant learning objective for removing environment-specific relevant influences. When the temperature is higher than the set temperature
Figure SMS_183
After capturing environment specific correlations, fix
Figure SMS_184
Equivalent to removing
Figure SMS_185
Impact on predicted targets when capture environment invariant correlation can satisfy adequate predicted targets (removal of
Figure SMS_186
Influenced goal) of the target. Thus, the invention is fixed
Figure SMS_187
Designing the following invariant learning objective optimization environment invariant model parameters
Figure SMS_188
To capture stable feature interactions, as shown in equation (1):
Figure SMS_189
(1),
wherein
Figure SMS_190
Refers to the variance of the risk of experience for different training environments,
Figure SMS_191
finger environment
Figure SMS_192
The weight of the predicted loss is calculated in a specific manner as shown in equations (3) and (4):
Figure SMS_193
(3),
Figure SMS_194
(4)。
combining minimized cross-environment loss to improve performance across all environments, and minimizing loss difference between environments
Figure SMS_195
The performance difference among different environments is limited, the sharing mode of the different environments is captured, and the model parameters stable across the environments are learned. At the same time, by applying greater weight to environments with high empirical risk
Figure SMS_196
The method can pay more attention to the difficult environment, and further improve the cross-environment generalization performance of the model parameters.
In summary, the overall learning objective of the decoupling invariant learning is shown in equations (1) and (2):
Figure SMS_197
(1),
Figure SMS_198
(2)。
by optimizing the learning objective, the environment invariant correlation and the environment specific correlation in different environments can be decoupled, and cross-environment stable feature interaction is captured between heterogeneous environments through risk variance and environment weighting, so that the feature interaction can be well generalized in a model service stage.
In the meta-learning optimization framework, two sub-optimization targets for decoupling invariant learning are interdependent, and an environment-invariant optimization target is required
Figure SMS_199
Capture and fix environment-specific dependencies
Figure SMS_200
To remove its effect. Thus, the present invention alternately iteratively updates
Figure SMS_201
And
Figure SMS_202
first, the environment-invariant model parameters are updated
Figure SMS_203
. Fixing the device
Figure SMS_204
Optimizing
Figure SMS_205
In view of the learning objective of decoupled invariant learning, which is a complex two-layer optimization problem, the present invention optimizes this objective based on meta-learning. In meta-training phase, an environment is randomly sampled
Figure SMS_206
Generation of intermediate model parameters using environment-specific learning objectives
Figure SMS_207
As shown in equation (10):
Figure SMS_208
(10),
then in the meta-test stage, invariant learning loss optimization is obtained by using intermediate model parameter calculation
Figure SMS_209
As shown in equation (11):
Figure SMS_210
(11),
wherein ,
Figure SMS_211
representing an environment
Figure SMS_212
Predicting the lost weight.
Second, the environment-specific model parameters are updated
Figure SMS_213
. In the process of updating
Figure SMS_214
Then, fix
Figure SMS_215
Directly optimizing environment-specific learning objectives to update
Figure SMS_216
As shown in equation (12):
Figure SMS_217
(12),
wherein ,
Figure SMS_218
solving about
Figure SMS_219
A gradient of (a); and performing alternate iteration on the two types of updating until the model converges.
Fig. 2 (a) and fig. 2 (b) respectively show schematic diagrams of two types of decoupling invariant learning models according to an embodiment of the present invention, where fig. 2 (a) shows the decoupling invariant learning model and fig. 2 (b) shows the light decoupling invariant learning model (LightDIL).
For the model architecture, the environment-invariant model parameters are respectively designed at the aspect of feature embedding and the aspect of feature domain weight
Figure SMS_220
With environment-specific model parameters
Figure SMS_221
By decoupling the two types of correlations, the factorization model is taken as an example (the method can also be designed based on other models), and the following two model architectures are designed.
The first model architecture, as shown in fig. 2 (a), is feature embedding level decoupling. In view of the core of the feature-embedded feature interaction model, the present invention is decoupled at the feature-embedded level. To the characteristics
Figure SMS_222
Make its corresponding environment invariant embedded vector
Figure SMS_223
Context-specific embedded vector set
Figure SMS_224
. Then for the factorization model, the concrete model prediction formula is as shown in formula (5):
Figure SMS_225
(5),
wherein
Figure SMS_226
This is the default Decoupled Invariant Learning (DIL) factorizer form.
The second model architecture, as shown in fig. 2 (b), is decoupled at the feature domain weight level. The characteristic embedding layer decoupling greatly increases model parameters, so that the difficulty of model learning is improved, and the model storage burden and the training overhead are increased. To improve model efficiency, the present invention decouples features and aspects. Specifically, we assign environment-invariant weights and environment-specific weights to feature interactions at the feature domain level to capture environment-invariant and environment-specific correlations, respectively. Taking a factorization machine as an example, the model prediction formula is shown in formula (6):
Figure SMS_227
(6),
wherein ,
Figure SMS_228
is based on domains
Figure SMS_231
Mid-feature embedding
Figure SMS_237
Computed domains
Figure SMS_229
The characterization of (1);
Figure SMS_233
domain(s)
Figure SMS_235
And domain
Figure SMS_236
The environment of the room is not weighted by the change,
Figure SMS_230
is a domain
Figure SMS_232
And domain
Figure SMS_234
With a particular weight of the environment t, of
Figure SMS_238
. The model architecture is named light decoupled invariant learning (LightDIL).
In summary, the present invention designs a decoupling model architecture at the feature embedding level and the feature domain weight level, respectively, as shown in fig. 2 (a). In the service stage, only the environment-invariant model parameters, namely the stable characteristic interaction, are used for prediction so as to ensure good generalization capability. Taking the light decoupling invariant learning as an example, a specific prediction formula is shown in formula (13):
Figure SMS_239
(13),
wherein ,
Figure SMS_240
and representing an empty set, and replacing the set of the environment-specific model parameters with the empty set in the prediction stage of the light decoupling invariant learning.
FIG. 3 is a flowchart of a click-through rate prediction method according to an embodiment of the invention.
As shown in fig. 3, the click-through rate prediction method includes operations S310 to S320.
In operation S310, a historical data set of a user to be predicted is obtained, where the historical data set of the user to be predicted includes user feature data and user click data.
In operation S320, a click rate prediction model is used to mine a prediction result of environment invariant feature interaction of the historical data set of the user to be predicted, where the click rate prediction model is obtained by training the click rate prediction model based on the decoupling invariant learning through the above-mentioned training method.
FIG. 4 is a schematic structural diagram of a training device for a click rate prediction model based on decoupling invariant learning according to an embodiment of the present invention.
As shown in fig. 4, the training apparatus 400 for the click rate prediction model based on the decoupled invariant learning includes a model building module 410, a data sampling module 420, an invariant parameter updating module 430, a specific parameter updating module 440, and an iteration module 450.
The model building module 410 is configured to execute operation S110, and build a click-through rate prediction model and a model optimization target based on a decoupling invariant learning method, where parameters of the click-through rate prediction model include parameters of an environment-invariant portion and parameters of an environment-specific portion, and the model optimization target includes an optimization target of the parameters of the environment-invariant portion and an optimization target of the parameters of the environment-specific portion.
The data sampling module 420 is configured to execute operation S120, and perform random sampling on an environment data set to obtain a training sample data set, where the environment data set represents historical click data of a user in different time periods, and the environment data set includes a tag value.
The invariant parameter updating module 430 is configured to execute operation S130, fix the environment specific part parameter of the click rate prediction model, mine the environment invariant feature of the training sample data set by using the click rate prediction model to obtain a first prediction result, process the first prediction result and the label value of the training sample data set by using the environment invariant loss function through a gradient descent method based on the optimization target of the environment invariant part parameter to obtain a first loss value, and update the environment invariant part parameter of the click rate prediction model according to the first loss value.
The specific parameter updating module 440 is configured to execute operation S140, fix the environment-invariant parameters of the click rate prediction model, mine the environment-specific features of the training sample data set by using the updated click rate prediction model to obtain a second prediction result, process the second prediction result and the label value of the training sample data set by using the environment-specific loss function through a gradient descent method based on the optimization target of the environment-specific parameters to obtain a second loss value, and update the environment-specific parameters of the click rate prediction model according to the second loss value.
And the iteration module 450 is configured to iterate operations S120 to S140 until the click rate prediction model meets a preset convergence condition, so as to obtain a trained click rate prediction model.
In order to better illustrate the advantages of the click rate prediction model obtained by the training method provided by the invention, the click rate prediction model obtained by the training method of the invention is verified by combining a specific experiment.
The method takes a classical click rate prediction model FM as a basic recommendation model, and selects two public different types of data, namely, double, movieLens10M (ML-10M) for experiment. In the invention, fwFMs, autoFIS, PROFIT, group-DRO and V-Rex are used as comparison models. The invention divides double, ML-10M into 1513 parts respectively with 6 months as a period. For double, the first five time periods are used as training sets, the middle five as validation sets, and the last five as test sets. For ML-10M, the first five time periods are used as training sets, the middle four are used as validation sets, and the last four are used as test sets. All methods train the model on the training set, select the optimal parameter on the verification set, and test on the test set. We counted the average performance of the last several test phases of double and ML-10M, respectively, as measured by AUC and logoss.
The results are shown in Table 1:
TABLE 1 comparison of Performance of different methods on two datasets
Figure SMS_241
From table 1, it can be found that: on two different types of data sets, all indexes of the method exceed those of a common invariant learning method V-Rex, group-DRO, and the method can be used for applying invariant learning to click rate prediction stable characteristic interactive capture by decoupling a sufficient prediction hypothesis meeting the invariant learning. Compared with the recommendation system models FwFMs, autoFIS and PROFIT, the method can obtain excellent results, which shows that the method can capture stable characteristic interaction for different recommendation scenes, achieve better generalization in service stage prediction and improve the prediction accuracy.
FIG. 5 schematically shows a block diagram of an electronic device suitable for implementing a click-through rate prediction model training method based on decoupled invariant learning and a click-through rate prediction method according to an embodiment of the present invention.
As shown in fig. 5, an electronic device 500 according to an embodiment of the present invention includes a processor 501 that can perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM) 502 or a program loaded from a storage section 508 into a Random Access Memory (RAM) 503. The processor 501 may comprise, for example, a general purpose microprocessor (e.g., a CPU), an instruction set processor and/or associated chipset, and/or a special purpose microprocessor (e.g., an Application Specific Integrated Circuit (ASIC)), among others. The processor 501 may also include onboard memory for caching purposes. Processor 501 may include a single processing unit or multiple processing units for performing the different actions of the method flows according to embodiments of the present invention.
In the RAM 503, various programs and data necessary for the operation of the electronic apparatus 500 are stored. The processor 501, the ROM 502, and the RAM 503 are connected to each other by a bus 504. The processor 501 performs various operations of the method flow according to the embodiments of the present invention by executing programs in the ROM 502 and/or the RAM 503. Note that the program may also be stored in one or more memories other than the ROM 502 and the RAM 503. The processor 501 may also perform various operations of method flows according to embodiments of the present invention by executing programs stored in the one or more memories.
According to an embodiment of the present invention, electronic device 500 may also include an input/output (I/O) interface 505, input/output (I/O) interface 505 also being connected to bus 504. The electronic device 500 may also include one or more of the following components connected to the I/O interface 505: an input portion 506 including a keyboard, a mouse, and the like; an output portion 907 including a display such as a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, and a speaker; a storage portion 508 including a hard disk and the like; and a communication section 509 including a network interface card such as a LAN card, a modem, or the like. The communication section 509 performs communication processing via a network such as the internet. The driver 510 is also connected to the I/O interface 505 as necessary. A removable medium 511 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 510 as necessary, so that a computer program read out therefrom is mounted on the storage section 508 as necessary.
The present invention also provides a computer-readable storage medium, which may be contained in the apparatus/device/system described in the above embodiments; or may exist separately and not be assembled into the device/apparatus/system. The computer-readable storage medium carries one or more programs which, when executed, implement the method according to an embodiment of the present invention.
According to embodiments of the present invention, the computer readable storage medium may be a non-volatile computer readable storage medium, which may include, for example but is not limited to: a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of the present invention, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. For example, according to embodiments of the invention, a computer-readable storage medium may include ROM 502 and/or RAM 503 and/or one or more memories other than ROM 502 and RAM 503 as described above.
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems that perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The above-mentioned embodiments, objects, technical solutions and advantages of the present invention are further described in detail, it should be understood that the above-mentioned embodiments are only examples of the present invention, and should not be construed as limiting the present invention, and any modifications, equivalent substitutions, improvements and the like made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (10)

1. A click rate prediction model training method based on decoupling invariant learning is characterized by comprising the following steps:
the method comprises the steps that firstly, a click rate prediction model and a model optimization target are built on the basis of a decoupling invariant learning method, wherein parameters of the click rate prediction model comprise environment invariant part parameters and environment specific part parameters, and the model optimization target comprises an optimization target of the environment invariant part parameters and an optimization target of the environment specific part parameters;
randomly sampling an environment data set to obtain a training sample data set, wherein the environment data set represents historical click data of a user in different time periods, and comprises tag values;
fixing the environment specific part parameter of the click rate prediction model, mining the environment invariant feature of the training sample data set by using the click rate prediction model to obtain a first prediction result, processing the first prediction result and the label value of the training sample data set by using an environment invariant loss function through a gradient descent method based on the optimization target of the environment invariant part parameter to obtain a first loss value, and updating the environment invariant part parameter of the click rate prediction model according to the first loss value;
fixing the environment invariant part parameters of the click rate prediction model, mining the environment specific characteristics of the training sample data set by using the updated click rate prediction model to obtain a second prediction result, processing the second prediction result and the label value of the training sample data set by using an environment specific loss function through a gradient descent method based on the optimization target of the environment specific part parameters to obtain a second loss value, and updating the environment specific part parameters of the click rate prediction model according to the second loss value;
and iterating the second step to the fourth step until the click rate prediction model meets a preset convergence condition, so as to obtain a trained click rate prediction model.
2. The method of claim 1, wherein the optimization objective of the model is represented by formula (1) and formula (2):
Figure QLYQS_1
(1),
Figure QLYQS_2
(2),
wherein formula (1) represents an optimization objective of the environment-invariant partial parameter, formula (2) represents an optimization objective of the environment-specific partial parameter,
Figure QLYQS_5
a parameter representing a constant part of said environment,
Figure QLYQS_7
is represented in the environment
Figure QLYQS_13
The environment-specific part parameter of (a),
Figure QLYQS_4
is represented in the environment
Figure QLYQS_8
The predicted loss obtained by the calculation of (a) above,
Figure QLYQS_11
is used for controlling
Figure QLYQS_16
The over-parameters of the intensity are,
Figure QLYQS_3
is used for preventing
Figure QLYQS_9
A regularization constraint that is context invariant dependent is captured,
Figure QLYQS_12
the variance representing the risk of experience for different training environments,
Figure QLYQS_14
finger environment
Figure QLYQS_6
The weight of the loss is predicted and,
Figure QLYQS_10
to represent
Figure QLYQS_15
The coefficient of (a).
3. The method of claim 2, wherein the variance of the different training environment experience risks
Figure QLYQS_17
Expressed by equation (3):
Figure QLYQS_18
(3),
wherein ,
Figure QLYQS_21
representing the number of elements of the set of training environments,
Figure QLYQS_22
and
Figure QLYQS_23
the values of the different environments are represented,
Figure QLYQS_19
is represented in the environment
Figure QLYQS_24
The environment-specific part-parameter of (a),
Figure QLYQS_25
is represented in the environment
Figure QLYQS_26
The variance of the empirical risk of the different training environments
Figure QLYQS_20
A mode for capturing different environment shares;
wherein the environment is
Figure QLYQS_27
Predicting lost weights
Figure QLYQS_28
Expressed by equation (4):
Figure QLYQS_29
(4),
wherein ,
Figure QLYQS_30
representing the traversal of all environments.
4. The method of claim 1, wherein the click rate prediction models comprise click rate prediction models based on decoupled invariant learning of a click data feature embedding level and/or click rate prediction models based on decoupled invariant learning of a click data feature domain weight level.
5. The method of claim 4, wherein the click-through rate prediction model based on decoupled invariant learning of click data feature embedding levels is determined by equation (5):
Figure QLYQS_31
(5),
wherein ,
Figure QLYQS_45
a parameter representing a constant part of said environment,
Figure QLYQS_34
is represented in the environment
Figure QLYQS_39
The environment-specific part parameter of (a),
Figure QLYQS_46
Figure QLYQS_47
Figure QLYQS_48
a feature representing the click data is shown,
Figure QLYQS_53
is shown as
Figure QLYQS_38
The characteristics of the individual click data are such that,
Figure QLYQS_43
is shown as
Figure QLYQS_32
The characteristics of the individual click data are such that,
Figure QLYQS_37
represents the number of the click data features,
Figure QLYQS_35
denotes the first
Figure QLYQS_41
The environment-invariant features corresponding to the individual features are embedded,
Figure QLYQS_44
is shown as
Figure QLYQS_51
The environment-invariant features corresponding to the individual features are embedded,
Figure QLYQS_42
denotes the first
Figure QLYQS_49
The characteristic corresponds to
Figure QLYQS_50
The specific features of the individual environments are embedded in,
Figure QLYQS_52
is shown as
Figure QLYQS_33
The characteristic corresponds to
Figure QLYQS_36
The specific features of the individual environments are embedded in,
Figure QLYQS_40
the click rate prediction model;
wherein the click rate prediction model based on the decoupling invariant learning of the click data feature domain weight layer is determined by formula (6):
Figure QLYQS_54
(6),
wherein ,
Figure QLYQS_57
representing a domain
Figure QLYQS_59
The characterization of (a) is performed,
Figure QLYQS_60
representing a domain
Figure QLYQS_56
The characterization of (a) is performed,
Figure QLYQS_62
representation domain
Figure QLYQS_64
And domain
Figure QLYQS_66
The environment of the room is not weighted by the change,
Figure QLYQS_55
representing a domain
Figure QLYQS_63
And domain
Figure QLYQS_65
In the environment of
Figure QLYQS_67
Has a specific weight of
Figure QLYQS_58
Figure QLYQS_61
Representing the number of feature fields.
6. The method of claim 5, wherein the domain is a public domain
Figure QLYQS_68
Is characterized by
Figure QLYQS_69
Is based on domains
Figure QLYQS_70
Mid-feature embedding
Figure QLYQS_71
A calculation is performed, determined by equation (7):
Figure QLYQS_72
(7),
wherein ,
Figure QLYQS_75
represent the first of the data
Figure QLYQS_76
The characteristics of the data are such that,
Figure QLYQS_77
indicates all the domains
Figure QLYQS_74
Data characteristics of
Figure QLYQS_78
Corresponding to
Figure QLYQS_79
The set of (a) and (b),
Figure QLYQS_80
represent the first of the data
Figure QLYQS_73
And embedding the characteristics corresponding to the data characteristics.
7. A click-through rate prediction method, comprising:
acquiring a historical data set of a user to be predicted, wherein the historical data set of the user to be predicted comprises user characteristic data and user click data;
and mining a prediction result of the environment-invariant feature interaction of the historical data set of the user to be predicted by using a click-through rate prediction model, wherein the click-through rate prediction model is obtained by training according to the method of any one of claims 1 to 6.
8. A training device of a click rate prediction model based on decoupling invariant learning is characterized by comprising the following components:
the model construction module is used for executing the first step, constructing a click rate prediction model and a model optimization target based on a decoupling invariant learning method, wherein the parameters of the click rate prediction model comprise environment invariant part parameters and environment specific part parameters, and the model optimization target comprises an optimization target of the environment invariant part parameters and an optimization target of the environment specific part parameters;
the data sampling module is used for executing the second step, randomly sampling an environment data set to obtain a training sample data set, wherein the environment data set represents historical click data of a user in different time periods, and comprises tag values;
the invariant parameter updating module is used for executing the third step, fixing the environment specific part parameter of the click rate prediction model, mining the environment invariant feature of the training sample data set by using the click rate prediction model to obtain a first prediction result, processing the first prediction result and the label value of the training sample data set by using an environment invariant loss function through a gradient descent method based on the optimization target of the environment invariant part parameter to obtain a first loss value, and updating the environment invariant part parameter of the click rate prediction model according to the first loss value;
a specific parameter updating module, configured to perform the fourth step, fix the environment-invariant parameter of the click rate prediction model, mine the environment specific characteristics of the training sample data set by using the updated click rate prediction model to obtain a second prediction result, process the second prediction result and the label value of the training sample data set by using an environment specific loss function through a gradient descent method based on the optimization target of the environment specific parameter to obtain a second loss value, and update the environment specific parameter of the click rate prediction model according to the second loss value;
and the iteration module is used for iterating the second step to the fourth step until the click rate prediction model meets a preset convergence condition, so as to obtain a trained click rate prediction model.
9. An electronic device, comprising:
one or more processors;
a storage device for storing one or more programs,
wherein the one or more programs, when executed by the one or more processors, cause the one or more processors to perform the method of any of claims 1-7.
10. A computer-readable storage medium having stored thereon executable instructions which, when executed by a processor, cause the processor to carry out the method according to any one of claims 1 to 7.
CN202310053850.7A 2023-02-03 2023-02-03 Click rate prediction model training method and device based on decoupling invariant learning Active CN115809372B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310053850.7A CN115809372B (en) 2023-02-03 2023-02-03 Click rate prediction model training method and device based on decoupling invariant learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310053850.7A CN115809372B (en) 2023-02-03 2023-02-03 Click rate prediction model training method and device based on decoupling invariant learning

Publications (2)

Publication Number Publication Date
CN115809372A true CN115809372A (en) 2023-03-17
CN115809372B CN115809372B (en) 2023-06-16

Family

ID=85487763

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310053850.7A Active CN115809372B (en) 2023-02-03 2023-02-03 Click rate prediction model training method and device based on decoupling invariant learning

Country Status (1)

Country Link
CN (1) CN115809372B (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110490389A (en) * 2019-08-27 2019-11-22 腾讯科技(深圳)有限公司 Clicking rate prediction technique, device, equipment and medium
CN111538761A (en) * 2020-04-21 2020-08-14 中南大学 Click rate prediction method based on attention mechanism
CN113205184A (en) * 2021-04-28 2021-08-03 清华大学 Invariant learning method and device based on heterogeneous hybrid data
US20220083913A1 (en) * 2020-09-11 2022-03-17 Actapio, Inc. Learning apparatus, learning method, and a non-transitory computer-readable storage medium
CN114240555A (en) * 2021-12-17 2022-03-25 北京沃东天骏信息技术有限公司 Click rate prediction model training method and device and click rate prediction method and device
CN114445121A (en) * 2021-12-27 2022-05-06 天翼云科技有限公司 Advertisement click rate prediction model construction and advertisement click rate prediction method
CN115018552A (en) * 2022-06-28 2022-09-06 中国科学技术大学 Method for determining click rate of product

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110490389A (en) * 2019-08-27 2019-11-22 腾讯科技(深圳)有限公司 Clicking rate prediction technique, device, equipment and medium
CN111538761A (en) * 2020-04-21 2020-08-14 中南大学 Click rate prediction method based on attention mechanism
US20220083913A1 (en) * 2020-09-11 2022-03-17 Actapio, Inc. Learning apparatus, learning method, and a non-transitory computer-readable storage medium
CN113205184A (en) * 2021-04-28 2021-08-03 清华大学 Invariant learning method and device based on heterogeneous hybrid data
CN114240555A (en) * 2021-12-17 2022-03-25 北京沃东天骏信息技术有限公司 Click rate prediction model training method and device and click rate prediction method and device
CN114445121A (en) * 2021-12-27 2022-05-06 天翼云科技有限公司 Advertisement click rate prediction model construction and advertisement click rate prediction method
CN115018552A (en) * 2022-06-28 2022-09-06 中国科学技术大学 Method for determining click rate of product

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
孟露,王莉: "推荐系统点击率预测模型" *
郑嘉伟,王粉花: "基于多层次特征交互的点击率预测模型" *

Also Published As

Publication number Publication date
CN115809372B (en) 2023-06-16

Similar Documents

Publication Publication Date Title
CN103502899B (en) Dynamic prediction Modeling Platform
CN111369299B (en) Identification method, device, equipment and computer readable storage medium
CN108229667A (en) Trimming based on artificial neural network classification
CN111145076B (en) Data parallelization processing method, system, equipment and storage medium
KR101828215B1 (en) A method and apparatus for learning cyclic state transition model on long short term memory network
CN109389424B (en) Flow distribution method and device, electronic equipment and storage medium
CN111950810A (en) Multivariable time sequence prediction method and device based on self-evolution pre-training
CN112785342A (en) Real estate dynamic estimation method and device
CN113435430A (en) Video behavior identification method, system and equipment based on self-adaptive space-time entanglement
US20140236869A1 (en) Interactive variable selection device, interactive variable selection method, and interactive variable selection program
CN116684330A (en) Traffic prediction method, device, equipment and storage medium based on artificial intelligence
CN110263136B (en) Method and device for pushing object to user based on reinforcement learning model
CN115359321A (en) Model training method and device, electronic equipment and storage medium
US11475295B2 (en) Predicting and visualizing outcomes using a time-aware recurrent neural network
Larsen et al. Fast continuous and integer L-shaped heuristics through supervised learning
CN114862010A (en) Flow determination method, device, equipment and medium based on space-time data
CN113505583B (en) Emotion reason clause pair extraction method based on semantic decision graph neural network
CN112486784A (en) Method, apparatus and medium for diagnosing and optimizing data analysis system
CN116861262B (en) Perception model training method and device, electronic equipment and storage medium
US11989656B2 (en) Search space exploration for deep learning
CN110717537B (en) Method and device for training user classification model and executing user classification prediction
CN115809372A (en) Click rate prediction model training method and device based on decoupling invariant learning
JP2005222445A (en) Information processing method and analysis device in data mining
WO2020059136A1 (en) Decision list learning device, decision list learning method, and decision list learning program
CN113094602B (en) Hotel recommendation method, system, equipment and medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant