CN107688956B - Information processing method and server - Google Patents

Information processing method and server Download PDF

Info

Publication number
CN107688956B
CN107688956B CN201610639936.8A CN201610639936A CN107688956B CN 107688956 B CN107688956 B CN 107688956B CN 201610639936 A CN201610639936 A CN 201610639936A CN 107688956 B CN107688956 B CN 107688956B
Authority
CN
China
Prior art keywords
user
data
recommendation information
interest
model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201610639936.8A
Other languages
Chinese (zh)
Other versions
CN107688956A (en
Inventor
赵丽丽
刘大鹏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN201610639936.8A priority Critical patent/CN107688956B/en
Publication of CN107688956A publication Critical patent/CN107688956A/en
Application granted granted Critical
Publication of CN107688956B publication Critical patent/CN107688956B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0241Advertisements
    • G06Q30/0251Targeted advertisements
    • G06Q30/0269Targeted advertisements based on user profile or attribute
    • G06Q30/0271Personalized advertisement

Abstract

The invention discloses an information processing method and a server, comprising the following steps: acquiring a data set, wherein the data set comprises a plurality of user characteristic data, and each user characteristic data indicates that a first interest relationship exists between a user and recommendation information; establishing a first interest relation model according to the data set; and determining a first interest parameter corresponding to the target recommendation information according to the first interest relationship model, wherein the first interest parameter is used for representing the degree of first interest of the user in the target recommendation information.

Description

Information processing method and server
Technical Field
The present invention relates to information processing technologies, and in particular, to an information processing method and a processor.
Background
Information recommendation is widely applied to the internet field. When information recommendation is performed through an internet medium, user experience is considered, and the feedback behavior of a user is used as a characteristic optimization model in the existing click rate estimation model. For the information which is disliked by the user, the ranking score of the candidate advertisement is correspondingly reduced through statistical data such as negative feedback of the user. In online advertising, negative feedback behavior is particularly important, and directly influences the probability of clicking the advertisement by a user and the experience of the user.
The existing negative feedback is configured and used according to manual experience by adopting statistical data, and the weight and mutual influence of various feedback behaviors are not considered in the mode. For the recommended information, a factor does not comprehensively consider the influence of various feedback data, and the degree of the user dislocating the current information cannot be comprehensively reflected.
Disclosure of Invention
In order to solve the above technical problem, embodiments of the present invention provide an information processing method and a server.
The information processing method provided by the embodiment of the invention comprises the following steps:
acquiring a data set, wherein the data set comprises a plurality of user characteristic data, and each user characteristic data indicates that a first interest relationship exists between a user and recommendation information;
establishing a first interest relation model according to the data set;
and determining a first interest parameter corresponding to the target recommendation information according to the first interest relationship model, wherein the first interest parameter is used for representing the degree of first interest of the user in the target recommendation information.
In this embodiment of the present invention, the acquiring the data set includes:
acquiring at least the following characteristic data: explicit feedback data, implicit feedback data, advertisement quality data and user basic data;
and selecting user characteristic data with a first interest relationship between the user and the recommendation information from the acquired characteristic data.
In an embodiment of the present invention, the establishing a first interest relationship model according to the data set includes:
according to the data set, determining first type of sample data and second type of sample data, wherein the quantity of the first type of sample data is less than that of the second type of sample data;
sampling the second type of sample data, wherein the number of the first type of sample data and the number of the sampled second type of sample data meet a preset proportional relationship;
and performing model optimization processing on the first type of sample data and the sampled second type of sample data by adopting an iterative decision tree model to obtain a first interest relationship model.
In the embodiment of the present invention, the method further includes:
taking the first interest parameter as a feature in a click rate estimation model, and optimizing the click rate estimation model; the click rate estimation model is used for representing the incidence relation between the user click rate and each characteristic in the click rate estimation model.
In the embodiment of the present invention, the method further includes:
and performing corresponding weight reduction processing on the ranking scores of the candidate recommendation information according to the first interest parameters.
The server provided by the embodiment of the invention comprises:
the system comprises an acquisition unit, a recommendation unit and a recommendation unit, wherein the acquisition unit is used for acquiring a data set, the data set comprises a plurality of user characteristic data, and each user characteristic data indicates that a first interest relationship exists between a user and recommendation information;
the establishing unit is used for establishing a first interest relationship model according to the data set;
and the determining unit is used for determining a first interest parameter corresponding to the target recommendation information according to the first interest relationship model, wherein the first interest parameter is used for representing the degree of first interest of the user in the target recommendation information.
In an embodiment of the present invention, the obtaining unit includes:
an obtaining subunit, configured to obtain at least the following feature data: explicit feedback data, implicit feedback data, advertisement quality data and user basic data;
and the selecting subunit is used for selecting the user characteristic data with the first interest relationship between the user and the recommendation information from the acquired characteristic data.
In an embodiment of the present invention, the establishing unit includes:
a determining subunit, configured to determine, according to the data set, first type of sample data and second type of sample data, where the number of the first type of sample data is smaller than the number of the second type of sample data;
the sampling subunit is used for sampling the second type of sample data, wherein the number of the first type of sample data and the number of the sampled second type of sample data meet a preset proportional relationship;
and the optimization subunit is used for performing model optimization processing on the first type of sample data and the sampled second type of sample data by adopting an iterative decision tree model to obtain a first interest relationship model.
In the embodiment of the present invention, the server further includes:
the click rate estimation model unit is used for optimizing the click rate estimation model by taking the first interest parameter as a characteristic in the click rate estimation model; the click rate estimation model is used for representing the incidence relation between the user click rate and each characteristic in the click rate estimation model.
In the embodiment of the present invention, the server further includes:
and the ranking optimization unit is used for performing corresponding weight reduction processing on the ranking scores of the candidate recommendation information according to the first interest parameters.
According to the technical scheme, a data set is obtained and comprises a plurality of user characteristic data, wherein each user characteristic data indicates that a first interest relationship exists between a user and recommendation information; establishing a first interest relation model according to the data set; and determining a first interest parameter corresponding to the target recommendation information according to the first interest relationship model, wherein the first interest parameter is used for representing the degree of first interest of the user in the target recommendation information. Therefore, various user characteristic data (namely user feedback data) are fused to establish the first interest relation model (namely the feedback model), and a factor (namely the first interest parameter) is abstracted, so that the degree of the user feedback information can be comprehensively reflected.
Drawings
FIG. 1 is a diagram of hardware entities performing information interaction in an embodiment of the present invention;
FIG. 2 is a first flowchart illustrating an information processing method according to an embodiment of the present invention;
FIG. 3 is a second flowchart illustrating an information processing method according to an embodiment of the present invention;
FIG. 4 is a third flowchart illustrating an information processing method according to an embodiment of the present invention;
FIG. 5 is a schematic diagram of an embodiment of the present invention;
FIG. 6 is a first schematic structural diagram of a server according to an embodiment of the present invention;
fig. 7 is a schematic structural diagram of a server according to an embodiment of the present invention.
Detailed Description
The following is an explanation of relevant terms in the examples of the present invention:
and (3) online advertising: an emerging internet advertising mode enables the most suitable audience to see the most suitable advertisement on the most suitable media at the most suitable time by virtue of the advantages of big data, thereby realizing the win-win situation.
Explicit feedback: the user directly clicks dominant marks such as likes or dislikes and the like for the delivered advertisements to feed back the advertisement preferences.
Implicit feedback: the user implicitly feeds back the preference of the advertisement to the delivered advertisement through clicking or not clicking and other behaviors.
Negative feedback: for example, a user may fork disliked advertisements, negative feedback is a means to effectively collect user opinions.
The following describes the embodiments in further detail with reference to the accompanying drawings.
Fig. 1 is a schematic diagram of hardware entities performing information interaction in an embodiment of the present invention, where fig. 1 includes: the server 11 … … 1n and the terminals 21-24, where the terminals 21-24 perform information interaction with the server through a wired network or a wireless network, and the terminals include mobile phones, desktop computers, PCs, all-in-one machines, and the like, in an example, the server 11 … … 1n may also perform interaction with a terminal where an advertiser (or an object for providing advertisement material and content promotion) is located through a network, and the advertiser submits an advertisement to be delivered and then stores the advertisement in a server cluster, and may be equipped with a series of processes such as an administrator checking the advertisement delivered by the advertiser. The terminals 21-24 are terminals where ordinary users (or objects called advertisement presentations or exposures) are located. Wherein all applications installed in the terminal or designated applications (such as game applications, video applications, navigation applications, etc.) can add advertisements to show more recommendation information to the user.
The above example of fig. 1 is only an example of a system architecture for implementing the embodiment of the present invention, and the embodiment of the present invention is not limited to the system architecture described in the above fig. 1, and various embodiments of the present invention are proposed based on the system architecture.
Fig. 2 is a first schematic flow chart of an information processing method according to an embodiment of the present invention, and as shown in fig. 2, the information processing method includes the following steps:
step 201: the method comprises the steps of obtaining a data set, wherein the data set comprises a plurality of user characteristic data, and each user characteristic data indicates that a first interest relation exists between a user and recommendation information.
In the embodiment of the present invention, acquiring the data set specifically includes: acquiring at least the following characteristic data: explicit feedback data, implicit feedback data, advertisement quality data and user basic data; and selecting user characteristic data with a first interest relationship between the user and the recommendation information from the acquired characteristic data.
Here, the data fed back by the user is also referred to as user characteristic data, and the data fed back by the user includes: explicit feedback data and implicit feedback data, wherein:
explicit feedback data refers to: the user directly clicks obvious feedback operations such as likes and dislikes on the delivered recommended information (such as advertisements). For example: the user skews off the data for the advertisement/category daily or weekly or monthly; for another example: the user clicks on the like/dislike advertisement/advertisement-like data every day or every week or every month.
Implicit feedback data refers to: the user implicitly feeds back the preference of the delivered recommendation information (such as advertisements) through feedback operations such as clicking or not clicking. For example: user exposure click data for advertisement/advertisement categories on a daily or weekly or monthly basis; for another example: the advertisement history exposes click data.
The characteristic data influencing the user dislike recommendation information further comprises: advertisement quality data, user profile data. For user profile data, for example: the age of the user, the sex of the user, the province of the user, the internet surfing scene of the user and other basic data.
Referring to fig. 5, among the characteristic data such as explicit feedback data, implicit feedback data, advertisement quality data, and user basic data, user characteristic data having a first interest relationship between the user and the recommendation information is selected for subsequent modeling.
Here, the fact that the user has the first interest relationship with the recommendation information means that: the user is disinclined (or referred to as not interested) in the recommendation information. Correspondingly, the user and the recommendation information have a second interest relationship, and the second interest relationship is: the user has a good sense (or is said to be interested) in the recommendation information.
Taking implicit feedback data as an example, recommendation information is exposed to the user A for multiple times, but the user A only clicks the recommendation information for 1 time or never clicks the recommendation information, so that the user A is not interested in or is not interested in the recommendation information, and the implicit feedback data indicates that the user and the recommendation information have a first interest relationship, namely the user is interested in the recommendation information.
In the embodiment of the invention, the user characteristic data with the first interest relationship between the user and the recommendation information is also called negative feedback data.
Step 202: and establishing a first interest relation model according to the data set.
In the embodiment of the invention, the number of the user characteristic data in the data set is considered to be less, and the user characteristic data has dense numerical characteristics, so that the iterative decision tree model is adopted to establish the first interest relation model.
Specifically, according to the data set, a first type of sample data and a second type of sample data are determined, wherein the number of the first type of sample data is smaller than that of the second type of sample data; sampling the second type of sample data, wherein the number of the first type of sample data and the number of the sampled second type of sample data meet a preset proportional relationship; and performing model optimization processing on the first type of sample data and the sampled second type of sample data by adopting an iterative decision tree model to obtain a first interest relationship model.
Taking an iterative Decision Tree model as a Gradient Boosting Decision Tree (GBDT) model as an example, when the GBDT model is adopted, feature data does not need to be preprocessed in a segmentation mode and the like, and a full-profile model learns by itself. The GBDT model may use the Xgboost toolkit, a network open source tool.
The most important step of establishing the first interest relationship model according to the data set is to perform model optimization, specifically, the final goal of the model optimization is to obtain the degree of the recommendation information which is disliked by the user, and the recommendation information is forked by the user corresponding to the user behavior data. The recommendation information is forked off by the user to be used as a positive sample, the recommendation information is exposed to be used as a negative sample, and the negative sample sampling is performed first considering that the number of the positive samples is small and the number of the negative samples is large, and the proportion of the positive samples to the negative samples is kept to be a preset proportion (for example, about 1: 10). Meanwhile, on the basis of the selected user characteristic data, a training data time window is selected to be more than or equal to one day, for example, the time window is at least guaranteed to be more than 1000 w. Then, the prepared user characteristic data is used as the input of the GBDT model, the user forks the recommendation information/recommendation information exposure to be used as the output of the model, and a training sample in a libsvm format is generated, so that a first interest relation model is established. Here, the first interest relationship model represents a model in which a user dislikes recommended information (e.g., advertisement).
In the embodiment of the invention, the final application scene of the model optimization target is mainly user experience, and the data set contains advertisement quality data, so that the model offline training is as fast as possible, for example, once every half an hour. In addition, the model prediction (i.e. the process of establishing the first interest relationship model) of the embodiment of the invention is performed in an online real-time manner, and according to the user characteristic data and the complexity of the model, the model prediction of the embodiment of the invention has a processing delay of about 10ms, and can meet the online real-time requirement.
Step 203: and determining a first interest parameter corresponding to the target recommendation information according to the first interest relationship model, wherein the first interest parameter is used for representing the degree of first interest of the user in the target recommendation information.
In the embodiment of the invention, the target recommendation information is brought into the first interest relation model, so that the first interest parameters corresponding to the target recommendation information can be obtained, and the first interest parameters are used for representing the degree of the user's objectionability to the target recommendation information. The embodiment of the invention integrates various characteristic data which possibly influence the user dislike the recommendation information, establishes the model of the user dislike the recommendation information, and estimates the degree of the user dislike the current recommendation information through the model, namely the first interest parameter. The first interest parameter can be used as an abstract factor for other links of online advertisement optimization, such as a click rate estimation optimization link, a user experience optimization link and the like.
Fig. 3 is a second schematic flowchart of an information processing method according to an embodiment of the present invention, and as shown in fig. 3, the information processing method includes the following steps:
step 301: the method comprises the steps of obtaining a data set, wherein the data set comprises a plurality of user characteristic data, and each user characteristic data indicates that a first interest relation exists between a user and recommendation information.
In the embodiment of the present invention, acquiring the data set specifically includes: acquiring at least the following characteristic data: explicit feedback data, implicit feedback data, advertisement quality data and user basic data; and selecting user characteristic data with a first interest relationship between the user and the recommendation information from the acquired characteristic data.
Here, the data fed back by the user is also referred to as user characteristic data, and the data fed back by the user includes: explicit feedback data and implicit feedback data, wherein:
explicit feedback data refers to: the user directly clicks obvious feedback operations such as likes and dislikes on the delivered recommended information (such as advertisements). For example: the user skews off the data for the advertisement/category daily or weekly or monthly; for another example: the user clicks on the like/dislike advertisement/advertisement-like data every day or every week or every month.
Implicit feedback data refers to: the user implicitly feeds back the preference of the delivered recommendation information (such as advertisements) through feedback operations such as clicking or not clicking. For example: user exposure click data for advertisement/advertisement categories on a daily or weekly or monthly basis; for another example: the advertisement history exposes click data.
The characteristic data influencing the user dislike recommendation information further comprises: advertisement quality data, user profile data. For user profile data, for example: the age of the user, the sex of the user, the province of the user, the internet surfing scene of the user and other basic data.
Referring to fig. 5, among the characteristic data such as explicit feedback data, implicit feedback data, advertisement quality data, and user basic data, user characteristic data having a first interest relationship between the user and the recommendation information is selected for subsequent modeling.
Here, the fact that the user has the first interest relationship with the recommendation information means that: the user is disinclined (or referred to as not interested) in the recommendation information. Correspondingly, the user and the recommendation information have a second interest relationship, and the second interest relationship is: the user has a good sense (or is said to be interested) in the recommendation information.
Taking implicit feedback data as an example, recommendation information is exposed to the user A for multiple times, but the user A only clicks the recommendation information for 1 time or never clicks the recommendation information, so that the user A is not interested in or is not interested in the recommendation information, and the implicit feedback data indicates that the user and the recommendation information have a first interest relationship, namely the user is interested in the recommendation information.
In the embodiment of the invention, the user characteristic data with the first interest relationship between the user and the recommendation information is also called negative feedback data.
Step 302: and establishing a first interest relation model according to the data set.
In the embodiment of the invention, the number of the user characteristic data in the data set is considered to be less, and the user characteristic data has dense numerical characteristics, so that the iterative decision tree model is adopted to establish the first interest relation model.
Specifically, according to the data set, a first type of sample data and a second type of sample data are determined, wherein the number of the first type of sample data is smaller than that of the second type of sample data; sampling the second type of sample data, wherein the number of the first type of sample data and the number of the sampled second type of sample data meet a preset proportional relationship; and performing model optimization processing on the first type of sample data and the sampled second type of sample data by adopting an iterative decision tree model to obtain a first interest relationship model.
Taking an iterative decision tree model as the GBDT model as an example, when the GBDT model is adopted, the characteristic data does not need to be preprocessed in a segmentation mode and the like, and the full-profile model learns by itself. The GBDT model may use the Xgboost toolkit, a network open source tool.
The most important step of establishing the first interest relationship model according to the data set is to perform model optimization, specifically, the final goal of the model optimization is to obtain the degree of the recommendation information which is disliked by the user, and the recommendation information is forked by the user corresponding to the user behavior data. The recommendation information is forked off by the user to be used as a positive sample, the recommendation information is exposed to be used as a negative sample, and the negative sample sampling is performed first considering that the number of the positive samples is small and the number of the negative samples is large, and the proportion of the positive samples to the negative samples is kept to be a preset proportion (for example, about 1: 10). Meanwhile, on the basis of the selected user characteristic data, a training data time window is selected to be more than or equal to one day, for example, the time window is at least guaranteed to be more than 1000 w. Then, the prepared user characteristic data is used as the input of the GBDT model, the user forks the recommendation information/recommendation information exposure to be used as the output of the model, and a training sample in a libsvm format is generated, so that a first interest relation model is established. Here, the first interest relationship model represents a model in which a user dislikes recommended information (e.g., advertisement).
In the embodiment of the invention, the final application scene of the model optimization target is mainly user experience, and the data set contains advertisement quality data, so that the model offline training is as fast as possible, for example, once every half an hour. In addition, the model prediction (i.e. the process of establishing the first interest relationship model) of the embodiment of the invention is performed in an online real-time manner, and according to the user characteristic data and the complexity of the model, the model prediction of the embodiment of the invention has a processing delay of about 10ms, and can meet the online real-time requirement.
Step 303: and determining a first interest parameter corresponding to the target recommendation information according to the first interest relationship model, wherein the first interest parameter is used for representing the degree of first interest of the user in the target recommendation information.
In the embodiment of the invention, the target recommendation information is brought into the first interest relation model, so that the first interest parameters corresponding to the target recommendation information can be obtained, and the first interest parameters are used for representing the degree of the user's objectionability to the target recommendation information. The embodiment of the invention integrates various characteristic data which possibly influence the user dislike the recommendation information, establishes the model of the user dislike the recommendation information, and estimates the degree of the user dislike the current recommendation information through the model, namely the first interest parameter.
Step 304: taking the first interest parameter as a feature in a click rate estimation model, and optimizing the click rate estimation model; the click rate estimation model is used for representing the incidence relation between the user click rate and each characteristic in the click rate estimation model.
In the embodiment of the invention, the first interest parameter can be used as an abstract factor for other links of online advertisement optimization, such as a click rate estimation optimization link.
Specifically, the degree of the user dislike the recommendation information (such as the advertisement) is added into the click-through rate prediction model, and the click-through rate prediction model is optimized as a feedback characteristic of the click-through rate prediction model. The candidate characteristics of the click rate estimation model mainly comprise: advertisement characteristics, characteristics associated with the advertisement and the user; the advertisement characteristics comprise advertisement quality, historical click rate, popularity and the like; the characteristics associated with the advertisement and the user refer to the user's preferences for the characteristics of the advertisement. In the embodiment of the invention, the click rate is estimated by taking the first interest parameter as the candidate characteristic of the click rate estimation model, so that the click rate of the user on the recommendation information can be accurately predicted. In addition, the click rate estimation model needs to perform segmentation processing on data characteristics. Specifically, the data features are sorted according to the feature values in a descending order, an optimal splitting point is calculated, optimal splitting nodes of the left sub-tree and the right sub-tree are calculated on the basis, and the next optimal splitting point is calculated continuously on the basis after one of the two optimal splitting nodes is selected.
Fig. 4 is a third schematic flowchart of an information processing method according to an embodiment of the present invention, and as shown in fig. 4, the information processing method includes the following steps:
step 401: the method comprises the steps of obtaining a data set, wherein the data set comprises a plurality of user characteristic data, and each user characteristic data indicates that a first interest relation exists between a user and recommendation information.
In the embodiment of the present invention, acquiring the data set specifically includes: acquiring at least the following characteristic data: explicit feedback data, implicit feedback data, advertisement quality data and user basic data; and selecting user characteristic data with a first interest relationship between the user and the recommendation information from the acquired characteristic data.
Here, the data fed back by the user is also referred to as user characteristic data, and the data fed back by the user includes: explicit feedback data and implicit feedback data, wherein:
explicit feedback data refers to: the user directly clicks obvious feedback operations such as likes and dislikes on the delivered recommended information (such as advertisements). For example: the user skews off the data for the advertisement/category daily or weekly or monthly; for another example: the user clicks on the like/dislike advertisement/advertisement-like data every day or every week or every month.
Implicit feedback data refers to: the user implicitly feeds back the preference of the delivered recommendation information (such as advertisements) through feedback operations such as clicking or not clicking. For example: user exposure click data for advertisement/advertisement categories on a daily or weekly or monthly basis; for another example: the advertisement history exposes click data.
The characteristic data influencing the user dislike recommendation information further comprises: advertisement quality data, user profile data. For user profile data, for example: the age of the user, the sex of the user, the province of the user, the internet surfing scene of the user and other basic data.
Referring to fig. 5, among the characteristic data such as explicit feedback data, implicit feedback data, advertisement quality data, and user basic data, user characteristic data having a first interest relationship between the user and the recommendation information is selected for subsequent modeling.
Here, the fact that the user has the first interest relationship with the recommendation information means that: the user is disinclined (or referred to as not interested) in the recommendation information. Correspondingly, the user and the recommendation information have a second interest relationship, and the second interest relationship is: the user has a good sense (or is said to be interested) in the recommendation information.
Taking implicit feedback data as an example, recommendation information is exposed to the user A for multiple times, but the user A only clicks the recommendation information for 1 time or never clicks the recommendation information, so that the user A is not interested in or is not interested in the recommendation information, and the implicit feedback data indicates that the user and the recommendation information have a first interest relationship, namely the user is interested in the recommendation information.
In the embodiment of the invention, the user characteristic data with the first interest relationship between the user and the recommendation information is also called negative feedback data.
Step 402: and establishing a first interest relation model according to the data set.
In the embodiment of the invention, the number of the user characteristic data in the data set is considered to be less, and the user characteristic data has dense numerical characteristics, so that the iterative decision tree model is adopted to establish the first interest relation model.
Specifically, according to the data set, a first type of sample data and a second type of sample data are determined, wherein the number of the first type of sample data is smaller than that of the second type of sample data; sampling the second type of sample data, wherein the number of the first type of sample data and the number of the sampled second type of sample data meet a preset proportional relationship; and performing model optimization processing on the first type of sample data and the sampled second type of sample data by adopting an iterative decision tree model to obtain a first interest relationship model.
Taking an iterative decision tree model as the GBDT model as an example, when the GBDT model is adopted, the characteristic data does not need to be preprocessed in a segmentation mode and the like, and the full-profile model learns by itself. The GBDT model may use the Xgboost toolkit, a network open source tool.
The most important step of establishing the first interest relationship model according to the data set is to perform model optimization, specifically, the final goal of the model optimization is to obtain the degree of the recommendation information which is disliked by the user, and the recommendation information is forked by the user corresponding to the user behavior data. The recommendation information is forked off by the user to be used as a positive sample, the recommendation information is exposed to be used as a negative sample, and the negative sample sampling is performed first considering that the number of the positive samples is small and the number of the negative samples is large, and the proportion of the positive samples to the negative samples is kept to be a preset proportion (for example, about 1: 10). Meanwhile, on the basis of the selected user characteristic data, a training data time window is selected to be more than or equal to one day, for example, the time window is at least guaranteed to be more than 1000 w. Then, the prepared user characteristic data is used as the input of the GBDT model, the user forks the recommendation information/recommendation information exposure to be used as the output of the model, and a training sample in a libsvm format is generated, so that a first interest relation model is established. Here, the first interest relationship model represents a model in which a user dislikes recommended information (e.g., advertisement).
In the embodiment of the invention, the final application scene of the model optimization target is mainly user experience, and the data set contains advertisement quality data, so that the model offline training is as fast as possible, for example, once every half an hour. In addition, the model prediction (i.e. the process of establishing the first interest relationship model) of the embodiment of the invention is performed in an online real-time manner, and according to the user characteristic data and the complexity of the model, the model prediction of the embodiment of the invention has a processing delay of about 10ms, and can meet the online real-time requirement.
Step 403: and determining a first interest parameter corresponding to the target recommendation information according to the first interest relationship model, wherein the first interest parameter is used for representing the degree of first interest of the user in the target recommendation information.
In the embodiment of the invention, the target recommendation information is brought into the first interest relation model, so that the first interest parameters corresponding to the target recommendation information can be obtained, and the first interest parameters are used for representing the degree of the user's objectionability to the target recommendation information. The embodiment of the invention integrates various characteristic data which possibly influence the user dislike the recommendation information, establishes the model of the user dislike the recommendation information, and estimates the degree of the user dislike the current recommendation information through the model, namely the first interest parameter.
Step 404: and performing corresponding weight reduction processing on the ranking scores of the candidate recommendation information according to the first interest parameters.
In the embodiment of the invention, the first interest parameter can be used as an abstract factor for other links of online advertisement optimization, such as a user experience optimization link.
Specifically, the first interest parameter (i.e. the degree of the user dislikes the recommended information) is used as an abstract factor, the user experience link uses the abstract factor to correspondingly reduce the weight, the abstract factor effectively considers the weight of mutual influence of various negative feedback behaviors, and the abstract factor is more reasonable in reaction negative feedback. The specific weight reduction formula can be referred to as the following formula (1):
pctr′=pctr×(1-r)m (1)
wherein r represents a first interest parameter; pctr represents the estimated click rate reference value; pctr' represents the weight corresponding to the estimated click rate. The larger r is, the more dislike the recommendation information of the user is represented, the lower pctr' is, meanwhile, the first interest parameter influences the weight coefficient and needs to be adjusted through the m parameter, and the setting of the specific m needs to be determined through online testing. Referring to fig. 5, the estimated click rate pctr obtained by the click rate estimation model also indirectly affects the weight reduction process.
Fig. 6 is a schematic structural composition diagram of a server according to an embodiment of the present invention, as shown in fig. 6, the server includes:
the obtaining unit 61 is configured to obtain a data set, where the data set includes a plurality of user feature data, and each user feature data indicates that a first interest relationship exists between a user and recommendation information;
the establishing unit 62 is configured to establish a first interest relationship model according to the data set;
a determining unit 63, configured to determine, according to the first interest relationship model, a first interest parameter corresponding to the target recommendation information, where the first interest parameter is used to represent a degree of a first interest that a user has in the target recommendation information.
Those skilled in the art will understand that the implementation functions of each unit in the server shown in fig. 6 can be understood by referring to the related description of the aforementioned information processing method. The functions of the units in the server shown in fig. 6 may be implemented by a program running on a processor, or may be implemented by specific logic circuits.
Fig. 7 is a schematic structural composition diagram of a server according to an embodiment of the present invention, and as shown in fig. 7, the server includes:
an obtaining unit 71, configured to obtain a data set, where the data set includes a plurality of user feature data, where each user feature data indicates that a first interest relationship exists between a user and recommendation information;
the establishing unit 72 is configured to establish a first interest relationship model according to the data set;
the determining unit 73 is configured to determine, according to the first interest relationship model, a first interest parameter corresponding to the target recommendation information, where the first interest parameter is used to represent a degree of first interest that a user has in the target recommendation information.
The acquisition unit 71 includes:
an obtaining subunit 711, configured to obtain at least the following feature data: explicit feedback data, implicit feedback data, advertisement quality data and user basic data;
a selecting subunit 712, configured to select, from the obtained feature data, user feature data with a first interest relationship between the user and the recommendation information.
The establishing unit 72 includes:
a determining subunit 721, configured to determine, according to the data set, a first type of sample data and a second type of sample data, where a quantity of the first type of sample data is smaller than a quantity of the second type of sample data;
the sampling subunit 722 is configured to sample the second type of sample data, where the numbers of the first type of sample data and the sampled second type of sample data satisfy a preset proportional relationship;
and the optimizing subunit 723 is configured to perform model optimization processing on the first type of sample data and the sampled second type of sample data by using an iterative decision tree model to obtain a first interest relationship model.
The server further comprises:
the click rate pre-estimation model unit 74 is used for optimizing the click rate pre-estimation model by taking the first interest parameter as a feature in the click rate pre-estimation model; the click rate estimation model is used for representing the incidence relation between the user click rate and each characteristic in the click rate estimation model.
The server further comprises:
and the ranking optimization unit 75 is configured to perform corresponding weight reduction processing on the ranking score of the candidate recommendation information according to the first interest parameter.
Those skilled in the art will understand that the implementation functions of each unit in the server shown in fig. 7 can be understood by referring to the related description of the aforementioned information processing method. The functions of the units in the server shown in fig. 7 may be implemented by a program running on a processor, or may be implemented by specific logic circuits.
The technical schemes described in the embodiments of the present invention can be combined arbitrarily without conflict.
In the embodiments provided in the present invention, it should be understood that the disclosed method and intelligent device may be implemented in other ways. The above-described device embodiments are merely illustrative, for example, the division of the unit is only a logical functional division, and there may be other division ways in actual implementation, such as: multiple units or components may be combined, or may be integrated into another system, or some features may be omitted, or not implemented. In addition, the coupling, direct coupling or communication connection between the components shown or discussed may be through some interfaces, and the indirect coupling or communication connection between the devices or units may be electrical, mechanical or other forms.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed on a plurality of network units; some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, all the functional units in the embodiments of the present invention may be integrated into one second processing unit, or each unit may be separately regarded as one unit, or two or more units may be integrated into one unit; the integrated unit can be realized in a form of hardware, or in a form of hardware plus a software functional unit.
The above description is only for the specific embodiments of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present invention, and all the changes or substitutions should be covered within the scope of the present invention.

Claims (8)

1. An information processing method, characterized in that the method comprises:
acquiring a data set from at least one of explicit feedback data, implicit feedback data, advertisement quality data and user basic data, wherein the data set comprises a plurality of user characteristic data;
the user characteristic data indicate that a first interest relationship exists between a user and recommended information, and the first interest relationship represents that the user is insensitive to the recommended information;
training a first interest relation model representing that the user is dislike of the recommendation information according to the data set;
determining a first interest parameter corresponding to target recommendation information according to the first interest relationship model, wherein the first interest parameter is used for representing the degree of the user's objectionability to the target recommendation information;
taking the first interest parameter as a candidate feature in a click rate estimation model, and optimizing the click rate estimation model; the click rate estimation model is used for representing the incidence relation between the user click rate and each characteristic in the click rate estimation model; or
And taking the first interest parameter as an abstract factor, and performing weight reduction processing on the sequencing score of the target recommendation information in a user experience link to obtain a weight representing the mutual influence of various negative feedback behaviors.
2. The information processing method of claim 1, wherein the obtaining a data set from at least one of explicit feedback data, implicit feedback data, advertisement quality data, and user profile data comprises:
acquiring at least the following characteristic data: explicit feedback data, implicit feedback data, advertisement quality data and user basic data;
and selecting the user characteristic data with the first interest relationship between the user and the recommendation information from the acquired characteristic data.
3. The information processing method of claim 1, wherein the training, according to the data set, a first interest relationship model representing that the user dislikes the recommendation information comprises:
according to the data set, determining first type of sample data and second type of sample data, wherein the quantity of the first type of sample data is less than that of the second type of sample data;
sampling the second type of sample data, wherein the number of the first type of sample data and the number of the sampled second type of sample data meet a preset proportional relationship;
and performing model optimization processing on the first type of sample data and the sampled second type of sample data by adopting an iterative decision tree model to obtain a first interest relation model representing that the user feels the recommendation information.
4. The information processing method according to any one of claims 1 to 3, wherein the candidate features include a recommendation information feature, recommendation information, and a feature associated with the user; the recommendation information characteristics comprise recommendation information quality, historical click rate and popularity; the recommendation information and the user-associated feature are preferences of the user for a recommendation information feature;
the optimizing the click rate pre-estimation model by taking the first interest parameter and the recommended information characteristic as candidate characteristics in the click rate pre-estimation model comprises the following steps:
sorting the candidate features in a descending order according to the feature values to calculate an optimal splitting point;
calculating the optimal splitting point of the left sub-tree and the optimal splitting point of the right sub-tree corresponding to the optimal splitting point;
and selecting an optimal splitting point from the optimal splitting point of the left sub-tree and the optimal splitting point of the right sub-tree to be used as the next optimal splitting point to continue splitting.
5. A server, characterized in that the server comprises:
the system comprises an acquisition unit, a processing unit and a processing unit, wherein the acquisition unit is used for acquiring a data set from at least one of explicit feedback data, implicit feedback data, advertisement quality data and user basic data, and the data set comprises a plurality of user characteristic data; the user characteristic data indicate that a first interest relationship exists between a user and recommended information, and the first interest relationship represents that the user is insensitive to the recommended information;
the establishing unit is used for training a first interest relation model representing that the user is disliked the recommendation information according to the data set;
the determining unit is used for determining a first interest parameter corresponding to target recommendation information according to the first interest relationship model, wherein the first interest parameter is used for representing the degree of the user's objectionability to the target recommendation information;
the click rate estimation model unit is used for optimizing the click rate estimation model by taking the first interest parameter as a candidate characteristic in the click rate estimation model; the click rate estimation model is used for representing the incidence relation between the user click rate and each characteristic in the click rate estimation model; or
And the weight reduction processing unit is used for performing weight reduction processing on the sequencing score of the target recommendation information in a user experience link by taking the first interest parameter as an abstract factor to obtain a weight representing mutual influence of various negative feedback behaviors.
6. The server according to claim 5, wherein the obtaining unit includes:
an obtaining subunit, configured to obtain at least the following feature data: explicit feedback data, implicit feedback data, advertisement quality data and user basic data;
and the selecting subunit is used for selecting the user characteristic data with the first interest relationship between the user and the recommendation information from the acquired characteristic data.
7. The server according to claim 6, wherein the establishing unit comprises:
a determining subunit, configured to determine, according to the data set, first type of sample data and second type of sample data, where the number of the first type of sample data is smaller than the number of the second type of sample data;
the sampling subunit is used for sampling the second type of sample data, wherein the number of the first type of sample data and the number of the sampled second type of sample data meet a preset proportional relationship;
and the optimization subunit is used for performing model optimization processing on the first type of sample data and the sampled second type of sample data by adopting an iterative decision tree model to obtain a first interest relationship model.
8. The server according to any one of claims 5 to 7, wherein the candidate features include a recommendation information feature, recommendation information, and features associated with the user; the recommendation information characteristics comprise recommendation information quality, historical click rate and popularity; the recommendation information and the user-associated feature are preferences of the user for a recommendation information feature;
the click rate pre-estimation model unit is further used for arranging the candidate features according to the feature values in a descending order to calculate the optimal splitting point; calculating the optimal splitting point of the left sub-tree and the optimal splitting point of the right sub-tree corresponding to the optimal splitting point; and selecting an optimal splitting point from the optimal splitting point of the left sub-tree and the optimal splitting point of the right sub-tree to be used as the next optimal splitting point to continue splitting.
CN201610639936.8A 2016-08-05 2016-08-05 Information processing method and server Active CN107688956B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610639936.8A CN107688956B (en) 2016-08-05 2016-08-05 Information processing method and server

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610639936.8A CN107688956B (en) 2016-08-05 2016-08-05 Information processing method and server

Publications (2)

Publication Number Publication Date
CN107688956A CN107688956A (en) 2018-02-13
CN107688956B true CN107688956B (en) 2021-09-07

Family

ID=61152009

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610639936.8A Active CN107688956B (en) 2016-08-05 2016-08-05 Information processing method and server

Country Status (1)

Country Link
CN (1) CN107688956B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108595481A (en) * 2018-03-13 2018-09-28 维沃移动通信有限公司 A kind of notification message display methods and terminal device
CN112053179A (en) * 2019-06-06 2020-12-08 上海晶赞融宣科技有限公司 Information issuing method and device, storage medium and terminal

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101739402A (en) * 2008-11-07 2010-06-16 华为技术有限公司 Method and device for interest analysis
US9129227B1 (en) * 2012-12-31 2015-09-08 Google Inc. Methods, systems, and media for recommending content items based on topics
CN105183772A (en) * 2015-08-07 2015-12-23 百度在线网络技术(北京)有限公司 Release information click rate estimation method and apparatus
CN103177129B (en) * 2013-04-19 2016-03-16 上海新数网络科技股份有限公司 Internet real-time information recommendation prognoses system
CN105631707A (en) * 2015-12-23 2016-06-01 北京奇虎科技有限公司 Advertisement click rate estimation method based on decision tree, application recommendation method and device

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101290631A (en) * 2008-05-28 2008-10-22 北京百问百答网络技术有限公司 Network advertisement automatic delivery method and its system
US20140006166A1 (en) * 2012-06-29 2014-01-02 Mobio Technologies, Inc. System and method for determining offers based on predictions of user interest

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101739402A (en) * 2008-11-07 2010-06-16 华为技术有限公司 Method and device for interest analysis
US9129227B1 (en) * 2012-12-31 2015-09-08 Google Inc. Methods, systems, and media for recommending content items based on topics
CN103177129B (en) * 2013-04-19 2016-03-16 上海新数网络科技股份有限公司 Internet real-time information recommendation prognoses system
CN105183772A (en) * 2015-08-07 2015-12-23 百度在线网络技术(北京)有限公司 Release information click rate estimation method and apparatus
CN105631707A (en) * 2015-12-23 2016-06-01 北京奇虎科技有限公司 Advertisement click rate estimation method based on decision tree, application recommendation method and device

Also Published As

Publication number Publication date
CN107688956A (en) 2018-02-13

Similar Documents

Publication Publication Date Title
US11531867B2 (en) User behavior prediction method and apparatus, and behavior prediction model training method and apparatus
CN105574147B (en) Information processing method and server
CN109547814B (en) Video recommendation method and device, server and storage medium
US11711447B2 (en) Method and apparatus for real-time personalization
WO2019128394A1 (en) Method for processing fusion data and information recommendation system
US11843651B2 (en) Personalized recommendation method and system, and terminal device
JP5877644B2 (en) User-targeted advertising
CN111125574B (en) Method and device for generating information
US20160307131A1 (en) Method, apparatus, and system for controlling delivery task in social networking platform
CN110929052A (en) Multimedia resource recommendation method and device, electronic equipment and storage medium
EP2821950A1 (en) Quality scoring system for advertisements and content in an online system
US10949000B2 (en) Sticker recommendation method and apparatus
WO2018121700A1 (en) Method and device for recommending application information based on installed application, terminal device, and storage medium
CN111178970B (en) Advertisement putting method and device, electronic equipment and computer readable storage medium
WO2015120798A1 (en) Method for processing network media information and related system
CN107463580B (en) Click rate estimation model training method and device and click rate estimation method and device
JP2013218485A (en) Content provision device, low-rank approximate matrix generation device, content provision method, low-rank approximate matrix generation method and program
WO2016169411A1 (en) Method and device for information processing
US20180101521A1 (en) Avoiding sentiment model overfitting in a machine language model
CN112241327A (en) Shared information processing method and device, storage medium and electronic equipment
CN110766427B (en) Advertisement bidding method and system
WO2022247666A1 (en) Content processing method and apparatus, and computer device and storage medium
CN113656681A (en) Object evaluation method, device, equipment and storage medium
CN108694174B (en) Content delivery data analysis method and device
CN107688956B (en) Information processing method and server

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant