CN112380449B

CN112380449B - Information recommendation method, model training method and related device

Info

Publication number: CN112380449B
Application number: CN202011398840.XA
Authority: CN
Inventors: 钟子宏
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2020-12-03
Filing date: 2020-12-03
Publication date: 2021-11-23
Anticipated expiration: 2040-12-03
Also published as: CN112380449A

Abstract

The embodiment of the application discloses an information recommendation method, a model training method and a related device in the field of artificial intelligence, wherein the information recommendation method comprises the following steps: acquiring a prediction data set, wherein the prediction data set comprises prediction data corresponding to a plurality of target users, and the prediction data comprises user characteristic data of the target users in a T-th period, information characteristic data of target recommendation information and prediction labels; determining the proportion of positive and negative samples in the prediction data set as the probability of the prediction state transition according to the prediction label included in each prediction data in the prediction data set; determining the interest probability of the target user to the target recommendation information in the T +1 th period according to the user characteristic data, the information characteristic data and the prediction state transition probability through an interest probability prediction model; and determining whether to recommend target recommendation information to the target user in the T +1 th period according to the interest probability. The method can ensure the accuracy of information recommendation.

Description

Information recommendation method, model training method and related device

Technical Field

The present application relates to the technical field of Artificial Intelligence (AI), and in particular, to an information recommendation method, a model training method, and a related apparatus.

Background

In the internet big data era, how to accurately recommend information to users meets the individual requirements of users and ensures the effectiveness of information recommendation becomes a key point of attention of many network platforms nowadays.

In the related art, information recommendation is performed according to behavior characteristics (such as Click Through Rate (CTR)) of a user mainly by a binary recommendation method based on a Logistic Regression (LR) model. Specifically, the logistic regression model may be trained by using the user characteristics, the recommendation information characteristics, and the viewing conditions of the user for the recommendation information in the T-th period; and then, predicting the interest probability of the user for the recommendation information according to the user characteristic and the recommendation information characteristic in the T +1 th period by using the trained logistic regression model, and determining whether to recommend the recommendation information to the user according to the interest probability.

However, the implementation in the above related art has the following drawbacks: due to the influences of factors such as sparse feature data, low feature discrimination, unsatisfactory model effect, unreasonable classification threshold value and the like, the interest probabilities predicted by the logistic regression model are often distributed in a concentrated manner, for example, the interest probabilities are often distributed in the vicinity of the classification threshold value, so that the influence on accurately distinguishing whether the user is interested in the recommended information and further influencing the information recommendation effect is not beneficial.

Disclosure of Invention

The embodiment of the application provides an information recommendation method, a model training method and a related device, which can reduce the concentration of the probability predicted by an interest probability prediction model, and are convenient for accurately distinguishing the interest degree of a user for recommendation information, thereby ensuring the accuracy of information recommendation.

In view of the above, a first aspect of the present application provides an information recommendation method, including:

obtaining a prediction data set; the prediction data set comprises prediction data corresponding to a plurality of target users respectively, the prediction data comprises prediction characteristic data and prediction labels, the prediction characteristic data comprises user characteristic data of the target users in a T-th period and information characteristic data of target recommendation information in the T-th period, and the prediction labels are determined according to the viewing conditions of the target users on the target recommendation information in the T-th period; t is an integer greater than 1;

determining the proportion of positive and negative samples in the prediction data set according to the prediction label included in each prediction data in the prediction data set, wherein the proportion is used as the prediction state transition probability;

determining the interest probability of the target user to the target recommendation information in the T +1 th period according to the prediction feature data included in the prediction data corresponding to the target user and the prediction state transition probability through an interest probability prediction model;

and determining whether the target recommendation information is recommended to the target user in the T +1 th period according to the interest probability of the target user to the target recommendation information in the T +1 th period.

A second aspect of the present application provides a model training method, the method comprising:

acquiring a training sample data set; the training sample data set comprises training sample data corresponding to a plurality of users respectively, the training sample data comprises training characteristic data, a first training label and a second training label, the training characteristic data comprises user characteristic data of the users in a T-1 period and information characteristic data of recommended information in the T-1 period, the first training label is determined according to the viewing condition of the users on the recommended information in the T-1 period, and the second training label is determined according to the viewing condition of the users on the recommended information in the T period; t is an integer greater than 1;

determining the proportion of positive and negative samples in the training sample data set according to the first training label included in each training sample data in the training sample data set, wherein the proportion is used as the training state transition probability;

and performing iterative training on the interest probability prediction model based on the training feature data and the second training label included in the training sample data set and the training state transition probability.

A third aspect of the present application provides an information recommendation apparatus, including:

a prediction data acquisition module for acquiring a prediction data set; the prediction data set comprises prediction data corresponding to a plurality of target users respectively, the prediction data comprises prediction characteristic data and prediction labels, the prediction characteristic data comprises user characteristic data of the target users in a T-th period and information characteristic data of target recommendation information in the T-th period, and the prediction labels are determined according to the viewing conditions of the target users on the target recommendation information in the T-th period; t is an integer greater than 1;

a state transition probability determining module, configured to determine, according to the prediction tag included in each prediction data in the prediction data set, a ratio of positive and negative samples in the prediction data set as a prediction state transition probability;

an interest probability prediction module, configured to determine, according to the prediction feature data included in the prediction data corresponding to the target user and the prediction state transition probability, a probability of interest of the target user in the T +1 th period to the target recommendation information through an interest probability prediction model;

and the information recommendation module is used for determining whether the target recommendation information is recommended to the target user in the T +1 th period according to the interest probability of the target user in the T +1 th period on the target recommendation information.

A fourth aspect of the present application provides a model training apparatus, the apparatus comprising:

the training data acquisition module is used for acquiring a training sample data set; the training sample data set comprises training sample data corresponding to a plurality of users respectively, the training sample data comprises training characteristic data, a first training label and a second training label, the training characteristic data comprises user characteristic data of the users in a T-1 period and information characteristic data of recommended information in the T-1 period, the first training label is determined according to the viewing condition of the users on the recommended information in the T-1 period, and the second training label is determined according to the viewing condition of the users on the recommended information in the T period; t is an integer greater than 1;

a state transition probability determining module, configured to determine, according to the first training label included in each training sample data in the training sample data set, a ratio of positive and negative samples in the training sample data set as a training state transition probability;

and the model training module is used for performing iterative training on the interest probability prediction model based on the training feature data and the second training label included in the training sample data set and the training state transition probability.

A fifth aspect of the present application provides an apparatus comprising a processor and a memory:

the memory is used for storing a computer program;

the processor is configured to perform the steps of the information recommendation method according to the first aspect or the steps of the model training method according to the second aspect according to the computer program.

A sixth aspect of the present application provides a computer-readable storage medium for storing a computer program for executing the steps of the information recommendation method of the first aspect or the steps of the model training method of the second aspect.

A seventh aspect of the present application provides a computer program product or computer program comprising computer instructions stored in a computer readable storage medium. The processor of the computer device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions to cause the computer device to perform the steps of the information recommendation method according to the first aspect or the steps of the model training method according to the second aspect.

According to the technical scheme, the embodiment of the application has the following advantages:

the embodiment of the application provides an information recommendation method which innovatively enables an interest probability prediction model to comprehensively consider state transition probability in the interest probability prediction process, and corrects the processing process of the interest probability prediction model by utilizing the state transition probability so as to reduce the concentration of the interest probability predicted by the interest probability prediction model. Specifically, in the information recommendation method provided in the embodiment of the present application, after a prediction data set composed of prediction data corresponding to a plurality of target users is obtained, according to a prediction tag included in each prediction data in the prediction data set, a positive-negative sample ratio in the prediction data set is determined as a prediction state transition probability, where the prediction tag is determined according to a viewing condition of the target user for target recommendation information in a T-th period; and then, predicting the interest probability of the target user to the target recommendation information in the T +1 th period according to the user characteristic data of the target user in the T th period and the information characteristic data of the target recommendation information in the predicted data and the predicted state transition probability through an interest probability prediction model, and determining whether to recommend the target recommendation information to the target user according to the interest probability. In the information recommendation method, the interest probability prediction model introduces a prediction state transition probability determined based on a prediction tag included in each prediction data in a prediction data set in the process of predicting the interest probability of a target user on target recommendation information, and the prediction state transition probability can effectively correct the interest probability predicted by the interest probability prediction model and reduce the concentration of the interest probability predicted by the interest probability prediction model; therefore, the interest degree of the user to the recommended information can be distinguished accurately according to the interest probability predicted by the interest probability prediction model, and accurate information recommendation is guaranteed.

Drawings

FIG. 1 is a schematic diagram illustrating a logistic regression model-based two-way recommendation method in the related art;

fig. 2 is a schematic view of an application scenario of an information recommendation method provided in an embodiment of the present application;

fig. 3 is a schematic flowchart of an information recommendation method according to an embodiment of the present application;

FIG. 4 is a schematic flow chart illustrating a model training method according to an embodiment of the present disclosure;

fig. 5 is a schematic diagram of an implementation architecture of an information recommendation method according to an embodiment of the present application;

fig. 6 is a schematic diagram of an implementation architecture of another information recommendation method according to an embodiment of the present application;

fig. 7 is a schematic structural diagram of a first information recommendation device according to an embodiment of the present application;

fig. 8 is a schematic structural diagram of a second information recommendation device according to an embodiment of the present application;

fig. 9 is a schematic structural diagram of a third information recommendation device according to an embodiment of the present application;

FIG. 10 is a schematic structural diagram of a first model training apparatus according to an embodiment of the present disclosure;

FIG. 11 is a schematic structural diagram of a second model training apparatus according to an embodiment of the present disclosure;

FIG. 12 is a schematic structural diagram of a third model training apparatus according to an embodiment of the present disclosure;

FIG. 13 is a schematic structural diagram of a fourth model training apparatus according to an embodiment of the present disclosure;

fig. 14 is a schematic structural diagram of a terminal device according to an embodiment of the present application;

fig. 15 is a schematic structural diagram of a server according to an embodiment of the present application.

Detailed Description

In order to make the technical solutions of the present application better understood, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

The terms "first," "second," "third," "fourth," and the like in the description and in the claims of the present application and in the drawings described above, if any, are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the application described herein are capable of operation in sequences other than those illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.

Artificial Intelligence (AI) is a theory, method, technique and application system that uses a digital computer or a machine controlled by a digital computer to simulate, extend and expand human Intelligence, perceive the environment, acquire knowledge and use the knowledge to obtain the best results. In other words, artificial intelligence is a comprehensive technique of computer science that attempts to understand the essence of intelligence and produce a new intelligent machine that can react in a manner similar to human intelligence. Artificial intelligence is the research of the design principle and the realization method of various intelligent machines, so that the machines have the functions of perception, reasoning and decision making.

The artificial intelligence technology is a comprehensive subject and relates to the field of extensive technology, namely the technology of a hardware level and the technology of a software level. The artificial intelligence infrastructure generally includes technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and the like.

Machine Learning (ML) is a multi-domain cross discipline, and relates to a plurality of disciplines such as probability theory, statistics, approximation theory, convex analysis, algorithm complexity theory and the like. The special research on how a computer simulates or realizes the learning behavior of human beings so as to acquire new knowledge or skills and reorganize the existing knowledge structure to continuously improve the performance of the computer. Machine learning is the core of artificial intelligence, is the fundamental approach for computers to have intelligence, and is applied to all fields of artificial intelligence. Machine learning and deep learning generally include techniques such as artificial neural networks, belief networks, reinforcement learning, transfer learning, inductive learning, and formal education learning.

With the research and progress of artificial intelligence technology, the artificial intelligence technology is developed and applied in a plurality of fields, for example, common smart homes, smart wearable devices, virtual assistants, smart speakers, smart marketing, unmanned driving, automatic driving, unmanned aerial vehicles, robots, smart medical treatment, smart customer service, and the like.

The scheme provided by the embodiment of the application relates to an artificial intelligence information recommendation technology, and is specifically explained by the following embodiment:

fig. 1 is a schematic diagram illustrating a principle of a logistic regression model-based binary recommendation method in the related art. As shown in fig. 1, when information recommendation is performed by using the method, a logistic regression model needs to be trained by using the user characteristics and the recommendation information characteristics in the T-th period and the label determined based on the viewing condition of the user for the recommendation information; and then, predicting the interest probability of the user for the recommendation information in the T +1 th period according to the user characteristic and the recommendation information characteristic in the T +1 th period by using the logistic regression model obtained by training, and determining whether to recommend the recommendation information to the user according to the interest probability.

However, the inventor of the present application has found that the above-mentioned binary recommendation method based on logistic regression model has the following disadvantages: the interest probability predicted by the logistic regression model is often high in concentration and often distributed around the classification threshold value, so that the method is not beneficial to accurately distinguishing whether the user is interested in the recommended information or not, and further possibly influences the information recommendation effect.

In view of the problems in the related art, the embodiments of the present application provide an information recommendation method, which can effectively reduce the concentration of interest probabilities predicted by a model, and further, can accurately distinguish whether a user is interested in recommended information, and accordingly, accurately recommend information.

Specifically, in the information recommendation method provided in the embodiment of the present application, a prediction data set is obtained first; the prediction data set comprises prediction data corresponding to a plurality of target users, the prediction data comprises prediction characteristic data and prediction labels, the prediction characteristic data comprises user characteristic data of the target users in a T-th period and information characteristic data of target recommendation information in the T-th period, and the prediction labels are determined according to the viewing conditions of the target recommendation information by the target users in the T-th period. Then, according to the prediction label included in each prediction data in the prediction data set, the ratio of positive and negative samples in the prediction data set is determined as the prediction state transition probability. And then, predicting the interest probability of the target user to the target recommendation information in the T +1 th period according to the prediction characteristic data included in the prediction data corresponding to the target user and the prediction state transition probability through an interest probability prediction model. And finally, determining whether the target user recommends the target recommendation information in the T +1 th period according to the interest probability of the target user to the target recommendation information in the T +1 th period.

According to the information recommendation method, the state transition probability is innovatively considered by the interest probability prediction model in the process of predicting the interest probability, namely, in the process of predicting the interest probability of the user for the recommendation information by the interest probability prediction model, the prediction state transition probability determined based on the prediction labels included in the prediction data set is introduced, so that the interest probability predicted by the interest probability prediction model is corrected by utilizing the prediction state transition probability to reduce the concentration of the interest probability predicted by the interest probability prediction model; therefore, the interest probability predicted by the interest probability prediction model is convenient to accurately distinguish the interest degree of the user for the recommended information, and accurate information recommendation is guaranteed.

It should be understood that the information recommendation method provided by the embodiment of the present application may be applied to a device with data processing capability, such as a terminal device or a server. The terminal device may be a computer, a smart phone, a tablet computer, a Personal Digital Assistant (PDA), or the like; the server may specifically be an application server or a Web server, and in actual deployment, the server may be an independent server, or may also be a cluster server or a cloud server.

In order to facilitate understanding of the information recommendation method provided in the embodiment of the present application, an execution subject of the information recommendation method is taken as an example of a server, and an application scenario to which the information recommendation method is applied is exemplarily described below.

Referring to fig. 2, fig. 2 is a schematic view of an application scenario of the information recommendation method provided in the embodiment of the present application. As shown in fig. 2, the application scenario includes a server 210, a database 220, and a database 230, the server 210 may access the database 220 and the database 230 through a network, or the database 220 and the database 230 may be integrated in the server 210. The server 210 is configured to execute the information recommendation method provided in the embodiment of the present application, so as to determine whether to recommend target recommendation information to a target user in a T +1 th period according to user characteristic data of the target user in the T th period, information characteristic data of the target recommendation information, and a prediction state transition probability; the database 220 is used for storing the feature data of the user on the target network platform, and the database 230 is used for storing the information feature data of the recommendation information on the target network platform.

The length of the period can be set according to the actual application requirements, for example, the period length can be set in units of seconds, minutes, hours, days, weeks, months, years, and the like; assuming that the length of one period is one day, the server 210 needs to predict the probability that the target user views the target recommendation information in the T +1 th day according to the user feature data of the target user and the information feature data of the target recommendation information in the T th day and the predicted state transition probability determined based on the viewing condition of the target user for the target recommendation information in the T th day. The present application does not limit the above-mentioned cycle length in any way.

In practical applications, the server 210 needs to acquire the prediction data set first. Specifically, the server 210 may construct prediction data corresponding to a plurality of target users, and further form a prediction data set by using the prediction data corresponding to the plurality of target users; for example, when the server constructs corresponding prediction data for a target user, the server may retrieve user feature data of the target user in a T-th period from the database 220, retrieve information feature data of target recommendation information recommended to the target user in the T-th period from the database 230, determine a prediction tag according to a viewing condition of the target user for the target recommendation information in the T-th period, and further construct the prediction data corresponding to the target user by using the retrieved user feature data, the information feature data, and the determined prediction tag.

Then, the server 210 may determine a ratio of positive and negative samples in the prediction data set as the prediction state transition probability according to the prediction label included in each prediction data in the prediction data set. Prediction labels included in the prediction data are generally divided into two types, one is used for reflecting that a target user views target recommendation information (corresponding to a positive sample) in a T period, and the other is used for reflecting that the target user does not view the target recommendation information (corresponding to a negative sample) in the T period; the server 210 may determine the ratio of each of the two different prediction tags in the prediction data set according to the type of the prediction tag included in each prediction data in the prediction data set, and use the ratio as the prediction state transition probability.

Furthermore, the server 210 may predict, through a pre-trained interest probability prediction model, a probability of interest of the target user in the T +1 th period in the target recommendation information according to the user feature data, the information feature data, and the prediction state transition probability included in the prediction data corresponding to the target user. It should be noted that the interest probability prediction model is obtained by training in advance based on a training sample data set, where the training sample data set includes training sample data corresponding to each of a plurality of users, and the training sample data specifically includes user feature data of the user in a T-1 th period, information feature data of recommendation information in the T-1 th period, a first training label, and a second training label; the first training label is determined according to the viewing condition of the user on the recommended information in the T-1 period, and when the interested probability prediction model is trained, the training state transition probability determined based on each first training label in the training sample data set is required to be utilized; the second training label is determined according to the viewing condition of the user on the recommendation information in the T-th period.

After predicting the interest probability of the target user for the target recommendation information in the T +1 th period through the interest probability prediction model, the server 210 may determine whether to recommend the target recommendation information to the target user in the T +1 th period according to the interest probability.

It should be understood that the application scenario shown in fig. 2 is only an example, and in practical application, the information recommendation method provided in the embodiment of the present application may also be applied to other application scenarios, for example, the information recommendation method provided in the embodiment of the present application may be executed by a terminal device, and the application scenario of the information recommendation method is not limited in any way in the present application.

The information recommendation method provided by the present application is described in detail below by way of method embodiments.

Referring to fig. 3, fig. 3 is a schematic flowchart of an information recommendation method provided in an embodiment of the present application. For convenience of description, the following embodiments are still introduced by taking the execution subject of the information recommendation method as an example of the server. As shown in fig. 3, the information recommendation method includes the steps of:

step 301: obtaining a prediction data set; the prediction data set comprises prediction data corresponding to a plurality of target users respectively, the prediction data comprises prediction characteristic data and prediction labels, the prediction characteristic data comprises user characteristic data of the target users in a T-th period and information characteristic data of target recommendation information in the T-th period, and the prediction labels are determined according to the viewing conditions of the target users on the target recommendation information in the T-th period; and T is an integer greater than 1.

When the server needs to determine whether to recommend the target recommendation information to the target user in the T +1 th period, the server needs to acquire a basic data-prediction data set which is needed to predict the interest probability of the target user to the target recommendation information. The prediction data set comprises prediction data corresponding to a plurality of target users, and the prediction data corresponding to each target user comprises prediction characteristic data and a prediction label; the predicted characteristic data comprises user characteristic data of a target user in a T-th period and information characteristic data of target recommendation information in the T-th period; the prediction label is determined according to the viewing condition of the target user for the target recommendation information in the T-th period, and can reflect whether the target user views the target recommendation information.

In one possible implementation, the server may construct the predictive data set autonomously. For example, the server may regard all login users on the target network platform in the T-th period as target users, construct corresponding prediction data for each target user, and further compose a prediction data set by using the prediction data corresponding to each target user. When the corresponding prediction data is constructed for a certain target user, the server may retrieve user feature data of the target user in a T-th period from a database for storing the user feature data, retrieve information feature data of target recommendation information recommended to the target user in the T-th period from the database for storing the information feature data, determine a corresponding prediction tag according to a viewing condition of the target user for the target recommendation information in the T-th period, and further construct the prediction data corresponding to the target user by using the retrieved user feature data, the information feature data and the determined prediction tag.

In another possible implementation, the server may also obtain the constructed prediction data from other related devices to form a prediction data set. For example, after the tth period is finished, the relevant device may construct, for each logged-in user on the target network platform in the tth period, corresponding prediction data according to the user feature data in the tth period, the information feature data of the target recommendation information recommended to the relevant device in the tth period, and the prediction tag determined based on the viewing condition of the target recommendation information; when the server needs to predict the interest probability of each target user logged on the target network platform in the T +1 th period with respect to the target recommendation information, the server may obtain prediction data corresponding to each target user from the relevant device, and form a prediction data set.

It should be understood that, in practical applications, the server may also obtain the prediction data set in other manners, and the present application does not limit the manner in which the server obtains the prediction data set.

It should be noted that, in this embodiment of the present application, the multiple target users may be all login users or part of login users on the target network platform in the T-th period, may also be all login users or part of login users on the target network platform in the T + 1-th period, and may also be all users or part of users that log in the target network platform in both the T-th period and the T + 1-th period, where this application does not specifically limit the multiple target users.

It should be noted that the target network platform may be any platform with an information recommendation service, such as a shopping platform, an audio playing platform, a video playing platform, a news recommendation platform, and the like, and the type of the target network platform is not limited in this application. Target recommendation information recommended by different target network platforms is often different, for example, for a shopping network platform, the recommended target recommendation information may include commodity information, commodity coupons, and the like, for an audio and video playing platform, the recommended target recommendation information may include audio resources, video resources, and the like, for a news recommendation platform, the recommended target recommendation information may include news articles, news links, news audios and videos, and the like, and the application does not limit the target recommendation information recommended by the target network platforms.

It should be noted that, for different target network platforms, the user characteristic data of the target user in the prediction data may also be different. For example, for a shopping platform, the user characteristic data of the target user may include any one or a combination of more of the following: basic attribute data used for representing personal basic information (such as gender, age, region and the like) of a user, active attribute data used for representing operation activity of the user (such as continuous active duration, active function quantity, time interval of a T-th period from user registration time and the like), and consumption attribute data used for representing user purchase conditions (such as recharging amount, consumption amount, recharging times, recharging days, time interval of the T-th period from first recharging time, and related information of a taken coupon); for the audio and video playing platform, the consumption attribute data suitable for the shopping platform can be replaced by resource playing attribute data for representing the historical resource playing condition of the user; for the news recommendation platform, the consumption attribute data applicable to the shopping platform can be replaced by information browsing attribute data for representing the historical news browsing situation of the user. The present application is not limited to user profile data.

It should be noted that, for different target network platforms, the information characteristic data of the target recommendation information in the prediction data may also be different. For example, for a shopping platform, the information characteristic data of the target recommendation information may include any one or a combination of more than one of: click rate of the commodity, rate, collection rate, average payment amount (total payment amount of the commodity/number of paid persons), average active time length (total active time of the commodity/number of active persons), and the like; for the audio and video playing platform, the information characteristic data of the target recommendation information may include any one or more of the following combinations: playing times, collected rate, user score and the like of the audio and video resources in a preset time period; for the news recommendation platform, the information characteristic data of the target recommendation information may include the popularity, the generation time, the number of browsed times, the collection rate, and the like of the news information. The information characteristic data is not specifically limited herein.

It should be noted that the determination method of the prediction tag in the prediction data is also different for different target recommendation information. For example, for a shopping platform, in the case that the target recommendation information is commodity information, if a target user views the commodity information in a T-th period, it may be determined that a prediction tag in prediction data corresponding to the target user is 1, and if the target user does not view the commodity information in the T-th period, it may be determined that a prediction tag in prediction data corresponding to the target user is 0; in the case where the target recommendation information is a product coupon, if the target user downloads the product coupon in the T-th period, the prediction tag in the prediction data corresponding to the target user may be determined to be 1, and if the target user does not download the product coupon in the T-th period, the prediction tag in the prediction data corresponding to the target user may be determined to be 0. For another example, for an audio/video playing platform, if a target user plays a recommended audio/video resource in a T-th period, it may be determined that a prediction tag in prediction data corresponding to the target user is 1, and if the target user does not play the recommended audio/video resource in the T-th period, it may be determined that the prediction tag in prediction data corresponding to the target user is 0. For the news recommendation platform, if the target user views the recommended news information or news link in the T-th period, it may be determined that the prediction tag in the prediction data corresponding to the target user is 1, and if the target user does not view the recommended news information or news link in the T-th period, it may be determined that the prediction tag in the prediction data corresponding to the target user is 0. The present application does not set any limit to the determination method of the prediction tag in the prediction data.

Step 302: and determining the proportion of positive and negative samples in the prediction data set according to the prediction label included in each prediction data in the prediction data set as the prediction state transition probability.

After the server acquires the prediction data set, the server can calculate the proportion of each of the positive samples and the negative samples in the prediction data set according to the prediction tags included in each prediction data set, and use the calculated proportion of the positive samples and the negative samples as the prediction state transition probability.

For example, the server may regard the included prediction data with a prediction tag of 1 as a positive sample and regard the included prediction data with a prediction tag of 0 as a negative sample; on the basis, the server can calculate the proportion of the prediction data with the prediction label of 0 in the prediction data set as the proportion of the negative sample in the prediction data set

Accordingly, the number of the first and second electrodes,

i.e. a positive sample fraction in the prediction dataset. Further, will

As predicted state transition probability (p)_0,T，1-p_0,T) By unbiased estimation is meant that the mathematical expectation of the estimator equals the true value of the estimated parameter.

Step 303: and determining the interest probability of the target user to the target recommendation information in the T +1 th period according to the prediction feature data included in the prediction data corresponding to the target user and the prediction state transition probability through an interest probability prediction model.

The server calls a pre-trained interest probability prediction model, inputs prediction characteristic data (namely user characteristic data and information characteristic data) included in prediction data corresponding to the target user and the prediction state transition probability obtained through calculation in the step 302 into the interest probability prediction model, the interest probability prediction model correspondingly processes the input prediction characteristic data and the prediction state transition probability and outputs a corresponding processing result, and the processing result is the interest probability of the target user for the target recommendation information in the T +1 th period.

In a possible implementation manner, the server may input only the prediction feature data included in the prediction data corresponding to a certain target user and the prediction state transition probability calculated in step 302 into the interest probability prediction model, so as to predict, by using the interest probability prediction model, the interest probability of the target user in the T +1 th period with respect to the target recommendation information.

In another possible implementation manner, the server may also input, into the interest probability prediction model, prediction feature data included in prediction data corresponding to each target user in the prediction data set and the prediction state transition probability calculated in step 302, so as to predict, by using the interest probability prediction model, a probability of interest of each target user in the target recommendation information in the T +1 th period.

It should be noted that the probability prediction model of interest is obtained by training based on a training sample data set; the training sample data set comprises training sample data corresponding to a plurality of users, and the training sample data corresponding to each user comprises training characteristic data, a first training label and a second training label; the training characteristic data comprises user characteristic data of a user in a T-1 period and information characteristic data of recommendation information recommended to the user in the T-1 period; the first training label is determined according to the viewing condition of the user on the recommendation information in the T-1 period; the second training label is determined according to the viewing condition of the user on the recommendation information in the T-th period.

When the interesting probability prediction model is trained, the proportion of positive and negative samples in the training sample data set is determined as the training state transition probability according to a first training label included in each training sample data in the training sample data set; furthermore, by using the training characteristic data, the second training label and the training state transition probability included in the training sample data set, the model parameters of the trained interest probability prediction model are iteratively adjusted through a gradient descent algorithm, and the process of solving through the gradient descent algorithm is influenced by the state transition probability, so that the convergence rate of the model can be improved to a certain extent. An implementation of specifically training the probability prediction model of interest will be described in detail below by way of example.

For example, the above-mentioned interest probability prediction model may be a model for predicting as a positive sample or a model for predicting as a negative sample, the model for predicting as a positive sample may be specifically shown as formula (1), and the model for predicting as a negative sample may be specifically shown as formula (2).

Wherein, Y_tThe probability of taking two states of 0 or 1 for e {0,1} is p₀And (1-p)₀) (ii) a In the formula (1) and the formula (2),

predictive feature data representing input, p₀And (1-p)₀) For the input of the predicted state transition probability,

and W_yModel parameters of the probability prediction model of interest need to be obtained by training based on a training sample data set.

The prediction feature data processed by the probability prediction model of interest is considered to include a large number of sparse features, which may result in insufficient generalization capability and poor prediction effect of the model. For the problem, in the method provided by the embodiment of the present application, the server may determine sparse feature data in the predicted feature data included in the predicted data corresponding to the target user; the sparse feature data is further processed through a Deep Neural Network (DNN) model to obtain first target feature data. Accordingly, the server can determine the interest probability of the target user in the T +1 th period to the target recommendation information according to the first target feature data, the feature data except the sparse feature data in the predicted feature data, and the predicted state transition probability through the interest probability prediction model.

For example, the server may extract sparse feature data in the predicted feature data, for example, sparse feature data such as user gender, user identification, user age, and the like; furthermore, the extracted sparse feature data may be processed by using a DNN model to obtain a corresponding embedding feature (i.e., first target feature data). Accordingly, when the server predicts the interest probability of the target user in the target recommendation information in the T +1 th period through the interest probability prediction model, the imbedding feature obtained through the DNN model processing, the feature data except the extracted sparse feature data in the predicted feature data, and the predicted state transition probability obtained through the calculation in step 302 may be input into the interest probability prediction model.

It should be noted that, in practical applications, the DNN model including a 5-tier network architecture may be generally used to process sparse feature data, and of course, a network structure of the adopted DNN model may also be set according to actual requirements, and the network structure of the DNN model is not specifically limited in this application.

In order to obtain a better prediction effect and ensure that the interest probability of the target user on the target recommendation information in the T +1 th period can be predicted more accurately, in the method provided by the embodiment of the application, the server can determine dense feature data in the predicted feature data included in the predicted data corresponding to the target user; and further preprocessing the dense feature data to obtain second target feature data, wherein the preprocessing comprises at least one of the following processing modes: decorrelation processing, normalization processing and feature discretization processing. Accordingly, the server can determine the interest probability of the target user in the T +1 th period to the target recommendation information according to the second target feature data, the feature data except the dense feature data in the predicted feature data, and the predicted state transition probability through the interest probability prediction model.

For example, the server may extract dense feature data in the predicted feature data, for example, dense feature data such as a top-up amount, a consumption amount, an active time length, and a number of purchasers of the goods of the user; further, Principal Component Analysis (PCA) decorrelation processing, normalization processing, feature discretization processing, and the like are performed on the extracted dense feature data to obtain second target feature data. Accordingly, when the server predicts the interest probability of the target user for the target recommendation information in the T +1 th period through the interest probability prediction model, the second target feature data obtained through the preprocessing, the feature data in the predicted feature data except the extracted dense feature data, and the predicted state transition probability calculated through the step 302 may be input into the interest probability prediction model.

It should be understood that, in practical applications, in addition to the foregoing processing method, other processing methods may also be used to preprocess the dense feature data according to practical requirements, and the processing method used in the preprocessing is not limited in this application.

It should be noted that, in practical applications, in order to enable the interest probability prediction model to more accurately predict the interest probability of the target user on the target recommendation information in the T +1 th period, the server may process the sparse feature data in the predicted feature data and process the dense feature data in the predicted feature data through the DNN model. That is, the server can distinguish sparse feature data and dense feature data from predicted feature data corresponding to the target user, process the sparse feature data by using the DNN model to obtain corresponding first target feature data, and perform preprocessing such as PCA decorrelation processing, normalization processing, feature discretization processing and the like on the dense feature data to obtain corresponding second target feature data; furthermore, the interest probability of the target user in the T +1 th period to the target recommendation information is predicted through the interest probability prediction model according to the first target characteristic data, the second target characteristic data and the prediction state transition probability determined in the step 302.

Step 304: and determining whether the target recommendation information is recommended to the target user in the T +1 th period according to the interest probability of the target user to the target recommendation information in the T +1 th period.

Finally, the server can determine whether the target recommendation information is recommended to the target user in the T +1 th period according to the interest probability of the target user to the target recommendation information in the T +1 th period predicted by the interest probability prediction model.

In a possible implementation manner, a classification threshold value can be preset by a server, and after the server obtains an interest probability predicted by an interest probability model according to predicted feature data corresponding to a certain target user and a predicted state transition probability, whether the interest probability is greater than the classification threshold value can be judged; if the interest probability is greater than the classification threshold, it indicates that the target user has a high possibility of viewing the target recommendation information in the T +1 th period, and further, the server may recommend the target recommendation information to the target user in the T +1 th period; on the contrary, if the interest probability is not greater than the classification threshold, it indicates that the target user has a low possibility of viewing the target recommendation information in the T +1 th period, and accordingly, the server does not need to recommend the target recommendation information to the target user in the T +1 th period.

In another possible implementation manner, the server may determine a classification threshold for dividing positive and negative samples according to the interest probability of each target user to the target recommendation information in the T +1 th period predicted by the interest probability prediction model; for positive prediction data with the interest probability larger than the classification threshold, target recommendation information can be recommended to the target user corresponding to the prediction data correspondingly, and for negative prediction data with the interest probability not larger than the classification threshold, the target recommendation information does not need to be recommended to the target user corresponding to the prediction data.

It should be understood that, in practical applications, the server may also determine whether to recommend the target recommendation information to the target user based on the interest probability predicted by the interest probability prediction model in other manners, and the implementation manner of determining whether to recommend the target recommendation information to the target user based on the interest probability is not limited in this application.

According to the information recommendation method provided by the embodiment of the application, the state transition probability is innovatively considered by the interest probability prediction model in the process of predicting the interest probability, namely, in the process of predicting the interest probability of a user for recommendation information by the interest probability prediction model, the prediction state transition probability determined based on the prediction labels included in the prediction data set is introduced, so that the interest probability predicted by the interest probability prediction model is corrected by utilizing the prediction state transition probability to reduce the concentration of the interest probability predicted by the interest probability prediction model; therefore, the interest probability predicted by the interest probability prediction model is convenient to accurately distinguish the interest degree of the user for the recommended information, and accurate information recommendation is guaranteed.

Whether the information recommendation method provided by the embodiment of the application can accurately perform information recommendation depends on the model performance of the interest probability prediction model to a great extent, and the method for training the interest probability prediction model is described in detail through the method embodiment.

Referring to fig. 4, fig. 4 is a schematic flowchart of a model training method provided in the embodiment of the present application. For convenience of description, the following embodiments take the execution subject of the model training method as an example of a server. As shown in fig. 3, the model training method includes the following steps:

step 401: acquiring a training sample data set; the training sample data set comprises training sample data corresponding to a plurality of users respectively, the training sample data comprises training characteristic data, a first training label and a second training label, the training characteristic data comprises user characteristic data of the users in a T-1 period and information characteristic data of recommended information in the T-1 period, the first training label is determined according to the viewing condition of the users on the recommended information in the T-1 period, and the second training label is determined according to the viewing condition of the users on the recommended information in the T period; and T is an integer greater than 1.

When the server trains the interested probability prediction model, a training sample data set needs to be acquired. The training sample data set comprises training sample data corresponding to a plurality of users on a target network platform, wherein the training sample data corresponding to each user comprises training characteristic data, a first training label and a second training label; the training characteristic data comprises user characteristic data of a user in a T-1 period and information characteristic data of recommendation information recommended to the user in the T-1 period; the first training label is determined according to the viewing condition of the user on the recommendation information in the T-1 period; the second training label is determined according to the viewing condition of the user on the recommendation information in the T-th period.

Similar to the implementation process of obtaining the prediction data set, when the server obtains the training sample data set, the server can independently construct the training sample data set, and can also obtain the constructed training sample data from other related devices to form the training sample data set. The specific implementation process of acquiring the training sample data set may refer to the specific implementation process of acquiring the prediction data set mentioned in step 301 above, and is not described herein again; the difference between the two implementation processes is that the duty cycles of the feature data to be acquired are different, and a second training label needs to be additionally acquired when the training sample data is acquired.

It should be noted that, the users corresponding to each training sample data in the training sample data set may be all login users or part login users on the target network platform in the T-1 th period, may also be all login users or part login users on the target network platform in the T-1 th period, and may also be all users or part users who log in on the target network platform in both the T-1 th period and the T-1 th period, and the application does not make any limitation on the users corresponding to each training sample data in the training sample data set.

It should be noted that the target network platform in the embodiment of the present application is the same as the target network platform in the embodiment shown in fig. 3, and specific reference may be made to relevant introduction contents of the target network platform in the embodiment shown in fig. 3. The user characteristic data and the information characteristic data included in the training sample data in the embodiment of the present application are the same as the user characteristic data and the information characteristic data included in the prediction data in the embodiment shown in fig. 3, and the difference is only that the duty cycles corresponding to the characteristic data are different, which may specifically refer to the related introduction contents of the user characteristic data and the information characteristic data in the embodiment shown in fig. 3. The determining manners of the first training label and the second training label in the embodiment of the present application are the same as the determining manners of the predictive label in the embodiment shown in fig. 3, and specific reference may be made to related introduction contents of the determining manners of the predictive label in the embodiment shown in fig. 3.

In consideration of the fact that in practical application, the test of the trained interest probability prediction model needs to be doped in the process of training the interest probability prediction model so as to detect whether the training of the interest probability prediction model is completed. Therefore, after the server acquires the training sample data, the training sample data in the training sample data set needs to be respectively divided into a training sub-sample data set and a testing sub-sample data set according to a preset proportion, wherein the training sample data divided into the training sub-sample data set can be specially used for training the interesting probability prediction model, and the training sample data divided into the testing sub-sample data set can be specially used for testing the interesting probability prediction model.

For example, the server may randomly divide training sample data in the training sample data set into a training sub-sample data set and a testing sub-sample data set according to a preset proportion; for example, the training sample data set may be partitioned in a 8:2 ratio of training sample data comprised by the training subsample data set to training sample data comprised by the test subsample data set. Of course, in practical application, the preset ratio may be set according to actual requirements, and the preset ratio is not limited in any way herein.

Step 402: and determining the proportion of positive and negative samples in the training sample data set according to the first training label included in each training sample data in the training sample data set, wherein the proportion is used as the training state transition probability.

After the server acquires the training sample data set, the proportion of each of the positive samples and the negative samples in the training sample data set can be calculated according to the first training label included in each training sample data set, and the calculated proportion of the positive samples and the negative samples is used as the training state transition probability.

For example, the server may regard training sample data whose included first training label is 1 as a positive sample, and regard training sample data whose included first training label is 0 as a negative sample; on this basis, the server may calculate the proportion of the training sample data with the first training label of 0 in the training sample data set as the proportion of the negative samples in the training sample data set

Accordingly, the number of the first and second electrodes,

i.e. the positive sample fraction in the training sample data set. Further, will

As training state transition probability (p)_0,T-1，1-p_0,T-1) By unbiased estimation is meant that the mathematical expectation of the estimator equals the true value of the estimated parameter.

Step 403: and performing iterative training on the interest probability prediction model based on the training feature data and the second training label included in the training sample data set and the training state transition probability.

Furthermore, the server may perform iterative training on the probability-of-interest prediction model to be trained according to the training feature data (i.e., the user feature data and the information feature data) and the second training label included in the training sample data set and the training state transition probability calculated in step 402 until the probability-of-interest prediction model satisfying the training end condition is obtained.

Specifically, the server may input training feature data and a training state transition probability included in training sample data into an interest probability prediction model to be trained, and obtain an output result obtained by processing the training feature data and the training state transition probability by the interest probability prediction model; furthermore, based on the difference between the output result and the second training label included in the training sample data, the model parameters of the interested probability prediction model are adjusted through a gradient descent algorithm under state transition (namely, the model parameters are all influenced by the training state transition probability in the gradient descent solving process). Therefore, by using the training sample data in the training sample data set, the model parameters of the interested probability prediction model are continuously adjusted through the process, and the repeated iterative training of the interested probability prediction model is realized.

For example, the above-mentioned interesting probability prediction model to be trained may be a model for predicting as a positive sample or a model for predicting as a negative sample, where the model for predicting as a positive sample may specifically be shown as formula (1), and the model for predicting as a negative sample may specifically be shown as formula (2).

representing input training feature data, p₀And (1-p)₀) For the input of the training state transition probabilities,

and W_yModel parameters of the model are predicted for the probability of interest to be trained.

It should be understood that, when the training process of the above-described interest probability prediction model is performed in a case where the training sample data set is divided into the training subsample data set and the test subsample data set, the server needs to iteratively train the interest probability prediction model using the training feature data and the second training label included in the training sample data in the training subsample data set, and the training state transition probability.

Considering that training feature data used in training the interest probability prediction model usually include a large number of sparse features, training the interest probability prediction model by using the sparse features may cause insufficient generalization capability of the model and poor prediction effect. For this problem, in the method provided in the embodiment of the present application, the server may determine sparse feature data in training feature data included in training sample data; and then, processing the sparse feature through a DNN model to obtain first training feature data. Accordingly, the server may iteratively train the probability prediction model of interest based on the first training feature data, the feature data of the training feature data other than the sparse feature data, and the training state transition probability.

For example, the server may extract sparse feature data in the training feature data, for example, sparse feature data of user gender, user identification, user age, and the like; furthermore, the extracted sparse feature data may be processed by using a DNN model to obtain a corresponding embedding feature (i.e., first training feature data). Accordingly, the server may iteratively train the probability prediction model of interest based on the embedding features obtained through the DNN model processing, the feature data of the training feature data excluding the extracted sparse feature data, and the training state transition probability calculated through step 402.

In order to enable the trained interest probability prediction model to obtain a better prediction effect, in the method provided by the embodiment of the application, the server can determine dense feature data in training feature data included in training sample data; and further preprocessing the dense feature data to obtain second training feature data, wherein the preprocessing comprises at least one of the following processing modes: decorrelation processing, normalization processing and feature discretization processing. Accordingly, the server may iteratively train the probability of interest prediction model based on the second training feature data, the feature data of the training feature data other than the dense feature data, and the training state probabilities.

For example, the server may extract dense feature data in the training feature data, for example, dense feature data such as a top-up amount, a consumption amount, an active time length, and a number of purchasers of a commodity of a user; and then, performing PCA decorrelation processing, normalization processing, feature discretization processing and the like on the extracted dense feature data to obtain second training feature data. Accordingly, the server may iteratively train the interested probability prediction model based on the second training feature data obtained through the above preprocessing, the feature data of the training feature data except the extracted dense feature data, and the training state transition probability calculated through step 402.

It should be noted that, in practical application, in order to make the trained probability prediction model of interest have better model performance, the server may process sparse feature data in the training feature data and process dense feature data in the training feature data through the DNN model. That is, the server can distinguish sparse feature data from dense feature data in the training feature data, process the sparse feature data by using the DNN model to obtain corresponding first training feature data, and perform preprocessing such as PCA decorrelation processing, normalization processing, feature discretization processing and the like on the dense feature data to obtain corresponding second training feature data; further, the server may iteratively train the probability of interest prediction model based on the first training feature data, the second training feature data, and the training state transition probabilities determined by step 402.

Under the condition that the training sample data set is divided into a training sub-sample data set and a testing sub-sample data set, after one round of iterative training of the interest probability prediction model is completed, the server can test whether the model performance of the interest probability prediction model obtained through iterative training meets a preset condition or not by using training feature data and a second training label included in the training sample data in the testing sub-sample data set and the training state transition probability obtained through the calculation in the step 402; the preset condition here may include at least one of the following: the recall ratio reaches a first preset threshold, the precision ratio reaches a second preset threshold, and the Area (AUC) below the sensitivity Curve reaches a third preset threshold. And under the condition that the model performance of the interested probability prediction model is determined to meet the preset condition, the training of the interested probability prediction model is correspondingly determined to be finished.

Specifically, the server may input training feature data (i.e., user feature data and information feature data) and training state transition probability included in training sample data in the test sub-sample data set to an interest probability prediction model obtained through iterative training, and the interest probability prediction model performs corresponding processing on the input training feature data and the training state transition probability to obtain a corresponding output result. Then, the server may determine, according to the probability prediction model of interest, a performance parameter of the probability prediction model of interest, such as at least one of a recall ratio, an accuracy ratio, and an AUC, based on an output result obtained by processing a plurality of training sample data in the test sub-sample data set and a second training label included in the plurality of training sample data. Further, the server may determine whether training of the probability prediction model of interest is currently completed according to the performance parameter of the probability prediction model of interest, for example, the server may perform at least one of the following determination operations: judging whether the recall ratio of the interesting probability prediction model reaches a first preset threshold value, judging whether the precision ratio of the interesting probability prediction model reaches a second preset threshold value, and judging whether the AUC of the interesting probability prediction model reaches a third preset threshold value.

It should be noted that, the recall ratio refers to a ratio of the correct positive training sample data to the true positive training sample data adopted in the test; the precision ratio refers to the proportion of correctly predicted positive training sample data in all the predicted positive training sample data; the AUC is an area enclosed by a receiver operating characteristic curve (ROC) and coordinate values, and is a standard for determining the quality of the probability prediction model of interest. In practical application, the first preset threshold, the second preset threshold and the third preset threshold may be set according to actual requirements, and the first preset threshold, the second preset threshold and the third preset threshold are not limited in any way herein.

According to the training method of the interest probability prediction model, the training state transition probability is integrated in the process of iteratively training the interest probability prediction model, so that the gradient reduction process can be influenced by the training state transition probability, and the convergence speed of the interest probability prediction model can be improved to a certain extent.

In order to further understand the information recommendation method provided by the embodiment of the present application, taking a scenario in which the method provided by the embodiment of the present application is applied to the pre-estimation of the loading coupon in the presence service as an example, the information recommendation method is introduced in an overall exemplary manner. The purpose of predicting downloading of the fueling coupon in the service is to predict whether the user will download the fueling coupon in the service in the T +1 th period according to the downloading operation of the user on the fueling coupon in the service in the T th period.

Fig. 5 and fig. 6 are schematic diagrams of implementation architectures of the information recommendation method provided in the embodiment of the present application, fig. 5 and fig. 6 respectively show the implementation architectures of the information recommendation method provided in the embodiment of the present application from different dimensions, and an overall implementation process of the information recommendation method provided in the embodiment of the present application is described below with reference to fig. 5 and fig. 6.

The information recommendation method provided by the embodiment of the application can be mainly divided into the following seven stages: the method comprises a sample data set preprocessing stage, a state transition probability calculation stage, a DNN model feature processing stage, a model training stage under state transition, a model testing stage under state transition, a model prediction stage under state transition and a classification downloading recommendation stage, wherein the seven stages are respectively described in detail below. In a scene of prediction of the downloading of the fueling coupon, a user label in a T-1 th period and a user label in a T-1 th period respectively represent the downloading conditions of the fueling coupon by a user in the T-1 th period and the T-th period, the user label is 1 to represent that the fueling coupon is downloaded by the user, and the user label is 0 to represent that the fueling coupon is not downloaded by the user; the user characteristic data can comprise click behavior characteristic data of a user on a trip service function interface, and the information characteristic data can comprise characteristic data of click rate of each function on the trip service function interface, download rate of the fueling coupon, exposure, click rate and the like.

In the sample data set preprocessing stage, the server may construct training sample data (including training samples and test samples) and prediction data.

When training sample data is constructed, the server can use the user characteristic data, the information characteristic data and the user label in the T-1 th period to construct the training sample data, and distinguish sparse characteristic data and dense characteristic data from the user characteristic data and the information characteristic data. The sparse feature data need to be processed by a DNN model, and the dense feature data need to be processed by PCA decorrelation processing, normalization processing, feature discretization processing and the like. The constructed training sample data is randomly divided into a training sub-sample data set (ratio a) and a testing sub-sample data set (ratio 1-a) according to a preset ratio, for example, the ratio of the training sample data in the training sub-sample data set to the training sample data in the testing sub-sample data set is 8:2 according to general experience.

When the predicted data is constructed, the server can construct the predicted data by using the user characteristic data, the information characteristic data and the user label in the T-th period, and distinguish sparse characteristic data and dense characteristic data from the user characteristic data and the information characteristic data. The sparse feature data need to be processed by a DNN model, and the dense feature data need to be processed by PCA decorrelation processing, normalization processing, feature discretization processing and the like.

The user characteristic data mainly comprises: basic attribute data such as gender, age and region of a user, active attribute data such as active days, active duration, active function quantity, current time and day intervals from registration time, recharging attribute data such as recharging amount, consumption amount, recharging times, recharging days and day intervals from current time and first recharging time, and coupon attribute data such as times of clicking coupons, information (such as quantity, times and value) of picking up coupons, information (such as quantity and value) of using coupons, information (such as quantity and value) of expired coupons. In a scene of predicting the downloading of the fueling coupon, a user label of 1 indicates that a user clicks and downloads the coupon, and sample data comprising the user label is a positive sample; a user tag of 0 indicates that the user clicks without downloading the coupon, and sample data including the user tag is a negative sample.

In the state transition probability calculation stage, the server needs to determine the training state transition probability of the model training stage, namely, the user label in the T-1 th period is used for calculating the proportion of positive samples and negative samples (the proportion of the negative samples is

A positive sample fraction of

) As training state transition probability (p)_0,T-1，1-p_0,T-1) Unbiased estimation of (d). The server also needs to determine the prediction state transition probability of the model prediction phase, namely, the positive and negative sample proportion (the negative sample proportion is

A positive sample fraction of

) As predicted state transition probability (p)_0,T，1-p_0,T) Unbiased estimation of (d).

In the DNN model feature processing stage, the server can process sparse feature data in training sample data and test sample data through the DNN model to obtain corresponding embedding features. Here it can be handled using the DNN model of a layer 5 network architecture.

In the model training phase under the state transition, the server may utilize the embedding features corresponding to the sparse feature data in the training sample data, the dense feature data in the training sample data and the user label of the T-th period, and the training state transition probability (p)_0,T-1，1-p_0,T-1) And training the interested probability prediction model based on a gradient descent method under state transition to obtain the model weight W.

In the model testing stage under the state transition, the server can adopt the training sample data in the testing sub-sample data set to test the interested probability prediction model (LR model) based on the model weight W and substitute the training state transition probability (p) in the testing process_0,T-1，1-p_0,T-1). Specifically, whether the evaluation indexes (such as recall ratio, precision ratio, AUC, and the like) of the interest probability prediction model reach a preset evaluation effect can be tested, if so, the interest probability prediction model based on the model weight W can be stored, and if not, the model needs to return to a model training stage under state transition to continue training the interest probability prediction model.

In the model prediction phase under the state transition, the server can predict a model according to the interesting probability based on the model weight W, according to the embedding characteristics corresponding to the sparse characteristic data in the prediction data, the dense characteristic data in the prediction data and the prediction state transition probability (p)_0,T，1-p_0,T) And calculating the interest probability of the user to the fueling coupon in the T +1 th period, namely the downloading probability of the fueling coupon.

In the classification downloading recommendation stage, the server can divide positive and negative samples according to a certain threshold (such as 0.5) on the predicted downloading probability, wherein the user corresponding to the positive sample has a desire to download the fueling coupon and can mark the coupon as 1, and the user corresponding to the negative sample has no desire to download the fueling coupon and can mark the coupon as 0; further, the recommendation of the fueling coupon is made to the user marked 1.

In addition, the method provided by the embodiment of the application can also be applied to scenes such as fueling service recommendation and the like, and when the method provided by the embodiment of the application is applied to the scene of fueling service recommendation, the server can predict whether the user will experience and use the recommended fueling service in the T +1 th period according to the fueling service experienced by the user in the T th period. The specific implementation process of the method in the fueling service recommendation scene provided by the embodiment of the application is similar to the specific implementation process in the fueling coupon download estimation scene, and the difference is usually only that the user characteristic data and the information characteristic data in the training sample data and the prediction data are different under two scenes, in the fueling service recommendation scene, the user characteristic data can comprise the basic attribute data, the active attribute data and the recharging data, and can also comprise fueling service attribute data such as fueling service experience times, fueling service detail information and fueling service value, and the information characteristic data is the characteristic data related to the fueling service; accordingly, in the fueling service recommendation scenario, a user tag of 1 indicates that the user uses the fueling service, and a user tag of 0 indicates that the user does not use the fueling service.

It should be understood that the method provided by the embodiment of the present application may also be applied to scenes such as commodity recommendation, audio and video recommendation, article recommendation, link recommendation, and the like, and training sample data and prediction data are constructed based on user characteristic data and information characteristic data in corresponding scenes in different scenes, where the scenes to which the information recommendation method provided by the embodiment of the present application is applied, and user characteristic data and information characteristic data that should be used in corresponding scenes are not limited at all.

The inventors of the present application performed comparative tests on a logistic regression model in the related art and an interesting probability prediction model in the embodiments of the present application, and the test results are shown in table 1.

TABLE 1

	Logistic regression model in correlation technique	Probability of interest prediction model in the present application
			Recall ratio of	71.61％	88.37％
Precision ratio	63.22％	88.52％
			AUC	0.6518	0.9003

By comparison, the interesting probability prediction model in the embodiment of the application has better performance in all aspects than the logistic regression model in the related art.

Aiming at the information recommendation method described above, the application also provides a corresponding information recommendation device, so that the information recommendation method is applied and implemented in practice.

Referring to fig. 7, fig. 7 is a schematic structural diagram of an information recommendation apparatus 700 corresponding to the information recommendation method shown in fig. 3. As shown in fig. 7, the information recommendation apparatus 700 includes:

a prediction data obtaining module 701, configured to obtain a prediction data set; the prediction data set comprises prediction data corresponding to a plurality of target users respectively, the prediction data comprises prediction characteristic data and prediction labels, the prediction characteristic data comprises user characteristic data of the target users in a T-th period and information characteristic data of target recommendation information in the T-th period, and the prediction labels are determined according to the viewing conditions of the target users on the target recommendation information in the T-th period; t is an integer greater than 1;

a state transition probability determining module 702, configured to determine, according to the prediction tag included in each prediction data in the prediction data set, a ratio of positive and negative samples in the prediction data set as a prediction state transition probability;

an interest probability prediction module 703, configured to determine, according to the prediction feature data included in the prediction data corresponding to the target user and the prediction state transition probability, a probability of interest of the target user in the T +1 th period to the target recommendation information through an interest probability prediction model;

an information recommending module 704, configured to determine whether to recommend the target recommendation information to the target user in the T +1 th period according to the interest probability of the target user in the T +1 th period for the target recommendation information.

Optionally, on the basis of the information recommendation device shown in fig. 7, referring to fig. 8, fig. 8 is a schematic structural diagram of another information recommendation device 800 provided in the embodiment of the present application. As shown in fig. 8, the information recommendation apparatus further includes:

a sparse feature processing module 801, configured to determine sparse feature data in the prediction feature data included in the prediction data corresponding to the target user; processing the sparse characteristic data through a deep neural network model to obtain first target characteristic data;

the interest probability prediction module 703 is specifically configured to:

and determining the interest probability of the target user to the target recommendation information in the T +1 th period according to the first target feature data, the feature data except the sparse feature data in the prediction feature data and the prediction state transition probability through the interest probability prediction model.

Optionally, on the basis of the information recommendation device shown in fig. 7, referring to fig. 9, fig. 9 is a schematic structural diagram of another information recommendation device 900 provided in the embodiment of the present application. As shown in fig. 9, the information recommendation apparatus further includes:

a dense feature processing module 901, configured to determine dense feature data from the predicted feature data included in the predicted data corresponding to the target user; preprocessing the dense feature data to obtain second target feature data; the pretreatment comprises at least one of the following treatment modes: decorrelation processing, normalization processing and feature discretization processing;

the interest probability prediction module 703 is specifically configured to:

and determining the interest probability of the target user to the target recommendation information in the T +1 th period according to the second target feature data, the feature data except the dense feature data in the predicted feature data and the predicted state transition probability through the interest probability prediction model.

Optionally, on the basis of the information recommendation apparatus shown in fig. 7, the probability prediction model of interest is obtained by training based on a training sample data set; the training sample data set comprises training sample data corresponding to a plurality of users respectively, the training sample data comprises training characteristic data, a first training label and a second training label, the training characteristic data comprises user characteristic data of the users in a T-1 period and information characteristic data of recommended information in the T-1 period, the first training label is determined according to the viewing condition of the users on the recommended information in the T-1 period, and the second training label is determined according to the viewing condition of the users on the recommended information in the T period.

Optionally, on the basis of the information recommendation apparatus shown in fig. 7, the user characteristic data includes at least one of the following: basic attribute data used for representing personal basic information, active attribute data used for representing operation activity of a user, and consumption attribute data used for representing purchase condition of the user;

the recommendation information includes at least one of: commodity information, commodity coupons.

The information recommendation device provided by the embodiment of the application innovatively enables the interest probability prediction model to comprehensively consider the state transition probability in the process of predicting the interest probability, namely, in the process of predicting the interest probability of a user for recommendation information by the interest probability prediction model, the prediction state transition probability determined based on the prediction labels included in the prediction data set is introduced, so that the interest probability predicted by the interest probability prediction model is corrected by utilizing the prediction state transition probability to reduce the concentration of the interest probability predicted by the interest probability prediction model; therefore, the interest probability predicted by the interest probability prediction model is convenient to accurately distinguish the interest degree of the user for the recommended information, and accurate information recommendation is guaranteed.

Aiming at the model training method described above, the present application also provides a corresponding model training device, so that the model training method described above can be applied and implemented in practice.

Referring to fig. 10, fig. 10 is a schematic structural diagram of a model training apparatus 1000 corresponding to the model training method shown in fig. 4. As shown in fig. 10, the model training apparatus 1000 includes:

a training data obtaining module 1001, configured to obtain the training sample data set; the training sample data set comprises training sample data corresponding to a plurality of users respectively, the training sample data comprises training characteristic data, a first training label and a second training label, the training characteristic data comprises user characteristic data of the users in a T-1 period and information characteristic data of recommended information in the T-1 period, the first training label is determined according to the viewing condition of the users on the recommended information in the T-1 period, and the second training label is determined according to the viewing condition of the users on the recommended information in the T period; t is an integer greater than 1;

a state transition probability determining module 1002, configured to determine, according to the first training label included in each training sample data in the training sample data set, a ratio of positive and negative samples in the training sample data set as a training state transition probability;

a model training module 1003, configured to perform iterative training on the probability-of-interest prediction model based on the training feature data and the second training label included in the training sample data set and the training state transition probability.

Optionally, on the basis of the model recommendation apparatus shown in fig. 10, referring to fig. 11, fig. 11 is a schematic structural diagram of another model recommendation apparatus 1100 provided in an embodiment of the present application. As shown in fig. 11, the apparatus further includes:

a sample dividing module 1101, configured to divide each training sample data in the training sample data set into a training sub-sample data set and a test sub-sample data set according to a preset ratio;

the model training module 1003 is specifically configured to perform iterative training on the interest probability prediction model based on the training feature data and the second training label included in the training sample data in the training sub-sample data set and the training state transition probability;

a model testing module 1102, configured to test whether model performance of the interest probability prediction model obtained through iterative training satisfies a preset condition by using the training feature data and the second training label included in the training sample data in the testing sub-sample data set and the training state transition probability after iterative training is performed on the interest probability prediction model; the prediction condition includes at least one of: the recall ratio reaches a preset first preset threshold value, the precision ratio reaches a second preset threshold value, and the lower area of the sensitivity curve reaches a third preset threshold value; and if the model performance of the interested probability prediction model meets the preset condition, determining that the training of the interested probability prediction model is finished.

Optionally, on the basis of the model training apparatus shown in fig. 10, referring to fig. 12, fig. 12 is a schematic structural diagram of another model training apparatus 1200 provided in the embodiment of the present application. As shown in fig. 12, the apparatus further includes:

a sparse feature processing module 1201, configured to determine sparse feature data in the training feature data included in the training sample data; processing the sparse feature data through a deep neural network model to obtain first training feature data;

the model training module 1003 is specifically configured to:

and performing iterative training on the interest probability prediction model based on the first training feature data, the feature data except the sparse feature data in the training feature data and the training state transition probability.

Optionally, on the basis of the model training apparatus shown in fig. 10, referring to fig. 13, fig. 13 is a schematic structural diagram of another model training apparatus 1300 provided in the embodiment of the present application. As shown in fig. 13, the apparatus further includes:

a dense feature processing module 1301, configured to determine dense feature data from the training feature data included in the training sample data; preprocessing the dense feature data to obtain second training feature data; the pretreatment comprises at least one of the following treatment modes: decorrelation processing, normalization processing and feature discretization processing;

the model training module 1003 is specifically configured to:

and performing iterative training on the interested probability prediction model based on the second training feature data, the feature data except the dense feature data in the training feature data and the training state transition probability.

The training device for the interest probability prediction model provided by the embodiment of the application integrates the training state transition probability in the process of iteratively training the interest probability prediction model, so that the gradient reduction process can be influenced by the training state transition probability, and the convergence rate of the interest probability prediction model can be improved to a certain extent.

The embodiment of the present application further provides a device for recommending information or training a model, where the device may specifically be a terminal device or a server, and the terminal device and the server provided in the embodiment of the present application will be described in terms of hardware implementation.

Referring to fig. 14, fig. 14 is a schematic structural diagram of a terminal device provided in an embodiment of the present application. As shown in fig. 14, for convenience of explanation, only the parts related to the embodiments of the present application are shown, and details of the technology are not disclosed, please refer to the method part of the embodiments of the present application. The terminal may be any terminal device including a mobile phone, a tablet computer, a Personal Digital Assistant (PDA, abbreviated as "Personal Digital Assistant"), a Sales terminal (POS, abbreviated as "Point of Sales"), a vehicle-mounted computer, and the like, taking the terminal as a computer as an example:

fig. 14 is a block diagram showing a partial structure of a computer related to a terminal provided in an embodiment of the present application. Referring to fig. 14, the computer includes: radio Frequency (RF) circuit 1410, memory 1420, input unit 1430, display unit 1440, sensor 1450, audio circuit 1460, wireless fidelity (WiFi) module 1470, processor 1480, and power supply 1490. Those skilled in the art will appreciate that the computer architecture shown in FIG. 14 is not intended to be limiting of computers, and may include more or fewer components than those shown, or some components in combination, or a different arrangement of components.

The memory 1420 may be used to store software programs and modules, and the processor 1480 executes various functional applications and data processing of the computer by operating the software programs and modules stored in the memory 1420. The memory 1420 may mainly include a program storage area and a data storage area, wherein the program storage area may store an operating system, an application program required for at least one function (such as a sound playing function, an image playing function, etc.), and the like; the storage data area may store data (such as audio data, a phonebook, etc.) created according to the use of the computer, etc. Further, memory 1420 may include high-speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid-state storage device.

The processor 1480 is a control center of the computer, connects various parts of the entire computer using various interfaces and lines, performs various functions of the computer and processes data by operating or executing software programs and/or modules stored in the memory 1420, and calls data stored in the memory 1420, thereby monitoring the entire computer. Alternatively, the processor 1480 may include one or more processing units; preferably, the processor 1480 may integrate an application processor, which handles primarily operating systems, user interfaces, and applications, among others, with a modem processor, which handles primarily wireless communications. It will be appreciated that the modem processor described above may not be integrated into the processor 1480.

In the embodiment of the present application, the processor 1480 included in the terminal also has the following functions:

Alternatively, the first and second electrodes may be,

Optionally, the processor 1480 is further configured to execute the steps of any implementation manner of the information recommendation method or the model training method provided in the embodiment of the present application.

Referring to fig. 15, fig. 15 is a schematic structural diagram of a server 1500 according to an embodiment of the present disclosure. The server 1500 may vary widely in configuration or performance and may include one or more Central Processing Units (CPUs) 1522 (e.g., one or more processors) and memory 1532, one or more storage media 1530 (e.g., one or more mass storage devices) storing applications 1542 or data 1544. Memory 1532 and storage media 1530 may be, among other things, transient or persistent storage. The program stored on the storage medium 1530 may include one or more modules (not shown), each of which may include a series of instruction operations for the server. Still further, a central processor 1522 may be provided in communication with the storage medium 1530, executing a series of instruction operations in the storage medium 1530 on the server 1500.

The server 1500 may also include one or more power supplies 1526, one or more wired or wireless network interfaces 1550, one or more input-output interfaces 1558, and/or one or more operating systems, such as Windows Server, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM, and so forth.

The steps performed by the server in the above embodiment may be based on the server structure shown in fig. 15.

The CPU 1522 is configured to execute the following steps:

Alternatively, the first and second electrodes may be,

Optionally, the CPU 1522 may also be configured to execute steps of any implementation manner of the information recommendation method or the model training method provided in the embodiment of the present application.

The embodiment of the present application further provides a computer-readable storage medium for storing a computer program, where the computer program is configured to execute any one implementation manner of the information recommendation method or the model training method described in the foregoing embodiments.

Embodiments of the present application also provide a computer program product or computer program comprising computer instructions stored in a computer-readable storage medium. The processor of the computer device reads the computer instructions from the computer-readable storage medium, and executes the computer instructions, so that the computer device executes any one of the information recommendation methods or the model training methods described in the foregoing embodiments.

It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described systems, apparatuses and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.

In the several embodiments provided in the present application, it should be understood that the disclosed system, apparatus and method may be implemented in other manners. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the units is only one logical division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.

The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application may be substantially implemented or contributed to by the prior art, or all or part of the technical solution may be embodied in a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a U disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing computer programs.

It should be understood that in the present application, "at least one" means one or more, "a plurality" means two or more. "and/or" for describing an association relationship of associated objects, indicating that there may be three relationships, e.g., "a and/or B" may indicate: only A, only B and both A and B are present, wherein A and B may be singular or plural. The character "/" generally indicates that the former and latter associated objects are in an "or" relationship. "at least one of the following" or similar expressions refer to any combination of these items, including any combination of single item(s) or plural items. For example, at least one (one) of a, b, or c, may represent: a, b, c, "a and b", "a and c", "b and c", or "a and b and c", wherein a, b, c may be single or plural.

The above embodiments are only used for illustrating the technical solutions of the present application, and not for limiting the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions in the embodiments of the present application.

Claims

1. An information recommendation method, characterized in that the method comprises:

2. The method of claim 1, further comprising:

determining sparse feature data in the predicted feature data included in the predicted data corresponding to the target user;

processing the sparse characteristic data through a deep neural network model to obtain first target characteristic data;

and determining, by the interest probability prediction model, the interest probability of the target user for the target recommendation information in the T +1 th period according to the prediction feature data included in the prediction data corresponding to the target user and the prediction state transition probability, including:

3. The method of claim 1, further comprising:

determining dense feature data in the predicted feature data included in the predicted data corresponding to the target user;

preprocessing the dense feature data to obtain second target feature data; the pretreatment comprises at least one of the following treatment modes: decorrelation processing, normalization processing and feature discretization processing;

4. The method of claim 1, wherein the probability of interest prediction model is trained based on a training sample data set; the training sample data set comprises training sample data corresponding to a plurality of users respectively, the training sample data comprises training characteristic data, a first training label and a second training label, the training characteristic data comprises user characteristic data of the users in a T-1 period and information characteristic data of recommended information in the T-1 period, the first training label is determined according to the viewing condition of the users on the recommended information in the T-1 period, and the second training label is determined according to the viewing condition of the users on the recommended information in the T period.

5. The method according to claim 1 or 4, wherein the user characteristic data comprises at least one of: basic attribute data used for representing personal basic information, active attribute data used for representing operation activity of a user, and consumption attribute data used for representing purchase condition of the user;

6. A method of model training, the method comprising:

7. The method of claim 6, further comprising:

according to a preset proportion, dividing each training sample data in the training sample data set into a training sub-sample data set and a testing sub-sample data set respectively;

and iteratively training the interest probability prediction model based on the training feature data and the second training label included in the training sample data set and the training state transition probability, including:

iteratively training the interest probability prediction model based on the training feature data and the second training label included in the training sample data in the training subsample dataset and the training state transition probability;

after the iteratively training the probabilistic predictive model of interest, the method further comprises:

testing whether the model performance of the interest probability prediction model obtained through iterative training meets a preset condition or not by using the training feature data and the second training label included by the training sample data in the testing sub-sample data set and the training state transition probability; the prediction condition includes at least one of: the recall ratio reaches a preset first preset threshold value, the precision ratio reaches a second preset threshold value, and the lower area of the sensitivity curve reaches a third preset threshold value;

and if the model performance of the interested probability prediction model meets the preset condition, determining that the training of the interested probability prediction model is finished.

8. The method of claim 6, further comprising:

determining sparse feature data in the training feature data included in the training sample data;

processing the sparse feature data through a deep neural network model to obtain first training feature data;

9. The method of claim 6, further comprising:

determining dense feature data in the training feature data included in the training sample data;

preprocessing the dense feature data to obtain second training feature data; the pretreatment comprises at least one of the following treatment modes: decorrelation processing, normalization processing and feature discretization processing;

10. An information recommendation apparatus, characterized in that the apparatus comprises:

11. The apparatus of claim 10, further comprising:

the sparse feature processing module is used for determining sparse feature data in the predicted feature data included in the predicted data corresponding to the target user; processing the sparse characteristic data through a deep neural network model to obtain first target characteristic data;

the interest probability prediction module is specifically configured to:

12. The apparatus of claim 10, further comprising:

the dense feature processing module is used for determining dense feature data in the predicted feature data included in the predicted data corresponding to the target user; preprocessing the dense feature data to obtain second target feature data; the pretreatment comprises at least one of the following treatment modes: decorrelation processing, normalization processing and feature discretization processing;

the interest probability prediction module is specifically configured to:

13. A model training apparatus, the apparatus comprising:

14. An apparatus, comprising a processor and a memory;

the memory is used for storing a computer program;

the processor is configured to execute the information recommendation method of any one of claims 1 to 5 or the model training method of any one of claims 6 to 9 according to the computer program.

15. A computer-readable storage medium, characterized in that the computer-readable storage medium is configured to store a computer program for executing the information recommendation method of any one of claims 1 to 5 or the model training method of any one of claims 6 to 9.