CN115470397B

CN115470397B - Content recommendation method, device, computer equipment and storage medium

Info

Publication number: CN115470397B
Application number: CN202110651154.7A
Authority: CN
Inventors: 王良栋; 张博; 刘书凯; 丘志杰; 饶君
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2021-06-10
Filing date: 2021-06-10
Publication date: 2024-04-05
Anticipated expiration: 2041-06-10
Also published as: CN115470397A

Abstract

A content recommendation method, apparatus, computer device and storage medium, the method includes: responding to a content recommendation event, and acquiring a user portrait of a user to be recommended and a user history behavior sequence; inputting the user portrait and the user history behavior sequence into a content recommendation increment model determined by training; acquiring recommended content output by a content recommendation incremental model determined by training; the content recommendation incremental model is determined based on sample user portraits, sample user historical behavior sequences and target sharing reflux quantity training, and the target sharing reflux quantity is determined according to the content recommendation total model determined through training. According to the method, the trained content recommendation full-scale model can be used for outputting more accurate target sharing reflux quantity, the content recommendation incremental model is trained based on less data, model updating is fast, the trained full-scale model is combined with the target sharing reflux quantity output by the incremental model, and the trained incremental model can recommend more accurate personalized content.

Description

Content recommendation method, device, computer equipment and storage medium

Technical Field

The present disclosure relates to the field of computer technologies, and in particular, to a content recommendation method, apparatus, computer device, and storage medium.

Background

The recall stage of the recommendation system can be understood as a process of selecting a small candidate set for roughing a batch of contents to be recommended from massive information according to historical behavior data of the user. In a recommendation system, a plurality of data are generally relied on for analysis and training to obtain more accurate personalized recommendation. In many data analysis, the user reflows data.

When personalized content recommendation is performed by combining the reflow data, the currently adopted method is to train all data generated daily through a content recommendation total model, predict the reflow data which can be generated by new content through a model obtained through training, and then perform content recommendation based on the predicted reflow data through the content recommendation total model. However, the acquisition and integration of the sample data required in this manner is delayed, the model update is slow, and the content recommended by using such models is poor.

Disclosure of Invention

In view of the foregoing, it is desirable to provide a content recommendation method, apparatus, computer device, and storage medium capable of improving recommendation effects.

A content recommendation method, the method comprising:

responding to a content recommendation event, and acquiring a user portrait of a user to be recommended and a user history behavior sequence;

inputting the user portrait and the user history behavior sequence into a content recommendation increment model determined through training;

acquiring recommended content output by the content recommendation incremental model determined by training; the content recommendation incremental model is determined based on sample user portraits, sample user historical behavior sequences and target sharing reflux quantity training, and the target sharing reflux quantity is determined according to the content recommendation total model determined through training.

A content recommendation device, the device comprising:

the acquisition module is used for responding to the content recommendation event and acquiring a user portrait of the user to be recommended and a user history behavior sequence;

the input module is used for inputting the user portrait and the user history behavior sequence into a content recommendation increment model determined through training;

the result acquisition module is used for acquiring recommended content output by the content recommendation incremental model determined by training; the content recommendation incremental model is determined based on sample user portraits, sample user historical behavior sequences and target sharing reflux quantity training, and the target sharing reflux quantity is determined according to the content recommendation total model determined through training.

A computer device comprising a memory storing a computer program and a processor which when executing the computer program performs the steps of:

A computer readable storage medium having stored thereon a computer program which when executed by a processor performs the steps of:

The content recommendation method, the device, the computer equipment and the storage medium are used for acquiring user portraits and user historical behavior sequences of users to be recommended, inputting the user portraits and the user historical behavior sequences into a content recommendation incremental model determined through training, and acquiring recommendation contents output by the content recommendation incremental model; the content recommendation incremental model is determined based on the sample user portraits, the sample user historical behavior sequences and the target sharing reflux quantity training, and the target sharing reflux quantity is obtained from the content recommendation total model determined through training. According to the method, the content recommendation total model can be trained based on more complete data, the trained content recommendation total model can output more accurate target sharing reflux amount, the content recommendation incremental model can be trained based on less data, model updating is faster, meanwhile, the content recommendation incremental model is combined with the target sharing reflux amount output by the trained content recommendation total model, and the trained incremental model can recommend more accurate personalized content.

Drawings

FIG. 1 is an application environment diagram of a content recommendation method in one embodiment;

FIG. 2 is a flow chart of a content recommendation method according to an embodiment;

FIG. 3 is a flow diagram of a training process for a content recommendation delta model in one embodiment;

FIG. 4 is a schematic flow chart of obtaining a target sharing reflux amount based on first sample user information and a content recommendation total model determined through training in one embodiment;

FIG. 5 is a flow diagram of a training process for a content recommendation full model in one embodiment;

FIG. 6 is a schematic diagram of a full scale model in one embodiment;

FIG. 7 is a schematic diagram of an incremental model in one embodiment;

FIG. 8 is a block diagram showing a structure of a content recommendation device in one embodiment;

fig. 9 is an internal structural diagram of a computer device in one embodiment.

Detailed Description

In order to make the objects, technical solutions and advantages of the present application more apparent, the present application will be further described in detail with reference to the accompanying drawings and examples. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the present application.

The content recommendation method provided by the application can be applied to an application environment shown in fig. 1. Wherein the terminal 102 communicates with the server 104 via a network. When a content recommendation event triggered by the terminal 102 is responded, a corresponding user to be recommended is acquired, corresponding user portrait, user historical behavior sequence and other information are acquired, the user portrait and the user historical behavior sequence are input into a content recommendation incremental model determined through training, and recommended content output by the content recommendation incremental model is acquired; the content recommendation incremental model is determined based on the sample user portraits, the sample user historical behavior sequences and the target sharing reflux quantity training, and the target sharing reflux quantity is obtained from the content recommendation total model determined through training. Wherein the content recommendation delta model and the content recommendation total model may be determined and stored in training in the server 104. The terminal 102 may be, but not limited to, various personal computers, notebook computers, smartphones, tablet computers, and portable wearable devices, and the server 104 may be implemented by a stand-alone server or a server cluster composed of a plurality of servers.

Machine Learning (ML) is a multi-domain interdisciplinary, involving multiple disciplines such as probability theory, statistics, approximation theory, convex analysis, algorithm complexity theory, etc. It is specially studied how a computer simulates or implements learning behavior of a human to acquire new knowledge or skills, and reorganizes existing knowledge structures to continuously improve own performance. Machine learning is the core of artificial intelligence, a fundamental approach to letting computers have intelligence, which is applied throughout various areas of artificial intelligence. Machine learning and deep learning typically include techniques such as artificial neural networks, confidence networks, reinforcement learning, transfer learning, induction learning, teaching learning, and the like.

In one embodiment, as shown in fig. 2, a content recommendation method is provided, and the method is applied to the server in fig. 1 for illustration, it is to be understood that, in other embodiments, the method may also be applied to the terminal in fig. 1. In this embodiment, the method includes steps S210 to S230.

Step S210, responding to the content recommendation event, and acquiring a user portrait of the user to be recommended and a user history behavior sequence.

Wherein, the content recommendation event represents an event requiring content recommendation; in one embodiment, the content recommendation event may be customized according to the actual situation. For example, in one particular embodiment, when a target application launch event is detected, it is determined that a content recommendation event is detected; the target application program may be a content browsing application program, such as a video playing application program, a news event application program, etc., or may be a content browsing applet, etc. In another specific embodiment, when a content loading operation is detected, it is determined that a content recommendation event is detected; in this embodiment, the content loading operation may also be defined according to practical situations, for example, the top page in the application program slides down to request loading of more content, and so on. In one embodiment, the content may represent different content such as pictures, videos, audios, articles, etc., and the user may browse the content in the terminal.

The users to be recommended represent users needing content recommendation; in one embodiment, when a content recommendation event is monitored, a user which corresponds to the content recommendation event and needs to be subjected to content recommendation is obtained, and then relevant user information including user figures, user historical behavior sequences and the like is obtained according to the user; further, the user identification can be obtained as the user to be recommended.

User portrayal, also known as user role, is widely used in various fields as an effective tool for outlining target users, contacting user appeal and design direction. In this embodiment, the user history behavior sequence represents a history behavior sequence of a user associated with content; further, in one embodiment, the user history behavior sequence may include information such as content browsing records of content browsed by the user, browsing operation records (e.g., forwarding, praise, comment, collection, etc.) for each content browsed, and the like. Furthermore, the user portrait and the user history behavior record of the user to be recommended can be obtained from a preset database, and the preset database stores the relevant user information of each user; in other embodiments, the user portrait and the user history behavior record of the user to be recommended may also be obtained from a database of the terminal where the user to be recommended is located, where relevant user information of the user corresponding to the terminal where the user to be recommended is located is stored. In other embodiments, user portraits, user history behavior records of the user to be recommended may be obtained in other ways.

In a specific embodiment, when a user opens a home page of a content browsing application program at a terminal or requests loading of more content from the home page of the content browsing application program, the server determines that a content recommendation event is monitored, responds to the content recommendation event to obtain a corresponding user identifier, and obtains a user portrait and a user history behavior record of the user to be recommended according to the user identifier.

Step S220, inputting the user portrait and the user history behavior sequence into a content recommendation increment model determined through training.

Wherein the incremental model represents a model constructed based on incremental data, the incremental data representing updated data over a period of time. The model corresponding to the content recommendation increment model includes a full-scale model representing a model constructed based on full-scale data representing all data. In one embodiment, delta data is a concept that is opposed to full data, delta data representing data updated for a small period of time (e.g., 1 hour) and full data representing data updated for a large period of time (e.g., 8 days). In this embodiment, the incremental model is used to recommend content to the user, and is denoted as a content recommendation incremental model.

In this embodiment, the content recommendation incremental model is determined through training in advance, and the training process of the content recommendation incremental model will be described in detail in the following embodiments, which will not be described here again.

Step S230, acquiring recommended content output by a content recommendation incremental model determined by training; the content recommendation incremental model is determined based on sample user portraits, sample user historical behavior sequences and target sharing reflux quantity training, and the target sharing reflux quantity is determined according to the content recommendation total model determined through training.

In the recommendation scene, the user shares a video into the group chat or the single chat, and possibly clicks to enter a new user into the recommendation scene; in this embodiment, the sharing reflux amount indicates how many users can carry in for visit after one user sharing, and the sharing reflux amount has a larger influence on the recommended scene, especially on daily activity.

After the content recommendation incremental model is trained, the content recommendation incremental model can be used for recommending the content for the user, and in the embodiment, the acquired user portrait and user history behavior sequence of the user to be recommended are input into the content recommendation incremental model determined through training, so that recommended content output by the content recommendation incremental model can be acquired.

In this embodiment, the content recommendation incremental model is determined based on a sample user representation, a sample user historical behavior sequence, and a target sharing reflux amount, which represents an output result of the trained content recommendation full model. In one embodiment, the target sharing reflux amount represents an estimated sharing reflux amount of the trained content recommendation total model for input data output; because the content recommendation total model is trained, the output estimated sharing reflux amount is relatively close to the real possible sharing reflux amount, and in the embodiment, the target sharing reflux amount in the output result of the trained content recommendation total model is used as a pseudo tag of the content recommendation incremental model to train the content recommendation incremental model.

Further, in a specific embodiment, the input data of the content recommendation total model determined through training is sample data of the content recommendation incremental model; when training the content recommendation incremental model, firstly, sample data are acquired, the sample data are input into the trained content recommendation full model, the target sharing reflux amount in the output result is acquired, and the target sharing reflux amount is used as the input of the content recommendation incremental model for training. In one embodiment, the target sharing reflux amount is trained as an input of the content recommendation incremental model, and specifically, the target sharing reflux amount is trained as a pseudo tag of the content recommendation incremental model. The pseudo tag represents that a sample with high confidence coefficient for judging the correctness of the result in the test set is added into the training set, and the estimated result output by the trained full-scale model is used as training sample data of the incremental model in the embodiment.

Further, the full-scale model determined by training is used to assist in training the content recommendation incremental model, which is denoted as the content recommendation full-scale model in this embodiment. The training process for the full-scale model will also be described in detail in the following embodiments, and will not be described in detail here.

In one embodiment, the content recommendation delta model is the same infrastructure as the content recommendation full model.

Further, in one embodiment, the process of outputting the recommended content by training the determined delta model includes: and determining a sharing target prediction vector, a playing target prediction vector and a predicted sharing reflux amount according to the user portrait and the user history behavior sequence, merging the sharing target prediction vector, the playing target prediction vector and the predicted sharing reflux amount to obtain a recommendation vector, matching the recommendation vector in a preset database, and selecting content with high similarity as recommendation content. The content attribute embedding, the content embedding, the user behavior embedding and the context environment embedding are determined and stored in the training process of the incremental model.

Further, in one embodiment, merging the sharing target prediction vector, the playing target prediction vector, and the predicted sharing return amount to obtain a recommendation vector includes: and calculating the sum of the play target predictive vector and the product of the predictive sharing reflux quantity and the sharing target predictive vector to serve as a recommendation vector.

According to the content recommendation method, for a user to be recommended, a user portrait and a user history behavior sequence of the user to be recommended are obtained, the user portrait and the user history behavior sequence are input into a content recommendation incremental model determined through training, and recommended content output by the content recommendation incremental model is obtained; the content recommendation incremental model is determined based on the sample user portraits, the sample user historical behavior sequences and the target sharing reflux quantity training, and the target sharing reflux quantity is obtained from the content recommendation total model determined through training. According to the method, the content recommendation total model can be trained based on more complete data, the trained content recommendation total model can output more accurate target sharing reflux amount, the content recommendation incremental model can be trained based on less data, model updating is faster, meanwhile, the content recommendation incremental model is combined with the target sharing reflux amount output by the trained content recommendation total model, and the trained incremental model can recommend more accurate personalized content.

In one embodiment, as shown in FIG. 3, the training process of the content recommendation delta model includes steps S310 through S340.

Step S310, acquiring first sample user information, where the first sample user information includes: the first sample user picture, the first sample user behavior sequence.

The sample user information represents sample information for training a model, and specifically includes a portrait of a user and a user behavior sequence, which are respectively marked as a first sample user portrait and a first sample user behavior sequence in the embodiment. In one embodiment, the first sample information for training the content recommendation delta model is user information within a first history period, which may be set according to actual situations, for example, 1 hour, 5 hours, 8 hours, and so on. It should be noted that, in the embodiments of the present application, references to "first" and "second" are only used for distinguishing between the designations, and do not indicate any actual meaning.

Step S320, obtaining the target sharing reflux amount based on the first sample user information and the content recommendation total model determined through training.

The content recommendation incremental model is determined based on the target sharing reflux amount, and the target sharing reflux amount is output by the trained content recommendation total model.

Further, in one embodiment, as shown in fig. 4, the target sharing reflux amount is obtained based on the first sample user information and the content recommendation total model determined through training, and the steps include step S321 to step S323. Wherein:

step S321, according to the first sample user behavior sequence, obtaining a first sample content attribute of each content in the first sample user behavior sequence.

Wherein, the attribute of the content can be used for reflecting the information of the content, and the attribute of one content generally comprises a plurality of attributes such as the label of the content, the category of the content and the like; for example, when the content is video, the attribute of the content may be classification of the video, secondary classification information, such as tags, entities, multimedia attributes, title segmentation, and the like.

In one embodiment, the user behavior sequence includes a browsing record of the user for the content, and each content appearing in the user behavior sequence can be obtained according to the user behavior sequence, so that the content attribute of each content in each user behavior sequence can be obtained. In this embodiment, the user behavior sequence and the content attribute are used to train the content recommendation delta model, and are therefore respectively denoted as a first sample user behavior sequence and a first sample content attribute.

In step S322, the first sample user information and the first sample content attribute are input into the content recommendation total model determined by training.

Because the content recommendation total model is trained, the corresponding target sharing reflux quantity can be output for the input first sample user information and the first sample content attribute; the training content recommendation total model aims to predict the reflux amount data which can be obtained after the content is shared by the user as accurately as possible when the user interacts with the content.

Step S323, obtaining the target sharing reflux amount output by the content recommendation total model determined through training.

In one embodiment, after the first sample user information and the first sample content attribute are input into the trained content recommendation total model, the content recommendation total model outputs a first target sharing reflux amount based on the first sample user information (first sample user image, first sample user behavior sequence) and a second target sharing reflux amount based on the first sample content attribute.

Further, in a specific embodiment, the content recommendation total model outputs a first target sharing reflux amount based on the first sample user information, including: obtaining corresponding full sample portrait vectors and full sample sequence vectors according to the first sample user portrait and the first sample user behavior sequence; combining all the sample image vectors to obtain a full sample image comprehensive vector, and combining all the sample sequence vectors to obtain a full sample sequence comprehensive vector; the comprehensive vector of the full-volume content attribute is obtained based on the comprehensive vector of the full-volume sample portrait and the comprehensive vector of the full-volume sample sequence through the processing of an attention mechanism; and integrating the full sample image comprehensive vector, the full sample sequence comprehensive vector and the full content attribute to obtain a full user comprehensive vector, and performing multi-objective optimization processing on the full user comprehensive vector to obtain a full user behavior estimation result and a first full estimated sharing reflux amount, namely a first target sharing reflux amount. The content recommendation total model outputs a second target sharing reflux amount based on the first sample content attribute, comprising: and processing the first sample content attribute through an attention mechanism and then through a full connection layer to obtain a second estimated sharing reflux quantity, namely a second target sharing reflux quantity. In one embodiment, the full-scale user behavior prediction result includes a full-scale predicted shared content vector and a full-scale predicted played content vector. Further, the full-scale predicted shared content vector represents a vector of content that the user predicted by the full-scale model intends to share, and the full-scale predicted play content vector represents a vector of content that the user predicted by the full-scale model intends to play. In one embodiment, the multi-objective means that other recommendation indexes such as duration, sharing, praise, comment, reflux and the like which are related to the user experience are included in addition to the click objective.

In the present embodiment, "full" of the "full sample image vector", "full sample sequence vector", "full sample image synthesis vector", "full sample sequence synthesis vector", and the like is only used to indicate parameters related to the full model.

Further, in one embodiment, the trained full-scale model combines the full-scale sample image vectors to obtain a full-scale sample image comprehensive vector, and combines the full-scale sample sequence vectors to obtain a full-scale sample sequence comprehensive vector, which can be implemented in any mode. In one embodiment, a weighted average is performed on each full sample user representation to obtain a full sample representation composite vector; and carrying out weighted average on each full sample sequence vector to obtain a full sample sequence comprehensive vector. Further, in one embodiment, a transducer layer may be utilized to obtain a full sample representation synthesis vector; a full-pel sample sequence synthesis vector can be obtained using two weighted transducer layers.

In one embodiment, vectors corresponding to the sample user images in the first sample user image are processed by a transducer layer to output a composite vector, i.e., a full sample image composite vector in this embodiment. The vectors corresponding to each user behavior sequence in the first sample user behavior sequence are processed by the two layers of transformers to output a comprehensive vector, namely a comprehensive vector of the whole sample sequence in the embodiment. The content attribute of each content involved in the first sample user behavior sequence is processed by the attention mechanism to output a comprehensive vector, i.e. a comprehensive vector of the total content attribute in this embodiment.

In this embodiment, first sample user information for training a content recommendation incremental model and content attributes of each sample obtained based on the first sample user information are input into a trained content recommendation full-scale model, and a corresponding estimated sharing reflux amount output by the full-scale model is obtained as a target sharing reflux amount and used for training the incremental model. Because the data used in the training process of the full-scale model is comprehensive, the estimated sharing reflux quantity corresponding to the input data output of the trained full-scale model is close to the real sharing reflux quantity, and therefore the estimated reflux quantity output by the full-scale model is used as training sample data of the content recommendation incremental model, and the training is facilitated to obtain the content recommendation incremental model with better effect.

Step S330, calculating a first sampling loss based on the first sample user information, and calculating a first sharing reflux amount loss based on the first sample user information and the target sharing reflux amount.

The loss function used in calculating the sampling loss and sharing the return loss can be any loss function.

In one embodiment, calculating a first sampling loss based on the first sample user information, calculating a first shared reflux amount loss based on the first sample user information, the target shared reflux amount, includes: inputting the first sample user information into a preset increment model, and pre-estimating the preset increment model based on the first sample user information to obtain an intermediate increment user behavior pre-estimation result and an intermediate increment sharing reflux amount pre-estimation result; corresponding labeling information can be obtained according to the first sample user information, and a loss function value (marked as first sampling loss in the embodiment) corresponding to the corresponding user behavior can be calculated according to the intermediate increment user behavior estimation result and the labeling information; according to the intermediate increment sharing reflux amount estimation result and the target sharing reflux amount, a loss function value (in this embodiment, referred to as a first sharing reflux amount loss) corresponding to the sharing reflux amount can be determined. In a specific embodiment, the intermediate delta user behavior prediction result comprises an intermediate delta shared content vector and an intermediate delta play content vector; correspondingly, the annotation information corresponding to the first sample user information comprises: sharing content annotation and playing content annotation.

Further, in one embodiment, the first sample user information is input into a preset incremental model, the preset incremental model is estimated based on the first sample user information, and an intermediate incremental user behavior estimation result and an intermediate incremental sharing reflux amount estimation result are obtained, including the steps of: the preset incremental model obtains corresponding incremental sample portrait vectors and incremental sample sequence vectors according to the first sample user portrait and the first sample user behavior sequence; merging the increment sample image vectors to obtain an increment sample image comprehensive vector, and merging the increment sample sequence vectors to obtain an increment sample sequence comprehensive vector; the incremental content attribute comprehensive vector is obtained based on the incremental sample portrait comprehensive vector and the incremental sample sequence comprehensive vector through the processing of an attention mechanism; and integrating the incremental sample image integrated vector, the incremental sample sequence integrated vector and the incremental content attribute to obtain an incremental user integrated vector, and performing multi-objective optimization processing on the incremental user integrated vector to obtain an intermediate incremental user behavior estimation result and an intermediate incremental sharing reflux amount estimation result.

In the present embodiment, "increment" in "increment sample image vector", "increment sample sequence vector", "increment sample image integrated vector", "increment sample sequence integrated vector", and the like is only used to indicate a result involved in the increment model.

Further, in one embodiment, the preset delta model combines the delta sample image vectors to obtain a delta sample image integrated vector, and combines the delta sample sequence vectors to obtain a delta sample sequence integrated vector, which can be implemented in any mode. In one embodiment, a weighted average is performed on each incremental sample user representation to obtain an incremental sample representation composite vector; and carrying out weighted average on each incremental sample sequence vector to obtain an incremental sample sequence comprehensive vector. Further, in one embodiment, a transducer layer may be utilized to obtain an incremental sample representation synthesis vector; two weighted transformer layers can be used to obtain an incremental sample sequence synthesis vector.

In one embodiment, vectors corresponding to the sample user images in the first sample user image are processed by a transducer layer to output a composite vector, i.e., an incremental sample image composite vector in this embodiment. The vector corresponding to each user behavior sequence in the first sample user behavior sequence is processed by two layers of transformers to output a comprehensive vector, namely an incremental sample sequence comprehensive vector in the embodiment. The content attribute of each content involved in the first sample user behavior sequence is processed by the attention mechanism to output a comprehensive vector, namely an incremental content attribute comprehensive vector in the embodiment.

And step S340, training the preset incremental model according to the first sampling loss and the first sharing reflux quantity loss to obtain a content recommendation incremental model determined by training.

Further, after two loss function values (first sampling loss and first sharing reflux amount loss) are determined according to the first sample user information and the target sharing reflux amount, parameters of a preset incremental model can be adjusted based on the loss function values, the process is repeated until a model training stop condition (any one condition can be customized according to actual conditions) is reached, and finally a trained content recommendation incremental model is obtained according to the model parameters when the model training stop condition is reached.

In this embodiment, how to train to obtain the content recommendation incremental model is described in detail, the first sample user information for training the incremental model is input into the trained content recommendation total model, the target sharing reflux amount output by the total model is obtained to be used as the input of the incremental model for training, and the recommendation effect of the content recommendation incremental model obtained based on the target sharing reflux amount training is better because the target sharing reflux amount predicted by the trained content recommendation total model is close to the real sharing reflux amount.

In one embodiment, regularization is employed to constrain the amount of targeted shared back traffic.

Regularization is a constraint algorithm for features and for models. In one embodiment, any regularization mode can be adopted to restrict the target sharing reflux quantity; such as L1 regularization, L2 regularization, etc. Further, the specific process of regularization is any prior art, and will not be described herein.

The reflux amount of the header content cannot be changed easily, so that the method is most suitable for anchoring the reference standard, and therefore the regularization mode is adopted to restrict the target sharing reflux amount in the embodiment. Wherein the header content represents content that appears more frequently in the sample user behavior sequence; in a specific embodiment, the content whose occurrence frequency is greater than the preset threshold value may be set to be determined as the head content; in other embodiments, other conditions may be set to determine the head content from a sequence of sample user behaviors.

In this embodiment, regularization processing is performed on the predicted data of the shared reflux amount by two parts in the trained full-scale model, so that when the incremental model is trained, the target shared reflux amount is obtained from the trained full-scale model, and an excessive error is not generated after final training due to excessive data deviation. According to the method, regularization is adopted to restrict the target sharing reflux quantity, so that errors in the incremental model training process can be reduced.

In one embodiment, as shown in fig. 5, the training process of the content recommendation full model includes steps S510 to S540.

Step S510, obtaining second sample user information, where the second sample user information includes a second sample user portrait, a second sample user behavior sequence, and a sample sharing reflux amount.

The second sample user information represents sample information for training a full-scale model, and specifically comprises portraits of users and user behavior sequences. In one embodiment, the first sample information used to train the content recommendation delta model is user information within a second historical period of time, which may be set according to the actual situation, for example, 1 day, 5 days, 8 days, and so on.

The sample sharing reflux quantity is an actual reflux quantity value obtained by carrying out reflux quantity statistics on the content sharing event which has occurred; in one embodiment, the sharing reflux amount corresponding to the content sharing event in the second sample historical behavior sequence in the second sample user information is according to the sharing reflux amount of the sample. The sharing reflux quantity value corresponding to the statistical content sharing event can be realized in any mode.

Step S520, determining a second sample content attribute of each content in the second sample user behavior sequence.

In one embodiment, the user behavior sequence includes a browsing record of the user for the content, and each content appearing in the user behavior sequence can be obtained according to the user behavior sequence, so that the content attribute of each content in each user behavior sequence can be obtained.

In step S530, a second sampling loss is calculated based on the second sample user information, and a second sharing reflux amount loss is calculated based on the second sample user information, the second sample content attribute, and the sample sharing reflux amount.

The loss function used in calculating the sampling loss and sharing the return loss can be any loss function. Similar to the training process of the incremental model, when the full-scale model is trained, the preset full-scale model is estimated based on the second sample user information and the second sample content attribute of the second sample user information and the second sample content attribute input into the preset full-scale model, and an intermediate full-scale user behavior estimation result and an intermediate full-scale sharing reflux estimation result are obtained; corresponding labeling information can be obtained according to the second sample user information, and a loss function value (marked as second sampling loss in the embodiment) corresponding to the corresponding user behavior can be calculated according to the intermediate full-scale user behavior estimation result and the labeling information; the loss function value (referred to as the second shared return loss in this embodiment) corresponding to the shared return can be determined according to the intermediate total shared return estimation result and the sample shared return.

The obtaining of the intermediate total sharing reflux amount estimated result based on the second sample user information and the second sample content attribute in the preset total model comprises the following steps: and estimating a first intermediate total sharing reflux amount estimated result based on the second sample user information, and estimating a second intermediate total sharing reflux amount estimated result based on the second sample content attribute.

Further, the estimating step of estimating based on the second sample user information to obtain a first intermediate total sharing reflux amount estimated result includes: the preset full-quantity model obtains corresponding full-quantity sample portrait vectors and full-quantity sample sequence vectors according to the second sample user portrait and the second sample user behavior sequence; combining all the sample image vectors to obtain a full sample image comprehensive vector, and combining all the sample sequence vectors to obtain a full sample sequence comprehensive vector; the comprehensive vector of the full-volume content attribute is obtained based on the comprehensive vector of the full-volume sample portrait and the comprehensive vector of the full-volume sample sequence through the processing of an attention mechanism; and integrating the full sample image integrated vector, the full sample sequence integrated vector and the full content attribute to obtain a full user integrated vector, and performing multi-objective optimization processing on the full user integrated vector to obtain a first intermediate full sharing reflux quantity estimation result.

Obtaining a second intermediate total sharing reflux amount estimation result based on second sample content attribute estimation, including: and processing the second sample content attribute through an attention mechanism and a full connection layer to obtain a second intermediate full-quantity sharing reflux quantity estimated result.

In this embodiment, the sharing reflux amount is considered to be related to the content attribute and is relatively stable, so in this embodiment, the sharing reflux amount after the user is predicted to share the content according to the content attribute is increased. Further, in the preset full-scale model in this embodiment, the prediction of the shared reflux amount is divided into two parts, the first part is determined according to the sample user information, the second part is determined according to the sample content attribute, the second shared reflux amount loss is calculated according to the prediction result of the shared reflux amount of the two parts, and the second shared reflux amount loss function is continuously optimized to obtain more accurate prediction.

And step S540, training the preset incremental model according to the second sampling loss and the second sharing reflux quantity loss to obtain a content recommendation incremental model determined by training.

Further, after two part loss function values (second sampling loss and second sharing reflux loss) of the preset full-quantity model are determined according to the second sample user information and the sample sharing reflux quantity, parameters of the preset incremental model can be adjusted based on the loss function values, the process is repeated until a model training stopping condition (any condition can be defined according to actual conditions) is achieved, and finally the trained content recommended full-quantity model is obtained according to the model parameters when the model training stopping condition is achieved.

In one embodiment, the first sample user information comprises user information over a first historical period of time and the second sample user information comprises user information over a second historical period of time; the first historical period of time is less than the second historical period of time.

The first historical time period and the second historical time period can be set according to actual conditions. In a specific embodiment, the first historical period of time may be set to an hour level, such as 8 hours; the second historical period of time may be set to a day level, such as 5 days, 8 days, etc.

In the related art, in a multi-target sequence recall mode, a training model is trained based on all data on the same day, but the acquisition and integration of the data have delay, so that the model is slow to update. For example, under a full-scale model, the statistical period of data required for predicting the Item reflux amount is one day, so that the model cannot learn online and update in real time. Through experiments, under the condition of day-level updating, the strategy fits the data of the previous day, and the performance effect on the current day is poor. If an incremental model of real-time learning is adopted, lables (labels, which contain attributes of different fields, such as age, sex, video playing amount, etc. of a user) that cannot be obtained in real time cannot be processed, or are generally randomly initialized to a certain value for training, and the accuracy of the result obtained after training is low. For example, when using an incremental model, the input data is not in the category of the amount of return, and the original model cannot be used for training. To this end, the present application provides a content recommendation method.

The application scene is used for applying the content recommendation method. Specifically, the application of the content recommendation method in the application scene is as follows:

and responding to the content recommendation event, acquiring information such as user portraits, user historical behavior sequences and the like of the users to be recommended, inputting the user portraits and the user historical behavior sequences into the trained incremental model, and obtaining the recommended content of the TopK output by the incremental model. The incremental model is obtained through training according to the sample user portraits, the sample user behavior sequences and the target sharing reflux quantity, and the target sharing reflux quantity is obtained based on the trained full-quantity model. The incremental model is an update of an hour level, and the full model is an update of a day level. Wherein:

an incremental model is designed based on the full-scale model, and a structural schematic diagram of the full-scale model in this embodiment is shown in fig. 6, and a structural schematic diagram of the incremental model is shown in fig. 7. The following is a description of some of the modules in the full and incremental models:

for introducing a sharing reflux quantity estimation module into a multi-objective optimization layer of the full-quantity model, the sharing reflux quantity after sharing content (item) of a user can be generated through training. The input data in the full-scale model contains the user portraits, the behavior sequences (i.e., the content most recently viewed by the user), and the content attributes. The user portrait and the user behavior sequence are directly modeled, and correspond to a user portrait module and a user behavior sequence module shown in fig. 6 respectively. Because the user behavior sequence is possible to be updated continuously, the prediction result of the part has dynamic property, the user behavior can be changed continuously along with the click of the video by the user, and the system can feed back at any time, so that the provided recommendation result has real-time property, the recommendation result is not fixed, and better individuation is realized.

For the attribute reflux amount prediction (AttrWgtTask) module in the structure shown in fig. 6, an auxiliary task added to the full-scale model is aimed at predicting the size of the shared reflux amount (corresponding to the second target shared reflux amount) through the Item attribute (content attribute). In this embodiment, the sharing reflux amount of the content is considered to have a relatively stable association relationship with the Item attribute. Therefore, in this embodiment, the amount of return after the user shares the Item is predicted by the attribute of the Item. And the user information prediction sharing reflux amount (Wgt Unit) module predicts the sharing reflux amount (corresponding to the first target sharing reflux amount) according to the user portrait and the user behavior sequence. Further, the data generated by the Wgt Unit in the full-scale model will calculate the final shared reflux Loss (Wgt Loss, corresponding to the second shared reflux Loss described above) together with the data generated by the AttrWgtTask module (Item feedback shown in fig. 6 represents the sample shared reflux), and by continuously optimizing Wgt Loss to obtain a more accurate prediction, the prediction will be used as a pseudo-label (filled in the Item feedback shown in fig. 7) of the input sample recall in the incremental model, so that the incremental model can be trained online.

Introducing an attribute regularization (ItemEmbedReg) module into the incremental model; in this embodiment, it is considered that the amount of reflow of the head video does not easily change, so that it is most suitable for anchoring the reference standard. In this embodiment, regularization processing is performed on the data predicted by the Wgt unit and the AttrWgtTask module, so that when the incremental model obtains the target sharing reflux amount from the full-scale model as a pseudo tag used in training, an excessive error is not generated after final training due to excessive data deviation.

In one embodiment, the process of sharing back traffic for training of incremental models with targets output by the trained full-scale model may be referred to as knowledge distillation.

The core idea of knowledge distillation (Knowledge Distillation, KD) is to migrate knowledge to get a small model more suitable for reasoning from a trained large model. The core idea of knowledge distillation is a Teacher-Student (T-S) model, where Teacher is the outputter of knowledge and Student is the recipient of knowledge, i.e., trained together using a soft target-aided hard target, where soft target comes from the predictive output of a large model.

This has the advantage that: compared with the hard target, the soft target contains larger information entropy and has information of different types of relations ^[2] . For example, in dealing with video multi-classification problems, although the video is classified as class a, it still has a representation of small probabilities for other classes, softening (soft) process, i.e., amplifying the information carried by these small probability values when calculating the loss function (loss function). In short, knowledge distillation is a simple way to remedy the shortage of classification problem supervision signals.

In this embodiment, the log of the Teacher network (Net-T) is usedRepresenting the logarithmic use of the Student network (Net-S)>A representation; />Is Net-T at the parameter +.>Lower softmax output at +.>Values on class, ++>Is Net-S at the parameter +.>Lower softmax output at +.>Values on the class; in->The group trunk value on the class is expressed asTaking 1 as a positive label and 0 as a negative label; the total tag number is recorded as->。

The general knowledge distillation process is divided into two stages: original model training and reduced model training. The first stage is to train Net-T without any restrictions on model architecture, number of parameters, whether to integrate, i.e. training the raw data and model and calculating the softened softmax function.

Wherein T is temperature, which is a super parameter; vi and vj are model parameters of the multi-class softmax expression.

The second stage trains Net-S, i.e., distills Net-T knowledge to Net-S. The objective function of the distillation process is weighted by distill loss (corresponding to soft target) and student loss (corresponding to hard target):

Net-T and Net-S are simultaneously input into transfer set (where transfer set used for training Net-T can be directly multiplexed) and Net-T generated softmax distribution (softmax distribution, with high temperature, higher T value) is used as soft target, net-S is at the same temperatureThe softmax output and cross-entropy of soft target under the condition are the first part of the Loss function>The method comprises the following steps:

Net-S inThe softmax output and cross-entcopy of the group trunk are the second part of the Loss function +.>The method comprises the following steps:

wherein,as a function of standard softmax. WhileStandard softmax function, i.e.)>Special distribution of time->The degree of interest in the negative tag during Net-S training is varied: at lower temperatures, there is less concern about negative labels, especially those that are significantly lower than average; while at higher temperatures, the negative label-related value will be relatively increased and Net-S will be relatively more focused on the negative label.

According to the method, the data distillation technology is applied to content recommendation, so that the model can participate in data training in real time, and the defect of sharing reflux amount data by incremental data is overcome. An incremental model is designed on the basis of the full-quantity model, and the full-quantity model is modified and a module is added on the incremental model, so that the filled data can be ensured to be as accurate as possible, and the result is more accurate. Learning and training the return quantity of the user after sharing the Item through the Wgt Unit module, and establishing the relation between the Item attribute and the return quantity target through the AttrWgtTask module during full model training; regularization is introduced by adding an ItemEmbeddreg module in the incremental model, so that the data is ensured not to generate excessive deviation due to abnormal fluctuation, and the robustness of the model is improved. The user portrait, the user behavior sequence and the omnibearing depiction modeling of the Item attribute are realized through three modules, so that the reflux quantity data of the full quantity model is more accurate and stable when distilled to the incremental model.

The application scene is used for applying the content recommendation method. Specifically, the content recommendation method is used for 'looking at one' in the application scene applied to the WeChat application program. "see at first glance" recommendation is made by: recall logic, primary selection logic and rank logic; wherein recall logic functions to: according to the portrait information of a specific user, carrying out data (recall) pulling according to various dimensions such as accurate individuation, generalized individuation, heat and the like; the initial selection logic is responsible for carrying out initial screening on a large number of recall results according to specific rules (such as user document correlation, timeliness, territories, diversity and the like) so as to reduce the rank calculation scale; rank orders the final results according to the click rate estimation model and presents the final results to the user. The main application scheme of the method in 'looking at' is as follows:

knowledge distillation: based on knowledge distillation technology and thought, recall data in the full-scale model is applied to the incremental model, so that the incremental model can train by using pseudo-label generated for a sample in the full-scale model, the purpose of real-time training is achieved, and the stability of data prediction is ensured.

Regularization: and meanwhile, regularization is adopted to process the data of the predicted output so as to ensure that different items have relatively stable reference quantity during recall prediction and ensure the stability of the hour-level increment model.

It should be understood that, although the steps in the flowcharts referred to in the above embodiments are shown in order as indicated by the arrows, these steps are not necessarily performed in order as indicated by the arrows. The steps are not strictly limited to the order of execution unless explicitly recited herein, and the steps may be executed in other orders. Moreover, at least a part of the steps in the flowcharts referred to in the above embodiments may include a plurality of steps or a plurality of stages, which are not necessarily performed at the same time, but may be performed at different times, and the order of the execution of the steps or stages is not necessarily sequential, but may be performed alternately or alternately with at least a part of the steps or stages in other steps or other steps.

In one embodiment, as shown in fig. 8, there is provided a content recommendation apparatus, which may employ a software module or a hardware module, or a combination of both, as a part of a computer device, and specifically includes: an acquisition module 810, an input module 820, and a result acquisition module 830, wherein:

An obtaining module 810, configured to obtain a user portrait of a user to be recommended and a user history behavior sequence in response to a content recommendation event;

an input module 820 for inputting user portraits, user history behavior sequences into the content recommendation incremental model determined by training;

the result obtaining module 830 is configured to obtain recommended content output by the content recommendation incremental model determined by training; the content recommendation incremental model is determined based on sample user portraits, sample user historical behavior sequences and target sharing reflux quantity training, and the target sharing reflux quantity is determined according to the content recommendation total model determined through training.

The content recommending device acquires user portraits and user historical behavior sequences of users to be recommended, inputs the user portraits and the user historical behavior sequences into a content recommending incremental model determined through training, and acquires recommended content output by the content recommending incremental model; the content recommendation incremental model is determined based on the sample user portraits, the sample user historical behavior sequences and the target sharing reflux quantity training, and the target sharing reflux quantity is obtained from the content recommendation total model determined through training. The content recommendation total model in the device can be trained based on more complete data, the trained content recommendation total model can output more accurate target sharing reflux amount, the content recommendation incremental model can be trained based on less data, model updating is faster, meanwhile, the content recommendation incremental model is combined with the target sharing reflux amount output by the trained content recommendation total model, and the trained incremental model can recommend more accurate personalized content.

In one embodiment, the apparatus further comprises: an incremental model training module comprising: the device comprises a first sample acquisition sub-module, a target sharing reflux amount acquisition sub-module, an loss calculation sub-module and a first adjustment sub-module, wherein:

the first sample obtaining sub-module is configured to obtain first sample user information, where the first sample user information includes: a first sample user image, a first sample user behavior sequence; the target sharing reflux quantity acquisition sub-module is used for acquiring the target sharing reflux quantity based on the first sample user information and the content recommendation total quantity model determined through training; the first loss calculation sub-module is used for calculating first sampling loss based on the first sampling user information and calculating first sharing reflux amount loss based on the first sampling user information and the target sharing reflux amount; the first adjustment sub-module is used for training the preset incremental model according to the first sampling loss and the first sharing reflux quantity loss to obtain a content recommendation incremental model determined by training.

In one embodiment, the target sharing reflux amount obtaining sub-module of the apparatus includes: the content attribute determining unit is used for obtaining first sample content attributes of each content in the first sample user behavior sequence according to the first sample user behavior sequence; the input unit is used for inputting the first sample user information and the first sample content attribute into the content recommendation total model determined through training; and the result acquisition unit is used for acquiring the target sharing reflux quantity output by the content recommendation total quantity model determined through training.

In one embodiment, the target shared return amount output by the content recommendation total model includes: the first target shares the reflux quantity and the second target shares the reflux quantity; the first target sharing reflux amount is determined based on a first sample user portrait and a user behavior sequence; the second target shared reflux amount is determined based on the first sample content attribute.

In one embodiment, the incremental model training module of the above apparatus further includes: and the regularization submodule is used for restraining the target sharing reflux quantity in a regularization mode.

In one embodiment, the apparatus further comprises: a full model training module comprising: the second sample acquisition submodule is used for acquiring second sample user information, wherein the second sample user information comprises a second sample user portrait, a second sample user behavior sequence and sample sharing reflux quantity; a content attribute determination submodule for determining a second sample content attribute of each content in the second sample user behavior sequence; the second loss calculation sub-module is used for calculating a second sampling loss based on the second sample user information and calculating a second sharing reflux amount loss based on the second sample user information, the second sample content attribute and the sample sharing reflux amount; and the second adjusting sub-module is used for training the preset full-quantity model according to the second sampling loss and the second sharing reflux quantity loss to obtain a content recommendation full-quantity model determined by training.

For specific embodiments of the content recommendation device, reference may be made to the above embodiments of the content recommendation method, and the description thereof will not be repeated here. The respective modules in the content recommendation apparatus described above may be implemented in whole or in part by software, hardware, and combinations thereof. The above modules may be embedded in hardware or may be independent of a processor in the computer device, or may be stored in software in a memory in the computer device, so that the processor may call and execute operations corresponding to the above modules.

In one embodiment, a computer device is provided, which may be a server, and the internal structure of which may be as shown in fig. 9. The computer device includes a processor, a memory, and a network interface connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, computer programs, and a database. The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage media. The database of the computer device is used for storing data such as content attribute embedding, content embedding and the like. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement a content recommendation method.

It will be appreciated by those skilled in the art that the structure shown in fig. 9 is merely a block diagram of a portion of the structure associated with the present application and is not limiting of the computer device to which the present application applies, and that a particular computer device may include more or fewer components than shown, or may combine some of the components, or have a different arrangement of components.

In an embodiment, there is also provided a computer device comprising a memory and a processor, the memory having stored therein a computer program, the processor implementing the steps of the method embodiments described above when the computer program is executed.

In one embodiment, a computer-readable storage medium is provided, storing a computer program which, when executed by a processor, implements the steps of the method embodiments described above.

In one embodiment, a computer program product or computer program is provided that includes computer instructions stored in a computer readable storage medium. The processor of the computer device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions, so that the computer device performs the steps in the above-described method embodiments.

Those skilled in the art will appreciate that implementing all or part of the above-described methods in accordance with the embodiments may be accomplished by way of a computer program stored on a non-transitory computer readable storage medium, which when executed may comprise the steps of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in embodiments provided herein may include at least one of non-volatile and volatile memory. The nonvolatile Memory may include Read-Only Memory (ROM), magnetic tape, floppy disk, flash Memory, optical Memory, or the like. Volatile memory can include random access memory (Random Access Memory, RAM) or external cache memory. By way of illustration, and not limitation, RAM can be in the form of a variety of forms, such as static random access memory (Static Random Access Memory, SRAM) or dynamic random access memory (Dynamic Random Access Memory, DRAM), and the like.

The technical features of the above embodiments may be arbitrarily combined, and all possible combinations of the technical features in the above embodiments are not described for brevity of description, however, as long as there is no contradiction between the combinations of the technical features, they should be considered as the scope of the description.

The foregoing examples represent only a few embodiments of the present application, which are described in more detail and are not to be construed as limiting the scope of the invention. It should be noted that it would be apparent to those skilled in the art that various modifications and improvements could be made without departing from the spirit of the present application, which would be within the scope of the present application. Accordingly, the scope of protection of the present application is to be determined by the claims appended hereto.

Claims

1. A content recommendation method, the method comprising:

acquiring recommended content output by the content recommendation incremental model determined by training; the content recommendation incremental model is determined based on sample user portraits, sample user historical behavior sequences and target sharing reflux quantity training, and the target sharing reflux quantity is determined according to the content recommendation total model determined through training;

the training process of the content recommendation incremental model comprises the following steps: acquiring first sample user information, the first sample user information comprising: a first sample user image, a first sample user behavior sequence; obtaining a first sample content attribute of each content in the first sample user behavior sequence according to the first sample user behavior sequence; inputting the first sample user information and the first sample content attribute into the content recommendation total model determined through training; acquiring target sharing reflux quantity output by the content recommendation total quantity model determined through training; calculating a first sampling loss based on the first sample user information, and calculating a first sharing reflux amount loss based on the first sample user information and a target sharing reflux amount; training a preset incremental model according to the first sampling loss and the first sharing reflux quantity loss to obtain a content recommendation incremental model determined by training;

The training process of the content recommendation full model comprises the following steps: acquiring second sample user information, wherein the second sample user information comprises a second sample user portrait, a second sample user behavior sequence and sample sharing reflux quantity; determining a second sample content attribute of each content in the second sample user behavior sequence; calculating a second sampling loss based on the second sample user information, and calculating a second sharing reflux amount loss based on the second sample user information, a second sample content attribute and a sample sharing reflux amount; training the preset full-quantity model according to the second sampling loss and the second sharing reflux quantity loss to obtain the content recommendation full-quantity model determined by training.

2. The content recommendation method according to claim 1, wherein the target sharing reflux amount output by the content recommendation total amount model comprises: the first target shares the reflux quantity and the second target shares the reflux quantity; the first target sharing reflux amount is determined based on the first sample user portrait and the user behavior sequence; the second target shared back volume is determined based on the first sample content attribute.

3. The content recommendation method according to claim 1, wherein the target sharing reflux amount is constrained in a regularized manner.

4. The content recommendation method according to claim 1, wherein the first sample user information includes user information within a first history period and the second sample user information includes user information within a second history period; the first historical time period is less than the second historical time period.

5. A content recommendation device, the device comprising:

the result acquisition module is used for acquiring recommended content output by the content recommendation incremental model determined by training; the content recommendation incremental model is determined based on sample user portraits, sample user historical behavior sequences and target sharing reflux quantity training, and the target sharing reflux quantity is determined according to the content recommendation total model determined through training;

An incremental model training module comprising: the device comprises a first sample acquisition submodule, a target sharing reflux quantity acquisition submodule, an loss calculation submodule and a first adjustment submodule, wherein the target sharing reflux quantity acquisition submodule comprises a content attribute determining unit, an input unit and a result acquisition unit; wherein:

the first sample obtaining sub-module is configured to obtain first sample user information, where the first sample user information includes: a first sample user image, a first sample user behavior sequence;

the content attribute determining unit is configured to obtain a first sample content attribute of each content in the first sample user behavior sequence according to the first sample user behavior sequence;

the input unit is used for inputting the first sample user information and the first sample content attribute into the content recommendation total model determined through training; acquiring target sharing reflux quantity output by the content recommendation total quantity model determined through training;

the result acquisition unit is used for a first loss calculation sub-module, which is used for calculating a first sampling loss based on the first sample user information and calculating a first sharing reflux amount loss based on the first sample user information and a target sharing reflux amount;

The first adjustment sub-module is used for training a preset incremental model according to the first sampling loss and the first sharing reflux quantity loss to obtain a content recommendation incremental model determined by training;

the full-quantity model training module comprises a second sample acquisition sub-module, a content attribute determination sub-module, a second loss calculation sub-module and a second adjustment sub-module; wherein:

the second sample acquisition submodule is used for acquiring second sample user information, wherein the second sample user information comprises a second sample user portrait, a second sample user behavior sequence and sample sharing reflux quantity;

the content attribute determining submodule is used for determining a second sample content attribute of each content in the second sample user behavior sequence;

the second loss calculation sub-module is configured to calculate a second sampling loss based on the second sample user information, and calculate a second sharing reflux amount loss based on the second sample user information, a second sample content attribute, and a sample sharing reflux amount;

and the second adjustment submodule is used for training a preset full-quantity model according to the second sampling loss and the second sharing reflux quantity loss to obtain a content recommendation full-quantity model determined by training.

6. The content recommendation device according to claim 5, wherein the target sharing reflux amount output by the content recommendation total amount model comprises: the first target shares the reflux quantity and the second target shares the reflux quantity; the first target sharing reflux amount is determined based on the first sample user portrait and the user behavior sequence; the second target shared back volume is determined based on the first sample content attribute.

7. The content recommendation device of claim 5 wherein the incremental model training module further comprises: and the regularization submodule is used for restraining the target sharing reflux quantity in a regularization mode.

8. The content recommendation device of claim 5 wherein the first sample user information comprises user information over a first historical period of time and the second sample user information comprises user information over a second historical period of time; the first historical time period is less than the second historical time period.

9. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor implements the steps of the method of any of claims 1 to 4 when the computer program is executed.

10. A computer readable storage medium storing a computer program, characterized in that the computer program when executed by a processor implements the steps of the method of any one of claims 1 to 4.