CN114417156B

CN114417156B - Training method and device for content recommendation model, server and storage medium

Info

Publication number: CN114417156B
Application number: CN202210061526.5A
Authority: CN
Inventors: 赵致辰
Original assignee: Beijing Dajia Internet Information Technology Co Ltd
Current assignee: Beijing Dajia Internet Information Technology Co Ltd
Priority date: 2022-01-19
Filing date: 2022-01-19
Publication date: 2022-09-30
Anticipated expiration: 2042-01-19
Also published as: CN114417156A

Abstract

The disclosure relates to a training method, a training device, a server and a storage medium of a content recommendation model. The method comprises the following steps: in the process of training a content recommendation model by adopting a sample recommendation information set, if a current sample is a positive sample of a first sample account, sampling a first negative sample and a second negative sample in the sample recommendation information set; the positive sample is sample recommendation information interacted by the first sample account; the first negative sample is sample recommendation information interacted by the second sample account; the second negative sample is obtained by randomly sampling the sample recommendation information set; the second sample account is any account except the first sample account in the sample accounts; respectively inputting the positive sample, the first negative sample and the second negative sample into a content recommendation model to obtain a prediction recommendation sequence; the content recommendation model is trained based on a difference between the predicted recommendation order and the desired recommendation order. The content recommendation method and the content recommendation system can improve the accuracy of the content recommendation model obtained through comparison learning mode training.

Description

Content recommendation model training method and device, server and storage medium

Technical Field

The present disclosure relates to the field of information recommendation technologies, and in particular, to a method and an apparatus for training a content recommendation model, a server, and a storage medium.

Background

With the development of information recommendation technology, a technology for recommending information content for a user by using a content recommendation model appears, the recommendation model can be trained in a comparative learning mode, namely, information of a part of users who have consumed behaviors is respectively screened out from a sample pool to be used as positive samples, and a part of information is randomly screened out to be used as negative samples, so that the training of the recommendation model can be realized in the comparative learning mode.

However, in the process of implementing the training of the content recommendation model through the contrast learning manner, the randomness of screening the negative samples is relatively high, so that the accuracy of the content recommendation model trained through the contrast learning manner is relatively low.

Disclosure of Invention

The disclosure provides a method and a device for training a content recommendation model, an electronic device and a storage medium, which are used for at least solving the problem that the accuracy of the content recommendation model trained in a contrast learning manner in the related art is low. The technical scheme of the disclosure is as follows:

according to a first aspect of the embodiments of the present disclosure, there is provided a method for training a content recommendation model, including:

in the process of training a content recommendation model by adopting a sample recommendation information set, if a current sample is a positive sample of a first sample account, sampling a first negative sample and a second negative sample in the sample recommendation information set; the positive sample is sample recommendation information interacted by a first sample account; the first negative sample is sample recommendation information interacted by a second sample account; the second negative sample is obtained by randomly sampling the sample recommendation information set; the first sample account is any account in the sample accounts, and the second sample account is any account except the first sample account in the sample accounts;

inputting the positive sample, the first negative sample and the second negative sample into the content recommendation model respectively to obtain a prediction recommendation sequence of the first sample account;

training the content recommendation model based on a difference between the predicted recommendation order and an expected recommendation order for the first sample account; the desired recommendation order is a recommendation order of the positive examples prior to the recommendation order of the first negative examples and the second negative examples.

In an exemplary embodiment, the sampling a first negative sample in the sample recommendation information set includes: sampling from a first negative sample set to obtain a first negative sample; the first negative sample set is composed of sample recommendation information interacted with the second sample account in the sample recommendation information set.

In an exemplary embodiment, the sampling a first negative example and a second negative example in the set of sample recommendation information includes: sampling from the first negative sample set to obtain the first negative sample, and sampling from the sample recommendation information set to obtain the second negative sample; the proportion between the number of the first negative samples and the number of the second negative samples meets a preset proportion relation.

In an exemplary embodiment, the sampling from the first negative sample set to obtain the first negative sample includes: acquiring the occurrence frequency corresponding to each sample recommendation information contained in the first negative sample set; the occurrence frequency is used for representing the interaction times of the second sample account on each sample recommendation information in the first negative sample set; and determining the sampling weight of each sample recommendation information in the first negative sample set according to the occurrence frequency, and sampling from the first negative sample set according to the sampling weight to obtain the first negative sample.

In an exemplary embodiment, the determining, according to the occurrence frequency, a sampling weight of each sample recommendation information in the first negative sample set includes: if the current sample recommendation information is the sample recommendation information with the occurrence frequency being greater than a preset first occurrence frequency and less than a preset second occurrence frequency in the first negative sample set, setting the sampling weight of the current sample recommendation information to be greater than the sampling weights of other sample recommendation information; the other sample recommendation information is sample recommendation information of which the occurrence frequency is less than or equal to the first occurrence frequency or greater than or equal to the second occurrence frequency in the first negative sample set; the first frequency of occurrence is less than the second frequency of occurrence.

In an exemplary embodiment, after the training of the content recommendation model, the method further includes: adding the positive samples to the first set of negative samples.

In an exemplary embodiment, the inputting the positive sample, the first negative sample, and the second negative sample into the content recommendation model respectively to obtain the predicted recommendation order of the first sample account includes: respectively inputting the positive sample, the first negative sample and the second negative sample into the content recommendation model, and obtaining, through the content recommendation model, predicted recommendation probabilities for recommending to the first sample account, which correspond to the positive sample, the first negative sample and the second negative sample, respectively; and sequencing the positive sample, the first negative sample and the second negative sample according to the magnitude relation of the prediction recommendation probability to obtain the prediction recommendation sequence.

In an exemplary embodiment, the method further comprises: in the process of training a content recommendation model by adopting a sample recommendation information set, if a current sample is a negative sample of a first sample account, setting the expected recommendation probability of the negative sample for the first sample account to be zero; the negative sample is sample recommendation information which is not interacted with the first sample account; inputting the negative examples into the content recommendation model to obtain a predicted recommendation probability for recommending to the first sample account for the negative examples; training the content recommendation model based on a difference between a predicted recommendation probability of the negative examples recommending to the first sample account and an expected recommendation probability of the negative examples aiming at the first sample account.

In an exemplary embodiment, after the training of the content recommendation model, the method further includes: responding to a content recommendation request, and acquiring account characteristics of a target account corresponding to the content recommendation request and information characteristics of information to be recommended; inputting the account characteristics and the information characteristics into the trained content recommendation model to obtain the predicted recommendation probability of each piece of information to be recommended for the target account; and screening recall information aiming at the target account from the information to be recommended according to the sequence of the predicted recommendation probability.

According to a second aspect of the embodiments of the present disclosure, there is provided a training apparatus for a content recommendation model, including:

the training sample acquisition unit is configured to sample a first negative sample and a second negative sample in the sample recommendation information set if the current sample is a positive sample of a first sample account in the process of training the content recommendation model by adopting the sample recommendation information set; the positive sample is sample recommendation information interacted by a first sample account; the first negative sample is sample recommendation information interacted by a second sample account; the second negative sample is obtained by randomly sampling the sample recommendation information set; the first sample account is any account in the sample accounts, and the second sample account is any account except the first sample account in the sample accounts;

a prediction order obtaining unit configured to perform input of the positive sample, the first negative sample and the second negative sample to the content recommendation model, respectively, to obtain a prediction recommendation order of the first sample account;

a recommendation model training unit configured to perform training of the content recommendation model based on a difference between the predicted recommendation order and an expected recommendation order of the first sample account; the desired recommendation order is a recommendation order of the positive examples prior to the recommendation order of the first negative examples and the second negative examples.

In an exemplary embodiment, the training sample obtaining unit is further configured to perform sampling from a first negative sample set to obtain the first negative sample; the first negative sample set is composed of sample recommendation information interacted with the second sample account in the sample recommendation information set.

In an exemplary embodiment, the training sample obtaining unit is further configured to perform sampling from the first negative sample set to obtain the first negative sample, and sampling from the sample recommendation information set to obtain the second negative sample; the proportion between the number of the first negative samples and the number of the second negative samples meets a preset proportion relation.

In an exemplary embodiment, the training sample obtaining unit is further configured to perform obtaining of occurrence frequencies corresponding to sample recommendation information included in the first negative sample set; the occurrence frequency is used for representing the interaction times of the second sample account on each sample recommendation information in the first negative sample set; and determining the sampling weight of each sample recommendation information in the first negative sample set according to the occurrence frequency, and sampling from the first negative sample set according to the sampling weight to obtain the first negative sample.

In an exemplary embodiment, the training sample obtaining unit is further configured to execute, if the current sample recommendation information is sample recommendation information of which the occurrence frequency is greater than a preset first occurrence frequency and less than a preset second occurrence frequency in the first negative sample set, setting the sampling weight of the current sample recommendation information to be greater than the sampling weights of other sample recommendation information; the other sample recommendation information is sample recommendation information of which the occurrence frequency is less than or equal to the first occurrence frequency or greater than or equal to the second occurrence frequency in the first negative sample set; the first frequency of occurrence is less than the second frequency of occurrence.

In an exemplary embodiment, the apparatus further comprises: a first set construction module for adding the positive samples to the first set of negative samples.

In an exemplary embodiment, the prediction order obtaining unit is further configured to perform inputting the positive sample, the first negative sample and the second negative sample into the content recommendation model, and obtain, through the content recommendation model, prediction recommendation probabilities for recommending to the first sample account, which correspond to the positive sample, the first negative sample and the second negative sample, respectively; and sequencing the positive sample, the first negative sample and the second negative sample according to the magnitude relation of the prediction recommendation probability to obtain the prediction recommendation sequence.

In an exemplary embodiment, the apparatus further comprises: the negative sample training unit is configured to execute the process of training the content recommendation model by adopting the sample recommendation information set, and if the current sample is a negative sample of a first sample account, the expected recommendation probability of the negative sample for the first sample account is set to be zero; the negative sample is sample recommendation information which is not interacted with the first sample account; inputting the negative examples into the content recommendation model to obtain a predicted recommendation probability for recommending to the first sample account for the negative examples; training the content recommendation model based on a difference between a predicted recommendation probability of the negative examples recommending to the first sample account and an expected recommendation probability of the negative examples aiming at the first sample account.

In an exemplary embodiment, the apparatus further comprises: the recall information acquisition unit is configured to execute the steps of responding to a content recommendation request, and acquiring the account characteristics of a target account corresponding to the content recommendation request and the information characteristics of each piece of information to be recommended; inputting the account characteristics and the information characteristics into the trained content recommendation model to obtain the predicted recommendation probability of each piece of information to be recommended for the target account; and screening recall information aiming at the target account from the information to be recommended according to the sequence of the predicted recommendation probability.

According to a third aspect of the embodiments of the present disclosure, there is provided a server, including: a processor; a memory for storing the processor-executable instructions; wherein the processor is configured to execute the instructions to implement the method of training a content recommendation model as defined in any one of the embodiments of the first aspect.

According to a fourth aspect of embodiments of the present disclosure, there is provided a computer-readable storage medium, wherein instructions, when executed by a processor of a server, enable the server to perform a method of training a content recommendation model as described in any one of the embodiments of the first aspect.

According to a fifth aspect of embodiments of the present disclosure, there is provided a computer program product, comprising instructions which, when executed by a processor of a server, enable the server to perform the method of training a content recommendation model as defined in any one of the first aspects.

The technical scheme provided by the embodiment of the disclosure at least brings the following beneficial effects:

in the process of training a content recommendation model by adopting a sample recommendation information set, if a current sample is a positive sample of a first sample account, sampling a first negative sample and a second negative sample in the sample recommendation information set; the positive sample is sample recommendation information interacted by the first sample account; the first negative sample is sample recommendation information interacted by the second sample account; the second negative sample is obtained by randomly sampling the sample recommendation information set; the first sample account is any account in the sample accounts, and the second sample account is any account except the first sample account in the sample accounts; respectively inputting the positive sample, the first negative sample and the second negative sample into a content recommendation model to obtain a prediction recommendation sequence of the first sample account; training a content recommendation model based on a difference between the predicted recommendation order and an expected recommendation order for the first sample account; the desired recommendation order is a recommendation order of positive examples that precedes the recommendation order of the first negative examples and the second negative examples. According to the method and the device, when the content recommendation model is trained in a contrast learning mode, the acquired negative samples comprise the second negative sample obtained by randomly sampling the sample recommendation information set and the first negative sample consisting of the sample recommendation information interacted with the second sample account, so that the randomness of the negative sample screening process can be reduced, and the accuracy of the content recommendation model trained in the contrast learning mode can be further improved by taking the sample recommendation information interacted with the second sample account and easy to misjudge as the negative sample of the first sample account.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and, together with the description, serve to explain the principles of the disclosure and are not to be construed as limiting the disclosure.

FIG. 1 is a flow diagram illustrating a method of training a content recommendation model in accordance with an exemplary embodiment.

FIG. 2 is a flow diagram illustrating sampling to obtain first negative sample content according to an example embodiment.

FIG. 3 is a flow diagram illustrating a process for deriving a predicted recommendation sequence, according to an example embodiment.

FIG. 4 is a flow chart illustrating a method of training a content recommendation model according to another exemplary embodiment.

FIG. 5 is a flow diagram illustrating filtering of recall information according to an exemplary embodiment.

FIG. 6 is a block diagram illustrating a training apparatus of a content recommendation model according to an example embodiment.

FIG. 7 is a block diagram illustrating a server in accordance with an example embodiment.

Detailed Description

In order to make the technical solutions of the present disclosure better understood by those of ordinary skill in the art, the technical solutions in the embodiments of the present disclosure will be clearly and completely described below with reference to the accompanying drawings.

It should be noted that the terms "first," "second," and the like in the description and claims of the present disclosure and in the above-described drawings are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the disclosure described herein are capable of operation in sequences other than those illustrated or otherwise described herein. The implementations described in the exemplary embodiments below are not intended to represent all implementations consistent with the present disclosure. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the present disclosure, as detailed in the appended claims.

It should be further noted that the user information (including but not limited to user device information, user personal information, etc.) and data (including but not limited to data for presentation, analyzed data, etc.) referred to in the present disclosure are information and data authorized by the user or sufficiently authorized by each party.

Fig. 1 is a flowchart illustrating a training method of a content recommendation model according to an exemplary embodiment, and as shown in fig. 1, the training method of the content recommendation model is used in a server and includes the following steps.

In step S101, in the process of training the content recommendation model by using the sample recommendation information set, if the current sample is a positive sample of the first sample account, a first negative sample and a second negative sample are sampled from the sample recommendation information set; the positive sample is sample recommendation information interacted by the first sample account; the first negative sample is sample recommendation information interacted by the second sample account; the second negative sample is obtained by randomly sampling the sample recommendation information set; the first sample account is any account in the sample accounts, and the second sample account is any account except the first sample account in the sample accounts.

The sample recommendation information set refers to a set formed by sample recommendation information used for training a content recommendation model, and the sample recommendation information may include various content recommendation information recommended to a user account, for example, picture information, text information, or video information recommended to the user account. The positive sample refers to sample recommendation information for which the first sample account has an interactive behavior in the sample recommendation information included in the sample recommendation information set. For example, a certain sample recommendation information may be a recommended video for a first sample account, and if there is a corresponding interaction behavior for the recommended video in the first sample account, it may be that the recommended video is clicked to play or comment, and so on, then the recommended video may be used as a positive sample for the first sample account at this time. Or a certain sample recommendation information may be a recommended picture for a first sample account, and if the first sample account has a corresponding interaction behavior with the recommended picture, for example, the recommended picture may be collected or shared, and the recommended picture may also be a positive sample of the first sample account at this time.

And the first negative sample is sample recommendation information contained in the sample recommendation information set, the second sample account has sample recommendation information of interactive behavior for the second sample account, and the second sample account refers to other sample accounts except the first sample account in the sample accounts. For example, the sample account may include a sample account a, a sample account B, and a sample account C, where if the sample account a is used as a first sample account, then the sample account B and the sample account C may both be used as a second sample account, then a certain sample recommendation information in the sample recommendation information set may be used, and if the sample account B or the sample account C has a certain interaction behavior with it, then the sample recommendation information may be used as a first negative sample. Or the sample account B is used as the first sample account, then both the sample account a and the sample account C may be used as the second sample account, and then the sample account a in the sample recommendation information set or the sample recommendation information of the sample account C having the interaction behavior may be used as the first negative sample. The second negative sample refers to the sample content randomly sampled from the sample recommendation information set, and the sample is directly sampled from the sample recommendation information set.

Specifically, when the recommendation model is trained by using the sample recommendation information set, if a currently acquired sample is a positive sample for a certain first sample account, that is, the first sample account has an interaction behavior with the sample recommendation information, the server may further sample a part of information from the sample recommendation information set as a first negative sample, that is, sample recommendation information that a second sample account has interacted with the sample recommendation information set as a first negative sample, and randomly sample a part of information from the sample recommendation information set as a second negative sample.

In step S102, the positive sample, the first negative sample, and the second negative sample are respectively input to the content recommendation model, so as to obtain a predicted recommendation order of the first sample account.

The predicted recommendation sequence refers to a recommendation sequence obtained by the content recommendation model for recommending the sample recommendation information to the first sample account, where the recommendation sequence is input to each sample recommendation information in the content recommendation model, that is, a positive sample, a first negative sample, and a second negative sample.

In step S103, training a content recommendation model based on a difference between the predicted recommendation order and the expected recommendation order of the first sample account; the desired recommendation order is a recommendation order of positive examples that precedes the recommendation order of the first negative examples and the second negative examples.

The expected recommendation sequence is a direct recommendation sequence of the positive sample, the first negative sample and the second negative sample expected by the content recommendation model, and because the first sample account has an interaction behavior with the positive sample once, that is, the positive sample may better meet the recommendation requirement of the first sample account, compared with the first negative sample and the second negative sample, the positive sample may be provided with a recommendation sequence that is more preferred to be recommended to the first sample account, and therefore, in the expected recommendation sequence, the recommendation sequence of the positive sample may also be set to be prior to the first negative sample and the second negative sample. At this time, after the predicted recommendation order is obtained in step S102, the content recommendation model may be trained by using the difference between the predicted recommendation order and the expected recommendation order to optimize the model, and the goal of the optimization may be to make the positive sample have a higher predicted recommendation probability, i.e. a higher score, than the first negative sample or the second negative sample, so that the content recommendation model may recommend the positive sample for the first sample account more preferentially, and avoid preferentially recommending the first negative sample or the second negative sample.

In the training method of the content recommendation model, in the process of training the content recommendation model by adopting the sample recommendation information set, if the current sample is a positive sample of the first sample account, a first negative sample and a second negative sample are sampled from the sample recommendation information set; the positive sample is sample recommendation information interacted by the first sample account; the first negative sample is sample recommendation information interacted by the second sample account; the second negative sample is obtained by randomly sampling the sample recommendation information set; the first sample account is any account in the sample accounts, and the second sample account is any account except the first sample account in the sample accounts; respectively inputting the positive sample, the first negative sample and the second negative sample into a content recommendation model to obtain a prediction recommendation sequence of the first sample account; training a content recommendation model based on a difference between the predicted recommendation sequence and an expected recommendation sequence of the first sample account; the desired recommendation order is a recommendation order of positive examples that precedes the recommendation order of the first negative examples and the second negative examples. According to the method, when the content recommendation model is trained in a contrast learning mode, the acquired negative samples comprise the second negative sample obtained by randomly sampling the sample recommendation information set and the first negative sample consisting of the sample recommendation information interacted with the second sample account, so that the randomness of the negative sample screening process can be reduced, and the sample recommendation information interacted with the second sample account and easy to misjudge is used as the negative sample of the first sample account, so that the accuracy of the content recommendation model trained in the contrast learning mode can be further improved.

In an exemplary embodiment, step S101 may further include: sampling from the first negative sample set to obtain a first negative sample; the first negative sample set is composed of sample recommendation information interacted with the second sample account in the sample recommendation information set.

In this embodiment, the server may form a corresponding sample set with the sample recommendation information interacted with the second sample account in advance as the first negative sample set, and when the server performs the first negative sample sampling, the server may directly sample from the first negative sample set, so as to obtain the first negative sample.

For example, the first set of negative examples may include: the server can randomly sample the first negative sample set when sampling the first negative sample, namely, the first negative sample is obtained by sampling the sample recommendation information A, the sample recommendation information B and the sample recommendation information C.

In this embodiment, a set formed by sample recommendation information interacted with a second sample account, that is, a first negative sample set, may be pre-constructed in the server, so that in the sampling process of the first negative sample, the sample recommendation information interacted with the second sample account may be obtained by sampling in the first negative sample set.

Further, step S101 may further include: sampling from the first negative sample set to obtain a first negative sample, and sampling from the sample recommendation information set to obtain a second negative sample; the proportion between the number of the first negative samples and the number of the second negative samples meets a preset proportion relation.

In order to ensure the content quantity of the first negative sample and the second negative sample and further ensure the accuracy of the content recommendation model obtained by training, in this embodiment, a corresponding proportion may be set in advance for the quantity relationship between the first negative sample and the second negative sample, so that when the server samples the first negative sample and the second negative sample, the sampling may be performed according to the quantity proportion. For example, when a total of 10 first negative examples and 10 second negative examples need to be sampled, and the quantity ratio between the first negative examples and the second negative examples can be set to 2:8, the server may randomly sample 2 pieces of sample recommendation information from the first negative example set as the first negative examples, and may randomly sample 8 pieces of sample recommendation information from the sample recommendation information set as the second negative examples. And if the number ratio between the first negative sample and the second negative sample is set to be 4:6, the server may randomly sample 4 sample recommendation information from the first negative sample set as the first negative sample, and may randomly sample 6 sample recommendation information from the sample recommendation information set as the second negative sample, and so on, respectively, so as to ensure the relative balance between the contents of the first negative sample and the second negative sample.

In this embodiment, the server may respectively sample from the first negative sample set to obtain the first negative sample and sample from the sample recommendation information set to obtain the second negative sample according to a preset ratio between the number of the first negative samples and the number of the second negative samples, which may ensure that the relative balance between the number of the first negative samples and the number of the second negative samples is maintained during each sampling in the model training process, so as to avoid the too small number of the first negative samples or the second negative samples in a certain sampling process, thereby further reducing the randomness of negative sample screening, and further improving the accuracy of the content recommendation model obtained by training.

In an exemplary embodiment, as shown in fig. 2, sampling the first negative sample from the first set of negative samples may further include:

in step S201, obtaining an occurrence frequency corresponding to each piece of sample recommendation information included in the first negative sample set; the occurrence frequency is used for representing the interaction times of the second sample account on each sample recommendation information in the first negative sample set.

In this embodiment, the first negative sample set may include the same sample recommendation information, and for a certain sample recommendation information, there may be an interaction behavior for the sample account a, or there may also be an interaction behavior for the sample account B, for example, for a certain recommended video, both the sample account a and the sample account B may have a click-to-play behavior for the recommended video, so that the sample account a and the sample account B may have an interaction behavior for the recommended video at the same time, the recommended video may repeatedly exist in the first negative sample set, the occurrence frequency refers to the number of each sample recommendation information included in the first negative sample set in the set, and may be used to represent the number of interactions of the second sample account with the sample recommendation information, generally speaking, the greater the occurrence frequency is, it indicates that there are more sample accounts that there may be a user behavior for the sample content, that is, the more the second sample account interacts with the recommendation information of each sample.

In step S202, according to the occurrence frequency, a sampling weight of each piece of sample recommendation information in the first negative sample set is determined, and a first negative sample is obtained by sampling from the first negative sample set according to the sampling weight.

The sampling weight is used for representing the sampling rate of each piece of sample recommendation information contained in the first negative sample set, and if the sampling weight of a piece of sample recommendation information is larger, the possibility that the piece of sample recommendation information is sampled as a first negative sample is also larger. For example, the first negative sample set may include sample recommendation information a, sample recommendation information B, sample recommendation information C, sample recommendation information D, and sample recommendation information E, and when the sampling weights of the sample recommendation information a, the sample recommendation information B, and the sample recommendation information C are greater than the sample recommendation information D and the sample recommendation information E, when the server performs the first negative sample sampling, the probability that the sample recommendation information a, the sample recommendation information B, or the sample recommendation information C is used as the first negative sample is greater than the probability that the sample recommendation information D and the sample recommendation information E are used as the first negative sample, so that the sampled first negative sample may be adapted to the occurrence frequency to further improve the sampling quality of the first negative sample.

In this embodiment, when the terminal samples the first negative sample, the terminal may determine a corresponding sampling weight based on an occurrence frequency corresponding to each sample recommendation information included in the first negative sample set, and then further sample the first negative sample according to the sampling weight, so that the first negative sample obtained by sampling may be adapted to the occurrence frequency, and since the occurrence frequency may be used to represent the number of interactions of the second sample account, the first negative sample obtained by sampling may be adapted to the number of interactions of the second sample account, thereby further improving the sampling quality of the first negative sample.

Further, step S202 may further include: if the current sample recommendation information is sample recommendation information in the first negative sample set, the occurrence frequency of which is greater than a preset first occurrence frequency and is less than a preset second occurrence frequency, setting the sampling weight of the current sample recommendation information to be greater than the sampling weights of other sample recommendation information; the other sample recommendation information is sample recommendation information of which the occurrence frequency is less than or equal to a first occurrence frequency or greater than or equal to a second occurrence frequency in the first negative sample set; the first frequency of occurrence is less than the second frequency of occurrence.

The current sample recommendation information refers to any one of the sample recommendation information contained in the first negative sample set, and the first occurrence frequency and the second occurrence frequency are two occurrence frequency thresholds preset in the server, wherein the first occurrence frequency is smaller than the second occurrence frequency. Because the frequency of occurrence of a certain sample recommendation information in the first negative sample set is low, it indicates that only sporadic positive samples of the user account may be possible, that is, only the user account may be triggered by accident, the probability that the sample recommendation information becomes a positive sample of the first sample account is not high, but at the same time, the significance of distinguishing the sample recommendation information is not great. Meanwhile, if a certain sample recommendation information frequently appears in the first negative sample set, the sample recommendation information can be regarded as a widely popular sample content which may be very popular, and at this time, if the sample recommendation information is used as the first negative sample of the first sample account, a training error may occur because the sample recommendation information has a high probability of becoming a positive sample of the first sample account. Therefore, in this embodiment, in order to ensure the accuracy of sampling the first negative samples, the sample recommendation information with the occurrence frequency greater than the first occurrence frequency and less than the second occurrence frequency in the first negative sample set may be preferentially screened as the first negative samples, that is, the sample recommendation information with the occurrence frequency greater than the first occurrence frequency and less than the second occurrence frequency may be set with a greater sampling weight compared with the other sample recommendation information with the occurrence frequency less than or equal to the first occurrence frequency or with the occurrence frequency greater than or equal to the second occurrence frequency, so as to preferentially screen the sample recommendation information with the occurrence frequency greater than the first occurrence frequency and less than the second occurrence frequency as the first negative samples.

For example, the first occurrence frequency may be set to 2, the second occurrence frequency may be set to 10, and if a certain first negative sample set includes sample recommendation information a, sample recommendation information B, sample recommendation information C, and sample recommendation information D, where the occurrence frequency of the sample recommendation information a is 1, the occurrence frequency of the sample recommendation information B is 4, the occurrence frequency of the sample recommendation information C is 8, and the occurrence frequency of the sample recommendation information D is 13, since the occurrence frequencies of the sample recommendation information B and the sample recommendation information C are located between the first occurrence frequency and the second occurrence frequency, the sample recommendation information B and the sample recommendation information C may be correspondingly set with larger sampling weights than the sample recommendation information a and the sample recommendation information D, that is, when the server samples the first negative sample, the possibility of sampling the sample recommendation information B and the sample recommendation information C is greater than the possibility of sampling the sample recommendation information a and the sample recommendation information D The possibility.

In this embodiment, the server may further set a greater sampling weight for the sample content whose occurrence frequency is greater than the preset first occurrence frequency and smaller than the preset second occurrence frequency, therefore, the sample recommendation information with the occurrence frequency greater than the first occurrence frequency and less than the second occurrence frequency can be preferentially screened as the first negative sample, compared with the other sample recommendation information with the occurrence frequency less than or equal to the first occurrence frequency as the first negative sample, the first negative sample screened by the embodiment has greater resolution significance, compared with the method that the sample recommendation information with the occurrence frequency greater than or equal to the second occurrence frequency is used as the first negative sample, the first negative sample screened by the embodiment can reduce the training errors, therefore, the sampling quality of the first negative sample can be ensured, and the accuracy of the content recommendation model obtained through training is further improved.

In addition, after step S103, the method may further include: adding the positive samples to the first set of negative samples.

For example, for the sample recommendation information a, for the sample account a, the sample account a belongs to a positive sample, which indicates that the sample account a has an interactive behavior with respect to the sample recommendation information a, and meanwhile, if the sample account B is taken as the first sample account, since the sample account a is another user account of the sample account B, the sample account a is taken as the second sample account at this time, the sample recommendation information a belongs to the sample recommendation information of the second sample account having an interactive behavior, and belongs to the first negative sample, so the server needs to send the positive sample to the first negative sample set in which the first negative sample is stored. Specifically, after the content recommendation model is trained by using the positive samples, the positive samples used in the current training may be input to the first negative sample set to further enrich the sample recommendation information included in the first negative sample set.

In this embodiment, the positive sample used for training the content recommendation model may be added to the first negative sample set, so that the first negative sample set may be updated during the training of the content recommendation model, and the sample recommendation information included in the first negative sample content set is further enriched and updated in real time.

In an exemplary embodiment, as shown in fig. 3, step S102 may further include:

in step S301, the positive sample, the first negative sample, and the second negative sample are respectively input to the content recommendation model, and the predicted recommendation probabilities that the positive sample, the first negative sample, and the second negative sample are respectively corresponding to the first sample account and recommend to the first sample account are obtained through the content recommendation model.

The predicted recommendation probability refers to the probability of recommending to the first sample account corresponding to the positive sample content, the first negative sample content and the second negative sample content respectively, which is output by the content recommendation model. The greater the predicted recommendation probability, the greater the possibility that the first sample account has an interactive action on the sample recommendation information. Specifically, after the positive sample, the first negative sample, and the second negative sample are obtained in step S101, the obtained positive sample, the obtained first negative sample, and the obtained second negative sample may be further input into the content recommendation model, and the predicted recommendation probabilities corresponding to the positive sample, the first negative sample, and the second negative sample are obtained through output by the content recommendation model.

In step S302, the positive samples, the first negative samples, and the second negative samples are sorted according to the magnitude relationship of the prediction recommendation probability to obtain a prediction recommendation order.

Because the predicted recommendation probability can be used for representing the possibility that the first sample account has an interactive behavior with the sample recommendation information, when the sample recommendation information is recommended, the recommendation sequence of each sample recommendation information can be determined according to the size relationship of the score represented by the predicted recommendation probability. Therefore, after the predicted recommendation probability is obtained in step S301, the positive samples, the first negative samples, and the second negative samples may be further sorted according to the magnitude relationship of the predicted recommendation probability, so as to obtain the predicted recommendation order of the corresponding positive samples, the first negative samples, and the second negative samples.

In this embodiment, after obtaining the positive sample, the first negative sample, and the second negative sample, the server may input the sample recommendation information to the content recommendation model, and output the corresponding predicted recommendation probability by the content recommendation model, so as to further obtain the corresponding predicted recommendation sequence according to the order of the predicted recommendation probabilities, and may improve the accuracy of the obtained predicted recommendation sequence.

In addition, in an exemplary embodiment, as shown in fig. 4, the method for training the content recommendation model may further include:

in step S401, in the process of training the content recommendation model by using the sample recommendation information set, if the current sample is a negative sample of the first sample account, setting an expected recommendation probability of the negative sample for the first sample account to zero; negative examples are example recommendation information that the first sample account has not interacted.

If the sample recommendation information currently used for training the content recommendation model is never recommended to the first sample account, or the sample recommendation information is recommended to the first sample account once, but the first sample account does not execute a corresponding interaction behavior on the recommended sample recommendation information, then the first sample account does not have a user interaction behavior on the sample recommendation information, and then the sample recommendation information can be used as a negative sample of the first sample account. If the sample recommendation information currently used for training the content recommendation model is a negative sample of the first sample account, the server may further set an expected recommendation probability of the negative sample to zero, so as to represent that the negative sample is not expected to be recommended to the first sample account.

In step S402, inputting the negative examples into the content recommendation model to obtain a predicted recommendation probability for recommending the negative examples to the first sample account;

in step S403, the content recommendation model is trained based on the difference between the predicted recommendation probability for recommending to the first sample account by the negative example and the expected recommendation probability for the first sample account by the negative example.

Then, the server can also input the currently obtained negative sample into the content recommendation model, the content recommendation model outputs a predicted recommendation probability for recommending the negative sample to the first sample account, namely a predicted possibility for recommending the negative sample to the first sample account, and can also calculate a difference loss between the predicted recommendation probability of the negative sample and an expected recommendation probability for recommending the negative sample to the first sample account, and implement training for the content recommendation model according to the difference loss, so that the trained content recommendation model can reduce the probability for recommending the negative sample to the first sample account, and further improve the effectiveness of sample recommendation information recommendation.

In this embodiment, if the sample recommendation information currently used for training the content recommendation model is a negative sample of the first sample account, the server may further set the expected recommendation probability of the negative sample to zero, and perform model training by using a difference between the predicted recommendation probability output by the content recommendation model and the expected recommendation probability, so that the trained content recommendation model may reduce the probability of recommending the negative sample to the first sample account, and further improve the effectiveness of sample recommendation information recommendation.

In an exemplary embodiment, as shown in fig. 5, after step S103, the method may further include:

in step S501, in response to the content recommendation request, the account characteristics of the target account corresponding to the content recommendation request and the information characteristics of each piece of information to be recommended are acquired.

The content recommendation request is triggered by a target account and is used for acquiring recommendation information, when a certain user account can trigger a relevant recommendation information acquisition operation through a terminal of the user account, the content recommendation request for acquiring recommendation information sent by a server can be initiated to the server, for example, the target account can trigger the content recommendation request to the server by performing a refresh operation on a certain display page, and the server can respond to the request, so as to acquire account features of the target account, such as interest tags of the target account and the like, and information features of information to be recommended, each of which can be used for information recommendation.

In step S502, the account characteristics and the information characteristics are input to the trained content recommendation model, so as to obtain a predicted recommendation probability of each piece of information to be recommended for the target account;

in step S503, recall information for the target account is screened from the information to be recommended according to the ranking of the predicted recommendation probabilities.

After the account characteristics of the target account and the information characteristics of each piece of information to be recommended are obtained in step S501, the account characteristics and the information characteristics may be input into a trained content recommendation model, and a part of the information to be recommended may be further screened out by the server according to a score sequence represented by the predicted recommendation probabilities, where the part is used as recall information for recommending information in a recall stage, the number of the recall information for the target account may be 100, for example, after the predicted recommendation probability for each piece of information to be recommended is obtained, the information to be recommended before the ranking 100 may also be ranked according to the size of the predicted recommendation probability, so as to use the recall information for the target account.

In this embodiment, after the training of the content recommendation model is completed, the trained content recommendation model may be further used for filtering recall information, when a content recommendation request is triggered on a target account, account features of the target account and information features of each piece of information to be recommended are input to the trained content recommendation model, a predicted recommendation probability that each piece of information to be recommended is recommended to the target account is obtained by the content recommendation model, and corresponding recall information may be further filtered from the information to be recommended according to a predicted recommendation sequence, because the content recommendation model trained in this embodiment is trained in a manner of performing comparative learning by using a second negative sample obtained by randomly sampling a sample recommendation information set and a first negative sample composed of sample recommendation information interacted with a second sample account, the trained content recommendation model has higher accuracy, so that the recall information obtained by the content recommendation model also has higher accuracy, and the accuracy of the recommendation information recall process can be improved.

In an exemplary embodiment, a recommendation model training method based on contrast learning and hard sample mining is further provided, which can refine the contrast learning to avoid the phenomenon that the recall stage belongs to a positive sample at other users, and is not necessarily a positive sample at the current user. The positive samples of other users are used as the negative samples of the current user, so that the positive samples of other users can be selected during negative sampling, the resolution capability of the model is enhanced, and most of misjudgment conditions existing in recalling can be reasonably avoided. The method specifically comprises the following steps:

in the recall phase of the recommendation system, the samples where the user generated the user behavior (play complete, like, focus, etc.) can be considered as positive samples, and the opposite is negative samples. Negative examples contain two sources, exposure taken but no user action and no exposure taken. Meanwhile, the method can be further divided into a general negative sample and a difficult negative sample. Wherein a hard negative example refers to an example that may result in user behavior for other users (i.e., users who have previously sent requests), but not for the current user. Otherwise, it may be referred to as a general negative sample. In this embodiment, two intermediate storage links are taken. One for storing the general samples (a-pool), constituting the general negative samples. The other one is used to store the difficult negative examples, i.e. the positive examples of other users (B-pool of examples), which constitute the difficult negative examples.

In the model training process, the sample pool A and the sample pool B are used for storing negative samples, and the sample pool A is basically kept static. The B sample pool is empty at the beginning, when a positive sample appears in training and a corresponding user trains, the positive sample is filled into the B sample pool to provide a difficult negative sample for other users. Until the maximum capacity of the B cell is reached. In each subsequent training, if the current sample is a negative sample, training only considering the positive and negative of one sample is directly performed once; if the current sample is a positive sample, sampling a certain proportion from the sample pool A as a negative sample, simultaneously sampling some negative samples from the sample pool B, training the ordering relation of the samples, and combining the currently adopted positive sample and two types of negative samples into a list together, wherein the optimization aim is to expect that the scores of the positive samples in the whole list are relatively highest. The output of these samples may be y1-yn, where the first term is a positive sample, first normalized by softmax the score obtained for these samples:

wherein, y _i Denotes the fraction of the ith sample, j denotes the total number of positive samples and two types of negative samples, S _i Softmax normalization results representing the respective sample scores. Then, the first term, i.e. the positive sample term output, can be maximized in the computation of loss as follows:

min-log(S ₁ )

since all scores are uniformly normalized by softmax, the first term output is maximized, essentially making this term larger than the others.

Meanwhile, for negative samples in the B sample pool, training should not be performed with the same weight. There is a possibility of error in the samples in the B sample pool, i.e. the positive samples of other users are also possible to be the positive samples of the current user. Therefore, when a sample appears less frequently in other samples, the sample may be only a sporadic positive sample of the user, and it can be considered that the probability of becoming a positive sample of the current user is not great, but at the same time, it does not have great significance for distinguishing the sample. When a sample frequently appears in other samples, the sample is considered to be a popular sample, and may be a very popular sample, and at this time, if a larger sampling weight is occupied in training, a training error may occur, that is, the sample has a high probability of becoming a positive sample of the current user.

Therefore, the samples in the B sample pool are a sampling mode with two low sides and a high middle part according to the frequency: neither too frequent nor too low frequency material should be given high weight in order of frequency of occurrence, while samples with intermediate frequency of occurrence should be given greater weight. Because the above learning mode through the sample ordering relationship only highlights the scores of the positive samples, but does not highlight the weights of the negative samples, by changing the sampling rate: the samples sampled by the B-pool are allowed to repeat, more easily when the weight is large, and less easily when the weight is small.

By such an approach, contrast learning can be brought into training in the presence of positive samples. And for the samples in the sample pool A, the problem of pertinence is not existed, and a new sample pool B is constructed, so that the positive samples of other users are collected as the difficult negative samples of the current user. Therefore, the resolution of the model is improved, and a better effect can be achieved in a recall link.

In the above embodiment, the B sample pool for storing the positive samples of the other users is constructed during the training in the recall stage by applying the contrast learning method, the positive samples of the other users can be used as the negative samples of the current user during the training, so that the positive and negative samples are effectively distinguished from each other, and the resolution capability of the model is improved.

It should be understood that although the various steps in the flowcharts of fig. 1-4 are shown in order as indicated by the arrows, the steps are not necessarily performed in order as indicated by the arrows. The steps are not performed in the exact order shown and described, and may be performed in other orders, unless explicitly stated otherwise. Moreover, at least some of the steps in fig. 1-4 may include multiple steps or multiple stages, which are not necessarily performed at the same time, but may be performed at different times, which are not necessarily performed in sequence, but may be performed in turn or alternately with other steps or at least some of the other steps or stages.

It is understood that the same/similar parts between the embodiments of the method described above in this specification can be referred to each other, and each embodiment focuses on the differences from the other embodiments, and it is sufficient that the relevant points are referred to the descriptions of the other method embodiments.

FIG. 6 is a block diagram illustrating a training apparatus of a content recommendation model according to an example embodiment. Referring to fig. 6, the apparatus includes a training sample acquisition unit 601, a prediction order acquisition unit 602, and a recommendation model training unit 603.

A training sample obtaining unit 601, configured to perform, in a process of training a content recommendation model by using a sample recommendation information set, if a current sample is a positive sample of a first sample account, sampling a first negative sample and a second negative sample in the sample recommendation information set; the positive sample is sample recommendation information interacted by the first sample account; the first negative sample is sample recommendation information interacted by the second sample account; the second negative sample is obtained by randomly sampling the sample recommendation information set; the first sample account is any account in the sample accounts, and the second sample account is any account except the first sample account in the sample accounts;

a prediction order obtaining unit 602 configured to perform input of the positive sample, the first negative sample, and the second negative sample to the content recommendation model, respectively, to obtain a prediction recommendation order of the first sample account;

a recommendation model training unit 603 configured to perform training of the content recommendation model based on a difference between the predicted recommendation order and an expected recommendation order for the first sample account; the desired recommendation order is a recommendation order of positive examples that precedes the recommendation order of the first negative examples and the second negative examples.

In an exemplary embodiment, the training sample obtaining unit 601 is further configured to perform sampling from a first negative sample set to obtain a first negative sample; the first negative sample set is composed of sample recommendation information interacted with the second sample account in the sample recommendation information set.

In an exemplary embodiment, the training sample obtaining unit 601 is further configured to perform sampling from the first negative sample set to obtain a first negative sample, and sampling from the sample recommendation information set to obtain a second negative sample; the proportion between the number of the first negative samples and the number of the second negative samples meets a preset proportion relation.

In an exemplary embodiment, the training sample obtaining unit 601 is further configured to perform obtaining of occurrence frequencies corresponding to each piece of sample recommendation information included in the first negative sample set; the occurrence frequency is used for representing the interaction times of the second sample account on each sample recommendation information in the first negative sample set; and determining the sampling weight of each sample recommendation information in the first negative sample set according to the occurrence frequency, and sampling from the first negative sample set according to the sampling weight to obtain a first negative sample.

In an exemplary embodiment, the training sample obtaining unit 601 is further configured to set the sampling weight of the current sample recommendation information to be greater than the sampling weights of the other sample recommendation information if the current sample recommendation information is the sample recommendation information with an occurrence frequency that is greater than a preset first occurrence frequency and less than a preset second occurrence frequency in the first negative sample set; the other sample recommendation information is sample recommendation information of which the occurrence frequency is less than or equal to a first occurrence frequency or greater than or equal to a second occurrence frequency in the first negative sample set; the first frequency of occurrence is less than the second frequency of occurrence.

In an exemplary embodiment, the training apparatus of the content recommendation model further includes: and the first set construction module is used for adding the positive samples to the first negative sample set.

In an exemplary embodiment, the prediction order obtaining unit 602 is further configured to perform inputting the positive sample, the first negative sample and the second negative sample into the content recommendation model, and obtain, through the content recommendation model, prediction recommendation probabilities of recommending to the first sample account respectively corresponding to the positive sample, the first negative sample and the second negative sample; and sequencing the positive samples, the first negative samples and the second negative samples according to the magnitude relation of the prediction recommendation probability to obtain a prediction recommendation sequence.

In an exemplary embodiment, the apparatus for training a content recommendation model further includes: the negative sample training unit is configured to execute the process of training the content recommendation model by adopting the sample recommendation information set, and if the current sample is a negative sample of the first sample account, the expected recommendation probability of the negative sample for the first sample account is set to be zero; the negative sample is sample recommendation information which is not interacted with the first sample account; inputting the negative sample into a content recommendation model to obtain a prediction recommendation probability for recommending the negative sample to the first sample account; and training the content recommendation model based on the difference between the predicted recommendation probability of the negative sample for recommending to the first sample account and the expected recommendation probability of the negative sample for the first sample account.

In an exemplary embodiment, the apparatus for training a content recommendation model further includes: a recall information acquisition unit configured to perform acquisition of an account characteristic of a target account corresponding to a content recommendation request and an information characteristic of each piece of information to be recommended in response to the content recommendation request; inputting the account characteristics and the information characteristics into the trained content recommendation model to obtain the predicted recommendation probability of each piece of information to be recommended for the target account; and screening recall information aiming at the target account from the information to be recommended according to the sequence of the predicted recommendation probability.

With regard to the apparatus in the above embodiment, the specific manner in which each module performs the operation has been described in detail in the embodiment related to the method, and will not be described in detail here.

FIG. 7 is a block diagram illustrating an electronic device 700 for training of a content recommendation model, according to an example embodiment. For example, the electronic device 700 may be a server. Referring to fig. 7, electronic device 700 includes a processing component 720 that further includes one or more processors and memory resources, represented by memory 722, for storing instructions, such as applications, that are executable by processing component 720. The application programs stored in memory 722 may include one or more modules that each correspond to a set of instructions. Further, the processing component 720 is configured to execute instructions to perform the above-described methods.

The electronic device 700 may further include: a power component 724 is configured to perform power management for the electronic device 700, a wired or wireless network interface 726 is configured to connect the electronic device 700 to a network, and an input-output (I/O) interface 728. The electronic device 700 may operate based on an operating system stored in the memory 722, such as Windows server, Mac OS X, Unix, Linux, FreeBSD, or the like.

In an exemplary embodiment, a computer-readable storage medium comprising instructions, such as memory 722, that are executable by a processor of electronic device 700 to perform the above-described method is also provided. The storage medium may be a computer-readable storage medium, which may be, for example, a ROM, a Random Access Memory (RAM), a CD-ROM, a magnetic tape, a floppy disk, an optical data storage device, and the like.

In an exemplary embodiment, a computer program product is also provided, which comprises instructions executable by a processor of the electronic device 700 to perform the above-described method.

It should be noted that the descriptions of the above-mentioned apparatus, the electronic device, the computer-readable storage medium, the computer program product, and the like according to the method embodiments may also include other embodiments, and specific implementations may refer to the descriptions of the related method embodiments, which are not described in detail herein.

Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This disclosure is intended to cover any variations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.

It will be understood that the present disclosure is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.

Claims

1. A method for training a content recommendation model, comprising:

2. The method of claim 1, wherein sampling a first negative sample in the set of sample recommendation information comprises:

sampling from a first negative sample set to obtain a first negative sample; the first negative sample set is composed of sample recommendation information interacted with the second sample account in the sample recommendation information set.

3. The method of claim 2, wherein the sampling a first negative example and a second negative example in the set of sample recommendation information comprises:

sampling from the first negative sample set to obtain the first negative sample, and sampling from the sample recommendation information set to obtain the second negative sample; the proportion between the number of the first negative samples and the number of the second negative samples meets a preset proportion relation.

4. The method of claim 2, wherein the sampling from the first set of negative samples to obtain the first negative sample comprises:

acquiring the occurrence frequency corresponding to each sample recommendation information contained in the first negative sample set; the occurrence frequency is used for representing the interaction times of the second sample account on each sample recommendation information in the first negative sample set;

and determining the sampling weight of each sample recommendation information in the first negative sample set according to the occurrence frequency, and sampling from the first negative sample set according to the sampling weight to obtain the first negative sample.

5. The method of claim 4, wherein the determining the sampling weight of each sample recommendation information in the first negative sample set according to the occurrence frequency comprises:

if the current sample recommendation information is the sample recommendation information with the occurrence frequency being greater than a preset first occurrence frequency and less than a preset second occurrence frequency in the first negative sample set, setting the sampling weight of the current sample recommendation information to be greater than the sampling weights of other sample recommendation information; the other sample recommendation information is sample recommendation information of which the occurrence frequency is less than or equal to the first occurrence frequency or greater than or equal to the second occurrence frequency in the first negative sample set; the first frequency of occurrence is less than the second frequency of occurrence.

6. The method of claim 2, wherein after training the content recommendation model, further comprising:

adding the positive samples to the first set of negative samples.

7. The method of claim 1, wherein the inputting the positive sample, the first negative sample, and the second negative sample into the content recommendation model, respectively, to obtain the predicted recommendation order for the first sample account comprises:

inputting the positive sample, the first negative sample and the second negative sample into the content recommendation model respectively, and obtaining, by the content recommendation model, predicted recommendation probabilities of the positive sample, the first negative sample and the second negative sample which are respectively corresponding to recommending to the first sample account;

and sequencing the positive sample, the first negative sample and the second negative sample according to the magnitude relation of the prediction recommendation probability to obtain the prediction recommendation sequence.

8. The method of claim 1, further comprising:

in the process of training a content recommendation model by adopting a sample recommendation information set, if a current sample is a negative sample of a first sample account, setting an expected recommendation probability of the negative sample for the first sample account to be zero; the negative sample is sample recommendation information which is not interacted with the first sample account;

inputting the negative examples into the content recommendation model to obtain a predicted recommendation probability for recommending the negative examples to the first sample account;

training the content recommendation model based on a difference between a predicted recommendation probability of the negative examples recommending to the first sample account and an expected recommendation probability of the negative examples aiming at the first sample account.

9. The method of any of claims 1 to 8, wherein after training the content recommendation model, further comprising:

responding to a content recommendation request, and acquiring account characteristics of a target account corresponding to the content recommendation request and information characteristics of information to be recommended;

inputting the account characteristics and the information characteristics into the trained content recommendation model to obtain the predicted recommendation probability of each piece of information to be recommended for the target account;

and screening recall information aiming at the target account from the information to be recommended according to the sequence of the predicted recommendation probability.

10. An apparatus for training a content recommendation model, comprising:

11. The apparatus of claim 10, wherein the training sample obtaining unit is further configured to perform sampling from a first negative sample set to obtain the first negative sample; the first negative sample set is composed of sample recommendation information interacted with the second sample account in the sample recommendation information set.

12. The apparatus of claim 11, wherein the training sample obtaining unit is further configured to perform sampling the first negative sample from the first negative sample set and sampling the second negative sample from the sample recommendation information set; the proportion between the number of the first negative samples and the number of the second negative samples meets a preset proportion relation.

13. The apparatus according to claim 11, wherein the training sample obtaining unit is further configured to perform obtaining of occurrence frequencies corresponding to the recommendation information of each sample included in the first negative sample set; the occurrence frequency is used for representing the interaction times of the second sample account on each sample recommendation information in the first negative sample set; and determining the sampling weight of each sample recommendation information in the first negative sample set according to the occurrence frequency, and sampling from the first negative sample set according to the sampling weight to obtain the first negative sample.

14. The apparatus according to claim 13, wherein the training sample obtaining unit is further configured to perform, if the current sample recommendation information is a sample recommendation information with an occurrence frequency greater than a preset first occurrence frequency and less than a preset second occurrence frequency in the first negative sample set, setting a sampling weight of the current sample recommendation information to be greater than sampling weights of other sample recommendation information; the other sample recommendation information is sample recommendation information of which the occurrence frequency is less than or equal to the first occurrence frequency or greater than or equal to the second occurrence frequency in the first negative sample set; the first frequency of occurrence is less than the second frequency of occurrence.

15. The apparatus of claim 11, further comprising: a first set construction module for adding the positive samples to the first negative sample set.

16. The apparatus according to claim 10, wherein the prediction order obtaining unit is further configured to perform inputting the positive sample, the first negative sample, and the second negative sample into the content recommendation model, respectively, and obtain, through the content recommendation model, prediction recommendation probabilities for recommending to the first sample account, respectively, corresponding to the positive sample, the first negative sample, and the second negative sample; and sequencing the positive sample, the first negative sample and the second negative sample according to the magnitude relation of the prediction recommendation probability to obtain the prediction recommendation sequence.

17. The apparatus of claim 10, further comprising: the negative sample training unit is configured to execute the process of training the content recommendation model by adopting the sample recommendation information set, and if the current sample is a negative sample of a first sample account, the expected recommendation probability of the negative sample for the first sample account is set to be zero; the negative sample is sample recommendation information which is not interacted with the first sample account; inputting the negative examples into the content recommendation model to obtain a predicted recommendation probability for recommending to the first sample account for the negative examples; training the content recommendation model based on a difference between a predicted recommendation probability that the negative exemplar recommends to the first exemplar account and an expected recommendation probability of the negative exemplar for the first exemplar account.

18. The apparatus of any one of claims 10 to 17, further comprising: a recall information acquisition unit configured to perform acquisition of an account characteristic of a target account corresponding to a content recommendation request and an information characteristic of each information to be recommended in response to the content recommendation request; inputting the account characteristics and the information characteristics into the trained content recommendation model to obtain the predicted recommendation probability of each piece of information to be recommended for the target account; and screening recall information aiming at the target account from the information to be recommended according to the sequence of the predicted recommendation probability.

19. A server, comprising:

a processor;

a memory for storing the processor-executable instructions;

wherein the processor is configured to execute the instructions to implement the method of training of a content recommendation model according to any one of claims 1 to 9.

20. A computer-readable storage medium, wherein instructions in the computer-readable storage medium, when executed by a processor of a server, enable the server to perform a method of training a content recommendation model according to any one of claims 1 to 9.