CN115564532A

CN115564532A - Training method and device of sequence recommendation model

Info

Publication number: CN115564532A
Application number: CN202211284644.9A
Authority: CN
Inventors: 林文芳; 马琼旭; 程红伟; 赵云安; 郭晓波
Original assignee: Zhejiang eCommerce Bank Co Ltd
Current assignee: Zhejiang eCommerce Bank Co Ltd
Priority date: 2022-10-17
Filing date: 2022-10-17
Publication date: 2023-01-03

Abstract

The embodiment of the specification provides a training method and a training device for a sequence recommendation model, wherein the training method for the sequence recommendation model comprises the following steps: acquiring a historical behavior sequence of a recommendation service of the associated object, and generating an original interest representation according to a historical object contained in the historical behavior sequence; calculating object similarity between each history object and a reference preference object according to the attention influence parameter of each history object in at least one dimension; determining a target historical object in the historical behavior sequence based on the object similarity, and generating a counterfactual interest representation according to the target historical object; and generating an optimized parameter pair according to the original interest representation, the reference preference representation corresponding to the reference preference object and the counterfactual interest representation, and recommending a sequence recommendation model of the service to the training association object according to the optimized parameter pair. Through the sequence recommendation model provided by the specification, accurate user interest prediction can be performed on behavior sequence sparse users, user interest is mined, and user use experience is improved.

Description

Training method and device of sequence recommendation model

Technical Field

The embodiment of the specification relates to the technical field of sequence recommendation, in particular to a training method and an object recommendation method of a sequence recommendation model.

Background

With the rapid development of technologies such as computers, artificial intelligence and the like, a user-oriented recommendation system brings convenient service experience to users, and the recommendation system can recommend interested commodity information to the users based on the interest characteristics and historical purchasing behaviors of the users. The sequence recommendation system in the recommendation system is very important, and the sequence recommendation system can accurately recommend a user by analyzing a historical browsing sequence of the user, so that the stability of the sequence recommendation system is a main problem in current research. The current sequence recommendation system mainly has the problem that the recommendation result is inaccurate due to sparse data and noise of a user browsing sequence. Therefore, how to provide accurate recommendation results for users is a problem that needs to be accurate.

Disclosure of Invention

In view of this, the embodiments of the present specification provide a training method for a sequence recommendation model and an object recommendation method. One or more embodiments of the present disclosure relate to a training apparatus for a sequence recommendation model, an object recommendation apparatus, a computing device, a computer-readable storage medium, and a computer program, so as to solve technical deficiencies of the prior art.

According to a first aspect of embodiments of the present specification, there is provided a training method for a sequence recommendation model, including:

acquiring a historical behavior sequence of a recommendation service of an associated object, and generating an original interest representation according to a historical object contained in the historical behavior sequence;

calculating object similarity between each history object and a reference preference object according to the attention influence parameter of each history object in at least one dimension;

determining a target historical object in the historical behavior sequence based on the object similarity, and generating a counterfactual interest representation according to the target historical object;

and generating an optimized parameter pair according to the original interest representation, the reference preference representation corresponding to the reference preference object and the counterfactual interest representation, and training a sequence recommendation model associated with the object recommendation service according to the optimized parameter pair.

According to a second aspect of embodiments of the present specification, there is provided an object recommendation method including:

inputting a historical behavior sequence of a target user associated with a target service into a sequence recommendation model, wherein the sequence recommendation model is a model obtained by training through a training method of the sequence recommendation model provided by the specification;

and obtaining a recommendation result output by the sequence recommendation model, wherein the recommendation result comprises a service object related to the target service.

According to a third aspect of embodiments of the present specification, there is provided an object recommendation method including:

receiving a participation request of a target user for a target service;

determining a historical behavior sequence of the target user according to the participation request;

inputting the historical behavior sequence into a sequence recommendation model, wherein the sequence recommendation model is a model obtained by training through a training method of the sequence recommendation model provided by the specification;

According to a fourth aspect of embodiments herein, there is provided a training apparatus for a sequence recommendation model, including:

the first generation module is configured to acquire a historical behavior sequence of a recommendation service of an associated object, and generate an original interest representation according to a historical object contained in the historical behavior sequence;

a calculation module configured to calculate an object similarity between each history object and a reference preference object according to an attention impact parameter of each history object in at least one dimension;

a second generation module configured to determine a target historical object in the historical behavior sequence based on object similarity and generate a counterfactual interest representation according to the target historical object;

and the training module is configured to generate an optimized parameter pair according to the original interest representation, the reference preference representation corresponding to the reference preference object and the counterfactual interest representation, and train a sequence recommendation model associated with the object recommendation service according to the optimized parameter pair.

According to a fifth aspect of embodiments herein, there is provided an object recommendation apparatus including:

the input module is configured to input a historical behavior sequence of a target user associated with a target service into a sequence recommendation model, wherein the sequence recommendation model is a model obtained by training through a training method of the sequence recommendation model provided by the specification;

an obtaining module configured to obtain a recommendation result output by the sequence recommendation model, wherein the recommendation result includes a service object associated with the target service.

According to a sixth aspect of embodiments herein, there is provided an object recommendation apparatus including:

the receiving module is configured to receive a participation request of a target user for a target service;

a determining module configured to determine a historical behavior sequence of the target user according to the participation request;

the input module is configured to input the historical behavior sequence into a sequence recommendation model, wherein the sequence recommendation model is a model obtained by training through a training method of the sequence recommendation model provided by the specification;

According to a seventh aspect of embodiments herein, there is provided a computing device comprising a memory, a processor and computer instructions stored on the memory and executable on the processor, the processor implementing the steps of the training method of the sequence recommendation model, the object recommendation method when executing the computer instructions.

According to an eighth aspect of embodiments herein, there is provided a computer-readable storage medium storing computer instructions which, when executed by a processor, implement the steps of the training method of the column recommendation model, the object recommendation method.

According to a ninth aspect of embodiments herein, there is provided a computer program, wherein when the computer program is executed in a computer, the computer program causes the computer to execute the steps of the training method and the object recommendation method of the column recommendation model.

The training method of the sequence recommendation model provided by the specification comprises the steps of obtaining a historical behavior sequence of a recommendation service of an associated object, and generating an original interest representation according to a historical object contained in the historical behavior sequence; calculating object similarity between each history object and a reference preference object according to the attention influence parameter of each history object in at least one dimension; determining a target historical object in the historical behavior sequence based on the object similarity, and generating a counterfactual interest representation according to the target historical object; and generating an optimized parameter pair according to the original interest representation, the reference preference representation corresponding to the reference preference object and the counterfactual interest representation, and training a sequence recommendation model associated with the object recommendation service according to the optimized parameter pair.

According to the embodiment of the description, the original interest representation is generated according to the historical objects contained in the historical behavior sequence, the object similarity between each historical object and the reference preference object is calculated according to the attention influence parameter of each historical object, so that the counterfactual interest representation with higher accuracy can be generated based on a counterfactual enhancement mode, the sequence recommendation model is trained through the original interest representation, the reference preference representation and the counterfactual interest representation, the trained sequence recommendation model is obtained, the problem that the objects which are interested by the user cannot be accurately predicted due to sparse sequence and noise of the sequence is solved, the user interest can be accurately predicted by the user with sparse behavior sequence and noise of the behavior sequence based on the trained sequence recommendation model, the user interest is mined, and the user experience is improved.

Drawings

FIG. 1 is a model framework diagram of a sequence recommendation model provided in accordance with one embodiment of the present description;

FIG. 2 is a flowchart of a training method for a sequence recommendation model according to an embodiment of the present disclosure;

FIG. 3 is a flowchart illustrating a processing procedure of a training method for a sequence recommendation model according to an embodiment of the present disclosure;

FIG. 4 is a flowchart of an object recommendation method provided in one embodiment of the present specification;

FIG. 5 is a flow chart of another object recommendation method provided in one embodiment of the present description;

FIG. 6 is a schematic structural diagram of a training apparatus for a sequence recommendation model according to an embodiment of the present disclosure;

fig. 7 is a schematic structural diagram of an object recommendation apparatus according to an embodiment of the present specification;

fig. 8 is a schematic structural diagram of another object recommendation device provided in an embodiment of the present specification;

fig. 9 is a block diagram of a computing device according to an embodiment of the present disclosure.

Detailed Description

In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present specification. This description may be implemented in many ways other than those specifically set forth herein, and those skilled in the art will appreciate that the present description is susceptible to similar generalizations without departing from the scope of the description, and thus is not limited to the specific implementations disclosed below.

The terminology used in the description of the one or more embodiments is for the purpose of describing the particular embodiments only and is not intended to be limiting of the description of the one or more embodiments. As used in one or more embodiments of the present specification and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used in one or more embodiments of the present specification is intended to encompass any and all possible combinations of one or more of the associated listed items.

It will be understood that, although the terms first, second, etc. may be used herein in one or more embodiments to describe various information, these information should not be limited by these terms. These terms are only used to distinguish one type of information from another. For example, a first can also be referred to as a second and, similarly, a second can also be referred to as a first without departing from the scope of one or more embodiments of the present description. The word "if," as used herein, may be interpreted as "at … …" or "at … …" or "in response to a determination," depending on the context.

First, the noun terms to which one or more embodiments of the present specification relate are explained.

The recommendation system comprises: in the recommendation System, the recommendation System performs information matching on the user demand information and the characteristics of recommended objects, performs calculation and screening by using a recommendation algorithm, finds the recommended objects which are most likely to be interested by the user, and recommends the recommended objects to the user.

CTR: click-Through Rate: the number of times that the recommended content is clicked is divided by the total number of times that the content is shown in a statistical period, and the recommended content is often used as a basis for advertisement recommendation effect evaluation or product selection, and the calculation formula is as follows: click rate = (amount clicked/amount shown) × 100%.

User behavior sequence: user Behavior, various ordered data which are ordered according to time sequence, such as clicking, sharing, commenting, browsing, purchasing and the like of a User in a business service.

And (4) recommending a strategy: and Rank, recommending the content for the user by the recommending system according to the collected user behavior training model.

Data enhancement: data Augmentation is widely applied to image and natural language processing, for example, the image enhancement technology can rotate an image a little or convert a picture into a gray scale image, add noise points and other methods to obtain a new image without changing the meaning of the image. The technology in the field of recommendation systems is mainly used for generating more usable data such as features, samples, labels and the like by performing time domain, frequency domain, decomposition method, statistical generation model and the like on sequence data.

An attention mechanism is as follows: attention Mechanism in neural network is a resource allocation scheme for allocating computing resources to more important tasks under the condition of limited computing power and solving the problem of information overload.

And (3) comparison and learning: contrast Learning focuses on Learning common features between similar instances, and distinguishes between non-similar instances. The aim of the comparison learning is to learn an encoder which performs similar encoding on the same type of data and enables the encoding results of different types of data to be different as much as possible.

And (3) counterfactual inference: counter factual Inference, also known as counter factual thinking, refers to the act of negating and re-characterizing facts that have occurred in the past to construct a probabilistic hypothetical mental activity. The technology can be generally applied to the fields of advertisement, recommendation and the like, for example, answering a counterfactual question in the recommendation field: "what the user decides if the recommended goods are different", in order to answer the question, a counterfactual reasoning framework is adopted to construct a recommendation system simulator, and then a large amount of counterfactual data is generated to process the data scarcity question.

The Graph convolution neural Network (GCN) performs flowing and spreading of features and messages in a relational Graph, and obtains information more complete than a single entity by taking information of neighbor nodes as information supplement of a current node.

At present, recommendation systems are widely applied to a plurality of business scenes such as e-commerce, advertisements, social media and the like, and the dwell time of users is increased by recommending contents which may be interested to the users, so that the click rate, the conversion rate and the like of the users are improved. In order to more accurately find the content of interest to the user from the mass data, the user interest is fully understood according to the historical behavior of the user. Therefore, the existing recommendation system is based on sequence characterization learning, and the characteristics of the user using the content are learned from the content sequence of the user history behavior (exposure, click, conversion and the like), so that the content distribution of the recommendation system is realized, and the user experience is improved.

The sequence characterization learning aims at predicting the content of interest of a user next time based on historical interaction data of the user, but an important challenge of applying the sequence characterization learning to a recommendation system is a data sparseness problem, for example, for some new users or low-frequency visiting users, historical behavior data of the users are sparse and even have no historical behavior, the users are called cold-start users, the sequence characterization learning lacks interest learning of the cold-start users, the content of interest of the users cannot be accurately judged according to the sparse behavior, and whether the users can accept exploration and discovery of new field content is lacked. Meanwhile, there is also a problem of noise factor in the implicit feedback of the user, specifically, there is inconsistency between the user interest and the click behavior because some interference factors may affect the first impression of the user (such as title party, position deviation, promotion). In summary, how to solve the above two problems, learning accurate and reliable user representation is crucial for the recommendation system.

Based on this, in the present specification, a training method of a sequence recommendation model is provided for training a user behavior sequence that can be based on data sparseness and data noise, and a sequence recommendation model with high accuracy and robustness is obtained.

Fig. 1 is a schematic diagram of a model framework of a sequence recommendation model provided according to an embodiment of this specification, where a user behavior sequence is various types of ordered data sorted according to a time sequence, such as click, share, comment, browse, and purchase, of a user in a business service, and a timestamp sequence is a sequence formed by timestamps corresponding to each user behavior in the user behavior sequence. After the user behavior Sequence is input into the Sequence Recommendation model, the original interest representation of the user is generated based on GCN learning inside the model, and a Counterfactual Sequence is generated through a Counterfactual Sequence framework (Time-based perceived Sequence Modeling for Recommendation, TCSM) based on Time-based perception attention, wherein the Counterfactual Sequence comprises a positive sample and a negative sample, the positive sample can be understood as an active interest Sequence, the negative sample can be understood as a negative interest Sequence, and the training model is learned through comparing two types of sequences in the Counterfactual Sequence with the original interest representation, so that the problem of sparseness of the user behavior Sequence can be solved, and the robust user Sequence representation is learned, so that the user Sequence representation is less sensitive to perception of the negative Sequence, and more trusts active sequences capable of representing potential interest of the user. In an embodiment of the description, by introducing a GCN, user interests with sparse U-I diagram learning behaviors are constructed, and in order to ensure accuracy of a generated counterfactual sequence in a counterfactual sequence generation process, a time perception attention-based mode is provided for learning similarity between each element in a user behavior sequence and a target object, and a time period model of the user behavior is mined. According to the counterfactual interest sequence generated by the TCSM, on one hand, the user sequence representation generated after the user mean value pooling is restrained, and the representation robustness is enhanced, on the other hand, the generated positive interest sequence is output and used for exploring and discovering the potential multi-interest of the user. And calculating common loss of click rate estimation and sequence generation based on the original interest representation, the counterfactual interest representation and the reference preference representation of the target object, performing parameter adjustment training on the model, and performing information fusion, full connection layer and activation function, so that the trained sequence recommendation model can output more accurate recommendation content.

Fig. 2 shows a flowchart of a training method of a sequence recommendation model provided in accordance with an embodiment of the present disclosure, which includes steps 202 to 208.

Step 202: and acquiring a historical behavior sequence of the recommendation service of the associated object, and generating an original interest representation according to the historical object contained in the historical behavior sequence.

The historical behavior sequence can be understood as an effective behavior sequence of clicking, sharing, commenting, browsing, purchasing and the like of a business item in a past period of time, such as a sequence formed by behaviors of getting a red envelope, using the red envelope, sharing an activity link and the like by a user. The object recommendation service can be understood as a service which needs to be recommended by an object, for example, in an online shopping service, a commodity which is interested in the object recommendation service can be recommended for a user according to purchasing and browsing behaviors of the user in the past time; or in the take-out service, the full red envelope or the cash-back red envelope can be recommended to the user according to the behavior of using the red envelope of the user. In any kind of service, the recommendation result predicted according to the historical behavior sequence needs to be close to the user interest, that is, object information which can trigger the user behavior better is recommended to the user. By taking the online shopping service as an example, the purpose of predicting and recommending commodities to the user is to provide more convenient service for the user, and the user does not need to perform commodity screening again, so that the user can directly purchase commodities according to recommendation, and the user experience is improved. The original interest representation generated according to the historical object contained in the historical behavior sequence can be understood as the user interest representation predicted according to the historical behavior event in the user behavior sequence, for example, the historical behavior sequence includes the behavior of purchasing commodities by the user, the user purchases shoes and clothes of a certain sports brand, the original interest representation generated according to the historical behavior sequence can be a certain sports brand or sports clothes and the like, and the commodities subsequently recommended to the user can be other clothes of the sports brand or clothes of other sports platforms, so that the interest of browsing the recommended commodities by the user is improved, the user directly purchases the commodities according to the recommended commodities, the time of screening the commodities by the user is shortened, and the user purchase experience is improved.

In practical application, when historical object data contained in a historical behavior sequence is sparse, the original interest representation predicted according to the historical behavior sequence may have a problem of low accuracy, so that a target object recommended to a user based on the original interest representation may not meet the user interest subsequently, and the user experience is poor. If the historical behavior sequence only includes that the user purchases a certain motion platform once, the original interest representation may tend to be that the user has a relatively high interest in the shoes, and the goods recommended to the user may be shoes of other styles, so that the user is not interested in the recommended goods, and the user is aversive to the recommended goods, and bad purchasing experience is brought to the user.

In summary, because the behavior data in the user behavior sequence is sparse, the user interest representation cannot be accurately obtained, and in order to generate a more accurate original interest representation based on the sparse user behavior sequence, in an embodiment of this specification, a GCN method is used to learn the user interest representation, specifically, the generating the original interest representation according to the history object included in the history behavior sequence includes: determining a target user corresponding to the historical behavior sequence and a related user corresponding to the target user; determining similar objects of the associated users, and constructing a user behavior relation graph of the target user based on the historical objects and the similar objects contained in the historical behavior sequence; and extracting the original interest representation of the target user from the user behavior relation graph.

The target user can be understood as a user corresponding to the historical behavior sequence, the associated user can be understood as a user having the same historical behavior as the target user, and the similar object can be understood as other historical behaviors of the associated user. GCN acts essentially as a convolutional neural network CNN, being a feature extractor, except that its object is graph data. Therefore, when the historical behaviors of the target user are sparse, the interests of associated users having the same behaviors as the target user can be borrowed.

In practical application, when a target user does not have a large amount of user data, in order to make the user satisfied with a recommendation result and thus be willing to use a recommendation system, the cold start problem needs to be solved, so a user behavior relation graph (U-I graph) can be constructed based on a historical behavior sequence of the target user, the target user is enabled to use interests of surrounding relation users in the user behavior relation graph to expand the historical behavior sequence of the target user, if the user behavior sequence of the user a originally has only 1 behavior, the interests of associated users (such as the user B, C) having similar behaviors can be moved to the user a, and also the historical behaviors of the user B, C can be understood to be added to the historical behavior sequence of the target user, and then the user interest characterization, namely the original interest characterization, of the target user can be generated based on the learning of the user behavior relation graph.

Taking a purchased commodity as an example, determining a target user A corresponding to a historical purchase behavior sequence, and an associated user B and an associated user C having the same historical purchase behavior as the target user A, taking a corresponding commodity object in the historical purchase behaviors of the associated user B and the associated user C as a similar object, constructing a user purchase behavior relation graph based on the historical purchased commodity and the purchased commodity of the similar object included in the original historical behavior sequence of the target user A, and extracting an original interest representation of the target user A in the user purchase behavior relation graph based on a graph convolution neural network (GCN).

In conclusion, by introducing the user behavior relation diagram, a more accurate user interest representation can be generated, and a better training sequence recommendation model can be subsequently generated based on the user interest representation, so that the sequence recommendation model can output a more accurate recommendation result.

Step 204: and calculating the object similarity between each history object and the reference preference object according to the attention influence parameter of each history object in at least one dimension.

In practical application, the different dimensions include a time dimension and a space dimension, and the attention impact parameters in the space dimension may be relationship factors between business objects, such as similarity between a commodity a and a commodity B; the attention impact parameter in the time dimension may be a time relationship between each business object and the target object, and according to the attention impact parameter of each history object in at least one dimension, an object similarity between each history object and a reference preference object (target object) may be calculated, and the object similarity may be characterized as a correlation between each element and the target content.

In practical applications, taking the time dimension as an example, the longer the time from the target object, the smaller the influence on the prediction result, and therefore the attention weight of the object should be lower. In general, the importance of each element in the user historical behavior sequence is learned by using an attention mechanism, so that the characteristics required in a target scene can be focused, in order to learn the relationship between the historical behaviors and a target object, a multi-head attention mechanism is used in part of work, the target object is considered to be query (Q), the historical objects in the user behavior sequence are key (K) and value (V), and correspondingly Q, K, V in the model is used. When the object similarity is calculated by adopting the attention influence parameters in multiple dimensions, the object sub-similarity can be calculated according to the attention influence parameters in each dimension, and the final object similarity is determined by adopting any one of means such as mean, weighted sum and summation.

However, since many historical objects of the user's historical behavior are unrelated to the target object, the method enhances the time information and law of using the target object by the user by introducing the time information, which specifically includes: determining a target user corresponding to the historical behavior sequence, and determining a reference preference object corresponding to the target user; determining an attention impact parameter of each historical object in a time dimension; and calculating the object similarity between each historical object and the reference preference object according to the attention influence parameters.

In an embodiment of the present specification, the attention impact parameters in the time dimension are defined as three relationship factors of each historical object and a target object in the historical behavior sequence, including a basic relationship factor, a discrete time relationship factor and a continuous time relationship factor, and finally the three relationship factors are fused to obtain a time-perception attention representation mode, that is, object similarity. Therefore, the time period model of the user behavior is mined, and the fact counterfactual sequence which can be generated more accurately subsequently is determined.

In practical application, the basis relation factor (Attention Base, A1) and the discrete Time relation factor (Sparse Time, A2) are all expressed by Attention, and the internal correlation of the sequence is learned by using a multi-head Attention mechanism, and then the calculation and determination are performed by the following formula, wherein three matrixes of Q (Query), K (Key) and V (Value) are all from the same input, firstly, the dot product between Q and K is calculated, and in order to prevent the result from being overlarge, the dot product is divided by one scale

d _k Dimension for one Query and Key vector:

the continuous time relation factor is used for mining the influence of the time interval on the target object, and we expect to find a mapping function phi (-) from a time domain to a d-dimensional vector space. Considering that we want to define the relationship between any two time stamp t1, t2 time intervals, it is expressed as the corresponding time code inner product (Φ (t 1), Φ (t 2)). To learn the mapping function, which is defined as a mapping function of frequency ω, it can be calculated by the following formula:

wherein the content of the first and second substances,

is a series of trainable parameters and is also a continuous time relationship factor. And finally, multiplying the three relation factors to obtain the final user history sequence similarity based on the time perception attention, namely the object similarity between each history object and the target object.

Based on this, by calculating an object similarity between each history object and the target object, a correlation between each history object and the target object can be determined, and since the target object is a reference preference object of the user, it is subsequently possible to select a positive interest object and a negative interest object of the user from the history behavior sequence based on the object similarity, and generate corresponding positive interest sequence and negative interest sequence, thereby generating a more accurate counter fact sequence.

Step 206: and determining a target historical object in the historical behavior sequence based on the object similarity, and generating a counterfactual interest representation according to the target historical object.

After the object similarity between each history object and the reference preference object is obtained, the target history object can be determined in the history behavior sequence, and the target history object comprises the history object with high object similarity and the history object with low object similarity.

In practical application, the screened target history objects comprise positive history objects and negative history objects, the positive history objects are history objects which are interesting to users, and the negative history objects are history objects which are not interesting to users. Therefore, the positive interest sequence and the negative interest sequence of the user are generated based on a counterfactual enhancement mode according to the target historical object, the counterfactual interest sequence comprises the two sequences, the cold start problem can be solved by comparing and learning the two generated sequences with the original interest sequence of the user, the user interest sequence representation with better learning robustness can be learned, and the counterfactual interest representation according to the generated counterfactual sequence is more accurate due to the fact that the object similarity is obtained based on time dimension calculation.

In specific implementation, in order to accurately determine the target historical object in the historical behavior sequence based on the object similarity, the historical objects in the historical behavior sequence may be sorted according to the object similarity corresponding to each historical object, which specifically includes: sequencing the historical objects in the historical behavior sequence from high to low based on the object similarity; determining a preset number of positive sequence historical objects and a preset number of negative sequence historical objects according to the sequencing result; and taking the forward order history object and the reverse order history object as target history objects.

The preset number can be understood as the number of the acquired history objects set in advance, for example, after data sorting is performed according to the sequence from high to low of the object similarity, the history objects corresponding to the first five sorted object similarity are acquired as the forward-sequence history objects, and the history objects corresponding to the last five sorted object similarity are acquired as the backward-sequence history objects, then the preset number is 5 in the above example, the preset number can also be a sequence replacement number, when the replacement number is k, the minimum and maximum k element positions are selected from the history behavior sequence and are respectively marked as Min _ k and Max _ k, and then the counter-fact positive interest sequence and the counter-fact negative interest sequence can be generated according to Min _ k.

In practical application, because the object similarity is calculated and the target historical object is determined in a counterfactual enhancement mode, the higher the object similarity is, the greater the influence on the representation of the predicted user interest after the historical object is replaced, the negative interest sequence can be generated according to the historical object with the high object similarity, and conversely, the positive interest sequence can be generated according to the historical object with the lower object similarity. Counterfactual interest sequences can then be generated from the target history object, including positive interest sequences and negative interest sequences.

In an embodiment of the present specification, history objects in the history behavior sequence are sorted in order of high object similarity, a history object at the top 3 of the sorting result is selected as a forward-order history object, a history object at the bottom 3 of the sorting result is selected as a reverse-order history object, and the selected forward-order history object and reverse-order history object are used as target history objects.

In practical application, because noise problems in the user historical behavior sequence can interfere with generation of real interest of the user, and content which is likely to be interested by the user currently is predicted only from the user behavior sequence may be inaccurate, so that the user sequence which is out of distribution can be constructed through the selected target historical object based on a counterfactual enhancement mode, namely, the counterfactual interest sequence is generated. The method specifically comprises the following steps: performing sequence construction processing on the forward sequence historical object based on a preset sequence construction strategy, and generating a counter-fact reverse interest sequence according to a processing result, wherein the preset sequence construction strategy comprises a sub-strategy for performing reconstruction processing on the historical object; performing sequence construction processing on the reverse-order historical object based on the preset sequence construction strategy, and generating a reverse-fact forward interest sequence according to a processing result; and generating a counterfactual interest characterization according to the counterfactual backward interest sequence and the counterfactual forward interest sequence.

The sequence construction processing can be understood as performing sequence construction by adopting a preset sequence construction strategy based on a target historical object, the preset sequence construction strategy comprises a sub-strategy for performing reconstruction processing on the historical object, the sub-strategy for the reconstruction processing comprises a abandoning strategy, a dropping strategy, a masking strategy, a Mask strategy, a rearrangement strategy and a Reorder, three sub-sequences corresponding to a positive sequence historical object and three sub-sequences corresponding to a negative sequence historical object are respectively generated through the three sub-strategies, then the sub-sequences can be combined to generate a positive interest sequence and a negative interest sequence, and a negative fact representation is generated through a pooling polymerization method.

In practical application, the positive sequence history object generates a negative interest subsequence through three modes of Dropout/MasK/Reorder respectively, the three negative interest subsequences are combined to generate a negative interest sequence, namely a counter fact reverse interest sequence, the negative sequence history object generates a positive interest subsequence through three modes of Dropout/MasK/Reorder respectively, and the three positive interest subsequences are combined to generate a positive interest sequence, namely a counter fact forward interest sequence. And performing pooling processing on the extreme interest sequence to obtain a negative interest representation, wherein the difference between the negative interest representation and the real interest of the user is considered to be large, and similarly, the positive interest representation can be extracted according to the positive interest sequence to represent the real interest of the user and introduce potential interest disturbance to discover the potential interest of the user.

In one embodiment of the specification, dropotout is based on positive order history objectsThree reconstruction sub-strategies of Mask/Reorder respectively generate three negative interest sub-sequences [ Seq ] ^-，D ，Seq ^-，M ，Seq ^-，R ]And generating a negative interest sequence Seq from the three negative interest subsequences ^-，All . Generating three positive interest subsequences [ Seq ] according to reverse order historical objects ^+，D ，Seq ^+，M ，Seq ^+，R ]And generating the positive interest sequence Seq according to the three positive interest subsequences ^+，All 。

In particular, in order to obtain a counterfactual interest characterization from a counterfactual sequence, the counterfactual sequence needs to be subjected to a pooling polymerization treatment, which specifically includes: pooling the counterfactual reverse interest sequence to obtain counterfactual reverse interest representation; performing pooling processing on the counter fact forward interest sequence to obtain a counter fact forward interest representation; determining a counterfactual interest representation from the counterfactual backward interest representation and the counterfactual forward interest representation.

Wherein, the pooling treatment can be understood as pooling polymerization treatment, and three pooling treatment methods are provided in the present specification, which are mean pooling (mean pooling), maximum pooling (max pooling), and sum pooling (sum pooling), and the counterfactual interest sequence can be pooled based on any one of the above pooling treatment methods to obtain the counterfactual interest characterization.

In practical applications, the counterfactual interest sequences include counterfactual forward interest sequences and counterfactual backward interest sequences, so that pooling processing needs to be performed on the two different types of interest sequences, the counterfactual interest tokens obtained after the pooling processing also include counterfactual forward interest tokens and counterfactual backward interest tokens, the counterfactual forward interest tokens are positive interest tokens, and the counterfactual backward interest tokens are negative interest tokens.

In one embodiment of the present specification, the counter-fact reverse interest sequence Seq ^-，All Carrying out mean posing pooling to obtain a counterfactual reverse interest characterization Seq-, and obtaining a counterfactual forward interest sequence Seq ^+，All Carrying out mean pond treatment to obtain the counterfactual forward interest characterization Seq +. And then, the sequence recommendation model can be better trained based on the counterfactual interest representation and the original interest representation through comparison learning, so that the training effect of the sequence recommendation model is better.

Step 208: and generating an optimized parameter pair according to the original interest representation, the reference preference representation corresponding to the reference preference object and the counterfactual interest representation, and training a sequence recommendation model associated with the object recommendation service according to the optimized parameter pair.

Because the counterfactual data enhancement mode is adopted to obtain the counterfactual interest representation, when the model is trained, not only the target (click rate estimation) of the service scene of the model itself needs to be optimized, but also the user sequence representation generated by the two types of counterfactual interest representation constraints needs to be used, and the representation robustness of the counterfactual interest representation is enhanced. Therefore, the loss function of model training comprises two parts, one part is a cross entropy loss value calculated according to the original interest characteristics and the reference preference characteristics, the other part is a contrast loss value calculated according to the counterfactual interest characteristics and the original interest characteristics, and the sequence recommendation model is trained based on the two parts of loss values.

In practical application, one optimization parameter pair is generated according to the original interest characterization and the reference preference characterization corresponding to the reference preference object, the other optimization parameter pair is generated according to the original interest characterization and the counterfactual interest characterization, loss values are respectively calculated according to the two optimization parameter pairs, and parameters of the sequence recommendation model are adjusted based on the loss values until the sequence recommendation model reaches a training stop condition.

In specific implementation, because two optimization parameter pairs have different loss value calculation methods, two optimization parameter pairs need to be generated according to the original interest characterization, the reference preference characterization and the counterfactual interest characterization, which specifically include: generating a first optimization parameter pair according to the original interest representation and a reference preference representation corresponding to the reference preference object; and generating a second optimization parameter pair according to the original interest characteristics and the counterfactual interest characteristics.

In practical application, the first optimization parameter pair is composed of an original interest characteristic and a reference preference characteristic, and a cross entropy loss function is adopted to calculate a loss value, and the second optimization parameter pair is composed of an original interest characteristic and a counterfactual interest characteristic, and a contrast loss function is adopted to calculate a loss value. Specifically, the method comprises the following steps: calculating a cross entropy loss value according to the first optimization parameter pair, and calculating a contrast loss value according to the second optimization parameter pair; calculating a target loss value based on the cross entropy loss value and the contrast loss value; and training a sequence recommendation model of the associated logarithm object recommendation service according to the target loss value.

Wherein the cross entropy loss value

Can be calculated by the following formula, wherein y _j Representing the original interest characterization:

value of contrast loss

Can be calculated by the following formula, wherein x _j The representation of the original characterization of interest is,

representing the counter-fact forward interest characterization,

representing counterfactual reverse interest characterization:

in practical applications, after the cross entropy loss value and the contrast loss value are calculated, the target loss value can be calculated based on the sum of the cross entropy loss value and the contrast loss value

The target loss value can be calculated by the following formula, wherein λ ₁ For balancing the learning importance between two tasks:

based on the method, the model parameters of the sequence recommendation model are adjusted through the target loss value, so that the training effect of the sequence recommendation model is better, and the sequence recommendation model with more accurate recommendation result prediction can be obtained subsequently, so that service is better provided for users, and the user experience is improved.

The training method of the sequence recommendation model provided by the specification comprises the steps of obtaining a historical behavior sequence of a recommendation service of an associated object, and generating an original interest representation according to a historical object contained in the historical behavior sequence; calculating object similarity between each history object and a reference preference object according to the attention influence parameter of each history object in at least one dimension; determining a target historical object in the historical behavior sequence based on the object similarity, and generating a counterfactual interest representation according to the target historical object; and generating an optimized parameter pair according to the original interest representation, the reference preference representation corresponding to the reference preference object and the counterfactual interest representation, and training a sequence recommendation model associated with the object recommendation service according to the optimized parameter pair. The original interest representation is generated according to the history objects contained in the history behavior sequence, the object similarity between each history object and the reference preference object is calculated according to the attention influence parameter of each history object, so that the counter fact interest representation with higher accuracy can be generated based on a counter fact enhancement mode, and the sequence recommendation model is trained through the original interest representation, the reference preference representation and the counter fact interest representation, so that the trained sequence recommendation model is obtained, and the problem that the objects which are interested by the user cannot be accurately predicted due to sparse sequence and noise in the sequence is solved.

The following describes the training method of the sequence recommendation model further by taking the application of the training method of the sequence recommendation model provided in the present specification in the financial marketing service as an example, with reference to fig. 3. Fig. 3 is a flowchart illustrating a processing procedure of a training method for a sequence recommendation model according to an embodiment of the present specification, where specific steps include step 302 to step 326.

Step 302: and acquiring a historical behavior sequence of the financial marketing service associated with the target user.

In an implementation embodiment, a historical behavior sequence of the target user under the financial marketing service is obtained, wherein the historical behavior sequence comprises behaviors of marketing red packages, marketing vouchers and the like which are taken by the target user in the past.

Step 304: and determining the associated user corresponding to the target user and the similar object of the associated user.

In an implementable embodiment, the related users with the same historical behaviors as the target user are determined, and the historical behaviors of the related users are obtained as similar objects, wherein the similar objects comprise comment participation and forwarding activities.

Step 306: and constructing a user behavior relation graph of the target user based on the historical objects and similar objects contained in the historical behavior sequence.

In an implementable embodiment, a user behavior relationship graph of the target user is constructed based on the historical objects and similar objects in the historical behavior sequence of the target user.

Step 308: and extracting the original interest representation of the target user in the user behavior relation graph.

In an implementation embodiment, the GCN method is adopted to extract the original interest representation of the target user in the user behavior relation graph.

Step 310: and determining a reference preference object corresponding to the target user and an attention influence parameter of each historical object contained in the historical behavior sequence in a time dimension.

In one implementation, a true reference preference object of the target user is determined, the reference preference object is a full red packet, and three relation factors of each history object included in the history behavior sequence include a basic relation factor, a discrete time relation factor and a continuous time relation factor.

Step 312: and calculating the object similarity between each historical object and the reference preference object according to the attention influence parameters.

In an implementable embodiment, an object similarity between each history object in the user history behavior sequence and the reference preference object is calculated according to three relation factors.

Step 314: and sequencing the historical objects in the historical behavior sequence from high to low based on the object similarity.

In one implementable embodiment, the historical objects in the historical behavior sequence are ordered in order of high to low object similarity.

Step 316: and determining a preset number of forward-order history objects and a preset number of reverse-order history objects according to the sequencing result, and taking the forward-order history objects and the reverse-order history objects as target history objects.

In an implementation embodiment, the preset number is a sequence replacement number k, the forward-order history object Max _ k and the reverse-order history object Min _ k are determined according to the sorting result, and the forward-order history object and the reverse-order history object are used as target history objects.

Step 318: and performing sequence construction processing on the forward sequence historical object based on a preset sequence construction strategy, generating a counter-fact reverse interest sequence according to a processing result, performing sequence construction processing on the counter-fact reverse interest object based on the preset sequence construction strategy, and generating a counter-fact forward interest sequence according to the processing result.

In an implementation embodiment, the preset sequence construction strategy comprises sub-strategies for reconstructing the history object, the reconstruction sub-strategies comprise three sub-strategies of dropout, mask and reorder, the sequential history object is respectively processed according to the three reconstruction sub-strategies to obtain three counter-fact forward interest sub-sequences, the three counter-fact forward interest sub-sequences are combined to generate a counter-fact forward interest sequence, and the counter-fact reverse interest sequence is obtained in the same manner.

Step 320: pooling the counter fact reverse interest sequence to obtain a counter fact reverse interest representation, pooling the counter fact forward interest sequence to obtain a counter fact forward interest representation, and determining the counter fact interest representation according to the counter fact reverse interest representation and the counter fact forward interest representation.

In an implementation embodiment, the counterfactual reverse interest sequence is subjected to pooling processing to obtain a counterfactual reverse interest characterization, the counterfactual forward interest sequence is subjected to pooling processing to obtain a counterfactual forward interest characterization, and the counterfactual interest characterization is determined according to the counterfactual reverse interest characterization and the counterfactual forward interest characterization.

Step 322: and generating a first optimization parameter pair according to the original interest characterization and the reference preference characterization corresponding to the reference preference object, and generating a second optimization parameter pair according to the original interest characterization and the counterfactual interest characterization.

Step 324: calculating a cross entropy loss value according to the first optimization parameter pair, calculating a contrast loss value according to the second optimization parameter pair, calculating a target loss value based on the cross entropy loss value and the contrast loss value, and training a sequence recommendation model associated with the logarithm object recommendation service according to the target loss value.

According to the training method for the sequence recommendation model applied to the financial marketing service, original interest representations are generated according to historical objects contained in a historical behavior sequence, object similarity between each historical object and a reference preference object is calculated according to attention influence parameters of each historical object, therefore, counterfactual interest representations with higher accuracy can be generated based on a counterfactual enhancement mode, the sequence recommendation model is trained through the original interest representations, the reference preference representations and the counterfactual interest representations, the trained sequence recommendation model is obtained, the problem that due to the fact that the sequence is sparse and the sequence has noise, objects which are interested by a user cannot be accurately predicted is solved, and subsequently, the user interest prediction can be accurately performed on the basis of the trained sequence recommendation model, users with sparse behavior sequences and noisy behavior sequences, the user interest is excavated, and the user use experience is improved.

Fig. 4 shows a flowchart of an object recommendation method provided in accordance with an embodiment of the present specification, which includes steps 402 to 408.

Step 402: and inputting the historical behavior sequence of the target user associated with the target service into a sequence recommendation model, wherein the sequence recommendation model is obtained by training through a training method of the sequence recommendation model provided by the specification.

In an implementation embodiment, taking the target service as a commodity purchasing service as an example, a historical behavior sequence of the target user in a purchasing platform is input to the sequence recommendation model, and the historical behavior sequence includes a purchasing record, a commodity browsing record, a collection record and the like of the target user under the purchasing platform.

Step 404: and obtaining a recommendation result output by the sequence recommendation model, wherein the recommendation result comprises a service object related to the target service.

In an implementation embodiment, the sequence recommendation model predicts the commodity which the user is interested in according to the input historical behavior sequence, namely, the business object of the associated commodity purchasing business, and recommends the commodity as a recommendation result to the user.

The object recommendation method provided by the specification comprises the steps of inputting a historical behavior sequence of a target user associated with a target service into a sequence recommendation model, wherein the sequence recommendation model is obtained by training through a training method of the sequence recommendation model provided by the specification; and obtaining a recommendation result output by the sequence recommendation model, wherein the recommendation result comprises a service object related to the target service. The sequence recommendation model obtained by the training method based on the sequence recommendation model provided by the specification can solve the problems of sparse user behavior sequence data and noise, can accurately predict a service object according with the preference of a user, and feeds the service object back to the user as a recommendation result, so that the service use experience of the user is improved.

Fig. 5 is a flowchart illustrating another object recommendation method provided in accordance with an embodiment of the present disclosure, including steps 502 through 508.

Step 502: and receiving a participation request of a target user for the target service.

In an implementable embodiment, a browse request is received for a target user to browse a shopping platform page.

Step 504: and determining the historical behavior sequence of the target user according to the participation request.

In an implementation embodiment, according to a browsing request sent by a target user through a terminal device, a historical behavior sequence of the user in a shopping platform is determined in a database, and the historical behavior sequence comprises a purchasing record, a commodity browsing record, a collection record and the like of the target user under the purchasing platform.

Step 506: and inputting the historical behavior sequence into a sequence recommendation model, wherein the sequence recommendation model is a model obtained by training through a training method of the sequence recommendation model provided by the specification.

In an implementation embodiment, the historical behavior sequence is input into the sequence recommendation model, so that the sequence recommendation model predicts the commodities meeting the user interest, and a faster, convenient and faster purchasing mode is provided for the user.

Step 508: and obtaining a recommendation result output by the sequence recommendation model, wherein the recommendation result comprises a service object related to the target service.

In an implementation embodiment, the predicted commodity is used as a recommendation result and fed back to the user terminal.

The object recommendation method provided by the specification receives a participation request of a target user for a target service; determining a historical behavior sequence of the target user according to the participation request; inputting the historical behavior sequence into a sequence recommendation model, wherein the sequence recommendation model is a model obtained by training through a training method of the sequence recommendation model provided by the specification; and obtaining a recommendation result output by the sequence recommendation model, wherein the recommendation result comprises a service object related to the target service. The sequence recommendation model obtained by the training method based on the sequence recommendation model provided by the specification can solve the problems of sparse user behavior sequence data and noise, can accurately predict a service object according with the preference of a user, and feeds the service object back to the user as a recommendation result, so that the service use experience of the user is improved.

Corresponding to the above method embodiment, the present specification further provides a sequence recommendation model device embodiment, and fig. 6 shows a schematic structural diagram of a sequence recommendation model device provided in an embodiment of the present specification. As shown in fig. 6, the apparatus includes:

a first generation module 602, configured to obtain a historical behavior sequence of a recommendation service of an associated object, and generate an original interest representation according to a historical object included in the historical behavior sequence;

a calculating module 604 configured to calculate an object similarity between each history object and the reference preference object according to the attention impact parameter of each history object in at least one dimension;

a second generation module 606 configured to determine a target historical object in the historical behavior sequence based on object similarity, and generate a counterfactual interest representation according to the target historical object;

a training module 608 configured to generate an optimized parameter pair according to the original interest representation, the reference preference representation corresponding to the reference preference object, and the counterfactual interest representation, and train a sequence recommendation model associated with the object recommendation service according to the optimized parameter pair.

Optionally, the first generating module 602 is further configured to:

determining a target user corresponding to the historical behavior sequence and a related user corresponding to the target user;

determining similar objects of the associated users, and constructing a user behavior relation graph of the target user based on the historical objects and the similar objects contained in the historical behavior sequence;

and extracting the original interest representation of the target user from the user behavior relation graph.

Optionally, the calculating module 604 is further configured to:

determining a target user corresponding to the historical behavior sequence, and determining a reference preference object corresponding to the target user;

determining an attention impact parameter of each historical object in a time dimension;

and calculating the object similarity between each historical object and the reference preference object according to the attention influence parameters.

Optionally, the second generating module 606 is further configured to:

sequencing the historical objects in the historical behavior sequence from high to low based on the object similarity;

determining a preset number of positive sequence historical objects and a preset number of negative sequence historical objects according to the sequencing result;

and taking the forward order history object and the reverse order history object as target history objects.

Optionally, the second generating module 606 is further configured to:

performing sequence construction processing on the forward sequence historical object based on a preset sequence construction strategy, and generating a counter-fact reverse interest sequence according to a processing result, wherein the preset sequence construction strategy comprises a sub-strategy for performing reconstruction processing on the historical object;

performing sequence construction processing on the reverse-order historical object based on the preset sequence construction strategy, and generating a reverse-fact forward interest sequence according to a processing result;

and generating a counterfactual interest representation according to the counterfactual backward interest sequence and the counterfactual forward interest sequence.

Optionally, the second generating module 606 is further configured to:

performing pooling treatment on the counterfactual reverse interest sequence to obtain a counterfactual reverse interest representation;

performing pooling processing on the counter fact forward interest sequence to obtain a counter fact forward interest representation;

determining a counterfactual interest representation from the counterfactual backward interest representation and the counterfactual forward interest representation.

Optionally, the training module 608 is further configured to:

generating a first optimization parameter pair according to the original interest representation and a reference preference representation corresponding to the reference preference object;

and generating a second optimization parameter pair according to the original interest characteristics and the counterfactual interest characteristics.

Optionally, the training module 608 is further configured to:

calculating a cross entropy loss value according to the first optimization parameter pair, and calculating a contrast loss value according to the second optimization parameter pair;

calculating a target loss value based on the cross entropy loss value and the contrast loss value;

and training a sequence recommendation model of the associated logarithm object recommendation service according to the target loss value.

The training device of the sequence recommendation model provided by the specification comprises a first generation module, a second generation module and a third generation module, wherein the first generation module is configured to acquire a historical behavior sequence of a recommendation service of an associated object, and generate an original interest representation according to a historical object contained in the historical behavior sequence; a calculation module configured to calculate an object similarity between each history object and a reference preference object according to an attention impact parameter of each history object in at least one dimension; the second generation module is configured to determine a target historical object in the historical behavior sequence based on object similarity, and generate a counterfactual interest representation according to the target historical object; and the training module is configured to generate an optimized parameter pair according to the original interest representation, the reference preference representation corresponding to the reference preference object and the counterfactual interest representation, and train a sequence recommendation model associated with the object recommendation service according to the optimized parameter pair. Generating an original interest representation according to historical objects contained in a historical behavior sequence, calculating object similarity between each historical object and a reference preference object according to attention influence parameters of each historical object, and accordingly generating a counterfactual interest representation with higher accuracy based on a counterfactual enhancement mode.

The above is a schematic scheme of a training apparatus for a sequence recommendation model according to this embodiment. It should be noted that the technical solution of the training apparatus for sequence recommendation model and the technical solution of the training method for sequence recommendation model belong to the same concept, and for details that are not described in detail in the technical solution of the training apparatus for sequence recommendation model, reference may be made to the description of the technical solution of the training method for sequence recommendation model.

Corresponding to the above method embodiment, the present specification further provides an object recommendation apparatus embodiment, and fig. 7 shows a schematic structural diagram of a sequence recommendation model apparatus provided in an embodiment of the present specification. As shown in fig. 7, the apparatus includes:

an input module 702, configured to input a historical behavior sequence of a target service associated with a target user into a sequence recommendation model, where the sequence recommendation model is a model obtained by training through a training method of the sequence recommendation model provided in this specification;

an obtaining module 704 configured to obtain a recommendation result output by the sequence recommendation model, wherein the recommendation result includes a service object associated with the target service.

The sequence recommendation model obtained by the training method based on the sequence recommendation model provided by the specification can solve the problems of sparse user behavior sequence data and noise, can accurately predict a service object according with the preference of a user, and feeds the service object back to the user as a recommendation result, so that the service use experience of the user is improved.

The above is a schematic scheme of an object recommendation apparatus of the present embodiment. It should be noted that the technical solution of the object recommendation apparatus and the technical solution of the object recommendation method belong to the same concept, and for details that are not described in detail in the technical solution of the object recommendation apparatus, reference may be made to the description of the technical solution of the object recommendation method.

Corresponding to the above method embodiment, the present specification further provides an embodiment of an object recommendation device, and fig. 8 shows a schematic structural diagram of another sequence recommendation model device provided in an embodiment of the present specification. As shown in fig. 8, the apparatus includes:

a receiving module 802 configured to receive a participation request of a target user for a target service;

a determining module 804 configured to determine a historical behavior sequence of the target user according to the participation request;

an input module 806, configured to input the historical behavior sequence into a sequence recommendation model, where the sequence recommendation model is a model obtained by training through a training method of a sequence recommendation model provided in this specification;

an obtaining module 808 configured to obtain a recommendation result output by the sequence recommendation model, wherein the recommendation result includes a service object associated with the target service.

The above is a schematic scheme of an object recommendation apparatus of the present embodiment. It should be noted that the technical solution of the object recommendation apparatus and the technical solution of the object recommendation method described above belong to the same concept, and for details that are not described in detail in the technical solution of the object recommendation apparatus, reference may be made to the description of the technical solution of the object recommendation method described above.

Fig. 9 illustrates a block diagram of a computing device 900 provided in accordance with an embodiment of the present description. Components of the computing device 900 include, but are not limited to, a memory 910 and a processor 920. The processor 920 is coupled to the memory 910 via a bus 930, and a database 950 is used to store data.

Computing device 900 also includes access device 940, access device 940 enabling computing device 900 to communicate via one or more networks 960. Examples of such networks include the Public Switched Telephone Network (PSTN), a Local Area Network (LAN), a Wide Area Network (WAN), a Personal Area Network (PAN), or a combination of communication networks such as the internet. Access device 940 may include one or more of any type of network interface (e.g., a Network Interface Card (NIC)) whether wired or wireless, such as an IEEE802.11 Wireless Local Area Network (WLAN) wireless interface, a worldwide interoperability for microwave access (Wi-MAX) interface, an ethernet interface, a Universal Serial Bus (USB) interface, a cellular network interface, a bluetooth interface, a Near Field Communication (NFC) interface, and so forth.

In one embodiment of the present description, the above-described components of computing device 900, as well as other components not shown in FIG. 9, may also be connected to each other, such as by a bus. It should be understood that the block diagram of the computing device architecture shown in FIG. 9 is for purposes of example only and is not limiting as to the scope of the description. Those skilled in the art may add or replace other components as desired.

Computing device 900 may be any type of stationary or mobile computing device, including a mobile computer or mobile computing device (e.g., tablet, personal digital assistant, laptop, notebook, netbook, etc.), a mobile phone (e.g., smartphone), a wearable computing device (e.g., smartwatch, smartglasses, etc.), or other type of mobile device, or a stationary computing device such as a desktop computer or PC. Computing device 900 may also be a mobile or stationary server.

When executing the computer instructions, the processor 920 implements the steps of the training method and the object recommendation method of the sequence recommendation model.

The above is an illustrative scheme of a computing device of the present embodiment. It should be noted that the technical solution of the computing device belongs to the same concept as the technical solution of the training method of the sequence recommendation model and the object recommendation method, and details of the technical solution of the computing device, which are not described in detail, can be referred to the description of the technical solution of the training method of the sequence recommendation model and the object recommendation method.

An embodiment of the present specification further provides a computer readable storage medium, which stores computer instructions, and the computer instructions, when executed by a processor, implement the steps of the training method and the object recommendation method of the sequence recommendation model as described above.

The above is an illustrative scheme of a computer-readable storage medium of the present embodiment. It should be noted that the technical solution of the storage medium belongs to the same concept as the technical solution of the training method of the sequence recommendation model and the object recommendation method, and details of the technical solution of the storage medium, which are not described in detail, can be referred to the description of the technical solution of the training method of the sequence recommendation model and the object recommendation method.

An embodiment of the present specification further provides a computer program, wherein when the computer program is executed in a computer, the computer program causes the computer to execute the steps of the training method and the object recommendation method of the sequence recommendation model.

The above is an illustrative scheme of a computer program of the present embodiment. It should be noted that the technical solution of the computer program is the same as the technical solution of the training method of the sequence recommendation model and the object recommendation method, and details of the technical solution of the computer program, which are not described in detail, can be referred to the description of the technical solution of the training method of the sequence recommendation model and the object recommendation method.

The foregoing description has been directed to specific embodiments of this disclosure. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims can be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing may also be possible or may be advantageous.

The computer instructions comprise computer program code which may be in the form of source code, object code, an executable file or some intermediate form, or the like. The computer-readable medium may include: any entity or device capable of carrying the computer program code, recording medium, usb disk, removable hard disk, magnetic disk, optical disk, computer Memory, read-Only Memory (ROM), random Access Memory (RAM), electrical carrier wave signals, telecommunications signals, software distribution medium, and the like. It should be noted that the computer readable medium may contain content that is subject to appropriate increase or decrease as required by legislation and patent practice in jurisdictions, for example, in some jurisdictions, computer readable media does not include electrical carrier signals and telecommunications signals as is required by legislation and patent practice.

It should be noted that, for the sake of simplicity, the foregoing method embodiments are described as a series of combinations of acts, but it should be understood by those skilled in the art that the embodiments are not limited by the described order of acts, as some steps may be performed in other orders or simultaneously according to the embodiments. Furthermore, those skilled in the art will appreciate that the embodiments described in this specification are presently preferred and that no acts or modules are required in the implementations of the disclosure.

In the above embodiments, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.

The preferred embodiments of the present specification disclosed above are intended only to aid in the description of the specification. Alternative embodiments are not exhaustive and do not limit the invention to the precise embodiments described. Obviously, many modifications and variations are possible in light of the above teaching. The embodiments were chosen and described in order to best explain the principles of the embodiments and the practical application, to thereby enable others skilled in the art to best understand and utilize the embodiments. The specification is limited only by the claims and their full scope and equivalents.

Claims

1. A training method of a sequence recommendation model comprises the following steps:

2. The training method according to claim 1, wherein generating an original interest representation from historical objects contained in the historical behavior sequence comprises:

3. The training method of claim 1, wherein calculating object similarity between each historical object and a reference preferred object in at least one dimension of time dimension comprises:

4. The training method of claim 1, determining a target historical object in the sequence of historical behaviors based on object similarity, comprising:

5. The training method of claim 4, generating a counterfactual interest characterization from the target historical object, comprising:

and generating a counterfactual interest characterization according to the counterfactual backward interest sequence and the counterfactual forward interest sequence.

6. The training method of claim 5, generating an interest characterization from the counterfactual backward interest sequence and the counterfactual forward interest sequence, comprising:

7. The method of claim 1, generating an optimized parameter pair according to the original interest representation, the reference preference representation corresponding to the reference preference object, and the counterfactual interest representation, comprising:

8. The method of claim 7, wherein training a sequence recommendation model associated with the object recommendation service according to the optimized parameters comprises:

9. An object recommendation method comprising:

inputting a historical behavior sequence of a target user associated with a target service into a sequence recommendation model, wherein the sequence recommendation model is a model obtained by training through the training method of any one of claims 1 to 8;

10. An object recommendation method comprising:

receiving a participation request of a target user for a target service;

inputting the historical behavior sequence into a sequence recommendation model, wherein the sequence recommendation model is a model obtained by training through the training method of any one of claims 1 to 8;

11. A training apparatus for a sequence recommendation model, comprising:

the second generation module is configured to determine a target historical object in the historical behavior sequence based on object similarity, and generate a counterfactual interest representation according to the target historical object;

12. A computing device comprising a memory, a processor, and computer instructions stored on the memory and executable on the processor, the processor implementing the steps of the method of any one of claims 1-8 when executing the computer instructions.

13. A computer-readable storage medium storing computer-executable instructions that, when executed by a processor, perform the steps of the method of any one of claims 1-8.