CN114692005B

CN114692005B - Sparse ultrashort sequence-oriented personalized recommendation method, system, medium and device

Info

Publication number: CN114692005B
Application number: CN202210604217.8A
Authority: CN
Inventors: 汤胤; 李泽峥; 沈子璐; 陈永健; 路婕
Original assignee: Jinan University
Current assignee: Jinan University
Priority date: 2022-05-31
Filing date: 2022-05-31
Publication date: 2022-08-12
Anticipated expiration: 2042-05-31
Also published as: CN114692005A

Abstract

The invention discloses a sparse ultrashort sequence-oriented personalized recommendation method, a system, a medium and equipment, wherein the method comprises the following steps: constructing a sparse ultrashort user behavior sequence, constructing a relational graph of users and commodities, and performing embedded representation learning on nodes in the relational graph by adopting a graph embedding method; constructing an expert database based on sparse ultra-short user behavior data, learning a purchase strategy of the expert database, and performing sparse ultra-short user behavior sequence expansion according to the purchase strategy; based on the commodity pre-embedded representation and the expanded sparse ultra-short user behavior sequence, the information enhancement of the commodity embedded representation is completed by adopting a self-attention model, the user embedded representation is obtained according to the final commodity embedded representation and the user behavior data of the sparse ultra-short user behavior sequence, and personalized recommendation is performed. The method can improve the quality of the data input into the self-attention model and realize the application of the self-attention model in a sparse ultra-short data recommendation-oriented scene.

Description

Sparse ultrashort sequence-oriented personalized recommendation method, system, medium and device

Technical Field

The invention relates to the technical field of data customization processing, in particular to a sparse ultrashort sequence-oriented personalized recommendation method, a sparse ultrashort sequence-oriented personalized recommendation system, a sparse ultrashort sequence-oriented personalized recommendation medium and sparse ultrashort sequence-oriented personalized recommendation equipment.

Background

With the development of online business and big data, the application value of the recommendation system on business intelligence is more and more obvious. The personalized recommendation method can better mine the potential interests of the user and make recommendations in a targeted manner. At present, a common practice is to recommend corresponding commodities to a user according to user characteristics, commodity characteristics and interaction information between the user and the commodities, and massive data needs to occupy a large amount of storage space. Meanwhile, in many application scenarios, in the face of an ultra-large commodity set and limited user commodity interaction times, the user commodity interaction data is in a sparse characteristic, and a new user and a low-activity user bring a cold start problem to a recommendation system, so that a commodity purchasing sequence of the user is in an ultra-short characteristic.

With the progress of artificial intelligence technology in recent years, a large-parameter complex model has the potential of better mining the potential interest of a user, a self-attention model can well capture the long-distance dependency of an object from user behavior data, but a large number of parameters are needed to learn a global structure, and the learning on a sparse and ultrashort data set is difficult.

Disclosure of Invention

In order to overcome the defects and shortcomings in the prior art, the invention provides a sparse ultrashort sequence-oriented personalized recommendation method.

The second purpose of the invention is to provide a sparse ultrashort sequence-oriented personalized recommendation system.

A third object of the present invention is to provide a computer-readable storage medium.

It is a fourth object of the invention to provide a computing device.

In order to achieve the purpose, the invention adopts the following technical scheme:

the invention provides a sparse ultrashort sequence-oriented personalized recommendation method, which comprises the following steps:

acquiring sparse ultrashort user behavior data, constructing a sparse ultrashort user behavior sequence, constructing a relation graph of a user and a commodity according to historical data of interaction between the user and the commodity, and performing embedding representation learning on nodes in the relation graph by adopting a graph embedding method;

constructing an expert database based on sparse ultrashort user behavior data, learning a purchase strategy of the expert database by adopting a simulation learning method, and completing the expansion of a sparse ultrashort user behavior sequence by following the purchase strategy;

learning the purchasing strategy of the expert database by adopting a simulated learning method, which specifically comprises the following steps:

setting a sequence length threshold, dividing a sparse ultrashort user behavior sequence into a long sequence and a short sequence according to the sequence length threshold, and storing the long sequence into an expert database;

learning the purchasing strategy of the expert database by adopting a simulated learning method based on a generated countermeasure network;

sampling a real purchase sequence state s from a sparse ultra-short user behavior sequence, and obtaining a purchase decision a by utilizing an initialized purchase strategy pi to obtain a generation experience (s, a);

sampling a long sequence from an expert database, segmenting a part of the long sequence containing the first m commodities as a state S, and taking any one of the rest commodities as an A to obtain expert experience (S, A);

the production experience (S, a) and the expert experience (S, a) are simultaneously input into the discriminator D, and the difference is calculated using the cross entropy:

wherein the content of the first and second substances,

the result of the calculation of the difference is represented,

for true purchase sequence statussDown to policy selection action

The probability of (a) of (b) being,

for the score expectation of the purchase strategy pi,

to usePolicy

The score expectation of (a) is that,

an intermediate parameter is expressed for measuring the uncertainty of the purchasing strategy pi,

representing a preset weight parameter;

iteratively updating the weight parameters of the purchasing strategy pi, learning to obtain the purchasing strategy pi and a reward function, inputting the state s of the real purchasing short sequence into the deep neural network of the purchasing strategy pi and obtaining a purchasing decisionaWill purchase the decisionaAdding real purchase short sequences to complete length expansion until the length is expanded to a preset length;

updating a purchasing strategy pi by taking the imitation learning as a framework;

based on the commodity pre-embedded representation and the expanded sparse ultra-short user behavior sequence, the information enhancement of the commodity embedded representation is completed by adopting a self-attention model, the user embedded representation is obtained according to the final commodity embedded representation and the user behavior data of the sparse ultra-short user behavior sequence, and personalized recommendation is performed according to the embedded representation of the commodity and the user.

As a preferred technical scheme, a graph embedding method is adopted to carry out embedding representation learning on nodes in a relational graph, and the method specifically comprises the following steps:

constructing a user and commodity relation bipartite graph according to a commodity purchasing history sequence of a user;

generating a commodity purchasing sequence of the user and a commodity purchasing sequence of the user, and constructing an inducing lists _x And inducement listss _y ；

Dividing a user-commodity relationship bipartite graph into a plurality of induction listss _x Ands _y the embedded learning is designed as a maximum likelihood problem;

initializing user purchase goods prediction network N _x Commodity purchase prediction network N _y Attribute-embedded network N _a Initializing embedded representation of users or commodities required by an embedding layer, and initializing a weight matrix required by a softmax layer;

from the list of inducerss _x Extract an edge from the induction lists _x Sampling other edges as the neighbors;

from the list of inducerss _y Extract an edge from the induction lists _y Sampling other edges as the neighbors;

setting update target, updating prediction network N _x Predictive network N _y Attribute-embedded network N _a And a commodity or user embedded representation vocabulary;

training until convergence, and updating the weight matrix parameters to obtain the embedded representation of the commodity and the user.

As a preferred technical solution, the construction of the bipartite graph of the relationship between the user and the product according to the history sequence of the product purchased by the user is specifically represented as: g (X, Y, A, t), wherein X, Y respectively represents the user node and the commodity node set, A represents the relevant attribute of the user interacting with the commodity, and t represents the time stamp of the interaction.

As a preferred technical scheme, the induction list is constructeds _x And inducement lists _y The method comprises the following specific steps:

generating a sequence x of purchases of goods by a user node x ^* =[(Y ^* ,A,t)]And the sequence y of the purchase of the goods y by the user ^* =[(X ^* ,A,t)]Wherein X is ^* To purchase a set of users of Y goods, Y ^* Obtaining a guidance list for the commodity set purchased by the x user nodes according to the purchase sequence

And inducement lists

Wherein, in the step (A),

is x ^* And Y ^* The established connecting edge is connected with the network,

is y and X ^* And C, establishing a connecting edge, wherein A represents the related attribute of the interaction between the user and the commodity, and t represents the time stamp of the occurrence of the interaction.

Preferably, the bipartite graph of relationships between users and commodities is divided into a plurality of inducement listss _x Ands _y the embedded learning is designed as a maximum likelihood problem, which is specifically expressed as:

wherein the content of the first and second substances,

is a balance parameter preset according to experience and is used for controlling the balance of the oil in the oil tanks _x Ands _y is to be weighed against the importance of,

are all the parameters that are involved in the model,

representing an objective function;

computing

Specifically, it is represented as:

computing

Specifically, it is represented as:

wherein the content of the first and second substances,

and

respectively represent x andy _i an embedded representation of a node is shown,

an embedding matrix from the set of user nodes X,

an embedding matrix from the commodity node set Y, the computation of the embedding matrix taking into accounty _i Properties of nodesd _i ，d _i Is an attribute embedded network N _a The output result of (a) is obtained,bis the term of the offset, and,

as softmax weighting matrices

The k-th row of the matrix is transposed,

as softmax weighting matrices

First of the matrixlThe lines are transposed so that the lines are transposed,

is taking into account the attributesd _j Softmax weight matrix of

The line k of (a) is transposed,

is taking into account the attributesd _j Softmax weight matrix of

To (1) alThe lines are transposed so that the lines are transposed,

is taking into account the attributed _j Softmax weight matrix of

The line k of (a) is transposed,

is taking into account the attributesd _j Softmax weight matrix of

To (1) alThe lines are transposed so that the lines are transposed,

is a preset hyper-parameter, representing the number of neighbors,a _i represents x andy _i the associated attributes of the interaction of (a),a _j represents x andy _j x represents a user node,y _i representing a commodityiThe node of (a) is selected,y _j representing a commodityjA node of (2);

v _y andv _xi respectively representyAndx _i an embedded representation of a node is shown,v _y an embedding matrix from the commodity node set Y,

an embedding matrix from the set of user nodes X, the computation of which takes into accountx _i Properties of nodesd _i ，

As softmax weight matrix

The transpose of the k-th line of (1),

as softmax weight matrix

To (1) alAnd (5) line transposition.

As a preferred technical solution, the update weight matrix parameter is expressed as:

wherein the content of the first and second substances,

represents iteration totThe parameters of the model at the time of generation,

is the direction of decrease of the parameter,

the step size is represented as a function of time,

are balance parameters that are preset empirically.

As a preferred technical solution, the information enhancement of the commodity embedded representation is completed by using a self-attention model, the user embedded representation is obtained according to the final commodity embedded representation and the user behavior data of the sparse ultrashort user behavior sequence, and personalized recommendation is performed according to the embedded representations of the commodity and the user, and the specific steps include:

inputting a real commodity purchasing length sequence of a user and a commodity purchasing length sequence of a virtual user after the short sequence is expanded in a self-attention model;

performing mask preprocessing on the training samples;

training a self-attention model to be convergent, inputting a historical sequence of commodities purchased by each user for the model to obtain an embedded representation of the commodities appearing in the sequence, and aggregating different embedded representations of the same commodity by adopting a mean aggregation or weighted aggregation mode to obtain a final embedded representation of the commodity;

obtaining user embedded representation by adopting a mean value aggregation commodity final embedded representation mode according to a historical commodity purchasing sequence of a real user;

and calculating and sequencing the cosine similarity of the users and the commodities, and recommending the commodity with the highest similarity to each user.

In order to achieve the second object, the invention adopts the following technical scheme:

a sparse ultrashort sequence-oriented personalized recommendation system comprises: the system comprises a user behavior sequence building module, a relational graph building module, an embedded representation learning module, an expert database building module, an imitation learning module, a sequence expansion module, an embedded representation output module and a personalized recommendation module;

the user behavior sequence construction module is used for acquiring sparse ultrashort user behavior data and constructing a sparse ultrashort user behavior sequence;

the relation graph building module is used for building a relation graph of the user and the commodity according to historical data of interaction between the user and the commodity;

the embedded representation learning module is used for embedding representation learning of the nodes in the relational graph by adopting a graph embedding method;

the expert database construction module is used for constructing an expert database based on sparse ultra-short user behavior data;

the imitation learning module is used for learning the purchasing strategy of the expert database by adopting an imitation learning method;

wherein, the first and the second end of the pipe are connected with each other,

the result of the calculation of the difference is represented,

for true purchase sequence statussSelecting actions according to policy

The probability of (a) of (b) being,

for the score expectation of the purchase strategy pi,

for using the strategy

The score expectation of (a) is,

representing a preset weight parameter;

iteratively updating the weight parameter of the purchasing strategy pi, learning to obtain the purchasing strategy pi and a reward function, inputting the state s of the real purchasing short sequence into the deep neural network of the purchasing strategy pi and obtaining a purchasing decisionaWill purchase the decisionaAdding real purchase short sequences to complete length expansion until the length is expanded to a preset length;

updating a purchasing strategy pi by taking the simulation learning as a framework;

the sequence expansion module is used for completing the expansion of the sparse ultrashort user behavior sequence by adopting a purchase strategy;

the embedded representation output module is used for finishing information enhancement of commodity embedded representation by adopting a self-attention model based on commodity pre-embedded representation and the expanded sparse ultra-short user behavior sequence, and obtaining user embedded representation according to the final commodity embedded representation and the user behavior data of the sparse ultra-short user behavior sequence;

the personalized recommendation module is used for performing personalized recommendation according to the embedded representations of the commodities and the users.

In order to achieve the third object, the invention adopts the following technical scheme:

a computer-readable storage medium, storing a program, which when executed by a processor, implements the sparse ultrashort sequence-oriented personalized recommendation method as described above.

In order to achieve the fourth object, the invention adopts the following technical scheme:

a computer device comprises a processor and a memory for storing a program executable by the processor, wherein the processor executes the program stored in the memory to realize the sparse ultrashort sequence-oriented personalized recommendation method.

Compared with the prior art, the invention has the following advantages and beneficial effects:

(1) according to the method, the generated graph is embedded into the pre-representation, the data sequence is expanded, deep learning of the self-attention module learning embedding representation effect is enhanced, and the self-attention model is applied to a scene for recommending sparse ultra-short sequence data, so that the mining capability of potential interest of a user is enhanced, and the personalized recommendation accuracy is improved.

(2) The method distinguishes the user behavior sequence into the long sequence and the short sequence, learns the purchase strategy which is easier to learn in the long sequence by using the imitation learning so as to expand the short sequence to the specified length, solves the problem of sparse and ultrashort quality of a training sample, improves the sample quality of an input self-attention model, depends on massive sparse ultrashort data, and occupies small storage space.

(3) The invention describes the commodity relation of the user by using a graph structure, learns the commodity embedded representation by using a graph embedding method to obtain high-quality commodity pre-embedded representation and improve the pre-representation quality of the input self-attention model sample.

Drawings

FIG. 1 is a schematic flow chart of a sparse ultrashort sequence-oriented personalized recommendation method according to the present invention;

FIG. 2 is a schematic diagram of an extended sequence according to the present invention;

FIG. 3 is a schematic diagram of a node-embedded representation of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.

Example 1

As shown in fig. 1, the present embodiment provides a sparse ultrashort sequence-oriented personalized recommendation method, including the following steps:

s1, acquiring sparse and ultrashort user behavior data from a storage space, constructing a user-commodity relation graph based on the sparse and ultrashort user behavior data, and completing pre-embedding expression of commodities by adopting a graph embedding method;

in step S1, the pre-embedding of the merchandise is completed by the graph embedding method, which includes the following steps;

s11, constructing a graph structure G according to the historical sequence of the commodities purchased by the user;

s12, based on the graph structure G, embedding representation learning is carried out on the nodes in the graph by adopting a graph embedding method;

preferably, in the recommendation problem, a user commodity relation is expressed by adopting a heterogeneous graph structure;

preferably, a heterogeneous graph (such as a bipartite graph) embedding method is used for representing and learning nodes, the relationship between the heterogeneous graph and the commodity can be described by the heterogeneous graph (such as the bipartite graph) according to historical data of interaction between users and commodities, a user commodity relationship bipartite graph G = (X, Y, E) is given, X, Y respectively represents node sets of users and commodities, E represents a connecting edge of nodes in the node sets of X and Y, after the relationship bipartite graph is constructed, node embedding is learned by using a deep neural network, and the goal of node embedding is to obtain a function

Sum function

，

And

are bipartite graph nodes respectively

K and

performing dimensional characterization;

in this embodiment, the bipartite graph embedding adopts an IGE (attribute interaction graph) embedding method, which specifically includes the following steps:

(1) constructing a bipartite graph G (X, Y, A, t) according to a commodity purchasing history sequence of a user, wherein X and Y are a user node and commodity node set, A is relevant attributes of interaction (purchase) between the user and commodities, such as purchase price, commodity category and the like, and t is a time stamp of interaction occurrence;

(2) generating a sequence x of purchases of goods by a user node x ^* =[(Y ^* ,A,t)]And the sequence y of the purchase of the goods y by the user ^* =[(X ^* ,A,t)]Wherein X is ^* To purchase a set of users of Y goods, Y ^* Obtaining a guidance list according to the purchase sequence for the commodity set purchased by the x user nodes

And inducement lists

Wherein, in the step (A),

is x ^* And Y ^* Established connecting edge

，

Is y and X ^* Built connecting edge, lists _x /s _y Contains information of all edges related to the node x/y;

(3) embedded learning can be designed as a maximum likelihood problem, and graph G can be divided into multiple inducement listss _x Ands _y then the maximum likelihood function is shown in equation 1:

(formula 1)

Wherein the content of the first and second substances,

is all parameters related to the model, the objective function (formula 1) has two similar parts, steps (5) - (8) describe the calculation of the first part of the objective function, and steps (9) - (12) describe the calculation of the second part of the objective function;

(4) initializing user purchased goods prediction network N _x Commodity purchasing prediction network N _y Attribute-embedded network N _a The embedded representation of the user or commodity required to initialize the embedding layer, i.e. initializing the weight matrix

WhereinFIs a pre-selected number of factors that,V _x is the size of the set of user nodes,V _y is the size of the node set of the commodity,Dis the dimension to which the attribute network is reduced,Kis the embedding dimension of the user and,

is the embedding dimension of the commodity, and a weight matrix required by initializing the softmax layer

，

(5) Froms _x Extract a side

And froms _x Middle sampling

Other edges of the strip

As its neighbor, sample to anythinge _j Probability of and

is proportional to the size of the (c), wherein,

is an adjustable hyper-parameter to take into account samplinge _j The effect of the time of the (c) phase,

representing preset hyper-parameters, such as fixing that each user node has several commodity nodes as neighbors,xto indicate the user or users of the device,y _i indicating merchandiseiThe node(s) of (a) is (are),y _j indicating merchandisejThe node(s) of (a) is (are),t _i represents x andy _i at the interaction oft _i The time of day (e.g. a date),t _j represents x andy _j at the interaction oft _i At the moment of time, the time of day,a _i represents x andy _i related attributes of the interaction (e.g., x purchase)y _i The closing price of (c),a _j represents x andy _j the relevant attributes of the interaction of (1);

(6) according to equation 2

The following are:

(formula 2)

Wherein the content of the first and second substances,y _j =kcan be understood asy _k ，

And

respectively represent x andy _i an embedded representation of a node is shown,

embedding matrix from user node set X

，

Embedding matrix from commodity node set Y

，d _i Is an attribute embedded network N _a The output result of (a) is obtained,

is the term of the offset, and,

can be understood as

The transpose of the k-th row of the matrix,

as softmax weighting matrices

First of the matrixlThe lines are transposed to form a line,

is taking into account the attributesd _j Softmax weight matrix of

The line k of (a) is transposed,

is taking into account the attributesd _j Softmax weight matrix of

To (1) alLine transpose, the calculation of which is as shown in step (4);

(7) according to equation 3

(formula 3)

(8) To be provided with

As large as possible for the target calculation

Updating the prediction network N with Adam according to equation 4 _x Attribute embedded network N _a And commodity/user embedded representation vocabulary (i.e., parameter matrix initialized by the embedding layer), wherein,

represents iteration totThe parameters of the model at the time of generation,

is the direction of decrease of the parameter,

is the step size;

(formula 4)

(9) Froms _y Extract a side

And froms _y Middle sampling

Other edges of the strip

As its neighbor, sample to anythinge _j Probability of and

is proportional to the size of the (c), wherein,

is an adjustable hyper-parameter to take into account samplinge _j The time effect of (a);

(10) according to equation 5

The following are:

(formula 5)

Wherein the content of the first and second substances,x _j =kcan be understood asx _k ，v _y Andv _xi is composed ofyAndx _i an embedded representation of a node is shown,v _y embedded matrix from Y

，

Embedded matrix from X

，

Is an attribute embedded network N _a The output result of (a) is obtained,

is the term of the offset, and,

can be understood as

The transpose of the k-th row of the matrix,

is taking into account the attributesd _j Softmax weight matrix of

The line k of (a) is transposed,

is taking into account the attributed _j Softmax weight matrix of

To (1) alThe lines are transposed so that the lines are transposed,

is a preset hyper-parameter, representing the number of neighbors,

as softmax weight matrix

The transpose of the k-th line of (1),

as softmax weight matrix

To (1) alAnd (5) line transposition. The calculation is as shown in step (4);

(11) according to equation 6

(formula 6)

(12) To be provided with

As large as possible for the target calculation

Updating the prediction network N with Adam according to equation 7 _y Attribute-embedded network N _a And commodity/user embedded expression vocabulary (i.e. parameter matrix initialized by embedding layer), wherein

Represents iteration totThe parameters of the model at the time of generation,

is the direction of decrease of the parameter,

represents a step size;

(formula 4)

(13) Training the steps (5) - (8) and the steps (9) - (12) alternately until convergence, and updating the weight matrix parameters to obtain the embedded representation of the commodity and the user;

s2, constructing an expert database based on sparse ultra-short user behavior data, learning the purchase strategy of the expert database by adopting a simulation learning method, and completing the expansion of a user sparse ultra-short purchase sequence by following the purchase strategy;

as shown in fig. 2, in step S2, the steps of the mimic learning method adopted by the expansion sequence are as follows;

s21, dividing the sparse and ultrashort user behavior sequence into a long sequence and a short sequence, and storing the long sequence into an expert database;

specifically, a sequence length threshold L (set depending on the particular data set) may be set, with sequence length ≧ L being long sequence, and sequence length < L being short sequence;

s22, learning the purchase strategy from the expert database and expanding the sparse ultrashort sequence accordingly;

preferably, the purchasing strategy is learned from an expert database, the sparse ultrashort sequence is expanded accordingly, and a simulation learning method based on a generation countermeasure network is adopted, wherein the steps are as follows;

s221, sampling a real purchasing sequence state S from a user behavior sequence, obtaining a purchasing decision a by utilizing an initialized purchasing strategy pi, and accordingly obtaining a generating experience (S, a);

the real purchase sequence state s is a weighted average value of historical purchased commodity embedded expressions before the purchase decision is made, the weight of the real purchase sequence state s is a learnable parameter, a candidate set of the purchase decision is all purchasable commodities, and a purchase strategy pi and a discriminator D are initialized deep neural networks;

s222, sampling a long sequence from an expert database, segmenting a part of the long sequence containing the first m commodities as a state S, and taking any one of the rest commodities as an A to obtain expert experience (S, A);

s223, simultaneously inputting the generated experience (S, a) and the expert experience (S, a) into the discriminator D to obtain scores D (S, a) and D (S, a) of the two, when the decision-making discrimination score (reward) is the difference between the two, see formula 4, which shows that the difference between the two can be expressed as the sum of the difference between D (S, a) and the real target score (e.g. 1) and the difference between D (S, a) and the false target score (e.g. 0) when implemented, and the difference can be calculated using cross entropy:

(formula 7)

Wherein the content of the first and second substances,

to measure the uncertainty of the purchasing strategy pi,

for true purchase sequence statussSelecting actions according to policy

The probability of (a) of (b) being,

for the score expectation of the purchase strategy pi, the formula for calculating the score is a formula in middle brackets, the weight parameter of the discriminator D is updated by Adam so that the formula 7 is as large as possible, and the weight parameter theta of the purchase strategy pi is updated by TRPO so that the formula 7 is as large as possible

As small as possible in the form of a capsule,

to use the strategy

The score expectation of (a) is,

representing a preset weight parameter;

s224, continuously repeating the process, and finally learning out a pi meeting the expert experience _E On the basis of which the true purchases can be ordered in short sequences (of length m) and a reward function ₁ ) State of (1)s*Inputting a deep neural network of a purchasing strategy pi and obtaining a purchasing decisionaAdding the purchasing decision into the real short sequence to complete the expansion of one length, and adding the expanded sequence (with the length of m) ₁ State of + 1)s*Repeating the above operation, thus circulating N-m ₁ Expanding to a specified length N;

in the embodiment, simulation learning is taken as a framework, a countermeasure generation and deep reinforcement learning technology is adopted for realizing, and a purchasing strategy pi is updated through algorithms of TRPO, PPO or model-based and the like;

the purchase strategy pi can be updated by an algorithm of TRPO, PPO or model-based, etc.;

s3, based on the commodity pre-embedded representation and the expanded user behavior data, completing information enhancement of the commodity embedded representation by adopting a self-attention model, obtaining a user embedded representation according to the final commodity embedded representation and the real user behavior data, and performing personalized recommendation according to the embedded representation of the commodity and the user;

as shown in fig. 3, the information enhancement of the embedded representation of the commodity is completed by using the self-attention model, and the steps are as follows:

s31, providing high-quality sample input for the self-attention model, wherein the sample is pre-expressed as a result of learning by using a graph embedding method, and the problem of sparseness and ultrashort is relieved after a short sequence in the sample is subjected to sequence expansion;

s32, performing commodity representation learning on the input samples by using a self-attention model;

s33, obtaining user representation based on the user behavior data and the final commodity representation, and recommending according to the matching degree of the user and the commodity;

in this embodiment, the self-attention model is a BERT model, and the specific steps are as follows:

(1) inputting a real commodity long sequence purchased by a user and a commodity long sequence purchased by a virtual user after the short sequence is expanded for BERT, wherein the pre-embedding of each commodity in the sequence is represented as a learning result of a bipartite graph embedding model IGE;

(2) i% masking of the training samples, the masked commodity will have j% replaced, where j ₁ % is replaced with random Commodity, j ₂ % is replaced by a mask vector, k% of the commodities which are masked are kept unchanged, and the parameter setting is determined according to a specific data set;

(3) training a BERT model to be convergent, inputting a historical sequence of purchased commodities of each user into the model, obtaining an embedded representation of commodities appearing in the sequence, aggregating different embedded representations of the same commodity in a mean value aggregation or weighted aggregation (the weight can be positively correlated with the sequence length because a longer purchased commodity sequence has higher probability to reflect the purchasing interest of the user, and the randomness of a short sequence to reflect the user interest is high) mode and the like to obtain a final embedded representation of the commodity, and obtaining the user embedded representation in a mean value aggregated commodity final embedded representation mode according to the historical purchased commodity sequence of the real user;

in this embodiment, the bipartite graph embedding is mainly to obtain a good initial product embedding representation, and the user embedding representation here can be obtained according to the final product embedding representation, so as to improve the accuracy of similarity calculation between the final product and the user;

(4) calculating cosine similarity between the user and the commodities, sequencing from big to small, and recommending topK commodities with the highest similarity for each user;

in this embodiment, the representation of the user is the average of the goods that the user has historically purchased;

in this embodiment, the matching degree calculation function between the user and the product is cosine similarity, the embedding dimensions of the user and the product are consistent, and the embedding representation of the user is derived from the related aggregation calculation of the product embedding representation and includes product information, so the cosine similarity can be used to calculate the similarity of the user and the product.

According to the commodity pre-embedded representation obtained by the graph embedding method, the problem of data sparseness is solved by adopting a technical scheme of simulation learning, the technical scheme of a self-attention model is adopted after an expanded data set and the pre-embedded representation are obtained, the technical problem of global insufficient connection among commodities is solved, and finally the high-quality commodity pre-embedded representation is obtained.

Example 2

in this embodiment, the user behavior sequence construction module is configured to obtain sparse ultrashort user behavior data and construct a sparse ultrashort user behavior sequence;

in this embodiment, the relationship graph building module is configured to build a relationship graph between a user and a commodity according to historical data of interaction between the user and the commodity;

in this embodiment, the embedded representation learning module is configured to perform embedded representation learning on nodes in the relational graph by using a graph embedding method;

the method for embedding, representing and learning the nodes in the relational graph by adopting a graph embedding method specifically comprises the following steps:

generating a commodity purchasing sequence of the user and a commodity purchasing sequence of the user, and constructing an inducing lists _x And inducement lists _y ；

Dividing a user-commodity relationship bipartite graph into a plurality of induction listss _x Ands _y design the embedded learning as a maximumA likelihood problem;

In the embodiment, the expert database construction module is used for constructing an expert database based on sparse ultra-short user behavior data;

in this embodiment, the imitation learning module is configured to learn a purchase strategy of the expert database by using an imitation learning method;

the result of the calculation of the difference is represented,

for true purchase sequence statussSelecting actions according to policy

The probability of (a) of (b) being,

for the score expectation of the purchase strategy pi,

to use the strategy

The score expectation of (a) is,

representing a preset weight parameter;

in this embodiment, the sequence expansion module is configured to complete expansion of a sparse and ultrashort user behavior sequence by using a purchase strategy;

in this embodiment, the embedded representation output module is configured to complete information enhancement of the commodity embedded representation by using a self-attention model based on the commodity pre-embedded representation and the extended sparse ultra-short user behavior sequence, and obtain the user embedded representation according to the final commodity embedded representation and the user behavior data of the sparse ultra-short user behavior sequence;

in this embodiment, the personalized recommendation module is configured to perform personalized recommendation according to the embedded representations of the goods and the user.

Example 3

The present embodiment provides a storage medium, which may be a storage medium such as a ROM, a RAM, a magnetic disk, an optical disk, or the like, where one or more programs are stored, and when the programs are executed by a processor, the method for personalized recommendation based on sparse ultrashort sequence oriented in embodiment 1 is implemented.

Example 4

The embodiment provides a computing device, which may be a desktop computer, a notebook computer, a smart phone, a PDA handheld terminal, a tablet computer, or other terminal devices with a display function, and the computing device includes a processor and a memory, where the memory stores one or more programs, and when the processor executes the programs stored in the memory, the sparse ultrashort sequence-oriented personalized recommendation method in embodiment 1 is implemented.

The above embodiments are preferred embodiments of the present invention, but the present invention is not limited to the above embodiments, and any other changes, modifications, substitutions, combinations, and simplifications which do not depart from the spirit and principle of the present invention should be construed as equivalents thereof, and all such changes, modifications, substitutions, combinations, and simplifications are intended to be included in the scope of the present invention.

Claims

1. A sparse ultrashort sequence-oriented personalized recommendation method is characterized by comprising the following steps:

wherein the content of the first and second substances,

the result of the calculation of the difference is represented,

for true purchase sequence statussDown to policy selection action

The probability of (a) of (b) being,

for the score expectation of the purchase strategy pi,

to use the strategy

The score expectation of (a) is,

representing a preset weight parameter;

2. The sparse ultrashort sequence-oriented personalized recommendation method as claimed in claim 1, wherein a graph embedding method is adopted to perform embedded representation learning on nodes in the relational graph, and the specific steps include:

initializing user purchased goods prediction network N _x Commodity purchase prediction network N _y Attribute-embedded network N _a Initializing embedded representation of users or commodities required by an embedding layer, and initializing a weight matrix required by a softmax layer;

from the list of inducerss _y Extract an edge from the list and extract an edge from the lists _y Sampling other edges as the neighbors;

3. The sparse ultrashort sequence-oriented personalized recommendation method as claimed in claim 2, wherein the bipartite graph of the relationship between the user and the commodity is constructed according to the historical sequence of the commodity purchased by the user, and is specifically represented as follows: g (X, Y, A, t), wherein X, Y respectively represents the user node and the commodity node set, A represents the relevant attribute of the user interacting with the commodity, and t represents the time stamp of the interaction.

4. Sparse ultrashort sequence oriented personality according to claim 2The recommendation method is characterized in that the induction list is constructeds _x And inducement listss _y The method comprises the following specific steps:

And inducement lists

Wherein, in the step (A),

is x ^* And Y ^* The connection edge is established, and the connection edge is established,

5. The sparse ultrashort sequence-oriented personalized recommendation method as claimed in claim 2, wherein the bipartite graph of the relationship between the user and the commodity is divided into a plurality of induction listss _x Ands _y the embedded learning is designed as a maximum likelihood problem, which is specifically expressed as:

wherein the content of the first and second substances,

is based on experienceBalance parameters set in advance fors _x Ands _y is to be weighed against the importance of,

are all the parameters that are involved in the model,

representing an objective function;

calculating out

Specifically, it is represented as:

computing

Specifically, it is represented as:

wherein the content of the first and second substances,

and

respectively represent x andy _i an embedded representation of a node is shown,

an embedding matrix from the set of user nodes X,

from commodity nodesAn embedding matrix of set Y, the computation of which takes into accounty _i Properties of nodesd _i ，d _i Is an attribute embedded network N _a The output result of (a) is obtained,bis the term of the offset, and,

as softmax weighting matrices

The k-th row of the matrix is transposed,

as softmax weighting matrices

First of the matrixlThe lines are transposed so that the lines are transposed,

is taking into account the attributesd _j Softmax weight matrix of

The line k of (2) is transposed,

is taking into account the attributesd _j Softmax weight matrix of

To (1) alThe lines are transposed so that the lines are transposed,

is taking into account the attributesd _j Softmax weight matrix of

To (1) aThe transposition of the k lines is carried out,

is taking into account the attributesd _j Softmax weight matrix of

To (1) alThe lines are transposed so that the lines are transposed,

is a preset hyper-parameter, representing the number of neighbors,a _i represents x andy _i the associated attributes of the interaction of (a),a _j represents x andy _j x represents a user node,y _i indicating merchandiseiThe node(s) of (a) is (are),y _j indicating merchandisejA node of (2);

As softmax weight matrix

The transpose of the k-th line of (1),

as softmax weight matrix

To (1) alAnd (5) line transposition.

6. The sparse ultrashort sequence-oriented personalized recommendation method of claim 2, wherein the updated weight matrix parameter is expressed as:

wherein the content of the first and second substances,

represents iteration totThe parameters of the model at the time of generation,

is the direction of decrease of the parameter,

the step size is represented as a function of time,

are balance parameters that are preset empirically.

7. The sparse ultrashort sequence-oriented personalized recommendation method as claimed in claim 1, wherein the information enhancement of the commodity embedded representation is completed by adopting a self-attention model, the user embedded representation is obtained according to the final commodity embedded representation and the user behavior data of the sparse ultrashort user behavior sequence, and personalized recommendation is performed according to the embedded representation of the commodity and the user, and the specific steps include:

performing mask preprocessing on the training samples;

8. A sparse ultrashort sequence-oriented personalized recommendation system is characterized by comprising: the system comprises a user behavior sequence building module, a relational graph building module, an embedded representation learning module, an expert database building module, an imitation learning module, a sequence expansion module, an embedded representation output module and a personalized recommendation module;

wherein the content of the first and second substances,

the result of the calculation of the difference is represented,

for true purchase sequence statussSelecting actions according to policy

The probability of (a) of (b) being,

for the score expectation of the purchase strategy pi,

to use the strategy

The score expectation of (a) is,

representing intermediate parameters for measuring purchase strategiesThe uncertainty of pi is determined by the number of the pixels,

representing a preset weight parameter;

9. A computer-readable storage medium storing a program, wherein the program, when executed by a processor, implements the sparse ultrashort sequence-oriented personalized recommendation method as recited in any one of claims 1 to 7.

10. A computer device comprising a processor and a memory for storing a program executable by the processor, wherein the processor, when executing the program stored in the memory, implements the sparse ultrashort sequence oriented personalized recommendation method as claimed in any one of claims 1 to 7.