CN113762477A - Method for constructing sequence recommendation model and sequence recommendation method - Google Patents

Method for constructing sequence recommendation model and sequence recommendation method Download PDF

Info

Publication number
CN113762477A
CN113762477A CN202111051076.3A CN202111051076A CN113762477A CN 113762477 A CN113762477 A CN 113762477A CN 202111051076 A CN202111051076 A CN 202111051076A CN 113762477 A CN113762477 A CN 113762477A
Authority
CN
China
Prior art keywords
sequence
embedding
item
constructing
user
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202111051076.3A
Other languages
Chinese (zh)
Other versions
CN113762477B (en
Inventor
刘玉葆
张岩森
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sun Yat Sen University
Original Assignee
Sun Yat Sen University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sun Yat Sen University filed Critical Sun Yat Sen University
Priority to CN202111051076.3A priority Critical patent/CN113762477B/en
Publication of CN113762477A publication Critical patent/CN113762477A/en
Application granted granted Critical
Publication of CN113762477B publication Critical patent/CN113762477B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Molecular Biology (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Machine Translation (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The application discloses a method for constructing a sequence recommendation model and a sequence recommendation method, comprising the following steps: constructing an adaptive adjacency matrix of an input sequence, and constructing a first embedding project based on the adaptive adjacency matrix; constructing a second embedding of the project based on the adjacency matrix of the graph neural network; according to the first embedding of the item and the second embedding of the item, a local interest model of the user is constructed through an attention mechanism; constructing a global interest model of a user and embedding of a target sequence, and constructing a sequence recommendation model according to the embedding of the target sequence, the local interest model of the user and the global interest model; and constructing a loss function of the sequence recommendation model based on gradient descent and Bayes personalized sorting. The sequence recommendation model does not need to depend on the existing composition mode and the prior knowledge, and avoids the improper influence caused by noise points by automatically learning the weights of the edges, so that more accurate project embedding and more accurate local interest are learned, and the sequence recommendation can be realized more effectively and reliably.

Description

Method for constructing sequence recommendation model and sequence recommendation method
Technical Field
The application relates to the technical field of artificial intelligence such as deep learning, in particular to a method for constructing a sequence recommendation model and a sequence recommendation method.
Background
With the rapid development of information technology and big data, people generate a large amount of data every moment. It is a great challenge how to mine useful information from these complex data. The recommendation algorithm aims to help a user select target data which are interesting to the user from massive data. At present, most websites start to adopt corresponding recommendation algorithms to help recommend related products or services, and good effects are obtained.
Whereas in various internet services, a user accesses products or items in a chronological order, where the items that the user is about to interact with may be closely related to those that he has just accessed. This property facilitates an important recommendation task, Sequence Recommendation (SR), which treats user behavior history as a chronologically ordered sequence of actions.
In real life, many times, the user's preference is temporary and non-continuous. This temporal, non-persistent preference interspersed with time-ordered sequences of actions (also referred to as sequences of items) can result in the generation of "noise points" in the sequences.
For example, given a subsequence of a user's sequence of items: (MacBook, iPhone, Bread, iPad, Apple Pencil). It is readily known that the local interest of the user in this sub-sequence is concentrated on the electronic product. The next item may be an accessory to "air pots" or Apple products, which in turn rely on "MacBook", "iPhone", "iPad" and "Apple Pencil", independent of "break". The association between "break" and other items can negatively impact the learning process of the recommendation algorithm, resulting in an inability to effectively capture the true interests of the user.
Therefore, in the case where the user's short-term interests and intentions dynamically change, how to predict user behavior in the near future using sequence dynamics is very challenging.
Disclosure of Invention
In view of this, the present application provides a method for constructing a sequence recommendation model and a sequence recommendation method, so as to construct a sequence recommendation model and predict items that a user is interested in through the sequence recommendation model.
To achieve the above object, a first aspect of the present application provides a method for constructing a sequence recommendation model, including:
constructing an adaptive adjacency matrix of an input sequence, and constructing a first embedding of a project based on the adaptive adjacency matrix; the self-adaptive adjacency matrix is used for learning the relation between items in the input sequence in an end-to-end mode;
constructing an adjacency matrix of the input sequence based on the graph neural network, and constructing a second embedding of the project based on the adjacency matrix; the adjacency matrix is used for aggregating adjacent information of the input sequence;
constructing a local interest model of the user through an attention mechanism according to the first embedding of the item and the second embedding of the item;
constructing a global interest model of a user and embedding of a target sequence, and constructing a sequence recommendation model according to the embedding of the target sequence, the local interest model of the user and the global interest model;
and constructing a loss function of the sequence recommendation model based on gradient descent and Bayesian personalized sorting.
Preferably, the process of constructing a first embedding of an item based on the adaptive adjacency matrix includes:
initializing the adaptive adjacency matrix
Figure BDA0003252766570000021
Wherein the content of the first and second substances,
Figure BDA0003252766570000022
having learnable parameters;
by the tanh activation function will
Figure BDA0003252766570000023
Is limited to a value between-1 and 1;
based on the adaptive matrix
Figure BDA0003252766570000024
Constructing a first layer-by-layer propagation rule of a project, and obtaining a first embedding of the project based on the first layer-by-layer propagation rule
Figure BDA0003252766570000025
Wherein, the mathematical expression of the first embedding of the item is as follows:
Figure BDA0003252766570000026
wherein the content of the first and second substances,
Figure BDA0003252766570000027
the weights used to control the neural network, d is the dimension in which the items are embedded,
Figure BDA0003252766570000028
the final hidden state for the input sequence after r propagation steps.
Preferably, the process of constructing a second embedding of the item based on the adjacency matrix includes:
based on the adjacency matrix A ∈ R(L+R)×(L+R)Constructing a second layer-by-layer propagation rule of the project, and obtaining a second embedding of the project based on the second layer-by-layer propagation rule
Figure BDA0003252766570000029
Wherein, the mathematical expression of the second embedding of the item is as follows:
Figure BDA0003252766570000031
wherein the content of the first and second substances,
Figure BDA0003252766570000032
the weights used to control the neural network, d is the dimension in which the items are embedded.
Preferably, the process of constructing the local interest model of the user through an attention mechanism according to the item first embedding and the item second embedding comprises:
capturing multidimensional attention of the input sequence through an importance scoring matrix, and assigning weights of the multidimensional attention to embedding H 'of the input sequence'u,lIn obtaining attentionWeight matrix S'u,l(ii) a Wherein the embedding of the input sequence is H'u,lFirst embedding by the item
Figure BDA0003252766570000033
And second embedding of said item
Figure BDA0003252766570000034
Merging to obtain;
will notice the weight matrix S'u,lEmbedding with input sequence H'u,lMultiplying to obtain a characterization matrix Z of the input sequenceu,l
Characterization matrix Z of input sequence by means of averaging functionu,lConversion into local interest model
Figure BDA0003252766570000035
The mathematical expression of the local interest model is as follows:
Figure BDA0003252766570000036
characterization matrix Zu,lThe mathematical formula of (1) is as follows:
Figure BDA0003252766570000037
attention weight matrix S'u,lThe mathematical formula of (1) is as follows:
Figure BDA0003252766570000038
insertion of input sequence H'u,lThe mathematical formula of (1) is as follows:
Figure BDA0003252766570000039
wherein the content of the first and second substances,
Figure BDA00032527665700000310
to learn parameters, daRepresents from H'u,lD th of attention extractionaAnd (5) carrying out the following steps.
Preferably, the process of constructing the sequence recommendation model according to the embedding of the target sequence, the local interest model of the user and the global interest model includes:
embedding H 'of the input sequence'u,lAnd embedding of the target sequence Q ∈ Rd*JCarrying out inner product to obtain a project relation; wherein d is the dimension of embedding the items, and J is the number of the items in the target sequence;
user-based local interest model
Figure BDA00032527665700000311
And a global interest model Pu∈Rd′Building user personality characterization
Figure BDA00032527665700000312
d' is the dimension of embedding the user;
characterizing based on the item relationships and the user personality
Figure BDA00032527665700000313
Constructing sequence recommendation models
Figure BDA00032527665700000314
Wherein the user personality characterizes
Figure BDA00032527665700000315
The mathematical formula of (1) is as follows:
Figure BDA00032527665700000316
[·;·]indicating vertical splicing, Wu∈R(d+d′)×dFor modeling local interests
Figure BDA0003252766570000041
And a global interest model PuCompression to visa potential space Rd
Sequence recommendation model
Figure BDA0003252766570000042
The mathematical formula of (1) is as follows:
Figure BDA0003252766570000043
wherein the content of the first and second substances,
Figure BDA0003252766570000044
is a local interest model
Figure BDA0003252766570000045
Q of (a) to (b)jIs the embedding of the target sequence Q ∈ Rd*JColumn j.
Preferably, the mathematical expression of the loss function comprises:
Figure BDA0003252766570000046
wherein (u, S)u,j+J- -) e D represents the generated pairwise preference set, SuRepresenting elements in a user's sequence of historical item interactions, j+And j _ respectively represents a second item subsequence Tu,lσ is a sigmoid function, Θ represents other learnable parameters, and λ is a regularization parameter.
A second aspect of the present application provides a sequence recommendation method, including:
taking a historical item interaction sequence of a user as an input sequence, and inputting the trained sequence recommendation model to obtain a sequence recommendation result;
the sequence recommendation model is a model constructed by any one of the above methods for constructing a sequence recommendation model.
Preferably, the training process of the sequence recommendation model includes:
determining a subsequence from a historical item interaction sequence of a user, and determining a first item subsequence and a second item subsequence from the subsequences, wherein the first item subsequence is used as an input sequence, and the second item subsequence is used as a target sequence;
inputting the input sequence and the target sequence into the sequence recommendation model, and determining an output sequence;
and calculating loss values of all items in the output sequence according to a set objective function, and updating learnable parameters of the sequence recommendation model by taking the loss values approaching a preset loss threshold as a target.
Preferably, the process of determining a subsequence from a sequence of historical item interactions of the user comprises:
and splitting the historical item interaction sequence of the user into fine-grained sub-sequences by adopting a sliding window strategy.
Preferably, said process of determining a first item sub-sequence and a second item sub-sequence from said sub-sequences comprises:
composing said first sub-sequence of items from L consecutive items to the left and R consecutive items to the right of said sub-sequence, ordered in time;
composing said second sub-sequence of items from the remaining T items of said sub-sequence;
wherein L, R and T are preset values, and the total length of the subsequence is L + R + T.
According to the technical scheme, the self-adaptive adjacency matrix of the input sequence is constructed, so that the relation between each item in the input sequence is learned in an end-to-end mode under the condition of no prior knowledge, the influence of neighbors on the item can be learned through the first embedding of the item constructed on the basis, and the dependency relation of the item is represented more accurately.
While using the adaptive adjacency matrix, also considering existing methods of graph neural networks, an adjacency matrix of the input sequence is constructed based on the graph neural network, and a second embedding of the project is constructed based on this.
And building a local interest model of the user by combining the item first embedding and the item second embedding through an attention mechanism. The local interest model is capable of capturing local interest features of a user.
The global interest characteristics of the user are captured by constructing a global interest model of the user. And finally, combining the embedding of the target sequence, the local interest model and the global interest model to construct a sequence recommendation model.
In the design of the loss function, the Bayes personalized ranking target model is optimized based on gradient descent, and the pairwise ranking between the positive sample and the negative sample is optimized.
Based on the characteristics, the sequence recommendation model does not need to depend on the existing composition mode and the prior knowledge, and avoids the improper influence caused by noise points by automatically learning the weights of edges, so that more accurate project embedding and more accurate local interest are learned, and further, the sequence recommendation can be effectively and reliably realized.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly introduced below, it is obvious that the drawings in the following description are only embodiments of the present application, and for those skilled in the art, other drawings can be obtained according to the provided drawings without creative efforts.
FIG. 1 is a schematic diagram of a method for constructing a sequence recommendation model disclosed in an embodiment of the present application;
FIG. 2 is a schematic diagram of a sequence recommendation model disclosed in an embodiment of the present application;
FIG. 3 illustrates a project diagram disclosed by an embodiment of the present application;
FIG. 4 illustrates an adjacency matrix corresponding to a project diagram disclosed in an embodiment of the present application;
fig. 5 is a schematic diagram of a sequence recommendation method disclosed in an embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
The inventors of the present application have discovered that conventional collaborative filtering recommendation algorithms prefer to model user-item behavior in a static manner, resulting in long-term general preferences for users. However, for the subsequences of users listed in the background (MacBook, iPhone, Bread, iPad, Apple Pencil), "Bread" and other items are not very relevant or even completely irrelevant, and their occurrence in the subsequences amounts to a "noise spot". Based on the existing sequence recommendation algorithm of Graph Neural Network (GNN), the project is embedded only by using a predefined adjacency matrix, and the relation of "noise points" can negatively affect the embedded learning of the project and can not effectively capture the real interest of the user. Ultimately resulting in inaccurate item embedding when aggregating information using the neural network of the graph, negatively impacting the downstream predictive tasks.
In order to solve the problem, the application designs an adaptive adjacency matrix in the sequence recommendation model, and the influence of the improper connection is weakened by adjusting the weight of the edge in the project diagram. The adaptive method may learn weights between the item "break" and other items based on user-item interactions. In particular, it can learn different weights for any pair of connections in the subsequence. By doing so, the true relationships between the items can be learned and a more accurate item-embedded representation can be modeled for the next prediction task.
Referring to fig. 1, a method for constructing a sequence recommendation model according to an embodiment of the present application may include the following steps:
step S100, project first embedding is built based on the adaptive adjacency matrix.
Specifically, an adaptive adjacency matrix of the input sequence is constructed, and the item first embedding is constructed based on the adaptive adjacency matrix. Wherein the adaptive adjacency matrix is used for learning the relation between each item in the input sequence in an end-to-end mode.
The inventors of the present application found that existing popular graph-neural network-based approaches often use the same rules to construct a project graph for an interactive sub-sequence of user projects, meaning that they build a fixed graph to capture the relationships between the projects in all generated sub-sequences.
The self-adaptive adjacency matrix is different from the method for explicitly learning item embedding, the relation between items can be implicitly learned, any priori knowledge is not needed, the negative influence of improper connection in the item graph can be weakened, and the real item dependency relation can be found.
And step S200, constructing item second embedding based on the adjacency matrix predefined by the graph neural network.
Specifically, to better model accurate item embedding, the embodiments of the present application also consider existing graph neural network methods, construct an adjacency matrix of the input sequence based on the graph neural network, and construct a second item embedding based on the adjacency matrix.
Wherein, the adjacency matrix adds edges among the items in the input sequence according to the graph neural network method for aggregating the adjacent information of the input sequence.
And step S300, constructing a local interest model of the user according to the first embedding of the item and the second embedding of the item.
Specifically, according to the aforementioned item first embedding and item second embedding, the contribution degree of each neighbor to generate a new feature is learned through an attention mechanism, and the neighbor features are aggregated according to the contribution degree, so that the local interest of the user is captured.
And S400, constructing a sequence recommendation model according to the embedding of the target sequence, the local interest model and the global interest model of the user.
Specifically, a global interest model of the user and embedding of the target sequence are built, and a sequence recommendation model is built according to the embedding of the target sequence, the local interest model of the user and the global interest model.
Wherein the global interests of the user may be determined by the inherent properties of the user. And combining the local interest and the global interest of the user to infer the user preference so as to deduce the item in which the user is interested.
And step S500, constructing a loss function of the sequence recommendation model.
Specifically, the aforementioned loss function of the sequence recommendation model is constructed based on gradient descent and Bayesian Personalized Ranking (Bayesian Personalized Ranking objective).
The Bayesian personalized sorting algorithm can sort all items corresponding to each user according to the preference, screen out few items which have higher priority in the user core, and arrange the items at the position of the front. And the Bayes personalized ranking algorithm is optimized by adopting gradient descent, so that the pairwise ranking between the positive samples and the negative samples can be optimized.
According to the technical scheme, the self-adaptive adjacency matrix of the input sequence is constructed, so that the relation between each item in the input sequence is learned in an end-to-end mode under the condition of no prior knowledge, the influence of neighbors on the item can be learned through the first embedding of the item constructed on the basis, and the dependency relation of the item is represented more accurately.
While using the adaptive adjacency matrix, also considering existing methods of graph neural networks, an adjacency matrix of the input sequence is constructed based on the graph neural network, and a second embedding of the project is constructed based on this.
And building a local interest model of the user by combining the item first embedding and the item second embedding through an attention mechanism. The local interest model is capable of capturing local interest features of a user.
The global interest characteristics of the user are captured by constructing a global interest model of the user. And finally, combining the embedding of the target sequence, the local interest model and the global interest model to construct a sequence recommendation model.
In the design of the loss function, the Bayes personalized ranking target model is optimized based on gradient descent, and the pairwise ranking between the positive sample and the negative sample is optimized.
Based on the characteristics, the sequence recommendation model does not need to depend on the existing composition mode and the prior knowledge, and avoids the improper influence caused by noise points by automatically learning the weights of edges, so that more accurate project embedding and more accurate local interest are learned, and further, the sequence recommendation can be effectively and reliably realized.
The core algorithm used by the sequence recommendation model of the embodiments of the present application will be described in detail below. For ease of reading, the mathematical expressions used in the sequence recommendation model are described below.
The task of sequence recommendation is to take as input the user-item historical sequence of interactions and predict the next item with which the user will interact. Order to
Figure BDA0003252766570000081
A set of users is represented as a set of users,
Figure BDA0003252766570000082
Figure BDA0003252766570000083
represents a collection of items, wherein
Figure BDA0003252766570000084
And
Figure BDA0003252766570000085
representing the number of users and items, respectively. The user-item interaction sequence may be represented by the following chronological sequence
Figure BDA0003252766570000086
Figure BDA0003252766570000087
Wherein
Figure BDA0003252766570000088
Representative set
Figure BDA0003252766570000089
User u in (2) has accessed a sequence from
Figure BDA00032527665700000810
The item of (1).
From the above notation, we define the sequence recommendation task as follows. Given a historical access sequence of M users
Figure BDA00032527665700000811
The goal is to be for each user to receive
Figure BDA00032527665700000812
Recommendation in individual item
Figure BDA00032527665700000813
An item, and evaluating
Figure BDA00032527665700000814
Whether the item in (a) will appear in the recommendation list.
Also, for ease of understanding, reference may be made to FIG. 2 concurrently with a reading of the following sections of the application. Fig. 2 is a schematic diagram of a sequence recommendation model constructed by a method for constructing a sequence recommendation model according to an embodiment of the present application, please refer to fig. 2, where the sequence recommendation model may be divided into an embedding layer 10, a local interest modeling layer 20, and a prediction layer 30, and in the following reading process, reference may be made to the input, output, and flow direction relationships of the layers shown in fig. 2.
On the basis of the foregoing embodiments of the present application, in some other embodiments of the present application, the process of building the item first embedding based on the adaptive adjacency matrix in step S100 may include:
a1, initializing the adaptive adjacency matrix
Figure BDA0003252766570000091
Wherein the content of the first and second substances,
Figure BDA0003252766570000092
with learnable parameters, (L + R) being the input sequence Cu,lLength of (d).
Wherein the sequence C is inputu,lCan be expressed as:
Cu,l={il,...,il+L-1,il+L+T,...,il+L+R+T-1} (1)
a2, will activate the function through tanh
Figure BDA0003252766570000093
Is limited to a value between-1 and 1.
A3, based on the adaptive matrix
Figure BDA0003252766570000094
Constructing a first layer-by-layer propagation rule of a project, and obtaining a first embedding of the project based on the first layer-by-layer propagation rule
Figure BDA0003252766570000095
Wherein the mathematical expression of the first embedding of the item can be expressed as:
Figure BDA0003252766570000096
wherein the content of the first and second substances,
Figure BDA0003252766570000097
weights for controlling the graph neural network, d is the dimension of embedding the items, Su,lAs an input sequence Cu,lAnd embedding the converted embedded representation.
In particular, the matrix is embedded by one item
Figure BDA0003252766570000098
Embedded inputSequence Cu,lThe embedding resulting in the input sequence may be expressed as:
Su,l=[el,...,el+L-1,el+L+R+T,...,el+L+R+T-1] (3)
Figure BDA0003252766570000099
the final hidden state for the input sequence after r propagation steps.
In some embodiments of the present application, the above step S200 builds the adjacency matrix A ∈ R of the input sequence based on the graph neural network(L+R)×(L+R)May include:
for each item in the input sequence, K subsequent items are extracted, and edges are added between the subsequent items, where K is a preset number of items.
For example, referring to FIG. 3, in an alternative embodiment, first, for the input sequence i1,i2,i3,i4,i7,i8For each entry of { circumflex over }, 2 subsequent entries are extracted and edges are added between these subsequent entries. For example, for item i1Extracting a subsequent item i of the item2And i3Adding edges for the two items; for item i2Extracting a subsequent item i of the item3And i4And add edges to the two items. Furthermore, for an item in the middle of the input sequence, no edges are added thereto unless the length of the target sequence is less than the number of edges to be added for subsequent items. The resulting project map is shown in fig. 3.
For the project graph shown in FIG. 3, an adjacency matrix is generated for it, and the weights of the edges are determined according to the number of neighbors to which the project is connected. For example, for item i2And i3Due to i2Has 3 connected neighbors, so i2To i3Is set to 1/3. The resulting adjacency matrix is shown in fig. 4. Then, normalization processing is carried out on the adjacency matrix to obtain a final adjacency matrix A.
In some embodiments of the present application, the step S200 of building a second embedding of the item based on the adjacency matrix may include:
constructing a second layer-by-layer propagation rule of the project based on the adjacency matrix A, and acquiring a second embedding of the project based on the second layer-by-layer propagation rule
Figure BDA0003252766570000101
Wherein the item second embedded mathematical expression may be expressed as:
Figure BDA0003252766570000102
wherein the content of the first and second substances,
Figure BDA0003252766570000103
the weights used to control the neural network, d is the dimension in which the items are embedded.
In some embodiments of the present application, the step S300 of building a local interest model of the user through an attention mechanism according to the item first embedding and the item second embedding may include:
b1, capturing the multidimensional attention of the input sequence through an importance scoring matrix, and assigning the weight of the multidimensional attention to the embedded H 'of the input sequence'u,lIn (1), obtain an attention weight matrix S'u,l
Wherein the input sequence is embedded in H'u,lBy first embedding of items
Figure BDA0003252766570000104
And item second embedding
Figure BDA0003252766570000105
Merging to obtain; specifically, embedding of the input sequence H'u,lBy item first embedding
Figure BDA0003252766570000106
And item second embedding
Figure BDA0003252766570000107
The element-by-element products are combined, and the mathematical expression of the product can be expressed as:
Figure BDA0003252766570000108
b2, attention weight matrix S'u,lEmbedding with input sequence H'u,lMultiplying to obtain a characterization matrix Z of the input sequenceu,l
Wherein, attention weight matrix S'u,lCan be expressed as:
Figure BDA0003252766570000109
wherein the content of the first and second substances,
Figure BDA00032527665700001010
to learn parameters, daRepresents from H'u,lD th of attention extractionaAnd (5) carrying out the following steps.
Characterization matrix Zu,lCan be expressed as:
Figure BDA00032527665700001011
b3, inputting the characterization matrix Z of the sequence through an averaging function Avgu,lConversion into local interest model
Figure BDA00032527665700001012
Wherein, the mathematical expression of the local interest model can be expressed as:
Figure BDA00032527665700001013
in some embodiments of the present application, the step S400 of constructing the sequence recommendation model according to the embedding of the target sequence, the local interest model of the user, and the global interest model may include:
c1, embedding H 'of the input sequence'u,lAnd embedding of the target sequence Q ∈ Rd*JAnd carrying out inner product to obtain the project relation. The item relationship may be expressed as:
Figure BDA0003252766570000111
where d is the dimension of embedding the item, J is the number of items in the target sequence, qjIs the embedding of the target sequence Q ∈ Rd*JColumn j.
C2, local interest model based on user
Figure BDA0003252766570000112
And a global interest model Pu,l∈Rd′Building user personality characterization
Figure BDA0003252766570000113
Wherein the user personality characterizes
Figure BDA0003252766570000114
Can be expressed as:
Figure BDA0003252766570000115
[·;·]denotes vertical splicing (W)u∈R(d+d′)×dFor modeling local interests
Figure BDA0003252766570000116
And a global interest model PuCompression to visa potential space RdAnd d' is the dimension of embedding the user.
C3, based on the item relation sumThe user personality characterization
Figure BDA0003252766570000117
Constructing sequence recommendation models
Figure BDA0003252766570000118
Wherein the sequence recommendation model
Figure BDA0003252766570000119
Can be expressed as:
Figure BDA00032527665700001110
Figure BDA00032527665700001111
is a local interest model
Figure BDA00032527665700001112
Q of (a) to (b)jIs the embedding of the target sequence Q ∈ Rd*JColumn j.
In particular, the amount of the solvent to be used,
Figure BDA00032527665700001113
the value of (d) reflects the predicted score for each item in the input sequence from which items that can be recommended next can be determined.
In some embodiments of the present application, the mathematical expression of the loss function in step S500 may be expressed as:
Figure BDA00032527665700001114
wherein (u, S)u,j+,j-) e.D represents the generated pairwise preference set, SuRepresenting elements in a user's input sequence, j+And j-Respectively represent target sequences Tu,lA positive example and a negative example of (c),σ is a sigmoid function, Θ represents other learnable parameters, and λ is a regularization parameter.
By minimizing the objective function, the partial derivatives with respect to all parameters can be calculated from the back-propagated gradient descent.
Based on the method for constructing the sequence recommendation model, the embodiment of the application further provides a sequence recommendation method. Referring to fig. 5, a sequence recommendation method provided in an embodiment of the present application may include:
and taking the historical item interaction sequence of the user as an input sequence, and inputting the trained sequence recommendation model to obtain a sequence recommendation result.
The sequence recommendation model is a model constructed by the method for constructing the sequence recommendation model provided by any embodiment of the application.
In some embodiments of the present application, the training process of the sequence recommendation model may include:
d1, determining a subsequence from the user's historical item interaction sequence, and determining a first item subsequence and a second item subsequence from the subsequence, with the first item subsequence as an input sequence and the second item subsequence as a target sequence.
And D2, inputting the input sequence and the target sequence into the sequence recommendation model, and determining an output sequence.
And D3, calculating the loss value of each item in the output sequence according to a set objective function, and updating the learnable parameters of the sequence recommendation model by taking the loss value approaching a preset loss threshold value as an objective.
In some embodiments of the present application, the process of determining a subsequence from a sequence of interactions of historical items by the user D1 may include:
and splitting the historical item interaction sequence of the user into fine-grained subsequences by adopting a sliding window strategy.
In some embodiments of the present application, the process of D1 determining the first item sub-sequence and the second item sub-sequence from the sub-sequence may include:
e1, forming a first sub-sequence of items by time-ordering the L consecutive items to the left and the R consecutive items to the right of the sub-sequence.
E2, forming said second item sub-sequence from the remaining T items of the sub-sequence. Wherein L, R and T are preset values, and the total length of the subsequence is L + R + T.
By taking the L consecutive items on the left and R consecutive items on the right of each sub-sequence as inputs and the T items in the middle as target values to be predicted, i.e. target sequences, the past and future context information can be better utilized.
For example, assume with Cu,l={il,...,il+L-1,il+L+T,...,il+L+R+T-1T is the l-th sub-sequence used to represent user u, then Tu,l={il+L,...,il+L+T-1The entries in the } represent the corresponding target values, i.e. target sequences, and the input to the model is a subsequence containing L + R entries.
For convenience, in the description of the specific algorithm in the method for constructing the sequence recommendation model in the previous section of the present application, the input sequence determined by the methods of E1 and E2 and the target sequence are also used for the related description. It is understood that L, R and T are merely representations of the number of elements of the input sequence and the target sequence, which may be any values. When the method of E1 or E2 is not used in practical application to determine the input sequence and the target sequence, the mathematical expression in the algorithm described above can be adapted accordingly. For example, when the input sequence is N consecutive items, then N is substituted for L + R.
In an alternative embodiment, the algorithm pseudo code for training the sequence recommendation model may be as shown in table 1.
Table 1: algorithm pseudo code for training sequence recommendation model
Figure BDA0003252766570000131
Training a sequence recommendation model by pseudo-code as described in Table 1Training, finally, the parameters W obtained by training can be returned*、b*Adaptive adjacency matrix
Figure BDA0003252766570000132
And embedded representations of users and items. Further, the target item of the user can be predicted through the trained sequence recommendation model.
In summary, the following steps:
according to the method and the device, the self-adaptive adjacency matrix of the input sequence is constructed, so that the relation between each item in the input sequence is learned in an end-to-end mode under the condition that no prior knowledge is needed, the influence of neighbors on the items can be learned through the first embedding of the items constructed on the basis, and the dependency relation of the items is represented more accurately.
While using the adaptive adjacency matrix, also considering existing methods of graph neural networks, an adjacency matrix of the input sequence is constructed based on the graph neural network, and a second embedding of the project is constructed based on this.
And building a local interest model of the user by combining the item first embedding and the item second embedding through an attention mechanism. The local interest model is capable of capturing local interest features of a user.
The global interest characteristics of the user are captured by constructing a global interest model of the user. And finally, combining the embedding of the target sequence, the local interest model and the global interest model to construct a sequence recommendation model.
In the design of the loss function, the Bayes personalized ranking target model is optimized based on gradient descent, and the pairwise ranking between the positive sample and the negative sample is optimized.
Based on the characteristics, the sequence recommendation model does not need to depend on the existing composition mode and the prior knowledge, and avoids the improper influence caused by noise points by automatically learning the weights of edges, so that more accurate project embedding and more accurate local interest are learned, and further, the sequence recommendation can be effectively and reliably realized.
Finally, it should also be noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
The embodiments in the present description are described in a progressive manner, each embodiment focuses on differences from other embodiments, the embodiments may be combined as needed, and the same and similar parts may be referred to each other.
The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present application. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the application. Thus, the present application is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (10)

1. A method of constructing a sequence recommendation model, comprising:
constructing an adaptive adjacency matrix of an input sequence, and constructing a first embedding of a project based on the adaptive adjacency matrix; the self-adaptive adjacency matrix is used for learning the relation between items in the input sequence in an end-to-end mode;
constructing an adjacency matrix of the input sequence based on the graph neural network, and constructing a second embedding of the project based on the adjacency matrix; the adjacency matrix is used for aggregating adjacent information of the input sequence;
constructing a local interest model of the user through an attention mechanism according to the first embedding of the item and the second embedding of the item;
constructing a global interest model of a user and embedding of a target sequence, and constructing a sequence recommendation model according to the embedding of the target sequence, the local interest model of the user and the global interest model;
and constructing a loss function of the sequence recommendation model based on gradient descent and Bayesian personalized sorting.
2. The method of claim 1, wherein the process of constructing a first embedding of an item based on the adaptive adjacency matrix comprises:
initializing the adaptive adjacency matrix
Figure FDA0003252766560000011
Wherein the content of the first and second substances,
Figure FDA0003252766560000012
having learnable parameters;
by the tanh activation function will
Figure FDA0003252766560000013
Is limited to a value between-1 and 1;
based on the adaptive matrix
Figure FDA0003252766560000014
Constructing a first layer-by-layer propagation rule of a project, and obtaining a first embedding of the project based on the first layer-by-layer propagation rule
Figure FDA0003252766560000015
Wherein, the mathematical expression of the first embedding of the item is as follows:
Figure FDA0003252766560000016
wherein the content of the first and second substances,
Figure FDA0003252766560000017
the weights used to control the neural network, d is the dimension in which the items are embedded,
Figure FDA0003252766560000018
the final hidden state for the input sequence after r propagation steps.
3. The method of claim 2, wherein said constructing a second embedding of items based on said adjacency matrix comprises:
based on the adjacency matrix A ∈ R(L+R)×(L+R)Constructing a second layer-by-layer propagation rule of the project, and obtaining a second embedding of the project based on the second layer-by-layer propagation rule
Figure FDA0003252766560000019
Wherein, the mathematical expression of the second embedding of the item is as follows:
Figure FDA00032527665600000110
wherein the content of the first and second substances,
Figure FDA00032527665600000111
the weights used to control the neural network, d is the dimension in which the items are embedded.
4. The method of claim 3, wherein the process of constructing the local interest model of the user through an attention mechanism according to the item first embedding and the item second embedding comprises:
capturing multidimensional attention of the input sequence through an importance scoring matrix, and assigning weights of the multidimensional attention to embedding H 'of the input sequence'u,lIn (1), obtain an attention weight matrix S'u,l(ii) a Wherein the embedding of the input sequence is H'u,lFirst embedding by the item
Figure FDA0003252766560000021
And second embedding of said item
Figure FDA0003252766560000022
Merging to obtain;
will notice the weight matrix S'u,lEmbedding with input sequence H'u,lMultiplying to obtain a characterization matrix Z of the input sequenceu,l
Characterization matrix Z of input sequence by means of averaging functionu,lConversion into local interest model
Figure FDA0003252766560000023
The mathematical expression of the local interest model is as follows:
Figure FDA0003252766560000024
characterization matrix Zu,lThe mathematical formula of (1) is as follows:
Figure FDA0003252766560000025
attention weight matrix S'u,lThe mathematical formula of (1) is as follows:
Figure FDA0003252766560000026
insertion of input sequence H'u,lThe mathematical formula of (1) is as follows:
Figure FDA0003252766560000027
wherein the content of the first and second substances,
Figure FDA0003252766560000028
to learn parameters, daRepresents from H'u,lD th of attention extractionaAnd (5) carrying out the following steps.
5. The method of claim 4, wherein the process of constructing the sequence recommendation model according to the embedding of the target sequence, the local interest model of the user and the global interest model comprises:
embedding H 'of the input sequence'u,lAnd embedding of the target sequence Q ∈ Rd*JCarrying out inner product to obtain a project relation; wherein d is the dimension of embedding the items, and J is the number of the items in the target sequence;
user-based local interest model
Figure FDA0003252766560000029
And a global interest model Pu∈Rd′Building user personality characterization
Figure FDA00032527665600000210
d' is the dimension of embedding the user;
characterizing based on the item relationships and the user personality
Figure FDA00032527665600000211
Constructing sequence recommendation models
Figure FDA00032527665600000212
Wherein the user personality characterizes
Figure FDA00032527665600000213
The mathematical formula of (1) is as follows:
Figure FDA00032527665600000214
[·;·]indicating vertical splicing, Wu∈R(d+d′)×dFor modeling local interests
Figure FDA00032527665600000215
And a global interest model PuCompression to visa potential space Rd
Sequence recommendation model
Figure FDA0003252766560000031
The mathematical formula of (1) is as follows:
Figure FDA0003252766560000032
wherein the content of the first and second substances,
Figure FDA0003252766560000033
is a local interest model
Figure FDA0003252766560000034
Q of (a) to (b)jIs the embedding of the target sequence Q ∈ Rd*JColumn j.
6. The method of claim 5, wherein the mathematical expression of the loss function comprises:
Figure FDA0003252766560000035
wherein (u, S)u,j+,j-) e.D represents the generated pairwise preference set, SuFor indicatingElements in the user's sequence of historical item interactions, j+And jRespectively represent a second item subsequence Tu,lσ is a sigmoid function, Θ represents other learnable parameters, and λ is a regularization parameter.
7. A method for sequence recommendation, comprising:
taking a historical item interaction sequence of a user as an input sequence, and inputting the trained sequence recommendation model to obtain a sequence recommendation result;
the sequence recommendation model is a model constructed by the method of any one of claims 1 to 6.
8. The method of claim 7, wherein the training process of the sequence recommendation model comprises:
determining a subsequence from a historical item interaction sequence of a user, and determining a first item subsequence and a second item subsequence from the subsequences, wherein the first item subsequence is used as an input sequence, and the second item subsequence is used as a target sequence;
inputting the input sequence and the target sequence into the sequence recommendation model, and determining an output sequence;
and calculating loss values of all items in the output sequence according to a set objective function, and updating learnable parameters of the sequence recommendation model by taking the loss values approaching a preset loss threshold as a target.
9. The method of claim 8, wherein determining the sub-sequence from the sequence of historical item interactions of the user comprises:
and splitting the historical item interaction sequence of the user into fine-grained sub-sequences by adopting a sliding window strategy.
10. The method of claim 8, wherein said determining a first sub-sequence of items and a second sub-sequence of items from said sub-sequences comprises:
composing said first sub-sequence of items from L consecutive items to the left and R consecutive items to the right of said sub-sequence, ordered in time;
composing said second sub-sequence of items from the remaining T items of said sub-sequence;
wherein L, R and T are preset values, and the total length of the subsequence is L + R + T.
CN202111051076.3A 2021-09-08 2021-09-08 Method for constructing sequence recommendation model and sequence recommendation method Active CN113762477B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111051076.3A CN113762477B (en) 2021-09-08 2021-09-08 Method for constructing sequence recommendation model and sequence recommendation method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111051076.3A CN113762477B (en) 2021-09-08 2021-09-08 Method for constructing sequence recommendation model and sequence recommendation method

Publications (2)

Publication Number Publication Date
CN113762477A true CN113762477A (en) 2021-12-07
CN113762477B CN113762477B (en) 2023-06-30

Family

ID=78793953

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111051076.3A Active CN113762477B (en) 2021-09-08 2021-09-08 Method for constructing sequence recommendation model and sequence recommendation method

Country Status (1)

Country Link
CN (1) CN113762477B (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106991163A (en) * 2017-03-31 2017-07-28 福州大学 A kind of song recommendations method based on singer's sound speciality
CN107506823A (en) * 2017-08-22 2017-12-22 南京大学 A kind of construction method for being used to talk with the hybrid production style of generation
CN107613304A (en) * 2017-09-11 2018-01-19 西安电子科技大学 The hidden close transmission method of mobile terminal video stream based on Android platform
CN110717098A (en) * 2019-09-20 2020-01-21 中国科学院自动化研究所 Meta-path-based context-aware user modeling method and sequence recommendation method
CN111209475A (en) * 2019-12-27 2020-05-29 武汉大学 Interest point recommendation method and device based on space-time sequence and social embedded ranking
CN111522962A (en) * 2020-04-09 2020-08-11 苏州大学 Sequence recommendation method and device and computer-readable storage medium
CN112559878A (en) * 2020-12-24 2021-03-26 山西大学 Sequence recommendation system and recommendation method based on graph neural network

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106991163A (en) * 2017-03-31 2017-07-28 福州大学 A kind of song recommendations method based on singer's sound speciality
CN107506823A (en) * 2017-08-22 2017-12-22 南京大学 A kind of construction method for being used to talk with the hybrid production style of generation
CN107613304A (en) * 2017-09-11 2018-01-19 西安电子科技大学 The hidden close transmission method of mobile terminal video stream based on Android platform
CN110717098A (en) * 2019-09-20 2020-01-21 中国科学院自动化研究所 Meta-path-based context-aware user modeling method and sequence recommendation method
CN111209475A (en) * 2019-12-27 2020-05-29 武汉大学 Interest point recommendation method and device based on space-time sequence and social embedded ranking
CN111522962A (en) * 2020-04-09 2020-08-11 苏州大学 Sequence recommendation method and device and computer-readable storage medium
CN112559878A (en) * 2020-12-24 2021-03-26 山西大学 Sequence recommendation system and recommendation method based on graph neural network

Also Published As

Publication number Publication date
CN113762477B (en) 2023-06-30

Similar Documents

Publication Publication Date Title
CN110717098B (en) Meta-path-based context-aware user modeling method and sequence recommendation method
CN109544306B (en) Cross-domain recommendation method and device based on user behavior sequence characteristics
CN111523047A (en) Multi-relation collaborative filtering algorithm based on graph neural network
US20220301024A1 (en) Sequential recommendation method based on long-term and short-term interests
CN110532471B (en) Active learning collaborative filtering method based on gated cyclic unit neural network
CN112597392B (en) Recommendation system based on dynamic attention and hierarchical reinforcement learning
CN110659742A (en) Method and device for acquiring sequence representation vector of user behavior sequence
CN114298851A (en) Network user social behavior analysis method and device based on graph sign learning and storage medium
CN112819575B (en) Session recommendation method considering repeated purchasing behavior
CN112287166A (en) Movie recommendation method and system based on improved deep belief network
CN113420221B (en) Interpretable recommendation method integrating implicit article preference and explicit feature preference of user
CN112396492A (en) Conversation recommendation method based on graph attention network and bidirectional long-short term memory network
Hamada et al. A fuzzy-based approach for modelling preferences of users in multi-criteria recommender systems
CN116521908A (en) Multimedia content personalized recommendation method based on artificial intelligence
CN110059251B (en) Collaborative filtering recommendation method based on multi-relation implicit feedback confidence
CN112364245B (en) Top-K movie recommendation method based on heterogeneous information network embedding
CN117390266B (en) Project recommendation method based on high-order neighbor generation algorithm and heterogeneous graph neural network
CN113761388A (en) Recommendation method and device, electronic equipment and storage medium
CN111897999B (en) Deep learning model construction method for video recommendation and based on LDA
CN113590976A (en) Recommendation method of space self-adaptive graph convolution network
CN116204723A (en) Social recommendation method based on dynamic hypergraph representation learning
CN113449176A (en) Recommendation method and device based on knowledge graph
CN117035914A (en) Product recommendation method and device, computer equipment and storage medium
CN116842277A (en) Social recommendation method based on cross-topic comparison learning
CN116362836A (en) Agricultural product recommendation algorithm based on user behavior sequence

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant