WO2023098098A1

WO2023098098A1 - Tag-aware recommendation method based on attention mechanism and hypergraph convolution

Info

Publication number: WO2023098098A1
Application number: PCT/CN2022/106913
Authority: WO
Inventors: 王海艳; 尤恺翔; 骆健
Original assignee: 南京邮电大学
Priority date: 2021-12-02
Filing date: 2022-07-21
Publication date: 2023-06-08
Also published as: CN114117142A

Abstract

A tag-aware recommendation method based on an attention mechanism and hypergraph convolution. The tag-aware recommendation method constructs different hypergraphs respectively on a user side and an item side according to direct interaction relationships between users and items and indirect interaction relationships among the users, the items and tags, extracts information reflected by high-order relationships by means of the hypergraph convolution, distinguishes information having different degrees of importance by means of the attention mechanism, and carries out recommendation by means of obtained feature representation. The method introduces the hypergraph convolution to mine the high-order relationships for feature extraction, and uses the attention mechanism to allocate weights to features obtained from hypergraphs of user-item direct interaction and hypergraphs constructed by tag-awareness, so that information having different degrees of importance can be better distinguished. By combining the hypergraph convolution and the attention mechanism, the method can fully extract features from the user-item direct interaction relationships and from the interaction relationships with the tags, thus improving the performance of the recommendation method.

Description

A label-aware recommendation method based on attention mechanism and hypergraph convolution

technical field

The invention belongs to the application fields of information service and computer software technology, and in particular relates to a label-aware recommendation method based on attention mechanism and hypergraph convolution.

Background technique

With the rapid growth of the number of various information resources on the Internet, how to recommend resources or products that meet the needs to users from the huge amount of data has attracted more and more attention from the industry and academia, which requires service providers to have appropriate recommended method. In order to improve the accuracy of recommendations, many Internet service providers have adopted a user-generated content (UGC) tagging system, which allows users to actively tag products, videos, information, etc. Traditional recommendation methods such as collaborative filtering are difficult to reflect the complex, diverse, and high-level interaction between users and items, so the effect is not good. With the introduction of deep learning methods, recommendation methods based on graph neural networks have attracted attention because they can reflect topology. However, the ordinary graph structure only reflects the pairwise relationship between nodes through edges, but cannot reflect the relationship between three or more than three.

Contents of the invention

In order to solve the problem that the complex high-order relationship of users, items, and tags in tag-aware recommendation cannot be represented by the traditional graph neural network, the present invention introduces a hypergraph to model the high-order relationship between nodes, and utilizes the hypergraph volume The transfer of information in the neural network is completed actively, and the attention mechanism is used to reflect the importance of different information, so as to effectively alleviate the problem of high-level information loss in traditional recommendation methods.

In order to achieve the above object, the present invention is achieved through the following technical solutions:

The present invention is a label-aware recommendation method based on attention mechanism and hypergraph convolution. The label-aware recommendation method is constructed on the user side and the project side through the direct interaction relationship between users and items and the brief interaction relationship between users, items, and labels. For different hypergraphs, the information of high-order relationship responses is extracted through hypergraph convolution, and the attention mechanism is used to distinguish information of different importance levels, and recommendations are made through the obtained feature representations. The specific steps are as follows:

Step 1: Initialize the feature representation of users and items to get u, i;

Step 2: Obtain the user feature u ^{tag and the item feature i tag} ^represented by the tag according to the interaction relationship between the user and the item and the tag respectively;

Step 3: Construct three interactive bipartite graphs G user- _{item , G user-} tag , and G _item-tag according to the interaction relationship between users and items, users and _tags , and items and tags;

Step 4: According to the bipartite graph of users and items obtained in step 3, respectively construct the hypergraph structure of the user side and the item side represented by the direct interaction relationship

Step 5: Based on the bipartite graphs of users and tags and items and tags obtained in step 3, respectively construct the hypergraph structure of the user side and item side represented by the tag relationship

Step 6: Use the two hypergraphs obtained in step 4 for the features u and i obtained in step 1

Perform hypergraph convolution respectively to obtain neighborhood feature representations u ¹ and i ¹ of the updated direct relationship representations on the user side and item side;

Step 7: For the features u and i obtained in step 1, the two hypergraphs obtained in step 5

Carry out hypergraph convolution respectively to obtain neighborhood feature representations u ² and i ² represented by updated tag relations on the user side and item side;

Step 8: Use the attention mechanism to process the features obtained in steps 2, 6, and 7 to obtain the weights of different feature representations to obtain the final user and item feature representations u ^* , i ^* ;

Step 9: Concatenate the user and item feature representations obtained in the eighth step to obtain z=[u ^* ; i ^* ], input it into the fully connected layer and use the Sigmoid function to obtain the prediction probability, and make recommendations based on the score.

A further improvement of the present invention is: in step 2, according to the number of times the user tags and the number of times the item is tagged, the tag representation features of the user and the item are initialized and normalized.

The further improvement of the present invention lies in: in step 4: according to the bipartite graph of users and items, the hypergraphs of the user side and the item side represented by the direct interaction relationship are respectively constructed. Take the user side as an example: if there is a path between two items m and n, and the number of users passing through the path is less than k, then these two items are reachable neighbors of order k; for item m, if it has reachable neighbors of order k Reach neighbor n and user u directly interacts with m, then user u is the k-order reachable user of item n. For each item, its k-order reachable users are regarded as a set, and the users on the set are regarded as nodes, and the set can be regarded as a hyperedge, thus constructing a hypergraph

The project side is the same as the user side, building a hypergraph

The further improvement of the present invention lies in that: in step 5, according to the bipartite graph of users and tags, and items and tags, hypergraphs on the user side and item side represented by the tag relationship are respectively constructed. The idea is similar to step 4, but instead of user-item relationships, user-tag and item-tag relationships are used respectively. Take the user side as an example: if there is a path between two labels m and n, and the number of users passing through the path is less than k, then these two labels are k-order reachable neighbors; for label m, if it has k-order reachable neighbors Reach neighbor n and user u directly interacts with m, then user u is the k-order reachable user of label n. For each label, its k-order reachable users are regarded as a set, and users on the set are regarded as nodes, and the set can be regarded as a hyperedge, thus constructing a hypergraph

The project side is the same as the user side, building a hypergraph

The beneficial effects of the present invention are:

(1) The present invention uses the characteristics of the hypergraph data structure to represent the high-order relationship of users and items, and utilizes hypergraph convolution to update information. Compared with the graph neural network, it can reduce the loss of information in the process of information transmission, fully Utilize high-order interaction relationships to obtain domain feature representations.

(2) The present invention combines the attention mechanism to effectively use multiple features to represent features with different degrees of importance, and avoid high-value information from being overwhelmed by low-value information.

(3) In tag-aware recommendation, the relationship between users, items, and tags is used for feature representation and relationship modeling, and a variety of information is fully utilized for feature representation.

(4) In the result prediction, the method of splicing the features of the user side and the item side is used first, and then processed by the fully connected layer and the Sigmoid activation function, which reduces the impact of the information loss problem caused by the direct feature inner product.

Description of drawings

Figure 1 is a diagram of the relationship between users, items, and tags.

Figure 2 is a variety of hypergraph convolutions representing hypergraphs.

Figure 3 is the multi-feature processing process based on the attention mechanism.

Figure 4 is the score prediction process.

Detailed ways

Embodiments of the present invention will be disclosed in the following diagrams. For the sake of clarity, many practical details will be described together in the following description. It should be understood, however, that these practical details should not be used to limit the invention. That is, in some embodiments of the invention, these practical details are not necessary. In addition, for the sake of simplifying the drawings, some well-known and commonly used structures and components will be shown in a simple schematic manner in the drawings.

In order to solve the problem in tag-aware recommendation, the present invention introduces a hypergraph to model the high-order relationship between nodes, uses hypergraph convolution to complete the information transfer in the neural network, and uses the attention mechanism to reflect the different information Importance, so as to effectively alleviate the problem of high-level information loss in traditional recommendation methods. Figure 1 shows the interactive relationship between users, items, and tags, including the direct interaction between users and items, and the relationship between users-tags and items-tags. In order to extract features from complex relations for recommendation, the present invention introduces attention mechanism and hypergraph convolution.

As shown in Figures 2-3, the present invention is a label-aware recommendation method based on attention mechanism and hypergraph convolution. This method uses the direct interaction relationship between users and items and the brief interaction relationship between users, items, and labels in the user Construct different hypergraphs on the side and the project side, extract the information of high-order relational responses through hypergraph convolution, and use the attention mechanism to distinguish information of different importance levels, and make recommendations through the obtained feature representations, including the following steps :

Step 1: Initialize features. Here, the initial feature u is established on the user side based on personal information such as user ID, gender, and age, and the initial feature i is established on the project side based on product information. In this step, the label-aware part is not involved.

Step 2: Use the tuple F=(U, I, T, A) to represent the user set U, the item set I, the label set T and the three-way relationship to represent the set A. In the user-label feature, the number of times the user marks the label represents the feature

Similarly, the item-label feature uses

in

and

Respectively represent the number of times that user u and item i are marked with p, and σ represents the normalization operation.

Step 3: Construct three interactive bipartite graphs G user-item , G _user- tag , and G _item-tag according to the interaction relationship between users and items, users and _tags , and items and tags. Figure 1 can be regarded as a combined representation of three bipartite graphs. The purpose of establishing a bipartite graph is to construct a hypergraph structure. The bipartite graph of users and items uses the direct interaction between users and items, such as the rating matrix.

Step 4: According to the bipartite graph of users and items, respectively construct the hypergraphs of the user side and the item side represented by the direct interaction relationship. Take the user side as an example: if there is a path between two items m and n, and the number of users passing through the path is less than k, then these two items are reachable neighbors of order k; for item m, if there is a reachable neighbor of order k Reach neighbor n and user u directly interacts with m, then user u is the k-order reachable user of item n. For each item, its k-order reachable users are regarded as a set, and the users on the set are regarded as nodes, and the set can be regarded as a hyperedge, thus constructing a hypergraph

The project side is the same as the user side, building a hypergraph

Specifically: if there is a path between two users u and v, and the number of items passed in the path is less than k, then the two users are k-order reachable neighbors; for user u, if it has a k-order reachable neighbor v And item m interacts directly with u, then item m is the k-level reachable item of user u. For each user, its k-level reachable items are regarded as a set, and the items on the set are used as nodes, and the set can be regarded as super edges, thus constructing the hypergraph

Step 5: According to the bipartite graphs of users and tags and items and tags, respectively construct the hypergraphs on the user side and item side represented by the tag relationship. Take the user side as an example: if there is a path between two labels m and n, and the number of users passing through the path is less than k, then these two labels are k-order reachable neighbors; for label m, if it has k-order reachable neighbors Reach neighbor n and user u directly interacts with m, then user u is the k-order reachable user of label n. For each label, its k-order reachable users are regarded as a set, and users on the set are regarded as nodes, and the set can be regarded as a hyperedge, thus constructing a hypergraph

The project side is the same as the user side, building a hypergraph

Specifically: if there is a path between two labels m and n, and the number of items passed in the path is less than k, then the two labels are k-order reachable neighbors; for label m, if it has a k-order reachable neighbor n And item p interacts directly with m, then item p is the k-order reachable item of label n. For each label, its k-order reachable items are regarded as a set, and the items on the set are used as nodes, and the set can be regarded as super edges, thus constructing the hypergraph

Step 6: Express the hypergraph structure in the form of an incidence matrix H=(V, E), wherein V is a node set, E is a hyperedge set, and use the following method to indicate whether the node v is on the hyperedge e:

Therefore, the hypergraph

The incidence matrix of represents H _d-user , H _d-item .

Taking the user side as an example, hypergraph convolution can be expressed as:

Here, the spectral method is used for hypergraph convolution. where Θ ^(l) represents the learnable parameter matrix on layer l, σ is the activation function, D _v is the degree matrix of nodes, and _DE is the degree matrix of hyperedges. H _d-user is a hypergraph

The incidence matrix of , H=(V, E), where V is the set of nodes, and E is the set of hyperedges. and

The multiplication operation of represents the user-side direct interaction relation represents the hypergraph

On the aggregation from node features to hyperedge features, the multiplication operation with H _d-user represents

Aggregation from hyperedge features to node features. At the same time, according to the idea of ResNet, the original features are added to each layer of the hypergraph convolution to retain the influence of the previous features, so as to prevent the initial features from being reflected due to the excessive influence of the neighbor features.

The project side is the same as the user side, and hypergraph convolution can be expressed as:

Among them: Θ ^(l) represents the learnable parameter matrix on the l layer, σ is the activation function, D _v is the degree matrix of the node, and _DE is the degree matrix of the hyperedge. H _d-item is a hypergraph

The multiplication operation represents the project-side direct interaction relation represents the hypergraph

The aggregation from the node feature to the hyperedge feature, and the multiplication operation with the H _d-item represent

Aggregation from hyperedge features to node features.

Thus, on each layer of hypergraph convolution, information is updated in the form of node-hyperedge-node, and information is extracted from high-order relations in the form of hypergraph neural network.

Step 7: Similar to Step 6, get the hypergraph

The incidence matrix of represents H _t-user , H _t-item .

User-side hypergraph convolution can be expressed as:

Among them: Θ ^(l) represents the learnable parameter matrix on the l layer, σ is the activation function, D _v is the degree matrix of the node, and _DE is the degree matrix of the hyperedge.

hypergraph

The multiplication operation of represents the user-side label represents the hypergraph

Aggregation from node features to hyperedge features, multiplication operation with H _t-user represents

Aggregation from hyperedge features to node features.

The item-side hypergraph convolution can be expressed as:

Among them: Θ ^(l) represents the learnable parameter matrix on the l layer, σ is the activation function, D _v is the degree matrix of the node, and _DE is the degree matrix of the hyperedge. H _t-item is a hypergraph

The multiplication operation represents the item side label represents the hypergraph

On the aggregation from node features to hyperedge features, the multiplication operation with H _t-item represents

Aggregation from hyperedge features to node features.

Step 8: As shown in Figure 3, according to the features u ^tag , i ^tag , u ¹ , i ¹ , u ² , and i ² obtained in steps 2, 6, and 7, use the attention mechanism to obtain the user-side and item-side The final features represent u ^* , i ^* .

Taking the user side as an example, since u ^tag has different dimensions from u ¹ and u ² , it is not suitable for direct addition operation. Therefore, they are respectively spliced to obtain u ^1-tag and u ^2-tag .

After getting u ^1-tag and u ^2-tag , use the attention mechanism to process the two feature representations into the final feature representation:

a(u,k)=W ^T tanh(Wu ^k-tag +b ₂ )

Obtaining the weight according to the attention mechanism can obtain the final feature representation u ^* on the user side

u ^* = α(u,1)u ^1-tag + α(u,2)u ^2-tag

Splicing i ^tag with i ¹ and i ² respectively to obtain i ^1-tag and i ^2-tag , and then use the attention mechanism to process the two feature representations into one feature representation:

a(i,k)=W ^T tanh(Wi ^k-tag +b ₂ )

Get the weight according to the attention mechanism to obtain the final feature representation i ^* on the item side:

i ^* = α(i,1)i ^1-tag + α(i,2)u ^2-tag .

Step 9: As shown in Figure 4, according to the concatenation operation of u ^* and i ^* , z=[u ^* ; i ^* ] is obtained, and the Sigmoid function is used as the activation function to obtain the user-item probability prediction for recommendation.

According to the obtained probability prediction, the items are recommended to the user by Top-K sorting.

Use the cross-entropy function as the loss function for model training:

where X is the training sample set.

The present invention introduces hypergraph convolution to mine high-order relations for feature extraction. At the same time, the attention mechanism is used to assign weights to the features obtained by the user-item direct interaction hypergraph and the label-aware hypergraph construction, which can better distinguish information of different importance. By cleverly combining hypergraph convolution and attention mechanism, the method proposed in the present invention can fully extract the features in the direct user-item interaction relationship and the interaction relationship with tags, effectively improving the performance of the recommendation method.

The above descriptions are only embodiments of the present invention, and are not intended to limit the present invention. Various modifications and variations of the present invention will occur to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention shall be included in the scope of the claims of the present invention.

Claims

A label-aware recommendation method based on attention mechanism and hypergraph convolution, characterized in that: the label-aware recommendation method uses the direct interaction relationship between users and items and the brief interaction relationship between users, items, and labels on the user side and item side respectively Construct different hypergraphs, extract the information of high-order relationship responses through hypergraph convolution, and use the attention mechanism to distinguish information of different importance levels, and make recommendations through the obtained feature representations, including the following steps:

Step 1: Initialize the feature representation of user u and item i, and obtain the initial feature u of user u and the initial feature i of item i;

Step 2: Obtain the user feature u tag and the item feature i tag represented by the tag according to the interaction relationship between the user and the item and the tag respectively;

Step 3: Construct three interactive bipartite graphs G user-item , G user- tag , and G item-tag according to the interaction relationship between users and items, users and tags , and items and tags;

Step 4: According to the bipartite graph of users and items obtained in step 3, respectively construct the hypergraph structure of the user side and the item side represented by the direct interaction relationship

Step 5: Based on the bipartite graphs of users and tags and items and tags obtained in step 3, respectively construct the hypergraph structure of the user side and item side represented by the tag relationship

Step 6: Use the two hypergraphs obtained in step 4 for the features u and i obtained in step 1

Perform hypergraph convolution respectively to obtain neighborhood feature representations u 1 and i 1 of the updated direct relationship representations on the user side and item side;

Step 7: For the features u and i obtained in step 1, the two hypergraphs obtained in step 5
Carry out hypergraph convolution respectively to obtain neighborhood feature representations u 2 and i 2 represented by updated tag relations on the user side and item side;

Step 8: Use the attention mechanism to process the features obtained in steps 2, 6, and 7 to obtain the weights of different feature representations to obtain the final user and item feature representations u * , i * ;

Step 9: Concatenate the user and item feature representations obtained in the eighth step to obtain z=[u * ; i * ], input it into the fully connected layer and use the Sigmoid function to obtain the prediction probability, and make recommendations based on the score.
A label-aware recommendation method based on attention mechanism and hypergraph convolution according to claim 1, characterized in that: the hypergraph structure on the user side in step 4
The specific construction of is as follows: if there is a path between two items m and n, and the number of users passing through the path is less than k, then these two items are reachable neighbors of order k; for item m, if it has reachable neighbors of order k Neighbor n and user u directly interacts with m, then user u is the k-order reachable user of project n. For each project, its k-order reachable users are regarded as a set, and the users on the set are regarded as nodes, and the set can be visualized is a hyperedge, thus constructing a hypergraph
A label-aware recommendation method based on attention mechanism and hypergraph convolution according to claim 2, characterized in that: the hypergraph structure on the item side in step 4
The build is specifically:

If there is a path between two users u and v, and the number of items passing through the path is less than k, then the two users are k-order reachable neighbors; for user u, if it has k-order reachable neighbor v and item m Interact directly with u, then item m is the k-level reachable item of user u, for each user, its k-level reachable items are regarded as a set, and the items on the set are used as nodes, and the set can be regarded as a hyperedge, so build hypergraph
A label-aware recommendation method based on attention mechanism and hypergraph convolution according to claim 1, characterized in that: the hypergraph structure on the user side in step 5
The specific construction of is: if there is a path between two labels m and n, and the number of users passing through the path is less than k, then the two labels are reachable neighbors of order k; for label m, if there is a reachable neighbor of order k Neighbor n and user u interacts directly with m, then user u is a k-order reachable user of label n. For each label, its k-order reachable users are regarded as a set, and users on the set are regarded as nodes, and the set can be viewed is a hyperedge, thus constructing a hypergraph
A label-aware recommendation method based on attention mechanism and hypergraph convolution according to claim 4, characterized in that: the hypergraph structure on the item side in step 5
The construction of is as follows: if there is a path between two labels m and n, and the number of items passing through the path is less than k, then these two labels are reachable neighbors of order k; for label m, if there is a reachable neighbor of order k Neighbor n and item p interacts directly with m, then item p is a k-order reachable item of label n. For each label, its k-order reachable items are regarded as a set, and the items on the set are used as nodes, and the set can be visualized is a hyperedge, thus constructing a hypergraph
A label-aware recommendation method based on attention mechanism and hypergraph convolution according to claim 1, characterized in that:

In the step 6, the hypergraph convolution on the user side is expressed as:

Among them: Θ (l) represents the learnable parameter matrix on the l layer, σ is the activation function, D v is the degree matrix of the node, and DE is the degree matrix of the hyperedge. H d-user is a hypergraph
The incidence matrix of , H=(V, E), where V is the set of nodes, E is the set of hyperedges, and
The multiplication operation of represents the user-side direct interaction relation represents the hypergraph
On the aggregation from node features to hyperedge features, the multiplication operation with H d-user represents
aggregation from hyperedge features to node features;

In the step 6, the hypergraph convolution on the project side is expressed as:

Among them: Θ (l) represents the learnable parameter matrix on the lth layer, σ is the activation function, D v is the degree matrix of the node, D E is the degree matrix of the hyperedge, and H d-item is the hypergraph
The incidence matrix of , H=(V, E), where V is the set of nodes, E is the set of hyperedges, and
The multiplication operation represents the project-side direct interaction relation represents the hypergraph
The aggregation from the node feature to the hyperedge feature, and the multiplication operation with the H d-item represent
Aggregation from hyperedge features to node features.
A label-aware recommendation method based on attention mechanism and hypergraph convolution according to claim 1, characterized in that:

In the step 7, the hypergraph convolution on the user side is expressed as:

Among them: Θ (l) represents the learnable parameter matrix on the l-th layer, σ is the activation function, D v is the degree matrix of the node, D E is the degree matrix of the hyperedge,
hypergraph
The incidence matrix of , H=(V, E), where V is the set of nodes, E is the set of hyperedges, and
The multiplication operation of represents the user-side label represents the hypergraph
Aggregation from node features to hyperedge features, multiplication operation with H t-user represents
aggregation from hyperedge features to node features;

The hypergraph convolution on the item side in step 7 is expressed as:

Among them: Θ (l) represents the learnable parameter matrix on the l layer, σ is the activation function, D v is the degree matrix of the node, and DE is the degree matrix of the hyperedge. H t-item is a hypergraph
The incidence matrix of , H=(V, E), where V is the set of nodes, E is the set of hyperedges, and
The multiplication operation represents the item side label represents the hypergraph
On the aggregation from node features to hyperedge features, the multiplication operation with H t-item represents
Aggregation from hyperedge features to node features.
A label-aware recommendation method based on attention mechanism and hypergraph convolution according to claim 1, characterized in that: the user-side and item-side feature processing in the step 8 is specifically:

According to the features u tag , i tag , u 1 , i 1 , u 2 , i 2 obtained in steps 2, 6, and 7, use the attention mechanism to obtain the final feature representations u * , i * on the user side and the item side, in:

Concatenate u tag with u 1 and u 2 respectively to obtain u 1-tag and u 2-tag , and then use the attention mechanism to process the two feature representations into one feature representation:

a(u,k)=W T tanh(Wu k-tag +b 2 )

Obtain the weight according to the attention mechanism to obtain the final feature representation u * on the user side:

u * = α(u, 1)u 1-tag + α(u, 2)u 2-tag ;

Splicing i tag with i 1 and i 2 respectively to obtain i 1-tag and i 2-tag , and then use the attention mechanism to process the two feature representations into one feature representation:

a(i,k)=W T tanh(Wi k-tag +b 2 )

Get the weight according to the attention mechanism to obtain the final feature representation i * on the item side:

i * = α(i,1)i 1-tag + α(i,2)u 2-tag .