CN115082147A

CN115082147A - Sequence recommendation method and device based on hypergraph neural network

Info

Publication number: CN115082147A
Application number: CN202210668287.XA
Authority: CN
Inventors: 许勇; 李想
Original assignee: South China University of Technology SCUT
Current assignee: South China University of Technology SCUT
Priority date: 2022-06-14
Filing date: 2022-06-14
Publication date: 2022-09-20
Anticipated expiration: 2042-06-14
Also published as: CN115082147B

Abstract

The invention discloses a sequence recommendation method and a sequence recommendation device based on a hypergraph neural network, wherein the method comprises the following steps: constructing a hypergraph from the data set; learning dynamic short-time user/commodity embedded vectors through hypergraph convolution; combining the dynamic/static users and the dynamic/static commodities to obtain interactive embedded vectors of the users and the commodities; user-commodity interaction sequences are input into a transfomer module, and user embedding vectors with different time granularities are learned according to different sliding windows; fusing dynamic user embedding vectors with different time granularities; and performing preference prediction by using the final dynamic user embedded vector and the static/dynamic fusion commodity embedded vector to obtain a recommendation sequence. The invention comprehensively considers the information under the multi-time granularity of the interaction between the user and the commodity, so that the user vector comprises the correlation information between the user and the user, the effectiveness of the characteristics is ensured, the time complexity of the training on the main model is reduced, and the method can be widely applied to the technical field of sequence recommendation.

Description

Sequence recommendation method and device based on hypergraph neural network

Technical Field

The invention relates to the technical field of sequence recommendation, in particular to a sequence recommendation method and device based on a hypergraph neural network.

Background

Since the 21 st century, with the development of network technology and the popularization of mobile phones and computers, online shopping has become an indispensable part of the lives of residents. The rapid development of the internet technology brings the problem of information explosion to residents, and the mass of irrelevant information fills the lives of people every day. How to help residents to better select the most needed and matched commodities from massive commodities becomes an important difficulty of each large e-commerce platform. The system can better meet the living demands of residents, can better position customer sources for merchants accurately, and better solves the problems of stock extrusion and the like in the process of selling products. The recommendation system is a technical application for recommending commodities for a user according to behaviors such as historical shopping and the like of the user and personal preference, can accurately provide recommended high-quality commodities for a client, enables the user to feel the demand of what you see is, and reduces troubles surrounded by massive uninteresting commodity information, so that the problem of information overload is effectively relieved; on the other hand, more scientific data support is provided for the decision of the merchant, and the service quality of the e-commerce industry is improved.

The current sequence recommendation algorithm can be simply divided into three categories: based on a traditional recurrent neural network model, such as GRU4Rec +; attention-based neural network models such as BERT4Rec and SASRec; a sequence recommendation model based on graph neural networks, such as HyperRec. Based on a traditional recurrent neural network model, a serialized article embedding representation vector is generally input into a GRU neural network, the output of each time sequence frame is a vector, and the vector is converted into a prediction score for a next time sequence user to select all articles after passing through a multi-layer perceptron. However, the conventional recurrent neural network model has the defects that only information of the order of the articles is utilized, and random initialization is generally performed at the initial stage of training, and the information of the user and the internal relationship between the articles are not well utilized.

Attention-based neural network models typically incorporate an attention mechanism in the sequence recommendation model that attempts to capture information about user activity based on recent interactions of the user, can better capture long-term semantics, find out from the user's historical interaction information which items are ' relevant ', and use them to predict the next item. However, the model of the type still does not well introduce information of the user side, and only uses behavior information of the user; only information on the same time granularity of the user is considered on the same time sequence column, and information on the cross-time granularity is not considered.

Based on a sequence recommendation model of a graph neural network, by constructing graphs of users and commodities, with the help of graph convolution, multi-hop relations between the users and between the users and the commodities can be better learned, preferences and characteristics of the users and the commodities can be better learned, and multi-order connection is captured under multi-layer graph convolution. Therefore, when the next commodity is predicted, the representation vectors of the user and the commodity are richer, and the contained semantics are more comprehensive. However, the computational operation and complexity of graph convolution are far greater than those of the traditional neural network model, the dependence degree on the graph construction is high, and the graph is also troubled by the scarcity of data in the graph. Therefore, better solving the problem of data scarcity of the graph becomes important to more effectively fuse the user and article interaction information into the model.

Disclosure of Invention

In order to solve at least one of the technical problems in the prior art to a certain extent, the present invention provides a sequence recommendation method and apparatus based on a hypergraph neural network.

The technical scheme adopted by the invention is as follows:

a sequence recommendation method based on a hypergraph neural network comprises the following steps:

acquiring a data set with user and commodity interaction information and time information, constructing a hypergraph according to the acquired data set, and dividing the hypergraph by using different time periods;

dividing to obtain a hypergraph subgraph of each time period, and clustering the hyperedge users in the hypergraph subgraph at each moment to cluster similar hyperedges;

pre-training learning is carried out by introducing contrast learning and graph convolution to obtain initial embedded vectors of commodities and users;

inputting the divided hypergraph subgraphs of each time period and the initial embedded vectors of the commodities/users to be learned through pre-training into the main model, and performing hypergraph convolution learning on the dynamic commodity/user embedded vectors; fusing the static/dynamic commodity and the static/dynamic user embedded vector through the fusion layer to obtain an interactive embedded vector of the user and the commodity;

embedding user and merchandise interactions into vector input tran _s f _o rm _e In the r module, three dynamic user embedded vectors of short, medium and long are obtained by learning according to different time sliding windows, and the three dynamic user embedded vectors are fused into a final dynamic user embedded vector;

and performing preference prediction on the final dynamic user embedded vector and the dynamic and static commodity embedded vectors which are fused to obtain a recommendation sequence.

Further, the constructing a hypergraph from the obtained dataset comprises:

preprocessing the data set, and using the preprocessed data set in the construction of the commodity and user hypergraph subgraphs at different times;

the method for constructing the commodity and user hypergraph subgraphs at different times by using the preprocessed data sets comprises the following steps:

the user: u ═ U ₁ ，u ₂ ，…u _L B, }; wherein u is _j Embedding a jth user into a vector, wherein j is more than or equal to 1 and less than or equal to L, and L is the total number of users;

commercial products: i ═ I ₁ ，i ₂ ，…i _N B, }; wherein i _j J is more than or equal to 1 and less than or equal to N, and N is the total number of the articles;

time: t ═ T ₁ ，t ₂ ，…t _C B, }; wherein, t _j J is more than or equal to 1 and less than or equal to C at the jth moment, and C is the total time length;

representing an interaction sequence of the user and the commodity, and sequencing according to the interaction time; wherein,

an embedded vector representing an article ID of 1, m being the article ID;

representing the time t at which the user n interacts with the item 1 ₁ ；

Wherein G represents a constructed hypergraph sub-graph of the commodity and the user at each time t, t represents a moment of time,

is shown at the moment t _c And then, carrying out a hypergraph sub-graph on the commodity and the user.

Further, in the step of clustering the super-edge users in the super-graph subgraph at each moment, the clustering formula is as follows:

wherein cluster partitioning of the initiating super edge users is (C) ₁ ，C ₂ ，…C _k ) Then the target minimizes the squared error E; u. of _i Is a cluster C _i The mean vector of (2); x is of the formula c _i All over-edge users, X, at the cluster center _i For the super edge user i, k is the k cluster class centers.

Further, in the step of performing pre-training learning by introducing contrast learning and graph convolution, the loss function in the pre-training stage includes two parts:

partial formula for the first partial BPRLoss:

pre-training phase, for user u _i And a proof goods

And a negative sample commodity

Supervised training of the output prediction values is required, and the loss of training is defined as follows:

BPRLoss：

in the formula, o represents: o { (u, i, j) | (u, i) ∈ R ⁺ ，(u，j)∈R ^- In which R is ⁺ To observe the sample, R ^- No sample observed; (u, i) represents a user and its positive sample pair, (u, j) represents a user and its negative sample pair; (u, i, j) represents a user and its positive sample pair, a user and its negative sample pair;

the second part compares the learning formula:

firstly, an LSTM sequence model is applied, and the output of the last time step is used as a potential user embedding vector; randomly initializing user embedded vectors and potential embedded vectors of the same user as positive samples, and taking other users as negative samples;

the formula is as follows:

in formula (II) { (z' _u ，z″ _u ) L U ∈ U } belongs to a pair of positive samples, { (z' _u ，z″ _v ) L U, v belongs to U, v is not equal to U and belongs to a pair of negative samples; where s () cosine similarity is the correlation used to predict two vectors and τ is the hyperparameter.

Further, inputting the divided hypergraph subgraph of each time segment and the commodity/user initial embedded vector of pre-training learning into the main model, and performing hypergraph convolution learning on the dynamic commodity/user embedded vector, including:

static merchandise initial embedded vector:

wherein,

embedding vectors into the commodity subjected to multilayer hypergraph convolution;

representing a static commodity initial embedded vector;

the hypergraph subgraph representing the commodity and the user is at the time t _n ，

Is a weight matrix with the diagonal matrix representing the super-edges,

the hypergraph subgraph representing the commodity and the user is at the time t _n Transpose of (P) ⁰ Representing a learnable parameter matrix; τ is a nonlinear activation function;

the hypergraph convolution learning commodity embedded vector formula is further expanded into:

dynamic merchandise initial embedded vector:

wherein,

a diagonal matrix representing nodes in the hypergraph,

a diagonal matrix representing the super edges in the hypergraph,

representing the dynamic commodity initial embedded vector.

Further, the fusing the static/dynamic commodity and the static/dynamic user embedded vector through the fusion layer to obtain an interactive embedded vector of the user and the commodity includes:

a fusion layer:

in the formula,

representing a static user-initiated embedded vector,

representing a dynamic user initial embedding vector;

will be provided with

And

adding the four vectors to obtain an interactive embedded vector of each user and each commodity, wherein i represents the commodity ID for pairing; u represents the corresponding user ID, t _n Corresponding to the time of mutual interaction.

Further, the user and commodity interaction is embedded into the vector input transform module, and the attention mechanism part of the computation process in the transform module is as follows:

in the formula,

representing the interaction embedding vector of the user and different commodities, and calculating the attention coefficient of the user and the different commodities; w is a group of _q 、W _K 、W _V A matrix of parameters that can be learned is represented,

showing user u and item L ^u The vector is embedded in the vector, and the vector is embedded in the vector,

representing a learnable parameter matrix and embedded vectors of a user u and a commodity j, and d represents a vector dimension;

obtaining a short-time dynamic user embedded vector by weighted summation under the last time t

Adding the short length, the medium length and the long length:

obtaining a final dynamic user embedding vector; wherein,

the vectors are embedded for medium-time dynamic users,

vectors are embedded for long-term dynamic users.

Further, the preference prediction of the final dynamic user embedded vector and the dynamic and static commodity embedded vector fused is performed to obtain a recommendation sequence, which includes:

using BPRLoss as a loss function to supervise prediction and back-propagate gradient into the network for parameter updating;

the prediction formula is as follows:

wherein

The dynamic commodity is a dynamic commodity, and the dynamic commodity is a dynamic commodity,

embedding vectors for static commodities;

expression of BPRLossComprises the following steps:

in the formula, o represents: loss is defined as o { (u, i, j) | (u, i) ∈ R ⁺ ，(u，j)∈R ^- In which R is ⁺ To observe the sample, R ^- No sample observed; sigmoid is a nonlinear activation function;

the prediction is scored for the user versus the positive sample,

λ is the weight used to determine the regularization loss of L2 for the prediction score of the user and the negative sample j, and Θ represents the learnable parameter.

Further, the method also comprises the steps of constructing an end-to-end model, and utilizing training data to learn and update parameters:

acquiring interaction data and specific time data of a user and commodities, and constructing a hypergraph sub-graph of the user and the commodities by using different time periods;

inputting user and commodity interaction data into a pre-training model, inputting a user purchased article sequence into an LSTM model after random enhancement and disturbance, and outputting a user potential embedding vector; then, entering a comparison learning module to learn to obtain a static commodity/user embedded vector;

learning the embedded vector of the dynamic commodity/user at each moment through hypergraph convolution; fusing the obtained embedded vectors through a fusion layer to obtain interactive embedded vectors of the user and the commodities;

and (3) realizing dynamic user embedded vectors under short, medium and long time granularities by using a transform module through a sliding window, combining the three dynamic user embedded vectors, performing point multiplication on the combined dynamic/static commodity embedded vector, using BPRLoss as a loss function for supervision, and learning model parameters on a training set by using a gradient descent back propagation method until the model converges.

The other technical scheme adopted by the invention is as follows:

a sequence recommendation device based on a hypergraph neural network comprises:

at least one processor;

at least one memory for storing at least one program;

when executed by the at least one processor, cause the at least one processor to implement the method described above.

The invention has the beneficial effects that: the invention comprehensively considers the information under the multi-time granularity of the interaction between the user and the commodity, firstly utilizes the pre-training stage to introduce the comparison learning module, better learns the embedded vector of the static user and the commodity, and leads the user vector to contain the correlation information between the user and the user, thereby not only ensuring the effectiveness of the characteristics, but also reducing the time complexity of the training on the main model.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the following description is made on the drawings of the embodiments of the present invention or the related technical solutions in the prior art, and it should be understood that the drawings in the following description are only for convenience and clarity of describing some embodiments in the technical solutions of the present invention, and it is obvious for those skilled in the art that other drawings can be obtained according to these drawings without creative efforts.

FIG. 1 is a flow chart of a method for recommending a merchandise forecast based on a sequence of hypergraph convolution according to an embodiment of the present invention;

FIG. 2 is a schematic diagram of a pre-training phase in an embodiment of the present invention;

FIG. 3 is a block diagram of a main model module according to an embodiment of the present invention;

FIG. 4 is a flowchart illustrating steps of a sequence recommendation method based on a hypergraph neural network according to an embodiment of the present invention.

Detailed Description

Reference will now be made in detail to embodiments of the present invention, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below with reference to the accompanying drawings are illustrative only for the purpose of explaining the present invention, and are not to be construed as limiting the present invention. The step numbers in the following embodiments are provided only for convenience of illustration, the order between the steps is not limited at all, and the execution order of each step in the embodiments can be adapted according to the understanding of those skilled in the art.

In the description of the present invention, it should be understood that the orientation or positional relationship referred to in the description of the orientation, such as the upper, lower, front, rear, left, right, etc., is based on the orientation or positional relationship shown in the drawings, and is only for convenience of description and simplification of description, and does not indicate or imply that the device or element referred to must have a specific orientation, be constructed and operated in a specific orientation, and thus, should not be construed as limiting the present invention.

In the description of the present invention, the meaning of a plurality of means is one or more, the meaning of a plurality of means is two or more, and larger, smaller, larger, etc. are understood as excluding the number, and larger, smaller, inner, etc. are understood as including the number. If the first and second are described for the purpose of distinguishing technical features, they are not to be understood as indicating or implying relative importance or implicitly indicating the number of technical features indicated or implicitly indicating the precedence of the technical features indicated.

In the description of the present invention, unless otherwise explicitly limited, terms such as arrangement, installation, connection and the like should be understood in a broad sense, and those skilled in the art can reasonably determine the specific meanings of the above terms in the present invention in combination with the specific contents of the technical solutions.

As shown in fig. 4, the present embodiment provides a sequence recommendation method based on a hypergraph neural network, and includes constructing hypergraph subgraphs by using interaction data of a commodity and a user and specific interaction time data, and performing K-Means clustering on the hyperedges of the subgraphs under each timestamp, thereby compressing the sparsity of the subgraphs; then, contrast learning is integrated in a pre-training stage, mutual information among users is enhanced, and a static user/commodity initial embedded vector is obtained; further learning commodity and user embedded vectors by using hypergraph convolution in the main model, and obtaining static user/commodity initial embedded vectors by fusing pre-training to obtain user and commodity interactive embedded vectors; further, the user and commodity interaction sequence is input into the transform module, dynamic user embedding vectors under different time granularities are learned according to different time sliding windows, information under multiple time granularities is fused with the characteristics of the user, and finally the short, medium and long characters are fused into the dynamic user embedding vectors, so that the semantics of the user is enriched, and the commodity prediction effect and accuracy of the user are improved. The method specifically comprises the following steps:

s1, acquiring a data set with user and commodity interaction information and time information, constructing a hypergraph according to the acquired data set, and dividing the hypergraph by using different time periods.

Directly acquiring a data set, acquiring the data set with user and commodity interaction information and interaction specific time information by opening a source data set, filtering invalid data and carrying out negative sampling; and disclosing data sets, such as goodreads _ large, newAmazon data sets and the like, wherein the data sets comprise the relations, and the data sets can be used for commodity and user hypergraph sub-graph construction at different times after data preprocessing.

As an optional implementation manner, for an acquired data set, a preprocessing operation is performed first; dividing the time of the data set, and classifying and dividing different years and 12 months artificially; and facilitating subsequent construction of the hypergraph subgraph at each time granularity. Traversing all the data to take out the time information of each user and corresponding articles and interaction, putting the commodities interacted by each user into a user dictionary, and sequencing in sequence; and meanwhile, the interactive time information is put into a user time dictionary to store the time information of the interaction between the user and the commodity.

After the user dictionary is obtained, all users are traversed one by one, and if the number of times of interaction between the users and the commodities is less than 3, the users are removed, so that effective data can be prevented from being polluted to a certain extent, and the data sparsity is reduced. For effective data, the commodity before the penultimate interaction between each user and the commodity is used as a training set, the penultimate commodity is used as a verification set, the last commodity is used as a test set, and the corresponding interaction time information is stored according to the mode; we build a representation of the user, the goods and the time:

representing an interaction sequence of the user and the commodity, and sequencing according to the interaction time; wherein

Is an embedded vector with an article ID of 1, m is the article ID;

time t1 for user n to interact with item 1.

And S2, dividing to obtain the hypergraph subgraph of each time segment, and clustering the hyperedges of the users in the hypergraph subgraph at each moment to cluster the similar hyperedges.

The method is characterized in that a data set is automatically collected or generated, a network service provider can automatically collect data, and a general user can construct a hypergraph subgraph of the interaction relationship between the user and the commodity by collecting user-commodity interaction data and interaction time information data and time division. And then clustering the super-edge users in the super-graph subgraphs at each time by using a K-Means algorithm, so that the sparsity of commodities and user super-graphs is reduced better, the size of the super-graphs is compressed, the operation of super-graph convolution is accelerated, and dynamic user/commodity embedded vectors are learned better.

Constructing hypergraph subgraphs at different times according to the constructed training set and the divided time; constructing a commodity and user hypergraph sub-graph at each time; the method comprises the following specific steps:

wherein G represents a hypergraph of the constructed commodity and the user at each time t, and t represents the moment of time;

constructing a hypergraph requires the inclusion of elements; wherein the hypergraph G at each time instant consists of 4 parts;

representing commodity nodes under this hypergraph;

representing the user-represented hyperedge under this hypergraph; tn represents the moment of time;

the number of the commodities is the adjacent matrix of the commodities and the users, n is the number of the commodities, and m is the number of the users, namely the corresponding excess edge;

next, clustering the commodities and the superedges in the user hypergraph subgraphs at each moment, namely the users, and using a K-Means algorithm to better reduce the sparsity of the hypergraphs and aggregate similar commodities;

kmean formula:

where we first initialize a cluster partition assuming super-edge users (C) ₁ ，C ₂ ，…C _k ) Then the objective is to minimize the squared error E, where u _i Is a cluster C _i The mean vector of (a), called centroid; and clustering the super-edge users in the hypergraph after a plurality of times of heuristic iteration methods.

And S3, pre-training learning is carried out by introducing contrast learning and graph convolution, and initial embedded vectors of the commodity and the user are obtained.

After each hypergraph subgraph is constructed, entering a pre-training stage, wherein the pre-training stage has the main functions of avoiding the problem that randomness exists in model training due to the fact that commodity initial embedded vectors and user initial embedded vectors are initialized randomly when a main model is trained, reducing cold start and accelerating the convergence speed of the model training; meanwhile, the pre-training of the user is reflected in a comparison learning module, so that the characterization vectors among different users can be better learned through comparison learning of the information of the correlation among the users on the basis of the traditional BPRLoss training; the pre-training phase mainly comprises two steps:

in the pre-training stage, after users are randomly sampled, interaction between each user and commodities is traversed, a positive sample and a negative sample are randomly extracted, and a list of three elements is formed and input into a pre-training model;

the method comprises the following steps of pre-training a stage model, wherein a loss function mainly comprises two parts, namely pretrainless ═ BPRLoss + CLOss; the first part is mainly through randomly extracted n users, the scores of the user to commodities are obtained through point multiplication of the initial embedding vectors of the users and the initial embedding vectors of positive and negative commodities, and the model is supervised by BPRLoss. The training BPRLoss loss can be defined as follows:

BPRLoss：

here, o represents: o { (u, i, j) | (u, i) ∈ R ⁺ ，(u，j)∈R ^- }，Wherein R is ⁺ To observe the sample, R ^- No sample observed; σ is a nonlinear activation function; theta denotes

Learnable parameters, using L ₂ Regularization reduces the overfitting problem of the small model.

The second part compares the learned CLOSs and needs to process the interaction data of the user and the commodity; and randomly drawing 50 users, placing the same 2 users in a positive sample of the users, placing different 2 users in a negative sample, and representing the commodities interacted by the users by a list.

Inputting 2 user and commodity interaction sequences into an LSTM model, taking out the last embedded vector in the LSTM model as embedded vectors of two users, carrying out vector dot multiplication on the two embedded vectors, and judging whether the two sequences are the same user or not through a nonlinear activation function; and monitoring by using a cross entropy loss function with the positive label, and reversely transmitting the result to the network according to the gradient for updating the parameters in the comparison learning module.

The formula:

wherein { (z' _u ，z″ _u ) L U ∈ U } belongs to a pair of positive samples, { (z' _u ，z″ _v ) L U, v belongs to U, v ≠ U } belongs to a pair of negative examples. And finally, the pre-training loss is formed by the two parts together, and parameters in the network are updated through gradient back propagation.

S4, inputting the divided hypergraph subgraphs of each time period and the initial embedded vectors of the commodities/users to be learned through pre-training into a main model, and performing hypergraph convolution learning on the dynamic commodity/user embedded vectors; and fusing the static/dynamic commodity and the static/dynamic user embedded vector through the fusion layer to obtain the interactive embedded vector of the user and the commodity.

The main model phase comprises sample sampling; hypergraph convolution learning dynamic user/commodity embedded vectors; a fusion layer: fusing static/dynamic users and static/dynamic commodity embedded vectors to construct user and commodity interactive embedded vectors; inputting the sequence of the user and the commodity into a transfomer module, obtaining dynamic user embedded vectors under different time granularities through different time sliding windows, fusing the short, medium and long time granularities to obtain a final dynamic user embedded vector, and predicting the final dynamic user embedded vector and the commodity embedded vector, wherein the method comprises the following specific steps:

in the first step, a sample is taken. Randomly selecting n users in the interaction sequence sequences of all the users, taking out the interaction sequences of the users, and selecting the interaction commodities and corresponding time for the interaction sequence value in each sequence; then, ten negative samples are traversed, and one negative sample is randomly drawn.

After the sampling stage is finished, inputting sampled data into the main model;

the first step of the model is to extract sample data, and a static commodity/user embedded vector which is learned in a pre-training stage is used as an initial embedded vector in the main model. Thereby avoiding the problems of randomness and contingency of random initialization; cold start is reduced, and the convergence speed of model training is increased; meanwhile, mutual information among users is extracted by the comparison learning module; they are represented as follows:

static commodity initialization embedded vector:

n generation of a commodity ID;

static user-initialized embedded vector:

n is a user ID;

and traversing all time periods, and extracting initial embedded vectors of the sampled users and the commodities interacted in the time periods.

Embedding the static use/commodity into a vector, and inputting the vector and the hypergraph subgraph into a hypergraph convolution module;

firstly, carrying out linear transformation on a commodity embedded vector through a full-link layer, wherein the dimension of the full-link layer is kept (n, 100) by the dimension of the commodity embedded vector obtained from (100 ); inputting the commodity embedded vector into hypergraph convolution, multiplying the hypergraph matrix by the hypergraph matrix, and after one layer of hypergraph convolution, enabling the obtained commodity embedded vector to pass one layer of relu nonlinear activation function so that data have nonlinearity; the specific formula is as follows:

wherein

Saving the initial commodity embedded vector;

embedding vectors into the commodity subjected to multilayer hypergraph convolution; p ⁰ Representing a learnable parameter matrix; τ is a nonlinear activation function.

Then, embedding the commodity which has undergone one-layer hypergraph convolution into a vector, entering a second-layer hypergraph convolution, and adjusting the number of layers as hypergraph parameters; repeating the process for several times to obtain the node embedded matrix under different levels.

For a user side, carrying out hypergraph convolution on a hypergraph and a pre-trained static commodity embedded vector to learn a dynamic user embedded vector; finally, obtaining a short-time dynamic user embedding vector after hypergraph convolution through a layer of relu nonlinear activation;

the embedded vector formula of the hypergraph convolution learning user is further expanded into:

the formula:

then, the dynamic user/commodity embedded vectors learned at each time are put into the user embedded vector dictionary and the commodity embedded vector dictionary.

Embedding the index values of the interaction sequences of the user and the commodities in the commodity embedded vector dictionary in each batch in each experiment to correspondingly find the embedded vectors of the commodities; the output dimension is 3 latitude tensor n × 50 × 100; n is the number of samples, each user is a 50 x 100 matrix, and each row represents an interactive merchandise embedding vector.

For the static commodity embedded vector, the static commodity embedded vector nxs 0 × 100 is obtained by using the sequence nx50 of the interaction between the user and the commodity in the original data in the static commodity dictionary obtained by learning in the pre-training process.

For the user side, the static/dynamic user embedded vector is found by sampling the index value of the user.

Next, we combine the static/dynamic user embedded vector, the static/dynamic commodity embedded vector, and the four vectors, and learn the interactive embedded vector between the user and the commodity, and the specific idea of this method is:

formula of fusion layer:

and adding the four according to the weight to obtain the interactive embedded vector of the user and the commodity.

And S5, inputting the interactive embedded vectors of the user and the commodity into a transform module, learning to obtain three dynamic user embedded vectors with short length, medium length and long length according to different time sliding windows, and fusing the three dynamic user embedded vectors into a final dynamic user embedded vector.

Inputting the interaction sequence of the user and the commodity into a transform module, obtaining a dynamic user embedding vector of a short, medium and long time period through the operation of a sliding window, fusing the three into a final dynamic user embedding vector, and performing commodity preference prediction with the dynamic/static fused commodity embedding vector; supervision is performed using BPRLoss as a loss function and then the gradient is propagated back into the network for parameter updates.

Introducing a transform module, performing partial interaction on a sequence of user and commodity interaction to perform mask operation, wherein the sequence length of each user is kept consistent, and the specific operation is as follows:

for the user and commodity interaction sequence, for each line, values which are not 0 are represented by True values, the values are converted through dimensions and become n multiplied by 1 multiplied by 50, and the user and commodity interaction sequence subjected to mask processing is obtained by multiplying the user and commodity interaction sequence by a mask vector.

Next, we enter the transform module, which we do at three time granularities separately; the method comprises the following steps of performing multi-dimensional time feature extraction on a sequence of interaction between a user and a commodity by using a transform model, finally obtaining feature representation of the whole sequence, and using the feature representation as an embedded vector of the long-sequence information of the user combined with the time granularity, wherein the specific steps are as follows:

firstly, feature extraction on a short-time sequence is carried out, and a sliding window is set to be 1 by considering the interaction behavior of a user and a single commodity.

Obtaining a user and commodity interaction sequence n multiplied by 50 multiplied by 100 by the fusion layer, and inputting the sequence before the sequence is input into a transform module; accumulation is carried out through single interaction, the accumulation is the value of the accumulation itself because the first step is a short-time sequence, and the final dimension is maintained to be n multiplied by 50 multiplied by 100; and then embedding the user and commodity interaction into a sequence input transformer module.

After the user and commodity interaction sequence is input into the first layer of transform module, the output dimension is n × 50 × 100, and after the transform module passes through for multiple times, the output dimension is still n × 50 × 100.

Since we need to extract the final dynamic user-embedded vector from the user interaction behavior, we get the final result dimension to be n × 100 by compressing the dimension of the result. The calculation formula of the attention mechanism is as follows:

the formula:

wherein,

wherein

Three globally shared trainable parameter matrices are provided,

wherein d is _model The dimension of the interactive embedded vector of the user and the commodity is K, and the dimension of the parameter matrix is K;

the specific calculation formula is as follows:

wherein

Representing the sequence of the user's transactions with different commodities in different time periods;

representing the interaction embedding vector of the user and different commodities, and calculating the attention coefficients of the user and the commodities; finally, obtaining a dynamic user embedded vector by weighted summation under the last time t

For a medium time sequence, the rolling time window is 2 days, which is consistent with the short-time overall thought that the rolling window is one day; finally, the dimension of the output result is n multiplied by 25 multiplied by 100, and then dimension compression is carried out to obtain the final result of n multiplied by 100.

Similarly, for a long-term sequence, the sliding window is 5 days, the dimension of the final result is n × 10 × 100, and then dimension compression is performed to obtain a final result n × 100.

Furthermore, the feature extraction of the short, medium and long time sequences is added, and the finally obtained dynamic user embedded vector not only contains the user interest information in a short time, but also contains the user interest feature extraction in a medium and long time, so that the semantic information nx100 of the user is enriched.

Then acquiring embedded vectors m multiplied by 100 of the m static/dynamic commodities according to id values of m randomly sampled positive samples; and obtaining the static/dynamic commodity embedded vector m multiplied by 100 by m negative samples in the same way.

And S6, performing preference prediction on the final dynamic user embedded vector and the dynamic and static commodity embedded vectors fused to obtain a recommendation sequence.

Adding the dynamic and static commodity embedded vectors to construct a final commodity embedded vector of the commodity; and finally, performing point multiplication on the user embedded vector and the commodity embedded vector to predict the preference of the commodity.

Calculating the formula:

finally, constructing a loss function: BPRLoss:

here, o represents: loss is the Loss value, o ═ ((u, i, j) | (u, i) ∈ R ⁺ ，(u，j)∈R ^- In which R is ⁺ To observe the sample, R ^- No sample observed; sigmoid is a nonlinear activation function; theta denotes

And S7, constructing an end-to-end model, and learning and updating parameters by using the training data.

Constructing a hypergraph sub-graph of the user and the commodity by utilizing different time periods according to the interaction data of the user and the commodity and specific time data; firstly, inputting interactive data of a user and commodities into a pre-training model, inputting a sequence of articles purchased by the user into an LSTM model after random enhancement and disturbance, and outputting a potential embedding vector of the user; then entering a comparison learning module, and enhancing self feature extraction by using the pull-up and self-embedded vectors as positive samples and the other user-embedded vectors as negative samples; another penalty of pre-training consists of BPRLoss, and finally by optimizing these two penalties, the gradient is propagated back into the network to learn the static commodity/user embedded vector. Then learning out the embedded vector of the dynamic commodity/user at each moment through hypergraph convolution; fusing the four through a fusion layer to obtain an interactive embedded vector of the user and the commodity; and then, a transform module is utilized to realize dynamic user embedded vectors under three time granularities of short, medium and long through a sliding window, after the three are combined, the point multiplication is carried out on the combined dynamic/static commodity embedded vectors, BPRLoss is used as a loss function for supervision, and model parameter learning is carried out on a training set by utilizing a gradient descent back propagation method until the model converges.

The method is described below with reference to the accompanying drawings and specific embodiments.

As shown in fig. 1, the present embodiment provides a method for predicting a sequence recommended commodity based on hypergraph convolution, which specifically includes:

(1) and preparing data:

directly acquiring a data set, acquiring the data set with user and commodity interaction information and interaction specific time information by opening a source data set, filtering invalid data and carrying out negative sampling; the existing public data sets, such as goodreads _ large, newAmazon data sets and the like, are directly collected, contain the relations, and can be used for commodity and user hypergraph construction at different times after data preprocessing.

The method comprises the following specific steps:

firstly, preprocessing the acquired data set; after traversing each user and article interaction data, dividing the time of the data set, and classifying and dividing different years and 12 months artificially.

Traversing all the data to take out each user and corresponding articles and the interactive time information, putting the commodities interacted by each user into a user dictionary, and sequencing in sequence; and meanwhile, the interactive time information is put into a user _ time dictionary to store the time information of the interaction between the user and the commodity.

The user: u ═ U ₁ ，u ₂ ，…u _l B, }; wherein u is _n Embedding vectors for the nth user, wherein t is more than or equal to 1 and less than or equal to L, and L is the total number of the users.

Commercial products: i ═ I ₁ ，i ₂ ，…i _m B, }; wherein i _n N is more than or equal to 1 and less than or equal to m, and m is the total number of the articles.

Time: t ═ T ₁ ，t ₂ ，…t _c B, }; wherein, t _c N is more than or equal to 1 and less than or equal to c at the c-th time, and c is the total time length.

Is an embedded vector with an article ID of 1, and m is the article ID;

time t1 for user n to interact with item 1.

After the user dictionary is obtained, all users are traversed one by one, and if the number of times of interaction between the users and the commodities is less than three, the users are removed, so that effective data can be prevented from being polluted to a certain extent, and the data sparsity is reduced.

For effective data, the commodity before the penultimate commodity interacted with each user is used as a training set, the penultimate commodity is used as a verification set, the last commodity is used as a test set, and corresponding interaction time information is stored according to the mode.

And further, constructing the hypergraph subgraphs at different times according to the constructed training set and the divided time. Respectively constructing a hypergraph subgraph of the commodity and the user at each time; is represented as follows:

wherein G represents a constructed hypergraph of the commodity and the user at each time t, and t represents the moment of time.

Wherein the hypergraph G at each time instant consists of 4 parts;

representing commodity nodes under this hypergraph;

representing the user-represented hyperedge under this hypergraph; tn represents the time instant;

is the adjacent matrix of the commodity and the user, n is the commodity number, and m is the corresponding super edge corresponding to the user number.

Further, the commodity and the super-edge user in the user super-graph sub-graph at each moment are clustered. By applying a K-Means algorithm in machine learning, the sparsity of the hypergraph is better reduced, and similar commodities are aggregated; finally, the number of users in the hypergraph sub-graph at each time is changed to k, where k is a hyper-parameter, and is set to 200 in this example.

kmean formula:

where we first initialize a cluster partition assuming super-edge users (C) ₁ ，C ₂ ，…C _k ) The number of k is a hyperparameter, the objective is to minimize the squared error E, where u _i Is a cluster C _i The mean vector of (a), called centroid; and clustering the super-edge users in the hypergraph after a plurality of times of heuristic iteration methods.

k is calculated as follows: at each time, dividing the number of users by five; if the number of the users is more than 200, clustering the number of the users into 200; if the original number is less than 200, the number is directly equal to the number of users;

after each hypergraph graph is constructed, we first go through a pre-training phase.

(2) And (3) learning in a pre-training stage to obtain an initial embedded vector of the commodity and the user:

the pre-training mainly comprises the following steps:

firstly, the problems that the training is difficult to converge and the time consumption is long due to random initialization of the commodity embedded vector and the user embedded vector during the training of the main model are solved.

Secondly, after a contrast learning module is introduced, the relevance between users can be considered, and the embedded vectors of similar clients are drawn close and the embedded vectors of dissimilar clients are drawn far, so that the learned embedded vectors of the users can be better combined with high-order semantic information between the users and are merged into the relevance information.

Referring to fig. 2, the pre-training phase mainly comprises three steps:

in the first step, in the pre-training stage, the number of samples in a training batch is n, the interaction between each user and the commodity is traversed, and a positive sample and a negative sample are randomly extracted and input into a pre-training model.

The pre-training model comprises a comparison learning step, so that the interaction data of the user and the commodity needs to be processed firstly; and traversing n users, wherein the same 2 users are placed in the positive samples of the users, and different 2 users are placed in the negative samples.

Secondly, pre-training the stage model, wherein the loss function mainly comprises two parts of pretraining _ loss ═ BPR _ loss + CLOSs

BPR _ loss is mainly through n users and commodities extracted randomly, through nn. The specific formula is as follows:

BPRLoss：

here, o represents: o { (u, i, j) | (u, i) ∈ R ⁺ ，(u，j)∈R ^- In which R is ⁺ To observe the sample, R ^- No sample observed; σ is a nonlinear activation function; theta denotes

Learnable parameters, using L ₂ Regularization reduces the over-fitting problem of the small model.

Thirdly, comparing the learning loss calculation; traversing all sampling samples, and extracting 2 users; the commodity sequence interacted by the two users is taken out, and random substitution operation is carried out on the commodity sequence interacted by the two users, so that the sequence data is enhanced.

And respectively inputting the interaction sequences of the two users and the commodity into the LSTM model, and respectively taking out the last embedded vector passing through the LSTM model as the embedded vectors of the two users.

Further, carrying out vector point multiplication on the obtained two embedded vectors, and judging whether the two sequences are the same sequence or not through a sigmoid nonlinear activation function; and finally, supervising the model by using a cross entropy loss function with the positive and negative labels, and updating the gradient.

The formula:

wherein { (z' _u ，z″ _u ) L U ∈ U } belongs to a pair of positive samples, ((z' _u ，z″ _v ) L U, v belongs to U, v is not equal to U and belongs to a pair of negative samples; where s-cosine similarity is used to predict the correlation of two vectors, τ is a hyperparameter, similar to the effect of softmax.

A pre-training stage, which is composed of the 2 loss functions, wherein pretrainless is BPRLoss + CLOss; training a supervision model, and finally updating parameters according to gradient back propagation; the network parameters are learned by gradient backpropagation until convergence.

(3) Obtaining a dynamic user/commodity embedded vector through hypergraph convolution; combining through a fusion layer to obtain a user and commodity interaction embedding vector:

referring to fig. 3, the divided hypergraph subgraphs of each time period and the initial embedded vectors of the commodities/users obtained by pre-training are input into the main model together; performing hypergraph convolution learning on the dynamic commodity/short-term user embedded vector, and fusing the static/dynamic commodity and the short-term user embedded vector through a fusion layer to obtain an interactive embedded vector of the user and the commodity; the method comprises the following specific steps:

inputting the static commodity/user initial embedded vector obtained in the pre-training stage and the hypergraph subgraph into a hypergraph convolution module; static commodity initialization embedded vector:

user initialization of the embedded vector:

where t represents the interaction at different times.

Firstly, initially embedding a static commodity vector through a full-link layer, wherein the dimension of the full-link layer is 100 multiplied by 100, the latitude of the obtained embedded vector of the commodity is kept to be n multiplied by 100, and then the embedded vector of the commodity is input into hypergraph convolution to carry out matrix multiplication; here we use the torch.spark.mm () interface; after one layer of hypergraph convolution is carried out, the learned commodity embedded vector passes through a layer of relu nonlinear activation function again, and data are made to have nonlinearity; the specific formula is as follows:

wherein

Saving the initial commodity embedded vector;

the formula:

further, embedding the commodity which has undergone one-layer hypergraph convolution into a vector, entering a second-layer hypergraph convolution, and adjusting the number of layers as hyperparameters; repeating the process for several times to obtain node embedding matrixes under different levels, wherein each layer of hypergraph convolution output corresponds to different node characteristics, shallow characteristics better describe the node and neighbor characteristics directly connected with the node, and Shenzhen characteristics obtain high-order abstract characteristics of the node and better express the embedding vector of the commodity;

finally, obtaining a dynamic commodity embedded vector:

for a user side, matrix multiplication is carried out on a user hypergraph and a learned dynamic commodity embedded vector, and a dynamic user embedded vector is learned; finally, through a layer of relu nonlinear activation, obtaining a dynamic user embedded vector after hypergraph convolution; the embedded vector formula of the hypergraph convolution learning user is further expanded into:

the formula:

finally, obtaining a dynamic user embedded vector:

further, at each time, the embedded vectors of the dynamic users and the commodities and the embedded vectors of the static users and the commodities are spliced to obtain a user embedded vector dictionary and a commodity embedded vector dictionary.

In each training, the interaction sequence of each batch of users and commodities is input into a commodity embedded vector dictionary, and commodity embedded vectors with index values can be correspondingly found through the interaction sequence in the sample and the id values of commodities in each row; the final output dimension is 3-dimensional tensor n × 50 × 100, n is the number of samples, each sample is a 50 × 100 matrix, and each row represents an embedded vector of the interactive commodity.

For the user side, similarly, the user id is input into the dynamic user embedded vector dictionary, and the corresponding user embedded vector is selected to be n × 50 × 100.

Next, we learn the above to obtain a static/dynamic user embedded vector, a static/dynamic commodity embedded vector, add the four vectors, and learn a user and commodity interactive embedded vector, the specific steps of this method are as follows:

adding the four vectors, and fusing the layer formula:

and finally, obtaining the interactive embedded vector of the user and the commodity.

(4) Inputting the user and commodity interaction sequence into a transformrer module, and performing window sliding according to different time granularities, thereby obtaining dynamic user embedding vectors under different time granularities, combining the dynamic user embedding vectors under different time granularities, and obtaining the user embedding vectors with short, medium and long time information:

before inputting a transform module, we need to perform a partial interaction action on a sequence of user interaction with a commodity to perform a mask operation, so that the lengths of the input sequences are kept consistent, and the specific operations are as follows:

for the user and commodity interaction sequence, for each line, values which are not 0 are represented by a pool value, dimensionality is converted through compression to be n multiplied by 1 multiplied by 50, and learned user and commodity interaction embedding vectors are subjected to star multiplication to obtain user and commodity interaction embedding vectors subjected to mask processing.

Further, we will enter a transform module, which we do at three time latitudes separately; feature extraction is carried out on a user interaction sequence by applying a transform model, and time and sequence information can be stored in a commodity embedded vector through the position coding in the transform; and an attention mechanism may learn the inherent relationship between the items in the sequence.

The attention mechanism can better extract the long-time and short-time information of the user, and ensures that the extracted information does not cause failure of the front information due to overlong time sequence; therefore, the semantics of the user are enriched, the feature representation of the whole sequence is finally obtained to be used as an embedded vector of the long and short time sequence information of the user combination, and the specific steps are as follows:

according to three time sliding windows of short time, firstly, feature extraction on short time sequence is carried out, single interaction behavior is considered, and the number of the sliding windows is 1.

Sum is used for commodity interaction sequence to accumulate through the torreh. Since the first step is a short-time sequence, the accumulation is the value of itself, and the final dimension is still nx50 x 100; and inputting the accumulated user and commodity interaction sequence into a transform module.

Transformer module we use the self-contained api interface nn in the pytorech; (in _ features 100, out _ features 100, bias true), the number of heads of the multi-head attention mechanism is super parameter, which is set as 1, 2 layers of full connection are arranged inside, and the first layer is 100 × n; a second layer n × 100; dropout rate is 0.1; two transform layers were initialized.

After the user and commodity interaction sequence is input into the first layer transform module, the output dimension is n multiplied by 50 multiplied by 100.

Further, inputting the newly obtained result into a second layer of transform; the resulting final dimension is still nx50 x 100;

since we only need to extract the final dynamic user-embedded vector from the user interaction behavior, we get the last one as output n × 100 by compressing the dimensionality of the result. Wherein the attention mechanism in the transform model is calculated as follows:

the formula:

wherein,

wherein

Three globally shared trainable parameter matrices are provided,

wherein d is _model The dimension of the interactive embedded vector of the user and the commodity is K, and the dimension of the parameter matrix is K; after three vectors of query, key and value of each commodity embedded vector are obtained, the query vector is used for matching each key vector, the two vectors are subjected to point multiplication to obtain the correlation weight of the query vector and the key vector, the correlation weight is obtained after the point multiplication is carried out on the query vector and the key vector, and a corresponding weight value is obtained by applying a softmax activation function; weighting, summing and updating to obtain each commodity embedded vector; finally, after each head is butt-jointed with W through multiple heads ^o A matrix of parameters is formed by a matrix of parameters,

the specific calculation formula is as follows:

the attention weight of the user u under different commodity interactions is obtained;

wherein

vectors are embedded representing different user interactions with the merchandise,

embedding vectors for the interaction of the user and other commodities, and calculating attention coefficients of the user and other commodities; finally, obtaining a dynamic user embedded vector by weighted summation under the last time t

For the medium-time sequence, the rolling time window is 2 days, which is consistent with the whole thought of the single-day rolling; the dimension of the final output result is n multiplied by 25 multiplied by 100; and then dimension compression is carried out to obtain the final output of n multiplied by 100.

Similarly, for a long-term sequence, the sliding window is 5 days, the dimension of the final result is n × 10 × 100, and then dimension compression is performed to obtain the final output which is n × 100.

And adding the feature extractions of the short, medium and long time sequences. The finally obtained dynamic user embedded vector not only contains short-time user interest, but also contains long-time interest feature extraction in the user, and the semantic information of the user is enriched.

(5) Main model module-prediction module: and (3) carrying out preference prediction by using the dynamic user embedded vector and the commodity embedded vector, and obtaining a recommendation sequence:

then, acquiring embedded vectors nx100 of the n commodities according to id values of n randomly sampled positive samples; and similarly, obtaining the commodity embedded vector n multiplied by 100 of the negative samples by the n negative samples.

And then adding the dynamic state and the static state of the positive sample n multiplied by 100 and the negative sample n multiplied by 100 respectively to construct a final commodity embedded vector of the commodity.

For the positive sample, the dynamic user embedded vector nx100 is multiplied by the commodity embedded vector nx100 of the positive sample, and the accumulation is carried out on each line by a torch.sum (axis ═ 1), and finally, the score pos _ location of the user on n commodities is obtained; for negative sample commodities nx100, the dynamic user embedded vector negative sample commodity embedded vector is subjected to planetary multiplication to obtain the scores of the user on n negative sample commodities.

And finally, constructing a main model BPRLoss loss function for monitoring and training the model, updating parameters according to the inverse gradient propagation, and realizing the convergence of the model.

The formula:

the formula: BPRLoss:

here, o represents: loss is defined as o { (u, i, j) | (u, i) ∈ R ⁺ ，(u，j)∈R ^- Where R + is the observed sample, R ^- No sample observed; sigmoid is a nonlinear activation function; theta denotes

Further, an end-to-end model is constructed, and parameter learning and updating are performed by using training data, specifically:

and constructing the hypergraph of the commodity and the user by utilizing different time periods according to the interaction data and the specific time data of the user and the commodity. Firstly, inputting user and commodity interaction data into a pre-training model, inputting a user and commodity interaction sequence into an LSTM model after random enhancement and disturbance, and outputting a user embedded vector. Then mapping into a comparison learning module, inputting the comparison learning module, enhancing self characteristic extraction by drawing up self interaction sequences as positive samples and other users as negative samples, so that the user vector comprises correlation information between the users; another loss of pre-training is composed of BPRLoss, and finally, static user and commodity embedded vectors are learned by optimizing the two losses; then learning out the embedded vector of the dynamic commodity and the user at each moment through hypergraph convolution, and finally fusing the four through a fusion layer by using an attention mechanism to obtain an interactive embedded vector of the user and the commodity; then, learning the expression form of the final user of the whole sequence according to different time granularities by using a transformer module, and adding the short, medium and long dynamic user embedded vectors; and performing point multiplication on the point-by-point vector and the combined dynamic and static commodity embedded vector, and performing model parameter learning on a training set by using a random gradient descent method until the model converges.

As can be seen from the above, the information of the interaction between the user and the commodity under the multi-time granularity is comprehensively considered in the method of the embodiment, the comparison learning module is introduced by using the pre-training stage, so that the initial embedded vector of the static commodity and the static user is better learned, the feature validity is ensured, the interrelation between the users is also integrated, and meanwhile, the time consumption of the training on the main model and the model convergence complexity are reduced; effective information of commodities and users is extracted by utilizing multi-time-dimension hypergraph convolution on a main model, neighbor information of nodes is subjected to multi-order aggregation, shallow features better describe the nodes and neighbor features directly connected with the nodes, deep features obtain high-order abstract features of the nodes, and therefore embedded vectors of the commodities and the users are enriched; then, the static commodity/dynamic commodity and the static/dynamic user relation are fused through a fusion layer to obtain a user and commodity interactive embedded vector, and finally the user and commodity interactive embedded vector is input into a transform module and is combined under different time granularities, so that the semantics and characteristics of the user are enriched, and the dynamic user embedded vector is obtained; and finally predicting the next purchased commodity of the user through BPRLoss.

The embodiment further provides a sequence recommendation device based on the hypergraph neural network, which includes:

at least one processor;

at least one memory for storing at least one program;

when executed by the at least one processor, cause the at least one processor to implement the method of fig. 4.

The sequence recommendation device based on the hypergraph neural network can execute the sequence recommendation method based on the hypergraph neural network provided by the method embodiment of the invention, can execute any combination of the implementation steps of the method embodiment, and has corresponding functions and beneficial effects of the method.

In alternative embodiments, the functions/acts noted in the block diagrams may occur out of the order noted in the operational illustrations. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality/acts involved. Furthermore, the embodiments presented and described in the flow charts of the present invention are provided by way of example in order to provide a more thorough understanding of the technology. The disclosed methods are not limited to the operations and logic flows presented herein. Alternative embodiments are contemplated in which the order of various operations is changed and in which sub-operations described as part of larger operations are performed independently.

Furthermore, although the present invention is described in the context of functional modules, it should be understood that, unless otherwise stated to the contrary, one or more of the described functions and/or features may be integrated in a single physical device and/or software module, or one or more functions and/or features may be implemented in a separate physical device or software module. It will also be appreciated that a detailed discussion of the actual implementation of each module is not necessary for an understanding of the present invention. Rather, the actual implementation of the various functional modules in the apparatus disclosed herein will be understood within the ordinary skill of an engineer, given the nature, function, and internal relationship of the modules. Accordingly, those skilled in the art can, using ordinary skill, practice the invention as set forth in the claims without undue experimentation. It is also to be understood that the specific concepts disclosed are merely illustrative of and not intended to limit the scope of the invention, which is defined by the appended claims and their full scope of equivalents.

The functions may be stored in a computer-readable storage medium if they are implemented in the form of software functional units and sold or used as separate products. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.

The logic and/or steps represented in the flowcharts or otherwise described herein, e.g., an ordered listing of executable instructions that can be considered to implement logical functions, can be embodied in any computer-readable medium for use by or in connection with an instruction execution system, apparatus, or device, such as a computer-based system, processor-containing system, or other system that can fetch the instructions from the instruction execution system, apparatus, or device and execute the instructions. For the purposes of this description, a "computer-readable medium" can be any means that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device.

More specific examples (a non-exhaustive list) of the computer-readable medium would include the following: an electrical connection (electronic device) having one or more wires, a portable computer diskette (magnetic device), a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber device, and a portable compact disc read-only memory (CDROM). Additionally, the computer-readable medium could even be paper or another suitable medium upon which the program is printed, as the program can be electronically captured, via for instance optical scanning of the paper or other medium, then compiled, interpreted or otherwise processed in a suitable manner if necessary, and then stored in a computer memory.

It should be understood that portions of the present invention may be implemented in hardware, software, firmware, or a combination thereof. In the above embodiments, the various steps or methods may be implemented in software or firmware stored in memory and executed by a suitable instruction execution system. For example, if implemented in hardware, as in another embodiment, any one or combination of the following techniques, which are known in the art, may be used: a discrete logic circuit having a logic gate circuit for implementing a logic function on a data signal, an application specific integrated circuit having an appropriate combinational logic gate circuit, a Programmable Gate Array (PGA), a Field Programmable Gate Array (FPGA), or the like.

In the foregoing description of the specification, reference to the description of "one embodiment/example," "another embodiment/example," or "certain embodiments/examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, schematic representations of the above terms do not necessarily refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.

While embodiments of the present invention have been shown and described, it will be understood by those of ordinary skill in the art that: various changes, modifications, substitutions and alterations can be made to the embodiments without departing from the principles and spirit of the invention, the scope of which is defined by the claims and their equivalents.

While the preferred embodiments of the present invention have been illustrated and described, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the invention as defined by the appended claims.

Claims

1. A sequence recommendation method based on a hypergraph neural network is characterized by comprising the following steps:

inputting the interactive embedded vectors of the users and the commodities into a transform module, learning to obtain three dynamic user embedded vectors of short, medium and long according to different time sliding windows, and fusing the three dynamic user embedded vectors into a final dynamic user embedded vector;

2. The hypergraph neural network-based sequence recommendation method according to claim 1, wherein the constructing a hypergraph from the obtained dataset comprises:

the user: u ═ U ₁ ,u ₂ ,…u _L B, }; wherein u is _j Embedding a jth user into a vector, wherein j is more than or equal to 1 and less than or equal to L, and L is the total number of users;

commercial products: i ═ I ₁ ,i ₂ ,…i _N B, }; wherein i _j J is more than or equal to 1 and less than or equal to N, and N is the total number of the articles;

time: t ═ T ₁ ,t ₂ ,…t _C B, }; wherein, t _j J is more than or equal to 1 and less than or equal to C at the jth moment, and C is the total time length;

an embedded vector representing an article ID of 1, m being the article ID;

representing the time t at which the user n interacts with the item 1 ₁ ；

is shown at the moment of time t _c And then, carrying out a hypergraph graph of the commodity and the user.

3. The hypergraph neural network-based sequence recommendation method according to claim 1, wherein in the step of clustering the hypergraph users in the hypergraph subgraph at each moment, the clustering formula is as follows:

wherein cluster partitioning of the initiating super edge users is (C) ₁ ,C ₂ ,…C _k ) Then the target minimizes the squared error E; u. u _i Is a cluster C _i The mean vector of (2); x is of the formula c _i All over-edge users, X, at the cluster center _i For a super edge user i, k is the k cluster centers.

4. The hypergraph neural network-based sequence recommendation method according to claim 1, wherein in the step of pre-training learning by introducing contrast learning and graph convolution, the loss function of the pre-training stage includes two parts: partial formula for the first partial BPRLoss:

pre-training phase, for user u _i And a proof goods

And a negative sample commodity

in the formula, o represents: o { (u, i, j) | (u, i) ∈ R ⁺ ,(u,j)∈R ^- In which R is ⁺ To observe the sample, R ^- No sample observed; (u, i) represents a user-to-user positive sample pair, and (u, j) represents a user-to-user negative sample pairA pair of samples;

the second part compares the learning formula:

the formula is as follows:

in formula (II) { (z' _u ,z” _u ) Belongs to a pair of positive samples, { (z' _u ,z” _v ) L U, v belongs to U, v is not equal to U and belongs to a pair of negative samples; where s () cosine similarity is the correlation used to predict two vectors, τ is the hyperparameter, and U is the user set.

5. The hypergraph neural network-based sequence recommendation method according to claim 1, wherein the hypergraph subgraph of each divided time segment and the pre-trained learned commodity/user initial embedding vector are input into a main model to perform hypergraph convolution learning on the dynamic commodity/user embedding vector, and the method comprises the following steps:

static merchandise initial embedded vector:

wherein,

representing a static commodity initial embedded vector;

a hypergraph graph representing the commodity and the user at a time t _n ，

Is a weight matrix with the diagonal matrix representing the super-edges,

the hypergraph subgraph representing the commodity and the user is at the time t _n Transpose of (P) ⁰ Representing a learnable parameter matrix; tau is a nonlinear activation function;

dynamic merchandise initial embedded vector:

wherein,

a diagonal matrix representing nodes in the hypergraph,

a diagonal matrix representing the super edges in the hypergraph,

representing the dynamic commodity initial embedded vector.

6. The hypergraph neural network-based sequence recommendation method according to claim 5, wherein the fusing static/dynamic commodity and static/dynamic user embedded vector through a fusion layer to obtain user-commodity interaction embedded vector comprises:

a fusion layer:

in the formula,

representing a static user-initiated embedded vector,

representing a dynamic user initial embedding vector;

will be provided with

And

7. The sequence recommendation method based on the hypergraph neural network is characterized in that the interaction between the user and the commodity is embedded into a vector input transducer module, and the attention mechanism in the transducer module is calculated as follows:

in the formula,

representing the interaction embedding vector of the user and different commodities, and calculating the attention coefficient of the user and the different commodities; w _q 、W _K 、W _V A matrix of parameters that can be learned is represented,

Adding the short length, the medium length and the long length:

obtaining a final dynamic user embedding vector; wherein,

the vector is embedded for a medium-time dynamic user,

vectors are embedded for long-term dynamic users.

8. The sequence recommendation method based on the hypergraph neural network as claimed in claim 1, wherein the preference prediction of the final dynamic user embedded vector and the fused dynamic and static commodity embedded vector to obtain the recommendation sequence comprises:

the prediction formula is as follows:

wherein

embedding vectors for static commodities;

the expression for BPRLoss is:

in the formula, o represents: loss is the Loss value, o ═ ((u, i, j) | (u, i) ∈ R ⁺ ，(u，j)∈R ^- In which R is ⁺ To observe the sample, R ^- No sample observed; sigmoid is a nonlinear activation function;

the prediction is scored for the user versus the positive sample,

9. The sequence recommendation method based on the hypergraph neural network as claimed in claim 1, further comprising the steps of constructing an end-to-end model, and performing parameter learning and updating by using training data:

inputting user and commodity interaction data into a pre-training model, inputting a user purchased article sequence into an LSTM model after random enhancement and disturbance, and outputting a user potential embedded vector; then, entering a comparison learning module to learn to obtain a static commodity/user embedded vector;

10. A sequence recommendation device based on a hypergraph neural network is characterized by comprising:

at least one processor;

at least one memory for storing at least one program;

when executed by the at least one processor, cause the at least one processor to implement the method of any one of claims 1-9.