CN115481325A - Personalized news recommendation method and system based on user global interest migration perception - Google Patents

Personalized news recommendation method and system based on user global interest migration perception Download PDF

Info

Publication number
CN115481325A
CN115481325A CN202211233877.6A CN202211233877A CN115481325A CN 115481325 A CN115481325 A CN 115481325A CN 202211233877 A CN202211233877 A CN 202211233877A CN 115481325 A CN115481325 A CN 115481325A
Authority
CN
China
Prior art keywords
news
user
migration
interest
attention
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211233877.6A
Other languages
Chinese (zh)
Inventor
胡明芮
刘波
严辉
孟青
曹玖新
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Southeast University
Original Assignee
Southeast University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Southeast University filed Critical Southeast University
Priority to CN202211233877.6A priority Critical patent/CN115481325A/en
Publication of CN115481325A publication Critical patent/CN115481325A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Abstract

The invention discloses a personalized news recommendation method and system based on user global interest migration perception, the method firstly extracts news headline text and news headline entity text from news data, and constructs user sequence, a click relationship network between users and news and a sequential relationship network between news and news; performing multi-view related news content representation calculation; then calculating the user content interest representation; then, constructing a global migration graph of news, convolving a click relation between a user and the news and a migration relation between the news and the news, and then fusing the two obtained relations to obtain a migration interest representation of the user; and finally, combining the content interest characterization and the migration interest characterization of the user, constructing a final news recommendation model of the global interest migration perception of the user, and realizing personalized recommendation. The system adopts a web interaction technology to realize the visual display of data analysis and recommendation results. The method can effectively improve the accuracy of personalized recommendation and has strong robustness.

Description

Personalized news recommendation method and system based on user global interest migration perception
Technical Field
The invention relates to a personalized news recommending method, in particular to a personalized news recommending method and system based on user global interest migration perception.
Background
The rapid development of the online news platform enables users to share, search and browse news information in time through the online news platform, so that the information requirements of people are greatly enriched, and the problem of information overload is also brought. The recommendation system recommends information which may be interested to a user by using an information filtering technology, and becomes an effective scheme for solving the information overload problem.
News recommendation systems can be broadly divided into two categories, collaborative filtering-based news recommendation and content-based news recommendation. On the basis of a collaborative filtering-based mode and a content-based mode, a plurality of new branches are derived, and the branches are mainly divided into a privacy protection recommendation system, a knowledge perception recommendation system, a sequence perception recommendation system and an interpretable recommendation system according to the problems aimed at by the branches. News has the particularity of anonymous browsing, so that browsing records of a user are usually stored in a conversation sequence form, and long-term browsing information of the user is difficult to obtain, and therefore, the importance of sequence-based recommendation in news recommendation is particularly remarkable. The Wang et al researchers systematically summarized the classification of the sequence recommendation system in 2019, and the sequence recommendation system was mainly classified into three categories according to the developed technical route: the system comprises a traditional sequence recommendation system, a sequence recommendation system based on a latent semantic model and a recommendation system based on deep learning.
The traditional sequence recommendation is mainly divided into a sequence recommendation model based on sequence pattern mining and a sequence recommendation model based on a Markov chain model. The method based on sequence pattern mining mainly comprises the steps of mining some frequent patterns from sequence data and then guiding subsequent recommendation by using the mined patterns. Sequence pattern mining, while simple and straightforward, typically produces a large number of redundant patterns, adding unnecessary temporal and spatial overhead. Another significant disadvantage is that it often loses those infrequently occurring patterns and items due to frequency constraints, which will result in recommendations being limited to those popular items. A Markov chain-based recommendation system models sequences of transitions between user-item interactions to predict the next interaction. Since the markov property assumes that the current interaction depends only on one or more recent interactions, it can only capture short-term dependencies, ignoring long-term dependencies. Furthermore, it can only capture dependencies between point-wise points and ignore dependencies between collections. Conventional sequence recommendation models take advantage of their natural advantages of modeling sequence correlations between user-item interactions in sequences, are intuitive and simple, but suffer from a number of drawbacks.
Unlike collaborative filtering methods, which mainly use a factorization machine to learn the potential representation of each user or item, the implicit model-based sequence recommendation is a matrix or tensor that needs to be factored, which is composed of interactions, rather than the scoring matrix in collaborative filtering. Sequence recommendation based on the latent semantic model is easily influenced by the sparsity of data, and an ideal recommendation effect is difficult to obtain; on the other hand, the limited linear expression capability of the factoring machine and the inclusion of useless cross combinations of features have a great influence on the recommendation result. In recent years, with the development of deep learning, researchers combine and apply a deep learning method and a factorization machine to sequence recommendation to improve the linear expression capability of the deep learning method, but the model design is complex and the calculation complexity is high.
The sequence recommendation system based on deep learning mainly models the interaction relationship between users and articles by using two representative technologies of Recurrent Neural Network (RNN) and Graph Neural Network (GNN). Recurrent neural networks have advantages in dealing with long-term dependencies of sequences, but are still difficult to model for lengthy sequences, on the other hand recurrent neural networks have difficulty modeling higher order dependencies in sequences and accurately characterizing user preferences when the news browsed by the user is sparse. In recent years, from a graph perspective, research considering the interaction relationship between users and news is getting hot, researchers generally construct an interactive sequence of users and articles into a directed graph, wherein each article is used as a node and each sequence is used as a path, then learn the representation of the articles by using a graph neural network based on the constructed graph structure, and then model the users by combining the historical information of the users. And according to time information of an interaction sequence of the user and the object, a researcher divides a certain time slice, connects the objects in the same time slice by using the super edges to construct a super graph, learns the representation of the object by using a super graph convolution mode, and dynamically represents the user by using the existing object representation. The data structure of the graph relation is constructed by converting the relation between the sequences, and the complex dependency relation between the article contexts can be modeled more easily by promoting the first order sequence relation to the high order graph relation. For sequences with few historical browsing news of the user, the sequences can be supplemented by the context information of the news sequences, so that the problem that the user preference is not easy to model when the historical browsing news of the user is sparse is solved. However, the existing research usually only considers the click relationship between the user and the news, but neglects the internal connection in the global browsing news sequence, thereby neglecting the global interest migration of the user, resulting in incomplete description of the user interest, and thus failing to meet the user's requirements.
How to realize the recommendation algorithm combining the content full mining and the user global interest migration perception is a problem to be solved urgently in a news recommendation system. By combining the research backgrounds, the invention provides a personalized news recommendation system based on user global interest migration perception aiming at the problems of global information aggregation and insufficient news expression in the news recommendation system. The user global interest migration perception of the invention is defined as: from a global view, the interactive connection of the click behaviors between the users and the news is mined, the forward and backward sequence migration connection between news sequences is considered, and the high-order relation between the users and the news is modeled.
Disclosure of Invention
The purpose of the invention is as follows: in order to overcome the defects of the prior art, the invention provides a personalized news recommendation method and system based on user global interest migration perception.
The technical scheme is as follows: in order to realize the purpose, the invention adopts the following technical scheme:
1. a personalized news recommendation method based on user global interest migration perception comprises the following steps:
(1) Carrying out data preprocessing on historical browsing news and candidate news of a user, and constructing a global sequence relation network of the news and a click relation network between the user and the news;
(2) Candidate news content characterization computation
Performing self-attention characterization calculation of a title text and an entity on candidate news from two different perspectives of a text perspective and an entity perspective respectively by using an attention mechanism, calculating cross attention characterizations of the two perspective characterizations, and performing multi-perspective fusion calculation to obtain candidate news content characterizations by combining the respective characterizations of the two perspectives and the cross attention characterizations between the two perspectives;
(3) User content interest characterization computation
Performing self-attention characterization calculation of a title text and an entity on historical browsing news of a user from two different perspectives of a text perspective and an entity perspective respectively by using an attention mechanism, calculating cross attention characterizations of the two perspective characterizations, performing multi-perspective fusion calculation by combining the respective characterizations of the two perspectives and the cross attention characterizations between the two perspectives to obtain content characterizations of the historical browsing news of the user, and aggregating the content characterizations of the historical browsing news of the user based on the attention mechanism to obtain the content interest characterizations of the user;
(4) User migration interest characterization computation
Fusing the global news and news sequential relationship network constructed in the step (1) and the click relationship network between the users and the news to construct a global news migration graph, taking the user node representations and different neighbor news node representations in the global news migration graph as input, using a two-layer migration perception graph attention network to perform information aggregation and learning, and finally obtaining user migration interest representations, news representations and migration representations, wherein the migration representations comprise a propagation representation and an influence representation;
(5) Joint recommendation model
Combining the candidate news content characterization obtained in the step (2), the user content interest characterization obtained in the step (3), the user migration interest characterization, the news characterization and the migration characterization obtained in the step (4), constructing a joint recommendation module and recommending news, performing similarity calculation on the user content interest characterization obtained in the step (3) and the candidate news content characterization obtained in the step (2) to obtain a user content interest score, and performing similarity calculation on the user migration interest characterization, the candidate news characterization and the migration characterization to obtain a user migration interest score; weighting and summing the user content interest scores and the user migration interest scores to obtain the interaction probability of the final user on the candidate news, and finally returning a top-k news list recommended in the candidate news set according to the sequencing of the interaction probability;
(6) And (5) displaying system functions.
2. According to the personalized news recommendation method based on the user global interest migration perception in the step (1), a specific method in the step (1) is that news title text fields and title entity fields are extracted from historical browsing news and candidate news of a user, and corresponding news title word vectors and corresponding entity vectors are obtained through Glove word vectors and entity vectors trained in Wikipedia respectively and serve as initial vector representations; and then connecting the historical browsing news before and after clicking according to the sequence data of the historical browsing news of the user to construct a global sequential relation network of the news and the news.
3. According to the personalized news recommending method based on the user global interest migration perception in the step 1, the step (2) comprises the following specific steps:
(2-1) for a given candidate News n i The headline text sequence of the news is represented as
Figure BDA00038828202800000417
Wherein T is i Headline text, w, representing the ith news item i,j I.e. of the ith newsJth word, | T, in title text i L is the total number of words in the title text; the entity sequence of news is represented as
Figure BDA00038828202800000416
Wherein E i Heading entity representing the ith news item, e i,j I.e., the jth entity in the headline of the ith news item, | E i I is the total number of entities in the title; headline text sequence characterization matrix for learning news by self-attention mechanics
Figure BDA0003882820280000041
The calculation process is as follows:
Figure BDA0003882820280000042
wherein the content of the first and second substances,
Figure BDA00038828202800000418
for word vector matrices of word sequences, superscript T denotes the matrix transposition operation, d w Is a feature dimension of a word that is,
Figure BDA0003882820280000043
represents the word w i,j Is used to represent the vector of (a),
Figure BDA0003882820280000044
representing news n i The self-attention impact weight of the jth word in (e), exp (-) represents an exponential function based on a natural constant e,
Figure BDA0003882820280000045
is news n i The self-attention characterization weight of the jth word in (k) refers to the kth word,
Figure BDA0003882820280000046
the method is characterized by self attention of words, and each word is updated according to the normalized weight vector to obtain a text sequence characterization matrix
Figure BDA0003882820280000047
Thereafter, a self-attention mechanism is used to calculate the self-attention representation of the ith news headline text
Figure BDA0003882820280000048
And carrying out self-attention mechanical study on entity sequence representation on entities in the titles to obtain an entity sequence representation matrix of the news titles
Figure BDA0003882820280000049
Wherein
Figure BDA00038828202800000410
Representing a self-attention representation of a jth entity in a headline of the ith news, and deriving a self-attention representation of the entity in the news headline
Figure BDA00038828202800000411
(2-2) characterizing a matrix by a heading text sequence of news
Figure BDA00038828202800000412
And entity sequence representation of news
Figure BDA00038828202800000413
Cross attention is carried out to obtain the association degree between each two words and entities, lines and columns are added as the weight values of the words and the entities respectively, and then the characteristics of a text set and an entity set are aggregated respectively through weighting to obtain the text level cross learning representation
Figure BDA00038828202800000414
Cross-learning representation with physical layer
Figure BDA00038828202800000415
The specific calculation process of (2) is as follows:
Figure BDA0003882820280000051
Figure BDA0003882820280000052
Figure BDA0003882820280000053
wherein
Figure BDA0003882820280000054
Represents the word w i,j Is used to indicate the self-attention vector of (1),
Figure BDA0003882820280000055
representing an entity e i,k Is used to indicate the self-attention vector of (1),
Figure BDA0003882820280000056
representing news n i The cross attention of the jth word in (a) characterizes the weight,
Figure BDA0003882820280000057
representing news n i The cross attention impact weight of the jth word in (a),
Figure BDA0003882820280000058
representing news n i The cross attention impact weight of the kth word in (a),
Figure BDA0003882820280000059
representing news n i The cross attention of the k-th word in (1) characterizes the weight, d w And d e Respectively, word and entity feature dimensions, entity level cross-learning representation
Figure BDA00038828202800000510
The calculation method of (2) and the text level cross learning representation
Figure BDA00038828202800000511
(2-3) self-attention representation of the title text obtained in the step (2-1)
Figure BDA00038828202800000512
Self-attention representation of title entity
Figure BDA00038828202800000513
And the text level cross learning representation obtained in the step (2-2)
Figure BDA00038828202800000514
Cross-learning representation with physical layer
Figure BDA00038828202800000515
Adding the data, splicing to obtain a news content characterization vector and a multi-view characterization vector of the ith news
Figure BDA00038828202800000516
The calculation is as follows:
Figure BDA00038828202800000517
wherein + represents the counterpoint addition of two vectors, | | represents the splicing operation, and the ith news content represents the vector
Figure BDA00038828202800000518
The calculation is as follows:
Figure BDA00038828202800000519
wherein
Figure BDA00038828202800000520
Is the weight of the linear layer or layers,
Figure BDA00038828202800000521
is linear layer biasAnd (4) poor.
4. The personalized news recommendation method based on the user global interest migration perception according to the 1, wherein the step (3) comprises the following specific steps:
(3-1) extracting the content of the user sequence from the attention, and performing news content characterization calculation on the news sequence historically browsed by the user to obtain a news sequence matrix
Figure BDA00038828202800000522
Wherein
Figure BDA00038828202800000523
As user u i First news n browsed 1 The content of (a) characterizes the vector,
Figure BDA00038828202800000528
representing the length of the user historical browsing sequence, applying a self-attention mechanism to a matrix constructed by a news sequence vector set to update information among news sequences to obtain a news self-attention moment matrix
Figure BDA00038828202800000524
Normalizing by the Softmax function to obtain a weight matrix
Figure BDA00038828202800000525
Then multiplying the weight matrix with the corresponding news vector set to realize the relation between each news and other news to obtain the self-attention vector matrix of the historical browsing news of the user
Figure BDA00038828202800000526
Wherein
Figure BDA00038828202800000527
Is user u i For news n j The self-attention vector of (a);
(3-2) use attention mechanism to weight the set of self-attention vectors aggregating news from the user plane to characterize the user's content interest preferences
Figure BDA0003882820280000061
The calculation is as follows:
Figure BDA0003882820280000062
Figure BDA0003882820280000063
wherein
Figure BDA0003882820280000064
Representing user u i For news n in history browsing sequence j The degree of preference of (a) is,
Figure BDA0003882820280000065
the parameters of the first layer of linear layer are consistent with the dimension of the news content vector, and are sent into the second layer of linear layer after nonlinear change is carried out through a Tanh activation function;
Figure BDA0003882820280000066
the user implements the dimension d for the parameters of the second layer linear layer n Mapping to 1, wherein each news vector corresponds to a weight, the preference distribution condition of the user to the news in the history browsing news sequence can be obtained after normalizing all the history browsing news weights, and finally, the self-attention vector matrix of the news is subjected to weighted summation
Figure BDA0003882820280000067
Deriving user content interest representations
Figure BDA0003882820280000068
5. The personalized news recommendation method based on the user global interest migration perception according to the 1, wherein the step (4) comprises the following specific steps:
(4-1) performing user migration interest modeling based on sequence global perception, wherein the user migration interest modeling comprises two parts, namely building a global news migration graph and characterizing migration interest based on migration perception;
firstly, constructing a global news migration graph: news in a user history browsing news sequence is connected in series to form a chain according to the browsing time sequence of the user, each chain can be connected to form a graph through the same news node, the original browsing sequence of the news is the spreading sequence of the news, the spreading sequence of the news from the previous news to the next news is shown, the sequence opposite to the browsing sequence is the influence sequence of the news, and the influence of the previous news on the next news is shown; meanwhile, the relation between the user and the news is added into the graph, and the click relation between the user and the news is fused to obtain a global news migration graph
Figure BDA0003882820280000069
Wherein V u Set of representative user nodes, V n Representing a set of news nodes, E C Set of edges representing click relationships, E P Set of edges representing propagation relationships, E I An edge set representing influence relationships;
(4-2) after a global news migration graph is constructed, based on a plurality of relations in the global news migration graph, macroscopically modeling the migration interest of a user in browsing news, and formally expressing that for a user u, in the global news migration graph
Figure BDA00038828202800000610
Figure BDA00038828202800000611
In the above, the neighbor news set Nei (u, click) = { N | N ∈ N, (u, N ∈ E) is obtained according to the Click relation C }; for news n, a neighbor user set Nei (n, click) = { U | U ∈ U [ ((U, n) ∈ E) ] is obtained according to a Click relation C }; obtaining a neighbor news set Nei (N, propagate) = { N | N ∈ N, (N, N ∈ E) according to the propagation relation P }; obtaining a neighbor news set Nei (N, influent) = { N | N ∈ N, (N, N ∈ E) according to the Influence relation I }; the migration interest learning model is carried out through a graph attention network of migration perception, namely Transition-GATInformation aggregation and learning are carried out, user node representations and different neighbor news node representations are input, and the nodes are aggregated respectively according to the edge relations of the nodes to obtain user migration interest representations, news representations and migration representations; collecting initial user and news vectors in a first Transition-GAT, mainly learning the relation between news and news on a migration network, and collecting last learned user and news vectors in a second Transition-GAT, mainly learning the migration relation between users and news;
the input of Transition-GAT is a user characterization matrix
Figure BDA0003882820280000071
And a news characterization matrix
Figure BDA0003882820280000072
Wherein
Figure BDA0003882820280000073
Representative user u i Is determined by the initial token vector of (a),
Figure BDA0003882820280000074
representing news n j For user u i Browsed news Nei (u) i Click) to get a new user migration interest representation by weighted aggregation of its neighbor news based on graph attention mechanism
Figure BDA0003882820280000075
The calculation formula is as follows:
Figure BDA0003882820280000076
Figure BDA0003882820280000077
Figure BDA0003882820280000078
wherein
Figure BDA0003882820280000079
Representing neighbor news node n j For user u i The score of attention of (a) is,
Figure BDA00038828202800000710
representing neighbor news node n j For user u i To the extent of the effect of (a) is,
Figure BDA00038828202800000711
representing neighbor news node n k For user u i The score of attention of (a) is,
Figure BDA00038828202800000712
is the transformation matrix parameter of the user feature vector in the first layer Transition-GAT,
Figure BDA00038828202800000713
is the transformation matrix parameter of the new feature vector in the first layer Transition-GAT, d g Is the dimension of the initial feature vector of the user and the news, | | | represents the splicing operation,
Figure BDA00038828202800000714
is through the attention weight pair Nei (u) 1 Click) set, and performing weighted aggregation on all news to obtain a user migration interest representation;
for the neighbor user node of the ith news node, nei (n) i Click) which operates similar to the news gathering operation of the user, and obtains the news representation by calculating attention weight and weighting
Figure BDA00038828202800000715
Collecting information of the propagation relation and the influence relation, simultaneously collecting the information of the propagation relation and the influence relation, obtaining corresponding news migration representation by calculating weight, and aiming at news n i In the transmission gateSimilar to the graph aggregation described above, the news dissemination representation can be obtained in the form of the relationship and the influence relationship respectively
Figure BDA00038828202800000716
And news impact characterization
Figure BDA00038828202800000717
Migration representation of news by aggregating its impact representation and propagation representation
Figure BDA00038828202800000718
The calculation is as follows:
Figure BDA0003882820280000081
Figure BDA0003882820280000082
wherein Nei (n) i Propagate) representation with news node n i A set of neighbor news having a propagation relationship,
Figure BDA0003882820280000083
representing neighbor news node n k For news node n i Attention score of (1), nei (n) i Influence) representation with news node n i Neighbor news collection with influence relation, news propagation vector
Figure BDA0003882820280000084
And news influence vector
Figure BDA0003882820280000085
And respectively weighting and aggregating the characteristics of the propagation relation and the other data nodes influencing the relation, wherein the displayed characteristics represent the relation between the news.
6. The personalized news recommendation method based on the user global interest migration perception according to the 1, wherein the step (5) comprises the following specific steps:
(5-1) carrying out similarity calculation on the content interest representation of the user and the content representation of the candidate news to obtain a user content interest score, and carrying out similarity calculation on the migration interest representation of the user and the candidate news migration representation to obtain a user migration interest score; user content interest scoring
Figure BDA0003882820280000086
And user migration interest scoring
Figure BDA0003882820280000087
Is calculated as follows:
Figure BDA0003882820280000088
Figure BDA0003882820280000089
wherein f is u A representation of the user's content interest representation,
Figure BDA00038828202800000810
content representation, gf, representing candidate news i u A representation of the migration interest of the user,
Figure BDA00038828202800000811
indicates a candidate news migration token, an indicates a vector inner product;
(5-2) carrying out weighted summation on the content interest scores and the migration interest scores to obtain the interaction probability of the final user on the candidate news, and carrying out weighted summation on the candidate news c by the user u i Interaction probability of
Figure BDA00038828202800000812
The calculation formula of (a) is as follows:
Figure BDA00038828202800000813
Figure BDA00038828202800000814
Figure BDA00038828202800000815
wherein
Figure BDA00038828202800000816
Representing normalized user u pair candidate news c i The content interest score of (a) is,
Figure BDA00038828202800000817
representing normalized user u pair candidate news c i K is the candidate news set length of the user u, and theta belongs to [0,1 ]]And weighting and combining the content interest probability and the migration interest probability through theta to obtain interaction probability, and finally returning a top-k news list recommended in the candidate news set according to the ordering of the interaction probability.
7. The personalized news recommendation method based on the user global interest migration awareness according to 1, wherein in the step (6), system function display comprises visual display of data analysis, experimental analysis and recommendation analysis, wherein the data analysis comprises time distribution of news browsed by a user, effect graph display of topic and subtopic distribution of the news, and effect graph display of text length and cumulative distribution of the news browsed by the user; the experimental analysis comprises contrast experiment and ablation experiment result histogram display of the algorithm on the public data set, and visual display of the attention of the user to the news words; the recommendation analysis comprises online news browsing interface display, recommended news list display after a user clicks browsing news, similar users and browsing sequence display and news migration relation display of the similar users.
8. A personalized news recommendation system based on the user global interest migration perception, which is operated by the personalized news recommendation method based on the user global interest migration perception in the steps 1-7, comprises a data processing module, a service processing module and a visual analysis module;
the data processing module is used for preprocessing a data set offline and dividing the data set into news title preprocessing, user news browsing preprocessing, user-news graph construction and news relation graph construction, and then classifying and storing the preprocessed data;
the business processing module is mainly used for interfacing system requirements and calling a pre-training model to generate data, and comprises three sub-modules, namely a user content interest mining module, a user migration interest mining module and a joint recommendation module, wherein the user content interest mining module is used for modeling the user content interest based on historical news data and news headline data browsed by a user; the user migration interest mining module models the user migration interest based on the user-news graph and the news relation graph; the joint recommendation module provides recommendation service for the visualization layer by combining the content interest and the migration interest of the user;
the visual analysis module is an interactive interface module which provides service for users by the system, mainly provides visual results of data analysis, experimental analysis and recommendation analysis for the users, and provides interactive functions for the users, so that the users can select news to browse, and recommend and analyze according to browsing of the users.
Has the advantages that: compared with the prior art, the invention adopts the technical scheme, and has the following advantages:
(1) The invention designs a personalized news recommendation system based on user global interest migration perception, wherein the personalized news recommendation system focuses on the migration of user interest in browsing news in personalized news recommendation: starting from a global view, constructing a global migration graph of news based on a historical browsing sequence of a global user, and making a convolution click relation and a news migration relation through a graph attention machine for migration perception; combined recommendation is carried out by combining the content interest and the migration interest of the user, so that high-precision personalized recommendation is realized;
(2) When the method is used for modeling the content information in the news headlines, the news content modeling method with multi-view association is used, the news content is represented by combining the own views of the news headlines and the news entities and the associated views between the own views, and the news content information can be fully mined.
Drawings
FIG. 1 is an overall framework diagram of the algorithm of the present invention;
FIG. 2 is an algorithmic schematic of the partial modeling of news content;
FIG. 3 is a schematic diagram of a global newsfeed graph construction;
FIG. 4 is a schematic diagram of a user migration interest model based on sequence global perception;
FIG. 5 is a system architecture diagram of the present invention;
Detailed Description
The technical means of the present invention will be described in detail below.
The following is merely an example of the present invention, and other various embodiments of the present invention, and those skilled in the art can make various changes and modifications according to the present invention without departing from the spirit and the spirit of the present invention, and those corresponding changes and modifications should fall within the scope of the appended claims.
The invention discloses a personalized news recommendation method based on user global interest migration perception, which comprises the following steps of:
1. data preprocessing is carried out on historical browsing news and candidate news of a user, and a global sequence relation network of the news and a click relation network between the user and the news are established
Extracting news title text fields and title entity fields from historical browsing news and candidate news of a user, and respectively obtaining corresponding news title word vectors and corresponding entity vectors as initial vector representations through a Glove word vector and an entity vector trained in Wikipedia; and then connecting the historical browsing news before and after clicking according to the sequence data of the historical browsing news of the user to construct a global sequential relation network of the news and the news.
2. Candidate news content characterization computation
Performing self-attention characterization calculation of a title text and an entity on candidate news from two different perspectives of a text perspective and an entity perspective respectively by using an attention mechanism, calculating cross-attention characterizations of the two perspective characterizations, and performing multi-perspective fusion calculation to obtain candidate news content characterizations by combining the respective characterizations of the two perspectives and the cross-attention characterizations between the two perspectives:
(2-1) for a given candidate News n i The headline text sequence of the news is represented as
Figure BDA0003882820280000102
Wherein T is i Headline text, w, representing the ith news item i,j I.e., the jth word, | T, in the headline text of the ith news i L is the total number of words in the title text; the sequence of entities in the news is represented as
Figure BDA0003882820280000103
Wherein E i Heading entity representing the ith news item, e i,j I.e., the jth entity in the headline of the ith news item, | E i I is the total number of entities in the title; headline text sequence characterization matrix for learning news by self-attention mechanics
Figure BDA0003882820280000101
The calculation process is as follows:
Figure BDA0003882820280000111
Figure BDA0003882820280000112
Figure BDA0003882820280000113
wherein the content of the first and second substances,
Figure BDA0003882820280000114
for word vector matrices of word sequences, superscript T denotes the matrix transposition operation, d w Is a feature dimension of a word and is,
Figure BDA0003882820280000115
representing a word w i,j Is used to represent the vector of (a),
Figure BDA0003882820280000116
representing news n i The self-attention impact weight of the jth word in (e), exp (-) represents an exponential function based on a natural constant e,
Figure BDA0003882820280000117
is news n i The self-attention characterizing weight of the jth word in (k) refers to the kth word,
Figure BDA0003882820280000118
the method is characterized by self attention of words, and each word is updated according to the normalized weight vector to obtain a text sequence characterization matrix
Figure BDA0003882820280000119
Thereafter, a self-attention mechanism is used to calculate the self-attention representation of the ith news headline text
Figure BDA00038828202800001110
And carrying out self-attention mechanical study on entity sequence representation on entities in the titles to obtain an entity sequence representation matrix of the news titles
Figure BDA00038828202800001111
Wherein
Figure BDA00038828202800001112
Representing a self-attentive representation of a jth entity in a headline of an ith news item, and deriving a self-attentive representation of an entity in the news headline
Figure BDA00038828202800001113
(2-2) characterizing a matrix by a headline text sequence of news
Figure BDA00038828202800001114
And an entity sequence representation of news
Figure BDA00038828202800001115
Cross attention is carried out to obtain the association degree between each two words and entities, lines and columns are added as the weight values of the words and the entities respectively, and then the characteristics of a text set and an entity set are aggregated respectively through weighting to obtain the text level cross learning representation
Figure BDA00038828202800001116
Cross-learning representation with physical layer
Figure BDA00038828202800001117
The specific calculation process of (2) is as follows:
Figure BDA00038828202800001118
Figure BDA00038828202800001119
Figure BDA00038828202800001120
wherein
Figure BDA00038828202800001121
Represents the word w i,j Is used to indicate the self-attention vector of (1),
Figure BDA00038828202800001122
representing an entity e i,k Self-attention vector table ofAs shown in the figure, the material of the steel wire,
Figure BDA00038828202800001123
representing news n i The cross attention of the jth word in (a) characterizes the weight,
Figure BDA00038828202800001124
representing news n i The cross attention impact weight of the jth word in (a),
Figure BDA00038828202800001125
representing news n i The cross attention impact weight of the kth word in (a),
Figure BDA00038828202800001126
representing news n i The cross attention of the k-th word in (1) characterizes the weight, d w And d e Respectively, word and entity feature dimensions, entity level cross-learning representation
Figure BDA0003882820280000121
The calculation method of (2) and the text level cross learning representation
Figure BDA0003882820280000122
(2-3) self-attention representation of the title text obtained in the step (2-1)
Figure BDA0003882820280000123
Self-attention representation of title entity
Figure BDA0003882820280000124
And the text level cross learning representation obtained in the step (2-2)
Figure BDA0003882820280000125
Cross-learning representation with physical layer
Figure BDA0003882820280000126
Adding the data and splicing the data to obtain the representation direction of the news contentMulti-view characterization vector of volume, i-th news
Figure BDA0003882820280000127
The calculation is as follows:
Figure BDA0003882820280000128
wherein + represents the counterpoint addition of two vectors, | | represents the splicing operation, and the ith news content represents the vector
Figure BDA0003882820280000129
The calculation is as follows:
Figure BDA00038828202800001210
wherein
Figure BDA00038828202800001211
Is the weight of the linear layer or layers,
Figure BDA00038828202800001212
is the linear layer deviation.
The overall implementation process of news content representation calculation based on multi-view association is shown as algorithm 1:
Figure BDA00038828202800001213
Figure BDA0003882820280000131
3. user content interest characterization computation
Performing self-attention characterization calculation of a title text and an entity on historical browsing news of a user from two different perspectives of a text perspective and an entity perspective respectively by using an attention mechanism, calculating cross attention characterizations of the two perspective characterizations, performing multi-perspective fusion calculation by combining the respective characterizations of the two perspectives and the cross attention characterizations between the two perspectives to obtain content characterizations of the historical browsing news of the user, and aggregating the content characterizations of the historical browsing news of the user based on the attention mechanism to obtain the content interest characterizations of the user; the method comprises the following specific steps:
(3-1) extracting the content of the user sequence from the attention, and performing news content characterization calculation on the news sequence historically browsed by the user to obtain a news sequence matrix
Figure BDA0003882820280000132
Wherein
Figure BDA0003882820280000133
As user u i First news n browsed 1 The content of (a) characterizes the vector,
Figure BDA0003882820280000139
representing the length of the user historical browsing sequence, applying a self-attention mechanism to a matrix constructed by a news sequence vector set to update information among news sequences to obtain a news self-attention moment matrix
Figure BDA0003882820280000134
Normalizing by the Softmax function to obtain a weight matrix
Figure BDA0003882820280000135
Then multiplying the weight matrix with the corresponding news vector set to realize the relation between each news and other news to obtain the self-attention vector matrix of the historical browsing news of the user
Figure BDA0003882820280000136
Wherein
Figure BDA0003882820280000137
Is user u i For news n j The self-attention vector of (a);
(3-2) self-attention direction for aggregating news using attention mechanism from user planeVolume aggregation to characterize content interest preferences of users
Figure BDA0003882820280000138
The calculation is as follows:
Figure BDA0003882820280000141
Figure BDA0003882820280000142
wherein
Figure BDA0003882820280000143
Representing user u i For news n in history browsing sequence j The degree of preference of (a) is,
Figure BDA0003882820280000144
the parameters of the first layer of linear layer are consistent with the dimension of the news content vector, and are sent into the second layer of linear layer after nonlinear change is carried out through a Tanh activation function;
Figure BDA0003882820280000145
the user implements the dimension d for the parameters of the second layer linear layer n Mapping to 1, wherein each news vector corresponds to a weight, the preference distribution condition of the user to the news in the history browsing news sequence can be obtained after normalizing all the history browsing news weights, and finally, the self-attention vector matrix of the news is subjected to weighted summation
Figure BDA0003882820280000146
Deriving user content interest representations
Figure BDA0003882820280000147
4. User migration interest characterization computation
Fusing the global news and news sequential relationship network constructed in the step (1) and the click relationship network between the users and the news to construct a global news migration graph, taking the user node representations and different neighbor news node representations in the global news migration graph as input, using a two-layer migration perception graph attention network to perform information aggregation and learning, and finally obtaining user migration interest representations, news representations and migration representations, wherein the migration representations comprise a propagation representation and an influence representation; the method comprises the following specific steps:
(4-1) performing user migration interest modeling based on sequence global perception, wherein the user migration interest modeling comprises two parts, namely building a global news migration graph and characterizing migration interest based on migration perception;
firstly, constructing a global news migration graph: news in a user history browsing news sequence is connected in series to form a chain according to the browsing time sequence of the user, each chain can be connected to form a graph through the same news node, the original browsing sequence of the news is the spreading sequence of the news, the spreading sequence of the news from the previous news to the next news is shown, the sequence opposite to the browsing sequence is the influence sequence of the news, and the influence of the previous news on the next news is shown; meanwhile, the relation between the user and the news is added into the graph, and the click relation between the user and the news is fused to obtain a global news migration graph
Figure BDA0003882820280000148
Wherein V u Representing a set of user nodes, V n Representing a set of news nodes, E C Set of edges representing click relationships, E P Set of edges representing propagation relationships, E I An edge set representing influence relationships;
(4-2) after a global news migration graph is constructed, based on a plurality of relations in the global news migration graph, macroscopically modeling the migration interest of a user in browsing news, and formally expressing that for a user u, in the global news migration graph
Figure BDA0003882820280000149
Figure BDA00038828202800001410
In the above, the neighbor news set Nei (u, click) = { N | N ∈ N, (u, N ∈ E) is obtained according to the Click relation C }; for news n, a neighbor user set Nei (n, click) = { U | U ∈ U [ ((U, n) ∈ E) ] is obtained according to a Click relation C }; obtaining a neighbor news set Nei (N, propagate) = { N | N ∈ N, (N, N ∈ E) according to the propagation relation P }; obtaining a neighbor news set Nei (N, influent) = { N | N ∈ N, (N, N ∈ E) according to the Influence relation I }; the migration interest learning model carries out information aggregation and learning through a migration perception graph attention network, namely Transition-GAT, user node representations and different neighbor news node representations are input, and the nodes are respectively aggregated according to the edge relations of the user node representations and the neighbor news node representations to obtain user migration interest representations, news representations and migration representations; collecting initial user and news vectors in a first Transition-GAT, mainly learning the relation between news and news on a migration network, and collecting last learned user and news vectors in a second Transition-GAT, mainly learning the migration relation between users and news;
the input of Transition-GAT is a user characterization matrix
Figure BDA0003882820280000151
And a news characterization matrix
Figure BDA0003882820280000152
Wherein
Figure BDA0003882820280000153
Representative user u i The initial characterization vector of (a) is,
Figure BDA0003882820280000154
representing news n j For user u i Browsed news Nei (u) i Click), and the graph attention based mechanism performs weighted aggregation on the neighbor news to obtain a new user migration interest representation
Figure BDA0003882820280000155
The calculation formula is as follows:
Figure BDA0003882820280000156
Figure BDA0003882820280000157
Figure BDA0003882820280000158
wherein
Figure BDA0003882820280000159
Representing neighbor news node n j For user u i The score of attention of (a) is,
Figure BDA00038828202800001510
representing neighbor news node n j For user u i To the extent of the effect of (a) is,
Figure BDA00038828202800001511
representing neighbor news node n k For user u i The score of attention of (a) is,
Figure BDA00038828202800001512
is the transformation matrix parameter of the user feature vector in the first layer Transition-GAT,
Figure BDA00038828202800001513
is the transformation matrix parameter of the new feature vector in the first layer Transition-GAT, d g Is the dimension of the initial feature vector of the user and the news, | | | represents the splicing operation,
Figure BDA00038828202800001514
is through the attention weight pair Nei (u) 1 Click) set, and carrying out weighted aggregation on all news to obtain a user migration interest representation;
for the ith newsNeighbor user node of the node, nei (n) i Click) which operates similarly to the user gathering news operation, and obtains a news representation by calculating attention weights and weighting
Figure BDA00038828202800001515
Collecting information of the propagation relation and the influence relation, simultaneously collecting the information of the propagation relation and the influence relation, calculating the weight to obtain a corresponding news migration representation aiming at the news n i The news propagation characterization can be obtained by respectively performing the mode similar to the graph aggregation on the propagation relation and the influence relation
Figure BDA00038828202800001516
And news impact characterization
Figure BDA00038828202800001517
Migration representation of news by aggregating its impact representation and propagation representation
Figure BDA00038828202800001518
The calculation is as follows:
Figure BDA0003882820280000161
Figure BDA0003882820280000162
wherein Nei (n) i Propagate) representation with news node n i A set of neighbor news having a propagation relationship,
Figure BDA0003882820280000163
representing neighbor news node n k For news node n i Attention score of (1), nei (n) i Influence) representation with news node n i Neighbor news collection with influence relation, news propagation vector
Figure BDA0003882820280000164
And news influence vector
Figure BDA0003882820280000165
And respectively weighting and aggregating the characteristics of the propagation relation and the other data nodes influencing the relation, wherein the displayed characteristics represent the relation between the news.
The specific implementation algorithm of the user migration interest representation calculation based on the user global interest migration perception is as follows:
Figure BDA0003882820280000166
5. joint recommendation model
Combining the candidate news content characterization obtained in the step (2), the user content interest characterization obtained in the step (3), the user migration interest characterization, the news characterization and the migration characterization obtained in the step (4), constructing a joint recommendation module and recommending news, performing similarity calculation on the user content interest characterization obtained in the step (3) and the candidate news content characterization obtained in the step (2) to obtain a user content interest score, and performing similarity calculation on the user migration interest characterization, the candidate news characterization and the migration characterization to obtain a user migration interest score; weighting and summing the user content interest scores and the user migration interest scores to obtain the interaction probability of the final user on the candidate news, and finally returning a top-k news list recommended in the candidate news set according to the sequencing of the interaction probability; the method specifically comprises the following steps:
(5-1) carrying out similarity calculation on the content interest representation of the user and the content representation of the candidate news to obtain a user content interest score, and carrying out similarity calculation on the migration interest representation of the user and the candidate news migration representation to obtain a user migration interest score; user content interest scoring
Figure BDA0003882820280000176
And user migration interest scoring
Figure BDA0003882820280000177
Is calculated as follows:
Figure BDA0003882820280000178
Figure BDA0003882820280000179
wherein f is u A representation of the user's content interest representation,
Figure BDA00038828202800001711
content representation, gf, representing candidate news i u A representation of the migration interest of the user,
Figure BDA00038828202800001710
indicates a candidate news migration token, an indicates a vector inner product;
(5-2) carrying out weighted summation on the content interest scores and the migration interest scores to obtain the interaction probability of the final user on the candidate news, and carrying out weighted summation on the candidate news c by the user u i Interaction probability p of u,ci The calculation formula of (a) is as follows:
Figure BDA0003882820280000171
Figure BDA0003882820280000172
Figure BDA0003882820280000173
wherein
Figure BDA0003882820280000174
Representing normalized user u pair candidate news c i The content interest score of (a) is,
Figure BDA0003882820280000175
representing normalized user u pair candidate news c i K is the candidate news set length of the user u, and theta belongs to [0,1 ]]And weighting and combining the content interest probability and the migration interest probability through theta to obtain an interaction probability, and finally returning a top-k news list recommended in the candidate news set according to the sequencing of the interaction probability.
6. System function display
The system function display of the personalized news recommendation method based on the user global interest migration perception comprises visual display of data analysis, experimental analysis and recommendation analysis, wherein the data analysis comprises time distribution of news browsing of a user, effect graph display of the distribution of the topics and the subtopics of the news, and effect graph display of text length and cumulative distribution of the news browsing of the user; the experimental analysis comprises contrast experiment and ablation experiment result histogram display of the algorithm on the public data set, and visual display of the attention of the user to the news words; the recommendation analysis comprises online news browsing interface display, recommended news list display after a user clicks browsing news, similar users and browsing sequence display and news migration relation display of the similar users.
The personalized news recommendation system based on the user global interest migration perception, which is operated by the personalized news recommendation method based on the user global interest migration perception, comprises a data processing module, a service processing module and a visual analysis module;
the data processing module is used for preprocessing a data set offline and dividing the data set into news title preprocessing, user news browsing preprocessing, user-news graph construction and news relation graph construction, and then classifying and storing the preprocessed data;
the business processing module is mainly used for meeting system requirements and calling a pre-training model to generate data, and comprises three sub-modules, namely a user content interest mining module, a user migration interest mining module and a joint recommendation module, wherein the user content interest mining module builds a model of user content interest on the basis of historical news data and news headline data browsed by a user; the user migration interest mining module models the user migration interest based on the user-news graph and the news relation graph; the joint recommendation module provides recommendation service for the visualization layer by combining the content interest and the migration interest of the user;
the visual analysis module is an interactive interface module which provides service for users by the system, mainly provides visual results of data analysis, experimental analysis and recommendation analysis for the users, and provides interactive functions for the users, so that the users can select news to browse, and recommend and analyze according to browsing of the users.

Claims (8)

1. A personalized news recommendation method based on user global interest migration perception is characterized by comprising the following steps:
(1) Carrying out data preprocessing on historical browsing news and candidate news of a user, and constructing a global sequence relation network of news and a click relation network between the user and news;
(2) Candidate news content characterization computation
Performing self-attention characterization calculation of a title text and an entity on candidate news from two different perspectives of a text perspective and an entity perspective respectively by using an attention mechanism, calculating cross attention characterizations of the two perspective characterizations, and performing multi-perspective fusion calculation to obtain candidate news content characterizations by combining the respective characterizations of the two perspectives and the cross attention characterizations between the two perspectives;
(3) User content interest characterization computation
Performing self-attention characterization calculation of a title text and an entity on historical browsing news of a user from two different perspectives of a text perspective and an entity perspective respectively by using an attention mechanism, calculating cross attention characterizations of the two perspective characterizations, performing multi-perspective fusion calculation by combining the respective characterizations of the two perspectives and the cross attention characterizations between the two perspectives to obtain content characterizations of the historical browsing news of the user, and aggregating the content characterizations of the historical browsing news of the user based on the attention mechanism to obtain the content interest characterizations of the user;
(4) User migration interest characterization computation
Fusing the global news and news sequential relationship network constructed in the step (1) and the click relationship network between the users and the news to construct a global news migration graph, taking the user node representations and different neighbor news node representations in the global news migration graph as input, using a two-layer migration perception graph attention network to perform information aggregation and learning, and finally obtaining user migration interest representations, news representations and migration representations, wherein the migration representations comprise a propagation representation and an influence representation;
(5) Joint recommendation model
Combining the candidate news content characterization obtained in the step (2), the user content interest characterization obtained in the step (3), the user migration interest characterization obtained in the step (4), the news characterization and the migration characterization, constructing a joint recommendation module and recommending news, performing similarity calculation on the user content interest characterization obtained in the step (3) and the candidate news content characterization obtained in the step (2) to obtain a user content interest score, and performing similarity calculation on the user migration interest characterization, the candidate news characterization and the migration characterization to obtain a user migration interest score; weighting and summing the user content interest scores and the user migration interest scores to obtain the interaction probability of the final user on the candidate news, and finally returning a top-k news list recommended in the candidate news set according to the sequencing of the interaction probability;
(6) And (5) displaying system functions.
2. The personalized news recommendation method based on user global interest migration perception according to claim 1, wherein the specific method of the step (1) is that news headline text fields and headline entity fields are extracted from historical browsing news and candidate news of a user, and corresponding news headline word vectors and corresponding entity vectors are obtained through Glove word vectors and entity vectors trained in Wikipedia respectively to serve as initial vector representations; and then connecting the historical browsing news before and after clicking according to the sequence data of the historical browsing news of the user to construct a global sequential relation network of the news and the news.
3. The personalized news recommendation method based on the user global interest migration perception according to claim 1, wherein the step (2) comprises the following specific steps:
(2-1) for a given candidate News n i The sequence of headline text of the news is represented as
Figure FDA0003882820270000021
Wherein T is i Headline text, w, representing the ith news item i,j I.e., the jth word, | T, in the headline text of the ith news i L is the total number of words in the title text; the sequence of entities in the news is represented as
Figure FDA0003882820270000022
Wherein E i Heading entity representing the ith news item, e i,j I.e., the jth entity in the headline of the ith news item, | E i I is the total number of entities in the title; headline text sequence characterization matrix for learning news by self-attention mechanics
Figure FDA0003882820270000023
The calculation process is as follows:
Figure FDA0003882820270000024
Figure FDA0003882820270000025
Figure FDA0003882820270000026
wherein the content of the first and second substances,
Figure FDA0003882820270000027
for word vector matrices of word sequences, superscript T denotes the matrix transposition operation, d w Is a feature dimension of a word and is,
Figure FDA0003882820270000028
represents the word w i,j Is used to represent the vector of (a),
Figure FDA0003882820270000029
representing news n i The self-attention impact weight of the jth word in (e), exp (-) represents an exponential function based on a natural constant e,
Figure FDA00038828202700000210
is news n i The self-attention characterization weight of the jth word in (k) refers to the kth word,
Figure FDA00038828202700000211
the method is characterized by self attention of words, and each word is updated according to the normalized weight vector to obtain a text sequence characterization matrix
Figure FDA00038828202700000212
Thereafter, a self-attention mechanism is used to calculate the self-attention representation of the ith news headline text
Figure FDA00038828202700000213
And performing self-attention mechanical study on entity sequence representation of entities in the titles to obtain an entity sequence representation matrix of the news titles
Figure FDA00038828202700000214
Wherein
Figure FDA00038828202700000215
In the title representing the ith newsAnd obtaining a self-attentive representation of the entity in the news headline
Figure FDA00038828202700000216
(2-2) characterizing a matrix by a headline text sequence of news
Figure FDA00038828202700000217
And an entity sequence representation of news
Figure FDA00038828202700000218
Cross attention is carried out to obtain the association degree between each two words and each two entities, lines and columns are added and used as the weight values of the words and the entities, and then the characteristics of the text set and the entity set are aggregated respectively through weighting to obtain the text level cross learning representation
Figure FDA00038828202700000219
Cross-learning representation with physical layer
Figure FDA0003882820270000031
The specific calculation process of (2) is as follows:
Figure FDA0003882820270000032
Figure FDA0003882820270000033
Figure FDA0003882820270000034
wherein
Figure FDA0003882820270000035
Represents the word w i,j Is used to indicate the self-attention vector of (1),
Figure FDA0003882820270000036
representing an entity e i,k Is used to indicate the self-attention vector of (1),
Figure FDA0003882820270000037
representing news n i The cross attention of the jth word in (a) characterizes the weight,
Figure FDA0003882820270000038
representing news n i The cross attention impact weight of the jth word in (a),
Figure FDA0003882820270000039
representing news n i The cross attention impact weight of the kth word in (a),
Figure FDA00038828202700000310
representing news n i The cross attention of the k-th word in (1) characterizes the weight, d w And d e Respectively, word and entity feature dimensions, entity level cross-learning representation
Figure FDA00038828202700000311
The calculation method of (2) and the text level cross learning representation
Figure FDA00038828202700000312
(2-3) self-attention representation of the title text obtained in the step (2-1)
Figure FDA00038828202700000313
Self-attention representation of title entity
Figure FDA00038828202700000314
And the text level cross learning representation obtained in the step (2-2)
Figure FDA00038828202700000315
Cross-learning representation with physical layer
Figure FDA00038828202700000316
Adding the above components, and splicing to obtain a news content characterization vector and a multi-view characterization vector of the ith news
Figure FDA00038828202700000317
The calculation is as follows:
Figure FDA00038828202700000318
wherein + represents the counterpoint addition of two vectors, | | represents the splicing operation, and the ith news content represents the vector
Figure FDA00038828202700000319
The calculation is as follows:
Figure FDA00038828202700000320
wherein
Figure FDA00038828202700000321
Is the weight of the linear layer or layers,
Figure FDA00038828202700000322
is the linear layer deviation.
4. The personalized news recommendation method based on the user global interest migration perception according to claim 1, wherein the step (3) comprises the following specific steps:
(3-1) extracting the content of the user sequence from the attention, and performing news content characterization calculation on the historical browsing news sequence of the user to obtain a news sequence momentMatrix of
Figure FDA00038828202700000323
Wherein
Figure FDA00038828202700000324
As user u i First news n browsed 1 The content of (a) characterizes the vector,
Figure FDA00038828202700000325
representing the length of the user historical browsing sequence, applying a self-attention mechanism to a matrix constructed by a news sequence vector set to update information among news sequences to obtain a news self-attention moment matrix
Figure FDA00038828202700000326
Normalizing by the Softmax function to obtain a weight matrix
Figure FDA00038828202700000327
Then multiplying the weight matrix with the corresponding news vector set to realize the relation between each news and other news to obtain the self-attention vector matrix of the historical browsing news of the user
Figure FDA0003882820270000041
Wherein
Figure FDA0003882820270000042
Is user u i For news n j The self-attention vector of (a);
(3-2) use an attention mechanism to weight the set of self-attention vectors of aggregated news from the user plane to characterize the user's content interest preferences
Figure FDA0003882820270000043
The calculation is as follows:
Figure FDA0003882820270000044
Figure FDA0003882820270000045
wherein
Figure FDA0003882820270000046
Representing user u i For news n in history browsing sequence j The degree of preference of (a) is,
Figure FDA0003882820270000047
the parameters of the first layer of linear layer are consistent with the dimension of the news content vector, and are sent into the second layer of linear layer after nonlinear change is carried out through a Tanh activation function;
Figure FDA0003882820270000048
then the user implements the dimension d for the parameters of the second layer linear layer n Mapping to 1, wherein each news vector corresponds to a weight, the preference distribution condition of the user to news in the historical browsing news sequence can be obtained after normalizing all historical browsing news weights, and finally, the news self-attention vector matrix is summed through weighting
Figure FDA0003882820270000049
Deriving user content interest representations
Figure FDA00038828202700000410
5. The personalized news recommendation method based on the user global interest migration perception according to claim 1, wherein the step (4) comprises the following specific steps:
(4-1) performing user migration interest modeling based on sequence global perception, wherein the user migration interest modeling comprises two parts, namely construction of a global news migration graph and migration interest characterization based on migration perception;
firstly, constructing a global news migration graph: news in a user history browsing news sequence is connected in series to form a chain according to the browsing time sequence of the user, each chain can be connected to form a graph through the same news node, the original browsing sequence of the news is the spreading sequence of the news, the spreading sequence of the news from the previous news to the next news is shown, the sequence opposite to the browsing sequence is the influence sequence of the news, and the influence of the previous news on the next news is shown; meanwhile, the relation between the user and the news is added into the graph, and the click relation between the user and the news is fused to obtain a global news migration graph
Figure FDA00038828202700000411
Wherein V u Representing a set of user nodes, V n Representing a set of news nodes, E C Set of edges representing click relationships, E P Set of edges representing propagation relationships, E I An edge set representing influence relationships;
(4-2) after a global news migration graph is constructed, based on a plurality of relations in the global news migration graph, macroscopically modeling the migration interest of a user in browsing news, and formally expressing that for a user u, in the global news migration graph
Figure FDA00038828202700000412
Figure FDA00038828202700000413
In the above, the neighbor news set Nei (u, click) = { N | N ∈ N, (u, N ∈ E) is obtained according to the Click relation C }; for news n, a neighbor user set Nei (n, click) = { U | U ∈ U [ ((U, n) ∈ E) ] is obtained according to a Click relation C }; obtaining a neighbor news set Nei (N, propagate) = { N | N ∈ N, (N, N ∈ E) according to the propagation relation P }; obtaining a neighbor news set Nei (N, influent) = { N | N ∈ N, (N, N ∈ E) according to the Influence relation I }; the migration interest learning model carries out information aggregation and learning through a graph attention network of migration perception, namely Transition-GAT, and inputs informationThe method comprises the steps that a user node representation and different neighbor news node representations are respectively aggregated by nodes according to the edge relations of the nodes, and a user migration interest representation, a news representation and a migration representation are obtained; collecting initial user and news vectors in a first Transition-GAT, mainly learning the relation between news and news on a migration network, and collecting last learned user and news vectors in a second Transition-GAT, mainly learning the migration relation between users and news;
the input of Transition-GAT is a user characterization matrix
Figure FDA0003882820270000051
And a news characterization matrix
Figure FDA0003882820270000052
Wherein
Figure FDA0003882820270000053
Representative user u i Is determined by the initial token vector of (a),
Figure FDA0003882820270000054
representing news n j For user u i Browsed news Nei (u) i Click) to get a new user migration interest representation by weighted aggregation of its neighbor news based on graph attention mechanism
Figure FDA0003882820270000055
The calculation formula is as follows:
Figure FDA0003882820270000056
Figure FDA0003882820270000057
Figure FDA0003882820270000058
wherein
Figure FDA0003882820270000059
Representing neighbor news node n j For user u i The score of attention of (a) is,
Figure FDA00038828202700000510
representing neighbor news node n j For user u i To the extent of the effect of (a) is,
Figure FDA00038828202700000511
representing neighbor news node n k For user u i The score of attention of (a) is,
Figure FDA00038828202700000512
is the transformation matrix parameter of the user feature vector in the first layer Transition-GAT,
Figure FDA00038828202700000513
is the transformation matrix parameter of the new feature vector in the first layer Transition-GAT, d g Is the dimension of the initial feature vector of the user and the news, | | | represents the splicing operation,
Figure FDA00038828202700000514
is through the attention weight pair Nei (u) 1 Click) set, and performing weighted aggregation on all news to obtain a user migration interest representation;
for the neighbor user node of the ith news node, nei (n) i Click) which operates similar to the news gathering operation of the user, and obtains the news representation by calculating attention weight and weighting
Figure FDA00038828202700000515
For the collection of information that propagates relationships and affects relationships,simultaneously collecting information of the propagation relation and the influence relation, and calculating the weight to obtain a corresponding news migration representation aiming at the news n i The news dissemination representation can be respectively obtained by respectively performing the mode similar to the graph aggregation on the dissemination relationship and the influence relationship
Figure FDA00038828202700000516
And news impact characterization
Figure FDA00038828202700000517
Migration representation of news by aggregating its impact representation and propagation representation
Figure FDA00038828202700000518
The calculation is as follows:
Figure FDA0003882820270000061
Figure FDA0003882820270000062
Figure FDA0003882820270000063
wherein Nei (n) i Propagate) representation with news node n i A set of neighbor news having a propagation relationship,
Figure FDA0003882820270000064
representing neighbor news node n k For news node n i Attention score of (1), nei (n) i Influence) representation with news node n i Neighbor news collection with influence relation, news propagation vector
Figure FDA0003882820270000065
And news influence vector
Figure FDA0003882820270000066
And respectively weighting and aggregating the characteristics of the propagation relation and the other data nodes influencing the relation, wherein the displayed characteristics represent the relation between the news.
6. The personalized news recommendation method based on the user global interest migration perception according to claim 1, wherein the step (5) comprises the following specific steps:
(5-1) carrying out similarity calculation on the content interest representation of the user and the content representation of the candidate news to obtain a user content interest score, and carrying out similarity calculation on the migration interest representation of the user and the candidate news migration representation to obtain a user migration interest score; user content interest scoring
Figure FDA0003882820270000067
And user migration interest scoring
Figure FDA0003882820270000068
Is calculated as follows:
Figure FDA0003882820270000069
Figure FDA00038828202700000610
wherein f is u A representation of the user's content interest representation,
Figure FDA00038828202700000611
content representation, gf, representing candidate news i u A representation of the migration interest of the user,
Figure FDA00038828202700000612
indication waitingSelecting a news migration token, a cone representing a vector inner product;
(5-2) carrying out weighted summation on the content interest scores and the migration interest scores to obtain the interaction probability of the final user on the candidate news, and carrying out weighted summation on the candidate news c by the user u i Interaction probability of
Figure FDA00038828202700000613
The calculation formula of (a) is as follows:
Figure FDA00038828202700000614
Figure FDA00038828202700000615
Figure FDA00038828202700000616
wherein
Figure FDA00038828202700000617
Representing normalized user u pair candidate news c i The content interest score of (a) is,
Figure FDA00038828202700000618
representing normalized user u pair candidate news c i K is the candidate news set length of the user u, and theta belongs to [0,1 ]]And weighting and combining the content interest probability and the migration interest probability through theta to obtain an interaction probability, and finally returning a top-k news list recommended in the candidate news set according to the sequencing of the interaction probability.
7. The personalized news recommendation method based on the user global interest migration perception according to claim 1, wherein the system function presentation in the step (6) includes visual presentations of data analysis, experimental analysis and recommendation analysis, wherein the data analysis includes time distribution of news browsing by the user, effect graph presentation of topic and subtopic distribution of news, and effect graph presentation of text length and cumulative distribution of news browsing by the user; the experimental analysis comprises contrast experiment and ablation experiment result histogram display of the algorithm on the public data set, and visual display of the attention of the user to the news words; the recommendation analysis comprises online news browsing interface display, recommended news list display after a user clicks browsing news, similar users and browsing sequence display and news migration relation display of the similar users.
8. A personalized news recommendation system based on user global interest migration perception, which is operated by the personalized news recommendation method based on user global interest migration perception according to one of claims 1 to 7, and is characterized by comprising a data processing module, a service processing module and a visual analysis module;
the data processing module is used for preprocessing a data set offline and dividing the data set into news title preprocessing, user news browsing preprocessing, user-news graph construction and news relation graph construction, and then classifying and storing the preprocessed data;
the business processing module is mainly used for interfacing system requirements and calling a pre-training model to generate data, and comprises three sub-modules, namely a user content interest mining module, a user migration interest mining module and a joint recommendation module, wherein the user content interest mining module is used for modeling the user content interest based on historical news data and news headline data browsed by a user; the user migration interest mining module models user migration interest based on a user-news graph and a news relation graph; the joint recommendation module provides recommendation service for the visualization layer by combining the content interest and the migration interest of the user;
the visual analysis module is an interactive interface module which provides service for users by the system, mainly provides visual results of data analysis, experimental analysis and recommendation analysis for the users, and provides interactive functions for the users, so that the users can select news to browse, and recommend and analyze according to browsing of the users.
CN202211233877.6A 2022-10-10 2022-10-10 Personalized news recommendation method and system based on user global interest migration perception Pending CN115481325A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211233877.6A CN115481325A (en) 2022-10-10 2022-10-10 Personalized news recommendation method and system based on user global interest migration perception

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211233877.6A CN115481325A (en) 2022-10-10 2022-10-10 Personalized news recommendation method and system based on user global interest migration perception

Publications (1)

Publication Number Publication Date
CN115481325A true CN115481325A (en) 2022-12-16

Family

ID=84393157

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211233877.6A Pending CN115481325A (en) 2022-10-10 2022-10-10 Personalized news recommendation method and system based on user global interest migration perception

Country Status (1)

Country Link
CN (1) CN115481325A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115878904A (en) * 2023-02-22 2023-03-31 深圳昊通技术有限公司 Intellectual property personalized recommendation method, system and medium based on deep learning

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115878904A (en) * 2023-02-22 2023-03-31 深圳昊通技术有限公司 Intellectual property personalized recommendation method, system and medium based on deep learning
CN115878904B (en) * 2023-02-22 2023-06-02 深圳昊通技术有限公司 Intellectual property personalized recommendation method, system and medium based on deep learning

Similar Documents

Publication Publication Date Title
CN108573411B (en) Mixed recommendation method based on deep emotion analysis and multi-source recommendation view fusion of user comments
CN107330115B (en) Information recommendation method and device
CN111222332B (en) Commodity recommendation method combining attention network and user emotion
CN113158033A (en) Collaborative recommendation model construction method based on knowledge graph preference propagation
CN110245285B (en) Personalized recommendation method based on heterogeneous information network
WO2023065859A1 (en) Item recommendation method and apparatus, and storage medium
WO2021139415A1 (en) Data processing method and apparatus, computer readable storage medium, and electronic device
CN112966091A (en) Knowledge graph recommendation system fusing entity information and heat
CN111949887A (en) Item recommendation method and device and computer-readable storage medium
CN112256916B (en) Short video click rate prediction method based on graph capsule network
CN108733669A (en) A kind of personalized digital media content recommendation system and method based on term vector
Song et al. Coarse-to-fine: A dual-view attention network for click-through rate prediction
CN115481325A (en) Personalized news recommendation method and system based on user global interest migration perception
CN114840745A (en) Personalized recommendation method and system based on graph feature learning and deep semantic matching model
CN112364245B (en) Top-K movie recommendation method based on heterogeneous information network embedding
Zhang et al. AENAR: An aspect-aware explainable neural attentional recommender model for rating predication
Hoang et al. Academic event recommendation based on research similarity and exploring interaction between authors
CN113326384A (en) Construction method of interpretable recommendation model based on knowledge graph
Zhang et al. Cross-graph convolution learning for large-scale text-picture shopping guide in e-commerce search
Sang et al. Position-aware graph neural network for session-based recommendation
CN116010696A (en) News recommendation method, system and medium integrating knowledge graph and long-term interest of user
CN115329215A (en) Recommendation method and system based on self-adaptive dynamic knowledge graph in heterogeneous network
CN106372147B (en) Heterogeneous topic network construction and visualization method based on text network
CN113051468B (en) Movie recommendation method and system based on knowledge graph and reinforcement learning
CN115618079A (en) Session recommendation method, device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination