WO2020224128A1

WO2020224128A1 - News recommendation method and apparatus based on short-term interest of user, and electronic device and medium

Info

Publication number: WO2020224128A1
Application number: PCT/CN2019/103700
Authority: WO
Inventors: 王健宗; 贾雪丽
Original assignee: 平安科技（深圳）有限公司
Priority date: 2019-05-08
Filing date: 2019-08-30
Publication date: 2020-11-12
Also published as: CN110275952A

Abstract

A news recommendation method and apparatus based on short-term interest of a user, and an electronic device and a storage medium, relating to the field of data analysis, and capable of combining long-term and short-term preferences of the user. The method comprises: collecting behavior data of a user on news (S1); obtaining a word vector matrix corresponding to a news matrix (S2); clustering the word vector matrix to obtain a news group of each news sub-group (S3); obtaining a long-term portrait and a short-term portrait of each user by means of long-term behavior data and short-term behavior data of each user for each piece of news (S4); analyzing a first similarity between the long-term portrait of each user and each news group (S5); sorting the news groups of each user in a descending order according to the first similarity, and taking a first set number of news groups sorted at the top (S6); analyzing a second similarity between the recent short-term portrait of each user and each piece of news in the first set number of news groups (S7); constructing a user-news bipartite graph according to the second similarity (S8); and selecting recommended news on the bipartite graph by using an absorption random walk method (S9).

Description

News recommendation method and device, electronic equipment and medium based on user's short-term interest

This application requires the priority of the patent application whose application number is 201910379183.5, the filing date is May 8, 2019, and the invention and creation titled "News recommendation method, device and medium based on user's short-term interests".

Technical field

This application relates to the field of data analysis technology, and more specifically, to a news recommendation method and device, electronic equipment, and media based on users' short-term interests.

Background technique

It is important to refer to the user’s reading history when recommending news. The outline of a user based on the content is called a user portrait. The key issue of content-based news recommendation is how to construct user portraits based on the user's reading history. When dealing with this problem, most content-based recommendation systems consider the user's reading history as a whole. The long-term interest of a user may be relatively stable, but in the short term, the content that the user pays attention to will change. For example, a sports enthusiast, his focus may change with the competition of different events. Therefore, using long-term reading history to determine the user's preference cannot accurately recommend news for him, nor can it better stimulate the user's interest in reading.

Summary of the invention

In view of the above-mentioned problems, the purpose of this application is to provide a news recommendation method and device, electronic equipment and medium based on the user's short-term interest that combine the long-term and short-term preferences of the user to recommend news to the user.

According to one aspect of this application, there is provided a news recommendation device based on a user’s short-term interest, including: a collection module that collects user behavioral data on news, the behavioral data includes a news matrix; a word vector matrix module, based on the news matrix Obtain the corresponding word vector matrix; clustering module, cluster the word vector matrix, obtain the grouping result of each news, and group each news into corresponding news groups according to the grouping result; user portrait obtaining module, A long-term portrait and a short-term portrait of each user are obtained through the long-term behavior data and short-term behavior data of each user for each news. The long-term portrait and the short-term portrait are used to represent the user's preference for the word vector corresponding to the word contained in the news. ; The first similarity acquisition module, which analyzes the similarity between the long-term portrait of each user and different newsgroups, and obtains multiple first similarities; the preference newsgroup acquisition module, in descending order, compares the multiple first similarities According to the ranking results, the first set number of news groups corresponding to each user is obtained based on the result of the ranking; the second similarity obtaining module analyzes the latest short-term portrait of each user and the first set number of news groups The second degree of similarity between each news; a bipartite graph construction module, which constructs a user-news bipartite graph according to the second degree of similarity; a recommendation module, which selects the recommended news on the bipartite graph using an absorption random walk method , So as to get the recommended news of each user.

According to a second aspect of the present application, a news recommendation method based on users' short-term interests is provided, including: step S1, collecting user behavior data on news, the behavior data including a news matrix; step S2, according to the news matrix Obtain the corresponding word vector matrix; step S3, cluster the word vector matrix to obtain the grouping result of each news, and group each news into the corresponding news group according to the grouping result; step S4, pass each The long-term behavior data and short-term behavior data of each news user obtain a long-term portrait and a short-term portrait of each user respectively, and the long-term portrait and the short-term portrait are used to represent the user's preference for the word vector corresponding to the word contained in the news; step S5 Analyze the similarity between the long-term portrait of each user and the different newsgroups to obtain multiple first similarities; step S6, sort the multiple first similarities in descending order, and obtain each The first set number of newsgroups corresponding to the user; step S7, analyzing the second similarity between the latest short-term portrait of each user and each news in the first set number of newsgroups; step S8, according to The second degree of similarity constructs a user news bipartite graph; step S9, using an absorption random walk method on the user news bipartite graph to select recommended news to obtain recommended news for each user.

In addition, in order to achieve the above object, the present application also provides an electronic device including a memory and a processor, and the memory includes a news recommendation program based on the user's short-term interest, and the news recommendation program based on the user's short-term interest When executed by the processor, the above-mentioned news recommendation method based on the user's short-term interest is realized.

In addition, in order to achieve the above object, the present application also provides a computer non-volatile readable storage medium, the computer non-volatile readable storage medium includes a news recommendation program based on the user's short-term interests, and the When the interest news recommendation program is executed by the processor, the steps of the above-mentioned news recommendation method based on the user's short-term interest are realized.

The news recommendation method and device based on the short-term interests of users, electronic equipment and media described in this application establishes a user-item bipartite graph based on long-term and short-term user portraits, and seamlessly integrates long-term and short-term users to represent users’ reading preferences. Absorbing random walk algorithm to select news in different topics, not only can provide relevant news articles about user interests, but also expand user preferences by introducing articles on different topics.

Description of the drawings

FIG. 1 is a schematic diagram of an application environment of a preferred embodiment of a news recommendation method based on a user's short-term interest in this application;

Fig. 2 is a schematic diagram of a news recommendation device based on the short-term interests of users in this application;

Fig. 3 is a flowchart of a preferred embodiment of a news recommendation method based on a user's short-term interest in this application.

Detailed ways

It should be understood that the specific embodiments described here are only used to explain the application, and are not used to limit the application.

The specific embodiments of the present application will be described in detail below in conjunction with the accompanying drawings.

This application provides a news recommendation method based on a user's short-term interest, which is applied to an electronic device 1. Referring to FIG. 1, it is a schematic diagram of an application environment of a preferred embodiment of a news recommendation method based on a user's short-term interest in this application.

In this embodiment, the electronic device 1 may be a terminal client with computing functions such as a server, a mobile phone, a tablet computer, a portable computer, a desktop computer, and the like.

The memory 11 includes at least one type of readable storage medium. The at least one type of readable storage medium may be a non-volatile storage medium such as flash memory, hard disk, multimedia card, card-type memory, and the like. In some embodiments, the readable storage medium may be an internal storage unit of the electronic device 1, such as a hard disk of the electronic device 1. In other embodiments, the readable storage medium may also be an external memory of the electronic device 1, such as a plug-in hard disk or a smart memory card (Smart Media Card, SMC) equipped on the electronic device 1. Secure Digital (SD) card, flash card (Flash Card), etc.

In this embodiment, the readable storage medium of the memory 11 is generally used to store a news recommendation program 10 based on the user's short-term interests installed in the electronic device 1 and the like. The memory 11 can also be used to temporarily store data that has been output or will be output.

In some embodiments, the processor 12 may be a central processing unit (CPU), a microprocessor or other data processing chip, which is used to run the program code or process data stored in the memory 11, for example, to execute a short-term Interested news recommendation program 10 etc.

The network interface 13 may optionally include a standard wired interface and a wireless interface (such as a WI-FI interface), and is usually used to establish a communication connection between the electronic device 1 and other electronic clients.

The communication bus 14 is used to realize the connection and communication between these components.

FIG. 1 only shows the electronic device 1 with the components 11-14, but it should be understood that it is not required to implement all the illustrated components, and more or fewer components may be implemented instead.

Optionally, the electronic device 1 may also include a user interface, and the user interface may include an input unit such as a keyboard (Keyboard), a voice input device such as a microphone (microphone) and other clients with voice recognition functions, and a voice output device such as audio, earphones, etc. Etc. Optionally, the user interface may also include a standard wired interface and a wireless interface.

Optionally, the electronic device 1 may also include a display, and the display may also be called a display screen or a display unit.

In some embodiments, it may be an LED display, a liquid crystal display, a touch-sensitive liquid crystal display, and an organic light-emitting diode (Organic Light-Emitting Diode, OLED) touch device. The display is used for displaying information processed in the electronic device 1 and for displaying a visualized user interface.

Optionally, the electronic device 1 further includes a touch sensor. The area provided by the touch sensor for the user to perform a touch operation is called a touch area. In addition, the touch sensor described here may be a resistive touch sensor, a capacitive touch sensor, or the like. Moreover, the touch sensor includes not only a contact type touch sensor, but also a proximity type touch sensor and the like. In addition, the touch sensor may be a single sensor, or may be, for example, a plurality of sensors arranged in an array.

Optionally, the electronic device 1 may also include logic gate circuits, sensors, audio circuits, etc., which will not be repeated here.

In the device embodiment shown in FIG. 1, the memory 11 as a computer storage medium may include an operating system and a news recommendation program 10 based on the user's short-term interest; the processor 12 executes the information stored in the memory 11 based on the user's short-term interest The news recommendation program implements the following steps at 10:

Step S1, collecting user behavior data on news, the behavior data including a news matrix;

Step S2: Obtain a corresponding word vector matrix according to the news matrix;

Step S3, clustering the word vector matrix to obtain a grouping result of each news, and grouping each news into a corresponding news group according to the grouping result;

Step S4: Obtain a long-term portrait and a short-term portrait of each user through the long-term behavior data and short-term behavior data of each user for each news. The long-term portrait and the short-term portrait are used to represent the word corresponding to the word contained in the news. Vector preference

Step S5: Analyze the similarity between the long-term portrait of each user and different newsgroups to obtain multiple first similarities;

Step S6, sort the plurality of first similarities in descending order, and obtain a first set number of newsgroups corresponding to each user based on the sorting result;

Step S7, analyzing the second similarity between the latest short-term portrait of each user and each news in the first set number of newsgroups;

Step S8, construct a user news bipartite graph according to the second similarity;

Step S9: Use the absorption random walk method to select recommended news on the user news bipartite graph, so as to obtain the recommended news of each user.

In other embodiments, the news recommendation program 10 based on the user's short-term interests can also be divided into one or more modules, and the one or more modules are stored in the memory 11 and executed by the processor 12 to complete the content. Application. The module referred to in this application refers to a series of computer program instruction segments that can complete specific functions.

The above electronic device obtains the long-term portrait of the user while also modeling the short-term reading preference of the user, and according to the short-term reading preference, recommends articles that can arouse the user's reading interest to expand the user's reading volume.

FIG. 2 is a schematic diagram of a news recommendation device based on a user's short-term interest in this application. As shown in FIG. 2, the news recommendation device includes:

The collection module 110 collects user behavior data on news. The behavior data includes a news matrix. Preferably, the behavior data further includes a news matrix and a behavior matrix. The behavior matrix is a news matrix of each user in the user matrix. A matrix of behavioral indicators for each news in;

The word vector matrix module 120 obtains a corresponding word vector matrix according to the news matrix;

The clustering module 130 clusters the word vector matrix to obtain a grouping result of each news, and groups each news into a corresponding news group according to the grouping result;

The user portrait obtaining module 140 obtains a long-term portrait and a short-term portrait of each user through the long-term behavior data and short-term behavior data of each user for each news. The long-term portrait and the short-term portrait are used to represent the words contained in the news. The preference of the corresponding word vector;

The first similarity obtaining module 150 analyzes the similarity between the long-term portrait of each user and different news groups to obtain multiple first similarities;

The preferred newsgroup obtaining module 160 sorts the plurality of first similarities in descending order, and obtains a first set number of newsgroups corresponding to each user based on the sorting result;

The second similarity obtaining module 170 analyzes the second similarity between the latest short-term portrait of each user and each news in the first set number of news groups;

The bipartite graph construction module 180 constructs a user-news bipartite graph according to the second similarity;

The recommendation module 190 selects the recommended news by using an absorption random walk method on the bipartite graph, so as to obtain the recommended news of each user.

Preferably, the aforementioned clustering module 130 includes:

The hierarchical clustering unit performs hierarchical clustering on the word vector matrix of the word vector matrix module to obtain a hierarchical clustering dendrogram, where one leaf node of the hierarchical clustering dendrogram corresponds to one news;

Dunn index obtaining unit, to obtain the Dunn index corresponding to each clustering result of the hierarchical clustering unit;

A cutting unit, cutting the hierarchical clustering dendrogram of the hierarchical clustering unit through the layer corresponding to the maximum Dunn index obtained by the Dunn index obtaining unit to obtain the best hierarchical clustering dendrogram;

The news grouping unit cuts the cutting unit to form the best hierarchical clustering dendrogram and the news corresponding to the leaf nodes belonging to the same parent node belong to the same news group, thereby obtaining the news grouping of each news.

In addition, preferably, the above-mentioned news recommendation device further includes: a topic matrix construction module, which analyzes the word vector matrix using a linear discriminant analysis method to obtain topic probability matrices of multiple topics of each news and different words corresponding to each topic The word probability matrix of the vector, the topic value of each news is obtained through the combination of the topic probability matrix, word probability matrix, and word vector matrix of each news. The topic value of each news forms the topic matrix.

Wherein, the clustering module 130 obtains the topic vector of each news group through the topic matrix constructed by the topic matrix building module; the first similarity obtaining module 150 uses the vector similarity measurement method to determine the long-term portrait of the user and the topic of each news group The first similarity of the vector; the second similarity obtaining module 170 uses a vector similarity measurement method to determine the second similarity between the short-term portrait of the user and the first set number of each news group.

In addition, this application also provides a news recommendation method based on users' short-term interests. Referring to FIG. 3, it is a flowchart of a preferred embodiment of a news recommendation method based on a user's short-term interest in this application. The method can be executed by a device, and the device can be implemented by software and/or hardware.

In this embodiment, the news recommendation method based on the user's short-term interest includes:

Step S1: Collect user behavior data about news. The behavior data includes a user matrix, preferably a news matrix and a behavior matrix. The behavior matrix is the behavioral data of each user in the user matrix to each news in the news matrix. Matrix of behavior indicators

U=[u ₁ , u ₂ ,..., u _a ]

N=[n ₁ , n ₂ ,..., n _b ]

Among them, U is the user matrix, a is the total number of users, N is the news matrix, b is the total number of news, UN is the behavior matrix formed by each user's behavior indicators for each news, and UN _a is the behavior vector of the a-th user, un _ab is the behavior indicator of the a-th user on the b-th news. The behavior indicators include the number of clicks, the number of reads, the number of likes, the number of evaluations, the length of reading, the frequency of clicks (the number of clicks per unit time), the frequency of reading, and the like One or more of frequency and evaluation frequency, for example, collecting user browsing history of news websites through web crawler technology, sorting user identifiers into a user matrix, sorting news identifiers in news websites into a news matrix, and dividing any The number of times the user clicks on any news is used as the user's behavior indicator for the news. When the user is not browsing news, the number of clicks by the user on the news is 0, which constitutes a behavior matrix;

Step S2: Obtain the corresponding word vector matrix according to the news matrix, that is to say, convert the words in each news in the news matrix into word vectors to form the corresponding word vector matrix

Among them, W is the word vector matrix of all news, c is the number of the longest word vector in the news, w _bc represents the word vector of the c-th word in the b-th news, when the number of news word vectors is not enough c, Fill it with zeros, W _b is the word vector matrix of the b-th news;

Step S3, clustering the word vector matrix to obtain a grouping result of each news, and grouping each news into a corresponding news group according to the grouping result, and the news group represents the grouping of news clusters;

Step S4: Obtain a long-term portrait and a short-term portrait of each user through the long-term behavior data and short-term behavior data of each user for each news. The long-term and short-term are in terms of time (for example, the long-term can be one month, The short-term may be one week), the long-term includes a plurality of the short-terms, and the long-term portrait and the short-term portrait represent the user's preference for the word vector corresponding to the word contained in the news;

Step S5: separately analyze the first similarity of the word vector between the long-term portrait of each user and each news group;

Step S7: respectively analyze the second similarity of the word vector between the short-term portrait of each user closest to the analysis time and each news in the first set number of news groups;

Step S8, construct a user-news bipartite graph according to the second similarity;

Step S9: Use the absorption random walk method to select the recommended news on the bipartite graph, so as to obtain the recommended news of each user.

The above-mentioned news recommendation method based on users’ short-term interests emphasizes the influence of the evolution of user’s interests when establishing user portraits, and seamlessly integrates long-term and short-term users as users’ reading preferences, establishes a relationship diagram between specific news and users, and then The absorption random walk method is implemented on the graph to select news articles with different topics.

In an embodiment of the present application, the foregoing news recommendation method based on the user's short-term interest includes:

In step S4, the word vector of each news is used as a label, and the long-term portrait and short-term portrait are the user's preference weight for each label,

Where P is a short-term portrait of a user, P'is a long-term portrait of a user, P _b represents the short-term weight vector of the user for the b-th news, and p _bc is the user's c-th news in the b-th news. Short-term weights of word vectors;

In step S5, the matrix similarity measurement method is used to determine the first similarity between the long-term portrait of the user and each newsgroup, for example, the correlation coefficient of the matrix, the cosine theorem of the space vector, etc., or the word vector of the news in the newsgroup The similarity between the newsgroup matrix and the corresponding long-term profile sub-matrix (including the preference of the word vector of newsgroup news). Another example is to use the cosine function to flatten the newsgroup matrix and the long-term profile sub-matrix, using the vector similarity method Obtain the first degree of similarity, for example, subtract the elements of the newsgroup matrix and the long-term portrait sub-matrix to square and then sum to obtain the first degree of similarity;

In step S7, a matrix similarity measurement method is used to determine the second similarity between the short-term portrait of the user and the first set number of each news group;

In step S8, in the second similarity of each user, each news group is sorted in descending order, and the second set number (less than the first set number) of the news group is taken, and all the news groups of each user are obtained. According to the second set number of newsgroups, a user-news bipartite graph is constructed according to the news of each user and the second set number of newsgroups, where the weight of the upper edge of the bipartite graph is set according to the user’s rating of news The higher the score, the greater the weight.

The above-mentioned news recommendation method based on the user's short-term interest screens newsgroups through the user's long-term portraits and short-term portraits, so that the selected newsgroups not only conform to the users' long-term preferences but also conform to the users' short-term interests, and improve the accuracy of news recommendation

In another embodiment, in the above step S7, Euclidean distance, Manhattan distance, Chebyshev distance, Minkowski distance, normalized Euclidean distance, Mahalanobis distance, angle cosine, Hamming distance, Jeckard Vector similarity measurement methods such as distance &Jaccard's similarity coefficient, correlation coefficient & correlation distance obtain the second similarity between the user's short-term portrait and each news in the first set number of newsgroups, for example, after the user's long-term portrait filtering The word vector of a news n _i of a news group in the first set number of news groups is W _i =[w ₁₁ , w ₁₂ ,..., w _1c ], and the vector of the corresponding short-term portrait of the user is P _i =[p ₁₁ , p ₁₂ ,..., p _1c ], take Euclidean distance as an example to explain how to obtain the second similarity,

Among them, d(P _i , W _i ) is the second degree of similarity between the user and news n ₁ ;

In step S8, each news is sorted in descending order in the second similarity of each user, and the first third set number of news is taken to obtain the third set number of news for each user, according to Each user constructs a user-news bipartite graph with their respective third set number of news, wherein the weight of the sideline on the bipartite graph is set according to the user’s rating of the news. Preferably, in step S8, the second similarity The user-news bipartite graph is constructed as the weight of the upper edge of the bipartite graph, or the user-news bipartite graph can be constructed directly without the second similarity ranking.

The above-mentioned news recommendation method based on users' short-term interests has two stages in news selection. First, long-term portraits are used to distinguish whether newsgroups meet user preferences, and then short-term portraits are used to filter specific news articles to users, so that users’ long-term preferences and short-term preferences Preference for seamless connection, which improves the accuracy of recommendations.

In the second embodiment of the present application, the news recommendation method based on the user's short-term interest includes:

In step S2, LDA (Latent Dirichlet Allocation, linear discriminant analysis) is used to analyze the word vector matrix to obtain the topic value of each news, thereby obtaining the topic matrix, specifically including: obtaining each of the news matrix through LDA The topic probability matrix of multiple topics of news and the word probability matrix of different word vectors corresponding to each topic

Among them, θ ^b is the topic probability matrix of the b-th news,

Is the probability that the b-th news corresponds to the d-th topic,

Is the word probability matrix of the b-th news,

Indicates the probability that the dth topic generates the cth word vector in the bth news;

Get the topic value of each news through the combination of the topic probability matrix, word probability matrix, and word vector matrix of each news

Among them, T _b is the topic value of the b-th news, "." means matrix multiplication;

The topic value of each news constitutes a topic matrix Z=[z ₁ , z ₂ ,..., z _b ].

In step S3, the word vector matrix is clustered to obtain the news group to which each news belongs, thereby obtaining the topic vector of each news group. For example, a news group is [n _i , n _j ], corresponding to the topic The vector is [z _i , z _j ].

In step S4, LDA is used as a language model for detecting potential topics, and a long-term portrait and a short-term portrait of each user are obtained. Specifically: the long-term portrait and the short-term portrait are obtained through the topic probability matrix, word probability matrix and behavior matrix of each news , Among them, the user’s behavioral index for news is taken as the user’s behavioral index for each word vector in the news,

un _ab (c)=[un _ab , un _ab ,..., un _ab ] ^T

z _a =[z _a1 , z _a2 ,..., z _ab ]

Among them, un _ab (c) represents the behavior vector of the a-th user to the c word vectors in the b-th news, that is, un _ab (c) is composed of c un _abs , and z _ab is the a-th user pair The topic value of the b-th news, z _a is the long-term portrait or short-term portrait of the a-th user.

In step S5, the similarity measurement method is used to determine the first similarity between the long-term portrait of the user and each newsgroup. Preferably, the cosine similarity method is used to obtain the first similarity.

Among them, sm _{, n} represents the similarity between the m-th long-term portrait and the n-th newsgroup, (x ₁ , x ₂ ,..., x _b ) is the topic vector of the m-th long-term portrait, (y ₁ , y ₂ ,...,y _b ) is the nth newsgroup topic vector. For example, a newsgroup X includes the first news and the third news, and the topic vector of the newsgroup is (z ₁ ,z ₃ ), The corresponding long-term portrait vector of the a-th user is (Z _a1 ,Z _a3 ),

In step S7, the similarity measurement method of step S5 is used to determine the second similarity between the short-term portrait of the user and the first set number of each news group.

In step S8, in the second similarity of each user, each news group is sorted in descending order, and the second set number (less than the first set number) of the news group is taken, and all the news groups of each user are obtained. According to the second set number of newsgroups, a user-news bipartite graph is constructed according to the news of each user and the second set number of newsgroups, where the weight of the upper edge of the bipartite graph is set according to the user’s rating of news set.

The above-mentioned news recommendation method based on the user's short-term interest obtains the topic vector of each news and the user's short-term portrait and long-term portrait vector through LDA analysis, and screens newsgroups through similarity, which reduces the amount of calculation while ensuring the accuracy of recommendation .

In an optional embodiment, in the above-mentioned news recommendation method based on the user's short-term interest:

In step S4, the long-term portrait is obtained by formula (3), and the short-term portrait is obtained by the following formula (5)

In step S7, the similarity measurement method is used to determine the second similarity between the short-term portrait of the user and each news of each news group of the first set number. Preferably, the cosine similarity method is used to obtain the second similarity. A similarity

Among them, s′ _m,n represents the similarity between the m-th short-term portrait and the n-th news, (x ₁ ,x ₂ ,...,x _c ) is the topic vector of the m-th short-term portrait, (y ₁ , y ₂ ,...,y _c ) are the word vectors of the nth news, all of which are 1×c vectors.

The above-mentioned news recommendation method based on the user's short-term interest obtains the topic vector of each news and the user's short-term portrait and long-term portrait vector through LDA analysis, and screens news groups and news respectively, reduces the amount of calculation, increases the speed of recommendation, and improves the recommendation. Accuracy.

Preferably, in step S2, LDA is used to analyze the word vector matrix, and the topic vector of each news is obtained by the following formula (7)

In step S7, the second similarity between each user's short-term portrait and each news is obtained by the similarity between each user's short-term portrait and the topic vector of each news.

In each of the foregoing embodiments, in step S4, the step of obtaining the long-term portrait and the short-term portrait of each user through the long-term behavior data and short-term behavior data of each user for each news respectively further includes:

Set a time frame, regard the time frame as a short-term, and the long-term includes multiple time frames;

Obtain the user portrait of the user in each time frame according to the user's behavior data of each word vector of the news in each time frame, thereby obtaining the short-term portrait of the user in each time frame;

The long-term portrait of the user is obtained in a weighted manner according to the user portrait of the user in each time frame, wherein the short-term portrait of the user closer to the analysis time has a higher weight.

Preferably, a time equation is used to weighted combination of multiple short-term portraits of users into a long-term portrait of users

Among them, P _u represents a long-term portrait,

Represents the short-term image corresponding to the g-th time frame t _g , f(t) is the time equation f(t)=e- ^λt , and λ is the constant parameter of the time equation.

The aforementioned news recommendation method based on the user's short-term interests first constructs a long-term portrait of a given user based on time-sensitive weighting, and then analyzes the user's latest reading history to analyze his short-term preferences. When recommending, we build a user-item bipartite graph based on long-term and short-term user portraits, and then select news from different topics by absorbing the random walk method algorithm, which can not only provide relevant news articles about user interests, but also You can expand user preferences by introducing articles on different topics.

In the foregoing embodiments, in step S3, the step of clustering the word vector matrix includes:

Perform hierarchical clustering on the word vector matrix to obtain a hierarchical clustering dendrogram, where one leaf node of the hierarchical clustering dendrogram corresponds to one news;

Obtain the Dunn index corresponding to each clustering result of hierarchical clustering, and cut the above-mentioned hierarchical clustering dendrogram at the layer corresponding to the maximum value of Dunn's index to obtain the best hierarchical clustering dendrogram and the best hierarchical clustering The news corresponding to the leaf nodes belonging to the same parent node in the tree diagram belong to the same news group, thereby obtaining the news grouping of each news. The above method of clustering the word vector matrix first uses a hierarchical agglomerative clustering algorithm to construct a news hierarchy purely based on the content of news articles, and then uses Dunn’s effectiveness index to determine the best hierarchical dendrogram, which avoids the cluster decision Quantity. Dunn index calculates the shortest distance between any two cluster elements (between clusters) divided by the maximum distance (within cluster) in any cluster. The larger the index, the greater the distance between clusters and the smaller the distance within the cluster. Use Dunn The index decides which layer to cut the tree diagram. After obtaining news groups, LDA can be used to analyze each group, and the theme of each group can be represented by a theme vector to match the long-term user portrait for group filtering.

In one embodiment, in step S9, news is selected in different topics by absorbing random walk method. The absorbing random walk method first chooses an initial point, and then randomly jumps to any point on the graph with the probability of p. The remaining 1-p probability will be assigned to the adjacent points according to the weight of the edge, and the same probability will be used every time. Jump to a random point or adjacent point, and use the transition matrix to calculate the jump probability. After several iterations, the jump probability stabilizes, and the news with the highest transition probability will be recommended, and the random walk method will decrease afterwards. The jump probability of the same article of the article in order to achieve the purpose of selecting more types of news. In this way, the news recommendation method based on the user's short-term interest described in this application can not only provide relevant news articles about the user's interest, but also expand the user's preferences by introducing articles on different topics.

In another embodiment, step S9 includes:

In the user news bipartite graph, each user acts as a node, and each news also acts as a node. The random walk restart method is used to obtain the correlation value between the nodes;

Obtain the adjacent set of each user formed by the adjacent nodes of each user node, form the first sub-correlation matrix of each user from the correlation value between any two nodes in the adjacent set, and divide the first sub-correlation matrix The reciprocal of the mean value of the off-diagonal elements in the correlation matrix is used as the bridging value of each user, combined with the bridging values of user nodes in adjacent sets to form the bridging matrix of each user, for example, a user node u ₁ , and its adjacent set is [n ₂ , n ₄ , u ₃ ], the first autocorrelation matrix of user node u ₁

r ₂₃ is the correlation value between news node n ₂ and user node u ₃ , and the bridge value q ₁ of user node u _{1 is} the mean value of the off-diagonal elements in the first correlation matrix, namely

The bridging matrix of user node u ₁ is [q ₁ , q ₃ ];

The correlation value of each user node and the user node in the adjacent set and the news node in the adjacent set constitutes the second sub-correlation matrix of each user, as in the above example, the second sub-correlation matrix of user node u ₁

The bridge matrix of each user and the second sub-correlation matrix are multiplied to obtain the recommended value of the news node;

The news nodes are sorted according to the recommended value in descending order, and the set number of news with the highest sorting is selected to recommend the user.

Preferably, the step of using a random walk restart method to obtain correlation values between nodes includes:

Taking a node as a starting node, and using a vector composed of the second similarity between the one node and other nodes as a restart vector, and calculating the jump probability between each node on the bipartite graph;

Compose the jump probability between the nodes into an adjacency matrix;

Iterative processing is performed on the adjacency matrix until the adjacency matrix converges, and the elements in the adjacency matrix after the convergence are the correlation values between the one node and the other node.

In addition, an embodiment of the present application also proposes a computer non-volatile readable storage medium, the computer non-volatile readable storage medium includes a news recommendation program based on the user's short-term interest, and the news based on the user's short-term interest The following steps are implemented when the recommended program is executed by the processor:

Step S1: Collect user behavior data on news, the behavior data includes a user matrix;

The specific implementation of the computer non-volatile readable storage medium of the present application is substantially the same as the specific implementation of the above-mentioned news recommendation method and device based on the user's short-term interest, and electronic equipment, and will not be repeated here.

The above are only preferred embodiments of this application, and do not limit the scope of this application. Any equivalent structure or equivalent process transformation made using the content of the description and drawings of this application, or directly or indirectly used in other related technical fields , The same reason is included in the scope of patent protection of this application.

Claims

A news recommendation method based on users' short-term interests, which is characterized in that it includes:

Step S1, collecting user behavior data on news, the behavior data including a news matrix;

Step S2: Obtain a corresponding word vector matrix according to the news matrix;

Step S3, clustering the word vector matrix to obtain a grouping result of each news, and grouping each news into a corresponding news group according to the grouping result;

Step S4: Obtain a long-term portrait and a short-term portrait of each user through the long-term behavior data and short-term behavior data of each user for each news. The long-term portrait and the short-term portrait are used to represent the word corresponding to the word contained in the news. Vector preference

Step S5: Analyze the similarity between the long-term portrait of each user and different newsgroups to obtain multiple first similarities;

Step S6, sort the plurality of first similarities in descending order, and obtain a first set number of newsgroups corresponding to each user based on the sorting result;

Step S7, analyzing the second similarity between the latest short-term portrait of each user and each news in the first set number of newsgroups;

Step S8, construct a user news bipartite graph according to the second similarity;

Step S9: Use the absorption random walk method to select recommended news on the user news bipartite graph, so as to obtain the recommended news of each user.
The news recommendation method based on the user's short-term interests according to claim 1, wherein in step S3, the step of clustering the word vector matrix comprises:

Perform hierarchical clustering on the word vector matrix to obtain a hierarchical clustering dendrogram, where one leaf node of the hierarchical clustering dendrogram corresponds to one news;

The Dunn index corresponding to each clustering result of hierarchical clustering is obtained, and the hierarchical clustering dendrogram is cut at the layer corresponding to the maximum value of Dunn index to obtain the best hierarchical clustering dendrogram and the best hierarchical clustering. The news corresponding to the leaf nodes that belong to the same parent node in the class tree graph belong to the same news group, thereby obtaining the news grouping of each news.
The news recommendation method based on users' short-term interests according to claim 2, characterized in that:

In step S2, the word vector matrix is analyzed using the linear discriminant analysis method, and the topic probability matrix of multiple topics of each news and the word probability matrix of different word vectors corresponding to each topic are obtained. Through each news topic The probability matrix, word probability matrix, and word vector matrix are combined to obtain the topic value of each news, and the topic value of each news constitutes the topic matrix;

In step S3, the word vector matrix is clustered to obtain the grouping result of each news, and each news is grouped into corresponding news groups according to the grouping result, so as to obtain the topic value of the news of each news group Constitute the subject vector;

In step S4, a linear discriminant analysis method is used as a language model for detecting potential topics to obtain a long-term portrait and a short-term portrait of each user;

In step S5, a vector similarity measurement method is used to determine the first similarity between the long-term portrait of the user and each newsgroup;

In step S7, a vector similarity measurement method is used to determine the second similarity between the short-term portrait of the user and the first set number of each news group;

In step S8, each news group is sorted in descending order in the second similarity of each user, and the second set number of news groups are taken to obtain the second set number of news for each user Group, construct a user-news bipartite graph based on the news of each user and the second set number of news groups. The weight of the sideline on the bipartite graph is set according to the user’s rating of news. The higher the rating, the more the sideline The greater the weight.
The news recommendation method based on the user's short-term interest according to claim 3, characterized in that:

In step S1, the behavior data further includes a user matrix and a behavior matrix, and the behavior matrix is a matrix composed of behavior indicators of each user in the user matrix for each news in the news matrix;

In step S4, the linear discriminant analysis method is used as a language model for detecting potential topics, and the methods for obtaining long-term and short-term portraits of each user include:

Use the linear discriminant analysis method to analyze the word vector matrix to obtain the topic probability matrix of multiple topics of each news and the word probability matrix of different word vectors corresponding to each topic;

Through the topic probability matrix, word probability matrix and behavior matrix of each news, long-term portraits and short-term portraits are obtained according to the following formula. Among them, the user's behavior index for news is used as the user's behavior index for each word vector in the news

Among them, un ab (c) = [un ab , un ab ,..., un ab ] T , un ab (c) represents the long-term or short-term behavior vector of the a-th user to the c word vector in the b-th news, z ab is the long-term or short-term topic value of the b-th news of the a-th user, z a =[z a1 , z a2 ,..., z ab ], z a is the long-term or short-term portrait of the a-th user, θ b is the topic probability matrix of the b-th news,
Is the word probability matrix of the b-th news.
The news recommendation method based on the user's short-term interest according to claim 4, characterized in that, in step S4, the short-term portrait is obtained by the following formula

In step S7, a similarity measurement method is used to determine the second similarity between the short-term portrait of the user and each news of each news group of the first set number;

In step S8, each news is sorted in descending order in the second similarity of each user, and the first third set number of news is taken to obtain the third set number of news for each user, according to Each user constructs a user-news bipartite graph with a third set number of news, in which the weight of the upper edge of the bipartite graph is set according to the user's rating of the news.
The news recommendation method based on the user's short-term interest according to claim 5, characterized in that, in step S8, the second similarity is used as the weight of the upper edge of the bipartite graph to construct the user-news bipartite graph, and the second similarity The user-news bipartite graph is constructed with or without the second similarity ranking.
The news recommendation method based on the user's short-term interest according to claim 3, characterized in that, in step S5, the step of using a similarity measurement method to determine the first similarity between the user's long-term portrait and each news group includes: adopting cosine Similarity method to obtain the first similarity

Among them, sm , n represents the similarity between the m-th long-term portrait and the n-th newsgroup, (x 1 , x 2 ,..., x b ) is the topic vector of the m-th long-term portrait, (y 1 , y 2 ,...,y b ) is the nth newsgroup topic vector.
The news recommendation method based on user short-term interests according to claim 3, characterized in that, in step S2, the word vector matrix is analyzed using LDA, and the topic vector of each news is obtained by the following formula

Among them, θ b is the topic probability matrix of the b-th news,
Is the word probability matrix of the b-th news, z＇ b is the topic vector of the b-th news;

In step S7, the second similarity between each user's short-term portrait and each news is obtained by the similarity between each user's short-term portrait and the topic vector of each news.
The news recommendation method based on users' short-term interests according to claim 1, characterized in that the long-term and short-term portraits of each user are obtained through the long-term behavior data and short-term behavior data of each user for each news. The steps include:

Set a time frame, regard the time frame as a short-term, and the long-term includes multiple time frames;

Obtain the user portrait of the user in each time frame according to the user's behavior data of each word vector of the news in each time frame, thereby obtaining the short-term portrait of the user in each time frame;

The long-term portrait of the user is obtained in a weighted manner according to the user portrait of the user in each time frame, wherein the short-term portrait of the user closer to the analysis time has a higher weight.
The news recommendation method based on the user's short-term interest according to claim 9, wherein the step of obtaining the long-term portrait of the user in a weighted manner according to the user portrait of the user in each time frame comprises:

Use the time equation to weight multiple user short-term portraits into a user long-term portrait

Among them, P u represents a long-term portrait,
Represents the short-term image corresponding to the g-th time frame t g , f(t) is the time equation f(t)=e- λt , and λ is the constant parameter of the time equation.
The news recommendation method based on the user's short-term interests according to claim 1, wherein in step S4, the word vector of each news is used as a label, and the long-term portrait and the short-term portrait are the user's preference for each label. Weight; In step S5, the matrix similarity measurement method is used to determine the first similarity between the user’s long-term portrait and each newsgroup; in step S7, the matrix similarity measurement method is used to determine the user’s short-term portrait and the first setting In step S8, in the second similarity of each user, each news group is sorted in descending order, and the second set number of news groups are taken to obtain each news group. According to the second set number of news groups of each user, a user-news bipartite graph is constructed according to the news of each user and the respective second set number of newsgroups, wherein the weight of the edge on the bipartite graph is based on the user The score setting of news, the higher the score, the greater the weight.
The news recommendation method based on the user's short-term interest according to claim 11, wherein in step S7, the vector similarity measurement method obtains the short-term portrait of the user and the first set of each news in the first set number of news groups. Second degree of similarity; in step S8, each news is sorted in descending order in the second degree of similarity of each user, and the first third set number of news is taken to obtain the third set number of each user For news, construct a user-news bipartite graph based on each user and their third set number of news, where the weight of the upper edge of the bipartite graph is set according to the user’s news rating.
The news recommendation method based on the user's short-term interest according to claim 1, wherein, in step S9, the step of using an absorption random walk method on the user news bipartite graph to select recommended news includes: firstly selecting an initial Point, then randomly jump to any point on the graph with the probability of p, the remaining 1-p probability will be assigned to the adjacent points according to the weight of the edge, and then jump to the random point or the adjacent point with the same probability every time, The transition matrix is used to calculate the jump probability. After several iterations, the jump probability stabilizes, and the news with the highest transition probability will be recommended.
The news recommendation method based on users' short-term interests according to claim 1, wherein step S9 comprises:

In the user news bipartite graph, each user acts as a node, and each news also acts as a node. The random walk restart method is used to obtain the correlation value between the nodes;

Obtain the adjacent set of each user formed by the adjacent nodes of each user node, form the first sub-correlation matrix of each user from the correlation value between any two nodes in the adjacent set, and divide the first sub-correlation matrix The reciprocal of the mean value of the off-diagonal elements in the correlation matrix is used as the bridge value of each user, and the bridge value of the user nodes in the adjacent set is combined to form the bridge matrix of each user;

Each user node and the correlation value of the user node in the adjacent set and the news node in the adjacent set form the second sub-correlation matrix of each user;

The bridge matrix of each user and the second sub-correlation matrix are multiplied to obtain the recommended value of the news node;

The news nodes are sorted according to the recommended value in descending order, and the set number of news with the highest sorting is selected to recommend the user.
The news recommendation method based on the user's short-term interest according to claim 14, wherein the step of using a random walk restart method to obtain correlation values between nodes comprises:

Taking a node as a starting node, and using a vector composed of the second similarity between the one node and other nodes as a restart vector, and calculating the jump probability between each node on the bipartite graph;

Compose the jump probability between the nodes into an adjacency matrix;

Iterative processing is performed on the adjacency matrix until the adjacency matrix converges, and the elements in the adjacency matrix after the convergence are the correlation values between the one node and the other node.
A news recommendation device based on users' short-term interests is characterized in that it includes:

The collection module collects user behavior data on news, and the behavior data includes a news matrix;

The word vector matrix module obtains the corresponding word vector matrix according to the news matrix;

A clustering module, clustering the word vector matrix to obtain a grouping result of each news, and grouping each news into a corresponding news group according to the grouping result;

The user portrait acquisition module obtains the long-term portrait and the short-term portrait of each user through the long-term behavior data and short-term behavior data of each user for each news. The long-term portrait and the short-term portrait are used to represent the user's correspondence to the word contained in the news. The preference of the word vector;

The first similarity obtaining module analyzes the similarity between the long-term portrait of each user and different newsgroups to obtain multiple first similarities;

There is a preference newsgroup obtaining module, which sorts the plurality of first similarities in descending order, and obtains a first set number of newsgroups corresponding to each user based on the sorting result;

The second similarity obtaining module analyzes the second similarity between the latest short-term portrait of each user and each news in the first set number of news groups;

A bipartite graph construction module, which constructs a user-news bipartite graph according to the second similarity;

The recommendation module uses the absorption random walk method to select the recommended news on the bipartite graph, thereby obtaining the recommended news of each user.
The news recommendation device based on a user's short-term interest according to claim 16, wherein the clustering module comprises:

The hierarchical clustering unit performs hierarchical clustering on the word vector matrix of the word vector matrix module to obtain a hierarchical clustering dendrogram, where one leaf node of the hierarchical clustering dendrogram corresponds to one news;

Dunn index obtaining unit, to obtain the Dunn index corresponding to each clustering result of the hierarchical clustering unit;

A cutting unit, cutting the hierarchical clustering dendrogram of the hierarchical clustering unit through the layer corresponding to the maximum Dunn index obtained by the Dunn index obtaining unit to obtain the best hierarchical clustering dendrogram;

The news grouping unit cuts the cutting unit to form the best hierarchical clustering dendrogram and the news corresponding to the leaf nodes belonging to the same parent node belong to the same news group, thereby obtaining the news grouping of each news.
The news recommendation device based on the user's short-term interest according to claim 16, characterized in that it further comprises:

The topic matrix building module uses the linear discriminant analysis method to analyze the word vector matrix to obtain the topic probability matrix of multiple topics of each news and the word probability matrix of different word vectors corresponding to each topic. Through each news topic The probability matrix, word probability matrix, and word vector matrix are combined to obtain the topic value of each news, and the topic value of each news constitutes the topic matrix,

Wherein, the clustering module obtains the topic vector of each news group through the topic matrix constructed by the topic matrix construction module; the first similarity obtaining module uses the vector similarity measurement method to determine the difference between the long-term portrait of the user and the topic vector of each news group The first similarity; the second similarity obtaining module uses a vector similarity measurement method to determine the second similarity between the short-term portrait of the user and the first set number of each news group.
An electronic device, characterized by comprising a memory and a processor, the memory stores a news recommendation program based on the user's short-term interest, and the following steps are implemented when the news recommendation program based on the user's short-term interest is executed by the processor :

Step S1, collecting user behavior data on news, the behavior data including a news matrix;

Step S2: Obtain a corresponding word vector matrix according to the news matrix;

Step S3, clustering the word vector matrix to obtain a grouping result of each news, and grouping each news into a corresponding news group according to the grouping result;

Step S4: Obtain a long-term portrait and a short-term portrait of each user through the long-term behavior data and short-term behavior data of each user for each news. The long-term portrait and the short-term portrait are used to represent the word corresponding to the word contained in the news. Vector preference

Step S5: Analyze the similarity between the long-term portrait of each user and different newsgroups to obtain multiple first similarities;

Step S6, sort the plurality of first similarities in descending order, and obtain a first set number of newsgroups corresponding to each user based on the sorting result;

Step S7, analyzing the second similarity between the latest short-term portrait of each user and each news in the first set number of newsgroups;

Step S8, construct a user news bipartite graph according to the second similarity;

Step S9: Use the absorption random walk method to select recommended news on the user news bipartite graph, so as to obtain the recommended news of each user.
A computer non-volatile readable storage medium, wherein the computer non-volatile readable storage medium includes a news recommendation program based on the user's short-term interest, and the news recommendation program based on the user's short-term interest is When the processor is executed, the steps of the news recommendation method based on the user's short-term interest as described in any one of claims 1 to 15 are implemented.