CN110674417B - Label recommendation method based on user attention relationship - Google Patents

Label recommendation method based on user attention relationship Download PDF

Info

Publication number
CN110674417B
CN110674417B CN201910902974.1A CN201910902974A CN110674417B CN 110674417 B CN110674417 B CN 110674417B CN 201910902974 A CN201910902974 A CN 201910902974A CN 110674417 B CN110674417 B CN 110674417B
Authority
CN
China
Prior art keywords
user
label
vector
network
attention
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910902974.1A
Other languages
Chinese (zh)
Other versions
CN110674417A (en
Inventor
赵鑫
侯宇蓬
陈俊华
文继荣
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Renmin University of China
Original Assignee
Renmin University of China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Renmin University of China filed Critical Renmin University of China
Priority to CN201910902974.1A priority Critical patent/CN110674417B/en
Publication of CN110674417A publication Critical patent/CN110674417A/en
Application granted granted Critical
Publication of CN110674417B publication Critical patent/CN110674417B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9536Search customisation based on social or collaborative filtering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Medical Informatics (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Artificial Intelligence (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a label recommendation method based on a user attention relationship, which specifically comprises the following steps: 1) generating user influence scores using a conventional PageRank algorithm
Figure DDA0002212407630000011
And label impact score
Figure DDA0002212407630000012
2) Training a user interest network and a user-label network by using a graph embedding model to generate a user vector
Figure DDA0002212407630000013
And label vector
Figure DDA0002212407630000014
Binding the impact fraction
Figure DDA0002212407630000015
Label impact score
Figure DDA0002212407630000016
User vector
Figure DDA0002212407630000017
And the label vector
Figure DDA0002212407630000018
And recommending the label for the user. The user attention relationship-based tag recommendation method provided by the invention is used for mining information from a user attention network and a user tag network containing rich information, so that the user characteristic information in the social network is richer, and a service provider can better understand the user.

Description

Label recommendation method based on user attention relationship
Technical Field
The invention relates to the technical field of tag recommendation methods, in particular to a method for recommending tags to users by using a graph embedding technology based on attention relations among users in a social network.
Background
In recent years, micro-blogging services like twitter and Singal micro blogging have attracted a large number of users, and have formed social networks of great scale and influence. In order to better manage, organize and understand the microblog users, a task of automatically recommending tags for the microblog users is provided by the academic world. By automatically recommending the labels to the user, the hidden interests which the user may have can be known, and the preferences and social relationships of the user can be understood in more dimensions. However, the previous tag recommendation method mainly focuses on mining text data generated by users, but the attention relationship among users, another data type rich in information in the microblog, is not reasonably mined and utilized.
The PageRank algorithm is an algorithm for analyzing the influence of nodes by taking the number and quality of links between the nodes in a network as main factors. The basic assumptions are: more important nodes are more linked by other nodes, and nodes linked by important nodes are more important. The algorithm calculates an influence score for each node in the network, and a high score indicates that the node has a large influence in the network. A schematic diagram of the PageRank algorithm is shown in fig. 1.
Graph Embedding (Network Embedding) is a technology for Embedding high-dimensional and discrete graph/Network data into a low-dimensional and dense real vector space by a machine learning method. The embedded real space vector is more easily applied to common machine learning models than high-dimensional, discrete graph data.
Calculating a training set by an iterative method through a gradient descent method
Figure BDA0002212407610000021
Minimum of upper risk function.
The formula is expressed as follows:
Figure BDA0002212407610000022
wherein theta istIs the parameter value at the t-th iteration, alpha is the learning rate,
Figure BDA0002212407610000023
is a training set
Figure BDA0002212407610000024
The risk function of (1).
The random Gradient Descent (SGD) method is based on the Gradient Descent method, and only one sample is randomly acquired in each iteration, and the Gradient of the sample loss function is calculated and the parameters are updated. Over a sufficient number of iterations, the random gradient descent may also converge to a locally optimal solution.
The information disclosed in this background section is only for enhancement of understanding of the general background of the application and should not be taken as an acknowledgement or any form of suggestion that this information forms the prior art already known to a person skilled in the art.
Disclosure of Invention
The invention aims to provide a label recommendation method based on a user attention relationship, so as to solve the technical problems in the prior art.
In order to solve the technical problem, the invention provides a tag recommendation method based on a user attention relationship, which specifically comprises the following steps:
1) generating user influence scores using a conventional PageRank algorithm
Figure BDA0002212407610000025
And label impact score
Figure BDA0002212407610000026
2) Training a user interest network and a user-label network by using a graph embedding model to generate a user vector
Figure BDA0002212407610000027
And label vector
Figure BDA0002212407610000028
Binding the impact fraction
Figure BDA0002212407610000029
Label impact score
Figure BDA00022124076100000210
User vector
Figure BDA00022124076100000211
And the label vector
Figure BDA00022124076100000212
And recommending the label for the user.
As a further technical scheme, the graph embedding model is divided into three parts: modeling explicit similarities between users, modeling implicit similarities between users, and modeling tag semantic information.
As a further technical solution, the modeling of explicit similarity between users specifically includes: sampling user attention relationship u1→u2And optimizing the user vector by using a random gradient descent method, so that the vector space and the probability distribution of the attention relationship generated in the user attention network are fitted with each other.
As a further technical scheme, the probability distribution of the attention relationship generated in the user attention network is characterized by the influence scores of the users, and the probability of forming the attention relationship among the users with the similar influence scores is higher.
As a further technical solution, the modeling of the implicit similarity between users specifically includes: sampling the user triples of 'common concern' and 'common concern', mapping an original vector space to a new vector space with a semantic node as an origin by using affine transformation, and then optimizing the user vector by using a random gradient descent method to ensure that the probability distribution of the triples generated in the new vector space and the user concern network are mutually fitted.
As a further technical solution, the semantic node refers to a node in the triplets of "concern together" and "concern together" to which two other users are simultaneously connected.
As a further technical solution, the modeling of the tag semantic information specifically includes: and sampling the user-label incidence relation u-t, and optimizing a user vector and a label vector by using a random gradient descent method to ensure that the vector space is fitted with the probability distribution of the user-label incidence relation generated in the user concern network and the user-label network.
By adopting the technical scheme, the invention has the following beneficial effects:
according to the method, the interest transfer relationship of the users is mined from the attention relationship among the users by using a graph embedding technology, so that the labels are recommended to the users, hidden interests possibly carried by the users can be known, and topics or users possibly interested by the users can be recommended to the users better.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the embodiments or the description in the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and other drawings can be obtained by those skilled in the art without creative efforts.
FIG. 1 is a diagram of a prior art PageRank algorithm;
FIG. 2 is a schematic diagram of the present invention employing affine transformations on triples.
Detailed Description
The technical solutions of the present invention will be described clearly and completely with reference to the accompanying drawings, and it should be understood that the described embodiments are some, but not all embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The present invention will be further explained with reference to specific embodiments.
The invention provides a label recommendation method based on a user attention relationship, which specifically comprises the following steps: 1) using conventionalPageRank algorithm generates user influence scores
Figure BDA0002212407610000041
And label impact score
Figure BDA0002212407610000042
2) Training a user interest network and a user-label network by using a graph embedding model to generate a user vector
Figure BDA0002212407610000043
And label vector
Figure BDA0002212407610000044
Binding the impact fraction
Figure BDA0002212407610000045
Label impact score
Figure BDA0002212407610000046
User vector
Figure BDA0002212407610000047
And the label vector
Figure BDA0002212407610000048
And recommending the label for the user.
The invention provides a novel graph embedding model based on a user attention network and a user label network, and further automatically recommends labels for users according to generated user/label vectors and influence scores.
In this embodiment, as a further technical solution, the graph embedding model is divided into three parts: modeling explicit similarities between users, modeling implicit similarities between users, and modeling tag semantic information. For each user u, use separately
Figure BDA0002212407610000051
And
Figure BDA0002212407610000052
representing its in-degree vector and out-degree vector.
In this embodiment, as a further technical solution, the modeling of the explicit similarity between the users specifically includes: sampling user attention relationship u1→u2And optimizing the user vector by using a random gradient descent method, so that the vector space and the probability distribution of the attention relationship generated in the user attention network are fitted with each other. The method specifically comprises the following steps:
sampling user attention relationship u1→u2Updating user u using a stochastic gradient descent method1And user u2Such that u is linked in the vector space1→u2Generated probability distribution p1(u1,u2) Fitting an empirical probability distribution corresponding to links in a network of interest to a user
Figure BDA0002212407610000053
Wherein:
Figure BDA0002212407610000054
Figure BDA0002212407610000055
Figure BDA0002212407610000056
Figure BDA0002212407610000057
a hereuRepresenting the influence score, Δ, of user ub,b′The degree of similarity between two real numbers is measured by-a- | b-b' |,
Figure BDA0002212407610000058
representing the totality of users,
Figure BDA0002212407610000059
Representing user u1The user concerned. The optimization function is:
Figure BDA00022124076100000510
wherein KL (·, ·) represents the KL divergence,
Figure BDA00022124076100000511
in this embodiment, as a further technical solution, the probability distribution for generating the attention relationship in the user attention network is characterized by the influence scores of the users, and the probability of forming the attention relationship between users with similar influence scores is higher.
In this embodiment, as a further technical solution, the modeling of the implicit similarity between users specifically includes: sampling the user triples of 'common concern' and 'common concern', mapping an original vector space to a new vector space with a semantic node as an origin by using affine transformation, and then optimizing the user vector by using a random gradient descent method to ensure that the probability distribution of the triples generated in the new vector space and the user concern network are mutually fitted. The method specifically comprises the following steps:
the implicit similarity modeling part among users samples the triples of 'common concern' and 'common concern'. Without loss of generality, the "common focus" is taken here as an example:<u1,u2,u3>represents u1And u2Are all covered by u3Attention is paid. The model adopts affine transformation (as shown in figure 2) to map the original vector space to the user u3In a new vector space with the origin of the output vector, the user u is updated by using a random gradient descent method1And u2The probability distribution p generated by the triplets in the new vector space2(u1,u2,u3) Fitting to an empirical probability distribution corresponding to triples in a network of interest to a user
Figure BDA0002212407610000061
Wherein:
Figure BDA0002212407610000062
Figure BDA0002212407610000063
affine transformation here
Figure BDA0002212407610000064
Figure BDA0002212407610000065
And
Figure BDA0002212407610000066
the definition of (A) is similar to that of the previous part, and is not described in detail. The optimization function still uses the KL divergence.
In this embodiment, as a further technical solution, the semantic node refers to a node in the "attention together" and "attention together" triples, which simultaneously connects two other users.
In this embodiment, as a further technical solution, the modeling of the tag semantic information specifically includes: and sampling the user-label incidence relation u-t, and optimizing a user vector and a label vector by using a random gradient descent method to ensure that the vector space is fitted with the probability distribution of the user-label incidence relation generated in the user concern network and the user-label network. The method specifically comprises the following steps:
firstly, the model splices the in-degree vector and the out-degree vector corresponding to the user to obtain a user vector
Figure BDA0002212407610000071
Figure BDA0002212407610000072
Herein, the
Figure BDA0002212407610000073
Representing a vector stitching operation. Then, the user label link u-t is sampled, and the user vector is updated by using a random gradient descent method
Figure BDA0002212407610000074
And label vector
Figure BDA0002212407610000075
Probability distribution p resulting from chaining u-t in vector space3(u, t) empirical probability distribution fitting to user interest network and link correspondence in user tag network
Figure BDA0002212407610000076
Wherein:
Figure BDA0002212407610000077
Figure BDA0002212407610000078
here, the
Figure BDA0002212407610000079
The definition of (A) is similar to that of the previous part, and is not described in detail. The optimization function still uses the KL divergence.
Finally we adopt
Figure BDA00022124076100000710
And calculating the similarity between the user and the vector, and selecting the K labels with the highest s (u, t) for each user u to recommend.
In summary, the invention provides a tag recommendation method based on user attention relations, which is characterized in that interest transfer relations of users are mined from attention relations among the users by using a graph embedding technology, and then tags are recommended to the users, so that hidden interests possibly carried by the users can be known, and further topics or users possibly interested by the users can be better recommended to the users.
Finally, it should be noted that: the above embodiments are only used to illustrate the technical solution of the present invention, and not to limit the same; while the invention has been described in detail and with reference to the foregoing embodiments, it will be understood by those skilled in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of the embodiments of the present invention.

Claims (5)

1. A label recommendation method based on a user attention relationship is characterized by specifically comprising the following steps:
1) generating user influence scores using a conventional PageRank algorithm
Figure FDA0003463412920000011
And label impact score
Figure FDA0003463412920000012
2) Training a user interest network and a user-label network by using a graph embedding model to generate a user vector
Figure FDA0003463412920000013
And label vector
Figure FDA0003463412920000014
Binding the impact fraction
Figure FDA0003463412920000015
Label impact score
Figure FDA0003463412920000016
User vector
Figure FDA0003463412920000017
And the label vector
Figure FDA0003463412920000018
Recommending a label for the user;
the graph embedding model is divided into three parts: modeling explicit similarities between users, implicit similarities between users, and tag semantic information;
the modeling of the explicit similarity between the users specifically comprises: sampling user attention relationship u1→u2Optimizing a user vector by using a random gradient descent method, and fitting a vector space and probability distribution of an attention relation generated in a user attention network with each other;
sampling user attention relationship u1→u2Updating user u using a stochastic gradient descent method1And user u2Such that u is linked in the vector space1→u2Generated probability distribution p1(u1,u2) Fitting an empirical probability distribution corresponding to links in a network of interest to a user
Figure FDA0003463412920000019
Wherein:
Figure FDA00034634129200000110
Figure FDA00034634129200000111
Figure FDA00034634129200000112
Figure FDA00034634129200000113
a hereuRepresenting the influence score, Δ, of user ub,b′The degree of similarity between two real numbers is measured by- α · | b-b' |, u denotes the total user,
Figure FDA0003463412920000021
representing user u1A user of interest; the optimization function is:
Figure FDA0003463412920000022
wherein KL (·, ·) represents the KL divergence,
Figure FDA0003463412920000023
by using
Figure FDA0003463412920000024
And calculating the similarity between the user and the vector, and selecting the K labels with the highest s (u, t) for each user u to recommend.
2. The label recommendation method based on the user attention relationship according to claim 1, wherein the probability distribution for generating the attention relationship in the user attention network is characterized by the influence scores of the users, and the probability of forming the attention relationship between users with similar influence scores is higher.
3. The tag recommendation method based on user attention relationship according to claim 1, wherein the modeling of implicit similarity between users specifically comprises: sampling the user triples of 'common concern' and 'common concern', mapping an original vector space to a new vector space with a semantic node as an origin by using affine transformation, and then optimizing the user vector by using a random gradient descent method to ensure that the probability distribution of the triples generated in the new vector space and the user concern network are mutually fitted.
4. The tag recommendation method based on user attention relationship according to claim 3, wherein the semantic node refers to a node in the triplets of "attention together" and "attention together" connecting two other users at the same time.
5. The user attention relationship-based tag recommendation method according to claim 1, wherein the modeling of tag semantic information specifically comprises: and sampling the user-label incidence relation u-t, and optimizing a user vector and a label vector by using a random gradient descent method to ensure that the vector space is fitted with the probability distribution of the user-label incidence relation generated in the user concern network and the user-label network.
CN201910902974.1A 2019-09-24 2019-09-24 Label recommendation method based on user attention relationship Active CN110674417B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910902974.1A CN110674417B (en) 2019-09-24 2019-09-24 Label recommendation method based on user attention relationship

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910902974.1A CN110674417B (en) 2019-09-24 2019-09-24 Label recommendation method based on user attention relationship

Publications (2)

Publication Number Publication Date
CN110674417A CN110674417A (en) 2020-01-10
CN110674417B true CN110674417B (en) 2022-03-11

Family

ID=69078571

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910902974.1A Active CN110674417B (en) 2019-09-24 2019-09-24 Label recommendation method based on user attention relationship

Country Status (1)

Country Link
CN (1) CN110674417B (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105740366A (en) * 2016-01-26 2016-07-06 哈尔滨工业大学深圳研究生院 Inference method and device of MicroBlog user interests
CN108804689A (en) * 2018-06-14 2018-11-13 合肥工业大学 The label recommendation method of the fusion hidden connection relation of user towards answer platform
CN110188272A (en) * 2019-05-27 2019-08-30 南京大学 A kind of community's question and answer web site tags recommended method based on user context

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105740366A (en) * 2016-01-26 2016-07-06 哈尔滨工业大学深圳研究生院 Inference method and device of MicroBlog user interests
CN108804689A (en) * 2018-06-14 2018-11-13 合肥工业大学 The label recommendation method of the fusion hidden connection relation of user towards answer platform
CN110188272A (en) * 2019-05-27 2019-08-30 南京大学 A kind of community's question and answer web site tags recommended method based on user context

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
Predicting Scientific Impact via Heterogeneous Academic Network Embedding;Chunjing Xiao等;《PRICAI 2019: Trends in Artificial Intelligence》;20190823;第555-556页 *

Also Published As

Publication number Publication date
CN110674417A (en) 2020-01-10

Similar Documents

Publication Publication Date Title
WO2021027256A1 (en) Method and apparatus for processing interactive sequence data
US11620321B2 (en) Artificial intelligence based method and apparatus for processing information
US11361188B2 (en) Method and apparatus for optimizing tag of point of interest
CN109919316A (en) The method, apparatus and equipment and storage medium of acquisition network representation study vector
US10936950B1 (en) Processing sequential interaction data
US20160224999A1 (en) Recommending common website features
US10630632B2 (en) Systems and methods for ranking comments
CN111291125B (en) Data processing method and related equipment
CN112925911B (en) Complaint classification method based on multi-modal data and related equipment thereof
CN111625715A (en) Information extraction method and device, electronic equipment and storage medium
CN116431914A (en) Cross-domain recommendation method and system based on personalized preference transfer model
CN114117048A (en) Text classification method and device, computer equipment and storage medium
US20160224991A1 (en) Evaluating features for a website within a selected industry vertical
US10896034B2 (en) Methods and systems for automated screen display generation and configuration
CN110674417B (en) Label recommendation method based on user attention relationship
CN114610989B (en) Personalized thesis recommendation method and system based on heterogeneous graph dynamic information compensation
CN116756281A (en) Knowledge question-answering method, device, equipment and medium
CN116956183A (en) Multimedia resource recommendation method, model training method, device and storage medium
CN115631008B (en) Commodity recommendation method, device, equipment and medium
CN115713386A (en) Multi-source information fusion commodity recommendation method and system
US11314488B2 (en) Methods and systems for automated screen display generation and configuration
EP4040373A1 (en) Methods and systems for generating hierarchical data structures based on crowdsourced data featuring non-homogenous metadata
CN112446738A (en) Advertisement data processing method, device, medium and electronic equipment
CN115203516A (en) Information recommendation method, device, equipment and storage medium based on artificial intelligence
CN110222097A (en) Method and apparatus for generating information

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant