CN111931903B - Network alignment method based on double-layer graph attention neural network - Google Patents

Network alignment method based on double-layer graph attention neural network Download PDF

Info

Publication number
CN111931903B
CN111931903B CN202010654776.0A CN202010654776A CN111931903B CN 111931903 B CN111931903 B CN 111931903B CN 202010654776 A CN202010654776 A CN 202010654776A CN 111931903 B CN111931903 B CN 111931903B
Authority
CN
China
Prior art keywords
user
node
vector
network
social network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010654776.0A
Other languages
Chinese (zh)
Other versions
CN111931903A (en
Inventor
卢美莲
戴银龙
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing University of Posts and Telecommunications
Original Assignee
Beijing University of Posts and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing University of Posts and Telecommunications filed Critical Beijing University of Posts and Telecommunications
Priority to CN202010654776.0A priority Critical patent/CN111931903B/en
Publication of CN111931903A publication Critical patent/CN111931903A/en
Application granted granted Critical
Publication of CN111931903B publication Critical patent/CN111931903B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/01Social networking
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Business, Economics & Management (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Molecular Biology (AREA)
  • Human Resources & Organizations (AREA)
  • Medical Informatics (AREA)
  • Economics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Marketing (AREA)
  • Primary Health Care (AREA)
  • Strategic Management (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Machine Translation (AREA)

Abstract

The invention provides a network alignment method based on a double-layer graph attention neural network, which comprises two stages of network embedded representation and embedded vector space alignment. In the network embedding representation stage, a double-layer graph attention neural network is provided for network representation learning so as to extract the embedding vector of a user in a social network; in the embedded vector space alignment stage, a classification model is constructed by utilizing the obtained embedded vector of the social network user node and part of known anchor link sets to predict anchor links between different social networks, and a bidirectional embedded vector space alignment strategy is provided to meet one-to-one matching constraint of user entities between different social networks. Through the arrangement, the method can effectively capture different influence weights among the user, neighbor users and various features in the social network, so that accurate representation of the user in the social network is learned, and accuracy of anchor link prediction among different social networks is improved.

Description

Network alignment method based on double-layer graph attention neural network
Technical Field
The invention relates to the technical field of data mining and machine learning, in particular to a network alignment method based on a double-layer graph attention neural network.
Background
With the rapid development of the internet and mobile devices, online social networks have become an indispensable popular platform for people to share and exchange information. Because of the different services provided by different social platforms, a person will typically register accounts with multiple social network platforms at the same time to meet their different needs. These users, who are shared by different social networking platforms, naturally form anchor links that connect different social networks, facilitating information interaction between the different social networks. Mining information interactions across multiple social domains may be effectively applied to a variety of downstream social network applications such as cross-domain link prediction, cross-domain recommendation, and cross-domain information dissemination. However, in today's society, these social networking platforms are often maintained individually by different companies and have some information isolation from each other. Therefore, aligning accounts belonging to the same user in different social platforms has become an urgent research topic to be solved. Current research on network alignment methods can be largely divided into two categories: an unsupervised-based network alignment method and a supervised-based network alignment method. Wherein:
(1) An unsupervised network alignment method: an unsupervised network alignment model based attempts to align user accounts between different social networks without known anchor links. In this type of approach, researchers typically measure user similarity between different social networks based on the rarity of the user's username in the social network and the consistency of the neighborhood structure, and then predict anchor links using greedy methods or methods that minimize the structural consistency of the two social networks.
(2) Based on a supervised network alignment method: the general idea based on the supervised network alignment model is to convert the alignment problem between different social networks into a classification problem about anchor links, i.e. to determine whether any two users between different social networks have an anchor link relationship. Early studies built classification models by manually extracting certain features of users in social networks, which, although solving to some extent the problem of alignment of some users in some social network scenarios, still had significant limitations. Firstly, the method for manually extracting the user features is very tedious, and the user cannot directly judge which features are effective, and the effective features of different social network scenes are possibly different; secondly, the social network platform is used for protecting the privacy of the user, part of real information of the user is often hidden, so that part of information is often lost when the user features are manually extracted, and the accuracy of the anchor link prediction task is affected.
In recent years, with the incentive that network representation learning is widely and successfully applied in a single social network analysis task, some researchers begin to apply network representation learning to network alignment tasks among multiple social networks, and this type of approach attempts to learn a common embedded vector space for users in different social networks without manually extracting the effective features of the users in the social networks. These approaches, while attempting to model the user's behavior in the social network from aspects of the user's social structure and profile information, ignore the different impact weights of their different neighboring user nodes when capturing the user node representation, and the different impact weights of different attribute information on the user information interactions.
In view of the above, the present invention aims to propose a network alignment method based on a dual-layer graph attention neural network, which combines a user-level attention mechanism and a feature-level attention mechanism to enable a model to learn an accurate representation of a user in a social network and improve the prediction accuracy of an anchor link, in view of the importance of network alignment research on multi-social network analysis tasks and some limitations of the existing research.
Disclosure of Invention
In view of the above, the invention aims to provide a network alignment method based on a double-layer graph attention neural network, which can effectively capture different influence weights among users, neighbor users and among features in a social network while modeling social behaviors of the users by using information such as attributes, local social structures, global social structures and the like of the users in the social network, thereby learning accurate representation of the users in the social network and improving accuracy of anchor link prediction among different social networks.
Based on the above object, the present invention provides a network alignment method based on a dual-layer graph attention neural network, which is characterized by comprising:
basic definition: social network abstraction is a directed graph g= (V, E, X), where v= { V i I=1, …, N } represents a set of user nodes in the social network, N being the number of user nodes in the social network; e= { E i,j =(v i ,v j )|v i ∈V,v j E V represents a set of relationships between users in a social network, e i,j =(v i ,v j ) Representing user v i And user v j The association relation exists between the two; x= { X i I=1, …, N } represents the features of all usersVector set, for each user node v i All have a node feature vector x i Correspondingly, the feature vector can be extracted from the personal data, the behavior and the network social structure information of the user node, without losing generality, the two networks to be aligned are named as a source social network and a target social network, and G is used for each network s And G t A representation;
for any two users from different social networks
Figure BDA0002576285310000031
And->
Figure BDA0002576285310000032
We use
Figure BDA0002576285310000033
Representing an anchor linkage relationship between a source social network and a target social network, wherein +.>
Figure BDA0002576285310000034
And->
Figure BDA0002576285310000035
Is that the same user is respectively in different social networks G s And G t An account in (a); the anchor links are one-to-one link relations between two users in different social networks, and the situation that the two anchor links share the same user account of the same social network does not exist;
two different social networks G s And G t All the anchor links between are defined as the set of anchor links, and
Figure BDA0002576285310000036
representation of->
Figure BDA0002576285310000037
Representing a user account in a source social network, +.>
Figure BDA0002576285310000038
Representing a user account in a target social network; for two different social networks G s =(V s ,E s ,X s ) And G t =(V t ,E t ,X t ) Network alignment aims at finding a set of anchor link sets T between two social networks, where any element e 'in set T' ij E T represents two user accounts +. >
Figure BDA0002576285310000039
And->
Figure BDA00025762853100000310
An anchor link between the two;
s1, a network preprocessing module: preprocessing a social network according to the input network type and the contained user attribute information, and constructing an initialized user node feature vector matrix;
s2, a network embedded representation module: taking the initialized user node feature vector matrix obtained by the network preprocessing module and the adjacent matrix of the social network as inputs, capturing a complex information interaction relation of a user in the social network through a double-layer-diagram attention neural network so as to learn potential information of the user node in the social network and obtain an accurate user node embedded vector;
s3, embedding a vector space alignment module: constructing a classification model according to the user node embedded vectors of the source social network and the target social network learned in the S2 to predict anchor links, and adopting a bidirectional embedded vector space alignment strategy to meet the constraint of one-to-one matching of user accounts among different social networks;
s4, solving an intersection, and completing network alignment.
Preferably, the step S2 includes the following: user v i Is expressed as x i Based on the network type, various feature vectors of the user are extracted and stacked laterally to generate an initialized feature vector representation of the user
Figure BDA00025762853100000311
Where d represents the dimension of the user-initialized feature vector, and d ', d ", d'" appearing hereinafter represent different dimensions, respectively, building the initialized feature vectors of all users in the social network into a state matrix X, where each row is a feature vector of a particular user node: x= (X 1 ,x 2 ,…x N ) T
Preferably, the network type is a topological network, the feature vector of the user is randomly initialized in a random matrix mode, and the weight parameters of the random matrix are learned through the training stage of the double-layer diagram attention neural network model.
Preferably, the network type is an attribute network, and the user attribute is vectorized in the following manner: randomly initializing user information such as user names by adopting a word embedding mode to obtain user name feature vectors; mining the language style of the user from the long text information of the user by adopting a Doc2Vec model, and learning the text feature vector of the user; carrying out vector initialization on the user track information through spatial clustering to obtain a spatial feature vector of a user; and directly taking the user scoring and the number of sign-in times as a characteristic dimension of the user, and carrying out vector initialization.
Preferably, the step S2 includes the following:
S2.1, embedding a user layer node into a representation sub-module: capturing different influence weights among users to carry out weighted aggregation on local neighborhood information of the users in the social network, so that node embedding vectors of user levels are learned;
s2.2, embedding a characteristic layer node into a representation sub-module: the method is responsible for learning influence weights among different features of the user so as to capture interaction relations among the features with finer granularity, and therefore node embedding vectors of the user at the feature level are learned;
s2.3, embedding a vector fusion sub-module: and the method is responsible for reserving and resetting the user embedded vectors from different layers of the user level and the feature level so as to fuse the node embedded vectors of multiple views and improve the accuracy of network alignment tasks.
Preferably, S2.1 includes the following: using a leachable transformation matrix
Figure BDA0002576285310000041
Converting the input vector into a high-dimensional vector, namely:
Figure BDA0002576285310000042
for any two user nodes v, according to the knowledge about the graph attention neural network i And v j First, the relation strength e between the two user nodes is calculated ij
Figure BDA0002576285310000043
Wherein the method comprises the steps of
Figure BDA0002576285310000044
And->
Figure BDA0002576285310000045
Representing user node v i And v j Embedding vector at layer I, +.>
Figure BDA0002576285310000046
The weight parameter representing the first layer, "|" is the series operator representing the transverse concatenation of two vectors, leakyReLU (·) is the activation function of neurons, for the calculation of user node v i When the neighborhood information of (a) is aggregated, the information contribution ratio from different neighbors is normalized by adopting a softmax (·) function to normalize the user node and all neighbor user nodes v thereof k ∈N(v i ) The relation strength between the two is calculated as follows:
Figure BDA0002576285310000051
a ij is called user node v i And v j Attention coefficient between a ij The larger the value of (a) is, the closer the relationship between the two users is expressed, and the node v of the user is obtained according to calculation i Attention coefficients with all its neighbor node users (including themselves), each user node v i The new embedded vector of (a) may be defined as follows:
Figure BDA0002576285310000052
wherein delta (·) is the activation function of the neuron, and the node embedding vector h of each user at the user level in the social network can be obtained by performing linear aggregation of different influence weights on the neighborhood information of the user i Form a user level vector matrix m= (h 1 ,h 2 ,…,h N ) T
Preferably, S2.2 comprises the following: user level vector matrix m= (h) obtained with S2.1 1 ,h 2 ,…,h N ) T As input, and taking into account the multidimensional attention between the features of any two user nodes in the social network, i.e. calculating an attention coefficient for each corresponding dimension of the two user node vectors,
let h i And h j Distribution represents two user nodes v in a social network i And v j The relationship between two user node embedded vectors can be defined as:
f(h i ,h j )=W 5 ·tanh(W 4 ·h i +W 3 ·h j +b 2 )+b 1
wherein W is 3 ,W 4 And
Figure BDA0002576285310000053
is a parameter matrix, b 1 ,/>
Figure BDA0002576285310000054
Is a bias termTan h (·) is the activation function of the neuron,
use of a feed forward neural network to determine the frequency of the feedback signal based on f (h i ,h j ) Calculating the dependency relationship of any two user nodes on the characteristic level to enable beta to be calculated ij Representing user node v i And v j Attention coefficient vector, [ beta ], of the feature layer of (c) ij ] k Similarly, to facilitate comparing the attention coefficients of the features of the corresponding dimensions between different attention coefficient vectors, if the attention coefficient vectors of all neighbors of the user are normalized by the corresponding feature dimensions using a softmax (·) function, then there are:
Figure BDA0002576285310000061
between computing attention coefficients [ beta ] between each dimension of features of any two users ij ] k The attention coefficients can then be combined into an attention coefficient vector beta between the two users according to the corresponding feature dimensions ij =([β ij ] 1 ,[β ij ] 2 ,…,[β ij ] d″ ) The dimensions of the attention vector and the user node vector are the same, each dimension [ beta ] ij ] k Impact weight corresponding to each dimension of the user node vector, [ beta ] ij ] k The larger, two user nodes v are represented i And v j The stronger the degree of association of the features of the kth dimension, and finally, for any user node v in the social network i The method comprises the steps that embedded vectors of neighbor users are weighted and linearly aggregated according to learned attention coefficient vectors among different user nodes, and the aggregation function of a feature layer attention mechanism is used for aggregating neighborhood information in an element-by-element multiplication mode, different from an aggregation mode of a user layer attention mechanism:
Figure BDA0002576285310000062
wherein the method comprises the steps of
Figure BDA0002576285310000068
Representing the element-by-element product between two vectors with the same shape, obtaining a vector with the same shape, wherein delta (·) is an activation function of a neuron, and node embedded representation of each user at the feature level in the social network can be obtained by linearly aggregating the neighborhood information of the user according to the influence weights of different features>
Figure BDA0002576285310000063
Component feature level vector matrix->
Figure BDA0002576285310000064
Figure BDA0002576285310000065
Preferably, S2.3 comprises the following: the embedded vector fusion sub-module uses the user-level vector matrix m= (h) 1 ,h 2 ,…,h N ) T And feature level vector matrix
Figure BDA0002576285310000066
As input, the gating mechanism is utilized to automatically learn the weight parameters of node embedded vectors at different levels of the same user, so as to effectively reserve and reset the information representations from different levels,
For any user node v in a social network i The module first calculates a user level node embedding vector h i And feature level node embedding vector
Figure BDA0002576285310000067
The weight relation vector is calculated as follows:
Figure BDA0002576285310000071
wherein the method comprises the steps of
Figure BDA0002576285310000072
And->
Figure BDA0002576285310000073
Is a parameter matrix of a gated neural network, +.>
Figure BDA0002576285310000074
Is a bias term, sigmoid (·) is an activation function of neurons, and according to the learned weight relation vector F, user node embedded vectors from different layers can be selectively retained and reset, and the final user node embedded vector is represented as follows:
Figure BDA0002576285310000075
wherein the method comprises the steps of
Figure BDA0002576285310000076
Representing selective retention of node embedded vectors at user level,/->
Figure BDA0002576285310000077
Represents a selective reset of the node embedded vector for the feature level, where 1-F is a vector operation, representing subtracting 1 from each dimension of the vector F,
according to the fusion of the user level node embedded representation and the feature level node embedded representation of the user in the social network by using a gating mechanism, the final node embedded representation z of each user in the social network can be obtained i Component node embedding vector matrix z= (Z) 1 ,z 2 ,…,z N ) T
For any given pair of user nodes v in a social network i And v j The node embedded vectors are z respectively i And z j The probability of an edge between the two nodes can be expressed as:
Figure BDA0002576285310000078
wherein the method comprises the steps of
Figure BDA0002576285310000079
Is a sigmoid function;
to optimize model parameters of a two-layer graph attention neural network, we need to define an objective function of the model whose objective is to maximize the probability of observable edge occurrences in the social network, namely:
Figure BDA00025762853100000710
to avoid trivial solutions, for each edge e that is observable i,j =(v i ,v j ) We employ a negative sampling technique to maximize the objective function, namely:
Figure BDA00025762853100000711
wherein the first term is modeling positive examples in the social network, and the second term is modeling negative examples by randomly generating edges associated with nodes by a negative sampling technique, and the probability of each node being sampled satisfies
Figure BDA0002576285310000081
K represents the number of edges of the negative example of the sample, d v A degree representing the user node v; according to the objective function, we can learn the parameters of the double-layer graph attention neural network model by adopting a back propagation optimization algorithm, so as to obtain the node vector matrixes of the source social network and the target social network respectively +.>
Figure BDA0002576285310000082
Figure BDA0002576285310000083
And->
Figure BDA0002576285310000084
Wherein |V s I represents the number of user nodes in the source social network, |V t The number of user nodes in the target social network; />
Figure BDA0002576285310000085
Representing user +.>
Figure BDA0002576285310000086
Node embedded vector, " >
Figure BDA0002576285310000087
Representing user +.>
Figure BDA0002576285310000088
Is embedded with a vector; z is Z s And Z t Also referred to as the embedded vector space corresponding to the source social network and the target social network.
Preferably, the step S3 includes the following: based on the step S2, a node embedded vector matrix of the source social network is obtained
Figure BDA0002576285310000089
And node embedded vector matrix of target social network +.>
Figure BDA00025762853100000810
Figure BDA00025762853100000811
Each row of the node vector matrix represents a node embedded vector corresponding to one user in the social network, and the whole node vector matrix is also called an embedded vector space corresponding to the social network;
defining a mapping function M that can map a user node vector from one embedded vector space to another, assuming we now project the source social network to the target social network to find a target node that matches the source social network, the target function can be defined as:
Figure BDA00025762853100000812
wherein M is s→t (. Cndot.) represents a mapping function from a source social network to a target social network, wherein a multi-layer perceptron is adopted to construct the mapping function, and θ is a weight parameter of the multi-layer perceptron; the objective function aims at minimizing the distance between the source user node in the user pair with the anchor link relation and the target user node after mapping the source user node to the target social network, so as to construct a classification model to predict whether any two users between different social networks have the anchor link or not, and selecting the target user node nearest to the source user node after projection to construct candidate anchor links.
From the above, it can be seen that the invention uses the graph neural network and the information of the attribute, the local social structure, the global social structure and the like of the user in the social network to model the complex interaction behavior of the user in the social network, and can obtain more accurate embedded vectors of the user nodes;
the invention provides a double-layer graph annotation intention neural network to learn the attention coefficients among users with user levels and feature levels respectively, and captures the difference of influence weights among different users from multiple perspectives, so that the learned user node embedding representation is more in line with the actual situation of the users in the social network;
the invention provides a bidirectional embedded vector space alignment strategy to predict anchor links among different social networks, so that users among different social networks are aligned and one-to-one matching constraint relation is satisfied. Meanwhile, the accuracy of the anchor link prediction is improved by further confirming the bidirectional embedded vector space alignment strategy.
Drawings
In order to more clearly illustrate the embodiments of the invention or the technical solutions in the prior art, the following description will briefly explain the drawings used in the embodiments or the description of the prior art, and it is obvious that the drawings in the following description are only some embodiments of the invention, and that other drawings can be obtained according to these drawings without inventive effort to a person skilled in the art.
FIG. 1 is a method framework of the present invention;
FIG. 2 is a fused schematic diagram of a user node embedded representation of the present invention from different perspectives;
FIG. 3 is a schematic diagram of the bi-directional embedded vector space alignment strategy of the present invention.
Detailed Description
The present invention will be further described in detail below with reference to specific embodiments and with reference to the accompanying drawings, in order to make the objects, technical solutions and advantages of the present invention more apparent.
It should be noted that unless otherwise defined, technical or scientific terms used in the embodiments of the present invention should be given the ordinary meaning as understood by one of ordinary skill in the art to which the present disclosure pertains. The terms "first," "second," and the like, as used in this disclosure, do not denote any order, quantity, or importance, but rather are used to distinguish one element from another. The word "comprising" or "comprises", and the like, means that elements or items preceding the word are included in the element or item listed after the word and equivalents thereof, but does not exclude other elements or items. The terms "connected" or "connected," and the like, are not limited to physical or mechanical connections, but may include electrical connections, whether direct or indirect. "upper", "lower", "left", "right", etc. are used merely to indicate relative positional relationships, which may also be changed when the absolute position of the object to be described is changed.
As shown in fig. 1 to 3, the embodiment:
first, the present invention describes the social network alignment problem as follows:
social network abstraction is a directed graph g= (V, E, X), where v= { V i I=1, …, N } represents a set of user nodes in a social network, N being socialThe number of user nodes in the traffic network; e= { E i,j =(v i ,v j )|v i ∈V,v j E V represents a set of relationships between users in a social network, e i,j =(v i ,v j ) Representing user v i And user v j The association relation exists between the two; x= { X i I=1, …, N } represents the feature vector set of all users, for each user node v i All have a node feature vector x i In response, feature vectors may be extracted from the user node's personal data, behavioral and structural attributes, and the like. For each edge e in the set of relationships i,j Let w i,j Representing the weights of the edges. Wherein if two users have a link relationship in the social network, w i,j =1, otherwise, w i,j =0. Matrix array
Figure BDA0002576285310000101
Called the adjacency matrix of graph G. Without loss of generality, we will call the two networks to be aligned a source social network and a target social network, and use G separately s And G t The representation is performed.
For any two users from different social networks
Figure BDA0002576285310000102
And->
Figure BDA0002576285310000103
We use
Figure BDA0002576285310000104
Representing the association relationship between the two user nodes, w' i,j Representing the relationship weights. If->
Figure BDA0002576285310000105
And->
Figure BDA0002576285310000106
Is two users in the source social network and the target social network respectivelyLet w 'in principle for a different account' i,j =1 means that there is an anchor link relationship between the two users, otherwise, w' i,j =0. The goal of network alignment is to find an anchor link set between different social networks +.>
Figure BDA0002576285310000107
Figure BDA0002576285310000108
Wherein->
Figure BDA0002576285310000109
Representing user node +.>
Figure BDA00025762853100001010
From the source social network, +.>
Figure BDA00025762853100001011
Representing user node +.>
Figure BDA00025762853100001012
From the target social network, w' i,j =1 means two user nodes +.>
Figure BDA00025762853100001013
And->
Figure BDA00025762853100001014
Belonging to the same user entity in the real world.
Next, referring to fig. 1, the present invention provides a network alignment method framework, which mainly comprises a network preprocessing module, a network embedded representation module and an embedded vector space alignment module. Wherein the method comprises the steps of
The network preprocessing module: for the social network g= (V, E), we first perform preprocessing work on the network according to the type of the input network and the included information, so as to construct an initialized user vector matrix. Common network types can be divided into topological networks and attribute networks, for which the invention adoptsThe embedding layer randomly initializes the node embedded vector x of the user i The weight parameters of the embedding layer are learned through the training stage of the attention neural network model of the double-layer graph; for attribute networks, user attributes are often vectorized in different ways depending on the information contained. For short text attributes like a user name, an emmbedding layer may be employed for random initialization; for long text attributes like user comments, methods such as topic models are generally adopted to learn topic preferences of users; regarding the sign-in information of the user, the user is considered to have similar access preference to merchants in the same area, and the sign-in information of the user is initialized by adopting a spatial clustering method; for numerical attributes such as user scoring, number of check-ins and the like, the numerical attributes can be directly used as one dimension of the user attributes. After extracting the attribute vectors of the user's various attributes, we stack these attribute vectors laterally to generate the user's initialization vector in the attribute network. Thus, for each user in the social network, we ultimately generate an initialized representation of the user
Figure BDA0002576285310000111
Where d represents the dimension of the user initialization vector. Constructing a state matrix X from initialization vectors of all users in the social network, wherein each row is a feature vector of a specific user node: x= (X 1 ,x 2 ,…x N ) T
Network embedded representation module: the module initializes the feature vector matrix X= (X) of the user node obtained by the network preprocessing module 1 ,x 2 ,…x N ) T Adjacency matrix with social network
Figure BDA0002576285310000114
As input, capturing complex information interaction relation of a user in a social network through a double-layer graph attention mechanism neural network so as to learn potential information representation z of a user node in the social network i . The module can be further subdivided into three sub-modules: user layer node embedded representation subThe device comprises a module, a feature layer node embedded representation sub-module and an embedded vector fusion sub-module.
1. User layer node embedded representation sub-module
The user layer node embedded representation sub-module is responsible for capturing different influence weights among users to carry out weighted aggregation on local neighborhood information of the users in the social network, so that node embedded representations at the user level are learned. In order to ensure that the node vector has sufficient information representation capability, the invention firstly utilizes a leachable transformation matrix
Figure BDA0002576285310000112
Converting the input vector into a high-dimensional vector, namely:
Figure BDA0002576285310000113
for any two user nodes v i And v j First, the relation strength e between the two users is calculated ij
Figure BDA0002576285310000121
Wherein the method comprises the steps of
Figure BDA0002576285310000122
And->
Figure BDA0002576285310000123
Representing user node v i And v j Vector representation at layer l, +.>
Figure BDA0002576285310000124
The weight parameter of the first layer is represented, the "|" is a series operator, and represents that two vectors are transversely spliced, and the LeakyReLU (·) is an activation function of a neuron. For calculating the node v for the user i Information contribution ratio from different neighbors when aggregation is carried out on neighborhood information, and softmax (·) function is adopted for usersNode and all neighbor user nodes v k ∈N(v i ) The relation strength of (2) is normalized, and the calculation mode is as follows:
Figure BDA0002576285310000125
a ij is referred to as user node v i And v j Attention coefficient between a ij The larger the value of (c) the closer the relationship between the two users. According to the calculated user node v i Attention coefficients with all its neighbor node users (including themselves), each user node v i The new potential information representation of (c) may be defined as follows:
Figure BDA0002576285310000126
where δ (·) is the activation function of the neuron. According to the linear aggregation of different influence weights on the neighborhood information of the users, the node embedded representation h of each user at the user level in the social network can be obtained i Form a user level vector matrix m= (h 1 ,h 2 ,…,h N ) T
2. Feature layer node embedded representation submodule
The feature layer node embedded representation sub-module is responsible for learning the influence weights among different features of the user to capture the interaction relationship among the features with finer granularity, so that the node embedded representation of the user at the feature level is learned. The feature layer node is embedded into a user level vector matrix M= (h) obtained in the previous stage of the representation submodule 1 ,h 2 ,…,h N ) T As input, and taking into account the multidimensional attention of the features between any two nodes in the social network, an attention coefficient is calculated for each corresponding dimension of the two user node vectors.
Let h i And h j Distribution represents two user nodes v in a social network i And v j The relationship between two user node embedded vectors can be defined as:
f(h i ,h j )=W 5 ·tanh(W 4 ·h i +W 3 ·h j +b 2 )+b 1
wherein W is 3 ,W 4 And
Figure BDA0002576285310000131
is a parameter matrix, b 1 ,/>
Figure BDA0002576285310000132
Is the bias term, and tanh (·) is the activation function of the neuron. Use of a feed forward neural network to determine the frequency of the feedback signal based on f (h i ,h j ) And calculating the dependency relationship of any two user nodes on the feature level. Let beta ij Representing user node v i And v j Attention coefficient vector, [ beta ], of the feature layer of (c) ij ] k Representing the kth dimension of the attention vector. Also, to facilitate comparing the attention coefficients of the features of the corresponding dimensions between the different attention coefficient vectors, the attention coefficient vectors of all neighbors of the user are normalized according to the corresponding feature dimensions using a softmax (·) function, then there are:
Figure BDA0002576285310000133
between computing attention coefficients [ beta ] between each dimension of features of any two users ij ] k The attention coefficients can then be combined into an attention coefficient vector beta between the two users according to the corresponding feature dimensions ij =([β ij ] 1 ,[β ij ] 2 ,…,[β ij ] d ). The dimensions of the attention vector and the user node vector are the same, each dimension [ beta ] ij ] k Impact weight corresponding to each dimension of the user node vector, [ beta ] ij ] k The larger, two user nodes v are represented i And v j The stronger the association of features in the k-th dimension.
Finally, for any user node v in the social network i We weight a linear aggregation of potential information representations of their neighbor users according to learned attention coefficient vectors between different user nodes. Unlike the aggregation mode of the user layer attention mechanism, the aggregation function of the feature layer attention mechanism is to aggregate the neighborhood information in an element-by-element multiplication mode:
Figure BDA0002576285310000134
wherein the method comprises the steps of
Figure BDA0002576285310000135
Representing the element-wise product between two vectors of the same shape, the resulting vector is also one of the same shape, δ (·) being the activation function of the neuron. According to the influence weights of different characteristics on the neighborhood information of the users, the node embedded representation of the characteristic level of each user in the social network can be obtained>
Figure BDA0002576285310000141
Component feature level vector matrix->
Figure BDA0002576285310000142
Figure BDA0002576285310000143
3. Embedded vector fusion submodule
The embedded vector fusion sub-module is responsible for reserving and resetting user potential information from different layers of user levels and feature levels so as to fuse node embedded representations of multiple views and improve the accuracy of subsequent network alignment tasks. The embedded vector fusion sub-module uses the user-level vector matrix m= (h) 1 ,h 2 ,…,h N ) T And feature level vector matrix
Figure BDA0002576285310000144
As input, the gating mechanism is utilized to automatically learn the weight parameters of the node embedded representations at different levels of the same user to effectively preserve and reset the information representations from the different levels.
As shown in fig. 2, for any one user node v in the social network i The module first calculates a user level node embedded representation h i And feature level node embedded representation
Figure BDA0002576285310000145
The weight relation vector is calculated as follows:
Figure BDA0002576285310000146
wherein the method comprises the steps of
Figure BDA0002576285310000147
And->
Figure BDA0002576285310000148
Is a parameter matrix of the gated neural network, < +.>
Figure BDA0002576285310000149
Figure BDA00025762853100001410
Is a bias term, sigmoid (·) is the activation function of the neuron. The user potential information representations from different levels can be selectively retained and reset according to the learned weight relation vector F, and the final user node embedding representation is as follows:
Figure BDA00025762853100001411
wherein the method comprises the steps of
Figure BDA00025762853100001412
Selective reservation of node embedded representation of representation to user level,/->
Figure BDA00025762853100001413
Represents a selective reset of the node embedded representation of the feature level, where 1-F is a vector operation, representing subtracting 1 from each dimension of the vector F.
According to the fusion of the user level node embedded representation and the feature level node embedded representation of the user in the social network by using a gating mechanism, the final node embedded representation z of each user in the social network can be obtained i Component node embedding vector matrix z= (Z) 1 ,z 2 ,…,z n ) T
For any given pair of user nodes v in a social network i And v j The node embedded vectors are z respectively i And z j The probability of an edge between the two nodes can be expressed as:
Figure BDA0002576285310000151
wherein the method comprises the steps of
Figure BDA0002576285310000152
Is a sigmoid function;
to optimize model parameters of a two-layer graph attention neural network, we need to define an objective function of the model whose objective is to maximize the probability of observable edge occurrences in the social network, namely:
Figure BDA0002576285310000153
to avoid trivial solutions, for each edge e that is observable i,j =(v i ,v j ) We employ a negative sampling technique to maximize the objective function, namely:
Figure BDA0002576285310000154
wherein the first term is modeling positive examples in the social network, and the second term is modeling negative examples by randomly generating edges associated with nodes by a negative sampling technique, and the probability of each node being sampled satisfies
Figure BDA0002576285310000155
K represents the number of edges of the negative example of the sample, d v A degree representing the user node v; according to the objective function, we can learn the parameters of the double-layer graph attention neural network model by adopting a back propagation optimization algorithm, so as to obtain the node vector matrixes of the source social network and the target social network respectively +.>
Figure BDA0002576285310000156
Figure BDA0002576285310000157
And->
Figure BDA0002576285310000158
Wherein |V s I represents the number of user nodes in the source social network, |V t The number of user nodes in the target social network; />
Figure BDA0002576285310000159
Representing user +.>
Figure BDA00025762853100001510
Node embedded vector, ">
Figure BDA00025762853100001511
Representing user +.>
Figure BDA00025762853100001512
Is embedded with a vector; z is Z s And Z t Also known as source social networks and objectivesAnd marking an embedded vector space corresponding to the social network.
An embedded vector space alignment module: based on the modules, the node embedded vector matrix of the source social network can be obtained respectively
Figure BDA0002576285310000161
And node embedded vector matrix of target social network +.>
Figure BDA0002576285310000162
Figure BDA0002576285310000163
Each row of the node vector matrix represents node embedded representation corresponding to one user in the social network, and the whole node vector matrix is called an embedded vector space corresponding to the social network. In order to enable efficient alignment of two embedded vector spaces, we need to project the embedded vector space of the source social network and the embedded vector space of the target social network into a common vector space.
First we define a mapping function M that maps user node vectors from one embedded vector space to another. Suppose we now project a source social network to a target social network to find a target node that matches the source social network. Given a partially known anchor linkage set T as supervision information, the objective function may be defined as:
Figure BDA0002576285310000164
Wherein M is s→t (. Cndot.) represents the mapping function from the source social network to the target social network, the invention uses the multi-layer perceptron to construct the mapping function, θ is the weight parameter of the multi-layer perceptron. The objective function aims to minimize the distance from the target user node after mapping the source user node in the user pair with the anchor link relationship to the target social network, so as to construct a classification model to predict any two users between different social networksWhether there is an anchor link. Since the user typically has only one active account in a different social network platform, the target user node closest to the source user node after the projection is selected herein to build the candidate anchor link.
Since the user alignment problem between different social networks typically satisfies a one-to-one matching constraint, i.e., the same user entity, there is at most only one active account in different social network platforms. As shown in fig. 3 (a), the unidirectional embedded vector space mapping may generate a one-to-many matching relationship between social networks, which violates the actual network scenario. Therefore, the invention provides a bidirectional embedded vector space alignment strategy to ensure that network alignment tasks between two social networks meet one-to-one matching constraint relation. Referring to fig. 3, the present invention will be described with reference to specific examples, in which the steps are as follows:
Step 1: constructing a multi-layer perceptron model projected from a source social network to a target social network according to a known anchor linkage set T
Figure BDA0002576285310000171
Learning the weight parameter θ by minimizing the distance of a source user node in an anchor link from a corresponding target user node after projection onto a target social network 1
Step 2: based on learned model of multi-layer perceptron
Figure BDA0002576285310000172
For each user node in the source social network +.>
Figure BDA0002576285310000173
As shown in fig. 3 (a), it is first projected into the target embedded vector space, then the target user node closest to the node it is projected on is found in the target social network to form an anchor link with the source user node, and it is added to the candidate anchor link set ∈>
Figure BDA0002576285310000174
Figure BDA0002576285310000175
Step 3: constructing a multi-layer perceptron model projected from a target social network to a source social network from a known anchor linkage set T
Figure BDA0002576285310000176
Learning the weight parameter θ by minimizing the distance of a target user node in an anchor link from a corresponding source user node after projection onto a source social network 2
Step 4: based on learned model of multi-layer perceptron
Figure BDA0002576285310000177
For each user node in the target social network +.>
Figure BDA0002576285310000178
As shown in fig. 3 (b), it is first projected into the source embedded vector space, then the source user node closest to the node it is projected on is found in the source social network to form an anchor link with the target user node, and it is added to the candidate anchor link set- >
Figure BDA0002576285310000179
Figure BDA00025762853100001710
Step 5: taking candidate anchor linkage set A 1 And A 2 As the final predicted anchor linkage set a=a 1 ∩A 2
Those of ordinary skill in the art will appreciate that: the discussion of any of the embodiments above is merely exemplary and is not intended to suggest that the scope of the disclosure, including the claims, is limited to these examples; the technical features of the above embodiments or in the different embodiments may also be combined within the idea of the invention, the steps may be implemented in any order and there are many other variations of the different aspects of the invention as described above, which are not provided in detail for the sake of brevity.
Additionally, well-known power/ground connections to Integrated Circuit (IC) chips and other components may or may not be shown within the provided figures, in order to simplify the illustration and discussion, and so as not to obscure the invention. Furthermore, the devices may be shown in block diagram form in order to avoid obscuring the invention, and also in view of the fact that specifics with respect to implementation of such block diagram devices are highly dependent upon the platform within which the present invention is to be implemented (i.e., such specifics should be well within purview of one skilled in the art). Where specific details (e.g., circuits) are set forth in order to describe example embodiments of the invention, it should be apparent to one skilled in the art that the invention can be practiced without, or with variation of, these specific details. Accordingly, the description is to be regarded as illustrative in nature and not as restrictive.
While the invention has been described in conjunction with specific embodiments thereof, many alternatives, modifications, and variations of those embodiments will be apparent to those skilled in the art in light of the foregoing description. For example, other memory architectures (e.g., dynamic RAM (DRAM)) may use the embodiments discussed.
The embodiments of the invention are intended to embrace all such alternatives, modifications and variances which fall within the broad scope of the appended claims. Therefore, any omission, modification, equivalent replacement, improvement, etc. of the present invention should be included in the scope of the present invention.

Claims (5)

1. A network alignment method based on a dual-layer graph attention neural network, comprising:
basic definition: social network abstraction is a directed graph g= (V, E, X), where v= { V i I=1,..n } represents a set of user nodes in the social network, N being the number of user nodes in the social network;E={e i,j =(v i ,v j )|v i ∈V,v j e V represents a set of relationships between users in a social network, e i,j =(v i ,v j ) Representing user v i And user v j The association relation exists between the two; x= { X i I=1,..n } represents the feature vector set of all users, for each user node v i All have a node feature vector x i Correspondingly, the feature vector can be extracted from the personal data, the behavior and the network social structure information of the user node, without losing generality, the two networks to be aligned are named as a source social network and a target social network, and G is used for each network s And G t A representation;
for any two users from different social networks
Figure FDA0004245278670000011
And->
Figure FDA0004245278670000012
We use->
Figure FDA0004245278670000013
Representing an anchor linkage relationship between a source social network and a target social network, wherein +.>
Figure FDA0004245278670000014
And->
Figure FDA0004245278670000015
Is that the same user is respectively in different social networks G s And G T An account in (a); the anchor links are one-to-one link relations between two users in different social networks, and the situation that the two anchor links share the same user account of the same social network does not exist;
two different social networks G s And G t All sets of anchor linkage relationships between them, we call the anchor linkage setCombining and using
Figure FDA0004245278670000016
Representation of->
Figure FDA0004245278670000017
Representing a user account in a source social network, +.>
Figure FDA0004245278670000018
Representing a user account in a target social network; for two different social networks G s =(V s ,E s ,X s ) And G t =(V t ,E t ,X t ) Network alignment aims at finding a set of anchor link sets T between two social networks, where any element e 'in set T' ij E T represents two user accounts +.>
Figure FDA0004245278670000019
And->
Figure FDA00042452786700000110
An anchor link between the two;
s1, a network preprocessing module: preprocessing a social network according to the input network type and the contained user attribute information, and constructing an initialized user node feature vector matrix;
s2, a network embedded representation module: the method comprises the steps of taking an initialized user node feature vector matrix obtained by a network preprocessing module and an adjacent matrix of a social network as inputs, capturing a complex information interaction relation of a user in the social network through a double-layer graph attention neural network, learning potential information of the user node in the social network, and obtaining an accurate user node embedded vector, wherein the method comprises the following steps of:
s2.1, embedding a user layer node into a representation sub-module: capturing different influence weights among users to carry out weighted aggregation on local neighborhood information of the users in the social network, so that node embedding vectors of user levels are learned; comprising the following steps:
using a leachable transformation matrix
Figure FDA0004245278670000021
Converting the input vector into a high-dimensional vector, namely:
Figure FDA0004245278670000022
for any two user nodes v, according to the knowledge about the graph attention neural network i And v j First, the relation strength e between the two user nodes is calculated ij
Figure FDA0004245278670000023
Wherein the method comprises the steps of
Figure FDA0004245278670000024
And->
Figure FDA0004245278670000025
Representing user node v i And v j Embedding vector at layer I, +.>
Figure FDA0004245278670000026
The weight parameter representing the first layer, "|" is the series operator representing the transverse concatenation of two vectors, leakyReLU (·) is the activation function of neurons, for the calculation of user node v i When the neighborhood information of (a) is aggregated, the information contribution ratio from different neighbors is normalized by adopting a softmax (·) function to normalize the user node and all neighbor user nodes v thereof k ∈N(v i ) The relation strength between the two is calculated as follows:
Figure FDA0004245278670000027
a ij is called user node v i And v j Attention coefficient between a ij The larger the value of (a) is, the closer the relationship between the two users is expressed, and the node v of the user is obtained according to calculation i Attention coefficients with all its neighbor node users (including themselves), each user node v i The new embedded vector of (a) may be defined as follows:
Figure FDA0004245278670000028
wherein delta (delta) is an activation function of a neuron, and node embedding vectors h of each user at the user level in the social network can be obtained by carrying out linear aggregation of different influence weights on the neighborhood information of the user i Form a user level vector matrix m= (h 1 ,h 2 ,...,h N ) T
S2.2, embedding a characteristic layer node into a representation sub-module: the method is responsible for learning influence weights among different features of the user so as to capture interaction relations among the features with finer granularity, and therefore node embedding vectors of the user at the feature level are learned; comprising the following steps:
User level vector matrix m= (h) obtained with S2.1 1 ,h 2 ,...,h N ) T As input, and taking into account the multidimensional attention between the features of any two user nodes in the social network, i.e. calculating an attention coefficient for each corresponding dimension of the two user node vectors,
let h i And h j Distribution represents two user nodes v in a social network i And v j The relationship between two user node embedded vectors can be defined as:
f(h i ,h j )=W 5 ·tanh(W 4 ·h i +W 3 ·h j +b 2 )+b 1
wherein W is 3 ,W 4 And
Figure FDA0004245278670000031
is a parameter matrix,/->
Figure FDA0004245278670000032
Is the bias term, tanh (·) is the activation function of the neuron,
use of a feed forward neural network to determine the frequency of the feedback signal based on f (h i ,h j ) Calculating the dependency relationship of any two user nodes on the characteristic level to enable beta to be calculated ij Representing user node v i And v j Attention coefficient vector, [ beta ], of the feature layer of (c) ij ] k Similarly, to facilitate comparing the attention coefficients of the features of the corresponding dimensions between different attention coefficient vectors, if the attention coefficient vectors of all neighbors of the user are normalized by the corresponding feature dimensions using a softmax (·) function, then there are:
Figure FDA0004245278670000033
between computing attention coefficients [ beta ] between each dimension of features of any two users ij ] k The attention coefficients can then be combined into an attention coefficient vector beta between the two users according to the corresponding feature dimensions ij =([β ij ] 1 ,[β ij ] 2 ,...,[β ij ] d″ ) The dimensions of the attention vector and the user node vector are the same, each dimension [ beta ] ij ] k Impact weight corresponding to each dimension of the user node vector, [ beta ] ij ] k The larger, two user nodes v are represented i And v j The stronger the association of features in the k-th dimension,
finally, for any user node v in the social network i We apply the learning of the attention coefficient vector between different user nodes based on itThe embedded vectors of the neighbor users are subjected to weighted linear aggregation, and the aggregation function of the feature layer attention mechanism is used for aggregating the neighborhood information in an element-by-element multiplication mode, which is different from the aggregation mode of the user layer attention mechanism:
Figure FDA0004245278670000041
wherein the method comprises the steps of
Figure FDA0004245278670000042
Representing the element-by-element product between two vectors with the same shape, obtaining a vector with the same shape, wherein delta (·) is an activation function of a neuron, and node embedded representation of each user at the feature level in the social network can be obtained by linearly aggregating the neighborhood information of the user according to the influence weights of different features>
Figure FDA0004245278670000043
Component feature level vector matrix->
Figure FDA0004245278670000044
S2.3, embedding a vector fusion sub-module: the method is responsible for reserving and resetting user embedded vectors from different layers of user levels and feature levels so as to fuse node embedded vectors of multiple views and improve accuracy of network alignment tasks; comprising the following steps:
The embedded vector fusion sub-module uses the user-level vector matrix m= (h) 1 ,h 2 ,…,h N ) T And feature level vector matrix
Figure FDA0004245278670000045
As input, the gating mechanism is utilized to automatically learn the weight parameters of node embedded vectors at different levels of the same user, so as to effectively reserve and reset the information representations from different levels,
for any user node v in a social network i The module first calculates a user level node embedding vector h i And feature level node embedding vector
Figure FDA0004245278670000046
The weight relation vector is calculated as follows:
Figure FDA0004245278670000047
wherein the method comprises the steps of
Figure FDA0004245278670000048
And->
Figure FDA0004245278670000049
Is a parameter matrix of a gated neural network, +.>
Figure FDA00042452786700000410
Is a bias term, sigmoid (·) is an activation function of neurons, and according to the learned weight relation vector F, user node embedded vectors from different layers can be selectively retained and reset, and the final user node embedded vector is represented as follows:
Figure FDA00042452786700000411
wherein the method comprises the steps of
Figure FDA00042452786700000412
Representing selective retention of node embedded vectors at user level,/->
Figure FDA00042452786700000413
Representing selective resetting of node embedded vectors for feature levels, where 1-is a vector operation representing the sum of 1 and directionEach dimension of the quantity F is subtracted,
according to the fusion of the user level node embedded representation and the feature level node embedded representation of the user in the social network by using a gating mechanism, the final node embedded representation z of each user in the social network can be obtained i Component node embedding vector matrix z= (Z) 1 ,z 2 ,...,z N ) T
For any given pair of user nodes v in a social network i And v j The node embedded vectors are z respectively i And z j The probability of an edge between the two nodes can be expressed as:
Figure FDA0004245278670000051
wherein the method comprises the steps of
Figure FDA0004245278670000052
Is a sigmoid function;
to optimize model parameters of a two-layer graph attention neural network, we need to define an objective function of the model whose objective is to maximize the probability of observable edge occurrences in the social network, namely:
Figure FDA0004245278670000053
to avoid trivial solutions, for each edge e that is observable i,j =(v i ,v j ) We employ a negative sampling technique to maximize the objective function, namely:
Figure FDA0004245278670000054
wherein the first term is modeling positive examples in the social network, and the second term is modeling negative examples by randomly generating edges associated with nodes through a negative sampling techniqueLine modeling, the probability that each node is sampled satisfies
Figure FDA0004245278670000055
K represents the number of edges of the negative example of the sample, d v A degree representing the user node v; according to the objective function, we can learn the parameters of the double-layer graph attention neural network model by adopting a back propagation optimization algorithm, so as to obtain the node vector matrixes of the source social network and the target social network respectively +.>
Figure FDA0004245278670000056
And->
Figure FDA0004245278670000057
Wherein |V s I represents the number of user nodes in the source social network, |V t The number of user nodes in the target social network; />
Figure FDA0004245278670000058
Representing user +.>
Figure FDA0004245278670000059
Node embedded vector, ">
Figure FDA00042452786700000510
Representing user +.>
Figure FDA00042452786700000511
Is embedded with a vector; z is Z s And Z t Also referred to as embedded vector spaces corresponding to the source social network and the target social network;
s3, embedding a vector space alignment module: constructing a classification model according to the user node embedded vectors of the source social network and the target social network learned in the S2 to predict anchor links, and adopting a bidirectional embedded vector space alignment strategy to meet the constraint of one-to-one matching of user accounts among different social networks;
s4, solving an intersection, and completing network alignment.
2. The network alignment method based on the dual-layer graph attention neural network according to claim 1, wherein the step S2 comprises the following steps: user v i Is expressed as x i Based on the network type, various feature vectors of the user are extracted and stacked laterally to generate an initialized feature vector representation of the user
Figure FDA0004245278670000061
Where d represents the dimension of the user-initialized feature vector, and d ', d ", d'" appearing hereinafter represent different dimensions, respectively, building the initialized feature vectors of all users in the social network into a state matrix X, where each row is a feature vector of a particular user node: x= (X 1 ,x 2 ,…x N ) T
3. The network alignment method based on the dual-layer graph attention neural network according to claim 2, wherein the network type is a topology network, the feature vector of the user is randomly initialized by adopting a random matrix mode, and the weight parameters of the random matrix are learned through the training phase of the dual-layer graph attention neural network model.
4. The network alignment method based on the dual-layer graph attention neural network according to claim 2, wherein if the network type is an attribute network, vectorizing the user attribute is performed by adopting the following manner: randomly initializing user information such as user names by adopting a word embedding mode to obtain user name feature vectors; mining the language style of the user from the long text information of the user by adopting a Doc2Vec model, and learning the text feature vector of the user; carrying out vector initialization on the user track information through spatial clustering to obtain a spatial feature vector of a user; and directly taking the user scoring and the number of sign-in times as a characteristic dimension of the user, and carrying out vector initialization.
5. The network alignment method based on the dual-layer graph attention neural network according to claim 1, wherein the step S3 comprises the following steps: based on the step S2, a node embedded vector matrix of the source social network is obtained
Figure FDA0004245278670000062
And node embedded vector matrix of target social network +.>
Figure FDA0004245278670000063
Each row of the node vector matrix represents a node embedded vector corresponding to one user in the social network, and the whole node vector matrix is also called an embedded vector space corresponding to the social network;
defining a mapping function M that can map a user node vector from one embedded vector space to another, assuming we now project the source social network to the target social network to find a target node that matches the source social network, the target function can be defined as:
Figure FDA0004245278670000071
wherein M is s→t (. Cndot.) represents a mapping function from a source social network to a target social network, wherein a multi-layer perceptron is adopted to construct the mapping function, and θ is a weight parameter of the multi-layer perceptron; the objective function aims at minimizing the distance between the source user node in the user pair with the anchor link relation and the target user node after mapping the source user node to the target social network, so as to construct a classification model to predict whether any two users between different social networks have the anchor link or not, and selecting the target user node nearest to the source user node after projection to construct candidate anchor links.
CN202010654776.0A 2020-07-09 2020-07-09 Network alignment method based on double-layer graph attention neural network Active CN111931903B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010654776.0A CN111931903B (en) 2020-07-09 2020-07-09 Network alignment method based on double-layer graph attention neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010654776.0A CN111931903B (en) 2020-07-09 2020-07-09 Network alignment method based on double-layer graph attention neural network

Publications (2)

Publication Number Publication Date
CN111931903A CN111931903A (en) 2020-11-13
CN111931903B true CN111931903B (en) 2023-07-07

Family

ID=73312715

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010654776.0A Active CN111931903B (en) 2020-07-09 2020-07-09 Network alignment method based on double-layer graph attention neural network

Country Status (1)

Country Link
CN (1) CN111931903B (en)

Families Citing this family (31)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112396492A (en) * 2020-11-19 2021-02-23 天津大学 Conversation recommendation method based on graph attention network and bidirectional long-short term memory network
CN112395466B (en) * 2020-11-27 2023-05-12 上海交通大学 Fraud node identification method based on graph embedded representation and cyclic neural network
CN112446542B (en) * 2020-11-30 2023-04-07 山西大学 Social network link prediction method based on attention neural network
CN114625978A (en) * 2020-12-10 2022-06-14 国家计算机网络与信息安全管理中心 Heterogeneous network user anchor link prediction method based on type perception and electronic device
CN112381179B (en) * 2020-12-11 2024-02-23 杭州电子科技大学 Heterogeneous graph classification method based on double-layer attention mechanism
CN112507246B (en) * 2020-12-13 2022-09-13 天津大学 Social recommendation method fusing global and local social interest influence
CN112507247B (en) * 2020-12-15 2022-09-23 重庆邮电大学 Cross-social network user alignment method fusing user state information
CN112667920A (en) * 2020-12-29 2021-04-16 复旦大学 Text perception-based social influence prediction method, device and equipment
CN112860810B (en) * 2021-02-05 2023-07-14 中国互联网络信息中心 Domain name multiple graph embedded representation method, device, electronic equipment and medium
CN112818257B (en) * 2021-02-19 2022-09-02 北京邮电大学 Account detection method, device and equipment based on graph neural network
CN113127752B (en) * 2021-03-18 2023-04-07 中国人民解放军战略支援部队信息工程大学 Social network account aligning method and system based on user naming habit mapping learning
CN113095948B (en) * 2021-03-24 2023-06-06 西安交通大学 Multi-source heterogeneous network user alignment method based on graph neural network
CN112800770B (en) * 2021-04-15 2021-07-09 南京樯图数据研究院有限公司 Entity alignment method based on heteromorphic graph attention network
CN113065045B (en) * 2021-04-20 2022-07-22 支付宝(杭州)信息技术有限公司 Method and device for carrying out crowd division and training multitask model on user
CN113238885B (en) * 2021-05-08 2023-07-07 长安大学 Method and equipment for predicting implicit deviation instruction based on graph attention network
CN113409157B (en) * 2021-05-19 2022-06-28 桂林电子科技大学 Cross-social network user alignment method and device
CN113407784B (en) * 2021-05-28 2022-08-12 桂林电子科技大学 Social network-based community dividing method, system and storage medium
CN113240098B (en) * 2021-06-16 2022-05-17 湖北工业大学 Fault prediction method and device based on hybrid gated neural network and storage medium
CN113628059B (en) * 2021-07-14 2023-09-15 武汉大学 Associated user identification method and device based on multi-layer diagram attention network
CN113807012A (en) * 2021-09-14 2021-12-17 杭州莱宸科技有限公司 Water supply network division method based on connection strengthening
CN113901831B (en) * 2021-09-15 2024-04-26 昆明理工大学 Parallel sentence pair extraction method based on pre-training language model and bidirectional interaction attention
CN113779406A (en) * 2021-09-16 2021-12-10 浙江网商银行股份有限公司 Data processing method and device
CN113792937B (en) * 2021-09-29 2022-09-13 中国人民解放军国防科技大学 Social network influence prediction method and device based on graph neural network
CN114662143B (en) * 2022-02-28 2024-05-03 北京交通大学 Sensitive link privacy protection method based on graph embedding
CN115063251B (en) * 2022-05-30 2024-09-03 华侨大学 Social propagation dynamic network representation method based on relationship strength and feedback mechanism
CN114969540A (en) * 2022-06-10 2022-08-30 重庆大学 Method for predicting future interaction behavior of user in social network
CN116049695B (en) * 2022-12-20 2023-07-04 中国科学院空天信息创新研究院 Group perception and standing analysis method, system and electronic equipment crossing social network
CN115861822B (en) * 2023-02-07 2023-05-12 海豚乐智科技(成都)有限责任公司 Target local point and global structured matching method and device
CN116776193B (en) * 2023-05-17 2024-08-06 广州大学 Method and device for associating virtual identities across social networks based on attention mechanism
CN116566743B (en) * 2023-07-05 2023-09-08 北京理工大学 Account alignment method, equipment and storage medium
CN117670572B (en) * 2024-02-02 2024-05-03 南京财经大学 Social behavior prediction method, system and product based on graph comparison learning

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109636658A (en) * 2019-01-17 2019-04-16 电子科技大学 A kind of social networks alignment schemes based on picture scroll product

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10268646B2 (en) * 2017-06-06 2019-04-23 Facebook, Inc. Tensor-based deep relevance model for search on online social networks

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109636658A (en) * 2019-01-17 2019-04-16 电子科技大学 A kind of social networks alignment schemes based on picture scroll product

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
A Unified Link Prediction Framework for Predicting Arbitrary Relations in Heterogeneous Academic Networks;Median Lu,et al.;《IEEE Access》;124967-124987 *

Also Published As

Publication number Publication date
CN111931903A (en) 2020-11-13

Similar Documents

Publication Publication Date Title
CN111931903B (en) Network alignment method based on double-layer graph attention neural network
Tian et al. Spatial‐temporal attention wavenet: A deep learning framework for traffic prediction considering spatial‐temporal dependencies
Ye et al. Coupled layer-wise graph convolution for transportation demand prediction
You et al. Image-based appraisal of real estate properties
CN112925989B (en) Group discovery method and system of attribute network
CN111695415A (en) Construction method and identification method of image identification model and related equipment
CN113255895B (en) Structure diagram alignment method and multi-diagram joint data mining method based on diagram neural network representation learning
Pajares et al. A Hopfield Neural Network for combining classifiers applied to textured images
CN112200266B (en) Network training method and device based on graph structure data and node classification method
CN113761250A (en) Model training method, merchant classification method and device
CN113240086B (en) Complex network link prediction method and system
Yu et al. Forecasting a short‐term wind speed using a deep belief network combined with a local predictor
CN116129286A (en) Method for classifying graphic neural network remote sensing images based on knowledge graph
CN116010813A (en) Community detection method based on influence degree of fusion label nodes of graph neural network
CN111309923A (en) Object vector determination method, model training method, device, equipment and storage medium
CN114254738A (en) Double-layer evolvable dynamic graph convolution neural network model construction method and application
Zhang et al. C 3-GAN: Complex-Condition-Controlled Urban Traffic Estimation through Generative Adversarial Networks
CN114417063A (en) Multi-view-based important node identification method for graph neural network
Wu et al. Learning spatial–temporal pairwise and high-order relationships for short-term passenger flow prediction in urban rail transit
Diao et al. DMSTG: Dynamic Multiview Spatio-Temporal Networks for Traffic Forecasting
CN112465253B (en) Method and device for predicting links in urban road network
Chen et al. Visual Odometry for Self‐Driving with Multihypothesis and Network Prediction
Xiao-Xu et al. An intelligent inspection robot of power distribution network based on image automatic recognition system
CN112560946A (en) Edge server hot spot prediction method for online and offline associated reasoning
Hela et al. CarParkingVQA: Visual Question Answering application on Car parking occupancy detection

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant