CN109117943B

CN109117943B - Method for enhancing network representation learning by utilizing multi-attribute information

Info

Publication number: CN109117943B
Application number: CN201810820414.7A
Authority: CN
Inventors: 乔立升; 陈恩红; 刘淇; 徐童
Original assignee: University of Science and Technology of China USTC
Current assignee: University of Science and Technology of China USTC
Priority date: 2018-07-24
Filing date: 2018-07-24
Publication date: 2022-09-30
Anticipated expiration: 2038-07-24
Also published as: CN109117943A

Abstract

The invention discloses a method for enhancing network representation by utilizing multi-attribute information, which comprises the following steps: fusing different types of attribute similarity information of the network nodes with structural adjacency information between corresponding network nodes through a first semi-supervised depth model to obtain representations of a plurality of different hidden spaces; optimizing structural adjacency information among the network nodes through a second semi-supervised depth model to obtain the representation of the network nodes in a hidden space, which is obtained only by using the structural adjacency information; and fusing the obtained representations of the hidden space to obtain the final network representation. The method can capture the nonlinear network structure, can solve the problem of non-consistency between the network topology structure and each attribute information to a certain extent from a certain angle, fully exerts the enhancement effect of the attribute information and keeps as much information as possible.

Description

Method for enhancing network representation learning by utilizing multi-attribute information

Technical Field

The invention relates to the technical field of machine learning and network representation optimization, in particular to a method for enhancing network representation learning by utilizing multi-attribute information.

Background

Many realistically important networks, such as social networks, reference networks, and airline networks, have complex structural features and rich attribute information (e.g., gender, reviews, user portrayal, etc.). How to utilize this information to better learn about network characterization is a fundamental but challenging task in network mining. One popular and effective way to represent network nodes is to map them to a low dimensional space, which is suitable for a variety of tasks, such as node classification, community discovery, and targeted advertising.

Recently, many algorithms have been proposed for network characterization, such as deep walk, LINE and node2 vec. While these approaches have proven effective in a number of network analysis tasks, most of them primarily consider structural topology information in network characterization learning. In reality, however, the network nodes may have a number of attributes that may be of different types that may help to improve the learning of the network characterization and benefit the subsequent analysis task. In order to obtain a representation with a better effect, some work considers the introduction of information such as node attributes or community structures to learn network representations.

Although some work has made many important efforts in learning to introduce attribute information for node characterization, how to effectively characterize nodes of a network with sparse connections and non-uniform attribute information is still in an ongoing research phase. Generally, in an actual network, connection information between nodes may be missing, attributes may be heterogeneous with respect to each other, and some newly added nodes may have limited or no connection information. In addition, the common attributes that different communities have may be the same, which means that the node signatures learned from network attribute information of different parts of the network node data, the encoded network structure information is likely to have inconsistencies, and vice versa between the attribute information and the structure topology. How to deal with these inconsistencies when we use attribute information to enhance the learning of network representations. The above problems bring challenges to the existing methods for effectively learning network characterization. There have been some efforts to improve the learning effect of network representation by using texts related to nodes, and some efforts to enhance the learning of network representation by using node attributes in a deep learning manner, but they either fail to address the above-mentioned non-uniformity problem in a targeted manner, or mainly get a common information part between different perspective attribute information.

Therefore, it is highly desirable and necessary to explore more technical approaches that can effectively capture nonlinear structural information and overcome the effects of inconsistencies, and retain as much useful information as possible.

Disclosure of Invention

The invention aims to provide a method for enhancing network representation learning by utilizing multi-attribute information, which can capture a nonlinear network structure, solve the problem of non-consistency between a network topology structure and each attribute information to a certain extent from a certain angle, fully play the role of enhancing the attribute information and keep as much information as possible.

The purpose of the invention is realized by the following technical scheme:

a method for enhancing network characterization learning using multi-attribute information, comprising:

s1, fusing different types of attribute similarity information of the network nodes with structural adjacency information between corresponding network nodes through a first semi-supervised depth model to obtain representations of a plurality of different hidden spaces;

step S2, optimizing the structural adjacency information among the network nodes through a second semi-supervised depth model to obtain the representation of the network nodes in the hidden space, which is obtained only by using the structural adjacency information;

and S3, fusing the representations of the hidden space obtained in the steps S1-S2 to obtain the final network representation.

According to the technical scheme provided by the invention, when the representation vector of the network node is learned, the topological structure information and the node attribute information of the network are considered, and meanwhile, the auxiliary function of the attribute information is fully exerted; the scheme is particularly suitable for learning the representation of the network with sparse connection and inconsistent attribute information, and a better network representation learning effect is obtained in the mode; the scheme can provide more accurate support for tasks such as node classification, connection prediction, community discovery and targeted advertising.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on the drawings without creative efforts.

FIG. 1 is a flowchart of a method for enhancing learning by using multi-attribute information for network characterization according to an embodiment of the present invention;

fig. 2 is an overall block diagram of a model framework for implementing a method for enhancing network characterization learning by using multi-attribute information according to an embodiment of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention are clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the present invention without making any creative effort, shall fall within the protection scope of the present invention.

The embodiment of the invention provides a method for enhancing network representation learning by utilizing multi-attribute information, which is mainly used for solving the problems of sparsity and information inconsistency of a network, designs a semi-supervised depth model for combining network topology structure information and node attribute information, and designs a model framework for fusing multi-view information. The design aims to take the topological structure information and the node attribute information of the network into consideration when the characterization vectors of the network nodes are learned, and simultaneously, the auxiliary function of the attribute information is fully exerted.

FIG. 1 is a flow chart of a method for enhancing network characterization learning using multi-attribute information, and FIG. 2 is a corresponding model framework; the method mainly comprises the following steps:

and step S1, fusing different types of attribute similarity information of the network nodes with structural adjacency information between corresponding network nodes through a first semi-supervised depth model to obtain representations of a plurality of different hidden spaces.

In the step, different types of attribute similarity information of the network nodes are used as input of a first semi-supervised depth model, and then the representation in the hidden space is finely adjusted by utilizing the structural adjacency information of the network through a back propagation algorithm.

The attribute similarity information corresponds to the similarity matrix of the attributes, and the similarity matrix C of the t-th attribute ^t Ith row and jth column element in (2)

Representing network nodes v _i Attribute vector of t-th attribute

And network node v _j Attribute vector of tth attribute

The similarity between them.

Structural adjacency information corresponding to adjacencies between pairs of network nodes, specifically each pair of network nodes (v) _i ,v _j ) The weight of the connection between them represents the adjacency between the network node pairs and also the similarity of the topology of the network node pairs. Structural adjacency information is generally represented by a structural adjacency matrix S, in which the ith row and jth column element S _i,j I.e. a network node pair (v) _i ,v _j ) The adjacency therebetween.

By jointly optimizing the structure and attribute similarity information in the first semi-supervised depth model, not only the highly nonlinear network structure information can be reserved, but also the enhancement effect of the attribute information can be improved.

In the model framework, the step corresponds to a multi-attribute enhanced representation part.

In the embodiment of the invention, the used first semi-supervised depth model, also called single attribute enhanced depth network, is an improvement on a depth autoencoder, and the characterization calculation formula of each layer of hidden space is as follows:

wherein the content of the first and second substances,

respectively representing network nodes v _i Characterization of the t-th attribute information in the hidden space of the 1 st layer and the k-th layer; k denotes the number of network space layers, T denotes the number of attributes,

and

and

and respectively representing the weight and the offset parameter of the network space, wherein the upper mark of the weight and the offset parameter is the serial number of the attribute information, and the lower mark of the weight and the offset parameter is the serial number of the network space layer.

In the embodiment of the invention, the loss function of the first semi-supervised depth model is as follows:

wherein the content of the first and second substances,

representing a loss function of the unsupervised training process of the first semi-supervised depth model,

representing a loss function of the supervised training process of the first semi-supervised depth model,

represents L ₂ -norm rule item, superscript t being the sequence number of the attribute information; both alpha and beta are hyperparameters, alpha represents

Beta represents a weight coefficient of

The weight coefficient of (c).

In the first semi-supervised depth model,

where n represents the total number of network nodes, representing a point multiply operation,

the decoder stage (decoding stage) representing the model at the first semi-supervised depth corresponds to

Is then outputted from the output of (a),

is C ^t Line i of (1), C ^t A similarity matrix representing the t-th attribute,

representing the decoder stage in the first semi-supervised depth model corresponding to C ^t The matrix of (a) is a matrix of (b),

represents H ^t The number of the ith row of (a),

H ^t represents the adjustment matrix and the adjustment matrix of the device,

representation corresponds to matrix C ^t Row and column i, s _i,j The element corresponding to the ith row and jth column of the adjacency matrix S; η and γ are hyper-parameters;

and

respectively representing network nodes v _i And v _j The characterization of the tth attribute information in the K layer space of the first semi-supervised depth model;

and

weight parameters of the k-th layer respectively representing the t-th attribute information and

the weight parameter of the corresponding decoder stage.

And step S2, optimizing the structural adjacent information among the network nodes through a second semi-supervised depth model, and obtaining the representation of the network nodes in the hidden space, which is obtained only by using the structural adjacent information.

In the step, the structural adjacency information between the network nodes is used as the input of the second semi-supervised depth model, and then the representation in the hidden space is finely adjusted by using the structural adjacency information between the network nodes through a back propagation algorithm. The step has the effect of enhancing the proportion of the effect of the structural topology information in the final representation, and simultaneously, the step is also used for better processing the condition that the attribute information is too sparse or completely absent.

In the model framework, this step corresponds to the structural characterization part.

In the embodiment of the invention, the second semi-supervised depth model is similar to the first semi-supervised depth model, and the difference is that the attribute similarity matrix C ^t Replacing with the structural adjacency matrix S and using the same in formula (4)

And setting 0.

Network node v obtained in this step for the sake of uniformity of representation _i Characterization in hidden space is also denoted

Where K is 2]And t is s, s is the network node v _i Structural adjacency information with other network nodes. That is to say

In (1), T ∈ { [1, T { ]]S, distinguishing according to the value of t

Is the result obtained in step S1 or step S2.

Those skilled in the art will appreciate that the first and second terms are used only to distinguish between the two semi-supervised depth models.

And step S3, fusing the representations of the hidden space obtained in the steps S1-S2 to obtain the final network representation.

In the step, the representations of different hidden spaces obtained in the previous two steps are fused, so that a better network representation can be obtained, and the representation is robust to a network with sparsity and inconsistent attributes and has more effective information to a certain extent.

In the model frame, this step corresponds to the bonding layer portion.

In the embodiment of the invention, two ways of fusing the representations of different hidden spaces are splicing and weight summation respectively;

for the splicing of two matrixes, the data of the same row are spliced into one row, so that a new matrix is formed; that is, the tokens of step S1 and step S2 are merged into a final token, denoted as y _i 。

For the weight summation method, the weight values can be obtained by an attention mechanism through training with label training data, and the formula is as follows:

wherein, y _i As a final network nodev _i The characterization of (a) is performed,

corresponding to the weight of the hidden space characterization, when T is belonged to [1, T]When t is s, it represents structural adjacency information.

The attention-based weight learning method is defined as follows:

wherein, when T is epsilon [1, T]When, G _t A weight vector representing the t-th attribute information, wherein G is the value of t ═ s _t A weight vector representing the structural adjacency information,

is a network node v _i The concatenation results of the representations of the different hidden spaces obtained in steps S1 to S2, σ (·) represents a sigmoid function.

The characterization of multiple hidden spaces of each network node obtained by splicing the previous modules, or the characterization y with robustness and discrimination can be obtained by formula (6) _i At the same time, such characterization is somewhat more informative.

On the other hand, the embodiment of the invention also provides an optimization algorithm of the model, aiming at realizing the optimization of the first semi-supervised depth model and the second semi-supervised depth model:

1) the first and second semi-supervised depth models are optimized.

The purpose of the optimization is to determine the weight and bias parameters

(i.e., parameters)

Including first and second semi-supervised depth models

And

) Minimizing a loss function of the first semi-supervised depth model

The optimization process is calculated by the following formula

About

Partial derivatives of (a):

in the above formula, the first and second carbon atoms are,

a bias parameter of a k layer representing tth attribute information of a decoding stage of the first semi-supervised depth model;

obtained by the above formula

And then adjusting parameters in the first semi-supervised depth model using a back propagation algorithm.

Similarly, the parameters in the second semi-supervised depth model are adjusted by adopting the method.

2) Obtaining the weights in equation (7) using a back propagation algorithm based on the specific task error

For classification tasks, the weight vector G for the t-th attribute information is minimized _t The objective function of (a) is as follows:

wherein the content of the first and second substances,

is a collection of network nodes, wherein the total number of network nodes is n, f (-) is a softmax function,

is a set of parameter vectors for the classifier,

is a network node v _i In the tag vector, if v _i Belonging to the category p, setting the p-th bit as 1, otherwise, setting the p-th bit as 0;

then, O is calculated _class With respect to G _t Then using back propagation algorithm to obtain the optimized weight in formula (7)

The above method provided by the embodiments of the present invention can be expressed as follows:

inputting:

structural adjacency matrix S, hyper-parameters alpha and beta, wherein

Represents a collection of network nodes, epsilon represents a collection of edges (edges embody the connection relationships between network nodes),

representing a collection of attribute matrices.

Initialization: in a uniform interval [ -1,1 [ ]]Medium random sampling initialization G _t And

t∈{[1,T],s}。

and (3) outputting: and (5) network characterization.

1: based on

Computing attribute similarity matrices

2: pre-training single attribute enhanced deep network through deep belief network to obtain initial parameters

3: repetition of

4: calculation by the aforementioned step S1 or step S2

For the characterization of the K-th hidden space,

for decoder stage to correspond to

An output of (d); t ∈ { [1, T ] here]S } also includes the stepS1 and the result of step S2;

5: updating parameter theta through formulas (2), (8-11) _k ；

6: until convergence

7：

8: repeat (R) to

9: updating the parameter G by the equations (6), (7), (12) _t And

10: until convergence

11: the network characterization is obtained by equation (6).

According to the scheme provided by the embodiment of the invention, when the representation of the network node is learned, the topological structure information and the node attribute information of the network are considered, and meanwhile, the auxiliary function of the attribute information is fully exerted; the scheme is particularly suitable for learning the representation of the network with sparse connection and inconsistent attribute information, and a better network representation learning effect is obtained in the mode; the scheme can provide more accurate support for tasks such as node classification, connection prediction, community discovery and targeted advertising.

Through the description of the above embodiments, it is clear to those skilled in the art that the above embodiments may be implemented by software, or by software plus a necessary general hardware platform. With this understanding, the technical solutions of the embodiments can be embodied in the form of a software product, which can be stored in a non-volatile storage medium (which can be a CD-ROM, a usb disk, a removable hard disk, etc.), and includes several instructions for enabling a computer device (which can be a personal computer, a server, or a network device, etc.) to execute the methods according to the embodiments of the present invention.

The above description is only for the preferred embodiment of the present invention, but the scope of the present invention is not limited thereto, and any changes or substitutions that can be easily conceived by those skilled in the art within the technical scope of the present invention are included in the scope of the present invention. Therefore, the protection scope of the present invention should be subject to the protection scope of the claims.

Claims

1. A method for enhancing network representation learning by using multi-attribute information is applied to a social network and is suitable for node classification, and the method comprises the following steps:

step S1, fusing different types of attribute similarity information of the network nodes with structural adjacency information between corresponding network nodes through a first semi-supervised depth model to obtain a plurality of representations of different hidden spaces, wherein the representation calculation formula of each layer of hidden space is as follows:

wherein the content of the first and second substances,

respectively representing network nodes v _i Characterization of the tth attribute information in the hidden space of the layer 1 and the layer k; k denotes the number of network space layers, T denotes the number of attributes,

and

and

respectively representing the weight and the offset parameter of the network space, wherein the upper mark of the weight and the offset parameter is the serial number of the attribute information, and the lower mark of the weight and the offset parameter is the serial number of the network space layer; the attribute information includes: gender, reviews, and user portraits;

s2, optimizing the structural adjacency information among the network nodes through a second semi-supervised depth model to obtain the representation of the network nodes in the hidden space, which is obtained only by using the structural adjacency information;

s3, fusing the representations of the hidden space obtained in the steps S1-S2 to obtain a final network representation;

wherein a loss function of the first semi-supervised depth model is:

wherein, the first and the second end of the pipe are connected with each other,

a loss function representing an unsupervised training process of the first semi-supervised depth model,

represents L ₂ -norm rule item, superscript t being the sequence number of the attribute information; alpha is a

Beta represents

The weight coefficient of (c).

2. The method for enhancing network characterization learning by using multi-attribute information as claimed in claim 1, wherein, in the first semi-supervised depth model,

where n represents the total number of network nodes,. represents a point multiply operation; c ^t A similarity matrix representing the t-th attribute, wherein the i-th row and the j-th column of elements

Representing network nodes v _i Attribute vector of t-th attribute

And network node v _j Attribute vector of t-th attribute

The similarity between the two or more of the images,

is C ^t The number of the ith row of (a),

representing the decoding phase in the first semi-supervised depth model corresponds to

Is then outputted from the output of (a),

representing the decoding phase in the first semi-supervised depth model corresponding to C ^t The matrix of (a) is,

represents the adjustment matrix H ^t The number of the ith row of (a),

s _i,j elements corresponding to the ith row and jth column elements of the adjacency matrix S, η and γ being hyper-parameters;

and

respectively represent nodes v _i And v _j The representation of the t-th attribute information in the K-th layer space of the first semi-supervised depth model;

respectively representing the weight of the kth layer of the tth attribute information and the weight of the kth layer of the tth attribute information in the decoding stage of the first semi-supervised depth model.

3. The method of claim 2, wherein the second semi-supervised depth model is similar to the first semi-supervised depth model except that the attribute similarity matrix C is used ^t Is replaced by a structural adjacency matrix S, and

in the formula

Setting 0; network node v _i The characteristics in the hidden space are

Where K2, K, t s, s denotes a network node v _i Structural adjacency information with other network nodesInformation; that is to say

In (1), T ∈ { [1, T { ]]S, distinguishing according to the value of t

Is the result obtained in step S1 or step S2.

4. The method for enhancing network characterization learning by using multi-attribute information according to claim 1 or 3, wherein there are two ways of fusing the characterizations of the hidden space in step S3, which are splicing and weight summation respectively;

for the weight summation method, the weight values are obtained by training with label training data through an attention mechanism, and the formula is expressed as follows:

wherein, y _i For the final network node v _i The characterization of (a) is performed,

corresponding to the weight of the hidden space characterization, when T is belonged to [1, T]When t is s, it represents structural adjacency information;

the attention-based weight learning method is defined as follows:

wherein, when T is ∈ [1, T]When, G _t G when t is s, a weight vector representing the t-th attribute information _t A weight vector representing the structural adjacency information,

5. The method of claim 1, further comprising: optimizing the first and second semi-supervised depth models;

the purpose of the optimization is to determine the weight and bias parameters

Minimizing a loss function of the first semi-supervised depth model

The optimization process is calculated by the following formula

About

Partial derivatives of (a):

in the above-mentioned formula, the compound has the following structure,

obtained by the above formula

Then, adjusting parameters in the first half-supervised depth model by using a back propagation algorithm;

the adjustment of the parameters in the second semi-supervised depth model is performed in a similar manner.

6. The method of claim 4, further comprising: obtaining weights using back propagation algorithm based on specific task errors

For classification tasks, the weight vector G for the t-th attribute information is minimized _t The objective function of (2) is as follows:

wherein the content of the first and second substances,

is a set of parameter vectors for the classifier,

is a network node v _i In the tag vector, if v _i Belonging to the category p, setting the p-th bit to be 1, otherwise, setting the p-th bit to be 0;

then, O is calculated _class With respect to G _t Partial derivatives of (a) are weighted by a back propagation algorithm