CN112801288A - Vector representation method and device of graph network - Google Patents

Vector representation method and device of graph network Download PDF

Info

Publication number
CN112801288A
CN112801288A CN202110157860.6A CN202110157860A CN112801288A CN 112801288 A CN112801288 A CN 112801288A CN 202110157860 A CN202110157860 A CN 202110157860A CN 112801288 A CN112801288 A CN 112801288A
Authority
CN
China
Prior art keywords
graph network
matrix
edges
vector representation
nodes
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110157860.6A
Other languages
Chinese (zh)
Inventor
陈捷
方凤妹
梁秋梅
俞碧洪
栾江霞
左军
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xiamen Meiya Pico Information Co Ltd
Original Assignee
Xiamen Meiya Pico Information Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xiamen Meiya Pico Information Co Ltd filed Critical Xiamen Meiya Pico Information Co Ltd
Priority to CN202110157860.6A priority Critical patent/CN112801288A/en
Publication of CN112801288A publication Critical patent/CN112801288A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/082Learning methods modifying the architecture, e.g. adding, deleting or silencing nodes or connections

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention relates to a vector representation method and a device of a graph network, wherein the method comprises the following steps: constructing an adjacency matrix Y of the graph network according to the incidence relation among all nodes of the graph networkN×N×KWherein N represents the number of nodes, and K represents the number of edges of different types; updating element values in K corresponding adjacency matrixes according to the actual association times of each node on different types of edges; carrying out standardization processing on the K adjacent matrixes; and (3) applying a feature extraction function f to extract the features of the K edges to form a vector representation D (G) of the graph network. The method can greatly simplify the calculation complexity of generating the graph network vector, particularly when the graph network sends changes, such as replacing part of structures or adding part of nodes, the vector representation of the graph network can be regenerated only by updating the adjacency matrix of the changed part, and the method is convenient for the rapid training and deployment of the model.

Description

Vector representation method and device of graph network
Technical Field
The invention relates to the field of neural network algorithms, in particular to a vector representation method and a vector representation device of a graph network.
Background
With the development of deep neural networks, great success is achieved in the fields of pattern recognition and data mining, such as target detection, machine translation, language recognition and the like. However, in the graph network, because of the irregularity of the graph itself, the number of nodes of each graph and the number of adjacent nodes of each node are greatly different, that is, a complex topological structure exists, and the graph network is not suitable for directly applying the existing deep neural network technology. The description of the graph network is generally described by G (V, E), where V represents a node of the graph network and E represents an edge of the graph network. In order to apply the deep learning technology to the research of the graph network, a plurality of graph vector representation technologies are developed in recent years, an adjacent matrix of a graph is established according to the incidence relation among all nodes in the graph network, the characteristics of the nodes are spliced, namely, the nodes in the graph and the relation among the nodes are represented in a vectorization mode, and the vectorized graph network is put into the existing deep learning technology. The currently common vector representation method includes: factorization based methods and random walk based methods.
The factorization-based method requires that each node is represented as a linear combination of spatially adjacent nodes, i.e. for any one node V in the graph G (V, E)iCan be represented by its adjacency matrix as:
Figure BDA0002935133820000011
wherein wijIs node Vi、VjAdjacent to the elements in the matrix. Then by minimizing ViAnd V'iTo obtain a vector representation of the graph network. Based on random walk method, by way of random walk (in general)There are two modes of breadth-first and depth-first), an initial vector representation of each node is obtained, which has the advantage that the more nodes with common neighbors have, the smaller their vector difference. And then training the initial vector in a word2 vec-like mode to obtain the vector representation of the graph network. Both methods are used in the learning of large-scale graph networks, but the effect is not ideal for small-scale graph networks or the situation that the adjacent matrix is sparse, and the calculation method is complex, so that the nodes in the graph network and the adjacent matrix of the nodes need to be traversed, and the time is long. When the graph network vector is generated, the difference of edges between nodes is not considered, that is, the difference of connection relations between nodes is not considered, for example, in a social network, nodes represent different people, and each node may have different contact ways such as WeChat, short message, telephone, and the like, that is, the attributes of edges between nodes may have difference.
Disclosure of Invention
The present invention is directed to a method and an apparatus for vector representation of a graph network, so as to solve the above problems. Therefore, the invention adopts the following specific technical scheme:
according to an aspect of the present invention, there is provided a vector representation method of a graph network, comprising the steps of:
constructing an adjacency matrix Y of the graph network according to the incidence relation among all nodes of the graph networkN×N×KWherein N represents the number of nodes, and K represents the number of edges of different types;
updating element values in K corresponding adjacency matrixes according to the actual association times of each node on different types of edges;
carrying out standardization processing on the K adjacent matrixes;
and (3) applying a feature extraction function f to extract the features of the K edges to form a vector representation D (G) of the graph network.
Further, the specific process of normalizing the K adjacency matrices is as follows:
firstly, setting an upper limit threshold value for each adjacent matrix, and replacing an element value in the matrix with the upper limit threshold value if the element value exceeds the upper limit threshold value;
then, normalization processing is carried out on each adjacent matrix element, and the element value is adjusted to be in the range of 0 to 1.
Further, the normalization process employs a maximum value normalization process, i.e., a process of dividing each element value in the adjacency matrix by the maximum value among the element values.
Further, the feature extraction function f includes taking a linear combination of maximum, minimum, mean, or weighted.
According to another aspect of the present invention, there is also provided a vector representation apparatus of a graph network, including:
the adjacency matrix construction module is used for constructing an adjacency matrix Y of the graph network according to the incidence relation among all nodes of the graph networkN×N×KWherein N represents the number of nodes, and K represents the number of edges of different types;
the element value updating module is used for updating the element values in the K corresponding adjacent matrixes according to the actual association times of each node on different types of edges;
the normalization processing module is used for normalizing the K adjacent matrixes;
and the vector representation module is used for applying a feature extraction function f to carry out feature extraction on the K edges to form vector representation D (G) of the graph network.
Further, the specific process of normalizing the K adjacency matrices is as follows:
firstly, setting an upper limit threshold value for each adjacent matrix, and replacing an element in the matrix with the upper limit threshold value if the element exceeds the upper limit threshold value;
and then, normalizing each adjacent matrix element, and adjusting the value to be in the range of 0 to 1.
Further, the normalization process employs a maximum value normalization process, i.e., a process of dividing each element value in the adjacency matrix by the maximum value among the element values.
Further, the feature extraction function f includes taking a linear combination of maximum, minimum, mean, or weighted.
By adopting the technical scheme, the invention has the beneficial effects that: the method can greatly simplify the calculation complexity of generating the graph network vector, particularly when the graph network sends changes, such as replacing part of structures or adding part of nodes, the vector representation of the graph network can be regenerated only by updating the adjacency matrix of the changed part, and the method is convenient for the rapid training and deployment of the model. On the other hand, the invention brings the edge difference, including the edge relations of different types and the number of times of contact on the edge into the vector generation, increases the consideration of the edge attribute in the graph network and is beneficial to more accurate subsequent application.
Drawings
To further illustrate the various embodiments, the invention provides the accompanying drawings. The accompanying drawings, which are incorporated in and constitute a part of this disclosure, illustrate embodiments of the invention and, together with the description, serve to explain the principles of the embodiments. Those skilled in the art will appreciate still other possible embodiments and advantages of the present invention with reference to these figures. Elements in the figures are not drawn to scale and like reference numerals are generally used to indicate like elements.
FIG. 1 is a flow chart of a vector representation method of a graph network of the present invention;
FIG. 2 is a schematic diagram of a graph network;
FIG. 3 is a schematic diagram of a adjacency matrix corresponding to the graph network shown in FIG. 2;
FIG. 4 is a schematic illustration of the adjacency matrix shown in FIG. 3 after normalization;
FIG. 5 is a vector representation matrix of the graph network shown in FIG. 2;
fig. 6 is a block diagram of a vector representation apparatus of the graph network of the present invention.
Detailed Description
The invention will now be further described with reference to the accompanying drawings and detailed description.
As shown in fig. 1, a vector representation method of a graph network includes the following steps:
s1, constructing an adjacency matrix Y of the graph network according to the incidence relation among the nodes of the graph networkN×N×KWherein N represents the number of nodes and K represents the number of edges of different types. Fig. 2 shows an example of a graph network comprising 6 nodes a-F, 3 types of edges, short message, wechat and telephone respectively. The number of adjacency matrices constructed is 3 and the dimensions are 6 × 6.
And S2, updating element values in the K corresponding adjacency matrixes according to the actual association times of the nodes on the edges of different types. Taking the graph network of fig. 2 as an example, the adjacency matrix thereof can be represented in the form of fig. 3, wherein (a) is established according to the telephone relationship, (b) is established according to the micro-message relationship, and (c) is established according to the short message relationship. An element in the matrix greater than zero indicates that such a connection exists, and greater than 1 indicates the number of times such a connection actually occurs.
S3, normalizing the K adjacent matrices can avoid computation abnormality caused by some abnormal data in the matrices, for example, the number of association times is too large, so that a certain element in the matrices is extremely high, and the features of other data are not significant. The method of normalization is as follows:
first, an upper threshold is set for each adjacency matrix, and if an element in the matrix exceeds the upper threshold, its value is replaced with the upper threshold. According to different edge relations, the threshold value can be the same or set to different constants;
Figure BDA0002935133820000051
Nlupper threshold of l side
In the above formula, i, j represents the position serial number of the element in the adjacency matrix, and takes values from 1 to N, and l is the serial number of the adjacency matrix and takes values from 1 to K;
next, each adjacent matrix element is normalized to adjust the value to the range from 0 to 1, for example, by maximum normalization, as shown in the following formula:
Figure BDA0002935133820000052
in the above formula, i, j represents the position serial number of the element in the adjacency matrix, and takes values from 1 to N, l is the serial number of the adjacency matrix, and takes values from 1 to K, and Max () represents the maximum element value in the l-th adjacency matrix, so as to ensure that the value ranges of the K kinds of edges are the same;
s4, feature extraction is carried out on the K edges by applying a feature extraction function f to form a vector representation D (G) of the graph network. The feature extraction method can generally adopt the forms of taking maximum values, minimum values, mean values, linear combinations with weights and the like:
Figure BDA0002935133820000061
in the above formula, D (G)i,jRepresenting the values of the ith row and jth column elements of the formed graph network vector, Max () representing the maximum value of the elements at the corresponding positions in the K adjacent matrixes by comparison, Min () representing the minimum value, Mean () representing the Mean value, alphalThe value range of the weight factor is 0 to 1, and the value size can refer to the importance of different edge relations;
similarly, a specific calculation process will be described by taking the graph network shown in fig. 2 as an example, and a normalization process is first performed, as shown in fig. 4.
Then, considering the method of taking the mean value and the same weight of each edge, extracting the feature, and obtaining the vector D after the feature extractionijAs shown in fig. 5.
And finally, the generated graph network vector can be substituted into a neural network for training and applied to corresponding graph network analysis.
As shown in fig. 6, the present invention also provides a vector representation apparatus of a graph network, which includes:
an adjacency matrix construction module 100, configured to construct an adjacency matrix Y of the graph network according to the association relationship between nodes of the graph networkN×N×KWherein N represents the number of nodes and K represents the number of edges of different types.
And an element value updating module 200, configured to update the element values in the K corresponding adjacency matrices according to the actual association times of each node on different types of edges.
The normalization processing module 300 is configured to perform normalization processing on the K adjacency matrices, specifically, first, an upper threshold is set for each adjacency matrix, and if an element in the matrix exceeds the upper threshold, the element is replaced by the upper threshold. And then, normalizing each adjacent matrix element, and adjusting the value to be in the range of 0 to 1. This is done to ensure that the value ranges of the K edges are the same.
And a vector representation module 400, configured to apply a feature extraction function f to perform feature extraction on the K edges, so as to form a vector representation d (g) of the graph network. The feature extraction function f may be a linear combination taking a maximum, minimum, mean or weight, etc.
The method of the invention can greatly simplify the calculation complexity of generating the graph network vector, and particularly when the graph network sends changes, such as replacing part of structures or adding part of nodes, the vector representation of the graph network can be regenerated only by updating the adjacency matrix of the changed part. On the other hand, the invention brings the edge difference, including the edge relations of different types and the number of times of contact on the edge into the vector generation, increases the consideration of the edge attribute in the graph network and is beneficial to more accurate subsequent application.
While the invention has been particularly shown and described with reference to a preferred embodiment, it will be understood by those skilled in the art that various changes in form and detail may be made therein without departing from the spirit and scope of the invention as defined by the appended claims.

Claims (8)

1. A method for vector representation of a graph network, comprising the steps of:
constructing an adjacency matrix Y of the graph network according to the incidence relation among all nodes of the graph networkN×N×KWherein N represents the number of nodes, and K represents the number of edges of different types;
updating element values in K corresponding adjacency matrixes according to the actual association times of each node on different types of edges;
carrying out standardization processing on the K adjacent matrixes;
and (3) applying a feature extraction function f to extract the features of the K edges to form a vector representation D (G) of the graph network.
2. The method of claim 1, wherein the K adjacency matrices are normalized by:
firstly, setting an upper limit threshold value for each adjacent matrix, and replacing an element value in the matrix with the upper limit threshold value if the element value exceeds the upper limit threshold value;
then, normalization processing is carried out on each adjacent matrix element, and the element value is adjusted to be in the range of 0 to 1.
3. The method of claim 2, wherein the normalization process employs a maximum normalization process, i.e., dividing each element value in the adjacency matrix by a maximum of the element values.
4. The method of claim 1, wherein the feature extraction function f comprises taking a maximum, minimum, mean, or weighted linear combination.
5. An apparatus for vector representation of a graph network, comprising:
the adjacency matrix construction module is used for constructing an adjacency matrix Y of the graph network according to the incidence relation among all nodes of the graph networkN ×N×KWherein N represents the number of nodes, and K represents the number of edges of different types;
the element value updating module is used for updating the element values in the K corresponding adjacent matrixes according to the actual association times of each node on different types of edges;
the normalization processing module is used for normalizing the K adjacent matrixes;
and the vector representation module is used for applying a feature extraction function f to carry out feature extraction on the K edges to form vector representation D (G) of the graph network.
6. The apparatus of claim 5, wherein the K adjacency matrices are normalized by:
firstly, setting an upper limit threshold value for each adjacent matrix, and replacing an element in the matrix with the upper limit threshold value if the element exceeds the upper limit threshold value;
and then, normalizing each adjacent matrix element, and adjusting the value to be in the range of 0 to 1.
7. The apparatus of claim 6, wherein the normalization process employs a maximum normalization process of dividing each element value in the adjacency matrix by a maximum value in the element values.
8. The apparatus of claim 5, wherein the feature extraction function f comprises taking a maximum, minimum, mean, or weighted linear combination.
CN202110157860.6A 2021-02-05 2021-02-05 Vector representation method and device of graph network Pending CN112801288A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110157860.6A CN112801288A (en) 2021-02-05 2021-02-05 Vector representation method and device of graph network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110157860.6A CN112801288A (en) 2021-02-05 2021-02-05 Vector representation method and device of graph network

Publications (1)

Publication Number Publication Date
CN112801288A true CN112801288A (en) 2021-05-14

Family

ID=75814265

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110157860.6A Pending CN112801288A (en) 2021-02-05 2021-02-05 Vector representation method and device of graph network

Country Status (1)

Country Link
CN (1) CN112801288A (en)

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108022171A (en) * 2016-10-31 2018-05-11 腾讯科技(深圳)有限公司 A kind of data processing method and equipment
CN108648095A (en) * 2018-05-10 2018-10-12 浙江工业大学 A kind of nodal information hidden method accumulating gradient network based on picture scroll
US20180351971A1 (en) * 2017-01-24 2018-12-06 Nec Laboratories America, Inc. Knowledge transfer system for accelerating invariant network learning
CN109101629A (en) * 2018-08-14 2018-12-28 合肥工业大学 A kind of network representation method based on depth network structure and nodal community
CN110009093A (en) * 2018-12-07 2019-07-12 阿里巴巴集团控股有限公司 For analyzing the nerve network system and method for relational network figure
CN110555050A (en) * 2018-03-30 2019-12-10 华东师范大学 heterogeneous network node representation learning method based on meta-path
CN111783100A (en) * 2020-06-22 2020-10-16 哈尔滨工业大学 Source code vulnerability detection method for code graph representation learning based on graph convolution network
CN112131480A (en) * 2020-09-30 2020-12-25 中国海洋大学 Personalized commodity recommendation method and system based on multilayer heterogeneous attribute network representation learning

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108022171A (en) * 2016-10-31 2018-05-11 腾讯科技(深圳)有限公司 A kind of data processing method and equipment
US20180351971A1 (en) * 2017-01-24 2018-12-06 Nec Laboratories America, Inc. Knowledge transfer system for accelerating invariant network learning
CN110555050A (en) * 2018-03-30 2019-12-10 华东师范大学 heterogeneous network node representation learning method based on meta-path
CN108648095A (en) * 2018-05-10 2018-10-12 浙江工业大学 A kind of nodal information hidden method accumulating gradient network based on picture scroll
CN109101629A (en) * 2018-08-14 2018-12-28 合肥工业大学 A kind of network representation method based on depth network structure and nodal community
CN110009093A (en) * 2018-12-07 2019-07-12 阿里巴巴集团控股有限公司 For analyzing the nerve network system and method for relational network figure
CN111783100A (en) * 2020-06-22 2020-10-16 哈尔滨工业大学 Source code vulnerability detection method for code graph representation learning based on graph convolution network
CN112131480A (en) * 2020-09-30 2020-12-25 中国海洋大学 Personalized commodity recommendation method and system based on multilayer heterogeneous attribute network representation learning

Similar Documents

Publication Publication Date Title
CN109214441A (en) A kind of fine granularity model recognition system and method
CN110473592B (en) Multi-view human synthetic lethal gene prediction method
CN112685504B (en) Production process-oriented distributed migration chart learning method
CN107832789B (en) Feature weighting K nearest neighbor fault diagnosis method based on average influence value data transformation
CN111259917B (en) Image feature extraction method based on local neighbor component analysis
CN113628059B (en) Associated user identification method and device based on multi-layer diagram attention network
CN113822419B (en) Self-supervision graph representation learning operation method based on structural information
CN113723238B (en) Face lightweight network model construction method and face recognition method
CN112215292A (en) Image countermeasure sample generation device and method based on mobility
CN111898735A (en) Distillation learning method, distillation learning device, computer equipment and storage medium
CN107517201A (en) A kind of network vulnerability discrimination method removed based on sequential
CN114708479B (en) Self-adaptive defense method based on graph structure and characteristics
CN113240105A (en) Power grid steady state discrimination method based on graph neural network pooling
CN113541985A (en) Internet of things fault diagnosis method, training method of model and related device
CN113987236B (en) Unsupervised training method and unsupervised training device for visual retrieval model based on graph convolution network
CN109063535B (en) Pedestrian re-identification and pedestrian gender classification method based on joint deep learning
CN110598585A (en) Sit-up action recognition method based on convolutional neural network
CN111160536B (en) Convolution embedding representation inference method based on fragmentation knowledge
CN117272195A (en) Block chain abnormal node detection method and system based on graph convolution attention network
CN116089652B (en) Unsupervised training method and device of visual retrieval model and electronic equipment
CN112801288A (en) Vector representation method and device of graph network
CN116232694A (en) Lightweight network intrusion detection method and device, electronic equipment and storage medium
CN111178431A (en) Network node role identification method based on neural network and multi-dimensional feature extraction
Jaiswal et al. Spending your winning lottery better after drawing it
CN114896977A (en) Dynamic evaluation method for entity service trust value of Internet of things

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20210514

RJ01 Rejection of invention patent application after publication