CN112801288A - Vector representation method and device of graph network - Google Patents
Vector representation method and device of graph network Download PDFInfo
- Publication number
- CN112801288A CN112801288A CN202110157860.6A CN202110157860A CN112801288A CN 112801288 A CN112801288 A CN 112801288A CN 202110157860 A CN202110157860 A CN 202110157860A CN 112801288 A CN112801288 A CN 112801288A
- Authority
- CN
- China
- Prior art keywords
- graph network
- matrix
- edges
- vector representation
- nodes
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 40
- 239000011159 matrix material Substances 0.000 claims abstract description 50
- 238000000605 extraction Methods 0.000 claims abstract description 18
- 238000012545 processing Methods 0.000 claims abstract description 9
- 238000010606 normalization Methods 0.000 claims description 18
- 238000010276 construction Methods 0.000 claims description 3
- 238000004364 calculation method Methods 0.000 abstract description 5
- 238000012549 training Methods 0.000 abstract description 4
- 230000006870 function Effects 0.000 description 7
- 238000013528 artificial neural network Methods 0.000 description 4
- 238000005516 engineering process Methods 0.000 description 4
- 230000009286 beneficial effect Effects 0.000 description 3
- 238000010586 diagram Methods 0.000 description 3
- 238000005295 random walk Methods 0.000 description 3
- 238000013135 deep learning Methods 0.000 description 2
- 230000002159 abnormal effect Effects 0.000 description 1
- 230000005856 abnormality Effects 0.000 description 1
- 238000007418 data mining Methods 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000003012 network analysis Methods 0.000 description 1
- 238000003909 pattern recognition Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/082—Learning methods modifying the architecture, e.g. adding, deleting or silencing nodes or connections
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention relates to a vector representation method and a device of a graph network, wherein the method comprises the following steps: constructing an adjacency matrix Y of the graph network according to the incidence relation among all nodes of the graph networkN×N×KWherein N represents the number of nodes, and K represents the number of edges of different types; updating element values in K corresponding adjacency matrixes according to the actual association times of each node on different types of edges; carrying out standardization processing on the K adjacent matrixes; and (3) applying a feature extraction function f to extract the features of the K edges to form a vector representation D (G) of the graph network. The method can greatly simplify the calculation complexity of generating the graph network vector, particularly when the graph network sends changes, such as replacing part of structures or adding part of nodes, the vector representation of the graph network can be regenerated only by updating the adjacency matrix of the changed part, and the method is convenient for the rapid training and deployment of the model.
Description
Technical Field
The invention relates to the field of neural network algorithms, in particular to a vector representation method and a vector representation device of a graph network.
Background
With the development of deep neural networks, great success is achieved in the fields of pattern recognition and data mining, such as target detection, machine translation, language recognition and the like. However, in the graph network, because of the irregularity of the graph itself, the number of nodes of each graph and the number of adjacent nodes of each node are greatly different, that is, a complex topological structure exists, and the graph network is not suitable for directly applying the existing deep neural network technology. The description of the graph network is generally described by G (V, E), where V represents a node of the graph network and E represents an edge of the graph network. In order to apply the deep learning technology to the research of the graph network, a plurality of graph vector representation technologies are developed in recent years, an adjacent matrix of a graph is established according to the incidence relation among all nodes in the graph network, the characteristics of the nodes are spliced, namely, the nodes in the graph and the relation among the nodes are represented in a vectorization mode, and the vectorized graph network is put into the existing deep learning technology. The currently common vector representation method includes: factorization based methods and random walk based methods.
The factorization-based method requires that each node is represented as a linear combination of spatially adjacent nodes, i.e. for any one node V in the graph G (V, E)iCan be represented by its adjacency matrix as:
wherein wijIs node Vi、VjAdjacent to the elements in the matrix. Then by minimizing ViAnd V'iTo obtain a vector representation of the graph network. Based on random walk method, by way of random walk (in general)There are two modes of breadth-first and depth-first), an initial vector representation of each node is obtained, which has the advantage that the more nodes with common neighbors have, the smaller their vector difference. And then training the initial vector in a word2 vec-like mode to obtain the vector representation of the graph network. Both methods are used in the learning of large-scale graph networks, but the effect is not ideal for small-scale graph networks or the situation that the adjacent matrix is sparse, and the calculation method is complex, so that the nodes in the graph network and the adjacent matrix of the nodes need to be traversed, and the time is long. When the graph network vector is generated, the difference of edges between nodes is not considered, that is, the difference of connection relations between nodes is not considered, for example, in a social network, nodes represent different people, and each node may have different contact ways such as WeChat, short message, telephone, and the like, that is, the attributes of edges between nodes may have difference.
Disclosure of Invention
The present invention is directed to a method and an apparatus for vector representation of a graph network, so as to solve the above problems. Therefore, the invention adopts the following specific technical scheme:
according to an aspect of the present invention, there is provided a vector representation method of a graph network, comprising the steps of:
constructing an adjacency matrix Y of the graph network according to the incidence relation among all nodes of the graph networkN×N×KWherein N represents the number of nodes, and K represents the number of edges of different types;
updating element values in K corresponding adjacency matrixes according to the actual association times of each node on different types of edges;
carrying out standardization processing on the K adjacent matrixes;
and (3) applying a feature extraction function f to extract the features of the K edges to form a vector representation D (G) of the graph network.
Further, the specific process of normalizing the K adjacency matrices is as follows:
firstly, setting an upper limit threshold value for each adjacent matrix, and replacing an element value in the matrix with the upper limit threshold value if the element value exceeds the upper limit threshold value;
then, normalization processing is carried out on each adjacent matrix element, and the element value is adjusted to be in the range of 0 to 1.
Further, the normalization process employs a maximum value normalization process, i.e., a process of dividing each element value in the adjacency matrix by the maximum value among the element values.
Further, the feature extraction function f includes taking a linear combination of maximum, minimum, mean, or weighted.
According to another aspect of the present invention, there is also provided a vector representation apparatus of a graph network, including:
the adjacency matrix construction module is used for constructing an adjacency matrix Y of the graph network according to the incidence relation among all nodes of the graph networkN×N×KWherein N represents the number of nodes, and K represents the number of edges of different types;
the element value updating module is used for updating the element values in the K corresponding adjacent matrixes according to the actual association times of each node on different types of edges;
the normalization processing module is used for normalizing the K adjacent matrixes;
and the vector representation module is used for applying a feature extraction function f to carry out feature extraction on the K edges to form vector representation D (G) of the graph network.
Further, the specific process of normalizing the K adjacency matrices is as follows:
firstly, setting an upper limit threshold value for each adjacent matrix, and replacing an element in the matrix with the upper limit threshold value if the element exceeds the upper limit threshold value;
and then, normalizing each adjacent matrix element, and adjusting the value to be in the range of 0 to 1.
Further, the normalization process employs a maximum value normalization process, i.e., a process of dividing each element value in the adjacency matrix by the maximum value among the element values.
Further, the feature extraction function f includes taking a linear combination of maximum, minimum, mean, or weighted.
By adopting the technical scheme, the invention has the beneficial effects that: the method can greatly simplify the calculation complexity of generating the graph network vector, particularly when the graph network sends changes, such as replacing part of structures or adding part of nodes, the vector representation of the graph network can be regenerated only by updating the adjacency matrix of the changed part, and the method is convenient for the rapid training and deployment of the model. On the other hand, the invention brings the edge difference, including the edge relations of different types and the number of times of contact on the edge into the vector generation, increases the consideration of the edge attribute in the graph network and is beneficial to more accurate subsequent application.
Drawings
To further illustrate the various embodiments, the invention provides the accompanying drawings. The accompanying drawings, which are incorporated in and constitute a part of this disclosure, illustrate embodiments of the invention and, together with the description, serve to explain the principles of the embodiments. Those skilled in the art will appreciate still other possible embodiments and advantages of the present invention with reference to these figures. Elements in the figures are not drawn to scale and like reference numerals are generally used to indicate like elements.
FIG. 1 is a flow chart of a vector representation method of a graph network of the present invention;
FIG. 2 is a schematic diagram of a graph network;
FIG. 3 is a schematic diagram of a adjacency matrix corresponding to the graph network shown in FIG. 2;
FIG. 4 is a schematic illustration of the adjacency matrix shown in FIG. 3 after normalization;
FIG. 5 is a vector representation matrix of the graph network shown in FIG. 2;
fig. 6 is a block diagram of a vector representation apparatus of the graph network of the present invention.
Detailed Description
The invention will now be further described with reference to the accompanying drawings and detailed description.
As shown in fig. 1, a vector representation method of a graph network includes the following steps:
s1, constructing an adjacency matrix Y of the graph network according to the incidence relation among the nodes of the graph networkN×N×KWherein N represents the number of nodes and K represents the number of edges of different types. Fig. 2 shows an example of a graph network comprising 6 nodes a-F, 3 types of edges, short message, wechat and telephone respectively. The number of adjacency matrices constructed is 3 and the dimensions are 6 × 6.
And S2, updating element values in the K corresponding adjacency matrixes according to the actual association times of the nodes on the edges of different types. Taking the graph network of fig. 2 as an example, the adjacency matrix thereof can be represented in the form of fig. 3, wherein (a) is established according to the telephone relationship, (b) is established according to the micro-message relationship, and (c) is established according to the short message relationship. An element in the matrix greater than zero indicates that such a connection exists, and greater than 1 indicates the number of times such a connection actually occurs.
S3, normalizing the K adjacent matrices can avoid computation abnormality caused by some abnormal data in the matrices, for example, the number of association times is too large, so that a certain element in the matrices is extremely high, and the features of other data are not significant. The method of normalization is as follows:
first, an upper threshold is set for each adjacency matrix, and if an element in the matrix exceeds the upper threshold, its value is replaced with the upper threshold. According to different edge relations, the threshold value can be the same or set to different constants;
In the above formula, i, j represents the position serial number of the element in the adjacency matrix, and takes values from 1 to N, and l is the serial number of the adjacency matrix and takes values from 1 to K;
next, each adjacent matrix element is normalized to adjust the value to the range from 0 to 1, for example, by maximum normalization, as shown in the following formula:
in the above formula, i, j represents the position serial number of the element in the adjacency matrix, and takes values from 1 to N, l is the serial number of the adjacency matrix, and takes values from 1 to K, and Max () represents the maximum element value in the l-th adjacency matrix, so as to ensure that the value ranges of the K kinds of edges are the same;
s4, feature extraction is carried out on the K edges by applying a feature extraction function f to form a vector representation D (G) of the graph network. The feature extraction method can generally adopt the forms of taking maximum values, minimum values, mean values, linear combinations with weights and the like:
in the above formula, D (G)i,jRepresenting the values of the ith row and jth column elements of the formed graph network vector, Max () representing the maximum value of the elements at the corresponding positions in the K adjacent matrixes by comparison, Min () representing the minimum value, Mean () representing the Mean value, alphalThe value range of the weight factor is 0 to 1, and the value size can refer to the importance of different edge relations;
similarly, a specific calculation process will be described by taking the graph network shown in fig. 2 as an example, and a normalization process is first performed, as shown in fig. 4.
Then, considering the method of taking the mean value and the same weight of each edge, extracting the feature, and obtaining the vector D after the feature extractionijAs shown in fig. 5.
And finally, the generated graph network vector can be substituted into a neural network for training and applied to corresponding graph network analysis.
As shown in fig. 6, the present invention also provides a vector representation apparatus of a graph network, which includes:
an adjacency matrix construction module 100, configured to construct an adjacency matrix Y of the graph network according to the association relationship between nodes of the graph networkN×N×KWherein N represents the number of nodes and K represents the number of edges of different types.
And an element value updating module 200, configured to update the element values in the K corresponding adjacency matrices according to the actual association times of each node on different types of edges.
The normalization processing module 300 is configured to perform normalization processing on the K adjacency matrices, specifically, first, an upper threshold is set for each adjacency matrix, and if an element in the matrix exceeds the upper threshold, the element is replaced by the upper threshold. And then, normalizing each adjacent matrix element, and adjusting the value to be in the range of 0 to 1. This is done to ensure that the value ranges of the K edges are the same.
And a vector representation module 400, configured to apply a feature extraction function f to perform feature extraction on the K edges, so as to form a vector representation d (g) of the graph network. The feature extraction function f may be a linear combination taking a maximum, minimum, mean or weight, etc.
The method of the invention can greatly simplify the calculation complexity of generating the graph network vector, and particularly when the graph network sends changes, such as replacing part of structures or adding part of nodes, the vector representation of the graph network can be regenerated only by updating the adjacency matrix of the changed part. On the other hand, the invention brings the edge difference, including the edge relations of different types and the number of times of contact on the edge into the vector generation, increases the consideration of the edge attribute in the graph network and is beneficial to more accurate subsequent application.
While the invention has been particularly shown and described with reference to a preferred embodiment, it will be understood by those skilled in the art that various changes in form and detail may be made therein without departing from the spirit and scope of the invention as defined by the appended claims.
Claims (8)
1. A method for vector representation of a graph network, comprising the steps of:
constructing an adjacency matrix Y of the graph network according to the incidence relation among all nodes of the graph networkN×N×KWherein N represents the number of nodes, and K represents the number of edges of different types;
updating element values in K corresponding adjacency matrixes according to the actual association times of each node on different types of edges;
carrying out standardization processing on the K adjacent matrixes;
and (3) applying a feature extraction function f to extract the features of the K edges to form a vector representation D (G) of the graph network.
2. The method of claim 1, wherein the K adjacency matrices are normalized by:
firstly, setting an upper limit threshold value for each adjacent matrix, and replacing an element value in the matrix with the upper limit threshold value if the element value exceeds the upper limit threshold value;
then, normalization processing is carried out on each adjacent matrix element, and the element value is adjusted to be in the range of 0 to 1.
3. The method of claim 2, wherein the normalization process employs a maximum normalization process, i.e., dividing each element value in the adjacency matrix by a maximum of the element values.
4. The method of claim 1, wherein the feature extraction function f comprises taking a maximum, minimum, mean, or weighted linear combination.
5. An apparatus for vector representation of a graph network, comprising:
the adjacency matrix construction module is used for constructing an adjacency matrix Y of the graph network according to the incidence relation among all nodes of the graph networkN ×N×KWherein N represents the number of nodes, and K represents the number of edges of different types;
the element value updating module is used for updating the element values in the K corresponding adjacent matrixes according to the actual association times of each node on different types of edges;
the normalization processing module is used for normalizing the K adjacent matrixes;
and the vector representation module is used for applying a feature extraction function f to carry out feature extraction on the K edges to form vector representation D (G) of the graph network.
6. The apparatus of claim 5, wherein the K adjacency matrices are normalized by:
firstly, setting an upper limit threshold value for each adjacent matrix, and replacing an element in the matrix with the upper limit threshold value if the element exceeds the upper limit threshold value;
and then, normalizing each adjacent matrix element, and adjusting the value to be in the range of 0 to 1.
7. The apparatus of claim 6, wherein the normalization process employs a maximum normalization process of dividing each element value in the adjacency matrix by a maximum value in the element values.
8. The apparatus of claim 5, wherein the feature extraction function f comprises taking a maximum, minimum, mean, or weighted linear combination.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110157860.6A CN112801288A (en) | 2021-02-05 | 2021-02-05 | Vector representation method and device of graph network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110157860.6A CN112801288A (en) | 2021-02-05 | 2021-02-05 | Vector representation method and device of graph network |
Publications (1)
Publication Number | Publication Date |
---|---|
CN112801288A true CN112801288A (en) | 2021-05-14 |
Family
ID=75814265
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110157860.6A Pending CN112801288A (en) | 2021-02-05 | 2021-02-05 | Vector representation method and device of graph network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112801288A (en) |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108022171A (en) * | 2016-10-31 | 2018-05-11 | 腾讯科技(深圳)有限公司 | A kind of data processing method and equipment |
CN108648095A (en) * | 2018-05-10 | 2018-10-12 | 浙江工业大学 | A kind of nodal information hidden method accumulating gradient network based on picture scroll |
US20180351971A1 (en) * | 2017-01-24 | 2018-12-06 | Nec Laboratories America, Inc. | Knowledge transfer system for accelerating invariant network learning |
CN109101629A (en) * | 2018-08-14 | 2018-12-28 | 合肥工业大学 | A kind of network representation method based on depth network structure and nodal community |
CN110009093A (en) * | 2018-12-07 | 2019-07-12 | 阿里巴巴集团控股有限公司 | For analyzing the nerve network system and method for relational network figure |
CN110555050A (en) * | 2018-03-30 | 2019-12-10 | 华东师范大学 | heterogeneous network node representation learning method based on meta-path |
CN111783100A (en) * | 2020-06-22 | 2020-10-16 | 哈尔滨工业大学 | Source code vulnerability detection method for code graph representation learning based on graph convolution network |
CN112131480A (en) * | 2020-09-30 | 2020-12-25 | 中国海洋大学 | Personalized commodity recommendation method and system based on multilayer heterogeneous attribute network representation learning |
-
2021
- 2021-02-05 CN CN202110157860.6A patent/CN112801288A/en active Pending
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108022171A (en) * | 2016-10-31 | 2018-05-11 | 腾讯科技(深圳)有限公司 | A kind of data processing method and equipment |
US20180351971A1 (en) * | 2017-01-24 | 2018-12-06 | Nec Laboratories America, Inc. | Knowledge transfer system for accelerating invariant network learning |
CN110555050A (en) * | 2018-03-30 | 2019-12-10 | 华东师范大学 | heterogeneous network node representation learning method based on meta-path |
CN108648095A (en) * | 2018-05-10 | 2018-10-12 | 浙江工业大学 | A kind of nodal information hidden method accumulating gradient network based on picture scroll |
CN109101629A (en) * | 2018-08-14 | 2018-12-28 | 合肥工业大学 | A kind of network representation method based on depth network structure and nodal community |
CN110009093A (en) * | 2018-12-07 | 2019-07-12 | 阿里巴巴集团控股有限公司 | For analyzing the nerve network system and method for relational network figure |
CN111783100A (en) * | 2020-06-22 | 2020-10-16 | 哈尔滨工业大学 | Source code vulnerability detection method for code graph representation learning based on graph convolution network |
CN112131480A (en) * | 2020-09-30 | 2020-12-25 | 中国海洋大学 | Personalized commodity recommendation method and system based on multilayer heterogeneous attribute network representation learning |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109214441A (en) | A kind of fine granularity model recognition system and method | |
CN110473592B (en) | Multi-view human synthetic lethal gene prediction method | |
CN112685504B (en) | Production process-oriented distributed migration chart learning method | |
CN107832789B (en) | Feature weighting K nearest neighbor fault diagnosis method based on average influence value data transformation | |
CN111259917B (en) | Image feature extraction method based on local neighbor component analysis | |
CN113628059B (en) | Associated user identification method and device based on multi-layer diagram attention network | |
CN113822419B (en) | Self-supervision graph representation learning operation method based on structural information | |
CN113723238B (en) | Face lightweight network model construction method and face recognition method | |
CN112215292A (en) | Image countermeasure sample generation device and method based on mobility | |
CN111898735A (en) | Distillation learning method, distillation learning device, computer equipment and storage medium | |
CN107517201A (en) | A kind of network vulnerability discrimination method removed based on sequential | |
CN114708479B (en) | Self-adaptive defense method based on graph structure and characteristics | |
CN113240105A (en) | Power grid steady state discrimination method based on graph neural network pooling | |
CN113541985A (en) | Internet of things fault diagnosis method, training method of model and related device | |
CN113987236B (en) | Unsupervised training method and unsupervised training device for visual retrieval model based on graph convolution network | |
CN109063535B (en) | Pedestrian re-identification and pedestrian gender classification method based on joint deep learning | |
CN110598585A (en) | Sit-up action recognition method based on convolutional neural network | |
CN111160536B (en) | Convolution embedding representation inference method based on fragmentation knowledge | |
CN117272195A (en) | Block chain abnormal node detection method and system based on graph convolution attention network | |
CN116089652B (en) | Unsupervised training method and device of visual retrieval model and electronic equipment | |
CN112801288A (en) | Vector representation method and device of graph network | |
CN116232694A (en) | Lightweight network intrusion detection method and device, electronic equipment and storage medium | |
CN111178431A (en) | Network node role identification method based on neural network and multi-dimensional feature extraction | |
Jaiswal et al. | Spending your winning lottery better after drawing it | |
CN114896977A (en) | Dynamic evaluation method for entity service trust value of Internet of things |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20210514 |
|
RJ01 | Rejection of invention patent application after publication |