CN113158391B - Visualization method, system, equipment and storage medium for multidimensional network node classification - Google Patents

Visualization method, system, equipment and storage medium for multidimensional network node classification Download PDF

Info

Publication number
CN113158391B
CN113158391B CN202110482521.5A CN202110482521A CN113158391B CN 113158391 B CN113158391 B CN 113158391B CN 202110482521 A CN202110482521 A CN 202110482521A CN 113158391 B CN113158391 B CN 113158391B
Authority
CN
China
Prior art keywords
dimensional
node
graph
network
low
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110482521.5A
Other languages
Chinese (zh)
Other versions
CN113158391A (en
Inventor
魏迎梅
韩贝贝
杨雨璇
冯素茹
康来
谢毓湘
蒋杰
万珊珊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
National University of Defense Technology
Original Assignee
National University of Defense Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by National University of Defense Technology filed Critical National University of Defense Technology
Priority to CN202110482521.5A priority Critical patent/CN113158391B/en
Publication of CN113158391A publication Critical patent/CN113158391A/en
Application granted granted Critical
Publication of CN113158391B publication Critical patent/CN113158391B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F30/00Computer-aided design [CAD]
    • G06F30/10Geometric CAD
    • G06F30/18Network design, e.g. design based on topological or interconnect aspects of utility systems, piping, heating ventilation air conditioning [HVAC] or cabling
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/904Browsing; Visualisation therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/906Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/213Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F30/00Computer-aided design [CAD]
    • G06F30/20Design optimisation, verification or simulation
    • G06F30/27Design optimisation, verification or simulation using machine learning, e.g. artificial intelligence, neural networks, support vector machines [SVM] or training a model
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2111/00Details relating to CAD techniques
    • G06F2111/04Constraint-based CAD
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2111/00Details relating to CAD techniques
    • G06F2111/08Probabilistic or stochastic CAD

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Geometry (AREA)
  • Computer Hardware Design (AREA)
  • Software Systems (AREA)
  • Medical Informatics (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Computational Mathematics (AREA)
  • Mathematical Analysis (AREA)
  • Mathematical Optimization (AREA)
  • Pure & Applied Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • User Interface Of Digital Computer (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The embodiment of the invention discloses a visualization method, a system, equipment and a storage medium for multi-dimensional network node classification. And based on a nonlinear dimension reduction algorithm, projecting the low-dimension embedded matrix to obtain coordinate values of each node in the two-dimensional space in the multidimensional graph data, and using label information of the nodes as color mapping to present classification results by adopting a visualization technology. The low-dimensional embedding obtained by the embodiment of the invention simultaneously fuses the attribute information of the node close range, the node far range and the node. And the obtained low-dimensional embedded matrix is projected into a two-dimensional layout space based on a nonlinear dimension reduction algorithm, and the influence of various characteristic information in an original multi-dimensional graph network on node classification is intuitively displayed from the visual point of view by adopting a visualization technology.

Description

Visualization method, system, equipment and storage medium for multidimensional network node classification
Technical Field
The present disclosure relates to the field of network data processing, and in particular, to a method, system, device, and storage medium for visualizing multi-dimensional network node classification.
Background
With the rapid development of technology in the past decades, particularly emerging technologies represented by the internet and big data, have penetrated into aspects of life, and humans have been in an information age. Among them, the development of information technology represented by the internet has led to the mutual influence of systems in the real world in different interaction modes, and the connection thereof is also becoming more and more tight. For a given system, the internal connectivity patterns can be described by the network: each component in the system is abstracted into a vertex (or node), and the connection among the components is abstracted into edges; such as economic networks, social networks, biological networks, traffic networks, e-commerce information networks, etc., the evolution and transition of these networks is also a mapping of the human real world.
The data which are massive, easy to obtain and have correlation in the current big data age can well represent the diversity of the relationship among the nodes, and the relationship among the nodes can be observed from each dimension. The simultaneous existence of different types of interactions is the source of the collective phenomenon between the observation nodes, which is often not possible in single-tier networks, where single-tier homogeneous networks (homogeneous networks represent networks that contain only one node type and one conjoined type) can only represent one relationship type between nodes. For example, if an online social network is taken as an example, the same group of users have interaction relations in three social account numbers of newwave microblogs, weChat and QQ at the same time, if the network with multiple interaction relations is expressed as a multi-relation fusion network, the respective structural characteristics of the same dimension graph network and the coupling information and interaction association information between different dimensions cannot be clearly expressed.
The prior art has the following technical problems: 1) The classical random walk-based network embedding technology can capture node neighbor information (namely global structural characteristics) of a target node at a long distance, but the method can only capture structural characteristics of a graph network and cannot capture attribute characteristics of the node; 2) The graph rolling network technology can naturally fuse the attribute characteristics of the nodes, but cannot capture the remote neighbor information of the target node; 3) The random walk-based network embedding technology and the graph rolling network technology are designed aiming at a single-layer homogeneous graph network, and cannot be directly used for a multi-dimensional graph network; 4) Classical nonlinear dimension-reduction based graph visualization techniques describe only the similarity features exhibited by node structures, without considering the impact of other features in the graph network on node similarity.
Disclosure of Invention
Based on the foregoing, it is necessary to provide a visualization method, system, device and storage medium for multi-dimensional network node classification.
In a first aspect, an embodiment of the present invention provides a visualization method for multi-dimensional network node classification, including the following steps:
converting all dimension graph nodes in the multidimensional graph network nodes into dense low-dimension embedding based on a random walk network embedding technology;
performing weighted average on the low-dimensional embedding, and splicing the low-dimensional embedding with the node original attribute characteristics of the multi-dimensional graph to obtain second attribute characteristics;
capturing the structure and attribute information of each dimension graph network based on the graph rolling network, and embedding each node containing the second attribute characteristics into a low-dimensional space to obtain a low-dimensional embedding matrix;
capturing the correlation among different dimension graph networks by adopting regularization constraint, and acquiring importance weights of the different dimension graph networks by using an attention mechanism to obtain an attention matrix;
and weighting and fusing the low-dimensional embedded matrix and the attention matrix to obtain n comprehensive low-dimensional vectors of the nodes, projecting the comprehensive low-dimensional vectors in a two-dimensional space, and displaying the classification results of the nodes.
Further, the random walk-based network embedding technique converts all dimension graph nodes in the multi-dimension graph network nodes into dense low-dimension embedding, including:
any given node v in a graph network with dimension r i Collecting the node v through a random walk sampling strategy i Obtaining a wander sequence;
dividing the wandering sequence through a visible window to obtain a training sample sequence related to the node information;
inputting the training sample sequence into a Skip-Gram model, and performing target optimization by a random gradient descent method to obtain the node v i Is embedded in a low-dimensional manner.
Further, capturing correlations between different dimension graph networks by adopting regularization constraint, and acquiring importance weights of the different dimension graph networks by using an attention mechanism to obtain an attention matrix, including:
adopting regularized consistency constraint to the low-dimensional embedded matrixes, and measuring the similarity degree between the low-dimensional embedded matrixes to obtain the similarity between corresponding dimension graph networks in the original multi-dimension graph network;
capturing the correlation of different dimension graph networks by using the regularized weight parameters of M graph neural networks;
based on the attention mechanism, in the training process, the importance weights of different dimensions are adaptively calculated by taking downstream node classification tasks as guidance.
Further, the weighting and fusing the low-dimensional embedding matrix and the attention matrix to obtain n node comprehensive low-dimensional vectors, projecting the comprehensive low-dimensional vectors in a two-dimensional space, and displaying the classification result of the nodes, including:
projecting the comprehensive low-dimensional vector into a two-dimensional layout space by adopting a nonlinear dimension reduction technology to obtain coordinate values of nodes in the two-dimensional space;
based on the coordinate values, performing color mapping by using a visualization technology according to label information of the nodes, and displaying the classification result of the nodes in a visual form.
On the other hand, the embodiment of the invention also provides a visualization system for multi-dimensional network node classification, which comprises:
the global structural feature module is used for converting all dimension graph nodes in the multi-dimension graph network nodes into dense low-dimension embedding based on a random walk network embedding technology;
the network module of the specific dimension graph is used for carrying out weighted average on the low-dimension embedding and splicing the node original attribute characteristics of the multi-dimension graph to obtain second attribute characteristics;
the graph rolling network module is used for capturing the structure and attribute information of each dimension graph network based on the graph rolling network, and embedding each node containing the second attribute characteristics into a low-dimensional space to obtain a low-dimensional embedding matrix;
the correlation constraint module is used for capturing the correlation between the different dimension graph networks by adopting regularization constraint, and acquiring importance weights of the different dimension graph networks by using an attention mechanism to obtain an attention matrix;
and the classification display module is used for weighting and fusing the low-dimensional embedded matrix and the attention matrix to obtain n node comprehensive low-dimensional vectors, projecting the comprehensive low-dimensional vectors in a two-dimensional space, and displaying classification results of the nodes.
Further, the global structural feature module includes a random walk network embedded unit, and the random walk network embedded unit is configured to:
any given node v in a graph network with dimension r i Collecting the node v through a random walk sampling strategy i Obtaining a wander sequence;
dividing the wandering sequence through a visible window to obtain a training sample sequence related to the node information;
inputting the training sample sequence into a Skip-Gram model, and performing target optimization by a random gradient descent method to obtain the node v i Is embedded in a low-dimensional manner.
Further, the relevance constraint module includes a constraint training unit configured to:
adopting regularized consistency constraint to the low-dimensional embedded matrixes, and measuring the similarity degree between the low-dimensional embedded matrixes to obtain the similarity between corresponding dimension graph networks in the original multi-dimension graph network;
capturing the correlation of different dimension graph networks by using the regularized weight parameters of M graph neural networks;
based on the attention mechanism, in the training process, the importance weights of different dimensions are adaptively calculated by taking downstream node classification tasks as guidance.
Further, the classification presentation module includes a projection mapping unit configured to:
projecting the comprehensive low-dimensional vector into a two-dimensional layout space by adopting a nonlinear dimension reduction technology to obtain coordinate values of nodes in the two-dimensional space;
based on the coordinate values, performing color mapping by using a visualization technology according to label information of the nodes, and displaying the classification result of the nodes in a visual form.
The embodiment of the invention also provides computer equipment, which comprises a memory, a processor and a computer program stored on the memory and capable of running on the processor, wherein the processor realizes the following steps when executing the computer program:
converting all dimension graph nodes in the multidimensional graph network nodes into dense low-dimension embedding based on a random walk network embedding technology;
performing weighted average on the low-dimensional embedding, and splicing the low-dimensional embedding with the node original attribute characteristics of the multi-dimensional graph to obtain second attribute characteristics;
capturing the structure and attribute information of each dimension graph network based on the graph rolling network, and embedding each node containing the second attribute characteristics into a low-dimensional space to obtain a low-dimensional embedding matrix;
capturing the correlation among different dimension graph networks by adopting regularization constraint, and acquiring importance weights of the different dimension graph networks by using an attention mechanism to obtain an attention matrix;
and weighting and fusing the low-dimensional embedded matrix and the attention matrix to obtain n comprehensive low-dimensional vectors of the nodes, projecting the comprehensive low-dimensional vectors in a two-dimensional space, and displaying the classification results of the nodes.
The embodiment of the invention also provides a computer readable storage medium, on which a computer program is stored, which when being executed by a processor, realizes the following steps:
converting all dimension graph nodes in the multidimensional graph network nodes into dense low-dimension embedding based on a random walk network embedding technology;
performing weighted average on the low-dimensional embedding, and splicing the low-dimensional embedding with the node original attribute characteristics of the multi-dimensional graph to obtain second attribute characteristics;
capturing the structure and attribute information of each dimension graph network based on the graph rolling network, and embedding each node containing the second attribute characteristics into a low-dimensional space to obtain a low-dimensional embedding matrix;
capturing the correlation among different dimension graph networks by adopting regularization constraint, and acquiring importance weights of the different dimension graph networks by using an attention mechanism to obtain an attention matrix;
and weighting and fusing the low-dimensional embedded matrix and the attention matrix to obtain n comprehensive low-dimensional vectors of the nodes, projecting the comprehensive low-dimensional vectors in a two-dimensional space, and displaying the classification results of the nodes.
The beneficial effects of this application are: the embodiment of the invention discloses a visualization method, a system, equipment and a storage medium for multi-dimensional network node classification, which are based on a machine learning network embedding technology, namely a random walk network embedding technology and a graph rolling network embedding technology, and combine a regularization mechanism and an attention mechanism to obtain a low-dimensional dense vector of each node in a multi-dimensional graph network in a fusion way so as to form a low-dimensional embedding matrix. The low-dimensional embedded matrix contains various information such as node attributes, local and global structures, correlation and importance differences among different dimensions of the multi-dimensional graph network. And in addition, based on a nonlinear dimension reduction algorithm, the low-dimension embedded matrix is projected to obtain coordinate values of each node in the two-dimensional space in the multidimensional graph data, and a classification result is presented by adopting a visualization technology by taking label information of the node as color mapping. The embodiment of the invention combines the random walk network embedding technology and the graph rolling network technology, and the obtained low-dimensional embedding not only comprises node neighbor information (global structure) of a node at a long distance, but also fuses the attribute information of the node at a short distance (local structure) and the node. In addition, the embodiment of the invention projects the obtained low-dimensional embedded matrix into a two-dimensional layout space based on a nonlinear dimension reduction algorithm, and the influence of various characteristic information in an original multi-dimensional graph network on node classification is intuitively displayed from the visual perspective by adopting a visualization technology according to the obtained two-dimensional coordinate value.
Drawings
FIG. 1 is a flow diagram of a visualization method of multi-dimensional network node classification disclosed in one embodiment;
FIG. 2 is a flow diagram of a random walk network embedded multidimensional graph network node as disclosed in one embodiment;
FIG. 3 is a flow diagram of obtaining the relevance and importance of a network of different dimension graphs as disclosed in one embodiment;
FIG. 4 is a flow diagram of classifying and visualizing multi-dimensional network nodes as disclosed in one embodiment;
FIG. 5 is a block diagram of a visualization system of multi-dimensional network node classification in one embodiment;
fig. 6 is an internal structural diagram of a computer device in one embodiment.
Detailed Description
The visualization technology can accelerate the information processing speed through visual images, and provides powerful support for finding and understanding scientific laws. Among them, graph visualization has become an important network data analysis method. The graph visualization is mainly divided into: force guidance-based methods and data dimension reduction-based methods. Wherein a force-guided graph visualization technique calculates a combined force of gravitational and repulsive forces of each node by modeling a graph as a physical system, whereby the position of the node is moved by the combined force until the entire system reaches a stable state. The graph visualization based on data dimension reduction aims at maintaining the similarity between the graph space and the node distribution in the two-dimensional layout space by designing an optimization objective function, namely, the difference between the node distribution in the two-dimensional layout space and the node distribution in the original graph space is as small as possible, so that the node distribution in the two-dimensional layout space can reflect the node information in the original graph space. The graph visualization based on data dimension reduction is divided into a linear dimension reduction technology and a nonlinear dimension reduction technology. The visualization technology based on the linear dimension reduction can only reflect the structural data with the linear relation, so that the visualization technology based on the nonlinear dimension reduction is widely applied.
In one embodiment, as shown in fig. 1, a visualization method for multi-dimensional network node classification is provided, including the following steps:
step 101, converting all dimension graph nodes in the multi-dimension graph network nodes into dense low-dimension embedding based on a random walk network embedding technology;
102, performing weighted average on the low-dimensional embedding, and splicing the low-dimensional embedding with the node original attribute characteristics of the multi-dimensional graph to obtain second attribute characteristics;
step 103, capturing the structure and attribute information of each dimension graph network based on the graph rolling network, and embedding each node containing the second attribute features into a low-dimensional space to obtain a low-dimensional embedded matrix;
step 104, capturing the correlation between different dimension graph networks by adopting regularization constraint, and acquiring importance weights of the different dimension graph networks by using an attention mechanism to obtain an attention matrix;
and 105, weighting and fusing the low-dimensional embedded matrix and the attention matrix to obtain n node comprehensive low-dimensional vectors, projecting the comprehensive low-dimensional vectors in a two-dimensional space, and displaying the classification results of the nodes.
Specifically, the network embedding technology based on machine learning in this embodiment, that is, the network embedding technology of random walk network embedding and graph rolling network, combines a regularization mechanism and an attention mechanism to fuse and obtain a low-dimensional dense vector of each node in the multidimensional graph network, so as to form a low-dimensional embedding matrix. The low-dimensional embedded matrix contains various information such as node attributes, local and global structures, correlation and importance differences among different dimensions of the multi-dimensional graph network. And in addition, based on a nonlinear dimension reduction algorithm, the low-dimension embedded matrix is projected to obtain coordinate values of each node in the two-dimensional space in the multidimensional graph data, and a classification result is presented by adopting a visualization technology by taking label information of the node as color mapping. The embodiment of the invention combines the random walk network embedding technology and the graph rolling network technology, and the obtained low-dimensional embedding not only comprises node neighbor information (global structure) of a node at a long distance, but also fuses the attribute information of the node at a short distance (local structure) and the node. In addition, the embodiment of the invention projects the obtained low-dimensional embedded matrix into a two-dimensional layout space based on a nonlinear dimension reduction algorithm, and the influence of various characteristic information in an original multi-dimensional graph network on node classification is intuitively displayed from the visual perspective by adopting a visualization technology according to the obtained two-dimensional coordinate value.
In one embodiment, as shown in fig. 2, the flow of embedding a random walk network into a multidimensional graph network node includes:
step 201, selecting any given node v in a graph network with dimension r i Collecting the node v through a random walk sampling strategy i Obtaining a wander sequence;
step 202, dividing the wander sequence through a visual window to obtain a training sample sequence related to the node information;
step 203, inputting the training sample sequence into a Skip-Gram model, and performing target optimization by a random gradient descent method to obtain the sectionPoint v i Is embedded in a low-dimensional manner.
Specifically, for a graph network G of dimension r r The invention obtains a low-dimensional vector matrix of the n nodes based on a random walk network embedded model.
For any given node v in a graph network of dimension r i Node v is collected by a random walk sampling strategy i A wander sequence is obtained by the context information (namely the neighbor node); dividing the wandering sequence through a window to obtain a training sample sequence related to the node information; inputting training samples into a Skip-Gram model, and optimizing an objective function L by a random gradient descent method loss Obtaining node v i Is described.
In one embodiment, as shown in FIG. 3, obtaining the relevance and importance of a network of different dimension graphs includes the steps of:
step 301, regularized consistency constraint is adopted for the low-dimensional embedded matrixes, and the similarity degree between the low-dimensional embedded matrixes is measured, so that the similarity between corresponding dimension graph networks in the original multi-dimension graph network is obtained;
step 302, capturing the correlation of different dimension graph networks by using the weight parameters of the regularized M graph neural networks;
in step 303, in the training process based on the attention mechanism, the importance weights of different dimensions are adaptively calculated by taking the downstream node classification task as a guide.
Specifically, there is a certain correlation between different dimensions in the multidimensional graph network, and the graph networks with different dimensions typically have a certain similarity or commonality feature, where the correlation information between the graph networks with different dimensions cannot be captured by using M low-dimensional embedding matrices obtained based on the M graph rolling network models.
The invention captures the correlation of regularized consistency constraint and regularized weight parameters of M graph neural networks by adopting the regularized consistency constraint and the regularized weight parameters, and specifically realizes that two regularized constraint terms are added in an optimization objective function at the downstream of the regularized consistency constraint:
(1) Consistency regularization constraint term L of M low-dimensional embedded matrixes CC ,L CC The degree of similarity between two different embedding matrices is measured, reflecting the similarity between corresponding dimension graph networks in the original multi-dimensional graph network.
(2) Weight parameter constraints for M graph rolling networks.
In one embodiment, as shown in fig. 4, classifying and visualizing the multi-dimensional network nodes includes the steps of:
step 401, projecting the comprehensive low-dimensional vector into a two-dimensional layout space by adopting a nonlinear dimension reduction technology to obtain coordinate values of nodes in the two-dimensional space;
and step 402, performing color mapping with label information of the nodes by adopting a visualization technology based on the coordinate values, and displaying the classification result of the nodes in a visual form.
Specifically, based on a nonlinear dimension-reduction algorithm, projecting the low-dimensional embedding matrix into the two-dimensional layout space comprises the following steps: based on the low-dimensional embedding matrix, the similarity between nodes, i.e., the P-distribution, is calculated. In the two-dimensional layout space, the layout proximity between nodes, i.e., the Q distribution, is calculated. And calculating KL divergence between the P distribution and the Q distribution, continuously iterating and optimizing an objective function, and reducing the difference between the P distribution and the Q distribution to obtain two-dimensional coordinate values of n nodes in a two-dimensional layout space. And performing graph visualization mapping according to the two-dimensional coordinate values and the labels of the nodes.
And calculating P distribution describing the similarity among the nodes according to the obtained low-dimensional embedded matrix.
Figure BDA0003048977530000101
Figure BDA0003048977530000102
Wherein for node v i And node v j In terms of conditional probability p ji Representing node v i Select v j As probability of its nearest point, if node v i And v j With very high similarity, p ji The value will be large. d, d ij Representing node v i And vj, the shortest path in graph theory is adopted to represent the distance in the original nonlinear dimension reduction method, and the disadvantage is that only the structural similarity between nodes can be described. The invention is based on node v i And v j The distance value is calculated based on node v because the low-dimensional embedding has captured local structural features, attribute features, remote node information (global structural features) of nodes in the multi-dimensional graph network, and correlation and importance differences between different-dimensional graph networks i And v j Is to calculate d ij Obtaining the distribution of the similarity between the nodes can describe the characteristics in the multidimensional graph network more accurately and comprehensively. Delta i Representing node v i Is the variance of the gaussian distribution of the center point.
Node v in two-dimensional layout space based on Student-t distribution measurement i And the proximity between vj, i.e. the Q distribution:
Figure BDA0003048977530000111
wherein ||y i -y j I represents node v i And v j Distance (y) i And y j Respectively represent node v i And v j Coordinate values in the two-dimensional layout space). Like the P distribution, the Q distribution indicates that similar nodes in the two-dimensional layout space are closer together and dissimilar nodes are relatively farther apart.
The KL divergence between the P and Q distributions is:
Figure BDA0003048977530000112
in the model optimization process, C is continuously reduced KL So that q ij Reflecting p as much as possible ij I.e. twoThe node coordinate positions in the dimensional layout space reflect the characteristic information in the original graph network as much as possible.
When the model stops iterative optimization, y at this time 1 ,y 2 ,L,y n Namely, two-dimensional coordinate values of n nodes. In order to clearly show the classification result of the nodes, the nodes with the same label are drawn by adopting the same color, the nodes with different labels are drawn by adopting different colors, and the visual result is the visual effect after the nodes of the multi-dimensional graph network are classified.
It should be understood that, although the steps in the above-described flowcharts are shown in order as indicated by the arrows, these steps are not necessarily performed in order as indicated by the arrows. The steps are not strictly limited to the order of execution unless explicitly recited herein, and the steps may be executed in other orders. Moreover, at least some of the steps in the flowcharts described above may include a plurality of sub-steps or stages that are not necessarily performed at the same time, but may be performed at different times, and the order of execution of the sub-steps or stages is not necessarily sequential, but may be performed alternately or alternately with at least a part of the sub-steps or stages of other steps or other steps.
In one embodiment, as shown in fig. 5, a visualization system for multi-dimensional network node classification is provided, comprising:
the global structural feature module 501 is configured to convert all dimension graph nodes in the multi-dimension graph network nodes into dense low-dimension embedding based on a random walk network embedding technology;
the specific dimension graph network module 502 is configured to perform weighted average on the low-dimension embedding, and splice the low-dimension embedding with original attribute features of nodes of the multi-dimension graph to obtain second attribute features;
a graph rolling network module 503, configured to capture structure and attribute information of each dimension graph network based on the graph rolling network, and embed each node including the second attribute feature into a low-dimensional space to obtain a low-dimensional embedding matrix;
the correlation constraint module 504 is configured to capture correlations between different dimension graph networks by using regularization constraint, and acquire importance weights of the different dimension graph networks by using an attention mechanism, so as to obtain an attention matrix;
the classification display module 505 is configured to weight-fuse the low-dimensional embedding matrix and the attention matrix to obtain n node comprehensive low-dimensional vectors, project the comprehensive low-dimensional vectors in a two-dimensional space, and display classification results of the nodes.
In one embodiment, the global structural feature module 501 includes a random walk network embedded unit for:
any given node v in a graph network with dimension r i Collecting the node v through a random walk sampling strategy i Obtaining a wander sequence;
dividing the wandering sequence through a visible window to obtain a training sample sequence related to the node information;
inputting the training sample sequence into a Skip-Gram model, and performing target optimization by a random gradient descent method to obtain the node v i Is embedded in a low-dimensional manner.
In one embodiment, the relevance constraint module 504 includes a constraint training unit to:
adopting regularized consistency constraint to the low-dimensional embedded matrixes, and measuring the similarity degree between the low-dimensional embedded matrixes to obtain the similarity between corresponding dimension graph networks in the original multi-dimension graph network;
capturing the correlation of different dimension graph networks by using the regularized weight parameters of M graph neural networks;
based on the attention mechanism, in the training process, the importance weights of different dimensions are adaptively calculated by taking downstream node classification tasks as guidance.
In one embodiment, the classification presentation module 505 comprises a projection mapping unit for:
projecting the comprehensive low-dimensional vector into a two-dimensional layout space by adopting a nonlinear dimension reduction technology to obtain coordinate values of nodes in the two-dimensional space;
based on the coordinate values, performing color mapping by using a visualization technology according to label information of the nodes, and displaying the classification result of the nodes in a visual form.
For specific limitations of the visualization system for the multi-dimensional network node classification, reference may be made to the above limitation of the visualization method for the multi-dimensional network node classification, which is not repeated here. The various modules in the visualization system of multi-dimensional network node classification described above may be implemented in whole or in part by software, hardware, and combinations thereof. The above modules may be embedded in hardware or may be independent of a processor in the computer device, or may be stored in software in a memory in the computer device, so that the processor may call and execute operations corresponding to the above modules.
FIG. 6 illustrates an internal block diagram of a computer device in one embodiment. As shown in fig. 6, the computer device includes a processor, a memory, a network interface, an input device, and a display screen connected by a system bus. The memory includes a nonvolatile storage medium and an internal memory. The non-volatile storage medium of the computer device stores an operating system, and may also store a computer program that, when executed by a processor, causes the processor to implement a rights abnormality detection method. The internal memory may also store a computer program that, when executed by the processor, causes the processor to perform the rights abnormality detection method. The display screen of the computer equipment can be a liquid crystal display screen or an electronic ink display screen, the input device of the computer equipment can be a touch layer covered on the display screen, can also be keys, a track ball or a touch pad arranged on the shell of the computer equipment, and can also be an external keyboard, a touch pad or a mouse and the like.
It will be appreciated by those skilled in the art that the structure shown in fig. 6 is merely a block diagram of some of the structures associated with the present application and is not limiting of the computer device to which the present application may be applied, and that a particular computer device may include more or fewer components than shown, or may combine certain components, or have a different arrangement of components.
In one embodiment, a computer device is provided comprising a memory, a processor, and a computer program stored on the memory and executable on the processor, the processor implementing the steps of when executing the computer program:
converting all dimension graph nodes in the multidimensional graph network nodes into dense low-dimension embedding based on a random walk network embedding technology;
performing weighted average on the low-dimensional embedding, and splicing the low-dimensional embedding with the node original attribute characteristics of the multi-dimensional graph to obtain second attribute characteristics;
capturing the structure and attribute information of each dimension graph network based on the graph rolling network, and embedding each node containing the second attribute characteristics into a low-dimensional space to obtain a low-dimensional embedding matrix;
capturing the correlation among different dimension graph networks by adopting regularization constraint, and acquiring importance weights of the different dimension graph networks by using an attention mechanism to obtain an attention matrix;
and weighting and fusing the low-dimensional embedded matrix and the attention matrix to obtain n comprehensive low-dimensional vectors of the nodes, projecting the comprehensive low-dimensional vectors in a two-dimensional space, and displaying the classification results of the nodes.
In one embodiment, the processor when executing the computer program further performs the steps of:
any given node v in a graph network with dimension r i Collecting the node v through a random walk sampling strategy i Obtaining a wander sequence;
dividing the wandering sequence through a visible window to obtain a training sample sequence related to the node information;
inputting the training sample sequence into a Skip-Gram model, and performing target optimization by a random gradient descent method to obtain the node v i Is embedded in a low-dimensional manner.
In one embodiment, the processor when executing the computer program further performs the steps of:
adopting regularized consistency constraint to the low-dimensional embedded matrixes, and measuring the similarity degree between the low-dimensional embedded matrixes to obtain the similarity between corresponding dimension graph networks in the original multi-dimension graph network;
capturing the correlation of different dimension graph networks by using the regularized weight parameters of M graph neural networks;
based on the attention mechanism, in the training process, the importance weights of different dimensions are adaptively calculated by taking downstream node classification tasks as guidance.
In one embodiment, the processor when executing the computer program further performs the steps of:
projecting the comprehensive low-dimensional vector into a two-dimensional layout space by adopting a nonlinear dimension reduction technology to obtain coordinate values of nodes in the two-dimensional space;
based on the coordinate values, performing color mapping by using a visualization technology according to label information of the nodes, and displaying the classification result of the nodes in a visual form.
In one embodiment, a computer readable storage medium is provided having a computer program stored thereon, which when executed by a processor, performs the steps of:
converting all dimension graph nodes in the multidimensional graph network nodes into dense low-dimension embedding based on a random walk network embedding technology;
performing weighted average on the low-dimensional embedding, and splicing the low-dimensional embedding with the node original attribute characteristics of the multi-dimensional graph to obtain second attribute characteristics;
capturing the structure and attribute information of each dimension graph network based on the graph rolling network, and embedding each node containing the second attribute characteristics into a low-dimensional space to obtain a low-dimensional embedding matrix;
capturing the correlation among different dimension graph networks by adopting regularization constraint, and acquiring importance weights of the different dimension graph networks by using an attention mechanism to obtain an attention matrix;
and weighting and fusing the low-dimensional embedded matrix and the attention matrix to obtain n comprehensive low-dimensional vectors of the nodes, projecting the comprehensive low-dimensional vectors in a two-dimensional space, and displaying the classification results of the nodes.
In one embodiment, the processor when executing the computer program further performs the steps of:
any given node v in a graph network with dimension r i Collecting the node v through a random walk sampling strategy i Obtaining a wander sequence;
dividing the wandering sequence through a visible window to obtain a training sample sequence related to the node information;
inputting the training sample sequence into a Skip-Gram model, and performing target optimization by a random gradient descent method to obtain the node v i Is embedded in a low-dimensional manner.
In one embodiment, the processor when executing the computer program further performs the steps of:
adopting regularized consistency constraint to the low-dimensional embedded matrixes, and measuring the similarity degree between the low-dimensional embedded matrixes to obtain the similarity between corresponding dimension graph networks in the original multi-dimension graph network;
capturing the correlation of different dimension graph networks by using the regularized weight parameters of M graph neural networks;
based on the attention mechanism, in the training process, the importance weights of different dimensions are adaptively calculated by taking downstream node classification tasks as guidance.
In one embodiment, the processor when executing the computer program further performs the steps of:
projecting the comprehensive low-dimensional vector into a two-dimensional layout space by adopting a nonlinear dimension reduction technology to obtain coordinate values of nodes in the two-dimensional space;
based on the coordinate values, performing color mapping by using a visualization technology according to label information of the nodes, and displaying the classification result of the nodes in a visual form.
Those skilled in the art will appreciate that implementing all or part of the above described methods may be accomplished by way of a computer program stored on a non-transitory computer readable storage medium, which when executed, may comprise the steps of the embodiments of the methods described above.
The technical features of the above embodiments may be arbitrarily combined, and all possible combinations of the technical features in the above embodiments are not described for brevity of description, however, as long as there is no contradiction between the combinations of the technical features, they should be considered as the scope of the description.
The above examples merely represent a few embodiments of the present application, which are described in more detail and are not to be construed as limiting the scope of the invention. It should be noted that it would be apparent to those skilled in the art that various modifications and improvements could be made without departing from the spirit of the present application, which would be within the scope of the present application. Accordingly, the scope of protection of the present application is to be determined by the claims appended hereto.

Claims (4)

1. The visualization method for the multi-dimensional network node classification is characterized in that the multi-dimensional network is a multi-dimensional social network, and the multi-dimensional network node is a user in the multi-dimensional social network; the visualization method for multi-dimensional network node classification comprises the following steps:
the random walk-based network embedding technique converts all dimension graph nodes in the multidimensional graph network nodes into dense low-dimension embedding, comprising: acquiring context information of any given node in a graph network with the dimension r through a random walk sampling strategy to obtain a walk sequence; dividing the wandering sequence through a visible window to obtain a training sample sequence related to the node information; inputting the training sample sequence into a Ski p-Gram model, and performing target optimization by a random gradient descent method to obtain low-dimensional embedding of the node;
performing weighted average on the low-dimensional embedding, and splicing the low-dimensional embedding with the node original attribute characteristics of the multi-dimensional graph to obtain second attribute characteristics;
capturing the structure and attribute information of each dimension graph network based on the graph rolling network, and embedding each node containing the second attribute characteristics into a low-dimensional space to obtain a low-dimensional embedding matrix;
capturing the correlation among different dimension graph networks by adopting regularization constraint, and acquiring importance weights of the different dimension graph networks by using an attention mechanism to obtain an attention matrix, wherein the method comprises the following steps of: adopting regularized consistency constraint to the low-dimensional embedded matrixes, and measuring the similarity degree between the low-dimensional embedded matrixes to obtain the similarity between corresponding dimension graph networks in the original multi-dimension graph network; capturing the correlation of different dimension graph networks by using the regularized weight parameters of M graph neural networks; based on the attention mechanism, in the training process, the downstream node classification task is used as a guide, and the importance weights of different dimensions are calculated in an adaptive manner;
weighting and fusing the low-dimensional embedded matrix and the attention matrix to obtain n node comprehensive low-dimensional vectors; the comprehensive low-dimensional vector comprises five kinds of information including node attribute characteristics, local structure characteristics, global structure characteristics of the multi-dimensional graph network and correlation and importance difference among different dimensional graph networks;
projecting the comprehensive low-dimensional vector in a two-dimensional space, and displaying the classification result of the node, wherein the method comprises the following steps: projecting the comprehensive low-dimensional vector into a two-dimensional layout space by adopting a nonlinear dimension reduction technology to obtain coordinate values of nodes in the two-dimensional space; based on the coordinate values, performing color mapping by using a visualization technology according to label information of the nodes, and displaying the classification result of the nodes in a visual form.
2. The visualization system for multi-dimensional network node classification is characterized in that the multi-dimensional network is a multi-dimensional social network, and the multi-dimensional network node is a user in the multi-dimensional social network; the visualization system for multi-dimensional network node classification comprises:
the global structural feature module is used for converting all dimension graph nodes in the multi-dimension graph network nodes into dense low-dimension embedding based on a random walk network embedding technology; the global structural feature module comprises: the random walk network embedding unit is used for collecting the context information of any given node in the graph network with the dimension r through a random walk sampling strategy to obtain a walk sequence; dividing the wandering sequence through a visible window to obtain a training sample sequence related to the node information; inputting the training sample sequence into a Skip-Gram model, and performing target optimization by a random gradient descent method to obtain low-dimensional embedding of the node;
the network module of the specific dimension graph is used for carrying out weighted average on the low-dimension embedding and splicing the node original attribute characteristics of the multi-dimension graph to obtain second attribute characteristics;
the graph rolling network module is used for capturing the structure and attribute information of each dimension graph network based on the graph rolling network, and embedding each node containing the second attribute characteristics into a low-dimensional space to obtain a low-dimensional embedding matrix;
the correlation constraint module is used for capturing the correlation between the different dimension graph networks by adopting regularization constraint, and acquiring importance weights of the different dimension graph networks by using an attention mechanism to obtain an attention matrix; the correlation constraint module comprises a constraint training unit, a correlation constraint module and a correlation constraint module, wherein the constraint training unit is used for adopting regularized consistency constraint on the low-dimensional embedded matrixes, measuring the similarity degree between the low-dimensional embedded matrixes and obtaining the similarity between corresponding dimension graph networks in the original multi-dimensional graph network; capturing the correlation of different dimension graph networks by using the regularized weight parameters of M graph neural networks; based on the attention mechanism, in the training process, the downstream node classification task is used as a guide, and the importance weights of different dimensions are calculated in an adaptive manner;
the classification display module is used for weighting and fusing the low-dimensional embedded matrix and the attention matrix to obtain n node comprehensive low-dimensional vectors, wherein the comprehensive low-dimensional vectors comprise five kinds of information including node attribute characteristics, local structure characteristics, global structure characteristics of a multi-dimensional graph network and correlation and importance differences among different dimensional graph networks; the method is also used for projecting the comprehensive low-dimensional vector in a two-dimensional space and displaying the classification result of the node; the classification display module comprises a projection mapping unit, wherein the projection mapping unit is used for projecting the comprehensive low-dimensional vector into a two-dimensional layout space by adopting a nonlinear dimension reduction technology to obtain coordinate values of nodes in the two-dimensional space; based on the coordinate values, performing color mapping by using a visualization technology according to label information of the nodes, and displaying the classification result of the nodes in a visual form.
3. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the steps of the method of claim 1 when executing the computer program.
4. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the steps of the method of claim 1.
CN202110482521.5A 2021-04-30 2021-04-30 Visualization method, system, equipment and storage medium for multidimensional network node classification Active CN113158391B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110482521.5A CN113158391B (en) 2021-04-30 2021-04-30 Visualization method, system, equipment and storage medium for multidimensional network node classification

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110482521.5A CN113158391B (en) 2021-04-30 2021-04-30 Visualization method, system, equipment and storage medium for multidimensional network node classification

Publications (2)

Publication Number Publication Date
CN113158391A CN113158391A (en) 2021-07-23
CN113158391B true CN113158391B (en) 2023-05-30

Family

ID=76873218

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110482521.5A Active CN113158391B (en) 2021-04-30 2021-04-30 Visualization method, system, equipment and storage medium for multidimensional network node classification

Country Status (1)

Country Link
CN (1) CN113158391B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113705642B (en) * 2021-08-16 2023-10-24 中山大学 Attribute-based cultural relic hierarchical classification method, system and device
CN114090838B (en) * 2022-01-18 2022-06-14 杭州悦数科技有限公司 Method, system, electronic device and storage medium for visually displaying big data
CN116186547B (en) * 2023-04-27 2023-07-07 深圳市广汇源环境水务有限公司 Method for rapidly identifying abnormal data of environmental water affair monitoring and sampling

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109753589A (en) * 2018-11-28 2019-05-14 中国科学院信息工程研究所 A kind of figure method for visualizing based on figure convolutional network
CN112231482A (en) * 2020-11-06 2021-01-15 中国人民解放军国防科技大学 Long and short text classification method based on scalable representation learning
CN112417633A (en) * 2020-12-01 2021-02-26 中国人民解放军国防科技大学 Large-scale network-oriented graph layout method and device

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109753589A (en) * 2018-11-28 2019-05-14 中国科学院信息工程研究所 A kind of figure method for visualizing based on figure convolutional network
CN112231482A (en) * 2020-11-06 2021-01-15 中国人民解放军国防科技大学 Long and short text classification method based on scalable representation learning
CN112417633A (en) * 2020-12-01 2021-02-26 中国人民解放军国防科技大学 Large-scale network-oriented graph layout method and device

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
局部二进制模式方法综述;刘丽 等;《中国图象图形学报》;第第19卷卷(第第12期期);第1696-1720页 *
融合节点结构和内容的网络表示学习方法;张虎 等;《计算机科学》;第第47卷卷(第第12期期);第119-124页 *

Also Published As

Publication number Publication date
CN113158391A (en) 2021-07-23

Similar Documents

Publication Publication Date Title
Li et al. Symbiotic graph neural networks for 3d skeleton-based human action recognition and motion prediction
CN113158391B (en) Visualization method, system, equipment and storage medium for multidimensional network node classification
Xie et al. Unseen object instance segmentation for robotic environments
Martinez et al. A scientometric analysis and critical review of computer vision applications for construction
JP7286013B2 (en) Video content recognition method, apparatus, program and computer device
Da Silva et al. Active learning paradigms for CBIR systems based on optimum-path forest classification
Peralta et al. Next-best view policy for 3d reconstruction
US11748937B2 (en) Sub-pixel data simulation system
Wang et al. 3d-physnet: Learning the intuitive physics of non-rigid object deformations
Adate et al. A survey on deep learning methodologies of recent applications
CN111881804B (en) Posture estimation model training method, system, medium and terminal based on joint training
Gomes et al. Spatio-temporal graph-RNN for point cloud prediction
Zhou et al. Indexed-points parallel coordinates visualization of multivariate correlations
Jha et al. The neural process family: Survey, applications and perspectives
Lu et al. Rapid mechanical property prediction and de novo design of three-dimensional spider webs through graph and GraphPerceiver neural networks
Yang et al. Xception-based general forensic method on small-size images
Riedel et al. Hand gesture recognition of methods-time measurement-1 motions in manual assembly tasks using graph convolutional networks
Bozkir et al. FUAT–A fuzzy clustering analysis tool
Drumond et al. Few-shot human motion prediction for heterogeneous sensors
Tagore et al. T-MAN: a neural ensemble approach for person re-identification using spatio-temporal information
Łępicka et al. Utilization of colour in ICP-based point cloud registration
Khoyani et al. A survey on visual slam algorithms compatible for 3d space reconstruction and navigation
Pegia et al. Multimodal 3D Object Retrieval
Jaunet et al. Sim2realviz: Visualizing the sim2real gap in robot ego-pose estimation
Ward et al. Improving image-based localization with deep learning: The impact of the loss function

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant