WO2019001070A1 - 一种基于邻接矩阵的连接信息规整系统、图特征提取系统、图分类系统和方法 - Google Patents

一种基于邻接矩阵的连接信息规整系统、图特征提取系统、图分类系统和方法 Download PDF

Info

Publication number
WO2019001070A1
WO2019001070A1 PCT/CN2018/082111 CN2018082111W WO2019001070A1 WO 2019001070 A1 WO2019001070 A1 WO 2019001070A1 CN 2018082111 W CN2018082111 W CN 2018082111W WO 2019001070 A1 WO2019001070 A1 WO 2019001070A1
Authority
WO
WIPO (PCT)
Prior art keywords
graph
adjacency matrix
matrix
classification
convolution
Prior art date
Application number
PCT/CN2018/082111
Other languages
English (en)
French (fr)
Inventor
罗智凌
尹建伟
吴朝晖
邓水光
李莹
吴健
Original Assignee
浙江大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 浙江大学 filed Critical 浙江大学
Publication of WO2019001070A1 publication Critical patent/WO2019001070A1/zh
Priority to US16/727,842 priority Critical patent/US11461581B2/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/901Indexing; Data structures therefor; Storage structures
    • G06F16/9024Graphs; Linked lists
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/906Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2323Non-hierarchical techniques based on graph theory, e.g. minimum spanning trees [MST] or graph cuts
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2413Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
    • G06F18/24133Distances to prototypes
    • G06F18/24143Distances to neighbourhood prototypes, e.g. restricted Coulomb energy networks [RCEN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/29Graphical models, e.g. Bayesian networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0418Architecture, e.g. interconnection topology using chaos or fractal principles
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/42Global feature extraction by analysis of the whole pattern, e.g. using frequency domain transformations or autocorrelation
    • G06V10/422Global feature extraction by analysis of the whole pattern, e.g. using frequency domain transformations or autocorrelation for representing the structure of the pattern or shape of an object therefor
    • G06V10/426Graphical representations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • G06V10/443Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components by matching or filtering
    • G06V10/449Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters
    • G06V10/451Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters with interaction between the filter responses, e.g. cortical complex cells
    • G06V10/454Integrating the filters into a hierarchical structure, e.g. convolutional neural networks [CNN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/84Arrangements for image or video recognition or understanding using pattern recognition or machine learning using probabilistic graphical models from image or video features, e.g. Markov models or Bayesian networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/30Scenes; Scene-specific elements in albums, collections or shared content, e.g. social network photos or video
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N7/00Computing arrangements based on specific mathematical models
    • G06N7/01Probabilistic graphical models, e.g. probabilistic networks

Definitions

  • the invention belongs to the field of artificial intelligence, and particularly relates to a connection information regularization system based on an adjacency matrix, a graph feature extraction system, a graph classification system and a method.
  • a graph in graph theory is a graph composed of a number of given points and lines connecting two points. This graph is usually used to describe a certain relationship between certain things. A line connecting two points indicates that there is a relationship between the two things.
  • the graph G in graph theory is an ordered binary group (V, E), where V is called vertex set, which is a set of all the vertices in the graph, and E is called edge set. ), a collection of edges between all vertices. Simply put, vertices represent things, and edges represent relationships between things.
  • a graph is a kind of non-grid data. The characteristics of such data are that the dimension is uncertain in a specific scene, and the dimension is high and there is no upper limit. Dimension refers to the number of vertices of the graph.
  • a chemical structural formula can correspond to a graph, in which an atom is a vertex in a graph, and a chemical bond between atoms is an edge in a graph.
  • the dimension of a molecule is the number of atoms contained in the molecule. For example, if a molecule consists of 100 atoms, the dimension of the molecule is 100. In a collection of molecules, each molecule is made up of a variable number of atoms, so its dimensions are uncertain. In reality, complex structures such as proteins often consist of dozens or even hundreds of atoms, and their dimensions are as high as tens or even hundreds.
  • the social network may also correspond to a graph, in which a person is a vertex in a graph, and a relationship between people is a side in a graph, and the dimension of the social network is higher and more Complex, generally a large social network can have thousands of vertices, tens of thousands of edges, and the dimensions are as high as several thousand. The dimensions corresponding to the graphs in the graph theory are very high and have no upper limit.
  • data such as pictures, texts, audio and video are all grid data.
  • the characteristics of this type of data are that the dimensions are low (not more than 3 dimensions) and the dimensions are determined.
  • the dimensions of the image are not affected by the number of images.
  • its dimensions can be expressed as 2D or 3D, and more images (for example, hundreds of images). ), its dimensions are constant, still 2D or 3D.
  • the grid data and the non-grid data are two completely different data.
  • the non-grid data has higher and uncertain dimensions and more complicated structure than the grid data, and the classification method of the two kinds of data and The feature extraction method is also completely different.
  • the graph classification problem treats a graph as a complex object, and constructs a deep learning model to learn the graph classification according to the common subgraph structure patterns hidden in the graph.
  • the MUTAG data set consists of a number of nitro compounds, where the class tag can indicate whether the compound has a mutagenic effect on the bacterium.
  • Another example is mapping an invisible compound to its level of activity on cancer cells.
  • the graph classification treats the graph as a complex object, and constructs a deep learning model based on the hidden subgraph structure patterns hidden in the graph to learn the classification of the graph.
  • the subgraph refers to a graph in a graph theory in which a part of the vertices in the graph and the edges connecting the vertices are displayed.
  • Complex object classification methods usually measure the similarity distance between two complex objects by designing a suitable similarity function, and then use some classification algorithms to classify complex objects.
  • the existing graph classification based on the graph similarity calculation model is roughly divided into two categories:
  • subgraph size commonly referred to as window size, window-size
  • window size the size of a window
  • the amount of calculation will increase exponentially, to a certain limit will exceed the computer's running time (memory time) and memory (memory usage) tolerance. Therefore, the method is limited by the window size (the selected subgraph size cannot exceed 10 vertices), and this will result in the lack of classification in the features of the graph, but cannot be captured by small windows (not more than 10).
  • the subgraph structure which in turn may result in a higher classification error rate.
  • the graph kernel and graph embedding are the two most representative methods in the graph classification method of the graph similarity calculation model.
  • Convolutional Neural Networks have achieved significant success in processing grid data in deep learning, such as text, images, audio and video, and streaming data, as well as large-scale scene analysis. These data belong to the grid data, they have certain dimensions and the dimensions are low, and the data in the grid data has the characteristics of translation, scaling and rotation.
  • the graph is non-grid data, and the convolutional neural network (CNN) cannot be directly applied to the graph because the convolution and pooling operations in the CNN are only defined in the regular grid data (grid). The operation on data), but not the convolution operation on non-grid data.
  • the PSCN method for the input graph, first marks the vertices of the graph (Graph Labeling), sorts the vertices of the graph according to the labeling result, selects the first w vertices as the center vertices; for the selected w vertices, Each vertex takes its breadth-first approach and selects its k adjacent vertices (selected according to Grab Labeling), so that each vertex and its surrounding neighborhood of size k form a subgraph, w vertices Can obtain w subgraphs; through the above steps, obtain w vectors of dimension (k+1), each vector corresponding to the information of the vertices in the subgraph centered on the center vertex, and also obtain w dimensions as (k+1) 2 vectors, each vector corresponding to the information of the edges of a subgraph
  • the PSCN extracts subgraphs of a specified size (determined by the window size parameter k) centered on several (determined by parameter w) vertices as features, and then applies a standard 1-dimensional convolutional neural network.
  • the PSCN method proposed in this paper obtains better results than the Deep Graph Kernel on the existing open dataset.
  • the selection of w center vertices will limit the number of subgraphs, so there is no guarantee that all subgraph structures can be extracted.
  • the PSCN method is still limited by the window size, and the neighborhood selection is less than 10 window size.
  • PSCN will not be able to effectively perform deep learning when using a smaller window size k because the input graph has a default window size
  • the PSCN classification results are sensitive to the labeling process, which sorts the vertices in the field; therefore their labeling methods apply to a data set. , but may fail on another dataset.
  • the prior art method has two main problems in classifying the graph: First, when the graph is analyzed as the whole object, it is impossible to select features that can include both display topology information and deep implicit information.
  • the graph is represented; the second is that when the subgraph is used as a feature of the graph, the subgraph size is subject to the selection of the window size k, which makes it difficult to capture a large complex subgraph, so that the graph Classification accuracy is not high.
  • the technical problem to be solved by the present invention is to provide a connection information regularization system and method based on an adjacency matrix in a computer environment, which can effectively concentrate elements of an adjacency matrix corresponding to edges of a graph to a diagonal Line area, so that traversing the diagonal area with a fixed-size window can capture all the corresponding size sub-graph structures in the graph, reducing the time complexity; then by merging the information of these sub-picture structures, you can capture large multi-top
  • the sub-picture structure information solves the technical problem that the prior art cannot solve.
  • Disadvantages of the present invention for existing graph classification methods include: First, since graphs as an entire object analyze not only need to capture shallow features from explicit topologies of graphs, but also need to capture hidden from vertex and edges. The deep features of the related structure, otherwise, will affect the accuracy of the classification of the graph.
  • the prior art method is difficult to represent a graph in a deterministic feature space, which refers to feature extraction from the original data, mapping the original data to a higher dimensional space in the feature space. A feature is a higher dimensional abstraction of the original data.
  • the present invention reduces the non-join information element in advance by concentrating the connection information elements in the adjacency matrix corresponding to the graph into a specific diagonal region of the adjacency matrix, and further uses the filter matrix edge. Extracting the subgraph structure of the graph in the diagonal direction, and then using the cascaded convolutional neural network to extract the larger subgraph structure, on the one hand, greatly reducing the computational complexity and calculation amount, and solving the computational complexity limit and window size. Limiting, and the ability to capture large multi-vertex subgraph structures through small windows, as well as deep features of implicit correlation structures from vertices and edges, improves the accuracy and speed of graph classification.
  • a first object of the present invention is to provide a connection information regularization system in a computer environment, wherein the connection information regularization system reorders all vertices in a first adjacency matrix of the graph to obtain a second adjacency matrix.
  • the connection information elements in the second adjacency matrix are concentrated in a diagonal region of the second adjacency matrix having a width n, where n is a positive integer, n ⁇ 2 and n ⁇
  • a second object of the present invention is to provide a graph feature extraction system based on an adjacency matrix in a computer environment, wherein the graph feature extraction system extracts features of the graph based on the adjacency matrix of the graph, and the features directly correspond to support classification.
  • Subgraph structure the features are presented in the form of at least one vector, each vector corresponding to a distribution of a mixed state in the graph; the graph feature extraction system includes a feature generation module and any of the foregoing
  • the graph feature extraction system includes a connection information regularization system and a feature generation module, and the connection information regularization system and the feature generation module cooperate as a whole, and the function thereof can be effective on the atlas of different sizes and different structural complexity. Local patterns and connection features implicit in a particular diagonal region of matrix width n are extracted.
  • the connection information regularization system greatly reduces the computational complexity and computational complexity required by the feature generation module, and solves the limitation of computational complexity.
  • a third object of the present invention is to provide a graph classification system based on an adjacency matrix in a computer environment, the graph classification system comprising a category labeling module and any one of the forms described above based on an adjacency matrix in a computer environment
  • the graph feature extraction system, the category labeling module performs category labeling on the graph based on the features generated by the graph feature extraction system, and outputs the category of the graph.
  • a fourth object of the present invention is to provide a method of regularizing connection information in a computer environment.
  • a fifth object of the present invention is to provide a graph feature extraction method based on an adjacency matrix in a computer environment.
  • a sixth object of the present invention is to provide a graph classification method based on an adjacency matrix in a computer environment.
  • a seventh object of the present invention is to provide three graph classification methods based on stacked CNN in a computer environment.
  • An eighth object of the present invention is to provide a graph classification system in which the vertices of the graph are arbitrary entities, and the edges of the graph are relationships between arbitrary entities.
  • a ninth object of the present invention is to provide a network structure type discriminating system, wherein the classification system implements network structure classification based on a graph classification system as described above, and the vertex of the graph is a node in the network, the graph The edge is the relationship of the nodes in the network.
  • a tenth object of the present invention is to provide a compound classification system which realizes compound classification based on a prior art classification system, the apex of which is an atom of a compound, and the sides of the figure are between atoms Chemical bond.
  • An eleventh object of the present invention is to provide a social network classification system, the classification system implementing social network classification based on a map classification system as described above, the vertex of the graph being an entity in the social network
  • the edge of the graph is a relationship between entities, including but not limited to a person, an institution, an event, a geographical location in a social network, and the relationship includes but is not limited to a friend relationship, a relationship of interest , private letter, name, and association.
  • the name of the point refers to a person who can use @.
  • a twelfth object of the present invention is to provide a computer system comprising the connection information regularization system in any form as described above, the adjacency matrix based graph feature extraction system And any one or any of a plurality of the graph classification system, the network structure type discriminating system, the compound classification system, and the social network classification system.
  • the present invention concentrates the non-joined information elements by concentrating the connection information elements in the adjacency matrix into the diagonal regions of the adjacency matrix, and concentrates the regions of the connected information elements, and then extracts the maps in the diagonal direction.
  • the subgraph structure greatly reduces the computational complexity of the subgraph structure in the extracted graph;
  • the present invention obtains the characteristics of the graph by using a filter matrix to perform a filtering operation along the diagonal direction of the second adjacency matrix obtained by the connection information regularization system, and at the same time, adopts a stacked volume and a neural network based on the obtained features to realize the passage. Smaller windows capture the subgraph structure of large multi-vertexs and capture the deep features of the topology.
  • the present invention reduces the non-join information elements in advance by concentrating the connection information elements in the adjacency matrix corresponding to the graph into a specific diagonal region of the adjacency matrix, and further extracts the graph in the diagonal direction by using the filter matrix.
  • Subgraph structure and then use the cascaded convolutional neural network to extract larger subgraph structure, which greatly reduces computational complexity and computational complexity, solves computational complexity limitations and window size limitations, and can pass smaller
  • the window captures the subgraph structure of large multi-vertices, as well as the deep features of implicit correlation structures from vertices and edges, improving the accuracy and speed of graph classification.
  • connection information regularization system, the feature generation module and the stacked CNN (Stacked CNN) module in the graph classification system cooperate to realize the small sub-graph structure captured by the small window of size n, Use a combination of these small subgraph structures to obtain a larger, deeper, more complex subgraph structure with more vertices than n, ie, use a small window (the window size is n) to extract the larger graph (the number of vertices is greater than n) ), deeper, more complex features. It realizes the subgraph structure of large multi-vertices captured by a small window, and the deep features of implicit correlation structures from vertices and edges, which improves the accuracy and speed of graph classification.
  • stacked CNN stacked CNN
  • 1 is a schematic view of a diagonal region having a width of 3 in a 6 ⁇ 6 adjacency matrix
  • FIG. 3 is a schematic diagram of converting a first adjacency matrix into a second adjacency matrix, the left graph is a first adjacency matrix, and the right graph is a second adjacency matrix;
  • Figure 4 is a flow chart of the greedy algorithm
  • Figure 5 is a flow chart of a branch and bound algorithm
  • Figure 6 is a data flow diagram of a stacked CNN module
  • FIG. 7 is a data flow diagram of a stacked CNN module (including a separate pooling module and a convolution pooling module);
  • FIG. 8 is a data flow diagram of a stacked CNN module (including a separate pooling module and a plurality of convolution pooling modules);
  • Figure 10 is a flow chart of the greedy algorithm
  • 11 is a schematic diagram of an example of row and column exchange of adjacency matrix
  • Figure 12 is a first adjacency matrix and a second adjacency matrix obtained by reordering
  • Figure 13 is a graph and a second adjacency matrix corresponding to the graph
  • 15 is a schematic diagram of a filter matrix calculation of a feature generation module
  • 16 is a schematic diagram of the zeroing operation of the adjacency matrix corresponding to the graph
  • 17 is a schematic diagram of a graph classification system based on a stacked CNN
  • Figure 18 is a graph of accuracy and time consuming results on MUTAG
  • Figure 19 is a graph of accuracy and time consuming results on the PTC
  • Figure 20 is a graph of accuracy and time consuming results on the PTC
  • Figure 21 is a graph showing the accuracy and time consumption as a function of the dropout ratio
  • FIG. 22 is a comparison diagram of classification accuracy and time consumption under the use of the connection information regularization system and the non-information regulation module on each data set;
  • Figure 23 is a convergence curve on MUTAG
  • Figure 24 is a convergence curve on the PTC
  • Figure 25 is a convergence curve on PROTEINS
  • Figure 26 is a filter matrix and its corresponding subgraph structure, where (a) is a positive subgraph structure, (b) is a negative subgraph structure, and (c) is a filter matrix.
  • Figure 27 is a schematic diagram showing the structure of each convolutional layer captured and its corresponding subgraph structure, wherein (a) is a 12-vertex map, (b) is an extracted 4-vertex feature, and (c) is an extracted 6-vertex feature, (d) Is the extracted 8 vertex features, (e) is the extracted 10 vertex features, and (f) is the extracted 12 vertex features;
  • 29 is a schematic diagram of a sub-picture structure captured by a feature generation module and a stacked CNN module;
  • FIG. 30 is a schematic diagram showing an implementation flow of a graph classification system based on a stacked CNN.
  • connection information regularization system in a computer environment provided by the present invention.
  • the connection information regularization system reorders all vertices in a first adjacency matrix of the graph to obtain a second adjacency matrix.
  • the connection information elements in the second adjacency matrix are concentratedly distributed in a diagonal region of the second adjacency matrix having a width n, where n is a positive integer, n ⁇ 2 and n ⁇
  • connection information element is a corresponding element of the edge of the graph in the adjacency matrix.
  • connection information regularization system concentrates the connection information elements in the adjacency matrix corresponding to the graph to a specific diagonal region of the second adjacency matrix width n (n is a positive integer, n ⁇ 2 and n ⁇
  • is the number of rows or columns of the second adjacency matrix
  • the number of vertices in the graph can be completed by traversing the diagonal region after processing with a matrix of size n ⁇ n (ie, the window size is n).
  • the extraction of the subgraph structure of n greatly reduces the computational complexity and computational complexity required, and solves the computational complexity limitation.
  • the vector of the present invention refers to an amount having a magnitude and a direction, expressed mathematically as a matrix of 1 x m, and m is a positive integer greater than one.
  • Features described herein all represent features of a graph.
  • the adjacency matrix according to the present invention refers to a matrix representing an adjacent relationship between vertices of a graph, and the basic attribute of the adjacency matrix is obtained by switching two columns and corresponding rows of the adjacency matrix. Another adjacency matrix of the same graph.
  • the adjacency matrix of G is an n-th order square matrix with the following properties:
  • the adjacency matrix must be symmetric, and the main diagonal must be zero (only the undirected simple graph is discussed here), the secondary diagonal is not necessarily 0, and the directed graph is not necessarily
  • the main diagonal is a diagonal line from the upper left corner to the lower right corner of the matrix; the secondary diagonal is a diagonal line from the upper right corner to the lower left corner of the matrix;
  • the degree of any vertex v i is the number of all non-zero elements in the i-th column (or the i-th row); the vertex i refers to the i-th column (or the i-th row in the matrix)
  • the vertices of the representation in the directed graph, the degree of the vertex i is the number of all non-zero elements in the i-th row, and the degree of entry is the number of all non-zero elements in the i-th column; the degree of the vertex is The number of edges associated with the vertex; the degree of the vertex is the number of edges of the vertex pointing to other fixed points; the intrudence of the fixed point is the number of edges of other fixed points pointing to the vertex;
  • connection information element is an element corresponding to the edge of the graph in the adjacency matrix; in the undirected graph, the element value of the i-th row and the j-th column represents whether the connection of the vertex v i and the vertex v j exists. And whether there is a connection weight; the element value of the i-th row and the j-th column in the directed graph represents whether the connection of the vertex v i to the vertex v j exists and whether there is a connection weight.
  • the corresponding element values of the i-th row, the j-th column, and the j-th row and the i-th column in the adjacency matrix are all 1 If there is no edge, the element values of the corresponding i-th row j-th column and j-th row i-th column are both 0.
  • the corresponding i-th row and j-th column And the element value of the jth row and the ith column are both w; for example, for the vertex v i and the vertex v j in the directed graph, if there is an edge between the two that points to the vertex v j by the vertex v i , then The element value of the corresponding i-th row and the j-th column in the adjacency matrix is 1, and if there is no edge pointed to by the vertex v i to the vertex v j , the corresponding element value of the i-th row and the j-th column is 0, if one exists When the vertex v i points to the edge of the vertex v j and the weight has w on the edge, the corresponding i-th row and the j-th column have an element value of w; wherein i and j are positive integers less than or equal to
  • the value of the connection information element is 1, and the value of the non-join information element is 0; more preferably, if the edge of the graph has weights The value of the connection information element is the weight value of the edge, and the value of the non-connection information element is 0.
  • the first adjacency matrix of the present invention refers to a first adjacency matrix obtained by initially converting a graph into an adjacency matrix, that is, an initial adjacency matrix before exchanging corresponding row and column
  • the second adjacency matrix refers to passing the first adjacency matrix
  • the matrix performs row-column exchange to maximize the matrix information after the adjacency matrix, and the connection information elements in the second adjacency matrix are concentrated in a diagonal region of the second adjacency matrix having a width n, where n is a positive integer , n ⁇ 2 and n ⁇
  • FIG. 3 A schematic diagram of converting the first adjacency matrix into a second adjacency matrix is shown in FIG. 3, the left graph is the first adjacency matrix, and the right graph is the second adjacency matrix.
  • the diagonal region of the second adjacency matrix is composed of the following elements: a positive integer i is traversed from 1 to
  • the diagonal area of the second adjacency matrix refers to an area that is scanned along a diagonal line of the second adjacency matrix by using a scanning rectangle of size n ⁇ n; more preferably, The scanning process is as follows: first, the upper left corner of the scanning rectangle is coincident with the upper left corner of the second adjacency matrix; then each time the scanning rectangle is moved to the right and below each element grid until the The lower right corner of the scanning rectangle coincides with the lower right corner of the second adjacency matrix.
  • connection information regularization system is configured to reorder all the vertices of the first adjacency matrix, so that the degree of concentration of the connection information elements in the diagonal region of the second adjacency matrix after the sorting is the highest; the connection information The degree of concentration of elements refers to the proportion of non-zero elements in the diagonal area;
  • the reordering method is an integer optimization algorithm, which functions to concentrate the connection information elements in the matrix into the diagonal region and make the concentration of the connection information elements as high as possible;
  • the integer optimization algorithm Refers to an algorithm that makes the connection information elements of the matrix more concentrated by simultaneously switching the corresponding two rows or two columns in the matrix;
  • the method of reordering is a greedy algorithm, including the following steps:
  • the first adjacency matrix of the input graph is taken as the adjacency matrix to be processed
  • the reordering method is a branch and bound algorithm, and includes the following steps:
  • the first adjacency matrix of the input graph is taken as the adjacency matrix to be processed
  • the degree of concentration of the connection information elements in the diagonal region of the second adjacency matrix depends on the number of connection information elements and/or the number of non-connection information elements in the diagonal region.
  • the degree of concentration of the connection information elements in the diagonal region of the second adjacency matrix depends on the number of connection information elements outside the diagonal region and/or the number of non-connection information elements.
  • the degree of concentration can be measured by using a Loss value.
  • the calculation method of the Loss value is as follows:
  • LS(A,n) represents the loss Loss value
  • A represents the second adjacency matrix
  • n represents the width of the diagonal region in the second adjacency matrix
  • a i,j represents the second adjacency The elements of the i-th row and the j-th column in the matrix.
  • the LS (A, n) represents a Loss value of the second adjacency matrix A when the filter matrix size is n ⁇ n, and the smaller the Loss value, the higher the degree of concentration of the second adjacency matrix.
  • the degree of concentration can also be measured by the ZR value.
  • the ZR value is calculated as follows:
  • A represents a second adjacency matrix
  • C denotes a matrix in which all elements are connected information elements and the size is the same as A
  • Ai,j denotes elements of the i-th row and jth column in A
  • Ci,j denotes C in C
  • the element of the jth column of i row, TC(A, n), TC represents the total number of elements in the diagonal region of width n
  • T1 represents the diagonal region of width n.
  • the number of connection information elements, ZR(A, n) represents the ZR value, which represents the proportion of non-join information elements in the diagonal area of width n.
  • An embodiment implements a graph feature extraction system based on an adjacency matrix in a computer environment, and the graph feature extraction system extracts features of a graph based on an adjacency matrix of the graph, and the features directly correspond to support classification.
  • Subgraph structure the features are presented in the form of at least one vector, each vector corresponding to a distribution of a mixed state in the graph;
  • the graph feature extraction system includes a feature generation module and any of the foregoing
  • the graph feature extraction system includes a connection information regularization system and a feature generation module, and the connection information regularization system and the feature generation module cooperate as a whole, and the function thereof can be effective on the atlas of different sizes and different structural complexity. Local patterns and connection features implicit in a particular diagonal region of matrix width n are extracted.
  • the connection information regularization system greatly reduces the computational complexity and computational complexity required by the feature generation module, and solves the limitation of computational complexity;
  • the diagram is a diagram in graph theory;
  • the feature generation module generates a feature of the map by using a filter matrix, wherein the filter matrix is a square matrix; more preferably, the feature generation module uses at least one filter matrix along the second adjacency matrix Performing a filtering operation on the diagonal area to obtain at least one vector, the at least one vector corresponding to the feature of the graph, the feature directly corresponding to the sub-graph structure supporting the classification, each vector corresponding to a mixed state The distribution in the figure.
  • the distribution case refers to the possibility that the sub-picture structure in the mixed state appears in the figure; preferably, each of the mixed states represents a linear weight of the adjacency matrix corresponding to any plurality of sub-picture structures; More preferably, the linear weighting means that the adjacency matrix of each subgraph is multiplied by the weight corresponding to the adjacency matrix, and then the bit phases are added together to obtain a matrix of the same size as the adjacency matrix of the subgraph; The sum of the weights corresponding to the adjacency matrix is 1; the calculation process is shown in FIG. 2.
  • the filtering operation is to add the matrix inner product of the second adjacency matrix by using the filter matrix, and obtain a value by the activation function, and the filter matrix is along the second adjacency matrix.
  • the diagonal direction moves to obtain a set of values, forming a vector corresponding to the distribution of a subgraph structure in the graph; more preferably, the activation function is a sigmoid function, a ReLU activation function, and a pReLU function.
  • the feature generation module performs the filtering operation by using different filter matrices
  • the initial values of each element in the filter matrix are respectively values of random variables taken from the Gaussian distribution.
  • the Gaussian distribution is a probability distribution
  • the Gaussian distribution is a distribution of continuous random variables having two parameters ⁇ and ⁇
  • the first parameter ⁇ is the mean of the random variables obeying the normal distribution
  • the second parameter ⁇ is this
  • the variance of a random variable when the value of a random variable is taken by a Gaussian distribution, the closer the value of the random variable is to ⁇ , the greater the probability, and the further away from ⁇ , the smaller the probability.
  • the elements in the filter matrix are real numbers greater than or equal to -1 and less than or equal to 1; more preferably, the elements in the filter matrix are real numbers greater than or equal to 0 and less than or equal to 1.
  • the feature generation module participates in a machine learning process for adjusting values of elements of the filter matrix.
  • the machine learning process utilizes backpropagation, uses the classified loss value, calculates the gradient value, and further adjusts the values of the respective elements in the filter matrix.
  • the loss value refers to the error between the output in the machine learning process and the output that should actually be obtained; the gradient can be regarded as the inclination of a surface along a given direction, and the gradient of the scalar field is a vector field.
  • the gradient at a point in the scalar field points to the fastest growing direction of the scalar field, and the gradient value is the largest rate of change in this direction.
  • the machine learning process consists of a forward propagation process and a back propagation process.
  • the forward propagation process the input information is processed through the hidden layer through the input layer and processed layer by layer and transmitted to the output layer. If the desired output value is not obtained at the output layer, the sum of the squares of the output and the expected error is taken as the objective function, and the back propagation is performed, and the partial derivative of the objective function for each neuron weight is obtained layer by layer to form a target.
  • the gradient of the function to the weight vector is used as the basis for modifying the weight, and the machine learning process is completed in the weight modification process.
  • the machine learning process ends when the error converges to the desired value or reaches the maximum number of learnings.
  • the initial values of the elements in the filter matrix are values of random variables taken from the Gaussian distribution, which are then updated by backpropagation during machine learning and are optimized at the end of the machine learning process.
  • the hidden layer refers to layers other than the input layer and the output layer, and the hidden layer does not directly receive signals from the outside, nor directly sends signals to the outside world.
  • the size of the filter matrix is n ⁇ n, that is, the size of the filter matrix is the same as the width of the diagonal region in the second adjacency matrix; the first adjacency matrix is adopted by the connection information regularization system After the connection information elements are concentrated in the diagonal area, the diagonal convolution using the filter matrix can make as many sub-graph structures as the size n in the figure under the premise of O(n) time complexity. The distribution of the situation is extracted.
  • An embodiment implements a graph classification system based on an adjacency matrix in a computer environment, and the graph classification system includes a category labeling module and an adjacency matrix in a computer environment according to any one of the forms described above.
  • the graph feature extraction system, the category labeling module performs category labeling on the graph based on the features generated by the graph feature extraction system, and outputs a category of the graph; the graph is a graph in the graph theory;
  • the category labeling module calculates the possibility that the graph belongs to each category label, and labels the most likely classification label as the category of the graph, and completes the classification of the graph;
  • the category labeling module uses a classification algorithm to calculate the probability that the graph belongs to each category label, and labels the most likely classification label as the category of the graph, and completes the classification of the graph; more preferably, the classification The algorithm is selected from any one or any of kNN, a linear classification algorithm.
  • the kNN algorithm means that if a sample has a majority of the k most adjacent samples in the feature space belonging to a certain category, the sample also belongs to this category and has the characteristics of the samples on this category.
  • the method determines the category to which the sample to be classified belongs based on only the category of the nearest one or several samples in determining the classification decision.
  • the linear classification algorithm refers to classifying data according to the distribution of data determined by the label in its space, using a straight line (or plane, hyperplane).
  • the tag refers to an identifier that describes the category.
  • the graph classification system further includes a cascading CNN module, and the cascading CNN module performs processing based on the features generated by the graph feature extraction system, and fuses the sub-graph structure supporting the classification corresponding to the feature to generate A feature comprising a larger subgraph structure in the graph, wherein the larger subgraph structure refers to a subgraph structure having more vertices than n;
  • the stacked CNN module includes a convolution submodule and a pooled submodule;
  • the convolution sub-module performs a convolution operation based on the features generated by the graph feature extraction system using at least one convolution layer, and fuses the sub-graph structure supporting the classification corresponding to the feature to obtain at least one vector as a convolution
  • the input of the first convolutional layer is a feature generated by any of the forms of the feature extraction system as described above. If there are multiple convolutional layers, the input of each convolutional layer is the previous convolutional layer.
  • the output of each convolution layer is at least one vector, each convolution layer performs convolution operation using at least one filter matrix, and the convolution result of the last convolution layer is output to the pooled submodule ;
  • the convolution operation refers to a method of calculating a vector or a matrix by using a filter matrix to perform translation on adjacency matrix by a certain rule, multiplying by a bit, and adding the obtained values.
  • the filter matrix is a square matrix; the number of rows of the filter matrix in each of the convolution layers is the same as the number of vectors input to the convolution layer; preferably, the elements in the filter matrix are greater than or equal to -1, a real number less than or equal to 1; more preferably, the elements in the filter matrix are real numbers greater than or equal to 0 and less than or equal to 1;
  • the pooling sub-module is configured to perform a pooling operation on the matrix obtained by the convolution sub-module, and obtain at least one vector as a pooling result output to the category labeling module, and classify the graph, and output the graph a category, the pooling result includes a feature of a larger subgraph structure in the graph; the larger subgraph structure refers to a subgraph structure having more vertices than n; preferably, the pooling operation is selected from Maximum pooling operation, average pooling operation.
  • the maximum pooling operation refers to taking a maximum value of feature points in the neighborhood; the average pooling operation refers to averaging the values of the feature points in the neighborhood.
  • the pooling operation performs mathematical operations on each convolution result on the basis of the convolution operation, thereby reducing the dimension of the convolution result.
  • the mathematical operations include, but are not limited to, averaging and taking a maximum.
  • the data flow diagram of the stacked CNN module is shown in FIG. 6.
  • the stacked CNN module fuses the sub-graph structure supporting the classification corresponding to the feature of the graph, and extracts larger, deeper and more complex features from a series of convolution layers and features obtained from the feature generation module. Corresponds to the large, deeper, and more complex subgraph structure in the figure.
  • connection information regularization system, the feature generation module and the cascade CNN module cooperate to realize capturing a small sub-graph structure (the number of vertices is n) by using a small window of size n, and simultaneously utilizing these small sub-graph structures
  • the graph classification system further includes a separate pooling module and a convolution pooling module; the independent pooling module is configured to perform pooling operations on the features generated by the graph feature extraction system to obtain at least one The vector is output to the category labeling module as a first pooling result; the convolution pooling module convolves and pools the features generated by the input graph feature extraction system of any of the forms described above.
  • the category labeling module is configured according to The first pooling result and the second pooling result labeling the graph, and outputting the category of the graph;
  • the larger subgraph structure refers to the subgraph structure with more vertices than n;
  • the convolution pooling module comprises a convolution submodule and a pooling submodule; the convolution submodule convolves the input using at least one filter matrix, and fuses the supported classification corresponding to the feature.
  • Sub-graph structure the at least one vector is passed as a convolution result to the pooling sub-module; the pooling sub-module performs a pooling operation on the convolution result, and obtains at least one vector as a second pooling result.
  • the second pooling result includes a feature of a larger sub-graph structure in the graph, and the pooled result is output to the category labeling module;
  • the filter matrix is a square matrix; the number of rows of the filter matrix in each of the convolution layers is the same as the number of vectors input to the convolution layer; preferably, the elements in the filter matrix are greater than or equal to -1, a real number less than or equal to 1; more preferably, the elements in the filter matrix are real numbers greater than or equal to 0 and less than or equal to 1; preferably, the pooling operation is selected from a maximum pooling operation, an average pool Operation.
  • the data flow diagram of the stacked CNN module including the independent pooling module and the convolution pooling module is shown in FIG. 7.
  • the graph classification system further includes a separate pooling module and a plurality of convolution pooling modules;
  • the independent pooling module is configured to perform pooling operations on the features generated by the graph feature extraction system, and obtain And outputting, by the convolutional pooling module, the convolution operation and the pooling operation to the input features, wherein the convolution operation fuses the The sub-graph structure supporting the classification corresponding to the feature obtains at least one vector as a convolution result, and then performs a pooling operation on the convolution result to obtain at least one vector as a pooling result, where the pooling result includes The feature of the large sub-graph structure; the convolution result of the previous convolution pooling module is output to the next convolution pooling module, and the pooling result of each convolution pooling module is output to the category labeling module;
  • the category labeling module performs category labeling on the map according to the first pooling result and the pooling result of all convolution pooling modules, and outputs the category of the graph;
  • the input of the first convolution pooling module is a feature generated by any one of the graph feature extraction systems as described above, and the input of the other convolution pooling module is the volume of the previous convolution pooling module.
  • the convolution pooling module comprises a convolution submodule and a pooling submodule; the convolution submodule convolves the input using at least one filter matrix, and fuses the supported classification corresponding to the feature.
  • Subgraph structure obtains at least one vector as a convolution result, and outputs the convolution result to the next convolution pooling module; the pooled submodule outputs the convolution submodule
  • the convolution result is pooled, and at least one vector is output as a pooling result to the category labeling module, and the pooling result includes characteristics of a larger sub-graph structure in the graph; preferably, the convolution sub-module,
  • the number of the pooled sub-modules may be the same or different; preferably, the number of the convolution sub-modules and the pooled sub-modules is one or more;
  • the filter matrix is a square matrix; the number of rows of the filter matrix in each of the convolution layers is the same as the number of vectors input to the convolution layer; preferably, the elements in the filter matrix are greater than or equal to -1, a real number less than or equal to 1; more preferably, the elements in the filter matrix are real numbers greater than or equal to 0 and less than or equal to 1;
  • the number of the convolution pooling modules is less than or equal to 10. More preferably, the number of the convolution pooling modules in the graph classification system is less than or equal to five; more preferably, the The number of convolution pooling modules in the graph classification system is less than or equal to three;
  • the pooling operation is selected from a maximum pooling operation and an average pooling operation.
  • the data flow diagram of the stacked CNN module including the independent pooling module and the plurality of convolution pooling modules is shown in FIG. 8.
  • the element value of the vector corresponding to the convolution result represents a possibility that the sub-picture structure appears at each position on the map, and the pooling result, the first pooling result, and the vector corresponding to the second pooling result are The element value represents the maximum likelihood or average likelihood that the subgraph structure appears in the graph.
  • the category labeling module includes an implicit layer unit, an activation unit, and an annotation unit;
  • the hidden layer unit processes the received vector to obtain at least one mixed vector that is transmitted to the activation unit, where the hybrid vector includes information of all vectors received by the hidden layer unit;
  • the processing of the received vector by the hidden layer unit refers to combining and splicing the input vectors into one combined vector, and performing linear weighting operation on the combined vector using at least one weight vector to obtain at least one mixture.
  • the hidden layer refers to layers other than the input layer and the output layer, and the hidden layer does not directly receive signals from the outside, nor directly sends signals to the outside world.
  • the activation unit calculates a value for each mixed vector output by the hidden layer unit using an activation function, and outputs all the obtained values into a vector to the labeling unit; preferably, the activation
  • the function is sigmoid function, ReLU activation function, pReLU function;
  • the labeling unit is configured to calculate, according to the result of the activation unit, the possibility that the graph belongs to each category label, and label the most likely classification label as the category of the graph, and complete the classification of the graph; preferably, the labeling unit is based on
  • the classification algorithm calculates the probability that the graph belongs to each classification label, and labels the most likely classification label as the category of the graph, and completes the classification of the graph; more preferably, the classification algorithm is selected from the kNN and the linear classification algorithm. Any one or any of a variety.
  • a fourth object of the present invention is to provide a method for synchronizing connection information in a computer environment, the method comprising the steps of:
  • connection information regularization reordering all the vertices in the first adjacency matrix to obtain a second adjacency matrix, wherein the connection information elements in the second adjacency matrix are concentrated in the width of the second adjacency matrix is n a diagonal region, where n is a positive integer, n ⁇ 2 and n ⁇
  • the diagonal region of the second adjacency matrix is composed of the following elements: a positive integer i is traversed from 1 to
  • connection information element is a corresponding element of the edge of the graph in the adjacency matrix
  • the figure is a diagram in graph theory
  • the value of the connection information element is 1, and the value of the non-join information element is 0; more preferably, if the edge of the graph has weights , the value of the connection information element is a weight value of the edge, and the value of the non-connection information element is 0;
  • the diagonal area refers to a diagonal area in the matrix from the upper left corner to the lower right corner;
  • the diagonal area of the second adjacency matrix refers to an area that is scanned one time along a diagonal line of the second adjacency matrix using a scanning rectangle of size n ⁇ n;
  • the scanning process is as follows: first, the upper left corner of the scanning rectangle is coincident with the upper left corner of the second adjacency matrix; then the scanning rectangle is moved one element to the right and below each time. a grid until the lower right corner of the scanning rectangle coincides with the lower right corner of the second adjacency matrix.
  • the method of reordering is an integer optimization algorithm.
  • the method of reordering is a greedy algorithm, including the following steps:
  • the first adjacency matrix of the input graph is taken as the adjacency matrix to be processed
  • the reordering method is a branch and bound algorithm, and includes the following steps:
  • the first adjacency matrix of the input graph is taken as the adjacency matrix to be processed
  • the degree of concentration of the connection information elements in the diagonal region of the second adjacency matrix depends on the number of connection information elements and/or the number of non-connection information elements in the diagonal region.
  • the degree of concentration of the connection information elements in the diagonal region of the second adjacency matrix depends on the number of connection information elements outside the diagonal region and/or the number of non-connection information elements.
  • the degree of concentration can be measured by using a Loss value.
  • the calculation method of the Loss value is as follows:
  • LS(A,n) represents the loss Loss value
  • A represents the second adjacency matrix
  • n represents the width of the diagonal region in the second adjacency matrix
  • Ai,j represents the second adjacency matrix The element in the jth column of the i-th row.
  • the degree of concentration can also be measured by the ZR value.
  • the ZR value is calculated as follows:
  • A represents a second adjacency matrix
  • C denotes a matrix in which all elements are connected information elements and the size is the same as A
  • Ai,j denotes elements of the i-th row and jth column in A
  • Ci,j denotes C in C
  • the element of the jth column of i row, TC(A, n), TC represents the total number of elements in the diagonal region of width n
  • T1 represents the diagonal region of width n.
  • the number of connection information elements, ZR(A, n) represents the ZR value, which represents the proportion of non-join information elements in the diagonal area of width n.
  • An embodiment specifically implements a graph feature extraction method based on an adjacency matrix in a computer environment, wherein the method extracts a feature of a graph based on an adjacency matrix of the graph, and the feature directly corresponds to a subgraph supporting the classification.
  • a structure said features being presented in the form of at least one vector, each vector corresponding to a distribution of a mixed state in the figure, said method comprising the steps of:
  • connection information regularization based on the first adjacency matrix of the graph, the second adjacency matrix is obtained by using any of the connection information regularization methods as described above;
  • Diagonal filtering based on the second adjacency matrix obtained in step (1), generating features of the graph, the features directly corresponding to the subgraph structure supporting the classification, and each vector corresponds to a distribution of the mixed state in the graph happening;
  • the step (2) generates a feature of the map by using a filter matrix, wherein the filter matrix is a square matrix; more preferably, the step (2) utilizes at least one filter matrix along the second adjacency Performing a filtering operation on the diagonal area of the matrix to obtain at least one vector, the at least one vector corresponding to the feature of the graph, the feature directly corresponding to the sub-graph structure supporting the classification, each vector corresponding to a mixture The distribution of states in the graph;
  • the step (2) described uses different filter matrices to perform the filtering operation
  • the distribution case refers to the possibility that the sub-picture structure in the mixed state appears in the figure; preferably, each of the mixed states represents a linear weight of the adjacency matrix corresponding to any plurality of sub-picture structures; More preferably, the linear weighting means that the adjacency matrix of each subgraph is multiplied by the weight corresponding to the adjacency matrix, and then the bit phases are added together to obtain a matrix of the same size as the adjacency matrix of the subgraph;
  • the filtering operation is to add the matrix inner product of the second adjacency matrix by using the filter matrix, and obtain a value by the activation function, and the filter matrix is along the second adjacency matrix. Moving diagonally to obtain a set of values, forming a vector corresponding to the distribution of a subgraph structure in the graph; more preferably, the activation function is a sigmoid function, a ReLU activation function, and a pReLU function;
  • the initial values of each element in the filter matrix are respectively values of random variables taken from the Gaussian distribution
  • the elements in the filter matrix are real numbers greater than or equal to -1 and less than or equal to 1; more preferably, the elements in the filter matrix are real numbers greater than or equal to 0 and less than or equal to 1;
  • the step (2) is involved in a machine learning process, and the machine learning process is used to adjust values of elements of the filter matrix;
  • the machine learning process utilizes backpropagation, uses the classified loss value, calculates the gradient value, and further adjusts the value of each element in the filter matrix; more preferably, the feature generation module can utilize different Filtering the matrix to perform the above filtering operation;
  • the value of the connection information is 1 and the value of the non-connection information is 0. More preferably, if the edge of the graph has a weight, the value of the connection information is a weight value of the edge. The value of the non-join information is 0.
  • the diagonal area of the second adjacency matrix refers to an area that is scanned one time along a diagonal line of the second adjacency matrix using a scanning rectangle of size n ⁇ n;
  • the size of the filter matrix is n ⁇ n.
  • An embodiment specifically implements a graph classification method based on an adjacency matrix in a computer environment, and the graph classification method includes the following steps:
  • Feature extraction extracting features of the graph by using a graph extraction method based on any form of the adjacency matrix as described above;
  • step (2) category labeling: classifying the graph based on the features extracted in step (1), outputting the category of the graph; the graph is a graph in the graph theory; preferably, the step (2) calculates the graph belongs to The possibility of each classification label, and labeling the classification label with the highest probability as the category of the graph, and completing the classification of the graph; preferably, the step (2) uses the classification algorithm to calculate the possibility that the graph belongs to each category label, The classification label with the highest probability is labeled as the category of the graph, and the classification of the graph is completed. More preferably, the classification algorithm is selected from any one or any of a plurality of kNN and linear classification algorithms.
  • An embodiment specifically implements a graph classification method based on a stacked CNN in a computer environment, and the graph classification method includes the following steps:
  • Convolution operation convolution operation is performed on the feature of the map extracted in step (1) by using at least one convolution layer, and the sub-graph structure supporting the classification corresponding to the feature is merged to obtain at least one vector as a convolution result.
  • the input of the first convolutional layer is the feature of the graph extracted in step (1).
  • each convolutional layer uses at least one filter matrix for convolution operation, and the convolution result of the last convolution layer is output to step (3);
  • the filter matrix is a square matrix;
  • the number of rows of the filter matrix in one of the convolutional layers is the same as the number of vectors in the convolution result input to the convolutional layer; preferably, the elements in the filter matrix are greater than or equal to -1, less than or equal to 1 More preferably, the elements in the filter matrix are real numbers greater than or equal to 0 and less than or equal to 1;
  • pooling operation performing a pooling operation on the result of the convolution operation in the step (2), and obtaining at least one vector as a pooling result is delivered to the step (4), where the pooling result includes the larger child in the figure
  • the larger sub-graph structure refers to a sub-graph structure with more vertices than n; preferably, the pooling operation is selected from a maximum pooling operation and an average pooling operation;
  • step (3) the pooling result is obtained, the graph is classified, and the category of the graph is output.
  • An embodiment specifically implements a graph classification method based on a stacked CNN in a computer environment provided by the present invention, and the graph classification method includes the following steps:
  • map feature extraction using any of the forms of the adjacency matrix based graph feature extraction method to extract the features of the graph, and passed to step (2) and step (3);
  • step (3) convolution pooling operation convolving the features of the graph extracted in step (1) by using at least one filter matrix, and merging the subgraph structure supporting the classification corresponding to the feature, and obtaining at least one vector as a convolution
  • the convolution result is subjected to a pooling operation, and at least one vector is obtained as a second pooling result, which is passed to step (4), where the second pooling result includes a larger subgraph structure in the graph.
  • a larger sub-graph structure refers to a sub-graph structure having more vertices than n;
  • the filter matrix is a square matrix; the number of rows of the filter matrix in each of the convolution layers is input
  • the number of the vectors of the convolutional layer is the same; preferably, the elements in the filter matrix are real numbers greater than or equal to -1 and less than or equal to 1; more preferably, the elements in the filter matrix are greater than or equal to 0 and less than a real number equal to 1; preferably, the pooling operation is selected from a maximum pooling operation, an average pooling operation;
  • An embodiment specifically implements a graph classification method based on a stacked CNN in a computer environment provided by the present invention, and the graph classification method includes the following steps:
  • map feature extraction using any of the forms of the adjacency matrix based graph feature extraction method to extract the features of the graph, and passed to step (2);
  • Convolution pooling operation convolution operation is performed on the input by using at least one filter matrix, and the sub-graph structure supporting the classification corresponding to the feature is obtained to obtain at least one vector as a convolution result, and then, the volume is The product result is subjected to a pooling operation, and at least one vector is obtained as a pooling result, and the pooling result includes characteristics of a larger subgraph structure in the graph, and the convolution result of the upper level is transferred to the convolution pooling operation of the next stage.
  • the pooling result of each level convolution pooling operation is output to step (4); wherein the input of the first level convolution pooling operation is a step
  • the input of each level convolution pooling operation is the output result of the convolution pooling operation of the previous stage, and the final level convolution pooling
  • the operation only pools the result to step (4);
  • the larger subgraph structure refers to a subgraph structure with more vertices than n;
  • the filter matrix is a square matrix; each of the convolution layers
  • the number of rows of the filter matrix is the same as the number of vectors input to the convolution layer; preferably, the elements in the filter matrix are real numbers greater than or equal to -1 and less than or equal to 1; more preferably, the filtering The elements in the matrix are real numbers greater than or equal to 0 and less than or equal to 1; preferably, the pooling operation is selected from a maximum pooling operation and an average pooling operation;
  • the element value of the vector corresponding to the convolution result represents a possibility that the sub-picture structure appears at each position on the map, and the pooling result, the first pooling result, and the vector corresponding to the second pooling result are The element value represents the maximum likelihood or average likelihood that the subgraph structure appears in the graph.
  • category labeling includes the following steps:
  • step (2) processing the received vector using the hidden layer to obtain at least one mixed vector is passed to step (2); the mixed vector contains information of all vectors received by the hidden layer; The processing combines the input vectors into a combined vector, and performs linear weighting operation on the combined vector using at least one weight vector to obtain at least one mixed vector;
  • Feature activation Calculate a value using an activation function for each mixed vector received, and pass all the obtained values into a vector to step (3).
  • the activation function is a sigmoid function, ReLU activation function, pReLU function;
  • Type labeling Calculate the possibility that the graph belongs to each sorting label by using the received vector, and label the most probable sorting label as the category of the graph to complete the classification of the graph; preferably, the labeling unit is based on the classification
  • the algorithm calculates the possibility that the graph belongs to each classification label, and labels the most probable classification label as the category of the graph, and completes the classification of the graph; more preferably, the classification algorithm is selected from any of kNN and linear classification algorithms. One or any of a variety.
  • An embodiment implements a graph classification system provided by the present invention, where the vertices of the graph are arbitrary entities, and the edges of the graphs are relationships between any entities;
  • any of the entities is any independent individual or set of individuals, the individual is actually present or virtual; preferably, the entity may be one of any person, thing, event, thing, concept a combination of one or more; more preferably, any of the entities selected from the group consisting of atoms in a compound or element, any one or more of a person, a commodity, an event in a network;
  • the relationship is any association between any entities; more preferably, the association is a chemical bond connecting atoms, a relationship between commodities, a relationship between people; more preferably, The relationship between the commodities includes a causal relationship and an association relationship of the purchased goods; more preferably, the relationship between the persons includes an actual blood relationship, a friend relationship or a concern relationship in the virtual social network, a transaction relationship, and a transmission Message relationship.
  • An embodiment specifically implements a network structure type discriminating system provided by the present invention, and the classification system implements network structure classification based on any form of graph classification system as described above, and the vertex of the graph is a node in the network.
  • the edge of the graph is a relationship of nodes in the network; preferably, the network is selected from the group consisting of an electronic network, a social network, and a logistics network; and more preferably, the electronic network is selected from a local area network, a metropolitan area network, a wide area network, and the Internet.
  • the node is selected from the group consisting of a geographic location, a mobile station, a mobile device, a user equipment, a mobile user, and a network user; more preferably, the relationship of the node is selected from an information transmission relationship between the electronic network nodes, Transportation relationship between geographical locations, actual blood relationship between people, friendships or concerns in virtual social networks, transaction relationships Sending a message relation;
  • the selected network classification structure type structure type selected from the star, tree, fully connected, ring.
  • An embodiment specifically implements a compound classification system provided by the present invention, the classification system implementing compound classification based on a graph classification system of any of the forms described above, the apex of the graph being an atom of a compound, the graph
  • the sides are chemical bonds between atoms; preferably, the classification is selected from the group consisting of activity, mutagenicity, carcinogenicity, catalytic properties, and the like.
  • An embodiment implements a social network classification system provided by the present invention, and the classification system implements social network classification based on any form of graph classification system as described above, and the vertex of the graph is an entity in a social network.
  • the edge of the figure is a relationship between entities, including but not limited to a person, an organization, an event, a geographical location in a social network, and the relationship includes but is not limited to a friend relationship, a relationship of interest, a private message, Name, association.
  • the name of the point refers to a person who can use @.
  • An embodiment implements a computer system provided by the present invention, the computer system comprising the graph feature extraction system, the map classification system, and the network structure in any form as described above. Any one or any of a type discriminating system, the compound sorting system, and the social network sorting system.
  • an embodiment takes a 6-vertex diagram as an example to describe the connection information regularization system based on the adjacency matrix and the graph feature extraction system based on the adjacency matrix in the computer environment.
  • this 6-vertex graph its vertices are represented by a, b, c, d, e, f.
  • the six edges are (a, b), (a, c), (b, e), (b, f), (e, f) and (e, d), whose graph structure and its first adjacency matrix sorted according to this vertex are as shown in FIG.
  • connection information regularization system reorders all the vertices in the first adjacency matrix to obtain a second adjacency matrix, and the connection information elements in the second adjacency matrix are concentrated in the width of the second adjacency matrix.
  • the diagonal region, where n is a positive integer, n> 2 and n is much smaller than
  • the diagonal region of the second adjacency matrix having a width n is composed of the following elements: a positive integer i is traversed from 1 to
  • the vertex reordering method may be a greedy algorithm, including the following steps:
  • swap(A, i, j) indicates that the rows and columns corresponding to i, j in the adjacency matrix A are simultaneously exchanged to obtain a new adjacency matrix; and refresh(A) indicates that the adjacency matrix is applied. Row and column exchange.
  • the best results can be obtained, as shown in the adjacency matrix on the right side of Figure 12, and the optimal result is the second adjacency matrix.
  • the second adjacency matrix is input to the feature generation module to calculate at least one vector, and the vectors directly correspond to the subgraph structure supporting the classification.
  • These filter matrices use F 0,i to represent i ⁇ ⁇ 1,...,n 0 ⁇ .
  • the diagonal feature extracted by the filter matrix F 0,i at step j can be expressed as:
  • ⁇ ( ⁇ ) is an activation function, such as sigmoid. Therefore, the feature size obtained from the diagonal convolution is n 0 ⁇ (
  • P 0 is used to represent the features obtained by the feature generation module.
  • F 0 is used to represent the filter parameter ⁇ F 0,i ⁇ .
  • FIG. 15 (a) is a diagram and its second adjacency matrix
  • Figure 15 (b) is the two filter matrices used, for the sake of convenience, the values in the filter matrix are taken as 0 or 1, two filter matrices
  • Figure 15(c) is the corresponding subgraph structure.
  • the area corresponding to 0.99 is the area enclosed by the broken line in Fig. 15(a), that is, the sub-picture structure represented by the three vertices b, e, and f, and the sub-picture structure and the used
  • the structure represented by the filter matrix (the structure above Fig. 15(c)) is identical.
  • connection information shaping system is that the connection information is concentrated on the diagonal area of the second adjacency matrix, since elements that do not contain the connection information do not contribute significantly to the classification of the graph, which greatly reduces the amount of computation of the system.
  • the connection information normalization system is not used, and the feature generation module uses the filter matrix of size n ⁇ n to extract features, each filter matrix needs to perform (
  • an embodiment is further provided to illustrate a specific implementation of the adjacency matrix based graph classification system in a computer environment as described herein, and to verify the effect of such an implementation using a public data set.
  • n For a dataset with an irregularly sized graph, you need to find a suitable window size n for it. When n is set too small, it may cause most of the graphs to lose the connection information element after passing through the connection information regulation system. Furthermore, n being too small may cause the feature generation module to over-fitting because fewer sub-graph structural features are captured.
  • FIG. 16(a) shows the graph structure of the graph of three vertices and its adjacency matrix, and zeros it to make the size of the adjacency matrix become 5, as shown in FIG. (b) is shown.
  • n When n is selected, a small number of graphs are randomly selected from the graph dataset, and then the selected graphs are processed using the joint information regularization system of different window sizes n, and the Loss index of the final second adjacency matrix is compared. For the randomly selected set of graphs, the window size n that minimizes the average Loss value of the second adjacency matrix of the set of graphs is selected as the window size of the graph data set.
  • the adjacency matrix is subjected to the zero-padding operation to obtain the first adjacency matrix, and then the first adjacency matrix is processed by the processing flow shown in FIG. 30.
  • the greedy algorithm in the embodiment 1 is used to the adjacency matrix of the graph.
  • the connection information is normalized and the feature generation operation is performed.
  • n f0 filter matrices are selected for the filtering operation, and the feature extraction of the graph is performed in the manner of the foregoing embodiment 1, and input to the layered CNN module.
  • the result of the convolution of the elements of a vector corresponding to the occurrence value represents the sub-structure in FIG various positions on the map, and then By repeatedly adding more convolution sub-modules, more convolution results P 2 , P 3 ,..., P m can be obtained, and the convolution result obtained by the deeper convolution sub-module is larger and larger. complex.
  • Table 1 describes the size and number of filter matrices in each convolution submodule and the size of the generated features, where the diagonal convolution represents the feature generation module and the convolutional layer m is the mth convolution submodule.
  • the height of the filter matrix (ie, the number of rows of the filter matrix) needs to be set to the number of filter matrices in the previous convolution sub-module (ie, the previous convolution sub-module).
  • the number of vectors in the convolution result of the module output For example, for convolution sub-module 2, the filter matrix size is n 1 ⁇ s 2 , which means that the filter matrix height is the same as the number of filter matrices (n 1 ) in the convolution sub-module 1.
  • the convolution result P i-1 of the (i-1)th convolution sub-module is taken as an input, and its size is n i-1 ⁇ (
  • n i-1 ⁇ (
  • F i (n i-1 ⁇ s i ) of the convolution operation is performed to obtain a convolution result P i.
  • the elements in P i as follows:
  • ⁇ ( ⁇ ) denotes an activation function, such as sigmoid
  • j,k denotes the position of the element in P i , the jth row, the kth column.
  • s i represents the width of the filter matrix in the i-th convolutional layer
  • n i represents the number of filter matrices in the i-th convolutional layer.
  • deep convolution results P 0 ,..., P m can be obtained.
  • the pooled sub-module is used to pool the convolution results.
  • the maximum pooling operation is selected, and the maximum pooling layer is added after each set of convolution results P i .
  • a maximum value pooling operation is performed for each row to obtain a vector of size n i-1 ⁇ 1.
  • Figure 17 shows the relationship between the convolution sub-module and the pooled sub-module in the stacked CNN, where the arrows indicate the direction of data transfer between the modules.
  • the hidden layer unit is a fully connected layer, and the neurons in the fully connected layer are completely connected to all the activation values of the upper layer.
  • the weight parameter W h is set
  • the deviation parameter b h is used to calculate the input vector to obtain the activation value
  • a dropout is additionally set to prevent the neural network from over-fitting.
  • the dropout refers to that during the training process of the deep learning network, for the neural network unit, it is temporarily discarded from the network according to a certain probability, and the dropout can effectively prevent over-fitting.
  • the activation value calculated by the activation unit is taken as an input, and the multi-logistic regression (ie, the softmax function) is performed on the class label vector x by another fully connected layer including the weight parameter W s and the deviation parameter b s .
  • the probability distribution marks the class label corresponding to the highest probability value in the output result as the category of the graph.
  • the training of neural networks in the system is achieved by minimizing the cross-entropy loss.
  • the formula is:
  • a i represents the adjacency matrix of the i-th graph in R
  • y i represents the i-th class label.
  • the parameters in the neural network are optimized by stochastic gradient descent (SGD), and the back propagation algorithm is used to calculate the gradient.
  • MUTAG is a data set with 188 nitro compounds, where the class indicates whether the compound has mutagenic effects on bacteria.
  • PTC is a data set containing 344 compounds in the class that are carcinogenic to male and female mice.
  • PROTEINS is a collection of graphs in which the graph vertices are secondary structure elements and the edges of the graph represent neighborhoods in the amino acid sequence or in 3D space. The other two are the social network data sets IMDB-BINAR and IMDB-MULTI.
  • IMDB-BINARY is a film collaboration dataset where IMDB collects actors/actors and genre information for different movies. For each graph, the vertices represent actors/actresses, and if they appear in the same movie, there is an edge between them to connect them. Each actor/actor generates a collaborative network and a self-network. The self network is labeled with the type it belongs to.
  • IMDB-MULTI is a multi-category version because movies can belong to multiple types at the same time; IMDB-BINARY is a two-category version (only two categories).
  • the first implementation uses one independent pooling module and one convolution pooling module; the second graph classification
  • the system uses one independent pooling module and four convolution sub-modules.
  • the parameter n in the invention is set from 3 to 17.
  • the filter matrix size s i used by each convolutional layer is adjusted from ⁇ 3, 5, 7, 9, 11 ⁇ .
  • the number of filter matrices in each convolutional layer is adjusted from ⁇ 20, 30, 40, 50, 60, 70, 80 ⁇ .
  • the convergence condition is set such that the difference between the accuracy of the training phase and the accuracy of the previous iteration is less than 0.3% or more than 30 iterations.
  • the test set and the training set are randomly extracted according to a ratio of 3:7.
  • each graph G i and its classification label y i and the predictor category of the classifier Accuracy is calculated as follows:
  • index function ⁇ ( ⁇ ) obtains the value "1" if the condition is true, otherwise the value "0" is obtained.
  • the present invention is compared to three representative methods: DGK (Deep graph kernels, Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 2015: 1365-1374), PSCN (Learning convolutional neural networks for Graphs, Proceedings of the 33rd International Conference on Machine Learning, New York, NY, USA, 2016, 2014-2023) and MTL (Joint structure feature exploration and regularization for multi-task graph classification, IEEE Transactions on Knowledge and Data Engineering, 2016 , 28(3): 715-728).
  • Table 2 shows the characteristics of the five data sets used and summarizes the average accuracy and standard deviation of the comparison results. All the examples were run ten times in the same setup.
  • the accuracy of the second graph classification system is 94.99%, which is higher than PSCN compared to the best result of PSCN 92.63%.
  • the first graph classification system achieved an accuracy of 92.32%, which is very similar to PSCN.
  • DGK and PSCN achieved approximately 60% accuracy.
  • the first graph classification system reached 62.50%, and the second graph classification system reached 64.99%, which is the best accuracy so far in this data set.
  • the second graph classification system achieves a maximum accuracy of 75.96%, which is slightly better than the best result of 75.89% of PSCN.
  • the second graph classification system has an accuracy of 71.66% for IMDB-BINARY, which is higher than 71% of the best PSCN, and for IMDB-MULTI, the highest accuracy is 50.66%. The best is 45% with PSCN and 44% for DGK. In all of the embodiments, the present invention achieves the highest accuracy and the probability of false positives is lower.
  • the number of filter matrices is set to 50, and the filter matrix width of the convolution sub-module in the stacked CNN is set to 7.
  • Accuracy and time consuming are the average of ten runs under the same embodiment settings.
  • the accuracy is not sensitive to the increase of n as the parameter n increases from 3 to 11. And time-consuming and more sensitive. Therefore, setting a smaller n is more desirable.
  • n is set to 7
  • the number of filter matrices is 50
  • the width of the stacked CNN filter matrices is 3 to 15. Due to Zero-padding, only odd-numbered widths can be used, namely 3, 5, 7, 9, 11, 13, 15. Also run the calculation of the average value ten times under the same settings.
  • Figure 18(b), Figure 19(b) and Figure 20(b) show the results on MUTAG, PTC and PROTEINS, respectively.
  • 9 is an approximately optimal setting for the width of the filter matrix because the time width is 9 to 11 and 15 is small.
  • the PTC dataset shows that the optimal setting of the filter matrix width is 5, because setting the filter matrix width to 9, 11, and 13 has close accuracy, but it takes a long time relative to the small filter matrix width of 7 .
  • the optimal filter matrix width is 11.
  • n is set to 7
  • the filter matrix width is set to 7
  • the number of filter matrices is 20-80.
  • Figures 18(c), 19(c) and 20(c) show the results on MUTAG, PTC and PROTEINS, respectively. It can be seen that using a larger number of filter matrices, such as 60 in Figure 9, may result in worse accuracy on the same data set. This is because more filter matrices are used and more weight is required for training. Therefore, it is easier to overfit in the training of a larger number of filter matrices.
  • the number of convolution layers on MUTAG, PTC and PROTEINS is set to 1 to 5 in this embodiment.
  • Figure 18(d), Figure 19(d) and Figure 20(d) show the accuracy and time-consuming on the MUTAG, PTC and PROTEINS data sets, respectively.
  • n and the filter matrix width are set to 7, and the number of filter matrices is set to 50.
  • the accuracy of the 5 convolutional layer is similar to that of the 2-volume layered version.
  • the previous embodiment has shown that increasing the filter matrix width, the size of the filter matrix, and the number and number of convolution layers may not improve performance.
  • the next set of embodiments investigates the effects of overfitting by using the dropout ratio in batch normalization.
  • the batch normalization is a method of keeping the input of each layer of neural networks in the same distribution during deep neural network training, which can help the neural network to converge.
  • Figure 21 shows the results of MUTAG and PTC.
  • the x-axis changes the dropout ratio, the left y-axis is accuracy, and the right y-axis is time-consuming.
  • FIG. 21(a) shows that when the dropout ratio is 0 to 0.2, the accuracy is improved, and when the dropout ratio of the MUTAG is 0.2 to 0.9, the accuracy is lowered.
  • Figure 21(b) shows the measurement results of PTC: When the dropout ratio is 0 to 0.4, the accuracy is stable, increasing when the dropout ratio is 0.4 to 0.5, and slightly decreasing when the dropout ratio is 0.5 to 0.9. This set of examples shows that when the dropout ratio is set to 0.2, the graph classification system of the present invention obtains the best fit of MUTAG, and the optimal ratio of PTC is 0.5.
  • the invention proposes a graph feature extraction system based on an adjacency matrix, which integrates the connection information elements in the adjacency matrix and extracts features.
  • the invention is compared here to a conventional CNN.
  • a two-dimensional convolution layer is applied directly on the adjacency matrix, and the pooled layer becomes a two-dimensional pool.
  • the filter matrix width is 7, and the number of filter matrices is 50.
  • Figure 22 (a) shows the accuracy of the two methods, and it can be seen that the method of the present invention is more accurate.
  • the time consumption of the ordinary CNN is larger than that of the method of the present invention. That is, the method of the invention has higher accuracy and lower time consumption.
  • Figures 23, 24, and 25 are the convergence processes for the loss of training sets and verification sets for MUTAG, PTC, and PROTEINS.
  • the gray line is the loss on the training set and the blue line is the loss on the verification set. It can be seen that in the three data sets, the loss is first reduced and stabilized after 30 iterations. Like most machine learning methods, especially neural networks, the loss on the training set can have lower values than the verification set. This is because the random gradient of the training program uses the loss of the training set instead of the verification set.
  • FIG. 26 shows the parameter variation of the filter matrix in the feature generation module and the graph structure represented by it in the training process.
  • the x-axis in the figure represents the number of iterations, from 0 to 30. When the number of iterations is 0, the value represents the initial value obtained by random sampling from the Gaussian distribution.
  • Figure 26(c) shows the value of the filter matrix, which is a 7 x 7 matrix. The darker the cells in the matrix, the larger the value, closer to 1, and the white cell is closer to -1, and the gray cell value is about 0.
  • Figure 27 shows subgraph features captured in different convolutional layers.
  • Figure 27(a) shows an input map of 12 vertices.
  • a second graph classification system (5 convolutional layer) is used, and the filter matrix size of the feature generation module is set to 4 ⁇ 4, and the remaining convolution layer filter matrix width is 3. Therefore, the feature size of each layer is 4, 6, 8, 10, 12.
  • 27(b), (c), (d), (e), and (f) respectively show subgraph patterns learned at each of the five convolution layers. Its adjacency matrix represents the existence probability of each edge, and the darker the cell, the higher the probability that the filter captures the corresponding edge.
  • the first layer shown in Fig. 27(b) only the basic four vertex patterns can be processed.
  • the filter matrix can capture and represent the six-vertex pattern consisting of the first layer features. By adding more convolution layers, you can capture and represent more complex subgraph patterns. Finally, in Figure 27(f), 12 vertex features are captured, which is very similar to the initial input graph in Figure 27(a).
  • an embodiment is provided to mainly illustrate the important characteristics of the adjacency matrix-based graph classification system proposed by the present invention: a sub-graph structure capable of capturing large multi-vertices with a small window.
  • FIG. 28 shows the physical meaning of using the feature generation module on this graph.
  • the graph has two rings of six vertices in size, and the two vertices are shared by the two ring structures.
  • existing methods typically require a window size greater than 10.
  • the method of the present invention is effective even if only a window of size 6 is used.
  • the filter matrix can be moved by
  • the five graphs at the center of Figure 28 show how the filter matrix covers (captures) different subgraphs of the graph in each step.
  • the filter matrix covers all connections between any pair of vertices marked by a, b, c, d, e, f.
  • the filter emphasized by the dotted line covers the ring composed of the vertices a, b, c, d, e, f.
  • different subgraph structures can be captured by the same filter matrix. For example, steps 1 and 5 capture the same graph structure: a six-vertex ring. At the same time, step 2, steps 3 and 4 capture another type of graph structure: six vertex lines. Combining the obtained features results in a more complex structure. As shown at the bottom of Figure 28, three different complex structures can be derived, with the middle structure being the 10 vertex ring that you want to capture.
  • FIG. 29 presents numerical examples to describe features captured by the feature generation module and features captured in the stacked CNN.
  • Figure 29 (a) is a 12-vertex diagram and the second adjacency matrix of the figure, the figure contains two rings of six vertices in size, and the two vertices are shared by the two ring structures, both rings are Also connected to a vertex.
  • the elements of the adjacency matrix and the filter matrix blank in Fig. 29 indicate a value of 0, and the values of the elements in the filter matrix are selected to be 0 or 1 in order to simplify the calculation.
  • Figure 29 (b) shows two filter matrices in the feature generation module, and the corresponding sub-graph structure is shown in Figure 29 (c).
  • the filtering operation is performed along the diagonal direction of the second adjacency matrix of the figure using the two filter matrices of FIG. 29(b), and the vector can be calculated as shown in FIG. 29(d), and the element surrounded by the broken line is zero-filled. (zero-padding).
  • the filter matrix in the stacked CNN is as shown in Fig. 29(e), and the element is also 0 or 1 in order to simplify the calculation.
  • the captured features (Fig. 29(d)) are filtered using the filter matrix in the stacked CNN to obtain a vector as shown in Fig. 29(h).
  • Figure 29 (i) shows.
  • the adjacency matrix represented by the filter matrix in the stacked CNN is obtained, as shown in Fig. 29(f), and Fig. 29(g) is the subgraph structure represented by the filter matrix in the stacked CNN.
  • Figure 29(g) is a double ring with ten vertices, and one is a six-vertex ring with four vertices.
  • the graph classification system based on the adjacency matrix proposed by the invention can capture the subgraph structure of large multi-vertices and the deep features of the implicit correlation structure from the vertex and the edge through a small window, thereby improving the classification accuracy.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Software Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computing Systems (AREA)
  • Medical Informatics (AREA)
  • Molecular Biology (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Biodiversity & Conservation Biology (AREA)
  • Discrete Mathematics (AREA)
  • Algebra (AREA)
  • Computational Mathematics (AREA)
  • Mathematical Analysis (AREA)
  • Mathematical Optimization (AREA)
  • Pure & Applied Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

一种基于邻接矩阵的连接信息规整系统、图特征提取系统、图分类系统和方法,通过将图对应的邻接矩阵中的连接信息元素集中到邻接矩阵的特定的对角线区域中,将非连接信息元素提前进行削减,这样在使用大小固定的窗口沿对角线区域遍历就可以捕获图中所有对应大小的子图结构时,时间复杂度大大降低了;进一步使用过滤矩阵沿对角线方向提取图的子图结构,然后采用层叠的卷积神经网络提取更大的子图结构,一方面大大减少了计算复杂度和计算量,解决了计算复杂度的限制和窗口大小的限制,并且能够通过较小的窗口捕获大型多顶点的子图结构,以及来自顶点和边的隐式相关结构的深层特征,提高了图分类的准确性和速度。

Description

一种基于邻接矩阵的连接信息规整系统、图特征提取系统、图分类系统和方法 技术领域
本发明属于人工智能领域,具体涉及一种基于邻接矩阵的连接信息规整系统、图特征提取系统、图分类系统和方法。
背景技术
图论中的图(graph)是由若干给定的点及连接两点的线所构成的图形,这种图形通常用来描述某些事物之间的某种特定关系,用点代表事物,用连接两点的线表示相应两个事物间具有某种关系。图论中的图(Graph)G是一个有序二元组(V,E),其中V称为顶点集(vertex set),即图中所有顶点组成的集合,E称为边集(edge set),即所有顶点(vertex)之间的边(edge)组成的集合。简单的说,顶点表示事物,边表示事物之间的关系。图(graph)是一种非网格数据(non-grid data),这类数据的特点是,在具体的场景中维度(dimension)是不确定的,并且维度高且无上限,所述的图的维度(dimension)是指图的顶点的数量。例如化学结构式可以对应一个图(graph),其中原子即为图(graph)中的顶点,原子间的化学键即为图(graph)中的边。一个分子的维度为该分子中包含的原子数量,例如一个分子包含100个原子组成,则该分子的维度为100。在一个分子的集合中,每个分子由数量不定的原子构成,故其维度是不确定的。现实中蛋白质等复杂结构往往由几十甚至上百个原子构成,其维度就高达几十甚至上百。又例如社交网络也可以对应一个图(graph),其中,人即图(graph)中的顶点,人与人之间的关系即图(graph)中的边,社交网络的维度会更高且更加复杂,一般较大的社交网络能有几千个顶点,几万条边,维度就高达几千,可见图论中的图对应的维度是非常高的,且无上限。
另一方面,图片、文本、音视频等数据均属于网格数据(grid data),该类数据特点是,维度低(不超过3维),并且维度是确定的。例如图片(image),对于一个图片的集合,图片的维度不受图片的数量的影响,对于一张图片,它的维度可以表示为2维或3维,更多张的图片(例如数百张),其维度是不变的,仍然为2维或3维。可见,网格数据和非网格数据是两种完全不同的数据,非网格数据相比于网格数据有着更高且不确定的维度和更复杂的结构,对两种数据的分类方法和特征提取方法也是完全不同的。
商业、科学和工程学中的许多复杂问题可以被抽象为图(graph)的问题,然后可以通过使用图分析算法来解决。图分类(graph classification)问题将图(graph)视为复杂对象,根据图中隐藏的常见子图结构模式构建深度学习模型来学习图的分类(graph classification)。例如,MUTAG数据集由许多硝基化合物组成,其中类别标签可以指示化合物对细菌是否具有诱变作用。另一个例子是将不可见化合物映射到其对癌细胞的活性水平上。
图分类问题(graph classification)将图视为复杂对象,根据图中隐藏的常见子图(subgraph)结构模式构建深度学习模型来学习图的分类。所述子图(subgraph)是指图中部分顶点以及将这些顶点连接起来的边表现出的图论中的图。复杂对象分类的方法通常通过设计适合的相似度函数来测量两个复杂对象之间的相似距离,然后再使用一些分类算法来对复杂对象进行分类。现有的基于图相似度计算模型的图分类大致分为两类:
(1)基于局部子图的方法;这类方法根据较小子图结构在图中是否存在或出现次数来计算图之间的相似度,这类方法核心思想在于将重要的子图结构识别为用于图分类的关键特征,然后,通过将待分类的各图表示为包含这些子图结构关键特征信息的向量,向量中每个元素表示相应子图结构的权重,最后应用现有的机器学习算法来进行训练和预测。使用这样的子图结构作为关键特征会受限于子图大小(通常称为窗口大小,window-size),因为子图大小增大会导致子图枚举的计算复杂度和计算量大大增加,通常是增加一个窗口大小,计算量会以指数级别增加,到一定极限会超出计算机的执行时间(running time)和内存(memory usage)的承受能力。因此,该方法会受限于窗口大小(选择的子图大小不能超过10个顶点),而这将导致图的特征中缺少对分类至关重要、但通过小的窗口(不超过10)无法捕获的子图结构,进而可能导致较高的分类错误率。
(2)基于全局相似度(global similarity-based)的方法;这类方法的核心思想是计算图的成对相似度(pairwise similarity,距离),这类方法通常先编码(encode)子图特征,然后创建距离/相似度矩阵,在距离矩阵上使用现有的监督学习算法去进行分类,比如kNN和SVM。
图核(graph kernel)和图嵌入(graph embedding)是图相似度计算模型的图分类方法中最新的两种代表性方法。
然而,以上两种图分类的现有方法都存在着严重的缺点。首先,与文本、图像、视频和场景数据集等网格数据的分类相比,作为非网格数据,图(Graph)的特征提取构成了一些独特的挑战。图(Graph)由两种类型的元素组成——顶点和边,将图作为整个对象来分析,不仅需要捕获来自图(Graph)的显式拓扑结构的浅层特征,而且需要捕获来自顶点和边的隐式(隐藏)相关结构的深层特征。因此,很难在确定性特征空间中表示图(Graph)。其次,捕获隐式结构相关模式对于图(Graph)的高质量分类至关重要。无论是较小且固定大小的子图模式匹配(局部相似性)还是图的成对相似性(全局相似性)都不足以捕获复杂隐藏相关模式,用于对具有不同大小和不同结构复杂度的图进行分类。
卷积神经网络(CNN)在深度学习中处理网格数据方面取得了显著的成功,例如文本、图像、音视频和流数据以及大规模场景分析。这些数据都属于网格数据,它们有确定的维度且维度低,且网格数据中的数据具有平移、缩放和旋转不变等特点。图(graph)是非网格数据(non-grid data),卷积神经网络(CNN)不能直接地应用到图上,因为CNN中的卷积和池化操作是仅定义在常规网格数据(grid data)上的操作,而不能直接在非网格数据(non-grid data)上做卷积操作。(Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering,30th Conference on Neural Information Processing Systems(NIPS 2016),Barcelona,Spain,1-8)。
Mathias Niepert等首次将卷积神经网络应用于图的分类问题中(Learning convolutional neural networks for graphs,Proceedings of the 33rd International Conference on Machine Learning,New York,NY,USA,2016,2014-2023),文中提出的PSCN方法,对于输入的图(graph),首先对图的顶点进行标注(Graph Labeling),根据标注结果将图的顶点进行排序,选择前w个顶点作为中心顶点;对选取的w个顶点,每个顶点采取广度优先的方式选取它的k个相邻顶点(根据Graph Labeling排序选取),这样每个顶点和它的周围大小为k的邻域构成一个子图(subgraph),w个顶点就能得到w个子图(subgraph); 通过以上步骤得到w个维度为(k+1)的向量,每个向量对应一个以中心顶点为中心的子图中顶点的信息,此外还得到w个维度为(k+1) 2的向量,每个向量对应一个以中心顶点为中心的子图的边的信息,然后在这些向量上应用标准的卷积神经网络。简单地说,PSCN抽取了以若干个(参数w决定)顶点为中心的指定大小(由窗口大小参数k决定)的子图作为特征进行编码,然后应用标准的1维卷积神经网络。文中提出的PSCN方法在现有的开放数据集上获得了比Deep Graph Kernel更好的结果。然而,它仍然有一些缺点。首先,w个中心顶点的选择会限定子图的数量,因而无法保证所有的子图结构都能被抽取出来;其次,PSCN方法仍然受到窗口大小的限制,邻域的选择由小于10的窗口大小k决定,因为较大的窗口大小k将导致不可接受的耗时和内存使用;再次,PSCN使用较小的窗口大小k时并不能有效地进行深度学习,因为当输入的图具有超出默认窗口大小的密集连接特征时,它会丢失复杂的子图特征;此外,PSCN的分类结果对标签(Labeling)过程敏感,标签过程是将领域中的顶点进行排序;因此它们的标签方法适用于一个数据集,但可能在另一个数据集上失败。
综上所述,现有技术的方法对图的分类存在两个主要问题:一是将图(graph)作为整个对象分析时,无法选择出既能包含显示拓扑信息和深层隐含信息的特征来对图(graph)进行表示;二是将子图作为图(graph)的特征时,子图大小受制于窗口大小(window size)k的选择,导致难以捕获较大的复杂子图,使得图的分类准确性不高。
而现实生活中很多场景在抽象成图(graph)时,其包含的代表其特征的子图(subgraph)结构通常比较复杂。如将有机化学中的化合物抽象为图(graph),以其中的原子作为图(graph)的顶点,原子之间的化学键作为图(graph)的边,通常需要使用某些特殊分子结构(即子图)作为整个化合物的特征,而这些特征分子结构(即子图)中可能包含上百个原子(即顶点);再比如社交网络抽象成图,以网络中的人作为图中的顶点,人与人之间的关系作为图的边(edge),通常需要使用网络中某些特殊团体结构(即子图)作为网络的特征,而这种团体结构中可能有数百个人(即顶点)。而现有技术的方法都无法有效地提取图中较大子图结构,进而无法对图进行很好的特征表示。
发明内容
针对现有技术的不足,本发明所要解决的技术问题是提供一种在计算机环境下基于邻接矩阵的连接信息规整系统和方法,可以有效使邻接矩阵中对应于图的边的元素集中到对角线区域,这样使用大小固定的窗口沿对角线区域遍历就可以捕获图中所有对应大小的子图结构,降低了时间复杂度;然后再通过融合这些子图结构的信息,可以捕获大型多顶点子图结构信息,从而解决了现有技术无法解决的技术问题。
本发明针对现有的图分类方法的缺点包括:第一,由于图(graph)作为整个对象来分析不仅需要捕获来自图的显式拓扑结构的浅层特征,而且需要捕获来自顶点和边的隐式相关结构的深层特征,否则,会影响图分类的准确性。现有技术的方法难以在确定性特征空间中表示图(graph),所述特征空间指的是,从原始数据中进行特征提取,将原始数据映射到一个更高维的空间,特征空间中的特征是对原始数据更高维的抽象。第二,由于窗口大小(window size)导致的计算复杂度的限制,现有技术的方法无法捕获大型多顶点子图结构。相对于现有的图分类方法,本发明通过将图对应的邻接矩阵中的连接信息元素集中到邻接矩阵的特定的对角线区域中,将非连接信息元素提前进行削减,进一步使用过滤矩阵沿对角线方向提取图的子图结构,然后采用层叠的卷积神经网络提取更大的子图结构,一方面大大减少了计算复杂度和计算量,解决了计算复杂度的限制和窗口大小的限制,并且能够通过较小的窗口捕获大型多顶点的子图结构,以及来自顶点和边的隐式相关结构的深层特征,提高了图分类的准确性和速度。
本发明的第一个目的在于提供一种在计算机环境中的连接信息规整系统,所述的连接信息规整系统对图的第一邻接矩阵中的全部顶点进行重新排序,得到第二邻接矩阵,所述第二邻接矩阵中的连接信息元素集中分布在所述第二邻接矩阵的宽度为n的对角线区域,其中n为正整数,n≥2且n<|V|,所述的|V|为第二邻接矩阵的行数或列数。
本发明的第二个目的在于提供一种在计算机环境中基于邻接矩阵的图特征提取系统,所述的图特征提取系统基于图的邻接矩阵抽取出图的特征,所述的特征直接对应支持分类的子图结构,所述的特征以至少一个向量的形式呈现,每一个向量对应一种混合态在图中的分布情况;所述的图特征提取系统包含特征生成模块和如前所述的任何一种形式的在计算机环境中的连接信息规整系统。所述图特征提取系统包括连接信息规整系统和特征生成模块,所述的连接信息规整系统和特征生成模块作为一个整体协同作用,其作用是可以在不同大小、不同结构复杂度的图集上有效地提取隐含在矩阵宽度为n的特定的对角线区域中的局部模式和连接特征。所述连接信息规整系统使得特征生成模块所需要的计算复杂度和计算量大大减小,解决了计算复杂度的限制。
本发明的第三个目的提供一种在计算机环境中基于邻接矩阵的图分类系统,所述的图分类系统包含类别标注模块和如前所述任何一种形式的在计算机环境中基于邻接矩阵的图特征提取系统,所述的类别标注模块基于所述图特征提取系统生成的特征对图进行类别标注,输出图的类别。
本发明的第四个目的是提供一种在计算机环境中的连接信息规整方法。
本发明的第五个目的是提供一种在计算机环境中基于邻接矩阵的图特征提取方法。
本发明的第六个目的是提供一种在计算机环境中基于邻接矩阵的图分类方法。
本发明的第七个目的是提供三种在计算机环境中基于层叠CNN(Stacked CNN)的图分类方法。
本发明的第八个目的是提供一种图分类系统,所述图的顶点为任意实体,所述图的边为任意实体之间的关系。
本发明的第九个目的是提供一种网络结构类型判别系统,所述的分类系统基于如前所述的图分类系统实现网络结构分类,所述图的顶点为网络中的节点,所述图的边为网络中节点的关系。
本发明的第十个目的是提供一种化合物分类系统,所述的分类系统基于如前的图分类系统实现化合物分类,所述图的顶点为化合物的原子,所述 图的边为原子之间的化学键。
本发明的第十一个目的是提供一种社交网络分类系统,所述的分类系统基于如前所述的图分类系统实现社交网络分类,所述图的顶点(vertex)为社交网络中的实体,所述图的边(edge)为实体之间的关系,所述的实体包括但不限于社交网络中的人、机构、事件、地理位置,所述的关系包括但不限于好友关系、关注关系、私信、点名、关联。所述的点名是指提及某个人,可以用@的方式。
本发明的第十二个目的是提供一种计算机系统,所述的计算机系统包括如前所述的任何一种形式的所述的连接信息规整系统、所述的基于邻接矩阵的图特征提取系统、所述的图分类系统、所述的网络结构类型判别系统、所述的化合物分类系统、所述的社交网络分类系统中的任意一种或任意多种。
本发明内容部分提供一些简单形式的概念的介绍,这些概念在下面的具体实施方式中将进行详细的描述,本发明内容部分不用来确定权利要求主题的关键特征或实质特征,也不用来确定所要求保护的主题的范围。
本发明的有益效果是:
1.本发明通过将邻接矩阵中的连接信息元素集中到邻接矩阵的对角线区域中,对非连接信息元素进行区域削减,对连接信息元素进行区域集中,然后沿对角线方向提取图的子图结构,大大减少了提取图中子图结构的计算复杂度;
2.本发明通过使用过滤矩阵沿通过连接信息规整系统得到的第二邻接矩阵的对角线方向进行过滤操作,得到图的特征,同时在基于得到的特征采用层叠的卷及神经网络,实现通过较小的窗口捕获大型多顶点的子图结构,捕获拓扑结构的深层特征。
3.本发明通过将图对应的邻接矩阵中的连接信息元素集中到邻接矩阵的特定的对角线区域中,将非连接信息元素提前进行削减,进一步使用过滤矩阵沿对角线方向提取图的子图结构,然后采用层叠的卷积神经网络提取更大的子图结构,一方面大大减少了计算复杂度和计算量,解决了计算复杂度的限制和窗口大小的限制,并且能够通过较小的窗口捕获大型多顶点的子图结构,以及来自顶点和边的隐式相关结构的深层特征,提高了图分类的准确性和速度。
4.本发明提供的图分类系统中的连接信息规整系统、特征生成模块和层叠CNN(Stacked CNN)模块协同作用,其作用是实现了利用大小为n的小窗口捕获小的子图结构,同时利用这些小的子图结构的组合来获取顶点数大于n的更大、更深层、更复杂的子图结构,即用小窗口(窗口大小为n)来提取图的更大(顶点数大于n)、更深层、更复杂的特征。即实现了利用较小的窗口捕获大型多顶点的子图结构,以及来自顶点和边的隐式相关结构的深层特征,提高了图分类的准确性和速度。
附图说明
图1是6×6的邻接矩阵中宽度为3的对角线区域示意图;
图2是线性加权计算过程示意图;
图3是第一邻接矩阵转换成第二邻接矩阵的示意图,左图为第一邻接矩阵,右图为第二邻接矩阵;
图4是贪心算法流程图;
图5是分支定界算法流程图;
图6是层叠CNN模块的数据流图;
图7是层叠CNN模块(包含独立池化模块和卷积池化模块)的数据流图;
图8是层叠CNN模块(包含独立池化模块和多个卷积池化模块)的数据流图;
图9是图(graph)以及该图(graph)对应的第一邻接矩阵;
图10是贪心算法流程图;
图11是邻接矩阵行列交换实例示意图;
图12是第一邻接矩阵以及重新排序得到的第二邻接矩阵;
图13是图(graph)以及该图(graph)对应的第二邻接矩阵;
图14是特征生成模块过滤矩阵移动图;
图15是特征生成模块过滤矩阵计算示意图;
图16是图(graph)对应的邻接矩阵补零操作示意图;
图17是基于层叠CNN的图分类系统示意图;
图18是MUTAG上的准确性及耗时结果图;
图19是PTC上的准确性及耗时结果图;
图20是PTC上的准确性及耗时结果图;
图21是准确性与耗时随dropout ratio的变化图;
图22是各数据集上使用连接信息规整系统和不使用信息规整模块下的分类准确性及耗时对比图;
图23是MUTAG上的收敛曲线;
图24是PTC上的收敛曲线;
图25是PROTEINS上的收敛曲线;
图26过滤矩阵及其对应的子图结构,其中(a)是正子图结构,(b)是负子图结构,(c)是过滤矩阵。
图27是各卷积层捕获的特征与其对应的子图结构示意图,其中(a)是12顶点图,(b)是提取的4顶点特征,(c)是提取的6顶点特征,(d)是提取的8顶点特征,(e)是提取的10顶点特征,(f)是提取的12顶点特征;
图28是特征生成模块物理意义示意图;
图29是特征生成模块及层叠CNN模块捕获的子图结构示意图;
图30是基于层叠CNN的图分类系统的实现流程示意图。
具体实施方式
为了使本发明的目的、技术方案和优点更加清楚,下面以本发明所述的在计算机环境中基于邻接矩阵的图特征提取系统和方法,在计算机环境中基于层叠CNN的图分类系统和方法为例,对本发明的技术方案进行进一步的描述。以下实施例仅用于说明本发明而不用于限制本发明的范围。此外应理解,在阅读了本发明讲授的内容之后,本领域技术人员可以对本发明作各种改动或修改,这些等价形式同样落于本发明所附权利要求书所限定的范围。
一个实施例具体实现本发明提供的一种计算机环境中的连接信息规整系统,所述的连接信息规整系统对图的第一邻接矩阵中的全部顶点进行重新排序,得到第二邻接矩阵,所述第二邻接矩阵中的连接信息元素集中分布在所述第二邻接矩阵的宽度为n的对角线区域,其中n为正整数,n≥2且n<|V|,所述的|V|为第二邻接矩阵的行数或列数;优选的,所述对角线区域指矩阵中从左上角至右下角的对角线区域,例如,附图1的阴影区域即为一个6×6的邻接矩阵中宽度为3的对角线区域。
所述的图、子图均为图论中的图;
所述的连接信息元素是图的边在邻接矩阵中对应的元素。
所述连接信息规整系统将图对应的邻接矩阵中的连接信息元素集中到第二邻接矩阵宽度为n的特定的对角线区域(n为正整数,n≥2且n<|V|,所述的|V|为第二邻接矩阵的行数或列数)中,处理之后使用大小为n×n的矩阵(即窗口大小为n)沿对角线区域遍历就可以完成图中顶点数为n的子图结构的提取了,所需的计算复杂度和计算量大大减小,解决了计算复杂度限制。
本发明所述向量指具有大小(magnitude)和方向的量,在数学中表现为一个1×m的矩阵,m为大于1的正整数。本发明所述的特征均表示图(graph)的特征。
本发明所述的邻接矩阵(Adjacency Matrix)是指表示图(graph)的顶点之间相邻关系的矩阵,所述邻接矩阵的基本属性是通过切换邻接矩阵的两列和相应行,可以得到表示相同图的另一个邻接矩阵。设G=(V,E)是一个图,V为顶点集(vertex set),v i表示V中第i个顶点,|V|表示V中顶点的个数,i为小于或等于|V|的正整数,E为边集(edge set)。G的邻接矩阵是一个具有下列性质的n阶方阵:
1)对无向图而言,邻接矩阵一定是对称的,而且主对角线一定为零(在此仅讨论无向简单图),副对角线不一定为0,有向图则不一定如此;所述主对角线为矩阵左上角至右下角的对角线;所述副对角线为矩阵右上角至左下角的对角线;
2)在有向图中,任一顶点v i的度为第i列(或第i行)所有非零元素的个数;所述顶点i是指在矩阵中第i列(或第i行)表示的顶点;在有向图中,顶点i的出度为第i行所有非零元素的个数,而入度为第i列所有非零元素的个数;所述顶点的度为与该顶点关联的边的数目;所述顶点的出度为该顶点指向其他定点的边的个数;所述定点的入度为其他定点指向该顶点的边的个数;
3)用邻接矩阵法表示图共需要|V| 2个元素,由于无向图的邻接矩阵一定具有对称关系,所以扣除对角线为零外,仅需要存储上三角形或下三角形的数据即可,因此仅需要|V|×(|V|-1)/2个元素;当无向图的边是带权值的边时,邻接矩阵中的连接元素值用权值代替,没有连接的元素用0代替。
本发明所述的连接信息元素是图的边在邻接矩阵中对应的元素;在无向图中,第i行第j列的元素值,代表的是顶点v i与顶点v j的连接是否存在以及是否有连接权值;在有向图中第i行第j列的元素值,代表顶点v i指向顶点v j的连接是否存在以及是否有连接权值。例如,对无向图中顶点v i和顶点v j,二者之间如果存在一条边,那么在邻接矩阵中对应的第i行第j列和第j行第i列的元素值均为1,如果不存在边,则对应的第i行第j列和第j行第i列的元素值均为0,如果存在边且边上存在权值为w,则对应的第i行第j列和第j行第i列的元素值均为w;再例如,对于有向图中的顶点v i和顶点v j,二者之间如果存在一条由顶点v i指向顶点v j的边,那么在邻接矩阵中对应的第i行第j列的元素值为1,如果不存在由顶点v i指向顶点v j的边,则对应的第i行第j列的元素值为0,如果存在一条由顶点v i指向顶点v j的边且边上存在权值为w,则对应的第i行第j列的元素值为w;其中,i、j为小于或等于|V|的正整数,|V|为图中顶点的数量,w为任意实数。
优选的,如果所述的图的边上没有权重,则所述的连接信息元素的值为1,非连接信息元素的值为0;更优选的,如果所述的图的边上带有权重,则所述的连接信息元素的值为边的权重值,非连接信息元素的值为0。
本发明所述第一邻接矩阵是指一开始将图转化为邻接矩阵得到的第一个邻接矩阵,即交换对应行列之前的初始邻接矩阵,所述第二邻接矩阵指的是通过对第一邻接矩阵进行行列交换,将矩阵信息最大限度集中化之后的邻接矩阵,第二邻接矩阵中的连接信息元素集中分布在所述第二邻接矩阵的 宽度为n的对角线区域,其中n为正整数,n≥2且n<|V|,所述的|V|为第二邻接矩阵的行数或列数。第一邻接矩阵转换成第二邻接矩阵的示意图如图3所示,左图为第一邻接矩阵,右图为第二邻接矩阵。
进一步地,所述第二邻接矩阵的对角线区域由以下元素组成:正整数i从1遍历至|V|,当i>max(n,|V|-n)时,选取第i行中第(i-n+1)到|V|列的元素;当i≤n,选取第i行中第0至i+n-1列的元素;当max(n,|V|-n)≥i≥min(|V|-n,n),则第i列中,选取第(i-n+1)列到第(i+n-1)列的元素;
优选的,所述第二邻接矩阵的对角线区域是指使用一个尺寸为n×n的扫描矩形框沿所述第二邻接矩阵的对角线扫描一遍所经过的区域;更优选的,所述的扫描过程如下:首先,将所述扫描矩形框的左上角与第二邻接矩阵的左上角重合;然后每次将所述扫描矩形框往右方和下方各移动一个元素格,直至所述扫描矩形框的右下角与所述第二邻接矩阵的右下角重合。
进一步地,所述连接信息规整系统用于对所述第一邻接矩阵的全部顶点进行重新排序,使得排序之后第二邻接矩阵的对角线区域中连接信息元素的集中程度最高;所述连接信息元素的集中程度是指在对角线区域中非零元素的占比;
优选的,所述重新排序的方法为整数优化算法,其作用为将矩阵中的连接信息元素集中到对角线区域中,并使连接信息元素的集中程度尽可能的高;所述整数优化算法指的是通过同时交换矩阵中的对应两行或两列,使得矩阵的连接信息元素的集中程度更高的算法;
进一步地,所述重新排序的方法为贪心算法,包括以下步骤:
(1)初始输入:输入图的第一邻接矩阵作为待处理邻接矩阵;
(2)交换对统计:计算待处理邻接矩阵中所有可能的顶点交换对;
(3)行列交换:判断是否所有可能的顶点交换对均为已处理状态,若是,则输出待处理邻接矩阵得到所述的第二邻接矩阵,所述的贪心算法结束;否则,从尚未处理过的顶点交换对中任意选择一个顶点交换对作为当前顶点交换对,同时交换其对应的两个顶点在待处理邻接矩阵中对应的两行及对应的两列,生成新邻接矩阵,并跳转至步骤(4);
(4)交换效果评定:计算新邻接矩阵中连接信息元素的集中程度,若所述新邻接矩阵中连接信息元素的集中程度高于所述待处理邻接矩阵中连接信息元素的集中程度,则用所述新邻接矩阵替代所述的待处理邻接矩阵,并跳转至步骤(2);若所述新邻接矩阵中连接信息元素的集中程度低于或等于所述待处理邻接矩阵中连接信息元素的集中程度,则放弃这种交换,并标记所述的当前顶点交换对为已处理状态,跳转至步骤(3)。
所述贪心算法的流程图参见附图4。
进一步地,所述重新排序的方法为分支定界算法,包括以下步骤:
(1)初始输入:输入图的第一邻接矩阵作为待处理邻接矩阵;
(2)交换对统计:计算待处理邻接矩阵中所有可能的顶点交换对;
(3)行列交换:判断是否所有可能的顶点交换对均为已处理状态,若是,则输出所述的待处理邻接矩阵得到所述第二邻接矩阵,所述的分支定界算法结束;否则,对所有可能的顶点交换对中的每一个未处理过的顶点交换对分别执行交换操作,并跳转至步骤(4),所述的交换操作是指同时交换所述顶点交换对对应的两个顶点在所述待处理邻接矩阵中对应的两行及对应的两列,对每一个所述的顶点交换对执行所述交换操作都会生成一个新邻接矩阵;
(4)交换效果评定:计算每一个所述的新邻接矩阵中连接信息元素的集中程度,若存在连接信息元素的集中程度高于所述待处理邻接矩阵中连接信息元素的集中程度的新邻接矩阵,则选择集中程度最高的新邻接矩阵代替所述的待处理矩阵,并标记生成该集中程度最高的新邻接矩阵的顶点交换对为已处理状态,然后跳转至步骤(3);若不存在连接信息元素的集中程度高于所述待处理邻接矩阵中连接信息元素的集中程度的新邻接矩阵,则输出当前待处理邻接矩阵得到所述的第二邻接矩阵,所述的分支定界算法结束。
所述分支定界算法流程图参见附图5。
进一步地,所述第二邻接矩阵的对角线区域中连接信息元素的集中程度依赖于所述的对角线区域内的连接信息元素的数量和/或非连接信息元素的数量。
进一步地,所述第二邻接矩阵的对角线区域中连接信息元素的集中程度依赖于所述的对角线区域外的连接信息元素的数量和/或非连接信息元素的数量。
进一步地,所述的集中程度可以利用Loss值来衡量,Loss值越小,集中程度越高,所述的Loss值的计算方法如下:
Figure PCTCN2018082111-appb-000001
式中,LS(A,n)代表损失Loss值,A代表所述的第二邻接矩阵,n代表所述第二邻接矩阵中对角线区域的宽度,A i,j表示所述第二邻接矩阵中第i行第j列的元素。优选的,所述LS(A,n)表示第二邻接矩阵A在过滤矩阵大小为n×n时的Loss值,Loss值越小,第二邻接矩阵的集中程度越高。
进一步地,所述的集中程度还可以利用ZR值来衡量,ZR值越小,集中程度越高,所述ZR值的计算方法如下:
Figure PCTCN2018082111-appb-000002
Figure PCTCN2018082111-appb-000003
Figure PCTCN2018082111-appb-000004
式中,A代表第二邻接矩阵,C表示所有元素均为连接信息元素且尺寸大小与A相同的矩阵,Ai,j表示A中第i行第j列的元素,Ci,j表示C中第i行第j列的元素,TC(A,n)、TC表示宽度为n的对角线区域中元素的总个数,T1(A,n)、T1表示宽度为n的对角线区域中连接信息元素的个数,ZR(A,n)代表ZR值,该值表示宽度为n的对角线区域中非连接信息元素的占比。
一个实施例具体实现本发明提供的一种在计算机环境中基于邻接矩阵的图特征提取系统,所述的图特征提取系统基于图的邻接矩阵抽取出图的特征,所述的特征直接对应支持分类的子图结构,所述的特征以至少一个向量的形式呈现,每一个向量对应一种混合态在图中的分布情况;所述的图特征提取系统包含特征生成模块和如前所述的任何一种形式的在计算机环境中的连接信息规整系统。所述图特征提取系统包括连接信息规整系统和特征生成模块,所述的连接信息规整系统和特征生成模块作为一个整体协同作用,其作用是可以在不同大小、不同结构复杂度的图集上有效地提取隐含在矩阵宽度为n的特定的对角线区域中的局部模式和连接特征。所述连接信息规整系统使得特征生成模块所需要的计算复杂度和计算量大大减小,解决了计算复杂度的限制;所述的图为图论中的图;
优选的,所述的特征生成模块利用过滤矩阵生成图的特征,所述的过滤矩阵为正方形矩阵;更优选的,所述的特征生成模块利用至少一个过滤矩阵,沿所述第二邻接矩阵的对角线区域进行过滤操作,得到至少一个向量,所述的至少一个向量对应于所述的图的特征,所述的特征直接对应支持分类的子图结构,每一个向量对应一种混合态在图中的分布情况。
优选的,所述的分布情况是指图中出现该混合态中的子图结构的可能性;优选的,每一种所述的混合态代表任意多个子图结构对应的邻接矩阵的线性加权;更优选的,所述的线性加权是指每一个子图的邻接矩阵乘以该邻接矩阵对应的权值,然后对位相加到一起,得到一个与子图的邻接矩阵相同大小的矩阵;所述邻接矩阵对应的权值的加和为1;计算过程如图2所示。
优选的,所述的过滤操作是利用所述的过滤矩阵对所述第二邻接矩阵对位的矩阵内积的加和,通过激活函数得到一个值,让过滤矩阵沿所述第二邻接矩阵的对角线方向移动,从而得到一组值,形成一个向量,该向量对应一种子图结构在图中的分布情况;更优选的,所述的激活函数为sigmoid函数、ReLU激活函数、pReLU函数。
优选的,所述的特征生成模块利用不同的过滤矩阵,进行所述的过滤操作;
优选的,所述过滤矩阵中每一个元素的初始值分别从高斯分布中取出的随机变量的值。所述高斯分布是一种概率分布,高斯分布是具有两个参数μ和σ的连续型随机变量的分布,第一参数μ是服从正态分布的随机变量的均值,第二个参数σ是此随机变量的方差;通过高斯分布取随机变量值时,所取随机变量值与μ越邻近,概率越大,而离μ越远则概率越小。
优选的,所述的过滤矩阵中的元素为大于等于-1、小于等于1的实数;更优选的,所述的过滤矩阵中的元素为大于等于0、小于等于1的实数。
优选的,所述的特征生成模块参与机器学习过程,所述机器学习过程用于调整所述过滤矩阵的元素的值。
优选的,所述的机器学习过程是利用反向传播,利用分类的损失值,计算梯度值,进一步调节过滤矩阵中的各个元素的值。
所述损失值,指的是机器学习过程中的输出与实际应该得到的输出之间的误差;所述梯度可以看作是一个曲面沿着给定方向的倾斜程度,标量场的梯度是一个向量场。标量场中某一点上的梯度指向标量场增长最快的方向,梯度值是这个方向上最大的变化率。
所述的机器学习过程由正向传播过程和反向传播过程组成。在正向传播过程中,输入信息通过输入层经隐含层,逐层处理并传向输出层。如果在输出层得不到期望的输出值,则取输出与期望的误差的平方和作为目标函数,转入反向传播,逐层求出目标函数对各神经元权值的偏导数,构成目标函数对权值向量的梯度,作为修改权值的依据,机器学习过程在权值修改过程中完成。误差收敛到期望值或达到最大学习次数时,机器学习过程结束。所述过滤矩阵中元素的初始值为从高斯分布中取出的随机变量的值,然后在机器学习过程中通过反向传播进行更新,并在机器学习过程结束时达到最优。
优选的,所述隐含层是指除输入层和输出层以外的其他各层,隐含层不直接接受外界的信号,也不直接向外界发送信号。
进一步地,所述过滤矩阵的尺寸为n×n,即所述过滤矩阵的尺寸与所述第二邻接矩阵中的对角线区域宽度相同;通过所述的连接信息规整系统将第一邻接矩阵中的连接信息元素集中到对角线区域之后,使用过滤矩阵进行对角卷积就可以在O(n)的时间复杂度的前提下,尽可能多的把图中大小为n的子图结构的分布情况提取出来。
一个实施例具体实现本发明提供的一种在计算机环境中基于邻接矩阵的图分类系统,所述的图分类系统包含类别标注模块和如前所述任何一种形式的在计算机环境中基于邻接矩阵的图特征提取系统,所述的类别标注模块基于所述图特征提取系统生成的特征对图进行类别标注,输出图的类别;所述的图为图论中的图;
优选的,所述的类别标注模块计算出图属于各个分类标签的可能性,并将可能性最高的分类标签标注为图的类别,完成图的分类;
优选的,所述的类别标注模块利用分类算法计算出图属于各个分类标签的可能性,并将可能性最高的分类标签标注为图的类别,完成图的分类;更 优选的,所述的分类算法选自kNN、线性分类算法中的任意一种或任意多种。
所述kNN算法是指如果一个样本在特征空间中的k个最相邻的样本中的大多数属于某一个类别,则该样本也属于这个类别,并具有这个类别上样本的特性。该方法在确定分类决策上只依据最邻近的一个或者几个样本的类别来决定待分样本所属的类别。所述线性分类算法是指,根据标签确定的数据在其空间中的分布,使用一条直线(或者平面,超平面)进行分割来对数据进行分类。所述标签指的是对类别进行描述的标识。
进一步地,所述的图分类系统还包含层叠CNN模块,所述的层叠CNN模块基于所述的图特征提取系统生成的特征进行处理,融合所述的特征对应的支持分类的子图结构,生成包含图中更大子图结构的特征,所述的更大子图结构是指顶点个数多于n的子图结构;
优选的,所述的层叠CNN模块包含卷积子模块和池化子模块;
所述的卷积子模块使用至少一个卷积层基于所述的图特征提取系统生成的特征进行卷积操作,融合所述的特征对应的支持分类的子图结构,得到至少一个向量作为卷积结果;第一个卷积层的输入为如前所述任何一种形式的图特征提取系统生成的特征,如果有多个卷积层,每一个卷积层的输入为前一个卷积层的输出结果,每一个卷积层的输出结果均为至少一个向量,每一个卷积层使用至少一个过滤矩阵进行卷积操作,最后一个卷积层的卷积结果输出至所述的池化子模块;
进一步地,所述卷积操作是指使用一个过滤矩阵在邻接矩阵上以某种规律进行平移,对位相乘再相加,将得到的值构成向量或矩阵的计算方法。
所述的过滤矩阵为正方形矩阵;每一个所述卷积层中所述过滤矩阵的行数与输入该卷积层的向量的数量相同;优选的,所述的过滤矩阵中的元素为大于等于-1、小于等于1的实数;更优选的,所述的过滤矩阵中的元素为大于等于0、小于等于1的实数;
所述的池化子模块用于对所述卷积子模块得到的矩阵进行池化操作,得到至少一个向量作为池化结果输出至所述的类别标注模块,对图进行类别标注,输出图的类别,所述池化结果包含图中更大子图结构的特征;所述的更大子图结构是指顶点个数多于n的子图结构;优选的,所述的池化操作选自最大池化操作、平均池化操作。所述最大池化操作是指对邻域内特征点取最大值;所述平均池化操作是指对邻域内特征点的值求平均。
进一步地,所述的池化操作是在卷积操作的基础上,对每个卷积结果进行数学操作,进而缩小卷积结果的维数。所述的数学操作包括但不限于取平均值、取最大值。
优选的,所述层叠CNN模块的数据流图参见附图6。
所述的层叠CNN模块融合所述图的特征对应的支持分类的子图结构,通过一系列的卷积层、从所述特征生成模块得到的特征中提取更大、更深层、更复杂的特征,对应于图中跟大、更深层、更复杂的子图结构。所述连接信息规整系统、特征生成模块和层叠CNN模块协同作用,其作用是实现了利用大小为n的小窗口捕获小的子图结构(顶点数为n),同时利用这些小的子图结构(顶点数为n)的组合来获取顶点数大于n的更大、更深层、更复杂的子图结构,即用小窗口(窗口大小为n)来提取图的更大(顶点数大于n)、更深层、更复杂的特征。即实现了利用较小的窗口捕获大型多顶点的子图结构,以及来自顶点和边的隐式相关结构的深层特征,提高了图分类的准确性和速度。
进一步地,所述的图分类系统还包含独立池化模块和卷积池化模块;所述的独立池化模块用于对所述的图特征提取系统生成的特征进行池化操作,得到至少一个向量作为第一池化结果输出至所述的类别标注模块;所述的卷积池化模块对输入的如前所述任何一种形式的图特征提取系统生成的特征进行卷积和池化处理,融合所述的特征对应的支持分类的子图结构,生成包含图中更大子图结构特征的第二池化结果,将其输出至所述的类别标注模块;所述的类别标注模块根据所述第一池化结果和第二池化结果对图进行类别标注,输出图的类别;所述的更大子图结构是指顶点个数多于n的子图结构;
优选的,所述的卷积池化模块包含卷积子模块和池化子模块;所述的卷积子模块使用至少一个过滤矩阵对输入进行卷积操作,融合所述的特征对应的支持分类的子图结构,得到至少一个向量作为卷积结果传递给池化子模块;所述的池化子模块对所述的卷积结果进行池化操作,得到至少一个向量作为第二池化结果,所述第二池化结果包含图中更大子图结构的特征,将所述的池化结果输出至所述的类别标注模块;
所述的过滤矩阵为正方形矩阵;每一个所述卷积层中所述过滤矩阵的行数与输入该卷积层的向量的数量相同;优选的,所述的过滤矩阵中的元素为大于等于-1、小于等于1的实数;更优选的,所述的过滤矩阵中的元素为大于等于0、小于等于1的实数;优选的,所述的池化操作选自最大池化操作、平均池化操作。
优选的,所述包含独立池化模块和卷积池化模块的层叠CNN模块的数据流图参见附图7。
进一步地,所述的图分类系统还包含独立池化模块和多个卷积池化模块;所述的独立池化模块用于对所述的图特征提取系统生成的特征进行池化操作,得到至少一个向量作为第一池化结果输出至所述的类别标注模块;所述的卷积池化模块对输入的特征依次进行卷积操作和池化操作,所述的卷积操作融合所述的特征对应的支持分类的子图结构得到至少一个向量作为卷积结果,然后对所述的卷积结果进行池化操作,得到至少一个向量作为池化结果,所述池化结果中包含图中更大子图结构的特征;上一个卷积池化模块的卷积结果输出至下一个卷积池化模块,每一个卷积池化模块的池化结果均输出至所述的类别标注模块;所述的类别标注模块根据所述第一池化结果和全部卷积池化模块的池化结果对图进行类别标注,输出图的类别;
其中,第一个所述卷积池化模块的输入为如前所述任何一种形式的图特征提取系统生成的特征,其他卷积池化模块的输入为上一个卷积池化模块的卷积结果;最后一个卷积池化模块仅将池化结果输出至类别标注模块;所述的更大子图结构是指顶点个数多于n的子图结构;
优选的,所述的卷积池化模块包含卷积子模块和池化子模块;所述的卷积子模块使用至少一个过滤矩阵对输入进行卷积操作,融合所述的特征对应的支持分类的子图结构得到至少一个向量作为卷积结果,并将所述的卷积结果输出至下一个所述的卷积池化模块;所述的池化子模块对所述卷积子 模块输出的卷积结果进行池化,得到至少一个向量作为池化结果输出至所述的类别标注模块,所述池化结果包含图中更大子图结构的特征;优选的,所述卷积子模块、池化子模块的数量可相同或不同;优选的,所述卷积子模块、池化子模块的数量为1个或多个;
所述的过滤矩阵为正方形矩阵;每一个所述卷积层中所述过滤矩阵的行数与输入该卷积层的向量的数量相同;优选的,所述的过滤矩阵中的元素为大于等于-1、小于等于1的实数;更优选的,所述的过滤矩阵中的元素为大于等于0、小于等于1的实数;
优选的,所述卷积池化模块的数量小于或等于10个,更优选的,所述的图分类系统中所述卷积池化模块的数量小于或等于5个;更优选的,所述的图分类系统中所述卷积池化模块的数量小于或等于3个;
优选的,所述的池化操作选自最大池化操作、平均池化操作。
优选的,所述包含独立池化模块和多个卷积池化模块的层叠CNN模块的数据流图参见附图8。
进一步地,所述的卷积结果对应的向量的元素值代表子图结构在图上各个位置出现的可能性,所述池化结果、第一池化结果、第二池化结果对应的向量的元素值代表子图结构在图中出现的最大可能性或平均可能性。
进一步地,所述的类别标注模块包括隐含层单元、激活单元、标注单元;
所述的隐含层单元对接收到的向量进行处理,得到至少一个混合向量传递至所述的激活单元,所述的混合向量包含所述隐含层单元接收到的所有向量的信息;优选的,所述的隐含层单元对接收到的向量进行的处理是指对输入的向量进行合并拼接成一个组合向量,并使用至少一个权重向量对所述的组合向量进行线性加权操作得到至少一个混合向量;优选的,所述隐含层是指除输入层和输出层以外的其他各层,隐含层不直接接受外界的信号,也不直接向外界发送信号。
所述的激活单元对所述隐含层单元输出的每一个混合向量使用激活函数计算得到一个值,并将所有得到的值组成一个向量输出到所述的标注单元;优选的,所述的激活函数为sigmoid函数、ReLU激活函数、pReLU函数;
所述的标注单元用于根据激活单元的结果计算出图属于各个分类标签的可能性,并将可能性最高的分类标签标注为图的类别,完成图的分类;优选的,所述标注单元基于分类算法计算出图属于各个分类标签的可能性,并将可能性最高的分类标签标注为图的类别,完成图的分类;更优选的,所述的分类算法选自kNN、线性分类算法中的任意一种或任意多种。
本发明的第四个目的是提供一种在计算机环境中的连接信息规整方法,所述的方法包括如下步骤:
(1)初始输入:将图转化为第一邻接矩阵;
(2)连接信息规整:对第一邻接矩阵中的全部顶点进行重新排序,得到第二邻接矩阵,所述第二邻接矩阵中的连接信息元素集中分布在所述第二邻接矩阵的宽度为n的对角线区域,其中n为正整数,n≥2且n<|V|,所述的|V|为第二邻接矩阵的行数或列数;
所述第二邻接矩阵的对角线区域由以下元素组成:正整数i从1遍历至|V|,当i>max(n,|V|-n)时,选取第i行中第(i-n+1)到|V|列的元素;当i≤n,选取第i行中第0至i+n-1列的元素;当max(n,|V|-n)≥i≥min(|V|-n,n),则第i列中,选取第(i-n+1)列到第(i+n-1)列的元素。
所述的连接信息元素是图的边在邻接矩阵中对应的元素;
所述的图为图论中的图;
优选的,如果所述的图的边上没有权重,则所述的连接信息元素的值为1,非连接信息元素的值为0;更优选的,如果所述的图的边上带有权重,则所述的连接信息元素的值为边的权重值,非连接信息元素的值为0;
优选的,所述对角线区域指矩阵中从左上角至右下角的对角线区域;
优选的,所述第二邻接矩阵的对角线区域是指使用一个尺寸为n×n的扫描矩形框沿所述第二邻接矩阵的对角线扫描一遍所经过的区域;
更优选的,所述的扫描过程如下:首先,将所述扫描矩形框的左上角与第二邻接矩阵的左上角重合;然后每次将所述扫描矩形框往右方和下方各移动一个元素格,直至所述扫描矩形框的右下角与所述第二邻接矩阵的右下角重合。
优选的,所述重新排序的方法为整数优化算法。
进一步地,所述重新排序的方法为贪心算法,包括以下步骤:
(1)初始输入:输入图的第一邻接矩阵作为待处理邻接矩阵;
(2)交换对统计:计算待处理邻接矩阵中所有可能的顶点交换对;
(3)行列交换:判断是否所有可能的顶点交换对均为已处理状态,若是,则输出待处理邻接矩阵得到所述的第二邻接矩阵,所述的贪心算法结束;否则,从尚未处理过的顶点交换对中任意选择一个顶点交换对作为当前顶点交换对,同时交换其对应的两个顶点在待处理邻接矩阵中对应的两行及对应的两列,生成新邻接矩阵,并跳转至步骤(4);
(4)交换效果评定:计算新邻接矩阵中连接信息元素的集中程度,若所述新邻接矩阵中连接信息元素的集中程度高于所述待处理邻接矩阵中连接信息元素的集中程度,则用所述新邻接矩阵替代所述的待处理邻接矩阵,并跳转至步骤(2);若所述新邻接矩阵中连接信息元素的集中程度低于或等于所述待处理邻接矩阵中连接信息元素的集中程度,则放弃这种交换,并标记所述的当前顶点交换对为已处理状态,跳转至步骤(3)。
进一步地,所述重新排序的方法为分支定界算法,包括以下步骤:
(1)初始输入:输入图的第一邻接矩阵作为待处理邻接矩阵;
(2)交换对统计:计算待处理邻接矩阵中所有可能的顶点交换对;
(3)行列交换:判断是否所有可能的顶点交换对均为已处理状态,若是,则输出所述的待处理邻接矩阵得到所述第二邻接矩阵,所述的分支定界算法结束;否则,对所有可能的顶点交换对中的每一个未处理过的顶点交换对分别执行交换操作,并跳转至步骤(4),所述的交换操作是指同时交换所述顶点交换对对应的两个顶点在所述待处理邻接矩阵中对应的两行及对应两列,对每一个所述的顶点交换对执行所述交换操作都会生成一个新邻接矩阵;
(4)交换效果评定:计算每一个所述的新邻接矩阵中连接信息元素的集中程度,若存在连接信息元素的集中程度高于所述待处理邻接矩阵中连接信息元素的集中程度的新邻接矩阵,则选择集中程度最高的新邻接矩阵代替所述的待处理矩阵,并标记生成该集中程度最高的新邻接矩阵的顶点交换对为已处理状态,然后跳转至步骤(3);若不存在连接信息元素的集中程度高于所述待处理邻接矩阵中连接信息元素的集中程度的新邻接矩阵,则输出当前待处理邻接矩阵得到所述的第二邻接矩阵,所述的分支定界算法结束。
进一步地,所述第二邻接矩阵的对角线区域中连接信息元素的集中程度依赖于所述的对角线区域内的连接信息元素的数量和/或非连接信息元素的数量。
进一步地,所述第二邻接矩阵的对角线区域中连接信息元素的集中程度依赖于所述的对角线区域外的连接信息元素的数量和/或非连接信息元素的数量。
进一步地,所述的集中程度可以利用Loss值来衡量,Loss值越小,集中程度越高,所述的Loss值的计算方法如下:
Figure PCTCN2018082111-appb-000005
式中,LS(A,n)代表损失Loss值,A代表所述的第二邻接矩阵,n代表所述第二邻接矩阵中对角线区域的宽度,Ai,j表示所述第二邻接矩阵中第i行第j列的元素。
进一步地,所述的集中程度还可以利用ZR值来衡量,ZR值越小,集中程度越高,所述ZR值的计算方法如下:
Figure PCTCN2018082111-appb-000006
Figure PCTCN2018082111-appb-000007
Figure PCTCN2018082111-appb-000008
式中,A代表第二邻接矩阵,C表示所有元素均为连接信息元素且尺寸大小与A相同的矩阵,Ai,j表示A中第i行第j列的元素,Ci,j表示C中第i行第j列的元素,TC(A,n)、TC表示宽度为n的对角线区域中元素的总个数,T1(A,n)、T1表示宽度为n的对角线区域中连接信息元素的个数,ZR(A,n)代表ZR值,该值表示宽度为n的对角线区域中非连接信息元素的占比。
一个实施例具体实现本发明提供的一种在计算机环境中基于邻接矩阵的图特征提取方法,所述的方法基于图的邻接矩阵抽取出图的特征,所述的特征直接对应支持分类的子图结构,所述的特征以至少一个向量的形式呈现,每一个向量对应一种混合态在图中的分布情况,所述的方法包括以下步骤:
(1)连接信息规整:基于图的第一邻接矩阵,采用如前所述的任何一种连接信息规整方法得到第二邻接矩阵;
(2)对角过滤:基于步骤(1)得到的第二邻接矩阵,生成图的特征,所述的特征直接对应支持分类的子图结构,每一个向量对应一种混合态在图中的分布情况;
所述的图、子图均为图论中的图;
优选的,所述的步骤(2)利用过滤矩阵生成图的特征,所述的过滤矩阵为正方形矩阵;更优选的,所述的步骤(2)利用至少一个过滤矩阵,沿所述第二邻接矩阵的对角线区域进行过滤操作,得到至少一个向量,所述的至少一个向量对应于所述的图的特征,所述的特征直接对应支持分类的子图结构,每一个向量对应一种混合态在图中的分布情况;
优选的所述的步骤(2)利用不同的过滤矩阵,进行所述的过滤操作;
优选的,所述的分布情况是指图中出现该混合态中的子图结构的可能性;优选的,每一种所述的混合态代表任意多个子图结构对应的邻接矩阵的线性加权;更优选的,所述的线性加权是指每一个子图的邻接矩阵乘以该邻接矩阵对应的权值,然后对位相加到一起,得到一个与子图的邻接矩阵相同大小的矩阵;
优选的,所述的过滤操作是利用所述的过滤矩阵对所述第二邻接矩阵对位的矩阵内积的加和,通过激活函数得到一个值,让过滤矩阵沿所述第二邻接矩阵的对角线方向移动,从而得到一组值,形成一个向量,该向量对应一种子图结构在图中的分布情况;更优选的,所述的激活函数为sigmoid函 数、ReLU激活函数、pReLU函数;
优选的,所述的过滤矩阵中每一个元素的初始值分别从高斯分布中取出的随机变量的值;
优选的,所述的过滤矩阵中的元素为大于等于-1、小于等于1的实数;更优选的,所述的过滤矩阵中的元素为大于等于0、小于等于1的实数;
优选的,所述的步骤(2)参与机器学习过程,所述机器学习过程用于调整所述过滤矩阵的元素的值;
优选的,所述的机器学习过程是利用反向传播,利用分类的损失值,计算梯度值,进一步调节过滤矩阵中的各个元素的值;更优选的,所述的特征生成模块可以利用不同的过滤矩阵,进行上述的过滤操作;
优选的,所述的连接信息的值为1,非连接信息的值为0;更优选的,如果所述的图的边上带有权重,则所述的连接信息的值为边的权重值,非连接信息的值为0。
优选的,所述第二邻接矩阵的对角线区域是指使用一个尺寸为n×n的扫描矩形框沿所述第二邻接矩阵的对角线扫描一遍所经过的区域;
进一步地,所述过滤矩阵的尺寸为n×n。
一个实施例具体实现本发明提供的一种在计算机环境中基于邻接矩阵的图分类方法,所述的图分类方法包括如下步骤:
(1)特征提取:利用如前所述任何一种形式的基于邻接矩阵的图特征提取方法提取图的特征;
(2)类别标注:基于步骤(1)提取的特征对图进行类别标注,输出图的类别;所述的图为图论中的图;优选的,所述的步骤(2)计算出图属于各个分类标签的可能性,并将可能性最高的分类标签标注为图的类别,完成图的分类;优选的,所述的步骤(2)利用分类算法计算出图属于各个分类标签的可能性,并将可能性最高的分类标签标注为图的类别,完成图的分类;更优选的,所述的分类算法选自kNN、线性分类算法中的任意一种或任意多种。
一个实施例具体实现本发明提供的一种在计算机环境中基于层叠CNN的图分类方法,所述的图分类方法包括如下步骤:
(1)图特征提取:利用如前所述任何一种形式的基于邻接矩阵的图特征提取方法提取图的特征;
(2)卷积操作:利用至少一个卷积层对步骤(1)提取的图的特征进行卷积操作,融合所述的特征对应的支持分类的子图结构,得到至少一个向量作为卷积结果;第一个卷积层的输入为步骤(1)提取的图的特征,如果有多个卷积层,每一个卷积层的输入为前一个卷积层的输出结果,每一个卷积层的输出结果均为至少一个向量,每一个卷积层使用至少一个过滤矩阵进行卷积操作,最后一个卷积层的卷积结果输出至步骤(3);所述的过滤矩阵为正方形矩阵;每一个所述卷积层中所述过滤矩阵的行数与输入该卷积层的卷积结果中向量的数量相同;优选的,所述的过滤矩阵中的元素为大于等于-1、小于等于1的实数;更优选的,所述的过滤矩阵中的元素为大于等于0、小于等于1的实数;
(3)池化操作:对步骤(2)中卷积操作的结果进行池化操作,得到至少一个向量作为池化结果传递至步骤(4),所述池化结果中包含图中更大子图结构的特征,所述的更大子图结构是指顶点个数多于n的子图结构;优选的,所述的池化操作选自最大池化操作、平均池化操作;
(4)类别标注:根据步骤(3)得到池化结果,对图进行类别标注,输出图的类别。
一个实施例具体实现本发明提供的另一种在计算机环境中基于层叠CNN的图分类方法,所述的图分类方法包括以下步骤:
(1)图特征提取:利用如前所述任何一种形式的基于邻接矩阵的图特征提取方法提取图的特征,并传递至步骤(2)和步骤(3);
(2)独立池化操作:对步骤(1)提取的图的特征进行池化操作,得到至少一个向量作为第一池化结果输出至步骤(4);
(3)卷积池化操作:使用至少一个过滤矩阵对步骤(1)提取的图的特征进行卷积操作,融合所述的特征对应的支持分类的子图结构,得到至少一个向量作为卷积结果,然后,对所述的卷积结果进行池化操作,得到至少一个向量作为第二池化结果传递至步骤(4),所述第二池化结果中包含图中更大子图结构的特征;所述的更大子图结构是指顶点个数多于n的子图结构;所述的过滤矩阵为正方形矩阵;每一个所述卷积层中所述过滤矩阵的行数与输入该卷积层的向量的数量相同;优选的,所述的过滤矩阵中的元素为大于等于-1、小于等于1的实数;更优选的,所述的过滤矩阵中的元素为大于等于0、小于等于1的实数;优选的,所述的池化操作选自最大池化操作、平均池化操作;
(4)类别标注:根据所述的第一池化结果和第二池化结果,对图进行类别标注,输出图的类别。
一个实施例具体实现本发明提供的另一种在计算机环境中基于层叠CNN的图分类方法,所述的图分类方法包括以下步骤:
(1)图特征提取:利用如前所述任何一种形式的基于邻接矩阵的图特征提取方法提取图的特征,并传递至步骤(2);
(2)独立池化操作:对步骤(1)提取的图的特征进行池化操作,得到至少一个向量作为第一池化结果输出至步骤(3);
(3)卷积池化操作:使用至少一个过滤矩阵对输入进行卷积操作,融合所述的特征对应的支持分类的子图结构得到至少一个向量作为卷积结果,然后,对所述的卷积结果进行池化操作,得到至少一个向量作为池化结果,所述池化结果包含图中更大子图结构的特征,上一级的卷积结果传递至下一级的卷积池化操作,每一级卷积池化操作的池化结果均输出至步骤(4);其中,第一级卷积池化操作的输入为步骤
(1)提取的图的特征,如果有多级卷积池化操作,每一级卷积池化操作的输入为前一级的卷积池化操作的输出结果,最后一级卷积池化操作仅将池化结果至步骤(4);所述的更大子图结构是指顶点个数多于n的子图结构;所述的过滤矩阵为正方形矩阵;每一个所述卷积层中所述过滤矩阵的行数与输入该卷积层的向量的数量相同;优选的,所述的过滤矩阵中的元素为大于等于-1、小于等于1的实 数;更优选的,所述的过滤矩阵中的元素为大于等于0、小于等于1的实数;优选的,所述的池化操作选自最大池化操作、平均池化操作;
(4)类别标注:根据所述的第一池化结果和步骤(3)的全部池化结果,对图进行类别标注,输出图的类别。
进一步地,所述的卷积结果对应的向量的元素值代表子图结构在图上各个位置出现的可能性,所述池化结果、第一池化结果、第二池化结果对应的向量的元素值代表子图结构在图中出现的最大可能性或平均可能性。
进一步地,所述的类别标注包括以下步骤:
(1)特征合并:使用隐含层对接收到的向量进行处理,得到至少一个混合向量传递至步骤(2);所述的混合向量包含所述隐含层接收到的所有向量的信息;优选的,所述的处理对输入的向量进行合并拼接成一个组合向量,并使用至少一个权重向量对所述的组合向量进行线性加权操作得到至少一个混合向量;
(2)特征激活:对接收到的每一个混合向量使用激活函数计算得到一个值,并将所有得到的值组成一个向量传递至步骤(3),优选的,所述的激活函数为sigmoid函数、ReLU激活函数、pReLU函数;
(3)类型标注:利用接收到的向量计算出图属于各个分类标签的可能性,并将可能性最高的分类标签标注为图的类别,完成图的分类;优选的,所述标注单元基于分类算法计算出图属于各个分类标签的可能性,并将可能性最高的分类标签标注为图的类别,完成图的分类;更优选的,所述的分类算法选自kNN、线性分类算法中的任意一种或任意多种。
一个实施例具体实现本发明提供的一种图分类系统,所述图的顶点为任意实体,所述图的边为任意实体之间的关系;
优选的,所述的任意实体是任意的独立个体或个体集合,所述的个体是实际存在或虚拟的;优选的,所述的实体可以是任意人、事、事件、物、概念中的一种或多种的组合;更优选的,所述的任意实体选自化合物或单质中的原子,网络中的人、商品、事件的任意一种或任意多种;
优选的,所述的关系为任意实体之间的任意关联性;更优选的,所述关联性是连接原子的化学键、商品之间的联系、人与人之间的关系;更优选的,所述商品之间的联系包括购买商品的因果关系、关联关系;更优选的,所述人与人之间的关系包括实际的血缘关系、虚拟社交网络中的好友关系或关注关系、交易关系、发送消息关系。
一个实施例具体实现本发明提供的一种网络结构类型判别系统,所述的分类系统基于如前所述任何一种形式的图分类系统实现网络结构分类,所述图的顶点为网络中的节点,所述图的边为网络中节点的关系;优选的,所述网络选自电子网络、社交网络、物流网络;更优选的,所述电子网络选自局域网、城域网、广域网、互联网、4G、CDMA、Wi-Fi、GSM、WiMax、802.11、红外、EV-DO、蓝牙、GPS卫星、和/或任意其他适当有线/无线技术或协议的网络的至少一部分中无线发送至少一些信息的任意通信方案;优选的,所述节点选自地理位置、移动站、移动设备、用户装备、移动用户、网络用户;更优选的,所述节点的关系选自电子网络节点之间的信息传输关系、地理位置之间运输关系、人与人之间实际的血缘关系、虚拟社交网络中的好友关系或关注关系、交易关系、发送消息关系;优选的,所述分类选自网络的结构类型;所述结构类型选自星型、树形、全连接型、环形。
一个实施例具体实现本发明提供的一种化合物分类系统,所述的分类系统基于如前所述任何一种形式的图分类系统实现化合物分类,所述图的顶点为化合物的原子,所述图的边为原子之间的化学键;优选的,所述的分类选自化合物的活性、诱变性、致癌性、催化性等。
一个实施例具体实现本发明提供的一种社交网络分类系统,所述的分类系统基于如前所述任何一种形式的图分类系统实现社交网络分类,所述图的顶点为社交网络中的实体,所述图的边为实体之间的关系,所述的实体包括但不限于社交网络中的人、机构、事件、地理位置,所述的关系包括但不限于好友关系、关注关系、私信、点名、关联。所述的点名是指提及某个人,可以用@的方式。
一个实施例具体实现本发明提供的一种计算机系统,所述的计算机系统包括如前所述的任何一种形式的所述的图特征提取系统、所述的图分类系统、所述的网络结构类型判别系统、所述的化合物分类系统、所述的社交网络分类系统中的任意一种或任意多种。
此外,一个实施例以一个6顶点图为例,对本发明的在计算机环境中基于邻接矩阵的连接信息规整系统及基于邻接矩阵的图特征提取系统进行详细描述。对于这个6顶点图,将其各个顶点用a,b,c,d,e,f表示,按照字母顺序,六条边分别是(a,b),(a,c),(b,e),(b,f),(e,f)和(e,d),其图结构以及其根据此顶点排序的第一邻接矩阵如图9所示。
所述的连接信息规整系统对第一邻接矩阵中的全部顶点进行重新排序,得到第二邻接矩阵,所述第二邻接矩阵中的连接信息元素集中分布在所述第二邻接矩阵的宽度为n的对角线区域,其中n为正整数,n>=2且n远小于|V|,所述的|V|为第二邻接矩阵的行数(或列数)。所述第二邻接矩阵的所述的宽度为n的对角线区域由以下元素组成:正整数i从1遍历至|V|,当n<i<|V|-n时,选取第i行中第(i-n+1)到(i+n-1)列的元素;当i<=n,选取第i行中第0至i+n-1列的元素;当i>=|V|-n,则第i列中,选取第(i-n+1)列到第|V|列的元素。
所述的顶点重新排序方法可以为贪心算法,包括以下步骤:
(1)初始输入:输入图的第一邻接矩阵作为待处理邻接矩阵A;
(2)交换对统计:计算A中所有可能的顶点交换对;对A中列号进行1~6标号,所有可能的顶点交换对(即列号对)为pairs={(m,h)|1<=m<=5,m+1<=h<=6},特殊的,每次待处理矩阵更新之后,会对待处理矩阵中列号重新标号,然后所有可能的列号重新初始化为15对;初始化i=1,j=2;
(3)行列交换:判断i是否等于5,若是,则输出A得到所述的第二邻接矩阵,所述的贪心算法结束;否则,从pairs中选择(i,j)作为当前顶点交 换对,执行swap(i,j)操作,生成新邻接矩阵,并跳转至步骤(4);
(4)交换效果评定:计算新邻接矩阵中连接信息的集中程度,若所述新邻接矩阵中连接信息的集中程度高于A中连接信息的集中程度,则执行refresh(A)操作(用所述新邻接矩阵替代A),并跳转至步骤(2);若所述新邻接矩阵中连接信息的集中程度低于或等于所述待处理邻接矩阵中连接信息的集中程度,则放弃这种交换,执行j=j+1,若j大于5,则执行i=i+1和j=i+1操作,跳转至步骤(3);若j小于或等于5,直接跳转至步骤(3)。
具体流程图如图10所示,其中swap(A,i,j)表示同时交换邻接矩阵A中的i,j对应的行和列,得到新邻接矩阵;refresh(A)表示邻接矩阵应用这种行列交换。
所述的连接信息集中程度利用Loss值以及ZR值来衡量,其计算方法如下方公式所示。例如图13(a)中,损失Loss(A,3)=0,ZR(A,3)=12/24=0.5;图13(b)中,Loss(A,3)=2,ZR(A,3)=10/24=5/12。Loss值或者ZR值越小,表示连接信息集中程度越高。
Figure PCTCN2018082111-appb-000009
Figure PCTCN2018082111-appb-000010
Figure PCTCN2018082111-appb-000011
Figure PCTCN2018082111-appb-000012
以图9中提到的图为例,选择n=3,交换第一邻接矩阵中的对应的两行和两列,如图11所示。图11中(a)为输入的第一邻接矩阵,其损失Loss(A,3)=4,ZR(A,3)=16/24=2/3。图11(b)为交换a、d对应的行列之后所得到新邻接矩阵A’,其损失Loss(A’,3)=6,ZR(A’,3)=18/24=9/12,损失Loss(A’,3)>Loss(A,3),ZR(A’,3)>ZR(A,3),即连接信息元素集中程度降低,故放弃这种交换;图11(c)为交换b,c对应的行列之后得到的新邻接矩阵A”,其损失Loss(A”,3)=2,ZR(A”,3)=22/24=7/12,Loss(A”,3)<Loss(A,3),ZR(A”,3)<ZR(A,3),经这样的交换后集中程度变高了,故采用这种交换,用A”替代A。经过不断尝试之后,可以得到最优结果,如图12右边的邻接矩阵,最优结果即为第二邻接矩阵。此时第二邻接矩阵的顶点顺序变为c,a,b,f,e,d,所有的连接信息元素(值为“1”的元素)均落在了第二邻接矩阵中宽度为n(n=3)的对角线区域中。
连接信息规整系统的一个重要作用是,给定一个第一邻接矩阵,可能存在不止一种方式对图顶点重新排序,且连接信息集中程度均为最低。因此存在多于一个的第二邻接矩阵,这些第二邻接矩阵之间是同构的。如图13(a)中所示,两个邻接矩阵均是通过连接信息规整系统得到的第二邻接矩阵,连接信息均在邻接矩阵中宽度为n(n=3)的对角线区域中,但是两个的顶点排序顺序并不相同,故可能存在多个第二邻接矩阵。在本发明中,利用这个同构的特性来生成图的不同矩阵表示,这些同构的第二邻接矩阵被用来增加图分类系统深度学习过程中预处理阶段的训练集。
将第二邻接矩阵输入到特征生成模块计算得到至少一个向量,这些向量直接对应支持分类的子图结构。特征生成模块中使用n 0>=1个大小为n×n的过滤矩阵,沿第二邻接矩阵的对角线局域移动,进行卷积运算,如图14所示。这些过滤矩阵使用F 0,i表示,i∈{1,…,n 0}。那么过滤矩阵F 0,i在第j步提取的对角线特征可以表示为:
Figure PCTCN2018082111-appb-000013
其中α(·)是激活函数,例如sigmoid。因此,从对角卷积获得的特征大小是n 0×(|V|-n+1)。在之后的说明中,使用P 0表示特征生成模块得到的特征
Figure PCTCN2018082111-appb-000014
并使用F 0表示滤波参数{F 0,i}。
同样以图9中提到的图为例,使用n 0=2个大小为3×3的过滤矩阵沿其第二邻接矩阵对角线方向移动计算,如图15所示。图15(a)中为图以及其第二邻接矩阵,图15(b)为使用的两个过滤矩阵,为了方便起见,这里将过滤矩阵中的值均取为0或1,两个过滤矩阵对应的子图结构如图15(c)所示。使用(b)上方的过滤矩阵沿第二邻接矩阵对角线方向移动计算,所述的计算即对位相乘再相加,故能得到一个向量(4,4,6,4);同样地,使用(b)中下方的过滤矩阵沿第二邻接矩阵对角线方向移动计算,可以得到另一个向量(4,4,4,4)。即经过两个过滤矩阵过滤操作之后,可以得到两个向量,如图15(d)所示,经过激活函数(Sigmoid)可以得到向量如图15(e)所示。其中图15(d)、图15(e)向量中的值越高,表示所使用的过滤矩阵所代表的子图结构在向量中该值对应的区域出现的可能性越大。比如在图15(e)中0.99所对应的区域为图15(a)中虚线所框出的区域,即b,e,f三个顶点所表示的子图结构,其子图结构与使用的过滤矩阵所表示的结构(图15(c)上方结构)完全相同。
连接信息规整系统的主要优点是将连接信息集中到第二邻接矩阵的对角线区域上,因为不包含连接信息的元素对于图的分类没有显著贡献,这使得系统的计算量大大减少。具体来说,没有经过连接信息规整系统,特征生成模块中使用大小为n×n的过滤矩阵提取特征时,每个过滤矩阵需要进行(|V|-n+1) 2次运算;而经过连接信息规整系统之后,在使用大小为n×n的过滤矩阵提取特征时,每个过滤矩阵仅需要进行(|V|-n+1)次运算。以图14为例,取n=3,经过连接信息规整系统后每个过滤矩阵需要进行的运算次数从(6-3+1) 2=16次减少到6-3+1=4次,计算量仅为原来的25%。可见,带 有连接信息规整系统的图特征提取系统的比不带有连接信息规整系统的图特征提取系统计算量大大减小,前者计算量仅为后者的25%。
此外,再提供一个实施例详细说明本发明所述的在计算机环境中基于邻接矩阵的图分类系统的一种具体实现,并使用以公开数据集来验证了这种实现的效果。
对于具有不规则大小的图的数据集,需要为其找到一个合适的窗口大小n。当n设置地太小时,可能导致大部分图经过连接信息规整系统后会丢失连接信息元素。此外,n太小可能导致着特征生成模块可能过拟合,因为捕获到较少可能的子图结构特征。首先,我们对所有图的邻接矩阵的尺寸进行统一,选取图数据集中顶底最多的图顶点数|V| max作为统一的邻接矩阵的大小(行数或列数)。对于顶点数小于|V| max的图,例如3个顶点的图,我们采用补零操作(追加0),使其邻接矩阵的行数和列数等于|V| max,在统一邻接矩阵尺寸的同时也保证了原始输入图中现有的连接信息得到维护,即追加的0不会破坏或更改图中原有的点和边。所述的补零操作,如图16所示,图16(a)为3个顶点的图的图结构以及其邻接矩阵,对其进行补零使其邻接矩阵的大小变为5,如图16(b)所示。
在选择n时,首先从图数据集中随机选取少量图,然后使用不同窗口大小n的连接信息规整系统对选取的图进行处理,比较最终的第二邻接矩阵的Loss指标。对随机选取的这一组图,选择使得这组图的第二邻接矩阵的平均Loss值最小的窗口大小n作为这个图数据集的窗口大小。
对于每张图,将其邻接矩阵进行补零操作得到第一邻接矩阵之后采用如图30所示的处理流程对第一邻接矩阵进行处理,首先采用实施例1中的贪心算法对图的邻接矩阵进行连接信息规整和特征生成操作,在特征生成操作中,选取n f0个过滤矩阵进行过滤操作,以前述实施例1中的方式进行图的特征提取,输入到层叠CNN模块。在层叠CNN模块中经过第一个卷积子模块得到第一个卷积结果P 1,所述的卷积结果对应的向量的元素值代表子图结构在图上各个位置出现的可能性,然后通过反复添加更多的卷积子模块,可以得到更多的卷积结果P 2,P 3,…,P m,越深层的卷积子模块得到的卷积结果代表的子图越大、越复杂。表1中介绍了每个卷积子模块中过滤矩阵的大小和数量以及生成的特征的大小,其中对角卷积代表特征生成模块,卷积层m为第m个卷积子模块。需要注意的是,层叠CNN中的每个卷积子模块,需要将过滤矩阵的高度(即过滤矩阵的行数)设置为上一个卷积子模块中的过滤矩阵数量(即上一个卷积子模块输出的卷积结果中向量的数量)。例如,对于卷积子模块2,过滤矩阵大小为n 1×s 2,这意味着过滤矩阵高度与卷积子模块1中的过滤矩阵的数量(n 1)相同。
正式地,对于第i个卷积子模块,将第(i-1)个卷积子模块的卷积结果P i-1作为输入,其大小为n i-1×(|V|-n+1)。在其左右均使用(s i-1)/2个零填充获得大小为n i-1×(|V|-n+s i)的特征
Figure PCTCN2018082111-appb-000015
之后使用n i个大小为(n i-1×s i)的过滤矩阵F i进行卷积运算,获得卷积结果P i。定义P i中的元素如下:
Figure PCTCN2018082111-appb-000016
式中,α(·)表示激活函数,如sigmoid;j,k表示在P i中元素的位置,第j行,第k列。s i表示第i个卷积层中过滤矩阵的宽度,n i表示第i个卷积层中过滤矩阵的个数。
表1 图分类系统各层设置及特征大小
Figure PCTCN2018082111-appb-000017
在经过m个卷积子模块之后,可以得到深层的卷积结果P 0,…,P m。使用池化子模块对各卷积结果进行池化操作,这里选取最大值池化操作,在每组卷积结果P i后添加最大值池化层。对于大小为n i-1×(|V|-n+1)的矩阵P i,对其每一行进行最大值池化操作,得到一个大小为n i-1×1的向量。
附图17表示层叠CNN中卷积子模块和池化子模块之间的关系,其中箭头表示模块之间数据的传递方向。隐含层单元为一个全连接层,全连接层中的神经元与上一层的所有激活值有完全的连接。该层中设置了权重参数W h,偏差参数b h来对输入的向量进行计算得到激活值,另外设置了dropout,来防止神经网络过拟合。所述的dropout是指在深度学习网络的训练过程中,对于神经网络单元,按照一定的概率将其暂时从网络中丢弃,dropout能够 有效防止过拟合。
在标注单元,将激活单元计算得到的激活值作为输入,通过包含权重参数W s,偏差参数b s的另一个全连接层进行多项Logistic回归(即softmax函数)计算在类标签向量x上的概率分布,将输出结果中最高概率值对应的类标签标注为该图的类别。
系统中神经网络的训练是通过最小化交叉熵(cross-entropy)损失实现的,其公式为:
Figure PCTCN2018082111-appb-000018
式中其中|R|是训练图集R中图的总数,A i表示R中第i个图的邻接矩阵,y i表示第i个类标签。神经网络中参数用随机梯度下降(SGD)优化,采用反向传播算法计算梯度。
为了评估本发明的效果,使用五个公开图数据集进行测试。其中包括三个生物信息学数据集:MUTAG、PTC和PROTEINS。MUTAG是一个具有188种硝基化合物的数据集,其中类别表明化合物对细菌是否具有诱变作用。PTC是包含344种化合物的数据集,类别为化合物对雄性和雌性老鼠的致癌性。PROTEINS是图的集合,其中图顶点是二级结构元素,图的边表示氨基酸序列中或3D空间中的邻域。另外两个为社交网络数据集IMDB-BINAR和IMDB-MULTI。IMDB-BINARY是一个电影合作数据集,其中IMDB上收集了不同电影的演员/演员和流派信息。对于每个图,顶点表示演员/女演员,并且如果它们出现在相同的电影中,它们之间就存在一条边连接它们。每个演员/演员生成一个协作网络和一个自我网络。自我网络标有它所属的类型。IMDB-MULTI是多分类版本,因为电影可以同时属于多种类型;IMDB-BINARY是二分类版本(只有两种类别)。
基于上述数据集,使用了本发明的基于层叠CNN的图分类系统的两种不同实现进行验证,第一种实现采用1个独立池化模块和1个卷积池化模块;第二种图分类系统采用1个独立池化模块和4个卷积子模块。将发明中参数n设置为从3到17。另外,每个卷积层使用的过滤矩阵大小s i从{3,5,7,9,11}中调整。每个卷积层中过滤矩阵的数量从{20,30,40,50,60,70,80}中调整。收敛条件设定为在训练阶段的准确性与前一次迭代的准确性差异小于0.3%或超过30次迭代次数。每个实施例中根据3:7的比例,随机抽取测试集和训练集。
给定有N个图的测试集合,每个图G i及其分类标签y i和分类器的预测类别
Figure PCTCN2018082111-appb-000019
准确性(Accuracy)计算公式如下:
Figure PCTCN2018082111-appb-000020
其中指标函数δ(·)如果条件为真,则获得值“1”,否则得到值“0”。
将本发明与三种代表性的方法进行比较:DGK(Deep graph kernels,Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.ACM,2015:1365-1374)、PSCN(Learning convolutional neural networks for graphs,Proceedings of the 33rd International Conference on Machine Learning,New York,NY,USA,2016,2014-2023)和MTL(Joint structure feature exploration and regularization for multi-task graph classification,IEEE Transactions on Knowledge and Data Engineering,2016,28(3):715-728)。表2显示了使用的五个数据集的特征,并总结了比较结果的平均准确性和标准偏差。所有实施例在相同的设置中运行了十次。
表2 实施例结果比较
数据集 MUTAG PTC PROTEINS IMDB-BINARY IMDB-MULLTI
图的数量 188 344 1113 1000 1500
类别数 2 2 2 2 3
最大顶点数 28 109 620 136 89
平均顶点数 17.9 25.5 39.1 19.77 13
DGK 82.94±2.68(5s) 59.17±1.56(30s) 73.30±0.82(143s) 66.96±0.56 44.55±0.52
PSCN 92.63±4.21(3s) 60.00±4.82(6s) 75.89±2.76(30s) 71.00±2.29 45.23±2.84
MTL 82.81±1.22(0.006s) 54.46±1.61(0.045s) 59.74±2.11(0.014s) 59.50±3.23 36.53±3.23
第一种图分类系统 92.32±4.10(0.01s) 62.50±4.51(0.10s) 74.99±2.13(0.39s) 63.43±2.50 46.22±1.15
第二种图分类系统 94.99±5.63(0.01s) 68.57±1.72(0.08s) 75.96±2.98(0.60s) 71.66±2.71 50.66±4.10
对于数据集MUTAG,与PSCN 92.63%的最佳结果相比,第二种图分类系统的准确性为94.99%,高于PSCN。第一种图分类系统达到92.32%的准确性,与PSCN非常相似。对于PTC数据集,DGK和PSCN获得了约60%的准确性。第一种图分类系统达到62.50%,第二种图分类系统达到64.99%,这是本数据集上迄今为止最好的准确性。对于数据库PROTEINS,第二种图分类系统达到最高准确性为75.96%,略高于PSCN 75.89%的最佳结果。对于两个社交网络数据集,第二种图分类系统对IMDB-BINARY具有71.66%的准确性结果,高于71%的最佳PSCN,并且对于IMDB-MULTI而言,其最高准确性为50.66%,与PSCN最佳为45%,DGK最好为44%。在所有的实施例中,本发明都取得了最高的准确性,出现误判的概率更低。
考察系统中参数变化对分类结果的准确性和时间复杂度的影响。
窗口大小n:
这是决定本发明中的系统能否覆盖给定图数据集中最重要的子图模式的关键参数。因为小的n可能导致大多数图无法将全部连接信息元素集中到宽度为n的对角线区域。因此,可能会丢失更多的连接信息元素,这对于图数据集的分类可能至关重要。另一方面,由于窗口效应,较大的n将导致高时间复杂度和计算成本。图18(a)显示了本发明在数据集MUTAG上的准确性和耗时随n变化的结果。在本实施例中,对于所有实施例,过滤矩阵的数量设置为50,层叠CNN中卷积子模块的过滤矩阵宽度设置为7。准确性和耗时均为相同实施例设置下运行十次的平均值。从图18(a),图19(a)和图20(a)中可以看到,对于MUTAG、PTC和PROTEINS数据集,随着参数n从3增加到11,准确性对n的增加不敏感,而耗时更敏感。因此,设定较小的n更为理想。从表2中可以看到PTC中最大顶点数为109,平均顶点数为25.5,PROTEINS中最大顶点数为620,平均顶点数为39.1,而窗口大小n在3到11,故n的选择会远小于图的顶点数|V|。
层叠CNN过滤矩阵宽度s i
简单的,这里将层叠CNN中所有的卷积子模块设置相同的过滤矩阵宽度来比较。设置较大的宽度s i意味着每个过滤矩阵可以捕获更复杂的子图结构特征。复合子图结构特征也具有较高的组合可能性。然而,也很难确定过滤矩阵宽度以覆盖所有可能的组合。在本实施例中设置n为7,过滤矩阵数为50,层叠CNN过滤矩阵宽度为3到15。由于零填充(Zero-padding),只能使用奇数值的宽度,即3,5,7,9,11,13,15。同样在相同设置下运行十次计算平均值。图18(b),图19(b)和图20(b)分别为在MUTAG,PTC和PROTEINS上的结果。这表明在MUTAG上,当过滤矩阵宽度从3增加到9时,准确性随着过滤矩阵宽度的增加而增加,并且随着过滤矩阵宽度从9增加到15,变得更加稳定。这表明9是过滤矩阵宽度的近似最优设置,因为耗时上宽度9比11和15小。与MUTAG相似,PTC数据集显示过滤矩阵宽度的最佳设置为5,因为设置过滤矩阵宽度为9,11和13具有接近的准确性,但耗时较长时间相对于小过滤矩阵宽度为7时。在PROTEINS数据集中,即图20(b),可以看到最佳过滤矩阵宽度为11。
过滤矩阵数
Figure PCTCN2018082111-appb-000021
与过滤矩阵宽度类似,将所有卷积层设置成相同的过滤矩阵数量。本实施例中,将n设置为7,过滤矩阵宽度设置为7,过滤矩阵数量为20~80。图18(c),图19(c)和图20(c)分别为在MUTAG,PTC和PROTEINS上的结果。可以看到,使用较大的过滤矩阵数量,例如图9中的60,在同个数据集上可能会导致更差的准确性。这是因为使用了更多的过滤矩阵,需要训练的权重更多。因此,在更大的过滤矩阵数量的训练中更容易过拟合。
卷积层数量
为了更好地观察本发明在不同卷积层数上的有效性和效率,在本实施例中将MUTAG,PTC和PROTEINS上的卷积层数量设置为1至5。图18(d),图19(d)和图20(d)分别是在MUTAG,PTC和PROTEINS数据集上的准确性和耗时。在本实施例中,n和过滤矩阵宽度设置为7,过滤矩阵数量设置为50。一个有趣的事实是,在不调整其他参数的情况下,增加卷积层数量将不会显著提高准确性。在图18(d)中,5卷积层的准确性与2卷积层版本相似。这是因为在不增加过滤矩阵数量和过滤矩阵宽度的情况下,较深的卷积网络不能利用其容量来表示更复杂的特征。在图20(d)中,5卷积层的准确性甚至比2卷积层版本差。这意味着当前参数n的情况下,过滤矩阵宽度和数量在2卷积层上运行良好,限制了5卷积的性能。因此,在这种情况下,需要在PROTEINS数据集上放大5卷积层版本的其他参数。
Droupout比例
前面的实施例已经表明,增加过滤矩阵宽度,过滤矩阵大小以及数量和卷积层数量可能不会提高性能。下一组实施例是通过使用batch normalization中dropout比例来研究过拟合的影响。所述的batch normalization是在深度神经网络训练过程中使得每一层神经网络的输入保持相同分布的方法,它能帮助神经网络收敛。图21显示了MUTAG和PTC的结果。x轴改变dropout比例,左y轴为准确性,右y轴为耗时。图21(a)示出了当dropout比例为0~0.2时,准确性提高,当MUTAG的dropout比例为0.2~0.9时,准确性降低。图21(b)显示了PTC的测量结果:当dropout比例为0至0.4时,准确性稳定,当dropout比例为0.4至0.5时增加,当dropout比例为0.5至0.9时略有下降。该组实施例表明,当dropout比例设定为0.2时,本发明的图分类系统得到MUTAG的最佳拟合,PTC的最佳比例为0.5。
本发明提出了基于邻接矩阵的图特征提取系统,将邻接矩阵中的连接信息元素集中并提取特征。这里将本发明与普通CNN进行比较。在普通CNN方法中,直接在邻接矩阵上应用二维卷积层,且池化层变为2维池化。对于这两种方法,实施例的配置都为n=7,过滤矩阵宽度为7,过滤矩阵数量为50。结果如图22所示。图22(a)是这两种方法的准确性,可以看到本发明的方法准确性更高。在图22(b)中,普通CNN的耗时大于本发明的方法。即本发明方法有着更高的准确性和更低的耗时。
收敛
图23,24,25为MUTAG,PTC和PROTEINS的训练集和验证集上损失的收敛过程。灰线是训练集上的损失,蓝线是验证集上损失。可以看到,在三个数据集中,损失首先减少,在30次迭代后稳定。就像大多数机器学习方法,特别是神经网络一样,训练集上的损失可以比验证集具有更低的值。这是因为训练程序随机梯度下降使用的是训练集的损失而不是验证集。
特征的训练
本实施例在MUTAG数据集上进行,n设置为7,过滤矩阵宽度设置为7,过滤矩阵数量设置为20。图26中为训练过程中,特征生成模块中的过滤矩阵的参数变化以及其代表的图结构。其中,图中x轴表示迭代次数,从0到30。迭代次数为0时表示值为从高斯分布中随机采样得到的初始值。 图26(c)为过滤矩阵的值,它是一个7×7矩阵。矩阵中的单元格越暗,值越大,更接近1,而白色单元格的值更接近-1,灰度单元格值约为0。在初始阶段,更多的单元格是灰色的,值为0左右。随着训练过程的进一步发展,一些黑的单元格变得更亮,一些白的单元格变得更暗,特别是在左上角。而最右边的最黑的单元格在训练期间保持黑色。这意味着这些位置在给定的图数据集的分类中起重要作用。这是因为反向传播仅修改对输入的图的分类无贡献的单元格的值。为了更好地理解子图结构,图16中绘制了图的正子图和负子图。分别为图26(a)和(b)。正子图是如果单元格值大于0,则将设置为1,如果其值小于或等于0,则为设置为0。由于这表示应显示的边,所以称为正子图。相反,如果其值小于或等于0则将单元格设置为1,如果其值大于0则设置为0,来绘制负子图。负子图表示不应出现的边。可以看到,正子图和负子图都在训练过程中从初始状态逐渐变化,并在训练结束时达到稳定的结构。这意味着训练过程最终达到了收敛状态。
特征可视化
图27示出了在不同卷积层中捕获的子图特征。图27(a)显示了12个顶点的输入图。本实施例中使用第二种图分类系统(5卷积层),设置特征生成模块的过滤矩阵大小为4×4,其余卷积层过滤矩阵宽度为3。因此,每层的特征大小为4,6,8,10,12。图27(b),(c),(d),(e),(f)分别示出了在五个卷积层中的每一处学习得到的子图模式。其邻接矩阵表示每个边的存在概率,单元格越暗,该滤波器捕获相应边的概率越高。在图27(b)所示的第一层中,只能处理基本的四个顶点模式。向前移动到图27(c)所示的第二层,过滤矩阵可以捕获并表示由第一层特征组成的六顶点模式。通过进一步添加更多的卷积层,可以捕获和表示更复杂的子图模式。最后,在图27(f)中,捕获了12顶点特征,这与图27(a)中的始输入图非常相似。
最后,提供一个实施例主要说明了本发明提出的基于邻接矩阵的图分类系统的重要特性:能够利用较小的窗口捕获大型多顶点的子图结构。
以一个由十个顶点(|V|=10)组成的图为例,图28示出了在这个图上使用特征生成模块的物理意义。可以看到,该图具有两个大小为六个顶点的环,并且两个顶点由这两个环结构共享。为了捕获这种基于环的图模式,现有的方法通常需要使窗口大小大于10。然而,即使仅使用大小为6的窗口,本发明的方法也是有效的。考虑图28左上方的图,我们用连接信息规整系统将连接信息元素集中到n=6的对角线区域,对顶点进行重新排序,右上方为得到的标注图,使用abcdefghi表示排序顶点的顺序。然后用大小为6×6的过滤矩阵(即n=6)进行过滤操作。过滤矩阵可以通过|V|-n+1=10-6+1=5步移动。图28中心的五个图显示了过滤矩阵如何在每个步骤中覆盖(捕获)图的不同子图。例如,在第一步中,过滤矩阵覆盖由a,b,c,d,e,f标记的任何一对顶点之间的所有连接。如图28的第1步所示,由虚线强调的滤波器覆盖由顶点a,b,c,d,e,f组成的环。更有趣的是,使用特征生成模块,可以通过相同的过滤矩阵捕获不同的子图结构(特征)。例如,第1步和第5步捕获相同的图结构:六顶点环。同时,第2步,第3步和第4步捕获另一种类型的图结构:六顶点线。将得到的特征进行组合,可以得到更复杂的结构,如图28最下方所示,可以得出3种不同的复杂结构,而其中中间的结构即想要捕获的10顶点环。
更具体地,图29给出了数值化的例子来描述特征生成模块捕获的特征以及层叠CNN中捕获的特征。图29(a)为一个12顶点的图以及该图的第二邻接矩阵,图中包含了两个大小六个顶点的环,并且两个顶点由这两个环结构共享,两个环上都另外连了一个顶点。图29中的邻接矩阵及过滤矩阵空白的元素表示值为0,为了简化计算过滤矩阵中元素的值均选为0或1。图29(b)为特征生成模块中的两个过滤矩阵,对应表示的子图结构如图29(c)所示。使用图29(b)的两个过滤矩阵沿该图的第二邻接矩阵的对角线方向进行过滤操作,可以计算得到向量如图29(d)所示,由虚线所包围的元素是零填充(zero-padding)。层叠CNN中的过滤矩阵如图29(e)所示,为了简化计算,同样使其元素为0或1。使用层叠CNN中的过滤矩阵对捕获的特征(图29(d))进行过滤操作,得到向量如图29(h)所示。考虑层叠CNN中过滤矩阵所代表的物理意义,它所表示的是特征生成模块捕获的子图结构的组合,故可以将特征生成模块的过滤矩阵根据层叠CNN中的过滤矩阵的值进行堆叠,如图29(i)所示。得到层叠CNN中过滤矩阵所表示的邻接矩阵,如图29(f)所示,图29(g)即为层叠CNN中过滤矩阵所表示的子图结构。可以看到图29(g)一个为十顶点的双环,一个为六顶点环外接4个顶点。
本发明提出的基于邻接矩阵的图分类系统能够通过较小的窗口捕获大型多顶点的子图结构,以及来自顶点和边的隐式相关结构的深层特征,进而提高分类的准确性。

Claims (117)

  1. 一种在计算机环境中基于邻接矩阵的连接信息规整系统,其特征在于:所述的连接信息规整系统用于将图对应的第一邻接矩阵中的全部顶点进行重新排序,得到第二邻接矩阵,所述第二邻接矩阵中的连接信息元素集中分布在所述第二邻接矩阵的宽度为n的对角线区域,其中n为正整数,n≥2且n<|V|,所述的|V|为第二邻接矩阵的行数或列数;
    所述第二邻接矩阵的对角线区域由以下元素组成:正整数i从1遍历至|V|,当i>max(n,|V|-n)时,选取第i行中第(i-n+1)到|V|列的元素;当i≤n,选取第i行中第0至i+n-1列的元素;当max(n,|V|-n)≥i≥min(|V|-n,n),则第i列中,选取第(i-n+1)列到第(i+n-1)列的元素;
    所述的连接信息元素是图中的边在邻接矩阵中对应的元素;
    所述的图为图论中的图。
  2. 根据权利要求1所述的连接信息规整系统,其特征在于:如果所述的图的边上没有权重,则所述的连接信息元素的值为1,非连接信息元素的值为0。
  3. 根据权利要求1所述的连接信息规整系统,其特征在于:如果所述的图的边上带有权重,则所述的连接信息元素的值为边的权重值,非连接信息元素的值为0。
  4. 根据权利要求1所述的连接信息规整系统,其特征在于:所述对角线区域指矩阵中从左上角至右下角的对角线区域。
  5. 根据权利要求1所述的连接信息规整系统,其特征在于:所述第二邻接矩阵的对角线区域是使用一个尺寸为n×n的扫描矩形框沿所述第二邻接矩阵的对角线扫描一遍所经过的区域。
  6. 根据权利要求5所述的连接信息规整系统,其特征在于所述的扫描过程如下:首先,将所述扫描矩形框的左上角与第二邻接矩阵的左上角重合;然后每次将所述扫描矩形框往右方和下方各移动一个元素格,直至所述扫描矩形框的右下角与所述第二邻接矩阵的右下角重合。
  7. 根据权利要求1所述的连接信息规整系统,其特征在于:所述连接信息规整系统用于对所述第一邻接矩阵的全部顶点进行重新排序,使得排序之后第二邻接矩阵的对角线区域中连接信息元素的集中程度最高。
  8. 根据权利要求7所述的连接信息规整系统,其特征在于:所述重新排序的方法为整数优化算法。
  9. 根据权利要求7所述的连接信息规整系统,其特征在于:所述重新排序的方法为贪心算法,包括以下步骤:
    (1)初始输入:输入图的第一邻接矩阵作为待处理邻接矩阵;
    (2)交换对统计:计算待处理邻接矩阵中所有可能的顶点交换对;
    (3)行列交换:判断是否所有可能的顶点交换对均为已处理状态,若是,则输出待处理邻接矩阵得到所述的第二邻接矩阵,所述的贪心算法结束;否则,从尚未处理过的顶点交换对中任意选择一个顶点交换对作为当前顶点交换对,同时交换其对应的两个顶点在待处理邻接矩阵中对应的两行及对应的两列,生成新邻接矩阵,并跳转至步骤(4);
    (4)交换效果评定:计算新邻接矩阵中连接信息元素的集中程度,若所述新邻接矩阵中连接信息元素的集中程度高于所述待处理邻接矩阵中连接信息元素的集中程度,则用所述新邻接矩阵替代所述的待处理邻接矩阵,并跳转至步骤(2);若所述新邻接矩阵中连接信息元素的集中程度低于或等于所述待处理邻接矩阵中连接信息元素的集中程度,则放弃这种交换,并标记所述的当前顶点交换对为已处理状态,跳转至步骤(3)。
  10. 根据权利要求7所述的连接信息规整系统,其特征在于:所述重新排序的方法为分支定界算法,包括以下步骤:
    (1)初始输入:输入图的第一邻接矩阵作为待处理邻接矩阵;
    (2)交换对统计:计算待处理邻接矩阵中所有可能的顶点交换对;
    (3)行列交换:判断是否所有可能的顶点交换对均为已处理状态,若是,则输出所述的待处理邻接矩阵得到所述第二邻接矩阵,所述的分支定界算法结束;否则,对所有可能的顶点交换对中的每一个未处理过的顶点交换对分别执行交换操作,并跳转至步骤(4),所述的交换操作是指同时交换所述顶点交换对对应的两个顶点在所述待处理邻接矩阵中对应的两行及对应的两列,对每一个所述的顶点交换对执行所述交换操作都会生成一个新邻接矩阵;
    (4)交换效果评定:计算每一个所述的新邻接矩阵中连接信息元素的集中程度,若存在连接信息元素的集中程度高于所述待处理邻接矩阵中连接信息元素的集中程度的新邻接矩阵,则选择集中程度最高的新邻接矩阵代替所述的待处理矩阵,并标记生成该集中程度最高的新邻接矩阵的顶点交换对为已处理状态,然后跳转至步骤(3);若不存在连接信息元素的集中程度高于所述待处理邻接矩阵中连接信息元素的集中程度的新邻接矩阵,则输出当前待处理邻接矩阵得到所述的第二邻接矩阵,所述的分支定界算法结束。
  11. 根据权利要求1-10任一项所述的连接信息规整系统,其特征在于:所述第二邻接矩阵的对角线区域中连接信息元素的集中程度依赖于所述的对角线区域内的连接信息元素的数量和/或非连接信息元素的数量。
  12. 根据权利要求1-10任一项所述的连接信息规整系统,其特征在于:所述第二邻接矩阵的对角线区域中连接信息元素的集中程度依赖于所述的对角线区域外的连接信息元素的数量和/或非连接信息元素的数量。
  13. 根据权利要求1-12任一项所述的连接信息规整系统,其特征在于:所述的集中程度利用Loss值来衡量,Loss值越小,集中程度越高,所述的Loss值的计算方法如下:
    Figure PCTCN2018082111-appb-100001
    式中,LS(A,n)代表损失Loss值,A代表所述的第二邻接矩阵,n代表所述第二邻接矩阵中对角线区域的宽度,A i,j表示所述第二邻接矩阵中第i 行第j列的元素。
  14. 根据权利要求1-12任一项所述的连接信息规整系统,其特征在于:所述的集中程度利用ZR值来衡量,ZR值越小,集中程度越高,所述ZR值的计算方法如下:
    Figure PCTCN2018082111-appb-100002
    Figure PCTCN2018082111-appb-100003
    Figure PCTCN2018082111-appb-100004
    式中,A代表第二邻接矩阵,C表示所有元素均为连接信息元素且尺寸大小与A相同的矩阵,A i,j表示A中第i行第j列的元素,C i,j表示C中第i行第j列的元素,TC(A,n)、TC表示宽度为n的对角线区域中元素的总个数,T1(A,n)、T1表示宽度为n的对角线区域中连接信息元素的个数,ZR(A,n)代表ZR值,该值表示宽度为n的对角线区域中非连接信息元素的占比。
  15. 一种在计算机环境中基于邻接矩阵的图特征提取系统,其特征在于:所述的图特征提取系统基于图的邻接矩阵抽取出图的特征,所述的特征直接对应支持分类的子图结构,所述的特征以至少一个向量的形式呈现,每一个向量对应一种混合态在图中的分布情况;所述的图特征提取系统包括特征生成模块和权利要求1-14任一项所述的连接信息规整系统;其中:所述的连接信息规整模块用于将图对应的第一邻接矩阵中的全部顶点进行重新排序,得到第二邻接矩阵;所述的特征生成模块基于所述的第二邻接矩阵,生成图的特征,所述的特征直接对应支持分类的子图结构,每一个向量对应一种混合态在图中的分布情况;所述的图、子图均为图论中的图。
  16. 根据权利要求15所述的图特征提取系统,其特征在于:所述的分布情况是指图中出现该混合态中的子图结构的可能性。
  17. 根据权利要求15所述的图特征提取系统,其特征在于:每一种所述的混合态代表任意多个子图结构对应的邻接矩阵的线性加权。
  18. 根据权利要求17所述的图特征提取系统,其特征在于:所述的线性加权是指每一个子图的邻接矩阵乘以该邻接矩阵对应的权值,然后对位相加到一起,得到一个与子图的邻接矩阵相同大小的矩阵。
  19. 根据权利要求15-18任一项所述的图特征提取系统,其特征在于:所述的特征生成模块利用过滤矩阵生成图的特征,所述的过滤矩阵为正方形矩阵。
  20. 根据权利要求15-18任一项所述的图特征提取系统,其特征在于:所述的特征生成模块利用至少一个过滤矩阵,沿所述第二邻接矩阵的对角线区域进行过滤操作,得到至少一个向量,所述的至少一个向量对应于所述的图的特征,所述的特征直接对应支持分类的子图结构,每一个向量对应一种混合态在图中的分布情况。
  21. 根据权利要求20所述的图特征提取系统,其特征在于:所述的特征生成模块利用不同的过滤矩阵,进行所述的过滤操作。
  22. 根据权利要求20或21所述的图特征提取系统,其特征在于:所述的过滤操作是利用所述的过滤矩阵对所述第二邻接矩阵对位的矩阵内积的加和,通过激活函数得到一个值,让过滤矩阵沿所述第二邻接矩阵的对角线方向移动,从而得到一组值,形成一个向量,该向量对应一种子图结构在图中的分布情况。
  23. 根据权利要求22所述的图特征提取系统,其特征在于:所述的激活函数为sigmoid函数、ReLU激活函数、pReLU函数。
  24. 根据权利要求19-23任一项所述的图特征提取系统,其特征在于:所述过滤矩阵的尺寸为n×n。
  25. 根据权利要求19-23任一项所述的图特征提取系统,其特征在于:所述过滤矩阵中每一个元素的初始值分别从高斯分布中取出的随机变量的值。
  26. 根据权利要求19-23任一项所述的图特征提取系统,其特征在于:所述的过滤矩阵中的元素为大于等于-1、小于等于1的实数。
  27. 根据权利要求19-23任一项所述的图特征提取系统,其特征在于:所述的过滤矩阵中的元素为大于等于0、小于等于1的实数。
  28. 根据权利要求19-23任一项所述的图特征提取系统,其特征在于:所述的特征生成模块参与机器学习过程,所述机器学习过程用于调整所述过滤矩阵的元素的值。
  29. 根据权利要求15-28任一项所述的图特征提取系统,其特征在于:所述的机器学习过程是利用反向传播,利用分类的损失值,计算梯度值,进一步调节过滤矩阵中的各个元素的值。
  30. 一种在计算机环境中基于邻接矩阵的图分类系统,其特征在于:所述的图分类系统包括类别标注模块和权利要求15-29任一项所述的图特征提取系统,所述的类别标注模块基于所述图特征提取系统生成的特征对图进行类别标注,输出图的类别;所述的图为图论中的图。
  31. 根据权利要求30所述的图分类系统,其特征在于:所述的类别标注模块计算出图属于各个分类标签的可能性,并将可能性最高的分类标签标注为图的类别,完成图的分类。
  32. 根据权利要求30-31任一项所述的图分类系统,其特征在于:所述的类别标注模块利用分类算法计算出图属于各个分类标签的可能性,并将可能性最高的分类标签标注为图的类别,完成图的分类。
  33. 根据权利要求32所述的图分类系统,其特征在于:所述的分类算法选自kNN、线性分类算法中的任意一种或任意多种。
  34. 根据权利要求30-33任一项所述的图分类系统,其特征在于:
    所述的图分类系统还进一步包含层叠CNN模块,所述的层叠CNN模块基于所述的图特征提取系统生成的特征进行处理,融合所述的特征对应的支持分类的子图结构,生成包含图中更大子图结构的特征,所述的更大子图结构是指顶点个数多于n的子图结构。
  35. 根据权利要求34所述的图分类系统,其特征在于:所述的层叠CNN模块包括卷积子模块和池化子模块;所述的卷积子模块使用至少一个卷积层基于所述的图特征提取系统生成的特征进行卷积操作,融合所述的特征对应的支持分类的子图结构,得到至少一个向量作为卷积结果;第一个卷积层的输入为权利要求15-29任一项所述的图特征提取系统生成的特征,如果有多个卷积层,每一个卷积层的输入为前一个卷积层的输出结果,每一个卷积层的输出结果均为至少一个向量,每一个卷积层使用至少一个过滤矩阵进行卷积操作,最后一个卷积层的卷积结果输出至所述的池化子模块;
    所述的过滤矩阵为正方形矩阵;每一个所述卷积层中所述过滤矩阵的行数与输入该卷积层的向量的数量相同;
    所述的池化子模块用于对所述卷积子模块得到的矩阵进行池化操作,得到至少一个向量作为池化结果输出至所述的类别标注模块,对图进行类别标注,
    输出图的类别,所述池化结果包含图中更大子图结构的特征;所述的更大子图结构是指顶点个数多于n的子图结构。
  36. 根据权利要求30-33任一项所述的图分类系统,其特征在于:
    所述的图分类系统还进一步包含独立池化模块和卷积池化模块;所述的独立池化模块用于对所述的图特征提取系统生成的特征进行池化操作,得到至少一个向量作为第一池化结果输出至所述的类别标注模块;所述的卷积池化模块对输入的权利要求15-29任一项所述的图特征提取系统生成的特征进行卷积和池化处理,融合所述的特征对应的支持分类的子图结构,生成包含图中更大子图结构特征的第二池化结果,将其输出至所述的类别标注模块;所述的类别标注模块根据所述第一池化结果和第二池化结果对图进行类别标注,输出图的类别;所述的更大子图结构是指顶点个数多于n的子图结构。
  37. 根据权利要求36所述的图分类系统,其特征在于:所述的卷积池化模块包含卷积子模块和池化子模块;所述的卷积子模块使用至少一个过滤矩阵对输入进行卷积操作,融合所述的特征对应的支持分类的子图结构,得到至少一个向量作为卷积结果传递给池化子模块;所述的池化子模块对所述的卷积结果进行池化操作,得到至少一个向量作为第二池化结果,所述第二池化结果包含图中更大子图结构的特征,将所述的池化结果输出至所述的类别标注模块;所述的过滤矩阵为正方形矩阵;每一个所述卷积层中所述过滤矩阵的行数与输入该卷积层的向量的数量相同。
  38. 根据权利要求30-33任一项所述的图分类系统,其特征在于:
    所述的图分类系统还进一步包含独立池化模块和多个卷积池化模块;所述的独立池化模块用于对所述的图特征提取系统生成的特征进行池化操作,得到至少一个向量作为第一池化结果输出至所述的类别标注模块;所述的卷积池化模块对输入的特征依次进行卷积操作和池化操作,所述的卷积操作融合所述的特征对应的支持分类的子图结构得到至少一个向量作为卷积结果,然后对所述的卷积结果进行池化操作,得到至少一个向量作为池化结果,所述池化结果中包含图中更大子图结构的特征;上一个卷积池化模块的卷积结果输出至下一个卷积池化模块,每一个卷积池化模块的池化结果均输出至所述的类别标注模块;所述的类别标注模块根据所述第一池化结果和全部卷积池化模块的池化结果对图进行类别标注,输出图的类别;
    其中,第一个所述卷积池化模块的输入为权利要求15-29任一项所述的图特征提取系统生成的特征,其他卷积池化模块的输入为上一个卷积池化模块的卷积结果;最后一个卷积池化模块仅将池化结果输出至类别标注模块;所述的更大子图结构是指顶点个数多于n的子图结构。
  39. 根据权利要求38所述的图分类系统,其特征在于:所述的卷积池化模块包含卷积子模块和池化子模块;所述的卷积子模块使用至少一个过滤矩阵对输入进行卷积操作,融合所述的特征对应的支持分类的子图结构得到至少一个向量作为卷积结果,并将所述的卷积结果输出至下一个所述的卷积池化模块;所述的池化子模块对所述卷积子模块输出的卷积结果进行池化,得到至少一个向量作为池化结果输出至所述的类别标注模块,所述池化结果包含图中更大子图结构的特征;所述的过滤矩阵为正方形矩阵;每一个所述卷积层中所述过滤矩阵的行数与输入该卷积层的向量的数量相同。
  40. 根据权利要求39所述的图分类系统,其特征在于:所述卷积子模块、池化子模块的数量可相同或不同。
  41. 根据权利要求39所述的图分类系统,其特征在于:所述卷积子模块、池化子模块的数量为1个或多个。
  42. 根据权利要求39所述的图分类系统,其特征在于:所述卷积池化模块的数量小于或等于10个。
  43. 根据权利要求39所述的图分类系统,其特征在于:所述卷积池化模块的数量小于或等于5个。
  44. 根据权利要求39所述的图分类系统,其特征在于:所述卷积池化模块的数量小于或等于3个。
  45. 根据权利要求34-44任一项所述的图分类系统,其特征在于:所述的过滤矩阵中的元素为大于等于-1、小于等于1的实数。
  46. 根据权利要求34-44任一项所述的图分类系统,其特征在于:所述的过滤矩阵中的元素为大于等于0、小于等于1的实数。
  47. 根据权利要求34-44任一项所述的图分类系统,其特征在于:所述的池化操作选自最大池化操作、平均池化操作。
  48. 根据权利要求34-47任一项所述的图分类系统,其特征在于:所述的卷积结果对应的向量的元素值代表子图结构在图上各个位置出现的可能性,所述池化结果、第一池化结果、第二池化结果对应的向量的元素值代表子图结构在图中出现的最大可能性或平均可能性。
  49. 根据权利要求30-48任一项所述的图分类系统,其特征在于:
    所述的类别标注模块包括隐含层单元、激活单元、标注单元;
    所述的隐含层单元对接收到的向量进行处理,得到至少一个混合向量传递至所述的激活单元,所述的混合向量包含所述隐含层单元接收到的所有向量的信息;
    所述的激活单元对所述隐含层单元输出的每一个混合向量使用激活函数计算得到一个值,并将所有得到的值组成一个向量输出到所述的标注单元;
    所述的标注单元用于根据激活单元的结果计算出图属于各个分类标签的可能性,并将可能性最高的分类标签标注为图的类别,完成图的分类。
  50. 根据权利要求49所述的图分类系统,其特征在于:所述的隐含层单元对接收到的向量进行的处理是指对输入的向量进行合并拼接成一个组合向量,并使用至少一个权重向量对所述的组合向量进行线性加权操作得到至少一个混合向量。
  51. 根据权利要求49所述的图分类系统,其特征在于:所述的激活函数为sigmoid函数、ReLU激活函数、pReLU函数。
  52. 根据权利要求49所述的图分类系统,其特征在于:所述标注单元基于分类算法计算出图属于各个分类标签的可能性,并将可能性最高的分类标签标注 为图的类别,完成图的分类。
  53. 根据权利要求52所述的图分类系统,其特征在于:所述的分类算法选自kNN、线性分类算法中的任意一种或任意多种。
  54. 根据权利要求30-53任一项所述的图分类系统,其特征在于:所述图的顶点为任意实体,所述图的边为任意实体之间的关系。
  55. 根据权利要求54所述的图分类系统,其特征在于:所述的任意实体是任意的独立个体或个体集合,所述的个体是实际存在或虚拟的。
  56. 根据权利要求54所述的图分类系统,其特征在于:所述的实体可以是任意人、事、事件、物、概念中的一种或多种的组合。
  57. 根据权利要求54任一项所述的图分类系统,其特征在于:所述的任意实体选自化合物或单质中的原子,网络中的人、商品、事件的任意一种或任意多种。
  58. 根据权利要求54-57任一项所述的图分类系统,其特征在于:所述的关系为任意实体之间的任意关联性。
  59. 根据权利要求58所述的图分类系统,其特征在于:所述关联性是连接原子的化学键、商品之间的联系、人与人之间的关系。
  60. 根据权利要求58所述的图分类系统,其特征在于:所述商品之间的联系包括购买商品的因果关系、关联关系。
  61. 根据权利要求58所述的图分类系统,其特征在于:所述人与人之间的关系包括实际的血缘关系、虚拟社交网络中的好友关系或关注关系、交易关系、发送消息关系。
  62. 一种在计算机环境中基于邻接矩阵的连接信息规整方法,其特征在于,所述的方法包括如下步骤:
    (1)初始输入:将图转化为第一邻接矩阵;
    (2)连接信息规整:对所述第一邻接矩阵中的全部顶点进行重新排序,得到第二邻接矩阵,所述第二邻接矩阵中的连接信息元素集中分布在所述第二邻接矩阵的宽度为n的对角线区域,其中n为正整数,n≥2且n<|V|,所述的|V|为第二邻接矩阵的行数或列数;
    所述第二邻接矩阵的对角线区域由以下元素组成:正整数i从1遍历至|V|,当i>max(n,|V|-n)时,选取第i行中第(i-n+1)到|V|列的元素;当i≤n,选取第i行中第0至i+n-1列的元素;当max(n,|V|-n)≥i≥min(|V|-n,n),则第i列中,选取第(i-n+1)列到第(i+n-1)列的元素;
    所述的连接信息元素是图中的边在邻接矩阵中对应的元素;所述的图为图论中的图。
  63. 根据权利要求62所述的图分类系统,其特征在于:如果所述的图的边上没有权重,则所述的连接信息元素的值为1,非连接信息元素的值为0。
  64. 根据权利要求62所述的图分类系统,其特征在于:如果所述的图的边上带有权重,则所述的连接信息元素的值为边的权重值,非连接信息元素的值为0。
  65. 根据权利要求62-64任一项所述的连接信息规整方法,其特征在于:所述第二邻接矩阵的对角线区域指矩阵中从左上角至右下角的对角线区域。
  66. 根据权利要求62-64任一项所述的连接信息规整方法,其特征在于:所述第二邻接矩阵的对角线区域是使用一个尺寸为n×n的扫描矩形框沿所述第二邻接矩阵的对角线扫描一遍所经过的区域;
  67. 根据权利要求66所述的连接信息规整方法,其特征在于:所述的扫描过程如下:首先,将所述扫描矩形框的左上角与第二邻接矩阵的左上角重合;然后每次将所述扫描矩形框往右方和下方各移动一个元素格,直至所述扫描矩形框的右下角与所述第二邻接矩阵的右下角重合。
  68. 根据权利要求62-67任一项所述的连接信息规整方法,其特征在于:所述步骤(2)对所述第一邻接矩阵的全部顶点进行重新排序,使得排序之后第二邻接矩阵的对角线区域中连接信息元素的集中程度最高。
  69. 根据权利要求62-67任一项所述的连接信息规整方法,其特征在于:所述重新排序的方法为整数优化算法。
  70. 根据权利要求68所述的连接信息规整方法,其特征在于:所述重新排序的方法为贪心算法,包括以下步骤:
    (1)初始输入:输入图的第一邻接矩阵作为待处理邻接矩阵;
    (2)交换对统计:计算待处理邻接矩阵中所有可能的顶点交换对;
    (3)行列交换:判断是否所有可能的顶点交换对均为已处理状态,若是,则输出待处理邻接矩阵得到所述的第二邻接矩阵,所述的贪心算法结束;否则,从尚未处理过的顶点交换对中任意选择一个顶点交换对作为当前顶点交换对,同时交换其对应的两个顶点在待处理邻接矩阵中对应的两行及对应的两列,生成新邻接矩阵,并跳转至步骤(4);
    (4)交换效果评定:计算新邻接矩阵中连接信息元素的集中程度,若所述新邻接矩阵中连接信息元素的集中程度高于所述待处理邻接矩阵中连接信息元素的集中程度,则用所述新邻接矩阵替代所述的待处理邻接矩阵,并跳转至步骤(2);若所述新邻接矩阵中连接信息元素的集中程度低于或等于所述待处理邻接矩阵中连接信息元素的集中程度,则放弃这种交换,并标记所述的当前顶点交换对为已处理状态,跳转至步骤(3)。
  71. 根据权利要求68所述的连接信息规整方法,其特征在于:所述重新排序的方法为分支定界算法,包括以下步骤:
    (1)初始输入:输入图的第一邻接矩阵作为待处理邻接矩阵;
    (2)交换对统计:计算待处理邻接矩阵中所有可能的顶点交换对;
    (3)行列交换:判断是否所有可能的顶点交换对均为已处理状态,若是,则输出所述的待处理邻接矩阵得到所述第二邻接矩阵,所述的分支定界算法结束;否则,对所有可能的顶点交换对中的每一个未处理过的顶点交换对分别执行交换操作,并跳转至步骤(4),所述的交换操作是指同时交换所述顶点交换对对应的两个顶点在所述待处理邻接矩阵中对应的两行及对应两列,对每一个所述的顶点交换对执行所述交换操作都会生成一个新邻接矩阵;
    (4)交换效果评定:计算每一个所述的新邻接矩阵中连接信息元素的集中程度,若存在连接信息元素的集中程度高于所述待处理邻接矩阵中连接信息元素的集中程度的新邻接矩阵,则选择集中程度最高的新邻接矩阵代替所述的待处理矩阵,并标记生成该集中程度最高的新邻接矩阵的 顶点交换对为已处理状态,然后跳转至步骤(3);若不存在连接信息元素的集中程度高于所述待处理邻接矩阵中连接信息元素的集中程度的新邻接矩阵,则输出当前待处理邻接矩阵得到所述的第二邻接矩阵,所述的分支定界算法结束。
  72. 根据权利要求62-71任一项所述的连接信息规整方法,其特征在于:所述第二邻接矩阵的对角线区域中连接信息元素的集中程度依赖于所述的对角线区域内的连接信息元素的数量和/或非连接信息元素的数量。
  73. 根据权利要求62-71任一项所述的连接信息规整方法,其特征在于:所述第二邻接矩阵的对角线区域中连接信息元素的集中程度依赖于所述的对角线区域外的连接信息元素的数量和/或非连接信息元素的数量。
  74. 根据权利要求62-73任一项所述的连接信息规整方法,其特征在于:所述的集中程度利用Loss值来衡量,Loss值越小,集中程度越高,所述的Loss值的计算方法如下:
    Figure PCTCN2018082111-appb-100005
    式中,LS(A,n)代表损失Loss值,A代表所述的第二邻接矩阵,n代表所述第二邻接矩阵中对角线区域的宽度,A i,j表示所述第二邻接矩阵中第i行第j列的元素。
  75. 根据权利要求62-73任一项所述的连接信息规整方法,其特征在于:所述的集中程度利用ZR值来衡量,ZR值越小,集中程度越高,所述ZR值的计算方法如下:
    Figure PCTCN2018082111-appb-100006
    Figure PCTCN2018082111-appb-100007
    Figure PCTCN2018082111-appb-100008
    式中,A代表第二邻接矩阵,C表示所有元素均为连接信息元素且尺寸大小与A相同的矩阵,A i,j表示A中第i行第j列的元素,C i,j表示C中第i行第j列的元素,TC(A,n)、TC表示宽度为n的对角线区域中元素的总个数,T1(A,n)、T1表示宽度为n的对角线区域中连接信息元素的个数,ZR(A,n)代表ZR值,该值表示宽度为n的对角线区域中非连接信息元素的占比。
  76. 一种在计算机环境中基于邻接矩阵的图特征提取方法,其特征在于,所述的图特征提取方法基于图的邻接矩阵抽取出图的特征,所述的特征直接对应支持分类的子图结构,所述的特征以至少一个向量的形式呈现,每一个向量对应一种混合态在图中的分布情况,所述的方法包括如下步骤:
    (1)连接信息规整:利用权利要求62-75任一项所述的连接信息规整方法对图的第一邻接矩阵进行处理,得到第二邻接矩阵;
    (2)对角过滤:基于步骤(1)得到的第二邻接矩阵,生成图的特征,所述的特征直接对应支持分类的子图结构,每一个向量对应一种混合态在图中的分布情况;
    所述的图、子图均为图论中的图;
  77. 根据权利要求76所述的图特征提取方法,其特征在于:所述的步骤(2)利用过滤矩阵生成图的特征,所述的过滤矩阵为正方形矩阵。
  78. 根据权利要求76所述的图特征提取方法,其特征在于:所述的步骤(2)利用至少一个过滤矩阵,沿所述第二邻接矩阵的对角线区域进行过滤操作,得到至少一个向量,所述的至少一个向量对应于所述的图的特征,所述的特征直接对应支持分类的子图结构,每一个向量对应一种混合态在图中的分布情况。
  79. 根据权利要求78所述的图特征提取方法,其特征在于:所述的步骤(2)利用不同的过滤矩阵,进行过滤操作。
  80. 根据权利要求78或79所述的图特征提取方法,其特征在于:所述的过滤操作是利用所述的过滤矩阵对所述第二邻接矩阵对位的矩阵内积的加和,通过激活函数得到一个值,让过滤矩阵沿所述第二邻接矩阵的对角线方向移动,从而得到一组值,形成一个向量,该向量对应一种子图结构在图中的分布情况。
  81. 根据权利要求80所述的图特征提取方法,其特征在于:所述的激活函数为sigmoid函数、ReLU激活函数、pReLU函数。
  82. 根据权利要求76-81任一项所述的图特征提取方法,其特征在于:所述的分布情况是指图中出现该混合态中的子图结构的可能性。
  83. 根据权利要求76-81任一项所述的图特征提取方法,其特征在于:每一种所述的混合态代表任意多个子图结构对应的邻接矩阵的线性加权。
  84. 根据权利要求83所述的图特征提取方法,其特征在于:所述的线性加权是指每一个子图的邻接矩阵乘以该邻接矩阵对应的权值,然后对位相加到一起,得到一个与子图的邻接矩阵相同大小的矩阵。
  85. 根据权利要求76-81所述的图特征提取方法,其特征在于:所述的过滤矩阵中每一个元素的初始值分别从高斯分布中取出的随机变量的值。
  86. 根据权利要求76-81所述的图特征提取方法,其特征在于:所述的过滤矩阵中的元素为大于等于-1、小于等于1的实数。
  87. 根据权利要求76-81所述的图特征提取方法,其特征在于:所述的过滤矩阵中的元素为大于等于0、小于等于1的实数。
  88. 根据权利要求76-81所述的图特征提取方法,其特征在于:所述的过滤矩阵的尺寸为n×n。
  89. 根据权利要求76-88所述的图特征提取方法,其特征在于:所述的步骤(2)参与机器学习过程,所述机器学习过程用于调整所述过滤矩阵的元素的值;。
  90. 根据权利要求89所述的图特征提取方法,其特征在于:所述的机器学习过程是利用反向传播,利用分类的损失值,计算梯度值,进一步调节过滤矩阵中的各个元素的值。
  91. 根据权利要求76-90所述的图特征提取方法,其特征在于:所述的连接信息的值为1,非连接信息的值为0。
  92. 根据权利要求76-90所述的图特征提取方法,其特征在于:如果所述的图的边上带有权重,则所述的连接信息的值为边的权重值,非连接信息的值为0。
  93. 一种在计算机环境中基于邻接矩阵的图分类方法,其特征在于:所述的图分类方法包括如下步骤:
    (1)图特征提取:利用权利要求76-92任一项所述的基于邻接矩阵的图特征提取方法提取图的特征;
    (2)类别标注:基于步骤(1)提取的特征对图进行类别标注,输出图的类别;所述的图为图论中的图。
  94. 根据权利要求93所述的图分类方法,其特征在于:所述的步骤(2)计算出图属于各个分类标签的可能性,并将可能性最高的分类标签标注为图的类别,完成图的分类。
  95. 根据权利要求93所述的图分类方法,其特征在于:所述的步骤(2)利用分类算法计算出图属于各个分类标签的可能性,并将可能性最高的分类标签标注为图的类别,完成图的分类。
  96. 根据权利要求95所述的图分类方法,其特征在于:所述的分类算法选自kNN、线性分类算法中的任意一种或任意多种。
  97. 一种在计算机环境中基于层叠CNN的图分类方法,其特征在于:所述的图分类方法包括如下步骤:
    (1)图特征提取:利用权利要求76-96任一项所述的基于邻接矩阵的图特征提取方法提取图的特征;
    (2)卷积操作:利用至少一个卷积层对步骤(1)提取的图的特征进行卷积操作,融合所述的特征对应的支持分类的子图结构,得到至少一个向量作为卷积结果;第一个卷积层的输入为步骤(1)提取的图的特征,如果有多个卷积层,每一个卷积层的输入为前一个卷积层的输出结果,每一个卷积层的输出结果均为至少一个向量,每一个卷积层使用至少一个过滤矩阵进行卷积操作,最后一个卷积层的卷积结果输出至步骤(3);所述的过滤矩阵为正方形矩阵;每一个所述卷积层中所述过滤矩阵的行数与输入该卷积层的卷积结果中向量的数量相同;
    (3)池化操作:对步骤(2)中卷积操作的结果进行池化操作,得到至少一个向量作为池化结果传递至步骤(4),所述池化结果中包含图中更大子图结构的特征,所述的更大子图结构是指顶点个数多于n的子图结构;
    (4)类别标注:根据步骤(3)得到池化结果,对图进行类别标注,输出图的类别。
  98. 一种在计算机环境中基于层叠CNN的图分类方法,其特征在于所述的图分类方法包括以下步骤:
    (1)图特征提取:利用权利要求76-96任一项所述的基于邻接矩阵的图特征提取方法提取图的特征,并传递至步骤(2)和步骤(3);
    (2)独立池化操作:对步骤(1)提取的图的特征进行池化操作,得到至少一个向量作为第一池化结果输出至步骤(4);
    (3)卷积池化操作:使用至少一个过滤矩阵对步骤(1)提取的图的特征进行卷积操作,融合所述的特征对应的支持分类的子图结构,得到至少一个向量作为卷积结果,然后,对所述的卷积结果进行池化操作,得到至少一个向量作为第二池化结果传递至步骤(4),所述第二池化结果中包含图中更大子图结构的特征;所述的更大子图结构是指顶点个数多于n的子图结构;所述的过滤矩阵为正方形矩阵;每一个所述卷积层中所述过滤矩阵的行数与输入该卷积层的向量的数量相同;
    (4)类别标注:根据所述的第一池化结果和第二池化结果,对图进行类别标注,输出图的类别。
  99. 一种在计算机环境中基于层叠CNN的图分类方法,其特征在于,所述的图分类方法包括以下步骤:
    (1)图特征提取:利用权利要求76-96任一项所述的基于邻接矩阵的图特征提取方法提取图的特征,并传递至步骤(2);
    (2)独立池化操作:对步骤(1)提取的图的特征进行池化操作,得到至少一个向量作为第一池化结果输出至步骤(3);
    (3)卷积池化操作:使用至少一个过滤矩阵对输入进行卷积操作,融合所述的特征对应的支持分类的子图结构得到至少一个向量作为卷积结果,然后,对所述的卷积结果进行池化操作,得到至少一个向量作为池化结果,所述池化结果包含图中更大子图结构的特征,上一级的卷积结果传递至下一级的卷积池化操作,每一级卷积池化操作的池化结果均输出至步骤(4);其中,第一级卷积池化操作的输入为步骤(1)提取的图的特征,如果有多级卷积池化操作,每一级卷积池化操作的输入为前一级的卷积池化操作的输出结果,最后一级卷积池化操作仅将池化结果至步骤(4);所述的更大子图结构是指顶点个数多于n的子图结构;所述的过滤矩阵为正方形矩阵;每一个所述卷积层中所述过滤矩阵的行数与输入该卷积层的向量的数量相同;
    (4)类别标注:根据所述的第一池化结果和步骤(3)的全部池化结果,对图进行类别标注,输出图的类别。
  100. 根据权利要求97-99任一项所述的基于层叠CNN的图分类方法,其特征在于:所述的过滤矩阵中的元素为大于等于-1、小于等于1的实数。
  101. 根据权利要求97-99任一项所述的基于层叠CNN的图分类方法,其特征在于:所述的过滤矩阵中的元素为大于等于0、小于等于1的实数;
  102. 根据权利要求97-101任一项所述的基于层叠CNN的图分类方法,其特征在于:步骤(3)所述的池化操作选自最大池化操作、平均池化操作。
  103. 根据权利要求97-102任一项所述的图分类方法,其特征在于:
    所述的卷积结果对应的向量的元素值代表子图结构在图上各个位置出现的可能性,所述池化结果、第一池化结果、第二池化结果对应的向量的元素值代表子图结构在图中出现的最大可能性或平均可能性。
  104. 根据权利要求97-103任一项所述的图分类方法,其特征在于:所述的类别标注包括以下步骤:
    (1)特征合并:使用隐含层对接收到的向量进行处理,得到至少一个混合向量传递至步骤(2);所述的混合向量包含所述隐含层接收到的所有向量的信息;优选的,所述的处理对输入的向量进行合并拼接成一个组合向量,并使用至少一个权重向量对所述的组合向量进行线性加权操作 得到至少一个混合向量;
    (2)特征激活:对接收到的每一个混合向量使用激活函数计算得到一个值,并将所有得到的值组成一个向量传递至步骤(3)。
    (3)类型标注:利用接收到的向量计算出图属于各个分类标签的可能性,并将可能性最高的分类标签标注为图的类别,完成图的分类。
  105. 根据权利要求104所述的图分类方法,其特征在于:所述的激活函数为sigmoid函数、ReLU激活函数、pReLU函数。
  106. 根据权利要求104-105任一项所述的图分类方法,其特征在于所述标注单元基于分类算法计算出图属于各个分类标签的可能性,并将可能性最高的分类标签标注为图的类别,完成图的分类。
  107. 根据权利要求106所述的图分类方法,其特征在于:所述的分类算法选自kNN、线性分类算法中的任意一种或任意多种。
  108. 一种网络结构类型判别系统,其特征在于:所述的分类系统基于权利要求30-61任一项所述的图分类系统实现网络结构分类,所述图的顶点为网络中的节点,所述图的边为网络中节点的关系。
  109. 根据权利要求108所述的网络结构类型判别系统,其特征在于:所述网络选自电子网络、社交网络、物流网络。
  110. 根据权利要求109所述的网络结构类型判别系统,其特征在于:所述电子网络选自局域网、城域网、广域网、互联网、5G、4G、CDMA、Wi-Fi、GSM、WiMax、802.11、红外、EV-DO、蓝牙、GPS卫星、和/或任意其他适当有线/无线技术或协议的网络的至少一部分中无线发送至少一些信息的任意通信方案。
  111. 根据权利要求108-110任一项所述的网络结构类型判别系统,其特征在于:所述节点选自地理位置、移动站、移动设备、用户装备、移动用户、网络用户。
  112. 根据权利要求111所述的网络结构类型判别系统,其特征在于:所述节点的关系选自电子网络节点之间的信息传输关系、地理位置之间运输关系、人与人之间实际的血缘关系、虚拟社交网络中的好友关系或关注关系、交易关系、发送消息关系。
  113. 根据权利要求108-112任一项所述的网络结构类型判别系统,其特征在于:所述分类选自网络的结构类型;所述结构类型选自星型、树形、全连接型、环形。
  114. 一种化合物分类系统,其特征在于:所述的分类系统基于权利要求30-61任一项所述的图分类系统实现化合物分类,所述图的顶点为化合物的原子,所述图的边为原子之间的化学键。
  115. 根据权利要求114所述的化合物分类系统,其特征在于:所述的分类选自化合物的活性、诱变性、致癌性、催化性。
  116. 一种社交网络分类系统,其特征在于:所述的分类系统基于权利要求30-61任一项所述的图分类系统实现社交网络分类,所述图的顶点为社交网络中的实体,所述图中的边为实体之间的关系,所述的实体包括但不限于社交网络中的人、机构、事件、地理位置,所述的关系包括但不限于好友关系、关注关系、私信、点名、关联。
  117. 一种计算机系统,其特征在于:所述的计算机系统包括权利要求1-14任一项所述的连接信息规整系统、权利要求15-29任一项所述的基于邻接矩阵的图特征提取系统、权利要求30-61任一项所述的图分类系统、权利要求108-113任一项所述的网络结构类型判别系统、权利要求114-115任一项所述的化合物分类系统、权利要求116所述的社交网络分类系统中的任意一种或任意多种。
PCT/CN2018/082111 2017-06-28 2018-04-08 一种基于邻接矩阵的连接信息规整系统、图特征提取系统、图分类系统和方法 WO2019001070A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US16/727,842 US11461581B2 (en) 2017-06-28 2019-12-26 System and method of connection information regularization, graph feature extraction and graph classification based on adjacency matrix

Applications Claiming Priority (6)

Application Number Priority Date Filing Date Title
CN201710510474.4 2017-06-28
CN201710510474 2017-06-28
CN201710529419 2017-07-01
CN201710529419.X 2017-07-01
CN201810286686.3 2018-03-31
CN201810286686.3A CN108520275A (zh) 2017-06-28 2018-03-31 一种基于邻接矩阵的连接信息规整系统、图特征提取系统、图分类系统和方法

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US16/727,842 Continuation US11461581B2 (en) 2017-06-28 2019-12-26 System and method of connection information regularization, graph feature extraction and graph classification based on adjacency matrix

Publications (1)

Publication Number Publication Date
WO2019001070A1 true WO2019001070A1 (zh) 2019-01-03

Family

ID=63431650

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/082111 WO2019001070A1 (zh) 2017-06-28 2018-04-08 一种基于邻接矩阵的连接信息规整系统、图特征提取系统、图分类系统和方法

Country Status (3)

Country Link
US (1) US11461581B2 (zh)
CN (1) CN108520275A (zh)
WO (1) WO2019001070A1 (zh)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109815440A (zh) * 2019-01-16 2019-05-28 江西师范大学 联合图优化和投影学习的维数约简方法
CN110674829A (zh) * 2019-09-26 2020-01-10 哈尔滨工程大学 一种基于图卷积注意网络的三维目标检测方法
CN111428562A (zh) * 2020-02-24 2020-07-17 天津师范大学 一种基于部件引导图卷积网络的行人再识别方法
CN111916144A (zh) * 2020-07-27 2020-11-10 西安电子科技大学 基于自注意力神经网络和粗化算法的蛋白质分类方法
CN112818179A (zh) * 2019-11-18 2021-05-18 中国科学院深圳先进技术研究院 基于Hybrid存储格式的图遍历访存优化方法、系统及电子设备
CN115797345A (zh) * 2023-02-06 2023-03-14 青岛佳美洋食品有限公司 一种海鲜烘烤异常识别方法

Families Citing this family (30)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11037330B2 (en) * 2017-04-08 2021-06-15 Intel Corporation Low rank matrix compression
CN108062551A (zh) * 2017-06-28 2018-05-22 浙江大学 一种基于邻接矩阵的图特征提取系统、图分类系统和方法
CN109460793B (zh) * 2018-11-15 2023-07-18 腾讯科技(深圳)有限公司 一种节点分类的方法、模型训练的方法及装置
CN109684185B (zh) * 2018-12-26 2022-02-01 中国人民解放军国防科技大学 基于启发式遍历的超级计算机大数据处理能力测试方法
CN109656798B (zh) * 2018-12-26 2022-02-01 中国人民解放军国防科技大学 基于顶点重排序的超级计算机大数据处理能力测试方法
CN109858618B (zh) * 2019-03-07 2020-04-14 电子科技大学 一种卷积神经单元块、构成的神经网络及图像分类方法
CN110009625B (zh) * 2019-04-11 2021-02-12 上海科技大学 基于深度学习的图像处理系统、方法、终端、及介质
US11216150B2 (en) * 2019-06-28 2022-01-04 Wen-Chieh Geoffrey Lee Pervasive 3D graphical user interface with vector field functionality
CN112306468A (zh) * 2019-08-02 2021-02-02 伊姆西Ip控股有限责任公司 用于处理机器学习模型的方法、设备和计算机程序产品
CN110957012B (zh) * 2019-11-28 2021-04-09 腾讯科技(深圳)有限公司 化合物的性质分析方法、装置、设备及存储介质
WO2021111606A1 (ja) * 2019-12-05 2021-06-10 日本電気株式会社 グラフ探索装置、グラフ探索方法、及びコンピュータ読み取り可能な記録媒体
US20210374499A1 (en) * 2020-05-26 2021-12-02 International Business Machines Corporation Iterative deep graph learning for graph neural networks
CN113761286B (zh) * 2020-06-01 2024-01-02 杭州海康威视数字技术股份有限公司 一种知识图谱的图嵌入方法、装置及电子设备
CN111860656B (zh) * 2020-07-22 2023-06-16 中南民族大学 分类器训练方法、装置、设备以及存储介质
US10885387B1 (en) * 2020-08-04 2021-01-05 SUPERB Al CO., LTD. Methods for training auto-labeling device and performing auto-labeling by using hybrid classification and devices using the same
US10902291B1 (en) * 2020-08-04 2021-01-26 Superb Ai Co., Ltd. Methods for training auto labeling device and performing auto labeling related to segmentation while performing automatic verification by using uncertainty scores and devices using the same
CN111949792B (zh) * 2020-08-13 2022-05-31 电子科技大学 一种基于深度学习的药物关系抽取方法
US20220067186A1 (en) * 2020-09-02 2022-03-03 Cookie.AI, Inc. Privilege graph-based representation of data access authorizations
CN112148931B (zh) * 2020-09-29 2022-11-04 河北工业大学 用于高阶异构图分类的元路径学习方法
CN112487199B (zh) * 2020-11-24 2022-02-18 杭州电子科技大学 一种基于用户购买行为的用户特征预测方法
CN112529069B (zh) * 2020-12-08 2023-10-13 广州大学华软软件学院 一种半监督节点分类方法、系统、计算机设备和存储介质
CN112560273B (zh) * 2020-12-21 2023-11-10 北京轩宇信息技术有限公司 面向数据流模型的模型组件执行顺序确定方法及装置
CN112801206B (zh) * 2021-02-23 2022-10-14 中国科学院自动化研究所 基于深度图嵌入网络与结构自学习的图像关键点匹配方法
CN112965888B (zh) * 2021-03-03 2023-01-24 山东英信计算机技术有限公司 一种基于深度学习预测任务量的方法、系统、设备及介质
CN113515908A (zh) * 2021-04-08 2021-10-19 国微集团(深圳)有限公司 驱动矩阵及其生成方法、门电路信息的表示方法、图
CN113360496B (zh) * 2021-05-26 2024-05-14 国网能源研究院有限公司 一种构建元数据标签库的方法及装置
CN113313173B (zh) * 2021-06-01 2023-05-30 中山大学 基于图表示和改进Transformer的人体解析方法
US20220156322A1 (en) * 2021-09-29 2022-05-19 Intel Corporation Graph reordering and tiling techniques
CN113990353B (zh) * 2021-10-27 2024-05-07 北京百度网讯科技有限公司 识别情绪的方法、训练情绪识别模型的方法、装置及设备
US20230297626A1 (en) * 2022-03-21 2023-09-21 Armand E. Prieditis Method and system for facilitating graph classification

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104951442A (zh) * 2014-03-24 2015-09-30 华为技术有限公司 一种确定结果向量的方法和装置
US20150356166A1 (en) * 2014-06-09 2015-12-10 Alcatel-Lucent Bell N.V. Method and System for Representing Paths on a Graph Based on a Classification
CN106203469A (zh) * 2016-06-22 2016-12-07 南京航空航天大学 一种基于有序模式的图分类方法
US20170163502A1 (en) * 2015-12-04 2017-06-08 CENX, Inc. Classifier based graph rendering for visualization of a telecommunications network topology
CN106897739A (zh) * 2017-02-15 2017-06-27 国网江苏省电力公司电力科学研究院 一种基于卷积神经网络的电网设备分类方法

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104951442A (zh) * 2014-03-24 2015-09-30 华为技术有限公司 一种确定结果向量的方法和装置
US20150356166A1 (en) * 2014-06-09 2015-12-10 Alcatel-Lucent Bell N.V. Method and System for Representing Paths on a Graph Based on a Classification
US20170163502A1 (en) * 2015-12-04 2017-06-08 CENX, Inc. Classifier based graph rendering for visualization of a telecommunications network topology
CN106203469A (zh) * 2016-06-22 2016-12-07 南京航空航天大学 一种基于有序模式的图分类方法
CN106897739A (zh) * 2017-02-15 2017-06-27 国网江苏省电力公司电力科学研究院 一种基于卷积神经网络的电网设备分类方法

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
NIEPERT, M. ET AL.: "Learning Convolutional Neural Networks for Graphs", PROCEEDINGS OF THE 33RD INTERNATIONAL CONFERENCE ON MACHINE LEARNING, 8 June 2016 (2016-06-08), XP055457122 *

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109815440A (zh) * 2019-01-16 2019-05-28 江西师范大学 联合图优化和投影学习的维数约简方法
CN109815440B (zh) * 2019-01-16 2023-06-23 江西师范大学 联合图优化和投影学习的维数约简方法
CN110674829A (zh) * 2019-09-26 2020-01-10 哈尔滨工程大学 一种基于图卷积注意网络的三维目标检测方法
CN112818179A (zh) * 2019-11-18 2021-05-18 中国科学院深圳先进技术研究院 基于Hybrid存储格式的图遍历访存优化方法、系统及电子设备
CN111428562A (zh) * 2020-02-24 2020-07-17 天津师范大学 一种基于部件引导图卷积网络的行人再识别方法
CN111428562B (zh) * 2020-02-24 2022-09-23 天津师范大学 一种基于部件引导图卷积网络的行人再识别方法
CN111916144A (zh) * 2020-07-27 2020-11-10 西安电子科技大学 基于自注意力神经网络和粗化算法的蛋白质分类方法
CN111916144B (zh) * 2020-07-27 2024-02-09 西安电子科技大学 基于自注意力神经网络和粗化算法的蛋白质分类方法
CN115797345A (zh) * 2023-02-06 2023-03-14 青岛佳美洋食品有限公司 一种海鲜烘烤异常识别方法

Also Published As

Publication number Publication date
US11461581B2 (en) 2022-10-04
US20200134362A1 (en) 2020-04-30
CN108520275A (zh) 2018-09-11

Similar Documents

Publication Publication Date Title
WO2019001070A1 (zh) 一种基于邻接矩阵的连接信息规整系统、图特征提取系统、图分类系统和方法
WO2019001071A1 (zh) 一种基于邻接矩阵的图特征提取系统、图分类系统和方法
Bhatti et al. Deep learning with graph convolutional networks: An overview and latest applications in computational intelligence
Abu-El-Haija et al. Learning edge representations via low-rank asymmetric projections
Chen et al. DAGCN: dual attention graph convolutional networks
CN110991532B (zh) 基于关系视觉注意机制的场景图产生方法
Sajid et al. Zoomcount: A zooming mechanism for crowd counting in static images
CN107506786A (zh) 一种基于深度学习的属性分类识别方法
CN113052254B (zh) 多重注意力幽灵残差融合分类模型及其分类方法
Xia et al. Weakly supervised multimodal kernel for categorizing aerial photographs
CN107392254A (zh) 一种通过联合嵌入从像素中构造图像的语义分割方法
CN109919172A (zh) 一种多源异构数据的聚类方法及装置
Wang et al. Deep multi-person kinship matching and recognition for family photos
Setiawan et al. Sequential inter-hop graph convolution neural network (SIhGCN) for skeleton-based human action recognition
Liu et al. Active deep densely connected convolutional network for hyperspectral image classification
CN116645579A (zh) 一种基于异质图注意力机制的特征融合方法
CN104598898A (zh) 一种基于多任务拓扑学习的航拍图像快速识别系统及其快速识别方法
Poelmans et al. Text mining with emergent self organizing maps and multi-dimensional scaling: A comparative study on domestic violence
Khlifi et al. Graph-based deep learning techniques for remote sensing applications: Techniques, taxonomy, and applications—A comprehensive review
He et al. Classification of metro facilities with deep neural networks
CN111259938B (zh) 基于流形学习和梯度提升模型的图片偏多标签分类方法
Li et al. Dual-stream GNN fusion network for hyperspectral classification
Li et al. MultiLineStringNet: a deep neural network for linear feature set recognition
Zhang et al. An MCMC-based prior sub-hypergraph matching in presence of outliers
Seydi et al. Leveraging involution and convolution in an explainable building damage detection framework

Legal Events

Date Code Title Description
NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 18824809

Country of ref document: EP

Kind code of ref document: A1