CN116383446A - Author classification method based on heterogeneous quotation network - Google Patents

Author classification method based on heterogeneous quotation network Download PDF

Info

Publication number
CN116383446A
CN116383446A CN202310359202.4A CN202310359202A CN116383446A CN 116383446 A CN116383446 A CN 116383446A CN 202310359202 A CN202310359202 A CN 202310359202A CN 116383446 A CN116383446 A CN 116383446A
Authority
CN
China
Prior art keywords
graph
graph structure
new
meta
heterogeneous
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310359202.4A
Other languages
Chinese (zh)
Inventor
李晋
孙青宇
张锋
林森
程建华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Harbin Engineering University
Original Assignee
Harbin Engineering University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Harbin Engineering University filed Critical Harbin Engineering University
Priority to CN202310359202.4A priority Critical patent/CN116383446A/en
Publication of CN116383446A publication Critical patent/CN116383446A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/901Indexing; Data structures therefor; Storage structures
    • G06F16/9024Graphs; Linked lists
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/906Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/042Knowledge-based neural networks; Logical representations of neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

An author classification method based on a heterogeneous quotation network, in particular to a classification method for authors in a quotation network which uses a meta-structure-based heterogeneous chart to represent learning, which aims at solving the problems of low efficiency and accuracy of author classification in the quotation network caused by low efficiency and accuracy of heterogeneous chart representation learning when the GNN method classifies the authors in the quotation network. Abstracting a quotation network in a certain field into an iso-composition, respectively defining the iso-composition and a meta-path and a meta-graph contained in the iso-composition, wherein node types in the iso-composition comprise articles, authors and conferences; sequentially processing the heterogeneous graph by using a graph structure learner, a graph structure expander and a graph structure screening device to obtain a screened new graph structure; constructing a graph structure analyzer according to the HAN model, embedding nodes into the screened new graph structure by using the graph structure analyzer, completing heterogeneous graph representation learning of the quotation network, classifying authors according to the heterogeneous graph representation learning, and obtaining classified authors. Belonging to the field of classification of authors.

Description

Author classification method based on heterogeneous quotation network
Technical Field
The invention relates to an author classification method, in particular to a classification method of authors in a quotation network which utilizes a heterogeneous diagram representation learning based on a meta structure, belonging to the field of author classification.
Background
Currently, the existing Graph Neural Network (GNN) is performed on isomorphic graphs or heteromorphic graphs, but there is a certain limitation in extracting information on isomorphic graphs or heteromorphic graphs. In isomorphic graphs, for example, the graph-convolution-based operation treats the type information of edges and nodes as characteristics of edges and nodes, but ignores the effect of the types of edges and nodes on the structure. In the heterograms, the concept of the meta-path is introduced to solve the problems of the GNN method in the isomorphic diagrams, for example, the meta-path sequence is used for guiding the process of searching the neighbor nodes based on the work of random walk of the meta-path, namely, when searching the neighbor nodes, the neighbor nodes connected through the meta-path instead of the simple edges are searched, and the method has rich semantics when extracting the neighbor node information.
The quotation network is a typical heterogeneous graph, and there are nodes of different types of a (author), P (article), C (meeting), etc., so that authors in the quotation network often consult or refer to work of other authors when publishing papers, and the authors generally belong to the same research domain, so that it is necessary to classify the authors in the same domain. However, there are hundreds of millions or even billions of authors in a large citation network, and it is impossible to manually classify the authors in the citation network, so that GNN method is applied in classification of authors in the citation network, more authors and corresponding information are mined by using meta paths, and the authors are classified according to the information, for example, meta paths a (author) -P (article) -a (author) can show a co-worker relationship of two authors, and meta paths a (author) -P (article) -C (meeting) -P (article) -a (author) indicate that both authors have published articles on the same meeting. Through the meta-paths, the information of authors can be further enriched, but the method has some limitations, namely, a common GNN method needs to predefine the meta-paths, a process of manually constructing the meta-paths needs strong prior knowledge, and as not all the meta-paths are helpful for enriching node information in a quotation network, useful meta-paths in the quotation network need to be determined in advance, so that certain difficulties exist for non-professionals, great effort and time are needed, and the efficiency of heterogeneous graph representation learning is influenced; secondly, the capability of processing sparse connection and missing connection between remote nodes is lacking in a complex quotation network, so that some element paths useful for enriching author information are not found, and the accuracy of heterogeneous graph representation learning is reduced; third, because semantics contained in different meta-paths are different, and importance of the same meta-path is different for different tasks, however, most of existing methods simply aggregate information contained in different meta-paths, which also reduces accuracy of heterogeneous graph representation learning, affects subsequent author classification tasks, and results in low author classification efficiency and accuracy in a quotation network.
Disclosure of Invention
The invention aims to solve the problems that when the GNN method classifies authors in a quotation network, great effort and time are required to determine meta paths on an abnormal graph, the efficiency of the representation learning of the heterogeneous graph is affected, and all meta paths useful for enriching the information of the authors cannot be found, but the information contained in different meta paths is simply aggregated, the accuracy of the representation learning of the heterogeneous graph is affected, and the classification efficiency and accuracy of the authors in the quotation network are low, and further provides an author classification method based on the heterogeneous quotation network.
The technical scheme adopted by the invention is as follows:
it comprises the following steps:
s1, abstracting a quotation network in a certain research field into an iso-composition, and defining a meta-composition and a meta-path and a meta-diagram contained in the iso-composition respectively;
s2, sampling and recombining the heterogeneous graph by using a graph structure learner to obtain a new sub graph, multiplying the new sub graph in a matrix mode to obtain a new graph structure, expanding the new graph structure by using a graph structure expander to obtain an expanded new graph structure, and carrying out diversity definition and screening on the expanded new graph structure by using a graph structure screening device to obtain a screened new graph structure;
s3, constructing a graph structure analyzer according to the HAN model, taking the screened new graph structure as input and output node embedding of a graph rolling network GCN in the graph structure analyzer, carrying out nonlinear conversion on the node embedding by using a multi-layer perceptron, measuring the weight of each specific graph structure under the embedding of a specific semantic node according to the similarity of the node embedding after the nonlinear conversion and a semantic hierarchy attention vector, fusing the weight with the embedding of the specific semantic node to obtain final node embedding, completing heterogeneous graph representation learning of a quotation network, classifying authors of the quotation network in a certain research field in S1 according to the heterogeneous graph representation learning, and obtaining classified authors.
Further, the specific process of S1 is as follows:
s11, defining a heterogeneous diagram:
abstracting a quotation network in a certain research field into an iso-composition, defining the iso-composition as G= (V, E), and defining the relationship mode of the iso-composition as T G =(T v ,T e ) Wherein V is the set of nodes in the heterogram, E is the set of edges in the heterogram, T v For the collection of node types in the heterograms, the node types include articles P, authors A and meetings C, T in a certain field e Edge types include P-A, A-P, P-C, C-P, which are a collection of edge types in the iso-graph;
obtaining corresponding edge types according to any two nodes in the iso-graph, and storing each edge type by using an adjacent matrix A, wherein A is E R N×N Where n= |v|, then the outlier may be stored with an adjacency matrix, i.e. the outlier comprises a plurality of adjacency matrices, then the outlier is a tensor
Figure BDA0004164416940000021
Each adjacency matrix is actually a subgraph;
s12, abstracting the relation among nodes in the quotation network into a meta structure, wherein the meta structure comprises a meta path and a meta graph, and the meta path is a path for connecting different types of edges on the heterograph;
s13, defining a primitive path based on the heterogeneous graph:
definition of the definition
Figure BDA0004164416940000031
Representing meta-paths, e l Representing edges of the first type, e, in the meta-path l ∈T e The method comprises the steps of carrying out a first treatment on the surface of the S14, defining a primitive graph based on the heterogeneous graph:
metagraph M is a single source node v s And a single target node v t Directed acyclic graph of (v), i.e. v s Is of the degree of penetration of 0, v t The degree of departure of (2) is 0, so M= (V) M ,E M ,A M ,R M ,v s ,v t ) Representing a metagraph, wherein,
Figure BDA0004164416940000032
are respectively subjected to
Figure BDA0004164416940000033
Constraint of V M Representing a set of nodes in metagraph M, E M Representing a set of edges in metagraph M, A M Representing a set of node types in a metagraph M, R M Representing a collection of edge types in metagraph M.
Further, the specific process of S2 is as follows:
s21, defining a plurality of graph structure generation layers, wherein each graph structure generation layer consists of l graph structure learners, and using a certain graph structure learner to tensor of the heterogeneous graph in S11
Figure BDA0004164416940000034
Sampling to obtain multiple sub-images A i Recombining all the subgraphs to obtain a new subgraph Q, obtaining l new subgraphs for the l graph structure learners, multiplying the l new subgraphs in a matrix form to obtain a new graph structure H containing path type element structures from 1 to l elements in length, namelyA new graph structure H is obtained by one graph structure generating layer, a plurality of graph structure generating layers obtain a plurality of new graph structures H, and a plurality of new graph structures H form tensors of the new graph structures
Figure BDA0004164416940000035
S22, expanding tensors of the new graph structure by using a graph structure expander to obtain tensors of the expanded new graph structure, wherein the expanded new graph structure comprises a meta-graph type meta-structure;
s23, performing diversity definition and screening on tensors of the expanded new graph structure by using a graph structure screening device to obtain tensors of the screened new graph structure.
Further, the specific process of S21 is as follows:
s211, defining a plurality of graph structure generation layers, wherein each graph structure generation layer consists of l graph structure learners, and the number of the graph structure generation layers is expressed as a channel number C;
s212, tensor of the iso-composition in S11 in each graph structure learner
Figure BDA0004164416940000036
Sampling to obtain multiple sub-images A i All subgraphs a are obtained using two 1 x 1 convolutional layers i Weighting and recombining the weights to obtain a new sub-graph Q:
Figure BDA0004164416940000041
wherein phi represents a convolution layer, W φ ∈R 1×1×K Representing the parameter phi, A i ,α i Respectively represent isomerism figures
Figure BDA0004164416940000042
And W is φ Sub-elements of (3);
obtaining l new subgraphs Q for l graph structure learners in each graph structure generation layer 1 、Q 2 、…Q l
S213, multiplying the l new subgraphs in a matrix form to obtain a new graph structure H containing path type element structures from 1 to l elements in length:
H=Q 1 Q 2 …Q l (2)
wherein,,
Figure BDA0004164416940000043
Figure BDA0004164416940000044
is the meta structure with length of l at the t l Weights in the individual graph structure learner, a new graph structure H of meta-structure of length l is obtained:
Figure BDA0004164416940000045
one new graph structure H is obtained by one graph structure generation layer, a plurality of new graph structures H are obtained by the multiple graph structure generation layers, and tensors of the new graph structures are formed by the multiple new graph structures H
Figure BDA0004164416940000046
Figure BDA0004164416940000047
Wherein,,
Figure BDA0004164416940000048
depending on the number of channels C.
Further, the specific process of S22 is:
the graph structure expander is a Hadamard product operation, and tensors from the new graph structure
Figure BDA00041644169400000412
Arbitrarily selecting two adjacency matrices H i And H j By Hadamard product pair H i And H j Expanding to obtain a new graph structure H containing meta-graph type meta-structure HP Since the graph structure can be used as a graph matrix, the new graph structure H HP Normalizing each element in the graph matrix by using the total value of each row of the graph matrix by adopting a normalization method based on matrix row values to obtain an expanded new graph structure, and repeatedly executing the operation to obtain a plurality of new graph structures H HP Multiple new graph structures H HP Tensor of new graph structure after composition expansion>
Figure BDA0004164416940000049
Further, the specific process of S23 is:
given an integrated model HC and a weight alpha t The definition of amb diversity measurement method is obtained as shown in the following formula:
Figure BDA00041644169400000410
the information of different graph structures is defined in a diversity manner by using a formula (5), and the formula is as follows:
Figure BDA00041644169400000411
where W is the total number of graph structures;
calculating tensors of the new graph structure based on equation (6)
Figure BDA0004164416940000051
And tensor of the new graph structure after expansion +.>
Figure BDA0004164416940000052
All new graph structures H i And sorting the diversity from large to small, selecting P new graph structures with the largest diversity as the output of the graph structure filter in the form of graph structure tensors, wherein the graph structure filter is represented by the following formula:
Figure BDA0004164416940000053
wherein,,
Figure BDA0004164416940000054
tensor representing new structure of the screened +.>
Figure BDA0004164416940000055
Further, the specific process of S3 is as follows:
s31, tensor of new graph structure in S23
Figure BDA0004164416940000056
P new graph structures in the graph are used as the input of a graph convolutional network GCN, and P nodes are output to be embedded into Z 1 ,Z 2 ,…,Z P ,/>
Figure BDA0004164416940000057
Wherein Z is a Representing node embedding under a new graph structure, aE P, H a Representing tensor H selected Adjacency matrix corresponding to the a-th new graph structure,/->
Figure BDA00041644169400000510
Represents H a Degree matrix of X E R N×d Representing a feature matrix, W.epsilon.R d×d Representing a training weight matrix;
s32, obtaining the weight of each new graph structure according to P nodes in an embedding way:
1 ,β 2 ,…,β P )=att sem (Z 1 ,Z 2 ,…,Z P ) (8)
wherein att is sem Representing a deep neural network, performing semantic hierarchy attention;
s33, performing nonlinear conversion on node embedding by using a multi-layer perceptron of one layer, measuring the importance of the specific semantic node embedding of a specific new graph structure by using the similarity between the node embedding after nonlinear conversion and a semantic hierarchy attention vector q, and determining the importance of the specific semantic node embedding of all the featuresThe importance of the semantic node embedding is averaged to obtain the importance w of each new graph structure i
Figure BDA0004164416940000058
Wherein W is a training weight matrix, b is a bias value, and q is a semantic hierarchy attention vector;
normalizing the importance of each new graph structure through a softmax function to obtain the weight of the corresponding new graph structure, wherein the weight is shown in the following formula:
Figure BDA0004164416940000059
wherein beta is i For each new graph structure weight, beta i The higher the new graph structure is, the more important;
s34, weighting w of new graph structure i Fusing the node embedded with the specific semantic node embedded corresponding to the new graph structure to obtain a final node embedded Z:
Figure BDA0004164416940000061
and finally, completing the heterogeneous diagram representation learning of the quotation network, classifying the authors of the quotation network in a certain research field in the S1 according to the heterogeneous diagram representation learning, and obtaining the classified authors.
Advantageous effects
The invention abstracts a quotation network in a certain research field into an isomerism diagram, self-defines an isomerism diagram, meta paths and meta diagrams contained in the isomerism diagram, divides the isomerism diagram into a plurality of subgraphs for subsequent calculation, wherein node types in the isomerism diagram comprise articles P, authors A and meetings C in the certain field, and edge types comprise P-A, A-P, P-C, C-P; the multiple subgraphs in the heterograms are sampled and recombined by the graph structure learner to obtain a new graph structure, and the graph structure learner can adaptively generate meta paths useful for enriching the node information of the authors, so that the cost of learning the heterogeneous graph quotation network representation is greatly reduced. And then expanding the new graph structure by using a graph structure expander to obtain an expanded new graph structure, so that the connection between author nodes far away in the quotation network can be established, the author node information is enriched, all useful meta paths are found, and the accuracy of heterogeneous graph representation learning is improved. And then, the new graph structure after expansion is subjected to diversity definition and screening by utilizing a graph structure screening device, the new graph structure after screening is obtained, and real and effective information is found. Finally, the final node embedding in the heterogram is realized by using a graph structure analyzer, the heterogram representation learning is completed, the authors of the quotation network in a certain research field are classified according to the heterogram representation learning, the classified authors are obtained, different weights are given to different element paths in the step, the actual situation is more met, the accuracy and the efficiency of the heterogram representation learning are further improved, and the accuracy and the efficiency of the author classification are further improved.
Drawings
FIG. 1 is a schematic diagram of a single-channel diagram structure learner;
FIG. 2 is an exemplary diagram of a matrix multiplication expansion relationship;
FIG. 3 is an exemplary diagram of the operation of the fabric extender of FIG. 3;
FIG. 4 is a schematic diagram of the structure analyzer process of FIG. 4;
FIG. 5 is a schematic diagram of a visual analysis of a meta structure;
fig. 6 is a graph of the results of a node classification experiment.
Detailed Description
The first embodiment is as follows: referring to fig. 1-6, the author classification method based on the heterogeneous quotation network according to the present embodiment is described, and since all author nodes have node characteristics, the study of heterogeneous chart representation can be performed through the node characteristics, so that the invention accurately predicts the research field of each author through the heterogeneous representation learning method based on the meta structure, and classifies the author nodes. It comprises the following steps:
s1, abstracting a quotation network in a certain research field into an iso-composition, and defining a meta-composition and a meta-path and a meta-diagram contained in the iso-composition respectively, wherein the specific process is as follows:
s11, defining a heterogeneous diagram:
the citation network related to the invention comprises DBLP (Digital Bibliography)&Library Project) and ACM (Association for Computing Machinery), abstracting a quotation network of a certain research field into an iso-composition, defining the iso-composition as G= (V, E), and defining the relationship mode of the iso-composition as T G =(T v ,T e ) Wherein V is the set of nodes in the heterogram, E is the set of edges in the heterogram, T v For the collection of node types in the heterograms, the node types comprise P (articles), A (authors), C (meetings) and the like in a certain research domain, T e The edge types include P-A, A-P, P-C, C-P, etc. for the collection of edge types in the iso-graph. The iso-graph g= (V, E) has a node type mapping function f therein v :V→T v Sum edge type mapping function f e :E→T e . Definition |T v I is the number of node types in the iso-composition, |T e I is the number of edge types in the iso-graph, if T v |+|T e I > 2, it means that there may be multiple types of nodes or multiple types of edges or multiple types of nodes and multiple types of edges in the iso-graph. Obtaining corresponding edge types according to any two nodes (which can be nodes of the same type or nodes of different types) in the heterogram, and storing each edge type by using an adjacent matrix, namely, each edge type is understood to be an adjacent matrix, so that an adjacent matrix A epsilon R is used N×N The iso-composition is stored, where n= |v|. If node v i And node v j Between which there is an edge e ij Then a in the adjacency matrix A ij Is non-zero, otherwise is zero, v i Is the ith node in the iso-graph, v j Is the j-th node in the heterograph, v i 、v j ∈V,e ij Is the connection node v in the heterograph i And node v j Edge e of (2) ij ∈E。a ij The calculation of e A is as follows:
Figure BDA0004164416940000071
depending on the edge type, the iso-pattern may be composed of
Figure BDA0004164416940000072
Representation, wherein K represents the kind of edge type, k= |t e I, K represents the kth type, K ε K, A k ∈R N×N When node v i And node v j When there is a kth type of edge between A k [i,j]To be non-zero, more precisely, an isograph is composed of K adjacency matrices, the isograph can be expressed as tensor +.>
Figure BDA0004164416940000073
Each adjacency matrix is actually a sub-graph, but the adjacency matrix is merely a representation in the form of a matrix.
S12, abstracting the relation among any nodes in the quotation network into a meta structure, wherein the meta structure comprises a meta path and a meta graph, and the meta path is a path for connecting different types of edges on different patterns.
S13, defining a primitive path based on the heterogeneous graph:
definition of the definition
Figure BDA0004164416940000081
Representing meta-paths, e l Representing edges of the first type, e, in the meta-path l ∈T e The meta-path is at v 1 And v l+1 Defining a complex relationship between +.>
Figure BDA0004164416940000082
Figure BDA0004164416940000083
Representing a compound operation between node relationships (relationships).
S14, defining a primitive graph based on the heterogeneous graph:
metagraph M is a single source node v s And a single target node v t Directed acyclic graph of (v), i.e. v s Is of the degree of penetration of 0, v t The output of (2) is 0, so M is used=(V M ,E M ,A M ,R M ,v s ,v t ) Representing a metagraph, wherein,
Figure BDA0004164416940000084
are respectively subjected to
Figure BDA0004164416940000085
Constraint of V M Representing a set of nodes in metagraph M, E M Representing a set of edges in metagraph M, A M Representing a set of node types in a metagraph M, R M Representing a collection of edge types in metagraph M.
S2, sampling and recombining the heterogeneous graph by using a graph structure learner to obtain a new sub graph, multiplying the new sub graph in a matrix mode to obtain a new graph structure, expanding the new graph structure by using a graph structure expander to obtain an expanded new graph structure, and carrying out diversity definition and screening on the expanded new graph structure by using a graph structure screening device to obtain a screened new graph structure, wherein the specific process is as follows:
s21, defining a plurality of graph structure generation layers, wherein each graph structure generation layer consists of l graph structure learners, and using a certain graph structure learner to tensor of the heterogeneous graph in S11
Figure BDA0004164416940000086
Sampling to obtain multiple sub-images A i Recombining all the subgraphs to obtain a new subgraph Q, obtaining l new subgraphs for the l graph structure learners, multiplying the l new subgraphs in a matrix form to obtain a new graph structure H containing path type element structures from 1 to l elements in length, namely obtaining a new graph structure H by one graph structure generation layer, obtaining a plurality of new graph structures H by a plurality of graph structure generation layers, and forming tensors of the new graph structure by the plurality of new graph structures H
Figure BDA0004164416940000087
The graph structure can be regarded as representing the new iso-graph in the form of a graph matrix:
to construct an automated learning process, all sub-graphs sampled from the heterogeneous graph G are combined into a new heterogeneous graph using a graph structure learner. The graph structure learner is a single channel, as shown in fig. 1.
The iso-pattern G is composed of different types of nodes V and different types of edges E, then it is apparent that for the iso-pattern G the tensor is initially
Figure BDA0004164416940000088
In other words, the set T of types of edges E may be used e Will->
Figure BDA0004164416940000089
Split into smaller sets of sub-graph adjacency matrices as shown in the following equation:
Figure BDA00041644169400000810
wherein T is e Represents the edge type set of E, A [ E ]]Representative tensor
Figure BDA00041644169400000811
Sub-graph adjacency matrix of edges of all kinds e, A [ e ] corresponding to]Only the node corresponding to the edge of class e is included.
Splitting the heterogeneous graph into a plurality of sub-graphs (sub-graph adjacency matrix) with minimum units according to the above, wherein each sub-graph corresponds to a sub-graph adjacency matrix
Figure BDA0004164416940000091
Can be regarded as a length-1 meta-structure, whereas building a more complex meta-structure is actually from tensor +.>
Figure BDA0004164416940000092
Corresponding subsets are selected and then connected through matrix multiplication. For example, in the quoted network, a adjacency matrix A of length 2-element structure Author-Paper-Conference (APC) APC The adjacency matrix A which can pass through Author-Paper, paper-Conference AP And A is a PC Multiplication, i.e. A APC =A AP ×A PC
From the point of view of the adjacency matrix it is explained that: tensor for the iso-composition in S11 for each graph structure learner
Figure BDA0004164416940000093
Sampling to obtain multiple sub-images A i All subgraphs a are obtained using two 1 x 1 convolutional layers i The convolution kernel of the selected adjacency matrix is calculated by softmax, the weight is weighted and recombined to obtain a new sub-graph Q, and the Q is less than the sub-graph A in the number of sampled nodes and edges i Each new sub-graph Q is a graph structure, as shown in the following formula:
Figure BDA0004164416940000094
wherein phi represents a convolution layer; w (W) φ ∈R 1×1×K Is a parameter of phi, and the convolution layer parameters comprise node characteristics, input characteristic dimensions, output characteristic dimensions and the like; a is that i ,α i Respectively is an iso-pattern
Figure BDA0004164416940000095
And W is φ Sub-elements of (A), e.g. A i Representing an iso-pattern->
Figure BDA0004164416940000096
In (i) th sub-graph, alpha i Representation A i Parameters of (2); q is in fact +.>
Figure BDA0004164416940000097
And (3) a weighted sum of the middle subgraphs.
In the above manner, two graph structure learners can learn tensors from heterogeneous graphs
Figure BDA0004164416940000098
Obtain two new subgraphs Q 1 And Q 2 Multiplying the two new subgraphs in the form of a matrix to generate a new graph structure H containing a path type element structure with the length of 2 elements:
H=D -1 Q 1 Q 2
wherein D is the heterogeneous map tensor
Figure BDA0004164416940000099
The degree matrix is used for normalization, so that the stability of the numerical value of the graph structure can be maintained.
Each graph structure learner outputs a new sub-graph Q, the representation of which is an adjacency matrix. In order to obtain a new graph structure including a path type element structure with a length of L elements, a graph structure learner is formed into a graph structure generating layer, the graph structure generating layer is defined as L, the output of the graph structure generating layer L is a new graph structure H, the number of the graph structure generating layer is expressed as a channel number C (channels), and the number of the graph structure learners is L multiplied by C, and L and C are taken as super parameters and can be set in training by self. Obtaining new subgraph Q through the structure learner 1 、Q 2 、…Q l Multiplying the new sub-graphs in matrix form to obtain a new graph structure H, a graph structure generation layer (a channel) can generate a new graph structure H, as shown in the following formula:
H=Q 1 Q 2 …Q l
wherein,,
Figure BDA0004164416940000101
Figure BDA0004164416940000108
is the meta structure with length of l at the t l Weights in the individual graph structure learner, resulting in a new graph structure H of meta-structure of length l:
Figure BDA0004164416940000102
according to the above process, the obtained multiple new graph structures H are formed into tensors of the new graph structures
Figure BDA0004164416940000109
Figure BDA0004164416940000103
Wherein,,
Figure BDA0004164416940000104
depending on the number of channels C.
The present invention considers a sophisticated example of why a form of matrix multiplication is used, as shown in fig. 2. The director-film-company heterograms are used as the input of a graph structure learner, and the original relation graph (namely the original heterograms) is firstly split into sub-relations by taking the edge relation as an index, and as shown in fig. 2, the director- (guidance) -film and the product company- (product) -film sub-relations are respectively obtained. If the two sub-relationships are directly multiplied, a new relationship which is not found in the original image of the director-the product company can be obtained, and in the data stream, all the relationships are stored in a matrix form, so that the adjacent matrix multiplication of the subgraph can naturally obtain a new image structure.
S22, expanding tensors of the new graph structure by using a graph structure expander to obtain tensors of the expanded new graph structure, wherein the expanded new graph structure comprises a meta-graph type meta-structure;
the graph structure learner only linearly expands the graph structure, the corresponding metSup>A structure is Sup>A style metSup>A path such as A-P-A, P-V-A, in order to enable the graph structure learner to learn more complex structure information, hadamard product operation is adopted to obtain the graph structure in step S21
Figure BDA0004164416940000105
And performing expansion, namely performing Hadamard product operation by using a graph structure expander.
The hadamard product operation refers to multiplication of elements at corresponding positions of two matrices, and in a data stream, a relation exists in a matrix form, as shown in the following formula:
Figure BDA0004164416940000106
the elements in the matrix come from the product of the weights of the convolution layers for each sub-relationship and the weights between the different layers, obviously each element is a fraction between 0-1. Tensor from new graph structure
Figure BDA0004164416940000107
Arbitrarily selecting two adjacency matrices H i And H j ,H i And H j Obtaining a new graph structure H containing primitive graph type primitive structures through Hadamard product HP The graph structure may be used as a graph matrix in which each element corresponds to a concentration factor applied to both graph structures. As shown in fig. 3, at H i And H j After Hadamard product operation, the graph structure changes the weight of each sub-relation (each small lattice in the graph represents a sub-relation), the sub-relation is shown as F-P, V-P, the weight size is represented by the depth of a color block, the depth of the color block represents small attention, and the depth of the color block represents large attention. For Hadamard product operation, the new graph structure H is due to the multiplication of the decimal numbers between 0 and 1 HP The value of (c) is always reduced, and the value stability of the graph structure is reduced for a large number of Hadamard product operations, so that the graph structure H is new HP And (3) normalizing each element in the graph matrix by using the total value of each row of the graph matrix by adopting a normalization method based on the matrix row value, wherein the normalization method is essentially used for normalizing all the output degrees of one node. The specific definition is as follows:
Figure BDA0004164416940000111
wherein N is the number of nodes, and k and j respectively represent a new graph structure H HP The kth row and the jth column in the matrix.
The graph structure expander obtains a plurality of new graph structures H through Hadamard product and normalization operation HP Multiple new graph structures H HP Tensors that together form an expanded new graph structure
Figure BDA0004164416940000112
S23, performing diversity definition and screening on tensors of the expanded new graph structure by using a graph structure screening device to obtain tensors of the screened new graph structure;
obtained by the steps S21 and S22
Figure BDA0004164416940000113
And->
Figure BDA0004164416940000114
A plurality of new graph structures including a meta path type meta structure and a meta graph type meta structure, respectively. The weight coefficient of each element structure is different in different graph structures, and the improvement effect on the downstream learning task is also different. In order to find out the meta structure with the best lifting effect, the learned graph structure needs to be initially screened, and the graph structure with larger structure difference is reserved as output.
The diversity is defined, the diversity identifies the extraction capacity of different models on input features, and the model with larger diversity has better feature extraction performance. Diversity measurements were first proposed by adaboost.nc, which named amb as the proposed method. Suppose that when the predicted x and the actual result agree, h t (x) Equal to 1; otherwise h t (x) Equal to-1. Given an integrated model HC and a weight alpha t The definition of amb can be obtained as shown in the following formula:
Figure BDA0004164416940000115
the idea of amb is to differed for each result from all other results, i.e. to calculate t× (T-1)/2 times, T representing the number of results. The amb works mainly aims at the problem of sorting different model performances in an integrated model, a graph structure learner learns abundant structure information in different graphs, the output graph structure also faces the problem of how much contained structure information, and the graph structure information contained in different graph structures can be defined by utilizing diversity, as shown in the following formula:
Figure BDA0004164416940000121
where W is the total number of graph structures. The definition of diversity is equivalent to making differences between two graph structures and taking absolute values, and the absolute value of the difference is defined as the distance between the two graph structures. For any one graph structure, its diversity is defined as the sum of its distances from all other graph structures.
Based on the above, tensors of the new graph structure are calculated
Figure BDA0004164416940000122
And tensor of the new graph structure after expansion +.>
Figure BDA0004164416940000123
All new graph structures H i And sorting the diversity from large to small, selecting P new graph structures with the largest diversity as the output of a graph structure filter, namely obtaining a filtered new graph structure, wherein the form of the new graph structure is a graph structure tensor as shown in the following formula:
Figure BDA0004164416940000124
wherein,,
Figure BDA0004164416940000125
the number of meta-structures in the graph screened by the graph structure screener is not determined by P, which defines how many of the graph structures are, and the single graph structure H i The method is that all the meta-structures with the lengths of 1 to l are weighted and contain various meta-structures, so that the adoption of a diversity screening mode is equivalent to guiding a graph structure generation layer to select the graph structure which can make the finally generated graph structure have larger difference from the original graph structure and has larger difference from the graph structure learned by the graph structure learners of other channels as much as possible when learning the meta-structures. The purpose of using a diversity diagram structure filter is to optimize the output of the diagram structure generation layer so that, during trainingThe process prefers to learn a unique graph structure, which is beneficial to the learning process of downstream tasks.
Tensor of new graph structure
Figure BDA0004164416940000126
The P graph structures in (1) are in a graph structure H i ∈R N×N Any value in the graph structure matrix represents a weighted sum of the meta structures between two points. Taking the paper citation network as an example, the matrix corresponding to the meta-structure of "author-paper-author (A-P-A)" is H A-P-A =α 1 ×H A-P2 ×H P-A Initial H P-A And H A-P Is a corresponding adjacency matrix, wherein the value is 0 or 1,0 represents non-communication, and 1 represents communication; alpha 1 And alpha 2 Are all parameter vectors consisting of numbers 0 to 1, so the last H A-P-A The value of the value range is [0,1 ]]The weights indirectly represent the selected meta-structure.
S3, constructing a graph structure analyzer according to the HAN model, taking the screened new graph structure as input and output node embedding of a graph rolling network GCN in the graph structure analyzer, carrying out nonlinear conversion on the node embedding by using a multi-layer perceptron, measuring the weight of each specific graph structure under the embedding of a specific semantic node according to the similarity of the node embedding after the nonlinear conversion and a semantic hierarchy attention vector, fusing the weight with the embedding of the specific semantic node to obtain final node embedding, completing heterogeneous graph representation learning of a quotation network, classifying authors of the quotation network in a certain research field in S1 according to the heterogeneous graph representation learning, and obtaining classified authors. The specific process is as follows:
s31, tensor of new graph structure in S23
Figure BDA0004164416940000131
P new graph structures in the graph are used as the input of a graph convolutional network GCN, and P nodes are output to be embedded into Z 1 ,Z 2 ,…,Z P ,/>
Figure BDA0004164416940000132
Wherein Z is a Representing node embedding under a new graph structure, aE P, H a Representing tensor H selected Adjacency matrix corresponding to the a-th new graph structure,/->
Figure BDA0004164416940000133
Represents H a Degree matrix of X E R N×d Representing a feature matrix, W.epsilon.R d×d Representing a training weight matrix.
S32, obtaining the weight of each new graph structure according to P nodes in an embedding way
Figure BDA0004164416940000136
1 ,β 2 ,…,β P )=att sem (Z 1 ,Z 2 ,…,Z P )
Wherein att is sem Representing a deep neural network, performing semantic hierarchy attention. Research has shown that semantic hierarchy attention can capture various semantic information behind the heterograms.
S33, performing nonlinear conversion on node embedding by using a multi-layer perceptron (MLP) of one layer, measuring the importance of the specific semantic node embedding of the specific new graph structure by using the similarity between the node embedding after nonlinear conversion and a semantic hierarchy attention vector q, and averaging the importance of all the specific semantic node embedding to obtain the importance w of each new graph structure i
Figure BDA0004164416940000134
Wherein W is a training weight matrix, b is a bias value, and q is a semantic hierarchy attention vector. After obtaining the importance of each graph structure, normalizing the importance of each new graph structure by a softmax function to obtain the weight of the corresponding new graph structure, as shown in the following formula:
Figure BDA0004164416940000135
wherein beta is i For the weight of each new graph structure, i.e., the contribution of each graph structure to a particular heterogeneous graph learning task, beta i The higher the graph structure is, the more important. The same graph structure may have different weights for different tasks.
S34, weighting the graph structure i Fusing the semantic node embedded with a specific semantic node of a corresponding graph structure to obtain a final node embedded Z, completing the heterogeneous graph representation learning of the quotation network, classifying authors of the quotation network in a certain research field in S1 according to the heterogeneous graph representation learning, and obtaining classified authors;
Figure BDA0004164416940000141
in general, heterogeneous graphics contain different meaningful and complex semantic information, which is usually reflected by a meta structure, and utilized for node embedding. Different meta-structures in the iso-graph may extract different semantic information, which is of different importance to the node, for example, in a Movie network, there are two types of meta-structures Movie-Year and Movie-Actor, obviously, the importance of the meta-structure Movie-Actor to a Movie is greater. In order to solve the difficult problems of meta-structure selection and semantic fusion in heterograms, a semantic hierarchy attention is applied to a graph structure through a graph structure analyzer, and the importance of the graph structure consisting of different meta-structures is automatically learned. The graph structure analyzer implementation is shown in fig. 4.
The quotation network in the invention is subjected to representation learning through the heterogeneous diagram representation learning method based on the meta structure, so that the research field of paper authors can be analyzed, classification can be further carried out according to the research field of the authors through the conjecture, and reference and citation of documents are facilitated.
The node classification experimental analysis is carried out on the real data set in the above-mentioned exemplary embodiment, the invention is represented by MS-GAN, the graph representation method selected by the comparison group is DeepWalk, metapath2vec, GCN, GAT, HAN, GTN, the selected data set is DBLP, IMDB, ACM, and the evaluation index used is Macro-F1.
In fig. 5, the generated meta-structure is visually analyzed, the meta-structure learned by the graph structure generation layers of different channels is represented, the color depth represents the weight size, taking the leftmost channel as an example, wherein the weight in the meta-structure is three sub-relations of V-P, F-P, P-a, that is, the importance degree of the generation layer corresponding to the channel on the two meta-structures of V-P-a and F-P-a is the greatest. As can be seen from fig. 6, compared with other similar algorithms, the graph obtained by adopting the technical scheme of the present invention shows that the learning scheme has the highest Macro-F1 in the three data sets, i.e. the classification effect is the best. Fig. 5 and 6 illustrate that the technical scheme of the invention has the characteristic of high accuracy while having an automatic learning element structure.

Claims (7)

1. An author classification method based on a heterogeneous citation network is characterized by comprising the following steps of: it comprises the following steps:
s1, abstracting a quotation network in a certain research field into an iso-composition, and defining a meta-composition and a meta-path and a meta-diagram contained in the iso-composition respectively;
s2, sampling and recombining the heterogeneous graph by using a graph structure learner to obtain a new sub graph, multiplying the new sub graph in a matrix mode to obtain a new graph structure, expanding the new graph structure by using a graph structure expander to obtain an expanded new graph structure, and carrying out diversity definition and screening on the expanded new graph structure by using a graph structure screening device to obtain a screened new graph structure;
s3, constructing a graph structure analyzer according to the HAN model, taking the screened new graph structure as input and output node embedding of a graph rolling network GCN in the graph structure analyzer, carrying out nonlinear conversion on the node embedding by using a multi-layer perceptron, measuring the weight of each specific graph structure under the embedding of a specific semantic node according to the similarity of the node embedding after the nonlinear conversion and a semantic hierarchy attention vector, fusing the weight with the embedding of the specific semantic node to obtain final node embedding, completing heterogeneous graph representation learning of a quotation network, classifying authors of the quotation network in a certain research field in S1 according to the heterogeneous graph representation learning, and obtaining classified authors.
2. The author classification method based on a heterogeneous citation network according to claim 1, wherein: the specific process of S1 is as follows:
s11, defining a heterogeneous diagram:
abstracting a quotation network in a certain research field into an iso-composition, defining the iso-composition as G= (V, E), and defining the relationship mode of the iso-composition as T G =(T v ,T e ) Wherein V is the set of nodes in the heterogram, E is the set of edges in the heterogram, T v For the collection of node types in the heterograms, the node types include articles P, authors A and meetings C, T in a certain field e Edge types include P-A, A-P, P-C, C-P, which are a collection of edge types in the iso-graph;
obtaining corresponding edge types according to any two nodes in the iso-graph, and storing each edge type by using an adjacent matrix A, wherein A is E R N×N Where n= |v|, then the outlier may be stored with an adjacency matrix, i.e. the outlier comprises a plurality of adjacency matrices, then the outlier is a tensor
Figure FDA0004164416930000011
Each adjacency matrix is actually a subgraph;
s12, abstracting the relation among nodes in the quotation network into a meta structure, wherein the meta structure comprises a meta path and a meta graph, and the meta path is a path for connecting different types of edges on the heterograph;
s13, defining a primitive path based on the heterogeneous graph:
definition of the definition
Figure FDA0004164416930000012
Representing meta-paths, e l Representing edges of the first type, e, in the meta-path l ∈T e
S14, defining a primitive graph based on the heterogeneous graph:
metagraph M is a single source node v s And a single target node v t Directed acyclic graph of (v), i.e. v s Is of the degree of penetration of 0, v t The degree of departure of (2) is 0, so M= (V) M ,E M ,A M ,R M ,v s ,v t ) Representing a metagraph, wherein,
Figure FDA0004164416930000021
are respectively subjected to
Figure FDA0004164416930000022
Constraint of V M Representing a set of nodes in metagraph M, E M Representing a set of edges in metagraph M, A M Representing a set of node types in a metagraph M, R M Representing a collection of edge types in metagraph M.
3. The author classification method based on a heterogeneous citation network according to claim 2, wherein: the specific process of S2 is as follows:
s21, defining a plurality of graph structure generation layers, wherein each graph structure generation layer consists of l graph structure learners, and using a certain graph structure learner to tensor of the heterogeneous graph in S11
Figure FDA0004164416930000023
Sampling to obtain multiple sub-images A i Recombining all the subgraphs to obtain a new subgraph Q, obtaining l new subgraphs for the l graph structure learners, multiplying the l new subgraphs in a matrix form to obtain a new graph structure H containing path type element structures from 1 to l elements in length, namely obtaining a new graph structure H by one graph structure generation layer, obtaining a plurality of new graph structures H by a plurality of graph structure generation layers, and forming tensors of the new graph structure by the plurality of new graph structures H
Figure FDA0004164416930000024
S22, expanding tensors of the new graph structure by using a graph structure expander to obtain tensors of the expanded new graph structure, wherein the expanded new graph structure comprises a meta-graph type meta-structure;
s23, performing diversity definition and screening on tensors of the expanded new graph structure by using a graph structure screening device to obtain tensors of the screened new graph structure.
4. A heterogeneous citation network based author classification method as claimed in claim 3, wherein: the specific process of S21 is as follows:
s211, defining a plurality of graph structure generation layers, wherein each graph structure generation layer consists of l graph structure learners, and the number of the graph structure generation layers is expressed as a channel number C;
s212, tensor of the iso-composition in S11 in each graph structure learner
Figure FDA0004164416930000025
Sampling to obtain multiple sub-images A i All subgraphs a are obtained using two 1 x 1 convolutional layers i Weighting and recombining the weights to obtain a new sub-graph Q:
Figure FDA0004164416930000026
wherein phi represents a convolution layer, W φ ∈R 1×1×K Representing the parameter phi, A i ,α i Respectively represent isomerism figures
Figure FDA0004164416930000027
And W is φ Sub-elements of (3);
obtaining l new subgraphs Q for l graph structure learners in each graph structure generation layer 1 、Q 2 、…Q l
S213, multiplying the l new subgraphs in a matrix form to obtain a new graph structure H containing path type element structures from 1 to l elements in length:
H=Q 1 Q 2 …Q l (2)
wherein,,
Figure FDA0004164416930000031
Figure FDA0004164416930000032
is the meta structure with length of l at the t l Weights in the individual graph structure learner, a new graph structure H of meta-structure of length l is obtained:
Figure FDA0004164416930000033
one new graph structure H is obtained by one graph structure generation layer, a plurality of new graph structures H are obtained by the multiple graph structure generation layers, and tensors of the new graph structures are formed by the multiple new graph structures H
Figure FDA0004164416930000034
Figure FDA0004164416930000035
Wherein,,
Figure FDA0004164416930000036
depending on the number of channels C.
5. The author classification method based on a heterogeneous citation network according to claim 4, wherein: the specific process of S22 is as follows:
the graph structure expander is a Hadamard product operation, and tensors from the new graph structure
Figure FDA0004164416930000037
Arbitrarily selecting two adjacency matrices H i And H j By Hadamard product pair H i And H j Expanding to obtain a new graph node containing meta-graph type meta-structureConstruct H HP Since the graph structure can be used as a graph matrix, the new graph structure H HP Normalizing each element in the graph matrix by using the total value of each row of the graph matrix by adopting a normalization method based on matrix row values to obtain an expanded new graph structure, and repeatedly executing the operation to obtain a plurality of new graph structures H HP Multiple new graph structures H HP Tensor of new graph structure after composition expansion>
Figure FDA0004164416930000038
6. The author classification method based on a heterogeneous citation network according to claim 5, wherein: the specific process of S23 is as follows:
given an integrated model HC and a weight alpha t The definition of amb diversity measurement method is obtained as shown in the following formula:
Figure FDA0004164416930000039
the information of different graph structures is defined in a diversity manner by using a formula (5), and the formula is as follows:
Figure FDA00041644169300000310
where W is the total number of graph structures;
calculating tensors of the new graph structure based on equation (6)
Figure FDA00041644169300000311
And tensor of the new graph structure after expansion +.>
Figure FDA00041644169300000312
All new graph structures H i The diversity of the image is sorted from large to small, and P new image structures with the largest diversity are selected to form image structure tensorsAs the output of the graph structure filter, as shown in the following formula:
Figure FDA0004164416930000041
wherein,,
Figure FDA0004164416930000042
tensor representing new structure of the screened +.>
Figure FDA0004164416930000043
7. The author classification method based on a heterogeneous citation network according to claim 6, wherein: the specific process of S3 is as follows:
s31, tensor of new graph structure in S23
Figure FDA0004164416930000044
P new graph structures in the graph are used as the input of a graph convolutional network GCN, and P nodes are output to be embedded into Z 1 ,Z 2 ,…,Z P ,/>
Figure FDA0004164416930000045
Wherein Z is a Representing node embedding under a new graph structure, aE P, H a Representing tensor H selected Adjacency matrix corresponding to the a-th new graph structure,/->
Figure FDA0004164416930000049
Represents H a Degree matrix of X E R N ×d Representing a feature matrix, W.epsilon.R d×d Representing a training weight matrix;
s32, obtaining the weight of each new graph structure according to P nodes in an embedding way:
1 ,β 2 ,…,β P )=att sem (Z 1 ,Z 2 ,…,Z P ) (8)
wherein att is sem Representing a deep neural network, performing semantic hierarchy attention;
s33, performing nonlinear conversion on node embedding by using a multi-layer perceptron of one layer, measuring the importance of the specific semantic node embedding of the specific new graph structure by using the similarity between the node embedding after the nonlinear conversion and a semantic hierarchy attention vector q, and averaging the importance of all the specific semantic node embedding to obtain the importance w of each new graph structure i
Figure FDA0004164416930000046
Wherein W is a training weight matrix, b is a bias value, and q is a semantic hierarchy attention vector;
normalizing the importance of each new graph structure through a softmax function to obtain the weight of the corresponding new graph structure, wherein the weight is shown in the following formula:
Figure FDA0004164416930000047
wherein beta is i For each new graph structure weight, beta i The higher the new graph structure is, the more important;
s34, weighting w of new graph structure i Fusing the node embedded with the specific semantic node embedded corresponding to the new graph structure to obtain a final node embedded Z:
Figure FDA0004164416930000048
and finally, completing the heterogeneous diagram representation learning of the quotation network, classifying the authors of the quotation network in a certain research field in the S1 according to the heterogeneous diagram representation learning, and obtaining the classified authors.
CN202310359202.4A 2023-04-06 2023-04-06 Author classification method based on heterogeneous quotation network Pending CN116383446A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310359202.4A CN116383446A (en) 2023-04-06 2023-04-06 Author classification method based on heterogeneous quotation network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310359202.4A CN116383446A (en) 2023-04-06 2023-04-06 Author classification method based on heterogeneous quotation network

Publications (1)

Publication Number Publication Date
CN116383446A true CN116383446A (en) 2023-07-04

Family

ID=86980287

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310359202.4A Pending CN116383446A (en) 2023-04-06 2023-04-06 Author classification method based on heterogeneous quotation network

Country Status (1)

Country Link
CN (1) CN116383446A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116578751A (en) * 2023-07-12 2023-08-11 中国医学科学院医学信息研究所 Main path analysis method and device

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110677284A (en) * 2019-09-24 2020-01-10 北京工商大学 Heterogeneous network link prediction method based on meta path
WO2020088439A1 (en) * 2018-10-30 2020-05-07 腾讯科技(深圳)有限公司 Method for identifying isomerism graph and molecular spatial structural property, device, and computer apparatus
US20200342006A1 (en) * 2019-04-29 2020-10-29 Adobe Inc. Higher-Order Graph Clustering
KR20210035786A (en) * 2020-06-19 2021-04-01 베이징 바이두 넷컴 사이언스 앤 테크놀로지 코., 엘티디. Method and apparatus for generating model for representing heterogeneous graph node, electronic device, storage medium and program
CN113095439A (en) * 2021-04-30 2021-07-09 东南大学 Heterogeneous graph embedding learning method based on attention mechanism
WO2021179838A1 (en) * 2020-03-10 2021-09-16 支付宝(杭州)信息技术有限公司 Prediction method and system based on heterogeneous graph neural network model
CN113806488A (en) * 2021-09-24 2021-12-17 石家庄铁道大学 Heterogeneous graph conversion text mining method based on meta-structure learning
CN113868482A (en) * 2021-07-21 2021-12-31 中国人民解放军国防科技大学 Heterogeneous network link prediction method suitable for scientific cooperative network
CN114239711A (en) * 2021-12-06 2022-03-25 中国人民解放军国防科技大学 Node classification method based on heterogeneous information network small-sample learning
CN114564573A (en) * 2022-03-14 2022-05-31 天津大学 Academic cooperative relationship prediction method based on heterogeneous graph neural network

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020088439A1 (en) * 2018-10-30 2020-05-07 腾讯科技(深圳)有限公司 Method for identifying isomerism graph and molecular spatial structural property, device, and computer apparatus
US20200342006A1 (en) * 2019-04-29 2020-10-29 Adobe Inc. Higher-Order Graph Clustering
CN110677284A (en) * 2019-09-24 2020-01-10 北京工商大学 Heterogeneous network link prediction method based on meta path
WO2021179838A1 (en) * 2020-03-10 2021-09-16 支付宝(杭州)信息技术有限公司 Prediction method and system based on heterogeneous graph neural network model
KR20210035786A (en) * 2020-06-19 2021-04-01 베이징 바이두 넷컴 사이언스 앤 테크놀로지 코., 엘티디. Method and apparatus for generating model for representing heterogeneous graph node, electronic device, storage medium and program
CN113095439A (en) * 2021-04-30 2021-07-09 东南大学 Heterogeneous graph embedding learning method based on attention mechanism
CN113868482A (en) * 2021-07-21 2021-12-31 中国人民解放军国防科技大学 Heterogeneous network link prediction method suitable for scientific cooperative network
CN113806488A (en) * 2021-09-24 2021-12-17 石家庄铁道大学 Heterogeneous graph conversion text mining method based on meta-structure learning
CN114239711A (en) * 2021-12-06 2022-03-25 中国人民解放军国防科技大学 Node classification method based on heterogeneous information network small-sample learning
CN114564573A (en) * 2022-03-14 2022-05-31 天津大学 Academic cooperative relationship prediction method based on heterogeneous graph neural network

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
LI JIN等: "Heterogeneous Graph Embedding for Cross-Domain Recommendation Through Adversarial Learning", LECTURE NOTES IN COMPUTER SCIENCE, 22 September 2020 (2020-09-22), pages 507 - 522 *
朱丹浩等: "基于异构特征融合的论文引用预测方法", 数据采集与处理, 18 October 2022 (2022-10-18), pages 1134 - 1144 *
谭鑫媛等: "聚合高阶邻居节点的异构图神经网络模型研究", 小型微型计算机系统, 8 July 2022 (2022-07-08), pages 1954 - 1960 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116578751A (en) * 2023-07-12 2023-08-11 中国医学科学院医学信息研究所 Main path analysis method and device
CN116578751B (en) * 2023-07-12 2023-09-22 中国医学科学院医学信息研究所 Main path analysis method and device

Similar Documents

Publication Publication Date Title
Ferentinou et al. Computational intelligence tools for the prediction of slope performance
CN110176280B (en) Method for describing crystal structure of material and application thereof
CN107358264A (en) A kind of method that graphical analysis is carried out based on machine learning algorithm
CN111737535A (en) Network characterization learning method based on element structure and graph neural network
CN108229578B (en) Image data target identification method based on three layers of data, information and knowledge map framework
Ibáñez et al. Using Bayesian networks to discover relationships between bibliometric indices. A case study of computer science and artificial intelligence journals
CN116383446A (en) Author classification method based on heterogeneous quotation network
Wang et al. Design of the Sports Training Decision Support System Based on the Improved Association Rule, the Apriori Algorithm.
CN114937173A (en) Hyperspectral image rapid classification method based on dynamic graph convolution network
CN107818328A (en) With reference to the deficiency of data similitude depicting method of local message
Nie et al. Adap-EMD: Adaptive EMD for aircraft fine-grained classification in remote sensing
Granell et al. Unsupervised clustering analysis: a multiscale complex networks approach
El Wakil et al. Data management for construction processes using fuzzy approach
Elhebir et al. A novel ensemble approach to enhance the performance of web server logs classification
CN114254199A (en) Course recommendation method based on bipartite graph projection and node2vec
CN109768890B (en) Symbolized directed weighted complex network building method based on STL decomposition method
Wijayanto et al. Predicting future potential flight routes via inductive graph representation learning
Divya et al. Survey on outlier detection techniques using categorical data
Bitsakidis et al. Hybrid Cellular Ants for Clustering Problems.
Dold et al. Evaluating the feasibility of interpretable machine learning for globular cluster detection
FUENTES HERRERA et al. Rough Net Approach for Community Detection Analysis in Complex Networks
Dilmaghani et al. Innovation Networks from Inter-organizational Research Collaborations
Yadav et al. STUDY OF GRAPH THEORY BASED MACHINE LEARNING
Han et al. Rough set-based decision tree using a core attribute
Bazan et al. A Classifier Based on a Decision Tree with Temporal Cuts

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination