CN116306779A - Knowledge reasoning method based on structure distinguishable representation graph neural network - Google Patents

Knowledge reasoning method based on structure distinguishable representation graph neural network Download PDF

Info

Publication number
CN116306779A
CN116306779A CN202310089311.9A CN202310089311A CN116306779A CN 116306779 A CN116306779 A CN 116306779A CN 202310089311 A CN202310089311 A CN 202310089311A CN 116306779 A CN116306779 A CN 116306779A
Authority
CN
China
Prior art keywords
graph
knowledge
model
network model
training
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310089311.9A
Other languages
Chinese (zh)
Inventor
周正斌
王震
惠孛
孙明
康昭
王勇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Creative Information Technology Co ltd
Original Assignee
Creative Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Creative Information Technology Co ltd filed Critical Creative Information Technology Co ltd
Priority to CN202310089311.9A priority Critical patent/CN116306779A/en
Publication of CN116306779A publication Critical patent/CN116306779A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Machine Translation (AREA)

Abstract

The invention discloses a knowledge reasoning method based on a structure distinguishable representation graph neural network, which belongs to the technical field of knowledge graph reasoning, and comprises the steps of preprocessing training data; then establishing a domain aggregation mechanism; constructing a graph attention network model through a domain aggregation mechanism, and learning an embedded representation of the preprocessed training data by using the graph attention network model; setting a loss function, performing end-to-end training on the graph annotation network model, and updating network parameters of the model by using the loss function; and inputting the preprocessed training data into the updated graph annotation network model for training, and complementing the knowledge graph by using the trained model. The invention generally unifies the representation of training data. And then designing a neighborhood aggregation mechanism, enhancing the distinguishing property of the network model, constructing a complete graph meaning network model, designing the structure of an encoder-decoder according to the neighborhood aggregation mechanism, embedding the training data of graph meaning network learning into the representation, and improving the distinguishing capability of the model.

Description

Knowledge reasoning method based on structure distinguishable representation graph neural network
Technical Field
The invention relates to the technical field of knowledge graph processing, in particular to a knowledge reasoning method based on a structure distinguishable representation graph neural network.
Background
In the process of constructing the knowledge graph, a large amount of knowledge information is derived from the document and the webpage information, and deviations often exist in the process of extracting knowledge from the document, and the deviations are from two aspects: (1) There is much noisy information in the document, i.e. no useful information, which may originate from the knowledge extraction algorithm itself or may be related to the validity of the language text itself; (2) The limited amount of information in documents does not cover all knowledge, especially much of the common sense knowledge. All the above results in incomplete knowledge patterns, so knowledge pattern complementation is increasingly important in constructing knowledge patterns.
Knowledge graph completion, also known as link prediction, focuses on predicting the head entity (body) or tail entity (object) that test triples lack. The purpose of the link prediction method is to define a scoring function to assign a value to each triplet, the true triplet should score higher than the false triplet. The most advanced knowledge graph completion method at present can be divided into a translation model, a tensor factor decomposition model, a model based on a convolutional neural network and a model based on a graph neural network.
The translation model represents entities and relationships as vectors in a low-dimensional vector space to aid KGC. The transition model represents the relationship between the head and tail entities as a dual objective function, representing only the composite and inverse relationships. The subgraph of the knowledge graph contains actual relationships between entities and missing inferred relationships. However, it cannot represent an antisymmetric relationship and an inverse relationship. ComplEx solves the DistMult problem, using ComplEx-valued embedding to represent symmetric, anti-symmetric and inverse relationships, but cannot derive ComplEx relationships. The translational model performs well in simple figures using simple operations and limited parameters.
The tensor factorization model represents the knowledge graph as a third-order binary tensor, where each element corresponds to a triplet representing the true or false of a fact. The core idea of the RESCAL model is to encode the whole knowledge graph into a three-dimensional tensor, decompose a core tensor and a factor matrix from the three-dimensional tensor, and take the result recovered by the core tensor and the factor matrix as the probability that the triplet is true. The TuckeR model decomposes the three-dimensional tensor representing KG into a core tensor and three matrices. It is a fully expressed model, ensuring that the lower bound of the parameters required for full expression is several orders of magnitude smaller than other models. These methods describe the knowledge base completion task as a three-dimensional (3D) binary tensor completion problem.
Recently, some Convolutional Neural Network (CNN) based models have achieved good performance, which are respectively ConvE, convKB and capsule models. The ConvE performs global two-dimensional convolution operation on the reconstructed entity and relation embedding, has high parameter efficiency, and achieves better results than a plurality of methods. Whereas in ConvKB each triplet (head entity, relationship, tail entity) is represented as a three-column matrix, where each column vector represents a triplet element, the model can capture global relationships and transitional features between entities and relationships in the knowledge base. The capsule uses a capsule network to model the relationship triplets, the model is based on the previous ConvKB, and all feature maps are encapsulated into one capsule after passing through the convolution layer. The method uses the local characteristics of the entity to complete the knowledge graph, but ignores the global structure information, and cannot better distinguish the representation of the graph neural network.
Disclosure of Invention
The invention aims to overcome the defects of the prior art, provides a knowledge reasoning method based on a structure distinguishable representation graph neural network, and is helpful to solve the problem that the global structure information is ignored in the process of knowledge graph completion by the current knowledge reasoning method, so that the representation graph neural network cannot be well distinguished.
The aim of the invention is realized by the following technical scheme:
the invention provides a knowledge reasoning method based on a structure distinguishable representation graph neural network, which comprises the following steps:
s1: preprocessing training data;
s2: establishing a domain aggregation mechanism;
s3: constructing a graph attention network model through a domain aggregation mechanism, and learning an embedded representation of the preprocessed training data by using the graph attention network model;
s4: setting a loss function, performing end-to-end training on the graph annotation network model, and updating network parameters of the model by using the loss function;
s5: and inputting the preprocessed training data into the updated graph annotation network model for training, and complementing the knowledge graph by using the trained model.
Further, the step S1 specifically includes: the WN18RR and FB15K-237 datasets are employed as training data and are represented as knowledge triples in the form of subject-relationship-objects.
Further, the step S2 specifically includes the following substeps:
s201: improving the GAT encoder, using a multi-layer perceptron to perform GAT modeling and learn a single shot function;
s202: and acquiring related field information of the target entity in the knowledge triples on different subspaces through multiple K-hop iterative computations by adopting a multi-head attention mechanism.
Further, the step S3 specifically includes the following substeps:
s301: setting an encoder, calculating attention values of neighbors around target entities in all knowledge triples through a design attention head module, and obtaining embedded representations of all target entities by using a neighborhood aggregation mechanism;
s302: and setting a decoder, and learning the embedded representation of the target entity in all knowledge triples by adopting a 3*3 convolution filter to obtain the relative attention value in all knowledge triples and obtain the graph annotation network model.
Further, the step S4 specifically includes the following substeps:
s401: dividing the preprocessed training data into a training set, a testing set and a verification set, wherein the proportion of the training set, the testing set and the verification set is about 28:1:1, and setting a control group;
s402: and performing end-to-end training on the graph meaning network model according to the training set, performing optimization training on a decoder of the graph meaning network model by using soft interval loss, and updating network parameters of the graph meaning network model.
The invention has the beneficial effects that: the invention provides a knowledge reasoning method based on a structure distinguishable representation graph neural network, which comprises the steps of preprocessing training data; then establishing a domain aggregation mechanism; constructing a graph attention network model through a domain aggregation mechanism, and learning an embedded representation of the preprocessed training data by using the graph attention network model; setting a loss function, performing end-to-end training on the graph annotation network model, and updating network parameters of the model by using the loss function; and inputting the preprocessed training data into the updated graph annotation network model for training, and complementing the knowledge graph by using the trained model. The invention unifies the data representation mode by preprocessing the training data. A neighborhood aggregation mechanism is then designed to enhance the distinguishability of the graph-annotation network model. And constructing a complete graph meaning network model, designing the structure of an encoder-decoder according to a neighborhood aggregation mechanism, embedding the training data learned by the graph meaning network into a representation, and improving the distinguishing capability of the graph meaning network.
Drawings
FIG. 1 is a flow chart of method steps of the present invention;
FIG. 2 is an overall block diagram of a schematic network model;
FIG. 3 is a schematic diagram of the polymerization process of the attention layer of the present invention.
Description of the embodiments
For a clearer understanding of technical features, objects, and effects of the present invention, a specific embodiment of the present invention will be described with reference to the accompanying drawings.
Referring to fig. 1, fig. 1 shows a flow chart of steps of a knowledge reasoning method based on a structure-distinguishable representation graph neural network, which specifically includes the following steps:
s1: preprocessing training data;
s2: establishing a domain aggregation mechanism;
s3: constructing a graph attention network model through a domain aggregation mechanism, and learning an embedded representation of the preprocessed training data by using the graph attention network model;
s4: setting a loss function, performing end-to-end training on the graph annotation network model, and updating network parameters of the model by using the loss function;
s5: and inputting the preprocessed training data into the updated graph annotation network model for training, and complementing the knowledge graph by using the trained model.
Further, in one embodiment, the step S1 specifically includes: the WN18RR and FB15K-237 data sets are used as training data, the training data are expressed as knowledge triples in the form of subjects-relations-objects, data filtering is achieved, and after irrelevant data are filtered, the WN18RR data sets comprise 40493 entities and 11 relations; the FB15K-237 dataset has 14541 entities, 237 relationships.
Further, in one embodiment, the step S2 specifically includes the following substeps:
s201: the GAT encoder is improved, a multi-layer perceptron (MLPs) is used for modeling and learning the single shot function, and the principle is a general approximation theorem. The hidden representation h (k-1) s is measured using a parameter phi that can be learned, and then the schematic neural network updates the node representation to the following formula, where phi (k) Is a learnable parameter at the kth iteration, h (K-1) s is a hidden representation of an entity s in the kth-1 round of iterations, a (K) s represents neighborhood information of the entity s collected in K-hop, K is a current round number, and s is a corresponding entity:
Figure SMS_1
the translation of K-hop Chinese is K-hop and may also be interpreted as K-neighbor. In the above figure, the K-neighbor operation refers to finding the set of all vertices (i.e., neighborhood information of the entity s) whose shortest path is K hops (or K steps) from a certain vertex (entity s). K is a positive integer > =1.
S202: in order to stabilize the learning process, more neighborhood information is packaged, a multi-head attention mechanism is adopted, related domain information on different subspaces is obtained through multiple times of calculation, wherein a (K) s represents the neighborhood information collected in a K-hop, M is the number of heads, MLP is a multi-layer perceptron and phi (k) Is a learnable parameter at the kth iteration, corresponding to the GAT layer shown in fig. 2, h (k) s represents the final entity representation learned from our structurally distinguishable GAT model. The formula is as follows:
Figure SMS_2
the multi-head attention mechanism is an improved mechanism of the single-head attention mechanism, and each attention operation is grouped (head) and feature information can be extracted from multiple dimensions.
Further, in one embodiment, the step S3 specifically includes the following substeps:
s301: setting up an encoder, calculating the attention value of the neighbor around the target entity by designing an attention head module, and obtaining the embedded representation of the target entity by using a neighborhood aggregation mechanism, wherein the aggregation process by using the neighborhood aggregation mechanism is shown in fig. 3. To get a hidden embedding of the target entity s, it is necessary to learn an embedded representation of each knowledge triplet related to s. For simplicity, assume that the neighborhood of s is N(s) = { (r) i ,o i )|(s,r i ,o i ) E T, neighborhood is determined by applying a method to r i And o i Is learned by linear transformation in series, as shown in the following formula, wherein W 1 Is a projection matrix, vector r i And o i An embedded representation representing the ith relationship and entity, c i Is the neighbor information learned to the ith:
Figure SMS_3
the encoder then learns the embedded representation of the entire knowledge triplet to obtain the ith knowledgeAbsolute attention value d of identification triplet i As shown in the following formula, wherein s i Refer to the embedded representation of the ith entity s, vector r i And o i Embedded representation representing the ith relationship and entity:
Figure SMS_4
the relative attention value can then be deduced. The following formula gives the relative attention value p of a single knowledge triplet i Wherein d i ,d j Is the absolute attention value of the ith, jth knowledge triplet, N(s) represents the neighborhood of entity s, pi is the attention value of the ith knowledge triplet:
Figure SMS_5
s302: a decoder is provided to learn the output of the knowledge triples using a 3*3 convolution filter in order to obtain an embedded representation of each knowledge triplet. As mentioned in ConvKB, its purpose is to obtain an embedded representation of each knowledge triplet and preserve its translation characteristics. The scoring function of the decoder of the invention is defined as follows, wherein g is an activation function ReLU; omega represents a convolution filter sharing parameters, and is irrelevant to s, r and o; * Representing a convolution operator; concat represents a connection operator; w represents a linear transformation matrix, s i Refer to the embedded representation of the ith entity s, vector r i And o i Embedded representation representing the ith relationship and entity:
Figure SMS_6
further, the step S4 specifically includes the following substeps:
s401: the preprocessed training data is divided into a training set, a testing set and a verification set, wherein the ratio of the training set, the testing set and the verification set is about 28:1:1, and a control group is set. The training set is used for carrying out end-to-end training on the model and updating parameters; the verification set is used for carrying out preliminary evaluation on the capacity of the model and modifying the super parameters according to the evaluation; the test set is used for evaluating the generalization capability of the final model of the model and calculating the performance index of the model;
s402: a loss function is set. We used Adam optimizers to optimally train the decoder, minimizing the loss function by L2 regularization of the weight matrix W of the model. We train the decoder using soft spacing loss, as shown in the following equation, where W represents a linear transformation matrix, s i Refer to the embedded representation of the ith entity s, vector r i And o i An embedded representation representing the ith relationship and entity, T' is a set of false knowledge triples, generated by destruction of the true knowledge triples in T, phi(s) i ,r i, o i ) Representing the scoring of the decoder, L decoder Representing the final loss function:
Figure SMS_7
Figure SMS_8
for different data sets, we do the following setup: for the FB15K-237 data set, dropout is set to be 0.3, the learning rate is set to be 0.001, the embedding size is set to be 100, and the MLP layer number is set to be 3; for the WN18RR data set, dropout is set to 0.0, learning rate is set to 0.0005, embedding size is set to 50, and MLP layer number is set to 3.
Specifically, for step S5, training data unified as a knowledge triplet is input into the network, and a completed knowledge graph is output. In training and testing data, the performance of the model is influenced by the inverse relation, and experiments show that the model has better effect on the complementation of various large-scale general knowledge maps containing the inverse relation by utilizing a neighborhood aggregation mechanism with distinguishable structures.
Further, by performing experiments on the WN18RR data set and the FB15K-237 data set, the experimental result of the knowledge reasoning method based on the structure distinguishable representation graph neural network provided by the invention is compared with the experimental result of a classical model ConvE, and the experimental result of the classical model ConvE is used for referencing paper [1] detectors T, minervini P, stenetorp P, et al, convolitional 2D Knowledge Graph Embeddings[J ]. 2017. The evaluation index is the proportion of the correct entity ranked in the first N bits (Hits@1, hits@3, hits@10). SD-GAT obtained better scores than all baseline tests, showing the expressive power of our model. More specifically, on FB15K-237, SD-GAT achieves an improvement of 0.17 on Hits@1, 0.13 on Hits@3, and 0.06 on Hits@10. On WN18RR, SD-GAT achieved a 0.02 improvement on the Hits@3.
The foregoing has shown and described the basic principles and main features of the present invention and the advantages of the present invention. It will be understood by those skilled in the art that the present invention is not limited to the embodiments described above, and that the above embodiments and descriptions are merely illustrative of the principles of the present invention, and various changes and modifications may be made without departing from the spirit and scope of the invention, which is defined in the appended claims. The scope of the invention is defined by the appended claims and equivalents thereof.

Claims (5)

1. A method for knowledge reasoning based on a neural network of a structurally distinguishable representation graph, comprising:
s1: preprocessing training data;
s2: establishing a domain aggregation mechanism;
s3: constructing a graph attention network model through a domain aggregation mechanism, and learning an embedded representation of the preprocessed training data by using the graph attention network model;
s4: setting a loss function, performing end-to-end training on the graph annotation network model, and updating network parameters of the model by using the loss function;
s5: and inputting the preprocessed training data into the updated graph annotation network model for training, and complementing the knowledge graph by using the trained model.
2. The knowledge reasoning method based on the structure-distinguishable representation graph neural network according to claim 1, wherein the S1 specifically comprises: the WN18RR and FB15K-237 datasets are employed as training data and are represented as knowledge triples in the form of subject-relationship-objects.
3. The knowledge reasoning method based on the structure-distinguishable representation graph neural network according to claim 1, wherein the step S2 specifically comprises the following sub-steps:
s201: improving the GAT encoder, using a multi-layer perceptron to perform GAT modeling and learn a single shot function;
s202: and acquiring related field information of the target entity in the knowledge triples on different subspaces through multiple K-hop iterative computations by adopting a multi-head attention mechanism.
4. The knowledge reasoning method based on the structure-distinguishable representation graph neural network according to claim 1, wherein the S3 specifically comprises the following steps:
s301: setting an encoder, calculating attention values of neighbors around target entities in all knowledge triples through a design attention head module, and obtaining embedded representations of all target entities by using a neighborhood aggregation mechanism;
s302: and setting a decoder, and learning the embedded representation of the target entity in all knowledge triples by adopting a 3*3 convolution filter to obtain the relative attention value in all knowledge triples and obtain the graph annotation network model.
5. The knowledge reasoning method based on the structure-distinguishable representation graph neural network according to claim 1, wherein the step S4 specifically comprises the following sub-steps:
s401: dividing the preprocessed training data into a training set, a testing set and a verification set, wherein the ratio of the training set to the testing set to the verification set is 28:1:1, and setting a control group;
s402: and performing end-to-end training on the graph meaning network model according to the training set, performing optimization training on a decoder of the graph meaning network model by using soft interval loss, and updating network parameters of the graph meaning network model.
CN202310089311.9A 2023-02-09 2023-02-09 Knowledge reasoning method based on structure distinguishable representation graph neural network Pending CN116306779A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310089311.9A CN116306779A (en) 2023-02-09 2023-02-09 Knowledge reasoning method based on structure distinguishable representation graph neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310089311.9A CN116306779A (en) 2023-02-09 2023-02-09 Knowledge reasoning method based on structure distinguishable representation graph neural network

Publications (1)

Publication Number Publication Date
CN116306779A true CN116306779A (en) 2023-06-23

Family

ID=86795086

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310089311.9A Pending CN116306779A (en) 2023-02-09 2023-02-09 Knowledge reasoning method based on structure distinguishable representation graph neural network

Country Status (1)

Country Link
CN (1) CN116306779A (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20220051083A1 (en) * 2020-08-11 2022-02-17 Nec Laboratories America, Inc. Learning word representations via commonsense reasoning
CN114708479A (en) * 2022-03-31 2022-07-05 杭州电子科技大学 Self-adaptive defense method based on graph structure and characteristics
CN115578211A (en) * 2022-10-25 2023-01-06 南京工业职业技术大学 Directed symbol network representation learning method and process with link prediction and node sequencing functions

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20220051083A1 (en) * 2020-08-11 2022-02-17 Nec Laboratories America, Inc. Learning word representations via commonsense reasoning
CN114708479A (en) * 2022-03-31 2022-07-05 杭州电子科技大学 Self-adaptive defense method based on graph structure and characteristics
CN115578211A (en) * 2022-10-25 2023-01-06 南京工业职业技术大学 Directed symbol network representation learning method and process with link prediction and node sequencing functions

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
XUE ZHOU等: "A structure distinguishable graph attention network for knowledge base completion", 《NEURAL COMPUTING AND APPLICATIONS》, vol. 33, pages 16005 - 16017, XP037608476, DOI: 10.1007/s00521-021-06221-1 *
孙郑煜等: "基于大数据技术的医学知识图谱构建方法", 《软件》, vol. 41, no. 1, pages 13 - 17 *
林友芳等: "《交通大数据》", 北京交通大学出版社, pages: 18 *

Similar Documents

Publication Publication Date Title
Cai et al. Path-level network transformation for efficient architecture search
Morris et al. Weisfeiler and leman go machine learning: The story so far
Chamberland et al. Deep neural decoders for near term fault-tolerant experiments
CN112417219B (en) Hyper-graph convolution-based hyper-edge link prediction method
Li et al. Deep learning methods for molecular representation and property prediction
CN113360673B (en) Entity alignment method, device and storage medium of multi-mode knowledge graph
CN113299354B (en) Small molecule representation learning method based on transducer and enhanced interactive MPNN neural network
CN112633478A (en) Construction of graph convolution network learning model based on ontology semantics
CN115346372A (en) Multi-component fusion traffic flow prediction method based on graph neural network
Kolajoobi et al. Investigating the capability of data-driven proxy models as solution for reservoir geological uncertainty quantification
Hong et al. Variational gridded graph convolution network for node classification
CN117524353B (en) Molecular large model based on multidimensional molecular information, construction method and application
Juan et al. INS-GNN: Improving graph imbalance learning with self-supervision
CN117408336A (en) Entity alignment method for structure and attribute attention mechanism
Lu et al. Soft-orthogonal constrained dual-stream encoder with self-supervised clustering network for brain functional connectivity data
CN117111464A (en) Self-adaptive fault diagnosis method under multiple working conditions
CN116306779A (en) Knowledge reasoning method based on structure distinguishable representation graph neural network
CN115423076A (en) Directed hypergraph chain prediction method based on two-step framework
CN114693873A (en) Point cloud completion method based on dynamic graph convolution and attention mechanism
Han et al. SurfNet: Learning surface representations via graph convolutional network
Alshara Multilayer Graph-Based Deep Learning Approach for Stock Price Prediction
CN113360732A (en) Big data multi-view graph clustering method
Ma et al. A multi-scale disperse dynamic routing capsule network knowledge graph embedding model based on relational memory
CN115631786B (en) Virtual screening method, device and execution equipment
Fei et al. A GNN Architecture with Local and Global-Attention Feature for Image Classification

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination