CN113761221B - Knowledge graph entity alignment method based on graph neural network - Google Patents
Knowledge graph entity alignment method based on graph neural network Download PDFInfo
- Publication number
- CN113761221B CN113761221B CN202110734416.6A CN202110734416A CN113761221B CN 113761221 B CN113761221 B CN 113761221B CN 202110734416 A CN202110734416 A CN 202110734416A CN 113761221 B CN113761221 B CN 113761221B
- Authority
- CN
- China
- Prior art keywords
- entity
- aligned
- graph
- neural network
- knowledge
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/36—Creation of semantic tools, e.g. ontology or thesauri
- G06F16/367—Ontology
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Life Sciences & Earth Sciences (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Evolutionary Computation (AREA)
- Biophysics (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Biomedical Technology (AREA)
- Artificial Intelligence (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Animal Behavior & Ethology (AREA)
- Databases & Information Systems (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
Aiming at the problem that the entity alignment method of the prior knowledge graph can cause information loss to a certain degree, the invention discloses a knowledge graph entity alignment method based on a graph neural network, which comprises the steps of data preprocessing, data preprocessing of two knowledge graphs to be aligned and the prior alignment seeds, and the processing result as the input of the next step; constructing a graph neural network model, inputting a preprocessing result into a graph convolution neural network, and uniformly modeling two knowledge graphs to be aligned by using the graph neural network to obtain vectorization representation of an entity in the knowledge graphs; and searching an entity with the highest semantic similarity to the entity represented by the entity vector in the vector space based on a greedy algorithm to serve as an aligned entity. The invention jointly models the two maps to be aligned by constructing a uniform map neural network, more effectively utilizes the comprehensive information of the two maps, obtains more accurate entity vectorization representation and improves the accuracy of entity alignment.
Description
Technical Field
The invention relates to the field of knowledge graph-based entity alignment, in particular to a knowledge graph entity alignment method based on a graph neural network.
Background
In recent years, knowledge-graph technology has developed rapidly, and various researches based on knowledge-graphs are endless. Researchers have constructed a large number of general knowledge maps and domain knowledge maps, and knowledge among the knowledge maps is overlapped and supplemented. How to fuse the multi-source heterogeneous knowledge graph to obtain a knowledge graph with more complete knowledge, thereby better supporting the application of the graph is urgent. The entity alignment technology is an important method for realizing knowledge graph fusion.
At present, an entity alignment method of knowledge graphs mainly uses two graph neural networks to independently model two graphs to be aligned, vector representations of entities and relations in the two graphs are obtained respectively, vector spaces are searched through a greedy algorithm and the like, and a pair of entities with the closest vector representations are used as alignment entities. The method models the two maps respectively, and the entity alignment task needs to fully utilize the comprehensive information of the two maps, so the method can cause information loss to a certain degree.
Disclosure of Invention
The invention discloses a knowledge graph entity alignment method based on a graph neural network, aiming at the problem that the existing knowledge graph entity alignment method can cause information loss to a certain degree.
The invention discloses a knowledge graph entity alignment method based on a graph neural network, which comprises the following specific steps:
s1, preprocessing data, preprocessing the two knowledge maps to be aligned and the existing alignment seeds, and using the processing result as the input of the step S2;
s2, constructing a graph neural network model, inputting the preprocessing result of the step S1 into a graph convolution neural network, and uniformly modeling two knowledge graphs needing to be aligned by using the graph neural network to obtain vectorization representation of an entity in the knowledge graphs;
and S3, searching an entity vector in the vector space based on the greedy algorithm to represent the entity with the highest semantic similarity with the entity, and taking the entity as an alignment entity.
In the step S1, during data preprocessing, all triplets and aligned entity seed pairs included in the two knowledge maps are processed, and are randomly initialized by using an embedding layer in the Keras artificial neural network library, so as to obtain vector representations of entities and relationships in the triplets. In order to ensure that the subsequently constructed graph neural network jointly models the two knowledge graphs to be aligned, the two knowledge graphs to be aligned are associated by utilizing the aligned seeds, and initial data information is fully mined; regarding the pre-aligned seeds as aligned triples, and constructing an adjacency matrix; when constructing the cross-map relational triples, traversing all the existing triples, replacing the entity contained in the triples with the entity aligned by the pre-aligned seeds for the triples containing a certain entity in the aligned seeds, and generating a new triplet, thereby constructing the adjacency matrix and the relational triples of the cross-map and obtaining a preprocessing result for synthesizing two pieces of knowledge-map information needing to be aligned.
In the step S2, the data preprocessing result obtained in the step S1 is input into a atlas convolutional neural network, and unified modeling is performed on two knowledge maps to be aligned, so as to obtain vectorized representation of entities and relations in the maps; the vectorization representation of the entity and the relation in the graph convolutional neural network is continuously adjusted through iteration, the similarity degree of the entity semantic meaning and the similarity degree of the entity vector representation are consistent as a convergence target, and the specific iteration process is as follows:
and S21, aggregating information, initializing graph vector representation, performing aggregation operation on each node in the graph, aggregating information of neighbor nodes to update vector representation of the central node, and aggregating information of all connected nodes. The calculation formula of the aggregation information is as follows:
whereinRepresenting the ith entity e at the ith iterationiIs used to represent the vector of (a),indicating the kth entity e at the l iterationkIs used to represent the vector of (a),denotes eiAll neighbor nodes of。
S22, reversely deriving all variables of the neural network model of the graph according to a chain rule according to the loss function, and updating model parameters by using a gradient descent method, wherein the expression of the loss function L is as follows:
where P represents a pre-aligned seed entity pair, P' represents other entity pairs resulting from random negative sampling,respectively representing the pre-aligned seed entity pairs e after the graph neural network codingiAnd ejIs used to represent the vector of (a),respectively represents a random negative sampling entity pair e 'coded by a graph neural network'iAnd e'jAnd λ represents the threshold. And reversely deriving all variables in the neural network of the graph by using the loss function, and updating the vector representation of the knowledge graph.
And S23, repeating the steps S21 and S22 until the whole training process is finished, and the map vector representation is not changed any more.
In step S3, a greedy algorithm is used to search the whole vector space, and the paired vector corresponding entities whose euclidean distances represented by the node vectors in different knowledge graphs are smaller than a threshold are obtained as alignment entities.
The invention has the beneficial effects that:
1. the method is improved on the basis of the existing graph neural network entity alignment method, and the respective modeling optimization aiming at the two graphs is modified into the large graph integral modeling aiming at the two graphs, so that the associated information of the two graphs to be aligned is better captured, and the entity alignment accuracy is improved; and the combined modeling mode is simple and easy to operate, the method has better expansibility, and can be used as a module to be embedded into the existing entity alignment model, so that the entity alignment accuracy of the existing method is improved.
2. The invention provides a new idea for other applications based on the knowledge graph, namely, a training set is not only used as a standard when the gradient of the model is reduced, but also can be used as a part of enhanced features to be input into the model when in input, thereby more fully utilizing the information of training data and improving the final performance of the model.
Drawings
FIG. 1 is a flow chart of the steps of the method.
Detailed Description
For a better understanding of the present disclosure, an example is given here.
The invention discloses a knowledge graph entity alignment method based on a graph neural network, which comprises the following steps:
s1, preprocessing data, preprocessing the two knowledge maps to be aligned and the existing alignment seeds, and using the processing result as the input of the step S2;
s2, constructing a graph neural network model, inputting the preprocessing result of the step S1 into a graph convolution neural network, and uniformly modeling two knowledge graphs needing to be aligned by using the graph neural network to obtain vectorization representation of an entity in the knowledge graphs;
and S3, searching an entity vector in the vector space based on the greedy algorithm to represent the entity with the highest semantic similarity with the entity, and taking the entity as an alignment entity.
In the step S1, during data preprocessing, all triplets and aligned entity seed pairs included in the two knowledge maps are processed, and are randomly initialized by using an embedding layer in the Keras artificial neural network library, so as to obtain vector representations of entities and relationships in the triplets. In order to ensure that the subsequently constructed graph neural network jointly models the two knowledge graphs to be aligned, the two knowledge graphs to be aligned are associated by utilizing the aligned seeds, and initial data information is fully mined; regarding the pre-aligned seeds as aligned triples, and constructing an adjacency matrix; when constructing the cross-map relational triples, traversing all the existing triples, replacing the entity contained in the triples with the entity aligned by the pre-aligned seeds for the triples containing a certain entity in the aligned seeds, and generating a new triplet, thereby constructing the adjacency matrix and the relational triples of the cross-map and obtaining a preprocessing result for synthesizing two pieces of knowledge-map information needing to be aligned. For example, to illustrate step S1, if there are two chinese-english maps that need to perform the entity alignment task, there is a triple in the chinese map: china- > capital- > Beijing, and an English map has a triple: China-China is a pre-aligned entity seed, and two new triples can be expanded by referring to the two triples according to the two map joint modeling principles in the foregoing: china- > belongsto- > Asia, China- > capital- > Beijing. And the expanded triple is used as enhancement data and put into a training set, so that the final entity alignment accuracy of the model can be effectively improved.
In the step S2, the data preprocessing result obtained in the step S1 is input into a atlas convolutional neural network, and unified modeling is performed on two knowledge maps to be aligned, so as to obtain vectorized representation of entities and relations in the maps; the vectorization representation of the entity and the relation in the graph convolutional neural network is continuously adjusted through iteration, the similarity degree of the entity semantic meaning and the similarity degree of the entity vector representation are consistent as a convergence target, and the specific iteration process is as follows:
and S21, aggregating information, performing aggregation operation on each node in the graph after initializing graph vector representation, and aggregating information of neighbor nodes to update vector representation of the central node. Due to the processing of S1, the aggregation information is different from the neighbor node information of the graph where the existing graph neural network only aggregates nodes, and in step S1, the cross-graph adjacency matrix and the cross-graph relationship triplet are associated, and then all the associated node information is aggregated. The calculation formula of the aggregation information is as follows:
whereinIs shown asI entity e in l iterationsiIs used to represent the vector of (a),indicating the kth entity e at the l iterationkIs used to represent the vector of (a),denotes eiAll neighbor nodes of (1).
S22, reversely deriving all variables of the neural network model of the graph according to a chain rule according to the loss function, and updating model parameters by using a gradient descent method, wherein the expression of the loss function L is as follows:
where P represents a pre-aligned seed entity pair, P' represents other entity pairs resulting from random negative sampling,respectively representing the pre-aligned seed entity pairs e after the graph neural network codingiAnd ejIs used to represent the vector of (a),respectively represents a random negative sampling entity pair e 'coded by a graph neural network'iAnd e'jAnd λ represents the threshold. And reversely deriving all variables in the neural network of the graph by using the loss function, and updating the vector representation of the knowledge graph.
And S23, repeating the steps S21 and S22 until the whole training process is finished, and the map vector representation is not changed any more.
In the process of step S3, a greedy algorithm is used to search the whole vector space, and the paired vector corresponding entities whose euclidean distances represented by the node vectors in different knowledge graphs are smaller than a threshold value are obtained by calculation and are used as alignment entities.
The above description is only an example of the present application and is not intended to limit the present application. Various modifications and changes may occur to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present application should be included in the scope of the claims of the present application.
Claims (2)
1. A knowledge graph entity alignment method based on a graph neural network is characterized by comprising the following specific steps:
s1, preprocessing data, preprocessing the two knowledge maps to be aligned and the existing alignment seeds, and using the processing result as the input of the step S2;
s2, constructing a graph neural network model, inputting the preprocessing result of the step S1 into a graph convolution neural network, and uniformly modeling two knowledge graphs needing to be aligned by using the graph neural network to obtain vectorization representation of an entity in the knowledge graphs;
s3, searching an entity vector in the vector space based on a greedy algorithm to represent an entity with the highest semantic similarity to the entity, and taking the entity as an alignment entity;
in the step S1, during data preprocessing, all triples and aligned entity seed pairs included in the two knowledge maps are processed, and are randomly initialized by using an embedding layer in the Keras artificial neural network library, so as to obtain vector representations of entities and relationships in the triples; in order to ensure that the subsequently constructed graph neural network jointly models the two knowledge graphs to be aligned, the two knowledge graphs to be aligned are associated by utilizing the aligned seeds, and initial data information is fully mined; regarding the pre-aligned seeds as aligned triples, and constructing an adjacency matrix; when constructing the cross-map relational triples, traversing all the existing triples, replacing an entity contained in each triplet with an entity aligned with a pre-aligned seed for the triples containing a certain entity in the aligned seed, and generating a new triplet, thereby constructing an adjacency matrix and a relational triplet of a cross-map and obtaining a preprocessing result for synthesizing two pieces of knowledge-map information needing to be aligned;
in the step S2, the data preprocessing result obtained in the step S1 is input into a atlas convolutional neural network, and unified modeling is performed on two knowledge maps to be aligned, so as to obtain vectorized representation of entities and relations in the maps; the vectorization representation of the entity and the relation in the graph convolutional neural network is continuously adjusted through iteration, the similarity degree of the entity semantic meaning and the similarity degree of the entity vector representation are consistent as a convergence target, and the specific iteration process is as follows:
s21, aggregating information, after initializing map vector representation, performing aggregation operation on each node in the map, aggregating information of neighbor nodes to update vector representation of the central node, and aggregating information of all connected nodes; the calculation formula of the aggregation information is as follows:
whereinRepresenting the ith entity e at the ith iterationiIs used to represent the vector of (a),indicating the kth entity e at the l iterationkIs used to represent the vector of (a),denotes eiAll neighbor nodes of (1);
s22, reversely deriving all variables of the neural network model of the graph according to a chain rule according to the loss function, and updating model parameters by using a gradient descent method, wherein the expression of the loss function L is as follows:
where P represents a pre-aligned seed entity pair, P' represents other entity pairs resulting from random negative sampling, respectively representing the pre-aligned seed entity pairs e after the graph neural network codingiAnd ejIs used to represent the vector of (a),respectively represents a random negative sampling entity pair e 'coded by a graph neural network'iAnd e'jRepresents the vector of (a), λ represents the threshold; reversely deriving all variables in the neural network of the graph by using the loss function, and updating the vector representation of the knowledge graph;
and S23, repeating the steps S21 and S22 until the whole training process is finished, and the map vector representation is not changed any more.
2. The method for aligning knowledge-graph entities based on graph neural network as claimed in claim 1, wherein in step S3, the greedy algorithm is used to search the whole vector space, and the paired vector corresponding entities whose euclidean distances represented by node vectors in different knowledge-graphs are smaller than the threshold are used as the aligned entities.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110734416.6A CN113761221B (en) | 2021-06-30 | 2021-06-30 | Knowledge graph entity alignment method based on graph neural network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110734416.6A CN113761221B (en) | 2021-06-30 | 2021-06-30 | Knowledge graph entity alignment method based on graph neural network |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113761221A CN113761221A (en) | 2021-12-07 |
CN113761221B true CN113761221B (en) | 2022-02-15 |
Family
ID=78787538
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110734416.6A Active CN113761221B (en) | 2021-06-30 | 2021-06-30 | Knowledge graph entity alignment method based on graph neural network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113761221B (en) |
Families Citing this family (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114880484B (en) * | 2022-05-11 | 2023-06-16 | 军事科学院系统工程研究院网络信息研究所 | Satellite communication frequency track resource map construction method based on vector mapping |
CN116150405B (en) * | 2023-04-19 | 2023-06-27 | 中电科大数据研究院有限公司 | Heterogeneous data processing method for multiple scenes |
CN116227592B (en) * | 2023-05-06 | 2023-07-18 | 城云科技(中国)有限公司 | Multisource knowledge graph alignment model, construction method, device and application thereof |
CN117172321A (en) * | 2023-11-02 | 2023-12-05 | 中国科学院空天信息创新研究院 | Geographic entity alignment method and device for introducing graphic neural network and electronic equipment |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109344285A (en) * | 2018-09-11 | 2019-02-15 | 武汉魅瞳科技有限公司 | A kind of video map construction and method for digging, equipment towards monitoring |
CN109389151A (en) * | 2018-08-30 | 2019-02-26 | 华南师范大学 | A kind of knowledge mapping treating method and apparatus indicating model based on semi-supervised insertion |
CN110955780A (en) * | 2019-10-12 | 2020-04-03 | 中国人民解放军国防科技大学 | Entity alignment method for knowledge graph |
Family Cites Families (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109255002B (en) * | 2018-09-11 | 2021-08-27 | 浙江大学 | Method for solving knowledge graph alignment task by utilizing relationship path mining |
CN110472065B (en) * | 2019-07-25 | 2022-03-25 | 电子科技大学 | Cross-language knowledge graph entity alignment method based on GCN twin network |
CN110941722B (en) * | 2019-10-12 | 2022-07-01 | 中国人民解放军国防科技大学 | Knowledge graph fusion method based on entity alignment |
CN111046186A (en) * | 2019-10-30 | 2020-04-21 | 平安科技(深圳)有限公司 | Entity alignment method, device and equipment of knowledge graph and storage medium |
CN111191045B (en) * | 2019-12-30 | 2023-06-16 | 创新奇智(上海)科技有限公司 | Entity alignment method and system applied to knowledge graph |
CN112417159B (en) * | 2020-11-02 | 2022-04-15 | 武汉大学 | Cross-language entity alignment method of context alignment enhanced graph attention network |
CN112445876B (en) * | 2020-11-25 | 2023-12-26 | 中国科学院自动化研究所 | Entity alignment method and system for fusing structure, attribute and relationship information |
-
2021
- 2021-06-30 CN CN202110734416.6A patent/CN113761221B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109389151A (en) * | 2018-08-30 | 2019-02-26 | 华南师范大学 | A kind of knowledge mapping treating method and apparatus indicating model based on semi-supervised insertion |
CN109344285A (en) * | 2018-09-11 | 2019-02-15 | 武汉魅瞳科技有限公司 | A kind of video map construction and method for digging, equipment towards monitoring |
CN110955780A (en) * | 2019-10-12 | 2020-04-03 | 中国人民解放军国防科技大学 | Entity alignment method for knowledge graph |
Also Published As
Publication number | Publication date |
---|---|
CN113761221A (en) | 2021-12-07 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN113761221B (en) | Knowledge graph entity alignment method based on graph neural network | |
CN111898364B (en) | Neural network relation extraction method, computer equipment and readable storage medium | |
CN109614614A (en) | A kind of BILSTM-CRF name of product recognition methods based on from attention | |
CN111275172B (en) | Feedforward neural network structure searching method based on search space optimization | |
US20210018332A1 (en) | Poi name matching method, apparatus, device and storage medium | |
CN110347881A (en) | A kind of group's discovery method for recalling figure insertion based on path | |
CN108399268B (en) | Incremental heterogeneous graph clustering method based on game theory | |
CN113111657B (en) | Cross-language knowledge graph alignment and fusion method, device and storage medium | |
CN114218389A (en) | Long text classification method in chemical preparation field based on graph neural network | |
CN109857457A (en) | A kind of function level insertion representation method learnt in source code in the hyperbolic space | |
CN115481682A (en) | Graph classification training method based on supervised contrast learning and structure inference | |
CN114332519A (en) | Image description generation method based on external triple and abstract relation | |
Wang et al. | Fast gunrock subgraph matching (gsm) on gpus | |
Sun et al. | Graph force learning | |
CN112035689A (en) | Zero sample image hash retrieval method based on vision-to-semantic network | |
Bi et al. | MM-GNN: Mix-moment graph neural network towards modeling neighborhood feature distribution | |
CN113177107B (en) | Intelligent contract similarity detection method based on syntax tree matching | |
CN114528971A (en) | Atlas frequent relation mode mining method based on heterogeneous atlas neural network | |
CN111159424B (en) | Method and device for labeling knowledge graph entity, storage medium and electronic equipment | |
CN113220820A (en) | Efficient SPARQL query response method, device and equipment based on graph | |
CN117010373A (en) | Recommendation method for category and group to which asset management data of power equipment belong | |
CN114997360B (en) | Evolution parameter optimization method, system and storage medium of neural architecture search algorithm | |
Wang et al. | Gradient Flow of Energy: A General and Efficient Approach for Entity Alignment Decoding | |
CN110609914B (en) | Online Hash learning image retrieval method based on rapid category updating | |
Han et al. | Text-to-Image Person Re-identification Based on Multimodal Graph Convolutional Network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |