CN113761221B - Knowledge graph entity alignment method based on graph neural network - Google Patents

Knowledge graph entity alignment method based on graph neural network Download PDF

Info

Publication number
CN113761221B
CN113761221B CN202110734416.6A CN202110734416A CN113761221B CN 113761221 B CN113761221 B CN 113761221B CN 202110734416 A CN202110734416 A CN 202110734416A CN 113761221 B CN113761221 B CN 113761221B
Authority
CN
China
Prior art keywords
entity
aligned
graph
neural network
knowledge
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110734416.6A
Other languages
Chinese (zh)
Other versions
CN113761221A (en
Inventor
张静
栾瑞鹏
亓东林
孙晓
陈曙东
朱浩洋
欧阳小叶
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chinese People's Liberation Army 32801
Original Assignee
Chinese People's Liberation Army 32801
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chinese People's Liberation Army 32801 filed Critical Chinese People's Liberation Army 32801
Priority to CN202110734416.6A priority Critical patent/CN113761221B/en
Publication of CN113761221A publication Critical patent/CN113761221A/en
Application granted granted Critical
Publication of CN113761221B publication Critical patent/CN113761221B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Artificial Intelligence (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Animal Behavior & Ethology (AREA)
  • Databases & Information Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

Aiming at the problem that the entity alignment method of the prior knowledge graph can cause information loss to a certain degree, the invention discloses a knowledge graph entity alignment method based on a graph neural network, which comprises the steps of data preprocessing, data preprocessing of two knowledge graphs to be aligned and the prior alignment seeds, and the processing result as the input of the next step; constructing a graph neural network model, inputting a preprocessing result into a graph convolution neural network, and uniformly modeling two knowledge graphs to be aligned by using the graph neural network to obtain vectorization representation of an entity in the knowledge graphs; and searching an entity with the highest semantic similarity to the entity represented by the entity vector in the vector space based on a greedy algorithm to serve as an aligned entity. The invention jointly models the two maps to be aligned by constructing a uniform map neural network, more effectively utilizes the comprehensive information of the two maps, obtains more accurate entity vectorization representation and improves the accuracy of entity alignment.

Description

Knowledge graph entity alignment method based on graph neural network
Technical Field
The invention relates to the field of knowledge graph-based entity alignment, in particular to a knowledge graph entity alignment method based on a graph neural network.
Background
In recent years, knowledge-graph technology has developed rapidly, and various researches based on knowledge-graphs are endless. Researchers have constructed a large number of general knowledge maps and domain knowledge maps, and knowledge among the knowledge maps is overlapped and supplemented. How to fuse the multi-source heterogeneous knowledge graph to obtain a knowledge graph with more complete knowledge, thereby better supporting the application of the graph is urgent. The entity alignment technology is an important method for realizing knowledge graph fusion.
At present, an entity alignment method of knowledge graphs mainly uses two graph neural networks to independently model two graphs to be aligned, vector representations of entities and relations in the two graphs are obtained respectively, vector spaces are searched through a greedy algorithm and the like, and a pair of entities with the closest vector representations are used as alignment entities. The method models the two maps respectively, and the entity alignment task needs to fully utilize the comprehensive information of the two maps, so the method can cause information loss to a certain degree.
Disclosure of Invention
The invention discloses a knowledge graph entity alignment method based on a graph neural network, aiming at the problem that the existing knowledge graph entity alignment method can cause information loss to a certain degree.
The invention discloses a knowledge graph entity alignment method based on a graph neural network, which comprises the following specific steps:
s1, preprocessing data, preprocessing the two knowledge maps to be aligned and the existing alignment seeds, and using the processing result as the input of the step S2;
s2, constructing a graph neural network model, inputting the preprocessing result of the step S1 into a graph convolution neural network, and uniformly modeling two knowledge graphs needing to be aligned by using the graph neural network to obtain vectorization representation of an entity in the knowledge graphs;
and S3, searching an entity vector in the vector space based on the greedy algorithm to represent the entity with the highest semantic similarity with the entity, and taking the entity as an alignment entity.
In the step S1, during data preprocessing, all triplets and aligned entity seed pairs included in the two knowledge maps are processed, and are randomly initialized by using an embedding layer in the Keras artificial neural network library, so as to obtain vector representations of entities and relationships in the triplets. In order to ensure that the subsequently constructed graph neural network jointly models the two knowledge graphs to be aligned, the two knowledge graphs to be aligned are associated by utilizing the aligned seeds, and initial data information is fully mined; regarding the pre-aligned seeds as aligned triples, and constructing an adjacency matrix; when constructing the cross-map relational triples, traversing all the existing triples, replacing the entity contained in the triples with the entity aligned by the pre-aligned seeds for the triples containing a certain entity in the aligned seeds, and generating a new triplet, thereby constructing the adjacency matrix and the relational triples of the cross-map and obtaining a preprocessing result for synthesizing two pieces of knowledge-map information needing to be aligned.
In the step S2, the data preprocessing result obtained in the step S1 is input into a atlas convolutional neural network, and unified modeling is performed on two knowledge maps to be aligned, so as to obtain vectorized representation of entities and relations in the maps; the vectorization representation of the entity and the relation in the graph convolutional neural network is continuously adjusted through iteration, the similarity degree of the entity semantic meaning and the similarity degree of the entity vector representation are consistent as a convergence target, and the specific iteration process is as follows:
and S21, aggregating information, initializing graph vector representation, performing aggregation operation on each node in the graph, aggregating information of neighbor nodes to update vector representation of the central node, and aggregating information of all connected nodes. The calculation formula of the aggregation information is as follows:
Figure BDA0003141045500000031
wherein
Figure BDA0003141045500000032
Representing the ith entity e at the ith iterationiIs used to represent the vector of (a),
Figure BDA0003141045500000033
indicating the kth entity e at the l iterationkIs used to represent the vector of (a),
Figure BDA0003141045500000034
denotes eiAll neighbor nodes of。
S22, reversely deriving all variables of the neural network model of the graph according to a chain rule according to the loss function, and updating model parameters by using a gradient descent method, wherein the expression of the loss function L is as follows:
Figure BDA0003141045500000035
where P represents a pre-aligned seed entity pair, P' represents other entity pairs resulting from random negative sampling,
Figure BDA0003141045500000036
respectively representing the pre-aligned seed entity pairs e after the graph neural network codingiAnd ejIs used to represent the vector of (a),
Figure BDA0003141045500000037
respectively represents a random negative sampling entity pair e 'coded by a graph neural network'iAnd e'jAnd λ represents the threshold. And reversely deriving all variables in the neural network of the graph by using the loss function, and updating the vector representation of the knowledge graph.
And S23, repeating the steps S21 and S22 until the whole training process is finished, and the map vector representation is not changed any more.
In step S3, a greedy algorithm is used to search the whole vector space, and the paired vector corresponding entities whose euclidean distances represented by the node vectors in different knowledge graphs are smaller than a threshold are obtained as alignment entities.
The invention has the beneficial effects that:
1. the method is improved on the basis of the existing graph neural network entity alignment method, and the respective modeling optimization aiming at the two graphs is modified into the large graph integral modeling aiming at the two graphs, so that the associated information of the two graphs to be aligned is better captured, and the entity alignment accuracy is improved; and the combined modeling mode is simple and easy to operate, the method has better expansibility, and can be used as a module to be embedded into the existing entity alignment model, so that the entity alignment accuracy of the existing method is improved.
2. The invention provides a new idea for other applications based on the knowledge graph, namely, a training set is not only used as a standard when the gradient of the model is reduced, but also can be used as a part of enhanced features to be input into the model when in input, thereby more fully utilizing the information of training data and improving the final performance of the model.
Drawings
FIG. 1 is a flow chart of the steps of the method.
Detailed Description
For a better understanding of the present disclosure, an example is given here.
The invention discloses a knowledge graph entity alignment method based on a graph neural network, which comprises the following steps:
s1, preprocessing data, preprocessing the two knowledge maps to be aligned and the existing alignment seeds, and using the processing result as the input of the step S2;
s2, constructing a graph neural network model, inputting the preprocessing result of the step S1 into a graph convolution neural network, and uniformly modeling two knowledge graphs needing to be aligned by using the graph neural network to obtain vectorization representation of an entity in the knowledge graphs;
and S3, searching an entity vector in the vector space based on the greedy algorithm to represent the entity with the highest semantic similarity with the entity, and taking the entity as an alignment entity.
In the step S1, during data preprocessing, all triplets and aligned entity seed pairs included in the two knowledge maps are processed, and are randomly initialized by using an embedding layer in the Keras artificial neural network library, so as to obtain vector representations of entities and relationships in the triplets. In order to ensure that the subsequently constructed graph neural network jointly models the two knowledge graphs to be aligned, the two knowledge graphs to be aligned are associated by utilizing the aligned seeds, and initial data information is fully mined; regarding the pre-aligned seeds as aligned triples, and constructing an adjacency matrix; when constructing the cross-map relational triples, traversing all the existing triples, replacing the entity contained in the triples with the entity aligned by the pre-aligned seeds for the triples containing a certain entity in the aligned seeds, and generating a new triplet, thereby constructing the adjacency matrix and the relational triples of the cross-map and obtaining a preprocessing result for synthesizing two pieces of knowledge-map information needing to be aligned. For example, to illustrate step S1, if there are two chinese-english maps that need to perform the entity alignment task, there is a triple in the chinese map: china- > capital- > Beijing, and an English map has a triple: China-China is a pre-aligned entity seed, and two new triples can be expanded by referring to the two triples according to the two map joint modeling principles in the foregoing: china- > belongsto- > Asia, China- > capital- > Beijing. And the expanded triple is used as enhancement data and put into a training set, so that the final entity alignment accuracy of the model can be effectively improved.
In the step S2, the data preprocessing result obtained in the step S1 is input into a atlas convolutional neural network, and unified modeling is performed on two knowledge maps to be aligned, so as to obtain vectorized representation of entities and relations in the maps; the vectorization representation of the entity and the relation in the graph convolutional neural network is continuously adjusted through iteration, the similarity degree of the entity semantic meaning and the similarity degree of the entity vector representation are consistent as a convergence target, and the specific iteration process is as follows:
and S21, aggregating information, performing aggregation operation on each node in the graph after initializing graph vector representation, and aggregating information of neighbor nodes to update vector representation of the central node. Due to the processing of S1, the aggregation information is different from the neighbor node information of the graph where the existing graph neural network only aggregates nodes, and in step S1, the cross-graph adjacency matrix and the cross-graph relationship triplet are associated, and then all the associated node information is aggregated. The calculation formula of the aggregation information is as follows:
Figure BDA0003141045500000051
wherein
Figure BDA0003141045500000052
Is shown asI entity e in l iterationsiIs used to represent the vector of (a),
Figure BDA0003141045500000053
indicating the kth entity e at the l iterationkIs used to represent the vector of (a),
Figure BDA0003141045500000054
denotes eiAll neighbor nodes of (1).
S22, reversely deriving all variables of the neural network model of the graph according to a chain rule according to the loss function, and updating model parameters by using a gradient descent method, wherein the expression of the loss function L is as follows:
Figure BDA0003141045500000061
where P represents a pre-aligned seed entity pair, P' represents other entity pairs resulting from random negative sampling,
Figure BDA0003141045500000062
respectively representing the pre-aligned seed entity pairs e after the graph neural network codingiAnd ejIs used to represent the vector of (a),
Figure BDA0003141045500000063
respectively represents a random negative sampling entity pair e 'coded by a graph neural network'iAnd e'jAnd λ represents the threshold. And reversely deriving all variables in the neural network of the graph by using the loss function, and updating the vector representation of the knowledge graph.
And S23, repeating the steps S21 and S22 until the whole training process is finished, and the map vector representation is not changed any more.
In the process of step S3, a greedy algorithm is used to search the whole vector space, and the paired vector corresponding entities whose euclidean distances represented by the node vectors in different knowledge graphs are smaller than a threshold value are obtained by calculation and are used as alignment entities.
The above description is only an example of the present application and is not intended to limit the present application. Various modifications and changes may occur to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present application should be included in the scope of the claims of the present application.

Claims (2)

1. A knowledge graph entity alignment method based on a graph neural network is characterized by comprising the following specific steps:
s1, preprocessing data, preprocessing the two knowledge maps to be aligned and the existing alignment seeds, and using the processing result as the input of the step S2;
s2, constructing a graph neural network model, inputting the preprocessing result of the step S1 into a graph convolution neural network, and uniformly modeling two knowledge graphs needing to be aligned by using the graph neural network to obtain vectorization representation of an entity in the knowledge graphs;
s3, searching an entity vector in the vector space based on a greedy algorithm to represent an entity with the highest semantic similarity to the entity, and taking the entity as an alignment entity;
in the step S1, during data preprocessing, all triples and aligned entity seed pairs included in the two knowledge maps are processed, and are randomly initialized by using an embedding layer in the Keras artificial neural network library, so as to obtain vector representations of entities and relationships in the triples; in order to ensure that the subsequently constructed graph neural network jointly models the two knowledge graphs to be aligned, the two knowledge graphs to be aligned are associated by utilizing the aligned seeds, and initial data information is fully mined; regarding the pre-aligned seeds as aligned triples, and constructing an adjacency matrix; when constructing the cross-map relational triples, traversing all the existing triples, replacing an entity contained in each triplet with an entity aligned with a pre-aligned seed for the triples containing a certain entity in the aligned seed, and generating a new triplet, thereby constructing an adjacency matrix and a relational triplet of a cross-map and obtaining a preprocessing result for synthesizing two pieces of knowledge-map information needing to be aligned;
in the step S2, the data preprocessing result obtained in the step S1 is input into a atlas convolutional neural network, and unified modeling is performed on two knowledge maps to be aligned, so as to obtain vectorized representation of entities and relations in the maps; the vectorization representation of the entity and the relation in the graph convolutional neural network is continuously adjusted through iteration, the similarity degree of the entity semantic meaning and the similarity degree of the entity vector representation are consistent as a convergence target, and the specific iteration process is as follows:
s21, aggregating information, after initializing map vector representation, performing aggregation operation on each node in the map, aggregating information of neighbor nodes to update vector representation of the central node, and aggregating information of all connected nodes; the calculation formula of the aggregation information is as follows:
Figure FDA0003465288360000021
wherein
Figure FDA0003465288360000022
Representing the ith entity e at the ith iterationiIs used to represent the vector of (a),
Figure FDA0003465288360000023
indicating the kth entity e at the l iterationkIs used to represent the vector of (a),
Figure FDA0003465288360000024
denotes eiAll neighbor nodes of (1);
s22, reversely deriving all variables of the neural network model of the graph according to a chain rule according to the loss function, and updating model parameters by using a gradient descent method, wherein the expression of the loss function L is as follows:
Figure FDA0003465288360000025
where P represents a pre-aligned seed entity pair, P' represents other entity pairs resulting from random negative sampling,
Figure FDA0003465288360000026
Figure FDA0003465288360000027
respectively representing the pre-aligned seed entity pairs e after the graph neural network codingiAnd ejIs used to represent the vector of (a),
Figure FDA0003465288360000028
respectively represents a random negative sampling entity pair e 'coded by a graph neural network'iAnd e'jRepresents the vector of (a), λ represents the threshold; reversely deriving all variables in the neural network of the graph by using the loss function, and updating the vector representation of the knowledge graph;
and S23, repeating the steps S21 and S22 until the whole training process is finished, and the map vector representation is not changed any more.
2. The method for aligning knowledge-graph entities based on graph neural network as claimed in claim 1, wherein in step S3, the greedy algorithm is used to search the whole vector space, and the paired vector corresponding entities whose euclidean distances represented by node vectors in different knowledge-graphs are smaller than the threshold are used as the aligned entities.
CN202110734416.6A 2021-06-30 2021-06-30 Knowledge graph entity alignment method based on graph neural network Active CN113761221B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110734416.6A CN113761221B (en) 2021-06-30 2021-06-30 Knowledge graph entity alignment method based on graph neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110734416.6A CN113761221B (en) 2021-06-30 2021-06-30 Knowledge graph entity alignment method based on graph neural network

Publications (2)

Publication Number Publication Date
CN113761221A CN113761221A (en) 2021-12-07
CN113761221B true CN113761221B (en) 2022-02-15

Family

ID=78787538

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110734416.6A Active CN113761221B (en) 2021-06-30 2021-06-30 Knowledge graph entity alignment method based on graph neural network

Country Status (1)

Country Link
CN (1) CN113761221B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114880484B (en) * 2022-05-11 2023-06-16 军事科学院系统工程研究院网络信息研究所 Satellite communication frequency track resource map construction method based on vector mapping
CN116150405B (en) * 2023-04-19 2023-06-27 中电科大数据研究院有限公司 Heterogeneous data processing method for multiple scenes
CN116227592B (en) * 2023-05-06 2023-07-18 城云科技(中国)有限公司 Multisource knowledge graph alignment model, construction method, device and application thereof
CN117172321A (en) * 2023-11-02 2023-12-05 中国科学院空天信息创新研究院 Geographic entity alignment method and device for introducing graphic neural network and electronic equipment

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109344285A (en) * 2018-09-11 2019-02-15 武汉魅瞳科技有限公司 A kind of video map construction and method for digging, equipment towards monitoring
CN109389151A (en) * 2018-08-30 2019-02-26 华南师范大学 A kind of knowledge mapping treating method and apparatus indicating model based on semi-supervised insertion
CN110955780A (en) * 2019-10-12 2020-04-03 中国人民解放军国防科技大学 Entity alignment method for knowledge graph

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109255002B (en) * 2018-09-11 2021-08-27 浙江大学 Method for solving knowledge graph alignment task by utilizing relationship path mining
CN110472065B (en) * 2019-07-25 2022-03-25 电子科技大学 Cross-language knowledge graph entity alignment method based on GCN twin network
CN110941722B (en) * 2019-10-12 2022-07-01 中国人民解放军国防科技大学 Knowledge graph fusion method based on entity alignment
CN111046186A (en) * 2019-10-30 2020-04-21 平安科技(深圳)有限公司 Entity alignment method, device and equipment of knowledge graph and storage medium
CN111191045B (en) * 2019-12-30 2023-06-16 创新奇智(上海)科技有限公司 Entity alignment method and system applied to knowledge graph
CN112417159B (en) * 2020-11-02 2022-04-15 武汉大学 Cross-language entity alignment method of context alignment enhanced graph attention network
CN112445876B (en) * 2020-11-25 2023-12-26 中国科学院自动化研究所 Entity alignment method and system for fusing structure, attribute and relationship information

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109389151A (en) * 2018-08-30 2019-02-26 华南师范大学 A kind of knowledge mapping treating method and apparatus indicating model based on semi-supervised insertion
CN109344285A (en) * 2018-09-11 2019-02-15 武汉魅瞳科技有限公司 A kind of video map construction and method for digging, equipment towards monitoring
CN110955780A (en) * 2019-10-12 2020-04-03 中国人民解放军国防科技大学 Entity alignment method for knowledge graph

Also Published As

Publication number Publication date
CN113761221A (en) 2021-12-07

Similar Documents

Publication Publication Date Title
CN113761221B (en) Knowledge graph entity alignment method based on graph neural network
CN111898364B (en) Neural network relation extraction method, computer equipment and readable storage medium
CN109614614A (en) A kind of BILSTM-CRF name of product recognition methods based on from attention
CN111275172B (en) Feedforward neural network structure searching method based on search space optimization
US20210018332A1 (en) Poi name matching method, apparatus, device and storage medium
CN110347881A (en) A kind of group's discovery method for recalling figure insertion based on path
CN108399268B (en) Incremental heterogeneous graph clustering method based on game theory
CN113111657B (en) Cross-language knowledge graph alignment and fusion method, device and storage medium
CN114218389A (en) Long text classification method in chemical preparation field based on graph neural network
CN109857457A (en) A kind of function level insertion representation method learnt in source code in the hyperbolic space
CN115481682A (en) Graph classification training method based on supervised contrast learning and structure inference
CN114332519A (en) Image description generation method based on external triple and abstract relation
Wang et al. Fast gunrock subgraph matching (gsm) on gpus
Sun et al. Graph force learning
CN112035689A (en) Zero sample image hash retrieval method based on vision-to-semantic network
Bi et al. MM-GNN: Mix-moment graph neural network towards modeling neighborhood feature distribution
CN113177107B (en) Intelligent contract similarity detection method based on syntax tree matching
CN114528971A (en) Atlas frequent relation mode mining method based on heterogeneous atlas neural network
CN111159424B (en) Method and device for labeling knowledge graph entity, storage medium and electronic equipment
CN113220820A (en) Efficient SPARQL query response method, device and equipment based on graph
CN117010373A (en) Recommendation method for category and group to which asset management data of power equipment belong
CN114997360B (en) Evolution parameter optimization method, system and storage medium of neural architecture search algorithm
Wang et al. Gradient Flow of Energy: A General and Efficient Approach for Entity Alignment Decoding
CN110609914B (en) Online Hash learning image retrieval method based on rapid category updating
Han et al. Text-to-Image Person Re-identification Based on Multimodal Graph Convolutional Network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant