CN115587187A - Knowledge graph complementing method based on small sample - Google Patents
Knowledge graph complementing method based on small sample Download PDFInfo
- Publication number
- CN115587187A CN115587187A CN202110759582.1A CN202110759582A CN115587187A CN 115587187 A CN115587187 A CN 115587187A CN 202110759582 A CN202110759582 A CN 202110759582A CN 115587187 A CN115587187 A CN 115587187A
- Authority
- CN
- China
- Prior art keywords
- entity
- query
- lstm
- adopting
- knowledge graph
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/36—Creation of semantic tools, e.g. ontology or thesauri
- G06F16/367—Ontology
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Databases & Information Systems (AREA)
- Animal Behavior & Ethology (AREA)
- Artificial Intelligence (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Evolutionary Computation (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a knowledge graph complementing method based on a small sample. The method comprises the following steps: obtaining a reference set and a query set related to the relation r in the meta-training set, and performing negative sampling on the query set; enhancing the word embedding expression of the current entity by adopting a neighbor coding module; enhancing interaction between the reference entity pairs by adopting an LSTM aggregator to obtain universal word embedding representation of the reference set; and finally, carrying out similarity matching on the query entity pair and the reference set by adopting an LSTM-based matching network. The method provided by the invention has a good effect in the completion of the knowledge graph of the small sample, is superior to most of the existing models, and has the advantages of short network training time, strong stability and the like. Has wide application prospect in the aspects of information retrieval, natural language understanding, question answering system and the like.
Description
Technical Field
The invention designs a knowledge graph complementing method based on a small sample, and relates to the field of deep learning.
Background
Some larger-scale knowledge maps are: freebase, NELL and Wiki-data are directed to many NLP tasks such as: information retrieval sources, machine reading and relationship extraction are extremely useful resources. A more typical knowledge graph is usually a multiple relationship graph structure, represented by a triplet of (h, r, t) with pairs of entities connected by a relationship r. Although KG contains a large number of triplets, it also has the problem of being incomplete. Knowledge-graph completion aims at automatically deducing missing facts from existing facts and is therefore receiving a lot of attention. Currently, in a more promising approach: KG embedding was proposed and has been successfully applied to this task. The key idea of this approach is to embed entities and relationships into a continuous vector space and predict through its embedded representation. However, the current embedding method needs to train sufficient triples for all relationships to effectively learn the word embedding of the entities and the relationships, and a large part of the relationships are actually long-tailed in an actual knowledge graph, that is, only limited triples are available. Thus, for those long-tailed relationships, the effectiveness of the model will be directly affected.
There have been several recent studies in which the completion of a knowledge graph of a small sample is performed. GMatching was the first study of one-shot learning of the knowledge-graph known to date. GMatching proposes a neighbor encoder that enhances the embedding of the current entity by aggregating information of the current entity's single-hop neighbors and uses an LSTM matching processor to perform multi-step matching through LSTM blocks. FSRL extends GMatching to the raw-shot case and uses the attention mechanism to further capture the local graph structure features. MetaR complements incomplete facts by transferring shared knowledge extracted from several existing facts into the incomplete facts. However, previous studies have been static representations of learning entities or references, ignoring their dynamic nature. The present invention seeks to learn dynamic representations of entities and reference sets through an adaptive attention network. The dynamic characteristics have been explored in some other work. Luo et al (2019) attempts to model user preferences using a recurrent network with adaptive attention, making recommendations for them accordingly. All of these show that modeling dynamic properties can enhance the learning capabilities of the algorithm.
Disclosure of Invention
The invention provides a knowledge graph complementing method based on small samples, which fully considers the influence of the roles of different nodes related to a central node on the central node when carrying out small sample knowledge graph complementing, dynamically models the central node, enhances the word-embedded representation of an entity, and simultaneously uses an LSTM-based aggregator to effectively construct the interaction between the entity pairs of a reference set and effectively enhance the representation capability of the reference set.
The invention realizes the purpose through the following technical scheme:
the knowledge graph complementing method based on the small sample comprises the following steps:
the method comprises the following steps: and obtaining a reference set and a query set related to the relation r in the meta-training set, and performing negative sampling on the query set.
Step two: and enhancing the word-embedded expression of the current entity by adopting a neighbor coding module.
Step three: and enhancing the interaction between the reference entity pairs by adopting an LSTM aggregator to obtain the universal word-embedded expression of the reference set.
Step four: and performing similarity matching on the query entity pair and the reference set by adopting an LSTM-based matching network.
As the function of the neighbor encoder in step two, it is explained as follows:
(1) For the reference entity pair related to the query relation r, based on the assumption of the TransE model, the current query entity pair (h, t) can be used to obtain its word-embedded representation, as shown in formula (1).
r=t-h (1)
(2) In all the neighbor nodes connected with the current entity node, the contributions of different neighbor nodes to the central node are different, and the corresponding contribution degrees of the neighbor nodes are correspondingly changed along with the change of the query relation r, in order to dynamically calculate the contributions of different connection nodes to the central node under different query relations, the query relation r and the neighbor relations are subjected to similarity matching, an attention mechanism is adopted, and different weight coefficients are distributed to the neighbor nodes according to the difference of the query relation r, as shown in formula (2), formula (3) and formula (4).
(3) In order to further enhance the word-embedding expression of the central entity, the entity and the relationship of the neighbor nodes are combined together by adopting a circular convolution method and then multiplied by corresponding weights to express the central node, and the word-embedding expression form is obtained and shown in formula (5).
f(h)=σ(W 1 h+W 2 c nbr ) (5)
The function of the LSTM polymerizer in step three is illustrated below:
each reference set is provided with a corresponding reference entity pair, the interaction between different entity pairs is very important for enhancing the representation of the current reference set, and the LSTM-based aggregator is adopted for enhancing the interaction between the reference entity pairs to obtain the general expressions of the reference sets, as shown in formulas (6) and (7).
Where m' is a representation of a certain reference entity pair in the current reference set, β k For the weight coefficient of the current reference entity pair, f ε (R r ) Is an aggregated representation of the current reference entity pair.
Drawings
FIG. 1 is a model architecture diagram of the small sample knowledge-graph-based completion method of the present invention.
Figure 2 is a diagram of the neighbor encoding module architecture of the present invention.
Detailed Description
The invention is further described below with reference to the accompanying drawings:
fig. 1 is a model architecture diagram based on a small sample knowledge graph complement method, and when the small sample knowledge graph complement is performed, the model architecture diagram is mainly based on three modules: the system comprises a neighbor coding module, an LSTM-based aggregation module and an LSTM-based matching module. Under the condition of different query relations r, the neighbor coding module dynamically distributes weight coefficients for different neighbor nodes through an attention mechanism, the influence of the different neighbor nodes on a central node is fully considered, the aggregation module enhances the interaction of a reference entity pair in a reference set through LSTM to obtain a universal expression of the reference set, and the matching network obtains a correct query entity pair through calculating the similarity between the query entity pair and the reference set entity pair so as to predict the missing entity.
Fig. 2 is a diagram of a neighbor coding module architecture, which is used for matching similarity between a relation r with query and a relation r of a reference entity set, dynamically allocating attention coefficients to different neighbor nodes, combining the entities and the relations of the neighbor nodes in a cyclic convolution manner, and updating a central node by using the entities and the relations of the neighbor nodes, so as to further enhance the word-embedded representation of the central entity.
The invention performs tests on the public data set to fully validate the validity and reliability of the invention. The details of the data sets used in the experiments are shown in table 1.
Table 1 detailed information of the experimental data set
In order to verify the effectiveness of the knowledge graph complementing method based on the small sample, the method disclosed by the invention is compared with models such as RESCAL, transE, disMult, complete, GMatching, FSRL, FANN and the like to obtain an experimental result shown in a table 2, and the experimental result can be used for obtaining that the effect is obviously improved when the method described in the patent is used for carrying out the small sample knowledge graph complementing, and the detailed experimental result is shown in the table 2.
Table 2 experimental results show
Claims (4)
1. The knowledge graph complementing method based on the small sample is characterized by comprising the following steps of:
the method comprises the following steps: obtaining a reference set and a query set related to the relation r in the meta-training set, and carrying out negative sampling on the query set;
step two: enhancing the word embedding expression of the current entity by adopting a neighbor coding module;
step three: enhancing interaction between the reference entity pairs by adopting an LSTM aggregator to obtain universal word embedding representation of the reference set;
step four: and performing similarity matching on the query entity pair and the reference set by adopting an LSTM-based matching network.
2. The neighbor coding module according to claim 1, wherein the neighbor coding module used in the second step adopts the concept of local map convolution, the entity connected to the central node is used to enhance the embedded representation of the current entity, the dynamic characteristic that different roles have different influence on the query relation r is considered, the attention mechanism is used to assign corresponding weight coefficients to different neighbor nodes, then the operation of circular convolution is used to combine the entity and the relation of the neighbor nodes, and the corresponding weight coefficients are multiplied to enhance the word-embedded representation of the current central node.
3. The LSTM aggregator of claim 1, wherein the LSTM-based aggregator is used in step three to enhance interaction between reference entity pairs, and aggregate information of all reference sets using the structure of codec to obtain common word-embedded representation associated with the reference sets.
4. The matching network of claim 1, wherein all query entity pairs (positive entity pairs (h) are combined in step four using an LSTM-based matching network l ,t l ) And negative sampling of (h) l ,t′ l ) ) similarity matching with the reference set common expressions such that the similarity score of the positive query entity pair is higher than the similarity scores of all negative query entity pairs.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110759582.1A CN115587187A (en) | 2021-07-05 | 2021-07-05 | Knowledge graph complementing method based on small sample |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110759582.1A CN115587187A (en) | 2021-07-05 | 2021-07-05 | Knowledge graph complementing method based on small sample |
Publications (1)
Publication Number | Publication Date |
---|---|
CN115587187A true CN115587187A (en) | 2023-01-10 |
Family
ID=84771486
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110759582.1A Pending CN115587187A (en) | 2021-07-05 | 2021-07-05 | Knowledge graph complementing method based on small sample |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115587187A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116629356A (en) * | 2023-05-09 | 2023-08-22 | 华中师范大学 | Encoder and Gaussian mixture model-based small-sample knowledge graph completion method |
-
2021
- 2021-07-05 CN CN202110759582.1A patent/CN115587187A/en active Pending
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116629356A (en) * | 2023-05-09 | 2023-08-22 | 华中师范大学 | Encoder and Gaussian mixture model-based small-sample knowledge graph completion method |
CN116629356B (en) * | 2023-05-09 | 2024-01-26 | 华中师范大学 | Encoder and Gaussian mixture model-based small-sample knowledge graph completion method |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Xie et al. | Deep learning enabled semantic communication systems | |
CN111753024B (en) | Multi-source heterogeneous data entity alignment method oriented to public safety field | |
CN110880019B (en) | Method for adaptively training target domain classification model through unsupervised domain | |
CN112131404A (en) | Entity alignment method in four-risk one-gold domain knowledge graph | |
CN111930894B (en) | Long text matching method and device, storage medium and electronic equipment | |
CN109871504B (en) | Course recommendation system based on heterogeneous information network and deep learning | |
CN111210002B (en) | Multi-layer academic network community discovery method and system based on generation of confrontation network model | |
JP2022169743A (en) | Information extraction method and device, electronic equipment, and storage medium | |
CN110263236A (en) | Social network user multi-tag classification method based on dynamic multi-view learning model | |
CN115982480A (en) | Sequence recommendation method and system based on cooperative attention network and comparative learning | |
CN117237559A (en) | Digital twin city-oriented three-dimensional model data intelligent analysis method and system | |
CN115587187A (en) | Knowledge graph complementing method based on small sample | |
CN111339258B (en) | University computer basic exercise recommendation method based on knowledge graph | |
CN112529057A (en) | Graph similarity calculation method and device based on graph convolution network | |
CN107633259A (en) | A kind of cross-module state learning method represented based on sparse dictionary | |
CN116881416A (en) | Instance-level cross-modal retrieval method for relational reasoning and cross-modal independent matching network | |
CN116055436A (en) | Knowledge-graph-driven multi-user cognitive semantic communication system and method | |
CN112836511B (en) | Knowledge graph context embedding method based on cooperative relationship | |
CN115587192A (en) | Relationship information extraction method, device and computer readable storage medium | |
SHIDIK et al. | Linked open government data as background knowledge in predicting forest fire | |
Sun et al. | Task-Oriented Explainable Semantic Communications Based on Structured Scene Graphs | |
Zhang et al. | A Spatial-Aware Representation Learning Model for Link Completion in GeoKG: A Case Study on Wikidata and OpenStreetMap | |
CN113254597B (en) | Model training method, query processing method and related equipment | |
CN114338093B (en) | Method for transmitting multi-channel secret information through capsule network | |
Chakeri et al. | From geolocation-based only to semantically-aware digital advertising: A neural embedding approach |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |