WO2022108206A1 - Procédé et appareil pour remplir un graphe de connaissances pouvant être décrit - Google Patents

Procédé et appareil pour remplir un graphe de connaissances pouvant être décrit Download PDF

Info

Publication number
WO2022108206A1
WO2022108206A1 PCT/KR2021/015999 KR2021015999W WO2022108206A1 WO 2022108206 A1 WO2022108206 A1 WO 2022108206A1 KR 2021015999 W KR2021015999 W KR 2021015999W WO 2022108206 A1 WO2022108206 A1 WO 2022108206A1
Authority
WO
WIPO (PCT)
Prior art keywords
segments
explainable
knowledge graph
paths
subject
Prior art date
Application number
PCT/KR2021/015999
Other languages
English (en)
Korean (ko)
Inventor
박영택
이민호
이완곤
바트셀렘작바랄
Original Assignee
숭실대학교산학협력단
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from KR1020210016548A external-priority patent/KR102464999B1/ko
Application filed by 숭실대학교산학협력단 filed Critical 숭실대학교산학협력단
Publication of WO2022108206A1 publication Critical patent/WO2022108206A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/02Knowledge representation; Symbolic representation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/04Inference or reasoning models
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"

Definitions

  • the present invention relates to a method and apparatus for completing an explanatory knowledge graph.
  • the knowledge graph refers to information that expresses the relationship between resources and resources accumulated from various sources, such as the web, and graphically expresses the meaning between these concepts.
  • the knowledge graph has a problem in that triples are missing or some data connection is insufficient.
  • the present invention intends to propose a method and apparatus for completing a knowledge graph that can provide validity of a derivation process as a basis for link prediction.
  • an explanatory knowledge graph completion apparatus comprising: a processor; and a memory connected to the processor, wherein the memory extracts a plurality of relational paths capable of connecting the subject and the object from a query triple including a subject, a predicate, and an object, and the extracted plurality of relational paths Generates a plurality of explainable segments using Compare semantic similarity between a plurality of explainable segments and a query predicate included in the query triple, and select a segment with high importance in link prediction for the query triple among the plurality of explainable segments through the semantic similarity comparison
  • An explanatory knowledge graph completion apparatus is provided for storing program instructions executable by the processor to determine.
  • the plurality of relationship paths may be defined as paths connected only to the one or more relationships excluding the one or more entities among one or more entities and one or more relationships that may be connected from the subject to the object.
  • the program instructions may extract the plurality of relationship paths by searching the one or more entities and the one or more relationships between the subject and the object through a random walk using a path ranking algorithm (PRA).
  • PRA path ranking algorithm
  • the program instructions express the subject and object of all triples connected by the query predicate in pairs, and remove some of the plurality of relationship paths by using the pair's random walk probability for each of the plurality of relationship paths.
  • the program instructions may remove some of the plurality of relational paths by using a ratio of pairs having the random walk probability greater than 0, an average value of the random walk probability, and a length of each of the plurality of relational paths.
  • Each of the plurality of explainable segments is preprocessed with the same length n, and each entity and relationship is expressed as a d-dimensional vector, and the CNN receives data converted into a matrix of n ⁇ d form for each of the plurality of explainable segments as input.
  • a feature map of each of the plurality of explainable segments is output, and the LSTM includes a forward LSTM layer and a backward LSTM layer, and an embedding vector of each of the plurality of explainable segments can be generated by receiving the feature map as an input.
  • the program instructions may calculate an attention score for each of the plurality of explainable segments by comparing the semantic similarity, and determine a segment having high importance in a link prediction result for the query triple based on the attention score. .
  • a method for completing a knowledge graph that can be described in a device including a processor and a memory connected to the processor, wherein in a query triple including a subject, a predicate, and an object, the subject and the object can be connected extracting a plurality of relationship paths; generating a plurality of explainable segments using the extracted plurality of relationship paths; extracting an embedding vector for each of the generated plurality of explainable segments using a neural network model combining CNN and LSTM; comparing semantic similarity between a plurality of descriptive segments represented by the embedding vector and a query predicate included in the query triple using an attention mechanism; and determining a segment having high importance for link prediction with respect to the query triple from among the plurality of explainable segments through the semantic similarity comparison.
  • FIG. 1 is a diagram illustrating the configuration of an explanatory knowledge graph completion apparatus according to an exemplary embodiment of the present invention.
  • FIG. 2 is a view for explaining a process of completing an explanatory knowledge graph according to the present embodiment.
  • FIG. 3 is a diagram illustrating an explainable segment embedding process according to an embodiment of the present invention.
  • FIG. 4 is a diagram illustrating the structure of an attention mechanism for link prediction according to an embodiment of the present invention.
  • Knowledge graph completion is the task of supplementing the incomplete knowledge graph by predicting missing links. It predicts the object corresponding to ? when the query triple ⁇ subject, predicate, ?> is given.
  • the subject and the object are defined as an entity (entity), and the predicate is defined as a relation.
  • the present invention relates to a method capable of presenting a description of a result of link prediction, and when a query triple is input, not only predicting a link to an object corresponding to a correct answer among a plurality of candidate objects connected to a subject, but also predicting a link We present an inference path to provide an explanation supporting the predicted link outcome.
  • the inference path is defined as a set of entities and relationships that can reach the object starting with the subject, and the explanatory inference path is defined as an explanation segment.
  • FIG. 1 is a diagram illustrating the configuration of an explanatory knowledge graph completion apparatus according to an exemplary embodiment of the present invention.
  • the knowledge graph completion apparatus may include a processor 100 and a memory 102 .
  • the processor 100 may include a central processing unit (CPU) capable of executing a computer program or other virtual machines.
  • CPU central processing unit
  • Memory 102 may include a non-volatile storage device such as a fixed hard drive or a removable storage device.
  • the removable storage device may include a compact flash unit, a USB memory stick, and the like.
  • Memory 102 may also include volatile memory, such as various random access memories.
  • Such memory 102 stores program instructions executable by the processor 100 .
  • the program instructions according to the present embodiment extract a plurality of relational paths that can connect the subject and the object from a query triple including a subject, a predicate, and an object, and use the extracted plurality of relational paths to provide a plurality of explanations.
  • Generates a possible segment extracts an embedding vector for each of the generated plurality of explainable segments using a neural network model that combines CNN and LSTM, and uses an attention mechanism to generate a plurality of explainable segments expressed by the embedding vector and
  • the semantic similarity with the predicate included in the query triple is compared, and a segment having a high importance in link prediction for the query triple is determined from among the plurality of explainable segments through the semantic similarity comparison.
  • a process of determining an explanatory segment with high importance for link prediction for completing the knowledge graph will be described in detail.
  • the object of the query triple may be an object corresponding to the correct answer among objects that can be connected to the subject.
  • FIG. 2 is a view for explaining a process of completing an explanatory knowledge graph according to the present embodiment.
  • FIG. 2 is a diagram exemplarily illustrating a case in which the United States is the correct object as the object in the query triple ⁇ Tom Cruise, nationality, ?>.
  • a segment that can be explained in FIG. 2 means three inference paths existing between Tom Cruise and the United States as follows.
  • explanation means an explanation supporting the result of link prediction, and the present invention classifies meaningful (high importance in link prediction) segments and meaningless segments among various explanatory segments.
  • a segment having a high importance in the link prediction result of the query triple may be determined as a segment having an attention score described below or higher than a preset value or a segment having a preset rank or higher among a plurality of segments.
  • explanation segment3 that cannot be presented as a basis for the inference result is classified as a meaningless explanation segment
  • explanation segment1,2 that cannot be presented as a basis for link prediction is classified as a meaningful explanation segment.
  • the explainable segment means various paths that can connect the subject (s) and the object (o) of the triple ⁇ s, r, o>.
  • the relational path is a path that can be connected from the subject to the object.
  • a path connected only by a relationship, not an object, in that path means
  • e denotes an entity and r denotes a relationship.
  • a number of entities and relationships between a subject and an object are searched through a random walk using a path ranking algorithm (PRA), and various relationship paths are extracted through this.
  • PRA path ranking algorithm
  • the subject and object of all triples connected by the query predicate are expressed as a pair (s,o), and a random walk probability value of each pair for all relationship paths is calculated.
  • the random walk probability is a mathematical expression of moving randomly, that is, probabilistically, at every moment in a given space.
  • FIG. 3 is a diagram illustrating an explainable segment embedding process according to an embodiment of the present invention.
  • an embedding vector for each of the generated plurality of explainable segments is extracted using a neural network model combining CNN and LSTM.
  • each entity and relationship are expressed as a d-dimensional vector, transformed into an n ⁇ d matrix, and input to CNN.
  • CNNs are mainly used to extract and enhance features of text data as well as images, and show relatively high performance in extracting semantic and grammatical relationships between several words.
  • CNN is used to express the characteristics of each entity and relationship in the explainable segment as a vector implied.
  • CNN uses k filters with a window size of 2 to move one space in the order of entities and relationships in the explainable segment. and output the feature map.
  • a pooling operation is performed to reduce the dimension while preserving all the key information, and finally, a vector that preserves local information is generated.
  • LTM Long Short-Term Memory
  • bidirectional LSTM is applied.
  • a segment that can be explained is composed of a form that starts with a subject and arrives at an object by successively connecting entities and relationships.
  • an attention mechanism is applied to evaluate the importance of each explainable segment.
  • FIG. 4 is a diagram illustrating the structure of an attention mechanism for link prediction according to an embodiment of the present invention.
  • the importance of link prediction results is identified by calculating the semantic similarity between each explanatory segment expressed as an embedding vector and a query predicate through CNN and LSTM.
  • explanation segments 3 and 4 can be classified as explanation segments that are not helpful to link prediction results because the attention score is low.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Artificial Intelligence (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • Evolutionary Computation (AREA)
  • Data Mining & Analysis (AREA)
  • Business, Economics & Management (AREA)
  • Strategic Management (AREA)
  • Human Resources & Organizations (AREA)
  • Economics (AREA)
  • Tourism & Hospitality (AREA)
  • Game Theory and Decision Science (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Development Economics (AREA)
  • General Business, Economics & Management (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Quality & Reliability (AREA)
  • Operations Research (AREA)
  • Marketing (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

Un procédé et un appareil pour remplir un graphe de connaissances pouvant être décrit sont divulgués. Selon la présente invention, l'appareil pour remplir un graphe de connaissances pouvant être décrit comprend : un processeur ; et une mémoire connectée au processeur, la mémoire stockant des instructions de programme pouvant être exécutées par le processeur pour : extraire une pluralité de chemins de relation pour connecter un sujet et un objet dans une triple interrogation comprenant le sujet, un prédicat, et l'objet ; générer une pluralité de segments pouvant être décrits à l'aide de la pluralité extraite de chemins de relation ; extraire un vecteur d'intégration pour chacun de la pluralité générée de segments pouvant être décrits à l'aide d'un modèle de réseau de neurones artificiels dans lequel un CNN et un LSTM sont combinés ; comparer la similarité sémantique entre la pluralité de segments pouvant être décrits représentés par le vecteur d'intégration et le prédicat d'interrogation inclus dans la triple interrogation, à l'aide d'un mécanisme d'attention ; et déterminer un segment présentant une importance élevée pour une prédiction de liaison concernant la triple interrogation parmi la pluralité de segments pouvant être décrits par le biais de la comparaison de similarité sémantique.
PCT/KR2021/015999 2020-11-19 2021-11-05 Procédé et appareil pour remplir un graphe de connaissances pouvant être décrit WO2022108206A1 (fr)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
KR10-2020-0155501 2020-11-19
KR20200155501 2020-11-19
KR10-2021-0016548 2021-02-05
KR1020210016548A KR102464999B1 (ko) 2020-11-19 2021-02-05 설명 가능한 지식그래프 완성 방법 및 장치

Publications (1)

Publication Number Publication Date
WO2022108206A1 true WO2022108206A1 (fr) 2022-05-27

Family

ID=81709267

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/KR2021/015999 WO2022108206A1 (fr) 2020-11-19 2021-11-05 Procédé et appareil pour remplir un graphe de connaissances pouvant être décrit

Country Status (1)

Country Link
WO (1) WO2022108206A1 (fr)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115826627A (zh) * 2023-02-21 2023-03-21 白杨时代(北京)科技有限公司 一种编队指令的确定方法、系统、设备及存储介质

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20120097840A (ko) * 2011-02-25 2012-09-05 주식회사 솔트룩스 벡터 공간 모델을 이용한 rdf 트리플 선택 방법, 장치, 및 그 방법을 실행하기 위한 프로그램 기록매체
KR20190033269A (ko) * 2017-09-21 2019-03-29 숭실대학교산학협력단 지식베이스 구축 방법 및 그 서버
KR101991320B1 (ko) * 2017-03-24 2019-06-21 (주)아크릴 온톨로지에 의해 표현되는 자원들을 이용하여 상기 온톨로지를 확장하는 방법
US20200065668A1 (en) * 2018-08-27 2020-02-27 NEC Laboratories Europe GmbH Method and system for learning sequence encoders for temporal knowledge graph completion
KR20200083404A (ko) * 2018-09-19 2020-07-08 주식회사 포티투마루 인공 지능 질의 응답 시스템, 방법 및 컴퓨터 프로그램
KR102203065B1 (ko) * 2019-09-03 2021-01-14 숭실대학교산학협력단 트리플 검증 장치 및 방법

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20120097840A (ko) * 2011-02-25 2012-09-05 주식회사 솔트룩스 벡터 공간 모델을 이용한 rdf 트리플 선택 방법, 장치, 및 그 방법을 실행하기 위한 프로그램 기록매체
KR101991320B1 (ko) * 2017-03-24 2019-06-21 (주)아크릴 온톨로지에 의해 표현되는 자원들을 이용하여 상기 온톨로지를 확장하는 방법
KR20190033269A (ko) * 2017-09-21 2019-03-29 숭실대학교산학협력단 지식베이스 구축 방법 및 그 서버
US20200065668A1 (en) * 2018-08-27 2020-02-27 NEC Laboratories Europe GmbH Method and system for learning sequence encoders for temporal knowledge graph completion
KR20200083404A (ko) * 2018-09-19 2020-07-08 주식회사 포티투마루 인공 지능 질의 응답 시스템, 방법 및 컴퓨터 프로그램
KR102203065B1 (ko) * 2019-09-03 2021-01-14 숭실대학교산학협력단 트리플 검증 장치 및 방법

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115826627A (zh) * 2023-02-21 2023-03-21 白杨时代(北京)科技有限公司 一种编队指令的确定方法、系统、设备及存储介质

Similar Documents

Publication Publication Date Title
WO2021096009A1 (fr) Procédé et dispositif permettant d'enrichir la connaissance sur la base d'un réseau de relations
WO2019103224A1 (fr) Système et procédé d'extraction de mot-clé central dans un document
WO2017057921A1 (fr) Procédé et système de classement automatique de données exprimées par une pluralité de facteurs avec les valeurs d'une séquence de mots et de symboles de texte au moyen d'un d'apprentissage approfondi
WO2018016673A1 (fr) Dispositif et procédé d'extraction automatique de mot alternatif, et support d'enregistrement permettant la mise en œuvre de ce procédé
WO2020111314A1 (fr) Appareil et procédé d'interrogation-réponse basés sur un graphe conceptuel
WO2020045714A1 (fr) Procédé et système de reconnaissance de contenu
WO2021095987A1 (fr) Procédé et appareil de complémentation de connaissances basée sur une entité de type multiple
CN111597314A (zh) 推理问答方法、装置以及设备
WO2021215551A1 (fr) Procédé de vérification de notes de recherche électronique basé sur une chaîne de blocs et appareil de gestion de notes de recherche électronique utilisant ledit procédé
WO2022108206A1 (fr) Procédé et appareil pour remplir un graphe de connaissances pouvant être décrit
WO2022114368A1 (fr) Procédé et dispositif de complétion de connaissances par représentation vectorielle continue d'une relation neuro-symbolique
WO2019107625A1 (fr) Procédé de traduction automatique et appareil associé
WO2023063486A1 (fr) Procédé de création de modèle d'apprentissage automatique et dispositif associé
WO2022080583A1 (fr) Système de prédiction de données de bloc de bitcoin basé sur un apprentissage profond prenant en compte des caractéristiques de distribution en série chronologique
WO2018147543A1 (fr) Système de questions-réponses basé sur un graphe de concept et procédé de recherche de contexte l'utilisant
WO2021132760A1 (fr) Procédé de prédiction de colonnes et de tables utilisées lors de la traduction de requêtes sql à partir du langage naturel sur la base d'un réseau de neurones
WO2014148664A1 (fr) Système de recherche en plusieurs langues, procédé de recherche en plusieurs langues et système de recherche d'image basé sur la signification d'un mot
CN117076608A (zh) 一种基于文本动态跨度的整合外部事件知识的脚本事件预测方法及装置
WO2022186539A1 (fr) Procédé et appareil d'identification de célébrité basés sur la classification d'image
WO2022154376A1 (fr) Appareil et procédé de fourniture d'un modèle d'analyse de style intérieur d'un utilisateur sur la base d'un texte sns
KR102464999B1 (ko) 설명 가능한 지식그래프 완성 방법 및 장치
WO2022154586A1 (fr) Procédé de détermination d'une protéine cible d'un composé, et appareil de détermination de protéine cible mettant en œuvre ledit procédé
WO2022035117A1 (fr) Procédé de rétroaction d'intelligence artificielle et système de rétroaction d'intelligence artificielle
WO2023178798A1 (fr) Procédé et appareil de classification d'image, et dispositif et support
WO2022098092A1 (fr) Procédé de recherche vidéo dans un dispositif électronique

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21894977

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21894977

Country of ref document: EP

Kind code of ref document: A1