WO2022107989A1 - Method and device for completing knowledge by using relation learning between query and knowledge graph - Google Patents

Method and device for completing knowledge by using relation learning between query and knowledge graph Download PDF

Info

Publication number
WO2022107989A1
WO2022107989A1 PCT/KR2020/018966 KR2020018966W WO2022107989A1 WO 2022107989 A1 WO2022107989 A1 WO 2022107989A1 KR 2020018966 W KR2020018966 W KR 2020018966W WO 2022107989 A1 WO2022107989 A1 WO 2022107989A1
Authority
WO
WIPO (PCT)
Prior art keywords
query
embedding
knowledge
value
knowledge graph
Prior art date
Application number
PCT/KR2020/018966
Other languages
French (fr)
Korean (ko)
Inventor
박영택
이완곤
김민성
이민호
Original Assignee
숭실대학교산학협력단
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 숭실대학교산학협력단 filed Critical 숭실대학교산학협력단
Publication of WO2022107989A1 publication Critical patent/WO2022107989A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation or dialogue systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/3332Query translation
    • G06F16/3334Selection or weighting of terms from queries, including natural language queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/901Indexing; Data structures therefor; Storage structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/901Indexing; Data structures therefor; Storage structures
    • G06F16/9024Graphs; Linked lists
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/284Lexical analysis, e.g. tokenisation or collocates
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/02Knowledge representation; Symbolic representation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/04Inference or reasoning models

Definitions

  • the present invention relates to a knowledge completion method and apparatus using a relationship learning between a query and a knowledge graph.
  • a knowledge graph refers to a network composed of relationships between entities.
  • there is a problem of an incomplete knowledge graph due to problems such as omission of relationships for specific entities or incorrect connection of relationships.
  • the present invention intends to propose a knowledge completion method and apparatus using a relationship learning between a query sentence and a knowledge graph capable of inferring missing knowledge by using a specific query sentence and a knowledge graph.
  • a query embedding module that outputs a query embedding value corresponding to an input query ; a topic extraction module for extracting topics from the input query; a knowledge graph embedding module for outputting embedding values for a plurality of predicates, subjects, and objects included in the knowledge graph; a similarity calculation module for determining a predicate most similar to a query sentence by calculating a similarity between the query embedding value and each of the embedding values of the plurality of predicates; an embedding connection module that connects the embedding value of the query with the embedding value of the most similar predicate; and a scoring module for inferring a new triple using the extracted topic, the embedding value connecting the embedding value of the query and the embedding value of the most similar predicate, and the subject and object of the knowledge graph.
  • a knowledge completion apparatus that outputs a query embedding value corresponding to an input query ; a topic extraction module for extracting topics from the input query; a knowledge
  • the query embedding module may determine an embedding value corresponding to the input query using a BERT-based RoBERT model.
  • the query embedding module may perform tokenization to separate the input query into words, and may extract a topic by excluding words having a preset number of appearances or more in the knowledge graph.
  • the similarity calculation module may perform a dot product operation on the embedding values of all predicates obtained through the graph embedding module, take a sigmoid, find the highest value, and search for a predicate most similar to the query statement.
  • the scoring module places the extracted topic in the subject, places the embedding value connecting the query embedding value and the embedding value of the most similar predicate in the predicate, and places the subject and objects of the knowledge graph in the candidate object to deduce a new triple.
  • the scoring module sequentially places a plurality of candidate objects in the score calculation function, so that the entity with the highest score is an object related to an embedding value that connects the extracted topic and the query embedding value with the embedding value of the most similar predicate. can decide
  • a method of completing knowledge by using a query and knowledge graph relationship learning in an apparatus including a processor and a memory, the method comprising: outputting a query embedding value corresponding to an input query; extracting a topic from the input query; outputting embedding values for a plurality of predicates, subjects, and objects included in the knowledge graph; determining a predicate most similar to a query statement by calculating a similarity between the query embedding value and each of the embedding values of the plurality of predicates; concatenating the embedding value of the query and the embedding value of the most similar predicate; and inferring a new triple using the extracted topic, the embedding value connecting the embedding value of the query and the embedding value of the most similar predicate, and the subject and object of the knowledge graph.
  • FIG. 1 is a diagram showing the configuration of a knowledge completion device according to an embodiment of the present invention.
  • FIG. 2 is a diagram illustrating a detailed configuration of a query embedding module according to the present embodiment.
  • FIG. 3 is a diagram illustrating a case in which the number of appearances is used for topic extraction according to the present embodiment.
  • FIG. 4 is a diagram for explaining a process of searching for a predicate similar to a query according to the present embodiment.
  • FIG. 5 is a diagram for explaining a process of inferring a new triple through the score calculation function according to the present embodiment.
  • the present invention provides a method for inferring missing knowledge by using a specific query and a knowledge graph.
  • a topic is automatically extracted from a question-type query to obtain the corresponding topic embedding value, and a new triple is created by learning the relationship between the topic and the query from the knowledge graph by using the query embedding and the knowledge graph embedding.
  • predicate embedding of a knowledge graph related to a specific query is used together.
  • FIG. 1 is a diagram showing the configuration of a knowledge completion device according to an embodiment of the present invention.
  • the knowledge completion device includes a query embedding module 100, a topic extraction module 102, a knowledge graph embedding module 104, a similarity calculation module 106, and an embedding connection module. 108 and a scoring module 110 .
  • the query embedding module 100 outputs an embedding value corresponding to the query input by the user.
  • query embedding means embedding a query inputted through various algorithms in a vector form in a multidimensional space.
  • FIG. 2 is a diagram illustrating a detailed configuration of a query embedding module according to the present embodiment.
  • a BERT-based RoBERT model is used for query embedding.
  • the BERT model on which RoBERT is based is a context-dependent model.
  • the word 'bank' can have two meanings, such as 'bank deposit' or 'river bank', so the advantage of obtaining an embedding value that expresses the characteristics of the word well considering the context, that is, the front and back sentences. have it
  • each word constituting the query is divided into tokens and input, and the first value of the result is embedded in the input query. used as a value.
  • the topic extraction module 102 extracts a topic in consideration of the number of appearances in the knowledge graph of each word included in the input query.
  • the input query is “What does Christian Bale star in?”
  • the topic is separately marked as [Christian Bale] in the query, such as “What does [Christian Bale] star in?”. Therefore, when extracting a topic, work for marking is required.
  • FIG. 3 is a diagram illustrating a case in which the number of appearances is used for topic extraction according to the present embodiment.
  • a topic is extracted by extracting a small number of words by calculating the number of appearances of each word included in a query in the knowledge graph. For example, “What does Christian Bale star in?” 'Waht', 'does', 'star', 'in', and '?' appear in the query because the probability of appearing in other queries is much higher than that of “Christian Bale”. It is desirable to extract the few words as the topic.
  • a topic may be extracted by performing tokenization of dividing a query into words and excluding words having a preset number of appearances or more (eg, 2,000 or more).
  • the knowledge graph embedding module 104 outputs an embedding matrix that well represents a knowledge graph (KB).
  • a triple is composed of a predicate corresponding to a relation and an entity (subject, object) corresponding to a subject and purpose, and the knowledge graph embedding module 104 includes embedding values for all predicates and entities included in the knowledge graph.
  • the ComplEx model that can express real and imaginary numbers as well as symmetric and asymmetric relationships can be used, and all triple learning of KG is possible through the Score Function.
  • an embedding value of a query sentence and a knowledge graph is used to find a predicate similar to the query sentence.
  • the similarity calculation module 106 calculates the similarity between the embedding values output through the query embedding module 100 and the embedding values of all predicates obtained through the knowledge graph embedding module 104 to determine the predicate most similar to the query.
  • the similarity calculation module 106 performs a dot product operation on the embedding values output through the query embedding module 100 and the embedding values of all predicates obtained through the knowledge graph embedding module 104, and takes a sigmoid to simulate Look for high values to find similar predicate embeddings.
  • FIG. 4 is a diagram for explaining a process of searching for a predicate similar to a query according to the present embodiment.
  • an embedding value is obtained through the query embedding module 100 for “What does Christian Bale star in?”. And if you take all the predicate embedding values of the knowledge graph in matrix form and take the sigmoid, you can see the similarity value between the query and each predicate as shown in the box on the right. Among them, the highest value “starred_actors” becomes the predicate embedding value most similar to the corresponding query.
  • the embedding concatenation module 108 concatenates the query embedding value output from the query embedding module 100 and the predicate embedding value most similar to the query sentence determined by the similarity calculation module 106 .
  • the scoring module 110 infers the missing triple by using the score calculation function (Equation 1) of the knowledge graph embedding module.
  • the scoring module 100 assigns the topic extracted from the query to the subject of the score calculation function ( ) and concatenating the query and similar predicate embeddings in that query to place the predicate in the score calculation function ( ) is placed in In place of the last object, entities such as the subject and object of the knowledge graph ( ) are candidates, and after calculation, the target with the highest value is found and a corresponding triple is newly inferred.
  • FIG. 5 is a diagram for explaining a process of inferring a new triple through the score calculation function according to the present embodiment.
  • the knowledge completion process using the query and knowledge graph relationship learning may be performed in an apparatus including a processor and a memory.
  • the processor may include a central processing unit (CPU) or other virtual machine capable of executing a computer program.
  • CPU central processing unit
  • the memory may include a non-volatile storage device such as a fixed hard drive or a removable storage device.
  • the removable storage device may include a compact flash unit, a USB memory stick, and the like.
  • the memory may also include volatile memory, such as various random access memories.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • Databases & Information Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Evolutionary Computation (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • General Health & Medical Sciences (AREA)
  • Human Computer Interaction (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Electrically Operated Instructional Devices (AREA)
  • Devices For Executing Special Programs (AREA)

Abstract

Disclosed are a method and a device for completing knowledge by using relation learning between a query and a knowledge graph. According to the present invention, provided is a device for completing knowledge by using relation learning between a query and a knowledge graph, comprising: a query embedding module that outputs a query embedding value corresponding to an input query; a topic extraction module that extracts topics from the input query; a knowledge graph embedding module that outputs embedding values for a plurality of predicates, subjects, and objects included in the knowledge graph; a similarity calculation module that determines the predicate most similar to the query by calculating the similarity between the embedding value of the query and the embedding values of each of the plurality of predicates; an embedding connection module that connects the embedding value of the query with the embedding value of the most similar predicate; and a scoring module that infers a new triple by using the extracted topic, the embedding value connecting the query embedding value and the embedding value of the most similar predicate, and the subjects and objects of the knowledge graph.

Description

질의문과 지식 그래프 관계 학습을 이용한 지식 완성 방법 및 장치Method and device for knowledge completion using query and knowledge graph relationship learning
본 발명은 질의문과 지식 그래프 관계 학습을 이용한 지식 완성 방법 및 장치에 관한 것이다.The present invention relates to a knowledge completion method and apparatus using a relationship learning between a query and a knowledge graph.
지식 그래프는 개체들 사이의 관계로 구성된 네트워크를 뜻한다. 이러한 지식 그래프에서 특정 개체들에 대한 관계가 누락되거나 잘못된 관계 연결과 같은 문제로 불완전한 지식 그래프의 문제점이 존재한다. A knowledge graph refers to a network composed of relationships between entities. In such a knowledge graph, there is a problem of an incomplete knowledge graph due to problems such as omission of relationships for specific entities or incorrect connection of relationships.
불완전한 지식 그래프의 문제를 해결 하기 위한 많은 연구들은 자연어 임베딩 기반으로 인공 신경망을 이용한 학습 방법들을 제안하고 있고, 이러한 방법들로 다양한 지식 그래프 완성 시스템들이 연구되고 있다. Many studies to solve the problem of incomplete knowledge graphs have proposed learning methods using artificial neural networks based on natural language embedding, and various knowledge graph completion systems are being studied with these methods.
종래기술에서는 “What does [Christian Bale] star in?” 와 같은 질의문(Query)에서 토픽을 [Christian Bale]처럼 표기를 하였다 그래서 토픽을 추출할 때 표기를 위한 작업을 필요로 하고, 지식 그래프의 술어를 활용하지 못하는 문제점이 있다. In the prior art, “What does [Christian Bale] star in?” In the same query as [Christian Bale], the topic is marked like [Christian Bale], so there is a problem in that it requires work for marking when extracting a topic, and the predicate of the knowledge graph cannot be used.
상기한 종래기술의 문제점을 해결하기 위해, 본 발명은 특정 질의문과 지식 그래프를 활용해 누락된 지식들을 추론할 수 있는 질의문과 지식 그래프 관계 학습을 이용한 지식 완성 방법 및 장치를 제안하고자 한다.In order to solve the problems of the prior art, the present invention intends to propose a knowledge completion method and apparatus using a relationship learning between a query sentence and a knowledge graph capable of inferring missing knowledge by using a specific query sentence and a knowledge graph.
상기한 바와 같은 목적을 달성하기 위하여, 본 발명의 일 실시예에 따르면, 질의문과 지식 그래프 관계 학습을 이용한 지식 완성 장치로서, 입력된 질의문에 상응하는 질의문 임베딩 값을 출력하는 질의문 임베딩 모듈; 상기 입력된 질의문에서 토픽을 추출하는 토픽 추출 모듈; 상기 지식 그래프에 포함된 복수의 술어, 주어 및 목적어들에 대한 임베딩 값을 출력하는 지식 그래프 임베딩 모듈; 상기 질의문 임베딩 값과 상기 복수의 술어 각각의 임베딩 값의 유사도를 계산하여 질의문과 가장 유사한 술어를 결정하는 유사도 계산 모듈; 상기 질의문 임베딩 값과 상기 가장 유사한 술어의 임베딩 값을 연결하는 임베딩 연결 모듈; 및 상기 추출된 토픽, 상기 질의문 임베딩 값과 상기 가장 유사한 술어의 임베딩 값을 연결한 임베딩 값 및 상기 지식 그래프의 주어 및 목적어들을 이용하여 새로운 트리플을 추론하는 스코어링 모듈을 포함하는 질의문과 지식 그래프 관계 학습을 이용한 지식 완성 장치가 제공된다. In order to achieve the above object, according to an embodiment of the present invention, as a knowledge completion device using a relationship learning between a query and a knowledge graph, a query embedding module that outputs a query embedding value corresponding to an input query ; a topic extraction module for extracting topics from the input query; a knowledge graph embedding module for outputting embedding values for a plurality of predicates, subjects, and objects included in the knowledge graph; a similarity calculation module for determining a predicate most similar to a query sentence by calculating a similarity between the query embedding value and each of the embedding values of the plurality of predicates; an embedding connection module that connects the embedding value of the query with the embedding value of the most similar predicate; and a scoring module for inferring a new triple using the extracted topic, the embedding value connecting the embedding value of the query and the embedding value of the most similar predicate, and the subject and object of the knowledge graph. A knowledge completion apparatus using learning is provided.
상기 질의문 임베딩 모듈은, BERT 기반의 RoBERT 모델을 이용하여 상기 입력된 질의문에 상응하는 임베딩 값을 결정할 수 있다. The query embedding module may determine an embedding value corresponding to the input query using a BERT-based RoBERT model.
상기 상기 질의문 임베딩 모듈은, 상기 입력된 질의문을 각 단어로 분리하는 토큰화를 수행하고 상기 지식 그래프 내에서 미리 설정된 등장 횟수 이상의 단어들을 제외시켜 토픽을 추출할 수 있다. The query embedding module may perform tokenization to separate the input query into words, and may extract a topic by excluding words having a preset number of appearances or more in the knowledge graph.
상기 유사도 계산 모듈은, 상기 그래프 임베딩 모듈을 통해 얻은 모든 술어의 임베딩 값을 dot product 연산을 하고 sigmoid를 취하여 가장 높은 값을 찾아 상기 질의문과 가장 유사한 술어를 탐색할 수 있다. The similarity calculation module may perform a dot product operation on the embedding values of all predicates obtained through the graph embedding module, take a sigmoid, find the highest value, and search for a predicate most similar to the query statement.
상기 스코어링 모듈은 상기 추출된 토픽을 주어에 위치시키고, 상기 질의문 임베딩 값과 상기 가장 유사한 술어의 임베딩 값을 연결한 임베딩 값을 술어에 위치시키고, 상기 지식 그래프의 주어 및 목적어들을 후보 목적어에 위치시켜 새로운 트리플을 추론할 수 있다. The scoring module places the extracted topic in the subject, places the embedding value connecting the query embedding value and the embedding value of the most similar predicate in the predicate, and places the subject and objects of the knowledge graph in the candidate object to deduce a new triple.
상기 스코어링 모듈은 스코어 계산 함수에 복수의 후보 목적어들을 순차적으로 위치시켜 스코어가 가장 높은 엔티티를 상기 추출된 토픽 및 상기 질의문 임베딩 값과 상기 가장 유사한 술어의 임베딩 값을 연결한 임베딩 값과 관련된 목적어로 결정할 수 있다. The scoring module sequentially places a plurality of candidate objects in the score calculation function, so that the entity with the highest score is an object related to an embedding value that connects the extracted topic and the query embedding value with the embedding value of the most similar predicate. can decide
본 발명의 다른 측면에 따르면, 프로세서 및 메모리를 포함하는 장치에서 질의문과 지식 그래프 관계 학습을 이용하여 지식을 완성하는 방법으로서, 입력된 질의문에 상응하는 질의문 임베딩 값을 출력하는 단계; 상기 입력된 질의문에서 토픽을 추출하는 단계; 상기 지식 그래프에 포함된 복수의 술어, 주어 및 목적어들에 대한 임베딩 값을 출력하는 단계; 상기 질의문 임베딩 값과 상기 복수의 술어 각각의 임베딩 값의 유사도를 계산하여 질의문과 가장 유사한 술어를 결정하는 단계; 상기 질의문 임베딩 값과 상기 가장 유사한 술어의 임베딩 값을 연결하는 단계; 및 상기 추출된 토픽, 상기 질의문 임베딩 값과 상기 가장 유사한 술어의 임베딩 값을 연결한 임베딩 값 및 상기 지식 그래프의 주어 및 목적어들을 이용하여 새로운 트리플을 추론하는 단계를 포함하는 질의문과 지식 그래프 관계 학습을 이용한 지식 완성 방법이 제공된다. According to another aspect of the present invention, there is provided a method of completing knowledge by using a query and knowledge graph relationship learning in an apparatus including a processor and a memory, the method comprising: outputting a query embedding value corresponding to an input query; extracting a topic from the input query; outputting embedding values for a plurality of predicates, subjects, and objects included in the knowledge graph; determining a predicate most similar to a query statement by calculating a similarity between the query embedding value and each of the embedding values of the plurality of predicates; concatenating the embedding value of the query and the embedding value of the most similar predicate; and inferring a new triple using the extracted topic, the embedding value connecting the embedding value of the query and the embedding value of the most similar predicate, and the subject and object of the knowledge graph. A knowledge completion method using
본 발명의 또 다른 측면에 따르면, 상기한 방법을 수행하는 컴퓨터 판독 가능한 프로그램이 제공된다. According to another aspect of the present invention, there is provided a computer readable program for performing the above method.
본 발명에 따르면, 토픽 선정을 자동화하여 사용 가능한 데이터셋 범위를 확장할 수 있는 장점이 있다. According to the present invention, there is an advantage in that the range of usable datasets can be expanded by automating topic selection.
또한, 본 발명에 따르면, 의문 임베딩 값과 함께 지식 그래프의 술어 임베딩 값을 같이 사용해 질의문만 사용했을 때보다 부족한 정보를 채워주는 효과를 얻을 수 있다. Also, according to the present invention, it is possible to obtain the effect of filling in insufficient information compared to the case of using only the query by using the predicate embedding value of the knowledge graph together with the question embedding value.
도 1은 본 발명의 바람직한 일 실시예에 따른 지식 완성 장치의 구성을 도시한 도면이다. 1 is a diagram showing the configuration of a knowledge completion device according to an embodiment of the present invention.
도 2는 본 실시예에 따른 질의문 임베딩 모듈의 상세 구성을 도시한 도면이다. 2 is a diagram illustrating a detailed configuration of a query embedding module according to the present embodiment.
도 3은 본 실시예에 따른 토픽 추출을 위해 등장 횟수를 이용하는 경우를 도시한 도면이다. 3 is a diagram illustrating a case in which the number of appearances is used for topic extraction according to the present embodiment.
도 4는 본 실시예에 따른 질의문과 유사한 술어를 탐색하는 과정을 설명하기 위한 도면이다. 4 is a diagram for explaining a process of searching for a predicate similar to a query according to the present embodiment.
도 5는 본 실시예에 따른 스코어 계산 함수를 통해 새로운 트리플을 추론하는 과정을 설명하기 위한 도면이다. 5 is a diagram for explaining a process of inferring a new triple through the score calculation function according to the present embodiment.
본 발명은 다양한 변경을 가할 수 있고 여러 가지 실시예를 가질 수 있는 바, 특정 실시예들을 도면에 예시하고 상세하게 설명하고자 한다.Since the present invention can have various changes and can have various embodiments, specific embodiments are illustrated in the drawings and described in detail.
그러나, 이는 본 발명을 특정한 실시 형태에 대해 한정하려는 것이 아니며, 본 발명의 사상 및 기술 범위에 포함되는 모든 변경, 균등물 내지 대체물을 포함하는 것으로 이해되어야 한다. However, this is not intended to limit the present invention to specific embodiments, it should be understood to include all modifications, equivalents and substitutes included in the spirit and scope of the present invention.
본 발명은 특정 질의문과 지식 그래프를 활용해 누락된 지식들을 추론하기 위한 방법을 제시한다. The present invention provides a method for inferring missing knowledge by using a specific query and a knowledge graph.
먼저 의문형인 질의문로부터 토픽(topic)을 자동으로 추출하여 해당 토픽 임베딩 값을 획득하고, 질의문 임베딩과 지식 그래프 임베딩을 활용하여 지식 그래프로부터의 토픽과 질의문 사이의 관계를 학습하여 새로운 트리플을 추론한다.First, a topic is automatically extracted from a question-type query to obtain the corresponding topic embedding value, and a new triple is created by learning the relationship between the topic and the query from the knowledge graph by using the query embedding and the knowledge graph embedding. infer
본 발명에서는, 누락된 지식의 추론 성능을 높이기 위해, 특정 질의문과 관련된 지식 그래프의 술어 임베딩을 같이 활용한다. In the present invention, in order to increase the reasoning performance of missing knowledge, predicate embedding of a knowledge graph related to a specific query is used together.
이하에서는 도면을 참조하여 본 실시예에 따른 지식 완성 방법을 상세하게 설명한다. Hereinafter, the knowledge completion method according to the present embodiment will be described in detail with reference to the drawings.
도 1은 본 발명의 바람직한 일 실시예에 따른 지식 완성 장치의 구성을 도시한 도면이다. 1 is a diagram showing the configuration of a knowledge completion device according to an embodiment of the present invention.
도 1에 도시된 바와 같이, 본 실시예에 따른 지식 완성 장치는 질의문 임베딩 모듈(100), 토픽 추출 모듈(102), 지식 그래프 임베딩 모듈(104), 유사도 계산 모듈(106), 임베딩 연결 모듈(108) 및 스코어링 모듈(110)을 포함할 수 있다. As shown in Fig. 1, the knowledge completion device according to the present embodiment includes a query embedding module 100, a topic extraction module 102, a knowledge graph embedding module 104, a similarity calculation module 106, and an embedding connection module. 108 and a scoring module 110 .
질의문 임베딩 모듈(100)은 사용자가 입력한 질의문에 상응하는 임베딩 값을 출력한다. The query embedding module 100 outputs an embedding value corresponding to the query input by the user.
여기서, 질의문 임베딩은 다양한 알고리즘을 통해 입력된 질의문을 다차원 공간에 벡터 형태로 임베딩하는 것을 의미한다. Here, query embedding means embedding a query inputted through various algorithms in a vector form in a multidimensional space.
도 2는 본 실시예에 따른 질의문 임베딩 모듈의 상세 구성을 도시한 도면이다. 2 is a diagram illustrating a detailed configuration of a query embedding module according to the present embodiment.
본 실시예에서는 질의문 임베딩을 위해 BERT 기반의 RoBERT 모델을 사용한다. In this embodiment, a BERT-based RoBERT model is used for query embedding.
RoBERT의 기반이 되는 BERT모델은 문맥에 의존하는 모델이다. 예를 들어 ‘bank’ 라는 단어는 ‘bank deposit’ 또는 ‘river bank’ 와 같은 두 개의 의미를 가질 수 있어, 앞뒤 문장 즉 문맥을 고려하여 단어의 특징을 잘 표현된 임베딩 값을 얻을 수 있는 장점을 갖고 있다.The BERT model on which RoBERT is based is a context-dependent model. For example, the word 'bank' can have two meanings, such as 'bank deposit' or 'river bank', so the advantage of obtaining an embedding value that expresses the characteristics of the word well considering the context, that is, the front and back sentences. have it
도 2를 참조하면, 입력된 질의문이 “What does Christian Bale star in?”인 경우, 질의문을 구성하는 각 단어들이 토큰별로 나누어 입력되고, 결과값의 가장 첫 번째 값을 입력된 질의문의 임베딩 값으로 사용한다.Referring to FIG. 2 , when the input query is “What does Christian Bale star in?”, each word constituting the query is divided into tokens and input, and the first value of the result is embedded in the input query. used as a value.
본 실시예에 따른 토픽 추출 모듈(102)은 입력된 질의문에 포함된 각 단어의 지식 그래프에서의 등장 횟수를 고려하여 토픽을 추출한다. The topic extraction module 102 according to the present embodiment extracts a topic in consideration of the number of appearances in the knowledge graph of each word included in the input query.
예를 들어, 입력된 질의문이 “What does Christian Bale star in?” 인 경우, “What does [Christian Bale] star in?”와 같이 질의문에서 토픽을 [Christian Bale]처럼 별도로 표기한다. 따라서 토픽을 추출할 때 표기를 위한 작업이 필요하다. For example, the input query is “What does Christian Bale star in?” In the case of , the topic is separately marked as [Christian Bale] in the query, such as “What does [Christian Bale] star in?”. Therefore, when extracting a topic, work for marking is required.
도 3은 본 실시예에 따른 토픽 추출을 위해 등장 횟수를 이용하는 경우를 도시한 도면이다. 3 is a diagram illustrating a case in which the number of appearances is used for topic extraction according to the present embodiment.
도 3을 참조하면, 본 실시예에서는 지식 그래프 내에서 질의문에 포함된 각 단어의 등장 횟수를 계산하여 적은 수의 단어를 추출하여 토픽을 추출한다. 예를 들어 “What does Christian Bale star in?” 이라는 질의문이 있을 때, ‘Waht’, ‘does’, ‘star’, ‘in’, ‘?’의 빈도는 다른 질의문에서도 등장 확률이 “Christian Bale” 보다 상대적으로 훨씬 많기 때문에 질의문에서 등장 횟수가 적은 단어를 토픽으로 추출하는 것이 바람직하다. 도 3을 참조하면, 질의문을 각 단어로 분리하는 토큰화를 수행하고 등장 횟수 미리 설정된 횟수 이상(예를 들어, 2,000번 이상)의 단어들을 제외시켜 토픽을 추출할 수 있다. Referring to FIG. 3 , in the present embodiment, a topic is extracted by extracting a small number of words by calculating the number of appearances of each word included in a query in the knowledge graph. For example, “What does Christian Bale star in?” 'Waht', 'does', 'star', 'in', and '?' appear in the query because the probability of appearing in other queries is much higher than that of “Christian Bale”. It is desirable to extract the few words as the topic. Referring to FIG. 3 , a topic may be extracted by performing tokenization of dividing a query into words and excluding words having a preset number of appearances or more (eg, 2,000 or more).
지식 그래프 임베딩 모듈(104)은 지식 그래프(Knowledge Graph: KB)를 잘 표현하는 임베딩 매트릭스를 출력한다. The knowledge graph embedding module 104 outputs an embedding matrix that well represents a knowledge graph (KB).
트리플은 릴레이션에 해당하는 술어(predicate)와 주어 및 목적에 해당하는 엔티티(subject, object)로 구성되며, 지식 그래프 임베딩 모듈(104)은 지식 그래프에 포함되는 모든 술어 및 엔티티들에 대한 임베딩 값을 출력한다. A triple is composed of a predicate corresponding to a relation and an entity (subject, object) corresponding to a subject and purpose, and the knowledge graph embedding module 104 includes embedding values for all predicates and entities included in the knowledge graph. print out
지식 그래프 임베딩을 위해 실수와 허수를 대칭, 비대칭 관계 표현도 가능한ComplEx 모델 사용할 수 있고, Score Function을 통해 KG의 모든 트리플 학습이 가능하다. For embedding the knowledge graph, the ComplEx model that can express real and imaginary numbers as well as symmetric and asymmetric relationships can be used, and all triple learning of KG is possible through the Score Function.
종래기술에서는 지식 그래프의 술어를 활용하지 못하였으나, 본 발명에서는 질의문과 지식 그래프의 주어를 관계 학습을 할 때, 질의문과 유사한 술어를 찾아 학습하는데 같이 사용하면 더 좋은 성능을 얻을 수 있다는 것을 확인하였다. In the prior art, it was not possible to utilize the predicate of the knowledge graph, but in the present invention, it was confirmed that better performance can be obtained if the subject of the knowledge graph is used together to find and learn a predicate similar to the query when learning the relationship. .
본 발명의 바람직한 일 실시예에 따르면, 질의문과 유사한 술어를 찾기 위해 질의문과 지식 그래프의 임베딩 값을 이용한다. According to a preferred embodiment of the present invention, an embedding value of a query sentence and a knowledge graph is used to find a predicate similar to the query sentence.
유사도 계산 모듈(106)은 질의문 임베딩 모듈(100)을 통해 출력된 임베딩 값과 지식 그래프 임베딩 모듈(104)을 통해 얻은 모든 술어의 임베딩 값의 유사도를 계산하여 질의문과 가장 유사한 술어를 결정한다. The similarity calculation module 106 calculates the similarity between the embedding values output through the query embedding module 100 and the embedding values of all predicates obtained through the knowledge graph embedding module 104 to determine the predicate most similar to the query.
유사도 계산 모듈(106)은 질의문 임베딩 모듈(100)을 통해 출력된 임베딩 값과 지식 그래프 임베딩 모듈(104)을 통해 얻은 모든 술어의 임베딩 값(relation embeddings)을 dot product 연산을 하고 sigmoid를 취하여 가장 높은 값을 찾아 유사한 술어 임베딩 값을 찾는다. The similarity calculation module 106 performs a dot product operation on the embedding values output through the query embedding module 100 and the embedding values of all predicates obtained through the knowledge graph embedding module 104, and takes a sigmoid to simulate Look for high values to find similar predicate embeddings.
도 4는 본 실시예에 따른 질의문과 유사한 술어를 탐색하는 과정을 설명하기 위한 도면이다. 4 is a diagram for explaining a process of searching for a predicate similar to a query according to the present embodiment.
도 4를 참조하면, “What does Christian Bale star in?”에 대해 질의문 임베딩 모듈(100)을 통해 임베딩 값을 얻는 것을 볼 수 있다. 그리고 지식 그래프의 모든 술어 임베딩 값만 매트릭스 형태로 가져와 sigmoid를 취하면 우측 박스와 같이 질의문과 각 술어의 유사도 수치를 볼 수 있다. 이중 가장 높은 값 “starred_actors”를 해당 질의문과 가장 유사한 술어 임베딩 값이 된다. Referring to FIG. 4 , it can be seen that an embedding value is obtained through the query embedding module 100 for “What does Christian Bale star in?”. And if you take all the predicate embedding values of the knowledge graph in matrix form and take the sigmoid, you can see the similarity value between the query and each predicate as shown in the box on the right. Among them, the highest value “starred_actors” becomes the predicate embedding value most similar to the corresponding query.
임베딩 연결 모듈(108)은 질의문 임베딩 모듈(100)에서 출력하는 질의문 임베딩 값과, 유사도 계산 모듈(106)의 결정된 질의문과 가장 유사한 술어 임베딩 값을 연결(concatenate)한다. The embedding concatenation module 108 concatenates the query embedding value output from the query embedding module 100 and the predicate embedding value most similar to the query sentence determined by the similarity calculation module 106 .
스코어링 모듈(110)은 지식 그래프 임베딩 모듈의 스코어 계산 함수(수학식 1)를 이용하여 누락된 트리플을 추론한다. The scoring module 110 infers the missing triple by using the score calculation function (Equation 1) of the knowledge graph embedding module.
Figure PCTKR2020018966-appb-M000001
Figure PCTKR2020018966-appb-M000001
스코어링 모듈(100)은 질의문로부터 추출된 토픽을 스코어 계산 함수의 주어(
Figure PCTKR2020018966-appb-I000001
)에 위치시키고 질의문과 해당 질의문의 유사한 술어 임베딩 값을 연결하여 스코어 계산 함수의 술어 자리(
Figure PCTKR2020018966-appb-I000002
)에 위치시킨다. 마지막 목적어 자리에는 지식 그래프의 주어, 목적어와 같은 엔티티(
Figure PCTKR2020018966-appb-I000003
)들이 후보가 되어 계산 후 가장 높은 값을 갖는 목적어를 찾아 해당 트리플을 새로 추론하게 된다.
The scoring module 100 assigns the topic extracted from the query to the subject of the score calculation function (
Figure PCTKR2020018966-appb-I000001
) and concatenating the query and similar predicate embeddings in that query to place the predicate in the score calculation function (
Figure PCTKR2020018966-appb-I000002
) is placed in In place of the last object, entities such as the subject and object of the knowledge graph (
Figure PCTKR2020018966-appb-I000003
) are candidates, and after calculation, the target with the highest value is found and a corresponding triple is newly inferred.
도 5는 본 실시예에 따른 스코어 계산 함수를 통해 새로운 트리플을 추론하는 과정을 설명하기 위한 도면이다. 5 is a diagram for explaining a process of inferring a new triple through the score calculation function according to the present embodiment.
도 5를 참조하면, “What does Christian Bale star in?” 질의문로부터 추출한 topic “Christian Bale” 임베딩 값을
Figure PCTKR2020018966-appb-I000004
에 넣고, 질의문과 이에 유사한 술어인 “starred_actors” 임베딩 값을 연결하여 스코어 계산 함수의 술어
Figure PCTKR2020018966-appb-I000005
에 넣는다. 이후 후보로 올 수 있는 지식 그래프의 주어, 목적어 후보들 “Christian Bale”, “Bruce Wayne”, “Batman Series”, “The Dark Knight” 임베딩 값들이 목적어
Figure PCTKR2020018966-appb-I000006
에 하나씩 대입되어 계산된다. 가장 높은 점수(0.87)를 갖는 “The Dark Knight”를 목적어로 선정하여 새로운 트리플로 <Christian Bale, starred_actors, The Dark Knight>를 추론하게 된다.
5, “What does Christian Bale star in?” The embedding value of topic “Christian Bale” extracted from the query
Figure PCTKR2020018966-appb-I000004
, and concatenate the embedding value of “starred_actors”, which is a predicate similar to the query statement, to the predicate of the score calculation function.
Figure PCTKR2020018966-appb-I000005
put in The subject and object candidates of the knowledge graph that can come as candidates afterward are the embedding values of “Christian Bale”, “Bruce Wayne”, “Batman Series”, and “The Dark Knight” as the object
Figure PCTKR2020018966-appb-I000006
It is calculated by substituting one for each. “The Dark Knight” with the highest score (0.87) is selected as the object, and <Christian Bale, starred_actors, The Dark Knight> is inferred as a new triple.
본 발명의 바람직한 일 실시예에 따른 질의문과 지식 그래프 관계 학습을 이용한 지식 완성 과정은 프로세서 및 메모리를 포함하는 장치에서 수행될 수 있다. The knowledge completion process using the query and knowledge graph relationship learning according to the preferred embodiment of the present invention may be performed in an apparatus including a processor and a memory.
프로세서는 컴퓨터 프로그램을 실행할 수 있는 CPU(central processing unit)나 그밖에 가상 머신 등을 포함할 수 있다. The processor may include a central processing unit (CPU) or other virtual machine capable of executing a computer program.
메모리는 고정식 하드 드라이브나 착탈식 저장 장치와 같은 불휘발성 저장 장치를 포함할 수 있다. 착탈식 저장 장치는 컴팩트 플래시 유닛, USB 메모리 스틱 등을 포함할 수 있다. 메모리는 각종 랜덤 액세스 메모리와 같은 휘발성 메모리도 포함할 수 있다.The memory may include a non-volatile storage device such as a fixed hard drive or a removable storage device. The removable storage device may include a compact flash unit, a USB memory stick, and the like. The memory may also include volatile memory, such as various random access memories.
상기한 본 발명의 실시예는 예시의 목적을 위해 개시된 것이고, 본 발명에 대한 통상의 지식을 가지는 당업자라면 본 발명의 사상과 범위 안에서 다양한 수정, 변경, 부가가 가능할 것이며, 이러한 수정, 변경 및 부가는 하기의 특허청구범위에 속하는 것으로 보아야 할 것이다.The above-described embodiments of the present invention have been disclosed for purposes of illustration, and various modifications, changes, and additions will be possible within the spirit and scope of the present invention by those skilled in the art having ordinary knowledge of the present invention, and such modifications, changes and additions should be regarded as belonging to the following claims.

Claims (10)

  1. 질의문과 지식 그래프 관계 학습을 이용한 지식 완성 장치로서, As a knowledge completion device using a query and knowledge graph relationship learning,
    입력된 질의문에 상응하는 질의문 임베딩 값을 출력하는 질의문 임베딩 모듈; a query embedding module that outputs a query embedding value corresponding to the input query;
    상기 입력된 질의문에서 토픽을 추출하는 토픽 추출 모듈; a topic extraction module for extracting topics from the input query;
    상기 지식 그래프에 포함된 복수의 술어, 주어 및 목적어들에 대한 임베딩 값을 출력하는 지식 그래프 임베딩 모듈; a knowledge graph embedding module for outputting embedding values for a plurality of predicates, subjects, and objects included in the knowledge graph;
    상기 질의문 임베딩 값과 상기 복수의 술어 각각의 임베딩 값의 유사도를 계산하여 질의문과 가장 유사한 술어를 결정하는 유사도 계산 모듈; a similarity calculation module for determining a predicate most similar to a query statement by calculating a similarity between the query embedding value and the embedding values of each of the plurality of predicates;
    상기 질의문 임베딩 값과 상기 가장 유사한 술어의 임베딩 값을 연결하는 임베딩 연결 모듈; 및an embedding connection module that connects the embedding value of the query with the embedding value of the most similar predicate; and
    상기 추출된 토픽, 상기 질의문 임베딩 값과 상기 가장 유사한 술어의 임베딩 값을 연결한 임베딩 값 및 상기 지식 그래프의 주어 및 목적어들을 이용하여 새로운 트리플을 추론하는 스코어링 모듈을 포함하는 질의문과 지식 그래프 관계 학습을 이용한 지식 완성 장치.Query and knowledge graph relationship learning including a scoring module for inferring a new triple using the extracted topic, the embedding value connecting the embedding value of the query and the embedding value of the most similar predicate, and the subject and object of the knowledge graph knowledge completion device using
  2. 제1항에 있어서, The method of claim 1,
    상기 질의문 임베딩 모듈은, BERT 기반의 RoBERT 모델을 이용하여 상기 입력된 질의문에 상응하는 임베딩 값을 결정하는 질의문과 지식 그래프 관계 학습을 이용한 지식 완성 장치.The query embedding module determines an embedding value corresponding to the inputted query using a BERT-based RoBERT model, and a knowledge completion device using knowledge graph relationship learning.
  3. 제1항에 있어서, According to claim 1,
    상기 상기 질의문 임베딩 모듈은, 상기 입력된 질의문을 각 단어로 분리하는 토큰화를 수행하고 상기 지식 그래프 내에서 미리 설정된 등장 횟수 이상의 단어들을 제외시켜 토픽을 추출하는 질의문과 지식 그래프 관계 학습을 이용한 지식 완성 장치.The query embedding module performs tokenization to separate the input query into each word, and extracts topics by excluding words with a preset number of appearances or more in the knowledge graph. Using knowledge graph relationship learning knowledge completion device.
  4. 제1항에 있어서, According to claim 1,
    상기 유사도 계산 모듈은, 상기 그래프 임베딩 모듈을 통해 얻은 모든 술어의 임베딩 값을 dot product 연산을 하고 sigmoid를 취하여 가장 높은 값을 찾아 상기 질의문과 가장 유사한 술어를 탐색하는 질의문과 지식 그래프 관계 학습을 이용한 지식 완성 장치.The similarity calculation module performs a dot product operation on the embedding values of all predicates obtained through the graph embedding module, finds the highest value by taking a sigmoid, and searches for a predicate most similar to the query statement Knowledge using graph relationship learning finished device.
  5. 제1항에 있어서, The method of claim 1,
    상기 스코어링 모듈은 상기 추출된 토픽을 주어에 위치시키고, 상기 질의문 임베딩 값과 상기 가장 유사한 술어의 임베딩 값을 연결한 임베딩 값을 술어에 위치시키고, 상기 지식 그래프의 주어 및 목적어들을 후보 목적어에 위치시켜 새로운 트리플을 추론하는 질의문과 지식 그래프 관계 학습을 이용한 지식 완성 장치.The scoring module places the extracted topic in the subject, places the embedding value connecting the query embedding value and the embedding value of the most similar predicate in the predicate, and places the subject and objects of the knowledge graph in the candidate object A knowledge completion device using a query that infers a new triple and knowledge graph relational learning.
  6. 제5항에 있어서, 6. The method of claim 5,
    상기 스코어링 모듈은 스코어 계산 함수에 복수의 후보 목적어들을 순차적으로 위치시켜 스코어가 가장 높은 엔티티를 상기 추출된 토픽 및 상기 질의문 임베딩 값과 상기 가장 유사한 술어의 임베딩 값을 연결한 임베딩 값과 관련된 목적어로 결정하는 질의문과 지식 그래프 관계 학습을 이용한 지식 완성 장치.The scoring module sequentially places a plurality of candidate objects in the score calculation function, so that the entity with the highest score is an object related to an embedding value that connects the extracted topic and the query embedding value with the embedding value of the most similar predicate. Knowledge completion device using decision-making query and knowledge graph relationship learning.
  7. 프로세서 및 메모리를 포함하는 장치에서 질의문과 지식 그래프 관계 학습을 이용하여 지식을 완성하는 방법으로서, As a method of completing knowledge by using a query and knowledge graph relationship learning in a device including a processor and a memory,
    입력된 질의문에 상응하는 질의문 임베딩 값을 출력하는 단계; outputting a query embedding value corresponding to the inputted query;
    상기 입력된 질의문에서 토픽을 추출하는 단계; extracting a topic from the input query;
    상기 지식 그래프에 포함된 복수의 술어, 주어 및 목적어들에 대한 임베딩 값을 출력하는 단계; outputting embedding values for a plurality of predicates, subjects, and objects included in the knowledge graph;
    상기 질의문 임베딩 값과 상기 복수의 술어 각각의 임베딩 값의 유사도를 계산하여 질의문과 가장 유사한 술어를 결정하는 단계; determining a predicate most similar to a query statement by calculating a similarity between the query embedding value and each of the embedding values of the plurality of predicates;
    상기 질의문 임베딩 값과 상기 가장 유사한 술어의 임베딩 값을 연결하는 단계; 및concatenating the embedding value of the query and the embedding value of the most similar predicate; and
    상기 추출된 토픽, 상기 질의문 임베딩 값과 상기 가장 유사한 술어의 임베딩 값을 연결한 임베딩 값 및 상기 지식 그래프의 주어 및 목적어들을 이용하여 새로운 트리플을 추론하는 단계를 포함하는 질의문과 지식 그래프 관계 학습을 이용한 지식 완성 방법. Inferring a new triple using the extracted topic, the embedding value connecting the embedding value of the query and the embedding value of the most similar predicate, and the subject and object of the knowledge graph. How to use knowledge to complete.
  8. 제7항에 있어서, 8. The method of claim 7,
    상기 질의문과 가장 유사한 술어를 결정하는 단계는, 상기 그래프 임베딩 모듈을 통해 얻은 모든 술어의 임베딩 값을 dot product 연산을 하고 sigmoid를 취하여 가장 높은 값을 찾아 상기 질의문과 가장 유사한 술어를 탐색하는 질의문과 지식 그래프 관계 학습을 이용한 지식 완성 방법.In the step of determining the predicate most similar to the query, a dot product operation is performed on the embedding values of all predicates obtained through the graph embedding module, and the highest value is found by taking a sigmoid. Knowledge completion method using graph relation learning.
  9. 제7항에 있어서, 8. The method of claim 7,
    상기 새로운 트리플을 추론하는 단계는, 상기 추출된 토픽을 주어에 위치시키고, 상기 질의문 임베딩 값과 상기 가장 유사한 술어의 임베딩 값을 연결한 임베딩 값을 술어에 위치시키고, 상기 지식 그래프의 주어 및 목적어들을 후보 목적어에 위치시켜 새로운 트리플을 추론하는 질의문과 지식 그래프 관계 학습을 이용한 지식 완성 방법.In the step of inferring the new triple, the extracted topic is placed in the subject, the embedding value connecting the query embedding value and the embedding value of the most similar predicate is placed in the predicate, and the subject and object of the knowledge graph A knowledge completion method using a query statement that infers a new triple by locating them to the candidate object and knowledge graph relational learning.
  10. 제7항에 따른 방법을 수행하는 컴퓨터 판독 가능한 프로그램.A computer readable program for performing the method according to claim 7.
PCT/KR2020/018966 2020-11-23 2020-12-23 Method and device for completing knowledge by using relation learning between query and knowledge graph WO2022107989A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
KR10-2020-0157981 2020-11-23
KR1020200157981A KR102442422B1 (en) 2020-11-23 2020-11-23 Knowledge Completion Method and Apparatus Using Query and Knowledge Graph Relationship Learning

Publications (1)

Publication Number Publication Date
WO2022107989A1 true WO2022107989A1 (en) 2022-05-27

Family

ID=81709225

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/KR2020/018966 WO2022107989A1 (en) 2020-11-23 2020-12-23 Method and device for completing knowledge by using relation learning between query and knowledge graph

Country Status (2)

Country Link
KR (1) KR102442422B1 (en)
WO (1) WO2022107989A1 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR102583818B1 (en) * 2022-09-14 2023-10-04 주식회사 글로랑 Method for sampling process of personality test using question and answer network representing group of respondents based on bert

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR101267038B1 (en) * 2011-02-25 2013-05-24 주식회사 솔트룩스 Method and apparatus for selecting RDF triple using vector space model
KR101662450B1 (en) * 2015-05-29 2016-10-05 포항공과대학교 산학협력단 Multi-source hybrid question answering method and system thereof
US20170357906A1 (en) * 2016-06-08 2017-12-14 International Business Machines Corporation Processing un-typed triple store data
KR20180108257A (en) * 2017-03-24 2018-10-04 (주)아크릴 Method for extending ontology using resources represented by the ontology
CN111639171A (en) * 2020-06-08 2020-09-08 吉林大学 Knowledge graph question-answering method and device

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR101228865B1 (en) 2011-11-23 2013-02-01 주식회사 한글과컴퓨터 Document display apparatus and method for extracting key word in document

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR101267038B1 (en) * 2011-02-25 2013-05-24 주식회사 솔트룩스 Method and apparatus for selecting RDF triple using vector space model
KR101662450B1 (en) * 2015-05-29 2016-10-05 포항공과대학교 산학협력단 Multi-source hybrid question answering method and system thereof
US20170357906A1 (en) * 2016-06-08 2017-12-14 International Business Machines Corporation Processing un-typed triple store data
KR20180108257A (en) * 2017-03-24 2018-10-04 (주)아크릴 Method for extending ontology using resources represented by the ontology
CN111639171A (en) * 2020-06-08 2020-09-08 吉林大学 Knowledge graph question-answering method and device

Also Published As

Publication number Publication date
KR102442422B1 (en) 2022-09-08
KR20220070919A (en) 2022-05-31

Similar Documents

Publication Publication Date Title
CN111259653B (en) Knowledge graph question-answering method, system and terminal based on entity relationship disambiguation
CN111597314B (en) Reasoning question-answering method, device and equipment
CN109783817A (en) A kind of text semantic similarity calculation model based on deeply study
WO2018196718A1 (en) Image disambiguation method and device, storage medium, and electronic device
WO2018092936A1 (en) Document clustering method for unstructured text data, using deep learning
Üstün et al. Characters or morphemes: How to represent words?
WO2020111314A1 (en) Conceptual graph-based query-response apparatus and method
CN108446404B (en) Search method and system for unconstrained visual question-answer pointing problem
CN107679070B (en) Intelligent reading recommendation method and device and electronic equipment
CN110245353B (en) Natural language expression method, device, equipment and storage medium
CN113593661A (en) Clinical term standardization method, device, electronic equipment and storage medium
WO2022107989A1 (en) Method and device for completing knowledge by using relation learning between query and knowledge graph
WO2021129411A1 (en) Text processing method and device
CN110543551B (en) Question and statement processing method and device
CN111444313B (en) Knowledge graph-based question and answer method, knowledge graph-based question and answer device, computer equipment and storage medium
JP2020057359A (en) Training data generation method, training data generation apparatus, electronic device and computer-readable storage medium
CN112434533A (en) Entity disambiguation method, apparatus, electronic device, and computer-readable storage medium
CN114722174A (en) Word extraction method and device, electronic equipment and storage medium
CN109033318B (en) Intelligent question and answer method and device
McClendon et al. The use of paraphrase identification in the retrieval of appropriate responses for script based conversational agents
CN111241276A (en) Topic searching method, device, equipment and storage medium
CN115774782A (en) Multilingual text classification method, device, equipment and medium
CN114463822A (en) Neural network training method for image processing, face recognition method and device
CN114491060A (en) Updating method and semantic error correction method of dynamic association knowledge network
CN113591004A (en) Game tag generation method and device, storage medium and electronic equipment

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20962595

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20962595

Country of ref document: EP

Kind code of ref document: A1