KR102346244B1

KR102346244B1 - Neural network-based auto-slot filling method and apparatus

Info

Publication number: KR102346244B1
Application number: KR1020190021846A
Authority: KR
Inventors: 최재식; 권순재; 임성우; 이세현
Original assignee: 울산과학기술원
Priority date: 2018-11-13
Filing date: 2019-02-25
Publication date: 2022-01-04
Also published as: KR20200058263A

Abstract

신경망 기반 자동 슬롯 채우기 기술 및 장치가 제공된다. 일 실시예에 따른 슬롯 채우기 장치는 입력 시퀀스와 입력 개체를 미리 학습된 제1 신경망에 입력하여 입력 개체와 특정 관계를 갖을 수 있는 하나 이상의 후보 개체를 추출하는 후보 개체 추출 모듈 및 후보 개체를 미리 학습된 제2 신경망에 입력하여 입력 개체와 후보 개체 사이의 특정 관계를 추출하는 관계 예측 모듈을 포함하고, 제1 신경망과 제2 신경망은 입력 레이어를 공유하여 학습된다.A neural network-based automatic slot filling technique and apparatus are provided. A slot filling apparatus according to an embodiment includes a candidate object extraction module for extracting one or more candidate objects that may have a specific relationship with the input object by inputting an input sequence and an input object into a pre-trained first neural network, and a candidate object learning in advance and a relationship prediction module for extracting a specific relationship between an input object and a candidate object by inputting the input to the second neural network, and the first neural network and the second neural network are learned by sharing an input layer.

Description

Neural network-based automatic slot filling technology and device {NEURAL NETWORK-BASED AUTO-SLOT FILLING METHOD AND APPARATUS}

아래의 실시예들은 신경망 기반 자동 슬롯 채우기 기술 및 장치에 관한 것이다.The following embodiments relate to neural network-based automatic slot filling technology and apparatus.

지식 베이스(Knowledge Base)는 데이터의 비정형적인 관계를 유연하게 표현할 수 있는 효과적인 수단이다. 지식 베이스는 자동 질의 시스템 등 다양한 인공지능 기술을 현실화 하는데 매우 중요하게 사용되고 있다. 그럼에도 불구하고 이런 방대한 지식 베이스를 수동으로 구축하고 유지하는 것은 현실적으로 매우 어렵다. 따라서, 컴퓨터 소프트웨어가 책과 인터넷 등 자료를 스스로 읽어, 중요한 관계를 자동으로 추출할 수 있는 기술(Relation Extraction)은 지식 베이스를 자동으로 확장(Knowledge Base Population)하는데 매우 중요한 기술이다.A knowledge base is an effective means to flexibly express atypical relationships of data. The knowledge base is very important to realize various artificial intelligence technologies such as an automatic query system. Nevertheless, it is very difficult in reality to manually build and maintain such a vast knowledge base. Therefore, a technology that enables computer software to automatically extract important relationships by reading books and the Internet by itself is a very important technology for automatically expanding the knowledge base (Knowledge Base Population).

관계를 자동으로 추출하는 다양한 기술이 존재하지만, 그 중에서 슬롯 채우기(Slot-Filling) 문제는 관계(예, 본사의 위치)와 키워드(예, 구글)가 주어졌을 때, 주어진 문서에서 키워드와 관계를 갖는 답(예, 캘리포니아)을 찾는 문제로 정의된다. 예를 들어, "구글은 지난 5월부터 캘리포니아 본사 주변에서 차량 공유 서비스를 시범 운영 …"이라는 문장을 읽고, 구글의 본사가 캘리포니아에 있다는 관계를 찾아, 지식 베이스에 추가하는 작업이다. 슬롯 채우기 문제를 해결하면 기존의 지식 베이스에 저장된 관계가 없는 경우에도, 자연어 문서 만으로 새로운 관계를 추출하여 지식 베이스를 확장할 수 있으며, 추론 알고리즘을 통하여 새로운 지식(관계)을 도출할 수도 있다. 이렇게 확장된 지식 베이스는 개인화된 자동 질의 서비스의 질을 향상 시키고, 의사나 변호사가 환자나 판례에 관련한 방대한 자료를 찾는데 걸리는 시간을 현저하게 줄일 수 있다.Various techniques for automatically extracting relationships exist, but among them, the slot-filling problem is that given a relationship (eg, location of the head office) and a keyword (eg, Google), the keyword and relationship in a given document are It is defined as the problem of finding an answer (eg, California) that has For example, reading the sentence "Google has been piloting a ride-hailing service around its California headquarters since last May...", finding a relationship that Google's headquarters is in California, and adding it to the knowledge base. By solving the slot filling problem, even when there is no relationship stored in the existing knowledge base, the knowledge base can be expanded by extracting new relationships only from natural language documents, and new knowledge (relationships) can be derived through an inference algorithm. This expanded knowledge base can improve the quality of personalized automatic inquiry service and significantly reduce the time it takes for doctors or lawyers to search for vast amounts of data related to patients or cases.

기존 슬롯 채우기 방법들은 개체명 인식(Named-entity recognition; NER) 등의 모듈을 이용하여 입력 개체에 대한 후보 구문을 추출하고, 후보 구문에 대하여 관계 추출기 등의 모듈을 사용하여 정답을 유추하는 접근 방법을 주로 채택하였다. 하지만 해당 방법들은 각 모듈의 학습과 예측 과정이 완전히 분리되어, 후보 구문 추출 과정과 관계 예측 과정에서 공통된 자식을 표현하기 힘들고, 슬롯 채우기 문제 전체를 최적화할 수 없었다.Existing slot filling methods extract a candidate phrase for an input entity using a module such as named-entity recognition (NER), and infer the correct answer using a module such as a relationship extractor for the candidate phrase. was mainly adopted. However, in these methods, the learning and prediction processes of each module are completely separated, so it is difficult to express common children in the candidate syntax extraction process and the relationship prediction process, and it is not possible to optimize the whole slot filling problem.

실시예들은 후보 개체 추출 모듈과 관계 예측 모듈을 학습하는 과정에서 자유 변수를 공유하고자 한다.Embodiments intend to share free variables in the process of learning the candidate entity extraction module and the relationship prediction module.

실시예들은 후보 개체 추출 모듈과 관계 예측 모듈을 학습하는 과정에서 각 모듈의 손실함수를 결합하고자 한다.Embodiments intend to combine the loss function of each module in the process of learning the candidate entity extraction module and the relationship prediction module.

실시예들은 후보 개체 추출 모듈과 관계 예측 모듈을 학습하는 과정에서 단어에 대응하는 임베딩을 데이터베이스의 룩업 테이블에서 기술하고, 이를 학습 과정에서 업데이트하고자 한다.Embodiments intend to describe embeddings corresponding to words in a lookup table of a database in the process of learning the candidate entity extraction module and the relationship prediction module, and update them in the learning process.

실시예들은 후보 개체 추출 모듈에 장단기 메모리(Long short-term memory: LSTM) 방식의 순환 신경망을 사용하고자 한다.Embodiments intend to use a long short-term memory (LSTM) type recurrent neural network for a candidate entity extraction module.

실시예들은 관계 예측 모듈에 구간적 합성곱 신경망(Piecewise Convolutional Neural Network: PCNN)을 사용하고자 한다.Embodiments intend to use a piecewise convolutional neural network (PCNN) for the relation prediction module.

일 실시예에 따른 슬롯 채우기 방법은 입력 시퀀스와 입력 개체를 미리 학습된 제1 신경망에 입력하여 상기 입력 개체와 특정 관계를 갖을 수 있는 하나 이상의 후보 개체를 추출하는 단계; 및 상기 후보 개체를 미리 학습된 제2 신경망에 입력하여 상기 입력 개체와 상기 후보 개체 사이의 상기 특정 관계를 추출하는 단계를 포함하고, 상기 제1 신경망과 상기 제2 신경망은 입력 레이어를 공유하여 학습된다.A slot filling method according to an embodiment includes: inputting an input sequence and an input entity into a pre-trained first neural network to extract one or more candidate entities that may have a specific relationship with the input entity; and extracting the specific relationship between the input entity and the candidate entity by inputting the candidate entity into a pre-trained second neural network, wherein the first neural network and the second neural network learn by sharing an input layer do.

상기 제1 신경망과 상기 제2 신경망은 상기 제1 신경망의 손실함수와 상기 제2 신경망의 손실함수를 결합하여 생성된 새로운 손실함수를 사용하여 학습될 수 있다.The first neural network and the second neural network may be trained using a new loss function generated by combining the loss function of the first neural network and the loss function of the second neural network.

상기 제1 신경망과 상기 제2 신경망은 동일한 단어 임베딩을 사용하는 방식으로 상기 입력 레이어의 자유 변수를 공유하여 학습될 수 있다.The first neural network and the second neural network may be trained by sharing the free variable of the input layer in a manner using the same word embedding.

상기 단어 임베딩은 상기 제1 신경망과 상기 제2 신경망의 데이터베이스에 룩업 테이블 형태로 기록되고, 학습하는 과정에서 상기 룩업 테이블을 갱신(update)할 수 있다.The word embedding may be recorded in the database of the first neural network and the second neural network in the form of a lookup table, and the lookup table may be updated during a learning process.

상기 입력 시퀀스는 토큰 단위로 구성되고, 상기 후보 개체를 추출하는 단계는 상기 입력 시퀀스를 상기 토큰 별로 상기 제1 신경망에 입력하여 상기 토큰에 대응하는 태그로 구성되는 태그 시퀀스를 출력하는 단계; 및 상기 태그 시퀀스에 기초하여 상기 후보 개체를 추출할 수 있다.The input sequence is configured in units of tokens, and the extracting of the candidate entity may include: inputting the input sequence to the first neural network for each token and outputting a tag sequence composed of tags corresponding to the token; and extracting the candidate entity based on the tag sequence.

상기 특정 관계를 추출하는 단계는 상기 후보 개체의 상기 입력 개체와의 상기 특정 관계를 갖는 정도에 기초하여, 상기 후보 개체를 재순위화하는 단계; 상기 재순위화된 상기 후보 개체와 상기 입력 개체 사이의 상기 특정 관계를 추출하는 단계; 및 상기 특정 관계에 대응하는 수치를 추출하는 단계를 포함할 수 있다.The extracting of the specific relationship may include: re-ranking the candidate entity based on the degree to which the candidate entity has the specific relationship with the input entity; extracting the specific relationship between the re-ranked candidate entity and the input entity; and extracting a numerical value corresponding to the specific relationship.

상기 입력 시퀀스는 텍스트 시퀀스를 포함할 수 있다.The input sequence may include a text sequence.

상기 제1 신경망은 장단기 메모리(Long short-term memory: LSTM) 방식의 순환 신경망을 포함할 수 있다.The first neural network may include a long short-term memory (LSTM) recurrent neural network.

제1항에 있어서, 상기 제2 신경망은 구간적 합성곱 신경망(Piecewise Convolutional Neural Network: PCNN)을 포함할 수 있다.The method of claim 1, wherein the second neural network may include a piecewise convolutional neural network (PCNN).

일 실시예에 따른 슬롯 채우기 장치는 입력 시퀀스와 입력 개체를 미리 학습된 제1 신경망에 입력하여 상기 입력 개체와 특정 관계를 갖을 수 있는 하나 이상의 후보 개체를 추출하는 후보 개체 추출 모듈; 및 상기 후보 개체를 미리 학습된 제2 신경망에 입력하여 상기 입력 개체와 상기 후보 개체 사이의 상기 특정 관계를 추출하는 관계 예측 모듈을 포함하고, 상기 제1 신경망과 상기 제2 신경망은 입력 레이어를 공유하여 학습된다.A slot filling apparatus according to an embodiment includes: a candidate entity extraction module for inputting an input sequence and an input entity into a pre-trained first neural network to extract one or more candidate entities that may have a specific relationship with the input entity; and a relationship prediction module for inputting the candidate entity into a pre-trained second neural network to extract the specific relationship between the input entity and the candidate entity, wherein the first neural network and the second neural network share an input layer is learned by

상기 입력 시퀀스는 토큰 단위로 구성되고, 상기 후보 개체를 추출 모듈은 상기 입력 시퀀스를 상기 토큰 별로 상기 제1 신경망에 입력하여 상기 토큰에 대응하는 태그로 구성되는 태그 시퀀스를 출력하고, 상기 태그 시퀀스에 기초하여 상기 후보 개체를 추출할 수 있다.The input sequence is configured in units of tokens, and the candidate entity extraction module inputs the input sequence to the first neural network for each token to output a tag sequence composed of tags corresponding to the tokens, and to the tag sequence Based on this, the candidate entity may be extracted.

상기 관계 예측 모듈은 상기 후보 개체의 상기 입력 개체와의 상기 특정 관계를 갖는 정도에 기초하여, 상기 후보 개체를 재순위화하고, 상기 재순위화된 상기 후보 개체와 상기 입력 개체의 상기 특정 관계와 상기 특정 관계에 대응하는 수치를 추출할 수 있다.The relationship prediction module is configured to re-rank the candidate entity based on the degree to which the candidate entity has the specific relationship with the input entity, and determine the specific relationship between the re-ranked candidate entity and the input entity; A numerical value corresponding to the specific relationship may be extracted.

상기 제2 신경망은 구간적 합성곱 신경망(Piecewise Convolutional Neural Network: PCNN)을 포함할 수 있다.The second neural network may include a piecewise convolutional neural network (PCNN).

실시예들은 후보 개체 추출 모듈과 관계 예측 모듈을 학습하는 과정에서 자유 변수를 공유할 수 있다.Embodiments may share free variables in the process of learning the candidate entity extraction module and the relationship prediction module.

실시예들은 후보 개체 추출 모듈과 관계 예측 모듈을 학습하는 과정에서 각 모듈의 손실함수를 결합할 수 있다.Embodiments may combine the loss function of each module in the process of learning the candidate entity extraction module and the relationship prediction module.

실시예들은 후보 개체 추출 모듈과 관계 예측 모듈을 학습하는 과정에서 단어에 대응하는 임베딩을 데이터베이스의 룩업 테이블에서 기술하고, 이를 학습 과정에서 업데이트할 수 있다.Embodiments may describe embeddings corresponding to words in a lookup table of a database in the process of learning the candidate entity extraction module and the relationship prediction module, and update them in the learning process.

실시예들은 후보 개체 추출 모듈에 장단기 메모리(Long short-term memory: LSTM) 방식의 순환 신경망을 사용할 수 있다.Embodiments may use a long short-term memory (LSTM) recurrent neural network for the candidate entity extraction module.

실시예들은 관계 예측 모듈에 구간적 합성곱 신경망(Piecewise Convolutional Neural Network: PCNN)을 사용할 수 있다.Embodiments may use a piecewise convolutional neural network (PCNN) for the relation prediction module.

도 1은 일 실시예에 따른 슬롯 채우기 장치의 동작 방법을 설명하기 위한 도면이다.
도 2는 일 실시예에 따른 후보 개체 추출 모듈의 구체적인 동작 방법을 설명하기 위한 도면이다.
도 3은 일 실시예에 따른 관계 예측 모듈의 구체적인 동작 방법을 설명하기 위한 도면이다.
도 4는 일 실시예에 따른 슬롯 채우기 방법의 순서도이다.
도 5는 일 실시예에 따른 슬롯 채우기 방법을 설명하기 위한 도면이다.1 is a view for explaining a method of operating a slot filling apparatus according to an embodiment.
2 is a diagram for explaining a specific operation method of a candidate entity extraction module according to an embodiment.
3 is a diagram for explaining a specific operation method of a relationship prediction module according to an embodiment.
4 is a flowchart of a slot filling method according to an embodiment.
5 is a diagram for explaining a method for filling a slot according to an embodiment.

본 명세서에 개시되어 있는 본 발명의 개념에 따른 실시예들에 대해서 특정한 구조적 또는 기능적 설명들은 단지 본 발명의 개념에 따른 실시예들을 설명하기 위한 목적으로 예시된 것으로서, 본 발명의 개념에 따른 실시예들은 다양한 형태로 실시될 수 있으며 본 명세서에 설명된 실시예들에 한정되지 않는다.Specific structural or functional descriptions of the embodiments according to the concept of the present invention disclosed herein are only exemplified for the purpose of explaining the embodiments according to the concept of the present invention, and the embodiment according to the concept of the present invention These may be embodied in various forms and are not limited to the embodiments described herein.

본 발명의 개념에 따른 실시예들은 다양한 변경들을 가할 수 있고 여러 가지 형태들을 가질 수 있으므로 실시예들을 도면에 예시하고 본 명세서에 상세하게 설명하고자 한다. 그러나, 이는 본 발명의 개념에 따른 실시예들을 특정한 개시형태들에 대해 한정하려는 것이 아니며, 본 발명의 사상 및 기술 범위에 포함되는 변경, 균등물, 또는 대체물을 포함한다.Since the embodiments according to the concept of the present invention may have various changes and may have various forms, the embodiments will be illustrated in the drawings and described in detail herein. However, this is not intended to limit the embodiments according to the concept of the present invention to specific disclosed forms, and includes changes, equivalents, or substitutes included in the spirit and scope of the present invention.

제1 또는 제2 등의 용어를 다양한 구성요소들을 설명하는데 사용될 수 있지만, 상기 구성요소들은 상기 용어들에 의해 한정되어서는 안 된다. 상기 용어들은 하나의 구성요소를 다른 구성요소로부터 구별하는 목적으로만, 예를 들어 본 발명의 개념에 따른 권리 범위로부터 이탈되지 않은 채, 제1 구성요소는 제2 구성요소로 명명될 수 있고, 유사하게 제2 구성요소는 제1 구성요소로도 명명될 수 있다.Terms such as first or second may be used to describe various elements, but the elements should not be limited by the terms. The above terms are used only for the purpose of distinguishing one component from other components, for example, without departing from the scope of rights according to the concept of the present invention, a first component may be named a second component, Similarly, the second component may also be referred to as the first component.

어떤 구성요소가 다른 구성요소에 “연결되어” 있다거나 “접속되어” 있다고 언급된 때에는, 그 다른 구성요소에 직접적으로 연결되어 있거나 또는 접속되어 있을 수도 있지만, 중간에 다른 구성요소가 존재할 수도 있다고 이해되어야 할 것이다. 반면에, 어떤 구성요소가 다른 구성요소에 “직접 연결되어” 있다거나 “직접 접속되어” 있다고 언급된 때에는, 중간에 다른 구성요소가 존재하지 않는 것으로 이해되어야 할 것이다. 구성요소들 간의 관계를 설명하는 표현들, 예를 들어 “~사이에”와 “바로~사이에” 또는 “~에 직접 이웃하는” 등도 마찬가지로 해석되어야 한다.When a component is referred to as being “connected” or “connected” to another component, it is understood that it may be directly connected or connected to the other component, but other components may exist in between. it should be On the other hand, when it is mentioned that a certain element is "directly connected" or "directly connected" to another element, it should be understood that there is no other element in the middle. Expressions describing the relationship between elements, for example, “between” and “between” or “directly adjacent to”, etc., should be interpreted similarly.

본 명세서에서 사용한 용어는 단지 특정한 실시예들을 설명하기 위해 사용된 것으로, 본 발명을 한정하려는 의도가 아니다. 단수의 표현은 문맥상 명백하게 다르게 뜻하지 않는 한, 복수의 표현을 포함한다. 본 명세서에서, “포함하다” 또는 “가지다” 등의 용어는 설시된 특징, 숫자, 단계, 동작, 구성요소, 부분품 또는 이들을 조합한 것이 존재함으로 지정하려는 것이지, 하나 또는 그 이상의 다른 특징들이나 숫자, 단계, 동작, 구성요소, 부분품 또는 이들을 조합한 것들의 존재 또는 부가 가능성을 미리 배제하지 않는 것으로 이해되어야 한다.The terminology used herein is used only to describe specific embodiments, and is not intended to limit the present invention. The singular expression includes the plural expression unless the context clearly dictates otherwise. In this specification, terms such as “comprise” or “have” are intended to designate that the described feature, number, step, operation, component, part, or combination thereof exists, and includes one or more other features or numbers, It should be understood that the possibility of the presence or addition of steps, operations, components, parts or combinations thereof is not precluded in advance.

다르게 정의되지 않는 한, 기술적이거나 과학적인 용어를 포함해서 여기서 사용되는 모든 용어들은 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자에 의해 일반적으로 이해되는 것과 동일한 의미를 가진다. 일반적으로 사용되는 사전에 정의되어 있는 것과 같은 용어들은 관련 기술의 문맥상 가지는 의미와 일치하는 의미를 갖는 것으로 해석되어야 하며, 본 명세서에서 명백하게 정의하지 않는 한, 이상적이거나 과도하게 형식적인 의미로 해석되지 않는다.Unless defined otherwise, all terms used herein, including technical or scientific terms, have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. Terms such as those defined in commonly used dictionaries should be interpreted as having a meaning consistent with the meaning in the context of the related art, and should not be interpreted in an ideal or excessively formal meaning unless explicitly defined in the present specification. does not

이하, 실시예들을 첨부된 도면을 참조하여 상세하게 설명한다. 각 도면에 제시된 동일한 참조 부호는 동일한 부재를 나타낸다.Hereinafter, embodiments will be described in detail with reference to the accompanying drawings. Like reference numerals in each figure indicate like elements.

도 1은 일 실시예에 따른 슬롯 채우기 장치의 동작 방법을 설명하기 위한 도면이다.1 is a view for explaining a method of operating a slot filling apparatus according to an embodiment.

슬롯 채우기 기술은 입력 시퀀스(input sequence)를 기반으로, 입력 개체(input entity)(예를 들어, 인명, 기관명, 지명, 시간, 날짜, 종교 등)에 대해 특정 관계(예를 들어, 탄생일, 사망일, 설립일, 종교, 배우자, 자녀, 자매 등)를 갖는 개체들을 찾아서 반환하는 기술일 수 있다. 예를 들어, 입력 개체가 ‘버락 오바마’이고, 질의-관계가 ‘배우자’라면, “버락 오바마 정권의 영부인인 미셸 오바마가 한국을 방문했다”라는 입력 시퀀스를 기반으로, ‘미셸 오바마’가 ‘버락 오바마’의 배우자라는 것을 출력할 수 있다. 슬롯 채우기 기술은 관계 추출(relation extraction) 기술로 지칭될 수 있다. 개체는 인명, 기관명, 지명, 시간, 날짜 등 고유한 의미를 가지는 것들로, 형태소, 품사가 아닌 의미가 있는 단어들일 수 있다.The slot filling technique is based on an input sequence, and is based on a specific relationship (e.g., date of birth, date of death, etc.) to an input entity (e.g., person, institution, place, time, date, religion, etc.). , establishment date, religion, spouse, children, sisters, etc.) may be a technique to find and return entities. For example, if the input object is 'Barack Obama' and the query-relationship is 'spouse', based on the input sequence "Michelle Obama, the first lady of the Barack Obama administration, visited Korea", 'Michelle Obama' is ' You can print that you are Barack Obama's spouse. The slot filling technique may be referred to as a relation extraction technique. Entity is a person's name, an institution name, a place name, a time, a date, etc., which have unique meanings, and may be words with meanings other than morphemes and parts of speech.

도 1을 참조하면, 일 실시예에 따른 슬롯 채우기 장치(100)는 후보 개체 추출 모듈(110) 및 관계 예측 모듈(150)을 포함한다.Referring to FIG. 1 , the slot filling apparatus 100 according to an embodiment includes a candidate entity extraction module 110 and a relationship prediction module 150 .

후보 개체 추출 모듈(110)은 미리 학습된 제1 신경망(미도시)을 포함할 수 있다. 후보 개체 추출 모듈(110)은 입력 시퀀스와 입력 개체를 미리 학습된 제1 신경망에 입력하여 입력 개체와 특정 관계를 갖을 수 있는 하나 이상의 후보 개체를 추출할 수 있다. 입력 시퀀스는 토큰 단위로 구성되고, 자연어 데이터인 텍스트 시퀀스를 포함할 수 있다. 특정 관계는 입력 개체와 후보 개체 사이의 탄생일, 사망일, 설립일, 종교, 배우자, 자녀, 자매 등을 포함할 수 있다. 또한, 특정 관계는 본 명세서에 기재된 예시 뿐만 아니라 학습 단계에서 미리 결정될 수 있다. 후보 개체는 입력 개체와 특정 관계를 갖을 수 있는 입력 시퀀스에 포함된 개체일 수 있다. 후보 개체 추출 모듈(110)의 구체적인 동작 방법은 아래에서 도 2를 참조하여 상세히 설명된다.The candidate entity extraction module 110 may include a pre-trained first neural network (not shown). The candidate entity extraction module 110 may input an input sequence and an input entity into a pre-trained first neural network to extract one or more candidate entities that may have a specific relationship with the input entity. The input sequence is configured in units of tokens and may include a text sequence that is natural language data. A specific relationship may include a date of birth, a date of death, a date of establishment, a religion, a spouse, a child, a sister, and the like between the input entity and the candidate entity. In addition, certain relationships may be predetermined in the learning phase as well as in the examples described herein. The candidate entity may be an entity included in the input sequence that may have a specific relationship with the input entity. A specific operation method of the candidate entity extraction module 110 will be described in detail below with reference to FIG. 2 .

관계 예측 모듈(150)은 미리 학습된 제2 신경망(미도시)을 포함할 수 있다. 관계 예측 모듈(150)은 후보 개체를 미리 학습된 제2 신경망에 입력하여 입력 개체와 후보 개체 사이의 특정 관계를 추출할 수 있다. 관계 예측 모듈(150)의 구체적인 동작 방법은 아래에서 도 3을 참조하여 상세히 설명된다.The relationship prediction module 150 may include a pre-trained second neural network (not shown). The relationship prediction module 150 may extract a specific relationship between the input entity and the candidate entity by inputting the candidate entity into a pre-trained second neural network. A specific operation method of the relationship prediction module 150 will be described in detail below with reference to FIG. 3 .

슬롯 채우기 장치(100)의 후보 개체 추출 모듈(110)의 제1 신경망과 관계 예측 모듈(150)의 제2 신경망은 입력 레이어를 공유하여 학습된다. 후보 개체 추출 모듈(110)의 제1 신경망과 관계 예측 모듈(150)의 제2 신경망을 학습하는 과정에서 자유 변수를 공유하기 때문에 공통된 자식을 표현할 수 있고, 슬롯 채우기 문제 전체를 최적화할 수 있다.The first neural network of the candidate entity extraction module 110 of the slot filling apparatus 100 and the second neural network of the relationship prediction module 150 are learned by sharing an input layer. Since free variables are shared in the process of learning the first neural network of the candidate entity extraction module 110 and the second neural network of the relationship prediction module 150, common children can be expressed and the entire slot filling problem can be optimized.

제1 신경망과 제2 신경망은 동일한 단어 임베딩을 사용하는 방식으로 입력 레이어의 자유 변수를 공유할 수 있다. 후보 개체 추출 모듈(110)의 제1 신경망과 관계 예측 모듈(150)의 제2 신경망은 슬롯 채우기 방법을 구현하기 위해서 단어를 벡터 형식으로 변환하는 단어 임베딩 모델을 사용할 수 있다. 단어 임베딩(word embedding)은 단어의 의미를 벡터로 표현한 것일 수 있다. 예를 들어, 대용량 말뭉치를 입력으로 말뭉치 내 각 단어들을 n차원의 실수 벡터 공간상에 사상(mapping)하여 단어의 의미를 파악하는 비지도 학습(unsupervised learning)의 일종일 수 있다. 단어 임베딩은 일반적으로 입력 말뭉치를 토큰(token) 단위로 분할한 다음 의미적 연관성이 높은 토큰들을 유사한 실수 벡터 값으로 생성할 수 있고, 단어(word) 단위 즉, 띄어쓰기 단위로 토큰을 생성할 수 있다. 또는, 한국어에서는 형태소 단위로 토큰을 구성하여 단어 임베딩을 학습하는 방식이 사용될 수 있다. 단어 임베딩 방법에 관한 실시예들은 다양한 형태로 실시될 수 있으며 본 명세서에 설명된 실시예들에 한정되지 않는다. The first neural network and the second neural network may share free variables of the input layer in a manner that uses the same word embedding. The first neural network of the candidate entity extraction module 110 and the second neural network of the relationship prediction module 150 may use a word embedding model that converts a word into a vector format to implement a slot filling method. The word embedding may be a vector expression of the meaning of a word. For example, it may be a type of unsupervised learning in which a large-capacity corpus is input and each word in the corpus is mapped on an n-dimensional real vector space to grasp the meaning of a word. In general, word embedding divides the input corpus into token units, then generates tokens with high semantic relevance as similar real vector values, and generates tokens in word units, that is, in spaces. . Alternatively, in Korean, a method of learning word embedding by constructing tokens in units of morphemes may be used. Embodiments related to the word embedding method may be implemented in various forms and are not limited to the embodiments described herein.

또한, 제1 신경망과 제2 신경망은 단어에 대응하는 임베딩을 데이터베이스의 룩업 테이블에서 기술하고, 이를 학습 과정에서 업데이트할 수 있다. 후보 개체 추출 모듈(110)과 관계 예측 모듈(150)은 동일한 단어 임베딩을 사용하는 방식으로 입력 레이어의 자유 변수를 공유하고, 단어에 대응하는 임베딩을 데이터베이스의 룩업 테이블에서 기술하고, 이를 학습 과정에서 업데이트하여 두 모듈이 슬롯 채우기 문제에 필요한 공통된 지식을 공유하게 만들 수 있다.In addition, the first neural network and the second neural network may describe embeddings corresponding to words in a lookup table of a database and update them in a learning process. The candidate object extraction module 110 and the relationship prediction module 150 share the free variables of the input layer in a way that uses the same word embedding, describe the embedding corresponding to the word in the lookup table of the database, and use this in the learning process. It can be updated so that the two modules share a common knowledge needed for slot filling issues.

제1 신경망과 상기 제2 신경망은 제1 신경망의 손실함수와 제2 신경망의 손실함수를 결합하여 생성된 새로운 손실함수를 사용하여 학습될 수 있다. 새로운 손실함수를 생성하는 구체적인 동작 방법은 아래에서 도 4를 참조하여 상세히 설명된다.The first neural network and the second neural network may be trained using a new loss function generated by combining the loss function of the first neural network and the loss function of the second neural network. A specific operation method for generating a new loss function will be described in detail below with reference to FIG. 4 .

도 2는 일 실시예에 따른 후보 개체 추출 모듈의 구체적인 동작 방법을 설명하기 위한 도면이다.2 is a diagram for explaining a specific operation method of a candidate entity extraction module according to an embodiment.

도 2를 참조하면, 일 실시예에 따른 후보 개체 추출 모듈(110)은 입력 시퀀스와 입력 개체를 미리 학습된 제1 신경망에 입력하여 입력 개체와 특정 관계를 갖을 수 있는 하나 이상의 후보 개체를 추출할 수 있다. 제1 신경망은 순환 신경망(recurrent neural network: RNN)을 포함할 수 있다. 예를 들어, 제1 신경망은 장단기 메모리(Long short-term memory: LSTM) 방식의 순한 신경망을 포함할 수 있다. 또는, 후보 개체 추출 모듈(110)은 양방향 LSTM-CRF 모델을 포함할 수 있다.Referring to FIG. 2 , the candidate entity extraction module 110 according to an embodiment inputs an input sequence and an input entity into a pre-trained first neural network to extract one or more candidate entities that may have a specific relationship with the input entity. can The first neural network may include a recurrent neural network (RNN). For example, the first neural network may include a long short-term memory (LSTM) type mild neural network. Alternatively, the candidate entity extraction module 110 may include a bidirectional LSTM-CRF model.

제1 신경망은 품사 태깅(Part-of-speech tagging), 개체명 인식(Named Entity Recognition), 레머타이제이션(Lemmatization), 의존 구문 분석(dependency parsing)등 자연어 처리를 위한 알고리즘들을 수행할 수 있도록 학습될 수 있다.The first neural network learns to perform algorithms for natural language processing such as part-of-speech tagging, named entity recognition, remmatization, and dependency parsing. can be

후보 개체 추출 모듈(110)은 입력 시퀀스를 토큰 별로 제1 신경망에 입력 받아 토큰에 대응하는 태그로 구성되는 태그 시퀀스를 출력할 수 있다. 예를 들어, 입력 시퀀스는 "President Barack Obama and First Lady Michelle Obama welcome Donald Trump at White House"일 수 있고, 입력 개체는 "Barack Obama"일 수 있다. 이하에서, 설명의 편의를 위하여 "President Barack Obama and First Lady Michelle Obama welcome Donald Trump at White House"에 기초하여 설명한다. 토큰은 시퀀스에 포함되는 단어 단위, 즉 띄어쓰기 단위로 결정될 수 있다. 예를 들어, 입력 시퀀스는 "President Barack Obama and First Lady Michelle Obama welcome Donald Trump at White House"는 14개의 토큰으로 이뤄질 수 있다.The candidate entity extraction module 110 may receive an input sequence for each token to the first neural network and output a tag sequence composed of tags corresponding to the tokens. For example, the input sequence may be "President Barack Obama and First Lady Michelle Obama welcome Donald Trump at White House", and the input object may be "Barack Obama". Hereinafter, for convenience of explanation, it will be described based on "President Barack Obama and First Lady Michelle Obama welcome Donald Trump at White House". The token may be determined in units of words included in the sequence, that is, in units of spaces. For example, the input sequence "President Barack Obama and First Lady Michelle Obama welcome Donald Trump at White House" may consist of 14 tokens.

후보 개체 추출 모듈(110)은 학습된 제1 신경망을 이용하여, 입력 개체 "Barack Obama"에 기초하여 입력 시퀀스의 각 토큰에 대응하는 태그로 구성되는 태그 시퀀스 "B-cand E1 E1 O O O B-cand I-cand O O O O O O"를 출력할 수 있다. "B-cand"는 후보 개체의 시작을 의미하는 태그이고, "I-cand"는 연속되는 후보 개체를 의미하는 태그이고, "E1"은 입력 개체를 의미하는 태그이며, "O"는 후보 개체도 입력 개체도 아닌 의미를 갖지 않는 태그일 수 있다.The candidate object extraction module 110 uses the learned first neural network, and based on the input object "Barack Obama", the tag sequence "B-cand E1 E1 OOO B-cand consisting of tags corresponding to each token of the input sequence" I-cand OOOOOO" can be printed. "B-cand" is a tag indicating the beginning of a candidate object, "I-cand" is a tag indicating a successive candidate object, "E1" is a tag indicating an input object, and "O" is a candidate object It may be a tag that is neither an input object nor a meaning.

후보 개체 추출 모듈(110)은 태그 시퀀스에 기초하여 후보 개체를 추출할 수 있다. 태그 시퀀스 "B-cand E1 E1 O O O B-cand I-cand O O O O O O" 에 기초하여 "President", "Michelle Obama"를 후보 개체로 추출할 수 있다.The candidate entity extraction module 110 may extract a candidate entity based on the tag sequence. Based on the tag sequence "B-cand E1 E1 O O O B-cand I-cand O O O O O O", "President" and "Michelle Obama" may be extracted as candidate objects.

도 3은 일 실시예에 따른 관계 예측 모듈의 구체적인 동작 방법을 설명하기 위한 도면이다.3 is a diagram for explaining a specific operation method of a relationship prediction module according to an embodiment.

도 3을 참조하면, 일 실시예에 따른 관계 예측 모듈(150)은 후보 개체를 제2 신경망에 입력하여 후보 개체와 입력 개체 사이의 특정 관계를 갖는 정도에 기초하여, 후보 개체를 재순위화할 수 있다. 제1 신경망은 구간적 합성곱 신경망(Piecewise Convolutional Neural Network: PCNN)을 포함할 수 있다. PCNN은 CNN 모델을 확장한 것으로, CNN에서 사용하는 최대 풀링 레이어(max pooling layer)를 구분적 최대 풀링 레이어(piecewise max pooling layer)로 확장하였다는 것이 큰 차이점일 수 있다.Referring to FIG. 3 , the relationship prediction module 150 according to an embodiment may re-rank the candidate entity based on the degree of having a specific relationship between the candidate entity and the input entity by inputting the candidate entity into the second neural network. have. The first neural network may include a piecewise convolutional neural network (PCNN). PCNN is an extension of the CNN model, and the big difference may be that the max pooling layer used in CNN is extended to a piecewise max pooling layer.

관계 예측 모듈(150)은 재순위화된 후보 개체와 입력 개체 사이의 특정 관계를 추출할 수 있고, 나아가 특정 관계에 대응하는 수치도 추출할 수 있다. 예를 들어, 관계 예측 모듈(150)은 후보 개체 "president"와 "Michelle Obama"를 제2 신경망에 입력하여, 재순위화할 수 있고, 그 결과에 기초하여 후보 개체 "president"는 입력 개체 "Barack Obama"와 "title"의 관계에 있고, 후보 개체 "Michelle Obama"는 입력 개체 "Barack Obama"와 "spouse"의 관계에 있음을 추출할 수 있다. 또한, 후보 개체 "president"는 입력 개체 "Barack Obama"와 "title"의 관계에 있음과 함께 0.87의 점수를 가짐을, 마찬가지로 "Michelle Obama"는 입력 개체 "Barack Obama"와 "spouse"의 관계에 있음과 함께 0.46의 점수를 가짐을 추출할 수 있다.The relationship prediction module 150 may extract a specific relationship between the re-ranked candidate entity and the input entity, and further extract a numerical value corresponding to the specific relationship. For example, the relationship prediction module 150 may input the candidate entities “president” and “Michelle Obama” into the second neural network for re-ordering, and based on the result, the candidate entity “president” is the input entity “Barack” It can be extracted that there is a relationship between "Obama" and "title", and that the candidate object "Michelle Obama" has a relationship between the input object "Barack Obama" and "spouse". In addition, the candidate object "president" has a score of 0.87 with being in the relationship of the input object "Barack Obama" and "title", and similarly, "Michelle Obama" is in the relationship of the input object "Barack Obama" and "spouse". It can be extracted that has and has a score of 0.46.

일 실시예에 따른 관계 예측 모듈(150)은 후보 개체와 입력 개체 사이의 특정 관계에 대응하는 점수가 임계값 이상인 경우에 후보 개체의 특정 관계를 결정할 수 있다. 예를 들어, 후보 개체와 입력 개체 사이의 특정 관계에 대응되는 점수가 임계값 이상이라면, 후보 개체는 입력 개체와 복수 개의 특정 관계를 가질 수 있다. 예를 들어, 입력 시퀀스가 "IBM chief Ginni Rometty puts emphasis on responsible use of data" 이고, 입력 개체가 "IBM"인 경우 후보 개체 "Ginni Rometty"는 입력 개체와 "employee" 관계일수도 있고, "top employee"관계일수도 있다.The relationship prediction module 150 according to an embodiment may determine the specific relationship of the candidate entity when a score corresponding to the specific relationship between the candidate entity and the input entity is equal to or greater than a threshold value. For example, if the score corresponding to the specific relationship between the candidate entity and the input entity is equal to or greater than a threshold, the candidate entity may have a plurality of specific relationships with the input entity. For example, if the input sequence is "IBM chief Ginni Rometty puts emphasis on responsible use of data" and the input object is "IBM", then the candidate object "Ginni Rometty" may have an "employee" relationship with the input object, and "top It could be an employee" relationship.

도 4는 일 실시예에 따른 슬롯 채우기 방법의 순서도이다.4 is a flowchart of a slot filling method according to an embodiment.

도 4를 참조하면, 단계(410, 420)은 도 1 내지 도 3를 참조하여 전술된 슬롯 채우기 장치(100)에 의해 수행될 수 있다. 슬롯 채우기 장치(100)는 하나 또는 그 이상의 하드웨어 모듈, 하나 또는 그 이상의 소프트웨어 모듈, 또는 이들의 다양한 조합에 의하여 구현될 수 있다.Referring to FIG. 4 , steps 410 and 420 may be performed by the slot filling apparatus 100 described above with reference to FIGS. 1 to 3 . The slot filling apparatus 100 may be implemented by one or more hardware modules, one or more software modules, or various combinations thereof.

단계(410)에서, 슬롯 채우기 장치(100)는 입력 시퀀스와 입력 개체를 미리 학습된 제1 신경망에 입력하여 입력 개체와 특정 관계를 갖을 수 있는 하나 이상의 후보 개체를 추출한다.In step 410, the slot filling apparatus 100 inputs the input sequence and the input object to the pre-trained first neural network to extract one or more candidate objects that may have a specific relationship with the input object.

단계(420)에서, 슬롯 채우기 장치(100)는 후보 개체를 미리 학습된 제2 신경망에 입력하여 입력 개체와 후보 개체 사이의 특정 관계를 추출한다.In step 420 , the slot filling apparatus 100 inputs the candidate entity to the pre-trained second neural network to extract a specific relationship between the input entity and the candidate entity.

제1 신경망과 상기 제2 신경망은 제1 신경망의 손실함수와 제2 신경망의 손실함수를 결합하여 생성된 새로운 손실함수를 사용하여 학습될 수 있다.The first neural network and the second neural network may be trained using a new loss function generated by combining the loss function of the first neural network and the loss function of the second neural network.

제1 신경망의 손실 함수는 수학식 1과 같을 수 있다.The loss function of the first neural network may be expressed as Equation (1).

loss_cand는 제1 신경망의 손실 함수, X는 입력 시퀀스,

는 제1 신경망의 출력 입력 시퀀스에 대응되는 태그 시퀀스일 수 있다.loss _cand is the loss function of the first neural network, X is the input sequence,

may be a tag sequence corresponding to an output input sequence of the first neural network.

제2 신경망의 손실 함수는 수학식 2와 같을 수 있다.The loss function of the second neural network may be expressed as Equation (2).

loss_rerank는 제2 신경망의 손실 함수, e₁은 입력 개체,

는 i번째 후보 개체일 수 있다. 예를 들어, e₁은 "Barack Obama",

는 "President",

는 "Michelle Obama"일 수 있다.

은 입력 개체와 후보 개체 사이의 특정 관계일 수 있고,

은 수학식 3과 같을 수 있다.loss _rerank is the loss function of the second neural network, e ₁ is the input object,

may be the i-th candidate entity. For example, e ₁ is "Barack Obama",

is "President",

could be "Michelle Obama".

may be a specific relationship between the input entity and the candidate entity,

may be equal to Equation (3).

제1 신경망의 손실함수와 제2 신경망의 손실함수를 결합하여 생성된 새로운 손실함수은 수학식 4와 같을 수 있다.A new loss function generated by combining the loss function of the first neural network and the loss function of the second neural network may be expressed as Equation (4).

수학식 4와 같이 두 모듈의 손실 함수를 결합하여 전체 슬롯 채우기 문제에 대해 최적화할 수 있는 새로운 손실함수를 사용할 수 있다.As shown in Equation 4, a new loss function that can be optimized for the entire slot filling problem can be used by combining the loss functions of the two modules.

도 5는 일 실시예에 따른 슬롯 채우기 방법을 설명하기 위한 도면이다.5 is a diagram for explaining a method for filling a slot according to an embodiment.

도 5를 참조하면, 일 실시예에 따른 슬롯 채우기 장치(500)의 후보 개체 추출 모듈(510)에 입력 시퀀스 "President Barack Obama and First Lady Michelle Obama welcome Donald Trump at White House"와 입력 개체는 "Barack Obama"이 입력될 수 있고, 후보 개체 추출 모듈(510)은 태그 시퀀스 "B-cand E1 E1 O O O B-cand I-cand O O O O O O"를 출력할 수 있다. 태그 시퀀스 "B-cand E1 E1 O O O B-cand I-cand O O O O O O" 에 기초하여 "President", "Michelle Obama"를 후보 개체로 추출할 수 있다. 후보 개체 추출 모듈(510)은 양방향 LSTM-CRF 모델을 포함할 수 있다.Referring to FIG. 5 , the input sequence “President Barack Obama and First Lady Michelle Obama welcome Donald Trump at White House” and the input object are “Barack” to the candidate object extraction module 510 of the slot filling device 500 according to an embodiment. Obama" may be input, and the candidate entity extraction module 510 may output the tag sequence "B-cand E1 E1 OOO B-cand I-cand OOOOOO". Based on the tag sequence "B-cand E1 E1 O O O B-cand I-cand O O O O O O", "President" and "Michelle Obama" may be extracted as candidate objects. The candidate entity extraction module 510 may include a bidirectional LSTM-CRF model.

관계 예측 모듈(550)은 후보 개체 "President", "Michelle Obama"와 입력 개체 "Barack Obama" 사이의 특정 관계를 추출할 수 있다. 예를 들어, "president"는 입력 개체 "Barack Obama"와 "title"의 관계에 있고, 후보 개체 "Michelle Obama"는 입력 개체 "Barack Obama"와 "spouse"의 관계에 있음을 추출할 수 있다. 또한, 후보 개체 "president"는 입력 개체 "Barack Obama"와 "title"의 관계에 있음과 함께 0.87의 점수를 가짐을, 마찬가지로 "Michelle Obama"는 입력 개체 "Barack Obama"와 "spouse"의 관계에 있음과 함께 0.46의 점수를 가짐을 추출할 수 있다. 관계 예측 모듈(550)은 구간적 합성곱 신경망(Piecewise Convolutional Neural Network: PCNN)을 포함할 수 있다.The relationship prediction module 550 may extract a specific relationship between the candidate entities “President” and “Michelle Obama” and the input entity “Barack Obama”. For example, it may be extracted that "president" has a relationship between the input object "Barack Obama" and "title", and the candidate object "Michelle Obama" has a relationship between the input object "Barack Obama" and "spouse". In addition, the candidate object "president" has a score of 0.87 with being in the relationship of the input object "Barack Obama" and "title", and similarly, "Michelle Obama" is in the relationship of the input object "Barack Obama" and "spouse". It can be extracted that has and has a score of 0.46. The relationship prediction module 550 may include a piecewise convolutional neural network (PCNN).

이상에서 설명된 실시예들은 하드웨어 구성요소, 소프트웨어 구성요소, 및/또는 하드웨어 구성요소 및 소프트웨어 구성요소의 조합으로 구현될 수 있다. 예를 들어, 실시예들에서 설명된 장치, 방법 및 구성요소는, 예를 들어, 프로세서, 콘트롤러, ALU(arithmetic logic unit), 디지털 신호 프로세서(digital signal processor), 마이크로컴퓨터, FPGA(field programmable gate array), PLU(programmable logic unit), 마이크로프로세서, 또는 명령(instruction)을 실행하고 응답할 수 있는 다른 어떠한 장치와 같이, 하나 이상의 범용 컴퓨터 또는 특수 목적 컴퓨터를 이용하여 구현될 수 있다. 처리 장치는 운영 체제(OS) 및 상기 운영 체제 상에서 수행되는 하나 이상의 소프트웨어 애플리케이션을 수행할 수 있다. 또한, 처리 장치는 소프트웨어의 실행에 응답하여, 데이터를 접근, 저장, 조작, 처리 및 생성할 수도 있다. 이해의 편의를 위하여, 처리 장치는 하나가 사용되는 것으로 설명된 경우도 있지만, 해당 기술분야에서 통상의 지식을 가진 자는, 처리 장치가 복수 개의 처리 요소(processing element) 및/또는 복수 유형의 처리 요소를 포함할 수 있음을 알 수 있다. 예를 들어, 처리 장치는 복수 개의 프로세서 또는 하나의 프로세서 및 하나의 콘트롤러를 포함할 수 있다. 또한, 병렬 프로세서(parallel processor)와 같은, 다른 처리 구성(processing configuration)도 가능하다.The embodiments described above may be implemented by a hardware component, a software component, and/or a combination of the hardware component and the software component. For example, the apparatus, methods, and components described in the embodiments may include, for example, a processor, a controller, an arithmetic logic unit (ALU), a digital signal processor, a microcomputer, a field programmable gate (FPGA) array), a programmable logic unit (PLU), a microprocessor, or any other device capable of executing and responding to instructions, may be implemented using one or more general purpose or special purpose computers. The processing device may execute an operating system (OS) and one or more software applications running on the operating system. A processing device may also access, store, manipulate, process, and generate data in response to execution of the software. For convenience of understanding, although one processing device is sometimes described as being used, one of ordinary skill in the art will recognize that the processing device includes a plurality of processing elements and/or a plurality of types of processing elements. It can be seen that can include For example, the processing device may include a plurality of processors or one processor and one controller. Other processing configurations are also possible, such as parallel processors.

소프트웨어는 컴퓨터 프로그램(computer program), 코드(code), 명령(instruction), 또는 이들 중 하나 이상의 조합을 포함할 수 있으며, 원하는 대로 동작하도록 처리 장치를 구성하거나 독립적으로 또는 결합적으로(collectively) 처리 장치를 명령할 수 있다. 소프트웨어 및/또는 데이터는, 처리 장치에 의하여 해석되거나 처리 장치에 명령 또는 데이터를 제공하기 위하여, 어떤 유형의 기계, 구성요소(component), 물리적 장치, 가상 장치(virtual equipment), 컴퓨터 저장 매체 또는 장치, 또는 전송되는 신호 파(signal wave)에 영구적으로, 또는 일시적으로 구체화(embody)될 수 있다. 소프트웨어는 네트워크로 연결된 컴퓨터 시스템 상에 분산되어서, 분산된 방법으로 저장되거나 실행될 수도 있다. 소프트웨어 및 데이터는 하나 이상의 컴퓨터 판독 가능 기록 매체에 저장될 수 있다.Software may comprise a computer program, code, instructions, or a combination of one or more thereof, which configures a processing device to operate as desired or is independently or collectively processed You can command the device. The software and/or data may be any kind of machine, component, physical device, virtual equipment, computer storage medium or apparatus, to be interpreted by or to provide instructions or data to the processing device. , or may be permanently or temporarily embody in a transmitted signal wave. The software may be distributed over networked computer systems and stored or executed in a distributed manner. Software and data may be stored in one or more computer-readable recording media.

실시예에 따른 방법은 다양한 컴퓨터 수단을 통하여 수행될 수 있는 프로그램 명령 형태로 구현되어 컴퓨터 판독 가능 매체에 기록될 수 있다. 상기 컴퓨터 판독 가능 매체는 프로그램 명령, 데이터 파일, 데이터 구조 등을 단독으로 또는 조합하여 포함할 수 있다. 상기 매체에 기록되는 프로그램 명령은 실시예를 위하여 특별히 설계되고 구성된 것들이거나 컴퓨터 소프트웨어 당업자에게 공지되어 사용 가능한 것일 수도 있다. 컴퓨터 판독 가능 기록 매체의 예에는 하드 디스크, 플로피 디스크 및 자기 테이프와 같은 자기 매체(magnetic media), CD-ROM, DVD와 같은 광기록 매체(optical media), 플롭티컬 디스크(floptical disk)와 같은 자기-광 매체(magneto-optical media), 및 롬(ROM), 램(RAM), 플래시 메모리 등과 같은 프로그램 명령을 저장하고 수행하도록 특별히 구성된 하드웨어 장치가 포함된다. 프로그램 명령의 예에는 컴파일러에 의해 만들어지는 것과 같은 기계어 코드뿐만 아니라 인터프리터 등을 사용해서 컴퓨터에 의해서 실행될 수 있는 고급 언어 코드를 포함한다. 상기된 하드웨어 장치는 실시예의 동작을 수행하기 위해 하나 이상의 소프트웨어 모듈로서 작동하도록 구성될 수 있으며, 그 역도 마찬가지이다.The method according to the embodiment may be implemented in the form of program instructions that can be executed through various computer means and recorded in a computer-readable medium. The computer-readable medium may include program instructions, data files, data structures, etc. alone or in combination. The program instructions recorded on the medium may be specially designed and configured for the embodiment, or may be known and available to those skilled in the art of computer software. Examples of the computer-readable recording medium include magnetic media such as hard disks, floppy disks and magnetic tapes, optical media such as CD-ROMs and DVDs, and magnetic such as floppy disks. - includes magneto-optical media, and hardware devices specially configured to store and execute program instructions, such as ROM, RAM, flash memory, and the like. Examples of program instructions include not only machine language codes such as those generated by a compiler, but also high-level language codes that can be executed by a computer using an interpreter or the like. The hardware devices described above may be configured to operate as one or more software modules to perform the operations of the embodiments, and vice versa.

이상과 같이 실시예들이 비록 한정된 도면에 의해 설명되었으나, 해당 기술분야에서 통상의 지식을 가진 자라면 상기를 기초로 다양한 기술적 수정 및 변형을 적용할 수 있다. 예를 들어, 설명된 기술들이 설명된 방법과 다른 순서로 수행되거나, 및/또는 설명된 시스템, 구조, 장치, 회로 등의 구성요소들이 설명된 방법과 다른 형태로 결합 또는 조합되거나, 다른 구성요소 또는 균등물에 의하여 대치되거나 치환되더라도 적절한 결과가 달성될 수 있다.As described above, although the embodiments have been described with reference to the limited drawings, those skilled in the art may apply various technical modifications and variations based on the above. For example, the described techniques are performed in an order different from the described method, and/or the described components of the system, structure, apparatus, circuit, etc. are combined or combined in a different form than the described method, or other components Or substituted or substituted by equivalents may achieve an appropriate result.

그러므로, 다른 구현들, 다른 실시예들 및 특허청구범위와 균등한 것들도 후술하는 특허청구범위의 범위에 속한다.Therefore, other implementations, other embodiments, and equivalents to the claims are also within the scope of the following claims.

Claims

extracting one or more candidate entities that may have a specific relationship with the input entity by inputting the input sequence and the input entity into a pre-trained first neural network; and
inputting the candidate entity into a pre-trained second neural network to extract the specific relationship between the input entity and the candidate entity;
including,
The first neural network and the second neural network are learned by sharing an input layer,
The steps are performed by at least one processor.
How to fill a slot.

According to claim 1,
The first neural network and the second neural network are
It is learned using a new loss function generated by combining the loss function of the first neural network and the loss function of the second neural network,
How to fill a slot.

According to claim 1,
The first neural network and the second neural network are learned by sharing the free variables of the input layer in a manner using the same word embedding,
How to fill a slot.

4. The method of claim 3,
The word embedding is recorded in the database of the first neural network and the second neural network in the form of a lookup table, and the lookup table is updated in the learning process,
How to fill a slot.

According to claim 1,
The input sequence consists of token units,
The step of extracting the candidate object is
outputting a tag sequence composed of tags corresponding to the tokens by inputting the input sequence into the first neural network for each token; and
extracting the candidate entity based on the tag sequence
containing,
How to fill a slot.

6. The method of claim 5,
The step of extracting the specific relationship is
inputting the candidate entity into the second neural network and re-ranking the candidate entity based on the degree of having the specific relationship between the candidate entity and the input entity;
extracting the specific relationship between the re-ranked candidate entity and the input entity; and
extracting a numerical value corresponding to the specific relationship
containing,
How to fill a slot.

According to claim 1,
The input sequence is
comprising a sequence of text;
How to fill a slot.

According to claim 1,
The first neural network is
Including a long short-term memory (LSTM) type of recurrent neural network,
How to fill a slot.

According to claim 1,
The second neural network is
comprising a piecewise convolutional neural network (PCNN),
How to fill a slot.

A computer program stored in a computer-readable recording medium in combination with hardware to execute the method of any one of claims 1 to 9.

a candidate entity extraction module inputting an input sequence and an input entity into a pre-trained first neural network to extract one or more candidate entities that may have a specific relationship with the input entity; and
A relationship prediction module for extracting the specific relationship between the input entity and the candidate entity by inputting the candidate entity into a pre-trained second neural network
including,
The first neural network and the second neural network are learned by sharing an input layer.
slot filling device.

12. The method of claim 11,
The first neural network and the second neural network are
It is learned using a new loss function generated by combining the loss function of the first neural network and the loss function of the second neural network,
slot filling device.

12. The method of claim 11,
The first neural network and the second neural network are learned by sharing the free variables of the input layer in a manner using the same word embedding,
slot filling device.

14. The method of claim 13,
The word embedding is recorded in the database of the first neural network and the second neural network in the form of a lookup table, and the lookup table is updated in the learning process,
slot filling device.

12. The method of claim 11,
The input sequence consists of token units,
The module for extracting the candidate object is
receiving the input sequence for each token into the first neural network and outputting a tag sequence composed of tags corresponding to the token;
extracting the candidate entity based on the tag sequence;
slot filling device.

16. The method of claim 15,
The relationship prediction module
input the candidate entity into the second neural network to re-rank the candidate entity based on the degree of having the specific relationship between the candidate entity and the input entity;
extracting the specific relation between the re-ranked candidate entity and the input entity and a numerical value corresponding to the specific relation,
slot filling device.

12. The method of claim 11,
The input sequence is
comprising a sequence of text;
slot filling device.

12. The method of claim 11,
The first neural network is
Including a long short-term memory (LSTM) type of recurrent neural network,
slot filling device.

12. The method of claim 11,
The second neural network is
comprising a piecewise convolutional neural network (PCNN),
slot filling device.