KR101988396B1

KR101988396B1 - System for generting query to knowledge base from natural language question and ranking resources and question answering system including the same

Info

Publication number: KR101988396B1
Application number: KR1020170176481A
Authority: KR
Inventors: 이종민; 양승원; 이경일
Original assignee: 주식회사 솔트룩스
Priority date: 2017-12-20
Filing date: 2017-12-20
Publication date: 2019-06-12

Abstract

According to an exemplary embodiment of the present invention, provided is a question and answer system, which comprises: a question pattern analysis unit configured to generate at least one question pattern arranged in accordance with a ranking by mapping tokens extracted from a natural language question to ontology components based on meta-knowledge data; and a query generation unit configured to generate at least one query on the knowledge base based on a question pattern template from the at least one question pattern. The meta-knowledge data may include meta-information of the ontology components in the knowledge base.

Description

BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a system for generating a query and a resource ranking for a knowledge base from a natural language query and a query response system including the same,

본 발명의 기술적 사상은 지식 베이스에 관한 것으로서, 자세하게는 지식 베이스에 대한 쿼리를 생성하는 시스템 및 방법에 관한 것이다. The technical idea of the present invention relates to a knowledge base, and more particularly, to a system and a method for generating a query on a knowledge base.

본 발명은 과학기술정보통신부 SW컴퓨팅산업원천기술개발사업(SW)의 일환으로 (주)솔트룩스에서 주관하고 연구하여 수행된 연구로부터 도출된 것이다. [연구기간: 2017.03.01~2017.12.31, 연구관리 전문기관: 정보통신기술진흥센터, 연구과제명: WiseKB: 빅데이터 이해 기반 자가학습형 지식베이스 및 추론 기술 개발, 과제 고유번호: 2013-0-00109] The present invention is derived from research conducted and conducted by Saltlux Co., Ltd. as part of SW Technology Computing Industry Source Technology Development (SW) of Ministry of Science, Technology and Information. [Research period: 2017.03.01 ~ 2017.12.31] Research institute: Information and Communication Technology Promotion Center, Research title: WiseKB: Big data understanding based self-learning knowledge base and inference technology development, task number: 2013-0 -00109]

지식 데이터를 저장하고 저장된 지식 데이터를 제공하는 지식 베이스(knowledge base)가 구축될 수 있다. 지식 베이스에 포함된 지식들(또는 지식 데이터) 중 사용자가 원하는 지식을 제공하기 위하여, 지식 베이스는 쿼리(query)를 지원할 수 있다. 지식 베이스로부터 원하는 지식의 획득을 위하여, 전문가는 지식 베이스에 대한 적절한 쿼리를 직접 작성할 수 있는 반면, 비전문가로서 일반 사용자에게는 쿼리의 작성이 용이하지 아니할 수 있다. A knowledge base may be constructed that stores knowledge data and provides stored knowledge data. In order to provide the knowledge desired by the user among the knowledge (or knowledge data) included in the knowledge base, the knowledge base can support a query. In order to acquire the desired knowledge from the knowledge base, the expert can directly write an appropriate query to the knowledge base, while the non-expert can not easily write the query to the general user.

질의 응답 시스템은 지식 베이스에 기초하여 사용자의 질의에 대한 응답을 생성할 수 있다. 예를 들면, 질의 응답 시스템은 사용자로부터 수신된 자연어 질의에 따라 지식 베이스에 관련 지식을 검색함으로써 응답을 생성할 수 있다. 이와 같은 질의 응답 시스템에서, 자연어 질의로부터 생성되는 지식 베이스에 대한 쿼리가 질의 응답 시스템이 제공하는 응답의 품질을 좌우할 수 있다.The query response system can generate a response to the user's query based on the knowledge base. For example, the query response system can generate a response by searching the knowledge base for relevant knowledge according to the natural language query received from the user. In such a QA system, a query to a knowledge base generated from a natural language query can influence the quality of the response provided by the QA system.

본 발명의 기술적 사상은, 지식 베이스에 대한 쿼리를 생성하는 시스템 및 방법을 제공한다.The technical idea of the present invention provides a system and method for generating a query for a knowledge base.

상기와 같은 목적을 달성하기 위하여, 본 발명의 기술적 사상에 따른 시스템은, 자연어 질의로부터 추출된 토큰(token)들을 메타-지식 데이터에 기초하여 온톨로지 구성요소(ontology component)들에 맵핑함으로써, 순위에 따라 정렬된 적어도 하나의 질의 패턴을 생성하도록 구성된 질의 패턴 분석부, 및 지식 베이스에 대한 적어도 하나의 쿼리를 질의 패턴 템플릿에 기초하여 적어도 하나의 질의 패턴으로부터 생성하도록 구성된 쿼리 생성부를 포함할 수 있고, 메타-지식 데이터는, 지식 베이스에서 온톨로지 구성요소들의 메타 정보를 포함할 수 있다.In order to achieve the above object, a system according to the technical idea of the present invention maps tokens extracted from a natural language query to ontology components based on meta-knowledge data, And a query generator configured to generate at least one query for the knowledge base from at least one query pattern based on the query pattern template, wherein the query generator is configured to generate at least one query pattern, The meta-knowledge data may include meta information of the ontology components in the knowledge base.

본 발명의 예시적 실시예에 따라, 메타-지식 데이터는, 엔티티의 중요도, 링크 카운트, 속성 리스트, 클래스 리스트 중 적어도 하나를 포함하는 엔티티(entity)의 메타 정보를 포함할 수 있고, 질의 패턴 분석부는, 토큰에 대응하는 적어도 하나의 엔티티의 메타 정보에 기초하여, 적어도 하나의 엔티티의 점수를 산출하도록 구성된 엔티티 평가부를 포함할 수 있다.According to an exemplary embodiment of the present invention, the meta-knowledge data may include meta information of an entity including at least one of an importance of an entity, a link count, an attribute list, and a class list, The department may include an entity evaluator configured to calculate a score of the at least one entity based on meta information of the at least one entity corresponding to the token.

본 발명의 예시적 실시예에 따라, 엔티티 평가부는, 중요도 및 링크 카운트 중 적어도 하나에 비례하도록 적어도 하나의 엔티티의 점수를 산출할 수 있다.According to an exemplary embodiment of the present invention, the entity evaluation unit may calculate the score of the at least one entity so as to be proportional to at least one of the importance and the link count.

본 발명의 예시적 실시예에 따라, 엔티티 평가부는, 추출된 토큰들 중 속성 리스트에 포함된 속성 또는 클래스 리스트에 포함된 클래스에 대응하는 토큰을 가지는 경우 증가하도록, 적어도 하나의 엔티티의 점수를 산출할 수 있다.According to an exemplary embodiment of the present invention, the entity evaluating unit calculates the score of at least one entity so as to increase when the attribute included in the attribute list or the token corresponding to the class included in the class list is extracted can do.

본 발명의 예시적 실시예에 따라, 메타-지식 데이터는, 속성들의 계층 정보, 도메인 정보, 범위 정보 중 적어도 하나를 포함하는 속성의 메타 정보를 포함할 수 있고, 질의 패턴 분석부는, 토큰에 대응하는 적어도 하나의 속성의 메타 정보에 기초하여, 적어도 하나의 속성의 점수를 산출하도록 구성된 속성 평가부를 포함할 수 있다.According to an exemplary embodiment of the present invention, the meta-knowledge data may include meta information of an attribute including at least one of hierarchical information of attributes, domain information, and range information, and the query pattern analysis unit may correspond to a token And an attribute evaluation unit configured to calculate a score of the at least one attribute based on the meta information of at least one attribute.

본 발명의 예시적 실시예에 따라, 속성 평가부는, 속성들의 계층 정보에 기초하여, 상위 레벨의 속성이 낮은 점수를 가지도록 적어도 하나의 속성의 점수를 산출할 수 있다.According to an exemplary embodiment of the present invention, the attribute evaluation unit may calculate the score of the at least one attribute such that the attribute of the higher level has a lower score, based on the hierarchical information of the attributes.

본 발명의 예시적 실시예에 따라, 속성 평가부는, 속성들의 도메인 정보 및 범위 정보 중 적어도 하나에 기초하여, 추출된 토큰들 중 적어도 하나의 속성의 관계된 도메인 또는 범위의 클래스에 대응하는 토큰을 가지는 경우 증가하도록, 적어도 하나의 속성의 점수를 산출할 수 있다.According to an exemplary embodiment of the present invention, the attribute evaluation unit may have a token corresponding to a class of an associated domain or range of at least one attribute of extracted tokens, based on at least one of domain information and range information of attributes , The score of at least one attribute can be calculated.

본 발명의 예시적 실시예에 따라, 메타-지식 데이터는, 클래스들의 계층 정보를 포함하는 클래스의 메타 정보를 포함할 수 있고, 질의 패턴 분석부는, 토큰에 대응하는 적어도 하나의 클래스의 메타 정보에 기초하여, 적어도 하나의 클래스의 점수를 산출하도록 구성된 클래스 평가부를 포함할 수 있다.According to an exemplary embodiment of the present invention, the meta-knowledge data may include meta information of a class including hierarchical information of classes, and the query pattern analyzing unit may include meta information of at least one class corresponding to the token And a class evaluation unit configured to calculate a score of at least one class based on the evaluation result.

본 발명의 예시적 실시예에 따라, 클래스 평가부는, 클래스들의 계층 정보에 기초하여, 상위 레벨의 클래스가 낮은 점수를 가지도록 적어도 하나의 클래스의 점수를 산출할 수 있다.According to an exemplary embodiment of the present invention, the class evaluation unit may calculate the score of at least one class so that the higher-level class has a lower score, based on the hierarchical information of the classes.

본 발명의 예시적 실시예에 따라, 질의 패턴 분석부는, 적어도 하나의 질의 패턴에 포함된 온톨로지 구성요소들의 점수들에 기초하여, 적어도 하나의 질의 패턴의 순위를 결정하도록 구성된 질의 패턴 랭크부를 포함할 수 있다.According to an exemplary embodiment of the present invention, the query pattern analyzing unit includes a query pattern rank unit configured to determine a ranking of at least one query pattern, based on the scores of the ontology components included in the at least one query pattern .

본 발명의 기술적 사상에 따른 시스템 및 방법에 의하면, 자연어 질의의 의도에 부합하는, 지식 베이스에 대한 쿼리가 생성될 수 있다.According to the system and method according to the technical idea of the present invention, a query can be generated for a knowledge base that matches the intention of a natural language query.

또한, 본 발명의 기술적 사상에 따른 시스템 및 방법에 의하면, 미리 구축된 메타-지식 데이터에 기인하여 사용자의 질의에 대하여 쿼리 및 응답이 신속하게 생성될 수 있다.In addition, according to the system and method according to the technical idea of the present invention, a query and a response can be quickly generated with respect to a user's query due to pre-established meta-knowledge data.

또한, 본 발명의 기술적 사상에 따른 시스템 및 방법에 의하면, 지식 베이스의 갱신에 따라 메타-지식 데이터가 자동으로 갱신될 수 있다.Further, according to the system and method according to the technical idea of the present invention, the meta-knowledge data can be automatically updated according to the update of the knowledge base.

도 1은 본 발명의 예시적 실시예에 따른 질의 응답 시스템 및 그 입출력 관계를 나타내는 블록도이다.
도 2는 본 개시의 예시적 실시예에 따라 도 1의 질의 패턴 분석부의 예시를 나타내는 블록도이다.
도 3은 본 개시의 예시적 실시예에 따라 도 2의 적어도 하나의 토큰 리스트의 예시를 나타낸다.
도 4는 본 개시의 예시적 실시예에 따라 도 2의 엔티티 메타 정보의 예시를 나타낸다.
도 5 및 도 6은 본 개시의 예시적 실시예들에 따라 도 2의 속성 메타 정보의 예시들을 나타낸다.
도 7은 본 개시의 예시적 실시예에 따라 도 2의 적어도 하나의 질의 패턴의 예시를 나타낸다.
도 8은 본 개시의 예시적 실시예에 따라 도 1의 패턴 템플릿 데이터베이스에 포함된 패턴 템플릿 데이터의 예시를 나타내는 도면이다.
도 9는 본 개시의 예시적 실시예에 따라 지식 베이스에 대한 쿼리를 생성하는 방법을 나타내는 순서도이다.
도 10은 본 개시의 예시적 실시예에 따라 도 9의 단계 S40의 예시를 나타내는 순서도이다.
도 11은 본 개시의 예시적 실시예에 따른 메타-지식 구축 시스템 및 그 입출력 관계를 나타내는 블록도이다.1 is a block diagram illustrating a query response system and its input / output relationship according to an exemplary embodiment of the present invention.
Figure 2 is a block diagram illustrating an example of the query pattern analyzer of Figure 1 in accordance with an exemplary embodiment of the present disclosure.
Figure 3 illustrates an example of at least one token list of Figure 2 in accordance with an exemplary embodiment of the present disclosure.
Figure 4 illustrates an example of entity meta information in Figure 2 in accordance with an exemplary embodiment of the present disclosure.
Figures 5 and 6 illustrate examples of the attribute meta information of Figure 2 in accordance with the exemplary embodiments of the present disclosure.
Figure 7 illustrates an example of at least one query pattern of Figure 2 in accordance with an exemplary embodiment of the present disclosure.
8 is a diagram illustrating an example of pattern template data included in the pattern template database of FIG. 1 according to an exemplary embodiment of the present disclosure.
9 is a flow diagram illustrating a method for generating a query for a knowledge base in accordance with an exemplary embodiment of the present disclosure.
10 is a flow chart illustrating an example of step S40 of FIG. 9 in accordance with an exemplary embodiment of the present disclosure.
11 is a block diagram illustrating a meta-knowledge building system and its input / output relationship in accordance with an exemplary embodiment of the present disclosure;

이하, 첨부한 도면을 참조하여 본 발명의 실시 예에 대해 상세히 설명한다. 본 발명의 실시 예는 당 업계에서 평균적인 지식을 가진 자에게 본 발명을 보다 완전하게 설명하기 위하여 제공되는 것이다. 본 발명은 다양한 변경을 가할 수 있고 여러 가지 형태를 가질 수 있는 바, 특정 실시 예들을 도면에 예시하고 상세하게 설명하고자 한다. 그러나, 이는 본 발명을 특정한 개시 형태에 대해 한정하려는 것이 아니며, 본 발명의 사상 및 기술 범위에 포함되는 모든 변경, 균등물 내지 대체물을 포함하는 것으로 이해되어야 한다. 각 도면을 설명하면서 유사한 참조부호를 유사한 구성요소에 대해 사용한다. 첨부된 도면에 있어서, 구조물들의 치수는 본 발명의 명확성을 기하기 위하여 실제보다 확대하거나 축소하여 도시한 것이다.Hereinafter, embodiments of the present invention will be described in detail with reference to the accompanying drawings. Embodiments of the present invention are provided to more fully describe the present invention to those skilled in the art. The present invention is capable of various modifications and various forms, and specific embodiments are illustrated and described in detail in the drawings. It should be understood, however, that the invention is not intended to be limited to the particular forms disclosed, but includes all modifications, equivalents, and alternatives falling within the spirit and scope of the invention. Like reference numerals are used for similar elements in describing each drawing. In the accompanying drawings, the dimensions of the structures are enlarged or reduced from the actual dimensions for the sake of clarity of the present invention.

본 출원에서 사용한 용어는 단지 특정한 실시 예를 설명하기 위해 사용된 것으로, 본 발명을 한정하려는 의도가 아니다. 단수의 표현은 문맥상 명백하게 다르게 뜻하지 않는 한, 복수개의 표현을 포함한다. 본 출원에서, "포함하다" 또는 "가지다" 등의 용어는 명세서 상에 기재된 특징, 숫자, 단계, 동작, 구성요소, 부분품 또는 이들을 조합한 것이 존재함을 지정하려는 것이지, 하나 또는 그 이상의 다른 특징들이나 숫자, 단계, 동작, 구성 요소, 부분품 또는 이들을 조합한 것들의 존재 또는 부가 가능성을 미리 배제하지 않는 것으로 이해되어야 한다.The terminology used in this application is used only to describe a specific embodiment and is not intended to limit the invention. The singular expressions include plural expressions unless the context clearly indicates otherwise. In this application, the terms "comprises", "having", and the like are used to specify that a feature, a number, a step, an operation, an element, a part or a combination thereof is described in the specification, But do not preclude the presence or addition of one or more other features, integers, steps, operations, components, parts, or combinations thereof.

다르게 정의되지 않는 한, 기술적이거나 과학적인 용어를 포함해서 여기서 사용되는 모든 용어들은 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자에 의해 일반적으로 이해되는 것과 동일한 의미를 갖는다. 일반적으로 사용되는 사전에 정의되어 있는 것과 같은 용어들은 관련 기술의 문맥상 가지는 의미와 일치하는 의미를 가지는 것으로 해석되어야 하며, 본 출원에서 명백하게 정의하지 아니하는 한, 이상적이거나 과도하게 형식적인 의미로 해석되지 않는다.Unless otherwise defined, all terms used herein, including technical or scientific terms, have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. Terms such as those defined in commonly used dictionaries should be construed to have meanings consistent with the meaning in the context of the relevant art and, unless expressly defined in this application, are to be interpreted in an ideal or overly formal sense It does not.

이하 도면 및 설명에서, 하나의 블록으로 표시 또는 설명되는 구성요소는 하드웨어 블록 또는 소프트웨어 블록일 수 있다. 예를 들면, 구성요소들 각각은 서로 신호를 주고 받는 독립적인 하드웨어 블록일 수도 있고, 또는 하나의 프로세서에서 실행되는 소프트웨어 블록일 수도 있다. 또한, 본 명세서에서 "시스템" 또는 "데이터베이스"는 적어도 하나의 프로세서 및 프로세서에 의해서 액세스되는 메모리를 포함하는 컴퓨팅 시스템을 지칭할 수 있다.In the drawings and the description below, the components indicated or described as one block may be a hardware block or a software block. For example, each of the components may be an independent hardware block that sends and receives signals to each other, or may be a software block that executes on one processor. Also, a "system" or "database" herein may refer to a computing system that includes at least one processor and memory accessed by the processor.

도 1은 본 발명의 예시적 실시예에 따른 질의 응답 시스템(100) 및 그 입출력 관계를 나타내는 블록도이다. 도 1에 도시된 바와 같이, 질의 응답 시스템(100)은 사용자(20)로부터 질의를 수신할 수 있고, 질의에 대한 응답을 사용자(20)에 제공할 수 있다. 또한, 질의 응답 시스템(100)은 자연어 인식 시스템(200), 메타-지식 데이터베이스(300), 패턴 템플릿 데이터베이스(400) 및 지식 베이스(500)와 통신가능하게 연결될 수 있다. 일부 실시예들에서, 도 1에 도시된 바와 상이하게, 자연어 인식 시스템(200), 메타-지식 데이터베이스(300), 패턴 템플릿 데이터베이스(400) 및 지식 베이스(500) 중 적어도 하나는 질의 응답 시스템(100) 내에 포함될 수 있다.1 is a block diagram illustrating a query response system 100 and its input / output relationship according to an exemplary embodiment of the present invention. As shown in FIG. 1, the Q & A system 100 may receive a query from a user 20 and provide a response to the query to the user 20. The query response system 100 may be communicatively coupled to the natural language recognition system 200, the meta-knowledge database 300, the pattern template database 400, and the knowledge base 500. 1, at least one of the natural language recognition system 200, the meta-knowledge database 300, the pattern template database 400, and the knowledge base 500 is a query response system 0.0 > 100). &Lt; / RTI >

질의 응답 시스템(100)은 사용자(20)로부터 자연어 질의를 수신할 수 있다. 예를 들면, 사용자(20)는 네트워크에 접속된 단말기를 사용할 수 있고, 단말기에 음성이나 텍스트를 통해서 자연어 질의를 입력할 수 있으며, 질의 응답 시스템(100)은 네트워크를 통해서 자연어 질의를 수신할 수 있다. 질의 응답 시스템(100)은 자연어 질의에 기초하여 지식 베이스(500)에 포함된(또는 저장된) 지식(또는 지식 데이터)을 획득할 수 있고, 획득된 지식에 기초하여 자연어 질의에 대한 응답을 생성할 수 있으며, 응답을 네트워크를 통해서 사용자(20)의 단말기에 제공할 수 있다. 도 1에 도시된 바와 같이, 질의 응답 시스템(100)은 사용자 인터페이스(110), 질의 패턴 분석부(130), 쿼리 생성부(150) 및 쿼리 수행부(170) 및 응답 생성부(190)를 포함할 수 있다.The query response system 100 may receive a natural language query from the user 20. For example, the user 20 can use a terminal connected to the network, can input a natural language query through voice or text to the terminal, and the query response system 100 can receive a natural language query through the network have. The query response system 100 can acquire knowledge (or knowledge data) contained (or stored) in the knowledge base 500 based on natural language queries and generates a response to the natural language query based on the acquired knowledge And can provide a response to the terminal of the user 20 via the network. 1, the Q & A system 100 includes a user interface 110, a query pattern analysis unit 130, a query generation unit 150, a query execution unit 170, and a response generation unit 190 .

사용자 인터페이스(110)는 LAN(Local Area Network) 및 WAN(Wide Area Network)와 같은 네트워크에 접속될 수도 있고, 사용자(20) 또는 사용자(20)의 단말기와 일대일 통신을 위한 전용 채널에 접속될 수도 있다. 사용자 인터페이스(110)는 사용자(20)로부터 수신된 자연어 질의를 질의 패턴 분석부(130)에 제공할 수 있고, 응답 생성부(190)로부터 수신된 응답을 사용자(20)에게 제공할 수 있다.The user interface 110 may be connected to a network such as a LAN (Local Area Network) and a WAN (Wide Area Network) or may be connected to a dedicated channel for one-to- have. The user interface 110 can provide the natural language query received from the user 20 to the query pattern analyzing unit 130 and provide the response received from the response generating unit 190 to the user 20.

질의 패턴 분석부(130)는 사용자 인터페이스(110)로부터 자연어 질의를 수신할 수 있고, 자연어 질의를 자연어 인식 시스템(200)에 제공할 수 있다. 자연어 인식 시스템(200)은 자연어 질의를 분석(예컨대, 형태소 분석)함으로써 자연어 질의에 대응하는 적어도 하나의 토큰 리스트를 생성하고 질의 패턴 분석부(130)에 제공할 수 있다. 토큰 리스트는 적어도 하나의 토큰을 포함할 수 있고, 토큰은 자연어 질의에서 유의미한 어휘의 단위를 의미할 수 있다. 예를 들면, "이순신의 고향은"이라는 자연어 질의로부터 "이순신", "의", "고향", "은" 및 "?"의 토큰들을 포함하는 토큰 리스트가 생성될 수 있다. 일부 실시예들에서, "의", "은"과 같은 조사는 토큰 리스트에서 제외될 수 있다. 자연어 인식 시스템(200)은 NLU(Natural Language Understanding) 시스템으로 지칭될 수도 있다. The query pattern analysis unit 130 may receive the natural language query from the user interface 110 and may provide the natural language query to the natural language recognition system 200. [ The natural language recognition system 200 may generate at least one token list corresponding to the natural language query by analyzing the natural language query (for example, morphological analysis) and provide the generated list to the query pattern analysis unit 130. The token list can include at least one token, and the token can mean a unit of meaningful vocabulary in a natural language query. For example, a token list including tokens of " YiSUN, ", "Hometown "," HALF ", and "?" May be generated from the natural language query " In some embodiments, an investigation such as "on "," on "may be excluded from the token list. The natural language recognition system 200 may also be referred to as a NLU (Natural Language Understanding) system.

일부 실시예들에서, 자연어 인식 시스템(200)은, 도 1에 도시된 바와 같이, 지식 베이스(500)에 포함된 지식 데이터를 참조할 수 있다. 예를 들면, 자연어 인식 시스템(200)은 지식 베이스(500)에 저장된 지식 데이터에 기초하여 자연어 질의로부터 토큰을 인식하고 추출할 수 있다. 자연어 인식 시스템(200)은 토큰 리스트에 대한 점수를 생성할 수 있고, 토큰 리스트와 함께 질의 패턴 분석부(130)에 제공할 수 있다. 토큰 리스트의 점수는, 주어진 자연어 질의에 대하여 어휘상 토큰 리스트가 부합하는 정도를 나타내는 값을 지칭할 수 있다. 자연어 인식 시스템(200)으로부터 제공되는 토큰 리스트의 예시들은 도 3을 참조하여 후술될 것이다.In some embodiments, the natural language recognition system 200 may refer to knowledge data contained in the knowledge base 500, as shown in FIG. For example, the natural language recognition system 200 can recognize and extract the token from the natural language query based on the knowledge data stored in the knowledge base 500. [ The natural language recognition system 200 can generate a score for the token list and provide it to the query pattern analyzer 130 together with the token list. The score in the token list may refer to a value indicating the degree to which the lexical token list matches the given natural language query. Examples of the token list provided from the natural language recognition system 200 will be described below with reference to FIG.

질의 패턴 분석부(130)는 자연어 인식 시스템(200)으로부터 수신된 토큰 리스트에 포함된 토큰들을 온톨로지 구성요소(ontology component)들에 맵핑함으로써 적어도 하나의 질의 패턴을 생성할 수 있다. 온톨로지는 실존하거나 사람이 인식 가능한 것들을 컴퓨터에서 다룰 수 있는 형태로 표현한 것으로서, 온톨로지 구성요소들은 엔티티(entity)(또는 인스턴스(instance)), 클래스(class), 속성(property)을 포함할 수 있다. 추가적으로, 온톨로지 구성요소들은, 관계(relation)(엔티티간 속성 또는 클래스간 속성), 함수 텀(function term), 제한(restriction), 규칙(rule), 사건(event) 등을 더 포함할 수 있다. The query pattern analyzing unit 130 may generate at least one query pattern by mapping tokens included in the token list received from the natural language recognition system 200 to ontology components. An ontology is a representation of an existing or human-recognizable form in a form that can be handled by a computer. The ontology components can include an entity (or instance), a class, and a property. Additionally, the ontology components may further include relationships (attributes between entities or attributes between classes), function terms, restrictions, rules, events, and the like.

질의 패턴 분석부(130)는 자연어 질의로부터 추출된 토큰들을 온톨로지 구성요소들에 맵핑함으로써, 온톨로지 구성요소들로 구성된 질의 패턴을 생성할 수 있다. 질의 패턴은 온톨로지 구성요소들, 예컨대 엔티티(E), 클래스(C), 속성(P)의 조합일 수 있고, 질의 패턴 분석부(130)는 메타-지식 데이터베이스(300)에 저장된 메타-지식 데이터(또는 메타-지식)를 참조함으로써 질의 패턴을 생성할 수 있다. 메타-지식 데이터는 지식 베이스(500)에 저장된 지식 데이터에 대한 데이터(즉, 메타 데이터로서), 지식 데이터에 대한 정보를 포함할 수 있다. 예를 들면, 메타-지식 데이터는, 도 4를 참조하여 후술되는 바와 같이 엔티티의 메타 정보를 포함할 수 있고, 도 5 및 도 6을 참조하여 후술되는 바와 같이 속성의 메타 정보를 포함할 수도 있으며, 클래스의 메타 정보를 포함할 수도 있다. 일부 실시예들에서, 엔티티의 메타 정보는 지식 베이스(500)에 저장된 지식 데이터에 의존할 수 있는 한편, 속성 및/또는 클래스의 메타 정보는 지식 베이스(500)의 구축에 사용된 데이터일 수 있다. 질의 패턴 분석부(130) 및 메타-지식 데이터에 대한 예시는 도 2 내지 도 7을 참조하여 후술될 것이다.The query pattern analyzing unit 130 can generate a query pattern composed of the ontology components by mapping the tokens extracted from the natural language query to the ontology components. The query pattern may be a combination of ontology components, such as entity (E), class (C), and attribute (P), and the query pattern analysis unit 130 may include meta-knowledge data (Or meta-knowledge) to generate a query pattern. The meta-knowledge data may include information on knowledge data (i.e., as metadata) stored in the knowledge base 500, and information on knowledge data. For example, the meta-knowledge data may include meta information of the entity as described below with reference to FIG. 4 and may include meta information of the attributes as described below with reference to FIGS. 5 and 6 , And may include meta information of the class. In some embodiments, the meta information of the entity may depend on the knowledge data stored in the knowledge base 500, while the meta information of the attribute and / or class may be the data used to build the knowledge base 500 . An example of the query pattern analyzing unit 130 and the meta-knowledge data will be described later with reference to FIG. 2 to FIG.

쿼리 생성부(150)는 질의 패턴 분석부(130)로부터 질의 패턴을 수신할 수 있고, 패턴 템플릿 데이터베이스(400)에 저장된 패턴 템플릿(또는 패턴 템플릿 데이터)에 기초하여 적어도 하나의 쿼리를 생성할 수 있다. 예를 들면, 지식 베이스(500)는 RDF(Resource Description Framework)를 사용하여 표현된 지식 데이터를 포함할 수 있고, 쿼리 생성부(150)는 온톨로지 언어의 예시로서 SPARQL(SPARQL Protocol and RDF Query Language) 쿼리를 생성할 수 있다. 패턴 템플릿 데이터베이스(400)는 복수의 패턴 템플릿들을 저장할 수 있고, 쿼리 생성부(150)에 패턴 템플릿을 제공할 수 있다. 패턴 템블릿은, 도 8을 참조하여 후술되는 바와 같이, 질의 패턴 및 그에 대응하는 쿼리 템플릿을 정의할 수 있다. 쿼리 생성부(150)는 질의 패턴 분석부(130)로부터 수신된 질의 패턴에 대응하는 쿼리 템플릿에, 질의 패턴에 포함된 온톨로지 구성요소들의 값들(예컨대, URI)을 대입함으로써 쿼리를 생성할 수 있다. 쿼리 생성부(150)의 동작에 대한 예시는 도 8을 참조하여 후술될 것이다.The query generator 150 can receive the query pattern from the query pattern analyzer 130 and can generate at least one query based on the pattern template (or pattern template data) stored in the pattern template database 400 have. For example, the knowledge base 500 may include knowledge data expressed using an RDF (Resource Description Framework), and the query generation unit 150 may include SPARQL (SPARQL Protocol and RDF Query Language) as an example of an ontology language, You can create a query. The pattern template database 400 may store a plurality of pattern templates, and may provide a pattern template to the query generating unit 150. The pattern template may define a query pattern and its corresponding query template, as described below with reference to FIG. The query generator 150 may generate a query by substituting the values (e.g., URI) of the ontology components included in the query pattern into the query template corresponding to the query pattern received from the query pattern analyzer 130 . An example of the operation of the query generation unit 150 will be described later with reference to FIG.

쿼리 수행부(170)는 쿼리 생성부(150)로부터 수신된 쿼리를 지식 베이스(500)에 제공할 수 있고, 지식 베이스(500)가 쿼리에 응답하여 제공한 지식 데이터를 수신할 수 있다. 일부 실시예들에서, 쿼리 수행부(170)는 하나의 자연어 질의에 대응하는 복수의 쿼리들을 쿼리 생성부(150)로부터 수신할 수 있고, 복수의 쿼리들에 대응하는 지식 데이터를 수신할 수 있다. 쿼리 수행부(170)는 지식 베이스(500)로부터 수신된 지식 데이터를 응답 생성부(190)에 제공할 수 있다.The query execution unit 170 may provide the query received from the query generation unit 150 to the knowledge base 500 and receive the knowledge data provided by the knowledge base 500 in response to the query. In some embodiments, the query performing unit 170 may receive a plurality of queries corresponding to one natural language query from the query generating unit 150, and may receive knowledge data corresponding to a plurality of queries . The query performing unit 170 may provide the knowledge data received from the knowledge base 500 to the response generating unit 190.

응답 생성부(190)는 쿼리 수행부(170)로부터 지식 데이터를 수신할 수 있고, 수신된 지식 데이터에 기초하여 응답을 생성하여 사용자 인터페이스(110)에 제공할 수 있다. 일부 실시예들에서, 응답 생성부(190)는 복수의 쿼리들에 대응하는 지식 베이스(500)의 복수의 응답들로서 지식 데이터를 쿼리 수행부(170)로부터 수신할 수 있고, 복수의 응답들에 대한 순위를 결정할 수 있다. 일부 실시예들에서, 응답 생성부(190)는 지식 데이터로부터 자연어 응답을 생성할 수 있다. 일부 실시예들에서, 응답 생성부(190)는 사용자(20)와의 대화 중 일부로서 사용자(20)의 자연어 질의에 대한 자연어 질의를 지식 데이터에 기초하여 생성할 수도 있다.The response generating unit 190 can receive the knowledge data from the query executing unit 170 and generate a response based on the received knowledge data and provide the generated response to the user interface 110. [ In some embodiments, the response generator 190 may receive knowledge data from the query performing unit 170 as a plurality of responses of the knowledge base 500 corresponding to a plurality of queries, Can be determined. In some embodiments, the response generator 190 may generate a natural language response from the knowledge data. In some embodiments, the response generator 190 may generate a natural language query for the natural language query of the user 20 as part of a conversation with the user 20, based on the knowledge data.

점수가 부여되는 대상, 예컨대 토큰 리스트, 온톨로지 구성요소, 질의 패턴은 리소스로서 지칭될 수 있고, 이러한 리소스들을 평가, 즉 리소스들의 랭킹을 통해서 적절한 리소스가 결정될 수 있고, 결과적으로 적절한 쿼리가 생성될 수 있다. 이에 따라, 이하에서 도면들을 참조하여 설명되는 바와 같이, 도 1의 질의 응답 시스템(100)은 자연어 질의의 의도에 부합하는 쿼리를 생성할 수 있으며, 미리 구축된 메타-지식 데이터에 따라, 사용자(20)의 질의에 대하여 신속하게 응답이 생성될 수 있다. 또한, 지식 베이스의 갱신에 따라 메타-지식 데이터가 갱신됨으로써 메타-지식 데이터에 기반하여 생성되는 쿼리는 자연어 질의의 의도에 지속적으로 부합할 수 있다.An object to which a score is to be awarded, such as a token list, an ontology component, a query pattern, may be referred to as a resource and an evaluation of these resources, i.e., ranking of resources, have. Accordingly, as described below with reference to the figures, the query response system 100 of FIG. 1 can generate a query that meets the intent of a natural language query and, based on pre-built meta-knowledge data, 20 can be quickly generated for the query. Further, since the meta-knowledge data is updated according to the update of the knowledge base, the query generated based on the meta-knowledge data can continuously meet the intention of the natural language query.

비록 도 1에서, 질의 패턴 분석부(130) 및 쿼리 생성부(150)가 질의 응답 시스템(100)에 포함되는 것으로 도시되었으나, 일부 실시예들에서, 자연어 인식 시스템(200)과 같이, 패턴 분석부(130) 및 쿼리 생성부(150)는 질의 응답 시스템(100)과 통신가능하게 연결된 별개의 독립적인 시스템에 포함될 수 있고, 그러한 독립적인 시스템은 질의 변환 시스템으로 지칭될 수 있다.Although the query pattern analyzer 130 and the query generator 150 are shown in FIG. 1 as being included in the query response system 100, in some embodiments, as in the natural language recognition system 200, The unit 130 and the query generator 150 may be included in separate independent systems communicatively coupled to the query response system 100 and such independent systems may be referred to as query translation systems.

도 2는 본 개시의 예시적 실시예에 따라 도 1의 질의 패턴 분석부(130)의 예시를 나타내는 블록도이고, 도 3은 본 개시의 예시적 실시예에 따라 도 2의 적어도 하나의 토큰 리스트(D210)의 예시를 나타내고, 도 4는 본 개시의 예시적 실시예에 따라 도 2의 엔티티 메타 정보(D310)의 예시를 나타내고, 도 5 및 도 6은 본 개시의 예시적 실시예들에 따라 도 2의 속성 메타 정보(D330)의 예시들을 나타내며, 도 7은 본 개시의 예시적 실시예에 따라 도 2의 적어도 하나의 질의 패턴(D130)의 예시를 나타낸다. 이하에서, 도 2 내지 도 7은 도 1을 참조하여 설명될 것이다.FIG. 2 is a block diagram illustrating an example of the query pattern analysis unit 130 of FIG. 1 according to an exemplary embodiment of the present disclosure, and FIG. 3 is a block diagram of at least one token list of FIG. 2 4 illustrates an example of entity meta information D310 of FIG. 2 in accordance with an exemplary embodiment of the present disclosure, and FIGS. 5 and 6 illustrate an example of the entity meta information D210 in accordance with the exemplary embodiments of the present disclosure FIG. 7 illustrates examples of at least one query pattern D130 of FIG. 2 in accordance with an exemplary embodiment of the present disclosure; FIG. 7 illustrates examples of attribute meta information D330 of FIG. Hereinafter, Figs. 2 to 7 will be described with reference to Fig.

도 2의 질의 패턴 분석부(130')는, 도 1을 참조하여 전술된 바와 같이, 자연어 질의로부터 생성된 적어도 하나의 토큰 리스트(D210)를 자연어 인식 시스템(200)으로부터 수신할 수 있고, 메타-지식 데이터베이스(300')에 저장된 메타-지식 데이터를 참조하여 토큰 리스트를 온톨로지 구성요소들에 맵핑함으로써 적어도 하나의 질의 패턴(D130)을 생성할 수 있다. 도 2에 도시된 바와 같이, 질의 패턴 분석부(130')는 엔티티 평가부(131), 속성 평가부(133), 클래스 평가부(135) 및 패턴 랭크부(137)를 포함할 수 있다. 또한, 메타-지식 데이터베이스(300')는 엔티티 메타 정보(D310), 속성 메타 정보(D330) 및 클래스 메타 정보(D350)를 포함할 수 있다.The query pattern analyzer 130 'of FIG. 2 may receive from the natural language recognition system 200 at least one token list D210 generated from the natural language query, as described above with reference to FIG. 1, - generate at least one query pattern (D130) by mapping the token list to the ontology components with reference to the meta-knowledge data stored in the knowledge database (300 '). 2, the query pattern analyzing unit 130 'may include an entity evaluating unit 131, an attribute evaluating unit 133, a class evaluating unit 135, and a pattern rank unit 137. In addition, the meta-knowledge database 300 'may include entity meta information D310, attribute meta information D330, and class meta information D350.

도 2에 도시된 바와 같이, 적어도 하나의 토큰 리스트(D210)는 일련의 토큰들(T₁,...,T_n)을 포함하는 제1 토큰 리스트(TL₁)를 포함할 수 있다. 예를 들면, 도 3의 토큰 리스트들(D210')에 도시된 바와 같이, 자연어 질의로서 "버락오바마의 아버지가 태어난 나라의 수도는?"가 사용자(20)로부터 수신된 경우, 제1 토큰 리스트(TL₁)는 "버락오바마", "의", "아버지", "가", "태어난", "나라", "의", "수도", "는", "?"의 토큰들을 포함할 수 있다. 유사하게, 제2 내지 제4 토큰 리스트들(TL₂,...,TL₄)은, 도 3에 도시된 바와 같이, 상이하게 구분된 토큰들을 각각 포함할 수 있다. 토큰 리스트는 자연어 인식 시스템(200)에 의해서 부여된 점수를 가질 수 있고, 점수는 토큰 리스트 중 자연어 인식 시스템(200)에 의해서 가장 적합한 것으로 평가된 정도를 나타낼 수 있다. 예를 들면, 자연어 인식 시스템(200)은 미리 정의된 문장 구조들 및/또는 지식 베이스(500)에 저장된 지식 데이터에 기초하여 토큰 리스트들 각각에 대한 점수를 산출할 수 있다. 일부 실시예들에서, 질의 패턴 분석부(130')는 토큰 리스트들과 함께 토큰 리스트들 각각의 점수를 수신할 수 있다. 일부 실시예들에서, 질의 패턴 분석부(130')는 토큰 리스트의 점수를 수신하는 대신, 높은 점수 순으로 정렬된 토큰 리스트들을 자연어 인식 시스템(200)으로부터 수신할 수도 있다.As shown in FIG. 2, at least one token list D 210 may include a first token list TL ₁ comprising a series of tokens T ₁ , ..., T _n . For example, as shown in the token lists D210 'in FIG. 3, when the natural language query "What is the capital of the country where Barack Obama's father was born?" Is received from the user 20, (TL ₁ ) contains tokens of "Barack Obama", "of", "Father", "A", "Born", "Country", "of", "Capital", " . Similarly, the second to fourth token lists TL ₂ , ..., TL ₄ may each include differently divided tokens, as shown in FIG. The token list may have a score assigned by the natural language recognition system 200, and the score may indicate the degree to which the natural language recognition system 200 in the token list is evaluated as most appropriate. For example, the natural language recognition system 200 may calculate scores for each of the token lists based on predefined sentence structures and / or knowledge data stored in the knowledge base 500. In some embodiments, the query pattern analyzer 130 'may receive scores of each of the token lists along with the token lists. In some embodiments, instead of receiving the score of the token list, the query pattern analyzer 130 'may receive the token lists sorted in high score order from the natural language recognition system 200. [

도 2를 다시 참조하면, 엔티티 평가부(131), 속성 평가부(133) 및 클래스 평가부(135)는 메타-지식 데이터베이스(300')에 포함된 메타-지식 데이터에 기초하여, 적어도 하나의 토큰 리스트(D210)에 포함된 토큰 리스트(예컨대, TL₁)를 수신할 수 있고, 수신된 토큰 리스트에 포함된 토큰들을 평가할 수 있다. 이하에서, 엔티티 평가부(131), 속성 평가부(133) 및 클래스 평가부(135) 각각의 동작이 설명될 것이다.2, the entity evaluating unit 131, the attribute evaluating unit 133, and the class evaluating unit 135 may classify the meta-knowledge data into at least one (E.g., TL ₁ ) included in the token list D 210 and may evaluate the tokens contained in the received token list. Hereinafter, the operation of each of the entity evaluating unit 131, the attribute evaluating unit 133, and the class evaluating unit 135 will be described.

엔티티 평가부(131)는 메타-지식 데이터베이스(300')에 포함된 엔티티 메타 정보(D310)를 참조하여, 토큰 리스트에 포함된 토큰들 중 엔티티에 대응하는 토큰을 평가할 수 있다. 엔티티 메타 정보(D310)는 지식 베이스(500)에 포함된 엔티티들에 관한 정보를 포함할 수 있다. 예를 들면, 도 4에 도시된 바와 같이, 엔티티 "버락오바마"에 대한 엔티티 메타 정보(D310')는 URI(Uniform Resource Identifier), 링크 카운트, 국문명칭, 원어명칭, 속성 리스트, 클래스 리스트, 랭크를 포함하는 복수의 필드들을 포함할 수 있다. URI는 엔티티 "버락오바마"에 대한 액세스를 위한 고유한 주소를 지칭할 수 있고, 값으로서 "0000148525"를 가질 수 있다. 링크 카운트는 엔티티 "버락오바마"에 대한 링크들의 개수를 지칭할 수 있고, 속성에 따라 엔티티 "버락오바마"를 향하는 인-링크(in-link) 및 엔티티 "버락오바마"로부터 출발하는 아웃-링크(in-link)를 포함하는 링크들의 개수를 지칭할 수 있다. 속성 리스트는 엔티티 "버락오바마"가 가지는 속성들을 포함할 수 있고, 유사하게 클래스 리스트는 엔티티 "버락오바마"가 가지는(또는 속하는) 클래스들을 포함할 수 있다. 랭크는 엔티티 "버락오바마"의 중요도를 나타내는 값을 지칭할 수 있다. 예를 들면, 랭크는 링크 카운트에 기초하여 계산될 수도 있고, 엔티티에 대한 최근 질의(또는 쿼리) 빈도 등에 기초하여 계산될 수도 있다.The entity evaluating unit 131 may evaluate the token corresponding to the entity among the tokens included in the token list by referring to the entity meta information D310 included in the meta-knowledge database 300 '. The entity meta information D310 may include information about the entities included in the knowledge base 500. [ For example, as shown in FIG. 4, the entity meta information D310 'for the entity "Barack Obama" includes a URI (Uniform Resource Identifier), a link count, a Korean name, a source name, an attribute list, And < / RTI > The URI may refer to a unique address for access to the entity "Barack Obama " and may have a value of" 0000148525 ". The link count may refer to the number of links to the entity "Barack Obama ", and may include an in-link to the entity" Barack Obama "and an out-link to the entity" Barack Obama " in-link. < / RTI > An attribute list may contain attributes possessed by the entity " Barack Obama ", and similarly, a class list may include classes that (or belong to) the entity "Barack Obama ". The rank can refer to a value indicating the importance of the entity "Barack Obama ". For example, the rank may be calculated based on the link count, and may be calculated based on the recent query (or query) frequency, etc. for the entity.

일부 실시예들에서, 엔티티 평가부(131)는 링크 카운트 및/또는 랭크에 비례하도록, 토큰에 대응하는 엔티티의 점수를 산출할 수 있다. 예를 들면, 엔티티 평가부(131)는 엔티티 "버락오바마"에 대하여 링크 카운트의 값 "770" 및/또는 랭크의 값 "790"에 비례하도록 엔티티의 점수(예컨대, 양 값들의 평균)를 산출할 수 있다. 일부 실시예들에서, 엔티티 평가부(131)는 주어진 토큰 리스트에 포함된 토큰들 중 속성 리스트에 포함된 속성에 대응하는 토큰이 있는 경우, 엔티티에 대한 점수를 높게 산출할 수 있다. 유사하게, 일부 실시예들에서 엔티티 평가부(131)는 주어진 토큰 리스트에 포함된 토큰들 중 클래스 리스트에 포함된 클래스에 대응하는 토큰이 있는 경우, 엔티티에 대한 점수를 높게 산출할 수 있다. 일부 실시예들에서, 엔티티 평가부(131)는 주어진 토큰 리스트에 포함된 토큰들 중 속성 리스트/클래스 리스트에 포함된 속성/클래스에 대응하는 토큰이 있는 경우, 엔티티에 대한 점수를 낮게 산출하거나 해당 엔티티를 배제할 수 있다.In some embodiments, the entity evaluator 131 may calculate the score of the entity corresponding to the token, such that it is proportional to the link count and / or the rank. For example, the entity evaluator 131 calculates the score of the entity (e.g., the average of the positive values) so as to be proportional to the value 770 of the link count and / or the value 790 of the rank for the entity "Barack Obama & can do. In some embodiments, the entity evaluating unit 131 may calculate the score for the entity to a high level if there is a token corresponding to the attribute included in the attribute list among the tokens included in the given token list. Similarly, in some embodiments, the entity evaluator 131 may calculate the score for the entity to a high degree if there is a token corresponding to the class included in the class list among the tokens included in the given token list. In some embodiments, if there is a token corresponding to the attribute / class included in the attribute list / class list among the tokens included in the given token list, the entity evaluating unit 131 may calculate the score for the entity lower Entities can be excluded.

전술된 방식들 중 적어도 하나에 따라, 엔티티 평가부(131)는 엔티티를 평가할 수 있고, 특정 토큰에 대하여 복수의 엔티티들이 대응하는 경우, 예컨대 특정 토큰에 동명의 복수의 엔티티들이 대응하는 경우, 복수의 엔티티들 각각에 대한 점수를 산출한 후 점수에 따라 엔티티들을 정렬할 수 있다.According to at least one of the above-described schemes, the entity evaluator 131 can evaluate the entity and, if a plurality of entities correspond to a particular token, e.g., if a plurality of entities of the same name correspond to a particular token, After calculating the score for each of the entities, the entities can be sorted according to the score.

속성 평가부(133)는 메타-지식 데이터베이스(300')에 포함된 속성 메타 정보(D330)를 참조하여, 토큰 리스트에 포함된 토큰들 중 속성에 대응하는 토큰을 평가할 수 있다. 속성 메타 정보(D330)는 지식 베이스(500)에 포함된 속성들에 관한 정보를 포함할 수 있다. 속성 메타 정보(D330)는 복수의 속성들에 대한 정보를 포함할 수 있고, 속성들은 그 의미에 따라 계층적인 구조를 가질 수 있다. 예를 들면, 도 5에 도시된 바와 같이, 엔티티가 가질 수 있는 속성들이 속성 메타 정보(D330')에 포함될 수 있고, 도 5의 속성 메타 정보(D330')에서 우측 열에 포함된 속성은 좌측 열에 포함된 속성의 상위 레벨의 속성일 수 있다. 상위 레벨의 속성은 하위 레벨의 속성의 의미를 포함하는 의미를 가질 수 있다. 예를 들면, 속성 "adaptedAs"는 "각색하다", "개작하다" 및 "번안하다"에 대응할 수 있는 한편, 속성 "adaptedAs"의 상위 레벨의 속성으로서, 속성 "creation"은 "만들다", "발명하다", "생성하다", "창시하다" 및 "창조하다"에 대응할 수 있다. 유사하게, 속성 "adapter"는 속성 "creator"를 상위 레벨의 속성으로서 가질 수 있고, 속성 "adopt"는 속성 "activity"를 상위 레벨의 속성으로서 가질 수 있다. 전술된 바와 같이, 상위 레벨의 속성은 하위 레벨의 속성의 의미를 포함하는 의미를 가지므로, 하위 레벨의 속성일수록 보다 구체적인 의미에 대응할 수 있다. 도 5의 예시에서, 속성 메타 정보(D330')는 하위 레벨의 속성 및 상위 레벨의 속성으로 구성된 쌍들을 포함하는 것으로 도시되었으나, 속성들은 트리 구조와 같은 계층 구조를 가질 수 있는 점은 이해될 것이다.The attribute evaluation unit 133 may evaluate the token corresponding to the attribute among the tokens included in the token list by referring to the attribute meta information D330 included in the meta-knowledge database 300 '. The property meta information D330 may include information on the attributes included in the knowledge base 500. [ The attribute meta information D330 may include information on a plurality of attributes, and the attributes may have a hierarchical structure depending on the meaning thereof. For example, as shown in FIG. 5, the attributes that the entity may have are included in the attribute meta information D330 ', and the attributes included in the right column in the attribute meta information D330' May be a high-level attribute of the included attribute. The upper level attribute may have a meaning including the meaning of lower level attribute. For example, the attribute " adaptedAs "may correspond to" adapt, "" Invent, "" create, "" invent, "and" create. " Similarly, the attribute " adapter "may have the attribute" creator "as a high-level attribute, and the attribute" adopt " As described above, since the attribute of the high-level has a meaning including the meaning of the property of the low-level, the attribute of the low-level can correspond to a more specific meaning. In the example of FIG. 5, the attribute meta information D330 'is shown to include pairs of attributes of a lower level and attributes of a higher level, but it will be understood that the attributes may have a hierarchical structure such as a tree structure .

일부 실시예들에서, 속성 평가부(133)는 속성 메타 정보(D330')에 포함된 속성들의 계층 정보에 기초하여, 상위 레벨의 속성이 낮은 점수를 가지도록, 토큰에 대응하는 속성의 점수를 산출할 수 있다. 예를 들면, 속성 평가부(133)는 동일한 토큰에 대하여 2이상의 속성들이 대응하는 경우, 속성들 중 하위 레벨의 속성에 보다 높은 점수를 부여할 수 있는 한편, 상위 레벨의 속성에 보다 낮은 점수를 부여할 수 있다. 예를 들면, 속성 평가부(133)는 속성들의 트리 구조에서 리프(leaf)로부터 부모로서, 토큰에 대응하는 속성까지의 레벨에 반비례하도록 점수를 산출할 수 있다.In some embodiments, the attribute evaluation unit 133 may determine, based on the hierarchical information of the attributes included in the attribute meta information D330 ', the score of the attribute corresponding to the token so that the attribute of the high- Can be calculated. For example, when two or more attributes correspond to the same token, the attribute evaluation unit 133 can assign a higher score to the lower-level attribute of the attributes, while assigning a lower score to the higher-level attribute . For example, the attribute evaluation unit 133 may calculate the score in inverse proportion to the level from the leaf to the attribute corresponding to the token as a parent in the tree structure of the attributes.

일부 실시예들에서, 도 2의 속성 메타 정보(D330)는 속성들의 도메인(domain) 정보 및 범위(range) 정보를 포함할 수 있다. 도메인은 속성의 주체에 대응할 수 있고, 범위는 속성의 대상이나 값에 대응할 수 있고, 도메인 및 범위 각각은 클래스들의 집합으로 정의될 수 있다. 예를 들면, 속성 "쓰다(write)"는 도메인으로서 클래스 "사람"을 포함할 수 있고, 범위로서 클래스들 "소설", "일기", "문서" 등을 포함할 수 있다. 또한, 속성 "쓰다(bitter)"는 도메인으로서 클래스 "음식"을 포함할 수 있고, 범위로서 클래스 "boolean"을 가질 수 있다. 도 6은 일부 속성들의 도메인 및 범위의 예시들을 나타낸다. 구체적으로, 도 6은 도메인에 포함된 클래스에 따른 속성 및 범위를 나타낸다. 예를 들면, 클래스 "accident"를 도메인으로서 가지는 속성들 중 속성 "agent"는 범위로서 클래스 "organization" 등을 가질 수 있다. 유사하게, 속성 "awarded"는 범위로서 클래스 "award"를 가질 수 있다. 일부 실시예들에서, 속성 평가부(133)는 속성들의 도메인 정보 및 범위 정보 중 적어도 하나에 기초하여, 토큰에 대응하는 속성에 관계된 도메인 또는 범위의 클래스에 대응하는 토큰이 토큰 리스트에 포함되는 경우, 그러한 속성의 점수가 증가하도록 점수를 산출할 수 있다. In some embodiments, the attribute meta information D330 of FIG. 2 may include domain information and range information of attributes. A domain may correspond to a subject of an attribute, a scope may correspond to an object or value of an attribute, and each domain and range may be defined as a set of classes. For example, the attribute "write" may include a class "person" as a domain and may include classes "novel "," diary ", & Also, the attribute "bitter" may include the class "food" as a domain and may have the class "boolean" as a range. Figure 6 illustrates examples of domains and ranges of some attributes. Specifically, FIG. 6 shows attributes and ranges according to the classes included in the domain. For example, of the attributes having the class "accident" as a domain, the attribute "agent" may have a class "organization" Similarly, the attribute "awarded " may have the class" award "as a range. In some embodiments, if the token corresponding to the domain or the class of the scope related to the attribute corresponding to the token is included in the token list based on at least one of the domain information and the scope information of the attributes , The score can be calculated such that the score of such an attribute increases.

클래스 평가부(135)는 메타-지식 데이터베이스(300')에 포함된 클래스 메타 정보(D350)를 참조하여, 토큰 리스트에 포함된 토큰들 중 클래스에 대응하는 토큰을 평가할 수 있다. 클래스 메타 정보(D350)는 지식 베이스(500)에 포함된 클래스들에 관한 정보를 포함할 수 있다. 클래스 메타 정보(D350)는 복수의 클래스들에 대한 정보를 포함할 수 있고, 속성 메타 정보(D330)와 유사하게, 클래스들은 그 의미에 따라 계층적인 구조를 가질 수 있다. 상위 레벨의 클래스는 하위 레벨의 클래스의 의미를 포함하는 의미를 가질 수 있고, 이에 따라 하위 레벨의 클래스일수록 보다 구체적인 의미에 대응할 수 있다.The class evaluating unit 135 may evaluate the token corresponding to the class among the tokens included in the token list by referring to the class meta information D350 included in the meta-knowledge database 300 '. The class meta information D350 may include information on the classes included in the knowledge base 500. [ The class meta information D350 may include information on a plurality of classes, and similar to the attribute meta information D330, the classes may have a hierarchical structure according to their meaning. The higher-level class may have a meaning including the meaning of the lower-level class, and accordingly, the lower-level class may correspond to a more specific meaning.

일부 실시예들에서, 클래스 평가부(135)는 클래스 메타 정보(D350)에 포함된 클래스들의 계층 정보에 기초하여, 상위 레벨의 클래스가 낮은 점수를 가지도록, 토큰에 대응하는 클래스의 점수를 산출할 수 있다. 예를 들면, 클래스 평가부(135)는 동일한 토큰에 대하여 2이상의 클래스들이 대응하는 경우, 클래스들 중 하위 레벨의 클래스에 보다 높은 점수를 부여할 수 있는 한편, 상위 레벨의 클래스에 보다 낮은 점수를 부여할 수 있다. 예를 들면, 클래스 평가부(135)는 클래스들의 트리 구조에서 리프로부터 부모로서, 토큰에 대응하는 클래스까지의 레벨에 반비례하도록 점수를 산출할 수 있다.In some embodiments, the class evaluation unit 135 calculates the score of the class corresponding to the token so that the higher-level class has a lower score, based on the hierarchical information of the classes included in the class meta information D350 can do. For example, when two or more classes correspond to the same token, the class evaluation unit 135 may assign a higher score to a lower-level class among the classes, . For example, the class evaluating unit 135 may calculate the score to be in inverse proportion to the level from the leaf to the class corresponding to the token in the tree structure of the classes.

일부 실시예들에서, 도 2의 클래스 메타 정보(D350)는 클래스들의 속성 정보 및 범위 정보를 포함할 수 있다. 예를 들면, 도 6을 참조하여 전술된 바와 같이, 클래스는, 그 클래스를 도메인으로서 가지는 속성들 및 범위에 대응하는 클래스들과 관계를 가질 수 있다. 일부 실시예들에서, 도 6에 도시된 정보는, 속성 평가부(133) 및 클래스 평가부(135)에 의해서 공유될 수 있고, 클래스 평가부(135)는 클래스들의 속성 및 범위 정보 중 적어도 하나에 기초하여, 토큰에 대응하는 클래스에 관계된 속성 또는 범위의 클래스에 대응하는 토큰이 토큰 리스트에 포함되는 경우, 그러한 클래스의 점수가 증가하도록 점수를 산출할 수 있다.In some embodiments, the class meta information D350 of FIG. 2 may include attribute information and range information of the classes. For example, as described above with reference to FIG. 6, a class may have relationships with classes corresponding to attributes and ranges having the class as a domain. 6 may be shared by the attribute evaluating unit 133 and the class evaluating unit 135 and the class evaluating unit 135 may classify at least one of the attributes and range information of the classes A score can be calculated so that the score of such a class is increased if a token corresponding to the class of the attribute or range related to the class corresponding to the token is included in the token list.

전술된 바와 같이, 속성 메타 정보(D330) 및 클래스 메타 정보(D350)는 속성들의 계층 정보 및 클래스들의 계층 정보를 각각 포함할 수 있다. 이러한 계층 정보는 지식 베이스(500)를 구축하는데 사용될 수 있고, 온톨로지 스키마로서 지칭될 수 있다. 즉, 도 2의 질의 패턴 분석부(130')가 참조하는 메타-지식 데이터는 엔티티 메타 정보(D310)와 같이, 지식 베이스(500)에 포함된 지식 데이터로부터 생산된 정보뿐만 아니라 지식 베이스(500)를 구축하는데 사용되는 구축 정보(예컨대, 도 2의 D330, D350)를 포함할 수 있다.As described above, the attribute meta information D330 and the class meta information D350 may include layer information of attributes and layer information of classes, respectively. This hierarchical information may be used to build the knowledge base 500 and may be referred to as an ontology schema. That is, the meta-knowledge data referred to by the query pattern analysis unit 130 'of FIG. 2 includes not only the information produced from the knowledge data included in the knowledge base 500 but also the knowledge base 500 (E.g., D330, D350 in FIG. 2) used to construct the information (e.g.

패턴 랭크부(137)는 엔티티 평가부(131), 속성 평가부(133) 및 클래스 평가부(135)로부터 온톨로지 구성요소들 및 그것들의 점수들을 수신할 수 있고, 이에 기초하여 질의 패턴들의 순위를 결정할 수 있다. 예를 들면, 패턴 랭크부(137)는, 엔티티 평가부(131)로부터 토큰에 대응하는 엔티티들 및 엔티티들의 점수들을 수신할 수 있고, 속성 평가부(133)로부터 토큰에 대응하는 속성들 및 속성들의 점수들을 수신할 수 있으며, 클래스 평가부(135)로부터 토큰에 대응하는 클래스들 및 클래스들의 점수들을 수신할 수 있다. 패턴 랭크부(137)는 수신된 점수들에 기초하여 엔티티, 속성 및 클래스의 조합인 질의 패턴에 대한 점수를 산출할 수 있고, 점수가 높은 순으로 질의 패턴들을 정렬함으로써 적어도 하나의 질의 패턴(D130)을 생성할 수 있다.The pattern rank unit 137 may receive the ontology components and their scores from the entity evaluation unit 131, the attribute evaluation unit 133, and the class evaluation unit 135, and based on this, You can decide. For example, the pattern rank unit 137 can receive scores of entities and entities corresponding to a token from the entity evaluation unit 131, and acquires, from the attribute evaluation unit 133, And receive scores of the classes and classes corresponding to the tokens from the class evaluation unit 135. [ The pattern rank unit 137 can calculate scores for a query pattern that is a combination of entities, attributes, and classes based on the received scores, and calculates at least one query pattern D130 Can be generated.

도 7을 참조하면, 도 2의 적어도 하나의 질의 패턴(D130)의 예시로서 복수의 질의 패턴들(D130')은 토큰 리스트, 점수 및 패턴으로 구성될 수 있다. 도 7에 도시된 바와 같이, 동일한 토큰 리스트에 대하여 상이한 점수들 및 상이한 패턴들이 대응할 수 있고, 질의 패턴에서 토큰 리스트의 조사에 대응하는 토큰들은 제외될 수 있다. 예를 들면, 도 7에 도시된 바와 같이, 가장 높은 점수를 가지는 질의 패턴 "EPPPC"은, 토큰들 "버락오바마", "아버지", 태어난", "나라" 및 "수도"가 각각 엔티티(E), 속성(P), 속성(P), 속성(P) 및 클래스(C)에 대응되는 것을 의미할 수 있다. 비록 도 7에 도시되지 아니하였으나, 질의 패턴에 포함된 온톨로지 구성요소들은 해당 토큰에 대응하는 URI를 포함할 수 있다. 예를 들면, 패턴 "EPPPC"에서, 엔티티 "E"는 "버락오바마"의 URI(예컨대, adr:0000148525)를 포함할 수 있고, 클래스 "C"는 "수도"의 URI를 포함할 수 있다.Referring to FIG. 7, as an example of at least one query pattern D130 in FIG. 2, a plurality of query patterns D130 'may be composed of a token list, a score, and a pattern. As shown in FIG. 7, different scores and different patterns may correspond for the same token list, and tokens corresponding to the examination of the token list in the query pattern may be excluded. For example, as shown in Fig. 7, the query pattern "EPPPC" having the highest score indicates that the tokens "Barack Obama "," Father ", " ), The attribute P, the attribute P, the attribute P, and the class C. Although not shown in FIG. 7, the ontology components included in the query pattern include the corresponding token For example, in the pattern "EPPPC ", the entity" E "may include a URI of" Barack Obama "(e.g., adr: 0000148525) Quot; URI "

도 8은 본 개시의 예시적 실시예에 따라 도 1의 패턴 템플릿 데이터베이스(400)에 포함된 패턴 템플릿 데이터의 예시를 나타내는 도면이다. 도 1을 참조하여 전술된 바와 같이, 도 1의 쿼리 생성부(150)는 질의 패턴 분석부(130)로부터 수신된 질의 패턴으로부터 패턴 템플릿 데이터베이스(400)에 저장된 패턴 템플릿에 기초하여 적어도 하나의 쿼리를 생성할 수 있다. 도 8은 SPARQL을 생성하기 위한 템플릿을 포함하는 패턴 템플릿 데이터(D410)를 예시적으로 도시한다. 이하에서, 도 8은 도 1을 참조하여 설명될 것이다.FIG. 8 is a diagram illustrating an example of pattern template data included in the pattern template database 400 of FIG. 1 according to an exemplary embodiment of the present disclosure. 1, the query generator 150 of FIG. 1 extracts, from the query pattern received from the query pattern analyzer 130, at least one query based on the pattern template stored in the pattern template database 400, Lt; / RTI > Fig. 8 exemplarily shows pattern template data D410 including a template for generating SPARQL. Hereinafter, FIG. 8 will be described with reference to FIG.

도 8을 참조하면, 패턴 템플릿 데이터(D410)는 복수의 패턴들 및 그에 대응하는 템플릿들을 포함할 수 있다. 도 8에 도시된 바와 같이, "EP" 패턴(또는 질의 패턴)에 대응하는 템플릿에서, "%E%"에 엔티티의 URI가 삽입될 수 있고, "%P%"에 속성의 URI(또는 온톨로지 스키마에 정의된 명칭)가 삽입될 수 있다. 예를 들면, "버락오바마의 아버지는?"이라는 자연어 질의로부터, 엔티티 "버락오바마" 및 속성 "아버지"로 구성된 "EP" 패턴이 생성되고, 엔티티 "버락오바마"의 URI가 "adr:0000148525"이고, 속성의 URI가 "adp:father"인 경우, 쿼리 생성부(150)는 도 8의 패턴 템플릿 데이터(D410)를 참조하여 "SELECT{ adr:0000148525 adp:father ?X. }"로 정의되는 쿼리를 생성할 수 있다. 생성된 쿼리는 쿼리 수행부(170)에 의해서 지식 베이스(500)에 제공될 수 있고, 지식 베이스(500)는 쿼리에 응답하여 "X"에 대응하는 값을 지식 데이터로부터 추출하여 제공할 수 있다.Referring to FIG. 8, the pattern template data D410 may include a plurality of patterns and corresponding templates. As shown in Fig. 8, in the template corresponding to the "EP" pattern (or the query pattern), the URI of the entity can be inserted into "% E%" and the URI of the attribute A name defined in the schema) can be inserted. For example, from the natural language query "What is Barack Obama's father?", An "EP" pattern consisting of the entity "Barack Obama" and the attribute "Father" is generated, and the URI of the entity "Barack Obama" is "adr: 0000148525" , And the URI of the attribute is "adp: father ", the query generation unit 150 refers to the pattern template data D410 of Fig. 8 and determines that the query is defined as" SELECT {adr: 0000148525 adp: father? X. You can create a query. The generated query can be provided to the knowledge base 500 by the query execution unit 170 and the knowledge base 500 can extract a value corresponding to "X" from the knowledge data in response to the query .

이상에서 도면들을 참조하여 전술된 본 발명의 예시적 실시예들에 따라, 자연어 질의 "버락오바마의 아버지가 태어난 나라의 수도는?"에 대한 처리 과정의 예시가 후술될 것이다. 자연어 인식 시스템(200)에 의해서 자연어 질의가 분석된 결과는 아래와 같이 예시될 수 있다.In accordance with the above described exemplary embodiments of the present invention described above with reference to the drawings, an example of a process for natural language query "What is the capital of the country where Barack Obama was born?" Will be described below. The results of natural language query analysis by the natural language recognition system 200 can be illustrated as follows.

[[

{{

"nlp": "버락오바마","nlp": "Barack Obama",

"log": " E{(adr:0000148525) L[버락 오바마] T[[adc:person_00006026 , adc:entity_00001740 , adc:causal_agent_00005598 ,...]]}"[{adc: person_00006026, adc: entity_00001740, adc: causal_agent_00005598, ...]]}

E{(adr:0030786216) L[버락 오바마] T[[adc:ability_05295659 , adc:cognition_00020729 , adc:psychological_feature_00020333 ,...]]}E {(adr: 0030786216) L [Barack Obama] T [[adc: ability_05295659, adc: cognition_00020729, adc: psychological_feature_00020333, ...]]}

E{(adr:0000126163) L[버락 오바마] T[[]]} "E {(adr: 0000126163) L [Barack Obama] T [[]]} "

}, },

{ {

"nlp": "의", "nlp": "of",

"log": " L{(J) L[의] T[]} " "log": "L {(J) L [of] T []}"

}, },

{ {

"nlp": "아버지", "nlp": "father",

"log": " P{(adp:father) L[아버지] T[]} 아버지 C{(adc:fatherfigure_09435082) L[fatherfigure] T[]} "father" T []} Father C {(adc: fatherfigure_09435082) L [fatherfigure] T []}

E{(adr:0031245687) L[아버지] T[[adc:abstraction_00020486 , adc:event_00025950 , adc:social_event_06841042 , adc:movie_06205452 ,...]]}E {(adr: 0031245687) L [Father] [[adc: abstraction_00020486, adc: event_00025950, adc: social_event_06841042, adc: movie_06205452, ...]]}

E{(adr:0031545743) L[아버지] T[[adc:abstraction_00020486 , adc:event_00025950 , adc:social_event_06841042 , adc:communication_00028764 ,...]]}E {(adr: 0031545743) L [father] [[adc: abstraction_00020486, adc: event_00025950, adc: social_event_06841042, adc: communication_00028764, ...]]}

E{(adr:0031108925) L[아버지] T[[adc:abstraction_00020486 , adc:event_00025950 , adc:social_event_06841042 , adc:communication_00028764 ,...]]}E {(adr: 0031108925) L [Father] [[adc: abstraction_00020486, adc: event_00025950, adc: social_event_06841042, adc: communication_00028764, ...]]}

E{(adr:0031334934) L[아버지] T[[adc:abstraction_00020486 , adc:event_00025950 , adc:social_event_06841042 , adc:movie_06205452 ,...]]}E {(adr: 0031334934) L [Father] [[adc: abstraction_00020486, adc: event_00025950, adc: social_event_06841042, adc: movie_06205452, ...]]}

E{(adr:0012072422) L[아버지] T[[adc:abstraction_00020486 , adc:event_00025950 , adc:social_event_06841042 , adc:movie_06205452 ,...]]}E {(adr: 0012072422) L [father] [[adc: abstraction_00020486, adc: event_00025950, adc: social_event_06841042, adc: movie_06205452, ...]]}

E{(adr:0000210970) L[아버지] T[[adc:person_00006026 , adc:entity_00001740 , adc:causal_agent_00005598 , adc:organism_00003226 ,...]]}E {(adr: 0000210970) L [Father] [[adc: person_00006026, adc: entity_00001740, adc: causal_agent_00005598, adc: organism_00003226, ...]]}

E{(adr:0030567899) L[아버지] T[[adc:abstraction_00020486 , adc:event_00025950 , adc:social_event_06841042 ,...]]}E {(adr: 0030567899) L [Father] [[adc: abstraction_00020486, adc: event_00025950, adc: social_event_06841042, ...]]}

E{(adr:0000499795) L[박명수] T[[adc:person_00006026 , adc:institution_07563940 , adc:entity_00001740 , adc:group_00026769 ,...]]}E {(adr: 0000499795) L [Myeongsoo Park] [[[adc: person_00006026, adc: institution_07563940, adc: entity_00001740, adc: group_00026769, ...]]}

E{(adr:0000112273) L[테오도어 아이케] T[[adc:person_00006026 , adc:entity_00001740 , adc:causal_agent_00005598 ,...]]}E {(adr: 0000112273) L [Theodore Iceke] T [[adc: person_00006026, adc: entity_00001740, adc: causal_agent_00005598, ...]]}

E{(adr:0000424316) L[박동수] T[[adc:person_00006026 , adc:entity_00001740 , adc:causal_agent_00005598 ,...]]}E {(adr: 0000424316) L [Park Dong-soo] T [[adc: person_00006026, adc: entity_00001740, adc: causal_agent_00005598, ...]]}

W{(WIKI) L[아버지] T[[]]} 아버지 U{(U_NE) L[AF] T[]} "W {(WIKI) L [Father] T [[]]} Father U {(U_NE) L [AF] T []}

}, },

{ {

"nlp": "가", "nlp": "

"log": " L{(J) L[가] T[]} " "log": "L {(J) L [] T []}"

}, },

{ {

"nlp": "태어난", "nlp": "born",

"log": " P{(adp:bornIn) L[태어나다] T[]} P{(adp:bornOn) L[태어나다] T[]} L{(V_태어나다) L[undefined] T[]} " L {[birthday] L [undefined] T []} P {(adp: bornOn) L [ "

}, },

{ {

"nlp": "나라", "nlp": "country",

"log": " P{(adp:nation) L[나라] T[]} 나라 C{(adc:state_07673557) L[body politic] T[]} [country] T []} country C {(adc: state_07673557) L [body politic] T []}

E{(adr:0000347468) L[나라] T[[adc:person_00006026 , adc:entity_00001740 , adc:actor_09145973 , adc:performer_09740423 ,...]]}E {(adr: 0000347468) L [country] T [[adc: person_00006026, adc: entity_00001740, adc: actor_09145973, adc: performer_09740423, ...]]}

E{(adr:0000501329) L[나라] T[[adc:person_00006026 , adc:entity_00001740 , adc:causal_agent_00005598 ,...]]} E {(adr: 0000501329) L [country] T [[adc: person_00006026, adc: entity_00001740, adc: causal_agent_00005598, ...]]}

E{(adr:0000478478) L[나라 시] T[[adc:city_08005407 , adc:geographical_area_08050136 , adc:state_capital_08163309 ,...]]}E {(adr: 0000478478) L [Nara City] T [[adc: city_08005407, adc: geographical_area_08050136, adc: state_capital_08163309, ...]]}

E{(adr:0000247685) L[국가] T[[adc:cognition_00020729 , adc:abstraction_00020486 , adc:content_05473476 ,...]]}E {(adr: 0000247685) L [Country] T [[adc: cognition_00020729, adc: abstraction_00020486, adc: content_05473476, ...]]}

E{(adr:0000620719) L[나라 역] T[[adc:entity_00001740 , adc:sp_GeoEntity , adc:facility_03194800 , adc:structure_04174544 ,...]]}E {(adr: 0000620719) L [[adc: entity_00001740, adc: sp_GeoEntity, adc: facility_03194800, adc: structure_04174544, ...]]}

E{(adr:0000078432) L[백나라] T[[adc:person_00006026 , adc:entity_00001740 , adc:causal_agent_00005598 ,...]]}E {(adr: 0000078432) L [white country] T [[adc: person_00006026, adc: entity_00001740, adc: causal_agent_00005598, ...]]}

W{(WIKI) L[나라] T[[]]} 나라 U{(U_NE) L[CH] T[]} "W {(WIKI) L [country] T [[]]} country U {(U_NE) L [CH] T []}

}, },

{ {

"nlp": "의", "nlp": "of",

"log": " L{(J) L[의] T[]} " "log": "L {(J) L [of] T []}"

}, },

{ {

"nlp": "수도", "nlp": "Capital",

"log": " P{(adp:capital) L[수도] T[]} P{(adp:capitalOf) L[수도] T[]} 수도 T []} P {(adp: capitalOf) L [Capital] T []} Capital

C{(adc:national_capital_08159919) L[national capital] T[]} 수도 C{(adc:plumbing_fixture_03819002) L[plumbing fixture] T[]} C {(adc: national_capital_08159919) L [national capital] T []} capital C {(adc: plumbing_fixture_03819002) L [plumbing fixture] T []}

E{(adr:0000340502) L[수도] T[[adc:cognition_00020729 , adc:content_05473476 , adc:concept_05498421 ,...]]}E {(adr: 0000340502) L [Capital] T [[adc: cognition_00020729, adc: content_05473476, adc: concept_05498421, ...]]}

E{(adr:0000510352) L[수도] T[[adc:abstraction_00020486 , adc:term_05916288]]} E {(adr: 0000510352) L [Capital] T [[adc: abstraction_00020486, adc: term_05916288]]}

E{(adr:0000652025) L[수도] T[[adc:entity_00001740 , adc:sp_GeoEntity , adc:land_08750552 , adc:object_00016236 , adc:island_08734830]]} E {(adr: 0000652025) L [Capital] T [[adc: entity_00001740, adc: sp_GeoEntity, adc: land_08750552, adc: object_00016236, adc: island_08734830]]}

E{(adr:0000048066) L[수도] T[[]]} E {(adr: 0000048066) L [Capital] T [[]]}

E{(adr:0000610209) L[수도] T[[adc:entity_00001740 , adc:sp_GeoEntity , adc:land_08750552 , adc:object_00016236 , adc:island_08734830]]} E {(adr: 0000610209) L [Capital] T [[adc: entity_00001740, adc: sp_GeoEntity, adc: land_08750552, adc: object_00016236, adc: island_08734830]]}

E{(adr:0031955789) L[수도] T[[adc:abstraction_00020486 , adc:event_00025950 , adc:social_event_06841042 ,...]]}E {(adr: 0031955789) L [Capital] T [[adc: abstraction_00020486, adc: event_00025950, adc: social_event_06841042, ...]]}

E{(adr:0000387233) L[수도] T[[adc:entity_00001740 , adc:sp_GeoEntity , adc:land_08750552 , adc:object_00016236 , adc:island_08734830]]} E {(adr: 0000387233) L [Capital] T [[adc: entity_00001740, adc: sp_GeoEntity, adc: land_08750552, adc: object_00016236, adc: island_08734830]]}

E{(adr:0031979581) L[수도] T[[adc:abstraction_00020486 , adc:event_00025950 , adc:social_event_06841042 ,...]]}E {(adr: 0031979581) L [Capital] T [[adc: abstraction_00020486, adc: event_00025950, adc: social_event_06841042, ...]]}

E{(adr:0000531168) L[삼도] T[[adc:abstraction_00020486 , adc:term_05916288]]} W{(WIKI) L[수도] T[[]]} "E {(adr: 0000531168) L [degree] T [[adc: abstraction_00020486, adc: term_05916288]]} W {

}, },

{ {

"nlp": "는", "nlp": "",

"log": " L{(J) L[는] T[]} " "log": "L {(J) L [] T []}"

}, },

{ {

"nlp": "?", "nlp": "?",

"log": " L{(PUNCT) L[?] T[]} " "log": "L {(PUNCT) L [?] T []}"

} }

]]

위 토큰 분석 결과로부터, 질의 패턴 분석부(130) 및 쿼리 생성부(150)에 의해서 쿼리가 생성될 수 있고, 아래와 같이 SPARQL이 생성될 수 있다.From the token analysis result, a query can be generated by the query pattern analysis unit 130 and the query generation unit 150, and the SPARQL can be generated as follows.

SELECT DISTINCT ?XSELECT DISTINCT? X

{ {

adr:0000148525 adp:father ?V1 . adr: 0000148525 adp: father? V1.

?V1 adp:bornIn ?b . ? V1 adp: bornIn? B.

?b a adc:state_07673557 . ? b a adc: state_07673557.

?b adp:capital ?X . ? b adp: capital? X.

} }

응답 생성부(190)는 위 쿼리에 대한 지식 베이스(500)의 응답(즉, X에 대응하는 값으로서 "나이로비")을 수신할 수 있고, 자연어 질의 및 수신된 응답에 따라 아래와 같은 자연어 답변을 생성할 수 있다.The response generator 190 can receive the response of the knowledge base 500 for the above query (i.e., "Nairobi" as a value corresponding to X) and responds to the natural language query and the received response as follows Can be generated.

"버락오바마의 아버지가 태어난 나라의 수도는 나이로비입니다.""The capital of the country where Barack Obama's father was born is Nairobi."

도 9는 본 개시의 예시적 실시예에 따라 지식 베이스에 대한 쿼리를 생성하는 방법을 나타내는 순서도이다. 일부 실시예들에서, 도 9는 도 1의 질의 응답 시스템(100)에 의해서 수행될 수 있고, 이하에서 도 9는 도 1을 참조하여 설명될 것이다.9 is a flow diagram illustrating a method for generating a query for a knowledge base in accordance with an exemplary embodiment of the present disclosure. In some embodiments, FIG. 9 may be performed by the query response system 100 of FIG. 1, and hereinafter, FIG. 9 will be described with reference to FIG.

단계 S20에서, 자연어 질의로부터 추출된 토큰들을 획득하는 동작이 수행될 수 있다. 예를 들면, 질의 응답 시스템(100)의 질의 패턴 분석부(130)는 자연어 인식 시스템(200)으로부터 적어도 하나의 토큰 리스트를 포함할 수 있고, 토큰 리스트에 포함된 토큰들을 획득할 수 있다.In step S20, an operation of obtaining the tokens extracted from the natural language query can be performed. For example, the query pattern analysis unit 130 of the QMS system 100 may include at least one token list from the natural language recognition system 200 and may obtain the tokens included in the token list.

단계 S40에서, 메타-지식 데이터(D300)를 참조하여, 토큰들을 온톨로지 구성요소들에 맵핑함으로써 질의 패턴을 생성하는 동작이 수행될 수 있다. 예를 들면, 질의 응답 시스템(100)의 질의 패턴 분석부(130)는 메타-지식 데이터베이스(300)에 저장된 메타-지식 데이터(D300)를 참조할 수 있고, 토큰 리스트의 토큰들을 온톨로지 구성요소들, 예컨대 엔티티(E), 속성(P) 및 클래스(C) 등에 맵핑함으로써 적어도 하나의 질의 패턴을 생성할 수 있다. 질의 패턴 분석부(130)는 메타-지식 데이터(D300)에 기초하여 토큰 리스트에 대응하는 질의 패턴에 대한 점수를 산출할 수 있고, 질의 패턴에 대한 점수는 토큰 리스트에 포함된 토큰들에 대응하는 온톨로지 구성요소들 각각의 점수들로부터 산출될 수 있다. 질의 패턴 분석부(130')는 온톨로지 구성요소들 각각에 대한 점수를 종합하여 질의 패턴에 대한 점수를 산출할 수 있고, 점수에 따라 복수의 질의 패턴들을 정렬할 수 있다. 단계 S20에 대한 예시는 도 10을 참조하여 후술될 것이다.In step S40, an operation of generating a query pattern by mapping the tokens to the ontology components may be performed with reference to the meta-knowledge data D300. For example, the query pattern analyzer 130 of the query response system 100 may refer to the meta-knowledge data D300 stored in the meta-knowledge database 300 and may compare tokens in the token list with ontology components , For example, entity E, attribute P, class C, or the like. The query pattern analysis unit 130 may calculate a score for a query pattern corresponding to the token list based on the meta-knowledge data D300, and the score for the query pattern corresponds to the tokens included in the token list Can be calculated from the scores of each of the ontology components. The query pattern analyzing unit 130 'can calculate the score of the query pattern by summing the scores of the respective ontology components and sort the plurality of query patterns according to the scores. An example of step S20 will be described later with reference to Fig.

단계 S60에서, 질의 패턴 템플릿(D400)을 참조하여, 질의 패턴으로부터 쿼리를 생성하는 동작이 수행될 수 있다. 예를 들면, 질의 응답 시스템(100)의 쿼리 생성부(150)는 패턴 템플릿 데이터베이스(400)에 저장된 질의 패턴 템플릿(D400)을 참조할 수 있고, 질의 패턴에 대응하는 템플릿에 토큰에 대응하는 URI를 삽입함으로써 쿼리를 생성할 수 있다.In step S60, an operation of generating a query from the query pattern may be performed with reference to the query pattern template D400. For example, the query generator 150 of the Q & A 100 can refer to a query pattern template D400 stored in the pattern template database 400, and generates a URI corresponding to the token in the template corresponding to the query pattern To generate a query.

도 10은 본 개시의 예시적 실시예에 따라 도 9의 단계 S40의 예시를 나타내는 순서도이다. 도 9를 참조하여 전술된 바와 같이, 도 9의 단계 S40'에서 메타-지식 데이터(D300')를 참조하여, 토큰들을 온톨로지 구성요소들에 맵핑함으로써 질의 패턴을 생성하는 동작이 수행될 수 있다. 도 10에 도시된 바와 같이, 단계 S40'는 복수의 단계들(S42, S44, S46, S48)을 포함할 수 있고, 메타-지식 데이터(D300')는 엔티티 메타 정보(D310'), 속성 메타 정보(D330') 및 클래스 메타 정보(D350')를 포함할 수 있다. 비록, 도 10에서 3개의 단계들(S42, S44, S46)이 순차적으로 수행되는 것으로 도시되었으나, 일부 실시예들에서 3개의 단계들(S42, S44, S46)은 병렬적으로 수행될 수도 있다. 10 is a flow chart illustrating an example of step S40 of FIG. 9 in accordance with an exemplary embodiment of the present disclosure. As described above with reference to FIG. 9, an operation of generating a query pattern by mapping the tokens to the ontology components by referring to the meta-knowledge data D300 'in step S40' of FIG. 9 may be performed. 10, the step S40 'may include a plurality of steps S42, S44, S46, and S48, and the meta-knowledge data D300' may include entity meta information D310 ' Information D330 ' and class meta information D350 '. Although three steps S42, S44, and S46 are shown in FIG. 10 as being performed sequentially, in some embodiments, the three steps S42, S44, and S46 may be performed in parallel.

단계 S42에서, 엔티티 메타 정보(D310')를 참조하여 엔티티의 점수를 산출하는 동작이 수행될 수 있다. 예를 들면, 토큰 리스트에 포함된 토큰에 대응하는 엔티티에 대한 점수는, 엔티티가 가지고 있는 중요도에 기초하여 산출될 수도 있고, 토큰 리스트에 포함된 속성이나 클래스와 엔티티의 관계에 기초하여 산출될 수도 있다.In step S42, an operation of calculating the score of the entity with reference to the entity meta information D310 'may be performed. For example, the score for the entity corresponding to the token included in the token list may be calculated based on the importance possessed by the entity, or may be calculated based on the attribute included in the token list or the relationship between the class and the entity have.

단계 S44에서, 속성 메타 정보(D330')를 참조하여 속성의 점수를 산출하는 동작이 수행될 수 있다. 예를 들면, 토큰 리스트에 포함된 토큰에 대응하는 속성에 대한 점수는, 속성의 계층 정보에 기초하여 하위 레벨의 속성이 높은 점수를 가지도록 산출될 수도 있고, 속성의 도메인 및 범위 정보에 기초하여 산출될 수도 있다.In step S44, an operation of calculating the score of the attribute by referring to the attribute meta information D330 'may be performed. For example, the score for the attribute corresponding to the token included in the token list may be calculated so that the low-level attribute has a high score based on the hierarchical information of the attribute, or based on the attribute domain and range information .

단계 S46에서, 클래스 메타 정보(D350')를 참조하여 클래스의 점수를 산출하는 동작이 수행될 수 있다. 예를 들면, 토큰 리스트에 포함된 토큰에 대응하는 클래스에 대한 점수는, 클래스의 계층 정보에 기초하여 하위 레벨의 클래스가 높은 점수를 가지도록 산출될 수도 있고, 클래스를 도메인으로서 가지는 속성 및 그에 따른 범위 정보에 기초하여 산출될 수도 있다.In step S46, an operation of calculating the score of the class can be performed by referring to the class meta information D350 '. For example, the score for the class corresponding to the token included in the token list may be calculated so that the lower-level class has a higher score based on the hierarchical information of the class, and the attribute having the class as a domain and the May be calculated based on the range information.

단계 S48에서, 질의 패턴의 점수를 산출하는 동작이 수행될 수 있다. 예를 들면, 3개의 단계들(S42, S44, S46)에서 산출된 점수들에 기초하여 질의 패턴의 점수가 산출될 수 있다. 이에 따라, 자연어 질의에 가장 적합한 것으로 평가된 질의 패턴이 가장 높은 점수를 가질 수 있다. 복수의 토큰 리스트들에 대하여 복수의 질의 패턴들이 생성될 수 있고, 동일한 토큰 리스트에 대하여 복수의 질의 패턴들이 생성될 수도 있다.In step S48, an operation of calculating the score of the query pattern may be performed. For example, the score of the query pattern can be calculated based on the scores calculated in the three steps (S42, S44, S46). Accordingly, the query pattern evaluated as best suited to a natural language query can have the highest score. A plurality of query patterns may be generated for a plurality of token lists and a plurality of query patterns may be generated for the same token list.

도 11은 본 개시의 예시적 실시예에 따른 메타-지식 구축 시스템(700) 및 그 입출력 관계를 나타내는 블록도이다. 도 11에 도시된 바와 같이, 메타-지식 구축 시스템(700)은 메타-지식 데이터베이스(300) 및 지식 베이스(500)와 통신가능하게 연결될 수 있다. 도 1을 참조하여 전술된 바와 같이, 지식 베이스(500)는 지식 데이터를 포함할 수 있고, 메타-지식 데이터베이스(300)는 메타-지식 데이터를 포함할 수 있다.11 is a block diagram illustrating a meta-knowledge building system 700 and its input / output relationship in accordance with an exemplary embodiment of the present disclosure. As shown in FIG. 11, the meta-knowledge building system 700 may be communicatively coupled to the meta-knowledge database 300 and the knowledge base 500. As described above with reference to FIG. 1, knowledge base 500 may include knowledge data, and meta-knowledge database 300 may include meta-knowledge data.

지식 베이스 구축 시스템(600)은 지식 베이스(500)를 갱신할 수 있다. 예를 들면, 지식 베이스 구축 시스템(600)은 전문가들로 구성된 사용자들에 의해서 제어될 수 있고, 신규 지식 데이터를 지식 베이스(500)에 추가하거나 지식 베이스(500)에 저장된 지식 데이터를 변경할 수 있다. 또한, 지식 베이스 구축 시스템(600)은 인터넷과 같은 네트워크에 접속될 수 있고, 네트워크를 통해서 수집된 정보에 기초하여 지식 베이스(500)를 갱신할 수 있다. 도 2를 참조하여 전술된 바와 같이, 메타-지식 데이터베이스(300)는 지식 베이스(500)의 구축에 사용되는 데이터(예컨대, 도 2의 D330, D350)를 포함할 수 있고, 지식 베이스 구축 시스템(600)은 이를 참조하여 지식 베이스(500)를 갱신할 수 있다. 예를 들면, 지식 베이스 구축 시스템(600)은 메타-지식 데이터베이스(300)에 포함된 속성들의 계층 정보 및/또는 클래스들의 계층 정보에 기초하여 지식 데이터를 구성하는 트리플(triple)들을 생성할 수 있다.The knowledge base construction system 600 can update the knowledge base 500. [ For example, the knowledge base construction system 600 can be controlled by users made up of experts and can add new knowledge data to the knowledge base 500 or modify knowledge data stored in the knowledge base 500 . In addition, the knowledge base construction system 600 can be connected to a network such as the Internet, and can update the knowledge base 500 based on information collected through the network. 2, the meta-knowledge database 300 may include data used to build the knowledge base 500 (e.g., D330, D350 of FIG. 2) 600 may update the knowledge base 500 with reference to the information. For example, the knowledge base construction system 600 may generate triples constituting knowledge data based on hierarchical information of attributes and / or hierarchical information of classes included in the meta-knowledge database 300 .

메타-지식 구축 시스템(700)은 지식 베이스(500)에 저장된 지식 데이터에 기초하여 메타-지식 데이터베이스(300)를 갱신할 수 있다. 예를 들면, 지식 베이스 구축 시스템(600)에 의해서 지식 베이스(500)가 갱신되는 경우, 메타-지식 구축 시스템(700)은 추가된 지식 데이터 또는 변경된 지식 데이터로부터 메타-지식 데이터를 생성할 수 있고, 생성된 메타-지식 데이터에 기초하여 메타-지식 데이터베이스(300)에 신규 메타-지식 데이터를 추가하거나 메타-지식 데이터베이스(300)에 저장된 메타-지식 데이터를 변경할 수 있다. 이에 따라, 메타-지식 데이터베이스(300)는 갱신된 지식 베이스(500)를 반영할 수 있고, 자연어 질의의 처리에 사용됨으로써 질의 응답 시스템(예컨대, 도 1의 100)에 의해서 적절한 질의 패턴들이 생성될 수 있다.The meta-knowledge building system 700 may update the meta-knowledge database 300 based on the knowledge data stored in the knowledge base 500. For example, when the knowledge base 500 is updated by the knowledge base building system 600, the meta-knowledge building system 700 may generate meta-knowledge data from the added knowledge data or modified knowledge data Knowledge data may be added to the meta-knowledge database 300 or meta-knowledge data stored in the meta-knowledge database 300 based on the generated meta-knowledge data. Accordingly, the meta-knowledge database 300 may reflect the updated knowledge base 500 and may be used in the processing of the natural language query to generate appropriate query patterns by a query response system (e.g., 100 in FIG. 1) .

이상에서와 같이 도면과 명세서에서 예시적인 실시예들이 개시되었다. 본 명세서에서 특정한 용어를 사용하여 실시예들을 설명되었으나, 이는 단지 본 발명의 기술적 사상을 설명하기 위한 목적에서 사용된 것이지 의미 한정이나 특허청구범위에 기재된 본 발명의 범위를 제한하기 위하여 사용된 것은 아니다. 그러므로 본 기술분야의 통상의 지식을 가진 자라면 이로부터 다양한 변형 및 균등한 타 실시예가 가능하다는 점을 이해할 것이다. 따라서, 본 발명의 진정한 기술적 보호범위는 첨부된 특허청구범위의 기술적 사상에 의해 정해져야 할 것이다.As described above, exemplary embodiments have been disclosed in the drawings and specification. While the embodiments have been described herein with reference to specific terms, it should be understood that they have been used only for purposes of describing the technical idea of the invention and not for limiting the scope of the invention as defined in the claims . Therefore, those skilled in the art will appreciate that various modifications and equivalent embodiments are possible without departing from the scope of the present invention. Accordingly, the true scope of the present invention should be determined by the technical idea of the appended claims.

Claims

A query pattern analyzer configured to generate at least one query pattern sorted according to a ranking by mapping tokens extracted from a natural language query to ontology components based on meta-knowledge data; And
And a query generator configured to generate at least one query for the knowledge base from the at least one query pattern based on the query pattern template,
Wherein the meta-knowledge data includes meta information of the ontology components in the knowledge base,
Wherein the meta-knowledge data includes meta information of an attribute including at least one of hierarchical information of attributes, domain information, and range information,
Wherein the query pattern analyzing unit includes an attribute evaluation unit configured to calculate a score of the at least one attribute based on meta information of at least one attribute corresponding to the token.

The method according to claim 1,
Wherein the meta-knowledge data includes meta information of an entity including at least one of an importance of an entity, a link count, an attribute list, and a class list,
Wherein the query pattern analyzing unit includes an entity evaluating unit configured to calculate a score of the at least one entity based on meta information of at least one entity corresponding to the token.

The method of claim 2,
Wherein the entity evaluator calculates a score of the at least one entity to be proportional to at least one of the importance and the link count.

The method of claim 2,
Wherein the entity evaluating unit calculates a score of the at least one entity so as to increase when the attribute included in the attribute list or the token corresponding to the class included in the class list among the extracted tokens Q & A system.

delete

The method according to claim 1,
Wherein the attribute evaluation unit calculates the score of the at least one attribute such that the attribute of the higher level has a lower score based on the hierarchical information of the attributes.

The method according to claim 1,
Wherein the attribute evaluating unit is configured to increase a token corresponding to a class of a related domain or range of the at least one attribute of the extracted tokens based on at least one of domain information and range information of the attributes, And the score of one attribute is calculated.

The method according to claim 1,
Wherein the meta-knowledge data includes meta information of a class including hierarchical information of classes,
Wherein the query pattern analyzing unit includes a class evaluating unit configured to calculate a score of the at least one class based on meta information of at least one class corresponding to the token.

The method of claim 8,
Wherein the class evaluation unit calculates the score of the at least one class so that the higher-level class has a lower score, based on the hierarchical information of the classes.

The method according to claim 1,
Wherein the query pattern analyzer comprises a query pattern rank portion configured to determine a ranking of the at least one query pattern based on scores of ontology components included in the at least one query pattern. .