KR102401333B1

KR102401333B1 - System and Method for Robust and Scalable Dialogue

Info

Publication number: KR102401333B1
Application number: KR1020210090211A
Authority: KR
Inventors: 여진영; 이진식; 김태윤; 전희원
Original assignee: 에스케이텔레콤 주식회사
Priority date: 2019-09-19
Filing date: 2021-07-09
Publication date: 2022-05-23
Also published as: KR20210033782A; KR102282695B1; KR20210089626A

Abstract

과제지향 대화 시스템에 이용되는 대화 시스템 및 방법을 개시한다.
본 실시예는, FAQ KB(Knowledge Base)에 기반하여 사용자 질의에 대한 질문의도 검색, 질문패턴 추정 및 조건부 답변 과정을 적용하여 대화 시스템의 확장성을 증대하고, 약한 라벨이 부착된(weakly labeled) 질문쌍을 이용한 트레이닝 과정을 적용하여 학습 모델의 강인성을 높임으로써, 과제지향 대화 시스템에 사용되는 확장성과 강인성을 갖는 대화 시스템 및 방법을 제공하는 데 목적이 있다.A dialog system and method used in a task-oriented dialog system are disclosed.
In this embodiment, the scalability of the dialogue system is increased by applying the question intention search, question pattern estimation, and conditional answer process to the user query based on the FAQ KB (Knowledge Base), and weakly labeled ) The purpose of this is to provide a dialog system and method with scalability and robustness used in task-oriented dialog systems by increasing the robustness of the learning model by applying the training process using question pairs.

Description

System and Method for Robust and Scalable Dialogue

본 발명은 과제지향 대화 시스템에 이용되는 확장성 및 강인성을 갖는 대화 시스템 및 방법에 관한 것이다.The present invention relates to a dialog system and method having scalability and robustness used in a task-oriented dialog system.

이하에 기술되는 내용은 단순히 본 발명과 관련되는 배경 정보만을 제공할 뿐 종래기술을 구성하는 것이 아니다. The content described below merely provides background information related to the present invention and does not constitute the prior art.

사용자로 하여금 식당 예약 또는 항공편 예약과 같은 목표를 달성할 수 있도록 사용자와 소통(interaction)하는 과제지향(task-oriented) 또는 타겟 도메인 특화된(target domain-specific) 대화 시스템(dialogue system)에 대한 관심이 증대되고 있다. 과제지향 대화 시스템이 지향하는 주요한 목적 중 하나는 사용자의 요청 의도(query intent)에 적합한 답변(answer)를 제공하는 것이다. There is an interest in a task-oriented or target domain-specific dialogue system that interacts with a user so that the user can achieve a goal, such as a restaurant reservation or a flight reservation. is increasing One of the main goals of the task-oriented dialog system is to provide an answer suitable for the user's query intent.

적합한 답변을 제공하는 하나의 방안은 과제지향 대화 시스템 상에 FAQ(Frequently Asked Question) 답변을 접목하는 것이다. 접목을 위한 직접적인 방법은, 대화 시스템 상에서 발생하는 사용자의 새로운 요청을, 상응하는 답변을 수반하는 기 존재하는 질문과 대응(mapping)시키는 것이다. 이런 유형의 문제는 환언 식별(Paraphrase Identification: PI)과 가장 근사하며, PI 모델은 두 문장이 문맥적으로 동일한지를 식별하는 PI 문제를 처리한다. One way to provide an appropriate answer is to graft Frequently Asked Question (FAQ) answers on a task-oriented dialogue system. A direct method for grafting is to map a user's new request that occurs on the dialog system with an existing question accompanied by a corresponding answer. This type of problem is most closely related to Paraphrase Identification (PI), and the PI model handles the PI problem of identifying whether two sentences are contextually identical.

최근에 BERT(Bidirectional Encoder Representations from Transformers, 비특허문헌 1 참조)를 이용하여, PI 문제 해결의 정확성 증대 측면에서 많은 진보가 이루어졌다. Recently, by using BERT (Bidirectional Encoder Representations from Transformers, see Non-Patent Document 1), many advances have been made in terms of increasing the accuracy of solving the PI problem.

그러나 과제지향 대화 시스템에서 사용자의 의도에 부합하는 적합한 답변을 제공하기 위해서는, 확장성(scalability)과 강인성(robustness)이라는 두 가지 이슈가 해결되어야 한다. 여기서 확장성은 대화 시스템이 가능한 많은 사용자의 질문을 수용하는 것을 의미하고, 강인성은 PI 모델의 학습과정에서 파생되는 오류에 효율적으로 대처하는 성질을 의미한다. 따라서, 두 가지 이슈에 대한 적절한 해법에 기반하여 과제지향 대화 시스템에 적용 가능한 확장성과 강인성을 수반하는 대화 시스템 및 방법을 필요로 한다. However, in order to provide an appropriate answer that meets the user's intention in the task-oriented dialogue system, two issues of scalability and robustness must be resolved. Here, scalability means that the dialog system accommodates as many questions as possible from users, and robustness means the property of efficiently coping with errors derived from the learning process of the PI model. Therefore, there is a need for a dialogue system and method with scalability and robustness applicable to a task-oriented dialogue system based on an appropriate solution to the two issues.

비특허문헌 1: Devlin, J.; Chang, M.-W.; Lee, K.; and Toutanova, K. 2019. Bert: Pre-training of deep bidirectional transformers for language understanding. In NAACL-HLT.Non-Patent Document 1: Devlin, J.; Chang, M.-W.; Lee, K.; and Toutanova, K. 2019. Bert: Pre-training of deep bidirectional transformers for language understanding. In NAACL-HLT. 비특허문헌 2: Robertson, S.; Zaragoza, H.; et al. 2009. The probabilistic relevance framework: Bm25 and beyond. Foundations and Trends in Information Retrieval.Non-Patent Document 2: Robertson, S.; Zaragoza, H.; et al. 2009. The probabilistic relevance framework: Bm25 and beyond. Foundations and Trends in Information Retrieval.

본 개시는, FAQ KB(Knowledge Base)에 기반하여 사용자 질의에 대한 질문의도 검색, 질문패턴 추정 및 조건부 답변 과정을 적용하여 대화 시스템의 확장성을 증대하고, 약한 라벨이 부착된(weakly labeled) 질문쌍을 이용한 트레이닝 과정을 적용하여 학습 모델의 강인성을 높임으로써, 과제지향 대화 시스템에 사용되는 확장성과 강인성을 갖는 대화 시스템 및 방법을 제공하는 데 주된 목적이 있다.The present disclosure increases the scalability of a conversational system by applying a question intention search, question pattern estimation, and conditional answer process to a user query based on the FAQ KB (Knowledge Base), and weakly labeled The main purpose is to provide a dialog system and method having scalability and robustness used in a task-oriented dialog system by increasing the robustness of the learning model by applying a training process using question pairs.

본 발명의 실시예에 의하면, 사용자 질의에 대한 답변 데이터를 질문의도, 질문패턴 및 조건부 답변으로 구성된 데이터 형태로 보관하는 지식베이스를 검색하여 상기 사용자 질의와 관련이 있는 복수의 질문패턴 그룹을 선별하고, 상기 복수의 질문패턴 그룹 및 상기 복수의 질문패턴 그룹에 연결된 복수의 질문의도를 제공하는 질문의도 검색기(Question Intent Retriever: QIR); 상기 사용자 질의, 상기 복수의 질문의도 및 상기 복수의 질문패턴 그룹을 이용하여 탑원(Top-1) 질문의도를 선정하는 질문패턴 추정기(Question Pattern Reader: QPR); 및 상기 지식베이스 상의 상기 조건부 답변을 이용하여 상기 탑원 질문의도와 관련된 답변을 선택하는 조건부 답변기(Conditional Answerer: CA)를 포함하는 것을 특징으로 하는 대화 시스템을 제공한다. According to an embodiment of the present invention, a plurality of question pattern groups related to the user's query are selected by searching the knowledge base that stores the answer data to the user's query in the form of data consisting of the question intention, the question pattern, and the conditional answer. and a Question Intent Retriever (QIR) providing the plurality of question pattern groups and a plurality of question intentions connected to the plurality of question pattern groups; a question pattern estimator (Question Pattern Reader: QPR) for selecting a Top-1 question intention using the user query, the plurality of question intentions, and the plurality of question pattern groups; and a Conditional Answerer (CA) that selects an answer related to the top-one question intention by using the conditional answer on the knowledge base.

본 발명의 다른 실시예에 의하면, 컴퓨팅 장치가 수행하는 학습방법에 있어서, 유사 여부를 구분하는 라벨을 갖는(labeled) 질문쌍(question pair)을 포함하는 데이터세트를 이용하여 환언식별(Paraphrase Identification: PI) 모델과 정제기(refinery)를 과제 정밀조정(Task Fine-tuning)하는 과정, 여기서, 상기 과제는, 입력된 질문쌍의 유사 여부에 대한 식별을 나타냄; 및 약한 긍정라벨(weak positive label)을 갖는 긍정 질문쌍, 및 부정라벨(negative label)을 갖는 부정 질문쌍을 포함하는, 도메인 특화된 데이터베이스를 이용하여 상기 PI 모델과 상기 정제기를 도메인 정밀조정(Domain Fine-tuning)하는 과정을 포함하되, 상기 도메인 정밀조정하는 과정은, 상기 PI 모델에 대한 반복적인 트레이닝(iterative training) 과정, 및 상기 정제기에 대한 정제 과정을 포함하는 것을 특징으로 하는 학습방법을 제공한다.According to another embodiment of the present invention, in a learning method performed by a computing device, Paraphrase Identification using a dataset including a labeled question pair PI) a process of task fine-tuning a model and a refiner, wherein the task indicates identification of whether an input question pair is similar; and domain fine tuning the PI model and the refiner using a domain-specific database, comprising a positive question pair with a weak positive label, and a negative query pair with a negative label. -tuning), wherein the process of fine-tuning the domain provides a learning method, characterized in that it includes an iterative training process for the PI model, and a refinement process for the refiner .

본 발명의 다른 실시예에 의하면, 대화 시스템이 이용하는 학습장치에 있어서, 입력 질문쌍(question pair)의 유사 여부에 대한 식별 결과를 생성하는 환언식별(Paraphrase Identification: PI) 모델; 및 상기 입력 질문쌍에 해당하는 신뢰 점수(confidence score)를 추정하는 정제기(refinery)를 포함하고, 유사 여부를 구분하는 라벨을 갖는(labeled) 질문쌍(question pair)을 포함하는 데이터세트를 이용하여 상기 PI 모델 및 상기 정제기를 과제 정밀조정(Task Fine-tuning)한 후, 약한 긍정라벨(weak positive label)을 갖는 긍정 질문쌍, 및 부정라벨(negative label)을 갖는 부정 질문쌍을 포함하는, 도메인 특화된 데이터베이스를 이용하여 상기 PI 모델 및 상기 정제기를 도메인 정밀조정(Domain Fine-tuning)하되, 상기 도메인 정밀조정은, 상기 신뢰 점수를 이용하여 상기 PI 모델에 대한 반복적인 트레이닝(iterative training)을 수행하고, 상기 식별 결과를 이용하여 상기 정제기에 대한 정제를 수행하는 것을 특징으로 하는 학습장치를 제공한다.According to another embodiment of the present invention, there is provided a learning apparatus used by a conversation system, comprising: a Paraphrase Identification (PI) model for generating an identification result for whether an input question pair is similar; and a refiner for estimating a confidence score corresponding to the input question pair, and using a dataset including a labeled question pair for discriminating similarity. After task fine-tuning the PI model and the refiner, a domain comprising a positive question pair with a weak positive label, and a negative question pair with a negative label Domain fine-tuning the PI model and the refiner using a specialized database, but the domain fine-tuning, iterative training for the PI model using the confidence score , by using the identification result to provide a learning apparatus characterized in that the refiner is refined.

본 발명의 다른 실시예에 의하면, 대화 시스템의 대화방법에 있어서, 사용자 질의에 대한 답변 데이터를 질문의도, 질문패턴 및 조건부 답변으로 구성된 데이터 형태로 보관하는 지식베이스를 검색하여 상기 사용자 질의와 관련이 있는 복수의 질문패턴 그룹을 선별하고, 상기 복수의 질문패턴 그룹 및 상기 복수의 질문패턴 그룹에 연결된 복수의 질문의도를 제공하는 과정; 상기 사용자 질의 및 상기 복수의 질문패턴 그룹을 사전에 트레이닝된 환언식별(Paraphrase Identification: PI) 모델에 입력하여 유사질문 여부에 대한 식별 결과를 산정하고, 상기 식별 결과를 이용하여, 상기 복수의 질문의도 중 가장 높은 비율의 유사질문을 포함하는 질문패턴 그룹을 인덱싱하는(indexing) 질문의도를 탑원(Top-1) 질문의도인 것으로 선정하는 과정; 및 상기 지식베이스 상의 상기 조건부 답변을 이용하여 상기 탑원 질문의도와 관련된 답변을 선택하는 과정을 포함하는 것을 특징으로 하는, 컴퓨터 상에 구현되는 대화방법을 제공한다.According to another embodiment of the present invention, in a conversation method of a conversation system, a knowledge base storing answer data to a user's query in the form of data consisting of a question intention, a question pattern, and a conditional answer is searched and related to the user's query selecting a plurality of question pattern groups having By inputting the user query and the plurality of question pattern groups into a pre-trained Paraphrase Identification (PI) model, an identification result of whether or not a similar question is asked is calculated, and using the identification result, the A process of selecting a question intention indexing a question pattern group including the highest rate of similar questions among the diagrams as a Top-1 question intention; and selecting an answer related to the top-one question intention by using the conditional answer on the knowledge base.

본 발명의 다른 실시예에 의하면, 컴퓨팅 장치가 수행하는 학습방법이 포함하는 각 단계를 실행시키기 위하여 컴퓨터로 읽을 수 있는 기록매체에 저장된 컴퓨터프로그램을 제공한다.According to another embodiment of the present invention, there is provided a computer program stored in a computer-readable recording medium to execute each step included in a learning method performed by a computing device.

본 발명의 다른 실시예에 의하면, 대화 시스템의 대화방법이 포함하는 각 단계를 실행시키기 위하여 컴퓨터로 읽을 수 있는 기록매체에 저장된 컴퓨터프로그램을 제공한다.According to another embodiment of the present invention, there is provided a computer program stored in a computer-readable recording medium to execute each step included in the conversation method of the conversation system.

이상에서 설명한 바와 같이 본 실시예에 의하면, FAQ KB(Knowledge Base)에 기반하여 사용자 질의에 대한 질문의도 검색, 질문패턴 추정 및 조건부 답변 과정을 대화 시스템에 적용하고, 약한 라벨이 부착된(weakly labeled) 질문쌍을 이용한 트레이닝 과정을 적용한, 과제지향 대화 시스템에 사용되는 대화 시스템 및 방법을 제공함으로써, 대화 시스템의 확장성을 증대하고 강인성을 높이는 효과가 있다. As described above, according to the present embodiment, based on the FAQ KB (Knowledge Base), the process of searching for a question intention for a user's query, estimating a question pattern, and a conditional answer process is applied to the dialog system, and a weak label is attached (weakly). By providing a dialog system and method used in a task-oriented dialog system to which a training process using labeled) question pairs is applied, it is effective to increase the scalability and robustness of the dialog system.

도 1은 본 발명의 일 실시예에 따른 FAQ 답변기 및 학습 모델의 구성도이다.
도 2는 본 발명의 일 실시예에 따른 FAQ 답변기의 구성요소의 동작을 설명하는 개념도이다.
도 3은 본 발명의 일 실시예에 따른 학습 모델의 트레이닝 과정을 보여주는 개념도이다.
도 4는 본 발명의 일 실시예에 따른 PI 모델과 정제기를 포함하는 학습 모델의 구성도이다.
도 5는 본 발명의 일 실시예에 따른 사용자 질의에 대한 FAQ 답변기에 의한 답변 절차를 보여주는 순서도이다.1 is a block diagram of a FAQ answering machine and a learning model according to an embodiment of the present invention.
2 is a conceptual diagram for explaining the operation of the components of the FAQ answering machine according to an embodiment of the present invention.
3 is a conceptual diagram illustrating a training process of a learning model according to an embodiment of the present invention.
4 is a block diagram of a learning model including a PI model and a refiner according to an embodiment of the present invention.
5 is a flowchart illustrating a procedure for answering a user's query by a FAQ answerer according to an embodiment of the present invention.

이하, 본 발명의 실시예들을 예시적인 도면을 참조하여 상세하게 설명한다. 각 도면의 구성요소들에 참조부호를 부가함에 있어서, 동일한 구성요소들에 대해서는 비록 다른 도면상에 표시되더라도 가능한 한 동일한 부호를 가지도록 하고 있음에 유의해야 한다. 또한, 본 실시예들을 설명함에 있어, 관련된 공지 구성 또는 기능에 대한 구체적인 설명이 본 실시예들의 요지를 흐릴 수 있다고 판단되는 경우에는 그 상세한 설명은 생략한다.DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS Hereinafter, embodiments of the present invention will be described in detail with reference to exemplary drawings. In adding reference numerals to the components of each drawing, it should be noted that the same components are given the same reference numerals as much as possible even though they are indicated on different drawings. In addition, in the description of the present embodiments, if it is determined that a detailed description of a related well-known configuration or function may obscure the gist of the present embodiments, the detailed description thereof will be omitted.

또한, 본 실시예들의 구성 요소를 설명하는 데 있어서, 제 1, 제 2, A, B, (a), (b) 등의 용어를 사용할 수 있다. 이러한 용어는 그 구성 요소를 다른 구성 요소와 구별하기 위한 것일 뿐, 그 용어에 의해 해당 구성 요소의 본질이나 차례 또는 순서 등이 한정되지 않는다. 명세서 전체에서, 어떤 부분이 어떤 구성요소를 '포함', '구비'한다고 할 때, 이는 특별히 반대되는 기재가 없는 한 다른 구성요소를 제외하는 것이 아니라 다른 구성요소를 더 포함할 수 있는 것을 의미한다. 또한, 명세서에 기재된 '…부', '모듈' 등의 용어는 적어도 하나의 기능이나 동작을 처리하는 단위를 의미하며, 이는 하드웨어나 소프트웨어 또는 하드웨어 및 소프트웨어의 결합으로 구현될 수 있다.Also, in describing the components of the present embodiments, terms such as first, second, A, B, (a), (b), etc. may be used. These terms are only for distinguishing the elements from other elements, and the essence, order, or order of the elements are not limited by the terms. Throughout the specification, when a part 'includes' or 'includes' a certain element, this means that other elements may be further included, rather than excluding other elements, unless otherwise stated. . In addition, the '... Terms such as 'unit' and 'module' mean a unit that processes at least one function or operation, which may be implemented as hardware or software or a combination of hardware and software.

첨부된 도면과 함께 이하에 개시될 상세한 설명은 본 발명의 예시적인 실시형태를 설명하고자 하는 것이며, 본 발명이 실시될 수 있는 유일한 실시형태를 나타내고자 하는 것이 아니다.DETAILED DESCRIPTION The detailed description set forth below in conjunction with the appended drawings is intended to describe exemplary embodiments of the present invention and is not intended to represent the only embodiments in which the present invention may be practiced.

과제지향(task-oriented) 대화 시스템(dialogue system)은 신뢰상태 추적기(belief state tracker, 대화상태 추적기(dialogue state tracker)로도 불림)와 사용자 질의(user query)에 대한 답변기(answerer)를 포함하여 구성될 수 있다. 본 발명은 특히 대화 시스템 내의 답변기의 구현에 대한 것으로, 사용자 질의에 대한 대응을 수행하기 위한 FAQ(Frequently Asked Question) 답변기를 포함하는 대화 시스템에 주목한다.A task-oriented dialog system includes a belief state tracker (also called a dialog state tracker) and an answerer to a user query. can be configured. The present invention particularly relates to the implementation of an answerer in a chat system, and pays attention to a dialog system including a Frequently Asked Question (FAQ) answerer for responding to a user's query.

도 1은 본 발명의 일 실시예에 따른 FAQ 답변기 및 학습 모델의 구성도이다. 1 is a block diagram of a FAQ answering machine and a learning model according to an embodiment of the present invention.

도 1의 도시는 과제지향 대화 시스템에서 사용되는 FAQ 답변기(100)로서, FAQ 지식베이스(110, Knowledge Base: KB), 질문의도 검색기(111, Question Intent Retriever, QIR), 질문패턴 추정기(112, Question Pattern Reader, QPR) 및 조건부 답변기(113, Conditional Answerer: CA)를 포함한다. 그리고 QPR(112)은 환언식별(Paraphrase Identification: PI) 모델(114) 및 질문의도 선정기(115, Question Intent Selector)를 구비한다. 1 shows a FAQ answerer 100 used in a task-oriented conversation system, a FAQ knowledge base 110 (Knowledge Base: KB), a question intent retriever (111, Question Intent Retriever, QIR), and a question pattern estimator ( 112, Question Pattern Reader, QPR) and a Conditional Answerer (113, Conditional Answerer: CA). In addition, the QPR 112 includes a Paraphrase Identification (PI) model 114 and a Question Intent Selector 115 .

본 실시예에 따른 FAQ 답변기(100)에 포함되는 구성요소가 반드시 이에 한정되는 것은 아니다. 예컨대, FAQ 답변기 상에 학습 모델 및 학습 모델의 트레이닝을 위한 트레이닝장치를 추가로 구비하거나, 외부의 트레이닝장치와 연동되는 형태로 구현될 수 있다. 예컨대, 도 1에 도시된 학습 모델(120)은 트레이닝장치에 의하여 학습된다. 학습 모델(120)은 PI 모델(114)과 정제기(121, refinery)를 포함하는데, PI 모델(114)은 FAQ 답변기(100) 상에 구비되나, 정제기(121)는 FAQ 답변기(100)의 외부에 존재한다. 따라서, 정제기(121)는 학습 시에만 이용되고, FAQ 답변기(100)의 동작 시에는 적용되지 않는다. Components included in the FAQ answering machine 100 according to the present embodiment are not necessarily limited thereto. For example, a learning model and a training device for training the learning model may be additionally provided on the FAQ answering machine, or may be implemented in a form that interworks with an external training device. For example, the learning model 120 shown in FIG. 1 is learned by a training device. The learning model 120 includes a PI model 114 and a refiner 121 . The PI model 114 is provided on the FAQ answerer 100 , but the refiner 121 is the FAQ answerer 100 . exists outside of Accordingly, the refiner 121 is used only for learning, and is not applied to the operation of the FAQ answerer 100 .

도 2는 본 발명의 일 실시예에 따른 FAQ 답변기의 구성요소의 동작을 설명하는 개념도이다.2 is a conceptual diagram for explaining the operation of the components of the FAQ answering machine according to an embodiment of the present invention.

이하 도 1 및 도 2를 참조하여 FAQ 답변기(100)에 포함된 구성요소의 형태 및 동작을 설명한다.Hereinafter, the form and operation of the components included in the FAQ answering machine 100 will be described with reference to FIGS. 1 and 2 .

FAQ KB(110)는 사용자 질의에 대한 답변에 사용되는 데이터를 트리플(triple) 데이터 형태로 보관하고 있다. 트리플 데이터는 (질문의도, 질문패턴, 조건부 답변 '답변조건|해당답변') 형태이다. The FAQ KB 110 stores data used to answer user queries in the form of triple data. Triple data is in the form of (question diagram, question pattern, conditional answer 'answer condition|corresponding answer').

질문의도는 유사한 질문패턴 그룹에 대한 인덱스(index)로서, 사용자 질문 의도에 대한 식별자(Identifier, ID) 역할을 하며, 효율적인 답변 검색을 위하여 사용된다. 질문패턴 그룹에 대한 인덱스로서 사용될 수 있다면, 질문의도는 "[phone_belief_description]"과 같은 문자 또는 단순한 숫자 어느 것으로도 표현될 수 있다. The question intention is an index to a group of similar question patterns, and serves as an identifier (ID) for the user's question intention, and is used for efficient answer search. If it can be used as an index for a question pattern group, the question intent can be expressed either as a character such as "[phone_belief_description]" or a simple number.

질문패턴은 대화기록으로부터 사용자가 실제로 사용한 질문문장을 수집하거나 또는 사용이 예측되는 질문문장을 수집한 것이다. 질문패턴은 질문의도 별로 분류된 질문패턴 그룹에 속하게 되며, 해당되는 질문의도가 인덱스가 되어 서로 연결된다. 예를 들어 자연어 문장인 "GG 노트에 대하여 설명하여 주실해요?"와 키워드 조합인 "HH폰 제품 설명"은 같은 그룹에 속하여 관련 질문의도인 [phone_belief_description]에 연결된다. The question pattern is a collection of question sentences actually used by the user or question sentences predicted to be used from conversation records. A question pattern belongs to a question pattern group classified by question intention, and the corresponding question intention becomes an index and is connected to each other. For example, the natural language sentence "Can you explain the GG note?" and the keyword combination "HH phone product description" belong to the same group and are linked to [phone_belief_description], which is the intent of the related question.

사용자 질의와 FAQ KB(110) 내의 질문패턴은 PI 모델(114)에 의하여 서로 문맥 관점에서 비교된다. 따라서 "GG 노트에 대해서 간단히 알고 싶어요”라는 원래의 질문은 "$phone$에 대해서 간단히 알고 싶어요”라는 질문패턴으로 변경하여 저장한다. 예컨대, 다양한 폰제품 설명에 대한 복수의 질문 및 질문의도를 하나의 질문의도 및 질문패턴 그룹으로 통합하고, "GG 노트" 또는 "HH폰"과 같은 구체적인 항목에 대한 정보는, 앞에서 예시한 바대로 패턴으로 대체한 후, 조건부 답변에서 해결하는 것으로하여 취급을 연기(delay)시킨다. The user query and the question pattern in the FAQ KB 110 are compared in context with each other by the PI model 114 . Therefore, the original question "I want to know about GG notes" is changed to the question pattern "I want to know about $phone$ simply" and saved. For example, a plurality of questions and question intentions for various phone product descriptions are integrated into one question intention and question pattern group, and information on specific items such as "GG note" or "HH phone" is After replacing the pattern with the correct one, the handling is delayed by resolving it in the conditional answer.

조건부 답변은 질문의도에 연결된 다수의 답변조건(answering condition) 및 해당답변이다. 예를 들어, [phone_belief_description]이라는 질문의도에 "HH폰은 최근에 출시된 …'phone_name|HH폰'" 또는 "GG 노트는 가장 인기있는 … 'phone_name|GG 노트'"와 같은 형태로 "답변텍스트 및 '답변조건|해당답변'"이 연결되어 있다. 여기서 답변조건은 대화 시스템 내의 신뢰상태 추적기로부터 전달받을 수 있는 슬롯유형(slot-type)일 수 있다. 슬롯유형이 존재하지 않을 경우에는, API(Application Specific Interface) 호출을 이용하여 외부 시스템으로부터 답변조건을 획득하거나 또는 후속 질문(follow-up question)을 이용하여 사용자로부터 답변조건을 전달받을 수 있다. 후속 질문을 사용하는 경우에 대비하여, 조건부 답변은 질문의도와 관련이 있는 복수의 후속 질문을 구비할 수 있다. A conditional answer is a plurality of answering conditions and corresponding answers linked to the intention of a question. For example, to the intent of the question [phone_belief_description], "Answer HH phone is the latest … 'phone_name|HH phone'" or "GG note is the most popular … 'phone_name|GG note'"" Text and 'Answer Conditions | Corresponding Answer'" are linked. Here, the answer condition may be a slot-type that can be delivered from the trust state tracker in the dialog system. When the slot type does not exist, an answer condition may be obtained from an external system using an application specific interface (API) call, or an answer condition may be transmitted from a user using a follow-up question. In case the follow-up question is used, the conditional answer may include a plurality of follow-up questions related to the intent of the question.

FAQ KB(110)는 통신서비스 시스템의 대화기록과 같은 도메인 특화된 데이터베이스 중 일부의 질문을 이용하여 생성할 수 있다. FAQ KB(110) 생성 시, 질문의도 및 질문패턴을 생성하는 과정은 이미 알려진 방법을 이용하여 자동적 또는 수동적으로 진행될 수 있다.The FAQ KB 110 may be generated using some of the questions in the domain-specific database, such as the conversation record of the communication service system. When generating the FAQ KB 110 , the process of generating a question intention and a question pattern may be performed automatically or manually using a known method.

원래의 질문 대신 질문패턴을 사용하고 조건부 답변을 도입하는 이유는, PI 모델(114)이 답변에 해당하는 단어를 식별할 정도의 성능을 보유하기가 쉽지 않으며, 성능을 보유하더라도 복잡도를 구현하는 데 요구되는 비용, 트레이닝에 소요되는 시간 및 실시간 동작의 구현 어려움 등 여러 가지 문제를 파생하기 때문이다. 또한 PI 모델의 복잡도는 FAQ 답변기(100)의 확장성 측면에서도 제약이 된다.The reason for using the question pattern instead of the original question and introducing the conditional answer is that it is not easy for the PI model 114 to have enough performance to identify the word corresponding to the answer, and even if it has the performance, it is difficult to implement complexity. This is because it introduces several problems, such as required cost, time required for training, and difficulty in implementing real-time operation. In addition, the complexity of the PI model is also a constraint in terms of scalability of the FAQ answerer 100 .

본 실시예에 따른 FAQ KB(110)의 구성으로 인하여 사용자 질의에 대한 답변을 검색하는 시간을 단축하고 검색 정확도를 높이는 것이 가능하다. 또한 FAQ KB(110)를 구성하는 질문의도, 질문패턴 및 조건부 답변 각각을 처리하는 구성요소를 FAQ 답변기(100) 내에 도입함으로써, FAQ 답변기(100)가 주어진 시간에 가능한 많은 사용자의 질문을 수용할 수 있는 확장성(scalability)을 지닌다.Due to the configuration of the FAQ KB 110 according to the present embodiment, it is possible to shorten the time for searching for answers to a user's query and increase the search accuracy. In addition, by introducing components that process each of the question intention, question pattern, and conditional answer constituting the FAQ KB 110 in the FAQ answerer 100, the FAQ answerer 100 can ask as many questions as possible from users at a given time. It has scalability to accommodate

다음, 도 2에 도시된 대로, 본 실시예에 따른 QIR(111)은 사용자 질의와 관련이 있는 복수의 질문의도를 FAQ KB(110)로부터 선별한다. 선별하는 방법은 검색엔진이 문서를 검색하는 방법과 유사하다. 검색 시, 질문의도는 문서 식별자(identifier)가 되고, 연결된 질문패턴과 조건부 답변은 연이어 붙여져서 문서의 텍스트가 된다. 사용자 질의를 구성하는 단어를 키워드로 사용하여, QIR(111)은 FAQ 지식베이스를 검색한다. 키워드와 관련된 단어를 많이 포함한 순서에 의거하여 복수의 질문패턴 그룹을 결정하고, 각각의 질문패턴 그룹을 인덱싱(indexing)하는 복수의 질문의도를 선별한다. 도 2에 도시된 바에 의하면, 본 실시예에 따른 QIR(111)의 출력은 복수의(Top-X로 표기) 질문의도 및 각각의 질문의도에 연결된 질문패턴 그룹이다. QIR(111)의 검색 능력 및 FAQ KB(110)의 크기 등에 의존하여, 선별되는 질문의도의 개수는 가감될 수 있다.Next, as shown in FIG. 2 , the QIR 111 according to the present embodiment selects a plurality of question intentions related to a user query from the FAQ KB 110 . The screening method is similar to how a search engine searches for documents. When searching, the question intent becomes the document identifier, and the linked question pattern and conditional answer are successively attached to become the text of the document. Using the words constituting the user query as keywords, the QIR 111 searches the FAQ knowledge base. A plurality of question pattern groups are determined based on an order in which many words related to a keyword are included, and a plurality of question intentions for indexing each question pattern group are selected. As shown in FIG. 2 , the output of the QIR 111 according to the present embodiment is a plurality of (represented by Top-X) question intentions and a question pattern group connected to each question intention. Depending on the search capability of the QIR 111 and the size of the FAQ KB 110 , the number of selected question intentions may be increased or decreased.

QIR(111)은 방대한 FAQ KB(110)에 속한 방대한 질문패턴을 필터링하여, 사용자 질의와 밀접한 관련을 갖는 복수의 질문의도와 연결된 질문패턴에 한정하도록 축소시킨다. QIR(111)은 답변 확률의 높이기 위해 검색 범위를 좁히는 것을 목표로 하고, 좁아진 검색 범위 때문에 PI 모델(114)의 추론 과정에서 요구되는 연산력(computing power)을 줄일 수 있다. The QIR 111 filters the vast question patterns belonging to the vast FAQ KB 110 and reduces it to be limited to the question patterns connected with a plurality of question intentions closely related to the user's query. The QIR 111 aims to narrow the search range to increase the answer probability, and because of the narrowed search range, it is possible to reduce the computing power required in the reasoning process of the PI model 114 .

다음, 도 2에 도시된 대로, 본 실시예에 따른 QPR(112)은, 먼저 QIR(111)이 출력한 복수의 질문의도 및 질문패턴 그룹을 사용하여 사용자 질의와 각각의 질문패턴을 PI 모델(114)에 입력하여 유사질문(duplicate question) 여부를 식별한다. QPR(112)은 유사질문 식별을 위하여 자연어처리(Natural Language Processing)에서 성과를 보여 주고 있는 신경회로망(neural network)의 하나인 BERT(Bidirectional Encoder Representations from Transformers)를 이용한다. FAQ 답변기(100)의 동작과 같은 NLU(Natural Language Understanding)에 적용하기 위하여, 유사질문 여부를 식별하는 PI 문제(Paraphrase Identification Problem)를 해결할 수 있도록 BERT를 사전에 트레이닝하여 PI 모델을 준비한다. Next, as shown in FIG. 2 , the QPR 112 according to this embodiment uses a plurality of question intentions and question pattern groups output by the QIR 111 first to convert a user query and each question pattern into a PI model. Input to (114) to identify whether a duplicate question is present. The QPR 112 uses BERT (Bidirectional Encoder Representations from Transformers), which is one of the neural networks showing performance in natural language processing to identify similar questions. In order to apply to NLU (Natural Language Understanding) such as the operation of the FAQ answerer 100, the BERT is trained in advance to solve the PI problem (paraphrase identification problem) that identifies whether a similar question exists to prepare a PI model.

다음, PI 모델(114)의 출력에 기반한 QIS(115)의 추론 방법을 설명한다. 사용자 질의를 q라 하고, 질문의도 I에 결합된 질문패턴 그룹

에 속한 질문패턴

에 대하여 QIS(115)가 탑원(Top-1) 질문의도를 선정하는 방법은 수학식 1로 표현한다.Next, an inference method of the QIS 115 based on the output of the PI model 114 will be described. Let the user query be q, and the question pattern group combined with question intent I

question pattern belonging to

A method for the QIS 115 to select a Top-1 question intention is expressed by Equation 1.

여기서

는 질문의도 I에 결합된 질문패턴 그룹의 크기를 나타내고,

는 사용자 질의 q와 질문패턴

를 입력받은 PI 모델(114)의 분류 결과로서 유사 질문쌍으로 분류된 경우에는 1, 반대의 경우에는 0이다. here

represents the size of the question pattern group combined in question intention I,

is the user query q and the question pattern

As a result of the classification of the PI model 114 receiving , it is 1 when classified as a similar question pair, and 0 in the opposite case.

결론적으로, PI 모델(114)의 식별 결과를 이용하여 QPR(112)은 복수의 질문의도 중, 가장 높은 비율의 유사질문을 포함하는 질문패턴 그룹을 인덱싱하는 질문의도를 탑원 질문의도인 것으로 선정하여 CA(113)에 전달한다. In conclusion, using the identification result of the PI model 114, the QPR 112 sets the question intention indexing the question pattern group including the highest rate of similar questions among the plurality of question intentions as the top-one question intention. It is selected and delivered to CA (113).

다음, 도 2에 도시된 대로, 본 실시예에 따른 CA(113)는, QPR(112)이 선정한 탑원 질문의도 및 연결된 답변조건을 이용하여, 탑원 질문의도와 관련된 답변 리스트 중 정확한 답변을 선택하여 사용자에게 제공한다. FAQ KB(110)에 관한 기술에서 설명한 바와 같이, 답변조건은 대화 시스템 내의 신뢰상태 추적기로부터 전달받거나, API 호출을 이용하여 외부 시스템으로부터 획득하거나 또는 후속 질문을 사용하여 사용자로부터 전달받을 수 있다. Next, as shown in FIG. 2 , the CA 113 according to the present embodiment selects the correct answer from the list of answers related to the top-one question intention by using the top-one question intention and the connected answer condition selected by the QPR 112 . to provide it to users. As described in the description of the FAQ KB 110, the answer condition may be received from the trust state tracker in the conversation system, obtained from an external system using an API call, or received from the user using a follow-up question.

전술한 바와 같이 본 실시예에 따른 FAQ 답변기(100)는 신경회로망 기반의 학습 모델을 구비하고, 구비된 학습 모델을 이용하여 PI 모델(114)에 대한 트레이닝 과정을 수행할 수 있다. 이러한 학습 모델은 유사한 질문쌍인지를 구분할 수 있는 라벨을 갖는(labeled) 데이터세트 및 약한 라벨을 갖는(weakly labeled) 도메인 특화된 데이트베이스에 기반하여 PI 문제를 처리하는 것이 가능하도록 사전에 트레이닝된 모델일 수 있다.As described above, the FAQ answerer 100 according to the present embodiment has a neural network-based learning model, and may perform a training process for the PI model 114 using the provided learning model. This training model is a pre-trained model that makes it possible to handle PI problems based on a labeled dataset and a weakly labeled domain-specific database that can distinguish between similar question pairs. can

도 3은 본 발명의 일 실시예에 따른 학습 모델의 트레이닝 과정을 보여주는 개념도이다.3 is a conceptual diagram illustrating a training process of a learning model according to an embodiment of the present invention.

이하 도 3을 참조하여, 학습 모델(120)의 트레이닝 과정에 대해 설명하도록 한다. 학습 모델(120)은 트레이닝 과정에서 PI 모델(114) 및 정제기(121, refinery)로 구성된다. 본 발명의 실시예에 따른 PI 모델(114) 및 정제기(121)는 모두 BERT로 구현되나, 반드시 이에 한정하는 것은 아니며, 자연어 처리(Natural Language Processing)에 적용이 가능한 어느 신경망(neural network)에 의하여도 구현이 가능하다. 대표적인 것으로는 RNN(Recurrent Neural Network) 모델이 있으며 RNN 계열의 신경망으로는 LSTM(Long Short-Term Memory model), GRU(Gated Recurrent Unit) 및 트랜스포머 디코더(Transformer Decoder) 등이 있다. Hereinafter, a training process of the learning model 120 will be described with reference to FIG. 3 . The learning model 120 is composed of a PI model 114 and a refiner 121 in the training process. Both the PI model 114 and the refiner 121 according to the embodiment of the present invention are implemented with BERT, but are not limited thereto, and by any neural network applicable to natural language processing. can also be implemented. Representative examples include a Recurrent Neural Network (RNN) model, and RNN-based neural networks include a Long Short-Term Memory model (LSTM), a Gated Recurrent Unit (GRU), and a Transformer Decoder.

우선, 트레이닝장치(미도시)는, 도 3에 도시된 바대로, 정답 라벨이 없는(unlabeled) 대용량의 말뭉치(dialogue corpus)에 비지도 학습방법(unsupervised learning)을 적용하여 PI 모델(114)과 정제기(121)를 사전 트레이닝(pre-training)한다. 대용량의 말뭉치로는 위키피디아(Wikipedia) 및 북스코퍼스(Bookscorpus)를 이용할 수 있으며, 대용량의 말뭉치로부터 추출된 수억 내지 수십억 개의 단어를 사전 트레이닝에 사용한다.First, the training device (not shown), as shown in FIG. 3, applies unsupervised learning to a large-capacity dialogue corpus without an answer label to PI model 114 and The refiner 121 is pre-trained. Wikipedia and Bookscorpus can be used as the large-capacity corpus, and hundreds of millions to billions of words extracted from the large-capacity corpus are used for dictionary training.

다음, 본 발명의 실시예에 따른 트레이닝장치는, 도 3에 도시된 바대로, 질문쌍(question pair)의 유사 여부를 구분하는 라벨을 갖는(labeled) 질문쌍을 포함하는 데이터세트(dataset)를 사용하여 PI 모델(114)과 정제기(121)를 과제 정밀조정(Task Fine-tuning)한다. 여기서 과제의 의미는 입력으로 주어진 질문쌍의 유사 여부에 대한 식별(paraphrase identification)이다. QQP(Quora Question Pairs)는 가장 널리 알려진 데이터세트로서 약 404K 개의 질문쌍을 포함하고 있으며, 각 질문쌍은 서로 유사한지 여부에 대한 라벨을 가지고 있다. Next, the training apparatus according to the embodiment of the present invention, as shown in FIG. 3, a dataset including a labeled question pair for discriminating whether the question pair is similar. The PI model 114 and the refiner 121 are task fine-tuned using the PI model 114 . Here, the meaning of the task is the identification of the similarity of the question pair given as input (paraphrase identification). Quora Question Pairs (QQP) is the most widely known dataset and includes about 404K question pairs, and each question pair has a label indicating whether they are similar to each other.

이하 PI 모델(114)과 정제기(121)에 대한 과제 정밀조정(Task Fine-tuning)을 설명한다. 우선 데이터세트에 포함된 질문쌍을 x_i, 해당 이진 라벨을 y_i로 표기한다. 한편 데이터세트에 포함된 질문쌍은 NMT(Natural Language Translator)에 의하여 목표 언어(예컨대 한국어)로 번역되어 사용될 수 있다.Hereinafter, task fine-tuning for the PI model 114 and the refiner 121 will be described. First, the question pair included in the dataset is denoted by x _i , and the corresponding binary label is denoted by y _i . Meanwhile, the question pair included in the dataset may be translated into a target language (eg, Korean) by a natural language translator (NMT) and used.

데이터세트에 포함된 전체 질문쌍의 수가 N 개일 때, 수학식 2에 표시된 크로스 엔트로피(cross-entropy) 형태의 손실함수(loss function)에 기반하여 PI 모델(114)과 정제기(121)의 파라미터를 업데이트한다. When the total number of question pairs included in the dataset is N, the parameters of the PI model 114 and the refiner 121 are calculated based on the loss function of the cross-entropy form shown in Equation 2 update

여기서 C는 클래스 넘버(class number)로서 PI 모델(114)과 정제기(121) 각각을 구현한 BERT의 최종 출력에 의존한다. PI 모델(114)의 최종단은 소프트맥스 함수(softmax function)로 구현되어 입력된 질문쌍의 유사도와 비유사도에 대한 확률(

,

)을 모두 출력한다. 따라서 클래스 넘버 C는 2이고, 이진 라벨

및

는 질문쌍의 특성을 반영하여 한쪽은 1, 다른 쪽은 0의 값을 갖는다. 정제기(121)의 최종단은 시그모이드 함수(sigmoid function)로 구현되어 입력된 질문쌍에 부착된 라벨에 대한 확률

을 출력한다. 따라서 클래스 넘버 C는 1이고. 이진 라벨

는 질문쌍의 특성을 반영하여 1 또는 0의 값을 갖는다. Here, C is a class number and depends on the final output of the BERT implementing each of the PI model 114 and the refiner 121 . The final stage of the PI model 114 is implemented as a softmax function, and the probability (

,

) are printed out. So class number C is 2, binary label

and

has a value of 1 on one side and 0 on the other side reflecting the characteristics of the question pair. The final stage of the refiner 121 is implemented as a sigmoid function, so the probability for the label attached to the input question pair

to output So class number C is 1. binary label

has a value of 1 or 0 reflecting the characteristics of the question pair.

수학식 2에 표현된 손실함수에서는 메트릭으로 크로스 엔트로피를 이용하고 있으나 반드시 이에 한정하는 것은 아니며, 지도학습(supervised learning)에 사용 가능한 어느 메트릭이든 손실함수에 이용하는 것이 가능하다. In the loss function expressed in Equation 2, cross entropy is used as a metric, but the present invention is not limited thereto, and any metric usable for supervised learning can be used for the loss function.

다음, 본 발명의 실시예에 따른 트레이닝장치는, 도 3에 도시된 바대로, 약한 라벨을 갖는(weakly labeled) 도메인 특화된 데이터베이스를 사용하여 PI 모델(114)과 정제기(121)를 도메인 정밀조정(Domain Fine-tuning)한다. 도메인 정밀조정은 PI 모델(114)에 대한 반복적인 트레이닝(iterative training) 및 정제기(121)에 대한 정제 과정을 포함한다.Next, as shown in FIG. 3, the training apparatus according to the embodiment of the present invention fine-tunes the PI model 114 and the refiner 121 using a domain-specific database having a weakly labeled domain ( Domain Fine-tuning). The domain refinement includes iterative training for the PI model 114 and a refinement process for the refiner 121 .

QQP를 이용하여 과제 정밀조정을 진행한 후에, PI 모델(114)과 정제기(121)에 도메인 특화된 질문쌍을 적용하면, 정확도가 매우 낮아진다. 예를 들어, "어떻게 로밍서비스(roaming service)를 시작할 수 있습니까?"라는 질문과 "로밍을 하기 위해 휴대폰을 리부트(reboot)해야 합니까?"라는 질문은 일반적인 문맥의미에서는 일치하지 않는다. 그러나 통신서비스와 같은 도메인에서는 서로 유사한 질문으로 고려될 수 있고, "로밍서비스를 시작하기 위해서는 휴대폰을 리부트하십시오."라는 동일한 답변이 제공된다. 따라서, 도메인 특화된 질문쌍에 대한 식별 정확도를 높이기 위하여 본 발명의 실시예에 따른 PI 모델(114)과 정제기(121)는 도메인 정밀조정(Domain Fine-tuning)된다If a domain-specific question pair is applied to the PI model 114 and the refiner 121 after fine-tuning the task using QQP, the accuracy is very low. For example, the question "How can I start the roaming service?" and the question "Do I need to reboot my phone to roam?" do not match in the general sense. However, in domains such as telecommunication services, similar questions can be considered, and the same answer is provided: "Reboot your mobile phone to start roaming service." Accordingly, the PI model 114 and the refiner 121 according to the embodiment of the present invention are domain fine-tuned in order to increase the identification accuracy for the domain-specific query pair.

이하 통신서비스 시스템의 대화기록과 같은 도메인 특화된 대화기록으로부터 질문쌍을 생성하는 과정을 기술한다. 우선 도메인 특화된 대화기록에 자동 유사 질문 클러스터링(automatic duplicate question clustering)을 적용하여 유사 질문집단(question cluster)을 생성한다. 같은 질문집단 내의 모든 조합 가능한 질문쌍을 생성하여 긍정 질문쌍(positive question pair)을 생성하고 긍정라벨(positive label)을 부착한다. 집단화 오류(clustering error)를 비롯한 여러 가지 이유 때문에, 긍정라벨은 항상 참이 아닐 수 있으므로, 긍정라벨은 의사 라벨(pseudo label) 또는 약한 라벨(weak label)일 수 있고, 해당하는 질문쌍은 의사긍정 질문쌍(pseudo-positive question pair)일 수 있다. 긍정 질문쌍을 표현하는 데이터는 {질문 1, 질문 2, 의사 긍정라벨} 형태이다. 이하 편의상 긍정라벨과 의사 긍정라벨을 교차적으로 사용하고, 긍정 질문쌍과 의사긍정 질문쌍도 교차적으로 사용한다.Hereinafter, a process of generating a question pair from a domain-specific conversation record such as a conversation record of a communication service system will be described. First, a question cluster is generated by applying automatic duplicate question clustering to a domain-specific conversation record. By generating all combinable question pairs in the same question group, a positive question pair is created and a positive label is attached. Because positive labels may not always be true for a number of reasons, including clustering error, positive labels can be pseudo-labels or weak labels, and the corresponding question pair is pseudo-positive. It may be a pseudo-positive question pair. The data representing the positive question pair is in the form of {question 1, question 2, pseudo positive label}. Hereinafter, for convenience, the positive label and the pseudo-positive label are used interchangeably, and the positive question pair and the pseudo positive question pair are also used interchangeably.

다음 하나의 긍정 질문쌍을 나타내는 데이터에 상응하도록 {질문 1, 질문 3, 부정라벨} 형태의 데이터로 표현되는 부정 질문쌍을 생성한다. 여기서 서로 상응하는 긍정 및 부정 질문쌍은 같은 질문 1을 포함하고, 질문 3은 다른 질문집단에 속한다. 서로 다른 질문집단에 속한 두 질문을 쌍으로 생성하였으므로 부정라벨에는 오류가 존재하지 않는다.Next, a negative question pair expressed as data in the form of {question 1, question 3, negative label} is generated to correspond to the data representing one positive question pair. Here, the corresponding positive and negative question pairs include the same question 1, and question 3 belongs to a different question group. Since two questions belonging to different question groups were created in pairs, there is no error in the negative label.

본 발명에 따른 트레이닝장치는, 긍정 및 부정 질문쌍을 이용하여 PI 모델(114)을 도메인 정밀조정한다. 그러나 오류를 포함한 의사긍정 질문쌍의 존재 때문에 PI 모델(114)은 의사 라벨을 이용하는 약한 지도방법(weakly supervised method)으로 트레이닝된다. 의사 라벨이 포함하고 있는 오류에 대처하기 위해 PI 모델의 트레이닝에 사용되는 긍정 질문쌍에 신뢰 점수(confidence score)를 가중치 형태로 부여하는 것이 필요하다. 신뢰 점수를 추정하기 위하여 본 발명의 실시예에서는 정제기(121)를 추가한다. The training apparatus according to the present invention fine-tunes the PI model 114 by using positive and negative question pairs. However, because of the presence of false positive question pairs, the PI model 114 is trained with a weakly supervised method using pseudo labels. In order to cope with the error contained in the pseudo-label, it is necessary to assign a confidence score in the form of a weight to the positive question pairs used for training of the PI model. In order to estimate the confidence score, a refiner 121 is added in the embodiment of the present invention.

도 4는 본 발명의 일 실시예에 따른 PI 모델과 정제기를 포함하는 학습 모델의 구성도이다.4 is a block diagram of a learning model including a PI model and a refiner according to an embodiment of the present invention.

이하 도 4를 참조하여, 정제기(121)의 정제 과정 및 PI 모델(114)에 대한 반복적인 트레이닝에 대하여 설명한다. 본 발명의 실시예에 따른 반복적인 트레이닝 과정에서, PI 모델(114)과 정제기(121)는 각각 상대방의 트레이닝 결과를 이용한다. 정제기(121)가 산정한 긍정 질문쌍의 신뢰 점수를 PI 모델(114)의 손해함수에서 가중치로 사용하고, PI 모델(114)의 분류 결과도 정제기(121)의 손해함수에 이용된다. 또한 반복적인 트레이닝 과정에서, 긍정 질문쌍의 신뢰 점수에 기반하여 긍정 질문쌍에 대한 라벨 변경을 시도한다.Hereinafter, with reference to FIG. 4 , the refinement process of the refiner 121 and iterative training for the PI model 114 will be described. In the iterative training process according to the embodiment of the present invention, the PI model 114 and the refiner 121 use each other's training results. The confidence score of the positive question pair calculated by the refiner 121 is used as a weight in the damage function of the PI model 114 , and the classification result of the PI model 114 is also used in the damage function of the refiner 121 . Also, in the iterative training process, we try to change the label of the positive question pair based on the confidence score of the positive question pair.

먼저, 정제기(121)를 트레이닝하기 위하여, 한번은 긍정 질문쌍을 정제기(121)에 입력하여 의사 긍정라벨에 대한 신뢰 점수

를 추정하고, 다른 한번은 부정 질문쌍을 입력하여 부정라벨에 대한 신뢰 점수

를 추정한다. 의사 긍정라벨 및 부정라벨에 대한 신뢰 점수 및 긍정라벨에 대한 PI 모델(114)의 분류 결과가 포함된, 수학식 3에 표현한 손실함수에 기반하여 정제기(121)의 파라미터를 업데이트한다. First, in order to train the refiner 121, an affirmative question pair is inputted into the refiner 121 once, and the confidence score for the pseudo positive label is

Estimating , and inputting a negative question pair at the other time, the confidence score for the negative label

to estimate The parameters of the refiner 121 are updated based on the loss function expressed in Equation 3, including the confidence scores for the pseudo positive label and the negative label, and the classification result of the PI model 114 for the positive label.

여기서

는 긍정 질문쌍 (p_i,q_i)에 대한 PI 모델(114)의 분류 결과로서 유사 질문쌍으로 분류된 경우에는 1, 반대의 경우에는 0이다. 그리고,

는 트레이닝 과정에서 조정되는 하이퍼 파라미터(hyper-parameter)이며, i는 긍정 질문쌍(또는 같은 수의 부정 질문쌍)을 인덱싱한다. 앞에서 설명한 바와 같이 정제기의 최종단은 시그모이드 함수로 구현되므로, 신뢰 점수의 범위는 0에서 1까지 이다. 결과적으로, 본 실시예에서는 긍정 질문쌍에 부착된 의사 긍정라벨이 얼마나 참인지 확신할 수 있는 신뢰 점수를 0과 1사이의 범위 내에서 출력하도록 정제기(121)가 트레이닝된다.here

is the classification result of the PI model 114 for the positive question pair (p _i , q _i ). And,

is a hyper-parameter adjusted during the training process, and i indexes positive question pairs (or the same number of negative question pairs). As described above, since the final stage of the refiner is implemented as a sigmoid function, the confidence score ranges from 0 to 1. As a result, in the present embodiment, the refiner 121 is trained to output a confidence score within the range between 0 and 1 with which it can be sure how true the pseudo positive label attached to the positive question pair is.

다음, PI 모델(114)을 트레이닝하기 위하여, 긍정 질문쌍 및 부정 질문쌍을 번갈아 PI 모델(114)에 입력하여 유사도를 추정한다. 정제기(121)가 제공한 긍정 질문쌍의 신뢰 점수 및 수학식 2를 결합하여, 수학식 4에 표현한 손실함수에 기반하여 PI 모델(114)의 파라미터를 업데이트한다. Next, in order to train the PI model 114, the positive question pair and the negative question pair are alternately input to the PI model 114 to estimate the similarity. By combining the confidence score of the positive question pair provided by the refiner 121 and Equation 2, the parameters of the PI model 114 are updated based on the loss function expressed in Equation 4.

여기서 함수

는 수학식 5로 표현된다. function here

is expressed by Equation (5).

여기서 c 및

는 트레이닝 과정에서 조정되는 하이퍼 파라미터이고, t는 이터레이션 스텝(iteration step)을 나타낸다. 트레이닝 초기에는 정제기(121)가 출력하는 신뢰 점수가 믿을만하지 않으므로, 트레이닝 시 적게 반영되어야 한다. 수학식 5에 나타낸 대로, t가 작은 값이면, 함수

는 지수항에 의존하여 1에 가까운 값이 되므로 신뢰 점수의 영향을 줄일 수 있다. 트레이닝이 진행되어 t가 증가하면, 지수항이 0에 수렴하므로, 함수

는 신뢰 점수에 의존하게 된다.where c and

is a hyperparameter adjusted during the training process, and t represents an iteration step. Since the confidence score output by the refiner 121 is not reliable at the beginning of training, it should be reflected less during training. As shown in Equation 5, if t is a small value, the function

is close to 1 depending on the exponent term, so the influence of the confidence score can be reduced. As the training progresses and t increases, the exponential term converges to 0, so the function

will depend on the confidence score.

수학식 5에 따르면, 수학식 4의 손실함수에는 긍정라벨의 신뢰 점수만 반영된다. 그리고,

는 1 이하의 값이므로, 수학식에 4에서

는 긍정라벨에 의한 손실의 영향을 줄이는 가중치 역할을 한다.According to Equation (5), only the confidence score of the positive label is reflected in the loss function of Equation (4). And,

is a value less than or equal to 1, so in Equation 4

is a weighting factor that reduces the impact of losses caused by positive labels.

한편, 긍정라벨에 대하여 정제기(121)가 추론한 신뢰 점수가 기 설정된 임계값보다 작으면, 해당되는 질문쌍을 부정 질문쌍으로 변환하고, 부정라벨을 부착한다. 라벨의 변경에 의하여 생성된 부정 질문쌍은 원래 존재하고 있던 부정 질문쌍보다 트레이닝 과정에서 더 강력한 데이터로 작용한다.On the other hand, if the confidence score inferred by the refiner 121 for the positive label is less than a preset threshold, the corresponding question pair is converted into a negative question pair, and a negative label is attached. The negative question pair generated by the label change acts as stronger data in the training process than the original negative question pair.

전술한 바와 같이 신뢰 점수 및 약한 라벨에 대한 라벨 변경 절차를 트레이닝 과정에 도입함으로써, 본 실시예에 따른 FAQ 답변기(100)는 트레이닝 과정에서 파생되는 오류에 효율적으로 대처할 수 있는 강인성(robustness)을 지닌다.By introducing the label change procedure for the confidence score and the weak label into the training process as described above, the FAQ answerer 100 according to the present embodiment provides robustness that can efficiently cope with errors derived from the training process. have

도 5는 본 발명의 일 실시예에 따른 사용자 질의에 대한 FAQ 답변기에 의한 답변 절차를 보여주는 순서도이다.5 is a flowchart illustrating a procedure for answering a user's query by a FAQ answerer according to an embodiment of the present invention.

먼저 FAQ 답변기(100)는 사용자 질의를 획득하고(S501), QIR(111)은 FAQ KB(110)를 검색하여 사용자 질의와 관련이 있는 복수의 질문의도 및 복수의 질문의도에 연결된 복수의 질문패턴 그룹을 선별한다(S502). 여기서, 전술한 바와 같이 FAQ KB(110)는 사용자 질의에 대한 답변 데이터를 질문의도, 질문패턴 및 조건부 답변으로 구성된 데이터 형태로 보관한다. First, the FAQ answerer 100 obtains a user query (S501), and the QIR 111 searches the FAQ KB 110 to obtain a plurality of question intentions related to the user query and a plurality of questions connected to the plurality of question intentions. of the question pattern group is selected (S502). Here, as described above, the FAQ KB 110 stores answer data to a user's query in the form of data composed of a question intention, a question pattern, and a conditional answer.

다음, QPR(112)은 사용자 질의와 복수의 질문패턴 그룹을 사전에 학습된 PI 모델(114)에 입력하여 유사질문 여부를 식별하고, 식별 결과를 이용하여, 복수의 질문의도 중 가장 높은 비율의 유사질문을 포함하는 질문패턴 그룹을 인덱싱하는 질문의도를 탑원 질문의도인 것으로 선정한다(S503). Next, the QPR 112 inputs a user query and a plurality of question pattern groups into the pre-trained PI model 114 to identify whether a similar question exists, and uses the identification result to determine the highest ratio among the plurality of question intentions. A question intention indexing a question pattern group including similar questions of

다음, 답변조건을 이용하여 CA(113)는 탑원 질문의도와 관련된 답변 리스트 중 정확한 답변을 선택한다(S504). 여기서, CA(113)는 지식베이스 상의 조건부 답변을 구성하는 답변조건 및 답변 리스트를 이용하여 탑원 질문의도와 관련된 답변을 선택한다.Next, using the answer condition, the CA 113 selects an accurate answer from the list of answers related to the top-one question intention (S504). Here, the CA 113 selects an answer related to the top-one question intention by using the answer condition and the answer list constituting the conditional answer on the knowledge base.

다음, FAQ 답변기(100)는 선택된 답변을 사용자에게 제공한다(S505). Next, the FAQ answering machine 100 provides the selected answer to the user (S505).

이하 본 실시예에 따른 FAQ 답변기(100)의 성능을 평가한 결과를 설명한다. 평가에 사용한 데이터베이스는 특정 통신사에서 사용자와의 대화기록을 저장한 데이터베이스로서, 수백만 개의 질문-답변 쌍으로 구성되어 있다. 앞에서 설명한 바대로, 특정 통신사의 데이터베이스 중 일부를 이용하여 학습 모델(120)을 트레이닝시키고, FAQ KB(110)를 생성하였다. 생성된 FAQ KB(110)를 사용하여 FAQ 답변기(100)가 제공하는 답변의 정확도를 측정하였다. 측정된 정확도는 84 %로서, 종래의 IR(Information Retrieval) 모델(BM25, 비특허문헌 2 참조)이 보여준 70 % 대비하여 더 좋은 성능을 보였다.Hereinafter, a result of evaluating the performance of the FAQ answering machine 100 according to the present embodiment will be described. The database used for evaluation is a database that stores conversation records with users at a specific telecommunication company, and consists of millions of question-answer pairs. As described above, the learning model 120 was trained using a part of the database of a specific telecommunication company, and the FAQ KB 110 was generated. The accuracy of the answer provided by the FAQ answerer 100 was measured using the generated FAQ KB 110 . The measured accuracy was 84%, which showed better performance compared to 70% shown by the conventional IR (Information Retrieval) model (BM25, see Non-Patent Document 2).

이상에서 설명한 바와 같이 본 실시예에 의하면, FAQ KB에 기반하여 사용자 질의에 대한 질문의도 검색, 질문패턴 추정 및 조건부 답변 과정을 대화 시스템에 적용하고, 약한 라벨이 부착된(weakly labeled) 질문쌍을 이용한 트레이닝 과정을 적용한, 과제지향 대화 시스템에 사용되는 대화 시스템 및 방법을 제공함으로써, 대화 시스템의 확장성을 증대하고 강인성을 높이는 효과가 있다. As described above, according to this embodiment, based on the FAQ KB, the process of searching for the intention of a user query, estimating a question pattern, and a conditional answering process is applied to the conversation system, and a weakly labeled question pair is applied. By providing a dialog system and method used in a task-oriented dialog system to which a training process using

본 실시예에 따른 각 순서도에서는 각각의 과정을 순차적으로 실행하는 것으로 기재하고 있으나, 반드시 이에 한정되는 것은 아니다. 다시 말해, 순서도에 기재된 과정을 변경하여 실행하거나 하나 이상의 과정을 병렬적으로 실행하는 것이 적용 가능할 것이므로, 순서도는 시계열적인 순서로 한정되는 것은 아니다.Although it is described that each process is sequentially executed in each flowchart according to the present embodiment, the present invention is not limited thereto. In other words, since it may be applicable to change and execute the processes described in the flowchart or to execute one or more processes in parallel, the flowchart is not limited to a time-series order.

본 명세서에 설명되는 시스템들 및 기법들의 다양한 구현예들은, 디지털 전자 회로, 집적 회로, FPGA(field programmable gate array), ASIC(application specific integrated circuit), 컴퓨터 하드웨어, 펌웨어, 소프트웨어, 및/또는 이들의 조합으로 실현될 수 있다. 이러한 다양한 구현예들은 프로그래밍가능 시스템 상에서 실행가능한 하나 이상의 컴퓨터 프로그램들로 구현되는 것을 포함할 수 있다. 프로그래밍가능 시스템은, 저장 시스템, 적어도 하나의 입력 디바이스, 그리고 적어도 하나의 출력 디바이스로부터 데이터 및 명령들을 수신하고 이들에게 데이터 및 명령들을 전송하도록 결합되는 적어도 하나의 프로그래밍가능 프로세서(이것은 특수 목적 프로세서일 수 있거나 혹은 범용 프로세서일 수 있음)를 포함한다. 컴퓨터 프로그램들(이것은 또한 프로그램들, 소프트웨어, 소프트웨어 애플리케이션들 혹은 코드로서 알려져 있음)은 프로그래밍가능 프로세서에 대한 명령어들을 포함하며 "컴퓨터-판독가능 매체"에 저장된다. Various implementations of the systems and techniques described herein may be implemented in digital electronic circuitry, integrated circuitry, field programmable gate array (FPGA), application specific integrated circuit (ASIC), computer hardware, firmware, software, and/or combination can be realized. These various implementations may include being implemented in one or more computer programs executable on a programmable system. The programmable system includes at least one programmable processor (which may be a special purpose processor) coupled to receive data and instructions from, and transmit data and instructions to, a storage system, at least one input device, and at least one output device. or may be a general-purpose processor). Computer programs (also known as programs, software, software applications, or code) contain instructions for a programmable processor and are stored on a “computer-readable medium”.

컴퓨터-판독가능 매체는, 명령어들 및/또는 데이터를 프로그래밍가능 프로세서에게 제공하기 위해 사용되는, 임의의 컴퓨터 프로그램 제품, 장치, 및/또는 디바이스(예를 들어, CD-ROM, ROM, 메모리 카드, 하드 디스크, 광자기 디스크, 스토리지 디바이스 등의 비휘발성 또는 비일시적인 기록매체)를 나타낸다. A computer-readable medium includes any computer program product, apparatus, and/or device (eg, a CD-ROM, ROM, memory card, a non-volatile or non-transitory recording medium such as a hard disk, a magneto-optical disk, and a storage device).

본 명세서에 설명되는 시스템들 및 기법들의 다양한 구현예들은, 프로그램가능 컴퓨터에 의하여 구현될 수 있다. 여기서, 컴퓨터는 프로그램가능 프로세서, 데이터 저장 시스템(휘발성 메모리, 비휘발성 메모리, 또는 다른 종류의 저장 시스템이거나 이들의 조합을 포함함) 및 적어도 한 개의 커뮤니케이션 인터페이스를 포함한다. 예컨대, 프로그램가능 컴퓨터는 서버, 네트워크 기기, 셋탑 박스, 내장형 장치, 컴퓨터 확장 모듈, 개인용 컴퓨터, 랩탑, PDA(Personal Data Assistant), 클라우드 컴퓨팅 시스템 또는 모바일 장치 중 하나일 수 있다.Various implementations of the systems and techniques described herein may be implemented by a programmable computer. Here, the computer includes a programmable processor, a data storage system (including volatile memory, non-volatile memory, or other types of storage systems or combinations thereof), and at least one communication interface. For example, a programmable computer may be one of a server, a network appliance, a set-top box, an embedded device, a computer expansion module, a personal computer, a laptop, a Personal Data Assistant (PDA), a cloud computing system, or a mobile device.

이상의 설명은 본 실시예의 기술 사상을 예시적으로 설명한 것에 불과한 것으로서, 본 실시예가 속하는 기술 분야에서 통상의 지식을 가진 자라면 본 실시예의 본질적인 특성에서 벗어나지 않는 범위에서 다양한 수정 및 변형이 가능할 것이다. 따라서, 본 실시예들은 본 실시예의 기술 사상을 한정하기 위한 것이 아니라 설명하기 위한 것이고, 이러한 실시예에 의하여 본 실시예의 기술 사상의 범위가 한정되는 것은 아니다. 본 실시예의 보호 범위는 아래의 청구범위에 의하여 해석되어야 하며, 그와 동등한 범위 내에 있는 모든 기술 사상은 본 실시예의 권리범위에 포함되는 것으로 해석되어야 할 것이다.The above description is merely illustrative of the technical idea of this embodiment, and various modifications and variations will be possible without departing from the essential characteristics of the present embodiment by those skilled in the art to which this embodiment belongs. Accordingly, the present embodiments are intended to explain rather than limit the technical spirit of the present embodiment, and the scope of the technical spirit of the present embodiment is not limited by these embodiments. The protection scope of this embodiment should be interpreted by the following claims, and all technical ideas within the scope equivalent thereto should be interpreted as being included in the scope of the present embodiment.

100: FAQ 답변기 110: FAQ 지식베이스
111: 질문의도 검색기 112: 질문패턴 추정기
113: 조건부 답변기 114: PI 모델
120: 학습 모델 121: 정제기
100: FAQ answerer 110: FAQ knowledge base
111: question intent searcher 112: question pattern estimator
113: conditional answerer 114: PI model
120: learning model 121: refiner

Claims

In the learning method performed by the computing device,
Task fine-tuning of a Paraphrase Identification (PI) model and refiner using a dataset containing a labeled question pair that distinguishes whether it is similar or not. process, wherein the task indicates identification of whether an input question pair is similar; and
The PI model and the refiner are domain fine-tuned using a domain-specific database, including positive question pairs with weak positive labels, and negative query pairs with negative labels. tuning)
including,
The process of fine-tuning the domain is
an iterative training process for the PI model, and a purification process for the refiner, wherein the refiner estimates a confidence score for the weak positive label and the negative label to determine the positive question A learning method characterized in that it copes with errors in the pair creation process.

2. The method of claim 1
The process of fine-tuning the task is
A learning method, characterized in that the PI model and the refiner are pre-trained using a large-capacity dialogue corpus without an answer label (unlabeled).

2. The method of claim 1
The process of fine-tuning the above task is
A learning method, characterized in that the parameter of the PI model is updated based on a distance metric between the label for distinguishing the similarity and the output of the PI model.

2. The method of claim 1
The process of fine-tuning the above task is
The learning method according to claim 1, wherein the parameter of the refiner is updated based on a distance metric between the label for discriminating whether the similarity is present and the output of the refiner.

2. The method of claim 1
The process of fine-tuning the domain is
A learning method, characterized in that by training the PI model based on the weak positive label and the negative label, identifying whether the PI model is similar to a question pair included in the domain-specific database.

delete

According to claim 1,
The purification process is
The learning method according to claim 1, wherein the parameter of the refiner is updated based on a loss function combining the confidence scores for the weak positive label and the negative label, and the identification result of the PI model for the positive question pair.

According to claim 1,
The iterative training process is
A learning method, characterized in that the weight for the positive question pair is calculated by using the confidence score for the weak positive label.

9. The method of claim 8,
The iterative training process is
and updating the parameter of the PI model based on a distance metric between the weak positive label and the output of the PI model, a distance metric between the negative label and the output of the PI model, and the weight.

According to claim 1,
The iterative training process is
When the confidence score for the weak positive label is less than a preset threshold, the corresponding question pair is corrected as the negative question pair, and the negative label is attached.

In the learning device used by the dialogue system,
Paraphrase Identification (PI) model for generating identification results for similarity of input question pairs; and
A refiner for estimating a confidence score corresponding to the input question pair
including,
After task fine-tuning the PI model and the refiner using a dataset including a labeled question pair to distinguish whether or not they are similar, a weak positive label domain fine-tuning the PI model and the refiner using a domain-specific database, comprising a positive query pair with a label, and a negative query pair with a negative label,
The domain precision adjustment is
performing iterative training on the PI model using the confidence score, and performing purification on the refiner using the identification result;
The purifier,
and estimating the confidence score for the weak positive label and the negative label to cope with an error in the generation of the positive question pair.

A computer program stored in a computer-readable recording medium to execute the learning method performed by the computing device according to any one of claims 1 to 5, and any one of claims 7 to 10.