KR20080024752A

KR20080024752A - Dialog management apparatus and method for chatting agent

Info

Publication number: KR20080024752A
Application number: KR1020060089264A
Authority: KR
Inventors: 이청재; 김석환; 이근배
Original assignee: 학교법인 포항공과대학교; 포항공과대학교 산학협력단
Priority date: 2006-09-14
Filing date: 2006-09-14
Publication date: 2008-03-19
Also published as: KR100818979B1

Abstract

A device and a method for managing dialog of a chatting agent are provided to improve performance of a chatting robot continuously by providing an answer to the chatting robot based on dialog stacks in consideration of dialog context. A dialog analyzer(120) analyzes a speech sound(110) received from a user. A domain determiner(130) determines and verifies domains to determine the domain of the speech sound based on a speech sound analysis result. A dialog expert(140) provides a system speech sound(170) in response to the user speech sound by searching a dialog example database constructed from a dialog corpus. The domain determiner includes a language/semantic quality extractor(121,122) and a keyword extractor(123), and determines the domain by integrating and applying extracted qualities to a quality based probability model. The dialog expert includes a dialog stack(150) storing dialog example session information of the previous user speech sound and dialog/main flow analysis result, and a dialog example selector(160) indexing information such as semantic information of the previous user speech sound and scenario session information added from a scenario based dialog corpus.

Description

Dialog management apparatus and method for chatting agent

도 1은 본 발명에 따른 대화 문맥을 반영하는 채팅 에이전트를 위한 대화 관리 장치의 전반적인 구조를 나타낸 블록도이다.1 is a block diagram showing an overall structure of a chat management apparatus for a chat agent reflecting a chat context according to the present invention.

도 2는 본 발명에 따른 도메인 결정 및 검증을 위한 순서도이다.2 is a flowchart for domain determination and verification according to the present invention.

도 3은 본 발명에 따른 대화 예제 DB를 구축하는 과정을 나타낸 흐름도이다.3 is a flowchart illustrating a process of building a dialogue example DB according to the present invention.

도 4는 본 발명에 따른 대화 예제 DB의 레코드 예를 나타낸 도면이다.4 is a diagram showing an example record of the dialogue example DB according to the present invention.

<도면의 주요 부분에 대한 부호의 설명><Explanation of symbols for the main parts of the drawings>

110...사용자 발화 121...언어자질 추출부110 ... user utterance 121 ... language feature extraction unit

122...의미자질 추출부 123...키워드자질 추출부122 ... meaning feature extraction section 123 ... keyword feature extraction section

130...도메인 결정부 140...대화전문가130 Domain Decision 140 Conversationalist

150...담화기록 스택 160...대화예제 선택부150 ... Call History Stack 160 ... Chat Examples

170...시스템 발화170.System ignition

본 발명은 지능형 채팅 에이전트에서 사용되는 사람과 채팅 로봇 간의 대화 관리 장치 및 방법에 관한 것으로서, 담화스택을 이용, 대화 문맥을 고려하여 채팅로봇의 답변을 제공하는 대화 관리 장치 및 방법에 관한 것이다.The present invention relates to an apparatus and method for managing a conversation between a person and a chatting robot used in an intelligent chat agent. The present invention relates to an apparatus and method for providing a response of a chat robot in consideration of a dialogue context using a conversation stack.

채팅은 컴퓨터나 휴대용 단말기 등을 이용하여 네트워크를 통해 상대방과 대화를 할 수 있도록 구성된 것으로 예전부터 널리 이용되고 있었다. 그러나, 사람과 사람 사이의 채팅에서는 상대방이 없는 경우에 채팅을 할 수 없는 단점을 지니고 있고 이를 극복하기 위해서 채팅 로봇이 탄생하게 되었다. 이것은 현재 많은 분야에서 요구하는 지능형 에이전트에서 사람 대 컴퓨터(로봇)간의 자연어를 이용한 의사소통 수단으로 필요성이 증대되고 있다. 그러나 대부분의 채팅로봇이 입력된 문장과 정확하게 패턴 매칭이 되는 경우에만 해당 답변을 할 수 있다는 단점이 있기때문에 많은 양의 대화 예제가 필요하여 대화 예제 데이터베이스(DB)를 구축하는데 비용이 많이 든다. 또한 기존의 채팅 로봇들은 대화 문맥을 고려하지 못하여 과거의 정보에 상관없이 1문1답의 채팅이 진행이 되는 문제점이 있다. 그러므로 기존의 채팅로봇은 사람과 사람 사이의 대화 방식과 괴리가 있어 채팅로봇의 대화 모델의 향상이 필요하다.Chat has been widely used in the past for being able to communicate with the other party through a network using a computer or a portable terminal. However, the chat between people has the disadvantage that you can not chat in the absence of the other party, the chat robot was born to overcome this. This necessity is increasing as a means of communication using natural language between humans and computers (robots) in intelligent agents that are required in many fields. However, since most chat robots can answer only when the pattern is exactly matched with the input sentence, a large amount of dialogue examples are required, and it is expensive to build a dialogue example database (DB). In addition, the existing chat robots do not consider the dialogue context, and there is a problem in that one-to-one chat is performed regardless of past information. Therefore, the existing chat robots are inconsistent with the way they communicate with each other. Therefore, it is necessary to improve the chat model of the chat robot.

따라서 본 발명자는 상술한 종래 기술의 문제점을 극복하기 위하여 본 발명에서는 지능적인 채팅로봇을 위한 대화 예제 DB 구축 방법과 대화 문맥을 고려하여 과거의 정보에 따라 채팅로봇의 답변이 달라지는 대화 관리 장치 및 방법을 제시한다.Therefore, in order to overcome the problems of the prior art described above, the present invention provides a method and method for managing conversations in which a chat robot's answer varies according to past information in consideration of a conversation example DB construction method and a dialogue context for an intelligent chat robot. To present.

상기의 목적을 달성하기 위하여 담화 스택을 이용하여 대화 문맥을 반영할 수 있는 채팅 에이전트의 대화 관리 방법에 있어서, (a) 입력된 사용자 발화를 분석하고, 분석된 결과를 바탕으로 사용자 발화의 도메인을 결정하는 도메인 결정 및 검증 단계; 및 (b) 대화 코퍼스로부터 구축된 대화 예제 데이터베이스(DB)를 검색하여 사용자 발화에 대한 답변으로 시스템 발화를 제공하는 단계를 포함하는 채팅 에이전트를 위한 대화 관리 방법을 제공하고자 한다.In order to achieve the above object, a conversation management method of a chat agent capable of reflecting a conversation context using a discourse stack, the method comprising: (a) analyzing an input user speech and analyzing a domain of the user speech based on the analyzed result; Determining and verifying domains; And (b) searching for a conversation example database (DB) constructed from a conversation corpus and providing a system speech in response to a user speech.

본 발명의 다른 국면에 의하면, 담화 스택을 이용하여 대화 문맥을 반영할 수 있는 채팅 에이전트의 대화 관리 장치에 있어서, 입력된 사용자 발화를 분석하는 대화분석부; 상기 대화 분석된 결과를 바탕으로 사용자 발화의 도메인을 결정하기 위하여 도메인 결정 및 검증을 수행하는 도메인 결정부; 및 대화 코퍼스로부터 구축된 대화 예제 데이터베이스(DB)를 검색하여 사용자 발화에 대한 답변으로 시스템 발화를 제공하는 대화전문가를 포함하는 채팅 에이전트를 위한 대화 관리 장치를 제공한다.According to another aspect of the present invention, a chat agent management apparatus of a chat agent capable of reflecting a dialogue context using a discourse stack, the apparatus comprising: a dialogue analyzer for analyzing input user speech; A domain determination unit that performs domain determination and verification to determine a domain of user speech based on the result of the dialogue analysis; And a chat expert including a chat expert who searches a chat example database (DB) constructed from a chat corpus and provides system speech in response to user speech.

이어서, 첨부된 도면을 참조하여 본 발명에 대하여 설명을 한다.Next, the present invention will be described with reference to the accompanying drawings.

도 1은 본 발명에 따른 채팅로봇을 위한 대화 관리 시스템의 전반적인 구조도이다. 도 1에서 사용자 발화(110)는 키보드 입력이나 음성 입력 등으로 사용자 입력 문장을 받아들이게 된다. 사용자 발화(110)가 들어오면 대화분석부(120)에서는 언어자질추출부(121), 의미자질추출부(122), 그리고 키워드자질추출부(123)를 통해서 도메인(주제) 및 대화예제를 결정하기 위한 자질을 추출하는 대화 분석을 수행한다. 상기와 같은 대화 분석을 거친 후, 도메인결정부(130)에서는 사용자 발 화의 도메인(주제)을 결정하고 검증하는 도메인 결정 및 검증 단계를 수행한다. 그후 도메인 결정부(130)는 결정된 도메인에 따라 대화전문가(140)를 호출한다. 1 is a general structural diagram of a conversation management system for a chat robot according to the present invention. In FIG. 1, the user utterance 110 receives a user input sentence through a keyboard input or a voice input. When the user utterance 110 comes in, the conversation analyzer 120 determines a domain (topic) and a conversation example through the language feature extractor 121, the semantic feature extractor 122, and the keyword feature extractor 123. Perform dialogue analysis to extract qualities for After the conversation analysis as described above, the domain determination unit 130 performs a domain determination and verification step of determining and verifying the domain (topic) of the user's speech. Thereafter, the domain determiner 130 calls the conversation expert 140 according to the determined domain.

대화전문가(140)는 도메인별로 나누어져 있으며, 각 대화전문가(140)는 고유한 담화기록스택(150)을 가지고 있다. 담화기록스택(150)에는 이전 사용자 발화들에 대한 분석 결과와 채팅로봇의 답변 등의 정보를 가지고 있다. 대화예제선택부(160)에서는 현재 대화 상황과 가장 유사한 대화 예제를 대화 예제 데이터베이스(DB)로부터 검색한다. 여기서, 대화 상황은 대화 예제를 검색하기 위한 현재 대화 상태 정보를 총괄적으로 의미한다. 즉, 도메인, 화행 및 의미 분석 결과, 담화 기록, 시나리오 세션 정보 등을 포함한다. 선택된 대화 예제 후보들 중에서 사용자 발화 유사도를 측정하여 유사도가 가장 높은 대화 예제를 최종적으로 선택하여 그 예제에 해당하는 답변을 시스템 발화(170)로 하게 된다.The conversation expert 140 is divided by domain, and each conversation expert 140 has a unique discourse record stack 150. The discourse record stack 150 has information such as an analysis result of previous user utterances and an answer of a chat robot. The conversation example selection unit 160 searches for a conversation example most similar to the current conversation situation from the conversation example database DB. Here, the conversation situation collectively means the current conversation state information for searching the conversation example. That is, domain, speech act and semantic analysis result, discourse record, scenario session information and the like. Among the selected conversation example candidates, the user utterance similarity is measured to finally select the conversation example having the highest similarity, and the answer corresponding to the example is the system utterance 170.

본 발명에서는 사용자 발화로부터 다양한 자질을 추출하여 발화의 도메인(주제) 결정 및 대화 예제 선택을 하게 된다. 도메인을 결정하는 것은 채팅에서 대화 예제 시나리오를 도메인별로 색인을 하여 도메인별로 검색된 답변 후보 중에서 적절한 답변을 선택하고 도메인에 무관한 대화예제를 검색 후보에서 제외하기 위한 것이다. 또한, 의미자질을 추출하는 것은 기존의 채팅로봇의 대화 예제 DB들이 어휘 기반으로 구축되어져 있어서 의미상으로 동일한 사용자 발화일지라도 다른 예제를 가지고 처리를 하기 위한 것이다. 이것은 대화 예제 DB를 구축하는데 많은 양의 대화 예제를 요구하는 단점을 가지고 있다. 그러므로 본 발명에서는 의미 기반으로 대화 예제 DB를 구축하여 사용자 발화가 의미상으로 동일한 경우에 같은 예제 문장으로 처리를 할 수 있도록 한다. 이것은 의미 기반의 대화 예제 DB를 구축하여 비용을 줄이고 효율성을 높이게 된다. 또한 대화 문맥을 고려하기 위하여 대화 예제 DB를 구축할 때, 시나리오 기반의 채팅 대화 코퍼스(Corpus; 언어자료)를 이용한다. 코퍼스를 시나리오별로 세션 정보를 기록하고 대화 예제 색인을 할 때, 이전 사용자 발화의 의미 정보(화행 및 주행 분석 결과) 및 세션 정보를 현재 사용자 발화에 추가하여 색인을 하게 된다. 이렇게 함으로써 담화기록스택(150)의 정보들을 이용하여 대화 예제를 검색할 수 있게 된다.In the present invention, by extracting various qualities from the user speech, the domain (topic) of the speech is determined and the conversation example is selected. Determining the domain is to index the dialogue example scenario in the chat by domain, to select an appropriate answer among the answer candidates searched by domain, and to exclude the chat example irrelevant to the domain from the search candidate. Also, extracting semantic feature is to process conversation example DBs of existing chat robot based on vocabulary and process with different example even though it is semantically same user utterance. This has the disadvantage of requiring a large amount of dialogue examples to build the dialogue example DB. Therefore, the present invention builds a dialogue example DB based on semantics so that the same example sentence can be processed when the user speech is semantically identical. This reduces the cost and increases the efficiency by constructing a semantic based dialogue example DB. In addition, when constructing a dialogue example DB to consider the dialogue context, a scenario-based chat dialogue corpus (Corpus) is used. When corpus records session information for each scenario and indexes conversation examples, indexing is performed by adding semantic information of previous user utterances (results and driving analysis results) and session information to current user utterances. By doing so, the conversation example can be searched using the information of the discourse recording stack 150.

도 2를 보면, 도 1에서 도시된 도메인결정부(130)에서 수행되는 단계를 상술하고 있다. 사용자 발화를 형태소 분석과 품사 태깅을 거쳐서(210단계) 언어적 자질을 추출한다(220단계). 여기서 언어적 자질은 상기 기술을 위한 다양한 형태가 가능하다. 일반적으로 언어적 자질은 도메인과 언어이해를 결정하는데 도움이 되는 자질로 형태소 원형이나 형태소 N-gram, 품사 태그 등을 추출한다. 그리고 분석된 언어적 자질을 바탕으로 사용자 발화의 화행 및 의미를 분석하여(230단계) 현재 사용자의 화행 및 의도, 그리고 구성 성분 요소 등의 의미적 자질을 추출한다(240단계). 그리고 키워드 모델을 통해서 사용자 발화 속에 포함된 키워드와 그 키워드의 각 도메인별 가중치를 추출한다(250단계). 여기서 키워드 자질은 TF*IDF값만을 이용한 도메인 분류 결과로 사용자 발화 중에서 가중치가 가장 높은 순서대로의 키워드 N개와 키워드들의 TF*IDF값에 의한 가장 가중치가 높은 도메인 N개를 추출하게 된다. 여기서, TF*IDF는 정보 검색에서 많이 이용하는 것으로, 각각 Term Frequency와 Inverse Document Frequency의 약자로, 이것을 이용하여 키워드를 추 출하게 된다. TF는 말 그대로 그 워드가 나오는 빈도수이고, IDF는 전체 문서 중에서 그 단어가 나오는 문서가 몇 개인가라는 것의 역수로 너무 많이 나오는 단어는 키워드가 아니라는 것이다. 즉, 조사나 어미 이런 것들은 키워드가 될 수가 없다 (이에 대한 관련된 웹페이지 http://abolapia.egloos.com/207027를 참조).Referring to FIG. 2, the steps performed by the domain determiner 130 shown in FIG. 1 are described in detail. The user's speech is extracted through morphological analysis and part-of-speech tagging (step 210) to extract linguistic features (step 220). Linguistic qualities here are available in a variety of forms for the description. In general, linguistic qualities are features that help determine domain and language understanding, and extract morpheme prototypes, morpheme N-grams, and parts of speech tags. The speech act and meaning of the user's speech are analyzed based on the analyzed linguistic qualities (step 230), and the semantic qualities of the current user's speech acts and intentions and components are extracted (step 240). The keyword included in the user utterance and the weight of each domain of the keyword are extracted through the keyword model (step 250). Here, keyword qualities are domain classification results using only TF * IDF values, and N keywords in the order of the highest weight among user utterances and N domains having the highest weight by TF * IDF values of the keywords are extracted. Here, TF * IDF is frequently used in information retrieval, and abbreviation of Term Frequency and Inverse Document Frequency, respectively, to extract keywords using this. TF is literally the frequency of words coming out, and IDF is the inverse of how many documents the words come from. That is, surveys or mothers can't be keywords (see the related web page on this one at http://abolapia.egloos.com/207027).

도메인을 결정하는데 현재 사용자의 발화가 직전의 도메인과 동일한 경우에는 대화의 지속성이 있다. 그러므로 도메인 스위칭 없이 이전의 대화에 연장선에 있는 것으로 생각하여 이전의 대화전문가가 계속해서 사용자 발화를 처리하면 된다. 현재 발화가 이전 발화의 지속성이 있는지를 검증하기 위해 직전과 현재의 화행 및 의도의 조건부 확률을 구하고 설정된 기준치와 비교한다(260단계). 이 확률 값이 기준치보다 높으면 도메인 결정(270단계)을 생략하고 바로 직전의 대화전문가(140)로 가고 기준치보다 낮은 경우에는 상기에서 추출된 언어적 자질(220단계)과 의미적 자질(240단계), 키워드 자질(250단계)를 결합하여 확률 모델을 이용하여 확률이 가장 높은 도메인을 결정하게 된다(270단계).In determining the domain, if the current user's speech is the same as the previous domain, there is a persistence of the conversation. Therefore, it can be assumed that the previous conversation experts continue to process user utterances, considering that they are an extension of the previous conversation without domain switching. In order to verify whether the current utterance is the continuation of the previous utterance, the conditional probabilities of immediately preceding and the current speech act and intention are calculated and compared with the set reference value (step 260). If the probability value is higher than the reference value, the domain decision (step 270) is omitted, and the dialogue expert 140 immediately before the reference value is lower. If the value is lower than the reference value, the linguistic and extracted semantic features (step 220) and semantic features (step 240) are extracted. In operation 270, the keyword feature (250) is combined to determine a domain having the highest probability using a probability model.

도 3은 상기에 기술한 대화 예제 DB를 구축하는 과정을 나타낸 흐름도이다. 시나리오 기반의 채팅대화코퍼스(310)로부터 사용자 발화를 추출(320단계)하고 해당하는 시나리오 및 발화 세션 정보를 추출(330단계)한다. 그리고 이전 의미 정보를 추출(340단계)한다. 이전 의미 정보는 이전 사용자 발화의 화행 및 주행 분석 결과를 묶은 것이다. 그리고 현재 화행 정보 및 주행 정보, 그리고 시나리오가 해당하는 도메인 정보를 추출(350단계)한다. 마지막으로 시스템 발화 추출(360단계)한다. 최종적으로 추출된 정보들을 이용하여 색인을 하여 시나리오 기반의 채팅 대 화예제 데이터베이스(370)를 만들게 된다.3 is a flowchart illustrating a process of constructing a dialogue example DB described above. The user speech is extracted from the scenario-based chat conversation corpus 310 (step 320), and the corresponding scenario and speech session information are extracted (step 330). The previous semantic information is extracted (step 340). The previous semantic information is a combination of speech acts and driving analysis results of previous user speeches. In operation 350, current speech act information, driving information, and domain information corresponding to a scenario are extracted. Finally, the system ignition extraction (step 360). Finally, the extracted information is indexed to create a scenario-based chat dialogue database 370.

도 4는 대화 코퍼스로부터 색인된 시나리오 기반의 채팅 대화 예제 DB의 레코드 예제들이다. 대화 예제는 도시된 정보 중에 시나리오 세션 정보, 이전 의미정보, 현재 화행정보, 현재 주행정보, 도메인 정보 등을 가지고 대화 예제를 검색하고 사용자 발화를 가지고 편집 거리(Edit Distance)를 측정하여 대화 예제를 선택하게 된다. 그리고 필요에 따라서 검색 조건을 완화하여 전체 일치로 찾을 수 없는 경우에는 부분 일치로 대화 예제를 찾게 된다. 예를 들어서 주어진 시나리오에서 대화 예제를 검색하지 못한 경우에는 시나리오 세션 정보와 이전 의미정보를 제외하고 대화 예제 DB를 검색하게 된다.4 are record examples of a scenario based chat conversation example DB indexed from a conversation corpus. The dialogue example searches the dialogue example with the scenario session information, previous semantic information, current speech information, current driving information, domain information, and the like and selects the dialogue example by measuring the edit distance with the user's speech. Done. If necessary, you can relax the search conditions and find a conversation example with partial matches if you cannot find a full match. For example, if a conversation example is not found in a given scenario, the conversation example DB is searched except for the scenario session information and previous semantic information.

상술한 바와 같이 본 발명에 의하면, 채팅 에이전트 대화 관리 장치를 구축하는데 있어서 의미 기반의 대화 예제 DB를 구축하여 적은 양의 대화 예제를 가지고 만들 수 있으며 시나리오 기반의 대화 예제를 이용하여 대화 문맥을 반영하는 채팅로봇 대화 관리 장치를 구현할 수 있다. 이러한 방법론은 대화 코퍼스로부터 자동으로 학습된 확률 모델 및 색인된 DB를 이용하여 수행함으로써 학습이 가능하여 지속적으로 채팅로봇의 성능을 향상시킬 수 있다. As described above, according to the present invention, in constructing a chat agent conversation management apparatus, a semantic based conversation example DB may be constructed to create a small amount of conversation examples, and a scenario based conversation example may be used to reflect a conversation context. The chat robot conversation management device can be implemented. This methodology can be performed by using the probabilistic model and indexed DB that are automatically learned from the dialogue corpus, and can continuously improve the performance of the chat robot.

상기 채팅로봇은 실시간으로 사용자 발화를 분석하여 적절한 응답을 하여 휴대폰의 대화형 단문 메시지 서비스나 지능 로봇, 홈네트워크 등의 지능형 에이전트의 채팅로봇으로 유용하게 이용될 수 있다. 본 발명에 속하는 기술 분야의 통상의 지식을 가진 자라면 이로부터 다양한 변형 및 균등한 타 실시예가 가능하다는 점을 이해할 것이다. 따라서 본 발명의 진정한 기술적 보호 범위는 첨부된 특허 청구 범위의 기술적 사상에 의해 정해져야 할 것이다.The chat robot may be usefully used as an interactive short message service of a mobile phone, an intelligent robot, a home robot, or an intelligent agent such as a chat robot by analyzing a user's speech in real time and responding appropriately. Those skilled in the art will understand that various modifications and equivalent other embodiments are possible from this. Therefore, the true technical protection scope of the present invention will be defined by the technical spirit of the appended claims.

Claims

In the chat management method of the chat agent that can reflect the context of the conversation using a discourse stack,

(a) domain determining and verifying an input user utterance and determining a domain of the user utterance based on the analyzed result; And

(b) searching for a conversation example database (DB) constructed from a conversation corpus and providing system speech in response to user speech.

The method of claim 1, wherein step (a) is a probability-based domain determination step combining various qualities.

(a1) using language features using natural language processing and semantic features through language understanding to determine a domain of user speech;

(a2) extracting keyword features using a keyword model; And

(a3) integrating the extracted features and applying them to a feature-based probability model to determine a domain.

3. The method of claim 2, wherein step (a) is a step of verifying domain persistence based on semantic qualities. The conditional probability value of the semantic qualities of previous user utterances and the semantic qualities of current utterances is expressed by using user's speech acts or intention information. Verifying that the conversation persists in the same domain using the method.

The method of claim 1, further comprising the step of constructing a dialogue example database (DB) applied to the step (b), wherein the establishing the dialogue example database (DB) comprises the meaning of user utterance in constructing a dialogue-based example conversation DB. As indexing based on information,

(c1) indexing the conversation example DB using the previous meaning information as an index key to use the dialogue act, driving and domain information as the semantic information, and to reflect the dialogue context; And

(c2) Automatically extract user utterance, extract scenario and utterance session, extract previous semantic information, current dialogue and driving, domain extraction and system utterance through a series of processes for constructing the scenario sample DB of the collected scenario-based chat conversation corpus A conversation management method for a chat agent, comprising extracting.

The method of claim 1, wherein step (b) reflects a conversational context using a speech stack,

(d1) using a conversation stack recording recorded example session information, dialogue act analysis result, and driving analysis result of previous user speech to represent previous information of the user speech; And

(d2) Conversation management method for a chat agent including adding and indexing information such as semantic information and scenario session information of a previous user's utterance from a scenario-based conversation corpus in order to consider a conversation context in constructing a conversation example DB. .

In the chat management apparatus of the chat agent that can reflect the chat context by using a discourse stack,

A conversation analyzer analyzing the input user speech;

A domain determination unit that performs domain determination and verification to determine a domain of user speech based on the result of the dialogue analysis; And

A conversation management device for a chat agent including a chat expert who searches a chat sample database (DB) constructed from a chat corpus and provides system speech in response to user speech.

The apparatus of claim 6, wherein the domain determiner comprises: a language feature extracting unit and a semantic feature extracting unit for extracting a language feature using natural language processing and a semantic feature through language understanding to determine a domain of a user speech; And

And a keyword extracting unit for extracting keyword features using a keyword model, and determining the domain by integrating the extracted features and applying the feature to a probability-based probability model.

8. The method of claim 7, wherein the domain determiner is a domain continuity verification step based on semantic qualities, and uses conditional probability values of the semantic qualities of previous user speeches and the semantic qualities of the current speech using user speech acts or intention information. And verifying that the conversation continues in the same domain.

8. The conversation example database (DB) according to claim 6, wherein the conversation example database (DB) uses dialogue acts, driving and domain information as the semantic information, and indexes the dialogue example DB using the previous semantic information as an index key to reflect the dialogue context; Establish the collected scenario-based chat conversation corpus by automatically extracting user speech, extracting scenarios and speech sessions, extracting previous semantic information, extracting current speech and driving, domain extraction, and system speech with a series of processes for constructing a conversation example DB. And a conversation managing device for a chat agent.

The method of claim 6, wherein the conversation expert uses a conversation stack to reflect the conversation context.

Discourse stack recording recorded example session information, dialogue act analysis result, and driving analysis result of previous user speech to represent previous information of user speech; And

In order to consider the dialogue context in constructing the dialogue example DB, a chat example selection unit for adding and indexing information such as the semantic information of the previous user's utterance and scenario session information from the scenario-based dialogue corpus is included. Dialogue management device.