KR20230071673A

KR20230071673A - Method, computer device, and computer program for building open-domain dialogue system using language model

Info

Publication number: KR20230071673A
Application number: KR1020210194421A
Authority: KR
Inventors: 곽동현; 배상환; 강소영; 함동훈; 박우명
Original assignee: 네이버 주식회사
Priority date: 2021-11-16
Filing date: 2021-12-31
Publication date: 2023-05-23

Abstract

Disclosed are a method, computer device, and computer program for constructing an open-type domain conversation model using a language model. A prompt that becomes an input statement of the language model as a role specification of a chatbot and an example of a conversation between the chatbot and a person is configured, wherein the new conversation data of the role and conversation pattern included in the prompt can be generated as the data for learning the chatbot through the language model by inputting the prompt into the language model. Therefore, the present invention is capable of providing a conversation system with an improved response quality.

Description

Method, computer device, and computer program for constructing an open domain dialog model using a language model

아래의 설명은 개방형 도메인 대화 모델(open-domain dialogue model)을 구축하는 기술에 관한 것이다.The description below relates to techniques for building an open-domain dialogue model.

QA(question and answer) 시스템이란, 사용자가 질문을 입력하면 적절한 문서를 참조하여 답변을 제공하는 대화 시스템이다.A question and answer (QA) system is a dialog system that provides answers by referring to appropriate documents when a user inputs a question.

최근 언어 모델(language model)이 발전함에 따라 대화 시스템 성능이 빠르게 향상되고 있으며, 자연어뿐만 아니라 이미지나 음성과 같은 멀티모달(multi-modal) 영역까지 확장되고 있는 추세이다.As language models have recently developed, the performance of dialog systems is rapidly improving, and it is a trend that extends not only to natural languages but also to multi-modal domains such as images and voices.

일례로, 한국 공개특허공보 제10-2002-0030545호(공개일 2002년 04월 25일)에는 인공지능 기술과 자연어처리 기술을 이용하여 질문에 대한 답변을 제공하는 기술이 개시되어 있다.For example, Korean Patent Publication No. 10-2002-0030545 (published on April 25, 2002) discloses a technology for providing answers to questions using artificial intelligence technology and natural language processing technology.

사람과 자연스럽게 대화하면서 일관된 역할을 유지할 수 있는 개방형 도메인 대화 모델을 구축할 수 있다.It is possible to build an open domain conversation model that can naturally communicate with people and maintain a consistent role.

언어 모델의 컨텍스트 퓨샷 학습(context few-shot learning)을 통해 역할을 만족시키는 대화 데이터셋을 구축할 수 있다.Context few-shot learning of the language model allows us to build a dialog dataset that satisfies the role.

컴퓨터 장치에서 실행되는 방법에 있어서, 상기 컴퓨터 장치는 메모리에 포함된 컴퓨터 판독가능한 명령들을 실행하도록 구성된 적어도 하나의 프로세서를 포함하고, 상기 방법은, 상기 적어도 하나의 프로세서에 의해, 챗봇의 역할 사양(role specification) 및 상기 챗봇과 사람 간의 대화 예시로 언어 모델의 입력문이 되는 프롬프트(prompt)를 구성하는 단계; 및 상기 적어도 하나의 프로세서에 의해, 상기 프롬프트를 상기 언어 모델에 입력하여 상기 언어 모델을 통해 상기 프롬프트에 포함된 역할과 대화 패턴의 새로운 대화 데이터를 생성하는 단계를 포함하는 방법을 제공한다.A method executed on a computer device, wherein the computer device includes at least one processor configured to execute computer readable instructions contained in a memory, and the method comprises, by the at least one processor, a role specification of a chatbot ( configuring a prompt that is an input sentence of a language model as an example of a role specification and a conversation between the chatbot and a person; and generating, by the at least one processor, new dialogue data of roles and dialogue patterns included in the prompt through the language model by inputting the prompt into the language model.

일 측면에 따르면, 상기 방법은, 상기 적어도 하나의 프로세서에 의해, 상기 대화 데이터에 대한 지도 학습(supervised learning)으로 상기 챗봇을 위한 개방형 도메인 대화 모델(Open-Domain Dialogue Model)을 구축하는 단계를 더 포함할 수 있다.According to one aspect, the method further comprises building, by the at least one processor, an open-domain dialogue model for the chatbot by supervised learning on the conversation data. can include

다른 측면에 따르면, 상기 방법은, 상기 적어도 하나의 프로세서에 의해, 주석(annotation)을 통해 상기 대화 데이터를 필터링하는 단계를 더 포함할 수 있다.According to another aspect, the method may further include filtering, by the at least one processor, the conversation data through annotations.

또 다른 측면에 따르면, 상기 필터링하는 단계는, 상기 대화 데이터에 포함된 상기 챗봇의 발언 중 상기 주석이 지정된 문제 발언을 상기 챗봇을 학습하기 위한 네거티브 예시(negative example)로 분류하는 단계를 포함할 수 있다.According to another aspect, the filtering may include classifying the annotated problem remarks among the speeches of the chatbot included in the conversation data as negative examples for learning the chatbot. there is.

또 다른 측면에 따르면, 상기 방법은, 상기 적어도 하나의 프로세서에 의해, 상기 챗봇과 사람이 직접 대화하는 인터페이스를 통해 상기 챗봇과 사람 간의 대화 데이터를 상기 챗봇의 학습 데이터로 수집하는 단계를 더 포함할 수 있다.According to another aspect, the method further comprises collecting, by the at least one processor, conversation data between the chatbot and a person as learning data of the chatbot through an interface in which the chatbot and a person directly communicate. can

또 다른 측면에 따르면, 상기 수집하는 단계는, 상기 인터페이스를 통해 제공되는 상기 챗봇의 발언 중 일부 발언에 대해 수정 요청이 수신되는 경우 상기 언어 모델을 통해 대체 발언을 생성하여 상기 일부 발언을 수정하는 단계를 포함할 수 있다.According to another aspect, the collecting may include, when a modification request is received for some of the chatbot's remarks provided through the interface, generating an alternative speech through the language model to correct the partial speech. can include

또 다른 측면에 따르면, 상기 수집하는 단계는, 수정 요청한 발언을 상기 챗봇을 학습하기 위한 네거티브 예시로 분류하는 단계; 및 상기 대체 발언으로 수정된 발언을 상기 챗봇을 학습하기 위한 포지티브 예시로 분류하는 단계를 더 포함할 수 있다.According to another aspect, the collecting may include classifying utterances requested for correction as negative examples for learning the chatbot; and classifying the utterance corrected as the substitute utterance as a positive example for learning the chatbot.

또 다른 측면에 따르면, 상기 구축하는 단계는, 상기 역할 사양의 범주를 벗어난 발언을 감지하는 모델(Out-of-Bounds Detection Model), 응답 후보를 찾은 후 리랭킹하는(retrieve-then-rerank) 접근법에서 상기 언어 모델의 PPL(perplexity)을 기초로 상기 응답 후보를 필터링하는 응답 선택 모델(Response Selection Model), 및 최대 우도 추정(Maximum Likelihood Estimation)을 데이터셋의 포지티브 예시에 적용하고 비우도(unlikelihood) 학습을 데이터셋의 네거티브 예시에 적용하는 응답 생성 모델(Response Generation Model) 중 적어도 하나를 이용하여 상기 도메인 대화 모델을 구축할 수 있다.According to another aspect, the constructing may include an Out-of-Bounds Detection Model (Out-of-Bounds Detection Model) and a retrieve-then-rerank approach after finding response candidates. In , a response selection model for filtering the response candidates based on the perplexity (PPL) of the language model, and maximum likelihood estimation are applied to positive examples of the dataset and unlikelihood The domain conversation model may be built using at least one of a Response Generation Model that applies learning to negative examples in a dataset.

상기 방법을 상기 컴퓨터 장치에 실행시키기 위해 컴퓨터 판독가능한 기록 매체에 저장되는 컴퓨터 프로그램을 제공한다.A computer program stored in a computer readable recording medium to execute the method in the computer device is provided.

컴퓨터 장치에 있어서, 메모리에 포함된 컴퓨터 판독가능한 명령들을 실행하도록 구성된 적어도 하나의 프로세서를 포함하고, 상기 적어도 하나의 프로세서는, 챗봇의 역할 사양 및 상기 챗봇과 사람 간의 대화 예시로 언어 모델의 입력문이 되는 프롬프트를 구성하고, 상기 프롬프트를 상기 언어 모델에 입력하여 상기 언어 모델을 통해 상기 프롬프트에 포함된 역할과 대화 패턴의 새로운 대화 데이터를 생성하는 것을 특징으로 하는 컴퓨터 장치를 제공한다.A computer device comprising at least one processor configured to execute computer readable instructions included in a memory, wherein the at least one processor includes a role specification of a chatbot and an input statement of a language model as an example of a conversation between the chatbot and a person. and generating new dialogue data of roles and dialogue patterns included in the prompt through the language model by inputting the prompt to the language model.

본 발명의 실시예들에 따르면, 사람과 자연스럽게 대화하면서 일관된 역할을 유지할 수 있는 개방형 도메인 대화 모델을 구축함으로써 응답 품질이 향상된 대화 시스템을 제공할 수 있다.According to embodiments of the present invention, it is possible to provide a dialog system with improved response quality by establishing an open domain dialog model capable of maintaining a consistent role while having a natural conversation with a person.

본 발명의 실시예들에 따르면, 언어 모델의 컨텍스트 퓨샷 학습을 활용하는 데이터 수집 프레임워크를 통해 역할을 만족시키는 대화 데이터셋을 빠르고 효율적으로 만들 수 있다.According to embodiments of the present invention, a conversation dataset that satisfies a role can be quickly and efficiently created through a data collection framework that utilizes contextual snapshot learning of a language model.

도 1은 본 발명의 일 실시예에 따른 네트워크 환경의 예를 도시한 도면이다.
도 2는 본 발명의 일실시예에 따른 컴퓨터 장치의 예를 도시한 블록도이다.
도 3은 본 발명의 일 실시예에 있어서 역할 지정 개방형 도메인 대화 모델을 구축하기 위한 프레임워크를 도시한 것이다.
도 4는 본 발명의 일실시예에 따른 컴퓨터 장치가 수행할 수 있는 개방형 도메인 대화 생성 방법의 예를 도시한 흐름도이다.
도 5는 본 발명의 일실시예에 있어서 대화 데이터 생성 과정을 설명하기 위한 예시 도면이다.
도 6은 본 발명의 일실시예에 있어서 대화 데이터에 대한 필터링 예시를 나타내고 있다.
도 7은 본 발명의 일실시예에 있어서 챗봇과 사람 간의 대화 데이터를 수집하는 과정을 설명하기 위한 예시 도면이다.
도 8은 본 발명의 일실시예에 있어서 개방형 도메인 대화 모델의 구조 예시를 도시한 것이다.1 is a diagram illustrating an example of a network environment according to an embodiment of the present invention.
2 is a block diagram illustrating an example of a computer device according to one embodiment of the present invention.
3 illustrates a framework for building a role-specified open domain dialog model in an embodiment of the present invention.
4 is a flowchart illustrating an example of a method for creating an open domain conversation performed by a computer device according to an embodiment of the present invention.
5 is an exemplary diagram for explaining a process of generating conversation data according to an embodiment of the present invention.
6 shows an example of filtering conversation data according to an embodiment of the present invention.
7 is an exemplary diagram for explaining a process of collecting conversation data between a chatbot and a person in one embodiment of the present invention.
8 illustrates an exemplary structure of an open domain dialogue model according to an embodiment of the present invention.

이하, 본 발명의 실시예를 첨부된 도면을 참조하여 상세하게 설명한다.Hereinafter, embodiments of the present invention will be described in detail with reference to the accompanying drawings.

본 발명의 실시예들은 개방형 도메인 대화 모델(open-domain dialogue model)을 구축하는 기술에 관한 것이다.Embodiments of the present invention relate to techniques for building an open-domain dialogue model.

본 명세서에서 구체적으로 개시되는 것들을 포함하는 실시예들은 사람과 자연스럽게 대화하면서 일관된 역할을 유지할 수 있는 개방형 도메인 대화 모델을 만들 수 있다.Embodiments including those specifically disclosed in this specification can create an open domain conversation model that can maintain a consistent role while naturally conversing with a person.

본 발명의 실시예들에 따른 개방형 도메인 대화 생성 시스템은 적어도 하나의 컴퓨터 장치에 의해 구현될 수 있으며, 본 발명의 실시예들에 따른 개방형 도메인 대화 생성 방법은 개방형 도메인 대화 생성 시스템에 포함되는 적어도 하나의 컴퓨터 장치를 통해 수행될 수 있다. 이때, 컴퓨터 장치에는 본 발명의 일실시예에 따른 컴퓨터 프로그램이 설치 및 구동될 수 있고, 컴퓨터 장치는 구동된 컴퓨터 프로그램의 제어에 따라 본 발명의 실시예들에 따른 개방형 도메인 대화 생성 방법을 수행할 수 있다. 상술한 컴퓨터 프로그램은 컴퓨터 장치와 결합되어 개방형 도메인 대화 생성 방법을 컴퓨터에 실행시키기 위해 컴퓨터 판독 가능한 기록매체에 저장될 수 있다.The open domain dialogue generating system according to embodiments of the present invention may be implemented by at least one computer device, and the open domain dialogue generating method according to embodiments of the present invention includes at least one of the open domain dialogue generating systems. It can be performed through a computer device of. At this time, a computer program according to an embodiment of the present invention may be installed and driven in the computer device, and the computer device may perform the open domain conversation creation method according to the embodiments of the present invention under the control of the driven computer program. can The above-described computer program may be combined with a computer device and stored in a computer readable recording medium in order to execute an open domain conversation generating method on a computer.

도 1은 본 발명의 일실시예에 따른 네트워크 환경의 예를 도시한 도면이다. 도 1의 네트워크 환경은 복수의 전자 기기들(110, 120, 130, 140), 복수의 서버들(150, 160) 및 네트워크(170)를 포함하는 예를 나타내고 있다. 이러한 도 1은 발명의 설명을 위한 일례로 전자 기기의 수나 서버의 수가 도 1과 같이 한정되는 것은 아니다. 또한, 도 1의 네트워크 환경은 본 실시예들에 적용 가능한 환경들 중 하나의 예를 설명하는 것일 뿐, 본 실시예들에 적용 가능한 환경이 도 1의 네트워크 환경으로 한정되는 것은 아니다.1 is a diagram illustrating an example of a network environment according to an embodiment of the present invention. The network environment of FIG. 1 shows an example including a plurality of electronic devices 110 , 120 , 130 , and 140 , a plurality of servers 150 and 160 , and a network 170 . 1 is an example for explanation of the invention, and the number of electronic devices or servers is not limited as shown in FIG. 1 . In addition, the network environment of FIG. 1 only describes one example of environments applicable to the present embodiments, and the environment applicable to the present embodiments is not limited to the network environment of FIG. 1 .

복수의 전자 기기들(110, 120, 130, 140)은 컴퓨터 장치로 구현되는 고정형 단말이거나 이동형 단말일 수 있다. 복수의 전자 기기들(110, 120, 130, 140)의 예를 들면, 스마트폰(smart phone), 휴대폰, 내비게이션, 컴퓨터, 노트북, 디지털방송용 단말, PDA(Personal Digital Assistants), PMP(Portable Multimedia Player), 태블릿 PC 등이 있다. 일례로 도 1에서는 전자 기기(110)의 예로 스마트폰의 형상을 나타내고 있으나, 본 발명의 실시예들에서 전자 기기(110)는 실질적으로 무선 또는 유선 통신 방식을 이용하여 네트워크(170)를 통해 다른 전자 기기들(120, 130, 140) 및/또는 서버(150, 160)와 통신할 수 있는 다양한 물리적인 컴퓨터 장치들 중 하나를 의미할 수 있다.The plurality of electronic devices 110, 120, 130, and 140 may be fixed terminals implemented as computer devices or mobile terminals. Examples of the plurality of electronic devices 110, 120, 130, and 140 include a smart phone, a mobile phone, a navigation device, a computer, a laptop computer, a digital broadcast terminal, a personal digital assistant (PDA), and a portable multimedia player (PMP). ), and tablet PCs. As an example, FIG. 1 shows the shape of a smartphone as an example of the electronic device 110, but in the embodiments of the present invention, the electronic device 110 substantially uses a wireless or wired communication method to transmit other information via the network 170. It may refer to one of various physical computer devices capable of communicating with the electronic devices 120 , 130 , and 140 and/or the servers 150 and 160 .

통신 방식은 제한되지 않으며, 네트워크(170)가 포함할 수 있는 통신망(일례로, 이동통신망, 유선 인터넷, 무선 인터넷, 방송망)을 활용하는 통신 방식뿐만 아니라 기기들간의 근거리 무선 통신 역시 포함될 수 있다. 예를 들어, 네트워크(170)는, PAN(personal area network), LAN(local area network), CAN(campus area network), MAN(metropolitan area network), WAN(wide area network), BBN(broadband network), 인터넷 등의 네트워크 중 하나 이상의 임의의 네트워크를 포함할 수 있다. 또한, 네트워크(170)는 버스 네트워크, 스타 네트워크, 링 네트워크, 메쉬 네트워크, 스타-버스 네트워크, 트리 또는 계층적(hierarchical) 네트워크 등을 포함하는 네트워크 토폴로지 중 임의의 하나 이상을 포함할 수 있으나, 이에 제한되지 않는다.The communication method is not limited, and short-distance wireless communication between devices as well as a communication method utilizing a communication network (eg, a mobile communication network, a wired Internet, a wireless Internet, and a broadcasting network) that the network 170 may include may also be included. For example, the network 170 may include a personal area network (PAN), a local area network (LAN), a campus area network (CAN), a metropolitan area network (MAN), a wide area network (WAN), and a broadband network (BBN). , one or more arbitrary networks such as the Internet. In addition, the network 170 may include any one or more of network topologies including a bus network, a star network, a ring network, a mesh network, a star-bus network, a tree or a hierarchical network, and the like. Not limited.

서버(150, 160) 각각은 복수의 전자 기기들(110, 120, 130, 140)과 네트워크(170)를 통해 통신하여 명령, 코드, 파일, 콘텐츠, 서비스 등을 제공하는 컴퓨터 장치 또는 복수의 컴퓨터 장치들로 구현될 수 있다. 예를 들어, 서버(150)는 네트워크(170)를 통해 접속한 복수의 전자 기기들(110, 120, 130, 140)로 서비스(일례로, 대화 봇 서비스)를 제공하는 시스템일 수 있다.Each of the servers 150 and 160 communicates with the plurality of electronic devices 110, 120, 130, and 140 through the network 170 to provide commands, codes, files, contents, services, and the like, or a computer device or a plurality of computers. Can be implemented in devices. For example, the server 150 may be a system that provides a service (eg, a chat bot service) to the plurality of electronic devices 110, 120, 130, and 140 connected through the network 170.

도 2는 본 발명의 일실시예에 따른 컴퓨터 장치의 예를 도시한 블록도이다. 앞서 설명한 복수의 전자 기기들(110, 120, 130, 140) 각각이나 서버들(150, 160) 각각은 도 2를 통해 도시된 컴퓨터 장치(200)에 의해 구현될 수 있다. 예를 들어, 본 발명의 실시예들에 따른 개방형 도메인 대화 생성 시스템은 도 1을 통해 도시된 컴퓨터 장치(100)에 의해 구현될 수 있다.2 is a block diagram illustrating an example of a computer device according to one embodiment of the present invention. Each of the plurality of electronic devices 110 , 120 , 130 , and 140 or each of the servers 150 and 160 described above may be implemented by the computer device 200 shown in FIG. 2 . For example, an open domain dialog creation system according to embodiments of the present invention may be implemented by the computer device 100 shown in FIG. 1 .

이러한 컴퓨터 장치(200)는 도 2에 도시된 바와 같이, 메모리(210), 프로세서(220), 통신 인터페이스(230) 그리고 입출력 인터페이스(240)를 포함할 수 있다. 메모리(210)는 컴퓨터에서 판독 가능한 기록매체로서, RAM(random access memory), ROM(read only memory) 및 디스크 드라이브와 같은 비소멸성 대용량 기록장치(permanent mass storage device)를 포함할 수 있다. 여기서 ROM과 디스크 드라이브와 같은 비소멸성 대용량 기록장치는 메모리(210)와는 구분되는 별도의 영구 저장 장치로서 컴퓨터 장치(200)에 포함될 수도 있다. 또한, 메모리(210)에는 운영체제와 적어도 하나의 프로그램 코드가 저장될 수 있다. 이러한 소프트웨어 구성요소들은 메모리(210)와는 별도의 컴퓨터에서 판독 가능한 기록매체로부터 메모리(210)로 로딩될 수 있다. 이러한 별도의 컴퓨터에서 판독 가능한 기록매체는 플로피 드라이브, 디스크, 테이프, DVD/CD-ROM 드라이브, 메모리 카드 등의 컴퓨터에서 판독 가능한 기록매체를 포함할 수 있다. 다른 실시예에서 소프트웨어 구성요소들은 컴퓨터에서 판독 가능한 기록매체가 아닌 통신 인터페이스(230)를 통해 메모리(210)에 로딩될 수도 있다. 예를 들어, 소프트웨어 구성요소들은 네트워크(170)를 통해 수신되는 파일들에 의해 설치되는 컴퓨터 프로그램에 기반하여 컴퓨터 장치(200)의 메모리(210)에 로딩될 수 있다.As shown in FIG. 2 , the computer device 200 may include a memory 210, a processor 220, a communication interface 230, and an input/output interface 240. The memory 210 is a computer-readable recording medium and may include a random access memory (RAM), a read only memory (ROM), and a permanent mass storage device such as a disk drive. Here, a non-perishable mass storage device such as a ROM and a disk drive may be included in the computer device 200 as a separate permanent storage device distinct from the memory 210 . Also, an operating system and at least one program code may be stored in the memory 210 . These software components may be loaded into the memory 210 from a computer-readable recording medium separate from the memory 210 . The separate computer-readable recording medium may include a computer-readable recording medium such as a floppy drive, a disk, a tape, a DVD/CD-ROM drive, and a memory card. In another embodiment, software components may be loaded into the memory 210 through the communication interface 230 rather than a computer-readable recording medium. For example, software components may be loaded into memory 210 of computer device 200 based on a computer program installed by files received over network 170 .

프로세서(220)는 기본적인 산술, 로직 및 입출력 연산을 수행함으로써, 컴퓨터 프로그램의 명령을 처리하도록 구성될 수 있다. 명령은 메모리(210) 또는 통신 인터페이스(230)에 의해 프로세서(220)로 제공될 수 있다. 예를 들어 프로세서(220)는 메모리(210)와 같은 기록 장치에 저장된 프로그램 코드에 따라 수신되는 명령을 실행하도록 구성될 수 있다.The processor 220 may be configured to process commands of a computer program by performing basic arithmetic, logic, and input/output operations. Instructions may be provided to processor 220 by memory 210 or communication interface 230 . For example, processor 220 may be configured to execute received instructions according to program codes stored in a recording device such as memory 210 .

통신 인터페이스(230)는 네트워크(170)를 통해 컴퓨터 장치(200)가 다른 장치(일례로, 앞서 설명한 저장 장치들)와 서로 통신하기 위한 기능을 제공할 수 있다. 일례로, 컴퓨터 장치(200)의 프로세서(220)가 메모리(210)와 같은 기록 장치에 저장된 프로그램 코드에 따라 생성한 요청이나 명령, 데이터, 파일 등이 통신 인터페이스(230)의 제어에 따라 네트워크(170)를 통해 다른 장치들로 전달될 수 있다. 역으로, 다른 장치로부터의 신호나 명령, 데이터, 파일 등이 네트워크(170)를 거쳐 컴퓨터 장치(200)의 통신 인터페이스(230)를 통해 컴퓨터 장치(200)로 수신될 수 있다. 통신 인터페이스(230)를 통해 수신된 신호나 명령, 데이터 등은 프로세서(220)나 메모리(210)로 전달될 수 있고, 파일 등은 컴퓨터 장치(200)가 더 포함할 수 있는 저장 매체(상술한 영구 저장 장치)로 저장될 수 있다.The communication interface 230 may provide a function for the computer device 200 to communicate with other devices (eg, storage devices described above) through the network 170 . For example, a request, command, data, file, etc. generated according to a program code stored in a recording device such as the memory 210 by the processor 220 of the computer device 200 is controlled by the communication interface 230 to the network ( 170) to other devices. Conversely, signals, commands, data, files, etc. from other devices may be received by the computer device 200 through the communication interface 230 of the computer device 200 via the network 170 . Signals, commands, data, etc. received through the communication interface 230 may be transferred to the processor 220 or the memory 210, and files, etc. may be stored as storage media that the computer device 200 may further include (described above). permanent storage).

입출력 인터페이스(240)는 입출력 장치(250)와의 인터페이스를 위한 수단일 수 있다. 예를 들어, 입력 장치는 마이크, 키보드 또는 마우스 등의 장치를, 그리고 출력 장치는 디스플레이, 스피커와 같은 장치를 포함할 수 있다. 다른 예로 입출력 인터페이스(240)는 터치스크린과 같이 입력과 출력을 위한 기능이 하나로 통합된 장치와의 인터페이스를 위한 수단일 수도 있다. 입출력 장치(250)는 컴퓨터 장치(200)와 하나의 장치로 구성될 수도 있다.The input/output interface 240 may be a means for interface with the input/output device 250 . For example, the input device may include a device such as a microphone, keyboard, or mouse, and the output device may include a device such as a display or speaker. As another example, the input/output interface 240 may be a means for interface with a device in which functions for input and output are integrated into one, such as a touch screen. The input/output device 250 and the computer device 200 may be configured as one device.

또한, 다른 실시예들에서 컴퓨터 장치(200)는 도 2의 구성요소들보다 더 적은 혹은 더 많은 구성요소들을 포함할 수도 있다. 그러나, 대부분의 종래기술적 구성요소들을 명확하게 도시할 필요성은 없다. 예를 들어, 컴퓨터 장치(200)는 상술한 입출력 장치(250) 중 적어도 일부를 포함하도록 구현되거나 또는 트랜시버(transceiver), 데이터베이스 등과 같은 다른 구성요소들을 더 포함할 수도 있다.Also, in other embodiments, computer device 200 may include fewer or more elements than those of FIG. 2 . However, there is no need to clearly show most of the prior art components. For example, the computer device 200 may be implemented to include at least some of the aforementioned input/output devices 250 or may further include other components such as a transceiver and a database.

이하에서는 언어 모델을 이용하여 개방형 도메인 대화 모델을 구축하는 방법 및 장치의 구체적인 실시예를 설명하기로 한다.Hereinafter, specific embodiments of a method and apparatus for constructing an open domain dialogue model using a language model will be described.

본 발명에서 사용되는 대규모 언어 모델은 방대한 데이터로 학습된 언어 모델로서 퓨샷 샘플(few-shot sample)만 주어지면 해당 태스크를 적절히 수행할 수 있는 자연어 생성 모델을 의미한다.The large-scale language model used in the present invention is a language model learned from massive data and refers to a natural language generation model that can appropriately perform a corresponding task given only a few-shot samples.

다시 말해, 대규모 언어 모델은 오토리그레시브(autoregressive) 모델로서 퓨샷 학습 등과 같은 방식을 이용하여 파인튜닝(fine-tuning) 없이 추론이 가능한 언어 모델을 지칭할 수 있으며, 기존의 일반 언어 모델에 비해 10배 이상 많은 매개 변수(예를 들어, 1000억 개 이상의 매개 변수 등)를 가질 수 있다. 예를 들어, GPT-3(Generative Pre-trained Transformer 3)이나 HyperClova와 같은 대규모 언어 모델은 자연스러운 프롬프트(prompt)를 통해 제어할 수 있는 우수한 퓨샷 학습기로서 프롬프트를 통해 소량의 데이터만으로 패턴을 이해하여 NLP 문제를 해결할 수 있는 능력의 인컨텍스트 학습(in-context learning)이 가능하다.In other words, a large-scale language model is an autoregressive model, which can refer to a language model that can be inferred without fine-tuning using a method such as one-shot learning. You can have more than twice as many parameters (eg, more than 100 billion parameters, etc.). For example, large-scale language models such as Generative Pre-trained Transformer 3 (GPT-3) or HyperClova are good short-shot learners that can be controlled through natural prompts, allowing them to understand patterns with only a small amount of data through NLP. In-context learning of the ability to solve problems is possible.

개방형 도메인 대화를 다루기 위해 대규모의 사전 학습된 언어 모델이 활용되고 있다.Large pretrained language models are being utilized to handle open domain conversations.

주로 대화 행동을 모델링하기 위해 대규모 소셜 댓글 체인 데이터에 대한 모델을 사전 학습한 후 사람 참여와 인간성을 개선하기 위해 다양한 타겟 데이터셋을 파인튜닝한다. 사람 간 대화에서 독성 및 편향을 포함한 모델의 원치 않는 행동을 피하기 위해 사전 정의된 기준에 의한 자동 필터링을 사용하여 학습 데이터의 일부를 제외시킬 수 있다.After pre-training a model on large-scale social comment chain data to primarily model conversational behavior, we fine-tune a variety of target datasets to improve human engagement and humanity. To avoid unwanted behavior of the model including toxicity and bias in human-to-human conversations, automatic filtering by predefined criteria can be used to exclude parts of the training data.

합성 대화 생성 대화 수집 비용을 줄이기 위해 합성 대화를 생성하는 많은 접근 방식이 있으며, 이들은 일반적으로 과제 지향 대화(task-oriented dialogue)에서 특정 시나리오를 시뮬레이션하기 위해 과제 스키마, 규칙 및 템플릿을 정의한다. 그러나, 보이지 않는 타겟 도메인을 이전하려면 소스 도메인의 학습 데이터가 필요하다.Synthetic Dialogue Creation There are many approaches to synthetic dialog creation to reduce the cost of dialog aggregation, and they typically define task schemas, rules, and templates to simulate specific scenarios in task-oriented dialogue. However, to transfer an invisible target domain, training data from the source domain is required.

과제 지향 대화 시스템에서 시스템 측은 특정 도메인의 명시적 지식 기반을 활용하는 기능적 역할을 수행한다. 예를 들어, 에이전트(agent)는 레스토랑이나 호텔과 같은 다양한 도메인에서 예약 도우미 또는 정보 제공자 역할을 한다. 각 대화 에이전트에 명시적 페르소나(persona)를 할당하여 에이전트가 개방형 도메인 대화 환경에서 더 구체적이고 일관된 응답을 하도록 촉진할 수 있다. 그러나, 모든 대화 세션에서 주어진 페르소나를 기반으로 하는 것에 불과하고, 더욱이 몇 개의 자연어 문장에 의해 주어진 페르소나는 실제 시나리오에서 특정한 역할을 나타내기에 충분하지 않다.In a task-oriented dialog system, the system side plays a functional role that utilizes the explicit knowledge base of a particular domain. For example, agents act as reservation helpers or information providers in various domains such as restaurants and hotels. Each conversational agent can be assigned an explicit persona to promote more specific and consistent responses in an open domain conversational environment. However, it is only based on the persona given in every conversation session, and moreover, the persona given by a few natural language sentences is not sufficient to represent a specific role in a real scenario.

챗봇과 같은 대화 시스템은 능동적으로 대화에 참여하는 것 외에 목표나 과제가 명시되어 있지 않더라도 특정 유형의 발언 허용 여부에 대한 시스템 정책을 가질 수 있다.Conversational systems, such as chatbots, may have system policies on whether or not to allow certain types of speech, even if no goals or tasks are specified other than actively participating in the conversation.

본 실시예에서는 역할이 지정된 개방형 도메인 대화 모델을 구축하는 것으로, 시스템이 특정 역할에 일관성을 유지하면서 개방형 도메인 주제에 대해 인간과 자연스럽게 대화할 수 있는 대화 모델을 구축할 수 있다.In this embodiment, by building an open domain dialogue model in which roles are assigned, a dialogue model in which a system can naturally communicate with a human on an open domain topic while maintaining consistency in a specific role can be built.

이를 위해, 페르소나, 스타일, 안전, 시스템 정책의 특정 조건이 충족되어야 하며, 데이터 확장이 가능한 프레임워크를 제공할 수 있다.To this end, certain conditions of persona, style, safety, and system policies must be met, and a framework capable of data extensibility can be provided.

본 실시예에서는 역할을 만족시키는 개방형 도메인 대화를 위해 확장 가능한 감독 데이터셋(supervisory dataset)을 만들기 위한 사람-AI 협업 데이터 구성 방법을 제공할 수 있다. 특히, 컨텍스트 퓨샷 학습 방식으로 사용자와 시스템 간의 전체 대화 세션을 생성하기 위해 대규모 언어 모델을 활용할 수 있다.In this embodiment, it is possible to provide a human-AI collaborative data construction method for creating an extensible supervisory dataset for an open domain conversation that satisfies a role. In particular, a large-scale language model can be leveraged to create an entire conversational session between a user and a system with a contextual snapshot learning approach.

도 3은 본 발명의 일 실시예에 있어서 역할 지정 개방형 도메인 대화 모델을 구축하기 위한 프레임워크를 도시한 것이다.3 illustrates a framework for building a role-specified open domain dialog model in an embodiment of the present invention.

도 3을 참조하면, 역할 지정 개방형 도메인 대화 모델을 구축하기 위한 감독 데이터를 수집하는 프레임워크를 나타내고 있다.Referring to FIG. 3 , a framework for collecting supervision data for constructing a role-specified open domain dialogue model is shown.

도 3을 참조하면, (1) 챗봇 개발자는 원하는 챗봇의 역할 사양과 몇 가지 대화 예시를 제공한다.Referring to FIG. 3 , (1) a chatbot developer provides role specifications of a desired chatbot and several examples of conversations.

(2) 대규모 언어 모델은 전체 대화 데이터셋을 생성하고 챗봇 개발자 내지 크라우드 작업자는 대화에 포함된 발언을 필터링한다.(2) The large-scale language model creates the entire conversation dataset, and chatbot developers or crowd workers filter the utterances included in the conversation.

(3) 챗봇 에이전트에 해당되는 대화 모델은 대화 데이터셋에 대한 지도 학습(supervised learning)으로 학습된다.(3) The conversation model corresponding to the chatbot agent is learned through supervised learning on the conversation dataset.

(4) 챗봇 개발자 내지 크라우드 작업자는 챗봇과 일대일 채팅을 하면서 챗봇의 발언에 대한 추가 피드백을 제공할 수 있다.(4) A chatbot developer or crowd worker can provide additional feedback on the chatbot's remarks while having a one-on-one chat with the chatbot.

프레임워크에 대한 입력은 챗봇 개발자로부터 지정된 역할 사양으로, 시스템에 대한 대화 상호 작용의 제약 조건을 정의한다. 특정 조건과 상호 작용하기 위한 대화 시스템이 없고 공개 대화 데이터가 부족하기 때문에 해당 사양에 대한 사람-봇 대화 데이터셋을 사용하지 못할 수 있다고 가정한다. 개방형 도메인 대화 특성 상 대화의 범위가 매우 광범위하고 다양하기 때문에 모든 대화 사례를 수작업으로 작성하는 것 또한 불가능하다.Inputs to the framework are role specifications specified by chatbot developers, which define the constraints of conversational interactions on the system. We assume that the human-bot conversation dataset for that specification may not be available due to the lack of a conversation system to interact with the specific condition and the lack of public conversation data. Because the range of conversations is very wide and diverse due to the nature of open domain conversations, it is also impossible to write all conversation cases manually.

따라서, 대규모 언어 모델의 컨텍스트 내 퓨샷 학습을 활용하여 최소한의 대화 사례를 사용하여 전체 대화 데이터셋을 구성하는 데 초점을 맞춘다.Therefore, we focus on constructing an entire conversational dataset using minimal conversational examples by leveraging raw shot learning in the context of a large-scale language model.

도 4는 본 발명의 일실시예에 따른 컴퓨터 장치가 수행할 수 있는 개방형 도메인 대화 생성 방법의 예를 도시한 흐름도이다.4 is a flowchart illustrating an example of a method for creating an open domain conversation performed by a computer device according to an embodiment of the present invention.

본 실시예에 따른 개방형 도메인 대화 생성 방법은 앞서 설명한 컴퓨터 장치(100)에 의해 수행될 수 있다. 이 경우, 컴퓨터 장치(100)의 프로세서(120)는 메모리(110)가 포함하는 운영체제의 코드나 적어도 하나의 프로그램의 코드에 따른 제어 명령(instruction)을 실행하도록 구현될 수 있다. 여기서, 프로세서(120)는 컴퓨터 장치(100)에 저장된 코드가 제공하는 제어 명령에 따라 컴퓨터 장치(100)가 도 4의 개방형 도메인 대화 생성 방법이 포함하는 단계들(S410 내지 S430)을 수행하도록 컴퓨터 장치(100)를 제어할 수 있다.The open domain conversation creation method according to the present embodiment may be performed by the computer device 100 described above. In this case, the processor 120 of the computer device 100 may be implemented to execute a control instruction according to an operating system code or at least one program code included in the memory 110 . Here, the processor 120 causes the computer device 100 to perform the steps (S410 to S430) included in the open domain dialog generation method of FIG. 4 according to a control command provided by a code stored in the computer device 100. The device 100 can be controlled.

도 4를 참조하면, 단계(S410)에서 프로세서(120)는 언어 모델의 입력문이 되는 프롬프트를 이용하여 전체 대화 데이터셋을 생성할 수 있다. 특히, 프로세서(120)는 챗봇 속성에 대한 간략한 설명과 챗봇과 사람이 주고받은 대화 예시가 포함된 프롬프트를 언어 모델에 입력하여 해당 언어 모델을 통해 특정 성격의 대화 데이터를 생성할 수 있다.Referring to FIG. 4 , in step S410, the processor 120 may generate an entire conversation dataset using a prompt that is an input sentence of a language model. In particular, the processor 120 may generate conversation data of a specific nature through the language model by inputting a prompt including a brief description of chatbot properties and an example of a conversation exchanged between the chatbot and a human to the language model.

프로세서(120)는 언어 모델의 입력문이 되는 프롬프트에 주어진 규격을 만족하는 몇 가지 대화 예시, 및 언어 모델에 입력을 구축하기 위한 적절한 시스템 설명을 첨부할 수 있다.The processor 120 may attach several dialog examples that satisfy given specifications to prompts serving as input statements of the language model, and an appropriate system description for constructing inputs to the language model.

도 5는 본 발명의 일실시예에 있어서 데이터 구성 과정을 설명하기 위한 예시 도면이다.5 is an exemplary diagram for explaining a data configuration process according to an embodiment of the present invention.

도 5에서 (a)는 입력 프롬프트(500)의 예시를 나타내고 있다. 프롬프트(500)는 챗봇 속성에 대한 간략한 설명(510)과 챗봇과 사람이 주고받은 대화 예시(520)로 구성될 수 있다.In FIG. 5 (a) shows an example of the input prompt 500. The prompt 500 may include a brief description of chatbot properties 510 and an example of a conversation 520 exchanged between the chatbot and a person.

프로세서(120)는 프롬프트(500)를 언어 모델에 입력하여 언어 모델로부터 자연어 형태의 대화 데이터를 생성할 수 있다. 다시 말해, 프로세서(120)는 프롬프트(500) 입력문을 언어 모델에 입력한 후 언어 모델의 생성 내지는 완성 기능을 통한 언어 생성 결과로서 대화 데이터를 얻을 수 있다. 프로세서(120)는 프롬프트(500)를 언어 모델로 입력하여 언어 모델을 통해 프롬프트(500)에 포함된 대화 예시의 자연어 패턴을 분석하여 해당 패턴을 가지는 새로운 대화 데이터를 얻을 수 있다.The processor 120 may input the prompt 500 to the language model to generate natural language dialogue data from the language model. In other words, the processor 120 may obtain dialogue data as a result of language generation through a language model generation or completion function after inputting the prompt 500 input sentence into the language model. The processor 120 may input the prompt 500 as a language model, analyze the natural language pattern of the dialogue example included in the prompt 500 through the language model, and obtain new dialogue data having the corresponding pattern.

즉, 프로세서(120)는 언어 모델을 통해 프롬프트(500)의 패턴으로부터 지정된 시스템과 사용자 역할의 전체 대화 세션을 모두 제공할 수 있다. 도 5에서 (b)는 언어 모델을 거쳐 (a) 프롬프트의 패턴으로 생성된 신규 대화 예시를 나타내고 있다.That is, the processor 120 may provide both the system specified from the pattern of the prompt 500 and the entire conversation session of the user role through the language model. In FIG. 5, (b) shows an example of a new conversation generated with a pattern of prompts (a) through a language model.

따라서, 프로세서(120)는 언어 모델을 이용하여 지정된 역할에 일관성을 유지하면서 개방형 도메인 주제에 대해 자연스러운 패턴의 대화 데이터를 생성할 수 있다. 언어 모델을 통해 생성된 대화 데이터는 개방형 도메인 대화 모델의 학습을 위한 데이터로 활용될 수 있다.Accordingly, the processor 120 may generate conversation data of a natural pattern for an open domain topic while maintaining consistency in a designated role using a language model. Conversation data generated through the language model can be used as data for learning an open domain dialogue model.

다시 도 4를 참조하면, 단계(S420)에서 프로세서(120)는 챗봇 개발자(또는 크라우드 작업자)의 주석(annotation)을 통해 단계(S410)에서 생성된 대화 데이터를 필터링할 수 있다. 사양에 대한 세부 사항을 모두 프롬프트에 포함시켜 데이터 생성에 반영시키는 것은 어렵다. 따라서, 언어 모델을 통해 생성된 대화 데이터에 대해 사람에 의한 주석을 사용하여 필터링할 수 있다. 프로세서(120)는 챗봇 개발자에 해당되는 주석자를 대상으로 언어 모델을 통해 생성된 각 대화 세션을 제공할 수 있고, 대화 범주를 벗어난 문제 발언에 주석을 이용한 레이블을 지정하도록 요청할 수 있다.Referring back to FIG. 4 , in step S420, the processor 120 may filter the conversation data generated in step S410 through annotations of chatbot developers (or crowd workers). It is difficult to include all the details of a specification in a prompt and reflect it in data generation. Accordingly, it is possible to filter conversation data generated through a language model using human annotations. The processor 120 may provide each chat session generated through the language model to an annotator corresponding to a chatbot developer, and request labeling of problem statements out of the scope of the chat using the annotation.

도 6은 언어 모델에 의해 생성된 대화 데이터에 대한 필터링 예시를 나타내고 있다. 챗봇 개발자는 언어 모델을 통해 생성된 대화 데이터의 챗봇 발언 중 문제 발언으로 평가되는 일부 발언에 대해 주석을 통해 별도의 레이블을 지정할 수 있다. 이때, 프로세서(120)는 개방형 도메인 대화 모델의 학습 데이터에 상기한 필터링 결과를 반영함에 있어, 주석이 지정된 문제 발언을 네거티브 예시(601)로 사용하는 반면, 네거티브 예시(601)에 해당되는 문제 발언을 기준으로 문제 발언 이전의 적어도 하나의 발언을 포지티브 예시(602)로 사용할 수 있다.6 shows an example of filtering conversation data generated by a language model. Chatbot developers can specify separate labels through annotations for some of the chatbot utterances that are evaluated as problem utterances in the conversation data generated through the language model. At this time, the processor 120 uses the annotated problem statement as the negative example 601 in reflecting the filtering result on the training data of the open domain conversation model, while the problem statement corresponding to the negative example 601 Based on , at least one utterance prior to the problem utterance may be used as a positive example 602 .

언어 모델에 의해 생성된 대화 데이터에서 챗봇 개발자에 의해 주석이 지정된 문제 발언 이후의 컨텍스트는 이미 손상된 내용을 포함하고 있을 가능성이 크기 때문에 개방형 도메인 대화 모델의 학습에 사용되지 않도록 제외시킨다(drop).In the dialog data generated by the language model, the context after the problem statement annotated by the chatbot developer is highly likely to contain damaged content, so it is dropped from being used for learning the open domain dialog model.

다시 도 4를 참조하면, 단계(S430)에서 프로세서(120)는 챗봇 개발자(또는 크라우드 작업자)가 챗봇과의 대화에 직접 참여하는 방식으로 챗봇 개발자와 챗봇 간에 주고받는 대화 데이터를 개방형 도메인 대화 모델의 학습 데이터로 수집할 수 있다. 프로세서(120)는 데이터셋을 구축하는 단계에서 개방형 도메인 대화 모델을 사람과의 다중 턴 대화에 참여시킬 수 있다. 챗봇 개발자는 시스템 사용자로서 챗봇과 차례로 대화를 주고받을 수 있으며, 대화를 진행하는 과정에서 챗봇의 응답으로 제공되는 발언 중 적절하지 않은 문제 발언의 경우 수정할 수 있다.Referring back to FIG. 4, in step S430, the processor 120 transfers conversation data exchanged between the chatbot developer and the chatbot in such a way that the chatbot developer (or crowd worker) directly participates in a conversation with the chatbot in the open domain conversation model. It can be collected as learning data. The processor 120 may engage an open domain conversation model in a multi-turn conversation with a person in the step of constructing a dataset. A chatbot developer, as a system user, can communicate with the chatbot in turn, and can correct inappropriate problem statements among the statements provided as responses from the chatbot during the conversation process.

일례로, 프로세서(120)는 챗봇 개발자와 챗봇의 대화 중 챗봇의 문제 발언을 챗봇 개발자가 직접 수정할 수 있는 인터페이스를 제공할 수 있다.For example, the processor 120 may provide an interface through which a chatbot developer can directly correct a problem statement of a chatbot during a conversation between the chatbot developer and the chatbot.

다른 예로, 프로세서(120)는 챗봇 개발자와 챗봇의 대화 중 챗봇의 문제 발언에 대해 챗봇 개발자의 수정 요청이 수신되는 경우 언어 모델을 호출하여 언어 모델을 통해 문제 발언을 수정할 수 있는 대체 발언을 생성할 수 있다. 프로세서(120)는 챗봇 개발자의 요청에 따라 챗봇의 문제 발언을 대체 발언으로 수정한 후 다음 대화를 진행할 수 있으며, 챗봇 개발자로부터 다시 수정 요청이 수신되는 경우 대체 발언을 다시 생성할 수 있다.As another example, if the processor 120 receives a correction request from the chatbot developer for a problem statement of the chatbot during a conversation between the chatbot developer and the chatbot, the processor 120 calls a language model to generate an alternative utterance capable of correcting the problem statement through the language model. can The processor 120 may correct the chatbot's problem statement into an alternative statement according to the chatbot developer's request, and then proceed with the next conversation, and when a correction request is received from the chatbot developer again, the processor 120 may regenerate the alternative statement.

도 7은 본 발명의 일실시예에 있어서 챗봇과 사람 간의 대화 데이터를 수집하는 과정을 설명하기 위한 예시 도면이다.7 is an exemplary diagram for explaining a process of collecting conversation data between a chatbot and a person in one embodiment of the present invention.

도 7은 웹 기반의 대화 수집 인터페이스 화면(700)을 나타내고 있다.7 shows a web-based conversation collection interface screen 700 .

도 7에 도시한 바와 같이, 프로세서(120)는 개방형 도메인 대화 모델의 학습 데이터를 수집하기 위해 대화 수집 인터페이스 화면(700)을 제공할 수 있다. 챗봇 개발자는 대화 수집 인터페이스 화면(700)을 통해 챗봇과 차례로 대화를 주고받을 수 있다.As shown in FIG. 7 , the processor 120 may provide a dialog collection interface screen 700 to collect learning data of an open domain dialog model. A chatbot developer can exchange and exchange a conversation with the chatbot in turn through the conversation collection interface screen 700 .

대화 수집 인터페이스 화면(700)에는 챗봇과의 대화 중 대화 범주를 벗어난 문제 발언에 대해 발언 수정을 요청하기 위한 '수정' 버튼(701)이 포함될 수 있다. 챗봇 개발자는 챗봇의 발언이 챗봇 사양이나 역할과 일치하지 않는 경우 문제 유형을 선택하고 '수정' 버튼(701)을 입력할 수 있다.The conversation collection interface screen 700 may include a 'modify' button 701 for requesting correction of remarks regarding problematic remarks outside the dialogue range during a conversation with the chatbot. A chatbot developer may select a problem type and input a 'modify' button 701 when the chatbot's remarks do not match the chatbot specifications or roles.

프로세서(120)는 챗봇 개발자가 챗봇의 발언 중 문제 발언에 대해 '수정' 버튼(701)을 선택하는 경우 언어 모델을 호출하여 언어 모델을 통해 문제 발언을 수정할 수 있는 대체 발언을 생성할 수 있다. 챗봇 개발자는 대체 발언이 적절하면 다음 대화를 진행하고 대체 발언이 적절하지 않으면 '수정' 버튼(701)을 통해 발언 수정을 반복해서 요청할 수 있다.When the chatbot developer selects the 'modify' button 701 for the problem statement among the chatbot's speech, the processor 120 may call a language model to generate an alternative speech capable of correcting the problem statement through the language model. The chatbot developer may proceed with the next conversation if the alternative remark is appropriate, and may repeatedly request correction of the remark through the 'modify' button 701 if the alternative remark is not appropriate.

대화 수집 인터페이스 화면(700)에는 챗봇 개발자와 챗봇 간의 대화 데이터를 저장하기 위한 '대화 저장' 버튼(702)이 더 포함될 수 있다. 챗봇 개발자는 대화 범주를 벗어난 발언이 없거나 모두 수정된 상태로 대화가 종료된 경우 '대화 저장' 버튼(702)을 통해 전체 대화 세션을 저장할 수 있다.The conversation collection interface screen 700 may further include a 'save conversation' button 702 for saving conversation data between the chatbot developer and the chatbot. The chatbot developer may save the entire chat session through the 'save conversation' button 702 when there are no utterances out of the conversation scope or the conversation is ended with all of them corrected.

프로세서(120)는 일부 발언의 수정과 대화가 종료되면 챗봇 개발자와 챗봇 간에 주고받는 전체 대화 세션을 개방형 도메인 대화 모델의 학습 데이터로 사용할 수 있다. 이때, 대화 중 대체 발언으로 수정된 발언은 포지티브 예시로 사용될 수 있고, 수정되기 이전 발언, 즉 '수정' 버튼(701)으로 수정 요청한 문제 발언은 네거티브 예시로 사용할 수 있다.The processor 120 may use the entire conversation session exchanged between the chatbot developer and the chatbot as training data of the open domain conversation model when some utterances are corrected and the conversation is finished. At this time, a utterance corrected as an alternative utterance during a conversation can be used as a positive example, and a utterance before correction, that is, a problem utterance requested for correction through the 'correction' button 701 can be used as a negative example.

더 나아가, 프로세서(120)는 챗봇의 문제 발언을 수정하는 과정을 챗봇에 대한 평가 지표로 활용할 수 있다. '수정' 버튼(701)의 입력은 부적절한 발언이 반환되었음을 의미하는 것이므로 시스템 오류율, 즉 전체 응답 중 수정된 응답 비율에 사용될 수 있다.Furthermore, the processor 120 may utilize a process of correcting the chatbot's problem statement as an evaluation index for the chatbot. Since the input of the 'modify' button 701 means that an inappropriate utterance is returned, it can be used for the system error rate, that is, the corrected response rate among all responses.

역할이 지정된 개방형 도메인 대화 모델의 아키텍처를 설명하면 다음과 같다.The architecture of the role-assigned open domain dialog model is as follows.

앞서 생성된 대화 데이터셋을 사용하여 역할 사양을 충족하는 대화 시스템을 모델링하기 위해 다양한 아키텍처를 적용할 수 있다.Various architectures can be applied to model a dialog system that satisfies the role specification using the previously created dialog dataset.

본 발명에 따른 개방형 도메인 대화 모델은 범주를 벗어난 발언을 감지하는 모델(Out-of-Bounds Detection Model)을 포함할 수 있다.The open domain conversation model according to the present invention may include an Out-of-Bounds Detection Model.

챗봇의 역할 사양에 따라 발언을 제한하는 가장 간단한 방법은 범주 경계를 벗어난 발언을 감지하고 폐기하는 것이다. 일례로, 대화 데이터셋에서 포지티브 예시와 네거티브 예시를 분류하기 위해 BERT(Bidirectional Encoder Representations form Transformer) 기반 이진 분류기(binary classifier)를 사용할 수 있다. 분류기는 단독으로 대화를 수행할 수 없기 때문에 2단계 모델(two-stage model)을 가정하고, 응답 예측 모델은 분류기에 의해 검열되는 응답을 반환한다. 범주를 벗어난 발언이 감지되면 유사한 다른 주제에 대해 사전 정의된 질문 중 하나를 선택하여 반환할 수 있으며, 이때 언어 모델을 사용하여 PPL(perplexity)이 가장 낮은 질문을 선택할 수 있다.The simplest way to limit utterances based on a chatbot's role specifications is to detect and discard utterances that fall outside of category boundaries. As an example, a BERT (Bidirectional Encoder Representations form Transformer) based binary classifier can be used to classify positive and negative examples in a conversation dataset. Since the classifier cannot carry out a conversation alone, a two-stage model is assumed, and the response prediction model returns a response that is censored by the classifier. When out-of-category utterances are detected, one of the predefined questions on other, similar topics can be selected and returned, using the language model to select the question with the lowest perplexity (PPL).

본 발명에 따른 개방형 도메인 대화 모델은 응답 선택 모델(Response Selection Model)을 포함할 수 있다.An open domain dialog model according to the present invention may include a response selection model.

챗봇의 역할 사양에 따라 발언을 제한하는 방법 중 하나는 응답 선택 모델에 대한 응답 후보를 미리 필터링하는 것이다. 일례로, 응답 선택 모델로 검색 후 리랭킹하는(retrieve-then-rerank) 2단계 접근법을 사용할 수 있다. 폴리-인코더 아키텍처의 검색기(retriever)는 응답 후보에서 상위 k개의 응답을 찾아낸 다음 교차 인코더 아키텍처의 리랭커(reranker)에 의해 상위 k개의 응답 순위를 재조정할 수 있다.One of the ways to restrict speech according to the chatbot's role specification is to pre-filter the response candidates for the response selection model. As an example, we can use a two-step approach to retrieve-then-rerank with a response selection model. The retriever of the poly-encoder architecture finds the top k responses from the response candidates and then reranks the top k responses by the reranker of the cross-encoder architecture.

응답 후보에 대한 필터링은 한계가 있기 때문에 응답 후보와 함께 답변할 수 없는 컨텍스트를 판단하는 것이 중요하다. 대답할 수 없는 컨텍스트를 예측하는 효과적인 방법 중 하나는 모델의 불확실성을 활용하는 것으로, 일례로 임계값을 사용한 유사 접근 방식을 사용할 수 있다. 검색된 모든 응답 점수가 특정 임계값보다 낮으면 응답할 수 없는 컨텍스트로 예측할 수 있다. 다른 예는 언어 모델의 PPL을 이용한 또 다른 접근 방식으로, 대화 컨텍스트와 검색된 응답을 연결하여 언어 모델에 입력하고 응답의 PPL을 측정한 후 임계값을 기준으로 응답을 최종 결정할 수 있다.Since the filtering of response candidates has limitations, it is important to determine the context in which an answer cannot be answered together with the response candidates. One effective way to predict unanswerable contexts is to take advantage of the model's uncertainty, for example a similar approach using thresholds. If all retrieved response scores are lower than a certain threshold, we can predict an unresponsive context. Another example is another approach using the PPL of the language model, which connects the dialog context and the searched response to input into the language model, measures the PPL of the response, and finally determines the response based on a threshold value.

본 발명에 따른 개방형 도메인 대화 모델은 응답 생성 모델(Response Generation Model)을 포함할 수 있다.An open domain dialog model according to the present invention may include a response generation model.

타겟 데이터에 대한 언어 모델 파인튜닝은 태스크의 특성을 학습하는데 효과적이다. 파인튜닝된 언어 모델을 최대 우도 추정(Maximum Likelihood Estimation)을 사용한 응답 생성 모델로 고려할 수 있다. 반면에, 비우도(unlikelihood) 학습은 생성 모델의 바람직하지 않은 특징(예를 들어, 토큰 반복, 논리적 불일치 등)을 완화하는데 효과적이다.Language model fine-tuning for target data is effective in learning the characteristics of tasks. The fine-tuned language model can be considered as a response generation model using Maximum Likelihood Estimation. On the other hand, unlikelihood learning is effective in mitigating undesirable characteristics of generative models (eg token repetitions, logical inconsistencies, etc.).

본 실시예에서는 챗봇이 바람직한 특징의 발언을 생성하도록 최대 우도 추정을 데이터셋의 포지티브 예시에 적용하는 반면, 챗봇이 바람직하지 않은 특징의 발언을 생성하지 않도록 비우도 학습을 네거티브 예시에 적용할 수 있다. 상기한 두 가지 유형의 학습은 동시에 수행될 수 있다.In this embodiment, maximum likelihood estimation is applied to the positive examples in the dataset so that the chatbot generates utterances of desirable characteristics, while non-likelihood learning can be applied to negative examples so that the chatbot does not generate utterances of undesirable characteristics. . The above two types of learning can be performed simultaneously.

본 발명에 따른 개방형 도메인 대화 모델은 검색-실패-생성 모델(Retrieve-fail-Generate Model)을 포함할 수 있다.The open domain dialog model according to the present invention may include a retrieve-fail-generate model.

개방형 도메인 대화 모델을 구축함에 있어 상기한 응답 선택 모델과 응답 생성 모델이 결합된 검색-실패-생성 파이프라인을 구축할 수 있다.In constructing an open domain dialogue model, a search-failure-generation pipeline combining the above-described response selection model and response generation model may be constructed.

도 8은 본 발명의 일실시예에 있어서 검색-실패-생성 모델 구조를 도시한 것이다.8 illustrates a search-failure-generation model structure in one embodiment of the present invention.

도 8을 참조하면, 검색-실패-생성 모델(800)은 검색 후 리랭킹하는(retrieve-then-rerank) 2단계 접근법을 사용하는 응답 선택 모델(810), 및 적절한 발언을 생성하면서 동시에 적절하지 않은 발언을 생성하지 않도록 학습된 응답 생성 모델(820)로 구성될 수 있다.Referring to FIG. 8 , a search-failure-generate model 800 is a response selection model 810 that uses a two-step approach of retrieve-then-rerank, and a response selection model 810 that generates appropriate utterances while simultaneously generating appropriate utterances. It can be configured as a response generation model 820 that has been trained not to generate unintentional utterances.

이때, 응답 선택 모델(810)은 적절한 응답을 선택하려고 시도하고, 응답 불가능한 컨텍스트 예측 모델이 선택된 응답을 무시하면 응답 생성 모델(820)은 주어진 컨텍스트에 대한 응답을 반환한다. 응답 후보를 관리하여 응답 선택 모델(810)을 제어하기가 비교적 쉽다. 따라서, 응답 선택 모형(810)은 대부분의 응답을 담당하며, 응답 생성 모델(820)은 응답 선택이 실패할 때만 사용된다.At this time, the response selection model 810 tries to select an appropriate response, and if the response selected by the unresponsive context prediction model is ignored, the response generation model 820 returns a response for the given context. It is relatively easy to control the response selection model 810 by managing response candidates. Accordingly, the response selection model 810 is responsible for most responses, and the response generation model 820 is used only when response selection fails.

본 실시예들은 언어 모델의 컨텍스트 퓨샷 학습을 활용하는 데이터 수집 프레임워크를 통해 역할을 만족시키는 대화 데이터셋을 만들 수 있고, 주석을 이용한 데이터 필터링과 사람과 챗봇 간의 직접 대화를 통해 대화 데이터셋을 확장해 나갈 수 있다. 이러한 대화 데이터셋을 학습 데이터로 사용하여 역할 사양을 충족하는 대화 시스템을 모델링할 수 있다.The present embodiments can create a conversation dataset that satisfies the role through a data collection framework that utilizes contextual snapshot learning of the language model, and expands the conversation dataset through data filtering using annotations and direct conversation between humans and chatbots. can do it We can use these dialog datasets as training data to model dialog systems that meet role specifications.

이처럼 본 발명의 실시예들에 따르면, 사람과 자연스럽게 대화하면서 일관된 역할을 유지할 수 있는 개방형 도메인 대화 모델을 구축함으로써 응답 품질이 향상된 대화 시스템을 제공할 수 있다.As described above, according to embodiments of the present invention, a dialog system with improved response quality can be provided by establishing an open domain dialog model capable of maintaining a consistent role while having a natural conversation with a person.

이상에서 설명된 장치는 하드웨어 구성요소, 소프트웨어 구성요소, 및/또는 하드웨어 구성요소 및 소프트웨어 구성요소의 조합으로 구현될 수 있다. 예를 들어, 실시예들에서 설명된 장치 및 구성요소는, 프로세서, 콘트롤러, ALU(arithmetic logic unit), 디지털 신호 프로세서(digital signal processor), 마이크로컴퓨터, FPGA(field programmable gate array), PLU(programmable logic unit), 마이크로프로세서, 또는 명령(instruction)을 실행하고 응답할 수 있는 다른 어떠한 장치와 같이, 하나 이상의 범용 컴퓨터 또는 특수 목적 컴퓨터를 이용하여 구현될 수 있다. 처리 장치는 운영 체제(OS) 및 상기 운영 체제 상에서 수행되는 하나 이상의 소프트웨어 어플리케이션을 수행할 수 있다. 또한, 처리 장치는 소프트웨어의 실행에 응답하여, 데이터를 접근, 저장, 조작, 처리 및 생성할 수도 있다. 이해의 편의를 위하여, 처리 장치는 하나가 사용되는 것으로 설명된 경우도 있지만, 해당 기술분야에서 통상의 지식을 가진 자는, 처리 장치가 복수 개의 처리 요소(processing element) 및/또는 복수 유형의 처리 요소를 포함할 수 있음을 알 수 있다. 예를 들어, 처리 장치는 복수 개의 프로세서 또는 하나의 프로세서 및 하나의 콘트롤러를 포함할 수 있다. 또한, 병렬 프로세서(parallel processor)와 같은, 다른 처리 구성(processing configuration)도 가능하다.The devices described above may be implemented as hardware components, software components, and/or a combination of hardware components and software components. For example, devices and components described in the embodiments include a processor, a controller, an arithmetic logic unit (ALU), a digital signal processor, a microcomputer, a field programmable gate array (FPGA), and a programmable PLU (programmable logic unit). logic unit), microprocessor, or any other device capable of executing and responding to instructions. The processing device may run an operating system (OS) and one or more software applications running on the operating system. A processing device may also access, store, manipulate, process, and generate data in response to execution of software. For convenience of understanding, there are cases in which one processing device is used, but those skilled in the art will understand that the processing device includes a plurality of processing elements and/or a plurality of types of processing elements. It can be seen that it can include. For example, a processing device may include a plurality of processors or a processor and a controller. Other processing configurations are also possible, such as parallel processors.

소프트웨어는 컴퓨터 프로그램(computer program), 코드(code), 명령(instruction), 또는 이들 중 하나 이상의 조합을 포함할 수 있으며, 원하는 대로 동작하도록 처리 장치를 구성하거나 독립적으로 또는 결합적으로(collectively) 처리 장치를 명령할 수 있다. 소프트웨어 및/또는 데이터는, 처리 장치에 의하여 해석되거나 처리 장치에 명령 또는 데이터를 제공하기 위하여, 어떤 유형의 기계, 구성요소(component), 물리적 장치, 컴퓨터 저장 매체 또는 장치에 구체화(embody)될 수 있다. 소프트웨어는 네트워크로 연결된 컴퓨터 시스템 상에 분산되어서, 분산된 방법으로 저장되거나 실행될 수도 있다. 소프트웨어 및 데이터는 하나 이상의 컴퓨터 판독 가능 기록 매체에 저장될 수 있다.Software may include a computer program, code, instructions, or a combination of one or more of the foregoing, which configures a processing device to operate as desired or processes independently or collectively. You can command the device. The software and/or data may be embodied in any tangible machine, component, physical device, computer storage medium or device to be interpreted by or to provide instructions or data to a processing device. there is. Software may be distributed on networked computer systems and stored or executed in a distributed manner. Software and data may be stored on one or more computer readable media.

실시예에 따른 방법은 다양한 컴퓨터 수단을 통하여 수행될 수 있는 프로그램 명령 형태로 구현되어 컴퓨터 판독 가능 매체에 기록될 수 있다. 이때, 매체는 컴퓨터로 실행 가능한 프로그램을 계속 저장하거나, 실행 또는 다운로드를 위해 임시 저장하는 것일 수도 있다. 또한, 매체는 단일 또는 수 개의 하드웨어가 결합된 형태의 다양한 기록수단 또는 저장수단일 수 있는데, 어떤 컴퓨터 시스템에 직접 접속되는 매체에 한정되지 않고, 네트워크 상에 분산 존재하는 것일 수도 있다. 매체의 예시로는, 하드 디스크, 플로피 디스크 및 자기 테이프와 같은 자기 매체, CD-ROM 및 DVD와 같은 광기록 매체, 플롭티컬 디스크(floptical disk)와 같은 자기-광 매체(magneto-optical medium), 및 ROM, RAM, 플래시 메모리 등을 포함하여 프로그램 명령어가 저장되도록 구성된 것이 있을 수 있다. 또한, 다른 매체의 예시로, 어플리케이션을 유통하는 앱 스토어나 기타 다양한 소프트웨어를 공급 내지 유통하는 사이트, 서버 등에서 관리하는 기록매체 내지 저장매체도 들 수 있다.The method according to the embodiment may be implemented in the form of program instructions that can be executed through various computer means and recorded on a computer readable medium. In this case, the medium may continuously store a program executable by a computer or temporarily store the program for execution or download. In addition, the medium may be various recording means or storage means in the form of a single or combined hardware, but is not limited to a medium directly connected to a certain computer system, and may be distributed on a network. Examples of the medium include magnetic media such as hard disks, floppy disks and magnetic tapes, optical recording media such as CD-ROM and DVD, magneto-optical media such as floptical disks, and ROM, RAM, flash memory, etc. configured to store program instructions. In addition, examples of other media include recording media or storage media managed by an app store that distributes applications, a site that supplies or distributes various other software, and a server.

이상과 같이 실시예들이 비록 한정된 실시예와 도면에 의해 설명되었으나, 해당 기술분야에서 통상의 지식을 가진 자라면 상기의 기재로부터 다양한 수정 및 변형이 가능하다. 예를 들어, 설명된 기술들이 설명된 방법과 다른 순서로 수행되거나, 및/또는 설명된 시스템, 구조, 장치, 회로 등의 구성요소들이 설명된 방법과 다른 형태로 결합 또는 조합되거나, 다른 구성요소 또는 균등물에 의하여 대치되거나 치환되더라도 적절한 결과가 달성될 수 있다.As described above, although the embodiments have been described with limited examples and drawings, those skilled in the art can make various modifications and variations from the above description. For example, the described techniques may be performed in an order different from the method described, and/or components of the described system, structure, device, circuit, etc. may be combined or combined in a different form than the method described, or other components may be used. Or even if it is replaced or substituted by equivalents, appropriate results can be achieved.

그러므로, 다른 구현들, 다른 실시예들 및 특허청구범위와 균등한 것들도 후술하는 특허청구범위의 범위에 속한다.Therefore, other implementations, other embodiments, and equivalents of the claims are within the scope of the following claims.

Claims

In a method executed on a computer device,
The computer device includes at least one processor configured to execute computer readable instructions contained in a memory;
The method,
configuring, by the at least one processor, a prompt that is an input sentence of a language model with a role specification of the chatbot and an example of a conversation between the chatbot and a person; and
generating, by the at least one processor, new dialogue data of roles and dialogue patterns included in the prompt through the language model by inputting the prompt into the language model;
How to include.

According to claim 1,
The method,
Building, by the at least one processor, an open-domain dialogue model for the chatbot by supervised learning on the dialogue data.
How to include more.

According to claim 1,
The method,
filtering, by the at least one processor, the dialogue data through annotations;
How to include more.

According to claim 3,
The filtering step is
Classifying the annotated problem speech among the speech of the chatbot included in the conversation data as a negative example for learning the chatbot.
How to include.

According to claim 1,
The method,
Collecting, by the at least one processor, conversation data between the chatbot and a person as learning data of the chatbot through an interface in which the chatbot and a person directly communicate.
How to include more.

According to claim 5,
The collecting step is
modifying some of the utterances by generating replacement utterances through the language model when a modification request is received for some of the utterances of the chatbot provided through the interface;
How to include.

According to claim 6,
The collecting step is
Classifying the utterance requested for correction as a negative example for learning the chatbot; and
Classifying the utterance modified as the replacement utterance as a positive example for learning the chatbot.
How to include more.

According to claim 2,
The building step is
Based on the perplexity (PPL) of the language model in an Out-of-Bounds Detection Model (Out-of-Bounds Detection Model) that detects utterances outside the scope of the role specification and a retrieve-then-rerank approach after finding response candidates A response selection model for filtering the response candidates, and response generation in which maximum likelihood estimation is applied to positive examples in the dataset and unlikelihood learning is applied to negative examples in the dataset. Building the domain conversation model using at least one of the models (Response Generation Model)
A method characterized by.

A computer program stored in a computer readable recording medium to execute the method of any one of claims 1 to 8 in the computer device.

In a computer device,
at least one processor configured to execute computer readable instructions contained in memory;
including,
The at least one processor,
Construct a prompt that is an input sentence of a language model with a role specification of the chatbot and an example of a conversation between the chatbot and a person,
Inputting the prompt to the language model to generate new conversation data of roles and conversation patterns included in the prompt through the language model.
Characterized by a computer device.

According to claim 10,
The at least one processor,
Building an open domain conversation model for the chatbot by supervised learning on the conversation data
Characterized by a computer device.

According to claim 10,
The at least one processor,
filtering the conversation data through annotations;
Characterized by a computer device.

According to claim 10,
The at least one processor,
Collecting conversation data between the chatbot and a person as learning data of the chatbot through an interface in which the chatbot and a person directly communicate
Characterized by a computer device.