KR20220123170A

KR20220123170A - Language Learning System and Method with AI Avatar Tutor

Info

Publication number: KR20220123170A
Application number: KR1020210027068A
Authority: KR
Inventors: 조지수
Original assignee: 조지수
Priority date: 2021-02-28
Filing date: 2021-02-28
Publication date: 2022-09-06
Also published as: WO2022182064A1

Abstract

The present invention provides a method of providing a conversation learning system using an artificial intelligence avatar tutor, which allows a user to effectively perform conversation learning while correcting an expression of the user to a better expression or suggesting a better expression. The method comprises: a step in which an artificial intelligence avatar explains a conversation subject and a situation to a user; a step in which the artificial intelligence avatar tells a question suitable for the subject and the situation first while starting a conversation; a step in which the artificial intelligence avatar shows proper reactions while the user speaks; a step of converting a response of the user into text to understand the response in accordance with the context of the current conversation and generating and expressing a response and a gesture of the artificial intelligence avatar; a step of generating a sentence as a better expression than the response of the user or an expression in which a grammatical error is corrected; and a step of inserting an advertisement banner into the background of the artificial intelligence avatar in accordance with the conversation subject and the situation.

Description

Conversation learning system and method using AI Avatar Tutor {Language Learning System and Method with AI Avatar Tutor}

본 발명은 사용자가 인공지능 아바타와 대화를 하면서 회화 실력을 향상시킬 수 있는 방법에 관한 것으로, 더욱 상세하게는 인공지능 아바타가 튜터가 되어 화면에 보여지고, 사용자에게 특정 주제와 상황을 제시하면서 대화를 시작하고, 사용자의 음성 입력을 받아 의미를 이해하고, 대화 문맥에 맞는 응답을 오디오로 들려준다. 사용자와 연속적인 대화를 하게 되고 사용자의 표현의 정정이나 더 적합한 표현을 제안한다. 또한 사람 같은 자연스러운 대화의 느낌을 주기 위해 인공지능 아바타가 듣고 있을 때, 말하고 있을 때의 각 문맥에 맞는 제스처를 보여주면서 회화 학습을 할 수 있도록 한 학습 시스템 및 회화 학습 방법에 관한 것이다.The present invention relates to a method by which a user can improve his or her conversational skills while having a conversation with an artificial intelligence avatar, and more particularly, the artificial intelligence avatar becomes a tutor and is displayed on the screen, while presenting a specific topic and situation to the user. , receives the user's voice input, understands the meaning, and plays the audio response appropriate to the context of the conversation. We have a continuous conversation with the user and suggest corrections to the user's expression or a more appropriate expression. Also, it relates to a learning system and a conversation learning method that enable conversational learning by showing gestures suitable for each context when an AI avatar is listening or speaking in order to give a feeling of natural conversation like a human.

외국어를 배우기 위한 기술의 발전으로 인간 튜터 실사를 이용한 대화 학습 시스템 접할 수 있는 시대에 이르렀다. 실제 인간 튜터가 아니라 튜터 와 동일한 목소리와 입모양을 들려주는 실사 아바타를 사용자에게 보여주고, 특정 주제를 정한 후 정해진 대화를 따라하게 하는 것이다. 즉, 정해진 주제와 대화 대본을 통해, 실사 아바타는 해당 대본을 읽고, 입모양도 소리에 맞게 변경한다. 사용자는 정해진 대화 대본의 응답 텍스트 로 보며, 따라 읽으면서 학습을 하게 된다. 주어진 대본 외에 다른 대화는 할 수 없다. 또한 정해진 대본에서 따라 읽는 수준으로, 사용자가 대화 중 즉시 생각하고 만든 표현을 사용할 수 없어, 표현에 대한 교정을 받을 수 없다.With the development of technology for learning foreign languages, we have reached an era where we can access a conversational learning system using live-action human tutors. Instead of a real human tutor, a live-action avatar with the same voice and mouth shape as that of a tutor is shown to the user, and after a specific topic is selected, the user is instructed to follow the set conversation. That is, through a set topic and a dialogue script, the live-action avatar reads the script and changes the mouth shape to match the sound. The user sees the response text of the set dialogue script and learns by reading along. You cannot have any other conversation other than the script given to you. Also, at the level of reading according to the set script, the user cannot use the expression immediately thought and created during the conversation, so the expression cannot be corrected.

종래의 실사 아바타 기술에서는 사용자와의 대화를 한다는 경험보다는 특정 발음을 교정하는 것에 집중되어 있다. 사용자가 주어진 대본을 보고 발음을 잘하는지 체크하고, 실제 실사 아바타가 응답을 할 때는 정확한 발음을 하기 위해 입 모양 표현에 집중하고 있다.In the conventional live-action avatar technology, it is focused on correcting a specific pronunciation rather than the experience of having a conversation with a user. The user looks at the given script and checks whether the pronunciation is good, and when the actual live-action avatar responds, it concentrates on the expression of the mouth shape to make the correct pronunciation.

이러한 기술의 단점으로 실사 아바타를 이용한 회화 학습은 프리토킹 보다는 올바른 발음과 정해진 표현을 숙지하는 것에 집중해서 사용자의 회화 능력을 향상시키기 에는 한계가 있다. As a disadvantage of this technique, conversational learning using a live-action avatar has a limit in improving the user's conversational ability by focusing on learning the correct pronunciation and fixed expressions rather than pre-talking.

본 발명은 전술한 바와 같이 종래 기술의 제반 문제점을 해결하기 위해 제안된 것으로, 본 발명은 인공지능 아바타 튜터가 주어진 주제와 상황을 사용자에게 설명하고, 자연스럽게 대화를 이끌어 가고, 사용자의 발화를 대화 문맥에 맞게 이해하면서 적절한 제스처와 소리로 리액션을 보여주고, 사용자의 발화가 끝난 후, 이에 맞는 응답을 생성한 후 제스처와 함께 오디오로 응답을 들려준다. 또한 사용자의 표현을 더 좋은 표현으로 정정하거나 제안하면서 사용자가 회화 학습을 효과적으로 할 수 있도록 한 인공지능 아바타 튜터를 이용한 회화 학습 시스템 및 회화 학습 방법을 제공하는데 목적이 있다. The present invention has been proposed to solve the problems of the prior art as described above. The present invention provides an artificial intelligence avatar tutor to explain a given topic and situation to a user, lead a conversation naturally, and put the user's utterance into a conversation context. It shows the reaction with appropriate gestures and sounds while understanding it properly, and after the user's utterance is finished, it generates a response that matches it, and then plays the response as audio along with the gesture. Another object of the present invention is to provide a conversation learning system and a conversation learning method using an artificial intelligence avatar tutor that allows the user to effectively learn conversation while correcting or suggesting a better expression of the user.

상술한 기술적 과제를 달성하기 위한 기술적 수단으로서, 본 발명의 일 실시예는, 인공지능 아바타가 대화 주제와 상황을 사용자에게 설명하는 단계, 인공지능 아바타가 대화를 시작하면서 대화 주제와 상황을 설명하고, 이에 맞는 질문 또는 요청을 말하는 단계, 사용자가 발화하는 동안 인공지능 아바타가 적절한 리액션과 추임새를 보여주는 단계, 사용자의 응답을 텍스트로 변환하여 현재 대화의 문맥에 맞게 이해하고, 인공지능 아바타의 응답과 제스처를 생성하고 표현하는 단계, 사용자의 응답 보다 더 좋은 표현이나 문법적인 오류를 수정한 표현으로 문장을 생성하는 단계, 대화 주제와 상황에 따라 인공지능 아바타 배경에 광고 배너를 삽입하는 단계를 포함한다.As a technical means for achieving the above-described technical problem, an embodiment of the present invention provides an artificial intelligence avatar explaining a conversation topic and situation to a user, the artificial intelligence avatar explaining a conversation topic and situation while starting a conversation, , saying the appropriate question or request, the AI avatar showing appropriate reactions and movements while the user is speaking, converting the user's response into text to understand it according to the context of the current conversation, and It includes the steps of generating and expressing gestures, generating sentences with expressions that are better than the user's response or correcting grammatical errors, and inserting advertisement banners into the background of the AI avatar according to the topic and situation of the conversation. .

전술한 본 발명의 과제 해결 수단 중 어느 하나에 의하면, 인공지능 아바타가 회화학습의 튜터가 되고, 주어진 대화 주제와 상황에 맞는 대화를 이끌어 가고, 사용자와 대화를 하는 동안 제스처와 추임새를 통해 실제 대화의 경험을 살리고, 현재 대화의 문맥을 기반으로 사용자 발화를 이해하고, 이에 맞는 응답과 제스처를 생성하고, 사용자 발화에 대한 더 좋은 표현을 제안하고, 문법적인 오류가 있으면 정정한 문장을 생성할 수 있으며, 정해진 응답 표현이나 발음을 학습하는 기존 인공지능 튜터의 한계를 극복하여, 사용자와 임의의 주제에 대해 정해져 있지 않는 자연스러운 대화가 가능한 인공지능 아바타 튜터가 사용자의 회화 능력을 향상시키는데 도움을 주는 회화학습 시스템을 제공할 수 있다.According to any one of the above-described problem solving means of the present invention, the artificial intelligence avatar becomes a conversational learning tutor, leads a conversation suitable for a given conversation topic and situation, and has a real conversation through gestures and chuimsae during conversation with the user experience, understand user utterances based on the context of the current conversation, generate responses and gestures corresponding to them, suggest better expressions for user utterances, and generate correct sentences for grammatical errors. Conversation that helps users to improve their conversational skills by overcoming the limitations of existing AI tutors who learn specific response expressions and pronunciations, and can have natural conversations with users on arbitrary topics A learning system can be provided.

도 1은 본 발명의 일 실시예에 따른 인공지능 아바타 튜터를 활용한 회화 학습 시스템을 설명하기 위한 도면이다.
도 2, 도3, 도 4, 도 5, 도 6은 본 발명의 일 실시예에 따른 인공지능 아바타 튜터를 활용한 회화 학습 시스템이 구현된 일 실시예를 설명하기 위한 도면이다.
도 7은 본 발명의 일 실시예에 따른 도 1의 인공지능 아바타 튜터를 활용한 회화 학습 시스템에 포함된 각 구성들 상호 간에 데이터가 송수신 되는 과정을 나타낸 도면이다. 1 is a diagram for explaining a conversation learning system using an artificial intelligence avatar tutor according to an embodiment of the present invention.
2, 3, 4, 5, and 6 are diagrams for explaining an embodiment in which a conversation learning system using an artificial intelligence avatar tutor according to an embodiment of the present invention is implemented.
7 is a diagram illustrating a process in which data is transmitted/received between components included in the conversation learning system using the artificial intelligence avatar tutor of FIG. 1 according to an embodiment of the present invention.

본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자가 용이하게 실시할 수 있도록 본 발명의 실시예를 상세히 설명한다. 그러나 본 발명은 여러가지 상이한 형태로 구현될 수 있으며 여기에서 설명하는 실시예에 한정되지 않는다. 그리고 도면에서 본 발명을 명확하게 설명하기 위해 설명과 관련 없는 부분은 생략하였으며, 명세서 전체에서 유사한 부분에 대해서는 유사한 도면 부호를 붙인다.Embodiments of the present invention will be described in detail so that those of ordinary skill in the art can easily carry out the present invention. However, the present invention may be embodied in various different forms and is not limited to the embodiments described herein. And in order to clearly explain the present invention in the drawings, parts not related to the description are omitted, and similar reference numerals are attached to similar parts throughout the specification.

명세서 전체에서 어떤 부분이 다른 부분과 '연결' 되어 있다고 할 때, 이는 '직접적으로 연결' 되어 있는 경우 뿐 아니라, 그 중간에 다른 모듈을 두고 연결되어 있는 경우도 포함한다. 또한 어떤 부분이 어떤 구성요소를 '포함' 한다고 할 때, 이는 특별히 반대되는 기재가 없는 한 다른 구성요소를 제외하는 것이 아니라 다른 구성요소를 더 포함할 수 있는 것을 의미하며, 하나 또는 그 이상의 다른 특징이나 숫자, 단계, 동작, 구성요소, 부분품 또는 이들을 조합한 것들의 존재 또는 부가 가능성을 미리 배제하지 않는 것으로 이해되어야 한다.Throughout the specification, when a part is 'connected' with another part, this includes not only the case where it is 'directly connected', but also the case where another module is connected in the middle. In addition, when a part 'includes' a certain component, this means that other components may be further included, rather than excluding other components, unless otherwise stated, and one or more other features However, it is to be understood that the existence or addition of numbers, steps, operations, components, parts, or combinations thereof is not precluded in advance.

명세서 전체에서 사용되는 정도의 용어 '약', '실질적으로' 등은 언급된 의미에 고유한 제조 및 물질 허용오차 가 제시될 때 그 수치에서 또는 그 수치에 근접한 의미로 사용되고, 본 발명의 이해를 돕기 위해 정확하거나 절대적인 수치가 언급된 개시 내용을 비양심적인 침해자가 부당하게 이용하는 것을 방지하기 위해 사용된다. 본 발명의 명세서 전체에서 사용되는 정도의 용어 '~(하는) 단계' 또는 '~의 단계'는 '~ 를 위한 단계'를 의미하지 않는다.The terms 'about', 'substantially', etc. to the extent used throughout the specification are used in or close to the numerical value when manufacturing and material tolerances inherent in the stated meaning are presented, and are intended to enhance the understanding of the present invention. To help, precise or absolute figures are used to prevent unfair use by unscrupulous infringers of the stated disclosure. As used throughout the specification of the present invention, the term 'step for (to)' or 'step for' does not mean 'step for '.

본 명세서에 있어서 '부(部)'란, 하드웨어에 의해 실현되는 유닛(unit), 소프트웨어에 의해 실현되는 유닛, 양 방을 이용하여 실현되는 유닛을 포함한다. 또한, 1개의 유닛이 2개 이상의 하드웨어를 이용하여 실현되어도 되고, 2개 이상의 유닛이 1개의 하드웨어에 의해 실현되어도 된다.In this specification, a "part" includes a unit realized by hardware, a unit realized by software, and a unit realized using both. In addition, one unit may be implemented using two or more hardware, and two or more units may be implemented by one hardware.

본 명세서에 있어서 단말, 장치 또는 디바이스가 수행하는 것으로 기술된 동작이나 기능 중 일부는 해당 단말, 장치 또는 디바이스와 연결된 서버에서 대신 수행될 수도 있다. 이와 마찬가지로, 서버가 수행하는 것으로 기술된 동작이나 기능 중 일부도 해당 서버와 연결된 단말, 장치 또는 디바이스에서 수행될 수도 있다.In this specification, some of the operations or functions described as being performed by the terminal, apparatus, or device may be performed instead of in a server connected to the terminal, apparatus, or device. Similarly, some of the operations or functions described as being performed by the server may also be performed in a terminal, apparatus, or device connected to the server.

본 명세서에서 있어서, 단말과 매핑(Mapping) 또는 매칭(Matching)으로 기술된 동작이나 기능 중 일부는, 단말의 식별 정보(Identifying Data)인 단말기의 고유번호나 개인의 식별정보를 매핑 또는 매칭한다는 의미로 해석될 수 있다.In this specification, some of the operations or functions described as mapping or matching with the terminal means mapping or matching the terminal's unique number or personal identification information, which is the identification data of the terminal. can be interpreted as

이하 첨부된 도면을 참고하여 본 발명을 상세히 설명하기로 한다.Hereinafter, the present invention will be described in detail with reference to the accompanying drawings.

도 1은 본 발명의 일 실시예에 따른 인공지능 아바타 튜터를 활용한 회화 학습 시스템을 설명하기 위한 도면이다. 도 1을 참조하면, 인공지능 아바타 튜터를 활용한 회화 학습 시스템(1)은, 사용자가 시스템에 접근할 수 있는 클라이언트(100), 네트워크(200), 인공지능 아바타 튜터 회화 학습 서비스를 제공하는 서버(300)를 포함할 수 있다. 도 1은 본 발명의 일 실시예에 불과하므로, 도 1을 통해 본 발명이 한정 해석되는 것은 아니다.1 is a diagram for explaining a conversation learning system using an artificial intelligence avatar tutor according to an embodiment of the present invention. Referring to FIG. 1 , a conversation learning system 1 using an artificial intelligence avatar tutor includes a client 100 that a user can access to the system, a network 200, and a server providing an AI avatar tutor conversation learning service. (300) may be included. 1 is only an embodiment of the present invention, and thus the present invention is not limitedly interpreted through FIG. 1 .

도 1의 각 구성요소들은 일반적으로 네트워크(200)를 통해 연결된다. 여기서 네트워크는, 복수의 단말 및 서버들과 같은 각각의 노드 상호 간에 정보 교환이 가능한 연결 구조를 의미하는 것으로, 이러한 네트워크의 일 예에는 RF, 3GPP(3rd Generation Partnership Project) 네트워크, LTE(Long Term Evolution) 네트워크, 5GPP(5rd Generation Partnership Project) 네트워크, WIMAX(World Interoperability for Microwave Access) 네트워크, 인터넷(Internet), LAN(Local Area Network), Wireless LAN(Wireless Local Area Network), WAN(Wide Area Network), PAN(Personal Area Network), 블루투스 (Bluetooth) 네트워크, NFC 네트워크, 위성 방송 네트워크, 아날로그 방송 네트워크, DMB(Digital Multimedia Broadcasting) 네트워크 등이 포함되나 이에 한정되지는 않는다. 적어도 하나의 라는 용어는 단수 및 복수를 포함하는 용어로 정의되고, 적어도 하나의 라는 용어가 존재하지 않더라도 각 구성요소가 단수 또는 복수로 존재할 수 있고, 단수 또는 복수를 의미할 수 있음은 자명 하다 할 것이다. 또한, 각 구성요소가 단수 또는 복수로 구비되는 것은, 실시예에 따라 변경 가능하다.Each component of FIG. 1 is generally connected through a network 200 . Here, the network refers to a connection structure in which information exchange is possible between each node, such as a plurality of terminals and servers. Examples of such networks include RF, 3rd Generation Partnership Project (3GPP) network, Long Term Evolution (LTE). ) network, 5GPP (5th Generation Partnership Project) network, WIMAX (World Interoperability for Microwave Access) network, Internet, LAN (Local Area Network), Wireless LAN (Wireless Local Area Network), WAN (Wide Area Network), PAN (Personal Area Network), Bluetooth (Bluetooth) network, NFC network, satellite broadcasting network, analog broadcasting network, DMB (Digital Multimedia Broadcasting) network, and the like are included, but are not limited thereto. The term at least one is defined as a term including the singular and the plural, and even if the at least one term does not exist, each element may exist in the singular or the plural, and it is obvious that it may mean the singular or the plural. will be. In addition, whether each component is provided in singular or plural may be changed according to an embodiment.

클라이언트(100)는 인공지능 아바타가 사용자와 직접적인 대화를 할 수 있는 아바타 제어부(110), 대화 주제와 상황 가이드를 제공하는 컨텐츠 제어부(120), 인공지능 아바타와 사용자가 대화한 것을 텍스트로 보여주고, 사용자의 표현보다 더 개선되거나 다른 표현, 문법적인 오류를 표시하고 수정한 것을 보여주는 대화 관리부(130), 상기 3개 모듈(110, 120, 130)간의 연결과 단말의 시스템 내부 기능과 서버(300)간 제어를 해 주는 시스템 제어부(140), 사용자의 계정, 대화 히스토리 및 설정을 할 수 있는 서비스 관리부(150)로 구성될 수 있다.The client 100 includes an avatar control unit 110 that allows the AI avatar to have a direct conversation with the user, a content control unit 120 that provides a conversation topic and situation guide, and displays the conversation between the AI avatar and the user in text. , the dialogue management unit 130 that displays and corrects more improved or different expressions or grammatical errors than the user's expression, the connection between the three modules 110, 120, 130, and the internal system function of the terminal and the server 300 ) may be composed of a system control unit 140 that controls the system, and a service management unit 150 that can set a user's account, conversation history, and settings.

아바타 제어부(110)는 인공지능 아바타의 전신이나 얼굴을 포함한 신체 일부가 특정 배경과 함께 보여 진다. 인공지능 아바타의 종류는 캐릭터에 따라 다양한 형태와 모양으로 존재하고, 또한 현재 대화 주제와 상황에 따라 같은 인공지능 아바타라고 모습이 달라질 수 있다. 배경 또한 달라지는데, 가령 여름 바캉스와 관련된 주제이면, 해변에서 수영복 차림으로 모히또를 마시고 있는 장면으로 나타날 수 있다.The avatar controller 110 shows the artificial intelligence avatar's whole body or body part including the face with a specific background. The types of AI avatars exist in various forms and shapes depending on the character, and the appearance of the same AI avatar may vary depending on the current conversation topic and situation. The background also changes. For example, if it is a topic related to a summer vacation, it may appear as a scene drinking a mojito in a swimsuit on the beach.

인공지능 아바타는 사용자가 발화하는 것을 듣고 있을 때, 사용자에게 발화할 때, 다음 입력을 대기 중일 때에 따라 제스처가 달라진다. 사용자가 발화하는 동안 사용자의 상황과 발화 의미 등의 문맥을 이해하여 문맥에 맞는 적절한 리액션을 보여주면서 사용자에게 잘 듣고 있다는 것을 보여주고 자연스럽게 대화가 이어지게 한다. 가령 사용자가 기뻐하면서 말할 경우, 이런 기분에 호응하는 밝은 제스처와 추임새를 내며 사용자와 공감하며 대화한다. 인공지능 아바타가 사용자에게 발화할 때도 문맥에 맞는 제스처를 보여주면서, 사용자에게 효과적으로 의미를 전달하고 사용자로 하여금 대화에 몰입할 수 있게 한다. 가령 인공지능 아바타가 발화하는 문맥 상 사용자를 설득하고자 하는 경우, 인공지능 아바타의 표정과 제스처, 어조가 이에 맞게 평상시 모습과 다르게 표현된다. 사용자가 말하는 것을 기다리거나 아무런 액션을 취하지 않는 경우 인공지능 아바타는 대기를 하게 되고 이때도 적절한 제스처로 대화의 흐름을 이어 나간다. 가령, 인공지능 아바타가 사용자에게 질문을 한 후, 사용자의 응답이 수초간 없는 경우, 사용자가 충분히 응답을 생각할 수 있는 기다리는 제스처로 대기하게 된다. The AI avatar changes its gestures depending on when it hears the user speaking, when it speaks to the user, and when it is waiting for the next input. While the user is uttering, it understands the context of the user's situation and the meaning of the utterance, shows the appropriate reaction according to the context, shows that the user is listening well, and allows the conversation to continue naturally. For example, when the user speaks with joy, he communicates with the user sympathetically by making bright gestures and gestures that respond to such feelings. When the AI avatar speaks to the user, it shows a gesture that fits the context, effectively conveying the meaning to the user and allowing the user to be immersed in the conversation. For example, when an AI avatar tries to persuade a user in the context of utterance, the facial expression, gesture, and tone of the AI avatar are expressed differently from their usual appearance. If you wait for the user to speak or take no action, the AI avatar will wait and continue the conversation with appropriate gestures. For example, if there is no response from the user for several seconds after the AI avatar asks a question to the user, the user waits with a waiting gesture that can sufficiently think of a response.

또한, 아바타 제어부(110)는 인공지능 아바타와 사용자의 발화 입력 턴을 자동으로 조절할 수 있다. 가령 인공지능 아바타가 말이 끝나면, 수초 뒤에 사용자가 말을 할 수 있게 자동으로 마이크가 켜지고, 사용자가 이를 인지할 수 있게 소리와 마이크 아이콘을 녹음 중 아이콘으로 변경한다. 사용자가 말이 끝나면 시스템은 이를 인지하고, 마이크 아이콘을 꺼진 상태로 표시한다. 사용자의 발화 입력은 수동으로도 할 수 있다. 인공지능 아바타가 발화하고 있는 경우, 상태 표시가 인공지능 아바타 말하는 중으로 표시되고, 인공지능 아바타의 말이 끝나면, 상태 표시가 마이크 아이콘으로 바뀐다. 이때 사용자가 마이크 버튼을 누른 경우, 사용자는 자신의 발화를 녹음할 수 있게 된다. 사용자가 발화를 끝내고, 다시 녹음 중 아이콘을 누르면 녹음이 종료가 되고, 마이크 아이콘은 꺼진 상태로 보여 진다.Also, the avatar controller 110 may automatically adjust the artificial intelligence avatar and the user's speech input turn. For example, after the AI avatar finishes speaking, the microphone turns on automatically for the user to speak after a few seconds, and the sound and microphone icon are changed to a recording icon so that the user can recognize it. When the user has finished speaking, the system recognizes it and displays the microphone icon as off. The user's utterance input can also be performed manually. When the AI avatar is speaking, the status display is displayed as AI avatar is speaking, and when the AI avatar finishes speaking, the status display changes to a microphone icon. In this case, when the user presses the microphone button, the user can record his or her utterance. When the user finishes speaking and presses the icon during recording again, the recording ends, and the microphone icon is turned off.

아바타 제어부(110)는 인공지능 아바타와 함께 배경에 광고 배너를 자연스럽게 보여준다. 현재 대화의 주제와 상황에 맞는 배경이 등장하고, 이 배경에 있는 광고 배너는 서버(300)의 광고 제공부(360)에서 현재 대화 주제와 상황에 맞게 큐레이션 한 것이다. 가령 현재 대화의 주제가 프린터 기기에 대한 것이라면, 제록스나 캐논 등의 프린터 기기 업체의 로고나 상품을 배경에 삽입해서 자연스럽게 광고가 사용자에게 노출된다.The avatar control unit 110 naturally displays an advertisement banner in the background together with the artificial intelligence avatar. A background suitable for the topic and situation of the current conversation appears, and the advertisement banner in the background is curated by the advertisement providing unit 360 of the server 300 according to the topic and situation of the current conversation. For example, if the topic of the current conversation is about printer devices, advertisements are naturally exposed to users by inserting logos or products of printer device manufacturers such as Xerox and Canon in the background.

컨텐츠 제어부(120)는 인공지능 아바타와 사용자간 대화 주제 및 상황 정보를 제공해 준다. 여기서 대화 주제는 일상 생활 및 전문 영역을 포함한 대부분의 영역을 포함하고, 대화 주제는 텍스트와 그림으로 표시될 수 있다. 상황 정보는 인공지능 아바타와 사용자간 역할, 문제 해결, 도움, 설명 등 다양하게 설정될 수 있다. 또한 특정 지문을 읽고 지문과 관련하여 대화 주제와 상황으로 인공지능 아바타와 사용자간 대화가 시작되게 한다. The content control unit 120 provides conversation topics and context information between the artificial intelligence avatar and the user. Here, the conversation topic includes most areas including daily life and professional areas, and the conversation topic may be displayed in text and pictures. The context information may be set in various ways, such as a role between the AI avatar and the user, problem solving, help, and explanation. It also reads a specific fingerprint and initiates a conversation between the AI avatar and the user with a conversation topic and situation related to the fingerprint.

컨텐츠 제어부(120)는 사용자가 인공지능 아바타와 새로운 주제를 옮길 수 있게 컨트롤 버튼을 제공한다. 또한 아바타 제어부(110)에서 사용자가 발화를 통해 주제를 변경할 것을 얘기할 수 있다. 가령, 사용자는 인공지능 아바타에게 다른 주제로 대화하자고 제안하거나, 특정 주제를 지정해서 대화하자고 말을 할 수도 있다. 이렇게 대화 주제가 바뀌면, 인공지능 아바타는 바뀐 주제와 상황에 대해 설명한 후 자연스럽게 후속 대화를 이어간다.The content controller 120 provides a control button so that the user can move the AI avatar and a new topic. Also, the avatar controller 110 may tell the user to change the subject through utterance. For example, the user may suggest to the AI avatar to have a conversation on a different topic or specify a specific topic to talk to. When the conversation topic changes in this way, the AI avatar explains the changed topic and situation and then naturally continues the conversation.

아바타 제어부(110)에서 인공지능 아바타나 사용자의 발화가 끝나면, 해당 발화의 텍스트를 대화 관리부(130)에 보여준다. 여기서 입력된 텍스트는 서버(300)의 상호작용 처리부(330)와 언어교정 처리부(340)로 전달되고, 대화 관리부(130)는 사용자 입력 텍스트에 대한 오류 수정 및 패러프레이징(Paraphrasing) 결과를 언어교정 처리부(340)로부터 받아 대화 관리부(130) 화면에 표시한다. 사용자 입력 텍스트에 대한 오류 표시 및 수정은 해당 텍스트에 오류 구간을 표시하고, 수정된 표현도 표시한다. 패러프레이징(Paraphrasing) 은 사용자 입력 텍스트를 기반으로 같은 의미에 더 좋은 표현이나 다른 표현을 포함한 문장을 말하고, 사용자 입력 텍스트 하단에 별도의 마크와 함께 표시될 수 있다. 가령 사용자의 언어 표현 스킬이 부족해서, 완전한 의미의 문장이 아닌 일부 단어만 있을 경우, 언어교정 처리부(340)는 대화 문맥에 맞는 완전한 문장으로 만드는 것도 패러프레이징(Paraphrasing)으로 볼 수 있다. When the AI avatar or the user's utterance is finished in the avatar control unit 110 , the text of the utterance is displayed on the conversation management unit 130 . Here, the inputted text is transmitted to the interaction processing unit 330 and the language proofing processing unit 340 of the server 300, and the dialogue management unit 130 corrects errors for the user input text and corrects the results of paraphrasing. It is received from the processing unit 340 and displayed on the screen of the conversation management unit 130 . Error display and correction for user input text displays an error section in the text and also displays a corrected expression. Paraphrasing refers to a sentence containing a better expression or a different expression for the same meaning based on the user input text, and may be displayed with a separate mark at the bottom of the user input text. For example, when there are only some words rather than a complete sentence due to the lack of language expression skills of the user, the language correction processing unit 340 may also consider creating a complete sentence suitable for the conversation context as paraphrasing.

대화 관리부(130)는 인공지능 아바타와 사용자의 대화 텍스트에 대한 오디오 파일도 제공한다. 인공지능 아바타의 목소리의 오디오와 사용자가 직접 발화한 오디오, 사용자 발화 텍스트를 인공지능 아바타의 목소리로 생성한 오디오, 패러프레이징(Paraphrasing)한 텍스트의 오디오를 포함한다. 이런 텍스트와 오디오 제공을 통해 사용자는 본인의 발음과 인공지능 아바타의 발음을 다시 들을 수 있다. 또한, 대화 관리부(130)는 인공지능 아바타가 사용자의 발화를 잘못 인식하거나, 문맥에 맞지 않는 응답을 했을 경우, 사용자가 직접 시스템으로 피드백을 줄 수 있는 기능도 제공할 수 있다.The conversation management unit 130 also provides audio files for the artificial intelligence avatar and the user's conversation text. It includes audio of the voice of the AI avatar, audio uttered by the user, audio generated by the user's utterance text with the voice of the AI avatar, and audio of paraphrased text. By providing these texts and audio, users can hear their own pronunciation and the pronunciation of the AI avatar again. Also, when the AI avatar incorrectly recognizes the user's utterance or makes a response that does not fit the context, the conversation manager 130 may also provide a function for the user to directly provide feedback to the system.

시스템 제어부(140)는 상기 세가지 모듈(110, 120, 130)간의 연결과 단말의 시스템 내부 기능과 서버(300)간 제어를 해 준다. 좀 더 구체적으로는 최초 사용자와의 대화 주제를 네트워크(200)를 통해 Server(300)의 컨텐츠 관리부(320)에서 받아오고 이를 컨텐츠 제어부(120)에 전달한다. 또한 아바타 관리부(310)에서 인공지능 아바타의 페르소나와 현재 대화 주제에 맞는 배경, 광고 제공부(350)에서 현재 대화 주제에 맞는 광고 배너를 받아 아바타 제어부(110)에 전달한다. 이후 아바타 제어부(110)에 새로운 주제에 대한 대화 시작을 트리거(Trigger)하고, 인공지능 아바타가 말한 것을 인식하여 대화 관리부(130)에 전달한다. 상기 설명한 플로우 외에 다른 클라이언트 서버 모듈간 모든 연결과 제어를 포함할 수 있다.The system control unit 140 controls the connection between the three modules 110 , 120 , 130 and the internal system function of the terminal and the server 300 . More specifically, a conversation topic with the first user is received from the content management unit 320 of the Server 300 through the network 200 and delivered to the content control unit 120 . In addition, the avatar management unit 310 receives the persona of the artificial intelligence avatar and a background suitable for the current conversation topic, and the advertisement providing unit 350 receives an advertisement banner suitable for the current conversation topic and transmits it to the avatar control unit 110 . Thereafter, the avatar controller 110 triggers a conversation on a new topic, and the AI avatar recognizes what the avatar has said and delivers it to the conversation manager 130 . In addition to the flow described above, it may include all connections and control between other client-server modules.

서비스 관리부(150)는 사용자의 계정을 관리하고, 사용자가 대화하길 원하는 인공지능 아바타를 선택하고, 이전에 대화한 히스토리를 오디오와 텍스트로 제공한다. 사용자의 계정 부분 관련해서는 사용자가 요금제를 선택할 수 있게 하는데 최초 서비스에 가입한 사용자에게 일정량의 대화주제를 무료로 사용할 수 있게 하고, 컨텐츠 제어부(120)에서 주제를 바꾸거나 할당된 대화 턴 수가 모두 소진된 경우, 팝업 메시지를 통해 사용자가 요금제를 선택할 수 있게 한다. 대화 히스토리는 대화 주제에 대한 해시태그가 자동으로 달리고, 인공지능 아바타의 사진 또는 대화 주제 배경과 함께 썸네일로 표시될 수 있다.The service manager 150 manages the user's account, selects an artificial intelligence avatar with which the user wants to talk, and provides a history of previous conversations in audio and text format. In relation to the user's account part, the user can select a rate plan, a certain amount of conversation topics can be used free of charge to the user who signed up for the service for the first time, and the content control unit 120 changes the topic or the number of allocated conversation turns is exhausted. If enabled, a pop-up message will allow the user to select a plan. The conversation history is automatically tagged with a conversation topic, and may be displayed as a thumbnail along with a picture of an AI avatar or a conversation topic background.

서버(300)는 인공지능 아바타의 페르소나(Persona)와 제스처를 정의하고, 대화 주제와 상황에 따라 달라지는 인공지능 아바타를 관리하는 아바타 관리부(310), 사용자와 인공지능 아바타가 대화를 하기 위한 주제와 상황을 제공하는 컨텐츠 관리부(320), 인공지능 아바타의 페르소나(Persona), 대화 주제 및 상황, 현재 대화의 문맥에 따라 응답과 적절한 제스처를 생성하는 상호작용 처리부(330), 사용자가 발화한 텍스트를 분석해 문맥에 맞는 더 좋은 표현을 생성하고, 문법적인 오류를 검출하는 언어교정 처리부(340), 클라이언트(100)를 통해 등록된 사용자를 관리하는 계정 관리부(350), 아바타 제어부(110)에서 인공지능 아바타의 배경에 자연스러운 광고 배너를 삽입하는 광고 제공부(360)로 구성될 수 있다.The server 300 defines the persona and gesture of the artificial intelligence avatar, the avatar manager 310 manages the artificial intelligence avatar that varies depending on the topic and situation of the conversation, the user and the artificial intelligence avatar have a conversation topic and A content management unit 320 that provides a situation, a persona of an artificial intelligence avatar, a conversation topic and situation, an interaction processing unit 330 that generates a response and an appropriate gesture according to the context of the current conversation, and the text uttered by the user Artificial intelligence in the language proofing processing unit 340 that analyzes to generate a better expression suitable for the context and detects grammatical errors, the account management unit 350 that manages users registered through the client 100, and the avatar control unit 110 It may be composed of an advertisement providing unit 360 that inserts a natural advertisement banner in the background of the avatar.

아바타 관리부(310)는 인공지능 튜터 인 아바타들을 정의하고 관리한다. 인공지능 튜터가 사람 같은 성격이나 이력, 경험 등을 가질 수 있고, 이에 기반하여 상호작용 처리부(330)에서 응답과 제스처를 생성한다. 인공지능 아바타는 사람 같은 형상을 지닐 수 있고, 동물이나 게임에 나오는 인물 같이 다양한 캐릭터로 나올 수 있다. 이런 인공지능 아바타는 대화 주제나 문맥에 따라 다양한 제스처를 취할 수 있는데, 아바타 마다 같은 대화 문맥이라도 제스처가 달라질 수 있다. 이렇게 정의된 주요 제스처 들은 상호작용 처리부(330)에서 사용된다. 인공지능 아바타는 자신 만의 독특한 표현들도 사용할 수 있고 이런 주요 표현 방식들도 아바타 관리부(310)에서 정의된다. 또한 인공지능 아바타는 자신의 목소리도 가진다. 이런 목소리를 생성하는 음성합성 모델은 사전에 학습되어 아바타 관리부(310)에 등록되고, 서비스 관리부(150)에서 사용자가 대화하고자 하는 인공지능 아바타를 선택할 때 클라이언트(100)로 다운로드 된다.The avatar manager 310 defines and manages avatars that are AI tutors. The AI tutor may have a human-like personality, history, or experience, and based on this, the interaction processing unit 330 generates a response and a gesture. AI avatars can have human-like shapes and can appear as various characters, such as animals or characters from games. Such an AI avatar can take various gestures depending on the topic or context of the conversation, and each avatar may have different gestures even in the same conversation context. The main gestures defined in this way are used in the interaction processing unit 330 . The AI avatar can use its own unique expressions, and these main expression methods are also defined in the avatar management unit 310 . AI avatars also have their own voice. The voice synthesis model for generating such a voice is learned in advance and registered in the avatar management unit 310 , and is downloaded to the client 100 when the service management unit 150 selects an artificial intelligence avatar with which the user wants to talk.

컨텐츠 관리부(320)는 사용자와 인공지능 아바타가 대화를 하기 위한 주제와 상황을 제공한다. 이러한 주제와 상황은 대화하는 사용자의 회화 능력에 따라 달라진다. 대화 주제는 일상 생활에 관련된 소재부터 전문 분야 지식까지 다양하게 될 수 있고, 상황은 사용자와 인공지능 아바타가 역할, 질의 응답, 문제 해결, 설명 등의 조건이 주어 질 수 있다.The content management unit 320 provides a topic and a situation for a conversation between the user and the AI avatar. These topics and situations depend on the conversational ability of the conversational user. Conversation topics may vary from topics related to daily life to knowledge in specialized fields, and conditions such as roles, questions and answers, problem solving, explanations, etc., may be given between the user and the AI avatar.

상호작용 처리부(330)는 인공지능 아바타의 페르소나(Persona), 대화 주제 및 상황, 현재 대화의 문맥에 따라 응답과 적절한 제스처를 생성한다. 응답을 생성함에 있어서는, 사람 간의 실제 대화 데이터를 딥러닝으로 학습하여 베이스 모델을 만들고, 특정 주제에 대한 각 인공지능 아바타의 페르소나에 맞는 대화 데이터를 추가 학습하여 대화 모델을 만든다. 또한, 인공지능 아바타와 사용자 간의 대화 보다 회화 학습에 대한 목적을 두고 있어, 사용자가 흔히 할 수 있는 질문이나 요청들은 룰기반으로 학습한 모델로 응답을 생성한다. 또한 대화를 시작하기 위한 사용자와의 인사나 대화 주제 가이드의 경우, 사전에 정의된 템플릿 기반의 대화 문구를 생성해서 활용한다.The interaction processing unit 330 generates a response and an appropriate gesture according to the persona of the AI avatar, the topic and situation of the conversation, and the context of the current conversation. In generating a response, a base model is created by learning actual conversation data between people through deep learning, and a conversation model is created by additionally learning conversation data suitable for the persona of each AI avatar on a specific topic. In addition, since the purpose of conversational learning is rather than the conversation between the AI avatar and the user, frequently asked questions or requests by the user are generated using a rule-based learning model. In addition, in the case of greetings or conversation topic guides with users to start a conversation, a predefined template-based conversation phrase is created and used.

인공지능 아바타의 제스처는 사용자가 발화할 때 인공지능 아바타가 듣는 상황의 제스처, 듣고 난 후 이를 해석하여 응답을 인공지능 아바타가 말하면서 취하는 제스처, 그리고 사용자의 입력이나 스스로 아무런 액션을 하지 않는 대기중의 제스처를 생성한다. 듣는 상황에서는 사용자가 발화하는 것을 듣고 있다는 것을 보여주고, 말하는 중간에 쉼이 있는 경우, 그 때까지 이해한 것을 기반으로 리액션을 준다. 가령, 사용자가 하는 말에 인공지능 아바타가 납득이 되거나 설득이 되는 경우, 고개를 끄덕이면서 맞아 라는 리액션을 생성할 수 있다.The gestures of the AI avatar are the gestures of the situation the AI avatar hears when the user speaks, the gestures that the AI avatar takes while speaking and then interprets the responses after hearing, and the gestures that the AI avatar takes while the user speaks and does not perform any action on the user's input. Create a gesture. In a listening situation, it shows that the user is listening to what is being said, and if there is a pause in the conversation, it gives a reaction based on understanding up to that point. For example, if the AI avatar is persuaded or persuaded by the user's words, it can generate a reaction saying yes while nodding its head.

인공지능 아바타는 사용자의 말을 이해한 후 응답을 생성하는 데, 해당 응답을 인공지능 아바타가 말하는 중간에 강조나 지시 등이 필요한 경우 바디 랭귀지(Body Language) 사용하는 제스처를 포함할 수 있다.The AI avatar generates a response after understanding the user's words, and when the AI avatar needs emphasis or instructions while the AI avatar speaks, a gesture using body language may be included.

언어교정 처리부(340)는 사용자가 말한 텍스트를 문맥에 맞는 더 좋은 표현을 생성하고, 문법적인 오류가 있을 경우 수정한 표현을 생성한다. 언어교정 처리부(340)의 패러프레이징(Paraphrasing)은 사전에 방대한 텍스트 데이터로 언어 모델을 만들어, 입력 문장이 들어왔을 때 다른 표현이 포함된 문장을 생성할 수 있게 파인 튜닝(Finetuning) 한 모델을 사용한다. 문법적인 오류는 사전 학습된 언어 모델을 통해 입력 문장이 오류를 포함되고 있는지를 판단하고, 오류가 있는 구간을 찾고, 정정한 표현으로 문구를 바꿔준다.The language proofing processing unit 340 generates a better expression according to the context of the text spoken by the user, and generates a corrected expression when there is a grammatical error. Paraphrasing of the language proofing processing unit 340 uses a fine-tuning model to create a language model with a large amount of text data in advance, and to generate a sentence with different expressions when an input sentence comes in. do. For grammatical errors, it is determined whether the input sentence contains errors through the pre-trained language model, finds the section with errors, and replaces the sentence with the corrected expression.

계정 관리부(350)는 사용자의 계정 정보 및 학습과 관련된 분석을 진행한다. 해당 사용자가 인공지능 아바타와 나눈 대화 내역을 분석하여, 어떤 주제와 표현에서 개선이 필요한 지 파악한다. 이후 해당 주제와 표현들을 사용자가 익숙히 다룰 수 있을 때까지, 컨텐츠 관리부(320)가 주기적으로 유사한 주제와 상황들을 대화 가이드로 생성할 수 있게 한다.The account management unit 350 analyzes the user's account information and learning. By analyzing the conversation history the user had with the AI avatar, it identifies which topics and expressions need improvement. Thereafter, the content management unit 320 may periodically generate similar topics and situations as a conversation guide until the user is familiar with the corresponding topics and expressions.

광고 제공부(360)는 아바타 제어부(110)의 배경에 자연스럽게 들어갈 광고 배너를 관리하고, 컨텐츠 관리부(320)에서 대화 주제와 상황을 설정할 때, 해당 주제와 상황에 포함되게 한다. 아바타 제어부 (110)는 대화 주제별 다양한 배경이 등장하는데, 이런 배경과 광고 배너 간의 연관도를 계산해서, 가장 높은 연관도를 가지는 배너를 배경에 삽입한다.The advertisement providing unit 360 manages an advertisement banner that will naturally fit into the background of the avatar control unit 110 , and when the content management unit 320 sets a conversation topic and situation, it is included in the corresponding topic and situation. The avatar control unit 110 inserts a banner having the highest degree of relevance into the background by calculating the degree of association between the background and the advertisement banner.

도 2에서 도 6까지 도면은 본 발명의 일 실시예에 따른 인공지능 아바타 튜터를 활용한 회화 학습 시스템을 이용한 회화 서비스가 구현된 일 실시예를 설명하기 위한 도면이다.2 to 6 are diagrams for explaining an embodiment in which a conversation service using a conversation learning system using an artificial intelligence avatar tutor according to an embodiment of the present invention is implemented.

도 2를 참조하면, 사용자가 인공지능 아바타와 대화 세션을 시작하면, 컨텐츠 제어부(120)에 인사나 아이스 브레이킹 주제가 주어지고 대화를 시작한다. 인공지능 아바타가 말을 하면, 인공지능 아바타 제어부(110)에서 인공지능 아바타가 말하고 있다는 것이 표시되고, 말하는 컨텍스트 맞게 적절한 제스처도 취한다. 인공지능 아바타가 말이 끝나면 대화 관리부(130)에 텍스트로 표시된다. Referring to FIG. 2 , when a user starts a conversation session with an AI avatar, a greeting or ice breaking topic is given to the content controller 120 and a conversation begins. When the AI avatar speaks, the AI avatar control unit 110 indicates that the AI avatar is speaking, and takes appropriate gestures according to the speaking context. When the AI avatar finishes speaking, it is displayed as text on the conversation management unit 130 .

도 3을 참조하면, 인공지능 아바타가 말이 끝나면, 수초 뒤 자동으로 사용자 발화를 입력 받기 위한 마이크가 켜지고, 사용자가 인지할 수 있게 녹음 중 아이콘을 표시해 주고 녹음 시작 소리를 들려준다. 이에 사용자는 인공지능 아바타에게 적절한 응답을 하고, 본격적인 대화를 하자는 말을 하거나 컨텐츠 제어부(120)에서 다음 주제로 넘어 갈 수 있는 버튼을 누르면, 다음 대화 주제와 상황으로 인공지능 아바타는 대화를 이어간다. 사용자 발화 입력은 수동으로도 할 수 있다. 즉, 인공지능 아바타가 말이 끝나면, 사용자는 인공지능 아바타 제어부(110)의 마이크 버튼을 눌러 발화를 입력하고, 이때 녹음 중이라는 표시가 나오는 데, 말이 끝난 후 이 녹음 중 표시를 누르면 다시 마이크가 꺼진다.Referring to FIG. 3 , after the AI avatar finishes speaking, a microphone for receiving user utterance is turned on automatically after a few seconds, a recording icon is displayed so that the user can recognize it, and a recording start sound is played. Accordingly, when the user responds appropriately to the AI avatar, says to have a full-fledged conversation, or presses a button to move to the next topic in the content control unit 120, the AI avatar continues the conversation with the next conversation topic and situation. . The user's speech input can also be performed manually. That is, when the AI avatar finishes speaking, the user presses the microphone button of the AI avatar control unit 110 to input a utterance, and at this time, a recording is displayed. .

도 4를 참조하면, 컨텐츠 제어부(120) 에는 인공지능 아바타와 사용자가 대화하고자 하는 주제와 상황이 주어진다. 이런 대화와 상황은 Server(300)의 컨텐츠 관리부(320)에서 큐레이션 한다. 인공지능 아바타는 새로운 대화 주제와 상황이 주어지면 이에 대한 적절한 대화 가이드를 인공지능 아바타 제어부(110)에서 말과 제스처로 사용자에게 전달해 주고, 해당 텍스트를 대화 관리부(130)에 표시한다. 인공지능 아바타가 해당 대화 세션으로 말을 하기 위해 현재 대화 주제와 상황 정보를 시스템 제어부(140)를 통해 서버(300)의 상호작용 처리부(330)로 보내고, 여기서 인공지능 아바타가 말해야 하는 텍스트와 제스처를 생성하여 다시 시스템 제어부(140)로 전달한다. 인공지능 아바타 제어부(110)에서 인공지능 아바타는 상호작용 처리부(330)로부터 받은 제스처를 취하면서 텍스트도 인공지능 아바타의 목소리 오디오로 변환하여 말을 한다. 인공지능 아바타가 말이 끝나면, 대화 관리부(130)에 텍스트로 인공지능 아바타가 말한 것이 표시된다.Referring to FIG. 4 , the content controller 120 is given a topic and situation with which the AI avatar and the user want to communicate. These conversations and situations are curated by the content management unit 320 of the Server 300 . When the AI avatar is given a new conversation topic and situation, the AI avatar control unit 110 delivers an appropriate conversation guide to the user through words and gestures, and displays the text on the conversation management unit 130 . In order for the AI avatar to speak in the corresponding conversation session, the current conversation topic and context information are sent to the interaction processing unit 330 of the server 300 through the system control unit 140, where the text and gestures to be spoken by the AI avatar are sent. is generated and transmitted back to the system control unit 140 . In the AI avatar control unit 110 , the AI avatar takes the gesture received from the interaction processing unit 330 , and converts the text into the voice audio of the AI avatar and speaks. When the AI avatar finishes speaking, the conversation management unit 130 displays what the AI avatar said as text.

인공지능 아바타 제어부(110)의 인공지능 아바타 배경은 현재 대화 주제와 상황에 대한 시각 자료가 표시되고, 이 시각 자료 내에 광고 배너가 같이 포함될 수 있다. 가령 도 3의 대화 주제가 직업과 취미이고, 상황 정보에는 사용자가 소프트웨어 엔지니어, 모바일 게임을 좋아하는 대화 설정에 이에 대한 배경도 회사에서 구글 애널리틱스로 회의하는 것과 집에서 플레이 스테이션으로 게임을 즐기는 배경을 제공할 수 있다.In the AI avatar background of the AI avatar control unit 110, visual data on a current conversation topic and situation are displayed, and an advertisement banner may be included in the visual data. For example, the topic of conversation in Fig. 3 is occupation and hobbies, and the context information provides the background for the conversation setting that the user likes to be a software engineer and mobile game, and the background for meeting with Google Analytics at the company and playing games on the Play Station at home. can do.

도 5를 참조하면, 인공지능 아바타가 말이 끝나면, 사용자는 이에 대한 응답을 목소리로 입력하고, 대화 관리부(130)에는 인공지능 아바타가 사용자 발화를 인식한 것을 텍스트로 표시한다. 사용자가 발화 텍스트와 현재 대화 주제 세션의 대화 내용이 시스템 제어부(140)을 통해 서버(300)의 언어교정 처리부(340)로 보내고, 여기서 사용자가 입력한 응답 문장보다 더 좋은 표현이나 문법적인 오류를 수정한 표현을 포함한 문장을 생성하여 다시 시스템 제어부(140)로 보내진다. 이렇게 수정된 문장은 대화 관리부(130)의 사용자 발화 텍스트 하단에 표시된다. 또한, 사용자 발화 텍스트에는 문법적인 오류가 있는 구간을 표시하고 정정된 문구도 제공한다.Referring to FIG. 5 , when the AI avatar finishes speaking, the user inputs a response by voice, and the conversation management unit 130 displays text indicating that the AI avatar recognizes the user's utterance. The user sends the spoken text and the conversation content of the current conversation topic session to the proofing processing unit 340 of the server 300 through the system control unit 140, and a better expression or grammatical error than the response sentence entered by the user A sentence including the corrected expression is generated and sent back to the system controller 140 . The corrected sentence is displayed at the bottom of the user's utterance text of the conversation management unit 130 . In addition, a section with a grammatical error is displayed in the user utterance text, and a corrected phrase is also provided.

도 6을 참조하면, 사용자 발화 텍스트와 현재 대화 주제 세션의 대화 내용은 서버(300)의 상호작용 처리부(330) 에도 보내지고, 여기서 대화의 문맥을 기반으로 사용자 발화를 이해하고, 이에 맞는 응답 문장과 제스처를 생성한다. 현재 대화 세션에 있는 개체명과 지시대명사, 연관 관계 등을 파악하여 대화의 흐름을 이어 가는 응답이나 후속 질문을 생성한다. 상기 서술된 것 같이, 상호작용 처리부(300)은 인공지능 아바타가 응답이나 후속 질문을 말하는 동안 보여줄 제스처 시퀀스도 생성한다.Referring to FIG. 6 , the user utterance text and the conversation content of the current conversation topic session are also sent to the interaction processing unit 330 of the server 300 , where the user utterance is understood based on the context of the conversation, and a response sentence corresponding thereto and create a gesture. It generates responses or follow-up questions that continue the flow of conversation by identifying entity names, referential pronouns, and related relationships in the current conversation session. As described above, the interaction processing unit 300 also generates a sequence of gestures to be displayed while the AI avatar speaks a response or follow-up question.

이와 같은 도 2에서 도 6까지 도면의 인공지능 아바타 튜터를 활용한 회화 학습 시스템을 이용한 회화 서비스에 대해서 설명되지 아니한 사항은 앞서 도 1을 통해 인공지능 아바타 튜터를 활용한 회화 학습 시스템을 이용한 회화 서비스가 제공 방법에 대하여 설명된 내용과 동일하거나 설명된 내용으로부터 용이하게 유추 가능하므로 이하 설명을 생략하도록 한다.As described above, the conversation service using the conversation learning system using the artificial intelligence avatar tutor in the drawings from FIGS. 2 to 6 is not described above. Since it is the same as the content described with respect to the method of providing or can be easily inferred from the described content, the following description will be omitted.

도 2에서 도 6을 통해 설명된 일 실시예에 따른 인공지능 아바타 튜터를 활용한 회화 학습 시스템 제공 방법은, 컴퓨터에 의해 실행되는 애플리케이션이나 프로그램 모듈과 같은 컴퓨터에 의해 실행가능한 명령어를 포함하는 기록 매체의 형태로도 구현될 수 있다. 컴퓨터 판독 가능 매체는 컴퓨터에 의해 액세스될 수 있는 임의의 가용 매체일 수 있고, 휘발성 및 비휘발성 매체, 분리형 및 비분리형 매체를 모두 포함한다. 또한, 컴퓨터 판독가능 매체는 컴퓨터 저장 매체를 모두 포함할 수 있다. 컴퓨터 저장 매체는 컴퓨터 판독가능 명령어, 데이터 구조, 프로그램 모듈 또는 기타 데이터와 같은 정보의 저장을 위한 임의의 방법 또는 기술로 구현된 휘발성 및 비휘발성, 분리형 및 비분리형 매체를 모두 포함한다.The method of providing a conversation learning system using an artificial intelligence avatar tutor according to an embodiment described with reference to FIGS. 2 to 6 is a recording medium including instructions executable by a computer, such as an application or program module executed by a computer. It can also be implemented in the form of Computer-readable media can be any available media that can be accessed by a computer and includes both volatile and nonvolatile media, removable and non-removable media. Also, computer-readable media may include all computer storage media. Computer storage media includes both volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data.

도 7을 참조하면, 본 발명의 일 실시예에 따른 도 1의 인공지능 아바타 튜터를 활용한 회화 학습 시스템에 포함된 각 구성들 상호 간에 데이터가 송수신 되는 과정을 나타낸 도면이다. 이하, 도 7을 통해 각 구성들 상호간에 데이터가 송수신되는 과정의 일 예를 설명할 것이나, 이와 같은 실시예로 본원이 한전 해석되는 것은 아니며, 앞서 설명한 다양한 실시예들에 따라 도 7에 도시된 데이터가 송수신 되는 과정이 변경될 수 있음은 기술분야에 속하는 사람들에게는 자명하다.Referring to FIG. 7 , it is a diagram illustrating a process in which data is transmitted/received between components included in the conversation learning system using the artificial intelligence avatar tutor of FIG. 1 according to an embodiment of the present invention. Hereinafter, an example of a process in which data is transmitted and received between each component will be described with reference to FIG. 7 , but the present application is not to be interpreted as such an embodiment, and the It is obvious to those in the technical field that the process in which data is transmitted and received can be changed.

도 7을 참조하면, 인공지능 아바타 튜터 회화 학습 시스템 서버(300)은 적어도 하나 이상의 모델 학습 서버(400)로부터, 인공지능 아바타 상호작용 모델, 언어교정 모델, 대화 주제 및 광고 큐레이션 모델을 제공받고(S1100), 시스템이 시작되면서 해당 모델을 엔진에 로딩 한다(S1100).Referring to FIG. 7 , the artificial intelligence avatar tutor conversation learning system server 300 is provided with an artificial intelligence avatar interaction model, a language correction model, a conversation topic and an advertisement curation model from at least one model learning server 400 , (S1100), as the system starts, the corresponding model is loaded into the engine (S1100).

사용자는 클라이언트(100)를 통해 인공지능 아바타를 선택하고, 대화 세션에 진입한다(S2000). 서버(300)는 사용자와 인공지능 아바타가 대화할 주제 및 상황 선택하고, 관련 아바타의 모습과 광고를 포함한 배경을 설정한다(S2100). 클라이언트(100)는 서버(300)로부터 대화 주제, 상황, 아바타 모습, 배경 정보를 전달받고(S2200), 화면에 표시한다. The user selects an artificial intelligence avatar through the client 100 and enters a conversation session (S2000). The server 300 selects a topic and situation in which the user and the artificial intelligence avatar will talk, and sets the appearance of the related avatar and a background including advertisements (S2100). The client 100 receives the conversation topic, situation, avatar appearance, and background information from the server 300 ( S2200 ), and displays it on the screen.

인공지능 아바타는 사용자에게 대화 주제와 가이드를 설명하고, 대화를 시작한다(S3000). 인공지능 아바타의 질문이나 요청에 사용자는 발화를 통해 본인의 응답을 제공하고, 클라이언트는 음성인식을 통해 사용자 발화 텍스트와 현재 대화 내용을 서버(300)에 전달한다(S3100). 서버(300)는 대화 문맥을 이해하고, 응답 텍스트 및 제스처를 생성하고, 사용자 입력 텍스트에 대한 패러프레이징(Paraphrasing) 문장을 생성하고, 오류 구간 검출 후 정정된 표현도 생성한다(S3200). The artificial intelligence avatar explains the conversation topic and guide to the user, and starts the conversation (S3000). The user provides his or her own response to the AI avatar's question or request through speech, and the client transmits the user's speech text and current conversation content to the server 300 through speech recognition (S3100). The server 300 understands the conversation context, generates response text and gestures, generates a paraphrasing sentence for the user input text, and also generates a corrected expression after detecting an error section ( S3200 ).

이에, 클라이언트(100)는 서버(300)에서 생성한 응답과 제스처, 패러프레이징(Paraphrasing) 한 문장과 오류 구간 검출 및 수정 표현을 전달받아(S3300), 인공지능 아바타를 통해 사용자에게 응답하고, 대화 텍스트 뷰에 패러프레이징(Paraphrasing) 한 문장과 사용자 입력 텍스트에 오류가 수정된 사항을 표시한다(S3400). 현재 대화 주제의 세션이 종료될 때까지(S3500), 상술한 인공지능 아바타와 사용자간 대화의 단계들은 루프를 돌며 반복한다.Accordingly, the client 100 receives the response, gesture, and paraphrasing sentence and error section detection and correction expression generated by the server 300 ( S3300 ), and responds to the user through the artificial intelligence avatar, and talks Paraphrasing sentences and corrected errors in the user input text are displayed in the text view (S3400). Until the session of the current conversation topic ends ( S3500 ), the above-described steps of the conversation between the artificial intelligence avatar and the user are repeated in a loop.

특정 대화 주제 세션이 종료가 되면(S3500), 서버(300)는 다음 대화 주제를 선정하는데 사용자의 대화 성향이나 수준을 기반으로 대화 주제 및 상황을 선택한다. 대화 주제별로 상술한 단계들은 루프를 돌며 반복하게 되고, 매 특정 대화 주제 세션이 종료되면, 사용자와 인공지능이 나눈 대화 내용과 사용자 피드백이 모델 학습 서버(400)로 업데이트 된다(S4100).When the specific conversation topic session ends (S3500), the server 300 selects the next conversation topic based on the user's conversation tendency or level, and selects the conversation topic and situation. The above-described steps for each conversation topic are repeated in a loop, and when each specific conversation topic session is ended, the conversation content and user feedback between the user and the AI are updated to the model learning server 400 (S4100).

상술한 단계들(S1100~S4100)간의 순서는 예시일 뿐, 이에 한정되지 않는다. 즉, 상술한 단계들(S1100~S4100)간의 순서는 상호 변동될 수 있으며, 이중 일부 단계들은 동시에 실행되거나 삭제될 수 있다.The order between the above-described steps ( S1100 to S4100 ) is merely an example and is not limited thereto. That is, the order between the above-described steps S1100 to S4100 may be mutually changed, and some of the steps may be simultaneously executed or deleted.

이와 같은 도 7의 인공지능 아바타 튜터를 활용한 회화 학습 시스템 제공 방법에 대해서 설명되지 아니한 사항은 앞서 상술한 다른 도면을 통해 인공지능 아바타 튜터를 활용한 회화 학습 시스템 제공 방법에 대하여 설명된 내용과 동일하거나 설명된 내용으로부터 용이하게 유추 가능하므로 이하 설명을 생략하도록 한다.The matters not described for the method of providing a conversation learning system using the artificial intelligence avatar tutor of FIG. 7 are the same as those described for the method of providing the conversation learning system using the artificial intelligence avatar tutor through the other drawings described above. or, since it can be easily inferred from the described content, the following description will be omitted.

전술한 본 발명의 일 실시예에 따른 인공지능 아바타 튜터를 활용한 회화 학습 시스템 제공 방법은, 단말기에 기본적으로 설치된 애플리케이션(이는 단말기에 기본적으로 탑재된 플랫폼이나 운영체제 등에 포함된 프로그램을 포함할 수 있음)에 의해 실행될 수 있고, 사용자가 애플리케이션 스토어 서버, 애플리케이션 또는 해당 서비스와 관련된 웹 서버 등의 애플리케이션 제공 서버를 통해 마스터 단말기에 직접 설치한 애플리케이션 (즉, 프로그램)에 의해 실행될 수도 있다. 이러한 의미에서, 전술한 본 발명의 일 실시예에 따른 가상현실 기 반 대화형 인공지능을 이용한 화상 대화 서비스 제공 방법은 단말기에 기본적으로 설치되거나 사용자에 의해 직 접 설치된 애플리케이션(즉, 프로그램)으로 구현되고 단말기에 등의 컴퓨터로 읽을 수 있는 기록매체에 기록될 수 있다.The above-described method for providing a conversation learning system using an artificial intelligence avatar tutor according to an embodiment of the present invention may include an application basically installed in a terminal (which may include a program included in a platform or an operating system basically installed in the terminal) ), and may be executed by an application (ie, a program) installed directly on the master terminal by a user through an application providing server such as an application store server, an application, or a web server related to the corresponding service. In this sense, the method for providing a video chat service using virtual reality-based interactive artificial intelligence according to an embodiment of the present invention described above is implemented as an application (that is, a program) installed basically in a terminal or directly installed by a user and may be recorded on a computer-readable recording medium such as a terminal.

전술한 본 발명의 설명은 예시를 위한 것이며, 본 발명이 속하는 기술분야의 통상의 지식을 가진 자는 본 발명의 기술적 사상이나 필수적인 특징을 변경하지 않고서 다른 구체적인 형태로 쉽게 변형이 가능하다는 것을 이해할 수 있을 것이다. 그러므로 이상에서 기술한 실시예들은 모든 면에서 예시 적인 것이며 한정적이 아닌 것으로 이해해야만 한다. 예를 들어, 단일형으로 설명되어 있는 각 구성 요소는 분산되어 실시될 수도 있으며, 마찬가지로 분산된 것으로 설명되어 있는 구성 요소들도 결합된 형태로 실시될 수 있다.The foregoing description of the present invention is for illustration, and those of ordinary skill in the art to which the present invention pertains can understand that it can be easily modified into other specific forms without changing the technical spirit or essential features of the present invention. will be. Therefore, it should be understood that the embodiments described above are illustrative in all respects and not restrictive. For example, each component described as a single type may be implemented in a distributed manner, and likewise components described as distributed may also be implemented in a combined form.

본 발명의 범위는 상기 상세한 설명보다는 후술하는 특허청구범위에 의하여 나타내어지며, 특허청구범위의 의미 및 범위 그리고 그 균등 개념으로부터 도출되는 모든 변경 또는 변형된 형태가 본 발명의 범위에 포함되는 것으로 해석되어야 한다.The scope of the present invention is indicated by the following claims rather than the above detailed description, and all changes or modifications derived from the meaning and scope of the claims and their equivalent concepts should be interpreted as being included in the scope of the present invention. do.

Claims

A method of providing a conversation learning system using an artificial intelligence avatar tutor running on a user terminal and a server for providing a conversation learning system, the method comprising:
Collecting artificial intelligence avatar dialogue modeling data, gesture change data related to the dialogue, data obtained by paraphrasing dialogue text, dialogue text error section and correction expression data from the dialogue data providing server;
An interaction model is learned using an artificial intelligence neural network algorithm with the collected dialog modeling data and gesture change data, and a paraphrasing model is learned using an artificial intelligence neural network algorithm with dialog modeling data and paraphrasing data. and learning a Grammar Error Correction model using the dialogue text, the error section, and the correction expression data;
accessing a conversation learning system using an artificial intelligence avatar tutor from a user terminal;
Driving the conversation topic and situation from the conversation content providing server, the avatar persona and the gesture model from the artificial intelligence avatar management server to the client;
loading an avatar persona and a gesture model from an AI avatar management server to a client when another AI avatar is selected by the user terminal;
After the AI avatar understands the conversation topic and situation, generating sentences and gestures to explain to the user, and explaining to the user;
The AI avatar generates and speaks questions and gestures based on understanding the current conversation topic and situation, and conversation history with the user;
When the user understands the question and gives a response, the user performs voice recognition and then takes appropriate reactions and actions;
when the user's response is finished, converting the response into text, and understanding it according to the current conversation topic and situation, and the conversation context;
generating and saying appropriate responses and gestures based on understanding;
generating and displaying a sentence in which the text responded to by the user has better expression and grammatical errors that fit the conversation context;
Evaluating the user's conversation level using an artificial intelligence neural network algorithm based on the user's conversation history with the AI avatar, and learning a curation model that presents conversation topics and situations similar to weak topics or expressions;
learning a relevance model using an artificial intelligence neural network algorithm with the collected dialogue modeling data and advertisement contents collected from an advertisement banner providing server;
and inserting an advertisement banner in the background of the artificial intelligence avatar according to the conversation topic and situation.
A method of providing a conversational learning system using an artificial intelligence avatar tutor.

The method of claim 1,
Before the step of accessing the conversation learning system using the artificial intelligence avatar tutor from the user terminal,
defining at least one AI avatar persona and performing conversation modeling;
Conversation modeling data and gesture data used by the at least one artificial intelligence avatar are mapped, dialogue modeling data and paraphrasing data are mapped, and dialogue text data and error sections and corrected expression data are mapped and stored. to do;
storing a user's response in the conversation topic and situation, data to evaluate it, and data mapped between similar conversation topics and situations;
mapping and storing advertisement content related to the given conversation topic and situation;
A method of providing a conversational learning system using an artificial intelligence avatar tutor that further comprises.

The method of claim 1,
After analyzing the input user voice through a natural language processing algorithm, the step of determining a response text and a gesture of the input voice comprises:
extracting a feature (spectogram) from the input voice signal and recognizing an emotion using the extracted feature;
Including, wherein the emotion recognition is to collectively refer to general emotions such as joy, sadness, and pain that a person can feel,
A method of providing a conversational learning system using an artificial intelligence avatar tutor.

The method of claim 1,
After analyzing the input user voice through a natural language processing algorithm, the step of determining a response text and a gesture of the input voice comprises:
partially understanding the inputted user response and providing a chuimsae and a gesture reaction while the user is speaking;
Including, wherein the partial understanding is to show the chuimsae and gestures suitable for the context of the paragraph by breaking it into a paragraph in which the meaning is distinguished in the middle of what the user is saying,
A method of providing a conversational learning system using an artificial intelligence avatar tutor.

The method of claim 1,
The step of speaking the generated response text together with the generated gesture of the AI avatar comprises:
extracting a pre-stored gesture animation clip corresponding to the determined gesture;
generating and synthesizing intermediate-stage images for smoothing the connection between the extracted animation clips;
A method of providing a conversational learning system using an artificial intelligence avatar tutor, which is executed by performing

The method of claim 1,
The step of generating and displaying the text that the user responded to with better expression and grammatical errors corrected for the context of the conversation,
generating another representation of the user response in a paraphrasing model, using the current conversation context, a conversation history including user response text, a conversation topic and situation as inputs;
Putting the inputted user response text into a Grammar Error Correction model, detecting error sections, and generating a corrected representation of each section;
A method of providing a conversational learning system using an artificial intelligence avatar tutor, which is executed by performing

The method of claim 1,
Evaluating the user's conversational level using an artificial intelligence neural network algorithm based on the history of the user's conversation with the artificial intelligence avatar comprises:
storing data obtained by mapping element item scores such as fluency, contextual consistency, and accuracy to the conversation data for the specific topic and situation;
creating a conversation level evaluation model with the stored data;
calculating a conversation level score by putting the conversation history of the user with the artificial intelligence avatar into the conversation level evaluation model;
A method of providing a conversational learning system using an artificial intelligence avatar tutor, which is executed by performing

The method of claim 1,
The step of presenting a similar conversation topic and situation in which the user can re-learn a vulnerable topic or expression,
classifying the entire conversation topic and situation based on similarity;
analyzing conversation topics and situations in which the user frequently makes errors or has a low conversation level score;
selecting a conversation topic and situation belonging to a class similar to the conversation topic and situation having the low score;
A method of providing a conversational learning system using an artificial intelligence avatar tutor, which is executed by performing

The method of claim 1,
The step of inserting an advertisement banner in the background of the artificial intelligence avatar according to the conversation topic and situation comprises:
The place where the artificial intelligence avatar of the advertisement banner is displayed is displayed on the screen and is embedded in the place, and
The place where the artificial intelligence avatar is displayed is displayed on the screen, and it is shown floating on the place;
separately showing the place where the AI avatar is located and a space separated from the screen;
What the artificial intelligence avatar is wearing and displayed in the shoes, clothing, etc.
A method of providing a conversational learning system using an artificial intelligence avatar tutor comprising a.