WO2022182064A1 - Système d'apprentissage de conversation utilisant un tuteur avatar d'intelligence artificielle, et procédé associé - Google Patents

Système d'apprentissage de conversation utilisant un tuteur avatar d'intelligence artificielle, et procédé associé Download PDF

Info

Publication number
WO2022182064A1
WO2022182064A1 PCT/KR2022/002362 KR2022002362W WO2022182064A1 WO 2022182064 A1 WO2022182064 A1 WO 2022182064A1 KR 2022002362 W KR2022002362 W KR 2022002362W WO 2022182064 A1 WO2022182064 A1 WO 2022182064A1
Authority
WO
WIPO (PCT)
Prior art keywords
conversation
avatar
user
artificial intelligence
topic
Prior art date
Application number
PCT/KR2022/002362
Other languages
English (en)
Korean (ko)
Inventor
조지수
Original Assignee
조지수
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 조지수 filed Critical 조지수
Publication of WO2022182064A1 publication Critical patent/WO2022182064A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/20Education
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/211Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/253Grammatical analysis; Style critique
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • G06F40/35Discourse or dialogue representation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/004Artificial life, i.e. computing arrangements simulating life
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T13/00Animation
    • G06T13/203D [Three Dimensional] animation
    • G06T13/403D [Three Dimensional] animation of characters, e.g. humans, animals or virtual beings
    • GPHYSICS
    • G09EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
    • G09BEDUCATIONAL OR DEMONSTRATION APPLIANCES; APPLIANCES FOR TEACHING, OR COMMUNICATING WITH, THE BLIND, DEAF OR MUTE; MODELS; PLANETARIA; GLOBES; MAPS; DIAGRAMS
    • G09B19/00Teaching not covered by other main groups of this subclass
    • G09B19/06Foreign languages
    • GPHYSICS
    • G09EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
    • G09BEDUCATIONAL OR DEMONSTRATION APPLIANCES; APPLIANCES FOR TEACHING, OR COMMUNICATING WITH, THE BLIND, DEAF OR MUTE; MODELS; PLANETARIA; GLOBES; MAPS; DIAGRAMS
    • G09B5/00Electrically-operated educational appliances
    • G09B5/06Electrically-operated educational appliances with both visual and audible presentation of the material to be studied

Definitions

  • the present invention relates to a method by which a user can improve his or her conversational skills while having a conversation with an artificial intelligence avatar, and more particularly, the artificial intelligence avatar becomes a tutor and is displayed on the screen, while presenting a specific topic and situation to the user. , receives the user's voice input, understands the meaning, and plays the audio response appropriate to the context of the conversation. We have a continuous conversation with the user and suggest corrections to the user's expression or a more appropriate expression. Also, it relates to a learning system and a conversation learning method that enable conversational learning by showing gestures suitable for each context when an AI avatar is listening or speaking in order to give a feeling of natural conversation like a human.
  • a live-action avatar with the same voice and mouth shape as that of a tutor is shown to the user, and after a specific topic is selected, the user is instructed to follow the set conversation. That is, through a set topic and a dialogue script, the live-action avatar reads the script and changes the mouth shape to match the sound. The user sees the response text of the set dialogue script and learns by reading along. You cannot have any other conversation other than the script given to you. Also, at the level of reading according to the set script, the user cannot use the expression immediately thought and created during the conversation, so the expression cannot be corrected.
  • the conventional live-action avatar technology it is focused on correcting a specific pronunciation rather than the experience of having a conversation with a user.
  • the user looks at the given script and checks whether the pronunciation is good, and when the actual live-action avatar responds, it concentrates on the expression of the mouth shape to make the correct pronunciation.
  • conversational learning using a live-action avatar has a limit in improving the user's conversational ability by focusing on learning the correct pronunciation and fixed expressions rather than pre-talking.
  • the present invention has been proposed to solve the problems of the prior art as described above.
  • the present invention provides an artificial intelligence avatar tutor to explain a given topic and situation to a user, lead a conversation naturally, and put the user's utterance into a conversation context. It shows the reaction with appropriate gestures and sounds while understanding it properly, and after the user's utterance is finished, it generates a response that matches it, and then plays the response as audio along with the gesture.
  • Another object of the present invention is to provide a conversation learning system and a conversation learning method using an artificial intelligence avatar tutor that allows the user to effectively learn conversation while correcting or suggesting a better expression of the user.
  • an embodiment of the present invention provides an artificial intelligence avatar explaining a conversation topic and situation to a user, the artificial intelligence avatar explaining a conversation topic and situation while starting a conversation, , saying the appropriate question or request, the AI avatar showing appropriate reactions and movements while the user is speaking, converting the user's response into text to understand it according to the context of the current conversation, and It includes the steps of generating and expressing gestures, generating sentences with expressions that are better than the user's response or correcting grammatical errors, and inserting advertisement banners into the background of the AI avatar according to the topic and situation of the conversation. .
  • the artificial intelligence avatar becomes a conversational learning tutor, leads a conversation suitable for a given conversation topic and situation, and has a real conversation through gestures and chuimsae during conversation with the user experience, understand user utterances based on the context of the current conversation, generate responses and gestures corresponding to them, suggest better expressions for user utterances, and generate correct sentences for grammatical errors.
  • Conversation that helps users to improve their conversational skills by overcoming the limitations of existing AI tutors who learn specific response expressions and pronunciations, and can have natural conversations with users on arbitrary topics
  • a learning system can be provided.
  • FIG. 1 is a diagram for explaining a conversation learning system using an artificial intelligence avatar tutor according to an embodiment of the present invention.
  • 2, 3, 4, 5, and 6 are diagrams for explaining an embodiment in which a conversation learning system using an artificial intelligence avatar tutor according to an embodiment of the present invention is implemented.
  • FIG. 7 is a diagram illustrating a process in which data is transmitted/received between components included in the conversation learning system using the artificial intelligence avatar tutor of FIG. 1 according to an embodiment of the present invention.
  • a "part" includes a unit realized by hardware, a unit realized by software, and a unit realized using both.
  • one unit may be implemented using two or more hardware, and two or more units may be implemented by one hardware.
  • mapping or matching with the terminal means mapping or matching the terminal's unique number or personal identification information, which is the identification data of the terminal. can be interpreted as
  • a conversation learning system 1 using an artificial intelligence avatar tutor includes a client 100 that a user can access to the system, a network 200, and a server providing an AI avatar tutor conversation learning service. (300) may be included.
  • a conversation learning system 1 using an artificial intelligence avatar tutor includes a client 100 that a user can access to the system, a network 200, and a server providing an AI avatar tutor conversation learning service. (300) may be included.
  • 1 is only one embodiment of the present invention, and thus the present invention is not limitedly interpreted through FIG. 1 .
  • Each component of FIG. 1 is generally connected through a network 200 .
  • the network refers to a connection structure in which information exchange is possible between each node, such as a plurality of terminals and servers.
  • Examples of such networks include RF, 3rd Generation Partnership Project (3GPP) network, Long Term Evolution (LTE).
  • ⁇ network 5GPP (5th Generation Partnership Project) network, WIMAX (World Interoperability for Microwave Access) network, Internet, LAN (Local Area Network), Wireless LAN (Wireless Local Area Network), WAN (Wide Area Network), PAN (Personal Area Network), Bluetooth (Bluetooth) network, NFC network, satellite broadcasting network, analog broadcasting network, DMB (Digital Multimedia Broadcasting) network, and the like are included, but are not limited thereto.
  • the term at least one is defined as a term including the singular and the plural, and even if the at least one term does not exist, each element may exist in the singular or the plural, and it is obvious that it may mean the singular or the plural. will be. In addition, whether each component is provided in singular or plural may be changed according to an embodiment.
  • the client 100 includes an avatar control unit 110 that allows the AI avatar to have a direct conversation with the user, a content control unit 120 that provides a conversation topic and situation guide, and displays the conversation between the AI avatar and the user in text.
  • the dialogue management unit 130 that displays and corrects more improved or different expressions or grammatical errors than the user's expression, the connection between the three modules 110, 120, 130, and the internal system function of the terminal and the server 300 ) may be composed of a system control unit 140 that controls the system, and a service management unit 150 that can set a user's account, conversation history, and settings.
  • the avatar controller 110 shows the artificial intelligence avatar's whole body or body part including the face with a specific background.
  • the types of AI avatars exist in various forms and shapes depending on the character, and the appearance of the same AI avatar may vary depending on the current conversation topic and situation.
  • the background also changes. For example, if it is a topic related to a summer vacation, it may appear as a scene drinking a mojito in a swimsuit on the beach.
  • the AI avatar changes its gestures depending on when it hears the user speaking, when it speaks to the user, and when it is waiting for the next input. While the user is uttering, it understands the context of the user's situation and the meaning of the utterance, shows the appropriate reaction according to the context, shows that the user is listening well, and allows the conversation to continue naturally. For example, when the user speaks with joy, he communicates with the user sympathetically by making bright gestures and gestures that respond to such feelings. When the AI avatar speaks to the user, it shows a gesture that fits the context, effectively conveying the meaning to the user and allowing the user to be immersed in the conversation.
  • the facial expression, gesture, and tone of the AI avatar are expressed differently from their usual appearance. If you wait for the user to speak or take no action, the AI avatar will wait and continue the conversation with appropriate gestures. For example, if there is no response from the user for several seconds after the AI avatar asks a question to the user, the user waits with a waiting gesture that can sufficiently think of a response.
  • the avatar controller 110 may automatically adjust the artificial intelligence avatar and the user's speech input turn. For example, after the AI avatar finishes speaking, the microphone turns on automatically for the user to speak after a few seconds, and the sound and microphone icon are changed to a recording icon so that the user can recognize it. When the user has finished speaking, the system recognizes it and displays the microphone icon as off. The user's utterance input can also be performed manually.
  • the AI avatar is speaking the status display is displayed as AI avatar is speaking, and when the AI avatar finishes speaking, the status display changes to a microphone icon. In this case, when the user presses the microphone button, the user can record his or her utterance. When the user finishes speaking and presses the icon during recording again, the recording ends, and the microphone icon is turned off.
  • the avatar control unit 110 naturally displays an advertisement banner in the background together with the artificial intelligence avatar.
  • a background suitable for the topic and situation of the current conversation appears, and the advertisement banner in the background is curated by the advertisement providing unit 360 of the server 300 according to the topic and situation of the current conversation. For example, if the topic of the current conversation is about printer devices, advertisements are naturally exposed to users by inserting logos or products of printer device manufacturers such as Xerox and Canon in the background.
  • the content control unit 120 provides conversation topics and context information between the artificial intelligence avatar and the user.
  • the conversation topic includes most areas including daily life and professional areas, and the conversation topic may be displayed in text and pictures.
  • the context information may be set in various ways, such as a role between the AI avatar and the user, problem solving, help, and explanation. It also reads a specific fingerprint and initiates a conversation between the AI avatar and the user with a conversation topic and situation related to the fingerprint.
  • the content controller 120 provides a control button so that the user can move the AI avatar and a new topic.
  • the avatar controller 110 may tell the user to change the subject through utterance. For example, the user may suggest to the AI avatar to have a conversation on a different topic or specify a specific topic to talk to. When the conversation topic changes in this way, the AI avatar explains the changed topic and situation and then naturally continues the conversation.
  • the text of the utterance is displayed on the conversation management unit 130 .
  • the inputted text is transmitted to the interaction processing unit 330 and the language proofing processing unit 340 of the server 300, and the dialogue management unit 130 corrects errors for the user input text and corrects the results of paraphrasing. It is received from the processing unit 340 and displayed on the screen of the conversation management unit 130 .
  • Error display and correction for user input text displays an error section in the text and also displays a corrected expression.
  • Paraphrasing refers to a sentence containing a better expression or a different expression for the same meaning based on the user input text, and may be displayed with a separate mark at the bottom of the user input text. For example, when there are only some words rather than a complete sentence due to the lack of language expression skills of the user, the language correction processing unit 340 may also consider creating a complete sentence suitable for the conversation context as paraphrasing.
  • the conversation management unit 130 also provides audio files for the artificial intelligence avatar and the user's conversation text. It includes audio of the voice of the AI avatar, audio uttered by the user, audio generated by the user's utterance text with the voice of the AI avatar, and audio of paraphrased text. By providing these texts and audio, users can hear their own pronunciation and the pronunciation of the AI avatar again. Also, when the AI avatar incorrectly recognizes the user's utterance or makes a response that does not fit the context, the conversation manager 130 may also provide a function for the user to directly provide feedback to the system.
  • the system control unit 140 controls the connection between the three modules 110 , 120 , 130 and the internal system function of the terminal and the server 300 . More specifically, a conversation topic with the first user is received from the content management unit 320 of the Server 300 through the network 200 and delivered to the content control unit 120 .
  • the avatar management unit 310 receives the persona of the artificial intelligence avatar and a background suitable for the current conversation topic
  • the advertisement providing unit 350 receives an advertisement banner suitable for the current conversation topic and transmits it to the avatar control unit 110 .
  • the avatar controller 110 triggers a conversation on a new topic, and the AI avatar recognizes what the avatar has said and delivers it to the conversation manager 130 .
  • it may include all connections and control between other client-server modules.
  • the service manager 150 manages the user's account, selects an artificial intelligence avatar with which the user wants to talk, and provides a history of previous conversations in audio and text format.
  • the user can select a rate plan, a certain amount of conversation topics can be used free of charge to the user who signed up for the service for the first time, and the content control unit 120 changes the topic or the number of allocated conversation turns is exhausted. If enabled, a pop-up message will allow the user to select a plan.
  • the conversation history is automatically tagged with a conversation topic, and may be displayed as a thumbnail along with a picture of an AI avatar or a conversation topic background.
  • the server 300 defines the persona and gesture of the artificial intelligence avatar, the avatar manager 310 manages the artificial intelligence avatar that varies depending on the topic and situation of the conversation, the user and the artificial intelligence avatar have a conversation topic and A content management unit 320 that provides a situation, a persona of an artificial intelligence avatar, a conversation topic and situation, an interaction processing unit 330 that generates a response and an appropriate gesture according to the context of the current conversation, and the text uttered by the user Artificial intelligence in the language proofing processing unit 340 that analyzes to generate a better expression suitable for the context and detects grammatical errors, the account management unit 350 that manages users registered through the client 100, and the avatar control unit 110 It may be composed of an advertisement providing unit 360 that inserts a natural advertisement banner in the background of the avatar.
  • the avatar manager 310 defines and manages avatars that are AI tutors.
  • the AI tutor may have a human-like personality, history, or experience, and based on this, the interaction processing unit 330 generates a response and a gesture.
  • AI avatars can have human-like shapes and can appear as various characters, such as animals or characters from games. Such an AI avatar can take various gestures depending on the topic or context of the conversation, and each avatar may have different gestures even in the same conversation context.
  • the main gestures defined in this way are used in the interaction processing unit 330 .
  • the AI avatar can use its own unique expressions, and these main expression methods are also defined in the avatar management unit 310 .
  • AI avatars also have their own voice.
  • the voice synthesis model for generating such a voice is learned in advance and registered in the avatar management unit 310 , and is downloaded to the client 100 when the service management unit 150 selects an artificial intelligence avatar with which the user wants to talk.
  • the content management unit 320 provides a topic and a situation for a conversation between the user and the AI avatar. These topics and situations depend on the conversational ability of the conversational user. Conversation topics may vary from topics related to daily life to knowledge in specialized fields, and conditions such as roles, questions and answers, problem solving, explanations, etc., may be given between the user and the AI avatar.
  • the interaction processing unit 330 generates a response and an appropriate gesture according to the persona of the AI avatar, the topic and situation of the conversation, and the context of the current conversation.
  • a base model is created by learning actual conversation data between people through deep learning
  • a conversation model is created by additionally learning conversation data suitable for the persona of each AI avatar on a specific topic.
  • frequently asked questions or requests by the user are generated using a rule-based learning model.
  • a predefined template-based conversation phrase is created and used.
  • the gestures of the AI avatar are the gestures of the situation the AI avatar hears when the user speaks, the gestures that the AI avatar takes while speaking and then interprets the responses after hearing, and the gestures that the AI avatar takes while the user speaks and does not perform any action on the user's input.
  • Create a gesture In a listening situation, it shows that the user is listening to what is being said, and if there is a pause in the conversation, it gives a reaction based on understanding up to that point. For example, if the AI avatar is persuaded or persuaded by the user's words, it can generate a reaction saying yes while nodding its head.
  • the AI avatar generates a response after understanding the user's words, and when the AI avatar needs emphasis or instructions while the AI avatar speaks, a gesture using body language may be included.
  • the language proofing processing unit 340 generates a better expression according to the context of the text spoken by the user, and generates a corrected expression when there is a grammatical error.
  • Paraphrasing of the language proofing processing unit 340 uses a fine-tuning model to create a language model with a large amount of text data in advance, and to generate a sentence with different expressions when an input sentence comes in. do.
  • For grammatical errors it is determined whether the input sentence contains errors through the pre-trained language model, finds the section with errors, and replaces the sentence with the corrected expression.
  • the account management unit 350 analyzes the user's account information and learning. By analyzing the conversation history the user had with the AI avatar, it identifies which topics and expressions need improvement. Thereafter, the content management unit 320 may periodically generate similar topics and situations as a conversation guide until the user is familiar with the corresponding topics and expressions.
  • the advertisement providing unit 360 manages an advertisement banner that will naturally fit into the background of the avatar control unit 110 , and when the content management unit 320 sets a conversation topic and situation, it is included in the corresponding topic and situation.
  • the avatar control unit 110 inserts a banner having the highest degree of relevance into the background by calculating the degree of association between the background and the advertisement banner.
  • FIGS. 2 to 6 are diagrams for explaining an embodiment in which a conversation service using a conversation learning system using an artificial intelligence avatar tutor according to an embodiment of the present invention is implemented.
  • a greeting or ice breaking topic is given to the content controller 120 and a conversation begins.
  • the AI avatar speaks
  • the AI avatar control unit 110 indicates that the AI avatar is speaking, and takes appropriate gestures according to the speaking context.
  • the AI avatar finishes speaking it is displayed as text on the conversation management unit 130 .
  • a microphone for receiving user utterance is turned on automatically after a few seconds, a recording icon is displayed so that the user can recognize it, and a recording start sound is played. Accordingly, when the user responds appropriately to the AI avatar, says to have a full-fledged conversation, or presses a button to move to the next topic in the content control unit 120, the AI avatar continues the conversation with the next conversation topic and situation. .
  • the user's speech input can also be performed manually. That is, when the AI avatar finishes speaking, the user presses the microphone button of the AI avatar control unit 110 to input a utterance, and at this time, a recording is displayed. .
  • the content controller 120 is given a topic and situation with which the AI avatar and the user want to communicate. These conversations and situations are curated by the content management unit 320 of the Server 300 .
  • the AI avatar control unit 110 delivers an appropriate conversation guide to the user through words and gestures, and displays the text on the conversation management unit 130 .
  • the current conversation topic and context information are sent to the interaction processing unit 330 of the server 300 through the system control unit 140, where the text and gestures to be spoken by the AI avatar are sent. is generated and transmitted back to the system control unit 140 .
  • the AI avatar takes the gesture received from the interaction processing unit 330 , and converts the text into the voice audio of the AI avatar and speaks.
  • the conversation management unit 130 displays what the AI avatar said as text.
  • the AI avatar background of the AI avatar control unit 110 visual data on a current conversation topic and situation are displayed, and an advertisement banner may be included in the visual data.
  • the topic of conversation in Fig. 3 is occupation and hobbies
  • the context information provides the background for the conversation setting that the user likes to be a software engineer and mobile game, and the background for meeting with Google Analytics at the company and playing games on the Play Station at home. can do.
  • the conversation management unit 130 displays text indicating that the AI avatar recognizes the user's utterance.
  • the user sends the spoken text and the conversation content of the current conversation topic session to the proofing processing unit 340 of the server 300 through the system control unit 140, and a better expression or grammatical error than the response sentence entered by the user
  • a sentence including the corrected expression is generated and sent back to the system controller 140 .
  • the corrected sentence is displayed at the bottom of the user's utterance text of the conversation management unit 130 .
  • a section with a grammatical error is displayed in the user utterance text, and a corrected phrase is also provided.
  • the user utterance text and the conversation content of the current conversation topic session are also sent to the interaction processing unit 330 of the server 300 , where the user utterance is understood based on the context of the conversation, and a response sentence corresponding thereto and create a gesture. It generates responses or follow-up questions that continue the flow of conversation by identifying entity names, referential pronouns, and related relationships in the current conversation session. As described above, the interaction processing unit 300 also generates a sequence of gestures to be displayed while the AI avatar speaks a response or follow-up question.
  • the method of providing a conversation learning system using an artificial intelligence avatar tutor is a recording medium including instructions executable by a computer, such as an application or program module executed by a computer. It can also be implemented in the form of Computer-readable media can be any available media that can be accessed by a computer and includes both volatile and nonvolatile media, removable and non-removable media. Also, computer-readable media may include all computer storage media. Computer storage media includes both volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data.
  • FIG. 7 it is a diagram illustrating a process in which data is transmitted/received between components included in the conversation learning system using the artificial intelligence avatar tutor of FIG. 1 according to an embodiment of the present invention.
  • FIG. 7 an example of a process in which data is transmitted and received between each component will be described with reference to FIG. 7 , but the present application is not to be interpreted as such an embodiment, and the It is obvious to those in the technical field that the process in which data is transmitted and received can be changed.
  • the artificial intelligence avatar tutor conversation learning system server 300 is provided with an artificial intelligence avatar interaction model, a language correction model, a conversation topic and an advertisement curation model from at least one model learning server 400 , (S1100), as the system starts, the corresponding model is loaded into the engine (S1100).
  • the user selects an artificial intelligence avatar through the client 100 and enters a conversation session (S2000).
  • the server 300 selects a topic and situation in which the user and the artificial intelligence avatar will talk, and sets the appearance of the related avatar and a background including advertisements (S2100).
  • the client 100 receives the conversation topic, situation, avatar appearance, and background information from the server 300 ( S2200 ), and displays it on the screen.
  • the artificial intelligence avatar explains the conversation topic and guide to the user, and starts the conversation (S3000).
  • the user provides his or her own response to the AI avatar's question or request through speech, and the client transmits the user's speech text and current conversation content to the server 300 through speech recognition (S3100).
  • the server 300 understands the conversation context, generates response text and gestures, generates a paraphrasing sentence for the user input text, and also generates a corrected expression after detecting an error section ( S3200 ).
  • the client 100 receives the response, gesture, and paraphrasing sentence and error section detection and correction expression generated by the server 300 ( S3300 ), and responds to the user through the artificial intelligence avatar, and talks Paraphrasing sentences and corrected errors in the user input text are displayed in the text view (S3400). Until the session of the current conversation topic ends ( S3500 ), the above-described steps of the conversation between the artificial intelligence avatar and the user are repeated in a loop.
  • the server 300 selects the next conversation topic based on the user's conversation tendency or level, and selects the conversation topic and situation.
  • the above-described steps for each conversation topic are repeated in a loop, and when each specific conversation topic session is ended, the conversation content and user feedback between the user and the AI are updated to the model learning server 400 (S4100).
  • the above-described method for providing a conversation learning system using an artificial intelligence avatar tutor may include an application basically installed in a terminal (which may include a program included in a platform or an operating system basically installed in the terminal) ), and may be executed by an application (ie, a program) installed directly on the master terminal by a user through an application providing server such as an application store server, an application, or a web server related to the corresponding service.
  • an application ie, a program
  • the method for providing a video chat service using virtual reality-based interactive artificial intelligence according to an embodiment of the present invention described above is implemented as an application (that is, a program) installed basically in a terminal or directly installed by a user and may be recorded on a computer-readable recording medium such as a terminal.
  • the present invention is used as a main system configuration for an AI-based English learning platform, Daily Talk (entu.agiavatar.com), a role-play pre-talking conversation between an AI and a user, paraphrase of native speaker expressions, and conversation evaluation.
  • the present invention can be applied as a service in various devices such as mobile phones, tablets, PCs, and TVs, and can be industrially used by making individual functions of the constituent modules of the present invention in a separate service form.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Business, Economics & Management (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Educational Administration (AREA)
  • Educational Technology (AREA)
  • Tourism & Hospitality (AREA)
  • Biophysics (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Biomedical Technology (AREA)
  • Economics (AREA)
  • Human Resources & Organizations (AREA)
  • Marketing (AREA)
  • Primary Health Care (AREA)
  • Strategic Management (AREA)
  • General Business, Economics & Management (AREA)
  • Electrically Operated Instructional Devices (AREA)
  • Machine Translation (AREA)

Abstract

La présente invention concerne un procédé pour fournir un système d'apprentissage de conversation à l'aide d'un tuteur avatar d'intelligence artificielle, comprenant des étapes dans lesquelles : un avatar d'intelligence artificielle explique un sujet de conversation et une situation à un utilisateur ; l'avatar d'intelligence artificielle pose d'abord une question adaptée au sujet et à la situation tout en démarrant une conversation ; l'avatar d'intelligence artificielle exprime des réactions appropriées pendant que l'utilisateur parle ; une réponse de l'utilisateur est convertie en texte de manière à être comprise selon le contexte de la conversation courante, et une réponse et un geste d'un avatar d'intelligence artificielle sont générés et exprimés ; une phrase est générée avec une meilleure expression que celle de la réponse de l'utilisateur ou une expression corrigée en matière d'erreurs grammaticales ; et une bannière publicitaire est insérée dans un arrière-plan d'avatar d'intelligence artificielle en fonction du sujet et de la situation de conversation.
PCT/KR2022/002362 2021-02-28 2022-02-17 Système d'apprentissage de conversation utilisant un tuteur avatar d'intelligence artificielle, et procédé associé WO2022182064A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
KR1020210027068A KR20220123170A (ko) 2021-02-28 2021-02-28 인공지능 아바타 튜터를 활용한 회화 학습 시스템 및 그 방법
KR10-2021-0027068 2021-02-28

Publications (1)

Publication Number Publication Date
WO2022182064A1 true WO2022182064A1 (fr) 2022-09-01

Family

ID=83049399

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/KR2022/002362 WO2022182064A1 (fr) 2021-02-28 2022-02-17 Système d'apprentissage de conversation utilisant un tuteur avatar d'intelligence artificielle, et procédé associé

Country Status (2)

Country Link
KR (1) KR20220123170A (fr)
WO (1) WO2022182064A1 (fr)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117058286A (zh) * 2023-10-13 2023-11-14 北京蔚领时代科技有限公司 一种文字驱动数字人生成视频的方法和装置

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20240067507A (ko) * 2022-11-09 2024-05-17 국립창원대학교 산학협력단 키워드를 이용한 패러프레이즈 문장 생성장치 및 방법
KR102671569B1 (ko) * 2023-02-24 2024-06-05 주식회사 구루미 인공지능 관리 프로바이더 기반의 교육 컨텐츠 제공방법
KR102607095B1 (ko) 2023-08-22 2023-11-29 장정완 인공지능에 기초한 맞춤형 대화 영어 학습 시스템

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR101856940B1 (ko) * 2017-02-20 2018-05-14 주식회사 투윈글로벌 소셜 네트워크 서비스 시스템 및 이를 이용한 소셜 네트워크 서비스 방법
KR101992424B1 (ko) * 2018-02-06 2019-06-24 (주)페르소나시스템 증강현실용 인공지능 캐릭터의 제작 장치 및 이를 이용한 서비스 시스템
US20190238487A1 (en) * 2018-02-01 2019-08-01 International Business Machines Corporation Dynamically constructing and configuring a conversational agent learning model
KR20200058909A (ko) * 2018-11-20 2020-05-28 주식회사 토킹코리아 사용자 맞춤형 상황별 언어 학습 방법 및 장치와 그 시스템
KR20200064021A (ko) * 2018-11-28 2020-06-05 김훈 대화형 교육 시스템에 포함되는 사용자 장치와 교육 서버

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR101856940B1 (ko) * 2017-02-20 2018-05-14 주식회사 투윈글로벌 소셜 네트워크 서비스 시스템 및 이를 이용한 소셜 네트워크 서비스 방법
US20190238487A1 (en) * 2018-02-01 2019-08-01 International Business Machines Corporation Dynamically constructing and configuring a conversational agent learning model
KR101992424B1 (ko) * 2018-02-06 2019-06-24 (주)페르소나시스템 증강현실용 인공지능 캐릭터의 제작 장치 및 이를 이용한 서비스 시스템
KR20200058909A (ko) * 2018-11-20 2020-05-28 주식회사 토킹코리아 사용자 맞춤형 상황별 언어 학습 방법 및 장치와 그 시스템
KR20200064021A (ko) * 2018-11-28 2020-06-05 김훈 대화형 교육 시스템에 포함되는 사용자 장치와 교육 서버

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117058286A (zh) * 2023-10-13 2023-11-14 北京蔚领时代科技有限公司 一种文字驱动数字人生成视频的方法和装置
CN117058286B (zh) * 2023-10-13 2024-01-23 北京蔚领时代科技有限公司 一种文字驱动数字人生成视频的方法和装置

Also Published As

Publication number Publication date
KR20220123170A (ko) 2022-09-06

Similar Documents

Publication Publication Date Title
WO2022182064A1 (fr) Système d'apprentissage de conversation utilisant un tuteur avatar d'intelligence artificielle, et procédé associé
Mazur Gestures and facial expressions in audio description
JP4395687B2 (ja) 情報処理装置
KR20020067592A (ko) 개인 상호 작용을 시뮬레이트하고 유저의 정신 상태및/또는 인격에 응답하는 유저 인터페이스/엔터테인먼트장치
KR20020067590A (ko) 개인 상호작용을 시뮬레이팅하는 환경-응답 유저인터페이스/엔터테인먼트 장치
CN109086860B (zh) 一种基于虚拟人的交互方法及系统
KR20220128897A (ko) 인공지능 아바타를 활용한 회화 능력 평가 시스템 및 그 방법
CN107403011A (zh) 虚拟现实环境语言学习实现方法和自动录音控制方法
WO2022196921A1 (fr) Procédé et dispositif de service d'interaction basé sur un avatar d'intelligence artificielle
JPWO2018230345A1 (ja) 対話ロボットおよび対話システム、並びに対話プログラム
CN117541444B (zh) 一种互动虚拟现实口才表达训练方法、装置、设备及介质
Bardini Audio description style and the film experience of blind spectators: design of a reception study
CN110767005A (zh) 基于儿童专用智能设备的数据处理方法及系统
Ke Deficient non-native speakers or translanguagers? Identity struggles in a multilingual multimodal ELF online intercultural exchange
KR20080114100A (ko) 컴퓨터 주도형 대화 장치 및 방법
CN117635383A (zh) 一种虚拟导师与多人协作口才培训系统、方法及设备
Riviello et al. On the perception of dynamic emotional expressions: A cross-cultural comparison
Sindoni Multimodality and Translanguaging in Video Interactions
WO2022196880A1 (fr) Procédé et dispositif de service d'interaction basé sur un avatar
CN111160051B (zh) 数据处理方法、装置、电子设备及存储介质
KR102325506B1 (ko) 가상현실 기반의 의사소통 개선 시스템 및 방법
WO2020111835A1 (fr) Dispositif utilisateur et serveur d'éducation compris dans un système d'enseignement basé sur une conversation
KR20210135151A (ko) 단말기의 음성인식 기능과 tts 기능을 이용한 상호 음성전달에 의한 대화 형 외국어 학습방법
WO2015115701A1 (fr) Matériel d'apprentissage multimédia d'une langue étrangère
KR20020024828A (ko) 인터넷을 이용한 상호 대화식 언어 학습방법

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22759982

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 22759982

Country of ref document: EP

Kind code of ref document: A1