WO2016175354A1

WO2016175354A1 - Artificial intelligence conversation device and method

Info

Publication number: WO2016175354A1
Application number: PCT/KR2015/004347
Authority: WO
Inventors: 이영근; 김승곤; 임완섭; 임성환; 김우현; 이영호; 김두호
Original assignee: 주식회사 아카인텔리전스
Priority date: 2015-04-29
Filing date: 2015-04-29
Publication date: 2016-11-03

Abstract

The present invention relates to an artificial intelligence conversation device and method which supports a conversation between a human and a robot. An artificial intelligence conversation device according to an aspect of the present invention comprises: an input answer analysis unit which analyzes an answer input from a user; a response control unit which selects at least one response scenario among predetermined scenarios according to the result of the analysis and transmits an output command for a response and a question to the user's answer; and an output unit which outputs a silence or a conversation start voice and outputs a response voice and a question voice according to the output command from the response control unit.

Description

Artificial Intelligence Apparatus and Method

The present invention relates to an artificial intelligence dialogue apparatus and method for supporting a dialogue between a person and a robot.

Chat is a computer or a portable terminal to support the conversation with the other party over the network, it is widely used in the form of chat window messenger.

However, in a chat between a person and a person, when there is no other party, a chat cannot be performed, resulting in a chat robot.

As the necessity increases as a means of communication using natural language between humans and computers (robots) in intelligent agents, various chat robot technologies have been proposed.

In the case of the conversation engine according to the related art, a corresponding answer preset according to the text input by the user is provided, and thus the subject of the conversation changes rapidly according to the user's input.

Despite the fact that the dialogue engine using natural language between humans and robots can be said to be the most important factor to minimize the heterogeneity of dialogue with the robot and to enable natural dialogue, the conventional technology simply answers based on the user's input. As the passive dialog engine providing only provides a user with not only a lot of heterogeneity, but also requires the user to induce a conversation, the flow of the conversation and the user's interest in the conversation are sharply dropped.

The present invention has been proposed to solve the above-described problems, by proceeding the dialogue in the order of sending a question, receiving an answer, responding to the answer and sending the next question, by inducing the user to lead to the next conversation in response to the subject, An object of the present invention is to provide an artificial intelligence device and method for supporting a natural conversation with a user without departing from the present invention.

According to an aspect of an exemplary embodiment, an artificial intelligence dialog device may include an input response analysis unit analyzing an input user response, and selecting at least one response scenario among preset scenarios according to an analysis result to respond to a response and a question about a user response. And an output unit for outputting a response control unit for outputting an output command and a silent or conversation start voice, and outputting a response voice and a question voice according to the output command of the reaction control unit.

The artificial intelligence dialogue apparatus and method according to the present invention actively proceeds a conversation in the order of question transmission, user response reception, and response response to a user response based on a preset scenario, thereby providing only a predetermined answer according to a user input. Rather, it leads to active conversations, thereby minimizing the heterogeneity of the user by conducting conversations with the conversation engine, and enhancing the interest of the conversation.

It is necessary to classify the answers received from the users by type, to organize the components belonging to the answers to improve the reliability of the analysis of the user answers, and to provide the response according to the user's answers accordingly, so as to flexibly proceed to the next order of the conversation. There is a possible effect.

The effects of the present invention are not limited to those mentioned above, and other effects that are not mentioned will be clearly understood by those skilled in the art from the following description.

1 is a block diagram illustrating an artificial intelligence conversation apparatus according to an embodiment of the present invention.

2 is a flowchart illustrating an artificial intelligence conversation method according to an embodiment of the present invention.

BRIEF DESCRIPTION OF THE DRAWINGS The above and other objects, advantages and features of the present invention, and methods of achieving them will be apparent with reference to the embodiments described below in detail with the accompanying drawings.

However, the present invention is not limited to the embodiments disclosed below, but may be implemented in various forms, and only the following embodiments are provided to those skilled in the art to which the present invention pertains. It is merely provided to easily inform the configuration and effects, the scope of the present invention is defined by the description of the claims.

Meanwhile, the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. In this specification, the singular also includes the plural unless specifically stated otherwise in the phrase. As used herein, “comprises” and / or “comprising” refers to the presence of one or more other components, steps, operations and / or devices in which the mentioned components, steps, operations and / or devices are described. Or does not exclude addition.

Artificial intelligence communication apparatus according to an embodiment of the present invention includes an input unit 100 for receiving a voice from the user utterance, Speech To Text (STT) unit 200 for converting the voice received by the input unit 100 into text, Input response analysis unit 300 for receiving the STT conversion result and analyzing the user response, and at least one response scenario selected from the preset scenarios according to the analysis result, and outputs the response command for the user response and the question And a output unit 500 for outputting a response control unit 400 for transmitting a silent or conversation start voice and outputting a response voice and a question voice according to an output command of the reaction control unit 400.

Input unit 100 according to an embodiment of the present invention receives the user's voice through the microphone of the artificial intelligence chat device.

The artificial intelligence dialogue apparatus according to the embodiment of the present invention performs the output of the question to the user, the input of the answer from the user, the output of the response to the response to the user and the output of the next question according to the response output to the user in order.

The question according to an embodiment of the present invention is provided at the beginning of a conversation with a user and is expressed by a conversation start voice.

At this time, the conversation start voice is the first question provided through the output unit, or when the silent sound is output, the conversation proceeds in the order of the response voice and the question voice output from the user's first voice input.

That is, by not only providing a preset matching answer to a query input from the user, but also asking a user a question based on a preset scenario, analyzing the user's response to this, and outputting a response and the next question, one conversation Supports natural dialogue between the user and the AI dialog within the subject.

The output unit 500 according to an embodiment of the present invention outputs a conversation start voice, which is a question voice for starting a conversation, based on the application execution environment information before starting the conversation.

At this time, the first embodiment outputs a conversation start voice, which is a question voice, and the conversation proceeds in the order of the user's answer, response voice, and question voice output. In the second embodiment, silence is output, The voice input of is the starting point of the conversation and the conversation proceeds in the order of response voice and question voice output.

In this case, the application execution environment information may be at least one of a built-in scenario database, a user's personal information, a user's behavior pattern, a record of a previous conversation, and surrounding environment information. If the record of previous conversations is about a company's project, as the application runs, it outputs the question, "How did the project go today?"

Also, if the application execution environment information is "weekend" and the weather information is "sunny", the output unit 500 is a question to start a conversation, not a question related to the company, "good weekend. Is the weather good? ”

That is, the output unit 500 according to an embodiment of the present invention does not only provide only a predetermined answer based on a user's input, but also provides a user with a question of an appropriate topic as the application is executed, thereby providing a conversation. It's a natural way to start and provide a customized conversation.

In addition, the reaction control unit 400 according to an embodiment of the present invention not only commands to select and output a response and a question for a user response from a pre-stored list, but also generates a new response and a question for the user response. It is also possible to print.

The input unit 100 according to an embodiment of the present invention receives a user's voice input in response to a conversation start voice output or after a silent output, and the STT unit 200 inputs a result of converting the voice of the user into a string. The answer analysis unit 300 is provided.

The input response analysis unit 300 analyzes whether or not the user answer converted into a character string corresponds to a predetermined answer type.

The structured scenario database according to the embodiment of the present invention stores and manages the selected answers, the general answers, the answers that you want to repeat, and the unrelated answers by their types.

The selective answer is a type in which the classification according to the answer selection is clearly defined in the user's answer to the question, and examples thereof include positive / negative, spring / summer / fall / winter, and the like.

The general answer, unlike the optional answer, is a type of ambiguity and multiple choices for a question, such as the answer to the question “What kind of exercise do you like?”.

Repetition is the type that the desired answer corresponds to the answer you want the output to ask again. At this time, the output unit 500 re-outputs the question immediately output.

An irrelevant answer is an unrelated answer to a question, in which the answer is "What weather do you like?" At this time, the artificial intelligence dialog device according to an embodiment of the present invention is to modify the scenario based on the user's answer to extract and provide the answers and questions sequentially or to query the user again the question of the category corresponding to the first question It is also possible.

The input response analysis unit 300 determines which category of a predetermined category the input user's answer belongs to, and displays the result. For example, when it corresponds to the optional answer type, it is determined whether the user response input through sentence analysis is affirmative or negative for the question.

The reaction controller 400 according to an embodiment of the present invention selects at least one reaction scenario from among scenarios stored in the instrumented scenario database according to the analysis result, and responds based on a reaction scenario according to a subject to which a user answer is applicable. And generate an output command for the question.

By extracting and providing responses and questions based on scenarios so as to lead to the next conversation according to the user's answer, the user can perform a natural conversation with the AI conversation device corresponding to his answer without any dissatisfaction.

The response control unit 400 transmits a pause command signal to the output unit 500 when new voice data is received from the user during output of the response voice and the question voice of the output unit 500, and then input response analysis unit 300. Resample the responses and questions according to the analysis results according to the new voice data of).

That is, the reaction scenario selected by the reaction controller 400 according to an embodiment of the present invention is determined and modified in real time according to the user's response or the user's comment, and the reaction scenario is appropriately based on the instrumented scenario database. It is corrected.

For example, if the first question is “Have you been in the company?” And while you were talking to you about what happened at the company, your answer was “But I have to go to the wedding this weekend.” If it is determined that the topic is to be transformed, the reaction control unit 400 changes the scenario (eg, “Would your friend get married? Where is the marriage ceremony?” Asking the first question on the topic and entering a specific event called wedding attendance). To continue the conversation).

The output unit 500 according to an embodiment of the present invention outputs text corresponding to the response voice and the question voice through the screen. Accordingly, even in a noisy environment in which the user cannot properly receive a voice from the output unit 500, the user may recognize the response and the question from the text output through the screen, and continue the conversation by uttering the answer. .

According to an embodiment of the present invention, an AI conversation method includes outputting a conversation start voice (including silence), receiving a user response (S200), analyzing a user response, and analyzing the result. Selecting at least one reaction scenario from among predetermined scenarios, extracting a response and a question based on the response scenario, and outputting a response voice and a question voice according to the extracted response and question (S400) do.

According to an embodiment of the present invention, step S100 is a step of outputting a question voice or a silence corresponding to a conversation start question to start a conversation, and includes a structured scenario database, a user's personal information, a user's behavior pattern, and a previous conversation. The conversation start question is extracted according to the application execution environment information which is at least one of the recording information, and the output information.

That is, based on the user's personal information, date, time, environment information such as the history of the previous conversation, and extracts the first question that corresponds to the subject of the conversation that the user may be interested in, and outputs it to the user to start the conversation. Actively perform

Alternatively, when the silent sound is output in step S100, the dialogue is performed in response to the user's answer input by voice in the step S200 and in the order of questions.

Step S200 according to an embodiment of the present invention converts a user's answer input by voice into text, and provides a sentence for analyzing a user's answer.

Step S300 according to an embodiment of the present invention performs analysis by determining which type of response type the user answer corresponds to. According to the present invention, the dialogue is performed in the order of the first question, the user's answer, the response to the user's answer, and the question according to the response (when silence is output, the user's voice input, the response to the user's voice, the question according to the response) To determine which type of response the user's answer falls into, which is divided into preset answer types (e.g., optional, general, answer you want to repeat, unrelated answer), Use it as evidence.

According to an embodiment of the present invention, the step S400 outputs the text corresponding to the response voice and the question voice through the screen, thereby providing the user with the response text and the question text visually as well as hearing, thereby supporting more accurate recognition of the user. .

According to an embodiment of the present invention, step S600 is a step of determining whether the conversation ends with a criterion. When the user is confirmed to say goodbye for a predetermined time, if the user does not answer for a predetermined time or longer, the user for a predetermined time or more. If there is no answer, and if the end criterion, such as when there is no user's reply to the voice of the output unit calling the user, the conversation is terminated. From step S200 to step S500 are repeated.

So far I looked at the center of the embodiments of the present invention. Those skilled in the art will appreciate that the present invention can be implemented in a modified form without departing from the essential features of the present invention. Therefore, the disclosed embodiments should be considered in descriptive sense only and not for purposes of limitation. The scope of the present invention is shown in the claims rather than the foregoing description, and all differences within the scope will be construed as being included in the present invention.

Claims

An input response analyzer analyzing the input user response;

A response control unit for selecting at least one response scenario among preset scenarios according to an analysis result and transmitting an output command for a response to the user response and a question; And

An output unit for outputting a silent or conversation start voice and outputting a response voice and a question voice according to an output command of the response control unit

Artificial intelligence communication device comprising a.
The method of claim 1,

The output unit is extracted based on the application execution environment information, and outputs a conversation start voice which is a question voice to start a conversation.

AI talk device.
The method of claim 2,

The output unit extracts the conversation start voice according to the application execution environment information, which is at least one of a built-in scenario database, a user's personal information, a user's behavior pattern, and information on a previous conversation.

AI talk device.
The method of claim 1,

The input response analysis unit receives a result of converting the voice of the user input in response to the conversation start voice output or the voice of the user input after the silent output into a character string, and the user response is one of preset response types. To perform an analysis by determining whether a type is applicable

AI talk device.
The method of claim 4, wherein

The reaction controller selects at least one reaction scenario from among scenarios stored in the scenario database constructed according to the analysis result, and outputs a response command for a response and a question based on the reaction scenario according to the subject to which the user answer corresponds. To transmit

AI talk device.
The method of claim 1,

The response control unit transmits a pause command signal to the output unit when new voice data is received from the user among the response voices and the question voice outputs of the output unit, and analyzes the result according to the new voice data of the input response analyzer. Reextracting responses and questions according to

AI talk device.
The method of claim 1,

The output unit to output a text corresponding to the response voice and the question voice through the screen

AI talk device.
(a) outputting a silent or conversation start voice;

(b) receiving a user response according to the silent or conversation start voice output;

(c) analyzing the user response and selecting at least one response scenario among preset scenarios based on the result, and extracting a response and a question based on the response scenario; And

(d) outputting a response voice and a question voice according to the extracted response and question

Artificial intelligence conversation method comprising a.
The method of claim 8,

The step (a) is to output a conversation start voice according to the application execution environment information which is at least one of a built-in scenario database, a user's personal information, a user's behavior pattern, and a record of a previous conversation.

Artificial intelligence conversation method.
The method of claim 8,

The step (b) is to convert the user response inputted by voice into text.

Artificial intelligence conversation method.
The method of claim 8,

The step (c) is to determine which type of response type the user answer corresponds to and performs an analysis.

Artificial intelligence conversation method.
The method of claim 8,

In step (d), the text corresponding to the response voice and the question voice is output through the screen.

Artificial intelligence conversation method.