WO2018047932A1

WO2018047932A1 - Interactive device, robot, processing method, program

Info

Publication number: WO2018047932A1
Application number: PCT/JP2017/032410
Authority: WO
Inventors: 久美子高塚; 山賀　宏之; 伊藤　真由美; 康一森川
Original assignee: 日本電気株式会社
Priority date: 2016-09-12
Filing date: 2017-09-08
Publication date: 2018-03-15

Abstract

An interactive device equipped with: an interaction start condition determination unit for determining whether or not acquired first acquisition information matches an interaction start condition; an analysis unit for performing an analysis related to detection of a user on the basis of the first acquisition information or on the basis of information obtained from a sensor device when the first acquisition information matches the interaction start condition; and an interaction processing unit for outputting first interaction information related to interaction with the user when the user is detected on the basis of the user detection analysis result of the analysis.

Description

Dialogue device, robot, processing method, program

The present invention relates to a dialogue apparatus, a robot, a processing method, and a program.

There are portable terminals such as tablet terminals that simplify functions and GUI (Graphical User Interface) targeting elderly people who are unfamiliar with the use of ICT (Information and Communications Technology) devices and display display information in a large size. In such an apparatus, the user-friendly interface is improved by using an interactive UI (User Interface) using a character. Patent Documents 1 and 2 are disclosed as techniques related to these.

JP 2006-119920 A JP-A-11-259446

By the way, in the case of the technology as described above, functions and GUIs are simplified, and display information is displayed in a large size. However, in the first place, when the user is unaccustomed to the operation of the ICT device, there is a problem that the ICT device cannot be freely used even if such a device is used. Therefore, there is a need for an apparatus that assists people who are unaccustomed to using ICT equipment and have a large psychological burden in use.

Therefore, an object of the present invention is to provide an interactive device, a robot, a processing method, and a program that solve the above-described problems.

According to the first aspect of the present invention, the dialog device includes a dialog start condition determining unit that determines whether the acquired first acquisition information matches the dialog start condition, and the first acquisition information is the dialog start. When the conditions are matched, an analysis unit that performs analysis related to user detection based on the first acquired information or information obtained from the sensor device, and the user is detected based on a user detection analysis result of the analysis A dialogue processing unit for outputting first dialogue information related to the dialogue with the user.

According to the second aspect of the present invention, the processing method determines whether the acquired first acquisition information matches the dialog start condition, and when the first acquisition information matches the dialog start condition. , Performing analysis related to user detection based on the first acquired information or information obtained from the sensor device, and first regarding dialogue with the user when the user is detected based on the user detection analysis result of the analysis Output dialog information.

According to the third aspect of the present invention, the program determines whether or not the acquired first acquisition information matches the dialog start condition, and the first acquisition information matches the dialog start condition. The user's detection based on the first acquisition information or the information obtained from the sensor device, and when the user is detected based on the user detection analysis result of the analysis, The first dialog information output process is executed.

According to the present invention, it is possible to assist a person who is unaccustomed to using an ICT device and has a large psychological burden in use.

It is a 1st figure which shows the dialogue apparatus by 1st embodiment, and its image display example. It is a hardware block diagram of the dialogue apparatus by 1st embodiment. It is a functional block diagram of the dialogue apparatus by 1st embodiment. It is a 2nd figure which shows the dialogue apparatus by 1st embodiment, and its image display example. It is a figure which shows the processing flow of the dialogue apparatus by 1st embodiment. It is a 3rd figure which shows the dialogue apparatus by 1st embodiment, and its image display example. It is a figure which shows the processing flow of the dialogue apparatus by 2nd embodiment. It is a figure which shows the processing flow of the dialogue apparatus by 3rd embodiment. It is a functional block diagram of the dialogue apparatus by 4th embodiment. It is a figure which shows the processing flow of the dialogue apparatus by 4th embodiment. It is a figure which shows the robot provided with the function of the dialogue apparatus. It is a figure which shows the minimum structure of an interactive apparatus.

(First embodiment)
Hereinafter, an interactive apparatus according to a first embodiment of the present invention will be described with reference to the drawings.
FIG. 1 is a first diagram illustrating an interactive apparatus and an image display example according to the first embodiment.
As shown in this figure, the interactive apparatus 1 has a display screen 16. The interactive device 1 is a tablet terminal, for example. A tablet terminal is an embodiment of an ICT device. The interactive apparatus 1 displays the character image 100 and the auxiliary image 101 on the display screen 16 and displays operation buttons on the display screen 16 that are simplified so that even a user unaccustomed to ICT devices such as elderly people can easily operate the screen. Display in area 110. In the present embodiment, an example in which only icon images of three operation buttons are displayed in the operation button display area 110 is shown. The dialogue apparatus 1 includes a camera 18.

FIG. 2 is a hardware configuration diagram of the interactive apparatus according to the first embodiment.
The interactive apparatus 1 includes a CPU (Central Processing Unit) 11, a RAM (Random Access Memory) 12, a ROM (Read Only Memory) 13, an SSD (Solid State Drive) 14, a communication module 15, a display screen 16, an IF (interface) 17, A camera 18 and the like are provided. The display screen 16 is configured by a liquid crystal monitor, a touch panel, or the like, and may have an input function for a user to input an operation by touching the touch panel in addition to a display function.

FIG. 3 is a functional block diagram of the interactive apparatus according to the first embodiment.
When the power is turned on, the CPU 11 (FIG. 2) of the dialogue apparatus 1 starts a dialogue processing program recorded in the ROM 13 (FIG. 2) or the SSD 14 (FIG. 2). As a result, the CPU 11 of the dialogue apparatus 1 includes the functions of the control unit 111, the dialogue start condition determination unit 112, the analysis unit 113, the dialogue processing unit 114, the transmission processing unit 115, and the response information notification unit 116. In addition, the CPU 11 of the interactive apparatus 1 has the function of the communication application processing unit 117 by starting the communication application program.

The control unit 111 controls other functional units.
The dialog start condition determination unit 112 determines whether the acquired information acquired by the dialog device 1 matches the dialog start condition.
The analysis unit 113 analyzes information obtained from a sensor device such as a touch panel constituting the camera 18 or the display screen 16 or obtained information when the obtained information matches the dialog start condition. The analysis unit 113 performs analysis related to user detection based on the information.
When the user is detected based on the user detection analysis result by the analysis unit 113, the dialogue processing unit 114 performs an output process of dialogue information regarding the dialogue with the user. The dialogue information includes, for example, voice information or character information.

The transmission processing unit 115 transmits the acquired information analysis result obtained by analyzing the acquired information acquired based on the user action, for example, after outputting the dialogue information.
The response information notification unit 116 notifies the predetermined transmission destination of the presence or absence of the response information when acquiring the acquired information. The response information is information indicating the content of the response by the user to the dialogue information.
The communication application processing unit 117 performs any one of processes such as a mail function, a message processing function, and an SNS (Social Networking Service) function.

FIG. 4 is a second diagram illustrating the interactive apparatus and its image display example according to the first embodiment.
As shown in FIG. 4, after the power is turned on, the interactive apparatus 1 displays the character image 100 and displays a plurality of operation buttons in a predetermined operation button display area 110 in the screen area. The dialogue apparatus 1 does not change the position of the operation button display area 110 in principle. This makes it possible for a user unfamiliar with the ICT device to operate the user without hesitation between many operations. The dialogue apparatus 1 may give an action to the character image 100 to display a gesture such as the character image 100 walking on the screen or a gesture for performing a conversation. Further, the interactive apparatus 1 may display an auxiliary image 101 representing the emotion of the character image 100 as shown in FIG. In FIG. 1, a heart mark is displayed as the auxiliary image 101. The character image 100 shown in FIG. 4 shows a movement that walks to the left and right, and a display in which the character walks between the character image 100a and the character image 100b is performed.

FIG. 5 is a diagram showing a processing flow of the interactive apparatus according to the first embodiment.
Next, the processing flow of the interactive apparatus 1 will be described in order.
The dialogue processing unit 114 of the dialogue apparatus 1 displays the character image 100, the auxiliary image 101, and operation buttons after activation (step S501). The dialogue processing unit 114 controls the type (display type) and movement of the character image 100 and the auxiliary image 101. For example, the dialogue processing unit 114 displays an image that attracts the user's interest, such as moving the character indicated by the character image 100 on the screen or shaking the character's head. The dialogue processing unit 114 may change or move the color of the auxiliary image 101.

The dialog start condition determination unit 112 is set to acquire the reception information (first acquisition information) when the communication application processing unit 117 receives the communication information. When receiving the communication information, the communication application processing unit 117 outputs the received information to the dialog start condition determining unit 112 based on the communication information.
It is assumed that the communication application processing unit 117 is a functional unit that performs application processing related to mail transmission / reception. In this case, the received information may include information such as a transmission source identifier such as a transmission source address or a transmission source user name, a face image of the transmission source user, a mail text, and attached data. The communication application processing unit 117 detects these pieces of information as received information.
When the communication application processing unit 117 is a functional unit that performs application processing related to SNS and application related to message transmission / reception, the received information includes a transmission source identifier such as a transmission source user name, a face image of the transmission source user, a message body, and attached data. Such information may be included.
When the communication application processing unit 117 is a functional unit that performs application processing related to a call, the received information may include information such as a caller user name and a call instruction.

The dialog start condition determining unit 112 acquires the received information (step S502). Acquisition of received information is an aspect in which a service function (communication application function) provided in the dialogue apparatus 1 acquires an event. When the received information is acquired, the dialog start condition determining unit 112 determines to start the dialog, and instructs the dialog processing unit 114 to start the dialog (step S503). The dialogue processing unit 114 outputs a voice call (step S504). Further, the dialogue processing unit 114 displays information notifying that the event has been acquired on the screen (step S505). Information notifying that this event has been acquired may be information indicating the movement of the character image 100 and the mode of the auxiliary image 101. The control unit 111 of the interactive apparatus 1 detects the reception information acquisition and activates the camera 18 (step S506).

The camera 18 is activated, for example, in a video shooting mode. The dialogue apparatus 1 is usually placed on a shelf or a desk, for example. In this state, when the dialog device 1 notifies that the communication information is received by the communication application processing unit 117 as described above, the user of the dialog device 1 holds the dialog device 1 and lifts the face close to the display screen 16. It is assumed that the face approaches the display screen 16 by approaching the interactive device 1. As a result, the camera 18 captures the user's face. The camera 18 outputs the captured image (each frame) included in the moving image to the analysis unit 113.

The analysis unit 113 determines whether or not a face image (second acquisition information) can be detected from the captured image (step S507). When the analysis unit 113 detects a face image, the analysis unit 113 compares the face image with a stored face image obtained by photographing a user's face in advance, and determines whether or not the face image matches. Do the same. The analysis unit 113 determines whether the authentication of the face image is successful (step S508). When the face image matches the face image obtained by photographing the user's face in advance, the analysis unit 113 outputs a dialogue start instruction indicating successful authentication to the dialogue processing unit 114. Note that the analysis unit 113 may output a dialogue start instruction to the dialogue processing unit 114 when a face image can be detected from the captured image without performing face authentication. As described above, the dialogue processing unit 114 determines whether or not to output the dialogue information based on the detection information of the face image (second acquisition information).

The interactive apparatus 1 may perform authentication processing using voiceprint information instead of the user's face image or together with the face image. When performing authentication processing using voiceprint information, the dialogue apparatus 1 is equipped with a microphone, and the voice information acquired from the microphone is analyzed by the analysis unit 113 to generate voiceprint information, which is stored in advance. Authenticates whether the information matches. Alternatively, the interactive apparatus 1 may perform an authentication process using the user's fingerprint information. When authentication processing is performed using fingerprint information, the interactive device 1 is provided with a fingerprint sensor, and the analysis unit 113 analyzes the fingerprint information acquired from the fingerprint sensor and matches the user's fingerprint information stored in advance. Authenticate whether or not to do. When the authentication is successful, the analysis unit 113 outputs the authentication success to the dialogue processing unit 114 as described above. The interactive device 1 may perform authentication processing based on iris information.

When the dialogue processing unit 114 detects a successful authentication (YES in step S508), the dialogue processing unit 114 performs a dialogue process (step S509). In this dialogue process, the dialogue processing unit 114 performs display by adding a predetermined action to the character image 100 and the auxiliary image 101.

In the above processing flow, based on the fact that the communication application processing unit 117 has acquired the received information in step S502, the dialog start condition determining unit 112 instructs the dialog processing unit 114 to start a dialog. Then, in step S504, the dialogue processing unit 114 performs dialogue processing. However, instead of these processes, the following process may be performed. The control unit 111 of the dialogue apparatus 1 detects a predetermined time based on a timer, and the dialogue start condition determination unit 112 acquires information indicating the detection. Then, the dialog start condition determination unit 112 instructs the dialog processing unit 114 to start the dialog in response to detecting the predetermined time, and the dialog processing unit 114 performs the dialog processing. In this case, the process in step S502 is replaced with a determination as to whether a predetermined time set by the timer has been detected. When the predetermined time is detected, the processing after step S503 is started. Further, in this case, the dialogue apparatus 1 performs the processes of steps S503 to S509. The processing after step S510 is omitted because the reception information is not acquired. In this example, the process of acquiring information (first acquisition information) indicating that the dialog start condition determination unit 112 has detected a predetermined time corresponds to one mode in which the first acquisition information matches the dialog start condition. .

FIG. 6 is a third diagram illustrating the interactive apparatus according to the first embodiment and an image display example thereof.
In the dialogue processing in step S509, the dialogue processing unit 114 may perform display with the line of sight of the character image 100 directed to the front of the screen, the blinking operation of the character image 100, or the movement of the mouth. For example, the dialogue processing unit 114 detects the interruption of the utterance based on the user's utterance. Then, the dialogue processing unit 114 performs a display in which the character image 100 adds a motion of nodding in the interruption of the utterance, or performs a display in which a motion of blinking eyes or blinking is added. Thus, the dialogue processing unit 114 outputs the character image 100 that assists the dialogue based on the dialogue information with the user to the display screen 16. In this dialogue process, the dialogue processing unit 114 displays the display contents such as the transmission source user name, the transmission source user's face image 102, the mail text 103, the message text 103, and the like included in the received information (step S510). The display content may be displayed in any manner.

After displaying the display content in step S510, the dialog processing unit 114 performs a dialog with the user so that the reply processing corresponding to the reception of the communication information by the communication application processing unit 117 is completed by only the conversation without the user's operation. You can go. The dialogue processing unit 114 detects the user's voice, analyzes the voice, and performs a character conversion process (step S511). In this case, the dialogue processing unit 114 notifies the communication application processing unit 117 of character information obtained by analyzing the voice. Then, the communication application processing unit 117 generates a mail or message in which text information is written in the text. Then, the communication application processing unit 117 transmits the generated communication information such as a mail or a message to the user who is the transmission source of the reception information based on the transmission source identifier or the user whose transmission destination is predetermined as the transmission destination. Alternatively, step S512 may be used. As described above, the second acquisition information includes voice information, and the communication application processing unit 117 transmits the character information obtained by analyzing the voice information to the transmission source of the first acquisition information.

Through the above processing, the user can immediately notify the user that the communication application processing unit 117 of the dialogue apparatus 1 has received the communication information (reception information) by means of a screen display or sound. In addition, based on the reception information received by the communication application processing unit 117, the dialogue apparatus 1 can immediately notify the reception to the user by screen display or sound when the reception information is from a predetermined transmission source. . Even if the user of the interactive device 1 is not familiar with the ICT device, the user can browse the information such as the contents of the received information and the face image of the sender of the transmission source only by bringing his face close to the user, and the operation is mostly performed. The functions of communication applications such as mail, SNS application, and message application provided in the dialog device 1 can be used without any problem. Further, the interactive device 1 displays and operates the character image, so that an illusion that the character image is interacting can be given to the user. Thereby, the psychological barrier which operates a user's ICT apparatus can be eased.

(Second embodiment)
FIG. 7 is a diagram showing a processing flow of the interactive apparatus according to the second embodiment.
Next, the processing flow of the interactive apparatus 1 will be described in order.
The dialogue processing unit 114 of the dialogue apparatus 1 displays the character image 100, the auxiliary image 101, and operation buttons after activation (step S701). The dialogue processing unit 114 controls the type (display type) and movement of the character image 100 and the auxiliary image 101. For example, the dialogue processing unit 114 displays an image that attracts the user's interest, such as moving the character indicated by the character image 100 on the screen or shaking the character's head. The dialogue processing unit 114 may change or move the color of the auxiliary image 101.

The dialog start condition determination unit 112 acquires the received information (step S702). Then, the dialog start condition determination unit 112 determines whether the received information matches the dialog start condition (step S703). Specifically, the dialog start condition determination unit 112 may determine that the dialog start condition is met when the reception information is simply acquired.
The dialog start condition determining unit 112 may extract predetermined information included in the received information and determine that the information matches the dialog start condition when it can be determined that the information matches the information indicated by the start condition. For example, the dialogue start condition determination unit 112 determines that the transmission start address and the transmission source user name included in the reception information match a predetermined transmission source address and transmission source user name stored in advance. It may be determined that

Alternatively, the dialog start condition determination unit 112 may determine that the dialog start condition is met when the sensing information acquired from the IF 17 or the camera 18 is acquired. The dialog start condition determination unit 112 may determine that the sensing information matches the dialog start condition when it can be determined that the sensing information matches the predetermined sensing information stored in advance. For example, when the sensing information is information that has detected that the display screen 16 has been touched, the dialog start condition determination unit 112 may determine that the touch condition has been met when it has detected that the touch has been made. When the sensing information is a face image captured by the camera 18, the dialog start condition determination unit 112 determines whether the face image is a face image of a predetermined user. It may be determined that it matches. When the sensing information is voice information input to the IF 17, the dialogue start condition determination unit 112 determines whether the voice print information based on the voice information matches the voice print information of the predetermined user, and is the voice print information of the predetermined user. In this case, it may be determined that the dialog start condition is met.
When the dialog start condition determining unit 112 determines that the received information or the acquired information (first acquired information) matches the dialog start condition, the dialog start condition determining unit 112 outputs the received information or the acquired information to the dialog processing unit 114. The acquired information is sensing information, detection information, image information, voiceprint information, and the like.

The dialogue processing unit 114 performs dialogue processing based on the received information and acquired information. Specifically, the dialogue processing unit 114 notifies the user that the communication application processing unit 117 has received the reception information based on the reception information from the communication application processing unit 117 (step S704).
In this notification, the dialogue processing unit 114 changes the movement of the character image 100 to notify that the communication application processing unit 117 has received the reception information. Alternatively, the dialog processing unit 114 may output a predetermined sound, a voice of a character notifying that it has been received, or other sound from a speaker to notify that the communication application processing unit 117 has received the reception information. That is, the operation for notifying that the communication application processing unit 117 has received the reception information is an aspect of the dialogue processing. The character image 100 and the sound for notifying that the communication application processing unit 117 has received the reception information are one aspect of the dialogue promotion information. The dialogue promotion information is information that prompts the user to interact.

The control unit 111 of the interactive apparatus 1 detects that the received information or the acquired information has been acquired by the interactive processing unit 114 and activates the camera 18 (step 705). The camera 18 is activated in, for example, a moving image shooting mode. The dialogue apparatus 1 is usually placed on a shelf or a desk, for example. In this state, when the dialog device 1 notifies that the received information is received by the communication application processing unit 117 as described above, the user of the dialog device 1 holds the dialog device 1 and lifts it to bring the face closer to the display screen 16. Alternatively, it is assumed that the face approaches the display screen 16 by approaching the interactive device 1. As a result, the camera 18 captures the user's face. The camera 18 outputs the captured image (each frame) included in the moving image to the analysis unit 113.

The analysis unit 113 determines whether or not a face image (second acquisition information) can be detected from the captured image. When the face image is detected, the analysis unit 113 determines whether or not the face image matches the face image obtained by photographing the user's face in advance, as in the face authentication process. The analysis unit 113 determines whether or not the face image has been successfully authenticated (step S706). When the face image matches the face image obtained by photographing the user's face in advance, the analysis unit 113 outputs a dialogue start instruction indicating successful authentication to the dialogue processing unit 114. Note that the analysis unit 113 may output a dialogue start instruction to the dialogue processing unit 114 when a face image can be detected from the captured image without performing face authentication.

The interactive device 1 may perform authentication processing using the user's voiceprint information. When performing authentication processing using voiceprint information, the dialogue apparatus 1 is equipped with a microphone, and the voice information (second acquisition information) acquired from the microphone is analyzed by the analysis unit 113 to generate voiceprint information. Then, authentication is performed as to whether or not it matches the voice print information of the user stored in advance. Alternatively, the interactive apparatus 1 may perform an authentication process using the user's fingerprint information. When authentication processing is performed using fingerprint information, the interactive device 1 is provided with a fingerprint sensor, and the analysis unit 113 analyzes the fingerprint information acquired from the fingerprint sensor and matches the user's fingerprint information stored in advance. Authenticate whether or not to do. When the authentication is successful, the analysis unit 113 outputs the authentication success to the dialogue processing unit 114 as described above. Alternatively, the dialogue apparatus 1 may perform authentication processing using the user's iris information.

When the dialogue processing unit 114 detects a successful authentication (YES in step S706), the dialogue processing unit 114 performs a dialogue process (step S707). In this dialogue process, the dialogue processing unit 114 performs display by adding a predetermined action to the character image 100 and the auxiliary image 101.

As shown in FIG. 6, in the dialogue processing in step S707, the dialogue processing unit 114 performs display with the line of sight of the character image 100 directed to the front of the screen, the blinking operation of the character image 100, and the movement of the mouth. You may do it. For example, the dialogue processing unit 114 detects the interruption of the utterance based on the user's utterance. Then, the dialogue processing unit 114 performs a display in which the character image 100 adds a motion of nodding in the interruption of the utterance, or performs a display in which a motion of blinking eyes or blinking is added. In this dialogue processing, the dialogue processing unit 114 displays the display contents such as the transmission source user name, the transmission source user's face image 102, the mail text 103, and the message text 103 included in the received information. The display content may be displayed in any manner. As described above, the dialogue information includes the character information 103, and the dialogue processing unit 114 outputs the character information 103 together with the face image 102 of the transmission source user of the reception information (first acquisition information).

Through the above processing, the user can immediately notify the user that the communication application processing unit 117 of the dialogue apparatus 1 has received the communication information (reception information) by means of a screen display or sound. In addition, based on the reception information received by the communication application processing unit 117, the dialogue apparatus 1 can immediately notify the reception to the user by screen display or sound when the reception information is from a predetermined transmission source. . Even if the user of the interactive device 1 is not familiar with the ICT device, the user can browse the information such as the contents of the received information and the face image of the sender of the transmission source only by bringing his face close to the user, and the operation is mostly performed. The functions of communication applications such as mail, SNS application, and message application provided in the dialog device 1 can be used without any problem. Further, when the interactive device 1 displays and operates the character image 100, an illusion that the character image 100 is interacting can be given to the user. Thereby, the psychological barrier which operates a user's ICT apparatus can be eased.

When it is determined that the authentication is successful in the process of step S706 of the above-described process flow, information indicating that the user of the dialog device 1 has interacted may be transmitted to the user who has transmitted the reception information. In this case, the analysis unit 113 outputs authentication success to the transmission processing unit 115. The transmission processing unit 115 acquires reception information. The received information includes a sender identifier such as a sender mail address, a sender user name, and a sender user ID. The transmission processing unit 115 uses this transmission source identifier to instruct the communication application processing unit 117 to transmit information indicating that the authentication has been successful or that a dialogue has occurred. As a result, the communication application processing unit 117 is transmitted to the transmission source using the transmission source identifier, indicating that the authentication has succeeded, has interacted, or has failed to interact. This process is an aspect in which the transmission processing unit 115 performs transmission control on the acquired information analysis result obtained by analyzing the second acquired information acquired based on the user action after the output of the dialog information. Moreover, this process is one mode of notifying the presence or absence of reply information to a predetermined transmission destination when the second acquisition information is acquired. Note that the face image of the user of the dialog device 1 may be stored in the information indicating that the dialog is transmitted to the transmission destination.

By such processing, it is possible to notify the transmission source that the user of the dialogue apparatus 1 has conducted a dialogue (whether there is dialogue or response information) without the user's operation. Thereby, if the user of the dialogue apparatus 1 is an elderly person or the like, the state of the elderly person can be notified to other users such as children of the elderly person.

Further, it is determined whether or not the received information matches the dialog start condition in the process of step S703 of the above-described process flow. However, it may be determined that the acquired information matches the dialog start condition as follows. For example, as described above, when it is detected that the user touches the display screen 16, when it is determined that the face image captured by the camera 18 is a face image of a predetermined user, the user's detected by the microphone For example, it is determined that the voiceprint information of a predetermined user is based on the voice. In this case, based on the fact that the acquired information matches the dialog start condition, information indicating that the user of the dialog device 1 has interacted may be transmitted to another user at a predetermined transmission destination. In this process, similarly, the dialog start condition determination unit 112 outputs to the transmission processing unit 115 that the acquired information matches the dialog start condition. When the transmission processing unit 115 detects matching of the dialog start conditions, the transmission processing unit 115 acquires a transmission source identifier such as a transmission source mail address, a transmission source user name, and a transmission source user ID of a predetermined transmission destination from a storage unit such as the SSD 14. Using this transmission source identifier, the transmission processing unit 115 instructs the communication application processing unit 117 to transmit information indicating that the user of the interactive apparatus 1 has interacted. As a result, the communication application processing unit 117 transmits information indicating that the user interacts with the user of the interactive apparatus 1 to a predetermined transmission destination using the transmission source identifier.

By such processing, it is possible to notify a predetermined transmission destination that the user has reacted to the dialog of the dialog device 1 without an operation of the user of the dialog device 1. Thereby, if the user of the dialogue apparatus 1 is an elderly person or the like, the state of the elderly person can be notified to other users such as children of the elderly person.

In step S706 described above, authentication is performed as to whether or not the face image matches the face image of the predetermined user, but the dialogue information is output based on other detection information of the face image that is the second acquisition information. It may be determined whether or not to perform. For example, the analysis unit 113 detects the size of the captured image of the face image, and when the size is equal to or larger than a predetermined size, the dialogue processing unit 114 determines that the authentication is successful because the user is approaching the dialogue device 1. Determine and instruct the start of dialogue processing. The size of the face image may be determined by the number of pixels in the image range recognized as the face in the captured image. Alternatively, the analysis unit 113 detects a face orientation, an angle formed by the face orientation and a line perpendicular to the screen plane, based on the face image, and determines whether or not the display screen 16 is directly facing. Also good. When it is determined that the face is directly facing the display screen 16, the analysis unit 113 determines that the user is looking at the dialog device 1, and the dialog processing unit 114 instructs the start of the dialog processing along with successful authentication. Alternatively, the analysis unit 113 determines that the user is about to use the interactive device 1 when the facial image is slower than a predetermined speed due to the speed of the movement of the face image, and the interactive processing unit 114 determines that the authentication is successful. You may instruct | indicate the start of a dialogue process.

The analysis unit 113 may output the analysis result of these face images to the dialogue processing unit 114. In this case, the dialogue processing unit 114 may change the movement of the character image 100 and the movement and type of the auxiliary image 101 according to the analysis result such as the size, orientation, angle, and movement speed of the face. Good. Specifically, when the analysis unit 113 determines that the user is about to use the dialogue apparatus 1 based on the analysis result, the dialogue processing unit 114 acquires information on the analysis result. Then, the dialogue processing unit 114 displays a character image 100 of a gesture (action) that requests the user to speak.

The analysis unit 113 calculates the distance between the interactive device 1 and the person in front of the interactive device 1 from the size of the face detected in the face image, and the user uses the interactive device 1 when the distance is equal to or less than the threshold value. It may be determined that the state is about to be attempted. The distance may be calculated by, for example, the size of the area occupied by the face image in the captured image.
The analysis unit 113 detects the user's eyes, nose, and mouth based on the face image, estimates the position of the entire face, and determines the angle of the face with respect to the dialogue apparatus 1 from the position of the eye nose and mouth in the entire face. You may make it guess.
Even when the dialogue processing unit 114 detects a face having a size greater than or equal to the threshold value, if the face movement is determined to be fast, the dialogue processing unit 114 does not output voice information simulating voice call by the character image 100 or the like. It may be.

The analysis unit 113 stores face images of a plurality of family members in advance in the analysis of the face image. Based on the comparison between the acquired face image and the stored face image, the analysis unit 113 determines who the acquired face image is in the family living together. The analysis unit 113 may determine that the user is about to use the interactive device 1 only when it is determined that the user is a specific user.
The analysis unit 113 may control the character image 100 not to interact when a visitor (a person who has not registered facial image information) is detected. In this case, the analysis unit 113 may determine that a visitor has been detected when a face image that does not match the stored face image is detected.
Further, there will be exemplified a case where there are a plurality of face images detected from the photographed image, and one of the face images is determined to be a face image of a predetermined user who owns the dialogue apparatus 1. In this case, when another face image that cannot be authenticated (not matched) is detected, the analysis unit 113 determines that there is a visitor, and the user is not trying to use the interactive device 1. May be determined. In this case, the dialogue processing unit 114 may not perform the dialogue processing. The dialogue processing unit 114 may determine that there is a visitor other than the user based on the voice analysis, and may not perform the dialogue processing.

The analysis unit 113 performs analysis processing based on second acquired information other than the face image (fingerprint information, voice information, touch detection with a finger on the display screen 16, etc.) and the user is about to use the interactive device 1. The analysis unit 113 may output the analysis result to the dialogue processing unit 114 in the same manner. The dialogue processing unit 114 may display a character image 100 of a gesture (action) that requests a user to speak based on the analysis result.

In the dialog processing in step S707, the dialog processing unit 114 may perform a dialog with the user so that the reply processing corresponding to reception of the communication application processing unit 117 is completed only by the conversation without the user's operation. Although the specific example of displaying the contents of the received information is shown in the dialog processing in step S707 above, control may be performed such that the voice of the user is further detected, the voice is analyzed, and the character conversion processing is performed. In this case, the dialogue processing unit 114 notifies the communication application processing unit 117 of character information obtained by analyzing the voice. Then, the communication application processing unit 117 generates a mail or message in which text information is written in the text. Then, the communication application processing unit 117 transmits the generated communication information such as a mail or a message to the user who is the transmission source of the reception information based on the transmission source identifier or the user whose transmission destination is predetermined as the transmission destination. It may be.

The dialogue processing unit 114 may detect character information from the photographed image obtained from the camera 18 and transmit the character information to the communication application processing unit 117 in the dialogue processing in step S707. For example, instead of producing a voice, the user of the dialogue apparatus 1 writes a sentence on a sheet during the dialogue processing and puts it in front of the camera 18. The camera 18 outputs image information generated by photographing a sheet to the dialogue processing unit 114. The dialogue processing unit 114 analyzes the image information, extracts character information, and notifies the communication application processing unit 117 of the character information. Then, the communication application processing unit 117 generates a mail or message in which text information is written in the text. The communication application processing unit 117 transmits the generated communication information such as a mail or a message to the user who is the transmission source of the reception information based on the transmission source identifier or to the user whose transmission destination is predetermined as the transmission destination. Also good. The image information may be attached to an email or a message and transmitted to the transmission destination.

As shown in FIG. 6, it is desirable that the area where the character image 100 is displayed, the area 120 where the dialog information is displayed, and the area 120 where the character information is displayed are fixed on the display screen 16 of the dialog apparatus 1. .

The dialogue processing unit 114 and other communication application processing unit 117 of the dialogue device 1 display text having contents that the character image 100 speaks, and display auxiliary operation buttons and the like. However, the position and size to be displayed are not changed even if the operation steps are changed so that the operation method is not required to be learned. The interactive device 1 fixes the display area according to the type of information to be displayed, so that even if the user is unfamiliar with the ICT device, the user is less confused by the irregular display and becomes familiar with the interactive device 1. Can be easily operated.

The dialogue processing unit 114 of the dialogue device 1 displays character information and the like by setting a large display range in the horizontal direction of the screen in order to make it easy to understand and understand the contents (one phrase is preferably displayed on the screen without line breaks). Devised to fit within the display).
Further, the dialogue processing unit 114 of the dialogue apparatus 1 displays the operation buttons with a small number such as about three. Thereby, it can be considered that a user such as an elderly person does not get lost in the operation. Also, by reducing the number, the size and spacing of the buttons can be increased, so that pressing mistakes can be suppressed.

(Third embodiment)
FIG. 8 is a diagram showing a processing flow of the interactive apparatus according to the third embodiment.
Next, the processing flow of the interactive apparatus 1 according to the third embodiment will be described in order.
The dialogue processing unit 114 of the dialogue apparatus 1 displays the character image 100, the auxiliary image 101, and operation buttons after activation (step S801). The dialogue processing unit 114 controls the display type and movement of the character image 100 and the auxiliary image 101. For example, the dialogue processing unit 114 displays an image that attracts the user's interest, such as moving the character indicated by the character image 100 on the screen or shaking the character's head. The dialogue processing unit 114 may change or move the color of the auxiliary image 101.

The dialog start condition determining unit 112 determines whether or not the received information has been acquired (step S802). The determination of whether or not the reception information has been acquired is an aspect of determination of whether or not an event of a service function (communication application function) has been acquired. When the received information is acquired (YES in S802), the dialog start condition determining unit 112 determines to start the process of the first dialog and instructs the dialog processing unit 114 to start the first dialog (Step S803). . The subsequent steps S804 to S812 are the same as the steps S504 to S512 according to the first embodiment.

Even if the received information is not acquired in step S802 (NO in S802), the dialog start condition determining unit 112 determines to start the second dialog processing, and instructs the dialog processing unit 114 to start the second dialog. (Step S813). The control unit 111 of the dialogue apparatus 1 detects the start of the second dialogue and activates the camera 18 (step S814).

The camera 18 is activated, for example, in a video shooting mode. The camera 18 photographs the user's face. The camera 18 outputs the captured image (each frame) included in the moving image to the analysis unit 113.

The analysis unit 113 determines whether or not a face image can be detected from the captured image (step S815). When the face image is detected, the analysis unit 113 determines whether or not the face image matches the face image obtained by photographing the user's face in advance, as in the face authentication process. The analysis unit 113 determines whether or not the face image has been successfully authenticated (step S816). When the face image matches the face image obtained by photographing the user's face in advance, the analysis unit 113 outputs a dialogue start instruction indicating successful authentication to the dialogue processing unit 114. Note that the analysis unit 113 may output a dialogue start instruction to the dialogue processing unit 114 when a face image can be detected from the captured image without performing face authentication.

When the dialogue processing unit 114 detects the authentication success, the dialogue processing unit 114 performs the second dialogue processing (step S817). In this dialogue process, the dialogue processing unit 114 performs display by adding a predetermined action to the character image 100 and the auxiliary image 101. This second interactive process is a process of directly interacting between the interactive apparatus 1 and the user. The dialogue processing unit 114 determines whether or not the user's voice is detected. When the user's voice is detected, the dialogue processing unit 114 outputs a character image 100 showing a motion of the character nodding.

Through the above processing, even when the event is not acquired in the service function such as the communication application processing unit 117 of the dialog device 1, the dialog processing between the dialog device 1 and the user is performed. Thereby, the dialog apparatus 1 can be made to communicate with users, such as an elderly person unfamiliar with an ICT apparatus.

(Fourth embodiment)
FIG. 9 is a functional block diagram of the interactive apparatus according to the fourth embodiment.
As shown in FIG. 9, the dialogue apparatus 1 may have a function of the photographing application processing unit 118 instead of the communication application processing unit 117. Similar to FIG. 3, the CPU 11 of the dialogue apparatus 1 starts the dialogue processing program recorded in the ROM 13 or the SSD 14 when the power is turned on. As a result, the CPU 11 of the dialogue apparatus 1 includes the functions of the control unit 111, the dialogue start condition determination unit 112, the analysis unit 113, the dialogue processing unit 114, the transmission processing unit 115, and the response information notification unit 116. In addition, the CPU 11 of the interactive apparatus 1 has the function of the communication application processing unit 117 by starting the communication application program. Further, the CPU 11 of the interactive apparatus 1 further includes the function of the photographing application processing unit 118 by starting the photographing application program.

FIG. 10 is a diagram showing a processing flow of the interactive apparatus according to the fourth embodiment.
In addition to the processing of the first embodiment, the interactive device 1 may perform processing described below. As in the first embodiment, the dialogue processing unit 114 of the dialogue device 1 displays the character image 100, the auxiliary image 101, and operation buttons after activation (step S1001). The dialogue processing unit 114 controls the display type and movement of the character image 100 and the auxiliary image 101. For example, the dialogue processing unit 114 displays an image that attracts the user's interest, such as moving the character indicated by the character image 100 on the screen or shaking the character's head. The dialogue processing unit 114 may change or move the color of the auxiliary image 101. The display of the character image 100 and the auxiliary image 101 is an aspect of outputting dialogue promotion information.

The control unit 111 of the interactive device 1 activates the camera 18 while the interactive device 1 is operating (step S1002). The camera 18 is activated in, for example, a moving image shooting mode. The dialogue apparatus 1 is usually placed on a shelf or a desk, for example. In this state, in response to the dialogue device 1 outputting the dialogue promotion information as described above, the user of the dialogue device 1 holds the dialogue device 1 and lifts it up to bring the face closer to the display screen 16 or the dialogue device 1 It is assumed that the face approaches the display screen 16 by approaching the side. As a result, the camera 18 captures the user's face. The camera 18 outputs the captured image (each frame) included in the moving image to the analysis unit 113.

Also, the dialogue start condition determination unit 112 is set to acquire face detection information (first acquisition information) when the photographing application processing unit 118 is notified of detection of a human face image from the analysis unit 113. The analysis unit 113 always determines whether or not a face image (first acquisition information) can be detected from the captured image. When a face image is detected, the analysis unit 113 determines that the first acquisition information matches the conversation start condition. When the face image is detected, the analysis unit 113 determines whether or not the face image matches a face image that has been captured and stored in advance as in the face authentication process. The analysis unit 113 determines whether or not the face image has been successfully authenticated (step S1003). When the face image matches the face image that has been captured and stored in advance by the user, the analysis unit 113 outputs a dialogue start instruction indicating a successful authentication to the dialogue processing unit 114. As described above, the dialogue processing unit 114 outputs the dialogue information when it is detected that the face image is a predetermined user. Note that the analysis unit 113 may output a dialogue start instruction to the dialogue processing unit 114 when a face image can be detected from the captured image without performing face authentication.

When the dialogue processing unit 114 detects the authentication success, the dialogue processing unit 114 performs dialogue processing (step S1004). In this dialogue process, the dialogue processing unit 114 performs display by adding a predetermined action to the character image 100 and the auxiliary image 101. The dialogue processing is as described in the other embodiments.

When the user approaches and the face can be authenticated through the above processing, communication with the character of the interactive device 1 can be achieved.

FIG. 11 is a diagram showing a robot having the function of an interactive device.
The robot 500 may have the function of the above-described dialogue apparatus 1. In this case, for example, the robot 500 may be provided with the display screen 16 shown by the interactive apparatus 1 on the front surface. Further, the interactive apparatus 1 provided in the robot 500 may control the robot 500 so that the robot 500 performs the operation of the character image 100 instead of displaying the character image 100. In this case, the dialogue apparatus 1 may control mechanical eye movements, mouth movements, foot movements, and the like included in the robot 500.

FIG. 12 is a diagram showing the minimum configuration of the interactive apparatus.
As shown in this figure, the dialogue apparatus 1 includes at least functions of a dialogue start condition determination unit 112, an analysis unit 113, and a dialogue processing unit 114. The dialog start condition determination unit 112 determines whether or not the acquired first acquisition information matches the dialog start condition. The analysis unit 113 analyzes the information obtained from the sensor device (such as a camera) when the first acquisition information matches the conversation start condition. When the dialogue processing unit 114 detects a user based on a user detection analysis result using information obtained from the sensor device, the dialogue processing unit 114 performs an output process of dialogue information related to the dialogue with the user.

The dialog device 1 obtains response information from the user according to the output of the dialog information, and explains the operation of a predetermined application according to the dialog processing unit 114 that analyzes the response information and the response information analysis result. An application operation unit that outputs the operation explanation information to the dialogue processing unit 114. In this case, the dialogue processing unit 114 outputs dialogue information using the operation explanation information and a character image for assisting dialogue based on the dialogue information.

Note that the above-described dialogue apparatus 1 has a computer system inside. A program for causing the interactive device 1 to perform each of the above-described processes is stored in a computer-readable recording medium of the interactive device 1, and the computer of the interactive device 1 reads and executes the program. The above processing is performed. Here, the computer-readable recording medium means a magnetic disk, a magneto-optical disk, a CD-ROM, a DVD-ROM, a semiconductor memory, or the like. Alternatively, the computer program may be distributed to the computer via a communication line, and the computer that has received the distribution may execute the program.

The above program may be for realizing a part of the functions of each processing unit described above. Furthermore, what can implement | achieve the function mentioned above in combination with the program already recorded on the computer system, and what is called a difference file (difference program) may be sufficient.

A part or all of the above-described embodiment can be described as in the following supplementary notes, but is not limited thereto.
(Appendix 1)
A dialog start condition determining unit that determines whether or not the acquired first acquisition information matches the dialog start condition;
When the first acquisition information matches the dialog start condition, an analysis unit that performs analysis related to user detection based on the first acquisition information or information obtained from the sensor device;
A dialogue processing unit for outputting first dialogue information related to dialogue with the user when the user is detected based on a user detection analysis result of the analysis;
A dialogue apparatus comprising:

(Appendix 2)
The dialogue apparatus according to claim 1, wherein the dialogue processing unit outputs a character image for assisting the dialogue based on the first dialogue information with the user on a screen.

(Appendix 3)
A transmission processing unit that transmits an acquisition information analysis result obtained by analyzing second acquisition information acquired based on the user's operation after the output of the first dialogue information;
The interactive apparatus according to Supplementary Note 1 or Supplementary Note 2, comprising:

(Appendix 4)
The second acquisition information includes audio information,
The dialogue apparatus according to claim 3, wherein the transmission processing unit transmits the character information obtained by analyzing the voice information to the transmission source of the first acquisition information.

(Appendix 5)
The dialogue apparatus according to any one of Supplementary Note 1 to Supplementary Note 4, wherein the dialogue processing unit determines whether or not the user has been detected based on the user detection analysis result.

(Appendix 6)
The second acquisition information includes a face image,
The dialogue apparatus according to Supplementary Note 3 or Supplementary Note 4, wherein the dialogue processing unit determines whether to output second dialogue information based on the detection information of the face image.

(Appendix 7)
The dialog start condition determining unit obtains the first acquisition information that matches the dialog start condition when the first acquisition information including a face image is acquired after outputting the dialog promotion information that prompts the user to perform a dialog. Judgment,
The analysis unit analyzes whether the face image included in the first acquisition information is a predetermined user,
The dialogue apparatus according to any one of notes 1 to 4, wherein the dialogue processing unit outputs the first dialogue information when it is detected that the face image is the predetermined user.

(Appendix 8)
The dialogue apparatus according to any one of notes 1 to 7, wherein the first dialogue information includes voice information or character information.

(Appendix 9)
When the dialogue processing unit has acquired the second acquisition information, a response information notification unit that notifies the predetermined transmission destination of the presence or absence of response information by the user with respect to the first dialogue information;
The interactive apparatus according to any one of appendices 3, 4, and 6.

(Appendix 10)
The first dialogue information includes character information,
The dialogue apparatus according to claim 7, wherein the dialogue processing unit outputs the character information together with the face image of the user who transmitted the first acquisition information.

(Appendix 11)
A robot provided with the interactive device according to any one of appendix 1 to appendix 10.

(Appendix 12)
Determine whether the acquired first acquisition information matches the dialog start condition,
When the first acquisition information matches the dialog start condition, perform analysis related to user detection based on the first acquisition information or information obtained from the sensor device,
A processing method for outputting first dialogue information relating to a dialogue with the user when the user is detected based on a user detection analysis result of the analysis.

(Appendix 13)
On the computer,
Determine whether the acquired first acquisition information matches the dialog start condition,
When the first acquisition information matches the dialog start condition, perform analysis related to user detection based on the first acquisition information or information obtained from the sensor device,
When the user is detected based on the analysis result of the user detection analysis of the analysis, the first dialogue information related to the dialogue with the user is output.
A program that executes processing.

(Appendix 14)
A dialogue processing unit for obtaining response information from the user according to output of dialogue information related to dialogue with the user, and analyzing the response information;
An application operation unit that outputs operation explanation information that explains an operation of a predetermined application according to a result of the analysis to the dialog processing unit,
The dialogue processing unit outputs the dialogue information using the operation explanation information and a character image for assisting the dialogue based on the dialogue information.

This application claims priority based on Japanese Patent Application No. 2016-177296 filed in Japan on September 12, 2016, the entire disclosure of which is incorporated herein.

DESCRIPTION OF SYMBOLS 1 ... Dialog device 100 ... Character image 101 ... Auxiliary image 11 ... CPU
12 ... RAM
13 ... ROM
14 ... SSD
15 ... Communication module 16 ... Display screen 17 ... IF
18 ... Camera 111 ... Control unit 112 ... Dialogue start condition determination unit 113 ... Analysis unit 114 ... Dialogue processing unit 115 ... Transmission processing unit 116 ... Response information notification unit 117 ..Communication application processing unit 118 ... Shooting application processing unit

Claims

A dialog start condition determining unit that determines whether or not the acquired first acquisition information matches the dialog start condition;
When the first acquisition information matches the dialog start condition, an analysis unit that performs analysis related to user detection based on the first acquisition information or information obtained from the sensor device;
A dialogue processing unit for outputting first dialogue information related to dialogue with the user when the user is detected based on a user detection analysis result of the analysis;
A dialogue apparatus comprising:
The dialogue apparatus according to claim 1, wherein the dialogue processing unit outputs a character image for assisting the dialogue based on the first dialogue information with the user on a screen.
A transmission processing unit that transmits an acquisition information analysis result obtained by analyzing second acquisition information acquired based on the user's operation after the output of the first dialogue information;
An interactive apparatus according to claim 1 or 2, further comprising:
The second acquisition information includes audio information,
The dialogue apparatus according to claim 3, wherein the transmission processing unit transmits character information obtained by analyzing the voice information to a transmission source of the first acquisition information.
The dialogue apparatus according to any one of claims 1 to 4, wherein the dialogue processing unit determines whether or not the user is detected based on the user detection analysis result.
The second acquisition information includes a face image,
The dialogue apparatus according to claim 3, wherein the dialogue processing unit determines whether to output second dialogue information based on detection information of the face image.
The dialog start condition determining unit obtains the first acquisition information that matches the dialog start condition when the first acquisition information including a face image is acquired after outputting the dialog promotion information that prompts the user to perform a dialog. Judgment,
The analysis unit analyzes whether the face image included in the first acquisition information is a predetermined user,
The dialogue apparatus according to any one of claims 1 to 4, wherein the dialogue processing unit outputs the first dialogue information when it is detected that the face image is the predetermined user.
The dialogue apparatus according to any one of claims 1 to 7, wherein the first dialogue information includes voice information or character information.
When the dialogue processing unit has acquired the second acquisition information, a response information notification unit that notifies the predetermined transmission destination of the presence or absence of response information by the user with respect to the first dialogue information;
An interactive apparatus according to any one of claims 3, 4, and 6.
The first dialogue information includes character information,
The dialogue apparatus according to claim 7, wherein the dialogue processing unit outputs the character information together with a face image of a user who has transmitted the first acquisition information.
A robot provided with the interactive device according to any one of claims 1 to 10.
Determine whether the acquired first acquisition information matches the dialog start condition,
When the first acquisition information matches the dialog start condition, perform analysis related to user detection based on the first acquisition information or information obtained from the sensor device,
A processing method for outputting first dialogue information relating to a dialogue with the user when the user is detected based on a user detection analysis result of the analysis.
On the computer,
Determine whether the acquired first acquisition information matches the dialog start condition,
When the first acquisition information matches the dialog start condition, perform analysis related to user detection based on the first acquisition information or information obtained from the sensor device,
When the user is detected based on the analysis result of the user detection analysis of the analysis, the first dialogue information related to the dialogue with the user is output.
A program that executes processing.
A dialogue processing unit for obtaining response information from the user according to output of dialogue information related to dialogue with the user, and analyzing the response information;
An application operation unit that outputs operation explanation information that explains an operation of a predetermined application according to a result of the analysis to the dialog processing unit,
The dialogue processing unit outputs the dialogue information using the operation explanation information and a character image for assisting the dialogue based on the dialogue information.