JP2023176404A

JP2023176404A - Virtual assistant device and program for virtual assistant device

Info

Publication number: JP2023176404A
Application number: JP2022088667A
Authority: JP
Inventors: 一元宮嶋; Kazumoto Miyajima
Original assignee: NGK Spark Plug Co Ltd
Current assignee: Niterra Co Ltd
Priority date: 2022-05-31
Filing date: 2022-05-31
Publication date: 2023-12-13

Abstract

To provide a technique capable of providing a topic to a user who utilizes a virtual assistant device and facilitating appropriate continuation of a bidirectional conversation concerning the provided topic.SOLUTION: A virtual assistant device 10 includes: an input unit to which a story from a user is input; a display 15 that displays an image; a control unit 11 that causes the display 15 to display an image of a character 70; and an output unit that outputs a story from the character 70. The control unit 11 displays a topic image different from the image of the character 70 and words concerning the topic image on the display 15. When a story including the words is input to the input unit after the topic image and the words are displayed on the display 15, the control unit 11 causes the output unit to output a story related to the words as a story from the character 70.SELECTED DRAWING: Figure 15

Description

本発明は、バーチャルアシスタント装置及びバーチャルアシスタント装置用のプログラムに関する。 The present invention relates to a virtual assistant device and a program for the virtual assistant device.

特許文献１には、音声出力システムが開示されている。特許文献１の音声出力システムは、音声情報取得手段、出力制御手段、会話情報取得手段、などを備える。音声情報取得手段は、話者から発せられた音声に関する音声情報を取得する。出力制御手段は、表示媒体に表示される表示内容に応じた文章を、音声情報に基づき話者の音声の態様を模した音声により、表示媒体のユーザに用いられる音声出力機器に出力させる。会話情報取得手段は、ユーザと話者との会話に関する会話情報を取得する。話者が文章を音声により発する際に、表示媒体には話者の態様を模した画像が表示される。 Patent Document 1 discloses an audio output system. The audio output system of Patent Document 1 includes audio information acquisition means, output control means, conversation information acquisition means, and the like. The voice information acquisition means acquires voice information regarding the voice uttered by the speaker. The output control means causes an audio output device used by a user of the display medium to output a sentence corresponding to the display content displayed on the display medium in a voice imitating a speaker's voice based on the voice information. The conversation information acquisition means acquires conversation information regarding a conversation between a user and a speaker. When a speaker utters a sentence by voice, an image imitating the speaker's appearance is displayed on the display medium.

特開２０２０－７６８８５号公報JP2020-76885A

表示部にキャラクタが表示され、キャラクタが利用者との間で会話を行うような装置では、利用者に話題を提供することが望まれ、提供した話題に関する会話が開始した場合には、その会話が双方向に継続することが望まれる。 In devices where a character is displayed on the display and the character converses with the user, it is desirable to provide the user with a topic, and if a conversation about the provided topic starts, the conversation It is hoped that this will continue in both directions.

この種の装置において話題を提供する場合、例えば、利用者が興味を持ちやすい画像を表示部に表示させることが有効である。しかし、画像の表示によって話題を提供した場合、画像を見た利用者がどのような話を発するのか正確に予想しにくいため、装置が応答を返す場合に、利用者が発した話から大きくずれてしまう懸念がある。このように「大きなずれ」が生じると、利用者は、会話が噛み合わない印象を抱いたり、話をしっかり聞いてもらっていない印象を抱いたりする懸念があり、双方向の会話が適切に続かない虞がある。 When providing topics in this type of device, for example, it is effective to display images that are likely to interest the user on the display unit. However, when a topic is provided by displaying an image, it is difficult to predict exactly what the user will say after seeing the image, so when the device returns a response, it may deviate greatly from what the user said. There is a concern that this may occur. If such a "large discrepancy" occurs, there is a concern that users may feel that the conversation does not mesh with each other or that they are not being listened to properly, and there is a risk that two-way conversation may not continue properly. There is.

本発明は、バーチャルアシスタント装置を利用する利用者に対して話題を提供することができ、提供した話題に関する双方向の会話が適切に継続しやすい技術を提供することを一つの目的とする。 One object of the present invention is to provide a technology that can provide a topic to a user using a virtual assistant device and facilitates appropriate continuation of a two-way conversation regarding the provided topic.

本発明の一つであるバーチャルアシスタント装置は、
利用者からの話が入力される入力部と、画像を表示する表示部と、前記表示部にキャラクタの画像を表示させる制御部と、前記キャラクタからの話を出力する出力部と、を備えたバーチャルアシスタント装置であって、
前記制御部は、前記キャラクタの画像とは異なる話題画像及び前記話題画像に関する言葉を前記表示部に表示し、前記話題画像及び前記言葉が前記表示部に表示された後、前記言葉を含む話が前記入力部に入力された場合に、前記言葉に関連する話を前記キャラクタからの話として前記出力部に出力させる。 A virtual assistant device that is one of the present inventions is
An input section into which a story from a user is input, a display section that displays an image, a control section that causes the display section to display an image of a character, and an output section that outputs the story from the character. A virtual assistant device,
The control unit displays a topic image different from the image of the character and words related to the topic image on the display unit, and after the topic image and the words are displayed on the display unit, a story including the words is displayed. When input to the input section, a story related to the word is outputted to the output section as a story from the character.

上記のバーチャルアシスタント装置は、話題となり得る画像（キャラクタの画像とは異なる話題画像）を表示する場合に、話題画像だけでなく、話題画像に関する言葉を表示部に表示することができる。このような表示がなされると、話題画像を見た利用者は、上記言葉を含んだ話を発しやすくなる。従って、バーチャルアシスタント装置側では、上記言葉を含んだ話がなされるものとして、上記言葉に関する応答用の話を用意しておくことができる。更に、バーチャルアシスタント装置は、上記話題画像及び上記言葉が表示された後に、実際に上記言葉を含んだ話が利用者から発せられた場合には、入力部に入力された情報から上記言葉が含まれることを確認した上で、上記言葉に関連する話をキャラクタからの話として出力することができる。このような動作がなされるため、話題画像を見た利用者が発した話に対してキャラクタが無関係の話を返すような対応が抑制されやすく、会話の適切化を図りやすい。 When displaying an image that can become a topic (a topic image different from a character image), the virtual assistant device described above can display not only the topic image but also words related to the topic image on the display unit. When such a display is made, the user who sees the topic image is more likely to utter a story that includes the above-mentioned words. Therefore, on the virtual assistant device side, it is possible to prepare a response story regarding the above-mentioned words, assuming that the speech including the above-mentioned words will be given. Furthermore, if the user actually speaks a story that includes the above-mentioned words after the topic image and the above-mentioned words are displayed, the virtual assistant device determines whether the above-mentioned words are included based on the information input to the input section. After confirming that the word is correct, the story related to the above word can be output as a story from the character. Because such an operation is performed, it is easy to prevent the character from replying with an unrelated story in response to the story uttered by the user who viewed the topic image, and it is easy to make the conversation more appropriate.

上記のバーチャルアシスタント装置において、上記制御部は、上記入力部に入力される話の解析を有効化する有効モードと、上記有効モードを解除する解除モードとを切り替えるように動作してもよい。更に、上記制御部は、上記有効モード中に上記言葉を含む話が上記入力部に入力された場合に上記言葉に関連する話を上記キャラクタからの話として上記出力部に出力させるように動作してもよい。そして、上記制御部は、上記有効モード中に上記言葉に関連する話を上記キャラクタからの話として上記出力部に出力させた場合、上記有効モードを継続しつつ上記利用者からの話を受け付ける期間を設けた後、予め定められた終了条件が成立するまで上記有効モードを継続するように動作してもよい。 In the above virtual assistant device, the control section may operate to switch between an effective mode in which analysis of the story input to the input section is enabled and a cancellation mode in which the effective mode is canceled. Further, the control section operates to cause the output section to output a story related to the word as a story from the character when a story including the word is input to the input section during the valid mode. You can. When the control unit causes the output unit to output a story related to the word as a story from the character during the valid mode, the control unit sets a period during which stories from the user are accepted while continuing the valid mode. After setting, the above-mentioned effective mode may be continued until a predetermined termination condition is satisfied.

上記のバーチャルアシスタント装置は、有効モードと解除モードを切り替えることができるため、有効モード中には話の解析を可能とし、解除モード中には処理負担を低減することができる。そして、制御部は、上記言葉に関連する話を上記キャラクタからの話として上記出力部に出力させた場合、有効モードを継続しつつ利用者からの話を受け付ける期間を設け、終了条件が成立するまで継続することができるため、上記言葉に関連する話を提供した後も、会話を円滑に継続することができる。 Since the virtual assistant device described above can switch between the valid mode and the cancel mode, it is possible to analyze the speech during the valid mode, and reduce the processing load during the cancel mode. When the control unit causes the output unit to output a story related to the word as a story from the character, the control unit sets a period for accepting stories from the user while continuing the valid mode, and the end condition is satisfied. This allows the conversation to continue smoothly even after the story related to the above-mentioned words has been provided.

上記制御部は、上記話題画像に関する上記言葉として単一のキーワードのみを上記表示部に表示するように動作してもよい。 The control unit may operate to display only a single keyword on the display unit as the word related to the topic image.

このように、話題画像と関連させて表示される言葉が単一のキーワードである場合、上記話題画像及び上記言葉を見た利用者が上記キーワードを発しやすくなる。よって、上記キーワードを含む話がバーチャルアシスタント装置にて認識されて会話が適切に継続する可能性が高まる。 In this way, when the word displayed in association with the topic image is a single keyword, the user who sees the topic image and the word becomes more likely to utter the keyword. Therefore, there is an increased possibility that the virtual assistant device will recognize a conversation that includes the above-mentioned keyword and continue the conversation appropriately.

話題画像と関連付けて複数種類のキーワードが表示されると、利用者が興味を持ちやすいキーワードが含まれる可能性が高くなり、且つ、利用者は、より多くの観点で話をしやすくなる。そして、このバーチャルアシスタント装置は、利用者の話しやすさを高めつつ、いずれかのキーワードを含んだ話が発せられた場合にはそのキーワードに対応する話を返すことができるため、利用者の話しやすさと会話の適切化を両立することができる。 When a plurality of types of keywords are displayed in association with a topical image, there is a high possibility that keywords that the user is likely to be interested in are included, and it becomes easier for the user to talk from more viewpoints. This virtual assistant device makes it easier for the user to speak, and when a word that includes any keyword is uttered, it can respond with a story that corresponds to that keyword. It is possible to achieve both ease of use and appropriate conversation.

上記制御部は、上記話題画像をいずれか１種以上の上記言葉と共に第１の組み合わせで上記表示部に表示した後、上記話題画像を上記第１の組み合わせとは異なる組み合わせで上記言葉と共に上記表示部に表示してもよい。 The control unit displays the topic image together with one or more of the words in a first combination on the display unit, and then displays the topic image together with the words in a combination different from the first combination. It may be displayed in the section.

このように、いずれかの話題画像を表示する場合に、関連付けて表示する言葉の組み合わせを変更可能であれば、同種の話題画像を継続的に又は繰り返し利用する場合でも、利用者にとって会話が飽きにくくなり、利用者の利用が促進されやすい。そして、このバーチャルアシスタント装置は、話題画像と言葉の組み合わせが変更されても、表示された言葉を含む話が発せられた場合には、その言葉に関連する話を返すことができるため、会話の飽きにくさと会話の適切化を両立することができる。 In this way, if it is possible to change the combination of words that are displayed in association when displaying any topic image, the conversation will become boring for the user even if the same topic image is used continuously or repeatedly. This makes it easier for users to use the service. Even if the combination of the topic image and words is changed, this virtual assistant device can return a story related to the displayed word if a word that includes the displayed word is uttered, so it can improve the conversation. It is possible to balance the difficulty of getting bored and the appropriateness of the conversation.

本発明の一つであるバーチャルアシスタント装置用のプログラムは、
利用者からの話が入力される入力部と、画像を表示する表示部と、前記表示部にキャラクタの画像を表示させる制御部と、前記キャラクタからの話を出力する出力部と、を備えたバーチャルアシスタント装置に用いられるプログラムであって、
前記キャラクタの画像とは異なる話題画像及び前記話題画像に関する言葉を前記表示部に表示させる制御を、前記制御部に行わせるステップと、
前記話題画像及び前記言葉が前記表示部に表示された後、前記言葉を含む話が前記入力部に入力された場合に、前記言葉に関連する話を前記キャラクタからの話として前記出力部に出力させる制御を、前記制御部に行わせるステップと、
を含む。 A program for a virtual assistant device, which is one of the inventions, is
An input section into which a story from a user is input, a display section that displays an image, a control section that causes the display section to display an image of a character, and an output section that outputs the story from the character. A program used for a virtual assistant device,
causing the control unit to perform control to display a topic image different from the image of the character and words related to the topic image on the display unit;
After the topic image and the word are displayed on the display unit, when a story including the word is input to the input unit, a story related to the word is output to the output unit as a story from the character. a step of causing the control unit to perform control to
including.

上記のバーチャルアシスタント装置用のプログラムは、話題となり得る画像（キャラクタの画像とは異なる話題画像）を表示する場合に、話題画像だけでなく、話題画像に関する言葉を表示部に表示することができる。このような表示がなされると、話題画像を見た利用者は、上記言葉を含んだ話を発しやすくなる。従って、バーチャルアシスタント装置側では、上記言葉を含んだ話がなされるものとして、上記言葉に関する応答用の話を用意しておくことができる。更に、このプログラムは、上記話題画像及び上記言葉が表示された後に、実際に上記言葉を含んだ話を利用者が発した場合に、入力部に入力された情報から上記言葉が含まれることを確認した上で、上記言葉に関連する話をキャラクタからの話として出力するように動作させることができる。このような動作がなされるため、話題画像を見た利用者が発した話に対してキャラクタが無関係の話を返すような対応が抑制されやすく、会話の適切化を図りやすい。 When displaying an image that can become a topic (a topic image different from a character image), the program for the virtual assistant device described above can display not only the topic image but also words related to the topic image on the display unit. When such a display is made, the user who sees the topic image is more likely to utter a story that includes the above-mentioned words. Therefore, on the virtual assistant device side, it is possible to prepare a response story regarding the above-mentioned words, assuming that the speech including the above-mentioned words will be given. Furthermore, if the user actually utters a story that includes the above words after the topic image and the above words are displayed, this program will detect from the information input to the input section that the above words are included. After checking, it can be operated to output a story related to the word as a story from the character. Because such an operation is performed, it is easy to prevent the character from replying with an unrelated story in response to the story uttered by the user who viewed the topic image, and it is easy to make the conversation more appropriate.

本発明によれば、バーチャルアシスタント装置を利用する利用者に対して話題を提供することができ、提供した話題に関する双方向の会話が適切に継続しやすい。 According to the present invention, it is possible to provide a topic to a user using a virtual assistant device, and it is easy to continue a two-way conversation regarding the provided topic appropriately.

図１は、第１実施形態のバーチャルアシスタント装置を備えたバーチャルアシスタントシステムの電気的構成を簡略的に示すブロック図である。FIG. 1 is a block diagram schematically showing the electrical configuration of a virtual assistant system including a virtual assistant device according to a first embodiment. 図２は、第１実施形態のバーチャルアシスタント装置における通常表示の例１を示す説明図である。FIG. 2 is an explanatory diagram showing an example 1 of normal display in the virtual assistant device of the first embodiment. 図３は、管理装置に記憶された利用者データのデータ構造例を概念的に示す説明図である。FIG. 3 is an explanatory diagram conceptually showing an example of the data structure of user data stored in the management device. 図４は、管理装置に記憶されたコンテンツ関連データのデータ構造例を概念的に示す説明図である。FIG. 4 is an explanatory diagram conceptually showing an example of the data structure of content-related data stored in the management device. 図５は、第１実施形態のバーチャルアシスタント装置での制御の流れを例示するフローチャートである。FIG. 5 is a flowchart illustrating the flow of control in the virtual assistant device of the first embodiment. 図６は、第１実施形態のバーチャルアシスタント装置における通常表示の例２を示す説明図である。FIG. 6 is an explanatory diagram showing a second example of normal display in the virtual assistant device of the first embodiment. 図７は、第１実施形態のバーチャルアシスタント装置における通常表示の例３を示す説明図である。FIG. 7 is an explanatory diagram showing a third example of normal display in the virtual assistant device of the first embodiment. 図８は、第１実施形態のバーチャルアシスタント装置における通常表示の例４を示す説明図である。FIG. 8 is an explanatory diagram showing example 4 of normal display in the virtual assistant device of the first embodiment. 図９は、第１実施形態のバーチャルアシスタント装置における「所定の報知」の一例を説明する説明図である。FIG. 9 is an explanatory diagram illustrating an example of "predetermined notification" in the virtual assistant device of the first embodiment. 図１０は、第１実施形態のバーチャルアシスタント装置における通常表示の例５を示す説明図であり、報知情報が表示された例を示す図である。FIG. 10 is an explanatory diagram showing a fifth example of normal display in the virtual assistant device of the first embodiment, and is a diagram showing an example in which notification information is displayed. 図１１は、第１実施形態のバーチャルアシスタント装置でのコンテンツの提供例１を示す説明図である。FIG. 11 is an explanatory diagram showing example 1 of content provision by the virtual assistant device of the first embodiment. 図１２は、第１実施形態のバーチャルアシスタント装置でのコンテンツの提供例２を示す説明図であり、図１１の表示に続く表示の例である。FIG. 12 is an explanatory diagram showing example 2 of content provision by the virtual assistant device of the first embodiment, and is an example of a display subsequent to the display in FIG. 11. 図１３は、第１実施形態のバーチャルアシスタント装置でのコンテンツの提供例３を示す説明図であり、図１２の表示に続く表示の例である。FIG. 13 is an explanatory diagram showing example 3 of content provision by the virtual assistant device of the first embodiment, and is an example of a display subsequent to the display in FIG. 12. 図１４は、第１実施形態のバーチャルアシスタント装置でのコンテンツの提供例４を示す説明図であり、図１３の表示に続く表示の例である。FIG. 14 is an explanatory diagram showing example 4 of content provision by the virtual assistant device of the first embodiment, and is an example of a display subsequent to the display in FIG. 13. 図１５は、第１実施形態のバーチャルアシスタント装置において行われる表示に関し、話題画像を言葉と共に表示する具体例１を説明する説明図である。FIG. 15 is an explanatory diagram illustrating a first specific example of displaying topic images together with words regarding the display performed in the virtual assistant device of the first embodiment. 図１６は、各言葉に対応付けて各言葉に対応する話を用意したデータ構成を説明する説明図である。FIG. 16 is an explanatory diagram illustrating a data structure in which a story corresponding to each word is prepared in association with each word. 図１７は、第１実施形態のバーチャルアシスタント装置において行われる表示に関し、話題画像を言葉と共に表示する具体例２を説明する説明図である。FIG. 17 is an explanatory diagram illustrating a second specific example in which topic images are displayed together with words regarding the display performed in the virtual assistant device of the first embodiment. 図１８は、第１実施形態のバーチャルアシスタント装置において行われる表示に関し、話題画像を言葉と共に表示する具体例３を説明する説明図である。FIG. 18 is an explanatory diagram illustrating a third specific example in which topic images are displayed together with words regarding the display performed in the virtual assistant device of the first embodiment. 図１９は、第１実施形態のバーチャルアシスタント装置において行われる表示に関し、話題画像を言葉と共に表示する具体例４を説明する説明図である。FIG. 19 is an explanatory diagram illustrating a fourth specific example in which topic images are displayed together with words regarding the display performed in the virtual assistant device of the first embodiment. 図２０は、第１実施形態のバーチャルアシスタント装置において行われる表示に関し、話題画像を言葉と共に表示する具体例５を説明する説明図である。FIG. 20 is an explanatory diagram illustrating a specific example 5 in which topic images are displayed together with words regarding the display performed in the virtual assistant device of the first embodiment. 図２１は、比較例のバーチャルアシスタント装置において行われる表示を示す説明図である。FIG. 21 is an explanatory diagram showing a display performed in a virtual assistant device of a comparative example.

＜第１実施形態＞
１．バーチャルアシスタントシステムの概要
図１に示されるバーチャルアシスタントシステム１は、バーチャルアシスタント装置１０と、管理装置９０と、を備える。以下の説明では、バーチャルアシスタントシステム１は、単にシステム１とも称される。以下で説明される代表例は、バーチャルアシスタント装置１０が、高齢者向けのバーチャルアシスタント装置として機能するものであり、例えば、利用者が、介護施設、自宅、病院などで利用することができるものである。 <First embodiment>
1. Overview of Virtual Assistant System The virtual assistant system 1 shown in FIG. 1 includes a virtual assistant device 10 and a management device 90. In the following description, the virtual assistant system 1 is also simply referred to as system 1. In a typical example described below, the virtual assistant device 10 functions as a virtual assistant device for elderly people, and can be used by a user at a nursing facility, home, hospital, etc. be.

２．バーチャルアシスタント装置のハードウェア構成
図１のように、バーチャルアシスタント装置１０は、タブレット端末、スマートフォン、パーソナルコンピュータ、外部装置と通信可能に構成されたテレビジョンなどの汎用の情報端末にアプリケーションプログラムをインストールし、記憶及び利用可能とした情報通信端末であってもよく、後述される各機能を実現できる専用装置であってもよい。バーチャルアシスタント装置１０は、通信機能を有する携帯型の情報装置であってもよく、通信機能を有する据置型の情報装置であってもよい。 2. Hardware Configuration of Virtual Assistant Device As shown in FIG. 1, the virtual assistant device 10 installs an application program on a general-purpose information terminal such as a tablet terminal, a smartphone, a personal computer, or a television configured to be able to communicate with an external device. , it may be an information communication terminal that can store and use the information, or it may be a dedicated device that can realize each function described below. The virtual assistant device 10 may be a portable information device having a communication function, or may be a stationary information device having a communication function.

図１のように、バーチャルアシスタント装置１０は、制御部１１と通信部１２とインタフェース１３と記憶部１４とを備える。以下で説明されるバーチャルアシスタント装置１０の代表例は、バーチャルアシスタント装置１０が、図２のようにタブレット端末によって実現される例である。 As shown in FIG. 1, the virtual assistant device 10 includes a control section 11, a communication section 12, an interface 13, and a storage section 14. A representative example of the virtual assistant device 10 described below is an example in which the virtual assistant device 10 is implemented by a tablet terminal as shown in FIG.

図１に示される制御部１１は、例えば公知の情報処理装置として構成される。制御部１１は、ＣＰＵなどの公知の演算装置及び他の周辺回路などを備え、様々な制御や演算を行い得る。制御部１１は、インタフェース１３を構成する表示部１５に、具現化されたキャラクタを表示させる機能を有する。 The control unit 11 shown in FIG. 1 is configured as, for example, a known information processing device. The control unit 11 includes a known arithmetic unit such as a CPU and other peripheral circuits, and can perform various controls and calculations. The control unit 11 has a function of displaying an embodied character on the display unit 15 that constitutes the interface 13.

図１に示される通信部１２は、公知の有線通信方式又は公知の無線通信方式によって広域通信網に直接又は他装置を介して間接的にアクセス可能な装置である。通信部１２は、基地局との間で無線通信を行い、図示されていない基地局を介して広域通信網（例えば、インターネット）に直接アクセスする構成であってもよい。通信部１２は、図示されていないアクセスポイントとの間で無線通信を行い、アクセスポイントを介して広域通信網に間接的にアクセスする構成であってもよい。通信部１２は、中継装置（ルータ等）との間で有線通信を行い、中継装置を介して広域通信網にアクセスする構成であってもよい。 The communication unit 12 shown in FIG. 1 is a device that can access a wide area communication network directly or indirectly via another device using a known wired communication method or a known wireless communication method. The communication unit 12 may be configured to perform wireless communication with a base station and directly access a wide area communication network (for example, the Internet) via a base station (not shown). The communication unit 12 may be configured to perform wireless communication with an access point (not shown) and indirectly access a wide area communication network via the access point. The communication unit 12 may be configured to perform wired communication with a relay device (such as a router) and access a wide area communication network via the relay device.

図１に示されるインタフェース１３は、利用者からの入力の受入れおよび出力を行う装置である。インタフェース１３は、表示部１５と音声出力部１６と操作部１７と音声入力部１８とを備える。 The interface 13 shown in FIG. 1 is a device that receives and outputs input from a user. The interface 13 includes a display section 15 , an audio output section 16 , an operation section 17 , and an audio input section 18 .

表示部１５及び音声出力部１６は、出力部の一例に相当し、情報を出力する機能を有する。表示部１５は、液晶ディスプレイや有機エレクトロルミネッセンスディスプレイなどの公知の画像表示装置として構成され、様々な画像を表示する機能を有する。以下で説明される代表例では、表示部１５は、タッチパネル式の表示装置の一部をなす。音声出力部１６は、例えば公知のスピーカなどの発音装置によって構成されている。音声出力部１６は、制御部１１と協働して各種音声を出力する機能を有する。 The display section 15 and the audio output section 16 correspond to an example of an output section and have a function of outputting information. The display unit 15 is configured as a known image display device such as a liquid crystal display or an organic electroluminescent display, and has a function of displaying various images. In the representative example described below, the display unit 15 forms part of a touch panel type display device. The audio output unit 16 is constituted by, for example, a known sounding device such as a speaker. The audio output section 16 has a function of outputting various sounds in cooperation with the control section 11.

操作部１７及び音声入力部１８は、情報を入力する入力部として機能する。操作部１７は、動作検出部の一例に相当し、接触方式での入力操作が可能とされた入力デバイスである。操作部１７は、例えば、タッチパネルなどが好適例であり、情報を入力するためのボタンを備えていてもよい。音声入力部１８は、例えば公知のマイクなどの音声入力装置によって構成されている。音声入力部１８は、入力される音を電気信号に変換して制御部１１に与える機能を有する。音声入力部１８は、利用者が音声入力部１８に向かって声やその他の音を発した場合に利用者の音声を示す音声信号を取得するように機能する。具体的には、音声入力部１８は、自身が検出可能な音声を利用者が発した場合、その音声の内容を示す音声信号を取得し、電気信号に変換し得る。 The operation unit 17 and the voice input unit 18 function as an input unit for inputting information. The operation unit 17 corresponds to an example of a motion detection unit, and is an input device that allows input operations using a contact method. The operation unit 17 is preferably a touch panel, for example, and may include buttons for inputting information. The audio input unit 18 is configured by, for example, a known audio input device such as a microphone. The audio input section 18 has a function of converting input sound into an electrical signal and providing it to the control section 11 . The audio input unit 18 functions to obtain an audio signal representing the user's voice when the user makes a voice or other sound toward the audio input unit 18. Specifically, when the user utters a voice that can be detected by the voice input unit 18, the voice input unit 18 can acquire a voice signal indicating the content of the voice and convert it into an electrical signal.

以下で説明される代表例では、図２等で示されるように、表示部１５と操作部１７とがタッチパネル式表示装置２０を構成する。図２等の例では、操作部１７の一部又は全部をなすタッチパネルが、表示部１５からの光を透過可能に構成され、表示部１５からの画像を外部から視認可能な構成で透明性のパネルとして表示部１５を覆っている。 In the representative example described below, the display section 15 and the operation section 17 constitute a touch panel display device 20, as shown in FIG. 2 and the like. In the example shown in FIG. 2, the touch panel forming part or all of the operation unit 17 is configured to be able to transmit light from the display unit 15, and is configured to be transparent so that the image from the display unit 15 can be viewed from the outside. It covers the display section 15 as a panel.

記憶部１４は、様々な情報を記憶する機能を有する。記憶部１４は、半導体メモリ、ＨＤＤ、ＳＳＤ、など、公知の記憶装置が採用される。制御部１１は、記憶部１４に対して各種情報を書き込む機能や、記憶部１４に記憶された各種情報を読み出す機能を有する。記憶部１４には、後述されるアプリケーションプログラムなどの様々なプログラムが記憶されている。記憶部１４には、管理装置９０によって管理されているサイト、情報、プログラム等に対して広域通信網を介してアクセスするための識別情報（例えば、ＵＲＬ（Uniform Resource Locator）など）やその他のデータも記憶されている。 The storage unit 14 has a function of storing various information. The storage unit 14 employs a known storage device such as a semiconductor memory, HDD, or SSD. The control unit 11 has a function of writing various information to the storage unit 14 and a function of reading various information stored in the storage unit 14. The storage unit 14 stores various programs such as application programs to be described later. The storage unit 14 stores identification information (for example, URL (Uniform Resource Locator), etc.) and other data for accessing sites, information, programs, etc. managed by the management device 90 via a wide area communication network. is also remembered.

３．管理装置
図１に示される管理装置９０は、様々な情報処理機能や様々な演算機能を有する。管理装置９０は、バーチャルアシスタント装置１０の外部に設けられた外部装置である。管理装置９０は、様々な情報を登録する機能、様々な情報を配信する機能、などを有する。管理装置９０は、通信機能及び情報処理機能を有する装置であればよい。管理装置９０は、例えば、ＣＰＵ，記憶媒体、通信装置などを備えたコンピュータとして構成されている。図１の例では、管理装置９０は、制御装置９１と、通信部９２、表示部９３、入力部９４、記憶部９５、を備える。 3. Management Device The management device 90 shown in FIG. 1 has various information processing functions and various calculation functions. The management device 90 is an external device provided outside the virtual assistant device 10. The management device 90 has a function of registering various information, a function of distributing various information, and the like. The management device 90 may be any device as long as it has a communication function and an information processing function. The management device 90 is configured as a computer including, for example, a CPU, a storage medium, a communication device, and the like. In the example of FIG. 1, the management device 90 includes a control device 91, a communication section 92, a display section 93, an input section 94, and a storage section 95.

制御装置９１は、例えば公知の情報処理装置として構成される。制御部１１は、ＣＰＵなどの公知の演算装置及び他の周辺回路などを備え、様々な制御や演算を行い得る。 The control device 91 is configured, for example, as a known information processing device. The control unit 11 includes a known arithmetic unit such as a CPU and other peripheral circuits, and can perform various controls and calculations.

通信部９２は、公知の有線通信方式又は公知の無線通信方式によって広域通信網に直接又は他装置を介して間接的にアクセス可能な装置である。通信部９２は、基地局との間で無線通信を行い、図示されていない基地局を介して広域通信網に直接アクセスする構成であってもよい。通信部９２は、図示されていないアクセスポイントとの間で無線通信を行い、アクセスポイントを介して広域通信網に間接的にアクセスする構成であってもよい。通信部９２は、中継装置（ルータ等）との間で有線通信を行い、中継装置を介して広域通信網にアクセスする構成であってもよい。 The communication unit 92 is a device that can access a wide area communication network directly or indirectly via another device using a known wired communication method or a known wireless communication method. The communication unit 92 may be configured to perform wireless communication with a base station and directly access a wide area communication network via a base station (not shown). The communication unit 92 may be configured to perform wireless communication with an access point (not shown) and indirectly access a wide area communication network via the access point. The communication unit 92 may be configured to perform wired communication with a relay device (such as a router) and access a wide area communication network via the relay device.

表示部９３は、公知の画像表示装置として構成される。入力部９４は、例えば、キーボード、マウス、タッチパネル、音声入力部など、公知の入力デバイスとして構成され、接触による操作や音声入力などによって情報の入力が可能とされる。記憶部９５は、様々な情報を記憶する記憶装置である。記憶部９５において、データベースが構成されていてもよい。 The display section 93 is configured as a known image display device. The input unit 94 is configured as a known input device such as a keyboard, a mouse, a touch panel, a voice input unit, etc., and allows information to be input by touch operation, voice input, or the like. The storage unit 95 is a storage device that stores various information. A database may be configured in the storage unit 95.

４．管理装置からのコンテンツの配信
管理装置９０には、情報の配信対象である利用者に関する情報が登録されている。図３は、図１に示される管理装置９０の記憶部９５に記憶される利用者データのデータ構造例を示している。図３のように、管理装置９０に記憶される利用者データは、データベースを構成しており、各々の利用者に対応付けてＩＤ、登録種類、利用者情報が記憶されている。ＩＤは、各々の利用者を識別可能な情報であり、各々の利用者を特定し得る識別情報である。登録種類は、利用者が希望する情報や利用者に有意義な情報の種類を登録する情報である。利用者情報は、利用者に関する各種情報であり、氏名、生年月日、メールアドレス、趣味、健康に関する情報、居住地、過去のエピソード等の個人情報を含んでいてもよく、コンテンツの配信先を特定する配信先情報を含んでいてもよい。例えば、図３の利用者データでは、利用者のＩＤとしてＩＤ１が特定されれば、「ＩＤ１」で特定される利用者が希望する情報の種類が「登録種類１」の情報によって特定され、その利用者の個人情報等が「利用者情報１」の情報によって特定される。 4. Distribution of Content from Management Device The management device 90 has registered information regarding users to whom information is to be distributed. FIG. 3 shows an example data structure of user data stored in the storage unit 95 of the management device 90 shown in FIG. 1. As shown in FIG. 3, the user data stored in the management device 90 constitutes a database, in which ID, registration type, and user information are stored in association with each user. The ID is information that can identify each user, and is identification information that can identify each user. The registration type is information for registering the type of information desired by the user or the type of information meaningful to the user. User information is various information about the user, and may include personal information such as name, date of birth, email address, hobbies, health information, place of residence, past episodes, etc. It may also include specific delivery destination information. For example, in the user data in Figure 3, if ID1 is specified as the user ID, the type of information desired by the user specified by "ID1" is specified by the information of "registration type 1", and the The user's personal information, etc. is specified by the information of "user information 1".

本実施形態に係るシステム１では、「利用者が希望する情報の種類」や「利用者に有意義な情報」として、様々な種類が登録可能とされている。具体的には、「利用者が希望する情報の種類」や「利用者に有意義な情報」として、趣味又は生き甲斐に関する複数の種類が予め選定可能に用意され、それら複数の種類から選ばれるいずれか１以上の種類が登録情報に含まれていてもよい。趣味又は生き甲斐に関する種類としては、例えば、園芸、スポーツ、レクリエーション、体操、音楽、美術、料理、懐かしい景色や玩具、旅行、動物、乗り物、ショッピング情報など、様々な種類を用意しておくことができる。或いは、利用者が希望する情報の種類として、病気に関する複数の種類が予め選定可能に用意され、それら複数の種類から選ばれるいずれか１以上の種類が登録情報に含まれていてもよい。具体的には、老齢症候群の予防やリハビリテーションやリラクゼーションに関する複数の種類が予め選定可能に用意され、それら複数の種類から選ばれるいずれか１以上の種類が登録情報に含まれていてもよい。 In the system 1 according to the present embodiment, various types can be registered as "type of information desired by the user" and "information meaningful to the user." Specifically, multiple types of hobbies or purpose in life are available for selection in advance as ``types of information desired by users'' and ``information meaningful to users,'' and any one selected from these multiple types is prepared in advance. One or more types may be included in the registration information. Various types of hobbies or purpose in life can be prepared, such as gardening, sports, recreation, gymnastics, music, art, cooking, nostalgic scenery and toys, travel, animals, vehicles, and shopping information. . Alternatively, as the type of information desired by the user, a plurality of disease-related types may be prepared in advance so as to be selectable, and one or more types selected from the plurality of types may be included in the registered information. Specifically, a plurality of types related to the prevention of geriatric syndromes, rehabilitation, and relaxation may be prepared in advance so as to be selectable, and one or more types selected from the plurality of types may be included in the registered information.

例えば、図３の利用者データにおいて、ＩＤ１が「００００１」の情報であり、登録種類１が「園芸、体操」を特定する情報であり、利用者情報１が、利用者名として「山田太郎」を特定し、配信先アドレスとして「アドレス情報１」を特定する情報であるとする。この場合、制御装置９１は、利用者データを参照すれば、「００００１」のＩＤで特定される利用者の氏名が「山田太郎」であり、配信先アドレスが「アドレス情報１」であり、この利用者が希望する種類が「園芸、体操」であることを特定することができる。 For example, in the user data in FIG. 3, ID1 is information of "00001", registration type 1 is information specifying "gardening, gymnastics", and user information 1 is "Taro Yamada" as the user name. It is assumed that the information specifies "address information 1" as the delivery destination address. In this case, if the control device 91 refers to the user data, the name of the user specified by the ID "00001" is "Taro Yamada", the delivery destination address is "address information 1", and this It is possible to specify that the type desired by the user is "gardening, gymnastics."

管理装置９０は、コンテンツの登録及び配信を行うことが可能とされている。図４は、図４は、図１に示される管理装置９０の記憶部９５に記憶されるコンテンツ関連データのデータ構造例を示している。図４のように、管理装置９０に記憶されるコンテンツ関連データは、データベースを構成しており、コンテンツ毎に、コンテンツ識別情報、コンテンツデータ、配信日時、コンテンツの種類が互いに対応付けられて記憶されている。本明細書では、コンテンツデータは、コンテンツ情報の一例に相当し、単にコンテンツとも称される。 The management device 90 is capable of registering and distributing content. FIG. 4 shows an example of the data structure of content-related data stored in the storage unit 95 of the management device 90 shown in FIG. 1. As shown in FIG. As shown in FIG. 4, the content-related data stored in the management device 90 constitutes a database, and for each content, content identification information, content data, delivery date and time, and content type are stored in association with each other. ing. In this specification, content data corresponds to an example of content information, and is also simply referred to as content.

コンテンツ識別情報は、各々のコンテンツを識別可能且つ特定可能な情報であればよく、識別番号であってもよく、具体的な名称であってもよい。コンテンツデータは、コンテンツの具体的なデータであり、動画データ、静止画データ、音声データなどを含んでいてもよく、その他のデータ（例えば、プログラムデータ等）を含んでいてもよい。配信日時は、対応付けられたコンテンツ（コンテンツデータ）が配信される日時を特定する情報である。種類情報は、対応付けられたコンテンツ（コンテンツデータ）の種類を特定する情報である。種類情報で特定されるコンテンツの種類は、１種類であってもよく、複数種類であってもよい。 The content identification information may be any information that allows each content to be identified and specified, and may be an identification number or a specific name. Content data is specific data of content, and may include video data, still image data, audio data, etc., and may also include other data (eg, program data, etc.). The distribution date and time is information that specifies the date and time when the associated content (content data) is distributed. Type information is information that specifies the type of content (content data) that is associated. The type of content specified by the type information may be one type or multiple types.

管理装置９０は、各識別情報で特定される各コンテンツ（コンテンツデータ）を、各コンテンツを配信すべき利用者（各コンテンツの種類を登録している利用者）のバーチャルアシスタント装置１０に対し、各コンテンツに対応付けられた各配信日時に配信する。なお、図１では、一の利用者が所持する一のバーチャルアシスタント装置１０のみが例示されているが、システム１では、多数の利用者がバーチャルアシスタント装置１０を所持することができ、図１では、他の利用者が所持するバーチャルアシスタント装置１０の図示は省略されている。管理装置９０は、コンテンツを配信する場合において、そのコンテンツの配信対象の利用者が複数存在する場合、いずれの利用者のバーチャルアシスタント装置１０に対してもコンテンツを配信することができる。なお、本実施形態では、管理装置９０が各コンテンツを配信すべき利用者（各コンテンツの種類を登録している利用者）を特定し、当該利用者のバーチャルアシスタント装置１０に対し、各コンテンツ（コンテンツデータ）を配信することとしたが、コンテンツの配信はこれに限られない。例えば、利用者のバーチャルアシスタント装置１０が、自身の記憶部に記憶されている情報に基づき、管理装置９０に対してコンテンツの配信を要求することで、当該利用者のバーチャルアシスタント装置１０に対し、各コンテンツ（コンテンツデータ）を配信することとしてもよい。 The management device 90 sends each content (content data) specified by each piece of identification information to the virtual assistant device 10 of the user who should distribute each content (the user who has registered the type of each content). Deliver each distribution date and time associated with the content. Note that in FIG. 1, only one virtual assistant device 10 owned by one user is illustrated, but in the system 1, many users can have virtual assistant devices 10, and in FIG. , illustration of the virtual assistant device 10 owned by another user is omitted. When distributing content, if there are multiple users to whom the content is to be distributed, the management device 90 can distribute the content to the virtual assistant devices 10 of any of the users. Note that in this embodiment, the management device 90 identifies the users to whom each content is to be distributed (users who have registered the types of each content), and sends each content ( However, the content distribution is not limited to this. For example, when the user's virtual assistant device 10 requests the management device 90 to distribute content based on information stored in its own storage unit, the user's virtual assistant device 10 requests Each content (content data) may be distributed.

例えば、図４のコンテンツ関連データにおいて、「識別情報１」が「００００Ａ」の情報であり、「データ１」が「園芸に関する所定の動画データ」であり、「日時情報１」が「２０２１年１月１日１０：００」であり、「種類情報１」が「園芸」を特定する情報であるとする。この場合、制御装置９１は、図４のコンテンツ関連データを参照すれば、「００００Ａ」に対応付けられたコンテンツデータ（データ１の「園芸に関する所定の動画データ」）を、「園芸」の種類のデータとして、「２０２１年１月１日１０：００」に配信すべきことを特定することができる。この場合、管理装置９０は、識別情報１（００００Ａ）で特定されるコンテンツデータ（データ１の「園芸に関する所定の動画データ」）を、「園芸」の種類のデータを登録する利用者（図３の利用者データにおいて、登録種類として「園芸」を含む利用者）のバーチャルアシスタント装置１０に対し「２０２１年１月１日１０：００」に配信する。 For example, in the content-related data in FIG. 4, "identification information 1" is "0000A" information, "data 1" is "predetermined video data related to gardening", and "date and time information 1" is "2021 January 2021". 10:00 on the first day of the month," and "type information 1" is information specifying "horticulture." In this case, the control device 91 refers to the content-related data in FIG. As data, it is possible to specify that the data should be distributed at "January 1, 2021, 10:00". In this case, the management device 90 transfers the content data (data 1 "predetermined video data related to gardening") specified by identification information 1 (0000A) to a user who registers data of the type "gardening" (see FIG. will be delivered to the virtual assistant device 10 of the user whose registered type is "gardening" in the user data of "January 1, 2021, 10:00".

５．バーチャルアシスタント装置の受信制御
５－１．基本制御
バーチャルアシスタント装置１０には、アプリケーションプログラムが記憶されている。このアプリケーションプログラムは、バーチャルアシスタント装置１０にインストールされている。このアプリケーションプログラムは、記憶部１４に記憶され、制御部１１によって読み出されて実行される。 5. Reception control of virtual assistant device 5-1. Basic Control The virtual assistant device 10 stores an application program. This application program is installed on the virtual assistant device 10. This application program is stored in the storage unit 14, read out and executed by the control unit 11.

上述されたように、管理装置９０は、バーチャルアシスタント装置１０に対してコンテンツ（コンテンツデータ）を配信する。管理装置９０からバーチャルアシスタント装置１０に配信されるコンテンツは、バーチャルアシスタント装置１０を所持する利用者が予め登録した種類に属するコンテンツであってもよく、利用者による登録に関係なく管理装置９０が選定したコンテンツであってもよい。バーチャルアシスタント装置１０は、管理装置９０から自身に対して配信されるコンテンツを通信部１２によって受信する。制御部１１は、外部（管理装置９０）から配信されるコンテンツ（コンテンツデータ）を通信部１２が受けた場合に、そのコンテンツデータを記憶部１４に記憶し、インタフェース１３を介して「所定の報知」を行う制御を実行する。なお、「所定の報知」は、コンテンツデータを記憶部１４に記憶する前に行ってもよい。 As described above, the management device 90 distributes content (content data) to the virtual assistant device 10. The content distributed from the management device 90 to the virtual assistant device 10 may be content that belongs to a type registered in advance by the user who owns the virtual assistant device 10, and may be selected by the management device 90 regardless of the registration by the user. The content may be The virtual assistant device 10 receives content distributed to itself from the management device 90 through the communication unit 12 . When the communication unit 12 receives content (content data) distributed from the outside (management device 90 ), the control unit 11 stores the content data in the storage unit 14 and sends a “predetermined notification” via the interface 13 . ” Executes control that performs. Note that the "predetermined notification" may be performed before the content data is stored in the storage unit 14.

具体的には、上記アプリケーションプログラムは、図５のような流れで制御部１１に制御を行わせるプログラムである。制御部１１は、所定の開始条件が成立した場合（例えば、操作部１７に対して予め定められた開始操作（例えば、当該アプリケーションプログラムを起動させるためのタッチパネル式表示装置２０の操作）がなされた場合）に上記アプリケーションプログラムを実行し、ステップＳ１において、画像によって具現化されたキャラクタ７０を表示部１５に表示させる。 Specifically, the application program is a program that causes the control unit 11 to perform control according to the flow shown in FIG. The control unit 11 controls the control unit 11 when a predetermined start condition is satisfied (for example, when a predetermined start operation is performed on the operation unit 17 (for example, an operation on the touch panel display device 20 to start the application program). case), the above application program is executed, and in step S1, the character 70 embodied in the image is displayed on the display section 15.

図２には、ステップＳ１でなされる表示の具体例が示される。図２において、キャラクタ７０は、一般人を模したバーチャルアシスタント（擬人）である。図２に表示されるキャラクタ７０は、あくまで一例であり、例えば、介護福祉士、看護師、医師などの特定の職業の者を模したバーチャルアシスタントであってもよい。また、人に限らず動物やロボット等を模したバーチャルアシスタントであってもよい。上記アプリケーションプログラムは、図２に表示されるキャラクタ７０が自動的に会話を行うようにチャットボットの機能を実現するプログラムが備えられていてもよい。自動的に行う会話の内容は、利用者が事前に登録した属性（男女、家族構成、住宅環境、誕生日、趣味、コンテンツの難易度など）に合わせて、変更、選択することができる。図２に示されるキャラクタ７０の画像は、静止画や動画などによって実現することができ、後述されるステップＳ３，Ｓ８の処理が実行されていない期間には、時間経過に応じて図６、図７，図８のようにキャラクタ７０の表情、姿勢、動作、行為などを様々に変化させてもよい。 FIG. 2 shows a specific example of the display made in step S1. In FIG. 2, a character 70 is a virtual assistant (personification) imitating an ordinary person. The character 70 displayed in FIG. 2 is just an example, and may be a virtual assistant modeled after a person in a specific profession, such as a care worker, nurse, or doctor. Further, the virtual assistant is not limited to a human, but may be a virtual assistant modeled after an animal, a robot, or the like. The above application program may include a program that realizes the function of a chatbot so that the character 70 displayed in FIG. 2 automatically carries out a conversation. The content of the automatic conversation can be changed or selected according to the user's pre-registered attributes (gender, family composition, housing environment, birthday, hobbies, content difficulty level, etc.). The image of the character 70 shown in FIG. 2 can be realized by a still image, a moving image, or the like. 7. As shown in FIG. 8, the facial expressions, postures, movements, actions, etc. of the character 70 may be varied in various ways.

制御部１１は、ステップＳ１、Ｓ３、Ｓ４、Ｓ８等において、図２、図６～図８のようにキャラクタ７０を表示させ、その場合には、表示部１５に、季節、暦、日付、時刻等を表示してもよく、朝・昼・夕方・夜の時間帯をイメージするイラストレーション、写真、コンピューターグラフィック、その他の画像等を表示してもよい。例えば、図６は、予め定められた昼の時間帯の画像を示しており、この画像では、昼の時間帯であることを示すようにキャラクタ７０の外側の背景の色を明るい色（具体的には予め定められた明色）で示している。一方、図８は、予め定められた夜の時間帯の画像を示しており、この画像では、夜の時間帯であることを示すようにキャラクタ７０の外側の背景の色を暗い色（予め定められた暗色）で示している。 In steps S1, S3, S4, S8, etc., the control unit 11 causes the character 70 to be displayed as shown in FIGS. etc., or may display illustrations, photographs, computer graphics, and other images illustrating morning, noon, evening, and night time zones. For example, FIG. 6 shows an image of a predetermined daytime period, and in this image, the background color outside the character 70 is changed to a bright color (specifically are shown in a predetermined light color). On the other hand, FIG. 8 shows an image of a predetermined night time period, and in this image, the background color outside the character 70 is changed to a dark color (a predetermined color) to indicate the night time period. shown in dark colors).

制御部１１は、ステップＳ１の後、ステップＳ２において、会話開始条件が成立したか否かを判定する。会話開始条件は、バーチャルアシスタント装置１０から会話を発する条件として予め定められた条件である。会話開始条件は、音声入力部１８に対して所定の音声が入力されたこと（例えば、予め定められたウエイクワードが入力されたこと）であってもよい。以下で説明される代表例では、キャラクタ７０の名前がウエイクワードとされており、このウエイクワードが音声入力されることが会話開始条件の一つとされている。なお、この例はあくまで一例であり、予め定められた挨拶（例えば「こんにちは」の言葉）がウエイクワードとされ、このウエイクワードが音声入力されることが会話開始条件の一つとされてもよい。会話開始条件は、これらの例に限定されず、例えば、操作部１７に対して所定の操作がなされたこと（例えば、表示部１５近傍をタップする操作がなされたこと）であってもよい。或いは、会話開始条件は、予め設定された予約時刻が到来したことであってもよい。 After step S1, the control unit 11 determines in step S2 whether a conversation start condition is satisfied. The conversation start condition is a condition predetermined as a condition for starting a conversation from the virtual assistant device 10. The conversation start condition may be that a predetermined voice is input to the voice input unit 18 (for example, that a predetermined wake word is input). In the representative example described below, the name of the character 70 is used as a wake word, and one of the conditions for starting a conversation is that this wake word is input by voice. Note that this example is just an example, and a predetermined greeting (for example, the word "hello") may be used as the wake word, and voice input of this wake word may be set as one of the conversation start conditions. The conversation start condition is not limited to these examples, and may be, for example, that a predetermined operation has been performed on the operation unit 17 (for example, that an operation of tapping near the display unit 15 has been performed). Alternatively, the conversation start condition may be that a preset reserved time has arrived.

制御部１１は、ステップＳ２において会話開始条件が成立した判定した場合、ステップＳ３に進んで会話又はコンテンツの提供を開始する。ステップＳ３での会話やコンテンツの提供は、後述されるステップＳ８と同様に行うことができる。一方、ステップＳ２において会話開始条件が成立していないと判定した場合、ステップＳ４に進んで通常表示を継続する。なお、ステップＳ２でのＮｏ判定及びステップＳ５でのＮｏ判定が繰り返される間は、図２、図６～図８のようなキャラクタ表示を継続させ、継続中には、キャラクタ７０の表情、姿勢、動作、行為などを様々に変化させる。 When the control unit 11 determines in step S2 that the conversation start condition is satisfied, the control unit 11 proceeds to step S3 and starts providing the conversation or content. Conversation and content provision in step S3 can be performed in the same manner as step S8, which will be described later. On the other hand, if it is determined in step S2 that the conversation start condition is not satisfied, the process advances to step S4 and normal display is continued. Note that while the No determination in step S2 and the No determination in step S5 are repeated, the character display as shown in FIGS. Change movements, actions, etc.

制御部１１は、ステップＳ３又はステップＳ４の後、ステップＳ５において、新たな配信があったか否かの判定を行う。制御部１１は、ステップＳ５において、通信部１２が管理装置９０から配信されたコンテンツ（コンテンツデータ）を受けていない判定した場合、ステップＳ５においてＮｏと判定するとともにステップＳ２に処理を戻し、ステップＳ２以降の処理を再び行う。制御部１１は、ステップＳ５において新たな配信があったと判定した場合、ステップＳ５においてＹｅｓと判定し、ステップＳ６以降の処理を行う。 After step S3 or step S4, the control unit 11 determines whether there is a new distribution in step S5. If the control unit 11 determines in step S5 that the communication unit 12 has not received the content (content data) distributed from the management device 90, the control unit 11 determines No in step S5 and returns the process to step S2, and returns the process to step S2. Perform the following processing again. When the control unit 11 determines that there is a new distribution in step S5, it determines Yes in step S5, and performs the processing from step S6 onwards.

５－２．新たな配信があった場合
制御部１１は、ステップＳ５において、通信部１２が管理装置９０から配信されたコンテンツ（コンテンツデータ）を受けたか否かを判定し、受けたと判定した場合、ステップＳ６において、インタフェース１３に「所定の報知」を行わせる。「所定の報知」は、「バーチャルアシスタント装置１０が外部から新たなコンテンツの配信を受けたことを伝える伝達情報」の表示や音声出力などである。上記伝達情報の表示や音声出力は、メッセージの表示や音声出力などであってもよく、新たなコンテンツの配信を受けたことを伝える記号や絵柄などの画像表示であってもよく、新たなコンテンツの配信を受けたことを伝える報知音（アラーム音やブザー音などの発生）の出力であってもよい。報知音としては、例えば日常生活で聴きなれた電話のベル音やチャーム音が好ましい。例えば、図９には、「所定の報知」の一例が示されている。図９の例では、「所定の報知」として、配信名で特定される配信が届いた旨のメッセージの画像７２を表示部１５に表示させているが、このようなメッセージ表示に代えて又はメッセージ表示と併用して音声によるメッセージ報知、絵柄等の画像表示、報知音の出力などを行ってもよい。例えば、キャラクタの会話により報知してもよい。 5-2. When there is new distribution The control unit 11 determines in step S5 whether the communication unit 12 has received the content (content data) distributed from the management device 90, and if it is determined that the communication unit 12 has received the content (content data), the control unit 11 determines in step S6 , causes the interface 13 to perform "predetermined notification". The "predetermined notification" is the display or audio output of "transmission information that indicates that the virtual assistant device 10 has received distribution of new content from the outside." The display and audio output of the above-mentioned communication information may be a message display or audio output, or may be an image display such as a symbol or a picture that indicates that new content has been delivered. It may also be the output of a notification sound (generation of an alarm sound, buzzer sound, etc.) to notify that the distribution of the content has been received. Preferably, the notification sound is, for example, a telephone ring or a charm sound that is familiar to the user in daily life. For example, FIG. 9 shows an example of "predetermined notification". In the example of FIG. 9, an image 72 of a message indicating that the distribution specified by the distribution name has arrived is displayed as a "predetermined notification" on the display unit 15, but instead of displaying such a message, or It may be used in combination with display to notify a message by voice, display images such as pictures, output notification sound, etc. For example, the notification may be made through a conversation between characters.

このように、上記のアプリケーションプログラムは、外部（管理装置９０）から配信されるコンテンツデータ（コンテンツ情報）を通信部１２が受けた場合に、ステップＳ６において、「インタフェース１３を介して所定の報知を行う制御」を制御部１１に実行させる。 In this manner, when the communication unit 12 receives content data (content information) distributed from the outside (management device 90), the above application program executes "predetermined notification via the interface 13" in step S6. The control section 11 is caused to execute the "control to be performed."

制御部１１は、ステップＳ６において、インタフェース１３に「所定の報知」を行わせた後、ステップＳ７において、提供指示があったか否かを判定する。具体的には、制御部１１は、ステップＳ７において、「インタフェース１３を介してコンテンツの提供を指示する入力を受けたか否か」を判定する。 The control unit 11 causes the interface 13 to perform "predetermined notification" in step S6, and then determines in step S7 whether or not there is a provision instruction. Specifically, in step S7, the control unit 11 determines whether "an input instructing the provision of content has been received via the interface 13."

「コンテンツの提供を指示する入力」は、例えば、操作部１７に対する所定操作であってもよく、所定の音声入力であってもよい。例えば、ステップＳ６の処理によって図９のような「所定の報知」がなされる例では、「所定の報知」のメッセージと共に２つの選択ボタン（「見ない」「見る」のボタン画像）が表示される。この場合、それら選択ボタンのうちの「見る」を選択する操作（例えば、「見る」の画像７２をクリックする操作）が「コンテンツの提供を指示する入力」の一例に相当する。従って、ステップＳ６において図９のような表示が行われた後、「見る」を選択する操作がなされた場合、ステップＳ７では、コンテンツの提供を指示する入力を受けたと判定し、ステップＳ８の処理を実行する。 The "input instructing to provide content" may be, for example, a predetermined operation on the operation unit 17 or a predetermined voice input. For example, in an example where a "predetermined notification" as shown in FIG. 9 is made by the process of step S6, two selection buttons ("not watch" and "watch" button images) are displayed together with the message "predetermined notification". Ru. In this case, the operation of selecting "View" from among these selection buttons (for example, the operation of clicking the "View" image 72) corresponds to an example of "input for instructing provision of content." Therefore, if an operation to select "view" is performed after the display as shown in FIG. 9 is performed in step S6, it is determined in step S7 that an input instructing the provision of content has been received, and the processing in step S8 is performed. Execute.

制御部１１は、ステップＳ７において「コンテンツの提供を指示する入力」を受けなかった判定した場合、ステップＳ９において新たに配信されたコンテンツデータを記憶部１４に蓄積する。例えば、制御部１１は、ステップＳ６において「所定の報知」を行った後、所定時間を経過するまで「コンテンツの提供を指示する入力」がなされない場合、図１０のように、コンテンツデータ（コンテンツ情報）の配信があったことを示す報知情報の表示を表示部１５に行わせる。例えば、管理装置９０から新たに２つのコンテンツデータが配信され、ステップＳ６において図９のように「所定の報知」が行われた場合に、「所定の報知」の開始から所定時間を経過しても「見る」のボタンが押されなかった場合には、ステップＳ９に進み、新たに配信されてきたコンテンツデータを蓄積するとともに、図１０のような画像を表示する。図１０の画像では、報知情報として、新たに配信されたコンテンツのうち未提供となっているコンテンツの数を画像７４によって表示している。なお、ステップＳ６において図９のように「所定の報知」が行われた場合に、「見ない」のボタンが選択された場合（即ち、コンテンツの提供の保留を指示する入力がなされた場合）にも、ステップＳ９に進んで同様の処理を行う。 When the control unit 11 determines in step S7 that the “input instructing the provision of content” has not been received, the control unit 11 stores the newly distributed content data in the storage unit 14 in step S9. For example, if the "input instructing the provision of content" is not made until a predetermined time has elapsed after the "predetermined notification" has been made in step S6, the control unit 11 may generate content data (content information) as shown in FIG. The display unit 15 is caused to display notification information indicating that the information) has been distributed. For example, when two new pieces of content data are distributed from the management device 90 and a "predetermined notification" is performed in step S6 as shown in FIG. If the "view" button is not pressed, the process advances to step S9, where newly distributed content data is accumulated and an image as shown in FIG. 10 is displayed. In the image of FIG. 10, the number of unprovided contents among newly distributed contents is displayed by an image 74 as broadcast information. In addition, when the "predetermined notification" is performed as shown in FIG. 9 in step S6, and the "Do not view" button is selected (that is, when an input instructing to suspend provision of the content is made) In this case, the process advances to step S9 and similar processing is performed.

制御部１１は、ステップＳ７において「コンテンツの提供を指示する入力」を受けたと判定した場合、ステップＳ８において、会話やコンテンツを提供する表示を表示部１５に行わせる。制御部１１は、ステップＳ７でＹｅｓと判定してステップＳ８の処理を行う場合、管理装置９０から新たに配信されたコンテンツデータ（コンテンツ情報）に基づくコンテンツの出力（例えば、新たに配信されたコンテンツデータに含まれる動画の再生や静止画の表示、音声の出力等）を表示部１５や音声出力部１６に行わせる。 If the control unit 11 determines in step S7 that an "input instructing the provision of content" has been received, the control unit 11 causes the display unit 15 to display a conversation and content provision in step S8. When determining Yes in step S7 and performing the process in step S8, the control unit 11 outputs content based on content data (content information) newly distributed from the management device 90 (for example, outputs content based on newly distributed content The display section 15 and the audio output section 16 are caused to play back moving images, display still images, output audio, etc. included in the data.

ステップＳ７でＹｅｓに進んだ場合にステップＳ８で行う制御は、「通信部１２が受けたコンテンツに関する会話」を、インタフェース１３を介して出力する制御である。図１１～図１４は、ステップＳ７でＹｅｓに進んだ場合に、制御部１１がステップＳ８で行う制御の一例を示している。図１１～図１４は、管理装置９０から「園芸」に関する「いきいき配信」というコンテンツが新たに配信された場合においてステップＳ８でこのコンテンツを提供する例であり、この例では、配信されたコンテンツデータに「園芸」に関する動画や会話データが含まれている。この場合、制御部１１は、ステップＳ８において動作を再生しつつ会話のために音声を発するように制御を行う。例えば、図１３の例では、ひまわり畑が表示されたタイミングで「夏の花、ひまわりは元気なイメージですね」という会話を発している。この会話を発する際には、キャラクタ７０を表示させておくことで、キャラクタ７０が話しかけている印象を与えることができる。なお、制御部１１は、会話を発する場合、予め登録された利用者のニックネームで発話をしてもよい。 The control performed in step S8 when the answer is Yes in step S7 is to output "conversation regarding content received by the communication unit 12" via the interface 13. FIGS. 11 to 14 show an example of the control that the control unit 11 performs in step S8 when the result in step S7 is Yes. FIGS. 11 to 14 are examples of providing the content in step S8 when content called "lively delivery" related to "gardening" is newly delivered from the management device 90. In this example, the delivered content data contains videos and conversation data related to ``gardening''. In this case, the control unit 11 performs control in step S8 so as to reproduce the motion and emit a voice for conversation. For example, in the example shown in FIG. 13, when a sunflower field is displayed, the user utters the conversation, ``Sunflowers, summer flowers, have a cheerful image.'' When uttering this conversation, by displaying the character 70, it is possible to give the impression that the character 70 is talking to the user. In addition, when uttering a conversation, the control unit 11 may utter the utterance using the nickname of the user registered in advance.

制御部１１は、ステップＳ８にてコンテンツの提供を行い、図１３のように利用者に対して会話を発した場合には、その後、動作検出部が検出する動作又は音声入力部に入力される音声の少なくともいずれかに基づいて「利用者からの応答の認識」を試みる。例えば、操作部１７に対する所定の応答操作（例えば、キャラクタ７０やその他のデザインをタップする操作）や音声入力部１８に対する所定の音声入力（例えば、言葉の入力）が「利用者からの応答」と定められている。制御部１１は、インタフェース１３において「利用者からの応答」があった場合に、この応答に対する更なる会話を、インタフェース１３を介して出力するように制御を行う。「応答に対する更なる会話」は、提供されているコンテンツの種類の話題であってもよく、その種類とは異なる話題であってもよい。「応答に対する更なる会話」は、文字表示と音声出力を併用することが望ましいが、いずれか一方であってもよい。なお、制御部１１が「応答に対する更なる会話」を発するように制御を行った場合、再び、「利用者からの応答の認識」を試みることが望ましい。そして、「利用者からの応答」があった場合には、この応答に対する更なる会話を、インタフェース１３を介して出力することが望ましい。このように、「応答に対する更なる会話」と「利用者からの応答の認識」を繰り返すことで、会話を継続させることができる。また、更なる会話が行える状態であることを、文字やイラスト、音声、点滅など視覚的効果で知らせてもよい。例えば、「連続会話」と表示することで連続会話が行える状態であることがわかり、利用者にとって親切であるといえる。 The control unit 11 provides content in step S8, and if a conversation is uttered to the user as shown in FIG. An attempt is made to "recognize the response from the user" based on at least one of the voices. For example, a predetermined response operation on the operation unit 17 (e.g., an operation of tapping the character 70 or other design) or a predetermined voice input on the voice input unit 18 (e.g., inputting words) is considered to be a "response from the user." It is determined. When there is a "response from the user" on the interface 13, the control unit 11 performs control so that further conversation in response to this response is output via the interface 13. "Further conversation in response" may be a topic of the type of content being provided, or a topic different from that type. For "further conversation in response to the response," it is desirable to use both text display and audio output, but either one may be used. Note that when the control unit 11 performs control to issue "further conversation in response to the response," it is desirable to attempt "recognition of the response from the user" again. If there is a "response from the user," it is desirable to output further conversation in response to this response via the interface 13. In this way, the conversation can be continued by repeating "further conversation in response to the response" and "recognition of the response from the user." Further, the status for further conversation may be notified by text, illustrations, audio, visual effects such as flashing, etc. For example, displaying ``continuous conversation'' indicates that continuous conversation is possible, which is helpful to the user.

このように、制御部１１は、インタフェース１３を介して「コンテンツに関する会話」を出力した後、インタフェース１３を介して「利用者からの応答」を認識した場合に、インタフェース１３を介して「応答に対する会話」を出力するように制御を行う。ステップＳ８は、このような制御を制御部１１に行わせるステップである。 In this way, when the control unit 11 outputs a "conversation about content" via the interface 13 and then recognizes a "response from the user" via the interface 13, control to output "conversation". Step S8 is a step for causing the control section 11 to perform such control.

なお、制御部１１は、ステップＳ８にて会話を行う場合、音声の入出力と併用して、音声入力部１８を介して音声入力された発話内容（利用者の発話内容）やバーチャルアシスタント装置１０の発話内容（例えば、キャラクタ７０の発話内容）を表示部１５に文字で表示してもよい。 Note that when having a conversation in step S<b>8 , the control unit 11 is used in conjunction with voice input/output to input and output the utterance content (user's utterance content) voice input via the voice input unit 18 and the virtual assistant device 10 . The content of the utterance (for example, the content of the utterance of the character 70) may be displayed in text on the display unit 15.

なお、図１１～図１４は、バーチャルアシスタント装置１０と対応付けて登録されている種類（利用者が希望する種類）のコンテンツを提供する例が示されるが、ステップＳ８で提供するコンテンツや会話はこの例に限定されない。 Note that although FIGS. 11 to 14 show examples of providing content of the type registered in association with the virtual assistant device 10 (the type desired by the user), the content and conversation provided in step S8 are It is not limited to this example.

５－３．新たな配信が無い期間の通常動作の詳細
次の説明は、新たな配信が無い期間（ステップＳ５のＮｏと判定される期間）になされる通常動作の詳細に関する。この期間、制御部１１は、主に、解除モード（待機モード）と有効モード（会話モード）とに切り替わる。本実施形態では、新たな配信が無い期間（ステップＳ５のＮｏと判定される期間）において、ステップＳ４の処理を継続するモードが解除モードの一例である。また、上記期間において、ステップＳ３の処理を継続するモードが有効モードの一例である。 5-3. Details of normal operations during a period in which there is no new distribution The following description relates to details of normal operations performed during a period in which there is no new distribution (period in which it is determined No in step S5). During this period, the control unit 11 mainly switches between a release mode (standby mode) and a valid mode (conversation mode). In the present embodiment, a mode in which the process in step S4 is continued during a period in which there is no new distribution (a period in which it is determined No in step S5) is an example of the cancellation mode. Furthermore, a mode in which the process of step S3 is continued during the above period is an example of an effective mode.

有効モード（会話モード）は、ステップＳ２でＹｅｓと判定されてから、有効モードの終了条件が成立するまでのモードである。有効モードの終了条件は、有効モードで実行されたコンテンツが終了したことであってもよく、一定時間以上インタフェースに所定入力（例えば、音声入力やタッチパネルに対する操作等）がなされなかったことであってもよく、利用者からインタフェースに対して終了を指示する入力（終了を指示する音声入力や接触操作等）がなされたことであってもよい。有効モードは、具体的には、制御部１１がウェイクワード以外の言葉を検出するモードである。制御部１１は、有効モード中にインタフェースに音声が入力された場合、インタフェースに入力された音声を解析し、その音声が意味する言葉を公知の方法で認識する。 The effective mode (conversation mode) is a mode from when the determination is Yes in step S2 until the end condition of the effective mode is satisfied. The condition for ending the valid mode may be that the content executed in the valid mode has ended, or that no predetermined input has been made to the interface for a certain period of time (e.g., voice input, touch panel operation, etc.). It may also be that the user has made an input to the interface instructing termination (voice input or touch operation instructing termination, etc.). Specifically, the valid mode is a mode in which the control unit 11 detects words other than the wake word. When a voice is input to the interface during the valid mode, the control unit 11 analyzes the voice input to the interface and recognizes the words meant by the voice using a known method.

解除モード（待機モード）は、ステップＳ２でＮｏと判定される期間のモードであり、有効モードが解除されているときのモードである。解除モードは、上記インタフェースにウェイクワードが入力されたか否かを制御部１１が監視するモードである。制御部１１は、解除モード中にインタフェースに音声が入力された場合、その音声がウェイクワードであるか否かを公知の方法で判定する。但し、制御部１１は、解除モード中にインタフェースに入力される音声からウェイクワード以外の音声は認識しない。つまり、制御部１１は、解除モードのときには、ウェイクワード以外の音声認識する必要が無く、ウェイクワードであるか否かの判定を行うだけで済むため、有効モードのときよりも音声の認識を簡略化することができる。 The release mode (standby mode) is a mode during a period in which the determination is No in step S2, and is a mode when the effective mode is released. The release mode is a mode in which the control unit 11 monitors whether a wake word has been input to the interface. When a voice is input to the interface during the release mode, the control unit 11 determines whether or not the voice is a wake word using a known method. However, the control unit 11 does not recognize any voice other than the wake word from the voice input to the interface during the release mode. In other words, in the release mode, the control unit 11 does not need to recognize speech other than the wake word, and only needs to determine whether it is a wake word or not, so the speech recognition is simpler than in the enable mode. can be converted into

図５のフローチャートで示されるように、制御部１１は、ステップＳ２、Ｓ５でＮｏの判定が繰り返される期間は解除モードに設定され、解除モード中には、ウェイクワードの検出を継続的に試みる。制御部１１は、解除モード中にインタフェース１３にウェイクワードが音声入力されたことを検出した場合、ステップＳ２にてＹｅｓと判定するとともに有効モードに切り替わり、処理をステップＳ３に進める。そして、有効モードの終了条件が成立するまで、有効モードを継続する。制御部１１は、有効モードの終了条件が成立した場合、解除モードに切り替わり、解除モードを継続する。制御部１１は、解除モード中に、キャラクタが待機動作を行う画像を表示部１５に表示させることもできる。 As shown in the flowchart of FIG. 5, the control unit 11 is set to the release mode during a period in which the determination of No in steps S2 and S5 is repeated, and during the release mode, it continuously tries to detect the wake word. When the control unit 11 detects that the wake word is input by voice into the interface 13 during the release mode, it determines Yes in step S2, switches to the valid mode, and advances the process to step S3. Then, the effective mode is continued until the end condition for the effective mode is satisfied. When the termination condition of the effective mode is satisfied, the control unit 11 switches to the release mode and continues the release mode. The control unit 11 can also cause the display unit 15 to display an image in which the character performs a standby action during the release mode.

バーチャルアシスタント装置がステップＳ３、Ｓ８で提供するコンテンツは、単に静止画を表示したり単に動画を流したりするコンテンツであってもよいが、疑似体験ができるコンテンツであってもよい。なお、ステップＳ３でいずれのコンテンツを実施するかは、ユーザが選択できるようになっていることが望ましい。ユーザによるコンテンツの選択方法は、例えば、複数種類のコンテンツ名をタッチパネル式表示装置２０に表示させたうえで、ユーザが希望するコンテンツを選択するような方法であってもよく、キャラクタ７０が会話の中で「旅行の動画を見ましょうか？」といった具合にコンテンツの種類を提案し、それに応じてユーザが希望する指示（タップ操作や指示する用語の音声入力など）を行った場合に、そのコンテンツが選択されるようにしてもよい。これらの方法に限定されず、他の方法でコンテンツが選択されてもよい。 The content provided by the virtual assistant device in steps S3 and S8 may be content that simply displays a still image or simply plays a moving image, but may also be content that allows a simulated experience. Note that it is desirable that the user be able to select which content is to be implemented in step S3. The content selection method by the user may be, for example, a method in which multiple types of content names are displayed on the touch panel display device 20 and the user selects the desired content. The content type is suggested, such as "Shall we watch a travel video?", and when the user gives the desired instruction (tap operation, voice input of the specified term, etc.), the content is displayed. may be selected. The content is not limited to these methods, and content may be selected using other methods.

５－４．会話の具体例
次の説明は、ステップＳ３、Ｓ８でなされる会話の具体例に関する。
以下で説明される具体例は、キャラクタ７０からの話を出力部が出力することが前提の例である。本実施形態では、音声出力部１６及び表示部１５が出力部の一例に相当し、キャラクタ７０からの話を出力する機能を有する。キャラクタ７０からの話を出力する動作は、キャラクタ７０が表示部１５に表示された状態でキャラクタ７０からの話を音声や文字表示などによって出力する動作であってもよい。或いは、キャラクタ７０からの話を出力する動作は、キャラクタ７０が表示部１５に表示された状態でキャラクタ７０からの話を音声によって出力する動作とキャラクタ７０が表示されずにキャラクタ７０からの話を音声によって出力する動作とが併用されてもよい。また、以下で説明される具体例は、利用者からの話が入力部に入力され得ることが前提の例である。本実施形態では、音声入力部１８及び操作部１７が入力部の一例に相当し、利用者からの話が入力される装置として機能する。制御部１１は、各種制御を行う機能を有し、例えば、表示部１５にキャラクタ７０の画像を表示させる機能、出力部に出力動作を行わせる機能、入力部に入力された情報を解析する機能、などを有する。 5-4. Specific Example of Conversation The following explanation relates to a specific example of the conversation that takes place in steps S3 and S8.
The specific example described below is an example on the premise that the output unit outputs a story from the character 70. In this embodiment, the audio output section 16 and the display section 15 correspond to an example of an output section, and have a function of outputting a story from the character 70. The operation of outputting the story from the character 70 may be an operation of outputting the story from the character 70 in the form of voice, text, etc. while the character 70 is displayed on the display unit 15. Alternatively, the operation of outputting the story from the character 70 may be an operation of outputting the story from the character 70 by voice while the character 70 is displayed on the display unit 15, or an operation of outputting the story from the character 70 with the character 70 not displayed. An operation of outputting audio may also be used. Further, the specific example described below is an example based on the assumption that a message from a user can be input to the input unit. In this embodiment, the voice input section 18 and the operation section 17 correspond to an example of an input section, and function as a device into which a user's speech is input. The control unit 11 has functions to perform various controls, such as a function to display an image of the character 70 on the display unit 15, a function to cause the output unit to perform an output operation, and a function to analyze information input to the input unit. , etc.

本実施形態では、上述のように、制御部１１は、有効モードと解除モードとを切り替えるように動作する。有効モードは、上記入力部に入力される話の解析を有効化するモードである。解除モードは、上記有効モードを解除したモードである。解除モード中には、「入力部に入力される話の解析」は行われない。 In this embodiment, as described above, the control unit 11 operates to switch between the valid mode and the cancel mode. The valid mode is a mode for validating the analysis of the story input to the input section. The cancellation mode is a mode in which the above-mentioned valid mode is canceled. During the release mode, "analysis of the story input to the input section" is not performed.

制御部１１は、図５の制御においてステップＳ３の処理で採用する会話として、様々な種類の会話を採用し得るが、ステップＳ３の処理を行う時点で予め定められた提供条件が成立している場合には、話題画像を表示させるように表示部１５を制御する。上記提供条件は、話題画像を提供する条件である。上記提供条件は、予め定められた時間条件が成立したことであってもよく、話題画像を含むコンテンツの配信がバーチャルアシスタント装置１０に対してなされたことであってもよく、その他の条件であってもよい。予め定められた時間条件は、予め定められた時刻、日にち、曜日のいずれかが到来したことであってもよく、前回の話題画像の提供終了から一定時間が経過したことであってもよく、その他の時間条件であってもよい。 Although the control unit 11 can adopt various types of conversations as the conversation to be adopted in the process of step S3 in the control of FIG. 5, predetermined provision conditions are satisfied at the time of performing the process of step S3. In this case, the display section 15 is controlled to display the topic image. The above provision condition is a condition for providing a topical image. The above-mentioned provision condition may be that a predetermined time condition is met, that content including a topical image is distributed to the virtual assistant device 10, or that it is another condition. You can. The predetermined time condition may be that a predetermined time, date, or day of the week has arrived, or that a certain amount of time has passed since the last topical image provision ended. Other time conditions may also be used.

制御部１１は、ステップＳ３の処理において上記話題画像を表示するように表示部１５を制御する場合、表示する話題画像を決定する。話題画像は、キャラクタ７０の画像とは異なる画像であり、会話の話題となる画像である。「会話の話題となる」とは、少なくとも話題画像と対応付けられた言葉が利用者から発せられた場合に、その言葉に関連する話をバーチャルアシスタント装置１０が提供するようにして話題になることを含む。 When controlling the display unit 15 to display the topic image in the process of step S3, the control unit 11 determines the topic image to be displayed. The topic image is an image different from the image of the character 70, and is an image that becomes a topic of conversation. "Becoming a topic of conversation" means that when at least a word associated with a topic image is uttered by a user, the virtual assistant device 10 provides a story related to the word and it becomes a topic of conversation. including.

本実施形態では、複数種類の話題画像が予め用意されており、ステップＳ３の処理で話題画像を提供する場合には、いずれかの話題画像を選定して表示する。本実施形態では、図１６に示される話題画像Ａ、話題画像Ｂのように、複数の話題画像の画像データが予め用意されており、図１６には図示されていないが、話題画像Ｃ、話題画像Ｄ・・・などの多数の画像データも用意されている。用意される話題画像の種類は特に限定されないが、例えば、風景、旅、自然、植物、動物、建物、食べ物、イベント、活動の様子など、様々な画像が挙げられる。 In this embodiment, a plurality of types of topic images are prepared in advance, and when providing topic images in the process of step S3, one of the topic images is selected and displayed. In this embodiment, image data of a plurality of topic images are prepared in advance, such as topic image A and topic image B shown in FIG. 16, and although not shown in FIG. A large number of image data such as image D... are also prepared. The types of topic images to be prepared are not particularly limited, but include various images such as landscapes, travel, nature, plants, animals, buildings, food, events, and activities.

更に、図１６の例では、複数の話題画像Ａ，Ｂ，Ｃ，Ｄ・・・の各々の画像データには各話題画像に関連する１以上の言葉のデータが対応付けられて用意されている。話題画像に対応付ける言葉は、１つの単語又は少数の単語の組み合わせからなるキーワードが望ましく、例えば短い語数の名詞が好適例である。但し、この例に限定されず、例えば短文などであってもよい。 Furthermore, in the example of FIG. 16, image data for each of the plurality of topic images A, B, C, D, etc. is prepared in association with data of one or more words related to each topic image. . The word to be associated with the topic image is preferably a keyword consisting of one word or a combination of a small number of words; for example, a noun with a short number of words is a suitable example. However, it is not limited to this example, and may be a short sentence, for example.

更に、図１６の例では、更に、各々の言葉のデータには、各言葉に関連する１以上の話のデータが対応付けられて用意されている。例えば、話題画像Ａに対応付けられて言葉Ａ１，Ａ２，Ａ３・・・が用意されており、言葉Ａ１に対応付けられて話Ａ１１，Ａ１２，Ａ１３・・・・が用意され、言葉Ａ２に対応付けられて話Ａ２１，Ａ２２，Ａ２３・・・・が用意されている。話題画像Ｂについても、同様の対応付けがなされている。話題画像に対応付ける言葉としては、風景、植物、動物、建物、食べ物、イベント、活動などの名称、画像の場所の地名などが挙げられる。言葉に関連する１以上の話は、その言葉が含まれた話であることが望ましい。例えば、「熱海」という言葉に関連する話としては、「熱海はいいですよね。」といった話のように、言葉の対象を称賛する話や、「私は熱海にいったことがないな。」といった話のように、言葉の対象に関連する経験の話であってもよく、「熱海は盛況のようですよ。」といった話のように、言葉の対象についての状況に関する話であってもよい。 Furthermore, in the example of FIG. 16, data for each word is associated with data for one or more stories related to each word. For example, words A1, A2, A3, etc. are prepared in correspondence with topic image A, and stories A11, A12, A13, ... are prepared in correspondence with word A1, and they correspond to word A2. Stories A21, A22, A23, etc. are prepared. Similar correspondence is made for topical image B as well. Examples of words associated with topical images include names of landscapes, plants, animals, buildings, foods, events, activities, etc., and place names of the locations of images. It is desirable that the one or more stories related to a word include the word. For example, stories related to the word "Atami" include stories that praise the target of the word, such as "Atami is nice, isn't it?" and stories that praise the target of the word, such as "I've never been to Atami." It can be a story about an experience related to the target of the word, such as ``Atami seems to be thriving.'' It can also be a story about the situation surrounding the target of the word, such as ``Atami seems to be thriving.'' .

各々の話題画像に対して１以上の言葉及び各言葉に関連する話を対応付け対応データが図１６のようなデータ構造で記憶部１４に記憶されているため、いずれかの話題画像が選定された場合には、当該話題画像のデータ、当該話題画像に対応付けられたいずれかの言葉のデータ、当該言葉に対応付けられたいずれかの話のデータを読み出すことができる。 Since correspondence data that associates one or more words and a story related to each word with each topic image is stored in the storage unit 14 in a data structure as shown in FIG. 16, any topic image is selected. In this case, data of the topic image, data of any word associated with the topic image, and data of any story associated with the word can be read out.

図１６のように複数種類の話題画像を用意しておく場合、各々の話題画像がどの属性に属するかを識別する識別情報が付されていてもよい。例えば、話題画像Ａは、「旅行」の属性に属し、話題画像Ｂは、「スポーツ」の属性に属し、話題画像Ｃは、「花」「イベント」の属性に属するいったように、識別情報を付しておくことができる。各々の話題画像の属性を特定する識別情報は、１種類の属性のみを特定してもよく、２種類以上の属性を特定してもよい。 When a plurality of types of topic images are prepared as shown in FIG. 16, identification information for identifying to which attribute each topic image belongs may be attached. For example, topic image A belongs to the attribute of "travel," topic image B belongs to the attribute of "sports," topic image C belongs to the attributes of "flowers" and "event," and so on. can be attached. The identification information that specifies the attributes of each topical image may specify only one type of attribute, or may specify two or more types of attributes.

制御部１１がステップＳ３で話題画像を選定する方法は、図１５のように複数用意された話題画像からランダムに選定する方法であってもよく、複数用意された話題画像を予め決められた順序で順番に選定してもよく、複数用意された話題画像の中から利用者が設定した条件に合致した話題画像を選定してもよい。例えば、上述された「利用者が希望する情報の種類」や「利用者に有意義な情報」として、１以上の種類が登録情報に含まれている場合に、登録された種類に属する話題画像を選定してもよい。例えば、「利用者が希望する情報の種類」や「利用者に有意義な情報」として「旅行」が登録されている場合に、「旅行」に属する話題画像を選定してもよい。 The method by which the control unit 11 selects a topic image in step S3 may be a method of randomly selecting a topic image from a plurality of prepared topic images as shown in FIG. The topic images may be selected in order, or the topic images that match the conditions set by the user may be selected from a plurality of topic images prepared. For example, if the registered information includes one or more types of "information desired by the user" or "information meaningful to the user" mentioned above, topic images belonging to the registered types may be displayed. You may choose. For example, if "travel" is registered as "type of information desired by the user" or "information meaningful to the user," topic images belonging to "travel" may be selected.

制御部１１は、いずれかの話題画像を選定した場合に、選定された話題画像に対応する複数の言葉の中からいずれかの言葉を選定するが、言葉の選定方法は、選定された話題画像に対応付けられた複数の言葉の中からランダムに選定する方法であってもよく、複数の言葉の中から予め決められた順序で順番に選定してもよく、複数の言葉の中から利用者が設定した条件に合致した話題画像を選定してもよい。選定された話題画像に対応する言葉を選定する場合、当該話題画像に対応付けられた複数の言葉の中から１つのみを選定してもよく、複数の言葉を選定してもよい。 When the control unit 11 selects any topic image, the control unit 11 selects one of the words from a plurality of words corresponding to the selected topic image. It may be a method of randomly selecting words from among a plurality of words associated with The topic images that match the conditions set by the user may be selected. When selecting a word corresponding to a selected topic image, only one word may be selected from a plurality of words associated with the selected topic image, or a plurality of words may be selected.

制御部１１は、ステップＳ３の処理で上記話題画像を表示する場合、上述のように選定された話題画像及び言葉を互いに関連付けて表示する。例えば、図１５のような熱海の海岸の風景を示す話題画像８２Ａが選定され、話題画像８２Ａに対応付けられた複数の言葉の中から「熱海」の言葉が選定された場合には、図１５のように、選定された話題画像８２Ａ及び言葉８４Ａを同時期に表示するように制御を行う。図１５の例では、キャラクタ７０の画像とともに、キャラクタ７０の背景画像として話題画像８２Ａを表示し、この話題画像８２Ａに重ねた形で「熱海」の言葉を同時期に表示している。話題画像８２Ａを表示する場合に話題画像８２Ａに対応する言葉８４Ａを表示する期間は、話題画像８２Ａが表示される全期間にわたって言葉８４Ａを表示してもよく、話題画像８２Ａが表示される一部期間に言葉８４Ａを表示してもよい。また、話題画像８２Ａと言葉８４Ａを関連付けて表示する方法は、両方を同時期に表示する方法でなくてもよい。例えば、「熱海」の言葉８４Ａを一定期間表示した後、「熱海」の言葉８４Ａの表示から話題画像８２Ａの表示に即座に切り替えるように表示を行ってもよい。 When displaying the topic image in the process of step S3, the control unit 11 displays the topic image and words selected as described above in association with each other. For example, if the topic image 82A showing the scenery of the coast of Atami as shown in FIG. 15 is selected and the word "Atami" is selected from among the plural words associated with the topic image 82A, Control is performed so that the selected topic image 82A and words 84A are displayed at the same time. In the example of FIG. 15, a topic image 82A is displayed as a background image of the character 70 together with the image of the character 70, and the word "Atami" is displayed at the same time as superimposed on the topic image 82A. When displaying the topic image 82A, the period for displaying the word 84A corresponding to the topic image 82A may be such that the word 84A is displayed for the entire period during which the topic image 82A is displayed, or for a portion of the period in which the topic image 82A is displayed. The word 84A may be displayed in the period. Further, the method of displaying the topic image 82A and the word 84A in association with each other does not necessarily have to be a method of displaying both at the same time. For example, after displaying the word "Atami" 84A for a certain period of time, the display may be performed such that the display of the word "Atami" 84A is immediately switched to the display of the topic image 82A.

制御部１１は、ステップＳ３の処理において図１５のように話題画像８２Ａの表示を行う場合、話題画像８２Ａとともに表示部１５に表示される言葉８４Ａをキャラクタ７０が読み上げるように音声出力部１６に音声を出力させてもよい。 When displaying the topic image 82A as shown in FIG. 15 in the process of step S3, the control section 11 sends audio to the audio output section 16 so that the character 70 reads out the words 84A displayed on the display section 15 together with the topic image 82A. may also be output.

上述の例では、図５の制御を実行するためのプログラムにおいて、ステップＳ３又はステップＳ８を実行するためのプログラムの一部が、「キャラクタ７０の画像とは異なる話題画像及び話題画像に関する言葉を表示部１５に表示させる制御を、制御部１１に行わせるステップ」の一例に相当する。制御部１１は、このプログラムに従って、図１５のようにキャラクタ７０の画像とは異なる話題画像及び当該話題画像に関する言葉を表示部１５に表示するように動作する。 In the above example, in the program for executing the control shown in FIG. This corresponds to an example of the step of causing the control unit 11 to perform the control to display the display on the unit 15. The control unit 11 operates according to this program to display a topic image different from the image of the character 70 and words related to the topic image on the display unit 15, as shown in FIG.

上述の例では、図５の制御を実行するためのプログラムにおいて、ステップＳ３又はステップＳ８を実行するためのプログラムの一部が、「話題画像及び言葉が表示部１５に表示された後、上記言葉を含む話が入力部に入力された場合に、上記言葉に関連する話をキャラクタ７０からの話として出力部に出力させる制御を制御部１１に行わせるステップ」の一例に相当する。制御部１１は、このプログラムに従い、話題画像及び言葉が表示部１５に表示された後、上記言葉を含む話が入力部に入力された場合に、上記言葉に関連する話をキャラクタ７０からの話として出力部に出力させるように動作する。 In the above example, in the program for executing the control shown in FIG. This corresponds to an example of the step of "controlling the control section 11 to output a story related to the word to the output section as a story from the character 70 when a story including the word is input to the input section". According to this program, when a story including the word is input to the input section after the topic image and the word are displayed on the display section 15, the control section 11 displays a story related to the word from the character 70. It operates so that it is output to the output section as .

制御部１１は、ステップＳ３又はＳ８において上述の話題画像及び言葉の表示を行った場合、その表示後、所定の終了条件が成立するまで有効モードを継続する。具体的には、上述の話題画像及び言葉の両方の表示がなされてから所定期間の間、上記言葉を含む話が入力部（例えば音声入力部１８）に入力されたか否かを確認する。「話題画像及び言葉の両方がなされてからの所定期間」は、例えば、「話題画像の表示が開始されたこと」及び「当該話題画像に対応付けられた言葉の表示が開始されたこと」の両条件を満たした時点からの一定期間であってもよく、上記両条件を満たした後、所定条件（例えば、上記言葉を含まない音声が入力されたこと、上記言葉を含まない音声が一定時間継続したこと等）が成立するまでの期間であってもよい。上述の終了条件（有効モードの終了条件）は、上記所定期間が経過したことであってもよく、上記所定期間の経過後、他の条件が成立したことであってもよい。他の条件を採用する場合、他の条件は、上記所定期間よりも長い規定期間が経過したことであってもよく、設定時刻が到来したことであってもよく、操作部に対して所定操作があったことであってもよい。 When the above-described topic image and words are displayed in step S3 or S8, the control unit 11 continues the effective mode after the display until a predetermined termination condition is satisfied. Specifically, for a predetermined period of time after both the topic image and the words are displayed, it is checked whether a story including the words has been input to the input unit (for example, the audio input unit 18). For example, the "predetermined period of time after both the topic image and the words are displayed" means "the display of the topic image has started" and "the display of the words associated with the topic image has started". It may be a certain period of time from the time when both conditions are met, and after both of the above conditions are met, a predetermined condition (for example, that a voice that does not include the above words has been input, that the voice that does not include the above words has been input for a certain period of time) It may also be the period until the conclusion of the agreement (e.g., that the term has been continued). The above-mentioned termination condition (valid mode termination condition) may be that the predetermined period has elapsed, or that another condition is satisfied after the predetermined period has elapsed. When adopting other conditions, the other conditions may be that a specified period longer than the above-mentioned predetermined period has elapsed, or that a set time has arrived, or that a predetermined operation is performed on the operation panel. It may be that there was.

制御部１１は、有効モード中において話題画像及び当該話題画像に対応する言葉の両方の表示がなされてから上記所定期間の間に上記言葉を含む話が音声入力部１８に入力されたか否かを確認し、上記有効モード中の上記所定期間の間に上記言葉を含む話が入力部（例えば音声入力部１８）に入力された場合には、上記言葉に関連する話をキャラクタ７０からの話として出力部（例えば、音声出力部１６）に出力させるように動作する。この場合、上記言葉に対応付けて用意された複数の話（図１６のように用意された候補となる複数の話）の中からいずれかの話を選定して出力するが、話の選定方法は、上記言葉に対応付けられて用意された複数の話の中からランダムに選定する方法であってもよく、複数の話の中から予め決められた順序で順番に選定してもよく、複数の話の中から利用者が設定した条件に合致した話を選定してもよい。 The control unit 11 determines whether a story including the word has been input to the audio input unit 18 during the predetermined period after both the topic image and the word corresponding to the topic image are displayed in the valid mode. If this is confirmed, and a story including the word is input to the input section (for example, the voice input section 18) during the predetermined period in the valid mode, the story related to the word is read as a story from the character 70. It operates to cause the output section (for example, the audio output section 16) to output the signal. In this case, one of the stories is selected and output from among the multiple stories prepared in association with the above words (multiple candidate stories prepared as shown in Figure 16), but the story selection method may be selected at random from among a plurality of stories prepared in association with the above words, or may be selected sequentially from among a plurality of stories in a predetermined order; A story that matches the conditions set by the user may be selected from among the stories.

例えば、制御部１１は、図１５のように話題画像（熱海の海岸の画像）とともに当該話題画像に関連させた言葉として「熱海」という単一のキーワードを表示するように制御を行った場合、上記話題画像及び上記単一のキーワードの表示後において上記所定期間にわたって音声入力部１８に入力された音声を解析する。そして、制御部１１は、上記所定期間の間に上記単一のキーワードを含む音声が音声入力部１８に入力されたか否かを確認する。単一のキーワードを含む音声は、「熱海」のように当該単一のキーワードのみを発した音声であってもよく、「熱海だね」「熱海はいいね」などのように、当該単一のキーワードに他の語が加えられた音声であってもよい。図１５の例では、制御部１１は、上記所定期間の間に上記単一のキーワードを含む音声が音声入力部１８に入力されたと判定した場合、上記単一のキーワードに対応する話を音声出力部１６に出力させるように動作してもよい。例えば、上記所定期間の間に、「熱海だね」といった音声が音声入力部１８に入力され、制御部１１が、この音声の解析によって「熱海」のキーワードが含まれると判定した場合、制御部１１は、「熱海」の言葉に対応付けられて用意された複数の話の中からいずれかの話を選定し、表示部１５にキャラクタ７０を表示させながら、選定した話の音声を音声出力部１６によって出力する。 For example, when the control unit 11 performs control to display a single keyword "Atami" as a word related to the topic image together with the topic image (image of the coast of Atami) as shown in FIG. 15, After the topic image and the single keyword are displayed, the audio input to the audio input unit 18 over the predetermined period is analyzed. Then, the control unit 11 checks whether speech including the single keyword has been input to the voice input unit 18 during the predetermined period. A voice that includes a single keyword may be a voice that utters only the single keyword, such as "Atami," or a voice that utters only the single keyword, such as "It's Atami," or "Atami is nice." It may also be a voice with other words added to the keyword. In the example of FIG. 15, when the control unit 11 determines that the voice including the single keyword has been input to the voice input unit 18 during the predetermined period, the control unit 11 outputs the story corresponding to the single keyword as voice. The unit 16 may be operated to output the information. For example, if a voice such as "It's Atami" is input to the voice input unit 18 during the predetermined period, and the control unit 11 analyzes this voice and determines that the keyword "Atami" is included, the control unit 11 selects one of the stories from a plurality of stories prepared in association with the word "Atami" and displays the character 70 on the display section 15 while outputting the audio of the selected story. 16.

なお、図１５のように表示しても、「熱海」の言葉が発せられない懸念もある。そこで、図１５のような表示がなされた状態で、「写真がどこなのか教えてよ。」といった話（表示される言葉を含まない話）が利用者から発せられた場合、「気になりますよね。写真に書いてある文字情報を読み上げていただけませんか。」といった話を、キャラクタ７０からの話として音声出力し、画面に表示された文字情報の読み上げを促してもよい。その後、上記所定期間の間に、「「熱海」って書いてあるよ。」といった話（表示される言葉を含む話）が入力された場合に、「熱海」に対応付けられた話として、「東京からも新幹線で行けますし、近くて有名な観光地ですね。」といった話を音声出力してもよい。 There is also a concern that even if displayed as shown in Figure 15, the word "Atami" may not be uttered. Therefore, if a user says something like "Please tell me where the photo is" (a story that does not include the displayed words) while the display shown in Figure 15 is displayed, the user might say something like "I'm interested." "Could you please read out the textual information written on the photo?" may be output as a speech from the character 70 to encourage the character 70 to read out the textual information displayed on the screen. After that, during the above-mentioned predetermined period, the message ````Atami'' is written.'' '' (a story that includes the displayed words), the story associated with ``Atami'' will be ``You can get there from Tokyo by Shinkansen, and it's a nearby and famous tourist destination.'' You may also output the speech as audio.

このように制御部１１は、言葉に関連する話をキャラクタ７０からの話として出力する。キャラクタ７０からの話として出力する方法は、キャラクタ７０を表示させながら話を音声として出力する方法でもよく、キャラクタ７０を表示せずにキャラクタ７０の声で話を音声として出力する方法でもよい。 In this way, the control unit 11 outputs a story related to words as a story from the character 70. The method of outputting the story as a story from the character 70 may be a method of outputting the story as audio while displaying the character 70, or a method of outputting the story as audio with the voice of the character 70 without displaying the character 70.

なお、制御部１１は、このように上記有効モード中に上記言葉に関連する話をキャラクタ７０からの話として出力部（例えば、音声出力部１６）に出力させた場合、その後、上記有効モードを継続しつつ上記利用者からの話を受け付ける期間を設けた後、予め定められた終了条件が成立するまで上記有効モードを継続するように動作してもよい。制御部１１は、上記言葉に関連する話を出力した後、有効モードを継続する場合、利用者から追加の話が音声入力部１８に入力された場合には、その話に対応する話を音声出力部１６によって出力するように動作してもよい。例えば、図１５の例において、「熱海」の言葉に対応付けられた話を選定し、キャラクタ７０を表示させながら選定した話の音声出力した後、利用者から「熱海」に関する追加の話が音声入力部１８に入力された場合には、「熱海」に対応付けられて用意された複数の話の中から既に出力した話以外の話を選定し、キャラクタ７０を表示させながら選定した話を音声出力してもよい。バーチャルアシスタント装置１０では、このような会話を有効モードの終了まで継続することができる。 Note that when the control unit 11 causes the output unit (for example, the audio output unit 16) to output a story related to the word as a story from the character 70 during the valid mode, the control unit 11 thereafter changes the valid mode to After providing a period during which messages from the user are continuously accepted, the effective mode may be continued until a predetermined termination condition is met. If the control unit 11 continues the valid mode after outputting a story related to the above-mentioned word, and if an additional story is input from the user to the voice input unit 18, the control unit 11 outputs a story corresponding to the word. The output unit 16 may also operate to output the information. For example, in the example of FIG. 15, after selecting a story associated with the word "Atami" and outputting the audio of the selected story while displaying the character 70, the user asks for an additional story about "Atami". When input to the input unit 18, a story other than the story that has already been output is selected from a plurality of prepared stories associated with "Atami", and the selected story is voiced while displaying the character 70. You can also output it. The virtual assistant device 10 allows such a conversation to continue until the effective mode ends.

なお、本実施形態では、図１６のように、いずれかの話題画像に対応させて複数の言葉が用意されている。このような例では、いずれかの話題画像を「複数種類のうちのいずれか１種以上の言葉」と共に第１の組み合わせで表示部１５に表示して上述のように会話を実現した後（具体的には、第１の組み合わせに含まれる言葉が音声入力された場合に、その言葉に対応する話を音声出力するように会話を実現した後）、上記話題画像を上記第１の組み合わせとは異なる組み合わせで当該話題画像に対応付けられた言葉と共に表示部１５に表示し、別観点の会話を実現してもよい。この場合、話題画像と言葉を第１の組み合わせとは異なる組み合わせで表示する時期（話題画像再表示時期）は、様々に設定することができ、例えば、第１の組み合わせで表示がなされてから、所定の経過時間（例えば、所定時間、所定日数等）が経過した後の時期であってもよく、所定の日にちや曜日が到来した時期であってもよく、その他の時期であってもよい。 Note that, in this embodiment, as shown in FIG. 16, a plurality of words are prepared in association with one of the topic images. In such an example, after realizing a conversation as described above by displaying one of the topic images on the display unit 15 in the first combination with "any one or more words out of a plurality of types" (specific Specifically, when a word included in the first combination is input by voice, the conversation is realized so that the story corresponding to the word is output as voice), and then the topic image is changed to the first combination. A different combination of words associated with the topic image may be displayed on the display unit 15 to realize a conversation from a different perspective. In this case, the timing at which the topic image and words are displayed in a combination different from the first combination (the topic image re-display timing) can be set variously. For example, after the first combination is displayed, It may be a time after a predetermined elapsed time (for example, a predetermined time, a predetermined number of days, etc.) has passed, it may be a time when a predetermined date or day of the week has arrived, or it may be another time.

例えば、図１５のような話題画像（熱海の海岸の画像）と言葉の組み合わせが第１の組み合わせである場合、図１５のように第１の組み合わせで表示部１５に表示し、「熱海」が音声入力されることに応じて、「熱海」に対応する話を音声出力するように、第１の組み合わせに関する会話を実現した後、上述の話題画像再表示時期が到来した場合に、図１７のように、図１５と同様の話題画像（熱海の海岸の画像）を上記第１の組み合わせとは異なる組み合わせ（第２の組み合わせ）で当該話題画像に対応付けられた言葉と共に表示部１５に表示してもよい。図１７の例では、上記話題画像（熱海の海岸の画像）と「海水浴」の組み合わせが「第１の組み合わせとは異なる組み合わせ」である。このように表示を行った場合でも、第１の組み合わせでなされた会話と同様に会話を行うことができ、制御部１１は、図１７のように表示を行った後、上記所定期間の間に「かいすいよく」を含む話（例えば、「海水浴、そういう季節だね。」といった話）が音声入力されることに応じて、「海水浴」に対応付けられた話（例えば、「いよいよ夏本番ですね。夏の浜辺でスイカ割りしたいなー。」といった話）を音声出力するように、制御を行えばよい。 For example, if the combination of the topic image (image of the coast of Atami) and words as shown in FIG. 15 is the first combination, the first combination is displayed on the display unit 15 as shown in FIG. In response to the voice input, when the conversation regarding the first combination is realized and the time for redisplaying the topic image arrives, the conversation corresponding to "Atami" is output as voice, as shown in FIG. 17. As shown in FIG. 15, a topic image (image of the coast of Atami) similar to that shown in FIG. You can. In the example of FIG. 17, the combination of the topic image (image of the coast of Atami) and "sea bathing" is "a combination different from the first combination." Even when the display is performed in this way, it is possible to have a conversation in the same way as the conversation that took place in the first combination, and after displaying as shown in FIG. In response to voice input of a story that includes "Kaisuiyoku" (for example, "It's that time of year for sea bathing."), a story associated with "sea bathing" (for example, "Summer is finally here!") is input by voice. You can control it so that it outputs a voice saying something like, ``I want to split watermelon on the beach in the summer.''

更には、図１７のように第２の組み合わせで会話を行った後、上述の話題画像再表示時期が到来した場合には、図１８のように、図１５、図１７と同様の話題画像（熱海の海岸の画像）を第１及び第２の組み合わせとは異なる組み合わせ（第３の組み合わせ）で当該話題画像に対応付けられた言葉と共に表示部１５に表示してもよい。例えば、図１５、図１７、図１８のような画像を異なる日にそれぞれ表示すれば、例えば、同じ話題画像を用いつつ日によって違った話題を提供することができる。 Furthermore, when the above-mentioned topic image re-display time comes after the conversation is performed using the second combination as shown in FIG. 17, the same topic image as in FIGS. 15 and 17 ( Atami beach image) may be displayed on the display unit 15 in a combination (third combination) different from the first and second combinations together with words associated with the topic image. For example, if images such as those shown in FIGS. 15, 17, and 18 are displayed on different days, different topics can be provided depending on the day while using the same topic image.

図１８の例は、話題画像に対応付けて用意された複数の言葉を組み合わせた例である。例えば、図１８のような話題画像８２Ａ（熱海の海岸の画像）に対応付けて予め「熱海」「温泉」といった２種類の言葉が用意されている場合、図１８のように、話題画像８２Ａと共に上記２種類の言葉（「熱海」「温泉」）を同時期に表示するように話題画像及び言葉を提供してもよい。話題画像と対応付けて用意された複数種類の言葉を同時期に表示する方法は、図１８のように複数種類の言葉を１行で表示する方法であってもよく、図１９のように複数行で表示する方法であってもよい。或いは、複数種類の言葉を離間させて別々の場所に表示する方法であってもよい。図１９の例は、話題画像８２Ａ（熱海の海岸の画像）に対応付けて予め「熱海」「温泉」「サンビーチ」といった３種類の言葉が用意された例であり、話題画像８２Ａと関連させて３種類の言葉（「熱海」「温泉」「サンビーチ」）を同時期に複数行で表示する例である。 The example in FIG. 18 is an example in which a plurality of words prepared in association with topic images are combined. For example, if two types of words such as "Atami" and "hot spring" are prepared in advance in association with the topic image 82A (image of the coast of Atami) as shown in FIG. The topic images and words may be provided so that the two types of words (“Atami” and “hot spring”) are displayed at the same time. A method of simultaneously displaying multiple types of words prepared in association with topic images may be a method of displaying multiple types of words in one line as shown in Figure 18, or a method of displaying multiple types of words in one line as shown in Figure 19. It may also be displayed in rows. Alternatively, a method may be used in which multiple types of words are separated and displayed in different locations. The example in FIG. 19 is an example in which three types of words such as "Atami," "hot spring," and "sun beach" are prepared in advance in association with the topic image 82A (image of the coast of Atami). This is an example of displaying three types of words (``Atami'', ``hot spring'', and ``sun beach'') in multiple lines at the same time.

図１８、図１９の例では、制御部１１は、話題画像８２Ａに関する言葉として複数種類のキーワードを表示部１５に表示させ、これら複数種類のキーワードのうちのいずれかキーワードのみを含む話が入力部に入力された場合、入力されたキーワードに対応する話を出力部（例えば音声出力部１６）に出力させるように動作する。例えば、図１８のように第３の組み合わせで表示を行った後、上記所定期間の間に「熱海」「温泉」のうちの「熱海」のみを含む話が音声として音声入力部１８に入力された場合、入力されたキーワード（「熱海」）に対応する話を音声出力部１６に音声出力させるように動作する。 In the examples shown in FIGS. 18 and 19, the control unit 11 causes the display unit 15 to display a plurality of types of keywords as words related to the topic image 82A, and the input unit displays a story that includes only one of these keywords. When the input keyword is input, the output unit (for example, the audio output unit 16) operates to output a story corresponding to the input keyword. For example, after displaying the third combination as shown in FIG. 18, a story containing only "Atami" out of "Atami" and "Onsen" is input as audio to the audio input section 18 during the predetermined period. In this case, the voice output unit 16 is operated to output a voice corresponding to the input keyword (“Atami”).

或いは、図１９のように第４の組み合わせで表示を行った後、上記所定期間の間に「熱海」「温泉」「サンビーチ」のうちの「サンビーチ」のみを含む話（例えば、「サンビーチは知っているよ」といった話）が音声として音声入力部１８に入力された場合、入力されたキーワード（「サンビーチ」）に対応する話（例えば、「この写真は熱海サンビーチの写真です。外国のリゾートのようで雰囲気がとっても気に入ってます。」といった話）を音声出力部１６に音声出力させるように動作する。このようにすれば、確実に関心のある話題について深堀できて、会話がつながる。 Alternatively, after displaying the fourth combination as shown in FIG. 19, a story including only "Sun Beach" among "Atami", "Onsen", and "Sun Beach" (for example, "Sun Beach") is displayed during the predetermined period. When a story (such as "I know the beach") is input as voice to the voice input section 18, a story corresponding to the input keyword ("Sun Beach") (for example, "This photo is a photo of Atami Sun Beach. .It looks like a resort in a foreign country and I really like the atmosphere.'') will be outputted by the audio output section 16. This way, you will be able to dig deeper into the topic you are interested in, and the conversation will continue.

なお、図１８、図１９のような例では、図１５、図１６のような例と同様、話題画像と対応付けて用意された複数種類の言葉の各々に対して、対応する話が１以上用意されていることが望ましいが、複数種類の言葉の中から選ばれる２以上の言葉の組み合わせに対して、対応する話が１以上用意されていてもよい。例えば、図１８のような例では、「熱海」「温泉」のそれぞれに対応する話だけでなく、「熱海温泉」に対応する話が１以上用意されていてもよい。この場合、図１８のような表示がなされた場合において、上記所定期間の間に、「熱海」及び「温泉」のいずれも含む音声（例えば、「今日は、熱海温泉について話しましょう。」といった音声など）が入力された場合に、「熱海温泉」に対応付けられて用意された話（例えば、「熱海駅の周辺には、たくさんのホテルや旅館、名所がありますね。行かれたことはありますか。」といった話）を音声出力すればよい。同様に、図１９のような例でも、「熱海」「温泉」「サンビーチ」のそれぞれに対応する話だけでなく、「熱海」と「温泉」の組み合わせに対応する話、「熱海」と「サンビーチ」の組み合わせに対応する話、「温泉」と「サンビーチ」の組み合わせに対応する話、「熱海」と「温泉」と「サンビーチ」の組み合わせに対応する話などが、それぞれ１以上対応付けられて用意されていてもよい。この場合、図１９のような表示がなされた場合において、上記所定期間の間に、「熱海」「サンビーチ」のいずれも含む音声（例えば、「熱海のサンビーチだね」といった音声など）が入力された場合に、「熱海」と「サンビーチ」の組み合わせに対応付けられて用意された話を音声出力すればよい。 Note that in the examples shown in FIGS. 18 and 19, as in the examples shown in FIGS. 15 and 16, there is one or more corresponding stories for each of the plural types of words prepared in association with the topic image. Although it is desirable that the words be prepared, one or more corresponding stories may be prepared for a combination of two or more words selected from a plurality of types of words. For example, in the example shown in FIG. 18, not only stories corresponding to "Atami" and "hot springs" but also one or more stories corresponding to "Atami hot springs" may be prepared. In this case, when the display shown in Figure 18 is displayed, during the predetermined period, a voice that includes both "Atami" and "hot springs" (for example, "Today, let's talk about Atami Onsen") is displayed. (e.g., voice, etc.), a prepared story associated with "Atami Onsen" (for example, "There are many hotels, inns, and famous places around Atami Station. Have you ever been there?") Do you have any?”) can be output as voice. Similarly, in the example shown in Figure 19, there are not only stories corresponding to "Atami", "Onsen", and "Sun Beach", but also stories corresponding to the combination of "Atami" and "Onsen", "Atami" and " There are at least one story corresponding to the combination of "Sun Beach", a story corresponding to the combination of "Onsen" and "Sun Beach", a story corresponding to the combination of "Atami", "Onsen" and "Sun Beach", etc. It may be attached and prepared. In this case, when the display as shown in Fig. 19 is displayed, during the above predetermined period, voices containing both "Atami" and "Sun Beach" (for example, voices such as "It's Sun Beach in Atami") are heard. When input, a prepared story associated with the combination of "Atami" and "Sun Beach" may be output as audio.

図１５、図１７～図１９の例は、「話題画像８２Ａ」と「話題画像８２Ａに対応付けられた言葉」を表示する例であったが、図２０のように話題画像を変更することもできる。図１５のように話題画像８２Ａを表示して会話を行った後、図２０のような新たな話題画像を表示する場合、その表示時期は様々に設定することができ、前回の話題画像の表示から一定時間が経過したことであってもよく、任意の時刻、時間、曜日等が到来したことであってもよく、その他の条件が成立したことであってもよい。例えば、図２０のような例では、図２０のような画像が表示された状態で「写真が変わったね。」といった音声が入力された場合に、「新しい話題です。写真に書いてある文字情報を読み上げてお話しましょう。」といったメッセージを音声等で出力し、写真やイラストの変化を報知することによって新しい感動、話題につなげつつ、画像に表示されている文字情報の読み上げを促すことができる。図２０のような画像が表示された状態で、「桜まつりに行きたい」といった話のように、言葉８４Ｄを含む話の音声入力があった場合には、言葉８４Ｄに対応付けられた話（例えば、「この近くで桜の名所はどこですか。」といった話）を音声出力することにより、会話を適切に行いつつ話題を広げることができる。 In the examples shown in FIGS. 15 and 17 to 19, "topic image 82A" and "words associated with topic image 82A" are displayed, but the topic image can also be changed as shown in FIG. 20. can. After displaying the topic image 82A as shown in FIG. 15 and having a conversation, when displaying a new topic image as shown in FIG. 20, the display timing can be set variously, and the display of the previous topic image It may be that a certain period of time has passed since then, it may be that an arbitrary time, time, day of the week, etc. has arrived, or it may be that other conditions have been met. For example, in the example shown in Fig. 20, if the image shown in Fig. 20 is displayed and a voice such as "The photo has changed" is input, "This is a new topic. The text information written on the photo" is input. By outputting a message such as "Let's read aloud and talk about it" and notifying changes in photos and illustrations, it can lead to new impressions and topics, while also encouraging the reader to read out the textual information displayed in the image. . When an image like that shown in Fig. 20 is displayed and there is audio input of a story that includes the word 84D, such as "I want to go to the cherry blossom festival," the story associated with the word 84D ( For example, by outputting a phrase such as "Where is the best place to see cherry blossoms near here?", it is possible to expand the topic while having an appropriate conversation.

６．効果の例示
バーチャルアシスタント装置１０は、話題となり得る画像（キャラクタ７０の画像とは異なる話題画像）を表示する場合に、話題画像だけでなく、話題画像に関する言葉を表示部１５に表示することができる。このような表示がなされると、話題画像を見た利用者は、上記言葉を含んだ話を発しやすくなる。従って、バーチャルアシスタント装置１０側では、上記言葉を含んだ話がなされるものとして、上記言葉に関する応答用の話を用意しておくことができる。更に、バーチャルアシスタント装置１０は、上記話題画像及び上記言葉が表示された後に、実際に上記言葉を含んだ話が利用者から発せられた場合には、入力部に入力された情報から上記言葉が含まれることを確認した上で、上記言葉に関連する話をキャラクタ７０からの話として出力することができる。このような動作がなされるため、話題画像を見た利用者が発した話に対してキャラクタ７０が無関係の話を返すような対応が抑制されやすく、会話の適切化を図りやすい。 6. Example of Effect When displaying an image that can become a topic (a topic image different from the image of the character 70), the virtual assistant device 10 can display not only the topic image but also words related to the topic image on the display unit 15. . When such a display is made, the user who sees the topic image is more likely to utter a story that includes the above-mentioned words. Therefore, on the virtual assistant device 10 side, it is possible to prepare a response story regarding the above-mentioned words, assuming that the speech including the above-mentioned words will be given. Furthermore, if the user actually speaks a story that includes the above-mentioned words after the topic image and the above-mentioned words are displayed, the virtual assistant device 10 determines the above-mentioned words from the information input to the input section. After confirming that the word is included, a story related to the word can be output as a story from the character 70. Because such an operation is performed, it is easy to prevent the character 70 from responding to a story uttered by a user who has viewed the topic image by saying an unrelated story, and it is easy to make the conversation more appropriate.

例えば、図２１の比較例は、キャラクタの背景に海岸の風景画像が表示される例である。図２１のように単に画像を表示するだけでは、利用者が発する話は様々な内容に発散しやすく、例えば、「そこはどこなの？」「どこにいるの？」といった様々な質問がなされたり、「きれいな写真ね」といった感想が述べられたりする。これら以外にも、話題の候補は無数に想定される。このような比較例において、バーチャルアシスタント装置側で会話を継続させるためには、画像を見た利用者からの応答をバーチャルアシスタント装置側で正確に認識し、その認識結果を踏まえて、無数に想定される話題の候補から話題（内容や要旨等）を正確に特定しなければならない。しかし、無数に想定される話題から会話のポイントを正確に特定することは困難であり、例えば、膨大なデータを有する高性能な会話装置を用いつつ、認識の誤りをある程度覚悟しながら会話を行うような方法を採用せざるを得ない。これに対し、上述のバーチャルアシスタント装置１０は、表示部に表示される言葉を用いたやり取りにより、キャラクタから無関係の話を返すような事態を防ぐことができるため、会話をより適切に行うことができる。 For example, the comparative example in FIG. 21 is an example in which a coastal landscape image is displayed in the background of the character. By simply displaying an image as shown in Figure 21, the user's conversation tends to diverge into various content, such as asking various questions such as "Where is that?", "Where are you?" Some people commented, ``That's a beautiful photo.'' In addition to these, countless other topics can be considered. In such a comparative example, in order for the virtual assistant device to continue the conversation, the virtual assistant device must accurately recognize the response from the user who has viewed the image, and based on the recognition results, it must make countless assumptions. The topic (content, gist, etc.) must be accurately identified from the topic candidates presented. However, it is difficult to accurately identify the conversation points from a myriad of potential topics. We have no choice but to adopt such a method. On the other hand, the above-described virtual assistant device 10 can prevent situations where the character returns unrelated information by communicating using the words displayed on the display, so it is possible to carry out the conversation more appropriately. can.

なお、図１６の例では、話題画像と、話題画像に対応する言葉と、各言葉に対応する話とが対応付けられた構成で対応データが構成されているが、対応データにおいて話題画像が対応付けられていなくてもよい。つまり、話題画像と言葉を含んだ画像（例えば、図１５の背景画像のように話題画像と言葉を同時に表示する画像）のデータと、言葉と話を対応付けた対応データ（例えば、図１６のデータ構成から、話題画像のデータを除いた対応データ）とが別々に用意されていてもよい。この場合、バーチャルアシスタント装置１０が対応データを利用できる環境にあれば、これとは別で画像データ（話題画像と言葉を同時に表示する画像のデータ）が独立していても、適切に会話を成立させることができる。例えば、上記画像データが独立して配信されるようなシステムでは、バーチャルアシスタント装置１０が上記画像データを取得して表示するとしても、どのような画像が表示されるのかを認識できない虞がある。例えば、図２１の例において、「写真がどこなのか教えてよ」といった会話が利用者から提供された場合、バーチャルアシスタント装置１０側で写真の場所を把握していないと、キャラクタからは「ごめんなさい、わからないです」といった回答となってしまい、会話が続かなかったり、話題が広がらなかったりする。しかし、本実施形態を適用すれば、利用者から上記言葉を含む話が発せられやすく、上記言葉を想定した応答を適切に行うことができ、話題も広がりやすい。 In the example shown in FIG. 16, the correspondence data is composed of a topic image, a word corresponding to the topic image, and a story corresponding to each word. It doesn't have to be attached. In other words, there is data of an image that includes a topic image and words (for example, an image that displays a topic image and words at the same time, such as the background image in FIG. 15), and correspondence data that associates words and stories (for example, an image that displays a topic image and words at the same time, such as the background image in FIG. 16). (corresponding data excluding topic image data from the data structure) may be prepared separately. In this case, if the virtual assistant device 10 is in an environment where the corresponding data can be used, a conversation can be properly established even if the image data (image data that displays the topic image and words at the same time) is independent. can be done. For example, in a system where the image data is distributed independently, even if the virtual assistant device 10 acquires and displays the image data, there is a risk that it will not be able to recognize what kind of image will be displayed. For example, in the example shown in FIG. 21, if the user provides a conversation such as "Please tell me where the photo is," if the virtual assistant device 10 does not know the location of the photo, the character will respond with "I'm sorry. , I don't know,'' and the conversation doesn't continue or the topic doesn't expand. However, if this embodiment is applied, it is easy for the user to say something that includes the above-mentioned words, it is possible to appropriately respond based on the above-mentioned words, and it is easy to spread the topic.

バーチャルアシスタント装置１０は、有効モード（会話モード）と解除モードを切り替えることができるため、有効モード中には話の解析を可能とし、解除モード中には処理負担を低減することができる。そして、制御部１１は、上記言葉に関連する話を上記キャラクタ７０からの話として出力部に出力させた場合、有効モードを継続しつつ利用者からの話を受け付ける期間を設け、終了条件が成立するまで継続することができるため、上記言葉に関連する話を提供した後も、会話を円滑に継続することができる。 Since the virtual assistant device 10 can switch between a valid mode (conversation mode) and a cancel mode, it is possible to analyze the conversation during the valid mode, and to reduce the processing load during the cancel mode. Then, when the control unit 11 causes the output unit to output a story related to the word as a story from the character 70, the control unit 11 sets a period for accepting stories from the user while continuing the effective mode, and the termination condition is satisfied. This allows the conversation to continue smoothly even after the user has provided the story related to the above-mentioned words.

図１５のように、話題画像８２Ａと関連させて表示される言葉８４Ａが単一のキーワードである場合、話題画像８２Ａ及び言葉８４Ａを見た利用者が言葉８４Ａ（キーワード）を発しやすくなる。よって、言葉８４Ａ（キーワード）を含む話がバーチャルアシスタント装置１０によって認識される可能性が高まり、ひいては、会話が適切に継続する可能性が高まる。 As shown in FIG. 15, when the word 84A displayed in association with the topic image 82A is a single keyword, the user who sees the topic image 82A and the word 84A is more likely to utter the word 84A (keyword). Therefore, the possibility that the conversation including the word 84A (keyword) will be recognized by the virtual assistant device 10 increases, and the possibility that the conversation will continue appropriately increases.

図１９の例では、制御部１１は、複数種類のキーワード（言葉８４Ｅ，８４Ｆ，８４Ｇ）のうちのいずれかキーワードのみを含む言葉が発せられた場合、発せられたキーワードに対応する話を出力部に出力させるように動作する。話題画像と関連付けて複数種類のキーワードが表示されると、利用者が興味を持ちやすいキーワードが含まれる可能性が高くなり、且つ、利用者は、より多くの観点で話をしやすくなる。バーチャルアシスタント装置１０は、利用者の話しやすさを高めつつ、いずれかのキーワードを含んだ話が発せられた場合にはそのキーワードに対応する話を返すことができるため、利用者の話しやすさと会話の適切化を両立することができる。 In the example of FIG. 19, when a word containing only one of the plurality of keywords (words 84E, 84F, 84G) is uttered, the control unit 11 outputs a story corresponding to the uttered keyword. It works so that it outputs. When a plurality of types of keywords are displayed in association with a topical image, there is a high possibility that keywords that the user is likely to be interested in are included, and it becomes easier for the user to talk from more viewpoints. The virtual assistant device 10 improves the user's ease of speaking, and when a word containing any keyword is uttered, it can return a story corresponding to that keyword. It is possible to make the conversation more appropriate.

制御部１１は、図１５のように話題画像８２Ａを言葉８４Ａと共に第１の組み合わせで表示部１５に表示した後、図１７のように話題画像８２Ａを第１の組み合わせとは異なる組み合わせで言葉８４Ｂと共に表示部１５に表示するように動作する。或いは、図１８のように話題画像８２Ａを第１の組み合わせとは異なる組み合わせで言葉８４Ｃと共に表示部１５に表示するように動作する。このように、話題画像８２Ａを表示する場合に、関連付けて表示する言葉の組み合わせを変更可能であれば、同種の話題画像８２Ａを継続的に又は繰り返し利用する場合でも、利用者にとって会話が飽きにくくなり、利用者の利用が促進されやすい。そして、バーチャルアシスタント装置１０は、話題画像と言葉の組み合わせが変更されても、表示された言葉を含む話が発せられた場合には、その言葉に関連する話を返すことができるため、会話の飽きにくさと会話の適切化を両立することができる。 The control unit 11 displays the topic image 82A together with the word 84A in a first combination on the display unit 15 as shown in FIG. 15, and then displays the topic image 82A with the word 84B in a combination different from the first combination as shown in FIG. It also operates to display on the display unit 15. Alternatively, as shown in FIG. 18, the topic image 82A is displayed on the display unit 15 together with the word 84C in a combination different from the first combination. In this way, if it is possible to change the combination of words displayed in association when displaying the topic image 82A, the conversation will be less boring for the user even when the same type of topic image 82A is used continuously or repeatedly. This makes it easier for users to use the service. Even if the combination of the topic image and the word is changed, the virtual assistant device 10 can return a story related to the displayed word if a word that includes the displayed word is uttered. It is possible to balance the difficulty of getting bored and the appropriateness of the conversation.

＜他の実施形態＞
本発明は上記記述及び図面によって説明した実施形態に限定されるものではなく、例えば次のような実施形態も本発明の技術的範囲に含まれる。また、上述した実施形態や後述する実施形態の様々な特徴は、矛盾しない組み合わせであればどのように組み合わされてもよい。 <Other embodiments>
The present invention is not limited to the embodiments described above and illustrated in the drawings; for example, the following embodiments are also included within the technical scope of the present invention. Further, various features of the embodiments described above and the embodiments described below may be combined in any combination that does not contradict each other.

上述された実施形態では、利用者からの話が入力される入力部として、話が音声として入力され得る音声入力部１８が例示されるが、この例に限定されない。例えば、利用者からの話が文字入力によって入力されてもよく、この例では文字入力を行うための操作部１７が入力部の一例に相当する。この例では、操作部１７は、文字入力を行い得る公知の様々な入力デバイスが採用され得る。 In the embodiment described above, the voice input unit 18 to which the user's speech can be input as voice is exemplified as an input unit into which the user's speech is input, but the input unit is not limited to this example. For example, the user's message may be input by inputting characters, and in this example, the operation unit 17 for inputting characters corresponds to an example of the input unit. In this example, the operation unit 17 may employ various known input devices that can input characters.

上述された実施形態では、利用者が話題画像と共に表示された言葉を含んだ話を発した場合に、制御部１１は、この言葉に対応する話をキャラクタからの話として音声によって出力するように制御を行うが、この例に限定されない。例えば、利用者が話題画像と共に表示された言葉を含んだ話を発した場合に、制御部１１は、この言葉に対応する話をキャラクタからの話として文字表示によって出力するように表示部１５を制御してもよく、文字表示と音声の両方を出力するように表示部１５及び音声出力部１６を制御してもよい。 In the embodiment described above, when the user utters a story that includes the word displayed together with the topic image, the control unit 11 outputs the story corresponding to the word as a story from the character. control, but is not limited to this example. For example, when a user speaks a story that includes a word displayed together with a topic image, the control section 11 causes the display section 15 to output a story corresponding to this word as a story from a character. Alternatively, the display section 15 and the audio output section 16 may be controlled so as to output both text display and audio.

上述された実施形態では、動作検出部の一例に相当する操作部１７がタッチパネルとして構成されるがこの例に限定されない。操作部１７は、その他の公知の入力デバイス（キーボード、マウス、タッチペン等）であってもよい。或いは、動作検出部は、操作部１７に代えて又は操作部１７に加えて他の入力デバイス（例えば、非接触方式での入力操作が可能とされた入力デバイス）を備えていてもよい。具体的には、動作検出部は、撮像部やモーションセンサなどの非接触センサを有していてもよい。この場合、その非接触センサと制御部１１とが協働し、利用者の動きやジェスチャーなどを検知してもよい。 In the embodiment described above, the operation unit 17, which is an example of a motion detection unit, is configured as a touch panel, but the present invention is not limited to this example. The operation unit 17 may be any other known input device (keyboard, mouse, touch pen, etc.). Alternatively, the motion detection section may include another input device (for example, an input device capable of non-contact input operation) instead of or in addition to the operation section 17. Specifically, the motion detection section may include a non-contact sensor such as an imaging section or a motion sensor. In this case, the non-contact sensor and the control unit 11 may cooperate to detect the user's movements, gestures, and the like.

上述された実施形態では、表示部１５は、静止画や動画を二次元で表示する表示装置として構成されるが、この例に限定されない。表示部１５は、三次元表示を行い得る三次元ディスプレイであってもよい。 In the embodiment described above, the display unit 15 is configured as a display device that displays still images and moving images in two dimensions, but is not limited to this example. The display unit 15 may be a three-dimensional display capable of three-dimensional display.

上述されたシステム１では、バーチャルアシスタント装置１０の外部に設けられた登録部（記憶部９５）に利用者が希望する情報の種類が登録され、そして、制御部１１は、登録部に登録される種類のコンテンツに関する会話を、インタフェース１３を介して出力する制御を行うようになっていた。しかし、この例に限定されない。例えば、バーチャルアシスタント装置１０に登録部が設けられ、利用者が希望する種類の情報が登録されてもよい。例えば、バーチャルアシスタント装置１０は、バーチャルアシスタント装置１０に登録された種類（利用者が希望する種類）のコンテンツのみを管理装置９０から受信してもよい。或いは、バーチャルアシスタント装置１０は、管理装置９０からコンテンツを受信した場合において、バーチャルアシスタント装置１０に登録された種類（利用者が希望する種類）のコンテンツを受信した場合にステップＳ２において新たな配信があったと判定してもよい。そして、ステップＳ７では、バーチャルアシスタント装置１０に登録された種類（利用者が希望する種類）のコンテンツのみを提供してもよい。 In the system 1 described above, the type of information desired by the user is registered in the registration unit (storage unit 95) provided outside the virtual assistant device 10, and the control unit 11 is registered in the registration unit. Conversations regarding different types of content are controlled to be output via the interface 13. However, it is not limited to this example. For example, the virtual assistant device 10 may be provided with a registration section, and the type of information desired by the user may be registered. For example, the virtual assistant device 10 may receive from the management device 90 only content of a type registered in the virtual assistant device 10 (a type desired by the user). Alternatively, when the virtual assistant device 10 receives content from the management device 90, if the content is of the type registered in the virtual assistant device 10 (the type desired by the user), the virtual assistant device 10 performs a new distribution in step S2. It may be determined that there was. Then, in step S7, only the type of content registered in the virtual assistant device 10 (the type desired by the user) may be provided.

上述された実施形態では、バーチャルアシスタント装置１０が、主に高齢者向けのバーチャルアシスタント装置として構成された例を示したが、この例に限定されない。例えば、子供などの他のカテゴリの対象者を対象としてもよい。 In the embodiment described above, an example was shown in which the virtual assistant device 10 was configured as a virtual assistant device mainly for elderly people, but the present invention is not limited to this example. For example, other categories of subjects such as children may be targeted.

本明細書のいずれの例でも、図１５、図１７、図１８、図１９、図２０のような話題画像を提供して会話を行う場合、会話を行う過程で、キャラクタの表情、キャラクタの動作、テロップ、効果音、アイコンのいずれか１つ又は複数の表示又は音声出力を発生又は変化させてもよい。例えば、制御部１１は、キャラクタ７０の表情を笑顔にしたり、キャラクタ７０に対してジャンプやスキップ等の動作を行わせたりしてもよい。採用されるキャラクタの表情は笑顔に限定されず、沈んだ表情、怒った表情、悲しんだ表情などに変化させてもよく、キャラクタ７０に喜んだ動作や泣く動作などを行わせてもよい。 In any of the examples in this specification, when having a conversation by providing topic images such as those shown in FIGS. 15, 17, 18, 19, and 20, the character's facial expressions, Display or audio output of one or more of , telops, sound effects, and icons may be generated or changed. For example, the control unit 11 may make the character 70 smile, or cause the character 70 to perform an action such as jumping or skipping. The facial expression of the adopted character is not limited to a smiling face, but may be changed to a depressed expression, an angry expression, a sad expression, etc., and the character 70 may be made to perform a happy motion, a crying motion, etc.

上述された実施形態では、バーチャルアシスタント装置１０に記憶部１４が設けられ、記憶部１４に上述の対応情報が記憶されるが、この例に限定されない。例えば、第１実施形態の記憶部１４に記憶される対応情報と同様の対応情報がバーチャルアシスタント装置１０の外部に設けられた装置に記憶されてもよい。例えば、管理装置９０の記憶部９５が上述の対応情報を記憶する記憶部として機能してもよい。或いは、上述の対応情報は、記憶部１４と記憶部９５の両方に設けられていてもよい。外部に設けられた装置（例えば記憶部９５）に対応情報が記憶される場合、制御部１１は、いずれかの話題画像を表示する前に、当該話題画像のデータ及び当該話題画像に対応付けられた言葉のデータを、外部の装置から受信すればよい。更に、制御部１１は、上記話題画像を表示する前又は表示した後に当該話題画像に対応付けられた話のデータを外部の装置から受信しておき、記憶部１４に記憶しておけばよい。 In the embodiment described above, the storage unit 14 is provided in the virtual assistant device 10, and the above-mentioned correspondence information is stored in the storage unit 14, but the present invention is not limited to this example. For example, correspondence information similar to the correspondence information stored in the storage unit 14 of the first embodiment may be stored in a device provided outside the virtual assistant device 10. For example, the storage unit 95 of the management device 90 may function as a storage unit that stores the above-mentioned correspondence information. Alternatively, the above-mentioned correspondence information may be provided in both the storage unit 14 and the storage unit 95. When the correspondence information is stored in an externally provided device (for example, the storage unit 95), the control unit 11 stores the data of the topic image and information associated with the topic image before displaying any topic image. It is sufficient to receive the data of the words from an external device. Further, the control unit 11 may receive the story data associated with the topic image from an external device before or after displaying the topic image, and store it in the storage unit 14.

なお、今回開示された実施の形態は全ての点で例示であって制限的なものではないと考えられるべきである。本発明の範囲は、今回開示された実施の形態に限定されるものではなく、特許請求の範囲によって示された範囲内又は特許請求の範囲と均等の範囲内での全ての変更が含まれることが意図される。 It should be noted that the embodiments disclosed herein are illustrative in all respects and should not be considered restrictive. The scope of the present invention is not limited to the embodiments disclosed herein, and includes all modifications within the scope indicated by the claims or within the scope equivalent to the claims. is intended.

１…バーチャルアシスタントシステム
１０…バーチャルアシスタント装置
１１…制御部
１２…通信部
１３…インタフェース
１４…記憶部
１５…表示部（出力部）
１６…音声出力部（出力部）
１７…操作部（入力部）
１８…音声入力部（入力部）
７０…キャラクタ
８２Ａ…話題画像
８２Ｂ…話題画像
８４Ａ…言葉
８４Ｂ…言葉
８４Ｃ…言葉
８４Ｄ…言葉
８４Ｅ…言葉
８４Ｆ…言葉
８４Ｇ…言葉 1...Virtual assistant system 10...Virtual assistant device 11...Control unit 12...Communication unit 13...Interface 14...Storage unit 15...Display unit (output unit)
16...Audio output section (output section)
17...Operation unit (input unit)
18...Audio input section (input section)
70... Character 82A... Topic image 82B... Topic image 84A... Word 84B... Word 84C... Word 84D... Word 84E... Word 84F... Word 84G... Word

Claims

An input section into which a story from a user is input, a display section that displays an image, a control section that causes the display section to display an image of a character, and an output section that outputs the story from the character. A virtual assistant device,
The control unit displays a topic image different from the image of the character and words related to the topic image on the display unit, and after the topic image and the words are displayed on the display unit, a story including the words is displayed. A virtual assistant device that, when input to the input unit, causes the output unit to output a story related to the word as a story from the character.

The control unit includes:
switching between an enable mode that enables analysis of the story input into the input section and a cancellation mode that cancels the enable mode;
When a story including the word is input to the input unit during the valid mode, outputting a story related to the word to the output unit as a story from the character;
When the output unit outputs a story related to the word as a story from the character, a predetermined end condition is met after a period is set for accepting stories from the user while continuing the valid mode. The virtual assistant device according to claim 1, wherein the effective mode is continued until the condition is established.

The virtual assistant device according to claim 1 or 2, wherein the control unit displays only a single keyword on the display unit as the word related to the topic image.

The control unit displays a plurality of types of keywords on the display unit as the words related to the topic image, and when a story containing only one of the plurality of keywords is input to the input unit, The virtual assistant device according to claim 1 or 2, wherein the output unit outputs a story corresponding to the keyword.

The control unit displays the topic image together with one or more of the words in a first combination on the display unit, and then displays the topic image together with the word in a combination different from the first combination. The virtual assistant device according to claim 1 or 2, wherein the virtual assistant device is displayed on the screen.

An input section into which a story from a user is input, a display section that displays an image, a control section that causes the display section to display an image of a character, and an output section that outputs the story from the character. A program used for a virtual assistant device,
causing the control unit to perform control to display a topic image different from the image of the character and words related to the topic image on the display unit;
After the topic image and the word are displayed on the display unit, when a story including the word is input to the input unit, a story related to the word is output to the output unit as a story from the character. a step of causing the control unit to perform control to
Programs for virtual assistant equipment, including.