JP2023032649A

JP2023032649A - Information processing system and information processing method

Info

Publication number: JP2023032649A
Application number: JP2021138909A
Authority: JP
Inventors: 佳紀草柳; Yoshinori Kusayanagi; 拓良柳; Hiroyoshi Yanagi; 美友紀茂田; Miyuki Shigeta; 沙織小野; Saori Ono
Original assignee: Renault SAS; Nissan Motor Co Ltd
Current assignee: Renault SAS; Nissan Motor Co Ltd
Priority date: 2021-08-27
Filing date: 2021-08-27
Publication date: 2023-03-09

Abstract

To provide an information processing system and an information processing method capable of performing appropriate communication while using an expression for a user to easily understand.SOLUTION: An information processing system detects an expression mode of a user, determines an expression mode of an agent to be a simulation object, sets a simulation degree of the expression mode of the agent to the detected expression mode of the user about the expression mode, and generates output data corresponding to the simulation degree. In such a case, the simulation degree of the expression mode of the agent is set so as to come close to the expression mode of the user with time as the generated output data is outputted to the user.SELECTED DRAWING: Figure 1

Description

本発明は、情報処理システム及び情報処理方法に関するものである。 The present invention relates to an information processing system and an information processing method.

音声応答システムを利用して応答情報を出力する際に、常に正式名称や典型句を用いて応答するのではなく、ユーザの発話に使用された略称、愛称、俗称などの呼称をユーザの発話特性として認識し、当該ユーザの発話特性に応じた呼称を用いて応答情報を出力する情報処理装置が知られている（特許文献１参照）。 When outputting response information using a voice response system, the user's utterance characteristics such as abbreviations, nicknames, and common names used in the user's utterance are used instead of always responding using the formal name or typical phrase. There is known an information processing apparatus that recognizes a user's name as a user and outputs response information using a name according to the user's utterance characteristics (see Patent Document 1).

国際公開第２０１７／１９９４８６号パンフレットInternational Publication No. 2017/199486 pamphlet

上記従来の情報処理装置では、ユーザの発話に使用された呼称を用いて音声応答システムの応答情報が生成される。エージェントを用いたコミュニケーション装置において、ユーザは、自身が用いた表現に似た態様で情報を提示されるほうが、自身が用いた表現と異なる態様で情報を提示されるよりも理解しやすい。しかしながら、ユーザが用いた表現をエージェントが過剰に模倣すると、ユーザはエージェントの表現に違和感を抱いたり、不快感を募らせたりするので、適切なコミュニケーションの妨げとなる場合がある。そのため、エージェントを用いた装置において、ユーザにとって理解しやすい表現を用いながらも、適切なコミュニケーションを行うことが望まれている。 In the above-described conventional information processing apparatus, response information of the voice response system is generated using the name used in the user's utterance. In a communication device using an agent, it is easier for a user to understand information presented in a manner similar to the expression used by the user than information presented in a manner different from the expression used by the user. However, if the agent excessively imitates the expression used by the user, the user may feel uncomfortable with the agent's expression or feel discomfort, which may hinder appropriate communication. Therefore, it is desired to perform appropriate communication while using expressions that are easy for users to understand in devices using agents.

本発明が解決しようとする課題は、ユーザにとって理解しやすい表現を用いながら、適切なコミュニケーションを行うことができる情報処理システム及び情報処理方法を提供することである。 The problem to be solved by the present invention is to provide an information processing system and an information processing method that enable appropriate communication while using expressions that are easy for the user to understand.

本発明は、ユーザの表現態様を検出し、模倣対象となるエージェントの表現態様を決定し、当該表現態様について、検出されたユーザの表現態様に対するエージェントの表現態様の模倣度合いを設定し、模倣度合いに応じた出力データを生成する。その際、エージェントの表現態様の模倣度合いを、生成された出力データをユーザに出力するにしたがって、経時的にユーザの表現態様に近づけるように設定することによって上記課題を解決する。 The present invention detects the expression mode of a user, determines the expression mode of an agent to be imitated, sets the degree of imitation of the agent's expression mode with respect to the detected expression mode of the user, and sets the degree of imitation. Generates output data according to In this case, the above problem is solved by setting the degree of imitation of the expression mode of the agent so as to approach the expression mode of the user over time as the generated output data is output to the user.

本発明によれば、ユーザにとって理解しやすい表現を用いながら、適切なコミュニケーションを行うことができる情報処理システム及び情報処理方法を提供することができる。特に、エージェントを用いた情報提供を行う場合には、ユーザの表現態様を模倣する度合いを適切に変化させるので、ユーザがエージェントに対して愛着を感じ易くなるという効果をも期待できる。 According to the present invention, it is possible to provide an information processing system and an information processing method that enable appropriate communication while using expressions that are easy for users to understand. In particular, when information is provided using an agent, the degree of imitation of the user's expression mode is appropriately changed, so an effect that the user can easily feel attachment to the agent can be expected.

本発明に係る情報処理システムの一実施の形態を示すブロック図である。1 is a block diagram showing an embodiment of an information processing system according to the present invention; FIG. （ａ）及び（ｂ）のそれぞれは、図１のエージェント装置の設置場所の一例を示す車室内の図である。2(a) and 2(b) are diagrams of the interior of a vehicle showing an example of an installation location of the agent device of FIG. 1. FIG. （ａ）～（ｃ）のそれぞれは、本発明に係る擬人化されたエージェントを含むエージェント装置の一例を示す図である。Each of (a) to (c) is a diagram showing an example of an agent device including an anthropomorphic agent according to the present invention. 図１の情報処理装置にて情報処理が施された表現態様データベースの構成例を示す図である。2 is a diagram showing a configuration example of an expression mode database subjected to information processing by the information processing apparatus of FIG. 1; FIG. 図１の情報処理装置にて情報処理が施された親密度データベースの構成例を示す図である。2 is a diagram showing a configuration example of a familiarity database processed by the information processing apparatus of FIG. 1; FIG. （ａ）及び（ｂ）のそれぞれは、図１の情報処理装置にて情報処理される親密度について説明するための図（その１）である。2A and 2B are diagrams (part 1) for explaining intimacy information processed by the information processing apparatus in FIG. 1; FIG. 図６のユーザにより入力された会話内容に基づいて情報処理される親密度の情報処理手順の一例を示すフローチャートである。FIG. 7 is a flow chart showing an example of an information processing procedure for intimacy, which is processed based on conversation content input by the user in FIG. 6 ; FIG. （ａ）～（ｃ）のそれぞれは、図１の情報処理装置にて情報処理される親密度について説明するための図（その２）である。2A to 2C are diagrams (part 2) for explaining intimacy information processed by the information processing apparatus of FIG. 1; FIG. 図８のユーザからの反応に基づいて情報処理される親密度の情報処理手順の一例を示すフローチャートである。FIG. 9 is a flow chart showing an example of a familiarity information processing procedure that is processed based on a reaction from a user in FIG. 8. FIG. 図１の情報処理装置にて実行される模倣対象となるエージェントの表現態様を決定する情報処理について説明するための図である。2 is a diagram for explaining information processing for determining an expression mode of an agent to be imitated, which is executed by the information processing apparatus of FIG. 1; FIG. （ａ）～（ｃ）のそれぞれは、図１０の表現態様に基づいてエージェントの模倣対象を決定し、当該模倣対象となる表現態様の模倣度合いを設定する情報処理について説明するための図（その１）である。Each of (a) to (c) is a diagram for explaining the information processing for determining the imitation target of the agent based on the expression mode of FIG. 10 and setting the imitation degree of the expression mode to be the imitation target. 1). （ａ）～（ｃ）のそれぞれは、図１０の表現態様に基づいてエージェントの模倣対象を決定し、当該模倣対象となる表現態様の模倣度合いを設定する情報処理について説明するための図（その２）である。Each of (a) to (c) is a diagram for explaining the information processing for determining the imitation target of the agent based on the expression mode of FIG. 10 and setting the imitation degree of the expression mode to be the imitation target. 2). （ａ）～（ｃ）のそれぞれは、図１０の表現態様に基づいてエージェントの模倣対象を決定し、当該模倣対象となる表現態様の模倣度合いを設定する情報処理について説明するための図（その３）である。Each of (a) to (c) is a diagram for explaining the information processing for determining the imitation target of the agent based on the expression mode of FIG. 10 and setting the imitation degree of the expression mode to be the imitation target. 3). 図１の情報処理装置にて実行されるエージェントの表現態様の模倣度合いを設定する情報処理の他の例について説明するための図である。4 is a diagram for explaining another example of information processing for setting the degree of imitation of an agent's expression mode, which is executed by the information processing apparatus of FIG. 1; FIG. 図１の情報処理装置にて実行されるエージェントの表現態様の模倣度合いを設定する情報処理のさらに他の例について説明するための図（その１）である。FIG. 11 is a diagram (part 1) for explaining still another example of information processing for setting the degree of imitation of an agent's expression mode, which is executed by the information processing apparatus of FIG. 1; 図１の情報処理装置にて実行されるエージェントの表現態様の模倣度合いを設定する情報処理のさらに他の例について説明するための図（その２）である。FIG. 11 is a diagram (part 2) for explaining still another example of information processing for setting the degree of imitation of an agent's expression mode, which is executed by the information processing apparatus of FIG. 1; 図１の情報処理システムにて実行される情報処理手順の一例を示すフローチャートである。2 is a flow chart showing an example of an information processing procedure executed in the information processing system of FIG. 1; 図１７のステップＳ８で実行されるサブルーチンの一例を示すフローチャートである。FIG. 18 is a flow chart showing an example of a subroutine executed in step S8 of FIG. 17; FIG.

以下、本発明の実施形態を図面に基づいて説明する。図１は、本発明に係る情報処理システム１の一実施の形態を示すブロック図、図２の（ａ）及び（ｂ）のそれぞれは、図１に示すエージェント装置５の設置場所と動作の一例を示す車室内の図である。本発明に係る情報処理システム１は、ユーザに対し、擬人化されたエージェント５２（以下、単にエージェント５２ともいう。）によるエージェント機能、具体的には音声、画像、キャラクタロボットの動作及びこれらの組み合わせの媒体を介して情報を提供し、又はユーザとの間で対話するエージェント装置５を用いたものである。 BEST MODE FOR CARRYING OUT THE INVENTION An embodiment of the present invention will be described below with reference to the drawings. FIG. 1 is a block diagram showing an embodiment of an information processing system 1 according to the present invention, and FIGS. It is a diagram of the interior of the vehicle showing the. The information processing system 1 according to the present invention provides the user with an agent function by an anthropomorphic agent 52 (hereinafter simply referred to as the agent 52), specifically voice, image, motion of a character robot, and combinations thereof. The agent device 5 is used to provide information or interact with the user through the medium of .

エージェント装置５は、音声や効果音を出力するためのスピーカその他の音声出力部や、文字を含む画像を表示するディスプレイその他の表示部を含み、キャラクタロボットの動作とともに、音声、効果音、文字その他の画像をユーザに提供することでコミュニケーション情報を出力する。なお、本実施形態では、エージェント５２をキャラクタロボットのような三次元物体としたが、これに限定されず、ディスプレイに表示する二次元画像としてもよい。 The agent device 5 includes a speaker and other audio output units for outputting voice and sound effects, and a display and other display units for displaying images including characters. image is provided to the user to output communication information. In this embodiment, the agent 52 is a three-dimensional object such as a character robot, but is not limited to this, and may be a two-dimensional image displayed on the display.

本実施形態のエージェント装置５は、図１、図２（ａ）及び（ｂ）に示すように、エージェント５２として人間を模したキャラクタロボットが、基台５１に対して、図示しないアクチュエータにより出没可能に設けられている。エージェント装置５は、出力部２８からの制御指令を受けて、エージェント機能によりユーザとコミュニケーションする場合には、図２（ｂ）に示すように基台５１から出現し、ユーザとコミュニケーションを終了すると、図２（ａ）に示すようにキャラクタロボットが基台５１に収納される。 As shown in FIGS. 1, 2(a) and 2(b), the agent device 5 of the present embodiment has a human-like character robot as an agent 52 that can appear and disappear on a base 51 by an actuator (not shown). is provided in When the agent device 5 receives a control command from the output unit 28 and communicates with the user using the agent function, the agent device 5 emerges from the base 51 as shown in FIG. A character robot is housed in a base 51 as shown in FIG.

図１、図２（ａ）及び（ｂ）に示す人間を模したキャラクタロボットは、擬人化されたエージェント５２の一例であり、人間を模さずとも、動物や植物、所定のキャラクタ、アバターやアイコンを表示させたエージェント５２であってもよい。エージェント５２は、物理的な個体として設けられてもよく、あるいは、エージェント５２としての人間、動物や植物、所定のキャラクタ等の形を画像としてディスプレイ上に表示させてもよい。 The character robots imitating humans shown in FIGS. 1, 2(a) and (b) are examples of anthropomorphic agents 52, and even if they do not imitate humans, they can be animals, plants, predetermined characters, avatars, and so on. It may be the agent 52 displaying an icon. The agent 52 may be provided as a physical individual, or may be displayed on the display as an image in the form of a person, an animal, a plant, or a predetermined character as the agent 52 .

また、以下においてはエージェント装置５が車両に備えられる例を説明するが、エージェント装置５の形態や設置場所は、これに限定されない。エージェント装置５は、擬人化されたエージェントを備え、エージェント機能を搭載した電子機器であればよく、例えば図３（ｂ）に示すような持ち運び可能なスピーカ型電子機器や、ディスプレイ付電子機器であってもよい。さらに、以下に説明するエージェント装置５の音声出力及び映像出力に関する機能を、図３（ｃ）に示すようにスマートフォン等の携帯電話に搭載してもよい。 Also, although an example in which the agent device 5 is provided in a vehicle will be described below, the form and installation location of the agent device 5 are not limited to this. The agent device 5 may be an electronic device having a personified agent and an agent function. For example, it may be a portable speaker-type electronic device or an electronic device with a display as shown in FIG. 3(b). may Furthermore, the functions related to audio output and video output of the agent device 5, which will be described below, may be installed in a mobile phone such as a smart phone as shown in FIG. 3(c).

本実施形態の情報処理システム１は、ユーザの動作や会話に含まれる表現態様を検出し、擬人化されたエージェント５２が、図３（ａ）～図３（ｃ）に示すように、ユーザの表現態様を模倣するようにしてコミュニケーションを行う。表現を模倣する、いわゆるミラーリングを用いると、ユーザにとって理解しやすい情報を提供することができ、良好なコミュニケーションを行うことができる。一方で、エージェント５２がユーザの表現態様を過剰に模倣すると、ユーザはエージェント５２の表現に違和感を抱いたり、不快感を募らせたりする。そのため、情報処理システム１は、エージェント５２の表現態様を、コミュニケーションを重ねるにつれて、徐々にユーザの表現態様に近づけるように模倣度合いを設定する。なお、本実施形態において、ユーザとは、情報処理システム１を利用する者をいい、以下においては車両の運転者に適用した例で説明するが、ユーザは運転者以外の他の同乗者（以下、これら運転者及び同乗者を単に乗員とも称する。）であってもよい。 The information processing system 1 of the present embodiment detects expressions included in the user's actions and conversations, and an anthropomorphic agent 52, as shown in FIGS. Communication is performed by imitating the mode of expression. By using so-called mirroring, which imitates expressions, information that is easy for the user to understand can be provided, and good communication can be performed. On the other hand, if the agent 52 excessively imitates the user's mode of expression, the user feels uncomfortable with the expression of the agent 52, or feels uncomfortable. Therefore, the information processing system 1 sets the degree of imitation so that the expression mode of the agent 52 gradually approaches the expression mode of the user as communication is repeated. In the present embodiment, a user refers to a person who uses the information processing system 1, and an example of application to a driver of a vehicle will be described below. , these drivers and fellow passengers are also simply referred to as passengers.).

図１に戻り、本実施形態の情報処理システム１は、各種処理を実行するためのプログラミングが格納されたＲＯＭ（Read Only Memory）と、このＲＯＭに格納されたプログラムを実行することで、情報処理装置として機能する動作回路としてのＣＰＵ（Central Processing Unit）と、アクセス可能な記憶装置として機能するＲＡＭ（Random Access Memory）とを備えた情報処理装置２と、車両センサ類３と、入力装置４と、出力装置としてのエージェント装置５で構成されている。これらの装置は、たとえばＣＡＮ（Controller Area Network）その他の車載ＬＡＮにより接続され、相互に情報の送受信を行うことができる。そして、情報処理装置２は、情報処理プログラムの実行により発揮される機能構成からみると、図１に示すように乗員特定部２１と、表現態様検出部２２と、表現態様データベース２３と、親密度判定部２４と、親密度データベース２５と、表現態様設定部２６と、データ生成部２７と、出力部２８と、を備える。 Returning to FIG. 1, the information processing system 1 of the present embodiment includes a ROM (Read Only Memory) storing programming for executing various processes, and executing the program stored in the ROM to perform information processing. An information processing device 2 having a CPU (Central Processing Unit) as an operating circuit functioning as a device and a RAM (Random Access Memory) functioning as an accessible storage device, vehicle sensors 3, and an input device 4 , and an agent device 5 as an output device. These devices are connected, for example, by a CAN (Controller Area Network) or other in-vehicle LAN, and can mutually transmit and receive information. As shown in FIG. 1, the information processing apparatus 2 has an occupant identification unit 21, an expression mode detection unit 22, an expression mode database 23, and a familiarity A determination unit 24 , a familiarity database 25 , an expression mode setting unit 26 , a data generation unit 27 , and an output unit 28 are provided.

乗員特定部２１は、車両センサ類３や入力装置４から取得した情報に基づいて運転者Ｄを特定し、少なくとも一時的に記憶する。車両センサ類３は、例えばルームミラーの近傍に設けられた車室内を撮像する車内カメラであり、入力装置４は、例えば座席近傍に設けられた音声入力が可能なマイクである。乗員特定部２１は、車内カメラにより撮像された画像情報や、車内マイクに入力された音声情報に基づいて、運転者Ｄを特定する。乗員特定部２１は、特定した運転者Ｄの属性情報（識別情報）、車両センサ類３から取得した運転者Ｄの画像情報及び入力装置４から取得した運転者Ｄの音声情報を、表現態様検出部２２及び親密度判定部２４に出力する。 The occupant identification unit 21 identifies the driver D based on information acquired from the vehicle sensors 3 and the input device 4, and at least temporarily stores the information. The vehicle sensors 3 are, for example, an in-vehicle camera that is provided near the rear-view mirror to capture an image of the interior of the vehicle. The occupant identification unit 21 identifies the driver D based on the image information captured by the in-vehicle camera and the voice information input to the in-vehicle microphone. The occupant identification unit 21 detects the expression mode of the identified attribute information (identification information) of the driver D, the image information of the driver D obtained from the vehicle sensors 3, and the voice information of the driver D obtained from the input device 4. It outputs to the unit 22 and the familiarity determination unit 24 .

なお、運転者Ｄの特定には、座席シートの着座部の内部に設けられた着座センサや、シートベルトのセンサからの入力信号を用いてもよい。また、例えば車両の無線キーに運転者の識別データを記憶させ、その無線キーを用いて車両を解錠又は始動した際に、運転者Ｄの識別データを自動的又は半自動的に読み取ることにより特定してもよい。 In order to specify the driver D, an input signal from a seat sensor provided inside the seat portion of the seat or a seat belt sensor may be used. Further, for example, the identification data of the driver is stored in the wireless key of the vehicle, and the identification data of the driver D is automatically or semi-automatically read when the vehicle is unlocked or started using the wireless key. You may

表現態様検出部２２は、乗員特定部２１から受信した運転者Ｄの属性情報（識別情報）、画像情報及び音声情報に基づいて、運転者Ｄの表現態様を検出し、表現態様データベース２３に蓄積する。運転者Ｄの表現態様とは、運転者Ｄが意志、感情、思想などを動作や言語で示す際に用いられる表現である。表現態様には、運転者Ｄが用いる視覚的特徴を有する動作態様と、運転者Ｄが用いる聴覚的特徴を有する会話態様が含まれる。表現態様検出部２２により検出された運転者Ｄの動作態様及び会話態様は、エージェント５２がコミュニケーション情報を出力する際に模倣する、エージェント５２の表現態様を決定するのに供される。運転者Ｄが用いた表現をエージェント５２が模倣することにより、運転者Ｄが理解しやすい態様でコミュニケーションを行うことができるからである。また、運転者Ｄが実際に用いた表現をエージェント５２が模倣することにより、エージェント５２の表現が運転者Ｄに好意的に受け止められるという効果も期待できる。 The expression detection unit 22 detects the expression of the driver D based on the attribute information (identification information), the image information, and the audio information of the driver D received from the occupant identification unit 21, and accumulates the expression in the expression database 23. do. The expression mode of the driver D is an expression used when the driver D expresses his/her will, emotion, thought, etc. by actions or language. The expression mode includes an action mode having visual features used by the driver D and a conversation mode having auditory features used by the driver D. The behavior mode and conversation mode of the driver D detected by the expression mode detection unit 22 are used to determine the expression mode of the agent 52 to be imitated when the agent 52 outputs communication information. This is because the agent 52 imitates the expression used by the driver D, so that the driver D can communicate in a manner that is easy for the driver D to understand. In addition, the agent 52 imitates the expressions actually used by the driver D, so that the expression of the agent 52 can be expected to be favorably received by the driver D.

表現態様検出部２２は、特に限定されないが、運転者Ｄを撮像した画像情報から、運転者Ｄの服装や髪型を含む外観、顔つきや体つきなどの容姿、車両を運転操作する際の姿勢、喜怒哀楽などの表情、身振り手振りなどの動作といった視覚的特徴を動作態様として検出してもよい。これらの視覚的特徴から検出される動作態様は、運転者Ｄが模倣に気付き易いので、エージェント５２が徐々に模倣することにより、過剰な模倣であるとの印象を与えることを抑制できる。 The expression mode detection unit 22 detects, but is not limited to, the driver D's appearance including clothes and hairstyle, appearance including face and body, posture when driving the vehicle, joy, and so on, from image information obtained by imaging the driver D. Visual features such as facial expressions such as anger and sorrow, and actions such as gestures may be detected as the action mode. Since the driver D can easily notice the imitation in the behavior detected from these visual features, the gradual imitation by the agent 52 can prevent the impression of excessive imitation.

また、表現態様検出部２２は、特に限定されないが、運転者Ｄにより入力された音声情報から、運転者Ｄの声質、抑揚（イントネーション）、ですます調や敬語、流行語の使用を含む言い回し、言語、方言といった口調に係る特徴や、話題や嗜好などを含む会話内容、推定年齢、性別といった聴覚的特徴を会話態様として検出してもよい。これらの聴覚的特徴から検出される会話態様の模倣は、動作態様に比べて運転者Ｄが模倣に気付き難いので、エージェント５２が徐々に模倣することにより、良好なコミュニケーションを行うことができる。 In addition, although not particularly limited, the expression mode detection unit 22 detects, from the voice information input by the driver D, the voice quality of the driver D, the intonation, the phrasing including the use of the tone, honorifics, and buzzwords. Features relating to tone such as language and dialect, content of conversation including topics and preferences, auditory features such as estimated age and gender may be detected as the mode of conversation. It is difficult for the driver D to notice the imitation of the conversation mode detected from these auditory features compared to the action mode, so that the agent 52 gradually imitates the imitation, thereby enabling good communication.

表現態様データベース２３は、運転者Ｄの属性情報（識別情報）と、表現態様検出部２２により検出された運転者Ｄの表現態様が対応付けられて格納される。図４は、表現態様データベース２３の構成例を示した図である。表現態様データベース２３には、運転者Ｄの属性情報と、検出された表現態様、当該表現態様が検出された日時情報及び累計検出回数などの検出履歴が記憶される。なお、表現態様データベース２３は、情報処理装置２に含まれる構成としたが、外部に設けられたサーバと通信することにより各種情報を記憶し又は取得する構成としてもよい。 The expression mode database 23 stores the attribute information (identification information) of the driver D and the expression mode of the driver D detected by the expression mode detection unit 22 in association with each other. FIG. 4 is a diagram showing a configuration example of the representation mode database 23. As shown in FIG. The expression mode database 23 stores the attribute information of the driver D, the detected expression mode, information on the date and time when the expression mode was detected, and the detection history such as the total number of times of detection. Although the representation mode database 23 is included in the information processing apparatus 2, it may be configured to store or acquire various types of information by communicating with an external server.

図４に示すように、乗員の属性情報には、例えばユーザ１として父が記憶され、ユーザ１の父が用いた表現態様の検出履歴が記憶される。２０２０年３月１８日１３時００分には、ユーザ１の父により入力された音声情報から「おっけー」という会話態様と、ユーザ１の父を撮像した画像情報から＜親指を上に向ける＞という動作態様が検出され、これらの表現態様の累計検出回数が１０８回であることが記憶されている。同様に、２０２０年１月１１日１１時１１分に「わかった」という会話態様と＜親指と人差し指で円を作る＞という動作態様（累計検出回数５０２回）が、２０１９年１２月８日８時６分に「了解」という会話態様と＜敬礼する＞という動作態様（累計検出回数４９回）が、２０１９年１１月２８日７時５９分に「ラジャー」という会話態様と＜一方の手を上に掲げ、他方の手を横に向ける＞という動作態様（累計検出回数９回）が検出されたことが記憶されている。なお、表現態様は、会話態様又は動作態様のいずれかを検出して記憶してもよく、図４に示すように、会話態様と動作態様の双方を検出して対応付けて記憶してもよい。また、表現態様検出部２２により検出された表現態様は、「承諾」などの項目に分類して記憶されてもよい。 As shown in FIG. 4, the attribute information of the occupant stores, for example, the father as user 1 and the detection history of the expressions used by the father of user 1 . At 13:00 on March 18, 2020, from the voice information input by User 1's father, the conversation mode of "Okay" and from the image information of User 1's father, <Thumb up > is detected, and the total number of detections of these expression modes is 108 times. Similarly, at 11:11 on January 11, 2020, the conversation mode of “understood” and the action mode of <making a circle with the thumb and index finger> (accumulated detection times 502 times) were recorded on December 8, 2019. At 6 o'clock, the conversation mode of "understand" and the action mode of <salute> (cumulative number of detections: 49 times), and at 7:59 on November 28, 2019, the conversation mode of "raja" and <one hand It is stored that the operation mode of "raise up and turn the other hand to the side" was detected (accumulated number of detections: 9 times). As for the expression mode, either the conversation mode or the motion mode may be detected and stored. As shown in FIG. 4, both the conversation mode and the motion mode may be detected and stored in association with each other. . Further, the expression mode detected by the expression mode detection unit 22 may be classified into items such as "acceptance" and stored.

親密度判定部２４は、運転者Ｄとエージェント５２の親密度を推定し、親密度データベース２５に蓄積する。親密度が高いほど、運転者Ｄがエージェント５２を受容し、エージェント５２とのコミュニケーションに慣れていることを示し、親密度が低いほど、運転者Ｄがエージェント５２とのコミュニケーションに慣れていないことを示す。親密度判定部２４により推定された運転者Ｄの親密度は、エージェント５２がコミュニケーション情報を出力する際の、エージェント５２の表現態様の模倣度合いを決定するのに供される。なお、親密度判定部２４及び親密度データベース２５は、本発明に必須の構成でなく、必要に応じて省略されてもよい。 The degree of familiarity determination unit 24 estimates the degree of familiarity between the driver D and the agent 52 and stores it in the degree of familiarity database 25 . A higher degree of familiarity indicates that the driver D accepts the agent 52 and is accustomed to communicating with the agent 52, and a lower degree of familiarity indicates that the driver D is not accustomed to communicating with the agent 52. show. The familiarity of the driver D estimated by the familiarity determination unit 24 is used to determine the degree of imitation of the expression mode of the agent 52 when the agent 52 outputs communication information. Note that the familiarity determination unit 24 and the familiarity database 25 are not essential components of the present invention, and may be omitted as necessary.

親密度データベース２５は、運転者Ｄの属性情報（識別情報）と、親密度判定部２４により推定された運転者Ｄの親密度が対応付けられて格納される。図５は、親密度データベース２５の構成例を示した図である。親密度データベース２５には、乗員の属性情報と親密度が記憶されるほか、車両センサ類３から取得した運転者Ｄの乗車履歴、運転支援装置やナビゲーション装置、エージェント装置５などの各種車載装置の使用履歴が記憶されてもよい。図５に示すように、乗員の属性情報には、例えばユーザ１として父が記録され、ユーザ１の父について、エージェント５２との親密度、車両の乗車履歴、各種車載装置の使用履歴が記憶される。 The familiarity database 25 stores the attribute information (identification information) of the driver D and the familiarity of the driver D estimated by the familiarity determination unit 24 in association with each other. FIG. 5 is a diagram showing a configuration example of the familiarity database 25. As shown in FIG. The familiarity database 25 stores passenger attribute information and familiarity, as well as the riding history of the driver D acquired from the vehicle sensors 3, and information on various in-vehicle devices such as the driving support device, the navigation device, and the agent device 5. Usage history may be stored. As shown in FIG. 5, in the attribute information of the occupant, for example, the father is recorded as user 1, and for the father of user 1, the familiarity with the agent 52, the vehicle boarding history, and the usage history of various in-vehicle devices are stored. be.

親密度判定部２４は、例えば乗員特定部２１から受信した運転者Ｄの属性情報（識別情報）と音声情報に基づいて、運転者Ｄとエージェント５２の親密度を推定する。具体的には、運転者Ｄにより入力された音声情報に含まれる会話内容を、話題の分野と深度に応じて分類し、数値化して親密度を算出する。話題の深度が深いほど、具体的な事柄やプライベートな事象が含まれ、話題の深度が浅いほど、抽象的で一般的な事柄となる。そのため、運転者Ｄの音声情報に含まれる会話内容の、深度が増して深くなるほど、運転者Ｄがエージェント５２を受容していると推定され、親密度は高いと判断することができる。 The degree of familiarity determination unit 24 estimates the degree of familiarity between the driver D and the agent 52 based on the attribute information (identification information) and voice information of the driver D received from the occupant identification unit 21, for example. Specifically, the content of the conversation included in the voice information input by the driver D is classified according to the field and depth of the topic, and digitized to calculate the degree of intimacy. The deeper the topic, the more specific and private events are included, and the shallower the topic, the more abstract and general. Therefore, as the conversation content included in the voice information of the driver D increases and becomes deeper, it can be estimated that the driver D accepts the agent 52, and it can be determined that the degree of intimacy is high.

話題の分野と深度の分類は、特に限定されないが、図６（ａ）に示すように、例えば分野を［１］趣味・社会観（意見や態度）、［２］生活・勉学、［３］性格・身体や外見、［４］友人・異性関係、［５］金銭関係に分類し、深度を（１）ニュースなどの一般的な話題、（２）話し手のポジティブな情報を含む個人的な話題、（３）話し手のネガティブな情報を含む個人的な話題に分類する。ニュースなどの一般的な話題よりも個人的な話題のほうが話題の深度が増し、さらにポジティブな情報よりもネガティブな情報を含む個人的な話題のほうが深度は増すので、例えば（１）ニュースなどの一般的な話題のスコアを１点、（２）話し手のポジティブな情報を含む個人的な話題のスコアを３点、（３）話し手のネガティブな情報を含む個人的な話題のスコアを５点として数値化する。 Classification of topic field and depth is not particularly limited, but as shown in FIG. Personality/physical appearance, [4] friendships/heterosexual relationships, and [5] financial relationships, and the depth is classified into (1) general topics such as news, and (2) personal topics including the positive information of the speaker. , (3) categorize personal topics that contain negative information about the speaker. Personal topics have more depth than general topics such as news, and personal topics including negative information have more depth than positive information. A score of 1 for general topics, a score of 3 for personal topics containing positive information about the speaker, and a score of 5 for personal topics containing negative information about the speaker. Digitize.

図６（ｂ）に示すように、親密度判定部２４は、運転者Ｄにより入力された音声情報に含まれる会話内容に基づいて、分野［１］から［５］までの深度を数値化し、平均値を算出する。ユーザ１の父の例でいうと、父の会話内容に基づいて、［１］趣味・社会観（意見や態度）については、（３）話し手のネガティブな情報を含む個人的な話題が含まれているので５点、［２］生活・勉学については、（１）ニュースなどの一般的な話題が含まれているので１点、というように数値化していき、分野［１］から［５］までの点数の平均値３．８を、ユーザ１の父の親密度とする。このように、運転者Ｄにより入力された音声情報に含まれる会話内容を用いて親密度を算出することで、エージェント５２に対する運転者Ｄの実際の言動を親密度に反映することができ、運転者Ｄとエージェント５２の適切なコミュニケーションを促進することができる。 As shown in FIG. 6B, the intimacy determination unit 24 quantifies the depth of the fields [1] to [5] based on the content of the conversation included in the voice information input by the driver D, Calculate the average value. In the example of the father of user 1, based on the father's conversation content, [1] hobbies and views of society (opinions and attitudes) include (3) personal topics including the speaker's negative information. For [2] Life/Study, (1) general topics such as news are included, so 1 point is given, and so on. Let the average value 3.8 of the points up to and including the user 1's father's intimacy. In this way, by calculating the degree of intimacy using the content of the conversation included in the voice information input by the driver D, the actual behavior of the driver D with respect to the agent 52 can be reflected in the degree of intimacy. Appropriate communication between person D and agent 52 can be facilitated.

本実施形態では、話題の分野を５項目、話題の深度を３項目に分類して数値化したが、項目数はこれに限定されない。また、話題の深度が同じであれば、どの分野でも同じスコア（点数）としたが、分野についても［１］から［５］に向かうほど深度が増すので、分野［１］から［５］に向かうほどスコアが高くなるように設定してもよい。また、親密度の算出には、分野［１］から［５］までの平均値を用いたが、中央値を用いてもよく、加算方式や減算方式で算出してもよい。また、数値ではなく「高」「中」「低」などの段階を用いて親密度としてもよい。なお、親密度判定部２４は、一度算出した親密度を、新たに取得した運転者Ｄの音声情報に含まれる会話内容に基づいて、適宜、更新してもよい。 In the present embodiment, the field of topic is classified into five items and the depth of topic is classified into three items and quantified, but the number of items is not limited to this. Also, if the depth of the topic is the same, the same score (score) was assigned to each field. You may set so that a score may become high so that it may go. In addition, although the average value of the fields [1] to [5] is used to calculate the degree of intimacy, the median value may be used, or the degree of intimacy may be calculated by an addition method or a subtraction method. In addition, instead of numerical values, levels such as "high", "medium", and "low" may be used as the degree of intimacy. Note that the familiarity determination unit 24 may appropriately update the once-calculated familiarity based on the content of the conversation included in the newly acquired voice information of the driver D.

図７は、運転者Ｄにより入力された会話内容に基づく親密度の情報処理手順の一例を示すフローチャートである。親密度判定部２４で実行される、運転者Ｄの音声情報に含まれる会話内容から親密度を算出する情報処理を示すフローチャートである。以下においては、運転者Ｄの親密度が、既に親密度データベース２５に記憶されているものとして説明する。 FIG. 7 is a flow chart showing an example of a familiarity information processing procedure based on the content of the conversation input by the driver D. As shown in FIG. 4 is a flow chart showing information processing for calculating a degree of familiarity from conversation content included in voice information of a driver D, which is executed by the degree of familiarity determination unit 24. FIG. In the following description, it is assumed that the familiarity of the driver D is already stored in the familiarity database 25 .

まず、図７のステップＳ１０１にて、車両のイグニッションスイッチがＯＮになると以下の情報処理が実行される。ステップＳ１０２にて、乗員特定部２１は、車両センサ類３から取得した画像情報に基づいて運転者Ｄを特定する。次に、ステップＳ１０３にて、親密度判定部２４は、入力装置４を介して入力された運転者Ｄの音声情報を取得する。 First, at step S101 in FIG. 7, when the ignition switch of the vehicle is turned on, the following information processing is executed. In step S102 , the occupant identification unit 21 identifies the driver D based on the image information acquired from the vehicle sensors 3 . Next, in step S103 , the familiarity determination unit 24 acquires voice information of the driver D input via the input device 4 .

続くステップＳ１０４にて、親密度判定部２４は、運転者Ｄの音声情報に含まれる会話内容を、図６（ａ）に示すように話題の分野と深度に応じて分類し、図６（ｂ）に示すように数値化して親密度を算出する。ステップＳ１０５にて、算出された親密度のスコアが、親密度データベース２５に記録された運転者Ｄの親密度より高い値であるか否かを判定する。 In subsequent step S104, the intimacy determination unit 24 classifies the conversation content included in the voice information of the driver D according to the field and depth of the topic as shown in FIG. ) to calculate the degree of intimacy. In step S105 , it is determined whether or not the calculated familiarity score is higher than the familiarity of the driver D recorded in the familiarity database 25 .

ステップＳ１０５の判定の結果、算出された親密度が、親密度データベース２５に記憶された運転者Ｄの親密度より高いと判定した場合には、ステップＳ１０６へ進み、運転者Ｄの親密度を上昇させて更新する。これに対して、ステップＳ１０５にて、算出された親密度が、親密度データベース２５に記録された運転者Ｄの親密度より高くないと判定した場合には、親密度データベース２５に記憶された運転者Ｄの親密度をそのまま維持する。 As a result of the determination in step S105, if it is determined that the calculated familiarity is higher than the familiarity of driver D stored in the familiarity database 25, the process proceeds to step S106 to increase the familiarity of driver D. update. On the other hand, if it is determined in step S105 that the calculated degree of familiarity is not higher than the degree of familiarity of driver D recorded in the familiarity database 25, the driving Maintain the closeness of person D as it is.

ステップＳ１０８にて、車両のイグニッションスイッチがＯＦＦになるまでステップＳ１０３からステップＳ１０７の情報処理を繰り返し実行する。なお、親密度の算出は、運転者Ｄが車両に乗車している間、所定間隔で繰り返す構成としてもよく、運転者Ｄの一度の乗車につき一回の情報処理を実行する構成としてもよい。 In step S108, the information processing from step S103 to step S107 is repeatedly executed until the ignition switch of the vehicle is turned off. Note that the calculation of the degree of intimacy may be repeated at predetermined intervals while the driver D is in the vehicle, or the information processing may be performed once for each time the driver D gets in the vehicle.

また、親密度判定部２４は、乗員特定部２１から受信した運転者Ｄの属性情報（識別情報）、画像情報及び音声情報に基づいて、運転者Ｄとエージェント５２の親密度を推定してもよい。具体的には、エージェント５２が運転者Ｄを模倣する表現態様を出力した後、運転者Ｄの画像情報及び音声情報を取得する。そして、画像情報及び音声情報に含まれる運転者Ｄの反応を検出し、エージェント５２が出力した表現態様と、運転者Ｄが応答した表現態様を比較する。エージェント５２が出力した表現態様と、運転者Ｄが応答した表現態様の、類似度が高いほど親密度が高いと推定し、類似度が低いほど親密度が低いと推定する。このように、エージェント５２が出力した表現態様に対する、運転者Ｄの反応に基づいて親密度を推定することにより、運転者Ｄとエージェント５２の適切なコミュニケーションを一層促進することができる。 Further, the familiarity determination unit 24 estimates the degree of familiarity between the driver D and the agent 52 based on the attribute information (identification information), image information, and audio information of the driver D received from the occupant identification unit 21. good. Specifically, after the agent 52 outputs an expression mode imitating the driver D, the image information and voice information of the driver D are obtained. Then, the reaction of the driver D included in the image information and the voice information is detected, and the expression mode output by the agent 52 and the expression mode responded by the driver D are compared. It is estimated that the higher the degree of similarity between the expression form output by the agent 52 and the expression form responded by the driver D, the higher the degree of intimacy, and the lower the degree of similarity, the lower the degree of intimacy. In this way, by estimating the degree of intimacy based on the reaction of the driver D to the expression mode output by the agent 52, appropriate communication between the driver D and the agent 52 can be further promoted.

例えば、図８（ａ）に示すように、エージェント５２が「わかった」という会話態様と＜親指と人差し指で円を作る＞という動作態様を出力したとする。このエージェント５２の表現態様に対し、図８（ｂ）に示すように、運転者Ｄが＜親指を上に向ける＞という動作態様で応答した場合には、エージェント５２が出力した＜親指と人差し指で円を作る＞という動作態様と、運転者Ｄが応答した＜親指を上に向ける＞という動作態様の類似度が低いので、親密度は低いと判断する。これに対して、図８（ｃ）に示すように、運転者Ｄが＜親指と人差し指で円を作る＞という動作態様で応答した場合には、エージェント５２が出力した動作態様と、運転者Ｄが応答した動作態様の類似度が高いので、親密度が高いと判断する。会話態様についても同様に、エージェント５２が出力した会話態様と、運転者Ｄが応答した会話態様とを比較して、その類似度により親密度を判断する。なお、親密度判定部２４は、一度推定した親密度を、新たに取得した運転者Ｄの画像情報及び音声情報に基づいて、適宜、更新してもよい。 For example, as shown in FIG. 8A, it is assumed that the agent 52 outputs a conversation mode of "understood" and an action mode of <make a circle with the thumb and forefinger>. As shown in FIG. 8(b), when the driver D responds to this expression mode of the agent 52 with the action mode of <pointing the thumb up>, the agent 52 outputs <with the thumb and forefinger Since the degree of similarity between the action mode of "making a circle" and the action mode of "pointing the thumb up" responded by the driver D is low, it is determined that the degree of intimacy is low. On the other hand, as shown in FIG. 8(c), when the driver D responds with the action mode of <making a circle with the thumb and forefinger>, the action mode output by the agent 52 and the driver D Since the similarity of the motion modes responded by is high, it is determined that the degree of intimacy is high. As for the conversation mode, similarly, the conversation mode output by the agent 52 is compared with the conversation mode responded by the driver D, and the degree of intimacy is determined based on the degree of similarity. Note that the familiarity determination unit 24 may appropriately update the once-estimated familiarity based on the newly acquired image information and voice information of the driver D.

図９は、運転者Ｄの反応に基づく親密度の情報処理手順の一例を示すフローチャートである。親密度判定部２４で実行される、エージェント５２の表現態様に対する運転者Ｄの応答に基づいて親密度を推定する情報処理を示すフローチャートである。以下においては、運転者Ｄの親密度が、既に親密度データベース２５に記憶されているものとして説明する。なお、上述した図７の運転者Ｄの会話内容に基づく親密度の情報処理と、図９の運転者Ｄの反応に基づく親密度の情報処理は、適宜、組み合わせて処理してもよい。 FIG. 9 is a flow chart showing an example of a familiarity information processing procedure based on the reaction of the driver D. As shown in FIG. 4 is a flow chart showing information processing for estimating familiarity based on a response of driver D to an expression mode of agent 52, which is executed by familiarity determination unit 24. FIG. In the following description, it is assumed that the familiarity of the driver D is already stored in the familiarity database 25 . It should be noted that the information processing of familiarity based on the content of conversation of driver D in FIG. 7 and the information processing of familiarity based on reaction of driver D in FIG. 9 may be combined as appropriate.

まず、図９のステップＳ１１１にて、車両のイグニッションスイッチがＯＮになると以下の情報処理が実行される。ステップＳ１１２にて、乗員特定部２１は、車両センサ類３から取得した画像情報に基づいて運転者Ｄを特定する。次に、ステップＳ１１３にて、親密度判定部２４は、エージェント５２が運転者Ｄを模倣する表現態様を出力したか否かを判定する。エージェント５２が、図８（ａ）に示すように、例えば「わかった」という会話態様と＜親指と人差し指で円を作る＞という動作態様を出力した場合には、ステップＳ１１４へ進む。これに対して、エージェント５２が運転者Ｄを模倣する表現態様を出力していない場合には、予め定めた所定時間、ステップＳ１１３を繰り返す。 First, in step S111 of FIG. 9, when the ignition switch of the vehicle is turned on, the following information processing is executed. In step S112 , the occupant identification unit 21 identifies the driver D based on the image information acquired from the vehicle sensors 3 . Next, in step S113 , the degree of intimacy determination unit 24 determines whether or not the agent 52 has output an expression that imitates the driver D. If the agent 52 outputs, for example, a conversation mode of "understood" and an action mode of <making a circle with the thumb and forefinger> as shown in FIG. 8A, the process proceeds to step S114. On the other hand, when the agent 52 does not output the representation mode imitating the driver D, step S113 is repeated for a predetermined time.

ステップＳ１１３の判定の結果、エージェント５２が運転者Ｄを模倣する表現態様を出力したと判定した場合には、ステップＳ１１４にて、親密度判定部２４は、エージェント５２が表現態様を出力した後の運転者Ｄの反応を検出するため、運転者Ｄの画像情報及び／又は音声情報を取得する。続くステップＳ１１５にて、親密度判定部２４は、運転者Ｄの画像情報及び／又は音声情報に基づいて、運転者Ｄがエージェント５２の表現態様に応答したか否かを判定する。運転者Ｄがエージェント５２の表現態様に応答したと判定した場合には、ステップＳ１１６へ進む。これに対して、運転者Ｄがエージェント５２の表現態様に応答していないと判定した場合には、ステップＳ１１８へ進む。 As a result of the determination in step S113, when it is determined that the agent 52 has output the representation mode imitating the driver D, in step S114, the degree of familiarity determination unit 24 outputs the representation mode after the agent 52 outputs the representation mode. In order to detect the reaction of driver D, image information and/or audio information of driver D is acquired. In subsequent step S115 , the familiarity determination unit 24 determines whether or not the driver D has responded to the expression mode of the agent 52 based on the image information and/or voice information of the driver D. If it is determined that the driver D has responded to the expression mode of the agent 52, the process proceeds to step S116. On the other hand, when it is determined that the driver D has not responded to the expression mode of the agent 52, the process proceeds to step S118.

ステップＳ１１５の判定の結果、運転者Ｄがエージェント５２の表現態様に応答したと判定した場合には、ステップＳ１１６にて、エージェント５２が出力した表現態様と、運転者Ｄが応答した表現態様を比較して類似度を判定する。例えば、図８（ａ）でエージェント５２が出力した、「わかった」という会話態様と＜親指と人差し指で円を作る＞という動作態様に対して、図８（ｃ）に示すように、運転者Ｄが＜親指と人差し指で円を作る＞という動作態様で応答した場合には、類似度が高いと判定してステップＳ１１７へ進み、運転者Ｄの親密度を上昇させて更新する。これに対して、図８（ｂ）に示すように、運転者Ｄが＜親指を上に向ける＞という動作態様で応答した場合には、エージェント５２が出力した動作態様と、運転者Ｄが応答した動作態様の類似度が低いと判定してステップＳ１１８へ進む。 As a result of the determination in step S115, if it is determined that the driver D has responded to the expression form of the agent 52, then in step S116, the expression form output by the agent 52 is compared with the expression form responded by the driver D. to determine the degree of similarity. For example, as shown in FIG. 8(c), the agent 52 outputs the conversation mode of "understood" and the action mode of <make a circle with the thumb and forefinger>. If D responds in the action mode of <making a circle with the thumb and forefinger>, it is determined that the degree of similarity is high, and the process proceeds to step S117 to increase and update the intimacy of driver D. On the other hand, as shown in FIG. 8(b), when the driver D responds with the action mode of <thumbs up>, the action mode output by the agent 52 and the response of the driver D are It is determined that the degree of similarity between the motion modes obtained is low, and the process proceeds to step S118.

ステップＳ１１５の判定の結果、運転者Ｄがエージェント５２の表現態様に応答していないと判定した場合、及びステップＳ１１６の判定の結果、エージェント５２が出力した表現態様と、運転者Ｄが応答した表現態様の類似度が低いと判定した場合には、ステップＳ１１８にて運転者Ｄの親密度をそのまま維持する。 As a result of the determination in step S115, if it is determined that the driver D has not responded to the expression mode of the agent 52, and as a result of the determination in step S116, the expression mode output by the agent 52 and the expression that the driver D responded to If it is determined that the degree of similarity between the modes is low, the degree of intimacy of driver D is maintained at step S118.

ステップＳ１１９にて、車両のイグニッションスイッチがＯＦＦになるまでステップＳ１１３からステップＳ１１８の情報処理を繰り返し実行する。なお、親密度の推定は、運転者Ｄが車両に乗車している間、所定間隔で繰り返す構成としてもよく、運転者Ｄの一度の乗車につき一回の情報処理を実行する構成としてもよい。 The information processing from step S113 to step S118 is repeatedly executed until the ignition switch of the vehicle is turned off at step S119. Note that the estimation of familiarity may be repeated at predetermined intervals while the driver D is in the vehicle, or may be configured to execute information processing once for each time the driver D gets in the vehicle.

さらに、親密度判定部２４は、車両センサ類３から取得した、運転者Ｄの各種車載装置の使用履歴に基づいて、運転者Ｄとエージェント５２の親密度を推定してもよい。例えば、運転支援装置やナビゲーション装置、エージェント装置５などの車載装置の使用頻度が、所定値以上である場合には、車載装置に対する受容性が高いと判断できるので、運転者Ｄとエージェント５２の親密度が高いと推定する。これに対して、車載装置の使用頻度が所定値以下である場合には、車載装置に対する受容性が低いと判断できるので、運転者Ｄとエージェント５２の親密度が低いと推定する。所定値は、特に限定されないが、車両の乗車履歴５回に対して各種車載装置の使用頻度が３回など、車載装置に対する受容性を判断するのに適した値である。各種車載装置の使用状況に基づいて親密度を推定することにより、運転者Ｄの車載装置に対する受容性を、運転者Ｄとエージェント５２のコミュニケーションに反映することができる。 Furthermore, the degree of familiarity determination unit 24 may estimate the degree of familiarity between the driver D and the agent 52 based on the driver D's use history of various in-vehicle devices acquired from the vehicle sensors 3 . For example, when the frequency of use of an in-vehicle device such as a driving assistance device, a navigation device, or the agent device 5 is equal to or higher than a predetermined value, it can be determined that the receptivity to the in-vehicle device is high. Assume high density. On the other hand, if the usage frequency of the in-vehicle device is equal to or less than the predetermined value, it can be determined that the receptivity to the in-vehicle device is low, so it is estimated that the degree of intimacy between the driver D and the agent 52 is low. Although the predetermined value is not particularly limited, it is a value suitable for judging the acceptability of the in-vehicle device, such as the frequency of use of various in-vehicle devices being 3 times with respect to the history of driving the vehicle 5 times. By estimating the degree of intimacy based on the usage status of various in-vehicle devices, it is possible to reflect the receptivity of the driver D to the in-vehicle devices in the communication between the driver D and the agent 52 .

さて、図１に戻り、表現態様設定部２６は、運転者Ｄに対して出力するエージェント５２の表現態様を決定する。具体的には、車両センサ類３から取得した運転者Ｄの画像情報や、入力装置４から取得した運転者Ｄの音声情報に含まれるトリガ表現を検出すると、表現態様データベース２３に格納された運転者Ｄの表現態様を参照し、模倣対象となるエージェント５２の表現態様を決定する。 Returning to FIG. 1, the expression mode setting unit 26 determines the expression mode of the agent 52 to be output to the driver D. FIG. Specifically, when a trigger expression included in the image information of the driver D acquired from the vehicle sensors 3 or the voice information of the driver D acquired from the input device 4 is detected, the driving behavior stored in the expression mode database 23 is detected. The representation mode of the person D is referred to, and the representation mode of the agent 52 to be imitated is determined.

さらに、表現態様設定部２６は、決定したエージェント５２の表現態様について模倣度合いを設定する。模倣度合いとは、運転者Ｄの表現態様に対するエージェント５２の表現態様の類似度である。模倣度合いが高いほど、両者の類似度は高く、エージェント５２の表現態様が運転者Ｄの表現態様に近づき、模倣度合いが低いほど、両者の類似度は低く、エージェント５２の表現態様が運転者Ｄの表現態様から遠ざかる。 Furthermore, the expression mode setting unit 26 sets the degree of imitation for the determined expression mode of the agent 52 . The degree of imitation is the degree of similarity of the agent's 52 expression mode to the driver's D expression mode. The higher the degree of imitation, the higher the degree of similarity between the two, and the closer the expression mode of the agent 52 is to the expression mode of driver D. move away from the mode of expression of

上述したように、運転者Ｄが用いた表現態様と似た表現態様でエージェント５２がコミュニケーションを行うと、運転者Ｄにとって理解しやすい情報を提供することができる。しかしながら、エージェント５２が急に運転者Ｄの表現態様をそっくり模倣するなど、エージェント５２が運転者Ｄの表現態様を模倣しすぎると、運転者Ｄはエージェント５２の表現に違和感を抱いたり、不快感を募らせたりする。そのため、表現態様設定部２６は、エージェント５２の表現態様を、経時的に、すなわちコミュニケーションを重ねるにつれて、徐々にユーザの表現態様に近づけるように模倣度合いを設定する。 As described above, when the agent 52 communicates in an expression mode similar to that used by the driver D, information that is easy for the driver D to understand can be provided. However, if the agent 52 imitates the expression of the driver D too much, for example, the agent 52 suddenly imitates the expression of the driver D, the driver D feels uncomfortable with the expression of the agent 52 or feels uncomfortable. and solicit Therefore, the expression mode setting unit 26 sets the degree of imitation so that the expression mode of the agent 52 gradually approaches the user's expression mode over time, that is, as communication is repeated.

このように、運転者Ｄが用いた表現態様を模倣対象とし、エージェント５２の表現態様を決定するとともに、エージェント５２の表現態様を、コミュニケーションを重ねるにつれて、徐々に運転者Ｄの表現態様に近づけるように模倣度合いを設定する。これにより、運転者Ｄがエージェント５２の表現に違和感を抱いたり、不快感を募らせたりすることを抑制し、運転者Ｄにとって理解しやすい表現を用いながら、適切なコミュニケーションを行うことができる。 In this manner, the expression mode used by the driver D is set as an imitation target, and the expression mode of the agent 52 is determined, and the expression mode of the agent 52 is gradually brought closer to the expression mode of the driver D as the communication is repeated. Set the degree of imitation to As a result, the driver D can be prevented from feeling uncomfortable with the expressions of the agent 52 or feeling uncomfortable, and appropriate communication can be performed while using expressions that are easy for the driver D to understand.

表現態様設定部２６は、模倣度合いを設定した後、エージェント５２がコミュニケーション情報を出力する際に用いる模倣度合いを決定する。そして、模倣対象として決定したエージェント５２の表現態様とともに、エージェント５２が出力する模倣度合いの情報をデータ生成部２７に出力する。なお、表現態様設定部２６による、エージェント５２の表現態様の決定処理と、模倣度合いの設定処理については、後述する。 After setting the degree of imitation, the expression mode setting unit 26 determines the degree of imitation used when the agent 52 outputs communication information. Then, along with the expression mode of the agent 52 determined to be the target of imitation, information on the degree of imitation output by the agent 52 is output to the data generation unit 27 . The processing for determining the expression mode of the agent 52 and the processing for setting the degree of imitation by the expression mode setting unit 26 will be described later.

表現態様設定部２６は、模倣度合いを設定した後、エージェント５２がコミュニケーション情報を出力する際に用いる模倣度合いを決定するのに、運転者Ｄとエージェント５２の親密度を用いて決定してもよい。具体的には、親密度判定部２４を介して、親密度データベース２５に格納された運転者Ｄとエージェント５２の親密度を取得し、親密度が相対的に高い場合には、親密度が相対的に低い場合に比べて、模倣度合いが高い表現態様を出力するように決定する。 After setting the degree of imitation, the expression mode setting unit 26 may use the degree of intimacy between the driver D and the agent 52 to determine the degree of imitation used when the agent 52 outputs communication information. . Specifically, the degree of familiarity between the driver D and the agent 52 stored in the degree of familiarity database 25 is acquired via the degree of familiarity determination unit 24, and if the degree of familiarity is relatively high, the degree of familiarity is relatively high. It is determined to output an expression mode with a higher degree of imitation than when the degree of imitation is relatively low.

エージェント５２の表現態様は、模倣度合いが高いほど運転者Ｄの表現態様に近づくので、運転者Ｄが模倣に気付き易い。これに対して、模倣度合いが低いほど運転者Ｄの表現態様から遠ざかるので、運転者Ｄが模倣に気付き難い。運転者Ｄとエージェント５２の親密度が低い場合には、運転者Ｄのエージェント５２に対する受容性が低く、模倣が過剰であると感じやすい。そのため、模倣度合いが低い表現態様を出力して、運転者Ｄが模倣であると気付き難くする。 The higher the degree of imitation, the closer the representation of the agent 52 is to the representation of the driver D, so the driver D is more likely to notice the imitation. On the other hand, the lower the degree of imitation, the farther away from the expression mode of the driver D, so it is difficult for the driver D to notice the imitation. When the degree of intimacy between the driver D and the agent 52 is low, the driver D's receptivity to the agent 52 is low, and it is likely that the imitation is excessive. Therefore, an expression mode with a low degree of imitation is output to make it difficult for the driver D to notice imitation.

一方において、運転者Ｄとエージェント５２の親密度が相対的に高い場合には、運転者Ｄのエージェント５２に対する受容性が高く、模倣が過剰であると感じ難いので、模倣度合いが高い表現態様を出力してもよい。親密度が高くなるにつれて、エージェント５２の表現態様が自身の表現態様に似てきたことに気付くと、運転者Ｄはエージェント５２に対して愛着を感じ易くなるからである。また、親密度が特に高い場合には、運転者Ｄが用いる特有の表現態様（例えば、図４に示す「ラジャー」の会話態様と＜一方の手を上に掲げ、他方の手を横に向ける＞動作態様）などを出力するようにしてもよい。このように、エージェント５２が運転者Ｄの表現態様を模倣する度合いを適切に変化させることにより、円滑なコミュニケーション行うことができる。 On the other hand, when the degree of intimacy between the driver D and the agent 52 is relatively high, the receptivity of the driver D to the agent 52 is high, and it is difficult to feel that the imitation is excessive. may be output. This is because, as the degree of intimacy increases, the driver D becomes more likely to feel attachment to the agent 52 when he notices that the mode of expression of the agent 52 resembles his own mode of expression. In addition, when the degree of intimacy is particularly high, the peculiar expressions used by the driver D (for example, the conversational mode of "rajah" shown in FIG. > operation mode), etc. may be output. In this way, by appropriately changing the degree to which the agent 52 imitates the expression mode of the driver D, smooth communication can be performed.

データ生成部２７は、表現態様設定部２６から受信した、模倣対象となるエージェント５２の表現態様と、当該表現態様の模倣度合いの情報に基づいて、運転者Ｄに対して出力する出力データを生成する。会話態様の出力データを生成する場合には、まずテキストデータ（文字列）を生成し、テキストデータを音声合成処理により音声データに変換し、これをエージェント５２が出力する会話態様の出力データとして出力部２８に送信する。動作態様の出力データを生成する場合には、予め設定された動作パスやモーションキャプチャを用いてデジタルデータを生成し、これをエージェント５２が出力する動作態様の出力データとして出力部２８に送信する。音声合成処理、モーションキャプチャ及びデジタルデータの生成については、公知の技術を適用することができる。 The data generation unit 27 generates output data to be output to the driver D based on the expression mode of the agent 52 to be imitated and the information on the degree of imitation of the expression mode received from the expression mode setting unit 26. do. When generating output data in a conversation mode, text data (character string) is first generated, the text data is converted into voice data by speech synthesis processing, and this is output as output data in a conversation mode to be output by the agent 52. 28. When generating the output data of the action mode, digital data is generated using a preset motion path and motion capture, and the digital data is transmitted to the output unit 28 as the output data of the action mode output by the agent 52 . Known techniques can be applied for voice synthesis processing, motion capture, and digital data generation.

さらに、データ生成部２７は、エージェント装置５による出力データの出力頻度を設定してもよい。出力データの出力頻度が高いほど、運転者Ｄとエージェント５２がコミュニケーションを行う頻度が高くなり、コミュニケーションが促進される。そのため、データ生成部２７は、例えば運転者Ｄとエージェント５２の親密度を参照し、親密度が低い場合には、エージェント５２の表現態様の出力頻度を高く設定してもよい。出力頻度を高くすることにより、エージェント５２が運転者Ｄに興味を抱いているという印象を持たせる効果も期待できる。なお、親密度が低い場合には、模倣度合いが低い表現態様を高頻度で出力し、親密度が中程度の場合には、模倣度合いが中程度の表現態様を中程度の頻度で出力し、親密度が高い場合には、模倣度合いが高い表現態様を低頻度で出力するというように、親密度と模倣度合いの双方を考慮して出力頻度を設定してもよい。また、エージェント５２が出力する、一つの表現態様の出力頻度を変更するだけでなく、複数の表現態様を用いる場合には、表現態様ごとに出力頻度を設定してもよい。 Furthermore, the data generator 27 may set the output frequency of the output data by the agent device 5 . The higher the output data output frequency, the higher the frequency of communication between the driver D and the agent 52, thus promoting communication. Therefore, the data generator 27 may refer to the degree of familiarity between the driver D and the agent 52, for example, and set the output frequency of the expression mode of the agent 52 to be high when the degree of familiarity is low. By increasing the output frequency, it can be expected that the agent 52 gives the impression that the driver D is interested. When the degree of familiarity is low, the mode of expression with a low degree of imitation is output with high frequency, and when the degree of familiarity is medium, the mode of expression with a medium degree of imitation is output with a medium frequency, The output frequency may be set in consideration of both the degree of familiarity and the degree of imitation, such as outputting an expression mode with a high degree of imitation at a low frequency when the degree of familiarity is high. Moreover, in addition to changing the output frequency of one representation mode output by the agent 52, when using a plurality of representation modes, the output frequency may be set for each representation mode.

出力部２８は、データ生成部２７から出力データを受信すると、エージェント装置５の動作部、スピーカその他の音声出力部、ディスプレイその他の表示部に制御信号を出力し、エージェント装置５のエージェント機能により、生成された出力データを出力する。 When the output unit 28 receives the output data from the data generation unit 27, the output unit 28 outputs a control signal to the operation unit of the agent device 5, the speaker or other audio output unit, the display or other display unit, and the agent function of the agent device 5 outputs: Print the generated output data.

次に、表現態様設定部２６による、模倣対象となるエージェント５２の表現態様の決定処理と、当該表現態様についての模倣度合いの設定処理について説明する。上述したように、表現態様設定部２６は、車両センサ類３から取得した運転者Ｄの画像情報や、入力装置４から取得した運転者Ｄの音声情報に含まれるトリガ表現を検出すると、表現態様データベース２３に格納された運転者Ｄの表現態様を参照し、模倣対象となるエージェント５２の表現態様を決定する。そして、決定した当該表現態様について、模倣度合いを設定する。 Next, the processing of determining the expression mode of the agent 52 to be imitated and the setting process of the degree of imitation for the expression mode by the expression mode setting unit 26 will be described. As described above, when the expression mode setting unit 26 detects a trigger expression included in the image information of the driver D acquired from the vehicle sensors 3 or the voice information of the driver D acquired from the input device 4, the expression mode setting unit 26 The representation mode of the driver D stored in the database 23 is referred to, and the representation mode of the agent 52 to be imitated is determined. Then, the degree of imitation is set for the determined expression mode.

図１０は、模倣対象となるエージェント５２の表現態様を決定する情報処理について説明するための図である。表現態様設定部２６は、図１０に示すように、例えば入力装置４を介して取得した運転者Ｄの音声情報に含まれる、「それでいいよ」という「承諾」のトリガ表現を検出すると、表現態様データベース２３に格納された運転者Ｄの表現態様から「承諾」に関する表現態様を抽出する。「承諾」に関する表現態様として抽出された、例えば「おっけー」という会話態様と＜親指を上に向ける＞という動作態様、「わかった」という会話態様と＜親指と人差し指で円を作る＞という動作態様、「了解」という会話態様と＜敬礼する＞という動作態様、「ラジャー」という会話態様と＜一方の手を上に掲げ、他方の手を横に向ける＞という動作態様から、模倣対象となるエージェント５２の表現態様を決定する。表現態様設定部２６は、模倣対象となるエージェント５２の表現態様を決定した後、エージェント５２の表現態様を運転者Ｄの表現態様に近づけるための変更対象を特定して模倣度合いを設定する。 FIG. 10 is a diagram for explaining information processing for determining the expression mode of the agent 52 to be imitated. As shown in FIG. 10, the expression mode setting unit 26 detects, for example, the trigger expression of "acceptance" of "that's okay" included in the voice information of the driver D acquired via the input device 4, and the expression An expression mode related to “acceptance” is extracted from the expression modes of the driver D stored in the mode database 23 . Extracted expressions related to "acceptance", for example, the conversational mode of "okay" and the action mode of <pointing the thumb up>, the conversational mode of "understood" and the conversational mode of <making a circle with the thumb and forefinger> From the action mode, the conversation mode of "understanding" and the action mode of <salute>, the conversation mode of "rajah" and the action mode of <raise one hand up and turn the other hand to the side>, it is possible to imitate determine the representation mode of the agent 52. After determining the expression mode of the agent 52 to be imitated, the expression mode setting unit 26 specifies a change target for bringing the expression mode of the agent 52 closer to the expression mode of the driver D, and sets the degree of imitation.

図１１～図１３は、エージェント５２の表現態様の模倣度合いを設定するシーンの一例を示した図である。図１１（ａ）～（ｃ）のそれぞれは、異なる表現態様を用いて模倣度合いを設定するシーンである。例えば、エージェント５２の模倣対象を「承認」の動作態様及び会話態様に決定すると、表現態様設定部２６は、表現態様データベース２３に記憶された、各表現態様の検出回数を変更対象として特定する。検出回数が多い表現態様ほど、運転者Ｄが用いる表現態様に近づけることができる。そのため、表現態様設定部２６は、検出回数の多い動作態様及び会話態様を用いるほど、模倣度合いを高く設定する。なお、検出回数を変更対象とすると、検出回数が少ない表現態様、すなわち運転者Ｄが偶発的にとった表現態様を、エージェント５２が頻繁に模倣するといった不自然な事象を抑制できる。 11 to 13 are diagrams showing examples of scenes for setting the degree of imitation of the expression mode of the agent 52. FIG. Each of FIGS. 11A to 11C is a scene in which the degree of imitation is set using different modes of expression. For example, when the action mode and conversation mode of "approval" are determined as the imitation target of the agent 52, the expression mode setting unit 26 specifies the number of detections of each expression mode stored in the expression mode database 23 as a change target. An expression mode with a larger number of detections can be closer to the expression mode used by the driver D. Therefore, the expression mode setting unit 26 sets the degree of imitation to be higher as the action mode and conversation mode used are detected more frequently. If the number of times of detection is changed, it is possible to suppress an unnatural phenomenon in which the agent 52 frequently imitates an expression mode with a small number of detection times, that is, an expression mode that the driver D accidentally takes.

図１１（ａ）～（ｃ）に示すシーンでいうと、図１１（ａ）に示すように、検出回数が少ない「了解」の会話態様と＜敬礼する＞動作態様は、模倣度合いを低く設定する。また、図１１（ｂ）に示すように、検出回数が中程度の「おっけー」の会話態様と＜親指を上に向ける＞動作態様は、模倣度合いを中程度に設定する。同様に、図１１（ｃ）に示すように、検出回数が多い「わかった」の会話態様と＜親指と人差し指で円を作る＞動作態様は、模倣度合いを高く設定する。そして、エージェント５２が、経時的に図１１（ａ）→図１１（ｂ）→図１１（ｃ）の表現態様を出力するように、今回生成する出力データに反映する模倣度合いを決定し、データ生成部２７へ指令信号を送信する。なお、本実施形態において、模倣度合いを「低」「中」「高」の３段階としたが、これに限定されず、模倣度合いは２段階であっても、３段階以上であってもよい。 In the scenes shown in FIGS. 11(a) to 11(c), as shown in FIG. 11(a), the degree of imitation is set low for the “understand” conversation mode and <salute> action mode, which are detected less frequently. do. In addition, as shown in FIG. 11(b), the degree of imitation is set to a medium level for the conversation mode of "Okay" and the action mode of <pointing thumb up>, which have a medium number of detections. Similarly, as shown in FIG. 11(c), the degree of imitation is set high for the conversation mode of "understood" and the action mode of <making a circle with the thumb and forefinger>, which are detected a large number of times. Then, the degree of imitation to be reflected in the output data generated this time is determined so that the agent 52 outputs the expressions of FIG. 11(a)→FIG. 11(b)→FIG. A command signal is transmitted to the generator 27 . In the present embodiment, the degree of imitation is set in three stages of "low", "medium", and "high", but the degree of imitation is not limited to this, and the degree of imitation may be two stages or three stages or more. .

図１２（ａ）～（ｃ）のそれぞれは、同一の表現態様を用いて模倣度合いを設定するシーンである。表現態様設定部２６は、例えばエージェント５２の模倣対象を、「承認」の表現態様のうち、＜親指と人差し指で円を作る＞動作態様に決定すると、エージェント５２の動作態様を運転者Ｄの動作態様に近づけるため、手の形状を変更対象として特定する。エージェント５２の手の形状が、運転者Ｄの動作態様の＜親指と人差し指で円を作る＞形状に類似するほど、模倣度合いを高く設定する。 Each of FIGS. 12(a) to 12(c) is a scene in which the degree of imitation is set using the same expression mode. For example, when the expression mode setting unit 26 determines the action mode of <making a circle with the thumb and forefinger> among the expression modes of "approval" as the imitation target of the agent 52, the expression mode setting unit 26 sets the action mode of the agent 52 to the motion of the driver D. In order to approximate the mode, the shape of the hand is specified as a change target. The degree of imitation is set higher as the shape of the hand of the agent 52 is more similar to the motion mode of the driver D <a circle is formed with the thumb and forefinger>.

図１２（ａ）～（ｃ）に示すシーンでいうと、図１２（ａ）に示すように、＜親指と人差し指で円を作る＞形状と類似度が低い＜片手を上げる＞形状は、模倣度合いを低く設定する。また、図１２（ｂ）に示すように、＜親指と人差し指で円を作る＞形状と類似度が中程度の＜片手で円を作る＞形状は、模倣度合いを中程度に設定する。図１２（ｃ）に示すように、＜親指と人差し指で円を作る＞形状と類似度が高い＜親指と人差し指で円を作る＞形状は、模倣度合いを高く設定する。そして、エージェント５２が、経時的に図１２（ａ）→図１２（ｂ）→図１２（ｃ）の動作態様を出力するように、今回生成する出力データに反映する模倣度合いを決定し、データ生成部２７へ指令信号を出力する。 In the scenes shown in FIGS. 12A to 12C, as shown in FIG. Set the intensity low. In addition, as shown in FIG. 12B, the degree of imitation is set to be moderate for the <circle with one hand> shape, which has a medium degree of similarity with the <circle with thumb and forefinger> shape. As shown in FIG. 12(c), the degree of imitation is set high for a shape <circle with thumb and forefinger> that is highly similar to the shape <circle with thumb and forefinger>. Then, the degree of imitation to be reflected in the output data generated this time is determined so that the agent 52 outputs the action modes of FIG. 12(a)→FIG. 12(b)→FIG. A command signal is output to the generator 27 .

図１３（ａ）～（ｃ）のそれぞれは、複数の表現態様を用いて模倣度合いを設定するシーンである。表現態様設定部２６は、複数の動作態様と会話態様を用いて模倣度合いを設定してもよい。複数の表現態様を用いることにより、エージェント５２が用いる表現を多様に、かつ段階的に変化させることができる。表現態様設定部２６は、例えばエージェント５２の一の模倣対象を「承認」の会話態様に決定し、「承認」の会話態様の検出回数を変更対象として特定する。さらに、他の模倣対象を＜親指と人差し指で円を作る＞動作態様に決定し、手の形状を変更対象として特定する。検出回数が多い会話態様を用いるほど、また、エージェント５２の手の形状が運転者の動作態様の＜親指と人差し指で円を作る＞形状に類似するほど、模倣度合いを高く設定する。 Each of FIGS. 13(a) to 13(c) is a scene in which the degree of imitation is set using a plurality of modes of expression. The expression mode setting unit 26 may set the degree of imitation using a plurality of behavior modes and conversation modes. By using a plurality of modes of expression, the expression used by the agent 52 can be varied step by step. The expression mode setting unit 26 determines, for example, the "approval" conversation mode as one imitation target of the agent 52, and specifies the number of detections of the "approval" conversation mode as a change target. Furthermore, another imitation target is determined to be <making a circle with the thumb and forefinger> action mode, and the shape of the hand is specified as a change target. The higher the degree of imitation is set, the more conversation mode is used, and the more similar the shape of agent 52's hand is to the driver's motion mode <a circle is formed with a thumb and forefinger>.

図１３（ａ）～（ｃ）に示すシーンでいうと、図１３（ａ）に示すように、検出回数が少ない「了解」の会話態様と類似度が低い＜片手を上げる＞形状は、模倣度合いを低く設定する。また、図１３（ｂ）に示すように、検出回数が中程度の「おっけー」の会話態様と類似度が中程度の＜片手で円を作る＞形状は、模倣度合いを中程度に設定する。さらに、図１３（ｃ）に示すように、検出回数が多い「わかった」の会話態様と類似度が高い＜親指と人差し指で円を作る＞形状は、模倣度合いを高く設定する。そして、エージェント５２が、経時的に図１３（ａ）→図１３（ｂ）→図１３（ｃ）の表現態様を出力するように、今回生成する出力データに反映する模倣度合いを決定し、データ生成部２７へ指令信号を出力する。 In the scenes shown in FIGS. 13(a) to 13(c), as shown in FIG. 13(a), the conversation mode of “understood” with a low number of detections and the shape of <raise one hand> with a low similarity are imitations. Set the intensity low. Also, as shown in FIG. 13(b), the degree of imitation is set to a medium level for the conversation mode of "Okay" with a medium number of detections and the shape <making a circle with one hand> with a medium degree of similarity. do. Furthermore, as shown in FIG. 13(c), the degree of imitation is set high for the shape of <making a circle with a thumb and forefinger>, which has a high degree of similarity with the conversation mode of “understood”, which is detected a large number of times. Then, the degree of imitation to be reflected in the output data generated this time is determined so that the agent 52 outputs the expressions of FIG. 13(a)→FIG. 13(b)→FIG. A command signal is output to the generator 27 .

なお、表現態様設定部２６は、複数の動作態様と会話態様を用いて、模倣対象となる表現態様の数を経時的に増やしてもよい。例えば、一つの表現態様を用いる場合は模倣度合いを低く設定し、二つの表現態様を用いる場合は模倣度合いを中程度に、三つの表現態様を用いる場合は模倣度合いを高く設定してもよい。この場合には、エージェント５２が、経時的に一つの表現態様→二つの表現態様→三つの表現態様を模倣するように、出力データに反映する。 Note that the expression mode setting unit 26 may use a plurality of behavior modes and conversation modes to increase the number of expression modes to be imitated over time. For example, the degree of imitation may be set low when using one mode of expression, the degree of imitation may be set medium when using two modes of expression, and the degree of imitation may be set high when using three modes of expression. In this case, the agent 52 reflects in the output data so as to imitate one representation → two representations → three representations over time.

また、表現態様設定部２６は、会話態様ついて、運転者Ｄの音声情報に含まれる音声の特徴、表現の特徴、会話内容の特徴を模倣対象に決定し、運転者Ｄの音声情報に近づけるため、それぞれの特徴に関する項目を変更対象に特定して模倣度合いを設定してもよい。音声の特徴は、運転者Ｄが模倣に気付き難く、表現の特徴は、やや模倣に気付き易い。また、会話内容の特徴は、運転者Ｄが模倣に気付き易い。表現態様設定部２６は、模倣に気付き易い特徴を模倣対象とするほど、運転者Ｄの音声情報に近づいているものとし、模倣度合いを高く設定する。 In addition, with respect to the conversation mode, the expression mode setting unit 26 determines voice features, expression features, and conversation content features included in the voice information of the driver D as objects to be imitated, so as to approximate the voice information of the driver D. , the degree of imitation may be set by specifying an item related to each feature as a change target. It is difficult for the driver D to notice the imitation of the features of the voice, and it is rather easy for the driver D to notice the imitation of the features of the expression. In addition, the feature of the conversation content is that the driver D can easily notice the imitation. The expression mode setting unit 26 sets the degree of imitation to a higher level, assuming that the more easily noticeable features are targeted for imitation, the closer they are to the voice information of the driver D.

図１４に示すように、会話の速度、トーン、声質、声量などの音声の特徴を模倣対象とする場合は、模倣度合いを低く設定する。また、抑揚、方言、言い回し（ですます調、敬語、流行語の使用）、アクセントなどの表現の特徴を模倣対象とする場合は、模倣度合いを中程度に設定する。さらに、話題、嗜好、特定の人物に対するネガティブな情報などの会話内容の特徴を模倣対象とする場合は、模倣度合いを高く設定する。ただし、特定の人物に対するネガティブな情報については、所定の条件を設定し、例えば運転者Ｄにより入力された音声情報に、特定の人物に対するネガティブな情報が含まれている場合のみ模倣対象とする構成が好ましい。表現態様設定部２６は、エージェント５２が、経時的に音声の特徴→表現の特徴→会話内容の特徴を模倣するように、出力データに反映する。 As shown in FIG. 14, when speech characteristics such as speech speed, tone, voice quality, and volume are targeted for imitation, the degree of imitation is set low. In addition, when the characteristics of expression such as intonation, dialect, phrasing (use of tone, honorifics, buzzwords), and accent are to be imitated, the degree of imitation is set to a medium level. Furthermore, when the features of conversation content such as topics, tastes, and negative information about a specific person are targeted for imitation, the degree of imitation is set high. However, with respect to negative information about a specific person, a predetermined condition is set so that, for example, only when the voice information input by the driver D contains negative information about a specific person, it is targeted for imitation. is preferred. The expression mode setting unit 26 reflects this in the output data so that the agent 52 imitates the characteristics of the speech, the characteristics of the expression, and the characteristics of the content of the conversation over time.

さらに、表現態様設定部２６は、運転者Ｄの動作態様や会話態様に含まれる、時間、場所、状況に関する条件を特定し、これらの条件を変更対象とすることにより模倣度合いを設定してもよい。これらの条件を用いることにより、運転者Ｄが気付き易い模倣から気付き難い模倣まで段階的に設定することができ、多様な表現を用いて適切なコミュニケーションを行うことができる。 Furthermore, the expression mode setting unit 26 may specify conditions related to time, place, and situation included in the behavior mode and conversation mode of the driver D, and set the degree of imitation by changing these conditions. good. By using these conditions, the imitation can be set step by step from the imitation that the driver D can easily notice to the imitation that is difficult to notice, and appropriate communication can be performed using various expressions.

図１５は、運転者Ｄの動作態様について、時間、場所、状況に関する条件を変更対象とする例を示した図である。模倣度合いが高いほど、運転者Ｄの動作態様に近づけるように、また、運転者Ｄから当該動作態様を検出した際の状態を再現するように設定される。 FIG. 15 is a diagram showing an example in which conditions regarding time, place, and situation are subject to change regarding the operation mode of the driver D. In FIG. The higher the degree of imitation, the closer to the action mode of the driver D, and the setting is made to reproduce the state when the action mode is detected from the driver D.

動作態様の時間に関する条件を変更対象とする場合には、例えばエージェント５２が動作態様を出力するタイミングや、動作態様を出力する時間の長さを変更する。図１５に示すシーンは、運転者Ｄが＜あくびする＞動作態様を模倣対象とし、＜目をつむる＞、＜片手を口元に添える＞、＜伸びをする＞の３つの動作から構成される動作態様をエージェント５２が模倣する場合について、時間の条件を変更対象とするシーンである。図１５に示すシーンでいうと、動作態様を出力するタイミングを一動作分遅らせるとともに、＜目をつむる＞、＜片手を口元に添える＞の２つの動作を出力する場合は、模倣度合いを低く設定する。また、動作態様を出力するタイミングを一動作分遅らせるとともに、＜目をつむる＞、＜片手を口元に添える＞、＜伸びをする＞の３つの動作を出力する場合は、模倣度合いを中程度に設定する。これに対して、動作態様を出力するタイミングを揃えるとともに、＜目をつむる＞、＜片手を口元に添える＞、＜伸びをする＞の３つの動作を出力する場合は、模倣度合いを高く設定する。 When the condition regarding the time of the action mode is to be changed, for example, the timing at which the agent 52 outputs the action mode or the length of time for which the action mode is output is changed. The scene shown in FIG. 15 is an imitation target of the motion mode <yawning> of the driver D, and consists of three motions: <close the eyes>, <place one hand on the mouth>, and <stretch out>. This is a scene in which the time conditions are subject to change when the agent 52 imitates the mode. In the scene shown in FIG. 15, when delaying the timing of outputting the motion mode by one motion and outputting two motions of <close your eyes> and <place one hand on your mouth>, the degree of imitation is set low. do. In addition, the timing of outputting the action mode is delayed by one action, and when outputting the three actions of <close your eyes>, <place one hand on your mouth>, and <stretch>, the degree of imitation is set to medium. set. On the other hand, when outputting the three motions of <close your eyes>, <place one hand on your mouth>, and <stretch> while aligning the timings of outputting the motion modes, the degree of imitation is set high. .

同様に、動作態様の場所に関する条件を変更対象とする場合には、例えばエージェント５２が模倣する箇所、位置などを変更する。図１５に示すシーンは、運転者Ｄが＜左腕を伸ばす＞動作態様を模倣対象とし、場所の条件を変更対象とするシーンである。図１５に示すシーンでいうと、左腕から離れた場所の＜左脚を伸ばす＞動作態様を出力する場合は、模倣度合いを低く設定する。また、左腕と反対の場所の＜右腕を伸ばす＞動作態様を出力する場合は、模倣度合いを中程度に設定し、同じ場所の＜左腕を伸ばす＞動作態様を出力する場合は、模倣度合いを高く設定する。 Similarly, when the conditions relating to the location of the action mode are to be changed, for example, the locations and positions to be imitated by the agent 52 are changed. The scene shown in FIG. 15 is a scene in which the motion mode of the driver D <stretching the left arm> is targeted for imitation, and the condition of the location is targeted for change. In the case of the scene shown in FIG. 15, when outputting the action mode of <extending the left leg> away from the left arm, the degree of imitation is set low. In addition, when outputting the action mode <stretching the right arm> in the opposite position to the left arm, the degree of imitation is set to medium, and when outputting the motion mode <stretching the left arm> in the same position, the degree of imitation is set high. set.

動作態様の状況に関する条件を変更対象とする場合には、例えばエージェント５２が模倣するシチュエーションを変更する。図１５に示すシーンは、運転者Ｄが＜あくびする＞動作態様を模倣対象とし、状況に関する条件を変更対象とするシーンである。運転者Ｄが＜あくびする＞動作態様は、朝に検出されているので、状況に関する条件は朝である。図１５に示すシーンでいうと、朝とは異なるシチュエーションの夜に＜あくびする＞動作態様を出力する場合は、模倣度合いを低く設定する。朝にやや近いシチュエーションの昼に＜あくびする＞動作態様を出力する場合は、模倣度合いを中程度に設定し、同じシチュエーションの朝に＜あくびする＞動作態様を出力する場合は、模倣度合いを高く設定する。 In the case of changing the conditions regarding the situation of the action mode, for example, the situation that the agent 52 imitates is changed. The scene shown in FIG. 15 is a scene in which the <yawning> action mode of the driver D is to be imitated, and the condition regarding the situation is to be changed. Since the <yawning> motion mode of the driver D is detected in the morning, the condition regarding the situation is morning. In the case of the scene shown in FIG. 15, when outputting the <yawn> action mode at night in a situation different from that in the morning, the degree of imitation is set low. If the <yawning> motion mode is to be output in the afternoon in a situation slightly closer to morning, the degree of imitation is set to medium, and if the <yawning> motion mode is to be output in the morning of the same situation, the degree of imitation is set to be higher. set.

なお、表現態様設定部２６は、動作態様について、時間、場所、状況に関する条件のうち、いずれか一つを変更対象として模倣度合いを設定してもよいし、複数の条件を変更対象として模倣度合いを設定してもよい。また、表現態様設定部２６は、変更対象とする時間、場所、状況に関する条件と運転者Ｄの親密度を対応付けて、模倣度合いを設定してもよい。例えば、運転者Ｄの親密度が低い場合には、運転者Ｄに模倣であると気付かれ難くするために、時間、場所、状況に関する条件のすべてを変更対象とする。運転者Ｄの親密度が中程度の場合には、時間、場所、状況に関する条件のうち二つを変更対象とする。運転者Ｄの親密度が高い場合には、状況に関する条件は共感性が高い条件のため、変更対象とせず、時間又は場所のいずれか一つを変更対象とする構成としてもよい。表現態様設定部２６は、エージェント５２が、経時的に模倣度合いの低い表現態様から模倣度合いの高い表現態様を用いるように、出力データに反映する。 Note that the expression mode setting unit 26 may set the degree of imitation with any one of the conditions related to time, place, and situation as the change target for the action mode, or set the degree of imitation with a plurality of conditions as the change targets. may be set. In addition, the expression mode setting unit 26 may set the degree of imitation by associating the conditions regarding the time, place, and situation to be changed with the degree of intimacy of the driver D. For example, if the degree of intimacy with the driver D is low, all of the time, place, and situation conditions are subject to change so that the driver D is less likely to notice the imitation. If the degree of familiarity with the driver D is medium, two of the conditions regarding time, place, and situation are subject to change. If the degree of intimacy with the driver D is high, the condition regarding the situation is a condition with high sympathy. The expression mode setting unit 26 reflects in the output data so that the agent 52 uses an expression mode with a low degree of imitation over time to an expression mode with a high degree of imitation.

図１６は、運転者Ｄの会話態様について、時間、場所、状況に関する条件を変更対象とする例を示した図である。模倣度合いが高いほど、運転者Ｄの会話態様に対して応答性が高く、エージェント５２が共感的な応答を出力するように設定される。図１６に示すように、例えば運転者Ｄの会話態様に含まれる「今日も国道１２９号は渋滞してるな」という会話内容を検出した場合には、時間の条件として「今日」、場所の条件として「国道１２９号」、状況の条件として「渋滞している」というワードを抽出し、これらのワードを変更する。なお、車両センサ類３を介して取得した検出値に基づいて、時間の条件と現在の時間情報、場所の条件と車両の位置情報、状況の条件と車両の周辺情報などを対応付けて変更してもよい。 FIG. 16 is a diagram showing an example in which conditions relating to time, place, and situation are subject to change regarding the manner of conversation of driver D. In FIG. The higher the degree of imitation, the higher the responsiveness to the conversation mode of the driver D, and the agent 52 is set to output an sympathetic response. As shown in FIG. 16, for example, when detecting the conversation content of "National highway No. 129 is congested today as well" included in the conversation mode of driver D, the time condition is "today" and the location condition is "National highway No. 129" is extracted as a condition, and the word "congested" is extracted as a condition of the situation, and these words are changed. Based on the detected values obtained through the vehicle sensors 3, time conditions and current time information, location conditions and vehicle position information, situation conditions and vehicle surrounding information, etc. are associated and changed. may

会話態様の時間に関する条件を変更対象とする場合には、例えば時間に関するワードを変更するとともに、現在の時間情報を対応づけて変更する。図１６に示すシーンでいうと、時間に関する条件として「今日」というワードを変更し、現在の時間情報として「通勤時間帯」を対応付けて変更する。「今日」という時間から離れた概念となる「行楽シーズンだと国道１２９号は渋滞しますね」などの応答文を用いる場合は、模倣度合いを低く設定する。また、「今日」という時間からやや離れた概念となる「今週も国道１２９号は渋滞してますね」などの応答文を用いる場合は、模倣度合いを中程度に設定する。これに対して、「今日」という時間の中で、現在の時間情報を対応づけた「通勤時間帯なので国道１２９号は渋滞していますね」などの応答文を用いる場合は、模倣度合いを高く設定する。 When the conditions related to the time of the conversation mode are to be changed, for example, the word related to the time is changed, and the current time information is changed in association with it. In the scene shown in FIG. 16, the word "today" is changed as the time-related condition, and "commuting time zone" is changed as the current time information. When using a response sentence such as "National highway No. 129 will be congested during the tourist season," which is a concept that is separate from the time "today", the degree of imitation is set low. Also, when using a response sentence such as "National Highway No. 129 is congested this week as well," which is a concept slightly different from the time "today", the degree of imitation is set to a medium level. On the other hand, when using a response sentence such as "It's commuting time, so National Route 129 is congested." set.

同様に、会話態様の場所に関する条件を変更対象とする場合には、例えば場所に関するワードを変更するとともに、車両の位置情報を対応づけて変更する。図１６に示すシーンでいうと、場所に関する条件として「国道１２９号」というワードを変更し、車両の位置情報「厚木バイパス付近」を対応付けて変更する。「国道１２９号」という場所から離れた概念となる「幹線道路はよく渋滞しますね」などの応答文を用いる場合は、模倣度合いを低く設定する。また、「国道１２９号」という場所からやや離れた概念となる「国道はよく渋滞しますね」などの応答文を用いる場合は、模倣度合いを中程度に設定する。これに対して、「国道１２９号」という場所の中で、車両の位置情報を対応づけた「国道１２９号の厚木バイパス付近はよく渋滞しますね」などの応答文を用いる場合は、模倣度合いを高く設定する。 Similarly, when the condition regarding the location of the conversation mode is to be changed, for example, the word regarding the location is changed, and the positional information of the vehicle is changed in correspondence. In the scene shown in FIG. 16, the word "National highway No. 129" is changed as the location condition, and the position information of the vehicle "near Atsugi bypass" is changed in association with it. When using a response sentence such as "Main roads are often congested", which is a concept distant from the place "National highway No. 129", the degree of imitation is set low. In addition, when using a response sentence such as "National highway is often congested", which is a concept slightly different from the place "National highway No. 129", the degree of imitation is set to a medium level. On the other hand, when using a response sentence such as "National Route 129 near the Atsugi Bypass is often congested, isn't it?" set high.

会話態様の状況に関する条件を変更対象とする場合には、例えば状況に関するワードを変更するとともに、車両の周辺情報を対応付けて変更する。図１６に示すシーンでいうと、状況の条件として「渋滞している」というワードを変更し、車両の周辺情報「車線規制付近」を対応付けて変更する。「渋滞している」という状況から離れた概念となる「国道１２９号は渋滞しないときもありますね」などの応答文を用いる場合は、模倣度合いを低く設定する。また、「渋滞している」という状況と同じ概念となる「国道１２９号は渋滞していますね」などの応答文を用いる場合は、模倣度合いを中程度に設定する。これに対して、「渋滞している」という状況に加えて、車両の周辺情報を対応づけた「この先、車線規制なので国道１２９号は渋滞しているようですね」などの応答文を用いる場合は、模倣度合いを高く設定する。 When the condition regarding the situation of the conversation mode is to be changed, for example, the words related to the situation are changed, and the surrounding information of the vehicle is also changed. In the scene shown in FIG. 16, the word "congested" is changed as the condition of the situation, and the surrounding information of the vehicle "near lane regulation" is changed in association with it. When using a response sentence such as "There are times when there is no congestion on Route 129," which is a concept apart from the situation of "congested", the degree of imitation is set low. Also, when using a response sentence such as "National Highway No. 129 is congested", which is the same concept as the situation of "congested", the degree of imitation is set to a medium level. On the other hand, in addition to the situation of "congested", when using a response sentence such as "It seems that National Highway 129 is congested due to lane restrictions ahead", which is associated with information about the surroundings of the vehicle. sets the degree of imitation high.

なお、表現態様設定部２６は、会話態様について、時間、場所、状況に関する条件のうち、いずれか一つを変更対象として模倣度合いを設定してもよいし、複数の条件を変更対象として模倣度合いを設定してもよい。また、表現態様設定部２６は、変更対象の時間、場所、状況に関する条件と運転者Ｄの親密度を対応付けて、模倣度合いを設定してもよい。例えば、運転者Ｄの親密度が低い場合には、運転者Ｄに模倣であると気付かれ難くするために、時間、場所、状況に関する条件のすべてを変更対象とする。運転者Ｄの親密度が中程度の場合には、時間、場所、状況に関する条件のうち二つを変更対象とする。運転者Ｄの親密度が高い場合には、状況に関する条件は共感性が高い条件のため、変更対象とせずに、時間又は場所のいずれか一つを変更対象とする構成としてもよい。表現態様設定部２６は、エージェント５２が、経時的に模倣度合いの低い表現態様から模倣度合いの高い表現態様を用いるように、出力データに反映する。 Note that the expression mode setting unit 26 may set the degree of imitation with any one of the conditions related to time, place, and situation as the change target for the conversation mode, or set the degree of imitation with a plurality of conditions as the change targets. may be set. In addition, the expression mode setting unit 26 may set the degree of imitation by associating the conditions regarding the time, place, and situation to be changed with the degree of intimacy of the driver D. For example, if the degree of intimacy with the driver D is low, all of the time, place, and situation conditions are subject to change so that the driver D is less likely to notice the imitation. If the degree of intimacy with the driver D is medium, two of the conditions regarding time, place, and situation are subject to change. If the degree of intimacy with the driver D is high, the condition regarding the situation is a condition with high sympathy. The expression mode setting unit 26 reflects in the output data so that the agent 52 uses an expression mode with a low degree of imitation over time to an expression mode with a high degree of imitation.

次に、図１７及び図１８を参照して、本実施形態の情報処理システム１の情報処理手順を説明する。図１７は、本実施形態の情報処理システム１が実行する、エージェント５２の表現態様の決定及び模倣度合いの設定の情報処理の一例を示すフローチャートである。図１８は、図１７に示すステップＳ８のサブルーチンの一例を示している。エージェント５２が出力する模倣度合いを決定する情報処理の一例を示すフローチャートである。以下においては、図１０及び図１１（ａ）～（ｃ）のシーンを参照して説明する。 Next, the information processing procedure of the information processing system 1 of the present embodiment will be described with reference to FIGS. 17 and 18. FIG. FIG. 17 is a flowchart showing an example of information processing for determining the expression mode of the agent 52 and setting the degree of imitation, which is executed by the information processing system 1 of the present embodiment. FIG. 18 shows an example of the subroutine of step S8 shown in FIG. 5 is a flowchart showing an example of information processing for determining the degree of imitation output by an agent 52; The following description will be made with reference to the scenes shown in FIGS. 10 and 11(a) to (c).

まず、図１７のステップＳ１にて、車両のイグニッションスイッチがＯＮになると以下の情報処理が実行される。ステップＳ２にて、乗員特定部２１は、車両センサ類３としての車内カメラから取得した画像情報に基づいて運転者Ｄを特定する。次に、ステップＳ３にて、表現態様設定部２６は、表現態様データベース２３を参照して、運転者Ｄについて記憶された表現態様、当該表現態様の検出回数、運転者Ｄとエージェント５２の親密度を取得する。 First, at step S1 in FIG. 17, when the ignition switch of the vehicle is turned on, the following information processing is executed. In step S2 , the occupant identification unit 21 identifies the driver D based on image information obtained from an in-vehicle camera as the vehicle sensors 3 . Next, in step S3, the expression mode setting unit 26 refers to the expression mode database 23 to determine the expression mode stored for the driver D, the number of times the expression mode is detected, and the intimacy between the driver D and the agent 52. to get

ステップＳ４にて、表現態様設定部２６は、車両センサ類３から取得した運転者Ｄの画像情報や、入力装置４から取得した運転者Ｄの音声情報に基づいて、トリガ表現を検出したか否かを判定する。例えば、図１０に示す「それでいいよ（承諾）」のトリガ表現を検出した場合にはステップＳ５へ進む。これに対して、トリガ表現を検出するまでステップＳ４を、予め定めた所定時間、繰り返す。 In step S4, the expression mode setting unit 26 detects whether or not the trigger expression is detected based on the image information of the driver D obtained from the vehicle sensors 3 and the voice information of the driver D obtained from the input device 4. determine whether For example, when the trigger expression "That's all right (acceptance)" shown in FIG. 10 is detected, the process proceeds to step S5. On the other hand, step S4 is repeated for a predetermined time until the trigger expression is detected.

ステップＳ４にて、トリガ表現を検出したと判定すると、ステップＳ５にて、表現態様設定部２６は、模倣対象となるエージェント５２の表現態様を決定する。図１０に示すように、運転者Ｄのトリガ表現「承諾」を検出すると、表現態様設定部２６は、表現態様データベース２３を参照し、運転者Ｄの表現態様から「承諾」の動作態様及び会話態様を模倣対象に決定する。 When it is determined in step S4 that the trigger expression has been detected, in step S5 the expression mode setting unit 26 determines the expression mode of the agent 52 to be imitated. As shown in FIG. 10 , when the trigger expression “accept” of the driver D is detected, the expression mode setting unit 26 refers to the expression mode database 23 and selects the action mode and conversation of “accept” from the expression mode of the driver D. A mode is determined as an imitation target.

続くステップＳ６にて、表現態様設定部２６は、模倣対象として決定した「承諾」の動作態様及び会話態様について、エージェント５２の表現態様を運転者Ｄの表現態様に近づけるための変更対象を特定する。例えば、図１０に示す、動作態様及び会話態様の検出回数を変更対象とする。 In subsequent step S6, the expression mode setting unit 26 specifies change targets for bringing the expression mode of the agent 52 closer to the expression mode of the driver D with respect to the action mode and conversation mode of "acceptance" determined to be imitated. . For example, the number of detections of the motion mode and the conversation mode shown in FIG. 10 are subject to change.

続くステップＳ７にて、表現態様設定部２６は、エージェント５２の表現態様の模倣度合いを設定する。このとき表現態様設定部２６は、エージェント５２の表現態様を、コミュニケーション重ねるにつれて、徐々に運転者Ｄの表現態様に近づけるように設定する。検出回数を変更対象とした場合には、検出回数が多い表現態様ほど、運転者Ｄが用いる表現態様に近づけることができる。そのため、例えば、図１１（ａ）に示すように、検出回数が少ない「了解」の会話態様と＜敬礼する＞動作態様は、模倣度合いを低く設定する。また、図１１（ｂ）に示すように、検出回数が中程度の「おっけー」の会話態様と＜親指を上に向ける＞動作態様は、模倣度合いを中程度に設定する。これに対して、図１１（ｃ）に示すように、検出回数が多い「わかった」の会話態様と＜親指と人差し指で円を作る＞動作態様は、模倣度合いを高く設定する。 In subsequent step S7, the expression mode setting unit 26 sets the degree of imitation of the agent 52's expression mode. At this time, the expression mode setting unit 26 sets the expression mode of the agent 52 so as to gradually approach the expression mode of the driver D as the communication is repeated. When the number of times of detection is changed, an expression mode with a higher number of detection times can be closer to the expression mode used by the driver D. Therefore, for example, as shown in FIG. 11A, the degree of imitation is set to be low for the "understand" conversation mode and the <salute> action mode, which are detected less frequently. In addition, as shown in FIG. 11(b), the degree of imitation is set to a medium level for the conversation mode of "Okay" and the action mode of <pointing thumb up>, which have a medium number of detections. On the other hand, as shown in FIG. 11(c), the degree of imitation is set high for the conversation mode of "I understand" and the action mode of <making a circle with the thumb and forefinger>, which are detected a large number of times.

ステップＳ８にて、表現態様設定部２６は、エージェント５２が今回出力する表現態様の模倣度合いを決定し、ステップＳ９へ進む。続くステップＳ９にて、データ生成部２７は、表現態様設定部２６から受信した、模倣対象となるエージェント５２の表現態様と、当該表現態様の模倣度合いの情報に基づいて、運転者Ｄに対する出力データを生成する。 In step S8, the expression mode setting unit 26 determines the degree of imitation of the expression mode that the agent 52 outputs this time, and proceeds to step S9. In subsequent step S9, the data generation unit 27 generates output data for the driver D based on the expression mode of the agent 52 to be imitated and the degree of imitation of the expression mode received from the expression mode setting unit 26. to generate

ステップＳ１０にて、出力部２８は、エージェント装置５の動作部、スピーカその他の音声出力部、ディスプレイその他の表示部に制御信号を出力し、エージェント装置５のエージェント機能により出力データを出力する。 In step S10 , the output unit 28 outputs control signals to the action unit, speaker or other audio output unit, display or other display unit of the agent device 5 , and outputs output data by the agent function of the agent device 5 .

ステップＳ１１にて、車両のイグニッションスイッチがＯＦＦになるまでステップＳ４からステップＳ１０までの情報処理を繰り返し実行する。 In step S11, the information processing from step S4 to step S10 is repeatedly executed until the ignition switch of the vehicle is turned off.

図１７のステップＳ８では、図１８に示すように、エージェント５２が今回出力する表現態様の模倣度合いを決定する情報処理が実行される。ステップＳ７にて、表現態様の模倣度合いが設定されると、図１８のステップＳ８１にて、表現態様設定部２６は、親密度判定部２４を介して、親密度データベース２５に格納された運転者Ｄとエージェント５２の親密度を取得する。 In step S8 of FIG. 17, as shown in FIG. 18, information processing is executed to determine the degree of imitation of the expression mode output by the agent 52 this time. When the degree of imitation of the expression mode is set in step S7, in step S81 of FIG. Obtain the degree of intimacy between D and agent 52.

ステップＳ８１にて取得した、運転者Ｄの親密度が低い場合には、ステップＳ８２へ進み、今回出力する模倣度合いを「低」に決定する。この場合には、図１１（ａ）に示すように、模倣度合いが低い「了解」の会話態様と＜敬礼する＞動作態様が出力されるようにする。これに対して、ステップＳ８１にて取得した、運転者Ｄの親密度が中程度の場合には、ステップＳ８３へ進み、今回出力する模倣度合いを「中」に決定する。この場合には、図１１（ｂ）に示すように、模倣度合いが中程度の「おっけー」の会話態様と＜親指を上に向ける＞動作態様が出力されるようにする。ステップＳ８１にて取得した運転者Ｄの親密度が高い場合には、ステップＳ８４へ進み、今回出力する模倣度合いを「高」に決定し、図１１（ｃ）に示すように、模倣度合いが高い「わかった」の会話態様と＜親指と人差し指で円を作る＞動作態様が出力されるようにする。 If the degree of familiarity with the driver D obtained in step S81 is low, the process proceeds to step S82, and the degree of imitation to be output this time is determined to be "low". In this case, as shown in FIG. 11(a), the "understand" conversation mode and the <salute> action mode with a low degree of imitation are output. On the other hand, if the degree of familiarity with the driver D acquired in step S81 is medium, the process proceeds to step S83, and the degree of imitation to be output this time is determined to be "medium". In this case, as shown in FIG. 11(b), a conversation mode of "Okay" and a motion mode <pointing thumb up> with a medium degree of imitation are output. If the intimacy level of driver D obtained in step S81 is high, the process proceeds to step S84, where the degree of imitation output this time is determined to be "high", and as shown in FIG. 11(c), the degree of imitation is high. The conversation mode of "understood" and the action mode <make a circle with the thumb and forefinger> are output.

以上の通り、本実施形態の情報処理システム１及び情報処理方法によれば、運転者Ｄ（ユーザ）の表現態様を検出し、模倣対象となるエージェント５２の表現態様を決定し、当該表現態様について、検出された運転者Ｄ（ユーザ）の表現態様に対するエージェント５２の表現態様の模倣度合いを設定し、模倣度合いに応じた出力データを生成する。その際、エージェント５２の表現態様の模倣度合いを、生成された出力データを運転者Ｄ（ユーザ）に出力するにしたがって、経時的に運転者Ｄ（ユーザ）の表現態様に近づけるように設定する。これにより、運転者Ｄ（ユーザ）にとって理解しやすい表現を用いながら、適切なコミュニケーションを行うことができる。また、運転者Ｄ（ユーザ）がエージェント５２の表現に違和感を抱いたり、不快感を募らせたりすることを抑制できる。 As described above, according to the information processing system 1 and the information processing method of the present embodiment, the expression mode of the driver D (user) is detected, the expression mode of the agent 52 to be imitated is determined, and the expression mode is determined. , sets the degree of imitation of the expression mode of the agent 52 with respect to the detected expression mode of the driver D (user), and generates output data according to the degree of imitation. At this time, the degree of imitation of the representation mode of the agent 52 is set so as to approach the representation mode of the driver D (user) over time as the generated output data is output to the driver D (user). As a result, appropriate communication can be performed using expressions that are easy for the driver D (user) to understand. In addition, it is possible to prevent the driver D (user) from feeling uncomfortable with the expression of the agent 52 or feeling uncomfortable.

また、本実施形態の情報処理システム１及び情報処理方法によれば、運転者Ｄ（ユーザ）とエージェント５２との親密度を判定する親密度判定部２４をさらに備え、表現態様設定部２６は、運転者Ｄ（ユーザ）とエージェント５２との親密度が相対的に高い場合には、運転者Ｄ（ユーザ）とエージェント５２との親密度が相対的に低い場合に比べて、模倣度合いを運転者Ｄ（ユーザ）の表現態様に近づけるように設定する。これにより、エージェント５２が運転者Ｄ（ユーザ）の表現態様を模倣する度合いを適切に変化させることができ、円滑なコミュニケーション行うことができる。また、運転者Ｄ（ユーザ）がエージェント５２に対して愛着を感じ易くなるという効果も期待できる。 Further, according to the information processing system 1 and the information processing method of the present embodiment, the familiarity determination unit 24 that determines the degree of familiarity between the driver D (user) and the agent 52 is further provided, and the expression mode setting unit 26 When the degree of intimacy between the driver D (user) and the agent 52 is relatively high, compared to the case where the degree of intimacy between the driver D (user) and the agent 52 is relatively low, the degree of imitation is determined by the driver. It is set so as to approximate the expression mode of D (user). As a result, the agent 52 can appropriately change the degree of imitation of the expression mode of the driver D (user), and smooth communication can be performed. In addition, an effect that the driver D (user) can easily feel attachment to the agent 52 can be expected.

また、本実施形態の情報処理システム１及び情報処理方法によれば、親密度判定部２４は、運転者Ｄ（ユーザ）から入力された会話内容に基づいて運転者Ｄ（ユーザ）とエージェント５２との親密度を推定する。これにより、エージェント５２に対する運転者Ｄ（ユーザ）の実際の言動を親密度に反映することができ、運転者Ｄ（ユーザ）とエージェント５２の適切なコミュニケーションを促進することができる。 Further, according to the information processing system 1 and the information processing method of the present embodiment, the intimacy determination unit 24 determines whether the driver D (user) and the agent 52 are connected based on the content of the conversation input by the driver D (user). Estimate the intimacy of As a result, the actual behavior of the driver D (user) with respect to the agent 52 can be reflected in the intimacy level, and appropriate communication between the driver D (user) and the agent 52 can be promoted.

また、本実施形態の情報処理システム１及び情報処理方法によれば、親密度判定部２４は、出力されたエージェント５２の表現態様に対する運転者Ｄ（ユーザ）の反応を取得し、取得された運転者Ｄ（ユーザ）の反応に基づいて、運転者Ｄ（ユーザ）とエージェント５２との親密度を推定する。これにより、運転者Ｄ（ユーザ）とエージェント５２の適切なコミュニケーションを一層促進することができる。 Further, according to the information processing system 1 and the information processing method of the present embodiment, the familiarity determination unit 24 acquires the reaction of the driver D (user) to the output expression mode of the agent 52, Based on the reaction of the person D (user), the familiarity between the driver D (user) and the agent 52 is estimated. This further promotes appropriate communication between the driver D (user) and the agent 52 .

また、本実施形態の情報処理システム１及び情報処理方法によれば、エージェント装置５は車両に車載された車載装置、ユーザは車両の乗員であって、親密度判定部２４は、運転支援装置を含む前記車両に搭載された車載装置の運転者Ｄ（ユーザ）の使用状況に基づいて運転者Ｄ（ユーザ）とエージェント５２との親密度を推定する。これにより、車載装置に対する受容性を、運転者Ｄ（ユーザ）とエージェント５２のコミュニケーションに反映することができる。 Further, according to the information processing system 1 and the information processing method of the present embodiment, the agent device 5 is an in-vehicle device mounted on a vehicle, the user is an occupant of the vehicle, and the intimacy determination unit 24 uses the driving support device. The degree of intimacy between the driver D (user) and the agent 52 is estimated based on the driver D (user) usage status of the in-vehicle device installed in the vehicle. Thereby, the receptivity to the in-vehicle device can be reflected in the communication between the driver D (user) and the agent 52 .

また、本実施形態の情報処理システム１及び情報処理方法によれば、運転者Ｄ（ユーザ）の表現態様は、運転者Ｄ（ユーザ）の視覚的特徴から検出される動作態様と、運転者Ｄ（ユーザ）の聴覚的特徴から検出される会話態様と、を含み、表現態様設定部２６は、運転者Ｄ（ユーザ）の動作態様及び／又は前記ユーザの会話態様を用いて、模倣対象となるエージェント５２の表現態様を決定する。これにより、運転者Ｄ（ユーザ）が実際に用いた表現をエージェント５２が模倣するので、運転者Ｄ（ユーザ）が理解しやすい態様でコミュニケーションを行うことができる。また、運転者Ｄ（ユーザ）が実際に用いた表現をエージェント５２が模倣することにより、エージェント５２の表現が運転者Ｄ（ユーザ）に好意的に受け止められるという効果も期待できる。 Further, according to the information processing system 1 and the information processing method of the present embodiment, the expression mode of the driver D (user) is the behavior mode detected from the visual characteristics of the driver D (user) and the and a conversation mode detected from the auditory features of the (user), and the expression mode setting unit 26 uses the behavior mode of the driver D (user) and/or the conversation mode of the user to be the imitation target. Determining the mode of representation of the agent 52 . As a result, the agent 52 imitates the expression actually used by the driver D (user), so communication can be performed in a manner that is easy for the driver D (user) to understand. In addition, since the agent 52 imitates the expression actually used by the driver D (user), an effect can be expected that the expression of the agent 52 will be favorably received by the driver D (user).

また、本実施形態の情報処理システム１及び情報処理方法によれば、運転者Ｄ（ユーザ）の視覚的特徴は、運転者Ｄ（ユーザ）を撮像した画像情報から取得した運転者Ｄ（ユーザ）の外観、容姿、姿勢、表情、動作の少なくとも一つを含む。視覚的特徴から検出される動作態様は、運転者Ｄ（ユーザ）が模倣に気付き易いので、エージェント５２が徐々に模倣することにより、過剰な模倣であるとの印象を与えることを抑制できる。 Further, according to the information processing system 1 and the information processing method of the present embodiment, the visual features of the driver D (user) are obtained from the image information of the driver D (user). Appearance, Appearance, Posture, Facial Expression, and/or Action. Since the driver D (user) easily notices the imitation in the behavior mode detected from the visual characteristics, the gradual imitation by the agent 52 can prevent the impression of excessive imitation.

また、本実施形態の情報処理システム１及び情報処理方法によれば、運転者Ｄ（ユーザ）の聴覚的特徴は、運転者Ｄ（ユーザ）の音声情報から取得した運転者Ｄ（ユーザ）の声質、抑揚、言い回し、言語、方言、会話内容、推定年齢、性別の少なくとも一つを含む。聴覚的特徴から検出される会話態様の模倣は、動作態様に比べて運転者Ｄ（ユーザ）が模倣に気付き難いので、エージェント５２が徐々に模倣することにより、良好なコミュニケーションを行うことができる。 Further, according to the information processing system 1 and the information processing method of the present embodiment, the auditory characteristics of the driver D (user) are obtained from the voice quality of the driver D (user) obtained from the voice information of the driver D (user). , intonation, phrasing, language, dialect, content of conversation, estimated age, and/or gender. It is difficult for the driver D (user) to notice the imitation of the conversation mode detected from the auditory features compared to the action mode.

また、本実施形態の情報処理システム１及び情報処理方法によれば、表現態様設定部２６は、複数の運転者Ｄ（ユーザ）の動作態様及び／又は運転者Ｄ（ユーザ）の会話態様を用いて模倣度合いを運転者Ｄ（ユーザ）の表現態様に近づけるように設定する。これにより、エージェント５２が用いる表現を多様に、かつ段階的に変化させることができる。 Further, according to the information processing system 1 and the information processing method of the present embodiment, the expression mode setting unit 26 uses a plurality of behavior modes of the driver D (user) and/or conversation modes of the driver D (user). to set the degree of imitation so as to approach the expressive mode of the driver D (user). As a result, the expression used by the agent 52 can be varied stepwise.

また、本実施形態の情報処理システム１及び情報処理方法によれば、表現態様設定部２６は、運転者Ｄ（ユーザ）の動作態様及び／又は運転者Ｄ（ユーザ）の会話態様に含まれる、時間、場所、状況の少なくとも一つを含む条件を特定し、当該条件を用いて模倣度合いを運転者Ｄ（ユーザ）の表現態様に近づけるように設定する。これらの条件を用いることにより、運転者Ｄ（ユーザ）が気付き易い模倣から気付き難い模倣まで段階的に設定することができ、多様な表現を用いて適切なコミュニケーションを行うことができる。 Further, according to the information processing system 1 and the information processing method of the present embodiment, the expression mode setting unit 26 is included in the operation mode of the driver D (user) and/or the conversation mode of the driver D (user). A condition including at least one of time, place, and situation is specified, and using the condition, the degree of imitation is set so as to approximate the expressive mode of the driver D (user). By using these conditions, it is possible to set the imitation in steps from imitation that is easy for the driver D (user) to notice to imitation that is difficult for the driver D (user) to perform appropriate communication using various expressions.

また、本実施形態の情報処理システム１及び情報処理方法によれば、運転者Ｄ（ユーザ）とエージェント５２との親密度が相対的に低い場合には、運転者Ｄ（ユーザ）とエージェント５２との親密度が相対的に高い場合に比べて、エージェント５２の表現態様を出力する出力頻度を高くする。これにより、運転者Ｄ（ユーザ）とエージェント５２のコミュニケーションを促進することができる。 Further, according to the information processing system 1 and the information processing method of the present embodiment, when the degree of intimacy between the driver D (user) and the agent 52 is relatively low, the driver D (user) and the agent 52 The output frequency of outputting the representation mode of the agent 52 is set higher than when the intimacy of the agent 52 is relatively high. Thereby, communication between the driver D (user) and the agent 52 can be promoted.

なお、以上に説明した実施形態は、本発明の理解を容易にするために記載されたものであって、本発明を限定するために記載されたものではない。したがって、上記の実施形態に開示された各要素は、本発明の技術的範囲に属する全ての設計変更や均等物をも含む趣旨である。 It should be noted that the embodiments described above are described to facilitate understanding of the present invention, and are not described to limit the present invention. Therefore, each element disclosed in the above embodiments is meant to include all design changes and equivalents that fall within the technical scope of the present invention.

１…情報処理システム
２…情報処理装置
２１…乗員特定部
２２…表現態様検出部
２３…表現態様データベース
２４…親密度判定部
２５…親密度データベース
２６…表現態様設定部
２７…データ生成部
２８…出力部
３…車両センサ類
４…入力装置
５…エージェント装置 DESCRIPTION OF SYMBOLS 1... Information processing system 2... Information processing apparatus 21... Passenger identification part 22... Expression mode detection part 23... Expression mode database 24... Familiarity determination part 25... Familiarity level database 26... Expression mode setting part 27... Data generation part 28... Output unit 3 Vehicle sensors 4 Input device 5 Agent device

Claims

an agent device including an anthropomorphic agent that imitates an expression mode of a user; and an information processing device that sets the expression mode of the agent and generates output data for the user. An information processing system that outputs generated output data to the user,
The information processing device is
a detection unit that detects the expression mode of the user;
a setting unit that determines an expression mode of the agent to be imitated, and sets, for the determined expression mode, a degree of imitation of the agent's expression mode with respect to the detected expression mode of the user;
a data generation unit that generates output data according to the set degree of imitation,
The information processing system, wherein the setting unit sets the degree of imitation of the expression mode of the agent so as to approach the expression mode of the user over time as the generated output data is output to the user.

further comprising a determination unit that determines a degree of intimacy between the user and the agent;
When the degree of intimacy between the user and the agent is relatively high, the setting unit expresses the degree of imitation to the user more than when the degree of intimacy between the user and the agent is relatively low. 2. The information processing system according to claim 1, wherein the information processing system is set so as to approximate the mode.

3. The information processing system according to claim 2, wherein the determination unit estimates the degree of intimacy between the user and the agent based on conversation content input by the user.

4. The determination unit acquires the user's reaction to the output expression mode of the agent, and estimates a degree of intimacy between the user and the agent based on the acquired user's reaction. The information processing system according to .

The agent device is an in-vehicle device mounted in a vehicle, the user is an occupant of the vehicle, and the determination unit is configured based on the user's usage of the in-vehicle device including a driving support device. 5. The information processing system according to any one of claims 2 to 4, wherein the degree of intimacy between the user and said agent is estimated.

The user's expression mode includes an action mode detected from the user's visual features and a conversation mode detected from the user's auditory features,
The information processing system according to any one of claims 1 to 5, wherein the setting unit determines the expression mode of the agent to be imitated by using the user's behavior mode and/or the user's conversation mode. .

7. The information processing system according to claim 6, wherein the user's visual features include at least one of the user's appearance, appearance, posture, facial expression, and motion obtained from image information of the user.

8. The auditory features of the user according to claim 6 or 7, wherein the user's auditory features include at least one of the user's voice quality, intonation, phrase, language, dialect, conversation content, estimated age, and gender obtained from the user's voice information. Information processing system.

9. The setting unit according to any one of claims 6 to 8, wherein the setting unit sets the degree of imitation to be close to the expression mode of the user by using a plurality of behavior modes and/or conversation modes of the user. information processing system.

The setting unit specifies conditions including at least one of time, place, and situation included in the user's operation mode and/or the user's conversation mode, and uses the conditions to determine the degree of imitation of the user. 10. The information processing system according to any one of claims 6 to 9, wherein the information processing system is set so as to approximate the expression mode.

When the degree of familiarity between the user and the agent is relatively low, the output frequency of outputting the expression mode of the agent is made higher than when the degree of familiarity between the user and the agent is relatively high. The information processing system according to any one of claims 2-10.

A processor for setting an expression mode of an anthropomorphic agent that imitates an expression mode of a user, generating output data for the user, and outputting the generated output data to the user using the agent An information processing method executed by
The processor
detecting an expression mode of the user;
determining the expression mode of the agent to be imitated, setting the degree of imitation of the agent's expression mode with respect to the detected expression mode of the user for the determined expression mode;
generating output data according to the set degree of imitation;
An information processing method for setting a degree of imitation of the expression mode of the agent so as to approach the expression mode of the user over time as the generated output data is output to the user.