JP5223605B2

JP5223605B2 - Robot system, communication activation method and program

Info

Publication number: JP5223605B2
Application number: JP2008285993A
Authority: JP
Inventors: 亮平笹間; 睦夫佐野; 健三郎宮脇; 佳生寺本; 達也速水; 大輔山本; 謙太郎向井; 雅雄川野; 智治山口; 宏比古伊藤; 要祐坂尾; 麻以子鎌田
Original assignee: NEC Corp; Josho Gakuen Educational Foundation
Current assignee: NEC Corp; Josho Gakuen Educational Foundation
Priority date: 2008-11-06
Filing date: 2008-11-06
Publication date: 2013-06-26
Anticipated expiration: 2028-11-06
Also published as: JP2010110864A

Description

本発明は、人間とコミュニケーションを図るロボットシステム、そのロボットシステムを用いたコミュニケーション活性化方法、及びそのロボットシステムを制御するコンピュータに実行させるプログラムに関する。 The present invention relates to a robot system that communicates with humans, a communication activation method using the robot system, and a program that is executed by a computer that controls the robot system.

一般に、会話によるコミュニケーションを自然なものとするためには、音声情報（バーバル情報）に加え、相手の視線、ジェスチャ、頷き動作をはじめとする音声以外の情報（ノンバーバル情報）が必要となる。発話に関するリズム、タイミング、配分、強弱などのバーバル情報と、相手の視線、呼吸、心拍、ジェスチャ、動作、頷き、相槌、瞬きなどのノンバーバル情報とを、両者が五感で感じとり、それに反応して自らの動作のリズム、タイミング、配分、強弱を調整することにより、お互いの身体リズム（コミュニケーションリズム）を共有してはじめて、コミュニケーションを自然なものとすることができる。 In general, in order to make communication through conversation natural, information other than voice (non-verbal information) including the gaze, gesture, and whisper of the other party is required in addition to voice information (verbal information). Both sense the rhythm, timing, distribution, strength, etc. related to speech and nonverbal information such as the other party's gaze, breathing, heartbeat, gestures, movement, whispering, competing, blinking, etc. By adjusting the rhythm, timing, distribution, and strength of the movement, it is only possible to share the physical rhythm (communication rhythm) with each other to make communication natural.

このような相手のバーバル情報及びノンバーバル情報を五感で感じとり、それらの反応に応答する行動をとることをインタラクション行動という。このインタラクション行動により、コミュニケーションリズムを共有することを、コミュニケーション同調という。また、コミュニケーション同調により会話に引き込まれていく現象を、引込現象という。 It is called interaction behavior to sense such opponent's verbal information and non-verbal information with the five senses and take action in response to those reactions. Sharing communication rhythm through this interaction behavior is called communication tuning. In addition, the phenomenon of being drawn into a conversation by communication synchronization is called a pull-in phenomenon.

近年、人間との間で、会話などのコミュニケーションを図ることができるロボットシステムが登場している。人間とロボットシステムとの間のコミュニケーションにおいても、この引込現象を発現させることが、そのコミュニケーションを活性化させるための重要なポイントとなる。このような背景から、ロボットとユーザとのコミュニケーションや、ロボットを介在させたユーザ間のコミュニケーションにおいて、引込現象の発現につながる種々の技術が、開示されている（例えば、特許文献１〜６参照）。 In recent years, robot systems that can communicate with humans such as conversations have appeared. In communication between a human and a robot system, expressing this pull-in phenomenon is an important point for activating the communication. From such a background, various technologies that lead to the appearance of a pull-in phenomenon have been disclosed in communication between a robot and a user and communication between users through a robot (see, for example, Patent Documents 1 to 6). .

特許文献１に記載の身体性メディア通信システムは、時間的または空間的に隔てられた通信相手との親密なコミュニケーションを実現するために、それぞれの通信端末の画面上に両者の疑似人格ロボットを表示させる。疑似人格ロボットは、両者の音声情報や特定動作情報に基づいて動作する。これにより、空間を共有する感覚を両者に与えることができるようになる。 The physical media communication system described in Patent Literature 1 displays both pseudo personality robots on the screens of the respective communication terminals in order to realize intimate communication with a communication partner separated in time or space. Let The pseudo personality robot operates based on both voice information and specific motion information. As a result, it is possible to give both the sense of sharing the space.

特許文献２に記載の身体的引き込み方法及びシステムは、話し手と聞き手の間において、話し手または聞き手に視線の切り替えを要求することなく、それぞれの触覚を介してノンバーバル情報を同時に与える。これにより、お互いのコミュニケーションリズムを共有することができるようになる。また、特許文献３では、上記特許文献２に開示された身体的引き込み方法及びシステムを、プレゼンテーションなどの聞き手が不特定多数である場合に適用したものが開示されている。 The physical pull-in method and system described in Patent Document 2 simultaneously provide non-verbal information via the respective tactile sensations without requiring the speaker or the listener to switch the line of sight between the speaker and the listener. Thereby, it becomes possible to share each other's communication rhythm. Patent Document 3 discloses a method of applying the physical pull-in method and system disclosed in Patent Document 2 to a case where there are an unspecified number of listeners such as presentations.

特許文献４に記載の自動応答玩具は、ユーザの音声の大きさや、ユーザの顔の動きの大きさ、ユーザの頷きのタイミング等の外部からの刺激に基づいて玩具の感情を決定する。例えば、この玩具は、ユーザの頷きのタイミングが検出された回数が多ければ、話が弾んでいると解釈し、そのときの感情を「幸福」とする。この玩具は、決定された感情に応じた応答動作（インタラクション行動）を行う。 The automatic response toy described in Patent Document 4 determines the emotion of the toy based on external stimuli such as the size of the user's voice, the size of the user's face movement, and the timing of the user's whisper. For example, this toy interprets that the story is bouncing if the number of times the user's whispering is detected is high, and sets the emotion at that time to “happiness”. This toy performs a response action (interaction action) according to the determined emotion.

特許文献５に記載の意思伝達装置は、音声送受信部と、共用ロボットと、聞き手制御部及び話し手制御部とから構成されている。音声送受信部は、会話等の音声信号を送受信し、共用ロボットは、この音声信号に応答して頭の頷き動作、口の開閉動作、目の瞬き動作、又は身体の身振り動作の挙動をする。聞き手制御部は、送信部を通じて送信される音声信号から聞き手としての共用ロボットの挙動を決定してこの共用ロボットを作動させる。そして、話し手制御部は、受信部で受信した音声信号から話し手としての共用ロボットの挙動を決定してこの共用ロボットを作動させる。 The intention transmission device described in Patent Document 5 includes a voice transmission / reception unit, a shared robot, a listener control unit, and a speaker control unit. The voice transmission / reception unit transmits / receives a voice signal such as a conversation, and the shared robot responds to the voice signal by performing a head whispering action, a mouth opening / closing action, an eye blinking action, or a body gesture action. The listener control unit determines the behavior of the shared robot as a listener from the audio signal transmitted through the transmitter, and operates the shared robot. Then, the speaker control unit determines the behavior of the shared robot as a speaker from the voice signal received by the receiving unit, and operates the shared robot.

特許文献６に記載のリズム制御対話装置は、データ入力手段からの音声信号・身振りの時刻情報を含む複数の入力データを認識する複数チャネルの認識手段と、時刻情報を出力する時刻付与手段と、認識手段から出力される認識結果を処理してユーザの対話のリズムを検出するリズム検出手段と、リズムの覆歴を格納する覆歴格納手段と、リズム検出手段により認識されたリズムに基づいて対話を進める対話管理手段と、出力データを出力する出力手段から構成されている。応答内容は、出力手段によりユーザに伝えられる。 The rhythm control dialogue apparatus described in Patent Document 6 includes a plurality of channels recognition means for recognizing a plurality of input data including time information of voice signals and gestures from a data input means, a time giving means for outputting time information, Rhythm detection means for processing a recognition result output from the recognition means to detect a rhythm of the user's dialogue, a cover history storage means for storing a rhythm history, and a dialog based on the rhythm recognized by the rhythm detection means Dialogue management means for proceeding and output means for outputting output data. The response content is transmitted to the user by the output means.

特開２００３−１０８５０２号公報JP 2003-108502 A 特開２００５−２５１１３３号公報JP 2005-251133 A 特開２００５−２５０４２１号公報JP-A-2005-250421 特開２００２−２３９２５６号公報JP 2002-239256 A 特開２０００−３４９９２０号公報JP 2000-349920 A 特開平１０−１１１７８６号公報Japanese Patent Laid-Open No. 10-11786

上記６つの技術はいずれも、コミュニケーションをする相手のバーバル情報及びノンバーバル情報を検出し（例えば、ユーザもロボットも音声を発しない無音区間などを検出し）、検出された情報に基づいてロボットにインタラクション行動（インタラクション動作）を行わせることにより、ユーザとロボットとの間でコミュニケーションリズムを共有させて、引込現象を発現させることを期待するものである。 All of the above six technologies detect the verbal information and non-verbal information of the communicating party (for example, detecting a silent section where neither the user nor the robot emits voice) and interacting with the robot based on the detected information By performing an action (interaction operation), it is expected that the user and the robot share a communication rhythm and develop a pull-in phenomenon.

コミュニケーションの取り方は、個人個人によって様々であるが、どのような人でも、相手との同調度合に応じてコミュニケーションの取り方を微妙に変えていくのが一般的である。したがって、引込現象の発現確率を高めるには、コミュニケーションの発展段階に応じて、インタラクション動作を変更する動的な誘発戦略が必要となる。しかしながら、上記６つの技術では、そのような動的な誘発戦略の下でコミュニケーションを行うのは困難である。 There are various methods of communication depending on the individual, but it is common for any person to slightly change the communication method according to the degree of synchronization with the other party. Therefore, in order to increase the probability of the pull-in phenomenon, a dynamic triggering strategy that changes the interaction behavior according to the development stage of communication is required. However, with the above six technologies, it is difficult to communicate under such a dynamic triggering strategy.

本発明は、上記事情に鑑みてなされたもので、引込現象の発現確率を高め、コミュニケーションをさらに活性化させることができるロボットシステム、コミュニケーション活性化方法及びプログラムを提供することを目的とする。 The present invention has been made in view of the above circumstances, and an object of the present invention is to provide a robot system, a communication activation method, and a program that can increase the probability of occurrence of a pull-in phenomenon and further activate communication.

上記目的を達成するために、本発明の第１の観点に係るロボットシステムは、
複数人のユーザに対するインタラクション動作を行う出力部と、
前記複数人のユーザ各々のバーバル情報及びノンバーバル情報に基づいて、そのユーザのコミュニケーションリズムを認識する認識部と、
前記コミュニケーションリズムに基づいて、前記ユーザ間の同調度合を算出する同調度合算出部と、
前記コミュニケーションリズム及び前記同調度合に応じて前記出力部が前記ユーザに対して行うべきインタラクション動作に関して、コミュニケーションの発展段階に応じた引込現象の動的な誘発戦略の下で構築されたルールを記憶するルールデータベースと、
前記ルールデータベースを参照して、そのルールに従って前記コミュニケーションリズムと前記同調度合を用いて前記ユーザに対して行うべき動作命令を探索し、探索された動作命令に基づいて前記出力部を制御するインタラクション制御部と、を備える。 In order to achieve the above object, a robot system according to a first aspect of the present invention includes:
An output unit for performing an interaction operation for a plurality of users;
A recognition unit for recognizing the communication rhythm of the user based on the verbal information and non-verbal information of each of the plurality of users;
Based on the communication rhythm, a synchronization level calculation unit that calculates a synchronization level between the users,
In accordance with the communication rhythm and the degree of synchronization, the output unit stores a rule constructed under a dynamic induction strategy of a pull-in phenomenon according to a communication development stage regarding an interaction operation to be performed on the user. A rule database;
An interaction control that refers to the rule database, searches for an operation command to be performed for the user using the communication rhythm and the degree of synchronization according to the rule, and controls the output unit based on the searched operation command A section.

本発明の第２の観点に係るコミュニケーション活性化方法は、
複数人のユーザに対するインタラクション動作を行う出力部を備えるロボットシステムを用いたコミュニケーション活性化方法であって、
前記複数人のユーザ各々のバーバル情報及びノンバーバル情報に基づいて、そのユーザのコミュニケーションリズムを認識する第１の工程と、
前記コミュニケーションリズムに基づいて前記ユーザ間の同調度合を算出する第２の工程と、
前記コミュニケーションリズム及び前記同調度合に応じて前記出力部が前記ユーザに対して行うべきインタラクション動作に関して、コミュニケーションの発展段階に応じた引込現象の動的な誘発戦略の下で構築されたルールを記憶するルールデータベースを参照して、そのルールに従って前記コミュニケーションリズムと前記同調度合を用いて前記ユーザに対して行うべき動作命令を探索し、探索された動作命令に基づいて前記出力部を制御する第３の工程と、を含む。 The communication activation method according to the second aspect of the present invention includes:
A communication activation method using a robot system including an output unit that performs an interaction operation for a plurality of users,
A first step of recognizing the communication rhythm of the plurality of users based on the verbal information and non-verbal information of each of the users;
A second step of calculating a degree of synchronization between the users based on the communication rhythm;
In accordance with the communication rhythm and the degree of synchronization, the output unit stores a rule constructed under a dynamic induction strategy of a pull-in phenomenon according to a communication development stage regarding an interaction operation to be performed on the user. A third database for searching for an operation command to be performed on the user using the communication rhythm and the degree of synchronization according to the rule with reference to the rule database, and controlling the output unit based on the searched operation command And a process.

本発明の第３の観点に係るプログラムは、
複数人のユーザに対するインタラクション動作を行う出力部を備えるロボットシステムを制御するコンピュータに、
前記複数人のユーザ各々のバーバル情報及びノンバーバル情報に基づいて、そのユーザのコミュニケーションリズムを認識する第１の手順と、
前記コミュニケーションリズムに基づいて前記ユーザ間の同調度合を算出する第２の手順と、
前記コミュニケーションリズム及び前記同調度合に応じて前記出力部が前記ユーザに対して行うべきインタラクション動作に関して、コミュニケーションの発展段階に応じた引込現象の動的な誘発戦略の下で構築されたルールを記憶するルールデータベースを参照して、そのルールに従って前記コミュニケーションリズムと前記同調度合を用いて前記ユーザに対して行うべき動作命令を探索し、探索された動作命令に基づいて前記出力部を制御する第３の手順と、を実行させる。 The program according to the third aspect of the present invention is:
In a computer that controls a robot system including an output unit that performs an interaction operation for a plurality of users,
A first procedure for recognizing the communication rhythm of the plurality of users based on the verbal information and non-verbal information of each of the users;
A second procedure for calculating the degree of synchronization between the users based on the communication rhythm;
In accordance with the communication rhythm and the degree of synchronization, the output unit stores a rule constructed under a dynamic induction strategy of a pull-in phenomenon according to a communication development stage regarding an interaction operation to be performed on the user. A third database for searching for an operation command to be performed on the user using the communication rhythm and the degree of synchronization according to the rule with reference to the rule database, and controlling the output unit based on the searched operation command And execute the procedure.

本発明によれば、引込現象の発現確率を高め、コミュニケーションをさらに活性化させることができる。 According to the present invention, it is possible to increase the probability of the pull-in phenomenon and further activate communication.

本発明を実施するための最良の形態について図面を参照して詳細に説明する。以下に示す本発明の各実施形態に係るロボットシステムは、複数人のユーザのコミュニケーションリズムを認識し、複数人のユーザに対するインタラクション動作を行うことにより、ロボットとユーザとのコミュニケーション同調を実現し、引込現象を誘発させるものである。 The best mode for carrying out the present invention will be described in detail with reference to the drawings. The robot system according to each embodiment of the present invention described below realizes communication synchronization between a robot and a user by recognizing a communication rhythm of a plurality of users and performing an interaction operation with respect to the plurality of users. It induces a phenomenon.

（第１の実施形態）
まず、本発明の第１の実施形態について説明する。図１には、本実施形態に係るロボットシステム１００の概略的な構成が示されている。図１に示されるように、ロボットシステム１００は、コミュニケーション場認識部１と、コミュニケーション同調度合算出部２と、インタラクション制御部３と、ソーシャルインタラクションルールデータベース（以下、「ＳＩＲＤＢ」と略述する）４と、出力部５と、を備えている。 (First embodiment)
First, a first embodiment of the present invention will be described. FIG. 1 shows a schematic configuration of a robot system 100 according to the present embodiment. As shown in FIG. 1, the robot system 100 includes a communication field recognition unit 1, a communication synchronization degree calculation unit 2, an interaction control unit 3, and a social interaction rule database (hereinafter abbreviated as “SIRDB”) 4. And an output unit 5.

コミュニケーション場認識部１は、不図示のマイク、カメラ、生体センサなどの各種センサを有している。マイクは、ロボットシステム１００のコミュニケーション相手となるユーザの音声を入力する。カメラは、そのユーザを撮像する。生体センサは、そのユーザの脈拍などの生体情報を検出する。 The communication field recognition unit 1 has various sensors such as a microphone, a camera, and a biological sensor (not shown). The microphone inputs a voice of a user who is a communication partner of the robot system 100. The camera images the user. The biological sensor detects biological information such as the pulse of the user.

コミュニケーション場認識部１は、これらのセンサから得られた音声情報、画像情報、生体情報、すなわちバーバル情報及びノンバーバル情報に基づいて、ユーザの発話パワーやその周期などの発話データ、ユーザのジェスチャ・動作・頷き・相槌といった身体動作データや、ユーザの視線・呼吸・心拍・瞬きといった生体センシングデータなどを認識する。これら発話データ、身体動作データ、生体センシングデータなどを、コミュニケーションリズム（モーダル情報）という。認識されたコミュニケーションリズムは、コミュニケーション同調度合算出部２に出力される。 Based on the voice information, image information, biological information, that is, verbal information and non-verbal information obtained from these sensors, the communication field recognition unit 1 utterance data such as the user's utterance power and its period, the user's gesture / motion -Recognize body movement data such as whispering and companion, and biological sensing data such as user's line of sight, breathing, heartbeat and blink. These utterance data, body movement data, and biological sensing data are referred to as communication rhythm (modal information). The recognized communication rhythm is output to the communication tuning degree calculation unit 2.

コミュニケーション同調度合算出部２は、コミュニケーション場認識部１から出力されたコミュニケーションリズム（複数のモーダル情報）に基づいて、コミュニケーション同調度合を算出する。コミュニケーション同調度合は、ユーザとのコミュニケーションリズムの共有状態の高さを示す指標値であり、この値が大きければ大きいほど、ユーザに引込現象が発現しやすくなる。コミュニケーション同調度合は、例えば、ユーザの発話パワーの平均値と、ユーザの視線のやりとりの回数、頷き回数等の線形加重和、すなわち各種コミュニケーションリズムの線形加重和とすることができる。算出されたコミュニケーション同調度合は、インタラクション制御部３に出力される。 The communication synchronization degree calculation unit 2 calculates the communication synchronization degree based on the communication rhythm (plural modal information) output from the communication field recognition unit 1. The degree of communication synchronization is an index value indicating the height of the shared state of the communication rhythm with the user. The larger this value, the more likely the user will be drawn. The degree of communication synchronization can be, for example, an average value of the user's speech power, a linear weighted sum of the number of user's line-of-sight exchanges, a number of whirlings, etc., that is, a linear weighted sum of various communication rhythms. The calculated communication synchronization degree is output to the interaction control unit 3.

インタラクション制御部３には、コミュニケーション同調度合算出部２から算出されたコミュニケーションリズムの他に、コミュニケーション場認識部１から出力されたコミュニケーションリズムも入力されている。インタラクション制御部３は、入力されたコミュニケーションリズムとコミュニケーション同調度合とに基づいて、ＳＩＲＤＢ４を参照する。 In addition to the communication rhythm calculated from the communication tuning degree calculation unit 2, the communication rhythm output from the communication field recognition unit 1 is also input to the interaction control unit 3. The interaction control unit 3 refers to the SIRDB 4 based on the input communication rhythm and communication tuning degree.

ＳＩＲＤＢ４には、コミュニケーションリズムとコミュニケーション同調度合とに応じて、後述する出力部５がユーザに対して行うべきインタラクション行動に関するルール（インタラクションルール）が蓄積されている。インタラクションルールは、通常、人間同士のコミュニケーションにおいて、人間が感じ取るバーバル情報及びノンバーバル情報に対して人間がとる行動と同じ行動を、可能な限りとるように構築されている。より具体的には、このインタラクションルールは、コミュニケーションの発展段階に応じた引込現象の動的な誘発戦略の下に構築されている。このインタラクションルールによれば、コミュニケーション同調度合が低い状態と高い状態とでは、コミュニケーションリズムが同じであっても、インタラクション動作が異なるようになる。 The SIRDB 4 stores rules (interaction rules) related to interaction actions that the output unit 5 described later should perform on the user according to the communication rhythm and the communication synchronization degree. The interaction rule is usually constructed so as to take as much as possible the same action as the action taken by humans with respect to the verbal information and non-verbal information that humans feel in communication between humans. More specifically, this interaction rule is constructed under a dynamic triggering strategy of the pull-in phenomenon according to the development stage of communication. According to this interaction rule, even when the communication rhythm is the same, the interaction operation differs between the low and high communication synchronization levels.

インタラクション制御部３は、このインタラクションルールにしたがって、入力されたコミュニケーションリズムとコミュニケーション同調度合とに基づいて、ユーザに対して行うべき動作に対応する動作指令であるロボットアクションコマンドを決定する。決定されたロボットアクションコマンドは、出力部５に出力される。 The interaction control unit 3 determines a robot action command, which is an operation command corresponding to an operation to be performed on the user, based on the input communication rhythm and communication synchronization degree according to the interaction rule. The determined robot action command is output to the output unit 5.

出力部５は、ディスプレイ又は人型のロボット本体である。ディスプレイである場合には、その画像にＣＧ（コンピュータグラフィックス）により作成された人物像（エージェント）が表示されたものを採用することができる。エージェント又は人型のロボットは、実際の人間を模して、顔、手、胴体などを有しており、それらを動かせるようになっている。また、その顔では、目、鼻、口などを動かせるようになっている。出力部５は、顔、手、胴体、さらには、目、鼻、口などを動かすことにより、ロボットアクション（インタラクション動作）を実現する。このようなインタラクション動作には、例えば、視線の変更・呼吸・心拍・瞬き・ジェスチャ・動作・頷き・相槌がある。 The output unit 5 is a display or a humanoid robot body. In the case of a display, it is possible to adopt a display in which a person image (agent) created by CG (computer graphics) is displayed on the image. An agent or a humanoid robot has a face, a hand, a torso, etc., imitating an actual human being, and can move them. The face can move its eyes, nose and mouth. The output unit 5 realizes a robot action (interaction operation) by moving the face, hands, torso, and eyes, nose, mouth, and the like. Such interaction operations include, for example, line-of-sight change, breathing, heartbeat, blink, gesture, operation, whispering, and conflicting.

また、出力部５は、ロボットの音声を出力するためのスピーカ（不図示）も有しており、表示された口を動かしつつ、スピーカから音声を出力することにより、発話が可能となっている。このように、出力部５は、人間の動作に近い各種動作を行うことができるようになっているのが望ましい。 The output unit 5 also has a speaker (not shown) for outputting the voice of the robot, and can speak by moving the displayed mouth and outputting the voice from the speaker. . As described above, it is desirable that the output unit 5 can perform various operations close to human operations.

出力部５は、インタラクション制御部３の制御の下、入力されたロボットアクションコマンドに従ってインタラクション動作を実際に行う。 The output unit 5 actually performs an interaction operation according to the input robot action command under the control of the interaction control unit 3.

ロボットシステム１００は、図２のコミュニケーション処理に示されるように、コミュニケーション場認識部１によるコミュニケーションリズムの認識処理（ステップＳ１０）→コミュニケーション同調度合算出部２によるコミュニケーション同調度合の算出（ステップＳ１２）→インタラクション制御部３によるロボットアクションコマンドの決定（ステップＳ１４）→出力部５によるインタラクション動作（ステップＳ１６）を、この順に行う。 As shown in the communication process of FIG. 2, the robot system 100 recognizes the communication rhythm by the communication field recognition unit 1 (step S10) → calculates the communication tuning degree by the communication tuning degree calculation unit 2 (step S12) → interaction. The determination of the robot action command by the control unit 3 (step S14) → the interaction operation by the output unit 5 (step S16) is performed in this order.

ユーザは、このロボットアクションを見ながら、さらに、ロボットシステム１００に対して発話やジェスチャなどのコミュニケーションを継続する。これに対し、ロボットシステム１００は、コミュニケーション場認識部１におけるコミュニケーションリズムの認識（ステップＳ１０）、コミュニケーション同調度合算出部２におけるコミュニケーション同調度合の算出（ステップＳ１２）、インタラクション制御部３におけるロボットアクションコマンドの決定（ステップＳ１４）、出力部５におけるインタラクション動作（ステップＳ１６）を繰り返す。ユーザは、このインタラクション動作を見ながら、さらに、ロボットシステム１００に対して発話やジェスチャなどのコミュニケーションを継続する。 While watching the robot action, the user further continues communication such as speech and gesture to the robot system 100. On the other hand, the robot system 100 recognizes the communication rhythm in the communication field recognition unit 1 (step S10), calculates the communication synchronization level in the communication synchronization level calculation unit 2 (step S12), and receives the robot action command in the interaction control unit 3. The determination (step S14) and the interaction operation (step S16) in the output unit 5 are repeated. While watching this interaction operation, the user further continues communication such as speech and gesture to the robot system 100.

ユーザと、ロボットシステム１００とは、このような動作を繰り返しつつ、会話などのコミュニケーションを継続する。 The user and the robot system 100 continue communication such as conversation while repeating such operations.

コミュニケーションの継続の結果、ユーザとロボットシステム１００との間で、コミュニケーションリズムが共有されるようになり、コミュニケーション同調度合が高まる。この結果、ユーザに引込現象が誘発される。 As a result of the continuation of communication, the communication rhythm is shared between the user and the robot system 100, and the degree of communication synchronization is increased. As a result, a pull-in phenomenon is induced in the user.

このように、本実施形態に係るロボットシステム１００は、バーバル情報及びノンバーバル情報（複数のモーダル情報）に基づいて、コミュニケーションリズムを認識し、そのコミュニケーションリズムに基づいてコミュニケーション同調度合を直接的に求めている。また、このロボットシステム１００では、コミュニケーションの発展段階に応じた引込現象の動的な誘発戦略に基づいて構築されたインタラクションルールに従って、コミュニケーション同調度合に基づいてインタラクション動作を行う。このように、ロボットシステム１００は、コミュニケーションの発展段階に応じた引込現象の動的な誘発戦略の下でユーザとコミュニケーションを図ることができるので、引込現象の発現確率を高め、コミュニケーションをさらに活性化させることができる。 As described above, the robot system 100 according to the present embodiment recognizes the communication rhythm based on the verbal information and the non-verbal information (plural modal information), and directly obtains the communication synchronization degree based on the communication rhythm. Yes. In the robot system 100, an interaction operation is performed based on the degree of communication synchronization according to the interaction rule constructed based on the dynamic induction strategy of the pull-in phenomenon according to the communication development stage. As described above, since the robot system 100 can communicate with the user under the dynamic triggering strategy of the pull-in phenomenon according to the development stage of communication, the probability of the pull-in phenomenon is increased and the communication is further activated. Can be made.

（第２の実施形態）
本発明の第２の実施形態について説明する。図３には、本実施形態に係るロボットシステム１０１の概略的な構成が示されている。図３に示されるように、本実施形態に係るロボットシステム１０１は、コミュニケーションモード決定部６をさらに備えている点と、ＳＩＲＤＢ４の代わりに、複数のＳＩＲＤＢ４１、４２、４３、…を備えている点と、インタラクション制御部３の動作とが、上記第１の実施形態に係るロボットシステム１００と異なっており、その他の点は同じである。そこで、本実施形態では、上記第１の実施形態と重複する構成要素については、図１と同一の符号を付し、詳細な説明を省略する。 (Second Embodiment)
A second embodiment of the present invention will be described. FIG. 3 shows a schematic configuration of the robot system 101 according to the present embodiment. As shown in FIG. 3, the robot system 101 according to the present embodiment further includes a communication mode determination unit 6 and a plurality of SIRDBs 41, 42, 43,... Instead of the SIRDB4. The operation of the interaction control unit 3 is different from the robot system 100 according to the first embodiment, and the other points are the same. Therefore, in the present embodiment, the same components as those in the first embodiment are denoted by the same reference numerals as those in FIG. 1, and detailed description thereof is omitted.

コミュニケーションモード決定部６は、コミュニケーション同調度合に基づいて、コミュニケーションモードを決定する。コミュニケーションモードとは、コミュニケーション場の状態を示すものである。コミュニケーションモードは、例えば、初対面状態、話題提供状態、話題盛り上げ状態など、コミュニケーション同調度合が異なる種々のモードを設定することができる。このようなコミュニケーションモードを設定することによって、ロボットシステム１０１は、コミュニケーション同調度合を効率よく高め、引込現象を誘発しやすくするために、インタラクションルールをコミュニケーションモードに応じて変更し、コミュニケーションモードに応じて出力部５の制御状態を計画的に変更するタスクを構築することができる。これにより、引込現象に対する動的な誘発戦略を立てやすくなる。 The communication mode determination unit 6 determines a communication mode based on the communication tuning degree. The communication mode indicates the state of the communication place. As the communication mode, for example, various modes with different degrees of communication synchronization such as an initial meeting state, a topic providing state, and a topic excitement state can be set. By setting such a communication mode, the robot system 101 changes the interaction rule according to the communication mode in order to efficiently increase the degree of communication synchronization and easily induce the pull-in phenomenon, and according to the communication mode. A task for systematically changing the control state of the output unit 5 can be constructed. This makes it easier to develop a dynamic induction strategy for the pull-in phenomenon.

ＳＩＲＤＢ４１、４２、４３…は、コミュニケーションモードの数だけ用意されており、それぞれが、いずれかのコミュニケーションモードに対応している。 As many SIRDBs 41, 42, 43... Are prepared as there are communication modes, and each of them corresponds to one of the communication modes.

コミュニケーションモード決定部６は、コミュニケーション同調度合算出部２から出力されたコミュニケーション同調度合に基づいて、現在のコミュニケーションモードを決定し、インタラクション制御部３に出力する。インタラクション制御部３は、複数のＳＩＲＤＢ４１、４２、４３…の中から、決定されたコミュニケーションモードに対応するＳＩＲＤＢを選択する。そして、インタラクション制御部３は、選択されたＳＩＲＤＢを参照して、そのＳＩＲＤＢに記憶されたインタラクションルールに従って、コミュニケーション場認識部１から出力されるコミュニケーションリズムと、コミュニケーション同調度合算出部２から出力されるコミュニケーション同調度合とに基づいて、ロボットアクションコマンドを決定する。出力部５は、そのロボットアクションコマンドに従って、そのときのコミュニケーションモードに応じたインタラクション動作を行う。 The communication mode determination unit 6 determines the current communication mode based on the communication tuning level output from the communication tuning level calculation unit 2 and outputs the communication mode to the interaction control unit 3. The interaction control unit 3 selects the SIRDB corresponding to the determined communication mode from the plurality of SIRDBs 41, 42, 43. Then, the interaction control unit 3 refers to the selected SIRDB, and outputs the communication rhythm output from the communication field recognition unit 1 and the communication synchronization degree calculation unit 2 according to the interaction rules stored in the SIRDB. A robot action command is determined based on the communication synchronization level. The output unit 5 performs an interaction operation according to the communication mode at that time in accordance with the robot action command.

このように、本実施形態によれば、コミュニケーション場の状態に応じて引込現象の誘発戦略を動的に変更することができるので、引込現象の発現確率を高め、コミュニケーションをさらに活性化させることができる。 As described above, according to the present embodiment, the induction phenomenon induction strategy can be dynamically changed according to the state of the communication field, so that the probability of the induction phenomenon can be increased and communication can be further activated. it can.

（第３の実施形態）
本発明の第３の実施形態について説明する。図４には、本実施形態に係るロボットシステム１０２の概略的な構成が示されている。ロボットシステム１０２は、ユーザ内部状態推定部７をさらに備える点と、ＳＩＲＤＢ４の代わりに複数のＳＩＲＤＢ４１、４２、４３…を備えている点と、インタラクション制御部３の動作とが、上記第１の実施形態に係るロボットシステム１００と異なっており、その他の点は同じである。したがって、本実施形態では、上記第１の実施形態と重複する構成要素については、図１と同一の符号を付し、詳細な説明を省略する。 (Third embodiment)
A third embodiment of the present invention will be described. FIG. 4 shows a schematic configuration of the robot system 102 according to the present embodiment. The robot system 102 further includes a user internal state estimation unit 7, a point provided with a plurality of SIRDBs 41, 42, 43... Instead of the SIRDB 4, and the operation of the interaction control unit 3 in the first embodiment. It is different from the robot system 100 according to the embodiment, and the other points are the same. Therefore, in this embodiment, the same components as those in the first embodiment are denoted by the same reference numerals as those in FIG. 1, and detailed description thereof is omitted.

図４に示されるように、ユーザ内部状態推定部７は、コミュニケーション場認識部１から出力されたコミュニケーションリズムを入力する。ユーザ内部状態推定部７は、コミュニケーションリズムに基づいて、ユーザの内部状態を推定する。ユーザ内部状態とは、ユーザの緊張状態や快状態といった、ユーザの精神状態のことである。ユーザの内部状態は、例えば、（緊張、快）、（緊張、不快）、（リラックス、快）、（リラックス、不快）などの状態に分けることができる。 As shown in FIG. 4, the user internal state estimation unit 7 inputs the communication rhythm output from the communication field recognition unit 1. The user internal state estimation unit 7 estimates the user's internal state based on the communication rhythm. A user internal state is a user's mental state, such as a user's tension state and a pleasant state. The internal state of the user can be divided into, for example, (tension, pleasant), (tension, unpleasant), (relaxed, pleasant), (relaxed, unpleasant), and the like.

ＳＩＲＤＢ４１、４２、４３…は、ユーザの内部状態の数だけ用意されており、それぞれが、いずれかのユーザの内部状態に対応している。例えば、（緊張、快）、（緊張、不快）、（リラックス、快）などのそれぞれの状態についてＳＩＢＤＢを１つずつ用意することができる。 As many SIRDBs 41, 42, 43... Are prepared as there are internal states of the user, and each of them corresponds to the internal state of one of the users. For example, one SIBDB can be prepared for each state such as (tension, pleasure), (tension, discomfort), and (relaxation, pleasure).

ユーザ内部状態推定部７は、ユーザ内部状態の推定結果をインタラクション制御部３へ出力する。インタラクション制御部３は、複数のＳＩＲＤＢ４１、４２、４３…の中から、ユーザの内部状態に応じたＳＩＲＤＢを選択する。そして、インタラクション制御部３は、選択されたＳＩＲＤＢを参照し、そのインタラクションルールに従って、コミュニケーション場認識部１から出力されるコミュニケーションリズムと、コミュニケーション同調度合算出部２から出力されるコミュニケーション同調度合とに基づいて、ロボットアクションコマンドを決定する。出力部５は、そのロボットアクションコマンドに従って、そのときのユーザの内部状態に応じたインタラクション動作を行う。 The user internal state estimation unit 7 outputs the estimation result of the user internal state to the interaction control unit 3. The interaction control unit 3 selects the SIRDB corresponding to the internal state of the user from the plurality of SIRDBs 41, 42, 43. Then, the interaction control unit 3 refers to the selected SIRDB, and based on the communication rhythm output from the communication field recognition unit 1 and the communication synchronization level output from the communication synchronization level calculation unit 2 according to the interaction rule. To determine the robot action command. The output unit 5 performs an interaction operation according to the internal state of the user at that time in accordance with the robot action command.

このように、本実施形態によれば、ユーザの内部状態に応じて引込現象の誘発戦略を動的に変更することができるので、引込現象の発現確率を高め、コミュニケーションをさらに活性化させることができる。 As described above, according to the present embodiment, the induction phenomenon induction strategy can be dynamically changed according to the internal state of the user, so that the probability of the induction phenomenon can be increased and communication can be further activated. it can.

（第４の実施形態）
本発明の第４の実施形態について説明する。図５には、本実施形態に係るロボットシステム１０３の概略的な構成が示されている。ロボットシステム１０３は、発話マインド推定部８をさらに備える点と、インタラクション制御部３の動作とが、上記第３の実施形態に係るロボットシステム１０２と異なっており、その他の点は同じである。したがって、本実施形態では、上記第３の実施形態と重複する構成要素については、図４と同一の符号を付し、詳細な説明を省略する。 (Fourth embodiment)
A fourth embodiment of the present invention will be described. FIG. 5 shows a schematic configuration of the robot system 103 according to the present embodiment. The robot system 103 is different from the robot system 102 according to the third embodiment in that it further includes an utterance mind estimation unit 8 and the operation of the interaction control unit 3, and the other points are the same. Therefore, in this embodiment, the same components as those in the third embodiment are denoted by the same reference numerals as those in FIG. 4, and detailed description thereof is omitted.

図５に示されるように、発話マインド推定部８は、ユーザ内部状態推定部７から出力されるユーザの内部状態を入力する。発話マインド推定部８は、このユーザの内部状態に基づいて、ユーザが発話しようとする意思があるかないかを示す指標値（以下、「発話マインド」と呼ぶ）を、推定する。 As shown in FIG. 5, the utterance mind estimation unit 8 inputs the user's internal state output from the user internal state estimation unit 7. The utterance mind estimation unit 8 estimates an index value (hereinafter referred to as “utterance mind”) indicating whether or not the user has an intention to speak based on the internal state of the user.

ＳＩＲＤＢ４１、４２、４３…は、発話マインドが示す値の数だけ用意されており、それぞれが、いずれかの発話マインドの値に対応している。 As many SIRDBs 41, 42, 43,... Are prepared as the number of values indicated by the utterance mind, and each corresponds to one of the utterance mind values.

発話マインド推定部８は、ユーザ内部状態推定部７から出力されたユーザの内部状態に基づいて、発話マインドを推定する。発話マインドは、一般的に、ユーザがロボットシステム１０３（出力部５）に視線を向けて集中しているときや、緊張状態が高いときに、その値が高くなるように設定されている。例えば、発話しようとしていないとみられるときにはその値を０とし、発話しようとしているとみられるときには、その値を１とすることができる。 The utterance mind estimation unit 8 estimates the utterance mind based on the user internal state output from the user internal state estimation unit 7. The utterance mind is generally set so that the value becomes high when the user is focused on the robot system 103 (output unit 5) with a line of sight or when the tension is high. For example, the value can be set to 0 when it is considered that the user is not trying to speak, and the value can be set to 1 when the user is considered to be speaking.

発話マインドの推定結果は、インタラクション制御部３に出力される。インタラクション制御部３は、複数のＳＩＲＤＢ４１、４２、４３…の中から、ユーザの発話マインドに応じたＳＩＲＤＢを選択する。そして、インタラクション制御部３は、選択されたＳＩＲＤＢを参照し、そのインタラクションルールに従って、コミュニケーション場認識部１から出力されるコミュニケーションリズムと、コミュニケーション同調度合算出部２から出力されるコミュニケーション同調度合とに基づいて、ロボットアクションコマンドを決定する。出力部５は、そのロボットアクションコマンドに従って、そのときの発話マインドに応じたインタラクション動作を行う。 The estimation result of the utterance mind is output to the interaction control unit 3. The interaction control unit 3 selects the SIRDB corresponding to the user's utterance mind from the plurality of SIRDBs 41, 42, 43. Then, the interaction control unit 3 refers to the selected SIRDB, and based on the communication rhythm output from the communication field recognition unit 1 and the communication synchronization level output from the communication synchronization level calculation unit 2 according to the interaction rule. To determine the robot action command. The output unit 5 performs an interaction operation according to the utterance mind at that time in accordance with the robot action command.

このように、本実施形態によれば、ユーザの発話マインドに応じて引込現象の誘発戦略を動的に変更することができるので、引込現象の発現確率を高め、コミュニケーションをさらに活性化させることができる。 As described above, according to the present embodiment, the induction phenomenon induction strategy can be dynamically changed according to the user's utterance mind, so that the probability of the induction phenomenon can be increased and communication can be further activated. it can.

（第５の実施形態）
本発明の第５の実施形態について説明する。このシステムは、複数のユーザを対象とし、ユーザ間のコミュニケーションの仲立ちをするために特に用いられる。図６には、本実施形態に係るロボットシステム１０４の概略的な構成が示されている。ロボットシステム１０４は、ユーザ間情報推定部９をさらに備える点と、インタラクション制御部３の動作とが、上記第３の実施形態に係るロボットシステム１０２と異なっており、その他の点は同じである。したがって、本実施形態では、上記第３の実施形態と重複する構成要素については、図３と同一の符号を付し、詳細な説明を省略する。 (Fifth embodiment)
A fifth embodiment of the present invention will be described. This system is particularly used for mediating communication between users for a plurality of users. FIG. 6 shows a schematic configuration of the robot system 104 according to the present embodiment. The robot system 104 is further provided with the inter-user information estimation unit 9 and the operation of the interaction control unit 3 is different from the robot system 102 according to the third embodiment, and the other points are the same. Therefore, in this embodiment, the same components as those in the third embodiment are denoted by the same reference numerals as those in FIG. 3, and detailed description thereof is omitted.

図６に示されるように、ユーザ間情報推定部９は、ユーザ内部状態推定部７から出力されるユーザの内部状態を入力する。ユーザ間情報推定部９は、ユーザの内部状態に基づいて、ユーザ間の社会的関係性を示すユーザ間情報を推定する。このようなユーザ間情報としては、例えば、ユーザ同士が親しい間柄であるか否かを示す指標値がある。例えば、ユーザが非常にリラックスしている場合には、相手が親しい間柄であると判断することができる。 As illustrated in FIG. 6, the inter-user information estimation unit 9 inputs the user internal state output from the user internal state estimation unit 7. The inter-user information estimation unit 9 estimates inter-user information indicating a social relationship between users based on the internal state of the user. Such inter-user information includes, for example, an index value that indicates whether or not the user has a close relationship. For example, when the user is very relaxed, it can be determined that the partner is close.

ＳＩＲＤＢ４１、４２、４３…は、ユーザ間情報に応じた数だけ用意されており、それぞれが、いずれかのユーザ間情報の状態に対応している。 As many SIRDBs 41, 42, 43... Are prepared as the number of user-to-user information, and each corresponds to the state of any one of the user-to-user information.

ユーザ間情報の推定結果は、インタラクション制御部３に出力される。インタラクション制御部３は、複数のＳＩＲＤＢ４１、４２、４３…の中から、ユーザ間情報に応じたＳＩＲＤＢを選択する。そして、インタラクション制御部３は、選択されたＳＩＲＤＢを参照し、そのインタラクションルールに従って、コミュニケーション場認識部１から出力されるコミュニケーションリズムと、コミュニケーション同調度合算出部２から出力されるコミュニケーション同調度合とに基づいて、インタラクション動作を決定する。出力部５は、そのロボットアクションコマンドに従って、そのときの発話マインドに応じたインタラクション動作を行う。 The estimation result of the information between users is output to the interaction control unit 3. The interaction control unit 3 selects a SIRDB corresponding to the information between users from the plurality of SIRDBs 41, 42, 43. Then, the interaction control unit 3 refers to the selected SIRDB, and based on the communication rhythm output from the communication field recognition unit 1 and the communication synchronization level output from the communication synchronization level calculation unit 2 according to the interaction rule. To determine the interaction action. The output unit 5 performs an interaction operation according to the utterance mind at that time in accordance with the robot action command.

このように、本実施形態によれば、ユーザ同士の関係に応じて引込現象の誘発戦略を動的に変更することができるので、引込現象の発現確率を高め、コミュニケーションをさらに活性化させることができる。 As described above, according to the present embodiment, the induction phenomenon induction strategy can be dynamically changed according to the relationship between users, so that the occurrence probability of the induction phenomenon can be increased and communication can be further activated. it can.

（第６の実施形態）
本発明の第６の実施形態について説明する。図７には、本実施形態に係るロボットシステム１０５の概略的な構成が示されている。ロボットシステム１０５は、エピソード蓄積部１０と、エピソード記憶データベース（以下、「ＥＳＤＢ」と略述する）１１と、エピソード学習部１２と、をさらに備える点と、インタラクション制御部３の動作とが、上記第１の実施形態に係るロボットシステム１００と異なっており、その他の点は同じである。したがって、本実施形態では、上記第１の実施形態と重複する構成要素については、図１と同一の符号を付し、詳細な説明を省略する。 (Sixth embodiment)
A sixth embodiment of the present invention will be described. FIG. 7 shows a schematic configuration of the robot system 105 according to the present embodiment. The robot system 105 further includes an episode storage unit 10, an episode storage database (hereinafter abbreviated as “ESDB”) 11, and an episode learning unit 12, and the operation of the interaction control unit 3 is the above. This is different from the robot system 100 according to the first embodiment, and the other points are the same. Therefore, in this embodiment, the same components as those in the first embodiment are denoted by the same reference numerals as those in FIG. 1, and detailed description thereof is omitted.

図７に示されるように、エピソード蓄積部１０は、コミュニケーション場認識部１から出力されるコミュニケーションリズムと、コミュニケーション同調度合算出部２から出力されるコミュニケーション同調度合と、インタラクション制御部３から出力されるロボットアクションコマンドとを入力する。エピソード蓄積部１０は、コミュニケーションリズムと、コミュニケーション同調度合と、ロボットアクションコマンドとを、ＥＳＤＢ１１に蓄積する。 As shown in FIG. 7, the episode accumulation unit 10 is output from the communication rhythm output from the communication field recognition unit 1, the communication synchronization level output from the communication synchronization level calculation unit 2, and the interaction control unit 3. Enter the robot action command. The episode accumulation unit 10 accumulates the communication rhythm, the communication synchronization degree, and the robot action command in the ESDB 11.

ＥＳＤＢ１１は、コミュニケーションリズム及びコミュニケーション同調度合と、ロボットアクションコマンドとの関係を記憶するデータベースである。より具体的には、ＥＳＤＢ１１は、コミュニケーションリズム及びコミュニケーション同調度合と、それらに基づいて探索されたロボットアクションコマンドと、を関連付けて記憶する。さらに、ＥＳＤＢ１１は、その動作命令に基づくインタラクション制御部３の下で行われた出力部５のインタラクション動作に対するユーザの反応としてのコミュニケーションリズム及びコミュニケーション同調度合と、を関連付けて記憶する。 The ESDB 11 is a database that stores the relationship between the communication rhythm and communication synchronization level and the robot action command. More specifically, the ESDB 11 stores the communication rhythm and the communication synchronization degree and the robot action commands searched based on them in association with each other. Further, the ESDB 11 stores the communication rhythm and the communication tuning degree as the user's reaction to the interaction operation of the output unit 5 performed under the interaction control unit 3 based on the operation command in association with each other.

例えば、ある時刻ｔ（ｔは、任意の正の実数）におけるインタラクション動作について考える。前提として、ロボットシステム１０５では、時刻ｔにおけるインタラクション動作は、時刻ｔ−ｂ（ｂは、正の実数）におけるコミュニケーションリズム及びコミュニケーション同調度合に基づいて決定されたロボットアクションコマンドによるものであるとする。また、時刻ｔにおけるインタラクション動作に対するユーザの反応は、時刻ｔ＋ａ（ａは、正の実数）におけるコミュニケーション場にて認識されるものであるとする。この場合、ＥＳＤＢ１１には、時刻ｔにおけるロボットアクションコマンドと、時刻ｔ＋ａにおけるコミュニケーションリズム及びコミュニケーション同調度合と、時刻ｔ−ｂにおけるコミュニケーションリズム及びコミュニケーション同調度合とが、関連づけて記憶される。 For example, consider an interaction operation at a certain time t (t is an arbitrary positive real number). As a premise, in the robot system 105, the interaction operation at the time t is based on the robot action command determined based on the communication rhythm and the communication synchronization degree at the time tb (b is a positive real number). Further, it is assumed that the user's reaction to the interaction operation at time t is recognized in the communication field at time t + a (a is a positive real number). In this case, the ESDB 11 stores the robot action command at time t, the communication rhythm and communication tuning degree at time t + a, and the communication rhythm and communication tuning degree at time t−b in association with each other.

エピソード学習部１２は、ＥＳＤＢ１１を参照し、ＳＩＲＤＢ４に記憶されたインタラクションルールを調整する。例えば、エピソード学習部１２は、時刻ｔのロボットアクションコマンドに関連づけられた時刻ｔ−ｂにおけるコミュニケーション同調度合に対して、時刻ｔ＋ａにおけるコミュニケーション同調度合が低下している場合には、他のインタラクション動作が決定されるように、ＳＩＲＤＢ４のインタラクションルールを変更する。 The episode learning unit 12 refers to the ESDB 11 and adjusts the interaction rules stored in the SIRDB 4. For example, when the communication synchronization level at time t + a is lower than the communication synchronization level at time t−b associated with the robot action command at time t, the episode learning unit 12 performs other interaction operations. Change the SIRDB4 interaction rules as determined.

エピソード学習部１２は、このように、ＳＩＲＤＢ４のインタラクションルールを繰り返し変更する。この繰り返しの結果、コミュニケーションリズム及びコミュニケーション同調度合と、インタラクション動作との関係が学習され、コミュニケーション同調度合が効率良く高くなるように、ＳＩＲＤＢ４におけるインタラクションルールが最適化される。 In this way, the episode learning unit 12 repeatedly changes the interaction rule of the SIRDB 4. As a result of this repetition, the relationship between the communication rhythm and the communication tuning degree and the interaction operation is learned, and the interaction rule in the SIRDB 4 is optimized so that the communication tuning degree is efficiently increased.

なお、ユーザの緊張状態が推定可能であれば、エピソード学習部１２による学習が、ユーザの緊張状態が低下しているか否かを基準として行われるようにしてもよい。 If the user's tension state can be estimated, learning by the episode learning unit 12 may be performed based on whether or not the user's tension state is reduced.

このように、本実施形態によれば、実際のコミュニケーションの実績に基づいてインタラクションルールが最適化され、最適化されたインタラクションルールの下でコミュニケーションが行われる。これにより、引込現象の発現確率を高め、コミュニケーションをさらに活性化させることができる。 As described above, according to the present embodiment, the interaction rule is optimized based on the actual performance of communication, and communication is performed under the optimized interaction rule. Thereby, the onset phenomenon occurrence probability can be increased and communication can be further activated.

（第７の実施形態）
本発明の第７の実施形態について説明する。図８には、本実施形態に係るロボットシステム１０６の概略的な構成が示されている。ロボットシステム１０６は、ユーザパーソナリティ情報データベース（以下、「ＵＰＩＤＢ」と略述する）１３をさらに備える点と、インタラクション制御部３の動作とが、上記第１の実施形態に係るロボットシステム１００と異なっており、その他の点は同じである。したがって、本実施形態では、上記第１の実施形態と重複する構成要素については、図１と同一の符号を付し、詳細な説明を省略する。 (Seventh embodiment)
A seventh embodiment of the present invention will be described. FIG. 8 shows a schematic configuration of the robot system 106 according to the present embodiment. The robot system 106 differs from the robot system 100 according to the first embodiment in that the robot system 106 further includes a user personality information database (hereinafter abbreviated as “UPIDB”) 13 and the operation of the interaction control unit 3. The other points are the same. Therefore, in this embodiment, the same components as those in the first embodiment are denoted by the same reference numerals as those in FIG. 1, and detailed description thereof is omitted.

ＵＰＩＤＢ１３には、ユーザの個人情報が蓄えられている。このような情報には、ユーザ個人の氏名、出身地、職歴、趣味といった個人情報や、ユーザの社会的スキルや心理分析結果といったユーザの能力や性格に関する情報などが含まれる。ＵＰＩＤＢ１３に蓄えられた情報は、インタラクション制御部３によって参照され、インタラクション制御部３がインタラクション動作を決定するために用いられる。 The UPIDB 13 stores user personal information. Such information includes personal information such as the name, birthplace, work history, and hobbies of the individual user, and information regarding the user's ability and personality such as the user's social skills and psychological analysis results. Information stored in the UPIDB 13 is referred to by the interaction control unit 3 and is used by the interaction control unit 3 to determine an interaction operation.

ＳＩＲＤＢ４におけるインタラクションルールは、ユーザの個人情報に応じてインタラクション動作が異なるようなルールとなっており、コミュニケーションリズム及びコミュニケーション同調度合が同じであっても、ユーザが異なっていれば、その結果行われるインタラクション動作は異なったものとなる可能性がある。 The interaction rule in SIRDB4 is a rule in which the interaction operation differs depending on the personal information of the user. Even if the communication rhythm and the communication synchronization degree are the same, if the user is different, the resulting interaction is performed. The behavior can be different.

このように、本実施形態によれば、ユーザの個人情報に応じて引込現象の誘発戦略を動的に変更することができるので、引込現象の発現確率を高め、コミュニケーションをさらに活性化させることができる。 As described above, according to the present embodiment, the induction phenomenon induction strategy can be dynamically changed according to the personal information of the user, so that the probability of the induction phenomenon can be increased and communication can be further activated. it can.

次に、本発明のさらなる詳細な実施例について図面を参照して説明する。
（第１の実施例）
まず、本発明の第１の実施例について説明する。本実施例は、上記第２の実施形態に係るロボットシステム１０１（図３参照）に対応するものである。 Next, further detailed embodiments of the present invention will be described with reference to the drawings.
(First embodiment)
First, a first embodiment of the present invention will be described. This example corresponds to the robot system 101 (see FIG. 3) according to the second embodiment.

前提として、本実施例に係るロボットシステム１０１が適用されるコミュニケーション場について説明する。図９（Ａ）に示されるように、このコミュニケーション場では、２人のユーザＨ１、Ｈ２が、テーブル３０を挟んで向かい合っており、会話できる状態となっている。本実施例に係るロボットシステム１０１は、このユーザＨ１、Ｈ２のコミュニケーションを円滑に進めるための支援を行う。 As a premise, a communication field to which the robot system 101 according to the present embodiment is applied will be described. As shown in FIG. 9A, in this communication field, two users H1 and H2 face each other across the table 30 and are in a state where they can talk. The robot system 101 according to the present embodiment provides support for smoothly promoting communication between the users H1 and H2.

このユーザＨ１、Ｈ２は初対面である。したがって、本実施例で、ロボットシステム１０１により実行されるのは、初対面紹介タスクともいうべきものである。 These users H1 and H2 are the first meeting. Therefore, in this embodiment, what is executed by the robot system 101 should be called an initial meeting introduction task.

ロボットシステム１０１の出力部５は、ディスプレイである。この出力部５の画面上には、図９（Ｂ）に示されるような、人物像であるエージェントＲが表示されている。このエージェントＲは、ＣＧ（コンピュータグラフィックス）によって、様々なインタラクション動作を行うことができるようになっている。ユーザＨ１、Ｈ２は、出力部５の画面上に表示されたエージェントＲのインタラクション動作を見ることができる。 The output unit 5 of the robot system 101 is a display. On the screen of the output unit 5, an agent R that is a person image as shown in FIG. 9B is displayed. The agent R can perform various interaction operations by CG (computer graphics). The users H1 and H2 can see the interaction operation of the agent R displayed on the screen of the output unit 5.

図９（Ａ）に示されるように、ユーザＨ１、Ｈ２の胸元には、それぞれマイク３１が付けられ、その頭頂部には、加速度センサ３２が取り付けられている。また、テーブル上には、ユーザＨ１、Ｈ２を撮像するためのカメラ３３、３４がそれぞれ２台ずつ設置されている。マイク３１、加速度センサ３２、カメラ３３、３４によって、コミュニケーション場認識部１の一部が構成されている。 As shown in FIG. 9A, a microphone 31 is attached to each of the chests of the users H1 and H2, and an acceleration sensor 32 is attached to the top of the head. In addition, two cameras 33 and 34 for capturing images of the users H1 and H2 are installed on the table. The microphone 31, the acceleration sensor 32, and the cameras 33 and 34 constitute a part of the communication field recognition unit 1.

本実施例では、コミュニケーション場認識部１は、マイク３１の出力に基づいて、ユーザＨ１、Ｈ２の音声データを検出し、加速度センサ３２の出力に基づいて、ユーザＨ１、Ｈ２の頷きを検出し、カメラ３３、３４の出力画像に基づいて、ユーザＨ１、Ｈ２の顔や視線の向きなど、ユーザＨ１、Ｈ２の身体動作を検出する。コミュニケーション場認識部１は、これらのセンシング結果に基づいて、コミュニケーションリズムを認識する。 In the present embodiment, the communication field recognition unit 1 detects the voice data of the users H1 and H2 based on the output of the microphone 31, detects the whispering of the users H1 and H2 based on the output of the acceleration sensor 32, Based on the output images of the cameras 33 and 34, the body movements of the users H1 and H2, such as the faces of the users H1 and H2 and the direction of the line of sight, are detected. The communication field recognition unit 1 recognizes the communication rhythm based on these sensing results.

なお、本実施例では、２人のユーザＨ１、Ｈ２の頷き、視線、顔の向き、指示といった基本動作及び発話動作を、以下の関数に基づいて定義する。これらの関数の値は、その関数の右側に記載された動作（上記センシング結果より検出された動作）が行われれば１となり、動作が行われなければ０となる。本実施例では、これらの関数に基づいてコミュニケーションリズムが認識される。 In the present embodiment, basic actions such as whispering, gaze, face orientation, and instructions of two users H1 and H2 are defined based on the following functions. The values of these functions are 1 if the operation described on the right side of the function (the operation detected from the sensing result) is performed, and 0 if no operation is performed. In this embodiment, the communication rhythm is recognized based on these functions.

・Nod(H1,t)：H1が時刻tに頷く。
・Utterance(H1,t)：H1が時刻tに発話する。
・Utterance(H1→H2,t)：H1がH2に対して時刻tに発話する。
・TerminateUtterance(H1, t)：H1が時刻tに発話を終了する。
・Gaze(H1→H2,t)：H1がH2に時刻tに視線を向けている。
・Face(H1→H2,t)：H1がH2に時刻tに顔を向けている。
・Gaze(H1⇔H2,t)：H1とH2が時刻tに同時に視線を向けている（視線一致状態）。
・Face(H1⇔H2,t)：H1とH2が時刻tに同時に顔を向けている（対面状態）。
・TurnGaze(R,H1→H2,t)：RがH1をH2の方に時刻tに視線を向かせる。
・TurnUtterance(R,H1→H2,t)：RがH1をH2の方に時刻tに発話させる。
・Direct(H1→H2,t)：H1がH2の方向を時刻tに指示する。
・SilentTime(H1,t)：H1の時刻tにおける無音区間
・UtterancePower(H1,t)：H1の時刻ｔにおける発生音の音量。・ Nod (H1, t): H1 goes at time t.
Utterance (H1, t): H1 speaks at time t.
Utterance (H1 → H2, t): H1 speaks to H2 at time t.
TerminateUtterance (H1, t): H1 ends utterance at time t.
Gaze (H1 → H2, t): H1 is looking at H2 at time t.
Face (H1 → H2, t): H1 turns his face to H2 at time t.
Gaze (H1⇔H2, t): H1 and H2 are simultaneously looking at time t (gaze matching state).
Face (H1⇔H2, t): H1 and H2 are facing their faces simultaneously at time t (face-to-face state).
・ TurnGaze (R, H1 → H2, t): R turns the line of sight from H1 to H2 at time t.
TurnUtterance (R, H1 → H2, t): R causes H1 to speak to H2 at time t.
Direct (H1 → H2, t): H1 indicates the direction of H2 at time t.
SilentTime (H1, t): Silent period at time t of H1 UtterancePower (H1, t): Volume of generated sound at time t of H1

上記各関数の引数は、その動作の主体とその動作が行われた時刻を示す。なお、エージェントＲのインタラクション動作についてもこの関数で表現することができる。 The argument of each function indicates the subject of the operation and the time when the operation was performed. Note that the interaction operation of the agent R can also be expressed by this function.

コミュニケーション場認識部１は、センシング結果に基づいて、上記各関数の値を求め、これらの関数に基づいて、コミュニケーションリズムを認識する。認識されたコミュニケーションリズムは、コミュニケーション同調度合算出部２及びインタラクション制御部３に出力される。 The communication field recognition unit 1 obtains the value of each function based on the sensing result, and recognizes the communication rhythm based on these functions. The recognized communication rhythm is output to the communication tuning degree calculation unit 2 and the interaction control unit 3.

コミュニケーション同調度合算出部２は、これらコミュニケーションリズムに基づいて、時刻ｔにおけるコミュニケーション同調度合としての評価関数Eval(t)を、算出する。コミュニケーション同調度合Eval(t)は、ユーザＨ１、Ｈ２の発話パワーの平均値、視線のやりとりの回数、頷き回数など、コミュニケーションリズムの線形加重和により表されるが、本実施例では、後述する４つのコミュニケーションモードに対応する４つの評価関数Eval(t)[１]〜Eval(t)[４]を算出する。 The communication tuning degree calculation unit 2 calculates an evaluation function Eval (t) as the communication tuning degree at time t based on these communication rhythms. The communication synchronization degree Eval (t) is represented by a linear weighted sum of communication rhythms such as the average value of the speech powers of the users H1 and H2, the number of line-of-sight exchanges, and the number of strokes. Four evaluation functions Eval (t) [1] to Eval (t) [4] corresponding to the two communication modes are calculated.

ここで、γ、δは、正規化パラメータであり、式（１）と式（３）とで、γ、δの値は異なる。上記式（１）は、エージェントＲとユーザＨ１との間、エージェントＲとユーザＨ２との間で、それぞれの引込現象が発現されたか否かを評価するための評価関数である。上記式（２）は、ユーザＨ１、Ｈ２が向き合って対話を始めたか否かを評価するための評価関数である。上記式（３）、式（４）は、エージェントＲと２人のユーザＨ１、Ｈ２のスムーズな会話が確立されたか否かを評価するための評価関数である。

Here, γ and δ are normalization parameters, and the values of γ and δ are different between Expression (1) and Expression (3). The above equation (1) is an evaluation function for evaluating whether or not the respective pull-in phenomenon has occurred between the agent R and the user H1, and between the agent R and the user H2. The above equation (2) is an evaluation function for evaluating whether or not the users H1 and H2 have faced each other and started a conversation. The above formulas (3) and (4) are evaluation functions for evaluating whether a smooth conversation between the agent R and the two users H1 and H2 has been established.

なお、後述するように、コミュニケーションモードが話題提供状態（話題を提供する会話の初期段階）となっているときには、ユーザＨ１、Ｈ２のお互いの反応を、詳細にチェックする必要があるため、コミュニケーション同調度合算出部２は、コミュニケーション同調度合Eval(t)[２]のほか、次式で示されるエージェントＲがユーザＨ２の話題情報をユーザＨ１に知らせたときの反応度React(H2→H1,t)と、ユーザＨ１の話題情報をユーザＨ２に知らせたときのユーザＨ２の反応度React(H1→H2,t)とを、同じくコミュニケーション同調度合として算出する。 As will be described later, when the communication mode is in the topic provision state (the initial stage of conversation providing the topic), it is necessary to check the reaction of the users H1 and H2 in detail. In addition to the communication synchronization degree Eval (t) [2], the degree calculation unit 2 reacts when the agent R shown by the following equation notifies the user H1 of the topic information of the user H2 React (H2 → H1, t) Then, the reactivity React (H1 → H2, t) of the user H2 when the user H1 is notified of the topic information of the user H1 is also calculated as the communication synchronization degree.

ここで、α、βは、正規化パラメータである。
React(H2→H1,t)、React(H1→H2,t)は、エージェントＲによる話題提供が、ユーザＨ１、Ｈ２のコミュニケーションのきっかけとして成り得たか否かを評価するための評価関数である。

Here, α and β are normalization parameters.
React (H2 → H1, t) and React (H1 → H2, t) are evaluation functions for evaluating whether or not the topic provision by the agent R can be triggered by the communication between the users H1 and H2.

算出されたコミュニケーション同調度合Eval(t)は、コミュニケーションモード決定部６に出力される。 The calculated communication synchronization degree Eval (t) is output to the communication mode determination unit 6.

本実施例では、５つのコミュニケーションモードが用意されている。図１０には、５つのコミュニケーションモードの遷移図が示されている。この遷移図によって初対面紹介タスクが表現される。図１０に示されるように、本実施例では、初期状態に加え、挨拶／初対面状態、話題提供状態、話題掘り下げ状態、話題盛り上げ状態の４つのコミュニケーションモードが用意されている。 In this embodiment, five communication modes are prepared. FIG. 10 shows transition diagrams of five communication modes. This transition diagram represents the first meeting introduction task. As shown in FIG. 10, in this embodiment, in addition to the initial state, four communication modes are prepared: a greeting / first meeting state, a topic providing state, a topic digging state, and a topic excitement state.

初期状態は、エージェントＲと、初対面である２人のユーザＨ１、Ｈ２が、同じコミュニケーション場に集まる前のコミュニケーションモードである。 The initial state is a communication mode before the agent R and the two users H1 and H2 who meet for the first time gather in the same communication area.

挨拶／初対面状態は、初対面である２人のユーザＨ１、Ｈ２が互いに挨拶をかわし、会話を開始する際のコミュニケーションモードである。この状態では、コミュニケーション同調度合は低く、ほぼ０に近い状態である。 The greeting / first meeting state is a communication mode when two users H1 and H2 who are first meeting dodge each other and start a conversation. In this state, the degree of communication synchronization is low and is nearly zero.

話題提供状態は、２人のユーザＨ１、Ｈ２が向き合って対話させることを目的として話題を提供し、会話の端緒を作り出すときの状態である。この状態では、挨拶／初対面状態よりも、コミュニケーション同調度合が少し高まっている。 The topic providing state is a state in which a topic is provided for the purpose of allowing the two users H1 and H2 to face each other and have a conversation, thereby creating the beginning of the conversation. In this state, the degree of communication synchronization is slightly higher than that in the greeting / first meeting state.

話題掘り下げ状態は、エージェントＲと２人のユーザＨ１、Ｈ２のスムーズな会話の発生を目指すために、提供された話題を掘り下げていくときの状態である。この状態では、話題提供状態よりも、コミュニケーション同調度合が高まっている。 The topic digging state is a state when the provided topic is digged down in order to aim for a smooth conversation between the agent R and the two users H1 and H2. In this state, the degree of communication synchronization is higher than the topic provision state.

話題盛り上げ状態は、掘り下げられた話題を掘り下げていった結果、コミュニケーション同調度合が極めて高くなり、コミュニケーションリズムが共有化された状態である。 The topic excitement state is a state in which the communication rhythm is shared and the communication rhythm is shared as a result of delving into the in-depth topic.

図１０に示されるように、コミュニケーションモードは、コミュニケーション同調度合が高まるにつれて、初期状態から挨拶／初対面状態に遷移し、さらに話題提供状態へと遷移する。その後、コミュニケーションモードは、コミュニケーション同調度合に応じて、話題提供状態と、話題掘り下げ状態と、話題盛り上げ状態との間を、遷移する。 As shown in FIG. 10, the communication mode changes from the initial state to the greeting / first meeting state and further to the topic providing state as the communication synchronization level increases. Thereafter, the communication mode transitions between a topic providing state, a topic digging state, and a topic excitement state according to the degree of communication synchronization.

初対面紹介タスクにおいて、最も望ましい流れは、コミュニケーションモードが、挨拶／初対面状態→話題提供状態→話題掘り下げ状態→話題盛り上げ状態と遷移する流れである。話題盛り上げ状態となり、その状態でタスク終了条件が満たされると、ロボットシステム１０１は、その役割が完了したものとして、初対面紹介タスクを終了させる。 In the initial meeting introduction task, the most desirable flow is a flow in which the communication mode transitions from greeting / first meeting state → topic providing state → topic digging state → topic excitement state. When the topic is in a lively state and the task end condition is satisfied in this state, the robot system 101 ends the initial meeting introduction task, assuming that the role has been completed.

コミュニケーションモード決定部６の動作について説明する。２人のユーザＨ１、Ｈ２が集まり、カメラ３３、３４により、両者の存在が検出されると、コミュニケーションモード決定部６は、コミュニケーションモードを、挨拶／初対面状態へと遷移させる。 The operation of the communication mode determination unit 6 will be described. When the two users H1 and H2 gather and the presence of both is detected by the cameras 33 and 34, the communication mode determination unit 6 changes the communication mode to the greeting / first meeting state.

その後、コミュニケーションモード決定部６は、コミュニケーション同調度合Eval(t)[１]〜Eval(t)[４]を、所定の閾値Th_Eval[１]、Th_Eval[２]、Th_Eval[３]、Th_Eval[４]と比較して、その比較結果に基づいて、コミュニケーションモードを決定する。これにより、コミュニケーションモードが図１０に示されるように遷移する。なお、それぞれの閾値の関係は、Th_Eval[４]＞Th_Eval[３]＞Th_Eval[２]＞Th_Eval[１]となっている。 Thereafter, the communication mode determination unit 6 sets the communication tuning degrees Eval (t) [1] to Eval (t) [4] to predetermined threshold values Th_Eval [1], Th_Eval [2], Th_Eval [3], Th_Eval [4]. ] And the communication mode is determined based on the comparison result. As a result, the communication mode changes as shown in FIG. The relationship between the threshold values is Th_Eval [4]> Th_Eval [3]> Th_Eval [2]> Th_Eval [1].

挨拶／初対面状態から、話題提供状態への遷移条件は、以下の式で示される。
Eval(t)[１]＝1（＝Th_Eval[１]） …（７）
この遷移条件が満たされたということは、上記式（１）に示されるように、エージェントＲとユーザＨ１との間、エージェントＲとユーザＨ２との間で、それぞれの引込現象が発現したことを示している。 The transition condition from the greeting / first meeting state to the topic providing state is expressed by the following expression.
Eval (t) [1] = 1 (= Th_Eval [1]) (7)
The fact that this transition condition is satisfied means that, as shown in the above formula (1), the respective pull-in phenomenon has occurred between the agent R and the user H1, and between the agent R and the user H2. Show.

コミュニケーションモード決定部６は、React(H2→H1)及びReact(H1→H2)を、一定閾値Th_Reactと比較する。話題提供状態から話題掘り下げ状態への遷移条件は、以下の式のようになる。
React(H2→H1)∧React(H1→H2)≧Th_ReactかつEval(t)[２]≧Th_Eval[２] …（８） The communication mode determination unit 6 compares React (H2 → H1) and React (H1 → H2) with a certain threshold value Th_React. The transition condition from the topic providing state to the topic digging state is as follows.
React (H2 → H1) ∧React (H1 → H2) ≧ Th_React and Eval (t) [2] ≧ Th_Eval [2] (8)

この遷移条件が満たされたということは、上記式（２）、式（５）、式（６）に示されるように、エージェントＲによる話題提供が成功し、２人のユーザＨ１、Ｈ２が向き合って対話を始めたことを示している。 The fact that the transition condition is satisfied means that the topic provision by the agent R succeeds and the two users H1 and H2 face each other as shown in the above formulas (2), (5), and (6). Indicates that the conversation has begun.

話題掘り下げ状態から話題盛り上げ状態への遷移条件は、以下の式のようになる。
Eval(t)[３]≧Th_Eval[３] …（９） The transition condition from the topic digging state to the topic excitement state is as follows.
Eval (t) [3] ≧ Th_Eval [3] (9)

この遷移条件が満たされたということは、上記式（３）に示されるように、エージェントＲと２人のユーザＨ１、Ｈ２のスムーズな会話が確立されたことを示している。 The fact that the transition condition is satisfied indicates that a smooth conversation between the agent R and the two users H1 and H2 has been established as shown in the above equation (3).

初対面紹介タスク終了条件は、以下の式のようになる。
Eval(t)[４]≧Th_Eval[４] …（１０）
コミュニケーションモード決定部６は、このように、遷移条件が満たされたか否かを判定することにより、コミュニケーションモードを遷移させる。 The initial meeting introduction task end condition is as follows.
Eval (t) [4] ≧ Th_Eval [4] (10)
Thus, the communication mode determination unit 6 changes the communication mode by determining whether or not the transition condition is satisfied.

続いて、インタラクション制御部３の動作について説明する。インタラクション制御部３は、決定されたコミュニケーションモードに対応するＳＩＲＤＢを選択する。そして、インタラクション制御部３は、選択されたＳＩＲＤＢのインタラクションルールに従って、出力部５のインタラクション動作を制御する。 Next, the operation of the interaction control unit 3 will be described. The interaction control unit 3 selects the SIRDB corresponding to the determined communication mode. Then, the interaction control unit 3 controls the interaction operation of the output unit 5 in accordance with the selected SIRDB interaction rule.

挨拶／初対面状態に対応するＳＩＲＤＢでは、エージェントＲとユーザＨ１、エージェントＲとユーザＨ２のスムーズな会話の発生を目指してエージェントＲが各種インタラクション動作を行うようなインタラクションルールが定められている。より具体的には、このインタラクションルールは、エージェントＲが、自発的にユーザＨ１、Ｈ２に話しかけるなどの発話誘導などを行い、会話リズムを生成させるように定められている。エージェントＲがこのような行動をとることより、エージェントＲとユーザＨ１、エージェントＲとユーザＨ２における１対１の引込現象が発現しやすくなり、コミュニケーションモードを話題提供状態に遷移させやすくなる。 In the SIRDB corresponding to the greeting / first meeting state, interaction rules are defined such that the agent R performs various interaction operations with the aim of smooth conversation between the agent R and the user H1, and between the agent R and the user H2. More specifically, the interaction rule is defined so that the agent R generates a conversation rhythm by performing utterance induction such as speaking to the users H1 and H2 voluntarily. Since the agent R takes such an action, a one-to-one pull-in phenomenon between the agent R and the user H1 and between the agent R and the user H2 is likely to occur, and the communication mode is easily shifted to the topic providing state.

話題提供状態に対応するＳＩＲＤＢでは、エージェントＲが、ユーザＨ１の情報をユーザＨ２に与えるとともに、ユーザＨ２の情報をユーザＨ１に伝えるように、インタラクションルールが定められている。さらに、このＳＩＲＤＢでは、同じ話題について両者に意見を述べさせたり、エージェントＲに視線誘導を行わせたりして、向かい合って対話させるように誘導するようなインタラクションルールが定められている。この誘導により、初対面のユーザ間で起こる「会話のきっかけが無くコミュニケーションが滞る問題」を解決することができるようになり、コミュニケーションモードを話題掘り下げ状態に遷移させやすくなる。 In the SIRDB corresponding to the topic providing state, an interaction rule is defined so that the agent R gives the information of the user H1 to the user H2 and transmits the information of the user H2 to the user H1. Furthermore, in this SIRDB, an interaction rule is defined in which both parties give their opinions on the same topic, or the agent R guides the line of sight to guide them to confront each other. By this guidance, it becomes possible to solve the “problem in which communication does not occur because of a conversation” that occurs between first-time users, and the communication mode can be easily shifted to the topic digging state.

話題掘り下げ状態に対応するＳＩＲＤＢでは、エージェントＲが質問を投げかけてユーザＨ１、Ｈ２が対話している話題内容に参入するようなインタラクションルールが定められている。エージェントＲがこのような行動をとることより、コミュニケーションモードを、話題盛り上げ状態に遷移させやすくなる。 In the SIRDB corresponding to the topic digging state, an interaction rule is defined such that the agent R asks a question and enters the topic content with which the users H1 and H2 are interacting. Since the agent R takes such an action, the communication mode is easily changed to the topic excitement state.

話題盛り上げ状態では、エージェントＲが、適当に頷いたり、相槌を打ったりするように、聞き役としてその場に同調するようなインタラクションルールが定められている。これにより、すでに話題が盛り上がっている状態の両者に対し、エージェントＲが過度に干渉しないような配慮がなされている。 In the topic excitement state, an interaction rule is set so that the agent R tunes in on the spot as a listener so that he or she can speak appropriately or make a match. Thereby, consideration is given to prevent agent R from excessively interfering with both of the already popular topics.

図１１には、インタラクションルールの基本例が示されている。図１１に示されるインタラクションルールは、以下の３つのルールで構成されている。
・［Ｒｕｌｅ１］頷き同調ルール：相手が頷けば即応的に頷く。
・［Ｒｕｌｅ２］発話タイミングルール：無音区間が一定時間（０．４５秒）以上続き、最後の音声データが、文末として判断されるならば発話する。
・［Ｒｕｌｅ３］相手の発話に応じた頷き・発話タイミングルール：
「無音区間が一定時間（０．４５秒）以上続き，文末ではない場合に２０％の確率で頷く」または「無音区間が一定時間（０．４５秒）以上続き，文末ではない場合でも８０％の確率で発話する」。
ここで、文末であるか否かの判断は、最後の音声データに対して形態素解析を実行し、助詞、終助詞など、文末によく現れる品詞であるか否かを検出することより行うことが可能である。 FIG. 11 shows a basic example of an interaction rule. The interaction rule shown in FIG. 11 includes the following three rules.
-[Rule 1] whispering synchronization rule: If the opponent speaks, it will whisper immediately.
[Rule 2] Utterance timing rule: An utterance is made if a silent section lasts for a certain time (0.45 seconds) or longer and the last voice data is determined as the end of a sentence.
[Rule 3] Whisper / utterance timing rule according to the other party's utterance:
“If there is a silent period lasting for a certain time (0.45 seconds) and it is not the end of the sentence, it will be heard with a probability of 20%.” Speak with the probability of ".
Here, the determination of whether or not the sentence end is performed by performing morphological analysis on the last speech data and detecting whether or not the part of speech often appears at the end of the sentence, such as a particle or a final particle. Is possible.

出力部５のエージェントＲは、ロボットアクションコマンドが入力されなかった場合には、図１２（Ａ）に示されるニュートラルポジションとなっている。インタラクション制御部３からロボットアクションコマンドが出力されると、エージェントＲは、図１２（Ｂ）〜図１２（Ｄ）に示されるような発話、頷き、ジェスチャのいずれかのインタラクション動作を行う。 The agent R in the output unit 5 is in the neutral position shown in FIG. 12A when no robot action command is input. When a robot action command is output from the interaction control unit 3, the agent R performs any one of speech, whispering, and gesture interaction operations as shown in FIGS. 12 (B) to 12 (D).

図１３（Ａ）〜図１３（Ｅ）には、頷きの有無と、発話量と、視線一致度と、コミュニケーション同調度合と、コミュニケーションモードの時間変化の様子が示されている。図１３（Ｅ）の（１）〜（４）は、それぞれ、挨拶／初対面状態、話題提供状態、話題掘り下げ状態、話題盛り上げ状態を示している。図１３（Ａ）〜図１３（Ｅ）に総合的に示されるように、時間が経過するにつれて、頷きの回数が増えていき、発話パワーが大きくなり、視線が一致する頻度が増えている。また、それらが増加するにつれてコミュニケーション同調度合が次第に大きくなっている。これにより、コミュニケーションモードが、挨拶／初対面状態→話題提供状態→話題掘り下げ状態→話題盛り上げ状態と遷移している。 13A to 13E show the presence / absence of utterance, the amount of speech, the degree of line-of-sight coincidence, the degree of communication synchronization, and how the communication mode changes over time. (1) to (4) in FIG. 13E show a greeting / initial meeting state, a topic providing state, a topic digging state, and a topic excitement state, respectively. As comprehensively shown in FIGS. 13A to 13E, as the time elapses, the number of whispers increases, the speech power increases, and the line-of-sight frequency increases. In addition, the degree of communication synchronization gradually increases as they increase. As a result, the communication mode transitions from greeting / initial meeting state → topic providing state → topic digging state → topic excitement state.

以上述べたように、本実施例に係るロボットシステム１０１では、コミュニケーションリズムに基づいてコミュニケーション同調度合が算出され、コミュニケーション同調度合に応じてコミュニケーションモードを遷移させるので、初対面である２人のユーザＨ１、Ｈ２のコミュニケーションをより活性化することができる。 As described above, in the robot system 101 according to the present embodiment, the communication tuning degree is calculated based on the communication rhythm, and the communication mode is changed according to the communication tuning degree. H2 communication can be further activated.

（第２の実施例）
次に、本発明の第２の実施例について説明する。本実施例は、上記第３の実施形態に係るロボットシステム１０２に対応するものである。 (Second embodiment)
Next, a second embodiment of the present invention will be described. This example corresponds to the robot system 102 according to the third embodiment.

本実施例でも、上記第１の実施例と同様に、図９（Ａ）、図９（Ｂ）について示されるコミュニケーション場に適用される。 This embodiment is also applied to the communication place shown in FIGS. 9A and 9B, as in the first embodiment.

上記第３の実施形態で説明したように、ロボットシステム１０２を構成するユーザ内部状態推定部７は、コミュニケーションリズムに基づいて、ユーザＨ１、Ｈ２の内部状態を推定する。本実施例では、ユーザ内部状態推定部７は、ユーザＨ１、Ｈ２の内部状態として、すなわち緊張状態（緊張しているか、リラックスしているかの状態）や快状態（快であるか不快であるかの状態）を推定する。コミュニケーションリズム、すなわちユーザＨ１、Ｈ２の視線、瞬き、表情に関するセンシングデータには、センシングエラーが確率的に含まれるのが一般的である。このことから、ユーザＨ１、Ｈ２の内部状態の推定には、図１４（Ａ）に示されるようなダイナミックベイジアンネットワークによるユーザの内面状態の確率的状態遷移モデルが用いられる。 As described in the third embodiment, the user internal state estimation unit 7 included in the robot system 102 estimates the internal states of the users H1 and H2 based on the communication rhythm. In the present embodiment, the user internal state estimation unit 7 is an internal state of the users H1 and H2, that is, a tension state (a state of tension or relaxation) or a pleasant state (pleasant or uncomfortable). State). In general, sensing errors are stochastically included in the communication rhythm, that is, the sensing data regarding the eyes, blinks, and facial expressions of the users H1 and H2. Therefore, a probabilistic state transition model of the internal state of the user by a dynamic Bayesian network as shown in FIG. 14A is used for estimating the internal states of the users H1 and H2.

まず、緊張状態の推定方法について説明する。一般的に、視線一致の頻度が少なく、かつ、瞬きの頻度が増えれば、ユーザＨ１、Ｈ２の緊張状態は、時間の経過とともに上昇していくものと推定される。そこで、本実施例では、ユーザ内部状態推定部７は、視線一致が検出される検出確率ｐ１（視線一致ありの検出確率ｐ１、視線一致なしの検出確率１−ｐ１）と、瞬きの回数がある閾値以上であるか否かの検出確率ｐ２（瞬きありの検出確率ｐ２、瞬きなしの検出確率１−ｐ２）とに基づいて、ユーザの緊張状態（緊張度ｑ１、リラックス度ｑ２）の時間変化を、所定の時間間隔（…、ｔ−１、ｔ、…）で算出する。 First, the tension state estimation method will be described. Generally, if the line-of-sight matching frequency is low and the blinking frequency is increased, the tension state of the users H1 and H2 is estimated to increase with the passage of time. Therefore, in this embodiment, the user internal state estimation unit 7 has the detection probability p1 (detection probability p1 with gaze matching, detection probability 1-p1 without gaze matching) and the number of blinks. Based on the detection probability p2 of whether or not the threshold value is exceeded (detection probability p2 with blinking, detection probability 1-p2 without blinking), the time change of the user's tension state (tensity q1 and relaxation degree q2) is changed. , Calculated at predetermined time intervals (..., T-1, t,...).

次に、快状態の推定方法について説明する。ユーザ内部状態推定部７は、快状態を、瞬きの回数がある閾値以上であるか否かの検出確率ｐ２（瞬きありの検出確率ｐ２、瞬きなしの検出確率１−ｐ２）と、表情変化を示す特徴量とに基づいて推定する。ここで、表情変化を示す特徴量としては、図１４（Ｂ）に示されるようなＦＡＣＳ（顔表情符号化システム）モデルに基づく眉、目、口の位置関係から算出される特徴量Ｆ１〜Ｆ６が用いられる。本実施例では、カメラ３３、３４の撮像結果から距離Ｆ１〜Ｆ６が得られている。例えば、人間が笑う場合には、目・口間の距離Ｆ１が短くなると考えられる。ここでは、例えば、この目・口間の距離Ｆ１が閾値Ｔｈ以下である検出確率ｐ３と、距離Ｆ１が閾値Ｔｈより大きくなる検出確率１−ｐ３とが求められるものとする。 Next, a method for estimating a pleasant state will be described. The user internal state estimation unit 7 determines the pleasant state by using a detection probability p2 of whether or not the number of blinks is equal to or greater than a certain threshold (detection probability p2 with blink, detection probability 1-p2 without blink) and facial expression change. It estimates based on the feature-value to show. Here, as feature quantities indicating facial expression changes, feature quantities F1 to F6 calculated from the positional relationship between eyebrows, eyes, and mouth based on a FACS (Facial Expression Coding System) model as shown in FIG. Is used. In this embodiment, distances F1 to F6 are obtained from the imaging results of the cameras 33 and 34. For example, when a human laughs, it is considered that the distance F1 between the eyes and the mouth is shortened. Here, for example, it is assumed that the detection probability p3 that the distance F1 between the eyes and the mouth is equal to or less than the threshold Th and the detection probability 1-p3 that the distance F1 is greater than the threshold Th are obtained.

ユーザ内部状態推定部７は、瞬きありであるとする検出確率（ｐ２、１−ｐ２）と、表情の特徴量の検出確率（ｐ３、１−ｐ３）との結合確率に基づいて、情動認識の学習を行い、快状態（快状態度ｑ３、不快状態度ｑ４）の時間変化を、所定の時間間隔（…、ｔ−１、ｔ、…）で算出する。 The user internal state estimation unit 7 performs emotion recognition based on the coupling probability between the detection probability (p2, 1-p2) that there is blinking and the detection probability (p3, 1-p3) of the facial feature amount. Learning is performed, and the temporal change of the pleasant state (pleasant state degree q3, unpleasant state degree q4) is calculated at predetermined time intervals (..., T-1, t,.

なお、視線一致の検出確率ｐ１は、次の第４の実施例における発話マインドの推定において、ユーザの内部状態を示す指標として用いられる。 The line-of-sight coincidence detection probability p1 is used as an index indicating the internal state of the user in the estimation of the utterance mind in the fourth embodiment.

また、本実施例では、インタラクション制御部３によって参照されるルールとして、図１１に示されるようなインタラクションルールに加え、エージェントＲが２人のユーザＨ１、Ｈ２のいずれかに発話する際に、ユーザ内部状態推定部７から出力される快度合の低いユーザに対して発話するというルールが加えられる。なお、このルールでは、ユーザＨ１、Ｈ２の快度合が同値であった場合は、インタラクション制御部３は、エージェントＲが緊張度合の低いユーザに対して発話するように、出力部５を制御する。ユーザＨ１、Ｈ２の快度合が同値であり、かつ、ユーザＨ１、Ｈ２の緊張度合も同値であれば、エージェントＲがどちらのユーザに発話するかは、ランダムに決定されるようにすればよい。 In addition, in this embodiment, as a rule referred to by the interaction control unit 3, in addition to the interaction rule as shown in FIG. 11, when the agent R speaks to one of the two users H1 and H2, the user A rule of speaking to a user with a low degree of pleasure output from the internal state estimation unit 7 is added. In this rule, when the pleasure levels of the users H1 and H2 are the same value, the interaction control unit 3 controls the output unit 5 so that the agent R speaks to a user with a low degree of tension. If the pleasure levels of the users H1 and H2 are the same value and the tension levels of the users H1 and H2 are also the same value, it may be determined randomly to which user the agent R speaks.

このように、本実施例では、緊張状態および快状態といったユーザＨ１、Ｈ２の内面状態を考慮してインタラクション動作が行われるので、その動作は、ユーザＨ１、Ｈ２の内部状態に応じて動的に調整されるようになる。初対面であるユーザＨ１、Ｈ２のコミュニケーションをより活性化させることができる。 As described above, in this embodiment, the interaction operation is performed in consideration of the internal state of the users H1 and H2 such as the tension state and the pleasant state. Therefore, the operation is dynamically performed according to the internal state of the users H1 and H2. Will be adjusted. Communication of the users H1 and H2 who are first meeting can be further activated.

（第３の実施例）
次に、本発明の第３の実施例について説明する。本実施例は、上記第４の実施形態に係るロボットシステム１０３に対応するものである。 (Third embodiment)
Next, a third embodiment of the present invention will be described. This example corresponds to the robot system 103 according to the fourth embodiment.

発話マインド推定部８は、ユーザ内部状態推定部７によって推定されたユーザＨ１、Ｈ２の緊張状態及び快状態（図１４（Ａ）に示されるモデルで推定された内部状態）に基づいて、発話マインドを推定する。 The utterance mind estimation unit 8 is based on the tension state and the pleasant state of the users H1 and H2 estimated by the user internal state estimation unit 7 (the internal state estimated by the model shown in FIG. 14A). Is estimated.

例えば、ユーザＨ１、Ｈ２がエージェントＲに対して視線を向けている場合には、ｐ１の確率で発話マインドありとする。また、ユーザの緊張度合がある閾値以上の場合には、ｑ１の確率で発話マインドありとする。さらに、ユーザの快度合がある閾値以上の場合に、はｑ３の確率で発話マインドありとする。最終的な発話マインドは、これらの確率の結合確率となる。発話マインド推定部８は、インタラクション制御部３に推定された発話マインドを出力する。 For example, when the users H1 and H2 are looking toward the agent R, it is assumed that there is an utterance mind with a probability of p1. When the user's degree of tension is equal to or greater than a certain threshold, it is determined that there is an utterance mind with a probability of q1. Further, when the user's degree of pleasure is equal to or greater than a certain threshold value, it is assumed that there is an utterance mind with a probability of q3. The final utterance mind is the combined probability of these probabilities. The utterance mind estimation unit 8 outputs the utterance mind estimated to the interaction control unit 3.

また、本実施例では、インタラクション制御部３によって参照されるルールとして、図１１に示されるようなインタラクションルールに加え、エージェントＲが２人のユーザＨ１、Ｈ２のどちらかに発話する際、発話マインド推定部８から出力される発話マインドのあるユーザに対して発話するというルールが加えられる。なお、このルールでは、ユーザＨ１、Ｈ２とも発話マインドがあるか、両者とも発話マインドがない場合であれば、エージェントＲがどちらのユーザＨ１、Ｈ２に発話するかはランダムに決定するようにすればよい。 Further, in this embodiment, as a rule referenced by the interaction control unit 3, in addition to the interaction rule as shown in FIG. 11, when the agent R speaks to either of the two users H1 and H2, the utterance mind A rule of speaking to a user who has a speech mind output from the estimation unit 8 is added. According to this rule, if both the users H1 and H2 have an utterance mind, or neither of them has an utterance mind, it is possible to randomly determine to which user H1 and H2 the agent R utters. Good.

このように、本実施例では、ユーザが発話しようとする意思を考慮してエージェントＲがユーザＨ１、Ｈ２に発話を行う。このため、ユーザＨ１、Ｈ２の感じる負荷をより少なくし、初対面であるユーザＨ１、Ｈ２のコミュニケーションをより活性化させることができる。 Thus, in this embodiment, the agent R speaks to the users H1 and H2 in consideration of the user's intention to speak. For this reason, it is possible to reduce the load felt by the users H1 and H2, and to activate the communication of the users H1 and H2 who are first meeting.

（第４の実施例）
次に、本発明の第４の実施例について説明する。本実施例に係るロボットシステムは、上記第５の実施形態に係るロボットシステム１０４に対応するものである。 (Fourth embodiment)
Next, a fourth embodiment of the present invention will be described. The robot system according to the present example corresponds to the robot system 104 according to the fifth embodiment.

ユーザ間情報推定部９は、推定されたユーザＨ１、Ｈ２の緊張度合および快度合に基づいてユーザ間情報を推定する。本実施例では、ユーザＨ１、Ｈ２の緊張度合がともにある閾値以下であり、かつ、ユーザＨ１、Ｈ２の快度合がともにある閾値以上である場合、ユーザの関係は親和的関係であるとし、この場合以外では非親和的関係であるとする。ユーザ間情報推定部９は、このユーザ間情報を、インタラクション制御部３に出力する。この他、ユーザ間情報推定部９は、複数のユーザ間の総コミュニケーション時間などを用いて、ユーザ間情報を定義するようにしてもよい。 The inter-user information estimation unit 9 estimates inter-user information based on the estimated tension levels and pleasure levels of the users H1 and H2. In this embodiment, when the tension levels of the users H1 and H2 are both equal to or less than a certain threshold value, and the pleasure levels of the users H1 and H2 are both equal to or greater than a certain threshold value, the user relationship is assumed to be an affinity relationship. In other cases, it is assumed that the relationship is non-affinity. The inter-user information estimation unit 9 outputs this inter-user information to the interaction control unit 3. In addition, the inter-user information estimation unit 9 may define the inter-user information using the total communication time between a plurality of users.

インタラクション制御部３は、このユーザ情報を考慮して、出力部５を制御する。例えば、本実施例に係るロボットシステムが、上記第２の実施形態に係るロボットシステム１０１と同様に、コミュニケーションモードを有している場合において、コミュニケーションモードが話題提供状態から話題掘り下げ状態に遷移する際に、ユーザ間情報推定部９から出力されたユーザ間情報が親和的関係であれば、コミュニケーションモード決定部６に、話題掘り下げ状態ではなく話題盛り上げ状態へある一定の確率で遷移させ、話題盛り上げ状態に遷移した場合には、話題盛り上げ状態に対応するインタラクションルールに従って、出力部５を制御するようにしてもよい。 The interaction control unit 3 controls the output unit 5 in consideration of this user information. For example, when the robot system according to the present embodiment has a communication mode as in the robot system 101 according to the second embodiment, when the communication mode transitions from the topic providing state to the topic digging state. If the inter-user information output from the inter-user information estimation unit 9 is an affinity relationship, the communication mode determination unit 6 is shifted to the topic excitement state instead of the topic excavation state with a certain probability, and the topic excitement state When the transition is made, the output unit 5 may be controlled according to the interaction rule corresponding to the topic excitement state.

このように、本実施例では、ユーザ間の関係が”親和的”であれば、エージェントＲがユーザＨ１、Ｈ２のコミュニケーションに水を差すような介入をするのを避けることができるため、より効率的にコミュニケーションを活性化させることができる。 In this way, in this embodiment, if the relationship between users is “affinity”, the agent R can avoid intervening in such a way as to flood the communication between the users H1 and H2, and thus more efficient. Can revitalize communication.

（第５の実施例）
次に、本発明の第５の実施例について説明する。本実施例は、上記第６の実施形態に係るロボットシステム１０５を基本とし、それらの構成に加え、上記第５の実施形態に係るロボットシステム１０４の構成要素であるユーザ内部状態推定部７とユーザ間情報推定部９とをさらに備えている。すなわち、本実施例のロボットシステムは、ロボットシステム１０４、１０５を組み合わせた構成となっている。 (Fifth embodiment)
Next, a fifth embodiment of the present invention will be described. This example is based on the robot system 105 according to the sixth embodiment, and in addition to those configurations, the user internal state estimation unit 7 and the user, which are components of the robot system 104 according to the fifth embodiment. An inter-information estimation unit 9 is further provided. That is, the robot system of this embodiment has a configuration in which the robot systems 104 and 105 are combined.

ＥＳＤＢ１１には、エピソード蓄積部１０により、例えば、図１５に示されるような情報が時系列（０、１、２、…）で蓄えられている。「ユーザ状態」は、時刻ｔ−ｂにおけるユーザ内部状態（緊張度合（高、中、低））を示している。「ユーザ間情報」は、時刻ｔ−ｂにおけるユーザ間の関係性（非親和的関係、親和的関係）を示している。「Ｒのアクション」は、時刻ｔにおけるエージェントＲのアクションの種別（Ｈ１に視線を向ける、Ｈ１に氏名を質問する、Ｈ１、Ｈ２に相槌を打つ、ｅｔｃ）である。「評価」は、時刻ｔ−ｂのユーザＨ１、Ｈ２の緊張度合に対する、時刻ｔ＋ａにおけるユーザＨ１、Ｈ２の緊張度合の減少値（緊張度合減少値）のユーザＨ１、Ｈ２の合計値である。 In the ESDB 11, for example, information as illustrated in FIG. 15 is stored in time series (0, 1, 2,...) By the episode accumulation unit 10. “User state” indicates the user internal state (degree of tension (high, medium, low)) at time t−b. “Inter-user information” indicates the relationship (non-affinity relationship, affinity relationship) between users at time t−b. “Action of R” is the type of action of agent R at time t (turn a line of sight to H1, ask H1 for a name, give a compliment to H1 and H2, etc). “Evaluation” is the total value of the users H1 and H2 of the decrease values (tensity degree decrease values) of the user H1 and H2 at the time t + a with respect to the tension levels of the users H1 and H2 at the time t−b.

エピソード学習部１２は、ＥＳＤＢ１１を参照して、ＳＩＲＤＢ４に記憶されたインタラクションルール、すなわち、コミュニケーションリズム及びコミュニケーション同調度合と、それらに基づく制御の下で出力部５によって行われたインタラクション動作との関係を、ユーザＨ１、Ｈ２の緊張度合が減少するように繰り返し変更する。このようにして、エピソード学習部１２は、コミュニケーションリズム及びコミュニケーション同調度合と、インタラクション動作との最適な関係を学習する。これにより、ＥＳＤＢ１１に記憶されたインタラクションルールが、コミュニケーション同調度合を効率良く高める方向に調整される。 The episode learning unit 12 refers to the ESDB 11 to determine the relationship between the interaction rules stored in the SIRDB 4, that is, the communication rhythm and the communication tuning degree, and the interaction operation performed by the output unit 5 under the control based on them. The user H1 and H2 are repeatedly changed so as to reduce the degree of tension. In this way, the episode learning unit 12 learns the optimum relationship between the communication rhythm and communication synchronization degree and the interaction operation. Thereby, the interaction rule memorize | stored in ESDB11 is adjusted in the direction which raises a communication synchronization degree efficiently.

なお、この学習の際、エピソード学習部１２は、コミュニケーション同調度合算出部２から出力されたコミュニケーション同調度合に基づいて学習ルールを変更するようにしてもよい。例えば、コミュニケーション同調度合が一定期間以上ある閾値よりも低ければ、最適化の収束の高速化（学習の高速化）を目指し、学習の際に用いられるユーザの緊張度合およびユーザ間情報といったパラメータを一定個数減らすようにすることができる。 In this learning, the episode learning unit 12 may change the learning rule based on the communication synchronization level output from the communication synchronization level calculation unit 2. For example, if the communication synchronization level is lower than a certain threshold for a certain period or longer, parameters such as the user's tension level and inter-user information used for learning are fixed, aiming at faster optimization convergence (higher learning speed). The number can be reduced.

（第６の実施例）
次に、本発明の第６の実施例について説明する。本実施例は、上記第７の実施形態に係るロボットシステム１０６に対応するものである。 (Sixth embodiment)
Next, a sixth embodiment of the present invention will be described. This example corresponds to the robot system 106 according to the seventh embodiment.

ＵＰＩＤＢ１３には、ユーザの氏名、出身地、職歴、趣味といった個人情報およびユーザの社会的スキルや心理分析結果といったユーザの能力や性格に関する情報が予め蓄えられている。ＵＰＩＤＢ１３に蓄えられた情報は、インタラクション制御部３によって参照され、インタラクション制御部３がインタラクション動作を決定するために利用される。 In the UPIDB 13, personal information such as the user's name, birthplace, work history, and hobby, and information regarding the user's ability and personality such as the user's social skills and psychological analysis results are stored in advance. The information stored in the UPIDB 13 is referred to by the interaction control unit 3 and is used by the interaction control unit 3 to determine the interaction operation.

例えば、ユーザＨ１、Ｈ２の趣味に関してエージェントＲが質問するといった場合に、ユーザＨ１、Ｈ２のユーザパーソナリティ情報に彼らの趣味の情報が含まれていれば、それらの内容が、発話に反映される。 For example, when the agent R asks about the hobbies of the users H1 and H2, if the user personality information of the users H1 and H2 includes information on their hobbies, the contents are reflected in the utterance.

また、ユーザの個人情報は、エージェントＲのユーザＨ１、Ｈ２への言葉遣いを決定する際にも参酌される。言葉遣いの社会的スキルの評価指標の１つにＪＩＣＳがある。例えばＪＩＣＳの中で、関係調整（上下関係管理）に関するユーザの社会的スキルを用いることでロボットの言葉遣いを調整することができる。 The personal information of the user is also taken into account when determining the wording of the agent R to the users H1 and H2. One of the evaluation indexes of language skills is JICS. For example, in JICS, the wording of a robot can be adjusted by using the user's social skills regarding relationship adjustment (upper and lower relationship management).

例えば、アンケート調査などの結果により、コミュニケーションに参加するユーザの関係調整の度合が既知であるものとし、その度合が、予め、ＵＰＩＤＢ１３へ蓄えられているものとする。関係調整の度合がある閾値よりも高いユーザは、人間の上下関係に関して意識していると推定することができる。このため、このようなユーザに対しては、インタラクション制御部３は、エージェントＲに、発話する場合に敬語を使用させる。一方、関係調整の度合がある閾値よりも低いユーザは、人間の上下関係に関してあまり意識していないと推定することができるため、このようなユーザに対しては、インタラクション制御部３は、エージェントＲに、発話する場合に敬語を使用しないようにさせる。 For example, it is assumed that the degree of relationship adjustment of users participating in communication is known from the result of questionnaire survey or the like, and that degree is stored in the UPIDB 13 in advance. It can be estimated that a user whose degree of relationship adjustment is higher than a certain threshold is conscious of the human vertical relationship. For this reason, for such a user, the interaction control unit 3 causes the agent R to use honorific words when speaking. On the other hand, since it can be estimated that a user whose degree of relationship adjustment is lower than a certain threshold is not so conscious about the human vertical relationship, the interaction control unit 3 performs agent R for such a user. To not use honorifics when speaking.

このようにすれば、エージェントＲとユーザＨ１、Ｈ２との親和性をより高めることができる。 In this way, the affinity between the agent R and the users H1 and H2 can be further increased.

また、このような社会的スキルの評価指標の他にも、交流分析における人格に関する理論などを用いて推定されたユーザの性格に関する情報をＵＰＩＤＢ１３に格納して、エージェントＲのユーザＨ１、Ｈ２に対する発話内容の調整に用いることができる。 In addition to the social skill evaluation index, information related to the personality of the user estimated using the personality theory in the exchange analysis is stored in the UPIDB 13, and the utterance of the agent R to the users H1 and H2 is stored. Can be used for content adjustment.

このような理論では、例えば、人間の性格（人格）が、批判的な親心と、養育的親心と、合理的な大人の心と、無邪気な子供の心と、順応した子供の心との５つに大別されている。この理論では、人格に関するアンケート調査を行えば、５つの人格の中で、どれがその人の中で優位であるかというような傾向を解析することができ、この解析結果に基づいて、その人の人格をある程度推定することができる。 In such a theory, for example, the human personality (personality) is a critical kinship, a nurturing kinship, a rational adult soul, an innocent child soul, and an adapted child soul. It is roughly divided into two. In this theory, if a questionnaire survey on personality is conducted, it is possible to analyze the tendency of which of the five personalities is dominant among the personality. The personality of can be estimated to some extent.

例えば、あるユーザに対するアンケート調査の結果、批判的な親心と、順応した子供の心が、他の３つの心の構造に比べ優位であれば、そのユーザの性格は、理屈好きなタイプであると推定される。この場合、このタイプに属するユーザに対する発話内容は、理屈を重視すべきであると考えられる。したがって、このタイプに属するユーザに対しては、エージェントＲは、理由をつけてユーザに動作を促すような対話戦略をとるようにする。このようにすれば会話がスムーズに進むようになる。 For example, as a result of a questionnaire survey on a certain user, if the critical kinship and the adapted child's mind are superior to the other three mind structures, the user's personality is of a logical type. Presumed. In this case, it is considered that the utterance content for the user belonging to this type should emphasize the reason. Therefore, for a user belonging to this type, the agent R takes an interactive strategy that prompts the user for an action with a reason. In this way, the conversation will proceed smoothly.

なお、上記第５の実施例にもあるように、ロボットシステムとして、上記各実施形態に係るロボットシステムを組み合わせたものを採用することができる。例えば、コミュニケーションモード、ユーザの内部状態、発話マインド、ユーザ間情報のうちの少なくとも一部の組み合わせについてＳＩＲＤＢを用意し、その組み合わせ毎にインタラクションルールを用意するようにしてもよい。また、それらの組み合わせに応じてＳＩＲＤＢを複数備えるシステムにおいて、エピソード学習を行うようにしてもよいし、ユーザの個人情報に基づいて、インタラクション動作を変更するようにしてもよい。 As in the fifth example, a combination of the robot systems according to the above embodiments can be employed as the robot system. For example, the SIRDB may be prepared for at least some combinations of the communication mode, the user internal state, the utterance mind, and the inter-user information, and an interaction rule may be prepared for each combination. Further, in a system including a plurality of SIRDBs according to the combination thereof, episode learning may be performed, or the interaction operation may be changed based on the personal information of the user.

また、上記各実施例では、ＣＧモデルのエージェントＲにインタラクション動作を行わせたが、出力部５として、各種アクチュエータを備え、ロボットの表情、腕、手、足、体を動かすことができる人型のロボットを用いるようにしてもよい。この場合でも、出力部５では、ロボットの表情を変化させたり、腕、手、足、体が動かしたりして、ロボットの喜怒哀楽といった感情や注意対象をユーザに対して効果的に伝達することができる。 Further, in each of the above embodiments, the CG model agent R performs an interaction operation. However, the output unit 5 includes various actuators and can be used to move the facial expression, arms, hands, feet, and body of the robot. You may make it use the robot of this. Even in this case, the output unit 5 effectively conveys emotions and attentions such as emotions and cautions of the robot to the user by changing the facial expression of the robot and moving the arms, hands, feet, and body. be able to.

この場合、出力部５は、目が点滅する、瞬きする、腕を振る、首を振る、ボディを伸縮する、ボディを振動する、鼓動音を出すといったインタラクション動作を行うようにしてもよい。また、出力部５は、涙を流す、ユーザの足元に擦り寄る、ユーザに近づく、ジャンプするといったインタラクション動作を実行するようにしてもよい。さらに、注意対象の伝達方法として、出力部５は、注意対象を注視する、注意対象を指差しする、注意対象に近づくといったインタラクション動作を行うようにしてもよい。 In this case, the output unit 5 may perform an interaction operation such as blinking of eyes, blinking, waving an arm, waving a neck, expanding / contracting the body, vibrating the body, or generating a beating sound. Further, the output unit 5 may execute an interaction operation such as tearing, rubbing against the user's feet, approaching the user, or jumping. Furthermore, as a method for transmitting the attention object, the output unit 5 may perform an interaction operation such as gazing at the attention object, pointing at the attention object, or approaching the attention object.

このように、ロボットシステムは、物理的に実体を持っていても良いし、上記各実施例のように、プロジェクタの投影画面やディスプレイに表示される、実体を持たないエージェント型であってもよいし、画面に文字を表示したり、音声を発したりするだけのものであってもよい。要は、ロボットシステムは、発話、身体動作、文字表示の少なくとも一つを含む動作を行えるものであればよい。 As described above, the robot system may physically have an entity, or may be an agent type that does not have an entity and is displayed on the projection screen or display of the projector as in each of the above embodiments. However, it may be one that only displays characters on the screen or emits sound. In short, the robot system only needs to be capable of performing an operation including at least one of utterance, body movement, and character display.

また、コミュニケーション場認識部１、コミュニケーション同調度合２、インタラクション制御部３など、各ロボットシステムの構成要素を、ハードウエアのみ実現するようにしてもよいが、これらは、ソフトウエアプログラムとハードウエアとの協調動作で実現されるのが一般的である。ソフトウエアプログラムとハードウエアとの協調動作の場合には、ロボットシステム内に設けられたＣＰＵが、同システム内のＲＯＭ等の記憶装置に格納されたソフトウエアプログラムを実行することにより、各部の機能を実現する。 In addition, the components of each robot system such as the communication field recognition unit 1, the communication tuning degree 2, and the interaction control unit 3 may be realized only by hardware. In general, it is realized by cooperative operation. In the case of a coordinated operation of the software program and hardware, the CPU provided in the robot system executes the software program stored in a storage device such as a ROM in the same system, so that the function of each unit Is realized.

この場合、ロボットシステムとしては、汎用のコンピュータを用いることが可能である。この場合、コンピュータの記憶装置に格納されるソフトウエアプログラムは、ＣＤ−ＲＯＭ（Compact Disc Read Only Memory）、ＤＶＤ（Digital Versatile Disc）、ＭＯ（Magneto Optical disc）、フレキシブルディスクなどのコンピュータ読み取り可能な記録媒体に格納して配布され、ロボットシステムにインストールされるようになっていてもよい。また、インターネット等の通信ネットワーク上のサーバ装置に格納された当該プログラムを、当該コンピュータにダウンロードして、ロボットシステムにインストールされるようになっていてもよい。 In this case, a general-purpose computer can be used as the robot system. In this case, the software program stored in the storage device of the computer is a computer-readable recording such as a CD-ROM (Compact Disc Read Only Memory), a DVD (Digital Versatile Disc), an MO (Magneto Optical Disc), or a flexible disk. It may be stored in a medium and distributed, and installed in a robot system. Further, the program stored in a server device on a communication network such as the Internet may be downloaded to the computer and installed in the robot system.

本発明の第１の実施形態に係るロボットシステムの基本的な構成を示すブロック図である。1 is a block diagram showing a basic configuration of a robot system according to a first embodiment of the present invention. 図１のロボットシステムの動作を示すフローチャートである。It is a flowchart which shows operation | movement of the robot system of FIG. 本発明の第２の実施形態に係るロボットシステムの基本的な構成を示すブロック図である。It is a block diagram which shows the basic composition of the robot system which concerns on the 2nd Embodiment of this invention. 本発明の第３の実施形態に係るロボットシステムの基本的な構成を示すブロック図である。It is a block diagram which shows the basic composition of the robot system which concerns on the 3rd Embodiment of this invention. 本発明の第４の実施形態に係るロボットシステムの基本的な構成を示すブロック図である。It is a block diagram which shows the basic composition of the robot system which concerns on the 4th Embodiment of this invention. 本発明の第５の実施形態に係るロボットシステムの基本的な構成を示すブロック図である。It is a block diagram which shows the basic composition of the robot system which concerns on the 5th Embodiment of this invention. 本発明の第６の実施形態に係るロボットシステムの基本的な構成を示すブロック図である。It is a block diagram which shows the basic composition of the robot system which concerns on the 6th Embodiment of this invention. 本発明の第７の実施形態に係るロボットシステムの基本的な構成を示すブロック図である。It is a block diagram which shows the basic composition of the robot system which concerns on the 7th Embodiment of this invention. 図９（Ａ）は、コミュニケーション場の一例を示す図であり、図９（Ｂ）は、エージェントの一例を示す図である。FIG. 9A is a diagram illustrating an example of a communication field, and FIG. 9B is a diagram illustrating an example of an agent. コミュニケーションモードの遷移図である。It is a transition diagram of a communication mode. インタラクションルールの一例を示す図である。It is a figure which shows an example of an interaction rule. 図１２（Ａ）は、エージェントのニュートラルポジションを示す図であり、図１２（Ｂ）は、発話動作を示す図であり、図１２（Ｃ）は、頷き動作を示す図であり、図１２（Ｄ）は、ジェスチャ動作を示す図である。12A is a diagram illustrating the neutral position of the agent, FIG. 12B is a diagram illustrating the speech operation, FIG. 12C is a diagram illustrating the whispering operation, and FIG. D) is a diagram illustrating a gesture operation. 図１３（Ａ）は、頷きの有無の時間変化を示すグラフであり、図１３（Ｂ）は、発話量の時間変化を示すグラフであり、図１３（Ｃ）は、視線の一致の時間変化を示すグラフであり、図１３（Ｄ）は、コミュニケーション同調度合の時間変化を示すグラフであり、図１３（Ｅ）は、コミュニケーションモードの時間変化を示す図である。FIG. 13A is a graph showing temporal changes in presence / absence of whispering, FIG. 13B is a graph showing temporal changes in the amount of speech, and FIG. 13C shows temporal changes in line-of-sight matching. FIG. 13D is a graph showing the time change of the communication tuning degree, and FIG. 13E is a diagram showing the time change of the communication mode. 図１４（Ａ）は、ユーザの内面状態を示す確率的状態遷移モデルの一例を示す図であり、図１４（Ｂ）は、ＦＡＣＳモデルに基づく眉、目、口の位置関係から算出される特徴量を説明するための図である。FIG. 14A is a diagram showing an example of a probabilistic state transition model indicating the inner state of the user, and FIG. 14B is a feature calculated from the positional relationship between the eyebrows, eyes, and mouth based on the FACS model. It is a figure for demonstrating quantity. エピソード記憶データベースに記憶される情報の一例を示す図である。It is a figure which shows an example of the information memorize | stored in an episode memory database.

Explanation of symbols

１コミュニケーション場認識部
２コミュニケーション同調度合算出部
３インタラクション制御部
４、４１、４２、４３ソーシャルインタラクションルールデータベース（ＳＩＲＤＢ）
５出力部
６コミュニケーションモード決定部
７ユーザ内部状態推定部
８発話マインド推定部
９ユーザ間情報推定部
１０エピソード蓄積部
１１エピソード記憶データベース（ＥＳＤＢ）
１２エピソード学習部
１３ユーザパーソナリティ情報データベース（ＵＰＩＤＢ）
３０テーブル
３１マイク
３２加速度センサ
３３、３４カメラ
１００、１０１、１０２、１０３、１０４、１０５、１０６ロボットシステム
Ｈ１、Ｈ２ユーザ
Ｒエージェント DESCRIPTION OF SYMBOLS 1 Communication field recognition part 2 Communication tuning degree calculation part 3 Interaction control part 4, 41, 42, 43 Social interaction rule database (SIRDB)
DESCRIPTION OF SYMBOLS 5 Output part 6 Communication mode determination part 7 User internal state estimation part 8 Utterance mind estimation part 9 Inter-user information estimation part 10 Episodic storage part 11 Episodic storage database (ESDB)
12 Episode Learning Department 13 User Personality Information Database (UPIDB)
30 Table 31 Microphone 32 Acceleration sensor 33, 34 Camera 100, 101, 102, 103, 104, 105, 106 Robot system H1, H2 User R agent

Claims

An output unit for performing an interaction operation for a plurality of users;
A recognition unit for recognizing the communication rhythm of the user based on the verbal information and non-verbal information of each of the plurality of users;
Based on the communication rhythm, a synchronization level calculation unit that calculates a synchronization level between the users,
In accordance with the communication rhythm and the degree of synchronization, the output unit stores a rule constructed under a dynamic induction strategy of a pull-in phenomenon according to a communication development stage regarding an interaction operation to be performed on the user. A rule database;
An interaction control that refers to the rule database, searches for an operation command to be performed for the user using the communication rhythm and the degree of synchronization according to the rule, and controls the output unit based on the searched operation command A robot system.

A mode determining unit for determining a communication mode which is a state of a communication field based on the degree of synchronization;
A plurality of the rule databases according to the communication mode,
The interaction control unit
From among the plurality of rule databases, select a rule database corresponding to the communication mode determined by the communication mode determination unit,
Referring to the selected rule database, search for an operation command to be performed on the user using the communication rhythm and the degree of synchronization according to the rule, and control the output unit based on the searched operation command The robot system according to claim 1, wherein:

Based on the communication rhythm of each of the plurality of users, further comprising a user internal state estimation unit that estimates the internal state of the user,
A plurality of the rule databases are provided according to the internal state,
The interaction control unit
From among the plurality of rule databases, select a rule database corresponding to the internal state estimated by the user internal state estimation unit,
Referring to the selected rule database, search for an operation command to be performed for the user using the communication rhythm and the degree of synchronization according to the rule, and control the output unit based on the searched operation command The robot system according to claim 1 or 2, characterized in that

An utterance mind estimating unit that estimates an index value indicating whether or not the user has an intention to speak based on an internal state of each of the plurality of users;
A plurality of the rule databases according to the index value,
The interaction control unit
From among the plurality of rule databases, select a rule database corresponding to the index value estimated by the utterance mind estimation unit,
Referring to the selected rule database, search for an operation command to be performed for the user using the communication rhythm and the degree of synchronization according to the rule, and control the output unit based on the searched operation command The robot system according to claim 3.

Based on the internal state of each of the plurality of users, further comprising an inter-user information estimation unit that estimates inter-user information indicating a social relationship between the users,
A plurality of the rule databases are provided according to the information between users,
The interaction control unit
From among the plurality of rule databases, select a rule database corresponding to the inter-user information estimated by the inter-user information estimation unit,
Referring to the selected rule database, search for an operation command to be performed for the user using the communication rhythm and the degree of synchronization according to the rule, and control the output unit based on the searched operation command The robot system according to claim 3.

Storing the communication rhythm and the degree of synchronization and the operation command searched based on them in association with each other;
Episodic memory for storing the operation command in association with the communication rhythm and the degree of tuning as a response of each of the users to the interaction operation of the output unit performed under the interaction control unit based on the operation command A database,
An episode storage unit for storing the communication rhythm, the degree of synchronization, and the operation command in the episode storage database;
An episode learning unit that adjusts a rule stored in the rule database by learning an optimal relationship between the communication rhythm and the degree of synchronization and the interaction action with reference to the episode storage database; The robot system according to any one of claims 1 to 5, further comprising:

A user personality information database in which personal information of the user is stored;
The interaction control unit
The robot system according to claim 1, wherein the output unit is controlled based on information stored in the user personality information database.

A communication activation method using a robot system including an output unit that performs an interaction operation for a plurality of users,
A first step of recognizing the communication rhythm of the plurality of users based on the verbal information and non-verbal information of each of the users;
A second step of calculating a degree of synchronization between the users based on the communication rhythm;
In accordance with the communication rhythm and the degree of synchronization, the output unit stores a rule constructed under a dynamic induction strategy of a pull-in phenomenon according to a communication development stage regarding an interaction operation to be performed on the user. A third database for searching for an operation command to be performed on the user using the communication rhythm and the degree of synchronization according to the rule with reference to the rule database, and controlling the output unit based on the searched operation command And a communication activation method including a process.

In a computer that controls a robot system including an output unit that performs an interaction operation for a plurality of users,
A first procedure for recognizing the communication rhythm of the plurality of users based on the verbal information and non-verbal information of each of the users;
A second procedure for calculating the degree of synchronization between the users based on the communication rhythm;
In accordance with the communication rhythm and the degree of synchronization, the output unit stores a rule constructed under a dynamic induction strategy of a pull-in phenomenon according to a communication development stage regarding an interaction operation to be performed on the user. A third database for searching for an operation command to be performed on the user using the communication rhythm and the degree of synchronization according to the rule with reference to the rule database, and controlling the output unit based on the searched operation command A program that executes a procedure.