JP3792882B2

JP3792882B2 - Emotion generation device and emotion generation method

Info

Publication number: JP3792882B2
Application number: JP06731098A
Authority: JP
Inventors: 薫鈴木
Original assignee: Toshiba Corp
Current assignee: Toshiba Corp
Priority date: 1998-03-17
Filing date: 1998-03-17
Publication date: 2006-07-05
Anticipated expiration: 2018-03-17
Also published as: JPH11265239A

Description

【０００１】
【発明の属する技術分野】
本発明は、インタフェースエージェントの感情表現を制御する感情生成装置及び感情生成方法に関する。
【０００２】
【従来の技術】
従来より、システムとユーザとを結ぶヒューマンインタフェースとして、スイッチとメーターの並んだ計器盤、キーボードから命令を打ち込むコマンドラインインタフェース、ポインティングデバイスとアイコンから成るグラフィカル・ユーザ・インタフェース（ＧＵＩ）などが提案され、かつ実用に供されている。また最近では、音声による命令を受け付け、音声で応答を返す音声インタフェースも研究段階から実用段階に移りつつある。さらに、近年ではシステム内部とユーザとの間に介在させるインタフェースエージェント（自律的なインタフェース）として、執事／秘書／店員／愛犬などを模した生物的メタファ（仮想的な人間／ロボット／動物）を実現する技術が研究されている。
【０００３】
インタフェースエージェントとして執事や愛犬などの生物的メタファを採用することは、無機的なシステムに対するユーザの抵抗感を軽減して親しみを覚えさせる効果を生む。特に、電子フレンドや電子ペットのようなアプリケーションでは、ユーザとインタフェースエージェントとの感情を交えたやりとりによる両者の心理的結び付きを演出することがシステムの本質となる。したがって、このようなアプリケーションでは、インタフェースエージェントの知的有能性より感情面でのリアリティを向上させることが重要である。
【０００４】
感情は生物の目に見えない内部状態の一種であるが、表情、声の調子、言葉遣い／鳴き声、動作の様子、態度、行動などから我々はそれを伺い知ることができる。特に感情に対応した人間の表情を画像化する技術に関しては、文献“Ａｍｕｓｃｌｅｍｏｄｅｌｆｏｒａｎｉｍａｔｉｎｇｔｈｒｅｅｄｉｍｅｎｓｉｏｎａｌｆａｃｉａｌｅｘｐｒｅｓｓｉｏｎ”（ＣｏｍｐｕｔｅｒＧｒａｐｈｉｃｓ，ｖｏｌ．４，１９８７年）に開示される３次元ＣＧによる表情合成の研究に始まり、これまでにコンピュータグラフィックスやヒューマンインタフェースの分野で多くの成果が成されている。しかしながら、これらの研究は感情表現そのものをリアルに生成することを主眼とするものであり、感情表現の推移を与えるタイムテーブルや、特定の条件と特定の感情表現のマッピングによる直接的な感情表現生成を行うのみで、その中間に位置するはずの感情そのものを模擬するものではなかった。
【０００５】
感情そのものに踏み込んだ研究例としては、文献“ＣＧアニメーションのための人間行動シミュレーション”（ＮＩＣＯＧＲＡＰＨ論文集、１９９２年）に開示される人間行動シミュレーションや、文献“ニューロベビー感情モデルを持つ表情合成システム”（画像ラボ、Ｓｅｐ．，１９９２）に開示されるニューロベビーがある。
【０００６】
前者の人間行動シミュレーションは、内部状態として魅力度というパラメータを導入して、人が他の人や自動車との衝突を避けながら距離を保って移動するＣＧアニメーションを生成可能にしている。魅力度の与え方によって保とうとする距離が変化する効果は、人間が嫌いな相手を回避し、好きな相手に接近する行動をよく再現しており、魅力度とは相手に対する好悪感情に相当するものであると考えることができる。この例では、好悪感情とも見なすことのできる魅力度というパラメータにより行動（接近行動／回避行動）を制御しているが、感情表現として表情などは扱っていない。
【０００７】
後者のニューロベビーは、独自の感情パラメータを持ち、怒鳴ったりなだめたりするユーザの音声に応じて自己の感情パラメータを変化させ、その感情パラメータ値に基づいて喜怒哀楽の様子をＣＧキャラクタの表情、顔色、声、動作に乗せて演じることができる。他の多くの従来技術と異なり、ニューロベビーは、入力に基づいて内部状態である感情パラメータを操作し、次に該感情パラメータに基づいて感情表現を生成するという２段階を経て、入力から感情表現を生成する。
【０００８】
しかしながら、感情パラメータを持つニューロベビーは、ユーザとのやりとりの中で感情パラメータを変化させることは可能であっても、特定のユーザとのやりとりの積み重ねに応じて、感情パラメータの変化のしかたや、同じ感情パラメータ値に対する表現のしかたを変えることはできなかった。その結果、システムの示す感情パラメータの変化と、それに基づく感情表現は時と相手を選ばず固定的かつ反復的であり、あるユーザがあるインタフェースエージェントとどう接してきたかという歴史を、そのユーザに対するそのインタフェースエージェント独自の感情表現として生成することはできなかった。これは、従来例の人間行動シミュレーションでも同様である。魅力度は予め与えられるが、それが経験により変化するという機能と効果については一切示唆されていない。すなわち、電子フレンドや電子ペットでは、ユーザとインタフェースエージェントとの対話の積み重ねの歴史により、両者の心理的結び付きが醸成可能であることがシステムの大きな目的となり得るにも関わらず、従来技術によるシステムではそのような機能を実現することができなかった。
【０００９】
【発明が解決しようとする課題】
以上説明してきたように、従来の技術では、インタフェースエージェントには、予め定められた入力に対して予め定められた感情的反応を示させることしかできなかった。
【００１０】
また、特定のユーザと特定のインタフェースエージェントとの間での過去のやり取り（かかわり）を反映させた形で、当該特定のインタフェースエージェントにおける感情表現を制御することができなかった。
【００１１】
また、複数のインタフェースエージェント間における感情表現に関する特性に個体差を持たせることや、あるインタフェースエージェントに付与した個性を別の個性に変更することが困難であった。
【００１２】
本発明は、上記事情を考慮してなされたもので、実際の運用場面において、所定の感情を発生させる状況に特有に現われる予測不可能な付帯条件を学習し、学習された付帯条件を満たす新たな状況下で該所定の感情を想起させることが可能な感情生成装置及び感情生成方法を提供することを目的とする。
【００１３】
また、本発明は、状況が満たす付帯条件とそれに伴い想起される感情として、特定の人物や特定の事物への好悪感情を学習でき、電子フレンドや電子ペットなど、ユーザとの心理的関係構築を重視するインタフェースエージェントに必要な機能、すなわちユーザ毎に異なる感情的応答を示し、この応答をユーザがインタフェースエージェントと如何に接するかで調整可変である感情生成装置及び感情生成方法を提供することを目的とする。
【００１４】
また、本発明は、感情の現われ方に関して、インタフェースエージェント毎の性格付けを容易に行うことのできる感情生成装置及び感情生成方法を提供することを目的とする。
【００１５】
【課題を解決するための手段】
本発明に係る感情生成装置は、周囲の状況を認識して複数種類の情報から成る状況情報を生成する状況認識手段と、前記状況情報を現在から過去にさかのぼる所定期間分まとめた状況情報列を生成して保持する状況記述手段と、前記状況情報列から予め定められた種類の情報を検出すると、該情報に応じた反応感情情報を生成する反応感情生成手段と、前記反応感情情報の強度が所定閾値以上となった場合に、該反応感情情報と前記状況情報列とを関連付けて状況感情対情報として記憶する感情的記憶記述手段と、状況情報列の入力に対して、前記記憶された状況感情対情報に関連付けられた前記状況情報列との一致の度合いに応じて、該状況感情対情報に関連付けられた反応感情情報を想起感情情報として想起する想起感情生成手段と、新たに想起された前記想起感情情報と新たに生成された前記反応感情情報とを合成して自己感情情報を生成する自己感情記述手段と、前記自己感情情報に応じた信号を生成出力する感情表現手段とを具備し、前記状況情報が、前記反応感情情報を生成させる前記予め定められた種類の情報とそれ以外の種類の情報とから成るか、または、前記予め定められた種類の情報を複数種類含むことを特徴とする。
また、本発明に係る感情生成方法は、周囲の状況を認識して複数種類の情報から成る状況情報を生成するステップと、前記状況情報を現在から過去にさかのぼる所定期間分まとめた状況情報列を生成し記憶手段に保持するステップと、前記状況情報列から予め定められた種類の情報を検出すると、該情報に応じた最新の反応感情情報を生成するステップと、前記反応感情情報の強度が所定閾値以上となった場合に、該反応感情情報と前記状況情報列とを関連付けて状況感情対情報として学習し、この学習結果を記憶手段に保持するステップと、状況情報列の入力に対して、前記学習された状況感情対情報に関連付けられた前記状況情報列との一致の度合いに応じて、該状況感情対情報に関連付けられた反応感情情報を想起した想起感情情報を生成するステップと、新たに生成された前記想起感情情報と新たに生成された前記反応感情情報とを合成して自己感情情報を生成するステップと、前記自己感情情報に応じた信号を生成出力するステップとを有し、前記状況情報が、前記反応感情情報を生成させる前記予め定められた種類の情報とそれ以外の種類の情報とから成るか、または、前記予め定められた種類の情報を複数種類含むことを特徴とする。
【００２０】
なお、装置に係る本発明は方法に係る発明としても成立し、方法に係る本発明は装置に係る発明としても成立する。
また、装置または方法に係る本発明は、コンピュータに当該発明に相当する手順を実行させるための（あるいはコンピュータを当該発明に相当する手段として機能させるための、あるいはコンピュータに当該発明に相当する機能を実現させるための）プログラムを記録したコンピュータ読取り可能な記録媒体としても成立する。
【００２１】
本発明では、予測可能な条件に照らして状況を評価して装置自身の感情を発生させる機構を与え、該機構により実際に経験した感情とそのときの状況から、該状況に特有な予測不可能の付帯条件を学習させる。学習した付帯条件を満たす新しい状況の入力に対しては、該付帯条件に一致する記憶された感情を想起することにより、実際にその状況に至らずとも、付帯条件の検出のみで感情を変化させることを可能にする。この結果、付帯条件としてユーザやユーザの行動を学習して感情的に応答できる、すなわちユーザ毎に感情表現を変え、ユーザの接し方でこれを調整可変な（生成される感情が経験により変化する）インタフェースエージェントを実現することができる。また、感情の現われ方に関して、インタフェースエージェント毎の性格付けを容易に行うことができる。
【００２２】
【発明の実施の形態】
以下、図面を参照しながら発明の実施の形態を説明する。
本実施形態に係る感情生成装置は、概略的には、ユーザとの対話機能もしくはユーザへの情報提示機能を有しかつその手段として擬人的（もしくは擬生物的）な形態を有するインタフェースエージェント（例えば、擬人的な外観がＣＧなどにより表現されるインタフェースエージェント、外観自体を擬人化したロボット、など）における主として視聴覚的に感知され得る出力（例えば、インタフェースエージェントを表現する画像表示や音声出力、ロボットの動きや音声出力、など）に対して感情表現を付与するためのものである。
【００２３】
そして、本実施形態は、その感情表現をより高度に擬人化させるためのものである。また、その感情表現の生成の仕組みをより自律化させ、かつ人間で言う個性に相当するような、インタフェースエージェントの感情表現に関する特性の個体差、を容易に設定・変更可能とするものである。
【００２４】
本実施形態では、感情生成装置は、インタフェースエージェントに個別に搭載するものとして説明する。従って、例えば、以下で装置独自と言った場合には、本感情生成装置独自であり、かつ、当該インタフェースエージェント独自である。また例えば、以下で装置周囲と言った場合には、本感情生成装置の周囲であり、かつ、当該インタフェースエージェントの位置する場所の周囲である。
【００２５】
図１に、本発明の一実施形態に係る感情生成装置の基本構成を示す。図１に示されるように、この感情生成装置は、外部状況認識部１、状況記述部２、反応感情生成部３、感情的記憶生成部４、感情的記憶記述部５、想起感情生成部６、自己感情記述部７、感情表現部８、制御情報生成部９を備えている。
【００２６】
外部状況認識部１は、概略的には、画像や音声やその他の情報を入力し、これを解析して装置周囲の現在の状況を表わす外部状況情報を逐次生成する。
状況記述部２は、概略的には、外部状況認識部１にて生成された外部状況情報（人物ＩＤコード、人物表情コード、発話コード、明るさなど）を定期的に読み出して、読み出し時刻とともに所定期間分の最新の外部状況情報を状況情報列として保持する。
【００２７】
反応感情生成部３は、概略的には、状況記述部２による指定期間分の状況情報列に直接反応して変化する装置独自の感情（反応感情情報）を生成出力する。
感情的記憶生成部４は、概略的には、反応感情記述部３による反応感情情報と状況記述部２による指定期間内の状況情報列とを対応付けた状況感情対情報を生成して、これを記憶する感情的記憶記述部５に受け渡す（ただし、後述するように、そのときの反応感情情報が十分な強度を持つ場合にのみ状況感情対情報を生成するものとする）。
【００２８】
感情的記憶記述部５は、概略的には、感情的記憶生成部４による（強い）状況感情対情報を、状況記述部２が保持できるよりも長期間保持する。
想起感情生成部６は、概略的には、状況記述部２から指定期間内の状況情報列を読み出し、該状況情報列に対応する感情情報を感情的記憶記述部５から検索して想起感情情報として出力する。
【００２９】
自己感情記述部７は、概略的には、反応感情生成部３による反応感情情報と、想起感情生成部６による想起感情情報とを合成して得られる感情情報を現在の自己感情情報として保持する。
【００３０】
感情表現部８は、概略的には、自己感情記述部７に記述される現在の自己感情情報にしたがって、インタフェースエージェント等の感情表現を、例えば画像や音声などにより出力する。
【００３１】
制御情報生成部９は、概略的には、感受性制御情報を反応感情生成部３に、記銘制御閾値情報を感情的記憶生成部４に、学習強度制御情報を感情的記憶記述部５に、想起強度制御情報を想起感情生成部６に、感情表出制御情報を感情表現部８に各々供給する。なお、これらの情報は、インタフェースエージェント等の感情面での個性や性格を決定するパラメータである。
【００３２】
さて以下では、外部状況認識部１から制御情報生成部９のそれぞれについて順番に詳細に説明していく。
最初に、外部状況認識部１について説明する。
【００３３】
外部状況認識部１は、ＴＶカメラやマイクロフォンやその他のセンサを通じて、画像や音声やその他の情報を入力し、これを解析して装置周囲の現在の状況を表わす外部状況情報を逐次生成する。
【００３４】
なお、本実施形態では、画像および音声およびその他の情報（例えば、温度など）を扱うものとして説明するが、これに限定されず、画像や音声を含めて扱うべき物理量（インタフェースエージェント等にとっての外界からの刺激となるもの）の組み合わせは適宜修正可能である。
【００３５】
図２に、外部状況認識部１の構成例を示す。この外部状況認識部１は、画像情報入力部１１、人物画像検出部１２、人物認識部１３、人物表情認識部１４、人物動作認識部１５、音声情報入力部１６、人物音声検出部１７、発声内容認識部１８、語気認識部１９、変化速度検出部２０、明るさ検出部２１、その他情報入力部２２、外部状況情報出力部２３を含む。
【００３６】
画像情報入力部１１は、ＴＶカメラなどにより画像を取り込んで画像データとして出力する。
人物画像検出部１２は、該画像データ中から人物の映っている画像領域を検出して出力する。この検出は、予め記憶されている、顔らしい画像特徴を記述した顔テンプレートを画像データ中で走査しつつ照合し、所定の基準値以上の類似度を有する領域を顔のある箇所と認定し、さらに該顔のある箇所を含むその周囲を人物領域として抽出する。人物領域として抽出された部分画像データは人物画像データとして出力される。
【００３７】
人物認識部１３は、該人物画像データの顔領域を、予め記憶されている、人物別の顔テンプレートと照合して該人物が誰であるのかを特定し、この人物を表わす人物ＩＤコード（既知のテンプレートに該当しない人物の場合は、未知人物を表わす特別なＩＤコード）を外部状況情報出力部２３の所定のバッファ２４−１に出力する。なお、人物画像データが検出されない場合には、人物ＩＤ情報として人物なしを表わす特別なＩＤコードを出力する。
【００３８】
人物表情認識部１４は、上記の人物画像データの顔領域を、予め記憶されている、表情別のテンプレートと照合して該表情の種別（例えば、平常、笑う、怒る、悲しむなど）を特定し、この種別を表わす人物表情コード（既知のテンプレートに該当しない表情の場合は、未知表情を表わす特別なＩＤコード）を外部状況情報出力部２３の所定のバッファ２４−２に出力する。なお、人物画像データが検出されない場合には、人物表情コードとして表情なしを表わす特別なＩＤコードを出力する。
【００３９】
人物動作認識部１５は、上記の人物画像データの全域をオプティカルフロー解析して得た動きベクトルをもとめ、予め記憶されている、動作別のテンプレートと照合して該動作の種別（例えば、発話動作、頷き動作、手招き動作など）を特定し、この種別を表わす人物動作コード（既知のテンプレートに該当しない動作の場合は、未知動作を表わす特別なＩＤコード）を外部状況情報出力部２３の所定のバッファ２４−３に出力する。なお、人物画像データが検出されない場合には、人物動作コードとして動作なしを表わす特別なＩＤコードを出力する。
【００４０】
音声情報入力部１６は、マイクロフォンなどにより音声を入力して音声データとして出力する。
人物音声検出部１７は、人物動作認識部１５と音声情報入力部１６の出力を監視し、音声情報入力部１６の音声データが大きなパワーを有し、かつ、人物動作認識部１５が人物の発話動作（口が開閉しているなど）を検出した場合に、該音声データを人物音声データとして検出出力する。
【００４１】
発声内容認識部１８は、該人物音声データを、予め記憶されている、単語音声別のテンプレートと照合して該音声中の単語を特定し、この単語を表わす発話内容コード（既知のテンプレートに該当しない単語の場合は、未知単語を表わす特別なＩＤコード）を外部状況情報出力部２３の所定のバッファ２４−４に出力する。
【００４２】
語気認識部１９は、上記の人物音声データをスペクトル解析して得た音声パラメータを、予め記憶されている、語気別のテンプレートと照合して語気の種別（例えば、優しい、怒鳴っているなど）を特定し、この種別を表わす語気コード（既知のテンプレートに該当しない語気の場合は、未知語気を表わす特別なＩＤコード）を外部状況情報出力部２３の所定のバッファ２４−５に出力する。
【００４３】
変化速度検出部２０は、上記の画像データの時間差分の最大絶対値と上記の音声データの時間差分の最大絶対値の合計値を計算し、状況変化速度として外部状況情報出力部２３の所定のバッファ２４−６に出力する。
【００４４】
明るさ検出部２１は、上記の画像データの画像全体の明るさを計算して、外部状況情報出力部２３の所定のバッファ２４−７に出力する。なお、明るさは、照度計などのセンサにより得るようにしてもよい。
【００４５】
その他情報入力部２２は、温度センサやスイッチ類などの他の入力手段による入力データから温度やその他の情報を抽出し、その他情報として外部状況情報出力部２３の所定のバッファ２４−８に出力する。
【００４６】
外部状況情報出力部２３は、以上の人物識別部１３〜明るさ検出部２１の各ブロックにより生成された各種情報を保持するバッファ（２４−１〜２４−８）の集合からなり、必要に応じて各バッファの値を一括して外部状況情報として出力する。一般に、外部状況情報は、人物の有無、人物の別、人物の動作、人物の発話内容、人物の発話語気、状況の急な変化、人物以外の例えば部屋の明るさや温度などから構成されるマルチトラック情報であり、画像処理や音声処理に要する時間がトラック毎に異なる。したがって、バッファ（２４−１〜２４−８）への各トラックの情報更新は非同期的に行われるが、バッファ（２４−１〜２４−８）の先では、常に次の更新前の最新の情報を読み出すことができる。
【００４７】
なお、以上で使用する各種テンプレートは、予め多数の実サンプルを収集し、これを統計的に解析（例えば、主成分分析など）するなどして生成する。
また、外部状況認識部１の生成する外部状況情報の種類は以上の例に限定されず、必要に応じて種々選択／拡張可能であり、それに応じて外部状況情報出力部２３の持つバッファの数と種類も種々選択／拡張可能である。
【００４８】
また、人物音声検出部１７に、人物識別部１３から出力される人物ＩＤコードを与え、人物音声検出部１７では人物ＩＤコードに応じた特定話者音声検出処理を行うようにしてもよい。同様に、発話内容認識部１８に人物ＩＤコードを与え、発話内容認識部１８では人物ＩＤコードに応じた特定話者発話内容認識処理を行うようにしてもよい。同様に、語気認識部１９に人物ＩＤコードを与え、語気認識部１９では人物ＩＤコードに応じた特定話者語気認識処理を行うようにしてもよい。このようにすることで、話者が既知の場合には、各認識処理の精度や速度を向上させることができる。
【００４９】
次に、状況記述部２について説明する。
状況記述部２は、外部状況認識部１にて生成された外部状況情報（人物ＩＤコード、人物表情コード、発話内容コード、明るさなど）を定期的に読み出して、読み出し時刻とともに所定期間分の最新の外部状況情報を状況情報列として保持する。また、状況記述部２は、保持する状況情報列が必要とされる他のブロック（反応感情生成部３、感情的記憶生成部４、想起感情生成部６）に対して、該ブロックからの期間指定情報に応じた期間の状況情報列を出力することができる。
【００５０】
図３に、状況記述部２の構成例を示す。この状況記述部２は、外部状況情報取得部３１、状況情報蓄積部３２、状況情報列出力部３３を含む。
外部状況情報取得部３１は、外部状況認識部１の外部状況情報出力部２３の各バッファ（２４−１〜２４−８）に保持される外部状況情報を定期的に一括して読み出し、これに該取得時点の時刻を付加して出力する。
【００５１】
状況情報蓄積部３２は、内蔵するリングバッファ３４の最も古い記憶を削除して、そこに新たに取得された外部状況情報および取得時刻の最新値を書き込む。このようにすることで、状況情報蓄積部３２は、最新一定期間分（図３の例ではＴ−０からＴ−１１まで）の状況情報の列とその取得時刻とを保持することが可能になる（Ｔ−０を最新時刻としている）。
【００５２】
状況情報列出力部３３は、外部から入力される期間指定情報の定める期間分だけの状況情報と取得時刻とをリングバッファ３４から読み出し、列バッファ３５に状況情報列として編集出力する。
【００５３】
状況情報列を利用する他のブロックが状況情報列を読み出す際に期間指定情報を与えられるようにすることで、状況情報列を読み出すブロックがどの期間に注目して処理を行うかを調整可変とすることができる。これを処理の時間感度と呼ぶことにする。一般に、状況に迅速に応じる必要のあるブロックでは指定期間は現在に近く、やや過去に遡った状況を利用するブロックでは指定期間は過去にシフトしている。
【００５４】
なお、上記では、一定期間分の外部状況情報および取得時刻の保持にリングバッファを用いたが、その代わりにＦＩＦＯバッファを用いてもよい。この場合、ＦＩＦＯバッファを外部状況情報の各トラック毎に設け、最新の外部状況情報を取得したならば、各ＦＩＦＯバッファの先頭のデータを破棄し、最新の外部状況情報を各ＦＩＦＯバッファの最後尾にそれぞれ投入する。
【００５５】
次に、反応感情生成部３について説明する。
反応感情生成部３は、状況記述部２による指定期間分の状況情報列に直接反応して変化する装置独自の感情（反応感情情報）を生成出力する。反応感情情報の強さは別に与える感受性制御情報の大きさに応じて制御可能である。この感受性制御情報は装置が状況に対してどれくらい感情的に反応しやすいかを決定する制御パラメータであり、制御情報生成部９により与えられる。なお、反応感情生成部３は、状況の変化に迅速に応答するために、時間感度として例えば図３のＴ−０からＴ−４までの期間のような最新の比較的短期間を指定する期間指定情報を状況記述部２に対して出力する。
【００５６】
図４に、反応感情生成部３の一構成例を示す。この反応感情生成部３は、反応感情用状況情報列取得部４１、状況感情変換部４２、反応感情スケール変換部４３を含む。
【００５７】
反応感情用状況情報列取得部４１は、期間指定情報に対応する状況情報列を状況記述部２より読み出す。読み出された状況情報列には、各時刻における人物ＩＤコード（例えば、Ａさんを示す人物番号、など）や発話内容コード（例えば、「おはよう」を示す単語番号、など）や人物動作コード（例えば、手招きを示す動作番号、など）等のコード量と、明るさや状況変化速度等の数値量とが含まれる。
【００５８】
状況感情変換部４２は、入力層４４と中間層４５と出力層４６の３層からなる階層型ニューラルネットワークを用いて構成され、各層（４４、４５、４６）はそれぞれ所定数のユニット（４７）により形成されている。入力層４４の全てのユニットからはそれぞれ中間層４５の全てのユニットに荷重結合（４８）が張られ、該当ユニットの出力値に荷重をかけた値が結合先のユニットに入力される。同様に中間層４５の全てのユニットからもそれぞれ出力層４６の全てのユニットに荷重結合が張られ、入力層４４のユニットの出力値は各々の経路にしたがった荷重をかけられて出力層４６に伝播される。
【００５９】
入力層４４には外部状況情報のトラックに対応したユニット群があり、各ユニット群には必要な数のユニットが用意されている。
例えば、人物ＩＤに対応するユニット群の場合、装置が「Ａさん」、「Ｂさん」、「Ｃさん」の３人の人物を見分けられる（すなわち、この３人の顔テンプレートが登録されている）とするならば、当該ユニット群に必要なユニットの数は、「Ａさん」「Ｂさん」「Ｃさん」「未知人物」「誰もいない」の５つのコードに対応した５個に、指定期間数（状況情報列の列の長さ；例えば図３のＴ−０からＴ−４までの期間を指定する場合には５）を乗じて得た個数となる。
【００６０】
もし、状況情報列の「ある時刻」に人物ＩＤとして「Ａさん」があれば、『対応する時刻のＡさんのユニット』を活性化（１．０などの所定値を代入）させる。逆に、該当するコードのないユニットについては、これを非活性化（０．０を代入）させる。
【００６１】
また、明るさなどの数値量に対応するユニット群に必要なユニットの数は、該数値を入れる１個に、指定期間数（状況情報列の列の長さ）を乗じて得た個数となる。
【００６２】
このように、状況情報列は、時刻とトラックと、場合によってはコードに応じた入力層４４の所定のユニットに代入される。代入された値はそのままユニットの出力値となる。
【００６３】
入力層４４の各ユニットの値は荷重結合を経て中間層４５のユニットに入力される。中間層４５のユニットでは入力層４４の全てのユニットから入力された値の総和を求め、この値に対応したシグモイド関数の値を出力する。中間層４５の各ユニットの出力値はさらに荷重結合を経て出力層４６のユニットに入力される。そして、出力層４６のユニットでも中間層４５の全てのユニットから入力された値の総和を求め、この値に対応したシグモイド関数の値を出力する。
【００６４】
出力層４６の各ユニットは、本実施形態では、それぞれ、「幸福」、「怒り」、「悲しみ」、「嫌悪」、「驚き」、「恐れ」の６つの感情パラメータに対応するものとしている。そのため、状況感情変換部４２の階層型ニューラルネットワークは、状況情報列を入力とし、６つの感情パラメータの強度を出力とするパタン変換器として機能する。
【００６５】
もちろん、感情パラメータのバリエーションはこの例に限定されるものではないが、以下、この６種類の感情パラメータを例に説明を続ける。
ニュートラルネットワークにこのようなパタン変換能力を与えるために、所定の入出力関係を満たすサンプルデータをニューラルネットワークに与え、バックプロパゲーションアルゴリズムなどによりオフラインで学習させる。例えば、装置周囲が暗かったり（明るさを参照）、見知らぬ人物が検出された（人物ＩＤコードを参照）ならば、恐れパラメータに所定強度を与えたり、あるいは大きな音声が入力されたり（状況変化速度を参照）、怒鳴られたり（発話内容コードも加味）したならば、驚きパラメータに所定強度を与えたり、あるいは見知った人物がいたり（人物ＩＤコードを参照）、優しく声を掛けられた（語気コードを参照）ならば、幸福パラメータに所定強度を与えたり、あるいはまた発話内容コードによって怒りや悲しみや嫌悪のパラメータにも所定強度を与えたりした入出力サンプルデータを与える。
【００６６】
反応感情スケール変換部４３は、出力層４６のユニットに現われる感情パラメータを感受性制御情報の定めるゲイン（感情パラメータ毎に設定可能）をかけて出力する。この結果、ニューラルネットワークの学習後であっても、最終的に出力される反応感情情報の大きさとパラメータ間のバランスが制御可能になる。
【００６７】
この感受性制御情報は、同じニューラルネットワークを持ちかつ同じ状況情報列を与えられた各々の装置（各々の個別のインタフェースエージェント等）に、互いに異なる強さとバランスの反応感情を生成させる制御パラメータである、と考えることができる。そして、ゲインを調整することにより、所望の個性を演出することが可能になる。例えば、ゲインの絶対値を大きく設定すればより感動し易いエージェントを、小さく設定すればより冷めたエージェントをそれぞれ演出できる。また、悲しみや恐れのバランスを大きくすれば、めそめそしたり、おどおどしたエージェントを、幸福のバランスを大きくすればニコニコしたエージェントをそれぞれ演出できる。
【００６８】
ところで、従来技術に係るニューロベビーも、ニューラルネットワークを用いて状況に関する情報の系列から感情情報を誘導するような構成を持つが、得られた感情情報に任意のスケール変換をかけられないため、ニューロベビーの個性を個別に演出させるためには、ニューラルネットワークの入出力関係を該個性に適うように再学習させる必要があり、そのための学習データを別途用意する手間がかかる。また、従来技術に係るニューロベビーでは、この学習作業を装置（ニューロベビー）の運用以前に予め済ませておく必要があり、本実施形態のように運用中に感受性制御情報を変更するだけで直ちに個性を変更させられるような柔軟性を持たない。
【００６９】
以上、状況感情変換部にニューラルネットワークを用いた反応感情生成部３の一構成例を示したが、以下では、状況感情変換部にｉｆ−ｔｈｅｎルールを用いた反応感情生成部３の他の構成例について説明する。
【００７０】
図５に、反応感情生成部３の他の構成例を示す。この反応感情生成部３は、反応感情用状況情報列取得部１１１、状況感情変換部１１２、反応感情スケール変換部１１３を含む。
【００７１】
反応感情用状況情報列取得部１１１、反応感情スケール変換部１１３はそれぞれ前述の反応感情用状況情報取得部４１、反応感情スケール部４３と同じである。
【００７２】
状況感情変換部１１２は、前述の状況感情変換部４２と同じ機能を果たすものであるが、内部構成が相違する。
ここでは、この状況感情変換部１１２を中心に説明する。
【００７３】
図５に示されるように、状況感情変換部１１２は、変換規則格納部１１４、規則照合部１１５、反応感情合成部１１６を含む。
変換規則格納部１１４は、状況に基づいて感情パラメータをどのように決定すべきかを定めた変換規則を格納する。
【００７４】
変換規則の例としては以下のようなものがある。これらは怒鳴られた場合と周囲が暗い場合の規則の例であり、各規則は条件部と述部とを持つｉｆ−ｔｈｅｎルールの形式で記述される。

規則照合部１１５は、変換規則格納部１１４に格納される変換規則の各々を、入力される状況情報列により与えられる所定期間の状況と照合して、その一致の度合を計算するとともに、当該規則が定める感情パラメータ値に該一致度をかけた感情パラメータ値を出力する。一致度の計算は、各規則の条件部が成立する累計期間長を求め、状況情報列の期間長に対するこの累計期間長の割合として求められる。このようにすることで、条件部に示される条件が所定期間の状況情報列に現れ始めてから消え去るまでの間、その現れている期間に応じた強さの感情パラメータ値が出力される。
【００７５】
反応感情合成部１１６は、規則照合部１１５が出力する一致度に応じた感情パラメータ値を、上記の変換規則格納部１１４が擁する全ての変換規則について加え合わせた結果を、入力された状況情報列に対応した反応感情情報として出力する。
【００７６】
なお、図４や図５の反応感情生成部３は、怒鳴られれば恐れ驚くというように、装置の運用以前から知られている一般的な条件（例えば、怒鳴られた）から反応的に発生する感情を誘導する（例えば、怒鳴られた場合は恐れ驚くという事実）機能を担うものである。したがって、誰が怒鳴ったとか、いつ怒鳴ったなどという、怒鳴られたという条件に付帯する運用以降に判明する詳細な他の条件については関知しない。
【００７７】
一方、後述する感情的記憶生成部４と感情的記憶記述部５と想起感情生成部６による感情想起の仕組みは、過去に怒鳴られた状況に付帯する、「誰が」とか「いつ」などの詳細な条件を記憶から思い出し、状況情報列中にそのような付帯条件（いつぞや怒鳴ったあの人がいるなど）が現れると、怒鳴られていなくても恐れの感情（驚きの感情は刹那的なものなので、想起の対象からはずしている）を発生させるのである。この仕組みは、将来似たような状況に遭遇した装置（インタフェースエージェント）が、過去の経験に基づいてより強く迅速な感情的反応を示すために必要なだけでなく、たった今怒鳴られたばかりの状況にあっても、従来とは異なった効果を奏することになる。すなわち、反応感情生成部３が対象とする状況情報列から「怒鳴られた」という条件が消え去っても、該怒鳴った人物が目前にいれば想起感情生成部６が恐れの感情を維持してくれる。このような反応は従来技術に係るニューロベビーでは達成されておらず、該従来技術では、怒鳴られたという条件が消え去るとともに、直ちに恐れの感情も消え去ってしまい、妙に立ち直りの早いエージェントが演出されてしまうことになる（これによって天真爛漫なベビーを演出することができたとしても、その他の個性を演出することはできない）。
【００７８】
次に、感情的記憶生成部４について説明する。
感情的記憶生成部４は、反応感情生成部３による反応感情情報と状況記述部２による指定期間内の状況情報列とを対応付けた状況感情対情報を生成して、これを記憶する感情的記憶記述部５に受け渡す。ただし、感情的記憶生成部４はそのときの反応感情パラメータの少なくとも１つが十分な強度を持つ場合にのみ状況感情対情報を生成する。これは、強い感情を覚えた状況だけを記憶し、それ以外の些末な状況を記憶しないためである。
【００７９】
図６に、感情的記憶生成部４の構成例を示す。この感情的記憶生成部４は、感情強度評価部５１、状況感情対情報生成部５２を含む。
感情強度評価部５１は、反応感情記述部３から反応感情情報を読み出し、その擁する感情パラメータのいずれかが、制御情報生成部９による記銘制御閾値情報の指定する強度以上の値を持つか否かを評価する。もし、十分な強度を持つパラメータが検出されれば、当該反応感情情報は状況とともに記憶されるべきと判断して、検出情報をオンにし、次段の状況感情対情報生成部５２に検出情報を送出する。
【００８０】
状況感情対情報生成部５２は、記憶すべき反応感情情報が検出された場合（検出情報がオンの場合）に、感情強度評価部５１から当該反応感情情報を取得するとともに、期間指定情報を状況記述部２に送出して指定期間分の状況情報列を読み出し、取得した当該反応感情情報と当該状況情報列とを組み合わせた状況感情対情報として、複合バッファ５３の状況情報列用バッファ５４と反応感情情報用バッファ５５に格納出力する。なお、状況感情対情報生成部５２は、反応感情の変化が起こる以前の比較的長い状況情報列が得られるように、時間感度として例えば図３のＴ−１からＴ−８までの期間のような、現在時刻よりもやや過去に遡ってからの比較的長期間を指定する期間指定情報を状況記述部２に対して出力する。
【００８１】
次に、感情的記憶記述部５について説明する。
感情的記憶記述部５は、感情的記憶生成部４による（強い）状況感情対情報を、状況記述部２が保持できるよりも長期間保持する。
【００８２】
図７に、感情的記憶記述部５の一構成例を示す。この感情的記憶記述部５は、状況感情対記憶部６１、状況感情対更新部６２、状況感情対検索部６３を含む。
状況感情対記憶部６１は、図４に例示した状況感情変換部４２と同様の構造を持つ階層型ニューラルネットワークを用いて構成され、ここでの階層型ニューラルネットワークは、状況感情対情報中の状況情報列に対応する入力層６４と、状況感情対情報中の感情情報に対応する出力層６６と、中間層６５の３層により形成され、隣接する階層間の各ユニット（６７）同士が荷重結合（６８）で結ばれている。
【００８３】
図４の状況感情変換部４２における入力層（４４）や出力層（４６）と同様に、図７の状況感情対記憶部６１における入力層６４のユニットは、状況情報列の時刻、トラック、コード別に用意されており、出力層６６のユニットは、装置が感じる感情パラメータ（本例では、６つの感情パラメータ）に対応して用意されている。各荷重結合の初期値は、いかなる入力に対しても出力を出さないように０に設定されている。
【００８４】
状況感情対更新部６２は、感情的記憶生成部４による状況感情対情報を受け取ると、これを記銘用状況情報列と記銘用感情情報とに分解し、記銘用状況情報列に対応する入力層６４のユニットと記銘用感情情報に対応する出力層６６のユニットを活性化／非活性化させたり、数値を代入したりする。この結果入出力層６６に与えられる活性値のパタンは、装置が実際に抱いた強い反応感情とその原因となった（比較的長期の）状況を示していることになる。本感情的記憶記述部５は、この活性値パタンをサンプルデータとして、ニューラルネットワークが与えられた入出力関係を満たす方向に結合荷重を調整する。この調整の大きさは別に与えられる学習強度制御情報に比例しており、この値が大きければ学習は進み、小さければあまり進まない。そのため、学習強度制御情報は学習の強さを与えるパラメータであると言える。この調整の結果、ニューラルネットワークは以後、記銘用状況情報列に似た状況に対して、記銘用感情情報に似た反応感情を出力するようになる。
【００８５】
状況感情対検索部６３は、外部から与えられる想起用状況情報列を状況感情対記憶部６１の入力層６４に与え、その結果出力層６６に現われる感情パラメータ値を想起感情情報として外部に出力する（状況感情対検索部６３は、いわば記憶検索機構として機能する）。
【００８６】
以上、状況感情対記憶部にニューラルネットワークを用いた感情的記憶記述部５の一構成例を示したが、以下では、状況感情対記憶部にパターン・マッチング的手法を用いた感情的記憶記述部５の他の構成例について説明する。
【００８７】
図８に、感情的記憶記述部５の他の構成例を示す。この感情的記憶記述部５は、状況感情対記憶部１２１、状況感情対更新部１２２、状況感情対検索部１２３を含む。
【００８８】
状況感情対更新部１２２、状況感情対検索部１２３はそれぞれ前述の状況感情対更新部６２、状況感情対検索部６３と同じである。
状況感情対記憶部１２１は、前述の状況感情対記憶部６１と同じ機能を果たすものであるが、内部構成が相違する。
【００８９】
ここでは、この状況感情対記憶部１２１を中心に説明する。
図８に示されるように、状況感情対記述部１２１は、状況感情対バッファ部１２４、バッファ更新部１２５、状況情報列照合部１２６、想起感情合成部１２７を含む。
【００９０】
状況感情対バッファ部１２４は、状況情報列とそのときの感情情報とを組にして記憶する複合バッファ（１２８）を所定数擁している記憶手段である。なお、各複合バッファは、図６の状況感情対情報生成部５２におけるものと同様、それぞれ状況情報列用バッファ１２９と感情情報用バッファ１３０から構成される。
【００９１】
バッファ更新部１２５は、状況感情対更新部１２２からの記銘用状況情報列と、記銘用感情情報とを組にして、状況感情対バッファ部１２４の空いている複合バッファに書き込む。このとき、空いている複合バッファがなければ、最も古い時刻の情報を保持する複合バッファの内容を棄却してこれに新しい情報を上書き更新する。また、バッファ更新部１２５は、記銘用感情情報の各感情パラメータ値をそのまま書き込むのではなく、学習強度制御情報の示すゲインでスケール変換した感情パラメータ値を書き込む。この結果、全ての状況感情対情報は一旦記憶されるものの、記憶される感情パラメータ値は学習強度制御情報によってその大きさとバランスが調整可能になる。
【００９２】
状況情報列照合部１２６は、状況感情対検索部１２３からの想起用状況情報列を受け、この状況情報列と各複合バッファ１２８に記憶される状況情報とを照合して、その一致の度合を計算するとともに、当該複合バッファに記憶される感情パラメータ値に該一致度をかけた感情パラメータ値を出力する。一致度の計算は、まず想起用状況情報列の各時刻の各トラック値と、記憶される状況情報列の各時刻の対応するトラック値の差分（コード量なら一致する場合に０／不一致の場合に１、数値量なら値の差の絶対値を正規化した値）を求める。数値量に対する正規化は差の絶対値が０から１の間に収まるようなスケール変換である。この結果、全てのトラックについて、その差分は０から１の間に収まるようになる。このような差分値を想起用状況情報列と記憶される状況情報列の全時刻と全トラックについて合計してさらに正規化する。この正規化は、該差分の合計値を、想起用状況情報列の期間長×記憶される状況情報列の期間長×トラック数、で割ることで行われる。この結果、差分の合計値は０から１の間の数値となる。一致度は、この正規化された差分の合計値を１から差し引いて得た値すると、最も一致した場合に１、全く一致しない場合に０となる。
【００９３】
最後に想起感情合成部１２７は、状況情報列用照合部１２６が出力する感情パラメータ値を上記の状況感情対バッファ部１２４が擁する全ての複合バッファについて加え合わせた結果を、状況情報列に対応した想起感情情報として出力する。
【００９４】
なお、反応感情生成部３が感情を発生させる短期的な状況に反応するのに対して、図７や図８の感情的記憶記述部５では、感情的記憶生成部４が比較的長い時間感度に基づく状況感情対情報を生成することから、強い感情が起こる前の比較的長期間の状況を学習する。
【００９５】
次に、想起感情生成部６について説明する。
想起感情生成部６は、状況記述部２から指定期間内の状況情報列を読み出し、該状況情報列に対応する感情情報を感情的記憶記述部５から検索して想起感情情報として出力する。
【００９６】
図９に、想起感情生成部６の構成例を示す。この想起感情生成部６は、想起用状況情報列取得部７１、想起感情スケール変換部７２を含む。
想起感情用状況情報列取得部７１は、状況記述部２に期間指定情報を送出して指定期間分の状況情報列を読み出し、さらに、感情的記憶記述部５の状況感情対検索部６３にこれを送り出す。
【００９７】
状況感情対検索部６３では、受け取った状況情報列を想起用状況情報列として状況感情対記憶部６１の入力層６４に入力し、これに呼応して出力層６６に現われる記憶された感情パラメータ値を送り返す。
【００９８】
想起感情スケール変換部７２は、状況感情対検索部６３により返される感情パラメータ値を受け取り、想起強度制御情報の定めるゲイン（感情パラメータ毎に設定可能）をかけ、想起感情として出力する。
【００９９】
この結果、出力される想起感情情報の大きさとパラメータ間のバランスが制御可能になり、どのような感情を強く思い出すか、あるいはどのような感情をあまり思い出さないかという、エージェントの想起上の性格が演出可能になる。
【０１００】
なお、このとき、驚きのような反応感情生成部３のみで生成されるべき刹那的な感情を想起しないように、驚きに対する想起強度制御情報のゲインを低く設定しておくのが好ましい。
【０１０１】
ところで、従来技術に係るニューロベビーでは、例えば怒鳴られて怖かったというように、状況が確定した場合に対応する感情を生成する。これは大声で怒鳴られれば怖いという自然かつ生得的な反応を実現するが、誰がよく怒鳴る人なのかを予め教えておくことはできない。人が皆怒鳴ってくるのであれば、そのことを予め学習させても良いが、実際には怒鳴る人は一部である。したがって、従来技術に係るニューロベビーでは誰に怒鳴られたのかあるいは誰がよく怒鳴る人なのかなどといった情報を扱うことも、そのような情報にニューロベビーを反応させることもできない。
【０１０２】
これに対して、本実施形態における感情的記憶生成部４と感情的記憶記述部５による記憶の仕組みでは、実際にＡさんに怒鳴られて驚いたり恐ろしかった経験から、怒鳴られたときの状況として「Ａさんが居た」という付帯条件を学習する。さらに、想起感情生成部６は、過去に怒鳴られた状況の付帯条件であるＡさんを検出するだけで、怒鳴られそうな状況を察知するかのごとく恐れの感情を思い出すのである。これは、Ａさんに対する好悪感情を学習したものと看做せる。また、例えば、周囲が暗くなり、雷の大音響が鳴り響いて恐ろしかったという場合も考えられるが、このような場合には、周囲の暗さという人物以外の付帯条件に対する好悪感情を学習することも可能である。
【０１０３】
繰り返しになるが、反応感情生成部３は、装置に予め与えておくことができるよくわかった感情的反応を生成するための機構であるのに対して、感情的記憶生成部４と感情的記憶記述部５と想起感情生成部６からなる記憶と想起の仕組みは、反応感情生成部３による生得的な感情的反応を拠り所にしつつ、その感情的反応が生まれた状況に対応するさらに細かい条件を学習して、将来これに反応するための機構である。
【０１０４】
なお、反応感情生成部３が注目する状況情報列の期間（時間感度）は現在から遡る短期間（例えば図３のＴ−０からＴ−４まで）であり、感情的記憶生成部４が注目する状況情報列の期間は現在よりやや遡った比較的長期間の過去（例えば図３のＴ−１からＴ−８まで）であることは既に述べた通りである。想起感情生成部６の目的は実際に状況が確定する前に付帯条件を評価し、来たるべき状況を予見した感情状態を装置に作ることである。したがって、想起感情生成部６が注目する状況情報列の期間は、感情的記憶生成部４の期間よりもさらに過去に遡った同じ長さの期間（例えば図３のＴ−４からＴ−１１まで）とするのが妥当である。
【０１０５】
次に、自己感情記述部７について説明する。
自己感情記述部７は、反応感情生成部３による反応感情情報と、想起感情生成部６による想起感情情報とを合成して得られる感情情報を現在の自己感情情報として保持する。
【０１０６】
図１０に、自己感情記述部８の構成例を示す。この自己感情記述部８は、自己感情合成部８１、自己感情保持部８２を含む。
自己感情合成部８１は、反応感情生成部３による反応感情情報と、想起感情生成部６による想起感情情報とを入力し、両者を合成して自己感情情報として出力する。合成は、例えば、同じ感情パラメータ毎にその値を加え合わせて自己感情情報の感情パラメータ値とすることで行われる。
【０１０７】
自己感情保持部８２は、自己感情合成部８１により求められた自己感情情報のパラメータ値を対応する内蔵のバッファ（８３−１〜８３−６）に格納保持する。
【０１０８】
次に、感情表現部８について説明する。
感情表現部８は、自己感情記述部７に記述される現在の自己感情情報にしたがって、インタフェースエージェントの感情表現を画像や音声等により出力する。
【０１０９】
図１１に、感情表現部８の構成例を示す。この感情表現部８は、反応生成部９１、行動生成部９２、エージェント合成部９３を含む。
反応生成部９１は、自己感情情報（心理反応源）と外部状況情報（生理反応源）に応じたエージェントの表情反応、身体反応などを表わす反応情報を生成する。表情反応や身体反応とは、例えば、嬉しいときに笑顔になったり、暑いときに発汗したり、恐ろしいときに青ざめた表情になったりする、自己感情や外部状況に応じて非意図的に起こる反応のことである。生成される反応情報は、例えば、笑顔７０％、発汗２０％、青顔色４０％などのように、後段のエージェント合成部９３が制御可能なパラメータのコードとその強度等で表現される。
【０１１０】
行動生成部９２は、自己感情情報（心理的動機）と外部状況情報（行動制約条件）に応じたエージェントの行動を表わす行動情報を生成する。ここで言う行動とは、例えば、嫌な相手に愛想笑い（幸福感から笑うのとは別物）をしたり、好きな相手に近づいたりする、自己感情や、外部状況に応じて意図的に起こされる行動のことである。生成される行動情報は、反応情報と同様、距離２ｍ、笑顔２０％などのように、後段のエージェント合成部９３が制御可能なパラメータのコードとその強度等で表現される。
【０１１１】
反応は心理反応源と生理反応源とにより自動的に発生するが、行動は意図的に為されるものであるから必ず動機が必要である。感情パラメータは、概ね、快（幸福）、不快（悲しみ、嫌悪、驚き、恐れ）、不定（怒り）に大別される。このうち、不快に分類される自己感情が強い場合には、装置はその状況を回避する行動を起こす。また、快に分類される自己感情が強い場合には、装置はその状況を維持する行動を起こす。したがって、自己感情は行動のための心理的動機であると言える。
【０１１２】
行動生成部９２は予め幾つかの行動パタンを与えられている。各行動パタンには、適用可能な感情状態および外部状況と、試行する優先順位とが、行動適用規則として与えられている。動機が発生すると、行動生成部９２は該行動適用規則を用いて行動パタンを１つ選択する。選択された行動パタンには当該行動パタンを実現するのに必要な行動情報が付加されているので、行動生成部９２は該行動情報をエージェント合成部９３に出力すれば良い。行動を行って所定期間経過しても状況が改善されない場合には、行動生成部９２は次の順位を与えられている適用可能な行動パタンを試す。
【０１１３】
エージェント合成部９３は、以上のようにして前段から与えられる反応情報と行動情報を受け、これらに応じたエージェントの形状、色彩、ポーズ、動作軌道、声の調子などを計算し、エージェントの姿（映像あるいはロボットの身体）と声を実現する。
【０１１４】
なお、制御情報生成部９により外部から与えられる感情表出制御情報は、反応生成部９１と行動生成部９２の両方に働きかけ、反応や行動の現われる大きさ、すなわち、エージェント合成部９３が制御可能なパラメータのゲインを調節する。この結果、同じ条件であっても、感情表出制御情報を様々に調整することで、例えば、顔には出ないが行動に出るとか、顔にすぐ出るがなかなか行動しないというような、エージェントの性格を演出することが可能となる。
【０１１５】
次に、制御情報生成部９について説明する。
制御情報生成部９は、感受性制御情報を反応感情生成部３に、記銘制御閾値情報を感情的記憶生成部４に、学習強度制御情報を感情的記憶記述部５に、想起強度制御情報を想起感情生成部６に、感情表出制御情報を感情表現部８に各々供給する。これらの情報は、インタフェースエージェントの感情面での個性や性格を決定するパラメータである。すなわち、このパラメータを調整することにより、インタフェースエージェントに付与する感情面での個性や性格を設定・変更することができる。
【０１１６】
次に、本感情生成装置の処理手順について説明する。
図１２に、本感情生成装置の処理手順の一例を示す。図１２の手順例では、外部状況認識処理Ｓ１と、状況情報列更新処理Ｓ２と、反応感情更新処理Ｓ３と、感情的記憶更新処理Ｓ４と、想起感情更新処理Ｓ５と、自己感情更新処理Ｓ６と、感情表現処理Ｓ７が実行される。
【０１１７】
外部状況認識処理Ｓ１は、外部状況認識部１における処理に対応しており、画像、音声、その他の観測データ等に基づいて外部状況情報を生成する処理である。
【０１１８】
状況情報列更新処理Ｓ２は、状況記述部２における処理に対応しており、外部状況認識処理Ｓ１の処理結果である最新の外部状況情報を受け、状況情報列から最も古い時刻の外部状況情報を破棄して、最新の外部状況情報に置き換える処理である。
【０１１９】
反応感情更新処理Ｓ３は、反応感情生成部３における処理に対応しており、状況情報列更新処理Ｓ２による最新の状況情報列から所定の期間の状況情報列を取り出し、これに対する最新の反応感情情報を生成する処理である。
【０１２０】
感情的記憶更新処理Ｓ４は、感情的記憶生成部４と感情的記憶記述部５における処理に対応しており、新しく生成された反応感情情報を記憶すべきか否かを、該感情の強さによって決定し、記憶すべき十分な強さを有するときには、これを所定期間の状況情報列とともに記憶する処理である。
【０１２１】
想起感情更新処理Ｓ５は、想起感情生成部６における処理に対応しており、状況情報列更新処理Ｓ２において更新された所定期間の状況情報列に類似した過去の状況情報列に対する最新の想起感情情報を想起する処理である。
【０１２２】
自己感情更新処理Ｓ６は、自己感情記述部７における処理に対応しており、最新の反応感情情報と最新の想起感情情報とを合成した最新の自己感情情報を生成する処理である。
【０１２３】
感情表現処理Ｓ７は、感情表現部８における処理に対応しており、最新の自己感情情報に応じて反応と行動を表現する画像や音声などの信号を出力する処理である。
【０１２４】
さて、従来技術に係るニューロベビーは、所定の期間内（可変期間ではなく最新の固定期間）の状況に関する情報の系列（２０ｍ秒の間隔をあけてとられる１０ｍ秒間のユーザ音声の最大振幅とゼロ交差回数を１０周期分採取した計２０個のパラメータ；語気に関する情報）を抽出し、これを入力層２０、中間層２４、出力層２の階層型ニューラルネットワークに入力して２個の感情に関するパラメータを求め、このパラメータに基づいてニューロベビーの映像と音声を制御している。
【０１２５】
一方、本実施形態では、現在の外界からの刺激に対する直接的な感情的反応を実現するだけでなく、過去に経験した状況／感情についての記憶との関連で想起される感情的反応をも実現することを可能としている。すなわち、感情的記憶生成部４と感情的記憶記述部５と想起感情生成部６とにより、状況記述部２の保持期間より過去に装置が経験した状況と感情を状況感情対情報として記憶・想起可能にすることで、強い感情を伴う状況が満たしている付帯条件を経験により学習可能としている。特に、状況が満たす付帯条件とそれに伴い想起される感情として、特定の人物や特定の事物もしくは事象への好悪感情を学習できることが、ユーザとの心理的関係構築を重視するインタフェースエージェントに必要な機能、すなわちユーザ毎に異なる感情的応答を示し、この応答をユーザがエージェントと如何に接するかで調整可変であるという性質を実現する。
【０１２６】
また、本実施形態では、制御情報生成部９により各種制御情報を供給可能とすることで、反応感情情報の強さとバランス、状況感情対情報の選択的記憶ならびに想起、経験による学習の強さ、想起感情情報の強さとバランス、感情表現の強さを制御可能にしている。この結果、インタフェースエージェントの個性として、例えば、感情的に反応しやすいか否か（感受性制御情報による）、懲りるたちか否かもしくは過去の経験が感情の想起に反映され易いか否か（記銘制御閾値情報、学習強度制御情報、想起強度制御情報による）、感情を表に出すたちが否か（感情表現制御情報による）等を調整可変とすることができる。
【０１２７】
なお、本実施形態に係る感情生成装置および感情生成方法は以上の例に限定されるものではない。
例えば、状況記述部２および状況情報列更新処理Ｓ２において生成される状況情報列として、外部状況情報に加えて、自己感情情報や反応情報、行動情報などの装置自身の内部状態（内部状況情報）を記述するようにしても良い。このようにすることで、感情を生成する予測不可能な付帯条件を、外部状況情報の他に内部状況情報をも手がかりにして学習することが可能になる。
【０１２８】
また、図４または図５の反応感情生成部３の構成と、図７または図８の感情的記憶記述部５の構成は、任意の組み合わせで実施可能である。
なお、以上の各機能は、ソフトウェアとしても実現可能である。
【０１２９】
また、本実施形態は、コンピュータに所定の手順を実行させるための（あるいはコンピュータを所定の手段として機能させるための、あるいはコンピュータに所定の機能を実現させるための）プログラムを記録したコンピュータ読取り可能な記録媒体として実施することもできる。
【０１３０】
例えば、図１３に例示するように、本発明に係る感情生成装置および感情生成方法を実現する情報（例えばプログラム）を記録媒体１０４に記録し、該記録した情報を該記録媒体１０４を経由して装置１０１や装置１０３に適用したり、通信回線１０５や１０６を経由して、装置１０２や１０３に適用することも可能である。
本発明は、上述した実施の形態に限定されるものではなく、その技術的範囲において種々変形して実施することができる。
【０１３１】
【発明の効果】
本発明によれば、過去に経験した感情とそのときの状況から該状況に特有な予測不可能の付帯条件を学習し、学習した付帯条件を満たす新しい状況の入力に対しては該付帯条件に一致する記憶された感情を想起することにより、実際にその状況に至らずとも、付帯条件の検出のみで感情を変化させることを可能にする。この結果、付帯条件としてユーザやユーザの行動を学習して感情的に応答できる、すなわちユーザ毎に感情表現を変え、ユーザの接し方でこれを調整可変なインタフェースエージェントを実現することができる。また、感情の現われ方に関して、インタフェースエージェント毎の性格付けを容易に行うことができる。
【図面の簡単な説明】
【図１】本発明の一実施形態に係る感情生成装置の基本構成を示す図
【図２】同実施形態に係る感情生成装置の外部状況認識部の構成例を示す図
【図３】同実施形態に係る感情生成装置の状況記述部の構成例を示す図
【図４】同実施形態に係る感情生成装置の反応感情生成部の構成例を示す図
【図５】同実施形態に係る感情生成装置の反応感情生成部の他の構成例を示す図
【図６】同実施形態に係る感情生成装置の感情的記憶生成部の構成例を示す図
【図７】同実施形態に係る感情生成装置の感情的記憶記述部の構成例を示す図
【図８】同実施形態に係る感情生成装置の感情的記憶記述部の他の構成例を示す図
【図９】同実施形態に係る感情生成装置の想起感情生成部の構成例を示す図
【図１０】同実施形態に係る感情生成装置の自己感情記述部の構成例を示す図
【図１１】同実施形態に係る感情生成装置の感情表現部の構成例を示す図
【図１２】同実施形態に係る感情生成装置における処理手順の一例を示すフローチャート
【図１３】本発明を記録媒体等により実施する場合について説明するための図
【符号の説明】
１…外部状況認識部
２…状況記述部
３…反応感情生成部
４…感情的記憶生成部
５…感情的記憶記述部
６…想起感情生成部
７…自己感情記述部
８…感情表現部
９…制御情報生成部
１１…画像情報入力部
１２…人物画像検出部
１３…人物認識部
１４…人物表情認識部
１５…人物動作認識部
１６…音声情報入力部
１７…人物音声検出部
１８…発声内容認識部
１９…語気認識部
２０…変化速度検出部
２１…明るさ検出部
２２…その他情報入力部
２３…外部状況情報出力部
３１…外部状況情報取得部
３２…状況情報蓄積部
３３…状況情報列出力部
４１…反応感情用状況情報列取得部
４２…状況感情変換部
４３…反応感情スケール変換部
５１…感情強度評価部
５２…状況感情対情報生成部
６１…状況感情対記憶部
６２……状況感情対更新部
６３…状況感情対検索部
７１…想起感情用状況情報列取得部
７２…想起感情スケール変換部
８１…自己感情合成部
８２…自己感情保持部
９１…反応生成部
９２…行動生成部
９３…エージェント合成部
１０１〜１０３…装置
１０４…記録媒体
１０５，１０６…通信回路
１１１…反応感情用状況情報列取得部
１１２…状況感情変換部
１１３…反応感情スケール変換部
１１４…変換規則格納部
１１５…規則照合部
１１６…反応感情合成部
１２１…状況感情対記憶部
１２２…状況感情対更新部
１２３…状況感情対検索部
１２４…状況感情対バッファ部
１２５…バッファ更新部
１２６…状況情報列照合部
１２７…想起感情合成部[0001]
BACKGROUND OF THE INVENTION
The present invention relates to an emotion generation apparatus and an emotion generation method for controlling the emotion expression of an interface agent.
[0002]
[Prior art]
Conventionally, as a human interface connecting the system and the user, an instrument panel in which switches and meters are arranged, a command line interface for inputting commands from a keyboard, a graphical user interface (GUI) consisting of a pointing device and an icon, etc. have been proposed. And it is used for practical use. Recently, voice interfaces that accept voice commands and return voice responses are also moving from the research stage to the practical stage. Furthermore, in recent years, a biological metaphor (virtual human / robot / animal) imitating a butler / secretary / clerk / pet dog etc. has been realized as an interface agent (autonomous interface) interposed between the system and the user. Technology to do is being researched.
[0003]
Employing a biological metaphor such as a butler or pet dog as an interface agent has the effect of reducing the user's resistance to inorganic systems and making them feel familiar. In particular, in applications such as electronic friends and electronic pets, it is the essence of the system to produce a psychological connection between the user and the interface agent by exchanging emotions. Therefore, in such an application, it is more important to improve emotional reality than the intellectual ability of the interface agent.
[0004]
Emotions are a kind of internal state invisible to living beings, but we can learn from their facial expressions, tone of voice, wording / screaming, behavior, attitude, and behavior. In particular, with regard to a technique for imaging human facial expressions corresponding to emotions, facial expression synthesis by three-dimensional CG disclosed in the document “A large model for animating three-dimensional facial expression” (Computer Graphics, vol. 4, 1987). Starting with research, so far many achievements have been made in the fields of computer graphics and human interface. However, these researches focus on the real generation of emotional expressions themselves, such as timetables that provide the transition of emotional expressions, and direct emotional expression generation by mapping specific conditions to specific emotional expressions. It was not a simulation of the emotions that should be in the middle.
[0005]
Examples of research that goes into emotion itself include human behavior simulation disclosed in the document "Human Behavior Simulation for CG Animation" (NICOGRAPH Proceedings, 1992), and the document "Facial Expression Synthesis System with Neurobaby Emotion Model". There is a neurobaby disclosed in (Image Lab, Sep., 1992).
[0006]
In the former human behavior simulation, a parameter called attractiveness is introduced as an internal state, and a CG animation in which a person moves while maintaining a distance while avoiding a collision with another person or a car can be generated. The effect of changing the distance to be kept depending on how the attractiveness is given reproduces the behavior of avoiding the opponents that humans dislike and approaching the favorite opponents. You can think of it as something. In this example, although the behavior (approaching behavior / avoidance behavior) is controlled by a parameter of attractiveness that can be regarded as a good and bad feeling, facial expressions are not treated as emotional expressions.
[0007]
The latter neuro baby has its own emotion parameter, changes its emotion parameter according to the voice of the user yelling or soothing, and based on the emotion parameter value, the emotional emotional state of the CG character, You can perform with your complexion, voice, and movement. Unlike many other conventional technologies, neurobaby operates emotional parameters that are internal states based on inputs, and then generates emotional expressions based on the emotional parameters. Is generated.
[0008]
However, a neurobaby with emotion parameters can change the emotion parameters in the interaction with the user, but depending on the accumulation of interactions with a specific user, The expression of the same emotion parameter value could not be changed. As a result, the emotion parameter changes that the system shows and the emotional expression based on it are fixed and repetitive regardless of time and partner, and the history of how a user has interacted with an interface agent It could not be generated as an emotional expression unique to the interface agent. The same applies to the conventional human behavior simulation. The attractiveness is given in advance, but there is no suggestion about the function and effect that it changes according to experience. In other words, in the case of electronic friends and electronic pets, despite the fact that it is possible to cultivate a psychological connection between users and interface agents due to the accumulated history of dialogue between users and interface agents, the system according to the prior art can Such a function could not be realized.
[0009]
[Problems to be solved by the invention]
As described above, in the conventional technique, the interface agent can only cause a predetermined emotional response to a predetermined input.
[0010]
In addition, the emotional expression in the specific interface agent cannot be controlled in a manner that reflects past exchanges (relationships) between the specific user and the specific interface agent.
[0011]
In addition, it is difficult to give individual differences in characteristics related to emotional expression among a plurality of interface agents, and to change the individuality assigned to one interface agent to another individuality.
[0012]
The present invention has been made in consideration of the above circumstances, learns an unpredictable incidental condition that appears peculiar to a situation in which a predetermined emotion is generated in an actual operational scene, and newly learns the additional condition that has been learned An object of the present invention is to provide an emotion generation device and an emotion generation method capable of recalling the predetermined emotion under various circumstances.
[0013]
In addition, the present invention can learn a favorable feeling for a specific person or a specific thing as an incidental condition to be satisfied by the situation and an emotion reminiscent of it, and can build a psychological relationship with a user such as an electronic friend or an electronic pet. It is an object of the present invention to provide an emotion generation apparatus and an emotion generation method which show functions necessary for an interface agent to be emphasized, that is, emotional responses different for each user, and the responses can be adjusted depending on how the user contacts the interface agent. And
[0014]
It is another object of the present invention to provide an emotion generation apparatus and an emotion generation method that can easily perform a personality rating for each interface agent regarding how emotions appear.
[0015]
[Means for Solving the Problems]
An emotion generation apparatus according to the present invention includes a situation recognition unit that recognizes a surrounding situation and generates situation information including a plurality of types of information, and a situation information string that summarizes the situation information for a predetermined period from the present to the past. When the situation description means to generate and hold, and when a predetermined type of information is detected from the situation information sequence, the reaction emotion generation means for generating reaction emotion information according to the information, and the intensity of the reaction emotion information The emotional storage description means for associating and storing the reaction emotion information and the situation information sequence as situation emotion pair information when the threshold value is equal to or greater than a predetermined threshold, and the stored situation for the input of the situation information sequence Recollection emotion generation means for recalling reaction emotion information associated with the situation emotion pair information as recall emotion information according to the degree of coincidence with the situation information sequence associated with the emotion pair information; Self-emotion description means for generating self-emotion information by synthesizing the recalled emotion information recalled with the newly generated reaction emotion information, and emotion expression means for generating and outputting a signal corresponding to the self-emotion information And the situation information includes the predetermined type of information for generating the reaction emotion information and other types of information, or a plurality of types of the predetermined type of information. It is characterized by including.
The emotion generation method according to the present invention includes a step of recognizing a surrounding situation and generating situation information composed of a plurality of types of information, and a situation information string obtained by collecting the situation information for a predetermined period from the present to the past. Generating and storing in the storage means; detecting a predetermined type of information from the situation information sequence; generating the latest reaction emotion information according to the information; and the intensity of the reaction emotion information is predetermined. When the threshold value is equal to or greater than the threshold value, the reaction emotion information and the situation information string are associated with each other and learned as situation emotion pair information, and the learning result is stored in a storage unit. Recall emotion information that recalls the reaction emotion information associated with the situation emotion pair information according to the degree of coincidence with the situation information pair associated with the learned situation emotion pair information. Generating the self-emotional information by synthesizing the newly generated recall emotion information and the newly generated reaction emotion information, and generating and outputting a signal corresponding to the self-emotional information The situation information includes the predetermined type of information for generating the reaction emotion information and other types of information, or a plurality of the predetermined types of information. It is characterized by including types.
[0020]
The present invention relating to the apparatus is also established as an invention relating to a method, and the present invention relating to a method is also established as an invention relating to an apparatus.
Further, the present invention relating to an apparatus or a method has a function for causing a computer to execute a procedure corresponding to the invention (or for causing a computer to function as a means corresponding to the invention, or for a computer to have a function corresponding to the invention. It can also be realized as a computer-readable recording medium on which a program (for realizing) is recorded.
[0021]
In the present invention, a mechanism for evaluating the situation in light of a predictable condition and generating an emotion of the apparatus itself is provided. From the emotion actually experienced by the mechanism and the situation at that time, an unpredictable characteristic peculiar to the situation Learn the incidental conditions. For the input of a new situation that meets the learned supplementary conditions, recalling the stored emotion that matches the supplementary conditions changes the emotion only by detecting the supplementary conditions without actually reaching the situation. Make it possible. As a result, it is possible to learn the user and the user's behavior as ancillary conditions and respond emotionally, that is, change the emotional expression for each user and adjust the variable depending on how the user touches (the generated emotion changes with experience ) An interface agent can be realized. In addition, the personality of each interface agent can be easily determined regarding how emotions appear.
[0022]
DETAILED DESCRIPTION OF THE INVENTION
Hereinafter, embodiments of the invention will be described with reference to the drawings.
The emotion generation apparatus according to the present embodiment generally has an interface agent (for example, an anthropomorphic (or quasi-biological) form) that has an interactive function with a user or a function of presenting information to the user and means for that. , An interface agent whose anthropomorphic appearance is represented by CG, a robot that anthropomorphizes the appearance itself, etc. Motion, voice output, etc.).
[0023]
And this embodiment is for making the emotional expression more personified. In addition, the mechanism of generating the emotion expression is made more autonomous, and individual differences in the characteristics related to the emotion expression of the interface agent, which correspond to the personality in humans, can be easily set and changed.
[0024]
In the present embodiment, the emotion generation apparatus will be described as being individually installed in the interface agent. Therefore, for example, when it is referred to as device-specific below, it is unique to the emotion generation device and unique to the interface agent. Further, for example, in the following description, the term “around the device” is around the emotion generating device and around the place where the interface agent is located.
[0025]
FIG. 1 shows a basic configuration of an emotion generation apparatus according to an embodiment of the present invention. As shown in FIG. 1, this emotion generation device includes an external situation recognition unit 1, a situation description unit 2, a reaction emotion generation unit 3, an emotional memory generation unit 4, an emotional memory description unit 5, and a recall emotion generation unit 6. A self-emotion description unit 7, an emotion expression unit 8, and a control information generation unit 9.
[0026]
In general, the external situation recognition unit 1 receives images, sounds, and other information, analyzes them, and sequentially generates external situation information representing the current situation around the apparatus.
In general, the situation description unit 2 periodically reads out external situation information (person ID code, person expression code, speech code, brightness, etc.) generated by the external situation recognition unit 1 and reads it together with the readout time. The latest external status information for a predetermined period is held as a status information string.
[0027]
In general, the reaction emotion generation unit 3 generates and outputs a device-specific emotion (reaction emotion information) that changes in direct response to a status information sequence for a specified period by the status description unit 2.
The emotional memory generation unit 4 generally generates situation emotion pair information in which the reaction emotion information by the reaction emotion description unit 3 and the situation information sequence within the specified period by the situation description unit 2 are associated with each other. (However, as will be described later, the situation emotion pair information is generated only when the reaction emotion information at that time has sufficient strength).
[0028]
In general, the emotional memory description unit 5 holds the (strong) situation emotion pair information by the emotional memory generation unit 4 for a longer period than the situation description unit 2 can hold.
In general, the recall emotion generation unit 6 reads a situation information sequence within a specified period from the situation description unit 2, searches the emotional memory description unit 5 for emotion information corresponding to the situation information sequence, and recalls emotion information. Output as.
[0029]
The self-emotion description unit 7 generally holds, as current self-emotion information, emotion information obtained by synthesizing the reaction emotion information from the reaction emotion generation unit 3 and the recall emotion information from the recall emotion generation unit 6. .
[0030]
In general, the emotion expression unit 8 outputs an emotion expression such as an interface agent in accordance with the current self emotion information described in the self emotion description unit 7 by, for example, an image or sound.
[0031]
The control information generation unit 9 generally includes sensitivity control information in the reaction emotion generation unit 3, memory control threshold information in the emotional memory generation unit 4, and learning intensity control information in the emotional memory description unit 5. The recall intensity control information is supplied to the recall emotion generation unit 6 and the emotion expression control information is supplied to the emotion expression unit 8. These pieces of information are parameters that determine emotional personality and personality of the interface agent and the like.
[0032]
Now, each of the external situation recognition unit 1 to the control information generation unit 9 will be described in detail in order.
First, the external situation recognition unit 1 will be described.
[0033]
The external situation recognition unit 1 inputs images, sounds, and other information through a TV camera, a microphone, and other sensors, analyzes them, and sequentially generates external situation information representing the current situation around the apparatus.
[0034]
In the present embodiment, description will be made assuming that images, sounds, and other information (such as temperature) are handled. However, the present invention is not limited to this, and physical quantities to be handled including images and sounds (external world for interface agents and the like). The combination of the stimuli from the above can be modified as appropriate.
[0035]
FIG. 2 shows a configuration example of the external situation recognition unit 1. The external situation recognition unit 1 includes an image information input unit 11, a human image detection unit 12, a human recognition unit 13, a human facial expression recognition unit 14, a human motion recognition unit 15, a voice information input unit 16, a human voice detection unit 17, and a utterance. A content recognition unit 18, a speech recognition unit 19, a change rate detection unit 20, a brightness detection unit 21, an other information input unit 22, and an external situation information output unit 23 are included.
[0036]
The image information input unit 11 captures an image with a TV camera or the like and outputs it as image data.
The person image detection unit 12 detects and outputs an image area in which a person is shown from the image data. This detection is performed by collating a face template that describes facial image features stored in advance while scanning in the image data, and recognizes a region having a similarity equal to or higher than a predetermined reference value as a face location, Further, the surrounding area including the face is extracted as a person area. The partial image data extracted as the person area is output as person image data.
[0037]
The person recognizing unit 13 compares the face area of the person image data with a face template for each person stored in advance, identifies who the person is, and a person ID code (known) In the case of a person who does not correspond to the template, a special ID code representing an unknown person) is output to a predetermined buffer 24-1 of the external situation information output unit 23. When no person image data is detected, a special ID code indicating no person is output as person ID information.
[0038]
The human facial expression recognition unit 14 identifies the type of facial expression (for example, normal, laughing, angry, sad, etc.) by collating the face area of the above-described human image data with a template stored in advance for each facial expression. Then, a human facial expression code representing this type (a special ID code representing an unknown facial expression in the case of facial expressions not corresponding to a known template) is output to a predetermined buffer 24-2 of the external situation information output unit 23. If no human image data is detected, a special ID code indicating no facial expression is output as the human facial expression code.
[0039]
The human motion recognition unit 15 obtains a motion vector obtained by optical flow analysis of the entire area of the above human image data, compares the motion vector with a pre-stored template for each motion (for example, speech operation). , Whispering action, beckoning action, etc.), and a person action code representing this type (a special ID code representing an unknown action in the case of an action not corresponding to a known template) is stored in a predetermined state of the external situation information output unit 23 Output to buffer 24-3. If no person image data is detected, a special ID code indicating no action is output as the person action code.
[0040]
The voice information input unit 16 inputs voice with a microphone or the like and outputs it as voice data.
The person voice detection unit 17 monitors the outputs of the person motion recognition unit 15 and the voice information input unit 16, and the voice data of the voice information input unit 16 has a large power. When an operation (such as opening and closing of mouth) is detected, the sound data is detected and output as person sound data.
[0041]
The utterance content recognition unit 18 compares the person voice data with a pre-stored template for each word voice to identify a word in the voice, and an utterance content code representing this word (corresponding to a known template). In the case of a word not to be processed, a special ID code representing an unknown word) is output to a predetermined buffer 24-4 of the external situation information output unit 23.
[0042]
The vocabulary recognizing unit 19 compares the voice parameters obtained by performing spectrum analysis on the above-mentioned human voice data with a template stored in advance according to vocabulary, and determines the type of vocabulary (for example, gentle, shouting, etc.). An ambiguity code indicating this type is output to a predetermined buffer 24-5 of the external situation information output unit 23 (a special ID code indicating an unknown vocabulary if the vocabulary does not correspond to a known template).
[0043]
The change speed detection unit 20 calculates a total value of the maximum absolute value of the time difference of the image data and the maximum absolute value of the time difference of the audio data, and sets a predetermined value of the external situation information output unit 23 as the situation change speed. The data is output to the buffer 24-6.
[0044]
The brightness detection unit 21 calculates the brightness of the entire image of the image data and outputs it to a predetermined buffer 24-7 of the external situation information output unit 23. The brightness may be obtained by a sensor such as an illuminometer.
[0045]
The other information input unit 22 extracts temperature and other information from input data from other input means such as a temperature sensor and switches, and outputs the extracted information as other information to a predetermined buffer 24-8 of the external situation information output unit 23. .
[0046]
The external situation information output unit 23 includes a set of buffers (24-1 to 24-8) that hold various types of information generated by the blocks of the person identification unit 13 to the brightness detection unit 21 described above. Output the values of each buffer collectively as external status information. In general, the external situation information includes a presence / absence of a person, a person's distinction, a person's action, a person's utterance content, a person's utterance verbosity, a sudden change in the situation, and a non-person such as brightness and temperature of a room. This is track information, and the time required for image processing and sound processing varies from track to track. Therefore, the information update of each track to the buffer (24-1 to 24-8) is performed asynchronously, but the latest information before the next update is always performed beyond the buffer (24-1 to 24-8). Can be read out.
[0047]
The various templates used above are generated by collecting a large number of real samples in advance and statistically analyzing them (for example, principal component analysis).
The type of external situation information generated by the external situation recognition unit 1 is not limited to the above example, and various selections / extensions can be made as necessary, and the number of buffers held by the external situation information output unit 23 accordingly. Various types can be selected / expanded.
[0048]
The person voice detection unit 17 may be provided with a person ID code output from the person identification unit 13, and the person voice detection unit 17 may perform a specific speaker voice detection process according to the person ID code. Similarly, a person ID code may be given to the utterance content recognition unit 18, and the utterance content recognition unit 18 may perform a specific speaker utterance content recognition process corresponding to the person ID code. Similarly, a person ID code may be given to the vocabulary recognition unit 19, and the vocabulary recognition unit 19 may perform specific speaker vocabulary recognition processing according to the person ID code. In this way, when the speaker is known, the accuracy and speed of each recognition process can be improved.
[0049]
Next, the situation description unit 2 will be described.
The situation description unit 2 periodically reads the external situation information (person ID code, person expression code, utterance content code, brightness, etc.) generated by the external situation recognition unit 1, and for a predetermined period of time together with the readout time. The latest external status information is stored as a status information string. In addition, the situation description unit 2 compares the period from the block with respect to other blocks (reaction emotion generation unit 3, emotional memory generation unit 4, recall memory generation unit 6) that require a status information sequence to be held. A status information string for a period according to the specified information can be output.
[0050]
FIG. 3 shows a configuration example of the situation description unit 2. The situation description unit 2 includes an external situation information acquisition unit 31, a situation information storage unit 32, and a situation information string output unit 33.
The external situation information acquisition unit 31 periodically reads out the external situation information held in the buffers (24-1 to 24-8) of the external situation information output unit 23 of the external situation recognition unit 1 in a batch manner. The time at the acquisition time is added and output.
[0051]
The status information storage unit 32 deletes the oldest storage in the built-in ring buffer 34 and writes the newly acquired external status information and the latest value of the acquisition time therein. In this way, the status information storage unit 32 can hold the status information sequence and the acquisition time for the latest fixed period (from T-0 to T-11 in the example of FIG. 3). (T-0 is the latest time).
[0052]
The status information string output unit 33 reads the status information and the acquisition time for the period determined by the period designation information input from the outside from the ring buffer 34 and edits and outputs the status information to the column buffer 35 as a status information string.
[0053]
By allowing period specification information to be given to other blocks that use the status information sequence when reading the status information sequence, it is possible to adjust which period the block from which the status information sequence is read performs processing. can do. This is referred to as processing time sensitivity. In general, a designated period is close to the present in a block that needs to respond quickly to a situation, and a designated period is shifted in the past in a block that uses a situation that goes back slightly in the past.
[0054]
In the above description, the ring buffer is used to hold the external status information and the acquisition time for a certain period, but a FIFO buffer may be used instead. In this case, a FIFO buffer is provided for each track of the external status information, and when the latest external status information is acquired, the data at the head of each FIFO buffer is discarded, and the latest external status information is stored at the end of each FIFO buffer. Respectively.
[0055]
Next, the reaction emotion generation unit 3 will be described.
The reaction emotion generation unit 3 generates and outputs device-specific emotions (reaction emotion information) that change in direct response to the status information sequence for the specified period by the status description unit 2. The strength of reaction emotion information can be controlled according to the size of sensitivity control information given separately. This sensitivity control information is a control parameter that determines how emotionally the device reacts to the situation, and is given by the control information generator 9. In addition, in order for the reaction emotion production | generation part 3 to respond quickly to the change of a condition, the period which designates the newest comparatively short period like the period from T-0 to T-4 of FIG. 3 as time sensitivity, for example The designation information is output to the situation description unit 2.
[0056]
FIG. 4 shows a configuration example of the reaction emotion generation unit 3. The reaction emotion generation unit 3 includes a reaction emotion situation information sequence acquisition unit 41, a situation emotion conversion unit 42, and a reaction emotion scale conversion unit 43.
[0057]
The reaction emotion status information sequence acquisition unit 41 reads the status information sequence corresponding to the period designation information from the status description unit 2. The read status information string includes a person ID code (for example, a person number indicating Mr. A), an utterance content code (for example, a word number indicating “good morning”, etc.) and a person action code (for example, a person number indicating A). For example, a code amount such as an action number indicating a beckoning) and a numerical amount such as brightness and a situation change speed are included.
[0058]
The situation emotion conversion unit 42 is configured by using a hierarchical neural network including three layers of an input layer 44, an intermediate layer 45, and an output layer 46, and each layer (44, 45, 46) has a predetermined number of units (47). It is formed by. From all the units in the input layer 44, load coupling (48) is applied to all the units in the intermediate layer 45, and a value obtained by applying a load to the output value of the corresponding unit is input to the unit to be coupled. Similarly, load coupling is applied to all units in the output layer 46 from all units in the intermediate layer 45, and the output value of the unit in the input layer 44 is applied to the output layer 46 by applying a load according to each path. Propagated.
[0059]
The input layer 44 has a unit group corresponding to the track of the external situation information, and a necessary number of units are prepared for each unit group.
For example, in the case of a unit group corresponding to a person ID, the apparatus can distinguish three persons “Mr. A”, “Mr. B”, and “Mr. C” (that is, the face templates of these three persons are registered). ), The number of units required for the unit group is designated as five corresponding to five codes of “Mr. A”, “Mr. B”, “Mr. C”, “Unknown person”, and “Nobody” This is the number obtained by multiplying the number of periods (the length of the status information string; for example, 5 when designating the period from T-0 to T-4 in FIG. 3).
[0060]
If there is “Mr. A” as the person ID at “a certain time” in the status information string, “Mr. A's unit at the corresponding time” is activated (a predetermined value such as 1.0 is substituted). On the contrary, for a unit having no corresponding code, it is deactivated (0.0 is substituted).
[0061]
In addition, the number of units required for the unit group corresponding to the numerical quantity such as brightness is the number obtained by multiplying one to put the numerical value by the specified number of periods (the length of the column of the status information string). .
[0062]
In this way, the status information sequence is assigned to a predetermined unit of the input layer 44 corresponding to the time, the track, and in some cases, the code. The assigned value becomes the output value of the unit as it is.
[0063]
The value of each unit of the input layer 44 is input to the unit of the intermediate layer 45 through load coupling. The unit of the intermediate layer 45 calculates the sum of values input from all the units of the input layer 44, and outputs the value of the sigmoid function corresponding to this value. The output value of each unit of the intermediate layer 45 is further input to the unit of the output layer 46 through load coupling. The sum of the values input from all the units of the intermediate layer 45 is obtained for the unit of the output layer 46, and the value of the sigmoid function corresponding to this value is output.
[0064]
In this embodiment, each unit of the output layer 46 corresponds to six emotion parameters of “happiness”, “anger”, “sadness”, “disgust”, “surprise”, and “fear”. Therefore, the hierarchical neural network of the situation emotion conversion unit 42 functions as a pattern converter that receives a situation information string and outputs six emotion parameter intensities.
[0065]
Of course, variations of emotion parameters are not limited to this example, but the description will be continued with these six types of emotion parameters as examples.
In order to give such a pattern conversion capability to the neutral network, sample data satisfying a predetermined input / output relationship is given to the neural network and is learned offline by a back-propagation algorithm or the like. For example, if the surroundings of the device are dark (see brightness), or an unknown person is detected (see person ID code), a predetermined intensity is given to the fear parameter, or a loud sound is input (situation change speed) ), Shouted (including the utterance content code), given a certain strength to the surprise parameter, or a known person (see the person ID code), and was gently spoken (the moral code) )), Input / output sample data in which a predetermined intensity is given to the happiness parameter, or a predetermined intensity is given to the anger, sadness, and disgust parameters by the utterance content code.
[0066]
The reaction emotion scale conversion unit 43 outputs an emotion parameter appearing in the unit of the output layer 46 by applying a gain determined by sensitivity control information (can be set for each emotion parameter). As a result, even after learning of the neural network, it is possible to control the balance between the size of the finally output response emotion information and the parameters.
[0067]
This sensitivity control information is a control parameter that causes each device (each individual interface agent, etc.) having the same neural network and given the same situation information sequence to generate reaction emotions of different strength and balance, Can be considered. And it becomes possible to produce desired individuality by adjusting the gain. For example, an agent that can be moved more easily can be produced if the absolute value of the gain is set large, and a colder agent can be produced if the absolute value of the gain is set small. Also, if you increase the balance of sadness and fear, you can produce a messed up or terrible agent, and if you increase the balance of happiness, you can produce a smiling agent.
[0068]
By the way, the neurobaby according to the prior art also has a configuration in which emotion information is derived from a sequence of information about a situation using a neural network, but since the obtained emotion information cannot be subjected to arbitrary scale conversion, In order to produce the individuality of the baby individually, it is necessary to re-learn the input / output relationship of the neural network so as to suit the individuality, and it takes time and effort to separately prepare learning data for that purpose. Further, in the neuro baby according to the prior art, it is necessary to complete this learning work before the operation of the device (neuro baby), and the individuality is immediately changed by changing the sensitivity control information during the operation as in this embodiment. Does not have the flexibility to change
[0069]
As mentioned above, although the example of 1 structure of the reaction emotion production | generation part 3 which used the neural network for the situation emotion conversion part was shown, below, the other structure of the reaction emotion generation part 3 which used the if-then rule for the situation emotion conversion part An example will be described.
[0070]
FIG. 5 shows another configuration example of the reaction emotion generation unit 3. The reaction emotion generation unit 3 includes a reaction emotion situation information sequence acquisition unit 111, a situation emotion conversion unit 112, and a reaction emotion scale conversion unit 113.
[0071]
The reaction emotion situation information sequence acquisition unit 111 and the reaction emotion scale conversion unit 113 are the same as the reaction emotion situation information acquisition unit 41 and the reaction emotion scale unit 43, respectively.
[0072]
The situation emotion conversion unit 112 performs the same function as the situation emotion conversion unit 42 described above, but the internal configuration is different.
Here, the situation emotion conversion unit 112 will be mainly described.
[0073]
As shown in FIG. 5, the situation emotion conversion unit 112 includes a conversion rule storage unit 114, a rule matching unit 115, and a reaction emotion synthesis unit 116.
The conversion rule storage unit 114 stores a conversion rule that defines how emotion parameters should be determined based on the situation.
[0074]
Examples of conversion rules are as follows. These are examples of rules when yelling and when the surroundings are dark, and each rule is described in the form of an if-then rule having a conditional part and a predicate.

The rule matching unit 115 compares each conversion rule stored in the conversion rule storage unit 114 with the situation of a predetermined period given by the inputted situation information sequence, calculates the degree of matching, and An emotion parameter value obtained by multiplying the emotion parameter value determined by the above degree of coincidence is output. The degree of coincidence is calculated as a cumulative period length in which the condition part of each rule is established, and is obtained as a ratio of the cumulative period length to the period length of the status information string. By doing in this way, the emotion parameter value of the strength corresponding to the appearing period is output from the time when the condition indicated in the condition part starts appearing in the status information string of the predetermined period until it disappears.
[0075]
The reaction emotion synthesis unit 116 adds the result of adding the emotion parameter value corresponding to the degree of coincidence output from the rule matching unit 115 to all the conversion rules held by the conversion rule storage unit 114, and the input situation information string Is output as reaction emotion information corresponding to.
[0076]
Note that the reaction emotion generation unit 3 in FIG. 4 and FIG. 5 is generated in response to a general condition (for example, yelling) known from before the operation of the apparatus, such as fear of being surprised if yelling. It is responsible for the function of inducing emotions (for example, the fact of being afraid and surprised when yelling). Therefore, it does not know about other detailed conditions that are found after the operation that accompanies the condition of yelling, such as who yelled and when yelled.
[0077]
On the other hand, the emotional recall mechanism by the emotional memory generation unit 4, the emotional memory description unit 5, and the recall emotion generation unit 6, which will be described later, is accompanied by a situation that has been yelled in the past, such as “who” and “when”. If you have such an incidental condition in the status information column (when there is that person who yelled, etc.), even if you are not yelled, the feeling of fear (surprise feelings are momentary) , It is removed from the object of recall). This mechanism is not only necessary for devices that have encountered similar situations in the future (interface agents) to show a stronger and quicker emotional response based on past experience, but also in situations where they have just been yelled. Even if it exists, there exists an effect different from the past. That is, even if the condition of “screaming” disappears from the situation information sequence targeted by the reaction emotion generating unit 3, the recalling emotion generating unit 6 maintains the feared emotion if the shouted person is in front of you. . Such a reaction has not been achieved by the neurobaby related to the prior art, and in the prior art, the condition of yelling disappears and the feeling of fear disappears immediately, producing a strangely fast-acting agent. (Even if this makes it possible to produce an innocent baby, other personalities cannot be produced).
[0078]
Next, the emotional memory generation unit 4 will be described.
The emotional memory generation unit 4 generates situation emotion pair information in which the reaction emotion information by the reaction emotion generation unit 3 and the situation information sequence within the designated period by the situation description unit 2 are associated with each other, and stores the emotional memory The data is transferred to the storage description unit 5. However, the emotional memory generation unit 4 generates the situation emotion pair information only when at least one of the reaction emotion parameters at that time has sufficient strength. This is because only the situation where the strong emotion was remembered is memorized, and the other trivial situations are not memorized.
[0079]
FIG. 6 shows a configuration example of the emotional memory generation unit 4. The emotional memory generation unit 4 includes an emotion strength evaluation unit 51 and a situation emotion pair information generation unit 52.
The emotion strength evaluation unit 51 reads the response emotion information from the reaction emotion description unit 3, and whether any of the emotion parameters held by the emotion strength evaluation unit 51 has a value equal to or greater than the strength specified by the memory control threshold information by the control information generation unit 9. To evaluate. If a parameter having sufficient strength is detected, it is determined that the reaction emotion information should be stored together with the situation, the detection information is turned on, and the detection information is sent to the situation emotion pair information generation unit 52 in the next stage. Send it out.
[0080]
The situation emotion pair information generation unit 52 acquires the response emotion information from the emotion strength evaluation unit 51 and detects the period designation information when the reaction emotion information to be stored is detected (when the detection information is ON). The situation information string sent out to the description unit 2 is read out for a specified period, and the situation information pair buffer 54 of the composite buffer 53 and the reaction are used as situation emotion pair information obtained by combining the obtained reaction emotion information and the situation information string. It is stored and output in the emotion information buffer 55. In addition, the situation emotion pair information generation part 52 is like a period from T-1 to T-8 of FIG. 3 as time sensitivity so that the comparatively long situation information sequence before the change of reaction emotion may be obtained. In addition, period designation information for designating a relatively long period of time after going back slightly past the current time is output to the situation description unit 2.
[0081]
Next, the emotional memory description unit 5 will be described.
The emotional memory description unit 5 holds (strong) situation emotion pair information by the emotional memory generation unit 4 for a longer period than the situation description unit 2 can hold.
[0082]
FIG. 7 shows a configuration example of the emotional memory description unit 5. The emotional memory description unit 5 includes a situation emotion pair storage unit 61, a situation emotion pair update unit 62, and a situation emotion pair search unit 63.
The situation emotion pair storage unit 61 is configured by using a hierarchical neural network having the same structure as the situation emotion conversion unit 42 illustrated in FIG. 4, and the hierarchical neural network here is a situation in situation emotion pair information. The input layer 64 corresponding to the information string, the output layer 66 corresponding to the emotion information in the situation emotion pair information, and the intermediate layer 65 are formed by the three layers, and the units (67) between adjacent hierarchies are load coupled. (68).
[0083]
Similar to the input layer (44) and output layer (46) in the situation emotion conversion unit 42 in FIG. 4, the unit of the input layer 64 in the situation emotion pair storage unit 61 in FIG. Separately prepared, the unit of the output layer 66 is prepared corresponding to emotion parameters (six emotion parameters in this example) felt by the apparatus. The initial value of each load coupling is set to 0 so as not to output any input.
[0084]
When the situation emotion pair update unit 62 receives the situation emotion pair information from the emotional memory generation unit 4, the situation emotion pair update unit 62 decomposes the situation emotion pair information into the memorized status information sequence and the memorized emotion information, and corresponds to the memorized status information sequence. The unit of the input layer 64 and the unit of the output layer 66 corresponding to the emotion information for inscription are activated / deactivated, or a numerical value is substituted. As a result, the pattern of the activation value given to the input / output layer 66 indicates the strong reaction feeling actually held by the apparatus and the situation (relatively long term) that caused it. The emotional memory description unit 5 uses this activity value pattern as sample data to adjust the coupling load in a direction that satisfies the input / output relationship given by the neural network. The magnitude of this adjustment is proportional to the separately applied learning intensity control information. If this value is large, learning proceeds, and if it is small, it does not proceed much. Therefore, it can be said that the learning intensity control information is a parameter that gives the strength of learning. As a result of this adjustment, the neural network thereafter outputs reaction emotions similar to the emotion information for the memorization for situations similar to the memorized status information sequence.
[0085]
The situation / emotion pair search unit 63 gives an external recall status information sequence to the input layer 64 of the situation / emotion pair storage unit 61, and outputs the emotion parameter values appearing in the output layer 66 as recall emotion information to the outside. (The situation emotion pair search unit 63 functions as a memory search mechanism).
[0086]
The configuration example of the emotional memory description unit 5 using the neural network for the situation emotion pair storage unit has been described above. In the following, the emotional memory description unit using the pattern matching method for the situation emotion pair storage unit 5 will be described.
[0087]
FIG. 8 shows another configuration example of the emotional memory description unit 5. The emotional memory description unit 5 includes a situation emotion pair storage unit 121, a situation emotion pair update unit 122, and a situation emotion pair search unit 123.
[0088]
The situation emotion pair update unit 122 and the situation emotion pair search unit 123 are the same as the situation emotion pair update unit 62 and the situation emotion pair search unit 63 described above, respectively.
The situation emotion pair storage unit 121 performs the same function as the situation emotion pair storage unit 61 described above, but the internal configuration is different.
[0089]
Here, the situation emotion pair storage unit 121 will be mainly described.
As shown in FIG. 8, the situation emotion pair description unit 121 includes a situation emotion pair buffer unit 124, a buffer update unit 125, a situation information string collation unit 126, and a recall emotion synthesis unit 127.
[0090]
The situation emotion pair buffer unit 124 is a storage means having a predetermined number of composite buffers (128) for storing a situation information string and emotion information at that time in pairs. Each composite buffer includes a situation information sequence buffer 129 and an emotion information buffer 130, respectively, as in the situation emotion pair information generation unit 52 of FIG.
[0091]
The buffer update unit 125 sets the memorized status information string from the status emotion pair update unit 122 and the memorized emotion information as a set and writes them in a free composite buffer of the situation emotion pair buffer unit 124. At this time, if there is no free composite buffer, the contents of the composite buffer holding the information of the oldest time are rejected and new information is overwritten and updated. In addition, the buffer updating unit 125 does not write each emotion parameter value of the emotion information for memorization as it is, but writes the emotion parameter value scale-converted with the gain indicated by the learning intensity control information. As a result, although all the situation emotion pair information is temporarily stored, the size and balance of the stored emotion parameter value can be adjusted by the learning intensity control information.
[0092]
The situation information string collation unit 126 receives the recall status information string from the situation emotion pair search unit 123, collates the situation information string with the situation information stored in each composite buffer 128, and determines the degree of coincidence. In addition to the calculation, the emotion parameter value obtained by multiplying the emotion parameter value stored in the composite buffer by the matching degree is output. The degree of coincidence is calculated by first calculating the difference between each track value at each time in the recall status information sequence and the corresponding track value at each time in the stored status information sequence (0 / no match if code amounts match) 1 and the value obtained by normalizing the absolute value of the difference between the numerical values). Normalization for numerical quantities is a scale conversion such that the absolute value of the difference falls between 0 and 1. As a result, the difference is between 0 and 1 for all tracks. Such difference values are summed for all times and all tracks of the recall status information sequence and the stored status information sequence, and are further normalized. This normalization is performed by dividing the total value of the differences by the period length of the recall status information sequence × the period length of the stored status information sequence × the number of tracks. As a result, the total difference value is a numerical value between 0 and 1. The degree of coincidence is a value obtained by subtracting the total value of the normalized differences from 1, and is 1 when most coincides and 0 when no coincidence.
[0093]
Finally, the recall emotion synthesis unit 127 adds the emotion parameter values output from the situation information sequence collating unit 126 to all the composite buffers included in the situation emotion pair buffer unit 124 and corresponds to the situation information sequence. Output as recall emotion information.
[0094]
The reaction emotion generation unit 3 responds to a short-term situation in which an emotion is generated, whereas in the emotional memory description unit 5 of FIGS. 7 and 8, the emotional memory generation unit 4 has a relatively long time sensitivity. Because it generates information on situational emotions based on the situation, it learns a relatively long-term situation before a strong emotion occurs.
[0095]
Next, the recall emotion generation unit 6 will be described.
The recall emotion generation unit 6 reads the status information sequence within the specified period from the status description unit 2, searches the emotional memory description unit 5 for emotion information corresponding to the status information sequence, and outputs it as recall emotion information.
[0096]
FIG. 9 shows a configuration example of the recall emotion generation unit 6. The recall emotion generation unit 6 includes a recall status information string acquisition unit 71 and a recall emotion scale conversion unit 72.
The recalled emotional situation information sequence acquisition unit 71 sends period designation information to the situation description unit 2 to read a situation information sequence for the designated period, and further sends it to the situation emotion pair search unit 63 of the emotional memory description unit 5. Send out.
[0097]
The situation emotion pair search unit 63 inputs the received situation information sequence as a recall status information sequence to the input layer 64 of the situation emotion pair storage unit 61, and the stored emotion parameter value that appears in the output layer 66 in response thereto. Send back.
[0098]
The recall emotion scale conversion unit 72 receives the emotion parameter value returned by the situation emotion pair search unit 63, multiplies the gain defined by the recall strength control information (can be set for each emotion parameter), and outputs it as a recall emotion.
[0099]
As a result, the size of the recalled emotion information and the balance between the parameters can be controlled, and the agent's recall personality of what kind of emotion is strongly remembered or what kind of emotion is rarely remembered. It becomes possible to produce.
[0100]
At this time, it is preferable to set the gain of the recall-related intensity control information for surprise low so as not to recall the momentary emotion that should be generated only by the reaction emotion generator 3 such as surprise.
[0101]
By the way, the neuro baby according to the related art generates an emotion corresponding to the case where the situation is confirmed, for example, that the child is yelled and scared. This achieves a natural and innate response that makes you scared if you yell loudly, but you can't tell in advance who is yelling. If everyone is yelling, you may learn that in advance, but there are actually some who yell. Therefore, in the neuro baby according to the prior art, it is impossible to handle information such as who yelled or who is often yelling, nor can the neurobaby react to such information.
[0102]
On the other hand, in the memory mechanism by the emotional memory generation unit 4 and the emotional memory description unit 5 in the present embodiment, as a situation when yelling from the experience that was actually surprised or frightened by Mr. A, Learn the incidental condition of “Mr. A”. Furthermore, the recalling emotion generation unit 6 only remembers Mr. A, which is an incidental condition of a situation that has been yelled in the past, and remembers the feared emotion as if it were detecting a situation that seems to be yelling. This can be regarded as having learned the feelings of good and bad for Mr. A. In addition, for example, there may be a case where the surroundings became dark and the thunderous sound of the thunder was sounding, but in such a case, it is also possible to learn a favorable feeling about the incidental condition other than the person such as the surrounding darkness. Is possible.
[0103]
Again, the reaction emotion generation unit 3 is a mechanism for generating a well-known emotional reaction that can be given to the device in advance, whereas the emotional memory generation unit 4 and the emotional memory The memory and recall mechanism composed of the description unit 5 and the recall emotion generation unit 6 is based on the innate emotional reaction by the reaction emotion generation unit 3, and further conditions corresponding to the situation in which the emotional reaction was born. A mechanism for learning and reacting to it in the future.
[0104]
It should be noted that the period (time sensitivity) of the situation information sequence that the reaction emotion generation unit 3 is interested in is a short period (for example, from T-0 to T-4 in FIG. 3) that is retroactive from the present, and the emotional memory generation unit 4 is interested As described above, the period of the status information sequence to be performed is a relatively long-term past (for example, from T-1 to T-8 in FIG. 3) that is slightly back from the present. The purpose of the recalling emotion generation unit 6 is to evaluate the incidental conditions before the situation is actually determined and to create an emotional state in which the situation is foreseen in the apparatus. Therefore, the period of the situation information sequence that the recalling emotion generation unit 6 is interested in is the same length of the period (eg, from T-4 to T-11 in FIG. 3) that goes back in the past than the period of the emotional memory generation unit 4. ) Is reasonable.
[0105]
Next, the self emotion description unit 7 will be described.
The self-emotion description unit 7 holds emotion information obtained by synthesizing the reaction emotion information from the reaction emotion generation unit 3 and the recall emotion information from the recall emotion generation unit 6 as current self emotion information.
[0106]
FIG. 10 shows a configuration example of the self-emotion description unit 8. The self-emotion description unit 8 includes a self-emotion synthesis unit 81 and a self-emotion holding unit 82.
The self-emotion synthesis unit 81 inputs the reaction emotion information from the reaction emotion generation unit 3 and the recall emotion information from the recall emotion generation unit 6, synthesizes both, and outputs them as self emotion information. The synthesis is performed, for example, by adding the values of the same emotion parameters to the emotion parameter value of the self emotion information.
[0107]
The self-emotion holding unit 82 stores and holds the parameter values of the self-emotion information obtained by the self-emotion synthesis unit 81 in the corresponding built-in buffers (83-1 to 83-6).
[0108]
Next, the emotion expression unit 8 will be described.
The emotion expression unit 8 outputs the emotion expression of the interface agent as an image or sound according to the current self emotion information described in the self emotion description unit 7.
[0109]
FIG. 11 shows a configuration example of the emotion expression unit 8. The emotion expression unit 8 includes a reaction generation unit 91, an action generation unit 92, and an agent synthesis unit 93.
The reaction generation unit 91 generates reaction information representing the facial expression reaction, body reaction, etc. of the agent according to the self-emotion information (psychological reaction source) and the external situation information (physiological reaction source). Facial reactions and physical reactions are, for example, reactions that occur unintentionally depending on self-emotions or external circumstances, such as smiling when you are happy, sweating when you are hot, or turning pale when you are scared. That's it. The generated reaction information is expressed by a code of a parameter that can be controlled by the agent synthesis unit 93 in the subsequent stage, such as 70% smile, 20% sweat, 40% blue face color, and the like.
[0110]
The behavior generation unit 92 generates behavior information representing the behavior of the agent according to the self-emotion information (psychological motive) and the external situation information (behavior constraint conditions). Actions mentioned here are, for example, intentionally aroused according to self-feelings or external circumstances, such as laughing at a disgusting partner (similar to laughing from a sense of happiness) or approaching a favorite partner It is an action to be done. The generated behavior information is expressed by a parameter code that can be controlled by the agent synthesizing unit 93 in the subsequent stage, such as a distance of 2 m and a smile of 20%, and its strength, as in the case of the reaction information.
[0111]
The reaction is automatically generated by the psychological reaction source and the physiological reaction source, but since the action is intentionally performed, motive is always required. The emotion parameters are roughly classified into pleasant (happiness), unpleasant (sadness, disgust, surprise, fear) and indefinite (anger). Among these, when the self-feeling classified as unpleasant is strong, the device takes action to avoid the situation. In addition, when the self-feelings that are well classified are strong, the device takes action to maintain the situation. Thus, self-emotion is a psychological motive for action.
[0112]
The action generation unit 92 is given some action patterns in advance. Each behavior pattern is given an applicable emotional state and external situation, and priority to be tried as behavior application rules. When motivation occurs, the behavior generation unit 92 selects one behavior pattern using the behavior application rule. Since the action information necessary for realizing the action pattern is added to the selected action pattern, the action generation unit 92 may output the action information to the agent composition unit 93. If the situation is not improved even after a predetermined period of time has passed since the action was taken, the action generation unit 92 tries the applicable action pattern given the next rank.
[0113]
The agent synthesizing unit 93 receives the reaction information and action information given from the previous stage as described above, calculates the agent shape, color, pose, motion trajectory, voice tone, and the like according to these information, and the agent figure ( Realize video and robot body) and voice.
[0114]
Note that the emotion expression control information given from the outside by the control information generation unit 9 works on both the reaction generation unit 91 and the action generation unit 92, and the magnitude of the appearance of the reaction or action, that is, the agent synthesis unit 93 can control it. Adjust the gain of various parameters. As a result, even if the conditions are the same, by adjusting the emotional expression control information in various ways, for example, the agent's action is that it does not appear on the face but appears in action, or immediately appears in the face but does not act easily. It becomes possible to produce personality.
[0115]
Next, the control information generation unit 9 will be described.
The control information generation unit 9 sends sensitivity control information to the reaction emotion generation unit 3, memorization control threshold information to the emotional memory generation unit 4, learning intensity control information to the emotional memory description unit 5, and recall intensity control information. Emotion expression control information is supplied to the emotion expression unit 8 to the recall emotion generation unit 6. These pieces of information are parameters that determine the emotional personality and personality of the interface agent. That is, by adjusting this parameter, it is possible to set and change the emotional personality and personality to be given to the interface agent.
[0116]
Next, a processing procedure of the emotion generation apparatus will be described.
FIG. 12 shows an example of the processing procedure of the emotion generation apparatus. In the procedure example of FIG. 12, the external situation recognition process S1, the situation information sequence update process S2, the reaction emotion update process S3, the emotional memory update process S4, the recall emotion update process S5, and the self-emotion update process S6, The emotion expression process S7 is executed.
[0117]
The external situation recognition process S1 corresponds to the process in the external situation recognition unit 1, and is a process for generating external situation information based on images, sounds, other observation data, and the like.
[0118]
The status information sequence update processing S2 corresponds to the processing in the status description section 2, receives the latest external status information that is the processing result of the external status recognition processing S1, and obtains the external status information of the oldest time from the status information sequence. It is a process of discarding and replacing with the latest external status information.
[0119]
The reaction emotion update process S3 corresponds to the process in the reaction emotion generation unit 3, takes out a situation information sequence for a predetermined period from the latest situation information sequence by the situation information sequence update process S2, and responds to the latest reaction emotion information. Is a process for generating
[0120]
The emotional memory update processing S4 corresponds to the processing in the emotional memory generation unit 4 and the emotional memory description unit 5, and determines whether or not newly generated reaction emotion information should be stored depending on the strength of the emotion. When there is sufficient strength to be determined and stored, this is a process of storing this together with a status information string for a predetermined period.
[0121]
The recall emotion update processing S5 corresponds to the processing in the recall emotion generation unit 6, and the latest recall emotion information for the past situation information sequence similar to the situation information sequence of the predetermined period updated in the situation information sequence update processing S2. Is a process to recall.
[0122]
The self-emotion update process S6 corresponds to the process in the self-emotion description unit 7, and is a process for generating the latest self-emotion information obtained by synthesizing the latest reaction emotion information and the latest recall emotion information.
[0123]
The emotion expression process S7 corresponds to the process in the emotion expression unit 8, and is a process for outputting a signal such as an image or a sound that expresses a reaction and action according to the latest self-emotion information.
[0124]
Now, the neurobaby according to the prior art has a series of information on the situation within a predetermined period (the latest fixed period, not a variable period) (the maximum amplitude and zero of the user voice for 10 milliseconds taken at intervals of 20 milliseconds). A total of 20 parameters obtained by collecting the number of intersections for 10 cycles; information about vocabulary) is extracted and input to the hierarchical neural network of the input layer 20, the intermediate layer 24, and the output layer 2, and parameters related to two emotions The video and audio of the neurobaby are controlled based on this parameter.
[0125]
On the other hand, in the present embodiment, not only a direct emotional reaction to a stimulus from the current outside world is realized, but also an emotional reaction recalled in relation to a memory of a situation / emotion experienced in the past is realized. It is possible to do. That is, the emotional memory generation unit 4, the emotional memory description unit 5, and the recall emotion generation unit 6 store and recall the situation and emotion experienced by the device in the past from the holding period of the situation description unit 2 as situation emotion pair information. By making it possible, it is possible to learn by experience the incidental conditions that the situation with strong emotions satisfies. In particular, functions that are necessary for interface agents that place importance on building psychological relationships with users, as it is possible to learn favorable feelings for specific people, specific things, or events as incidental conditions that are met by the situation and the emotions associated with them. That is, the emotional response different for each user is shown, and this response is adjustable depending on how the user contacts the agent.
[0126]
In the present embodiment, the control information generator 9 can supply various types of control information, so that the strength and balance of reaction emotion information, selective storage of situation emotion vs. information and recall, strength of learning through experience, The strength and balance of recalled emotion information and the strength of emotional expression can be controlled. As a result, the personality of the interface agent includes, for example, whether it is easy to react emotionally (by sensitivity control information), whether it is disciplinary, or whether past experience is easily reflected in emotional recall (memorandum) Control threshold information, learning intensity control information, and recall intensity control information), whether or not emotions are displayed on the table (by emotion expression control information), and the like can be adjusted.
[0127]
Note that the emotion generation apparatus and the emotion generation method according to the present embodiment are not limited to the above examples.
For example, as the status information sequence generated in the status description unit 2 and the status information sequence update process S2, in addition to the external status information, the internal state of the device itself (internal status information) such as self-emotion information, reaction information, and behavior information May be described. By doing so, it becomes possible to learn an unpredictable incidental condition for generating an emotion using the internal situation information as a clue in addition to the external situation information.
[0128]
Further, the configuration of the reaction emotion generation unit 3 in FIG. 4 or 5 and the configuration of the emotional memory description unit 5 in FIG. 7 or 8 can be implemented in any combination.
The above functions can also be realized as software.
[0129]
Also, the present embodiment is a computer-readable recording program that causes a computer to execute a predetermined procedure (or to cause a computer to function as a predetermined means, or to cause a computer to realize a predetermined function). It can also be implemented as a recording medium.
[0130]
For example, as illustrated in FIG. 13, information (for example, a program) for realizing the emotion generating apparatus and the emotion generating method according to the present invention is recorded on the recording medium 104, and the recorded information is transmitted via the recording medium 104. The present invention can be applied to the apparatus 101 and the apparatus 103, or can be applied to the

apparatuses

102 and 103 via the

communication lines

105 and 106.
The present invention is not limited to the embodiment described above, and can be implemented with various modifications within the technical scope thereof.
[0131]
【The invention's effect】
According to the present invention, an unpredictable incidental condition peculiar to the situation is learned from emotions experienced in the past and the situation at that time, and for the input of a new situation satisfying the learned incidental condition, the incidental condition is used. Recalling the stored memorized emotion makes it possible to change the emotion only by detecting the incidental condition without actually reaching the situation. As a result, it is possible to realize an interface agent that can learn the user and the user's behavior as ancillary conditions and respond emotionally, that is, change the emotional expression for each user and adjust the variable depending on how the user touches. In addition, the personality of each interface agent can be easily determined regarding how emotions appear.
[Brief description of the drawings]
FIG. 1 is a diagram showing a basic configuration of an emotion generation device according to an embodiment of the present invention.
FIG. 2 is a diagram illustrating a configuration example of an external situation recognition unit of the emotion generation device according to the embodiment;
FIG. 3 is a diagram showing a configuration example of a situation description unit of the emotion generation device according to the embodiment;
FIG. 4 is a diagram illustrating a configuration example of a reaction emotion generation unit of the emotion generation device according to the embodiment;
FIG. 5 is a diagram showing another configuration example of the reaction emotion generation unit of the emotion generation device according to the embodiment;
FIG. 6 is a diagram showing a configuration example of an emotional memory generation unit of the emotion generation device according to the embodiment;
FIG. 7 is a diagram showing a configuration example of an emotional memory description unit of the emotion generation device according to the embodiment;
FIG. 8 is a diagram showing another configuration example of the emotional memory description unit of the emotion generation device according to the embodiment;
FIG. 9 is a diagram showing a configuration example of a recall emotion generation unit of the emotion generation device according to the embodiment;
FIG. 10 is a diagram showing a configuration example of a self-emotion description part of the emotion generation device according to the embodiment;
FIG. 11 is a diagram showing a configuration example of an emotion expression unit of the emotion generation device according to the embodiment;
FIG. 12 is a flowchart showing an example of a processing procedure in the emotion generation device according to the embodiment;
FIG. 13 is a diagram for explaining a case where the present invention is implemented using a recording medium or the like.
[Explanation of symbols]
1 ... External situation recognition part
2. Situation description part
3 ... Reaction emotion generator
4 ... Emotional memory generator
5 ... Emotional memory description part
6 ... Recollection emotion generation part
7 ... Self emotion description part
8 ... Emotion expression section
9: Control information generation unit
11 ... Image information input unit
12 ... person image detection unit
13: Person recognition unit
14 ... Human facial expression recognition unit
15: Human motion recognition unit
16 ... Voice information input section
17 ... person voice detection unit
18 ... Speech recognition unit
19 ... Word recognition unit
20 ... Change rate detector
21 ... Brightness detector
22 ... Other information input section
23 ... External status information output section
31 ... External situation information acquisition unit
32 ... Situation information storage unit
33 ... Situation information string output section
41 ... Situation information sequence acquisition unit for reaction emotion
42 ... Situation emotion conversion part
43 ... Reaction emotion scale converter
51. Emotional strength evaluation section
52. Situation / Emotion vs. Information Generator
61 ... Situation vs. memory
62 …… Situation vs. Update
63 ... Situation / Emotion vs. Search
71 ... Situation information sequence acquisition unit for recalled emotion
72 ... Recalling emotion scale converter
81 ... Self emotion synthesis part
82 ... Self-Emotion Holding Department
91 ... Reaction generator
92. Action generation unit
93 ... Agent composition part
101-103 ... apparatus
104 ... Recording medium
105, 106 ... communication circuit
111 ... Situation information sequence acquisition unit for reaction emotion
112 ... Situation emotion conversion part
113 ... Reaction emotion scale converter
114 ... Conversion rule storage unit
115 ... Rule matching part
116 ... Reaction emotion synthesis part
121 ... Situation vs. memory
122 ... Situation emotion pair update part
123 ... Situation emotion pair search part
124 ... Situation vs. buffer
125: Buffer update unit
126... Situation information string matching unit
127 ... Recollection emotion synthesis part

Claims

Situation recognition means for recognizing surrounding situations and generating situation information consisting of multiple types of information ;
A situation description means for generating and holding a situation information string in which the situation information is collected for a predetermined period from the present to the past;
When detecting a predetermined type of information from the situation information sequence, reaction emotion generating means for generating reaction emotion information according to the information;
An emotional storage description unit that associates the reaction emotion information with the situation information sequence and stores it as situation emotion pair information when the intensity of the reaction emotion information is equal to or greater than a predetermined threshold ;
In response to the input of the situation information sequence, the reaction emotion information associated with the situation emotion pair information is recalled according to the degree of coincidence with the situation information sequence associated with the stored situation emotion pair information. Recalling emotion generation means to recall as,
Self emotion describing means for generating a self emotion information by combining the new and newly recalled the recalled emotion information generated the reaction emotion information,
An emotion expression means for generating and outputting a signal corresponding to the self-emotion information ,
Wherein the status information either comprising a said predetermined type of information and other types of information to produce the reaction emotion information or a call includes a plurality of types the type of information determined in advance Emotion generation device.

The emotion expression means selects, from among action patterns given in advance, an action pattern suitable for the self-emotion information and the situation information, which has the highest priority given to the action pattern, and The emotion generating apparatus according to claim 1 , further comprising means for converting and outputting the behavior information given to the behavior pattern .

The emotion expression means performs selection of the behavior pattern and output of behavior information given to the behavior pattern when the self-emotional information is categorized comfortably , and a predetermined period has elapsed after the output. The emotion generation apparatus according to claim 2 , wherein when the self-emotion state transitions to a state that is not easily classified within, the priority of the action pattern is lowered .

The emotion expression means performs selection of the behavior pattern and output of behavior information given to the behavior pattern when the self-emotional information is classified as unpleasant , and a predetermined period has elapsed after the output. The emotion generation apparatus according to claim 2, wherein the priority of the behavior pattern is lowered when the self-emotion state does not transition to a state that is easily classified within .

Strength of reaction emotion information generated by the reaction emotion generation means, a lower limit of strength of reaction emotion information to be stored by the emotional memory description means, strength of recall emotion information generated by the recall emotion generation means, 2. The emotion generating apparatus according to claim 1, further comprising means for adjusting at least one of the strengths of signals output by the emotion expressing means as individuality.

Recognizing surrounding situations and generating situation information composed of multiple types of information ;
Generating a status information string that summarizes the status information for a predetermined period going back from the present to the past and holding it in a storage means;
When detecting a predetermined type of information from the situation information sequence , generating the latest reaction emotion information according to the information;
When the intensity of the reaction emotion information is equal to or greater than a predetermined threshold, the reaction emotion information and the situation information string are associated with each other and learned as situation emotion pair information, and the learning result is stored in a storage unit;
Recalling that for the input of the status information sequence, depending on the degree of matching between the status information string associated with the learned status emotion pair information recalled reaction emotion information associated with the status emotion pair information Generating emotion information;
Generating a self emotion information by combining the newly generated the recalled was emotion information and the newly generated the reaction emotion information,
Generating and outputting a signal corresponding to the self-emotion information ,
Wherein the status information either comprising a said predetermined type of information and other types of information to produce the reaction emotion information or a call includes a plurality of types the type of information determined in advance Emotion generation method.