JP3848076B2

JP3848076B2 - Virtual biological system and pattern learning method in virtual biological system

Info

Publication number: JP3848076B2
Application number: JP2000350737A
Authority: JP
Inventors: 薫鈴木
Original assignee: Toshiba Corp
Current assignee: Toshiba Corp
Priority date: 1999-11-18
Filing date: 2000-11-17
Publication date: 2006-11-22
Anticipated expiration: 2020-11-17
Also published as: JP2001209779A

Description

【０００１】
【発明の属する技術分野】
本発明は、外部からの入力パタンを認識する仮想生物システムに係り、特に自発的にパタンを学習する仮想生物システム及び仮想生物システムにおけるパタン学習方法に関する。
【０００２】
【従来の技術】
近年、生物を模したキャラクタの登場するソフトウェアアプリケーションや生物を模したロボットが種々提案されている。これら仮想生物システムとも呼ぶべきソフトウェアアプリケーションやロボットはユーザからの入力情報に応じて感情表現をするなどの応答を出力するように作られている。また、それらソフトウェアアプリケーションやロボットのいくつかは、ユーザの顔画像や音声波形などのパタン情報を認識する能力を有し、そのようなパタン認識情報を入力情報として応答できるようにできている。
【０００３】
例えば、ｈｔｔｐ：／／ｗｗｗ．ｉｎｃｘ．ｎｅｃ．ｃｏ．ｊｐ／ｒｏｂｏｔ／に開示されている「お手伝いロボットＲ１００」は、ユーザの顔画像からそれが誰であるかを見分け、その認識結果に基づいて当該ユーザに特化した動作を行えるようになっている。
【０００４】
また、例えば、ｈｔｔｐ：／／ｗｗｗ．ｌａｒｅｓ．ｄｔｉ．ｎｅ．ｊｐ／〜ｐ−ｃｈａｍｉｎ／に開示されている「ぴーちゃみん」というソフトウェアアプリケーションは、パソコンに接続されたマイクロフォンから音声信号を取り込み、それを覚えて聞き分けられるようになると共に、その音声信号を加工した波形信号を自らの音声として発声可能な「インコ」型の仮想生物である。更に、「ぴーちゃみん」は別々に入力される個別音声の前後関係をも学習し、同じ前後関係で上記自己音声の発生を行えるようになっている。
【０００５】
パタン認識の基本的な枠組みは、認識対象となるクラスに属するパタンの特徴を何らかの形式で表現し、未知人力パタンの特徴がこのクラスの特徴にどれくらい合致しているかを照合評価するというものである。このクラスの特徴を表現した情報を辞書と呼び、合致の度合を示す尺度を類似度と呼ぶ。認識対象となるクラスが1つの場合(これを同定問題と云う)、この類似度が所定閾値以上あれば、未知入力パタンはこのクラスに属しているものと認定する。
【０００６】
また、認識対象となるクラスが２つ以上ある場合(これを識別問題と云う)、類似度が所定閾値以上あり、最も高い類似度を獲得したクラスがあれば、未知入力パタンはこの最高類似度を得たクラスに属しているものと認定する。いずれの場合も、所定閾値以上のクラスがない場合には、未知入力パタンはクラス未定、すなわち、システムの認識できない新たなクラスに属しているものとされる。
ここで、パタン認識の計算上の仕組みを簡単に説明する。
【０００７】
一般にパタンｐはＮ個のスカラー量で構成されるデータである。そのため、パタンｐは各スカラー量をそれぞれ1つの軸に対応させ、各スカラー量の値を対応する軸上の座標値としたＮ次元超空間Ｓ〈Ｎ本の直交する座標軸を持つ空間)中の点(あるいは原点からの位置ヘクトルv)と看倣すことができる。画像パタンの場合は大きさと輝度を正規化された画像の各画素値をこのスカラー量に相当させることができる。音声パタンの場合は時間長と強度を正規化された各時刻のスペクトル成分をスカラー量に相当させることができる。これは多くのパタン認識装置で用いられているパタンのベクトル化法である。
【０００８】
このとき、２つのパタンがとれくらい似通っているかは、パタン間の類似度を計算することで評価できる。２パタン間の類似度の定義の仕方には様々あるが、例えば、各パタンを表す長さを正規化された位置ベクトル(特徴ベクトルと云う)の余弦(すなわち、一方から他方への射影長)として類似度が定義される。この場合、類似度が大きい程２つのパタンは似ていることになる。
【０００９】
一般に、あるクラスに属する既知のパタン(教示パタンと言う)が複数与えられており、更に未知入力パタンが１つ与えられたとき、この未知入力パタンがこのクラスに属している可能性を評価するには、単純に前述の２パタン間の類似度を計算することで行える。この場合、未知入力パタンと全ての既知パタンとの類似度を個々に求め、例えば最も大きい類似度をこのクラスとの類似度とする。この方法はクラスの特徴を表わす辞書として全ての既知パタンを個々に保存しておく方法である。
【００１０】
しかしながら、この方法だと既知パタンの数が増えるにつれて類似度計算のコストや既知パタンを記憶しておくメモリスベースなどが際限なく増大してしまう。また、既知パタンには相互に類似したパタンもあるため、同じようなパタンと何回も類似度計算をするという無駄も発生する。そこで、既知パタンを個々に保存しておく代わりに、より少ない情報で既知パタン集合、即ちクラスの特徴を記述する必要が生ずる。
【００１１】
クラスの特徴を表現する方法として、或るクラスに属するＫ個の既知パタン、即ちＫ本のＮ次元特徴ベクトル｛ｖ：ｖ１，．．，ｖＫ｝（前述の既知パタンに相当）が与えられたとき、これら特徴ベクトルを主成分分析して得られる寄与率の大きいＭ(Ｍ＜Ｎ）本の正規直交ベクトル｛ｅ：ｅ１，．．．，ｅＭ｝を基底とするＭ次元超空間Ｌを辞書とする方法がある。このＭ次元超空間ＬはＮ次元超空間Ｓの次元数Ｍの部分空間Ｌ(Ｍ＜Ｎであるため)であり、このＭ次元超空間Ｌを次元数Ｍの辞書部分空間と呼ぶ。
【００１２】
この方法によると、既知パタンの数ＫがＮよりはるかに大きくても、辞書としては高々Ｍ（＜Ｎ）本の基底ベクトルを保存しておくだけでよい。未知特徴ベクトルｖ(前述の未知パタンに相当)が１つ与えられたときの辞書部分空間Ｌとの類似度は、未知特徴ベクトルvを部分空間Ｌの基底ベクトル｛ｅ：ｅ１，．．．，ｅＭ｝の各々に射影した長さの２乗和を類似度として定義される。これが部分空間法と呼ばれるものである。
【００１３】
なお、部分空間法を含めた辞書表現と類似度の定義に関連した詳細な情報は、文献［１］（エルッキ・オヤ著、小川英光他訳、「パターン認識と部分空間」、産業図書、１９８６）や、文献［２］〈飯島泰蔵著、「パターン認識理論」、森北出版、１９８９）に開示されている。
【００１４】
以上例示したように、一般に何らかのパタンを認識するシステムにおいては、認識対象となるクラスの辞書をシステムに与えておく必要がある。通常、この辞書を実際のパタンと関係なく合成的に作り出すことは困難なため、対象クラスに属する実際のパタン（前述の教示パタン）を収集し、例示した部分空間法における主成分分析などにより辞書を生成することになる。
【００１５】
このとき、実際のパタンから辞書を生成するためには、少なくとも辞書を構築するための教示パタンと当該教示パタンの属するクラスの情報が必要である。教示パタンは顔認識なら顔の画像パタン、文字認識なら文字の画像パタン、単語音声認識なら単語の音声波形パタンであり、クラス情報は顔による個人認識ならその人物の名前（文字コード列）や識別番号など、顔による表情認識ならその表情の名前（文字コード列）や識別番号など、文字認識ならその文字の文字コードなど、単語音声認識ならその単語の文字コード列など、一般的には当該パタンを個別に識別可能な番号や記号（列）である。
【００１６】
このように、教示パタンと当該教示パタンの属するクラスの情報という２種類の情報が与えられることで初めて必要なクラスの必要な辞書が生成可能になる。
【００１７】
なお、このような辞書の生成、即ち新規辞書の追加あるいは既存辞書の更新をパタン認識システムの運用中に随時行い、当該システムのパタン認識能力を向上させることを以後パタン学習と呼ぶことにする。
【００１８】
ところで、従来、教示パタンとクラス情報という２つの情報は人間がシステムに与える必要があった。特に運用中にパタン学習を行うことのできるパタン認識システムにおいては、人間がシステムの持つパタン認識のためのパタン入力機能を利用しつつ教示パタンを集め、その属するクラスをシステムに手作業で入力し、システムの助けを借りながら辞書を生成するという手順でパタン学習が進行する。そのため、このようなシステムにはパタンを入力して認識するという本来の機能の他に、辞書生成機能として教示パタンを収集するモードを呼び出すための操作入力機能、当該モードにおいて入力パタンを教示パタンとして記憶しておく機能、クラス情報の操作入力を受理する機能、及び収集記憶された教示パタンから指定されたクラスの辞書を生成する機能が与えられていた。
【００１９】
これからも明らかなように、従来のパタン学習においては、「これから入力するパタンを学習せよ」という教示タイミングの入力、その後に意図的に行われる教示のためのシステムへのパタン呈示、「教示されたパタンをこのクラスに属させよ」というクラス情報の入力という３つの段階に人手の介在が不可欠であった。
【００２０】
しかしながらこのように、その都度人手を介してパタンを学習させていたのでは、使用するうちに知らず知らずに物を覚えていくという仮想生物の生物らしい学習能力を演出することは不可能であり、より生物らしい人工システムを提供するうえでの大きな障害となる。
【００２１】
前述の「Ｒ１００」においても、顔の教示は明示的に行われなくてはならず、新しい顔を覚えさせるためには人手による教示作業が必要であった。
【００２２】
また、前述の「ぴ一ちゃみん」においても、利用者が学習を希望する音声を教示する際には、画面に表示されるマイクボタンをマウスでクリックしてから教示音声を聞かせるというように、学習すべきパタンをシステムに明示的に通知する必要があった。このように従来のパタン学習可能なパタン認識システムは、パタン学習において人間によるまさに教示のためだけに行われる意図的な作業を必要とした。
【００２３】
但し、前述の「ぴーちゃみん」だけは例外であり、システムはマイクロフォンが拾う音声を自律的に学習する機能も備えており、利用者による明示的な作業を必要としないパタン学習も可能となっている。「ぴーちゃみん」はこれまで学習したどの音声パタンとも類似しない新規パタンを見分けると、そのパタンに対して新しいクラス情報（おそらく新しい識別番号）を自動的に発行することにより、当該パタンに対するクラス情報を内部で生成し、この新規パタンを認識するための辞書を自動生成しているものと思われる。この機能により、「ぴ一ちゃみん」は利用者が何もしなくても自発的に言葉、実際には音を覚える仮想生物を演出することに成功している。
【００２４】
しかしながら、このような自発的な学習能力を持つ「ぴーちゃみん」においても、マイクロフォンが拾う、どの音声を学習すべきかを決定する手段が不十分なため、往々にして周囲のノイズを勝手に学習してしまい、意味不明の音声を喋るようになるという問題が発生する。これは学習すべきパタンを選別するための適切な手段がないことを意味しており、これが従来技術の問題点である。
【００２５】
【発明が解決しようとする課題】
本発明は上記のような従来の仮想生物の学習の際の問題点に鑑みてなされたもので、その目的とするところは、不適切なパタンを排除して自発的なパタン学習を行える仮想生物システム及び仮想生物システムにおけるパタン学習方法を提供することである。
【００２６】
【課題を解決するための手段】
本発明は、入力されたパタン情報を認識し、その結果に基づいて内部状態を更新する点に特徴がある。
【００２７】
上記目的を達成するために、本発明の請求項１によれば、複数種類の信号源からパタン情報を入力する入力手段と、該入力されたパタン情報を辞書情報に基づいて認識するパタン認識手段と、該パタン認識手段による認識結果に基づいて感情状態を更新する状態更新手段と、該感情状態に応じて応答を出力する出力手段とを具備する仮想生物システムにおいて、前記感情状態が所定の条件を満たすときに、前記パタン情報の認識のための新たな辞書情報と、前記パタン情報の検出により前記感情状態を更新する量と、を学習する学習手段をさらに具備することを特徴とする仮想生物システムを提供する。
【００２８】
したがって、内部状態の条件によって入力パタンを選択的に学習することになり、適切な入力パタンのみ学習して、合理的な出力をする仮想生物システムが得られる。
【００２９】
本発明の請求項２によれば、複数種類の信号源からパタン情報を入力する入力処理と、該入力されたパタン情報を辞書情報に基づいて認識するパタン認識処理と、該パタン認識手段による認識結果に基づいて感情状態を更新する状態更新処理と、該感情状態に応じて応答を出力する出力処理とを有する，仮想生物システムにおけるパタン学習方法であって、前記感情状態が所定の条件を満たすときに、前記パタン情報の認識のための新たな辞書情報と、前記パタン情報の検出により前記感情状態を更新する量と、を学習するパタン学習処理をさらに有することを特徴とする仮想生物システムにおけるパタン学習方法を提供する。
【００３０】
内部状態の条件によって入力パタンを選択的に学習することになり、適切な入力パタンのみ学習する合理的なパタン学習方法が得られる。
【００３１】
本発明の請求項３によれば、請求項１記載の仮想生物システムにおいて、前記所定の条件は感情状態の強度が所定の閾値を越えることであることを特徴とする仮想生物システムを提供する。
【００３２】
本発明の請求項４によれば、請求項２記載のパタン学習方法において、前記所定の条件は感情状態の強度が所定の閾値を越えることであることを特徴とする仮想生物システムにおけるパタン学習方法を提供する。
【００３３】
本発明の請求項５によれば、１つ以上の信号源からパタン情報を入力する入力手段と、この入力手段により入力されたパタン情報を、複数のグループに分類された辞書情報に基づいてグループごとに認識するパタン認識手段と、前記パタン情報の認識のための新たな辞書情報と、前記パタン情報の検出により前記感情状態を更新する量と、を学習するパタン学習手段と、前記パタン認識手段による認識結果に基づいて感情状態を更新する状態更新手段と、前記感情状態が所定の条件を満たすとき、前記パタン学習手段に学習指令を行う学習指令手段と、前記感情状態に応じて応答を出力する出力手段とを具備することを特徴とする仮想生物システムを提供する。
【００３４】
本発明の請求項６によれば、１つ以上の信号源からパタン情報を入力する入力手段と、この入力手段により入力されたパタン情報を、複数のグループに分類された辞書情報に基づいてグループごとに認識するパタン認識手段と、前記パタン情報の認識のための新たな辞書情報と、前記パタン情報の検出により前記感情状態を更新する量と、を学習するパタン学習手段と、前記パタン認識手段による認識結果に基づいて、複数の数値により表される感情状態のこれらの数値を変化させることにより感情状態を更新する状態更新手段と、前記感情状態を表す複数の数値が所定の大きさになったとき、前記パタン学習手段に学習指令を行う学習指令手段と、前記感情状態に応じて応答を出力する出力手段とを具備したことを特徴とする仮想生物システムを提供する。
【００３５】
本発明の請求項７によれば、１つ以上の信号源からパタン情報を入力する入力手段と、この入力手段により所定時間内に入力された複数のパタン情報を、複数のグループに分類された辞書情報に基づいてグループごとに認識するパタン認識手段と、前記パタン情報の認識のための新たな辞書情報と、前記パタン情報の検出により前記感情状態を更新する量と、を学習するパタン学習手段と、前記パタン認識手段による認識結果に基づいて、複数の数値により表される感情状態のこれらの数値を変化させることにより感情状態を更新する状態更新手段と、前記感情状態を表す複数の数値が所定の大きさになったとき、前記パタン学習手段に学習指令を行う学習指令手段と、前記感情状態に応じて応答を出力する出力手段とを具備することを特徴とする仮想生物システムを提供する。
【００３６】
ここで、入力手段により複数のパタン情報が入力される所定時間内とは、通常、使用者が一連の動作として動作を行う程度の時間内である。この発明によれば、複数のパタン情報によって感情状態を変えるので、更に合理的な動作をする仮想生物システムを得られる。
【００３７】
本発明の請求項８によれば、請求項６または請求項７に記載の仮想生物システムにおいて、前記感情状態を表す複数の数値は、少なくとも幸福度、興奮度及び好感度を表す数値であることを特徴とする仮想生物システムを提供する。
【００３８】
本発明の請求項９によれば、請求項６または請求項７に記載の仮想生物システムにおいて、前記パタン情報は、人物及び語彙のグループの少なくとも１つと、表情、語気及び力覚のグループのうち少なくとも１つについて認識されることを特徴とする仮想生物システムを提供する。
【００３９】
本発明の請求項１０によれば、請求項６または請求項７に記載の仮想生物システムにおいて、前記パタン情報は、人物、表情、語彙、語気及び力覚のグループに分けられており、これらのうちの表情、語気及び力覚のグループについての刺激の認識結果の快及び不快により、前記感情状態を表す幸福度、興奮度及び好感度の数値を変化させることを特徴とする仮想生物システムを提供する。
【００４０】
本発明の請求項１１によれば、１つ以上の信号源からパタン情報を入力する入力するステップと、このステップにより所定時間内に入力された複数のパタン情報を複数のグループに分類された辞書情報に基づいてグループごとに認識するパタン認識ステップと、このパタン認識ステップによる認識結果に基づいて、複数の数値により表される感情状態のこれらの数値を変化させることにより前記感情状態を更新する状態更新ステップと、前記感情状態を表す複数の数値の少なくとも１つが所定の大きさになったとき、前記パタン情報の認識のための新たな辞書情報と、前記パタン情報の検出により前記感情状態を更新する量と、を学習することの指令を行う学習指令ステップと、この学習指令ステップの行う指令に従って前記新たな辞書情報と前記更新量とを学習するパタン学習ステップと、前記感情状態に応じて応答を出力する出力ステップとを有することを特徴とする仮想生物システムにおけるパタン学習方法を提供する。
【００４１】
本発明の請求項１２によれば、請求項１１記載のパタン学習方法において、前記パタン情報は、人物、表情、語彙、語気及び力覚のグループに分けられており、これらのうちの表情、語気及び力覚のグループについての刺激の認識結果の快及び不快により、前記感情状態を表す幸福度、興奮度及び好感度の数値を変化させることを特徴とする仮想生物システムにおけるパタン学習方法を提供する。
【００４２】
【発明の実施の形態】
以下、本発明による仮想生物システムの一実施形態について図面を用いて説明する。
【００４３】
図１は本発明の一実施形態による仮想生物システムの機能ブロック構成を示す図である。この仮想生物システムは、外界からの情報を入力される入力部１、その入力された情報をパタンとして認識するパタン認識部２、このシステムの内部の状態を更新する状態更新部３、外部に行動としてあらわす出力部４、上記情報のパタンの学習を指示する学習指令生成部５、この指示のもとにパタンの学習を行うパタン学習部６から成る。
【００４４】
また、図２はこの仮想生物システムにおける処理の流れを示す図である。この実施形態の仮想生物システムにおける処理は、入力処理ステップＳ１、パタン認識処理ステップＳ２、状態更新処理ステップＳ３、出力処理ステップＳ４、学習判断処理ステップＳ５、パタン学習処理ステップＳ６より成る。
【００４５】
入力部１は、利用者の顔画像パタンを入力するテレビカメラなどの画後入力手段、利用者の音声波形パタンを入力するマイクロフォンなどの音声入力手段、装置外装に配置され、利用者からの接触力覚パタンを入力する感圧センサなどの力覚入力手段を備えており、例えば音声信号と力覚信号や、音声信号と画像信号というような複数種類の信号源からパタン情報（入力パタンという）を入力する手段である。
【００４６】
図３に、本発明を動物を模したロボット１１に適用した場合の実施形態の各種入力手段の具体的な配置例を示す。このロボット１１は、目に相当する部位に画像入力手段たるテレビカメラ１２を有し、耳に相当する部位に音声入力手段たるマイクロフォン１３を有し、また頭頂部に相当する部位に力覚入力手段たる感圧センサ１４を設けられている。
【００４７】
本発明における仮想生物システムは、図３に示すように具体的に存在するロボットに限られず、コンピュータの画面上にコンピュータグラフィック（ＣＧ）に現わされる存在でもよい。
【００４８】
図４に、このようにコンピュータ２１の表示装置上に表示されるＣＧによる表示仮想生物２２に適用した場合の各種入力手段の配置例を示した。この例では、画像入力手段たるテレビカメラ２３をコンピュータ２１の手前に設置し、音声入力手段たるマイクロフォン２４をコンピュータ２１の横に設置している。
【００４９】
力覚入力手段はマウス２５の左ボタン２６とする。この左ボタン２６を押下しつつマウスカーソル２７を仮想生物２２の上で往復させることにより、この表示仮想生物２２を撫でる動作を意味する撫で入力を与えることができる。また、この左ボタン２６を表示仮想生物２２の上で短くクリックすることにより表示仮想生物２２を叩く動作を意味する叩き入力が可能である。更に、左ボタン２６を表示仮想生物２２の上で押下し続けることで仮想生物２２を押える動作を意味する押さえ入力が可能となっている。
【００５０】
上述のような手段によって、図２のステップＳ１１で入力処理が行われる。
【００５１】
図１におけるパタン認識部２は、パタン認識のための辞書情報を保持し、前記入力部１による入力パタンを部分空間法を用いて各クラスの辞書情報と照合し、所定閾値以上で最も高い類似度を示すクラスを検出し、当該クラスを当該入力パタンの属するクラスと認定し、該認定されたクラスの識別情報を出力する手段であり、図２のステップＳ１２においてこのようなパタン認識処理がなされる。
【００５２】
なお、パタン認識部２は認定すべきクラスを発見できなかった場合には、当該入力パタンが未学習クラスであることを示す特別な識別情報を出力する。
【００５３】
パタン認識部２は、利用者の顔画像パタンからその人物の別と表情の別を、利用者の音声スペクトルパタンから語彙の別と語気の別を、利用者の接触感圧パタンから触られ方の別を認識できる。そのため、パタン認識部２は図５に示すように、人物、表情、語彙、語気及び力覚の５種類のグループに分類された辞書情報を保持することができ、各グループ毎の認識結果を出力できる。
【００５４】
図５において初期辞書数とは、最初から有する辞書の数であり、これが０であることは、最初そのグループの辞書がないことを意味する。
【００５５】
表情と語気と力覚の各グループについては、初期辞書数が２あるいは３であり、システムが生得的にこれらを認識可能であるために予め対応する辞書情報を用意している。一方、人物と語彙のグループについては、初期辞書数は０であるから、運用中にその認識能力が獲得されるべきものとして辞書情報を予め用意しておかず、無垢なままとする。このとき、生得的に認識可能なクラスを初期クラスと呼ぶことにする。
【００５６】
また、この実施形態においては、上記の認識結果として得られる識別情報として、各グループの識別情報の書式を図６のように定めている。即ち、人物、表情、語彙、語気及び力覚の各グループの識別情報は、各々ＦＩ、ＦＥ、ＶＩ、ＶＥ、ＴＩの記号の後に１番から始まる通し番号を付けることにより表される。各グループとも当該グループを示す記号に当該グループに属する各クラスを示す通し番号を付けた文字列が識別情報となる。
【００５７】
なお、未学習パタンについては、通し番号は全グループ共通で当該クラスの記号の後に０を付ける。
【００５８】
図７に、パタン認識部２が運用開始時点の初期状態で認識可能なクラス（初期クラス）と識別情報並びに辞書情報構築に用いた教示パタンを示す。なお、初期クラスとは仮想生物の生存本能に訴える危険信号や受容欲求に訴える触覚的快刺激であり、本発明における仮想生物はそのような信号や刺激を生得的に認識できるように進化してきたのだという想定に基づいている。
【００５９】
表情グループの初期クラスは、笑顔（ＦＥ１）と怒り顔（ＦＥ２）の２つがあり、語気グループの初期クラスは、優しい語気（ＶＥ１）と怒り語気（ＶＥ２）の２つがあり、力覚グループの初期クラスは、撫で（ＴＩ１）と叩き（ＴＩ２）と押さえ（ＴＩ３）の３つがあることを示している。
【００６０】
図１における状態更新部３は、仮想生物の内部状態情報を保持し、前述のパタン認識部２による識別情報に応じて、例えば仮想生物の感情状態を司る内部状態を更新する手段であり、図２のステップＳ１３において、内部状態の更新処理がなされる。
【００６１】
本発明のこの実施形態の仮想生物システムでは、感情状態を司る内部状態として例えば図８に示すように、幸福度Ｈ、興奮度Ａ、好感度Ｌという３種類のパラメータを用いる。幸福度のパラメータＨは、−１．０から１．０の範囲にあり、この範囲内の値によって、恐怖、恐れ、不安、安心、喜び、恍惚の状態、即ち、仮想生物の不安や安心の程度を表す。特別大きい刺激がなければ、安心に自然収束する。
【００６２】
興奮度のパラメータＡは、０．０から１．０の範囲にあり、この値によって沈静と興奮の状態の程度を表し、特別大きな刺激がなければ０．０（沈静）に自然収束する。
【００６３】
また、好感度のパラメータＬは、−１．０から１．０の範囲にあり、この範囲内の値によって、嫌いから好きの程度を表す。
【００６４】
図８に示した内部状態のパラメータは、前述した初期クラスがパタン認識部２により検知されることにより、各クラスに応じて例えば図９に示すように変化する。このとき、各クラスは図示するように報酬系快刺激と罰系不快刺激のいずれかに分類されており、報酬系の快刺激には、笑顔、優しい語気、撫でが、罰系の不快刺激には、怒り顔、怒り語気、叩き、押さえが各々割り当てられている。例えば、親和表明に相当する快刺激である、笑顔のクラス（ＦＥ１）は、その刺激の継続時間に比例して幸福度Ｈ，好感度Ｌ，興奮度Ａのパラメータの値を緩やかに増加させる。また、強い叱りに相当する不快刺激である怒り語気（ＶＥ２）は、その刺激の回数に比例して、幸福度Ｈ及び好感度Ｌの値を急激に減少させ、興奮度Ａの値を急激に増加させる。
【００６５】
このように、上記幸福度Ｈ、興奮度Ａ、好感度Ｌのパラメータの値は、快刺激あるいは不快刺激を受けることにより変化していく。このクラス毎にどのように内部状態を変化させるべきかを定義した情報を更新情報と呼び、認識可能な全てのクラスについてクラス毎に状態更新部３が保持するものとする。
【００６６】
出力部４は、前述のように状態更新部３により更新された内部状態とパタン入力結果に基づいて、感情表現と態度表明による応答を出力する手段であり、ステップＳ１４において、処理される。
【００６７】
出力部４における応答の仕方は、仮想生物が生得的に獲得している性質であり、図１０に示すように内部状態とパタン入力結果に応じて応答が決定可能なように設定されている。例えば、幸福度が悦惚の域に達すると、仮想生物はうっとりとした声を漏らしながら大人しくなり、喜びの域では嬉しそうな声をあげて喜ぶというように感情表現し、好感情で人物を検出すると視線を合わせようとし、悪感情で人物を検出すると顔を背けようとするというように態度表明をする。
【００６８】
以上で述べた構成により、本発明のこの実施形態の仮想システムは、初期クラスを認識することで、例えば、大声を掛けられる（怒り語気）と怖がったり、撫でられる（撫で）とうっとりするという応答を示すことのできる仮想生物システムとして機能する。
【００６９】
引き続き、この実施形態のシステムが自発的にパタンを学習する仕組みとその特徴について説明する。
【００７０】
図１の学習指令生成部５は、上述の状態更新部３により適宜更新される内部状態が所定の条件を満たすとき、具体的には快刺激や不快刺激により幸福度Ｈ、興奮度Ａ、好感度Ｌの絶対値が、該値が正値あるいは負値であるときにそれぞれ設定されている所定閾値を超えたとき、そのような感情状態の発生原因となった事態を構成する入力パタンを新規にあるいは再度学習すべく、観測された全ての入力パタンを対象とする学習指令を生成出力する手段である。したがって、図２のステップＳ１５において、学習すべきかどうかが検討され、前記幸福度Ｈ、興奮度Ａ、好感度Ｌのいずれかの値が所定閾値を超えたとき、学習すべきと判断され、ステップＳ１６のパタン学習処理に移る。
【００７１】
ここで本発明を先に述べた従来例と比較すると、「ぴーちゃみん」においては、上述したような「所定の感情状態の発生原因となった事態から観測される複数の入力パタンを学習する」機能が欠けている。そのため「ぴーちゃみん」は、システムが振る舞うべき仮想生物の立場から見て検出する価値（生物的価値）のある、例えば感情の昂ぶりなどをもたらす事態の発生を示す入力パタンを選択的に学習することができない。
【００７２】
例えば、Ａさんというユーザが本発明の実施形態による仮想生物システムを使うとき、Ａさんが笑顔を見せながら優しく「よしよし」と云いつつ感圧センサを撫でたとする。
【００７３】
このとき事態から観測されるパタンは、Ａさんの顔画像パタンと「よしよし」という音声スペクトルパタンと撫でを表す感圧パタンである。Ａさんの顔や「よしよし」という語彙を学習していないものとすると、この仮想生物システムは図７に示す笑顔ＦＥ１と優しい語気ＶＥ１と撫でＴＩ１のみを認識することができる。即ち、パタン認識部２による識別情報として「ＦＩ０，ＦＥ１，ＶＩ０、ＶＥ１，ＴＩ１」が得られる。
【００７４】
このうちの、未学習パタンＦＩ０とＶＩ０を除く「ＦＥ１、ＶＥ１，ＴＩ１」に基づいて、図９に示す快刺激を受けたことになるので、状態更新部３によりシステムの内部状態が幸福度Ｈや好感度Ｌの増加した状態となり、出力部５によって悦惚感や喜びあるいは安心を表す表現が出力される。そして、このような刺激が蓄積的に働いて幸福度Ｈや好感度Ｌが所定閾値を超えたとき、学習指令生成部５により学習指令が生成される。
【００７５】
パタン学習部６は、前記学習指令生成部５からの学習指令に従って観測されている入力パタンを学習する手段であり、ステップＳ１５において学習すべきとされたときにステップＳ１６において入力パタンを学習する。なお、ステップＳ１５において、学習すべきたされなかった場合には、再びステップＳ１１に戻り入力を待つことになる。
【００７６】
上記ステップＳ１６における入力パタンの学習は、次の２通りのいずれかの方法で行われる。
【００７７】
パタン認識部２による識別情報が未学習クラス（上記例ではＦＩ０，ＶＩ０）を表している場合には、その入力パタンを認識可能な辞書情報がパタン認識部２に記憶されていないものとして、当該パタンからこれを認識するための辞書情報を主成分分析を用いて新規に生成する。これと共に、その識別情報を新規に発行してパタン認識部２に記憶させる。
【００７８】
新規に発行される識別情報には、相当するグループ内の既存クラスの通し番号に続く番号を与えられる。例えば、Ａさんの人物を識別する新しいクラスにはＦＩ１が、「よしよし」という語彙を識別する新しいクラスにはＶＩ１が、それぞれ発行される。また、この新規クラス発行時は、そのとき既知クラス「ＦＥ１，ＶＥ１，ＴＩ１」によってもたらされた内部状態パラメータ値の変化量が、当該クラスの検知によって各パラメータを増減させる量、すなわち当該新規クラスのための更新情報として状態更新部３に記憶される。
【００７９】
そのためパタン学習部６は、学習指令生成前後の内部状態パラメータ値の差分情報を常に監視している。
【００８０】
一方、前記パタン認識部２による識別情報が既知クラスを表している場合には、当該入力パタンを、更に良く認識可能なように、パタン認識部２に記憶されている当該クラスの辞書情報を、前述文献［１］に開示される学習部分空間法を用いて強化して再ぴパタン認識部２に記憶させる。
【００８１】
以上の結果、次回から同様のパタンが入力されたとき、パタン認識部２はそのようなパタンをより確かに認識できるようになると共に、的確に応答できるようになっている。
【００８２】
前述のＡさんがある行為をしたときの例を挙げると、感情パラメータ値の絶対値が所定閾値を超えたとき、学習指令生成部５により学習指令が生成され、パタン学習部６によりＡさんの顔や「よしよし」という語彙が学習される。この結果、システムは「嬉しくしてくれそうな人物」という生物的に価値のあるＡさんの顔と、「うれしくしてくれる言葉」という生物的に価値のある「よしよし」を認識できるようになる。その一方、いかにＡさんの顔やその言葉が未学習であったとしても、喜ばせたり怖がらせたりという生物的価値を与えてくれない人物の顔や語彙は学習されない。
【００８３】
上記の学習指令生成部５とパタン学習部６により、本発明のこの実施形態の仮想生物システムは未学習のパタンを自発的に学習して、パタン認識能力を発達させる仮想生物を演出可能になる。特に、学習の引き金となる条件を、生物的に価値のある状況の検出、即ち「嬉しい」「恐い」などの強い感情の発生とすることで、やみくもにあるいはランダムにパタンを学習するよりも遥かに生物的に妥当な学習性を、本発明では実現することができる。
【００８４】
この結果、本発明のこの実施形態の仮想生物システムは、大声を掛けられると怖がったり、撫でられるとうっとりするという応答を生まれながらに示すだけでなく、その後十分に経験を積むことにより、良くしてくれる知り合いの顔を検出すると喜んだり、叱り言葉を検出すると落ち込んだりという動作を入力パタンに応じて返すことのできる仮想生物システムとして機能する。
【００８５】
本発明の要点は、パタン学習を行うか否かを、ほぼ同時期に入力された他の既学習パタンの認識結果により駆動される感情状態に基づいてシステムが独自に判断できるようにすることである。このように外部から入力される他の既学習パタンの認識結果に応じてパタン学習が進行する機能は、従来例として先に述べた「ぴーちゃみん」にはないものである。なぜなら、「ぴーちゃみん」は音声語彙という１グループのパタンしか認識できないからである。
【００８６】
なお、以上述べた機能は、各機能を有する部品が組み合わせられたハードにより実現する場合だけでなく、ソフトウェアとしても実現可能である。
【００８７】
また、本発明は、コンピュータに所定の手順を実行させるためのプログラム、コンピュータを所定の手段として機能させるためのプログラム、あるいはコンピュータに所定の機能を実現させるためのプログラムを記録した、コンピュータ読取り可能な記録媒体として実施することもできる。
【００８８】
また、図１１に示すように、本発明に係るパタン認識方法やパタン登録方法を表現する情報、例えばプログラムを記録媒体３１に記録し、この記録した情報を記録媒体３１を経由してコンピュータ装置３２に適用したり、更に通信回線３３を経由してコンピュータ装置３４に適用することも可能である。
本発明は上述した実施形態に限定されるものではなく、その技術的思想の範囲内において種々変形して実施することができる。
【００８９】
【発明の効果】
以上述べたように本発明によれば、いつパタンを学習すべきかを決定する、より適切な手段を有することにより、不適切なパタンを排除した自発的なパタン学習を行える仮想生物システム及び仮想生物システムにおけるパタン学習方法を得ることができる。
【図面の簡単な説明】
【図１】本発明の一実施形態の仮想生物システムの機能ブロック構成例を示す図。
【図２】本発明の一実施形態の仮想生物システムにおける処理の流れを示す図。
【図３】本発明の一実施形態のロボット形式の仮想生物システムにおける各種入力手段の配置例を示す図。
【図４】本発明の一実施形態の画面表示形式の仮想生物システムにおける各種入力手段の配置例を示す図。
【図５】本発明の一実施形態の仮想生物システムにおいて認識可能なグループとそれらの構成辞書を説明するための図。
【図６】本発明の一実施形態の仮想生物システムにおいて認識可能なグループとそれらの識別情報の書式を説明するための図。
【図７】本発明の一実施形態の仮想生物システムにおいて初期状態で認識可能なクラスとそれらの教示パタンを説明するための図。
【図８】本発明の一実施形態の仮想生物システムにおいて感情状態を司る内部状態パラメータを説明するための図。
【図９】本発明の一実施形態の仮想生物システムにおいて刺激により、内部状態パラメータの変化を説明するための図。
【図１０】本発明の一実施形態の仮想生物システムにおいて内部状態パラメータに応じた応答を説明するための図。
【図１１】本発明の仮想生物システムを記録媒体により移して実現する場合の他の実施形態を説明するための図。
【符号の説明】
１・・・入力部、２・・・パタン認識部、３・・・状態更新部、４・・・出力部、５・・・学習指令生成部、６・・・パタン学習部、１１・・・ロボット、１２・・・テレビカメラ、１３・・・マイクロフォン、１４・・・感圧センサ、２１・・・コンピュータ、２２・・・表示仮想生物、２３・・・テレビカメラ、２４・・・マイクロフォン、２５・・・マウス、２６・・・左ボタン、３１・・・記録媒体、３２、３４・・・コンピュータ装置。[0001]
BACKGROUND OF THE INVENTION
The present invention relates to a virtual biological system that recognizes an input pattern from the outside, and more particularly to a virtual biological system that spontaneously learns a pattern and a pattern learning method in the virtual biological system.
[0002]
[Prior art]
In recent years, various software applications in which characters imitating organisms appear and robots imitating organisms have been proposed. These software applications and robots, which should be called virtual biological systems, are designed to output responses such as expressing emotions according to input information from the user. Some of these software applications and robots have a capability of recognizing pattern information such as a user's face image and voice waveform, and can respond to such pattern recognition information as input information.
[0003]
For example, see http: // www. incx. nec. co. The “help robot R100” disclosed in jp / robot / is capable of identifying who the person is from the face image of the user and performing an operation specific to the user based on the recognition result. .
[0004]
Also, for example, http: // www. lares. dti. ne. jp / ~ p-chamin / The software application called “Pechamin” took a voice signal from a microphone connected to a personal computer, learned it and listened to it, and processed the voice signal. It is an “Inco” type virtual creature that can utter a waveform signal as its own voice. Furthermore, “Pichamin” learns the context of individual voices that are input separately, and can generate the self-speech in the same context.
[0005]
The basic framework for pattern recognition is to express the characteristics of patterns belonging to the class to be recognized in some form and to verify how well the characteristics of unknown human power patterns match the characteristics of this class. . Information representing the characteristics of this class is called a dictionary, and a scale indicating the degree of matching is called similarity. When there is one class to be recognized (this is called an identification problem), if this similarity is greater than or equal to a predetermined threshold, the unknown input pattern is recognized as belonging to this class.
[0006]
In addition, when there are two or more classes to be recognized (this is called an identification problem), if there is a class whose similarity is equal to or higher than a predetermined threshold and the highest similarity is obtained, the unknown input pattern has the highest similarity. Acknowledge that you belong to the class In any case, if there is no class equal to or greater than the predetermined threshold, the unknown input pattern is determined to be in a class undecided, that is, belongs to a new class that cannot be recognized by the system.
Here, the calculation mechanism of pattern recognition will be briefly described.
[0007]
In general, the pattern p is data composed of N scalar quantities. Therefore, in the pattern p, each scalar quantity corresponds to one axis, and the value of each scalar quantity is a coordinate value on the corresponding axis. In the N-dimensional superspace S <a space having N orthogonal coordinate axes> It can be regarded as a point (or a position vector v from the origin). In the case of an image pattern, each pixel value of an image whose size and brightness are normalized can correspond to this scalar quantity. In the case of an audio pattern, the spectral component at each time with the time length and intensity normalized can be made to correspond to the scalar quantity. This is a pattern vectorization method used in many pattern recognition apparatuses.
[0008]
At this time, whether the two patterns are so similar can be evaluated by calculating the similarity between the patterns. There are various ways of defining the similarity between two patterns. For example, the cosine of a normalized position vector (referred to as a feature vector) representing the length of each pattern (ie, projection length from one to the other) The similarity is defined as In this case, the greater the similarity, the more similar the two patterns.
[0009]
In general, when a plurality of known patterns (called teaching patterns) belonging to a certain class are given and one unknown input pattern is given, the possibility that this unknown input pattern belongs to this class is evaluated. Is simply calculated by calculating the similarity between the two patterns. In this case, the similarity between the unknown input pattern and all the known patterns is obtained individually, and for example, the largest similarity is set as the similarity with this class. In this method, all known patterns are individually stored as a dictionary representing class features.
[0010]
However, with this method, as the number of known patterns increases, the cost of similarity calculation, the memorandum base for storing known patterns, and the like will increase without limit. In addition, since there are patterns similar to each other in the known patterns, there is a waste of calculating the degree of similarity many times with similar patterns. Therefore, instead of storing the known patterns individually, it is necessary to describe the feature of the known pattern set, that is, the class with less information.
[0011]
As a method of expressing the characteristics of a class, K known patterns belonging to a certain class, that is, K N-dimensional feature vectors {v: v1,. . , VK} (corresponding to the above-mentioned known pattern), M (M <N) orthonormal vectors {e: e1,. . . , EM} as a base, there is a method using an M-dimensional superspace L as a dictionary. This M-dimensional superspace L is a subspace L of M dimensions of the N-dimensional superspace S (because M <N), and this M-dimensional superspace L is called a dictionary subspace of M dimensions.
[0012]
According to this method, even if the number K of known patterns is much larger than N, it is only necessary to store at most M (<N) basis vectors as a dictionary. When one unknown feature vector v (corresponding to the above-mentioned unknown pattern) is given, the similarity to the dictionary subspace L is determined by using the unknown feature vector v as the basis vector {e: e1,. . . , EM} is defined as the similarity to the sum of squares of the length projected onto each of e, M}. This is called the subspace method.
[0013]
Detailed information related to the definition of the dictionary expression and similarity including the subspace method can be found in Ref. [1] (Erki Oya, translated by Hidemitsu Ogawa, “Pattern Recognition and Subspace”, Sangyo Tosho, 1986. ), [2] <Yasuzo Iijima, “Pattern Recognition Theory”, Morikita Publishing, 1989).
[0014]
As exemplified above, in a system that generally recognizes some pattern, it is necessary to give the system a dictionary of classes to be recognized. Normally, it is difficult to create this dictionary synthetically regardless of the actual pattern. Therefore, the actual pattern (the above-mentioned teaching pattern) belonging to the target class is collected, and the dictionary is analyzed by principal component analysis in the illustrated subspace method. Will be generated.
[0015]
At this time, in order to generate a dictionary from an actual pattern, at least a teaching pattern for constructing the dictionary and information of a class to which the teaching pattern belongs are necessary. The teaching pattern is the face image pattern for face recognition, the character image pattern for character recognition, the word waveform pattern for word speech recognition, and the class information is the name (character code string) and identification of the person for personal recognition by face. If the facial expression is recognized by a face, the name of the facial expression (character code string) and identification number, the character code of the character is recognized by character recognition, the character code string of the word is generally recognized by word speech recognition, and the like. Are numbers and symbols (columns) that can be individually identified.
[0016]
In this way, a necessary dictionary of a necessary class can be generated only by providing two types of information, that is, a teaching pattern and information on a class to which the teaching pattern belongs.
[0017]
It should be noted that generating such a dictionary, that is, adding a new dictionary or updating an existing dictionary at any time during operation of the pattern recognition system and improving the pattern recognition ability of the system will be referred to as pattern learning hereinafter.
[0018]
Conventionally, it has been necessary for humans to provide two types of information, that is, teaching patterns and class information, to the system. In particular, in a pattern recognition system that can perform pattern learning during operation, a human collects teaching patterns while using the pattern input function for pattern recognition, and manually inputs the class to which the pattern belongs. Pattern learning proceeds in the procedure of creating a dictionary with the help of the system. Therefore, in addition to the original function of inputting and recognizing a pattern in such a system, an operation input function for calling a mode for collecting a teaching pattern as a dictionary generation function, and an input pattern in the mode as a teaching pattern A function of storing, a function of accepting an operation input of class information, and a function of generating a dictionary of a specified class from a collected and stored teaching pattern are provided.
[0019]
As is clear from this, in the conventional pattern learning, the input of the teaching timing of “Learning the pattern to be input from now on”, the pattern presentation to the system for the teaching performed intentionally after that, “Teached Human intervention was indispensable for the three stages of class information input, “Pattern belongs to this class”.
[0020]
However, in this way, it was impossible to produce the virtual creature's creature-like learning ability to learn things without knowing it while using it by learning the pattern manually. This is a major obstacle to providing a more biological artificial system.
[0021]
Also in the above-mentioned “R100”, face teaching must be performed explicitly, and manual teaching work is necessary to learn a new face.
[0022]
In addition, in the above-mentioned “Piichi Chamin”, when teaching the voice that the user wants to learn, the teaching voice is heard after clicking the microphone button displayed on the screen with the mouse. It was necessary to explicitly notify the system of the pattern to be learned. As described above, the conventional pattern recognition system capable of pattern learning requires intentional work performed only for teaching by a human in pattern learning.
[0023]
However, the only exception is “Pechamin” mentioned above, and the system also has a function to autonomously learn the voice picked up by the microphone, enabling pattern learning that does not require explicit work by the user. Yes. When "Pichamin" recognizes a new pattern that is not similar to any of the voice patterns learned so far, it automatically issues new class information (probably a new identification number) to that pattern, thereby providing class information for that pattern. It seems that the dictionary that is generated internally and recognizes this new pattern is automatically generated. With this function, “Piichi Chamin” has succeeded in producing a virtual creature that learns words and sounds, even if the user does nothing.
[0024]
However, even in “Pichamin” with such a spontaneous learning ability, there are insufficient means to determine which speech to be picked up by the microphone. This causes a problem that the user speaks an unknown voice. This means that there is no appropriate means for selecting the pattern to be learned, which is a problem of the prior art.
[0025]
[Problems to be solved by the invention]
The present invention has been made in view of the above-described problems in learning of virtual creatures, and the object of the present invention is to create a virtual creature that can perform spontaneous pattern learning by eliminating inappropriate patterns. It is to provide a pattern learning method in a system and a virtual biological system.
[0026]
[Means for Solving the Problems]
The present invention is characterized in that the input pattern information is recognized and the internal state is updated based on the result.
[0027]
In order to achieve the above object, according to claim 1 of the present invention, input means for inputting pattern information from a plurality of types of signal sources, and the input pattern information. Based on dictionary information Based on the recognition result by the pattern recognition means and the pattern recognition means Emotion State updating means for updating the state; and Emotion Output means for outputting a response according to the state Do In the virtual biological system, Emotion When the condition satisfies a predetermined condition, New dictionary information for recognizing the pattern information, an amount of updating the emotional state by detecting the pattern information, Learning means to learn further Equipped Do A virtual biological system characterized by the above is provided.
[0028]
Therefore, the input pattern is selectively learned according to the condition of the internal state, and a virtual biological system that learns only an appropriate input pattern and performs a reasonable output can be obtained.
[0029]
According to claim 2 of the present invention, input processing for inputting pattern information from a plurality of types of signal sources, and the input pattern information Based on dictionary information Based on the pattern recognition processing to be recognized and the recognition result by the pattern recognition means Emotion State update processing for updating the state, and Emotion Output processing that outputs a response according to the state A pattern learning method in a virtual biological system, comprising: Said Emotion When the condition satisfies a predetermined condition, New dictionary information for recognizing the pattern information, an amount of updating the emotional state by detecting the pattern information, Pattern learning process to learn Further have A pattern learning method in a virtual biological system is provided.
[0030]
The input pattern is selectively learned according to the condition of the internal state, and a rational pattern learning method for learning only an appropriate input pattern is obtained.
[0031]
According to claim 3 of the present invention, in the virtual biological system according to claim 1, The predetermined condition is that the intensity of the emotional state exceeds a predetermined threshold value Provide virtual biological systems.
[0032]
According to claim 4 of the present invention, in the pattern learning method according to claim 2, the predetermined condition is: Emotion Provided is a pattern learning method in a virtual biological system, characterized in that the intensity of a state exceeds a predetermined threshold.
[0033]
According to claim 5 of the present invention, input means for inputting pattern information from one or more signal sources, and pattern information input by the input means are divided into a plurality of groups. For each group based on dictionary information categorized into Pattern recognition means for recognizing New dictionary information for recognizing the pattern information, and an amount of updating the emotional state by detecting the pattern information. Based on the pattern learning means for learning and the recognition result by the pattern recognition means Emotion State update means for updating the state; and Emotion When the state satisfies a predetermined condition, a learning command unit that gives a learning command to the pattern learning unit; Emotion Output means for outputting a response according to the state Do A virtual biological system characterized by the above is provided.
[0034]
According to claim 6 of the present invention, input means for inputting pattern information from one or more signal sources, and pattern information input by the input means are divided into a plurality of groups. For each group based on dictionary information classified into Pattern recognition means for recognizing New dictionary information for recognizing the pattern information, an amount of updating the emotional state by detecting the pattern information, Based on the pattern learning means for learning and the recognition result by the pattern recognition means, it is represented by a plurality of numerical values. Emotion By changing these numbers of states Emotion State update means for updating the state; and Emotion Learning command means for giving a learning command to the pattern learning means when a plurality of numerical values representing a state has a predetermined size; Emotion Provided is a virtual biological system comprising output means for outputting a response according to a state.
[0035]
According to claim 7 of the present invention, input means for inputting pattern information from one or more signal sources, and a plurality of pieces of pattern information input within a predetermined time by the input means are divided into a plurality of groups. For each group based on dictionary information categorized into Pattern recognition means for recognizing New dictionary information for recognizing the pattern information, an amount of updating the emotional state by detecting the pattern information, Based on the pattern learning means for learning and the recognition result by the pattern recognition means, it is represented by a plurality of numerical values. Emotion By changing these numbers of states Emotion State update means for updating the state; and Emotion Learning command means for giving a learning command to the pattern learning means when a plurality of numerical values representing a state has a predetermined size; Emotion Output means for outputting a response according to the state Do A virtual biological system characterized by the above is provided.
[0036]
Here, the “predetermined time period during which a plurality of pieces of pattern information are input by the input means” usually refers to a time period during which the user operates as a series of operations. According to the present invention, by a plurality of pattern information Emotion Since the state is changed, a virtual biological system that operates more rationally can be obtained.
[0037]
According to claim 8 of the present invention, claim 6 Or claims The virtual biological system according to claim 7, wherein the Emotion The virtual biological system is characterized in that the plurality of numerical values representing the state are numerical values representing at least happiness, excitement, and likability.
[0038]
According to claim 9 of the present invention, claim 6 Or claims 8. The virtual biological system according to claim 7, wherein the pattern information is recognized for at least one of a group of people and vocabulary and at least one of a group of facial expressions, vocabulary, and force sense. provide.
[0039]
According to claim 10 of the present invention, claim 6 Or claims 7. The virtual biological system according to claim 7, wherein the pattern information is divided into groups of person, facial expression, vocabulary, vocabulary, and force sense, and among these, the group of facial expression, vocabulary, and force sense about Of stimulation Recognition result Pleasant and uncomfortable, said Emotion Provided is a virtual biological system characterized by changing the numerical values of happiness, excitement, and likability representing a state.
[0040]
According to the eleventh aspect of the present invention, the step of inputting pattern information from one or more signal sources, and the plurality of pattern information input within a predetermined time by this step into a plurality of groups. For each group based on classified dictionary information Recognized by a plurality of numerical values based on the pattern recognition step to be recognized and the recognition result by this pattern recognition step Emotion By changing these numbers of states The emotion A state update step for updating the state; Emotion When at least one of a plurality of numerical values representing a state reaches a predetermined size, New dictionary information for recognizing the pattern information, an amount of updating the emotional state by detecting the pattern information, In accordance with a learning command step for giving a command for learning and a command for this learning command step The new dictionary information and the update amount A pattern learning step for learning Emotion An output step that outputs a response according to the state Have A pattern learning method in a virtual biological system is provided.
[0041]
According to a twelfth aspect of the present invention, in the pattern learning method according to the eleventh aspect, the pattern information is divided into groups of persons, facial expressions, vocabulary, vocabulary, and force senses. And about force groups Stimulating Due to the pleasant and uncomfortable recognition results, Emotion Provided is a pattern learning method in a virtual biological system, characterized in that numerical values of happiness, excitement, and favorableness representing a state are changed.
[0042]
DETAILED DESCRIPTION OF THE INVENTION
Hereinafter, an embodiment of a virtual biological system according to the present invention will be described with reference to the drawings.
[0043]
FIG. 1 is a diagram showing a functional block configuration of a virtual biological system according to an embodiment of the present invention. This virtual biological system includes an input unit 1 that receives information from the outside world, a pattern recognition unit 2 that recognizes the input information as a pattern, a state update unit 3 that updates the internal state of the system, and an external action An output unit 4, a learning command generation unit 5 for instructing learning of the pattern of the information, and a pattern learning unit 6 for performing pattern learning based on this instruction.
[0044]
FIG. 2 is a diagram showing the flow of processing in this virtual biological system. The processing in the virtual biological system of this embodiment includes an input processing step S1, a pattern recognition processing step S2, a state update processing step S3, an output processing step S4, a learning determination processing step S5, and a pattern learning processing step S6.
[0045]
The input unit 1 is disposed on the exterior of the apparatus, such as a post-screen input means such as a TV camera for inputting a user's face image pattern, a microphone or the like for inputting a user's voice waveform pattern, and contact from the user. Force input means such as a pressure sensor for inputting force patterns is provided. Pattern information (referred to as input patterns) from a plurality of types of signal sources such as sound signals and force signals, and sound signals and image signals. Is a means for inputting.
[0046]
FIG. 3 shows a specific arrangement example of various input means of the embodiment when the present invention is applied to a robot 11 simulating an animal. This robot 11 has a television camera 12 as an image input means at a part corresponding to the eyes, a microphone 13 as a voice input means at a part corresponding to the ears, and a force sense input means at a part corresponding to the top of the head. A pressure sensor 14 is provided.
[0047]
The virtual biological system in the present invention is not limited to a robot that specifically exists as shown in FIG. 3, but may be a computer graphic (CG) that appears on a computer screen.
[0048]
FIG. 4 shows an arrangement example of various input means when applied to the display virtual creature 22 by CG displayed on the display device of the computer 21 in this way. In this example, a television camera 23 serving as an image input unit is installed in front of the computer 21, and a microphone 24 serving as a voice input unit is installed beside the computer 21.
[0049]
The force sense input means is the left button 26 of the mouse 25. By reciprocating the mouse cursor 27 on the virtual creature 22 while pressing the left button 26, it is possible to give an input with a scissors meaning an action of stroking the display virtual creature 22. In addition, when the left button 26 is clicked on the display virtual creature 22 for a short time, it is possible to perform a hit input meaning an operation of hitting the display virtual creature 22. Furthermore, by pressing down the left button 26 on the display virtual creature 22, it is possible to perform a pressing input that means an operation of pressing the virtual creature 22.
[0050]
The input process is performed in step S11 of FIG. 2 by the means described above.
[0051]
The pattern recognition unit 2 in FIG. 1 holds dictionary information for pattern recognition, collates the input pattern by the input unit 1 with the dictionary information of each class using a subspace method, and has the highest similarity above a predetermined threshold. 2 is a means for detecting a class indicating a degree, certifying the class as a class to which the input pattern belongs, and outputting identification information of the certified class. In step S12 of FIG. 2, such pattern recognition processing is performed. The
[0052]
When the pattern recognition unit 2 cannot find a class to be recognized, the pattern recognition unit 2 outputs special identification information indicating that the input pattern is an unlearned class.
[0053]
The pattern recognition unit 2 touches the user's face image pattern according to the person and the facial expression, the user's voice spectrum pattern according to the vocabulary and the vocabulary according to the user's contact pressure pattern. You can recognize another. Therefore, as shown in FIG. 5, the pattern recognition unit 2 can hold dictionary information classified into five types of groups of people, facial expressions, vocabulary, vocabulary, and force sense, and outputs recognition results for each group. it can.
[0054]
In FIG. 5, the initial number of dictionaries is the number of dictionaries from the beginning. If this number is 0, it means that there is no dictionary in the group at first.
[0055]
For each group of facial expressions, vocabulary, and force sense, the initial number of dictionaries is 2 or 3, and corresponding dictionary information is prepared in advance so that the system can naturally recognize them. On the other hand, for the group of people and vocabulary, since the initial dictionary number is 0, dictionary information is not prepared in advance so that its recognition ability should be acquired during operation, and it remains innocent. At this time, a class that can be recognized naturally is called an initial class.
[0056]
In this embodiment, the format of the identification information of each group is defined as shown in FIG. 6 as the identification information obtained as the recognition result. That is, the identification information of each group of person, facial expression, vocabulary, vocabulary and force sense is represented by adding a serial number starting from 1 after the FI, FE, VI, VE, and TI symbols. For each group, a character string in which a serial number indicating each class belonging to the group is added to the symbol indicating the group is identification information.
[0057]
For the unlearned pattern, the serial number is common to all groups, and 0 is added after the symbol of the class.
[0058]
FIG. 7 shows a class (initial class) that can be recognized by the pattern recognition unit 2 in the initial state at the start of operation, identification information, and teaching patterns used for constructing dictionary information. The initial class is a tactile pleasant stimulus that appeals to the survival instinct of a virtual organism or a desire for acceptance, and the virtual organism in the present invention has evolved to recognize such signals and stimuli innately. Based on the assumption that
[0059]
There are two initial classes in the facial expression group: smile (FE1) and angry face (FE2). There are two initial classes in the morale group: gentle vocabulary (VE1) and angry morale (VE2). The class indicates that there are three types: stroke (TI1), strike (TI2), and presser (TI3).
[0060]
The state update unit 3 in FIG. 1 is means for holding the internal state information of the virtual organism and updating the internal state governing the emotional state of the virtual organism, for example, according to the identification information by the pattern recognition unit 2 described above. In step S13 of 2, the internal state update process is performed.
[0061]
In the virtual biological system according to this embodiment of the present invention, three types of parameters such as happiness level H, excitement level A, and preference level L are used as internal states that govern emotional states, as shown in FIG. The happiness parameter H is in the range of -1.0 to 1.0. Depending on the value in this range, the state of fear, fear, anxiety, relief, joy, jealousy, that is, anxiety and relief of virtual creatures. Represents the degree. If there is no special large stimulus, it will converge naturally.
[0062]
The degree of excitement parameter A is in the range of 0.0 to 1.0, and this value represents the degree of calm and excitement. If there is no extra large stimulus, it naturally converges to 0.0 (calm).
[0063]
Moreover, the parameter L of likability is in the range of −1.0 to 1.0, and the value within this range represents the degree of likes from dislike.
[0064]
The internal state parameters shown in FIG. 8 change as shown in FIG. 9, for example, according to each class when the initial class described above is detected by the pattern recognition unit 2. At this time, each class is classified as either a reward-type pleasant stimulus or a punitive-type unpleasant stimulus as shown in the figure. Are assigned angry faces, angry words, beatings and holdings. For example, the smile class (FE1), which is a pleasant stimulus corresponding to the affinity expression, gradually increases the values of the parameters of the happiness level H, the likability L, and the excitement level A in proportion to the duration of the stimulus. In addition, the anger speech (VE2), which is an unpleasant stimulus corresponding to a strong resentment, rapidly decreases the value of happiness level H and likability L in proportion to the number of stimuli, and rapidly increases the value of excitement level A. increase.
[0065]
Thus, the values of the parameters of the happiness level H, the excitement level A, and the likability L change as a result of receiving pleasant or unpleasant stimuli. Information defining how the internal state should be changed for each class is referred to as update information, and the state update unit 3 holds all the recognizable classes for each class.
[0066]
The output unit 4 is means for outputting a response by emotion expression and attitude expression based on the internal state and pattern input result updated by the state update unit 3 as described above, and is processed in step S14.
[0067]
The way of response in the output unit 4 is the property that the virtual creature has acquired in nature, and is set so that the response can be determined according to the internal state and the pattern input result as shown in FIG. For example, when the level of happiness reaches the level of jealousy, the virtual creature grows up with enchanting voices, expresses emotions such as joyful voices in the area of joy, and expresses people with good emotions. When it detects it, it tries to adjust its line of sight, and when it detects a person with bad emotion, it tries to turn its face.
[0068]
With the configuration described above, the virtual system according to this embodiment of the present invention recognizes the initial class, and, for example, responds that it is terrified when it is shouted (angry language) or scared (boiled). Functions as a virtual biological system.
[0069]
Next, a mechanism and features of the system of this embodiment for spontaneously learning a pattern will be described.
[0070]
The learning command generation unit 5 in FIG. 1 specifically, when the internal state appropriately updated by the state update unit 3 satisfies a predetermined condition, the happiness level H, the excitement level A, and the favorable level are caused by a pleasant stimulus or an unpleasant stimulus. When the absolute value of the sensitivity L exceeds a predetermined threshold value set when the value is a positive value or a negative value, a new input pattern that constitutes a situation that causes such an emotional state to be generated Or means for generating and outputting learning commands for all observed input patterns in order to learn again. Therefore, in step S15 of FIG. 2, it is determined whether or not to learn, and when any of the happiness level H, excitement level A, and likability L exceeds a predetermined threshold, it is determined that learning should be performed, The process moves to the pattern learning process of S16.
[0071]
Here, when comparing the present invention with the conventional example described above, in “Pichamin”, as described above, “learns a plurality of input patterns observed from the situation that caused the predetermined emotional state” It lacks functionality. Therefore, “Pichamin” selectively learns input patterns that indicate the occurrence of a situation that has a value (biological value) that is detected from the standpoint of the virtual creature in which the system should behave, for example, an emotional blow. I can't.
[0072]
For example, when the user A uses the virtual biological system according to the embodiment of the present invention, it is assumed that Mr. A gently strokes the pressure sensor while saying “good” while smiling.
[0073]
At this time, the patterns observed from the situation are the face image pattern of Mr. A, the voice spectrum pattern of “Yoshiyoshi”, and the pressure-sensitive pattern representing the stroke. If it is assumed that the face of Mr. A and the vocabulary “Yoshiyoshi” are not learned, the virtual biological system can recognize only TI1 with the smile FE1 and the gentle vocabulary VE1 and 撫 shown in FIG. That is, “FI0, FE1, VI0, VE1, TI1” is obtained as identification information by the pattern recognition unit 2.
[0074]
Based on “FE1, VE1, TI1” excluding the unlearned patterns FI0 and VI0, the pleasant state shown in FIG. Or the preference L increases, and the output unit 5 outputs an expression expressing a sense of joy, joy or relief. And when such a stimulus works cumulatively and the happiness level H and the likability L exceed a predetermined threshold, the learning command generation unit 5 generates a learning command.
[0075]
The pattern learning unit 6 is means for learning an input pattern that is observed in accordance with the learning command from the learning command generating unit 5, and learns the input pattern in step S16 when it should be learned in step S15. In step S15, if learning has not been done, the process returns to step S11 again to wait for input.
[0076]
The input pattern learning in step S16 is performed by one of the following two methods.
[0077]
If the identification information by the pattern recognition unit 2 represents an unlearned class (FI0, VI0 in the above example), it is assumed that the dictionary information that can recognize the input pattern is not stored in the pattern recognition unit 2. Dictionary information for recognizing this from the pattern is newly generated using principal component analysis. At the same time, the identification information is newly issued and stored in the pattern recognition unit 2.
[0078]
The newly issued identification information is given a number following the serial number of the existing class in the corresponding group. For example, FI1 is issued for a new class for identifying the person of Mr. A, and VI1 is issued for a new class for identifying the vocabulary “good”. When this new class is issued, the change amount of the internal state parameter value brought about by the known class “FE1, VE1, TI1” at that time is the amount by which each parameter is increased or decreased by the detection of the class, that is, the new class. Is stored in the state update unit 3 as update information for
[0079]
Therefore, the pattern learning unit 6 always monitors the difference information of the internal state parameter values before and after the generation of the learning command.
[0080]
On the other hand, when the identification information by the pattern recognition unit 2 represents a known class, the dictionary information of the class stored in the pattern recognition unit 2 is used so that the input pattern can be recognized better. It is strengthened using the learning subspace method disclosed in the above-mentioned document [1] and stored in the repeat pattern recognition unit 2.
[0081]
As a result, when a similar pattern is input from the next time, the pattern recognition unit 2 can recognize such a pattern more reliably and can respond accurately.
[0082]
As an example when Mr. A performs an action, when the absolute value of the emotion parameter value exceeds a predetermined threshold value, a learning command is generated by the learning command generation unit 5, and Mr. A's The face and the vocabulary “Yoshiyoshi” are learned. As a result, the system will be able to recognize the biologically valuable face of Mr. A, a person who seems to be happy, and the biologically valuable "goodness," a word that makes people happy. . On the other hand, no matter how much Mr. A's face and words are unlearned, the face and vocabulary of a person who does not give the biological value of pleasing or scaring will not be learned.
[0083]
By the learning command generation unit 5 and the pattern learning unit 6 described above, the virtual biological system according to this embodiment of the present invention can spontaneously learn an unlearned pattern and produce a virtual creature that develops pattern recognition ability. . In particular, the conditions that trigger learning are the detection of biologically valuable situations, that is, the generation of strong emotions such as “joyful” and “scary”, far more than learning patterns randomly or randomly. Biologically relevant learning ability can be realized in the present invention.
[0084]
As a result, the virtual biological system of this embodiment of the present invention not only provides a natural response to being scared when screaming, or engrossed when stroking, but it also improves by gaining experience afterwards. It functions as a virtual biological system that can return the action of being happy when it detects the acquaintance's face, or depressing when it detects an utterance, depending on the input pattern.
[0085]
The gist of the present invention is that the system can independently determine whether or not to perform pattern learning based on the emotional state driven by the recognition result of other already learned patterns input almost at the same time. is there. The function that pattern learning proceeds in accordance with the recognition result of other learned patterns input from the outside as described above is not present in “Pichamin” described above as a conventional example. This is because “Pechamin” recognizes only one group of patterns called speech vocabulary.
[0086]
The functions described above can be realized not only by hardware in which components having respective functions are combined but also as software.
[0087]
The present invention also provides a computer-readable recording medium storing a program for causing a computer to execute a predetermined procedure, a program for causing a computer to function as a predetermined means, or a program for causing a computer to realize a predetermined function. It can also be implemented as a recording medium.
[0088]
As shown in FIG. 11, information representing the pattern recognition method and pattern registration method according to the present invention, for example, a program is recorded on a recording medium 31, and the recorded information is stored in the computer device 32 via the recording medium 31. It is also possible to apply to the computer apparatus 34 via the communication line 33.
The present invention is not limited to the embodiments described above, and can be implemented with various modifications within the scope of the technical idea.
[0089]
【The invention's effect】
As described above, according to the present invention, a virtual organism system and a virtual organism capable of performing spontaneous pattern learning that eliminates inappropriate patterns by having a more appropriate means for determining when to learn a pattern. A pattern learning method in the system can be obtained.
[Brief description of the drawings]
FIG. 1 is a diagram showing a functional block configuration example of a virtual biological system according to an embodiment of the present invention.
FIG. 2 is a diagram showing a processing flow in the virtual biological system according to the embodiment of the present invention.
FIG. 3 is a view showing an arrangement example of various input means in the robot-type virtual biological system according to the embodiment of the present invention.
FIG. 4 is a view showing an arrangement example of various input means in the virtual biological system in the screen display format according to the embodiment of the present invention.
FIG. 5 is a view for explaining groups that can be recognized and their constituent dictionaries in the virtual biological system according to the embodiment of the present invention;
FIG. 6 is a diagram for explaining groups that can be recognized in the virtual biological system according to the embodiment of the present invention and formats of their identification information.
FIG. 7 is a diagram for explaining classes and their teaching patterns that can be recognized in the initial state in the virtual biological system according to the embodiment of the present invention.
FIG. 8 is a diagram for explaining an internal state parameter governing an emotion state in the virtual biological system according to the embodiment of the present invention.
FIG. 9 is a diagram for explaining changes in internal state parameters by stimulation in the virtual biological system according to the embodiment of the present invention.
FIG. 10 is a diagram for explaining a response according to an internal state parameter in the virtual biological system according to the embodiment of the present invention.
FIG. 11 is a view for explaining another embodiment in the case where the virtual biological system of the present invention is realized by being transferred by a recording medium.
[Explanation of symbols]
DESCRIPTION OF SYMBOLS 1 ... Input part, 2 ... Pattern recognition part, 3 ... State update part, 4 ... Output part, 5 ... Learning command production | generation part, 6 ... Pattern learning part, 11 ... Robot, 12 ... TV camera, 13 ... Microphone, 14 ... Pressure sensor, 21 ... Computer, 22 ... Display virtual creature, 23 ... TV camera, 24 ... Microphone 25 ... mouse, 26 ... left button, 31 ... recording medium, 32, 34 ... computer device.

Claims

An input unit for inputting pattern information from a plurality of types of signal sources, a pattern recognition unit for recognizing the input pattern information based on dictionary information, and a state for updating an emotional state based on a recognition result by the pattern recognition unit In a virtual biological system comprising update means and output means for outputting a response according to the emotional state,
When the emotional state satisfies a predetermined condition, further comprising learning means for learning and new dictionary information for recognition of the pattern information, and the amount of updating the emotion state by detection of said pattern information Virtual biological system characterized by that.

An input process for inputting pattern information from a plurality of types of signal sources, a pattern recognition process for recognizing the input pattern information based on dictionary information, and a state for updating an emotional state based on a recognition result by the pattern recognition means A pattern learning method in a virtual biological system, comprising: an update process; and an output process for outputting a response according to the emotion state ,
It further has a pattern learning process for learning new dictionary information for recognizing the pattern information and an amount of updating the emotional state by detecting the pattern information when the emotion state satisfies a predetermined condition. A pattern learning method in a virtual biological system characterized by the above.

The virtual biological system according to claim 1, wherein the predetermined condition is that the intensity of the emotional state exceeds a predetermined threshold.

3. The pattern learning method in the virtual biological system according to claim 2, wherein the predetermined condition is that the intensity of the emotional state exceeds a predetermined threshold.

Input means for inputting pattern information from one or more signal sources;
Pattern recognition means for recognizing the pattern information input by the input means for each group based on dictionary information classified into a plurality of groups;
Pattern learning means for learning new dictionary information for recognizing the pattern information, and an amount of updating the emotional state by detecting the pattern information ;
State update means for updating the emotion state based on the recognition result by the pattern recognition means;
A learning command means for giving a learning command to the pattern learning means when the emotional state satisfies a predetermined condition;
Virtual biological systems, characterized by comprising output means for outputting a response in accordance with the emotional state.

Input means for inputting pattern information from one or more signal sources;
Pattern recognition means for recognizing the pattern information input by the input means for each group based on dictionary information classified into a plurality of groups;
Pattern learning means for learning new dictionary information for recognizing the pattern information, and an amount of updating the emotional state by detecting the pattern information ;
State update means for updating the emotional state by changing these numerical values of the emotional state represented by a plurality of numerical values based on the recognition result by the pattern recognition means;
A learning instruction means for issuing a learning instruction to the pattern learning means when a plurality of numerical values representing the emotional state have reached a predetermined size;
A virtual biological system comprising output means for outputting a response according to the emotional state.

Input means for inputting pattern information from one or more signal sources;
Pattern recognition means for recognizing a plurality of pattern information input within a predetermined time by the input means for each group based on dictionary information classified into a plurality of groups;
Pattern learning means for learning new dictionary information for recognizing the pattern information, and an amount of updating the emotional state by detecting the pattern information ;
State update means for updating the emotional state by changing these numerical values of the emotional state represented by a plurality of numerical values based on the recognition result by the pattern recognition means;
A learning instruction means for issuing a learning instruction to the pattern learning means when a plurality of numerical values representing the emotional state have reached a predetermined size;
Virtual biological systems, characterized by comprising output means for outputting a response in accordance with the emotional state.

The virtual biological system according to claim 6 or 7, wherein the plurality of numerical values representing the emotional state are numerical values representing at least happiness, excitement, and likability.

8. The virtual information according to claim 6, wherein the pattern information is recognized for at least one of a group of people and vocabulary and at least one of a group of facial expressions, vocabulary, and force sense. Biological system.

The pattern information is divided into groups of persons, facial expressions, vocabulary, vocabulary and force sensation , and the emotional state is expressed by the pleasantness and discomfort of the stimulus recognition results for the facial expression, vocabulary and force sense groups. 8. The virtual biological system according to claim 6 or 7, wherein numerical values of happiness, excitement, and likability that express the value are changed.

Inputting pattern information from one or more signal sources; and
A pattern recognition step for recognizing a plurality of pieces of pattern information input within a predetermined time by this step for each group based on dictionary information classified into a plurality of groups;
Based on the recognition result of this pattern recognition step, a state update step of updating the emotional state by changing these numerical values of the emotional state represented by a plurality of numerical values,
When at least one of a plurality of numerical values representing the emotional state reaches a predetermined size , new dictionary information for recognizing the pattern information, an amount for updating the emotional state by detecting the pattern information, A learning command step for commanding to learn
A pattern learning step for learning the new dictionary information and the update amount in accordance with a command performed by the learning command step;
Pattern learning method in a virtual biological systems, characterized by an output step of outputting a response in accordance with the emotional state.

The pattern information is divided into groups of persons, facial expressions, vocabulary, vocabulary and force sensation , and the emotional state is expressed by the pleasantness and discomfort of the stimulus recognition results for the facial expression, vocabulary and force sense groups. The pattern learning method in the virtual biological system according to claim 11, wherein numerical values of happiness level, excitement level, and likability that express the value are changed.