JP4032492B2

JP4032492B2 - Agent device

Info

Publication number: JP4032492B2
Application number: JP09538698A
Authority: JP
Inventors: 智氣窪田; 孝二堀; 松田　　学; 和英足立; 康二向井
Original assignee: Equos Research Co Ltd
Current assignee: Equos Research Co Ltd
Priority date: 1998-03-23
Filing date: 1998-03-23
Publication date: 2008-01-16
Anticipated expiration: 2018-03-23
Also published as: JPH11272639A

Description

【０００１】
【発明の属する技術分野】
本発明は、エージェント装置に係り、例えば、擬人化されたエージェントを相手に車両内での会話等が可能なコミュニケーション機能を備えたエージェント装置に関する。
【０００２】
【従来の技術】
従来車両内において、運転者による走行環境を向上させるようにしたものとして、ラジオやカセットテーププレーヤが搭載されている。
また、車両に搭載したアマチュア無線機や携帯電話等の無線通信機器を使用して、車両外の知人等との会話を楽しむことで、走行環境を向上させるようにした車両もある。
【０００３】
【発明が解決しようとする課題】
上述のような従来の車両におけるラジオ等では運転者に対して一方向の情報提示にすぎず、双方向の会話等をすることができなかった。
一方、携帯電話等による場合には会話をすることができるが、コール待ち、ダイヤル等によって通話相手を捜さなければならなかった。たとえ、通話相手が見つかったとしても、車両の状況といった運転者の一方的な都合にあわせた、適切な会話をしてくれるわけではなかった。
このように、従来の車両には、車両の過去の状態などの履歴・運転者の状態に応じて、擬人化されたエージェントが存在しないため、車両が愛着のわかないただの乗り物としての道具でしか役割を持たない場合もあった。
【０００４】
なお、運転者に対する情報の伝達を、人間の表情や動作などにより行うようにした技術が特開平９−１０２０９８号公報において提示されている。
しかし、この公報に記載された技術は、過去の運転者の応答等の履歴や性別、年齢等のユーザ情報などに基づいて表示が変わるわけではなく、同一の状況が生じた場合には常に同一の表示がされるものである。すなわち、限られたセンサ出力に対して常に同一の表示を行うものであり、視認性が向上された従来の計器類の範疇に入るべきものである。また、車両においては、運転者が認知すべき情報は車内車外を含め多種多様に存在し、かつ安全性等からこれらの情報は一見して確実に把握する必要がある。しかし、人間の表情や動作により良好な視認性で種々の情報を伝達するには、情報量に限界がある。
【０００５】
本発明は、擬人化されたエージェントによる行為とエージェントの背景とにより種々の情報を伝達し、且つ運転者とのコミュニケーションをはかることが可能なエージェント装置を提供することを目的とする。
【０００６】
【課題を解決するための手段】
請求項１に記載した発明では、擬人化されたエージェント及び該エージェントの背景を表示する画像表示装置と、前記画像表示装置に表示されるエージェントの行為を決定する行為決定手段と、前記画像表示装置に表示される前記背景を決定する背景決定手段と、前記行為決定手段で決定された行為を行うエージェントと前記背景決定手段により決定された背景とを前記画像表示装置に表示させる画像表示手段と、前記画像表示装置に表示されたエージェントの行為に対する、ユーザの音声による応答を音声認識する音声認識手段と、前記音声認識手段による、肯定を表す複数の単語からなる肯定単語グループと、否定を表す複数の単語からなる否定単語グループの各単語に対する音声認識率を算出する認識率算出手段と、を備え、前記背景決定手段は、ユーザからの音声による応答として、肯定を表す単語又は否定を表す単語の音声入力を待機する場合、音声認識可能な前記肯定単語グループの単語と前記否定単語グループの単語のうち、ユーザに音声入力を推奨する単語として、前記肯定単語グループの中で最も認識率が高い単語を表示したプラカードと、前記否定単語グループの中で最も認識率が高い単語を表示したプラカードを背景として決定する、ことを特徴とするエージェント装置を提供する。
請求項２記載の発明では、音声出力装置を備え、前記音声認識手段による音声認識が、前記音声出力装置によるエージェントの音声出力中や音声認識結果の判定中のために音声認識できない状態か否かを判断する音声認識状態判断手段と、をさらに備え、前記画像表示装置は、擬人化されたエージェント及び、該エージェントの表示画面の縁部に表示される枠を含めた背景を表示し、前記背景決定手段は、前記音声認識できない状態と判断された場合と、音声認識出来る状態と判断された場合とで、前記背景の枠の色を異なる色に決定する、ことを特徴とする請求項１に記載のエージェント装置を提供する。
【００１０】
【発明の実施の形態】
以下、本発明のエージェント装置における好適な実施の形態について、図１から図１１を参照して詳細に説明する。
（１）実施形態の概要
本実施形態のエージェント装置では、擬人化されたエージェントとその背景とを画像（平面的画像、ホログラフィ等の立体的画像等）により車両内の表示装置に表示させる。車両自体、運転者、同乗者、対向車等を含む車両の状況（運転者の応答や反応等も含む）の判断を行い、各時点での車両状況に基づいて、運転者や車両に対して様々なバリエーションをもった対応（行為＝行動と音声）をする。更に、背景も、エージェントの行為と同様に、車両状況に基づいて決定され表示される。これにより運転者は、自分固有のエージェントと車両内でつき合う（コミュニケーションする）ことが可能になり、車両内での環境を快適にすることができる。また、エージェントと背景との両方により、多くの情報を、はっきりと区別して取得でき、車両内での環境をより快適にすることができる。
ここで、本実施形態において擬人化されたエージェントとは、特定の人間、生物、漫画のキャラクター等との同一性があり、その同一性のある生物が、同一性・連続性を保つようなある傾向の出力（動作、音声により応答）を行うものである。また、同一性・連続性は特有の個性を持つ人格として表現され、電子機器内の一種の疑似生命体としてもとらえることができる。車両内に出現させる本実施形態のエージェントは、人間と同様に判断する疑似人格化（仮想人格化）された主体である。
また、本実施形態中においては、エージェントは、車両自体や運転者を含む車両の状況を判断して経路案内や機器の作動等種々の動作を運転者に代わり行ったりそのアシストを行いつつ、更に、車両の状況や運転者の応答等を学習し、この学習結果を含めた判断により各種行為を行うものとなっている。従って、同一の車両状況であっても、過去の学習内容に応じてコミュニケーションの内容は異なる。ときには、車両の相応には関係ない範囲での判断ミスも有り、この判断ミスによる不要な（ドジな）応答をするおともある。そして運転者の応答により、判断ミスか否かを判定し、学習する。
また更に、本実施形態中においては、背景についても車両の状況や運転者の応答等の学習結果を含めた判断により表示が決定され、運転者に合わせた背景を表示することにより、より一層効果的なコミュニケーションや良好な走行環境を提供できるようになっている。
【００１１】
（２）実施形態の詳細
図１は、本実施形態におけるエージェント装置の構成を示すブロック図である。
本実施形態では、コミュニケーション機能全体を制御する全体処理部１を備えている。この全体処理部１は、設定した目的地までの経路を探索して音声や画像表示により案内するナビゲーション処理部１０、エージェント処理部１１、ナビゲーション処理部１０とエージェント処理部１１に対するＩ／Ｆ部１２、エージェント画像や地図画像等の画像出力や入力画像を処理する画像処理部１３、エージェント音声や経路案内音声等の音声出力や入力される音声を制御する音声制御部１４、及び車両や運転者に関する各種状況の検出データを処理する状況情報処理部１５を有している。
エージェント処理部１１は、車両の状況から、車両内に出現させるエージェントの行為と背景とを決定し、車両の状況や運転者による過去の応対等を学習して適切な会話や制御を運転者に応じて行うようになっている。
【００１２】
ナビゲーション処理部１０とエージェント処理部１１は、データ処理及び各部の動作の制御を行うＣＰＵ（中央処理装置）と、このＣＰＵにデータバスや制御バス等のバスラインで接続されたＲＯＭ、ＲＡＭ、タイマ等を備えている。両処理部１０、１１はネットワーク接続されており、互いの処理データを取得することができるようになっている。
ＲＯＭはＣＰＵで制御を行うための各種データやプログラムが予め格納されたリードオンリーメモリであり、ＲＡＭはＣＰＵがワーキングメモリとして使用するランダムアクセスメモリである。
【００１３】
本実施形態のナビゲーション処理部１０とエージェント処理部１１は、ＣＰＵがＲＯＭに格納された各種プログラムを読み込んで各種処理を実行するようになっている。なお、ＣＰＵは、記録媒体駆動装置２３にセットされた外部の記録媒体からコンピュータプログラムを読み込んで、エージェント記憶装置２９やナビゲーションデータ記憶装置、図示しないハードディスク等のその他の記憶装置に格納（インストール）し、この記憶装置から必要なプログラム等をＲＡＭに読み込んで（ロードして）実行するようにしてもよい。また、必要なプログラム等を記録媒体駆動装置２３からＲＡＭに直接読み込んで実行するようにしてもよい。
【００１４】
ナビゲーション処理部１０には、現在位置検出装置２１とナビゲーションデータ記憶装置３０が接続され、エージェント処理部１１にはエージェントデータ記憶装置２９が接続され、Ｉ／Ｆ部１２には入力装置２２と記憶媒体駆動装置２３と通信制御装置２４が接続され、画像処理部１３には表示装置２７と撮像装置２８が接続され、音声制御部１４には音声処理装置２５とマイク２６が接続され、状況情報処理部１５には状況センサ部４０が接続されている。
【００１５】
現在位置検出装置２１は、車両の絶対位置（緯度、経度による）を検出するためのものであり、人工衛星を利用して車両の位置を測定するＧＰＳ（Global Positioning System)受信装置２１１と、方位センサ２１２と、舵角センサ２１３と、距離センサ２１４と、路上に配置されたビーコンからの位置情報を受信するビーコン受信装置２１５等が使用される。
ＧＰＳ受信装置２１１とビーコン受信装置２１５は単独で位置測定が可能であるが、ＧＰＳ受信装置２１１やビーコン受信装置２１５による受信が不可能な場所では、方位センサ２１２と距離センサ２１４の双方を用いた推測航法によって現在位置を検出するようになっている。
方位センサ２１２は、例えば、地磁気を検出して車両の方位を求める地磁気センサ、車両の回転角速度を検出しその角速度を積分して車両の方位を求めるガスレートジャイロや光ファイバジャイロ等のジャイロ、左右の車輪センサを配置しその出力パルス差（移動距離の差）により車両の旋回を検出することで方位の変位量を算出するようにした車輪センサ、等が使用される。
舵角センサ２１３は、ステアリングの回転部に取り付けた光学的な回転センサや回転抵抗ボリューム等を用いてステアリングの角度αを検出する。
距離センサ２１４は、例えば、車輪の回転数を検出して計数し、または加速度を検出して２回積分するもの等の各種の方法が使用される。
【００１６】
入力装置２２は、車両の状況としての、ユーザに関する情報（年齢、性別、趣味、性格など）を入力したり、エージェントからの問い合わせに対して運転者が応答するための手段である。なお、ユーザに関する情報は、入力装置２２からユーザが入力する場合に限らず、例えば、プロ野球が好きか否か、好きな球団名等に関する各種問い合わせをエージェントがユーザに行い、ユーザの回答内容から取得するようにしてもよい。
また、入力装置２２は、ナビゲーション処理における走行開始時の現在地（出発地点）や目的地（到達地点）、情報提供局へ渋滞情報等の情報の請求を発信したい車両の所定の走行環境（発信条件）、車両内で使用される携帯電話のタイプ（型式）などを入力するためのものでもある。
入力装置２２には、タッチパネル（スイッチとして機能）、キーボード、マウス、ライトペン、ジョイスティック、赤外線等によるリモコン、音声認識装置などの各種の装置が使用可能である。また、赤外線等を利用したリモコンと、リモコンから送信される各種信号を受信する受信部を備えてもよい。リモコンには、画面上に表示されたカーソルの移動操作等を行うジョイスティックの他、メニュー指定キー（ボタン）、テンキー等の各種キーが配置される。
【００１７】
記録媒体駆動装置２３は、ナビゲーション処理部１０やエージェント処理部１１が各種処理を行うためのコンピュータプログラムを外部の記録媒体から読み込むのに使用される駆動装置である。記録媒体に記録されているコンピュータプログラムには、各種のプログラムやデータ等が含まれる。
ここで、記録媒体とは、コンピュータプログラムが記録される記録媒体をいい、具体的には、フロッピーディスク、ハードディスク、磁気テープ等の磁気記録媒体、メモリチップやＩＣカード等の半導体記録媒体、ＣＤ−ＲＯＭやＭＯ、ＰＤ（相変化書換型光ディスク）等の光学的に情報が読み取られる記録媒体、紙カードや紙テープ、文字認識装置を使用してプログラムを読み込むための印刷物等の用紙（および、紙に相当する機能を持った媒体）を用いた記録媒体、その他各種方法でコンピュータプログラムが記録される記録媒体が含まれる。
【００１８】
記録媒体駆動装置２３は、これらの各種記録媒体からコンピュータプログラムを読み込む他に、記録媒体がフロッピーディスクやＩＣカード等のように書き込み可能な記録媒体である場合には、ナビゲーション処理部１０やエージェント処理部１１のＲＡＭや記憶装置２９、３０のデータ等をその記録媒体に書き込むことが可能である。
例えば、ＩＣカードに、エージェント機能に関する学習内容（学習項目データ、応答データ）や、ユーザに関する情報等を記憶させ、他の車両を運転する場合でもこれらのデータを記憶させたＩＣカードを使用することで、自分の好みに合わせて（過去の応対の状況に応じて）学習されたエージェントとコミュニケーションすることが可能になる。これにより、車両毎のエージェントではなく、運転者に固有のエージェントを車両内に出現させることが可能になる。
【００１９】
通信制御装置２４は、各種無線通信機器からなる携帯電話が接続されるようになっている。通信制御部２４は、電話回線による通話の他、道路の混雑状況や交通規制等の交通情報に関するデータなどを提供する情報提供局との通信や、車内での通信カラオケのために使用するカラオケデータを提供する情報提供局との通信を行うことができるようになっている。
また、通信制御装置２４を介して、エージェント機能に関する学習データやユーザに関する情報を送受信することも可能である。
【００２０】
音声出力装置２５は、車内に配置された複数のスピーカで構成され、音声制御部１４で制御された音声、例えば、音声による経路案内を行う場合の案内音声や、エージェントの行動にあわせた音声や音が出力されるようになっている。この音声出力装置２５は、オーディオ用のスピーカと兼用するようにしてもよい。なお、音声制御装置１４は、運転者のチューニング指示の入力に応じて、音声出力装置２５から出力する音声の音色やアクセント等を制御することが可能である。
マイク２６は、音声制御部１４における音声認識の対象となる音声、例えば、ナビゲーション処理における目的地等の入力音声や、エージェントとの運転者の会話（応答等）等を入出力する音声入力手段として機能する。このマイク２６は、通信カラオケ等のカラオケを行う際のマイクと兼用するようにしてもよく、また、運転者の音声を的確に収集するために指向性のある専用のマイクを使用するようにしてもよい。
音声出力装置２５とマイク２６とでハンズフリーユニットを形成させて、携帯電話を介さずに、電話通信における通話を行えるようにしてもよい。
【００２１】
表示装置２７には、ナビゲーション処理部１０の処理による経路案内用の道路地図や各種画像情報が表示されたり、エージェント処理部１１によるエージェントの各種行動（動画）が表示されたりするようになっている。また、エージェントが表示される場合には、同時にエージェントの背景が表示されるようになっている。また、撮像装置２８で撮像された車両内外の画像も画像処理部１３で処理された後に表示されるようになっている。
表示装置２７は、液晶表示装置、ＣＲＴ等の各種表示装置が使用される。
なお、この表示装置２７は、例えばタッチパネル等の、前記入力装置２２としての機能を兼ね備えたものとすることができる。
【００２２】
撮像装置２８は、画像を撮像するためのＣＣＤ（電荷結合素子）を備えたカメラで構成されており、運転者を撮像する車内カメラの他、車両前方、後方、右側方、左側方を撮像する各車外カメラが配置されている。撮像装置２８の各カメラにより撮像された画像は、画像処理部１３に供給され、画像認識等の処理が行われ、各認識結果をエージェント処理部１１によるプログラム番号の決定にも使用するようになっている。
【００２３】
エージェントデータ記憶装置２９は、本実施形態によるエージェント機能を実現するために必要な各種データ（プログラムを含む）が格納される記憶装置である。このエージェントデータ記憶装置２９には、例えば、フロッピーディスク、ハードディスク、ＣＤ−ＲＯＭ、光ディスク、磁気テープ、ＩＣカード、光カード等の各種記録媒体と、その駆動装置が使用される。
この場合、例えば、学習項目データ２９２、応答データ２９３を持ち運びが容易なＩＣカードやフロッピーディスクで構成し、その他のデータをハードディスクで構成するというように、複数種類の異なる記録媒体と駆動装置で構成し、駆動装置としてそれらの駆動装置を用いるようにしてもよい。
【００２４】
エージェントデータ記憶装置２９には、エージェントプログラム２９０、プログラム選択テーブル２９１、学習項目データ２９２、応答データ２９３、図４に例示したエージェントの容姿や行動、背景を画像表示するための画像データ２９４、背景選択テーブル２９６、応答認識データ２９８、その他のエージェントのための処理に必要な各種のデータが格納されている。
【００２５】
エージェントプログラム２９０には、エージェント機能を実現するためのエージェント処理プログラムや、エージェントと運転者とがコミュニケーションする場合の細かな行動を背景と共に表示装置２７に画像表示し、またその行動に対応した会話を音声出力装置２５から出力するためのコミュニケーションプログラムがプログラム番号順に格納されている。
このエージェントプログラム２９０には、各プログラム番号の音声に対して復習種類の音声データが格納されており、運転者は前記エージェントの容姿の選択と併せて音声を入力装置２２等から選択することができるようになっている。エージェントの音声としては、男性の音声、女性の音声、子供の音声、機械的な音声、動物的な音声、特定の声優や俳優の音声、特定のキャラクタの音声等があり、これらの中から適宜運転者が選択する。なお、このエージェントの音声の選択は、適時変更することが可能である。
【００２６】
プログラム選択テーブル２９１は、エージェントプログラム２９０に格納されているコミュニケーションプログラムを選択するためのテーブルである。
図２はプログラム選択テーブル２９１を表したものであり、図３はプログラム選択テーブル２９１で選択される各プログラム番号に対応した、エージェントの行為（行動と発声）内容を表したものである。
この図２、図３で示されているプログラム番号は、エージェントプログラム２９０に格納されている各コミュニケーションプログラムの番号と一致している。
【００２７】
図４は、図２、図３のプログラム番号００００１〜００００２により表示装置２７に表示されるエージェントの「かしこまってお辞儀」行動についての数画面を表したものである。
この図４に示されるように、エージェントＥは、口元を引き締めると共に手を膝に当てながら、お辞儀をすることでかしこまったお辞儀であることが表現されている。この行動と共にエージェントＥが話す言葉（発声）は、車両状況や学習状況、エージェントの性格等によって変えられる。
【００２８】
エンジンの冷却水温度が低い場合には、エンジンの調子に合わせて行動「眠そうに…」が選択される。眠そうな表現として、瞼が下がった表情にしたり、あくびや伸びをした後に所定の行動（お辞儀等）をしたり、最初に目をこすったり、動きや発声を通常よりもゆっくりさせたりすることで表すことができる。これらの眠そうな表現は、常に同一にするのではなく、行動回数等を学習することで適宜表現を変更する。
例えば、３回に１回は目をこすり（Ａ行動）、１０回に１回はあくびをするようにし（Ｂ行動）、それ以外では瞼を下がった表情（Ｃ行動）にする。これらの変化は、行動Ｂや行動Ｃの付加プログラムを行動Ａの基本プログラムに組み合わせることで実現される。そして、どの行動を組み合わせるかについては、基本となる行動Ａのプログラム実行回数を学習項目として計数しておき、回数に応じて付加プログラムを組み合わせるようにする。
また、行動「元気よく」を表現する場合には、音声の抑揚を大きくしたり、エージェントＥを走りながら画面に登場させたりすることで表現する。
【００２９】
図２に表示された各項目は、各プログラム番号を選択するための選択条件を表したもので、状態センサ４０により検出される車両や運転者の各種状況から決定される項目（時間、起動場所、冷却水温、シフトポジション位置、アクセル開度等）と、学習項目データ２９２や応答データ２９３に格納されている学習内容から決定される項目（今日のＩＧＯＮ回数、前回終了時からの経過時間、通算起動回数等）とがある。
プログラム選択テーブル２９１中で、これら全項目を満足するプログラムは必ず一義的に決定するようになっている。なお、テーブル中で「○」印は、そのプログラム番号が選択されるために満たす必要がある項目を示し、「−」印、「無印」はそのプログラムの選択には考慮されない項目を示している。
【００３０】
図２、図３では、イグニッションをＯＮにした場合のコミュニケーション（挨拶）に関連する行為と選択条件について記載しているが、その他各種行為（行動と発声）を規定するプログラムを選択するためのプログラム番号と選択条件も種々規定されている。
例えば、急ブレーキが踏まれたことを条件として、エージェントが「しりもち」をついたり、「たたら」を踏んだりする行動とったり、驚き声をだすようなプログラムも規定されている。エージェントによる各行動の選択は急ブレーキに対する学習によって変化するようにし、例えば、最初の急ブレーキから３回目までは「しりもち」をつき、４回目から１０回目までは「たたら」を踏み、１０回目以降は「片足を一歩前にだすだけで踏ん張る」行動を取るようにし、エージェントが急ブレーキに対して段階的に慣れるようにする。そして、最後の急ブレーキから１週間の間隔があいた場合には、１段階後退するようにする。
【００３１】
学習項目データ２９２及び応答データ２９３は、運転者の運転操作や応答によってエージェントが学習した結果のデータである。従って、学習項目データ２９２と応答データ２９３は、各運転者毎にそのデータが格納・更新（学習）されるようになっている。
【００３２】
学習項目データ２９２と応答データ２９３は共にエージェントの学習により格納、更新されるデータであり、その内容がそれぞれ図５、図６に概念的に示されている。
学習項目データ２９２には、図５に示されるように、プログラム選択テーブル２９１（図２）においてプロコミュニケーションプログラムを選択するための選択条件項目である通算起動回数、前回終了日時、今日のイグニッションＯＮ回数、前５回の給油時残量、オーディオ類作動条件とその時の作動機器、等が格納される。また、選択条件により選択されたプログラムを起動するか否か（お休みするか否か）を決定するためのお休み回数／日時、デフォルト値、その他のデータが格納される。
【００３３】
通算起動回数には、イグニッションを起動した通算回数が格納され、イグニッションがＯＮされる毎にカウントアップされる。
前回終了日時には、イグニッションをＯＦＦにする毎にその日時が格納される。
今日のイグニッションＯＮ回数には、その日におけるイグニッションＯＮの回数と、１日の終了時間が格納される。イグニッションがＯＮされる毎にカウントアップされるが、１日が終了するとデータが”０”に初期化される。１日の終了時間はデフォルト値として２４：００が格納されている、この時間はユーザ（運転者）の生活パターンによって変更することが可能である。時間が変更された場合には、変更後の時間が格納される。
【００３４】
前５回の給油残量には、燃料（ガソリン）を給油する直前に検出された燃料の残量が格納され、新たに給油される毎に各データが左側にシフトされ（最も古い最左のデータが削除される）今回給油直前の残量が一番右側に格納される。
このデータは、後述する燃料検出センサ４１５の検出値Ｇ１が、全５回分の給油残量の平均値Ｇ２以下（Ｇ１≦Ｇ２）になった場合に、エージェントＥが表示装置２７に現れて給油を促す行動が表示装置２７に表示され、「おなかが減ったなあ！ガソリンがほしいな！」等の音声が音声出力装置２５から出力される。
オーディオ類作動条件は、ラジオ、ＣＤ、ＭＤ、カセットテーププレーヤ、テレビ等のオーディオ類のスイッチがＯＮにされた時の時間帯及び場所であり、ラジオ及びテレビの場合には更に選択された局が該当する。また、作動機器は、ラジオ、ＣＤ、ＭＤ、カセットテーププレーヤ等のオーディオ類である。このオーディオ類作動条件と作動機器は、オーディオ類がスイッチＯＮされた過去５回分について記憶される。
【００３５】
お休み回数／日時には、該当するコミュニケーションプログラムが選択されたとしても実行せずにお休みした回数等が各プログラム番号毎に格納される。このお休み回数／日時は、例えば後述するエアコンの停止を提案するエージェントの行為（プログラム番号００１２３）のように、学習項目としてお休み項目が設定されているエージェント行為について格納される。
エージェントの提案や会話に対する運転者の応答が、拒否（拒絶）であった場合や無視（又は無応答）であった場合、コミュニケーションプログラムに応じて選択的に「お休み」が設定される。
【００３６】
デフォルト値には、時間、回数、温度、車速、日時等の各項目に対する初期設定値が格納されており、前記した１日の終了時間のように学習項目の中で変更された値を初期値に戻す場合に使用される。
【００３７】
学習項目データ２９２に格納されるその他のデータとしては、例えば、運転者やその関係者の誕生日（これはユーザ入力項目である）、祭日とその言われ、クリスマス、バレンタインデー、ホワイトデー等のイベント日などが格納される。各イベント日に応じた特別メニューのコミュニケーションプログラムも用意されており、例えば、クリスマスイブにはサンタクロースに変装したエージェントが現れる。
【００３８】
図６の応答データ２９３には、エージェントの行為に対するユーザの応答の履歴が、ユーザ応答を学習項目とする各コミュニケーションプログラム番号毎に格納される。ユーザ応答データは、図６（Ａ）のコミュニケーションプログラム番号００１２３、００１２５のように最新の応答日時と応答内容が所定回分（プログラム番号００１２３は２回分）格納されるものと、プログラム番号００１２４のように最新の応答内容のみが１回分格納される（従って応答がある毎に更新される。）ものと、最新の応答内容のみが所定回分格納されるものと、最新の日時と応答内容が一回分格納されるものと、最新の日時だけが１回分または所定回分格納されるもの等がある。
図６（Ａ）中に表示された記号Ａ、Ｂ、Ｃは応答内容を表すもので、同図（Ｂ）に示すように、記号Ａが無視された場合、記号Ｂが拒絶された場合、記号Ｃが受容された場合を表す。運転者の応答内容については、マイク２６から入力される運転者の音声に対する音声認識の結果や、入力装置２２による入力結果から判断される。
なお、本実施形態では運転者の応答を無視、拒絶、受容の３パターに分類しているが、「強く拒絶」、「怒られた」、「喜ばれてた」を新たに加えるようにしてもよい。この場合、新たに加えた応答により、学習項目データ２９２（例えば、お休み回数等）や応答データ２９３のを追加変更する。
【００３９】
図１に示すエージェントデータ記憶装置２９の画像データ２９４には、エージェントプログラム２９０のコミュニケーションプログラムのプログラム番号の行動に対して、複数種類のエージェントの容姿それぞれと各背景との組合わさった画像が格納されている。エージェントの容姿は、運転者の好みによって入力装置２２等から選択することができるようになっており、この選択されたエージェントの容姿が、各種センサ等により取得された状況をもとに背景選択テーブル２９６により決定された背景とともに画像表示されるようになっている。なお、このエージェントの容姿の選択は、音声と同様に、適時変更することが可能である。
画像データ２９４に格納されるエージェントの容姿としては、人間（男性、女性）的な容姿である必要はなく、例えば、ひよこや犬、猫、カエル、ネズミ等の動物自体の容姿や人間的に図案化（イラスト化）した動物の容姿であってもよく、更にロボット的な容姿や、特定のキャラクタの容姿等であってもよい。またエージェントの年齢としても一定である必要がなく、エージェントの学習機能として、最初は子供の容姿とし、時間の経過と共に成長していき容姿が変化していく（大人の容姿に変化し、更に老人の容姿に変化していく）ようにしてもよい。
【００４０】
画像データ２９４に格納される背景の画像としては、例えば、日の出や星空等の時間帯を表す風景、海や雪山、紅葉等の季節を表す風景、ゴルフコースや海等の目的地を表す風景、エージェントが「はい」「いいえ」の応答を待機している時の「はい」と「いいえ」のプラカード等の持ち物、ラジオやＣＤ等で音楽を聞いているときの音符の模様や、エージェントが音声の応答を認識可能か否かの応答認識状態を色別に表す各色の枠、等が挙げられる。
【００４１】
図７は、背景選択テーブル２９６を表したものであり、表示装置２７に表示されるエージェントの背景を選択するためのテーブルである。このテーブル左側に示されるように、背景としては、エージェントのバックに表示される風景や模様の画像、エージェントが持つ持ち物の画像、及び表示装置２７の表示画面の内枠に沿って表示される枠画像がある。これらの背景は、図７に示すように、時間帯、季節、走行状態、作動している機器、エージェントの状態、カーナビゲーションに設定された目的地といった各種項目に基づいて決定されるようになっている。
これらの各項目は、状態センサ４０により検出される車両の走行状態等の各種状況から決定される項目と、学習項目データ２９２や応答データ２９３に格納されている学習内容も会わせて決定される項目（「はい」と「いいえ」のプラカードを持つか「ＹＥＳ」と「ＮＯ」のプラカードを持つか、オーディオ類を作動させる場合の背景の選択等）とがある。
そして、選択条件に基づいて、１つ又は複数の背景が選択されるようになっている。複数の背景が選択される場合には、バック、持ち物、枠それぞれのうちからは重複して選択されないようになっている。なお、テーブル中で「○」印は、その背景が選択されるために満たす必要がある項目を示し、「無印」はその背景が選択されるためには満たしてはいけない項目を示している。
【００４２】
図８（ａ）、（ｂ）、（ｃ）及び（ｄ）は、本実施形態において、上述の背景選択テーブル２９６により選択された背景をエージェントとともに表示装置に表示した一例を示すものである。
図８（ａ）は、エージェントが発話中や音声認識結果の判定中のために運転者の音声を認識できない状態の背景であり、背景として、音声認識不可能を示す赤色の枠Ｒが表示されている。尚、本表示（ａ）においてはエージェントが座って表示されているが、これは、車両が停車中であることをエージェントの姿勢で表現するものである。
図８（ｂ）は、運転者からの応答として、肯定又は否定のワードを待機している状態であり、背景として肯定及び否定のワードのみが認識可能であることを示す黄色の枠Ｙが表示されている。また、エージェントの持ち物（背景）には、音声認識可能な応答のうちの推奨ワードとして、「はい」及び「いいえ」のプラカードが表示されている。尚、本表示（ｂ）〜（ｄ）においてはエージェントが立って表示されているが、これは、車両が走行中であることをエージェントの姿勢で表現するものである。
図８（ｃ）は、ナビゲーションシステムにおいて目的地の都道府県の音声入力に対して待機している状態の画面であり、背景として音声認識可能状態を表す緑色の枠Ｇが表示されている。また、エージェントの持ち物（背景）には、音声認識可能な応答のうちの推奨ワードとして、「都道府県」のプラカードが表示されている。
図８（ｄ）は、特に限定せずに「ＣＤをかけて」や「窓を開けて」等の通常のコミュニエーションの音声を認識できる状態を表す緑色の枠Ｇが表示されている。
【００４３】
応答認識データ２９８は、前述の学習項目データ２９２や応答データ２９３と同様に、運転者からの応答によってエージェントが学習した結果のデータであり、各運転者毎にそのデータが格納・更新（学習）される。
【００４４】
図９は、応答認識データを表したものである。
この図９に示されるように、応答認識データ２９８には、応答認識結果と、この応答認識結果から求められた応答認識率、及び、肯定のワードのグループ及び否定のワードのグループそれぞれにおける最高応答認識率ワードが、運転者別に格納される。
応答認識結果は、「はい」と「いいえ」、「ＹＥＳ」と「ＮＯ」、「はい」と「ＮＯ」、「うん」と「やだよ」等の、背景として肯定及び否定のワードをプラカードで表示した場合に、運転者からのワードによる応答を正しく認識できたかどうかのデータである。応答が正しく認識できたかどうかは、取得した応答を認識した結果に基づいてエージェントが制御等を行った場合の運転者の反応から、認識結果が正しいか否かを判断し、各応答「はい」、「いいえ」、「ＹＥＳ」、「ＮＯ」、…について１０回ずつ格納する。
【００４５】
応答認識率は、上述の各前記応答認識結果から求められた応答認識率が次の数式１によって求められ、格納される。
【００４６】
【数１】
応答認識率＝（応答が正しく認識された回数／応答取得回数）×１００
【００４７】
最高認識率ワードは、「はい」や「ＹＥＳ」等の肯定のワードのグループ、及び、「いいえ」や「ＮＯ」等の否定のワードのグループそれぞれにおいて最も認識率の大きいワードを格納する。そして、肯定と否定の２つのプラカードを持った背景を表示させる場合に、背景選択テーブルにおいてこれらの最高認識率ワードが選択されるようになっている。
【００４８】
図１０は、ナビゲーションデータ記憶装置３０（図１）に格納されるデータファイルの内容を表したものである。
図１０に示されるように、ナビゲーションデータ記憶装置３０には経路案内等で使用される各種データファイルとして、通信地域データファイル３０１、描画地図データファイル３０２、交差点データファイル３０３、ノードデータファイル３０４、道路データファイル３０５、探索データファイル３０６、写真データファイル３０７が格納されるようになっている。
このナビゲーションデータ記憶装置４は、例えば、フロッピーディスク、ハードディスク、ＣＤ−ＲＯＭ、光ディスク、磁気テープ、ＩＣカード、光カード等の各種記録媒体と、その駆動装置が使用される。
なお、ナビゲーションデータ記憶装置４は、複数種類の異なる記録媒体と駆動装置で構成するようにしてもよい。例えば、検索データファイル４６を読み書き可能な記録媒体（例えば、フラッシュメモリ等）で、その他のファイルをＣＤ−ＲＯＭで構成し、駆動装置としてそれらの駆動装置を用いるようにする。
【００４９】
通信地域データファイル３０１には、通信制御装置２４に接続され又は無接続で車内において使用される携帯電話が、車内から通信できる地域を表示装置５に表示したり、その通信できる地域を経路探索の際に使用するための通信地域データが、携帯電話のタイプ別に格納されている。この携帯電話のタイプ別の各通信地域データには、検索しやすいように番号が付されて管理され、その通信可能な地域は、閉曲線で囲まれる内側により表現できるので、その閉曲線を短い線分に分割してその屈曲点の位置データによって特定する。なお、通信地域データは、通信可能地を大小各種の四角形エリアに分割し、対角関係にある２点の座標データによりデータ化するようにしてもよい。
通信地域データファイル３０１に格納される内容は、携帯電話の使用可能な地域の拡大や縮小に伴って、更新できるのが望ましく、このために、携帯電話と通信制御装置２４を使用することにより、情報提供局との間で通信を行なって、通信地域データファイル３０１の内容を最新のデータと更新できるように構成されている。なお、通信地域データファイル３０１をフロッピーディスク、ＩＣカード等で構成し、最新のデータと書換えを行うようにしても良い。
描画地図データファイル３０２には、表示装置２７に描画される描画地図データが格納されている。この描画地図データは、階層化された地図、例えば最上位層から日本、関東地方、東京、神田といった階層ごとの地図データが格納されている。各階層の地図データは、それぞれ地図コードが付されている。
【００５０】
交差点データファイル３０３には、各交差点を特定する交差点番号、交差点名、交差点の座標（緯度と経度）、その交差点が始点や終点になっている道路の番号、および信号の有無などが交差点データとして格納されている。
ノードデータファイル３０４には、各道路における各地点の座標を指定する緯度、経度などの情報からなるノードデータが格納されている。すなわち、このノードデータは、道路上の一地点に関するデータであり、ノード間を接続するものをアークと呼ぶと、道路は、複数のノード列のそれぞれの間をアークで接続することによって表現される。
道路データファイル３０５には、各道路を特定する道路番号、始点や終点となる交差点番号、同じ始点や終点を持つ道路の番号、道路の太さ、進入禁止等の禁止情報、後述の写真データの写真番号などが格納されている。
交差点データファイル３０３、ノードデータファイル３０４、道路データファイル３０５にそれぞれ格納された交差点データ、ノードデータ、道路データからなる道路網データは、経路探索に使用される。
【００５１】
探索データファイル３０６には、経路探索により生成された経路を構成する交差点列データ、ノード列データなどが格納されている。交差点列データは、交差点名、交差点番号、その交差点の特徴的風景を写した写真番号、曲がり角、距離等の情報からなる。また、ノード列データは、そのノードの位置を表す東経、北緯などの情報からなる。
写真データファイル３０７には、各交差点や直進中に見える特徴的な風景等を撮影した写真が、その写真番号と対応してディジタル、アナログ、またはネガフィルムの形式で格納されている。
【００５２】
図１１は、状況センサ部４０を構成する各種センサを表したものである。
図１１に示すように状況センサ部４０は、イグニッションセンサ４０１、車速センサ４０２、アクセルセンサ４０３、ブレーキセンサ４０４、サイドブレーキ検出センサ４０５、シフト位置検出センサ４０６、ウィンカー検出センサ４０７、ワイパー検出センサ４０８、ライト検出センサ４０９、シートベルト検出センサ４１０、ドア開閉検出センサ４１１、同乗者検出センサ４１２、室内温度検出センサ４１３、室外温度検出センサ４１４、燃料検出センサ４１５、水温検出センサ４１６、ＡＢＳ検出センサ４１７、エアコンセンサ４１８、体重センサ４１９、前車間距離センサ４２０、後車間距離センサ４２１、体温センサ４２２、心拍数センサ４２３、発汗センサ４２４、脳波センサ４２５、アイトレーサー４２６、赤外線センサ４２７、その他のセンサ（タイヤの空気圧低下検出センサ、ベルト類のゆるみ検出センサ、窓の開閉状態センサ、クラクションセンサ、室内湿度センサ、室外湿度センサ、油温検出センサ、油圧検出センサ等）４２８等の車両状況や運転者状況、車内状況等を検出する各種センサを備えている。
これら各種センサは、それぞれのセンシング目的に応じた所定の位置に配置されている。
なお、これらの各センサは独立したセンサとして存在しない場合には、他のセンサ検出信号から間接的にセンシングする場合を含む。例えば、タイヤの空気圧低下検出センサは、車輪速センサの信号の変動により間接的に空気圧の低下を検出する。
【００５３】
イグニッションセンサ４０１は、イグニッションのＯＮとＯＦＦを検出する。
車速センサ４０２は、例えば、スピードメータケーブルの回転角速度又は回転数を検出して車速を算出するもの等、従来より公知の車速センサを特に制限なく用いることができる。
アクセルセンサ４０３は、アクセルペダルの踏み込み量を検出する。
ブレーキセンサ４０４は、ブレーキの踏み込み量を検出したり、踏み込み力や踏む込む速度等から急ブレーキがかけられたか否かを検出する。
サイドブレーキ検出センサ４０５は、サイドブレーキがかけられているか否かを検出する。
シフト位置検出センサ４０６は、シフトレバー位置を検出する。
ウィンカー検出センサ４０７は、ウィンカの点滅させている方向を検出する。
ワイパー検出センサ４０８は、ワイパーの駆動状態（速度等）を検出する。
ライト検出センサ４０９は、ヘッドランプ、テールランプ、フォグランプ、ルームランプ等の各ランプの点灯状態を検出する。
シートベルト検出センサ４１０は、運転者、及び同乗者（補助席、後部座席）がシートベルトを着用しているか否かを検出する。着用していない場合には適宜（嫌われない程度に）エージェントが現れ、警告、注意、コメント等（学習により程度を変更する）を行う。
【００５４】
ドア開閉検出センサ４１１は、ドアの開閉状態を検出し、いわゆる半ドアの場合には、エージェントがその旨を知らせる。ドア開閉検出センサ４１１は、運転席ドア、助手席ドア、後部運転席側ドア、後部助手席側ドア等の、車種に応じた各ドア毎の開閉を検出できるようになっている。
同乗者検出センサ４１２は、助手席や後部座席に同乗者が乗っているか否かを検出するセンサで、撮像装置２８で撮像された車内の画像から検出し、または、補助席等に配置された圧力センサや、体重計により検出する。
室内温度検出センサ４１３は室内の気温を検出し、室外温度検出センサ４１４は車両外の気温を検出する。
燃料検出センサ４１５は、ガソリン、軽油等の燃料の残量を検出する。給油時直前における過去５回分の検出値が学習項目データ２９２に格納され、その平均値になった場合にエージェントが給油時期であることを知らせる。
【００５５】
水温検出センサ４１６は、冷却水の温度を検出する。イグニッションＯＮ直後において、この検出温度が低い場合には、エージェントが眠そうな行為をする場合が多い。逆に水温が高すぎる場合にはオーバーヒートする前に、エージェントが「だるそう」な行動と共にその旨を知らせる。
ＡＢＳ検出センサ４１７は、急ブレーキによるタイヤのロックを防止し操縦性と車両安定性を確保するＡＢＳが作動したか否かを検出する。
エアコンセンサ４１８は、エアコンの操作状態を検出する。例えば、エアコンのＯＮ・ＯＦＦ、設定温度、風量等が検出される。
体重センサ４１９は、運転者の体重を検出するセンサである。この体重から、または、体重と撮像装置２８の画像から運転者を特定し、その運転者との関係で学習したエージェントを出現させるようにする。すなわち、特定した運転者に対してエージェントが学習した、学習項目データ２９２と応答データ２９３を使用することで、その運転者専用のエージェントを出現させるようにする。
前車間距離センサ４２０は車両前方の他車両や障害物との距離を検出し、後車間距離センサ４２１は後方の他車両や障害物との距離を検出する。
【００５６】
体温センサ４２２は、心拍数センサ４２３、発汗センサ４２４は、それぞれ運転者の体温、心拍数、発汗状態を検出するセンサで、例えば、ハンドル表面に各センサを配置し運転者の手の状態から検出する。または、体温センサ４２２として、赤外線検出素子を使用したサーモグラフィーにより運転者の各部の温度分布を検出するようにしても良い。
脳波センサ４２５は、運転者の脳波を検出するセンサで、例えばα波やβ波等を検出して運転者の覚醒状態等を調べる。
アイトレーサー４２６は、ユーザの視線の動きを検出し、通常運転中、車外の目的物を捜している、車内目的物をさがしている、覚醒状態等を判断する。
赤外線センサ４２７は、ユーザの手の動きや顔の動きを検出する。
【００５７】
次に、以上のように構成された本実施形態の動作について説明する。
図１２は本実施形態のエージェントによる処理のメイン動作を表したフローチャートである。
エージェント処理部１１は、イグニッションがＯＮされたことがイグニッションセンサ４０１で検出されると、まず最初に初期設定を行う（ステップ１１）。初期設定としては、ＲＡＭのクリア、各処理用のワークエリアをＲＡＭに設定、プログラム選択テーブル２９１（図２）のＲＡＭへのロード、フラグの０設定、等の処理が行われる。なお、本実施形態のエージェント処理では、その処理の開始をイグニッションＯＮとしたが、例えばドア開閉検出センサ４１１によりいずれかのドアの開閉が検出された場合に処理を開始するようにしてもよい。
【００５８】
次に、エージェント処理部１１は、運転者の特定を行う（ステップ１２）。すなわち、エージェント処理部１１は、運転者から先に挨拶がかけられたときにはその声を分析して運転者を特定したり、撮像した画像を分析することで運転者を特定したり、体重センサ４１９で検出した体重から運転者を特定したり、設定されたシート位置やルームミラーの角度から運転者を特定したりする。なお、特定した運転者については、後述のエージェントの処理とは別個に、「○○さんですか？」等の問い合わせをする特別のコミュニケーションプログラムが起動され、運転者の確認が行われる。
【００５９】
運転者が特定されると、次にエージェント処理部１１は、現在の状況を把握する（ステップ１３）。
すなわち、エージェント処理部１１は、状況情報処理部１５に状況センサ部４０の各センサから供給される検出値や、撮像装置２８で撮像した画像の処理結果や、現在位置検出装置２１で検出した車両の現在位置等のデータを取得して、ＲＡＭの所定エリアに格納し、格納したデータから車両の状態等の現在の状況の把握する。例えば、水温検出センサ４１６で検出された冷却水の温度がｔ１である場合、エージェント処理部１１は、この温度ｔ１をＲＡＭに格納すると共に、ｔ１が所定の閾値ｔ２以下であれば、車両の現在の状態として冷却水温（図２参照）は低い状態であると把握する。
現在の状況としては、他にマイク２６からの入力に基づいて音声認識した運転者の要求、例えば、「○○○番に電話をしてくれ。」や「この辺のレストランを表示してくれ。」や「ＣＤをかけてくれ。」等の要求も現在の状況として把握される。この場合、認識した音声に含まれるワード「ＣＤ」「かけて」等がプログラム選択テーブル２９１（図２）の選択条件（横軸項目）になる。
さらにエージェント処理部１１は、現在状況の把握として、エージェントデータ記憶装置２９の学習項目データ２９２と応答データ２９３をチェックすることで、エージェントがこれまでに学習してきた状態（学習データ）を把握する。
【００６０】
エージェント処理部１１は、現在の状況を把握すると、図１１により後で詳述するように、把握した状況に応じたエージェントの処理を行う（ステップ１４）。
ここでのエージェントの処理としては、エージェントによる判断、行為（行動＋発声）、制御、学習、検査等の各種処理が含まれるが、把握した現在の状況によっては何も動作しない場合も含まれる。
【００６１】
次に、エージェント処理部１１は、メイン動作の処理を終了するか否かを判断し（ステップ１５）、終了でない場合には（ステップ１５；Ｎ）、ステップ１３に戻って処理を繰り返す。
一方を終了する場合、すなわち、イグニッションがＯＦＦされたことがイグニッションセンサ４０１で検出され（ステップ１３）、室内灯の消灯等の終了処理（ステップ１４）が完了した後（ステップ１５；Ｙ）、メイン処理の動作を終了する。
【００６２】
図１３は、把握した状況に応じたエージェントの処理動作を表したフローチャートである。
エージェント処理部１１は、把握済みの現在の状況（起動回数、現在の天気、時間等）から、図２に示したプログラム選択テーブル２９１に基づいて、現在の状態で起動可能なコミュニケーションプログラム（の番号）があるか否かを判断し（ステップ２１）、該当プログラムが無ければ（ステップ２１；Ｎ）、メインルーチンにリターンする。
一方、起動可能なコミュニケーションプログラムがある場合（ステップ２１；Ｙ）、そのプログラム番号を決定する。そして、決定したプログラム番号に対する運転者の応答履歴を応答データ２９３から確認し、当該プログラム番号のコミュニケーションプログラムの起動を、お休みすべき状態か否かを確認する（ステップ２２）。
【００６３】
お休み状態ではない場合（ステップ２２；Ｎ）、エージェント処理部１１は、起動するコミュニケーションプログラムが、車両や車両に搭載された各種機器等の制御を行う制御プログラムかどうか確認し（ステップ２３）、制御プログラムの場合（ステップ２３；Ｙ）はそのままこのコミュニケーションプログラムを起動し、プログラムに従った制御を行う（ステップ２９）。
【００６４】
コミュニケーションプログラムが制御プログラムでない場合（ステップ２３；Ｎ）には、把握済みの現在の状況（現在の時間、季節、作動機器、エージェントの音声認識状況、カーナビゲーションシステムにおける目的地、等）及び、エージェントデータ記憶装置２９の応答認識データ２９８から、背景選択テーブル２９６に従って、エージェントの背景を決定する（ステップ２４）。
続いて、選択されているエージェントの容姿と決定された背景の画像のコミュニケーションプログラムを起動することで、図８（ａ）から（ｄ）に示されるような、エージェントの行為（行動と音声）に従った画像を表示装置２７に表示すると共に、音声出力装置２５から音声出力する（ステップ２５）。
【００６５】
そして、このコミュニケーションプログラムが運転者からの応答を取得する応答取得プログラムでない場合（ステップ２６；Ｎ）には、メインのルーチンにリターンする。また、このコミュニケーションプログラムが応答取得プログラムの場合（ステップ２６；Ｙ）には、エージェント処理部１１は、コミュニケーションプログラムの起動によるエージェント行為に対する運転者の応答を、マイク２６からの入力に基づく音声認識結果や、入力装置２２からの入力結果から取得する（ステップ２７）。そして、エージェント処理部１１は、今回のコミュニケーションプログラムに関するデータを蓄積することで、エージェントに学習をさせ（ステップ２８）、メインルーチンにリターンする。
データの蓄積としては、例えば、コミュニケーションプログラムの起動がお休みである場合には（ステップ２２；Ｙ）、学習項目データ２９２の該当プログラム番号の回数欄をカウントアップさせる。ただし、学習項目データ２９２のお休み回数／日時欄に格納されている回数をＫａ回とし、当該プログラム番号に対する前回までの応答データ２９３の履歴から決まるお休み回数をＫｂ回とした場合、Ｋａ＝Ｋｂ−１であれば、今回のお休みで規定回数休んだことになる。そこで、学習項目データ２９２及び応答データ２９３の当該プログラム番号欄の（該当する位置に格納されている）データをクリアする。
【００６６】
その他の場合（ステップ２７の後、ステップ２９の後）には、把握済みの現在状況（ステップ１３）の中に学習項目があれば学習項目データ２９２の値を更新し、応答内容を履歴として格納すべきプログラム番号であればステップ１７で取得した応答内容を応答データ２９３（図６）に格納する。この応答の履歴も各プログラム番号毎に規定された所定回数分のデータが既に格納されている場合には、最も古いデータを廃棄して新しいデータを格納する。また、ステップ２５により取得した応答が、その前に取得した肯定又は否定ワードについてのアンサーバックに関するものである場合には、この応答から肯定ワード又は否定ワードについての認識結果の正否を取得し、この認識結果についてのデータを応答認識データ２９８に格納し、応答認識率、及び、必要に応じて最高認識率ワードを書き換える。
【００６７】
次に、以上説明したエージェント処理による具体的な行為として、ラジオを作動させる場合について説明する。
図１４は、イグニッションＯＮ後における具体的なエージェント処理の内容を概念的に表したものである。
この図１４（Ａ）に示すように、エージェント処理部１１は、現在の状況として、現在時刻が１７時、現在位置検出装置２１で検出された現在位置（緯度、経度）から求めた現在位置が「東京」、等の状況がステップ１３において把握済みであるものとする。また、学習項目データ２９２と応答データ２９３についてチェックした学習データとしては、オーディオ類動作条件として、過去５回のオーディオ作動条件のうちに１７時から１８時の間に東京においてはラジオが作動され、また東京においてはラジオを聞く場合にはＪ−ｗａｖｅが最も多く選局されているとチェック済みであるものとする。
【００６８】
以上の把握状態に基づいて、エージェント処理部１１は、プログラム選択テーブル２９１から対応するコミュニケーションプログラムを選択する。すなわち、プログラム番号００５０１のコミュニケーションプログラム（ラジオをかける提案をするプログラム）が選択されたものとする（ステップ２１；Ｙ）。そしてこのコミュニケーションプログラムがお休み対象で無いことを確認（ステップ２２；Ｎ）する。
このコミュニケーションプログラムは制御プログラムではない（ステップ２３；Ｎ）ので、エージェント処理部１１は、続いて、背景選択テーブル２９６から背景を決定する（ステップ２４）。この場合、時間帯が１５時から１８時であること、起動されるコミュニケーションプログラムが発話を伴い音声認識状態が不可能となること、等から、バックが夕焼けであり、枠が赤色である背景が決定される。
そして、当該番号のコミュニケーションプログラムを起動し（ステップ２５）、図１４（Ｂ）に示すように、選択されたエージェントと決定された背景（夕焼けのバックＢＫと赤色の枠Ｒ）を表示装置２７に画像表示し、「ラジオでもつけよっか？」との発声による問い合わせを行う。このコミュニケーションプログラムは応答取得プログラムでない（ステップ２６；Ｎ）ことから、そのままメインのルーチンにリターンする。
【００６９】
続いて、今度は、現在状況として、プログラム番号００５０１により問い合わせが行われたことが新たに把握され、エージェント処理においては、プログラム番号００×××番の、肯定または否定の応答を取得するコミュニケーションプログラムが有ると判断される（ステップ２１；Ｙ）。そして、お休みでなく（ステップ２２；Ｎ）、制御プログラムでない（ステップ２３；Ｎ）ことから、エージェント処理部１１は、背景選択テーブル２９６及び応答認識データ２９８の最高認識ワードから背景を決定する（ステップ２４）。
この場合、時間帯が１５時から１８時であること、起動されるコミュニケーションプログラムが肯定及び否定ワードのみ認識可能となること、最高認識ワードが「はい」と「やだよ」であること、等から、バックが夕焼けであり、持ち物が「はい」と「やだよ」のプラカードであり、枠が黄色である背景が決定される。
【００７０】
そして、エージェント処理部１１は、このコミュニケーションプログラムを起動し、図１４（Ｃ）に示すように、選択されたエージェントと決定された背景（夕焼けのバックＢＫとプラカードＰＣと黄色の枠Ｙ）を表示装置２７に画像表示し（ステップ２５）、応答取得プログラムとして（ステップ２６；Ｙ）応答を取得する（ステップ２７）。ここでは「はい」との応答を取得したものとする。
そして、エージェントの学習として、プログラム番号００５０１番の応答データを更新し（ステップ２８）、メインのルーチンへリターンする。
【００７１】
今度は、現在状況として、プログラム番号００×××によりラジオＯＮについて肯定の応答を取得したことが新たに把握され、エージェント処理においては、プログラム番号００△△△番の、ラジオをかけるコミュニケーションプログラムが有ると判断される（ステップ２１；Ｙ）。そして、お休みでなく（ステップ２２；Ｎ）、制御プログラムである（ステップ２３；Ｙ）ことから、プログラムが起動され、ラジオをかける制御が行われる。このとき、学習項目２９２データから、現在地である東京において最も多く選択された選択局が取得され、この局に自動チューニングされる。
そして、学習項目データ２９２のオーディオ類作動条件及び作動オーディオ機器が書き換えられる。また、運転者が「違うよ」「やめて」等、ラジオをかけることについて中断する応答をしないことから、「はい」との応答認識が正しかったことを把握し、応答認識データ２９８の「はい」の認識結果、書き換え後のデータに基づく「はい」の応答認識率の取得と書き換え、及び必要に応じて最高認識率ワードの書き換えを行い（ステップ２８）、メインのルーチンへリターンする。
【００７２】
今度は、現在状況として、ラジオが作動中であることが新たに把握され、エージェント処理としては、プログラム番号００▽▽▽番の、車両状況表示のコミュニケーションプログラムが有ると判断される（ステップ２１；Ｙ）。そして、お休みでなく（ステップ２２；Ｎ）、制御プログラムでないことから、背景選択テーブルに基づいて背景が選択される。
この場合、時間帯が１５時から１８時であること、ラジオが作動していること、任意の発話音声が認識可能となること、等から、バックが夕焼けであり、持ち物がラジカセであり、枠が緑色である背景が決定される。
【００７３】
続いて、エージェント処理部１１は、このコミュニケーションプログラムを起動し、選択されたエージェントと決定された背景（夕焼けのバックＢＫと持ち物ラジカセと緑色の枠）を表示装置２７に画像表示する（ステップ２５）。そして、このプログラムは応答取得プログラムでない（ステップ２６；Ｎ）ことから、そのままメインのルーチンへリターンする。
【００７４】
以上説明したように本実施形態によれば、擬人化されたエージェントの表示と音声に加えて、エージェントの背景を表示することにより、情報の視認性を低下させることなく多くの情報を含み且つバラエティに富んだ画面の表示が可能となり、多くの情報を運転者に伝達し、運転者や状況に応じた内容の濃いコミュ二ケーションを確立することができる。
本実施形態によれば、エージェントにより伝達される情報と、背景により伝達される情報を区分することにより、より視認性を良好に情報の伝達することができる。
本実施形態によれば、背景として、エージェントのバックと、エージェントの持ち物とを表示し、且つ、これらの伝達する情報の種類を区分しているので、より視認性良好に情報を伝達することができる。
【００７５】
本実施形態によれば、現在の車両・運転者の状況だけでなく、過去の履歴等に基づく学習結果から擬人化されたエージェントが状況に合わせた行為をし、運転者とのコミュニケーションをはかることができ、車内での運転環境を快適にすることができる。
本実施形態によれば、現在の車両・運転者の状況だけでなく、過去の履歴等に基づく学習結果から背景が決定され、情報が提供されるので、車内での運転環境を快適にすることができる。
【００７６】
本実施形態によれば、認識率の高い肯定ワード及び否定ワードが学習により取得され、背景としてのプラカードにより認識率の高いワードが提示されるので、運転者がこれに従った応答を行うことにより、エージェントが正確に応答を行うことができ、運転者とエージェントとの効率的かつ良好なコミュニケーションが可能となる。
本実施形態によれば、背景としての枠によりエージェントの音声認識状態が表されるので、運転者がこれに従ったタイミングで応答を行うことにより、運転者からの応答をエージェントが認識可能となり、運転者とエージェントとの効率的且つ良好なコミュニケーションが可能となる。
【００７７】
尚、本発明のエージェント装置は、上述の実施形態に限定されるものではなく、本発明の趣旨を逸脱しない限りにおいて適宜変更可能である。
例えば、上述の各実施形態においては、画像データ２９４として、選択可能なエージェントの容姿それぞれと各背景との合成された画像データが格納されているが、エージェントの容姿の画像データと、背景の画像データとを別個に格納し、選択された容姿と決定された背景それぞれのデータを合成して表示装置に表示するようにしてもよい。
背景は上述のものに限られるものではなく、例えば、エージェントのバックとして風景以外に、オーディオを作動させているときに音符の模様や作動機器の絵等を表示してもよい。
また、背景として、例えば木を表示し、車速に応じて木が左方向へなびくような表示をしたり、横向きにのエージェントの足下に道を表示し、車両が登り坂に位置するときには、エージェントの前方側を高くしエージェントが上り坂を登っているように表示し、車両が下り坂に位置するときにはエージェントの前方側を低くしてエージェントが下り坂を降りているように表示してもよい。このような道路の状況等は、計器では見落とし易く、また安全性の点から重要であり、車両におけるエージェントとともに背景として視認性良好に表示することは、大きな利点となる。
【００７８】
本実施形態では、応答の認識結果に基づいた制御に対する運転者の反応から、応答の認識結果が正しいか否かを判断しているが、これに限られるものではなく、認識結果について音声によるアンサーバックを行い運転者からの回答（入力操作）により判断したり、持っているプラカードのうち認識したグループと同じものを高く上げる等の画像によるアンサーバックに対し、運転者からの回答から判断してもよい。
表示された背景に対する運転者の嗜好度をエージェントの問いかけによる応答や状況センサから取得する背景嗜好度学習手段を具備し、他の同一条件下において選択可能な複数の背景を用意し、運転者に応じて嗜好度の高い背景を選択表示させるようにしてもよい。
【００７９】
【発明の効果】
本発明のエージェント装置によれば、擬人化されたエージェントとエージェントの背景とにより多くの情報が視認性良く運転者に伝えられるので、車両の状況に応じた内容の濃いコミュ二ケーションを確立することができる。
【図面の簡単な説明】
【図１】本発明の一実施形態におけるコミュニケーション機能を実現するための構成を示すブロック図である。
【図２】同上、実施形態におるプログラム選択テーブルの内容を概念的にあらわした説明図である。
【図３】同上、実施形態において、各プログラム番号に対応するエージェントの行為（行動と音声）を表した説明図である。
【図４】同上、実施形態におけるプログラム番号００００１〜００００２の起動により表示装置に表示されるエージェントの「かしこまってお辞儀」行動についての数画面を表した説明図である。
【図５】同上、実施形態における学習項目データの内容を概念的に表した説明図である。
【図６】同上、実施形態における応答データの内容を概念的に表した説明図である。
【図７】同上、実施形態における背景選択テーブルの内容を概念的に表した説明図である。
【図８】同上、実施形態において表示装置に表示されるエージェント及び背景の一例を示す図であり、（ａ）は、音声認識不可能状態における表示、（ｂ）は肯定ワードと否定ワードのみの音声を認識可能な状態における表示、（ｃ）は都道府県の入力を要求している状態の表示、（ｄ）は任意の音声を認識可能な状態における表示を示す。
【図９】同上、実施形態における応答認識データの内容を概念的に表した説明図である。
【図１０】同上、実施形態におけるナビゲーションデータ記憶装置に格納されるデータファイルの内容を概念的に表した説明図である。
【図１１】同上、実施形態における状況センサ部を構成する各種センサを表した説明図である。
【図１２】同上、実施形態においてエージェントによるメイン動作を表したフローチャートである。
【図１３】同上、実施形態によるエージェント処理の動作を表したフローチャートである。
【図１４】同上、実施形態において、イグニッションＯＮ後における具体的なエージェント処理の内容を概念的に表した説明図である。
【符号の説明】
１全体処理部
１０ナビゲーション処理部
１１エージェント処理部
１２Ｉ／Ｆ部
１３画像処理部
１４音声制御部
１５状況情報処理部
２１現在位置検出装置
２２入力装置
２３記憶媒体駆動装置
２４通信制御装置
２５音声出力装置
２６マイク
２７表示装置
２８撮像装置
２９エージェントデータ記憶装置
３０ナビゲーションデータ記憶装置
４０状況センサ部[0001]
BACKGROUND OF THE INVENTION
The present invention relates to an agent device, and more particularly, to an agent device having a communication function that enables conversation within a vehicle with an anthropomorphic agent.
[0002]
[Prior art]
Conventionally, a radio and a cassette tape player are installed in a vehicle as a means for improving a driving environment by a driver.
In addition, there is a vehicle in which a traveling environment is improved by enjoying a conversation with an acquaintance outside the vehicle using a wireless communication device such as an amateur radio or a mobile phone mounted on the vehicle.
[0003]
[Problems to be solved by the invention]
The conventional radio or the like in the vehicle as described above only provides one-way information to the driver, and cannot perform a two-way conversation or the like.
On the other hand, in the case of using a mobile phone or the like, it is possible to talk, but it has been necessary to search for the other party by waiting for a call or dialing. Even if the other party was found, it did not give an appropriate conversation tailored to the driver's one-sided circumstances such as vehicle conditions.
In this way, conventional vehicles do not have anthropomorphized agents according to the history of the vehicle, such as the past state of the vehicle, or the driver's state, so the vehicle is a tool that is simply a vehicle that is not attached to it. In some cases, it only had a role.
[0004]
Japanese Patent Laid-Open No. 9-102098 discloses a technique for transmitting information to the driver by human facial expressions and actions.
However, the technology described in this publication does not change the display based on past driver response history, gender, user information such as age, and is always the same when the same situation occurs. Is displayed. That is, the same display is always performed for a limited sensor output, and it should fall within the category of conventional instruments with improved visibility. In addition, in a vehicle, there is a wide variety of information that should be recognized by the driver, including inside and outside the vehicle, and it is necessary to grasp the information at a glance from the viewpoint of safety and the like. However, there is a limit to the amount of information in order to transmit various types of information with good visibility based on human facial expressions and actions.
[0005]
The present invention can transmit various kinds of information and communicate with the driver according to the action by the anthropomorphic agent and the background of the agent. Agent device To provide Eyes Target.
[0006]
[Means for Solving the Problems]
According to the first aspect of the present invention, an anthropomorphic agent, an image display device that displays the background of the agent, an action determination unit that determines an action of the agent displayed on the image display device, and the image display device Background determining means for determining the background displayed on the screen, an agent for performing the action determined by the action determining means, and an image display means for displaying the background determined by the background determining means on the image display device, Voice recognition means for recognizing a voice response of a user to an agent's action displayed on the image display device, a positive word group consisting of a plurality of words representing affirmation by the voice recognition means, and a plurality representing negation A recognition rate calculating means for calculating a speech recognition rate for each word of a negative word group consisting of The background determination means comprises When waiting for voice input of a word indicating affirmation or a word indicating denial as a response by voice from the user, voice input is made to the user among the words of the positive word group and the words of the negative word group that can be recognized by voice. Recommendation A placard displaying the word with the highest recognition rate in the positive word group and the word with the highest recognition rate in the negative word group The agent device is characterized in that the placard displaying the message is determined as the background.
Claim 2 In the described invention, With audio output device, Speech recognition by the speech recognition means is According to the audio output device Agent's Audio output Voice recognition state determination means for determining whether or not voice recognition is not possible due to determination of the voice recognition result or the voice recognition result, and the image display device includes a personified agent and a display screen of the agent A background including a frame displayed at the edge of the background is displayed, and the background determination means determines whether the background frame is in a case where it is determined that the speech recognition is not possible and a state where the speech recognition is possible. The color of the image is determined to be a different color. 1 The described agent device is provided.
[0010]
DETAILED DESCRIPTION OF THE INVENTION
Hereinafter, a preferred embodiment of the agent device of the present invention will be described in detail with reference to FIGS.
(1) Outline of the embodiment
In the agent device of the present embodiment, the anthropomorphized agent and the background thereof are displayed on a display device in the vehicle by an image (a planar image, a stereoscopic image such as holography). Judgment of the situation of the vehicle including the vehicle itself, the driver, the passenger, the oncoming vehicle, etc. (including the response and reaction of the driver), and based on the vehicle situation at each time point, Respond with various variations (action = action and voice). Further, the background is determined and displayed based on the vehicle situation, similar to the agent action. As a result, the driver can interact (communicate) with his own agent in the vehicle, and the environment in the vehicle can be made comfortable. In addition, both the agent and the background can acquire a lot of information in a clearly distinguished manner, making the environment in the vehicle more comfortable.
Here, the anthropomorphic agent in the present embodiment has the same identity as a specific human being, a creature, a cartoon character, etc., and the creature with that identity maintains the identity and continuity. Outputs the trend (response by action and voice). In addition, identity and continuity are expressed as a personality with a unique personality, and can be regarded as a kind of pseudo-life form in electronic equipment. The agent of the present embodiment that appears in the vehicle is a subject that is pseudo-personalized (virtual personalized) that is determined in the same manner as a human.
Further, in the present embodiment, the agent determines the situation of the vehicle including the vehicle itself and the driver, performs various operations such as route guidance and device operation on behalf of the driver, and further assists the driver. The vehicle status, driver response, and the like are learned, and various actions are performed according to the judgment including the learning result. Therefore, even in the same vehicle situation, the content of communication differs depending on the past learning content. In some cases, there are misjudgments within a range that is not relevant to the vehicle, and an unnecessary (due) response may be caused by this misjudgment. Based on the driver's response, it is determined whether or not there is a determination error, and learning is performed.
Furthermore, in the present embodiment, the display of the background is determined based on the judgment including the learning result such as the vehicle state and the driver's response, and the effect is further enhanced by displaying the background according to the driver. Communication and good driving environment can be provided.
[0011]
(2) Details of the embodiment
FIG. 1 is a block diagram showing the configuration of the agent device in this embodiment.
In the present embodiment, an overall processing unit 1 that controls the entire communication function is provided. The overall processing unit 1 searches for a route to a set destination and provides guidance by voice or image display, a navigation processing unit 10, an agent processing unit 11, an I / F unit 12 for the navigation processing unit 10 and the agent processing unit 11. , An image processing unit 13 for processing an image output and an input image such as an agent image and a map image, an audio control unit 14 for controlling an audio output and an input voice such as an agent voice and a route guidance voice, and a vehicle and a driver It has a situation information processing unit 15 that processes detection data of various situations.
The agent processing unit 11 determines the action and background of the agent to appear in the vehicle from the vehicle status, learns the vehicle status and the past response by the driver, etc., and gives the driver appropriate conversation and control. It is designed to respond accordingly.
[0012]
The navigation processing unit 10 and the agent processing unit 11 are a CPU (central processing unit) that controls data processing and operation of each unit, and a ROM, RAM, and timer connected to the CPU via a bus line such as a data bus or a control bus. Etc. Both the processing units 10 and 11 are connected to a network, and can acquire processing data of each other.
The ROM is a read-only memory in which various data and programs for controlling by the CPU are stored in advance, and the RAM is a random access memory used by the CPU as a working memory.
[0013]
In the navigation processing unit 10 and the agent processing unit 11 of the present embodiment, the CPU reads various programs stored in the ROM and executes various processes. The CPU reads a computer program from an external recording medium set in the recording medium driving device 23, and stores (installs) the computer program in an agent storage device 29, a navigation data storage device, or another storage device such as a hard disk (not shown). Alternatively, a necessary program or the like may be read from the storage device into the RAM and executed. Further, a necessary program or the like may be directly read from the recording medium driving device 23 into the RAM and executed.
[0014]
The navigation processing unit 10 is connected to a current position detection device 21 and a navigation data storage device 30, the agent processing unit 11 is connected to an agent data storage device 29, and the I / F unit 12 is connected to an input device 22 and a storage medium. A drive device 23 and a communication control device 24 are connected, a display device 27 and an imaging device 28 are connected to the image processing unit 13, a sound processing device 25 and a microphone 26 are connected to the sound control unit 14, and a situation information processing unit A status sensor unit 40 is connected to 15.
[0015]
The current position detection device 21 is for detecting the absolute position (by latitude and longitude) of the vehicle, and includes a GPS (Global Positioning System) reception device 211 that measures the position of the vehicle using an artificial satellite, A sensor 212, a steering angle sensor 213, a distance sensor 214, a beacon receiving device 215 that receives position information from beacons arranged on the road, and the like are used.
The GPS receiver 211 and the beacon receiver 215 can measure the position independently, but in a place where the GPS receiver 211 and the beacon receiver 215 cannot receive, both the direction sensor 212 and the distance sensor 214 are used. The current position is detected by dead reckoning navigation.
The direction sensor 212 is, for example, a geomagnetic sensor that detects the azimuth of the vehicle by detecting geomagnetism, a gyroscope such as a gas rate gyroscope or an optical fiber gyroscope that detects the rotational angular velocity of the vehicle and integrates the angular velocity to obtain the azimuth of the vehicle, A wheel sensor is used that calculates the amount of displacement in the azimuth by detecting the turning of the vehicle based on the output pulse difference (movement distance difference).
The steering angle sensor 213 detects the steering angle α using an optical rotation sensor, a rotation resistance volume, or the like attached to the rotating portion of the steering.
For the distance sensor 214, for example, various methods are used such as detecting and counting the number of rotations of the wheel, or detecting acceleration and integrating twice.
[0016]
The input device 22 is means for inputting information about the user (such as age, sex, hobbies, personality, etc.) as a vehicle situation, or for a driver to respond to an inquiry from an agent. Note that the information related to the user is not limited to the case where the user inputs from the input device 22. For example, the agent makes various inquiries regarding whether or not he / she likes professional baseball, the name of a favorite team, etc. to the user. You may make it acquire.
In addition, the input device 22 is a predetermined travel environment (transmission condition) of a vehicle to which a current location (departure point), a destination (arrival point) at the start of travel in the navigation process, a traffic request for information such as traffic information is transmitted to the information provider ), For inputting the type (model) of the mobile phone used in the vehicle.
As the input device 22, various devices such as a touch panel (functioning as a switch), a keyboard, a mouse, a light pen, a joystick, an infrared remote controller, and a voice recognition device can be used. Moreover, you may provide the receiving part which receives the remote control using infrared rays etc., and the various signals transmitted from a remote control. In addition to the joystick for moving the cursor displayed on the screen, various keys such as menu designation keys (buttons) and numeric keys are arranged on the remote control.
[0017]
The recording medium driving device 23 is a driving device used to read a computer program for the navigation processing unit 10 and the agent processing unit 11 to perform various processes from an external recording medium. The computer program recorded on the recording medium includes various programs and data.
Here, the recording medium refers to a recording medium on which a computer program is recorded. Specifically, a magnetic recording medium such as a floppy disk, a hard disk, or a magnetic tape, a semiconductor recording medium such as a memory chip or an IC card, a CD- Recording media such as ROM, MO, PD (phase change rewritable optical disc), etc., paper such as paper cards, paper tapes, printed materials for reading programs using character recognition devices (and paper) Recording medium using a medium having a corresponding function) and other recording media on which a computer program is recorded by various methods.
[0018]
In addition to reading the computer program from these various recording media, the recording medium driving device 23, when the recording medium is a writable recording medium such as a floppy disk or an IC card, performs navigation processing unit 10 and agent processing. The RAM of the unit 11 and the data of the storage devices 29 and 30 can be written to the recording medium.
For example, the learning contents (learning item data, response data) regarding the agent function, information regarding the user, etc. are stored in the IC card, and the IC card storing these data is used even when driving other vehicles. Thus, it becomes possible to communicate with the learned agent according to his / her preference (according to the past situation of reception). This makes it possible for an agent specific to the driver to appear in the vehicle instead of the agent for each vehicle.
[0019]
The communication control device 24 is connected to a mobile phone including various wireless communication devices. The communication control unit 24 communicates with an information providing station that provides data related to traffic information such as road congestion and traffic regulations, as well as telephone line calls, and karaoke data used for communication karaoke in a car. It is possible to communicate with an information providing station that provides information.
In addition, learning data related to the agent function and information related to the user can be transmitted and received via the communication control device 24.
[0020]
The voice output device 25 is composed of a plurality of speakers arranged in the vehicle, and is controlled by the voice control unit 14, for example, guidance voice when performing route guidance by voice, voice according to the action of the agent, Sound is output. The audio output device 25 may also be used as an audio speaker. The voice control device 14 can control the timbre, accent, and the like of the voice output from the voice output device 25 in accordance with the driver's input of the tuning instruction.
The microphone 26 serves as voice input means for inputting / outputting voice to be recognized by the voice control unit 14, for example, input voice such as a destination in navigation processing, a driver's conversation with the agent (response, etc.), and the like. Function. The microphone 26 may be used also as a microphone for performing karaoke such as online karaoke, and a dedicated microphone having directivity is used in order to accurately collect the driver's voice. Also good.
A hands-free unit may be formed by the audio output device 25 and the microphone 26 so that a telephone call can be made without using a mobile phone.
[0021]
On the display device 27, a road map for route guidance and various image information by the processing of the navigation processing unit 10 are displayed, and various actions (moving images) of the agent by the agent processing unit 11 are displayed. . When an agent is displayed, the agent background is displayed at the same time. Further, images inside and outside the vehicle imaged by the imaging device 28 are also displayed after being processed by the image processing unit 13.
As the display device 27, various display devices such as a liquid crystal display device and a CRT are used.
The display device 27 may have a function as the input device 22 such as a touch panel.
[0022]
The imaging device 28 is composed of a camera equipped with a CCD (charge coupled device) for capturing an image, and images the front, rear, right side, and left side of the vehicle in addition to the in-vehicle camera that images the driver. Each outside camera is arranged. An image captured by each camera of the imaging device 28 is supplied to the image processing unit 13, and processing such as image recognition is performed, and each recognition result is also used for determining a program number by the agent processing unit 11. ing.
[0023]
The agent data storage device 29 is a storage device that stores various data (including programs) necessary for realizing the agent function according to the present embodiment. As the agent data storage device 29, for example, various recording media such as a floppy disk, a hard disk, a CD-ROM, an optical disk, a magnetic tape, an IC card, an optical card, and a driving device thereof are used.
In this case, for example, the learning item data 292 and the response data 293 are composed of an IC card or a floppy disk that is easy to carry, and the other data is composed of a hard disk. However, these driving devices may be used as the driving device.
[0024]
The agent data storage device 29 includes an agent program 290, a program selection table 291, learning item data 292, response data 293, image data 294 for displaying the appearance, behavior, and background of the agent illustrated in FIG. A table 296, response recognition data 298, and various data necessary for processing for other agents are stored.
[0025]
The agent program 290 displays an agent processing program for realizing the agent function, and detailed actions when the agent and the driver communicate with each other on the display device 27 together with a background, and a conversation corresponding to the action. Communication programs for output from the audio output device 25 are stored in the order of program numbers.
The agent program 290 stores review type audio data for the audio of each program number, and the driver can select the audio from the input device 22 or the like together with the selection of the appearance of the agent. It is like that. Agent voices include male voices, female voices, child voices, mechanical voices, animal voices, voices of specific voice actors and actors, voices of specific characters, etc. The driver chooses. Note that the agent's voice selection can be changed as needed.
[0026]
The program selection table 291 is a table for selecting a communication program stored in the agent program 290.
FIG. 2 shows the program selection table 291. FIG. 3 shows the contents of the actions (actions and utterances) of the agent corresponding to each program number selected in the program selection table 291.
The program numbers shown in FIGS. 2 and 3 coincide with the numbers of the communication programs stored in the agent program 290.
[0027]
FIG. 4 shows several screens about the “slow bow” action of the agent displayed on the display device 27 by the program numbers 00001 to 00002 in FIGS.
As shown in FIG. 4, the agent E is expressed as a bow bowed by bowing while tightening the mouth and placing the hand on the knee. The words (speech) spoken by Agent E along with this action can be changed depending on the vehicle situation, the learning situation, the personality of the agent, and the like.
[0028]
When the coolant temperature of the engine is low, the action “Let me sleep ...” is selected according to the condition of the engine. As a sleepy expression, use a facial expression with a drop of wrinkles, yawning and stretching, performing prescribed actions (bowing, etc.), rubbing eyes first, and making movements and vocalization slower than usual Can be expressed as These sleepy expressions are not always the same, but are appropriately changed by learning the number of actions.
For example, rubbing eyes once every three times (A action), yawning once every ten times (B action), and other than that, a facial expression with a heel down (C action). These changes are realized by combining the additional program of action B or action C with the basic program of action A. As to which action is to be combined, the number of program executions of the basic action A is counted as a learning item, and the additional program is combined according to the number of times.
In addition, when expressing the action “energeticly”, it is expressed by increasing the inflection of the voice or by making the agent E appear on the screen while running.
[0029]
Each item displayed in FIG. 2 represents a selection condition for selecting each program number, and is determined from various conditions of the vehicle and the driver detected by the state sensor 40 (time, start location) , Cooling water temperature, shift position position, accelerator opening, etc.) and items determined from learning contents stored in learning item data 292 and response data 293 (number of times of today's IG ON, elapsed time since last end, And the total number of startups).
In the program selection table 291, a program that satisfies all these items is always determined uniquely. In the table, “◯” indicates an item that must be satisfied in order to select the program number, and “−” and “No” indicate items that are not considered in selecting the program. .
[0030]
2 and 3 describe actions and selection conditions related to communication (greetings) when the ignition is turned on, but a program for selecting a program that prescribes various other actions (actions and utterances) Various numbers and selection conditions are also defined.
For example, on the condition that a sudden brake is stepped on, a program is also defined in which an agent acts “sitoshimochi” or “tatara” or makes a surprise voice. The selection of each action by the agent is changed by learning for sudden braking. For example, “Sirimochi” is applied from the first sudden braking to the third, and “Tatara” is performed from the fourth to the tenth. From the second time onward, take the action of “holding one foot one step forward” and let the agent get used to sudden braking step by step. Then, when there is an interval of one week from the last sudden braking, the vehicle is moved backward by one step.
[0031]
The learning item data 292 and the response data 293 are data obtained as a result of the agent learning by the driver's driving operation and response. Therefore, the learning item data 292 and the response data 293 are stored / updated (learned) for each driver.
[0032]
Both the learning item data 292 and the response data 293 are data that is stored and updated by learning of the agent, and the contents thereof are conceptually shown in FIGS. 5 and 6, respectively.
As shown in FIG. 5, the learning item data 292 includes the total number of activations, the previous end date and time, and the number of times the ignition is turned on today, which are selection condition items for selecting a professional communication program in the program selection table 291 (FIG. 2). The remaining amount at the time of refueling five times, the audio operating conditions and the operating equipment at that time, etc. are stored. Further, the number of times of rest / date and time, a default value, and other data for determining whether to start the program selected according to the selection condition (whether to take a rest) are stored.
[0033]
The total number of activations stores the total number of times the ignition has been activated, and is counted up each time the ignition is turned on.
The previous end date and time is stored every time the ignition is turned off.
The number of times the ignition is turned on today stores the number of times the ignition is turned on that day and the end time of the day. The count is incremented every time the ignition is turned on, but the data is initialized to “0” when the day ends. The end time of the day is stored as 24:00 as a default value. This time can be changed according to the life pattern of the user (driver). When the time is changed, the changed time is stored.
[0034]
In the previous five refueling remaining amounts, the remaining amount of fuel detected immediately before refueling (gasoline) is stored, and each time data is newly refueled, each data is shifted to the left (the oldest leftmost (The data is deleted.) The remaining amount immediately before refueling is stored on the rightmost side.
This data indicates that when a detection value G1 of a fuel detection sensor 415, which will be described later, becomes equal to or less than the average value G2 of the remaining amount of fuel for all five times (G1 ≦ G2), the agent E appears on the display device 27 and supplies fuel. The prompting action is displayed on the display device 27, and a sound such as “I am hungry! I want gasoline!” Is output from the sound output device 25.
The audio operating conditions are the time zone and place when the audio switches such as radio, CD, MD, cassette tape player, and TV are turned on. In the case of radio and TV, the selected station is further selected. Applicable. The operating equipment is audio such as radio, CD, MD, cassette tape player and the like. The audio operation condition and the operation device are stored for the past five times when the audio is switched on.
[0035]
The number of times / date of rest stores, for each program number, the number of times of rest without executing even if the corresponding communication program is selected. The number of times / date of rest is stored for an agent action in which the rest item is set as a learning item, such as an agent act (program number 00123) that proposes to stop the air conditioner described later.
When the driver's response to the agent's proposal or conversation is rejection (rejection) or disregard (or no response), “rest” is selectively set according to the communication program.
[0036]
In the default value, initial setting values for each item such as time, number of times, temperature, vehicle speed, date and time are stored, and the value changed in the learning item, such as the end time of the day described above, is the initial value. Used to return to
[0037]
Other data stored in the learning item data 292 includes, for example, the birthday of the driver and related persons (this is a user input item), the national holiday, the event, such as Christmas, Valentine's Day, White Day, etc. Stores the date. There is also a special menu communication program for each event day. For example, an agent disguised as Santa Claus appears on Christmas Eve.
[0038]
In the response data 293 of FIG. 6, a history of user responses to agent actions is stored for each communication program number whose user response is a learning item. As for the user response data, the latest response date and time and the content of the response are stored for a predetermined number of times (program number 00123 is two times) as in communication program numbers 00123 and 00125 in FIG. Only the latest response content is stored once (thus updating every time there is a response), only the latest response content is stored a predetermined time, and the latest date and time and response content are stored once. In other cases, only the latest date and time are stored once or a predetermined number of times.
Symbols A, B, and C displayed in FIG. 6 (A) represent response contents. As shown in FIG. 6 (B), when symbol A is ignored, when symbol B is rejected, This represents the case where the symbol C is received. The response content of the driver is determined from the result of voice recognition with respect to the driver's voice input from the microphone 26 and the input result from the input device 22.
In this embodiment, the driver's response is ignored, categorized into three patterns of rejection and acceptance, but “strongly rejected”, “angry”, and “happy” were newly added. Also good. In this case, the learning item data 292 (for example, the number of days off) and the response data 293 are additionally changed according to the newly added response.
[0039]
The image data 294 of the agent data storage device 29 shown in FIG. 1 stores images in which the appearances of a plurality of types of agents and the respective backgrounds are combined with respect to the behavior of the program number of the communication program of the agent program 290. ing. The appearance of the agent can be selected from the input device 22 or the like according to the driver's preference, and the background selection table based on the situation of the selected agent obtained by various sensors or the like. An image is displayed together with the background determined by H.296. The selection of the appearance of the agent can be changed in a timely manner as in the case of voice.
The appearance of the agent stored in the image data 294 does not have to be a human (male, female) appearance. For example, the appearance of an animal itself such as a chick, a dog, a cat, a frog, a mouse, or a human design It may be an animal appearance (illustrated), a robot appearance, a specific character appearance, or the like. In addition, the age of the agent does not need to be constant. The learning function of the agent is that it is initially a child's appearance and grows over time, changing its appearance (changing to the appearance of an adult, It may change to the appearance of
[0040]
Examples of background images stored in the image data 294 include landscapes representing time zones such as sunrise and starry sky, landscapes representing seasons such as the sea, snowy mountains, and autumn leaves, landscapes representing destinations such as golf courses and the sea, "Yes" and "No" placards when agents are waiting for a "Yes" or "No" response, patterns of notes when listening to music on radio or CD, Examples include a frame of each color that represents a response recognition state for each color indicating whether or not a voice response can be recognized.
[0041]
FIG. 7 shows a background selection table 296 for selecting an agent background displayed on the display device 27. As shown on the left side of the table, the background includes images of landscapes and patterns displayed on the back of the agent, images of belongings possessed by the agent, and frames displayed along the inner frame of the display screen of the display device 27. There is an image. As shown in FIG. 7, these backgrounds are determined based on various items such as time zone, season, running state, operating equipment, agent state, and destination set in car navigation. ing.
Each of these items is determined in association with items determined from various situations such as the vehicle running state detected by the state sensor 40 and learning contents stored in the learning item data 292 and the response data 293. There are items (“Yes” and “No” placards, “YES” and “NO” placards, background selection when operating audio, etc.).
Then, one or more backgrounds are selected based on the selection condition. When a plurality of backgrounds are selected, the background, belongings, and frame are not selected redundantly. In the table, “◯” indicates an item that must be satisfied in order to select the background, and “No mark” indicates an item that must not be satisfied in order to select the background.
[0042]
FIGS. 8A, 8B, 8C, and 8D show an example in which the background selected by the above-described background selection table 296 is displayed on the display device together with the agent in the present embodiment.
FIG. 8A shows a background in which the driver's voice cannot be recognized because the agent is speaking or judging the voice recognition result, and a red frame R indicating that voice recognition is impossible is displayed as the background. ing. In this display (a), the agent is seated and displayed. This represents that the vehicle is stopped by the posture of the agent.
FIG. 8B shows a state of waiting for an affirmative or negative word as a response from the driver, and a yellow frame Y indicating that only an affirmative and negative word can be recognized as a background is displayed. Has been. In addition, placards of “yes” and “no” are displayed on the agent's belongings (background) as recommended words of responses that can be recognized by voice. In this display (b) to (d), the agent is displayed standing, and this represents that the vehicle is traveling by the posture of the agent.
FIG. 8C is a screen in a state of waiting for voice input of the destination prefecture in the navigation system, and a green frame G representing a voice recognition enabled state is displayed as a background. In addition, a placard of “prefecture” is displayed on the agent's belongings (background) as a recommended word of responses that can be recognized by voice.
In FIG. 8D, a green frame G representing a state in which a voice of a normal communication such as “pick up CD” or “open a window” can be recognized without particular limitation.
[0043]
Similar to the learning item data 292 and response data 293 described above, the response recognition data 298 is data obtained by the agent as a result of a response from the driver, and the data is stored / updated (learned) for each driver. Is done.
[0044]
FIG. 9 shows the response recognition data.
As shown in FIG. 9, the response recognition data 298 includes a response recognition result, a response recognition rate obtained from the response recognition result, and a maximum response in each of a positive word group and a negative word group. A recognition rate word is stored for each driver.
Response recognition results include positive and negative words such as “Yes” and “No”, “YES” and “NO”, “Yes” and “NO”, “Ye” and “Yadayo” as background. This is data indicating whether or not the word response from the driver was correctly recognized when displayed on the card. Whether or not the response was correctly recognized is determined based on the driver's reaction when the agent performs control etc. based on the result of recognizing the acquired response. , “No”, “YES”, “NO”,... Are stored 10 times each.
[0045]
As the response recognition rate, the response recognition rate obtained from each of the above-mentioned response recognition results is obtained by the following Equation 1 and stored.
[0046]
[Expression 1]
Response recognition rate = (number of times a response has been correctly recognized / number of responses acquired) × 100
[0047]
The highest recognition rate word stores a word having the highest recognition rate in each of a group of positive words such as “Yes” and “YES” and a group of negative words such as “No” and “NO”. When displaying a background having two placards, affirmative and negative, these highest recognition rate words are selected in the background selection table.
[0048]
FIG. 10 shows the contents of a data file stored in the navigation data storage device 30 (FIG. 1).
As shown in FIG. 10, the navigation data storage device 30 includes various data files used for route guidance, such as a communication area data file 301, a drawing map data file 302, an intersection data file 303, a node data file 304, a road A data file 305, a search data file 306, and a photo data file 307 are stored.
As the navigation data storage device 4, for example, various recording media such as a floppy disk, a hard disk, a CD-ROM, an optical disk, a magnetic tape, an IC card, an optical card, and a driving device thereof are used.
The navigation data storage device 4 may be composed of a plurality of different types of recording media and driving devices. For example, the search data file 46 is a readable / writable recording medium (for example, a flash memory or the like), and other files are formed of a CD-ROM, and these drive devices are used as drive devices.
[0049]
In the communication area data file 301, a mobile phone that is connected to the communication control device 24 or used in the vehicle without connection is displayed on the display device 5, or a route search is performed for the communication area. The communication area data for use at the time is stored for each type of mobile phone. Each communication area data for each type of mobile phone is numbered and managed so that it can be easily searched, and the communicable area can be expressed by the inside surrounded by a closed curve. And is specified by the position data of the bending point. Note that the communication area data may be generated by dividing the communicable area into large and small square areas and using the coordinate data of two points in a diagonal relationship.
The content stored in the communication area data file 301 is preferably updateable as the area in which the mobile phone can be used is expanded or reduced. For this purpose, by using the mobile phone and the communication control device 24, The communication area data file 301 can be updated with the latest data by communicating with the information providing station. The communication area data file 301 may be composed of a floppy disk, an IC card, etc., and rewritten with the latest data.
The drawing map data file 302 stores drawing map data to be drawn on the display device 27. The drawing map data stores a hierarchical map, for example, map data for each hierarchy such as Japan, Kanto region, Tokyo, Kanda from the highest layer. Map data is attached to each level of map data.
[0050]
The intersection data file 303 includes intersection numbers, intersection names, intersection coordinates (latitude and longitude), road numbers starting and ending at the intersection, and presence / absence of signals as intersection data. Stored.
The node data file 304 stores node data consisting of information such as latitude and longitude that specifies the coordinates of each point on each road. In other words, this node data is data relating to a single point on the road. When a node connecting nodes is called an arc, the road is expressed by connecting each of a plurality of node rows with an arc. .
The road data file 305 includes a road number that identifies each road, an intersection number that is a start point and an end point, a road number that has the same start point and end point, road thickness, prohibition information such as entry prohibition, Photo number etc. are stored.
Road network data composed of intersection data, node data, and road data stored in the intersection data file 303, node data file 304, and road data file 305, respectively, is used for route search.
[0051]
The search data file 306 stores intersection sequence data, node sequence data, and the like constituting the route generated by the route search. The intersection string data includes information such as an intersection name, an intersection number, a photo number showing a characteristic landscape of the intersection, a turning angle, and a distance. The node string data includes information such as east longitude and north latitude indicating the position of the node.
In the photo data file 307, photographs taken of characteristic scenery or the like that can be seen at each intersection or straight ahead are stored in a digital, analog, or negative film format corresponding to the photograph number.
[0052]
FIG. 11 shows various sensors constituting the situation sensor unit 40.
As shown in FIG. 11, the situation sensor unit 40 includes an ignition sensor 401, a vehicle speed sensor 402, an accelerator sensor 403, a brake sensor 404, a side brake detection sensor 405, a shift position detection sensor 406, a winker detection sensor 407, a wiper detection sensor 408, Light detection sensor 409, seat belt detection sensor 410, door opening / closing detection sensor 411, passenger detection sensor 412, indoor temperature detection sensor 413, outdoor temperature detection sensor 414, fuel detection sensor 415, water temperature detection sensor 416, ABS detection sensor 417, Air conditioner sensor 418, weight sensor 419, front inter-vehicle distance sensor 420, rear inter-vehicle distance sensor 421, body temperature sensor 422, heart rate sensor 423, sweat sensor 424, electroencephalogram sensor 425, eye tracer 426, infrared sensor 42 , Vehicles such as 428 such as tire pressure drop detection sensor, belt looseness detection sensor, window open / close state sensor, horn sensor, indoor humidity sensor, outdoor humidity sensor, oil temperature detection sensor, hydraulic pressure detection sensor, etc. Various sensors are provided for detecting the situation, driver situation, in-vehicle situation, and the like.
These various sensors are arranged at predetermined positions according to the respective sensing purposes.
In addition, when each of these sensors does not exist as an independent sensor, the case where it senses indirectly from another sensor detection signal is included. For example, a tire air pressure decrease detection sensor indirectly detects a decrease in air pressure by a change in a signal from a wheel speed sensor.
[0053]
The ignition sensor 401 detects ON and OFF of the ignition.
As the vehicle speed sensor 402, a conventionally known vehicle speed sensor such as one that calculates the vehicle speed by detecting the rotational angular speed or the number of rotations of the speedometer cable can be used without particular limitation.
The accelerator sensor 403 detects the amount of depression of the accelerator pedal.
The brake sensor 404 detects the amount of depression of the brake, and detects whether or not a sudden brake is applied based on the depression force, the depression speed, and the like.
The side brake detection sensor 405 detects whether or not the side brake is applied.
The shift position detection sensor 406 detects the shift lever position.
The blinker detection sensor 407 detects the blinking direction of the blinker.
The wiper detection sensor 408 detects the driving state (speed, etc.) of the wiper.
The light detection sensor 409 detects the lighting state of each lamp such as a head lamp, tail lamp, fog lamp, and room lamp.
The seat belt detection sensor 410 detects whether the driver and the passenger (auxiliary seat, rear seat) are wearing the seat belt. If it is not worn, the agent appears as appropriate (to the extent that it is not disliked), and performs warnings, cautions, comments, etc. (the degree is changed by learning).
[0054]
The door open / close detection sensor 411 detects the open / closed state of the door, and in the case of a so-called half-door, the agent notifies the fact. The door opening / closing detection sensor 411 can detect opening / closing of each door according to the vehicle type, such as a driver's seat door, a passenger seat door, a rear driver seat side door, and a rear passenger seat side door.
The passenger detection sensor 412 is a sensor that detects whether or not a passenger is on the passenger seat or the rear seat. The passenger detection sensor 412 is detected from an in-vehicle image captured by the imaging device 28, or is disposed in an auxiliary seat or the like. Detect with pressure sensor or scale.
The indoor temperature detection sensor 413 detects the indoor air temperature, and the outdoor temperature detection sensor 414 detects the air temperature outside the vehicle.
The fuel detection sensor 415 detects the remaining amount of fuel such as gasoline and light oil. The detected values for the past five times immediately before refueling are stored in the learning item data 292, and when the average value is reached, the agent informs that it is the refueling time.
[0055]
The water temperature detection sensor 416 detects the temperature of the cooling water. Immediately after the ignition is turned on, if the detected temperature is low, the agent often acts to sleep. On the other hand, if the water temperature is too high, before overheating, the agent informs that effect along with the “dull” behavior.
The ABS detection sensor 417 detects whether or not the ABS is activated to prevent the tire from being locked due to a sudden brake and to ensure the maneuverability and vehicle stability.
The air conditioner sensor 418 detects the operation state of the air conditioner. For example, ON / OFF of the air conditioner, set temperature, air volume, etc. are detected.
The weight sensor 419 is a sensor that detects the weight of the driver. A driver is identified from this weight or from the weight and the image of the imaging device 28, and an agent learned in relation to the driver is caused to appear. That is, by using the learning item data 292 and the response data 293 learned by the agent for the specified driver, an agent dedicated to the driver appears.
The front inter-vehicle distance sensor 420 detects the distance to other vehicles and obstacles in front of the vehicle, and the rear inter-vehicle distance sensor 421 detects the distance to other vehicles and obstacles behind.
[0056]
The body temperature sensor 422 is a heart rate sensor 423, and the sweat sensor 424 is a sensor that detects the body temperature, heart rate, and sweat state of the driver. For example, each sensor is arranged on the handle surface and detected from the state of the driver's hand. To do. Alternatively, as the body temperature sensor 422, the temperature distribution of each part of the driver may be detected by thermography using an infrared detection element.
The electroencephalogram sensor 425 is a sensor that detects a driver's brain wave, and detects, for example, an α wave, a β wave, or the like to check the driver's arousal state.
The eye tracer 426 detects the movement of the user's line of sight and determines whether the user is looking for an object outside the vehicle, searching for the object inside the vehicle, or a wakeful state during normal driving.
The infrared sensor 427 detects the movement of the user's hand and the movement of the face.
[0057]
Next, the operation of the present embodiment configured as described above will be described.
FIG. 12 is a flowchart showing the main operation of processing by the agent of this embodiment.
When the ignition sensor 401 detects that the ignition is turned on, the agent processing unit 11 first performs initial setting (step 11). As initial settings, processing such as clearing the RAM, setting the work area for each process in the RAM, loading the program selection table 291 (FIG. 2) into the RAM, and setting the flag to 0 is performed. In the agent processing of this embodiment, the start of the processing is set to ignition ON. However, for example, the processing may be started when any door opening / closing is detected by the door opening / closing detection sensor 411.
[0058]
Next, the agent processing unit 11 identifies the driver (Step 12). In other words, the agent processing unit 11 analyzes the voice when the driver gives a greeting first, identifies the driver, identifies the driver by analyzing the captured image, or the weight sensor 419. The driver is identified from the weight detected in step 1, or the driver is identified from the set seat position and the angle of the rearview mirror. For the identified driver, a special communication program for inquiring “Is Mr. XX?” Is activated separately from the agent processing described later, and the driver is confirmed.
[0059]
When the driver is specified, the agent processing unit 11 next grasps the current situation (step 13).
That is, the agent processing unit 11 detects the detection value supplied from each sensor of the situation sensor unit 40 to the situation information processing unit 15, the processing result of the image captured by the imaging device 28, and the vehicle detected by the current position detection device 21. Is acquired and stored in a predetermined area of the RAM, and the current state such as the state of the vehicle is grasped from the stored data. For example, when the temperature of the cooling water detected by the water temperature detection sensor 416 is t1, the agent processing unit 11 stores the temperature t1 in the RAM, and if the t1 is equal to or less than a predetermined threshold value t2, It is grasped that the cooling water temperature (see FIG. 2) is in a low state.
As the current situation, the driver's request that is recognized by voice based on the input from the microphone 26, for example, “Please call XXX” or “Show a restaurant in this area. ”Or“ Please play the CD. ”Is also recognized as the current situation. In this case, the words “CD” and “Kake” included in the recognized voice are the selection conditions (horizontal axis items) of the program selection table 291 (FIG. 2).
Further, the agent processing unit 11 checks the learning item data 292 and the response data 293 in the agent data storage device 29 as a grasp of the current situation, thereby grasping the state (learning data) that the agent has learned so far.
[0060]
When the agent processing unit 11 grasps the current situation, as will be described in detail later with reference to FIG. 11, the agent processing unit 11 performs processing of the agent according to the grasped situation (step 14).
The processing of the agent here includes various processing such as judgment, action (behavior + utterance), control, learning, inspection, etc. by the agent, but also includes a case where no operation is performed depending on the grasped current situation.
[0061]
Next, the agent processing unit 11 determines whether or not to end the process of the main operation (step 15). If not ended (step 15; N), the process returns to step 13 and repeats the process.
In the case of ending one, that is, after the ignition sensor 401 detects that the ignition is turned off (step 13) and completes the termination process (step 14) such as turning off the indoor lamp (step 15; Y), the main The processing operation is terminated.
[0062]
FIG. 13 is a flowchart showing the processing operation of the agent according to the grasped situation.
Based on the program selection table 291 shown in FIG. 2, the agent processing unit 11 can start a communication program (number of communication programs) that can be started in the current state based on the grasped current situation (number of activations, current weather, time, etc.). ) (Step 21). If there is no corresponding program (step 21; N), the process returns to the main routine.
On the other hand, if there is a communication program that can be activated (step 21; Y), the program number is determined. Then, the driver's response history with respect to the determined program number is confirmed from the response data 293, and it is confirmed whether or not the activation of the communication program of the program number is in a state to be rested (step 22).
[0063]
When not in a rest state (step 22; N), the agent processing unit 11 confirms whether the communication program to be started is a control program for controlling the vehicle and various devices mounted on the vehicle (step 23). In the case of a control program (step 23; Y), this communication program is started as it is, and control according to the program is performed (step 29).
[0064]
If the communication program is not a control program (step 23; N), the current situation (current time, season, operating device, voice recognition status of the agent, destination in the car navigation system, etc.) that has been grasped and the agent From the response recognition data 298 of the data storage device 29, the background of the agent is determined according to the background selection table 296 (step 24).
Subsequently, by starting the communication program for the appearance of the selected agent and the determined background image, the agent action (behavior and voice) as shown in FIGS. The conforming image is displayed on the display device 27, and the sound is output from the sound output device 25 (step 25).
[0065]
When this communication program is not a response acquisition program for acquiring a response from the driver (step 26; N), the process returns to the main routine. When this communication program is a response acquisition program (step 26; Y), the agent processing unit 11 uses the voice recognition result based on the input from the microphone 26 to the driver's response to the agent action by the activation of the communication program. Or it acquires from the input result from the input device 22 (step 27). Then, the agent processing unit 11 accumulates data related to the current communication program to cause the agent to learn (step 28), and returns to the main routine.
As the accumulation of data, for example, when the communication program is not activated (step 22; Y), the number column of the corresponding program number in the learning item data 292 is counted up. However, when the number of times stored in the number of times / date of learning item data 292 is Ka times and the number of times of rest determined from the history of the response data 293 for the program number is Kb times, Ka = If it is Kb-1, this is the prescribed number of days off. Therefore, the data (stored in the corresponding position) in the program number column of the learning item data 292 and the response data 293 is cleared.
[0066]
In other cases (after step 27 and after step 29), if there is a learning item in the grasped current situation (step 13), the value of the learning item data 292 is updated and the response content is stored as a history. If the program number is to be stored, the response content acquired in step 17 is stored in response data 293 (FIG. 6). In the case of the response history, when the predetermined number of data defined for each program number has already been stored, the oldest data is discarded and new data is stored. If the response acquired in step 25 is related to the answerback for the positive or negative word acquired before that, the correctness of the recognition result for the positive or negative word is acquired from this response. Data on the recognition result is stored in the response recognition data 298, and the response recognition rate and, if necessary, the maximum recognition rate word are rewritten.
[0067]
Next, a case where the radio is operated will be described as a specific action by the agent processing described above.
FIG. 14 conceptually shows the contents of specific agent processing after the ignition is turned on.
As shown in FIG. 14A, the agent processing unit 11 determines that the current position obtained from the current position (latitude, longitude) detected by the current position detection device 21 at the current time is 17:00 as the current situation. It is assumed that the status of “Tokyo” or the like has been grasped in step 13. The learning data checked for the learning item data 292 and the response data 293 is that the radio is operated in Tokyo from 17:00 to 18:00 in the past five audio operating conditions as the audio operating conditions. In the case of listening to the radio, it is assumed that the J-wave has been selected most frequently.
[0068]
Based on the above grasping state, the agent processing unit 11 selects a corresponding communication program from the program selection table 291. In other words, it is assumed that the communication program (program for making a proposal for applying radio) with program number 00501 is selected (step 21; Y). Then, it is confirmed that this communication program is not a subject of rest (step 22; N).
Since this communication program is not a control program (step 23; N), the agent processing unit 11 subsequently determines a background from the background selection table 296 (step 24). In this case, since the time zone is from 15 o'clock to 18 o'clock, the communication program to be activated is accompanied by speech and the voice recognition state is impossible, etc., the background is sunset and the frame is red. It is determined.
Then, the communication program of that number is started (step 25), and the selected agent and the determined background (sunset back BK and red frame R) are displayed on the display device 27 as shown in FIG. Display an image and make an inquiry by saying "Is it on radio?" Since this communication program is not a response acquisition program (step 26; N), the process directly returns to the main routine.
[0069]
Subsequently, this time, as a current situation, it is newly grasped that an inquiry has been made by the program number 00501, and in the agent process, a communication program that acquires a positive or negative response of the program number 00xxx. (Step 21; Y). Since it is not a day off (step 22; N) and is not a control program (step 23; N), the agent processing unit 11 determines the background from the background recognition table 296 and the highest recognition word of the response recognition data 298 ( Step 24).
In this case, the time zone is from 15:00 to 18:00, the communication program to be activated can recognize only positive and negative words, the highest recognition words are "Yes" and "Yadayo", etc. Therefore, the background whose sunset is the sunset, whose placards are “Yes” and “Yadayo” and whose frame is yellow is determined.
[0070]
Then, the agent processing unit 11 activates this communication program and, as shown in FIG. 14C, displays the background determined as the selected agent (sunset back BK, placard PC, and yellow frame Y). An image is displayed on the display device 27 (step 25), and a response is acquired as a response acquisition program (step 26; Y) (step 27). Here, it is assumed that a response “Yes” has been acquired.
Then, as learning of the agent, the response data of program number 00501 is updated (step 28), and the process returns to the main routine.
[0071]
This time, as a current situation, it is newly grasped that a positive response for radio ON has been acquired by the program number 00xxx, and in the agent processing, the communication program for applying the radio of program number 00 △△△ is It is judged that it exists (step 21; Y). And since it is not a rest (step 22; N) and is a control program (step 23; Y), a program is started and control which turns on a radio is performed. At this time, the selected station most frequently selected in Tokyo, which is the current location, is acquired from the learning item 292 data, and is automatically tuned to this station.
Then, the audio operation condition and the operation audio device of the learning item data 292 are rewritten. In addition, since the driver does not respond to stopping the radio such as “No” or “Stop”, it is understood that the response recognition of “Yes” was correct, and “Yes” in the response recognition data 298 The response recognition rate of “Yes” is acquired and rewritten based on the recognized result, the rewritten data, and the maximum recognition rate word is rewritten as necessary (step 28), and the process returns to the main routine.
[0072]
This time, it is newly determined that the radio is operating as the current status, and it is determined that there is a communication program for displaying the vehicle status of program number 00 ▽ ▽ ▽ as the agent processing (step 21; Y). And since it is not a rest (step 22; N) and it is not a control program, a background is selected based on a background selection table.
In this case, because the time zone is from 15:00 to 18:00, the radio is operating, and any utterances can be recognized, the back is sunset, the belongings are radio cassettes, The background in which is green is determined.
[0073]
Subsequently, the agent processing unit 11 activates this communication program, and displays an image of the selected agent and the background determined (sunset back BK, belongings radio cassette and green frame) on the display device 27 (step 25). . Since this program is not a response acquisition program (step 26; N), the process directly returns to the main routine.
[0074]
As described above, according to the present embodiment, in addition to an anthropomorphic agent display and sound, the agent background is displayed, so that a large amount of information can be included without reducing the visibility of the information. It is possible to display a rich screen, transmit a lot of information to the driver, and establish a rich communication according to the driver and the situation.
According to the present embodiment, by separating information transmitted by the agent from information transmitted by the background, information can be transmitted with better visibility.
According to the present embodiment, the agent's back and the agent's belongings are displayed as the background, and the types of information to be transmitted are classified, so that information can be transmitted with better visibility. it can.
[0075]
According to this embodiment, an agent that is anthropomorphic from the learning result based on the past history etc. as well as the current situation of the vehicle / driver, acts to suit the situation, and communicates with the driver And can make the driving environment in the car comfortable.
According to the present embodiment, the background is determined from the learning result based on the past history and the like as well as the current vehicle / driver situation, and information is provided, so that the driving environment in the vehicle is made comfortable. Can do.
[0076]
According to this embodiment, a positive word and a negative word with a high recognition rate are acquired by learning, and a word with a high recognition rate is presented by a placard as a background, so that the driver makes a response according to this Thus, the agent can accurately respond, and efficient and good communication between the driver and the agent becomes possible.
According to the present embodiment, since the voice recognition state of the agent is represented by a frame as a background, the agent can recognize the response from the driver when the driver responds at a timing according to this, Efficient and good communication between the driver and the agent is possible.
[0077]
The agent device of the present invention is not limited to the above-described embodiment, and can be changed as appropriate without departing from the spirit of the present invention.
For example, in each of the embodiments described above, combined image data of each of the selectable agent appearances and each background is stored as the image data 294, but the agent appearance image data and the background image are stored. The data may be stored separately, and the selected appearance and the determined background data may be combined and displayed on the display device.
The background is not limited to the above, and for example, a musical note pattern or a picture of an operating device may be displayed as an agent's back when operating the audio, in addition to the landscape.
For example, when a tree is displayed as a background and the tree flutters to the left according to the vehicle speed, or a road is displayed under the agent's feet sideways, and the vehicle is located on an uphill, the agent May be displayed as if the agent is climbing uphill, and when the vehicle is located on the downhill, the front side of the agent may be lowered and displayed as if the agent is going downhill. . Such road conditions and the like are easily overlooked by the instrument and are important from the viewpoint of safety, and displaying with good visibility as a background together with the agent in the vehicle is a great advantage.
[0078]
In this embodiment, whether or not the response recognition result is correct is determined from the driver's reaction to the control based on the response recognition result. However, the present invention is not limited to this, and the voice response to the recognition result is determined. Judgment is made by answering from the driver (input operation) by backing up, or by answering from the driver for answerbacks based on images such as raising the same placard as the recognized group. May be.
Provided with background preference learning means to acquire the driver's preference for the displayed background from the agent's inquiry response or situation sensor, prepared multiple backgrounds that can be selected under the same other conditions, to the driver Accordingly, a background with a high degree of preference may be selectively displayed.
[0079]
【The invention's effect】
Of the present invention According to the agent device, since a lot of information is transmitted to the driver with high visibility by the anthropomorphized agent and the background of the agent, it is possible to establish a rich communication according to the situation of the vehicle.
[Brief description of the drawings]
FIG. 1 is a block diagram showing a configuration for realizing a communication function in an embodiment of the present invention.
FIG. 2 is an explanatory diagram conceptually showing the contents of a program selection table in the embodiment.
FIG. 3 is an explanatory diagram showing an agent's action (action and voice) corresponding to each program number in the embodiment;
FIG. 4 is an explanatory diagram showing a number of screens about the “slow bow” action of the agent displayed on the display device by starting program numbers 00001 to 00002 in the embodiment.
FIG. 5 is an explanatory diagram conceptually showing the contents of learning item data in the embodiment.
FIG. 6 is an explanatory diagram conceptually showing the contents of response data in the embodiment.
FIG. 7 is an explanatory diagram conceptually showing the contents of a background selection table in the embodiment.
FIG. 8 is a diagram showing an example of an agent and a background displayed on the display device in the embodiment, where (a) is a display in a voice recognition disabled state, and (b) is only an affirmative word and a negative word. A display in a state where speech can be recognized, (c) shows a display requesting input of a prefecture, and (d) shows a display in a state where any speech can be recognized.
FIG. 9 is an explanatory diagram conceptually showing the contents of response recognition data in the embodiment.
FIG. 10 is an explanatory diagram conceptually showing the contents of a data file stored in the navigation data storage device in the embodiment.
FIG. 11 is an explanatory diagram showing various sensors constituting the situation sensor unit in the embodiment.
FIG. 12 is a flowchart showing a main operation by an agent in the embodiment;
FIG. 13 is a flowchart showing the operation of an agent process according to the embodiment.
FIG. 14 is an explanatory diagram conceptually showing the contents of specific agent processing after the ignition is turned on in the embodiment.
[Explanation of symbols]
1 Overall processing section
10 Navigation processing part
11 Agent processing part
12 I / F section
13 Image processing unit
14 Voice control unit
15 Situation information processing department
21 Current position detection device
22 Input device
23 Storage medium drive
24 Communication control device
25 Audio output device
26 microphone
27 Display device
28 Imaging device
29 Agent Data Storage Device
30 Navigation data storage device
40 Situation sensor section

Claims

An image display device for displaying an anthropomorphized agent and a background of the agent;
Action determining means for determining an action of an agent displayed on the image display device;
Background determining means for determining the background displayed on the image display device;
Image display means for causing the image display device to display an agent that performs the action determined by the action determination means and the background determined by the background determination means;
Voice recognition means for recognizing a voice response of the user to the agent's action displayed on the image display device;
A recognition rate calculating means for calculating a speech recognition rate for each word of a negative word group consisting of a plurality of words representing affirmation and a negative word group consisting of a plurality of words representing a negation by the voice recognition means ;
When the background determination unit waits for speech input of a word indicating affirmation or a word indicating denial as a voice response from the user , the background determination unit includes the words of the positive word group and the words of the negative word group that can be recognized by voice. A placard displaying the word with the highest recognition rate in the positive word group and a placard displaying the word with the highest recognition rate in the negative word group as words to recommend voice input to the user Decide as background,
An agent device characterized by that.

With audio output device,
Voice recognition state determination means for determining whether or not the voice recognition by the voice recognition means is in a state where voice recognition cannot be performed due to the voice output of the agent by the voice output device or the determination of the voice recognition result;
The image display device displays an anthropomorphic agent and a background including a frame displayed at the edge of the display screen of the agent,
The background determination means determines the color of the background frame to be different between the case where it is determined that the voice recognition is impossible and the case where the voice recognition is possible.
The agent device according to claim 1 .