JP3677833B2

JP3677833B2 - Navigation device, navigation method, and automobile

Info

Publication number: JP3677833B2
Application number: JP26754795A
Authority: JP
Inventors: 和夫石井; 英二山本; 幸田中; 弘史角田; 康治浅野; 浩明小川; 雅則表; 活樹南野
Original assignee: Sony Corp
Current assignee: Sony Corp
Priority date: 1995-10-16
Filing date: 1995-10-16
Publication date: 2005-08-03
Anticipated expiration: 2015-10-16
Also published as: JPH09114491A

Description

【０００１】
【発明の属する技術分野】
本発明は、例えば自動車に搭載させて道路地図などを表示させるナビゲーション装置及びナビゲート方法、並びにこれらの装置が搭載された自動車に関する。
【０００２】
【従来の技術】
従来、自動車などに搭載させるナビゲーション装置が各種開発されている。このナビゲーション装置は、例えば道路地図データが記憶されたＣＤ−ＲＯＭなどの大容量データ記憶手段と、現在位置の検出手段と、検出した現在位置の近傍の道路地図を、データ記憶手段から読出したデータに基づいて表示させるディスプレイ装置とで構成される。この場合、現在位置の検出手段としては、ＧＰＳ（Global Positioning System ）と称される測位用の人工衛星を使用した測位システムを使用したものや、車両の走行方向，走行速度などの情報に基づいて出発地点から現在位置の変化を追跡する自律航法によるものなどがある。
【０００３】
また、ディスプレイ装置に表示される地図としては、キー操作などを行うことで、現在位置の近傍だけでなく、地図データが用意されている限りは、所望の位置の地図を表示させることができるようにしてある。
【０００４】
このようなナビゲーション装置の場合には、例えば自動車用の場合、運転席の近傍にディスプレイ装置を設置して、運転者が走行中や信号停止などの一時停止中に現在位置の近傍の地図を見れるようにするのが一般的である。
【０００５】
【発明が解決しようとする課題】
ところで、このようなナビゲーション装置は、自動車の運転などを邪魔しないで操作できるようにする必要があり、例えば走行中は複雑な操作を禁止するようにしてある。即ち、このようなナビゲーション装置を車両に設置する場合には、何らかの走行状態検出部（例えば自動車のパーキングブレーキスイッチ）と接続して、この検出部の状態により車両が停止していることが検出されるときだけ、全ての操作ができるように設定し、停止してない状態（即ち走行中）には、複雑なキー操作を禁止するように設定してある。
【０００６】
ところが、このように走行中に表示地図を切換える等の操作ができないのは不便であり、走行中であっても、運転を邪魔することなく、高度な操作ができるようにすることが要請されている。
【０００７】
本発明はかかる点に鑑み、自動車の運転などを邪魔することなく、ナビゲーション装置などの各種装置の高度な操作が簡単にできるようにすることを目的とする。
【０００８】
【課題を解決するための手段】
本発明のナビゲーション装置は、音声処理部で認識した認識対象語の候補を表示する映像信号を作成する映像信号作成手段と、この映像信号作成手段で認識対象語の候補の映像信号を作成させる際に、その候補となる認識対象語の区分が、地名を指示する言葉と、指令を行う言葉とで区分し、この区分毎に、異なる表示態様で表示させる映像信号とする表示制御手段とを備えたものである。
【００１３】
本発明のナビゲーション装置によると、認識対象語の候補の表示状態が、候補の対象語の区分毎に異なる表示態様になるので、同じ区分毎の候補が判り易くなり、見やすい表示状態となる。
【００１４】
また本発明のナビゲート方法は、音声認識した認識対象語の候補を表示させる場合に、その候補となる認識対象語の区分が、地名を指示する言葉と、指令を行う言葉とで区分し、この区分毎に、異なる表示態様で表示させるようにしたものである。
【００１５】
本発明のナビゲート方法によると、認識対象語の候補の表示状態が、候補の対象語の区分毎に異なる表示態様になるので、同じ区分毎の候補が判り易くなり、見やすい表示状態となる。
【００１６】
また本発明の自動車は、車内の所定位置に配された表示手段に、入力した音声の認識に基づいて地図を表示させる装置を備えた自動車において、音声処理部で認識した認識対象語の候補を表示する映像信号を作成して表示手段に供給する映像信号作成手段と、この映像信号作成手段で認識対象語の候補の映像信号を作成させる際に、その候補となる認識対象語の区分が、地名を指示する言葉と、指令を行う言葉とで区分し、この区分毎に、異なる表示態様で表示させる映像信号とする表示制御手段とを備えたものである。
【００１７】
本発明の自動車によると、認識対象語の候補の表示状態が、候補の対象語の区分毎に異なる表示態様になるので、同じ区分毎の候補が判り易くなり、見やすい表示状態となる。
【００１８】
【発明の実施の形態】
以下、本発明の一実施例を、添付図面を参照して説明する。
【００１９】
本例においては、自動車に搭載されるナビゲーション装置に適用したもので、まず図２，図３を参照して本例の装置の自動車への設置状態を説明する。図２に示すように、自動車５０は、ハンドル５１が運転席５２の前方に取付けられ、基本的には、運転席５２に着席した運転者がナビゲーション装置の操作を行うようにしたものである。但し、この自動車５０内の他の同乗者が操作する場合もある。そして、ナビゲーション装置の本体２０及びこのナビゲーション装置本体２０に接続された音声認識装置１０は、自動車５０内の任意の空間（例えば後部のトランク内）に設置され、後述する測位信号受信用アンテナ２１が車体の外側（或いはリアウィンドウの内側などの車内）に取付けてある。
【００２０】
そして、図３に運転席の近傍を示すように、ハンドル５１の脇には、後述するトークスイッチ１８やナビゲーション装置の操作キー２７が配置され、これらのスイッチやキーは、運転中に操作されても支障がないように配置してある。また、ナビゲーション装置に接続されたディスプレイ装置４０が、運転者の前方の視界を妨げない位置に配置してある。また、ナビゲーション装置２０内で音声合成された音声信号を出力させるスピーカ３２が、運転者に出力音声が届く位置（例えばディスプレイ装置４０の脇など）に取付けてある。
【００２１】
また、本例のナビゲーション装置は音声入力ができるようにしてあり、そのためのマイクロフォン１１が、運転席５２の前方のフロントガラス上部に配されたサンバイバイザ５３に取付けてあり、運転席５２に着席した運転者の話し声を拾うようにしてある。
【００２２】
また、本例のナビゲーション装置本体２０は、この自動車のエンジン制御用コンピュータ５４と接続してあり、エンジン制御用コンピュータ５４から車速に比例したパルス信号が供給されるようにしてある。
【００２３】
次に、本例のナビゲーション装置の内部の構成について図１を参照して説明すると、本例においては、音声認識装置１０をナビゲーション装置２０と接続して構成させたもので、音声認識装置１０は、マイクロフォン１１が接続してある。このマイクロフォン１１としては、例えば指向性が比較的狭く設定されて、自動車の運転席に着席した者の話し声だけを良好に拾うようなものを使用し、例えば後述するトークスイッチ１８が押されてオン状態となっている間だけ電源が投入されて音声を拾う動作を行うようにしてある。
【００２４】
そして、このマイクロフォン１１が拾って得た音声信号を、アナログ／デジタル変換器１２に供給し、所定のサンプリング周波数のデジタル音声信号に変換する。そして、このアナログ／デジタル変換器１２が出力するデジタル音声信号を、ＤＳＰ（デジタル・シグナル・プロセッサ）と称される集積回路構成のデジタル音声処理回路１３に供給する。このデジタル音声処理回路１３では、帯域分割，フィルタリングなどの処理で、デジタル音声信号をベクトルデータとし、このベクトルデータを音声認識回路１４に供給する。
【００２５】
この音声認識回路１４には音声認識データ記憶用ＲＯＭ１５が接続され、デジタル音声処理回路１３から供給されるベクトルデータとの所定の音声認識アルゴリズム（例えばＨＭＭ：隠れマルコフモデル）に従った認識動作を行い、ＲＯＭ１５に記憶された音声認識用音韻モデルから候補を複数選定し、その候補の中で最も一致度の高い音韻モデルに対応して記憶された文字データを読出す。なお、本例の音声認識回路１４は、音声認識装置１０内の各部の処理の制御を行う制御手段としても機能するようにしてあり、後述するトークスイッチ１８の操作についても、この音声認識回路１４が判断するようにしてある。
【００２６】
ここで、本例の音声認識データ記憶用ＲＯＭ１５のデータ記憶状態について説明すると、本例の場合には、地名と、ナビゲーション装置の操作を指示する言葉だけを認識するようにしてあり、地名としては、図４に記憶エリアの設定状態を示すように、国内の都道府県と、市区町村の名前だけを登録させてあり、各都道府県と市区町村毎に、その地名の文字コードと、地名を音声認識させるためのデータである音韻モデルが記憶させてある。
【００２７】
なお、例えば日本国内の場合には、全国の市区町村の数は約３５００であり、この約３５００の地名が記憶されることになる。但し、「××町」の地名の場合には、「××マチ」と発音した場合のデータと、「××チョウ」と発音した場合のデータとの双方が記憶させてある。同様に、「××村」の地名の場合には、「××ソン」と発音した場合のデータと、「××ムラ」と発音した場合のデータとの双方が記憶させてある。
【００２８】
また、都道府県の境界に隣接した位置の市区町村などのように、都道府県名を間違えて覚える可能性の高い市区町村名については、間違えやすい都道府県名を付与させて登録させてある。即ち、例えば正しい例である「カナガワケンカワサキシ（神奈川県川崎市）」と登録させると共に、間違った例である隣接した都道府県名を付与させた「トウキョウトカワサキシ（東京都川崎市）」としても登録させる。
【００２９】
また、ナビゲーション装置の操作を指示する言葉としては、「目的地」，「出発地」，「経由地」，「自宅」などの表示位置を指示する言葉や、「今何時」（現在時刻を聞く指令），「今どこ」（現在位置を聞く指令），「次は」（次の交差点を聞く指令），「あとどれくらい」（目的地までの距離を聞く指令），「速度は」（現在速度を聞く指令），「高度は」（現在の高度を聞く指令），「進行方向は」（進行方向を聞く指令），「一覧表」（認識できる指令の一覧表をディスプレイに表示させるための指令）等のその他の各種操作指令を行う言葉の文字コードと、その言葉に対応する音韻モデルが記憶させてある。
【００３０】
そして、音声認識回路１４で、入力ベクトルデータから、所定の音声認識アルゴリズムを経て得られた認識結果に一致する、音韻モデルに対応した文字コードが、地名の文字コードである場合には、この文字コードを、ＲＯＭ１５から読出す。そして、この読出された文字コードを、経緯度変換回路１６に供給する。この経緯度変換回路１６には経緯度変換データ記憶用ＲＯＭ１７が接続され、音声認識回路１４から供給される文字データに対応した経緯度データ及びその付随データをＲＯＭ１７から読出す。
【００３１】
なお、本例の音声認識回路１４には、認識結果を一時的に記憶するメモリ（図示せず）が備えられ、このメモリ内に認識結果を履歴リストとして記憶させるようにしてある。また、認識処理時に、最も一致度が高い音声から順にある程度まで一致する音声についてまでのデータを、候補リストとして記憶させるようにしてある。この履歴リストや候補リストは、記憶されてからある程度の時間が経過すると消去される。
【００３２】
ここで、本例の経緯度変換データ記憶用ＲＯＭ１７のデータ記憶状態について説明すると、本例の場合には、音声認識データ記憶用ＲＯＭ１５に記憶された地名の文字コードと同じ文字コード毎に記憶エリアが設定され、図５に示すように、各文字コード毎に、その文字で示される地名の緯度と経度のデータと、付随するデータとして表示スケールのデータとが記憶させてある。また、音声認識データ記憶用ＲＯＭ１５から読出された文字コードとしては、カタカナによる文字コードとしてあるが、この経緯度変換データ記憶用ＲＯＭ１７には、発音を文字列で示すカタカナによる文字コードの他に、表示用の漢字，平仮名，カタカナ等を使用した文字コードについても記憶させてある。
【００３３】
なお、本例の場合には、地名毎の緯度と経度のデータとしては、その地名で示される地域の役所（市役所，区役所，町役場，村役場）の所在地の絶対位置を示す緯度と経度のデータとしてある。また、付随データとして、表示用の文字コードと表示スケールのデータを、緯度と経度のデータと共に出力するようにしてある。この表示スケールのデータとしては、その地名で示される地域の大きさに応じて設定された表示スケールのデータとしてあり、例えば数段階に表示スケールを指示するデータとしてある。
【００３４】
そして、経緯度変換データ記憶用ＲＯＭ１７から読出された経緯度データ及びその付随データを、音声認識装置１０の出力として出力端子１０ａに供給する。また、音声認識回路１４で一致が検出された入力音声の文字コードのデータを、音声認識装置１０の出力として出力端子１０ｂに供給する。この出力端子１０ａ，１０ｂに得られるデータは、ナビゲーション装置２０に供給する。
【００３５】
なお、本例の音声認識装置１０には、ロックされない開閉スイッチ（即ち押されたときだけオン状態になるスイッチ）であるトークスイッチ１８が接続され、このトークスイッチ１８が少なくとも３００ｍ秒以上継続して押されている間に、マイクロフォン１１が拾った音声信号だけを、アナログ／デジタル変換器１２から経緯度変換回路１６までの回路で上述した処理を行うようにしてある。この音声認識装置１０内での処理は、音声認識回路１４の制御に基づいて行われ、トークスイッチ１８の状態についても、音声認識回路１４が判断するようにしてある。
【００３６】
そして本例においては、音声認識回路１４で所定時間以内（例えば１０秒以内）に、再度入力した音声の認識処理が行われた場合において、このとき認識した音声が、音声認識回路１４内のメモリに記憶された履歴リストに記憶されているとき、この認識音声を履歴リストから削除し、削除された履歴リストの最も高い順位に記憶された音声を、音声認識したと判断するようにしてある。また、このような処理が複数回（例えば５回）連続して行われたときには、候補となる認識音声のデータを候補リストから読出して、ナビゲーション装置２０側に供給し、ナビゲーション装置２０に接続されたディスプレイ装置４０に候補リストを表示させるようにしてある。これらの処理の詳細については、後述する。
【００３７】
また、本例の音声認識装置１０内の音声認識回路１４からは、端子１０ｂを介してナビゲーション装置２０側に上述した文字コード以外の各種制御データについても伝送できるようにしてあり、例えば音声出力処理や地図データの作成処理を中断させる制御データをナビゲーション装置２０側に送ることもある。
【００３８】
次に、音声認識装置１０と接続されたナビゲーション装置２０の構成について説明する。このナビゲーション装置２０は、ＧＰＳ用アンテナ２１を備え、このアンテナ２１が受信したＧＰＳ用衛星からの測位用信号を、現在位置検出回路２２で受信処理し、この受信したデータを解析して、現在位置を検出する。この検出した現在位置のデータとしては、そのときの絶対的な位置である緯度と経度のデータである。
【００３９】
そして、この検出した現在位置のデータを、演算回路２３に供給する。この演算回路２３は、ナビゲーション装置２０による動作を制御するシステムコントローラとして機能する回路で、道路地図データが記憶されたＣＤ−ＲＯＭ（光ディスク）がセットされて、このＣＤ−ＲＯＭの記憶データを読出すＣＤ−ＲＯＭドライバ２４と、データ処理に必要な各種データを記憶するＲＡＭ２５と、このナビゲーション装置が搭載された車両の動きを検出する車速センサ２６と、操作キー２７とが接続させてある。そして、現在位置などの経緯度の座標データが得られたとき、ＣＤ−ＲＯＭドライバ２４にその座標位置の近傍の道路地図データを読出す制御を行う。そして、ＣＤ−ＲＯＭドライバ２４で読出した道路地図データをＲＡＭ２５に一時記憶させ、この記憶された道路地図データを使用して、道路地図を表示させるための表示データを作成する。このときには、自動車内の所定位置に配置された操作キー２７の操作などにより設定された表示スケール（縮尺）で地図を表示させるような表示データとする。
【００４０】
そして、演算回路２３で作成された表示データを、映像信号生成回路２８に供給し、この映像信号生成回路２８で表示データに基づいて所定のフォーマットの映像信号を生成させ、この映像信号を出力端子２０ｃに供給する。
【００４１】
そして、この出力端子２０ｃから出力される映像信号を、ディスプレイ装置４０に供給し、このディスプレイ装置４０で映像信号に基づいた受像処理を行い、ディスプレイ装置４０の表示パネルに道路地図などを表示させる。
【００４２】
そして、このような現在位置の近傍の道路地図を表示させる他に、操作キー２７の操作などで指示された位置の道路地図なども、演算回路２３の制御に基づいて表示できるようにしてある。また、操作キー２７の操作などに基づいて、「目的地」，「出発地」，「経由地」，「自宅」などの特定の座標位置を登録することができるようにしてある。この特定の座標位置を登録した場合には、その登録した座標位置のデータ（経度と緯度のデータ）をＲＡＭ２５に記憶させる。
【００４３】
また、車速センサ２６が自動車の走行を検出したときには、演算回路２３が操作キー２７の操作の内の比較的簡単な操作以外の操作を受け付けないようにしてある。
【００４４】
また、このナビゲーション装置２０は、自律航法部２９を備え、自動車側のエンジン制御用コンピュータ等から供給される車速に対応したパルス信号に基づいて、自動車の正確な走行速度を演算すると共に、自律航法部２９内のジャイロセンサの出力に基づいて進行方向を検出し、速度と進行方向に基づいて決められた位置からの自律航法による現在位置の測位を行う。例えば現在位置検出回路２２で位置検出ができない状態になったとき、最後に現在位置検出回路２２で検出できた位置から、自律航法による測位を行う。
【００４５】
また、演算回路２３には音声合成回路３１が接続させてあり、演算回路２３で音声による何らかの指示が必要な場合には、音声合成回路３１でこの指示する音声の合成処理を実行させ、音声合成回路３１に接続されたスピーカ３２から音声を出力させるようにしてある。例えば、「目的地に近づきました」，「進行方向は左です」などのナビゲーション装置として必要な各種指示を音声で行うようにしてある。また、この音声合成回路３１では、音声認識装置１０で認識した音声を、供給される文字データに基づいて音声合成処理して、スピーカ３２から音声として出力させるようにしてある。その処理については後述する。
【００４６】
ここで、このナビゲーション装置２０は、音声認識装置１０の出力端子１０ａ，１０ｂから出力される経緯度データとその付随データ及び文字コードのデータが供給される入力端子２０ａ，２０ｂを備え、この入力端子２０ａ，２０ｂに得られる経緯度データとその付随データ及び文字コードのデータを、演算回路２３に供給する。
【００４７】
そして、演算回路２３では、この経緯度データなどが音声認識装置１０側から供給されるとき、その経度と緯度の近傍の道路地図データをＣＤ−ＲＯＭドライバ２４でディスクから読出す制御を行う。そして、ＣＤ−ＲＯＭドライバ２４で読出した道路地図データをＲＡＭ２５に一時記憶させ、この記憶された道路地図データを使用して、道路地図を表示させるための表示データを作成する。このときには、供給される経度と緯度が中心に表示される表示データとすると共に、経緯度データに付随する表示スケールで指示されたスケール（縮尺）で地図を表示させるような表示データとする。
【００４８】
そして、この表示データに基づいて、映像信号生成回路２８で映像信号を生成させ、ディスプレイ装置４０に、音声認識装置１０から指示された座標位置の道路地図を表示させる。
【００４９】
また、音声認識装置１０の出力端子１０ｂからナビゲーション装置の操作を指示する言葉の文字コードが供給される場合には、その操作を指示する言葉の文字コードを演算回路２３で判別すると、対応した制御を演算回路２３が行うようにしてある。この場合、「目的地」，「出発地」，「経由地」，「自宅」などの表示位置を指示する言葉の文字コードである場合には、この表示位置の座標がＲＡＭ２５に登録されているか否か判断した後、登録されている場合には、その位置の近傍の道路地図データをＣＤ−ＲＯＭドライバ２４でディスクから読出す制御を行う。
【００５０】
また、演算回路２３に音声認識装置１０から、認識した音声の発音を示す文字コードのデータが供給されるときには、その文字コードで示される言葉を、音声合成回路３１で合成処理させ、音声合成回路３１に接続されたスピーカ３２から音声として出力させるようにしてある。例えば、音声認識装置１０側で「トウキョウトブンキョウク（東京都文京区）」と音声認識したとき、この認識した発音の文字列のデータに基づいて「トウキョウトブンキョウク」と発音させる音声信号を生成させる合成処理を、音声合成回路３１で行い、その生成された音声信号をスピーカ３２から出力させる。
【００５１】
この場合、本例においては音声認識装置１０で音声認識を行った場合に、ナビゲーション装置２０の端子２０ａに経度，緯度のデータが供給されるのと、端子２０ｂに認識した音声の発音を示す文字コードのデータが供給されるのが、ほぼ同時であるが、演算回路２３では最初に音声合成回路３１で認識した言葉を音声合成させる処理を実行させ、次に経度，緯度のデータに基づいた道路地図の表示データの作成処理を実行させるようにしてある。
【００５２】
次に、本例の音声認識装置１０とナビゲーション装置２０を使用して、道路地図表示などを行う場合の動作を説明する。まず、音声認識装置１０での音声認識動作を、図６のフローチャートに示すと、最初にトークスイッチ１８がオンか否か判断し（ステップ１０１）、このトークスイッチ１８がオンとなったことを判別した場合には、そのオンとなった期間にマイクロフォン１１が拾った音声信号を、アナログ／デジタル変換器１２でサンプリングさせ、デジタル音声処理回路１３で処理させて、ベクトルデータ化させる（ステップ１０２）。そして、このベクトルデータに基づいて音声認識回路１４で音声認識処理させる（ステップ１０３）。
【００５３】
ここで、音声認識データ記憶用ＲＯＭ１５に記憶された地名（即ち予め登録された地名）の音声を認識したか否か判断し（ステップ１０４）、登録された地名の音声を認識した場合には、認識した地名を発音させるための文字データをＲＯＭ１５から読出して出力端子１０ｂから出力させる（ステップ１０５）と共に、認識した地名の経度，緯度のデータを経緯度変換回路１６に接続された経緯度変換データ記憶用ＲＯＭ１７から読出す（ステップ１０６）。ここでの地名の音声認識としては、本例のＲＯＭ１５に登録された地名が、国内の都道府県と、市区町村の名前であるので、例えば「××県 ××市」と言う音声や、「××市 ××区」（ここでは区の場合には都道府県を省略しても認識できるようにしてある）と言う音声を認識する。
【００５４】
そして、認識した音声に基づいて読出した経度，緯度のデータと付随データとを、出力端子１０ａから出力させる（ステップ１０７）。
【００５５】
そして、ステップ１０４で、登録された地名の音声を認識できなかった場合には、地名以外の登録された特定の音声を認識したか否か判断する（ステップ１０８）。ここで、地名以外の登録された特定の音声を認識した場合には、識別した音声に対応した文字コードを判別し（ステップ１０９）、その判別した文字コードを出力端子１０ｂから出力させる（ステップ１１０）。
【００５６】
また、ステップ１０８で地名以外の登録された特定の音声も認識できなかった場合には、このときの処理を終了する。或いは、音声認識できなかったことを、ナビゲーション装置２０側に指示し、音声合成回路３１での音声合成又はディスプレイ装置４０で表示される文字などで警告する。
【００５７】
次に、ナビゲーション装置２０側での動作を、図７のフローチャートに示すと、まず演算回路２３では現在位置の表示モードが設定されているか否か判断する（ステップ２０１）。そして、現在位置の表示モードが設定されていると判断したときには、現在位置検出回路２２で現在位置の測位を実行させ（ステップ２０２）、その測位した現在位置の近傍の道路地図データをＣＤ−ＲＯＭから読出させ（ステップ２０３）、その読出した道路地図データに基づいた道路地図の表示処理を行い、ディスプレイ装置４０に対応した座標位置の道路地図を表示させる（ステップ２０４）。
【００５８】
そして、ステップ２０１で現在位置の表示モードが設定されてないと判断したとき、或いはステップ２０４での現在位置の道路地図の表示処理が終了し、その道路地図が表示された状態となっているときに、音声認識装置１０から入力端子２０ａ，２０ｂを介して経度，緯度データなどが供給されるか否か判断する（ステップ２０５）。ここで、経度，緯度データとそれに付随する文字データなどが供給されたことを判別したときには、まず端子２０ｂを介して供給される発音用の文字コードを音声合成回路３１に供給して、音声認識装置１０で認識した音声を音声合成させてスピーカ３２から出力させる（ステップ２０６）。続いて、経度，緯度データで示される位置の近傍の道路地図データをＣＤ−ＲＯＭから読出させ（ステップ２０７）、その読出した道路地図データに基づいた道路地図の表示処理を行い、ディスプレイ装置４０に対応した座標位置の道路地図を表示させる（ステップ２０８）。
【００５９】
そして、ステップ２０５で音声認識装置１０から経度，緯度データが供給されないと判断したとき、或いはステップ２０８での指定された地名の道路地図の表示処理が終了し、その道路地図が表示された状態となっているときに、音声認識装置１０から入力端子２０ｂを介して表示位置を直接指示する文字コードが供給されるか否か判断する（ステップ２０９）。そして、端子２０ｂから文字コードが供給されたと判断したときには、その文字コードを音声合成回路３１に供給して、音声認識装置１０で認識した音声をスピーカ３２から出力させる（ステップ２１０）。そして次に、ステップ２０９で表示位置を直接指示する文字コード（即ち「目的地」，「出発地」，「経由地」，「自宅」などの言葉）を判別したときには、これらの文字で指示された座標位置がＲＡＭ２５に登録されているか否か判断し（ステップ２１１）、登録されている場合には、その登録された座標位置である経度，緯度データで示される位置の近傍の道路地図データをＣＤ−ＲＯＭから読出させ（ステップ２１２）、その読出した道路地図データに基づいた道路地図の表示処理を行い、ディスプレイ装置４０に対応した座標位置の道路地図を表示させ（ステップ２１３）、この表示が行われた状態で、ステップ２０１の判断に戻る。
【００６０】
そして、ステップ２０９で表示位置を直接指示する文字コードが音声認識装置１０から供給されないと判断したときには、操作キー２７の操作により、表示位置を指定する操作があるか否か演算回路２３で判断する（ステップ２１４）。そして、この表示位置を指定する操作がある場合には、車速センサ２６の検出データを判断して、現在車両が走行中か否か判断する（ステップ２１５）。そして、走行中であると演算回路２３が判断したときには、そのときの操作を無効とし、ステップ２０１の判断に戻る（このとき何らかの警告を行うようにしても良い）。
【００６１】
そして、車両が走行中でないと判断したときに、ステップ２１１に移り、登録された座標があるか否か判断した後、登録された座標位置がある場合には、その位置の道路地図の表示処理（ステップ２１２，２１３）を行った後、ステップ２０１の判断に戻る。
【００６２】
そして、ステップ２１１で「目的地」，「出発地」，「経由地」，「自宅」などの対応した位置の座標の登録がない場合には、音声合成回路３１での音声合成又はディスプレイ装置４０での文字表示で、未登録を警告させ（ステップ２１６）、ステップ２０１の判断に戻る。
【００６３】
なお、この図７のフローチャートでは、地図表示に関係する処理について説明したが、音声認識装置１０側から地図表示以外の操作を指示する音声を認識した結果による文字コードが供給される場合には、演算回路２３の制御に基づいて、対応した処理を行うようにしてある。例えば、「イマナンジ」などと認識して文字コードが供給されるとき、演算回路２３の制御に基づいて、現在時刻を発音させる音声を音声合成回路３１で合成させてスピーカ３２から出力させるようにしてある。その他の指令についても、回答の音声を音声合成回路３１で合成させてスピーカ３２から出力させるか、或いは該当する表示をディスプレイ装置４０で行うように処理する。
【００６４】
以上のように処理されることで、音声入力により表示位置を全国どこでも自由に設定することができ、簡単に所望の位置の道路地図を表示させることができる。即ち、例えば操作者がトークスイッチ１８を押しながら、マイクロフォン１１に向かって「××県 ××市」や「××市 ××区」と話すだけで、その音声が認識されて、その地域の道路地図が表示されるので、キー操作で位置の指示などを行う必要がなく、例えばキー操作が困難な状況であっても、ナビゲーション装置の操作ができる。この場合、本例においては音声認識装置１０で認識する地名の音声を、国内の都道府県と、市区町村の名前に限定したので、認識する音声の数が比較的少ない数（約３５００）に制限され、音声認識装置１０内の音声認識回路１４で比較的少ない処理量による短時間での音声認識処理で、地名を認識でき、入力した音声により指示された地図が表示されるまでの時間を短縮することができると共に、認識する地名の数が限定されることで、認識率自体も向上する。
【００６５】
ここで本例においては、以上説明した音声入力があって認識処理が行われた後に、再度音声入力があったとき、その認識処理時に過去の認識結果を参照するようにしてある。以下、その処理を図８のフローチャートに示す。
【００６６】
まず、前回の音声認識処理から充分な時間（例えば数分）が経過している場合には、音声認識回路１４内の履歴リストをクリアし（ステップ４０１）、その後発話が開始、即ちトークスイッチ１８がオン状態になったか否か判断し（ステップ４０２）、発話が開始されたと判断すると、前回の発話から所定時間Ｔｈ（ここでは１０秒）が経過しているか否か判断し（ステップ４０３）、経過している場合には音声認識回路１４内の履歴リストをクリアする（ステップ４０４）。そして、前回の発話から所定時間Ｔｈが経過してない場合には、履歴リストをクリアしない。
【００６７】
そして次に、音声認識回路１４の制御に基づいて、入力された音声の認識処理を行う（ステップ４０５）。そして、この認識結果で得られた候補の音声データと、履歴リストにある音声データとを照合し、履歴リストに同じデータがある場合には、そのデータを認識された候補の中から削除する（ステップ４０６）。そして次に、履歴リストの項目数がＮ個（ここでは５個）以上か否か判断する（ステップ４０７）。そして、Ｎ個以上でない場合（即ちＮ回連続して発話がされてない場合）には、ステップ４０８に移って、このときの残りの候補のデータの中で、最も認識度（一致度）が高かったデータを、認識された結果として、ナビゲーション装置２０の音声合成回路３１に供給し、スピーカ３２から音声として出力させる。そして、この認識された結果が地域を示す音声（即ち本例の場合には都道府県名及び市区町村名）である場合には、その市区町村を表示させる地図を、ナビゲーション装置２０内での処理でディスプレイ装置４０に表示させる（ステップ４０９）。そして、このとき認識された結果を、履歴リストに追加し（ステップ４１０）、ステップ４０２に戻り、次の発話開始まで待機する。
【００６８】
そして、ステップ４０７で履歴リストの項目数がＮ個であると判断された場合（即ちＮ回連続して発話がされた場合）には、ステップ４１１に移って、候補リストの表示処理を行う。即ち、ここまでの認識処理で認識された候補のデータを、音声認識回路１４内の候補リスト用メモリから読出し、このデータをナビゲーション装置２０に供給して、ナビゲーション装置２０内の映像信号生成回路２８で候補リストの映像信号を生成させ、その映像信号をディスプレイ装置４０に供給して、候補リストをディスプレイ装置４０に表示させる。
【００６９】
このときの候補リストは、例えば図９に示すように表示される。即ち、最も一致度が高かった順に、一位の候補から五位程度までの候補まで表示させる（スクロール操作などでより下位の候補まで表示させるようにしても良い）。このとき、地名の候補と、コマンドの候補とは異なる態様で表示する（例えば文字の表示色を変える）ようにしてある。図９の例では、字体を変えて表示させてある。
【００７０】
そして、この候補リストが表示された最初の段階では、このリスト内の候補の内の一位の候補に、選択されたことを示す印ａを付与するようにしてある。この選択する候補を示す印ａは、操作キー２７の操作によるスクロール操作で、移動させることができるが、次にこのスクロール操作が行われたか否か判断する（ステップ４１２）。ここで、スクロール操作が行われた場合には、選択される候補に付与する印ａの位置を移動させる（ステップ４１３）。
【００７１】
この状態で、操作キー２７の中の決定用のボタンが押されたか否か判断する（ステップ４１４）。この決定用のボタンが押されたと判断したときには、そのとき印ａで示された候補が選択されたと判断し、その候補に関するデータ（経緯度のデータ，音声出力用の文字データなど）の読出しを音声認識装置１０側に指示し、その読出されたデータをナビゲーション装置２０側に供給させる。そして、その供給されたデータに基づいて、音声合成回路３１で、音声合成処理を行って、地名をスピーカ３２から音声として出力させる（ステップ４１５）。そして、供給された経緯度のデータに基づいて、該当する位置の道路地図を表示させる映像信号を作成させ、ディスプレイ装置４０に選択された候補の地図を表示させ（ステップ４１６）。そして、このとき選択された結果を、履歴リストに追加し（ステップ４１７）、ステップ４０２に戻り、次の発話開始まで待機する。
【００７２】
そして、ステップ４１４で決定用のボタンが押さないと判断された場合には、その後発話が開始、即ちトークスイッチ１８がオン状態になったか否か判断し（ステップ４１８）、発話が開始されたと判断すると、候補リストの表示を中止させて、ステップ４０３の処理に戻る。そして、ステップ４１８で発話が開始されないと判断した場合には、ステップ４１１での候補リストの表示が開始されてから、所定時間Ｔｄ（このＴｄは例えば１０秒程度の時間）が経過したか否か判断し（ステップ４１９）、この時間Ｔｄが経過してない場合には、ステップ４１２の処理に戻り、候補リストが表示された状態を継続させる。そして、ステップ４１９で所定時間Ｔｄが経過したと判断したときには、ステップ４１２でスクロール操作が行われたか否か判断し（ステップ４２０）、スクロール操作が行われた場合には、ステップ４１２の処理に戻り、候補リストが表示された状態を継続させる。
【００７３】
そして、ステップ４２０でスクロール操作が行われてないと判断したときには、ステップ４０８に移って、候補リストの一位の結果を音声で出力させ、この一位の地名の地図を表示させる。
【００７４】
このように制御されることで、発話を一定時間内（例えば１０秒以内）に続けて行われたときには、言い直されたと見なされて、前回の認識結果の一位候補が認識対象語から外れることになり、言い直しても間違った地名が再度認識されて、所望の地名が認識されない事故を防止できる。例えば、似た地名として「横浜市神奈川区」と「横浜市金沢区」が存在するが、音声入力をした者が「横浜市神奈川区」と話した場合に、「横浜市金沢区」と誤認識されたとする。このとき、同じ発音を繰り返すことで、なにも対処しない場合には再度「横浜市金沢区」と誤認識される可能性が高いが、ここでは二回目の音声入力時には履歴リストに「横浜市金沢区」の発音が既にあるので、この「横浜市金沢区」が認識対象語から外れることになる。そして、二位の候補に「横浜市神奈川区」があったとき、この「横浜市神奈川区」が一位の候補に繰り上がることになり、「横浜市神奈川区」が認識されたと判断され、結果として言い直した場合には誤認識が防止されたことになり、それだけ認識率を向上させることができる。
【００７５】
そして、短時間に所定回（ここでは５回）繰り返し音声入力があった場合には、このときの連続的な入力音声信号により認識された認識対象語を、認識度が高い順に一覧表示され、そのときの認識状態が容易に判断できるようになると共に、その一覧表示された中から言葉を選択できるので、音声入力による認識が困難な場合の対処が簡単な操作で容易にできるようになる。
【００７６】
そして本例においては、このときの認識対象語の候補の一覧表示として、その認識対象語が、地名の音声の場合の表示状態（図９では通常の文字による表示）と、何らかの指令などのコマンドの場合の表示状態（図９では白抜きの文字による表示）とを変えるようにしたので、それぞれの種類の音声が迅速に表示から判断できるようになる。なお、図９の例では文字の状態を変えるようにしたが、例えば地名の候補の場合の文字（又は文字の周囲）の表示色と、コマンドの候補の場合の文字（又は文字の周囲）の表示色とを変えるようにしても良い。
【００７７】
また、このように地名とコマンドで表示状態を変える他に、地名を地域毎に区分分けして、その区分毎に表示状態を変えるようにしても良い。即ち、例えば都道府県毎に表示色を変えたり、或いは関東地方，東北地方のような地域毎に表示色を変えるようにしても良い。
【００７８】
なお、図８のフローチャートでは、選択された候補が地名であり、その地名に基づいて地図表示が行われる場合について説明したが、選択された候補が何らかの指令（コマント）である場合には、地図表示の代わりに対応した指令を実行させるものである。
【００７９】
また、候補リストを図９に示すように一覧表示させた場合には、この一覧表示された認識対象語を、音声合成回路３１での音声合成処理で、順にスピーカ３２から音声として出力させるようにしても良い。このようにすることで、ディスプレイ装置４０の表示を見なくても、認識対象語の候補が判り、ナビゲーション装置としての使い勝手が向上する。
【００８０】
なお、上述実施例では音声認識装置で認識する地名を、国内の都道府県と、市区町村の名前に限定したが、より細かい地名や目標物の名前などまで認識するようにしても良い。但し、認識できる地名などを多くすると、それだけ音声認識に必要な処理量と処理時間が多く必要になり、認識率を高くするためからも、市区町村の名前程度に限定するのが最も好ましい。
【００８１】
また、上述実施例では各地名毎の中心の座標を、その地域の役所（市役所，区役所，町役場，村役場）の所在地の絶対位置を示す緯度と経度のデータとしたが、その他の位置を示す緯度と経度のデータとしても良い。例えば、単純にその地域（市区町村）の中心の緯度と経度のデータとしても良い。
【００８２】
また、このように中心の緯度と経度のデータを記憶させる代わりに、その地域の東西南北の端部の座標位置のデータを記憶させるようにしても良い。この場合には、東西の経度と南北の緯度の４つのデータがあれば良い。
【００８３】
また、上述実施例では音声認識装置内の音声認識回路１４で、認識した音声を文字コードに変換してから、この文字コードを経緯度変換回路１６で経度，緯度のデータに変換するようにしたが、認識した音声より直接経度，緯度のデータに変換するようにしても良い。また、このように直接経度，緯度のデータに変換させない場合でも、これらの変換データを記憶するＲＯＭ１５とＲＯＭ１７は、同一のメモリで構成させて、例えば地名の記憶エリアを共用するようにしても良い。
【００８４】
また、上述実施例ではＧＰＳと称される測位システムを使用したナビゲーション装置に適用したが、他の測位システムによるナビゲーション装置にも適用できることは勿論である。
【００８５】
【発明の効果】
本発明のナビゲーション装置によると、認識対象語の候補の表示状態が、候補の対象語の区分毎に異なる表示態様になるので、同じ区分毎の候補が判り易くなり、見やすい表示状態となる。例えば、地図を表示させるための地名の表示と、動作などを指示するためのコマンドの表示とを、異なる態様で表示することで、認識対象語の候補の表示から、地名やコマンドなどの必要とする候補を探すことが容易にできるようになり、ナビゲーション装置としての使い勝手が向上する。
【００８８】
また本発明のナビゲート方法によると、認識対象語の候補の表示状態が、候補の対象語の区分毎に異なる表示態様になるので、同じ区分毎の候補が判り易くなり、見やすい表示状態となる。例えば、地図を表示させるための地名の表示と、動作などを指示するためのコマンドの表示とを、異なる態様で表示することで、認識対象語の候補の表示から、地名やコマンドなどの必要とする候補を探すことが容易にできるようになり、ナビゲーション装置としての使い勝手が向上する。
【００８９】
また本発明の自動車によると、認識対象語の候補の表示状態が、候補の対象語の区分毎に異なる表示態様になるので、同じ区分毎の候補が判り易くなり、見やすい表示状態となり、例えば自動車の運転状況などにより表示を長時間見るのが困難な場合でも、必要とする候補を探すことが容易にできるようになり、自動車の運転の安全性を確保した上での良好な操作が可能になる。
【図面の簡単な説明】
【図１】本発明の一実施例を示す構成図である。
【図２】一実施例の装置を自動車に組み込んだ状態を示す斜視図である。
【図３】一実施例の装置を自動車に組み込んだ場合の運転席の近傍を示す斜視図である。
【図４】一実施例による音声認識用メモリの記憶エリア構成を示す説明図である。
【図５】一実施例による経緯度変換用メモリの記憶エリア構成を示す説明図である。
【図６】一実施例の音声認識による処理を示すフローチャートである。
【図７】一実施例のナビゲーション装置での表示処理を示すフローチャートである。
【図８】一実施例の音声認識を複数回実行したときの処理を示すフローチャートである。
【図９】一実施例による候補リストの表示例を示す説明図である。
【符号の説明】
１０音声認識装置
１１マイクロフォン
１２アナログ／デジタル変換器
１３デジタル音声処理回路（ＤＳＰ）
１４音声認識回路
１５音声認識データ記憶用ＲＯＭ
１６経緯度変換回路
１７経緯度変換データ記憶用ＲＯＭ
１８トークスイッチ
２０ナビゲーション装置
２３演算回路
２４ＣＤ−ＲＯＭドライバ
２５ＲＡＭ
２６車速センサ
２７操作キー
２８映像信号生成回路
３１音声合成回路
３２スピーカ
４０ディスプレイ装置
５０自動車[0001]
BACKGROUND OF THE INVENTION
  The present invention displays a road map or the like mounted on a car, for example.NaThe present invention relates to a navigation device, a navigation method, and an automobile on which these devices are mounted.
[0002]
[Prior art]
Conventionally, various navigation devices mounted on automobiles and the like have been developed. This navigation device includes, for example, a large-capacity data storage means such as a CD-ROM in which road map data is stored, a current position detection means, and data obtained by reading a road map near the detected current position from the data storage means. And a display device that displays based on the above. In this case, the current position detection means is based on information using a positioning system using a positioning artificial satellite called GPS (Global Positioning System), and information such as the traveling direction and traveling speed of the vehicle. There are things such as autonomous navigation that tracks changes in the current position from the starting point.
[0003]
Moreover, as a map displayed on the display device, by performing key operations, a map at a desired position can be displayed as long as map data is prepared in addition to the vicinity of the current position. It is.
[0004]
In the case of such a navigation device, for example, in the case of an automobile, a display device is installed in the vicinity of the driver's seat so that the driver can see a map in the vicinity of the current position while driving or during a temporary stop such as a signal stop. It is common to do so.
[0005]
[Problems to be solved by the invention]
By the way, such a navigation device needs to be able to be operated without interfering with driving of the automobile, and for example, complicated operations are prohibited during traveling. That is, when such a navigation device is installed in a vehicle, it is detected that the vehicle is stopped due to the state of the detection unit connected to some kind of traveling state detection unit (for example, a parking brake switch of an automobile). Only when the vehicle is in operation, it is set so that all operations can be performed, and in a state where the vehicle is not stopped (that is, during running), complicated key operations are prohibited.
[0006]
However, it is inconvenient that operations such as switching the display map cannot be performed while traveling, and it is requested that advanced operations can be performed without disturbing driving even while traveling. Yes.
[0007]
SUMMARY OF THE INVENTION The present invention has been made in view of the above, and an object of the present invention is to make it possible to easily perform advanced operations of various devices such as a navigation device without obstructing driving of an automobile.
[0008]
[Means for Solving the Problems]
  The present inventionThe navigation apparatus of the present invention creates a video signal for creating a video signal for displaying a recognition target word candidate recognized by the audio processing unit, and when creating a video signal for a recognition target word candidate by the video signal creation means, The candidate recognition target words are classified into a word indicating a place name and a word giving a command, and each of the classifications is provided with display control means for making a video signal to be displayed in a different display mode It is.
[0013]
According to the navigation device of the present invention, since the display state of the recognition target word candidates is different for each category of the candidate target words, the candidates for the same category are easy to understand and the display state is easy to see.
[0014]
  In the navigation method of the present invention, when candidates for recognition target words recognized by speech are displayed, classification of recognition target words as candidates is displayed.However, it is divided into words that specify place names and words that give orders.Are displayed in different display modes.
[0015]
According to the navigation method of the present invention, since the display state of the recognition target word candidates is different for each candidate target word category, the candidates for the same category are easily understood and the display state is easy to see.
[0016]
  In addition, the automobile of the present invention is a vehicle provided with a device for displaying a map on the basis of recognition of input speech on display means arranged at a predetermined position in the vehicle. Video signal creation means for creating a video signal to be displayed and supplying it to the display means, and classification of recognition target words as candidates when the video signal creation means creates a candidate video signal for the recognition target wordHowever, it is divided into words that specify place names and words that give orders.And a display control means for making a video signal to be displayed in a different display mode.
[0017]
According to the automobile of the present invention, since the display state of the recognition target word candidate is different for each category of the candidate target word, the candidates for the same category are easily understood and the display state is easy to see.
[0018]
DETAILED DESCRIPTION OF THE INVENTION
Hereinafter, an embodiment of the present invention will be described with reference to the accompanying drawings.
[0019]
In this example, the present invention is applied to a navigation device mounted on an automobile. First, an installation state of the apparatus of this example in an automobile will be described with reference to FIGS. As shown in FIG. 2, the vehicle 50 has a handle 51 attached to the front of a driver seat 52, and basically a driver seated in the driver seat 52 operates a navigation device. However, other passengers in the car 50 may operate. The navigation device main body 20 and the voice recognition device 10 connected to the navigation device main body 20 are installed in an arbitrary space in the automobile 50 (for example, in the trunk at the rear), and a positioning signal receiving antenna 21 described later is provided. It is attached to the outside of the vehicle body (or inside the rear window).
[0020]
As shown in the vicinity of the driver's seat in FIG. 3, a talk switch 18 and an operation key 27 of the navigation device which will be described later are disposed beside the handle 51, and these switches and keys are operated during driving. Is arranged so that there is no hindrance. In addition, the display device 40 connected to the navigation device is disposed at a position that does not obstruct the driver's front view. Further, a speaker 32 that outputs a voice signal synthesized in the navigation device 20 is attached to a position where the output voice reaches the driver (for example, the side of the display device 40).
[0021]
In addition, the navigation device of this example is configured to allow voice input, and the microphone 11 for that purpose is attached to a sun visor 53 disposed on the front windshield in front of the driver seat 52 and is seated on the driver seat 52. The driver's voice is picked up.
[0022]
The navigation device body 20 of this example is connected to the engine control computer 54 of the automobile, and a pulse signal proportional to the vehicle speed is supplied from the engine control computer 54.
[0023]
Next, the internal configuration of the navigation device of this example will be described with reference to FIG. 1. In this example, the voice recognition device 10 is configured to be connected to the navigation device 20. The microphone 11 is connected. As the microphone 11, for example, a microphone whose directionality is set to be relatively narrow and picks up only the voice of a person seated in the driver's seat of the automobile is used. For example, a talk switch 18 described later is pressed and turned on. The power is turned on and the operation of picking up the sound is performed only during the state.
[0024]
The audio signal picked up by the microphone 11 is supplied to the analog / digital converter 12 and converted into a digital audio signal having a predetermined sampling frequency. The digital audio signal output from the analog / digital converter 12 is supplied to a digital audio processing circuit 13 having an integrated circuit configuration called a DSP (digital signal processor). In the digital voice processing circuit 13, the digital voice signal is converted into vector data by processing such as band division and filtering, and this vector data is supplied to the voice recognition circuit 14.
[0025]
A speech recognition data storage ROM 15 is connected to the speech recognition circuit 14 and performs a recognition operation according to a predetermined speech recognition algorithm (for example, HMM: Hidden Markov Model) with vector data supplied from the digital speech processing circuit 13. Then, a plurality of candidates are selected from the speech recognition phoneme models stored in the ROM 15, and the character data stored corresponding to the phoneme model having the highest degree of coincidence among the candidates is read out. Note that the voice recognition circuit 14 of this example also functions as a control unit that controls processing of each unit in the voice recognition device 10, and the voice recognition circuit 14 also operates the talk switch 18 described later. Is to judge.
[0026]
Here, the data storage state of the voice recognition data storage ROM 15 of this example will be described. In this example, only the place name and the words for instructing the operation of the navigation device are recognized. As shown in the storage area setting state in FIG. 4, only the names of prefectures and cities in Japan are registered. For each prefecture and city, the character code of the place name and the place name A phoneme model, which is data for recognizing a voice, is stored.
[0027]
For example, in the case of Japan, the number of municipalities in the whole country is about 3500, and about 3500 place names are stored. However, in the case of the place name “XX town”, both data when pronounced as “xxx gusset” and data when pronounced as “xxx butterfly” are stored. Similarly, in the case of the place name of “XX village”, both data when pronounced as “xxx son” and data when pronounced as “xxx unevenness” are stored.
[0028]
In addition, city names that are likely to be misunderstood, such as municipalities located adjacent to the borders of prefectures, are registered with the name of the prefecture that is likely to be mistaken. . That is, for example, the correct example “Kanagawa Ken Kawasaki (Kawasaki City, Kanagawa Prefecture)” is registered, and the wrong example of the neighboring prefecture name “Tokyo Kawasaki City (Tokyo Kawasaki City)” is also registered. .
[0029]
In addition, words to instruct operation of the navigation device include words indicating the display position such as “Destination”, “Departure Point”, “Via-route”, “Home”, and “What time is it now” (Listing the current time) Command), “Where is now” (command to listen to the current position), “Next” (command to listen to the next intersection), “How much” (command to listen to the distance to the destination), “Speed is” (current speed) , "Altitude" (command to listen to the current altitude), "Progression direction" (command to listen to the direction of travel), "List" (command to display a list of recognizable commands on the display) The character code of a word for performing various other operation commands such as) and the phoneme model corresponding to the word are stored.
[0030]
If the character code corresponding to the phoneme model that matches the recognition result obtained from the input vector data through the predetermined speech recognition algorithm is the character code of the place name in the speech recognition circuit 14, The code is read from the ROM 15. The read character code is supplied to the longitude-latitude conversion circuit 16. A longitude / latitude conversion data storage ROM 17 is connected to the longitude / latitude conversion circuit 16, and the latitude / longitude data corresponding to the character data supplied from the voice recognition circuit 14 and its associated data are read from the ROM 17.
[0031]
Note that the speech recognition circuit 14 of this example is provided with a memory (not shown) for temporarily storing the recognition result, and the recognition result is stored as a history list in this memory. Also, during the recognition process, data from the voice with the highest matching degree to the voice that matches to some extent in order is stored as a candidate list. The history list and candidate list are deleted after a certain amount of time has elapsed since they were stored.
[0032]
Here, the data storage state of the longitude-latitude conversion data storage ROM 17 of this example will be described. In the case of this example, a storage area is stored for each character code that is the same as the character code of the place name stored in the voice recognition data storage ROM 15. As shown in FIG. 5, for each character code, the latitude and longitude data of the place name indicated by the character and display scale data are stored as accompanying data. The character code read from the voice recognition data storage ROM 15 is a katakana character code. In the longitude-latitude conversion data storage ROM 17, in addition to the katakana character code whose pronunciation is represented by a character string, Character codes using kanji, hiragana, katakana, etc. for display are also stored.
[0033]
In the case of this example, the latitude and longitude data for each place name includes latitude and longitude data indicating the absolute location of the local government office (city hall, ward office, town hall, village hall) indicated by the place name. It is as. Further, as accompanying data, display character code and display scale data are output together with latitude and longitude data. The display scale data is display scale data set according to the size of the area indicated by the place name. For example, the display scale data is data indicating the display scale in several stages.
[0034]
Then, the latitude / longitude data read from the longitude / latitude conversion data storage ROM 17 and its accompanying data are supplied to the output terminal 10 a as the output of the speech recognition device 10. Further, the character code data of the input speech whose match is detected by the speech recognition circuit 14 is supplied to the output terminal 10 b as the output of the speech recognition device 10. Data obtained at the output terminals 10 a and 10 b is supplied to the navigation device 20.
[0035]
The speech recognition device 10 of this example is connected to a talk switch 18 that is an open / close switch that is not locked (ie, a switch that is turned on only when it is pressed), and the talk switch 18 continues for at least 300 milliseconds. Only the audio signal picked up by the microphone 11 while being pressed is processed by the circuit from the analog / digital converter 12 to the latitude-longitude conversion circuit 16 as described above. The processing in the voice recognition device 10 is performed based on the control of the voice recognition circuit 14, and the voice recognition circuit 14 also determines the state of the talk switch 18.
[0036]
In this example, when the speech recognition circuit 14 performs the recognition processing of the input speech again within a predetermined time (for example, within 10 seconds), the speech recognized at this time is stored in the memory in the speech recognition circuit 14. Is stored in the history list, the recognized speech is deleted from the history list, and it is determined that the speech stored in the highest rank in the deleted history list is recognized as speech. In addition, when such a process is continuously performed a plurality of times (for example, five times), candidate recognition voice data is read from the candidate list, supplied to the navigation device 20 side, and connected to the navigation device 20. The candidate list is displayed on the display device 40. Details of these processes will be described later.
[0037]
In addition, various kinds of control data other than the character codes described above can be transmitted from the voice recognition circuit 14 in the voice recognition apparatus 10 of this example to the navigation apparatus 20 via the terminal 10b. Control data for interrupting the map data creation process may be sent to the navigation device 20 side.
[0038]
Next, the configuration of the navigation device 20 connected to the voice recognition device 10 will be described. The navigation device 20 includes a GPS antenna 21. The current position detection circuit 22 receives and processes a positioning signal from a GPS satellite received by the antenna 21, analyzes the received data, and analyzes the current position. Is detected. The detected current position data is latitude and longitude data that are absolute positions at that time.
[0039]
The detected current position data is supplied to the arithmetic circuit 23. The arithmetic circuit 23 is a circuit that functions as a system controller that controls the operation of the navigation device 20. A CD-ROM (optical disc) in which road map data is stored is set, and the stored data in the CD-ROM is read out. A CD-ROM driver 24, a RAM 25 for storing various data necessary for data processing, a vehicle speed sensor 26 for detecting the movement of the vehicle on which the navigation device is mounted, and an operation key 27 are connected. When coordinate data of longitude and latitude such as the current position is obtained, the CD-ROM driver 24 is controlled to read road map data in the vicinity of the coordinate position. Then, the road map data read by the CD-ROM driver 24 is temporarily stored in the RAM 25, and display data for displaying the road map is created using the stored road map data. At this time, the display data is such that the map is displayed on a display scale (scale) set by operating the operation key 27 arranged at a predetermined position in the automobile.
[0040]
Then, the display data created by the arithmetic circuit 23 is supplied to the video signal generation circuit 28. The video signal generation circuit 28 generates a video signal of a predetermined format based on the display data, and the video signal is output to the output terminal. 20c.
[0041]
The video signal output from the output terminal 20c is supplied to the display device 40, and the display device 40 performs an image receiving process based on the video signal to display a road map or the like on the display panel of the display device 40.
[0042]
In addition to displaying a road map in the vicinity of the current position, a road map at a position instructed by operating the operation key 27 can be displayed based on the control of the arithmetic circuit 23. In addition, based on the operation of the operation key 27 and the like, specific coordinate positions such as “Destination”, “Departure Point”, “Route”, and “Home” can be registered. When the specific coordinate position is registered, the registered coordinate position data (longitude and latitude data) is stored in the RAM 25.
[0043]
Further, when the vehicle speed sensor 26 detects the traveling of the automobile, the arithmetic circuit 23 does not accept an operation other than a relatively simple operation among the operations of the operation keys 27.
[0044]
In addition, the navigation device 20 includes an autonomous navigation unit 29, calculates an accurate traveling speed of the automobile based on a pulse signal corresponding to the vehicle speed supplied from an engine control computer or the like on the automobile side, and autonomous navigation. The traveling direction is detected based on the output of the gyro sensor in the unit 29, and the current position is measured by autonomous navigation from the position determined based on the speed and traveling direction. For example, when the current position detection circuit 22 becomes incapable of position detection, positioning by autonomous navigation is performed from the position finally detected by the current position detection circuit 22.
[0045]
Also, a speech synthesis circuit 31 is connected to the arithmetic circuit 23, and when the arithmetic circuit 23 needs some instruction by voice, the voice synthesis circuit 31 executes a voice synthesis process for instructing the voice synthesis circuit 31. Sound is output from the speaker 32 connected to the circuit 31. For example, various instructions necessary for a navigation device such as “I approached the destination” and “The traveling direction is left” are given by voice. In the speech synthesis circuit 31, the speech recognized by the speech recognition device 10 is subjected to speech synthesis processing based on the supplied character data and output from the speaker 32 as speech. This process will be described later.
[0046]
Here, the navigation device 20 includes input terminals 20a and 20b to which longitude and latitude data output from the output terminals 10a and 10b of the voice recognition device 10, its associated data, and character code data are supplied. The longitude / latitude data obtained at 20a and 20b, the accompanying data and the character code data are supplied to the arithmetic circuit 23.
[0047]
When the longitude / latitude data and the like are supplied from the voice recognition device 10 side, the arithmetic circuit 23 performs control to read road map data in the vicinity of the longitude and latitude from the disk by the CD-ROM driver 24. Then, the road map data read by the CD-ROM driver 24 is temporarily stored in the RAM 25, and display data for displaying the road map is created using the stored road map data. At this time, the display data is displayed at the center of the supplied longitude and latitude, and the display data is such that the map is displayed at the scale (scale) indicated by the display scale attached to the longitude and latitude data.
[0048]
Based on this display data, the video signal generation circuit 28 generates a video signal and causes the display device 40 to display a road map at a coordinate position instructed from the voice recognition device 10.
[0049]
Further, when a character code of a word instructing the operation of the navigation device is supplied from the output terminal 10b of the voice recognition device 10, when the character code of the word instructing the operation is determined by the arithmetic circuit 23, the corresponding control is performed. Is performed by the arithmetic circuit 23. In this case, if the character code is a word code indicating a display position such as “destination”, “departure place”, “route”, or “home”, whether the coordinates of this display position are registered in the RAM 25. If it is registered after the determination, the CD-ROM driver 24 controls to read the road map data in the vicinity of the position from the disk.
[0050]
When character code data indicating the pronunciation of the recognized speech is supplied from the speech recognition device 10 to the arithmetic circuit 23, the speech synthesis circuit 31 synthesizes the words indicated by the character code, and the speech synthesis circuit The sound is output from the speaker 32 connected to 31. For example, when the speech recognition apparatus 10 recognizes speech as “Tokyo Bunkyo (Bunkyo-ku, Tokyo)”, a synthesis process for generating a speech signal that produces “Tokyo Bunkyo” based on the recognized character string data. Is performed by the voice synthesis circuit 31 and the generated voice signal is output from the speaker 32.
[0051]
In this case, in this example, when speech recognition is performed by the speech recognition device 10, the longitude and latitude data are supplied to the terminal 20 a of the navigation device 20, and the character indicating the pronunciation of the speech recognized at the terminal 20 b. The code data is supplied almost at the same time, but the arithmetic circuit 23 first performs speech synthesis on the words recognized by the speech synthesis circuit 31, and then the road based on the longitude and latitude data. A process for creating map display data is executed.
[0052]
Next, an operation in the case of performing road map display or the like using the voice recognition device 10 and the navigation device 20 of this example will be described. First, when the voice recognition operation in the voice recognition apparatus 10 is shown in the flowchart of FIG. 6, it is first determined whether or not the talk switch 18 is on (step 101), and it is determined that the talk switch 18 is turned on. In this case, the audio signal picked up by the microphone 11 during the ON period is sampled by the analog / digital converter 12, processed by the digital audio processing circuit 13, and converted into vector data (step 102). Based on this vector data, the speech recognition circuit 14 performs speech recognition processing (step 103).
[0053]
Here, it is determined whether or not the voice of the place name (that is, the place name registered in advance) stored in the voice recognition data storage ROM 15 is recognized (step 104), and when the voice of the registered place name is recognized, Character data for generating the recognized place name is read from the ROM 15 and output from the output terminal 10b (step 105), and the longitude and latitude data of the recognized place name is converted to the longitude-latitude conversion circuit 16 connected to the longitude-latitude conversion circuit 16. Reading from the storage ROM 17 (step 106). As the speech recognition of the place name here, the place name registered in the ROM 15 of this example is the name of the prefecture and the municipality in the country. For example, the voice “XX prefecture XX city” The voice “XX city XX ward” (here, in the case of a ward, it can be recognized even if the prefecture is omitted) is recognized.
[0054]
Then, the longitude and latitude data read out based on the recognized voice and the accompanying data are output from the output terminal 10a (step 107).
[0055]
If the voice of the registered place name cannot be recognized in step 104, it is determined whether or not the registered specific voice other than the place name has been recognized (step 108). Here, when the registered specific voice other than the place name is recognized, the character code corresponding to the identified voice is discriminated (step 109), and the discriminated character code is output from the output terminal 10b (step 110). ).
[0056]
If the registered specific voice other than the place name cannot be recognized in step 108, the process at this time is terminated. Alternatively, it is instructed to the navigation device 20 that voice recognition has failed, and a warning is given by voice synthesis in the voice synthesis circuit 31 or characters displayed on the display device 40.
[0057]
Next, when the operation on the navigation device 20 side is shown in the flowchart of FIG. 7, the arithmetic circuit 23 first determines whether or not the display mode of the current position is set (step 201). When it is determined that the current position display mode is set, the current position detection circuit 22 performs positioning of the current position (step 202), and the road map data in the vicinity of the determined current position is CD-ROM. (Step 203), a road map display process based on the read road map data is performed, and a road map at a coordinate position corresponding to the display device 40 is displayed (step 204).
[0058]
When it is determined in step 201 that the current position display mode is not set, or when the display of the road map at the current position in step 204 ends and the road map is displayed. Next, it is determined whether or not longitude and latitude data are supplied from the speech recognition device 10 via the input terminals 20a and 20b (step 205). Here, when it is determined that the longitude / latitude data and the accompanying character data are supplied, the phonetic synthesis character code supplied via the terminal 20b is first supplied to the voice synthesis circuit 31 for voice recognition. The speech recognized by the apparatus 10 is synthesized and output from the speaker 32 (step 206). Subsequently, road map data in the vicinity of the position indicated by the longitude and latitude data is read from the CD-ROM (step 207), road map display processing based on the read road map data is performed, and the display device 40 is displayed. A road map at the corresponding coordinate position is displayed (step 208).
[0059]
Then, when it is determined in step 205 that the longitude and latitude data are not supplied from the voice recognition device 10, or the display processing of the road map of the designated place name in step 208 is finished, and the road map is displayed. When it is determined, it is determined whether or not a character code that directly indicates the display position is supplied from the voice recognition device 10 via the input terminal 20b (step 209). When it is determined that the character code is supplied from the terminal 20b, the character code is supplied to the voice synthesis circuit 31, and the voice recognized by the voice recognition device 10 is output from the speaker 32 (step 210). Next, when it is determined in step 209 that the character code that directly indicates the display position (that is, words such as “Destination”, “Departure”, “Via”, “Home”, etc.) is indicated, these characters are indicated. It is determined whether or not the coordinate position is registered in the RAM 25 (step 211). If registered, the road map data in the vicinity of the position indicated by the longitude and latitude data as the registered coordinate position is obtained. The information is read from the CD-ROM (step 212), the road map is displayed based on the read road map data, and the road map at the coordinate position corresponding to the display device 40 is displayed (step 213). In the state of being performed, the process returns to the determination in step 201.
[0060]
When it is determined in step 209 that the character code that directly designates the display position is not supplied from the speech recognition apparatus 10, the arithmetic circuit 23 determines whether or not there is an operation for designating the display position by operating the operation key 27. (Step 214). If there is an operation for designating the display position, the detection data of the vehicle speed sensor 26 is determined to determine whether or not the vehicle is currently traveling (step 215). When the arithmetic circuit 23 determines that the vehicle is traveling, the operation at that time is invalidated and the process returns to the determination in step 201 (at this time, some warning may be performed).
[0061]
Then, when it is determined that the vehicle is not running, the process proceeds to step 211, and after determining whether or not there is a registered coordinate, if there is a registered coordinate position, display processing of the road map at that position After performing (steps 212 and 213), the process returns to the determination of step 201.
[0062]
If there is no registration of the coordinates of the corresponding position such as “Destination”, “Departure”, “Via”, “Home”, etc. in step 211, the speech synthesis by the speech synthesis circuit 31 or the display device 40 is performed. With the character display at, an unregistered warning is issued (step 216), and the process returns to the determination at step 201.
[0063]
In the flowchart of FIG. 7, processing related to map display has been described. However, when a character code based on a result of recognizing a voice instructing an operation other than map display is supplied from the voice recognition device 10 side, Based on the control of the arithmetic circuit 23, the corresponding processing is performed. For example, when the character code is supplied by recognizing “Imananji” or the like, the voice synthesizing circuit 31 synthesizes the sound that sounds the current time based on the control of the arithmetic circuit 23 and outputs it from the speaker 32. is there. For other commands, the voice of the answer is synthesized by the voice synthesizing circuit 31 and output from the speaker 32, or the display device 40 performs a corresponding display.
[0064]
By processing as described above, the display position can be freely set anywhere in the country by voice input, and a road map at a desired position can be easily displayed. That is, for example, the operator simply presses the talk switch 18 and speaks “XX prefecture XX city” or “XX city XX city” to the microphone 11 to recognize the voice and Since the road map is displayed, there is no need to give a position instruction or the like by key operation, and the navigation device can be operated even in a situation where key operation is difficult, for example. In this case, in this example, the place names recognized by the speech recognition device 10 are limited to the names of prefectures and municipalities in the country, so the number of recognized sounds is relatively small (about 3500). The time required for the place name to be recognized by the speech recognition circuit 14 in the speech recognition apparatus 10 in a short time with a relatively small amount of processing and for the map indicated by the input speech to be displayed is limited. It can be shortened, and the recognition rate itself is improved by limiting the number of place names to be recognized.
[0065]
Here, in this example, when the voice input described above is performed after the voice input described above is performed, the past recognition result is referred to during the recognition process. The process is shown in the flowchart of FIG.
[0066]
First, when a sufficient time (for example, several minutes) has elapsed since the previous speech recognition process, the history list in the speech recognition circuit 14 is cleared (step 401), and then the utterance starts, that is, the talk switch 18 Is determined to be on (step 402). If it is determined that the utterance has started, it is determined whether a predetermined time Th (here, 10 seconds) has elapsed since the previous utterance (step 403). If it has elapsed, the history list in the speech recognition circuit 14 is cleared (step 404). When the predetermined time Th has not elapsed since the previous utterance, the history list is not cleared.
[0067]
Then, based on the control of the voice recognition circuit 14, the input voice is recognized (step 405). Then, the candidate voice data obtained as a result of the recognition is collated with the voice data in the history list, and if the same data exists in the history list, the data is deleted from the recognized candidates ( Step 406). Next, it is determined whether or not the number of items in the history list is N (here, 5) or more (step 407). If it is not N or more (that is, if no utterance has been made N times consecutively), the process moves to step 408, and the recognition degree (matching degree) is the highest among the remaining candidate data at this time. The high data is supplied to the speech synthesis circuit 31 of the navigation device 20 as a recognized result, and is output from the speaker 32 as speech. If the recognized result is a voice indicating a region (that is, in the case of this example, a prefecture name and a city name), a map for displaying the city is displayed in the navigation device 20. In this process, the image is displayed on the display device 40 (step 409). Then, the result recognized at this time is added to the history list (step 410), and the process returns to step 402 and waits until the next utterance starts.
[0068]
When it is determined in step 407 that the number of items in the history list is N (that is, when utterances are made N times consecutively), the process proceeds to step 411 to display a candidate list. That is, the candidate data recognized in the recognition processing so far is read from the candidate list memory in the speech recognition circuit 14, and this data is supplied to the navigation device 20 and the video signal generation circuit 28 in the navigation device 20. Then, the video signal of the candidate list is generated, the video signal is supplied to the display device 40, and the candidate list is displayed on the display device 40.
[0069]
The candidate list at this time is displayed as shown in FIG. 9, for example. That is, the first to fifth candidates are displayed in descending order of the degree of coincidence (the lower candidates may be displayed by scrolling or the like). At this time, the place name candidates and the command candidates are displayed differently (for example, the display color of the characters is changed). In the example of FIG. 9, the font is changed and displayed.
[0070]
Then, at the first stage when this candidate list is displayed, a mark a indicating that it has been selected is assigned to the first candidate among the candidates in this list. The mark a indicating the candidate to be selected can be moved by a scroll operation by operating the operation key 27. Next, it is determined whether or not this scroll operation has been performed (step 412). Here, when the scroll operation is performed, the position of the mark a given to the candidate to be selected is moved (step 413).
[0071]
In this state, it is determined whether or not the determination button in the operation key 27 has been pressed (step 414). When it is determined that the determination button has been pressed, it is determined that the candidate indicated by the mark a has been selected, and data relating to the candidate (data on longitude and latitude, character data for voice output, etc.) is read out. An instruction is given to the voice recognition device 10 side, and the read data is supplied to the navigation device 20 side. Then, based on the supplied data, the speech synthesis circuit 31 performs speech synthesis processing to output the place name as speech from the speaker 32 (step 415). Then, based on the supplied longitude and latitude data, a video signal for displaying the road map at the corresponding position is created, and the selected candidate map is displayed on the display device 40 (step 416). Then, the result selected at this time is added to the history list (step 417), and the process returns to step 402 to wait until the next utterance starts.
[0072]
If it is determined in step 414 that the decision button is not pressed, then it is determined whether or not the utterance is started, that is, whether or not the talk switch 18 is turned on (step 418), and it is determined that the utterance has started. Then, the display of the candidate list is stopped, and the process returns to step 403. If it is determined in step 418 that the utterance is not started, whether or not a predetermined time Td (this time Td is about 10 seconds) has elapsed since the display of the candidate list in step 411 is started. Determination is made (step 419), and if the time Td has not elapsed, the process returns to step 412 to continue the state where the candidate list is displayed. If it is determined in step 419 that the predetermined time Td has elapsed, it is determined in step 412 whether or not a scroll operation has been performed (step 420). If a scroll operation has been performed, the process returns to step 412. The state where the candidate list is displayed is continued.
[0073]
When it is determined in step 420 that the scroll operation is not performed, the process proceeds to step 408, where the first result of the candidate list is output by voice, and the map of the first place name is displayed.
[0074]
By being controlled in this way, when the utterance is continuously performed within a certain time (for example, within 10 seconds), it is regarded as being rephrased, and the first candidate of the previous recognition result is excluded from the recognition target word. In other words, even if it is rephrased, the wrong place name is recognized again, and an accident in which the desired place name is not recognized can be prevented. For example, “Yokohama City Kanagawa Ward” and “Yokohama City Kanazawa Ward” exist as similar place names. Suppose it is recognized. At this time, if you do not deal with anything by repeating the same pronunciation, there is a high possibility that it will be mistakenly recognized as “Kanazawa Ward, Yokohama City” again. Since the pronunciation of “Kanazawa Ward” is already present, “Kanazawa Ward in Yokohama City” is excluded from the recognition target words. And when “Yokohama City Kanagawa Ward” was the second candidate, this “Yokohama City Kanagawa Ward” was moved up to the first candidate, and it was judged that “Kanagawa Ward Yokohama City” was recognized, When rephrased as a result, erroneous recognition is prevented, and the recognition rate can be improved accordingly.
[0075]
And, when there is repeated voice input for a predetermined time (here, 5 times) in a short time, the recognition target words recognized by the continuous input voice signal at this time are listed in descending order of recognition degree, The recognition state at that time can be easily determined, and words can be selected from the displayed list, so that it is possible to easily cope with a case where recognition by voice input is difficult by a simple operation.
[0076]
In this example, as a list display of recognition target word candidates at this time, a display state in the case where the recognition target word is a place name speech (displayed by normal characters in FIG. 9), and a command such as some command Since the display state in this case (display by white characters in FIG. 9) is changed, each type of sound can be quickly determined from the display. In the example of FIG. 9, the state of the character is changed. For example, the display color of the character (or around the character) in the case of a place name candidate and the character (or around the character) in the case of a command candidate The display color may be changed.
[0077]
In addition to changing the display state by the place name and the command as described above, the place name may be divided for each region and the display state may be changed for each division. That is, for example, the display color may be changed for each prefecture, or the display color may be changed for each region such as the Kanto region and the Tohoku region.
[0078]
In the flowchart of FIG. 8, the case where the selected candidate is a place name and the map display is performed based on the place name has been described. However, when the selected candidate is any command (command), a map is displayed. Instead of display, the corresponding command is executed.
[0079]
When the candidate list is displayed as a list as shown in FIG. 9, the recognition target words displayed in the list are sequentially output as speech from the speaker 32 by the speech synthesis process in the speech synthesis circuit 31. May be. By doing in this way, even if it does not look at the display of the display apparatus 40, the candidate of a recognition object word is known and the usability as a navigation apparatus improves.
[0080]
In the above-described embodiment, the place names recognized by the speech recognition apparatus are limited to the names of prefectures and municipalities in the country. However, more detailed place names and target names may be recognized. However, if the number of place names that can be recognized is increased, the amount of processing and processing time required for speech recognition are increased, and it is most preferable to limit the name to the name of the municipality in order to increase the recognition rate.
[0081]
In the above-described embodiment, the coordinates of the center for each name are used as latitude and longitude data indicating the absolute position of the local government office (city hall, ward office, town hall, village hall), but other positions are indicated. It may be data of latitude and longitude. For example, the latitude and longitude data of the center of the area (city / town / village) may be simply used.
[0082]
Further, instead of storing the latitude and longitude data of the center as described above, the coordinate position data of the east, west, north, and south ends of the area may be stored. In this case, it is only necessary to have four data of east-west longitude and north-south latitude.
[0083]
In the above embodiment, the recognized voice is converted into a character code by the voice recognition circuit 14 in the voice recognition device, and the character code is converted into longitude and latitude data by the longitude-latitude conversion circuit 16. However, the recognized voice may be directly converted into data of longitude and latitude. Even when the data is not directly converted into longitude and latitude data as described above, the ROM 15 and the ROM 17 for storing the conversion data may be configured by the same memory, for example, so as to share the storage area of the place name. .
[0084]
In the above embodiment, the present invention is applied to a navigation device using a positioning system called GPS, but it is needless to say that the present invention can also be applied to a navigation device using other positioning systems.
[0085]
【The invention's effect】
  Of the present inventionAccording to the navigation device, the display state of the recognition target word candidates is different for each candidate target word category, so that the candidates for the same category are easily understood and the display state is easy to see. For example, the display of the place name for displaying the map and the display of the command for instructing the operation etc. are displayed in different modes, so that the place name and the command are necessary from the display of the recognition target word candidates. It is possible to easily search for candidates to be performed, and the usability as a navigation device is improved.
[0088]
In addition, according to the navigation method of the present invention, the display state of recognition target word candidates is different for each candidate target word category, so that the candidates for the same category can be easily understood and displayed easily. . For example, the display of the place name for displaying the map and the display of the command for instructing the operation etc. are displayed in different modes, so that the place name and the command are necessary from the display of the recognition target word candidates. It is possible to easily search for candidates to be performed, and the usability as a navigation device is improved.
[0089]
Further, according to the automobile of the present invention, the display state of the recognition target word candidates is different for each category of the candidate target words, so that the candidates for the same classification are easy to understand, and the display state is easy to see. Even if it is difficult to see the display for a long time due to the driving situation of the car, it becomes easy to find the candidate you need, and it is possible to perform a good operation while ensuring the safety of driving the car Become.
[Brief description of the drawings]
FIG. 1 is a configuration diagram showing an embodiment of the present invention.
FIG. 2 is a perspective view showing a state in which the apparatus of one embodiment is incorporated in an automobile.
FIG. 3 is a perspective view showing the vicinity of a driver's seat when the device of one embodiment is incorporated in an automobile.
FIG. 4 is an explanatory diagram showing a storage area configuration of a speech recognition memory according to an embodiment.
FIG. 5 is an explanatory diagram showing a storage area configuration of a longitude / latitude conversion memory according to an embodiment;
FIG. 6 is a flowchart illustrating processing by speech recognition according to an embodiment.
FIG. 7 is a flowchart showing display processing in the navigation device according to the embodiment.
FIG. 8 is a flowchart showing processing when voice recognition is executed a plurality of times in one embodiment.
FIG. 9 is an explanatory diagram illustrating a display example of a candidate list according to an embodiment.
[Explanation of symbols]
10 Voice recognition device
11 Microphone
12 Analog / digital converter
13 Digital audio processing circuit (DSP)
14 Speech recognition circuit
15 Voice recognition data storage ROM
16 Longitude and latitude conversion circuit
17 Longitude and latitude conversion data storage ROM
18 Talk switch
20 Navigation device
23 Arithmetic circuit
24 CD-ROM driver
25 RAM
26 Vehicle speed sensor
27 Operation keys
28 Video signal generation circuit
31 Speech synthesis circuit
32 Speaker
40 Display device
50 cars

Claims

Audio signal input means;
A speech processing unit for recognizing speech of a plurality of predetermined recognition target words including speech of a specific place name from the speech signal input to the speech signal input means;
A conversion unit that converts data of a specific place name recognized by the voice processing unit into absolute coordinate position data indicated by the place name;
Means for storing map data;
The map data at the position indicated by the coordinate position data converted by the conversion unit is read out from the storage means to create a map display video signal, and the recognition target word candidates recognized by the voice processing unit are displayed. Video signal creating means for creating a video signal;
When the video signal creation means creates a candidate video signal of the recognition target word, the classification of the recognition target word as the candidate is divided into a word indicating a place name and a word giving a command. navigation apparatus and a display control means for the video signal to be displayed in a manner different for each.

The display mode that is different for each of the above categories is displayed with different character types.
The navigation device according to claim 1.

Different display modes for each of the above categories are displayed in different display colors.
The navigation device according to claim 1.

From the input voice signal, the voice of a plurality of recognition target words including a voice indicating a specific place name is recognized and processed,
This recognized place name data is converted into absolute coordinate position data indicated by this place name,
While displaying the map of the position shown by this converted coordinate position data,
Display the recognized recognition target word candidates,
When displaying the recognition target word candidates, each recognition target word category is classified into a word indicating a place name and a word giving a command, and each category is displayed in a different display mode. The navigation method to let you do.

The display mode that is different for each of the above categories is displayed with different character types.
The navigation method according to claim 4.

Different display modes for each of the above categories are displayed in different display colors.
The navigation method according to claim 4.

In a vehicle equipped with a device for displaying a map on display means arranged at a predetermined position in the vehicle,
Audio signal input means;
A speech processing unit for recognizing speech of a plurality of predetermined recognition target words including speech of a specific place name from the speech signal input to the speech signal input means;
A conversion unit that converts data of a specific place name recognized by the voice processing unit into absolute coordinate position data indicated by the place name;
Means for storing map data;
The map data at the position indicated by the coordinate position data converted by the conversion unit is read out from the storage means to create a map display video signal, and the recognition target word candidates recognized by the voice processing unit are displayed. Creating a video signal and supplying the video signal to the display means;
When the video signal creation means creates a candidate video signal of the recognition target word, the classification of the recognition target word as the candidate is divided into a word indicating a place name and a word giving a command. A vehicle provided with display control means for making video signals to be displayed in different display modes.