JP4232943B2

JP4232943B2 - Voice recognition device for navigation

Info

Publication number: JP4232943B2
Application number: JP2001182663A
Authority: JP
Inventors: 教明大谷; 武史黒澤
Original assignee: Alpine Electronics Inc
Current assignee: Alpine Electronics Inc
Priority date: 2001-06-18
Filing date: 2001-06-18
Publication date: 2009-03-04
Anticipated expiration: 2021-06-18
Also published as: JP2003004470A

Description

【０００１】
【発明の属する技術分野】
本発明は、利用者がマイクから入力した音声を認識し、ナビゲーション装置の目的地設定や各種検索を行うことができるようにしたナビゲーション装置用音声認識装置に関し、特に利用者が入力した地名や施設等の音声に対応するデータを高速で検索することができるようにした音声認識辞書を備えているナビゲーション用音声認識装置に関する。
【０００２】
【従来の技術】
例えば図１５に示すような従来のナビゲーション装置３０においては、地図を描画するための地図データ及び各種情報を記録した地図・情報データ３３、後述するように入力した地名や施設名に対応した地点の緯度・経度データを記録した地名施設名対応地点データ３４等の種々のデータを記録している、ＣＤ−ＲＯＭ、ＤＶＤ−ＲＯＭ等の地図・情報記憶媒体３２を備えている。この地図・情報記憶媒体３２のデータのうち、地図データは地図描画部３５でシステム制御部３１から指示された地点を中心とするものが読み出され、画像合成装置３６で他の各種画像と合成され、画像表示装置３７に３Ｄ等の種々の態様で地図及びそれに関連する情報を描画している。
【０００３】
システム制御部３１には手動操作入力部４０が接続し、リモコン、キースイッチ、タッチパネル等の利用者が操作する手動操作部からの操作信号を入力している。また、近年の音声認識技術の進歩により、ナビゲーション装置にも音声認識装置が組み込まれるようになっており、図１５のナビゲーション装置３０においてはマイク５１からの利用者の音声を認識する音声認識装置５０を備えている。それにより後述するような音声認識処理を行うことによって、認識結果出力部５８から認識した利用者の音声をナビゲーション装置３０のシステム制御部３１に出力し、前記手動操作入力部４０と同様に利用者の指示信号として利用することができるようになっている。
【０００４】
このような利用者の各種操作指示信号により、目的地経由地設定部４４では利用者が指示した地点を目的地、或いは経由地に設定する。また、現在地については、例えばＧＰＳ信号を用い、また車速センサや走行方向センサを用いて正確な現在位置を検出しており、この現在地を中心とする地図データを前記地図・情報記憶媒体３２に記録されている地図データから読み出し、これを画像表示装置３７に表示し、車両の現在位置を重ねて表示することにより、車両が現在どこを走行しているか一目でわかるようにしている。
【０００５】
システム制御部３１には誘導経路演算部４２が接続し、前記のような現在地、及び前記目的地や経由地に基づいて、地図・情報記憶媒体３２に記録されているリンクデータを用い、これらの地点を結ぶ経路の内、時間、距離、料金等の条件を加味して最も適切な経路を自動探索し、その探索した経路を誘導経路として設定し、この誘導経路は地図画像上の道路の色を変えて太く描画する等により画面表示することができるようにしている。
【０００６】
また、上記のように設定された誘導経路に沿って運転者が確実に走行することができるように、誘導経路案内部４３において、例えば車両が誘導経路上の進路を変更すべき交差点に一定距離以内に近づいたときに、交差点を拡大表示し、進路を変更すべき方向を示す矢印等を描画して画面表示したり、音声で右左折の誘導を行うことで、ユーザを目的地まで案内することができるようにしている。
【０００７】
【発明が解決しようとする課題】
上記目的地、あるいは経由地を指定するに際して、或いは特定の地点を検索して画面上に表示するに際しては、従来から各種の手段が採用されており、例えば地名を広域側から狭域側に順にリスト化したデータを予め地図・情報記憶媒体３２に記憶しておき、利用者は通常の住所表示と同じ方式で都道府県から順に画面上のキーボード等を用いて入力することによりその地名を特定し、或いは画面に広域側から順に表示される地名をカーソル指示入力等によって選択していくことにより、最終的に特定の地名を絞り込む方式も採用されている。
【０００８】
また、地図・情報記憶媒体３２に駅や市役所等の公共施設、或いは交差点やインターチェンジ等の道路の施設等、更にはランドマーク的な建物やコンビニ、ガソリンスタンド、レストラン等の施設名を記憶しておき、利用者がこの施設名を直接入力、或いは候補リスト表示を行い、これをカーソル指示入力等によって選択することにより、特定の施設を絞り込むことも行われる。このように設定された特定に地名、或いは特定の施設名は図１５における検索用地名施設名入力部４５に入力され、地点データ検索部４６ではこの地名や施設名に対応する緯度・経度を示す地点データを地図情報記憶媒体３２の地名施設名対応地点データ３４から検索する。このデータは目的地や経由地の地点データとされ、或いは地図画面上にこの地点を表すデータとされる。
【０００９】
なお、上記のような地名や施設名を手動で検索することを、運転者が車両の走行中行うことは好ましくない。その対策として前記のような音声認識装置５０で行うことができるように、音声認識用辞書部５５に地名施設名辞書部５６を備え、マイク５１から利用者による地名や施設名の入力があったときにはこれを音声入力部５２から音声特徴パラメータ抽出部５３に入力し、ここで得られたその音声の特徴的なパラメータを音声認識部５４に出力し、音声認識部５４ではこのパラメータに最も適合するパラメータを備えたものを地名施設名辞書部５６から検索し、これを認識結果出力部５８から検索用地名施設名入力部４５に出力する。ここに入力された音声認識による地名や施設名は、前記利用者による手動入力と同様に地点データ検索部４６に出力し、前記と同様に特定の地名や施設名に対応する地点データを得て目的地設定や地図上への表示等に用いる。
【００１０】
このように音声によって地名や施設名を直接検索することができるが、その音声入力に際して、前記手動操作による入力と同様に地名を都道府県から順に発声することにより、選択する地名候補を絞り込んで音声認識処理負担を減少することも考えられる。しかしながらその際には、多数回の音声入力を行わなければならず面倒である。したがってできる限り最も詳細な地名部分を発声しただけでこれを設定することができるようにし、また、このとき複数の地点が存在するときにはその候補の地点をリスト表示し、或いは地図上に表示する等の作動を行うことが好ましい。
【００１１】
そのためには、例えば地名として「よしま」を、また施設名として「ツインタワー」のような特定の地名や施設名を検索する際に、地名施設名辞書部５６に記録した日本全国の全ての地名や施設名の中から検索を行う必要がある。このように莫大なデータの中から一つの地名や施設名を検索することはこの演算処理を行う処理装置に大きな負担をかけ、処理時間が長くなると共に認識率も低下せざるを得ず、高性能な演算処理装置を用いると高価なものとならざるを得ない。
【００１２】
その対策として、地名施設名辞書部５６のデータを県単位等の所定の地域に分割しておき、特定の地名や施設名の検索に際しては予め地域を特定する入力を行い、或いはナビゲーション装置の現在地のデータにより、現在車両が存在する地点を含む地域を特定し、音声認識に際しては県単位等の特定の地域に含まれる音声認識辞書のみを用いることも考えられる。
【００１３】
しかしながら、このように特定の地域を対象とする地名施設名辞書のみを参照するだけでは、例えば茨城県と福島県の県境にいる人が各県の間を往来する際にこのナビゲーション装置を使う時のように、地域をまたがって往来するときには隣接する地方に存在する地名の音声認識が行われず、極めて使用しにくい装置とならざるを得ない。
【００１４】
その対策として、例えば現在車両の存在する地点を含む県に隣接する県を含むように音声認識辞書を選択することによって、前記のような問題点を解決することが考えられる。その際には、例えば現在地が福島県にあるときには、茨城県、栃木県、新潟県、山形県、宮城県が選択され、極めて広い面積が選択されることとなる。また、例えば現在地が東京都にあるときには、神奈川県、山梨県、埼玉県、千葉県が選択され、比較的面積は狭いものの人口密度が極めて高いため、その地域内に含まれる地名、施設名もそれに比例して大量なものとなる。また、例えば現在地が島根県にあるときには、山口県、広島県、鳥取県が選択されるだけであり、比較的狭い範囲が選択されると共に、人口密度が比較的低いこともあり、この地域の地名数、施設名数は比較的少ないものとなる。
【００１５】
このように、現在車両が存在する地点に応じて地名施設名の音声認識辞書が対象とする範囲、及び地名施設名の数が大きくばらつくこととなる。したがって音声認識装置においては全体の中間程度のデータ量の地域に合わせておくことも考えられるが、その際には前記のように現在東京都にいる車両でこの音声認識装置を利用するときにはその処理速度が不適切に遅くなり、利用者をいらいらさせることとなる。また、東京都に車両が存在するときでも、高速で正確に音声認識処理を行うことができる程度の処理能力を備えた高性能の演算処理装置を使用するときには、他のほとんどの地方においては過剰の処理能力となってしまう。
【００１６】
上記のような問題は日本に限らず、例えば米国においても同様であり、米国の地名を音声認識するために、例えば隣接する「州（Ｓｔａｔｅ）」を対象にするように音声認識辞書を設定する場合にも、州によって隣接する州を含む地名施設名の数について、また対象とされる地域の面積についても前記日本の例と同様に大きな相違が生じる。
【００１７】
したがって本発明は、地名や施設名を地域毎に設定した音声認識辞書を用いて音声認識処理を行うに際して、適切な範囲の音声認識辞書を設定し、また適切にこれを選択することにより、安価なデータ処理装置を用いて高速に、且つ正確に音声認識を行うことができるようにしたナビゲーション用音声認識装置を提供することを主たる目的とする。
【００１８】
【課題を解決するための手段】
本発明に係るナビゲーション用音声認識装置は、上記課題を解決するため、緯度と経度で分割したブロックに含まれる地名と施設名の音素データとその関連データを、各ブロック毎に記録してなる複数のブロック単位音声認識辞書を備えたブロック単位音声認識辞書蓄積部と、現在位置が所属する現在位置所属ブロックと、その周囲の所定範囲の周辺ブロックとを選定する、現在位置対応ブロック群選定部と、前記現在位置対応ブロック群選定部で選定したブロック群のブロック単位音声認識辞書を、前記ブロック単位音声認識辞書蓄積部から選定した現在位置対応ブロック群音声認識辞書とを備え、前記現在位置対応ブロック群選定部は、前記現在位置が所属するブロックの周辺ブロックについて、地名と施設名の量ができる限り均等になるように、現在位置の移動と共に変化させ、入力した地名または施設名の音声に対応する音素データを、前記現在位置対応ブロック群音声認識辞書から検索し出力する音声認識処理部を備えたものである。
【００１９】
また、本発明に係る他のナビゲーション用音声認識装置は、前記緯度と経度で分割したブロックが、各ブロックに含まれる地名と施設名の量ができる限り均等になるようにブロックの大きさを変えて分割されているものである。
【００２０】
また、本発明に係る他のナビゲーション用音声認識装置は、前記現在位置が所属するブロックの周辺ブロックを、現在位置所属ブロックに隣接するブロックとしたものである。
【００２１】
また、本発明に係る他のナビゲーション用音声認識装置は、前記現在位置が所属するブロックの周辺ブロックを、現在位置所属ブロックに隣接するブロックと、更にその周囲の所定範囲のブロックを含むようにしたものである。
【００２３】
また、本発明に係る他のナビゲーション用音声認識装置は、前記音声認識処理部が、現在位置対応ブロック群音声認識辞書に含まれる地名または施設名について、現在位置に近いブロックに存在するほど音声近似度を大きく設定したものである。
【００２４】
また、本発明に係る他のナビゲーション用音声認識装置は、前記現在位置対応ブロック群音声認識辞書が、現在位置対応ブロック群のブロックを記録したリストからなり、前記音声認識処理部は、入力した音声に対応する音素データを検索する際、前記リストに記録されたブロックの辞書を選択して検索するようにしたものである。
【００２５】
また、本発明に係る他のナビゲーション用音声認識装置は、前記現在位置所属ブロック選定部において、現在位置の移動により現在位置が所属するブロックが変更し新たな現在位置所属ブロックを選定したとき、周辺ブロックの選定処理を行うようにしたものである。
【００２６】
また、本発明に係る他のナビゲーション用音声認識装置は、複数候補絞込処理部を備え、前記複数候補絞込処理部は、前記現在位置対応ブロック群音声認識辞書による音声認識処理の結果複数の候補が存在するときに、絞り込み処理を行うようにしたものである。
【００２７】
また、本発明に係る他のナビゲーション用音声認識装置は、地名と施設名をカテゴリ別に記録したカテゴリ別音声認識辞書を備え、前記複数候補絞込処理部は、利用者が指示したカテゴリに対応する前記カテゴリ別音声認識辞書を用いて音声認識処理を行うようにしたものである。
【００２８】
また、本発明に係る他のナビゲーション用音声認識装置は、前記複数候補絞込処理部には、音声認識処理の結果得られた複数候補について各々現在地からの距離を演算し、演算結果により距離順に配列する候補地点距離順配列部を備え、前記候補地点距離順配列部の出力により表示部に複数候補を順にリスト表示し、利用者がこれにより選択を行うようにしたものである。
【００２９】
また、本発明に係る他のナビゲーション用音声認識装置は、音声認識処理の結果得られた複数候補について各々現在地からの距離を演算し、最も距離の近い候補を音声認識結果として出力するようにしたものである。
【００３０】
【発明の実施の形態】
本発明の実施の形態を図面に沿って説明する。図１は本発明によるナビゲーション用音声認識装置の主要機能部とそれらの相互の関係を示す機能ブロック図であり、前記図１５に示す従来の音声認識装置５０部分において、本発明の音声認識処理を行う機能ブロックを示したものである。同図においてマイク１からの音声は音声入力部２から入力し、音声特徴パラメータ抽出部３で入力した音声の特徴を抽出し、音声認識部４に出力する。音声認識部４では音声認識用辞書部５の各種辞書を適宜選択し、選択した辞書の音素データの中で前記抽出した音声の特徴パラメータに適合するものを検索する。
【００３１】
音声認識用辞書部５には、前記図１５に示す従来の地名施設名辞書部５６と同様に地名・施設名辞書部６を備え、目的地や経由地を設定するために地名を入力する際、或いは目的地や経由地設定時に目安となる施設を入力する際に、更には車両走行中に特定の施設を利用する際にこの辞書を用いて音声認識処理を行うことができるようにしている。また、このような地名や施設名以外に、例えば表示している地図の縮尺の拡大や縮小、３Ｄ表示等の地図表示の変更等の、ナビゲーション装置の各種機能を音声操作するため、操作機能等その他の辞書７を備えている。なお、この辞書の選択は、入力した音声の中に、予め定められたナビゲーション装置の機能を操作するための音声が含まれているか否かを操作機能等その他の辞書７を用いて判別し、含まれていないときには地名や施設名に関する音声であると判別するために用いることもできる。
【００３２】
地名・施設名辞書部６には、図５（ａ）に示すように、一例として緯度ａ１、ａ２、・・・で分割される緯度分割領域と、経度ｂ１、ｂ２、・・・で分割される経度分割領域が交差する地域のブロックについて、この地域に含まれる地名及び施設名の音声認識用データと、その地名や施設名が存在する地点の位置データ等を記録した音声認識用辞書である緯度・経度分割ブロック音声認識辞書１３を備えている。図５（ａ）に示す例においては、現在地が存在する地点が所属するブロック（ｎ，ｍ）は、緯度ａ３とａ４の間の緯度分割領域であるｎ領域と、経度ｂ３とｂ４の間の経度分割領域であるｍ領域とが交差する地域のブロックとして示される。
【００３３】
図５（ａ）の例においては、緯度分割領域であるｎ領域を中心にｎ＋１，ｎ＋２及びｎ−１，ｎ−２の各領域を示し、また経度分割領域であるｍ領域を中心にｍ＋１，ｍ＋２及びｍ−１，ｍ−２の各領域を示し、これらの領域が交差するブロックとして例えば（ｎ＋２，ｍ−２）、（ｎ＋１，ｍ−２）、（ｎ，ｍ−２）、・・・（ｎ＋２，ｍ−１）、（ｎ＋１，ｍ−１）、・・・・・（ｎ−２，ｍ＋２）等が存在している。
【００３４】
図１の地名施設名辞書部６は、ナビゲーション装置２８が備えている、例えば図１５における現在位置検出部４１のような車両の現在位置を検出する機能部の現在位置信号を、現在位置入力部１０から入力している。現在位置所属ブロック選定部１１はこの現在位置入力部１０から入力した現在位置信号により、緯度・経度分割ブロック音声認識辞書１３に蓄積された、前記のように各ブロック毎に分割された音声認識辞書の中から、現在位置が所属するブロックを選定する。図５（ａ）に示す例においては、車両の現在地は前記のようにブロック（ｎ，ｍ）に存在するので、このブロックが選定されることとなる。
【００３５】
このようにして現在位置所属ブロックが選定されると、そのデータを周辺ブロック選定部１２に出力し、周辺ブロック選定部１２では、例えば現在地の存在するブロックに隣接して取り囲む全ての周辺のブロックを選定するように、予め定められた周辺のブロックを選定する。図５（ｂ）にはそのときの例を示しており、周辺ブロックとして現在位置所属ブロック（ｎ，ｍ）に隣接して取り囲む（ｎ＋１，ｍ−１）、（ｎ，ｍ−１）、（ｎ−１，ｍ−１）、（ｎ＋１，ｍ）、（ｎ−１，ｍ）、（ｎ＋１，ｍ＋１）、（ｎ，ｍ＋１）、（ｎ−１，ｍ＋１）の合計８個のブロックが選定される。
【００３６】
図１において、上記のように現在位置所属ブロック選定部１１で選定した１個のブロック、及び周辺ブロック選定部１２で選定した８個のブロックの合計９個のブロックに対応するブロック単位音声認識辞書を抽出し、現在位置対応ブロック群音声認識辞書１４に蓄積する。音声認識部４においては、この現在位置所属ブロック群音声認識辞書１４を用いて地名及び施設名の音声認識処理を行う。
【００３７】
現在位置所属ブロック選定部１１は、車両の移動と共に変化する現在位置信号に応じて、現在位置が先に選定した現在位置所属ブロックの範囲に存在するか否かを常に検出しており、そのブロックから他のブロックに移動したときには直ちに現在位置が所属する新しいブロックを選定する。図５（ｃ）にはこの状態を示しており、車両がブロック（ｎ，ｍ）の道路Ｌ上の地点Ｐから、ブロック（ｎ，ｍ）とブロック（ｎ，ｍ＋１）の境界点Ｑを通過し、ブロック（ｎ，ｍ＋１）に入り、地点Ｒの方向に走行したとき、境界点Ｑを通過した時点で現在位置所属ブロック選定部１１は現在位置所属ブロックがブロック（ｎ，ｍ＋１）に変更になったことを検出し、このブロックを選定する。
【００３８】
この選定結果を周辺ブロック選定部１２に出力し、前記と同様に周辺ブロック選定部１２は現在位置所属ブロックの周辺の８個のブロックを選定する。このような選定の結果図５（ｄ）に示すように、現在位置所属ブロック（ｎ，ｍ＋１）とその周囲の、（ｎ＋１，ｍ）、（ｎ，ｍ）、（ｎ−１，ｍ）、（ｎ＋１，ｍ＋１）、（ｎ−１，ｍ＋１）、（ｎ＋１，ｍ＋２）、（ｎ，ｍ＋２）、（ｎ＋１，ｍ＋２）の合計８個のブロックが選定される。
【００３９】
上記のように選択された９個のブロックに対応する音声認識辞書が、現在位置ブロック群音声認識辞書１４に蓄積され、例えば地点Ｑにおいて利用者がセブンイレブン小山乙女店を知りたいと思ったとき、マイクから「セブンイレブン小山乙女店」と音声を入力すると、音声認識部４では入力音声に近い施設名について現在位置対応ブロック群音声認識辞書１４内の施設名を検索し、例えば「セブンイレブン〇〇店」の他に、「セブンイーグル」（スポーツ店）や「セブンイースト」（旅行店）等、車両の近くに存在する施設名を探し出すことができるようにしている。
【００４０】
図５（ａ）に示すような緯度と経度で分割するブロックの設定に際して、その緯度と経度は任意に設定することができるが、これを狭い地域に設定した場合には、そのときのブロック群内に含まれる地名や施設名の数は少なくなる。そのため、もしもこのデータの中に所望の地名や施設名が存在するときには、音声認識部４で検索する全体のデータ量が少ないため、同じ処理速度のＣＰＵを用いても高速で音声認識処理を行うことができる。しかしながら、音声認識の対象とする範囲が限られてしまうため、そのブロック群内に存在しない可能性も高くなり、その点では音声認識処理が適切に行われないこととなる。
【００４１】
また逆に、緯度と経度で分割するブロックの設定を広い地域に設定した場合には、そのときのブロック群内に含まれる地名や施設名の数は多くなり、音声認識部４で検索する全体のデータ量が多くなるため、同じ処理速度のＣＰＵを用いた場合には音声認識処理を行うのに多くの時間を要することとなる。但し、そのときには音声認識の対象とする範囲が広くなるため、利用者が意図する地名や地点がそのブロック群内に存在する可能性が高くなり、その点では音声認識処理が適切に行われることとなる。
【００４２】
このように、音声認識の対象とする範囲を適切に設定するに際して、例えば誘導経路を設定するに際に現在地から遠くに存在する目的地の地名を音声入力する場合と、誘導経路に沿って走行しているとき買い物を行うために近くのセブンイレブンを探す場合とでは、探す対象地域が大きく異なるため、音声認識を行う状況に応じて周辺ブロックの選択範囲を狭くする場合と広くする場合とで任意に選択することができるように予め設定しておいても良い。その際には図１における周辺ブロック選定部１２に選定範囲調整部を設け、利用者により任意に選択し、或いは誘導経路設定のための音声入力と、誘導経路に沿って走行中の音声入力とを識別し、自動的にその範囲を変更するように設定してもよい。
【００４３】
図６（ａ）には上記のような周辺ブロックの選定範囲調整を行うときの例を示しており、現在位置が緯度ａ５とａ６間の緯度分割領域と、経度がｂ５とｂ６の間の経度分割領域が交差するブロックＳ１に存在するとき、周辺ブロックの範囲を狭く設定した場合には緯度ａ４とａ７の間で経度がｂ４とｂ７の間に含まれる９個のブロック群Ｓ２が周辺ブロックとして選定され、周辺ブロックの範囲を広く設定した場合には緯度ａ２とａ９の間で経度がｂ２とｂ９の間に含まれる４９個のブロック群Ｓ３が設定される。このようなブロック群の設定は、全領域で常に同じ広さに設定する以外に、地名や施設名の多い地域においては狭い範囲に設定し、少ない地域においてはこれを広く設定することを自動的に変化させることもできる。
【００４４】
また、上記のような音声認識辞書の所定の範囲の設定に際して、例えば前記図６（ａ）に示すように緯度と経度のメッシュを細かく設定し、その範囲を広く合計４９個のブロック群Ｓ３を選定したものに対して、同図（ｂ）に示すように緯度と経度のメッシュを大きく設定し、緯度ｃ３とｃ４の間で経度がｄ３とｄ４の間の現在位置所属ブロックＴ１の周辺の９個のブロックを周辺ブロック群Ｔ２として選定したものと、実質的に同じ音声認識辞書として選定することができ、このような調整によっても音声認識辞書群を任意に選定することができる。
【００４５】
なお、前記図６（ａ）に示すように、現在位置所属ブロックＳ１に隣接するブロック群Ｓ２と、更にその周囲の適宜設定した範囲のブロック群Ｓ３の設定が行われる場合には、上記のように任意にブロック群の選択を切り替える以外に、現在位置所属ブロックＳ１に近いほど、ブロックに含まれる地名及び施設名について、音声認識処理における入力音声との近似度演算処理の重み付け係数を大きくし、利用者が意図している名称の可能性が高いものと判別するように設定することもできる。このような設定を行うことにより、より正確な音声認識が可能となるが、例えば全ての地名や施設名について、現在位置からの距離を演算して各候補の重み付けを行うものよりも、上記のようにブロック毎にまとめて演算を行う方がより音声認識処理の負担を軽減することができる。
【００４６】
更に、音声認識辞書の範囲の設定に際しては、音声認識装置の処理能力を考慮して各ブロックに含まれる地名や施設名の地点数が所定の範囲となるように設定することが好ましい。その際には、このようにしてブロック化された結果得られるブロック群において、最も地点数が多いブロック群でも充分にこの音声認識装置で対応することができる範囲に設定する。
【００４７】
地名や施設名の地点数が多いブロックに対応するに際して、前記のように最も地名や施設名が多いブロックでも、使用する音声認識装置の処理能力で十分に対応することができる程度の大きさに全ての範囲で細分化する場合には、他の地域では逆に地名や施設名の地点数が極めて少なくなり、音声認識辞書の範囲としては好ましくない場合が生じる。
【００４８】
その対策として、例えば図７（ａ）に示すように、地名・施設名の多い地域を含むブロックの範囲Ａを細分化し、緯度ａ２とａ４の間にブロック分割用の緯度ａ３を設定し、緯度ａ４とａ６の間にブロック分割用の緯度ａ５を設定している。同様に経度ｂ３とｂ５の間に経度ｂ３とｂ５の間にｂ４を、経度ｂ５とｂ７の間にｂ６を設定しており、それによりブロックの範囲Ａについては他のブロックの４分の１の面積のブロックを形成し、１つのブロックに含まれる地名や施設名の地点数をできるだけ均等化する。更に必要があるときには例えば図７（ｂ）に示すように、前記図７（ａ）のブロックの範囲Ａにおいて、特に地名・施設名の多いブロックの範囲Ｂで、各ブロックを４分割した大きさのブロックを形成し、各ブロックに含まれる地名・施設名の数を均等化することもできる。
【００４９】
このようなきめ細かな分割を行うことにより、例えば１つの国の全体を緯度と経度で分割した音声認識辞書用のブロックを形成するに際して、例えば日本における関東地方のように、首都の東京都がある地域の人口密度が高いことによりその地域には多くの地名や施設名が存在するため、この地域においては細分化したブロックの範囲Ａを形成し、更にその中の東京都では特に多くの地名・施設名が存在するので、この部分を細分化したブロックの範囲Ｂを形成する、というようにこれを利用する。
【００５０】
例えば前記図７（ａ）のように音声認識辞書用のブロックを分割した際には、車の移動に伴って例えば図８（ａ）（ｂ）（ｃ）のように現在位置対応ブロック群が選定される。即ち図８（ａ）においてブロックＦ１部分に現在位置が存在しており、そのブロックの周囲に存在するブロック群を含んだ現在位置対応ブロック群Ｇ１が選定される。このとき現在位置所属ブロックＦ１がブロック細分化範囲に隣接しているため、ブロック細分化範囲部分では狭い範囲が周辺ブロックとして選定されている。
【００５１】
次いで、現在位置が前記ブロックＦ１から同図（ｂ）のブロックＦ２に移動したときには、現在位置対応ブロック群Ｇ２が選定される。このとき、現在位置所属ブロックＦ２はブロック細分化範囲のブロックであるため、その周囲のブロックはより狭い範囲となっている。更に現在位置が同図（ｃ）に示すようにブロックＦ３に移動したときには、周囲のブロックが全てブロック細分化範囲のため、現在位置対応ブロック群Ｇ３は最も狭い範囲となる。このように、地名・施設名の数が多いブロックを細分化した際には、音声認識辞書に含まれる地名・施設名の数を常に略均等に保った状態で音声認識処理を行うことができる。なお、図７（ｂ）に示すようなブロックの細分化を行った場合も同様に作用する。
【００５２】
上記のような種々の態様の音声認識辞書を用いて、図１における音声認識部４は地名や施設名の音声認識を行うものであるが、その認識の結果、入力した音声に対応する地名或いは施設名が一つだけの場合はこれを出力確認部２６に直接出力し、操作信号入力部２５からの利用者の確認信号が入力したときにはこれを認識結果出力部２７からナビゲーション装置２８に出力する。
【００５３】
このとき、地名や施設名の候補が複数存在するときには、この実施例においてはそのデータを複数候補リスト表示出力部２３に出力し、ナビゲーション装置の表示部２４にそのリストを表示する。また、前記複数の候補リストは複数候補絞込処理部２０のカテゴリ別音声認識部２１に出力し、それらの候補の中から利用者の意図する地名や施設名を検索する際に、カテゴリ別に音声認識を行うことができるようにする。そのため、この実施例においては地名・施設名辞書部６に、地名・施設名カテゴリ別音声認識辞書１５を備え、カテゴリに基づいた音声認識を行うことができるようにしている。なお、上記の例においては、音声認識部４での認識処理の結果複数の候補が存在するとき、この複数の候補を表示部２４にリスト表示する例を示したが、これを表示することなく、直ちにジャンルを選択する指示表示を行い、利用者にジャンルの選択を行うように促すようにしても良い。
【００５４】
利用者は表示部２４に別途表示されるカテゴリのリストを参考にし、操作信号入力部２５におけるリモコンのカーソルキー操作等によって希望するカテゴリを選択し、或いは別途カテゴリ名を音声入力しこれを音声認識させることによって、その信号をカテゴリ別音声認識部２１に出力する。なお、このときの音声認識処理に際しては、音声認識辞書部５における操作機能等その他の辞書７が用いられる。カテゴリ別音声認識部２１は、利用者が指示したカテゴリの信号により、前記地名・施設名カテゴリ別辞書１５の対応するカテゴリの辞書を検索し、そのカテゴリ別辞書に記録されている地名や施設名の音素データの中から、利用者が入力した音声に適合するものを検索し絞り込みを行う。
【００５５】
カテゴリ別音声認識部２１における上記のような音声認識処理によって、前記のような複数の候補が絞り込まれ、その結果１つだけの候補だけとなったときにはこれを出力確認部２６に直接出力し、前記と同様に利用者の確認を求め、確認されたときにはこれを認識結果出力部２７からナビゲーション装置２８に出力する。このとき、未だ候補が複数存在するときには、これを候補地点距離順配列絞込部２２に出力し、各候補について現在地からの距離を計算し、距離の近いものから順に配列したリストを作成する。そのリストは複数候補リスト表示出力部２３を介して表示部２４に出力しリストの表示を行う。
【００５６】
利用者はこのリスト表示を参考にし、操作信号入力部２５から適当と思われるものを選択して候補地点距離順配列絞込部２２に出力する。候補地点距離順配列絞込部２２においては、指示された候補を出力確認部２６に出力し、出力確認部２６は前記と同様に表示部２４に出力し、これを見た利用者からの確認信号が操作信号入力部から入力したときには、認識結果出力部２７を介してナビゲーション装置２８に出力する。なお、前記実施例においては、複数の候補について距離順に配列したリストを表示する例を示したが、リスト表示することなく、最も近い候補を出力確認部２６に出力し、利用者の確認を求めるように設定することもできる。
【００５７】
図１に示した上記のような機能を行う機能ブロックからなるナビゲーション用音声認識装置においては、例えば図２及び図３に示すような作動フローによって順に作動させることができる。以下、前記図１の機能ブロック図を参照しつつ説明する。この音声認識処理に際して最初は図２に示すように利用者からの認識用音声の入力が行われる（ステップＳ１）。次いでこの音声の特徴パラメータの抽出処理を行い（ステップＳ２）、入力した音声の中に地名や施設名に関する音声を含むか否かの判別を行う（ステップＳ３）。
【００５８】
この判別に際しては、図１の音声入力部２で入力したマイク１からの音声の特徴パラメータを音声特徴パラメータ抽出部３において抽出し、音声認識部４で音声認識用辞書部５における操作機能等その他の辞書７の中に前記抽出した音声特徴パラメータに対応するものが存在するか否かを検索し判別することにより行うことができる。この判別は音声認識用辞書部５において、地名・施設名辞書部６のデータは例えば数百万ＰＯＩ（Point of interest)のように莫大なデータが存在する場合があるのに対して、操作機能等その他の辞書７には数百程度の言葉が存在するのみであることが通常であるので、操作機能等その他の辞書７を用いることにより上記のような判別を容易に行うことができる。但し、このような判別手法以外に、利用者が目的地設定入力操作を行っている途中の音声入力であることを検出することにより、入力した音声は地名・施設名に関する音声であると判別することができ、また、「近くのコンビニ」のように施設名を検索する際の予め決められた特定の用語である「近くの」の言葉を認識したとき、次に続く言葉は施設名であるとして判別することもできる。
【００５９】
この判別において、利用者が発声した音声には地名や施設名が含まれていると判別されたときには、現在位置に対応した地名・施設名の音声認識辞書の設定処理を行う（ステップＳ４）。その処理フローは図３に示しており、後に詳述するが、このステップにおいては前記のように、このナビゲーション装置が対象としている国、或いは地域を緯度と経度で分割した複数の地域のブロックに含まれる地名や施設名を１つの単位音声認識辞書として形成し、その集合として地名・施設名音声認識辞書が形成されているので、その中で現在地が存在するブロックを選定し、更にそのブロックの周囲に存在する所定範囲のブロックを選定することにより、現在位置対応ブロック群音声認識辞書の設定を行っている。
【００６０】
このようにして設定した音声認識辞書を用い、入力した音声の特徴パラメータに対応する音声データの検索処理を行い（ステップＳ５）、その音声データに対応する地名や施設名を求め、更にその地名や施設名と共に記録されているそれらが存在する地点の位置データ等の必要な種々のデータを読出す。上記音声認識辞書の検索処理は従来から用いられている各種の検索手法を用いることができるので、ここでの説明は省略する。
【００６１】
上記のように行われた検索処理の結果、得られた地名や施設名に対応する地点が１つだけであるか否かを判別する（ステップＳ６）。ここで１つだけではない、即ち複数存在すると判別されたときには、複数候補の絞り込み処理を行う（ステップＳ７）。この処理フローは図４に示しており、後に詳述するが、前記のように図１の複数候補絞込処理部２０においてカテゴリによる絞り込みを行い、更に必要な場合には複数の候補地点を距離順に配列し、利用者がこれを参考にして選択する絞り込み処理を行う。
【００６２】
このような絞込処理を行った後、得られた音声認識結果を表示部等に出力し、利用者に確認を促す（ステップＳ８）。次いで利用者によって認識結果が適切なものであるか否かが判断され（ステップＳ９）、認識結果が適切であると判断されたときには音声入力を終了するか否かを判別し（ステップＳ１０）、利用者がその後所定期間の間に新たな音声入力を行わない等により、音声入力を終了したものと判別されたときには、この作動フローを終了する（ステップＳ１３）。また、前記ステップＳ６において音声認識辞書の検索処理の結果、候補は１つだけであると判別したときには、直ちにステップＳ８に進み、得られた候補は利用者の意図するものであるかの確認を行う。
【００６３】
一方、前記ステップＳ３において、利用者が入力した音声の中には地名や施設名に関する音声を含んでいないと判別したときには、操作機能等その他の音声認識を行うための辞書を選択し（ステップＳ１１）、その辞書を用いて入力した音声に対応する言葉を検索する。その音声認識によって得られた言葉については前記と同様に、これを表示部に表示し、或いは特定の機能を行うが良いか、という問い合わせの音声を発声する等の出力を行い、利用者の確認を促す。
【００６４】
また、前記ステップＳ１０において、再び利用者が音声入力を行ったときのように、音声入力が終了していないと判別されたときにはステップＳ１に戻り、前記と同様の作動を繰り返す。また、前記ステップＳ９において、音声認識の結果の確認出力に対して利用者が判断し、その認識結果が利用者の意図するものではないと判断したときにはその旨の信号を入力することによりステップ１に戻り、再度認識用の音声入力を行い、以降同様の作動を行う。
【００６５】
前記ステップＳ４における現在位置に対応した地名・施設名の音声認識辞書の設定処理に際しては、例えば図３に示す作動フローによって処理することができる。この処理の最初に、現在位置のデータの取り込みを行う（ステップＳ２１）。この作動は、図１における現在位置入力部１０が、ナビゲーション装置２８の現在位置検出部から信号を取り込むことによって行う。
【００６６】
この実施例においては次いで、これから行う音声認識辞書の選択処理が最初のものであるか否かを判別する（ステップＳ２２）。ここで最初の音声認識辞書の選択処理であると判別したときには、現在位置所属ブロックの選定を行う（ステップＳ２３）。この処理は、図１の緯度・経度分割ブロック単位音声認識辞書１３の中に、例えば前記図５（ａ）に示すような緯度で分割される緯度分割領域と、経度で分割される経度分割領域が互いに交差した部分に形成されるブロック部分に、予め識別番号を付与したリストを形成しておき、現在位置が入力されたときその緯度と経度データにより特定のブロックを選定することによって求めることができる。
【００６７】
実際のブロック分けに際しては、例えば図９に示す日本の例のように、１度単位で緯度と経度を分割し、緯度分割領域をＡ１〜Ａ１６とし、経度分割領域をＢ１〜Ｂ１７とすることにより、その交差する部分を緯度・経度分割ブロックとする。このようなブロック分けにより、例えば東京都のほとんどはブロック（Ａ６，Ｂ１１）に含まれる。また、このようなブロック分けに際しては、各ブロックに含まれる地名や施設名のデータ量に応じて適宜細分化し、或いはまとめても良い。
【００６８】
次いで上記のようにして得られた現在位置所属ブロックを元に、そのブロックの所定の範囲の周辺ブロックを選定する（ステップＳ２４）。この作動は図１の周辺ブロック選定部１２が、前記現在位置所属ブロック選定部１１で選定したデータに基づき、そこで選定した現在位置ブロックの周囲における予め定められた範囲のブロックを選定することにより行う。
【００６９】
実際の周辺ブロックの選定に際して、図１０（ａ）に示す例のように選定することができ、この例においては、前記図９のようにして緯度と経度で分割した場合において、例えば現在位置が山梨県に位置することによってブロック（Ａ６，Ｂ１０）が現在位置所属ブロックとして選定され、その周囲のブロック（Ａ７，Ｂ９）、（Ａ６，Ｂ９）、（Ａ５，Ｂ９）、（Ａ７，Ｂ１０）、（Ａ５，Ｂ１０）、（Ａ７，Ｂ１１）、（Ａ６，Ｂ１１）、（Ａ５，Ｂ１１）の合計８個のブロックを周辺ブロックとして選定している。
【００７０】
その後、上記のようにステップＳ２３で得られた現在位置所属ブロックと、ステップＳ２４で得られた現在位置所属ブロックの所定範囲の周辺ブロックについて、それらのブロックに対応する音声認識辞書を緯度・経度分割ブロック音声認識辞書１３の中から選定して、ブロック対応音声認識辞書群として選択し、これをまとめることにより現在位置対応ブロック群音声認識辞書１４とする（ステップＳ２６）。その後再びステップＳ２１に戻り、以降同様の作動を繰り返す。
【００７１】
上記のような辞書の作成に際しては、実際にこれらのデータを全てまとめて記憶部に一時的に記録するほか、単に緯度経度分割ブロック音声認識辞書１３に存在する各辞書のうち、現在音声認識辞書として使用するものをリスト化して記録するのみでも良い。このようにリスト化したデータを作成するのみの場合には、音声認識処理に際して、入力した音声に対応する音素データを検索する際、緯度・経度分割ブロック音声認識辞書１３の中で、前記リストに存在するブロックの辞書のみを選択して検索を行うことにより音声認識処理を実行する。
【００７２】
一方、前記ステップＳ２２において、これから行う音声認識辞書の選択処理が最初の処理ではないと判別したとき、即ち既に音声認識辞書の選択が行われているときには、現在位置所属ブロックが変わったか否かの判別を行う。即ち、車両の移動に伴い現在位置が変化するとき、現在位置が先に選定した現在位置所属ブロックの範囲から出たか否かを検出する。その結果、現在位置所属ブロックが変わっていないと判別されたときには再びステップＳ２１に戻り、現在位置データの取り込みを継続し同様の作動を繰り返す。
【００７３】
前記ステップＳ２５において、現在位置所属ブロックが変わったと判別したときにおいては、ステップＳ２３に進み、前記と同様に現在位置所属ブロックの選定、次いでステップＳ２４においてその現在位置所属ブロックの周囲における所定周辺ブロックの選定を行い、以下同様の作動を繰り返す。この実施例においてはステップＳ２５において現在位置所属ブロックが変わったと判別したときのみステップＳ２３以降のブロック選定処理を行うようにしているので、現在位置データを取り込むたび毎に常にステップＳ２３以降の処理を行うことがないようにし、地名・施設名辞書選定処理における処理負担を軽減している。
【００７４】
上記のような現在位置の移動による音声認識辞書の選定処理の結果、例えば前記図１０（ａ）に示す現在位置所属ブロック、及び周辺ブロックからなる現在位置対応ブロック群においては、現在位置が山梨県から東京都に入った場合には、現在位置所属ブロックは同図（ｂ）に示すようにブロック（Ａ６，Ｂ１１）となり、それに伴って周辺ブロックも図示するように移動する。更に現在地が東京都から埼玉県に入り、緯度分割領域ａ６からＡ７に入ったとき、現在位置所属ブロックは同図（ｃ）に示すようにブロック（Ａ７，Ｂ１１）に移動し、それに伴って周辺ブロックも図示するように移動する。このように、現在位置の移動に応じて、音声認識処理を行う辞書も変化させることができる。
【００７５】
前記図２のステップＳ７における複数候補の絞り込み処理に際しては、例えば図４に示す作動フローに従って順に処理することができる。即ち図２のステップＳ５において音声認識辞書の検索処理が行われ、その結果ステップＳ６において検索結果が１つだけではない、即ち複数の候補が存在すると判別されたとき、図４の実施例においてはこれらの複数の候補を利用者が確認できるように画面に表示する（ステップＳ３１）。なお、このような画面表示を行うことなく、直ちに次のステップＳ３２に進むように設定することもでき、また候補が所定数以内の時のみ表示を行うように設定することもできる。
【００７６】
ステップＳ３２においては、利用者が意図する地名や施設名がどのようなカテゴリに属するものであるかを入力するに際して、その入力の便宜のためカテゴリのリストを表示し、カテゴリ別音声認識部２１においてはカテゴリ別辞書１５の音素データを検索する（ステップＳ３３）。
【００７７】
その検索結果得られた候補が１つだけであるか否かを判別し（ステップＳ３４）、１つだけではない、即ち複数の候補が存在すると判別されたときには、これらの複数の候補について現在位置からの距離を演算し（ステップＳ３５）、距離の近い順に並べ替え、これをリスト表示する（ステップＳ３６）。
【００７８】
それにより、例えば図１０（ｃ）の現在位置において、セブンイレブンで買い物をしたいと思ったとき、カテゴリとしてコンビニのセブンイレブンを指定した際には、例えば図１１に示すようなリストを表示する。利用者はこのようなリストを見ることにより、最も近い所にセブンイレブン稲葉郷店が存在することがわかる。なお、上記のようなリストに代えて、現在地を中心とした地図表示を行い、その地図上に上記のような施設を表示しても良い。利用者は上記のようなリスト表示或いは地図表示を見ながら希望する施設名等を選択し（ステップＳ３７）、その選択結果を出力する（ステップＳ３８）。
【００７９】
本発明によるナビゲーション用音声認識装置は上記のように作動するものであるが、前記図９、図１０に示すような日本で用いるナビゲーション装置以外にも世界中の国で同様に利用することができ、特に米国においては広大な国土に多くの地名や施設名が存在するので、これを１つの音声認識辞書にまとめて利用するにはあまりにも音声認識処理装置の負担が大きいため、米国で使用されるナビゲーション装置の音声認識装置として本発明を利用すると特にその効果が大きい。その際には、例えば図１２に示すように音声認識用辞書を緯度と経度で分割する。この例においては、緯度分割領域ｎ１〜ｎ１０と経度分割領域ｗ１〜ｗ２４とを用い、これらが交差する地域を各々音声認識用辞書を区分するブロック単位とする。なお、米国においては特に州の行政区画が緯度及び経度に沿う部分が多いので、本発明による緯度と経度でブロックを区分する手法が適合しやすくなる。
【００８０】
それにより、図１３に示す例においては、ブロック（ｎ７，ｗ１５）に現在地が存在するときに、周辺ブロックを前記日本の例と同様に８個選定し、合計９個のブロックの音声認識辞書を用いて音声認識処理を行う。その結果、図１３（ａ）から現在位置が同図（ｂ）、（ｃ）のように移動するにつれて音声認識辞書も変更される。なお図示実施例においては、緯度、経度共に２．５度単位に設定した例を示しているが、例えば図１４に示すように１度単位で細かなブロックに区分しても良い。その際には同図に示すように、米国の西海岸の人口密度の高い部分については１度毎に細かなブロックで区分し、他の部分は２度毎のブロックで区分することもできる。また、米国の内陸中央部のように人口密度の少ない地域は、更に大きなブロックで分割するように設定しても良い。
【００８１】
本発明は上記のように種々の態様で実施することができるが、更に種々の態様で緯度と経度で分割した音声認識辞書を用い、種々の態様の周辺ブロックの選定を行うことができる。また、音声認識の結果複数の候補が存在するときの絞込処理も、上記実施例の他、他の態様で絞り込みを行うこともできる。
【００８２】
【発明の効果】
本発明は、上記のように構成したので、本発明に係るナビゲーション用音声認識装置は、上記課題を解決するため、緯度と経度で分割したブロックに含まれる地名と施設名の音素データとその関連データを、各ブロック毎に記録してなる複数のブロック単位音声認識辞書を備えたブロック単位音声認識辞書蓄積部と、現在位置が所属する現在位置所属ブロックと、その周囲の所定範囲の周辺ブロックとを選定する、現在位置対応ブロック群選定部と、前記現在位置対応ブロック群選定部で選定したブロック群のブロック単位音声認識辞書を、前記ブロック単位音声認識辞書蓄積部から選定した現在位置対応ブロック群音声認識辞書とを備え、入力した地名または施設名の音声に対応する音素データを、前記現在位置対応ブロック群音声認識辞書から検索し出力する音声認識処理部を備えたので、地名や施設名を地域毎に設定した音声認識辞書を用いて音声認識処理を行うに際して、音声認識処理装置にとって大きな処理負担にならず、しかも適切な音声認識処理を行うことができ、小さな音声認識用辞書を用いながら、あたかも大量のデータの中から音声認識処理を行っているものと同様の処理を行うことができる。それにより安価なデータ処理装置を用いて高速に、且つ正確に音声認識を行うことができる。
【００８３】
また、本発明に係る他のナビゲーション用音声認識装置は、前記緯度と経度で分割したブロックが、各ブロックに含まれる地名と施設名の量ができる限り均等になるようにブロックの大きさを変えて分割されているので、現在地が存在する地域に応じて音声認識処理負担が変化する程度を少なくすることができ、安価なデータ処理装置を用いて高速に、且つ正確に音声認識を行うことができる。
【００８４】
また、本発明に係る他のナビゲーション用音声認識装置は、前記現在位置が所属するブロックの周囲のブロックを、現在位置所属ブロックに隣接するブロックとしたので、音声認識の対象とする地域の選定に際して、その選定を容易に行うことができる。
【００８５】
また、本発明に係る他のナビゲーション用音声認識装置は、前記現在位置が所属するブロックの周囲のブロックを、現在位置所属ブロックに隣接するブロックと、更にその周囲の所定範囲のブロックを含むようにしたので、音声認識の対象とする地域を任意の範囲に設定することができ、音声認識処理装置の処理能力等により適宜その範囲を選択して設定することができる等、種々の状況に応じて音声認識の対象とする範囲を適切に選定することができる。
【００８６】
また、本発明に係る他のナビゲーション用音声認識装置は、前記現在位置が所属するブロックの周囲のブロックを、地名と施設名の量ができる限り均等になるように、現在位置の移動と共に変化させるようにしたので、現在位置が存在する地域によって地名や施設名が多い場合と少ない場合があっても、音声認識の対象とするブロック群に含まれる全体の地名と施設名の量を均等化することができ、音声認識の処理負担を均等化し、全体として適切な音声認識処理を行うことができる。
【００８７】
また、本発明に係る他のナビゲーション用音声認識装置は、前記音声認識処理部が、現在位置対応ブロック群音声認識辞書に含まれる地名または施設名について、現在位置に近いブロックに存在するほど音声近似度を大きく設定したので、例えば近くのコンビニを探すようなときに、利用者が意図する施設を適切に検索することができる。また、特にブロック単位で近似度を求めているため、州や都道府県等の単位で地名や施設名毎に求めるものよりも、メモリ容量が少なくてよく、演算処理が容易になり、音声認識処理負担が軽減する。
【００８８】
また、本発明に係る他のナビゲーション用音声認識装置は、前記現在位置対応ブロック群音声認識辞書が、現在位置対応ブロック群のブロックを記録したリストからなり、前記音声認識処理部は、入力した音声に対応する音素データを備えた地名または施設名を、前記ブロックを記録したリストに基づき、前記ブロック単位音声認識辞書蓄積部の前記リストに記録されたブロックの辞書データから検索するようにしたので、現在位置対応ブロック群音声認識辞書に記録するデータがリストデータのみの極めて小さなもので良く、また、データの書き換え処理負担も軽減でき、安価な音声認識装置とすることができる。
【００８９】
また、本発明に係る他のナビゲーション用音声認識装置は、前記現在位置所属ブロック選定部において、現在位置の移動により現在位置が所属するブロックが変更し新たな現在位置所属ブロックを選定したとき、周囲ブロックの選定処理を行うようにしたので、現在位置が移動するたびに周囲ブロックの選定処理を行う必要が無く、音声認識用辞書の選定処理を簡略化することができる。
【００９０】
また、本発明に係る他のナビゲーション用音声認識装置は、複数候補絞込処理部を備え、前記複数候補絞込処理部は、前記現在位置対応ブロック群音声認識辞書による音声認識処理の結果複数の候補が存在するときに、絞り込み処理を行うようにしたので、現在位置対応ブロック群音声認識辞書を用いて音声認識処理を行った結果複数の候補が存在するとき、その中から適切なもののみを絞り込んで出力することができ、使用しやすい音声認識装置とすることができる。
【００９１】
また、本発明に係る他のナビゲーション用音声認識装置は、地名と施設名をカテゴリ別に記録したカテゴリ別音声認識辞書を備え、前記複数候補絞込処理部は、利用者が指示したカテゴリに対応する前記カテゴリ別音声認識辞書を用いて音声認識処理を行うようにしたので、現在位置対応ブロック群音声認識辞書を用いて音声認識処理を行った結果複数の候補が存在するとき、別途用意したカテゴリ別音声認識辞書を用いて、利用者の指示したカテゴリに対応して検索することができ、より正確な音声認識処理を行うことができる。
【００９２】
また、本発明に係る他のナビゲーション用音声認識装置は、前記複数候補絞込処理部には、音声認識処理の結果得られた複数候補について各々現在地からの距離を演算し、前記候補地点距離順配列部の出力により表示部に複数候補を順にリスト表示し、利用者がこれにより選択を行うようにしたので、複数の候補のうち現在位置により近いもの、或いは他の条件を加味して最も近いものの選択等を利用者が任意に行うことができ、利用者の意図する地名または施設名を正確に認識することができる。
【００９３】
また、本発明に係る他のナビゲーション用音声認識装置は、音声認識処理の結果得られた複数候補について各々現在地からの距離を演算し、最も距離の近い候補を音声認識結果として出力するようにしたので、複数の候補のうち最も適切と思われるものを自動的に出力することができ、利用者の手を煩わせることのない、利用し易い音声認識装置とすることができる。
【図面の簡単な説明】
【図１】本発明の実施例の機能ブロック図である。
【図２】同実施例の基本作動フロー図である。
【図３】同基本作動フロー図における、現在位置対応地名・施設名音声認識辞書設定処理を行う作動フロー図である。
【図４】同基本作動フロー図における、複数候補絞り込み処理を行う作動フロー図である。
【図５】本発明における音声認識辞書を緯度と経度で分割し、現在位置所属ブロックと周辺ブロックを選定する例を示す図であり、（ａ）は緯度経度分割ブロックの例を示し、（ｂ）は現在位置所属ブロックと周辺ブロックの選定例を示し、（ｃ）は現在位置がブロック間を移動する例を示し、（ｄ）は現在位置所属ブロックが変更したときの音声認識辞書を変更する例を示す図である。
【図６】本発明における周辺ブロックの選定例を示す図であり、（ａ）は周辺ブロックを複数の態様で設定する例を示し、（ｂ）は他の態様で設定する例を示す図である。
【図７】本発明における音声認識辞書を緯度と経度で分割する際に特定の地域を細分化する例を示す図であり、（ａ）は１種類の態様で分割する例を示し、（ｂ）は２種類の態様で分割する例を示す図である。
【図８】本発明における音声認識辞書を緯度と経度で分割する際に特定の地域を細分化した際の、現在位置の移動によって現在位置対応ブロック群が変化する例を示す図である。
【図９】本発明における緯度と経度で分割する音声認識辞書作成方法を、日本全国の地名・施設名の音声認識辞書作成に適用した実施例を示す図である。
【図１０】同日本の実施例において、音声認識用の現在位置対応ブロック群を示す図であり、（ａ）（ｂ）（ｃ）は各々現在位置の移動に対応して現在位置対応ブロック群が変化する例を示す図である。
【図１１】音声認識処理の結果得られた複数の候補を特定のカテゴリで絞り、現在地からの距離によって配列して画面に表示した例を示す図である。
【図１２】本発明における緯度と経度で分割する音声認識辞書作成方法を、米国の地名・施設名の音声認識辞書作成に適用した実施例を示す図である。
【図１３】同米国の実施例において、音声認識用の現在位置対応ブロック群を示す図であり、（ａ）（ｂ）（ｃ）は各々現在位置の移動に対応して現在位置対応ブロック群が変化する例を示す図である。
【図１４】同実施例において、地名や施設名が特に多い西海岸部分について、音声認識辞書を細分化する例を示す図である。
【図１５】従来から用いられ、本発明を適用するナビゲーション用音声認識装置の例を示す機能ブロック図である。
【符号の説明】
１マイク
４音声認識部
５音声認識用辞書部
６地名・施設名辞書部
７操作機能等その他の辞書
９現在位置対応ブロック群選定部
１０現在位置入力部
１１現在位置所属ブロック選定部
１２周辺ブロック選定部
１３緯度・経度分割ブロック単位音声認識辞書蓄積部
１４現在位置対応ブロック群音声認識辞書
１５地名・施設名カテゴリ別音声認識辞書
２０複数候補絞込処理部
２１カテゴリ別音声認識部
２２候補地点距離順配列絞込部
２３複数候補リスト表示出力部
２６出力確認部[0001]
BACKGROUND OF THE INVENTION
The present invention relates to a voice recognition device for a navigation device that recognizes a voice input from a microphone by a user and can perform destination setting and various searches of the navigation device, and more particularly, a place name and facility input by the user. The present invention relates to a navigation speech recognition device including a speech recognition dictionary that can search data corresponding to speech such as high speed.
[0002]
[Prior art]
For example, in a conventional navigation device 30 as shown in FIG. 15, map data for drawing a map and map / information data 33 in which various types of information are recorded, and a location corresponding to a place name or facility name input as will be described later. A map / information storage medium 32 such as a CD-ROM, DVD-ROM, or the like, which records various data such as place name / facility name corresponding point data 34 in which latitude / longitude data is recorded, is provided. Of the data stored in the map / information storage medium 32, the map data is read by the map drawing unit 35 centered on the point designated by the system control unit 31, and is synthesized with other various images by the image synthesis device 36. Then, the map and information related thereto are drawn on the image display device 37 in various modes such as 3D.
[0003]
A manual operation input unit 40 is connected to the system control unit 31 and inputs an operation signal from a manual operation unit operated by a user such as a remote controller, a key switch, or a touch panel. Further, due to recent advances in speech recognition technology, speech recognition devices are also incorporated in navigation devices. In the navigation device 30 of FIG. 15, a speech recognition device 50 that recognizes the user's speech from the microphone 51. It has. Thereby, the voice of the user recognized from the recognition result output unit 58 is output to the system control unit 31 of the navigation device 30 by performing voice recognition processing as described later, and the user is the same as the manual operation input unit 40. It can be used as an instruction signal.
[0004]
In accordance with the user's various operation instruction signals, the destination / route setting unit 44 sets the point designated by the user as the destination or the route. For the current location, for example, a GPS signal is used, and an accurate current location is detected using a vehicle speed sensor or a traveling direction sensor, and map data centered on the current location is recorded in the map / information storage medium 32. The map data is read out and displayed on the image display device 37, and the current position of the vehicle is displayed in an overlapping manner so that the vehicle is currently traveling at a glance.
[0005]
A guide route calculation unit 42 is connected to the system control unit 31 and uses link data recorded in the map / information storage medium 32 based on the current location as described above and the destination or waypoint. The route that connects the points is automatically searched for the most appropriate route in consideration of conditions such as time, distance, and fee, and the searched route is set as a guide route. This guide route is the color of the road on the map image. It is possible to display on the screen by drawing a thick line by changing.
[0006]
Further, in order to ensure that the driver can travel along the guidance route set as described above, in the guidance route guide unit 43, for example, a certain distance from the intersection where the vehicle should change the course on the guidance route. When approaching within, the user will be guided to the destination by enlarging the intersection, drawing an arrow indicating the direction to change the course, etc. To be able to.
[0007]
[Problems to be solved by the invention]
When designating the above destination or waypoint, or when searching for a specific point and displaying it on the screen, various means have been conventionally employed. For example, place names are sequentially assigned from the wide area side to the narrow area side. The list data is stored in the map / information storage medium 32 in advance, and the user specifies the name of the place by inputting it using the keyboard on the screen in order from the prefecture in the same manner as normal address display. Alternatively, a method of finally narrowing down specific place names by selecting place names displayed on the screen in order from the wide area side by inputting a cursor instruction or the like is also employed.
[0008]
The map / information storage medium 32 stores names of public facilities such as stations and city halls, road facilities such as intersections and interchanges, and landmark buildings, convenience stores, gas stations, restaurants, and other facilities. In addition, the user directly narrows down a specific facility by directly inputting the facility name or displaying a candidate list and selecting this by a cursor instruction input or the like. The specific place name or specific facility name set in this way is input to the search place name facility name input unit 45 in FIG. 15, and the point data search unit 46 indicates the latitude and longitude corresponding to the place name and facility name. The point data is retrieved from the place name 34 corresponding to the name of the facility in the map information storage medium 32. This data is point data of the destination or waypoint, or data representing this point on the map screen.
[0009]
Note that it is not preferable that the driver manually searches for a place name or facility name as described above while the vehicle is traveling. As a countermeasure, the speech recognition dictionary unit 55 includes a place name facility name dictionary unit 56 so that the user can input a place name and facility name from the microphone 51 so that the speech recognition device 50 can perform the countermeasure. Sometimes this is input from the voice input unit 52 to the voice feature parameter extraction unit 53, and the characteristic parameters of the voice obtained here are output to the voice recognition unit 54. The voice recognition unit 54 is most suitable for this parameter. Those having parameters are searched from the place name facility name dictionary unit 56 and output from the recognition result output unit 58 to the search place name facility name input unit 45. The place name or facility name by voice recognition input here is output to the point data search unit 46 as in the case of manual input by the user, and the point data corresponding to the specific place name or facility name is obtained in the same manner as described above. Used for destination setting and display on a map.
[0010]
In this way, place names and facility names can be searched directly by voice, but when inputting the voice, the place names to be selected are narrowed down by speaking the place names in order from the prefecture in the same manner as the manual input. It may be possible to reduce the recognition processing burden. However, in that case, many voice inputs must be performed, which is troublesome. Therefore, it is possible to set it by simply speaking the most detailed place name part as much as possible, and when there are multiple points at this time, the candidate points are displayed in a list or displayed on a map, etc. It is preferable to perform the operation.
[0011]
For that purpose, for example, when searching for a specific place name or facility name such as “Yoshima” as the place name and “Twin Tower” as the facility name, all the places in Japan recorded in the place name facility name dictionary section 56 are searched. It is necessary to search from place names and facility names. Searching for one place name or facility name from such enormous data places a heavy burden on the processing device that performs this calculation process, and the processing time becomes long and the recognition rate must be reduced. If a high performance processing unit is used, it must be expensive.
[0012]
As a countermeasure, the data in the place name facility name dictionary unit 56 is divided into predetermined areas such as prefecture units, and when searching for a specific place name or facility name, an input for specifying the area is made in advance, or the current location of the navigation device From this data, it is possible to specify an area including a point where the vehicle is present and use only a speech recognition dictionary included in a specific area such as a prefecture for voice recognition.
[0013]
However, when only referring to the location name facility name dictionary for a specific area in this way, for example, when a person on the border between Ibaraki and Fukushima prefectures uses this navigation device when going between prefectures As described above, when a user goes across a region, the name of a place existing in an adjacent region is not recognized, and the device must be extremely difficult to use.
[0014]
As a countermeasure, for example, it is conceivable to solve the above-mentioned problems by selecting a speech recognition dictionary so as to include a prefecture adjacent to a prefecture including a point where a vehicle currently exists. In this case, for example, when the current location is in Fukushima Prefecture, Ibaraki Prefecture, Tochigi Prefecture, Niigata Prefecture, Yamagata Prefecture, and Miyagi Prefecture are selected, and an extremely large area is selected. For example, if your current location is in Tokyo, Kanagawa Prefecture, Yamanashi Prefecture, Saitama Prefecture, and Chiba Prefecture are selected and the population density is extremely high although the area is relatively small. It becomes a large amount in proportion to it. For example, when the current location is in Shimane Prefecture, only Yamaguchi Prefecture, Hiroshima Prefecture, and Tottori Prefecture are selected, and a relatively narrow range may be selected and the population density may be relatively low. The number of place names and facility names will be relatively small.
[0015]
As described above, the range of the place name facility name speech recognition dictionary and the number of place name facility names vary greatly depending on the location where the vehicle currently exists. Therefore, in the speech recognition apparatus, it may be possible to adjust the data amount to a region with an intermediate amount of data. However, in this case, when using the speech recognition apparatus in a vehicle currently in Tokyo as described above, the processing is performed. The speed is inappropriately slowed and frustrated by the user. Even when there are vehicles in Tokyo, when using a high-performance arithmetic processing unit with a processing capability capable of performing speech recognition processing at high speed and accuracy, it is excessive in most other regions. It becomes the processing capacity of.
[0016]
The above-mentioned problems are not limited to Japan, but are also the same in the United States, for example, and in order to recognize the place names in the United States, a speech recognition dictionary is set to target, for example, the adjacent “State”. Even in this case, the number of place name facility names including neighboring states and the area of the target area vary greatly depending on the state as in the case of Japan.
[0017]
Therefore, the present invention is inexpensive by setting a speech recognition dictionary in an appropriate range and appropriately selecting it when performing speech recognition processing using a speech recognition dictionary in which place names and facility names are set for each region. It is a main object of the present invention to provide a navigation speech recognition apparatus capable of performing speech recognition accurately at high speed using a simple data processing apparatus.
[0018]
[Means for Solving the Problems]
In order to solve the above problems, a navigation speech recognition apparatus according to the present invention records a plurality of phoneme data of place names and facility names included in blocks divided by latitude and longitude and related data for each block. A block unit speech recognition dictionary storage unit having a block unit speech recognition dictionary, a current position corresponding block group selecting unit for selecting a current position belonging block to which the current position belongs, and surrounding blocks in a predetermined range around the current position belonging block; A block unit speech recognition dictionary of the block group selected by the current position corresponding block group selection unit, and a current position corresponding block group speech recognition dictionary selected from the block unit speech recognition dictionary storage unit, The current position corresponding block group selection unit, with respect to the peripheral blocks of the block to which the current position belongs, to change with the movement of the current position, so that the amount of place name and facility name as even as possible, A speech recognition processing unit is provided for searching and outputting phoneme data corresponding to the input place name or facility name speech from the current position corresponding block group speech recognition dictionary.
[0019]
In addition, another navigation speech recognition apparatus according to the present invention changes the block size so that the blocks divided by the latitude and longitude are as uniform as possible in the amount of place names and facility names included in each block. Are divided.
[0020]
Further, another navigation speech recognition apparatus according to the present invention provides a block to which the current position belongs. Around The block is a block adjacent to the current position belonging block.
[0021]
Further, another navigation speech recognition apparatus according to the present invention provides a block to which the current position belongs. Around The block includes a block adjacent to the current position belonging block and a block in a predetermined range around it.
[0023]
Further, in another navigation speech recognition apparatus according to the present invention, the speech recognition processing unit has a speech approximation such that the place name or facility name included in the current position corresponding block group speech recognition dictionary is located in a block closer to the current position. The degree is set large.
[0024]
In another navigation speech recognition apparatus according to the present invention, the current position corresponding block group speech recognition dictionary includes a list in which blocks of the current position corresponding block group are recorded, and the speech recognition processing section Phoneme data corresponding to When searching, select a dictionary of blocks recorded in the list and search It is what you do.
[0025]
Further, the other speech recognition device for navigation according to the present invention, in the current position belonging block selection unit, when the block to which the current position belongs is changed by the movement of the current position and a new current position belonging block is selected, Around Block selection processing is performed.
[0026]
In addition, another navigation speech recognition apparatus according to the present invention includes a plurality of candidate narrowing-down processing units, and the plurality of candidate narrowing-down processing units include a plurality of results of speech recognition processing by the current position corresponding block group speech recognition dictionary. A narrowing process is performed when there is a candidate.
[0027]
Further, another navigation speech recognition apparatus according to the present invention includes a category-specific speech recognition dictionary in which place names and facility names are recorded by category, and the plurality of candidate narrowing-down processing units correspond to categories designated by a user. Voice recognition processing is performed using the category-specific voice recognition dictionary.
[0028]
In another navigation speech recognition apparatus according to the present invention, the plurality of candidate narrowing-down processing units calculate distances from the current location for the plurality of candidates obtained as a result of the speech recognition processing, and the calculation results indicate the order of distance. A candidate point distance order arrangement unit is arranged, and a plurality of candidates are sequentially displayed on the display unit by the output of the candidate point distance order arrangement unit, and the user makes a selection accordingly.
[0029]
In addition, another navigation speech recognition apparatus according to the present invention calculates the distance from the current location for each of a plurality of candidates obtained as a result of the speech recognition processing, and outputs the candidate with the closest distance as a speech recognition result. Is.
[0030]
DETAILED DESCRIPTION OF THE INVENTION
Embodiments of the present invention will be described with reference to the drawings. FIG. 1 is a functional block diagram showing the main functional units of a navigation speech recognition apparatus according to the present invention and their mutual relationship. In the conventional speech recognition apparatus 50 shown in FIG. 15, the speech recognition processing of the present invention is performed. The functional block to perform is shown. In the figure, the voice from the microphone 1 is input from the voice input unit 2, the voice feature input by the voice feature parameter extraction unit 3 is extracted, and is output to the voice recognition unit 4. The speech recognition unit 4 appropriately selects various dictionaries in the speech recognition dictionary unit 5 and searches for phoneme data in the selected dictionary that matches the extracted speech feature parameters.
[0031]
The speech recognition dictionary section 5 includes a place name / facility name dictionary section 6 similar to the conventional place name / facility name dictionary section 56 shown in FIG. 15, and is used when a place name is input to set a destination or waypoint. Or, when inputting a facility that serves as a guideline when setting a destination or waypoint, and also when using a specific facility while the vehicle is running, this dictionary can be used to perform voice recognition processing. . In addition to such place names and facility names, for example, operating functions and the like for performing voice operations on various functions of the navigation device, such as enlarging or reducing the scale of the displayed map, changing the map display such as 3D display, etc. Other dictionaries 7 are provided. The selection of the dictionary is performed by using the other dictionary 7 such as an operation function to determine whether or not the input voice includes a voice for operating a predetermined navigation device function. When it is not included, it can also be used to determine that the sound is related to a place name or facility name.
[0032]
As shown in FIG. 5A, the place name / facility name dictionary unit 6 is divided into latitude division areas divided by latitudes a1, a2,... And longitudes b1, b2,. This is a speech recognition dictionary that records speech recognition data of place names and facility names included in this area, and location data of points where the place names and facility names exist, for blocks in areas where longitude division areas intersect. A latitude / longitude division block speech recognition dictionary 13 is provided. In the example shown in FIG. 5A, the block (n, m) to which the point where the current location belongs belongs to an n region that is a latitude divided region between latitudes a3 and a4, and between longitudes b3 and b4. It is shown as a block of a region where the m region which is the longitude division region intersects.
[0033]
In the example of FIG. 5A, n + 1, n + 2 and n-1, n-2 are shown centering on an n region that is a latitude division region, and m + 1, n is centered on an m region that is a longitude division region. m + 2 and m−1, m−2 are shown, and as blocks where these regions intersect, for example, (n + 2, m−2), (n + 1, m−2), (n, m−2),. (N + 2, m-1), (n + 1, m-1), ... (n-2, m + 2) etc. exist.
[0034]
The place name facility name dictionary unit 6 in FIG. 1 includes, as a current position input unit, a current position signal of a functional unit that is included in the navigation device 28 and that detects the current position of the vehicle, such as the current position detection unit 41 in FIG. 10 is entered. The current position affiliation block selection unit 11 stores the speech recognition dictionary divided for each block as described above, stored in the latitude / longitude division block speech recognition dictionary 13 based on the current position signal input from the current position input unit 10. The block to which the current position belongs is selected from In the example shown in FIG. 5A, the current location of the vehicle exists in the block (n, m) as described above, so this block is selected.
[0035]
When the current position affiliation block is selected in this way, the data is output to the peripheral block selection unit 12, and the peripheral block selection unit 12, for example, selects all the surrounding blocks that are adjacent to the block where the current location exists. Select peripheral blocks that have been determined in advance. FIG. 5B shows an example at that time, and (n + 1, m−1), (n, m−1), (n) surrounding the current position belonging block (n, m) as neighboring blocks. n-1, m-1), (n + 1, m), (n-1, m), (n + 1, m + 1), (n, m + 1), (n-1, m + 1) in total are selected. Is done.
[0036]
In FIG. 1, a block unit speech recognition dictionary corresponding to a total of nine blocks, one block selected by the current position block selection unit 11 and eight blocks selected by the peripheral block selection unit 12 as described above. Are extracted and stored in the current position corresponding block group speech recognition dictionary 14. The voice recognition unit 4 performs voice recognition processing of place names and facility names using the current position belonging block group voice recognition dictionary 14.
[0037]
The current position affiliation block selection unit 11 always detects whether or not the current position is within the range of the current position affiliation block selected earlier according to the current position signal that changes as the vehicle moves. When moving from to another block, a new block to which the current position belongs is immediately selected. FIG. 5C shows this state, and the vehicle passes from the point P on the road L of the block (n, m) through the boundary point Q between the block (n, m) and the block (n, m + 1). When entering the block (n, m + 1) and traveling in the direction of the point R, the current position belonging block selecting unit 11 changes the current position belonging block to the block (n, m + 1) when passing the boundary point Q. This block is selected upon detection.
[0038]
The selection result is output to the peripheral block selecting unit 12, and the peripheral block selecting unit 12 selects eight blocks around the current position belonging block in the same manner as described above. As a result of such selection, as shown in FIG. 5D, (n + 1, m), (n, m), (n-1, m), and the current position belonging block (n, m + 1) and its surroundings, A total of eight blocks of (n + 1, m + 1), (n-1, m + 1), (n + 1, m + 2), (n, m + 2), (n + 1, m + 2) are selected.
[0039]
The voice recognition dictionary corresponding to the nine blocks selected as described above is accumulated in the current position block group voice recognition dictionary 14. For example, when the user wants to know the 7-Eleven Koyama Otome store at the point Q, When the voice “Seven-Eleven Koyama Otome” is input from the microphone, the voice recognition unit 4 searches the name of the facility in the current position corresponding block group voice recognition dictionary 14 for the name of the facility close to the input voice. For example, “Seven-Eleven 00 store” is searched. In addition, the facility names such as “Seven Eagle” (sport store) and “Seven East” (travel store) can be found.
[0040]
When setting a block to be divided by latitude and longitude as shown in FIG. 5A, the latitude and longitude can be arbitrarily set. If this is set to a narrow area, the block group at that time is set. The number of place names and facility names contained within is reduced. For this reason, if a desired place name or facility name exists in this data, the entire amount of data searched by the voice recognition unit 4 is small, so that voice recognition processing is performed at high speed even if a CPU with the same processing speed is used. be able to. However, since the range to be subjected to speech recognition is limited, there is a high possibility that it does not exist in the block group, and speech recognition processing is not appropriately performed at that point.
[0041]
On the other hand, when the setting of blocks divided by latitude and longitude is set to a wide area, the number of place names and facility names included in the block group at that time increases, and the entire speech recognition unit 4 searches. Therefore, when a CPU having the same processing speed is used, it takes a long time to perform the voice recognition process. However, since the target range of speech recognition is widened at that time, there is a high possibility that the place name or point intended by the user exists in the block group, and speech recognition processing is appropriately performed at that point. It becomes.
[0042]
As described above, when appropriately setting the target range for voice recognition, for example, when setting a guidance route, when inputting a place name of a destination that is far from the current location, and traveling along the guidance route When searching for a nearby 7-Eleven for shopping, the search target area is significantly different, so it is optional depending on whether the selection range of the surrounding blocks is narrowed or widened depending on the situation of voice recognition It may be set in advance so that it can be selected. In that case, the peripheral block selection unit 12 in FIG. 1 is provided with a selection range adjustment unit, which is arbitrarily selected by the user, or voice input for setting a guidance route, and voice input during traveling along the guidance route. May be identified and the range may be automatically changed.
[0043]
FIG. 6A shows an example of adjusting the selection range of the surrounding blocks as described above, where the current position is a latitude divided area between latitudes a5 and a6, and the longitude is longitude between b5 and b6. When the divided areas exist in the intersecting block S1, when the range of the peripheral blocks is set narrow, nine block groups S2 included between the latitudes a4 and a7 and the longitudes between b4 and b7 are the peripheral blocks. When selected and the range of the peripheral blocks is set wide, 49 block groups S3 including longitudes between b2 and b9 between latitudes a2 and a9 are set. In addition to always setting the same size in all areas, this block group setting is automatically set to a narrow range in areas with a lot of place names and facility names and wide in a small area. It can also be changed.
[0044]
Further, when setting the predetermined range of the speech recognition dictionary as described above, for example, as shown in FIG. 6 (a), the latitude and longitude meshes are set finely, and the range is widened, for a total of 49 block groups S3. With respect to the selected one, a mesh of latitude and longitude is set large as shown in FIG. 4B, and the latitude of the current position belonging block T1 between the latitudes c3 and c4 and between the longitudes d3 and d4 is set to 9. These blocks can be selected as a speech recognition dictionary that is substantially the same as the block selected as the peripheral block group T2, and the speech recognition dictionary group can be arbitrarily selected by such adjustment.
[0045]
As shown in FIG. 6A, when the block group S2 adjacent to the current position affiliation block S1 and the block group S3 in an appropriately set range around the block group S2 are set, as described above. In addition to arbitrarily switching the block group, the closer to the current position affiliation block S1, the greater the weighting coefficient of the approximation calculation process with the input voice in the voice recognition process for the place name and facility name included in the block, It can also be set so as to determine that the name intended by the user is highly likely. By making such a setting, more accurate speech recognition is possible. For example, for all place names and facility names, the distance from the current position is calculated and the weights of the candidates are weighted. As described above, it is possible to reduce the burden of the voice recognition processing by performing the calculation for each block.
[0046]
Furthermore, when setting the range of the speech recognition dictionary, it is preferable to set the number of place names and facility names included in each block within a predetermined range in consideration of the processing capability of the speech recognition apparatus. In that case, in the block group obtained as a result of the block formation in this way, the block group having the largest number of points is set in a range that can be sufficiently handled by the speech recognition apparatus.
[0047]
When dealing with a block with a large number of place names or facility names, the block with the largest place name or facility name as described above is sufficiently large to handle the processing capability of the voice recognition device used. When subdividing in the entire range, the number of place names and facility names is extremely small in other areas, which may not be preferable as the range of the speech recognition dictionary.
[0048]
As a countermeasure, for example, as shown in FIG. 7A, a block range A including an area with many place names / facility names is subdivided, and a latitude a3 for block division is set between latitudes a2 and a4. A latitude a5 for block division is set between a4 and a6. Similarly, b4 is set between the longitudes b3 and b5, and b6 is set between the longitudes b3 and b5, and b6 is set between the longitudes b5 and b7. A block of area is formed, and the number of place names and facility names included in one block is made as uniform as possible. When there is a further need, for example, as shown in FIG. 7 (b), in the block range A of FIG. 7 (a), each block is divided into four in the range B of the block having a lot of place names / facility names. The number of place names / facility names included in each block can be equalized.
[0049]
By forming such a fine division, for example, when forming a block for a speech recognition dictionary in which the whole of one country is divided by latitude and longitude, there is the capital Tokyo, as in the Kanto region in Japan, for example. Due to the high population density of the area, there are many place names and facility names in the area. Therefore, in this area, a subdivided block range A is formed. Since the facility name exists, this is used such that a block range B is formed by subdividing this part.
[0050]
For example, when the speech recognition dictionary block is divided as shown in FIG. 7 (a), the current position corresponding block group as shown in FIGS. 8 (a), (b), and (c) is generated as the vehicle moves. Selected. That is, in FIG. 8A, the current position exists in the block F1 portion, and the current position corresponding block group G1 including the block groups existing around the block is selected. At this time, since the current position belonging block F1 is adjacent to the block subdivision range, a narrow range is selected as the peripheral block in the block subdivision range portion.
[0051]
Next, when the current position moves from the block F1 to the block F2 in FIG. 5B, the current position corresponding block group G2 is selected. At this time, since the current position belonging block F2 is a block in a block subdivision range, the surrounding blocks are in a narrower range. Further, when the current position moves to the block F3 as shown in FIG. 5C, the current position corresponding block group G3 becomes the narrowest range because all the surrounding blocks are in the block segmentation range. As described above, when a block having a large number of place names / facility names is subdivided, voice recognition processing can be performed in a state where the number of place names / facility names included in the voice recognition dictionary is always kept substantially equal. . Note that the same effect is obtained when the blocks are subdivided as shown in FIG.
[0052]
The voice recognition unit 4 in FIG. 1 performs voice recognition of place names and facility names using the voice recognition dictionary of various aspects as described above. As a result of the recognition, the place names corresponding to the input voice or When there is only one facility name, it is output directly to the output confirmation unit 26, and when a user confirmation signal is input from the operation signal input unit 25, it is output from the recognition result output unit 27 to the navigation device 28. .
[0053]
At this time, when there are a plurality of candidates for place names and facility names, in this embodiment, the data is output to the multiple candidate list display output unit 23 and the list is displayed on the display unit 24 of the navigation device. The plurality of candidate lists are output to the category-specific speech recognition unit 21 of the plurality of candidate narrowing-down processing units 20, and when searching for a place name or facility name intended by the user from among these candidates, a speech for each category is output. Enable recognition. For this reason, in this embodiment, the place name / facility name dictionary section 6 is provided with a place name / facility name category-specific speech recognition dictionary 15 so that voice recognition based on the category can be performed. In the above example, when there are a plurality of candidates as a result of the recognition processing in the speech recognition unit 4, an example is shown in which the plurality of candidates are displayed as a list on the display unit 24. Alternatively, an instruction display for selecting a genre may be immediately displayed, and the user may be prompted to select a genre.
[0054]
The user refers to the list of categories separately displayed on the display unit 24, selects the desired category by operating the cursor key of the remote controller in the operation signal input unit 25, or inputs the category name by voice and recognizes it. As a result, the signal is output to the category-specific speech recognition unit 21. In the voice recognition process at this time, another dictionary 7 such as an operation function in the voice recognition dictionary unit 5 is used. The category-specific speech recognition unit 21 searches a dictionary of a corresponding category in the place name / facility name category dictionary 15 based on the category signal designated by the user, and the place name and facility name recorded in the category dictionary. Search for phoneme data that matches the voice input by the user and refine it.
[0055]
A plurality of candidates as described above are narrowed down by the speech recognition processing as described above in the category-specific speech recognition unit 21 and, as a result, only one candidate is output directly to the output confirmation unit 26, In the same manner as described above, confirmation of the user is requested, and when confirmed, this is output from the recognition result output unit 27 to the navigation device 28. At this time, if there are still a plurality of candidates, these are output to the candidate point distance order arrangement narrowing unit 22, the distance from the current location is calculated for each candidate, and a list arranged in order from the closest is created. The list is output to the display unit 24 via the multiple candidate list display output unit 23 to display the list.
[0056]
The user refers to the list display, selects an appropriate one from the operation signal input unit 25, and outputs it to the candidate point distance order arrangement narrowing unit 22. In the candidate point distance order narrowing-down unit 22, the instructed candidate is output to the output confirmation unit 26, and the output confirmation unit 26 outputs it to the display unit 24 in the same manner as described above, and confirmation from the user who saw this When a signal is input from the operation signal input unit, the signal is output to the navigation device 28 via the recognition result output unit 27. In the above-described embodiment, an example is shown in which a list in which a plurality of candidates are arranged in order of distance is displayed. However, without displaying the list, the nearest candidate is output to the output confirmation unit 26 to obtain confirmation from the user. It can also be set as follows.
[0057]
The navigation speech recognition apparatus including the functional blocks that perform the above-described functions shown in FIG. 1 can be operated sequentially according to the operation flow shown in FIGS. 2 and 3, for example. Hereinafter, description will be made with reference to the functional block diagram of FIG. In the voice recognition process, first, a recognition voice is input from the user as shown in FIG. 2 (step S1). Next, the voice feature parameter extraction process is performed (step S2), and it is determined whether or not the input voice includes a voice related to a place name or facility name (step S3).
[0058]
In this determination, the voice feature parameter extracted from the microphone 1 input by the voice input unit 2 in FIG. 1 is extracted by the voice feature parameter extraction unit 3, and the operation function in the voice recognition dictionary unit 5 by the voice recognition unit 4 etc. This can be done by searching and determining whether there is a dictionary 7 corresponding to the extracted voice feature parameter. This determination is made in the voice recognition dictionary unit 5 where the data in the place name / facility name dictionary unit 6 may contain enormous data such as millions of points of interest (POI). Since there are usually only about several hundred words in the other dictionary 7 and the like, the above determination can be easily made by using the other dictionary 7 such as the operation function. However, in addition to such a discriminating method, it is determined that the input voice is voice related to the place name / facility name by detecting that the user is inputting voice while the destination setting input operation is being performed. If the word “near” is recognized as a specific term when searching for a facility name such as “near convenience store”, the next word is the facility name. Can also be determined.
[0059]
In this determination, if it is determined that the voice uttered by the user includes a place name or a facility name, the voice recognition dictionary setting process for the place name / facility name corresponding to the current position is performed (step S4). The processing flow is shown in FIG. 3 and will be described in detail later. In this step, as described above, the country or region targeted by this navigation device is divided into blocks of a plurality of regions divided by latitude and longitude. The place name and facility name included are formed as one unit speech recognition dictionary, and the place name / facility name speech recognition dictionary is formed as a set of them. The current position corresponding block group speech recognition dictionary is set by selecting a predetermined range of blocks around.
[0060]
Using the speech recognition dictionary set in this way, the speech data search process corresponding to the input speech feature parameter is performed (step S5), the place name and facility name corresponding to the speech data are obtained, and the place name and Various necessary data such as the location data of the point where they are recorded together with the facility name is read out. Since the search processing of the voice recognition dictionary can use various search methods that have been used conventionally, the description thereof is omitted here.
[0061]
As a result of the search processing performed as described above, it is determined whether or not there is only one point corresponding to the place name or facility name obtained (step S6). If it is determined that there is not only one, that is, there are a plurality of candidates, a plurality of candidates are narrowed down (step S7). This processing flow is shown in FIG. 4 and will be described in detail later. As described above, the plural candidate narrowing processing unit 20 in FIG. 1 performs narrowing by category, and if necessary, plural candidate points are separated by distance. They are arranged in order, and a narrowing-down process that the user selects with reference to this is performed.
[0062]
After performing such a narrowing process, the obtained speech recognition result is output to a display unit or the like to prompt the user to confirm (step S8). Next, it is determined whether or not the recognition result is appropriate by the user (step S9). When it is determined that the recognition result is appropriate, it is determined whether or not the voice input is terminated (step S10). If it is determined that the user has finished voice input for a predetermined period after that, for example, the user has not finished voice input, the operation flow is finished (step S13). If it is determined in step S6 that there is only one candidate as a result of the speech recognition dictionary search process, the process immediately proceeds to step S8, and it is confirmed whether the obtained candidate is intended by the user. Do.
[0063]
On the other hand, when it is determined in step S3 that the voice input by the user does not include the voice related to the place name or the facility name, a dictionary for performing other voice recognition such as an operation function is selected (step S11). ), And searches for words corresponding to the input voice using the dictionary. The words obtained by the voice recognition are displayed on the display unit as described above, or the voice of inquiries as to whether a specific function should be performed is output and the user is confirmed. Prompt.
[0064]
In step S10, when it is determined that the voice input is not completed, such as when the user performs voice input again, the process returns to step S1 and the same operation as described above is repeated. In step S9, when the user makes a decision on the confirmation output of the speech recognition result and determines that the recognition result is not intended by the user, a signal to that effect is input to input step 1. Returning to, the voice input for recognition is performed again, and the same operation is performed thereafter.
[0065]
The place name / facility name speech recognition dictionary setting process corresponding to the current position in step S4 can be performed, for example, according to the operation flow shown in FIG. At the beginning of this process, the current position data is fetched (step S21). This operation is performed by the current position input unit 10 in FIG. 1 taking in a signal from the current position detection unit of the navigation device 28.
[0066]
Next, in this embodiment, it is determined whether or not the speech recognition dictionary selection process to be performed is the first one (step S22). Here, when it is determined that it is the first speech recognition dictionary selection process, the current position belonging block is selected (step S23). This processing is performed in the latitude / longitude division block unit speech recognition dictionary 13 of FIG. 1 by, for example, a latitude division area divided by latitude as shown in FIG. 5A and a longitude division area divided by longitude. A list in which identification numbers are assigned in advance to the block portions formed at the portions where the two intersect each other can be obtained by selecting a specific block from the latitude and longitude data when the current position is input. it can.
[0067]
In actual block division, for example, in the Japanese example shown in FIG. 9, the latitude and longitude are divided in units of one degree, the latitude division areas are A1 to A16, and the longitude division areas are B1 to B17. The intersecting portion is defined as a latitude / longitude division block. By such block division, for example, most of Tokyo is included in the block (A6, B11). In addition, when dividing into blocks, the blocks may be subdivided or combined as appropriate according to the data amount of place names and facility names included in each block.
[0068]
Next, based on the current position affiliation block obtained as described above, peripheral blocks in a predetermined range of the block are selected (step S24). This operation is performed by the peripheral block selection unit 12 in FIG. 1 selecting a block in a predetermined range around the selected current position block based on the data selected by the current position belonging block selection unit 11. .
[0069]
When actually selecting the peripheral block, it can be selected as shown in the example shown in FIG. 10A. In this example, when the latitude and longitude are divided as shown in FIG. By being located in Yamanashi Prefecture, the block (A6, B10) is selected as the current position affiliation block, and the surrounding blocks (A7, B9), (A6, B9), (A5, B9), (A7, B10), A total of eight blocks (A5, B10), (A7, B11), (A6, B11), and (A5, B11) are selected as peripheral blocks.
[0070]
Then, for the current position affiliation block obtained in step S23 as described above and peripheral blocks within a predetermined range of the current position affiliation block obtained in step S24, the speech recognition dictionary corresponding to these blocks is divided into latitude and longitude. The block voice recognition dictionary 13 is selected and selected as a block-corresponding voice recognition dictionary group, and these are grouped into a current position-corresponding block group voice recognition dictionary 14 (step S26). Thereafter, the process returns to step S21 again, and thereafter the same operation is repeated.
[0071]
In creating the dictionary as described above, all these data are actually collectively recorded in the storage unit, and the current speech recognition dictionary is simply selected from the dictionaries existing in the latitude / longitude division block speech recognition dictionary 13. It is also possible to record only what is used as a list. In the case of only creating the list data in this way, when searching for phoneme data corresponding to the input speech during the speech recognition process, the list is stored in the list in the latitude / longitude divided block speech recognition dictionary 13. A speech recognition process is executed by selecting only a dictionary of existing blocks and performing a search.
[0072]
On the other hand, in step S22, when it is determined that the voice recognition dictionary selection process to be performed is not the first process, that is, when the voice recognition dictionary has already been selected, whether or not the current position belonging block has changed. Make a decision. That is, when the current position changes as the vehicle moves, it is detected whether or not the current position is out of the range of the current position belonging block selected earlier. As a result, when it is determined that the current position affiliation block has not changed, the process returns to step S21 again, the current position data is continuously taken in, and the same operation is repeated.
[0073]
If it is determined in step S25 that the current position affiliation block has changed, the process proceeds to step S23, the current position affiliation block is selected in the same manner as described above, and then in step S24, a predetermined peripheral block around the current position affiliation block is selected. Make a selection and repeat the same operation. In this embodiment, the block selection process after step S23 is performed only when it is determined in step S25 that the current position affiliation block has changed. Therefore, the process after step S23 is always performed every time the current position data is fetched. This reduces the processing burden of the place name / facility name dictionary selection process.
[0074]
As a result of the speech recognition dictionary selection processing by moving the current position as described above, for example, in the current position corresponding block group including the current position belonging block and the peripheral blocks shown in FIG. When entering Tokyo from the current position, the current position block becomes a block (A6, B11) as shown in FIG. 4B, and the peripheral blocks move as shown. Furthermore, when the current location enters Saitama Prefecture from Tokyo and enters the A7 from the latitude division area a6, the current position belonging block moves to the block (A7, B11) as shown in FIG. The block also moves as shown. As described above, the dictionary for performing the speech recognition processing can be changed according to the movement of the current position.
[0075]
In the narrowing-down process of a plurality of candidates in step S7 of FIG. 2, for example, the processes can be sequentially performed according to the operation flow shown in FIG. That is, when the speech recognition dictionary search process is performed in step S5 of FIG. 2, and it is determined in step S6 that there is not only one search result, that is, there are a plurality of candidates, in the embodiment of FIG. The plurality of candidates are displayed on the screen so that the user can confirm them (step S31). It should be noted that it is possible to set so as to immediately proceed to the next step S32 without performing such a screen display, and it is also possible to set to display only when the number of candidates is within a predetermined number.
[0076]
In step S32, when inputting what category the place name or facility name intended by the user belongs, a list of categories is displayed for convenience of input, and the category-specific speech recognition unit 21 Searches the phoneme data in the category dictionary 15 (step S33).
[0077]
It is determined whether or not there is only one candidate obtained as a result of the search (step S34). When it is determined that there is not only one, that is, a plurality of candidates exist, the current position of these plurality of candidates is determined. Are calculated (step S35), rearranged in order of distance, and displayed as a list (step S36).
[0078]
Accordingly, for example, when the user wants to shop at the current location in FIG. 10C and wants to shop at the 7-11, when the convenience store 7-11 is designated as the category, a list as shown in FIG. 11 is displayed, for example. By looking at such a list, the user can see that there is a 7-Eleven Inabago branch in the nearest place. Instead of the list as described above, a map display centering on the current location may be performed, and the above facilities may be displayed on the map. The user selects a desired facility name or the like while viewing the list display or map display as described above (step S37), and outputs the selection result (step S38).
[0079]
The navigation speech recognition apparatus according to the present invention operates as described above, but can be used in other countries around the world in addition to the navigation apparatus used in Japan as shown in FIGS. In particular, in the United States, there are many place names and facility names in a vast land area, so it is too heavy for a speech recognition processing device to be used in a single speech recognition dictionary. The present invention is particularly effective when used as a voice recognition device for a navigation device. In that case, for example, as shown in FIG. 12, the speech recognition dictionary is divided by latitude and longitude. In this example, latitude division areas n1 to n10 and longitude division areas w1 to w24 are used, and an area where these intersect is set as a block unit for dividing the speech recognition dictionary. In the United States, since the administrative divisions of the state have many portions along the latitude and longitude, the method of partitioning blocks by latitude and longitude according to the present invention is easily adapted.
[0080]
Accordingly, in the example shown in FIG. 13, when the current location exists in the block (n7, w15), eight neighboring blocks are selected in the same manner as in the Japanese example, and a voice recognition dictionary of a total of nine blocks is selected. Use it to perform voice recognition processing. As a result, the speech recognition dictionary is changed as the current position moves from FIG. 13A as shown in FIGS. In the illustrated embodiment, both latitude and longitude are set in units of 2.5 degrees. However, for example, as shown in FIG. 14, the blocks may be divided into fine blocks in units of 1 degree. In that case, as shown in the figure, the densely populated portion of the west coast of the United States can be divided into fine blocks every one degree, and the other parts can be divided into blocks every two degrees. Moreover, you may set so that an area with a small population density like the inland central part of the United States may be divided | segmented into a bigger block.
[0081]
Although the present invention can be implemented in various modes as described above, it is possible to select peripheral blocks in various modes by using a speech recognition dictionary divided by latitude and longitude in various modes. In addition, the narrowing-down process when there are a plurality of candidates as a result of the speech recognition can be narrowed down in another manner in addition to the above embodiment.
[0082]
【The invention's effect】
Since the present invention is configured as described above, the speech recognition apparatus for navigation according to the present invention solves the above-described problem, and phoneme data of place names and facility names included in the blocks divided by latitude and longitude, and their relations. A block unit speech recognition dictionary storage unit having a plurality of block unit speech recognition dictionaries in which data is recorded for each block, a current position belonging block to which the current position belongs, and peripheral blocks in a predetermined range around the current position belonging block A current position corresponding block group selecting unit, and a block unit speech recognition dictionary of the block group selected by the current position corresponding block group selecting unit selected from the block unit speech recognition dictionary storage unit A speech recognition dictionary, and phoneme data corresponding to the speech of the input place name or facility name is converted into the current position corresponding block group speech recognition dictionary. Since the voice recognition processing unit for searching and outputting from the voice recognition processing unit is provided, when performing the voice recognition processing using the voice recognition dictionary in which the place name and facility name are set for each region, the processing load is not burdened on the voice recognition processing device. Appropriate speech recognition processing can be performed, and processing similar to that in which speech recognition processing is performed from a large amount of data can be performed using a small speech recognition dictionary. Thereby, voice recognition can be performed accurately at high speed using an inexpensive data processing apparatus.
[0083]
In addition, another navigation speech recognition apparatus according to the present invention changes the block size so that the blocks divided by the latitude and longitude are as uniform as possible in the amount of place names and facility names included in each block. Therefore, it is possible to reduce the degree of change in the voice recognition processing load according to the area where the current location exists, and to perform voice recognition at high speed and accurately using an inexpensive data processing device. it can.
[0084]
In another navigation speech recognition apparatus according to the present invention, the blocks around the block to which the current position belongs are blocks adjacent to the current position belonging block. The selection can be easily performed.
[0085]
In another navigation speech recognition apparatus according to the present invention, a block around the block to which the current position belongs includes a block adjacent to the block to which the current position belongs and a block in a predetermined range around the block. Therefore, it is possible to set the target area for speech recognition to an arbitrary range, and to select and set the range as appropriate depending on the processing capability of the speech recognition processing device. It is possible to appropriately select a target range for speech recognition.
[0086]
Further, the other speech recognition apparatus for navigation according to the present invention changes the blocks around the block to which the current position belongs so as to move the current position so that the amount of the place name and the facility name is as uniform as possible. Because of this, even if there are many place names or facility names depending on the area where the current location exists, the amount of the entire place name and facility name included in the block group targeted for speech recognition is equalized. Therefore, it is possible to equalize the processing load of speech recognition and perform appropriate speech recognition processing as a whole.
[0087]
Further, in another navigation speech recognition apparatus according to the present invention, the speech recognition processing unit has a speech approximation such that the place name or facility name included in the current position corresponding block group speech recognition dictionary is located in a block closer to the current position. Since the degree is set large, for example, when searching for a nearby convenience store, the facility intended by the user can be appropriately searched. In addition, since the degree of approximation is calculated in units of blocks, the memory capacity may be smaller than that required for each place name or facility name in units of states, prefectures, etc. The burden is reduced.
[0088]
In another navigation speech recognition apparatus according to the present invention, the current position corresponding block group speech recognition dictionary includes a list in which blocks of the current position corresponding block group are recorded, and the speech recognition processing section Since the place name or the facility name provided with the phoneme data corresponding to is searched from the dictionary data of the block recorded in the list of the block unit speech recognition dictionary storage unit based on the list in which the block is recorded, The data to be recorded in the current position corresponding block group speech recognition dictionary may be extremely small data including only list data, and the data rewrite processing load can be reduced, so that an inexpensive speech recognition apparatus can be obtained.
[0089]
Further, the other speech recognition device for navigation according to the present invention, when the current position belonging block selection unit changes a block to which the current position belongs due to the movement of the current position and selects a new current position belonging block, Since the block selection process is performed, it is not necessary to perform the surrounding block selection process every time the current position moves, and the voice recognition dictionary selection process can be simplified.
[0090]
In addition, another navigation speech recognition apparatus according to the present invention includes a plurality of candidate narrowing-down processing units, and the plurality of candidate narrowing-down processing units include a plurality of results of speech recognition processing by the current position corresponding block group speech recognition dictionary. Since the narrow-down process is performed when there are candidates, when a plurality of candidates exist as a result of performing the speech recognition process using the current position corresponding block group speech recognition dictionary, only the appropriate one of them is selected. It can narrow down and output, and it can be set as a voice recognition device which is easy to use.
[0091]
Further, another navigation speech recognition apparatus according to the present invention includes a category-specific speech recognition dictionary in which place names and facility names are recorded by category, and the plurality of candidate narrowing-down processing units correspond to categories designated by a user. Since the speech recognition process is performed using the category-specific speech recognition dictionary, when there are a plurality of candidates as a result of performing the speech recognition process using the current position corresponding block group speech recognition dictionary, Using the speech recognition dictionary, it is possible to search in correspondence with the category designated by the user, and more accurate speech recognition processing can be performed.
[0092]
In another navigation speech recognition apparatus according to the present invention, the plurality of candidate narrowing-down processing units calculate distances from the current location for the plurality of candidates obtained as a result of the speech recognition processing, and the candidate point distance order is calculated. A plurality of candidates are sequentially displayed in a list on the display unit according to the output of the array unit, and the user makes a selection accordingly, so that the closest to the current position among the plurality of candidates or other conditions are taken into account. The user can arbitrarily select things, and can accurately recognize the place name or facility name intended by the user.
[0093]
In addition, another navigation speech recognition apparatus according to the present invention calculates the distance from the current location for each of a plurality of candidates obtained as a result of the speech recognition processing, and outputs the candidate with the closest distance as a speech recognition result. As a result, it is possible to automatically output a plurality of candidates that are considered to be most appropriate, and it is possible to provide an easy-to-use speech recognition apparatus that does not bother the user.
[Brief description of the drawings]
FIG. 1 is a functional block diagram of an embodiment of the present invention.
FIG. 2 is a basic operation flowchart of the embodiment.
FIG. 3 is an operation flow diagram for performing a current position corresponding place name / facility name speech recognition dictionary setting process in the basic operation flow diagram;
FIG. 4 is an operation flowchart for performing a plurality of candidate narrowing-down processes in the basic operation flowchart.
FIG. 5 is a diagram showing an example of dividing the speech recognition dictionary according to the present invention by latitude and longitude, and selecting a current position belonging block and surrounding blocks, (a) shows an example of latitude and longitude divided blocks, ) Shows an example of selecting the current position belonging block and surrounding blocks, (c) shows an example in which the current position moves between the blocks, and (d) changes the speech recognition dictionary when the current position belonging block changes. It is a figure which shows an example.
6A and 6B are diagrams showing an example of selecting peripheral blocks in the present invention, where FIG. 6A shows an example of setting peripheral blocks in a plurality of modes, and FIG. 6B shows an example of setting in other modes. is there.
FIG. 7 is a diagram showing an example of subdividing a specific area when the speech recognition dictionary according to the present invention is divided by latitude and longitude. FIG. 7A shows an example of dividing in one type, and FIG. ) Is a diagram showing an example of division in two types of modes.
FIG. 8 is a diagram illustrating an example in which a current position corresponding block group is changed by movement of a current position when a specific area is subdivided when the speech recognition dictionary according to the present invention is divided by latitude and longitude.
FIG. 9 is a diagram showing an embodiment in which the speech recognition dictionary creating method for dividing by latitude and longitude according to the present invention is applied to creating a speech recognition dictionary for place names / facility names all over Japan.
FIGS. 10A and 10B are diagrams showing a current position corresponding block group for voice recognition in the embodiment of the same Japan, wherein FIGS. 10A, 10B, and 9C correspond to the movement of the current position, respectively. FIG. It is a figure which shows the example from which changes.
FIG. 11 is a diagram showing an example in which a plurality of candidates obtained as a result of the speech recognition process are narrowed down by a specific category, arranged according to the distance from the current location, and displayed on the screen.
FIG. 12 is a diagram showing an embodiment in which the speech recognition dictionary creation method for dividing by latitude and longitude according to the present invention is applied to creation of a speech recognition dictionary for place names / facility names in the United States.
FIGS. 13A and 13B are diagrams showing a current position corresponding block group for speech recognition in the embodiment of the United States, and FIGS. 13A and 13B are current position corresponding block groups corresponding to the movement of the current position, respectively. It is a figure which shows the example from which changes.
FIG. 14 is a diagram showing an example of subdividing a speech recognition dictionary for a west coast portion where place names and facility names are particularly large in the embodiment.
FIG. 15 is a functional block diagram showing an example of a navigation speech recognition apparatus that has been used conventionally and to which the present invention is applied.
[Explanation of symbols]
1 microphone
4 Voice recognition unit
5 Dictionary section for voice recognition
6 place name, facility name dictionary part
7 Other dictionaries such as operation functions
9 Current position corresponding block group selection part
10 Current position input section
11 Current position belonging block selection section
12 Peripheral block selection section
13 Latitude / longitude division block speech recognition dictionary storage
14 Current position corresponding block group speech recognition dictionary
15 Speech recognition dictionary by place name / facility name category
20 Multiple candidate narrowing down processing unit
21 Voice recognition by category
22 Candidate point distance order arrangement narrowing down part
23 Multiple candidate list display output section
26 Output confirmation part

Claims

A block unit speech recognition dictionary storage unit comprising a plurality of block unit speech recognition dictionaries in which phoneme data of place names and facility names included in blocks divided by latitude and longitude and related data are recorded for each block;
A current position corresponding block group selection unit that selects a current position belonging block to which the current position belongs and a peripheral block in a predetermined range around the current position belonging block;
A block unit speech recognition dictionary of the block group selected by the current position corresponding block group selection unit, and a current position corresponding block group speech recognition dictionary selected from the block unit speech recognition dictionary storage unit,
The current position corresponding block group selection unit, with respect to the peripheral blocks of the block to which the current position belongs, to change with the movement of the current position, so that the amount of place name and facility name as even as possible,
A navigation speech recognition apparatus, comprising: a speech recognition processing unit that searches and outputs phoneme data corresponding to an input place name or facility name speech from the current position corresponding block group speech recognition dictionary.

The block divided by the latitude and longitude is divided by changing the size of the block so that the amount of place name and facility name included in each block is as uniform as possible. Voice recognition device for navigation.

2. The navigation speech recognition apparatus according to claim 1, wherein a peripheral block of the block to which the current position belongs is a block adjacent to the current position belonging block.

The navigation speech recognition apparatus according to claim 1, wherein the peripheral blocks of the block to which the current position belongs include a block adjacent to the current position belonging block and a block in a predetermined range around the block.

The speech recognition processing unit sets a speech approximation degree so as to be larger as a place name or facility name included in a current position corresponding block group speech recognition dictionary is present in a block closer to the current position. Voice recognition device for navigation.

The current position corresponding block group speech recognition dictionary consists of a list in which blocks of the current position corresponding block group are recorded,
The navigation speech recognition apparatus according to claim 1, wherein the speech recognition processing unit selects and searches a dictionary of blocks recorded in the list when searching for phoneme data corresponding to the input speech. .

The peripheral block selection process is performed when the current position affiliation block selection unit changes a block to which the current position belongs due to movement of the current position and selects a new current position affiliation block. Voice recognition device for navigation.

A plurality of candidate narrowing-down processing units, wherein the plurality of candidate narrowing-down processing units perform a narrowing-down process when a plurality of candidates exist as a result of the speech recognition processing by the current position corresponding block group speech recognition dictionary, The navigation speech recognition apparatus according to claim 1.

A category-specific speech recognition dictionary that records place names and facility names by category,
9. The navigation speech recognition apparatus according to claim 8, wherein the plurality of candidate narrowing processing units perform speech recognition processing using the category-specific speech recognition dictionary corresponding to a category designated by a user.

The plurality of candidate narrowing-down processing units each include a candidate point distance order arranging unit that calculates the distance from the current location for each of the plurality of candidates obtained as a result of the speech recognition process, and arranges in order of distance according to the calculation result,
9. The navigation speech recognition apparatus according to claim 8, wherein a plurality of candidates are displayed in a list in order on the display unit according to the output of the candidate point distance order arrangement unit, and the user makes a selection based on the list.

9. The navigation speech recognition apparatus according to claim 8, wherein the distance from the current location is calculated for each of a plurality of candidates obtained as a result of the speech recognition processing, and the closest candidate is output as a speech recognition result.