JP2005114964A

JP2005114964A - Method and processor for speech recognition

Info

Publication number: JP2005114964A
Application number: JP2003348143A
Authority: JP
Inventors: Zenichi Hirayama; 善一平山
Original assignee: Xanavi Informatics Corp
Current assignee: Faurecia Clarion Electronics Co Ltd
Priority date: 2003-10-07
Filing date: 2003-10-07
Publication date: 2005-04-28

Abstract

<P>PROBLEM TO BE SOLVED: To reduce malfunction of an on-vehicle navigation device which vocally inputs the name of a place by improving its voice recognition precision. <P>SOLUTION: A speech recognition part outputs a plurality of results obtained by processing and recognizing an inputted voice and holds the results as candidates data in the decreasing order of matching probability. Before they are outputted to other processing parts as speech recognition results, matching between the held candidate data and map data is confirmed in order from higher matching and when candidate data matching the map data are found, the candidate data are outputted as a voice recognition result to other functions. <P>COPYRIGHT: (C)2005,JPO&NCIPI

Description

本発明は、住所を音声認識する技術に関する。 The present invention relates to a technique for recognizing an address by voice.

車両の現在位置を算出し、これを液晶ディスプレイなどの画面上に表示するとともに、指定された目的地に向かって車両が進行すべき道路を画面上に表示し、経路誘導を行う車載用ナビゲーション装置が知られている。 A vehicle-mounted navigation device that calculates the current position of the vehicle and displays it on a screen such as a liquid crystal display, as well as displaying on the screen the road on which the vehicle should travel toward the specified destination, and performing route guidance It has been known.

近年、手操作により指示命令などを入力するのに代えて、音声で指示を入力する技術が車載用ナビゲーション装置にも取り入れられている。 In recent years, a technique for inputting an instruction by voice instead of inputting an instruction command or the like by a manual operation has been adopted in an in-vehicle navigation device.

例えば、ナビゲーションの基礎となる地名、制御コマンド名などの諸情報を、使用者の肉声により入力できるナビゲーション装置がある（例えば、特許文献１参照。）。このナビゲーション装置は、音声認識デコーダを備え、マイクを介して入力された音声が表す言葉に近い候補言葉を、予め用意された辞書情報から選択して入力とするものである。 For example, there is a navigation device that can input various information such as a place name and a control command name as a basis of navigation by a user's real voice (see, for example, Patent Document 1). This navigation device includes a speech recognition decoder, and selects and inputs candidate words close to words represented by speech input via a microphone from previously prepared dictionary information.

特開平０９−２９２２５５号公報JP 09-292255 A

しかしながら、特許文献１に開示されているナビゲーション装置の場合、音声認識デコーダが入力された音声に近いと判断した候補言葉からなる地名が、当該地名が地図上に存在するか否かに係わらず、認識結果として出力される。従って、地図上に存在しない地名が認識結果として出力された場合、その後の経路認識などの処理を行うことができない。 However, in the case of the navigation device disclosed in Patent Document 1, a place name consisting of candidate words determined to be close to the input speech by the speech recognition decoder, regardless of whether or not the place name exists on the map, Output as recognition result. Therefore, when a place name that does not exist on the map is output as a recognition result, it is not possible to perform subsequent processing such as route recognition.

本発明は、このような問題を解決するためになされたもので、音声入力された地名を認識するにあたり、確実に地図上に存在する地名のみを認識結果として出力できる音声認識処理技術を提供することを目的とする。 The present invention has been made to solve such a problem, and provides a speech recognition processing technology that can reliably output only a place name existing on a map as a recognition result when recognizing a place name inputted by voice. For the purpose.

本発明は、音声入力された住所データを処理する音声認識方法において、実際に地図データに存在する住所のみを認識結果として出力する音声認識方法を提供する。 The present invention provides a speech recognition method for processing address data inputted by speech, and outputting only the address actually present in map data as a recognition result.

例えば、本発明の音声認識方法は、地図上の地点を特定する住所データを含む地図データを記憶する地図データ記憶手段を備える装置における、音声によって入力された住所データを認識する音声認識方法であって、操作者から入力される音声を受け付ける音声入力受付ステップと、前記音声入力受付ステップで受け付けた音声に音声認識処理を施し、１以上の音声認識結果を整合性の高いものから順に候補データとして候補データベースに格納する候補データ生成ステップと、前記候補データベースに格納された候補データを、前記整合性の高いものから順に抽出し、当該抽出した候補データに合致する前記住所データが前記地図データ内にあるか否かを判断し、合致するものがある場合、当該抽出した候補データを最終認識結果として出力し、合致するものがない場合、前記候補データベースに格納された全ての候補データを抽出するまで、次に整合性の高いものを抽出して前記合致を判断する処理を繰り返し、前記候補データ全てが、前記地図データ内の前記住所データに合致しない場合は、合致なしを示すデータを最終認識結果の代わりに出力する認識結果確定ステップと、を備える。 For example, the speech recognition method of the present invention is a speech recognition method for recognizing address data input by speech in an apparatus comprising map data storage means for storing map data including address data for specifying a point on a map. A voice input receiving step for receiving voice input from the operator, and performing voice recognition processing on the voice received in the voice input receiving step, and setting one or more voice recognition results as candidate data in descending order of consistency. Candidate data generation step for storing in the candidate database, candidate data stored in the candidate database are extracted in order from the one with the highest consistency, and the address data matching the extracted candidate data is in the map data If there is a match, the extracted candidate data is output as the final recognition result. If there is no match, until all candidate data stored in the candidate database are extracted, the process of extracting the next most consistent data and determining the match is repeated. A recognition result determination step of outputting data indicating no match instead of the final recognition result when the address data in the map data does not match.

また、本発明の他の音声認識方法は、地図上の地点を特定する住所データを含む地図データを記憶する地図データ記憶手段を備える装置における、音声によって入力された住所データを認識する音声認識方法であって、操作者から入力される音声を受け付ける音声入力受付ステップと、前記音声入力受付ステップで受け付けた音声に音声認識処理を施す音声処理ステップと、前記地図データから特定の範囲の住所データを抽出し、当該抽出した住所データが登録された認識候補データベースと、前記音声認識処理ステップにおいて得られた音声認識処理結果とを比較し、最も整合性の高いものを、前記候補データベースから最終認識結果として出力する認識結果確定ステップと、を備える。 Another speech recognition method of the present invention is a speech recognition method for recognizing address data input by voice in an apparatus comprising map data storage means for storing map data including address data for specifying a point on a map. A voice input receiving step for receiving voice input from the operator, a voice processing step for performing voice recognition processing on the voice received in the voice input receiving step, and address data in a specific range from the map data. The recognition candidate database that is extracted and registered with the extracted address data is compared with the speech recognition processing result obtained in the speech recognition processing step, and the most consistent one is compared with the final recognition result from the candidate database. A recognition result determination step for outputting as

本発明によれば、地名を音声入力する装置において、確実に地図上に存在する地名のみを認識結果として出力できる音声認識処理技術を提供できる。 ADVANTAGE OF THE INVENTION According to this invention, in the apparatus which inputs a place name by voice, the speech recognition processing technique which can output only the place name which exists on a map reliably as a recognition result can be provided.

以下、本発明の一実施形態を、図面を参照して説明する。 Hereinafter, an embodiment of the present invention will be described with reference to the drawings.

図１は、本実施形態のナビゲーション装置１００のハードウエア構成図である。本図に示すように、本実施形態のナビゲーション装置１００は、ＣＰＵ１０１と、ＲＯＭ１０２と、ＲＡＭ１０３と、表示装置１０４と、入出力インタフェース装置１０５と、記録媒体ドライブ１０６と、ＧＰＳ（Global Positioning System）受信機１１１と、方位センサ１１２と、車速センサ１１３と、Ｄ／Ａ変換部１２１、Ａ／Ｄ変換部と、入力装置１２３と、マイクロフォンなどの音声入力部１２４と、スピーカなどの音声出力部１２５とを備える。 FIG. 1 is a hardware configuration diagram of the navigation device 100 of the present embodiment. As shown in the figure, the navigation device 100 of this embodiment includes a CPU 101, a ROM 102, a RAM 103, a display device 104, an input / output interface device 105, a recording medium drive 106, and a GPS (Global Positioning System) reception. Machine 111, direction sensor 112, vehicle speed sensor 113, D / A converter 121, A / D converter, input device 123, audio input unit 124 such as a microphone, and audio output unit 125 such as a speaker. Is provided.

ＧＰＳ受信機１１１は、ＧＰＳ衛星からの信号を受信する。方位センサ１１２は、地磁気センサ、光ジャイロなどからなり、センサ等によって自車の進行方位を検出する基礎となる信号を取得する。車速センサ１１３は、車輪の回転速度に応じたパルスを発生する。 The GPS receiver 111 receives a signal from a GPS satellite. The direction sensor 112 includes a geomagnetic sensor, an optical gyro, and the like, and acquires a signal that is a basis for detecting the traveling direction of the own vehicle by the sensor or the like. The vehicle speed sensor 113 generates a pulse corresponding to the rotational speed of the wheel.

音声入力部１２４は、マイクロフォンなどからなり、音声の入力を受け付ける。Ｄ／Ａ変換部１２１は、音声入力部１２４で受け付けた音声にＤ／Ａ変換を施す。また、音声出力部１２５は、スピーカなどからなり、Ａ／Ｄ変換部１２２でＡ／Ｄ変換を施された音声信号を出力する。また、発話スイッチ１２３は、当該スイッチが押された後、予め定められた期間、音声の入力を受け付ける信号を送出する。 The voice input unit 124 includes a microphone and receives voice input. The D / A conversion unit 121 performs D / A conversion on the voice received by the voice input unit 124. The audio output unit 125 includes a speaker or the like, and outputs an audio signal that has been A / D converted by the A / D conversion unit 122. Further, the utterance switch 123 transmits a signal for accepting voice input for a predetermined period after the switch is pressed.

入出力インタフェース装置１０５は、ＧＰＳ受信機１１１、方位センサ１１２、車速センサ１１３、発話スイッチ１２３、Ｄ／Ａ変換部１２１、Ａ／Ｄ変換部１２２から取得した信号を、バスライン１１０に送出する。 The input / output interface device 105 sends signals acquired from the GPS receiver 111, the direction sensor 112, the vehicle speed sensor 113, the speech switch 123, the D / A conversion unit 121, and the A / D conversion unit 122 to the bus line 110.

記録媒体ドライブ１０６は、図示しない記録媒体装着部に装填された、デジタル地図データが記録されたＣＤ−ＲＯＭやＤＶＤなどの記録媒体からデータを読み出す。ここで、記録媒体に記録されたデジタル地図データには、一般的な経路誘導などのナビゲーション処理が可能なフォーマットの道路データとともに、地図上の位置を特定する情報として住所データが格納されている。 The recording medium drive 106 reads data from a recording medium such as a CD-ROM or DVD in which digital map data is recorded, which is loaded in a recording medium mounting unit (not shown). Here, in the digital map data recorded on the recording medium, address data is stored as information for specifying a position on the map, together with road data in a format capable of navigation processing such as general route guidance.

表示装置１０４は、車の現在位置、周辺の道路地図、目的地までのルート情報、誘導情報など、ナビゲーション装置１００で表示される目的で処理された各種のデータを表示する。 The display device 104 displays various data processed for the purpose of being displayed on the navigation device 100, such as the current position of the vehicle, a road map of the surrounding area, route information to the destination, and guidance information.

入力装置１２３は、表示装置１０４に表示する情報の入力や画面の切り換えなどの指令を操作者から受け付ける。入力装置１２３は、スイッチ類や、表示装置１０４の表示画面を用いたタッチパネル、ジョイスティックなどで構成される。また、入力装置１２３は、ナビゲーション装置１００を、音声入力部１２４から入力された音声を所定時間受け付けるモードに切り換える指示の入力を受け付ける発話スイッチ１２３ａを備える。 The input device 123 receives commands such as input of information to be displayed on the display device 104 and screen switching from the operator. The input device 123 includes switches, a touch panel using a display screen of the display device 104, a joystick, and the like. The input device 123 also includes an utterance switch 123a that accepts an input of an instruction to switch the navigation device 100 to a mode that accepts the voice input from the voice input unit 124 for a predetermined time.

ＣＰＵ１０１は、不揮発性メモリＲＯＭ１０２に格納されたプログラムを実行することによりナビゲーション装置１００全体を制御する。ＲＡＭ１０６は、揮発性メモリであり、ワークデータ領域となる。 The CPU 101 controls the entire navigation device 100 by executing a program stored in the nonvolatile memory ROM 102. The RAM 106 is a volatile memory and serves as a work data area.

バスライン１１０は、ナビゲーション装置１００のＣＰＵ１０１などの各構成要素をバス接続する。 The bus line 110 connects each component such as the CPU 101 of the navigation device 100 via a bus.

次に、本実施形態のナビゲーション装置１００の機能構成について説明する。図２は、本実施形態のナビゲーション装置１００の機能構成図である。 Next, a functional configuration of the navigation device 100 according to the present embodiment will be described. FIG. 2 is a functional configuration diagram of the navigation device 100 of the present embodiment.

本図に示すように、本実施形態のナビゲーション装置１００は、制御部２０１と、操作／入力部２１１と、表示／出力部２１２と、現在位置算出部２２１と、経路探索部２２２と、認識結果確定部２２３と、音声認識部２２４と、音声合成部２２５と、地図データ２３１と、認識辞書２３２と、比較候補データ２３３と、を備える。 As shown in the figure, the navigation device 100 of the present embodiment includes a control unit 201, an operation / input unit 211, a display / output unit 212, a current position calculation unit 221, a route search unit 222, and a recognition result. A confirmation unit 223, a speech recognition unit 224, a speech synthesis unit 225, map data 231, a recognition dictionary 232, and comparison candidate data 233 are provided.

現在位置算出部２２１は、ＧＰＳ受信機１１１、方位センサ１１２、車速センサ１１３とから得た信号、および、記録媒体ドライブ１０６を介して記録媒体から得た地図データとを用い、自車の現在位置、進行方向などのナビゲーション処理に必要な情報を算出する。 The current position calculation unit 221 uses a signal obtained from the GPS receiver 111, the direction sensor 112, the vehicle speed sensor 113, and map data obtained from a recording medium via the recording medium drive 106, and uses the current position of the host vehicle. The information necessary for navigation processing such as the direction of travel is calculated.

音声認識部２２４は、音声入力部１２４から入力された音声信号に音声認識処理を施し、その結果と認識辞書２３２内の辞書データとを比較照合し、予め定められた数の候補データを、整合性の高いものから優先順位をつけて生成し、候補データベース２３３に格納する。その後、処理が完了したことを認識結果確定部２３２に通知する。 The speech recognition unit 224 performs speech recognition processing on the speech signal input from the speech input unit 124, compares the result with dictionary data in the recognition dictionary 232, and matches a predetermined number of candidate data. The data is generated with a priority from the most likely ones and stored in the candidate database 233. Thereafter, the recognition result determination unit 232 is notified that the processing is completed.

候補データベース２３３に格納される候補データは、後述するように、地図データベース２３１に住所データとして格納されているデータと比較照合を行うものである。このため、候補データは、この比較照合が可能なフォーマットに変換され、候補データベース２３３に格納される。 The candidate data stored in the candidate database 233 is to be compared with data stored as address data in the map database 231 as will be described later. Therefore, the candidate data is converted into a format that can be compared and stored and stored in the candidate database 233.

本実施形態では、住所を、「県名」、「市区町村名」、「大字名」、そして、「丁目番地号」という区分けし、それぞれの構成要素ごとに入力を促し、受け付けた入力をそれぞれ音声認識部２２４で処理し、後述の認識結果確定部２２３で確定する構成を例にあげて説明する。従って、候補データは、住所を構成する各構成要素単位で候補データベース２３３に格納される。 In this embodiment, the address is classified into “prefecture name”, “city name”, “large name”, and “chome street address”, prompting the input for each component, and accepting the input A description will be given by taking, as an example, a configuration in which each processing is performed by the speech recognition unit 224 and is confirmed by a recognition result determination unit 223 described later. Accordingly, the candidate data is stored in the candidate database 233 for each component constituting the address.

音声による住所データの入力の方法および認識処理方法は、これに限らず、住所の全ての構成要素の入力を一度に受け付け、メモリなどに格納し、適当な単位で認識処理を進めていく構成としてもよいし、例えば、「丁目番地号」といった数字を含む部分と、その他の部分（県市大字名と呼ぶ）とに分けて入力を受け付けてそれぞれ認識処理を進めるなど、適宜定めることができる。 The method of inputting address data by voice and the recognition processing method are not limited to this, and the configuration is such that all the components of the address are received at once, stored in a memory, etc., and the recognition processing proceeds in appropriate units. Alternatively, for example, it can be determined as appropriate by receiving input separately for a part including a number such as “chome address” and another part (referred to as “prefectural city large name”) and proceeding with the recognition process.

図３に、音声認識部２２４により処理されて候補データベース２３３に格納される候補データの一例を示す。ここでは、住所データを構成する要素のうち、「丁目番地号」の入力があった場合に格納されるデータを例にあげて説明する。 FIG. 3 shows an example of candidate data that is processed by the voice recognition unit 224 and stored in the candidate database 233. Here, data stored when “chome address” is input among the elements constituting the address data will be described as an example.

本図に示すように、候補データベース２３３は、優先順位を示す優先順位データ２３３ａと、候補データ２３３ｂと、それまでに確定した部分の住所の構成要素データを示す確定部データ２３３ｃとを備える。 As shown in the figure, the candidate database 233 includes priority order data 233a indicating the priority order, candidate data 233b, and determination unit data 233c indicating the component data of the part of the address determined so far.

なお、ここでは、候補データを文字列で示しているが、これに限られない。候補データは、前述したように、地図データの情報と比較照合が可能な形式であればよい。 Here, the candidate data is shown as a character string, but is not limited thereto. As described above, the candidate data may be in a format that can be compared with the map data information.

認識結果確定部２２３は、候補データベース２３３に格納されている優先順位に従って、地図データベース２３１内のデータと比較照合し、合致したものを結果として出力する。 The recognition result determination unit 223 compares and collates with the data in the map database 231 in accordance with the priority order stored in the candidate database 233, and outputs a match as a result.

認識結果確定部２２３は、音声認識部２２４から処理完了の通知を受け取ると、地図データベース２３１にアクセスし、候補データベース２３３の確定部データ２３３ｃを抽出し、当該確定部データ２３２ｃで定まる住所内の次のレベルの構成要素のデータを抽出する。ここで、次のレベルの構成要素とは、例えば、「県名」まで確定していた場合は、「市区町村名」、「市区町村名」まで確定していれば、「大字名」、「大字名」まで確定していれば、「丁目番地号」のことである。 Upon receiving the processing completion notification from the voice recognition unit 224, the recognition result determination unit 223 accesses the map database 231 and extracts the determination unit data 233c of the candidate database 233, and the next in the address determined by the determination unit data 232c. Extract component data of levels. Here, the component of the next level is, for example, “prefecture name”, “city name”, “city name” if “city name” is confirmed. If it is fixed to “Large Name”, it means “Chome Address”.

次に、候補データベース２３３の候補データ２３３ｂを、その優先順位の高いものから順に、地図データベース２３１から抽出したデータと比較照合する。合致するものがあった時点で、それを認識結果として出力するとともに、当該構成要素を追加したものを、新たな確定部データ２３２ｃとして候補データベース２３３に登録する。 Next, the candidate data 233b in the candidate database 233 is compared with the data extracted from the map database 231 in descending order of priority. When there is a match, it is output as a recognition result, and the added component is registered in the candidate database 233 as new confirmed part data 232c.

例えば、「大字名」まで確定している住所であって、当該大字の「丁目番地号」の最大が「５丁目７番地４０号」であった場合、図３に格納された候補データ２３３ｂを用いて比較照合する場合、優先順位１番の「２５丁目６番地３５号」は、合致なしとして判断され、次の優先順位の候補データの処理に進む。そして、次に、優先順位２番の「２丁目６番地３５号」が抽出されて照合されると、合致ありと判断され、認識結果として出力される。 For example, if the address is fixed up to “Large name” and the maximum of the “Chome address number” of the large character is “5 Chome address 7”, the candidate data 233b stored in FIG. When using the comparison and collation, the priority No. 1 “25th Street No. 6 No. 35” is determined as no match, and the process proceeds to the next priority candidate data. Then, when “2-6-6 address 35” having the second priority is extracted and collated, it is determined that there is a match and is output as a recognition result.

ここで、合致するものがあった際、表示装置１０４および／または音声出力部１２５に、結果を出力し、操作者の承認を受けるよう構成してもよい。非承認の場合は、承認を受けるまで、次の優先順位のものから同様に比較照合を続け、結果を出力することを繰り返す。承認を受けた後、当該結果は、確定したものとして出力される。 Here, when there is a match, the result may be output to the display device 104 and / or the audio output unit 125 to be approved by the operator. In the case of non-approval, until the approval is received, the comparison and comparison are continued in the same manner from the next priority, and the result is output repeatedly. After receiving approval, the result is output as confirmed.

なお、全ての候補データについて上記の処理を行ったにも係わらず、合致するものがない、または、承認を得ることができるものがない場合は、認識結果確定部２２３は、照合不成功を示すデータを、表示装置１０４および／または音声出力部１２５から出力し、操作者にその旨提示する。 If there is no match or no approval can be obtained even though the above processing is performed for all candidate data, the recognition result determination unit 223 indicates that the verification is unsuccessful. The data is output from the display device 104 and / or the audio output unit 125, and this is presented to the operator.

経路探索部２２２は、入力装置１２３を介して操作者から受け取った指示に従い、現在位置、目的地、および、地図データベース２３１内のデータを用いて、経路探索処理を行い、指示された結果を出力する。 The route search unit 222 performs route search processing using the current position, the destination, and the data in the map database 231 in accordance with the instruction received from the operator via the input device 123, and outputs the instructed result. To do.

例えば、入力装置１２３または音声入力部１２４を介して経路誘導の処理を行う指示が入力された場合、現在位置算出部２２１にて算出した現在位置データと、入力装置１２３を介して入力された目的地データ、または、音声入力部１２４を介して入力され、音声認識部２２４および認識結果確定部２２３にて処理されて得られた目的地データとを用い、現在位置と目的地とを結ぶ道路データおよび経路誘導処理を行うために必要な各種データを、地図データベース２３１から取得し、ダイクストラ法などを用いて、現在地から目的地までの誘導経路を決定し、結果を表示／出力部２１２に送信する。 For example, when an instruction to perform route guidance processing is input via the input device 123 or the voice input unit 124, the current position data calculated by the current position calculation unit 221 and the purpose input via the input device 123 Road data connecting the current position and the destination using the location data or the destination data input through the voice input unit 124 and processed by the voice recognition unit 224 and the recognition result determination unit 223 Various data necessary for performing the route guidance process are acquired from the map database 231, the guidance route from the current location to the destination is determined using the Dijkstra method, and the result is transmitted to the display / output unit 212. .

音声合成部２２５は、各機能部で処理された結果を、表示／出力部２１２から音声で出力するために、既存の音声合成ソフトを用い発生用の音声データに変換する。なお、変換された音声データは、後述の表示／出力部２１２により、Ｄ／Ａ変換部１２２、音声出力部１２５を介して出力される。 The voice synthesizer 225 converts the result processed by each functional unit into voice data for generation using existing voice synthesizer software in order to output the result from the display / output unit 212 as voice. Note that the converted audio data is output via the D / A converter 122 and the audio output unit 125 by the display / output unit 212 described later.

操作／入力部２１１は、入力装置１２３を介して入力された信号および音声入力部１２４を介してＡ／Ｄ変換部１２１でデジタル信号に変換された信号の入力を受け取り、制御部２０１を介して各機能部に送信する。 The operation / input unit 211 receives a signal input via the input device 123 and an input of the signal converted into a digital signal by the A / D conversion unit 121 via the voice input unit 124 and via the control unit 201. Send to each functional unit.

表示／出力部２１２は、各機能部で処理された結果を、出力装置１０４および音声出力部１２５に出力する。 The display / output unit 212 outputs the result processed by each functional unit to the output device 104 and the audio output unit 125.

制御部２０１は、ナビゲーション装置１００全体の機能動作を制御する。また、本実施形態では、例えば、入力装置１２３の発話スイッチ１２３ａが押下されると、所定の時間、音声入力部１２４を介して音声の入力を受け付けるよう制御する。 The control unit 201 controls the functional operation of the entire navigation device 100. In the present embodiment, for example, when the utterance switch 123a of the input device 123 is pressed, control is performed so as to accept voice input via the voice input unit 124 for a predetermined time.

これらの各機能部は、ＲＯＭ１０２に格納されたプログラムをＣＰＵ１０１が実行することにより実現される。また、候補データベース２３２は、ＲＡＭ１０６に格納される。一方、地図データベース２３１および認識辞書２３２は、記録媒体に格納されていてもよいし、記録媒体が装着された際に記録媒体ドライブ１０６が読み出し、ＲＡＭ１０６に一時的に格納するよう構成されていてもよい。 Each of these functional units is realized by the CPU 101 executing a program stored in the ROM 102. The candidate database 232 is stored in the RAM 106. On the other hand, the map database 231 and the recognition dictionary 232 may be stored in a recording medium, or may be configured to be read by the recording medium drive 106 and temporarily stored in the RAM 106 when the recording medium is loaded. Good.

以下に、音声により入力された目的地の住所を用い、現在地から当該目的地までの経路探索処理を行う場合の、本実施形態のナビゲーション装置１００における処理手順を図を用いて説明する。 Hereinafter, a processing procedure in the navigation apparatus 100 according to the present embodiment when a route search process from the current location to the destination is performed using the address of the destination input by voice will be described with reference to the drawings.

図４は、ナビゲーション装置１００における全体の処理フローである。本図に示すように、制御部２０１は、入力装置１２３などを介して経路探索の処理の指示を受け付ける（ステップ４０１）と、目的地の入力を促し、音声認識部２２４および認識結果確定部２２３に処理を行わせ、目的地の住所データの各構成要素をそれぞれ確定する（ステップ４０２〜４０５）。 FIG. 4 is an overall processing flow in the navigation device 100. As shown in the figure, when the control unit 201 receives an instruction for route search processing via the input device 123 or the like (step 401), the control unit 201 prompts the input of the destination, and recognizes the voice recognition unit 224 and the recognition result determination unit 223. Then, each component of the destination address data is determined (steps 402 to 405).

目的地が確定したことを認識結果確定部２２３から通知を受けると、現在位置算出部２３１は、現在位置を算出し（ステップ４０６）、それを経路探索部２３２に通知する。 When receiving a notification from the recognition result determination unit 223 that the destination has been determined, the current position calculation unit 231 calculates the current position (step 406) and notifies the route search unit 232 of the current position.

認識結果確定部２２３から目的地を、そして、現在位置算出部２３１から現在位置の通知を受け取ると、経路探索部２２２は、地図データベース２３１を用い、現在地から目的地への経路探索処理を行い、結果を表示／出力部２１２を介して表示装置に表示させる誘導経路を算出する（ステップ４０７）。 Upon receiving the destination from the recognition result determination unit 223 and the notification of the current position from the current position calculation unit 231, the route search unit 222 uses the map database 231 to perform a route search process from the current location to the destination, A guidance route for displaying the result on the display device via the display / output unit 212 is calculated (step 407).

次に、上記の処理の中で、住所データの各構成要素を確定する処理を、「丁目番地号」を確定する場合を例にあげ、以下に説明する。 Next, in the above process, the process of determining each component of the address data will be described below by taking the case of determining the “chome address” as an example.

図５は、音声認識部２２４および認識結果確定部２２３において、入力された音声から「丁目番地号」を確定する場合の処理フローである。住所データの他の構成要素を確定する場合も、基本的に同様の処理が行われる。 FIG. 5 is a processing flow in the case where the voice recognition unit 224 and the recognition result determination unit 223 determine the “chome address” from the input voice. The same processing is basically performed when determining other components of the address data.

認識結果確定部２２３において「大字名」まで確定すると、制御部２０１は、「丁目番地号」の入力を促す表示あるいは出力を行う（ステップ５０１）。具体的には、表示／出力部２１２は、表示装置１０４に目的地の「丁目番地号」の入力を指示する画面を表示する、または、目的地の入力を促す音声を合成してＤ／Ａ変換部１２２を介して音声出力部１２５から出力する。 When the recognition result confirmation unit 223 confirms “large name”, the control unit 201 performs display or output for prompting the input of “chome address” (step 501). Specifically, the display / output unit 212 displays a screen for instructing the input of the “chome address” of the destination on the display device 104 or synthesizes a voice prompting the input of the destination to perform D / A The sound is output from the audio output unit 125 via the conversion unit 122.

ここで、制御部２０１は、操作／入力部２１１から、発話スイッチ１２３ａを介して音声入力許可の指示を受け付けるのを待つ。当該指示を受け付けると、制御部２０１は、入出力インタフェース装置１０５を制御して音声入力を受け付ける状態にし、音声入力部１２４からの入力を待つ。 Here, the control unit 201 waits to receive a voice input permission instruction from the operation / input unit 211 via the utterance switch 123a. When the instruction is received, the control unit 201 controls the input / output interface device 105 to receive a voice input and waits for an input from the voice input unit 124.

操作／入力部２１１は、音声入力部１２４およびＡ／Ｄ変換部１２１を介して音声の入力を受け付けると、受け付けた音声信号を音声認識部２２４に送信する（ステップ５０２）。 When the operation / input unit 211 receives a voice input via the voice input unit 124 and the A / D conversion unit 121, the operation / input unit 211 transmits the received voice signal to the voice recognition unit 224 (step 502).

音声認識部２２４は、受け取った音声信号に音声認識処理を施し、認識辞書２３２を用いて、「丁目番地号」に関する候補データを、整合性の高いものから順に抽出し、候補データベース２３３に格納する。そして、認識結果確定部２２３に格納が完了したことを通知する（ステップ５０３）。 The speech recognition unit 224 performs speech recognition processing on the received speech signal, extracts candidate data related to the “chome address” in order from the highest consistency using the recognition dictionary 232, and stores the candidate data in the candidate database 233. . Then, the recognition result confirmation unit 223 is notified of the completion of storage (step 503).

処理完了の通知を受け取った認識結果確定部２２３は、候補データベース２３３に格納された候補データ２３３ａを、優先順位順に取り出し、地図データベース２３１から抽出したデータと比較照合を行い（ステップ５０４、５０５、５０７）、合致すれば、当該候補データを表示装置１０４または音声出力部１２５に出力し、承認を得る（ステップ５０５、５０６）。 Receiving the processing completion notification, the recognition result determination unit 223 extracts the candidate data 233a stored in the candidate database 233 in order of priority, and performs comparison and collation with the data extracted from the map database 231 (steps 504, 505, and 507). If they match, the candidate data is output to the display device 104 or the voice output unit 125 to obtain approval (steps 505 and 506).

承認を得られた場合、認識結果確定部２２３は、当該候補データを結果として比較候補データベース２３３の確定部データ２３３ｃに加える。このとき、確定した構成要素が、住所データを構成する最後の構成要素の場合、すなわち、「丁目番地号」である場合は、確定部データ２３３ｃと処理終了を示すデータを経路探索部２２２および現在位置算出部２３１に通知する。その他の構成要素の場合、住所データの次のレベルの構成要素の入力を促す画面を表示するよう制御部２０１に通知する。 When the approval is obtained, the recognition result determination unit 223 adds the candidate data as a result to the determination unit data 233c of the comparison candidate database 233. At this time, if the confirmed component is the last component constituting the address data, that is, “chome address”, the confirmed unit data 233c and the data indicating the end of processing are obtained as the route search unit 222 and the current The position calculation unit 231 is notified. In the case of other components, the control unit 201 is notified to display a screen that prompts the input of the next level component of the address data.

ここで、ステップ５０５で合致しなかった場合、および、ステップ５０６で、承認を得られなかった場合、認識結果確定部２２３は、次の優先順位の候補データ２３３ａを抽出し、ステップ５０４の比較照合処理を繰り返す。 Here, in the case where they do not match in step 505 and in the case where approval is not obtained in step 506, the recognition result determination unit 223 extracts candidate data 233a of the next priority, and performs comparison and collation in step 504. Repeat the process.

そして、候補データ全てを比較照合したにも係わらず、合致するのもがない場合、または、承認を得られるものがない場合は、「音声認識失敗」を意味するデータを表示装置１０４または音声出力部１２５に表示、出力させ、処理を終了する。 If there is no match even though all candidate data have been compared and collated, or there is no data that can be approved, the data indicating “voice recognition failure” is displayed on the display device 104 or the voice output. The data is displayed and output on the unit 125, and the process is terminated.

なお、ここで、処理を終了させず、ステップ５０１に戻り、再度、「丁目番地号」の入力を促すよう構成してもよい。 Here, the processing may be terminated without returning to step 501 and prompting for the input of the “chome address” again.

以上、第一の実施形態のナビゲーション装置１００は、音声で入力された住所データについて、音声認識処理の結果得られた複数の候補データを認識時の整合性の高いものから優先順位をつけて保持し、優先度がより高いものであって、地図データに合致するものを、最終的な結果としてその後の処理機能に出力するよう構成されている。このため、音声入力された住所データの処理結果は、必ず地図データに存在するものとなる。 As described above, the navigation apparatus 100 according to the first embodiment assigns a plurality of candidate data obtained as a result of the speech recognition processing to the address data input by speech, in order of priority from the data having high consistency at the time of recognition. However, it is configured such that the one having higher priority and matching the map data is output to the subsequent processing function as the final result. For this reason, the processing result of the address data inputted by voice always exists in the map data.

従って、本実施形態のナビゲーション装置１００は、音声入力された住所を誤認識して存在しない住所を認識結果として、その後の処理機能に出力することがなくなり、地図上に存在しない住所データによりその後の処理に不都合を生じさせることがなくなる。 Therefore, the navigation device 100 according to the present embodiment does not output an address that does not exist by erroneously recognizing the address input by voice, and outputs it to a subsequent processing function. There is no inconvenience in processing.

なお、第一の実施形態において、音声認識部２２４は、音声認識結果を、候補データベース２３３に、整合性の高いものから優先順位をつけて格納し、認識結果確定部２２３では、優先順位に従って合致の有無を判定したが、方法はこれに限られない。例えば、音声認識部２２４は、所定の整合性以上の結果を優先順位を付けずに候補データベース２３３に格納し、認識結果確定部２２３では、合致したものを操作者に提示し、承認を得たものを最終結果とするよう構成してもよい。 In the first embodiment, the speech recognition unit 224 stores the speech recognition results in the candidate database 233 with priorities in descending order of consistency, and the recognition result determination unit 223 matches them according to the priorities. However, the method is not limited to this. For example, the voice recognition unit 224 stores a result of a predetermined consistency or higher in the candidate database 233 without assigning a priority order, and the recognition result determination unit 223 presents the match to the operator and obtains approval. May be configured to be the final result.

次に、本発明の第二の実施形態について説明する。 Next, a second embodiment of the present invention will be described.

本実施形態のナビゲーション装置１００のハードウエア構成は、第一の実施形態と基本的に同様である。 The hardware configuration of the navigation device 100 of this embodiment is basically the same as that of the first embodiment.

図６に、本実施形態のナビゲーション装置１００の機能構成図を示す。本図に示すように、本実施形態のナビゲーション装置１００は、制御部３０１と、操作／入力部３１１と、表示／出力部３１２と、現在位置算出部３２１と、経路探索部３２２と、候補データ生成部３２３と、音声認識部３２４と、音声合成部３２５と、地図データベース３３１と、変換データベース３３２と、認識候補データベース３３３と、を備える。 FIG. 6 shows a functional configuration diagram of the navigation device 100 of the present embodiment. As shown in the figure, the navigation device 100 of the present embodiment includes a control unit 301, an operation / input unit 311, a display / output unit 312, a current position calculation unit 321, a route search unit 322, and candidate data. A generation unit 323, a speech recognition unit 324, a speech synthesis unit 325, a map database 331, a conversion database 332, and a recognition candidate database 333 are provided.

これらの機能部のうち、候補データ生成部３２３と、音声認識部３２４と、認識候補データベース３３３以外は、基本的に第一の実施形態の同じ名称の機能部と同様の機能を備える。また、これらの各機能部が、ＲＯＭ１０２に格納されたプログラムをＣＰＵ１０１が実行することにより実現される点、および、地図データ３３１および認識辞書３３２が、記録媒体内に記録され、必要に応じて記録媒体ドライブ経由で読み出されるか、または、記録媒体が装填された際に、読み出され、ＲＡＭ１０３に一時的に格納されるか、何れでもよい点も同様である。認識候補データベース３３３は、ＲＡＭ１０３に格納される。 Among these function units, except for the candidate data generation unit 323, the speech recognition unit 324, and the recognition candidate database 333, basically the same functions as the function units having the same names in the first embodiment are provided. In addition, each of these functional units is realized by the CPU 101 executing a program stored in the ROM 102, and the map data 331 and the recognition dictionary 332 are recorded in a recording medium and recorded as necessary. The same is true in that it may be read out via the medium drive or read out when the recording medium is loaded and temporarily stored in the RAM 103. The recognition candidate database 333 is stored in the RAM 103.

以下、候補データ生成部３２３と、音声認識部３２４と、変換データベース３３２、認識候補データベース３３３について説明する。 Hereinafter, the candidate data generation unit 323, the speech recognition unit 324, the conversion database 332, and the recognition candidate database 333 will be described.

候補データ生成部３２３は、地図データベース３３１から候補データを抽出する。具体的には、地図データベース３３１にアクセスし、後述する音声認識部３２４が保持する既に確定した構成要素からなる住所の、当該構成要素の次のレベルの構成要素を抽出する。そして、抽出したデータを、変換データベース３３２を用いて、その読み方を示す表音文字列データに変換し、当該表音文字列データを候補データとして認識候補データベース３３３に登録する。 The candidate data generation unit 323 extracts candidate data from the map database 331. Specifically, the map database 331 is accessed, and a component at the next level of the component is extracted from an address made up of components that have already been determined and held by the voice recognition unit 324 described later. Then, the extracted data is converted into phonetic character string data indicating how to read using the conversion database 332, and the phonetic character string data is registered in the recognition candidate database 333 as candidate data.

なお、構成要素の次のレベルの構成要素とは、第一の実施形態と同様に、例えば、住所データを、「県名」、「市区町村名」、「大字名」、丁目番地号、の構成要素からなるものとした場合、「県名」まで確定していた場合は、「市区町村名」、「市区町村名」まで確定していれば、「大字名」、「大字名」まで確定していれば、「丁目番地号」のことである。 In addition, the component of the next level of the component is the same as in the first embodiment, for example, the address data includes “prefecture name”, “city name”, “large name”, chome address, If it has been confirmed up to “prefecture name”, “city name”, “city name”, “large name”, “large name” ”Is the“ chome street address ”.

ここで、変換データベース３３２について説明する。変換データベース３３２には、地図データベース３３１に格納されている住所データを、その読み方を示す表音文字列データに変換するための対応表が格納されている。 Here, the conversion database 332 will be described. The conversion database 332 stores a correspondence table for converting the address data stored in the map database 331 into phonetic character string data indicating how to read the address data.

そして、認識候補データベース３３３は、上記のように、候補データ生成部３２３において生成された表音文字列からなる候補データが格納される。 And the recognition candidate database 333 stores the candidate data which consists of the phonetic character string produced | generated in the candidate data production | generation part 323 as mentioned above.

音声認識部３３４は、音声入力部１２４から音声で入力された住所名に音声認識処理を施し、その結果と認識候補データベース３３３に格納されている候補データとを比較照合し、最も整合性の高いものを結果として出力する。また、各構成要素単位で住所データを確定する場合、既に確定した構成要素を保持する。 The voice recognition unit 334 performs voice recognition processing on the address name input by voice from the voice input unit 124, compares the result with candidate data stored in the recognition candidate database 333, and has the highest consistency. Output the result. In addition, when the address data is determined for each component, the already determined component is held.

ここで、最も整合性の高いものとは、本実施形態では、表音文字列の並び順も含め、最も合致する表音文字の多いものとする。これは、適宜定めることができる。 Here, in the present embodiment, the phrase having the highest consistency is assumed to have the most consistent phonetic characters including the arrangement order of the phonetic character strings. This can be determined as appropriate.

また、本実施形態では、音声認識部３３４において、表音文字列どうしで比較照合しているが、これに限られない。例えば、変換データベース３３２に、地図データベースに格納される住所データと同等のフォーマットを有するデータから所定の音声パターンに変換するデータを格納しておく。候補データ生成部３２３では、変換データベース３３２を用い、地図データベース３３１内の住所データの音声パターンを生成する。そして、音声認識部３３４では、入力された音声に音声認識処理を施し、認識候補データベース３３３に登録された音声パターンと同等の音声パターンを生成し、両音声パターンを比較照合するよう構成してもよい。 In the present embodiment, the voice recognition unit 334 compares and collates phonetic character strings, but is not limited thereto. For example, the conversion database 332 stores data for converting data having a format equivalent to the address data stored in the map database into a predetermined voice pattern. The candidate data generation unit 323 generates a voice pattern of the address data in the map database 331 using the conversion database 332. The voice recognition unit 334 may perform voice recognition processing on the input voice, generate a voice pattern equivalent to the voice pattern registered in the recognition candidate database 333, and compare and collate both voice patterns. Good.

また、結果として出力するものは、最も整合性の高いものとしているが、これに限られない。例えば、所定の整合性以上のものを結果として出力し、操作者から承認を得たものを最終結果として選択するよう構成してもよい。 Moreover, what is output as a result is assumed to have the highest consistency, but is not limited thereto. For example, it is possible to output a result having a predetermined consistency or more as a result, and select a result obtained from the operator as a final result.

次に、音声により入力された目的地の住所を用い、現在地から当該目的地までの経路探索処理を行う場合の、本実施形態のナビゲーション装置１００における処理手順を図を用いて説明する。 Next, a processing procedure in the navigation device 100 according to the present embodiment when a route search process from the current location to the destination is performed using the address of the destination input by voice will be described with reference to the drawings.

なお、入力装置１２３などを介して、経路探索の処理の指示を受け付け、目的地および現在地を確定し、経路探索を行う処理全体の流れは、第一の実施形態と同様であるため、ここでは、説明しない。ここでは、第一の実施形態の図４に示す処理フローの中で、住所データの各構成要素を確定する処理を、「丁目番地号」を確定する場合を例にあげ、説明する。 In addition, since the flow of the whole process which receives the instruction | indication of a route search process via the input device 123 etc., determines a destination and a present location, and performs a route search is the same as that of 1st embodiment, it is here. I will not explain. Here, in the processing flow shown in FIG. 4 of the first embodiment, the process of determining each component of the address data will be described using an example of determining “chome address” as an example.

図７は、音声認識部３２４および候補データ生成部３２３において、入力された音声から「丁目番地号」を確定する場合の処理フローである。住所データの他の構成要素を確定する場合も、基本的に同様の処理が行われる。 FIG. 7 is a processing flow when the speech recognition unit 324 and the candidate data generation unit 323 determine the “chome address” from the input speech. The same processing is basically performed when determining other components of the address data.

音声認識部３２４において「大字名」まで確定すると、制御部３０１は、「丁目番地号」の入力を促す表示あるいは出力を行う（ステップ７０１）。具体的には、表示／出力部３１２は、表示装置１０４に目的地の「丁目番地号」の入力を指示する画面を表示する、または、目的地の入力を促す音声を合成してＤ／Ａ変換部１２２を介して音声出力部１２５から出力する。 When the voice recognition unit 324 confirms the “large name”, the control unit 301 performs display or output for prompting the input of the “chome address” (step 701). Specifically, the display / output unit 312 displays a screen for instructing the input of the “chome address” of the destination on the display device 104 or synthesizes a voice prompting the input of the destination to perform D / A The sound is output from the audio output unit 125 via the conversion unit 122.

ここで、制御部３０１は、操作／入力部３１１から、発話スイッチ１２３ａを介して音声入力許可の指示を受け付けるのを待つととも、候補データ生成部３２３に「丁目番地号」の候補データの生成を指示する。 Here, the control unit 301 waits to receive a voice input permission instruction from the operation / input unit 311 via the utterance switch 123a, and generates candidate data for “chome address” in the candidate data generation unit 323. Instruct.

候補データ生成の支持を受け付けると、候補データ生成部３２３は、地図データベース３３１にアクセスし、地図データベース３３１から音声認識部３２４において既に確定した大字の「丁目番地号」データを全て抽出する（ステップ７０２）。そして抽出した「丁目番地号」データを、変換データベース３３２を用いて候補データに変換し（ステップ７０３）、認識候補データベース３３３に格納する（ステップ７０４）。 Upon receiving support for candidate data generation, the candidate data generation unit 323 accesses the map database 331 and extracts all large-size “chome street address” data already determined by the voice recognition unit 324 from the map database 331 (step 702). ). The extracted “chome address” data is converted into candidate data using the conversion database 332 (step 703) and stored in the recognition candidate database 333 (step 704).

一方、制御部３０１は、発話スイッチ１２３ａを介して音声入力受け付け許可の指示を受け付けると、入出力インタフェース装置１０５を制御して音声入力を受け付ける状態にし、音声入力部１２４からの入力を待つ。 On the other hand, when the control unit 301 receives a voice input acceptance permission instruction via the speech switch 123 a, the control unit 301 controls the input / output interface device 105 to accept voice input and waits for input from the voice input unit 124.

操作／入力部３１１は、音声入力部１２４およびＡ／Ｄ変換部１２１を介して音声の入力を受け付けると、受け付けた音声信号を音声認識部３２４に送信する。 When the operation / input unit 311 receives a voice input via the voice input unit 124 and the A / D conversion unit 121, the operation / input unit 311 transmits the received voice signal to the voice recognition unit 324.

音声認識部３２４は、受け取った音声信号に音声認識処理を施し（ステップ７０６）、その結果と認識候補データベース３３３に格納されている候補データとを比較照合し、最も整合性の高いものを認識結果として、表示装置１０４または音声出力部１２５に出力し（ステップ７０７）、承認の入力を待つ（ステップ７０８）。 The speech recognition unit 324 performs speech recognition processing on the received speech signal (step 706), compares the result with the candidate data stored in the recognition candidate database 333, and recognizes the result with the highest consistency as the recognition result. Is output to the display device 104 or the audio output unit 125 (step 707), and an input for approval is waited (step 708).

承認の指示を受け付けると、音声認識処理部３２４は、自身が保有する既に確定した構成要素データに追加する。 When the approval instruction is received, the voice recognition processing unit 324 adds to the already determined component data held by itself.

このとき、確定した構成要素が、住所データを構成する最後の構成要素の場合、すなわち、「丁目番地号」である場合は、確定した各構成要素データからなる住所データと処理終了を示すデータとを経路探索部２２２および現在位置算出部２３１に通知する。その他の構成要素の場合、住所データの次のレベルの構成要素の入力を促す画面を表示するよう制御部３０１に通知するとともに、既に確定した構成要素を候補データ生成部３２３に通知する。 At this time, if the confirmed component is the last component constituting the address data, that is, “chome address”, the address data composed of each confirmed component data and the data indicating the end of processing, To the route search unit 222 and the current position calculation unit 231. In the case of other components, the control unit 301 is notified to display a screen that prompts input of the next level component of the address data, and the already determined components are notified to the candidate data generation unit 323.

一方、ステップ７０８において、承認を得られなかった場合、音声認識部３２４は、「音声認識失敗」を意味するデータを表示装置１０４または音声出力部１２５に表示、出力させ、処理を終了する。なお、ここで、処理を終了させず、再度、ステップ７０１に戻り、「丁目番地号」の入力を促すよう構成してもよい。また、ステップ７０６に戻り、次に整合性の高いものを認識結果として採用し、ステップ７０７以降の処理を行うよう構成してもよい。 On the other hand, if the approval is not obtained in step 708, the voice recognition unit 324 causes the display device 104 or the voice output unit 125 to display and output data indicating “voice recognition failure”, and ends the process. Here, it may be configured to return to step 701 again without prompting the process to be prompted to input the “chome address”. Alternatively, it may be configured to return to step 706, adopt the next most consistent as the recognition result, and perform the processing after step 707.

なお、本実施形態では、住所の入力を各構成要素ごとに行い、承認を得る構成としたが、これに限られない。「県名」と「市区町村名」となど、複数の連続する構成要素を組み合わせてその単位ごとに音声認識処理を行ってもよい。この場合、候補データは、その単位ごとに生成され、保持される。 In the present embodiment, the address is input for each component and the approval is obtained. However, the present invention is not limited to this. A plurality of continuous components such as “prefecture name” and “city name” may be combined to perform voice recognition processing for each unit. In this case, candidate data is generated and held for each unit.

また、丁目番地号データのように数字により限定される構成要素の場合、地図データベース３３１内に存在する該当する「大字名」の「丁目番地号」のデータを全て抽出し、候補データを生成するのではなく、丁目、番地、号、それぞれの最大値を抽出し、当該最大値以下の、各、丁目、番地、号を認識可能な候補データを準備するよう構成してもよい。 In addition, in the case of a component limited by numbers such as chome address data, all the data of “chome address” of the corresponding “large name” existing in the map database 331 is extracted, and candidate data is generated. Instead, the maximum value of each chome, address, and number may be extracted, and candidate data that can recognize each chome, address, and number below the maximum value may be prepared.

以上説明したように、第二の実施形態のナビゲーション装置１００は、予め地図データから、音声認識の候補データを抽出しておき、その中で認識結果を確定する。音声認識において、候補データが制限されるため、認識率が向上する。さらに、地図データに存在する範囲のデータのみが候補データとなっているため、認識結果として出力される住所データは、必ず地図データに存在するものとなる。従って、地図データに存在しない結果をその後の処理機能に受け渡し、誤動作を誘発することがなくなる。 As described above, the navigation device 100 according to the second embodiment extracts candidate data for speech recognition from map data in advance, and confirms the recognition result therein. In speech recognition, candidate data is limited, so that the recognition rate is improved. Furthermore, since only the data in the range existing in the map data is the candidate data, the address data output as the recognition result always exists in the map data. Therefore, a result that does not exist in the map data is passed to the subsequent processing function, and malfunctions are not induced.

なお、上記の第一の実施形態および第二の実施形態において、車載用ナビゲーション装置を例にあげて説明したが、本発明の適用対象のナビゲーション装置は車載用には限られない。ハンドキャリーのナビゲーション装置などにも適用可能である。また、ナビゲーション装置に限らず、地図表示装置など、地名を特定する際に住所が入力される装置一般に適用可能である。 In addition, in said 1st embodiment and 2nd embodiment, although demonstrated taking the vehicle-mounted navigation apparatus as an example, the navigation apparatus of the application object of this invention is not restricted to vehicle-mounted. It can also be applied to a hand carry navigation device. Further, the present invention is not limited to a navigation device, and can be applied to a general device that inputs an address when specifying a place name, such as a map display device.

図１は、第一の実施形態のナビゲーション装置のハードウエア構成図である。FIG. 1 is a hardware configuration diagram of the navigation device according to the first embodiment. 図２は、第一の実施形態のナビゲーション装置の機能構成図である。FIG. 2 is a functional configuration diagram of the navigation device according to the first embodiment. 図３は、第一の実施形態の比較候補データベースに格納されるデータの一例を示す。FIG. 3 shows an example of data stored in the comparison candidate database of the first embodiment. 図４は、第一の実施形態のナビゲーション装置の処理フローである。FIG. 4 is a processing flow of the navigation device of the first embodiment. 図５は、第一の実施形態における入力された音声から丁目番地号を確定する場合の処理フローである。FIG. 5 is a processing flow in the case of determining the chome address from the input voice in the first embodiment. 図６は、第二の実施形態のナビゲーション装置の機能構成図である。FIG. 6 is a functional configuration diagram of the navigation device according to the second embodiment. 図７は、第二の実施形態における入力された音声から丁目番地号を確定する場合の処理フローである。FIG. 7 is a processing flow in the case of determining the chome address from the input voice in the second embodiment.

Explanation of symbols

100・・・ナビゲーション装置、101・・・ＣＰＵ、102・・・ＲＯＭ、103・・・ＲＡＭ、104・・・表示装置、105・・・入出力インタフェース装置、106・・・記憶媒体ドライブ、110・・・バス、111・・・ＧＰＳ受信機、112・・・方位センサ、113・・・車速センサ、121・・・Ａ／Ｄ変換部、122・・・Ｄ／Ａ変換部、123・・・入力装置、124・・・音声入力部、125・・・音声出力部、201・・・制御部、211・・・操作／入力部、212・・・表示／出力部、221・・・現在位置算出部、222・・・経路探索部、223・・・認識結果確定部、224・・・音声認識部、225・・・音声合成部、231・・・地図データベース、232・・・認識辞書、233・・・比較候補データベース DESCRIPTION OF SYMBOLS 100 ... Navigation device, 101 ... CPU, 102 ... ROM, 103 ... RAM, 104 ... Display device, 105 ... Input / output interface device, 106 ... Storage medium drive, 110 ... Bus, 111 ... GPS receiver, 112 ... Direction sensor, 113 ... Vehicle speed sensor, 121 ... A / D converter, 122 ... D / A converter, 123 ...・ Input device 124 ... Audio input unit 125 ... Audio output unit 201 ... Control unit 211 ... Operation / input unit 212 ... Display / output unit 221 ... Current Position calculation unit, 222 ... Route search unit, 223 ... Recognition result determination unit, 224 ... Speech recognition unit, 225 ... Speech synthesis unit, 231 ... Map database, 232 ... Recognition dictionary 233 ... Comparison candidate database

Claims

A speech recognition method for recognizing address data input by voice in an apparatus comprising map data storage means for storing map data including address data for identifying a point on a map,
A voice input receiving step for receiving voice input from an operator;
A candidate data generation step of performing speech recognition processing on the speech received in the speech input reception step, and storing one or more speech recognition results in a candidate database in order from a highly consistent one;
Candidate data stored in the candidate database is extracted in descending order of the consistency, and it is determined whether the address data matching the extracted candidate data is in the map data. If there is, the extracted candidate data is output as the final recognition result, and if there is no match, the next most consistent data is extracted until all candidate data stored in the candidate database are extracted. Recognizing the match, and if all of the candidate data does not match the address data in the map data, a recognition result confirmation step of outputting data indicating no match instead of the final recognition result;
A speech recognition method comprising:

Mounted in an apparatus comprising map data storage means for storing map data including address data consisting of prefecture city large name data and chome street address data for specifying a point on the map, and among the address data, the prefecture city A speech recognition method for recognizing the chome address data of address data for which capitalized name data is confirmed,
A voice input receiving step for receiving voice input from an operator;
A candidate data generation step of performing voice recognition processing on the voice received by the voice input receiving means, and storing one or more voice recognition results in a candidate database in order from a highly consistent one;
Candidate data stored in the candidate database is extracted in order from the one with the highest consistency, and the chome street address data matching the extracted candidate data is determined by the already established prefecture city large name data. It is determined whether or not the data is in the street address data in the address. If there is a match, the extracted candidate data is output as a final recognition result. If there is no match, the candidate database is stored in the candidate database. Until all candidate data is extracted, the process of extracting the next most consistent data and determining the match is repeated, and all the candidate data are within the address determined by the already established prefecture-city large name data. If the data does not match the chome street address data, a recognition result confirmation step for outputting data indicating no match instead of the final recognition result,
A speech recognition method comprising:

A speech recognition method for recognizing address data input by voice in an apparatus comprising map data storage means for storing map data including address data for specifying a point on a map,
A voice input receiving step for receiving voice input from an operator;
A voice processing step for performing voice recognition processing on the voice received in the voice input receiving step;
The address data within a specific range is extracted from the map data, the recognition candidate database in which the extracted address data is registered is compared with the speech recognition processing result obtained in the speech recognition processing step, and the most consistent A recognition result determination step for outputting a higher one as a final recognition result from the candidate database;
A speech recognition method comprising:

Mounted in an apparatus comprising map data storage means for storing map data including address data consisting of prefecture city large name data and chome street address data for specifying a point on the map, and among the address data, A speech recognition method for recognizing the street address data of address data for which capitalized name data is fixed,
A voice input receiving step for receiving voice input from an operator;
A voice processing step of performing voice recognition processing on the voice received by the voice input receiving means;
Extracting the street address data in the address determined by the already determined prefecture city large name data from the map data, the recognition candidate database in which the extracted street address data is registered, and obtained in the voice processing step. A speech recognition processing result, and a recognition result confirmation step for outputting the highest consistency as a final recognition result from the candidate database;
A speech recognition method comprising:

A speech recognition processing device mounted on an apparatus comprising map data storage means for storing map data including address data for specifying a point on a map,
Voice input receiving means for receiving voice input from an operator;
Candidate data generating means for performing voice recognition processing on the voice received by the voice input receiving means, and storing one or more voice recognition results in a candidate database as candidate data in descending order of consistency;
Candidate data stored in the candidate database is extracted in descending order of the consistency, and it is determined whether the address data matching the extracted candidate data is in the map data. If there is, the extracted candidate data is output as the final recognition result, and if there is no match, the next most consistent data is extracted until all candidate data stored in the candidate database are extracted. Recognizing the match, and if all of the candidate data does not match the address data in the map data, a recognition result determination unit that outputs data indicating no match instead of the final recognition result;
A speech recognition processing apparatus comprising:

Mounted in a navigation device comprising map data storage means for storing map data including address data consisting of prefectural city capital name data and chome street address data for identifying points on the map, and among the address data, the prefecture A speech recognition processing device for recognizing the street address data of the address data for which the city large name data is fixed,
Voice input receiving means for receiving voice input from an operator;
Candidate data generating means for performing voice recognition processing on the voice received by the voice input receiving means, and storing one or more voice recognition results in a candidate database as candidate data in descending order of consistency;
Candidate data stored in the candidate database is extracted in order from the one with the highest consistency, and the chome street address data matching the extracted candidate data is determined by the already established prefecture city large name data. It is determined whether or not the data is in the street address data in the address. If there is a match, the extracted candidate data is output as a final recognition result. If there is no match, the candidate database is stored in the candidate database. Until all candidate data is extracted, the process of extracting the next most consistent data and determining the match is repeated, and all the candidate data are within the address determined by the already established prefecture-city large name data. If the data does not match the street address data, the recognition result confirmation means for outputting data indicating no match instead of the final recognition result,
A speech recognition processing apparatus comprising:

A speech recognition processing device mounted on a navigation device comprising map data storage means for storing map data including address data for specifying a point on a map,
Voice input receiving means for receiving voice input from an operator;
Voice processing means for performing voice recognition processing on the voice received by the voice input receiving means;
Candidate database generating means for extracting address data in a specific range from the map data and generating a recognition candidate database in which the extracted address data is registered;
A recognition result determination unit that compares the result obtained in the voice processing unit with the candidate database, extracts the highest consistency as a final recognition result from the candidate database, and outputs the final recognition result;
A speech recognition processing apparatus comprising:

It is mounted on a navigation device that includes map data storage means for storing map data including address data consisting of prefecture city large name data and chome street address data for specifying a point on the map, and among the address data, the prefecture A speech recognition processing device for recognizing the street address data of the address data for which the city large name data is fixed,
Voice input receiving means for receiving voice input from an operator;
Voice processing means for performing voice recognition processing on the voice received by the voice input receiving means;
Candidate database generating means for extracting chome street address data in an address determined by the already determined prefecture city large name data from the map data, and generating a recognition candidate database in which the extracted chome street address data is registered;
A recognition result determination unit that compares the result obtained in the voice processing unit with the candidate database, extracts the highest consistency as a final recognition result from the candidate database, and outputs the final recognition result;
A speech recognition processing apparatus comprising: