JP2001005480A

JP2001005480A - User uttering discriminating device and recording medium

Info

Publication number: JP2001005480A
Application number: JP11176813A
Authority: JP
Inventors: Shinji Aono; 真司青野
Original assignee: Denso Corp
Current assignee: Denso Corp
Priority date: 1999-06-23
Filing date: 1999-06-23
Publication date: 2001-01-12

Abstract

PROBLEM TO BE SOLVED: To instruct a user that the pronunciation of the user causes malrecognition of his voice by comparing a recognized character string and a set character string and discriminating whether these strings are matched with each other or not. SOLUTION: When a pronunciation correction mode is selected, a control circuit 2 reads a character string having a specific title from an external memory 7, displays the string on a display device 8 and also displays a screen to urge a user for uttering. At the same time, the circuit 2 synthesizes the voice corresponding to the read character string and outputs the voice from a speaker 9. Thus, the user is able to listen to a standard uttering. When the user utters the character string having the specific title, his voice is inputted into the circuit 2 through a microphone 17 and transmitted to a voice processing section 10. The section 10 conducts a voice recognition process of the voice uttered by the user to recognize the title. The circuit 2 compares the uttered character string and the set character string. If no agreement is reached, a message is displayed on the device 8 to indicate the fact that the words are wrong.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、音声認識機能を備
えた各種の機器（例えばカーナビゲーションシステム）
に組み込んで用いるのに好適するユーザー発音判定装置
及び記録媒体に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to various devices having a voice recognition function (for example, a car navigation system).
And a recording medium.

【０００２】[0002]

【従来の技術】音声認識機能を備えたカーナビゲーショ
ンシステムでは、例えば地名を入力してその地名の地図
を表示したり、操作コマンドを入力したりする場合に、
ユーザーは、地名や操作コマンドを発声すれば入力する
ことができる。従って、面倒なキー操作が不要になるか
ら、ユーザーにとって非常に便利なシステムとなってい
る。2. Description of the Related Art In a car navigation system having a voice recognition function, for example, when a place name is input to display a map of the place name or input an operation command,
The user can input by uttering a place name or an operation command. Therefore, a troublesome key operation is not required, so that the system is very convenient for the user.

【０００３】[0003]

【発明が解決しようとする課題】さて、上記構成の場
合、ユーザーが発声した言葉を誤認識することがあり、
その場合には、ユーザーが希望しない地名の地図が表示
されたり、該当する地図がないという結果になったり、
ユーザーが意図しないコマンドの操作（動作）が実行さ
れたりすることがあった。上記音声の誤認識が生ずる原
因の１つとして、例えば、ユーザーの発音が標準的な発
音でない場合（例えば訛っていたり、発音が不明確であ
ったりする場合など）がある。[0007] In the case of the above configuration, words uttered by a user may be erroneously recognized.
In that case, the user may see a map with a place name they do not want, or there may be no such map,
An operation (action) of a command not intended by the user was sometimes performed. One of the causes of the erroneous recognition of the voice is, for example, a case where the pronunciation of the user is not a standard pronunciation (for example, a case where the user is accentuated or the pronunciation is unclear).

【０００４】しかし、上記従来構成の場合、ユーザーの
発音に原因があって誤認識が発声した場合であっても、
ユーザーが希望しない地名の地図が表示されたり、該当
する地図がないという結果になったりするだけで、誤認
識の原因は全くわからない構成となっている。このた
め、ユーザーの発音に原因がある場合でも、ユーザー
は、カーナビゲーションシステムの音声認識機能の性能
が悪いというように考えることが多かった。[0004] However, in the case of the above-mentioned conventional configuration, even if a false recognition is uttered due to a user's pronunciation,
The map is configured such that a map of a place name that the user does not desire is displayed, or that the map does not exist, but the cause of the misrecognition is completely unknown. For this reason, even if the pronunciation of the user has a cause, the user often thinks that the performance of the voice recognition function of the car navigation system is poor.

【０００５】そこで、本発明の目的は、ユーザーの発音
に原因があって、音声を誤認識するような場合に、その
原因をユーザーに教示することができるユーザー発音判
定装置及び記録媒体を提供することにある。Accordingly, an object of the present invention is to provide a user pronunciation determination apparatus and a recording medium that can teach a user to cause a user to misunderstand a voice when the user has a cause in pronunciation. It is in.

【０００６】[0006]

【課題を解決するための手段】請求項１の発明において
は、ユーザーが設定された文字列を読んで発声したとき
の音声を音声認識処理して前記音声に対応する文字列を
認識する音声認識手段を備え、この音声認識手段により
認識された文字列を表示する表示手段を備え、そして、
認識された文字列と設定された文字列とを比較して一致
しているか否かを判定する判定手段を備える構成とし
た。この構成によれば、ユーザーの発音に原因があっ
て、音声を誤認識するような場合に、ユーザーは自分の
発音に問題があることが容易にわかる。According to the first aspect of the present invention, there is provided a speech recognition apparatus for recognizing a character string corresponding to the set speech by performing a speech recognition process on a speech when the user reads and utters a set character string. Means, and display means for displaying a character string recognized by the voice recognition means, and
A configuration is provided that includes a determination unit that compares the recognized character string with the set character string to determine whether they match. According to this configuration, when there is a cause in the pronunciation of the user and the voice is erroneously recognized, the user can easily recognize that there is a problem in his / her pronunciation.

【０００７】請求項２の発明によれば、判定手段によっ
て、認識された文字列と設定された文字列の相違してい
るところを教示するように構成したので、ユーザーは、
認識されない文字を明確に知ることができる。従って、
ユーザーは、認識されない文字の発音を矯正することが
可能になる。According to the second aspect of the present invention, the determination means teaches the difference between the recognized character string and the set character string.
Unrecognized characters can be clearly known. Therefore,
The user can correct the pronunciation of an unrecognized character.

【０００８】請求項３の発明によれば、判定手段によっ
て、認識された文字列と設定された文字列の一致度を教
示するように構成したので、ユーザーは、認識された文
字列と設定された文字列の一致度を明確に知ることがで
き、発音の矯正に役立てることが可能となる。According to the third aspect of the present invention, since the determination unit teaches the degree of coincidence between the recognized character string and the set character string, the user can set the recognized character string as the recognized character string. It is possible to clearly know the degree of coincidence of the character string, and to use it for correction of pronunciation.

【０００９】請求項４の発明によれば、認識された文字
列が設定された文字列に一致するまで、ユーザーが設定
された文字列を発声する処理を繰り返し実行するように
構成したので、ユーザーの発音を確実に矯正することが
でき、音声の誤認識をより一層防止することができる。According to the fourth aspect of the present invention, the user repeatedly performs the process of speaking the set character string until the recognized character string matches the set character string. Can be reliably corrected, and erroneous recognition of voice can be further prevented.

【００１０】[0010]

【発明の実施の形態】以下、本発明をカーナビゲーショ
ンシステムに適用した一実施例について、図面を参照し
ながら説明する。まず、図２は本実施例のカーナビゲー
ションシステム１の全体概略構成を示すブロック図であ
る。この図２に示すように、カーナビゲーションシステ
ム１は、制御回路２と、位置検出器３と、地図データ入
力器４と、操作スイッチ群５と、通信装置６と、外部メ
モリ７と、表示装置（表示手段）８と、スピーカ９と、
音声処理部１０とから構成されている。DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS An embodiment in which the present invention is applied to a car navigation system will be described below with reference to the drawings. First, FIG. 2 is a block diagram illustrating the overall schematic configuration of the car navigation system 1 according to the present embodiment. As shown in FIG. 2, the car navigation system 1 includes a control circuit 2, a position detector 3, a map data input device 4, an operation switch group 5, a communication device 6, an external memory 7, a display device (Display means) 8, speaker 9,
An audio processing unit 10 is provided.

【００１１】制御回路２は、カーナビゲーションシステ
ム１の動作全般を制御する機能を有しており、通常のコ
ンピュータ（例えばマイクロコンピュータ）として構成
されている。即ち、制御回路２は、ＣＰＵ、ＲＯＭ、Ｒ
ＡＭ、Ｉ／Ｏ及びこれらを接続するバス（いずれも図示
しない）を備えて構成されている。そして、上記制御回
路２が、本発明の文字列設定手段、判定手段及び繰り返
し手段としての各機能を有している。The control circuit 2 has a function of controlling the overall operation of the car navigation system 1, and is configured as a normal computer (for example, a microcomputer). That is, the control circuit 2 includes a CPU, a ROM, an R
It is configured to include an AM, an I / O, and a bus (not shown) for connecting these. The control circuit 2 has functions as a character string setting unit, a determination unit, and a repetition unit of the present invention.

【００１２】また、位置検出器３は、ＧＰＳ（Global P
ositioning System ）受信機１１と、ジャイロスコープ
１２と、距離センサ１３と、地磁気センサ１４とから構
成されている。上記位置検出器３は、上記４つのセンサ
１１〜１４により互いに補間しながら車両の現在位置を
検出するように構成されており、高精度の位置検出機能
を有している。尚、位置検出精度をそれほど必要としな
い場合には、４つのセンサ１１〜１４のうちの何れかで
位置検出器３を構成しても良い。また、ステアリングの
回転センサや、各転動輪のセンサ等を組み合わせて位置
検出器３を構成しても良い。The position detector 3 is a GPS (Global P
The ositioning system includes a receiver 11, a gyroscope 12, a distance sensor 13, and a geomagnetic sensor 14. The position detector 3 is configured to detect the current position of the vehicle while interpolating each other by the four sensors 11 to 14, and has a highly accurate position detection function. When the position detection accuracy is not so required, the position detector 3 may be configured by any one of the four sensors 11 to 14. Further, the position detector 3 may be configured by combining a steering rotation sensor, a sensor of each rolling wheel, and the like.

【００１３】地図データ入力器４は、例えばＤＶＤ−Ｒ
ＯＭ等の記録媒体を読み取る読取装置で構成されてお
り、地図データやマップマッチング用データや目印デー
タやＨＴＭＬ情報（インターネット情報）等を入力する
ための装置である。尚、上記記録媒体としては、例えば
ＣＤ−ＲＯＭやメモリカード等を用いても良い。The map data input device 4 is, for example, a DVD-R
It is configured by a reading device that reads a recording medium such as an OM, and is a device for inputting map data, map matching data, landmark data, HTML information (Internet information), and the like. The recording medium may be, for example, a CD-ROM or a memory card.

【００１４】表示装置８は、例えば液晶ディスプレイ等
で構成されており、カラー表示が可能であると共に地図
や文字や画像等を明確に表示可能な表示画面１５（図４
ないし図７参照）を備えている。この表示装置８の表示
画面１５には、車両の現在位置マークと、地図データ
と、地図上に表示する誘導経路等の付加データとを重ね
て表示することができるように構成されている。The display device 8 is composed of, for example, a liquid crystal display or the like, and is capable of color display and a display screen 15 (FIG. 4) capable of clearly displaying maps, characters, images, and the like.
To FIG. 7). The display screen 15 of the display device 8 is configured so that a current position mark of the vehicle, map data, and additional data such as a guide route to be displayed on the map can be displayed in a superimposed manner.

【００１５】操作スイッチ群５は、上記表示装置８の表
示画面１５の上面に設けられたタッチスイッチ（タッチ
パネル）と、表示画面１５の周辺部に設けられたメカニ
カルなプッシュスイッチ等とから構成されている。通信
装置６は、例えばインフラデータを送受信する装置であ
ると共に、ＶＩＣＳ（Vehicle Information & Communic
ation System）情報を受信する装置である。更に、スピ
ーカ９は、各種の操作手順の案内や経路案内等の音声、
並びに、ユーザーに発声させるための文字列の標準的な
音声（発生）を出力するための装置である。The operation switch group 5 includes a touch switch (touch panel) provided on the upper surface of the display screen 15 of the display device 8 and a mechanical push switch provided on the periphery of the display screen 15. I have. The communication device 6 is, for example, a device that transmits and receives infrastructure data, and has a VICS (Vehicle Information & Communic).
ation System) device. Further, the speaker 9 provides voices for guidance of various operation procedures and route guidance, and the like.
Also, it is a device for outputting a standard voice (generation) of a character string for making the user utter.

【００１６】また、上記カーナビゲーションシステム１
の制御回路２は、ユーザーが操作スイッチ群５やリモコ
ン１８を操作して目的地を設定すると、現在位置から上
記目的地までの最適経路（誘導経路）を自動的に設定す
る機能や、現在位置を地図上に位置付けるマップマッチ
ング処理を実行する機能や、汎用情報である例えばＨＴ
ＭＬ情報（インターネット情報）を閲覧する機能等を備
えている。尚、上記自動的に最適経路を設定する方法と
しては、例えばダイクストラ法等が知られている。The car navigation system 1
The control circuit 2 has a function of automatically setting an optimal route (guidance route) from the current position to the destination when a user operates the operation switch group 5 or the remote controller 18 to set a destination. To perform a map matching process for positioning a map on a map, and general information such as HT
A function for browsing ML information (Internet information) is provided. As a method for automatically setting the optimum route, for example, the Dijkstra method is known.

【００１７】さて、音声処理部１０は、音声認識ユニッ
ト１６と、ユーザーが発声する音声を入力するマイク１
７とを備えて構成されている。上記音声認識ユニット１
６は、図３に示すように、辞書部１８ａ及び照合部１８
ｂからなる認識部１８と、過去の照合に基づく学習結果
や外部の状況を保存する記憶部１９とを備えて構成され
ている。そして、上記音声処理部１０が本発明の音声認
識手段としての機能を備えている。The voice processing unit 10 includes a voice recognition unit 16 and a microphone 1 for inputting voice uttered by a user.
7 are provided. The above speech recognition unit 1
6 is a dictionary unit 18a and a collating unit 18 as shown in FIG.
b, and a storage unit 19 for storing learning results based on past matching and external situations. The voice processing unit 10 has a function as voice recognition means of the present invention.

【００１８】尚、記憶部１９内に保存される外部の状況
には、制御回路２から通知される車両の現在地の情報
や、表示装置８の表示状態の情報や、ユーザーが設定し
た認識のルール（例えば、音声認識する際にコマンドと
して認識すべき予約語）等が含まれている。また、上記
辞書部１８ａ内には、ユーザーが登録した単語を含む認
識対象語彙、及びその語彙同士の関連を表した構造が記
憶されている。The external situation stored in the storage unit 19 includes information on the current location of the vehicle notified from the control circuit 2, information on the display state of the display device 8, and the recognition rules set by the user. (For example, a reserved word to be recognized as a command when recognizing a voice). The dictionary unit 18a stores a recognition target vocabulary including a word registered by the user and a structure representing the relation between the vocabulary.

【００１９】そして、認識部１８ｂにおいては、マイク
１７を介して音声入力があると、過去の学習結果に基づ
いて、その音声入力に最も近い文字（文字列）を辞書部
１８ａから選択する照合処理が行われるように構成され
ている。尚、この照合処理は、すでに知られている照合
処理用のプログラム（アルゴリズム）を用いて実行され
るようになっている。更に、認識部１８ｂでは、上記照
合結果と記憶部１９に記憶されている予約語とを比較
し、一致すれば対応するコマンドとして、一致しなけれ
ばカナ文字データとして、制御回路２に出力する処理が
実行されるように構成されている。When there is a voice input via the microphone 17, the recognition unit 18b selects a character (character string) closest to the voice input from the dictionary unit 18a based on the past learning result. Is configured to be performed. Note that this collation processing is executed using a known program (algorithm) for collation processing. Further, the recognizing unit 18b compares the result of the collation with the reserved word stored in the storage unit 19, and outputs to the control circuit 2 as a corresponding command if they match, or as kana character data if they do not match. Is configured to be executed.

【００２０】次に、上記構成の作用、具体的には、ユー
ザーが発声する音声（発音）を矯正するために使用する
発音矯正モード（発音矯正運転）の動作について、図
１、図４〜図７も参照して説明する。尚、図１のフロー
チャートは、制御回路２に記憶されている制御プログラ
ムのうちの上記発音矯正モードに相当する部分制御の内
容を示すものである。Next, the operation of the above configuration, specifically, the operation in the pronunciation correction mode (pronunciation correction operation) used to correct the voice (pronunciation) uttered by the user will be described with reference to FIGS. 7 will also be described. The flowchart of FIG. 1 shows the contents of the partial control corresponding to the above-mentioned pronunciation correcting mode in the control program stored in the control circuit 2.

【００２１】まず、カーナビゲーションシステム１の例
えばメニュー選択画面（図示しない）において、上記発
音矯正モードを、選択スイッチを操作したり、選択コマ
ンドを音声入力したりして選択する。そして、この選択
がなされると、発音矯正モードが実行開始され、ステッ
プＳ１００にて「ＹＥＳ」へ進み、制御回路２は、外部
メモリ７から特定の名称の文字列を読み込む処理を実行
する（ステップＳ１１０）。この特定の名称の文字列
は、発音矯正用の文字列、即ち、ユーザーに発声させる
ための文字列（予め設定された文字列）であり、本実施
例では、「あいちけんとよたし」を用いた。尚、上記特
定の名称の文字列として、他の文字列を使用しても良い
し、また、複数の文字列の中から適宜選択するように構
成しても良いし、また、ユーザーに選択させるように構
成しても良い。First, on the menu selection screen (not shown) of the car navigation system 1, for example, the pronunciation correction mode is selected by operating a selection switch or inputting a selection command by voice. When this selection is made, the pronunciation correcting mode is started, and the process proceeds to "YES" in step S100, and the control circuit 2 executes a process of reading a character string having a specific name from the external memory 7 (step S100). S110). The character string of this specific name is a character string for pronunciation correction, that is, a character string for causing the user to utter (a character string set in advance). In the present embodiment, “Aichiken Toyoshi” is used. Was. It should be noted that another character string may be used as the character string having the specific name, or may be appropriately selected from a plurality of character strings, or may be selected by the user. It may be configured as follows.

【００２２】続いて、ステップＳ１２０へ進み、制御回
路２は、上記読み込んだ文字列を表示装置８に、図４に
示すような表示形態で表示し、発声を催促する画面を表
示する。これと共に、ステップＳ１３０へ進み、制御回
路２は、上記読み込んだ文字列に対応する音声を合成
し、この音声をスピーカ９から出力（発生）させる。
尚、この音声を合成する処理は、例えば音声処理部１０
にて実行されるように構成されている。そして、ユーザ
ーは、スピーカ９から出力された音声を聞くことによ
り、上記特定の名称の文字列の標準的な発声（発音）を
聞くことができる。Then, the process proceeds to step S120, where the control circuit 2 displays the read character string on the display device 8 in a display form as shown in FIG. 4, and displays a screen for urging the user to speak. At the same time, the process proceeds to step S130, where the control circuit 2 synthesizes a voice corresponding to the read character string, and outputs (generates) the voice from the speaker 9.
The process of synthesizing the voice is performed, for example, by the voice processing unit 10.
It is configured to be executed by. Then, the user can hear a standard utterance (pronunciation) of the character string having the specific name by listening to the sound output from the speaker 9.

【００２３】ここで、ユーザーが上記特定の名称の文字
列を読んで発声（発音）するのを待つ（ステップＳ１４
０）。この待機状態で、ユーザーが上記特定の名称の文
字列を読んで発声すると、その音声は、マイク１７を通
して制御回路２に入力され、音声処理部１０へ送られ
る。すると、ステップＳ１４０にて「ＹＥＳ」へ進み、
ステップＳ１５０に移行し、ここで、ユーザーが発声し
た音声を、音声処理部１０にて音声認識処理することに
より、上記音声に対応する文字列、即ち、名称を認識
（特定）する。そして、この特定結果の文字列（即ち、
発声された文字列）は、制御回路２へ送られるように構
成されている。Here, it waits for the user to read the character string having the specific name and to utter (pronounce) it (step S14).
0). In this standby state, when the user reads the character string having the specific name and utters the voice, the voice is input to the control circuit 2 through the microphone 17 and sent to the voice processing unit 10. Then, the process proceeds to “YES” in step S140,
The process proceeds to step S150, in which the voice uttered by the user is subjected to voice recognition processing in the voice processing unit 10, thereby recognizing (identifying) a character string corresponding to the voice, that is, a name. Then, the character string of this specific result (ie,
The uttered character string) is sent to the control circuit 2.

【００２４】続いて、ステップＳ１６０へ進み、制御回
路２は、上記発声された文字列と前記設定された文字列
（即ち、特定の名称の文字列）とを比較する処理を行
う。ここで、両者が一致したら、ステップＳ１６０にて
「一致」へ進み、ステップＳ１８０へ移行し、ここで、
ユーザーが発声した音声が合っていること、即ち、ユー
ザーが発声した音声が標準的であり、正しく音声認識さ
れたことを、表示装置８に表示する。この場合、図５に
示すように、認識結果の文字列を表示すると共に、「あ
っています」というメッセージを表示するように構成さ
れており、この表示された画面が発声一致画面である。
そして、この後は、上記した発音矯正モードを終了する
ように構成されている。Subsequently, the process proceeds to step S160, where the control circuit 2 performs a process of comparing the uttered character string with the set character string (that is, a character string having a specific name). Here, if they match, the process proceeds to “match” in step S160, and the process proceeds to step S180.
The fact that the voice uttered by the user matches, that is, that the voice uttered by the user is standard and has been correctly recognized is displayed on the display device 8. In this case, as shown in FIG. 5, it is configured to display the character string of the recognition result and to display a message of "meeting", and the displayed screen is the utterance matching screen.
Then, after that, the above-described pronunciation correction mode is ended.

【００２５】一方、上記ステップＳ１６０において、発
声された文字列と設定された文字列が一致しない場合に
は、「不一致」へ進み、ステップＳ１７０へ移行する。
このステップＳ１７０においては、図６に示すように、
不一致の結果、即ち、言葉が違っている旨のメッセージ
と、認識結果の文字列（発声された文字列）とを表示装
置８に表示する。ここで、認識結果の文字列のうちの設
定された文字列と異なる文字については、例えば反転表
示するように構成されている。また、言葉が違っている
旨のメッセージについては、音声で出力してユーザーに
教えるように構成しても良い。On the other hand, if it is determined in step S160 that the uttered character string does not match the set character string, the process proceeds to “mismatch” and proceeds to step S170.
In this step S170, as shown in FIG.
A result of the disagreement, that is, a message indicating that the words are different, and a character string of the recognition result (uttered character string) are displayed on the display device 8. Here, a character different from the set character string in the character string of the recognition result is configured to be displayed in reverse video, for example. Further, a message indicating that the language is different may be output as a voice to teach the user.

【００２６】尚、上記異なる文字を見つけ出すに当たっ
ては、両方の文字数が同じ場合は、１文字ずつ比較すれ
ば良い。また、両方の文字数が異なる場合は、適当なア
ルゴリズムを用いることにより、両方の文字列につい
て、一致している部分（文字列）をできるだけ多く見つ
け出すような処理を実行して、異なる文字や文字抜け等
の相違点を表示（指摘）するように構成することが好ま
しい。更に、認識された文字列と設定された文字列の一
致度（例えば図６の場合、９文字のうちの８文字が一致
しているから、一致度は、８／９、即ち、約８９％とな
る）を表示画面に表示するように構成しても良い。尚、
上記一致度を、表示する代わりに、音声で出力してユー
ザーに教えるように構成しても良い。In order to find the different characters, if both characters have the same number, the characters may be compared one by one. If the numbers of characters are different, an appropriate algorithm is used to execute a process to find as many matching portions (character strings) as possible for both character strings, thereby obtaining different characters and missing characters. It is preferable to display (point out) differences such as. Further, the degree of coincidence between the recognized character string and the set character string (for example, in the case of FIG. 6, eight of nine characters match, the degree of coincidence is 8/9, that is, about 89% May be displayed on the display screen. still,
Instead of displaying the degree of coincidence, it may be configured to output it by voice and to teach the user.

【００２７】そして、図６に示すように、認識結果を例
えば所定時間（数秒ないし数十秒程度の時間）表示した
後は、ステップＳ１２０へ進み、ここで、図７に示すよ
うに、ユーザーに対して、もう一度発声を促すメッセー
ジ（即ち、発声催促画面）を表示するように構成されて
いる。続いて、ステップＳ１３０以降の処理を、前述し
たようにして繰り返すようになっている。尚、ステップ
Ｓ１３０の処理、即ち、設定された文字列を、カーナビ
ゲーションシステム１が標準的な発音で発声する処理に
ついては、２回目以降は省略するように構成しても良い
し、繰り返し発声するように構成しても良い。Then, as shown in FIG. 6, after displaying the recognition result, for example, for a predetermined time (time of about several seconds to several tens of seconds), the process proceeds to step S120, where the user is prompted as shown in FIG. On the other hand, it is configured to display a message prompting utterance again (that is, an utterance prompting screen). Subsequently, the processing after step S130 is repeated as described above. Note that the processing in step S130, that is, the processing in which the car navigation system 1 utters the set character string with standard pronunciation may be omitted in the second and subsequent times, or may be repeated. It may be configured as follows.

【００２８】また、本実施例では、ステップＳ１７０に
て、認識結果を所定時間表示するように構成したが、こ
れに代えて、認識結果を表示すると共に、発音矯正モー
ドを続けるか否かを、ユーザーに問い合わせるように構
成しても良い。この構成の場合、ユーザーが発音矯正モ
ードを続ける応答をしたときには、ステップＳ１２０へ
移行すれば良い。また、ユーザーが発音矯正モードを終
了する応答をしたときには、発音矯正モードを終了する
ように構成すれば良い。In this embodiment, the recognition result is displayed for a predetermined time in step S170. Instead, the recognition result is displayed and whether or not to continue the pronunciation correction mode is determined. It may be configured to ask the user. In the case of this configuration, when the user makes a response to continue the pronunciation correction mode, the process may proceed to step S120. Further, when the user responds to end the pronunciation correction mode, the pronunciation correction mode may be terminated.

【００２９】このような構成の本実施例においては、設
定された文字列をユーザーが読んで発声したときの音声
を、音声認識処理してその音声に対応する文字列を認識
した後、この音声認識した文字列を表示し、そして、認
識された文字列と設定された文字列とを比較して一致し
ているか否かを判定するように構成した。この構成によ
れば、ユーザーの発音に原因があって、音声を誤認識す
るような場合に、ユーザーは自分の発音に問題があるこ
とを容易に認識できる。In this embodiment having the above-described structure, the voice when the user reads and utters the set character string is subjected to voice recognition processing to recognize the character string corresponding to the voice, and then the voice is processed. The recognized character string is displayed, and the recognized character string is compared with the set character string to determine whether or not they match. According to this configuration, when there is a cause in the user's pronunciation and the voice is erroneously recognized, the user can easily recognize that there is a problem in his / her pronunciation.

【００３０】また、上記実施例では、図６に示すよう
に、認識された文字列と設定された文字列の相違してい
るところを表示（教示）するように構成したので、ユー
ザーは、認識されない文字を明確に知ることができ、そ
の認識されない文字の発音が標準的な発音となるように
矯正することが可能になる。更に、上記実施例では、認
識された文字列と設定された文字列の一致度を表示（教
示）するように構成したので、ユーザーは、認識された
文字列と設定された文字列の一致度を明確に知ることが
できる。Further, in the above embodiment, as shown in FIG. 6, the difference between the recognized character string and the set character string is displayed (taught). The unrecognized character can be clearly known, and the pronunciation of the unrecognized character can be corrected so that it becomes a standard pronunciation. Further, in the above embodiment, the configuration is such that the degree of coincidence between the recognized character string and the set character string is displayed (taught), so that the user can check the degree of coincidence between the recognized character string and the set character string. Can be clearly understood.

【００３１】更にまた、上記実施例では、認識された文
字列が設定された文字列に一致するまで、ユーザーが設
定された文字列を発声する処理を繰り返し実行するよう
に構成したので、ユーザーの発音を確実に矯正すること
ができ、音声の誤認識をより一層防止することができ
る。Further, in the above-described embodiment, the user repeatedly performs the process of uttering the set character string until the recognized character string matches the set character string. The pronunciation can be reliably corrected, and erroneous recognition of the voice can be further prevented.

【００３２】尚、上記実施例では、カーナビゲーション
システム１の例えばメニュー選択画面において選択スイ
ッチ等を操作することにより、発音矯正モードを選択し
て実行するように構成したが、これに限られるものでは
ない。例えば、地図を表示している画面やその他のナビ
の画面で、特定の操作スイッチを操作したり、特定の複
数の操作スイッチを操作したりすることにより、上記発
音矯正モードを選択して実行できるように構成しても良
い。In the above embodiment, the pronunciation correction mode is selected and executed by operating a selection switch on the menu selection screen of the car navigation system 1, for example. However, the present invention is not limited to this. Absent. For example, by operating a specific operation switch or operating a plurality of specific operation switches on a screen displaying a map or another navigation screen, the pronunciation correction mode can be selected and executed. It may be configured as follows.

【００３３】また、上記実施例においては、カーナビゲ
ーションシステム１を動作させるためのプログラム（即
ち、文字列設定手段、判定手段、繰り返し手段及び音声
認識手段としての各機能を実現するプログラム）を制御
回路２のＲＯＭ内に格納するように構成した。この構成
の場合、上記プログラムを格納するＲＯＭを、制御回路
２を構成するプリント配線基板に対して交換可能に構成
しても良い。また、上記プログラムをＣＤ−ＲＯＭやＤ
ＶＤ−ＲＯＭ等の記録媒体に格納しておくと共に、制御
回路２にフラッシュメモリ等の書き換え可能な不揮発性
メモリを配設しておき、上記プログラムを記録媒体から
不揮発性メモリに転送するように構成しても良い。In the above embodiment, a program for operating the car navigation system 1 (ie, a program for realizing each function as a character string setting unit, a judgment unit, a repetition unit, and a voice recognition unit) is controlled by a control circuit. 2 was stored in the ROM. In the case of this configuration, the ROM storing the program may be configured to be replaceable for the printed circuit board configuring the control circuit 2. In addition, the above program is stored on a CD-ROM or D-ROM.
A configuration in which the program is stored in a recording medium such as a VD-ROM, a rewritable nonvolatile memory such as a flash memory is provided in the control circuit 2, and the program is transferred from the recording medium to the nonvolatile memory. You may.

【００３４】尚、上記実施例においては、本発明のユー
ザー発音判定装置をカーナビゲーションシステム１に適
用したが、これに限られるものではなく、携帯型のナビ
ゲーションシステムに適用しても良いし、また、音声認
識機能を備えた種々の電気機器に適用しても良い。In the above embodiment, the user pronunciation determination device of the present invention is applied to the car navigation system 1. However, the present invention is not limited to this, and may be applied to a portable navigation system. Alternatively, the present invention may be applied to various electric devices having a voice recognition function.

[Brief description of the drawings]

【図１】本発明の一実施例を示すフローチャートFIG. 1 is a flowchart showing an embodiment of the present invention.

【図２】カーナビゲーションシステムのブロック図FIG. 2 is a block diagram of a car navigation system.

【図３】音声認識ユニットのブロック図FIG. 3 is a block diagram of a speech recognition unit.

【図４】発声催促画面を示す図FIG. 4 is a diagram showing a voice prompting screen.

【図５】発声一致画面を示す図FIG. 5 is a diagram showing an utterance matching screen;

【図６】発声不一致画面を示す図FIG. 6 is a diagram showing an utterance mismatch screen;

【図７】２回目以降の発声催促画面を示す図FIG. 7 is a diagram showing the second and subsequent vocalization prompting screens;

[Explanation of symbols]

１はカーナビゲーションシステム、２は制御回路（文字
列設定手段、判定手段及び繰り返し手段）、４は地図デ
ータ入力器、５は操作スイッチ群、８は表示装置、９は
スピーカ、１０は音声処理部（音声認識手段）、１５は
表示画面、１６は音声認識ユニット、１７はマイク、１
８は認識部、１８ａは辞書部、１８ｂは照合部、１９は
記憶部を示す。1 is a car navigation system, 2 is a control circuit (character string setting means, determination means and repetition means), 4 is a map data input device, 5 is an operation switch group, 8 is a display device, 9 is a speaker, and 10 is a voice processing unit. (Voice recognition means), 15 is a display screen, 16 is a voice recognition unit, 17 is a microphone, 1
Reference numeral 8 denotes a recognition unit, 18a denotes a dictionary unit, 18b denotes a collation unit, and 19 denotes a storage unit.

Claims

[Claims]

1. A character string setting means for setting a character string to be uttered by a user, and a character corresponding to the voice by performing a voice recognition process on a voice when the user reads and utters the set character string. Voice recognition means for recognizing a string, display means for displaying a character string recognized by the voice recognition means, and comparing the recognized character string with the set character string to determine whether they match. A user pronunciation determination device, comprising: determination means for determining a user pronunciation.

2. The user pronunciation according to claim 1, wherein said determination means is configured to teach a difference between said recognized character string and said set character string. Judgment device.

3. The user pronunciation determination according to claim 1, wherein the determination unit is configured to teach a degree of coincidence between the recognized character string and the set character string. apparatus.

4. A repetition means for causing a user to repeatedly execute a process of uttering the set character string until the recognized character string matches the set character string. Item 4. The user pronunciation determination device according to any one of Items 1 to 3.

5. A recording medium for recording a program for operating a user pronunciation determination device, the program comprising: a function as character string setting means for setting a character string for uttering a user; A function as voice recognition means for recognizing a character string corresponding to the voice by reading the character string and uttering the voice and performing voice recognition processing; and a display means for displaying the character string recognized by the voice recognition means. And a function as a determination unit for comparing the recognized character string with the set character string to determine whether or not they match each other, Recording medium.