JP2002140094A

JP2002140094A - Device and method for voice recognition, and computer- readable recording medium with voice recognizing program recorded thereon

Info

Publication number: JP2002140094A
Application number: JP2000334857A
Authority: JP
Inventors: Hirotaka Goi; 啓恭伍井; Yoshiharu Abe; 芳春阿部; Yuzo Maruta; 裕三丸田; Shinobu Arai; 忍新井
Original assignee: Mitsubishi Electric Corp
Current assignee: Mitsubishi Electric Corp
Priority date: 2000-11-01
Filing date: 2000-11-01
Publication date: 2002-05-17

Abstract

PROBLEM TO BE SOLVED: To solve such problem that correcting operation becomes complicated if a word unit is incorrectly sectioned since the correcting operation needs to be instructed for every word unit. SOLUTION: This device is provided with a voice recognition part 2 which generates a plurality of syllable strings by recognizing the voice from a microphone 1, generates multiple word strings from the syllable string by using an n-gram dictionary in an n-gram dictionary storage part 4 and sorts them into an initial candidate and multiple candidates, and stores the initial candidate in a display area of a RAM 3, a display part 5 which displays the word string stored in the display area of the RAM 3 to a user, and a candidate retrieval part 10 which retrieves a word string wherein only all character strings corresponding to the character position of a misrecognized character string have characters changed from the next candidate and stores them in the display area of the RAM 3 when the misrecognized character string is specified through a keyboard 6 or with a mouse 7.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】この発明は、音声ワードプロ
セッサなどに適用されて連続音声認識結果の修正機能を
有する音声認識装置、音声認識方法および音声認識プロ
グラムを記録したコンピュータ読取可能な記録媒体に係
るものである。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a speech recognition apparatus applied to a speech word processor or the like and having a function of correcting a result of continuous speech recognition, a speech recognition method, and a computer-readable recording medium recording a speech recognition program. It is.

【０００２】[0002]

【従来の技術】音声認識装置によるテキスト入力は文書
作成の手段として有用である。一般に、連続音声認識に
よるテキスト入力では１００％の正解精度が得られると
は限らないため、修正機能が用意されている。2. Description of the Related Art Text input by a speech recognition apparatus is useful as a means for creating a document. Generally, 100% correct accuracy is not always obtained by text input by continuous speech recognition, so a correction function is provided.

【０００３】例えば日本語の連続音声認識では、認識結
果の単語列を単語などの単位で区切り、その単位におい
て初期候補とは異なる次候補をユーザに提示して、正し
い単語列をユーザが選択できるようにしている。For example, in Japanese continuous speech recognition, a word string as a recognition result is divided into units such as words, and the next candidate different from the initial candidate is presented to the user in that unit, and the user can select a correct word string. Like that.

【０００４】この場合、区切りが第１の解において正し
く認識されるとは必ずしも限らないため、特開昭６２−
１８０４６２号公報に示すような区切り変更を行う音声
認識装置が提案されている。In this case, the break is not always recognized correctly in the first solution.
Japanese Patent No. 180462 discloses a speech recognition device that changes the break.

【０００５】図１１は従来の音声認識装置の構成を示す
図である。図１１において、１０１は入力音声を電気信
号に変換するマイク、１０２はマイク１０１からの電気
信号を音声認識処理する音声認識部、１０３は読書可能
なランダムアクセスメモリ（以下、ＲＡＭという）、１
０４は音節列から単語列を生成する際に用いる単語辞書
を記憶する単語辞書記憶部、１０５は単語列や音節列を
表示する表示部、１０６は修正操作を行うキーボード、
１０８は修正操作にしたがって初期候補を修正する単語
修正部である。FIG. 11 is a diagram showing a configuration of a conventional speech recognition apparatus. In FIG. 11, reference numeral 101 denotes a microphone for converting an input voice into an electric signal; 102, a voice recognition unit for performing voice recognition processing on the electric signal from the microphone 101; 103, a random access memory (hereinafter referred to as RAM);
04 is a word dictionary storage unit that stores a word dictionary used when generating a word string from a syllable string, 105 is a display unit that displays word strings and syllable strings, 106 is a keyboard that performs correction operations,
A word correction unit 108 corrects the initial candidates according to a correction operation.

【０００６】次に動作について説明する。まず、区切り
変更の必要がない場合について説明する。ユーザがマイ
ク１０１に発声すると、音声認識処理が開始される。ま
ず、ユーザの発声例Ａ「おんせいにんしきそうち（音声
認識装置）」を考える。Next, the operation will be described. First, a case where there is no need to change the break will be described. When the user speaks into the microphone 101, the voice recognition process starts. First, consider a user's utterance example A, "Onsei Nishikichi (voice recognition device)".

【０００７】入力された音声をマイク１０１が電気信号
に変換すると、音声認識部１０２は、マイク１０１から
の電気信号をＡ／Ｄ変換し、量子化した後にスペクトル
分析して、音節単位に分離したもっともらしい音節列を
ＲＡＭ１０３に記憶する。従来の音声認識装置の修正操
作を説明するために、発声例Ａに対して音節列「おんせ
い＿にんち＿そうち」が記憶されたものとする。記号
「＿」は単語の区切りを意味している。When the microphone 101 converts the input voice into an electric signal, the voice recognition unit 102 A / D converts the electric signal from the microphone 101, quantizes the electric signal, analyzes the spectrum, and separates it into syllable units. The plausible syllable string is stored in the RAM 103. In order to explain the correction operation of the conventional speech recognition device, it is assumed that a syllable string “Onsei_Ninchi_Soch” is stored for utterance example A. The symbol "_" means a word break.

【０００８】続いて、音声認識部１０２は、音節列に対
応する単語を記憶する単語辞書記憶部１０４の単語辞書
を参照して、ＲＡＭ１０３に記憶された音節列それぞれ
に対応した単語列を生成する。そして、その中から最尤
の単語列「音声＿認知＿装置」を初期候補として出力す
る。Subsequently, the speech recognition section 102 refers to the word dictionary in the word dictionary storage section 104 for storing words corresponding to the syllable strings, and generates word strings corresponding to the respective syllable strings stored in the RAM 103. . Then, the maximum likelihood word string “speech_recognition_device” is output as an initial candidate.

【０００９】音声認識部１０２の出力した初期候補の音
節列「おんせい＿にんち＿そうち」および単語列「音声
＿認知＿装置」は、単語修正部１０８へ転送される。表
示部１０５は、単語修正部１０８の単語列「音声＿認知
＿装置」を表示する。[0009] The syllable string “Onsei_Ninchi_Soch” and the word string “Speech_Recognition_Device” of the initial candidates output from the speech recognition unit 102 are transferred to the word correction unit 108. The display unit 105 displays the word string “voice_recognition_device” of the word correction unit 108.

【００１０】ユーザは、表示部１０５の初期候補「音声
＿認知＿装置」を見て、誤認識単語をキーボード１０６
によって指定し、候補選択または区切り変更による修正
操作を行う。この例では、単語の区切りは正しく認識さ
れており、第２単語「認知」が誤認識単語である。[0010] The user looks at the initial candidate “speech_cognition_device” on the display unit 105, and inputs a misrecognized word on the keyboard 106.
And perform a correction operation by selecting a candidate or changing a break. In this example, the word segment is correctly recognized, and the second word “recognition” is a misrecognized word.

【００１１】したがって、図１２の表示部１０５Ａに示
すように誤認識単語「認知」をユーザが指定すると、単
語修正部１０８は「認知」に対する次候補の単語を表示
する。「認識」、「認証」、「認可」などが次候補とし
て表示されるので、この中からユーザはキーボード１０
６によって「認識」を候補選択する。単語修正部１０８
は、ユーザの候補選択に応じて修正した単語列「音声＿
認識＿装置」を表示部１０５に表示する（図１２の表示
部１０５Ｂ）。Therefore, when the user specifies the misrecognized word “cognition” as shown in the display unit 105A of FIG. 12, the word correction unit 108 displays the next candidate word for “cognition”. Since “recognition”, “authentication”, “authorization”, and the like are displayed as the next candidates, the user selects the keyboard 10
6, "recognition" is selected as a candidate. Word correction unit 108
Is the word string “voice_
The “recognition_device” is displayed on the display unit 105 (display unit 105B in FIG. 12).

【００１２】このように、区切り変更の必要がない場合
には、誤認識単語「認知」を候補選択すると次候補の単
語「認識」が得られ、誤認識の単語列「音声＿認知＿装
置」から正認識の単語列「音声＿認識＿装置」への修正
を容易に行うことができる。As described above, when there is no need to change the delimiter, when the candidate word “recognition” is selected as a candidate, the word “recognition” of the next candidate is obtained, and the misrecognized word string “speech_recognition_device” Can be easily corrected to a word string “speech_recognition_device” that is correctly recognized.

【００１３】次に、区切り変更を必要とする場合につい
て説明する。ここでは、発声例Ｂ「くるまでによういす
る（来るまでに用意する）」を考え、以下に示す誤認識
の単語列から正認識の単語列へ修正操作する。この誤認
識の単語列は単語の区切りを誤って認識している。Next, a case where a break change is required will be described. Here, considering the utterance example B “Let's come before (prepare before coming)”, the following correction operation is performed from the incorrectly recognized word string to the correctly recognized word string shown below. This misrecognized word string incorrectly recognizes word breaks.

【００１４】誤認識の単語列「車＿で＿尿意＿する」[0014] Word string of misrecognition ""

【００１５】誤認識の単語列に対応する音節列「くるま
＿で＿にょうい＿する」を用いて修正操作を行う。ま
ず、修正操作の対象となる音節列「くるま＿で＿にょう
い＿する」の中の文字「ょ」を指定して、文字「よ」に
修正して修正音節列１を得る。A correction operation is performed using a syllable string “car______________” corresponding to the word string of the misrecognition. First, a character “yo” in a syllable string “car__________” to be corrected is specified and corrected to the character “yo” to obtain a corrected syllable string 1.

【００１６】修正音節列１「くるま＿で＿にようい＿す
る」Modified syllable sequence 1

【００１７】次に、修正音節列１の第１単語「くるま」
と第２単語「で」との区切りが誤っているため、これら
を「くる」と「まで」とに区切り変更して修正音節列２
を得る。Next, the first word "car" of the modified syllable string 1
Is incorrectly separated from the second word “de”, these are changed to “coming” and “until” to change the modified syllable sequence 2
Get.

【００１８】修正音節列２「くる＿まで＿にようい＿す
る」Modified syllable string 2

【００１９】さらに、修正音節列２の第２単語「まで」
と第３単語「にようい」との区切りが誤っているため、
これらを「までに」と「ようい」とに区切り変更して修
正音節列３を得る。Further, the second word "to" of the modified syllable string 2
Is incorrectly separated from the third word "niyoi"
The modified syllable string 3 is obtained by dividing these into “by” and “you”.

【００２０】修正音節列３「くる＿までに＿ようい＿す
る」Corrected syllable string 3 "by _ by _ by _ do"

【００２１】ユーザは単語の区切りが全て正しく変更さ
れたことを確認すると、最後に修正音節列３を漢字変換
し、正認識の単語列を得る。When the user confirms that all word divisions have been changed correctly, finally, the corrected syllable string 3 is converted into kanji characters to obtain a correctly recognized word string.

【００２２】正認識の単語列「来る＿までに＿用意＿す
る」A word sequence of correct recognition "preparing_before_coming_"

【００２３】[0023]

【発明が解決しようとする課題】従来の音声認識装置は
以上のように構成されているので、単語単位に修正操作
を指示する必要があるため、単語単位の区切りが誤って
いる場合には修正操作が煩雑になりやすいという課題が
あった。Since the conventional speech recognition apparatus is configured as described above, it is necessary to instruct a correction operation on a word-by-word basis. There has been a problem that the operation tends to be complicated.

【００２４】つまり、発声例Ｂの場合のように、単語単
位に区切り変更して修正操作を行わなければならない場
合もあり、入力音声の単語の区切りを意識して１箇所毎
に区切り変更を行わなければならず、修正操作が煩雑に
陥りやすくなっている。That is, as in the case of the utterance example B, it may be necessary to perform a modification operation by changing the delimiter in units of words. The correction operation must be complicated.

【００２５】区切り変更による修正操作は、どの区切り
を変更すれば良いかが分かりにくく、特に単語「来るま
で」が１語としてのみ単語辞書に登録されている場合に
は、いくら区切り変更しても正認識の単語列は得られな
い。In the modification operation by changing the delimiter, it is difficult to know which delimiter should be changed. In particular, when the word “until” is registered as only one word in the word dictionary, no matter how much the delimiter is changed, A correctly recognized word string cannot be obtained.

【００２６】また、従来の音声認識装置は、誤認識単語
が複数存在する場合には複数回の修正操作を行わなけれ
ばならず、ユーザの負担が大きいという課題があった。Further, the conventional speech recognition apparatus has a problem that when a plurality of erroneously recognized words are present, the correction operation must be performed a plurality of times, which imposes a heavy burden on the user.

【００２７】この発明は上記のような課題を解決するた
めになされたものであり、単語単位の区切りを意識する
ことなく、誤認識の部分が複数存在する場合にも一度の
修正操作で誤認識の単語列の修正を容易に行うことがで
きる音声認識装置、音声認識方法および音声認識プログ
ラムを記録したコンピュータ読取可能な記録媒体を構成
することを目的とする。The present invention has been made to solve the above-described problem, and does not require a single corrective operation even when there are a plurality of erroneously recognized portions without being aware of the delimiter in word units. It is an object of the present invention to configure a speech recognition device, a speech recognition method, and a computer-readable recording medium that records a speech recognition program, which can easily correct the word string of the above.

【００２８】[0028]

【課題を解決するための手段】この発明に係る音声認識
装置は、入力音声を音声認識して音節列を生成し、ｎ−
ｇｒａｍ辞書を用いて音節列に対応した複数の単語列を
生成して初期候補および複数の次候補に選別し、初期候
補を表示領域に格納する音声認識手段と、表示領域に格
納された単語列をユーザに表示する表示手段と、表示手
段に表示された初期候補中の誤認識文字列が指定される
と、誤認識文字列の文字位置に相当する文字列だけが全
て文字変更されている単語列を次候補から検索して表示
領域に格納する候補検索手段とを備えるようにしたもの
である。SUMMARY OF THE INVENTION A speech recognition apparatus according to the present invention recognizes an input speech to generate a syllable string and performs n-
a voice recognition unit for generating a plurality of word strings corresponding to a syllable string using a gram dictionary, selecting initial candidates and a plurality of next candidates, and storing the initial candidates in a display area, and a word string stored in the display area When the misrecognized character string in the initial candidate displayed on the display means is specified, and only the character string corresponding to the character position of the misrecognized character string is changed in character Candidate searching means for searching for a column from the next candidate and storing it in the display area.

【００２９】この発明に係る音声認識装置は、誤認識文
字列の文字位置に相当する次候補の文字列が誤認識文字
列と同一でない場合に、候補検索手段が文字変更と拡張
して定義するようにしたものである。In the speech recognition apparatus according to the present invention, when the character string of the next candidate corresponding to the character position of the misrecognized character string is not the same as the misrecognized character string, the candidate retrieving means expands and defines the character change. It is like that.

【００３０】この発明に係る音声認識装置は、網掛け表
示、反転表示、輝度変化表示、色変化表示およびアニメ
ーション表示のいずれかの表示形態によって、誤認識文
字列を表示する表示変更手段を備えるようにしたもので
ある。The speech recognition apparatus according to the present invention includes a display changing means for displaying an erroneously recognized character string in any one of a display mode of a shaded display, an inverted display, a luminance change display, a color change display, and an animation display. It was made.

【００３１】この発明に係る音声認識装置は、誤認識文
字列の文字位置に相当する次候補文字列を候補検索手段
によって検索された次候補から抽出し、誤認識文字列に
合わせて表示部の初期候補とともに次候補文字列を表示
する次候補文字列表示手段と、ユーザの候補選択によっ
て初期候補の誤認識文字列を次候補文字列に修正する次
候補文字列選択手段とを備えるようにしたものである。The speech recognition apparatus according to the present invention extracts the next candidate character string corresponding to the character position of the erroneously recognized character string from the next candidate searched by the candidate searching means, and matches the next candidate character string with the erroneously recognized character string. Next candidate character string display means for displaying the next candidate character string together with the initial candidate, and next candidate character string selecting means for correcting the misrecognized character string of the initial candidate to the next candidate character string by selecting a candidate Things.

【００３２】この発明に係る音声認識方法は、入力音声
を音声認識して音節列を生成し、ｎ−ｇｒａｍ辞書を用
いて音節列に対応した複数の単語列を生成して初期候補
および複数の次候補に選別し、初期候補を表示領域に格
納する音声認識ステップと、表示領域に格納された単語
列をユーザに表示する表示ステップと、表示ステップに
おいて表示された初期候補中の誤認識文字列が指定され
る誤認識文字列指定ステップと、誤認識文字列の文字位
置に相当する文字列だけが全て文字変更されている単語
列を次候補から検索して表示領域に格納する候補検索ス
テップとを備えるようにしたものである。In the speech recognition method according to the present invention, a syllable string is generated by speech recognition of an input voice, a plurality of word strings corresponding to the syllable string are generated using an n-gram dictionary, and initial candidates and a plurality of word strings are generated. A voice recognition step of selecting the next candidate and storing the initial candidate in the display area, a display step of displaying a word string stored in the display area to the user, and a misrecognized character string in the initial candidate displayed in the display step And a candidate search step of searching the next candidate for a word string in which only the character string corresponding to the character position of the misrecognized character string has been changed from the next candidate and storing it in the display area. It is provided with.

【００３３】この発明に係る音声認識方法は、候補検索
ステップでは、誤認識文字列の文字位置に相当する次候
補の文字列が誤認識文字列と同一でない場合に文字変更
と拡張して定義するようにしたものである。In the speech recognition method according to the present invention, in the candidate search step, if the character string of the next candidate corresponding to the character position of the erroneously recognized character string is not the same as the erroneously recognized character string, it is defined as a character change. It is like that.

【００３４】この発明に係る音声認識方法は、網掛け表
示、反転表示、輝度変化表示、色変化表示およびアニメ
ーション表示のいずれかの表示形態によって、誤認識文
字列を表示する表示変更ステップを備えるようにしたも
のである。[0034] The voice recognition method according to the present invention includes a display change step of displaying an erroneously recognized character string in any of the display modes of shaded display, inverted display, luminance change display, color change display, and animation display. It was made.

【００３５】この発明に係る音声認識方法は、誤認識文
字列の文字位置に相当する次候補文字列を候補検索ステ
ップにおいて検索された次候補から抽出し、誤認識文字
列に合わせて表示部の初期候補とともに次候補文字列を
表示する次候補文字列表示ステップと、ユーザの候補選
択によって初期候補の誤認識文字列を次候補文字列に修
正する次候補文字列選択ステップとを備えるようにした
ものである。In the voice recognition method according to the present invention, the next candidate character string corresponding to the character position of the erroneously recognized character string is extracted from the next candidate searched in the candidate search step, and the next candidate character string is displayed on the display unit in accordance with the erroneously recognized character string. A next candidate character string displaying step of displaying the next candidate character string together with the initial candidate, and a next candidate character string selecting step of correcting the misrecognized character string of the initial candidate into the next candidate character string by selecting a user candidate are provided. Things.

【００３６】この発明に係る音声認識プログラムを記録
したコンピュータ読取可能な記録媒体は、入力音声を音
声認識して音節列を生成し、ｎ−ｇｒａｍ辞書を用いて
音節列に対応した複数の単語列を生成して初期候補およ
び複数の次候補に選別し、初期候補を表示領域に格納す
る音声認識手順と、表示領域に格納された単語列をユー
ザに表示する表示手順と、表示手順において表示された
初期候補中の誤認識文字列が指定される誤認識文字列指
定手順と、誤認識文字列の文字位置に相当する文字列だ
けが全て文字変更されている単語列を次候補から検索し
て表示領域に格納する候補検索手順とを備えるようにし
たものである。A computer-readable recording medium storing a speech recognition program according to the present invention recognizes an input speech as a speech, generates a syllable string, and uses an n-gram dictionary to generate a plurality of word strings corresponding to the syllable string. Is generated and selected as an initial candidate and a plurality of next candidates, a voice recognition procedure for storing the initial candidate in a display area, a display procedure for displaying a word string stored in the display area to a user, and a display procedure for displaying A misrecognition character string designation procedure in which the misrecognition character string in the initial candidate is designated, and a word string in which only the character string corresponding to the character position of the misrecognition character string is changed from the next candidate is searched. And a candidate search procedure to be stored in the display area.

【００３７】この発明に係る音声認識プログラムを記録
したコンピュータ読取可能な記録媒体は、候補検索手順
では、誤認識文字列の文字位置に相当する次候補の文字
列が誤認識文字列と同一でない場合に文字変更と拡張し
て定義するようにしたものである。In the computer readable recording medium storing the speech recognition program according to the present invention, in the candidate search procedure, the character string of the next candidate corresponding to the character position of the erroneously recognized character string is not the same as the erroneously recognized character string. It is defined as an extended character change.

【００３８】この発明に係る音声認識プログラムを記録
したコンピュータ読取可能な記録媒体は、網掛け表示、
反転表示、輝度変化表示、色変化表示およびアニメーシ
ョン表示のいずれかの表示形態によって、誤認識文字列
を表示する表示変更手順を備えるようにしたものであ
る。The computer-readable recording medium on which the speech recognition program according to the present invention is recorded has a hatched display,
A display change procedure for displaying an erroneously recognized character string is provided in any of the display modes of the reverse display, the luminance change display, the color change display, and the animation display.

【００３９】この発明に係る音声認識プログラムを記録
したコンピュータ読取可能な記録媒体は、誤認識文字列
の文字位置に相当する次候補文字列を候補検索手順にお
いて検索された次候補から抽出し、誤認識文字列に合わ
せて表示部の初期候補とともに次候補文字列を表示する
次候補文字列表示手順と、ユーザの候補選択によって初
期候補の誤認識文字列を次候補文字列に修正する次候補
文字列選択手順とを備えるようにしたものである。The computer-readable recording medium storing the speech recognition program according to the present invention extracts a next candidate character string corresponding to the character position of the misrecognized character string from the next candidate searched in the candidate search procedure, A next candidate character string display procedure for displaying the next candidate character string along with the initial candidate on the display unit in accordance with the recognition character string, and a next candidate character for correcting the misrecognized character string of the initial candidate to the next candidate character string by selecting a candidate And a column selection procedure.

【００４０】[0040]

【発明の実施の形態】以下、この発明の実施の一形態を
説明する。実施の形態１．図１はこの発明の実施の形態１による音
声認識装置の構成を示す図である。図１において、１は
入力音声を電気信号に変換するマイク（音声認識手
段）、２はマイク１からの電気信号を音声認識処理する
音声認識部（音声認識手段）、３は読書可能なＲＡＭ
（音声認識手段）、４は音節列から単語列を生成する際
に用いるｎ−ｇｒａｍ辞書を記憶するｎ−ｇｒａｍ辞書
記憶部（音声認識手段）、５はＲＡＭ３の表示領域の単
語列を表示する表示部（表示手段）である。DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS One embodiment of the present invention will be described below. Embodiment 1 FIG. FIG. 1 is a diagram showing a configuration of a speech recognition apparatus according to Embodiment 1 of the present invention. In FIG. 1, reference numeral 1 denotes a microphone (speech recognition unit) for converting an input voice into an electric signal, 2 denotes a speech recognition unit (speech recognition unit) for performing a speech recognition process on an electric signal from the microphone 1, and 3 denotes a readable RAM.
(Speech recognition means) 4 is an n-gram dictionary storage unit (speech recognition means) for storing an n-gram dictionary used when generating a word string from a syllable string, and 5 is a word string in a display area of the RAM 3 It is a display unit (display means).

【００４１】６および７はそれぞれ修正操作を行うキー
ボードおよびマウス、８は修正操作によって指定された
誤認識文字列の文字位置をＲＡＭ３に記憶する範囲指定
部、９は誤認識文字列の表示形態を変更する表示変更部
（表示変更手段）、１０は誤認識文字列が指定された初
期候補を参照して後述の検索条件１を満たす単語列を次
候補から検索する候補検索部（候補検索手段）である。Reference numerals 6 and 7 respectively denote a keyboard and a mouse for performing a correction operation, 8 a range designating section for storing the character position of the misrecognized character string specified by the correction operation in the RAM 3, and 9 a display form of the misrecognized character string. A display change unit (display change unit) 10 to be changed is a candidate search unit (candidate search unit) that searches a next candidate for a word string that satisfies search condition 1 described below with reference to an initial candidate in which an erroneously recognized character string is specified. It is.

【００４２】次に動作について説明する。図２はこの発
明の実施の形態１による音声認識方法を示すフローチャ
ートである。ユーザがマイク１に発声すると音声認識処
理が開始される（ステップＳＴ１）。ここでは、前述の
発声例Ｂ「くるまでによういする（来るまでに用意す
る）」を考える。Next, the operation will be described. FIG. 2 is a flowchart showing a speech recognition method according to Embodiment 1 of the present invention. When the user speaks into the microphone 1, voice recognition processing is started (step ST1). Here, the above-mentioned utterance example B “to come (be prepared)” is considered.

【００４３】単語列の認識は、単語列の確率Ｐ（Ｗ｜
Ｙ）を最大にするＷを算出することで得られる。単語列
の確率Ｐ（Ｗ｜Ｙ）は（１）式から得られる。Recognition of a word string is based on the probability P (W |
It is obtained by calculating W that maximizes Y). The word string probability P (W | Y) is obtained from equation (1).

【００４４】Ｐ（Ｗ｜Ｙ）＝Ｐ（Ｙ｜Ｗ）・Ｐ（Ｗ）／Ｐ（Ｙ）（１）P (W | Y) = P (Y | W) · P (W) / P (Y) (1)

【００４５】ここで、Ｗは発声された単語列、Ｙは音節
列、Ｐ（Ｙ｜Ｗ）は単語列Ｗが与えられたときの音節列
Ｙの出現確率、Ｐ（Ｗ）は単語列Ｗの出現確率である。
単語列の確率Ｐ（Ｗ｜Ｙ）を最大にする単語列Ｗを求め
れば良いから、（１）式右辺のうち単語列Ｗに共通な項
Ｐ（Ｙ）は省略でき、Ｐ（Ｙ｜Ｗ）・Ｐ（Ｗ）を最大に
する単語列Ｗを求めれば良い。Here, W is an uttered word string, Y is a syllable string, P (Y | W) is an appearance probability of the syllable string Y given the word string W, and P (W) is a word string W Is the appearance probability of.
Since the word string W that maximizes the word string probability P (W | Y) may be obtained, the term P (Y) common to the word string W on the right side of the equation (1) can be omitted, and P (Y | W It is sufficient to find a word string W that maximizes P (W).

【００４６】時刻ｔ＝１，２，…，Ｌにおいて、単語列
Ｗに対応する音節列Ｙが（２）式で決定されるとき、Ｐ
（Ｙ｜Ｗ）は各音節確率Ｐ（Ｙ_１），Ｐ（Ｙ_２），…，
Ｐ（Ｙ_Ｌ）から（３）式によって算出できる。At time t = 1, 2,..., L, when the syllable string Y corresponding to the word string W is determined by the equation (2), P
(Y | W) is the syllable probability P (Y ₁ ), P (Y ₂ ),.
It can be calculated from P (Y _L ) by equation (3).

【００４７】Ｙ＝Ｙ_１，Ｙ_２，…，Ｙ_Ｌ（２）Y = Y ₁ , Y ₂ ,..., Y _L (2)

【数１】 (Equation 1)

【００４８】また、単語列Ｗの出現確率Ｐ（Ｗ）は、ｍ
語の単語列Ｗが（４）式で決定されるとき、音節確率と
は独立に（５）式（単語ｎ−ｇｒａｍ情報）から求めら
れる。The appearance probability P (W) of the word string W is m
When the word string W of a word is determined by Expression (4), it is obtained from Expression (5) (word n-gram information) independently of the syllable probability.

【００４９】Ｗ＝ｗ_１，ｗ_２，…，ｗ_ｍ（４）W = w ₁ , w ₂ ,..., W _m (4)

【数２】 (Equation 2)

【００５０】上述した計算によって音節列のうち、音節
列に対応した単語の列が辞書に存在するものについて、
単語列の確率Ｐ（Ｗ｜Ｙ）を最大にする単語列Ｗを算出
する。組み合わせの計算については、例えばＶｉｔｅｒ
ｂｉやスタックデコーディングの方法を用いて高速に行
っても良いし、対数確率を用いて確率を総和で計算して
も良い。それぞれの単語の出現確率Ｐ（Ｗ）はｎ−ｇｒ
ａｍ辞書記憶部４に予め記憶してある。According to the above-described calculation, of the syllable strings, if the word string corresponding to the syllable string exists in the dictionary,
A word string W that maximizes the word string probability P (W | Y) is calculated. For the calculation of the combination, for example,
High-speed processing may be performed by using bi or stack decoding, or the probabilities may be calculated by summation using logarithmic probabilities. The appearance probability P (W) of each word is n-gr
It is stored in the am dictionary storage unit 4 in advance.

【００５１】ｎ＝１の場合（符号４Ａ）、ｎ＝２の場合
（符号４Ｂ）におけるｎ−ｇｒａｍ辞書記憶部４の一例
を図３に示す。図３に示すように、ｎ−ｇｒａｍ辞書記
憶部４には、音節列と、音節列に対応した単語列と、末
尾単語の出現確率Ｐ（Ｗ）とが１つのレコードとして記
憶されている。この実施の形態１では、単語の連鎖数ｎ
は２連鎖までを扱うものとするが、連鎖数ｎは３連鎖以
上であってももちろん良い。FIG. 3 shows an example of the n-gram dictionary storage unit 4 when n = 1 (reference numeral 4A) and when n = 2 (reference numeral 4B). As shown in FIG. 3, the n-gram dictionary storage unit 4 stores a syllable string, a word string corresponding to the syllable string, and the appearance probability P (W) of the last word as one record. In the first embodiment, the number of word chains n
Handles up to two chains, but the number n of chains may of course be three or more.

【００５２】入力された音声をマイク１が電気信号に変
換すると（ステップＳＴ２）、音声認識部２は電気信号
をＡ／Ｄ変換し、量子化した後にスペクトル分析して、
音節単位に分離したもっともらしい音節列をその尤度順
にＲＡＭ３に記憶する（ステップＳＴ３）。When the microphone 1 converts the input voice into an electric signal (step ST2), the voice recognition unit 2 performs A / D conversion of the electric signal, quantizes the electric signal, and analyzes the spectrum.
The plausible syllable strings separated in syllable units are stored in the RAM 3 in the order of their likelihood (step ST3).

【００５３】ここでは、音節列の尤度Ｐ（Ｙ｜Ｗ）とと
もに、以下の２つの音節列がＲＡＭ３に記憶されるもの
とする。Here, it is assumed that the following two syllable strings are stored in the RAM 3 together with the syllable string likelihood P (Y | W).

【００５４】音節列１「くるまでにょういする」Ｐ（Ｙ｜Ｗ）＝０．６音節列２「くるまでによういする」Ｐ（Ｙ｜Ｗ）＝０．４Syllable string 1 “Say to come” P (Y | W) = 0.6 Syllable string 2 “Say to come” P (Y | W) = 0.4

【００５５】次に、音声認識部２は、音節列１，２にそ
れぞれ対応した単語列を生成して、単語列を尤度ととも
にＲＡＭ３に記憶する（ステップＳＴ４）。ここでは、
以下の単語列とともに対応する尤度Ｐ（Ｙ｜Ｗ）・Ｐ
（Ｗ）が記憶される。Next, the voice recognition unit 2 generates word strings corresponding to the syllable strings 1 and 2, respectively, and stores the word strings together with the likelihood in the RAM 3 (step ST4). here,
Likelihood P (Y | W) · P corresponding to the following word string
(W) is stored.

【００５６】単語列１「車で尿意する」Ｐ（Ｙ｜Ｗ）・Ｐ（Ｗ）＝０．３単語列２「来るまでに用意する」Ｐ（Ｙ｜Ｗ）・Ｐ（Ｗ）＝０．１６単語列３「来るまで尿意する」Ｐ（Ｙ｜Ｗ）・Ｐ（Ｗ）＝０．０６Word sequence 1 “I want to urinate in a car” P (Y | W) · P (W) = 0.3 Word sequence 2 “Prepare before coming” P (Y | W) · P (W) = 0 .16 Word string 3 "Urious until coming" P (Y | W) · P (W) = 0.06

【００５７】候補検索部１０は、ＲＡＭ３に記憶された
単語列１，２，３の中から最尤の単語列を初期候補とし
てＲＡＭ３の表示領域に格納する（ステップＳＴ５）。
上の場合には、最尤の単語列として単語列１「車で尿意
する」を初期候補としてＲＡＭ３の表示領域に格納す
る。以上のステップＳＴ１〜ＳＴ５までが音声認識ステ
ップである。The candidate retrieval unit 10 stores the most likely word string from the word strings 1, 2, and 3 stored in the RAM 3 as initial candidates in the display area of the RAM 3 (step ST5).
In the above case, the word string 1 “I want to urinate in a car” is stored as an initial candidate in the display area of the RAM 3 as the maximum likelihood word string. The above steps ST1 to ST5 are the speech recognition steps.

【００５８】また、単語列２「来るまでに用意する」、
単語列３「来るまで尿意する」は、それぞれ第１の次候
補、第２の次候補としてＲＡＭ３に記憶しておく。表示
部５は、ＲＡＭ３の表示領域に格納された初期候補の単
語列「車で尿意する」を表示部５に表示する（ステップ
ＳＴ６，表示ステップ）。Also, the word string 2 "prepare before coming",
The word string 3 “I'm urged to come” is stored in the RAM 3 as a first next candidate and a second next candidate, respectively. The display unit 5 displays on the display unit 5 the word string of “initial car” stored in the display area of the RAM 3 (step ST6, display step).

【００５９】初期候補「車で尿意する」[0059] Initial candidate "I'm tired of driving"

【００６０】ユーザは表示部５の出力を見て自身の意図
と異なる結果を認識すると、初期候補中の誤認識文字列
をキーボード６やマウス７を用いて複数指定する（ステ
ップＳＴ７，誤認識文字列指定ステップ）。この実施の
形態１では、区切り変更が不必要な初期候補の単語列
「車で尿意する」をそのまま用いて文字列単位による誤
認識部分の指定を行う。When the user recognizes a result different from his / her intention by looking at the output of the display unit 5, the user designates a plurality of misrecognized character strings in the initial candidates using the keyboard 6 and the mouse 7 (step ST7, misrecognized characters). Column designation step). In the first embodiment, an erroneously recognized portion is specified in units of character strings by using the word string “I'm tired of a car” of an initial candidate that does not need to be changed.

【００６１】この時点で全ての誤認識文字列を指定して
しまうので、ユーザは修正操作を容易に行うことができ
る。上記の例では、「車」および「尿」が誤りであるの
で、図４の表示部（表示手段）５Ａの矢印に示す動作で
「車」および「尿」をキーボード６やマウス７で文末方
向へのドラッグにより指定する（文中＜＞は誤認識文字
列の指定範囲）。At this point, since all the incorrectly recognized character strings are specified, the user can easily perform the correction operation. In the above example, since “car” and “urine” are incorrect, the “car” and “urine” are moved in the end direction by the keyboard 6 and the mouse 7 by the operation indicated by the arrow of the display unit (display means) 5A in FIG. (<> In the text indicates the range of misrecognized character strings).

【００６２】初期候補「＜車＞で＜尿＞意する」Initial candidate "<car> means <urine>"

【００６３】範囲指定部８は、指定された各誤認識文字
列の始端位置および終端位置をＲＡＭ３に記憶する（ス
テップＳＴ８）。つまり、「＜車＞で＜尿＞意する」に
おける＜車＞および＜尿＞の文字位置を｛始端位置，終
端位置｝の形式で｛１，１｝および｛３，３｝のように
ＲＡＭ３にそれぞれ記憶する（終端位置が始端位置より
小さい場合には、始端文字の前に不足文字があることを
表す。キーボード６やマウス７の文頭方向へのドラッグ
で指定する。）。The range specification section 8 stores the start position and the end position of each specified erroneously recognized character string in the RAM 3 (step ST8). In other words, the character positions of <car> and <urine> in "<car> mean <urine>" are expressed in {3, 3} as {1, 1} and {3, 3} in the form of {start position, end position}. (If the end position is smaller than the start position, it indicates that there is a missing character before the start character. Designate by dragging the keyboard 6 or the mouse 7 toward the beginning of the sentence.)

【００６４】表示変更部９は、誤認識文字列の指定範囲
をＲＡＭ３から読み出し、誤認識文字列以外の部分と異
なる表示形態になるようにＲＡＭ３の表示領域を更新す
る。表示部５はＲＡＭ３の表示領域の更新にしたがっ
て、表示形態を変更して誤認識文字列を表示する（ステ
ップＳＴ９，表示変更ステップ）。The display changing section 9 reads the designated range of the misrecognized character string from the RAM 3 and updates the display area of the RAM 3 so that the display form is different from that of the part other than the misrecognized character string. The display unit 5 changes the display mode and displays the erroneously recognized character string according to the update of the display area of the RAM 3 (step ST9, display change step).

【００６５】図４の表示部（表示手段）５Ｂでは、＜車
＞、＜尿＞の文字位置｛１，１｝｛３，３｝をＲＡＭ３
から読み出し、誤認識文字列を網掛け表示している。も
ちろん、網掛け表示に限定されることなく、反転表示、
輝度変化表示、色変化表示、アニメーション表示などの
表示形態を用いても良い。このようにすることで、ユー
ザの嗜好や用途に応じて誤認識文字列を表示することが
できる。In the display section (display means) 5B of FIG. 4, the character positions {1, 1 {3, 3} of <car> and <urine> are stored in the RAM3.
And misrecognized character strings are shaded. Of course, without being limited to shaded display, reverse display,
A display mode such as a luminance change display, a color change display, and an animation display may be used. By doing so, it is possible to display an erroneously recognized character string according to the user's preference and use.

【００６６】候補検索部１０は、ＲＡＭ３から次候補を
読み込み、指定された誤認識文字列に相当する文字位置
の文字列だけが異なり（文字変更されている）、誤認識
文字列以外に相当する文字位置の文字列が全て同一（文
字変更されていない）の単語列を次候補から検索してＲ
ＡＭ３の表示領域に記憶する（ステップＳＴ１０，候補
検索ステップ）。この実施の形態１では、以下の次候補
を候補検索部１０がＲＡＭ３から読み込む。The candidate retrieval unit 10 reads the next candidate from the RAM 3 and differs only in the character string at the character position corresponding to the specified erroneously recognized character string (character has been changed), and corresponds to other than the erroneously recognized character string. A word string in which the character strings at the character positions are all the same (characters have not been changed) is searched from the next candidate and R
It is stored in the display area of AM3 (step ST10, candidate search step). In the first embodiment, the candidate search unit 10 reads the following next candidate from the RAM 3.

【００６７】第１の次候補「来るまでに用意する」第２の次候補「来るまで尿意する」First Next Candidate “Prepare Before Coming” Second Next Candidate “Urgent until Coming”

【００６８】候補検索部１０の次候補の検索に用いる検
索条件１を次に示す（ここで、＋は文字列の結合を示す
演算子とする）。The search condition 1 used in the search for the next candidate by the candidate search unit 10 is shown below (here, + is an operator indicating a combination of character strings).

【００６９】検索条件１＜車＞が同じ文字位置にない文字列＋「で」＋＜尿＞が同じ文字位置にない文字列＋「意する」Search condition 1 Character string where <car> is not at the same character position + "de" + Character string where <urine> is not at the same character position + "meaning"

【００７０】なお、この検索の場合の文字位置は、日本
語の音声認識であれば、音節列の文字数を参照して文字
位置を判断する。候補検索部１０は、初期候補の誤認識
文字列＜車＞、＜尿＞に相当する文字位置の文字列だけ
が全て文字変更され、「で」、「意する」の文字位置に
相当する文字列は全て文字変更されていない単語列を次
候補としてＲＡＭ３から選択する。In the case of this search, the character position is determined by referring to the number of characters in the syllable string in the case of Japanese speech recognition. The candidate search unit 10 changes only the character strings at the character positions corresponding to the misrecognized character strings <car> and <urine> of the initial candidates, and changes the characters corresponding to the character positions of “de” and “mean”. As for the column, a word sequence in which all characters are not changed is selected from the RAM 3 as a next candidate.

【００７１】まず、第１の次候補が検索条件１を満足す
るかを判断する。一致パターンが複数ある場合にはすべ
てのパターンについて検索条件１を満足するかを判断す
る。誤認識文字列が指定された初期候補と第１の次候補
とを比較すると次のようになる（比較のためにスペース
を挿入する）。First, it is determined whether or not the first next candidate satisfies search condition 1. When there are a plurality of matching patterns, it is determined whether or not all the patterns satisfy the search condition 1. Comparing the initial candidate with the misrecognized character string to the first next candidate results in the following (inserting a space for comparison).

【００７２】初期候補「＜車＞で＜尿＞意する」第１の次候補「来るまでに用意する」Initial candidate “<Urine> in <car>” <br> First candidate “Preparation before coming”

【００７３】上の比較から、第１の次候補は、指定され
た誤認識文字列＜車＞および＜尿＞が「来るま」、「に
用意」にそれぞれ文字変更されており、かつ、誤認識文
字列以外の文字列「で」、「意する」が文字変更されて
いないことが分かる。すなわち、第１の次候補は検索条
件１を満たしている。From the above comparison, the first next candidate shows that the designated erroneously recognized character strings <car> and <urine> have been changed to “come” and “prepared” respectively, and It can be seen that the character strings “de” and “mean” other than the recognition character string have not been changed. That is, the first next candidate satisfies the search condition 1.

【００７４】次に、第２の次候補が検索条件１を満足す
るかを判断する。誤認識文字列が指定された初期候補と
第２の次候補とを比較すると次のようになる（比較のた
めにスペースを挿入する）。Next, it is determined whether or not the second next candidate satisfies search condition 1. Comparing the initial candidate with the misrecognized character string and the second next candidate results in the following (inserting a space for comparison).

【００７５】初期候補「＜車＞で＜尿＞意する」第２の次候補「来るまで尿意する」Initial candidate “<Urine> in <car>” Second candidate “Urium until coming”

【００７６】上の比較から、第２の次候補は、一方の誤
認識文字列＜車＞は「来るま」に文字変更されており、
誤認識文字列以外の文字列「で」、「意する」は文字変
更されてはいないが、他方の誤認識文字列＜尿＞が文字
変更されていないことが分かる。すなわち、第２の次候
補は、検索条件１を満たしていない。From the above comparison, in the second next candidate, one of the misrecognized character strings <car> has been changed to “come”,
It can be seen that the character strings “de” and “I” other than the misrecognized character string have not been changed, but the other incorrectly recognized character string <urine> has not been changed. That is, the second next candidate does not satisfy the search condition 1.

【００７７】このように、検索条件１に合致する単語列
は、第１の次候補「来るまでに用意する」と判断するこ
とができ、候補検索部１０は第１の次候補をＲＡＭ３の
表示領域に格納するとともに、誤認識文字列の指定範囲
に対応する文字列候補｛１，３｝，｛５，６｝をＲＡＭ
３に記憶する。そして、表示部５は、表示領域の第１の
次候補「来るまでに用意する」を表示するとともに、表
示変更部９がＲＡＭ３に記憶された｛１，３｝，｛５，
６｝に対応する「来るま」、「に用意」の部分の表示形
態を変更する。As described above, the word string that matches the search condition 1 can be determined as the first next candidate “prepare before coming”, and the candidate search unit 10 displays the first next candidate on the RAM 3. The character string candidates {1, 3}, {5, 6} corresponding to the specified range of the erroneously recognized character string are stored in the RAM and stored in the area.
3 is stored. Then, the display unit 5 displays the first next candidate “prepare before coming” of the display area, and the display change unit 9 causes the {1, 3}, $ 5,
The display form of the part "come" or "prepared" corresponding to 6 @ is changed.

【００７８】修正結果「＜来るま＞で＜に用＞意する」Correction result "I'll be ready for <Coming>"

【００７９】ユーザは、表示部５の修正結果を見て正認
識の単語列が得られたことを確認すると、次の音声入力
に移る。When the user confirms that a correctly recognized word string has been obtained by looking at the correction result on the display unit 5, the user proceeds to the next voice input.

【００８０】以上見てきたように、この実施の形態１で
は、単語単位に限定されることなく、文字列単位で誤認
識指定を複数箇所にわたって一度に全て行い、検索条件
１にしたがって誤認識文字列の文字位置に相当する文字
列だけが全て文字変更された単語列を次候補から検索す
るようにしているので、煩雑な区切り変更や複数回の修
正操作を行うことなく、誤認識の単語列から正認識の単
語列への修正操作を容易に行うことができる。As has been seen above, in the first embodiment, erroneous recognition is not limited to word units, but is specified all at once at a plurality of locations in units of character strings. Since the next candidate is searched for the word string in which only the character string corresponding to the character position of the column has been changed, the erroneously recognized word string can be used without performing complicated delimiter changes or multiple correction operations. Can be easily corrected to a word string that is correctly recognized.

【００８１】検索条件１を用いた候補検索部１０の動作
をより理解するために、もう一つ別の発声例についても
説明しておく。In order to better understand the operation of the candidate retrieval unit 10 using the retrieval condition 1, another utterance example will be described.

【００８２】発声例Ｃ「つまできたかねおくれ（津まで
来た金送れ）」を誤認識した初期候補の単語列「妻でき
た金をくれ」を修正操作する場合を考える。ここでの誤
認識文字列は「＜妻＞で＜き＞た金＜をく＞れ」のよう
に文字列単位で指定されている。このとき、ＲＡＭ３に
記憶されている次候補は次の４つの単語列である。Suppose that the utterance example C corrects the word string "Give me the money" of the initial candidate which misrecognized "I'm sick of the money". Here, the misrecognized character string is specified in units of character strings, such as “<wife><mo><mo>”. At this time, the next candidates stored in the RAM 3 are the following four word strings.

【００８３】第１の次候補「妻出来た金をくれ」第２の次候補「津まで来た金をくれ」第３の次候補「津まで北風送れ」第４の次候補「津まで来た金送れ」The first candidate, "Give me the money made by my wife" The second candidate, "Give me the money that came to Tsu" The third candidate, "Send the north wind to Tsu" The fourth candidate, "Give me Tsu Send money! "

【００８４】初期候補「＜妻＞で＜き＞た金＜をく＞れ」第１の次候補「妻出来た金をくれ」第１の次候補では、＜き＞は「来」に文字変更されてい
る。しかし、他の＜妻＞、＜をく＞はいずれも文字変更
されておらず、さらに誤認識文字列以外の「で」が
「出」に文字変更されている。以上から、検索条件１を
満たさないため、候補検索部１０は第１の次候補を棄却
する。Initial candidate “<wife> got money” First first candidate “wife came money” In the first next candidate, <ki> is “come” Has been changed to However, the characters of the other <wife> and <koku> have not been changed, and “de” other than the misrecognized character string has been changed to “de”. As described above, since the search condition 1 is not satisfied, the candidate search unit 10 rejects the first next candidate.

【００８５】初期候補「＜妻＞で＜き＞た金＜をく＞れ」第２の次候補「津まで来た金をくれ」第２の次候補では、＜妻＞および＜き＞が「津ま」およ
び「来」にそれぞれ文字変更されている。しかし、残り
の＜をく＞が文字変更されていない。以上から、検索条
件１を満たさないため、候補検索部１０は第２の次候補
を棄却する。The initial candidate “<wife> and <kin>” The second next candidate “the money that came to Tsu” “Kuma” and “<wife” > Has been changed to “Tsuma” and “Kai”, respectively. However, the remaining characters are not changed. As described above, since the search condition 1 is not satisfied, the candidate search unit 10 rejects the second next candidate.

【００８６】初期候補「＜妻＞で＜き＞た金＜をく＞れ」第３の次候補「津まで北風送れ」第３の次候補では、＜妻＞および＜をく＞が「津ま」お
よび「送」にそれぞれ文字変更されている。しかし、残
りの誤認識文字列＜き＞が「北」と「た」まで含めて文
字変更されており、かつ、誤認識文字列以外の文字列
「金」が「風」に文字変更されている。以上から、検索
条件１を満たさないため、候補検索部１０は第３の次候
補を棄却する。[0086] Initial candidate "<Wife> and <Kita><Kokure>" Third candidate [Tsuma to north wind sending] In the third candidate, <Wife> and <Koku> Has been changed to "tsuma" and "send" respectively. However, the characters of the remaining misrecognized character string <ki> were changed to include "north" and "ta", and the character string "gold" other than the misrecognized character string was changed to "wind". I have. As described above, since the search condition 1 is not satisfied, the candidate search unit 10 rejects the third next candidate.

【００８７】初期候補「＜妻＞で＜き＞た金＜をく＞れ」第４の次候補「津まで来た金送れ」第４の次候補では、＜妻＞、＜き＞および＜をく＞が
「津ま」、「来」および「送」にそれぞれ文字変更され
ている。かつ、誤認識文字列以外の全ての文字列
「で」、「た金」、「れ」が文字変更されずにそのまま
保持されている。以上から、検索条件１を満たすため、
候補検索部１０は第４の次候補をＲＡＭ３の表示領域に
格納する。The initial candidate “<wife> was paid by <w>” The fourth next candidate was “send money that came to Tsu” The fourth candidate was <wife>, <w> The characters of and have been changed to "tsuma", "kuru" and "send", respectively. In addition, all the character strings “de”, “takin”, and “re” other than the erroneously recognized character string are held as they are without being changed. From the above, to satisfy the search condition 1,
The candidate search unit 10 stores the fourth next candidate in the display area of the RAM 3.

【００８８】なお、検索条件１を満たす次候補が複数存
在する場合には、検索条件１を満たす複数の次候補の中
から尤度の大きい順に表示するようにすれば良い。When there are a plurality of next candidates satisfying the search condition 1, the plurality of next candidates satisfying the search condition 1 may be displayed in descending order of likelihood.

【００８９】以上のように、この実施の形態１によれ
ば、マイク１からの音声を音声認識して複数の音節列を
生成し、ｎ−ｇｒａｍ辞書記憶部４のｎ−ｇｒａｍ辞書
を用いて音節列から複数の単語列を生成して初期候補お
よび複数の次候補に選別し、ＲＡＭ３の表示領域に初期
候補を格納する音声認識部２と、ＲＡＭ３の表示領域に
格納された単語列をユーザに表示する表示部５と、キー
ボード６やマウス７によって誤認識文字列が指定される
と、誤認識文字列の文字位置に相当する全ての文字列だ
けが文字変更されている単語列を次候補から検索し、Ｒ
ＡＭ３の表示領域に格納する候補検索部１０とを備える
ようにしたので、単語単位の区切りを意識することな
く、誤認識文字列が複数存在する場合にも一度の修正操
作で誤認識の単語列の修正を容易に行うことができると
いう効果が得られる。As described above, according to the first embodiment, the voice from the microphone 1 is recognized to generate a plurality of syllable strings, and the syllable string is generated using the n-gram dictionary in the n-gram dictionary storage unit 4. A speech recognition unit 2 that generates a plurality of word strings from a syllable string, selects them as initial candidates and a plurality of next candidates, and stores the initial candidates in a display area of the RAM 3, and outputs a word string stored in the display area of the RAM 3 to a user. When the erroneously recognized character string is designated by the display unit 5 and the keyboard 6 or the mouse 7, a word string in which only the character string corresponding to the character position of the erroneously recognized character string is changed to the next candidate. Search from and R
A candidate search unit 10 for storing in the display area of the AM3 is provided. Therefore, even if a plurality of erroneously recognized character strings exist, the erroneously recognized word string can be corrected by a single operation even if a plurality of erroneously recognized character strings exist. Can be easily corrected.

【００９０】また、この実施の形態１によれば、網掛け
表示、反転表示、輝度変化表示、色変化表示およびアニ
メーション表示のいずれかの表示形態から、誤認識文字
列を表示する表示変更部９を備えるようにしたので、ユ
ーザの嗜好や用途に応じて誤認識文字列を表示すること
ができるという効果が得られる。Further, according to the first embodiment, the display change unit 9 for displaying a misrecognized character string from any of the display modes of the shaded display, the inverted display, the luminance change display, the color change display, and the animation display. Is provided, it is possible to obtain an effect that an erroneously recognized character string can be displayed according to a user's preference or use.

【００９１】実施の形態２．実施の形態１では、誤認識
文字列だけを指定するようにしたが、実際には誤認識文
字列だけを全て正確に指定することは難しい。例えば発
声例Ｂの場合、誤認識範囲指定を＜尿＞だけ指定すべき
ところを、図５の表示部（表示手段）５Ｃに示すよう
に、誤っていない「意」の部分も含めて＜尿意＞と指定
してしまうことがしばしば生じる。したがって、実施の
形態１の検索条件１では、「意」が同じ文字位置にある
ため正認識の単語列が得られない。この実施の形態２で
は、このような場合を想定して、検索条件１を拡張した
検索条件２によって正認識の単語列を得る手法について
説明する。Embodiment 2 In the first embodiment, only the misrecognized character string is designated. However, it is actually difficult to accurately designate only the misrecognized character string. For example, in the case of the utterance example B, the place where the misrecognition range should be designated only for <urine> includes the part of "meaning" which is not mistaken as shown in the display unit (display means) 5C of FIG. Often, it is specified as>. Therefore, in the search condition 1 according to the first embodiment, a word string that is correctly recognized cannot be obtained because “me” is at the same character position. In the second embodiment, assuming such a case, a description will be given of a method of obtaining a correctly recognized word string by a search condition 2 which is an extension of the search condition 1.

【００９２】図６はこの発明の実施の形態２による音声
認識装置の構成を示す図である。図１と同一または相当
する構成については同一の符号を付してある。図６にお
いて、１１は後述する検索条件２にしたがって次候補単
語列を検索する候補検索部（候補検索手段）である。FIG. 6 is a diagram showing a configuration of a speech recognition apparatus according to Embodiment 2 of the present invention. The same or corresponding components as those in FIG. 1 are denoted by the same reference numerals. In FIG. 6, reference numeral 11 denotes a candidate search unit (candidate search means) for searching for a next candidate word string according to search condition 2 described later.

【００９３】図７はこの発明の実施の形態２による音声
認識方法を示すフローチャートである。図２と同一また
は相当する動作については同一符号を付してある。実施
の形態１と同様にステップＳＴ１からステップＳＴ９ま
での動作が行われると、候補検索部１１は、次候補の単
語列をＲＡＭ３から読み込み、検索条件２にしたがっ
て、各誤認識文字列については１文字以上の文字の異な
りがあり、かつ、誤認識文字列以外の文字列が全て同一
の単語列を検索してＲＡＭ３の表示領域に記憶する（ス
テップＳＴ１１，候補検索ステップ）。FIG. 7 is a flowchart showing a voice recognition method according to the second embodiment of the present invention. Operations that are the same as or correspond to those in FIG. 2 are denoted by the same reference numerals. When the operations from step ST1 to step ST9 are performed in the same manner as in the first embodiment, the candidate search unit 11 reads the word string of the next candidate from the RAM 3 and, according to the search condition 2, sets 1 for each erroneously recognized character string. A word string that has a difference of characters or more and all of the same character strings other than the misrecognized character string are searched for and stored in the display area of the RAM 3 (step ST11, candidate search step).

【００９４】発声例Ｂを使って具体的に説明する。図５
の表示部５Ｃの場合には、指定した誤認識文字列は
｛１，１｝，｛３，４｝であり、＜尿＞だけ指定すべき
ところを＜尿意＞と誤って指定してしまっている。A specific description will be given using utterance example B. FIG.
In the case of the display unit 5C, the specified misrecognition character string is {1, 1}, {3, 4}, and the place where only <urine> should be designated is mistakenly designated as <urinary>. I have.

【００９５】修正された初期候補「＜車＞で＜尿意＞す
る」Modified initial candidate "<Uri> in <car>"

【００９６】またこのときの次候補の単語列は、実施の
形態１と同様である。「意」が同じ文字位置にあるの
で、検索条件１によると正認識の単語列は得られない
が、上記の次候補に対して、検索条件１を拡張した次の
検索条件２を満たすものを候補検索部１１が検索する。The word string of the next candidate at this time is the same as in the first embodiment. Since "me" is at the same character position, a word string of correct recognition cannot be obtained according to the search condition 1, but a word satisfying the next search condition 2 obtained by expanding the search condition 1 with respect to the above-mentioned next candidate is used. The candidate search unit 11 searches.

【００９７】検索条件２＜車＞と同一でない文字列＋「で」＋＜尿意＞と同一でない文字列＋「する」Search condition 2 Character string not identical to <car> + “de” + character string not identical to <urinary> + “do”

【００９８】検索条件２は、誤認識文字列と同一の文字
位置に同一の文字列が含まれている場合でも、少なくと
も１文字が文字変更されていれば文字変更とみなすこと
を意味している。第１の次候補は、初期候補と比較する
と以下のようになり、＜尿意＞には正しい文字「意」が
含まれているが、＜尿意＞と「に用意」とは同一ではな
い文字列であり、誤認識文字列以外の部分が同一である
ため、検索条件２を満たしている。Search condition 2 means that even if the same character string is included at the same character position as the misrecognized character string, if at least one character has been changed, it is regarded as a character change. . The first next candidate is as follows when compared with the initial candidate. A character string in which <Uri> includes the correct character “U”, but <Uri ”is not the same as“ Preparation ” Since the parts other than the misrecognized character string are the same, the search condition 2 is satisfied.

【００９９】初期候補「＜車＞で＜尿意＞する」第１の次候補「来るまでに用意する」[0099] Initial candidate "<Uri> in <car>" First candidate "Prepare before coming"

【０１００】第２の次候補は、初期候補と比較すると以
下のようになり、＜尿意＞が完全に同一の文字列である
ため、検索条件２を満たしていない。The second next candidate is as follows when compared with the initial candidate. Since the <urinary> is a completely identical character string, it does not satisfy the search condition 2.

【０１０１】初期候補「＜車＞で＜尿意＞する」第２の次候補「来るまで尿意する」Initial candidate “<Uri> in <car>” Second candidate “Uri” until coming

【０１０２】したがって、候補検索部１１によってＲＡ
Ｍ３の表示領域に第１の次候補が格納され、表示部５に
よって表示される。このように、検索条件２によって候
補検索部１１が次候補を検索することで、指定した誤認
識文字列の中に正しい文字列が含まれている場合にも、
誤認識の単語列を正認識の単語列へ頑健に修正すること
ができる。Therefore, the candidate search unit 11 sets the RA
The first next candidate is stored in the display area of M3, and is displayed by the display unit 5. As described above, the candidate search unit 11 searches for the next candidate according to the search condition 2, so that even when the specified incorrectly recognized character string includes a correct character string,
It is possible to robustly correct a misrecognized word string to a correctly recognized word string.

【０１０３】発声例Ｃの場合も同様に処理することがで
きる。つまり、初期候補の単語列に次に示すように誤認
識文字列を指定してしまった場合でも、検索条件２から
第４の次候補が検索されることが以下の比較から理解で
きる。In the case of utterance example C, the same processing can be performed. That is, it can be understood from the following comparison that even if an erroneously recognized character string is specified as the initial candidate word string as shown below, the fourth next candidate is searched from the search condition 2.

【０１０４】初期候補「＜妻＞で＜きた＞金＜をくれ＞」第１の次候補「妻出来た金をくれ」第２の次候補「津まで来た金をくれ」第３の次候補「津まで北風送れ」第４の次候補「津まで来た金送れ」[0104] Initial candidate "<wife> <it> gave <money>" First candidate "wife came, give me money" Second candidate, "wife came, give me money" Third Next candidate "Send north wind to Tsu" Send fourth "Send gold to Tsu"

【０１０５】以上のように、この実施の形態２によれ
ば、誤認識文字列の文字位置に相当する次候補の文字列
が、誤認識文字列と同一でない場合に検索条件２の文字
変更と拡張して定義するようにしたので、指定した誤認
識文字列の中に正しい文字列が含まれてしまった場合に
も、誤認識の単語列を頑健に修正することができる。As described above, according to the second embodiment, when the character string of the next candidate corresponding to the character position of the misrecognized character string is not the same as the misrecognized character string, the character change of search condition 2 is performed. Since the definition is extended, even if a correct character string is included in the specified erroneously recognized character string, the erroneously recognized word string can be robustly corrected.

【０１０６】実施の形態３．図８はこの発明の実施の形
態３による音声認識装置の構成を示す図である。図１，
６と同一または相当する構成については同一の符号を付
してある。図８において、１２は検索条件２を満たす次
候補から誤認識文字列の文字位置に対応する次候補文字
列を抽出して、初期候補の誤認識文字列に合わせて次候
補文字列をそれぞれ表示する次候補文字列表示部（次候
補文字列表示手段）、１３は候補選択によって初期候補
の誤認識文字列を次候補文字列に修正する次候補文字列
選択部（次候補文字列選択手段）である。Embodiment 3 FIG. 8 is a diagram showing a configuration of a voice recognition device according to Embodiment 3 of the present invention. Figure 1
The same or corresponding components as in FIG. 6 are denoted by the same reference numerals. In FIG. 8, reference numeral 12 extracts the next candidate character string corresponding to the character position of the misrecognized character string from the next candidate satisfying the search condition 2, and displays the next candidate character string in accordance with the initial candidate misrecognized character string. A next candidate character string display unit (next candidate character string display means), and a next candidate character string selection unit (next candidate character string selection means) for correcting an erroneously recognized character string of an initial candidate into a next candidate character string by selecting a candidate. It is.

【０１０７】図９はこの発明の実施の形態３による音声
認識方法を示すフローチャートである。図２，７と同一
または相当する動作については同一符号を付してある。
ステップＳＴ１からステップＳＴ１１までの動作が実施
の形態１，２と同様に行われると、候補検索部１１によ
って検索された次候補から、誤認識文字列に対応する次
候補文字列だけを次候補文字列表示部１２が表示部に表
示する（ステップＳＴ１２，次候補文字列表示ステッ
プ）。FIG. 9 is a flowchart showing a voice recognition method according to Embodiment 3 of the present invention. The same or corresponding operations as in FIGS. 2 and 7 are denoted by the same reference numerals.
When the operations from step ST1 to step ST11 are performed in the same manner as in the first and second embodiments, only the next candidate character string corresponding to the erroneously recognized character string is replaced with the next candidate character from the next candidate searched by the candidate search unit 11. The column display unit 12 displays it on the display unit (step ST12, next candidate character string display step).

【０１０８】つまり、次候補文字列表示部１２は、各誤
認識文字列の変更部分である各次候補文字列「来るま」
１４Ａ、「に用意」１４ＢをＲＡＭ３の第１の次候補か
ら読み込み、図１０の表示部（表示手段）５Ｄのように
誤認識文字列に合わせてそれぞれ表示する。表示された
各次候補文字列「来るま」１４Ａ、「に用意」１４Ｂを
マウスやキーボードで候補選択すると、図１０の表示部
（表示手段）５Ｅのように次候補文字列選択部１３が初
期候補を修正する（ステップＳＴ１３，次候補文字列選
択ステップ）。In other words, the next candidate character string display section 12 displays each next candidate character string “come” which is a changed part of each erroneously recognized character string.
14A and "ready" 14B are read from the first next candidate in the RAM 3, and are displayed in accordance with the erroneously recognized character string as shown in a display unit (display means) 5D of FIG. When each of the displayed next candidate character strings “come” 14A and “prepared” 14B is selected with a mouse or a keyboard, the next candidate character string selection unit 13 is initialized as shown in a display unit (display unit) 5E in FIG. The candidate is corrected (step ST13, next candidate character string selecting step).

【０１０９】以上のように、この実施の形態３によれ
ば、各誤認識文字列の変更部分である各次候補文字列
「来るま」１４Ａ、「に用意」１４Ｂを次候補文字列表
示部１２が誤認識文字列に合わせてそれぞれ表示し、次
候補文字列の候補選択によって次候補文字列選択部１３
が初期候補を修正するようにしたので、誤認識文字列と
次候補文字列とを対照しながら修正操作を行うことがで
きるという効果が得られる。特に、検索条件を満たす単
語列が次候補から複数検索された場合には、この実施の
形態３の手法は有効である。As described above, according to the third embodiment, the next candidate character strings "come" 14A and "prepare" 14B, which are the changed portions of each erroneously recognized character string, are displayed in the next candidate character string display section. 12 is displayed in accordance with the erroneously recognized character string, and the next candidate character string selecting unit 13 is selected by selecting the next candidate character string.
Corrects the initial candidate, so that an effect is obtained that the correction operation can be performed while comparing the incorrectly recognized character string with the next candidate character string. In particular, when a plurality of word strings satisfying the search condition are searched from the next candidate, the method according to the third embodiment is effective.

【０１１０】なお、この実施の形態３では実施の形態２
の候補検索部１１を用いたが、候補検索部１１に代え
て、実施の形態１の候補検索部１０を用いても良い。In the third embodiment, the second embodiment is used.
However, the candidate search unit 10 of the first embodiment may be used instead of the candidate search unit 11.

【０１１１】また、各実施の形態で示した音声認識方法
の各ステップを各手順に置き換えてプログラム化し、コ
ンピュータ読取可能な記録媒体に記録するようにしても
良く、同様の効果を得ることができる。Further, each step of the voice recognition method shown in each embodiment may be replaced with each procedure and programmed to be recorded on a computer-readable recording medium, and the same effect can be obtained. .

【０１１２】[0112]

【発明の効果】この発明によれば、入力音声を音声認識
して音節列を生成し、ｎ−ｇｒａｍ辞書を用いて音節列
に対応した複数の単語列を生成して初期候補および複数
の次候補に選別し、初期候補を表示領域に格納する音声
認識手段と、表示領域に格納された単語列をユーザに表
示する表示手段と、表示手段に表示された初期候補中の
誤認識文字列が指定されると、誤認識文字列の文字位置
に相当する文字列だけが全て文字変更されている単語列
を次候補から検索して表示領域に格納する候補検索手段
とを備えるようにしたので、単語単位の区切りを意識す
ることなく、誤認識文字列が複数存在する場合にも一度
の修正操作で誤認識の単語列の修正を容易に行うことが
できるという効果が得られる。According to the present invention, a syllable string is generated by recognizing an input voice, a plurality of word strings corresponding to the syllable string are generated using an n-gram dictionary, and an initial candidate and a plurality of next word strings are generated. Speech recognition means for selecting candidates and storing initial candidates in a display area, display means for displaying a word string stored in the display area to a user, and misrecognized character strings in the initial candidates displayed on the display means When specified, the character string corresponding to the character position of the misrecognized character string is changed from the next candidate to a word string in which all the characters have been changed. Even if there are a plurality of erroneously recognized character strings, it is possible to easily correct the erroneously recognized word strings by a single correction operation without being aware of the word unit delimiter.

【０１１３】この発明によれば、誤認識文字列の文字位
置に相当する次候補の文字列が誤認識文字列と同一でな
い場合に、候補検索手段が文字変更と拡張して定義する
ようにしたので、指定した誤認識文字列の中に正しい文
字列が含まれてしまった場合にも、誤認識の単語列を頑
健に修正することができるという効果が得られる。According to the present invention, when the character string of the next candidate corresponding to the character position of the misrecognized character string is not the same as the misrecognized character string, the candidate retrieving means expands and defines the character change. Therefore, even when a correct character string is included in the specified erroneously recognized character string, an effect that the erroneously recognized word string can be robustly corrected can be obtained.

【０１１４】この発明によれば、網掛け表示、反転表
示、輝度変化表示、色変化表示およびアニメーション表
示のいずれかの表示形態によって、誤認識文字列を表示
する表示変更手段を備えるようにしたので、ユーザの嗜
好や用途に応じて誤認識文字列を表示することができる
という効果が得られる。According to the present invention, the display changing means for displaying the erroneously recognized character string is provided in any of the display modes of shaded display, inverted display, luminance change display, color change display, and animation display. In addition, an effect is obtained that an erroneously recognized character string can be displayed according to the user's preference and use.

【０１１５】この発明によれば、誤認識文字列の文字位
置に相当する次候補文字列を候補検索手段によって検索
された次候補から抽出し、誤認識文字列に合わせて表示
部の初期候補とともに次候補文字列を表示する次候補文
字列表示手段と、ユーザの候補選択によって初期候補の
誤認識文字列を次候補文字列に修正する次候補文字列選
択手段とを備えるようにしたので、誤認識文字列と次候
補文字列との対応を視認しながら修正操作を行うことが
できるという効果が得られる。According to the present invention, the next candidate character string corresponding to the character position of the erroneously recognized character string is extracted from the next candidate searched by the candidate search means, and is matched with the erroneously recognized character string together with the initial candidate of the display unit. Since next candidate character string display means for displaying the next candidate character string and next candidate character string selecting means for correcting the erroneously recognized character string of the initial candidate to the next candidate character string by the user's candidate selection are provided, The effect is obtained that the correction operation can be performed while visually checking the correspondence between the recognized character string and the next candidate character string.

【０１１６】この発明によれば、入力音声を音声認識し
て音節列を生成し、ｎ−ｇｒａｍ辞書を用いて音節列に
対応した複数の単語列を生成して初期候補および複数の
次候補に選別し、初期候補を表示領域に格納する音声認
識ステップと、表示領域に格納された単語列をユーザに
表示する表示ステップと、表示ステップにおいて表示さ
れた初期候補中の誤認識文字列が指定される誤認識文字
列指定ステップと、誤認識文字列の文字位置に相当する
文字列だけが全て文字変更されている単語列を次候補か
ら検索して表示領域に格納する候補検索ステップとを備
えるようにしたので、単語単位の区切りを意識すること
なく、誤認識文字列が複数存在する場合にも一度の修正
操作で誤認識の単語列の修正を容易に行うことができる
という効果が得られる。According to the present invention, a syllable string is generated by recognizing an input voice, and a plurality of word strings corresponding to the syllable string are generated using an n-gram dictionary to be used as an initial candidate and a plurality of next candidates. A voice recognition step of selecting and storing an initial candidate in a display area; a display step of displaying a word string stored in the display area to a user; and a misrecognized character string in the initial candidate displayed in the display step. And a candidate search step of searching a next candidate for a word string in which only the character string corresponding to the character position of the misrecognized character string is changed from the next candidate and storing the word string in the display area. This makes it possible to easily correct an erroneously recognized word string with a single correction operation even when there are multiple erroneously recognized character strings without being aware of word-unit delimiters. That.

【０１１７】この発明によれば、候補検索ステップで
は、誤認識文字列の文字位置に相当する次候補の文字列
が誤認識文字列と同一でない場合に文字変更と拡張して
定義するようにしたので、指定した誤認識文字列の中に
正しい文字列が含まれてしまった場合にも、誤認識の単
語列を頑健に修正することができるという効果が得られ
る。According to the present invention, in the candidate search step, when the character string of the next candidate corresponding to the character position of the misrecognized character string is not the same as the misrecognized character string, it is extended to define a character change. Therefore, even when a correct character string is included in the specified erroneously recognized character string, an effect that the erroneously recognized word string can be robustly corrected can be obtained.

【０１１８】この発明によれば、網掛け表示、反転表
示、輝度変化表示、色変化表示およびアニメーション表
示のいずれかの表示形態によって、誤認識文字列を表示
する表示変更ステップを備えるようにしたので、ユーザ
の嗜好や用途に応じて誤認識文字列を表示することがで
きるという効果が得られる。According to the present invention, a display change step of displaying an erroneously recognized character string is provided in any of the display modes of shaded display, inverted display, luminance change display, color change display, and animation display. In addition, an effect is obtained that an erroneously recognized character string can be displayed according to the user's preference and use.

【０１１９】この発明によれば、誤認識文字列の文字位
置に相当する次候補文字列を候補検索ステップにおいて
検索された次候補から抽出し、誤認識文字列に合わせて
表示部の初期候補とともに次候補文字列を表示する次候
補文字列表示ステップと、ユーザの候補選択によって初
期候補の誤認識文字列を次候補文字列に修正する次候補
文字列選択ステップとを備えるようにしたので、誤認識
文字列と次候補文字列との対応を視認しながら修正操作
を行うことができるという効果が得られる。According to the present invention, the next candidate character string corresponding to the character position of the erroneously recognized character string is extracted from the next candidate searched in the candidate search step, and is matched with the erroneously recognized character string together with the initial candidate of the display unit. Since a next candidate character string displaying step of displaying the next candidate character string and a next candidate character string selecting step of correcting the erroneously recognized character string of the initial candidate to the next candidate character string by the user's selection of the candidate are provided. The effect is obtained that the correction operation can be performed while visually checking the correspondence between the recognized character string and the next candidate character string.

【０１２０】この発明によれば、入力音声を音声認識し
て音節列を生成し、ｎ−ｇｒａｍ辞書を用いて音節列に
対応した複数の単語列を生成して初期候補および複数の
次候補に選別し、初期候補を表示領域に格納する音声認
識手順と、表示領域に格納された単語列をユーザに表示
する表示手順と、表示手順において表示された初期候補
中の誤認識文字列が指定される誤認識文字列指定手順
と、誤認識文字列の文字位置に相当する文字列だけが全
て文字変更されている単語列を次候補から検索して表示
領域に格納する候補検索手順とを備えるようにしたの
で、単語単位の区切りを意識することなく、誤認識文字
列が複数存在する場合にも一度の修正操作で誤認識の単
語列の修正を容易に行うことができるという効果が得ら
れる。According to the present invention, a syllable string is generated by recognizing an input voice, and a plurality of word strings corresponding to the syllable string are generated using an n-gram dictionary to be an initial candidate and a plurality of next candidates. A voice recognition procedure for selecting and storing initial candidates in a display area, a display procedure for displaying a word string stored in the display area to a user, and a misrecognized character string in the initial candidates displayed in the display procedure are designated. And a candidate search step of searching a next candidate for a word string in which only the character string corresponding to the character position of the misrecognized character string has been changed from the next candidate and storing it in the display area. Therefore, even if there are a plurality of erroneously recognized character strings, it is possible to easily correct the erroneously recognized word strings by a single correction operation without being conscious of the delimitation in word units.

【０１２１】この発明によれば、候補検索手順では、誤
認識文字列の文字位置に相当する次候補の文字列が誤認
識文字列と同一でない場合に文字変更と拡張して定義す
るようにしたので、指定した誤認識文字列の中に正しい
文字列が含まれてしまった場合にも、誤認識の単語列を
頑健に修正することができるという効果が得られる。According to the present invention, in the candidate search procedure, when the character string of the next candidate corresponding to the character position of the misrecognized character string is not the same as the misrecognized character string, it is extended and defined as character change. Therefore, even when a correct character string is included in the specified erroneously recognized character string, an effect that the erroneously recognized word string can be robustly corrected can be obtained.

【０１２２】この発明によれば、網掛け表示、反転表
示、輝度変化表示、色変化表示およびアニメーション表
示のいずれかの表示形態によって、誤認識文字列を表示
する表示変更手順を備えるようにしたので、ユーザの嗜
好や用途に応じて誤認識文字列を表示することができる
という効果が得られる。According to the present invention, a display change procedure for displaying an erroneously recognized character string is provided in any of the display modes of shaded display, inverted display, luminance change display, color change display, and animation display. In addition, an effect is obtained that an erroneously recognized character string can be displayed according to the user's preference and use.

【０１２３】この発明によれば、誤認識文字列の文字位
置に相当する次候補文字列を候補検索手順において検索
された次候補から抽出し、誤認識文字列に合わせて表示
部の初期候補とともに次候補文字列を表示する次候補文
字列表示手順と、ユーザの候補選択によって初期候補の
誤認識文字列を次候補文字列に修正する次候補文字列選
択手順とを備えるようにしたので、誤認識文字列と次候
補文字列との対応を視認しながら修正操作を行うことが
できるという効果が得られる。According to the present invention, the next candidate character string corresponding to the character position of the misrecognized character string is extracted from the next candidate searched in the candidate search procedure, and is matched with the misrecognized character string together with the initial candidate of the display unit. A next candidate character string display procedure for displaying the next candidate character string and a next candidate character string selection procedure for correcting the misrecognized character string of the initial candidate to the next candidate character string according to the user's candidate selection are provided. The effect is obtained that the correction operation can be performed while visually checking the correspondence between the recognized character string and the next candidate character string.

[Brief description of the drawings]

【図１】この発明の実施の形態１による音声認識装置
の構成を示す図である。FIG. 1 is a diagram showing a configuration of a speech recognition device according to a first embodiment of the present invention.

【図２】この発明の実施の形態１による音声認識方法
を示すフローチャートである。FIG. 2 is a flowchart showing a speech recognition method according to Embodiment 1 of the present invention.

【図３】ｎ＝１の場合、ｎ＝２の場合におけるｎ−ｇ
ｒａｍ辞書の一例を示す図である。FIG. 3 shows n−g in the case of n = 1 and n = 2
It is a figure showing an example of a ram dictionary.

【図４】文字列「車」および「尿」を誤認識文字列と
して指定している図である。FIG. 4 is a diagram in which character strings “car” and “urine” are designated as erroneously recognized character strings.

【図５】文字列「車」および「尿意」を誤認識文字列
として指定している図である。FIG. 5 is a diagram in which character strings “car” and “urine” are designated as erroneously recognized character strings.

【図６】この発明の実施の形態２による音声認識装置
の構成を示す図である。FIG. 6 is a diagram showing a configuration of a voice recognition device according to a second embodiment of the present invention.

【図７】この発明の実施の形態２による音声認識方法
を示すフローチャートである。FIG. 7 is a flowchart showing a voice recognition method according to Embodiment 2 of the present invention.

【図８】この発明の実施の形態３による音声認識装置
の構成を示す図である。FIG. 8 is a diagram showing a configuration of a voice recognition device according to a third embodiment of the present invention.

【図９】この発明の実施の形態３による音声認識方法
を示すフローチャートである。FIG. 9 is a flowchart showing a voice recognition method according to Embodiment 3 of the present invention.

【図１０】各誤認識文字列＜車＞、＜尿意＞にそれぞ
れ合わせて各次候補文字列「来るま」、「に用意」が表
示されている図である。FIG. 10 is a diagram in which respective next candidate character strings “come” and “ready” are displayed in accordance with the respective erroneously recognized character strings <car> and <urgency>.

【図１１】従来の音声認識装置の構成を示す図であ
る。FIG. 11 is a diagram showing a configuration of a conventional voice recognition device.

【図１２】従来の音声認識装置による誤認識単語「認
知」を正認識単語「認識」に修正する様子を示す図であ
る。FIG. 12 is a diagram showing a state in which a misrecognized word “recognition” by a conventional speech recognition device is corrected to a correct recognition word “recognition”.

[Explanation of symbols]

１マイク（音声認識手段）、２音声認識部（音声認
識手段）、３ランダムアクセスメモリ（音声認識手
段）、４ｎ−ｇｒａｍ辞書記憶部（音声認識手段）、
５，５Ａ，５Ｂ，５Ｃ，５Ｄ，５Ｅ表示部（表示手
段）、６キーボード、７マウス、８範囲指定部、
９表示変更部（表示変更手段）、１０，１１候補検
索部（候補検索手段）、１２次候補文字列表示部（次
候補文字列表示手段）、１３次候補文字列選択部（次
候補文字列選択手段）。1 microphone (voice recognition unit), 2 voice recognition unit (voice recognition unit), 3 random access memory (voice recognition unit), 4 n-gram dictionary storage unit (voice recognition unit),
5, 5A, 5B, 5C, 5D, 5E display unit (display means), 6 keyboard, 7 mouse, 8 range designation unit,
9 display change section (display change means), 10, 11 candidate search section (candidate search means), 12th candidate character string display section (next candidate character string display means), 13th candidate character string selection section (next candidate character string) Selection means).

───────────────────────────────────────────────────── フロントページの続き (72)発明者丸田裕三東京都千代田区丸の内二丁目２番３号三菱電機株式会社内 (72)発明者新井忍東京都千代田区丸の内二丁目２番３号三菱電機株式会社内Ｆターム(参考） 5D015 HH12 HH16 KK03 LL04 LL08 5D045 AB02 ──────────────────────────────────────────────────続き Continuing from the front page (72) Inventor Yuzo Maruta 2-3-2 Marunouchi, Chiyoda-ku, Tokyo Within Mitsubishi Electric Corporation (72) Inventor Shinobu Arai 2-3-2 Marunouchi, Chiyoda-ku, Tokyo Rishi Electric Co., Ltd. F term (reference) 5D015 HH12 HH16 KK03 LL04 LL08 5D045 AB02

Claims

[Claims]

1. A speech recognition apparatus for displaying a word string as a speech recognition result of an input speech to a user and correcting a misrecognized character string included in the word string. Generate a sequence, n-gr
a voice recognition unit that generates a plurality of word strings corresponding to the syllable strings using an am dictionary, selects initial candidates and a plurality of next candidates, and stores the initial candidates in a display area; Display means for displaying the generated word string to a user, and when a misrecognized character string in the initial candidate displayed on the display means is designated, only a character string corresponding to the character position of the misrecognized character string A candidate retrieving means for retrieving a word string in which all characters have been changed from the next candidate and storing it in the display area.

2. The method according to claim 1, wherein the candidate searching means defines the character string of the next candidate corresponding to the character position of the misrecognized character string as a character change when the character string is not the same as the misrecognized character string. Item 2. The speech recognition device according to Item 1.

3. A shaded display, an inverted display, a brightness change display,
3. The speech recognition apparatus according to claim 1, further comprising a display change unit that displays an erroneously recognized character string in one of a color change display and an animation display.

4. A next candidate character string corresponding to the character position of the erroneously recognized character string is extracted from the next candidate searched by the candidate search means, and the next candidate is displayed together with the initial candidate of the display unit in accordance with the erroneously recognized character string. Next candidate character string display means for displaying a character string, and next candidate character string selection means for correcting the misrecognized character string of the initial candidate to the next candidate character string by user selection. Claim 1 or Claim 2
The speech recognition device according to the above.

5. A speech recognition method for displaying to a user a word string that is a speech recognition result of an input speech and correcting a misrecognized character string included in the word string. Generate a sequence, n-gr
a voice recognition step of generating a plurality of word strings corresponding to the syllable strings using an am dictionary, selecting the word strings as initial candidates and a plurality of next candidates, and storing the initial candidates in a display area; A display step of displaying the generated word string to a user; a misrecognized character string designating step in which a misrecognized character string in the initial candidate displayed in the display step is designated; and a character position of the misrecognized character string. And a candidate searching step of searching a next candidate for a word string in which only the character string corresponding to is changed from the next candidate and storing the word string in the display area.

6. In the candidate search step, when the character string of the next candidate corresponding to the character position of the misrecognized character string is not the same as the misrecognized character string, the character string is extended by defining a character change. Item 6. The voice recognition method according to Item 5.

7. A shaded display, an inverted display, a brightness change display,
7. The voice recognition method according to claim 5, further comprising a display change step of displaying an erroneously recognized character string in one of a color change display mode and an animation display mode.

8. A next candidate character string corresponding to the character position of the erroneously recognized character string is extracted from the next candidate searched in the candidate searching step, and the next candidate is displayed together with the initial candidate of the display unit in accordance with the erroneously recognized character string. A next candidate character string displaying step of displaying a character string; and a next candidate character string selecting step of correcting the misrecognized character string of the initial candidate to the next candidate character string by selecting a candidate. The speech recognition method according to claim 5.

9. A computer-readable recording medium on which a speech recognition program for displaying a word string as a speech recognition result of an input speech to a user and correcting an erroneously recognized character string included in the word string is recorded. A syllable string is generated by speech recognition of the input speech, and n-gr
a voice recognition procedure for generating a plurality of word strings corresponding to the syllable strings using an am dictionary, selecting the word candidates as initial candidates and a plurality of next candidates, and storing the initial candidates in a display area; A display procedure for displaying the word string thus displayed to the user, a misrecognition string specification procedure for specifying a misrecognition string in the initial candidate displayed in the display procedure, and a character position of the misrecognition string. And a candidate search procedure for searching the next candidate for a word string in which only the character string corresponding to is changed from the next candidate and storing the word string in the display area. Medium.

10. In the candidate search procedure, when the character string of the next candidate corresponding to the character position of the misrecognized character string is not the same as the misrecognized character string, the character string is extended and defined as character change. Item 10. A computer-readable recording medium recording the voice recognition program according to Item 9.

11. A display changing procedure for displaying a misrecognized character string in any one of a display mode of shading display, reverse display, luminance change display, color change display, and animation display. Or claim 10
A computer-readable recording medium on which the speech recognition program described above is recorded.

12. A next candidate character string corresponding to the character position of the misrecognized character string is extracted from the next candidate searched in the candidate search procedure, and the next candidate is displayed together with the initial candidate of the display unit in accordance with the misrecognized character string. A next candidate character string displaying step of displaying a character string; and a next candidate character string selecting step of correcting the misrecognized character string of the initial candidate into the next candidate character string by selecting a candidate. Claim 9 or Claim 1
A computer-readable recording medium on which the speech recognition program described in Item 0 is recorded.