JP2005241829A

JP2005241829A - System and method for speech information processing, and program

Info

Publication number: JP2005241829A
Application number: JP2004049749A
Authority: JP
Inventors: Hisayoshi Nagae; 尚義永江; Yukihiro Fukunaga; 幸弘福永
Original assignee: Toshiba Corp
Current assignee: Toshiba Corp
Priority date: 2004-02-25
Filing date: 2004-02-25
Publication date: 2005-09-08
Anticipated expiration: 2024-02-25
Also published as: JP4189336B2

Abstract

<P>PROBLEM TO BE SOLVED: To provide a speech recognition system that eliminates the need to repeat the same correction without word registration by a user. <P>SOLUTION: A speech input part 101 inputs a voice of a character string. A speech recognition part 201 converts the character string into a KANA(Japanese syllabary)-KANJI(Chinese character) mixed sentence. This KANA-KANJI mixed sentence is displayed on a display screen and correction by a user is accepted. A correction result generation part 102 generates correction result information, a corrected word generation part 103 generates a corrected word consisting of the input voice and the correction result information, and the corrected word is registered in a corrected word dictionary 106. The speech recognition part 201 performs recognition processing by using a recognition vocabulary dictionary 107 and a corrected word dictionary 106 in combination. <P>COPYRIGHT: (C)2005,JPO&NCIPI

Description

本発明は、音声で文章の入力を行うことを目的とした音声情報処理システム、音声情報処理方法及びプログラムに関する。 The present invention relates to a voice information processing system, a voice information processing method, and a program for inputting a sentence by voice.

近年、音声で日本語の文章を入力することができる日本語ディクテーションシステムが実用化され、様々な分野で実用システムとして利用され始めている。システムが音声認識可能な語彙数も数万〜十数万語程度と相当大量になってきた。しかし、現実には固有名詞や個々のユーザ特有の単語、日々作られる造語を事前にすべてシステムの辞書に登録しておくことは不可能である。一方、これらの単語が辞書に未登録の状態のままでは、ユーザがどんなに丁寧に発声しても、音声認識システムはその単語を正しく認識することができない。現在のシステムでは、これらの未登録単語は、ユーザ単語という形で音声認識システムが参照する認識語彙辞書に追加登録することで対処している。 In recent years, a Japanese dictation system capable of inputting Japanese sentences by voice has been put into practical use and has begun to be used as a practical system in various fields. The number of vocabulary that can be recognized by the system has also increased considerably from tens of thousands to hundreds of thousands of words. However, in reality, it is impossible to register all proper nouns, words unique to individual users, and coined words created every day in the system dictionary. On the other hand, if these words remain unregistered in the dictionary, the speech recognition system cannot correctly recognize the words no matter how carefully the user speaks. In the current system, these unregistered words are dealt with by additionally registering them in the recognized vocabulary dictionary referred to by the speech recognition system in the form of user words.

ユーザ単語の辞書登録は、通常、「表記」「読み（あるいは発音）」「品詞」をユーザが入力・指定して辞書登録ボタンを押すという作業をすることで実現される。一般的には１つの単語ごとに上記３つ組の情報を入力して１語ずつユーザ単語を辞書登録していくことになる。 The user word dictionary registration is usually realized by the user inputting and specifying “notation”, “reading (or pronunciation)”, and “part of speech” and pressing the dictionary registration button. Generally, the above three sets of information are input for each word, and user words are registered in the dictionary one word at a time.

なお、従来の音声認識技術については、例えば非特許文献１２に詳しく開示されている。
「情報処理学会誌」、２０００年４月号（Ｖｏｌ．４１Ｎｏ．０４）、ｐｐ．４３６−４３９、特集名：道しるべ、題目：ここまできた音声認識技術、執筆者：河原達也（京都大学情報学研究科） The conventional speech recognition technology is disclosed in detail in Non-Patent Document 12, for example.
“Journal of Information Processing Society”, April 2000 issue (Vol.41 No.04), pp. 436-439, feature name: signpost, title: speech recognition technology, so far, author: Tatsuya Kawahara (Graduate School of Informatics, Kyoto University)

年々音声認識システムの性能は向上してきているが認識率は１００％ではない。そのため、システムが誤認識したときには、誤認識した箇所をユーザが再度発声し直したり、キーボード等の入力作業によって修正する必要がある。誤認識の主たる原因はユーザが入力した単語が認識語彙辞書に登録されていないためである。したがって、誤認識した単語をユーザ単語として辞書登録すればそれ以降は誤認識しないようにすることができる。しかし、文章を考えながらテキスト入力している最中に誤認識が発生する度にその入力作業や文章の推敲を中断してユーザ単語登録作業を実施することは非常に煩わしい。その結果、多くのユーザはユーザ単語登録作業を実施せずにキーボード等を使用して誤認識の訂正作業だけをその場で実施しているのが現状である。ただ、これではユーザが入力したかった単語は未登録のままなので、それ以降の文章入力時に同じ単語をユーザが発声すると音声認識システムはまた同じ誤認識をし、ユーザが同様の訂正作業を繰り返し行わなければならない。 The performance of speech recognition systems is improving year by year, but the recognition rate is not 100%. For this reason, when the system misrecognizes, it is necessary for the user to re-speak the misrecognized portion or to correct it by an input operation such as a keyboard. The main cause of misrecognition is that the word input by the user is not registered in the recognition vocabulary dictionary. Therefore, if a misrecognized word is registered in the dictionary as a user word, it can be prevented from being misrecognized thereafter. However, it is very troublesome to perform user word registration work by interrupting the input work and the text review every time erroneous recognition occurs during text input while considering the text. As a result, many users do not carry out user word registration work, but use the keyboard or the like to carry out only misrecognition correction work on the spot. However, since the word that the user wanted to input remains unregistered in this case, when the user utters the same word during subsequent sentence input, the speech recognition system again performs the same erroneous recognition, and the user repeats the same correction work. It must be made.

本発明は、上記事情を考慮してなされたもので、ユーザが単語登録をせずとも同じ訂正を繰り返さなくてすむようにした音声情報処理システム、音声情報処理方法及びプログラムを提供することを目的とする。 The present invention has been made in view of the above circumstances, and an object thereof is to provide a voice information processing system, a voice information processing method, and a program that do not require the user to repeat the same correction without registering words. To do.

本発明に係る音声情報処理システムは、認識対象となる語彙の仮名による読みに関する情報と仮名漢字による表記に関する情報とを含む第１の辞書データを複数登録した認識語彙辞書と、音声を入力する手段と、入力した前記音声をもとにして仮名文字列を生成する手段と、前記認識語彙辞書に基づいて、生成された前記仮名文字列に対する仮名漢字文字列を生成する仮名漢字文字列生成手段と、生成された前記仮名漢字文字列を表示画面に表示する表示手段と、表示された前記仮名漢字文字列に対する訂正を受け付ける受付手段と、前記訂正に係る仮名漢字文字列のもととなった仮名文字列と前記訂正の内容に関する情報とを含む第２の辞書データを生成する辞書データ生成手段と、生成された前記第２の辞書データを、前記認識語彙辞書とは異なる特定の辞書に登録する登録手段とを備え、前記仮名漢字文字列生成手段は、前記特定の辞書にも基づいて前記生成を行うことを特徴とする。 The speech information processing system according to the present invention includes a recognized vocabulary dictionary in which a plurality of first dictionary data including information related to reading of a vocabulary to be recognized by kana and information related to notation in kana / kanji and means for inputting speech And a means for generating a kana character string based on the input speech; and a kana / kanji character string generating means for generating a kana / kanji character string for the generated kana character string based on the recognition vocabulary dictionary; Display means for displaying the generated kana-kanji character string on a display screen; accepting means for accepting correction to the displayed kana-kanji character string; and kana based on the kana-kanji character string related to the correction Dictionary data generating means for generating second dictionary data including a character string and information on the contents of correction, and the generated second dictionary data as the recognized vocabulary dictionary Comprises a registration means for registering the different specific dictionary, the kana-kanji character string generation means, and performs the generation also based on the particular dictionary.

本発明では、入力音声を認識処理して生成された仮名漢字文字列に対してユーザが実施した訂正操作を監視し、当該仮名文字列と当該訂正に関する情報とを含む辞書データとして特定の辞書へ辞書登録する。これによりユーザが単語登録作業をしなくても、次回以降の言語処理時に通常の認識語彙辞書に加えて当該辞書データが併用できるので、次回以降同じ仮名漢字文字列を入力した際にシステムが正しく認識できるようになるため、ユーザは同様の訂正操作を繰り返し実施しなくてすむようになる。 In the present invention, the correction operation performed by the user on the kana / kanji character string generated by the recognition processing of the input speech is monitored, and the dictionary data including the kana character string and the information related to the correction is sent to a specific dictionary. Register the dictionary. As a result, even if the user does not register the word, the dictionary data can be used in addition to the normal recognition vocabulary dictionary in the next and subsequent language processing. Since the user can recognize, the user does not have to repeat the same correction operation.

なお、装置に係る本発明は方法に係る発明としても成立し、方法に係る本発明は装置に係る発明としても成立する。
また、装置または方法に係る本発明は、コンピュータに当該発明に相当する手段を実行させるための（あるいはコンピュータを当該発明に相当する手段として機能させるための、あるいはコンピュータに当該発明に相当する機能を実現させるための）プログラムとしても成立し、該プログラムを記録したコンピュータ読み取り可能な記録媒体としても成立する。 The present invention relating to the apparatus is also established as an invention relating to a method, and the present invention relating to a method is also established as an invention relating to an apparatus.
Further, the present invention relating to an apparatus or a method has a function for causing a computer to execute means corresponding to the invention (or for causing a computer to function as means corresponding to the invention, or for a computer to have a function corresponding to the invention. It can also be realized as a program (for realizing the program), and can also be realized as a computer-readable recording medium on which the program is recorded.

本発明によれば、ユーザが単語登録をせずとも同じ訂正を繰り返さなくてすむようになる。 According to the present invention, the user does not have to repeat the same correction without registering words.

以下、図面を参照しながら本発明の実施形態について説明する。 Hereinafter, embodiments of the present invention will be described with reference to the drawings.

（第１の実施形態）
図１に、本発明の第1の実施形態に係る音声認識システム（自然言語処理システム）の構成例を示す。 (First embodiment)
FIG. 1 shows a configuration example of a speech recognition system (natural language processing system) according to the first embodiment of the present invention.

図１に示されるように、本音声認識システムは、音声入力部１０１、訂正結果生成部１０２、訂正単語生成部１０３、訂正単語辞書登録部１０４、訂正単語辞書併用型音声認識部１０５、訂正単語辞書１０６、認識語彙辞書１０７を備えている。 As shown in FIG. 1, the speech recognition system includes a speech input unit 101, a correction result generation unit 102, a correction word generation unit 103, a correction word dictionary registration unit 104, a correction word dictionary combined type speech recognition unit 105, a correction word. A dictionary 106 and a recognized vocabulary dictionary 107 are provided.

認識語彙辞書１０７は、「表記」「読み」「品詞」などの情報の組からなる単語情報を複数登録したものである。 The recognition vocabulary dictionary 107 stores a plurality of pieces of word information including sets of information such as “notation”, “reading”, and “part of speech”.

音声入力部１０１は、ユーザ（１００）からの入力音声データ（２００）を受け付ける部分である。 The voice input unit 101 is a part that receives input voice data (200) from the user (100).

訂正単語辞書併用型音声認識部１０５は、後述するように、音声認識結果（２０１）を生成する。ここでは、音声認識結果は、仮名漢字混じり文である。 The corrected word dictionary combined speech recognition unit 105 generates a speech recognition result (201) as described later. Here, the speech recognition result is a sentence mixed with kana and kanji.

生成された仮名漢字混じり文は、所定の表示装置（図示せず）の表示画面に表示される。また、表示された仮名漢字混じり文に対しては、所定の入力装置（図示せず）を介して、ユーザからの訂正操作を受け付けるとともに、訂正結果は、所定の表示装置の表示画面に表示される。 The generated kana-kanji mixed sentence is displayed on a display screen of a predetermined display device (not shown). The displayed kana-kanji mixed sentence accepts a correction operation from the user via a predetermined input device (not shown), and the correction result is displayed on the display screen of the predetermined display device. The

訂正結果生成部１０２は、訂正単語辞書併用型音声認識部１０５が出力した音声認識結果（２０１）と、音声認識結果に対してユーザが実施した訂正操作２０２とから、訂正範囲を特定し、音声認識結果中における訂正位置と訂正結果文字列とで構成される訂正結果情報（２０３）を生成する。ここで、ユーザが実施できる訂正操作には、音声による再入力だけでなく、キーボード、マウス、ペン等の文字情報を入力するすべての入力デバイスを使用することができる。また、訂正操作の際にこれらのデバイスを複数組み合わせても構わない。 The correction result generation unit 102 specifies a correction range from the speech recognition result (201) output from the corrected word dictionary combined speech recognition unit 105 and the correction operation 202 performed by the user on the speech recognition result, Correction result information (203) including a correction position in the recognition result and a correction result character string is generated. Here, for the correction operation that can be performed by the user, not only re-input by voice, but also all input devices for inputting character information such as a keyboard, a mouse, and a pen can be used. Further, a plurality of these devices may be combined during the correction operation.

訂正単語生成部１０３は、音声入力部１０１が受け取った入力音声（２００）の発音列と訂正結果生成部１０２が生成した訂正結果情報（２０３）とを組にした訂正単語（２０４）を生成する。なお、ここでは、生成単位を単語と呼んでいるが、登録する文字列は句や短文でも構わない。また、登録文字列に関して、制限は無い。 The correction word generation unit 103 generates a correction word (204) in which the pronunciation string of the input speech (200) received by the voice input unit 101 and the correction result information (203) generated by the correction result generation unit 102 are paired. . Here, the generation unit is called a word, but the character string to be registered may be a phrase or a short sentence. There is no restriction on the registered character string.

訂正単語辞書登録部１０４は、訂正単語生成部１０３が生成した訂正単語（２０４）を訂正単語辞書１０６に登録する。 The correction word dictionary registration unit 104 registers the correction word (204) generated by the correction word generation unit 103 in the correction word dictionary 106.

訂正単語辞書併用型音声認識部１０５は、訂正単語辞書１０６と認識語彙辞書１０７とを併用して音声認識する（入力音声２００を音声認識して仮名文字列を生成し、該仮名文字列をもとに仮名漢字混じり文字列２０１を生成する）。なお、訂正単語の読み（入力音声の発音列）が認識語彙辞書の単語と重複するものについては、例えば、訂正単語を優先する、両方の単語を提示してユーザに選択させるなど、既存の方法を採用して構わない。 The corrected word dictionary combined type speech recognition unit 105 recognizes speech by using the corrected word dictionary 106 and the recognized vocabulary dictionary 107 together (to recognize the input speech 200 to generate a kana character string and to store the kana character string. And the kana-kanji mixed character string 201 is generated). For the case where the correction word reading (the pronunciation string of the input speech) overlaps with the words in the recognition vocabulary dictionary, for example, the correction word is prioritized, or both words are presented and the user selects them. May be adopted.

図２に、本実施形態の音声認識システムにおける認識処理及び訂正処理に係る概略的な手順の一例を示す。 FIG. 2 shows an example of a schematic procedure related to recognition processing and correction processing in the speech recognition system of this embodiment.

ユーザからの入力を受け付け（ステップＳ１）、これが認識対象の入力ならば（ステップＳ２）、認識処理を行い（ステップＳ３）、認識結果を出力する（ステップＳ４）。他方、入力が訂正のためのものであるならば（ステップＳ２）、訂正情報（本実施形態では、訂正結果情報）を生成し（ステップＳ５）、訂正情報に基づいて辞書登録（本実施形態では、訂正単語の訂正単語辞書への登録）を行う（ステップＳ６）。 An input from the user is accepted (step S1). If this is an input to be recognized (step S2), a recognition process is performed (step S3), and a recognition result is output (step S4). On the other hand, if the input is for correction (step S2), correction information (in this embodiment, correction result information) is generated (step S5), and dictionary registration (in this embodiment) is performed based on the correction information. And registration of the correction word to the correction word dictionary) (step S6).

従来の音声認識システムでは、未登録単語は、ユーザが認識語彙辞書に追加登録をしない限りは、その後も常に誤認識し続ける。しかし、本実施形態の音声認識システムでは、誤認識した際にユーザが訂正操作を行い、その操作結果が訂正単語として訂正単語辞書に自動登録されると、従来の認識語彙辞書だけを使用したときには、誤認識し続けた箇所でも、ユーザが以前実施した訂正操作が適用されて、正しく認識されるようになる。 In the conventional speech recognition system, unregistered words continue to be erroneously recognized thereafter unless the user additionally registers them in the recognition vocabulary dictionary. However, in the speech recognition system of the present embodiment, when a user performs a correction operation when erroneous recognition is performed and the operation result is automatically registered in the correction word dictionary as a correction word, when only the conventional recognition vocabulary dictionary is used Even in a place where erroneous recognition is continued, the correction operation previously performed by the user is applied and the recognition is correctly performed.

以下では、図３を参照しながら具体例を用いて本実施形態につき説明する。 Hereinafter, the present embodiment will be described using a specific example with reference to FIG.

図３は、訂正結果生成部１０２、訂正単語生成部１０３、訂正単語辞書登録部１０４の動作とその過程で作成される訂正結果情報（２０３）、訂正単語（２０４）の具体例を示したものである。 FIG. 3 shows specific examples of the operation of the correction result generation unit 102, the correction word generation unit 103, and the correction word dictionary registration unit 104, correction result information (203) and correction word (204) created in the process. It is.

本具体例では、「粉骨砕身努力します。」という文章を入力するケースで考える。このとき、ユーザは「ふんこつさいしんどりょくします」と発声する。ここで、「砕身」という単語が認識語彙辞書１０７に登録されていなかったとする。 In this specific example, a case where a sentence “I will make efforts to break down the bone and bones” is considered. At this time, the user utters “Funkotsusai Shindo Rokushi”. Here, it is assumed that the word “crushed” is not registered in the recognition vocabulary dictionary 107.

なお、図３において（ａ）は「ふんこつさいしんどりょくします」と発声する１回目のケース（訂正・登録の前のケース）であり、（ｂ）は「ふんこつさいしんどりょくします」と発声する２回目のケース（訂正・登録の後のケース）である。 In Fig. 3, (a) is the first case (before correction / registration) that says "I'll do my best", and (b) is "I'll do my best" Is the second case (after correction / registration).

このとき、本音声認識システムは、次のような誤認識をすることになる（Ｓ１１）。
入力音声：「ふんこつさいしんどりょくします」
認識結果：「粉骨最新努力します。」
この誤認識を訂正するために、ユーザは次のような訂正操作（Ａ）を実施する。
訂正操作（Ａ）：
（ｉ）カーソルを「最新」の右に移動
（ii）直前２文字「最新」を削除
（iii）「くだくしんたい」と発声
（iv）認識結果の「砕く身体」のうち「く」「体」を削除
（v）カーソルを文末に移動して次の発声の準備をする
なお、本実施形態では、訂正操作時に文字列を入力するために音声入力を使用しているが、これはキーボード等による文字入力であっても構わない。 At this time, the voice recognition system performs the following erroneous recognition (S11).
Input voice: "Funkotsusaishindo Rokushi"
Recognition result: “I will do my best in powdered bones.”
In order to correct this erroneous recognition, the user performs the following correction operation (A).
Correction operation (A):
(I) Move the cursor to the right of “Latest”
(Ii) Delete the last two characters “latest”
(Iii) Say “Kukuku Shintai”
(Iv) Delete “ku” and “body” from the “crushed body” of the recognition result
(V) Move the cursor to the end of the sentence and prepare for the next utterance. In this embodiment, voice input is used to input a character string during correction operation. It does not matter.

上記の操作による訂正結果は、次のようになる（Ｓ１２）。
訂正結果：「粉骨砕身努力します。」
ユーザのこの訂正操作結果から訂正結果生成部１０２は、「最新」という認識結果の位置の文字列を「砕身」に訂正したことを検出し、訂正結果情報（２０３）として「訂正位置：認識結果中の『最新』、訂正結果文字列：砕身」を出力する。 The correction result by the above operation is as follows (S12).
Correction result: “I will make efforts to break down the bone and bone.”
From this correction operation result of the user, the correction result generation unit 102 detects that the character string at the position of the recognition result “latest” has been corrected to “crushed”, and “correction position: recognition” as correction result information (203). “Latest” in the result, correction result character string: shatter ”is output.

そして、訂正単語生成部１０３は、入力音声と訂正結果生成部１０２が生成した訂正結果情報（２０３）とから、「入力音声の訂正位置に該当する発声列：さいしん」と「訂正結果文字列：砕身」とを対応付けて、「表記：砕身、読み：さいしん」という訂正単語２０４を生成する。 Then, the correction word generation unit 103 uses the input speech and the correction result information (203) generated by the correction result generation unit 102 to generate “spoken string corresponding to the correction position of the input speech: Saishin” and “correction result character string: The correction word 204 “notation: shatter, reading: saishin” is generated in association with “shatter”.

訂正単語辞書登録部１０４は、この訂正単語を訂正単語辞書１０６に登録する（Ｓ１３）。 The correction word dictionary registration unit 104 registers the correction word in the correction word dictionary 106 (S13).

以降、訂正単語辞書併用型音声認識部１０５は、この訂正単語が登録された訂正単語辞書１０６と認識語彙辞書１０７とを併用して音声認識する。この結果、ユーザが次回「ふんこつさいしん」と入力したときに、「さいしん」の入力部分で訂正単語辞書が参照されることにより、「砕身」という表記が表示される（Ｓ１４）。 Thereafter, the correction word dictionary combined speech recognition unit 105 performs speech recognition using the correction word dictionary 106 in which the correction word is registered and the recognition vocabulary dictionary 107 in combination. As a result, when the user inputs “Funkotsusaishin” next time, the correction word dictionary is referred to in the input portion of “Saishin”, thereby displaying the notation of “crushed” (S14).

従来の方法では、ユーザが「砕身：さいしん」という単語を辞書登録しない限り、何度でも「粉骨最新」と誤認識を繰り返すが、本実施形態によれば、１回だけユーザが訂正作業を実施すると、それ以降は同様の誤認識しないようになる。なお、訂正作業は本音声認識システムが誤認識したときにユーザが自然に行っている作業であるため、ユーザに対して新たな手間や負荷を与えることはない。 In the conventional method, unless the user registers in the dictionary the word “crushed”, the erroneous recognition of “latest bone” is repeated many times, but according to this embodiment, the user corrects only once. After that, the same misrecognition will not occur after that. Since the correction work is a work that the user naturally performs when the voice recognition system misrecognizes, no new effort or load is given to the user.

（第２の実施形態）
図４に、本発明の第２の実施形態に係る音声認識システム（自然言語処理システム）の構成例を示す。 (Second Embodiment)
FIG. 4 shows a configuration example of a speech recognition system (natural language processing system) according to the second embodiment of the present invention.

図４に示されるように、本音声認識システムは、音声入力部１０１、認識語彙辞書１０７、訂正手順生成部１０８、訂正マクロ生成部１０９、訂正マクロ辞書登録部１１０、訂正マクロ辞書併用型音声認識部１１１、訂正マクロ辞書１１２を備えている。なお、図１と同様の部分には同じ符号を付してある。 As shown in FIG. 4, the speech recognition system includes a speech input unit 101, a recognition vocabulary dictionary 107, a correction procedure generation unit 108, a correction macro generation unit 109, a correction macro dictionary registration unit 110, and a correction macro dictionary combined type speech recognition. A section 111 and a correction macro dictionary 112 are provided. In addition, the same code | symbol is attached | subjected to the part similar to FIG.

本実施形態の音声認識システムにおける認識処理及び訂正処理に係る概略的な手順の一例は図２と同様である。 An example of a schematic procedure related to recognition processing and correction processing in the voice recognition system of the present embodiment is the same as that in FIG.

以下では、第１の実施形態と相違する点を中心に説明する。 Below, it demonstrates centering on the point which is different from 1st Embodiment.

訂正マクロ辞書併用型音声認識部１１１は、後述するように、音声認識結果（２０１）を生成する。ここでは、音声認識結果は、仮名漢字混じり文である。 The corrected macro dictionary combined speech recognition unit 111 generates a speech recognition result (201) as described later. Here, the speech recognition result is a sentence mixed with kana and kanji.

訂正手順生成部１０８は、訂正マクロ辞書併用型音声認識部１１１が出力した音声認識結果（２０１）と、音声認識結果に対してユーザが実施した訂正操作（２０２）とから、訂正範囲を特定し、訂正操作手順を表す訂正手順（２０６）を生成する。 The correction procedure generation unit 108 specifies a correction range from the speech recognition result (201) output from the correction macro dictionary combined speech recognition unit 111 and the correction operation (202) performed by the user on the speech recognition result. Then, a correction procedure (206) representing the correction operation procedure is generated.

訂正マクロ生成部１０９は、音声入力部１０１が受け取った入力音声（２００）の発音列と訂正手順生成部１０８が生成した訂正手順（２０６）とを組にした訂正マクロ（２０７）を生成する。 The correction macro generation unit 109 generates a correction macro (207) in which the pronunciation sequence of the input voice (200) received by the voice input unit 101 and the correction procedure (206) generated by the correction procedure generation unit 108 are combined.

訂正マクロ辞書登録部１１０は、訂正マクロ生成部１０９が生成した訂正マクロ（２０７）を訂正マクロ辞書１１２に登録する。 The correction macro dictionary registration unit 110 registers the correction macro (207) generated by the correction macro generation unit 109 in the correction macro dictionary 112.

訂正マクロ辞書併用型音声認識部１１１は、訂正マクロ辞書１１２と認識語彙辞書１０７とを併用して音声認識する（入力音声２００を音声認識して仮名文字列を生成し、該仮名文字列をもとに仮名漢字混じり文字列２０１を生成する）。なお、訂正マクロの読み（入力音声の発音列）が認識語彙辞書の単語と重複するものについては、例えば、訂正マクロを優先する、訂正マクロに係る単語と認識語彙辞書に係る単語との両方を提示してユーザに選択させるなど、既存の方法を採用して構わない。 The correction macro dictionary combined speech recognition unit 111 recognizes speech by using the correction macro dictionary 112 and the recognition vocabulary dictionary 107 together (to recognize the input speech 200 to generate a kana character string, and to store the kana character string. And the kana-kanji mixed character string 201 is generated). For the case where the reading of the correction macro (the pronunciation string of the input speech) overlaps with the word in the recognition vocabulary dictionary, for example, both the word related to the correction macro and the word related to the recognition vocabulary dictionary are given priority. An existing method such as presenting and allowing the user to select may be adopted.

以下では、図５を参照しながら具体例を用いて本実施形態につき説明する。 Hereinafter, the present embodiment will be described using a specific example with reference to FIG.

図５は、訂正手順生成部１０８、訂正マクロ生成部１０９、訂正マクロ辞書登録部１１０の動作とその過程で作成される訂正手順（２０６）、訂正マクロ（２０７）の具体例を示したものである。 FIG. 5 shows specific examples of operations of the correction procedure generation unit 108, the correction macro generation unit 109, and the correction macro dictionary registration unit 110 and the correction procedure (206) and correction macro (207) created in the process. is there.

本具体例では、「粉骨砕身努力します。」という文章を入力するケースで考える。このとき、ユーザは「ふんこつさいしんどりょくします」と発声する。ここで、「砕身」という単語が認識語彙辞書に登録されていなかったとする。 In this specific example, a case where a sentence “I will make efforts to break down the bone and bones” is considered. At this time, the user utters “Funkotsusai Shindo Rokushi”. Here, it is assumed that the word “crushed” is not registered in the recognition vocabulary dictionary.

なお、図５において（ａ）は「ふんこつさいしんどりょくします」と発声する１回目のケース（訂正・登録の前のケース）であり、（ｂ）は「ふんこつさいしんどりょくします」と発声する２回目のケース（訂正・登録の後のケース）である。 In Fig. 5, (a) is the first case (before correction / registration) that says "I'll do my best", and (b) is "I'll do my best" Is the second case (after correction / registration).

このとき、本音声認識システムは、次のような誤認識をすることになる（Ｓ２１）。
入力音声：「ふんこつさいしんどりょくします」
認識結果：「粉骨最新努力します。」
この誤認識を訂正するために、ユーザは訂正操作（Ａ）を実施する（第１の実施形態参照）。 At this time, the voice recognition system performs the following erroneous recognition (S21).
Input voice: "Funkotsusaishindo Rokushi"
Recognition result: “I will do my best in powdered bones.”
In order to correct this misrecognition, the user performs a correction operation (A) (see the first embodiment).

この操作による訂正結果は、次のようになる（Ｓ２２）。
訂正結果：「粉骨砕身努力します。」
ユーザのこの訂正操作結果から、訂正手順生成部１０８は、「最新」という認識結果の位置の文字列に対して訂正操作（Ａ）を実施したことを検出し、訂正手順２０６として「訂正位置：認識結果中の『最新』、訂正操作：「最新」を削除→「くだくしんたい」と発声→「く」「体」を削除」を出力する。 The correction result by this operation is as follows (S22).
Correction result: “I will make efforts to break down the bone and bone.”
From this correction operation result of the user, the correction procedure generation unit 108 detects that the correction operation (A) has been performed on the character string at the position of the recognition result “latest”, and the “correction position: “Latest” in the recognition result, corrective action: Delete “Latest” → Say “Kakuku Shintai” → Output “Delete” and “Delete body”.

そして、訂正マクロ生成部１０９は、入力音声と訂正手順生成部１０８が生成した訂正手順２０６とから、「入力音声の訂正位置に該当する発声列：さいしん」と「訂正操作：「最新」を削除→「くだくしんたい」と発声→「く」「体」を削除」とを対応付けて、「操作：「最新」を削除→「くだくしんたい」と発声→「く」「体」を削除、読み：さいしん」という訂正マクロ２０７を生成する。 Then, the correction macro generation unit 109 deletes “the utterance string corresponding to the correction position of the input speech: Saishin” and “correction operation:“ latest ”from the input speech and the correction procedure 206 generated by the correction procedure generation unit 108. → “Kakushintai” and utterance → “ku” and “Delete body” are associated and “Operation: Delete“ Latest ”→ Say“ Kakushintai ”→ Delete“ ku ”and“ Body ” A correction macro 207 “Read: Saishin” is generated.

訂正マクロ辞書登録部１１０は、この訂正マクロを訂正マクロ辞書１１２に登録する（Ｓ２３）。 The correction macro dictionary registration unit 110 registers this correction macro in the correction macro dictionary 112 (S23).

以降、訂正マクロ辞書併用型音声認識部１１１は、この訂正マクロが登録された訂正マクロ辞書１１２と認識語彙辞書１０７とを併用して音声認識する。この結果、ユーザが次回「ふんこつさいしん」と入力したときに、「さいしん」の入力部分で訂正マクロ辞書が参照されることにより、『「最新」を削除→「くだくしんたい」と発声→「く」「体」を削除』という訂正操作が自動実行され、最終的に「砕身」という表記が表示される。 Thereafter, the corrective macro dictionary combined speech recognition unit 111 recognizes speech by using the corrective macro dictionary 112 in which the corrective macro is registered and the recognition vocabulary dictionary 107 in combination. As a result, the next time the user inputs “Funkotsusaishin”, the correction macro dictionary is referenced in the input portion of “Saishin”, so that “Delete“ Latest ”→ Say“ Kakushinsai ”→“ The correct operation “Delete“ Body ”” is automatically executed, and finally the notation “Crush” is displayed.

なお、訂正マクロに割り当てる「読み」は、訂正前の入力音声の訂正箇所の発声列でなくてもよい。例えば、訂正操作時に入力した発声列を「読み」として割り当ててもよい。上例で説明すると、「くだくしんたい」という訂正操作時の発声列を割り当てることを考える。このとき、「ふんこつさいしんどりょくします」と入力すると、「粉骨最新努力します。」と誤認識することになるが、ここで、「くだくしんたい」と発声すると、訂正マクロが実行され、直前の認識結果中の「最新」という文字列を「砕身」に置き換える。従来であればカーソル移動や余分な文字列の削除が必要であったが、本実施形態によりユーザのこれらの手間を排除することが可能になる。 The “reading” assigned to the correction macro does not have to be the utterance string of the corrected portion of the input speech before correction. For example, the utterance string input during the correction operation may be assigned as “reading”. In the above example, consider the assignment of the utterance string for the correction operation “Kakukushintai”. At this time, if you enter “Funkotsusaishindokushikaku”, you will misrecognize that “I will make the latest effort with powdered bones.” However, if you say “Kakushinshintai” here, the correction macro will be executed. The character string “latest” in the previous recognition result is replaced with “crushed”. Conventionally, it has been necessary to move the cursor and delete extra character strings, but this embodiment can eliminate the user's trouble.

また、上記の例では訂正マクロと通常の音声認識単語とを区別しないで扱っているが、訂正マクロの前あるいは後ろに予約語（例えば「訂正マクロ」という語）を発声する規則にしてもよい。例えば、上記の例で説明すると「訂正マクロくだくしんたい」と発声したときのみ、直前の認識結果中の「最新」という文字列を「砕身」に置き換える。これにより、通常の音声入力時に誤って訂正マクロが実行されるのを避けることができる。 In the above example, the correction macro and the normal speech recognition word are handled without distinction, but a rule that utters a reserved word (for example, the word “correction macro”) before or after the correction macro may be used. . For example, in the above example, the character string “latest” in the immediately preceding recognition result is replaced with “crush” only when “correction macro” is spoken. Thereby, it is possible to avoid a correction macro being erroneously executed during normal voice input.

（第３の実施形態）
図６に、本発明の第３の実施形態に係る音声認識システム（自然言語処理システム）の構成例を示す。 (Third embodiment)
FIG. 6 shows a configuration example of a speech recognition system (natural language processing system) according to the third embodiment of the present invention.

図６に示されるように、本音声認識システムは、音声入力部１０１、訂正結果生成部１０２、訂正単語生成部１０３、認識語彙辞書１０７、ユーザ単語辞書登録自動起動部１１３、ユーザ単語辞書併用型音声認識部１１４、ユーザ辞書１１５を備えている。なお、図１と同様の部分には同じ符号を付してある。 As shown in FIG. 6, the speech recognition system includes a speech input unit 101, a correction result generation unit 102, a correction word generation unit 103, a recognition vocabulary dictionary 107, a user word dictionary registration automatic activation unit 113, and a user word dictionary combined type. A voice recognition unit 114 and a user dictionary 115 are provided. In addition, the same code | symbol is attached | subjected to the part similar to FIG.

ユーザ辞書１１５は、認識語彙辞書１０７と同様、「表記」「読み」「品詞」などの情報の組からなる単語情報を複数登録したものである。従来と同様、認識語彙辞書１０７は、一般的な辞書であるのに対して、ユーザ辞書１１５は、当該ユーザがユーザ単語（１０８）を適宜辞書登録するものである。 Similar to the recognition vocabulary dictionary 107, the user dictionary 115 is a plurality of registered word information composed of information sets such as “notation”, “reading”, and “part of speech”. As in the past, the recognized vocabulary dictionary 107 is a general dictionary, whereas the user dictionary 115 is a dictionary in which the user registers user words (108) as appropriate.

ユーザ単語辞書併用型音声認識部１１４は、後述するように、音声認識結果（２０１）を生成する。ここでは、音声認識結果は、仮名漢字混じり文である。 The user word dictionary combined speech recognition unit 114 generates a speech recognition result (201) as described later. Here, the speech recognition result is a sentence mixed with kana and kanji.

ユーザ単語辞書登録自動起動部１１３は、訂正単語成部１０３が生成した訂正単語（２０４）を、ユーザ単語（１０８）としてユーザ辞書１１５に登録するための作業を実行する。例えば、ユーザ単語登録画面を表示し、登録に必要な「表記」「読み」「品詞」の情報を訂正単語（２０４）から生成して代入しておく。なお、このユーザ単語に代入する「品詞」については、例えば、訂正対象となったもとの単語の品詞と同じ品詞としておく方法や、画一的に「名詞」としておく方法など、種々の方法がある。ここで、ユーザは登録内容を確認できる。登録内容に問題が無ければ登録ボタンを押すだけでよい。もし、修正が必要なら適宜修正を行ってからユーザ単語を辞書に登録する。 The user word dictionary registration automatic activation unit 113 performs an operation for registering the correction word (204) generated by the correction word generation unit 103 in the user dictionary 115 as the user word (108). For example, a user word registration screen is displayed, and information on “notation”, “reading”, and “part of speech” necessary for registration is generated from the corrected word (204) and substituted. Note that there are various methods for “part of speech” to be assigned to the user word, such as a method of setting the same part of speech as the part of speech of the original word to be corrected, or a method of uniformly setting “noun”. . Here, the user can confirm the registered contents. If there is no problem with the registered contents, all you have to do is press the registration button. If correction is necessary, the user word is registered in the dictionary after appropriate correction.

ユーザ単語辞書併用型音声認識部１１４は、ユーザ辞書１１５と認識語彙辞書１０７とを併用して音声認識する（入力音声２００を音声認識して仮名文字列を生成し、該仮名文字列をもとに仮名漢字混じり文字列２０１を生成する）。なお、ユーザ単語の読み（入力音声の発音列）が認識語彙辞書の単語と重複するものについては、例えば、ユーザ単語を優先する、両方の単語を提示してユーザに選択させるなど、既存の方法を採用して構わない。 The user word dictionary combined speech recognition unit 114 recognizes speech by using the user dictionary 115 and the recognition vocabulary dictionary 107 together (recognizes the input speech 200 to generate a kana character string, and based on the kana character string A character string 201 mixed with kana and kanji is generated. For the case where the reading of the user word (the pronunciation string of the input speech) overlaps with the word in the recognition vocabulary dictionary, for example, the user word is given priority, or both words are presented to the user to select. May be adopted.

以下では、図７を参照しながら具体例を用いて本実施形態につき説明する。 Hereinafter, this embodiment will be described using a specific example with reference to FIG.

図７は、訂正結果生成部１０２、訂正単語生成部１０３、ユーザ単語辞書登録自動起動部１１３の動作とその過程で作成される訂正結果情報（２０３）、訂正単語（２０４）の具体例を示したものである。 FIG. 7 shows specific examples of operations of the correction result generation unit 102, the correction word generation unit 103, and the user word dictionary registration automatic activation unit 113, correction result information (203) and correction word (204) created in the process. It is a thing.

なお、図７において（ａ）は「ふんこつさいしんどりょくします」と発声する１回目のケース（訂正・登録の前のケース）であり、（ｂ）は「ふんこつさいしんどりょくします」と発声する２回目のケース（訂正・登録の後のケース）である。 In addition, in Fig. 7, (a) is the first case (case before correction / registration) that says "I'll do my best" and (b) is "I'll be my own." Is the second case (after correction / registration).

このとき、本音声認識システムは、次のような誤認識をすることになる（Ｓ３１）。
入力音声：「ふんこつさいしんどりょくします」
認識結果：「粉骨最新努力します。」
この誤認識を訂正するために、ユーザは訂正操作（Ａ）を実施する（第１の実施形態参照）。 At this time, the voice recognition system performs the following erroneous recognition (S31).
Input voice: "Funkotsusaishindo Rokushi"
Recognition result: “I will do my best in powdered bones.”
In order to correct this misrecognition, the user performs a correction operation (A) (see the first embodiment).

この操作による訂正結果は、次のようになる（Ｓ３２）。
訂正結果：「粉骨砕身努力します。」
ユーザのこの訂正操作結果から、訂正結果生成部１０２は、「最新」という認識結果の位置の文字列を「砕身」に訂正したことを検出し、訂正結果情報２０３として「訂正位置：認識結果中の『最新』、訂正結果文字列：砕身」を出力する。 The correction result by this operation is as follows (S32).
Correction result: “I will make efforts to break down the bone and bone.”
From this correction operation result of the user, the correction result generation unit 102 detects that the character string at the position of the recognition result “latest” has been corrected to “crushed”, and the correction result information 203 indicates “correction position: recognition result”. "The latest", correction result character string: shattered "is output.

そして、訂正単語生成部１０３は、入力音声と訂正結果生成部１０２が生成した訂正結果情報２０３から、「入力音声の訂正位置に該当する発声列：さいしん」と「訂正結果文字列：砕身」とを対応付けて、「表記：砕身、読み：さいしん」という訂正単語２０４を生成する。 Then, the correction word generation unit 103 determines from the input speech and the correction result information 203 generated by the correction result generation unit 102 that “the utterance string corresponding to the correction position of the input speech: Saishin” and “correction result character string: shatter”. Are associated with each other to generate a correction word 204 of “notation: shattered, reading: saishin”.

ユーザ単語辞書登録自動起動部１１３は、この訂正単語をユーザ単語２０８としてユーザ単語辞書１１５に登録する（Ｓ３４）。ここで、ユーザ単語辞書に登録する前にユーザに確認画面を出し、登録内容を修正できるようにすることも可能である（Ｓ３３）。 The user word dictionary registration automatic activation unit 113 registers the corrected word as the user word 208 in the user word dictionary 115 (S34). Here, before registering in the user word dictionary, a confirmation screen may be displayed to the user so that the registered content can be corrected (S33).

ユーザ単語辞書併用型音声認識部１１４は、このユーザ単語が登録されたユーザ単語辞書１０６と認識語彙辞書１０７とを併用して音声認識する。この結果、ユーザが次回「ふんこつさいしん」と入力したときに「さいしん」の入力部分でユーザ単語辞書が参照されることにより、「砕身」という表記が表示される。 The user word dictionary combined speech recognition unit 114 performs speech recognition using both the user word dictionary 106 in which the user word is registered and the recognition vocabulary dictionary 107. As a result, when the user inputs “Funkotsusaishin” next time, the user word dictionary is referred to in the input portion of “Saishin”, so that the notation “Cream” is displayed.

従来の方法では、ユーザが「砕身：さいしん」というユーザ単語を辞書登録するためには、「表記：砕身」「読み：さいしん」「品詞：名詞」をすべて指定しなければならなかった。本実施形態によれば、ユーザの訂正操作の内容からユーザ単語登録に必要な情報を自動的に抽出することができるため、簡便にユーザ単語を登録することが可能になる。これにより、ユーザは「表記」「読み」「品詞」をすべて１から入力し直す手間から開放され、通常のユーザ単語登録を実施する場合と比較して、ユーザ単語登録の煩わしさが大幅に低減される。 In the conventional method, in order for the user to register the user word “crushing: saishin” in the dictionary, all of “notation: crushing”, “reading: saishin”, and “part of speech: noun” must be specified. According to this embodiment, since it is possible to automatically extract information necessary for user word registration from the contents of the user's correction operation, it is possible to easily register user words. As a result, the user is relieved from having to input all of “notation”, “reading”, and “part of speech” from 1 and the troublesomeness of user word registration is greatly reduced as compared with the case of performing normal user word registration. Is done.

（第４の実施形態）
図８に、本発明の第４の実施形態に係る音声認識システム（自然言語処理システム）の構成例を示す。 (Fourth embodiment)
FIG. 8 shows a configuration example of a speech recognition system (natural language processing system) according to the fourth embodiment of the present invention.

図８に示されるように、本音声認識システムは、音声入力部１０１、訂正結果生成部１０２、訂正単語生成部１０３、訂正単語辞書登録部１０４、訂正単語辞書１０６、認識語彙辞書１０７、前後関係抽出部１２０、訂正単語前後関係表登録部１２１、訂正単語辞書及び前後関係表併用型音声認識部１２２、訂正単語前後関係表１２３を備えている。なお、図１と同様の部分には同じ符号を付してある。 As shown in FIG. 8, the speech recognition system includes a speech input unit 101, a correction result generation unit 102, a correction word generation unit 103, a correction word dictionary registration unit 104, a correction word dictionary 106, a recognition vocabulary dictionary 107, and context. An extraction unit 120, a corrected word sequence table registration unit 121, a correction word dictionary and context table combined speech recognition unit 122, and a corrected word sequence table 123 are provided. In addition, the same code | symbol is attached | subjected to the part similar to FIG.

訂正単語辞書及び前後関係表併用型音声認識部１２２は、後述するように、音声認識結果（２０１）を生成する。ここでは、音声認識結果は、仮名漢字混じり文である。 The corrected word dictionary and the context table combined speech recognition unit 122 generates a speech recognition result (201) as described later. Here, the speech recognition result is a sentence mixed with kana and kanji.

前後関係抽出部１２０は、訂正単語辞書及び前後関係表併用型音声認識部１２２が出力した音声認識結果（２０１）と、音声認識結果に対してユーザが実施した訂正操作（２０２）とから、訂正範囲を特定し、訂正箇所の前後関係の情報（２２０）を抽出する。 The context extraction unit 120 corrects the speech recognition result (201) output from the corrected word dictionary and context table combined speech recognition unit 122 and the correction operation (202) performed by the user on the speech recognition result. The range is specified, and the contextual information (220) of the correction part is extracted.

訂正単語前後関係表登録部１２１は、訂正単語辞書登録部１０４が登録した訂正単語２０４と前後関係抽出部１２０が生成した訂正箇所の前後関係の情報（２２０）とを組にして訂正単語の前後関係の情報（２２１）を生成し、訂正単語前後関係表１２３に登録する。 The correction word context table registration unit 121 sets the correction word 204 registered by the correction word dictionary registration unit 104 and the context information (220) of the correction portion generated by the context extraction unit 120 as a pair before and after the correction word. Relation information (221) is generated and registered in the corrected word context table 123.

訂正単語辞書及び前後関係表併用型音声認識部１２２は、訂正単語辞書１０６及び訂正単語前後関係表１２３と認識語彙辞書１０７とを併用して音声認識する（入力音声２００を音声認識して仮名文字列を生成し、該仮名文字列をもとに仮名漢字混じり文字列２０１を生成する）。 The corrected word dictionary and context table combined speech recognition unit 122 recognizes speech by using the corrected word dictionary 106 and the corrected word context table 123 and the recognition vocabulary dictionary 107 together (recognizing the input speech 200 and performing kana characters A string is generated, and a kana-kanji mixed character string 201 is generated based on the kana character string).

第１の実施形態では、訂正単語の読みが認識語彙辞書の単語と重複する場合がある。例えば、この場合に常に訂正単語を優先させる方法を採用すると、一例として「最新の部署では粉骨砕身努力します。」という文章を入力するために、「さいしんのぶしょではふんこつさいしんどりょくします」と発声したとき、「さいしん」のところで必ず訂正単語が採用され、「砕身の部署では粉骨砕身努力します。」と誤認識してしまう、というようなケースが生じ得る。 In the first embodiment, correction word readings may overlap with words in the recognition vocabulary dictionary. For example, in this case, if the method of always giving priority to the correction word is adopted, as an example, in order to enter the sentence “I will make efforts to break down the bones and bones in the latest department.” The correct word is always used at “Saishin”, and it may be misrecognized as “I ’ll do my best to break down the bones in the department of shattering”.

そこで、本実施形態では、訂正単語の辞書登録時に訂正単語の前後関係を抽出し、訂正単語前後関係表として管理する。そして、入力音声が訂正単語と認識語彙との双方の読みと一致したときは、訂正単語の前後の単語と訂正単語前後関係表とを比較して訂正単語、認識語彙のいずれか適切な方を選択する。 Therefore, in the present embodiment, the correction word context is extracted and registered as a correction word context table when the correction word dictionary is registered. When the input speech matches the readings of both the corrected word and the recognized vocabulary, the words before and after the corrected word are compared with the corrected word context table to determine which one of the corrected word and the recognized vocabulary is appropriate. select.

以下では、図９を参照しながら具体例を用いて本実施形態につき説明する。 Hereinafter, the present embodiment will be described using a specific example with reference to FIG.

図９は、前後関係抽出部１２０、訂正単語前後関係表登録部１２１、訂正単語及び前後関係表併用型音声認識部１２２の動作とその過程で作成される訂正箇所の前後関係の情報（２２０）、訂正単語の前後関係の情報（２２１）の具体例を示したものである。 FIG. 9 shows the operation of the context extraction unit 120, the correction word context table registration unit 121, the correction word and context table combined speech recognition unit 122, and the context information (220) of the correction part created in the process. The specific example of the contextual information (221) of a correction word is shown.

なお、図９において（ａ）は「ふんこつさいしんどりょくします」と発声する１回目のケース（訂正・登録の前のケース）であり、（ｂ）は「ふんこつさいしんどりょくします」と発声する２回目のケース（訂正・登録の後のケース）である。 In addition, in Fig. 9, (a) is the first case (before correction / registration) that says "I'll do my best", and (b) is "I'll do my best" Is the second case (after correction / registration).

このとき、本音声認識システムは、次のような誤認識をすることになる（Ｓ４１）。
入力音声：「ふんこつさいしんどりょくします」
認識結果：「粉骨最新努力します。」
この誤認識を訂正するために、ユーザは訂正操作（Ａ）を実施する（第１の実施形態参照）。 At this time, the voice recognition system performs the following erroneous recognition (S41).
Input voice: "Funkotsusaishindo Rokushi"
Recognition result: “I will do my best in powdered bones.”
In order to correct this misrecognition, the user performs a correction operation (A) (see the first embodiment).

この操作による訂正結果は、次のようになる（Ｓ４２）。
訂正結果：「粉骨砕身努力します。」
ユーザのこの訂正操作結果から、訂正結果生成部１０２は「最新」という認識結果の位置の文字列を「砕身」に訂正したことを検出し、訂正結果情報２０３として「訂正位置：認識結果中の『最新』、訂正結果文字列：砕身」を出力する。 The correction result by this operation is as follows (S42).
Correction result: “I will make efforts to break down the bone and bone.”
From the correction operation result of the user, the correction result generation unit 102 detects that the character string at the position of the recognition result “latest” has been corrected to “crushed”, and the correction result information 203 indicates “correction position: in recognition result. "Latest", correction result character string: shattered.

これと同時に、ユーザの訂正操作２０２と音声認識結果２０１から、前後関係抽出部１２０は、「最新」という認識結果の位置の文字列に対して訂正操作（Ａ）を実施したことを検出し、その操作箇所の前後の単語として「粉骨」「努力」を検出し、訂正箇所の前後関係２２０として「訂正位置：前＝『粉骨』、後＝『努力』」を出力する。 At the same time, from the user correction operation 202 and the speech recognition result 201, the context extraction unit 120 detects that the correction operation (A) has been performed on the character string at the position of the recognition result “latest”, “Fracture” and “Effort” are detected as the words before and after the operation location, and “correction position: front =“ powder ”, after =“ effort ”” is output as the context 220 of the correction location.

そして、訂正単語生成部１０３は、入力音声と訂正結果生成部１０２が生成した訂正結果情報２０３から、「入力音声の訂正位置に該当する発声列：さいしん」と「訂正結果文字列：砕身」とを対応付けて、「表記：砕身、読み：さいしん」という訂正単語２０４を生成する。訂正単語辞書登録部１０４は、この訂正単語を訂正単語辞書１０６に登録する（Ｓ４３）。 Then, the correction word generation unit 103 determines from the input speech and the correction result information 203 generated by the correction result generation unit 102 that “the utterance string corresponding to the correction position of the input speech: Saishin” and “correction result character string: shatter”. Are associated with each other to generate a correction word 204 of “notation: shattered, reading: saishin”. The corrected word dictionary registration unit 104 registers this corrected word in the corrected word dictionary 106 (S43).

一方、訂正単語前後関係表登録部１２１は、訂正単語辞書登録部１０４が登録した訂正単語２０４の「表記：砕身、読み：さいしん」と訂正箇所の前後関係２２０の「訂正位置：前＝『粉骨』、後＝『努力』」から、「表記：砕身、読み：さいしん」：前＝『粉骨』、後＝『努力』という訂正単語前後関係２２１を生成し、訂正単語前後関係表１２３に登録する（Ｓ４４）。 On the other hand, the correction word context table registration unit 121 sets “notation: shatter, reading: saishin” of the correction word 204 registered by the correction word dictionary registration unit 104 and “correction position: previous =“ From the word “powder” and after = “effort” ”, a corrected word context 221 of“ notation: shattered, reading: saishin ”: front =“ powder ”, back =“ effort ”is generated, and the correction word context table. 123 is registered (S44).

訂正単語辞書及び前後関係表併用型音声認識部１２２は、この訂正単語辞書１０６と訂正単語前後関係表１２３と認識語彙辞書１０７とを併用して音声認識する。この結果、ユーザが次回「さいしんのぶしょではふんこつさいしんどりょくします」と入力したとき、１番目の「さいしん」の部分では、前後関係が訂正単語登録時と異なるために、訂正単語辞書ではなく従来通り認識語彙辞書が参照されることになり、「最新」という表記が表示される（Ｓ４５，Ｓ４６）。一方、２番目の「さいしん」の部分では、前後の単語が訂正単語前後関係表の単語と一致しているため、訂正単語辞書の方が参照されることになり、「砕身」という表記が表示される（Ｓ４５，Ｓ４６）。この結果、「最新の部署では粉骨砕身努力します。」と正しく認識される（Ｓ４７）。 The corrected word dictionary and context table combined speech recognition unit 122 performs speech recognition using the corrected word dictionary 106, the corrected word context table 123, and the recognition vocabulary dictionary 107 in combination. As a result, the next time the user inputs “Saishin no butsu”, the first “Saishin” part has a different context from that when the correction word was registered. Instead, the recognized vocabulary dictionary is referred to as before, and the notation "latest" is displayed (S45, S46). On the other hand, in the second “Saishin” part, because the preceding and following words match the words in the correction word context table, the correction word dictionary will be referred to, and the notation of “crushed” It is displayed (S45, S46). As a result, it is correctly recognized that “the latest department will make efforts to break down the bone and bone” (S47).

このように、訂正単語辞書を使用する場合に登録時の訂正単語の前後関係を考慮することにより、適切な箇所にだけ訂正単語を当てはめることができるようになる。 As described above, when the correction word dictionary is used, the correction word can be applied only to an appropriate place by considering the context of the correction word at the time of registration.

なお、本実施形態では、前後関係として訂正単語の前、後ろを使用したが、どちらか一方だけを使用するようにしても構わない。また、本実施形態では、前後関係として使用する範囲が前、後ろともに１単語であったが、これも２単語以上のより長い単語列を使用しても構わない。また、本実施形態では、前後関係として保持する対象が単語になっているが、単語の代わりに品詞等の単語に付随した情報を使用しても構わない。また、前後関係表との一致度を判定する方法として、前後関係表中に訂正単語の前後の単語が存在するか否かだけでなく、確率値を使用する判定法でも構わない。 In the present embodiment, the front and rear of the correction word are used as the context, but only one of them may be used. In the present embodiment, the range used as the context is one word for both the front and rear, but a longer word string of two or more words may also be used. In the present embodiment, the word to be stored as the context is a word, but information attached to a word such as a part of speech may be used instead of the word. In addition, as a method for determining the degree of coincidence with the context table, not only whether or not the words before and after the correction word exist in the context table, but also a determination method using a probability value may be used.

（第５の実施形態）
本発明の第５の実施形態は、第４の実施形態の前後関係の情報を考慮する構成を、第２の実施形態に適用したものである。第４の実施形態で示した、前後関係の情報を考慮する構成に関する効果やバリエーションは、本実施形態にも妥当する。 (Fifth embodiment)
In the fifth embodiment of the present invention, the configuration considering the context information of the fourth embodiment is applied to the second embodiment. The effects and variations related to the configuration taking into account the contextual information shown in the fourth embodiment are also applicable to this embodiment.

図１０に、本発明の第５の実施形態に係る音声認識システム（自然言語処理システム）の構成例を示す。 FIG. 10 shows a configuration example of a speech recognition system (natural language processing system) according to the fifth embodiment of the present invention.

図１０に示されるように、本音声認識システムは、音声入力部１０１、認識語彙辞書１０７、訂正手順生成部１０８、訂正マクロ生成部１０９、訂正マクロ辞書登録部１１０、訂正マクロ辞書１１２、前後関係抽出部１２０、訂正マクロ前後関係表登録部１２４、訂正マクロ辞書及び前後関係表併用型音声認識部１２５、訂正マクロ前後関係表１２６を備えている。なお、図４と同様の部分には同じ符号を付してある。 As shown in FIG. 10, the speech recognition system includes a speech input unit 101, a recognition vocabulary dictionary 107, a correction procedure generation unit 108, a correction macro generation unit 109, a correction macro dictionary registration unit 110, a correction macro dictionary 112, and a context. An extraction unit 120, a correction macro context table registration unit 124, a correction macro dictionary and context table combined speech recognition unit 125, and a correction macro context table 126 are provided. In addition, the same code | symbol is attached | subjected to the part similar to FIG.

以下では、第２の実施形態と相違する点を中心に説明する。 Below, it demonstrates centering on the point which is different from 2nd Embodiment.

訂正マクロ辞書及び前後関係表併用型音声認識部１２５は、後述するように、音声認識結果（２０１）を生成する。ここでは、音声認識結果は、仮名漢字混じり文である。 The corrected macro dictionary and the context table combined speech recognition unit 125 generates a speech recognition result (201) as described later. Here, the speech recognition result is a sentence mixed with kana and kanji.

前後関係抽出部１２０は、訂正マクロ辞書及び前後関係表併用型音声認識部１２５が出力した音声認識結果（２０１）と、音声認識結果に対してユーザが実施した訂正操作（２０２）とから、訂正範囲を特定し、訂正箇所の前後関係の情報（２２０）を抽出する。 The context extraction unit 120 corrects from the speech recognition result (201) output by the correction macro dictionary and the context table combined speech recognition unit 125 and the correction operation (202) performed by the user on the speech recognition result. The range is specified, and the contextual information (220) of the correction part is extracted.

訂正マクロ前後関係表登録部１２４は、訂正マクロ辞書登録部１１０が登録した訂正マクロ（２０７）と前後関係抽出部１２０が生成した訂正箇所の前後関係の情報（２２０）とを組にして訂正マクロの前後関係の情報（２２２）を生成し、訂正マクロ前後関係表１２６に登録する。 The correction macro context table registration unit 124 sets the correction macro (207) registered by the correction macro dictionary registration unit 110 and the context information (220) of the correction part generated by the context extraction unit 120 as a pair. The context information (222) is generated and registered in the correction macro context table 126.

訂正マクロ辞書及び前後関係表併用型音声認識部１２５は、訂正マクロ辞書１１２及び訂正マクロ前後関係表１２６と認識語彙辞書１０７とを併用して音声認識する（入力音声２００を音声認識して仮名文字列を生成し、該仮名文字列をもとに仮名漢字混じり文字列２０１を生成する）。 The corrected macro dictionary and context table combined speech recognition unit 125 recognizes speech by using the correction macro dictionary 112 and the corrected macro context table 126 and the recognition vocabulary dictionary 107 together (speech recognition of the input speech 200 and kana characters). A string is generated, and a kana-kanji mixed character string 201 is generated based on the kana character string).

以下では、図１１を参照しながら具体例を用いて本実施形態につき説明する。 Hereinafter, the present embodiment will be described using a specific example with reference to FIG.

図１１は、前後関係抽出部１２０、訂正マクロ前後関係表登録部１２４、訂正マクロ及び前後関係表併用型音声認識部１２５の動作とその過程で作成される訂正箇所の前後関係の情報（２２０）、訂正マクロの前後関係の情報（２２２）の具体例を示したものである。 FIG. 11 shows the operation of the context extraction unit 120, the correction macro context table registration unit 124, the correction macro and context table combined speech recognition unit 125, and the context information (220) of the correction part created in the process. The specific example of the information (222) of the context of the correction macro is shown.

なお、図１１において（ａ）は「ふんこつさいしんどりょくします」と発声する１回目のケース（訂正・登録の前のケース）であり、（ｂ）は「ふんこつさいしんどりょくします」と発声する２回目のケース（訂正・登録の後のケース）である。 In addition, in Fig. 11, (a) is the first case (before correction / registration) that says "I'll do my best" and (b) is "I'll be my own." Is the second case (after correction / registration).

このとき、本音声認識システムは、次のような誤認識をすることになる（Ｓ５１）。
入力音声：「ふんこつさいしんどりょくします」
認識結果：「粉骨最新努力します。」
この誤認識を訂正するために、ユーザは訂正操作（Ａ）を実施する（第１の実施形態参照）。 At this time, the voice recognition system performs the following erroneous recognition (S51).
Input voice: "Funkotsusaishindo Rokushi"
Recognition result: “I will do my best in powdered bones.”
In order to correct this misrecognition, the user performs a correction operation (A) (see the first embodiment).

この操作による訂正結果は次のようになる（Ｓ５２）。
訂正結果：「粉骨砕身努力します。」
ユーザのこの訂正操作結果から、訂正手順生成部１０８は「最新」という認識結果の位置の文字列に対して訂正操作（Ａ）を実施したことを検出し、訂正手順２０６として「訂正位置：認識結果中の『最新』、訂正操作：「最新」を削除→「くだくしんたい」と発声→「く」「体」を削除」を出力する。 The correction result by this operation is as follows (S52).
Correction result: “I will make efforts to break down the bone and bone.”
From the correction operation result of the user, the correction procedure generation unit 108 detects that the correction operation (A) has been performed on the character string at the position of the recognition result “latest”. “Latest” in the result, corrective action: Delete “Latest” → Say “Kukuku Shintai” → Output “Delete” and “Delete body”.

これと同時に、ユーザの訂正操作２０２と音声認識結果２０１とから前後関係抽出部１２０は、「最新」という認識結果の位置の文字列に対して訂正操作（Ａ）を実施したことを検出し、その操作箇所の前後の単語として「粉骨」「努力」を検出し、訂正箇所の前後関係２２０として「訂正位置：前＝『粉骨』、後＝『努力』」を出力する。 At the same time, the context extraction unit 120 detects that the correction operation (A) has been performed on the character string at the position of the recognition result “latest” from the user correction operation 202 and the speech recognition result 201, “Fracture” and “Effort” are detected as the words before and after the operation location, and “correction position: front =“ powder ”, after =“ effort ”” is output as the context 220 of the correction location.

そして、訂正マクロ生成部１０９は、入力音声と訂正手順生成部１０８が生成した訂正手順２０６から、「入力音声の訂正位置に該当する発声列：さいしん」と「訂正操作：「最新」を削除→「くだくしんたい」と発声→「く」「体」を削除」とを対応付けて、「操作：「最新」を削除→「くだくしんたい」と発声→「く」「体」を削除、読み：さいしん」という訂正マクロ２０７を生成する。訂正マクロ辞書登録部１１０は、この訂正マクロを訂正マクロ辞書１１２に登録する（Ｓ５３）。 Then, the correction macro generation unit 109 deletes “the utterance string corresponding to the correction position of the input speech: Saishin” and “correction operation:“ latest ”from the input speech and the correction procedure 206 generated by the correction procedure generation unit 108 → Say “Kakuku Shintai” → “Delete Ku” “Delete body” and “Operation: Delete“ Latest ”→ Say“ Kaku Shintai ”→ Delete“ ku ”“ Body ”and read : A correction macro 207 "Saishin" is generated. The correction macro dictionary registration unit 110 registers this correction macro in the correction macro dictionary 112 (S53).

一方、訂正マクロ前後関係表登録部１２４は、訂正マクロ辞書登録部１１０が登録した訂正マクロ２０７の「操作：「最新」を削除→「くだくしんたい」と発声→「く」「体」を削除、読み：さいしん」と訂正箇所の前後関係２２０の「訂正位置：前＝『粉骨』、後＝『努力』」から、「操作：「最新」を削除→「くだくしんたい」と発声→「く」「体」を削除、読み：さいしん」：前＝『粉骨』、後＝『努力』という訂正マクロ前後関係２２２を生成し、訂正マクロ前後関係表１２６に登録する（Ｓ５４）。 On the other hand, the correction macro context table registration unit 124 deletes “operation: delete“ latest ”→ speak“ success ”from the correction macro 207 registered by the correction macro dictionary registration unit 110 →“ ku ”“ body ”. , "Reading: Saishin" and "Correction position: Before =" powder ", After =" Effort "" in the context 220 of the correction part, "Operation: Delete" Latest "→ Say" Kakushinshin "→" “Deleted” and “Body”, read: “Saishin”: Created a corrected macro context 222 of before = “flour” and after = “effort”, and registered it in the corrected macro context table 126 (S54).

訂正マクロ辞書及び前後関係表併用型音声認識部１２５は、この訂正マクロ辞書１１２と訂正マクロ前後関係表１２６と認識語彙辞書１０７を併用して音声認識する。その結果、次回、ユーザが「さいしんのぶしょではふんこつさいしんどりょくします」と入力したとき、１番目の「さいしん」の部分では、前後関係が訂正マクロ登録時と異なるために、訂正マクロ辞書ではなく従来通り認識語彙辞書が参照されることになり、「最新」という表記が表示される（Ｓ５５，Ｓ５６）。一方、２番目の「さいしん」の部分では、前後の単語が訂正マクロ前後関係表の単語と一致しているため、訂正マクロ辞書の方が参照されることになり、『「最新」を削除→「くだくしんたい」と発声→「く」「体」を削除』という訂正操作が自動実行され、最終的に「砕身」という表記が表示される（Ｓ５５，Ｓ５６）。この結果、「最新の部署では粉骨砕身努力します。」と正しく認識される（Ｓ５７）。 The correction macro dictionary and context table combined speech recognition unit 125 recognizes speech using the correction macro dictionary 112, the correction macro context table 126, and the recognition vocabulary dictionary 107 in combination. As a result, the next time the user inputs “Fill in the future,” the first “Saishin” part has a different context from that in the correction macro registration. Instead, the recognized vocabulary dictionary is referred to as before, and the notation “latest” is displayed (S55, S56). On the other hand, in the second “Saishin” part, since the preceding and following words match the words in the correction macro context table, the correction macro dictionary is referred to, and “Delete“ latest ”→ The corrective action of “Kakukushintai” utterance → “ku” “delete body” is automatically executed, and finally the notation “crushed” is displayed (S55, S56). As a result, it is correctly recognized that “the latest department will make efforts to break down the bone and bone” (S57).

このように、訂正マクロ辞書を使用する場合に登録時の訂正マクロの前後関係を考慮することにより、適切な箇所にだけ訂正マクロを当てはめることができるようになる。 As described above, when the correction macro dictionary is used, the correction macro can be applied only to an appropriate portion by considering the context of the correction macro at the time of registration.

（第６の実施形態）
本発明の第６の実施形態は、第４の実施形態の前後関係の情報を考慮する構成を、第３の実施形態に適用したものである。第４の実施形態で示した、前後関係の情報を考慮する構成に関する効果やバリエーションは、本実施形態にも妥当する。 (Sixth embodiment)
In the sixth embodiment of the present invention, the configuration considering the contextual information of the fourth embodiment is applied to the third embodiment. The effects and variations related to the configuration taking into account the contextual information shown in the fourth embodiment are also applicable to this embodiment.

図１２に、本発明の第６の実施形態に係る音声認識システム（自然言語処理システム）の構成例を示す。 FIG. 12 shows a configuration example of a speech recognition system (natural language processing system) according to the sixth embodiment of the present invention.

図１２に示されるように、本音声認識システムは、音声入力部１０１、訂正結果生成部１０２、訂正単語生成部１０３、認識語彙辞書１０７、ユーザ単語辞書登録自動起動部１１３、ユーザ辞書１１５、前後関係抽出部１２０、ユーザ単語前後関係表登録部１２７、ユーザ単語辞書及び前後関係表併用型音声認識部１２８、ユーザ単語前後関係表１２９を備えている。なお、図６と同様の部分には同じ符号を付してある。 As shown in FIG. 12, the speech recognition system includes a speech input unit 101, a correction result generation unit 102, a correction word generation unit 103, a recognition vocabulary dictionary 107, a user word dictionary registration automatic activation unit 113, a user dictionary 115, A relationship extraction unit 120, a user word context table registration unit 127, a user word dictionary and context table combined speech recognition unit 128, and a user word context table 129 are provided. In addition, the same code | symbol is attached | subjected to the part similar to FIG.

以下では、第３の実施形態と相違する点を中心に説明する。 Below, it demonstrates focusing on the point which is different from 3rd Embodiment.

ユーザ単語辞書及び前後関係表併用型音声認識部１２８は、後述するように、音声認識結果（２０１）を生成する。ここでは、音声認識結果は、仮名漢字混じり文である。 The user word dictionary and the context table combined speech recognition unit 128 generates a speech recognition result (201) as described later. Here, the speech recognition result is a sentence mixed with kana and kanji.

前後関係抽出部１２０は、ユーザ単語辞書及び前後関係表併用型音声認識部１２８が出力した音声認識結果（２０１）と、音声認識結果に対してユーザが実施した訂正操作（２０２）とから、訂正範囲を特定し、訂正箇所の前後関係の情報（２２０）を抽出する。 The context extraction unit 120 corrects the speech recognition result (201) output from the user word dictionary and context table combined speech recognition unit 128 and the correction operation (202) performed by the user on the speech recognition result. The range is specified, and the contextual information (220) of the correction part is extracted.

ユーザ単語前後関係表登録部１２７は、ユーザ単語辞書登録部１１３が登録したユーザ単語（２０８）と前後関係抽出部１２０が生成した訂正箇所の前後関係の情報（２２０）とを組にしてユーザ単語の前後関係の情報（２２３）を生成し、ユーザ単語前後関係表１２９に登録する。 The user word context table registration unit 127 sets the user word (208) registered by the user word dictionary registration unit 113 and the context information (220) of the corrected portion generated by the context extraction unit 120 as a pair. Is created and registered in the user word context table 129.

ユーザ単語辞書及び前後関係表併用型音声認識部１２８は、ユーザ単語辞書１１５及びユーザ単語前後関係表１２９と認識語彙辞書１０７と併用して音声認識する（入力音声２００を音声認識して仮名文字列を生成し、該仮名文字列をもとに仮名漢字混じり文字列２０１を生成する）。 The user word dictionary and context table combined speech recognition unit 128 recognizes speech using the user word dictionary 115 and user word context table 129 and the recognition vocabulary dictionary 107 (recognizes the input speech 200 and recognizes the kana character string). And a character string 201 mixed with kana-kanji based on the kana character string).

以下では、図１３を参照しながら具体例を用いて本実施形態につき説明する。 Hereinafter, this embodiment will be described using a specific example with reference to FIG.

図１３は、前後関係抽出部１２０、ユーザ単語前後関係表登録部１２７、ユーザ単語及び前後関係表併用型音声認識部１２８の動作とその過程で作成される訂正箇所の前後関係（２２０）、ユーザ単語の前後関係の情報（２２３）の具体例を示したものである。 FIG. 13 shows the operation of the context extraction unit 120, the user word context table registration unit 127, the user word and context table combined speech recognition unit 128, and the context (220) of the correction points created in the process, The example of the information (223) of the context of a word is shown.

なお、図１３において（ａ）は「ふんこつさいしんどりょくします」と発声する１回目のケース（訂正・登録の前のケース）であり、（ｂ）は「ふんこつさいしんどりょくします」と発声する２回目のケース（訂正・登録の後のケース）である。 In Fig. 13, (a) is the first case (before correction / registration) that says "I'll do my best", and (b) is "I'll do my best" Is the second case (after correction / registration).

このとき、本音声認識システムは、次のような誤認識をすることになる（Ｓ６１）。 At this time, the voice recognition system performs the following erroneous recognition (S61).

入力音声：「ふんこつさいしんどりょくします」
認識結果：「粉骨最新努力します。」
この誤認識を訂正するために、ユーザは訂正操作（Ａ）を実施する（第１の実施形態参照）。 Input voice: "Funkotsusaishindo Rokushi"
Recognition result: “I will do my best in powdered bones.”
In order to correct this misrecognition, the user performs a correction operation (A) (see the first embodiment).

この操作による訂正結果は次のようになる（Ｓ６２）。
訂正結果：「粉骨砕身努力します。」
ユーザのこの訂正操作結果から、訂正結果生成部１０２は、「最新」という認識結果の位置の文字列を「砕身」に訂正したことを検出し、訂正結果情報２０３として「訂正位置：認識結果中の『最新』、訂正結果文字列：砕身」を出力する。 The correction result by this operation is as follows (S62).
Correction result: “I will make efforts to break down the bone and bone.”
From this correction operation result of the user, the correction result generation unit 102 detects that the character string at the position of the recognition result “latest” has been corrected to “crushed”, and the correction result information 203 indicates “correction position: recognition result”. "The latest", correction result character string: shattered "is output.

これと同時に、ユーザの訂正操作２０２と音声認識結果２０１から、前後関係抽出部１２０は「最新」という認識結果の位置の文字列に対して訂正操作（Ａ）を実施したことを検出し、その操作箇所の前後の単語として「粉骨」「努力」を検出し、訂正箇所の前後関係２２０として「訂正位置：前＝『粉骨』、後＝『努力』」を出力する。 At the same time, from the user's correction operation 202 and the speech recognition result 201, the context extraction unit 120 detects that the correction operation (A) has been performed on the character string at the position of the recognition result “latest”, and “Fracture” and “Effort” are detected as the words before and after the operation location, and “correction position: front =“ powder ”, after =“ effort ”” is output as the context 220 of the correction location.

そして、訂正単語生成部１０３は、入力音声と訂正結果生成部１０２が生成した訂正結果情報２０３から、「入力音声の訂正位置に該当する発声列：さいしん」と「訂正結果文字列：砕身」とを対応付けて、「表記：砕身、読み：さいしん」という訂正単語２０４を生成する。ユーザ単語辞書登録自動起動部１１３は、この訂正単語をユーザ単語２０８としてユーザ単語辞書１１５に登録する（Ｓ６４）。ここで、ユーザ単語辞書に登録する前にユーザに確認画面を出し、登録内容を修正できるようにすることも可能である（Ｓ６３）。 Then, the correction word generation unit 103 determines from the input speech and the correction result information 203 generated by the correction result generation unit 102 that “the utterance string corresponding to the correction position of the input speech: Saishin” and “correction result character string: shatter”. Are associated with each other to generate a correction word 204 of “notation: shattered, reading: saishin”. The user word dictionary registration automatic activation unit 113 registers the corrected word as the user word 208 in the user word dictionary 115 (S64). Here, before registering in the user word dictionary, a confirmation screen may be displayed to the user so that the registered contents can be corrected (S63).

一方、ユーザ単語前後関係表登録部１２７は、ユーザ単語辞書登録自動起動部１１３が登録したユーザ単語２０８の「表記：砕身、読み：さいしん」と訂正箇所の前後関係２２０の「訂正位置：前＝『粉骨』、後＝『努力』」とから、「表記：砕身、読み：さいしん」：前＝『粉骨』、後＝『努力』というユーザ単語前後関係２２３を生成し、ユーザ単語前後関係表１２９に登録する（Ｓ６５）。 On the other hand, the user word context table registration unit 127 displays “notation: shatter, reading: saishin” of the user word 208 registered by the user word dictionary registration automatic activation unit 113 and “correction position: previous” of the correction part context 220. = “Powdered bone”, After = “Effort” ”, the user word context 223 of“ notation: shattered, reading: Saishin ”: front =“ powdered bone ”, back =“ effort ”is generated, and the user word It registers in the context table 129 (S65).

ユーザ単語辞書及び前後関係表併用型音声認識部１２８は、このユーザ単語辞書１１５とユーザ単語前後関係表１２９と認識語彙辞書１０７とを併用して音声認識する。その結果、次回、ユーザが「さいしんのぶしょではふんこつさいしんどりょくします」と入力したとき、１番目の「さいしん」の部分では、前後関係がユーザ単語登録時と異なるために、ユーザ単語辞書ではなく従来通り認識語彙辞書が参照されることになり、「最新」という表記が表示される（Ｓ６６，Ｓ６７）。一方、２番目の「さいしん」の部分では、前後の単語がユーザ単語前後関係表の単語と一致しているため、ユーザ単語辞書の方が参照されることになり、「砕身」という表記が表示される（Ｓ６６，Ｓ６７）。この結果、「最新の部署では粉骨砕身努力します。」と正しく認識される（Ｓ６８）。 The user word dictionary and context table combined speech recognition unit 128 performs speech recognition using the user word dictionary 115, the user word context table 129, and the recognition vocabulary dictionary 107 in combination. As a result, the next time the user enters “Fill in the future,” the first “Saishin” part has a different context from that at the time of user word registration. Instead, the recognized vocabulary dictionary is referred to as usual, and the notation “latest” is displayed (S66, S67). On the other hand, in the second “Saishin” part, since the preceding and following words match the words in the user word context table, the user word dictionary will be referred to, and the notation of “crushed” It is displayed (S66, S67). As a result, it is correctly recognized that “the latest department will make efforts to break down the bone and bone” (S68).

このように、ユーザ単語辞書を使用する場合に登録時のユーザ単語の前後関係を考慮することにより、適切な箇所にだけユーザ単語を当てはめることができるようになる。 As described above, when the user word dictionary is used, the user word can be applied only to an appropriate portion by considering the context of the user word at the time of registration.

ところで、第１〜第６の実施形態では、音声入力部が、ユーザからの入力音声データを認識処理して、仮名文字列を出力し、各認識部が、該仮名文字列をもとに、仮名漢字混じり文字列を生成するものであったが、その代わりに、入力部が、キーボード装置あるいはいわゆるソフトキー等の仮名文字入力デバイスによりユーザからの仮名文字列を入力し、各認識部が、該仮名文字列をもとに、仮名漢字混じり文字列を生成する場合にも、本発明は適用可能である。また、音声入力と仮名文字とを併用する場合も可能である。 By the way, in the first to sixth embodiments, the voice input unit recognizes input voice data from the user and outputs a kana character string, and each of the recognition units based on the kana character string, Instead of generating a kana-kanji mixed character string, instead, the input unit inputs a kana character string from the user by a kana character input device such as a keyboard device or a so-called soft key, and each recognition unit The present invention can also be applied to a case where a kana / kanji mixed character string is generated based on the kana character string. It is also possible to use both voice input and kana characters.

なお、以上の各機能は、ソフトウェアとして記述し適当な機構をもったコンピュータに処理させても実現可能である。
また、本実施形態は、コンピュータに所定の手段を実行させるための、あるいはコンピュータを所定の手段として機能させるための、あるいはコンピュータに所定の機能を実現させるためのプログラムとして実施することもできる。加えて該プログラムを記録したコンピュータ読取り可能な記録媒体として実施することもできる。 Each of the above functions can be realized even if it is described as software and processed by a computer having an appropriate mechanism.
The present embodiment can also be implemented as a program for causing a computer to execute predetermined means, causing a computer to function as predetermined means, or causing a computer to realize predetermined functions. In addition, the present invention can be implemented as a computer-readable recording medium that records the program.

なお、本発明は上記実施形態そのままに限定されるものではなく、実施段階ではその要旨を逸脱しない範囲で構成要素を変形して具体化できる。また、上記実施形態に開示されている複数の構成要素の適宜な組み合わせにより、種々の発明を形成できる。例えば、実施形態に示される全構成要素から幾つかの構成要素を削除してもよい。さらに、異なる実施形態にわたる構成要素を適宜組み合わせてもよい。 Note that the present invention is not limited to the above-described embodiment as it is, and can be embodied by modifying the constituent elements without departing from the scope of the invention in the implementation stage. In addition, various inventions can be formed by appropriately combining a plurality of components disclosed in the embodiment. For example, some components may be deleted from all the components shown in the embodiment. Furthermore, constituent elements over different embodiments may be appropriately combined.

本発明の第１の実施形態に係る音声認識システムの構成例を示す図The figure which shows the structural example of the speech recognition system which concerns on the 1st Embodiment of this invention. 本発明の第１〜第６の実施形態における処理手順の一例を示すフローチャートThe flowchart which shows an example of the process sequence in the 1st-6th embodiment of this invention. 本発明の第１の実施形態に係る音声認識システムの訂正結果生成部、訂正単語生成部、訂正単語辞書登録部の動作を説明するための図The figure for demonstrating operation | movement of the correction result production | generation part of the speech recognition system which concerns on the 1st Embodiment of this invention, a correction word production | generation part, and a correction word dictionary registration part. 本発明の第２の実施形態に係る音声認識システムの構成例を示す図The figure which shows the structural example of the speech recognition system which concerns on the 2nd Embodiment of this invention. 同音声認識システムの訂正手順生成部、訂正マクロ生成部、訂正マクロ辞書登録部の動作を説明するための図The figure for demonstrating operation | movement of the correction procedure production | generation part of the same speech recognition system, a correction macro production | generation part, and a correction macro dictionary registration part 本発明の第３の実施形態に係る音声認識システムの構成例を示す図The figure which shows the structural example of the speech recognition system which concerns on the 3rd Embodiment of this invention. 同音声認識システムの訂正結果生成部、訂正単語生成部、ユーザ単語辞書登録自動起動部の動作を説明するための図The figure for demonstrating operation | movement of the correction result production | generation part of the same speech recognition system, a correction word production | generation part, and a user word dictionary registration automatic starting part. 本発明の第４の実施形態に係る音声認識システムの構成例を示す図The figure which shows the structural example of the speech recognition system which concerns on the 4th Embodiment of this invention. 同音声認識システムの前後関係抽出部、訂正単語前後関係表登録部、訂正単語及び前後関係表併用型音声認識部の動作を説明するための図The figure for demonstrating operation | movement of the context extraction part of the same speech recognition system, the correction word context table registration part, the correction word and context table combined speech recognition part 本発明の第５の実施形態に係る音声認識システムの構成例を示す図The figure which shows the structural example of the speech recognition system which concerns on the 5th Embodiment of this invention. 同音声認識システムの前後関係抽出部、訂正マクロ前後関係表登録部、訂正マクロ及び前後関係表併用型音声認識部の動作を説明するための図The figure for demonstrating operation | movement of the context extraction part of the same speech recognition system, a correction macro context table registration part, a correction macro, and a context table combined speech recognition part 本発明の第６の実施形態に係る音声認識システムの構成例を示す図The figure which shows the structural example of the speech recognition system which concerns on the 6th Embodiment of this invention. 同音声認識システムの前後関係抽出部、ユーザ単語前後関係表登録部、ユーザ単語及び前後関係表併用型音声認識部の動作を説明するための図The figure for demonstrating operation | movement of the context extraction part of the same speech recognition system, a user word context table registration part, a user word and a context table combined speech recognition part

Explanation of symbols

１０１…音声入力部、１０２…訂正結果生成部、１０３…訂正単語生成部、１０４…訂正単語辞書登録部、１０５…訂正単語辞書併用型音声認識部、１０６…訂正単語辞書、１０７…認識語彙辞書、１０８…訂正手順生成部、１０９…訂正マクロ生成部、１１０…訂正マクロ辞書登録部、１１１…訂正マクロ辞書併用型音声認識部、１１２…訂正マクロ辞書、１１３…ユーザ単語辞書登録自動起動部、１１４…ユーザ単語辞書併用型音声認識部、１１５…ユーザ辞書、１２０…前後関係抽出部、１２１…訂正単語前後関係表登録部、１２２…訂正単語辞書及び前後関係表併用型音声認識部、１２３…訂正単語前後関係表、１２４…訂正マクロ前後関係表登録部、１２５…訂正マクロ辞書及び前後関係表併用型音声認識部、１２６…訂正マクロ前後関係表、１２７…ユーザ単語前後関係表登録部、１２８…ユーザ単語前後関係表、１２９…ユーザ単語辞書及び前後関係表併用型音声認識部 DESCRIPTION OF SYMBOLS 101 ... Voice input part, 102 ... Correction result generation part, 103 ... Correction word generation part, 104 ... Correction word dictionary registration part, 105 ... Correction word dictionary combined type speech recognition part, 106 ... Correction word dictionary, 107 ... Recognition vocabulary dictionary 108 ... correction procedure generation unit 109 ... correction macro generation unit 110 ... correction macro dictionary registration unit 111 ... correction macro dictionary combined speech recognition unit 112 ... correction macro dictionary 113 ... user word dictionary registration automatic activation unit, 114... User word dictionary combined speech recognition unit 115. Correction word context table 124: Correction macro context table registration unit 125 ... Correction macro dictionary and context table combined speech recognition unit 126 ... Correction macro Rear relationship table, 127 ... user word context table registration unit, 128 ... user word context table 129 ... user word dictionary and context table combination speech recognition unit

Claims

A recognized vocabulary dictionary in which a plurality of first dictionary data including information related to reading of kana words for information to be processed and information related to notation in kana kanji are registered;
Means for inputting voice;
Means for generating a kana character string based on the input voice;
A kana / kanji character string generating means for generating a kana / kanji character string for the generated kana character string based on the recognition vocabulary dictionary;
Display means for displaying the generated kana-kanji character string on a display screen;
Accepting means for accepting corrections to the displayed kana-kanji character string;
Dictionary data generating means for generating second dictionary data including a kana character string that is the basis of the kana-kanji character string related to the correction and information about the content of the correction;
Registration means for registering the generated second dictionary data in a specific dictionary different from the recognized vocabulary dictionary;
The kana / kanji character string generation means performs the generation based on the specific dictionary.

The dictionary data generation means includes at least a corrected kana-kanji character string related to the correction in the second dictionary data as information relating to the content of the correction. Voice information processing system.

3. The voice information processing system according to claim 2, wherein the specific dictionary is a user dictionary capable of user registration of dictionary data related to a desired vocabulary.

2. The voice according to claim 1, wherein the dictionary data generation unit includes information indicating an operation procedure related to the correction in the second dictionary data as information related to the content of the correction. Information processing system.

Whether or not to apply the second dictionary data related to the correction in the generation of the kana / kanji character string by the kana / kanji character string generation means based on the generated kana / kanji character string and the correction to the kana / kanji character string Reference data generation means for generating reference data serving as a reference for determining whether or not
Storage means for storing the generated reference data in association with the second dictionary data;
5. The speech information processing system according to claim 1, wherein the kana-kanji character string generation unit performs the generation based on the reference data.

The reference data indicates that the second dictionary data is applied only when a specific kana / kanji character string exists at a location having a specific positional relationship with the kana character string related to the correction. The voice information processing system according to claim 5, wherein:

A speech information processing method in a language processing apparatus having a recognized vocabulary dictionary in which a plurality of first dictionary data including information related to reading of a vocabulary to be processed by kana and information related to notation in kana / kanji are registered,
A voice input step for inputting voice;
A kana character string generation step for generating a kana character string based on the input voice;
A kana / kanji character string generating step for generating a kana / kanji character string for the generated kana character string based on the recognition vocabulary dictionary;
A display step of displaying the generated kana-kanji character string on a display screen;
An accepting step of accepting correction for the displayed kana-kanji character string;
A dictionary data generation step of generating second dictionary data including a kana character string that is a source of the kana-kanji character string related to the correction and information on the content of the correction;
A registration step of registering the generated second dictionary data in a specific dictionary different from the recognized vocabulary dictionary;
In the kana / kanji character string generation step, the generation is performed based on the specific dictionary.

In a program for causing a computer to function as a speech information processing system including a recognized vocabulary dictionary in which a plurality of first dictionary data including information related to reading of a vocabulary to be processed by kana and information related to notation in kana / kanji are registered,
The program is
A voice input step for inputting voice;
A kana character string generation step for generating a kana character string based on the input voice;
A kana / kanji character string generating step for generating a kana / kanji character string for the generated kana character string based on the recognition vocabulary dictionary;
A display step of displaying the generated kana-kanji character string on a display screen;
An accepting step of accepting correction for the displayed kana-kanji character string;
A dictionary data generation step of generating second dictionary data including a kana character string that is a source of the kana-kanji character string related to the correction and information on the content of the correction;
Causing the computer to execute a registration step of registering the generated second dictionary data in a specific dictionary different from the recognized vocabulary dictionary;
In the kana-kanji character string generation step, the generation is performed based on the specific dictionary.