JPS59123926A

JPS59123926A - Voice input japanese processing system

Info

Publication number: JPS59123926A
Application number: JP57230287A
Authority: JP
Inventors: Yutaka Kamiyanagi; 上柳　裕; Takahiko Ogita; 荻田　隆彦; Takashi Hamada; 浜田　隆史
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 1982-12-29
Filing date: 1982-12-29
Publication date: 1984-07-17

Abstract

PURPOSE:To obtain a correct KANJI (Chinese character) and KANA (Japanese syllabary) mixed writing by preparing KANA strings as the 1st candidata closest to an input voice and another approximate candidate, and performing a grammatical check, maximum likelihood evaluation, etc., upon a KANA string made into morphemic form and outputting the KANA and KANJI mixed writing. CONSTITUTION:When a Japanese sentence to be printed out on a printer 26 is voiced in front of a microphone 10, a voice recognition part 14 recognizes it, syllable by syllable, and passes it to a KANA-KANJI conversion part 16 by KANA. Word dictionary consultation and decision on whether grammatical connection relation, therefore, words in morphemic form are proper or not are performed here to perform convesion to the KANA and KANJI mixed writing. Then, a document editing part 18 perform editing operations such as a line change, page change, sentence replacement, etc., and the edited KANA and KANJI mixed writing is displayed on a display device 20 through a voice synthesis/output control part 24 and further printed out on a printer 26 by an indication from a console panel 12, etc. Further, the input voice is recognized to prepare an approximate candidate in addition to the candidate closest to the right answer, and it is also checked to output the KANA and KANJI mixed writing close to the right answer.

Description

【発明の詳細な説明】発明の技術分野本発明は、日本語を音声で入力し、それをワードプロセ
ッサにカナ漢字混じりの文章に変換させる音声入力日本
語処理方式に関する。DETAILED DESCRIPTION OF THE INVENTION Technical Field of the Invention The present invention relates to a voice input Japanese processing method in which Japanese is input by voice and is converted into a sentence mixed with kana and kanji by a word processor.

従来技術と問題点従来、文書作成はキーボード入力のワードプロセッサが
主体であった。キーボード入力のものは当然オペレータ
がキーボード操作に習熟する必要がある。これに対して
、人間にとって最も自然な情報伝達手段である音声をマ
ンマシンインタフェースに取り入れたも開発されている
が、この方式のものは音声認識した結果をカナでディス
プレイに表示しく誤ｌ識があるから）、話者にそれを確
認させ、必要に応じて次候補選択による修正を行なわせ
、こうして得たカナ列を通常のワードプロセッサと同様
にカナ漢字変換し、その際もオペレータにより確認、修
正を行ない、というプロセスを経る。しかしこの方式で
はカナの確認、修正という操作が入り、キーボード入力
ワードプロセッサに比べて使い易いというところまでに
は至っていない。Prior Art and Problems Traditionally, word processors with keyboard input have been the main way to create documents. Of course, for keyboard input, the operator needs to be proficient in keyboard operations. In response, a method has been developed that incorporates voice, the most natural means of information transmission for humans, into a man-machine interface, but this method displays the voice recognition results in kana on the display, leading to misunderstandings. ), have the speaker confirm it, make corrections by selecting the next candidate if necessary, convert the kana string obtained in this way to kana-kanji in the same way as a normal word processor, and at that time also check and make corrections by the operator. Go through the process of doing this. However, this method requires operations such as checking and correcting kana characters, and is not as easy to use as a keyboard-input word processor.

発明の目的本発明は、音声入力をカナ列に変換し該カナ列をカナ漢
字混じりの文に変換するというプロセスを経るが、話者
は該カナ列を確認する必要がない音声入力日本語処理を
実現しようとするものである。Purpose of the Invention The present invention is a method for processing voice input Japanese that does not require the speaker to confirm the kana sequence, although the process of converting the voice input into a kana string and converting the kana string into a sentence mixed with kana and kanji. This is what we are trying to achieve.

発明の構成本発明は入力された音声を認識してカナ列に変換し、該
カナ列を漢字カナ混合文に変換する音声入力日本語処理
方式において、入力された音声に最も近い第１候補のカ
ナ列の他に近似度が第１候補のカナと比べて余り相違し
ないカナ及び第１候補のカナ列に発音がカナ固有のもの
と変るカナがあればそのカナを集めた第２候補のカナ列
を用意し、これらのカナ列に対し単語辞書索引による形
態素化を行ない、種々に形態素化されたカナ列に対し文
法チェック、最尤評価、言語的評価の１つ以上を行ない
、これらの評価の高い形態素化漢字カナ混合文を出力デ
ータとすることを特徴とするが、次に図面を参照しなが
らこれを詳細に説明する。Structure of the Invention The present invention is an audio input Japanese processing method that recognizes input speech, converts it into a kana string, and converts the kana string into a kanji-kana mixed sentence. In addition to the kana sequence, if there are any kana whose approximation is not much different from the first candidate kana and the first candidate kana sequence has a pronunciation that differs from the kana-specific one, the second candidate kana is a collection of those kana. Prepare columns, perform morphemization on these kana strings using a word dictionary index, perform one or more of grammar checking, maximum likelihood evaluation, and linguistic evaluation on the variously morphemized kana strings, and perform these evaluations. The present invention is characterized in that the output data is a highly morphological Kanji-Kana mixed sentence, which will be explained in detail below with reference to the drawings.

発明の実施例第１図は音声入力ワードプロセッサの概要を示し、１０
はマイクロホン、１２は操作卓、１４は音声認識部、１
６はカナ漢字変換部、１８は文書編集部、２０はＣＲＴ
ディスプレイ、２２はスピーカ、２４は音声合成／出力
制御部、２６はプリンタである。プリンタ２６でプリン
トアウトしたい日本語文章をマイクロホン１０に向って
発声すると、音声認識部１４はそれを各音節（つまりカ
ナ）別に認識し、カナ列としてカナ漢字変換部１６へ渡
す。こ−で単語辞書索引、文法による接続関係従って当
該単語の当否判定などが行なわれ、カナ漢字混りの文に
変換される。文書編集部１８では改行、改頁、文章入れ
換えその他の編集を行ない゛、編集されたカナ漢字文章
が音声合成／出力制御部２４を介してディスプレイ２０
に表示され、操作卓１２等からの指示でプリンタ２６か
らプリントアウトされたりする。従来方式では音声認識
部１４で認識された音節つまりカナがディスプレイ２０
に表示され、話者に確認、修正指示を求めたが、本発明
方式ではこれを必ずしも必要としないので、表示される
のはカナ漢字変換後の文章編集部１８の出力である。音
声入力日本語処理装置の基本は音声認識部とカナ漢字変
換部で、これらを組合せることにより該処理装置を構成
できる。そして音声認識部もカナ漢字変換部もそれ自身
は、完全に若しくは対象を若干絞れば充分実用段階にあ
る。Embodiment of the Invention FIG. 1 shows an outline of a voice input word processor, and 10
1 is a microphone, 12 is an operation console, 14 is a voice recognition unit, 1
6 is a kana-kanji converter, 18 is a document editor, and 20 is a CRT.
22 is a speaker, 24 is a voice synthesis/output control unit, and 26 is a printer. When a Japanese sentence to be printed out by the printer 26 is uttered into the microphone 10, the speech recognition section 14 recognizes it for each syllable (that is, kana) and passes it to the kana-kanji conversion section 16 as a kana string. At this point, a word dictionary index, grammatical connection relationships, and validity determination of the word are performed, and the sentence is converted into a sentence containing kana and kanji. The document editing section 18 performs line breaks, page breaks, text replacement, and other editing.The edited kana-kanji text is sent to the display 20 via the speech synthesis/output control section 24.
, and may be printed out from the printer 26 based on instructions from the operation console 12 or the like. In the conventional method, the syllables, that is, kana, recognized by the speech recognition unit 14 are displayed on the display 20.
is displayed, asking the speaker for confirmation and correction instructions, but since this is not necessarily necessary in the method of the present invention, what is displayed is the output of the text editing section 18 after the kana-kanji conversion. The basics of a speech input Japanese processing device are a speech recognition section and a kana-kanji conversion section, and the processing device can be configured by combining these parts. Both the speech recognition section and the kana-kanji conversion section are at a practical stage if they are completely or slightly narrowed down.

なお音声認識部には単音節認識の他に単語認識をするも
のもあるが、本例では前者とする。Note that some speech recognition units perform word recognition in addition to monosyllable recognition, but in this example, the former is used.

カナ漢字変換は、入力カナ文字列に対し単語辞書との照
合を行ない、一般には最長一致法によって品詞分解を行
ない、カナ文字列に相当する単語に対して漢字又はカナ
（一般には平仮名）を割当てるものであるが、本発明で
はこれを次のように拡張する。Kana-Kanji conversion compares the input kana character string with a word dictionary, generally performs part-of-speech decomposition using the longest match method, and assigns kanji or kana (generally hiragana) to the word corresponding to the kana character string. However, in the present invention, this is expanded as follows.

即ち本発明では、入力音声を認識して最も正解に近い（
基準パターンとの距離が短い）第１候補のカナ列の他に
、候補間距離を闇値と比較して（近似度情報を知る）闇
値以下の距離となる他の候補も取上げる。即ち、音声認
識には誤認識がつきものであるから候補は複数とし、こ
の誤認識に対処するが、候補が増えると多数の組合せが
でき甚だ厄介なこととなる。例えば２単語の文章でも各
単語に第１候補、第２候補があると２Ｘ２＝４通りの文
章ができてしまう。この煩雑さを避けるため近似度（距
離）を考慮し、候補には近似度の大なるもののみ取上げ
、近似度の小なるものは落とす。採用する他の候補とし
ては、小文字指定をしないで発声したカナ列に対しては
対応する小文字（例えば「ハラシン」の「ツ」を発音す
る際は、操作卓操作などで小文字指定したのち「ツ」と
発音するが、これをしないで「ツ」と発音したカナ列「
ハラシン」に対しては「ツ」。小文字）するものはツ、
ユ、ヨなど特定のカナであるから該当カナがカナ列にあ
ればその小文字を他の候補とする）、及び文字の読みと
発音が異なる又は区別できないものの相手方（例えば助
詞は、を、へは文書中では、わ、お、えとなるからこれ
らに対する他候補として、は、を、へを用意し、またず
とづ、じとぢは発音上では区別できないので一方は他方
に対する他の候補とする）などがある。このような他の
候補に対してはその都度解析ルート及び優先順位を割り
付け、各ルートに対し逐次形態素（辞書の見出し語に相
当する）以上に区切られた文字列と単語辞書との照合に
より個々の形態素に分け、形態素列の妥当性を言語的に
チェックしく例えば「本発明は」と１本発面は」の如く
、直前、直後の形態素間の文法チェックでは接続可でも
日本語ではか＼る形態素のつながりが有り得ない場合が
あるが、これを除く処理をする）、ルートの枝刈りを行
なった後、対応する漢字、カナを割り付け、更に他の言
語的処理によりルートの優先順位の変更を行ない（例え
ばルートの優先順位を未使用語の有無、形態素数等につ
いて評価し、変更する）、最終的に第１優先順位のルー
トのカナ漢字混り列を出力する。更には必要に応じてカ
ナ漢字混り列を漢字カナ変換してカナ列に戻し、ルート
の入力源と比較して妥当性を検証する（本来、一致して
いるはず）。また一度の処理で希望のカナ漢字混り文が
得られない場合には、再変換命令により再処理を行なう
。That is, in the present invention, the input voice is recognized and the answer closest to the correct answer (
In addition to the first candidate kana string (which has a short distance to the reference pattern), other candidates whose distances are less than or equal to the dark value by comparing the distance between the candidates with the dark value (knowing the degree of approximation information) are also picked up. That is, since recognition errors are inherent in voice recognition, a plurality of candidates are used to deal with these recognition errors, but as the number of candidates increases, a large number of combinations are created, which becomes extremely troublesome. For example, even in a two-word sentence, if each word has a first candidate and a second candidate, 2×2=4 different sentences will be created. In order to avoid this complexity, the degree of approximation (distance) is taken into consideration, and only candidates with a high degree of approximation are selected, and candidates with a low degree of approximation are discarded. Another option to adopt is to use the corresponding lowercase letter (for example, when pronouncing the ``tsu'' in ``Harashin'', first specify the lowercase letter by operating the console, etc.) for a kana string that is uttered without specifying a lowercase letter. '', but the kana sequence pronounced as ``tsu'' without doing this is pronounced as ``
"Tsu" for "Harashin". (lower case)
If the corresponding kana is in a kana string, the lowercase letter is used as another candidate (because it is a specific kana such as yu, yo, etc.), and the other character whose pronunciation and pronunciation are different or cannot be distinguished (for example, particles are In the document, they are wa, o, and eh, so we have wa, wa, and eh as alternative candidates for these, and since zutozu and jitoji are indistinguishable in pronunciation, one is an alternative candidate for the other. )and so on. For these other candidates, we assign analysis routes and priorities each time, and each route is analyzed individually by comparing character strings divided into morphemes (equivalent to dictionary headwords) or more with a word dictionary. The validity of the morpheme sequence is linguistically checked by dividing the morphemes into morphemes.For example, in the case of ``The present invention is'' and the morphemes that appear in one sentence, it is possible to connect the morphemes immediately before and after the morphemes, but in Japanese it is not possible. After pruning the root, assigning the corresponding kanji and kana, and changing the priority order of the root through other linguistic processing. (for example, the priority order of the root is evaluated and changed in terms of the presence or absence of unused words, the number of morphemes, etc.), and finally a string containing kana-kanji mixture of the root with the first priority is output. Furthermore, if necessary, the Kana/Kanji mixed string is converted back to a Kana string by converting it into a Kana string, and its validity is verified by comparing it with the root input source (which should originally match). Furthermore, if the desired sentence containing kana, kanji, and kanji cannot be obtained in one processing, reprocessing is performed using a reconversion command.

第２図は、以上の処理の概要をブロック図で示す説明図
である。音声認識部１４からのカナ列などの入力は最初
にコマンド解析され、該入力に付されている命令が文書
作成命令かをチェックしてイエスなら変換ルートＡへ文
書チェックなら再変換ルートＢへ送られる。変換ルー）
Ａでは先ず入力処理つまりデータの読み込みが行なわれ
、次い、で候補の近似度評価によるマルチルート設定、
マルチルートに対するカナ漢字変換及び言語的チェック
、言語的チェックによるルート順位の変更、漢字カナ変
換による最優ルートの検証が逐次行なわれ、然るのち出
力される。再変換ルートＢでは直接、言語的チェックに
よるルート優先順位の変更に入り、漢字カナ変換による
最優ルートの検証を経て出力処理される。FIG. 2 is an explanatory diagram showing an overview of the above processing in a block diagram. The input such as a kana string from the speech recognition unit 14 is first analyzed as a command, and it is checked whether the command attached to the input is a document creation command, and if yes, it is sent to conversion route A, and if it is a document check, it is sent to reconversion route B. It will be done. conversion rule)
In A, input processing is first performed, that is, data is read, and then multi-route setting is performed by evaluating the degree of approximation of candidates.
Kana-Kanji conversion and linguistic checks for multiple routes, change of route ranking by linguistic checks, and verification of the best route by Kanji-Kana conversion are performed one after another, and then output. In the re-conversion route B, the route priority order is directly changed by linguistic checking, and the best route is verified by Kanji-kana conversion, and then output processing is performed.

第３図は音声カナ漢字変換の処理概要を示すフロー図で
ある。入力文字列つまり音声認識されたカナ列を読み込
み、句読点などで処理単位を切出し、認識情報より複（
マルチ）ルートの設定も行なう。次いで形態素切出し、
単語辞書アクセス、文法接続チェック、最大評価チェッ
クを行ない、これらを全てのルートについて行ない、ル
ートの評価をして適切なものを出力する。ステップ（ｅ
）。FIG. 3 is a flowchart showing an outline of the processing of phonetic kana-kanji conversion. Read the input character string, that is, the voice-recognized kana string, cut out the processing units using punctuation marks, etc., and extract them from the recognition information.
Also configure multi) routes. Next, morpheme extraction,
It performs word dictionary access, grammatical connection check, and maximum evaluation check, and performs these for all routes, evaluates the routes, and outputs the appropriate one. Step (e
).

ｔｆ）により有効形態素がないときはバックトラックを
行なう。＊印を付したステップが本発明の追加部分であ
る。次に第４図により処理の具体例を説明する。tf), when there is no valid morpheme, backtracking is performed. Steps marked * are additional parts of the invention. Next, a specific example of the process will be explained with reference to FIG.

第４図では「総額を計算する］を音声入力したと仮定し
ており、そしてその音声認識結果の第１候補は「ソウガ
ウオ、ケニ号ンスル」、第２候補は「オフカクホ、ヘイ
ハクウス」であったとしている。なお第２候補の各文字
は第候禎の各文字に対応するものであり、近似度が２番
目のものである。音声入力は音節毎なので「ソ」　「つ
」　「ガ」・・・・・・と発声して入力し、そして読点
「、」は「カンマ」と単語発声して入力する。単語入力
は句読点その他の特定語（制御用の言葉など）に限られ
ているので認識率は高く、第２候補は不要である。In Figure 4, it is assumed that ``Calculate the total amount'' was inputted by voice, and the first candidate of the voice recognition result was ``Sougaouo, Kenigo nsuru,'' and the second candidate was ``Ofukakuho, Heihakuus.'' It is said that Note that each character of the second candidate corresponds to each character of the second candidate, and has the second highest degree of approximation. Since the voice input is syllable by syllable, it is input by saying "so", "tsu", "ga", etc., and the comma "," is input by saying the word "comma". Since the word input is limited to punctuation marks and other specific words (control words, etc.), the recognition rate is high and a second candidate is not necessary.

次に変換単位の切出しを行ない、読点で区切った最初の
カナ列を取込む。次に候補間の近似度情報の評価を行な
う。この評価で本例では第１候補のソ、つ、ガ、オに対
する第２候補の近似度は低い（第１候補ソ、つ、ガ、オ
の距離Ｌｌ、Ｌ２．Ｌ３゜ｆ、４に対する第２候補オ、
）、力、ホの距離Ｊｌ。Next, extract the conversion unit and import the first kana sequence separated by commas. Next, the similarity information between the candidates is evaluated. In this evaluation, in this example, the degree of approximation of the second candidate to the first candidates G, T, G, O is low (the distances Ll, L2.L3°f, 4 of the first candidates G, T, G, O) are low. 2 candidates,
), force, distance Jl.

Ｉ１２．　７！３．ｅｇの差り＋　　Ｌ＋、Ｊ２　Ｌ２
・・・・・・が閾値以上）ので落とし、近似度が高い第
１候補つに対する第２候補りのみを残した。I12. 7!3. eg difference + L+, J2 L2
.

次にルートの設定及び優先順の付与に入り、本例では「
ソウガウォ」を順位１、「ソウガクオ」を順位２とする
。順位１のカナ列は囚のルートへ、順位２のカナ列は回
のルートへ導く。これらのＡ。Next, we start setting the route and assigning priority, in this example,
``Sougawo'' is ranked 1st, and ``Sogakuo'' is ranked 2nd. The kana string with rank 1 leads to the prisoner's route, and the kana string with rank 2 leads to the route of time. These A.

Ｂルートではカナ漢字変換及びそのチェックを行なう。Route B performs kana-kanji conversion and checking.

Ａ、Ｂルートとも処理内容は同じで、先ず単語辞書を引
き、最長一致法で該当語を確度の高いもの順に選出する
。最長一致であるから辞書見出し語（形態素）とカナ列
が、先頭より可及的に後部に至るまで文字一致したもの
も最も確度が高いとする。こうして選ばれた形態素は本
例では図示の如くＡルートでは草画うお、爪芽、相、そ
、Ｂルートでは総額をであった。これらに対して文０法チェックを行なう。単語辞書には各形態素に品詞がの
っており、この形態素はどの品詞と連結可能かの接続情
報もあるから、これらにより形態素化したものが相互に
連続可能か否かをチェックする。次は最尤評価を行ない
、長さく字数）の大なるもの及び使用傾度の大なるもの
等に高い評価を与える。The processing contents for routes A and B are the same: first, a word dictionary is looked up, and corresponding words are selected in descending order of accuracy using the longest match method. Since this is the longest match, it is also assumed that the dictionary headword (morpheme) and the kana string have the highest probability of character matching from the beginning to the end as far as possible. In this example, the morphemes selected in this way were ``Uo'', ``Claw bud'', ``So'', and ``Total'' for the A route, as shown in the diagram. Perform a sentence 0 law check on these. The word dictionary has a part of speech for each morpheme, and there is also connection information indicating which part of speech this morpheme can be connected to, so it is checked whether the morphemes can be continuous with each other. Next, a maximum likelihood evaluation is performed, and a high rating is given to items with a large length (length and number of characters) and items with a large usage tendency.

次は打切り評価を行なう。カナ列は最長一致などで形態
素化して行くが、最長一致したものが正解とは限らず、
そして誤った形態素化が行なわれると以後の形態素が連
結しなくなる。か＼る行き止りの枝路に入ったら当該枝
路での形態素化は打切り、１つ、２つ・・・・・・前の
形態素化を他の形態素化に変更して解析をやりなおし、
正解へ至る努力を続けるが、該打切りを何処で行なうか
の判断基準を与えるのがこの評価である。深さ、読み長
などで当該枝路の評価をして評価値が設定値を越えたも
のは打切りとする。次の言語的評価は前述したが更に付
言すれば、日本語では平仮名の１文字１形態素が４個以
上、漢字の１文字１形態素が２１個以上連続することはないという如き特質を利用して評
価する。例えば平仮名は１文字で１形態素となり、多数
連結もするが、連結すると「ねた（し」などのように１
形態素となってしまうもので、集っても１形態素とはな
らない独立した形態素が４個も５個も続くことはないと
判断する。また日、本、国はそれぞれ辞書の見出し語に
あるから各々１形態素の言葉であり、これらを結合させ
た日本国なる言葉もあるが、これは「日本」で１つの形
態素を作ってしまい、１つの形態素とはならない独立し
た形態素が３個も４個も連続するものはないと判断する
。例えば「すべくどりょくする」を「すべ句度力する」
と形態素化した場合、漢字の形態素「句」　「度」　「
力」が３つも続き、かつこれらは合成して１形態素とな
ることもないのでおかしい、この形態素化は誤りである
と判断する。Next, we will conduct a censored evaluation. Kana strings are converted into morphemes based on the longest match, but the longest match is not necessarily the correct answer.
If morphemes are erroneously converted, subsequent morphemes will not be connected. When we enter a dead-end branch, we discontinue the morphemization in that branch, change the previous morphemization to another one, two, etc., and redo the analysis.
Efforts will continue to arrive at the correct answer, but this evaluation provides a criterion for deciding where to cut off. The branches are evaluated based on depth, reading length, etc., and those whose evaluation values exceed the set values are discontinued. The next linguistic evaluation was mentioned above, but I would like to add that it is possible to make use of the characteristic that in Japanese, one character in hiragana does not have more than four consecutive morphemes, and one character in kanji does not have more than 21 consecutive morphemes. evaluate. For example, one character in hiragana is one morpheme, and multiple characters can be connected, but when connected, it becomes one morpheme, such as ``neta'' (shi).
It is judged that four or five independent morphemes that become morphemes and do not form a single morpheme even if they come together cannot continue. In addition, the words 日本, 本, and 国 are all words with one morpheme, as they are each found in dictionary headwords, and there is also the word ``Japan,'' which is a combination of these words, but this creates one morpheme with ``Japan.'' It is determined that there are no consecutive three or four independent morphemes that do not form one morpheme. For example, ``Subekudoryokusuru'' is changed to ``Subekudoryokusuru.''
When converted into morphemes, the kanji morphemes ``ku'', ``degree'', ``
It is strange because there are three ``power'' in a row, and they cannot be combined to form a single morpheme.This morpheme is judged to be incorrect.

次に最優先ルートの決定、本例ではＡルートがよいか、
Ｂルートがよいかを決定する。こ＼では未使用単語を含
んでいるか、形態素数は多いかなどをチェックする。Ａ
ルートの「草画」は本シス２テムでは使用したことのない単語であるとすると、未使
用単語ありとなり、Ｂルートの「総額」は使用したこと
がある単語であると未使用単語なしとなり、後者の方が
評価が高くなる。また形態素数はＡルートの草両うおは
３、Ｂルートの総額は２であり、形態素数は少ない方が
よしとする。これらの結果から、入力音声の近似度とし
てはＡルートのカナ列の方が高かったが鎖線枠内での解
析結果ではＢルートのカナ列の方が評価が高まり、優先
度はＢルートの「総額を」が■、Ａルートの「草画うお
」は■と変更される。次いで優先度■の「総額を」が「
ソウガクオ」にカナ変換され、これは入力データに等し
いので解析中にエラーが導入されたことはなかったと判
断され、「総額を」が出力データとなる。なお「を」は
「総額」が決った段階で助詞であると判断され「お」が
「を」に変更されたものである。Next, determine the highest priority route, in this example, is route A better?
Decide whether route B is better. Check whether it contains unused words and whether it has a large number of morphemes. A
If the root "Sugata" is a word that has never been used in this system, there will be unused words, and if the "Total amount" of route B is a word that has been used, there will be no unused words. , the latter is rated higher. In addition, the number of morphemes is 3 for root A, and the total number of morphemes for route B is 2, so the smaller the number of morphemes, the better. From these results, the kana string of route A had a higher degree of approximation to the input speech, but the analysis results within the chain line frame showed that the kana string of route B had a higher evaluation, and the priority was higher than that of route B. "Total amount" will be changed to ■, and "Sugata Uo" for route A will be changed to ■. Next, priority ■ "Total amount" is "
Since this is equivalent to the input data, it is determined that no error was introduced during the analysis, and the output data is ``Total amount''. Note that ``wo'' was determined to be a particle when the ``total amount'' was determined, and ``o'' was changed to ``wo''.

なお例えば、「グイヨウリョウジキバブル」をカナ漢字
変換すると、「代用領事着バブル」となる。漢字カナ変
換で形態素解析は代用（ダイロウ）３領事（リョウジ）着（ジャク）バブル（バブル）となる
ので、変換結果は「グイロウリョウジジャクバブル」を
得る。このようなケースに対しては漢字カナ変換による
チェックが有効である。For example, if you convert ``Guiyo Ryoujiki Bubble'' into kana-kanji, it becomes ``substitute consul arrival bubble.'' In kanji-kana conversion, the morphological analysis is ``substitute (dairou)'' 3 consul (ryoji) arrival (jaku) bubble (bubble), so the conversion result is ``guiro ryoji jaku bubble''. In such cases, checking using Kanji-kana conversion is effective.

辞書索引、形態素化に際し入力カナ列に対し名詞、助詞
、動詞などの品詞別の区切りができれば索引は容易にな
り、認識率は上り、認識所要時間は減少する。しかし品
詞別の入力を話者に要求するのは音声入力の簡便さ、自
然さを損ねてしまう。When performing dictionary indexing and morphemization, if the input kana string can be divided into parts of speech such as nouns, particles, and verbs, indexing becomes easier, the recognition rate increases, and the time required for recognition decreases. However, requiring the speaker to input information by part of speech impairs the simplicity and naturalness of voice input.

そこで、話者が自然に行なう句切りを検出し、それを区
分情報とするのが有効である。具体的には入力音声の時
間監視を行ない、音声中断時間が設定値以上なら句切り
とする。前述の例では話者は「ソウガクオ」と続けて発
声し、「ソウガ」　「り」「オ」と途切れた発声をした
のではないであろうから、「ソウガクオ」までを一連の
ものとしく句切りコードがオの次に付されるから）、最
長一致をとれば簡単に「総額を」が求まる確率が高い。Therefore, it is effective to detect the phrases that the speaker makes naturally and use them as classification information. Specifically, the time of the input voice is monitored, and if the voice interruption time exceeds a set value, it is cut off. In the example above, the speaker probably uttered ``sou ga kuo'' in succession and did not say ``sou ga'', ``ri'', and ``o'', so the phrase up to ``sou ga kuo'' is treated as a series. Since the cut code is added after O), there is a high probability that you can easily find the ``total amount'' by finding the longest match.

なおこの例では読点があったので句切りコードは不要に
なるが、句読点までのカナ文字数が多い力４す列では句切コードは有効である。In this example, there is a comma, so a punctuation code is not needed, but a punctuation code is effective in a sequence that has a large number of kana characters up to the punctuation mark.

発明の詳細な説明したように本発明では、マルチルートの設定、各
ルートの適性チェッ°りによる正しい漢字カナ混合文の
出力を行なうので、高い確度で、誤認識のある音声入力
カナ列をカナ漢字変換でき、話者にカナ列チェックをさ
せる必要がなく直ちに漢字カナ混合文をディスプレイに
表示して確認を求めることができる。またマルチルート
の設定では近イ以度の低いものは候補から落とし、また
誤り易い若しくは区別のつかないものは積極的に候補に
加えるので、適切なマルチルート数を確保できる。更に
各ルートの適性チェックでは未使用単語のを無、形態素
数の多小をチェック項目に加えたので、正しいルートの
選択確率の向上が期待できる。As described in detail, the present invention outputs a correct kanji-kana mixed sentence by setting multiple routes and checking the suitability of each route, so that incorrectly recognized voice input kana strings can be translated into kanji with high accuracy. It can convert kanji to kanji, and there is no need for the speaker to check the kana sequence, and the kanji-kana mixed sentence can be immediately displayed on the display and asked for confirmation. Furthermore, when setting multi-routes, routes with a low degree of near-i are removed from the candidates, and routes that are easily misunderstood or cannot be distinguished are actively added to the candidates, so that an appropriate number of multi-routes can be secured. Furthermore, in the aptitude check for each route, we added unused words and the number of morphemes to the check items, so we can expect an improvement in the probability of selecting the correct route.

[Brief explanation of the drawing]

第１図は音声入力日本語ワードプロセッサの概要を示す
説明図、第２図〜第４図は本発明の処理方式を説明する
フロー図である。５図面で、１０はマイク、１２は操作卓、１４は音声認識
部、１６はカナ漢字変換部、２０はディスプレイである
。出願人　富士通株式会社代理人弁理士　　青　　柳　　　　稔６FIG. 1 is an explanatory diagram showing an outline of a voice input Japanese word processor, and FIGS. 2 to 4 are flow diagrams explaining the processing method of the present invention. 5 In the drawing, 10 is a microphone, 12 is an operation console, 14 is a voice recognition section, 16 is a kana-kanji conversion section, and 20 is a display. Applicant Fujitsu Limited Representative Patent Attorney Minoru Aoyagi6

Claims

[Claims] In a speech input Japanese processing method that recognizes input speech, converts it into a kana string, and converts the kana string into a kanji-kana mixed sentence, a first candidate closest to the input speech is selected. In addition to the kana sequence, if there are any kana whose approximation is not much different from the first candidate kana and the first candidate kana sequence has a pronunciation that differs from the kana-specific one, the second candidate kana is a collection of those kana. Prepare columns, perform morphemization on these kana strings using a word dictionary index, perform one or more of grammar checking, maximum likelihood evaluation, and linguistic evaluation on the variously morphemized kana strings, and perform these evaluations. A voice input Japanese processing method characterized in that output data is a highly morphized kanji-kana mixed sentence.