JPH07120253B2

JPH07120253B2 - Text input device by voice

Info

Publication number: JPH07120253B2
Application number: JP60019941A
Authority: JP
Inventors: 信夫畑岡; 義典北原; 明雄天野; 熹市川
Original assignee: Hitachi Ltd
Current assignee: Hitachi Ltd
Priority date: 1985-02-06
Filing date: 1985-02-06
Publication date: 1995-12-20
Anticipated expiration: 2010-12-20
Also published as: JPS61180329A

Description

【発明の詳細な説明】〔発明の利用分野〕本発明は音声入力による文章入力・処理・変換装置（以
下ワードプロセツサ）に係り、特に確認・訂正の効率を
あげ、かつそれに要する時間を低減しうるに好適な文章
入力・処理・変換装置に関する。Description: FIELD OF THE INVENTION The present invention relates to a text input / processing / converting device (hereinafter, word processor) by voice input, and particularly improves efficiency of confirmation / correction and reduces time required therefor. The present invention relates to a text input / processing / conversion device that is suitable.

[Background of the Invention]

従来のワードプロセツサにおける確認，訂正方式は、画
面（以下デイスプレイ）上に表示されている入力結果の
目視による確認と、手動カーソルによる位置指定後の訂
正によるものが主であつた。第１図は従来の確認，訂正
方式の処理フローを示したものである。まずデイスプレ
イ上に表示された入力結果の確認を目視にて行い、入力
あるいは認識誤りがあつた場合にカーソルキーを手動に
てカーソルを誤りの位置まで移動し、その後訂正，削
除，挿入などの校正を行う。目視による確認は、原稿と
デイスプレイを交互に見る必要があるのでわずらわし
く、しかも確認に要する時間がかかり、効率が悪いとい
う欠点がある。さらに、カーソルキー打鍵によるカーソ
ル移動は移動に時間がかかり、効率が悪いという欠点が
あつた。The confirmation and correction methods in the conventional word processor are mainly the visual confirmation of the input result displayed on the screen (hereinafter, display) and the correction after the position is designated by the manual cursor. FIG. 1 shows a processing flow of a conventional confirmation / correction method. First, visually confirm the input result displayed on the display, and if there is an input or recognition error, manually move the cursor to the error position and then calibrate correction, deletion, insertion, etc. I do. The visual confirmation is troublesome because it is necessary to alternately see the original and the display, and there is a drawback in that the confirmation requires a long time and is inefficient. Further, there is a drawback in that the movement of the cursor by hitting the cursor keys is time-consuming and inefficient.

[Object of the Invention]

本発明の目的は上記の問題を解決し、確認，訂正の効率
が良く、かつそれに要する時間を低減したワードプロセ
ツサを提供することにある。SUMMARY OF THE INVENTION An object of the present invention is to solve the above problems, to provide a word processor which has a high efficiency of confirmation and correction and which requires a reduced time.

[Outline of Invention]

上記の目的を達成するために、本発明では、入力された
音声の認識結果の中で、不確実な個所を検出しでその不
確実な個所を他の認識結果とは異ならせて表示し、か
つ、その不確実な個所に訂正カーソルを位置付けた。In order to achieve the above object, in the present invention, among the recognition results of the input voice, an uncertain part is detected and the uncertain part is displayed differently from other recognition results. And the correction cursor was positioned at the uncertain place.

具体的には、本発明による音声による文章入力装置は、
音声を入力する入力手段と、上記入力された音声の分析
・認識をする音声認識手段と、上記音声認識手段によっ
て認識された結果を表示する表示手段と、上記音声の認
識結果が不確実な場合には、上記表示手段に表示された
上記認識結果の中で上記不確実な個所を識別可能に表示
し、かつ、上記不確実な個所に訂正用カーソルを位置付
けるよう制御する制御部とから構成される。Specifically, the voice-based text input device according to the present invention is
Input means for inputting voice, voice recognition means for analyzing / recognizing the input voice, display means for displaying the result recognized by the voice recognition means, and when the recognition result of the voice is uncertain In the recognition result displayed on the display means, the uncertain portion is displayed in a distinguishable manner, and a control unit for controlling to position the correction cursor at the uncertain portion is constituted. It

音声の認識結果が不確実な個所の検出は、音声認識手段
に所定のマッチング距離基準を備え、音声の認識結果が
該基準値以上を示す場合に、または、音声認識手段にお
いて得られる音声のマッチング距離のうち、１位のマッ
チング距離と２位のマッチング距離との比が所定基準未
満を示す場合に、音声の認識結果を不確実と検出する。Detection of an uncertain part of the voice recognition result is performed by providing the voice recognition means with a predetermined matching distance reference, and when the voice recognition result shows the reference value or more, or when the voice recognition means obtains the voice matching. Among the distances, when the ratio between the first-ranking matching distance and the second-ranking matching distance is less than a predetermined reference, the voice recognition result is detected as uncertain.

なお、表示の仕方に関するパタン認識システムとして関
連するものは例えば特開昭57−211200号があげられる。As a pattern recognition system relating to a display method, there is, for example, JP-A-57-211200.

Example of Invention

以下、本発明の原理に関して詳細に説明する。第２図は
本発明による表示，確認，訂正方式を組み入れた文章入
力方式の処理フローを示したものである。例えば文節単
位に入力された音声は、音声認識され、画面上に認識結
果が表示される。この際、認識結果の不確実な個所（音
声）や誤認識しやすい個所がその他の個所と区別できる
ような表示方法で行われる。第３図はその表示方法の１
例を示したものである。入力音声／おんせいによる／の
中で２個所誤認識された場合（／い／→／ひ/,/よ／→
／お／）を想定した。誤認識した個所を認識結果のマツ
チング距離値などから認識不確実部として検出し、その
個所を色をかえたり、マークをつけたりして、特別な表
示を行う。次に表示された結果を主にマークなどのつか
つている個所を中心に認識を行い入力結果に誤りがない
場合は次のカナー漢変換などの処理へ戻る。入力結果に
誤りがあつた場合は、誤つたとみられる個所に前もつて
自動的に位置づけられている訂正用カーソル（第３図）
を使つて、もしその個所が本当に誤りならば結果の訂
正、誤りでないならば次の位置へ移動し、同処理を続け
る。この結果、誤りの個所を見つけやすく、かつ誤り訂
正用カーソルの位置移動が早いので効率の良い確認，訂
正がなされる。訂正の仕方はまず始めに認識結果の次候
補表示、次に音声による再入力、そして最終的にはキー
による入力などが考えられる。入力結果に誤りのない場
合はカナ−漢変換処理、その結果の確認、訂正処理が行
なわれる。全文章入力が終了するまで、以上の処理を続
ける。なお、本発明による表示，訂正方式をカナ−漢変
換後に組み入れる形も同様に考えられる。Hereinafter, the principle of the present invention will be described in detail. FIG. 2 shows a processing flow of a text input method incorporating a display, confirmation and correction method according to the present invention. For example, the voice input in phrase units is voice-recognized, and the recognition result is displayed on the screen. At this time, a display method is used so that an uncertain portion (voice) of the recognition result or a portion that is likely to be erroneously recognized can be distinguished from other portions. Figure 3 shows the display method 1
This is an example. In case of erroneously recognizing two places in input voice / onsen // (/ i / → / hi /, / yo / →
/ O /) was assumed. A part that is erroneously recognized is detected as a recognition uncertainty part from the matching distance value of the recognition result, and that part is changed in color or marked to give a special display. The next displayed result is mainly recognized mainly in the part where the mark is attached, and if there is no error in the input result, the process returns to the next Kanner-Kan conversion. If there is an error in the input result, the correction cursor is automatically positioned in advance to the position where it seems to have been incorrect (Fig. 3).
If the point is really an error, correct the result. If not, move to the next position and continue the same process. As a result, the location of the error can be easily found, and the position of the error correction cursor moves quickly, so that efficient confirmation and correction can be performed. As a method of correction, first, the next candidate display of the recognition result, then re-input by voice, and finally by key input can be considered. When there is no error in the input result, kana-kanji conversion processing, confirmation of the result, and correction processing are performed. The above processing is continued until the input of all sentences is completed. A form in which the display / correction method according to the present invention is incorporated after Kana-Kanji conversion is also conceivable.

第４図は認識結果の不確実な個所を検出するのに使われ
る情報の一例を示したものであり、認識処理結果の１位
のマツチング距離値と２位と１位とのマツチング距離比
の度数分布である。白抜きは正解、斜線はエラーの場合
であり、確実な正解は１位のマツチング距離値（絶縁
値）を0.1未満とした場合、２位と１位の比（相対値）
を1.4以上とした場合に得られる。換言すれば、１位の
マツチング距離値が0.1以上か、または２位と１位との
比が1.4未満かの場合は、認識結果を不確実とすること
により、確実にエラーをリジエクトできる。FIG. 4 shows an example of information used for detecting an uncertain portion of the recognition result. The matching distance value of the first place and the matching distance ratio of the second place and the first place of the recognition processing result are shown. It is a frequency distribution. The white outline is the correct answer, and the shaded area is the error case. The reliable correct answer is the ratio of the second place to the first place (relative value) when the matching distance value (insulation value) of the first place is less than 0.1.
It is obtained when is set to 1.4 or more. In other words, when the matching distance value of the first place is 0.1 or more, or the ratio of the second place to the first place is less than 1.4, the recognition result is uncertain, so that the error can be reliably rejected.

また、認識結果に誤認識の可能性がある個所は、多量の
認識実験の後得られるエラー傾向をもとに決定すること
が考えられる（例えば破裂音/p,t,k,b,d,g/を含む音節
など）。In addition, it is considered that the location where there is a possibility of misrecognition in the recognition result is determined based on the error tendency obtained after a large number of recognition experiments (for example, plosive sounds / p, t, k, b, d, syllables including g /).

さらに、本発明の方式をカナ−漢変換後に組み入れる場
合は、変換後の文章（例えば日本語）らしさをはかる尺
度を用いて変換不確実部を検出する。尺度としては自立
語の頻度情報や文節としての自立語と付属語の組合せの
可能性、最小文節法などの判定などが考えられる。Furthermore, when the method of the present invention is incorporated after the Kana-Kanji conversion, the conversion uncertainty is detected using a scale for measuring the likelihood of the converted sentence (for example, Japanese). As a scale, the frequency information of independent words, the possibility of combining independent words and adjunct words as bunsetsu, and the determination of the minimum bunsetsu method are considered.

以下、本発明の実施例を第５図を用いて詳細に説明す
る。第５図は、本発明を用いた音声入力による文章入力
・処理・変換装置の一実施例のブロツク図である。入力
・認識部１ではマイクロフオン11などの音声入力手段を
介して文章等が入力される。入力された音声は音声認識
部12で音声分析された後、標準音声とのマツチング処理
などを経て、認識される。この際、認識結果１位のマツ
チング距離や２位と１位とのマツチング距離比などをも
とにして、前記に示すような判断基準で認識結果の不確
実部が得られる。音声認識部の構成は本発明の主眼とす
るものではないので具体的実施例は省くことにするが、
従来から行われている方式のいずれを使つても実現され
る。次に入力された音声の認識または文章処理変換結果
と不確実部などの結果が表示・出力部２のデイスプレイ
21に表示される。スピーカ22は人間とシステムとのやり
とりや、認識結果の音声出力の手段として使われる。制
御部（cpu）３は例えばインテル社のマイコン8086等で
構成され、入力・認識部や表示・出力部などの制御と日
本語処理・変換の実行などを担当している。外部フアイ
ル４は例えばミニフロツピ（FD）41で構成され、作成
（編集）されたかな漢字混じりの日本語文章情報の格納
と日本語情報処理・変換用のプログラム等の格納に使わ
れる。プリンタ５は活字プリンタ51で構成され、編集結
果の文章等の活字による出力を行う。Hereinafter, an embodiment of the present invention will be described in detail with reference to FIG. FIG. 5 is a block diagram of an embodiment of a text input / processing / converting device by voice input according to the present invention. In the input / recognition unit 1, a sentence or the like is input via voice input means such as a microphone 11. The input voice is subjected to voice analysis by the voice recognition unit 12 and then recognized through a matching process with a standard voice. At this time, the uncertainty part of the recognition result is obtained based on the matching distance of the 1st place of the recognition result, the matching distance ratio of the 2nd place and the 1st place, and the like as described above. Since the configuration of the voice recognition unit is not the main object of the present invention, a specific embodiment will be omitted.
It can be achieved using any of the conventional methods. Next, the display of the recognition / sentence conversion result of the input voice and the result of the uncertainty part and the display / output part 2 are displayed.
Displayed on 21. The speaker 22 is used as a means for exchanging humans with the system and outputting the recognition result as voice. The control unit (cpu) 3 is composed of, for example, a microcomputer 8086 manufactured by Intel Corp., and is in charge of controlling the input / recognition unit, the display / output unit, etc. and executing the Japanese language processing / conversion. The external file 4 is composed of, for example, a mini floppy (FD) 41, and is used for storing the created (edited) Japanese sentence information containing kana-kanji and the Japanese information processing / conversion program. The printer 5 is composed of a type printer 51, and outputs the text of the edited result in the type.

〔The invention's effect〕

本発明によれば入力誤りのある個所を主体にした認識と
訂正用カーソルの自動的移動とにより、確認と訂正に要
する時間と手間とが短縮できるので、効率の良い文章編
集が可能となる。According to the present invention, the time and effort required for confirmation and correction can be shortened by recognizing mainly the location of an input error and automatically moving the correction cursor, so that efficient text editing is possible.

[Brief description of drawings]

第１図は従来の確認・訂正方式の処理のフローを示す
図、第２図は本発明による表示，確認，訂正方式を組み
入れた文章入力方式の処理フローを示す図、第３図は本
発明によるデイスプレイ上での表示の一例を示す図、第
４図は認識結果の不確実な個所を検出するのに使われる
情報の一例を示す図、第５図は本発明の一実施例を示す
図である。１……入力・認識部、２……表示・出力部。FIG. 1 is a diagram showing a processing flow of a conventional confirmation / correction method, FIG. 2 is a diagram showing a processing flow of a text input method incorporating a display, confirmation, and correction method according to the present invention, and FIG. FIG. 4 is a diagram showing an example of display on the display, FIG. 4 is a diagram showing an example of information used for detecting an uncertain portion of a recognition result, and FIG. 5 is a diagram showing an embodiment of the present invention. Is. 1 ... Input / recognition section, 2 ... Display / output section.

───────────────────────────────────────────────────── フロントページの続き (72)発明者天野明雄東京都国分寺市東恋ヶ窪１丁目280番地株式会社日立製作所中央研究所内 (72)発明者市川熹東京都国分寺市東恋ヶ窪１丁目280番地株式会社日立製作所中央研究所内 (56)参考文献特開昭56−162142（ＪＰ，Ａ) 特開昭59−99532（ＪＰ，Ａ) ─────────────────────────────────────────────────── ─── Continuation of front page (72) Inventor Akio Amano 1-280 Higashi Koigakubo, Kokubunji, Tokyo Inside Central Research Laboratory, Hitachi, Ltd. (72) Inventor 熹 Ichikawa 1-280, Higashi Koigakubo, Kokubunji, Tokyo Hitachi Ltd. Central Research Laboratory (56) Reference JP-A-56-162142 (JP, A) JP-A-59-99532 (JP, A)

Claims

[Claims]

1. Input means for inputting voice, voice recognition means for analyzing / recognizing the input voice, display means for displaying a result recognized by the voice recognition means, and recognition result for the voice. If it is uncertain, the control for displaying the uncertain portion in the recognition result displayed on the display means in a distinguishable manner and for positioning the correction cursor at the uncertain portion A voice-based text input device, which is composed of

2. The voice recognition means has a predetermined matching distance reference, and makes the voice recognition result uncertain when the voice recognition result indicates a reference value or more. A voice-based text input device according to claim 1.

3. The voice recognition means obtains the matching distance of the input voice, and when the ratio of the first-order matching distance to the second-order matching distance is less than a predetermined reference,
The voice-based text input device according to claim 1, wherein the voice recognition result is uncertain.