JPS607528A

JPS607528A - Voice-controlled word processor

Info

Publication number: JPS607528A
Application number: JP58116355A
Authority: JP
Inventors: Sakae Inoue; 井上　榮
Original assignee: NEC Corp; Nippon Electric Co Ltd
Current assignee: NEC Corp
Priority date: 1983-06-28
Filing date: 1983-06-28
Publication date: 1985-01-16

Abstract

PURPOSE:To increase the input speed of a voice-controlled word processor by converting the input of said word processor into a more natural voice input and at the same time decreasing the number of operation procedures of KANJI (Chinese character) conversion. CONSTITUTION:The input of a voice-controlled word processor is converted into a more natural voice input, and at the same time the number of operation procedures for KANJI conversion is reduced. For instance, the voices of an operator are supplied to a voice recognizer 11 through a mike 12. Then the result of recognition is coded and sent to a processor 13, and the processor 13 converts the result of recognition into the corresponding characters to display them in a display device 16. When the KANA (Japanese syllabary) fed by the operator is converted into KANJI, a conversion key on a keyboard 17 is depressed. Thus the candidate KANJI corresponding to the KANA fed from a KANJI dictionary 14 via processor 13 is selected and displayed in a display device 16. Then the character string fed finally is stored in a document memory 15 and then printed by a printer 18 with push of a print key on the keyboard 17.

Description

【発明の詳細な説明】本発明は、キーボードの代シに音声によって文字を入力
し、文章の入力９校正１編集印刷などを行う音声ワード
プロセッサに関する。DETAILED DESCRIPTION OF THE INVENTION The present invention relates to a voice word processor that inputs characters by voice on a keyboard and inputs, proofreads, edits and prints sentences.

従来、この種の音声ワードプロセッサにおいては、オペ
レータは「あ」、「い」、「う」、「え」字コードに変
換していた。この文字コードは処理に蓄えられたシして
いる。Conventionally, in this type of spoken word processor, the operator converted the words into ``a'', ``i'', ``u'', and ``e'' character codes. This character code is stored in the process.

このように、単音節、すなわちかな−文字ずつの入力を
行うワードプロセッサにおいて、音声認識装置は入力さ
れる単音節音声を認識しなくてはならない。ところが単
音節音声は音声情報が少ないため誤認識されやすく、そ
の場合にオペレータは正常に認識される迄、幾度も同音
声を発声しなければならない。この欠点は特に単音で発
声しに＜＜、音声エネルギーの少ない「ん」の文字にお
いて顕著である。また、オペレータの方も入力する文章
を単音節に区切って入力する必要があるため、入力し難
いばかしですく、入力に時間がかかるという欠点があっ
た。In this way, in a word processor that inputs monosyllables, that is, kana characters, the speech recognition device must recognize the input monosyllabic speech. However, since monosyllabic speech contains little speech information, it is easy to be misrecognized, and in this case, the operator has to utter the same speech many times until it is correctly recognized. This drawback is particularly noticeable in the characters ``<<'' and ``n'', which have little vocal energy when uttered as a single sound. In addition, since the operator also has to input the text by dividing it into monosyllables, it is difficult to input, and it takes a long time to input.

さらに、このような単音節によって入力されたかなを漢
字に変換する場合、オペレータは、まずワードプロセッ
サのキーボード上の「漢字キー」を押下して処理装Ｗに
漢字変換することを伝えた後に、音声によるかな入力を
行い、意図する漢字に対応するかな入力が終わった時点
で「変換キー」を押下して「漢字キー」押下以後に入力
されたかなの漢字変換を指示していた。従って、かな漢
字変換のだめの手順が複雑であり、結果として文字入力
速度の低下を招いていた。゛従って本発明の目的は、音声ワードプロセッサの入力を
より自然な音声入力にするとともに漢字変換の操作手順
を低減させることにより、音声ワードプロセッサの入力
速度をより高めることにある。Furthermore, when converting such monosyllabic kana input into kanji, the operator first presses the ``kanji key'' on the word processor's keyboard to inform the processing unit W that the kanji conversion is to be performed, and then converts the voice into kanji. After inputting the kana that corresponds to the desired kanji, the user presses the ``conversion key'' to instruct the kanji conversion of the kana input after pressing the ``kanji key.'' Therefore, the procedure for converting kana to kanji is complicated, resulting in a decrease in character input speed. Therefore, it is an object of the present invention to further increase the input speed of a spoken word processor by making the input to the spoken word processor more natural and reducing the number of operating procedures for kanji conversion.

本発明によれば、複音節の漢字の読みを認識音声として
登録され入力された音声を認識コードに変換して出力す
る音声認識装置と、かな文字コードによって対応する漢
字をアクセスされる漢字辞書と、音声認識装置からの認
識コードをかな文字コードに変換して漢字辞書をアクセ
スし、かな文字コードに対応する漢字を得る処理装置と
を有することを特徴とする音声ワードプロセッサが得ら
れる。According to the present invention, there is provided a speech recognition device that registers the reading of a polysyllabic Kanji as a recognized speech and converts the input speech into a recognition code and outputs it, and a Kanji dictionary that accesses the corresponding Kanji using a Kana character code. and a processing device that converts recognition codes from a speech recognition device into kana character codes, accesses a kanji dictionary, and obtains kanji characters corresponding to the kana character codes.

種々の機関による調査によれば、日本語文書に使用する
文字として約２０００字種用意しておけば、日本語文書
の９０係以上を表現することができるが、たとえ字種を
これ以上増やしても表現できる日本語文章の率は仲々１
００チには近づかないと報告されている。そこで漢字と
して常用漢字または当用漢字を用意しておけば日本語文
章を作成する上でほとんど支障はないと考えられる。According to research by various organizations, if you prepare about 2,000 character types for use in Japanese documents, you can express more than 90 characters in Japanese documents, but even if you increase the number of character types even more. The rate of Japanese sentences that can express
It is reported that he does not approach 00chi. Therefore, if you prepare common or regular kanji as kanji, there will be almost no problem in creating Japanese sentences.

さらに我々は、当用漢字に属する多数の漢字には第１図
に示すように同じ読み方のものが多く、音読みの場合、
はぼ１００種の読み方で、当用漢字の８０係くらいを網
羅することができることを見い出した。そこで音声認識
装置に音声パターンを登録する場合、これまでのように
単音節音声のみを登録するのではなく、「イン」　「エ
イ」　「エキ」　「ショウ」のような漢字の読みも単語
として登録しておけば、この認識結果により直接漢字を
之アクセスできるようになる。倒れば「クヨウ」を登録し
ておくことによりオペレータの複音節音声「ショウ」を
直接かなに変換し、これによって「小」、「升」、「昇
」・・・・・・など５６種類もの当用漢字をアクセスす
ることができるようになる。Furthermore, we found that many kanji belonging to the kanji category have the same reading as shown in Figure 1, and in the case of on-yomi,
I discovered that the 100 readings of Habo can cover about 80 sections of kanji for general use. Therefore, when registering speech patterns in a speech recognition device, instead of only registering monosyllabic sounds as in the past, readings of kanji such as "in", "ei", "eki", and "sho" are also registered as words. If you do this, you will be able to directly access the kanji using this recognition result. By registering ``Kuyou'', the operator's multi-syllabic voice ``Shou'' can be directly converted into Kana, resulting in 56 different types such as ``小'', ``masu'', ``noboru'', etc. You will be able to access useful kanji.

次に、本発明の一実施例を示す図面全参照して本発明の
詳細な説明する。Next, the present invention will be described in detail with reference to all the drawings showing one embodiment of the present invention.

第２図において、オペレータの発声した音声はマイク１
２によって音声認識装置１１に入力される。これにより
認識装置１１の認識結果は、コード化されて処理装置１
２に送られる。処理装置１２は、内部に認識装置１１か
ら与えられたコードをかな文字コードに変換するテーブ
ルを持っていて、認識結果を対応する文字に変換して、
表示装置１６にその表示を行なう。１文字ごとの表示文
字の種類（かｌ、英字等）は、オペレータが音声入力前
にキーボード１７の入力補助キ一群を使用して指定され
るが、通常はかなモードになっている。オペレータが入
力されたかなの漢字変換を行ないたい場合には、オペレ
ータがキーボード１７の「変換キー」を押下することに
よって、処理装置１３は漢字辞書１４をアクセスし、入
力され表示されているかなに対する候補漢字を選出して
オペレータの指示、すなわち、再度の「変換キー」の押
下によって順に表示装置１６に表示する。尚、最終的に
入力された文字列は文書記憶装置１５に記憶され、これ
を用紙に印字する時には、オペレータがキーボード１７
上の印字キーを押下することによってプリンタ１８に出
力され、印字が行なわれる０以上が音声ワードプロセッサの概略構成および動作であ
るが、本実施例においては、音声認識装置１１には識別
可能音声として「アン」、「イン」。In Figure 2, the voice uttered by the operator is transmitted through microphone 1.
2 is input to the speech recognition device 11. As a result, the recognition result of the recognition device 11 is encoded into the processing device 1.
Sent to 2. The processing device 12 has an internal table for converting the code given from the recognition device 11 into a kana character code, and converts the recognition result into the corresponding character.
The information is displayed on the display device 16. The type of displayed character for each character (cal, alphabetic character, etc.) is specified by the operator using a group of input auxiliary keys on the keyboard 17 before voice input, but normally the mode is set to kana mode. If the operator wishes to convert the input kana into kanji, the operator presses the "conversion key" on the keyboard 17, and the processing device 13 accesses the kanji dictionary 14 and converts the input and displayed kana into kanji. Candidate kanji are selected and displayed in order on the display device 16 according to the operator's instructions, that is, by pressing the "conversion key" again. The finally input character string is stored in the document storage device 15, and when printing it on paper, the operator uses the keyboard 17.
By pressing the print key above, the output is output to the printer 18 and printed.0 The above is the general configuration and operation of the voice word processor. Anne', 'in'.

「エイ」、・・・・・・等約１００種類の複音節音声が
登録されている。すなわち、オペレータはこれらの登録
されている音声に対しては、単音節音声の入力ではなく
、複音節で音声入力することができる。Approximately 100 types of polysyllabic sounds such as "ei", etc. are registered. In other words, the operator can input multisyllables into these registered voices instead of inputting monosyllabic voices.

この音声認識装置１１への音声登録の方法について第３
図を参照して説明する。3. How to register voice to this voice recognition device 11
This will be explained with reference to the figures.

音声認識装＃１１の記憶コードを格納するメモリエリア
はマトリクス状であって、セグメントおよび単語番号と
いう座標が割りあてられている。The memory area for storing memory codes of speech recognition device #11 is in a matrix form, and coordinates called segment and word numbers are assigned.

セグメントの数、および単語の数は任意に予め設定され
るものであるが、単語数は第２図に示す上な文字を、セ
グメント２に英字ケ、セグメント３に数字を、セグメン
ト４に記号を、セグメント５オペレータは自分の音声を
音声認識装置１１ＶＣ登録するために、まずキーボード
１７上の「音声登録」のキーを押下する。処理装置１３
はこれをえる。音声認識装置１１でこの準１１ｉ１ｉが
出来ると、準備可の応答を処理装置１３に返す。これに
よって処理装置１３はセグメント１、単語１で指定され
るエリアに登録すべき文字「あ」を表示装置１６に表示
させる。この表示によってオペレータは「あ」の発声を
行い、これが音声認識装置１１で処理されて声紋となっ
てセグメント１、単語１で指定されるエリアに登録され
る。次に処理装置１３はセグメント１、単語番号２で指
定されるエリアに「い」の登録をするよう認識装置１１
に指示する。以下同様にして音声の登録が行なわれるが
、セグメント５には「アン」、「イン」、「エン」。The number of segments and the number of words are set arbitrarily in advance, but the number of words is determined by using the upper letters shown in Figure 2, alphabetical characters in segment 2, numbers in segment 3, and symbols in segment 4. , the segment 5 operator first presses the "Voice Registration" key on the keyboard 17 in order to register his/her own voice to the voice recognition device 11VC. Processing device 13
gets this. When the speech recognition device 11 is able to perform this quasi-11i1i, it returns a ready response to the processing device 13. As a result, the processing device 13 causes the display device 16 to display the character "a" to be registered in the area specified by segment 1 and word 1. In response to this display, the operator utters "a", which is processed by the speech recognition device 11, becomes a voiceprint, and is registered in the area designated by segment 1 and word 1. Next, the processing device 13 instructs the recognition device 13 to register "i" in the area specified by segment 1 and word number 2.
instruct. Thereafter, the voices are registered in the same way, but segment 5 includes "An", "In", and "En".

・・・・・・等の漢字の音が順次登録さ扛ていく。これ
を繰り返していくことによって認識装置１１の認識コー
ド格納用のメモリエリアには認識することのできる音声
がすべて格納される。The sounds of kanji such as ... are registered one after another. By repeating this process, all the voices that can be recognized are stored in the memory area for storing the recognition code of the recognition device 11.

音声登録を終えた音声認識装置１１ば、オペレータから
音声が入力されると、それを電気的信号に変換して認識
コードテーブル内に記憶された音声と照合して一致また
は最も類似したエリアのセグメントおよび単語番号を認
識結果コードとして出力する。例えばオペレータが［Ａ
ＮＪと発声したとすれば認識結果コードとしてセグメン
ト５、単語番号１が処理装置１３に出力される。処理装
置１３は、これを受け取るとこｆＬをかな文字列「アン
」に変換する。Once the voice registration has been completed, the voice recognition device 11 converts the voice input from the operator into an electrical signal and compares it with the voice stored in the recognition code table to find a match or the most similar area segment. and the word number as the recognition result code. For example, if the operator [A
If NJ is uttered, segment 5 and word number 1 are output to the processing device 13 as the recognition result code. Upon receiving this, the processing device 13 converts fL into a kana character string "an".

処理装置１３内でのこの変換の方法を第４図を参照して
説明する。The method of this conversion within the processing device 13 will be explained with reference to FIG.

処理装置１３内には、音声認識装置１１から送られてく
るセグメント番号および単語番号をかなコードに変換す
るテーブルが設けられている。このテーブルは、セグメ
ント番号からアドレスの上位４ビツトを作成し、単語番
号からアドレスの下位８ピツトを作成する。セグメント
番号はそのまま１６進数に変換されてアドレスの上位４
ビツトを形成するが、単語番号は４倍された後１６進数
に変換されてアドレスの下位８ビツトとなる。例えば、
セグメント５、単語番号１が入力されるとすると、処理
装置１３はこれらの番号を２進数で０１０１０００００
００１、すなわち１６進数で０５０１（但し最初の０は
ダミー数字）というアドレスに変換してテーブルをアク
セスする。すなわち連続するセグメント番号、単語番号
に対してテーブルのアドレスは４つおきに対応すること
になる。このように１つのセグメント番号および単語番
号によって、１つのアドレスがアクセスされるが、テー
ブル内からは指定されるアドレスに後続する３つのアド
レスの格納内容も読み出される。テーブル内の１アドレ
スで指定されるエリアには１文字のかなコードが格納さ
れているので、１つのセグメント番号および単語番号か
ら作成される１アト〈レスによるアクセスによってテーブルからは最大４文字
のかなコードが読み出されることになる。A table is provided in the processing device 13 for converting the segment numbers and word numbers sent from the speech recognition device 11 into kana codes. In this table, the upper 4 bits of the address are created from the segment number, and the lower 8 bits of the address are created from the word number. The segment number is directly converted to hexadecimal and the upper 4 of the address
The word number is multiplied by 4 and converted to hexadecimal to form the lower 8 bits of the address. for example,
Assuming that segment 5 and word number 1 are input, the processing unit 13 converts these numbers into binary numbers 010100000.
The table is accessed by converting the address to 001, that is, 0501 in hexadecimal (however, the first 0 is a dummy number). In other words, every fourth address in the table corresponds to consecutive segment numbers and word numbers. In this way, one address is accessed by one segment number and word number, but the stored contents of the three addresses following the designated address are also read from the table. Since a 1-character Kana code is stored in the area specified by 1 address in the table, a maximum of 4 characters can be accessed from the table by accessing 1 address created from 1 segment number and word number. The code will be read.

第４図において、例えばアドレス０５０４で指定される
エリアには「ア」の１バイトかなコードが、０５０５で
指定されるエリアには「ン」の１バイトかなコードが格
納されている。アドレス０５０６および０５０７は空き
エリアである。音声認識装置１１からセグメント５、単
語番号１の認識結果コードが入力されたとすると、処理
装置１３はこのコードをアドレス０５０４に変換し、こ
のアドレスでテーブルをアクセスする。これによってア
ドレス０５０４，０５０５，０５０６．および０５０７
の内容、すなわち「アー、「ン」の２文字のかなコード
が読み出される。In FIG. 4, for example, the area designated by address 0504 stores a 1-byte kana code for "A", and the area designated by address 0505 stores a 1-byte kana code for "n". Addresses 0506 and 0507 are empty areas. When the recognition result code of segment 5 and word number 1 is input from the speech recognition device 11, the processing device 13 converts this code to address 0504 and accesses the table using this address. This results in addresses 0504, 0505, 0506. and 0507
The contents of ``a'' and ``n'', that is, the two-character kana code, are read out.

処理装置１１は読み出したかなコードによって漢字辞書
１４をアクセスし、対応する２バイトの漢字コードを得
ることができる。The processing device 11 can access the Kanji dictionary 14 using the read Kana code and obtain the corresponding 2-byte Kanji code.

このように、音声認識装置１１には、漢字の音読み単位
で音声が登録されているので、オペレータは１回の連続
した発音で１漢字を得ることができる。In this way, since voices are registered in the voice recognition device 11 in units of phonetic pronunciation of kanji, the operator can obtain one kanji with one continuous pronunciation.

本発明は以上説明したように、音声認識装置Ｉｔに複音
節の漢字の読みを登録しておくことによってオペレータ
の漢字入力のための操作を低減させ入力速度を向上する
効果がある。As described above, the present invention has the effect of reducing the operator's operations for inputting Chinese characters and improving the input speed by registering the readings of multi-syllable Chinese characters in the speech recognition device It.

[Brief explanation of drawings]

第１図は、同じ読み方をする当用漢字の一部を示す図、
第２図は本発明の一実施例の構成図、第３図は音声認識
装置の音声登録エリアを示す図、第４図は処理装置のか
なコード格、納エリアを示す図である。第７図ポー疹号ｌ　２３４　由− でグ・　２ＡＢＣＤメ　：に６　タイ　ググ　タヅ　クン躬　３　図Figure 1 shows some of the regular kanji that have the same reading.
FIG. 2 is a block diagram of an embodiment of the present invention, FIG. 3 is a diagram showing a voice registration area of a speech recognition device, and FIG. 4 is a diagram showing a kana code storage area of a processing device. Figure 7 Pore No. 1 234 Yu- degu 2 ABCD Me: ni 6 Thai Gugu Tazukun 3 Figure

Claims

[Claims]

a speech recognition device that registers the reading of multi-syllable kanji as recognized speech and converts the input speech into a recognition code and outputs it; a kanji dictionary that accesses the corresponding kanji using a kana character code; A voice word processor comprising: a processor for converting into kana character codes and accessing the kanji dictionary.