JPS62119595A

JPS62119595A - Japanese language processor

Info

Publication number: JPS62119595A
Application number: JP60260733A
Authority: JP
Inventors: 潤一郎藤本; 林　大川; 山岸　美奈; 中谷　奉文
Original assignee: Ricoh Co Ltd
Current assignee: Ricoh Co Ltd
Priority date: 1985-11-20
Filing date: 1985-11-20
Publication date: 1987-05-30

Abstract

(57)【要約】本公報は電子出願前の出願データであるた
め要約のデータは記録されません。(57) [Summary] This bulletin contains application data before electronic filing, so abstract data is not recorded.

Description

【発明の詳細な説明】１」本発明は、日本語処理装置、より詳細には、音声入力装
置を持った日本語文書作成装置に関する。DETAILED DESCRIPTION OF THE INVENTION 1. The present invention relates to a Japanese language processing device, and more particularly to a Japanese document creation device having a voice input device.

従来技術近年、文書作成に日本語文書作成装置所謂ワードプロセ
ッサを用いることが多くなっているが、キーボードから
入力するためにはある程度の訓練が必要であり、このた
め、普及が妨げられているということがある。これに対
処する一方法として、音声入力装置を備えたワードプロ
セッサが開発され、キーボード上の文字を探すことなく
、文章入力ができるようになった。しかし、通常の音声
ワードプロセッサでは単音節入力方式であるため、例え
ば、ｒ本発明」を入力するには［は・ん・は・つ・め・
い」のように区切らなければならない。BACKGROUND ART In recent years, Japanese document creation devices (so-called word processors) have been increasingly used to create documents, but a certain amount of training is required to input from a keyboard, which has hindered their widespread use. There is. As a way to deal with this, word processors equipped with voice input devices have been developed, allowing users to enter text without having to search for letters on a keyboard. However, since ordinary spoken word processors use a monosyllabic input method, for example, to input ``r this invention'', [ha・n・ha・tsu・me・
It must be separated like this.

このような不自然さをなくすためには単語入力。To eliminate such unnaturalness, input words.

文節入力９文章入力等が必要であり、その試みもなされ
ている。しかし、ワードプロセッサに必要な単語数は最
低５００以上で、このような多数の単語を認識するには
特定話者のパターンマツチング方式が現在実用可能な方
式である。特定話者方式では使用者が自らの声で＋Ｂ語
音声を登録しておく必要があり、この場合５００以−１
〕の！ｌｔ語を最低１回以上発声してからでないとワー
ドプロセッサが使用できないこととなり利用者の負担は
膨大となる。Phrase input 9 It is necessary to input sentences, and attempts have been made to do so. However, the number of words required for a word processor is at least 500, and a pattern matching method for a specific speaker is currently a practical method for recognizing such a large number of words. In the specific speaker method, the user needs to register the +B language voice with his/her own voice, and in this case, the number of +B words is 500 or more.
〕of! The word processor cannot be used until the LT word has been uttered at least once, resulting in an enormous burden on the user.

目　　　　　的本発明は、上述のごとき実情に鑑みてなされたもので、
特に、使用者の少ない負担で日本語文章のパターンを登
録することのできる１−１本語音声情報処理装置を１ノ
と供することを目的としてなされたものである。Purpose The present invention was made in view of the above-mentioned circumstances.
In particular, this invention was developed with the aim of providing a 1-1 native language speech information processing device that can register Japanese sentence patterns with little burden on the user.

構　　　成本発明は、」１記１−１的を達成するために、音声の入
力部と、分析部と、認識部と、該認識部から出力される
結果を表示する表示部と、表示された結果に指示を与え
る指示部とを備えた日本語音声処理装置において、入力
された音声を一持保持する部分を有し、表示された結果
に対し、前記指示部から指示された日本語に相当する該
保持部内のパターンを音声認識用標準パターンとして登
録するようにしたことを特徴としたものである。以下、
本発明の実施例に基いて説明する。Configuration In order to achieve the objective 1-1 of ``1, the present invention includes a voice input section, an analysis section, a recognition section, a display section for displaying the results output from the recognition section, and a display section for displaying the results outputted from the recognition section. A Japanese language speech processing device is equipped with an instruction section that gives instructions to the result, and has a section that temporarily holds the input voice, and for the displayed result, outputs a message corresponding to the Japanese language instructed by the instruction section. The pattern in the holding section is registered as a standard pattern for speech recognition. below,
An explanation will be given based on an example of the present invention.

第１図は、本発明の一実施例を説明するための構成図で
、図中、１．はマイク、２は音声区間切り出し部、３は
特徴量変換部、４はレジスタ、５は照合部、６は標準パ
ターン部、７はかな漢字変換部、８は表示部、９はキー
ボードで、標準パターン部６にはあらかじめ少しの標準
パターンが登録されているものとする。今、未知の音声
がマイク１から入力されたとすると、その中の音声に係
る部分が例えば周波数スペクトルのような特徴量に変換
され、レジスタ４に一時的に格納され、それと共に未知
音声の特徴パターンが認識結果として表示部８に表示さ
れる。表示部８は通常のＣＲＴで良く、表示部の前段に
カナ・漢字変換部を設けて漢字で表示しても良い。表示
部８にはキーボードが付けられており、これでＣＲＴを
見ながら文章の修正をする。例えば、第２図の如き「本
日は誠にお日柄も良く・・・」という文章を作りたい時
、単語毎に区切り、［木１−１・は・誠・に・・・・」
と入力した時、「誠」が標準パターンとして登録されて
いなかったとする。この時は、図の如く「誠」と一番類
似したＳＩｔ録語「まるごとＪが出力される。FIG. 1 is a configuration diagram for explaining one embodiment of the present invention, and in the figure, 1. is a microphone, 2 is a voice section extraction unit, 3 is a feature conversion unit, 4 is a register, 5 is a collation unit, 6 is a standard pattern unit, 7 is an ephemeral Kanji conversion unit, 8 is a display unit, 9 is a keyboard, and a standard pattern It is assumed that a small number of standard patterns are registered in advance in section 6. Now, if an unknown voice is input from the microphone 1, the part related to the voice is converted into a feature amount, such as a frequency spectrum, and is temporarily stored in the register 4, along with the characteristic pattern of the unknown voice. is displayed on the display unit 8 as a recognition result. The display section 8 may be a normal CRT, or a kana/kanji conversion section may be provided in front of the display section to display kanji characters. A keyboard is attached to the display section 8, and text can be corrected using the keyboard while viewing the CRT. For example, if you want to create a sentence like the one shown in Figure 2, ``It's a really nice day today...'', separate each word and write ``Thursday 1-1, Makoto, Ni...''
Suppose that when you enter "Makoto", "Makoto" is not registered as a standard pattern. At this time, as shown in the figure, the SIT word ``Marugoto J'', which is most similar to ``Makoto'', is output.

このように単語＋１を位に人力された音声は特徴量に変
換された後、順番にレジスタに保存されており、今、間
違えた＋１１−語の「まるごと」が表示されている部分
へカーソルを移動し、キーボードでこれを「まこと」と
修正する。こうしてレジスタの３番目の「まこと」が「
まるごと」に誤っていたことがわかり、レジスタ内の「
まこと」のパターンが新たに標準パターンとして登録さ
れる。この方法により使用者は音声認識装首の標準パタ
ーンを登録しているという意識なくしてパターンを登録
することができる。しかしながら、前例での「まこと」
の誤認識はパターンが登録されていないからなのか、た
またま発声不良であったのかはわがらない。そこで標準
パターンに登録されている単語の一覧表を持ち、キーボ
ードから修正された単語がこの一覧表の中に含まれてい
ない場合のみ、新しいパターンを登録するという方法が
考えられる。In this way, the human-generated speech starting from word +1 is converted into features and then stored in the register in order. Now, when you move the cursor to the part where the word +11-word that you made a mistake is displayed, "whole" is displayed. Move and use the keyboard to correct this as "Makoto." In this way, the third ``Makoto'' in the register becomes ``
It turned out that there was an error in the "whole", and the "
The "Makoto" pattern is newly registered as a standard pattern. With this method, the user can register a pattern without being aware that he or she is registering a standard pattern for a voice recognition neck wear. However, "Makoto" in the precedent
It is unclear whether the misrecognition was due to the pattern not being registered or whether it was due to poor pronunciation. Therefore, one possible method is to have a list of words registered in the standard pattern, and to register a new pattern only when the word modified from the keyboard is not included in this list.

第３図は、上記方法を実施するための電気的ブロック線
図の一実施例を示す図で、この実施例は、登録単語の一
覧表部１０を持つ点以外は第１図に示した実施例と同じ
である。しかし、この様にして次々とパターンを登録し
て行くとまぎられしい単語が増えてしまう。そこで、こ
れを防ぐため、単語の類似度を導入する方法が考えられ
る。FIG. 3 is a diagram showing an example of an electrical block diagram for implementing the above method, and this example is similar to the example shown in FIG. Same as example. However, if patterns are registered one after another in this way, the number of confusing words increases. Therefore, in order to prevent this, a method of introducing word similarity may be considered.

第４図は、上記単語の類似度を導入した場合の一実施例
を示す図で、この実施例においても、今までの例と同様
に入力を特徴量に変換して入力された順に認識し、表示
部に表示して日本文を作成して行く。この時、いくつか
の入力音声のパターンをレジスタに格納できるようにし
ておき、一杯になると古いパターンから順に消去して新
しいパターンを保存するようにする。前述の第２図の実
施例のように、いくつかの音声認識が行われた後で表示
部」二でＲ４（りを訂正するが、この誤りが例えば「大
系的に・・・」と人力するつもりで「たいけい・てき・
に・・」と発声したのが「会則的に・・・」と誤って表
示されたとすると、大系と会計とは非常に良く類似して
おり、「大系」という入力と「会計」という標準パター
ンとの類似度も大きくなる。そこで、「大系」という標
準パターンを登録すると大系と会計との間の誤りが増え
てしまうことになるから類似度認定部１］にて、類似度
を比較し、これが小なるものつまり標準パターン中にま
ぎられしい単語がない場合のみ類似度判定部１２により
類似度がある基準値より小さいことを判定した時に、入
カバターンを標準パターンとして登録するようにする。FIG. 4 is a diagram showing an example in which the above-mentioned word similarity is introduced. In this example as well, inputs are converted into feature quantities and recognized in the order in which they are input, as in the previous examples. , and create Japanese sentences by displaying them on the display. At this time, several input audio patterns can be stored in a register, and when it becomes full, the oldest patterns are erased and new patterns are saved. As in the embodiment shown in FIG. 2, after some speech recognition has been performed, R4 (R4) is corrected on the display section 2. With the intention of doing some work, I said,
If you uttered ``to...'' but it was incorrectly displayed as ``according to the rules of the association...'', Taikei and accounting are very similar, and inputting ``Taikei'' and ``accounting'' are incorrectly displayed. The degree of similarity with the standard pattern also increases. Therefore, if a standard pattern called "large system" is registered, errors between large system and accounting will increase, so the similarity is compared in the similarity recognition section 1], and this is the smaller one, that is, the standard pattern. Only when there is no confusing word in the pattern, and when the similarity determining unit 12 determines that the degree of similarity is smaller than a certain reference value, the incoming cover pattern is registered as a standard pattern.

これによって類似したパターンの登録が防げる。This prevents registration of similar patterns.

処−一米以」二の説明から明らかなように、本発明によると、使
用者の負担を少なくして音声標準パターンを登録できる
１−１本ｉ／ｌ処理装置が実現できる。As is clear from the explanation in Section 2, according to the present invention, it is possible to realize a 1-1 I/L processing device that can register voice standard patterns with less burden on the user.

[Brief explanation of drawings]

第１図は、本発明による日本語処理装置の一実施例を説
明するための電気的ブロック線図、第２図は、本発明の
動作説明をするための図、第３図及び第４図は、それぞ
れ本発明の他の実施例を説明するための図である。１・・・マイク、２・・・音声区間切り出し部、３・・
・特徴量変換部、４・・・レジスタ、５・・・照合部、
６・・・標準パターン部、７・・・かな漢字変換部、８
・・・表示部、９・・・キーボード、１０・・・登録単
語一覧表部、１１・・・類似度認定部、１２・・・類似
度判定部。特許出願人　　株式会社　リコー＝７＝第　１　図第２図発　　声：「ｌまんじつ」「は」「まこと」「に」レジ
スタ：　はんじフ　ＩＳ　　まこと　に表　　示二　本
　日　　は　まるごと　１こ↓ 修　正：　　　　　　　　まこと第３図第４図１　″　　　　　　４！Ｌ＃　ＹＥＳ＋＋Ｏ？FIG. 1 is an electrical block diagram for explaining an embodiment of the Japanese language processing device according to the present invention, FIG. 2 is a diagram for explaining the operation of the present invention, and FIGS. 3 and 4 2A and 2B are diagrams for explaining other embodiments of the present invention, respectively. 1...Microphone, 2...Audio section cutting unit, 3...
・Feature value conversion unit, 4... register, 5... collation unit,
6... Standard pattern section, 7... Kana-Kanji conversion section, 8
. . . Display section, 9. Keyboard, 10. Registered word list section, 11.. Similarity recognition section, 12.. Similarity determination section. Patent applicant: Ricoh Co., Ltd. = 7 = Figure 1 Figure 2 Voice: "lmanjitsu""ha""makoto""ni" Register: Hanjifu IS Makoto ni Display 2 Today is the whole 1 word↓ Correction: Makoto Figure 3 Figure 4 1 ″ 4 !L# YES ++ O ?

Claims

[Claims]

(1) A Japanese voice that includes a voice input section, an analysis section, a recognition section, a display section that displays the results output from the recognition section, and an instruction section that gives instructions to the displayed results. The processing device has a part that temporarily holds input speech, and registers a pattern in the holding part corresponding to the Japanese language instructed by the instruction part as a standard pattern for speech recognition in response to the displayed result. A Japanese language processing device characterized by:

(2) It is determined whether or not the character string instructed by the instruction section is registered in a speech standard pattern, and if it is not registered, it is registered. ) Japanese language processing device described in section 2.

(3) Claim No. 1 characterized in that it has a part that temporarily holds the similarity of the input voice, and registers patterns in which the similarity of the recognition result is less than a certain value.
) Japanese language processing device described in section 2.