JPS6148032A

JPS6148032A - Speech input type japanese document processor

Info

Publication number: JPS6148032A
Application number: JP59169568A
Authority: JP
Inventors: Takeshi Yoshii; 健吉井; Fumio Togawa; 外川　文雄
Original assignee: Sharp Corp
Current assignee: Sharp Corp
Priority date: 1984-08-14
Filing date: 1984-08-14
Publication date: 1986-03-08

Abstract

PURPOSE:To reduce a buffer capacity for writing a transition matrix used in suitability discrimination relative to a candidate syllable train by carrying out a preliminary discrimination according to the transition matrix of (M-1) order having one smaller power of number without using the transition matrix of M order directly. CONSTITUTION:A speech information entering a monosyllable recognition section 1 is compared with a pattern stored in a standard pattern memory 2 and the standard pattern approximating to the input speech is selected as a candidate in accordance with an order of the approximation and a candidate train comprising these combination is stored in a candidate train buffer 4. Then, plurality candidate trains stored in this buffer 4 are discriminated about its suitability by the use of a primary transition matrix in a transition matrix buffer 5, and in the next place, by the use of a secondary transition matrix and the rest is stored in a candidate train creating section 6. In a clause analysis division 7, high-order processing such as a dictionary collation is done. Finally, the selected clause is stored in a recognition result memory 8 and issued properly.

Description

【発明の詳細な説明】〈産業上の利用分野〉本発明は、音声入力式日本語文書処理装置における人力
音声の認識方式に関するものである。DETAILED DESCRIPTION OF THE INVENTION <Industrial Application Field> The present invention relates to a human voice recognition method in a voice input type Japanese document processing device.

〈従来技術〉文節等の一区切りの音声を音節単位に認識する方式とし
て、認識すべき音節単位間の接続を表）つす２＋多関係
を記述した１行列を用いて音節単位間の遷移が不可能な
組み合わせを有する候補音節列（以下、単に候補列とい
う）を除外し、残りの候補列に対してのみ次の処理を行
なうようにしたちのが現に提案されている（例えば、特
開昭５９−５８４９３号公報参照）。<Prior art> As a method for recognizing one section of speech, such as a phrase, into syllable units, a matrix is used that describes the connections between the syllable units to be recognized (2+many relationships), so that no transitions between syllable units occur. It has actually been proposed to exclude candidate syllable strings (hereinafter simply referred to as candidate strings) that have possible combinations and to perform the following processing only on the remaining candidate strings (for example, (Refer to Publication No.-58493).

これは、文節等を正しく認識するｒｉｌ：度を高め、結
果的に高次の処理量を減少させることを目的としており
、１次遷移だけでなく、２次遷移、更にはＭ次遷移まで
拡張可能であって、Ｍの次数を大きくとれば、それだけ
候補列の限度か強くなり、認識の正確度を高めることが
可能である。The purpose of this is to increase the degree of recognition of phrases, etc., and to reduce the amount of higher-order processing as a result, and extends not only to first-order transitions but also to second-order transitions and even M-order transitions. If it is possible, and the degree of M is increased, the limit of the candidate sequence becomes stronger, and the accuracy of recognition can be increased.

〈発明が解決しようとする問題点〉認識対′４ミとなる口・ドａｈの８香Ｊ１土１１１（固
あｌ）、５これらの各音部間の遷移関係の記述に要する
ノ（′ンファの容（１λは＋１２（＝１１１＋１）のべ
えとなる。υ亡っで、（Ｍ＋１）個の１“′ｒ面単１・
γからなる候補列（こ肘−セるＮ１犬の遷移行列を作成
する（こは１１２ビ、ントの（Ｍ＋１）４１４となり、
Ｉ’ｌ＝１の次数を大きくとるほどその限定１よ強くな
って効果が火きくなるとしても、必要とする遷移行列の
）１：き込み用／＜゛・ノファの容量カＺ膨大なものに
なるという問題点が付随して生−ｒる。<Problem to be solved by the invention> Recognition pair '4 Mi's mouth/door ah's 8 incense J1 Sat 111 (Kara l), 5 Required to describe the transition relationship between each of these tones (' The capacity of the amplifier (1λ is the base of +12 (=111+1).
Create a candidate sequence (Koji-Seru N1 dog transition matrix) consisting of γ (this becomes (M+1)414 of 112 bits,
Even though the larger the order of I'l=1, the stronger the limitation 1 and the more severe the effect, the required capacity of the transition matrix 1: for input/<゛・nofa is huge. This comes with the problem of becoming.

本発明は、このような問題点を解決し、遷！多イテ列書
き込み用バ、ノファの必要容量を小さくすることをＩＪ
的としてなされたちのである。The present invention solves these problems and improves the transition! IJ is designed to reduce the required capacity of the buffer for writing multi-iteration rows.
It was done as a target.

〈問題点をｈ′４決する為の手段〉１・述の目的を達するｌ）に、本発明は、（ｌｖｌ＋１
）個のｊ″１而単面からなる候補列に対する適否＄１１
　を断の際に、先ずＣＭ＋１）次の遷移行列を適用して
遷峯多丁可１１駆組み合わせを有する候補列を除タトし
、残１）の候補列に灯してのみＩＶ４次の遷移行ＷＩＪ
をｊ轟ＪＴＩｌ＝としている。<Means for determining the problem h'4> 1. To achieve the above-mentioned purpose l), the present invention provides (lvl+1
) suitability for a candidate sequence consisting of j″1 planes $11
When cutting, first, apply the CM+1) next transition matrix to divide the candidate columns having the transition peak multi-choice 11 drive combination, and only light up the remaining 1) candidate columns to create the IV4th transition row. W.I.J.
is assumed to be JTIL=.

例えば、１次の遷移行列、即ち連続する２１テ節間の接
続を表わす遷移ｒＡ係を記述したちのを考えると、この
遷移行列は、２次の遷移行列、即ち連続する３音節間の
接続を表わす遷移関係を記述しだらのに包含されている
。つまり、１次の遷移関係が成立しなければ２次の遷移
関係も成立しないわけであって、本発明はこの点に着目
したちのであり、（Ｍ−１）次の遷移行列によって遷移
が閉定された音節の組み合わせのみＭ次に拡張して適否
を判断するのである。For example, if we consider a first-order transition matrix, that is, a transition rA relation representing the connection between consecutive 21 syllables, this transition matrix is a second-order transition matrix, that is, the connection between three consecutive syllables. It describes the transition relations that represent . In other words, if the first-order transition relationship does not hold, the second-order transition relationship also does not hold.The present invention focuses on this point, and the transition is closed by the (M-1)-order transition matrix. Only the specified combinations of syllables are extended to the Mth order to determine suitability.

〈作用〉このように、いきなりＭ次の遷移行列を適用・・ｋず、
ベキ数の１つ少ない（Ｍ−１）次の遷移行列によって、
いわば予備判断を行なう為、Ｍ次の遷移行列の対象とな
る候補列が少なくなり、必要とする遷移行列書き込み用
バッファの容量は大幅に縮小され、しかもいきなりＭ次
の遷移行列を適用した場合と同等の認識確度が得られる
。<Effect> In this way, suddenly apply the M-order transition matrix...
By the transition matrix of one less power (M-1),
In other words, because a preliminary judgment is performed, the number of candidate columns to be subjected to the M-th transition matrix is reduced, and the required capacity of the transition matrix writing buffer is significantly reduced. Equivalent recognition accuracy can be obtained.

〈実施例〉以下、図面の一実施例について、本発明を具体的に説明
する。<Example> The present invention will be specifically described below with reference to an example shown in the drawings.

第１図にすｊい′ζ、単音節認識部１に人力された？ｊ
’　？！’清（４１＜は、襟ｉ１（パターンメモリ２に
記憶されている（ｊ、１準パターンと比較され、人力音
声に近餞した（架（（（ユバターンかその近鉄順に候補
として選出され、γｆ節ラうィ又バッファ３に時系列的
に記憶され、又、これらの組み合わせからなる候補列か
候補列バッファｌ［に記憶さＪする。犬いで、このバッ
ファ・１に記憶された複数間の候補列は、遷移行列バッ
ファ５内の１次遷、移行列（以下、Ｍ７＼′［ＲＩＸ−
１という）を用いて適否を１′１ｊ断され、続いて、２
次遷移行列（以下、ＭＡＴＲＩＸ−２という）をＪＩＪ
い′ご適否を１’ｌ＋断され、残ったものか候補列作成
部６に記憶される。犬に、文節分析部７において、辞：
＋１照合等の高次処理か行なわれ、最終的に選択された
文節が認識結果メモリ号に記憶され、適宜出力される。In Figure 1, was the monosyllabic recognition unit 1 manually inputted? j
' ? ! 'Qing (41< is compared with the collar i1 (stored in the pattern memory 2 (j, The nodes are also stored in buffer 3 in chronological order, and candidate sequences consisting of these combinations are stored in candidate sequence buffer l. The candidate column is the first-order transition in the transition matrix buffer 5, the transition column (hereinafter M7\'[RIX-
1) is used to determine suitability, and then 2
The next transition matrix (hereinafter referred to as MATRIX-2) is JIJ
The suitability of the candidates is determined by 1'l+, and the remaining ones are stored in the candidate sequence creation section 6. To the dog, in the phrase analysis part 7, the following words:
High-level processing such as +1 matching is performed, and the finally selected phrase is stored in the recognition result memory and output as appropriate.

９は、これらの諸動作を制御するＣ１−’　（＋て゛あ
る。9 is C1-' (+) which controls these operations.

次に、ＬＳＴＩＬＩＸ〜１及びＭＡＴＲＩＸ−２につい
て述べる。尚、遷移行列について基本的な説明は、前掲
の特許公開公報に詳細に記載斜ｔでいるので、ここでは
省略する。Next, LSTILIX-1 and MATRIX-2 will be described. A basic explanation of the transition matrix is described in detail in the above-mentioned patent publication, so it will be omitted here.

第２図は八４ＡＴＲＩＸ−１を、第３図はＭＡＴＲＩＸ
−２ｅ夫々示ｔ、ＭＡＴＲＩＸ−１ｉｉ、２バイト単位
でＭＡＴＲＩＸ−２のブロック番号を記述してあり、こ
の点は、１ビット単位で音ｍＪ単位間の遷移関係を記述
した前掲先行技術の遷移行列とは異なっている。このＭ
ＡＴＲＩＸ−１を用いて遷移関係をチェックされた候補
列力弓次遷移不可能なものであると、ブロック番号＜　
０　（］　（，１（’ｌ　＞１−Ｉ　Ｅ　Ｘが与えられ
、又、可能なもので゛あると、更に、次に参照すべきＭ
ＡＴＲＩＸ　　２のブロック番号が与えられる。Figure 2 shows 84 ATRIX-1, Figure 3 shows MATRIX.
-2e, MATRIX-1ii, and MATRIX-2 block numbers are described in 2-byte units, and this point is similar to the transition matrix of the prior art described above, which describes the transition relationship between sound mJ units in 1-bit units. It is different from This M
The transition relationship of the candidates was checked using ATRIX-1, and if the transition was not possible, the block number <
0 (] (,1('l > 1-I E
A block number of ATRIX 2 is given.

ＭＡＴＲＩＸ−２は、１１２ビツト毎にブロック番号を
有するものであり、ブ０７２番号Ｏは、１次の遷移関係
を全て否定するもので、１１２ビツト全てが０”のブロ
ックであり、例えば、Ｍ　Ａ　Ｔ　ＲＩＸ−１のある音
節単位間の遷移関係が゛＜　（ｌ　Ｏｆ’１０＞ＨＥＸ
となった場合は、Ｍ］＼ＴＲｌＸ−２のブロック（）に
シ゛ヤンブする。このフ゛口・ンク０は、１次の遷移か
成り立たなかったと同時に２次の遷移ら否定するらのて
゛ある。又、遷移か可能で０以外のフロック番号か示さ
れた場合は、その番号のフロノクヘノヤンプし、そのフ
ロックの先頭から１１２ヒ′ントを調べる。ここで・は
、各１ビ′ントか′、夫々の１音節単位に対応している
。尚、Ｍ　Ａ　Ｔ　ＲＩＸ−２の先頭ビットが′”１″
で、残りの１１１ビツトが全て“（）゛の場合は、接尾
となることを意味している。MATRIX-2 has a block number for every 112 bits, and block 072 number O negates all first-order transition relationships, and is a block in which all 112 bits are 0'', for example, MATRIX-2. The transition relationship between certain syllable units in TRIX-1 is ゛< (l Of'10>HEX
If so, shamble to the block () of M]\TRlX-2. This link 0 has the ability to deny the second-order transition at the same time as the first-order transition does not hold. Also, if a transition is possible and a non-zero block number is indicated, jump forward to that number and check 112 hints from the beginning of the block. Here, * corresponds to each 1 bit or 1 syllable unit. In addition, the first bit of MAT RIX-2 is ``1''
If the remaining 111 bits are all "()", it means that it is a suffix.

以上のようにして、Ｍ＝２の場合、即ち３個の音＠Ｂ’
−位からなる候補列の遷移関係の適否が判断されるので
あるが、前掲先行技術により、いきな１）２次の遷移行
列を適用する場合と、本発明により、先ず１次の遷移行
列を適用し、次いで２次の遷移行列を適用する場合のビ
ット数は、次のようになる。As described above, when M=2, that is, three sounds @B'
The suitability of the transition relationship of the candidate sequence consisting of - digits is judged. According to the prior art mentioned above, there is a case where a second-order transition matrix is applied, and a case where a first-order transition matrix is applied according to the present invention. The number of bits when applying the transition matrix and then applying the second-order transition matrix is as follows.

（ａ）　　先行技術の場合１１２コ＝１．４０４，９２８　　　［ビット１０．・
（Ａ）（ｂ）　　本発明の場合１＝ＩＡＴ旧Ｘ−１１６Ｘ１１２”＝２００，７０４　
　［ヒラ）ｌ・・（Ｂ）ｉ４八ＴＲｌＸ−２１１２３Ｘ
１／２＝７０２，４６４　　［ビ　ン　ト　１・・・（
Ｃ）（ＭＡＴＲｌＸ−１でのビット１の占有率は全体の
１／２として計則従って、（ａ）に対する（１１）の比率は、（Ｂ　十Ｃ
）／　Ａ　＝９０３，１６８／”１，４０４，９２８＝
０．６４２８となり、本発明の場合は、先行技術に比べて６５［％１
程度のバッフ７容量で同等の効果を得ることができるの
である。(a) In the case of the prior art, 112 pieces = 1.404,928 [bit 10.・
(A) (b) In the case of the present invention 1=IAT old X-116X112”=200,704
[Hira) l... (B) i48 TRlX-21123X
1/2=702,464 [bin 1...(
C) (The occupancy rate of bit 1 in MATRlX-1 is calculated as 1/2 of the total. Therefore, the ratio of (11) to (a) is (B + C
)/A =903,168/”1,404,928=
0.6428, and in the case of the present invention, it is 65%1 compared to the prior art.
The same effect can be obtained with a buffer capacity of about 7.

本発明は、任意の畜産間の遷移関係を記述する場合のバ
ッファ容量の縮小化に有効なものであるが、次のような
手段と併用すれば、更にバッフ７容量を小さくすること
ができる。尚、下記の手段は先行技術にも適用可能であ
り、単独で利用してもある程度の効果がある。Although the present invention is effective in reducing the buffer capacity when describing transition relationships between arbitrary livestock breeds, the buffer 7 capacity can be further reduced if used in conjunction with the following means. Incidentally, the following means can also be applied to the prior art, and even when used alone, they have some effect.

その１は、認識すべき所定の日本語音節単位のうち、特
定の限定した音節４１位間の接続関係のみを記述した遷
移行列を併用することであり、例えば特殊な音節単位で
ある促音“っ”とその前後の接続を表わす遷移行列を用
いる。The first method is to use a transition matrix that describes only the connections between specific 41st syllables among the predetermined Japanese syllable units to be recognized. ” and a transition matrix representing the connections before and after it.

その２は、認識すべき所定の日本語音節単位を、予め音
声認識上意味のあるカテゴリーによってグループ分けし
、そのグループ開の接続関係を記述した遷移行列を用い
ることである。例えば、音韻上１１）、でいるもの、即
ち（、）パ行、り行、カ行、（１１）バ行、グ行、ガ行
、（ｃ）す行、ハ行、（ｄ）マ行、す行、う行というよ
うに、異なった音節単位を含む音節群を夫々１種類の音
節単位としで扱い、残りの音節単位を含めて、その接続
関係である遷移関係を記述した虜ｆ伺テ列を用いるので
ある。The second method is to group predetermined Japanese syllable units to be recognized into categories that have meaning in terms of speech recognition, and to use a transition matrix that describes the connection relationships between the groups. For example, 11) phonetically, (,) Pa line, Ri line, Ka line, (11) B line, G line, Ga line, (c) Su line, H line, (d) M line. A syllable group that includes different syllable units, such as , su line, and u line, is treated as one type of syllable unit, and the transition relationship that is the connection relationship between the remaining syllable units is described. It uses a te sequence.

〈発明の効果〉上述のＡＪ４例の説明から明らかなように、本発明によ
れば、候補音節列に対する適否１１１１断に用いる遷移
行列を書き込むバッファの容量を縮小することかでき、
装置の小型化、代コスト化、処理時間短４１１等を実現
することか可能となるのである。<Effects of the Invention> As is clear from the explanation of the AJ4 example above, according to the present invention, it is possible to reduce the capacity of the buffer in which the transition matrix used for determining the suitability of candidate syllable strings is written.
This makes it possible to downsize the device, reduce costs, shorten processing time, etc.

[Brief explanation of the drawing]

第１図は、本発明の一実施例の構成を示すブロック図、第２図は、同」二、１次遷移行列の１例を示す図、第３
図は、同上、２次遷移行列の１例を・バす図である。１・・・単音ｆｆ１ｉ認識部２・・・標準パターンメモリ３・・・音ｍｉラティスバッフｒ４・・・候補列バッファ５・・・遷移行列バフ）７６・・・候補列作成部７・・・分節分析部８・・・認識結果メモリ９・・・ｃ　ｐ　ｕFIG. 1 is a block diagram showing the configuration of an embodiment of the present invention; FIG. 2 is a diagram showing an example of a second-order and first-order transition matrix;
The figure is a diagram showing an example of the quadratic transition matrix. 1... Single sound ff1i recognition unit 2... Standard pattern memory 3... Sound mi lattice buffer r 4... Candidate sequence buffer 5... Transition matrix buff) 7 6... Candidate sequence creation unit 7. ... Segment analysis unit 8 ... Recognition result memory 9 ... CPU

Claims

[Claims]

1. When recognizing speech uttered in units such as phrases in syllable units and creating candidate syllable strings in the order of highly reliable combinations from multiple candidates recognized for each syllable unit, the predetermined Japanese language to be recognized is Using a transition matrix that describes transition relationships that represent connections between word syllable units, candidate syllable strings that have combinations in which transitions between syllable units are impossible are excluded, and the remaining candidate syllable strings are subjected to the following dictionary matching, etc. In a voice-input Japanese document processing device that outputs the recognition results of phrases, etc., when determining the suitability of a candidate syllable string consisting of (M+1) syllable units, first the (M-1) ) The following transition matrix is applied to exclude candidate syllable strings with combinations that cannot be transitioned, and the M-order transition matrix is applied only to the remaining candidate syllable strings to judge the suitability of the candidate syllable strings. A voice input type Japanese document processing device characterized by: