JPS6148032A - Speech input type japanese document processor - Google Patents
Speech input type japanese document processorInfo
- Publication number
- JPS6148032A JPS6148032A JP59169568A JP16956884A JPS6148032A JP S6148032 A JPS6148032 A JP S6148032A JP 59169568 A JP59169568 A JP 59169568A JP 16956884 A JP16956884 A JP 16956884A JP S6148032 A JPS6148032 A JP S6148032A
- Authority
- JP
- Japan
- Prior art keywords
- candidate
- transition matrix
- syllable
- order
- stored
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Landscapes
- Document Processing Apparatus (AREA)
Abstract
Description
【発明の詳細な説明】
〈産業上の利用分野〉
本発明は、音声入力式日本語文書処理装置における人力
音声の認識方式に関するものである。DETAILED DESCRIPTION OF THE INVENTION <Industrial Application Field> The present invention relates to a human voice recognition method in a voice input type Japanese document processing device.
〈従来技術〉
文節等の一区切りの音声を音節単位に認識する方式とし
て、認識すべき音節単位間の接続を表)つす2+多関係
を記述した1行列を用いて音節単位間の遷移が不可能な
組み合わせを有する候補音節列(以下、単に候補列とい
う)を除外し、残りの候補列に対してのみ次の処理を行
なうようにしたちのが現に提案されている(例えば、特
開昭59−58493号公報参照)。<Prior art> As a method for recognizing one section of speech, such as a phrase, into syllable units, a matrix is used that describes the connections between the syllable units to be recognized (2+many relationships), so that no transitions between syllable units occur. It has actually been proposed to exclude candidate syllable strings (hereinafter simply referred to as candidate strings) that have possible combinations and to perform the following processing only on the remaining candidate strings (for example, (Refer to Publication No.-58493).
これは、文節等を正しく認識するril:度を高め、結
果的に高次の処理量を減少させることを目的としており
、1次遷移だけでなく、2次遷移、更にはM次遷移まで
拡張可能であって、Mの次数を大きくとれば、それだけ
候補列の限度か強くなり、認識の正確度を高めることが
可能である。The purpose of this is to increase the degree of recognition of phrases, etc., and to reduce the amount of higher-order processing as a result, and extends not only to first-order transitions but also to second-order transitions and even M-order transitions. If it is possible, and the degree of M is increased, the limit of the candidate sequence becomes stronger, and the accuracy of recognition can be increased.
〈発明が解決しようとする問題点〉
認識対′4ミとなる口・ドahの8香J1土111(固
あl)、5これらの各音部間の遷移関係の記述に要する
ノ(′ンファの容(1λは+12(=111+1)のべ
えとなる。υ亡っで、(M+1)個の1“′r面単1・
γからなる候補列(こ肘−セるN1犬の遷移行列を作成
する(こは112ビ、ントの(M+1)414となり、
I’l=1の次数を大きくとるほどその限定1よ強くな
って効果が火きくなるとしても、必要とする遷移行列の
)1:き込み用/<゛・ノファの容量カZ膨大なものに
なるという問題点が付随して生−rる。<Problem to be solved by the invention> Recognition pair '4 Mi's mouth/door ah's 8 incense J1 Sat 111 (Kara l), 5 Required to describe the transition relationship between each of these tones (' The capacity of the amplifier (1λ is the base of +12 (=111+1).
Create a candidate sequence (Koji-Seru N1 dog transition matrix) consisting of γ (this becomes (M+1)414 of 112 bits,
Even though the larger the order of I'l=1, the stronger the limitation 1 and the more severe the effect, the required capacity of the transition matrix 1: for input/<゛・nofa is huge. This comes with the problem of becoming.
本発明は、このような問題点を解決し、遷!多イテ列書
き込み用バ、ノファの必要容量を小さくすることをIJ
的としてなされたちのである。The present invention solves these problems and improves the transition! IJ is designed to reduce the required capacity of the buffer for writing multi-iteration rows.
It was done as a target.
〈問題点をh′4決する為の手段〉
1・述の目的を達するl)に、本発明は、(lvl+1
)個のj″1而単面からなる候補列に対する適否$11
を断の際に、先ずCM+1)次の遷移行列を適用して
遷峯多丁可11駆組み合わせを有する候補列を除タトし
、残1)の候補列に灯してのみIV4次の遷移行WIJ
をj轟JTIl=としている。<Means for determining the problem h'4> 1. To achieve the above-mentioned purpose l), the present invention provides (lvl+1
) suitability for a candidate sequence consisting of j″1 planes $11
When cutting, first, apply the CM+1) next transition matrix to divide the candidate columns having the transition peak multi-choice 11 drive combination, and only light up the remaining 1) candidate columns to create the IV4th transition row. W.I.J.
is assumed to be JTIL=.
例えば、1次の遷移行列、即ち連続する21テ節間の接
続を表わす遷移rA係を記述したちのを考えると、この
遷移行列は、2次の遷移行列、即ち連続する3音節間の
接続を表わす遷移関係を記述しだらのに包含されている
。つまり、1次の遷移関係が成立しなければ2次の遷移
関係も成立しないわけであって、本発明はこの点に着目
したちのであり、(M−1)次の遷移行列によって遷移
が閉定された音節の組み合わせのみM次に拡張して適否
を判断するのである。For example, if we consider a first-order transition matrix, that is, a transition rA relation representing the connection between consecutive 21 syllables, this transition matrix is a second-order transition matrix, that is, the connection between three consecutive syllables. It describes the transition relations that represent . In other words, if the first-order transition relationship does not hold, the second-order transition relationship also does not hold.The present invention focuses on this point, and the transition is closed by the (M-1)-order transition matrix. Only the specified combinations of syllables are extended to the Mth order to determine suitability.
〈作用〉
このように、いきなりM次の遷移行列を適用・・kず、
ベキ数の1つ少ない(M−1)次の遷移行列によって、
いわば予備判断を行なう為、M次の遷移行列の対象とな
る候補列が少なくなり、必要とする遷移行列書き込み用
バッファの容量は大幅に縮小され、しかもいきなりM次
の遷移行列を適用した場合と同等の認識確度が得られる
。<Effect> In this way, suddenly apply the M-order transition matrix...
By the transition matrix of one less power (M-1),
In other words, because a preliminary judgment is performed, the number of candidate columns to be subjected to the M-th transition matrix is reduced, and the required capacity of the transition matrix writing buffer is significantly reduced. Equivalent recognition accuracy can be obtained.
〈実施例〉
以下、図面の一実施例について、本発明を具体的に説明
する。<Example> The present invention will be specifically described below with reference to an example shown in the drawings.
第1図にすjい′ζ、単音節認識部1に人力された?j
’ ?!’清(41<は、襟i1(パターンメモリ2に
記憶されている(j、1準パターンと比較され、人力音
声に近餞した(架(((ユバターンかその近鉄順に候補
として選出され、γf節ラうィ又バッファ3に時系列的
に記憶され、又、これらの組み合わせからなる候補列か
候補列バッファl[に記憶さJする。犬いで、このバッ
ファ・1に記憶された複数間の候補列は、遷移行列バッ
ファ5内の1次遷、移行列(以下、M7\′[RIX−
1という)を用いて適否を1′1j断され、続いて、2
次遷移行列(以下、MATRIX−2という)をJIJ
い′ご適否を1’l+断され、残ったものか候補列作成
部6に記憶される。犬に、文節分析部7において、辞:
+1照合等の高次処理か行なわれ、最終的に選択された
文節が認識結果メモリ号に記憶され、適宜出力される。In Figure 1, was the monosyllabic recognition unit 1 manually inputted? j
' ? ! 'Qing (41< is compared with the collar i1 (stored in the pattern memory 2 (j, The nodes are also stored in buffer 3 in chronological order, and candidate sequences consisting of these combinations are stored in candidate sequence buffer l. The candidate column is the first-order transition in the transition matrix buffer 5, the transition column (hereinafter M7\'[RIX-
1) is used to determine suitability, and then 2
The next transition matrix (hereinafter referred to as MATRIX-2) is JIJ
The suitability of the candidates is determined by 1'l+, and the remaining ones are stored in the candidate sequence creation section 6. To the dog, in the phrase analysis part 7, the following words:
High-level processing such as +1 matching is performed, and the finally selected phrase is stored in the recognition result memory and output as appropriate.
9は、これらの諸動作を制御するC1−’ (+て゛あ
る。9 is C1-' (+) which controls these operations.
次に、LSTILIX〜1及びMATRIX−2につい
て述べる。尚、遷移行列について基本的な説明は、前掲
の特許公開公報に詳細に記載斜tでいるので、ここでは
省略する。Next, LSTILIX-1 and MATRIX-2 will be described. A basic explanation of the transition matrix is described in detail in the above-mentioned patent publication, so it will be omitted here.
第2図は八4ATRIX−1を、第3図はMATRIX
−2e夫々示t、MATRIX−1ii、2バイト単位
でMATRIX−2のブロック番号を記述してあり、こ
の点は、1ビット単位で音mJ単位間の遷移関係を記述
した前掲先行技術の遷移行列とは異なっている。このM
ATRIX−1を用いて遷移関係をチェックされた候補
列力弓次遷移不可能なものであると、ブロック番号<
0 (] (,1(’l >1−I E Xが与えられ
、又、可能なもので゛あると、更に、次に参照すべきM
ATRIX 2のブロック番号が与えられる。Figure 2 shows 84 ATRIX-1, Figure 3 shows MATRIX.
-2e, MATRIX-1ii, and MATRIX-2 block numbers are described in 2-byte units, and this point is similar to the transition matrix of the prior art described above, which describes the transition relationship between sound mJ units in 1-bit units. It is different from This M
The transition relationship of the candidates was checked using ATRIX-1, and if the transition was not possible, the block number <
0 (] (,1('l > 1-I E
A block number of ATRIX 2 is given.
MATRIX−2は、112ビツト毎にブロック番号を
有するものであり、ブ072番号Oは、1次の遷移関係
を全て否定するもので、112ビツト全てが0”のブロ
ックであり、例えば、M A T RIX−1のある音
節単位間の遷移関係が゛< (l Of’10>HEX
となった場合は、M]\TRlX−2のブロック()に
シ゛ヤンブする。このフ゛口・ンク0は、1次の遷移か
成り立たなかったと同時に2次の遷移ら否定するらのて
゛ある。又、遷移か可能で0以外のフロック番号か示さ
れた場合は、その番号のフロノクヘノヤンプし、そのフ
ロックの先頭から112ヒ′ントを調べる。ここで・は
、各1ビ′ントか′、夫々の1音節単位に対応している
。尚、M A T RIX−2の先頭ビットが′”1″
で、残りの111ビツトが全て“()゛の場合は、接尾
となることを意味している。MATRIX-2 has a block number for every 112 bits, and block 072 number O negates all first-order transition relationships, and is a block in which all 112 bits are 0'', for example, MATRIX-2. The transition relationship between certain syllable units in TRIX-1 is ゛< (l Of'10>HEX
If so, shamble to the block () of M]\TRlX-2. This link 0 has the ability to deny the second-order transition at the same time as the first-order transition does not hold. Also, if a transition is possible and a non-zero block number is indicated, jump forward to that number and check 112 hints from the beginning of the block. Here, * corresponds to each 1 bit or 1 syllable unit. In addition, the first bit of MAT RIX-2 is ``1''
If the remaining 111 bits are all "()", it means that it is a suffix.
以上のようにして、M=2の場合、即ち3個の音@B’
−位からなる候補列の遷移関係の適否が判断されるので
あるが、前掲先行技術により、いきな1)2次の遷移行
列を適用する場合と、本発明により、先ず1次の遷移行
列を適用し、次いで2次の遷移行列を適用する場合のビ
ット数は、次のようになる。As described above, when M=2, that is, three sounds @B'
The suitability of the transition relationship of the candidate sequence consisting of - digits is judged. According to the prior art mentioned above, there is a case where a second-order transition matrix is applied, and a case where a first-order transition matrix is applied according to the present invention. The number of bits when applying the transition matrix and then applying the second-order transition matrix is as follows.
(a) 先行技術の場合
112コ=1.404,928 [ビット10.・
(A)(b) 本発明の場合
1=IAT旧X−116X112”=200,704
[ヒラ)l・・(B)i4八TRlX−21123X
1/2=702,464 [ビ ン ト 1・・・(
C)(MATRlX−1でのビット1の占有率は全体の
1/2として計則
従って、(a)に対する(11)の比率は、(B 十C
)/ A =903,168/”1,404,928=
0.6428
となり、本発明の場合は、先行技術に比べて65[%1
程度のバッフ7容量で同等の効果を得ることができるの
である。(a) In the case of the prior art, 112 pieces = 1.404,928 [bit 10.・
(A) (b) In the case of the present invention 1=IAT old X-116X112”=200,704
[Hira) l... (B) i48 TRlX-21123X
1/2=702,464 [bin 1...(
C) (The occupancy rate of bit 1 in MATRlX-1 is calculated as 1/2 of the total. Therefore, the ratio of (11) to (a) is (B + C
)/A =903,168/”1,404,928=
0.6428, and in the case of the present invention, it is 65%1 compared to the prior art.
The same effect can be obtained with a buffer capacity of about 7.
本発明は、任意の畜産間の遷移関係を記述する場合のバ
ッファ容量の縮小化に有効なものであるが、次のような
手段と併用すれば、更にバッフ7容量を小さくすること
ができる。尚、下記の手段は先行技術にも適用可能であ
り、単独で利用してもある程度の効果がある。Although the present invention is effective in reducing the buffer capacity when describing transition relationships between arbitrary livestock breeds, the buffer 7 capacity can be further reduced if used in conjunction with the following means. Incidentally, the following means can also be applied to the prior art, and even when used alone, they have some effect.
その1は、認識すべき所定の日本語音節単位のうち、特
定の限定した音節41位間の接続関係のみを記述した遷
移行列を併用することであり、例えば特殊な音節単位で
ある促音“っ”とその前後の接続を表わす遷移行列を用
いる。The first method is to use a transition matrix that describes only the connections between specific 41st syllables among the predetermined Japanese syllable units to be recognized. ” and a transition matrix representing the connections before and after it.
その2は、認識すべき所定の日本語音節単位を、予め音
声認識上意味のあるカテゴリーによってグループ分けし
、そのグループ開の接続関係を記述した遷移行列を用い
ることである。例えば、音韻上11)、でいるもの、即
ち(、)パ行、り行、カ行、(11)バ行、グ行、ガ行
、(c)す行、ハ行、(d)マ行、す行、う行というよ
うに、異なった音節単位を含む音節群を夫々1種類の音
節単位としで扱い、残りの音節単位を含めて、その接続
関係である遷移関係を記述した虜f伺テ列を用いるので
ある。The second method is to group predetermined Japanese syllable units to be recognized into categories that have meaning in terms of speech recognition, and to use a transition matrix that describes the connection relationships between the groups. For example, 11) phonetically, (,) Pa line, Ri line, Ka line, (11) B line, G line, Ga line, (c) Su line, H line, (d) M line. A syllable group that includes different syllable units, such as , su line, and u line, is treated as one type of syllable unit, and the transition relationship that is the connection relationship between the remaining syllable units is described. It uses a te sequence.
〈発明の効果〉
上述のAJ4例の説明から明らかなように、本発明によ
れば、候補音節列に対する適否1111断に用いる遷移
行列を書き込むバッファの容量を縮小することかでき、
装置の小型化、代コスト化、処理時間短411等を実現
することか可能となるのである。<Effects of the Invention> As is clear from the explanation of the AJ4 example above, according to the present invention, it is possible to reduce the capacity of the buffer in which the transition matrix used for determining the suitability of candidate syllable strings is written.
This makes it possible to downsize the device, reduce costs, shorten processing time, etc.
第1図は、本発明の一実施例の構成を示すブロック図、
第2図は、同」二、1次遷移行列の1例を示す図、第3
図は、同上、2次遷移行列の1例を・バす図である。
1・・・単音ff1i認識部
2・・・標準パターンメモリ
3・・・音miラティスバッフr
4・・・候補列バッファ
5・・・遷移行列バフ)7
6・・・候補列作成部
7・・・分節分析部
8・・・認識結果メモリ
9・・・c p uFIG. 1 is a block diagram showing the configuration of an embodiment of the present invention; FIG. 2 is a diagram showing an example of a second-order and first-order transition matrix;
The figure is a diagram showing an example of the quadratic transition matrix. 1... Single sound ff1i recognition unit 2... Standard pattern memory 3... Sound mi lattice buffer r 4... Candidate sequence buffer 5... Transition matrix buff) 7 6... Candidate sequence creation unit 7. ... Segment analysis unit 8 ... Recognition result memory 9 ... CPU
Claims (1)
、音節単位毎に認識された複数個の候補から信頼度の高
い組み合わせ順に候補音節列を作成するに際して、認識
すべき所定の日本語音節単位間の接続を表わす遷移関係
を記述した遷移行列を用いて音節単位間の遷移が不可能
な組み合わせを有する候補音節列を除外し、残りの候補
音節列に対して次の辞書照合等の処理を行なって文節等
の認識結果を出力するようにした音声入力式日本語文書
処理装置において、(M+1)個の音節単位からなる候
補音節列に対する適否判断の際に、まず(M−1)次の
遷移行列を適用して遷移不可能な組み合わせを有する候
補音節列を除外し、残りの候補音節列に対してのみM次
の遷移行列を適用して候補音節列の適否を判断するよう
にしたことを特徴とする音声入力式日本語文書処理装置
。1. When recognizing speech uttered in units such as phrases in syllable units and creating candidate syllable strings in the order of highly reliable combinations from multiple candidates recognized for each syllable unit, the predetermined Japanese language to be recognized is Using a transition matrix that describes transition relationships that represent connections between word syllable units, candidate syllable strings that have combinations in which transitions between syllable units are impossible are excluded, and the remaining candidate syllable strings are subjected to the following dictionary matching, etc. In a voice-input Japanese document processing device that outputs the recognition results of phrases, etc., when determining the suitability of a candidate syllable string consisting of (M+1) syllable units, first the (M-1) ) The following transition matrix is applied to exclude candidate syllable strings with combinations that cannot be transitioned, and the M-order transition matrix is applied only to the remaining candidate syllable strings to judge the suitability of the candidate syllable strings. A voice input type Japanese document processing device characterized by:
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP59169568A JPS6148032A (en) | 1984-08-14 | 1984-08-14 | Speech input type japanese document processor |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP59169568A JPS6148032A (en) | 1984-08-14 | 1984-08-14 | Speech input type japanese document processor |
Publications (1)
Publication Number | Publication Date |
---|---|
JPS6148032A true JPS6148032A (en) | 1986-03-08 |
Family
ID=15888879
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
JP59169568A Pending JPS6148032A (en) | 1984-08-14 | 1984-08-14 | Speech input type japanese document processor |
Country Status (1)
Country | Link |
---|---|
JP (1) | JPS6148032A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH01260493A (en) * | 1988-04-12 | 1989-10-17 | Matsushita Electric Ind Co Ltd | Voice recognizing method |
KR100408524B1 (en) * | 2001-08-22 | 2003-12-06 | 삼성전자주식회사 | Speech recognition method and the apparatus thereof |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPS58208846A (en) * | 1982-05-31 | 1983-12-05 | Nec Corp | Priority deciding system for kana (japanese syllabary) letter train |
JPS58208847A (en) * | 1982-05-31 | 1983-12-05 | Nec Corp | Deciding system for kana (japanese syllabary) letter train |
JPS5958493A (en) * | 1982-09-28 | 1984-04-04 | 電子計算機基本技術研究組合 | Recognition system |
JPS59132039A (en) * | 1983-01-17 | 1984-07-30 | Nec Corp | Evaluating method of kana character string |
-
1984
- 1984-08-14 JP JP59169568A patent/JPS6148032A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPS58208846A (en) * | 1982-05-31 | 1983-12-05 | Nec Corp | Priority deciding system for kana (japanese syllabary) letter train |
JPS58208847A (en) * | 1982-05-31 | 1983-12-05 | Nec Corp | Deciding system for kana (japanese syllabary) letter train |
JPS5958493A (en) * | 1982-09-28 | 1984-04-04 | 電子計算機基本技術研究組合 | Recognition system |
JPS59132039A (en) * | 1983-01-17 | 1984-07-30 | Nec Corp | Evaluating method of kana character string |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH01260493A (en) * | 1988-04-12 | 1989-10-17 | Matsushita Electric Ind Co Ltd | Voice recognizing method |
KR100408524B1 (en) * | 2001-08-22 | 2003-12-06 | 삼성전자주식회사 | Speech recognition method and the apparatus thereof |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Lamel et al. | Bref, a large vocabulary spoken corpus for french1 | |
Wang et al. | Automatic classification of intonational phrase boundaries | |
US9406292B2 (en) | Multitask learning for spoken language understanding | |
US5241619A (en) | Word dependent N-best search method | |
US6535849B1 (en) | Method and system for generating semi-literal transcripts for speech recognition systems | |
Lee et al. | Improved acoustic modeling for large vocabulary continuous speech recognition | |
US10235991B2 (en) | Hybrid phoneme, diphone, morpheme, and word-level deep neural networks | |
US6629073B1 (en) | Speech recognition method and apparatus utilizing multi-unit models | |
KR19980701676A (en) | System and method for generating and using context-dependent model for syllable language (tonal language) recognition | |
US9798653B1 (en) | Methods, apparatus and data structure for cross-language speech adaptation | |
Bagshaw | Phonemic transcription by analogy in text-to-speech synthesis: Novel word pronunciation and lexicon compression | |
Grocholewski | CORPORA-speech database for Polish diphones. | |
Nagano et al. | Data augmentation based on vowel stretch for improving children's speech recognition | |
Morgan | Making a speech recognizer tolerate non-native speech through Gaussian mixture merging | |
Chen et al. | Automatic pronunciation assessment for Mandarin Chinese | |
Greenberg et al. | An introduction to the diagnostic evaluation of Switchboard-corpus automatic speech recognition systems | |
Lamel et al. | Continuous speech recognition at LIMSI | |
Wester et al. | Improving the performance of a Dutch CSR by modeling pronunciation variation | |
Wester et al. | A comparison of data-derived and knowledge-based modeling of pronunciation variation | |
JPS6148032A (en) | Speech input type japanese document processor | |
Tan et al. | Malay grapheme to phoneme tool for automatic speech recognition | |
Mŭller et al. | Design of speech recognition engine | |
Sakai et al. | A probabilistic approach to unit selection for corpus-based speech synthesis. | |
Ferreiros et al. | Improving continuous speech recognition in Spanish by phone-class semicontinuous HMMs with pausing and multiple pronunciations | |
Ladefoged et al. | Recording the phonetic structures of endangered languages |