JPS6235120Y2

JPS6235120Y2 -

Info

Publication number: JPS6235120Y2
Application number: JP1986071824U
Authority: JP
Priority date: 1986-05-15
Filing date: 1986-05-15
Publication date: 1987-09-07
Also published as: JPS61203800U

Description

【考案の詳細な説明】本考案は音声認識装置に関する。[Detailed explanation of the idea] The present invention relates to a speech recognition device.

近年、音声認識装置に対する関心が高まりつつ
あり、例えばコンピユータ分野においては従来の
キーボード入力装置等に代わるものとして期待さ
れている。もしオペレータの音声によるデータ入
力がそのままコンピユータによつて処理されるな
らば、コンピユータの利用効率が飛躍的に増大す
るであろうことは言うまでもない。この音声認識
装置は多数の機能部分によつて構成されるが、本
考案はその中の音声入力機能部分について述べる
ものとする。前記多数の機能部分としてはその他
文章解析機能部等が挙げられる。通常、マイクロ
ホン等を介して入力されるアナログ音声信号は、
Ａ／Ｄ（アナログ／デイジタル）変換、サンプリ
ング等の処理を経て、一旦データ・メモリにスト
アされ、その後音声認識機能部に導かれる。ここ
で各音声が認識されると、次に音声によつて入力
された文の構造が解析される。これが前記の文章
解析機能部である。コンピユータ等の機械にとつ
て文章の解析は、極めて困難な操作である。つま
り文の区切り、すなわち文節の切れ目、節の切れ
目、文の終わり等の認識は極めて難しい。このた
め、一般には話者（オペレータ）に対し各前記切
れ目毎にポーズを挿入することを強制し、これに
より機械に対し、これらの切れ目の存在を明らか
にしている。然し、このような音声認識装置は話
者にとつて不自然であり、改良されるべきであ
る。 In recent years, interest in voice recognition devices has been increasing, and for example, in the computer field, they are expected to replace conventional keyboard input devices. It goes without saying that if the data input by the operator's voice were directly processed by the computer, the utilization efficiency of the computer would be dramatically increased. This speech recognition device is composed of many functional parts, and the present invention will describe the speech input functional part among them. Examples of the large number of functional parts include a text analysis functional part and the like. Normally, analog audio signals input via a microphone etc.
After undergoing processing such as A/D (analog/digital) conversion and sampling, the data is temporarily stored in a data memory, and then guided to a speech recognition function section. Once each voice is recognized, the structure of the sentence input by voice is analyzed. This is the text analysis function section mentioned above. Analyzing sentences is an extremely difficult operation for machines such as computers. In other words, it is extremely difficult to recognize sentence breaks, such as clause breaks, clause breaks, and sentence ends. For this reason, the speaker (operator) is generally forced to insert a pause at each such break, thereby making the presence of these breaks clear to the machine. However, such speech recognition devices are unnatural to speakers and should be improved.

従つて本考案の目的は、話者にとつて不自然さ
の少ない、且つ認識精度の高い、音声認識装置を
提案することである。 SUMMARY OF THE PRESENT EMBODIMENT An object of the present invention is to provide a speech recognition device which is less unnatural for the speaker and has high recognition accuracy.

上記目的に従い本考案は、話者の発した音声中
の切れ目毎に、話者によつて発生せしめられた切
れ目信号を文章解析部に入力するようにしたこと
を特徴とするものである。 In accordance with the above object, the present invention is characterized in that a break signal generated by the speaker is input to the sentence analysis section for each break in the speech uttered by the speaker.

以下図面を参照しながら本考案を説明する。 The present invention will be described below with reference to the drawings.

一般に日本語は助詞の「が」、「は」、「の」…等
をキーワードとして構成される。従つて、文章解
析部としては、これらキーワードを検出して文の
構造を認識する場合が多い、勿論、他の言語でも
そのキーワードに相当するものがあり、これを検
出して文の構造を認識できる。従つて、これらキ
ーワードをも含めて、文の区切りすなわち文節の
切れ目、節の切れ目、文の終わり、必要ならば句
や語の切れ目に、いわゆる切れ目信号を与えられ
れば、文章解析部にとつて文の構造を解明する上
で大きな手掛りとなることは言うまでもない。加
えて文章解析の精度が飛躍的に向上することも明
らかである。そこで本考案は、話者の発した音声
の切れ目毎に、話者自ら発生した切れ目信号を前
記文章解析部が受信するものとする。これを図解
すると添付図の如くなる。図において、１１はア
ナログ音声信号を受信するマイクロホンであり、
マイクロホン１１からの出力信号はは増幅器１２
を介してＡ／Ｄ変換器１３により、デイジタル化
される。このデイジタルデータは一旦メモリ１４
にストアされた後、音声認識部１５に供給され
る。ここで、該デイジタルデータは、子音および
母音を構成単位とする音節として認識される。次
に、各音節は文章解析部１６に入力され、文の構
造が解明される。例えば“今日の天気は快晴で
す。”なる文章については、“KYOUNO
TENKIWA KAISEIDESU”の様に分解して把握
する。ところが、実際には上記の様にはつきりし
た切れ目が存在する訳ではないので、これを本考
案では、“KYOUN〓ＯTENKIW〓ＡKAISEIDE〓Ｓ
Ｕ”の如く、文章解析部１６に情報を与える。こ
れら・印は、切れ目を指示する信号を意味し、具
体的には、例えばスイツチ１７によりその切れ目
信号が形成される。スイツチ１７には、話者の音
声が切れ目に達する毎に、話者によつて押下さ
れ、例えば音声認識部１５に通知される。ここ
で、認識された各音節のうち、スイツチ１７の押
下に対応する音節には切れ目信号が付加される。
文章解析部１６は切れ目信号の音節を検出し、当
該音節が検出されてときは、これを文章の一つの
切れ目として把握する。かくの如く、切れ目信号
の付加により、文章解析能率は飛躍的に増大す
る。この場合、話者に対してスイツチ押下という
操作を強いる訳であるが、既述した様な、切れ目
毎にポーズを挿入するというような不自然な操作
に比べれば遥かに自然且つ容易な操作である。本
考案の場合、スイツチ押下のし忘れ若しくは過つ
たスイツチ押下等が考えられるが、この切れ目信
号はあくまでも文章解析上の補助的手段であり、
これにより文章解析部におけるハードウエア操作
を肩代わりし、より迅速且つ高精度な解析を期待
するものあるから、その様な事態があつても、文
章解析部の元々の機能に期待すれば良いことであ
つて、何ら不都合あるは支障が生ずることはな
い。 In general, Japanese is composed of keywords such as particles ``ga'', ``ha'', ``no'', etc. Therefore, the sentence analysis unit often detects these keywords and recognizes the structure of the sentence.Of course, there are equivalent keywords in other languages, and it detects these and recognizes the structure of the sentence. can. Therefore, if so-called break signals including these keywords are given to sentence breaks, clause breaks, clause breaks, sentence ends, and if necessary, phrase or word breaks, it will be useful for the text analysis unit. Needless to say, this will be a big clue in elucidating the structure of sentences. In addition, it is clear that the accuracy of text analysis is dramatically improved. Therefore, in the present invention, the sentence analysis section receives a break signal generated by the speaker at each break in the voice uttered by the speaker. This is illustrated in the attached figure. In the figure, 11 is a microphone that receives analog audio signals;
The output signal from the microphone 11 is sent to the amplifier 12.
The signal is digitized by the A/D converter 13 via the A/D converter 13. This digital data is temporarily stored in the memory 14.
After being stored in the voice recognition unit 15, the voice recognition unit 15 receives the voice recognition unit 15. Here, the digital data is recognized as a syllable whose constituent units are consonants and vowels. Next, each syllable is input to the sentence analysis section 16 to elucidate the structure of the sentence. For example, for the sentence "Today's weather is clear,""KYOUNO
TENKIWA KAISEIDESU".However, in reality, there are no sharp cuts like the one shown above, so in this invention, this is broken down into "KYOUN〓OTENKIW〓AKAISEIDE〓S.
U'', etc., gives information to the text analysis unit 16. These marks mean signals indicating a break, and specifically, the break signal is generated by the switch 17, for example. Each time the speaker's voice reaches a break, the speaker presses the button and, for example, the voice recognition unit 15 is notified.Among the recognized syllables, the syllable corresponding to the press of the switch 17 is A break signal is added.
The sentence analysis unit 16 detects a syllable in the break signal, and when the syllable is detected, it recognizes this as a break in the sentence. As described above, by adding a break signal, the efficiency of text analysis increases dramatically. In this case, the speaker is forced to press the switch, but this is a much more natural and easy operation than the unnatural operation of inserting a pause at each break as described above. be. In the case of the present invention, it is possible that the user forgot to press the switch or pressed the switch incorrectly, but this break signal is only an auxiliary means for text analysis.
This takes over the hardware operations in the text analysis section and is expected to provide faster and more accurate analysis, so even if such a situation occurs, it is fine to rely on the original functions of the text analysis section. There will be no inconvenience or trouble.

以上説明したように本考案によれば音声認識装
置において、迅速且つ高精度な文章解析が極めて
簡易な手法で実現される。 As explained above, according to the present invention, rapid and highly accurate sentence analysis can be realized in a speech recognition device using an extremely simple method.

[Brief explanation of the drawing]

添付図は本考案の装置を図解して示すブロツク
図である。本図において、１５は音声認識部、１６は文章
解析部、１７はスイツチである。 The accompanying drawing is a block diagram illustrating the apparatus of the present invention. In this figure, 15 is a speech recognition section, 16 is a text analysis section, and 17 is a switch.

Claims

[Scope of claim for utility model registration] 1. A voice input section for inputting the voice uttered by the speaker;
a speech recognition section that receives the output from the speech input section, decomposes the speech into syllables, and recognizes the speech; and a sentence of the speech uttered by the speaker using each of the syllables outputted from the speech recognition section as input. A speech recognition device comprising: a sentence analysis unit that analyzes a sentence; and a switch that is operated by the speaker to output a break signal for each break in the sentence; In addition to each of the above syllables,
A speech recognition device characterized in that the break signal is also input. 2 The break signal indicates the break of a bunsetsu, the break of a clause,
The speech recognition device according to claim 1, wherein the speech recognition device is generated at either end of a sentence.