JPS6177100A

JPS6177100A - Voice section detecting circuit

Info

Publication number: JPS6177100A
Application number: JP59199212A
Authority: JP
Inventors: 河本　俊毅
Original assignee: Ricoh Co Ltd
Current assignee: Ricoh Co Ltd
Priority date: 1984-09-21
Filing date: 1984-09-21
Publication date: 1986-04-19
Also published as: JPH0573034B2

Abstract

(57)【要約】本公報は電子出願前の出願データであるた
め要約のデータは記録されません。(57) [Summary] This bulletin contains application data before electronic filing, so abstract data is not recorded.

Description

【発明の詳細な説明】投亙分互本発明は、音声認識装置における音声区間検出回路に関
する。DETAILED DESCRIPTION OF THE INVENTION The present invention relates to a speech interval detection circuit in a speech recognition device.

従」Ｕ支遁− 一般に音声認識装置において入力音声の信号対雑音比が
良好な音声を対像とする場合には音声の存在する区間を
抽出することは比較的容易なことである。しかし、音声
認ｍ装置が実際に使用されるような環境においては種々
の騒音を含み、騒音と重畳された形で音声が入力される
。この時の騒音は時々刻々と変化するので固定的な閾値
を設けておいて音声区間を切出す方法では安定な音声区
間の検出は困難であり、誤認識の一因となる。このよう
な固定閾値による切出しによると高騒音下では本来音声
区間であるべき所の前後に騒音が付加して切出されるこ
とになる。In general, when a speech recognition device uses speech with a good signal-to-noise ratio as an input speech, it is relatively easy to extract a section where speech exists. However, the environment in which the voice recognition device is actually used includes various kinds of noise, and the voice is input in a form superimposed on the noise. Since the noise at this time changes from moment to moment, it is difficult to detect a stable voice section by setting a fixed threshold value and cutting out the voice section, which may cause misrecognition. When clipping is performed using such a fixed threshold value, under high noise conditions, noise is added before and after what should normally be a voice section and is clipped.

１−一煎本発明は、上述のごとき問題点を解決するためになされ
たもので、特に、周囲の定常騒音レベルの大小にかかわ
りなく安定な音声区間検出を行ない、安定した認識率を
確保することを目的としてなされたものである。1-1 The present invention has been made in order to solve the above-mentioned problems, and in particular, to perform stable speech segment detection regardless of the level of ambient steady noise to ensure a stable recognition rate. It was done for that purpose.

請−ａ本発明は、上記目的を達成するため、音声信号パワーを
抽出し、その無音区間内で閾値を設定して音声区間を切
出す音声区間検出装置において。Request-a In order to achieve the above object, the present invention provides a voice section detection device that extracts voice signal power and sets a threshold value within the silent section to cut out the voice section.

音声の終端から所定時間（ｔ秒）経過した時点に発生す
るパルス及び音声の始端から所定時間（Ｓ秒）経過した
時点に発生するパルスを用いてその時のノイズレベルを
サンプルしてホールドするサンプルホールド手段を有し
、そのホールド値を音声区間切出しの閾値として音声区
間を検出することを特徴としたものである。以下、本発
明の実施例に基づいて説明する。Sample hold that samples and holds the noise level at that time using a pulse that occurs when a predetermined time (t seconds) has elapsed from the end of the voice and a pulse that occurs after a predetermined time (S seconds) has elapsed from the start of the voice. The present invention is characterized in that it has a means for detecting a voice section and uses the hold value as a threshold for cutting out a voice section. Hereinafter, the present invention will be explained based on examples.

本発明は、音声区間の終端から所定時間（ｔ秒）だけ経
過した時点でのノイズレベル及び音声区間の始端から所
定時間（Ｓ秒）だけ経過した時点でのノイズレベルをホ
ールドしてそれを音声区間検出のための閾とするもので
ある。前記ｔは、促音を持つ単語では無音区間が２００
〜４００＋ｎｓ存在する事に基いており、これ以下で次
の音声が入力されれば１前後する音声は一単語として処
理される様な配慮から決定される。又、発声中に環境騒
音のレベルが上がると終端を検出することが出来ない場
合があるので、始端からＳ秒経過しても終端を検出出来
ない場合は強制的にその時点を音声区間の終端とし、そ
の時のノイズレベルをホールドする。The present invention holds the noise level at the time when a predetermined time (t seconds) has elapsed from the end of the voice section and the noise level at the time when a predetermined time (S seconds) has elapsed from the start of the voice section, and converts the noise level into a voice. This is used as a threshold for section detection. The above t is 200 silent intervals for words with consonants.
This is based on the fact that ~400+ns exists, and the decision is made with the consideration that if the next voice is inputted after this time, the voice that is around 1 will be processed as one word. Also, if the level of environmental noise increases during vocalization, it may not be possible to detect the end, so if the end cannot be detected even after S seconds have passed from the start, that point will be forced to be the end of the voice section. and hold the noise level at that time.

第１図は、本発明による音声区間検出回路の一実施例を
説明するための電気的ブロック線図、第２図は、タイム
チャートで、第１図において、１は入力端、２は検波回
路、３は平滑回路、４はサンプルホールド回路、５はレ
ベル比較回路、６は音声区間弁別回路、７は出力端であ
る。また、第２図において、（Ｑ）は入力信号の平均信
号レベルの例を示し、Ｔ１．Ｔ２は閾値の切り換わり時
点を示す。（ｂ）は（０）の閾値で検出した音声区間信
号、（ｃ）は（ｂ）の音声区間信号の立下がりからｔ秒
後に区間信号がロウの時に発生する音声区間終端パルス
、（ｄ）は（ｂ）の音声区間信号の立上がりからＳ秒以
内に（ｃ）の終端パルスが発生しなかった場合に発生す
るパルスで、この（Ｃ）と（ｄ）のパルスによってサン
プルホールドが動き閾値が切り換わる。FIG. 1 is an electrical block diagram for explaining an embodiment of the voice section detection circuit according to the present invention, and FIG. 2 is a time chart. In FIG. 1, 1 is an input terminal, and 2 is a detection circuit. , 3 is a smoothing circuit, 4 is a sample hold circuit, 5 is a level comparison circuit, 6 is a voice section discrimination circuit, and 7 is an output terminal. Further, in FIG. 2, (Q) shows an example of the average signal level of the input signal, and T1. T2 indicates the switching point of the threshold value. (b) is the voice section signal detected using the threshold of (0), (c) is the voice section end pulse that occurs when the section signal is low t seconds after the fall of the voice section signal in (b), (d) is a pulse that occurs when the terminal pulse in (c) does not occur within S seconds from the rise of the voice section signal in (b), and the sample hold moves due to the pulses in (C) and (d), and the threshold value is set. Switch.

第１図において、入力端１からの入力信号は、検波回路
２、平滑回路３を通して平均信号レベルが検出され、サ
ンプルホールド回路４及びレベル比較回路５に入力され
る。このレベル比較回路５からの出力信号が音声区間弁
別回路６に入力され。In FIG. 1, the average signal level of an input signal from an input terminal 1 is detected through a detection circuit 2 and a smoothing circuit 3, and is input to a sample hold circuit 4 and a level comparison circuit 5. The output signal from this level comparison circuit 5 is input to a voice section discrimination circuit 6.

ここで発生する音声区間終端パルス及び終端が検出でき
ない場合は始端からＳ秒後に発生するパルスがサンプル
ホールド回路４に入力され、その時の信号レベルがホー
ルドされる。このホールドされた値と平均信号レベルと
がレベル比較回路５で比１咬され音声区間信号段７に出
力される。なお、以上は音声終端パルス及び終端検出が
できない場合に発生ずるパルスが発生した時のノイズレ
ベルをそのまま次の閾値とする方式であるが、より安定
に動作させるため、この閾値にある固定値を加えたもの
を音声区間検出の閾値とする方式も考えられる。If the voice section end pulse generated here and the end cannot be detected, the pulse generated S seconds after the start end is input to the sample hold circuit 4, and the signal level at that time is held. This held value and the average signal level are compared by one in the level comparison circuit 5 and outputted to the voice section signal stage 7. Note that the above method uses the noise level at the time when the audio end pulse and the pulse that occurs when the end cannot be detected as the next threshold, but in order to operate more stably, a fixed value for this threshold can be set. A method may also be considered in which the added value is used as a threshold for voice section detection.

第３図は、上記方式の一実施例を示す電気的ブロック線
図で、この実施例は前記の閾値にある固定値を加えたも
のを音声区間検出の閾値とするもので、図中、８は基準
電圧源、９は加算器で、その他第１図と同様の作用をす
る部分には第１図の場合と同一の参照番号が付しである
。而して、この実施例は、基準電圧源８と加算器９を有
し、加算器９において、サンプルホールド回路４の出力
に基準電圧源８からの一定しベル電圧Ｖｓが加算され、
この加算された値がレベル比較回路５の新しい閾値とな
るものである。FIG. 3 is an electrical block diagram showing an embodiment of the above method. In this embodiment, the threshold value for voice section detection is the sum of the above threshold value and a fixed value. 1 is a reference voltage source, 9 is an adder, and other parts having the same functions as in FIG. 1 are given the same reference numerals as in FIG. Thus, this embodiment has a reference voltage source 8 and an adder 9, and the adder 9 adds a constant bell voltage Vs from the reference voltage source 8 to the output of the sample and hold circuit 4.
This added value becomes the new threshold value of the level comparison circuit 5.

羞−一困以上の説明から明らかなように、本発明によると最適な
閾値を求めて区間信号を検出することが可能となる。As is clear from the above description, according to the present invention, it is possible to detect an interval signal by determining an optimal threshold value.

【図面の簡単な説明】第１図は１本発明の一実施例を説明するための電気的ブ
ロック線図、第２図は、タイムチャート。第３図は１本発明の他の実施例を示す電気的ブロック線
図である。１・・・入力端、２・・・検波回路、３・・・平滑回路
、４・・サンプルホールド回路、５・・・レベル比較回
路、６・・・音声区間弁別回路、７・・・出力端、８・
・・基準電圧、９・・・加算器。ｉｆ凶第　２　図第３図BRIEF DESCRIPTION OF THE DRAWINGS FIG. 1 is an electrical block diagram for explaining an embodiment of the present invention, and FIG. 2 is a time chart. FIG. 3 is an electrical block diagram showing another embodiment of the present invention. 1... Input end, 2... Detection circuit, 3... Smoothing circuit, 4... Sample hold circuit, 5... Level comparison circuit, 6... Voice section discrimination circuit, 7... Output Edge, 8・
...Reference voltage, 9...Adder. If bad Figure 2 Figure 3

Claims

[Claims]

(1) In a voice section detection device that extracts the voice signal power and sets a threshold value within the silent section to cut out the voice section, a pulse and a It has a sample hold means that samples and holds the noise level at that time using a pulse generated at the time when a predetermined time (s seconds) has elapsed from the start of the voice, and uses the hold value as a threshold for cutting out the voice section to separate the voice section. A voice section detection circuit characterized by detecting.

(2), comprising means for adding a fixed value to the threshold;
The speech section detection circuit according to claim 1, wherein the speech section is detected using the value as a threshold for cutting out the speech section.