JPH1091184A

JPH1091184A - Sound detection device

Info

Publication number: JPH1091184A
Application number: JP8241458A
Authority: JP
Inventors: Shinichi Kawada; 眞一川田; Yoichiro Hosokawa; 洋一郎細川; Kenichi Aratatsu; 健一新立
Original assignee: Oki Electric Industry Co Ltd
Current assignee: Oki Electric Industry Co Ltd
Priority date: 1996-09-12
Filing date: 1996-09-12
Publication date: 1998-04-10

Abstract

PROBLEM TO BE SOLVED: To prevent a state of sound from continuing as a judgment by a sound detection in such a case as noises are suddenly increased in the surroundings and the noise level is maintained. SOLUTION: A speech detection circuit 4 compares a power level of an input signal Sin with a level of an adaptive threshold value ST set by an adaptive threshold value control circuit 3, and judges the state as sound when the level of the input signal Sin is higher than that of the threshold value, and judges as silence when the input is lower than the threshold value. Further, the sound detection circuit 4 is provided with a timer circuit 5 for measuring a succeeding time of sound after the sound detection circuit has judged the input signal as sound. The timer circuit 5 outputs a command signal 6 to an electric power calculation circuit 2 when a judgment signal C has become '1' after having continued for a fixed time T0 or longer, and commands the circuit to multiply the calculated mean electric power P of noises by k(>1.0) for outputting it.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】この発明は、入力信号に含ま
れる雑音信号と音声信号との電力レベル差に基づいて音
声信号を検出する音声検出装置に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a voice detection device for detecting a voice signal based on a power level difference between a noise signal and a voice signal included in an input signal.

【０００２】[0002]

【従来の技術】音声検出器は、ＴＶ会議システムのエコ
ーキャンセラや、電話など音声通信用の音声コーデック
（符号化・復号化装置）等の音声処理装置に使用される
もので、雑音信号と音声信号とが含まれている入力信号
から音声信号を検出する機能を有している。2. Description of the Related Art A voice detector is used for a voice processing device such as an echo canceller of a TV conference system and a voice codec (encoding / decoding device) for voice communication such as a telephone. It has a function of detecting an audio signal from an input signal including the signal.

【０００３】こうした音声信号の検出に際しては、入力
信号の電力レベルが所定のしきい値より大きいか、小さ
いかのみに基づいて、入力信号中に音声が含まれている
（有音）か、それとも雑音のみである（無音）かの判定
を行うことが可能である。しかし、音声処理装置の設置
場所が屋内であるか屋外であるかによって、雑音信号の
電力レベルは変化し、さらに同じ屋内で使用する場合で
も、そこでエアコン等が使用されているか否かによって
も、雑音の大きさは異なってくる。したがって、固定さ
れたしきい値による音声検出方法にあっては、装置の設
置場所の雑音レベルが変化する場合に、音声と誤まって
雑音を検出してしまう不都合があった。In detecting such a voice signal, based on only whether the power level of the input signal is higher or lower than a predetermined threshold, the input signal contains voice (voice) or It is possible to determine whether there is only noise (silence). However, the power level of the noise signal changes depending on whether the voice processing device is installed indoors or outdoors, and even when used indoors, whether an air conditioner or the like is used there. The magnitude of the noise will be different. Therefore, in the voice detection method using a fixed threshold, when the noise level at the installation location of the device changes, there is a disadvantage that the noise is erroneously detected as a voice.

【０００４】このような不都合を防止するためには、音
声処理装置が置かれている周囲の雑音レベルを算出し、
その雑音レベルより常に一定値だけ大きいレベルにしき
い値を設定し、このしきい値レベル以上の信号を検出す
る適応しきい値方法が有効である。In order to prevent such inconvenience, the noise level around the voice processing device is calculated,
An adaptive thresholding method is effective in which a threshold is set to a level always higher than the noise level by a certain value, and a signal above the threshold level is detected.

【０００５】図４には、従来の適応しきい値法による音
声検出装置の一例を示している。FIG. 4 shows an example of a conventional voice detection apparatus using the adaptive threshold method.

【０００６】図において、１は音声信号と雑音信号とが
入力されるマイクロフォン、２は電力算出回路である。
この電力算出回路２は、入力信号Ｓ_inの直前の短時間の
平均電力Ｐ（移動平均）を計算して、しきい値算出回路
３に出力する。しきい値算出回路３は、平均電力Ｐに一
定値を加算した適応しきい値Ｓ_T を設定し、音声検出回
路４に出力している。音声検出回路４は、入力信号Ｓ_in
から有音／無音判定を行なうものであり、入力信号Ｓ_in
の電力レベルと、しきい値算出回路３で設定された適応
しきい値Ｓ_T のレベルとを比較し、しきい値Ｓ_T のレベ
ルより入力信号Ｓ_inのレべルの方が高ければ有音と判定
し、低ければ無音と判定している。この判定によって音
声検出回路４からは有音／無音判定信号Ｃが出力され
る。また、この有音／無音判定信号Ｃは電力算出回路２
及びしきい値算出回路３にも入力され、有音の場合に
は、平均電力Ｐの算出と適応しきい値Ｓ_T の更新動作と
を停止させている。In FIG. 1, reference numeral 1 denotes a microphone to which a voice signal and a noise signal are input, and 2 denotes a power calculation circuit.
The power calculation circuit 2 calculates the short-time average power P (moving average) immediately before the input signal S _in and outputs the calculated power P to the threshold value calculation circuit 3. Threshold calculation circuit 3 sets an adaptive threshold S _T obtained by adding a constant value to the average power P, and outputs the voice detection circuit 4. The voice detection circuit 4 receives the input signal S _in
From the input signal S _in
Yes and the power level is compared with the level of the adaptive threshold S _T, which is set by the threshold value calculation circuit 3, the higher the better the leveling Le threshold S _T input signal S _in the level of It is determined to be sound, and if low, it is determined to be silent. By this determination, the voice detection circuit 4 outputs a voice / non-voice determination signal C. The sound / non-sound determination signal C is output from the power calculation circuit 2.
And also input to the threshold calculation circuit 3, when the sound is to stop the updating operation of the average power P calculated adaptive threshold S _T of.

【０００７】図５は、この音声検出装置の動作を説明す
る信号波形図である。FIG. 5 is a signal waveform diagram for explaining the operation of the voice detection device.

【０００８】同図（ａ）に示す音声は、同図（ｂ）に示
す徐々に増大する周囲雑音とともにマイクロフォン１で
集音され、入力信号Ｓ_inとなる。The sound shown in FIG. 1A is collected by the microphone 1 together with the gradually increasing ambient noise shown in FIG. 1B, and becomes an input signal S _in .

【０００９】同図（ｃ）には、これら音声と雑音とが合
成された入力信号Ｓ_inから計算される平均電力Ｐと適応
しきい値Ｓ_T とを示している。電力算出回路２では、有
音／無音判定信号Ｃに基づいて、無音区間だけで入力信
号Ｓ_inの平均電力値を計算しており、その短時間の平均
電力Ｐがしきい値算出回路３に出力される。適応しきい
値Ｓ_T は、差分ΔＰだけ平均電力Ｐより高いレべルに制
御され、音声検出回路４に出力されている。[0009] FIG. (C) shows the average power P indicated threshold S _T These voice and noise is calculated from the input signal S _in, which is synthesized. The power calculation circuit 2 calculates the average power value of the input signal S _in only the silent section based on the sound / non-speech determination signal C, and the short-time average power P is supplied to the threshold calculation circuit 3. Is output. Adaptive threshold S _T is controlled to a high leveling Le than the average power P by the difference [Delta] P, is output to the voice detection circuit 4.

【００１０】同図（ｄ）には、音声検出回路４から出力
される有音／無音判定信号Ｃの波形を示している。音声
検出回路４では、入力信号Ｓ_inのレベルが適応しきい値
Ｓ_Tより高くなると有音と判定し、適応しきい値Ｓ_T よ
り低い場合には、無音と判定する。そして、この有音／
無音判定信号Ｃが、それぞれ論理レベル１（以下、単に
「１」と記す）、論理レベル０（以下、単に「０」と記
す）の信号出力となる。FIG. 1D shows the waveform of the sound / non-speech determination signal C output from the voice detection circuit 4. The speech detection circuit 4, voice and determine if the level of the input signal S _in is higher than the adaptive threshold S _T, is lower than the adaptive threshold S _T, it is determined that the silence. And this sound /
The silence determination signal C is a signal output of a logical level 1 (hereinafter simply referred to as “1”) and a logical level 0 (hereinafter simply referred to as “0”).

【００１１】ただし、有音から無音に変化するタイミン
グでは、一定時間、ここでは例えばＴ_H だけ無音が継続
してはじめて、周囲雑音の平均電力Ｐの計算が再開さ
れ、音声検出回路４の有音／無音判定信号Ｃが「０」に
なる。この電力算出回路２に設定された時間Ｔ_H は、い
わゆるハングオーバ時間であって、例えば電話などでは
話中なのに音声レベルが一時的に低下して、しきい値Ｓ
_T を下回ったような場合でも、この時間Ｔ_H の間だけ
は、音声検出回路４の出力を「１」に維持する働きをし
ている。[0011] However, the timing of changing from voiced to silence certain time, wherein the first time is continuously silence only example T _H, the calculation of the average power P of the ambient noise is resumed, sound of the voice detecting circuit 4 / The silence determination signal C becomes “0”. The power calculation circuit 2 time T _H which is set to is a so-called hangover period, for example telephone voice levels for a busy like temporarily reduced, the threshold value S
Even if as below _T, only during the time T _H is a function of maintaining the output of the voice detection circuit 4 to "1".

【００１２】図６は、周囲雑音が急激に増大した場合の
動作を説明するための信号波形図である。FIG. 6 is a signal waveform diagram for explaining the operation when the ambient noise increases sharply.

【００１３】図に示すように、例えば部屋の空調装置の
稼働を開始し、或いは、それまで閉められていたドアを
開けて室外からの騒音が入るなど、入力信号Ｓ_inがある
時点で急激に変化し、その変化量が適応しきい値Ｓ_T の
平均電力Ｐに対する差分ΔＰを越えて、入力信号Ｓ_inの
レベルが適応しきい値Ｓ_T より高くなる場合がある。こ
のとき、音声検出回路４の出力である有音／無音判定信
号Ｃが「０」から「１」に変化する。この時点で、電力
算出回路２における平均電力Ｐの計算が停止され、しき
い値Ｓ_T の更新も停止される。このように雑音レベルの
みが高くなって、適応しきい値Ｓ_T が更新できないと、
入力信号Ｓ_inに雑音成分だけしか含まれていないにもか
かわらず、有音／無音判定信号Ｃがいつまでも「０」レ
ベルに戻らず、有音の判断が継続することとなる。As shown in the figure, for example, the operation of the air conditioner in the room is started, or the door which has been closed until then is opened and noise from the outside enters, and the input signal S _in suddenly rises at a certain point in time. changes, beyond the difference ΔP to the average power P of the amount of change is adaptive threshold S _T, there is a case where the level of the input signal S _in is higher than the adaptive threshold S _T. At this time, the sound / non-speech determination signal C output from the voice detection circuit 4 changes from “0” to “1”. At this point, the calculation of the average power P is stopped at the power calculation circuit 2, also updates the threshold S _T is stopped. In this way only the noise level is high, and the adaptive threshold S _T can not update,
Even though only the noise component is included _{in the} input signal S _in , the sound / non-speech determination signal C does not return to the “0” level forever, and the sound determination continues.

【００１４】[0014]

【発明が解決しようとする課題】上述のように、従来の
適応しきい値法による音声検出装置にあっては、周囲の
雑音レベルがしきい値を越える程に急激に高くなると、
それまで「０」であった有音／無音判定信号Ｃが「１」
に変化して有音と判定される。そして、有音と判定され
たら、その後は音声検出装置のしきい値Ｓ_T が変らない
ために、同じ雑音レベルが維持された場合には、有音の
判断が継続するという問題があった。As described above, in the conventional voice detection apparatus using the adaptive threshold method, when the surrounding noise level suddenly becomes higher as exceeding the threshold value,
The sound / non-speech determination signal C which was “0” until then becomes “1”
And it is determined that there is sound. Subsequently, upon determining that voice, to thereafter unchangeable threshold S _T of the audio sensing device, when the same noise level is maintained, there is a problem that sound determination continues.

【００１５】この発明は、このような課題を解決するた
めになされたもので、周囲雑音が急激に増大し、その雑
音レベルが維持されるような場合に、音声検出の判定で
有音の状態が継続することを防止した音声検出装置を提
供することを目的とするものである。SUMMARY OF THE INVENTION The present invention has been made to solve such a problem, and when ambient noise increases rapidly and its noise level is maintained, a sound state is determined in the determination of voice detection. It is an object of the present invention to provide a voice detection device that prevents continuation.

【００１６】[0016]

【課題を解決するための手段】請求項１に係る音声検出
装置は、入力信号に含まれる雑音信号と音声信号との電
力レベル差に基づいて音声信号を検出する音声検出装置
において、前記入力信号の電力レベルから、前記音声信
号を検出するためのしきい値を設定する適応しきい値制
御手段と、前記しきい値と前記入力信号とを比較して有
音／無音を判定する音声検出手段と、前記音声検出手段
で継続して有音と判定された時間を計る計時手段とを備
え、前記適応しきい値制御手段では、前記無音と判定さ
れている間は、前記しきい値を前記入力信号の電力レベ
ルに対応した値に設定し、前記計時手段にあらかじめ設
定された時間を越えて前記有音が検出されたときには、
前記しきい値を所定量だけ増加させるものである。According to a first aspect of the present invention, there is provided a voice detecting apparatus for detecting a voice signal based on a power level difference between a noise signal and a voice signal included in the input signal. Adaptive threshold control means for setting a threshold value for detecting the audio signal from the power level of the audio signal, and audio detection means for comparing the threshold value with the input signal to determine presence / absence of sound And a timer for measuring the time continuously determined to be sound by the voice detection means, and the adaptive threshold control means sets the threshold to the threshold while the sound is determined to be silent. Set to a value corresponding to the power level of the input signal, and when the sound is detected for more than a predetermined time in the timing means,
The threshold value is increased by a predetermined amount.

【００１７】請求項２に係る音声検出装置の適応しきい
値制御手段は、前記計時手段にあらかじめ設定された時
間を越えて前記有音が検出される毎に、前記しきい値を
所定量繰り返し増加させるものである。The adaptive threshold value control means of the voice detection device according to claim 2 repeats the threshold value by a predetermined amount each time the sound is detected for a time exceeding a time preset in the time counting means. Is to increase.

【００１８】請求項３に係る音声検出装置の適応しきい
値制御手段は、前記しきい値を所定の上限値以上にはし
ないものである。According to a third aspect of the present invention, the adaptive threshold value control means of the voice detection device does not set the threshold value to a predetermined upper limit value or more.

【００１９】[0019]

【発明の実施の形態】以下、添付した図面を参照して、
この発明の実施の形態を説明する。BRIEF DESCRIPTION OF THE DRAWINGS FIG.
An embodiment of the present invention will be described.

【００２０】図１は、本発明の音声検出装置の一例を示
すブロック図である。音声検出回路４は、入力信号Ｓ_in
から有音／無音判定を行なうものであり、入力信号Ｓ_in
の電力レベルと、しきい値算出回路３で設定された適応
しきい値Ｓ_T のレベルとを比較し、しきい値Ｓ_T のレベ
ルより入力信号Ｓ_inのレべルの方が高ければ有音と判定
し、低ければ無音と判定している。FIG. 1 is a block diagram showing an example of a voice detection device according to the present invention. The voice detection circuit 4 receives the input signal S _in
From the input signal S _in
Yes and the power level is compared with the level of the adaptive threshold S _T, which is set by the threshold value calculation circuit 3, the higher the better the leveling Le threshold S _T input signal S _in the level of It is determined to be sound, and if low, it is determined to be silent.

【００２１】また音声検出装置は、音声検出回路４で有
音と判別された後の有音の継続時間を計るタイマ回路５
を備えている。このタイマ回路５は、有音／無音判定信
号Ｃが一定時間（Ｔ₀ ）以上、継続して「１」となった
ときに、電力算出回路２に指令信号６を出力して、既に
算出されている雑音の平均電力Ｐをｋ（＞1.0）倍して
出力するように指令する。なお、図１において図４の従
来装置と同一符号を付してあるブロックや信号は、同一
又は対応するブロック及び信号を示している。また、し
きい値制御手段７とは、電力算出回路２としきい値算出
回路３とを含む回路構成を指している。The voice detecting device includes a timer circuit 5 for measuring the duration of the voice after the voice detection circuit 4 determines that the voice is voiced.
It has. The timer circuit 5 outputs a command signal 6 to the power calculation circuit 2 when the sound / non-speech determination signal C continuously becomes “1” for a predetermined time (T ₀ ) or more, and the timer signal 5 is already calculated. The average power P of the noise is multiplied by k (> 1.0) and output. In FIG. 1, blocks and signals denoted by the same reference numerals as those of the conventional device in FIG. 4 indicate the same or corresponding blocks and signals. Further, the threshold control means 7 indicates a circuit configuration including the power calculation circuit 2 and the threshold calculation circuit 3.

【００２２】次に、図１の音声検出装置の動作を説明す
る。Next, the operation of the voice detection device shown in FIG. 1 will be described.

【００２３】同図において、電力算出回路２及びしきい
値算出回路３には有音／無音判定信号Ｃが入力され、有
音と判断された場合には、その平均電力Ｐの算出が停止
され、無音時の雑音の平均電力Ｐが継続してしきい値算
出回路３に出力される。その結果、適応しきい値Ｓ_T の
更新動作も、有音の場合には停止されることになる。In the figure, a sound / non-speech determination signal C is input to a power calculation circuit 2 and a threshold value calculation circuit 3, and when it is determined that there is sound, the calculation of the average power P is stopped. , The average power P of the noise during silence is continuously output to the threshold value calculation circuit 3. As a result, operation of updating the adaptive threshold S _T, will be stopped in case of voiced.

【００２４】ここで、新たに追加されたタイマ回路５か
らの指令信号６は、有音／無音判定信号Ｃが一定時間
（Ｔ₀ ）以上、継続して「１」となったときに、電力算
出回路２に対して有音と判断される直前の雑音の平均電
力Ｐを逓倍してしきい値算出回路３に出力するように制
御する。これによって、電力算出回路２からしきい値算
出回路３に出力される平均電力Ｐは、有音区間と判定さ
れてから一定時間Ｔ₀ が経過した後に、新しい値ｋ×Ｐ
として出力される。Here, the command signal 6 from the newly added timer circuit 5 is used when the presence / absence of the sound / non-speech determination signal C becomes "1" for a certain time (T ₀ ) or more. The calculation circuit 2 is controlled so as to multiply the average power P of the noise immediately before the sound is determined to be sound and output the result to the threshold value calculation circuit 3. Thus, the average power P output from the power calculation circuit 2 to the threshold calculation circuit 3, after a predetermined time T ₀ has elapsed since it is determined that the active interval, the new value k × P
Is output as

【００２５】なお、必要に応じて音声検出回路４からの
有音／無音判定信号Ｃとともに音声出力Ｓ_out が、有音
区間で後段の図示しない音声処理装置へと出力される。A sound output _Sout is output together with a sound / non-sound determination signal C from the sound detection circuit 4 to a sound processing device (not shown) at a subsequent stage in a sound period as necessary.

【００２６】さらに、上記音声検出装置の動作につい
て、図２に示すフローチャートによって説明する。Further, the operation of the voice detecting device will be described with reference to a flowchart shown in FIG.

【００２７】マイクロフォン１から入力された音声と背
景雑音とは入力信号Ｓ_inとして、電力算出回路２と音声
検出回路４に入力され、音声検出回路４で有音／無音判
定が実行される（ステップＳＴ１）。音声検出回路４で
しきい値Ｓ_T より大きな入力信号Ｓ_inが検出されると、
有音／無音判定信号Ｃが「１」となって有音と判断され
（ステップＳＴ２）、ステップＳＴ３に進み、タイマ回
路５における計時動作が開始され、ハングオーバ時間Ｔ
_H がセットされる（ステップＳＴ４）。The voice and background noise input from the microphone 1 are input to the power calculation circuit 2 and the voice detection circuit 4 as an input signal S _in , and the voice detection circuit 4 performs a sound / silence determination (step). ST1). When a large input signal S _in is detected than the threshold S _T voice detection circuit 4,
The sound / non-sound determination signal C becomes "1" and is determined to be a sound (step ST2), the process proceeds to step ST3, the timer circuit 5 starts a time counting operation, and the hangover time T
_H is set (step ST4).

【００２８】また、この音声検出回路４で、入力信号Ｓ
_inがしきい値Ｓ_T より小さいと判定されると、ステップ
ＳＴ５に進んで、ハングオーバ時間Ｔ_H がタイムオーバ
しているか否かの判断が行なわれる。そして、ハングオ
ーバ時間Ｔ_H がタイムオーバしている（Ｔ_H ≦０）とき
には、有音／無音判定信号Ｃを「０」として（ステップ
ＳＴ６）、電力算出回路２で平均電力Ｐが算出される
（ステップＳＴ７）。ハングオーバ時間Ｔ_H がタイムオ
ーバしていない（Ｔ_H ＞０）ときには、有音／無音判定
信号Ｃを「１」として（ステップＳＴ８）、ハングオー
バ時間Ｔ_H を単位時間ΔＴ_H だけ低減して（ステップＳ
Ｔ９）、ステップＳＴ１に戻る。Further, in the voice detection circuit 4, the input signal S
When _in is determined that the threshold S _T is less than, the flow proceeds to step ST5, it determines hangover time T _H is whether or not the time over is performed. The hangover time T _H is time-over (T _H ≦ 0) Sometimes, the voice / silence decision signal C "0" (step ST6), the average power P is calculated by the power calculation circuit 2 ( Step ST7). Hangover time T _H is not time-over (T _H> 0) times, speech / silence decision signal C to "1" (step ST8), by reducing the hangover time T _H by the unit time [Delta] T _H (step S
T9), and return to step ST1.

【００２９】タイマ回路５では、Δｔ毎に一定時間Ｔ₀
が経過したか否かを判断して（ステップＳＴ１０）、経
過していなければステップＳＴ１に戻る。そして、上述
の有音／無音判定が実行され、タイマ回路５により一定
時間Ｔ₀ が経過したと判断されるまでステップＳＴ１０
から、ステップＳＴ１〜４が繰り返し実行される。しか
し、この間に音声検出回路４でＳ_in＜Ｓ_T がハングオー
バ時間Ｔ_H 以上継続し、その出力である有音／無音判定
信号Ｃが「０」になれば、ステップＳＴ７に進み、電力
算出回路２での平均電力Ｐの算出が再開される。In the timer circuit 5, a constant time T ₀ is set at every Δt.
Is determined (step ST10), and if not, the process returns to step ST1. Then, the above-mentioned sound / non-sound determination is performed, and until the timer circuit 5 determines that the predetermined time T ₀ has elapsed, the process proceeds to step ST10
Thereafter, steps ST1 to ST4 are repeatedly executed. However, S _in <S _T continues hangover time T _H or more voice detection circuit 4 during this time, if voice / silence decision signal C which is the output to "0", the flow proceeds to step ST7, power calculation circuit The calculation of the average power P in 2 is restarted.

【００３０】ステップＳＴ１０において、タイマ回路５
で一定時間Ｔ₀ が経過したとき、タイマ回路５から電力
算出回路２に対して指令信号６が出力される。これによ
り、電力算出回路２で算出された平均電力Ｐをｋ倍して
出力するようになる（ステップＳＴ１１）。この平均電
力Ｐには、ステップＳＴ７において有音と判断される直
前に算出された値が使用される。In step ST10, the timer circuit 5
When the predetermined time T ₀ has elapsed, the command signal 6 is output from the timer circuit 5 to the power calculation circuit 2. As a result, the average power P calculated by the power calculation circuit 2 is multiplied by k and output (step ST11). As the average power P, a value calculated immediately before the sound is determined to be present in step ST7 is used.

【００３１】その結果、しきい値算出回路３に出力され
る新しい平均電力Ｐの値が上昇し、しきい値算出回路３
では、新しい平均電力Ｐによってしきい値Ｓ_T が更新さ
れる（ステップＳＴ１２）。その後、タイマ回路５がリ
セットされ（ステップＳＴ１３）、ステップＳＴ１に戻
って音声検出回路４で有音／無音判定が実行される。As a result, the value of the new average power P output to the threshold value calculation circuit 3 increases,
In the threshold S _T is updated with the new average power P (step ST12). After that, the timer circuit 5 is reset (step ST13), and the process returns to step ST1 and the sound detection circuit 4 performs the sound / non-speech determination.

【００３２】このように、新しい平均電力Ｐに基づいて
更新されたしきい値Ｓ_T が設定され、その更新されたし
きい値Ｓ_T が音声検出回路４に入力するから、音声検出
回路４では前回とは異なる基準で有音／無音の判定が実
行される。そして、前回と同様に一定時間Ｔ₀ が経過す
るまで、更新されたしきい値Ｓ_T に基づいて、さらに有
音区間終了判定が繰り返し実行される。[0032] Thus, the threshold S _T, which is updated based on the new average power P is set, since the updated threshold S _T is input to the voice detection circuit 4, the speech detection circuit 4 The sound / non-speech determination is performed based on a different reference from the previous time. Then, until a predetermined time has elapsed T ₀ as before, based on the updated threshold value S _T, it is repeatedly executed further sound interval end judgment.

【００３３】なお、ステップＳＴ１で無音と判断された
後には、ステップＳＴ７で電力算出回路２での平均電力
Ｐの算出が再開され、ステップＳＴ１２で新しい平均電
力Ｐに基づくしきい値Ｓ_T に更新され（ステップＳＴ１
２）、タイマ回路５をリセットして（ステップＳＴ１
３）、ステップＳＴ１に戻って、有音／無音判定が繰り
返される。[0033] Note that after it is determined that the silence in step ST1, the calculation of the average power P of the power calculation circuit 2 in step ST7 is restarted, updates the threshold S _T based on the new average power P in step ST12 (Step ST1)
2) Reset the timer circuit 5 (step ST1)
3) Returning to step ST1, the sound / non-speech determination is repeated.

【００３４】図３は、上述の音声検出装置の一連の動作
の具体例を示す信号波形図である。FIG. 3 is a signal waveform diagram showing a specific example of a series of operations of the above-described voice detection device.

【００３５】同図において、音声検出回路４における有
音／無音判定信号Ｃが「０」（無音）から「１」（有
音）に変化する時刻ｔ₁ までの動作は、従来の音声検出
装置（図４）の動作と全く同じである。本発明における
音声検出回路４では、有音／無音判定信号Ｃが「１」に
なると、タイマ回路５の計時動作がスタートする。タイ
マ回路５に設定されている一定時間（Ｔ₀ ）が経過する
までは、平均電力Ｐと適応しきい値Ｓ_T は一定レベルを
保持する。In the figure, the operation from time t ₁ when the sound / non-speech determination signal C in the sound detection circuit 4 changes from “0” (silence) to “1” (sound) is the same as that of the conventional sound detection device. The operation is exactly the same as that shown in FIG. In the voice detection circuit 4 according to the present invention, when the presence / absence determination signal C becomes “1”, the timer circuit 5 starts the timekeeping operation. Until a certain time set in the timer circuit 5 (T ₀₎ has elapsed, the adaptive threshold S _T and the average power P maintains a constant level.

【００３６】時刻ｔ₂ になると、電力算出回路２では最
後に算出された平均電力Ｐにｋ（＞1.0）を乗じた値
（ｋ×Ｐ）が新たな平均電力としてしきい値算出回路３
に入力され、これに伴い適応しきい値Ｓ_T も同様に上昇
した値に更新される。この更新されたしきい値Ｓ_T が、
図示のように入力信号Ｓ_inのレベルより高くなると、ハ
ングオーバ時間Ｔ_H が経過した時刻ｔ₃ で、音声検出回
路４の有音／無音判定信号Ｃは「１」から「０」に変化
する。したがって、電力算出回路２では入力信号Ｓ_inに
対応する平均電力Ｐの更新が再開され、これに伴って適
応しきい値Ｓ_T の更新が再開される。At time t ₂ , the power calculating circuit 2 calculates a value (k × P) obtained by multiplying the average power P calculated last by k (> 1.0) as a new average power and sets the threshold calculating circuit 3
, And the adaptive threshold value _ST is similarly updated to the increased value. This updated threshold _ST is
It becomes higher than the level of the input signal S _in as shown at time t ₃ when the hangover time T _H has elapsed, voice / silence decision signal C of the voice detecting circuit 4 is changed from "1" to "0". Therefore, updating of the average power P is resumed corresponding to the input signal in the power calculation circuit 2 S _in the updating of the adaptive threshold S _T is restarted accordingly.

【００３７】図３に示す例では、周囲雑音によって有音
という判断が継続し、算出電力値を１回だけｋ倍するこ
とで、適応しきい値Ｓ_T が入力信号Ｓ_inのレベルより高
くなるような周囲雑音が想定されている。しかし、雑音
の大きさや定数ｋの設定によっては、１回の逓倍操作だ
けでしきい値Ｓ_T が入力信号Ｓ_inのレベルより高くなら
ないような場合も考えられる。そのような場合には、図
２のフローチャートに示すように、一定時間Ｔ₀ が経過
する毎に徐々に算出電力値を高めていくことになる。[0037] In the example shown in FIG. 3, continues to be determined that voiced by ambient noise, the calculated power value by multiplying once k, adaptive threshold S _T is higher than the level of the input signal S _in Such ambient noise is assumed. However, by setting the noise magnitude and constant k, it can be considered as the threshold S _T is not higher than the level of the input signal S _in with only a single multiplication operation. In such a case, as shown in the flowchart of FIG. 2, it will be gradually increased gradually calculated power value every time the predetermined time T ₀ has elapsed.

【００３８】なお、通常人間の話し言葉には、継続して
話を行っている場合でも、途中に必ず区切りや、途切れ
が入るので、音声信号を電力レベルで詳細に観察すれ
ば、その信号レベルは断続状態となっている。このた
め、図１のタイマ回路５における一定時間Ｔ₀ を、例え
ば１〜１０秒程度の幅で適当な値に選んでおけば、音声
信号によってしきい値Ｓ_T が更新されてしまうおそれは
少ない。またたとえ、いったんしきい値Ｓ_T が音声信号
によって更新されたとしても、その後に話が途切れれば
再び雑音のみによるしきい値Ｓ_T の更新が行われるた
め、適切なしきい値ＳT に戻すことが可能である。It is to be noted that, even when speech is normally spoken by a human, even if speech is continued, breaks and breaks always occur in the middle, so that if the audio signal is observed in detail at the power level, the signal level will be It is in an intermittent state. For this reason, if the fixed time T ₀ in the timer circuit 5 of FIG. 1 is selected to be an appropriate value within a range of, for example, about 1 to 10 seconds, there is little possibility that the threshold value _ST is updated by the audio signal. . The example, once for the threshold S _T is even updated by the audio signal, after which the threshold updating S _T only by again noise if Togirere talk takes place, be returned to the appropriate threshold ST Is possible.

【００３９】以上説明したように、上記音声検出装置に
おいては、有音となった後も一定時間経過後に適応しき
い値Ｓ_T を変化させているから、周囲雑音が急激に変化
して雑音を音声とする誤った検出をした場合でも、繰り
返し周囲雑音に対応して確実に有音／無音の判定を行な
うことができる。As described above, in the above-described speech detection device, the adaptive threshold value _ST is changed after a certain period of time has elapsed even after a sound is produced. Even if erroneous detection of voice is made, it is possible to reliably determine the presence or absence of sound in response to repeated ambient noise.

【００４０】上述の実施の形態を変形して、入力信号Ｓ
_inの平均電力Ｐの算出値、或いは適応しきい値Ｓ_T のい
ずれかに上限を設け、それらの値が一定レベル以上にな
らないように制限して音声検出を行う装置を構成するこ
とも可能である。このような音声検出装置では、例えば
ＢＧＭのような音楽演奏などを音声信号とする入力信号
Ｓ_inのように、音声の連続した入力があった場合にも、
通常の周囲雑音が取り得る電力値より適応しきい値Ｓ_T
の上限を高く設定しておけば、音声検出動作を保障でき
る。By modifying the above-described embodiment, the input signal S
calculated value of the average power P _in, or an upper limit to either the adaptive threshold S _T provided to limit such that their values are not above a certain level is also possible to configure the apparatus to perform voice detection is there. In such a voice detection device, even when there is a continuous input of voice, such as an input signal S _in which a music performance such as BGM is used as a voice signal,
Adaptive threshold S _T from the power values that can take the normal ambient noise
If the upper limit is set high, the voice detection operation can be guaranteed.

【００４１】また、上記電力算出回路２では所定の係数
ｋを平均電力Ｐに掛けて、新しい電力値を算出している
が、所定値ΔＰを加算するものであってもよい。Although the power calculation circuit 2 calculates a new power value by multiplying the average power P by a predetermined coefficient k, the power calculation circuit 2 may add a predetermined value ΔP.

【００４２】さらに、上記音声検出装置では、タイマ回
路５の指令信号６は電力算出回路２に対する指令であっ
たが、例えばタイマ回路５の指令信号６をしきい値算出
回路３に供給して、有音となった後も、平均電力Ｐの値
はそのままに保持し、適応しきい値Ｓ_T だけを一定時間
Ｔ₀ が経過する毎に徐々に高めるように制御するように
構成してもよい。Further, in the above-mentioned voice detecting device, the command signal 6 of the timer circuit 5 is a command to the power calculation circuit 2. For example, the command signal 6 of the timer circuit 5 is supplied to the threshold value calculation circuit 3, even after a voice, the value of the average power P is held as it is, or may be configured to control only the adaptive threshold S _T to increase gradually every time a predetermined time has elapsed T ₀ .

【００４３】したがって、請求項１に記載した「しきい
値を所定量だけ増加させる」とは、「所定の割合での増
加」及び「所定値の増加」並びにこれらの組合せを含む
意味に解するべきである。Therefore, the expression "increase the threshold value by a predetermined amount" described in claim 1 means "increase at a predetermined ratio", "increase of a predetermined value", and a combination thereof. Should.

【００４４】[0044]

【発明の効果】この発明の音声検出装置は、以上に説明
したように構成されているので、周囲雑音が急激に変化
した場合にも、新たな周囲雑音に対応してしきい値を更
新して、確実に有音／無音の判定を行なうことができ
る。As described above, the speech detection device of the present invention is configured as described above, so that even when the ambient noise changes rapidly, the threshold value is updated in accordance with the new ambient noise. As a result, it is possible to reliably determine the presence or absence of sound.

[Brief description of the drawings]

【図１】本発明の音声検出装置の一例を示すブロック
図である。FIG. 1 is a block diagram illustrating an example of a voice detection device according to the present invention.

【図２】図１の音声検出装置の動作を示すフローチャ
ートである。FIG. 2 is a flowchart illustrating an operation of the voice detection device in FIG. 1;

【図３】図１の音声検出装置の動作の具体例を示す信
号波形図である。FIG. 3 is a signal waveform diagram showing a specific example of the operation of the voice detection device in FIG.

【図４】従来の適応しきい値法による音声検出装置の
一例を示すブロック図である。FIG. 4 is a block diagram illustrating an example of a conventional voice detection device using an adaptive threshold method.

【図５】図４の音声検出装置の動作を説明する信号波
形図である。FIG. 5 is a signal waveform diagram illustrating an operation of the voice detection device in FIG.

【図６】周囲雑音が急激に増大した場合の動作を説明
するための信号波形図である。FIG. 6 is a signal waveform diagram for explaining an operation in a case where ambient noise increases rapidly.

[Explanation of symbols]

１マイクロフォン、２電力算出回路、３しき
い値算出回路、４音声検出回路、５タイマ回路、
６指令信号、７しきい値制御手段、Ｓ_in 入力
信号、Ｓ_T 適応しきい値、Ｐ雑音の平均電力、
Ｃ有音／無音判定信号。1 microphone, 2 power calculation circuit, 3 threshold value calculation circuit, 4 voice detection circuit, 5 timer circuit,
6 instruction signal 7 the threshold control unit, S _in the input signal, S _T adaptive threshold, the average power of P noise,
C Voice / silence determination signal.

Claims

[Claims]

1. A voice detection device for detecting a voice signal based on a power level difference between a noise signal and a voice signal included in an input signal, wherein the voice signal is detected from a power level of the input signal. Adaptive threshold value control means for setting a threshold value; sound detection means for comparing the threshold value with the input signal to determine sound / non-speech; The adaptive threshold value control means sets the threshold value to a value corresponding to the power level of the input signal while it is determined that the sound is silent. A voice detecting device, wherein when the sound is detected for a time longer than a preset time in a timer, the threshold value is increased by a predetermined amount.

2. The apparatus according to claim 1, wherein said adaptive threshold value control means repeatedly increases said threshold value by a predetermined amount each time said sound is detected for a time exceeding a time preset in said timekeeping means. The voice detection device according to claim 1.

3. The apparatus according to claim 2, wherein said adaptive threshold value control means does not set said threshold value to a predetermined upper limit value or more.