JPH01123297A - Voice detection circuit - Google Patents

Voice detection circuit

Info

Publication number
JPH01123297A
JPH01123297A JP62281753A JP28175387A JPH01123297A JP H01123297 A JPH01123297 A JP H01123297A JP 62281753 A JP62281753 A JP 62281753A JP 28175387 A JP28175387 A JP 28175387A JP H01123297 A JPH01123297 A JP H01123297A
Authority
JP
Japan
Prior art keywords
value
threshold
threshold value
average power
larger
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
JP62281753A
Other languages
Japanese (ja)
Inventor
Katsumi Hosoya
細谷 克美
Keiichi Shiro
代 啓一
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nippon Telegraph and Telephone Corp
Original Assignee
Nippon Telegraph and Telephone Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nippon Telegraph and Telephone Corp filed Critical Nippon Telegraph and Telephone Corp
Priority to JP62281753A priority Critical patent/JPH01123297A/en
Publication of JPH01123297A publication Critical patent/JPH01123297A/en
Pending legal-status Critical Current

Links

Abstract

PURPOSE: To detect a voice included in an input signal on conditions different by noise levels by updating a threshold in accordance with the noise level to decide sound/silence. CONSTITUTION: A voice signal is divided by a unit time, and an average power Pi of the section is calculated, and a certain value α is added to the average power Pi to calculate a value (Pi +α). The average power Pi and a threshold Ti are compared with each other to output a discridecision result X0 showing larger one. The calculated value (Pi +α) and the threshold Ti are compared with each other to output a discrimination result X1 showing larger one. This decision result X1 is inputted, and a threshold Ti+1 of the next section is set to Ti if (Pi +α) is equal to or larger than Ti , and the threshold Ti+1 of the next section is set to (Pi +α) if (Pi +α) is smaller than Ti . The threshold Ti is set to a preliminarily determined certain value T0 at the time of the start of operation and is changed to output the decision result X0 at each time of elapse of the unit time.

Description

【発明の詳細な説明】 「産業上の利用分野」 この発明は人間が発声した音声を伝送あるいは蓄積する
装置、及び人間が発声した音声によって制御する装置に
おいて、音声がありたこと、その時刻、音声の終わった
こと、などを検出する音声検出回路に関するものである
Detailed Description of the Invention "Field of Industrial Application" This invention relates to devices that transmit or store voices uttered by humans, and devices that are controlled by voices uttered by humans. This relates to a voice detection circuit that detects when a voice has ended.

「従来の技術」 従来より、音声の伝送や蓄積を目的とする装置に使われ
る音声検出回路の構成法が提案されている。例えば伝送
コストの低減のために有音部分のみを伝送する、音声の
蓄積の効率化のために無音部分を圧縮して記憶する、な
どの機能を持ったシステムへの応用がある。(例えば吉
用他「音声蓄積サービスにおける無音圧縮効果」昭和5
6信学情報・システム全天、473) これらの音声検出回路の基本原理は、入力信号の振幅、
周波数成分、零交差数、線形分析等の分析パラメータな
ど、を用いており、これらの太きさや時間変化tもとに
判定している。(例えば、Maruta、R4,et 
al、: ”Design and Perfor+n
anceofaDs I Terminal for 
Domestic Applications”。
"Prior Art" Conventionally, methods for configuring a voice detection circuit used in a device for transmitting or storing voice have been proposed. For example, it can be applied to systems that have functions such as transmitting only voiced parts to reduce transmission costs, or compressing and storing silent parts to improve the efficiency of audio storage. (For example, Yoshiyuki et al. ``Silence compression effect in audio storage services'', 1932)
6 IEICE Information System Zenten, 473) The basic principle of these voice detection circuits is that the amplitude of the input signal,
Frequency components, number of zero crossings, analysis parameters such as linear analysis, etc. are used, and determination is made based on the thickness and time change t of these. (For example, Maruta, R4, etc.
al,: ”Design and Performance+n
anceofaDs I Terminal for
Domestic Applications”.

IEEE Trans、 CommunlCOM−29
、pp337−344(March 1981 ) しかし、いずれの手法も基本となるパラメータは入力信
号の平均電力(振幅の包絡に対応する)に基づいており
、他のパラメータはこれを補足するにすぎない。
IEEE Trans,CommunlCOM-29
, pp. 337-344 (March 1981) However, the basic parameters of both methods are based on the average power (corresponding to the amplitude envelope) of the input signal, and other parameters only supplement this.

平均電力を用いる有音/無音の判定方法は、短時間の平
均電力(音声信号の振幅の自乗の平均値)が、定められ
たしきい値を越えれば、その区間は有音、しきい値に満
たない場合は無音と判定する。
The method of determining voice/silence using average power is that if the short-term average power (average value of the square of the amplitude of the audio signal) exceeds a predetermined threshold, then that section is considered to be a voice, and the threshold value is determined. If it is less than , it is determined that there is no sound.

更にこの結果を時間的に平滑し、音声中の瞬間的な無音
区間り有音(−修正したり、無音状態での突発的な雑音
等を修正するアルゴリズムビ組込んでいるものがある。
Furthermore, some systems incorporate an algorithm that smooths this result over time, corrects momentary silent periods in speech, and corrects sudden noises in silent states.

(例えば吉川、植田「遅延が許容される音声通信システ
ムにおける無音区間圧縮処理」信学論昭58−輪319
CB−105)pp90B−915) 音声レベルが十分に大きく雑音レベルが小さければ、判
定に用いるしきい値として固定値を用いることができる
。しかし電話回線を介して音声を検出する装置などでは
、回線の長さや伝送特性などによって雑音レベルがかな
り異なる。このように雑音レベルが異なる条件で入力信
号に含まれる音声を検出するためには、回線接続直後に
雑音を゛測定し、これをもとに有音/無音判定のための
しきい値を定め、これ以降の判定を行う方法がある。
(For example, Yoshikawa, Ueda, "Silent interval compression processing in voice communication systems that allow delay," IEICE Ronsho 58-Wa 319
CB-105) pp90B-915) If the voice level is sufficiently high and the noise level is small, a fixed value can be used as the threshold value used for determination. However, in devices that detect voice via telephone lines, the noise level varies considerably depending on the length of the line, transmission characteristics, etc. In order to detect the voice included in the input signal under conditions with different noise levels, it is necessary to measure the noise immediately after connecting the line, and then set the threshold value for determining voice/no-speech based on this. , there is a method for making subsequent determinations.

(例えば葉間、落合「アダプティブスレッショルド型音
声検出回路の一方式」昭和48信学全大、しかし、特定
の時刻の雑音を測定するのでは、どの時点でどの程度の
時間、雑音測定を行うかが問題であり、雑音測定時刻に
生じた音声や突発的雑音等によってそれ以降の判定に誤
りを生じるという問題点がある。また人間が話し出さな
いだろうと予想される時点を予想して雑音測定をするこ
とは、適用形態毎に熟慮しなければならず、システム設
計が難しくなる。
(For example, Hama, Ochiai, ``A method of adaptive threshold voice detection circuit'', IEICE, 1972) However, if we measure noise at a specific time, it is difficult to measure the noise at what point and for how long. However, there is a problem in that speech or sudden noise that occurs at the time of noise measurement may cause errors in subsequent judgments.Also, noise measurement is performed by anticipating the point in time when it is expected that the person will not start speaking. Doing so requires careful consideration for each application form, making system design difficult.

「問題点を解決するだめの手段」 この発明によれば連続して入力される音声信号は単位時
間毎に分割される(以降は分割された1番目の区間に着
目して記述する)、その区間の平均電力Piが計算され
、その平均電力Piに一定値αを加算した値(Pi+α
)が計算される。平均電力Piとしきい値Tiとの大き
さが比較され、どちらが大きいかの判定結果X。が出力
される。また計算値(Pi+α)としきい値Tiとの大
きさが比較され、どちらが大きいかの判定結果X1が出
力される。その判定結果X1を入力して、(Pi+α)
がTiより大きいかまたは等しいときは、次の区間のし
きい値Ti−HはTiに等しく設定され、(Pi+α)
がTiより小さいときは、次の区間のしきい値Ti−H
は(Pi+α)に設定される。動作の開始時点にはしき
い値Tiは予め定めた一定値T。とされ、単位時間が経
過するごとに上記の手法によりしきい値Tiが変化され
て判定結果為が出力される。
"Means to Solve the Problem" According to this invention, a continuously input audio signal is divided into units of time (hereinafter, the description will focus on the first divided section). The average power Pi of the section is calculated, and the value obtained by adding a constant value α to the average power Pi (Pi + α
) is calculated. The magnitude of the average power Pi and the threshold value Ti are compared, and the determination result X is which one is larger. is output. Further, the calculated value (Pi+α) and the threshold value Ti are compared in magnitude, and a determination result X1 is output as to which one is larger. Input the judgment result X1, (Pi+α)
is greater than or equal to Ti, the threshold value Ti-H of the next interval is set equal to Ti, and (Pi+α)
is smaller than Ti, the threshold value Ti-H of the next interval
is set to (Pi+α). At the start of the operation, the threshold value Ti is a predetermined constant value T. The threshold value Ti is changed by the above-described method every time a unit of time elapses, and a determination result is output.

このようにこの発明は特定時刻の雑音を測定することな
く、逐次動作を繰り返しながら雑音レベルに応じたしき
い値を自動的に決定して音声の有無の判定を行うように
したものである。
In this manner, the present invention is capable of determining the presence or absence of speech by automatically determining a threshold value according to the noise level while repeating the operations sequentially, without measuring the noise at a specific time.

「実施例」 入力された音声等の信号x (t)はAD変換器1によ
って一定周期(例えば125μ秒毎)で標本化されたデ
ィジタル信号Xvに変換される。この信号は自乗計算回
路2、加算回路3、遅延回路4(125μ秒の遅延)に
よって積和計算される−。タイミング発生器5が一定時
間間隔(1フレーム、例えば32m秒)で遅延回路4の
内容を零するので、その区間の入力信号の総電力に比例
する出力が加算回路3から出てくる。この出力は対数回
路6を介して区間内の対数平均電力Piに変換される。
"Embodiment" An input signal x (t) such as voice is converted by the AD converter 1 into a digital signal Xv sampled at a constant period (for example, every 125 μsec). This signal is subjected to sum-of-products calculation by a square calculation circuit 2, an addition circuit 3, and a delay circuit 4 (delay of 125 microseconds). Since the timing generator 5 zeros out the contents of the delay circuit 4 at fixed time intervals (one frame, for example 32 msec), an output proportional to the total power of the input signal in that interval is output from the adder circuit 3. This output is converted via a logarithmic circuit 6 into a logarithmic average power Pi within the interval.

AD変換回路1、加算回路3、タイミング発生回路5は
それぞれ既知の技術で実現できる。自乗計算回路2、対
数回路6は例えば読みだし専用メモ!ICROM)等に
入力に対応する出力値を記憶させておくことによって実
現できる。このようにして一定時間毎にその区間内の対
数平均電力Piが得られ、以下の音声の有無の判定をす
る部分に入力される。
The AD conversion circuit 1, the addition circuit 3, and the timing generation circuit 5 can each be realized using known techniques. For example, the square calculation circuit 2 and the logarithm circuit 6 are read-only memos! This can be realized by storing output values corresponding to inputs in an ICROM or the like. In this way, the logarithmic average power Pi within the interval is obtained at regular intervals, and is input to the section that determines the presence or absence of audio below.

対数平均電力Piは加算回路7で一定値αが加えられ、
比較回路8においてしきい値Tiと比較され、判定結果
Xlが選択回路9を制御する。このとき比較回路8は Pi+αがTiより大きいかまたは等しいとき出力はl Pi+αがTiより小さいとき     出力は0を出
力する。選択回路9では制御入力端子S1の入力に応じ
て以下に示す論理で入力端子り。とDlのいずれかを出
力端子Qに出力する。
A constant value α is added to the logarithmic average power Pi by an adding circuit 7,
It is compared with a threshold value Ti in a comparator circuit 8, and the determination result Xl controls a selection circuit 9. At this time, the comparison circuit 8 outputs l when Pi+α is greater than or equal to Ti, and outputs 0 when Pi+α is smaller than Ti. The selection circuit 9 selects an input terminal according to the logic shown below in response to the input from the control input terminal S1. and Dl are output to the output terminal Q.

もし51=OならばQ ” Dt もし51=1ならばQ ” D。If 51=O then Q” Dt If 51=1, then Q”D.

この出力は、遅延回路10(32m秒の遅延)に入力さ
れ、次の区間のしきい値Ti−Hとして使われる。この
検出回路が動作を開始したときに遅延口。
This output is input to the delay circuit 10 (delay of 32 msec) and is used as the threshold value Ti-H for the next section. There is a delay when this detection circuit starts operating.

路10には予め定めた一定値が設定されるようになって
おり、それ以降は一定時間毎(32m秒毎)に内容が更
新されていく。一方、比較回路11において対数平均電
力Piとしきい値Tiが比較され、Piの方がTiより
大きいかまたは等しいときは1(有音であることを表す
)、Piの方がTiより小さいときは0(無音を表す)
を出力する。
A predetermined constant value is set in the path 10, and thereafter the contents are updated at regular intervals (every 32 msec). On the other hand, in the comparator circuit 11, the logarithmic average power Pi and the threshold value Ti are compared, and when Pi is greater than or equal to Ti, it is 1 (indicating that there is a sound), and when Pi is smaller than Ti, it is 1 (indicating that there is a sound). 0 (represents silence)
Output.

第2図はこの音声検出回路の動作原理を説明する図であ
る。この音声検出回路は各区間(32m秒)毎に区間内
の入力信号の平均電力Piを計算し、これをしきい値T
iと比較して、 Pi≧Tiのとき、有音 Pi<Tiのとき、無音 と判定する。モしてPiが小さいときは以下に示す論理
で次の区間の判定処理に使われるしきい値Ti−1−t
の値を更新する。
FIG. 2 is a diagram explaining the operating principle of this voice detection circuit. This voice detection circuit calculates the average power Pi of the input signal within each interval (32 ms), and sets this to a threshold value T
When compared with i, when Pi≧Ti, and when sound Pi<Ti, it is determined that there is no sound. When Pi is small, the threshold value Ti-1-t is used for the judgment process of the next section according to the logic shown below.
Update the value of

Pi≧T1−αのとき、Ti+t = Ti(変化せず
) PlぐTi−αのとき、TiH=Pi+α(更新) 但し、動作開始時点ではTiは予め定めた値Toに設定
しておく。
When Pi≧T1-α, Ti+t=Ti (no change) When PlgTi-α, TiH=Pi+α (updated) However, at the start of operation, Ti is set to a predetermined value To.

′?53図はこの音声検出回路を動作させたとき時間と
ともにしきい値Tiが変化する様子を模式的に表したも
のである。時間とともにTiの値が変化し、最終的に雑
音レベルよりわずかに大きい値に落ち着く様子がわかる
。この音声検出回路が安定に動作するためには、動作を
開始してからの雑音レベルがほぼ一定であり揺らぎ幅が
極端に大きくないことが必要である。αの値を雑音レベ
ルの揺らぎ幅よりやや大きく定め、しきい値の初期値T
。を予想される雑音レベルより大きく設定しておけば。
′? FIG. 53 schematically shows how the threshold value Ti changes over time when this voice detection circuit is operated. It can be seen that the value of Ti changes with time and finally settles to a value slightly larger than the noise level. In order for this voice detection circuit to operate stably, it is necessary that the noise level after the start of operation is approximately constant and that the fluctuation width is not extremely large. The value of α is set slightly larger than the fluctuation width of the noise level, and the initial value of the threshold value T
. If you set it higher than the expected noise level.

安定に音声検出の動作ができる。Voice detection can be performed stably.

この音声検出回路は、背景雑音のレベルが大幅に異なっ
ても安定に動作する。しかし以下の場合には判定結果の
誤りが大きくなる可能性がある。
This speech detection circuit operates stably even when background noise levels vary widely. However, in the following cases, the error in the determination result may become large.

(1)雑音が極端に少ない場合 (2)突発的雑音 (3)雑音レベルが時間とともに増加する場合以下では
、これらの問題点に対処する手段〉述べる。
(1) When the noise is extremely small; (2) When the noise is sudden; and (3) When the noise level increases over time. In the following, means for dealing with these problems will be described.

(1)の場合には、対数軸よC二おける雑音レベルの揺
らぎ幅が比較的大きくなる。従ってこれが予め定めた値
αを越える場合、雑音部分を有音とみなす誤りが急増す
る。この場合は、雑音レベルが十分に小さいのだから、
Tiの値をさほど小さくしなくとも音声の検出は十分で
ある。そこでTiの値に下限値を設けそれ以下にならな
いように制御すればこのような不安定動作を防ぐことが
できる。
In the case of (1), the fluctuation width of the noise level on the logarithmic axis C2 becomes relatively large. Therefore, if this exceeds a predetermined value α, the number of errors in determining the noise portion as sound increases rapidly. In this case, since the noise level is sufficiently low,
Voice detection is sufficient even if the value of Ti is not made very small. Therefore, such unstable operation can be prevented by setting a lower limit value for the value of Ti and controlling it so that it does not fall below that value.

(2)の場合には、たとえ突発的であっても平均電力が
しきい値を越える区間を有音とみなすために起こる誤り
である。これを防ぐためには、この音声検出回路の出力
を時間的に平滑化することが現実的である。平滑処理に
付いては、 有音区間として許す最小の継続時間 無音区間として許す最小の継続時間 等の種々の制約条件を与えて最終的判定をするアルゴリ
ズムが既に発表されているのでこれらを利用することが
できる。(例えば前述の吉川、植田:信学論、昭58−
輪319(B−1051pp(3)の場合には、しきい
値Tiが平均電力の最小値を捜しながら変化していくた
めに、雑音レベルが時間とともに小さくなる場合には追
随できるが、時間とともに大きくなるときには追随でき
ないという欠点があるからである。これを防ぐには’l
’1(7)値が更新されない場合には、次の区間のしき
い値Ti−4−tをTiに等しくするかわりに、Ti−
H= Ti+ε とし、εを適当(二定めれば解決できる。長時間打音区
間が連続する場合は、Tiの値が時間とともにわずかず
つ増加し、緩やかな雑音の増加に追随できるようになる
In the case of (2), the error occurs because a section in which the average power exceeds the threshold value is considered to be audible, even if it is sudden. In order to prevent this, it is practical to temporally smooth the output of this voice detection circuit. Regarding smoothing processing, there are already published algorithms that make the final judgment by giving various constraints such as the minimum duration allowed for a sound section and the minimum duration allowed for a silent section, so these should be used. be able to. (For example, the above-mentioned Yoshikawa, Ueda: Theory of Shinshu, 1982-
In the case of Ring 319 (B-1051pp(3)), the threshold value Ti changes while searching for the minimum value of the average power, so if the noise level becomes smaller over time, it can be followed, but as time passes This is because it has the disadvantage of not being able to keep up with it when it grows.To prevent this, 'l
'1 (7) If the value is not updated, instead of making the threshold value Ti-4-t of the next interval equal to Ti, Ti-
This problem can be solved by setting H=Ti+ε and setting ε to an appropriate value (2). When long hitting sections are continuous, the value of Ti increases little by little over time, making it possible to follow the gradual increase in noise.

(1)で述べたTiの下限値と(3)で述べたTiを増
加させる機能を持った構成の例を第4図に示す。追加し
た部分は比較回路12及び加算回路13であり、選択回
路14は比較回路8.12の両出力により制御され、入
力端子り。+ Dl h D2の何れかの入力を出力す
る。入力端子り。にはTiとεとを加算回路13で加算
した値が入力され、入力端子D1.D2には(Pi+α
) * Tminがそれぞれ入力される、比較回路12
では(Pi+α)とTminとが比較され、(Pi+α
)≧Tminで出力S。は1、(Pi+α)<Tmin
でSoはOとなる。選択回路は以下の論理で動作する。
FIG. 4 shows an example of a configuration having the lower limit value of Ti described in (1) and the function of increasing Ti described in (3). The added parts are a comparator circuit 12 and an adder circuit 13, and the selection circuit 14 is controlled by both outputs of the comparator circuits 8 and 12, and has an input terminal. + Dl h Outputs any input of D2. Input terminal. A value obtained by adding Ti and ε by an adder circuit 13 is input to the input terminals D1. D2 has (Pi+α
) * Comparison circuit 12 to which Tmin is input respectively
Then, (Pi+α) and Tmin are compared, and (Pi+α
)≧Tmin, output S. is 1, (Pi+α)<Tmin
So So becomes O. The selection circuit operates according to the following logic.

So= s1= o   ならばQ = D2SO= 
1 、St = 0ならばQ = DISo=81=1
  ならばQ=D。
If So= s1= o then Q = D2SO=
1, if St = 0, then Q = DISo = 81 = 1
Then Q=D.

以上の実施例で述べた各機能は、電子計算機やマイクロ
プロセッサの上で動作するプログラムとして実現するこ
ともできるし、ディジタルシグナルプロセッサCD5P
)で実現することもできる。
Each of the functions described in the above embodiments can be realized as a program running on an electronic computer or microprocessor, or can be implemented as a program running on a digital signal processor CD5P.
) can also be realized.

「発明の効果」 以上説明したように、この発明の音声検出回路を用いれ
ば雑音レベルが異なる条件で入力信号に含まれる音声を
検出することが可能である。従来の装置とは異なり、特
定の時刻の雑音を測定するのではなく、有音/無音判定
を行いながら雑音レベルに応じてしきい値を更新して判
定を行って行くので、雑音測定に要する時間が省け、雑
音測定時刻に生じた音声や突発的雑音等によってそれ以
降の判定に誤りを生じるという問題点を解消することが
できる。
"Effects of the Invention" As explained above, by using the speech detection circuit of the present invention, it is possible to detect speech contained in an input signal under conditions of different noise levels. Unlike conventional devices, it does not measure noise at a specific time, but rather updates the threshold according to the noise level while determining whether there is a sound or not. This saves time and eliminates the problem of errors in subsequent determinations due to voices, sudden noises, etc. occurring at the time of noise measurement.

特に、電話回線を介して音声を検出する装置では、回線
の距離や接続ルートなどによって雑音レベルがかなり異
なる。この発明音声検出回路を用いれば、有音/無音判
定の動作を開始した時点で人間が発声中であっても、あ
るいは突発的な雑音が入っても、これ以降の判定結果が
極端に誤るといった問題はなくなる。
In particular, in devices that detect voice via telephone lines, the noise level varies considerably depending on the line distance, connection route, etc. If the voice detection circuit of this invention is used, even if a person is speaking at the time the voice/silence determination operation is started, or even if a sudden noise occurs, the subsequent determination results will be extremely incorrect. The problem will go away.

【図面の簡単な説明】[Brief explanation of the drawing]

第1図はこの発明の第1の実施例〉示すブロック図、第
2図はこの発明回路の動作原理を示す図、第3図はこの
発明回路の動作の様子を模式的に説明した図、第4図は
この発明の第2の実施例を示すブロック図である。 特許出願人  日本電信電話株式会社
FIG. 1 is a block diagram showing a first embodiment of the invention, FIG. 2 is a diagram showing the operating principle of the circuit of the invention, and FIG. 3 is a diagram schematically explaining the operation of the circuit of the invention. FIG. 4 is a block diagram showing a second embodiment of the invention. Patent applicant Nippon Telegraph and Telephone Corporation

Claims (3)

【特許請求の範囲】[Claims] (1)連続して入力する音声信号を単位時間毎に分割し
て(以降は分割されたi番目の区間に着目して記述する
)、その区間の平均電力(以降P_iと表す)を計算す
る手段と、その平均電力P_iに一定値αを加算した値
(P_i+α)を計算する手段と、 前記平均電力P_iとしきい値(以降T_iと表す)と
の大きさを比較してどちらが大きいかの判定結果X_0
を出力する手段と、 前記計算値(P_i+α)としきい値T_iとの大きさ
を比較してどちらが大きいかの判定結果X_1を出力す
る手段と、 その判定結果X_1を入力し、(P_i+α)がT_i
より大きいかまたは等しいときは、次の区間のしきい値
(T_i+1)をT_iに等しく設定し、(P_i+α
)がT_iより小さいときは、次の区間のしきい値T_
i+1を(P_i+α)に設定する手段と、を備え、動
作を開始する時点において、しきい値T_iを予め定め
た一定値T_0に等しく設定して、単位時間が経過する
毎に前記各手段でしきい値T_iを変化させて、判定結
果X_0を出力することを特徴とする音声検出回路。
(1) Divide the continuously input audio signal into units of time (hereinafter, the description will focus on the i-th divided section), and calculate the average power of that section (hereinafter expressed as P_i). means for calculating a value (P_i+α) obtained by adding a constant value α to the average power P_i; and determining which one is larger by comparing the magnitudes of the average power P_i and a threshold value (hereinafter referred to as T_i). Result X_0
means for outputting the calculated value (P_i+α) and the threshold value T_i and outputting a judgment result X_1 to determine which is larger;
If it is greater than or equal to, set the next interval threshold (T_i+1) equal to T_i and (P_i+α
) is smaller than T_i, the next interval threshold T_
means for setting i+1 to (P_i+α), and at the time of starting the operation, the threshold value T_i is set equal to a predetermined constant value T_0, and each of the above-mentioned means is set every unit time. A voice detection circuit characterized in that it outputs a determination result X_0 by changing a threshold value T_i.
(2)前記判定結果(X_1)を入力し、(P_i+α
)がT_iより大きいかまたは等しいときは、次の区間
のしきい値(T_i+1)をT_iに等しく設定し、(
P_i+α)がT_iより小さく、かつ(P_i+α)
が一定値T_m_i_nより大きいときは、次の区間の
しきい値(T_i+1)を(P_i+α)に設定し、(
P_i+α)がT_iより小さく、かつ(P_i+α)
が一定値T_m_i_nより小さいときは、次の区間の
しきい値(T_i+1)をT_m_i_nに設定する手
段、を備えることを特徴とする特許請求の範囲第1項記
載の音声検出回路。
(2) Input the judgment result (X_1) and (P_i+α
) is greater than or equal to T_i, set the threshold value (T_i+1) of the next interval equal to T_i, and (
P_i+α) is smaller than T_i, and (P_i+α)
is larger than the constant value T_m_i_n, the threshold value (T_i+1) of the next section is set to (P_i+α), and (
P_i+α) is smaller than T_i, and (P_i+α)
2. The voice detection circuit according to claim 1, further comprising means for setting a threshold value (T_i+1) for the next section to T_m_i_n when T_m_i_n is smaller than a constant value T_m_i_n.
(3)前記判定結果(X_1)を入力し、(P_i+α
)がT_iより大きいかまたは等しいときは、次の区間
のしきい値(T_i+1)をT_iに一定値εだけ大き
い値に設定する手段、を備えることを特徴とする特許請
求の範囲第1項記載の音声検出回路。
(3) Input the judgment result (X_1) and (P_i+α
) is larger than or equal to T_i, the threshold value (T_i+1) of the next interval is set to a value larger than T_i by a constant value ε, characterized in that the method is characterized in that voice detection circuit.
JP62281753A 1987-11-06 1987-11-06 Voice detection circuit Pending JPH01123297A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP62281753A JPH01123297A (en) 1987-11-06 1987-11-06 Voice detection circuit

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP62281753A JPH01123297A (en) 1987-11-06 1987-11-06 Voice detection circuit

Publications (1)

Publication Number Publication Date
JPH01123297A true JPH01123297A (en) 1989-05-16

Family

ID=17643495

Family Applications (1)

Application Number Title Priority Date Filing Date
JP62281753A Pending JPH01123297A (en) 1987-11-06 1987-11-06 Voice detection circuit

Country Status (1)

Country Link
JP (1) JPH01123297A (en)

Similar Documents

Publication Publication Date Title
JP4282659B2 (en) Voice section detection apparatus and method for voice signal processing apparatus
JP4236726B2 (en) Voice activity detection method and voice activity detection apparatus
US5687285A (en) Noise reducing method, noise reducing apparatus and telephone set
JP3297346B2 (en) Voice detection device
JPS62274941A (en) Audio coding system
US5809460A (en) Speech decoder having an interpolation circuit for updating background noise
EP0972283A1 (en) Vocoder system and method for performing pitch estimation using an adaptive correlation sample window
JP2000172283A (en) System and method for detecting sound
US7254532B2 (en) Method for making a voice activity decision
JPH10200351A (en) Digital audio processor
JPH01123297A (en) Voice detection circuit
JP3555490B2 (en) Voice conversion system
JP2002198918A (en) Adaptive noise level adaptor
JPH1091184A (en) Sound detection device
JP3375655B2 (en) Sound / silence determination method and device
JP2981044B2 (en) Digital automatic gain controller
Long et al. An improved method for robust speech endpoint detection
JPH1195785A (en) Voice segment detection system
JPH03241400A (en) Voice detector
JPS6058707A (en) Automatic gain control circuit
JPH0832526A (en) Voice detector
JP2793520B2 (en) Sound determination circuit
JPH02272837A (en) Voice section detection system
JPH03141740A (en) Sound detector
JP2001134299A (en) Speech speed converter