JPH07336289A

JPH07336289A - Vox control communication device

Info

Publication number: JPH07336289A
Application number: JP6126104A
Authority: JP
Inventors: Hideaki Yoshida; 秀明吉田; Toshiyuki Okamura; 俊幸岡村
Original assignee: Japan Radio Co Ltd
Current assignee: Japan Radio Co Ltd
Priority date: 1994-06-08
Filing date: 1994-06-08
Publication date: 1995-12-22

Abstract

PURPOSE:To eliminate the cutting of a talk at its beginning and obtain high quality as to the communication device which employs VOX control. CONSTITUTION:A speech signal is converted into a digital signal by a PCM encoder 1 and the mean power and reflection coefficient of its current frame are calculated by a speech encoder 2 and inputted to a speech detector 3. The speech detector 3 calculates the quantity of variation in mean power between a voiceless section and the current frame. Further, a final threshold value is determined from the weighted mean of the threshold value calculated from the mean power of the voiceless section and the threshold value calculated from the mean predicted gain of the voiceless section and the predicted gain of the current frame and this threshold value is compared with the variation quantity to decides whether or not the current frame is a voiced or voiceless section. The mean power and mean predicted gain of the voiceless section are updated in order with the weighted mean of the mean power and predicted gain of the current frame decided as the voiceless section.

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明はＶＯＸ制御通信装置、特
にデジタル移動通信機などに用いられるＶＯＸ制御に関
する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a VOX control communication device, and more particularly to a VOX control used in a digital mobile communication device.

【０００２】[0002]

【従来の技術】従来より、無線機器の送信時における省
電力技術の一方法としてＶＯＸ（Voice Operated Trans
mitter）が知られている。このＶＯＸとは、入力される
音声信号を検出し音声信号が存在する時のみ送信機の電
源をＯＮとし、音声信号が存在しない時にはＯＦＦとす
る技術である。2. Description of the Related Art Conventionally, VOX (Voice Operated Transistor) has been used as a method for saving power consumption during transmission of radio equipment.
mitter) is known. The VOX is a technique of detecting an input voice signal and turning on the power of the transmitter only when the voice signal is present, and turning it off when the voice signal is not present.

【０００３】特に、携帯電話等の移動通信機において
は、その電力はバッテリにより供給されるため、省電力
技術としてこのＶＯＸが広く用いられている。In particular, in mobile communication devices such as mobile phones, since the power is supplied by a battery, this VOX is widely used as a power saving technique.

【０００４】ＶＯＸにおいては、音声信号の検出から送
信機の電力がＯＮされるまでの立ち上がり時間が長すぎ
るといわゆる話頭切断が生じてしまい、受信側において
聞き取りにくい音声となってしまう。そのため、音声信
号の有無の検出は極めて重要である。In the VOX, if the rising time from the detection of the voice signal to the turning on of the electric power of the transmitter is too long, so-called head disconnection occurs and the voice becomes difficult to hear on the receiving side. Therefore, it is extremely important to detect the presence / absence of a voice signal.

【０００５】そこで、従来のＶＯＸにおける有音か無音
かの判定は、入力音声レベルに対して固定のしきい値レ
ベルを設け、これらの値を大小比較することにより行わ
れていた。Therefore, in the conventional VOX, whether the voice is present or not is determined by setting a fixed threshold level for the input voice level and comparing these values.

【０００６】[0006]

【発明が解決しようとする課題】しかしながら、このよ
うに固定しきい値を用いて有音か無音かを判定する構成
では、例えば子音等のように低いレベルの音声信号が入
力された場合には、固定しきい値よりも小さいと誤判定
されてしまう場合があり、その結果話頭が欠如し音質が
劣化してしまう問題があった。However, in the configuration for determining whether there is voice or no voice by using the fixed threshold value as described above, when a low-level voice signal such as a consonant is input, However, it may be erroneously determined to be smaller than the fixed threshold value, and as a result, there is a problem that the speech is missing and the sound quality deteriorates.

【０００７】そこで、本願出願人は先に特願平６−４１
１号にて無音区間の平均パワーに基づきしきい値を決定
し、この可変しきい値との大小比較により有音か無音か
を判定する構成を提案した。このように有音か無音かを
判定するためのしきい値を無音区間の平均パワーに応じ
て決定することにより、高雑音下においても低いレベル
の音声信号を確実に検出することができる。Therefore, the applicant of the present invention previously filed Japanese Patent Application No. 6-41.
In No. 1, we proposed a configuration that determines the threshold value based on the average power in the silent section, and compares it with the variable threshold value to determine whether there is sound or silence. In this way, by determining the threshold value for determining whether there is sound or no sound according to the average power of the silent section, it is possible to reliably detect a low level audio signal even under high noise.

【０００８】一方、このようにしきい値を無音区間の平
均パワーに応じて決定した場合でも、特定の子音（例え
ばサ行など）が話頭にくる場合には誤判定が生じてしま
う可能性があり、全ての音声信号に対して話頭切断を防
止するには充分でない問題があった。On the other hand, even when the threshold value is determined in accordance with the average power in the silent section as described above, an erroneous determination may occur when a specific consonant (for example, a sa-syaku) comes to the beginning of the talk. , There was a problem that it was not sufficient to prevent the head disconnection for all audio signals.

【０００９】本発明は上記従来技術の有する課題に鑑み
なされたものであり、その目的は、高雑音下での話頭切
れを確実に防止し、誤動作の少ない高効率なＶＯＸ制御
を行うことができるＶＯＸ制御通信装置を提供すること
にある。The present invention has been made in view of the above problems of the prior art, and an object of the present invention is to reliably prevent the beginning of a talk under high noise and to perform highly efficient VOX control with less malfunction. It is to provide a VOX control communication device.

【００１０】[0010]

【課題を解決するための手段】上記目的を達成するため
に、請求項１記載のＶＯＸ制御通信装置は、アナログ入
力音声を符号化する際に現在フレームが有音か無音かを
判定し、有音時のみ送信を行うＶＯＸ制御通信装置にお
いて、無音区間の平均パワーに対する現在フレームの平
均パワーの変化量を算出する変化量算出手段と、少なく
とも１次と２次の反射係数に基づき予測利得を算出する
予測利得算出手段と、無音区間の平均予測利得を算出す
る平均算出手段と、前記無音区間の平均パワーに基づき
第１しきい値を算出する第１しきい値算出手段と、前記
平均予測利得及び現在フレームの予測利得に基づき第２
しきい値を算出する第２しきい値算出手段と、前記第１
及び第２しきい値の重み付け平均により第３しきい値を
算出する第３しきい値算出手段と、前記変化量と前記第
３しきい値の大小比較により現在フレームが有音か無音
かを判定する判定手段と、前記判定手段により現在フレ
ームが無音と判定された場合に、前記無音区間の平均パ
ワーと無音と判定された現在フレームの平均パワーの重
み付け平均で無音区間の平均パワーを更新するとともに
前記無音区間の平均予測利得と無音と判定された現在フ
レームの予測利得の重み付け平均で無音区間の平均予測
利得を更新する更新手段とを有することを特徴とする。In order to achieve the above object, a VOX control communication apparatus according to claim 1 determines whether a current frame is voiced or silenced when encoding an analog input voice. In a VOX control communication device that transmits only during sound, a change amount calculation unit that calculates a change amount of the average power of the current frame with respect to the average power of a silent section, and a predicted gain based on at least first-order and second-order reflection coefficients Predictive gain calculating means, average calculating means for calculating an average predictive gain in a silent section, first threshold calculating means for calculating a first threshold value based on the average power of the silent section, and the average predictive gain. And second based on the prediction gain of the current frame
Second threshold value calculating means for calculating a threshold value, and the first
And a third threshold value calculating means for calculating a third threshold value by a weighted average of the second threshold value, and whether the current frame is voiced or silenced by comparing the change amount and the third threshold value. When the current frame is determined to be silent by the determining means, the average power of the silent section is updated by a weighted average of the average power of the silent section and the average power of the current frame determined to be silent. In addition, there is provided an updating means for updating the average prediction gain of the silent section with a weighted average of the average prediction gain of the silent section and the prediction gain of the current frame determined to be silent.

【００１１】また、上記目的を達成するために、請求項
２記載のＶＯＸ制御通信装置は、請求項１記載のＶＯＸ
制御通信装置において、前記予測利得算出手段は、１次
から４次までの反射係数に基づき予測利得を算出するこ
とを特徴とする。In order to achieve the above object, a VOX control communication device according to a second aspect is a VOX control device according to the first aspect.
In the control communication device, the predictive gain calculating means calculates the predictive gain based on the reflection coefficients from the first order to the fourth order.

【００１２】また、上記目的を達成するために、請求項
３記載のＶＯＸ制御通信装置は、請求項１または請求項
２記載のＶＯＸ制御通信装置において、前記第１しきい
値算出手段は、前記無音区間の平均パワーに反比例する
ように前記第１しきい値を算出することを特徴とする。Further, in order to achieve the above object, the VOX control communication device according to claim 3 is the VOX control communication device according to claim 1 or 2, wherein the first threshold value calculating means is The first threshold value is calculated so as to be inversely proportional to the average power in the silent section.

【００１３】さらに、上記目的を達成するために、請求
項４記載のＶＯＸ制御通信装置は、請求項１または請求
項２または請求項３記載のＶＯＸ制御通信装置におい
て、前記第２しきい値算出手段は、前記平均予測利得を
Ｒav、前記予測利得をＲ、正の定数をＡ、Ｂとした場合
に、Further, in order to achieve the above object, the VOX control communication device according to claim 4 is the VOX control communication device according to claim 1, claim 2 or claim 3, in which the second threshold value is calculated. When the average prediction gain is Rav, the prediction gain is R, and positive constants are A and B,

【数１】ＴH ＝（Ａ／Ｒav³）・Ｒ³−Ｂにより第２しきい値ＴH を算出することを特徴とする。The Equation 1] ^{TH = (A / Rav 3)} · R 3 -B and calculates the second threshold value TH.

【００１４】[0014]

【作用】本発明のＶＯＸ制御通信装置では、無音区間の
平均パワーに基づき算出されたしきい値（第１しきい
値）のみで有音か無音かを判定する場合の子音の欠如を
防止すべく、更に反射係数に基づき計算される予測利得
に基づくしきい値（第２しきい値）との重み付け平均に
より最終的なしきい値を決定し、このしきい値と変化量
の大小比較により有音か無音かを判定する。このとき、
第２しきい値を決定する際に用いられる関数は無音区間
の平均予測利得に基づき調整され、結局第２しきい値は
無音区間の平均予測利得及び現在フレームの予測利得に
基づき決定されることになる。In the VOX control communication device of the present invention, the lack of consonants in the case of determining whether there is voice or no voice only with the threshold value (first threshold value) calculated based on the average power of the silent section is prevented. Therefore, the final threshold value is determined by weighted averaging with the threshold value (second threshold value) based on the prediction gain calculated based on the reflection coefficient, and the final threshold value is compared with the change amount. Determine whether it is sound or silence. At this time,
The function used in determining the second threshold value is adjusted based on the average prediction gain of the silent period, and the second threshold value is determined based on the average prediction gain of the silent period and the prediction gain of the current frame. become.

【００１５】子音が入力された場合には、反射係数に基
づき計算される予測利得は低下する事実に着目し、予測
利得に基づきしきい値（第２しきい値）を決定すること
により子音を確実に検出することができる。ここで、単
に予測利得に対し正の相関を有するようにしきい値（第
２しきい値）を決定すると、予測利得の増大に応じて第
２しきい値が不要に増大する場合がある。そこで、第２
しきい値を決定する際の関数の係数として、無音区間に
おける平均予測利得に対して負の相関を有するような係
数を用いることにより、第２しきい値を適当な値に設定
する。一方、無音区間の平均パワーに基づきしきい値
（第１しきい値）を決定することにより高雑音下におい
ても微小なパワーの変動に対応することができる。従っ
て、これら第１のしきい値と第２のしきい値の重み付け
平均により最終的なしきい値（第３しきい値）を決定す
ることにより、高雑音下においても子音を確実に検出す
ることが可能となる。When a consonant is input, attention is paid to the fact that the prediction gain calculated based on the reflection coefficient decreases, and the consonant is determined by determining the threshold value (second threshold value) based on the prediction gain. It can be reliably detected. Here, if the threshold value (second threshold value) is simply determined so as to have a positive correlation with the prediction gain, the second threshold value may unnecessarily increase as the prediction gain increases. Therefore, the second
The second threshold value is set to an appropriate value by using a coefficient having a negative correlation with the average prediction gain in the silent section as the coefficient of the function for determining the threshold value. On the other hand, by determining the threshold value (first threshold value) based on the average power in the silent section, it is possible to cope with a minute power fluctuation even under high noise. Therefore, by determining the final threshold value (third threshold value) by the weighted average of these first threshold value and second threshold value, it is possible to reliably detect consonants even under high noise. Is possible.

【００１６】なお、第１しきい値は無音区間の平均パワ
ーに反比例するように、すなわち雑音が高く無音区間の
平均パワーが大きい場合にはしきい値を下げるように設
定され、また、第２しきい値を算出する際の予測利得は
少なくとも１次と２次の反射係数、好適には１次から４
次までの反射係数が用いられる。これは、反射係数の変
動に過剰に応答することなく、かつ、短時間で最適な値
にしきい値を設定するためである。The first threshold value is set so as to be inversely proportional to the average power of the silent section, that is, the threshold value is lowered when the noise is high and the average power of the silent section is large. The predictive gain in calculating the threshold is at least the first and second order reflection coefficients, preferably from the first to the fourth.
The reflection coefficients up to the next are used. This is because the threshold value is set to an optimum value in a short time without excessively responding to fluctuations in the reflection coefficient.

【００１７】[0017]

【実施例】以下、図面に基づき本発明のＶＯＸ制御通信
装置の実施例について説明する。図１には本実施例の構
成ブロック図が示されている。マイクロフォン（不図
示）から入力された音声信号はＰＣＭ符号化器１に入力
され、デジタル信号に変換される。ＰＣＭ符号化器１か
らのデジタル信号は音声符号化器２に入力される。音声
符号化器２では、現在のフレームの平均パワー及び反射
係数を算出する。算出された平均パワー及び反射係数
は、音声検出器３に入力される。音声検出器３では後述
する所定の演算及び大小比較を行って現在フレームが有
音か無音かを判定し、制御信号を無線部４に出力する。
無線部４では、音声検出器３からの制御信号に基づき現
在フレームが有音の場合にはアンテナ５にデジタル信号
を供給し、送信を行う。DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENT An embodiment of a VOX control communication device of the present invention will be described below with reference to the drawings. FIG. 1 shows a block diagram of the configuration of this embodiment. A voice signal input from a microphone (not shown) is input to the PCM encoder 1 and converted into a digital signal. The digital signal from the PCM encoder 1 is input to the speech encoder 2. The voice encoder 2 calculates the average power and reflection coefficient of the current frame. The calculated average power and reflection coefficient are input to the voice detector 3. The voice detector 3 performs a predetermined calculation and a size comparison described later to determine whether the current frame is voiced or not, and outputs a control signal to the wireless unit 4.
The wireless unit 4 supplies a digital signal to the antenna 5 for transmission when the current frame has a sound based on the control signal from the voice detector 3.

【００１８】図２には音声検出器３にて行われる処理の
フローチャートが示されている。上述したように、音声
検出器３には音声符号化器２から現在フレームの平均パ
ワーが入力されるが、音声検出器３は、入力された現在
フレームの平均パワーと既に算出された無音区間の平均
パワーとの変化量を算出する（Ｓ１０１）。なお、音声
符号化器２における平均パワーの算出は、ＰＣＭ符号化
器１から入力されたデジタル信号のそれぞれの２乗和を
算出することにより行われる。また、無音区間の平均パ
ワーは、既に無音と判定された区間における平均パワー
に基づき算出され順次更新されるが、その詳細について
は後述する。無音区間の平均パワーに対する現在フレー
ムの平均パワーの変化量が算出された後、音声検出器３
は次に音声符号化器２から入力された１次から４次の反
射係数に基づき予測利得を算出する（Ｓ１０２）。予測
利得Ｒは反射係数をｒi （ｉ＝１，２，３，４）とした
場合FIG. 2 shows a flowchart of the processing performed by the voice detector 3. As described above, the speech detector 3 receives the average power of the current frame from the speech encoder 2, but the speech detector 3 receives the average power of the inputted current frame and the calculated silent period. The amount of change from the average power is calculated (S101). The calculation of the average power in the speech encoder 2 is performed by calculating the sum of squares of the respective digital signals input from the PCM encoder 1. Further, the average power of the silent section is calculated and sequentially updated based on the average power of the section already determined to be silent, the details of which will be described later. After the change amount of the average power of the current frame with respect to the average power of the silent section is calculated, the voice detector 3
Calculates the prediction gain based on the first to fourth order reflection coefficients input from the speech encoder 2 (S102). Prediction gain R is when the reflection coefficient is ri (i = 1, 2, 3, 4)

【数２】により算出される。なお、１次の反射係数のみを用いて
予測利得を算出した場合、反射係数の変動に対し予測利
得が過剰に変動してしまい、しきい値を最適に決定する
ことが困難となる。また、５次以上の高次の反射係数を
も考慮して予測利得を算出した場合、高次の反射係数は
ほとんど寄与しないこと及び予測利得の算出に長時間を
要すること等から効率が低下してしまう問題がある。従
って、予測利得を算出する際には少なくとも２次以上、
好適には１次から４次までの反射係数を用いて算出する
のがよい。[Equation 2] Is calculated by When the prediction gain is calculated using only the first-order reflection coefficient, the prediction gain fluctuates excessively with respect to the fluctuation of the reflection coefficient, which makes it difficult to optimally determine the threshold value. In addition, when the prediction gain is calculated in consideration of the higher-order reflection coefficients of 5th order or higher, the efficiency decreases because the high-order reflection coefficient hardly contributes and it takes a long time to calculate the prediction gain. There is a problem that ends up. Therefore, when calculating the prediction gain,
It is preferable to use the first to fourth order reflection coefficients for the calculation.

【００１９】１次から４次の反射係数に基づき予測利得
が算出された後、音声検出器３は無音区間の平均パワー
に基づきしきい値１（第１しきい値）を決定する（Ｓ１
０３）。このしきい値１は無音区間の平均パワーに反比
例するように決定され、本実施例においては、特にAfter the predicted gain is calculated based on the first to fourth order reflection coefficients, the voice detector 3 determines the threshold value 1 (first threshold value) based on the average power in the silent section (S1).
03). This threshold value 1 is determined so as to be inversely proportional to the average power in the silent section, and in this embodiment, in particular,

【数３】ＴH1＝−０．０８３３（Ｐ₀＋６）により決定している。ここで、ＴH1はしきい値１、Ｐ₀
は無音区間の平均パワーである。なお、図３には無音区
間の平均パワーＰ₀としきい値１との関係が示されてい
る。It has been determined by the [number 3] _{TH1 = -0.0833 (P 0 +6)} . Here, TH1 is the threshold value 1, P ₀
Is the average power in the silent section. Note that FIG. 3 shows the relationship between the average power P _{0 in the} silent section and the threshold value 1.

【００２０】無音区間の平均パワーに基づきしきい値１
が決定された後、音声検出器３はＳ１０２にて算出され
た過去の無音区間の予測利得の平均値に基づいて、しき
い値２を算出するための関数を決定する（Ｓ１０４）。
本実施例では、このしきい値２は予測利得が小さくなれ
ば小さくなり、予測利得が大きくなれば大きくなるよう
な正の相関を有する３次関数で決定し、さらにその関数
の３次の係数を無音区間の平均予測利得で決定する。具
体的には、３次の係数Ｃoeは、無音区間の平均予測利得
をＲavとした場合、Threshold value 1 based on the average power in the silent section
After the determination is made, the voice detector 3 determines a function for calculating the threshold 2 based on the average value of the prediction gains of the past silent sections calculated in S102 (S104).
In the present embodiment, this threshold value 2 is determined by a cubic function having a positive correlation such that it decreases as the prediction gain decreases and increases as the prediction gain increases, and the third-order coefficient of the function is determined. Is determined by the average prediction gain in the silent section. Specifically, the third-order coefficient Coe is the average prediction gain Rav of the silent section,

【数４】Ｃoe＝６．５／Ｒav³ により決定している。このようにしてしきい値２（第２
しきい値）の算出関数が決定された後、音声検出器３
は、現在フレームの予測利得ＲとＳ１０４にて決定され
た係数Ｃoeを有する３次関数を用いて、しきい値２を## EQU4 ## It is determined by Coe = 6.5 / Rav ³ . In this way, the threshold 2 (second
After the calculation function of (threshold value) is determined, the voice detector 3
Uses the cubic function having the prediction gain R of the current frame and the coefficient Coe determined in S104 to set the threshold value 2 to

【数５】ＴH2＝Ｃoe・Ｒ³−１．５により決定する。図４には予測利得としきい値２との関
係が示されている。無音区間の平均予測利得が大となる
ほど３次の係数Ｃoeは小となるので結局しきい値２は小
となり、逆に無音区間の平均予測利得が小なるほど３次
の係数Ｃoeが大となるので結局しきい値２は大となる。[Number 5] is determined by TH2 = Coe · R ³ -1.5. FIG. 4 shows the relationship between the prediction gain and the threshold value 2. Since the third-order coefficient Coe becomes smaller as the average prediction gain in the silent section becomes larger, the threshold value 2 becomes smaller in the end, and conversely, the third-order coefficient Coe becomes larger as the average prediction gain in the silent section becomes smaller. Eventually, the threshold value 2 becomes large.

【００２１】しきい値１及びしきい値２が算出された
後、これらしきい値の重み付け平均により最終的なしき
い値を算出する（Ｓ１０６）。この算出は、それぞれの
重みを２：１として、After the threshold 1 and the threshold 2 are calculated, the final threshold is calculated by the weighted average of these thresholds (S106). In this calculation, each weight is set to 2: 1

【数６】ＴH ＝（２ＴH1＋ＴH2）／３により行われる。[Equation 6] TH = (2TH1 + TH2) / 3

【００２２】最終的なしきい値ＴH が決定された後、Ｓ
１０１にて算出された変化量としきい値ＴH との大小比
較が行われる（Ｓ１０７）。そして、変化量がしきい値
以上である場合には現在フレームは有音であると判定さ
れ（Ｓ１０８）、無線部４に有音信号を出力する。一
方、変化量がしきい値より小さい場合には、既に算出さ
れた無音区間の平均パワーと現在フレームの平均パワー
との重み付け平均を算出し、この重み付け平均により無
音区間の平均パワーを更新するとともに（Ｓ１０９）、
無音区間の平均予測利得と現在フレームの予測利得との
重み付け平均により無音区間の平均予測利得を更新する
（Ｓ１１０）。そして、現在フレームが無音であると判
定して無線部４に無音信号を出力する（Ｓ１１１）。After the final threshold TH is determined, S
The change amount calculated in 101 and the threshold value TH are compared in size (S107). When the amount of change is equal to or greater than the threshold value, it is determined that the current frame has a sound (S108), and a sound signal is output to the wireless unit 4. On the other hand, when the change amount is smaller than the threshold value, a weighted average of the already calculated average power of the silent section and the average power of the current frame is calculated, and the average power of the silent section is updated by this weighted average. (S109),
The average prediction gain of the silent section is updated by the weighted average of the average prediction gain of the silent section and the prediction gain of the current frame (S110). Then, it is determined that the current frame is silent, and a silent signal is output to the wireless unit 4 (S111).

【００２３】このように、本実施例においては音声検出
器３で無音区間の平均パワーと現在フレームの平均パワ
ーの変化量により有音か無音かの判定を行い、かつ、無
音区間の平均パワーを順次更新するため、雑音下におけ
る微妙なパワーの変動にも対応でき、誤動作を防止する
ことができる。また、有音か無音かの判定を行う際に用
いるしきい値として、無音区間の平均パワーに基づいて
決定されるしきい値（しきい値１）と、無音区間の平均
予測利得及び現在フレームの予測利得に基づいて決定さ
れるしきい値（しきい値２）との重み付け平均により算
出しているため、有色雑音下での子音の検出にも優れ、
話頭欠如を有効に防止して高品質の通信を行うことが可
能となる。すなわち、しきい値１を無音区間の平均パワ
ーに対し負の相関を有するように決定して高雑音下にお
ける検出を容易にし、また、しきい値２を予測利得に対
し正の相関を有するように決定して子音の検出を容易と
し、さらに、しきい値２の決定関数の係数を無音区間の
平均予測利得に対し負の相関を有するように決定して雑
音下でのしきい値２の不要な増大を抑え、しきい値を最
適化することができる。As described above, in the present embodiment, the voice detector 3 determines whether there is sound or no sound based on the amount of change in the average power of the silent section and the average power of the current frame, and determines the average power of the silent section. Since it is updated sequentially, it is possible to cope with a slight power fluctuation under noise and prevent malfunction. Further, as a threshold value used when determining whether there is sound or no sound, a threshold value (threshold value 1) determined based on the average power of the silent section, the average prediction gain of the silent section, and the current frame. Since it is calculated by weighted averaging with a threshold value (threshold value 2) determined based on the prediction gain of, consonant detection is also excellent under colored noise,
It becomes possible to effectively prevent the lack of the talk head and perform high quality communication. That is, the threshold value 1 is determined to have a negative correlation with the average power of the silent section to facilitate detection under high noise, and the threshold value 2 has a positive correlation with the prediction gain. To facilitate detection of consonants, and to determine the coefficient of the decision function of the threshold 2 so as to have a negative correlation with the average prediction gain in the silent section. It is possible to suppress unnecessary increase and optimize the threshold value.

【００２４】[0024]

【発明の効果】以上説明したように、請求項１乃至請求
項４記載のＶＯＸ制御通信装置によれば、有音か無音か
の判定を行う際のしきい値を雑音下においても最適化で
き、高品質で誤動作の少ない高効率なＶＯＸ制御を行う
ことができる。As described above, according to the VOX control communication device of the first to fourth aspects, the threshold value for determining whether there is sound or no sound can be optimized even in the presence of noise. It is possible to perform highly efficient VOX control with high quality and less malfunction.

[Brief description of drawings]

【図１】本発明の実施例の構成ブロック図である。FIG. 1 is a configuration block diagram of an embodiment of the present invention.

【図２】本発明の実施例における処理フローチャートで
ある。FIG. 2 is a processing flowchart in the embodiment of the present invention.

【図３】本発明の実施例における無音区間の平均パワー
と第１しきい値との関係を示すグラフ図である。FIG. 3 is a graph showing a relationship between average power in a silent section and a first threshold value in the example of the present invention.

【図４】本発明の実施例における予測利得と第２しきい
値との関係を示すグラフ図である。FIG. 4 is a graph showing a relationship between a prediction gain and a second threshold value according to the embodiment of the present invention.

【符号の説明】１ＰＣＭ符号化器２音声符号化器３音声検出器４無線部５アンテナ[Description of Codes] 1 PCM encoder 2 Speech encoder 3 Speech detector 4 Radio unit 5 Antenna

Claims

[Claims]

1. A VOX control communication device for determining whether a current frame has voice or silence when encoding an analog input voice and transmitting only when there is voice, the average power of the current frame with respect to the average power of a silence period. Change amount calculating means for calculating the change amount of the, a predictive gain calculating means for calculating a predictive gain based on at least first and second order reflection coefficients, an average calculating means for calculating an average predictive gain in a silent section, First threshold value calculating means for calculating a first threshold value based on the average power of the section; and second threshold value calculating means for calculating a second threshold value based on the average prediction gain and the prediction gain of the current frame. A third threshold value calculating means for calculating a third threshold value by weighted averaging the first and second threshold values; and a current frame by comparing the change amount and the third threshold value. A determination unit for determining whether there is sound or silence, and when the determination unit determines that the current frame is silent, the silent section is a weighted average of the average power of the silent section and the average power of the current frame determined to be silent. Update means for updating the average prediction gain of the silent section and a weighted average of the average prediction gain of the silent section and the prediction gain of the current frame determined to be silent, and updating means. VOX control communication device.

2. The VOX control communication device according to claim 1, wherein the predictive gain calculating means calculates the predictive gain based on the reflection coefficients from the first order to the fourth order.

3. The VOX control communication device according to claim 1, wherein the first threshold value calculation means calculates the first threshold value so as to be inversely proportional to the average power of the silent section. A VOX control communication device characterized by the above.

4. Claim 1 or claim 2 or claim 3.
In the VOX control communication device described above, when the average prediction gain of the silent section is Rav, the prediction gain is R, and positive constants are A and B, TH = (A / Rav ^³⁾ · R ³ V, characterized in that to calculate the second threshold value TH by -B
OX control communication device.