JPH04119028A

JPH04119028A - Voice detector

Info

Publication number: JPH04119028A
Application number: JP23940590A
Authority: JP
Inventors: Shin Sugano; 伸菅野; Tsukasa Tsujimura; 辻村　司; Shigeaki Suzuki; 茂明鈴木
Original assignee: Mitsubishi Electric Corp; Nippon Telegraph and Telephone Corp
Current assignee: Mitsubishi Electric Corp; Nippon Telegraph and Telephone Corp
Priority date: 1990-09-10
Filing date: 1990-09-10
Publication date: 1992-04-20
Anticipated expiration: 2011-03-29
Also published as: JPH0834458B2

Abstract

PURPOSE:To always obtain a proper background noise power by receiving a reception packet presence signal from a packet decomposing section and making the result of calculation of an estimate silence time power calculation section for a period when a reception packet is in existence equal to the result of calculation latched at a preceding period. CONSTITUTION:A reception packet presence signal inputted from an input terminal 24 is inputted to a switch 25, which selects an output of a block power calculation section 15 when the reception voice packet is absent and selects an output of an estimate silence time power calculation section 20 when the packet is present and outputs the selected output to a calculation section 20. Thus, the operation of the calculation section 20 is the same as that of a conventional system when no reception packet exists, but the power in the silence state is kept to that calculated for a preceding period regardless of the block power when the packet is in existence. Thus, even when a signal including a complete silence period like an output signal of an echo canceller 5 is inputted, it is possible to always obtain a proper background noise power by the calculation section 20, and the voice detector is realized, in which background noise is not mis-discriminated to be a voiced sound.

Description

【発明の詳細な説明】〔産業上の利用分野）この発明は、音声信号のパケット組立分解装置に用いら
れ、音声信号の有無を判定する音声検出器に関するもの
である。DETAILED DESCRIPTION OF THE INVENTION [Field of Industrial Application] The present invention relates to a voice detector used in an audio signal packet assembly and disassembly device to determine the presence or absence of an audio signal.

（従来の技術）近年のメディアの多様化により、音声、データ、画像等
異なるメディア情報を統合させるマルチメディア通信の
要望が高まっている。メディア統合の一方法として、パ
ケット交換方式を用いたマルチメディア通信が挙げられ
、これに伴いメディア情報の一つである音声信号のパケ
ット化に関する研究が進められている。(Prior Art) With the recent diversification of media, there is an increasing demand for multimedia communication that integrates different media information such as voice, data, and images. One method of media integration is multimedia communication using a packet switching method, and along with this, research on packetization of audio signals, which is one type of media information, is progressing.

第６図は電子情報通信ハンドブック（オーム社、昭和６
３年３月）に示されたもので、音声パケット組立分解装
置の構成例である。図において、（１）は加入者、（２
）は音声パケット組立分解装置で、この音声パケット組
立分解装置（２）は、加入者（１）からの通話信号を入
力するハイブリッド回路（３）、このハイブリッド回路
（３）からのアナログ信号をＰＣＭ信号に変換する＾／
Ｄ変換器（４）、＾／Ｄ変換器（４）より入力した信号
に含まれるエコー成分を除去するエコーキャンセラ（５
）、エコーキャンセラ（５）の出力を得て音声信号の高
能率符号化を行ってパケット組立部（８）に出力する符
号化部（６）と該音声信号の有無を検出しその検出信号
をパケット組立部（８）に出力する音声検出器（７）、
この音声検出器（７）より入力する検出信号を用いて有
音の場合にのみ符号化部（６）より入力する符号化され
た音声信号をパケット化してパケット網へ出力するパケ
ット組立部（８）を備えると共に、パケット交換機（９
）より入力された受信パケットからヘッダ情報を取り除
くパケット分解部（１０）、受信パケットが存在しない
時間に無音挿入を行う無音挿入部（１１）、パケット通
信網におけるパケットの伝送遅延時間の変動を吸収する
ゆらぎ吸収バッファ（１２）、符号化された音声信号を
ＰＣＭ信号に復号してエコーキャンセラ（５）及びＤ／
Ａ変換器（１４）に出力する復号部（１３）、該ＰＣＭ
信号をアナログ信号に変換しハイブリッド回路（３）を
介して加入者＜１）に出力するＤ／Ａ変換器（１４）を
備えており、図示構成の音声パケット組立分解装置（２
）では、高能率音声符号化及び無音圧縮によって回線の
有効利用を図るような構成となっている。Figure 6 is the Electronic Information and Communication Handbook (Ohmsha, 1932).
This is an example of the configuration of a voice packet assembly/disassembly device. In the figure, (1) is the subscriber, (2
) is a voice packet assembling and disassembling device, and this voice packet assembling and disassembling device (2) has a hybrid circuit (3) that inputs a call signal from a subscriber (1), and converts the analog signal from this hybrid circuit (3) into PCM. Convert to signal ^/
An echo canceller (5) that removes echo components included in the signal input from the D converter (4) and the ^/D converter (4).
), an encoding unit (6) which obtains the output of the echo canceller (5), performs high-efficiency encoding of the audio signal, and outputs it to the packet assembly unit (8), and a coding unit (6) that detects the presence or absence of the audio signal and outputs the detected signal. a voice detector (7) outputting to the packet assembly unit (8);
Using the detection signal input from the voice detector (7), the packet assembling unit (8) packetizes the encoded voice signal input from the encoder (6) only when there is a voice, and outputs the packet to the packet network. ) and a packet switch (9
), a packet disassembly unit (10) that removes header information from received packets input from the input packet, a silence insertion unit (11) that inserts silence at times when there are no received packets, and absorbs fluctuations in packet transmission delay time in the packet communication network. a fluctuation absorption buffer (12), an echo canceller (5) that decodes the encoded audio signal into a PCM signal, and a D/
A decoding unit (13) outputting to the A converter (14), the PCM
It is equipped with a D/A converter (14) that converts the signal into an analog signal and outputs it to the subscriber <1) via the hybrid circuit (3).
) is configured to utilize the line effectively through high-efficiency speech encoding and silence compression.

ここで、無音圧縮を行うためには、音声検出器（７）が
必要となるが、従来、例えば、第７図に示すような音声
検出器が提案されている。これは特開昭６０−１１７８
３８号公報に示されたものであり、サンプリングされ量
子化された入力信号を受け、入力信号が有音、無音の何
れであるかを判定した判定出力を送出するようになされ
、判定出力は例えば約６ｍｓの判定可能な最小時間区間
（以下区間と称す）毎に変化する。Here, in order to perform silence compression, a voice detector (7) is required, and conventionally, for example, a voice detector as shown in FIG. 7 has been proposed. This is Japanese Patent Publication No. 60-1178
This is disclosed in Publication No. 38, and receives a sampled and quantized input signal, and sends out a determination output that determines whether the input signal is audible or silent, and the determination output is, for example, It changes every approximately 6 ms, the minimum time interval that can be determined (hereinafter referred to as an interval).

すなわち、図示構成の音声検出器（７）は、区間内の入
力信号のパワーを加算する区間パワー計算部（１５）、
今回の入力と前回の入力との積が負の場合に区間内の零
交叉数を計数し、その値を出力する零交叉計数部（１６
）、１区間前のパワーを保持する１区間遅延部（１７）
、区間パワー計算部（１５）の出力と１区間遅延部（３
）の出力の比が約２倍以上であれば有音の徴候があった
として出力を１とする前回のパワーとの比による判定部
（１８）、区間パワー計算部（１５）の出力が絶対音声
であると判定できるレベル以上であれば出力を１とする
絶対値レベルと区間パワーとの比による判定°部（１９
）、後述する有音無音判定部（９）が無音判定状態又は
ハングオーバー状態の時に区間パワー計算部（１５）の
出力により区間毎に無音パワーの値を更新するもので、
区間パワーが無音時パワーより大きい時は無音パワーを
通常の音声バワーレヘルの傾斜程度に増加させ、小さい
時には無音時パワーを区間パワーで置換える推定無音時
パワー計算部（２０）、推定無音時パワー計算部（２０
）の出力と区間パワー計算部（１５）の圧力との比が約
３倍以上であれば圧力を１とする無音時パワーと区間パ
ワーとの比による判定部（２１）、τ交叉計数部（１６
）の圧力が無音声と思われる数有れば圧力を１とする零
交叉による判定部（２２）、及び前回パワーとの比によ
る判定部（１８）、無音パワーと区間パワーとの比によ
る判定部（２１）、絶対値レベルと区間パワーとの比に
よる判定部（１９）、零交叉による判定部（２２）の圧
力結果に基づいて有音無音の判定結果を出力する有音判
定部（２３）を備えている。That is, the voice detector (7) having the illustrated configuration includes a section power calculation section (15) that adds the power of the input signal within the section;
A zero-crossing counting unit (16
), 1-section delay section (17) that holds the power of 1 section before.
, the output of the section power calculation section (15) and the one section delay section (3
) is about twice or more, it is assumed that there is a sound signal and the output is set to 1.The output of the judgment unit (18) based on the ratio to the previous power and the section power calculation unit (15) is absolute. Judgment unit (19
), which updates the value of silence power for each section based on the output of the section power calculation section (15) when the utterance/silence determination section (9), which will be described later, is in the silence determination state or the hangover state.
Estimated silent power calculation unit (20) that increases the silent power to the slope of the normal voice power level when the interval power is larger than the silent power, and replaces the silent power with the interval power when it is smaller; estimated silent power calculation unit Part (20
) and the pressure of the section power calculation section (15) is about 3 times or more, the pressure is set to 1. The determination section (21) based on the ratio of the silent power and the section power, the τ crossover counting section ( 16
) Judgment unit (22) based on zero crossing which sets the pressure to 1 if there is a number of pressures that seem to be silent, a judgment unit (18) based on the ratio to the previous power, and judgment based on the ratio of silence power to section power. (21), a determination unit (19) based on the ratio of the absolute value level and the section power, and a sound determination unit (23) that outputs a determination result of utterance or silence based on the pressure results of the determination unit (22) based on zero crossing. ).

次に、上記のような従来の音声検出器（７）が第６図に
示す音声パケット組立分解装置（２）に用いられる場合
の動作を考えてみる。第６図において、音声検出器（７
）の入力信号はエコーキャンセラ（５）より出力される
。エコーキャンセラ（５）の機能は、パケット分解部（
１０）、無音挿入部（１１）、ゆらき吸収バッファ（１
２）、復号部（１３）を経てＤ／Ａ変換器（１４）より
出力された受信音声信号がハイブリッド回路（３）で回
り込んで発生するエコー信号を消去するために用いられ
る。Next, let us consider the operation when the conventional voice detector (7) as described above is used in the voice packet assembly/disassembly device (2) shown in FIG. In FIG. 6, the voice detector (7
) is output from the echo canceller (5). The function of the echo canceller (5) is that of the packet disassembly unit (
10), silence insertion part (11), fluctuation absorption buffer (1
2) The received audio signal output from the D/A converter (14) via the decoding section (13) is used to cancel echo signals generated when the received audio signal goes around the hybrid circuit (3).

第８図はエコーキャンセラ（５）の動作を説明するため
の図であり、同図（ａ）をエコーキャンセラの近端入力
信号としたときの出力信号を同図（ｂ）（Ｃ）　に示し
である。通常、エコーキャンセラ（５）は消去し切れな
い残留エコーを完全に打ち消すためにＮＬＰ（Ｎｏｎ−
Ｌｉｎｅａｒ　Ｐｒｏｃｅｓｓｉｎｇ）機能を備えてお
り、第８図（ｂ）はＮＬＰ機能を動作させない場合、同
図（Ｃ）は動作させた場合の出力信号である。第８図ｆ
ａ）が第６図におけるエコーキャンセラ（５）の近端入
力信号であるとすると、範囲（Ｉ）（ＩＩＩ　）のレベ
ルの低い部分の信号は加入者（１）からの背景雑音信号
であり、範囲（ＩＩ）のレベルの高い部分の信号は加入
者（１）からの背景雑音信号と受信音声信号がハイブリ
ッド回路（３）で回り込んで発生したエコー信号とが重
畳し゛た信号である。ここでは、加入者（１）からの通
話音声信号が含まれていないが、今後の説明上特に必要
としないため省略してあり、範囲（Ｉ）、（１１）。Fig. 8 is a diagram for explaining the operation of the echo canceller (5), and output signals when Fig. 8 (a) is used as the near-end input signal of the echo canceller are shown in Fig. 8 (b) and (C). It is. Normally, the echo canceller (5) uses NLP (Non-
FIG. 8(b) shows the output signal when the NLP function is not activated, and FIG. 8(C) shows the output signal when the NLP function is activated. Figure 8 f
If a) is the near-end input signal of the echo canceller (5) in FIG. 6, the low-level signals in ranges (I) and (III) are background noise signals from the subscriber (1), The high-level signal in range (II) is a signal in which a background noise signal from the subscriber (1) and an echo signal generated when the received voice signal loops around in the hybrid circuit (3) are superimposed. Although the call voice signal from subscriber (1) is not included here, it is omitted because it is not particularly necessary for future explanations, and ranges (I) and (11).

（ＩＩＩ）において、加入者（１）は黙った状態である
ものとする。このような信号がエコーキャンセラ（５）
に入力されたときの出力信号は第８図（ｂ）に示すよう
にＮＬＰ機能を動作させない場合、範囲（ＩＩ）におい
て、エコー信号が抑圧されてそのレベルが低くなり、第
８図（Ｃ）　に示すようにＮＬＰ機能を動作させた場合
、範囲（ＩＩ）において消去し切れない残留エコーを完
全に打ち消すため完全な無音となる。In (III), subscriber (1) is assumed to be in a silent state. Such a signal is an echo canceller (5)
As shown in Figure 8(b), when the NLP function is not activated, the echo signal is suppressed in range (II) and its level becomes low, as shown in Figure 8(C). When the NLP function is activated as shown in FIG. 1, the residual echo that cannot be completely eliminated in range (II) is completely canceled out, resulting in complete silence.

しかしながら、第７図に示した従来の音声検圧器（７）
の入力として、第８図（Ｃ）に示すエコーキャンセラ（
５）の出力信号が与えられると、範囲（Ｈ）の長さが音
声検出器（７）の動作時間単位である区間に比べ十分長
い場合、範囲（ＩＩ）においては完全な無音であるため
、区間パワー計算部（１５）はτを算出し、従フて推定
無音時パワー計算部（２０）も無音時パワーとして零を
算出する。次に、範囲（ＩＩＩ　）に入り、音声検出器
（７）に背景雑音か入力され始めると、区間パワー計算
部（１５）の出力は背景雑音のパワーを算出して出力す
る。その結果、推定無音時パワー計算部（２０）の出力
と区間パワー計算部（１５）の出力との比が大きくなり
、無音時パワーと区間パワーとの比による判定部（２１
）では出力を１として、入力信号が背景雑音であるにも
かかわらず、有音と誤判定してしまうことになる。However, the conventional sound pressure detector (7) shown in FIG.
As an input, an echo canceller (
When the output signal of 5) is given, if the length of the range (H) is sufficiently long compared to the section that is the operating time unit of the sound detector (7), there is complete silence in the range (II), so The interval power calculation unit (15) calculates τ, and the estimated silent power calculation unit (20) also calculates zero as the silent power. Next, when the range (III) is entered and background noise begins to be input to the speech detector (7), the output of the section power calculation section (15) calculates and outputs the power of the background noise. As a result, the ratio between the output of the estimated silent power calculation section (20) and the output of the section power calculation section (15) increases, and the determination section (21) based on the ratio of the silent power to the section power increases.
), the output is set to 1, and even though the input signal is background noise, it will be erroneously determined to be a voice.

（発明が解決しようとする課題）以上のように、従来の音声検圧器（ア）は、入力として
エコーキャンセラ（５）の圧力信号のような完全な無音
区間を含む信号が与えられると、推定無音時パワー計算
部（２０）が完全な無音の区間で零を算出し、これが本
来の背景雑音パワーと異なるため、完全な無音区間の後
に続く信号が背景雑音であるにもかかわらず、これを有
音と誤判定してしまう問題点があった。(Problem to be Solved by the Invention) As described above, the conventional audio pressure detector (a) is capable of estimating when a signal including a complete silent section, such as the pressure signal of the echo canceller (5), is given as input. The silent power calculation unit (20) calculates zero in a completely silent section, and this is different from the original background noise power, so even though the signal following the complete silent section is background noise, it is There was a problem where it could be incorrectly determined that there was a sound.

この発明は、上記のような問題点を解消するためになさ
れたもので、エコーキャンセラの出力信号のように完全
な無音区間を含む信号が入力として与えられた場合ても
、背景雑音を有音と判定してしまうことのない音声検出
器を得ることを目的とする。This invention was made to solve the above-mentioned problems, and even when a signal including a complete silent section, such as the output signal of an echo canceller, is input, the background noise can be suppressed. The purpose of this invention is to obtain a voice detector that does not make such a judgment.

[Means to solve the problem]

この発明に係る音声検出器は、加入者からの入力音声信
号をハイブリッド回路を介してディジタル変換した後、
エコーキャンセラによって該入力信号に含まれるエコー
成分を除去し、該音声の有無検出信号に基づいて有音の
場合にのみ音声信号をパケット化してパケット網へ出力
すると共に、パケット網より入力される受信パケットか
らパケット分解部によりヘッダ情報を取り除き、受信パ
ケットが存在しない時間には無音挿入を行い、パケット
の伝送遅延時間の変動を吸収した後、アナログ変換して
上記ハイブリッド回路を介して加入者に音声信号を出力
するパケット組立分解装置内で用いられる上記音声の有
無検出手段として、上記エコーキャンセルからの入力信
号のパワーを算出する第１のパワー算出手段と、上記第
１のバワー算出手段の出力に基づいて無音時の背景雑音
パワーを算出する第２のパワー算出手段と、上記第１の
パワー算出手段で算出されたパワーと上記第２のパワー
算出手段で算出されたパワーを比較することによって音
声の有無を判定する判定手段とを備えた音声検出器にお
いて、上記第１と第２のパワー算出手段との間に設けら
れて、上記パケット分解部より受信パケット有無情報を
入力し、受信パケット有の時間においては上記第２のパ
ワー算出手段が過去に算出した無音時の背景雑音パワー
と同一の値を出力すべき制御する制御手段を備えたもの
である。The voice detector according to the present invention digitally converts an input voice signal from a subscriber through a hybrid circuit, and then
The echo canceller removes the echo component contained in the input signal, packetizes the audio signal only when there is sound based on the audio presence/absence detection signal, and outputs the packet to the packet network, and also receives input from the packet network. The header information is removed from the packet by the packet disassembly unit, silence is inserted during the time when there is no received packet, and after absorbing fluctuations in packet transmission delay time, it is converted to analog and sent to the subscriber via the above hybrid circuit. The audio presence/absence detection means used in the packet assembly/disassembly device that outputs the signal includes a first power calculation means for calculating the power of the input signal from the echo cancellation, and an output of the first power calculation means. a second power calculation means for calculating the background noise power during silence based on the background noise power, and comparing the power calculated by the first power calculation means and the power calculated by the second power calculation means, A voice detector is provided between the first and second power calculation means, and inputs received packet presence/absence information from the packet disassembly section, and determines the presence or absence of a received packet. The second power calculating means is provided with a control means for controlling the second power calculating means to output the same value as the background noise power during silence calculated in the past.

[Effect]

この発明においては、パケット分解部より受信パケット
の有無信号を入力することにより、制御手段により、受
信パケット有の区間においては、推定無音時パワー計算
部の算出結果がその前の区間で保持していた算出結果と
同一となる。すなわち、第８図（Ｃ）　に示すようなエ
コーキャンセラの出力信号が完全に零となる範囲（ＩＩ
　）においては、受信音声信号かハイブリッド回路で回
り込んで発生するエコー信号が存在しているのであるが
ら、バケッ）−分解部より出力される受信パケット有無
信号は、有りの状態となり、従って、受信パケット有り
の区間において推定無音時パワー計算部の算出結果かそ
の前の区間で保持していた算出結果と同一となるように
することによって、エコーキャンセラの出力信号か；と
なる区間においても推定無音時パワー計算部には適切な
背景雑音パワーを保持しておくことが可能となる。In this invention, by inputting the presence/absence signal of a received packet from the packet disassembly section, the control means maintains the calculation result of the estimated silent power calculation section in the previous section in the section where there is a received packet. The result is the same as the calculated result. In other words, the range (II
), although there is an echo signal generated by the received audio signal looping around in the hybrid circuit, the received packet presence/absence signal output from the bucket)-decomposition unit is in the presence state, and therefore, the reception By making the calculation result of the estimated silent power calculation unit in the interval with a packet the same as the calculation result held in the previous interval, the estimated silence can be calculated even in the interval where the output signal of the echo canceller is the same as the calculation result held in the previous interval. It becomes possible to hold an appropriate background noise power in the time power calculation section.

〔Example〕

以下、この発明の一実施例を図に基ついて説明する。第
１図はこの発明の詳細な説明する音声パケット組立分解
装置を示し、図において、（１）〜（６）　　　（８）
〜（１４）は第６図に示したものと同一であるが、音声
検出器（７）は、パケット分解部（１０）より受信パケ
ット有無信号を入力する構成となっている。An embodiment of the present invention will be described below with reference to the drawings. FIG. 1 shows a voice packet assembly and disassembly device for explaining the present invention in detail, and in the figure, (1) to (6) (8)
.about.(14) are the same as those shown in FIG. 6, but the voice detector (7) is configured to input the received packet presence/absence signal from the packet disassembly section (10).

また、第２図はこの発明の一実施例を示す音声検出器の
構成図であり、（１５）〜（２３）は第７図に示した従
来例と同一のものである。（２４）はパケット分解部（
ｌＯ）から受信パケット有無信号を入力する入力端子、
（２５）は受信パケットの有無によって接点を切換えて
、受信パケット有の時間においては第２のパワー算出手
段が過去に算出した無音時の背景雑音パワーと同一の値
を出力すべき制御する制御手段をなすスイッチである。Further, FIG. 2 is a block diagram of a voice detector showing an embodiment of the present invention, and (15) to (23) are the same as the conventional example shown in FIG. (24) is the packet decomposition part (
an input terminal for inputting a received packet presence/absence signal from lO);
(25) is a control means that switches the contact depending on the presence or absence of a received packet, and controls the second power calculation means to output the same value as the background noise power during silence calculated in the past during the time when there is a received packet. It is a switch that performs

また、第１図は音声パケット組立、分解装置内に第２図
に示す本発明の一実施例を通用した例である。図におい
て、（１）〜（６）　　　（８）〜（１４）は第６図に
示したものと同一であるが、音声検出器（７）は、パケ
ット分解部（１０）より受信パケット有無信号を入力す
る構成となっている。Further, FIG. 1 shows an example in which the embodiment of the present invention shown in FIG. 2 is used in an audio packet assembly and disassembly device. In the figure, (1) to (6) and (8) to (14) are the same as those shown in FIG. It is configured to input.

次に、第２図に示す具体的実施例の動作について説明す
る。（１５）〜（１９）、（２１）〜（２３）について
は従来例と同様であるので省略する。入力端子（２４）
より入力された受信パケット有無信号は、スイッチ（２
５）に入力され、スイッチ（２５）は受信音声パケット
が無のとき区間パワー計算部（１５）の圧力を、有の時
推定無音時パワー計算部（２０）の出力を選択して推定
無音時パワー計算部（２ｏ）に出力する。従って、推定
無音時パワー計算部（２ｏ）の動作は、受信パケットか
無の場合は従来例と同様であるが、有の場合は無音時パ
ワーは区間パワーにかかわらずその前の区間で算出され
た無音時パワーが保持される。Next, the operation of the specific embodiment shown in FIG. 2 will be explained. (15) to (19) and (21) to (23) are the same as in the conventional example, and therefore will be omitted. Input terminal (24)
The received packet presence/absence signal input from the switch (2
5), the switch (25) selects the pressure of the interval power calculation unit (15) when there is no received voice packet, and selects the output of the estimated silent power calculation unit (20) when there is a received voice packet, and selects the output of the estimated silent power calculation unit (20) to calculate the estimated silent time. Output to the power calculation section (2o). Therefore, the operation of the estimated silent power calculation unit (2o) is the same as the conventional example when there is no received packet, but when there is a received packet, the silent power is calculated in the previous interval regardless of the interval power. The power is retained when there is no sound.

このため、エコーキャンセラ（５）の出力信号のように
完全な無音区間を含む信号が入力されても推定無音時パ
ワー計算部（２ｏ）で常に適切な背景雑音パワーを求め
ることが可能となり、背景雑音を有音と誤判定すること
のない音声検出器か得られる。Therefore, even if a signal including a complete silent period is input, such as the output signal of the echo canceller (5), the estimated silent power calculation unit (2o) can always calculate an appropriate background noise power, A voice detector that does not erroneously judge noise as voice presence can be obtained.

また、この発明は、第７図に示した従来例のように推定
無音時パワー計算部（２ｏ）の出力と区間パワーとの比
によって音声の有無を判定する手段（２１）を備えた音
声検出器のみならず、推定無音時パワー計算部（２０）
の出力に基づいたしとい値を発生しこのしきい値と区間
パワーの大小にょフて音声の有無を判定する手段を備え
た音声検出器に対しても有効である。Further, the present invention provides a voice detection system that includes means (21) for determining the presence or absence of voice based on the ratio of the output of the estimated silent power calculation unit (2o) and the section power, as in the conventional example shown in FIG. Estimated silent power calculation unit (20)
The present invention is also effective for a voice detector equipped with means for generating a threshold value based on the output of , and determining the presence or absence of voice based on this threshold value and the magnitude of the section power.

例えは、第７図に示す無音時パワーと区間パワーとの比
による判定部（２１）に代えて、第３図に示すような構
成の判定部を備えた音声検圧器を考えてみる。図におい
て、（２１１）は推定無音時パワー計算部の出力に基づ
いたしきい値を発生するしきい値発生部、（２１２）は
しきい値と区間パワーの大小による判定部であり、これ
らの動作に関連する区間パワー計算部（１５）、推定無
音時パワー計算部（２０）も含めて示している。しきい
値発生部（２１１）は推定無音時パワー計算部（２０）
の出力に応じて、例えばこれよりも一定値だけ大きいレ
ベルをしきい値として出力し、しきい値と区間パワーの
大小による判定部（２１２）ではしきい値発生部（２１
１）の出力するしきい値よりも区間パワーの方が大きい
場合に出力を１とする。入力信号として％　８　図（ｃ
）　に示すエコーキャンセル出力信号が与えられると、
第７図に示した従来例と同様に推定無音時パワー計算部
の出力は範囲（ＩＩ）においてτとなるため、しきい値
発生部（２１１）の出力は小さくなり、その後範囲（Ｉ
ＩＩ　）において再び背景雑音か入力され始めると、区
間パワーの方かしきい値発生部（２１１，）の圧力より
も犬きくなってしきい値と区間パワーの大小による判定
部（２１２）は１を出力する。このため、第７図に示し
た従来の音声検出器と同様に背景雑音を有音と誤判定し
てしまう。For example, consider a voice pressure detector that includes a determining section configured as shown in FIG. 3 instead of the determining section (21) based on the ratio of silent power to section power shown in FIG. In the figure, (211) is a threshold generation unit that generates a threshold based on the output of the estimated silent power calculation unit, and (212) is a determination unit that determines the magnitude of the threshold and the section power. A section power calculation unit (15) and an estimated silent power calculation unit (20) related to the operation are also shown. The threshold generation section (211) is the estimated silent power calculation section (20).
According to the output of
The output is set to 1 when the section power is greater than the output threshold in 1). %8 as input signal Figure (c
) Given the echo canceling output signal shown in
As in the conventional example shown in FIG.
When background noise starts to be input again in II), the section power becomes stronger than the pressure of the threshold generation section (211,), and the judgment section (212) based on the magnitude of the threshold and section power becomes 1. Output. For this reason, as with the conventional voice detector shown in FIG. 7, background noise is erroneously determined to be voiced.

第４図はこれを解決するためにこの発明を適用した実施
例である。図において、スイッチ（２５）は第２図に示
す本発明の一実施例のスイッチ（２５）と同様の動作を
する。従って、推定無音時パワー計算部（２０）は常に
適切な背景雑音算出が可能となり、背景雑音と有音と誤
判定することかなくなることが分かる。FIG. 4 shows an embodiment in which the present invention is applied to solve this problem. In the figure, the switch (25) operates similarly to the switch (25) of the embodiment of the present invention shown in FIG. Therefore, it can be seen that the estimated silent power calculation unit (20) can always calculate the background noise appropriately, and there is no possibility of erroneously determining the presence of speech as background noise.

また、第１図にこの発明による音声検出器を音声パケッ
ト組立分解装置に通用した例を示したが、第５図に示す
ようにこの発明による音声検出器は、高能率符号化器及
び復号器を用いない音声パケット組立分解装置にも適用
可能である。Furthermore, although FIG. 1 shows an example in which the voice detector according to the present invention is applied to a voice packet assembly and disassembly device, as shown in FIG. It is also applicable to voice packet assembly and disassembly devices that do not use.

（発明の効果）以上のように、この発明によれば、パケット分解部より
受信パケット有無信号を入力し、受信パケットが有の区
間において、推定無音時パワー計算部の算出結果がその
前の区間で保持していた算出結果と同一となるように構
成したので、エコーキャンセラ出力信号のように完全な
無音区間を含む信号が入力されても推定無音時パワー計
算部で常に適切な背景雑音パワーを求めることが可能と
なり、背景雑音を有音と誤判定することのない音声検出
器が得られる効果がある。(Effects of the Invention) As described above, according to the present invention, the received packet presence/absence signal is input from the packet disassembly section, and in the section where there is a received packet, the calculation result of the estimated silent power calculation section is calculated from the previous section. Since the calculation result is the same as that held in the Estimated Silence Power Calculator, even if a signal including a complete silent section is input, such as an echo canceller output signal, the estimated silent power calculation section will always calculate an appropriate background noise power. This has the effect of providing a voice detector that does not erroneously determine background noise as a voice.

[Brief explanation of the drawing]

第１図はこの発明による音声検出器を音声パケット組立
分解装置に通用した場合の構成図、第２図はこの発明の
一実施例による音声検出器の構成図、第３図は推定無音
時パワー算出部の出力に基づいて発生するしきい値と区
間パワーとの大小によって有音無音判定を行う判定部及
びその関連部分の構成図、第４図は第３図に示す判定部
にこの発明を通用した第２の実施例の構成図、第５図は
この発明による音声検出器を高能率音声符号化器及び復
号器を用いない音声パケット組立分解装置に適用した例
の構成図、第６図は音声パケット組立分解装置の構成図
、第７図は従来の音声検出器の構成図、第８図はエコー
キャンセラの入圧力信号例の説明図である。図において、（１）は加入者、（２）は音声パケット組
立分解装置、（３）はハイブリット回路、（４）はＡ／
Ｄ変換器、（５）はエコーキャンセラ、（ア）は音声検
出器、（８）はパケット組立部、（９）はパケット交換
機、（１０）はパケット分解部、（１１）は無音挿入部
、（１２）はゆらぎ吸収バッファ、（１４）はＤ／Ａ変
換器、（１５）は区間パワー計算部、（２０）は推定無
音時パワー計算部、（２１）は無音時パワーと区間パワ
ーとの比による判定部、（２４）は受信パケット有無信
号入力端子、（２５）はスイッチ、（２１１）はしきい
値発生部、（２１２）はしきい値と区間パワーの大小に
よる判定部である。尚、図中、同一符号は同−又は相当部分を示す。FIG. 1 is a block diagram of a voice detector according to the present invention applied to a voice packet assembly and disassembly device, FIG. 2 is a block diagram of a voice detector according to an embodiment of the present invention, and FIG. 3 is an estimated silence power FIG. 4 is a block diagram of a determination unit that determines utterance and non-utterance based on the magnitude of the threshold value generated based on the output of the calculation unit and the interval power, and its related parts. FIG. 5 is a block diagram of a second embodiment that has been used, and FIG. 5 is a block diagram of an example in which the voice detector according to the present invention is applied to a voice packet assembly and disassembly device that does not use a high-efficiency voice encoder and decoder. 7 is a block diagram of a voice packet assembly and disassembly device, FIG. 7 is a block diagram of a conventional voice detector, and FIG. 8 is an explanatory diagram of an example of an input pressure signal of an echo canceller. In the figure, (1) is a subscriber, (2) is a voice packet assembly/disassembly device, (3) is a hybrid circuit, and (4) is an A/
D converter, (5) is an echo canceller, (A) is a voice detector, (8) is a packet assembly unit, (9) is a packet switch, (10) is a packet disassembly unit, (11) is a silence insertion unit, (12) is a fluctuation absorption buffer, (14) is a D/A converter, (15) is an interval power calculation unit, (20) is an estimated silent power calculation unit, and (21) is a combination of silent power and interval power. (24) is a reception packet presence/absence signal input terminal, (25) is a switch, (211) is a threshold value generation section, and (212) is a determination section based on the magnitude of the threshold and section power. In the drawings, the same reference numerals indicate the same or corresponding parts.

Claims

[Claims]

After converting the input audio signal from the subscriber into a digital signal through a hybrid circuit, an echo canceler removes the echo component contained in the input signal, and generates an audio signal only when there is sound based on the audio presence/absence detection signal. In addition to packetizing and outputting to the packet network, the packet disassembly section removes header information from the received packets input from the packet network, inserts silence during times when there are no received packets, and suppresses fluctuations in packet transmission delay time. After absorption, the power of the input signal from the echo cancellation is calculated as means for detecting the presence or absence of the audio used in a packet assembly/disassembly device that converts the audio signal into analog and outputs the audio signal to the subscriber via the hybrid circuit. a first power calculation means; a second power calculation means for calculating background noise power during silence based on the output of the first power calculation means; and a power calculated by the first power calculation means. and a determination means for determining the presence or absence of a voice by comparing the power calculated by the second power calculation means, the voice detector provided between the first and second power calculation means. Then, information on the presence or absence of a received packet is input from the packet decomposition unit, and the second power calculation means is controlled to output the same value as the previously calculated background noise power during silent periods during the time when there is a received packet. A voice detector characterized in that it is equipped with a control means.