JPS63237100A

JPS63237100A - Voice detector

Info

Publication number: JPS63237100A
Application number: JP62070188A
Authority: JP
Inventors: 孝夫鈴木; 白木　裕一; 庄司　保夫
Original assignee: Oki Electric Industry Co Ltd
Current assignee: Oki Electric Industry Co Ltd
Priority date: 1987-03-26
Filing date: 1987-03-26
Publication date: 1988-10-03

Abstract

(57)【要約】本公報は電子出願前の出願データであるた
め要約のデータは記録されません。(57) [Summary] This bulletin contains application data before electronic filing, so abstract data is not recorded.

Description

【発明の詳細な説明】（産業上の利用分野）この発明は音声検出器、特にディジタル通信分野のディ
ジタル音声挿入システム或は音声パケットシステムに適
用して好適な音声検出器に関する。DETAILED DESCRIPTION OF THE INVENTION (Field of Industrial Application) The present invention relates to a voice detector, and particularly to a voice detector suitable for application to a digital voice insertion system or a voice packet system in the field of digital communications.

（従来技術の説明）従来、この種の音声検出技術については例えば文献Ｉ：
　「昭和５１年度電子通信学会総合全国大会Ｊ、１７５
３（零交差頻度を利用した音声検出の一方式）、荒開　
卓及び落合　相離」に開示されている。(Description of Prior Art) Conventionally, this type of voice detection technology is described in, for example, Document I:
``1975 National Institute of Electronics and Communication Engineers General Conference J, 175
3 (a method of voice detection using zero-crossing frequency), Arakai
Taku and Ochiai, Airi.

先ず、この発明の音声検出器の説明に先立ち、この従来
既知の音声検出器につき説明する。First, prior to explaining the voice detector of the present invention, this conventionally known voice detector will be explained.

第２図はこの文献■に開示されでいる音声検出器を示す
ブロック図であり、−サンプル毎に音声検出を行う構成
となっている。この音声検出器の入力端子２１で受信さ
れた入力サンプル信号は母音等の比較的振幅の大きい信
号に対する振幅検出部２２に入力される。一方、この入
力サンプル信号は直流（ＤＣ）抑圧回路２３でオフセッ
トが除去れ、続いて加算器２４において任意の一定値ａ
を加えその符号ビットのみを取り出して摩擦性子音用の
零交差検出部２５に入力される。FIG. 2 is a block diagram showing the voice detector disclosed in this document (2), which is configured to detect voice for each -sample. The input sample signal received at the input terminal 21 of this voice detector is input to an amplitude detection section 22 for signals having relatively large amplitudes such as vowels. On the other hand, the offset of this input sample signal is removed by a direct current (DC) suppression circuit 23, and then an arbitrary constant value a is applied to the input sample signal by an adder 24.
is added and only its sign bit is extracted and input to the zero-crossing detection unit 25 for fricative consonants.

この振幅検出部２２では、先ず、比較回路２６においで
入力サンプル信号の絶対値が一定値（ｅ）を越える否か
でカウンタ２７ヲカウントアツプ又はカウントダウンさ
せ、このカウント値ｔｎ値回路２８において一定の閾値
ＴＨｖと比較を行ってこれを越えた時、論理値「１」の
出力αＶを発生させている。In this amplitude detection section 22, first, a comparator circuit 26 causes a counter 27 to count up or count down depending on whether the absolute value of the input sample signal exceeds a certain value (e), and this count value tn value circuit 28 increments a certain value (e). A comparison is made with a threshold value THv, and when the threshold value THv is exceeded, an output αV having a logical value of “1” is generated.

一方、零交差検出部２５においては、送られできた符号
ビットを遅延回路２９及び論理和回路（０８回路）３０
とに送り、両回路２９及び３０によっで−サンプル前の
符号ビットとの−ｈｖ取る。その結果が不一致か否かで
カウンタ３１ヲカウントダウン又はカウントアツプさせ
て入力が（−ａ）を横切る回数を計数し、そのカウント
値を閾値回路３２において一定値を越えているか否かを
所定の閾値ＴＨｚとの比較を行って検査し、閾＠　Ｔ　
Ｈｚを越えている場合には、論理値「１」の出力α２を
発生させる構成となっている。On the other hand, in the zero crossing detection section 25, the sent code bit is sent to a delay circuit 29 and an OR circuit (08 circuit) 30.
Both circuits 29 and 30 take -hv with the sign bit before the sample. Depending on whether the results are inconsistent, the counter 31 is counted down or counted up to count the number of times the input crosses (-a), and the count value is set to a predetermined threshold in the threshold circuit 32 to determine whether or not it exceeds a certain value. Check by comparing with THz and find the threshold @ T
If the frequency exceeds Hz, the configuration is such that an output α2 of logical value “1” is generated.

そして、これら両出力αＶ及びα２の論理和αを論理和
回路３３から出力させ、この出力αをハングオーバ制御
回路３４に供給し、ハングオーバ時間を付加して音声検
出器の出力αｏｕｔを出力端子３５から出力するように
構成されている。尚、この出力αｏｕｔが「１」である
期間は有音であり、ｒＱＪである期間は無音である。Then, the logical sum α of these two outputs αV and α2 is outputted from the logical sum circuit 33, this output α is supplied to the hangover control circuit 34, the hangover time is added, and the output αout of the audio detector is output from the output terminal 35. is configured to print. Note that the period in which the output αout is “1” is a sound period, and the period in which the output αout is “rQJ” is a silent period.

この従来の音声検出技術で用いられているハングオーバ
制御とは、ハングオーバ制御部３４の前段に位フする有
音・無音の判定部で発生する信号αが「１」から「０」
となったとしても、検出器からの出力αｏｕｔ　％ある
一定時間の間ｒｌＪとして出力し続けるように制御する
ことである。このように、一定期間「１」の出力を維持
する目的は、このハングオーバ時間を出力「１」の本来
の持続時間に追加して付加することによって、音声検出
器が検出しきれない音声部分の欠落を防ぐことにある。The hangover control used in this conventional voice detection technology means that the signal α generated in the voice/no-speech determination section located before the hangover control section 34 changes from "1" to "0".
Even if , the output from the detector is controlled so that it continues to be output as rlJ for a certain period of time. In this way, the purpose of maintaining the output of "1" for a certain period of time is to add this hangover time to the original duration of the output "1" so that the audio part that cannot be detected by the audio detector is The purpose is to prevent omissions.

このハングオーバ時間は従来は約１５０〜２００ｍ５と
いう時周期間に達していた。This hangover time conventionally reached a time period of about 150 to 200 m5.

従って、予め、音声検出器での検出感度を高感度に出来
るならば、このハングオーバ時間を短縮出来るので、音
声検出処理を正確かつ短時間で行うことが出来る。Therefore, if the detection sensitivity of the voice detector can be made high in advance, this hangover time can be shortened, and the voice detection process can be performed accurately and in a short time.

ところで、特にディジタル音声挿入システムや音声パケ
ットシステムでは、音声品質の劣化を伴わずにハングオ
ーバ時間の短縮化が出来る音声検出器が望まれている。Incidentally, especially in digital voice insertion systems and voice packet systems, there is a demand for voice detectors that can shorten hangover time without deteriorating voice quality.

この種のシステムで、例示するまでもなく種々の実験結
果から周知のように、ハングオーバ時間が長くなると、
一般には音声検出器が有音と判定し続けた長さすなわち
バースト長が長くなる傾向がある。換言すれば、全通話
時間に占める占有部分の割り合い（これを音声検出器の
アクティビティという）が高まってしまう。このアクテ
ィビティの増大は、無音部分に他の音声やデータを伝送
することにより伝送路の有効利用を図っている上記シス
テムには、あまり望ましいことではない、このような理
由により、音声品質の劣化を最小限におさえて、しかも
、アクティどティの低い音声検出器が望まれでいる。こ
のためにも音声検出器の高感度化を図り、その結果ハン
グオーバ時間についてはその短縮化を図ることが望まし
い。In this type of system, as is well known from various experimental results, which need not be exemplified, as the hangover time becomes longer,
In general, the length of time that the voice detector continues to determine that there is a sound, that is, the burst length, tends to become longer. In other words, the proportion of the occupied portion of the total call time (this is called the activity of the voice detector) increases. This increase in activity is not very desirable for the system described above, which aims to make effective use of the transmission path by transmitting other voices and data during silent parts. What is desired is a voice detector that can be kept to a minimum and has low activity. For this reason, it is desirable to increase the sensitivity of the voice detector and, as a result, reduce the hangover time.

（発明が解決しようとする問題点）しかしながら、上述した従来の音声検出技術で用いてい
る零交差法は、−サンプル期間中に入力サンプルが一定
値を横切る回数を数えでいるだけなので、正確な入力サ
ンプル列のもつスペクトル分布の推定とならない。例え
ば、回線雑音と音声信号が例え同一の零交差数をもって
いたとしても、スペクトル分布が一敗するとはかぎらな
いし、又、聴覚的に有意なスペクトル分布の違いが生じ
ていても、零交差数で見ると同じ値となることがある。(Problem to be Solved by the Invention) However, the zero-crossing method used in the conventional voice detection technology described above only counts the number of times the input sample crosses a certain value during the sample period, so it is not accurate. The spectral distribution of the input sample sequence cannot be estimated. For example, even if line noise and a voice signal have the same number of zero crossings, the spectral distributions do not always fall flat, and even if there is an auditory significant difference in the spectral distributions, the number of zero crossings will The values may appear to be the same.

このようなことから、従来の音声検出技術では、音声区
間の切り出しにおいて技術的に満足出来るものは得られ
なかった。さらに、零交差数を用いる従来の音声検出技
術ではハングオーバ時間を大幅に短縮することか出来な
いという問題点があった。For these reasons, conventional voice detection techniques have not been able to provide a technically satisfactory result in cutting out voice sections. Furthermore, the conventional voice detection technique using the number of zero crossings has a problem in that it is only possible to significantly shorten the hangover time.

この発明の目的は、上述した従来の零交差数によるスペ
クトル分布の分析の限界から生ずる音声部分の欠落とい
う問題点と、ハングオーバ時閉が零交差数を用いても大
きく短縮されないという問題点を除去し、回線雑音と音
声の識別能力が優れ、短いハングオーバ時間でも適切な
音声区間の切り出しを可能とする音声検出器を提供する
ことにある。The purpose of this invention is to eliminate the problem of missing audio parts caused by the limitations of the conventional spectral distribution analysis using the number of zero crossings mentioned above, and the problem that the closing time during hangover cannot be significantly shortened even if the number of zero crossings is used. Another object of the present invention is to provide a voice detector that has excellent ability to discriminate between line noise and voice, and can cut out appropriate voice sections even during short hangover times.

（問題点を解決するための手段）この目的の達成を図るため、この発明の音声検出器によ
れば、受信した入力サンプル信号に対し、所定のフレーム毎に
、自己相関係数に基づいた線形予測分析を行って、この
フレーム内の平均電力と、この自己相関係数と、線形予
測係数とを出力する線形予測分析部と、この線形予測係数及び自己相関係数並びに予め用意した
標準ベクトルから線形予測計数距離を算出して出力する
線形予測係数路Ｍ算出部と、前述の平均電力を、回線雑
音判定閾値及び有音・無音判定閾値と比較して、第一比
較結果出力する第一比較部と、前述の線形予測係数路Ｍを白色雑音判定閾値と比較して
第二比較結果を出力する第二比較部と、前述の第一及び
第二比較結果に基づいて有音・無音の総合判定を行う判
定部とを具えることを特徴とするものである。(Means for Solving the Problems) In order to achieve this objective, the speech detector of the present invention provides linear processing based on autocorrelation coefficients for each predetermined frame for the received input sample signal. A linear prediction analysis unit that performs predictive analysis and outputs the average power in this frame, this autocorrelation coefficient, and a linear prediction coefficient; A linear prediction coefficient path M calculation unit that calculates and outputs a linear prediction count distance, and a first comparison that compares the above-mentioned average power with a line noise determination threshold and a voice/no-speech determination threshold and outputs a first comparison result. a second comparison section that compares the linear prediction coefficient path M with a white noise determination threshold and outputs a second comparison result; The present invention is characterized in that it includes a determination section that performs determination.

（作用）上述したようなこの発明の構成によれば、線形予測計数
距離（ＬＰＣ距離と称する）を用いでいるので、回線雑
音に基づく誤動作を回避出来、しかも音声信号に対して
は高感度な検出が可能となる。従って、音声品質の劣化
を来さずにハングオーバ時間の短縮化を図れる。(Function) According to the configuration of the present invention as described above, since the linear predictive counting distance (referred to as LPC distance) is used, malfunctions due to line noise can be avoided, and moreover, it is highly sensitive to voice signals. Detection becomes possible. Therefore, hangover time can be shortened without deteriorating voice quality.

（実施例）以下、図面ヲ参照して、この発明の音声検出器の実施例
につき説明する。(Embodiments) Hereinafter, embodiments of the voice detector of the present invention will be described with reference to the drawings.

第１図は、この発明の音声検出器の一実施例を示すブロ
ック図である。１０はこの音声検出器の入力端子であり
、１１は出力端子である。入力端子１０で受信した入力
サンプル信号ｘ（ｉ）（但し、ｉは後述するＬＰＧ分析
フレーム中の何番目のサンプルであるかを示す。）を線
形予測分析部（以下、単にＬＰＧ分析部と称する。）１
２に供給する。このＬＰＧ分析部１２ヲ、入力サンプル
信号に対し、所定のフレーム毎に、自己相関係数に基づ
いた線形予測（ＬＰＣ）分析を行って、フレーム内の平
均電力と、自己相関係数と、線形予測係数とを出力する
ように構成する。FIG. 1 is a block diagram showing an embodiment of the voice detector of the present invention. 10 is an input terminal of this voice detector, and 11 is an output terminal. The input sample signal x(i) received at the input terminal 10 (where i indicates the number of the sample in the LPG analysis frame, which will be described later) is processed by a linear predictive analysis section (hereinafter simply referred to as the LPG analysis section). .)1
Supply to 2. This LPG analysis unit 12 performs linear prediction (LPC) analysis on the input sample signal based on the autocorrelation coefficient for each predetermined frame, and calculates the average power within the frame, the autocorrelation coefficient, and the linear The configuration is configured to output prediction coefficients.

第３図の動作の流れ図を参照して、ＬＰＧ分析の手順に
つき説明する。尚、以下の説明及び図面において、処理
ステップをＳで示す。The procedure for LPG analysis will be explained with reference to the operational flowchart of FIG. In addition, in the following description and drawings, a processing step is indicated by S.

先ず、ＬＰＧ分析の開始後、入力サンプル信号ｘ（ｉ）
を取り込み（Ｓｌ）、所定のフレーム長例えば数十サン
プルを−ブロックとしたＬＰＧ分析フレームに区切る（
Ｓ２）。First, after starting the LPG analysis, the input sample signal x(i)
(Sl) and divides it into LPG analysis frames with blocks of a predetermined frame length, for example, several tens of samples (
S2).

次に、この区切られたフレーム毎に、総計で（ｐ＋１．
）個の自己相開係数（Ｒ，、Ｒ＋　、Ｒ２・・・日、）
を計算する（Ｓ３）、ここで添字の小文字の数字はＬＰ
Ｇ分析の次数を表わし、例えばＲ３は第３次の自己相関
係数という。この自己相関係数の集りを自己相関係数行
列Ｆとして代表して示す、ここで、第０次の自己相関係
数Ｒ０は各フレーム内の入力信号×（１）の平均電力で
あり、これは後処理用として別個に出力させる。Next, for each divided frame, a total of (p+1.
) self-phase opening coefficients (R,, R+, R2... days,)
(S3), where the subscript lowercase number is LP
It represents the order of G analysis; for example, R3 is called a third-order autocorrelation coefficient. This collection of autocorrelation coefficients is representatively shown as an autocorrelation coefficient matrix F, where the 0th order autocorrelation coefficient R0 is the average power of the input signal x (1) in each frame; is output separately for post-processing.

次に、この自己相開係数を用いて総計で（ｐ＋１）個の
線形予測（ＬＰＧ）係数（１，α、。Next, using this self-phase opening coefficient, a total of (p+1) linear prediction (LPG) coefficients (1, α,) are calculated.

α２．・・・、αｐ）を計算する（Ｓ４）、この場合に
も、ＰＬＯ係数の集りを代表してＬＰＧ係数係数ベクト
ル水す。α2. .

次に、（ｐ＋１）個の自己相関係数及びＬＰＧ係数が求
まったかを検査する（Ｓ５）。求まっていない場合には
ステップＳ４からの処理を繰り返し行う。Next, it is checked whether (p+1) autocorrelation coefficients and LPG coefficients have been found (S5). If it has not been determined, the process from step S4 is repeated.

このようにして得られた自己相関係数日及びしＰＣ係数
ベクトルσを次段の線形予測係数距離算出部（以下、単
にＬＰＧ距離算出部と称する。）１３のＬＰＧ距離算出
段１４へ出力すると共に、平均電力臼。を第一比較部１
５へ出力する（Ｓ６）。The autocorrelation coefficient and PC coefficient vector σ thus obtained are output to the LPG distance calculation stage 14 of the next stage linear prediction coefficient distance calculation unit (hereinafter simply referred to as the LPG distance calculation unit) 13. Along with the average power mortar. The first comparison part 1
5 (S6).

このＬＰＧ距離算出部１３は零交差数に対応する処理を
行う部分であり、このＬＰＧ距離算出部１３においでは
、供給されたＬＰＧ係数ベクトルα及び自己相関係数行
列日並びに予めＲＯＭ等のような標準へクトリメモ１，
１１６に格納して用意した標準ＬＦ）Ｃ係数ベクトルβ
からＬＰＧ距離りを算出して第二比較部１６に出力する
。This LPG distance calculation unit 13 is a part that performs processing corresponding to the number of zero crossings, and in this LPG distance calculation unit 13, the supplied LPG coefficient vector α and autocorrelation coefficient matrix date, as well as the data stored in a ROM etc. Standard hectory memo 1,
Standard LF) C coefficient vector β stored in 116 and prepared
The LPG distance is calculated from and output to the second comparison section 16.

このＬＰＧ距離りは、文献：　ＩＩ　ｒ音声のディジタ
ル信号処理（下）、コロナ社発行、昭和５８年４月１５
日、第２６９頁〜第２７１頁」にの記載から理解出来る
ように、次式で与えられる。This LPG distance can be found in the document: II r Audio Digital Signal Processing (Part 2), published by Corona Publishing, April 15, 1982.
As can be understood from the description in Japanese, pages 269 to 271, it is given by the following formula.

但し、ｃｒ＝（１，α１．α２．・・・αＦ）８＝　（１，β
１．β２．・・・β、）８　＝（日ＩＪ）　、Ｒｉｉ＝
　Ｒ１ｌ−Ｊｌ　、ｉ、ｊ＝ｏ〜ｐ日ｒ　＝　（１／Ｎ）Σｘ　（ｎ）−ｘ　（ｎ＋１）７
１＋１であり、Ｎは分析フレームのサンプル数、Ｒ＋はＬＰＧ
分析フレームの第ｉ次の自己相関係数、α１は第ｉ次の
ＬＰＧ係数であり、β１は第ｉ次の標準ＬＰＧ係数であ
り、ｔは転百行列を意味する。However, cr=(1,α1.α2....αF)8=(1,β
1. β2. ... β, )8 = (Japan IJ), Rii =
R1l-Jl, i, j=o~p day r = (1/N)Σx (n)-x (n+1)7
1+1, N is the number of samples in the analysis frame, R+ is LPG
The i-th autocorrelation coefficient of the analysis frame, α1 is the i-th LPG coefficient, β1 is the i-th standard LPG coefficient, and t means the centrifugation matrix.

既に述べたように、この標準ＬＰＧ係数８１は予め用意
する。この実施例では、一般に回線雑音は白色雑音に近
い性質をもつと考えられるので、回線雑音を白色雑音と
考えて、白色雑音について各フレーム毎の線形予測係数
を実測し、その平均値を求めておき、これを標準ＬＰＧ
係数８１．・・・、β、（すなわち、標準ＬＰＧベクト
ルβ）と決定するが、勿論他の任意好適な方法で決定し
ても良い。、この６は“あくまでも平均値（平均ベクト
ル）であるので、白色雑音の各フレーム線形予測係数が
このβと必ずしも一敗するとは限らないので、各フレー
ムの線形予測係数が６とどの程度似ているかの目安すと
して（１）式のＬＰＣ距ＭＤを求める。As already mentioned, this standard LPG coefficient 81 is prepared in advance. In this example, since line noise is generally considered to have properties similar to white noise, line noise is considered to be white noise, and the linear prediction coefficients for each frame of white noise are actually measured and the average value thereof is calculated. and convert this into standard LPG
Coefficient 81. . , this 6 is just an average value (average vector), so the linear prediction coefficient of each frame of white noise is not necessarily equal to this β, so how similar is the linear prediction coefficient of each frame to 6? The LPC distance MD of equation (1) is determined as a guideline for determining whether the vehicle is present.

このＬＰＧ距離りの算出手順は例えば分母及び分子の必
要な項を算出してから乗算を行い、次に除算を行えば良
いが、他の算出手順であっても良く、その手順は問わな
い。The procedure for calculating this LPG distance may be, for example, by calculating necessary terms in the denominator and numerator, then performing multiplication, and then performing division, but other calculation procedures may be used, and the procedure does not matter.

第一比較部１５は振幅検出に対応する処理を行う部分で
あり、この第一比較部１５においては、平均電力臼。を
、予め設定してある回線雑音判定閾値Ｐ　ｔｈ＋及び有
音・無音判定閾値Ｐ　ｔｈ２と比較して、第一比較結果
ｖｐを出力させる。一般に回線雑音の平均レベルは−４
０ｄ　ＢｍＯ〜−５０ｄＢｍＯのＪａヲもつと言われて
いる。この実施例では、閾値ｐ　ｔｈ、を−４０ｄＢｍ
Ｏの値よりも大きな適当な値に設定し一４０ｄＢｍＯの
回線雑音に対し音声検出器が誤動作しないようにする。The first comparison section 15 is a section that performs processing corresponding to amplitude detection, and in this first comparison section 15, an average power mill is used. is compared with a preset line noise determination threshold P th+ and a voice/silence determination threshold P th2, and a first comparison result vp is output. Generally, the average level of line noise is -4
It is said to have a Jawo of 0dBmO to -50dBmO. In this example, the threshold p th is −40 dBm
It is set to an appropriate value larger than the value of O to prevent the voice detector from malfunctioning due to line noise of -40 dBmO.

また、有音・無音判定閾値Ｐ　ｔｈ２についでは、音声
信号の大部分がこの閾値Ｐ　ｔｈ２以上のパワーをもち
、それ以下の信号を無音として判定してもざしつかえな
いような任意好適な値に設定する。好ましくは、この閾
［Ｐ　ｔｈ２を一５０ｄＢｍＯ程度に設定するのが良い
。Regarding the voice/silence determination threshold P th2, it is set to an arbitrary and suitable value such that most of the audio signals have a power equal to or greater than this threshold P th2 and signals below this can be judged as silence. Set. Preferably, this threshold [P th2 is set to about -50 dBmO.

従って、この第一比較部１５からは、このように設定し
た閾１ｐｔｈｌ及びＰ　ｔｈ２に対し、入力された平均
電力臼。か、次式（２）のような関係にあるとき、それ
ぞれに対応した出力信号を第一比較結果Ｖｐ！出力する
構成となっている。Therefore, from this first comparison unit 15, the average power mill inputted with respect to the thresholds 1 pthl and P th2 set in this way. Or, when there is a relationship as shown in the following equation (2), the corresponding output signals are used as the first comparison result Vp! It is configured to output.

一方、第二比較部１７においては、ＬＰＣ距離を予め定
めておいた白色雑音判定閾値ｏｔｈと比較して第二比較
結果Ｖｄｔ出力する。この実施例では、好ましくは、白
色雑音の各フレームのＬＰＣ距離りを求め、どの程度の
値にまでなるかを調べ、各ＬＰＧ距離（ａ　Ｄかある一
定値Ａを越える確率が０．０１％となるような当該への
値を予め求め、これを白色判定閾値Ｄｔｈとするのが良
い。この０．０１％という値は一般に統計の分野で用い
られている危険率に当つ、この危険率を何％とするかに
よって、この閾値Ｄｔｈは変わる。On the other hand, the second comparing section 17 compares the LPC distance with a predetermined white noise determination threshold oth and outputs a second comparison result Vdt. In this embodiment, it is preferable to calculate the LPC distance of each frame of white noise, check to what extent the value reaches, and calculate the probability that each LPG distance (a D exceeds a certain value A is 0.01%). It is best to find in advance a value for the relevant value such that , and use this as the white judgment threshold Dth. This threshold value Dth changes depending on what percentage is set.

この実施例では、このＬＰＧ距離りと閾値ｏｔｈとの大
きざの相違により、次式（３）で定まる論理値の比較結
果ｄを総合判定の結果として出力する。In this embodiment, based on the difference in size between this LPG distance and the threshold value oth, a comparison result d of logical values determined by the following equation (3) is output as a result of comprehensive judgment.

上述したようにして得られた第一比較結果Ｖｐ及び第二
比較結果Ｖｄを共に有音・無音判定部１８に送り、そこ
てこれら第一及び第二比較結果Ｖｐ及びＶｄに基づいて
有音・無音の総合判定を行い、有音・無音の判定信号Ｖ
を発生させて次段のハングオーバ付加部１９へ出力させ
る。この判定は次式（４）に従って行う。Both the first comparison result Vp and the second comparison result Vd obtained as described above are sent to the sound/non-sound determining section 18, and there, the sound/non-sound determination unit 18 determines whether the sound is present or not based on the first and second comparison results Vp and Vd. Performs a comprehensive judgment of silence and generates a sound/no-sound judgment signal V
is generated and output to the hangover adding section 19 at the next stage. This determination is made according to the following equation (4).

（４）式においで、判定信号Ｖか論理値「］」であると
きは有音を意味し、ｌ”ＯＪのときは無音を意味する。In equation (4), when the judgment signal V is the logical value "]", it means there is a sound, and when it is l''OJ, it means no sound.

これを上述した平均電力臼。及びＬ　ＰＣ距ＭＯに関し
て整理していうと、平均電力Ｒ８かｆｉ（ａＰｔｈ、よ
りも大きい時には判定信号ＶはＶ＝ｖｐとなり有音と判
定される。一方、平均電力Ｒｏか閾値ｐ　ｔｈ、以下で
ありかつｐ　ｔｈ２以上である場合には判定信号ＶはＶ
＝Ｖｄとなり、ＬＰＧ距ｊｌＤに基づく判定のみによっ
て有音・無音が判定されることか理解出来る。This is the average power mill mentioned above. And L If it is present and p th2 or more, the judgment signal V is V
=Vd, and it can be understood that whether there is a sound or not is determined only by the determination based on the LPG distance jlD.

ざらに、平均電力Ｒ８か閾値Ｐ　ｔｈ２よつも小ざい時
には、無条件に判定信号ＶはＶ＝○として無音となる。Roughly speaking, when the average power R8 is smaller than the threshold value Pth2, the determination signal V is unconditionally set to V=◯ and becomes silent.

このように判定されて出力された判定信号に対して従来
と同様にハングオーバ付加部１９においでハングオーバ
時間を付加した後、出力端子１１より出力信号Ｖｏｕｔ
７ａ出力するが、この発明の音声検出装置によれば、回
線雑音と、音声とが混在する電力レベル領域については
、回線雑音と音声とを識別する技術として、スペクトル
のエンベロープの違いを前述した（１）式に示し楚ＬＰ
Ｃ距離を計算することで求めるという技術を用いでいる
ので、回線雑音と音声信号の識別能力の高い、従って高
感度の音声検出装置ｉ％実現することが出来る。これが
ため、ハングオーバ時間の短縮化を図ることが出来る。After adding a hangover time to the determination signal determined and output in this manner in the hangover adding section 19 as in the conventional case, the output terminal 11 outputs the output signal Vout.
However, according to the voice detection device of the present invention, in the power level region where line noise and voice coexist, the difference in the spectral envelope is used as a technique for distinguishing between line noise and voice (as described above). 1) Chu LP shown in equation
Since the technique of calculating the C distance is used, it is possible to realize a highly sensitive voice detection device i% that has a high ability to distinguish between line noise and voice signals. Therefore, hangover time can be shortened.

この発明は上述した実施例にのみ限定されるものではな
く、多くの変形又は変更を行い得ること明らかである。It is clear that the invention is not limited only to the embodiments described above, but can be subjected to many variations and modifications.

例えば、第一及び第二比較部はそれぞれの閾値を適当な
メモリに格納しておき、入力信号の到来時にメモリから
読み出して、それぞれの比較部で比較処理を行える構成
となっていれば良く、その具体的構成は従来貴地の技術
を用いて設計に応して容易に構成することが出来る。ま
た、有音・無音判定部についても、二つの入力信号ｖｐ
及びＶｄの和を取り、その結果を○レベルと比較する構
成となっていれば、その具体的構成は問わず、設計に応
じた任意好適な構成とすることが出来る。For example, it is sufficient that the first and second comparing sections store their respective threshold values in an appropriate memory, read them from the memory when an input signal arrives, and perform comparison processing in each comparing section. Its specific configuration can be easily configured according to the design using conventional Takaji technology. Also, regarding the sound/non-sound determination section, two input signals vp
As long as the configuration is such that the sum of and Vd is calculated and the result is compared with the O level, the specific configuration can be any suitable configuration depending on the design.

（発明の効果）上述した説明から明らかなように、この発明の音声検出
器によれば、一旦ＬＰＧ分析を行った後、ＬＰＧ距離を
用いて回線雑音と音声信号との識別を行う構造となって
いるので、回線雑音に基づく誤動作を短縮し、しかも、
音声信号に対しては従来よりも高感度な検出か可能とな
る。この高感度化のために、ハングオーバ時間を従来の
１／３〜１／２程度に短縮化を図ることが出来る。(Effects of the Invention) As is clear from the above description, the voice detector of the present invention has a structure in which, after LPG analysis is performed, line noise and voice signals are discriminated using the LPG distance. This reduces malfunctions caused by line noise, and
It is possible to detect audio signals with higher sensitivity than before. Due to this high sensitivity, the hangover time can be reduced to about 1/3 to 1/2 of the conventional one.

[Brief explanation of the drawing]

第１図は、この発明の音声検出器の一実施例の構成を説
明するためのブロック図、第２図は、従来の音声検出器の説明に供するプロ・ンク
図、第３図は、この発明の音声検出器の動作の説明に供する
動作の流れ図である。１０・・・入力端子、　　　　Ｎ−・・出力端子１２・
・・ＬＰＧ分析部、　　１３・・・ＬＰＧ距離算出部１
４・・・ＬＰＣ距離算出段、１５・・・第一比較部１６
・・・標準ベクトルメモリ＋　７−・・第二比較部、　　　１８・・・有音・無音
判定部１９・・・ハングオーバ付加部。特許出願人　　　　沖電気工業株式会社舗　− ＬＰＣ分オｆｒｌＰの會力作の遭危図第３図FIG. 1 is a block diagram for explaining the configuration of an embodiment of the voice detector of the present invention, FIG. 2 is a block diagram for explaining a conventional voice detector, and FIG. It is a flow chart of the operation provided for explanation of the operation of the voice detector of the invention. 10...Input terminal, N-...Output terminal 12.
...LPG analysis section, 13...LPG distance calculation section 1
4... LPC distance calculation stage, 15... First comparison section 16
. . . Standard vector memory + 7- . . . Second comparing section 18 . . . Sound/silence determining section 19 . . . Hangover adding section. Patent Applicant Oki Electric Industry Co., Ltd. - LPC Minute FRLP Collaborative Danger Diagram Figure 3

Claims

[Claims]

(1) Perform linear prediction analysis based on the autocorrelation coefficient for each predetermined frame on the received input sample signal, and calculate the average power in the frame, the autocorrelation coefficient, and the linear prediction coefficient. a linear prediction analysis unit that outputs a linear prediction coefficient distance, a linear prediction coefficient distance calculation unit that calculates and outputs a linear prediction coefficient distance from the linear prediction coefficient, autocorrelation coefficient, and a standard vector prepared in advance; a first comparison unit that compares the linear prediction coefficient distance with a determination threshold and a voice/no-speech determination threshold and outputs a first comparison result; and a second comparison unit that compares the linear prediction coefficient distance with a white noise determination threshold and outputs a second comparison result. A voice detector comprising: a comparison section; and a determination section that performs a comprehensive determination of whether there is a sound or no sound based on the first and second comparison results.