JPH10301594A

JPH10301594A - Sound detecting device

Info

Publication number: JPH10301594A
Application number: JP9114055A
Authority: JP
Inventors: Hideaki Kurihara; 栗原秀明
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 1997-05-01
Filing date: 1997-05-01
Publication date: 1998-11-13

Abstract

PROBLEM TO BE SOLVED: To detect a musical component existing as a background of a voice signal in a sound detecting device detecting the presence of a sound. SOLUTION: A frequency conversion part 31 converts the input data 40 from a time area to a frequency area, and a normalization part 32 normalizes an output signal from this frequency conversion part 31, and a peak detection part 33 detects the peak frequency component of the output signal from this normalization part 32. Then, a sound candidate selection part 34 detects a basic wave component having a higher harmonic component from this peak frequency component, and a sound decision part 35 decides this input data 40 as the sound when this basic wave component is continued for a prescribed time or above.

Description

DETAILED DESCRIPTION OF THE INVENTION

【発明の属する技術分野】本発明は有音検出装置に関
し、特に音声の有無を検出する有音検出装置に関するも
のである。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a sound detection device, and more particularly to a sound detection device for detecting the presence or absence of sound.

【０００１】ディジタル回線の利用効率を上げる重要な
技術の一つとして、図６に示すように、ＰＣＭ入力デー
タ４０に音声が含まれるか否かを検出する。有音（音声
の有る音）の場合（“１”）は有音判定信号５０に基づ
いて音声符号化部１０がＰＣＭデータ４０を符号化し、
符号化データ６０として回線に送出し、無音（音声の無
い音）の場合（“０”）は符号化を行わない有音検出装
置２０が必要になっている。As one of the important techniques for improving the utilization efficiency of a digital line, as shown in FIG. 6, it is detected whether or not voice is included in the PCM input data 40. In the case of a sound (a sound with sound) (“1”), the sound coding unit 10 codes the PCM data 40 based on the sound determination signal 50,
The encoded data 60 is transmitted to the line, and in the case of silence (sound without sound) (“0”), a sound detection device 20 that does not perform encoding is required.

【０００２】[0002]

【従来の技術】図７は、上記の有音検出装置２０の従来
例を示しており、パワー算出部２１、線形予測分析部２
２、周波数成分抽出部２３、ピッチ周期算出部２４、及
び有音判別部２５で構成されている。2. Description of the Related Art FIG. 7 shows a conventional example of the sound detection device 20 described above.
2, a frequency component extracting unit 23, a pitch cycle calculating unit 24, and a sound discrimination unit 25.

【０００３】ＰＣＭ入力データ４０は、パワー算出部２
１、線形予測分析部２２、周波数成分抽出部２３、及び
ピッチ周期算出部２４の各入力端子に共通に入力され
る。パワー算出部２１、線形予測分析部２２、周波数成
分抽出部２３、及びピッチ周期算出部２４の各出力信号
は、有音判別部２５に入力され、有音判別部２５は、有
音検出時又は無音検出時にそれぞれ“１”又は“０”の
論理信号である有音判定信号５０を出力する。[0003] The PCM input data 40 is supplied to the power calculator 2.
1, are commonly input to input terminals of the linear prediction analysis unit 22, the frequency component extraction unit 23, and the pitch period calculation unit 24. Each output signal of the power calculation unit 21, the linear prediction analysis unit 22, the frequency component extraction unit 23, and the pitch cycle calculation unit 24 is input to the sound determination unit 25, and the sound determination unit 25 detects when a sound is detected or At the time of silence detection, a sound determination signal 50 which is a logical signal of "1" or "0" is output.

【０００４】パワー算出部２１は、ある一定期間（以
下、フレームと称する）のＰＣＭ入力データ４０のサン
プルｆ（ｎ）からパワーＰを次式（１）により算出す
る。The power calculator 21 calculates a power P from a sample f (n) of the PCM input data 40 for a certain period (hereinafter, referred to as a frame) according to the following equation (1).

【０００５】[0005]

【数１】ここで、Ｍはフレーム内のサンプル数である。(Equation 1) Here, M is the number of samples in the frame.

【０００６】このパワーＰを基準値ＴＨ＿Ｐと比較し、
Ｐ＞ＴＨ＿Ｐの場合、音声が有ることを示す有音判定信
号を出力する。ここで、基準値ＴＨ＿Ｐは、固定値ある
いは、適応的に可変の値である。The power P is compared with a reference value TH_P,
If P> TH_P, a sound determination signal indicating that voice is present is output. Here, the reference value TH_P is a fixed value or a value that is adaptively variable.

【０００７】線形予測分析部２２は、音声の包絡線情報
を自己相関法等に基づき線形予測係数α（ｘ）及び線形
反射係数ｒ（ｘ）を算出し、この線形反射係数ｒ（ｘ）
から予測残差Ｒを次式により算出する。The linear prediction analysis unit 22 calculates a linear prediction coefficient α (x) and a linear reflection coefficient r (x) based on the envelope information of the speech based on an autocorrelation method or the like, and calculates the linear reflection coefficient r (x).
Is calculated from the following equation.

【０００８】[0008]

【数２】ここで、Ｌは反射係数の次数を示す。(Equation 2) Here, L indicates the order of the reflection coefficient.

【０００９】この予測残差Ｒと基準値ＴＨ＿Ｒとを比較
して、Ｒ＜ＴＨ＿Ｒの場合に、有音判定信号を出力す
る。この時の基準値ＴＨ＿Ｒは、固定値あるいは適応的
に可変な値である。The prediction residual R is compared with the reference value TH_R, and if R <TH_R, a sound determination signal is output. The reference value TH_R at this time is a fixed value or a value that is adaptively variable.

【００１０】なお、参考文献として、エル・アール・ラ
ビナ(L.R.Rabiner)とアール・ダブリュー・シャファ(R.
W.Schafer)の著による“Digital Processing of Speech
Signals （音声信号のディジタル処理）”がある。For reference, L. R. Rabina (LRRabiner) and R. W. Shafa (R.
W. Schafer), “Digital Processing of Speech
Signals (digital processing of audio signals).

【００１１】周波数成分抽出部２３は、音声の周波数情
報としてフレーム内の零交差数をカウントし、交差した
回数Ｆと基準値ＴＨ＿ＦがＦ＜ＴＨ＿Ｆの場合は、有音
判定信号を出力する。この時の基準値ＴＨ＿Ｆも、固定
値あるいは適応的に可変の値である。The frequency component extraction unit 23 counts the number of zero crossings in the frame as the frequency information of the voice, and outputs a sound determination signal when the number of crossings F and the reference value TH_F satisfy F <TH_F. The reference value TH_F at this time is also a fixed value or an adaptively variable value.

【００１２】ピッチ周期算出部２４は、音声のピッチ情
報としてピッチ周期を算出し、このピッチ周期が所定の
範囲内にある場合に、有音判定信号を出力する。The pitch cycle calculating section 24 calculates a pitch cycle as voice pitch information, and outputs a sound determination signal when the pitch cycle is within a predetermined range.

【００１３】有音判定部２５は、上記の４つの有音判定
信号を組合せ条件として有音であるか否かを判定し、有
音である時は“１”を、無音である時は“０”を有音判
定信号５０として音声符号化部１０（図６参照）に出力
する。The sound determination unit 25 determines whether or not there is a sound by using the above four sound determination signals as a combination condition, and determines “1” when the sound is present and “1” when the sound is silent. “0” is output to the voice encoding unit 10 (see FIG. 6) as the voiced determination signal 50.

【００１４】音声符号化部１０は、有音である場合の
み、ＰＣＭ入力データ４０に対して音声符号化処理を行
い符号化データ６０として回線に出力する。The voice coding unit 10 performs voice coding processing on the PCM input data 40 only when there is a sound, and outputs the data as coded data 60 to the line.

【００１５】この結果、回線の使用効率を向上させ、特
に携帯端末等では、消費電力を低減させることが有効に
なる。As a result, it is effective to improve the use efficiency of the line, and particularly to reduce the power consumption in portable terminals and the like.

【００１６】[0016]

【発明が解決しようとする課題】このような従来の有音
検出装置においては、音声のみに注目したパラメータを
使用しているため音声のみが入力されている場合は、有
効に音声の有音・無音検出が可能である。In such a conventional sound detection device, a parameter that focuses only on the voice is used. Therefore, when only the voice is input, the voice sound / voice of the voice is effectively used. Silence detection is possible.

【００１７】しかしながら、上記の従来例の構成では、
音声の背景にある低レベルの音楽成分は、音声が有音と
して検出される会話時は音声と共に伝送されるが、音声
が検出されない無音時には伝送が中断されてしまう。従
って、会話時のみ音楽成分が聞こえることとなり相手方
にとって違和感が大きく主観上の品質が大きく劣化して
しまう。However, in the configuration of the above conventional example,
The low-level music component in the background of the voice is transmitted together with the voice during a conversation in which the voice is detected as a sound, but the transmission is interrupted in a silent time in which no voice is detected. Therefore, the music component can be heard only during the conversation, and the other party feels uncomfortable and the subjective quality is greatly deteriorated.

【００１８】従って本発明は、音声の有無を検出する有
音検出装置において、音声信号の背景となる音楽成分を
検出することを課題とする。Accordingly, an object of the present invention is to detect a music component which is a background of an audio signal in a sound detection device for detecting the presence or absence of audio.

【００１９】[0019]

【課題を解決するための手段】上記の課題を解決するた
め、本発明に係る有音検出装置は、入力データを時間領
域から周波数領域に変換する周波数変換部と、該周波数
変換部の出力信号を正規化する正規化部と、該正規化部
の出力信号のピーク周波数成分を検出するピーク検出部
と、該ピーク周波数成分の中から高調波成分に対応した
基本波成分を検出する有音候補選択部と、該基本波成分
が所定時間以上続いたとき該入力データを有音と判定す
る有音判定部とを備えたことが特徴としている。In order to solve the above-mentioned problems, a sound detection device according to the present invention comprises a frequency converter for converting input data from a time domain to a frequency domain, and an output signal of the frequency converter. A normalizing unit, a peak detecting unit for detecting a peak frequency component of an output signal of the normalizing unit, and a sound candidate for detecting a fundamental component corresponding to a harmonic component from the peak frequency component. It is characterized by comprising a selection unit and a sound determination unit that determines that the input data is sound when the fundamental wave component continues for a predetermined time or more.

【００２０】すなわち、音楽、特に弦楽器や打楽器の音
には、基本波成分とその高調波成分が含まれ、さらに一
定時間以上連続するという特徴があることが知られてい
る。That is, it is known that the sound of music, especially the sound of a stringed instrument or a percussion instrument, includes a fundamental component and its harmonic components, and further has a characteristic that it continues for a certain period of time.

【００２１】そこで本発明においては、まず、周波数変
換部は、時間軸データである入力データを離散フーリエ
変換（または高速フーリエ変換）して周波数軸上のデー
タに変換する。このデータを正規化部は、後のデータ処
理を容易にするため正規化し、さらにこの正規化された
データからピーク検出部が、ピーク周波数成分を検出す
る。Therefore, in the present invention, first, the frequency converter converts the input data, which is the time axis data, into discrete Fourier transform (or fast Fourier transform) into data on the frequency axis. The normalization unit normalizes this data to facilitate later data processing, and the peak detection unit detects a peak frequency component from the normalized data.

【００２２】有音候補選択部は、検出されたピーク周波
数成分から音楽成分に対応した周波数候補として基本波
成分とその高調波成分の組を有音判定の候補として選択
し、背景雑音判定部３５は、選択された候補が所定時間
以上続いた場合に有音と判定する。The voiced candidate selection unit selects a set of a fundamental component and its harmonic component as a frequency candidate corresponding to a music component from the detected peak frequency components as a voice determination candidate, and a background noise determination unit 35. Is determined to be sound if the selected candidate continues for a predetermined time or longer.

【００２３】この結果、入力データから楽器の音楽成分
を抽出して有音と判定することができる。As a result, the music component of the musical instrument can be extracted from the input data and determined to be sound.

【００２４】また、本発明では、上記のピーク検出部
が、高周波成分に対してより大きく重み付けして該ピー
ク周波数成分を検出することが可能である。Further, according to the present invention, the peak detecting section can detect the peak frequency component by giving a higher weight to the high frequency component.

【００２５】すなわち、ピーク検出部の入力データに周
波数が高くなればなるほど１より大きい係数を掛けるこ
とにより入力データのレベルが小さい高周波数成分を強
調してより安定したピーク検出を可能とすることができ
る。That is, by multiplying the input data of the peak detection section by a coefficient larger as the frequency becomes higher, a high frequency component having a small level of the input data is emphasized to enable more stable peak detection. it can.

【００２６】また、本発明では、上記のピーク検出部
が、しきい値以上の該ピーク周波数成分のみを検出する
ことも可能である。Further, according to the present invention, the above-described peak detection section can detect only the peak frequency component equal to or higher than the threshold value.

【００２７】すなわち、一定のレベル以上のピーク周波
数成分のみ検出することにより雑音に対して安定したピ
ーク検出をすることができる。That is, by detecting only a peak frequency component equal to or higher than a certain level, a peak can be detected stably with respect to noise.

【００２８】また、本発明では、上記の所定時間が音楽
成分であることを示す期間とすることもできる。Further, in the present invention, the above-mentioned predetermined time may be a period indicating that it is a music component.

【００２９】すなわち、音楽の周波数成分は、例えば１
００ｍｓ〜２００ｍｓ以上は連続する特性を利用して、
有音判定の候補として選択された周波数成分が１００ｍ
ｓ以上連続した場合に音楽成分が有ると判定すればよ
い。That is, the frequency component of music is, for example, 1
Utilizing continuous characteristics for 00 ms to 200 ms or more,
The frequency component selected as a candidate for sound determination is 100 m
It is only necessary to determine that there is a music component when s or more are consecutive.

【００３０】さらに、本発明では、上記の有音判定部の
出力信号が、音声符号化部に与えられて音声符号化の可
否を設定することも可能である。Further, according to the present invention, the output signal of the above-mentioned sound determination section can be given to the voice coding section to set whether or not voice coding can be performed.

【００３１】すなわち、有音判定部が音楽成分有りと判
定した場合、音声符号化部は入力データを符号化して出
力することができる。この結果、音声の途切れた時間に
おける背景音楽成分を中断無く通話回線に伝送して通話
品質の向上を図ることが可能となる。That is, when the sound determination section determines that there is a music component, the voice encoding section can encode and output the input data. As a result, it is possible to improve the call quality by transmitting the background music component at the time when the sound is interrupted to the call line without interruption.

【００３２】[0032]

【発明の実施の形態】図１は、本発明に係る有音検出装
置の構成例を示しており、図７と異なる点は、周波数変
換部３１、正規化部３２、ピーク検出部３３、有音候補
選択部３４、有音判定部３５、及びＯＲ回路３６が追加
されていることである。FIG. 1 shows an example of the configuration of a sound detection device according to the present invention. The difference from FIG. 7 is that a frequency conversion unit 31, a normalization unit 32, a peak detection unit 33, That is, a sound candidate selection unit 34, a sound determination unit 35, and an OR circuit 36 are added.

【００３３】そして、ＰＣＭ入力データ４０が入力され
た周波数変換部３１は、正規化部３２、ピーク検出部３
３、有音候補選択部３４及び有音判定部３５と順次縦続
接続されている。The frequency conversion unit 31 to which the PCM input data 40 has been input includes a normalization unit 32 and a peak detection unit 3.
3. The voice candidate selection unit 34 and the voice determination unit 35 are sequentially connected in cascade.

【００３４】有音判別部２５の出力信号である有音判定
信号５１と、有音判定部３５の出力信号である有音判定
信号５２とが、ＯＲ回路３６に入力され、このＯＲ回路
３６は、有音判定信号５０を出力する。A sound judgment signal 51, which is an output signal of the sound judgment unit 25, and a sound judgment signal 52, which is an output signal of the sound judgment unit 35, are input to an OR circuit 36. , And outputs a sound determination signal 50.

【００３５】ここで、有音判定信号５１は、入力データ
４０に音声が含まれるか否かを示す論理信号であり、後
述するように、有音判定信号５２は、入力データ４０に
音楽成分が含まれるか否かを示す論理信号である。Here, the sound determination signal 51 is a logical signal indicating whether or not a sound is included in the input data 40. As will be described later, the sound determination signal 52 includes a music component in the input data 40. It is a logic signal indicating whether or not it is included.

【００３６】そして、有音判定信号５０は、有音判定信
号５１と有音判定信号５２との論理和信号であり、音声
符号化部１０（図６参照）の符号化処理動作を行うか否
かを設定する制御信号となる。The sound determination signal 50 is a logical sum signal of the sound determination signal 51 and the sound determination signal 52, and determines whether or not to perform the coding processing operation of the voice coding unit 10 (see FIG. 6). It becomes a control signal for setting whether or not.

【００３７】周波数変換部３１は、入力データ４０を時
間ｘの関数であるｆ（ｘ）で表すと、この関数ｆ（ｘ）
を次式で示される離散フーリエ変換を行い、変換データ
Ｆ（ｎ）を得る。When the input data 40 is represented by f (x), which is a function of time x, the frequency conversion unit 31 uses this function f (x)
Is subjected to a discrete Fourier transform represented by the following equation to obtain transformed data F (n).

【００３８】[0038]

【数３】ここで、Ｍはフレーム内のサンプル数である。(Equation 3) Here, M is the number of samples in the frame.

【００３９】例えば、Ｍが「２」のべき乗である場合に
は、高速フーリエ変換によれば処理速度は大幅に高める
ことができる。For example, when M is a power of "2", the processing speed can be greatly increased by the fast Fourier transform.

【００４０】そして、フーリエ変換後のデータＦ（ｎ）
に次式の処理を行い、サンプルデータＦ'（ｎ）を得
る。Then, the data F (n) after the Fourier transform
Then, the following processing is performed to obtain sample data F ′ (n).

【００４１】[0041]

【数４】 (Equation 4)

【００４２】正規化部３２は、次式で示されるサンプル
データＦ'(ｎ）の最大値Ｆｍａｘを検出する。The normalizing section 32 detects the maximum value Fmax of the sample data F '(n) represented by the following equation.

【００４３】[0043]

【数５】 (Equation 5)

【００４４】そして、最大値が１になるように正規化し
たサンプルデータＦ”（ｎ）を次式により計算する。Then, sample data F ″ (n) normalized so that the maximum value becomes 1 is calculated by the following equation.

【００４５】[0045]

【数６】 (Equation 6)

【００４６】図２は、上記のように正規化部３２におい
て正規化されたＦ”（ｎ）のグラフ例を示しており、縦
軸は正規化されたサンプル値であり、横軸は周波数であ
る。FIG. 2 shows an example of a graph of F ″ (n) normalized by the normalization unit 32 as described above. The vertical axis represents the normalized sample value, and the horizontal axis represents the frequency. is there.

【００４７】同図には、有効な周波数範囲として０〜Ｍ
（例えば、４kHz）が設定され、周波数Ｍの半分の周波
数Ｌｍが示されている。また、サンプルデータＦ”
（ｎ）は、特定の周波数成分〜でピークになる。こ
のピークの中の最大値である周波数成分によって正規
化されているためこの周波数成分の値は、「１」とな
っている。FIG. 4 shows effective frequency ranges of 0 to M
(For example, 4 kHz), and a frequency Lm that is half of the frequency M is shown. Also, sample data F "
(N) peaks at a specific frequency component. The value of this frequency component is “1” because it is normalized by the frequency component that is the maximum value of the peaks.

【００４８】ピーク検出部３３は、Ｆ”（ｎ）のピーク
周波数成分を検出する次の処理を行う。（１）有効な周波数範囲の下限から、前サンプルより大
きくなるサンプル周波数位置ＭＩＮ（図示せず）を検出
する。The peak detector 33 performs the following processing for detecting the peak frequency component of F ″ (n). (1) From the lower limit of the effective frequency range, the sample frequency position MIN (shown in the figure) becomes larger than the previous sample. Zu) is detected.

【００４９】（２）その後、該サンプルＭＩＮから順次
周波数の大きい方向においてサンプル毎に大きさを比較
し、前サンプル値より小さくなった場合には、前サンプ
ル番号を図示のようにピーク位置１として記憶する。（３）その処理をサンプルＭまで続ける。そのピーク位
置１〜９をピーク検出部３３の出力とする。(2) Thereafter, the size of each sample is sequentially compared in the direction of higher frequency from the sample MIN, and when the sample size is smaller than the previous sample value, the previous sample number is set to peak position 1 as shown in the figure. Remember. (3) Continue the process up to sample M. The peak positions 1 to 9 are set as the output of the peak detection unit 33.

【００５０】さらに、各サンプルデータの大きさをしき
い値ＴＨ＿ＭＩＮ（同図参照）以上の値に制限すること
により、ピークの検出を容易にしても良い。さらに、図
３に示すように、高調波成分の検出を容易にするためサ
ンプルデータＦ”（ｎ）の高周波成分を強調した後にピ
ーク検出することもできる。Furthermore, peak detection may be facilitated by limiting the size of each sample data to a value equal to or greater than the threshold value TH_MIN (see FIG. 6). Furthermore, as shown in FIG. 3, peaks can be detected after emphasizing the high frequency components of the sample data F ″ (n) in order to facilitate the detection of higher harmonic components.

【００５１】すなわち、図３はサンプルの高周波成分を
強調ための高周波強調係数のグラフ例を示しており、縦
軸はゲイン（重み）であり、横軸は周波数である。高周
波強調係数の周波数０における値は、“１”であり、周
波数が高くなるにつれて大きくなっている。That is, FIG. 3 shows a graph example of a high-frequency emphasis coefficient for emphasizing a high-frequency component of a sample, in which the vertical axis represents gain (weight) and the horizontal axis represents frequency. The value of the high-frequency emphasis coefficient at frequency 0 is “1”, and increases as the frequency increases.

【００５２】図４は、有音候補選択部３４における基本
波成分候補を選択する手順を示しており、以下に順を追
って説明する。FIG. 4 shows a procedure for selecting a fundamental wave component candidate in the sound candidate selection section 34, which will be described step by step.

【００５３】まず、ピーク位置を示すカウンタｉと、基
本波成分候補の数を示すカウンタｊをクリアする（ステ
ップＳ１０）。ただし、ピーク位置１に対応するカウン
タｉの値は“０”である。First, the counter i indicating the peak position and the counter j indicating the number of fundamental wave component candidates are cleared (step S10). However, the value of the counter i corresponding to the peak position 1 is “0”.

【００５４】ピーク検出部３３の出力の中から、検出可
能な最大周波数“Ｍ”の半分の周波数を“Ｌｍ”とする
（図２参照）。この周波数“Ｌｍ”以下のピーク周波数
成分を基本波成分候補とする（同Ｓ１１）。From the output of the peak detector 33, half of the maximum detectable frequency "M" is defined as "Lm" (see FIG. 2). The peak frequency component equal to or lower than the frequency "Lm" is set as a fundamental component candidate (S11).

【００５５】その理由は、通話帯域においては４kHz程
度が検出可能周波数であるため、その半分の周波数であ
る２kHzまでにしか高調波成分を伴った基本波成分候補
のピーク位置は存在し得ないからである。The reason is that the detectable frequency is about 4 kHz in the speech band, and the peak position of the fundamental wave component candidate with the harmonic component can exist only up to 2 kHz which is half the frequency. It is.

【００５６】ピーク位置ｉで指定されピーク周波数成分
ｙを得る（同Ｓ１２）。このピーク周波数成分ｙと周波
数“Ｌｍ”とを比較し、ｙ＜Ｌｍであれば（同Ｓ１３の
“＜０”）、配列基本候補（ｊ）にｙを書き込み（同Ｓ
１４）、配列である基本候補フラグ（ｊ）を“１”にセ
ットし（同Ｓ１５）、ｙ≧ＬｍであればステップＳ１９
に進む。The peak frequency component y specified by the peak position i is obtained (S12). The peak frequency component y is compared with the frequency “Lm”. If y <Lm (“<0” in S13), y is written into the array basic candidate (j) (S13).
14) The basic candidate flag (j), which is an array, is set to “1” (S15), and if y ≧ Lm, step S19.
Proceed to.

【００５７】ｉ，ｊをそれぞれ１だけインクリメントす
る（同Ｓ１６，Ｓ１７）。ｉとピークの数Ｃｐｅａｋを
比較し、ｉ＜ＣｐｅａｋであればステップＳ１２に戻り
（同Ｓ１８の“＜”）、それ以外の場合は、基本候補数
Ｌｓｔに“ｊ”を書き込み（同Ｓ１９）、処理を終え
る。I and j are each incremented by 1 (S16 and S17). i is compared with the number Cpeak of peaks, and if i <Cpeak, the process returns to step S12 (“<” in S18); otherwise, “j” is written in the number Lst of basic candidates (S19). Finish the process.

【００５８】この結果、基本波成分候補が配列基本候補
（ｊ）に書き込まれ、この基本候補（ｊ）に対応する基
本候補フラグ（ｊ）が“１”に設定され、基本候補数レ
ジスタＬｓｔに基本波成分候補数が（図２のピーク〜
）設定される。As a result, the basic wave component candidate is written into the array basic candidate (j), the basic candidate flag (j) corresponding to the basic candidate (j) is set to “1”, and the basic candidate number register Lst is set. The number of fundamental wave component candidates is
) Is set.

【００５９】図５は、有音候補選択部３４における高調
波成分を持つ基本波成分を検出する手順を示しており、
以下に順を追って説明する。FIG. 5 shows a procedure for detecting a fundamental wave component having a harmonic component in the sound candidate selection unit 34.
The description will be given in order below.

【００６０】まず、カウンタｉ，ｊ，ｋをクリアする
（同図ステップＳ２０）。レジスタＬｓｒに基本候補数
Ｌｓｔを入れる（同Ｓ２１）。First, the counters i, j, and k are cleared (step S20 in the figure). The number of basic candidates Lst is entered into the register Lsr (S21).

【００６１】変数ｙに基本候補（ｉ）の最初の基本候補
（０）に記憶されている基本波成分を取り出す（同Ｓ２
２）。変数ｙｙに基本候補フラグ（０）から基本候補
（０）が基本波成分の候補であるか否かを示すフラグを
取り出す（同Ｓ２３）。The fundamental wave component stored in the first basic candidate (0) of the basic candidate (i) is taken out as the variable y (S2).
2). From the basic candidate flag (0), a flag indicating whether or not the basic candidate (0) is a candidate for a fundamental component is extracted from the variable yy (S23).

【００６２】そして、フラグｙｙが“０”で有るか否か
を判定し（同Ｓ２３）、ｙｙ＝０であればステップＳ４
０に進み、ｉを“１”だけインクリメントして（同Ｓ４
０）、ｉ＜ＬｓｒであればステップＳ２２に戻り次の基
本候補（１）が高調波成分を持っているか否かの判定に
移る。以下、基本波成分候補数だけの基本候補（ｉ）に
対し同様の判定が行われる。Then, it is determined whether or not the flag yy is "0" (S23). If yy = 0, the process proceeds to step S4.
0, i is incremented by “1” (S4
0), if i <Lsr, the flow returns to step S22 to determine whether or not the next basic candidate (1) has a harmonic component. Hereinafter, the same determination is performed for the basic candidates (i) as many as the number of fundamental wave component candidates.

【００６３】一方、ステップＳ２３においてｙｙ≠０で
あれば、レジスタｆ₀とｆ₁を基本波成分ｙにオフセッ
トｄを加算した値で初期設定し、検出する高調波の次数
をカウントするカウンタｍを０に初期設定する（同Ｓ２
５）。On the other hand, if yy ≠ 0 in step S23, the registers f ₀ and f ₁ are initialized with a value obtained by adding the offset d to the fundamental wave component y, and a counter m for counting the order of the harmonic to be detected is set. 0 (S2)
5).

【００６４】なお、オフセットｄは、時間領域から周波
数領域に変換するときに持たせたオフセット値であり、
このオフセットｄをｙに加えることにより実際の周波数
を得ている。Note that the offset d is an offset value given when transforming from the time domain to the frequency domain.
The actual frequency is obtained by adding the offset d to y.

【００６５】そして、レジスタｆ₁にレジスタｆ₀の値
を加えて、レジスタｆ₁の内容を基本波成分ｙの第２高
調波成分とする（同Ｓ２６）。ｆ₁と最大周波数Ｍにオ
フセットｄを加えた値と比較し、ｆ₁≧（Ｍ＋ｄ）であ
れば（同Ｓ２７）、ｆ₁が処理対象の周波数以上である
と判断してステップＳ４０に進む。[0065] Then, by adding the value of register f ₀ to the register f _1, the contents of register f ₁ and second harmonic component of the fundamental wave component y (the S26). compared with the values obtained by adding an offset d to f ₁ and the maximum frequency M, if _{f 1 ≧ (M + d)} ( same S27), it is determined that f ₁ is equal to or greater than the frequency to be processed advances to step S40.

【００６６】一方、ステップＳ２７においてｆ₁＜（Ｍ
＋ｄ）であれば、レジスタｘにｆ₁からオフセットｄを
引いた値を代入し（同Ｓ２８）、検査する基本候補
（ｉ）以後の基本候補を指定するカウンタｎをｉ＋１に
初期設定する（同Ｓ２９）。On the other hand, in step S27, f ₁ <(M
If + d), and substitutes the value obtained by subtracting the offset d from f ₁ to the register x (the S28), and initializes the counter n that specifies the basic candidate (i) subsequent basic candidates to check in i + 1 (same S29).

【００６７】カウンタｎによって指定される、基本候補
（０）の次の基本候補（１）とレジスタｘの内容とを比
較し（同Ｓ３０）、等しくない場合で、カウンタｎ≦基
本候補数Ｌｓｔであれば（同Ｓ３１）、カウンタｎを
“１”だけインクリメントし（同Ｓ３２）、ステップＳ
３０に戻り次の基本候補（２）とｘの比較を行う。The basic candidate (1) next to the basic candidate (0) specified by the counter n is compared with the contents of the register x (S30), and if they are not equal, the counter n ≦ the basic candidate number Lst If there is (step S31), the counter n is incremented by "1" (step S32), and step S32 is executed.
Returning to step 30, the next basic candidate (2) is compared with x.

【００６８】一方、レジスタｘの内容と基本候補（１）
が等しい場合は、配列ｚ（０）に基本波成分ｙを代入す
る（同Ｓ３４）。そして、基本候補（１）は基本候補
（０）の高調波成分であるため基本候補フラグ（１）を
“０”にリセットして基本波成分の候補から除き（同Ｓ
３５）、ステップＳ３６の動作に進む。On the other hand, the contents of the register x and the basic candidate (1)
Are equal, the fundamental wave component y is substituted into the array z (0) (S34). Then, since the basic candidate (1) is a harmonic component of the basic candidate (0), the basic candidate flag (1) is reset to “0” and removed from the candidates of the fundamental component (S
35), and proceed to the operation in step S36.

【００６９】また、ステップＳ３１でカウンタｎ＞基本
候補数Ｌｓｔである場合は、配列ｚ（０）に高調波成分
が無なかったことを示す“０”を書き込んでステップＳ
３６に進む（同Ｓ３３）。If it is determined in step S31 that counter n> basic candidate number Lst, "0" indicating that there is no harmonic component is written in array z (0), and step S31 is executed.
The process proceeds to S36 (S33).

【００７０】ステップＳ３６において、カウンタｍを
“１”だけインクリメントして、カウンタｍと予め設定
された検査すべき高調波成分の上限の次数に対応するＣ
ｓｅａｒｃｈとを比較して（同Ｓ３７）、ｍ＜Ｃｓｅａ
ｒｃｈである場合は、ステップＳ２６に戻りさらに高次
の高調波成分を探す。In step S36, the counter m is incremented by "1", and the counter m and the C corresponding to the order of the preset upper limit of the harmonic component to be inspected are set.
Compared with search (S37), m <Csea
If it is rch, the process returns to step S26 to search for a higher-order harmonic component.

【００７１】ステップＳ３７において配列ｚ（ｋ）がｍ
＜Ｃｓｅａｒｃｈである場合は、ｚ（ｋ）が“０”であ
るか否かを判定する（同Ｓ３８）。ｚ（ｋ）≠０であれ
ばｋを“１”だけインクリメントし（同Ｓ３９）、ｚ
（ｋ）＝０あればｚ（ｋ）の中に基本波成分の候補が入
っていないと判別してｋはそのままで、次のステップＳ
４０に進みｉを１だけインクリメントする。In step S37, the array z (k) is set to m
If <Csearch, it is determined whether or not z (k) is “0” (S38). If z (k) ≠ 0, k is incremented by “1” (S39), and z
If (k) = 0, it is determined that no fundamental wave component candidate is included in z (k), k is not changed, and the next step S
Proceed to 40 to increment i by one.

【００７２】そして、ｉとＬｓｒとを比較して、ｉ＜Ｌ
ｓｒであれば、ステップＳ２２に戻り、それ以外では動
作を終了する（同Ｓ４１）。Then, i is compared with Lsr, and i <L
If it is sr, the process returns to step S22; otherwise, the operation ends (step S41).

【００７３】上記の手順によれば、高調波成分を持つ基
本波成分を複数個検出することができる。したがって、
複数の楽器の音楽成分を検出することが可能となり、よ
り性能の良い音楽検出が可能となる。According to the above procedure, a plurality of fundamental wave components having harmonic components can be detected. Therefore,
It is possible to detect music components of a plurality of musical instruments, and it is possible to detect music with better performance.

【００７４】この結果、配列ｚ（ｋ）に高調波成分を持
つ基本波成分が記憶されることになる。As a result, a fundamental wave component having a harmonic component is stored in the array z (k).

【００７５】さらに、有音候補選択部３４は、配列ｚ
（ｋ）に記憶された基本候補と同じ基本波成分が、前フ
レームで検出されて保持された基本候補群の中に有るか
否かを検索する。有る場合は、その候補の出現回数をカ
ウントアップし、全候補の処理後に、出現回数の最大値
検索を行う。Further, the sound candidate selection section 34 selects the array z
A search is performed to determine whether or not the same fundamental wave component as the basic candidate stored in (k) exists in the basic candidate group detected and held in the previous frame. If there is, the number of appearances of the candidate is counted up, and after processing all candidates, the maximum value of the number of appearances is searched.

【００７６】そして、高調波成分がある候補の連続出現
回数Ｃｍａｘ（図示せず）を有音判定部３５に出力す
る。Then, the continuous occurrence count Cmax (not shown) of the candidate having the harmonic component is output to the sound determination section 35.

【００７７】有音判定部３５は、音楽等の高調波の継続
時間が１００ｍｓｅｃ以上はあることを利用して、連続
出現回数と時間換算で１００ｍｓｅｃ以上になるフレー
ム数である基準値Ｃｓｅｔとを比較し、Ｃｍａｘ≧Ｃｓ
ｅｔの条件を満足した場合には、有音と判定し有音判定
信号５２を“１”として出力し、それ以外は無音として
“０”を出力する。Using the fact that the continuation time of harmonics of music or the like is 100 msec or more, the sound existence determination unit 35 compares the number of continuous appearances with the reference value Cset which is the number of frames that become 100 msec or more in time conversion. And Cmax ≧ Cs
When the condition of et is satisfied, it is determined that there is sound, and the sound determination signal 52 is output as “1”, and otherwise, “0” is output as no sound.

【００７８】この有音判定信号５２と有音判定信号５１
（図１参照）のいずれかが“１”であるとき、音声符号
化部１０（図６参照）を動作せて符号化データ６０を出
力させれば、音声が途切れた場合においても、背景音楽
は途切れること無く相手に伝わることとなる。The sound determination signal 52 and the sound determination signal 51
If any one of (see FIG. 1) is “1”, the audio encoding unit 10 (see FIG. 6) is operated to output the encoded data 60. Will be transmitted to the other party without interruption.

【００７９】この図５の手順を図２との対応で説明する
と次のようになる。まず、図５の処理に入る前に、ピー
ク位置１〜９（周波数成分〜）が既に検出されてお
り、基本候補（０）〜（４）の配列にピーク周波数成分
〜が書き込まれている。The procedure of FIG. 5 will be described below with reference to FIG. First, before entering the process of FIG. 5, peak positions 1 to 9 (frequency components to) have already been detected, and peak frequency components to have been written in the array of basic candidates (0) to (4).

【００８０】そして、基本候補（０）〜（４）に対応す
る基本候補フラグ（０）〜（４）には基本波成分候補で
あることを示す“１”が設定され、基本候補フラグ
（５）〜（９）には候補でないことを示す“０”が設定
されている。さらに、レジスタＬｓｔは、周波数の値が
Ｌｍ以下の基本波成分（周波数成分〜）が５個ある
ことを記憶している。The basic candidate flags (0) to (4) corresponding to the basic candidates (0) to (4) are set to "1" indicating that they are fundamental wave component candidates, and the basic candidate flags (5) are set. ) To (9) are set with "0" indicating that they are not candidates. Further, the register Lst stores that there are five fundamental wave components (frequency components 〜) whose frequency values are equal to or less than Lm.

【００８１】図５のステップＳ２０〜Ｓ２３を経由した
ステップＳ２４において、最初の周波数成分に対応す
る基本候補フラグ（０）は“０”でないと判定され、ス
テップＳ２５，Ｓ２６に進む。このステップＳ２６にお
いて周波数成分の第２次高調波成分が計算される。In step S24 through steps S20 to S23 in FIG. 5, it is determined that the basic candidate flag (0) corresponding to the first frequency component is not "0", and the process proceeds to steps S25 and S26. In this step S26, the second harmonic component of the frequency component is calculated.

【００８２】そして、ステップＳ３０，Ｓ３１，Ｓ３
２，Ｓ３０，Ｓ３４〜Ｓ３７，Ｓ２６〜Ｓ３０のループ
ＬＰ（図示せず）を繰り返すことで周波数成分の第
２，３，４，６次高調波である周波数成分，，，
が検出される。Then, steps S30, S31, S3
By repeating a loop LP (not shown) of S2, S30, S34 to S37, and S26 to S30, frequency components which are the second, third, fourth and sixth harmonics of the frequency component,.
Is detected.

【００８３】ループＬＰのステップＳ３４において、配
列ｚ（０）に周波数成分が４回繰り返し書き込まれ
る。さらに、ステップＳ３５において、周波数成分，
に対応する基本候補フラグ（２），（４）が“０”に
リセットされ、周波数成分，は基本波成分候補から
除かれる。In step S34 of the loop LP, frequency components are repeatedly written into the array z (0) four times. Further, in step S35, the frequency component,
Are reset to "0", and the frequency components are excluded from the fundamental wave component candidates.

【００８４】ループＬＰをステップＳ２７またはステッ
プＳ３７で抜けて、ステップＳ４０でカウンタｉが
“１”加算され、ステップＳ４１，Ｓ２２，Ｓ２３，Ｓ
２４のルートＲＴ（図示せず）を通って、周波数成分
が高調波成分を持つか否かの検査が始まる。The process exits the loop LP in step S27 or step S37, and the counter i is incremented by "1" in step S40, and steps S41, S22, S23 and S
Through a route RT (not shown) at 24, a check is started to determine whether the frequency component has a harmonic component.

【００８５】周波数成分に対応する基本候補フラグ
（１）は“１”である。したがって周波数成分の場合
と同様にして、第２，３，４次高調波成分である周波数
成分，，が検出され、基本候補フラグ（４）がリ
セットされる（これは、既に周波数成分の検査のとき
にリセットされている）。そして、ｚ（１）に周波数成
分が基本波成分の候補として書き込まれる。The basic candidate flag (1) corresponding to the frequency component is “1”. Therefore, in the same manner as in the case of the frequency component, the frequency components, which are the second, third, and fourth harmonic components, are detected, and the basic candidate flag (4) is reset (this is the same as the frequency component inspection). When it is reset). Then, the frequency component is written to z (1) as a candidate for the fundamental component.

【００８６】ループＬＰを抜けステップＳ４０でカウン
タｉが“１”加算されてルートＲＴを通って周波数成分
の検査が始まる。ステップＳ２４において周波数成分
に対応する基本候補フラグ（２）は“０”にリセット
されているため基本候補でないと判断される。After exiting from the loop LP, the counter i is incremented by "1" in step S40, and the inspection of the frequency component starts through the route RT. In step S24, since the basic candidate flag (2) corresponding to the frequency component has been reset to "0", it is determined that it is not a basic candidate.

【００８７】そして、ステップＳ２４から直接ステップ
Ｓ４０に進みカウンタｉが“１”だけ加算されてルート
ＲＴを通って周波数成分の検査が始まる。この周波数
成分には、高調波成分がないため基本波成分の候補で
ないと判断される。Then, the process proceeds directly from step S24 to step S40, where the counter i is incremented by "1", and the inspection of the frequency component starts through the route RT. Since this frequency component has no harmonic component, it is determined that the frequency component is not a candidate for a fundamental component.

【００８８】周波数成分は周波数成分と同様に対応
する基本候補フラグ（４）が“０”にリセットされてい
るため基本候補でないと判断される。Since the corresponding basic candidate flag (4) is reset to "0" in the same manner as the frequency component, it is determined that the frequency component is not a basic candidate.

【００８９】そして、ステップＳ４１において検査すべ
き回数Ｌｓｒを超えると処理を終了する。When the number of times Lsr to be inspected is exceeded in step S41, the process is terminated.

【００９０】この結果、配列ｚ（０）とｚ（１）にそれ
ぞれ基本波成分の候補者として周波数成分とが登録
されることになる。As a result, the frequency components are registered in the arrays z (0) and z (1) as candidates for the fundamental wave components.

【００９１】[0091]

【発明の効果】以上説明したように、本発明に係る有音
検出装置によれば、周波数変換部が入力データを時間領
域から周波数領域に変換し、正規化部が該周波数変換部
の出力信号を正規化し、ピーク検出部が該正規化部の出
力信号のピーク周波数成分を検出し、有音候補選択部が
該ピーク周波数成分の中から高調波成分を持つ基本波成
分を検出し、有音判定部が該基本波成分が所定時間以上
続いたとき該入力データを有音と判定するように構成し
たので、音声信号の背景となる音楽成分を検出すること
が可能となる。As described above, according to the sound detection device of the present invention, the frequency conversion unit converts the input data from the time domain to the frequency domain, and the normalization unit outputs the output signal of the frequency conversion unit. The peak detection unit detects the peak frequency component of the output signal of the normalization unit, the sound candidate selection unit detects the fundamental wave component having a harmonic component from the peak frequency components, Since the determination unit is configured to determine that the input data is sound when the fundamental wave component continues for a predetermined time or more, it is possible to detect a music component serving as a background of the audio signal.

【００９２】また、該有音判定部の出力信号が音声符号
化部に与えられて音声符号化の可否を設定するように構
成出来るので、音声が途切れた場合においても、背景音
楽は途切れること無く相手に伝わることとなり違和感の
ない通話が可能となり、伝送する通話品質の向上に寄与
するところが大きい。Further, since the output signal of the sound determination unit can be provided to the audio encoding unit to set the availability of audio encoding, the background music is not interrupted even if the audio is interrupted. The call is transmitted to the other party, and the call without discomfort is possible, which greatly contributes to the improvement of the quality of the transmitted call.

[Brief description of the drawings]

【図１】本発明に係る有音検出装置の実施例を示すブロ
ック図である。FIG. 1 is a block diagram showing an embodiment of a sound detection device according to the present invention.

【図２】本発明に係る有音検出装置におけるピーク検出
部の動作を説明するためのグラフ図である。FIG. 2 is a graph for explaining the operation of a peak detection unit in the sound detection device according to the present invention.

【図３】本発明に係る有音検出装置におけるピーク検出
部で用いられる重み関数例を示すグラフ図である。FIG. 3 is a graph illustrating an example of a weighting function used in a peak detection unit in the sound detection device according to the present invention.

【図４】本発明に係る有音検出装置の有音候補選択部に
おける基本波成分候補選択の手順を示すフローチャート
図である。FIG. 4 is a flowchart illustrating a procedure of selecting a fundamental component candidate in a sound candidate selection unit of the sound detection device according to the present invention.

【図５】本発明に係る有音検出装置の有音候補選択部に
おける高調波検出の手順を示すフローチャート図であ
る。FIG. 5 is a flowchart illustrating a procedure of harmonic detection in a sound candidate selection unit of the sound detection device according to the present invention.

【図６】一般的な有音検出装置を用いた音声符号化装置
の構成例を示すブロック図である。FIG. 6 is a block diagram illustrating a configuration example of a speech encoding device using a general sound presence detection device.

【図７】一般的な有音（音声）検出装置の構成を示すブ
ロック図である。FIG. 7 is a block diagram illustrating a configuration of a general sound (voice) detection device.

[Explanation of symbols]

１０音声符号化部２０有音検出装置２１パワー算出部２２線形予測分析部２３周波数成分抽出部２４ピッチ周期算出部２５，３５有音判定部３１周波数変換部３２正規化部３３ピーク検出部３４有音候補選択部４０ＰＣＭ入力データ５０，５１，５２有音判定信号６０符号化データ図中、同一符号は同一又は相当部分を示す。 DESCRIPTION OF SYMBOLS 10 Speech coding part 20 Sound existence detection device 21 Power calculation part 22 Linear prediction analysis part 23 Frequency component extraction part 24 Pitch period calculation part 25, 35 Sound existence judgment part 31 Frequency conversion part 32 Normalization part 33 Peak detection part 34 Existence Sound candidate selection unit 40 PCM input data 50, 51, 52 Voice determination signal 60 Encoded data In the drawings, the same reference numerals indicate the same or corresponding parts.

Claims

[Claims]

1. A frequency conversion unit for converting input data from a time domain to a frequency domain, a normalization unit for normalizing an output signal of the frequency conversion unit, and detecting a peak frequency component of an output signal of the normalization unit A voice detecting unit that detects a fundamental component having a harmonic component from the peak frequency component, and determines that the input data is a voice when the fundamental component continues for a predetermined time or more. A sound detection device, comprising:

2. The sound detection device according to claim 1, wherein the peak detection section detects the peak frequency component by weighting the high frequency component more heavily.

3. The sound detection device according to claim 1, wherein the peak detection section detects only the peak frequency component equal to or higher than a threshold value.

4. The sound detection device according to claim 1, wherein the predetermined time is a period indicating a music component.

5. The sound detection device according to claim 1, wherein an output signal of the sound determination unit is supplied to a voice coding unit to set whether or not voice coding is possible.