JP2006323134A

JP2006323134A - Signal extractor

Info

Publication number: JP2006323134A
Application number: JP2005146342A
Authority: JP
Inventors: Mariko Aoki; 真理子青木; Kenichi Furuya; 賢一古家; Akitoshi Kataoka; 章俊片岡
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 2005-05-19
Filing date: 2005-05-19
Publication date: 2006-11-30
Anticipated expiration: 2025-05-19
Also published as: JP4612468B2

Abstract

PROBLEM TO BE SOLVED: To provide a signal extractor which reduces discrimination errors of an objective signal and miscellaneous signals and extracts the objective signal at high SN ratio. SOLUTION: The signal extractor comprises: at least one or more signal input means which receive signals from an objective signal source and a miscellaneous signal source; a signal feature level calculation means for calculating a signal feature level of a signal received by the signal input means; a weight value determination means which determines a weight value α for multiplying thereby the received signal, based on the value of the signal feature level calculated by the signal feature level calculation means; and a weight value multiplying means which multiplies the received signal by the weight value determined by the weight value determination means. As the signal feature levels, a dispersion value about the time base of cepstrum, a dispersion value about the time base of an auto-correlation function, sharpness are used as necessary. COPYRIGHT: (C)2007,JPO&INPIT

Description

本発明は、近接した目的信号源と、遠方の雑信号源とから信号が発せられている環境において、雑信号を抑圧し、目的信号を高いＳＮ比で抽出する信号抽出装置に関する。 The present invention relates to a signal extraction apparatus that suppresses a miscellaneous signal and extracts a target signal with a high S / N ratio in an environment in which signals are emitted from a target signal source in the vicinity and a distant signal source.

目的信号と雑信号が時間的に重ならずに発せられている環境（例えば、信号が音の場合、話者が順番に交代していく環境や、時々突発的に鳴る非定常な雑音などが存在する環境である。）において、遠方の雑信号を抑圧し、近接した目的信号を強調する方法としては、従来、パワーを用いて閾値処理を行うノイズゲートという方法が提案されてきた（例えば非特許文献１参照）。
インターネット<http://www.jiten.com/dicmi/docs/k25/20745s.htm>［平成１７年５月１６日検索］ The environment where the target signal and the miscellaneous signal are emitted without overlapping in time (for example, when the signal is a sound, the environment in which the speakers change in turn, or the unsteady noise that occasionally sounds suddenly) In the existing environment), a noise gate method that performs threshold processing using power has been proposed as a method for suppressing a distant signal in the distance and emphasizing a target signal in the vicinity (for example, non-existing). Patent Document 1).
Internet <http://www.jiten.com/dicmi/docs/k25/20745s.htm> [Search May 16, 2005]

しかし、パワーだけを情報に使う場合には、例えば遠方のパワーの大きな雑信号と、近接した位置から発せられたパワーの小さな目的信号の判別を誤り、不必要な雑信号が出力されたり、必要な目的信号が過剰に抑圧されるという問題があった。 However, when only power is used for information, for example, a miscellaneous signal with a large distant power and a target signal with a small power emitted from a nearby position are mistakenly output, and an unnecessary miscellaneous signal is output. There is a problem that a large target signal is excessively suppressed.

そこで本発明は、上記従来方法に比して目的信号と雑信号との判別誤りを低減し、目的信号を高いＳＮ比で抽出する信号抽出装置を提供することを目的とする。 Therefore, an object of the present invention is to provide a signal extraction apparatus that reduces the discrimination error between a target signal and a miscellaneous signal as compared with the conventional method and extracts a target signal with a high S / N ratio.

目的信号源が雑信号源に比べて信号入力手段（例えば信号が音の場合、マイクロホン）に近接しているという場合、遠方の雑信号には反射や残響が多く重畳され、近接した目的信号には雑信号があまり重畳されないという特徴がある。この特徴を検出可能な物理量（信号特徴量）を算出することで、信号が遠方位置から発せられているのか近接位置から発せられているのかを判定し、近接した目的信号だけを抽出する。 If the target signal source is closer to the signal input means than the miscellaneous signal source (for example, a microphone if the signal is sound), the distant miscellaneous signal has a large amount of reflection and reverberation superimposed on the nearby target signal. Has a feature that miscellaneous signals are not superposed. By calculating a physical quantity (signal feature quantity) that can detect this feature, it is determined whether the signal is emitted from a distant position or a close position, and only a target signal that is close is extracted.

信号の反射や残響などの度合いは、信号のケプストラムの高次成分（例えばケフレンシで５０ｍｓ以上）の変動の大きさから観測できることが知られている。同様に、信号の自己相関関数の変動からも観測できることが知られている。
（参考文献１）：Alan V. Oppenheim, Ronald W. Schafer著、伊達玄訳、"デジタル信号処理（DIGITAL SIGNAL PROCESSING）"、初版、下巻、株式会社コロナ社、１９８６年６月２０日 It is known that the degree of signal reflection and reverberation can be observed from the magnitude of the fluctuation of the higher-order component of the signal cepstrum (for example, 50 ms or more in quefrency). Similarly, it is known that it can be observed from fluctuations in the autocorrelation function of the signal.
(Reference 1): Alan V. Oppenheim, Ronald W. Schafer, Translated by Gen Date, "DIGITAL SIGNAL PROCESSING", First Edition, Volume 2, Corona, Inc., June 20, 1986

また、信号の反射や残響などの度合いは、信号の尖鋭度からも観測できることが知られている。
（参考文献２）：Bradford W. Gillespie, Henrique S. Malvar and Dinei A.F. Florencio,"Speech Dereverberation Via Maximum-Kurtosis Subband Adaptive Filtering", International Conference on Acoustics, Speech and Signal Processing, 2001 It is also known that the degree of signal reflection and reverberation can be observed from the sharpness of the signal.
(Reference 2): Bradford W. Gillespie, Henrique S. Malvar and Dinei AF Florencio, “Speech Dereverberation Via Maximum-Kurtosis Subband Adaptive Filtering”, International Conference on Acoustics, Speech and Signal Processing, 2001

さらに、ケプストラム高次の時間軸に関する分散値は、近接信号では小さく、遠方信号では大きくなる。自己相関関数の時間軸に関する分散値も、近接信号では小さく、遠方信号では大きくなる。また、尖鋭度については、近接信号では値が大きく、遠方信号では値が小さくなる。 Further, the dispersion value on the time axis of higher cepstrum is small for the proximity signal and large for the far signal. The variance value of the autocorrelation function with respect to the time axis is also small for the near signal and large for the far signal. Further, the sharpness is large for the proximity signal and small for the far signal.

図１に、信号が音の場合に、マイクロホンから５０ｃｍ程度離れた位置からの目的音信号とマイクロホンから３ｍ程度離れた位置からの雑音信号とを３０秒ほど観測した場合における、ケプストラム高次の分散値の一例を示す。横軸はケプストラムにおいて短時間フーリエ変換する際のフレーム時間のフレーム数を表し、縦軸はケプストラム高次の分散値を表す。実線が近傍(マイクロホンから５０ｃｍ程度)の目的音信号（近い音）であり、点線が遠方(マイクロホンから３ｍ程度)の雑音信号（遠い音）である。
図２に、上記（図１の場合）と同条件下における自己相関関数の分散値の一例を示す。横軸は自己相関関数を演算する際のフレーム時間のフレーム数を表し、縦軸は自己相関関数の分散値を表す。
図３に、上記（図１の場合）と同条件下における尖鋭度の一例を示す。横軸は尖鋭度を演算する際のフレーム時間のフレーム数を表し、縦軸は尖鋭度を表す。 FIG. 1 shows that when the signal is a sound, the cepstrum high-order dispersion when the target sound signal from a position about 50 cm away from the microphone and the noise signal from a position about 3 m away from the microphone are observed for about 30 seconds. An example of the value is shown. The horizontal axis represents the number of frames in the frame time when short-time Fourier transform is performed in the cepstrum, and the vertical axis represents the cepstrum higher-order dispersion value. The solid line is the target sound signal (near sound) in the vicinity (about 50 cm from the microphone), and the dotted line is the noise signal (far sound) in the distance (about 3 m from the microphone).
FIG. 2 shows an example of the variance value of the autocorrelation function under the same conditions as described above (in the case of FIG. 1). The horizontal axis represents the number of frames in the frame time when calculating the autocorrelation function, and the vertical axis represents the variance value of the autocorrelation function.
FIG. 3 shows an example of sharpness under the same conditions as above (in the case of FIG. 1). The horizontal axis represents the number of frames in the frame time when calculating the sharpness, and the vertical axis represents the sharpness.

従って本発明では、信号入力手段で受信した受信信号のケプストラム高次の分散値、または自己相関関数の分散値、または尖鋭度の値を算出し、その値に応じて信号を出力する際の重み値αを算出する。 Therefore, according to the present invention, the cepstrum high-order variance value of the received signal received by the signal input means, the variance value of the autocorrelation function, or the sharpness value is calculated, and the weight when outputting the signal according to the value The value α is calculated.

信号特徴量として受信信号のケプストラム高次の分散値を用いた場合、目的信号源からの信号のケプストラム分散値は雑信号源からの信号のケプストラム分散値よりも小さいことを利用して、ケプストラム分散値が予め定めた閾値より小さくなる区間は目的信号と判定し、受信信号に乗算する重み値αを所定の値に決定する。ケプストラム分散値が閾値より大きくなる区間は雑信号の区間と判定し、重み値αを所定の値に決定する。判定された各区間における重み値αの値は、目的信号源からの信号と雑信号源からの信号とを分離できるように適宜の値に設定される。例えばαを０≦α≦１．０とすれば、目的信号と判定された区間の受信信号に乗算する重み値αは、１．０あるいは１．０に近い値に決定し、雑信号と判定された区間の受信信号に乗算する重み値αは、０あるいは０に近い値に決定すればよい。
信号特徴量として受信信号の自己相関関数の分散値を用いる場合も同様の処理を行う。 When the cepstrum high-order variance value of the received signal is used as the signal feature quantity, the cepstrum variance is obtained by utilizing the fact that the cepstrum variance value of the signal from the target signal source is smaller than the cepstrum variance value of the signal from the miscellaneous signal source. A section in which the value is smaller than a predetermined threshold is determined as a target signal, and a weight value α to be multiplied with the received signal is determined to be a predetermined value. The section in which the cepstrum variance value is larger than the threshold is determined as a section of a miscellaneous signal, and the weight value α is determined to be a predetermined value. The value of the weight value α in each determined section is set to an appropriate value so that the signal from the target signal source and the signal from the miscellaneous signal source can be separated. For example, if α is 0 ≦ α ≦ 1.0, the weight value α multiplied by the received signal in the section determined as the target signal is determined to be 1.0 or a value close to 1.0, and determined as a miscellaneous signal. The weight value α to be multiplied with the received signal in the section may be determined to be 0 or a value close to 0.
Similar processing is performed when the variance value of the autocorrelation function of the received signal is used as the signal feature amount.

また、信号特徴量として受信信号の尖鋭度を用いる場合には、近接した目的信号源からの信号の尖鋭度は大きく、遠方の雑信号源からの信号の尖鋭度は小さくなる性質を利用して、尖鋭度がある閾値以上になる区間は目的信号と判定し、受信信号に乗算する重み値αを所定の値に決定する。尖鋭度が閾値以下の場合には雑信号成分と判定し、重み値αを所定の値に決定する。判定された各区間における重み値αの値は、目的信号源からの信号と雑信号源からの信号とを分離できるように適宜の値に設定される。例えばαを０≦α≦１．０とすれば、目的信号と判定された区間の受信信号に乗算する重み値αは、１．０あるいは１．０に近い値に決定し、雑信号と判定された区間の受信信号に乗算する重み値αは、０あるいは０に近い値に決定すればよい。 In addition, when using the sharpness of a received signal as a signal feature quantity, the sharpness of a signal from a nearby target signal source is large, and the sharpness of a signal from a distant signal source is small. A section where the sharpness is greater than or equal to a threshold value is determined as a target signal, and the weight value α to be multiplied with the received signal is determined to be a predetermined value. If the sharpness is less than or equal to the threshold value, it is determined as a miscellaneous signal component, and the weight value α is determined to be a predetermined value. The value of the weight value α in each determined section is set to an appropriate value so that the signal from the target signal source and the signal from the miscellaneous signal source can be separated. For example, if α is 0 ≦ α ≦ 1.0, the weight value α multiplied by the received signal in the section determined as the target signal is determined to be 1.0 or a value close to 1.0, and determined as a miscellaneous signal. The weight value α to be multiplied with the received signal in the section may be determined to be 0 or a value close to 0.

重み値乗算手段においては、決定した重み値αを受信信号ｘ（ｔ）（ｔ：サンプリング時刻）の各帯域に乗算する。即ち、目的信号と判定された区間の受信信号には所定の重み値α（上記の例で云えば、１．０あるいは１．０に近い値）を乗じ、雑信号と判定された区間の受信信号には所定の重み値α（上記の例で云えば、０あるいは０に近い値）を乗じる。このように重み付けされた受信信号を出力信号として出力する。 The weight value multiplying means multiplies each band of the received signal x (t) (t: sampling time) by the determined weight value α. That is, the reception signal in the section determined as the target signal is multiplied by a predetermined weight value α (in the above example, 1.0 or a value close to 1.0), and the reception signal in the section determined as the miscellaneous signal is received. The signal is multiplied by a predetermined weight value α (in the above example, 0 or a value close to 0). The weighted reception signal is output as an output signal.

また、上記信号特徴量(ケプストラム高次の分散値、自己相関関数の分散値、尖鋭度)はそれぞれ単独で用いても良いし、複数を組み合わせてもよい。演算量に余裕がある場合には、複数組み合わることにより判別精度の向上が期待できる。なお、上記信号特徴量はパワーと組み合わせることも出来る。 In addition, the signal feature values (the cepstrum high-order variance, the autocorrelation function variance, and the sharpness) may be used alone or in combination. When there is a margin in the amount of calculation, an improvement in discrimination accuracy can be expected by combining a plurality of calculations. The signal feature amount can be combined with power.

さらに、予め既知の目的信号および雑信号から上記の複数の信号特徴量を算出し、これらの値から重回帰分析に基づく回帰式を得ておき、受信信号の信号特徴量を回帰式に当てはめて得た目的変量から重み値を決定することもできる。 Furthermore, the above-mentioned plurality of signal feature amounts are calculated from a known target signal and a miscellaneous signal, a regression equation based on multiple regression analysis is obtained from these values, and the signal feature amount of the received signal is applied to the regression equation. The weight value can also be determined from the obtained objective variable.

本発明の信号抽出装置によれば、パワーだけを使って判別していた従来方法に比して、目的信号と雑信号との判別誤りを低減し、目的信号を高いＳＮ比で抽出することができる。 According to the signal extraction apparatus of the present invention, it is possible to reduce the discrimination error between the target signal and the miscellaneous signal and extract the target signal with a high S / N ratio as compared with the conventional method in which discrimination is performed using only power. it can.

図４に、第１の実施形態および第２の実施形態に係わる信号抽出装置（Ａ）の機能ブロック図を示す。また図５に、第１の実施形態および第２の実施形態に係わる信号抽出装置（Ａ）における信号抽出処理のフローチャートを示す。これらの実施形態においては、信号を音声や楽音などの音響信号として説明する。信号入力手段である音響信号入力部（１）は例えばマイクロホンとする。目的信号源である目的音源の音響信号（目的音信号）をｓ（ｔ）、雑信号源である雑音源の音響信号（雑音信号）をｎ（ｔ）とする。説明を簡略化するために、ここでは雑音源を一つとして説明するが、一般に雑音源の個数は複数でも良い。 FIG. 4 is a functional block diagram of the signal extraction device (A) according to the first embodiment and the second embodiment. FIG. 5 shows a flowchart of signal extraction processing in the signal extraction apparatus (A) according to the first embodiment and the second embodiment. In these embodiments, the signal is described as an acoustic signal such as voice or musical sound. The acoustic signal input unit (1) which is a signal input means is a microphone, for example. Let s (t) be the acoustic signal (target sound signal) of the target sound source that is the target signal source, and n (t) be the acoustic signal (noise signal) of the noise source that is the miscellaneous signal source. In order to simplify the description, a single noise source is described here, but in general, a plurality of noise sources may be used.

＜第１の実施形態＞
まず、本発明の信号抽出装置の第１の実施形態について説明する。 <First Embodiment>
First, a first embodiment of the signal extraction device of the present invention will be described.

信号特徴量算出手段である音響特徴量算出部（２）においては、音響信号入力部（１）で受信された受信信号ｘ（ｔ）の信号特徴量である音響特徴量τを算出する（Ｓ１００）。この音響特徴量τとは例えば、受信信号のケプストラム高次の分散値τ_１、自己相関関数の分散値τ_２、そして尖鋭度τ_３のいずれかである。 The acoustic feature quantity calculation unit (2), which is a signal feature quantity calculation means, calculates an acoustic feature quantity τ that is a signal feature quantity of the received signal x (t) received by the acoustic signal input unit (1) (S100). ). The acoustic feature amount τ is, for example, one of a cepstrum high-order dispersion value τ ₁ , an autocorrelation function dispersion value τ ₂ , and a sharpness τ ₃ of the received signal.

第１の実施形態では、これらの音響特徴量のうちいずれかを単独で使うとするが、複数を組み合わせて使うことで精度向上が望める。この場合については、第２の実施形態において説明する。 In the first embodiment, any one of these acoustic feature quantities is used alone, but an improvement in accuracy can be expected by using a plurality of them in combination. This case will be described in the second embodiment.

以下に、ケプストラム、自己相関関数、尖鋭度の定義を説明する。 Below, the definition of cepstrum, autocorrelation function, and sharpness will be explained.

《ケプストラム》
受信信号ｘ（ｔ）のケプストラムは式（１）で定義される。

《Cepstrum》
The cepstrum of the received signal x (t) is defined by equation (1).

ここで、fft(・)は入力・のフーリエ変換、abs(・)は入力・の絶対値、log(・)は入力・の常用対数、ifft(・)は入力・の逆フーリエ変換、そしてreal(・)は入力・の実部を表す。音響特徴量算出部（２）で算出される音響特徴量τは、受信信号のケプストラム高次の分散値τ_１の場合、式（１）で定義されたケプストラムの高次成分の分散値である。 Where fft (・) is the Fourier transform of the input ・ abs (・) is the absolute value of the input ・ log (・) is the common logarithm of the input ・ ifft (・) is the inverse Fourier transform of the input ・ and real (•) represents the real part of input. The acoustic feature quantity τ calculated by the acoustic feature quantity calculation unit (2) is the variance value of the high-order component of the cepstrum defined by the expression (1) in the case of the cepstrum high-order variance value τ ₁ of the received signal. .

《自己相関関数》
受信信号ｘ（ｔ）の自己相関関数は式（２）で定義される。

《Autocorrelation function》
The autocorrelation function of the received signal x (t) is defined by equation (2).

ここで、Ｎは相関を計算する信号の長さ、ｍはｍサンプルずらした相関を表す。音響特徴量算出部（２）で算出される音響特徴量τは、自己相関関数の分散値τ_２の場合、式（２）で定義された自己相関関数の分散値である。 Here, N represents the length of the signal for calculating the correlation, and m represents the correlation shifted by m samples. The acoustic features tau calculated in acoustic feature amount calculation unit (2), if the variance value tau ₂ of the autocorrelation function is a variance value of the autocorrelation function defined by equation (2).

《尖鋭度》
受信信号ｘ（ｔ）の線形予測残差信号をｙ（ｔ）とする。信号ｙ（ｔ）の尖鋭度は下記式（３）で定義される。

《Sharpness》
Let y (t) be the linear prediction residual signal of the received signal x (t). The sharpness of the signal y (t) is defined by the following formula (3).

ここでＥは信号の期待値である。音響特徴量算出部（２）で算出される音響特徴量τは、尖鋭度τ_３の場合、式（３）で定義された尖鋭度である。 Here, E is the expected value of the signal. The acoustic features tau calculated in acoustic feature amount calculation unit (2), if the sharpness tau _3, a sharpness defined by equation (3).

重み値決定手段である重み値決定部（３）においては、音響特徴量算出部（２）で算出した音響特徴量τの値に基づき、受信信号ｘ（ｔ）の各帯域に乗算する重み値αを決定する（Ｓ１０１）。例えば、音響特徴量としてケプストラム高次の分散値を用いた場合、目的音源からの音響信号のケプストラム分散値は雑音源からの音響信号のケプストラム分散値よりも小さいことを利用して、ケプストラム分散値が予め定めた閾値より小さくなる区間は目的音信号と判定し、受信信号に乗算する重み値αをα＝１．０（あるいは１．０に近い値）に決定する。ケプストラム分散値が閾値より大きくなる区間は雑音信号の区間と判定し、ゼロに近い重み値α（０≦α＜１）を決定する。音響特徴量として受信信号の自己相関関数の分散値を用いる場合も同様の処理を行う。 In the weight value determining unit (3), which is a weight value determining means, a weight value for multiplying each band of the received signal x (t) based on the value of the acoustic feature amount τ calculated by the acoustic feature amount calculating unit (2). α is determined (S101). For example, when a cepstrum high-order variance value is used as the acoustic feature amount, the cepstrum variance value is obtained by utilizing the fact that the cepstrum variance value of the acoustic signal from the target sound source is smaller than the cepstrum variance value of the acoustic signal from the noise source. A section where is smaller than a predetermined threshold is determined as a target sound signal, and a weight value α to be multiplied with the received signal is determined to be α = 1.0 (or a value close to 1.0). A section in which the cepstrum variance value is larger than the threshold is determined as a section of a noise signal, and a weight value α (0 ≦ α <1) close to zero is determined. Similar processing is performed when the variance value of the autocorrelation function of the received signal is used as the acoustic feature quantity.

また、音響特徴量として受信信号の尖鋭度を用いる場合には、近接した目的音源からの音響信号の尖鋭度は大きく、遠方の雑音源からの音響信号の尖鋭度は小さくなる性質を利用して、尖鋭度がある閾値以上になる区間は目的音信号と判定して重み値αをα＝１．０（あるいは１．０に近い値）に決定する。尖鋭度が閾値以下の場合には雑音信号成分と判定してゼロに近い重み値α（０≦α＜１）を決定する。 In addition, when the sharpness of the received signal is used as the acoustic feature amount, the sharpness of the acoustic signal from the nearby target sound source is large, and the sharpness of the acoustic signal from the distant noise source is small. The section where the sharpness is greater than or equal to a certain threshold is determined as the target sound signal, and the weight value α is determined to be α = 1.0 (or a value close to 1.0). When the sharpness is less than or equal to the threshold value, it is determined as a noise signal component, and a weight value α (0 ≦ α <1) close to zero is determined.

重み値乗算手段である重み値乗算部（４）においては、重み値決定部（３）で決定した重み値αを受信信号ｘ（ｔ）に乗算する（Ｓ１０２）。つまり、目的音信号と判定された区間の受信信号には所定の重み値α（１．０あるいは１．０に近い値）を乗じ、雑音信号と判定された区間の受信信号には所定の重み値α（０あるいは０に近い値）を乗じ、このα×ｘ（ｔ）を出力信号として出力する。この出力信号が目的音信号として抽出されたものである。 In the weight value multiplication unit (4) which is a weight value multiplication means, the received signal x (t) is multiplied by the weight value α determined by the weight value determination unit (3) (S102). That is, the received signal in the section determined as the target sound signal is multiplied by the predetermined weight value α (1.0 or a value close to 1.0), and the received signal in the section determined as the noise signal is set to the predetermined weight. The value α (0 or a value close to 0) is multiplied, and this α × x (t) is output as an output signal. This output signal is extracted as the target sound signal.

＜第２の実施形態＞
次に、本発明の信号抽出装置の第２の実施形態について説明する。
第２の実施形態に係わる信号抽出装置は、第１の実施形態で述べた信号抽出装置（Ａ）と同じ構成である。以下、第１の実施形態と異なる部分について説明する。 <Second Embodiment>
Next, a second embodiment of the signal extraction device of the present invention will be described.
The signal extraction device according to the second embodiment has the same configuration as the signal extraction device (A) described in the first embodiment. Hereinafter, a different part from 1st Embodiment is demonstrated.

第２の実施形態では、重み値の決定に複数の音響特徴量を用いる。即ち、重み値決定手段である重み値決定部（３）においては、音響特徴量算出部（２）で算出した複数の音響特徴量の値に基づき、受信信号ｘ（ｔ）の各帯域に乗算する重み値αを決定する（Ｓ１０１ｐ）。 In the second embodiment, a plurality of acoustic feature quantities are used for determining the weight value. That is, the weight value determining unit (3), which is a weight value determining unit, multiplies each band of the received signal x (t) based on the values of the plurality of acoustic feature values calculated by the acoustic feature value calculating unit (2). The weight value α to be determined is determined (S101p).

先に述べた複数の音響特徴量を組み合わせた場合の処理の一例を下記プログラム形式〈ａ〉で示す。ここで、ケプストラム高次の分散値をτ_１、τ_１の閾値をth1、自己相関関数の分散値をτ_２、τ_２の閾値をth2、尖鋭度をτ_３、τ_３の閾値をth3とする。また、プログラム形式〈ａ〉の記号∪は“または”を表す。

An example of processing when combining a plurality of acoustic feature values described above is shown in the following program format <a>. Here, ₁ a variance value of the cepstrum higher _tau, th1 threshold of tau _1, ₂ the variance of the autocorrelation function _tau, the threshold tau ₂ th2, sharpness and tau _3, and th3 the threshold of tau ₃ To do. The symbol ∪ in the program format <a> represents “or”.

プログラム形式〈ａ〉は、τ_１がth1よりも小さいか、τ_２がth2よりも小さいか、τ_３がth3よりも大きいかの少なくともいずれかが成立するか否かを判断し〔プログラム形式〈ａ〉の１行目〕、それらの少なくともいずれかが成立する場合には、重み値αを１．０と決定し〔プログラム形式〈ａ〉の２行目〕、それ以外の場合は、重み値αを０．０と決定する〔プログラム形式〈ａ〉の３行目〕ことを表す。 The program format <a> determines whether or not at least one of τ ₁ is smaller than th1, τ ₂ is smaller than th2, and τ ₃ is larger than th3 [program format <a> first line], and if at least one of them is satisfied, the weight value α is determined to be 1.0 (second line of the program format <a>); otherwise, the weight value is determined. It represents that α is determined to be 0.0 [the third line of the program format <a>].

勿論、プログラム形式〈ａ〉の１行目のif文において、∪(または)の代わりに∩(且つ)を用いて判断するものとしてもよいし、これらの組み合わせを用いて判断することも可能である。 Of course, in the if statement on the first line of the program format <a>, it may be determined using ∩ (and) instead of ∪ (or), or a combination of these may be used. is there.

また、上記τ_１、τ_２、τ_３の全てを用いて判断することは必須ではない。例えば、τ_１＜th1とτ_２＜th2の組み合わせ、τ_１＜th1とτ_３＞th3の組み合わせ、τ_２＜th2とτ_３＞th3の組み合わせで判断することでもよい。さらに、上記τ_１、τ_２、τ_３のみならず、受信信号ｘ（ｔ）のパワーの値をτ_４、このτ_４の閾値をth4として、τ_４＞th4を用いて判断することも可能である。つまり例えば、τ_１＜th1とτ_４＞th4の組み合わせで判断することも可能であるし、τ_１＜th1とτ_３＞th3とτ_４＞th4の組み合わせで判断することなども可能である。より具体的な一例をプログラム形式〈ｂ〉で示す。

Moreover, it is not essential to make a determination using all of τ ₁ , τ ₂ , and τ ₃ . For example, the determination may be made by a combination of τ ₁ <th1 and τ ₂ <th2, a combination of τ ₁ <th1 and τ ₃ > th3, or a combination of τ ₂ <th2 and τ ₃ > th3. Further, not only the above τ ₁ , τ ₂ , τ ₃ but also the power value of the received signal x (t) is τ ₄ , and the threshold value of this τ ₄ is th4, and it is possible to make a determination using τ ₄ > th4. It is. That is, for example, it is possible to make a determination based on a combination of τ ₁ <th1 and τ ₄ > th4, and it is also possible to make a determination based on a combination of τ ₁ <th1, τ ₃ > th3, and τ ₄ > th4. A more specific example is shown in the program format .

プログラム形式〈ｂ〉は、τ_１がth1よりも小さいか、あるいは、τ_２がth2よりも小さく、且つ、τ_４がth4よりも大きいかの少なくともどちらかが成立するか否かを判断し〔プログラム形式〈ｂ〉の１行目〕、少なくともどちらかが成立する場合には、重み値αを１．０と決定し〔プログラム形式〈ｂ〉の２行目〕、それ以外の場合は、重み値αを０．０と決定する〔プログラム形式〈ｂ〉の３行目〕ことを表す。 The program format determines whether or not at least _one of τ ₁ is smaller than th1, or τ ₂ is smaller than th2, and τ ₄ is larger than th4 [ The first line of the program format ], if at least one of them is satisfied, the weight value α is determined to be 1.0 [the second line of the program format ], otherwise the weight The value α is determined to be 0.0 [the third line of the program format ].

なお、以上に例示したプログラム形式では、例えばτ_１については、τ_１＜th1を判断するとしたが、逆にth1≦τ_１を判断するように変更することもできる。このことを説明するために、プログラム形式〈ｃ〉および〈ｄ〉を例示する。

In the above-described program format, for example, for τ ₁ , τ ₁ <th1 is determined, but conversely, it may be changed to determine th1 ≦ τ ₁ . In order to explain this, program formats <c> and <d> are illustrated.

プログラム形式〈ｃ〉は、τ_１がth1よりも小さく、且つ、τ_２がth2よりも小さいことが成立するか否かを判断し〔プログラム形式〈ｃ〉の１行目〕、これが成立する場合には、重み値αを１．０と決定し〔プログラム形式〈ｃ〉の２行目〕、それ以外の場合は、重み値αを０．０と決定する〔プログラム形式〈ｃ〉の３行目〕ことを表す。一方、プログラム形式〈ｄ〉は、τ_１がth1以上であるか、または、τ_２がth2以上であるかの少なくともどちらかが成立するか否かを判断し〔プログラム形式〈ｄ〉の１行目〕、少なくともどちらかが成立する場合には、重み値αを０．０と決定し〔プログラム形式〈ｄ〉の２行目〕、それ以外の場合は、重み値αを１．０と決定する〔プログラム形式〈ｄ〉の３行目〕ことを表す。結局、プログラム形式〈ｃ〉〈ｄ〉は同等の処理内容を表している。 In the program format <c>, it is determined whether τ ₁ is smaller than th1 and τ ₂ is smaller than th2 (the first line of the program format <c>). The weight value α is determined to be 1.0 (second line of the program format <c>). Otherwise, the weight value α is determined to be 0.0 (3 lines of the program format <c>). Eyes]. On the other hand, the program format <d> is either tau ₁ is th1 or more or, it is determined whether or at least one is tau ₂ is th2 or more is satisfied [1 line of source form <d> First, if at least one of them is satisfied, the weight value α is determined to be 0.0 (second line of the program format <d>). Otherwise, the weight value α is determined to be 1.0. This indicates that [the third line of the program format <d>]. Eventually, the program formats <c><d> represent equivalent processing contents.

このように、プログラム形式〈ｃ〉〈ｄ〉で例示したように、同等の処理内容でありながら、異なる判断に従って処理されることがあるが、本発明においては、ある特定の判断に従った処理に限定するものではない。その他、複数の音響特徴量（信号特徴量）を用いた重み値αの決定は、本発明の趣旨を逸脱しない限り適宜に変更可能である。 In this way, as exemplified in the program format <c> <d>, although the processing contents are equivalent, the processing may be performed according to different judgments. In the present invention, the processing according to a certain judgment is performed. It is not limited to. In addition, the determination of the weight value α using a plurality of acoustic feature quantities (signal feature quantities) can be changed as appropriate without departing from the spirit of the present invention.

＜第３の実施形態＞
次に、本発明の信号抽出装置の第３の実施形態について説明する。
図６に、第３の実施形態における信号抽出装置（Ｂ）の機能ブロック図を示す。また図７に、第３の実施形態に係わる信号抽出装置（Ｂ）における信号抽出処理のフローチャートを示す。
第３の実施形態に係わる信号抽出装置（Ｂ）は、第１の実施形態および第２の実施形態で述べた信号抽出装置（Ａ）に、後述の統計分析部（５）を備えたものである。以下、第１の実施形態ないし第２の実施形態と異なる部分について説明する。 <Third Embodiment>
Next, a third embodiment of the signal extraction device of the present invention will be described.
FIG. 6 shows a functional block diagram of the signal extraction device (B) in the third embodiment. FIG. 7 shows a flowchart of signal extraction processing in the signal extraction apparatus (B) according to the third embodiment.
The signal extraction device (B) according to the third embodiment is provided with a statistical analysis unit (5) described later in addition to the signal extraction device (A) described in the first embodiment and the second embodiment. is there. Hereinafter, parts different from the first embodiment or the second embodiment will be described.

第３の実施形態に係わる信号抽出装置（Ｂ）の特徴は、音響特徴量の閾値を決定する際、重回帰分析と呼ばれる手法を使うことで、閾値を毎回設定する必要が無く、閾値が０．５以上ならば目的音信号、０．５より小さければ雑音信号と判定できるところにある（第３の実施形態では閾値を０．５とするが、閾値を０．５に限定するものではなく適宜変更可能である。）。重回帰分析とは、複数の音響特徴量と信号の特徴との相関を多次元で調べることの出来る手法である。 The feature of the signal extraction device (B) according to the third embodiment is that it is not necessary to set the threshold value every time by using a technique called multiple regression analysis when determining the threshold value of the acoustic feature value. .5 or more is the target sound signal, and if it is less than 0.5, it can be determined as a noise signal (though the threshold value is 0.5 in the third embodiment, the threshold value is not limited to 0.5). It can be changed as appropriate.) Multiple regression analysis is a technique that can examine the correlation between a plurality of acoustic feature quantities and signal features in a multidimensional manner.

重回帰分析では一般に、正解が既知であるデータを用いて、未知データの判別を行うことが出来る。例えば、予め収録された近接の目的音信号と、遠方の雑音信号に対して、近接の目的音信号には数値“１”を割り当て、遠方の雑音信号には数値“０”を正解として割り当てる。さらに、既知の目的音信号及び雑音信号の音響特徴量として複数の音響特徴量を算出する(これは先に述べたケプストラム高次の分散値、自己相関関数の分散値、そして尖鋭度、パワーなどである)。これらの音響特徴量に対して重回帰分析を施すことで、未知の信号に対して、それが近接音であるか、遠方音であるかを判別するための回帰式が導かれる。回帰式は、説明変量ｐ_１〜ｐ_ｋ（ｋ＝１，２，・・・）に対して、回帰係数ｂ_１〜ｂ_ｋを用いて、下記式（４）で表される。

In multiple regression analysis, unknown data can generally be discriminated using data whose correct answer is known. For example, for a nearby target sound signal and a distant noise signal recorded in advance, a numerical value “1” is assigned to the close target sound signal and a numerical value “0” is assigned to the far noise signal as a correct answer. Furthermore, a plurality of acoustic feature quantities are calculated as the acoustic feature quantities of the known target sound signal and noise signal (this is the above-described cepstrum high-order variance value, variance value of autocorrelation function, sharpness, power, etc. Is). By performing multiple regression analysis on these acoustic feature amounts, a regression equation for determining whether the unknown signal is a close sound or a far sound is derived. The regression equation is expressed by the following equation (4) using the regression coefficients b _{1 to} b _k for the explanatory variables p _{1 to} p _k (k = 1, 2,...).

ここで、説明変量ｐ_１〜ｐ_ｋは音響特徴量τ_１〜τ_ｋを表し、ｙは回帰式（４）から得られる目的変量である。また、ａ_０はｙ切片である（ｙ切片ａ_０と回帰係数ｂ_１〜ｂ_ｋをうまく求める手法が重回帰分析である。）。 Here, the explanatory variables p _{1 to} p _k represent the acoustic feature quantities τ _{1 to} τ _k , and y is an objective variable obtained from the regression equation (4). Further, a ₀ is a y-intercept (a technique for successfully obtaining the y-intercept a ₀ and the regression coefficients b _{1 to} b _k is a multiple regression analysis).

統計分析手段である統計分析部（５）においては、音響特徴量算出部（２）で算出した音響特徴量を、式（４）の説明変量に代入演算した結果、目的変量ｙ_１を算出する（Ｓ２００）。 In the statistical analysis unit (5), which is a statistical analysis means, the objective variable y ₁ is calculated as a result of substituting the acoustic feature amount calculated by the acoustic feature amount calculation unit (2) into the explanatory variable of the equation (4). (S200).

このように算出された目的変量ｙ_１を用いて、重み値算出部（３）は、重み値αをプログラム形式〈ｅ〉のように決定する（Ｓ２０１）。

Using the objective variable y ₁ calculated in this way, the weight value calculation unit (3) determines the weight value α as in the program format <e> (S201).

プログラム形式〈ｅ〉は、ｙ_１の値が０．５よりも大きいか否かを判断し、大きいと判断された場合には、重み値αを１．０に決定し、それ以外の場合には、重み値αを０．０に決定することを表している。 Program Format <e> is, determines whether or not the value of y ₁ is greater than 0.5, when it is determined to be greater determines the weight value α to 1.0, otherwise Represents that the weight value α is determined to be 0.0.

このように定めることで、閾値の分布を予め目視で確認しなくても、重み値αを決定することが出来る。 By determining in this way, the weight value α can be determined without having to visually check the threshold distribution in advance.

なお、本発明の信号抽出装置は、上記の信号入力手段（例えばマイクロホン）、記憶装置（例えばＲＡＭ、ＲＯＭやハードディスク）、演算処理装置（例えばＣＰＵ）、入力・出力装置（例えばキーボード、ディスプレイ）、これらの装置間でデータのやり取りが可能に接続するバスなどを備えたコンピュータによって実現することができる（図８参照）。この場合、上述したケプストラムの高次成分の分散値、自己相関関数の分散値、尖鋭度、統計分析、重み値、出力信号などを算出するために必要なプログラム（音響特徴量算出プログラム、統計分析プログラム、重み値決定プログラム、重み値乗算プログラムその他これらのプログラムの処理を制御する制御プログラムなど。但し、第１の実施形態および第２の実施形態では、統計分析プログラムは不要である。）その他受信信号ｘ（ｔ）などのデータを記憶装置に記憶しておき、必要に応じて演算処理装置がプログラムを読み込んで解釈実行することで、上述した各部の機能を実現する（音響特徴量算出部、統計分析部、重み値決定部、重み値乗算部その他これら各部の処理を制御する制御部など）。重み値乗算部によって出力された出力信号は、記憶装置に保存すればよい。また各プログラムは、コンピュータ読み取り可能な記録媒体に記録することもできる。 The signal extraction device of the present invention includes the signal input means (for example, a microphone), a storage device (for example, a RAM, a ROM, and a hard disk), an arithmetic processing device (for example, a CPU), an input / output device (for example, a keyboard, a display), It can be realized by a computer provided with a bus or the like that allows data exchange between these devices (see FIG. 8). In this case, a program (acoustic feature amount calculation program, statistical analysis) required to calculate the variance value of the higher-order components of the cepstrum, the variance value of the autocorrelation function, the sharpness, the statistical analysis, the weight value, the output signal, etc. Program, weight value determination program, weight value multiplication program, control program for controlling the processing of these programs, etc. However, in the first embodiment and the second embodiment, a statistical analysis program is unnecessary.) Other reception Data such as the signal x (t) is stored in the storage device, and the processing unit reads the program and interprets and executes it as necessary, thereby realizing the functions of the above-described units (acoustic feature amount calculation unit, A statistical analysis unit, a weight value determination unit, a weight value multiplication unit, and a control unit for controlling the processing of these units). The output signal output by the weight value multiplication unit may be stored in the storage device. Each program can also be recorded on a computer-readable recording medium.

本発明の信号抽出装置は、例えば、目的信号が音声の場合の音声認識や雑音信号抑圧といった音響信号分析などに有用である。特に、目的信号源からの目的信号と雑信号源からの雑信号とが時間的に重なっていない環境において、目的信号源が雑信号源に比べて信号抽出装置に近接している場合に有用である。 The signal extraction device of the present invention is useful for acoustic signal analysis such as speech recognition and noise signal suppression when the target signal is speech, for example. This is particularly useful when the target signal source is closer to the signal extraction device than the miscellaneous signal source in an environment where the target signal from the target signal source and the miscellaneous signal from the miscellaneous signal source do not overlap in time. is there.

マイクロホンから５０ｃｍ程度離れた位置からの目的音信号とマイクロホンから３ｍ程度離れた位置からの雑音信号とを３０秒ほど観測した場合における、ケプストラム高次の分散値の一例。An example of a higher cepstrum dispersion value when a target sound signal from a position about 50 cm away from a microphone and a noise signal from a position about 3 m away from the microphone are observed for about 30 seconds. 図１におけるのと同条件下での自己相関関数の分散値の一例。An example of the dispersion value of the autocorrelation function under the same conditions as in FIG. 図１におけるのと同条件下での尖鋭度の一例。An example of sharpness under the same conditions as in FIG. 第１の実施形態および第２の実施形態に係わる信号抽出装置（Ａ）の機能ブロック図。The functional block diagram of the signal extraction apparatus (A) concerning 1st Embodiment and 2nd Embodiment. 第１の実施形態および第２の実施形態に係わる信号抽出装置（Ａ）における信号抽出処理のフローチャート。The flowchart of the signal extraction process in the signal extraction apparatus (A) concerning 1st Embodiment and 2nd Embodiment. 第３の実施形態に係わる信号抽出装置（Ｂ）の機能ブロック図。The functional block diagram of the signal extraction apparatus (B) concerning 3rd Embodiment. 第３の実施形態に係わる信号抽出装置（Ｂ）における信号抽出処理のフローチャート。The flowchart of the signal extraction process in the signal extraction apparatus (B) concerning 3rd Embodiment. 信号抽出装置のハードウェア構成例。The hardware structural example of a signal extraction apparatus.

Explanation of symbols

１マイクロホン
２音響特徴量算出部
３重み値決定部
４重み値乗算部
５統計分析部 DESCRIPTION OF SYMBOLS 1 Microphone 2 Acoustic feature-value calculation part 3 Weight value determination part 4 Weight value multiplication part 5 Statistical analysis part

Claims

A signal extraction apparatus that suppresses a miscellaneous signal and extracts a target signal from a received signal received in an environment where a target signal source and a miscellaneous signal source exist,
At least one signal input means for receiving signals from a target signal source and a miscellaneous signal source;
Signal feature amount calculating means for calculating the signal feature amount of the received signal received by the signal input means;
Weight value determining means for determining a weight value α for multiplying the received signal received by the signal input means based on the value of the signal feature value calculated by the signal feature value calculating means;
A signal extraction apparatus comprising weight value multiplying means for multiplying the received signal received by the signal input means by the weight value determined by the weight value determining means.

2. The signal extraction apparatus according to claim 1, wherein the signal feature amount calculated by the signal feature amount calculating means is a variance value with respect to a time axis of the cepstrum.

2. The signal extraction device according to claim 1, wherein the signal feature amount calculated by the signal feature amount calculating means is a variance value with respect to a time axis of the autocorrelation function.

The signal extraction apparatus according to claim 1, wherein the signal feature amount calculated by the signal feature amount calculation unit is sharpness.

The signal feature amount calculation means
Calculates any two or more of power, variance value with respect to time axis of cepstrum, variance value with respect to time axis of autocorrelation function, and sharpness as signal feature amount,
The weight value determining means is
Weight to multiply the received signal based on multiple combinations of power, signal cepstrum time axis variance, autocorrelation function time axis variance, and sharpness 2. The signal extraction device according to claim 1, wherein the value α is determined.

Statistical analysis means for calculating a target variable by multiple regression analysis from the value of the signal feature quantity calculated by the signal feature quantity calculation means,
The weight value determining means is
6. The signal extraction apparatus according to claim 1, wherein the weight value α is determined based on the objective variable calculated by the statistical analysis means.