JP2014102318A

JP2014102318A - Noise elimination device, noise elimination method, and program

Info

Publication number: JP2014102318A
Application number: JP2012253013A
Authority: JP
Inventors: Yasuhiko Teranishi; 康彦寺西
Original assignee: JVCKenwood Corp
Current assignee: JVCKenwood Corp
Priority date: 2012-11-19
Filing date: 2012-11-19
Publication date: 2014-06-05

Abstract

PROBLEM TO BE SOLVED: To provide a noise elimination device capable of effectively reducing wind noise, a noise elimination method, and a program.SOLUTION: A noise elimination device 100 comprises: a subtractor 11 for calculating a difference signal between an audio signal of one channel and an audio signal of another channel in a plurality of audio signals which are input from a plurality of audio channels; STFT processing units 13, 14, 15 for, after performing frame division on the audio signal and the difference signal in a time domain respectively, converting the signals into an audio signal and a difference signal in a frequency domain; coefficient multiplication/subtraction processing units 16, 17 for, on the basis of the difference signal in the frequency domain and the audio signal in the frequency domain, generating a subtraction signal in the frequency domain; and IFFT processing units 21, 23 for inversely converting the subtraction signal in the frequency domain into a time signal in the time domain.

Description

本発明は、雑音除去装置、雑音除去方法、及びプログラムに関するものである。 The present invention relates to a noise removal device, a noise removal method, and a program.

特許文献１には、スペクトルサブトラクション（ＳＳ）法を利用した方法が開示されている。一般にＳＳ法では、無音区間の音声信号から雑音信号を推定して、音声信号から雑音信号を除去している。具体的には、雑音を含む音声信号の周波数スペクトルから雑音信号の周波数スペクトルを減算する。特許文献１の方法は、予め複数通りの雑音モデルを用意し、その中から選択したモデルの周波数スペクトルを減算するというものである。 Patent Document 1 discloses a method using a spectral subtraction (SS) method. In general, in the SS method, a noise signal is estimated from a sound signal in a silent section, and the noise signal is removed from the sound signal. Specifically, the frequency spectrum of the noise signal is subtracted from the frequency spectrum of the voice signal including noise. The method of Patent Document 1 is to prepare a plurality of noise models in advance and subtract the frequency spectrum of the model selected from them.

特許文献２には、風雑音を低減するための別の装置が開示されている。特許文献２の装置では、複数の音声チャネルの入力信号をＦＦＴ部で周波数信号に変換している。そして、風雑音帯域の周波数信号を取り出して、振幅比較部及び位相比較部により、複数の音声チャンネル間での差分を振幅と位相の両方から求めている。さらに、減衰係数生成部により、風雑音成分を減衰するための振幅係数に変換している。周波数選択／減衰部が、振幅と位相の一方の係数を選択して、風雑音帯域の周波数信号に乗算する。帯域合成部が、風雑音帯域以外の周波数信号と合成し、ＩＦＦＴ部が時間信号に逆変換する。 Patent Document 2 discloses another apparatus for reducing wind noise. In the apparatus of Patent Document 2, input signals of a plurality of audio channels are converted into frequency signals by an FFT unit. Then, the frequency signal of the wind noise band is taken out, and the difference between the plurality of audio channels is obtained from both the amplitude and the phase by the amplitude comparison unit and the phase comparison unit. Further, the attenuation coefficient generator converts the wind noise component into an amplitude coefficient for attenuating. The frequency selection / attenuation unit selects one coefficient of amplitude and phase and multiplies the frequency signal in the wind noise band. The band synthesizing unit synthesizes with a frequency signal other than the wind noise band, and the IFFT unit performs inverse conversion to a time signal.

特開２００６−４７６３９号公報JP 2006-47639 A 特開２０１０−２８３０７号公報JP 2010-28307 A

しかしながら、 However,

しかしながら、特許文献１の手法では時々刻々変化する風雑音について、適当なモデルを用意して、選択するのが容易ではなく、結果的に風雑音の低減効果が不十分であるという問題点がある。特許文献２では、風雑音帯域と音声帯域が重複している場合には、音声が同時に低減されてしまうという問題点がある。 However, in the method of Patent Document 1, it is not easy to prepare and select an appropriate model for wind noise that changes from time to time, and as a result, the effect of reducing wind noise is insufficient. . In Patent Document 2, when the wind noise band and the voice band overlap, there is a problem that the voice is simultaneously reduced.

本発明は、上記の問題を鑑みてなされたものであり、風雑音を効果的に除去することができる雑音除去装置、雑音除去方法、及びプログラムを提供する事を目的とする。 The present invention has been made in view of the above problems, and an object thereof is to provide a noise removal device, a noise removal method, and a program that can effectively remove wind noise.

本発明の一態様に係る雑音除去装置は、複数の音声チャンネルから入力された複数の音声信号のうちの一つのチャンネルの音声信号と他の一つのチャンネルの音声信号の差信号を算出する信号算出部と、時間領域における前記音声信号と前記差信号をそれぞれフレーム分割した後、周波数領域における音声信号及び差信号に変換する変換部と、周波数領域における前記差信号と前記周波数領域における音声信号とに基づいて、周波数領域における減算信号を生成する減算処理部と、周波数領域における前記減算信号を時間領域の時間信号に逆変換する逆変換部と、を備えたものである。
本発明の一態様に係る雑音除去方法は、複数の音声チャンネルから入力された複数の音声信号のうちの一つのチャンネルの音声信号と他の一つのチャンネルの音声信号の差信号を算出するステップと、時間領域における前記音声信号と前記差信号をそれぞれフレーム分割した後、周波数領域における音声信号及び差信号に変換するステップと、周波数領域における前記差信号と前記周波数領域における音声信号とに基づいて、周波数領域における減算信号を生成するステップと、周波数領域における前記減算信号を時間領域の信号に逆変換するステップと、を備えたものである。
本発明の一態様に係るプログラムは、雑音を除去する雑音除去方法をコンピュータに対して実行させるプログラムであって、前記雑音除去方法が、複数の音声チャンネルから入力された複数の音声信号のうちの一つのチャンネルの音声信号と他の一つのチャンネルの音声信号の差信号を算出するステップと、時間領域における前記音声信号と前記差信号をそれぞれフレーム分割した後、周波数領域における音声信号及び差信号に変換するステップと、周波数領域における前記差信号と前記周波数領域における音声信号とに基づいて、周波数領域における減算信号を生成するステップと、周波数領域における前記減算信号を時間領域の信号に逆変換するステップと、を備えているものである。 A noise reduction apparatus according to an aspect of the present invention calculates a difference signal between a sound signal of one channel and a sound signal of another channel among a plurality of sound signals input from a plurality of sound channels. An audio signal and a difference signal in the frequency domain after the audio signal and the difference signal in the time domain are respectively divided into frames, and a difference signal in the frequency domain and an audio signal in the frequency domain. Based on this, a subtraction processing unit that generates a subtraction signal in the frequency domain and an inverse conversion unit that inversely converts the subtraction signal in the frequency domain into a time signal in the time domain are provided.
The noise removal method according to an aspect of the present invention includes a step of calculating a difference signal between a sound signal of one channel and a sound signal of another channel among a plurality of sound signals input from a plurality of sound channels. The audio signal and the difference signal in the time domain are each divided into frames, and then converted into an audio signal and a difference signal in the frequency domain, and based on the difference signal in the frequency domain and the audio signal in the frequency domain, A step of generating a subtraction signal in the frequency domain; and a step of inversely converting the subtraction signal in the frequency domain into a signal in the time domain.
A program according to an aspect of the present invention is a program for causing a computer to execute a noise removal method for removing noise, wherein the noise removal method includes a plurality of audio signals input from a plurality of audio channels. Calculating a difference signal between an audio signal of one channel and an audio signal of another channel; and dividing the audio signal and the difference signal in the time domain into frames, respectively, and then converting the audio signal and the difference signal in the frequency domain into frames. Converting, generating a subtraction signal in the frequency domain based on the difference signal in the frequency domain and the audio signal in the frequency domain, and inversely converting the subtraction signal in the frequency domain into a signal in the time domain And.

本発明によれば、風雑音を効果的に低減することができる雑音除去装置、雑音除去方法、及びプログラムを提供することができる。 ADVANTAGE OF THE INVENTION According to this invention, the noise removal apparatus, the noise removal method, and program which can reduce a wind noise effectively can be provided.

実施の形態１にかかる雑音除去装置の構成を回路ブロック図である。1 is a circuit block diagram of a configuration of a noise removal device according to a first exemplary embodiment. 実施の形態１にかかる雑音除去装置の係数乗算／減算処理部を示す図である。FIG. 3 is a diagram illustrating a coefficient multiplication / subtraction processing unit of the noise removal device according to the first exemplary embodiment. 実施の形態２にかかる雑音除去装置の構成を回路ブロック図である。FIG. 4 is a circuit block diagram illustrating a configuration of a noise removal device according to a second exemplary embodiment. 切替部を切り替えるためのしきい値を示すグラフである。It is a graph which shows the threshold value for switching a switching part. 実施の形態２にかかる雑音除去装置の切替部を示す図である。It is a figure which shows the switching part of the noise removal apparatus concerning Embodiment 2. FIG. 実施の形態３にかかる雑音除去装置の構成を回路ブロック図である。FIG. 6 is a circuit block diagram illustrating a configuration of a noise removal device according to a third exemplary embodiment. 実施の形態４にかかる雑音除去装置の構成を回路ブロック図である。FIG. 6 is a circuit block diagram illustrating a configuration of a noise removal device according to a fourth exemplary embodiment. 実施の形態４にかかる雑音除去装置の係数乗算／減算処理部を示す図である。FIG. 10 is a diagram illustrating a coefficient multiplication / subtraction processing unit of the noise removal device according to the fourth exemplary embodiment. 実施の形態５にかかる雑音除去装置の構成を回路ブロック図である。FIG. 10 is a circuit block diagram of a configuration of a noise removal device according to a fifth exemplary embodiment. 実施の形態６にかかる雑音除去装置の構成を回路ブロック図である。FIG. 10 is a circuit block diagram illustrating a configuration of a noise removing device according to a sixth embodiment.

実施の形態１．
以下、図面を参照して本発明の実施の形態について説明する。図１は、実施の形態１にかかる雑音低減装置の構成を示す図である。電子機器の筐体内に配置された２つのマイクロフォンからステレオの右、左チャンネル（Ｒｃｈ，Ｌｃｈ）の音声信号がＡ／Ｄ変換されて入力端子から入力される。 Embodiment 1 FIG.
Embodiments of the present invention will be described below with reference to the drawings. FIG. 1 is a diagram illustrating a configuration of the noise reduction device according to the first embodiment. Stereo right and left channel (Rch, Lch) audio signals are A / D converted from two microphones arranged in the casing of the electronic device and input from the input terminal.

雑音除去装置１００は、減算器１１、定数倍器１２、ＳＴＦＴ処理部１３〜１５、係数乗算／減算処理部１６、係数乗算／減算処理部１７、ＩＦＦＴ処理部２１、ＩＦＦＴ処理部２３、波形合成部２２、及び波形合成部２４を備えている。なお、雑音除去装置１００は、アナログ回路及びデジタル回路などで実現してもよく、ＣＰＵ（Central Processing Unit）やＤＳＰ(Digital Signal Processor)などのソフトウエアで実現してもよく、これらの組み合わせで実現してもよい。 The noise removal apparatus 100 includes a subtractor 11, a constant multiplier 12, STFT processing units 13 to 15, a coefficient multiplication / subtraction processing unit 16, a coefficient multiplication / subtraction processing unit 17, an IFFT processing unit 21, an IFFT processing unit 23, and a waveform synthesis. A unit 22 and a waveform synthesis unit 24 are provided. The noise removal apparatus 100 may be realized by an analog circuit, a digital circuit, or the like, or may be realized by software such as a CPU (Central Processing Unit) or a DSP (Digital Signal Processor), or a combination thereof. May be.

左のマイクロフォンからの音声信号Ｌｃｈは、入力端子１０１に入力される。右のマイクロフォンからの音声信号Ｒｃｈは、入力端子１０２に入力される。減算器１１は、ＬｃｈとＲｃｈの音声信号の差信号を算出する。 The audio signal Lch from the left microphone is input to the input terminal 101. The audio signal Rch from the right microphone is input to the input terminal 102. The subtractor 11 calculates a difference signal between the Lch and Rch audio signals.

差信号は、定数倍器１２で１／２倍された後、ＳＴＦＴ処理部１４に入力される。また、Ｌｃｈの音声信号は、ＳＴＦＴ処理部１３に入力され、Ｒｃｈの音声信号はＳＴＦＴ処理部１５に入力される。ＳＴＦＴ処理部１３〜１５は、ＳＴＦＴ（Short Time Fourier Transform）処理を実行する。具体的には、ＳＴＦＴ処理部１３〜１５は入力された差信号及び音声信号を所定時間ごとにずらしながら、所定長のフレームにフレーム分割を行う。ＳＴＦＴ処理部１３〜１５は、フレーム分割された各フレームに対し、所定の時間窓を掛ける処理を行う。ＳＴＦＴ処理部１３〜１５は、時間窓をかけた信号に対してＦＦＴ（Fast Fourier Transform）処理を実行し、各フレームの各周波数における位相値および振幅値を出力する。ＳＴＦＴ処理部１３〜１５は、時間領域における音声信号と差信号をそれぞれフレーム分割した後、周波数領域における音声信号及び差信号に変換する変換部となる。 The difference signal is halved by the constant multiplier 12 and then input to the STFT processing unit 14. The Lch audio signal is input to the STFT processing unit 13, and the Rch audio signal is input to the STFT processing unit 15. The STFT processing units 13 to 15 execute STFT (Short Time Fourier Transform) processing. Specifically, the STFT processing units 13 to 15 perform frame division into frames having a predetermined length while shifting the input difference signal and audio signal every predetermined time. The STFT processing units 13 to 15 perform a process of multiplying each frame divided by a predetermined time window. The STFT processing units 13 to 15 perform FFT (Fast Fourier Transform) processing on the signal subjected to the time window, and output a phase value and an amplitude value at each frequency of each frame. The STFT processing units 13 to 15 serve as a conversion unit that converts the audio signal and the difference signal in the time domain into frames and then converts them into an audio signal and a difference signal in the frequency domain.

例えば、２５６次のＦＦＴであれば、周波数信号は、１２８個の振幅値｜Ｙ(t,k×f0)｜ (K=0〜127)と位相値φy(k×f0) (K=0〜127)を有している。差信号については、｜Ｙs(t,k×f0)｜、φys(k×f0)、Ｌｃｈの音声信号については、｜ＹL(t,k×f0)｜、φyＬ(k×f0)、Ｒｃｈの音声信号については、｜ＹR(t,k×f0)｜、φyＲ(k×f0)で表す。もちろん、ｋの最大値は１２７に限らず、任意の自然数とすることができる。 For example, in the case of a 256th order FFT, the frequency signal has 128 amplitude values | Y (t, k × f0) | (K = 0 to 127) and phase value φy (k × f0) (K = 0 to 127). For difference signals, | Ys (t, k × f0) |, φys (k × f0), and for Lch audio signals, | YL (t, k × f0) |, φyL (k × f0), Rch An audio signal is represented by | YR (t, k × f0) | and φyR (k × f0). Of course, the maximum value of k is not limited to 127, and can be any natural number.

差信号とＬｃｈの音声信号のそれぞれ低周波数域の振幅値データは、係数乗算/減算処理部１６に入力される。例えば、低域の３２個ずつのデータ、｜Ｙs(t,k×f0)｜、｜ＹL (t,k×f0)｜ (K=0〜31)が係数乗算/減算処理部１６に入力される。
差信号とＲｃｈの音声信号のそれぞれ低周波数域の振幅値データは、係数乗算/減算処理部１７に入力される。例えば、低域の３２個ずつのデータ、｜Ｙs(t,k×f0)｜、｜ＹR(t,k×f0)｜ (K=0〜31)が係数乗算/減算処理部１７に入力される。 The amplitude value data in the low frequency range of the difference signal and the Lch audio signal are input to the coefficient multiplication / subtraction processing unit 16. For example, 32 low-frequency data, | Ys (t, k × f0) |, | YL (t, k × f0) | (K = 0 to 31) are input to the coefficient multiplication / subtraction processing unit 16. The
The amplitude value data in the low frequency range of the difference signal and the Rch audio signal are input to the coefficient multiplication / subtraction processing unit 17. For example, 32 low-frequency data, | Ys (t, k × f0) |, | YR (t, k × f0) | (K = 0 to 31) are input to the coefficient multiplication / subtraction processing unit 17. The

係数乗算/減算処理部１６、１７では、｜Ｙs(t,k×f0)｜ (K=0〜31) の各振幅値データに所定の係数Ｃkを乗算する。したがって、振幅値データに係数を乗算した乗算値データは、Ｃk ×｜Ｙs(t,k×f0)｜となる。Ｃkは、低域ほど大きく、高域になるにつれて小さくなるような値である。例えば、Ｃk = Ｃ×(1 - (K / 32)× (K / 32)) とすることができる。Ｃは、１前後の値とすることが望ましい。後述するように、差信号の低周波数域の信号は風雑音によるものであるが、高周波数域になるにつれて風雑音以外の本来のステレオ成分が混じったものとなる。従って、低周波数域ほど、乗算値を大きくする。あるいは、K によらず、一定の値であっても良い。演算回路を簡素化できるという利点がある。さらに、Ｃk(K=0〜31)の一部、又は全部を１としてもよい。 The coefficient multiplication / subtraction processing units 16 and 17 multiply each amplitude value data of | Ys (t, k × f0) | (K = 0 to 31) by a predetermined coefficient Ck. Therefore, the multiplication value data obtained by multiplying the amplitude value data by the coefficient is Ck × | Ys (t, k × f0) |. Ck is a value that is larger as the frequency is lower and decreases as the frequency is higher. For example, Ck = C × (1− (K / 32) × (K / 32)). C is preferably a value around 1. As will be described later, the low-frequency signal of the difference signal is due to wind noise, but as the frequency becomes higher, the original stereo components other than wind noise are mixed. Therefore, the multiplication value is increased in the lower frequency range. Alternatively, it may be a constant value regardless of K. There is an advantage that the arithmetic circuit can be simplified. Further, a part or all of Ck (K = 0 to 31) may be set to 1.

係数乗算/減算処理部１６は、係数と差信号の振幅値データとの乗算値データをＬｃｈの音声信号の振幅値データから減算する。すなわち、係数乗算/減算処理部１６は、Ｌｃｈの音声信号の振幅値データから乗算値データを引いた減算値データを周波数ごとに算出する。係数乗算/減算処理部１６は｜ＹL (t,k×f0)｜ - Ｃk × ｜Ｙs(t,k×f0)｜を算出する。｜ＹL (t,k×f0)｜ - Ｃk × ｜Ｙs(t,k×f0)｜はｋ毎に算出される。
同様に、係数乗算/減算処理部１７は、係数と差信号の振幅値データとの乗算値データをＲｃｈの音声信号の振幅値データから減算する。すなわち、係数乗算/減算処理部１７は、Ｒｃｈの音声信号の振幅値データから乗算値データを引いた減算値データを周波数ごとに算出する。係数乗算/減算処理部１７は｜ＹR(t,k×f0)｜ - Ｃk × ｜Ｙs(t,k×f0)｜を算出する。｜ＹR(t,k×f0)｜ - Ｃk × ｜Ｙs(t,k×f0)｜はｋ毎に算出される。 The coefficient multiplication / subtraction processing unit 16 subtracts the multiplication value data of the coefficient and the amplitude value data of the difference signal from the amplitude value data of the Lch audio signal. That is, the coefficient multiplication / subtraction processing unit 16 calculates subtraction value data obtained by subtracting the multiplication value data from the amplitude value data of the Lch audio signal for each frequency. The coefficient multiplication / subtraction processing unit 16 calculates | YL (t, k * f0) | -Ck * | Ys (t, k * f0) |. | YL (t, k * f0) | -Ck * | Ys (t, k * f0) | is calculated for each k.
Similarly, the coefficient multiplication / subtraction processing unit 17 subtracts the multiplication value data of the coefficient and the amplitude value data of the difference signal from the amplitude value data of the Rch audio signal. That is, the coefficient multiplication / subtraction processing unit 17 calculates, for each frequency, subtraction value data obtained by subtracting the multiplication value data from the amplitude value data of the Rch audio signal. The coefficient multiplication / subtraction processing unit 17 calculates | YR (t, k * f0) | -Ck * | Ys (t, k * f0) |. | YR (t, k * f0) | -Ck * | Ys (t, k * f0) | is calculated for each k.

係数乗算／減算処理部１６の具体的な構成について、図２を用いて説明する。図２は、係数乗算／減算処理部１６の構成例を示す図である。係数乗算／減算処理部１６は、定数倍器４１と減算器４２とのペアを備えている。定数倍器４１と減算器４２のペア数は、それぞれ、抽出する低周波数域に含まれるデータ数に対応しており、上記の例では、３２(K=0〜31)となる。 A specific configuration of the coefficient multiplication / subtraction processing unit 16 will be described with reference to FIG. FIG. 2 is a diagram illustrating a configuration example of the coefficient multiplication / subtraction processing unit 16. The coefficient multiplication / subtraction processing unit 16 includes a pair of a constant multiplier 41 and a subtracter 42. The number of pairs of the constant multiplier 41 and the subtracter 42 corresponds to the number of data included in the low frequency range to be extracted, and is 32 (K = 0 to 31) in the above example.

定数倍器４１のそれぞれには、予め定められた係数Ｃkが設定されている。そして、定数倍器４１は、差信号の振幅値データと係数の乗算値を求める。そして、減算器４２がＬｃｈの音声信号の振幅値データと乗算値データとの差分を周波数毎に求めて、減算値とする。これにより、係数乗算／減算処理部１６が減算値からなる減算信号を生成する。 A predetermined coefficient Ck is set for each of the constant multipliers 41. Then, the constant multiplier 41 obtains the product of the amplitude value data of the difference signal and the coefficient. Then, the subtractor 42 obtains a difference between the amplitude value data of the Lch audio signal and the multiplication value data for each frequency, and sets it as a subtraction value. As a result, the coefficient multiplication / subtraction processing unit 16 generates a subtraction signal composed of the subtraction value.

ここで、減算値である｜ＹＬ(t,k×f0)｜ - Ｃk × ｜Ｙs(t,k×f0)｜が負になる場合は、０に置き換える。なお、負になる場合に０に書き換える処理に代えて、所定の正定数以下になる場合にその所定の正定数で置き換えるようにしても良い。ミュージカルノイズと呼ばれるノイズを目立たなくする効果がある。
また、係数乗算／減算処理部１７も係数乗算／減算処理部１６と同様の構成となっており、同様の処理を行う。係数乗算／減算処理部１７が算出した減算値が負になる場合も同様に処理される。 If the subtraction value | YL (t, k × f0) | −Ck × | Ys (t, k × f0) | is negative, it is replaced with 0. Instead of the process of rewriting to 0 when it becomes negative, it may be replaced with the predetermined positive constant when it becomes equal to or less than a predetermined positive constant. It has the effect of making noise called musical noise inconspicuous.
The coefficient multiplication / subtraction processing unit 17 has the same configuration as the coefficient multiplication / subtraction processing unit 16 and performs the same processing. The same processing is performed when the subtraction value calculated by the coefficient multiplication / subtraction processing unit 17 becomes negative.

なお、対応する周波数の振幅値データとその周辺の周波数の振幅値データから生成した値を差信号のデータとしても良い。例えば、｜Ｙs(t,k×f0)｜の代わりに、max( ｜Ｙs(t,(k-1)×f0)｜, ｜Ｙs(t,k×f0)｜,｜Ｙs(t,(k+1)×f0)｜ ) のように、前後３つの周波数の振幅値のうちの最大値を求める。このように、対応する周波数の振幅値データとその周辺の周波数の振幅値データの最大値を用いてもよい。雑音成分が隣接する周波数の振幅値として現れる場合もあるからである。このように、係数乗算/減算処理部１６は差信号の振幅値データとＬｃｈの音声信号の振幅値データとに基づいて、減算処理を行う。すなわち、係数乗算/減算処理部１６は、差信号とＬｃｈの音声信号とに基づいて減算信号を生成する。もちろん、振幅値データの最大値の代わりに、振幅値データの平均値や中間値などを用いてもよい。こうすることで、風雑音成分をより効果的に削減することができる。 A value generated from the amplitude value data of the corresponding frequency and the amplitude value data of the surrounding frequencies may be used as the difference signal data. For example, instead of | Ys (t, k × f0) |, max (| Ys (t, (k-1) × f0) |, | Ys (t, k × f0) |, | Ys (t, ( k + 1) × f0) |) The maximum value of the amplitude values of the three frequencies before and after is obtained. In this way, the maximum value of the amplitude value data of the corresponding frequency and the amplitude value data of the surrounding frequencies may be used. This is because a noise component may appear as an amplitude value of an adjacent frequency. As described above, the coefficient multiplication / subtraction processing unit 16 performs the subtraction process based on the amplitude value data of the difference signal and the amplitude value data of the Lch audio signal. That is, the coefficient multiplication / subtraction processing unit 16 generates a subtraction signal based on the difference signal and the Lch audio signal. Of course, an average value or an intermediate value of the amplitude value data may be used instead of the maximum value of the amplitude value data. By so doing, wind noise components can be more effectively reduced.

なお、減算に用いる差信号の低周波数域の振幅値データについて、K毎に過去のフレームの振幅値データを用いて平滑化した値としても良い。例えばK=5の振幅値データについて、過去３フレーム分の振幅値データで、(｜Ｙs(t-3,5×f0)｜+ 2×｜Ｙs(t-2,5×f0)｜+ 3×｜Ｙs(t-１,5×f0)｜+ 4×｜Ｙs(t,5×f0)｜) / 10のように演算した値とする。他のKについても同様とする。差信号(後述するように、これは風雑音分である)の急激な時間変化の影響を緩和して、ミュージカルノイズと呼ばれるノイズを目立たなくする効果がある。また、係数乗算/減算処理部１７は係数乗算/減算処理部１６と同様の処理を行うことができる。 Note that the amplitude value data in the low frequency range of the difference signal used for subtraction may be a value smoothed using the amplitude value data of the past frame for each K. For example, for amplitude value data of K = 5, the amplitude value data for the past three frames is represented by (| Ys (t−3,5 × f0) | + 2 × | Ys (t−2,5 × f0) | +3 X | Ys (t-1,5 × f0) | + 4 × | Ys (t, 5 × f0) |) / 10 The same applies to other Ks. The effect of abrupt time change of the difference signal (which is a wind noise component, as will be described later) is alleviated and noise called musical noise is made inconspicuous. The coefficient multiplication / subtraction processing unit 17 can perform the same processing as the coefficient multiplication / subtraction processing unit 16.

係数乗算/減算処理部１６からの減算信号は、図１に示すＩＦＦＴ処理部２１に入力される。Ｌｃｈの音声信号の高周波数域の振幅値データ｜ＹL (t,k×f0)｜ (K=32〜127)とＬｃｈの音声信号の位相値 φｙL (k×f0) (K=0〜127) は、ＩＦＦＴ処理部２１に入力される。ＩＦＦＴ処理部２１では、これらの振幅情報と位相情報を用いて、ＩＦＦＴ(Inverse ＦＦＴ)処理を行う。これにより、ＩＦＦＴ処理部２１は周波数領域における減算信号を時間領域の時間信号に逆変換する逆変換部となる。
同様に係数乗算/減算処理部１７からの減算信号は、ＩＦＦＴ処理部２３に入力される。Ｒｃｈの音声信号の高周波数域の振幅値データ｜ＹR(t,k×f0)｜ (K=32〜127)とＲｃｈの音声信号の位相値 φｙR(k×f0) (K=0〜127) は、ＩＦＦＴ処理部２３に入力される。ＩＦＦＴ処理部２３では、これらの振幅情報と位相情報を用いて、ＩＦＦＴ(Inverse ＦＦＴ)処理を行う。これにより、ＩＦＦＴ処理部２３は周波数領域における減算信号を時間領域の時間信号に逆変換する逆変換部となる。 The subtraction signal from the coefficient multiplication / subtraction processing unit 16 is input to the IFFT processing unit 21 shown in FIG. Amplitude value data in the high frequency range of the Lch audio signal | YL (t, k × f0) | (K = 32 to 127) and the phase value of the Lch audio signal φyL (k × f0) (K = 0 to 127) Is input to the IFFT processing unit 21. The IFFT processing unit 21 performs IFFT (Inverse FFT) processing using these amplitude information and phase information. As a result, the IFFT processing unit 21 serves as an inverse transform unit that inversely transforms the subtraction signal in the frequency domain into a time signal in the time domain.
Similarly, the subtraction signal from the coefficient multiplication / subtraction processing unit 17 is input to the IFFT processing unit 23. Rch audio signal amplitude value data | YR (t, k × f0) | (K = 32 to 127) and Rch audio signal phase value φyR (k × f0) (K = 0 to 127) Is input to the IFFT processing unit 23. The IFFT processing unit 23 performs IFFT (Inverse FFT) processing using these amplitude information and phase information. As a result, the IFFT processing unit 23 becomes an inverse transform unit that inversely transforms the subtraction signal in the frequency domain into a time signal in the time domain.

ＩＦＦＴ処理部２１の出力は、波形合成部２２に入力される。波形合成部２２では、インバースウィンドイング処理、及び波形合成処理を行って、音声信号ｙnrL(i)を出力する。この音声信号は、左チャンネル（Ｌｃｈ）の音声信号から、風雑音成分が除去されたものとなっている。
ＩＦＦＴ処理部２３の出力は、波形合成部２４に入力される。波形合成部２４では、インバースウィンドイング処理、及び波形合成処理を行って、音声信号ｙnrR(i)を出力する。この音声信号は、右チャンネル（Ｒｃｈ）の音声信号から、風雑音成分が除去されたものとなっている。 The output of the IFFT processing unit 21 is input to the waveform synthesis unit 22. The waveform synthesizer 22 performs an inverse wind process and a waveform synthesize process, and outputs an audio signal ynrL (i). This audio signal is obtained by removing wind noise components from the audio signal of the left channel (Lch).
The output of the IFFT processing unit 23 is input to the waveform synthesis unit 24. The waveform synthesizer 24 performs an inverse wind process and a waveform synthesize process, and outputs an audio signal ynrR (i). This audio signal is obtained by removing wind noise components from the right channel (Rch) audio signal.

電子機器の筐体内にＬｃｈ用とＲｃｈ用として配置された２つのマイクロフォンの場合には、マイクロフォン間の距離が近い。そのため、収録する音声信号のうち、特に低周波数分については、両チャンネルの差はほとんど無い。高周波になるにつれて、音源と２つのマイクロフォンとの位置に応じた、本来のステレオ成分の差が存在する。一方、風雑音は主に1KHz以下の低周波数成分が主体である。また、風雑音はＬｃｈ用とＲｃｈ用のマイクロフォンで相関なく発生する。 In the case of two microphones arranged for Lch and Rch in the casing of the electronic device, the distance between the microphones is short. For this reason, there is almost no difference between the two channels in the recorded audio signal, particularly for low frequencies. As the frequency becomes higher, there is a difference in the original stereo component depending on the position of the sound source and the two microphones. On the other hand, wind noise is mainly low frequency components of 1KHz or less. Wind noise is generated without correlation between the Lch and Rch microphones.

従って、ＬｃｈとＲｃｈの差信号をＦＦＴした結果の低周波数分は、風雑音によるものと考えられる。Ｌｃｈ、又はＲｃｈの音声信号の低周波数分の振幅値データから、差信号の対応する周波数の振幅値データを減算する。このようにすることで、音声信号から風雑音分を低減することが出来る。 Therefore, the low frequency component resulting from the FFT of the difference signal between Lch and Rch is considered to be due to wind noise. The amplitude value data of the frequency corresponding to the difference signal is subtracted from the amplitude value data corresponding to the low frequency of the audio signal of Lch or Rch. By doing so, it is possible to reduce the wind noise component from the audio signal.

なお、以上の記述では電子機器のＬｃｈ用とＲｃｈ用のマイクロフォンを例に説明したが、種々の電気機器に利用することができる。例えば、録音機能を有する電子機器や、スピーカで音声を再生する電気機器に、雑音除去装置１００を搭載することができる。また、音声信号を検出するマイクロフォンは、ＬｃｈとＲｃｈと有するステレオマイクでなくてもよく、近接して配置された複数のマイクロフォンであれば良い。マイクロフォンの数も２つに限られるものではなく、２つ以上のマイクロフォンがあればよい。 In the above description, the Lch and Rch microphones of the electronic device have been described as examples. However, the present invention can be used for various electric devices. For example, the noise removal apparatus 100 can be mounted on an electronic device having a recording function or an electric device that reproduces sound with a speaker. Further, the microphone for detecting the audio signal may not be a stereo microphone having Lch and Rch, but may be a plurality of microphones arranged close to each other. The number of microphones is not limited to two as long as there are two or more microphones.

なお、以上の説明では、音声信号、及び差信号の振幅値｜ＹR(t,k×f0)｜、｜ＹL(t,k×f0)｜、｜Ｙs(t,k×f0)｜を用いたが、これを２乗したパワー値｜ＹR(t,k×f0)｜^２、｜ＹL(t,k×f0)｜^２、｜Ｙs(t,k×f0)｜^２を用いて、減算信号を生成してもよい。例えば、係数乗算／減算処理部１６が音声信号のパワー値から、差信号のパワー値に係数をかけた値を減算する。そして、減算した結果を１／２乗する。この場合、係数乗算／減算処理部１６から出力される減算信号は、( (｜ＹL(t,k×f0)｜)²- Ｃk × (｜Ｙs(t,k×f0)｜)²)^0.5とすることができる。
あるいは、パワー値の減算結果を音声信号の２乗で除算する。そして、除算値に音声信号の振幅値データを乗じた結果を減算信号としてもよい。この場合、係数乗算／減算処理部１６から出力される減算信号は、｜ＹL(t,k×f0)｜× ( (｜ＹL(t,k×f0)｜)²- Ｃk × (｜Ｙs(t,k×f0)｜)²)／(｜ＹL(t,k×f0)｜)²となる。なお、係数乗算／減算処理部１７についても、係数乗算／減算処理部１６と同様の処理を行うようにする。 In the above description, the amplitude values | YR (t, k × f0) |, | YL (t, k × f0) |, | Ys (t, k × f0) | are used for the audio signal and the difference signal. However, the power value | YR (t, k × f0) | ² , | YL (t, k × f0) | ² , | Ys (t, k × f0) | ² A signal may be generated. For example, the coefficient multiplication / subtraction processing unit 16 subtracts a value obtained by multiplying the power value of the difference signal by the coefficient from the power value of the audio signal. The subtracted result is raised to the power of 1/2. In this case, the subtraction signal output from the coefficient multiplication / subtraction processing unit 16 is ((| YL (t, k × f0) |) ² −Ck × (| Ys (t, k × f0) |) ² ) ^0.5 It can be.
Alternatively, the power value subtraction result is divided by the square of the audio signal. The result obtained by multiplying the division value by the amplitude value data of the audio signal may be used as the subtraction signal. In this case, the subtraction signal output from the coefficient multiplication / subtraction processing unit 16 is | YL (t, k × f0) | × ((| YL (t, k × f0) |) ² −Ck × (| Ys ( t, k × f0) |) ² ) / (| YL (t, k × f0) |) ² . The coefficient multiplication / subtraction processing unit 17 performs the same processing as the coefficient multiplication / subtraction processing unit 16.

このようにパワー値を用いることによって、音声信号及び差信号の振幅値データを用いた場合と同様の効果を得ることができる。このように、フーリエ変換の振幅スペクトル又はパワースペクトルを用いて、減算信号を生成することができる。 By using the power value in this way, it is possible to obtain the same effect as when the amplitude value data of the audio signal and the difference signal is used. Thus, a subtraction signal can be generated using the amplitude spectrum or power spectrum of Fourier transform.

実施の形態２．
本実施の形態にかかる雑音除去装置について、図３を用いて説明する。図３は、雑音除去装置の構成を示す図である。本実施の形態にかかる雑音除去装置は、実施の形態１と同様に、電子機器の筐体内に配置されたマイクロフォンからの音声信号に対して雑音除去処理を行っている。すなわち、２つのマイクロフォンからステレオの右、左チャンネル（Ｒｃｈ，Ｌｃｈ）の音声信号がＡ／Ｄ変換されて入力端子１０１、１０２から入力される。 Embodiment 2. FIG.
The noise removal apparatus according to the present embodiment will be described with reference to FIG. FIG. 3 is a diagram illustrating a configuration of the noise removing device. As in the first embodiment, the noise removal apparatus according to the present embodiment performs noise removal processing on the audio signal from the microphone arranged in the casing of the electronic device. In other words, stereo right and left channel (Rch, Lch) audio signals are A / D converted and input from the input terminals 101 and 102 from two microphones.

本実施の形態に係る雑音除去装置１００は、実施の形態１の構成に加えて、合成比算出部１８、遅延部３１、遅延部３２、切替部３３、及び切替部３４を備えている。なお、雑音除去装置１００の基本的構成については、実施の形態１と同様であるため適宜説明を省略する。 The noise removal apparatus 100 according to the present embodiment includes a composition ratio calculation unit 18, a delay unit 31, a delay unit 32, a switching unit 33, and a switching unit 34 in addition to the configuration of the first embodiment. Note that the basic configuration of the noise removal apparatus 100 is the same as that of the first embodiment, and thus description thereof will be omitted as appropriate.

差信号の低周波数域の振幅値データ｜Ｙs(t,k×f0)｜ (K=0〜31)は、合成比算出部１８に入力される。合成比算出部１８は、例えば、振幅値データ｜Ｙs(t,k×f0)｜ (K=0〜31) の低域の加重を大きくした加重平均値を求める。合成比算出部１８は、加重平均値を、閾値１、閾値２と比較する。図４に示すように、加重平均値等が所定の閾値１よりも小さい場合には、合成比算出部１８は合成比データＱ=０を出力する。加重平均値などが所定の閾値２よりも大きい場合には、合成比算出部１８はＱ＝１を出力する。加重平均値が閾値１から閾値２の間の場合、合成比算出部１８は、値に応じて０より大きく１より小さい値を合成比データＱとして出力する。ここでは、閾値１から閾値２の間では、加重平均値に対して、合成比データＱが線形に増加している。 The amplitude value data | Ys (t, k × f0) | (K = 0 to 31) of the low frequency range of the difference signal is input to the synthesis ratio calculation unit 18. For example, the composition ratio calculation unit 18 obtains a weighted average value obtained by increasing the weight of the low frequency of the amplitude value data | Ys (t, k × f0) | (K = 0 to 31). The composition ratio calculation unit 18 compares the weighted average value with the threshold value 1 and the threshold value 2. As shown in FIG. 4, when the weighted average value or the like is smaller than the predetermined threshold value 1, the synthesis ratio calculation unit 18 outputs the synthesis ratio data Q = 0. When the weighted average value or the like is larger than the predetermined threshold 2, the synthesis ratio calculation unit 18 outputs Q = 1. When the weighted average value is between the threshold value 1 and the threshold value 2, the synthesis ratio calculation unit 18 outputs a value larger than 0 and smaller than 1 according to the value as the synthesis ratio data Q. Here, between the threshold value 1 and the threshold value 2, the composite ratio data Q increases linearly with respect to the weighted average value.

なお、加重平均値の代わりに、振幅値データ｜Ｙs(t,k×f0)｜ (K=0〜31)の和、あるいは二乗和を用いてもよい。すなわち、｜Ｙs(t,k×f0)｜ (K=0〜31)の和、二乗和、又は加重平均値を閾値１、閾値２と比較される比較値とすることができる。この比較値は、音声信号に含まれる低周波数域の雑音成分を示している。 Instead of the weighted average value, the sum of the amplitude value data | Ys (t, k × f0) | (K = 0 to 31) or the sum of squares may be used. That is, the sum, square sum, or weighted average value of | Ys (t, k × f0) | (K = 0 to 31) can be used as a comparison value to be compared with the threshold value 1 and the threshold value 2. This comparison value indicates the noise component in the low frequency range included in the audio signal.

また、以上のように算出した合成比データＱに対して、時間軸方向の平滑化を行っても良い。例えば、過去ｎ−１フレームの結果を保持しておき、現在の結果と合わせてｎフレームの平均値を算出して出力しても良い。あるいは、時間軸方向でＬＰＦをかけた値を出力しても良い。合成比算出部１８の出力は、図３に示すように、Ｌｃｈ用とＲｃｈ用の切替部３３、３４に入力される。 Further, the synthesis ratio data Q calculated as described above may be smoothed in the time axis direction. For example, the result of the past n−1 frames may be held, and the average value of n frames may be calculated and output together with the current result. Alternatively, a value obtained by applying LPF in the time axis direction may be output. As shown in FIG. 3, the output of the composition ratio calculation unit 18 is input to the switching units 33 and 34 for Lch and Rch.

Ｌｃｈ用の切替部３３には、Ｌｃｈの音声信号が遅延部３１を経て入力される。遅延部３１は、ＳＴＦＴ処理、ＩＦＦＴ処理、波形合成処理に要する時間を補償するものである。すなわち、遅延部３１は、ＳＴＦＴ処理部１３の処理と係数乗算／減算処理部１６の処理とＩＦＦＴ処理部２１の処理と波形合成部２２の処理に対応する時間だけＬｃｈの音声信号を遅延する。例えば、ＳＴＦＴ処理、ＩＦＦＴ処理、波形合成処理に、Ａ／Ｄ変換のサンプリング周期でｍサンプル分がかかる場合には、遅延部３１でもｍサンプル分遅延させて、ｙdL(i)として出力する。切替部３３には、波形合成部２２の出力信号ｙnrL(i)と合成比算出部１８の出力である合成比データＱも入力される。切替部３３では、ｙdL(i)、ｙnrL(i)、及び合成比データＱから、 (1 - Ｑ) × ｙdL(i) + Ｑ × ｙnrL(i) を算出して出力する。 An Lch audio signal is input to the Lch switching unit 33 via the delay unit 31. The delay unit 31 compensates for the time required for the STFT process, IFFT process, and waveform synthesis process. That is, the delay unit 31 delays the Lch audio signal by a time corresponding to the processing of the STFT processing unit 13, the processing of the coefficient multiplication / subtraction processing unit 16, the processing of the IFFT processing unit 21, and the processing of the waveform synthesis unit 22. For example, when the STFT process, IFFT process, and waveform synthesis process take m samples in the sampling period of A / D conversion, the delay unit 31 also delays m samples and outputs it as ydL (i). The switching unit 33 also receives the output signal ynrL (i) of the waveform synthesis unit 22 and the synthesis ratio data Q that is the output of the synthesis ratio calculation unit 18. The switching unit 33 calculates (1−Q) × ydL (i) + Q × ynrL (i) from ydL (i), ynrL (i) and the composition ratio data Q and outputs the result.

切替部３３の構成は、図５に示すようになっている。切替部３３は、可変数倍器７１、可変数倍器７２、及び加算器７３を備えている。可変数倍器７１、７２には、合成比データＱが入力されている。そして、可変数倍器７１は、ｙnrL(i)をＱ倍する。可変数倍器７２は、ｙdL(i)を(1 - Ｑ)倍する。そして、加算器７３は、可変数倍器７１と可変数倍器７２の出力の和を求める。これにより、(1 - Ｑ) × ｙdL(i) + Ｑ × ｙnrL(i)が算出される。 The configuration of the switching unit 33 is as shown in FIG. The switching unit 33 includes a variable number multiplier 71, a variable number multiplier 72, and an adder 73. The combination ratio data Q is input to the variable multipliers 71 and 72. The variable multiplier 71 multiplies ynrL (i) by Q. The variable number multiplier 72 multiplies ydL (i) by (1−Q). Then, the adder 73 obtains the sum of the outputs of the variable number multiplier 71 and the variable number multiplier 72. Accordingly, (1−Q) × ydL (i) + Q × ynrL (i) is calculated.

低周波数域の加重平均値が閾値１よりも小さい場合、Ｑ＝０となっている。すなわち、雑音成分が小さいため、出力信号はｙdL (i)となり、入力端子１０１に入力された音声信号がそのまま出力される。一方、低周波数域の加重平均値が閾値２よりも大きい場合、Ｑ＝１となっている。すなわち、雑音成分が大きいため、出力信号はｙnrL (i)となり、波形合成部２２で合成された信号が出力端子１０３に出力される。このように、切替部３３は、入力端子１０１に入力された音声信号ｙdL (i)又は雑音除去された信号ｙnrL (i)を切り替えて出力する。また、加重平均値が閾値１以上閾値２以下の場合、出力信号は(1 - Ｑ) × ｙdL (i) + Ｑ × ｙnrL (i)となり、入力端子１０１に入力された音声信号と雑音除去された信号を所定の割合で合成したものとなる。 When the weighted average value in the low frequency range is smaller than the threshold value 1, Q = 0. That is, since the noise component is small, the output signal is ydL (i), and the audio signal input to the input terminal 101 is output as it is. On the other hand, when the weighted average value in the low frequency range is larger than the threshold value 2, Q = 1. That is, since the noise component is large, the output signal is ynrL (i), and the signal synthesized by the waveform synthesizer 22 is output to the output terminal 103. As described above, the switching unit 33 switches and outputs the audio signal ydL (i) input to the input terminal 101 or the noise-reduced signal ynrL (i). When the weighted average value is greater than or equal to threshold 1 and less than or equal to threshold 2, the output signal is (1−Q) × ydL (i) + Q × ynrL (i), and noise is removed from the audio signal input to the input terminal 101. Are synthesized at a predetermined ratio.

Ｒｃｈ用の音声信号についても同様の処理が行われる。すなわち、Ｒｃｈの音声信号は遅延部３２を経て、切替部３４に入力される。遅延部３２は、遅延部３１と同様の処理を行う。切替部３４は、切替部３３と同様の構成を有しており、切替部３３と同様の処理を行う。よって、切替部３４からは、Ｒｃｈの(1 - Ｑ) × ｙdR(i) + Ｑ × ｙnrR(i)が出力される。そして、切替部３３、３４からの出力信号はそれぞれ出力端子１０３、１０４から出力され、図示しない符号化器によって符号化されて記録媒体に記録される。あるいはＤ／Ａ変換された後に、スピーカなどに出力される。 The same processing is performed on the Rch audio signal. That is, the Rch audio signal is input to the switching unit 34 via the delay unit 32. The delay unit 32 performs the same processing as the delay unit 31. The switching unit 34 has the same configuration as the switching unit 33 and performs the same processing as the switching unit 33. Therefore, the switching unit 34 outputs Rch (1−Q) × ydR (i) + Q × ynrR (i). The output signals from the switching units 33 and 34 are respectively output from the output terminals 103 and 104, encoded by an encoder (not shown), and recorded on a recording medium. Or after D / A conversion, it outputs to a speaker etc.

なお、合成比データＱは、差信号の低周波数域の振幅値データ以外から求めても良い。例えば、合成比データＱは、差信号の低周波数域のパワー値のデータから求めても良い。さらには、特開平５−３２８４８０で記述されている風圧センサを用いて算出しても良い。あるいは、差信号をＢＰＦに通して、その出力の絶対値をピーク検波したデータから算出するようにしても良い。ここで、ＢＰＦの通過域は、風雑音の主成分である１００Ｈｚ〜１ＫＨｚ等にする。このように、差信号の振幅値データ、又はその他の手段を用いて音声信号に含まれる雑音成分を測定する。そして、雑音成分の測定結果に応じて、合成比算出部１８が合成比データＱを算出すればよい。こうすることで、切替部３３が、適切に切り替えを行うことができる。 The synthesis ratio data Q may be obtained from data other than the amplitude value data in the low frequency range of the difference signal. For example, the synthesis ratio data Q may be obtained from data of power values in the low frequency range of the difference signal. Further, it may be calculated using a wind pressure sensor described in Japanese Patent Laid-Open No. 5-328480. Alternatively, the difference signal may be passed through the BPF and the absolute value of the output may be calculated from the peak detected data. Here, the pass band of the BPF is set to 100 Hz to 1 KHz, which is a main component of wind noise. Thus, the noise component contained in the audio signal is measured using the amplitude value data of the difference signal or other means. Then, the synthesis ratio calculation unit 18 may calculate the synthesis ratio data Q according to the measurement result of the noise component. By doing so, the switching unit 33 can appropriately perform switching.

合成比算出部１８は、差信号をＦＦＴした結果の低周波分の振幅値データを用いて合成比データＱを算出する。切替部３３、３４は、合成比データＱを用いて、元の音声信号と風雑音分を低減した音声信号とを合成する。こうすることにより、風雑音が無い場合には元の音声信号を出力するので、ステレオ成分がある音声信号となる。風雑音が大きい場合には、風雑音を低減した音声信号を出力するので、風がある場合でも風雑音の影響の少ない音声信号を出力できる。 The synthesis ratio calculation unit 18 calculates the synthesis ratio data Q using the amplitude value data for the low frequency as a result of FFT of the difference signal. The switching units 33 and 34 use the synthesis ratio data Q to synthesize the original audio signal and the audio signal with reduced wind noise. By doing this, the original audio signal is output when there is no wind noise, so that the audio signal has a stereo component. When the wind noise is large, an audio signal with reduced wind noise is output, so that an audio signal with little influence of wind noise can be output even when there is wind.

ここで、合成比算出部１８で算出した合成比データＱが、あるフレームで０の場合には、係数乗算/減算処理部、ＩＦＦＴ処理部、波形合成部の演算処理の一部、又は全部を実行しないようにするようにしてもよい。例えば、雑音の測定結果に応じて、ＩＦＦＴ処理部２１が逆フーリエ変換を行わないようにする。こうすることで、処理を簡素化することができる。 Here, when the synthesis ratio data Q calculated by the synthesis ratio calculation unit 18 is 0 in a certain frame, a part or all of the arithmetic processing of the coefficient multiplication / subtraction processing unit, IFFT processing unit, and waveform synthesis unit is performed. You may make it not perform. For example, the IFFT processing unit 21 is configured not to perform the inverse Fourier transform according to the noise measurement result. By doing so, the processing can be simplified.

例えば、本実施形態にかかる雑音除去装置を例えば音声収録装置に用いた場合、常時、風が吹いているわけではなく、時には長い期間吹かないことがある。これらの処理部が回路で実現されている場合には、その回路を動作させないようにすることで消費電力の削減ができる。これらの処理部がＣＰＵやＤＳＰなどのソフトウエアで実現されている場合には、それらの処理を実行するルーチンをスキップする。その結果、風雑音が小さい場合に、処理を省略することができ、消費電力を削減することができる。 For example, when the noise removing apparatus according to the present embodiment is used in, for example, an audio recording apparatus, the wind is not always blowing, and sometimes it is not blowing for a long period of time. When these processing units are realized by a circuit, power consumption can be reduced by not operating the circuit. When these processing units are realized by software such as a CPU or a DSP, a routine for executing these processes is skipped. As a result, when the wind noise is small, the processing can be omitted and the power consumption can be reduced.

具体的には、雑音除去装置１００内の処理がＣＰＵやＤＳＰなどのソフトウエアで実現されている場合には、まず、差信号についてＳＴＦＴ処理を行う。得られた結果の振幅値データ｜Ｙs(t,k×f0)｜ (K=0〜31) の大きさが、全て所定の閾値より小さい場合、音声信号のＳＴＦＴ処理、係数乗算/減算処理、ＩＦＦＴ処理、波形合成処理の全て、あるいは一部をスキップする。このように、振幅値データと閾値を比較することで、音声信号のＳＴＦＴ処理、係数乗算/減算処理、ＩＦＦＴ処理、波形合成処理の全て、あるいは一部を省略してもよい。こうすることで、消費電力を削減することができる。 Specifically, when the processing in the noise removal apparatus 100 is realized by software such as a CPU or DSP, first, STFT processing is performed on the difference signal. When the magnitude values of the obtained amplitude value data | Ys (t, k × f0) | (K = 0 to 31) are all smaller than a predetermined threshold, STFT processing, coefficient multiplication / subtraction processing of the audio signal, Skip all or part of IFFT processing and waveform synthesis processing. In this way, by comparing the amplitude value data with the threshold value, all or part of the STFT processing, coefficient multiplication / subtraction processing, IFFT processing, and waveform synthesis processing of the audio signal may be omitted. In this way, power consumption can be reduced.

なお、雑音除去装置１００は、雑音の測定に、振幅値データではなく、パワー値のデータと用いてもよい。なお、閾値はK 毎に決められていてもよく、同じ閾値でもよい。また、雑音除去装置１００が、振幅値データ、又はパワー値のデータを加重して総和をとった総和値と閾値とを比較してもよい。そして、比較結果に応じて、風雑音が小さいか否かを判定し、判定結果に応じて処理を省略してもよい。そして、少なくとも一部の処理をスキップした場合は、雑音除去装置１００が、音声信号をそのまま所定の時間遅延させて出力する。なお、所定の時間は、音声信号のＳＴＦＴ処理、係数乗算／減算処理、ＩＦＦＴ処理、波形合成処理を行う場合と、同じサンリング周期分とすることができる。 Note that the noise removal apparatus 100 may use power value data instead of amplitude value data for noise measurement. The threshold value may be determined for each K or the same threshold value. Further, the noise removal apparatus 100 may compare the sum value obtained by weighting the amplitude value data or the power value data and taking the sum and the threshold value. Then, it may be determined whether the wind noise is small according to the comparison result, and the processing may be omitted according to the determination result. When at least a part of the processing is skipped, the noise removal apparatus 100 outputs the audio signal with a predetermined time delay as it is. Note that the predetermined time can be set to the same sanding period as in the case of performing STFT processing, coefficient multiplication / subtraction processing, IFFT processing, and waveform synthesis processing of an audio signal.

実施の形態３．
本実施の形態にかかる雑音除去装置１００について、図６を用いて説明する。図６は、雑音除去装置１００の構成を示す図である。本実施の形態にかかる雑音除去装置１００は、３つのマイクロフォンを用いて、雑音除去処理を行う。ここでは、ＬｃｈとＲｃｈとの音声信号に加えて、Ｍｃｈの音声信号が雑音除去装置１００に入力される。したがって、雑音除去装置１００はＭｃｈの入力端子１０５と出力端子１０６を備えている。 Embodiment 3 FIG.
A noise removal apparatus 100 according to the present embodiment will be described with reference to FIG. FIG. 6 is a diagram illustrating a configuration of the noise removal apparatus 100. The noise removal apparatus 100 according to the present embodiment performs noise removal processing using three microphones. Here, in addition to the Lch and Rch audio signals, an Mch audio signal is input to the noise removal apparatus 100. Therefore, the noise removal apparatus 100 includes an Mch input terminal 105 and an output terminal 106.

そして、減算器１１ａは、Ｌｃｈの音声信号とＭｃｈの音声信号との差信号を生成し、減算器１１ｂは、Ｍｃｈの音声信号とＲｃｈの音声信号との差信号を生成する。なお。ＬｃｈとＭｃｈの差信号と、ＭｃｈとＲｃｈの差信号に対する処理は基本的に同じであるため、以下、ＬｃｈとＭｃｈの差信号の処理を中心に説明する。 The subtractor 11a generates a difference signal between the Lch audio signal and the Mch audio signal, and the subtractor 11b generates a difference signal between the Mch audio signal and the Rch audio signal. Note that. Since the processing for the difference signal between Lch and Mch and the difference signal between Mch and Rch are basically the same, the following description will be focused on the processing of the difference signal between Lch and Mch.

減算器１１ａからの差信号は、定数倍器１２ａで定数倍されて、ＳＴＦＴ処理部１４ａに入力される。減算器１１ｂからの差信号は、定数倍器１２ｂで定数倍されて、ＳＴＦＴ処理部１４ｂに入力される。ＳＴＦＴ処理部１４ａは、差信号に対して、実施の形態１のＳＴＦＴ処理部１４と同様の処理を行う。Ｌｃｈの音声信号は、ＳＴＦＴ処理部１３ａに入力される。ＳＴＦＴ処理部１３ａは、Ｌｃｈの音声信号に対して、実施の形態１のＳＴＦＴ処理部１３と同様の処理を行う。Ｍｃｈの音声信号は、ＳＴＦＴ処理部１５に入力される。ＳＴＦＴ処理部１５は、Ｍｃｈの音声信号に対して、実施の形態１のＳＴＦＴ処理部１５と同様の処理を行う。Ｒｃｈの音声信号は、ＳＴＦＴ処理部１３ｂに入力される。ＳＴＦＴ処理部１３ｂは、Ｒｃｈの音声信号に対して、実施の形態１のＳＴＦＴ処理部１３と同様の処理を行う。 The difference signal from the subtractor 11a is multiplied by a constant by the constant multiplier 12a and input to the STFT processing unit 14a. The difference signal from the subtractor 11b is multiplied by a constant by a constant multiplier 12b and input to the STFT processing unit 14b. The STFT processing unit 14a performs the same processing on the difference signal as the STFT processing unit 14 of the first embodiment. The Lch audio signal is input to the STFT processing unit 13a. The STFT processing unit 13a performs the same processing as the STFT processing unit 13 of the first embodiment on the Lch audio signal. The Mch audio signal is input to the STFT processing unit 15. The STFT processing unit 15 performs the same processing on the Mch audio signal as the STFT processing unit 15 of the first embodiment. The Rch audio signal is input to the STFT processing unit 13b. The STFT processing unit 13b performs the same processing on the Rch audio signal as the STFT processing unit 13 of the first embodiment.

また、本実施の形態における係数乗算／減算処理部１６は、実施の形態１の係数乗算／減算処理部１６と同様の処理を行う。したがって、係数乗算／減算処理部１６は、差信号とＬｃｈの音声信号とに基づいて、減算信号を算出して、ＩＦＦＴ処理部２１に出力する。本実施の形態における係数乗算／減算処理部２５は、実施の形態１の係数乗算／減算処理部１７と同様の処理を行う。したがって、係数乗算／減算処理部２５は、差信号とＭｃｈの音声信号とに基づいて、減算信号を算出して、ＩＦＦＴ処理部２６に出力する。 Also, the coefficient multiplication / subtraction processing unit 16 in the present embodiment performs the same processing as the coefficient multiplication / subtraction processing unit 16 in the first embodiment. Therefore, the coefficient multiplication / subtraction processing unit 16 calculates a subtraction signal based on the difference signal and the Lch audio signal, and outputs the subtraction signal to the IFFT processing unit 21. The coefficient multiplication / subtraction processing unit 25 in the present embodiment performs the same processing as the coefficient multiplication / subtraction processing unit 17 in the first embodiment. Therefore, the coefficient multiplication / subtraction processing unit 25 calculates a subtraction signal based on the difference signal and the Mch audio signal, and outputs the subtraction signal to the IFFT processing unit 26.

本実施の形態のＩＦＦＴ処理部２１、波形合成部２２が、実施の形態１のＩＦＦＴ処理部２１と波形合成部２２に対応する。ＩＦＦＴ処理部２１は減算信号とＬｃｈの音声信号に基づいて、時間領域の時間信号を生成する。波形合成部２２は、時間領域における時間信号に基づいて、雑音除去されたLｃｈの音声信号を出力端子１０３に出力する。このようにすることで、出力端子１０３からは、雑音成分が除去されたＬｃｈの音声信号が出力される。 The IFFT processing unit 21 and the waveform synthesis unit 22 of the present embodiment correspond to the IFFT processing unit 21 and the waveform synthesis unit 22 of the first embodiment. The IFFT processing unit 21 generates a time signal in the time domain based on the subtraction signal and the Lch audio signal. Based on the time signal in the time domain, the waveform synthesizer 22 outputs the Lch audio signal from which noise has been removed to the output terminal 103. In this way, an Lch audio signal from which the noise component has been removed is output from the output terminal 103.

ＩＦＦＴ処理部２６、及び波形合成部２７は、ＩＦＦＴ処理部２１、波形合成部２２と同様の処理を行う。ＩＦＦＴ処理部２６は減算信号とＭｃｈの音声信号に基づいて、時間領域の時間信号を生成する。波形合成部２７は、時間領域における時間信号に基づいて、雑音除去されたＭｃｈの音声信号を出力端子１０６に出力する。このようにすることで、出力端子１０６からは、雑音成分が除去されたＭｃｈの音声信号が出力される。 The IFFT processing unit 26 and the waveform synthesis unit 27 perform the same processing as the IFFT processing unit 21 and the waveform synthesis unit 22. The IFFT processing unit 26 generates a time signal in the time domain based on the subtraction signal and the Mch audio signal. Based on the time signal in the time domain, the waveform synthesizer 27 outputs the noise-removed Mch audio signal to the output terminal 106. In this way, the Mch audio signal from which the noise component has been removed is output from the output terminal 106.

Ｒｃｈの音声信号についても、同様の処理が行われる。すなわち、ＳＴＦＴ処理部１４ｂが、ＳＴＦＴ処理部１４ａに対応しており、ＳＴＦＴ処理部１３ｂがＳＴＦＴ処理部１３ａに対応している。また、係数乗算／減算処理部１７が係数乗算／減算処理部１６に対応しており、ＩＦＦＴ処理部２３がＩＦＦＴ処理部２１に対応している。波形合成部２４が波形合成部２２に対応している。 The same processing is performed for the Rch audio signal. That is, the STFT processing unit 14b corresponds to the STFT processing unit 14a, and the STFT processing unit 13b corresponds to the STFT processing unit 13a. The coefficient multiplication / subtraction processing unit 17 corresponds to the coefficient multiplication / subtraction processing unit 16, and the IFFT processing unit 23 corresponds to the IFFT processing unit 21. The waveform synthesis unit 24 corresponds to the waveform synthesis unit 22.

ＲｃｈとＭｃｈとの差信号がＳＴＦＴ処理部１４ｂに入力され、Ｒｃｈの音声信号がＳＴＦＴ処理部１３ｂに入力される。ＳＴＦＴ処理部１４ｂは、差信号に対して、ＳＴＦＴ処理部１４ａと同様の処理を行う。ＳＴＦＴ処理部１３ｂは、Ｒｃｈの信号に対して、ＳＴＦＴ処理部１３ａと同様の処理を行う。係数乗算／減算処理部１７は、差信号とＲｃｈの音声信号に基づいて、減算信号を算出して、ＩＦＦＴ処理部２３に出力する。また、ＳＴＦＴ処理部１３ｂは、Ｒｃｈの音声信号の高周波数域の振幅成分と位相成分をＩＦＦＴ処理部２３に出力する。ＩＦＦＴ処理部２３は、これらの振幅情報と位相情報とに基づいて、時間領域の時間信号を生成する。波形合成部２４は、雑音除去されたＲｃｈの音声信号を出力端子１０４に出力する。 The difference signal between Rch and Mch is input to the STFT processing unit 14b, and the Rch audio signal is input to the STFT processing unit 13b. The STFT processing unit 14b performs the same processing as the STFT processing unit 14a on the difference signal. The STFT processing unit 13b performs the same processing as the STFT processing unit 13a on the Rch signal. The coefficient multiplication / subtraction processing unit 17 calculates a subtraction signal based on the difference signal and the Rch audio signal, and outputs the subtraction signal to the IFFT processing unit 23. Further, the STFT processing unit 13 b outputs the amplitude component and phase component in the high frequency region of the Rch audio signal to the IFFT processing unit 23. The IFFT processing unit 23 generates a time signal in the time domain based on the amplitude information and the phase information. The waveform synthesis unit 24 outputs the Rch audio signal from which noise has been removed to the output terminal 104.

なお、Ｍｃｈについては、ＬｃｈとＭｃｈの音声信号に基づいて処理がなされているので、Ｒｃｈの信号を用いずに、Ｍｃｈの雑音除去処理が行われる。そして、Ｌｃｈ、Ｍｃｈ、Ｒｃｈの３つのマイクロフォンは近接して配置する。電子機器などにマイクロフォンが取り付けられた場合を考えると、Ｌｃｈ、Ｒｃｈのマイクロフォンを筐体のそれぞれ左側と右側に配置し、ＭｃｈはＬｃｈとＲｃｈの中間付近の位置に配置すればよい。ＬｃｈとＭｃｈのマイクロフォン間の距離、ＲｃｈとＭｃｈのマイクロフォン間の距離が近くなるので好適である。 Since Mch is processed based on the Lch and Mch audio signals, the Mch noise removal process is performed without using the Rch signal. The three microphones Lch, Mch, and Rch are arranged close to each other. Considering the case where a microphone is attached to an electronic device or the like, Lch and Rch microphones may be arranged on the left and right sides of the housing, respectively, and Mch may be arranged at a position near the middle of Lch and Rch. This is preferable because the distance between the Lch and Mch microphones and the distance between the Rch and Mch microphones are close.

このようにすることで、３つのマイクロフォンの音声信号を処理することが可能になる。もちろん、４つ以上の場合でも同様に処理すれば良い。また、実施の形態３の構成においても、実施の形態２で示した切替部３３、切替部３４などを追加することも可能である。 By doing so, it becomes possible to process the audio signals of the three microphones. Of course, the same processing may be performed for four or more cases. Also in the configuration of the third embodiment, the switching unit 33, the switching unit 34, and the like described in the second embodiment can be added.

実施の形態４．
本実施の形態にかかる雑音除去装置について、図７を用いて説明する。図７は、雑音除去装置１００の全体構成を示す図である。本実施の形態では、離散フーリエ変換ではなく、離散コサイン変換を用いて、時間領域の音声信号を周波数領域の信号に変換している。したがって、ＳＴＦＴ処理部１３〜１５が窓関数処理＋ＤＣＴ処理部６３〜６５に置き換わっている。また、ＩＦＦＴ処理部２１、２３が逆ＤＣＴ処理部８１、８３に置き換わっている。なお、実施の形態１と同様の処理については、適宜説明を省略する。 Embodiment 4 FIG.
The noise removal apparatus according to this embodiment will be described with reference to FIG. FIG. 7 is a diagram illustrating an overall configuration of the noise removal apparatus 100. In the present embodiment, a time domain speech signal is converted into a frequency domain signal using discrete cosine transform instead of discrete Fourier transform. Therefore, the STFT processing units 13 to 15 are replaced with window function processing + DCT processing units 63 to 65. Further, the IFFT processing units 21 and 23 are replaced with inverse DCT processing units 81 and 83. Note that the description of the same processing as in the first embodiment will be omitted as appropriate.

実施の形態１と同様に、減算器１１で、ＬｃｈとＲｃｈの差信号が算出される。これらの差信号は、定数倍器１２で１／２倍された後、窓関数処理＋ＤＣＴ処理部６４に入力される。また、Ｌｃｈの音声信号は、窓関数処理＋ＤＣＴ処理部６３に入力され、Ｒｃｈの音声信号は、窓関数処理＋ＤＣＴ処理部６５に入力される。窓関数処理＋ＤＣＴ処理部６３〜６５は窓関数処理とＤＣＴ処理を実行する。 As in the first embodiment, the subtractor 11 calculates a difference signal between Lch and Rch. These difference signals are multiplied by ½ by the constant multiplier 12 and then input to the window function processing + DCT processing unit 64. The Lch audio signal is input to the window function processing + DCT processing unit 63, and the Rch audio signal is input to the window function processing + DCT processing unit 65. Window function processing + DCT processing units 63 to 65 execute window function processing and DCT processing.

窓関数処理では、入力信号を所定時間ごとにずらしながら所定長のフレームにフレーム分割する処理が行われる。そして、フレーム分割された各フレームに対し、所定の時間窓を掛ける処理が行われる。このフレームは、所定のサンプル数の重なりを持つようにする。さらに、時間窓をかけた信号に対して離散コサイン変換ＤＣＴ（Discrete Cosine Transform）処理を実行して周波数領域の信号に変換する。窓関数処理＋ＤＣＴ処理部６３〜６５は、時間領域における音声信号と差信号をそれぞれフレーム分割した後、周波数領域における音声信号及び差信号に変換する変換部となる。 In the window function process, a process of dividing the input signal into frames having a predetermined length while shifting the input signal at predetermined time intervals is performed. Then, a process of multiplying each frame divided by a predetermined time window is performed. This frame has an overlap of a predetermined number of samples. Further, a discrete cosine transform DCT (Discrete Cosine Transform) process is performed on the signal subjected to the time window to convert it into a frequency domain signal. The window function processing + DCT processing units 63 to 65 serve as conversion units that convert the audio signal and the difference signal in the time domain into frames and then convert them into an audio signal and a difference signal in the frequency domain.

ｔ時刻のフレームについて、２５６次のＤＣＴであれば、２５６個の周波数領域データはＦ(t,k) (K=0〜255)である。差信号については、Ｆs(t,k)、Ｌｃｈの音声信号については、ＦL(t,k)、Ｒｃｈの音声信号については、ＦR(t,k)で表す。ここで、ｋの小さいデータが低周波数域のデータである。 For a frame at time t, if it is a 256th order DCT, the 256 frequency domain data are F (t, k) (K = 0 to 255). The difference signal is expressed as Fs (t, k), the Lch audio signal is expressed as FL (t, k), and the Rch audio signal is expressed as FR (t, k). Here, data having a small k is data in a low frequency range.

差信号とＬｃｈの音声信号のそれぞれ低周波数域の周波数領域データは、係数乗算/減算処理部１６に入力される。例えば、低域の３２個ずつのデータ、Ｆs(t,k)、ＦL(t,k) (K=0〜31)である。係数乗算／減算処理部１６の構成を図８に示す。係数乗算／減算処理部１６は、ＡＢＳ部４３と、定数倍器４１と、減算器４２と、ＳＧＮ部４４を有している。 The frequency domain data in the low frequency range of the difference signal and the Lch audio signal is input to the coefficient multiplication / subtraction processing unit 16. For example, 32 low frequency data, Fs (t, k), FL (t, k) (K = 0 to 31). The configuration of the coefficient multiplication / subtraction processing unit 16 is shown in FIG. The coefficient multiplication / subtraction processing unit 16 includes an ABS unit 43, a constant multiplier 41, a subtractor 42, and an SGN unit 44.

係数乗算/減算処理部１６のＡＢＳ部４３は、Ｆs(t,k)、ＦL(t,k)のそれぞれを絶対値化して｜Ｆs(t,k)｜、｜ＦL(t,k)｜とする。ここで、係数乗算/減算処理部１６はＦL(t,k)の各値の正負の符号を記憶しておく。 The ABS unit 43 of the coefficient multiplication / subtraction processing unit 16 converts each of Fs (t, k) and FL (t, k) into an absolute value to obtain | Fs (t, k) |, | FL (t, k) | And Here, the coefficient multiplication / subtraction processing unit 16 stores the sign of each value of FL (t, k).

定数倍器４１は、｜Ｆs(t,k)｜(K=0〜31) の各データに所定の係数Ｃkを乗算する。係数Ｃkは、実施の形態１と同様の値とすることができる。これにより、実施の形態１と同様の効果を得ることができる。 The constant multiplier 41 multiplies each data of | Fs (t, k) | (K = 0 to 31) by a predetermined coefficient Ck. The coefficient Ck can be the same value as in the first embodiment. Thereby, the effect similar to Embodiment 1 can be acquired.

次に、減算器４２が、絶対値と係数との乗算値を音声信号の周波数領域データの絶対値から減算する。ここで、減算結果、すなわち、｜ＦL(t,k)｜ - Ｃk × ｜Ｆs(t,k)｜が負になる場合は、０に置き換える。なお、負になる場合に０に書き換える処理に代えて、所定の正定数以下になる場合にその所定の正定数で置き換えるようにしても良い。ミュージカルノイズと呼ばれるノイズを目立たなくする効果がある。 Next, the subtractor 42 subtracts the product of the absolute value and the coefficient from the absolute value of the frequency domain data of the audio signal. If the subtraction result, that is, | FL (t, k) | −Ck × | Fs (t, k) | Instead of the process of rewriting to 0 when it becomes negative, it may be replaced with the predetermined positive constant when it becomes equal to or less than a predetermined positive constant. It has the effect of making noise called musical noise inconspicuous.

なお、実施の形態１と同様に、対応する周波数のデータの絶対値とその周辺の周波数のデータの絶対値の最大値を用いてもよい。この場合、｜Ｆs(t,k)｜の代わりに、max( ｜Ｆs(t,k-1)｜, ｜Ｆs(t,k)｜,｜Ｆs(t,k+1)｜)を用いることになる。あるいは、これらの値の平均値、中間値を用いてもよい。実施の形態１と同様に、減算に用いる差信号の低周波数域の周波数領域データについて、各K毎に過去のフレームのデータを用いて平滑化した値としても良い。この場合、例えばK=5のデータについて、過去３フレーム分のデータで、(｜Ｆs(t-3,5)｜+ 2×｜Ｆs(t-2,5)｜+ 3×｜Ｆs(t-3,5)｜+ 4×｜Ｆs(t,5)｜) / 10のように演算した値とする。これにより、実施の形態１と同様の効果を得ることができる。 As in the first embodiment, the absolute value of the corresponding frequency data and the maximum value of the absolute values of the peripheral frequency data may be used. In this case, instead of | Fs (t, k) |, max (| Fs (t, k-1) |, | Fs (t, k) |, | Fs (t, k + 1) |) is used. It will be. Alternatively, an average value or an intermediate value of these values may be used. As in the first embodiment, the frequency domain data in the low frequency domain of the difference signal used for subtraction may be a value smoothed using data of past frames for each K. In this case, for example, for the data of K = 5, the data for the past three frames is (| Fs (t−3,5) | + 2 × | Fs (t−2,5) | + 3 × | Fs (t -3,5) | + 4 × | Fs (t, 5) |) / 10. Thereby, the effect similar to Embodiment 1 can be acquired.

ＳＧＮ部４４は、この減算結果に、絶対値化する時に記憶しておいた各値の正負の符号を付ける。すなわち、音声信号の周波数領域データの正負は変わることがなく、その絶対値が差信号の周波数領域データの絶対値に係る量だけ減少することになる。結果的に、このような値になるのであれば、必ずしもＡＢＳ部４３の位置での絶対値化やＳＧＮ部４４の位置で符号を付けることを行う必要はなく、減算時に論理的に判断して演算するようにしても良い。 The SGN unit 44 attaches a positive or negative sign of each value stored when making the absolute value to the subtraction result. That is, the sign of the frequency domain data of the audio signal does not change, and the absolute value thereof decreases by an amount related to the absolute value of the frequency domain data of the difference signal. As a result, if such a value is obtained, it is not always necessary to make an absolute value at the position of the ABS unit 43 or to add a sign at the position of the SGN unit 44. You may make it calculate.

また、周波数領域データを絶対値化する代わりに、ＡＢＳ部４３で２乗値化しても良い。この場合、減算器４２の出力は、(｜ＦL(t,k)｜)² - Ｃk × (｜Ｆs(t,k)｜)²となる。そして、ＳＧＮ部４４は減算器４２の出力データを１／２乗化して符号を付ける。ＳＧＮ部４４の出力は、( (｜ＦL(t,k)｜)² - Ｃk × (｜Ｆs(t,k)｜)² )^0.5に符号が付いた値となる。あるいは、減算器４２の出力データを音声信号の２乗で除算した除算値を求め、除算値に音声信号の周波数領域データを乗じてもよい。この場合、係数乗算／減算処理部１６から出力される減算信号はＦL(t,k)× ( (｜ＦL(t,k)｜)²- Ｃk × (｜Ｆs(t,k×f0)｜)²)／(｜ＦL(t,k)｜)²となる。なお、除算部の分子は、前述のように負にはならない。したがって、音声信号の周波数領域データの正負は変化しないので、ＳＧＮ部４４の位置で符号を付ける必要はない。このように、周波数領域データの２乗値を減算することで、減算信号を算出するようにしてもよい。 Further, instead of converting the frequency domain data into absolute values, the ABS unit 43 may convert them into square values. In this case, the output of the subtracter 42 is (| FL (t, k) |) ² −Ck × (| Fs (t, k) |) ² . Then, the SGN unit 44 divides the output data of the subtractor 42 by a power of 2 and adds a sign. The output of the SGN unit 44 is a value with a sign (((| FL (t, k) |) ² −Ck × (| Fs (t, k) |) ² ) ^0.5 ). Alternatively, a division value obtained by dividing the output data of the subtractor 42 by the square of the audio signal may be obtained, and the division value may be multiplied by the frequency domain data of the audio signal. In this case, the subtraction signal output from the coefficient multiplication / subtraction processing unit 16 is FL (t, k) × ((| FL (t, k) |) ² −Ck × (| Fs (t, k × f0) | ) ² ) / (| FL (t, k) |) ² Note that the numerator of the division part does not become negative as described above. Accordingly, since the sign of the frequency domain data of the audio signal does not change, it is not necessary to add a sign at the position of the SGN unit 44. In this way, the subtraction signal may be calculated by subtracting the square value of the frequency domain data.

減算結果に正負の符号が付けられた周波数領域データと音声信号の高周波数域の周波数領域データＦL(t,k) (K=32〜127)は、逆ＤＣＴ処理部８１に入力される。逆ＤＣＴ処理部８１では、これらの周波数領域データを用いて、逆ＤＣＴ処理を行う。これにより、周波数領域の信号が、時間領域の信号に逆変換される。逆ＤＣＴ処理部８１の出力は、波形合成部２２に入力される。逆ＤＣＴ処理部８１は、周波数領域における減算信号を時間領域の時間信号に逆変換する逆変換部となる。波形合成部２２は、実施の形態１と同様に、波形合成処理を行って、音声信号ｙnrL(i)を出力する。この音声信号は、左チャンネル（Ｌｃｈ）の音声信号から、風雑音成分が除去されたものとなっている。 The frequency domain data with positive and negative signs added to the subtraction result and the high frequency domain frequency domain data FL (t, k) (K = 32 to 127) of the audio signal are input to the inverse DCT processing unit 81. The inverse DCT processing unit 81 performs inverse DCT processing using these frequency domain data. As a result, the frequency domain signal is inversely transformed into a time domain signal. The output of the inverse DCT processing unit 81 is input to the waveform synthesis unit 22. The inverse DCT processing unit 81 is an inverse transform unit that inversely transforms the subtraction signal in the frequency domain into a time signal in the time domain. Similarly to the first embodiment, the waveform synthesis unit 22 performs waveform synthesis processing and outputs an audio signal ynrL (i). This audio signal is obtained by removing wind noise components from the audio signal of the left channel (Lch).

Ｒｃｈの音声信号についても、Ｌｃｈと同様に処理が行われる。これにより、実施の形態１と同様に、雑音除去されたＬｃｈ，及びＲｃｈの音声信号を得ることができる。なお、ＤＣＴ変換の代わりに、修正離散コサイン変換（ＭＤＣＴ）や、ハートレー変換、離散サイン変換を用いても良い。特にＭＤＣＴと適当な窓関数を用いて、フレームの重なりをフレーム全長の１／２とすることで、フレームの重なり分による、演算すべき周波数領域サンプル数を減らすことができるという効果がある。また。これらの直交変換を用いて時間領域から周波数領域信号に変換することが好適だが、直交変換でなくても良い。非直交変換であっても、直交変換と同様な時間領域、周波数領域変換が行えればよい。 The Rch audio signal is also processed in the same manner as the Lch. As a result, as in the first embodiment, it is possible to obtain Lch and Rch audio signals from which noise has been removed. Instead of the DCT transform, modified discrete cosine transform (MDCT), Hartley transform, or discrete sine transform may be used. In particular, by using MDCT and an appropriate window function, the number of frequency domain samples to be calculated can be reduced due to the overlap of frames by setting the frame overlap to ½ of the total frame length. Also. Although it is preferable to convert from the time domain to the frequency domain signal using these orthogonal transforms, the orthogonal transform may not be used. Even in the non-orthogonal transformation, it is only necessary to perform time domain and frequency domain transformation similar to the orthogonal transformation.

実施の形態５．
本実施の形態にかかる雑音除去装置の構成について、図９を用いて説明する。図９は、雑音除去装置の構成を示す図である。なお、雑音除去装置１００の基本的構成については、実施の形態１、実施の形態２、又は実施の形態４と同様であるため説明を省略する。具体的には、本実施の形態では、実施の形態２と実施の形態４とを組み合わせた構成となっている。すなわち、実施の形態２において、ＳＴＦＴ処理部１３〜１５、及びＩＦＦＴ処理部２１、２３を窓関数処理＋ＤＣＴ処理部６３〜６５、及び逆ＤＣＴ処理部８１、８３に置き換えて、ＤＣＴ変換及び逆ＤＣＴ変換を用いている。換言すると、実施の形態４において、合成比算出部１８、遅延部３１、３２、及び切替部３３、３４を追加した構成となっている。 Embodiment 5 FIG.
The configuration of the noise removal apparatus according to the present embodiment will be described with reference to FIG. FIG. 9 is a diagram illustrating a configuration of the noise removing device. Note that the basic configuration of the noise removal apparatus 100 is the same as that of the first embodiment, the second embodiment, or the fourth embodiment, and thus the description thereof is omitted. Specifically, in the present embodiment, the second embodiment and the fourth embodiment are combined. That is, in the second embodiment, the STFT processing units 13 to 15 and the IFFT processing units 21 and 23 are replaced with window function processing + DCT processing units 63 to 65 and inverse DCT processing units 81 and 83 to perform DCT conversion and inverse DCT. Using conversion. In other words, in the fourth embodiment, the composition ratio calculation unit 18, the delay units 31 and 32, and the switching units 33 and 34 are added.

このような構成では、音声信号に含まれる雑音成分に応じて合成比を算出することができる。よって、実施の形態１、４の効果に加えて、実施の形態２で示した効果を得ることができる。このような制御を行うことで、雑音成分をより効果的に削除することができる。さらには、雑音成分が低く、音声信号をそのまま出力端子１０３、１０４に出力する場合、一部の処理を省略することができる。よって、処理を簡素化することができ、消費電力を低減することができる。 In such a configuration, the synthesis ratio can be calculated according to the noise component included in the audio signal. Therefore, in addition to the effects of the first and fourth embodiments, the effects shown in the second embodiment can be obtained. By performing such control, the noise component can be deleted more effectively. Furthermore, when the noise component is low and the audio signal is output as it is to the output terminals 103 and 104, a part of the processing can be omitted. Therefore, processing can be simplified and power consumption can be reduced.

実施の形態６．
本実施の形態にかかる雑音除去装置の構成について、図１０を用いて説明する。図１０は、雑音除去装置の構成を示す図である。なお、雑音除去装置１００の基本的構成については、実施の形態１、実施の形態３、又は実施の形態４と同様であるため説明を省略する。具体的には、本実施の形態では、実施の形態３と実施の形態４とを組み合わせた構成となっている。すなわち、実施の形態３において、ＳＴＦＴ処理部１３ａ、１３ｂ、１４ａ、１４ｂ、１５、及びＩＦＦＴ処理部２１、２３、２６を窓関数処理＋ＤＣＴ処理部６３ａ、６３ｂ、６４ａ、６４ｂ、６５、及び逆ＤＣＴ処理部８１、８３、８６に置き換えて、ＤＣＴ変換及び逆ＤＣＴ変換を用いている。換言すると、実施の形態４において、マイクロフォンをＬｃｈ、Ｒｃｈ、Ｍｃｈの３チャンネルにした場合に相当する。 Embodiment 6 FIG.
The configuration of the noise removal apparatus according to this embodiment will be described with reference to FIG. FIG. 10 is a diagram illustrating a configuration of the noise removal device. Note that the basic configuration of the noise removal apparatus 100 is the same as that of the first embodiment, the third embodiment, or the fourth embodiment, and thus the description thereof is omitted. Specifically, the present embodiment has a configuration in which the third embodiment and the fourth embodiment are combined. That is, in the third embodiment, the STFT processing units 13a, 13b, 14a, 14b, and 15 and the IFFT processing units 21, 23, and 26 are changed to window function processing + DCT processing units 63a, 63b, 64a, 64b, 65, and inverse DCT. Instead of the processing units 81, 83, 86, DCT transformation and inverse DCT transformation are used. In other words, the fourth embodiment corresponds to the case where the microphones are set to three channels of Lch, Rch, and Mch.

このような構成によって、実施の形態１、３、４と同様の効果を得ることができる。すなわち、３つ以上のマイクロフォンのそれぞれに対して、効果的に雑音除去を行うことができる。もちろん、本実施の形態と実施の形態２を組み合わせてもよい。すなわち、３つ以上のマイクロフォンを用いた構成において、遅延部及び切替部を追加してもよい。このようにすることで、実施の形態２と同様の効果を得ることができる。 With such a configuration, the same effect as in the first, third, and fourth embodiments can be obtained. That is, it is possible to effectively remove noise for each of the three or more microphones. Of course, the present embodiment and the second embodiment may be combined. That is, in a configuration using three or more microphones, a delay unit and a switching unit may be added. By doing in this way, the effect similar to Embodiment 2 can be acquired.

その他の実施の形態．
実施形態１〜６では、フレーム内のサンプル数が固定であるとして記述したが、フレーム内のサンプル数を時々刻々変化させても良い。入力された音声信号の状況に応じて変えることで、風雑音除去後の音質をあげることが出来る。または、風雑音の検出状況（差信号の低域振幅の分布状況）によって、サンプル数を変えても良い。風雑音が比較的小さくて、その周波数分布が低域に留まるような場合にサンプル数を大きくし、風雑音が比較的大きく、その周波数分布が高域まで伸びているような場合にサンプル数を小さくする。このような制御を行うことで風雑音の除去をより効果的に行うことが出来る。 Other embodiments.
In the first to sixth embodiments, the number of samples in the frame is described as being fixed. However, the number of samples in the frame may be changed every moment. By changing according to the state of the input audio signal, the sound quality after removing wind noise can be improved. Alternatively, the number of samples may be changed depending on the detection state of wind noise (the distribution state of the low frequency amplitude of the difference signal). Increase the number of samples when the wind noise is relatively small and the frequency distribution stays in the low range, and increase the number of samples when the wind noise is relatively large and the frequency distribution extends to the high range. Make it smaller. By performing such control, wind noise can be removed more effectively.

なお、以上の実施例では係数乗算/減算処理する低周波数域のデータ数は固定(実施形態では、３２)であり、またＣkが高域になるにつれて小さくなる度合いも固定であるとした。しかしながら、風雑音の周波数分布は風が緩やかな場合には比較的低い周波数成分のみからなるが、強風になるにつれて、高周波数域まで分布するようになる。一方、２つのマイクロフォン間の距離に依存するが、ある程度の周波数以上では本来のステレオ成分が存在する。そこで、比較的風量が少ない場合は係数乗算/減算処理する低周波数域のデータ数を少なくする。あるいは、Ｃkの減衰度合いを急峻にする。一方、比較的風量が多い場合は、係数乗算/減算処理する低周波数域のデータ数を多くする。あるいは、Ｃkの減衰度合いを緩やかにする。このように、風量に応じて、係数乗算／減算処理部１６において減算処理を行う低周波数成分の範囲を調整することができる。このように、風量に応じて、低周波数域のデータ数を変更する。換言すると、減算処理が行われる低周波数成分の範囲を調整する。こうすることで、効果的に風雑音を低減することができる。 In the above embodiment, the number of data in the low frequency range to be subjected to coefficient multiplication / subtraction processing is fixed (in the embodiment, 32), and the degree of decrease as Ck becomes higher is fixed. However, the frequency distribution of wind noise is composed of only a relatively low frequency component when the wind is gentle. However, as the wind becomes stronger, the frequency distribution becomes higher. On the other hand, depending on the distance between the two microphones, the original stereo component exists above a certain frequency. Therefore, when the air volume is relatively small, the number of data in the low frequency range for coefficient multiplication / subtraction processing is reduced. Alternatively, the attenuation degree of Ck is made steep. On the other hand, when the air volume is relatively large, the number of data in the low frequency range to be multiplied / subtracted by the coefficient is increased. Alternatively, the degree of attenuation of Ck is moderated. In this manner, the low frequency component range in which the coefficient multiplication / subtraction processing unit 16 performs the subtraction process can be adjusted according to the air volume. In this way, the number of data in the low frequency range is changed according to the air volume. In other words, the range of the low frequency component in which the subtraction process is performed is adjusted. By doing so, wind noise can be effectively reduced.

風量の検出は、差信号低周波数域の振幅値データ｜Ｙs(t,k×f0)｜ (K=0〜20) の和、あるいは二乗和、あるいは低周波数域データの加重を大きくした加重平均値などで行う。あるいは、実施の形態１と同様に、風圧センサを用いてもよい。実施の形態１と同様に、差信号をＢＰＦに通して、その出力の絶対値をピーク検波したデータから算出するようにしても良い。ここで、ＢＰＦの通過域は、風雑音の主成分である１００Ｈｚ〜１ＫＨｚ等にする。 The detection of the air volume is the weighted average of the sum of the amplitude value data | Ys (t, k × f0) | (K = 0 to 20) or the sum of squares, or the weight of the low frequency data. Perform by value. Alternatively, a wind pressure sensor may be used as in the first embodiment. Similarly to the first embodiment, the difference signal may be passed through the BPF and the absolute value of the output may be calculated from the data obtained by peak detection. Here, the pass band of the BPF is set to 100 Hz to 1 KHz, which is a main component of wind noise.

なお、係数乗算/減算処理での処理は、｜Ｙs(t,k×f0)｜ (K=0〜31) の各データに所定の係数Ｃkを乗算して、その乗算結果を一方のｃｈ信号の振幅値データ（例えば、Ｌｃｈのデータ｜ＹL(t,k×f0)｜）から減算するとしたが、これと類似の処理であっても良い。例えば、｜Ｙs(t,k×f0)｜の値が小さな時は小さく、大きな時は大きい値に変換し、その値で一方のｃｈ信号の振幅値データを除算するようにしても良い。 In the coefficient multiplication / subtraction process, each data of | Ys (t, k × f0) | (K = 0 to 31) is multiplied by a predetermined coefficient Ck, and the multiplication result is obtained as one ch signal. The amplitude value data (for example, Lch data | YL (t, k × f0) |) is subtracted from the amplitude value data. For example, when the value of | Ys (t, k × f0) | is small, it may be small and when large, it may be converted to a large value, and the amplitude value data of one ch signal may be divided by that value.

なお、ＬｃｈとＲｃｈのマイクロフォンでなくてもよく、複数のチャンネルのマイクロフォンがあればよい。例えば、近接して配置された２以上のマイクロフォンであれば良い。また、ＲｃｈとＬｃｈの一方のみ雑音除去するようにしてもよい。さらに、マイクロフォンがアレイ状に配列されたマイクロフォンアレイからの音声信号に対して、雑音除去してもよい。 Note that the Lch and Rch microphones do not have to be provided, and there may be a plurality of channel microphones. For example, it may be two or more microphones arranged close to each other. Further, noise may be removed from only one of Rch and Lch. Further, noise may be removed from a sound signal from a microphone array in which microphones are arranged in an array.

上記した雑音除去するための処理は、コンピュータプログラムによって実行されても良い。上述したプログラムは、様々なタイプの非一時的なコンピュータ可読媒体(non-transitory computer readable medium)を用いて格納され、コンピュータに供給することができる。非一時的なコンピュータ可読媒体は、様々なタイプの実体のある記録媒体(tangible storage medium)を含む。非一時的なコンピュータ可読媒体の例は、磁気記録媒体(例えばフレキシブルディスク、磁気テープ、ハードディスクドライブ)、光磁気記録媒体(例えば光磁気ディスク)、ＣＤ−ＲＯＭ(Read Only Memory）、ＣＤ−Ｒ、ＣＤ−Ｒ／Ｗ、半導体メモリ(例えば、マスクＲＯＭ、ＰＲＯＭ(Programmable ROM）、ＥＰＲＯＭ(Erasable PROM）、フラッシュＲＯＭ、ＲＡＭ(Random Access Memory））を含む。また、プログラムは、様々なタイプの一時的なコンピュータ可読媒体(transitory computer readable medium）によってコンピュータに供給されてもよい。一時的なコンピュータ可読媒体の例は、電気信号、光信号、及び電磁波を含む。一時的なコンピュータ可読媒体は、電線及び光ファイバ等の有線通信路、又は無線通信路を介して、プログラムをコンピュータに供給できる。 The processing for removing noise described above may be executed by a computer program. The above-described program can be stored and supplied to a computer using various types of non-transitory computer readable media. Non-transitory computer readable media include various types of tangible storage media. Examples of non-transitory computer-readable media include magnetic recording media (for example, flexible disks, magnetic tapes, hard disk drives), magneto-optical recording media (for example, magneto-optical disks), CD-ROMs (Read Only Memory), CD-Rs, CD-R / W, semiconductor memory (for example, mask ROM, PROM (Programmable ROM), EPROM (Erasable PROM), flash ROM, RAM (Random Access Memory)) are included. The program may also be supplied to the computer by various types of transitory computer readable media. Examples of transitory computer readable media include electrical signals, optical signals, and electromagnetic waves. The temporary computer-readable medium can supply the program to the computer via a wired communication path such as an electric wire and an optical fiber, or a wireless communication path.

また、コンピュータが上述の実施の形態の機能を実現するプログラムを実行することにより、上述の実施の形態の機能が実現される場合だけでなく、このプログラムが、コンピュータ上で稼動しているＯＳ(Operating System)もしくはアプリケーションソフトウェアと共同して、上述の実施の形態の機能を実現する場合も、本発明の実施の形態に含まれる。 In addition to the case where the function of the above-described embodiment is realized by the computer executing the program that realizes the function of the above-described embodiment, this program is not limited to the OS ( The case where the functions of the above-described embodiment are realized in cooperation with the Operating System) or application software is also included in the embodiment of the present invention.

１００雑音除去装置
１１減算器
１３〜１５ＳＴＦＴ処理部
１６、１７係数乗算／減算処理部
１８合成比算出部
６３〜６５窓関数処理＋ＤＣＴ処理部
２１、２３ＩＦＦＴ処理部
２２、２４波形合成部
３１、３２遅延部
３３、３４切替部
４１定数倍器
４２減算器 DESCRIPTION OF SYMBOLS 100 Noise removal apparatus 11 Subtractor 13-15 STFT process part 16, 17 Coefficient multiplication / subtraction process part 18 Composition ratio calculation part 63-65 Window function process + DCT process part 21, 23 IFFT process part 22, 24 Waveform composition part 31, 32 Delay part 33, 34 Switching part 41 Constant multiplier 42 Subtractor

Claims

A signal calculation unit that calculates a difference signal between an audio signal of one channel among a plurality of audio signals input from a plurality of audio channels and an audio signal of another one channel;
A conversion unit for dividing the audio signal and the difference signal in the time domain into frames, respectively, and then converting the audio signal and the difference signal in the frequency domain;
A subtraction processing unit that generates a subtraction signal in the frequency domain based on the difference signal in the frequency domain and the audio signal in the frequency domain;
A noise removing device comprising: an inverse transform unit that inversely transforms the subtraction signal in the frequency domain into a time signal in the time domain.

The subtraction processing unit subtracts a value obtained by multiplying a value corresponding to the magnitude of the data value of the difference signal in the frequency domain by a coefficient from a value corresponding to the magnitude of the data value of the audio signal in the frequency domain. The noise removal apparatus according to claim 1, wherein the subtraction signal is calculated.

The noise removal apparatus according to claim 1, wherein the subtraction processing unit generates the subtraction signal by performing a subtraction process only on a low frequency component of the audio signal and the difference signal.

The noise removal device according to claim 3, wherein a range of the low frequency component in which the subtraction process is performed is changed according to an air volume.

5. The switching unit according to claim 1, further comprising: a switching unit that switches and outputs the time signal inverted by the inverse conversion unit and the audio signal input from the audio channel. Noise removal device.

The noise removal device according to claim 5, wherein when the switching unit outputs the audio signal, the inverse transformation process by the inverse transformation unit is omitted.

Depending on the magnitude of the data value of the difference signal in the low frequency range, the conversion unit converts the audio signal into a frequency signal, the subtraction unit performs subtraction, and the inverse conversion unit performs inverse processing. The noise removal device according to claim 1, wherein at least one of the conversion processes is omitted.

The noise removal according to any one of claims 1 to 7, wherein a value of the subtraction signal at an arbitrary frequency is calculated based on a value of the difference signal having a frequency including before and after the arbitrary frequency. apparatus.

The transform unit performs Fourier transform on the audio signal and the difference signal in the time domain, thereby converting the audio signal and the difference signal in the frequency domain,
The noise removal apparatus according to claim 1, wherein the subtraction signal is calculated using an amplitude spectrum or a power spectrum of the Fourier transform.

The conversion unit may calculate the audio signal and the difference signal in the frequency domain with a positive / negative sign,
The subtraction processing unit uses the absolute value of the audio signal and the difference signal, or the square value of the audio signal and the difference signal, and attaches the sign of the audio signal in the frequency domain to the resulting signal, The noise removal device according to claim 1, wherein the subtraction signal is calculated.

Calculating a difference signal between an audio signal of one channel and an audio signal of another channel among a plurality of audio signals input from a plurality of audio channels;
The audio signal and the difference signal in the time domain are each divided into frames, and then converted into an audio signal and a difference signal in the frequency domain;
Generating a subtraction signal in the frequency domain based on the difference signal in the frequency domain and an audio signal in the frequency domain;
A step of inversely transforming the subtracted signal in the frequency domain into a signal in the time domain.

A program for causing a computer to execute a noise removal method for removing noise,
The noise removal method comprises:
Calculating a difference signal between an audio signal of one channel and an audio signal of another channel among a plurality of audio signals input from a plurality of audio channels;
The audio signal and the difference signal in the time domain are each divided into frames, and then converted into an audio signal and a difference signal in the frequency domain;
Generating a subtraction signal in the frequency domain based on the difference signal in the frequency domain and an audio signal in the frequency domain;
Back-converting the subtraction signal in the frequency domain into a signal in the time domain.