JP5187666B2

JP5187666B2 - Noise suppression device and program

Info

Publication number: JP5187666B2
Application number: JP2009001470A
Authority: JP
Inventors: 益永上村; 洋猿渡; 多伸近藤
Original assignee: Yamaha Corp
Current assignee: Yamaha Corp
Priority date: 2009-01-07
Filing date: 2009-01-07
Publication date: 2013-04-24
Anticipated expiration: 2029-01-07
Also published as: JP2010160246A

Description

本発明は、目的音と雑音との混合音から雑音を抑圧する技術に関する。 The present invention relates to a technique for suppressing noise from a mixed sound of a target sound and noise.

目的音と雑音との混合音から雑音を抑圧する技術が従来から提案されている。例えば非特許文献１には、音響信号のスペクトルから雑音スペクトルを減算するスペクトルサブトラクション（SS：spectral subtraction）法が開示されている。また、非特許文献２には、目的音（音声）が強調されるように選定されたスペクトルゲインを音響信号のスペクトルに乗算するMMSE-STSA（minimum mean square error short time spectral amplitude）法が開示されている。 Conventionally, a technique for suppressing noise from a mixed sound of target sound and noise has been proposed. For example, Non-Patent Document 1 discloses a spectral subtraction (SS) method that subtracts a noise spectrum from a spectrum of an acoustic signal. Non-Patent Document 2 discloses an MMSE-STSA (minimum mean square error short time spectral amplitude) method for multiplying the spectrum of an acoustic signal by a spectral gain selected so that the target sound (sound) is emphasized. ing.

しかし、非特許文献１や非特許文献２のように周波数領域で音響信号から雑音を抑圧する方法においては、雑音の抑圧後に時間軸上および周波数軸上に分散的に点在する成分が、耳障りなミュージカルノイズとして受聴者に知覚されるという問題がある。そこで、非特許文献３には、雑音の抑圧後にミュージカルノイズを除去する技術が開示されている。 However, in the method of suppressing noise from an acoustic signal in the frequency domain as in Non-Patent Document 1 and Non-Patent Document 2, components that are dispersedly scattered on the time axis and the frequency axis after noise suppression are disturbing. There is a problem that the listener perceives it as a musical noise. Therefore, Non-Patent Document 3 discloses a technique for removing musical noise after noise suppression.

S. F. Boll, "Suppression of acoustic noise in using spectral subtraction", IEEE Trans., Acoustics, Speech and Signal Processing, vol. ASSP-27, no. 2, p.113-120, Apr. 1979S. F. Boll, "Suppression of acoustic noise in using spectral subtraction", IEEE Trans., Acoustics, Speech and Signal Processing, vol.ASSP-27, no. 2, p.113-120, Apr. 1979 Y. Ephraim and D. Malah, "Speech enhancement using a minimum mean-square error short-time spectral amplitude estimator", IEEE ASSP, vol.ASSP-32, no.6, p.1109-1121, Dec. 1984Y. Ephraim and D. Malah, "Speech enhancement using a minimum mean-square error short-time spectral amplitude estimator", IEEE ASSP, vol.ASSP-32, no.6, p.1109-1121, Dec. 1984 阿部友実，松本光春，橋本周司，"時間−周波数Ｍ変換によるミュージカルノイズ除去"，日本音響学会講演論文集，3-6-9，p.727−p.730，2008年３月Tomomi Abe, Mitsuharu Matsumoto, Shuji Hashimoto, "Musical noise removal by time-frequency M conversion", Proceedings of the Acoustical Society of Japan, 3-6-9, p.727-p.730, March 2008

しかし、雑音の抑圧後に発生するミュージカルノイズの程度は音響信号の音響的な特性に応じて相違する。したがって、音響信号のうち特定の区間に発生するミュージカルノイズが非特許文献３の技術で低減されても、音響的な特性が相違する別区間のミュージカルノイズを充分に低減できるとは限らない。以上の事情を背景として、本発明は、音響信号の音響的な特性が変化する場合でも雑音の抑圧後のミュージカルノイズを効果的に低減することをひとつの目的とする。 However, the degree of musical noise generated after noise suppression differs depending on the acoustic characteristics of the acoustic signal. Therefore, even if the musical noise generated in a specific section of the acoustic signal is reduced by the technique of Non-Patent Document 3, it is not always possible to sufficiently reduce the musical noise in another section having different acoustic characteristics. In view of the above circumstances, an object of the present invention is to effectively reduce musical noise after noise suppression even when the acoustic characteristics of an acoustic signal change.

以上の課題を解決するために、本発明に係る雑音抑圧装置は、相異なる雑音抑圧処理を音響信号に対して実行する複数の雑音抑圧手段と、音響信号の強度の度数分布における尖度が雑音抑圧処理の前後で変化した度合を示す尖度指標値を各雑音抑圧手段による雑音抑圧処理毎に算定する指標算定手段と、指標算定手段が算定した各尖度指標値に応じて各雑音抑圧手段による複数の雑音抑圧処理の何れかを選択する選択手段とを具備する。例えば、選択手段は、複数の雑音抑圧処理のうち尖度指標値の示す尖度の変化が小さい雑音抑圧処理を選択する。 In order to solve the above problems, a noise suppression device according to the present invention includes a plurality of noise suppression units that perform different noise suppression processes on an acoustic signal, and a kurtosis in the frequency distribution of the intensity of the acoustic signal. Index calculation means for calculating the kurtosis index value indicating the degree of change before and after the suppression process for each noise suppression process by each noise suppression means, and each noise suppression means according to each kurtosis index value calculated by the index calculation means Selecting means for selecting any one of a plurality of noise suppression processes. For example, the selection unit selects a noise suppression process having a small change in kurtosis indicated by the kurtosis index value among the plurality of noise suppression processes.

以上の構成においては、音響信号の強度の度数分布における尖度が雑音抑圧処理の前後で変化した度合（すなわちミュージカルノイズの発生の度合）を示す尖度指標値に応じて、選択手段が選択する雑音抑圧処理が変更されるから、音響信号の音響的な特性が変化する場合でも雑音の抑圧後のミュージカルノイズを効果的に低減できるという利点がある。 In the above configuration, the selection means selects according to the kurtosis index value indicating the degree to which the kurtosis in the frequency distribution of the intensity of the acoustic signal has changed before and after the noise suppression process (that is, the degree of occurrence of musical noise). Since the noise suppression processing is changed, there is an advantage that the musical noise after noise suppression can be effectively reduced even when the acoustic characteristics of the acoustic signal change.

本発明の好適な態様に係る雑音抑圧装置は、音響信号を時間軸上で雑音区間と目的音区間とに区分する信号区分手段と、各雑音抑圧処理に対する加重値を雑音区間と目的音区間とで変化させて各尖度指標値を加重する加重手段とを具備し、選択手段は、加重後の各尖度指標値に応じて雑音抑圧処理を選択する。以上の態様においては、各尖度指標値が近接する場合でも、雑音区間および目的音区間の各々に適した雑音抑圧処理を安定的に選択できるという利点がある。 A noise suppression apparatus according to a preferred aspect of the present invention includes a signal classification unit that classifies an acoustic signal into a noise section and a target sound section on a time axis, and a weight value for each noise suppression process is assigned to the noise section and the target sound section. And a weighting means for weighting each kurtosis index value by changing, and the selection means selects a noise suppression process according to each kurtosis index value after weighting. The above aspect has an advantage that noise suppression processing suitable for each of the noise section and the target sound section can be stably selected even when the kurtosis index values are close to each other.

音響信号のスペクトルから雑音スペクトルを減算する減算型雑音抑圧処理（例えばスペクトルサブトラクション法）は、音声が優勢な区間についてミュージカルノイズを抑制する雑音抑圧処理として好適である。一方、目的音を強調するスペクトルゲインを音響信号のスペクトルに乗算する乗算型雑音抑圧処理（例えばMMSE-STSA法やMAP法）は、雑音（特に定常的な雑音）が優勢な区間についてミュージカルノイズを抑制する雑音抑圧処理として好適である。したがって、選択手段による選択の候補となる複数の雑音抑圧処理が減算型雑音抑圧処理と乗算型雑音抑圧処理とを含む構成によれば、音響信号のうち音声が優勢な区間と雑音が優勢な区間との双方についてミュージカルノイズの発生を抑制できるという利点がある。 A subtractive noise suppression process (for example, a spectral subtraction method) that subtracts a noise spectrum from the spectrum of an acoustic signal is suitable as a noise suppression process that suppresses musical noise in a section where speech is dominant. On the other hand, multiplicative noise suppression processing (for example, the MMSE-STSA method or MAP method) that multiplies the spectrum of an acoustic signal by a spectral gain that enhances the target sound, for example, eliminates musical noise in a section where noise (especially stationary noise) is dominant. It is suitable as a noise suppression process to suppress. Therefore, according to the configuration in which the plurality of noise suppression processes that are candidates for selection by the selection unit include the subtraction type noise suppression process and the multiplication type noise suppression process, the voice dominant section and the noise dominant section of the acoustic signal There is an advantage that generation of musical noise can be suppressed.

また、以上の各態様に係る雑音抑圧装置は、雑音の抑圧に専用されるＤＳＰ（Digital Signal Processor）などのハードウェア（電子回路）によって実現されるほか、ＣＰＵ（Central Processing Unit）などの汎用の演算処理装置とプログラムとの協働によっても実現される。本発明に係るプログラムは、音響信号に対して個別に実行される複数の雑音抑圧処理と、音響信号の強度の度数分布における尖度が雑音抑圧処理の前後で変化した度合を示す尖度指標値を雑音抑圧処理毎に算定する指標算定処理と、指標算定処理で算定した各尖度変化指標に応じて複数の雑音抑圧処理の何れかを選択する選択処理とをコンピュータに実行させる。以上のプログラムによれば、本発明の各態様に係る雑音抑圧装置と同様の作用および効果が奏される。本発明のプログラムは、コンピュータが読取可能な記録媒体に格納された形態で利用者に提供されてコンピュータにインストールされるほか、通信網を介した配信の形態でサーバ装置から提供されてコンピュータにインストールされる。 In addition, the noise suppression device according to each aspect described above is realized by hardware (electronic circuit) such as a DSP (Digital Signal Processor) dedicated to noise suppression, or a general-purpose such as a CPU (Central Processing Unit). This is also realized by cooperation between the arithmetic processing unit and the program. A program according to the present invention is a kurtosis index value indicating a degree of change in kurtosis in a frequency distribution of a plurality of noise suppression processes individually performed on an acoustic signal and before and after the noise suppression process. For each noise suppression process, and a selection process for selecting one of a plurality of noise suppression processes according to each kurtosis change index calculated in the index calculation process. According to the above program, operations and effects similar to those of the noise suppression device according to each aspect of the present invention are exhibited. The program of the present invention is provided to a user in a form stored in a computer-readable recording medium and installed in the computer, or provided from a server device in a form of distribution via a communication network and installed in the computer. Is done.

本発明の第１実施形態に係る雑音抑圧装置のブロック図である。1 is a block diagram of a noise suppression device according to a first embodiment of the present invention. 音響信号の区分について説明するための概念図である。It is a conceptual diagram for demonstrating the division of an acoustic signal. 第１雑音抑圧部のブロック図である。It is a block diagram of a 1st noise suppression part. 第２雑音抑圧部のブロック図である。It is a block diagram of a 2nd noise suppression part. 音響信号の強度の度数分布が雑音の抑圧で変化する様子を示す概念図である。It is a conceptual diagram which shows a mode that the frequency distribution of the intensity | strength of an acoustic signal changes by noise suppression. 指標算定部のブロック図である。It is a block diagram of an index calculation unit. 雑音区間における尖度比のグラフである。It is a graph of kurtosis ratio in a noise area. 目的音区間における尖度比のグラフである。It is a graph of the kurtosis ratio in the target sound section. 雑音抑圧処理に適用した各係数の図表である。It is a chart of each coefficient applied to noise suppression processing. 本発明の第２実施形態に係る雑音抑圧装置のブロック図である。It is a block diagram of the noise suppression apparatus which concerns on 2nd Embodiment of this invention.

＜Ａ：第１実施形態＞
図１は、本発明の第１実施形態に係る雑音抑圧装置のブロック図である。雑音抑圧装置１００には、目的音と雑音との混合音の波形を表す時間領域の音響信号ＶINが供給される。音響信号ＶINの供給元（図示略）は、例えば、周囲の音響に応じた音響信号ＶINを生成する収音機器や、記録媒体から音響信号ＶINを取得して出力する再生装置である。雑音抑圧装置１００は、音響信号ＶINの雑音を抑圧することで音響信号ＶOUTを生成する。音響信号ＶOUTは、例えばスピーカやヘッドホンなどの放音装置（図示略）に供給されて音波として再生される。 <A: First Embodiment>
FIG. 1 is a block diagram of a noise suppression apparatus according to the first embodiment of the present invention. The noise suppression apparatus 100 is supplied with a time domain acoustic signal VIN representing a waveform of a mixed sound of the target sound and noise. The supply source (not shown) of the acoustic signal VIN is, for example, a sound collecting device that generates an acoustic signal VIN corresponding to surrounding sounds, or a playback device that acquires and outputs the acoustic signal VIN from a recording medium. The noise suppression device 100 generates the acoustic signal VOUT by suppressing the noise of the acoustic signal VIN. The acoustic signal VOUT is supplied to a sound emitting device (not shown) such as a speaker or headphones and reproduced as a sound wave.

雑音抑圧装置１００は、演算処理装置１２と記憶装置１４とを含むコンピュータシステムで実現される。記憶装置１４は、音響信号ＶINから音響信号ＶOUTを生成するためのプログラムや各種のデータを記憶する。半導体記録媒体や磁気記録媒体などの公知の記録媒体が記憶装置１４として任意に採用される。 The noise suppression device 100 is realized by a computer system including an arithmetic processing device 12 and a storage device 14. The storage device 14 stores a program and various data for generating the acoustic signal VOUT from the acoustic signal VIN. A known recording medium such as a semiconductor recording medium or a magnetic recording medium is arbitrarily employed as the storage device 14.

演算処理装置１２は、記憶装置１４に格納されたプログラムを実行することで複数の要素（周波数分析部２０，抑圧処理部３０，指標算定部４２，選択部４４，波形合成部４６）として機能する。なお、音響信号ＶINの処理に専用される電子回路（ＤＳＰ）が演算処理装置１２の各要素を実現する構成や、演算処理装置１２の各要素を複数の集積回路に分散的に搭載した構成も採用される。 The arithmetic processing unit 12 functions as a plurality of elements (frequency analysis unit 20, suppression processing unit 30, index calculation unit 42, selection unit 44, waveform synthesis unit 46) by executing a program stored in the storage device 14. . In addition, a configuration in which an electronic circuit (DSP) dedicated to processing of the acoustic signal VIN realizes each element of the arithmetic processing device 12 or a configuration in which each element of the arithmetic processing device 12 is mounted on a plurality of integrated circuits in a distributed manner. Adopted.

図１の周波数分析部２０は、図２に示すように、音響信号ＶINを時間軸上で区分した複数のフレームＦRの各々について周波数スペクトルＸm(n)を算定する。記号ｎは、周波数軸上に離散的に設定されたＮ個の周波数（周波数ビン）ｆ1〜ｆNのうち第ｎ番目の周波数ｆnを意味し（ｎ＝１〜Ｎ）、記号ｍはフレームＦRの番号を意味する。周波数スペクトルＸm(n)の算定には公知の技術（例えば短時間フーリエ変換）が任意に採用される。第ｍ番目のフレームＦRの周波数スペクトルＸm(n)は、目的音の周波数スペクトルＳm(n)と雑音の周波数スペクトルＮm(n)との加算に相当する（数式(1)）。
Ｘm(n)＝Ｓm(n)＋Ｎm(n) ……(1) As shown in FIG. 2, the frequency analysis unit 20 in FIG. 1 calculates a frequency spectrum Xm (n) for each of a plurality of frames FR obtained by dividing the acoustic signal VIN on the time axis. The symbol n means the n-th frequency fn among N frequencies (frequency bins) f1 to fN discretely set on the frequency axis (n = 1 to N), and the symbol m represents the frame FR. Means a number. A known technique (for example, short-time Fourier transform) is arbitrarily adopted for calculating the frequency spectrum Xm (n). The frequency spectrum Xm (n) of the mth frame FR corresponds to the addition of the frequency spectrum Sm (n) of the target sound and the frequency spectrum Nm (n) of the noise (Formula (1)).
Xm (n) = Sm (n) + Nm (n) (1)

図１の抑圧処理部３０は、周波数分析部２０が生成した周波数スペクトルＸm(n)（音響信号ＶIN）に対して相異なる複数種の雑音抑圧処理を並列に実行する。図１に示すように、抑圧処理部３０は、第１雑音抑圧部３１と第２雑音抑圧部３２とを含んで構成される。第１雑音抑圧部３１は、各フレームＦRの周波数スペクトルＸm(n)に第１雑音抑圧処理を実行することでフレームＦR毎に周波数スペクトルＹm(n)_1を生成する。第２雑音抑圧部３２は、周波数スペクトルＸm(n)に第２雑音抑圧処理を実行することでフレームＦR毎に周波数スペクトルＹm(n)_2を生成する。 The suppression processing unit 30 in FIG. 1 executes different types of noise suppression processing in parallel on the frequency spectrum Xm (n) (acoustic signal VIN) generated by the frequency analysis unit 20. As shown in FIG. 1, the suppression processing unit 30 includes a first noise suppression unit 31 and a second noise suppression unit 32. The first noise suppression unit 31 generates a frequency spectrum Ym (n) _1 for each frame FR by performing a first noise suppression process on the frequency spectrum Xm (n) of each frame FR. The second noise suppression unit 32 generates a frequency spectrum Ym (n) _2 for each frame FR by executing a second noise suppression process on the frequency spectrum Xm (n).

第１雑音抑圧処理は、音響信号ＶINから推定される雑音の周波数スペクトル（以下「推定雑音スペクトル」という）ψm(n)を音響信号ＶINの周波数スペクトルＸm(n)から減算する処理である。他方、第２雑音抑圧処理は、音響信号ＶINの目的音が強調されるように選定されたスペクトルゲインＧm(n)を音響信号ＶINの周波数スペクトルＸm(n)に乗算する処理である。すなわち、第１雑音抑圧処理は減算型の雑音抑圧処理（スペクトルサブトラクション法）に相当し、第２雑音抑圧処理は乗算型の雑音抑圧処理に相当する。 The first noise suppression process is a process of subtracting the frequency spectrum of noise estimated from the acoustic signal VIN (hereinafter referred to as “estimated noise spectrum”) ψm (n) from the frequency spectrum Xm (n) of the acoustic signal VIN. On the other hand, the second noise suppression process is a process of multiplying the frequency spectrum Xm (n) of the acoustic signal VIN by the spectrum gain Gm (n) selected so that the target sound of the acoustic signal VIN is emphasized. That is, the first noise suppression process corresponds to a subtraction type noise suppression process (spectral subtraction method), and the second noise suppression process corresponds to a multiplication type noise suppression process.

図３は、第１雑音抑圧部３１のブロック図である。図３に示すように、第１雑音抑圧部３１は、雑音推定部３１２と減算部３１４とを含んで構成される。雑音推定部３１２は、推定雑音スペクトル（パワースペクトル）ψm(n)をフレームＦR毎に推定する。推定雑音スペクトルψm(n)の生成（雑音の推定）には公知の技術が任意に採用される。例えば、雑音推定部３１２は、音響信号ＶINのうち目的音が存在しない雑音区間内の各フレームＦRの周波数スペクトルＸm(n)を雑音の周波数スペクトルＮm(n)として以下の数式(2)の演算を実行することで推定雑音スペクトルψm(n)を生成する。数式(2)の記号Ｅは、複数のフレームＦRにわたる平均（加算）を意味する。
ψm(n)＝Ｅ｛|Ｎm(n)|²｝ ……(2) FIG. 3 is a block diagram of the first noise suppression unit 31. As shown in FIG. 3, the first noise suppression unit 31 includes a noise estimation unit 312 and a subtraction unit 314. The noise estimation unit 312 estimates an estimated noise spectrum (power spectrum) ψm (n) for each frame FR. A known technique is arbitrarily employed to generate the estimated noise spectrum ψm (n) (noise estimation). For example, the noise estimation unit 312 calculates the following equation (2) using the frequency spectrum Xm (n) of each frame FR in the noise section of the acoustic signal VIN in which the target sound does not exist as the noise frequency spectrum Nm (n). To generate an estimated noise spectrum ψm (n). The symbol E in Equation (2) means an average (addition) over a plurality of frames FR.
ψm (n) = E {| Nm (n) | ² } (2)

図３の減算部３１４は、周波数スペクトルＸm(n)から推定雑音スペクトルψm(n)を減算することで周波数スペクトルＹm(n)_1を算定する。周波数スペクトルＹm(n)_1は、振幅スペクトルＰm(n)^１/２と周波数スペクトルＸm(n)の位相θx(n)とを利用して数式(3)で表現される。
Ｙm(n)_1＝（Ｐm(n)）^１/２・ｅ^jθx(n) ……(3)
数式(3)のパワースペクトルＰm(n)は以下の数式(4a)および数式(4b)で算定される。

The subtraction unit 314 in FIG. 3 calculates the frequency spectrum Ym (n) _1 by subtracting the estimated noise spectrum ψm (n) from the frequency spectrum Xm (n). The frequency spectrum Ym (n) _1 is expressed by Equation (3) using the amplitude spectrum Pm (n) ^1/2 and the phase θx (n) of the frequency spectrum Xm (n).
Ym (n) _1 = (Pm (n)) ^1/2 · e ^{jθx (n)} (3)
The power spectrum Pm (n) of the equation (3) is calculated by the following equations (4a) and (4b).

すなわち、周波数スペクトルＸm(n)の強度（パワー）|Ｘm(n)|^２が所定値（推定雑音スペクトルψm(n)と係数αmとの乗算値）を上回る場合、パワースペクトルＰm(n)は、式(4a)に示すように強度|Ｘm(n)|^２と当該所定値（αm・ψm(n)）との差分値に設定される。他方、強度|Ｘm(n)|^２が所定値（αm・ψm(n)）を下回る場合、パワースペクトルＰm(n)は、式(4b)に示すように推定雑音スペクトルψm(n)と係数βmとの乗算値（βm・ψm(n)）に設定される。係数αm（減算係数）および係数βm（フロアリング係数）は、必要な雑音抑圧の程度に応じて適宜に選定される。 That is, when the intensity (power) | Xm (n) | ^{2 of} the frequency spectrum Xm (n) exceeds a predetermined value (multiplied value of the estimated noise spectrum ψm (n) and the coefficient αm), the power spectrum Pm (n) is As shown in the equation (4a), the difference value between the intensity | Xm (n) | ² and the predetermined value (αm · ψm (n)) is set. On the other hand, when the intensity | Xm (n) | ² is lower than a predetermined value (αm · ψm (n)), the power spectrum Pm (n) has an estimated noise spectrum ψm (n) and a coefficient as shown in the equation (4b). It is set to a multiplication value (βm · ψm (n)) with βm. The coefficient αm (subtraction coefficient) and the coefficient βm (flooring coefficient) are appropriately selected according to the required degree of noise suppression.

次に、図４は、第２雑音抑圧部３２のブロック図である。図４に示すように、第２雑音抑圧部３２は、雑音推定部３２２とゲイン算定部３２４と乗算部３２６とを含んで構成される。雑音推定部３２２は、雑音推定部３１２と同様の方法で推定雑音スペクトルψm(n)を生成する。なお、雑音推定部３１２が生成した推定雑音スペクトルψm(n)を第２雑音抑圧部３２が流用する構成（雑音推定部３２２を省略した構成）も採用される。 Next, FIG. 4 is a block diagram of the second noise suppression unit 32. As shown in FIG. 4, the second noise suppression unit 32 includes a noise estimation unit 322, a gain calculation unit 324, and a multiplication unit 326. The noise estimation unit 322 generates an estimated noise spectrum ψm (n) by the same method as the noise estimation unit 312. A configuration in which the second noise suppression unit 32 diverts the estimated noise spectrum ψm (n) generated by the noise estimation unit 312 (a configuration in which the noise estimation unit 322 is omitted) is also employed.

図４のゲイン算定部３２４は、目的音の強調用のスペクトルゲインＧm(n)をフレームＦR毎に算定する。スペクトルゲインＧm(n)は、周波数ｆnで雑音が優勢であるほどゼロに近い数値に設定される（周波数ｆnで目的音が優勢であるほど大きい数値に設定される）。スペクトルゲインＧm(n)の算定には、以下に例示するように非特許文献２のMMSE-STSA法が好適である。具体的には、ゲイン算定部３２４は、以下の数式(5)を演算することでスペクトルゲインＧm(n)を算定する。数式(5)の符号Γはガンマ関数を意味する。また、符号Ｉ0は、０次の変形ベッセル関数を意味し、符号Ｉ1は、１次の変形ベッセル関数を意味する。

The gain calculation unit 324 in FIG. 4 calculates the spectral gain Gm (n) for enhancing the target sound for each frame FR. The spectral gain Gm (n) is set to a value closer to zero as the noise prevails at the frequency fn (set to a larger value as the target sound prevails at the frequency fn). For the calculation of the spectral gain Gm (n), the MMSE-STSA method of Non-Patent Document 2 is suitable as exemplified below. Specifically, the gain calculation unit 324 calculates the spectral gain Gm (n) by calculating the following formula (5). The symbol Γ in Equation (5) means a gamma function. Further, the symbol I0 means a zero-order modified Bessel function, and the symbol I1 means a first-order modified Bessel function.

数式(5)の事後ＳＮ比（posteriori SNR）γm(n)および事前ＳＮ比（priori SNR）ξm(n)は、周波数分析部２０が生成した周波数スペクトルＸm(n)と雑音推定部３２２が生成した推定雑音スペクトルψm(n)から以下の数式(6a)および数式(6b)の演算で算定される。数式(6b)の関数値Ｆ[x]は、変数ｘが正数である場合に当該変数ｘに設定され、変数ｘがゼロまたは負数である場合にゼロに設定される。また、数式(6b)の係数ηは、１を下回る所定の正数である。
γm(n)＝|Ｘm(n)|^２／ψm(n) ……(6a)
ξm(n)＝η・|Ｓm-1(n)|^２／ψm-1(n)＋（１−η）・Ｆ[γm(n)−１] ……(6b) The posterior signal-to-noise ratio (posteriori SNR) γm (n) and the prior signal-to-noise ratio (priori SNR) ξm (n) in Equation (5) are generated by the frequency spectrum Xm (n) generated by the frequency analysis unit 20 and the noise estimation unit 322. The estimated noise spectrum ψm (n) is calculated by the following equations (6a) and (6b). The function value F [x] of Expression (6b) is set to the variable x when the variable x is a positive number, and is set to zero when the variable x is zero or a negative number. In addition, the coefficient η in Expression (6b) is a predetermined positive number less than 1.
γm (n) = | Xm (n) | ² / ψm (n) (6a)
ξm (n) = η · | Sm-1 (n) | ² / ψm-1 (n) + (1-η) · F [γm (n) -1] (6b)

図４の乗算部３２６は、周波数スペクトルＸm(n)とスペクトルゲインＧm(n)との乗算で周波数スペクトルＹm(n)_2を算定する（Ｙm(n)_2＝Ｇm(n)・Ｘm(n)）。スペクトルゲインＧm(n)は目的音を強調するように設定されるから、周波数スペクトルＹm(n)_2においては音響信号ＶINの雑音が抑圧される。以上が抑圧処理部３０の具体的な構成である。 4 calculates the frequency spectrum Ym (n) _2 by multiplying the frequency spectrum Xm (n) and the spectrum gain Gm (n) (Ym (n) _2 = Gm (n) · Xm (n). )). Since the spectrum gain Gm (n) is set so as to enhance the target sound, the noise of the acoustic signal VIN is suppressed in the frequency spectrum Ym (n) _2. The specific configuration of the suppression processing unit 30 has been described above.

ところで、周波数スペクトルＹm(n)_1の時系列や周波数スペクトルＹm(n)_2の時系列には、時間軸上および周波数軸上にミュージカルノイズが点在する場合がある。図１の指標算定部４２は、雑音抑圧処理に起因したミュージカルノイズの発生の度合の定量的な尺度となる尖度指標値σm（σm_1，σm_2）をフレームＦR毎に算定する。尖度指標値σmについて以下に詳述する。 By the way, in the time series of the frequency spectrum Ym (n) _1 and the time series of the frequency spectrum Ym (n) _2, musical noise may be scattered on the time axis and the frequency axis. The index calculation unit 42 in FIG. 1 calculates a kurtosis index value σm (σm_1, σm_2), which is a quantitative measure of the degree of occurrence of musical noise caused by noise suppression processing, for each frame FR. The kurtosis index value σm will be described in detail below.

図５の部分(A)は、雑音の抑圧前の音響信号ＶINのうち所定の区間における強度の度数分布（強度を確率変数とする確率密度関数）である。図５の部分(A)に示すように、音響信号ＶINの強度は、強度がゼロから増加するほど度数が減少するように非線形に分布する。 Part (A) of FIG. 5 is a frequency distribution of the intensity (a probability density function with intensity as a random variable) in a predetermined section of the acoustic signal VIN before noise suppression. As shown in part (A) of FIG. 5, the intensity of the acoustic signal VIN is non-linearly distributed so that the frequency decreases as the intensity increases from zero.

図５の部分(B)は、雑音の抑圧後の強度の度数分布である。図５の部分(A)と部分(B)との対比から理解されるように、音響信号ＶIN（雑音の抑圧前）のうちゼロに近い強度の度数が雑音の抑圧で増加するという傾向がある。すなわち、強度がゼロの近傍となる範囲内における度数分布の傾斜は雑音の抑圧後に急峻な形状に変化する。度数分布の形状（傾斜の急峻度）の尺度として尖度（kurtosis）を導入すると、雑音抑圧処理の実行後の尖度ＫBmは、雑音抑圧処理の実行前（音響信号ＶIN）の尖度ＫAmと比較して高い数値となる（ＫBm＞ＫAm）。尖度κは、ｎ次のモーメントから以下の数式(7)で算定される高次統計量である。

Part (B) in FIG. 5 is a frequency distribution of intensity after noise suppression. As understood from the comparison between part (A) and part (B) in FIG. 5, the frequency of the intensity close to zero in the acoustic signal VIN (before noise suppression) tends to increase due to noise suppression. . That is, the slope of the frequency distribution in the range where the intensity is near zero changes to a steep shape after noise suppression. When kurtosis is introduced as a measure of the shape of the frequency distribution (steepness of inclination), the kurtosis KBm after the noise suppression processing is executed is the kurtosis KAm before the noise suppression processing (acoustic signal VIN). Compared to a higher value (KBm> KAm). The kurtosis κ is a higher-order statistic calculated from the n-th moment by the following formula (7).

雑音の抑圧後にミュージカルノイズが多い音響信号には、ゼロの付近の強度の度数が高いという傾向がある。したがって、度数分布にて強度がゼロとなる度数が雑音の抑圧の前後で増加するほど、雑音抑圧処理に起因して発生したミュージカルノイズが多いと評価できる。すなわち、雑音の抑圧の前後にわたる尖度κの変化（ＫAm→ＫBm）が大きいほど、雑音の抑圧後に発生するミュージカルノイズは多い。 An acoustic signal having a lot of musical noise after noise suppression tends to have a high intensity frequency near zero. Therefore, it can be evaluated that the more the frequency at which the intensity becomes zero in the frequency distribution increases before and after noise suppression, the more musical noise is generated due to the noise suppression processing. That is, the greater the change in kurtosis κ before and after noise suppression (KAm → KBm), the more musical noise is generated after noise suppression.

以上の傾向から、図１の指標算定部４２は、抑圧処理部３０による処理の前後にわたる尖度κの変化に応じた尖度指標値σm（σm_1，σm_2）を算定する。尖度指標値σm_1は、第１雑音抑圧部３１による第１雑音抑圧処理の前後で尖度κが変化した度合の尺度であり、周波数スペクトルＹm(n)_1におけるミュージカルノイズの発生の度合の指標として利用される。尖度指標値σm_2は、第２雑音抑圧部３２による第２雑音抑圧処理の前後で尖度κが変化した度合の尺度であり、周波数スペクトルＹm(n)_2におけるミュージカルノイズの発生の度合の指標として利用される。Ｍ個の強度ｘ1〜ｘMの度数分布における尖度κは例えば以下の方法で算定される。 From the above tendency, the index calculation unit 42 in FIG. 1 calculates the kurtosis index value σm (σm_1, σm_2) corresponding to the change in the kurtosis κ before and after the processing by the suppression processing unit 30. The kurtosis index value σm_1 is a measure of the degree to which the kurtosis κ has changed before and after the first noise suppression processing by the first noise suppression unit 31, and is an index of the degree of occurrence of musical noise in the frequency spectrum Ym (n) _1. Used as The kurtosis index value σm_2 is a measure of the degree to which the kurtosis κ has changed before and after the second noise suppression processing by the second noise suppression unit 32, and is an index of the degree of occurrence of musical noise in the frequency spectrum Ym (n) _2. Used as The kurtosis κ in the frequency distribution of M intensities x1 to xM is calculated by the following method, for example.

Ｍ個の強度ｘ1〜ｘMの度数分布は、以下の数式(8)の関数Ｇa(x；k,θ)で近似される。

数式(8)の係数Ｃは、ガンマ関数Γ(k)を利用して以下のように定義される。

The frequency distribution of M intensities x1 to xM is approximated by a function Ga (x; k, θ) of the following formula (8).

The coefficient C in Expression (8) is defined as follows using the gamma function Γ (k).

２次のモーメントμ2の定義式における分布関数Ｐ(x)を数式(8)の関数Ｇa(x；k,θ)に置換することで以下の数式(9)が導出される。

By substituting the distribution function P (x) in the defining equation of the second moment μ2 with the function Ga (x; k, θ) of the equation (8), the following equation (9) is derived.

２次のモーメントμ2の導出と同様に、４次のモーメントμ4の定義式における分布関数Ｐ(x)を数式(8)の関数Ｇa(x；k,θ)に置換することで以下の数式(10)が導出される。

Similar to the derivation of the second-order moment μ2, the distribution function P (x) in the definition of the fourth-order moment μ4 is replaced with the function Ga (x; 10) is derived.

数式(9)の２次のモーメントμ2と数式(10)の４次のモーメントμ4とを数式(7)に代入すると、尖度κを定義する以下の数式(11)が導出される。

Substituting the second-order moment μ2 in Equation (9) and the fourth-order moment μ4 in Equation (10) into Equation (7) yields the following Equation (11) that defines kurtosis κ.

Ｍ個の強度ｘ1〜ｘMについて数式(11)を演算することで尖度κが算定される。もっとも、尖度κを算定する方法は以上の例示に限定されない。例えば、強度ｘ1〜ｘMの度数分布を所定の関数（例えば数式(8)）で近似する構成は必須ではない。 The kurtosis κ is calculated by calculating Expression (11) for M intensities x1 to xM. However, the method for calculating the kurtosis κ is not limited to the above examples. For example, a configuration that approximates the frequency distribution of the intensities x1 to xM with a predetermined function (for example, Equation (8)) is not essential.

図６は、指標算定部４２のブロック図である。図６の第１尖度算定部４２１は、音響信号ＶINの強度の度数分布における尖度（雑音の抑圧前の尖度）ＫAmを周波数スペクトルＸm(n)の時系列から算定する。すなわち、第１尖度算定部４２１は、周波数スペクトルＸm(n)の時系列から抽出されるＭ個の強度ｘ1〜ｘMについて数式(11)の演算を実行することで尖度ＫAmを算定する。尖度ＫAmの算定に利用される強度ｘ1〜ｘMは、図２に示すように、第ｍ番目のフレームＦRを最後とするτ個のフレームＦRの各々の周波数スペクトルＸm(n)における強度|Ｘm(n)|^２に相当する（Ｍ＝τ×Ｎ）。 FIG. 6 is a block diagram of the index calculation unit 42. The first kurtosis calculation unit 421 in FIG. 6 calculates the kurtosis (kurtosis before noise suppression) KAm in the frequency distribution of the intensity of the acoustic signal VIN from the time series of the frequency spectrum Xm (n). That is, the first kurtosis calculation unit 421 calculates the kurtosis KAm by executing the calculation of Expression (11) for M intensities x1 to xM extracted from the time series of the frequency spectrum Xm (n). As shown in FIG. 2, the intensities x1 to xM used for calculating the kurtosis KAm are the intensities | Xm in the frequency spectrum Xm (n) of each of the τ frames FR with the mth frame FR as the last. It corresponds to (n) | ² (M = τ × N).

図６の第２尖度算定部４２２は、第１雑音抑圧部３１が生成する周波数スペクトルＹm(n)_1の時系列から第１雑音抑圧処理の実行後の尖度ＫBm_1を算定する。具体的には、第２尖度算定部４２２は、第ｍ番目のフレームＦRを最後とするτ個のフレームＦRの周波数スペクトルＹm(n)_1を構成するＭ個の強度|Ｙm(n)_1|^２を強度ｘ1〜ｘMとして数式(11)の演算を実行することで尖度ＫBm_1を算定する。図６の第３尖度算定部４２３は、第２尖度算定部４２２と同様の方法で、第２雑音抑圧処理の実行後の尖度ＫBm_2を周波数スペクトルＹm(n)_2の時系列から算定する。 The second kurtosis calculation unit 422 of FIG. 6 calculates the kurtosis KBm_1 after execution of the first noise suppression processing from the time series of the frequency spectrum Ym (n) _1 generated by the first noise suppression unit 31. Specifically, the second kurtosis calculation unit 422 generates M intensities | Ym (n) _1 constituting the frequency spectrum Ym (n) _1 of τ frames FR whose last is the mth frame FR. | The kurtosis KBm_1 is calculated by executing the calculation of Equation (11) with ² as the intensity x1 to xM. The third kurtosis calculation unit 423 in FIG. 6 calculates the kurtosis KBm_2 after execution of the second noise suppression processing from the time series of the frequency spectrum Ym (n) _2 by the same method as the second kurtosis calculation unit 422. To do.

図６の第１指標算定部４２５は、第１尖度算定部４２１が算定した尖度ＫAmと第２尖度算定部４２２が算定した尖度ＫBm_1とから尖度指標値σm_1を算定する。尖度指標値σm_1は、以下の数式(12)に示すように、尖度ＫAmに対する尖度ＫBm_1の相対比（以下「尖度比」という）Ｒm_1を変数とする関数Ｆaで定義される（Ｒm_1＝ＫBm_1／ＫAm）。関数Ｆaは、尖度指標値σm_1と尖度比Ｒm_1との関係を定義する単調増加関数である。
σm_1＝Ｆa(Ｒm_1)
＝Ｆa(ＫBm_1／ＫAm) ……(12)
図５を参照して前述したように、尖度比Ｒm_1（尖度ＫAmから尖度ＫBm_1への変化）が小さいほど、第１雑音抑圧処理で発生するミュージカルノイズが低減される。したがって、尖度指標値σm_1が小さいほど第１雑音抑圧処理の実行後のミュージカルノイズは少ないと評価できる。すなわち、尖度指標値σm_1は、第１雑音抑圧処理に起因したミュージカルノイズの発生の度合を示す指標値（尺度）に相当する。 The first index calculation unit 425 in FIG. 6 calculates the kurtosis index value σm_1 from the kurtosis KAm calculated by the first kurtosis calculation unit 421 and the kurtosis KBm_1 calculated by the second kurtosis calculation unit 422. The kurtosis index value σm_1 is defined by a function Fa using a relative ratio of the kurtosis KBm_1 to the kurtosis KAm (hereinafter referred to as “kurtosis ratio”) Rm_1 as shown in the following formula (12) (Rm_1) = KBm_1 / KAm). The function Fa is a monotonically increasing function that defines the relationship between the kurtosis index value σm_1 and the kurtosis ratio Rm_1.
σm_1 = Fa (Rm_1)
= Fa (KBm_1 / KAm) ...... (12)
As described above with reference to FIG. 5, the smaller the kurtosis ratio Rm_1 (change from the kurtosis KAm to the kurtosis KBm_1), the more the musical noise generated in the first noise suppression process is reduced. Therefore, it can be evaluated that the smaller the kurtosis index value σm_1 is, the smaller the musical noise after the execution of the first noise suppression process is. That is, the kurtosis index value σm_1 corresponds to an index value (scale) indicating the degree of occurrence of musical noise caused by the first noise suppression process.

図６の第２指標算定部４２６は、第１尖度算定部４２１が算定した尖度ＫAmと第３尖度算定部４２３が算定した尖度ＫBm_2とから尖度指標値σm_2を算定する。尖度指標値σm_2は、以下の数式(13)に示すように、尖度ＫAmに対する尖度ＫBm_2の尖度比Ｒm_2を変数とする関数Ｆaで定義される（Ｒm_2＝ＫBm_2／ＫAm）。したがって、尖度指標値σm_2が小さいほど（尖度比Ｒm_1が示す尖度ＫAmから尖度ＫBm_1への変化が小さいほど）、第２雑音抑圧処理で発生するミュージカルノイズは少ないと評価できる。すなわち、尖度指標値σm_2は、第２雑音抑圧処理に起因したミュージカルノイズの発生の度合を示す指標値に相当する。
σm_2＝Ｆa(Ｒm_2)
＝Ｆa(ＫBm_2／ＫAm) ……(13) 6 calculates the kurtosis index value σm_2 from the kurtosis KAm calculated by the first kurtosis calculation unit 421 and the kurtosis KBm_2 calculated by the third kurtosis calculation unit 423. The kurtosis index value σm_2 is defined by a function Fa having a kurtosis ratio Rm_2 of the kurtosis KBm_2 with respect to the kurtosis KAm as a variable (Rm_2 = KBm_2 / KAm) as shown in the following equation (13). Therefore, it can be evaluated that the smaller the kurtosis index value σm_2 (the smaller the change from the kurtosis KAm indicated by the kurtosis ratio Rm_1 to the kurtosis KBm_1), the smaller the musical noise generated in the second noise suppression process. That is, the kurtosis index value σm_2 corresponds to an index value indicating the degree of occurrence of musical noise caused by the second noise suppression process.
σm_2 ＝ Fa (Rm_2)
= Fa (KBm_2 / KAm) (13)

図１の選択部４４は、指標算定部４２が算定した尖度指標値σm_1および尖度指標値σm_2に応じて第１雑音抑圧処理および第２雑音抑圧処理の何れか（第１雑音抑圧部３１および第２雑音抑圧部３２の何れか）をフレームＦR毎に選択する。具体的には、尖度指標値σm_1および尖度指標値σm_2のうち小さい方に対応した雑音抑圧処理が選択される。すなわち、尖度指標値σm_1が尖度指標値σm_2よりも小さい場合には第１雑音抑圧処理（第１雑音抑圧部３１）が選択され、尖度指標値σm_2が尖度指標値σm_1よりも小さい場合には第２雑音抑圧処理（第２雑音抑圧部３２）が選択される。選択部４４は、自身が選択した雑音抑圧処理で生成された周波数スペクトルＹm(n)（Ｙm(n)_1，Ｙm(n)_2）をフレームＦR毎に順次に波形合成部４６に出力する。すなわち、選択部４４は、第１雑音抑圧処理を選択したフレームＦRでは周波数スペクトルＹm(n)_1を出力し、第２雑音抑圧処理を選択したフレームＦRでは周波数スペクトルＹm(n)_2を出力する。 The selection unit 44 in FIG. 1 performs either the first noise suppression process or the second noise suppression process according to the kurtosis index value σm_1 and the kurtosis index value σm_2 calculated by the index calculation unit 42 (first noise suppression unit 31). And any one of the second noise suppression units 32) are selected for each frame FR. Specifically, the noise suppression process corresponding to the smaller one of the kurtosis index value σm_1 and the kurtosis index value σm_2 is selected. That is, when the kurtosis index value σm_1 is smaller than the kurtosis index value σm_2, the first noise suppression processing (first noise suppression unit 31) is selected, and the kurtosis index value σm_2 is smaller than the kurtosis index value σm_1. In this case, the second noise suppression process (second noise suppression unit 32) is selected. The selection unit 44 outputs the frequency spectrum Ym (n) (Ym (n) _1, Ym (n) _2) generated by the noise suppression processing selected by itself to the waveform synthesis unit 46 sequentially for each frame FR. That is, the selection unit 44 outputs the frequency spectrum Ym (n) _1 in the frame FR selected for the first noise suppression processing, and outputs the frequency spectrum Ym (n) _2 in the frame FR selected for the second noise suppression processing. .

波形合成部４６は、選択部４４がフレームＦR毎に選択した周波数スペクトルＹm(n)（Ｙm(n)_1，Ｙm(n)_2）から時間領域の音響信号ＶOUTを合成する。すなわち、波形合成部４６は、周波数スペクトルＹm(n)に対する逆フーリエ変換で算定した時間領域の信号を複数のフレームＦRについて時間軸上で重複させて加算することで音響信号ＶOUTを算定する。 The waveform synthesizing unit 46 synthesizes the time-domain acoustic signal VOUT from the frequency spectrum Ym (n) (Ym (n) _1, Ym (n) _2) selected by the selection unit 44 for each frame FR. That is, the waveform synthesizer 46 calculates the acoustic signal VOUT by overlapping the time-domain signals calculated by the inverse Fourier transform for the frequency spectrum Ym (n) with respect to the plurality of frames FR on the time axis.

以上の形態においては、第１雑音抑圧処理および第２雑音抑圧処理のうち尖度指標値σmが小さい方の雑音抑圧処理が音響信号ＶOUTの生成に選択的に使用される。したがって、第１雑音抑圧処理のみを実行する構成や第２雑音抑圧処理のみを実行する構成と比較すると、以下に詳述するように、音響信号ＶINの音響的な特性が変化する場合でもミュージカルノイズを有効に低減できるという効果がある。 In the above embodiment, the noise suppression process with the smaller kurtosis index value σm of the first noise suppression process and the second noise suppression process is selectively used for generating the acoustic signal VOUT. Therefore, as compared with a configuration in which only the first noise suppression processing is performed and a configuration in which only the second noise suppression processing is performed, as described in detail below, even when the acoustic characteristics of the acoustic signal VIN change, the musical noise Can be effectively reduced.

図７は、音響信号ＶINの雑音区間（目的音に対して雑音が優勢な区間）におけるＳＮ比と尖度比Ｒm（Ｒm_1，Ｒm_2）との関係を示すグラフである。図８は、音響信号ＶINの目的音区間（目的音が優勢な区間）におけるＳＮ比と尖度比Ｒmとの関係を示すグラフである。なお、雑音抑圧処理に適用される各係数（αm．βm，η）はＳＮ比に応じて図９のように設定した。 FIG. 7 is a graph showing the relationship between the SN ratio and the kurtosis ratio Rm (Rm_1, Rm_2) in the noise section of the acoustic signal VIN (the section in which noise is dominant with respect to the target sound). FIG. 8 is a graph showing the relationship between the SN ratio and the kurtosis ratio Rm in the target sound section (section in which the target sound is dominant) of the acoustic signal VIN. In addition, each coefficient ((alpha) m. (Beta) m, (eta)) applied to a noise suppression process was set like FIG. 9 according to S / N ratio.

図７に示すように、雑音区間では、第２雑音抑圧処理の前後の尖度比Ｒm_2が第１雑音抑圧処理の前後の尖度比Ｒm_1よりも低い。すなわち、第１雑音抑圧処理（ＳＳ法）よりも第２雑音抑圧処理（MMSE-STSA法）の方が、雑音の抑圧後のミュージカルノイズは低減される。したがって、選択部４４は、雑音区間内の各フレームＦRでは第２雑音抑圧部３２による第２雑音抑圧処理（周波数スペクトルＹm(n)_2）を選択する。 As shown in FIG. 7, in the noise section, the kurtosis ratio Rm_2 before and after the second noise suppression process is lower than the kurtosis ratio Rm_1 before and after the first noise suppression process. That is, musical noise after noise suppression is reduced in the second noise suppression processing (MMSE-STSA method) than in the first noise suppression processing (SS method). Therefore, the selection unit 44 selects the second noise suppression processing (frequency spectrum Ym (n) _2) by the second noise suppression unit 32 in each frame FR within the noise section.

他方、図８に示すように、目的音区間では、第１雑音抑圧処理の前後の尖度比Ｒm_1が第２雑音抑圧処理の前後の尖度比Ｒm_2よりも低い。すなわち、第２雑音抑圧処理よりも第１雑音抑圧処理の方が、雑音の抑圧後のミュージカルノイズは低減される。したがって、選択部４４は、目的音区間内の各フレームＦRでは第１雑音抑圧部３１による第１雑音抑圧処理（周波数スペクトルＹm(n)_1）を選択する。 On the other hand, as shown in FIG. 8, in the target sound section, the kurtosis ratio Rm_1 before and after the first noise suppression process is lower than the kurtosis ratio Rm_2 before and after the second noise suppression process. That is, musical noise after noise suppression is reduced in the first noise suppression process than in the second noise suppression process. Therefore, the selection unit 44 selects the first noise suppression processing (frequency spectrum Ym (n) _1) by the first noise suppression unit 31 in each frame FR within the target sound section.

以上のように音響信号ＶOUTの生成に適用される雑音抑圧処理が雑音区間と目的音区間とで変更される（すなわち音響信号ＶINの音響的な特性に応じて変更される）から、音響信号ＶINの全区間にわたって第１雑音抑圧処理を実行する構成と比較して雑音区間でのミュージカルノイズが低減され、音響信号ＶINの全区間にわたって第２雑音抑圧処理を実行する構成と比較して目的音区間でのミュージカルノイズが低減されるという利点がある。 As described above, since the noise suppression processing applied to the generation of the acoustic signal VOUT is changed between the noise section and the target sound section (that is, changed according to the acoustic characteristics of the acoustic signal VIN), the acoustic signal VIN is changed. The musical noise in the noise section is reduced as compared with the configuration in which the first noise suppression processing is performed over the entire section of the target sound, and the target sound section is compared with the configuration in which the second noise suppression processing is performed over the entire section of the acoustic signal VIN. There is an advantage that the musical noise at is reduced.

雑音抑圧装置１００の具体的な使用の状況を例示する。まず、空調設備の動作音などの定常的な雑音が存在する空間内で発話者が収音点（例えば収音機器）の近傍を通過する場合を想定する。収音点から充分に遠い位置に発話者が存在する状態では雑音のみが収音されるから、第２雑音抑圧処理による音響信号ＶOUTが生成される。発話者が収音点に接近した状態では発話者による発声の有無に応じて第１雑音抑圧処理と第２雑音抑圧処理とが随時に切替わる。すなわち、発話者による発声音が支配的な区間では第１雑音抑圧処理が選択され、雑音が支配的な区間（例えば発話が休止した区間）では第２雑音抑圧処理が選択される。そして、発話者が収音点から遠ざかると（すなわち雑音が優勢になると）、第２雑音抑圧処理による音響信号ＶOUTの生成が実行される。 A specific usage situation of the noise suppression apparatus 100 will be exemplified. First, a case is assumed in which a speaker passes near a sound collection point (for example, a sound collection device) in a space where stationary noise such as an operation sound of an air conditioning facility exists. Since only noise is picked up when a speaker is present at a position sufficiently far from the sound pickup point, an acoustic signal VOUT is generated by the second noise suppression processing. When the speaker is close to the sound collection point, the first noise suppression processing and the second noise suppression processing are switched at any time according to the presence or absence of speech from the speaker. That is, the first noise suppression process is selected in a section where the utterance sound by the speaker is dominant, and the second noise suppression process is selected in a section where noise is dominant (for example, a section where speech is paused). Then, when the speaker moves away from the sound collection point (that is, when the noise becomes dominant), the generation of the acoustic signal VOUT by the second noise suppression processing is executed.

次に、空調設備の動作音などの定常的な雑音が存在する状態と、多数の発声音の混合音などの定常性が低い雑音が存在する状態とが随時に切替わる場合（例えば、多数の発話者が存在する展示会）を想定する。定常的な雑音が存在する状態では、図７のように尖度Ｒm_2が尖度Ｒm_1よりも低いから、第２雑音抑圧処理による音響信号ＶOUTの生成が選択される。他方、定常性が低い雑音が存在する状態（すなわち図７よりも図８に近い状態）では、尖度Ｒm_1が尖度Ｒm_2よりも低いから、第１雑音抑圧処理による音響信号ＶOUTの生成が選択される。すなわち、雑音の音響的な特性に応じて第１雑音抑圧処理または第２雑音抑圧処理が選択される。 Next, a state where stationary noise such as an operation sound of an air conditioner exists and a state where low stationary noise such as a mixed sound of many uttered sounds is switched at any time (for example, many noises) Suppose an exhibition where there is a speaker. In a state where stationary noise exists, the kurtosis Rm_2 is lower than the kurtosis Rm_1 as shown in FIG. 7, and therefore the generation of the acoustic signal VOUT by the second noise suppression process is selected. On the other hand, in a state where noise with low stationarity exists (that is, a state closer to FIG. 8 than FIG. 7), the kurtosis Rm_1 is lower than the kurtosis Rm_2, so that the generation of the acoustic signal VOUT by the first noise suppression processing is selected Is done. That is, the first noise suppression process or the second noise suppression process is selected according to the acoustic characteristics of the noise.

＜Ｂ：第２実施形態＞
次に、本発明の第２実施形態について説明する。なお、以下の各形態において作用や機能が第１実施形態と同等である要素については、以上と同じ符号を付して各々の詳細な説明を適宜に省略する。 <B: Second Embodiment>
Next, a second embodiment of the present invention will be described. In addition, about the element in which an effect | action and a function are equivalent to 1st Embodiment in each following form, the same code | symbol as the above is attached | subjected and each detailed description is abbreviate | omitted suitably.

図１０は、第２実施形態に係る雑音抑圧装置１００Aのブロック図である。図１０に示すように、雑音抑圧装置１００Aは、信号区分部５２と加重部５４とを第１実施形態の雑音抑圧装置１００に追加した構成である。信号区分部５２は、音響信号ＶINを時間軸上で雑音区間と目的音区間とに区分する。雑音区間と目的音区間との選別には公知の技術が任意に採用される。 FIG. 10 is a block diagram of a noise suppression device 100A according to the second embodiment. As illustrated in FIG. 10, the noise suppression device 100A has a configuration in which a signal sorting unit 52 and a weighting unit 54 are added to the noise suppression device 100 of the first embodiment. The signal classification unit 52 divides the acoustic signal VIN into a noise section and a target sound section on the time axis. A known technique is arbitrarily employed for selecting the noise section and the target sound section.

加重部５４は、指標算定部４２が算定した尖度指標値σm_1と尖度指標値σm_2とを重み付けする。すなわち、加重部５４は、尖度指標値σm_1に加重値ｗ1を乗算するとともに尖度指標値σm_2に加重値ｗ2を乗算する。加重値ｗ1と加重値ｗ2とは、信号区分部５２による区分の結果に応じて可変に設定される。例えば、目的音区間内の各フレームＦRでは加重値ｗ1が加重値ｗ2よりも大きい数値に設定され、雑音区間内の各フレームＦRでは加重値ｗ2が加重値ｗ1よりも大きい数値に設定される。 The weighting unit 54 weights the kurtosis index value σm_1 and the kurtosis index value σm_2 calculated by the index calculation unit 42. That is, the weighting unit 54 multiplies the kurtosis index value σm_1 by the weight value w1 and multiplies the kurtosis index value σm_2 by the weight value w2. The weight value w1 and the weight value w2 are variably set according to the result of the classification by the signal classification unit 52. For example, the weight value w1 is set to a value larger than the weight value w2 in each frame FR in the target sound section, and the weight value w2 is set to a value larger than the weight value w1 in each frame FR in the noise section.

尖度指標値σm_1と尖度指標値σm_2とが接近する状態では両者の大小が短時間で頻繁に逆転する可能性がある。したがって、第１実施形態の構成（加重部５４を省略した構成）では、第２雑音抑圧処理が選択され易い雑音区間内の僅かなフレームＦRにて第１雑音抑圧処理が選択される場合や、第１雑音抑圧処理が選択され易い目的音区間内の僅かなフレームＦRにて第２雑音抑圧処理が選択される場合が発生する。以上のように雑音抑圧処理が瞬間的に変更された区間では、音響信号ＶOUTの再生音が聴感的に不自然な音響となる可能性がある。 In the state where the kurtosis index value σm_1 and the kurtosis index value σm_2 are close to each other, there is a possibility that both magnitudes are frequently reversed in a short time. Therefore, in the configuration of the first embodiment (a configuration in which the weighting unit 54 is omitted), when the first noise suppression processing is selected with a few frames FR in the noise section in which the second noise suppression processing is easily selected, There is a case where the second noise suppression process is selected in a few frames FR within the target sound section in which the first noise suppression process is easily selected. As described above, in the section in which the noise suppression processing is instantaneously changed, the reproduced sound of the acoustic signal VOUT may be audibly unnatural sound.

第２実施形態においては、尖度指標値σm_1の加重値ｗ1と尖度指標値σm_2の加重値ｗ2とが雑音区間と目的音区間とで変更されるから、尖度指標値σm_1と尖度指標値σm_2とが接近した状態でも、目的音区間内の各フレームＦRでは第１雑音抑圧処理が選択され、雑音区間内の各フレームＦRでは第２雑音抑圧処理が選択される。すなわち、雑音抑圧処理の瞬間的な変更が防止される。したがって、第１実施形態と比較して、聴感的に自然な再生音を生成することが可能である。 In the second embodiment, since the weight value w1 of the kurtosis index value σm_1 and the weight value w2 of the kurtosis index value σm_2 are changed between the noise section and the target sound section, the kurtosis index value σm_1 and the kurtosis index Even in the state where the value σm_2 is close, the first noise suppression process is selected for each frame FR in the target sound section, and the second noise suppression process is selected for each frame FR in the noise section. That is, an instantaneous change in the noise suppression process is prevented. Therefore, compared with the first embodiment, it is possible to generate an acoustically natural reproduced sound.

＜Ｃ：変形例＞
以上に例示した各形態には様々な変形が加えられる。具体的な変形の態様を例示すれば以下の通りである。なお、以下の例示から２以上の態様を任意に選択して組合せてもよい。 <C: Modification>
Various modifications can be made to each of the forms exemplified above. An example of a specific modification is as follows. Two or more aspects may be arbitrarily selected from the following examples and combined.

（１）変形例１
以上の各形態においては、第１雑音抑圧処理としてスペクトルサブトラクション法を採用するとともに第２雑音抑圧処理としてMMSE-STSA法を例示したが、選択部４４による選択の候補となる雑音抑圧処理の種類は以上の例示に限定されない。 (1) Modification 1
In each of the above embodiments, the spectral subtraction method is adopted as the first noise suppression processing and the MMSE-STSA method is exemplified as the second noise suppression processing. However, the types of noise suppression processing that are selection candidates by the selection unit 44 are as follows. It is not limited to the above illustration.

例えば、Hack-Yoon KIM, et al., "Speech Enhancement Based on Short-Time spectral Amplitude Estimation with Two-Channel Beamformer", IEICE Trans. Fundamentals, Vol. E79-A, No.12, December 1996に開示された雑音抑圧処理（以下「雑音抑圧処理Ａ」という）や、向井良，"非定常スペクトルサブトラクションによる音源分離後の残留雑音除去”, 日本音響学会 2001年秋季研究発表会, 2-6-14, p. 617−618に開示された雑音抑圧処理（以下「雑音抑圧処理Ｂ」という）が第１雑音抑圧処理または第２雑音抑圧処理として採用される。 For example, disclosed in Hack-Yoon KIM, et al., “Speech Enhancement Based on Short-Time spectral Amplitude Estimation with Two-Channel Beamformer”, IEICE Trans. Fundamentals, Vol. E79-A, No. 12, December 1996 Noise suppression processing (hereinafter referred to as “noise suppression processing A”), Ryo Mukai, “Residual noise removal after sound source separation by non-stationary spectral subtraction”, Acoustical Society of Japan 2001 Autumn Meeting, 2-6-14, p The noise suppression processing disclosed in 617-618 (hereinafter referred to as “noise suppression processing B”) is employed as the first noise suppression processing or the second noise suppression processing.

雑音抑圧処理Ａにおいては、複数の収音機器が生成した複数の音響信号ＶINの加算（目的音方向に対するビームの形成）および減算（目的音方向に対する死角の形成）で目的音と雑音とを空間的に分離し（Griffith-Jim型適応ビームフォーマ）、目的音の周波数スペクトルから雑音の周波数スペクトルを減算することで雑音抑圧後の周波数スペクトルＹm(n)を生成する。他方、雑音抑圧処理Ｂにおいては、目的音を強調した分離信号を複数の音響信号ＶINの独立成分分析で生成し、分離信号から推定した雑音（残留雑音）の周波数スペクトルを分離信号の周波数スペクトルから減算することで雑音抑圧後の周波数スペクトルＹm(n)を生成する。なお、複数の音響信号ＶINが雑音抑圧処理に使用される場合、複数の音響信号ＶINのうちの何れかの音響信号ＶINから雑音抑圧前の尖度ＫAmが算定される。 In the noise suppression processing A, the target sound and noise are spatially obtained by adding (forming a beam with respect to the target sound direction) and subtracting (forming a blind spot with respect to the target sound direction) of the plurality of acoustic signals VIN generated by the plurality of sound collecting devices. Are separated (Griffith-Jim type adaptive beamformer), and the frequency spectrum Ym (n) after noise suppression is generated by subtracting the frequency spectrum of the noise from the frequency spectrum of the target sound. On the other hand, in the noise suppression processing B, a separated signal in which the target sound is emphasized is generated by independent component analysis of a plurality of acoustic signals VIN, and a frequency spectrum of noise (residual noise) estimated from the separated signal is obtained from the frequency spectrum of the separated signal. The frequency spectrum Ym (n) after noise suppression is generated by subtraction. When a plurality of acoustic signals VIN are used for noise suppression processing, the kurtosis KAm before noise suppression is calculated from any one of the plurality of acoustic signals VIN.

なお、以上においては減算型の雑音抑圧処理を例示したが、乗算型の雑音抑圧処理の内容も適宜に変更される。例えば、T. Lotter and P. Vary, "Speech enhancement by MAP spectral amplitude estimation using a Super-Gaussian speech model", EURASIP Journal on Applied Signal Processing, vol.2005, no,7, p.1110-1126, July 2005に開示されたMAP（maximum a posteriori estimation）推定をスペクトルゲインＧm(n)の推定に利用した乗算型の雑音抑圧処理が第１雑音抑圧処理または第２雑音抑圧処理として採用される。具体的には、以下の数式(14)の演算でスペクトルゲインＧm(n)が算定される。数式(14)の係数φや係数τは、雑音の確率分布（確率密度関数）の形状を定める定数（例えばτ＝2.5，φ＝1）である。

Although the subtraction type noise suppression process has been exemplified above, the content of the multiplication type noise suppression process is also changed as appropriate. For example, T. Lotter and P. Vary, "Speech enhancement by MAP spectral amplitude estimation using a Super-Gaussian speech model", EURASIP Journal on Applied Signal Processing, vol. 2005, no, 7, p.1110-1126, July 2005 The multiplication type noise suppression process using the MAP (maximum a posteriori estimation) estimation disclosed in the above is used for the estimation of the spectrum gain Gm (n) as the first noise suppression process or the second noise suppression process. Specifically, the spectrum gain Gm (n) is calculated by the calculation of the following formula (14). The coefficient φ and the coefficient τ in Expression (14) are constants (for example, τ = 2.5, φ = 1) that determine the shape of the noise probability distribution (probability density function).

また、選択部４４による選択の候補となる雑音抑圧処理は減算型や乗算型に限定されない。例えば、目的音を強調した音響信号（すなわち雑音を抑圧した音響信号）の周波数スペクトルＹm(n)を複数の音響信号ＶINに対する独立成分分析で生成する処理や、目的音の方向にビームを形成する（あるいは雑音の方向に収音上の死角を形成する）ことで雑音抑圧後の周波数スペクトルＹm(n)を生成する処理（ビームフォーマ）も、第１雑音抑圧処理または第２雑音抑圧処理として採用される。 Further, the noise suppression processing as a selection candidate by the selection unit 44 is not limited to the subtraction type or the multiplication type. For example, a process of generating a frequency spectrum Ym (n) of an acoustic signal in which the target sound is emphasized (that is, an acoustic signal in which noise is suppressed) by independent component analysis for a plurality of acoustic signals VIN, or a beam is formed in the direction of the target sound. The processing (beamformer) that generates the frequency spectrum Ym (n) after noise suppression by forming a dead angle on sound collection in the direction of noise (or beam former) is also adopted as the first noise suppression processing or the second noise suppression processing. Is done.

なお、選択部４４による選択の候補となる複数の雑音抑圧処理で雑音の抑圧の原理が相違する必要まではない。例えば、第１雑音抑圧処理および第２雑音抑圧処理の双方を同種の減算型の雑音抑圧処理（数式(4a)）とし、雑音の抑圧に適用される係数（例えば数式(4a)の係数αmや数式(4b)の係数βm）を第１雑音抑圧処理と第２雑音抑圧処理とで相違させる構成も採用される。また、第１雑音抑圧処理および第２雑音抑圧処理の双方を乗算型の雑音抑圧処理とし、雑音の抑圧に影響する係数（例えば数式(6b)の係数η）を第１雑音抑圧処理と第２雑音抑圧処理とで相違させる構成も採用される。雑音の抑圧後のミュージカルノイズを効果的に低減できる係数（αm，βm，η）は音響信号ＶINの音響的な特性に応じて変化するから、以上のように同種の（すなわち雑音の抑圧の原理が共通する）複数の雑音抑圧処理を選択部４４による選択の候補とした構成であっても、音響信号ＶINの音響的な特性の変化に拘わらずミュージカルノイズを有効に低減するという所期の効果は実現される。以上の説明から理解されるように、選択部４４による選択の候補となる複数の雑音抑圧処理は、実行後に発生するミュージカルノイズの度合が相違する処理であれば足り、雑音抑圧の原理の異同は不問である。 It should be noted that the principle of noise suppression is not necessarily different between a plurality of noise suppression processes that are selection candidates by the selection unit 44. For example, both the first noise suppression process and the second noise suppression process are the same type of subtraction type noise suppression process (formula (4a)), and a coefficient (for example, the coefficient αm of the formula (4a) A configuration is also adopted in which the coefficient βm) of the equation (4b) is different between the first noise suppression process and the second noise suppression process. In addition, both the first noise suppression process and the second noise suppression process are multiplication-type noise suppression processes, and a coefficient that affects noise suppression (for example, the coefficient η in Equation (6b)) is used as the first noise suppression process and the second noise suppression process. A configuration different from the noise suppression processing is also adopted. Since the coefficients (αm, βm, η) that can effectively reduce the musical noise after noise suppression change according to the acoustic characteristics of the acoustic signal VIN, they are of the same type (that is, the principle of noise suppression as described above). Even in a configuration in which a plurality of noise suppression processes are selected as candidates for selection by the selection unit 44, the intended effect of effectively reducing musical noise regardless of changes in the acoustic characteristics of the acoustic signal VIN Is realized. As can be understood from the above description, it is sufficient for the plurality of noise suppression processes to be selected by the selection unit 44 to be processes having different degrees of musical noise generated after execution, and the difference in the principle of noise suppression is as follows. It is unquestionable.

（２）変形例２
以上の各形態においては、実際に第１雑音抑圧処理で生成された周波数スペクトルＹm(n)_1から尖度ＫBm_1を算定したが、以下に説明するように、推定雑音スペクトルψm(n)と雑音抑圧前の周波数スペクトルＸm(n)とから尖度ＫBm_1を推定する構成も採用される。 (2) Modification 2
In each of the above embodiments, the kurtosis KBm_1 is calculated from the frequency spectrum Ym (n) _1 actually generated by the first noise suppression processing. As described below, the estimated noise spectrum ψm (n) and the noise are calculated. A configuration for estimating the kurtosis KBm_1 from the frequency spectrum Xm (n) before suppression is also employed.

いま、推定雑音スペクトルψm(n)のＡ倍（Ａ・ψm(n)）を図３の減算部３１４が周波数スペクトルＸm(n)から減算する場合（数式(4a)の係数αmを所定値Ａに設定した場合）を想定すると、雑音抑圧後の音響信号ＶOUTの強度の度数分布を近似する関数Ｇb(x；k,θ)は、数式(8)の強度ｘを強度(ｘ＋Ａ)に置換した以下の数式(15)で表現される。

Now, when the subtractor 314 in FIG. 3 subtracts A times (A · ψm (n)) of the estimated noise spectrum ψm (n) from the frequency spectrum Xm (n) (the coefficient αm in equation (4a) is set to a predetermined value A). Assuming that the function Gb (x; k, θ) approximating the frequency distribution of the intensity of the acoustic signal VOUT after noise suppression is replaced with the intensity (x + A) in the equation (8) It is expressed by the following formula (15).

数式(10)と同様に、４次のモーメントμ4の定義式における分布関数Ｐ(x)に数式(15)の関数Ｇb(x；k,θ)を代入することで数式(16)が導出される。

Similar to equation (10), equation (16) is derived by substituting function Gb (x; k, θ) of equation (15) into distribution function P (x) in the definition equation of fourth-order moment μ4. The

数式(16)の(ｘ＋Ａ)^k-1は、以下の数式(17)のようにテイラー展開される。

(X + A) ^{k-1 in} Expression (16) is Taylor-expanded as in Expression (17) below.

数式(17)の高次項を便宜的に無視したうえで数式(16)に代入すると、４次のモーメントμ4を近似する以下の数式(18)が導出される。

When the high-order term of the equation (17) is ignored for convenience and substituted into the equation (16), the following equation (18) that approximates the fourth-order moment μ4 is derived.

２次のモーメントについても同様に、定義式の分布関数Ｐ(x)（数式(9)参照）に数式(15)の関数Ｇb(x；k,θ)を代入したうえで数式(17)の高次項を無視することで、以下の数式(19)が導出される。

Similarly for the second moment, after substituting the function Gb (x; k, θ) of Equation (15) into the distribution function P (x) (see Equation (9)) of the definition equation, By ignoring the higher order terms, the following equation (19) is derived.

数式(18)の４次のモーメントμ4と数式(19)の２次のモーメントμ2とを数式(7)に代入することで、雑音の抑圧後の尖度ＫBm_1を定義する以下の数式(20)が導出される。なお、数式(20)の導出には、ガンマ関数Γ(k)の平均ｋ・θの正規化で導出される以下の数式(21)の関係を利用した。第２尖度算定部４２２は、数式(20)を実行することで尖度ＫBm_1（推定値）を算定する。数式(20)の所定値Ａは、第１雑音抑圧処理で所望の効果（雑音の抑圧度）が実現されるように適宜に選定される。

θ＝１／ｋ ……(21) Substituting the fourth-order moment μ4 of Equation (18) and the second-order moment μ2 of Equation (19) into Equation (7), the following Equation (20) that defines the kurtosis KBm_1 after noise suppression: Is derived. In order to derive the formula (20), the relationship of the following formula (21) derived by normalizing the average k · θ of the gamma function Γ (k) was used. The second kurtosis calculation unit 422 calculates the kurtosis KBm_1 (estimated value) by executing Expression (20). The predetermined value A in the equation (20) is appropriately selected so that a desired effect (noise suppression degree) is realized by the first noise suppression processing.

θ = 1 / k (21)

以上のように尖度指標値σm_1の算定に第１雑音抑圧処理の実行は不要である。そこで、第１雑音抑圧部３１は、選択部４４が第１雑音抑圧処理を選択したフレームＦRについてのみ第１雑音抑圧処理を実際に実行する。以上の構成によれば、選択部４４が第２雑音抑圧処理を選択するフレームＦRについては第１雑音抑圧処理が省略されるから、第１雑音抑圧部３１（演算処理装置１２）の処理の負荷が軽減されるという利点がある。 As described above, it is not necessary to perform the first noise suppression process for calculating the kurtosis index value σm_1. Therefore, the first noise suppression unit 31 actually executes the first noise suppression process only for the frame FR for which the selection unit 44 has selected the first noise suppression process. According to the above configuration, since the first noise suppression process is omitted for the frame FR in which the selection unit 44 selects the second noise suppression process, the processing load of the first noise suppression unit 31 (the arithmetic processing device 12) is omitted. There is an advantage that is reduced.

（３）変形例３
以上の各形態においては、各フレームＦRの尖度指標値σm（σm_1，σm_2）を利用してフレームＦR毎に雑音抑圧処理を選択したが、尖度指標値σmを算定する周期は本発明において任意である。例えば、音響信号ＶINを複数のフレームＦRで構成される区間（以下「単位区間」という）に区分し、単位区間毎に尖度指標値σmの算定と雑音抑圧処理の選択とを実行する構成も採用される。すなわち、指標算定部４２は、各単位区間の最初のフレームＦRについて尖度指標値σm（σm_1，σm_2）を算定する。選択部４４は、尖度指標値σmの比較で雑音抑圧処理を選択する。選択部４４が選択する雑音抑圧処理は、次回の単位区間の最初のフレームＦRで新たな尖度指標値σmが算定されるまで維持される。以上の構成によれば、指標算定部４２や選択部４４による処理の負荷が軽減されるという利点がある。また、フレームＦR毎に算定された尖度指標値σmを複数のフレームＦRについて平均する構成（すなわち、尖度指標値σmの時間的な変動を平滑化する構成）も好適である。 (3) Modification 3
In each of the above embodiments, the noise suppression processing is selected for each frame FR using the kurtosis index value σm (σm_1, σm_2) of each frame FR. However, the cycle for calculating the kurtosis index value σm is set in the present invention. Is optional. For example, the acoustic signal VIN is divided into sections (hereinafter referred to as “unit sections”) composed of a plurality of frames FR, and the calculation of the kurtosis index value σm and the selection of the noise suppression processing are performed for each unit section. Adopted. That is, the index calculation unit 42 calculates the kurtosis index value σm (σm_1, σm_2) for the first frame FR of each unit section. The selection unit 44 selects the noise suppression process by comparing the kurtosis index value σm. The noise suppression processing selected by the selection unit 44 is maintained until a new kurtosis index value σm is calculated in the first frame FR of the next unit section. According to the above configuration, there is an advantage that the processing load by the index calculation unit 42 and the selection unit 44 is reduced. A configuration in which the kurtosis index value σm calculated for each frame FR is averaged over a plurality of frames FR (that is, a configuration in which temporal fluctuation of the kurtosis index value σm is smoothed) is also preferable.

（４）変形例４
尖度指標値σm（σm_1，σm_2）を算定する方法は適宜に変更される。例えば、雑音の抑圧後のミュージカルノイズの発生の度合は、尖度比Ｒm（Ｒm_1，Ｒm_2）の対数値に対して特に顕著な相関を示すという傾向がある。したがって、尖度比Ｒmの対数値から尖度指標値σmを算定する構成も好適である。また、雑音抑圧処理の前後における尖度の変化量（差分値）に応じて尖度指標値σmを算定する構成も採用される。すなわち、尖度指標値σm_1は、第１雑音抑圧処理の実行後の尖度ＫBm_1と実行前の尖度ＫAmとの差分値（ＫBm_1−ＫAm）に応じて設定され、尖度指標値σm_2は、第２雑音抑圧処理の実行後の尖度ＫBm_2と実行前の尖度ＫAmとの差分値に応じて設定される。 (4) Modification 4
The method for calculating the kurtosis index value σm (σm_1, σm_2) is appropriately changed. For example, the degree of occurrence of musical noise after noise suppression tends to show a particularly significant correlation with the logarithmic value of the kurtosis ratio Rm (Rm_1, Rm_2). Therefore, a configuration for calculating the kurtosis index value σm from the logarithmic value of the kurtosis ratio Rm is also suitable. A configuration is also employed in which the kurtosis index value σm is calculated according to the kurtosis change amount (difference value) before and after the noise suppression processing. That is, the kurtosis index value σm_1 is set according to the difference value (KBm_1−KAm) between the kurtosis KBm_1 after execution of the first noise suppression process and the kurtosis KAm before execution, and the kurtosis index value σm_2 is It is set according to the difference value between the kurtosis KBm_2 after execution of the second noise suppression process and the kurtosis KAm before execution.

また、以上の各形態では尖度指標値σmの算定に関数Ｆaを使用したが、尖度の相対比（Ｒm_1，Ｒm_2）や尖度の変化量（ＫBm_1−ＫAm，ＫBm_2−ＫAm）を尖度指標値σm（σm_1，σm_2）として利用する構成も好適である。 In each of the above forms, the function Fa is used to calculate the kurtosis index value σm. However, the relative kurtosis ratio (Rm_1, Rm_2) and the change in kurtosis (KBm_1-KAm, KBm_2-KAm) A configuration in which the index value σm (σm_1, σm_2) is used is also suitable.

さらに、尖度指標値σmの大小とミュージカルノイズの多少（尖度κの変化の度合）との関係は変更される。例えば、数式(12)や数式(13)の関数Ｆaを単調減少関数とした場合、尖度指標値σmが小さいほど雑音の抑圧後のミュージカルノイズは多い（すなわち尖度の変化が大きい）。したがって、選択部４４は、尖度指標値σm_1および尖度指標値σm_2のうち大きい方に対応する雑音抑圧処理を選択する。すなわち、選択部４４による処理は、各尖度指標値σmに応じて複数の雑音抑圧処理の何れかを選択する処理（より好適には、複数の雑音抑圧処理のうち尖度指標値σmの示す尖度の変化が小さい雑音抑圧処理を選択する処理）として包括される。 Furthermore, the relationship between the magnitude of the kurtosis index value σm and the degree of musical noise (the degree of change in kurtosis κ) is changed. For example, when the function Fa in Expression (12) or Expression (13) is a monotone decreasing function, the smaller the kurtosis index value σm, the more musical noise after noise suppression (that is, the greater the change in kurtosis). Therefore, the selection unit 44 selects a noise suppression process corresponding to the larger one of the kurtosis index value σm_1 and the kurtosis index value σm_2. That is, the process by the selection unit 44 is a process of selecting one of a plurality of noise suppression processes according to each kurtosis index value σm (more preferably, the kurtosis index value σm of the plurality of noise suppression processes indicates This is included as a process for selecting a noise suppression process with a small change in kurtosis.

（５）変形例５
以上の各形態においては、尖度ＫAmと尖度ＫBm（ＫBm_1，ＫBm_2）との相違の度合に応じて尖度指標値σmを選定した。しかし、雑音の抑圧前の尖度ＫAmは尖度指標値σm_1と尖度指標値σm_2とで共通するから、雑音の抑圧後の尖度ＫBmから尖度指標値σmを算定する構成（尖度指標値σmの算定に尖度ＫAmを使用しない構成）も採用される。すなわち、指標算定部４２は、尖度ＫBm_1から尖度指標値σm_1を算定するとともに尖度ＫBm_2から尖度指標値σm_2を算定する。以上の構成においては、指標算定部４２の構成や処理が簡素化される（具体的には第１尖度算定部４２１が省略される）という利点がある。 (5) Modification 5
In each of the above embodiments, the kurtosis index value σm is selected according to the degree of difference between the kurtosis KAm and the kurtosis KBm (KBm_1, KBm_2). However, since the kurtosis index value σm before noise suppression is common to the kurtosis index value σm_1 and the kurtosis index value σm_2, a configuration for calculating the kurtosis index value σm from the kurtosis index value mm after noise suppression (kurtosis index A configuration in which the kurtosis KAm is not used for calculating the value σm is also employed. That is, the index calculation unit 42 calculates the kurtosis index value σm_1 from the kurtosis KBm_1 and calculates the kurtosis index value σm_2 from the kurtosis KBm_2. The above configuration has an advantage that the configuration and processing of the index calculation unit 42 are simplified (specifically, the first kurtosis calculation unit 421 is omitted).

（６）変形例６
第１実施形態の第１雑音抑圧処理における数式(4a)の係数αmを尖度指標値σm_1に応じて可変に制御する構成も好適である。例えば、係数αmが大きいほど第１雑音抑圧処理に起因したミュージカルノイズは増加するから、尖度指標値σm_1が大きいほど係数αmを減少させる構成が好適である。数式(4b)の係数βmについても同様に尖度指標値σm_1に応じて可変に制御される。 (6) Modification 6
A configuration is also preferable in which the coefficient αm of the equation (4a) in the first noise suppression processing of the first embodiment is variably controlled according to the kurtosis index value σm_1. For example, since the musical noise resulting from the first noise suppression processing increases as the coefficient αm increases, a configuration in which the coefficient αm is decreased as the kurtosis index value σm_1 increases. Similarly, the coefficient βm of Expression (4b) is also variably controlled according to the kurtosis index value σm_1.

（７）変形例７
選択部４４による選択の候補となる雑音抑圧処理の種類数は任意である。例えば、抑圧処理部３０が３種類以上の雑音抑圧処理の各々を音響信号ＶINに対して並列に実行する構成では、雑音抑圧処理毎に指標算定部４２が尖度指標値σm（３個以上）を算定し、３種類以上の雑音抑圧処理の何れかを選択部４４が選択する。また、３種類以上の雑音抑圧処理を選択の候補とした場合、選択部４４が２以上の雑音抑圧処理を選択する構成も採用される。選択部４４が選択した２以上の雑音抑圧処理で生成された２以上の周波数スペクトルＹm(n)は、例えば混合されてから波形合成部４６に出力される。 (7) Modification 7
The number of types of noise suppression processing that are candidates for selection by the selection unit 44 is arbitrary. For example, in a configuration in which the suppression processing unit 30 performs each of three or more types of noise suppression processing in parallel with the acoustic signal VIN, the index calculation unit 42 performs the kurtosis index value σm (three or more) for each noise suppression process. And the selection unit 44 selects one of three or more types of noise suppression processing. Further, when three or more types of noise suppression processes are selected as selection candidates, a configuration in which the selection unit 44 selects two or more noise suppression processes is also employed. The two or more frequency spectra Ym (n) generated by the two or more noise suppression processes selected by the selection unit 44 are output to the waveform synthesis unit 46 after being mixed, for example.

１００……雑音抑圧装置、１２……演算処理装置、１４……記憶装置、２０……周波数分析部、３０……抑圧処理部、３１……第１雑音抑圧部、３１２……雑音推定部、３１４……減算部、３２……第２雑音抑圧部、３２２……雑音推定部、３２４……ゲイン算定部、３２６……乗算部、４２……指標算定部、４２１……第１尖度算定部、４２２……第２尖度算定部、４２３……第３尖度算定部、４２５……第１指標算定部、４２６……第２指標算定部、４４……選択部、４６……波形合成部。 DESCRIPTION OF SYMBOLS 100 ... Noise suppression apparatus, 12 ... Arithmetic processing apparatus, 14 ... Memory | storage device, 20 ... Frequency analysis part, 30 ... Suppression processing part, 31 ... 1st noise suppression part, 312 ... Noise estimation part, 314: Subtraction unit, 32: Second noise suppression unit, 322: Noise estimation unit, 324: Gain calculation unit, 326: Multiplication unit, 42: Indicator calculation unit, 421: First kurtosis calculation , 422... Second kurtosis calculation unit, 423... Third kurtosis calculation unit, 425... First index calculation unit, 426... Second index calculation unit, 44. Synthesis department.

Claims

A plurality of noise suppression means for performing different noise suppression processing on the acoustic signal;
Index calculation means for calculating the kurtosis index value indicating the degree to which the kurtosis in the frequency distribution of the intensity of the acoustic signal has changed before and after the noise suppression process for each noise suppression process by the noise suppression means;
A noise suppression apparatus comprising: selection means for selecting one of a plurality of noise suppression processes by each noise suppression means according to each kurtosis index value calculated by the index calculation means.

The noise suppression apparatus according to claim 1, wherein the selection unit selects a noise suppression process in which a change in kurtosis indicated by the kurtosis index value is small among the plurality of noise suppression processes.

A signal dividing means for dividing the acoustic signal into a noise section and a target sound section on a time axis;
Weighting means for weighting each kurtosis index value by changing a weighting value for each noise suppression process between a noise interval and a target sound interval;
The noise suppression apparatus according to claim 1, wherein the selection unit selects a noise suppression process according to each weighted kurtosis index value.

The plurality of noise suppression processes include a subtraction-type noise suppression process for subtracting a noise spectrum from the spectrum of the acoustic signal, and a multiplication-type noise suppression process for multiplying the spectrum of the acoustic signal by a spectrum gain that enhances the target sound. The noise suppression device according to any one of claims 1 to 3.

A plurality of noise suppression processes individually performed on the acoustic signal;
An index calculation process for calculating, for each noise suppression process, a kurtosis index value indicating the degree to which the kurtosis in the frequency distribution of the intensity of the acoustic signal has changed before and after the noise suppression process;
A program that causes a computer to execute a selection process that selects any one of the plurality of noise suppression processes in accordance with each kurtosis change index calculated in the index calculation process.