JP2013068919A

JP2013068919A - Device for setting coefficient for noise suppression and noise suppression device

Info

Publication number: JP2013068919A
Application number: JP2011245206A
Authority: JP
Inventors: Ryoichi Miyazaki; 亮一宮崎; Hiroshi Saruwatari; 洋猿渡; Kazunobu Kondo; 多伸近藤
Original assignee: Nara Institute of Science and Technology NUC; Yamaha Corp
Current assignee: Nara Institute of Science and Technology NUC; Yamaha Corp
Priority date: 2011-09-07
Filing date: 2011-11-09
Publication date: 2013-04-18
Anticipated expiration: 2031-11-09
Also published as: JP5942388B2

Abstract

PROBLEM TO BE SOLVED: To achieve noise suppression which allows effective suppression of generation of musical noise.SOLUTION: A characteristic value calculation unit 42 calculates a shape parameter α0 according with a shape of an intensity distribution of a sound signal x(t). A first coefficient setting unit 44 variably sets a flooring coefficient η in accordance with the shape coefficient α0 calculated by the characteristic value calculation unit 42, so as to satisfy a relation formula prescribing a relation between the shape parameter α0 of the sound signal x(t) in the case where a kurtosis K of the intensity distribution of the sound signal x(t) is not changed before and after noise suppression processing and the flooring coefficient η to be applied to flooring processing of the noise suppression processing. A noise suppression unit 36 executes, on the sound signal x(t), noise suppression processing including flooring processing to which the flooring coefficient η set by the first coefficient setting unit 44 is applied.

Description

本発明は、音響信号から雑音成分を抑圧する技術に関する。 The present invention relates to a technique for suppressing a noise component from an acoustic signal.

周波数領域で音響信号から雑音成分を抑圧する雑音抑圧技術では、雑音抑圧処理に起因して発生する耳障りなミュージカルノイズの発生が問題となる。特許文献１には、音響信号の強度分布の尖度に応じた雑音指標値をミュージカルノイズの発生度合の指標として算定し、雑音抑圧処理に適用される抑圧係数やフロアリング係数を雑音指標値に応じて可変に設定する技術が開示されている。また、非特許文献１には、雑音抑圧処理の前後にわたる音響信号の強度分布の尖度比や雑音抑圧処理による雑音抑圧率を定式化し得ることが開示されている。 In noise suppression technology that suppresses noise components from acoustic signals in the frequency domain, the generation of annoying musical noise caused by noise suppression processing becomes a problem. In Patent Document 1, a noise index value corresponding to the kurtosis of the intensity distribution of an acoustic signal is calculated as an index of the degree of occurrence of musical noise, and a suppression coefficient or flooring coefficient applied to noise suppression processing is used as the noise index value. A technique for variably setting according to this is disclosed. Non-Patent Document 1 discloses that the kurtosis ratio of the intensity distribution of an acoustic signal before and after the noise suppression process and the noise suppression rate by the noise suppression process can be formulated.

特開２０１０−０２００１２号公報JP 2010-020012 A

T. Inoue, et al., "Theoretical analysis of iterative weak spectral subtraction via higher-order statistics", Proc. MLSP2010, p.220-225, 2010T. Inoue, et al., "Theoretical analysis of iterative weak spectral subtraction via higher-order statistics", Proc. MLSP2010, p.220-225, 2010

しかし、特許文献１や非特許文献１の技術のもとでも、実際にはミュージカルノイズの発生を有効に抑制し得る最適な係数を選定することは必ずしも容易ではなかった。以上の事情を考慮して、本発明は、ミュージカルノイズの発生を有効に抑制し得る雑音抑圧の実現を目的とする。 However, even under the techniques of Patent Document 1 and Non-Patent Document 1, it is not always easy to select an optimum coefficient that can effectively suppress the occurrence of musical noise. In view of the above circumstances, an object of the present invention is to realize noise suppression that can effectively suppress the generation of musical noise.

本発明の雑音抑圧用係数設定装置は、入力音響信号（例えば音響信号ｘ(t)）の強度分布の形状に応じた雑音特性値を算定する特性値算定手段と、音響信号の強度分布の尖度が雑音抑圧処理の前後で変化しない場合における当該音響信号の雑音特性値（例えば形状母数α0）と雑音抑圧処理のフロアリング処理に適用されるフロアリング係数（例えばフロアリング係数η）との関係を規定する関係式（例えば数式(24)）を満たすように、特性値算定手段が算定した雑音特性値に応じてフロアリング係数を可変に設定する第１係数設定手段とを具備する。 The noise suppression coefficient setting apparatus according to the present invention includes a characteristic value calculating means for calculating a noise characteristic value corresponding to the shape of the intensity distribution of an input acoustic signal (for example, the acoustic signal x (t)), and a peak of the intensity distribution of the acoustic signal. When the degree of noise does not change before and after the noise suppression process, the noise characteristic value (for example, shape parameter α0) of the acoustic signal and the flooring coefficient (for example, the flooring coefficient η) applied to the flooring process of the noise suppression process First coefficient setting means for variably setting the flooring coefficient in accordance with the noise characteristic value calculated by the characteristic value calculation means so as to satisfy a relational expression that defines the relationship (for example, Expression (24)).

以上の構成では、雑音抑圧処理の前後で強度分布の尖度が変化しないという条件（第１条件）を満たすようにフロアリング係数が雑音特性値に応じて可変に設定される。したがって、ミュージカルノイズの発生を有効に抑制（理想的には完全に防止）しながら雑音成分を抑圧することが可能である。 In the above configuration, the flooring coefficient is variably set according to the noise characteristic value so as to satisfy the condition (first condition) that the kurtosis of the intensity distribution does not change before and after the noise suppression process. Therefore, it is possible to suppress the noise component while effectively suppressing (ideally completely preventing) the generation of musical noise.

本発明の好適な態様において、第１係数設定手段は、雑音抑圧処理による雑音抑圧率が正数となるようにフロアリング係数を設定する。以上の態様では、雑音抑圧率が正数であるという条件（第２条件）を満たすようにフロアリング係数が設定される。例えば、雑音特性値とフロアリング係数との関係式を満たす複数のフロアリング係数から雑音抑圧率が正数となるフロアリング係数が選択される。したがって、雑音抑圧を有効に実現し得る有意なフロアリング係数を雑音抑圧処理に適用することが可能である。 In a preferred aspect of the present invention, the first coefficient setting means sets the flooring coefficient so that the noise suppression rate by the noise suppression process is a positive number. In the above aspect, the flooring coefficient is set so as to satisfy the condition that the noise suppression rate is a positive number (second condition). For example, a flooring coefficient with a positive noise suppression rate is selected from a plurality of flooring coefficients that satisfy the relational expression between the noise characteristic value and the flooring coefficient. Therefore, it is possible to apply a significant flooring coefficient that can effectively realize noise suppression to the noise suppression processing.

本発明の好適な態様の雑音抑圧用係数設定装置は、雑音抑圧強度を制御するための抑圧係数（例えば抑圧係数β）を、特性値算定手段が算定した雑音特性値に応じて可変に設定する第２係数設定手段を具備する。例えば、雑音抑圧処理による雑音抑圧率が最大となる抑圧係数や雑音抑圧率が目標値を上回る抑圧係数が設定される。以上の態様では、雑音抑圧処理（減算処理）に適用されて雑音抑圧強度を制御する抑圧係数が、入力音響信号の雑音特性値に応じて可変に設定されるから、抑圧係数を所定値に固定した構成と比較して適切な雑音抑圧が実現される。なお、以上の態様の具体例は例えば第２実施形態として後述される。 A noise suppression coefficient setting device according to a preferred aspect of the present invention variably sets a suppression coefficient (for example, suppression coefficient β) for controlling the noise suppression strength according to the noise characteristic value calculated by the characteristic value calculation means. Second coefficient setting means is provided. For example, a suppression coefficient that maximizes the noise suppression rate by the noise suppression process or a suppression coefficient that exceeds the target value is set. In the above aspect, since the suppression coefficient applied to the noise suppression process (subtraction process) to control the noise suppression intensity is variably set according to the noise characteristic value of the input acoustic signal, the suppression coefficient is fixed to a predetermined value. Compared with the configuration described above, appropriate noise suppression is realized. In addition, the specific example of the above aspect is later mentioned as 2nd Embodiment, for example.

本発明の好適な態様の雑音抑圧用係数設定装置は、各雑音抑圧処理による雑音抑圧率の累算値が目標値を上回るように、特性値算定手段が算定した雑音特性値と第１係数設定手段が設定したフロアリング係数とに応じて雑音抑圧処理の反復回数（例えば反復回数Ｑ）を可変に設定する反復回数設定手段を具備する。以上の態様では、雑音抑圧率の累算値が目標値を上回るように雑音抑圧処理の反復回数が可変に設定されるから、反復回数を所定値に固定した構成と比較して雑音抑圧処理の過不足を抑制することが可能である。なお、以上の態様の具体例は例えば第３実施形態として後述される。また、雑音抑圧処理は入力音響信号に対して累積的に反復される。雑音抑圧処理の累積的な反復とは、音響信号の一の区間に対して雑音抑圧処理を反復する（すなわち、各雑音抑圧処理の実行後の音響信号を次回の雑音抑圧処理の対象とする）ことを意味する。 The noise suppression coefficient setting device according to a preferred aspect of the present invention sets the noise characteristic value calculated by the characteristic value calculation means and the first coefficient setting so that the accumulated value of the noise suppression rate by each noise suppression process exceeds the target value. The number of iterations setting means for variably setting the number of iterations of noise suppression processing (for example, the number of iterations Q) according to the flooring coefficient set by the means is provided. In the above aspect, since the number of iterations of the noise suppression process is variably set so that the accumulated value of the noise suppression rate exceeds the target value, the noise suppression processing is compared with a configuration in which the number of iterations is fixed to a predetermined value. It is possible to suppress excess and deficiency. In addition, the specific example of the above aspect is later mentioned as 3rd Embodiment, for example. The noise suppression process is repeated cumulatively for the input acoustic signal. The cumulative repetition of the noise suppression process is to repeat the noise suppression process for one section of the acoustic signal (that is, the acoustic signal after the execution of each noise suppression process is the target of the next noise suppression process). Means that.

本発明は、以上の各態様に係る雑音抑圧用係数設定装置を具備する雑音抑圧装置としても実現される。本発明の第１態様に係る雑音抑圧装置は、以上の各態様に係る雑音抑圧用係数設定装置と、雑音抑圧用係数設定装置が設定した係数を適用した雑音抑圧処理を入力音響信号に対して実行する雑音抑圧手段とを具備する。雑音抑圧手段は、例えば、第１係数設定手段が設定したフロアリング係数を適用したフロアリング処理を含む雑音抑圧処理を入力音響信号に対して実行する要素や、第１係数設定手段が設定したフロアリング係数と第２係数設定手段が設定した抑圧係数とを適用した雑音抑圧処理を入力音響信号に対して実行する要素や、反復回数設定手段が設定した反復回数にわたり雑音抑圧処理を入力音響信号に対して累積的に反復する要素である。以上の雑音抑圧装置によれば、本発明の雑音抑圧用係数設定装置について前述した作用および効果が実現される。 The present invention is also realized as a noise suppression device including the noise suppression coefficient setting device according to each of the above aspects. The noise suppression device according to the first aspect of the present invention performs, on an input acoustic signal, a noise suppression coefficient setting device according to each of the above aspects and noise suppression processing to which the coefficient set by the noise suppression coefficient setting device is applied. Noise suppression means to be executed. The noise suppression means includes, for example, an element that executes noise suppression processing including flooring processing to which the flooring coefficient set by the first coefficient setting means is applied to the input acoustic signal, and the floor set by the first coefficient setting means. An element for executing noise suppression processing applied to the input acoustic signal by applying the ring coefficient and the suppression coefficient set by the second coefficient setting means, and noise suppression processing for the input acoustic signal over the number of iterations set by the iteration count setting means It is an element that repeats cumulatively. According to the above noise suppression apparatus, the above-described operation and effects of the noise suppression coefficient setting apparatus of the present invention are realized.

本発明の第２態様に係る雑音抑圧装置は、相互に離間して配置された収音機器が生成するＤ（Ｄは２以上の自然数）チャネルの音響信号を順次に処理するＱ段の単位処理手段と、
Ｑ段のうち最終段の単位処理手段による処理後のＤチャネルの音響信号の遅延加算で特定の音源方向の目的音成分を強調する出力処理手段とを具備し、Ｑ段の単位処理手段の各々は、当該単位処理手段に供給されるＤチャネルの音響信号に対する独立成分分析で推定雑音成分を生成する雑音推定手段と、推定雑音成分の雑音特性値に応じたフロアリング係数をチャネル毎に可変に設定する前述の各態様の雑音抑圧用係数設定装置と、雑音抑圧用係数設定装置がチャネル毎に設定したフロアリング係数を適用した雑音抑圧処理を当該チャネルの音響信号に実行して出力する雑音抑圧手段とを含む。第２態様においても第１態様の雑音抑圧装置と同様の効果が実現される。また、雑音成分の特性が経時的に変動する場合でも高精度な雑音抑圧を実現できるという利点もある。なお、第２態様の雑音抑圧装置は、例えば第４実施形態として後述される。 The noise suppression apparatus according to the second aspect of the present invention is a Q-stage unit process that sequentially processes acoustic signals of D (D is a natural number of 2 or more) channels generated by sound collection devices arranged apart from each other. Means,
Output processing means for emphasizing a target sound component in a specific sound source direction by delay addition of the D-channel acoustic signal processed by the final stage unit processing means in the Q stage, and each of the Q stage unit processing means The noise estimation means for generating an estimated noise component by independent component analysis for the D channel acoustic signal supplied to the unit processing means, and the flooring coefficient corresponding to the noise characteristic value of the estimated noise component are variable for each channel. The noise suppression coefficient setting device of each aspect described above to be set, and the noise suppression that outputs by executing the noise suppression processing to which the flooring coefficient set for each channel by the noise suppression coefficient setting device is applied to the acoustic signal of the channel Means. In the second mode, the same effect as that of the noise suppression device of the first mode is realized. In addition, there is an advantage that highly accurate noise suppression can be realized even when the characteristics of the noise component change over time. The noise suppression device of the second mode will be described later as a fourth embodiment, for example.

以上の各態様に係る雑音抑圧用係数設定装置は、雑音成分の抑圧に専用されるＤＳＰ（Digital Signal Processor）などのハードウェア（電子回路）によって実現されるほか、ＣＰＵ（Central Processing Unit）などの汎用の演算処理装置とプログラム（ソフトウェア）との協働によっても実現される。本発明のプログラムは、入力音響信号の強度分布の形状に応じた雑音特性値を算定する特性値算定処理と、音響信号の強度分布の尖度が雑音抑圧処理の前後で変化しない場合における当該音響信号の雑音特性値と雑音抑圧処理のフロアリング処理に適用されるフロアリング係数との関係を規定する関係式を満たすように、特性値算定処理で算定した雑音特性値に応じてフロアリング係数を可変に設定する第１係数設定処理とをコンピュータに実行させる。以上のプログラムによれば、本発明の雑音抑圧用係数設定装置と同様の作用および効果が実現される。本発明のプログラムは、コンピュータが読取可能な記録媒体に格納された形態で提供されてコンピュータにインストールされるほか、通信網を介した配信の形態で提供されてコンピュータにインストールされる。 The noise suppression coefficient setting device according to each aspect described above is realized by hardware (electronic circuit) such as a DSP (Digital Signal Processor) dedicated to noise component suppression, as well as a CPU (Central Processing Unit) and the like. It is also realized by cooperation between a general-purpose arithmetic processing unit and a program (software). The program according to the present invention includes a characteristic value calculation process for calculating a noise characteristic value according to the shape of the intensity distribution of an input acoustic signal, and the acoustic signal when the kurtosis of the intensity distribution of the acoustic signal does not change before and after the noise suppression process. The flooring coefficient is set according to the noise characteristic value calculated by the characteristic value calculation process so as to satisfy the relational expression that defines the relationship between the noise characteristic value of the signal and the flooring coefficient applied to the flooring process of noise suppression processing. The computer executes the first coefficient setting process to be variably set. According to the above program, the same operation and effect as the noise suppression coefficient setting device of the present invention are realized. The program of the present invention is provided in a form stored in a computer-readable recording medium and installed in the computer, or is provided in a form distributed via a communication network and installed in the computer.

第１実施形態に係る雑音抑圧装置のブロック図である。It is a block diagram of the noise suppression apparatus which concerns on 1st Embodiment. 雑音抑圧率と尖度比との関係を示すグラフである。It is a graph which shows the relationship between a noise suppression rate and kurtosis ratio. 実施形態の効果を示すグラフである。It is a graph which shows the effect of an embodiment. 実施形態の効果を示すグラフである。It is a graph which shows the effect of an embodiment. 第２実施形態に係る雑音抑圧装置のブロック図である。It is a block diagram of the noise suppression apparatus which concerns on 2nd Embodiment. 係数テーブルの模式図である。It is a schematic diagram of a coefficient table. 第３実施形態に係る雑音抑圧装置のブロック図である。It is a block diagram of the noise suppression apparatus which concerns on 3rd Embodiment. 第４実施形態に係る雑音抑圧装置のブロック図である。It is a block diagram of the noise suppression apparatus which concerns on 4th Embodiment. 単位処理部のブロック図である。It is a block diagram of a unit processing part. 第４実施形態の効果を示すグラフである。It is a graph which shows the effect of 4th Embodiment. 第４実施形態の効果を示すグラフである。It is a graph which shows the effect of 4th Embodiment. 第４実施形態の効果を示すグラフである。It is a graph which shows the effect of 4th Embodiment.

＜第１実施形態＞
図１は、本発明の第１実施形態に係る雑音抑圧装置１００Aのブロック図である。雑音抑圧装置１００Aには、信号供給装置１２と放音装置１４とが接続される。信号供給装置１２は、音響信号ｘ(t)を雑音抑圧装置１００Aに供給する。音響信号ｘ(t)は、目的音成分（例えば音声や楽音等の音響）と雑音成分（空調設備の動作音や雑踏音等の環境音）との混合音の波形を表す時間領域の信号である（ｔ：時間）。周囲の音響を収音して音響信号ｘ(t)を生成する収音機器や、可搬型または内蔵型の記録媒体から音響信号ｘ(t)を取得して雑音抑圧装置１００Aに供給する再生装置や、通信網から音響信号ｘ(t)を受信して雑音抑圧装置１００Aに供給する通信装置が信号供給装置１２として採用され得る。 <First Embodiment>
FIG. 1 is a block diagram of a noise suppression device 100A according to the first embodiment of the present invention. A signal supply device 12 and a sound emission device 14 are connected to the noise suppression device 100A. The signal supply device 12 supplies the acoustic signal x (t) to the noise suppression device 100A. The acoustic signal x (t) is a time-domain signal that represents a waveform of a mixed sound of a target sound component (for example, sound such as voice or musical sound) and a noise component (environmental sound such as air conditioning equipment operation sound or hustle). Yes (t: time). A sound collection device that collects ambient sounds and generates an acoustic signal x (t), or a playback device that acquires the acoustic signal x (t) from a portable or built-in recording medium and supplies the acoustic signal x (t) to the noise suppression device 100A Alternatively, a communication device that receives the acoustic signal x (t) from the communication network and supplies it to the noise suppression device 100A can be employed as the signal supply device 12.

雑音抑圧装置１００Aは、信号供給装置１２が供給する音響信号ｘ(t)から音響信号ｙ(t)を生成する音響処理装置である。音響信号ｙ(t)は、音響信号ｘ(t)から雑音成分を抑圧した音響（目的音成分を強調した音響）の波形を表す時間領域の信号である。放音装置１４（例えばスピーカやヘッドホン）は、雑音抑圧装置１００Aが生成した音響信号ｙ(t)に応じた音波を再生する。なお、音響信号ｙ(t)をデジタルからアナログに変換するＤ/Ａ変換器の図示は便宜的に省略されている。 The noise suppression device 100A is an acoustic processing device that generates an acoustic signal y (t) from the acoustic signal x (t) supplied by the signal supply device 12. The acoustic signal y (t) is a time-domain signal representing a waveform of a sound (a sound in which the target sound component is emphasized) in which a noise component is suppressed from the acoustic signal x (t). The sound emitting device 14 (for example, a speaker or headphones) reproduces sound waves according to the acoustic signal y (t) generated by the noise suppressing device 100A. Note that a D / A converter that converts the acoustic signal y (t) from digital to analog is not shown for convenience.

図１に示すように、雑音抑圧装置１００Aは、演算処理装置２２と記憶装置２４とを具備するコンピュータシステムで実現される。記憶装置２４は、演算処理装置２２が実行するプログラムＰGMや演算処理装置２２が使用する各種のデータを記憶する。半導体記録媒体や磁気記録媒体などの公知の記録媒体や複数種の記録媒体の組合せが記憶装置２４として任意に採用され得る。音響信号ｘ(t)を記憶装置２４に記憶した構成（したがって信号供給装置１２は省略される）も好適である。 As shown in FIG. 1, the noise suppression device 100 </ b> A is realized by a computer system including an arithmetic processing device 22 and a storage device 24. The storage device 24 stores a program PGM executed by the arithmetic processing device 22 and various data used by the arithmetic processing device 22. A known recording medium such as a semiconductor recording medium or a magnetic recording medium or a combination of a plurality of types of recording media can be arbitrarily employed as the storage device 24. A configuration in which the acoustic signal x (t) is stored in the storage device 24 (therefore, the signal supply device 12 is omitted) is also suitable.

演算処理装置２２は、記憶装置２４に格納されたプログラムＰGMを実行することで、音響信号ｘ(t)から音響信号ｙ(t)を生成するための複数の機能（周波数分析部３２，雑音推定部３４，雑音抑圧部３６，波形合成部３８，特性値算定部４２，第１係数設定部４４）を実現する。なお、演算処理装置２２の各機能を複数の集積回路に分散した構成や、専用の電子回路（DSP）が各機能を実現する構成も採用され得る。 The arithmetic processing unit 22 executes a program PGM stored in the storage device 24 to thereby generate a plurality of functions (frequency analysis unit 32, noise estimation) for generating the acoustic signal y (t) from the acoustic signal x (t). Unit 34, noise suppression unit 36, waveform synthesis unit 38, characteristic value calculation unit 42, first coefficient setting unit 44). A configuration in which each function of the arithmetic processing unit 22 is distributed over a plurality of integrated circuits, or a configuration in which a dedicated electronic circuit (DSP) realizes each function may be employed.

周波数分析部３２は、音響信号ｘ(t)のスペクトル（複素スペクトル）Ｘ(f,τ)を時間軸上のフレーム毎に順次に生成する。スペクトルＸ(f,τ)の生成には、短時間フーリエ変換等の公知の周波数分析が任意に採用され得る。記号τはフレームを指定する変数であり、記号ｆは周波数を指定する変数である。なお、通過帯域が相違する複数の帯域通過フィルタで構成されるフィルタバンクも周波数分析部３２として採用され得る。 The frequency analysis unit 32 sequentially generates a spectrum (complex spectrum) X (f, τ) of the acoustic signal x (t) for each frame on the time axis. For the generation of the spectrum X (f, τ), a known frequency analysis such as short-time Fourier transform can be arbitrarily employed. Symbol τ is a variable that specifies a frame, and symbol f is a variable that specifies a frequency. Note that a filter bank including a plurality of bandpass filters having different passbands can also be employed as the frequency analysis unit 32.

雑音推定部３４は、音響信号ｘ(t)に含まれると推定される雑音成分（以下「推定雑音成分」という）ｎ(t)のスペクトル（複素スペクトル）Ｎ(f,τ)を生成する。推定雑音成分ｎ(t)のスペクトルＮ(f,τ)の生成には公知の技術が任意に採用され得る。例えば、雑音推定部３４は、目的音成分が存在する目的音区間と目的音成分が存在しない雑音区間とに音響信号ｘ(t)を時間軸上で区分し、雑音区間内の各フレームのスペクトルＸ(f,τ)を推定雑音成分ｎ(t)のスペクトルＮ(f,τ)として特定する。目的音区間と雑音区間との区分には公知の音声検出技術（VAD：Voice Activity Detection）が任意に採用される。 The noise estimation unit 34 generates a spectrum (complex spectrum) N (f, τ) of a noise component (hereinafter referred to as “estimated noise component”) n (t) estimated to be included in the acoustic signal x (t). A known technique can be arbitrarily adopted to generate the spectrum N (f, τ) of the estimated noise component n (t). For example, the noise estimation unit 34 divides the acoustic signal x (t) on the time axis into a target sound section where the target sound component exists and a noise section where the target sound component does not exist, and the spectrum of each frame in the noise section. X (f, τ) is specified as the spectrum N (f, τ) of the estimated noise component n (t). A known voice detection technique (VAD: Voice Activity Detection) is arbitrarily adopted for the classification of the target sound section and the noise section.

雑音抑圧部３６は、目的音区間および雑音区間の各フレームの音響信号ｘ(t)のスペクトルＸ(f,τ)に対する反復型の雑音抑圧で音響信号ｙ(t)のスペクトル（複素スペクトル）Ｙ(f,τ)をフレーム毎に順次に生成する。反復型の雑音抑圧は、音響信号ｘ(t)のスペクトルＸ(f,τ)に対する雑音抑圧処理を所定の反復回数Ｑ（Ｑは自然数）にわたり累積的に反復する信号処理である。雑音抑圧部３６が生成するスペクトルＹ(f,τ)は、以下の数式(1)で表現される。

数式(1)の記号ｊは虚数単位を意味し、記号θx(f,τ)は音響信号ｘ(t)の位相スペクトルを意味する。また、数式(1)の記号|ＹQ(f,τ)|は、音響信号ｘ(t)のスペクトルＸ(f,τ)に対して反復回数Ｑの雑音抑圧処理を実行した時点の振幅スペクトルである。各回の雑音抑圧処理では、数式(2A)で表現される減算処理と数式(2B)で表現されるフロアリング処理とが周波数ｆ毎に択一的に実行される。

The noise suppression unit 36 performs the spectrum (complex spectrum) Y of the acoustic signal y (t) by iterative noise suppression for the spectrum X (f, τ) of the acoustic signal x (t) in each frame of the target sound section and the noise section. (f, τ) is sequentially generated for each frame. The iterative noise suppression is signal processing in which noise suppression processing for the spectrum X (f, τ) of the acoustic signal x (t) is repeated cumulatively over a predetermined number of iterations Q (Q is a natural number). The spectrum Y (f, τ) generated by the noise suppression unit 36 is expressed by the following formula (1).

The symbol j in Equation (1) means an imaginary unit, and the symbol θx (f, τ) means the phase spectrum of the acoustic signal x (t). In addition, the symbol | YQ (f, τ) | in the equation (1) is an amplitude spectrum at the time when the noise suppression processing of the number of iterations Q is performed on the spectrum X (f, τ) of the acoustic signal x (t). is there. In each noise suppression process, a subtraction process expressed by Expression (2A) and a flooring process expressed by Expression (2B) are alternatively executed for each frequency f.

数式(2A)から理解されるように、減算処理は、推定雑音成分ｎ(t)のパワースペクトル|Ｎi-1(f,τ)|²の時間平均（すなわち推定雑音）と抑圧係数βとの乗算値を、直前（第(i-1)回目）の雑音抑圧処理後のパワースペクトル|Ｙi-1(f,τ)|²から減算して振幅スペクトル|Ｙi(f,τ)|を算定する処理である。したがって、抑圧係数（減算係数）βは、雑音成分の抑圧強度を制御するための変数として機能する。第１実施形態では、抑圧係数βは所定値（例えば２または４）に固定される。 As can be understood from the equation (2A), the subtraction processing is performed by calculating the time average (ie, estimated noise) of the power spectrum | Ni-1 (f, τ) | ² of the estimated noise component n (t) and the suppression coefficient β. The amplitude spectrum | Yi (f, τ) | is calculated by subtracting the multiplication value from the power spectrum | Yi-1 (f, τ) | ² after the noise suppression processing immediately before ((i−1) th). It is processing. Therefore, the suppression coefficient (subtraction coefficient) β functions as a variable for controlling the suppression intensity of the noise component. In the first embodiment, the suppression coefficient β is fixed to a predetermined value (for example, 2 or 4).

他方、数式(2A)の減算後の数値が負数となる場合に実行される数式(2B)のフロアリング処理は、直前（第(i-1)回目）の雑音抑圧処理後の振幅スペクトル|Ｙi-1(f,τ)|とフロアリング係数ηとを乗算することで振幅スペクトル|Ｙi(f,τ)|を算定する処理である。すなわち、フロアリング係数ηは、振幅スペクトル|Ｙi(f,τ)|の下限値を規定する変数として機能する。数式(1)で説明したように、第Ｑ回目の雑音抑圧処理の完了後の振幅スペクトル|ＹQ(f,τ)|に音響信号ｘ(t)の位相スペクトルθx(f,τ)を付加したスペクトルＹ[f,t]がフレーム毎に波形合成部３８に供給される。 On the other hand, the flooring process of the equation (2B) executed when the numerical value after the subtraction of the equation (2A) becomes a negative number is the amplitude spectrum | Yi after the noise suppression process of the immediately previous ((i-1)) th time. This is a process for calculating the amplitude spectrum | Yi (f, τ) | by multiplying -1 (f, τ) | by the flooring coefficient η. That is, the flooring coefficient η functions as a variable that defines the lower limit value of the amplitude spectrum | Yi (f, τ) |. As described in Equation (1), the phase spectrum θx (f, τ) of the acoustic signal x (t) is added to the amplitude spectrum | YQ (f, τ) | after completion of the Qth noise suppression process. The spectrum Y [f, t] is supplied to the waveform synthesis unit 38 for each frame.

図１の波形合成部３８は、雑音抑圧部３６がフレーム毎に生成するスペクトルＹ(f,τ)から時間領域の音響信号ｙ(t)を生成する。具体的には、波形合成部３８は、各フレームのスペクトルＹ(f,τ)を逆フーリエ変換で時間領域の信号に変換するとともに前後のフレームを相互に連結することで音響信号ｙ(t)を生成する。波形合成部３８が生成した音響信号ｙ(t)が放音装置１４に供給されて音波として再生される。 1 generates a time domain acoustic signal y (t) from the spectrum Y (f, τ) generated by the noise suppression unit 36 for each frame. Specifically, the waveform synthesizer 38 converts the spectrum Y (f, τ) of each frame into a time domain signal by inverse Fourier transform, and connects the preceding and subsequent frames to each other to connect the acoustic signal y (t). Is generated. The acoustic signal y (t) generated by the waveform synthesizer 38 is supplied to the sound emitting device 14 and reproduced as a sound wave.

図１の特性値算定部４２は、音響信号ｘ(t)内の雑音成分（推定雑音成分ｎ(t)）の特性に応じた形状母数（shape parameter）α0を音響信号ｘ(t)から算定する。形状母数α0は、雑音区間内の複数のフレームにわたる音響信号ｘ(t)のパワー|Ｘ(f,τ)|²（すなわち推定雑音成分ｎ(t)のパワー|Ｎ(f,τ)|²）の度数分布（以下「強度分布」という）の形状に応じて変化する変数である。第１実施形態の特性値算定部４２は、音響信号ｘ(t)の強度分布を近似する確率分布の形状母数α0を算定する。確率分布の典型例はガウス分布である。音響信号ｘ(t)のパワーｘ（ｘ＝|Ｘ(f,τ)|²）を確率変数として音響信号ｘ(t)の強度分布を近似するガンマ分布の確率密度関数Ｐ(x)は以下の数式(3)で表現される。

The characteristic value calculation unit 42 in FIG. 1 calculates a shape parameter α 0 corresponding to the characteristic of the noise component (estimated noise component n (t)) in the acoustic signal x (t) from the acoustic signal x (t). Calculate. The shape parameter α 0 is the power | X (f, τ) | ² (that is, the power of the estimated noise component n (t) | N (f, τ) | ² ) A variable that changes according to the shape of the frequency distribution (hereinafter referred to as “intensity distribution”). The characteristic value calculator 42 of the first embodiment calculates a shape parameter α 0 of a probability distribution that approximates the intensity distribution of the acoustic signal x (t). A typical example of the probability distribution is a Gaussian distribution. The probability density function P (x) of the gamma distribution that approximates the intensity distribution of the acoustic signal x (t) with the power x (x = | X (f, τ) | ² ) of the acoustic signal x (t) as a random variable is This is expressed by Equation (3).

数式(3)の形状母数αは以下の数式(4A)および数式(4B)で定義され、数式(3)の尺度母数（scaling parameter）θは以下の数式(5)で定義される。また、数式(3)の記号Γ(α)は、以下の数式(6)で定義されるガンマ関数を意味する。なお、記号Ｅ[ ]は平均値（期待値）を意味する。図１の特性値算定部４２は、雑音区間内の音響信号ｘ(t)のパワー|Ｘ(f,τ)|²を数式(4B)の確率変数ｘに適用して数式(4A)で算定される形状母数αを、最初の雑音抑圧処理の実行前の音響信号ｘ(t)の形状母数α0として算定する。

The shape parameter α in the equation (3) is defined by the following equations (4A) and (4B), and the scaling parameter θ in the equation (3) is defined by the following equation (5). In addition, the symbol Γ (α) in Equation (3) means a gamma function defined by Equation (6) below. The symbol E [] means an average value (expected value). The characteristic value calculation unit 42 in FIG. 1 applies the power | X (f, τ) | ² of the acoustic signal x (t) in the noise interval to the random variable x of the equation (4B) and calculates the equation (4A). The shape parameter α to be calculated is calculated as the shape parameter α0 of the acoustic signal x (t) before execution of the first noise suppression processing.

図１の第１係数設定部４４は、雑音抑圧部３６が数式(2B)のフロアリング処理に適用するフロアリング係数ηを、特性値算定部４２が算定した形状母数α0に応じて可変に設定する。具体的には、雑音抑圧処理に起因した音響信号ｙ(t)のミュージカルノイズが最小化され、かつ、雑音抑圧処理により雑音成分が確かに抑圧されるように、フロアリング係数ηは算定される。 The first coefficient setting unit 44 in FIG. 1 variably changes the flooring coefficient η that the noise suppression unit 36 applies to the flooring process of Equation (2B) according to the shape parameter α0 calculated by the characteristic value calculation unit 42. Set. Specifically, the flooring coefficient η is calculated so that the musical noise of the acoustic signal y (t) resulting from the noise suppression process is minimized and the noise component is surely suppressed by the noise suppression process. .

フロアリング係数ηが満たすべき条件を特定するために、１回の雑音抑圧処理に便宜的に着目して、ミュージカルノイズの発生の度合と雑音成分の抑圧の度合との関係を検討する。まず、雑音抑圧処理に起因したミュージカルノイズが非ガウス性の雑音であることを考慮し、強度分布（確率密度関数）のガウス性の指標となる尖度（kurtosis）を、雑音抑圧処理に起因したミュージカルノイズの発生量の定量的な指標として利用する。具体的には、雑音抑圧処理の前後にわたる尖度の変化が大きいほどミュージカルノイズが顕在化するという傾向を考慮して、雑音抑圧処理の実行前の尖度ＫAに対する実行後の尖度ＫBの相対比（以下「尖度比」という）κをミュージカルノイズの発生量の指標として利用する（κ＝ＫB／ＫA）。尖度比κが大きいほどミュージカルノイズが多く、尖度比κが１である場合（雑音抑圧処理の実行の前後で尖度が変化しない場合）にはミュージカルノイズは発生していないと評価できる。なお、尖度（尖度比κ）とミュージカルノイズとの相関については特許文献１にも詳述されている。 In order to specify the condition that the flooring coefficient η should satisfy, the relationship between the degree of occurrence of musical noise and the degree of suppression of noise components will be examined by focusing attention on a single noise suppression process for convenience. First, considering that the musical noise caused by noise suppression processing is non-Gaussian noise, the kurtosis that is an index of Gaussianity of the intensity distribution (probability density function) was caused by the noise suppression processing. This is used as a quantitative indicator of the amount of musical noise generated. Specifically, considering the tendency that musical noise becomes more apparent as the change in kurtosis before and after the noise suppression process becomes larger, the relative kurtosis KB after execution to the kurtosis KA before execution of the noise suppression process The ratio (hereinafter referred to as “kurtosis ratio”) κ is used as an index of the generation amount of musical noise (κ = KB / KA). The larger the kurtosis ratio κ is, the more musical noise is. When the kurtosis ratio κ is 1 (when the kurtosis does not change before and after the execution of the noise suppression process), it can be evaluated that no musical noise has occurred. The correlation between kurtosis (kurtosis ratio κ) and musical noise is also described in detail in Patent Document 1.

確率密度関数Ｐ(x)の尖度Ｋは以下の数式(7)で表現される。

数式(7)の記号μm（μ2，μ4）は、確率密度関数Ｐ(x)の原点回りのｍ次モーメントを意味し、以下の数式(8)で定義される。

The kurtosis K of the probability density function P (x) is expressed by the following equation (7).

The symbol μm (μ2, μ4) in the equation (7) means an m-th moment around the origin of the probability density function P (x), and is defined by the following equation (8).

非特許文献１の記載から理解されるように、雑音抑圧処理後のｍ次モーメントμmは以下の数式(9)で表現される。数式(9)の関数Ｍ(α,β,η,m)は、数式(10)で定義される。

As can be understood from the description in Non-Patent Document 1, the m-th moment μm after the noise suppression processing is expressed by the following formula (9). The function M (α, β, η, m) of Equation (9) is defined by Equation (10).

数式(10)の記号Γ(b,a)は、以下の数式(11)で定義される第１種不完全ガンマ関数であり、数式(10)の記号γ(b,a)は、以下の数式(12)で定義される第２種不完全ガンマ関数である。

The symbol Γ (b, a) in Equation (10) is a first-type incomplete gamma function defined by the following Equation (11), and the symbol γ (b, a) in Equation (10) is It is a second type incomplete gamma function defined by Equation (12).

数式(7)および数式(9)から、雑音抑圧処理の実行後の尖度Ｋを示す以下の数式(13)が導出される。また、雑音抑圧処理の実行前の尖度Ｋは、数式(13)において抑圧係数βおよびフロアリング係数ηを０とした以下の数式(14)で表現される。

したがって、雑音抑圧処理の前後にわたる尖度比κは、以下の数式(15)で表現される。

From Expression (7) and Expression (9), the following Expression (13) indicating the kurtosis K after execution of the noise suppression process is derived. Further, the kurtosis K before execution of the noise suppression process is expressed by the following formula (14) in which the suppression coefficient β and the flooring coefficient η are set to 0 in formula (13).

Therefore, the kurtosis ratio κ before and after the noise suppression process is expressed by the following formula (15).

他方、雑音抑圧処理による雑音成分の抑圧度合の指標値として雑音抑圧率（Noise Reduction Rate）Ｒを導入する。非特許文献１から理解されるように、雑音抑圧率Ｒは、音響信号ｘ(t)の強度分布の形状母数αと数式(10)の関数Ｍ(α,β,η,m)とを含む以下の数式(16)で表現される。数式(16)の雑音抑圧率Ｒは、雑音抑圧処理の実行後のＳＮ（Signal to Noise）比と雑音抑圧処理の実行前のＳＮ比との差分（デシベル値）を意味し、１回の雑音抑圧処理における雑音成分の抑圧量の指標として機能する。

On the other hand, a noise reduction rate R is introduced as an index value of the degree of suppression of noise components by noise suppression processing. As understood from Non-Patent Document 1, the noise suppression rate R is obtained by calculating the shape parameter α of the intensity distribution of the acoustic signal x (t) and the function M (α, β, η, m) of Equation (10). Including the following expression (16). The noise suppression rate R in Equation (16) means the difference (decibel value) between the SN (Signal to Noise) ratio after execution of the noise suppression processing and the SN ratio before execution of the noise suppression processing. It functions as an index of the amount of noise component suppression in the suppression process.

図２は、数式(15)の尖度比κ（縦軸）と数式(16)の雑音抑圧率Ｒ（横軸）との関係を示すグラフである。図２では、雑音成分の形状母数αが１.０（白色ガウス雑音）である場合を想定し、フロアリング係数ηを０.０に設定した場合（破線）と１.０に設定した場合（実線）との各々について抑圧係数βを変化させたときの尖度比κと雑音抑圧率Ｒとの相関が図示されている。図２の矢印は、抑圧係数βが増加する方向を意味する。 FIG. 2 is a graph showing the relationship between the kurtosis ratio κ (vertical axis) in Expression (15) and the noise suppression rate R (horizontal axis) in Expression (16). In FIG. 2, it is assumed that the shape parameter α of the noise component is 1.0 (white Gaussian noise), and the flooring coefficient η is set to 0.0 (dashed line) and 1.0. The correlation between the kurtosis ratio κ and the noise suppression rate R when the suppression coefficient β is changed for each (solid line) is illustrated. The arrow in FIG. 2 means the direction in which the suppression coefficient β increases.

フロアリング係数ηを０.０に設定した場合、抑圧係数βの増加とともに尖度比κおよび雑音抑圧率Ｒの双方が単調に増加する。したがって、雑音抑圧の効果が増加する一方でミュージカルノイズが増大する。他方、フロアリング係数ηを１.０に設定した場合、尖度比κと雑音抑圧率Ｒとの関係は、抑圧係数βの変化に対してヒステリシスループを描き、雑音抑圧率Ｒが０となる地点（雑音成分が抑圧されない場合）以外で尖度比κが１となる特定点ｐが存在する。尖度比κが１に維持される（すなわち雑音抑圧処理の前後で尖度Ｋが変化しない）ということは、雑音抑圧処理の前後で論理的にはミュージカルノイズが発生しないことを意味する。すなわち、ミュージカルノイズを発生させることなく雑音成分を有効に抑制できる抑圧係数βとフロアリング係数ηとの組合せが存在するという事実が図２から確認できる。抑圧係数βを所定値に固定した場合を想定すると、フロアリング係数ηを適切な数値に設定することで、ミュージカルノイズを発生させることなく雑音成分を抑圧することが可能である。 When the flooring coefficient η is set to 0.0, both the kurtosis ratio κ and the noise suppression rate R monotonously increase as the suppression coefficient β increases. Therefore, musical noise increases while the effect of noise suppression increases. On the other hand, when the flooring coefficient η is set to 1.0, the relationship between the kurtosis ratio κ and the noise suppression rate R draws a hysteresis loop with respect to the change of the suppression factor β, and the noise suppression rate R becomes 0. There is a specific point p where the kurtosis ratio κ is 1 except for the point (when the noise component is not suppressed). The fact that the kurtosis ratio κ is maintained at 1 (that is, the kurtosis K does not change before and after the noise suppression process) means that no musical noise is logically generated before and after the noise suppression process. That is, it can be confirmed from FIG. 2 that there is a combination of a suppression coefficient β and a flooring coefficient η that can effectively suppress a noise component without generating musical noise. Assuming the case where the suppression coefficient β is fixed to a predetermined value, the noise component can be suppressed without generating musical noise by setting the flooring coefficient η to an appropriate value.

以上の知見を踏まえて、ミュージカルノイズを発生させないための第１条件と、雑音成分を抑圧するための第２条件との双方を成立させることを検討する。第１条件は、雑音抑圧処理の前後で尖度Ｋが変化しないという条件であり、第２条件は、雑音抑圧処理による雑音抑圧率Ｒが正数であるという条件である。第１条件および第２条件の各々について以下に詳述する。 Based on the above knowledge, it is considered to establish both the first condition for preventing the generation of musical noise and the second condition for suppressing the noise component. The first condition is a condition that the kurtosis K does not change before and after the noise suppression process, and the second condition is a condition that the noise suppression rate R by the noise suppression process is a positive number. Each of the first condition and the second condition will be described in detail below.

［第１条件］
第ｉ回目（ｉ＝０,１,２,……）の雑音抑圧処理の実行後の尖度Ｋ(αi,β,η)の強度分布の形状母数αを第(i+1)回目の雑音抑圧処理の対象となる強度分布の形状母数αi+1とする場合、尖度Ｋ(αi,β,η)と形状母数αi+1との間には以下の数式(17)が成立する。数式(17)の導出については非特許文献１に詳述されている。

[First condition]
The shape parameter α of the intensity distribution of the kurtosis K (αi, β, η) after execution of the i-th (i = 0, 1, 2,...) Noise suppression processing is calculated as the (i + 1) -th. When the shape parameter αi + 1 of the intensity distribution subject to noise suppression processing is used, the following formula (17) is established between the kurtosis K (αi, β, η) and the shape parameter αi + 1. To do. The derivation of Equation (17) is described in detail in Non-Patent Document 1.

最初（ｉ＝０）の雑音抑圧処理の実行前の強度分布（すなわち音響信号ｘ(t)の強度分布）の尖度Ｋ（α0,０,０）は、以下の数式(18)で表現される。

The kurtosis K (α0,0,0) of the intensity distribution (that is, the intensity distribution of the acoustic signal x (t)) before execution of the first (i = 0) noise suppression processing is expressed by the following equation (18). The

第１条件が成立する場合、以下の数式(19)で表現されるように、任意の時点の尖度Ｋ(αi,β,η)が雑音抑圧処理前の尖度Ｋ(α0,０,０)と同等となる。

When the first condition is satisfied, the kurtosis K (αi, β, η) at an arbitrary time point is expressed as the kurtosis K (α0, 0, 0 before noise suppression processing) as expressed by the following formula (19). ).

数式(10)の関数Ｍ(α,β,η,ｍ)を、以下の数式(20)に示すように、フロアリング係数ηを含まない関数Ｓ(α,β,ｍ)および関数Ｆ(α,β,ｍ)で便宜的に簡略化すると、最初の雑音抑圧処理の直後の尖度Ｋ(α0,β,η)は数式(21)で表現される。

The function M (α, β, η, m) of the equation (10) is changed into a function S (α, β, m) and a function F (α that do not include the flooring coefficient η, as shown in the following equation (20). , β, m), the kurtosis K (α 0, β, η) immediately after the first noise suppression processing is expressed by Equation (21).

数式(19)と数式(21)とから、第１条件を表現する以下の数式(22)が導出される。

フロアリング係数ηの４乗を変数Ｈで置換すると、数式(22)は以下の数式(23)に変形される。

From the formula (19) and the formula (21), the following formula (22) expressing the first condition is derived.

When the fourth power of the flooring coefficient η is replaced with the variable H, Equation (22) is transformed into the following Equation (23).

数式(23)を変数Ｈについて整理することで、第１条件を満たすフロアリング係数η（η＝Ｈ^1/4）を規定する以下の数式(24)が導出される。

第１実施形態では抑圧係数βは所定値に固定される。したがって、数式(24)は、音響信号ｘ(t)の強度分布の尖度が雑音抑圧処理（１回の雑音抑圧処理およびＱ回にわたる雑音抑圧処理の反復）の前後で変化しない場合における形状母数αとフロアリング係数ηとの関係を規定する関係式として利用可能である。 By arranging Equation (23) with respect to the variable H, the following Equation (24) defining the flooring coefficient η (η = H ^1/4 ) that satisfies the first condition is derived.

In the first embodiment, the suppression coefficient β is fixed to a predetermined value. Therefore, Equation (24) is the shape matrix in the case where the kurtosis of the intensity distribution of the acoustic signal x (t) does not change before and after the noise suppression process (one noise suppression process and Q noise suppression processes repeated). It can be used as a relational expression that defines the relationship between the number α and the flooring coefficient η.

［第２条件］
第２条件は、１回の雑音抑圧処理での雑音成分の抑圧量を意味する数式(16)の雑音抑圧率Ｒが正数であるという条件である。したがって、前述の関数Ｓ(α,β,ｍ)および関数Ｆ(α,β,ｍ)を利用することで、第２条件を表現する数式(25)が導出される。

[Second condition]
The second condition is a condition that the noise suppression rate R in Expression (16), which represents the amount of noise component suppression in one noise suppression process, is a positive number. Therefore, by using the above-described function S (α, β, m) and function F (α, β, m), Equation (25) expressing the second condition is derived.

フロアリング係数ηが正数であることを加味すると、第２条件を満たすフロアリング係数ηの範囲を規定する以下の数式(26)が導出される。

Taking into account that the flooring coefficient η is a positive number, the following formula (26) that defines the range of the flooring coefficient η that satisfies the second condition is derived.

図１の第１係数設定部４４は、数式(24)および数式(26)を満たすように、特性値算定部４２が算定した形状母数α0に応じたフロアリング係数ηを算定する。具体的には、第１係数設定部４４は、形状母数α0について数式(24)の演算を実行することで変数Ｈを算定するとともに変数Ｈの４乗根をフロアリング係数η（η＝Ｈ^1/4）として算定する。数式(24)からは複数のフロアリング係数ηが算定される。第１係数設定部４４は、複数のフロアリング係数ηのうち数式(26)を満たす１個のフロアリング係数ηを選択する。第１係数設定部４４が設定したフロアリング係数ηが雑音抑圧部３６による雑音抑圧処理（数式(2B)のフロアリング処理）に適用される。 The first coefficient setting unit 44 in FIG. 1 calculates the flooring coefficient η according to the shape parameter α0 calculated by the characteristic value calculation unit 42 so as to satisfy the expressions (24) and (26). Specifically, the first coefficient setting unit 44 calculates the variable H by executing the calculation of the equation (24) with respect to the shape parameter α0, and calculates the fourth root of the variable H as the flooring coefficient η (η = H ^1/4 ). From the equation (24), a plurality of flooring coefficients η are calculated. The first coefficient setting unit 44 selects one flooring coefficient η that satisfies Expression (26) among the plurality of flooring coefficients η. The flooring coefficient η set by the first coefficient setting unit 44 is applied to the noise suppression processing (flooring processing of Expression (2B)) by the noise suppression unit 36.

以上に説明したように第１実施形態では、雑音抑圧処理の前後で尖度Ｋが変化しないという第１条件（数式(24)）を満たすようにフロアリング係数ηが推定雑音成分ｎ(t)の形状母数α0に応じて可変に設定される。したがって、以下に詳述するように、ミュージカルノイズの発生を有効に抑制（理想的には完全に防止）しながら音響信号ｘ(t)の雑音成分を抑圧することが可能である。 As described above, in the first embodiment, the flooring coefficient η is the estimated noise component n (t) so as to satisfy the first condition (Formula (24)) that the kurtosis K does not change before and after the noise suppression process. Is variably set according to the shape parameter α0. Therefore, as described in detail below, it is possible to suppress the noise component of the acoustic signal x (t) while effectively suppressing (ideally completely preventing) the generation of musical noise.

図３は、雑音抑圧処理の前後の尖度比κを示すグラフであり、図４は、雑音抑圧処理の実行後の音響信号ｙ(t)のケプストラム歪ｄを示すグラフである。ケプストラム歪ｄは、目的音成分のケプストラムと雑音抑圧処理後の音響信号ｙ(t)のケプストラムとの誤差を示す指標値である。すなわち、ケプストラム歪ｄが小さいほど雑音抑圧性能が高い（すなわち目的音成分のスペクトル包絡が忠実に再現される）と評価できる。 FIG. 3 is a graph showing the kurtosis ratio κ before and after the noise suppression process, and FIG. 4 is a graph showing the cepstrum distortion d of the acoustic signal y (t) after the noise suppression process. The cepstrum distortion d is an index value indicating an error between the cepstrum of the target sound component and the cepstrum of the acoustic signal y (t) after the noise suppression processing. That is, it can be evaluated that the noise suppression performance is higher as the cepstrum distortion d is smaller (that is, the spectral envelope of the target sound component is faithfully reproduced).

図３および図４では、フロアリング係数ηを所定値に固定して１回だけ雑音抑圧処理を実行した場合（対比例１／One-Shot SS）と、ウィーナフィルタを適用した雑音抑圧処理を実行した場合（対比例２／Wiener filtering）と、MMSE-STSA（Minimum Mean-Square Error - Short-Time Spectral Amplitude）を利用した雑音抑圧処理を実行した場合（対比例３／MMSE-STSA）と、第１実施形態とが対比的に併記されている。図３および図４では、白色ガウス雑音（α0＝０.９７）を音響信号ｘ(t)の雑音成分とした場合（White Gaussian Noise）と会話音（α0＝０.２１）を音響信号ｘ(t)の雑音成分とした場合とが想定されている。 3 and 4, when the flooring coefficient η is fixed to a predetermined value and the noise suppression process is executed only once (comparative 1 / One-Shot SS), the noise suppression process using the Wiener filter is executed. When performing noise suppression using MMSE-STSA (Minimum Mean-Square Error-Short-Time Spectral Amplitude) (Comparative 3 / MMSE-STSA) One embodiment is shown in comparison. 3 and 4, when white Gaussian noise (α0 = 0.97) is used as the noise component of the acoustic signal x (t) (White Gaussian Noise) and the conversational sound (α0 = 0.21) are represented by the acoustic signal x ( It is assumed that the noise component is t).

図４から理解されるように、第１実施形態によれば、雑音成分が会話音である場合には対比例１から対比例３を上回る雑音抑圧性能が実現される。また、雑音成分が白色ガウス雑音である場合には、対比例１および対比例２を上回るとともに対比例３に匹敵する雑音抑圧性能が実現される。そして、図３から理解されるように、第１実施形態によれば、雑音成分が白色ガウス雑音および会話音の何れの場合でも、対比例１から対比例３と比較して、雑音抑圧の前後にわたる尖度比κは１に近い数値に低減される。すなわち、対比例１から対比例３の何れと比較した場合でも、第１実施形態によればミュージカルノイズを有効に低減できることが理解される。以上に説明したように、第１実施形態によれば、ミュージカルノイズを有効に低減しながら高度な雑音抑圧を実現することが可能である。 As can be understood from FIG. 4, according to the first embodiment, when the noise component is a conversational sound, noise suppression performance exceeding 1 to 3 is realized. Further, when the noise component is white Gaussian noise, a noise suppression performance that exceeds Comparative 1 and Comparative 2 and is comparable to Comparative 3 is realized. As can be understood from FIG. 3, according to the first embodiment, the noise components before and after the noise suppression are compared with the proportional 1 to the proportional 3 regardless of whether the noise component is white Gaussian noise or conversational sound. The kurtosis ratio κ over the range is reduced to a value close to 1. That is, it can be understood that the musical noise can be effectively reduced according to the first embodiment even when compared with any one of the proportionality 1 to the proportionality 3. As described above, according to the first embodiment, it is possible to realize advanced noise suppression while effectively reducing musical noise.

＜第２実施形態＞
本発明の第２実施形態を以下に説明する。なお、以下に例示する各形態において作用や機能が第１実施形態と同等である要素については、第１実施形態の説明で参照した符号を流用して各々の詳細な説明を適宜に省略する。 Second Embodiment
A second embodiment of the present invention will be described below. In addition, about the element in which an effect | action and a function are equivalent to 1st Embodiment in each form illustrated below, the code | symbol referred by description of 1st Embodiment is diverted, and each detailed description is abbreviate | omitted suitably.

図５は、第２実施形態に係る雑音抑圧装置１００Bのブロック図である。図５に示すように、第２実施形態の雑音抑圧装置１００Bは、第１実施形態の雑音抑圧装置１００Aに第２係数設定部４６を追加した構成である。第２係数設定部４６は、特性値算定部４２が算定した形状母数α0と第１係数設定部４４が設定したフロアリング係数ηとに応じて、各雑音抑圧処理の減算処理（数式(2A)）に適用される抑圧係数βを可変に設定する。抑圧係数βの設定には、記憶装置２４に事前に記憶された図６の係数テーブルＴBLが使用される。 FIG. 5 is a block diagram of a noise suppression device 100B according to the second embodiment. As shown in FIG. 5, the noise suppression device 100B of the second embodiment has a configuration in which a second coefficient setting unit 46 is added to the noise suppression device 100A of the first embodiment. The second coefficient setting unit 46 subtracts each noise suppression process according to the shape parameter α 0 calculated by the characteristic value calculation unit 42 and the flooring coefficient η set by the first coefficient setting unit 44 (formula (2A )) Is set to be variable. For setting the suppression coefficient β, the coefficient table TBL of FIG. 6 stored in advance in the storage device 24 is used.

図６の係数テーブルＴBLは、形状母数α0の各数値（α0_1，α0_2，……）とフロアリング係数ηの各数値（η_1，η_2，……）との組合せに対して抑圧係数βの数値（β11，β12，……）を対応させたデータテーブルである。形状母数α0およびフロアリング係数ηの組合せに対応する抑圧係数βの数値は、その形状母数α0およびフロアリング係数ηのもとで高い雑音抑圧性能が実現されるように事前に選定される。 The coefficient table TBL in FIG. 6 shows the numerical value of the suppression coefficient β for each combination of the numerical values of the shape parameter α0 (α0_1, α0_2,...) And the numerical values of the flooring coefficient η (η_1, η_2,. This is a data table in which (β11, β12,...) Is associated. The numerical value of the suppression coefficient β corresponding to the combination of the shape parameter α0 and the flooring coefficient η is selected in advance so that high noise suppression performance is realized under the shape parameter α0 and the flooring coefficient η. .

数式(16)の雑音抑圧率Ｒは、雑音抑圧処理の前後のパワー比の対数値であるから、雑音抑圧処理毎に累算（加算）される。したがって、第(i+1)回目の雑音抑圧処理を実行した時点の雑音抑圧率Ｒ(αi+1,β,η)は以下の数式(27)で表現される。

係数テーブルＴBLのうち各形状母数α0とフロアリング係数ηとの組合せに対応する抑圧係数βは、その形状母数α0およびフロアリング係数ηのもとで反復回数Ｑにわたる雑音抑圧処理を実行した場合の数式(27)の雑音抑圧率Ｒ(αQ,β,η)に応じて設定される。具体的には、抑圧係数βは、雑音抑圧率Ｒ(αQ,β,η)が最大となる数値や雑音抑圧率Ｒ(αQ,β,η)が所定の目標値Ｒtarを上回る数値に設定される。目標値Ｒtarは、例えば入力装置（図示略）に対する利用者からの指示に応じて可変に設定される。 Since the noise suppression rate R in Expression (16) is a logarithmic value of the power ratio before and after the noise suppression process, it is accumulated (added) for each noise suppression process. Therefore, the noise suppression rate R (αi + 1, β, η) at the time when the (i + 1) th noise suppression process is executed is expressed by the following equation (27).

In the coefficient table TBL, the suppression coefficient β corresponding to the combination of each shape parameter α0 and the flooring coefficient η was subjected to noise suppression processing over the number of iterations Q under the shape parameter α0 and the flooring coefficient η. It is set according to the noise suppression rate R (αQ, β, η) of the case (27). Specifically, the suppression coefficient β is set to a value at which the noise suppression rate R (αQ, β, η) is maximum or a value at which the noise suppression rate R (αQ, β, η) exceeds a predetermined target value Rtar. The The target value Rtar is variably set according to an instruction from the user to an input device (not shown), for example.

第２係数設定部４６は、特性値算定部４２が算定した形状母数α0と第１係数設定部４４が設定したフロアリング係数ηとの組合せに対応する抑圧係数βを係数テーブルＴBLから検索および取得して雑音抑圧部３６に指示する。雑音抑圧部３６は、第１係数設定部４４から指示されるフロアリング係数ηと第２係数設定部４６から指示される抑圧係数βとを適用した雑音抑圧処理を所定の反復回数Ｑにわたり反復することでスペクトルＹ(f,τ)を生成する。 The second coefficient setting unit 46 searches the coefficient table TBL for a suppression coefficient β corresponding to the combination of the shape parameter α 0 calculated by the characteristic value calculation unit 42 and the flooring coefficient η set by the first coefficient setting unit 44. Obtaining and instructing the noise suppression unit 36. The noise suppression unit 36 repeats the noise suppression process using the flooring coefficient η specified by the first coefficient setting unit 44 and the suppression coefficient β specified by the second coefficient setting unit 46 over a predetermined number of iterations Q. Thus, the spectrum Y (f, τ) is generated.

第２実施形態においても第１実施形態と同様の効果が実現される。また、第２実施形態では、特性値算定部４２が算定した形状母数α0と第１係数設定部４４が設定したフロアリング係数ηとに応じて抑圧係数βが可変に設定されるから、抑圧係数βを所定値に固定した構成と比較して雑音抑圧性能（雑音抑圧率Ｒ）を向上することが可能である。 In the second embodiment, the same effect as in the first embodiment is realized. In the second embodiment, the suppression coefficient β is variably set according to the shape parameter α 0 calculated by the characteristic value calculation unit 42 and the flooring coefficient η set by the first coefficient setting unit 44. Compared with a configuration in which the coefficient β is fixed to a predetermined value, the noise suppression performance (noise suppression rate R) can be improved.

なお、以上の説明では係数テーブルＴBLから抑圧係数βを取得したが、抑圧係数βを設定する方法は適宜に変更される。例えば、第２係数設定部４６が、抑圧係数βの複数の候補値ｂの各々を形状母数α0およびフロアリング係数ηとともに数式(27)に適用して候補値ｂ毎に雑音抑圧率Ｒ(αQ,β,η)を算定し、雑音抑圧率Ｒ(αQ,β,η)が最大となる候補値ｂを抑圧係数βとして選択することも可能である。また、係数テーブルＴBLと数式(27)の演算とを併用する構成も採用され得る。例えば第２係数設定部４６は、形状母数α0およびフロアリング係数ηの組合せに対応する抑圧係数βを係数テーブルＴBLから取得し、その抑圧係数βを含む所定の範囲内の複数の候補値ｂの各々について数式(27)の演算で雑音抑圧率Ｒ(αQ,β,η)を算定する。そして、雑音抑圧率Ｒ(αQ,β,η)が最大となる候補値ｂを抑圧係数βとして選択する。 In the above description, the suppression coefficient β is acquired from the coefficient table TBL, but the method of setting the suppression coefficient β is changed as appropriate. For example, the second coefficient setting unit 46 applies each of the plurality of candidate values b of the suppression coefficient β to the equation (27) together with the shape parameter α 0 and the flooring coefficient η, so that the noise suppression rate R ( It is also possible to calculate αQ, β, η) and select a candidate value b that maximizes the noise suppression rate R (αQ, β, η) as the suppression coefficient β. In addition, a configuration in which the coefficient table TBL and the calculation of Expression (27) are used together can be employed. For example, the second coefficient setting unit 46 acquires the suppression coefficient β corresponding to the combination of the shape parameter α0 and the flooring coefficient η from the coefficient table TBL, and a plurality of candidate values b within a predetermined range including the suppression coefficient β. The noise suppression rate R (αQ, β, η) is calculated for each of the above by the calculation of Equation (27). Then, the candidate value b that maximizes the noise suppression rate R (αQ, β, η) is selected as the suppression coefficient β.

＜第３実施形態＞
図７は、第３実施形態に係る雑音抑圧装置１００Cのブロック図である。図７に示すように、第３実施形態の雑音抑圧装置１００Cは、第１実施形態の雑音抑圧装置１００Aに反復回数設定部４８を追加した構成である。反復回数設定部４８は、特性値算定部４２が算定した形状母数α0と第１係数設定部４４が設定したフロアリング係数ηとに応じて雑音抑圧部３６による雑音抑圧処理の反復回数Ｑを可変に設定する。 <Third Embodiment>
FIG. 7 is a block diagram of a noise suppression device 100C according to the third embodiment. As shown in FIG. 7, the noise suppression device 100C of the third embodiment has a configuration in which an iteration number setting unit 48 is added to the noise suppression device 100A of the first embodiment. The iteration number setting unit 48 sets the iteration number Q of the noise suppression processing by the noise suppression unit 36 according to the shape parameter α 0 calculated by the characteristic value calculation unit 42 and the flooring coefficient η set by the first coefficient setting unit 44. Set to variable.

具体的には、反復回数設定部４８は、反復回数Ｑの複数の候補値ｑの各々について、特性値算定部４２が算定した形状母数α0と第１係数設定部４４が設定したフロアリング係数ηと所定の抑圧係数βとのもとでその候補値ｑの回数だけ雑音抑圧処理を反復した場合の雑音抑圧率Ｒ(αq,β,η)を数式(27)の演算で算定し、雑音抑圧率Ｒ(αq,β,η)が目標値Ｒtarを上回る最小の候補値ｑを反復回数Ｑとして確定する。目標値Ｒtarは、例えば入力装置（図示略）に対する利用者からの指示に応じて可変に設定される。雑音抑圧部３６は、反復回数設定部４８が設定した反復回数Ｑにわたり雑音抑圧処理を反復する。 Specifically, the iteration number setting unit 48 sets the shape parameter α 0 calculated by the characteristic value calculation unit 42 and the flooring coefficient set by the first coefficient setting unit 44 for each of the plurality of candidate values q of the iteration number Q. The noise suppression rate R (αq, β, η) when the noise suppression processing is repeated as many times as the candidate value q based on η and a predetermined suppression coefficient β is calculated by the calculation of Equation (27), The minimum candidate value q in which the suppression rate R (αq, β, η) exceeds the target value Rtar is determined as the number of iterations Q. The target value Rtar is variably set according to an instruction from the user to an input device (not shown), for example. The noise suppression unit 36 repeats the noise suppression process over the number of iterations Q set by the iteration number setting unit 48.

第３実施形態においても第１実施形態と同様の効果が実現される。また、第３実施形態では、特性値算定部４２が算定した形状母数α0と第１係数設定部４４が設定したフロアリング係数ηとに応じて雑音抑圧処理の反復回数Ｑが可変に設定されるから、雑音抑圧処理の過不足を防止することが可能である。具体的には、雑音抑圧処理の不足を防止して目標値Ｒtarを確実に達成するとともに雑音抑圧処理の過剰を防止して演算処理装置２２の演算量を削減できるという利点がある。 In the third embodiment, the same effect as in the first embodiment is realized. In the third embodiment, the number of iterations Q of the noise suppression process is variably set according to the shape parameter α 0 calculated by the characteristic value calculation unit 42 and the flooring coefficient η set by the first coefficient setting unit 44. Therefore, it is possible to prevent excessive and insufficient noise suppression processing. Specifically, there is an advantage that it is possible to reduce the amount of calculation of the arithmetic processing unit 22 by preventing the noise suppression process from being insufficient and reliably achieving the target value Rtar and preventing the noise suppression process from being excessive.

なお、以上の説明では数式(27)の演算で反復回数Ｑを設定したが、事前に用意されたテーブルを利用して反復回数Ｑを設定することも可能である。例えば、形状母数α0の各数値とフロアリング係数ηの各数値との組合せ毎に目標値Ｒtarを達成し得る反復回数Ｑを設定したテーブルを記憶装置２４に事前に格納し、特性値算定部４２が算定した形状母数α0と第１係数設定部４４が設定したフロアリング係数ηとに対応する反復回数Ｑを反復回数設定部４８がテーブルから検索して雑音抑圧部３６に指示する。 In the above description, the number of iterations Q is set by the calculation of Equation (27), but the number of iterations Q can also be set using a table prepared in advance. For example, a table in which the number of iterations Q that can achieve the target value Rtar for each combination of each value of the shape parameter α0 and each value of the flooring coefficient η is stored in the storage device 24 in advance, and the characteristic value calculation unit The iteration number setting unit 48 searches the table for the number of iterations Q corresponding to the shape parameter α 0 calculated by 42 and the flooring coefficient η set by the first coefficient setting unit 44 and instructs the noise suppression unit 36.

＜第４実施形態＞
図８は、第４実施形態に係る雑音抑圧装置１００Dのブロック図である。図８に示すように、第４実施形態の雑音抑圧装置１００Dは、収音部５０と周波数分析部５２と信号処理部５４と出力処理部５６と波形合成部５８とを具備する。収音部５０は、相互に離間して配置されたＤ個（Ｄは２以上の自然数）の収音機器５１で構成されるマイクロホンアレイであり、各収音機器５１が生成したＤチャネルの音響信号ｘ1(t)〜ｘD(t)を周波数分析部５２に供給する。各音響信号ｘd(t)（ｄ＝１〜Ｄ）は、目的音成分と雑音成分との混合音の波形を示す時間領域信号である。目的音成分は、特定の音源方向から到来する音響成分である。雑音抑圧装置１００Dは、音響信号ｘ1(t)〜ｘD(t)から雑音成分を抑圧した音響信号ｙ(t)を生成する。 <Fourth embodiment>
FIG. 8 is a block diagram of a noise suppression device 100D according to the fourth embodiment. As shown in FIG. 8, the noise suppression device 100D of the fourth embodiment includes a sound collection unit 50, a frequency analysis unit 52, a signal processing unit 54, an output processing unit 56, and a waveform synthesis unit 58. The sound collection unit 50 is a microphone array composed of D (D is a natural number of 2 or more) sound collection devices 51 that are spaced apart from each other, and D-channel sound generated by each sound collection device 51. The signals x1 (t) to xD (t) are supplied to the frequency analysis unit 52. Each acoustic signal xd (t) (d = 1 to D) is a time domain signal indicating a waveform of a mixed sound of a target sound component and a noise component. The target sound component is an acoustic component that arrives from a specific sound source direction. The noise suppression device 100D generates an acoustic signal y (t) in which noise components are suppressed from the acoustic signals x1 (t) to xD (t).

周波数分析部５２は、Ｄチャネルの音響信号ｘ1(t)〜ｘD(t)の各々のスペクトルＸd^[0](f,τ)（Ｘ1^[0](f,τ)〜ＸD^[0](f,τ)）をフレーム毎に順次に生成する。図８に示すように、スペクトルＸ1^[0](f,τ)〜ＸD^[0](f,τ)を要素とするＤ次元のベクトル（以下「観測ベクトル」という）Ｖ^[0](f,τ)が周波数分析部５２から出力される（Ｖ^[0](f,τ)＝[Ｘ1^[0](f,τ)，Ｘ2^[0](f,τ)，……，ＸD^[0](f,τ)]^T）。記号Ｔは行列の転置を意味する。 The frequency analysis unit 52 performs spectrum Xd ^[0] (f, τ) (X1 ^[0] (f, τ) to XD ^[0] (f) of each of the D-channel acoustic signals x1 (t) to xD (t). , τ)) are generated sequentially for each frame. As shown in FIG. 8, a D-dimensional vector (hereinafter referred to as “observation vector”) V ^[0] (f, τ) whose elements are the spectra X1 ^[0] (f, τ) to XD ^[0] (f, τ). τ) is output from the frequency analysis unit 52 (V ^[0] (f, τ) = [X1 ^[0] (f, τ), X2 ^[0] (f, τ),..., XD ^[0] (f, τ)] ^T ). The symbol T means transposition of the matrix.

信号処理部５４は、周波数分析部５２が生成したＤチャネルのスペクトルＸ1^[0](f,τ)〜ＸD^[0](f,τ)（観測ベクトルＶ^[0](f,τ)）からＤチャネルのスペクトルＹ1(f,τ)〜ＹD(f,τ)を生成する。スペクトルＹd(f,τ)は、スペクトルＸd^[0](f,τ)から雑音成分を抑圧したスペクトルである。図８に示すように、信号処理部５４は、相互に縦続に接続されてＤチャネルの音響信号ｘ1(t)〜ｘD(t)（Ｘ1^[0](f,τ)〜ＸD^[0](f,τ)）を順次に処理するＱ段の単位処理部Ｕ[1]〜Ｕ[Q]を含んで構成される。 The signal processing unit 54 uses the D channel spectrum X1 ^[0] (f, τ) to XD ^[0] (f, τ) (observation vector V ^[0] (f, τ)) generated by the frequency analysis unit 52. The D channel spectrum Y1 (f, τ) to YD (f, τ) is generated. The spectrum Yd (f, τ) is a spectrum obtained by suppressing a noise component from the spectrum Xd ^[0] (f, τ). As shown in FIG. 8, the signal processing units 54 are connected to each other in cascade, and D-channel acoustic signals x1 (t) to xD (t) (X1 ^[0] (f, τ) to XD ^[0] ( f, τ)) are sequentially processed, and Q unit processing units U [1] to U [Q] are included.

図９は、信号処理部５４の第ｑ段目（ｑ＝１〜Ｑ）の単位処理部Ｕ[q]のブロック図である。各単位処理部Ｕ[q]は、前段（第(q-1)段）から供給される観測ベクトルＶ^[q-1](f,τ)に対する雑音抑圧処理で観測ベクトルＶ^[q](f,τ)を生成する。第１段目の単位処理部Ｕ[1]には周波数分析部５２が生成した観測ベクトルＶ^[0](f,τ)が供給される。第Ｑ段目（最終段）の単位処理部Ｕ[Q]が生成した観測ベクトルＶ^[Q](f,τ)がＤチャネルのスペクトルＹ1(f,τ)〜ＹD(f,τ)に相当する（Ｖ^[Q](f,τ)＝［Ｙ1(f,τ)，Ｙ2(f,τ)，……，ＹD(f,τ)]^T）。 FIG. 9 is a block diagram of the unit processing unit U [q] at the q-th stage (q = 1 to Q) of the signal processing unit 54. Each unit processing unit U [q] are front observation vector V ^[q-1] supplied from (the (q-1) stage) (f, tau) in the noise suppressing process observation vectors V for the ^[q] (f , τ). The observation vector V ^[0] (f, τ) generated by the frequency analysis unit 52 is supplied to the unit processing unit U [1] in the first stage. The observation vector V ^[Q] (f, τ) generated by the unit processing unit U [Q] at the Q-th stage (final stage) corresponds to the spectrum Y1 (f, τ) to YD (f, τ) of the D channel. (V ^[Q] (f, τ) = [Y1 (f, τ), Y2 (f, τ),..., YD (f, τ)] ^T ).

図９に示すように、各単位処理部Ｕ[q]は、雑音推定部６２と雑音抑圧用係数設定部６４と雑音抑圧部６６とを具備する。雑音推定部６２は、観測ベクトルＶ^[q-1](f,τ)に対する独立成分分析で推定雑音成分の推定雑音ベクトルＺ^[q](f,τ)を生成する。第４実施形態の雑音推定部６２は、独立成分分析部６２２と逆射影部６２４とを含んで構成される。 As shown in FIG. 9, each unit processing unit U [q] includes a noise estimation unit 62, a noise suppression coefficient setting unit 64, and a noise suppression unit 66. The noise estimation unit 62 generates an estimated noise vector Z ^[q] (f, τ) of the estimated noise component by independent component analysis with respect to the observed vector V ^[q-1] (f, τ). The noise estimation unit 62 according to the fourth embodiment includes an independent component analysis unit 622 and a reverse projection unit 624.

独立成分分析部６２２は、周波数領域での独立成分分析（FD-ICA：Frequency Domain-Independent Component Analysis）を利用した音源分離を観測ベクトルＶ^[q-1](f,τ)に対して実行することで分離ベクトルＧ^[q](f,τ)を生成する。分離ベクトルＧ^[q](f,τ)は、Ｄ個の要素ｇ1^[q](f,τ)〜ｇD^[q](f,τ)で構成され、以下の数式(28)で表現される。

数式(28)の記号Ｗ^[q](f)は分離行列を意味し、公知の更新式の演算を累積的に反復することで算定される。 The independent component analysis unit 622 performs sound source separation using frequency domain independent component analysis (FD-ICA) on the observation vector V ^[q-1] (f, τ). Thus, the separation vector G ^[q] (f, τ) is generated. The separation vector G ^[q] (f, τ) is composed of D elements g1 ^[q] (f, τ) to gD ^[q] (f, τ), and is expressed by the following equation (28). .

Symbol W ^[q] (f) in Equation (28) means a separation matrix, and is calculated by cumulatively repeating the calculation of a known update equation.

独立成分分析部６２２が生成した分離ベクトルＧ^[q](f,τ)のうち目的音成分に相当する要素（ｇs^[q](f,τ)）をゼロに置換した分離ベクトルＧ_(noise) ^[q](f,τ)が逆射影部６２４に供給される。すなわち、分離ベクトルＧ_(noise) ^[q](f,τ)は、以下の数式(29)で表現される。要素ｇs^[q](f,τ)は、独立成分分析で推定された目的音成分である。

The separation vector G _(noise) obtained by replacing the element (gs ^[q] (f, τ)) corresponding to the target sound component in the separation vector G ^[q] (f, τ) generated by the independent component analysis unit 622 with zero. ^[q] (f, τ) is supplied to the reverse projection unit 624. That is, the separation vector G _(noise) ^[q] (f, τ) is expressed by the following equation (29). Element gs ^[q] (f, τ) is the target sound component estimated by the independent component analysis.

逆射影部６２４は、独立成分分析のスケーリング問題を解決するための逆射影（projection back）を分離ベクトルＧ_(noise) ^[q](f,τ)に適用することで推定雑音ベクトルＺ^[q](f,τ)を生成する。具体的には、推定雑音ベクトルＺ^[q](f,τ)は以下の数式(30)の演算で生成される。

推定雑音ベクトルＺ^[q](f,τ)は、各チャネルに対応するＤ個の推定雑音成分ｚ1^[q](f,τ)〜ｚD^[q](f,τ)を要素とするベクトルである（Ｚ^[q](f,τ)＝［ｚ1^[q](f,τ)，ｚ2^[q](f,τ)，……，ｚD^[q](f,τ)］^T）。 The inverse projection unit 624 applies an inverse projection (projection back) for solving the scaling problem of independent component analysis to the separation vector G _(noise) ^[q] (f, τ) to thereby estimate the noise vector Z ^[q]. Generate (f, τ). Specifically, the estimated noise vector Z ^[q] (f, τ) is generated by the calculation of the following equation (30).

The estimated noise vector Z ^[q] (f, τ) is a vector having D estimated noise components z1 ^[q] (f, τ) to zD ^[q] (f, τ) corresponding to each channel as elements. (Z ^[q] (f, τ) = [z1 ^[q] (f, τ), z2 ^[q] (f, τ),..., ZD ^[q] (f, τ)] ^T ).

図９の雑音抑圧用係数設定部６４は、各チャネルのフロアリング係数η[1]〜η[D]を設定する要素であり、特性値算定部６４２と第１係数設定部６４４とを含んで構成される。特性値算定部６４２は、推定雑音ベクトルＺ^[q](f,τ)から各チャネルの形状母数α0[1]〜α0[D]を算定する。形状母数α0[d]は、推定雑音ベクトルＺ^[q](f,τ)の推定雑音成分ｚd^[q](f,τ)の強度分布に応じた変数である。特性値算定部６４２が推定雑音成分ｚd^[q](f,τ)から形状母数α0[d]を算定する方法は、第１実施形態の特性値算定部４２が推定雑音成分ｎ(t)から形状母数α0を算定する方法と同様である。 The noise suppression coefficient setting unit 64 in FIG. 9 is an element that sets the flooring coefficients η [1] to η [D] of each channel, and includes a characteristic value calculation unit 642 and a first coefficient setting unit 644. Composed. The characteristic value calculator 642 calculates the shape parameters α 0 [1] to α 0 [D] of each channel from the estimated noise vector Z ^[q] (f, τ). Shape parameter .alpha.0 [d] is a variable corresponding to the intensity distribution of the estimated noise vector ^{Z [q] (f, τ} ) of the estimated noise component ^{zd [q] (f, τ} ). The characteristic value calculation unit 642 calculates the shape parameter α0 [d] from the estimated noise component zd ^[q] (f, τ). The characteristic value calculation unit 42 of the first embodiment uses the estimated noise component n (t). This is the same as the method of calculating the shape parameter α0 from

第１係数設定部６４４は、特性値算定部６４２が算定した形状母数α0[1]〜α0[D]からチャネル毎のフロアリング係数η[1]〜η[D]を算定する。具体的には、第１係数設定部６４４は、第１実施形態の第１係数設定部４４と同様に、第１条件を表現する数式(24)と第２条件を表現する数式(26)とを満たすように、形状母数α0[d]に応じたフロアリング係数η[d]を算定する。 The first coefficient setting unit 644 calculates the flooring coefficients η [1] to η [D] for each channel from the shape parameters α0 [1] to α0 [D] calculated by the characteristic value calculation unit 642. Specifically, the first coefficient setting unit 644 is similar to the first coefficient setting unit 44 of the first embodiment in that the mathematical expression (24) expressing the first condition and the mathematical expression (26) expressing the second condition are: The flooring coefficient η [d] corresponding to the shape parameter α0 [d] is calculated so as to satisfy the above.

図９の雑音抑圧部６６は、第１係数設定部６４４が算定したフロアリング係数η[1]〜η[D]を適用した雑音抑圧処理を観測ベクトルＶ^[q-1](f,τ)に対して実行することで観測ベクトルＶ^[q](f,τ)を生成する。雑音抑圧処理はチャネル毎に個別に実行される。すなわち、雑音抑圧部６６は、フロアリング係数η[d]を適用した雑音抑圧処理を音響信号ｘd(t)のスペクトルＸd^[q-1](f,τ)に対して実行することでスペクトルＸd^[q](f,τ)を生成する。具体的には、雑音抑圧部６６は、以下の数式(31A)で表現される減算処理と数式(31B)で表現されるフロアリング処理とを周波数ｆ毎に択一的に実行する。すなわち、第４実施形態の信号処理部５４では、周波数分析部５２が生成したスペクトルＸd^[0](f,τ)に対してＱ回の雑音抑圧処理が累積的に反復される。

The noise suppression unit 66 in FIG. 9 performs the noise suppression process using the flooring coefficients η [1] to η [D] calculated by the first coefficient setting unit 644 as the observation vector V ^[q-1] (f, τ). To generate an observation vector V ^[q] (f, τ). Noise suppression processing is performed individually for each channel. That is, the noise suppression unit 66 performs a noise suppression process using the flooring coefficient η [d] on the spectrum Xd ^[q-1] (f, τ) of the acoustic signal xd (t), thereby performing the spectrum Xd. ^[q] (f, τ) is generated. Specifically, the noise suppression unit 66 alternatively executes a subtraction process expressed by the following formula (31A) and a flooring process expressed by the formula (31B) for each frequency f. That is, in the signal processing unit 54 of the fourth embodiment, Q times of noise suppression processing are cumulatively repeated for the spectrum Xd ^[0] (f, τ) generated by the frequency analysis unit 52.

図８の出力処理部５６は、信号処理部５４（第Ｑ段目の単位処理部Ｕ[Q]）が生成したＤチャネルのスペクトルＹ1(f,τ)〜ＹD(f,τ)から音響信号ｙ(t)のスペクトルＹ(f,τ)をフレーム毎に生成する。具体的には、出力処理部５６は、目的音成分の方向に収音のビーム（感度が高い領域）を形成する遅延加算型ビームフォーマであり、遅延部５６２と加算部５６２とを含んで構成される。 The output processing unit 56 in FIG. 8 generates an acoustic signal from the spectrums Y1 (f, τ) to YD (f, τ) of the D channel generated by the signal processing unit 54 (the Q-th unit processing unit U [Q]). A spectrum Y (f, τ) of y (t) is generated for each frame. Specifically, the output processing unit 56 is a delay-and-add type beamformer that forms a collected beam (a region with high sensitivity) in the direction of the target sound component, and includes a delay unit 562 and an addition unit 562. Is done.

遅延部５６２は、スペクトルＹ1(f,τ)〜ＹD(f,τ)の各々を目的音成分の音源方向φに応じた遅延量だけ遅延させる。音源方向φは、信号処理部５４のＱ個の単位処理部Ｕ[1]〜Ｕ[Q]から選択された１個の単位処理部Ｕ[q]（例えば最終段の単位処理部Ｕ[Q]）における独立成分分析部６２２が推定した分離行列Ｗ^[q](f)から特定される。 The delay unit 562 delays each of the spectra Y1 (f, τ) to YD (f, τ) by a delay amount corresponding to the sound source direction φ of the target sound component. The sound source direction φ is one unit processing unit U [q] selected from the Q unit processing units U [1] to U [Q] of the signal processing unit 54 (for example, the unit processing unit U [Q at the final stage). ]) From the separation matrix W ^[q] (f) estimated by the independent component analysis unit 622.

加算部５６４は、遅延部５６２による遅延後のＤチャネルのスペクトルＹ1(f,τ)〜ＹD(f,τ)を加算することで音響信号ｙ(t)のスペクトルＹ(f,τ)を生成する。以上の説明から理解されるように、出力処理部５６が生成するスペクトルＹ(f,τ)は、音源方向φの目的音成分を強調した信号となる。 The adding unit 564 generates the spectrum Y (f, τ) of the acoustic signal y (t) by adding the spectrums Y1 (f, τ) to YD (f, τ) of the D channel delayed by the delay unit 562. To do. As understood from the above description, the spectrum Y (f, τ) generated by the output processing unit 56 is a signal in which the target sound component in the sound source direction φ is emphasized.

図８の波形合成部５８は、第１実施形態の波形合成部３８と同様に、出力処理部５６がフレーム毎に生成するスペクトルＹ(f,τ)から時間領域の音響信号ｙ(t)を生成する。音響信号ｙ(t)は、例えば放音装置（図示略）に供給されて音波として再生される。 The waveform synthesizer 58 in FIG. 8 generates an acoustic signal y (t) in the time domain from the spectrum Y (f, τ) generated for each frame by the output processor 56, similarly to the waveform synthesizer 38 of the first embodiment. Generate. The acoustic signal y (t) is supplied to, for example, a sound emitting device (not shown) and reproduced as a sound wave.

以上に説明した第４実施形態においても第１実施形態と同様の効果が実現される。また、第４実施形態では、雑音抑圧処理毎（単位処理部Ｕ[q]毎）に独立成分分析で推定雑音成分ｚd^[q](f,τ)が推定されるから、音響信号ｘd(t)の途中の時点で雑音成分の特性が変動した場合でも高精度な雑音抑圧を実現できるという利点がある。また、ＤチャネルのスペクトルＹ1(f,τ)〜ＹD(f,τ)が出力処理部５６にて加算されるから、各スペクトルＹd(f,τ)にミュージカルノイズが仮に発生した場合でも音響信号ｙ(t)では知覚され難くすることが可能である。 Also in the fourth embodiment described above, the same effect as in the first embodiment is realized. In the fourth embodiment, since the estimated noise component zd ^[q] (f, τ) is estimated by independent component analysis for each noise suppression process (for each unit processing unit U [q]), the acoustic signal xd (t ), There is an advantage that highly accurate noise suppression can be realized even when the characteristics of the noise component fluctuate in the middle. Further, since the spectrums Y1 (f, τ) to YD (f, τ) of the D channel are added by the output processing unit 56, even if musical noise is temporarily generated in each spectrum Yd (f, τ) It is possible to make it difficult to perceive with y (t).

図１０から図１２は雑音抑圧の評価結果を示すグラフである。第４実施形態での雑音抑圧の評価結果と公知のBSSA（Blind Spacial Subtraction Array）を利用した雑音抑圧（以下「対比例４」という）の評価結果とが対比的に図示されている。図１０は、雑音抑圧の前後の尖度比κを示すグラフであり、図１１は、雑音抑圧後の音響信号ｙ(t)のケプストラム歪ｄを示すグラフである。図１０および図１１では、所定の音声にインパルス応答を畳込んだ音響を目的音成分とし、駅構内で収録された雑音成分（駅雑音）および人混みで収録された雑音成分（人混音）を目的音成分に付加した各場合について尖度比κおよびケプストラム歪ｄが図示されている。また、図１２は、雑音抑圧後の音響信号ｙ(t)の音質の主観評価の結果である。具体的には、第４実施形態による雑音抑圧の結果が高音質であると評価した被験者と対比例４の結果が高音質であると評価した被験者との比率が図１２では図示されている。 10 to 12 are graphs showing the evaluation results of noise suppression. The evaluation result of noise suppression in the fourth embodiment and the evaluation result of noise suppression (hereinafter referred to as “Comparison 4”) using a known BSSA (Blind Spatial Subtraction Array) are shown in comparison. FIG. 10 is a graph showing the kurtosis ratio κ before and after noise suppression, and FIG. 11 is a graph showing the cepstrum distortion d of the acoustic signal y (t) after noise suppression. In FIG. 10 and FIG. 11, the sound component obtained by convolving an impulse response with a predetermined sound is set as a target sound component, and a noise component (station noise) recorded in a station premises and a noise component (people mixed sound) recorded in a crowd are used. The kurtosis ratio κ and the cepstrum distortion d are shown for each case added to the target sound component. FIG. 12 shows the result of subjective evaluation of the sound quality of the acoustic signal y (t) after noise suppression. Specifically, FIG. 12 shows the ratio between the subject who evaluated that the result of noise suppression according to the fourth embodiment is high sound quality and the subject who evaluated that the result of contrast 4 is high sound quality.

図１０に示すように、第４実施形態では、雑音抑圧の前後にわたる尖度比κが対比例４と比較して１に近い数値に低減されるから、ミュージカルノイズを有効に抑制できることが理解される。他方、第４実施形態の処理後の音響信号ｙ(t)のスペクトル歪ｄは対比例４と比較して大きいという傾向が図１１から把握される。しかし、図１２の主観評価によれば、第４実施形態による処理後の音響信号ｙ(t)のほうが対比例４と比較して高音質であると９０％程度の被験者が評価している。すなわち、第４実施形態によれば、受聴者が知覚する音質を高い水準に維持しながら雑音抑圧を実現できるという効果が実現される。 As shown in FIG. 10, in the fourth embodiment, it is understood that the musical noise can be effectively suppressed because the kurtosis ratio κ before and after the noise suppression is reduced to a value close to 1 compared to the proportional 4. The On the other hand, the tendency that the spectral distortion d of the acoustic signal y (t) after the processing of the fourth embodiment is larger than that of the proportional 4 can be grasped from FIG. However, according to the subjective evaluation of FIG. 12, about 90% of the subjects evaluate that the acoustic signal y (t) after the processing according to the fourth embodiment is higher in sound quality than the comparative 4. That is, according to the fourth embodiment, an effect that noise suppression can be realized while maintaining the sound quality perceived by the listener at a high level is realized.

＜変形例＞
以上の各形態は多様に変形される。具体的な変形の態様を以下に例示する。以下の例示から任意に選択された２以上の態様は適宜に併合され得る。 <Modification>
Each of the above forms can be variously modified. Specific modifications are exemplified below. Two or more aspects arbitrarily selected from the following examples can be appropriately combined.

(1) In the subtraction processing of each form described above, the power spectrum | Ni of the estimated noise component n (t) from the power spectrum | Yi-1 (f, τ) | ² after the (i-1) th noise suppression. -1 (f, τ) | ² is subtracted, but as shown in the following formula (2A ′), the amplitude spectrum | Yi-1 (f, τ) | and the amplitude spectrum | Ni-1 (f , τ) | can be generalized as any positive number K. For generalization of the power index K in the subtraction process, see, for example, Inoue et al., “Mathematical analysis of the amount of musical noise generated in the generalized spectral subtraction method, Acoustical Society of Japan, 3-5-4, p.759-762, It is also described in detail in March 2010 (hereinafter referred to as “Non-Patent Document 2”).

非特許文献２によれば、冪指数Ｋを小さい数値に設定したほうが聴感的に自然な音響信号ｙ(t)を生成できる。したがって、理想的には、演算処理装置２２の演算処理の範囲内（例えば、演算処理装置２２が演算可能な浮動小数点数のもとでアンダーフローを回避して有意な数値が得られる限度内）で最小の数値に冪指数Ｋは設定される。具体的には、冪指数Ｋを１以下の数値（更に好適には０.５を下回る数値）に設定することが可能である。 According to Non-Patent Document 2, it is possible to generate an acoustically natural acoustic signal y (t) by setting the power index K to a small value. Therefore, ideally, within the range of the arithmetic processing of the arithmetic processing unit 22 (for example, within a limit where a significant numerical value can be obtained by avoiding underflow under a floating point number that can be calculated by the arithmetic processing unit 22). The power index K is set to the minimum value. Specifically, the power index K can be set to a numerical value of 1 or less (more preferably, a numerical value lower than 0.5).

（２）雑音抑圧処理の具体的な内容は前述の数式(2A)（または数式(2A')）および数式(2B)に限定されない。例えば、以下の数式(32A)および数式(32B)で表現されるウィーナフィルタを１回の雑音抑圧処理にて周波数ｆ毎に択一的に実行することも可能である。

(2) The specific contents of the noise suppression processing are not limited to the above-described formula (2A) (or formula (2A ′)) and formula (2B). For example, the Wiener filter expressed by the following equations (32A) and (32B) can be alternatively executed for each frequency f by one noise suppression process.

数式(32A)および数式(32B)の雑音抑圧処理の実行後のｍ次モーメントμmは、例えばT. Inoue, et al., "Theoretical analysis of musical noise in Wiener filtering family via higher-order statistics", Proc. ICASSP2011, p.5076-5079, 2011に開示される通り、以下の数式(33)で表現される。数式(33)の関数ＭWF(α,β,η,m)は、数式(34)で定義される。

したがって、前述の数式(24)および数式(26)と同様に、第１条件および第２条件の双方を満たす（すなわち理論的にはミュージカルノイズを発生させない）フロアリング係数ηを算定することが可能である。 For example, T. Inoue, et al., “Theoretical analysis of musical noise in Wiener filtering family via higher-order statistics”, Proc As disclosed in ICASSP2011, p.5076-5079, 2011, it is expressed by the following equation (33). The function MWF (α, β, η, m) of Equation (33) is defined by Equation (34).

Therefore, similarly to the above-described equations (24) and (26), it is possible to calculate the flooring coefficient η that satisfies both the first condition and the second condition (that is, theoretically does not generate musical noise). It is.

（３）第４実施形態では、各単位処理部Ｕ[q]の雑音推定部６２がチャネル毎に推定雑音成分ｚ1^[q](f,τ)〜ｚD^[q](f,τ)を生成したが、Ｄチャネルについて共通の推定雑音成分を雑音推定部６２が独立成分分析で生成することも可能である。雑音抑圧部６６は、Ｄチャネルの各々のスペクトルＸ1^[q-1](f,τ)〜ＸD^[q-1](f,τ)（観測ベクトルＶ^[q-1](f,τ)）から共通の推定雑音成分を抑圧する雑音抑圧処理を実行する。 (3) In the fourth embodiment, the noise estimation unit 62 of each unit processing unit U [q] generates estimated noise components z1 ^[q] (f, τ) to zD ^[q] (f, τ) for each channel. However, it is also possible for the noise estimation unit 62 to generate a common estimated noise component for the D channel by independent component analysis. The noise suppression unit 66 has the spectrum X1 ^[q-1] (f, τ) to XD ^[q-1] (f, τ) (observation vector V ^[q-1] (f, τ)) of each D channel. The noise suppression processing for suppressing the common estimated noise component is executed.

（４）前述の各形態では、第１係数設定部４４が数式(24)の演算でフロアリング係数ηを算定したが、形状母数α0や抑圧係数βの各数値にフロアリング係数ηを対応付けた係数テーブルを参照することで第１係数設定部４４がフロアリング係数ηを設定する構成も採用され得る。フロアリング係数ηの設定用の係数テーブルでは、数式(24)の関係（第１条件）が成立するように、形状母数α0および抑圧係数βの組合せとフロアリング係数ηとが対応付けられる。以上の説明から理解されるように、第１係数設定部４４は、第１条件を規定する関係式を満たすように形状母数α0に応じてフロアリング係数ηを可変に設定する要素として包括され、フロアリング係数ηを実際の演算で算定するか係数テーブルから取得するかは不問である。 (4) In each of the above-described embodiments, the first coefficient setting unit 44 calculates the flooring coefficient η by the calculation of Expression (24), but the flooring coefficient η corresponds to each numerical value of the shape parameter α0 and the suppression coefficient β. A configuration in which the first coefficient setting unit 44 sets the flooring coefficient η by referring to the attached coefficient table may be employed. In the coefficient table for setting the flooring coefficient η, the combination of the shape parameter α0 and the suppression coefficient β is associated with the flooring coefficient η so that the relationship (first condition) of Expression (24) is satisfied. As understood from the above description, the first coefficient setting unit 44 is included as an element that variably sets the flooring coefficient η according to the shape parameter α0 so as to satisfy the relational expression that defines the first condition. It does not matter whether the flooring coefficient η is calculated by actual calculation or obtained from the coefficient table.

（５）前述の各形態では雑音抑圧部３６が反復型の雑音抑圧を実行したが、音響信号ｘ(t)の各フレームについて雑音抑圧部３６が雑音抑圧処理を１回だけ実行する構成にも本発明は適用される。すなわち、雑音抑圧処理の反復は本発明において必須ではない。雑音抑圧処理を各フレームにつき１回だけ実行する構成でも、第１条件および第２条件を満たすようにフロアリング係数ηを設定することで、フロアリング係数ηを所定値に固定した構成と比較すれば、ミュージカルノイズの発生を抑制しながら音響信号ｘ(t)の雑音成分を抑圧できるという所期の効果は実現される。 (5) In each of the above-described embodiments, the noise suppression unit 36 performs repetitive noise suppression. However, the noise suppression unit 36 performs the noise suppression process only once for each frame of the acoustic signal x (t). The present invention applies. That is, iterative noise suppression processing is not essential in the present invention. Even in the configuration in which the noise suppression process is executed only once for each frame, the flooring coefficient η is set so as to satisfy the first condition and the second condition, so that it is compared with the configuration in which the flooring coefficient η is fixed to a predetermined value. For example, the desired effect of suppressing the noise component of the acoustic signal x (t) while suppressing the generation of musical noise is realized.

（６）第２実施形態の第２係数設定部４６と第３実施形態の反復回数設定部４８との双方を具備する構成も採用され得る。反復回数設定部４８は、反復回数Ｑの複数の候補値ｑの各々について、第１係数設定部４４が設定したフロアリング係数ηと第２係数設定部４６が設定した抑圧係数βとを適用した雑音抑圧処理をその回数ｑだけ反復した場合の雑音抑圧率Ｒ(αq,β,η)を数式(27)の演算で算定し、雑音抑圧率Ｒ(αq,β,η)が目標値Ｒtarを上回る最小の候補値ｑを反復回数Ｑとして確定する。 (6) A configuration including both the second coefficient setting unit 46 of the second embodiment and the iteration number setting unit 48 of the third embodiment may be employed. The iteration number setting unit 48 applies the flooring coefficient η set by the first coefficient setting unit 44 and the suppression coefficient β set by the second coefficient setting unit 46 to each of the plurality of candidate values q of the iteration number Q. The noise suppression rate R (αq, β, η) when the noise suppression processing is repeated q times is calculated by the calculation of Equation (27), and the noise suppression rate R (αq, β, η) is the target value Rtar. The minimum candidate value q above is determined as the number of iterations Q.

（７）第２実施形態では、形状母数α0およびフロアリング係数ηの双方に応じて抑圧係数βを設定したが、抑圧係数βを制御する方法は以上の例示に限定されない。例えば、形状母数α0に応じて抑圧係数β（フロアリング係数ηには依存しない数値）を設定することも可能である。 (7) In the second embodiment, the suppression coefficient β is set according to both the shape parameter α 0 and the flooring coefficient η. However, the method for controlling the suppression coefficient β is not limited to the above examples. For example, the suppression coefficient β (a numerical value that does not depend on the flooring coefficient η) can be set according to the shape parameter α0.

（８）前述の各形態では、音響信号ｘ(t)の強度分布を近似する確率密度関数Ｐ(x)の形状母数α0を推定雑音成分ｎ(t)の特性の指標（雑音特性値）として例示したが、雑音特性値は形状母数α0に限定されない。例えば、音響信号ｘ(t)の強度分布から直接に算定される統計量（例えば尖度等の高次統計量）や、音響信号ｘ(t)の振幅|Ｘ(f,τ)|の度数分布に応じた統計量（例えば振幅|Ｘ(f,τ)|の度数分布を近似する確率密度関数の形状母数）を雑音特性値として利用することも可能である。すなわち、雑音特性値は、音響信号ｘ(t)の特性（特に推定雑音成分ｎ(t)の特性）に応じて変化する数値（典型的には強度分布の形状に応じた数値）として包括される。 (8) In each embodiment described above, the shape parameter α0 of the probability density function P (x) approximating the intensity distribution of the acoustic signal x (t) is used as an index of the characteristic of the estimated noise component n (t) (noise characteristic value). However, the noise characteristic value is not limited to the shape parameter α0. For example, a statistic directly calculated from the intensity distribution of the acoustic signal x (t) (for example, a higher-order statistic such as kurtosis) or the frequency | X (f, τ) | of the acoustic signal x (t) A statistic corresponding to the distribution (for example, a shape parameter of a probability density function approximating a frequency distribution of amplitude | X (f, τ) |) can be used as a noise characteristic value. That is, the noise characteristic value is included as a numerical value (typically, a numerical value corresponding to the shape of the intensity distribution) that changes according to the characteristic of the acoustic signal x (t) (particularly the characteristic of the estimated noise component n (t)). The

（９）前述の各形態では、雑音抑圧部３６を具備する雑音抑圧装置１００（１００A，１００B，１００C）を例示したが、雑音抑圧に適用される変数（例えばフロアリング係数η，抑圧係数β，反復回数Ｑ）を可変に設定する装置（以下「雑音抑圧用係数設定装置」という）としても本発明は実現され得る。雑音抑圧用係数設定装置に雑音抑圧部３６や雑音推定部３４を追加することで雑音抑圧装置１００が実現される。例えば第１実施形態に対応する雑音抑圧用係数設定装置は、前述の各形態における特性値算定部４２および第１係数設定部４４を含んで構成される。また、第４実施形態の雑音抑圧用係数設定部６４は、本発明の各態様に係る雑音抑圧用係数設定装置に相当する。雑音抑圧用係数設定装置には、第２実施形態の第２係数設定部４６や第３実施形態の反復回数設定部４８が適宜に追加され得る。 (9) In each of the above-described embodiments, the noise suppression device 100 (100A, 100B, 100C) including the noise suppression unit 36 is illustrated. However, variables (for example, flooring coefficient η, suppression coefficient β, The present invention can also be realized as a device that variably sets the number of iterations Q) (hereinafter referred to as “noise suppression coefficient setting device”). The noise suppression device 100 is realized by adding the noise suppression unit 36 and the noise estimation unit 34 to the noise suppression coefficient setting device. For example, the noise suppression coefficient setting device corresponding to the first embodiment includes the characteristic value calculation unit 42 and the first coefficient setting unit 44 in each of the above-described embodiments. Further, the noise suppression coefficient setting unit 64 of the fourth embodiment corresponds to the noise suppression coefficient setting apparatus according to each aspect of the present invention. In the noise suppression coefficient setting device, the second coefficient setting unit 46 of the second embodiment and the iteration number setting unit 48 of the third embodiment can be appropriately added.

１００A，１００B，１００C、１００D……雑音抑圧装置、１２……信号供給装置、１４……放音装置、２２……演算処理装置、２４……記憶装置、３２，５２……周波数分析部、３４，６２……雑音推定部、３６，６６……雑音抑圧部、３８，５８……波形合成部、４２……特性値算定部、４４……第１係数設定部、４６……第２係数設定部、４８……反復回数設定部、５０……収音部、５１……収音機器、５４……信号処理部、Ｕ[q]（Ｕ[1]〜Ｕ[Q]）……単位処理部、５６……出力処理部。
100A, 100B, 100C, 100D .... Noise suppression device, 12 ... Signal supply device, 14 ... Sound emitting device, 22 ... Calculation processing device, 24 ... Storage device, 32,52 ... Frequency analysis unit, 34 , 62 ... Noise estimation unit, 36, 66 ... Noise suppression unit, 38, 58 ... Waveform synthesis unit, 42 ... Characteristic value calculation unit, 44 ... First coefficient setting unit, 46 ... Second coefficient setting , 48... Repeat count setting section, 50... Sound collection section, 51... Sound collection equipment, 54... Signal processing section, U [q] (U [1] to U [Q]). Part, 56... Output processing part.

Claims

A characteristic value calculating means for calculating a noise characteristic value according to the shape of the intensity distribution of the input acoustic signal;
When the kurtosis of the intensity distribution of the acoustic signal does not change before and after the noise suppression process, a relational expression that defines the relationship between the noise characteristic value of the acoustic signal and the flooring coefficient applied to the flooring process of the noise suppression process is A noise suppression coefficient setting device comprising: first coefficient setting means for variably setting a flooring coefficient in accordance with the noise characteristic value calculated by the characteristic value calculation means so as to satisfy.

The noise suppression coefficient setting device according to claim 1, wherein the first coefficient setting means sets the flooring coefficient so that a noise suppression rate by the noise suppression processing becomes a positive number.

The noise suppression coefficient according to claim 1 or 2, further comprising second coefficient setting means for variably setting a suppression coefficient for controlling a noise suppression strength according to a noise characteristic value calculated by the characteristic value calculation means. Setting device.

The noise according to the noise characteristic value calculated by the characteristic value calculation means and the flooring coefficient set by the first coefficient setting means so that the accumulated value of the noise suppression rate by each noise suppression process exceeds the target value. The noise suppression coefficient setting device according to any one of claims 1 to 3, further comprising: an iteration count setting unit that variably sets an iteration count of the suppression process.

A noise suppression coefficient setting device according to any one of claims 1 to 4,
A noise suppression device comprising: noise suppression means for executing noise suppression processing to which an coefficient set by the noise suppression coefficient setting device is applied to an input acoustic signal.

Q-stage unit processing means for sequentially processing acoustic signals of D (D is a natural number of 2 or more) channels generated by sound collection devices arranged apart from each other;
Output processing means for emphasizing a target sound component in a specific sound source direction by delay addition of the D-channel acoustic signal after processing by the unit processing means in the final stage of the Q stages,
Each of the Q-stage unit processing means includes:
Noise estimation means for generating an estimated noise component by independent component analysis for the D-channel acoustic signal supplied to the unit processing means;
The noise suppression coefficient setting device according to any one of claims 1 to 4, wherein a flooring coefficient corresponding to a noise characteristic value of the estimated noise component is variably set for each channel;
A noise suppression device comprising: noise suppression means for executing a noise suppression process to which the flooring coefficient set for each channel by the noise suppression coefficient setting device is applied to an acoustic signal of the channel and outputting the result.