JP2010220087A

JP2010220087A - Sound processing apparatus and program

Info

Publication number: JP2010220087A
Application number: JP2009066830A
Authority: JP
Inventors: Takafumi Tanaka; 啓文田中; Kazunobu Kondo; 多伸近藤; Hiroshi Okumura; 啓奥村
Original assignee: Yamaha Corp
Current assignee: Yamaha Corp
Priority date: 2009-03-18
Filing date: 2009-03-18
Publication date: 2010-09-30

Abstract

PROBLEM TO BE SOLVED: To suppress feedback sound while suppressing the generation of musical noise caused by spectrum subtraction. SOLUTION: A feedback sound estimation unit 24 generates an estimated feedback sound component n2 (t) obtained by estimating a feedback sound component n1 (t) reaching a sound collection apparatus 14 from a sound emission apparatus 12. A feedback sound suppression unit 26 suppresses the estimated feedback sound component n2 (t) from a sound signal w1 (t) generated by the sound collection apparatus 14. A kurtosis calculation unit 52A calculates kurtosis KA (m) in a frequency distribution of intensity of an estimated residual sound component e2 (t) obtained by estimating a residual component within a feedback sound component n1 (t) after processing by the feedback sound suppression unit 26. A coefficient setter 54 sets a subtraction coefficient α(m) in accordance with the kurtosis KA (m) calculated by the kurtosis calculation unit 52A. A spectrum subtraction unit 42 adjusts a spectrum E2 (m, f) of the estimated residual sound component e2 (t) in accordance with a subtraction coefficient α(m) to be subtracted from a spectrum W2 of a sound signal w2 (t) after processing by the feedback sound suppression unit 26. COPYRIGHT: (C)2010,JPO&INPIT

Description

本発明は、放音機器と収音機器とを含む音響系において放音機器から収音機器に到来する音響（以下「帰還音」という）の影響を抑制する技術に関する。 The present invention relates to a technique for suppressing the influence of sound (hereinafter referred to as “return sound”) that arrives at a sound collecting device from the sound emitting device in an acoustic system including the sound emitting device and the sound collecting device.

放音機器から収音機器に到来する帰還音の影響（エコーやハウリング）を抑制する技術が従来から提案されている。例えば、特許文献１には、帰還音を模擬した成分（以下「推定帰還音成分」という）を時間領域で入力音声から減算したうえで、減算後のスペクトルから推定帰還音成分のスペクトルを減算（スペクトル減算）する技術が開示されている。 Conventionally, a technique for suppressing the influence (echo or howling) of feedback sound coming from a sound emitting device to a sound collecting device has been proposed. For example, in Patent Document 1, a component simulating feedback sound (hereinafter referred to as “estimated feedback sound component”) is subtracted from the input speech in the time domain, and then the spectrum of the estimated feedback sound component is subtracted from the subtracted spectrum ( A technique for spectral subtraction) is disclosed.

特開２００４−５６４５３号公報JP 2004-56453 A

しかし、特許文献１のように周波数領域で推定帰還音成分を抑圧する技術では、スペクトル減算後に時間軸上および周波数軸上に分散的に点在する成分が、人工的で耳障りなミュージカルノイズとして受聴者に知覚され得るという問題がある。以上の事情を考慮して、本発明は、ミュージカルノイズの発生を抑制しながら帰還音を抑圧することを目的とする。 However, in the technique of suppressing the estimated feedback sound component in the frequency domain as in Patent Document 1, components scattered in the time axis and the frequency axis after spectrum subtraction are received as artificial and annoying musical noise. There is a problem that it can be perceived by the listener. In view of the above circumstances, an object of the present invention is to suppress feedback sound while suppressing the generation of musical noise.

以上の課題を解決するために、本発明の第１の態様に係る音響処理装置は、放音機器から収音機器に到来する帰還音を推定した推定帰還音成分を生成する帰還音推定手段と、収音機器による生成後の音響信号（例えば音響信号ｗ1(t)や音響信号ｗ2(t)）から推定帰還音成分を抑圧する帰還音抑圧手段と、帰還音のうち帰還音抑圧手段による処理後に残留する成分を推定した推定残留音成分の強度の度数分布における尖度（例えば尖度ＫA(m)）を算定する尖度算定手段（例えば尖度算定部５２A）と、尖度算定手段が算定した尖度に応じて減算係数を設定する係数設定手段と、推定残留音成分のスペクトルを減算係数に応じて調整して帰還音抑圧手段による処理後の音響信号（例えば音響信号ｗ2(t)）のスペクトルから減算するスペクトル減算手段とを具備する。なお、以上の態様の具体例は、第１実施形態，第２実施形態，第５実施形態として後述される。 In order to solve the above problems, the acoustic processing device according to the first aspect of the present invention includes a feedback sound estimation unit that generates an estimated feedback sound component that estimates a feedback sound that arrives at a sound collection device from a sound emitting device. , Feedback sound suppression means for suppressing the estimated feedback sound component from the acoustic signal generated by the sound collection device (for example, the acoustic signal w1 (t) and the acoustic signal w2 (t)), and processing by the feedback sound suppression means of the feedback sound Kurtosis calculation means (for example, kurtosis calculation unit 52A) for calculating kurtosis (for example, kurtosis KA (m)) in the frequency distribution of the intensity of the estimated residual sound component obtained by estimating the remaining component, and kurtosis calculation means Coefficient setting means for setting a subtraction coefficient according to the calculated kurtosis, and an acoustic signal after processing by the feedback sound suppression means by adjusting the spectrum of the estimated residual sound component according to the subtraction coefficient (for example, the acoustic signal w2 (t) Spectral subtraction means to subtract from the spectrum of) Comprising. In addition, the specific example of the above aspect is later mentioned as 1st Embodiment, 2nd Embodiment, and 5th Embodiment.

第１の態様に係る音響処理装置においては、帰還音抑圧手段による処理後の音響信号のスペクトルから推定残留音成分のスペクトルが減算されるから、帰還音抑圧手段による抑圧のみが実行される構成と比較して、収音機器が生成した音響信号から帰還音成分を有効に抑圧できるという利点がある。また、推定残留音成分の強度の度数分布における尖度に応じて設定された減算係数が推定残留音成分のスペクトルの調整に適用されるから、減算係数が尖度に依存しない構成（例えば減算係数を所定値に固定した構成）と比較して、スペクトル減算に起因したミュージカルノイズを抑制することが可能である。なお、「推定残留音成分の強度の度数分布における尖度」は、推定残留音成分の波形を表す時間領域の信号にて時系列に配列する複数の信号値（強度）の度数分布における強度と、推定残留音成分のスペクトルにおける複数の強度の度数分布における尖度とを包含する概念である。 In the acoustic processing device according to the first aspect, since the spectrum of the estimated residual sound component is subtracted from the spectrum of the acoustic signal processed by the feedback sound suppression means, only the suppression by the feedback sound suppression means is executed. In comparison, there is an advantage that the feedback sound component can be effectively suppressed from the acoustic signal generated by the sound collection device. In addition, since the subtraction coefficient set according to the kurtosis in the frequency distribution of the intensity of the estimated residual sound component is applied to the adjustment of the spectrum of the estimated residual sound component, the subtraction coefficient does not depend on the kurtosis (for example, the subtraction coefficient Can be suppressed as compared with a configuration in which is fixed to a predetermined value). The “kurtosis in the frequency distribution of the estimated residual sound component intensity” is the intensity in the frequency distribution of a plurality of signal values (intensities) arranged in time series in the time domain signal representing the waveform of the estimated residual sound component. And a kurtosis in a frequency distribution of a plurality of intensities in the spectrum of the estimated residual sound component.

本発明の好適な態様において、スペクトル減算手段は、推定残留音成分のスペクトルに減算係数を乗算して帰還音抑圧手段による処理後の音響信号のスペクトルから減算する。以上の態様では、尖度が大きいほど減算係数を大きい数値に設定することで、スペクトル減算に起因したミュージカルノイズを有効に抑制できる。 In a preferred aspect of the present invention, the spectrum subtracting means multiplies the spectrum of the estimated residual sound component by a subtraction coefficient and subtracts it from the spectrum of the acoustic signal processed by the feedback sound suppressing means. In the above aspect, musical noise resulting from spectral subtraction can be effectively suppressed by setting the subtraction coefficient to a larger value as the kurtosis increases.

本発明の好適な態様において、尖度算定手段は、推定残留音成分の尖度と、収音機器による生成後の音響信号の強度の度数分布における尖度とを算定し、係数設定手段は、推定残留音成分の尖度と音響信号の尖度とが近似する場合には、推定残留音成分の尖度および音響信号の尖度の少なくとも一方に応じて減算係数を設定し、推定残留音成分の尖度と音響信号の尖度とが近似しない場合には減算係数の更新を停止する。以上の態様においては、推定残留音成分の尖度と音響信号の尖度とが近似しない場合（すなわち、音響信号が目的音成分を含む場合）に減算係数の更新が停止するから、減算係数が目的音成分に影響されない。したがって、ミュージカルノイズを適切に抑圧できるという利点がある。以上の態様の具体例は、第５実施形態として後述される。なお、推定残留音成分の尖度と音響信号の尖度とが近似しない場合に、係数設定手段が、減算係数の更新を停止したうえで減算係数を所定値（例えば、スペクトル減算の度合を低減する数値）に初期化する構成も好適である。以上の構成によれば、音響信号のうち目的音成分を含む区間に対するスペクトル減算が、目的音成分を含まない区間の影響で過剰となることが抑制される。 In a preferred aspect of the present invention, the kurtosis calculation means calculates the kurtosis of the estimated residual sound component and the kurtosis in the frequency distribution of the intensity of the acoustic signal generated by the sound collection device, and the coefficient setting means When the kurtosis of the estimated residual sound component approximates the kurtosis of the acoustic signal, a subtraction coefficient is set according to at least one of the kurtosis of the estimated residual sound component and the kurtosis of the acoustic signal, and the estimated residual sound component If the kurtosis of the sound does not approximate the kurtosis of the acoustic signal, the update of the subtraction coefficient is stopped. In the above aspect, the update of the subtraction coefficient stops when the kurtosis of the estimated residual sound component and the kurtosis of the acoustic signal are not approximate (that is, when the acoustic signal includes the target sound component). Not affected by the target sound component. Therefore, there is an advantage that musical noise can be appropriately suppressed. A specific example of the above aspect will be described later as a fifth embodiment. If the kurtosis of the estimated residual sound component and the kurtosis of the acoustic signal are not approximated, the coefficient setting means stops updating the subtraction coefficient and sets the subtraction coefficient to a predetermined value (for example, reduces the degree of spectrum subtraction). A configuration in which the value is initialized to a numerical value) is also preferable. According to the above configuration, it is possible to prevent the spectral subtraction for the section including the target sound component from being excessive due to the influence of the section not including the target sound component.

本発明の第２の態様に係る音響処理装置は、放音機器から収音機器に到来する帰還音を推定した推定帰還音成分を生成する帰還音推定手段と、収音機器による生成後の音響信号（例えば音響信号ｗ1(t)や音響信号ｗ2(t)）から推定帰還音成分を抑圧する帰還音抑圧手段と、収音機器による生成後の音響信号（例えば音響信号ｗ1(t)〜ｗ3(t)）の強度の度数分布における尖度（例えば尖度ＫB(m)や尖度ＫC(m)）を算定する尖度算定手段と、尖度算定手段が算定した尖度に応じて減算係数を設定する係数設定手段と、帰還音のうち帰還音抑圧手段による処理後に残留する成分を推定した推定残留音成分のスペクトルを減算係数に応じて調整して帰還音抑圧手段による処理後の音響信号（例えば音響信号ｗ2(t)）のスペクトルから減算するスペクトル減算手段とを具備する。以上の態様の具体例は、第３実施形態から第７実施形態として後述される。 The acoustic processing device according to the second aspect of the present invention includes a feedback sound estimation unit that generates an estimated feedback sound component that estimates a feedback sound that arrives at a sound collection device from a sound emitting device, and a sound that is generated by the sound collection device. Feedback sound suppression means for suppressing the estimated feedback sound component from the signal (for example, the acoustic signal w1 (t) or the acoustic signal w2 (t)), and the acoustic signal generated by the sound collecting device (for example, the acoustic signals w1 (t) to w3) (t)) kurtosis calculation means for calculating the kurtosis (for example, kurtosis KB (m) or kurtosis KC (m)) in the intensity distribution of intensity, and subtraction according to the kurtosis calculated by the kurtosis calculation means A coefficient setting means for setting a coefficient, and an acoustic signal after the processing by the feedback sound suppression means by adjusting the spectrum of the estimated residual sound component estimated from the feedback sound after the processing by the feedback sound suppression means according to the subtraction coefficient Spectral subtraction means for subtracting from the spectrum of the signal (eg acoustic signal w2 (t)) It comprises. Specific examples of the above aspects will be described later as third to seventh embodiments.

第２の態様に係る音響処理装置においては、帰還音抑圧手段による処理後の音響信号のスペクトルから推定残留音成分のスペクトルが減算されるから、帰還音抑圧手段による抑圧のみが実行される構成と比較して、収音機器が生成した音響信号から帰還音成分を有効に抑圧できるという利点がある。また、音響信号の強度の度数分布における尖度に応じて設定された減算係数が推定残留音成分のスペクトルの調整に適用されるから、減算係数が尖度に依存しない構成（例えば減算係数を所定値に固定した構成）と比較して、スペクトル減算に起因したミュージカルノイズを抑制することが可能である。なお、「音響信号の強度の度数分布における尖度」は、音響信号にて時系列に配列する複数の信号値（強度）の度数分布における強度と、音響信号のスペクトルにおける複数の強度の度数分布における尖度とを包含する概念である。 In the acoustic processing device according to the second aspect, since the spectrum of the estimated residual sound component is subtracted from the spectrum of the acoustic signal processed by the feedback sound suppression unit, only the suppression by the feedback sound suppression unit is executed. In comparison, there is an advantage that the feedback sound component can be effectively suppressed from the acoustic signal generated by the sound collection device. Further, since the subtraction coefficient set according to the kurtosis in the frequency distribution of the intensity of the acoustic signal is applied to the adjustment of the spectrum of the estimated residual sound component, the subtraction coefficient does not depend on the kurtosis (for example, the subtraction coefficient is set to a predetermined value). Compared with a configuration fixed to a value), it is possible to suppress musical noise caused by spectral subtraction. Note that “the kurtosis in the frequency distribution of the intensity of the acoustic signal” means the intensity distribution in the frequency distribution of a plurality of signal values (intensities) arranged in time series in the acoustic signal and the frequency distribution of the plurality of intensities in the spectrum of the acoustic signal. It is a concept that includes kurtosis.

第２の態様の具体例（例えば、第３実施形態，第４実施形態，第５実施形態，第７実施形態）において、尖度算定手段は、スペクトル減算手段による処理前の音響信号の尖度を算定し、スペクトル減算手段は、推定残留音成分のスペクトルに減算係数を乗算して帰還音抑圧手段による処理後の音響信号のスペクトルから減算する。以上の構成では、例えば尖度が大きいほど減算係数を大きい数値に設定することで、スペクトル減算に起因したミュージカルノイズを有効に抑制できる。なお、「スペクトル減算手段による処理前の音響信号」は、スペクトル減算手段による処理の直前の音響信号に限定されず、収音機器による生成後からスペクトル減算手段による処理前までの任意の段階における音響信号（例えば、帰還音抑圧手段による処理前の音響信号や帰還音抑圧手段による処理後の音響信号）を包含する。 In a specific example of the second mode (for example, the third embodiment, the fourth embodiment, the fifth embodiment, and the seventh embodiment), the kurtosis calculation means is the kurtosis of the acoustic signal before being processed by the spectrum subtraction means. The spectrum subtracting means multiplies the spectrum of the estimated residual sound component by the subtraction coefficient and subtracts it from the spectrum of the acoustic signal processed by the feedback sound suppressing means. In the above configuration, for example, musical noise caused by spectrum subtraction can be effectively suppressed by setting the subtraction coefficient to a larger numerical value as the kurtosis increases. Note that the “acoustic signal before processing by the spectrum subtracting means” is not limited to the acoustic signal immediately before processing by the spectrum subtracting means, and the acoustic signal at any stage from the generation by the sound collecting device to before the processing by the spectrum subtracting means. A signal (for example, an acoustic signal before processing by the feedback sound suppression unit or an acoustic signal after processing by the feedback sound suppression unit).

第２の態様の他の具体例（例えば、第６実施形態や第７実施形態）において、尖度算定手段は、スペクトル減算手段による処理後の音響信号の尖度を算定し、スペクトル減算手段は、推定残留音成分のスペクトルに減算係数を乗算して帰還音抑圧手段による処理後の音響信号のスペクトルから減算する。以上の構成では、例えば尖度が大きいほど減算係数を小さい数値に設定することで、スペクトル減算に起因したミュージカルノイズを有効に抑制できる。なお、「スペクトル減算手段による処理後の音響信号」は、スペクトル減算手段による処理の直後の音響信号に限定されず、例えば、スペクトル減算手段による処理後に他の処理が実行された音響信号をも包含する。 In another specific example of the second mode (for example, the sixth embodiment or the seventh embodiment), the kurtosis calculating unit calculates the kurtosis of the acoustic signal processed by the spectrum subtracting unit, and the spectrum subtracting unit Then, the spectrum of the estimated residual sound component is multiplied by a subtraction coefficient and subtracted from the spectrum of the acoustic signal after processing by the feedback sound suppression means. In the above configuration, for example, musical noise caused by spectrum subtraction can be effectively suppressed by setting the subtraction coefficient to a smaller value as the kurtosis increases. The “acoustic signal after processing by the spectrum subtracting means” is not limited to the acoustic signal immediately after the processing by the spectral subtracting means, and includes, for example, an acoustic signal that has been subjected to other processing after being processed by the spectral subtracting means. To do.

本発明（第１の態様および第２の態様）の好適な態様に係る音響処理装置は、収音機器による生成後の音響信号について目的音成分の有無を判定する目的音判定手段を具備し、尖度算定手段は、目的音成分が存在しないと目的音判定手段が判定した場合に尖度を算定し、目的音成分が存在すると目的音判定手段が判定した場合に尖度の算定を停止する。以上の態様においては減算係数に対する目的音成分の影響が低減（理想的には排除）されるから、スペクトル減算に起因したミュージカルノイズを効果的に抑制できるという利点がある。以上の態様の具体例は、例えば第２実施形態や第４実施形態として後述される。 An acoustic processing device according to a preferred aspect of the present invention (first aspect and second aspect) includes a target sound determination unit that determines the presence or absence of a target sound component for an acoustic signal generated by a sound collection device, The kurtosis calculating means calculates the kurtosis when the target sound determining means determines that the target sound component does not exist, and stops calculating the kurtosis when the target sound determining means determines that the target sound component exists. . In the above aspect, since the influence of the target sound component on the subtraction coefficient is reduced (ideally excluded), there is an advantage that musical noise caused by spectral subtraction can be effectively suppressed. Specific examples of the above aspects will be described later as, for example, the second embodiment and the fourth embodiment.

以上の各態様に係る音響処理装置は、音響信号の処理に専用されるＤＳＰ（Digital Signal Processor）などのハードウェア（電子回路）によって実現されるほか、ＣＰＵ（Central Processing Unit）などの汎用の演算処理装置とプログラムとの協働によっても実現される。本発明の第１の態様に係るプログラムは、放音機器から収音機器に到来する帰還音を推定した推定帰還音成分を生成する帰還音推定処理と、収音機器による生成後の音響信号から推定帰還音成分を抑圧する帰還音抑圧処理と、帰還音のうち帰還音抑圧処理の実行後に残留する成分を推定した推定残留音成分の強度の度数分布における尖度を算定する尖度算定処理と、尖度算定処理で算定した尖度に応じて減算係数を設定する係数設定処理と、推定残留音成分のスペクトルを減算係数に応じて調整して帰還音抑圧処理の実行後の音響信号のスペクトルから減算するスペクトル減算処理とをコンピュータに実行させる。以上のプログラムによれば、本発明の第１の態様に係る音響処理装置と同様の作用および効果が奏される。 The acoustic processing device according to each of the above aspects is realized by hardware (electronic circuit) such as a DSP (Digital Signal Processor) dedicated to processing of an acoustic signal, or a general-purpose calculation such as a CPU (Central Processing Unit). This is also realized by cooperation between the processing device and the program. The program according to the first aspect of the present invention includes a feedback sound estimation process for generating an estimated feedback sound component that estimates a feedback sound that arrives at a sound collection device from a sound emitting device, and an acoustic signal generated by the sound collection device. A feedback sound suppression process for suppressing the estimated feedback sound component, and a kurtosis calculation process for calculating the kurtosis in the frequency distribution of the intensity of the estimated residual sound component estimated after the feedback sound suppression process of the feedback sound is performed. , The coefficient setting process for setting the subtraction coefficient according to the kurtosis calculated by the kurtosis calculation process, and the spectrum of the acoustic signal after executing the feedback sound suppression process by adjusting the spectrum of the estimated residual sound component according to the subtraction coefficient And causing the computer to execute spectral subtraction processing for subtracting from. According to the above program, the same operation and effect as the sound processing apparatus according to the first aspect of the present invention are exhibited.

また、本発明の第２の態様に係るプログラムは、放音機器から収音機器に到来する帰還音を推定した推定帰還音成分を生成する帰還音推定処理と、収音機器による生成後の音響信号から推定帰還音成分を抑圧する帰還音抑圧処理と、収音機器による生成後の音響信号の強度の度数分布における尖度を算定する尖度算定処理と、尖度算定処理で算定した尖度に応じて減算係数を設定する係数設定処理と、帰還音のうち帰還音抑圧処理の実行後に残留する成分を推定した推定残留音成分のスペクトルを減算係数に応じて調整して帰還音抑圧処理の実行後の音響信号のスペクトルから減算するスペクトル減算処理とをコンピュータに実行させる。以上のプログラムによれば、本発明の第２の態様に係る音響処理装置と同様の作用および効果が奏される。 Further, the program according to the second aspect of the present invention includes a feedback sound estimation process for generating an estimated feedback sound component obtained by estimating a feedback sound arriving at a sound collection device from a sound emitting device, and a sound generated by the sound collection device. Feedback sound suppression processing to suppress the estimated feedback sound component from the signal, kurtosis calculation processing to calculate the kurtosis in the frequency distribution of the intensity of the sound signal generated by the sound collection device, and the kurtosis calculated by the kurtosis calculation processing Of the feedback sound suppression processing by adjusting the spectrum of the estimated residual sound component that estimates the component remaining after execution of the feedback sound suppression processing of the feedback sound according to the subtraction coefficient. The computer executes a spectrum subtraction process for subtracting from the spectrum of the acoustic signal after the execution. According to the above program, the same operation and effect as the sound processing apparatus according to the second aspect of the present invention are exhibited.

以上の各態様に係るプログラムは、コンピュータが読取可能な記録媒体に格納された形態で利用者に提供されてコンピュータにインストールされるほか、通信網を介した配信の形態でサーバ装置から提供されてコンピュータにインストールされる。 The program according to each of the above aspects is provided to the user in a form stored in a computer-readable recording medium and installed in the computer, and is also provided from the server device in the form of distribution via a communication network. Installed on the computer.

本発明の第１実施形態に係る音響処理装置のブロック図である。1 is a block diagram of a sound processing apparatus according to a first embodiment of the present invention. スペクトル減算に起因したミュージカルノイズについて説明するための概念図である。It is a conceptual diagram for demonstrating the musical noise resulting from spectrum subtraction. 本発明の第２実施形態に係る音響処理装置のブロック図である。It is a block diagram of the sound processing apparatus which concerns on 2nd Embodiment of this invention. 本発明の第３実施形態に係る音響処理装置のブロック図である。It is a block diagram of the sound processing apparatus which concerns on 3rd Embodiment of this invention. 本発明の第４実施形態に係る音響処理装置のブロック図である。It is a block diagram of the sound processing apparatus which concerns on 4th Embodiment of this invention. 本発明の第５実施形態に係る音響処理装置のブロック図である。It is a block diagram of the sound processing apparatus which concerns on 5th Embodiment of this invention. 本発明の第６実施形態に係る音響処理装置のブロック図である。It is a block diagram of the sound processing apparatus which concerns on 6th Embodiment of this invention. 本発明の第７実施形態に係る音響処理装置のブロック図である。It is a block diagram of the sound processing apparatus which concerns on 7th Embodiment of this invention.

＜Ａ：第１実施形態＞
図１は、本発明の第１実施形態に係る音響処理装置１００Aのブロック図である。音響処理装置１００Aは、他の通信装置との間で通信網を介して音響信号を授受する電話器（典型的にはハンズフリー電話器）に好適に利用される。 <A: First Embodiment>
FIG. 1 is a block diagram of a sound processing apparatus 100A according to the first embodiment of the present invention. The acoustic processing device 100A is preferably used for a telephone (typically a hands-free telephone) that exchanges acoustic signals with other communication devices via a communication network.

音響処理装置１００Aには放音機器（例えばスピーカ）１２と収音機器１４とが接続される。通信網から受信された時間領域（時間ｔ）の音響信号（遠端信号）ｖ(t)が音響処理装置１００Aを介して放音機器１２に供給される。放音機器１２は、音響信号ｖ(t)に応じた音波を放射する。収音機器１４は、周囲の音響に応じた時間領域の音響信号ｗ1(t)を生成して音響処理装置１００Aに出力する。なお、音響信号ｖ(t)をアナログ信号に変換して放音機器１２に供給するＤ/Ａ変換器や、音響信号ｗ1(t)をデジタル信号に変換するＡ/Ｄ変換器の図示は便宜的に省略されている。 A sound emitting device (for example, a speaker) 12 and a sound collecting device 14 are connected to the sound processing apparatus 100A. A sound signal (far end signal) v (t) in the time domain (time t) received from the communication network is supplied to the sound emitting device 12 via the sound processing device 100A. The sound emitting device 12 emits a sound wave corresponding to the acoustic signal v (t). The sound collection device 14 generates a time-domain sound signal w1 (t) corresponding to the surrounding sound and outputs it to the sound processing apparatus 100A. It should be noted that a D / A converter that converts the acoustic signal v (t) into an analog signal and supplies it to the sound emitting device 12 and an A / D converter that converts the acoustic signal w1 (t) into a digital signal are shown for convenience. Are omitted.

収音機器１４には目的音と帰還音とが到来する。目的音は、音響処理装置１００Aの利用者が発生した音響（すなわち、通信の本来の目的となる音響）であり、帰還音は、放音機器１２から直接的または間接的（すなわち反射後）に収音機器１４に到達する音響（典型的にはエコー音）である。したがって、音響信号ｗ1(t)は、以下の数式(1)で表現されるように、目的音に対応する目的音成分ｓ(t)と帰還音に対応する帰還音成分ｎ1(t)との加算に相当する。帰還音成分ｎ1(t)は、放音機器１２から収音機器１４までの音波の経路に応じた伝達関数ｈ1を音響信号ｖ(t)に付加した成分（ｎ1(t)＝ｈ1・ｖ(t)）である。
ｗ1(t)＝ｓ(t)＋ｎ1(t) ……(1) The target sound and the return sound arrive at the sound collecting device 14. The target sound is the sound generated by the user of the sound processing apparatus 100A (that is, the sound that is the original purpose of communication), and the return sound is directly or indirectly (that is, after reflection) from the sound emitting device 12. The sound (typically an echo sound) that reaches the sound collection device 14. Therefore, the acoustic signal w1 (t) is expressed by the following equation (1): the target sound component s (t) corresponding to the target sound and the feedback sound component n1 (t) corresponding to the feedback sound. It corresponds to addition. The feedback sound component n1 (t) is a component obtained by adding a transfer function h1 corresponding to a sound wave path from the sound emitting device 12 to the sound collecting device 14 to the acoustic signal v (t) (n1 (t) = h1 · v ( t)).
w1 (t) = s (t) + n1 (t) (1)

音響処理装置１００Aは、収音機器１４が生成した音響信号ｗ1(t)から帰還音成分ｎ1(t)を抑圧することで音響信号（近端信号）ｚ(t)を生成するエコー抑圧装置（エコーキャンセラ）である。音響処理装置１００Aが生成した音響信号ｚ(t)は通信網を介して他の通信装置に送信される。音響処理装置１００Aを構成する図１の各要素は、プログラムを実行する汎用のコンピュータ（ＣＰＵ）や専用の電子回路（ＤＳＰ）で実現される。 The acoustic processing apparatus 100A is an echo suppression apparatus that generates an acoustic signal (near-end signal) z (t) by suppressing the feedback sound component n1 (t) from the acoustic signal w1 (t) generated by the sound collection device 14. Echo canceller). The acoustic signal z (t) generated by the acoustic processing device 100A is transmitted to another communication device via the communication network. Each element of FIG. 1 constituting the sound processing apparatus 100A is realized by a general-purpose computer (CPU) that executes a program or a dedicated electronic circuit (DSP).

図１の目的音判定部２２は、音響信号ｗ1(t)における目的音成分ｓ(t)の有無を判定して有音区間と無音区間とを区別する要素（VAD：voice activity detection）である。有音区間は、音響信号ｗ1(t)に目的音成分ｓ(t)が含まれる区間（帰還音成分ｎ1(t)の有無は不問）であり、無音区間は、音響信号ｗ1(t)に目的音成分ｓ(t)が含まれない（または強度が充分に低い）区間である。有音区間と無音区間との区別には公知の技術が任意に採用される。 The target sound determination unit 22 of FIG. 1 is an element (VAD: voice activity detection) that determines the presence or absence of the target sound component s (t) in the acoustic signal w1 (t) and distinguishes the voiced and silent sections. . The voiced section is a section in which the target sound component s (t) is included in the acoustic signal w1 (t) (regardless of the presence or absence of the feedback sound component n1 (t)), and the silent section is the acoustic signal w1 (t). The target sound component s (t) is not included (or the intensity is sufficiently low). A known technique is arbitrarily adopted to distinguish between the sounded section and the silent section.

帰還音推定部２４は、帰還音成分ｎ1(t)を推定した推定帰還音成分ｎ2(t)を生成する。例えば、適応フィルタを利用したAEC（acoustic echo canceller）が帰還音推定部２４として好適に利用される。帰還音抑圧部２６は、帰還音推定部２４が生成した推定帰還音成分ｎ2(t)を音響信号ｗ1(t)から抑圧することで音響信号ｗ2(t)を生成する。例えば、図１に示すように、音響信号ｗ1(t)から推定帰還音成分ｎ2(t)を減算する減算器が帰還音抑圧部２６として利用される（ｗ2(t)＝ｗ1(t)−ｎ2(t)）。 The feedback sound estimation unit 24 generates an estimated feedback sound component n2 (t) obtained by estimating the feedback sound component n1 (t). For example, an AEC (acoustic echo canceller) using an adaptive filter is preferably used as the feedback sound estimation unit 24. The feedback sound suppression unit 26 generates the acoustic signal w2 (t) by suppressing the estimated feedback sound component n2 (t) generated by the feedback sound estimation unit 24 from the acoustic signal w1 (t). For example, as shown in FIG. 1, a subtractor that subtracts the estimated feedback sound component n2 (t) from the acoustic signal w1 (t) is used as the feedback sound suppression unit 26 (w2 (t) = w1 (t) − n2 (t)).

帰還音推定部２４は、音響信号ｗ2(t)の強度が最小となるように伝達関数ｈ1を推定することで伝達関数ｈ2を生成し、伝達関数ｈ2を音響信号ｖ(t)に乗算することで推定帰還音成分ｎ2(t)を生成する（ｎ2(t)＝ｈ2・ｖ(t)）。帰還音推定部２４は、目的音判定部２２が判定した無音区間（目的音成分ｓ(t)が存在しない区間）内で推定帰還音成分ｎ2(t)（伝達関数ｈ2）を順次に算定および更新し、目的音判定部２２が判定した有音区間内では推定帰還音成分ｎ2(t)の更新を停止する。したがって、目的音成分ｓ(t)の影響を抑制して高精度に推定帰還音成分ｎ2(t)を推定できるという利点がある。 The feedback sound estimation unit 24 generates the transfer function h2 by estimating the transfer function h1 so that the intensity of the acoustic signal w2 (t) is minimized, and multiplies the acoustic signal v (t) by the transfer function h2. To generate an estimated feedback sound component n2 (t) (n2 (t) = h2 · v (t)). The feedback sound estimation unit 24 sequentially calculates and calculates the estimated feedback sound component n2 (t) (transfer function h2) within the silent section (the section where the target sound component s (t) does not exist) determined by the target sound determination unit 22. The update of the estimated feedback sound component n2 (t) is stopped within the sound section determined by the target sound determination unit 22. Therefore, there is an advantage that the estimated feedback sound component n2 (t) can be estimated with high accuracy by suppressing the influence of the target sound component s (t).

ただし、伝達関数ｈ1の厳密な推定は現実的には困難であるから、実際の伝達関数ｈ1と帰還音推定部２４が推定する伝達関数ｈ2とは必ずしも合致しない。したがって、以下の数式(2)で表現されるように、帰還音成分ｎ1(t)と推定帰還音成分ｎ2(t)との差分に相当する残留音成分（帰還音成分ｎ1(t)の一部）ｅ1(t)が、帰還音抑圧部２６による処理後の音響信号ｗ2(t)には残留する（ｅ1(t)＝ｎ1(t)−ｎ2(t)）。そこで、音響処理装置１００Aは、残留音成分ｅ1(t)を推定した推定残留音成分ｅ2(t)のスペクトルＥ2(m,f)を音響信号ｗ2(t)のスペクトルＷ2(m,f)から減算（スペクトル減算）することで、音響信号ｗ2(t)に残留した残留音成分ｅ1(t)を抑圧する。
ｗ2(t)＝ｗ1(t)−ｎ2(t)
＝ｓ(t)＋ｎ1(t)−ｎ2(t)
＝ｓ(t)＋ｅ1(t) ……(2) However, since strict estimation of the transfer function h1 is difficult in practice, the actual transfer function h1 and the transfer function h2 estimated by the feedback sound estimation unit 24 do not always match. Therefore, as expressed by the following formula (2), one of the residual sound component (feedback sound component n1 (t) corresponding to the difference between the feedback sound component n1 (t) and the estimated feedback sound component n2 (t). Part) e1 (t) remains in the acoustic signal w2 (t) processed by the feedback sound suppression unit 26 (e1 (t) = n1 (t) −n2 (t)). Therefore, the acoustic processing apparatus 100A obtains the spectrum E2 (m, f) of the estimated residual sound component e2 (t) obtained by estimating the residual sound component e1 (t) from the spectrum W2 (m, f) of the acoustic signal w2 (t). By subtracting (spectrum subtraction), the residual sound component e1 (t) remaining in the acoustic signal w2 (t) is suppressed.
w2 (t) = w1 (t) -n2 (t)
= S (t) + n1 (t) -n2 (t)
= S (t) + e1 (t) (2)

図１の周波数解析部３２は、帰還音推定部２４が生成した推定帰還音成分ｎ2(t)の時系列を時間軸上で区分した複数のフレームの各々についてスペクトル（周波数スペクトル）Ｎ2(m,f)を生成する。記号ｍはフレーム（フレームの番号）を示し、記号ｆは周波数軸上の周波数または周波数帯域（周波数ビン）を示す。周波数解析部３２と同様に、周波数解析部３４は、音響信号ｗ2(t)を時間軸上で区分した複数のフレームの各々についてスペクトル（周波数スペクトル）Ｗ2(m,f)を生成する。スペクトルＮ2(m,f)やスペクトルＷ2(m,f)の生成には、高速フーリエ変換やウェーブレット変換などの公知の周波数解析が任意に採用される。 The frequency analysis unit 32 in FIG. 1 has a spectrum (frequency spectrum) N2 (m,) for each of a plurality of frames obtained by dividing the time series of the estimated feedback sound component n2 (t) generated by the feedback sound estimation unit 24 on the time axis. generates f). Symbol m indicates a frame (frame number), and symbol f indicates a frequency or frequency band (frequency bin) on the frequency axis. Similar to the frequency analysis unit 32, the frequency analysis unit 34 generates a spectrum (frequency spectrum) W2 (m, f) for each of a plurality of frames obtained by dividing the acoustic signal w2 (t) on the time axis. For the generation of the spectrum N2 (m, f) and the spectrum W2 (m, f), known frequency analysis such as fast Fourier transform and wavelet transform is arbitrarily employed.

残留音推定部３６は、帰還音成分ｎ1(t)のうち帰還音抑圧部２６による処理後の残留音成分ｅ1(t)を推定した推定残留音成分ｅ2(t)のスペクトルＥ2(m,f)を生成する（すなわち、残留音成分ｅ1(t)を推定する）。帰還音成分ｎ1(t)（推定帰還音成分ｎ2(t)）の周波数特性が残留音成分ｅ1(t)に反映されるという傾向を考慮して、本実施形態の残留音推定部３６は、周波数解析部３２が生成した推定帰還音成分ｎ2(t)のスペクトルＮ2(m,f)からスペクトルＥ2(m,f)を生成する。具体的には、残留音推定部３６は、推定帰還音成分ｎ2(t)のスペクトルＮ2(m,f)に係数δ（０≦δ≦１）を乗算することで推定残留音成分ｅ2(t)のスペクトルＥ2(m,f)を算定する。係数δは、各周波数に共通の数値または周波数毎に相異なる数値に設定され得る。 The residual sound estimation unit 36 estimates the residual sound component e1 (t) processed by the feedback sound suppression unit 26 from the feedback sound component n1 (t), and the spectrum E2 (m, f) of the estimated residual sound component e2 (t). ) (Ie, the residual sound component e1 (t) is estimated). In consideration of the tendency that the frequency characteristic of the feedback sound component n1 (t) (estimated feedback sound component n2 (t)) is reflected in the residual sound component e1 (t), the residual sound estimation unit 36 of this embodiment is A spectrum E2 (m, f) is generated from the spectrum N2 (m, f) of the estimated feedback sound component n2 (t) generated by the frequency analysis unit 32. FIG. Specifically, the residual sound estimation unit 36 multiplies the spectrum N2 (m, f) of the estimated feedback sound component n2 (t) by a coefficient δ (0 ≦ δ ≦ 1) to thereby estimate the residual sound component e2 (t ) Spectrum E2 (m, f). The coefficient δ can be set to a numerical value common to each frequency or a different numerical value for each frequency.

スペクトル減算部４２は、音響信号ｗ2(t)のスペクトルＷ2(m,f)から推定残留音成分ｅ2(t)のスペクトルＥ2(m,f)を減算することでスペクトルＷ3(m,f)を算定する。スペクトルＷ3(m,f)は、音響処理装置１００Aによる処理後の音響信号ｚ(t)の周波数スペクトルであり、振幅スペクトルＡ(m,f)とスペクトルＷ2(m,f)の位相スペクトルθw2(m,f)とを利用して以下の数式(3)で表現される。
Ｗ3(m,f)＝Ａ(m,f)・ｅ^ｊθw2(m,f) ……(3) The spectrum subtracting unit 42 subtracts the spectrum E2 (m, f) of the estimated residual sound component e2 (t) from the spectrum W2 (m, f) of the acoustic signal w2 (t) to obtain the spectrum W3 (m, f). Calculate. The spectrum W3 (m, f) is a frequency spectrum of the acoustic signal z (t) processed by the acoustic processing apparatus 100A, and the phase spectrum θw2 (of the amplitude spectrum A (m, f) and the spectrum W2 (m, f). m, f) and expressed by the following formula (3).
W3 (m, f) = A (m, f) ・ e ^{jθw2 (m, f)} (3)

スペクトル減算部４２は、以下の数式(4a)および数式(4b)の演算を実行することでスペクトルＷ3(m,f)のパワースペクトル|Ａ(m,f)|^２を算定し、パワースペクトル|Ａ(m,f)|^２から算定される振幅スペクトルＡ(m,f)とスペクトルＷ2(m,f)の位相スペクトルθw2(m,f)とから数式(3)の演算でスペクトルＷ3(m,f)を算定する。

The spectrum subtraction unit 42 calculates the power spectrum | A (m, f) | ² of the spectrum W3 (m, f) by executing the following expressions (4a) and (4b), and the power spectrum | From the amplitude spectrum A (m, f) calculated from A (m, f) | ² and the phase spectrum θw2 (m, f) of the spectrum W2 (m, f), the spectrum W3 (m , f).

数式(4a)に示すように、スペクトルＷ2(m,f)の強度（パワー）|Ｗ2(m,f)|^２が所定値ＴHを上回る場合、スペクトル減算部４２は、推定残留音成分ｅ2(t)のスペクトルＥ2(m,f)の強度|Ｅ2(m,f)|^２を減算係数α(m)に応じて調整してから音響信号ｗ2(t)のスペクトルＷ2(m,f)の強度|Ｗ2(m,f)|^２から減算（スペクトル減算）することで、スペクトルＷ3(m,f)のパワースペクトル|Ａ(m,f)|^２を算定する。具体的には、減算係数α(m)と強度|Ｅ2(m,f)|^２との乗算値が強度|Ｗ2(m,f)|^２から減算される。所定値ＴHは、例えば、減算係数α(m)と強度|Ｅ2(m,f)|^２との乗算値（ＴH＝α(m)・|Ｅ2(m,f)|^２）に設定される。他方、強度|Ｗ2(m,f)|^２が所定値ＴHを下回る場合、スペクトル減算部４２は、数式(4b)に示すように、強度|Ｅ2(m,f)|^２をフロアリング係数β(m)に応じて調整する（具体的には強度|Ｅ2(m,f)|^２にフロアリング係数β(m)を乗算する）ことでパワースペクトル|Ａ(m,f)|^２を算定する。 As shown in Expression (4a), when the intensity (power) | W2 (m, f) | ² of the spectrum W2 (m, f) exceeds a predetermined value TH, the spectrum subtracting unit 42 calculates the estimated residual sound component e2 ( t) The intensity of the spectrum E2 (m, f) | E2 (m, f) | ² is adjusted according to the subtraction coefficient α (m), and then the spectrum W2 (m, f) of the acoustic signal w2 (t) is adjusted. By subtracting (spectral subtraction) from the intensity | W2 (m, f) | ² , the power spectrum | A (m, f) | ² of the spectrum W3 (m, f) is calculated. Specifically, the product of the subtraction coefficient α (m) and the intensity | E2 (m, f) | ² is subtracted from the intensity | W2 (m, f) | ² . The predetermined value TH is set to, for example, a multiplication value of the subtraction coefficient α (m) and the intensity | E2 (m, f) | ² (TH = α (m) · | E2 (m, f) | ² ). . On the other hand, when the intensity | W2 (m, f) | ² is lower than the predetermined value TH, the spectrum subtraction unit 42 converts the intensity | E2 (m, f) | ² to the flooring coefficient β as shown in the equation (4b). The power spectrum | A (m, f) | ² is calculated by adjusting according to (m) (specifically, multiplying the intensity | E2 (m, f) | ² by the flooring coefficient β (m)) To do.

図１の逆変換部４４は、スペクトル減算部４２が生成する各フレームのスペクトルＷ3(m,f)を時間領域の信号に変換し、前後のフレームの変換後の信号を時間軸上で相互に連結することで音響信号ｚ(t)を生成する。数式(4a)のように推定残留音成分ｅ2(t)のスペクトルＥ2(m,f)が音響信号ｗ2(t)のスペクトルＷ2(m,f)から減算されるので、帰還音成分ｎ1(t)の残留音成分ｅ1(t)を効果的に低減した音響信号ｚ(t)を生成することが可能である。逆変換部４４が生成した音響信号ｚ(t)は通信網に送信される。 1 converts the spectrum W3 (m, f) of each frame generated by the spectrum subtracting unit 42 into a time domain signal, and converts the converted signals of the preceding and succeeding frames to each other on the time axis. The acoustic signal z (t) is generated by the connection. Since the spectrum E2 (m, f) of the estimated residual sound component e2 (t) is subtracted from the spectrum W2 (m, f) of the acoustic signal w2 (t) as shown in Equation (4a), the feedback sound component n1 (t It is possible to generate an acoustic signal z (t) in which the residual sound component e1 (t) is effectively reduced. The acoustic signal z (t) generated by the inverse conversion unit 44 is transmitted to the communication network.

ところで、以上のようにスペクトルＥ2(m,f)の減算で生成されたスペクトルＷ3(m,f)には、スペクトル減算に起因した高強度の成分（孤立点）が時間軸上および周波数軸上に分散的に存在し、人工的で耳障りなミュージカルノイズとして受聴者に知覚される場合がある。スペクトル減算の度合（スペクトル減算の前後にわたるスペクトルの変化の度合）が大きいほどスペクトルＷ3(m,f)のミュージカルノイズは顕著となる。具体的には、減算係数α(m)を大きい数値に設定した場合（スペクトル減算の度合を増加した場合）やフロアリング係数β(m)を小さい数値に設定した場合にミュージカルノイズは顕著となる。 By the way, in the spectrum W3 (m, f) generated by subtracting the spectrum E2 (m, f) as described above, a high-intensity component (isolated point) resulting from the spectrum subtraction is on the time axis and the frequency axis. May be perceived by the listener as artificial and annoying musical noise. The greater the degree of spectral subtraction (the degree of change in the spectrum before and after spectral subtraction), the more marked the musical noise in the spectrum W3 (m, f). Specifically, musical noise becomes noticeable when the subtraction coefficient α (m) is set to a large value (when the degree of spectrum subtraction is increased) or the flooring coefficient β (m) is set to a small value. .

図２の部分(A)は、スペクトル減算前のスペクトルＷ2(m,f)（音響信号ｗ2(t)）の各周波数における強度の度数を所定個（複数）のフレームについて計数することで生成された度数分布（強度を確率変数とする確率密度関数）Ｆ1のグラフである。図２の部分(A)に示すように、スペクトル減算前における各強度の度数（確率）は、強度がゼロから増加するほど減少するように非線形に分布する。 Part (A) in FIG. 2 is generated by counting the frequency of intensity at each frequency of the spectrum W2 (m, f) (acoustic signal w2 (t)) before spectrum subtraction for a predetermined number of frames. Is a graph of the frequency distribution (probability density function with intensity as a random variable) F1. As shown in part (A) of FIG. 2, the frequency (probability) of each intensity before spectrum subtraction is non-linearly distributed so as to decrease as the intensity increases from zero.

図２の部分(B)は、スペクトル減算後のスペクトルＷ3(m,f)（音響信号ｚ(t)）の各周波数における強度の度数を所定個のフレームについて計数することで生成された度数分布（確率密度関数）Ｆ2のグラフである。ゼロに近い強度の度数（確率）はスペクトル減算で増加するから、スペクトル減算後の度数分布Ｆ2のうち強度がゼロの近傍となる領域内の分布は、スペクトル減算前の度数分布Ｆ1と比較して急峻な形状となる。 Part (B) of FIG. 2 shows a frequency distribution generated by counting the frequency of intensity at each frequency of the spectrum W3 (m, f) (acoustic signal z (t)) after spectrum subtraction for a predetermined number of frames. (Probability density function) is a graph of F2. Since the frequency (probability) of the intensity close to zero is increased by spectrum subtraction, the distribution in the region where the intensity is close to zero in the frequency distribution F2 after the spectrum subtraction is compared with the frequency distribution F1 before the spectrum subtraction. It becomes a steep shape.

度数分布の形状（傾斜の急峻度）の尺度として尖度（kurtosis）を導入すると、スペクトル減算後の度数分布Ｆ2の尖度ＫCはスペクトル減算前の度数分布Ｆ1の尖度ＫBよりも高い数値となる。尖度κは、ｎ次のモーメントμnから以下の数式(5)で算定される高次統計量である。

When kurtosis is introduced as a measure of the shape of the frequency distribution (steepness of inclination), the kurtosis KC of the frequency distribution F2 after the spectrum subtraction is higher than the kurtosis KB of the frequency distribution F1 before the spectrum subtraction. Become. The kurtosis κ is a higher-order statistic calculated from the n-th moment μn by the following equation (5).

非ガウス性の指標としての尖度κの性質に着目すると、度数分布の非ガウス性がスペクトル減算に起因して増加すると理解できる。ミュージカルノイズは非ガウス性が高い雑音であるから、スペクトル減算の前後にわたる尖度κの変化（尖度ＫBに対する尖度ＫCの相対比ＫC／ＫBや、尖度ＫCと尖度ＫBとの差分値（ＫC−ＫB））が増加するほどミュージカルノイズが顕在化するという傾向がある。すなわち、スペクトル減算前のスペクトルＷ2(m,f)（音響信号ｗ2(t)）の度数分布Ｆ1の尖度ＫBが低いほど、スペクトル減算に起因したミュージカルノイズがスペクトルＷ3(m,f)（音響信号ｚ(t)）に発生し易い。 Focusing on the nature of kurtosis κ as a non-Gaussian index, it can be understood that the non-Gaussianity of the frequency distribution increases due to spectral subtraction. Since musical noise is highly non-Gaussian noise, the change in kurtosis κ before and after spectral subtraction (the relative ratio KC / KB of kurtosis KC to kurtosis KB, or the difference between kurtosis KC and kurtosis KB As (KC-KB)) increases, there is a tendency that the musical noise becomes obvious. That is, the lower the kurtosis KB of the frequency distribution F1 of the spectrum W2 (m, f) (acoustic signal w2 (t)) before spectrum subtraction, the more the musical noise caused by the spectrum subtraction becomes the spectrum W3 (m, f) (sound Signal z (t)).

いま、音響信号ｗ2(t)のスペクトルＷ2(m,f)のうち残留音成分ｅ1(t)のスペクトルＥ1(m,f)から推定残留音成分ｅ2(t)のスペクトルＥ2(m,f)を減算するという本来的な目的に着目すると（すなわち、目的音成分ｓ(t)を便宜的に無視すると）、残留音成分ｅ1(t)の強度の度数分布の尖度が低いほど、スペクトル減算に起因したミュージカルノイズが発生し易いという傾向が把握される。そして、推定残留音成分ｅ2(t)は残留音成分ｅ1(t)の推定値であるから、推定残留音成分ｅ2(t)（スペクトルＥ2(m,f)）の強度の度数分布における尖度ＫAが低いほど、推定残留音成分ｅ2(t)のスペクトルＥ2(m,f)の減算に起因したミュージカルノイズが発生し易いという傾向がある。すなわち、ミュージカルノイズの発生の度合を示す定量的な指標として尖度ＫAを利用できる。以上の傾向を考慮して、本実施形態においては、数式(4a)の減算係数α(m)や数式(4b)のフロアリング係数β(m)（すなわち、スペクトル減算の度合）を推定残留音成分ｅ2(t)（スペクトルＥ2(m,f)）の強度の度数分布における尖度ＫA(m)に応じて可変に制御する。 The spectrum E2 (m, f) of the estimated residual sound component e2 (t) from the spectrum E1 (m, f) of the residual sound component e1 (t) in the spectrum W2 (m, f) of the acoustic signal w2 (t). Focusing on the original purpose of subtracting (ie, ignoring the target sound component s (t) for convenience), the lower the kurtosis of the intensity distribution of the residual sound component e1 (t), the lower the spectral subtraction. The tendency that the musical noise resulting from this is easy to generate is grasped. Since the estimated residual sound component e2 (t) is an estimated value of the residual sound component e1 (t), the kurtosis in the frequency distribution of the intensity of the estimated residual sound component e2 (t) (spectrum E2 (m, f)) As KA is lower, there is a tendency that musical noise due to subtraction of the spectrum E2 (m, f) of the estimated residual sound component e2 (t) is more likely to occur. That is, the kurtosis KA can be used as a quantitative index indicating the degree of occurrence of musical noise. In consideration of the above tendency, in the present embodiment, the subtraction coefficient α (m) in Expression (4a) and the flooring coefficient β (m) in Expression (4b) (that is, the degree of spectrum subtraction) are estimated residual sound. Control is variably performed according to the kurtosis KA (m) in the frequency distribution of the intensity of the component e2 (t) (spectrum E2 (m, f)).

図１の尖度算定部５２Aは、残留音推定部３６が生成したスペクトルＥ2(m,f)（推定残留音成分ｅ2(t)）の強度の度数分布における尖度ＫA(m)をフレーム毎に算定する。Ｍ個（Ｍは２以上の自然数）の強度ｘ1〜ｘMの度数分布について尖度κを算定する方法の具体例を以下に詳述する。 The kurtosis calculation unit 52A in FIG. 1 calculates the kurtosis KA (m) in the frequency distribution of the intensity of the spectrum E2 (m, f) (estimated residual sound component e2 (t)) generated by the residual sound estimation unit 36 for each frame. To calculate. A specific example of a method for calculating the kurtosis κ for the frequency distribution of M (M is a natural number of 2 or more) intensities x1 to xM will be described in detail below.

Ｍ個の強度ｘ1〜ｘMの度数分布は、以下の数式(6)の関数Ｇa(x；k,θ)で近似される。

数式(6)の係数Ｃは、ガンマ関数Γ(k)を利用して以下のように定義される。

The frequency distribution of M intensities x1 to xM is approximated by a function Ga (x; k, θ) of the following formula (6).

The coefficient C in Equation (6) is defined as follows using the gamma function Γ (k).

２次のモーメントμ2の定義式における分布関数（確率密度関数）Ｐ(x)を数式(6)の関数Ｇa(x；k,θ)に置換することで以下の数式(7)が導出される。

The following equation (7) is derived by replacing the distribution function (probability density function) P (x) in the definition equation of the second moment μ2 with the function Ga (x; k, θ) of the equation (6). .

２次のモーメントμ2の導出と同様に、４次のモーメントμ4の定義式における分布関数（確率密度関数）Ｐ(x)を数式(6)の関数Ｇa(x；k,θ)に置換することで以下の数式(8)が導出される。

Similar to the derivation of the second-order moment μ2, the distribution function (probability density function) P (x) in the definition equation of the fourth-order moment μ4 is replaced with the function Ga (x; k, θ) of the equation (6). The following formula (8) is derived.

数式(7)の２次のモーメントμ2と数式(8)の４次のモーメントμ4とを数式(5)に代入すると、尖度κを定義する以下の数式(9)が導出される。

Substituting the second-order moment μ2 in equation (7) and the fourth-order moment μ4 in equation (8) into equation (5) yields the following equation (9) that defines kurtosis κ.

図１の尖度算定部５２Aは、第ｍ番目のフレームを含む所定個のフレーム（例えば、第ｍ番目のフレームを最後とする複数のフレーム）にわたる各スペクトルＥ2(m,f)のＭ個の強度|Ｅ2(m,f)|^２を強度ｘ1〜ｘMとしたときの数式(9)の尖度κを尖度ＫA(m)として算定する。 The kurtosis calculation unit 52A in FIG. 1 performs M number of spectrums E2 (m, f) over a predetermined number of frames including the mth frame (for example, a plurality of frames with the mth frame as the last). The kurtosis κ in the equation (9) when the intensity | E2 (m, f) | ² is the intensity x1 to xM is calculated as the kurtosis KA (m).

図１の係数設定部５４は、尖度算定部５２Aが算定した尖度ＫA(m)に応じて減算係数α(m)とフロアリング係数β(m)とを可変に設定する。各フレームについて尖度ＫA(m)が算定されるたびに減算係数α(m)およびフロアリング係数β(m)が順次に更新される。図２を参照して説明したように、尖度ＫA(m)が小さいほど（すなわち、残留音成分ｅ1(t)の強度の度数分布における尖度が小さいほど）、スペクトル減算後にミュージカルノイズが発生し易いという傾向がある。以上の傾向を考慮して、係数設定部５４は、尖度ＫA(m)が小さいほど、スペクトル減算の度合が低減される（すなわち、スペクトル減算に起因したミュージカルノイズが低減される）ように、減算係数α(m)およびフロアリング係数β(m)を設定する。 The coefficient setting unit 54 in FIG. 1 variably sets the subtraction coefficient α (m) and the flooring coefficient β (m) according to the kurtosis KA (m) calculated by the kurtosis calculation unit 52A. Each time the kurtosis KA (m) is calculated for each frame, the subtraction coefficient α (m) and the flooring coefficient β (m) are sequentially updated. As described with reference to FIG. 2, the smaller the kurtosis KA (m) (that is, the smaller the kurtosis in the frequency distribution of the intensity of the residual sound component e1 (t)), the more noise is generated after subtracting the spectrum. There is a tendency to be easy to do. In consideration of the above tendency, the coefficient setting unit 54 reduces the degree of spectrum subtraction as the kurtosis KA (m) is smaller (that is, the musical noise due to spectrum subtraction is reduced). A subtraction coefficient α (m) and a flooring coefficient β (m) are set.

具体的には、減算係数α(m)が小さいほど音響信号ｚ(t)のミュージカルノイズは抑制されるから、係数設定部５４は、尖度ＫA(m)が小さいほど減算係数α(m)を小さい数値（すなわち、スペクトルＷ2(m,f)からの減算量を抑制する数値）に設定する。減算係数α(m)と尖度ＫA(m)との具体的な関係は任意であるが、例えば、尖度ＫA(m)に所定の正数を乗算することで減算係数α(m)を算定する構成が好適である。 Specifically, since the musical noise of the acoustic signal z (t) is suppressed as the subtraction coefficient α (m) is smaller, the coefficient setting unit 54 determines that the subtraction coefficient α (m) is smaller as the kurtosis KA (m) is smaller. Is set to a small value (that is, a value that suppresses the amount of subtraction from the spectrum W2 (m, f)). The specific relationship between the subtraction coefficient α (m) and the kurtosis KA (m) is arbitrary. For example, the subtraction coefficient α (m) is obtained by multiplying the kurtosis KA (m) by a predetermined positive number. The structure to calculate is suitable.

また、フロアリング係数β(m)が大きいほど音響信号ｚ(t)のミュージカルノイズは抑制されるから、係数設定部５４は、尖度ＫA(m)が小さいほどフロアリング係数β(m)を大きい数値（すなわち、スペクトルＷ2(m,f)とスペクトルＷ3(m,f)との相違を抑制する数値）に設定する。例えば、尖度ＫA(m)を所定値から減算した数値をフロアリング係数β(m)として算定する構成が好適である。なお、減算係数α(m)とフロアリング係数β(m)とが尖度ＫA(m)の数値（あるいは範囲）毎に格納されたテーブルから、尖度ＫA(m)に応じた減算係数α(m)およびフロアリング係数β(m)を係数設定部５４が探索する構成も好適である。 Further, since the musical noise of the acoustic signal z (t) is suppressed as the flooring coefficient β (m) is larger, the coefficient setting unit 54 sets the flooring coefficient β (m) as the kurtosis KA (m) is smaller. A large numerical value (that is, a numerical value that suppresses the difference between the spectrum W2 (m, f) and the spectrum W3 (m, f)) is set. For example, a configuration in which a numerical value obtained by subtracting the kurtosis KA (m) from a predetermined value is calculated as the flooring coefficient β (m) is preferable. The subtraction coefficient α (m) and the flooring coefficient β (m) are stored for each numerical value (or range) of the kurtosis KA (m), and the subtraction coefficient α corresponding to the kurtosis KA (m) is stored. A configuration in which the coefficient setting unit 54 searches for (m) and the flooring coefficient β (m) is also suitable.

以上の形態においては、音響信号ｗ2(t)のスペクトルＷ2(m,f)から減算されるスペクトルＥ2(m,f)の調整に適用される減算係数α(m)が推定残留音成分ｅ2(t)の強度の度数分布における尖度ＫA(m)に応じて（すなわち、スペクトル減算に起因して発生する可能性があるミュージカルノイズの程度）に応じて可変に設定される。したがって、減算係数α(m)が尖度ＫA(m)に依存しない構成（例えば減算係数α(m)を所定値に固定した構成）と比較して、音響信号ｚ(t)におけるミュージカルノイズを有効に抑制しながら、帰還音成分ｎ1(t)（特に残留音成分ｅ1(t)）を抑圧できるという利点がある。さらに、フロアリング係数β(m)も尖度ＫA(m)に応じて可変に設定されるから、減算係数α(m)およびフロアリング係数β(m)が尖度ＫA(m)に依存しない構成（例えば減算係数α(m)やフロアリング係数β(m)を所定値に固定した構成）や減算係数α(m)のみを尖度ＫA(m)に応じて設定する構成と比較して、スペクトル減算に起因したミュージカルノイズの低減の効果は格別に顕著である。 In the above embodiment, the subtraction coefficient α (m) applied to the adjustment of the spectrum E2 (m, f) subtracted from the spectrum W2 (m, f) of the acoustic signal w2 (t) is the estimated residual sound component e2 ( It is variably set according to the kurtosis KA (m) in the frequency distribution of the intensity of t) (that is, the degree of musical noise that may occur due to spectral subtraction). Therefore, compared with a configuration in which the subtraction coefficient α (m) does not depend on the kurtosis KA (m) (for example, a configuration in which the subtraction coefficient α (m) is fixed to a predetermined value), the musical noise in the acoustic signal z (t) is reduced. There is an advantage that the feedback sound component n1 (t) (particularly the residual sound component e1 (t)) can be suppressed while being effectively suppressed. Further, since the flooring coefficient β (m) is also variably set according to the kurtosis KA (m), the subtraction coefficient α (m) and the flooring coefficient β (m) do not depend on the kurtosis KA (m). Compared to the configuration (for example, the configuration in which the subtraction coefficient α (m) and the flooring coefficient β (m) are fixed to predetermined values) and the configuration in which only the subtraction coefficient α (m) is set according to the kurtosis KA (m) The effect of reducing musical noise due to spectral subtraction is particularly remarkable.

＜Ｂ：第２実施形態＞
次に、本発明の第２実施形態について説明する。なお、以下の各形態において作用や機能が第１実施形態と同等である要素については、以上と同じ符号を付して各々の詳細な説明を適宜に省略する。 <B: Second Embodiment>
Next, a second embodiment of the present invention will be described. In addition, about the element in which an effect | action and a function are equivalent to 1st Embodiment in each following form, the same code | symbol as the above is attached | subjected and each detailed description is abbreviate | omitted suitably.

第１実施形態の帰還音推定部２４は、目的音判定部２２が特定した無音区間で動作（推定帰還音成分ｎ2(t)の更新）を停止することで推定帰還音成分ｎ2(t)を高精度に推定する。しかし、様々な要因（例えば、直前の無音区間における推定の誤差の影響）で、有音区間に適用される推定帰還音成分ｎ2(t)に誤差が発生する場合がある。第１実施形態の尖度算定部５２Aは、推定帰還音成分ｎ2(t)に応じた推定残留音成分ｅ2(t)のスペクトルＥ2(m,f)を対象として尖度ＫA(m)を算定するから、推定帰還音成分ｎ2(t)に起因した誤差が尖度ＫA(m)に発生する。したがって、減算係数α(m)やフロアリング係数β(m)を適切な数値に設定できない場合がある。第２実施形態は以上の問題を解決するための形態である。 The feedback sound estimation unit 24 of the first embodiment stops the operation (update of the estimated feedback sound component n2 (t)) in the silent section specified by the target sound determination unit 22 to obtain the estimated feedback sound component n2 (t). Estimate with high accuracy. However, an error may occur in the estimated feedback sound component n2 (t) applied to the voiced section due to various factors (for example, the influence of the estimation error in the previous silent section). The kurtosis calculation unit 52A of the first embodiment calculates the kurtosis KA (m) for the spectrum E2 (m, f) of the estimated residual sound component e2 (t) corresponding to the estimated feedback sound component n2 (t). Therefore, an error caused by the estimated feedback sound component n2 (t) occurs in the kurtosis KA (m). Therefore, the subtraction coefficient α (m) and the flooring coefficient β (m) may not be set to appropriate values. The second embodiment is a form for solving the above problem.

図３は、第２実施形態に係る音響処理装置１００Bのブロック図である。図３に破線の矢印で示すように、目的音判定部２２による判定の結果は、帰還音推定部２４とともに尖度算定部５２Aにも通知される。尖度算定部５２Aは、目的音判定部２２が特定した無音区間（目的音成分ｓ(t)が存在しない区間）内では第１実施形態と同様に尖度ＫA(m)をフレーム毎に算定および更新するが、有音区間内では尖度ＫA(m)の算定を停止する。したがって、有音区間内では、直前の無音区間の最後に算定した尖度ＫA(m)が継続的に係数設定部５４に指示される。係数設定部５４が尖度ＫA(m)に応じて減算係数α(m)やフロアリング係数β(m)を算定する動作は第１実施形態と同様である。 FIG. 3 is a block diagram of the sound processing apparatus 100B according to the second embodiment. As indicated by the dashed arrows in FIG. 3, the determination result by the target sound determination unit 22 is notified to the kurtosis calculation unit 52 A together with the feedback sound estimation unit 24. The kurtosis calculation unit 52A calculates the kurtosis KA (m) for each frame in the silent section specified by the target sound determination unit 22 (section in which the target sound component s (t) does not exist) as in the first embodiment. However, the calculation of the kurtosis KA (m) is stopped within the sound section. Therefore, in the voiced section, the coefficient setting unit 54 is instructed continuously with the kurtosis KA (m) calculated at the end of the immediately preceding silent section. The operation of the coefficient setting unit 54 calculating the subtraction coefficient α (m) and the flooring coefficient β (m) according to the kurtosis KA (m) is the same as that in the first embodiment.

以上の形態においては、有音区間内で尖度ＫA(m)の更新が停止するから、有音区間内における推定帰還音成分ｎ2(t)の誤差は減算係数α(m)やフロアリング係数β(m)に影響しない。したがって、スペクトル減算に起因したミュージカルノイズを第１実施形態よりも効果的に抑制できるという利点がある。また、有音区間内では尖度ＫA(m)の算定が停止するから、有音区間および無音区間の双方にて尖度ＫA(m)を算定する第１実施形態と比較して、尖度算定部５２Aの処理量が削減されるという利点もある。 In the above embodiment, the update of the kurtosis KA (m) is stopped in the sounded section, so the error of the estimated feedback sound component n2 (t) in the sounded section is the subtraction coefficient α (m) or the flooring coefficient. Does not affect β (m). Therefore, there is an advantage that musical noise caused by spectrum subtraction can be more effectively suppressed than in the first embodiment. In addition, since the calculation of the kurtosis KA (m) is stopped in the voiced section, the kurtosis is compared with the first embodiment in which the kurtosis KA (m) is calculated in both the voiced section and the silent section. There is also an advantage that the processing amount of the calculation unit 52A is reduced.

＜Ｃ：第３実施形態＞
図２を参照して前述したように、スペクトル減算前のスペクトルＷ2(m,f)（音響信号ｗ2(t)）の強度の度数分布Ｆ1の尖度ＫB(m)が低いほど、スペクトル減算に起因したミュージカルノイズがスペクトルＷ3(m,f)（音響信号ｚ(t)）に発生し易いという傾向がある。以上の傾向を考慮して、第３実施形態においては、スペクトルＷ2(m,f)の強度の度数分布Ｆ1の尖度ＫB(m)に応じてスペクトル減算の度合（減算係数α(m)やフロアリング係数β(m)）を制御する。 <C: Third Embodiment>
As described above with reference to FIG. 2, the lower the kurtosis KB (m) of the frequency distribution F1 of the intensity of the spectrum W2 (m, f) (acoustic signal w2 (t)) before the spectrum subtraction, the lower the spectrum subtraction. The resulting musical noise tends to occur easily in the spectrum W3 (m, f) (acoustic signal z (t)). In consideration of the above tendency, in the third embodiment, the degree of spectrum subtraction (subtraction coefficient α (m) and the like) according to the kurtosis KB (m) of the frequency distribution F1 of the intensity of the spectrum W2 (m, f). The flooring coefficient β (m)) is controlled.

図４は、第３実施形態に係る音響処理装置１００Cのブロック図である。図４に示すように、音響処理装置１００Cは、第１実施形態の音響処理装置１００Aにおける尖度算定部５２Aを尖度算定部５２Bに置換した構成である。尖度算定部５２Bは、周波数解析部３４が生成したスペクトルＷ2(m,f)（音響信号ｗ2(t)）の強度の度数分布Ｆ1における尖度ＫB(m)をフレーム毎に算定する。尖度ＫB(m)の算定には、第１実施形態における尖度ＫA(m)の算定と同様の方法が採用される。 FIG. 4 is a block diagram of a sound processing apparatus 100C according to the third embodiment. As illustrated in FIG. 4, the acoustic processing device 100C has a configuration in which the kurtosis calculation unit 52A in the acoustic processing device 100A of the first embodiment is replaced with a kurtosis calculation unit 52B. The kurtosis calculation unit 52B calculates the kurtosis KB (m) in the frequency distribution F1 of the intensity of the spectrum W2 (m, f) (acoustic signal w2 (t)) generated by the frequency analysis unit 34 for each frame. For the calculation of the kurtosis KB (m), the same method as the calculation of the kurtosis KA (m) in the first embodiment is adopted.

係数設定部５４は、第１実施形態と同様に、尖度算定部５２Bが算定した尖度ＫB(m)に応じて減算係数α(m)とフロアリング係数β(m)とを可変に設定する。尖度ＫB(m)が低いほどミュージカルノイズが発生し易いという傾向を考慮して、係数設定部５４は、尖度ＫB(m)が小さいほどスペクトル減算の度合が低減されるように減算係数α(m)およびフロアリング係数β(m)を設定する。具体的には、係数設定部５４は、尖度ＫB(m)が小さいほど減算係数α(m)を小さい数値に設定し、尖度ＫB(m)が小さいほどフロアリング係数β(m)を大きい数値に設定する。 As in the first embodiment, the coefficient setting unit 54 variably sets the subtraction coefficient α (m) and the flooring coefficient β (m) according to the kurtosis KB (m) calculated by the kurtosis calculation unit 52B. To do. Considering the tendency that musical noise is more likely to occur as the kurtosis KB (m) is lower, the coefficient setting unit 54 reduces the subtraction coefficient α so that the degree of spectral subtraction is reduced as the kurtosis KB (m) is smaller. Set (m) and flooring coefficient β (m). Specifically, the coefficient setting unit 54 sets the subtraction coefficient α (m) to a smaller value as the kurtosis KB (m) is smaller, and sets the flooring coefficient β (m) as the kurtosis KB (m) is smaller. Set to a large number.

第３実施形態においては、音響信号ｗ2(t)のスペクトルＷ2(m,f)の尖度ＫB(m)に応じて減算係数α(m)やフロアリング係数β(m)が制御されるから、第１実施形態と同様に、音響信号ｚ(t)におけるミュージカルノイズを有効に抑制しながら、帰還音成分ｎ1(t)（特に残留音成分ｅ1(t)）を抑圧できるという利点がある。また、音響信号ｗ2(t)の尖度ＫB(m)が利用されるから、推定残留音成分ｅ2(t)の推定の精度に影響されずに適切な減算係数α(m)やフロアリング係数β(m)を算定できるという利点もある。 In the third embodiment, the subtraction coefficient α (m) and flooring coefficient β (m) are controlled in accordance with the kurtosis KB (m) of the spectrum W2 (m, f) of the acoustic signal w2 (t). As in the first embodiment, there is an advantage that the feedback sound component n1 (t) (particularly the residual sound component e1 (t)) can be suppressed while effectively suppressing the musical noise in the acoustic signal z (t). In addition, since the kurtosis KB (m) of the acoustic signal w2 (t) is used, an appropriate subtraction coefficient α (m) and flooring coefficient are not affected by the estimation accuracy of the estimated residual sound component e2 (t). There is also an advantage that β (m) can be calculated.

＜Ｄ：第４実施形態＞
音響信号ｗ2(t)には目的音成分ｓ(t)が含まれる場合と含まれない場合とがある。帰還音成分ｎ1(t)は、反射や拡散を経て収音機器１４に到達した残響音であるから、大部分が音源から直接的に収音機器１４に到達する目的音成分ｓ(t)と比較すると尖度（非ガウス性）が低い。すなわち、音響信号ｗ2(t)が目的音成分ｓ(t)を含まない場合の尖度ＫB(m)は、音響信号ｗ2(t)が目的音成分ｓ(t)を含む場合の尖度ＫB(m)よりも低い。 <D: Fourth Embodiment>
The acoustic signal w2 (t) may or may not include the target sound component s (t). Since the feedback sound component n1 (t) is a reverberant sound that reaches the sound collecting device 14 through reflection or diffusion, most of the feedback sound component n1 (t) is a target sound component s (t) that reaches the sound collecting device 14 directly from the sound source. In comparison, kurtosis (non-Gaussian) is low. That is, the kurtosis KB (m) when the acoustic signal w2 (t) does not include the target sound component s (t) is the kurtosis KB when the acoustic signal w2 (t) includes the target sound component s (t). Lower than (m).

したがって、音響信号ｗ2(t)が目的音成分ｓ(t)を含まない場合の尖度ＫB(m)のもとでミュージカルノイズが効果的に抑制されるように尖度ＫB(m)と減算係数α(m)との関係を決定することを前提とすれば、目的音成分ｓ(t)の有無に拘わらず音響信号ｗ2(t)の尖度ＫB(m)が減算係数α(m)に反映される第３実施形態の構成では、音響信号ｗ2(t)が目的音成分ｓ(t)を含む場合（すなわち、目的音成分ｓ(t)を含まない場合と比較して尖度ＫB(m)が高い場合）に減算係数α(m)が大きい数値に設定され、推定残留音成分ｅ2(t)のスペクトルＥ(m,f)が音響信号ｗ2(t)のスペクトルＷ2(m,f)から過剰に減算される可能性がある。そこで、第４実施形態においては、音響信号ｗ2(t)が目的音成分ｓ(t)を含む場合に尖度ＫB(m)の算定を停止する。 Therefore, the kurtosis KB (m) is subtracted from the kurtosis KB (m) so that the musical noise is effectively suppressed under the kurtosis KB (m) when the acoustic signal w2 (t) does not include the target sound component s (t). Assuming that the relationship with the coefficient α (m) is determined, the kurtosis KB (m) of the acoustic signal w2 (t) is the subtraction coefficient α (m) regardless of the presence or absence of the target sound component s (t). In the configuration of the third embodiment reflected in the above, the kurtosis KB is compared with the case where the acoustic signal w2 (t) includes the target sound component s (t) (that is, as compared with the case where the target sound component s (t) is not included). When (m) is high), the subtraction coefficient α (m) is set to a large value, and the spectrum E (m, f) of the estimated residual sound component e2 (t) becomes the spectrum W2 (m, There is a possibility of excessive subtraction from f). Therefore, in the fourth embodiment, the calculation of the kurtosis KB (m) is stopped when the acoustic signal w2 (t) includes the target sound component s (t).

図５は、第４実施形態に係る音響処理装置１００Dのブロック図である。図５に破線の矢印で示すように、目的音判定部２２による判定の結果は、帰還音推定部２４とともに尖度算定部５２Bにも通知される。第２実施形態の尖度算定部５２Aと同様に、尖度算定部５２Bは、目的音判定部２２による判定の結果に応じて尖度ＫB(m)の算定を実行または停止する。すなわち、尖度算定部５２Bは、目的音判定部２２が特定した無音区間内では、第３実施形態と同様に尖度ＫB(m)をフレーム毎に算定および更新するが、有音区間内では尖度ＫB(m)の算定を停止する。係数設定部５４が尖度ＫB(m)に応じて減算係数α(m)やフロアリング係数β(m)を算定する動作は第３実施形態と同様である。 FIG. 5 is a block diagram of a sound processing apparatus 100D according to the fourth embodiment. As indicated by the dashed arrows in FIG. 5, the determination result by the target sound determination unit 22 is notified to the kurtosis calculation unit 52 B together with the feedback sound estimation unit 24. Similar to the kurtosis calculation unit 52A of the second embodiment, the kurtosis calculation unit 52B executes or stops the calculation of the kurtosis KB (m) according to the determination result by the target sound determination unit 22. That is, the kurtosis calculation unit 52B calculates and updates the kurtosis KB (m) for each frame in the silent section specified by the target sound determination unit 22, as in the third embodiment. Stop calculating kurtosis KB (m). The operation of the coefficient setting unit 54 to calculate the subtraction coefficient α (m) and the flooring coefficient β (m) according to the kurtosis KB (m) is the same as in the third embodiment.

以上の形態においては、音響信号ｗ2(t)が目的音成分ｓ(t)を含む有音区間内で尖度ＫB(m)の算定が停止するから、減算係数α(m)やフロアリング係数β(m)は目的音成分ｓ(t)に影響されない。したがって、スペクトル減算に起因したミュージカルノイズを適切に抑圧できるという効果が実現される。また、有音区間内では尖度ＫB(m)の算定が停止するから、有音区間および無音区間の双方で尖度ＫB(m)を算定する第３実施形態と比較して、尖度算定部５２Bの処理量が削減されるという利点もある。 In the above embodiment, since the calculation of the kurtosis KB (m) is stopped within the sound section where the acoustic signal w2 (t) includes the target sound component s (t), the subtraction coefficient α (m) and the flooring coefficient β (m) is not affected by the target sound component s (t). Therefore, the effect that the musical noise resulting from spectrum subtraction can be suppressed appropriately is realized. Also, since the calculation of kurtosis KB (m) stops in the voiced section, the kurtosis calculation is compared with the third embodiment in which the kurtosis KB (m) is calculated in both the voiced and silent sections. There is also an advantage that the processing amount of the unit 52B is reduced.

＜Ｅ：第５実施形態＞
図６は、第５実施形態に係る音響処理装置１００Eのブロック図である。図６に示すように、音響処理装置１００Eは、第１実施形態（図１）の音響処理装置１００Aに第３実施形態（図４）の尖度算定部５２Bを追加するとともに係数設定部５４に判定部５６を追加した構成である。尖度算定部５２Aは推定残留音成分ｅ2(t)（スペクトルＥ2(m,f)）の尖度ＫA(m)をフレーム毎に算定し、尖度算定部５２Bは音響信号ｗ2(t)（スペクトルＷ2(m,f)）の尖度ＫB(m)をフレーム毎に算定する。 <E: Fifth Embodiment>
FIG. 6 is a block diagram of a sound processing apparatus 100E according to the fifth embodiment. As shown in FIG. 6, the sound processing device 100E adds the kurtosis calculating unit 52B of the third embodiment (FIG. 4) to the sound processing device 100A of the first embodiment (FIG. 1) and adds it to the coefficient setting unit 54. The determination unit 56 is added. The kurtosis calculation unit 52A calculates the kurtosis KA (m) of the estimated residual sound component e2 (t) (spectrum E2 (m, f)) for each frame, and the kurtosis calculation unit 52B calculates the acoustic signal w2 (t) ( The kurtosis KB (m) of the spectrum W2 (m, f)) is calculated for each frame.

尖度算定部５２Aが尖度ＫA(m)を算定する推定残留音成分ｅ2(t)（スペクトルＥ2(m,f)）は、尖度算定部５２Bが尖度ＫB(m)を算定する音響信号ｗ2(t)内の残留音成分ｅ1(t)の推定値である。したがって、音響信号ｗ2(t)が残留音成分ｅ1(t)のみを含む場合（すなわち、目的音成分ｓ(t)を含まない場合）には、尖度ＫA(m)と尖度ＫB(m)とが近似する。他方、第４実施形態について前述したように、音響信号ｗ2(t)が目的音成分ｓ(t)を含む場合と目的音成分ｓ(t)を含まない場合とで尖度ＫB(m)は相違するから、音響信号ｗ2(t)が目的音成分ｓ(t)を含む場合には、尖度ＫA(m)と尖度ＫB(m)とは相違する。 The estimated residual sound component e2 (t) (spectrum E2 (m, f)) for which the kurtosis calculation unit 52A calculates the kurtosis KA (m) is the acoustic for which the kurtosis calculation unit 52B calculates the kurtosis KB (m). This is an estimated value of the residual sound component e1 (t) in the signal w2 (t). Therefore, when the acoustic signal w2 (t) includes only the residual sound component e1 (t) (that is, when it does not include the target sound component s (t)), the kurtosis KA (m) and the kurtosis KB (m ) Approximate. On the other hand, as described above for the fourth embodiment, the kurtosis KB (m) is obtained when the acoustic signal w2 (t) includes the target sound component s (t) and does not include the target sound component s (t). Therefore, when the acoustic signal w2 (t) includes the target sound component s (t), the kurtosis KA (m) and the kurtosis KB (m) are different.

以上の傾向を考慮して、判定部５６は、尖度算定部５２Aが算定した尖度ＫA(m)と尖度算定部５２Bが算定した尖度ＫB(m)とが近似するか否かに応じて有音区間と無音区間とを区別する。すなわち、判定部５６は、目的音判定部２２とは異なる方法で目的音成分ｓ(t)の有無を判定する。具体的には、判定部５６は、尖度ＫA(m)と尖度ＫB(m)とが近似する場合には音響信号ｗ2(t)が目的音成分ｓ(t)を含まない（無音区間である）と判定し、尖度ＫA(m)と尖度ＫB(m)とが近似しない場合には音響信号ｗ2(t)が目的音成分ｓ(t)を含む（有音区間である）と判定する。例えば、尖度ＫA(m)と尖度ＫB(m)との差分値（絶対値）が所定の閾値を下回る場合には無音区間と判定され、尖度ＫA(m)と尖度ＫB(m)との差分値が所定の閾値を上回る場合には有音区間と判定される。 Considering the above tendency, the determination unit 56 determines whether or not the kurtosis KA (m) calculated by the kurtosis calculation unit 52A approximates the kurtosis KB (m) calculated by the kurtosis calculation unit 52B. Accordingly, the voiced section and the silent section are distinguished. That is, the determination unit 56 determines the presence or absence of the target sound component s (t) by a method different from that of the target sound determination unit 22. Specifically, the determination unit 56 determines that the acoustic signal w2 (t) does not include the target sound component s (t) when the kurtosis KA (m) and the kurtosis KB (m) are approximated (silent section). If the kurtosis KA (m) and the kurtosis KB (m) are not approximate, the acoustic signal w2 (t) includes the target sound component s (t) (which is a voiced section). Is determined. For example, if the difference value (absolute value) between the kurtosis KA (m) and the kurtosis KB (m) is below a predetermined threshold, it is determined as a silent section, and the kurtosis KA (m) and the kurtosis KB (m ) Is greater than a predetermined threshold value, it is determined as a sound section.

係数設定部５４は、判定部５６による判定の結果に応じて減算係数α(m)およびフロアリング係数β(m)の更新を実行または停止する。具体的には、係数設定部５４は、判定部５６が特定した無音区間内の各フレームについては、第１実施形態と同様に、尖度算定部５２Aが算定した尖度ＫA(m)に応じて減算係数α(m)およびフロアリング係数β(m)を設定する。他方、判定部５６が特定した有音区間内の各フレームについては、減算係数α(m)およびフロアリング係数β(m)の算定を停止する。減算係数α(m)およびフロアリング係数β(m)の算定を停止すると、係数設定部５４は、減算係数α(m)およびフロアリング係数β(m)を、算定の停止前（有音区間の開始前）の減算係数α(m)やフロアリング係数β(m)とは無関係な所定値（例えばスペクトル減算の度合を低減する数値）に初期化する。具体的には、減算係数α(m)はゼロに近い数値に設定され、フロアリング係数β(m)は１に近い数値に設定される。したがって、有音区間内の各フレームに対するスペクトル減算が無音区間内の音響信号ｗ2(t)（帰還音成分ｎ1(t)）の影響で過剰となることを抑制できる。もっとも、無音区間での更新後の減算係数α(m)やフロアリング係数β(m)を有音区間内のフレームに適用してもスペクトル減算の度合が過剰とならない場合もある。したがって、無音区間で更新された減算係数α(m)およびフロアリング係数β(m)が直後の有音区間でも継続的に係数設定部５４からスペクトル減算部４２に指示される構成も採用され得る。 The coefficient setting unit 54 executes or stops the update of the subtraction coefficient α (m) and the flooring coefficient β (m) according to the determination result by the determination unit 56. Specifically, the coefficient setting unit 54, for each frame in the silent section specified by the determination unit 56, according to the kurtosis KA (m) calculated by the kurtosis calculation unit 52A, as in the first embodiment. To set the subtraction coefficient α (m) and the flooring coefficient β (m). On the other hand, the calculation of the subtraction coefficient α (m) and the flooring coefficient β (m) is stopped for each frame in the sound section specified by the determination unit 56. When the calculation of the subtraction coefficient α (m) and the flooring coefficient β (m) is stopped, the coefficient setting unit 54 calculates the subtraction coefficient α (m) and the flooring coefficient β (m) before the calculation is stopped (sound period). Is initialized to a predetermined value irrelevant to the subtraction coefficient α (m) and the flooring coefficient β (m) (for example, a numerical value for reducing the degree of spectrum subtraction). Specifically, the subtraction coefficient α (m) is set to a value close to zero, and the flooring coefficient β (m) is set to a value close to 1. Therefore, it is possible to prevent the spectral subtraction for each frame in the sounded section from becoming excessive due to the influence of the acoustic signal w2 (t) (feedback sound component n1 (t)) in the silent section. However, even if the updated subtraction coefficient α (m) and flooring coefficient β (m) in the silent section are applied to the frames in the voiced section, the degree of spectrum subtraction may not be excessive. Accordingly, a configuration in which the subtraction coefficient α (m) and the flooring coefficient β (m) updated in the silent section are continuously instructed from the coefficient setting unit 54 to the spectrum subtracting section 42 even in the immediately following voiced section can be adopted. .

以上の形態においては、スペクトル減算に適用される減算係数α(m)やフロアリング係数β(m)の更新が有音区間内で停止するから、目的音成分ｓ(t)の特性がスペクトル減算の度合に影響する（例えば目的音成分ｓ(t)に起因してスペクトル減算が過剰となる）ことは防止される。また、有音区間内では減算係数α(m)やフロアリング係数β(m)の算定が停止するから、有音区間および無音区間の双方にて減算係数α(m)やフロアリング係数β(m)を算定する構成と比較して、係数設定部５４の処理量が削減されるという利点もある。なお、係数設定部５４が尖度ＫA(m)から減算係数α(m)やフロアリング係数β(m)を算定する構成を以上では例示したが、減算係数α(m)やフロアリング係数β(m)を、第３実施形態と同様に尖度ＫB(m)から算定する構成や、尖度ＫA(m)および尖度ＫB(m)の双方に応じて（例えば、尖度ＫA(m)と尖度ＫB(m)との平均値に応じて）算定する構成も採用される。 In the above embodiment, the update of the subtraction coefficient α (m) and the flooring coefficient β (m) applied to the spectral subtraction stops in the sound section, so that the characteristic of the target sound component s (t) is spectral subtraction. (For example, excessive spectrum subtraction due to the target sound component s (t)) is prevented. In addition, since the calculation of the subtraction coefficient α (m) and flooring coefficient β (m) is stopped in the sounded section, the subtraction coefficient α (m) and flooring coefficient β ( Compared with the configuration for calculating m), there is also an advantage that the processing amount of the coefficient setting unit 54 is reduced. The configuration in which the coefficient setting unit 54 calculates the subtraction coefficient α (m) and the flooring coefficient β (m) from the kurtosis KA (m) has been exemplified above, but the subtraction coefficient α (m) and the flooring coefficient β (m) is calculated from the kurtosis KB (m) as in the third embodiment, and according to both the kurtosis KA (m) and the kurtosis KB (m) (for example, the kurtosis KA (m ) And the kurtosis KB (m) (based on the average value) are also used.

＜Ｆ：第６実施形態＞
図７は、第６実施形態に係る音響処理装置１００Fのブロック図である。図７に示すように、音響処理装置１００Fは、第３実施形態の音響処理装置１００C（図４）における尖度算定部５２B（第１実施形態の音響処理装置１００Aにおける尖度算定部５２A）を尖度算定部５２Cに置換した構成である。尖度算定部５２Cは、スペクトル減算後の音響信号ｚ(t)（スペクトルＷ3(m,f)）の強度の度数分布Ｆ2における尖度ＫC(m)（図２の部分(B)）をスペクトルＷ3(m,f)からフレーム毎に算定する。尖度ＫC(m)の算定には、第３実施形態における尖度ＫB(m)の算定と同様の方法が採用される。 <F: Sixth Embodiment>
FIG. 7 is a block diagram of a sound processing apparatus 100F according to the sixth embodiment. As illustrated in FIG. 7, the sound processing device 100F includes a kurtosis calculation unit 52B (the kurtosis calculation unit 52A in the sound processing device 100A of the first embodiment) in the sound processing device 100C (FIG. 4) of the third embodiment. In this configuration, the kurtosis calculation unit 52C is replaced. The kurtosis calculation unit 52C spectrums the kurtosis KC (m) (part (B) of FIG. 2) in the frequency distribution F2 of the intensity of the acoustic signal z (t) (spectrum W3 (m, f)) after the spectrum subtraction. Calculate for each frame from W3 (m, f). For the calculation of the kurtosis KC (m), the same method as the calculation of the kurtosis KB (m) in the third embodiment is adopted.

係数設定部５４は、尖度算定部５２Cによる算定の結果に応じて減算係数α(m)およびフロアリング係数β(m)をフレーム毎に順次に設定する。尖度ＫC(m)は実行済のスペクトル減算の結果（スペクトルＷ3(m,f)）を利用して算定されるから、第ｍ番目のフレームの減算係数α(m)およびフロアリング係数β(m)は、第(m-1)番目のフレームのスペクトルＷ3(m-1,f)から算定された尖度ＫC(m-1)に応じて設定される。図２を参照して説明したように、尖度ＫC(m)が大きいほどミュージカルノイズが発生し易い。したがって、係数設定部５４は、尖度ＫC(m-1)が大きいほど、減算係数α(m)を小さい数値に設定するとともにフロアリング係数β(m)を大きい数値に設定する。第６実施形態においても第１実施形態や第３実施形態と同様の効果が実現される。 The coefficient setting unit 54 sequentially sets the subtraction coefficient α (m) and the flooring coefficient β (m) for each frame according to the calculation result by the kurtosis calculation unit 52C. Since the kurtosis KC (m) is calculated using the result of spectrum subtraction already performed (spectrum W3 (m, f)), the subtraction coefficient α (m) and flooring coefficient β ( m) is set according to the kurtosis KC (m-1) calculated from the spectrum W3 (m-1, f) of the (m-1) th frame. As described with reference to FIG. 2, musical noise is more likely to occur as the kurtosis KC (m) increases. Therefore, the coefficient setting unit 54 sets the subtraction coefficient α (m) to a smaller numerical value and the flooring coefficient β (m) to a larger numerical value as the kurtosis KC (m−1) increases. In the sixth embodiment, the same effects as those of the first and third embodiments are realized.

なお、第４実施形態と同様に有音区間内で尖度ＫC(m)の算定および更新を停止する構成も採用される。また、以上においては尖度ＫC(m-1)に応じて減算係数α(m)およびフロアリング係数β(m)を算定したが、第ｍ番目のフレームの尖度ＫC(m)から当該フレームの減算係数α(m)やフロアリング係数β(m)を算定する構成も採用される。例えば、第ｍ番目のフレームのスペクトル減算後の尖度ＫC(m)に応じて設定された減算係数α(m)およびフロアリング係数β(m)を適用したスペクトル減算が第ｍ番目のフレームのスペクトルＷ2(m,f)について実行される（すなわち、尖度ＫC(m)の算定のためのスペクトル減算と本来の目的のスペクトル減算とが１個のフレームについて実行される）。 In addition, the structure which stops calculation and update of kurtosis KC (m) within a sound area similarly to 4th Embodiment is also employ | adopted. In the above, the subtraction coefficient α (m) and the flooring coefficient β (m) are calculated according to the kurtosis KC (m−1), but the kurtosis KC (m) of the m-th frame A configuration for calculating the subtraction coefficient α (m) and the flooring coefficient β (m) is also employed. For example, the spectral subtraction using the subtraction coefficient α (m) and the flooring coefficient β (m) set according to the kurtosis KC (m) after the spectral subtraction of the mth frame is performed for the mth frame. It is performed on the spectrum W2 (m, f) (ie, the spectral subtraction for calculating the kurtosis KC (m) and the original target spectral subtraction are performed for one frame).

＜Ｇ：第７実施形態＞
図８は、本発明の第７実施形態に係る音響処理装置１００Gのブロック図である。図８に示すように、音響処理装置１００Gは、スペクトル減算前の尖度ＫB(m)を算定する第３実施形態の尖度算定部５２Bと、スペクトル減算後の尖度ＫC(m)を算定する第６実施形態の尖度算定部５２Cとを含んで構成される。 <G: Seventh Embodiment>
FIG. 8 is a block diagram of a sound processing apparatus 100G according to the seventh embodiment of the present invention. As shown in FIG. 8, the acoustic processing device 100G calculates the kurtosis calculation unit 52B of the third embodiment for calculating the kurtosis KB (m) before spectrum subtraction, and the kurtosis KC (m) after spectrum subtraction. The kurtosis calculating unit 52C of the sixth embodiment is configured.

係数設定部５４は、尖度ＫB(m-1)および尖度ＫC(m-1)の双方に応じて第ｍ番目のフレームの減算係数α(m)およびフロアリング係数β(m)を可変に設定する。具体的には、尖度ＫB(m-1)と尖度ＫC(m-1)との相違が大きいほどスペクトル減算の度合が低減されるように、減算係数α(m)およびフロアリング係数β(m)が算定される。例えば、係数設定部５４は、尖度ＫB(m-1)に対する尖度ＫC(m-1)の相対比ＫC(m-1)／ＫB(m-1)や、尖度ＫC(m-1)と尖度ＫB(m-1)との差分値(ＫC(m-1)−ＫB(m-1))が大きいほど、減算係数α(m)を小さい数値に設定するとともにフロアリング係数β(m)を大きい数値に設定する。 The coefficient setting unit 54 varies the subtraction coefficient α (m) and the flooring coefficient β (m) of the mth frame according to both the kurtosis KB (m−1) and the kurtosis KC (m−1). Set to. Specifically, the subtraction coefficient α (m) and the flooring coefficient β are set such that the greater the difference between the kurtosis KB (m−1) and the kurtosis KC (m−1), the lower the degree of spectral subtraction. (m) is calculated. For example, the coefficient setting unit 54 calculates the relative ratio KC (m-1) / KB (m-1) of the kurtosis KC (m-1) to the kurtosis KB (m-1) or the kurtosis KC (m-1). ) And kurtosis KB (m-1), the larger the difference value (KC (m-1) -KB (m-1)), the smaller the subtraction coefficient α (m) and the flooring coefficient β Set (m) to a large number.

以上の形態においては、スペクトル減算の実行前の尖度ＫB(m)と実行後の尖度ＫC(m)との双方が減算係数α(m)やフロアリング係数β(m)に反映されるから、尖度ＫB(m)および尖度ＫC(m)の一方のみを利用する構成（第３実施形態や第６実施形態）と比較して、スペクトル減算に起因したミュージカルノイズの抑制の精度を向上させることが可能である。 In the above embodiment, both the kurtosis KB (m) before execution of spectral subtraction and the kurtosis KC (m) after execution are reflected in the subtraction coefficient α (m) and the flooring coefficient β (m). Therefore, compared with the configuration using only one of kurtosis KB (m) and kurtosis KC (m) (the third embodiment and the sixth embodiment), the accuracy of suppressing musical noise caused by spectral subtraction is improved. It is possible to improve.

＜Ｈ：変形例＞
以上に例示した各形態は様々に変形され得る。変形の具体的な態様を以下に例示する。なお、以下の例示から任意に選択された２以上の態様は適宜に併合され得る。 <H: Modification>
Each form illustrated above can be variously modified. Specific modes of deformation are exemplified below. Note that two or more aspects arbitrarily selected from the following examples may be appropriately combined.

（１）変形例１
尖度算定部５２Bが尖度ＫB(m)を算定する対象は適宜に変更される。例えば、収音機器１４が生成した音響信号ｗ1(t)の各スペクトルの強度（ｘ1〜ｘM）から尖度ＫB(m)が算定され得る。また、時間領域の信号から尖度ＫB(m)を算定する構成も好適である。例えば、音響信号ｗ1(t)や音響信号ｗ2(t)の強度（各信号値）の度数分布における尖度κ（すなわち、時系列に配列するＭ個の強度ｘ1〜ｘMを適用した数式(9)の演算値）がスペクトル減算前の尖度ＫB(m)として算定され得る。 (1) Modification 1
The object for which the kurtosis calculating unit 52B calculates the kurtosis KB (m) is changed as appropriate. For example, the kurtosis KB (m) can be calculated from the intensity (x1 to xM) of each spectrum of the acoustic signal w1 (t) generated by the sound collection device 14. A configuration for calculating kurtosis KB (m) from a signal in the time domain is also suitable. For example, the mathematical expression (9) applying the kurtosis κ (that is, M intensities x1 to xM arranged in time series) in the frequency distribution of the intensity (each signal value) of the acoustic signal w1 (t) and the acoustic signal w2 (t). )) Can be calculated as the kurtosis KB (m) before spectral subtraction.

同様に、第６実施形態や第７実施形態において尖度算定部５２Cが尖度ＫC(m)を算定する対象は適宜に変更される。例えば、時間領域の音響信号ｚ(t)の各信号値の度数分布における尖度κをスペクトル減算後の尖度ＫC(m)として算定する構成や、音響信号ｚ(t)に対する所定の処理で生成された音響信号（またはスペクトル）から尖度ＫC(m)を算定する構成が採用される。 Similarly, the target in which the kurtosis calculation unit 52C calculates the kurtosis KC (m) in the sixth embodiment and the seventh embodiment is appropriately changed. For example, in a configuration in which the kurtosis κ in the frequency distribution of each signal value of the acoustic signal z (t) in the time domain is calculated as the kurtosis KC (m) after spectrum subtraction, or by a predetermined process for the acoustic signal z (t) A configuration is employed in which the kurtosis KC (m) is calculated from the generated acoustic signal (or spectrum).

（２）変形例２
時間領域から周波数領域への変換の位置は任意に変更される。すなわち、スペクトル減算部４２による処理が周波数領域で実行されることを除けば、音響処理装置１００（１００A〜１００F）の各要素の処理が周波数領域および時間領域の何れで実行されるかは本発明において不問である。例えば、収音機器１４が生成した直後（帰還音抑圧部２６による処理前）の音響信号ｗ1(t)を周波数領域に変換する構成では、帰還音推定部２４や帰還音抑圧部２６による処理が周波数領域で実行される。また、例えば周波数解析部３２が残留音推定部３６の後段に配置された構成では、残留音推定部３６は時間領域の推定残留音成分ｅ2(t)を生成する。 (2) Modification 2
The position of transformation from the time domain to the frequency domain is arbitrarily changed. In other words, except that the processing by the spectrum subtracting unit 42 is performed in the frequency domain, whether the processing of each element of the acoustic processing device 100 (100A to 100F) is performed in the frequency domain or the time domain is described in the present invention. Is unquestionable. For example, in a configuration in which the acoustic signal w1 (t) immediately after generation by the sound collection device 14 (before processing by the feedback sound suppression unit 26) is converted to the frequency domain, processing by the feedback sound estimation unit 24 and the feedback sound suppression unit 26 is performed. Run in the frequency domain. Further, for example, in a configuration in which the frequency analysis unit 32 is arranged at the subsequent stage of the residual sound estimation unit 36, the residual sound estimation unit 36 generates an estimated residual sound component e2 (t) in the time domain.

（３）変形例３
以上の各形態における残留音推定部３６は必須の要素ではない。例えば、以上の各形態では、推定帰還音成分ｎ2(t)のスペクトルＮ2(m,f)に係数δを乗算することで推定残留音成分ｅ2(t)のスペクトルＥ2(m,f)を生成したが、係数δを加味して減算係数α(m)やフロアリング係数β(m)を設定すれば、推定帰還音成分ｎ2(t)のスペクトルＮ2(m,f)を推定残留音成分ｅ2(t)のスペクトルＥ2(m,f)としてスペクトル減算部４２に供給することも可能である。 (3) Modification 3
The residual sound estimation unit 36 in each of the above forms is not an essential element. For example, in each of the above embodiments, the spectrum E2 (m, f) of the estimated residual sound component e2 (t) is generated by multiplying the spectrum N2 (m, f) of the estimated feedback sound component n2 (t) by the coefficient δ. However, if the subtraction coefficient α (m) and the flooring coefficient β (m) are set in consideration of the coefficient δ, the spectrum N2 (m, f) of the estimated feedback sound component n2 (t) is estimated as the estimated residual sound component e2. It is also possible to supply the spectrum subtraction unit 42 as the spectrum E2 (m, f) of (t).

（４）変形例４
尖度κ（ＫA(m)，ＫB(m)，ＫC(m)）を算定する方法は各形態の例示に限定されない。例えば、強度ｘ1〜ｘMの度数分布を所定の関数（例えば数式(6)）で近似する構成は必須ではない。また、尖度κ（ＫA(m)，ＫB(m)，ＫC(m)）の算定や更新の周期は任意である。例えば、複数のフレームを単位として尖度κを算定する構成や、フレームとは無関係の周期で尖度κを算定する構成も採用される。 (4) Modification 4
The method for calculating the kurtosis κ (KA (m), KB (m), KC (m)) is not limited to the illustration of each form. For example, a configuration for approximating the frequency distribution of the intensities x1 to xM with a predetermined function (for example, Equation (6)) is not essential. Further, the calculation and update cycle of kurtosis κ (KA (m), KB (m), KC (m)) is arbitrary. For example, a configuration in which kurtosis κ is calculated in units of a plurality of frames, or a configuration in which kurtosis κ is calculated at a period unrelated to the frame are also employed.

（５）変形例５
以上の各形態における音響処理装置１００（１００A〜１００F）は、帰還音に起因したハウリングを抑制する装置（ハウリング抑制装置）としても好適である。例えば、逆変換部４４が生成した音響信号ｚ(t)を増幅器による増幅後に放音機器１２に供給する構成（典型的には、収音機器１４の周囲の音響の音量を調整して放音機器１２から放射する拡声装置）が採用される。以上の構成においては、放音機器１２から収音機器１４に到来する帰還音が帰還音抑圧部２６およびスペクトル減算部４２にて抑圧されるから、放音機器１２からの放射音におけるミュージカルノイズを抑制しながら、放音機器１２と収音機器１４と音響処理装置１００とで構成されるループに起因したハウリングが効果的に防止される。 (5) Modification 5
The acoustic processing device 100 (100A to 100F) in each of the above embodiments is also suitable as a device that suppresses howling caused by feedback sound (howling suppression device). For example, a configuration in which the acoustic signal z (t) generated by the inverse transform unit 44 is supplied to the sound emitting device 12 after being amplified by an amplifier (typically, the sound is emitted by adjusting the sound volume around the sound collecting device 14. A loudspeaker radiating from the device 12 is employed. In the above configuration, the feedback sound arriving at the sound collecting device 14 from the sound emitting device 12 is suppressed by the feedback sound suppressing unit 26 and the spectrum subtracting unit 42, so that the musical noise in the radiated sound from the sound emitting device 12 is reduced. While being suppressed, howling due to a loop formed by the sound emitting device 12, the sound collecting device 14, and the sound processing device 100 is effectively prevented.

１００A，１００B，１００C，１００D，１００E，１００F，１００G……音響処理装置、１２……放音機器、１４……収音機器、２２……目的音判定部、２４……帰還音推定部、２６……帰還音抑圧部、３２，３４……周波数解析部，３６……残留音推定部、４２……スペクトル減算部、４４……逆変換部、５２A，５２B，５２C……尖度算定部，５４……係数設定部、５６……判定部。
100A, 100B, 100C, 100D, 100E, 100F, 100G... Acoustic processing device, 12... Sound emission device, 14... Sound collection device, 22 .. target sound determination unit, 24. …… Feedback sound suppression unit 32, 34 Frequency analysis unit 36 Residual sound estimation unit 42 Spectral subtraction unit 44 Inverse transformation unit 52A 52B 52C Kurtosis calculation unit 54... Coefficient setting unit, 56.

Claims

Feedback sound estimation means for generating an estimated feedback sound component that estimates the feedback sound coming from the sound emitting device to the sound collecting device;
Feedback sound suppression means for suppressing the estimated feedback sound component from the acoustic signal generated by the sound collection device;
Kurtosis calculating means for calculating the kurtosis in the frequency distribution of the intensity of the estimated residual sound component obtained by estimating the component remaining after the processing by the feedback sound suppression means of the feedback sound;
Coefficient setting means for setting a subtraction coefficient according to the kurtosis calculated by the kurtosis calculating means;
An acoustic processing apparatus comprising: a spectrum subtracting unit that adjusts a spectrum of the estimated residual sound component according to the subtraction coefficient and subtracts it from a spectrum of the acoustic signal processed by the feedback sound suppressing unit.

The spectrum subtracting unit multiplies the spectrum of the estimated residual sound component by the subtraction coefficient and subtracts it from the spectrum of the acoustic signal processed by the feedback sound suppression unit,
The sound processing apparatus according to claim 1, wherein the coefficient setting unit sets the subtraction coefficient to a larger numerical value as the kurtosis increases.

The kurtosis calculation means calculates the kurtosis of the estimated residual sound component and the kurtosis in the frequency distribution of the intensity of the acoustic signal generated by the sound collection device,
When the kurtosis of the estimated residual sound component approximates the kurtosis of the acoustic signal, the coefficient setting means, according to at least one of the kurtosis of the estimated residual sound component and the kurtosis of the acoustic signal, The acoustic processing apparatus according to claim 1, wherein a subtraction coefficient is set, and updating of the subtraction coefficient is stopped when the kurtosis of the estimated residual sound component and the kurtosis of the acoustic signal are not approximated.

Feedback sound estimation means for generating an estimated feedback sound component that estimates the feedback sound coming from the sound emitting device to the sound collecting device;
Feedback sound suppression means for suppressing the estimated feedback sound component from the acoustic signal generated by the sound collection device;
Kurtosis calculating means for calculating kurtosis in the frequency distribution of the intensity of the acoustic signal generated by the sound collecting device;
Coefficient setting means for setting a subtraction coefficient according to the kurtosis calculated by the kurtosis calculating means;
Of the feedback sound, the spectrum of the estimated residual sound component obtained by estimating the component remaining after the processing by the feedback sound suppression means is adjusted according to the subtraction coefficient and subtracted from the spectrum of the acoustic signal after the processing by the feedback sound suppression means And a spectral subtracting means.

The kurtosis calculation means calculates the kurtosis of the acoustic signal before processing by the spectrum subtraction means,
The spectrum subtracting means multiplies the spectrum of the estimated residual sound component by the subtraction coefficient and subtracts it from the spectrum of the acoustic signal processed by the feedback sound suppression means,
The sound processing apparatus according to claim 4, wherein the coefficient setting unit sets the subtraction coefficient to a larger numerical value as the kurtosis increases.

The kurtosis calculation means calculates the kurtosis of the acoustic signal after processing by the spectrum subtraction means,
The spectrum subtracting unit multiplies the spectrum of the estimated residual sound component by the subtraction coefficient and subtracts it from the spectrum of the acoustic signal processed by the feedback sound suppression unit,
The sound processing apparatus according to claim 4, wherein the coefficient setting unit sets the subtraction coefficient to a smaller numerical value as the kurtosis is larger.

A target sound determining means for determining the presence or absence of a target sound component for the acoustic signal generated by the sound collecting device;
The kurtosis calculating unit calculates the kurtosis when the target sound determining unit determines that the target sound component does not exist, and the kurtosis when the target sound determining unit determines that the target sound component exists. The sound processing apparatus according to any one of claims 1 to 6, wherein the calculation of is stopped.

A feedback sound estimation process for generating an estimated feedback sound component estimating a feedback sound arriving at the sound collecting device from the sound emitting device;
Feedback sound suppression processing for suppressing the estimated feedback sound component from the acoustic signal generated by the sound collection device;
A kurtosis calculation process for calculating a kurtosis in the frequency distribution of the intensity of the estimated residual sound component obtained by estimating the component remaining after execution of the feedback sound suppression process in the feedback sound;
A coefficient setting process for setting a subtraction coefficient according to the kurtosis calculated in the kurtosis calculation process;
A program that causes a computer to execute spectrum subtraction processing that adjusts a spectrum of the estimated residual sound component according to the subtraction coefficient and subtracts it from a spectrum of an acoustic signal after execution of the feedback sound suppression processing.

A feedback sound estimation process for generating an estimated feedback sound component estimating a feedback sound arriving at the sound collecting device from the sound emitting device;
Feedback sound suppression processing for suppressing the estimated feedback sound component from the acoustic signal generated by the sound collection device;
Kurtosis calculation processing for calculating the kurtosis in the frequency distribution of the intensity of the acoustic signal after generation by the sound collection device;
A coefficient setting process for setting a subtraction coefficient according to the kurtosis calculated in the kurtosis calculation process;
A spectrum of an estimated residual sound component obtained by estimating a component remaining after execution of the feedback sound suppression processing in the feedback sound is adjusted according to the subtraction coefficient and subtracted from a spectrum of the acoustic signal after execution of the feedback sound suppression processing. A program that causes a computer to execute spectral subtraction processing.