JP2010220087A - Sound processing apparatus and program - Google Patents

Sound processing apparatus and program Download PDF

Info

Publication number
JP2010220087A
JP2010220087A JP2009066830A JP2009066830A JP2010220087A JP 2010220087 A JP2010220087 A JP 2010220087A JP 2009066830 A JP2009066830 A JP 2009066830A JP 2009066830 A JP2009066830 A JP 2009066830A JP 2010220087 A JP2010220087 A JP 2010220087A
Authority
JP
Japan
Prior art keywords
sound
kurtosis
spectrum
feedback
coefficient
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
JP2009066830A
Other languages
Japanese (ja)
Inventor
Takafumi Tanaka
啓文 田中
Kazunobu Kondo
多伸 近藤
Hiroshi Okumura
啓 奥村
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Yamaha Corp
Original Assignee
Yamaha Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Yamaha Corp filed Critical Yamaha Corp
Priority to JP2009066830A priority Critical patent/JP2010220087A/en
Publication of JP2010220087A publication Critical patent/JP2010220087A/en
Pending legal-status Critical Current

Links

Images

Abstract

<P>PROBLEM TO BE SOLVED: To suppress feedback sound while suppressing the generation of musical noise caused by spectrum subtraction. <P>SOLUTION: A feedback sound estimation unit 24 generates an estimated feedback sound component n2 (t) obtained by estimating a feedback sound component n1 (t) reaching a sound collection apparatus 14 from a sound emission apparatus 12. A feedback sound suppression unit 26 suppresses the estimated feedback sound component n2 (t) from a sound signal w1 (t) generated by the sound collection apparatus 14. A kurtosis calculation unit 52A calculates kurtosis KA (m) in a frequency distribution of intensity of an estimated residual sound component e2 (t) obtained by estimating a residual component within a feedback sound component n1 (t) after processing by the feedback sound suppression unit 26. A coefficient setter 54 sets a subtraction coefficient α(m) in accordance with the kurtosis KA (m) calculated by the kurtosis calculation unit 52A. A spectrum subtraction unit 42 adjusts a spectrum E2 (m, f) of the estimated residual sound component e2 (t) in accordance with a subtraction coefficient α(m) to be subtracted from a spectrum W2 of a sound signal w2 (t) after processing by the feedback sound suppression unit 26. <P>COPYRIGHT: (C)2010,JPO&INPIT

Description

本発明は、放音機器と収音機器とを含む音響系において放音機器から収音機器に到来する音響(以下「帰還音」という)の影響を抑制する技術に関する。   The present invention relates to a technique for suppressing the influence of sound (hereinafter referred to as “return sound”) that arrives at a sound collecting device from the sound emitting device in an acoustic system including the sound emitting device and the sound collecting device.

放音機器から収音機器に到来する帰還音の影響(エコーやハウリング)を抑制する技術が従来から提案されている。例えば、特許文献1には、帰還音を模擬した成分(以下「推定帰還音成分」という)を時間領域で入力音声から減算したうえで、減算後のスペクトルから推定帰還音成分のスペクトルを減算(スペクトル減算)する技術が開示されている。   Conventionally, a technique for suppressing the influence (echo or howling) of feedback sound coming from a sound emitting device to a sound collecting device has been proposed. For example, in Patent Document 1, a component simulating feedback sound (hereinafter referred to as “estimated feedback sound component”) is subtracted from the input speech in the time domain, and then the spectrum of the estimated feedback sound component is subtracted from the subtracted spectrum ( A technique for spectral subtraction) is disclosed.

特開2004−56453号公報JP 2004-56453 A

しかし、特許文献1のように周波数領域で推定帰還音成分を抑圧する技術では、スペクトル減算後に時間軸上および周波数軸上に分散的に点在する成分が、人工的で耳障りなミュージカルノイズとして受聴者に知覚され得るという問題がある。以上の事情を考慮して、本発明は、ミュージカルノイズの発生を抑制しながら帰還音を抑圧することを目的とする。   However, in the technique of suppressing the estimated feedback sound component in the frequency domain as in Patent Document 1, components scattered in the time axis and the frequency axis after spectrum subtraction are received as artificial and annoying musical noise. There is a problem that it can be perceived by the listener. In view of the above circumstances, an object of the present invention is to suppress feedback sound while suppressing the generation of musical noise.

以上の課題を解決するために、本発明の第1の態様に係る音響処理装置は、放音機器から収音機器に到来する帰還音を推定した推定帰還音成分を生成する帰還音推定手段と、収音機器による生成後の音響信号(例えば音響信号w1(t)や音響信号w2(t))から推定帰還音成分を抑圧する帰還音抑圧手段と、帰還音のうち帰還音抑圧手段による処理後に残留する成分を推定した推定残留音成分の強度の度数分布における尖度(例えば尖度KA(m))を算定する尖度算定手段(例えば尖度算定部52A)と、尖度算定手段が算定した尖度に応じて減算係数を設定する係数設定手段と、推定残留音成分のスペクトルを減算係数に応じて調整して帰還音抑圧手段による処理後の音響信号(例えば音響信号w2(t))のスペクトルから減算するスペクトル減算手段とを具備する。なお、以上の態様の具体例は、第1実施形態,第2実施形態,第5実施形態として後述される。   In order to solve the above problems, the acoustic processing device according to the first aspect of the present invention includes a feedback sound estimation unit that generates an estimated feedback sound component that estimates a feedback sound that arrives at a sound collection device from a sound emitting device. , Feedback sound suppression means for suppressing the estimated feedback sound component from the acoustic signal generated by the sound collection device (for example, the acoustic signal w1 (t) and the acoustic signal w2 (t)), and processing by the feedback sound suppression means of the feedback sound Kurtosis calculation means (for example, kurtosis calculation unit 52A) for calculating kurtosis (for example, kurtosis KA (m)) in the frequency distribution of the intensity of the estimated residual sound component obtained by estimating the remaining component, and kurtosis calculation means Coefficient setting means for setting a subtraction coefficient according to the calculated kurtosis, and an acoustic signal after processing by the feedback sound suppression means by adjusting the spectrum of the estimated residual sound component according to the subtraction coefficient (for example, the acoustic signal w2 (t) Spectral subtraction means to subtract from the spectrum of) Comprising. In addition, the specific example of the above aspect is later mentioned as 1st Embodiment, 2nd Embodiment, and 5th Embodiment.

第1の態様に係る音響処理装置においては、帰還音抑圧手段による処理後の音響信号のスペクトルから推定残留音成分のスペクトルが減算されるから、帰還音抑圧手段による抑圧のみが実行される構成と比較して、収音機器が生成した音響信号から帰還音成分を有効に抑圧できるという利点がある。また、推定残留音成分の強度の度数分布における尖度に応じて設定された減算係数が推定残留音成分のスペクトルの調整に適用されるから、減算係数が尖度に依存しない構成(例えば減算係数を所定値に固定した構成)と比較して、スペクトル減算に起因したミュージカルノイズを抑制することが可能である。なお、「推定残留音成分の強度の度数分布における尖度」は、推定残留音成分の波形を表す時間領域の信号にて時系列に配列する複数の信号値(強度)の度数分布における強度と、推定残留音成分のスペクトルにおける複数の強度の度数分布における尖度とを包含する概念である。   In the acoustic processing device according to the first aspect, since the spectrum of the estimated residual sound component is subtracted from the spectrum of the acoustic signal processed by the feedback sound suppression means, only the suppression by the feedback sound suppression means is executed. In comparison, there is an advantage that the feedback sound component can be effectively suppressed from the acoustic signal generated by the sound collection device. In addition, since the subtraction coefficient set according to the kurtosis in the frequency distribution of the intensity of the estimated residual sound component is applied to the adjustment of the spectrum of the estimated residual sound component, the subtraction coefficient does not depend on the kurtosis (for example, the subtraction coefficient Can be suppressed as compared with a configuration in which is fixed to a predetermined value). The “kurtosis in the frequency distribution of the estimated residual sound component intensity” is the intensity in the frequency distribution of a plurality of signal values (intensities) arranged in time series in the time domain signal representing the waveform of the estimated residual sound component. And a kurtosis in a frequency distribution of a plurality of intensities in the spectrum of the estimated residual sound component.

本発明の好適な態様において、スペクトル減算手段は、推定残留音成分のスペクトルに減算係数を乗算して帰還音抑圧手段による処理後の音響信号のスペクトルから減算する。以上の態様では、尖度が大きいほど減算係数を大きい数値に設定することで、スペクトル減算に起因したミュージカルノイズを有効に抑制できる。   In a preferred aspect of the present invention, the spectrum subtracting means multiplies the spectrum of the estimated residual sound component by a subtraction coefficient and subtracts it from the spectrum of the acoustic signal processed by the feedback sound suppressing means. In the above aspect, musical noise resulting from spectral subtraction can be effectively suppressed by setting the subtraction coefficient to a larger value as the kurtosis increases.

本発明の好適な態様において、尖度算定手段は、推定残留音成分の尖度と、収音機器による生成後の音響信号の強度の度数分布における尖度とを算定し、係数設定手段は、推定残留音成分の尖度と音響信号の尖度とが近似する場合には、推定残留音成分の尖度および音響信号の尖度の少なくとも一方に応じて減算係数を設定し、推定残留音成分の尖度と音響信号の尖度とが近似しない場合には減算係数の更新を停止する。以上の態様においては、推定残留音成分の尖度と音響信号の尖度とが近似しない場合(すなわち、音響信号が目的音成分を含む場合)に減算係数の更新が停止するから、減算係数が目的音成分に影響されない。したがって、ミュージカルノイズを適切に抑圧できるという利点がある。以上の態様の具体例は、第5実施形態として後述される。なお、推定残留音成分の尖度と音響信号の尖度とが近似しない場合に、係数設定手段が、減算係数の更新を停止したうえで減算係数を所定値(例えば、スペクトル減算の度合を低減する数値)に初期化する構成も好適である。以上の構成によれば、音響信号のうち目的音成分を含む区間に対するスペクトル減算が、目的音成分を含まない区間の影響で過剰となることが抑制される。   In a preferred aspect of the present invention, the kurtosis calculation means calculates the kurtosis of the estimated residual sound component and the kurtosis in the frequency distribution of the intensity of the acoustic signal generated by the sound collection device, and the coefficient setting means When the kurtosis of the estimated residual sound component approximates the kurtosis of the acoustic signal, a subtraction coefficient is set according to at least one of the kurtosis of the estimated residual sound component and the kurtosis of the acoustic signal, and the estimated residual sound component If the kurtosis of the sound does not approximate the kurtosis of the acoustic signal, the update of the subtraction coefficient is stopped. In the above aspect, the update of the subtraction coefficient stops when the kurtosis of the estimated residual sound component and the kurtosis of the acoustic signal are not approximate (that is, when the acoustic signal includes the target sound component). Not affected by the target sound component. Therefore, there is an advantage that musical noise can be appropriately suppressed. A specific example of the above aspect will be described later as a fifth embodiment. If the kurtosis of the estimated residual sound component and the kurtosis of the acoustic signal are not approximated, the coefficient setting means stops updating the subtraction coefficient and sets the subtraction coefficient to a predetermined value (for example, reduces the degree of spectrum subtraction). A configuration in which the value is initialized to a numerical value) is also preferable. According to the above configuration, it is possible to prevent the spectral subtraction for the section including the target sound component from being excessive due to the influence of the section not including the target sound component.

本発明の第2の態様に係る音響処理装置は、放音機器から収音機器に到来する帰還音を推定した推定帰還音成分を生成する帰還音推定手段と、収音機器による生成後の音響信号(例えば音響信号w1(t)や音響信号w2(t))から推定帰還音成分を抑圧する帰還音抑圧手段と、収音機器による生成後の音響信号(例えば音響信号w1(t)〜w3(t))の強度の度数分布における尖度(例えば尖度KB(m)や尖度KC(m))を算定する尖度算定手段と、尖度算定手段が算定した尖度に応じて減算係数を設定する係数設定手段と、帰還音のうち帰還音抑圧手段による処理後に残留する成分を推定した推定残留音成分のスペクトルを減算係数に応じて調整して帰還音抑圧手段による処理後の音響信号(例えば音響信号w2(t))のスペクトルから減算するスペクトル減算手段とを具備する。以上の態様の具体例は、第3実施形態から第7実施形態として後述される。   The acoustic processing device according to the second aspect of the present invention includes a feedback sound estimation unit that generates an estimated feedback sound component that estimates a feedback sound that arrives at a sound collection device from a sound emitting device, and a sound that is generated by the sound collection device. Feedback sound suppression means for suppressing the estimated feedback sound component from the signal (for example, the acoustic signal w1 (t) or the acoustic signal w2 (t)), and the acoustic signal generated by the sound collecting device (for example, the acoustic signals w1 (t) to w3) (t)) kurtosis calculation means for calculating the kurtosis (for example, kurtosis KB (m) or kurtosis KC (m)) in the intensity distribution of intensity, and subtraction according to the kurtosis calculated by the kurtosis calculation means A coefficient setting means for setting a coefficient, and an acoustic signal after the processing by the feedback sound suppression means by adjusting the spectrum of the estimated residual sound component estimated from the feedback sound after the processing by the feedback sound suppression means according to the subtraction coefficient Spectral subtraction means for subtracting from the spectrum of the signal (eg acoustic signal w2 (t)) It comprises. Specific examples of the above aspects will be described later as third to seventh embodiments.

第2の態様に係る音響処理装置においては、帰還音抑圧手段による処理後の音響信号のスペクトルから推定残留音成分のスペクトルが減算されるから、帰還音抑圧手段による抑圧のみが実行される構成と比較して、収音機器が生成した音響信号から帰還音成分を有効に抑圧できるという利点がある。また、音響信号の強度の度数分布における尖度に応じて設定された減算係数が推定残留音成分のスペクトルの調整に適用されるから、減算係数が尖度に依存しない構成(例えば減算係数を所定値に固定した構成)と比較して、スペクトル減算に起因したミュージカルノイズを抑制することが可能である。なお、「音響信号の強度の度数分布における尖度」は、音響信号にて時系列に配列する複数の信号値(強度)の度数分布における強度と、音響信号のスペクトルにおける複数の強度の度数分布における尖度とを包含する概念である。   In the acoustic processing device according to the second aspect, since the spectrum of the estimated residual sound component is subtracted from the spectrum of the acoustic signal processed by the feedback sound suppression unit, only the suppression by the feedback sound suppression unit is executed. In comparison, there is an advantage that the feedback sound component can be effectively suppressed from the acoustic signal generated by the sound collection device. Further, since the subtraction coefficient set according to the kurtosis in the frequency distribution of the intensity of the acoustic signal is applied to the adjustment of the spectrum of the estimated residual sound component, the subtraction coefficient does not depend on the kurtosis (for example, the subtraction coefficient is set to a predetermined value). Compared with a configuration fixed to a value), it is possible to suppress musical noise caused by spectral subtraction. Note that “the kurtosis in the frequency distribution of the intensity of the acoustic signal” means the intensity distribution in the frequency distribution of a plurality of signal values (intensities) arranged in time series in the acoustic signal and the frequency distribution of the plurality of intensities in the spectrum of the acoustic signal. It is a concept that includes kurtosis.

第2の態様の具体例(例えば、第3実施形態,第4実施形態,第5実施形態,第7実施形態)において、尖度算定手段は、スペクトル減算手段による処理前の音響信号の尖度を算定し、スペクトル減算手段は、推定残留音成分のスペクトルに減算係数を乗算して帰還音抑圧手段による処理後の音響信号のスペクトルから減算する。以上の構成では、例えば尖度が大きいほど減算係数を大きい数値に設定することで、スペクトル減算に起因したミュージカルノイズを有効に抑制できる。なお、「スペクトル減算手段による処理前の音響信号」は、スペクトル減算手段による処理の直前の音響信号に限定されず、収音機器による生成後からスペクトル減算手段による処理前までの任意の段階における音響信号(例えば、帰還音抑圧手段による処理前の音響信号や帰還音抑圧手段による処理後の音響信号)を包含する。   In a specific example of the second mode (for example, the third embodiment, the fourth embodiment, the fifth embodiment, and the seventh embodiment), the kurtosis calculation means is the kurtosis of the acoustic signal before being processed by the spectrum subtraction means. The spectrum subtracting means multiplies the spectrum of the estimated residual sound component by the subtraction coefficient and subtracts it from the spectrum of the acoustic signal processed by the feedback sound suppressing means. In the above configuration, for example, musical noise caused by spectrum subtraction can be effectively suppressed by setting the subtraction coefficient to a larger numerical value as the kurtosis increases. Note that the “acoustic signal before processing by the spectrum subtracting means” is not limited to the acoustic signal immediately before processing by the spectrum subtracting means, and the acoustic signal at any stage from the generation by the sound collecting device to before the processing by the spectrum subtracting means. A signal (for example, an acoustic signal before processing by the feedback sound suppression unit or an acoustic signal after processing by the feedback sound suppression unit).

第2の態様の他の具体例(例えば、第6実施形態や第7実施形態)において、尖度算定手段は、スペクトル減算手段による処理後の音響信号の尖度を算定し、スペクトル減算手段は、推定残留音成分のスペクトルに減算係数を乗算して帰還音抑圧手段による処理後の音響信号のスペクトルから減算する。以上の構成では、例えば尖度が大きいほど減算係数を小さい数値に設定することで、スペクトル減算に起因したミュージカルノイズを有効に抑制できる。なお、「スペクトル減算手段による処理後の音響信号」は、スペクトル減算手段による処理の直後の音響信号に限定されず、例えば、スペクトル減算手段による処理後に他の処理が実行された音響信号をも包含する。   In another specific example of the second mode (for example, the sixth embodiment or the seventh embodiment), the kurtosis calculating unit calculates the kurtosis of the acoustic signal processed by the spectrum subtracting unit, and the spectrum subtracting unit Then, the spectrum of the estimated residual sound component is multiplied by a subtraction coefficient and subtracted from the spectrum of the acoustic signal after processing by the feedback sound suppression means. In the above configuration, for example, musical noise caused by spectrum subtraction can be effectively suppressed by setting the subtraction coefficient to a smaller value as the kurtosis increases. The “acoustic signal after processing by the spectrum subtracting means” is not limited to the acoustic signal immediately after the processing by the spectral subtracting means, and includes, for example, an acoustic signal that has been subjected to other processing after being processed by the spectral subtracting means. To do.

本発明(第1の態様および第2の態様)の好適な態様に係る音響処理装置は、収音機器による生成後の音響信号について目的音成分の有無を判定する目的音判定手段を具備し、尖度算定手段は、目的音成分が存在しないと目的音判定手段が判定した場合に尖度を算定し、目的音成分が存在すると目的音判定手段が判定した場合に尖度の算定を停止する。以上の態様においては減算係数に対する目的音成分の影響が低減(理想的には排除)されるから、スペクトル減算に起因したミュージカルノイズを効果的に抑制できるという利点がある。以上の態様の具体例は、例えば第2実施形態や第4実施形態として後述される。   An acoustic processing device according to a preferred aspect of the present invention (first aspect and second aspect) includes a target sound determination unit that determines the presence or absence of a target sound component for an acoustic signal generated by a sound collection device, The kurtosis calculating means calculates the kurtosis when the target sound determining means determines that the target sound component does not exist, and stops calculating the kurtosis when the target sound determining means determines that the target sound component exists. . In the above aspect, since the influence of the target sound component on the subtraction coefficient is reduced (ideally excluded), there is an advantage that musical noise caused by spectral subtraction can be effectively suppressed. Specific examples of the above aspects will be described later as, for example, the second embodiment and the fourth embodiment.

以上の各態様に係る音響処理装置は、音響信号の処理に専用されるDSP(Digital Signal Processor)などのハードウェア(電子回路)によって実現されるほか、CPU(Central Processing Unit)などの汎用の演算処理装置とプログラムとの協働によっても実現される。本発明の第1の態様に係るプログラムは、放音機器から収音機器に到来する帰還音を推定した推定帰還音成分を生成する帰還音推定処理と、収音機器による生成後の音響信号から推定帰還音成分を抑圧する帰還音抑圧処理と、帰還音のうち帰還音抑圧処理の実行後に残留する成分を推定した推定残留音成分の強度の度数分布における尖度を算定する尖度算定処理と、尖度算定処理で算定した尖度に応じて減算係数を設定する係数設定処理と、推定残留音成分のスペクトルを減算係数に応じて調整して帰還音抑圧処理の実行後の音響信号のスペクトルから減算するスペクトル減算処理とをコンピュータに実行させる。以上のプログラムによれば、本発明の第1の態様に係る音響処理装置と同様の作用および効果が奏される。   The acoustic processing device according to each of the above aspects is realized by hardware (electronic circuit) such as a DSP (Digital Signal Processor) dedicated to processing of an acoustic signal, or a general-purpose calculation such as a CPU (Central Processing Unit). This is also realized by cooperation between the processing device and the program. The program according to the first aspect of the present invention includes a feedback sound estimation process for generating an estimated feedback sound component that estimates a feedback sound that arrives at a sound collection device from a sound emitting device, and an acoustic signal generated by the sound collection device. A feedback sound suppression process for suppressing the estimated feedback sound component, and a kurtosis calculation process for calculating the kurtosis in the frequency distribution of the intensity of the estimated residual sound component estimated after the feedback sound suppression process of the feedback sound is performed. , The coefficient setting process for setting the subtraction coefficient according to the kurtosis calculated by the kurtosis calculation process, and the spectrum of the acoustic signal after executing the feedback sound suppression process by adjusting the spectrum of the estimated residual sound component according to the subtraction coefficient And causing the computer to execute spectral subtraction processing for subtracting from. According to the above program, the same operation and effect as the sound processing apparatus according to the first aspect of the present invention are exhibited.

また、本発明の第2の態様に係るプログラムは、放音機器から収音機器に到来する帰還音を推定した推定帰還音成分を生成する帰還音推定処理と、収音機器による生成後の音響信号から推定帰還音成分を抑圧する帰還音抑圧処理と、収音機器による生成後の音響信号の強度の度数分布における尖度を算定する尖度算定処理と、尖度算定処理で算定した尖度に応じて減算係数を設定する係数設定処理と、帰還音のうち帰還音抑圧処理の実行後に残留する成分を推定した推定残留音成分のスペクトルを減算係数に応じて調整して帰還音抑圧処理の実行後の音響信号のスペクトルから減算するスペクトル減算処理とをコンピュータに実行させる。以上のプログラムによれば、本発明の第2の態様に係る音響処理装置と同様の作用および効果が奏される。   Further, the program according to the second aspect of the present invention includes a feedback sound estimation process for generating an estimated feedback sound component obtained by estimating a feedback sound arriving at a sound collection device from a sound emitting device, and a sound generated by the sound collection device. Feedback sound suppression processing to suppress the estimated feedback sound component from the signal, kurtosis calculation processing to calculate the kurtosis in the frequency distribution of the intensity of the sound signal generated by the sound collection device, and the kurtosis calculated by the kurtosis calculation processing Of the feedback sound suppression processing by adjusting the spectrum of the estimated residual sound component that estimates the component remaining after execution of the feedback sound suppression processing of the feedback sound according to the subtraction coefficient. The computer executes a spectrum subtraction process for subtracting from the spectrum of the acoustic signal after the execution. According to the above program, the same operation and effect as the sound processing apparatus according to the second aspect of the present invention are exhibited.

以上の各態様に係るプログラムは、コンピュータが読取可能な記録媒体に格納された形態で利用者に提供されてコンピュータにインストールされるほか、通信網を介した配信の形態でサーバ装置から提供されてコンピュータにインストールされる。   The program according to each of the above aspects is provided to the user in a form stored in a computer-readable recording medium and installed in the computer, and is also provided from the server device in the form of distribution via a communication network. Installed on the computer.

本発明の第1実施形態に係る音響処理装置のブロック図である。1 is a block diagram of a sound processing apparatus according to a first embodiment of the present invention. スペクトル減算に起因したミュージカルノイズについて説明するための概念図である。It is a conceptual diagram for demonstrating the musical noise resulting from spectrum subtraction. 本発明の第2実施形態に係る音響処理装置のブロック図である。It is a block diagram of the sound processing apparatus which concerns on 2nd Embodiment of this invention. 本発明の第3実施形態に係る音響処理装置のブロック図である。It is a block diagram of the sound processing apparatus which concerns on 3rd Embodiment of this invention. 本発明の第4実施形態に係る音響処理装置のブロック図である。It is a block diagram of the sound processing apparatus which concerns on 4th Embodiment of this invention. 本発明の第5実施形態に係る音響処理装置のブロック図である。It is a block diagram of the sound processing apparatus which concerns on 5th Embodiment of this invention. 本発明の第6実施形態に係る音響処理装置のブロック図である。It is a block diagram of the sound processing apparatus which concerns on 6th Embodiment of this invention. 本発明の第7実施形態に係る音響処理装置のブロック図である。It is a block diagram of the sound processing apparatus which concerns on 7th Embodiment of this invention.

<A:第1実施形態>
図1は、本発明の第1実施形態に係る音響処理装置100Aのブロック図である。音響処理装置100Aは、他の通信装置との間で通信網を介して音響信号を授受する電話器(典型的にはハンズフリー電話器)に好適に利用される。
<A: First Embodiment>
FIG. 1 is a block diagram of a sound processing apparatus 100A according to the first embodiment of the present invention. The acoustic processing device 100A is preferably used for a telephone (typically a hands-free telephone) that exchanges acoustic signals with other communication devices via a communication network.

音響処理装置100Aには放音機器(例えばスピーカ)12と収音機器14とが接続される。通信網から受信された時間領域(時間t)の音響信号(遠端信号)v(t)が音響処理装置100Aを介して放音機器12に供給される。放音機器12は、音響信号v(t)に応じた音波を放射する。収音機器14は、周囲の音響に応じた時間領域の音響信号w1(t)を生成して音響処理装置100Aに出力する。なお、音響信号v(t)をアナログ信号に変換して放音機器12に供給するD/A変換器や、音響信号w1(t)をデジタル信号に変換するA/D変換器の図示は便宜的に省略されている。   A sound emitting device (for example, a speaker) 12 and a sound collecting device 14 are connected to the sound processing apparatus 100A. A sound signal (far end signal) v (t) in the time domain (time t) received from the communication network is supplied to the sound emitting device 12 via the sound processing device 100A. The sound emitting device 12 emits a sound wave corresponding to the acoustic signal v (t). The sound collection device 14 generates a time-domain sound signal w1 (t) corresponding to the surrounding sound and outputs it to the sound processing apparatus 100A. It should be noted that a D / A converter that converts the acoustic signal v (t) into an analog signal and supplies it to the sound emitting device 12 and an A / D converter that converts the acoustic signal w1 (t) into a digital signal are shown for convenience. Are omitted.

収音機器14には目的音と帰還音とが到来する。目的音は、音響処理装置100Aの利用者が発生した音響(すなわち、通信の本来の目的となる音響)であり、帰還音は、放音機器12から直接的または間接的(すなわち反射後)に収音機器14に到達する音響(典型的にはエコー音)である。したがって、音響信号w1(t)は、以下の数式(1)で表現されるように、目的音に対応する目的音成分s(t)と帰還音に対応する帰還音成分n1(t)との加算に相当する。帰還音成分n1(t)は、放音機器12から収音機器14までの音波の経路に応じた伝達関数h1を音響信号v(t)に付加した成分(n1(t)=h1・v(t))である。
w1(t)=s(t)+n1(t) ……(1)
The target sound and the return sound arrive at the sound collecting device 14. The target sound is the sound generated by the user of the sound processing apparatus 100A (that is, the sound that is the original purpose of communication), and the return sound is directly or indirectly (that is, after reflection) from the sound emitting device 12. The sound (typically an echo sound) that reaches the sound collection device 14. Therefore, the acoustic signal w1 (t) is expressed by the following equation (1): the target sound component s (t) corresponding to the target sound and the feedback sound component n1 (t) corresponding to the feedback sound. It corresponds to addition. The feedback sound component n1 (t) is a component obtained by adding a transfer function h1 corresponding to a sound wave path from the sound emitting device 12 to the sound collecting device 14 to the acoustic signal v (t) (n1 (t) = h1 · v ( t)).
w1 (t) = s (t) + n1 (t) (1)

音響処理装置100Aは、収音機器14が生成した音響信号w1(t)から帰還音成分n1(t)を抑圧することで音響信号(近端信号)z(t)を生成するエコー抑圧装置(エコーキャンセラ)である。音響処理装置100Aが生成した音響信号z(t)は通信網を介して他の通信装置に送信される。音響処理装置100Aを構成する図1の各要素は、プログラムを実行する汎用のコンピュータ(CPU)や専用の電子回路(DSP)で実現される。   The acoustic processing apparatus 100A is an echo suppression apparatus that generates an acoustic signal (near-end signal) z (t) by suppressing the feedback sound component n1 (t) from the acoustic signal w1 (t) generated by the sound collection device 14. Echo canceller). The acoustic signal z (t) generated by the acoustic processing device 100A is transmitted to another communication device via the communication network. Each element of FIG. 1 constituting the sound processing apparatus 100A is realized by a general-purpose computer (CPU) that executes a program or a dedicated electronic circuit (DSP).

図1の目的音判定部22は、音響信号w1(t)における目的音成分s(t)の有無を判定して有音区間と無音区間とを区別する要素(VAD:voice activity detection)である。有音区間は、音響信号w1(t)に目的音成分s(t)が含まれる区間(帰還音成分n1(t)の有無は不問)であり、無音区間は、音響信号w1(t)に目的音成分s(t)が含まれない(または強度が充分に低い)区間である。有音区間と無音区間との区別には公知の技術が任意に採用される。   The target sound determination unit 22 of FIG. 1 is an element (VAD: voice activity detection) that determines the presence or absence of the target sound component s (t) in the acoustic signal w1 (t) and distinguishes the voiced and silent sections. . The voiced section is a section in which the target sound component s (t) is included in the acoustic signal w1 (t) (regardless of the presence or absence of the feedback sound component n1 (t)), and the silent section is the acoustic signal w1 (t). The target sound component s (t) is not included (or the intensity is sufficiently low). A known technique is arbitrarily adopted to distinguish between the sounded section and the silent section.

帰還音推定部24は、帰還音成分n1(t)を推定した推定帰還音成分n2(t)を生成する。例えば、適応フィルタを利用したAEC(acoustic echo canceller)が帰還音推定部24として好適に利用される。帰還音抑圧部26は、帰還音推定部24が生成した推定帰還音成分n2(t)を音響信号w1(t)から抑圧することで音響信号w2(t)を生成する。例えば、図1に示すように、音響信号w1(t)から推定帰還音成分n2(t)を減算する減算器が帰還音抑圧部26として利用される(w2(t)=w1(t)−n2(t))。   The feedback sound estimation unit 24 generates an estimated feedback sound component n2 (t) obtained by estimating the feedback sound component n1 (t). For example, an AEC (acoustic echo canceller) using an adaptive filter is preferably used as the feedback sound estimation unit 24. The feedback sound suppression unit 26 generates the acoustic signal w2 (t) by suppressing the estimated feedback sound component n2 (t) generated by the feedback sound estimation unit 24 from the acoustic signal w1 (t). For example, as shown in FIG. 1, a subtractor that subtracts the estimated feedback sound component n2 (t) from the acoustic signal w1 (t) is used as the feedback sound suppression unit 26 (w2 (t) = w1 (t) − n2 (t)).

帰還音推定部24は、音響信号w2(t)の強度が最小となるように伝達関数h1を推定することで伝達関数h2を生成し、伝達関数h2を音響信号v(t)に乗算することで推定帰還音成分n2(t)を生成する(n2(t)=h2・v(t))。帰還音推定部24は、目的音判定部22が判定した無音区間(目的音成分s(t)が存在しない区間)内で推定帰還音成分n2(t)(伝達関数h2)を順次に算定および更新し、目的音判定部22が判定した有音区間内では推定帰還音成分n2(t)の更新を停止する。したがって、目的音成分s(t)の影響を抑制して高精度に推定帰還音成分n2(t)を推定できるという利点がある。   The feedback sound estimation unit 24 generates the transfer function h2 by estimating the transfer function h1 so that the intensity of the acoustic signal w2 (t) is minimized, and multiplies the acoustic signal v (t) by the transfer function h2. To generate an estimated feedback sound component n2 (t) (n2 (t) = h2 · v (t)). The feedback sound estimation unit 24 sequentially calculates and calculates the estimated feedback sound component n2 (t) (transfer function h2) within the silent section (the section where the target sound component s (t) does not exist) determined by the target sound determination unit 22. The update of the estimated feedback sound component n2 (t) is stopped within the sound section determined by the target sound determination unit 22. Therefore, there is an advantage that the estimated feedback sound component n2 (t) can be estimated with high accuracy by suppressing the influence of the target sound component s (t).

ただし、伝達関数h1の厳密な推定は現実的には困難であるから、実際の伝達関数h1と帰還音推定部24が推定する伝達関数h2とは必ずしも合致しない。したがって、以下の数式(2)で表現されるように、帰還音成分n1(t)と推定帰還音成分n2(t)との差分に相当する残留音成分(帰還音成分n1(t)の一部)e1(t)が、帰還音抑圧部26による処理後の音響信号w2(t)には残留する(e1(t)=n1(t)−n2(t))。そこで、音響処理装置100Aは、残留音成分e1(t)を推定した推定残留音成分e2(t)のスペクトルE2(m,f)を音響信号w2(t)のスペクトルW2(m,f)から減算(スペクトル減算)することで、音響信号w2(t)に残留した残留音成分e1(t)を抑圧する。
w2(t)=w1(t)−n2(t)
=s(t)+n1(t)−n2(t)
=s(t)+e1(t) ……(2)
However, since strict estimation of the transfer function h1 is difficult in practice, the actual transfer function h1 and the transfer function h2 estimated by the feedback sound estimation unit 24 do not always match. Therefore, as expressed by the following formula (2), one of the residual sound component (feedback sound component n1 (t) corresponding to the difference between the feedback sound component n1 (t) and the estimated feedback sound component n2 (t). Part) e1 (t) remains in the acoustic signal w2 (t) processed by the feedback sound suppression unit 26 (e1 (t) = n1 (t) −n2 (t)). Therefore, the acoustic processing apparatus 100A obtains the spectrum E2 (m, f) of the estimated residual sound component e2 (t) obtained by estimating the residual sound component e1 (t) from the spectrum W2 (m, f) of the acoustic signal w2 (t). By subtracting (spectrum subtraction), the residual sound component e1 (t) remaining in the acoustic signal w2 (t) is suppressed.
w2 (t) = w1 (t) -n2 (t)
= S (t) + n1 (t) -n2 (t)
= S (t) + e1 (t) (2)

図1の周波数解析部32は、帰還音推定部24が生成した推定帰還音成分n2(t)の時系列を時間軸上で区分した複数のフレームの各々についてスペクトル(周波数スペクトル)N2(m,f)を生成する。記号mはフレーム(フレームの番号)を示し、記号fは周波数軸上の周波数または周波数帯域(周波数ビン)を示す。周波数解析部32と同様に、周波数解析部34は、音響信号w2(t)を時間軸上で区分した複数のフレームの各々についてスペクトル(周波数スペクトル)W2(m,f)を生成する。スペクトルN2(m,f)やスペクトルW2(m,f)の生成には、高速フーリエ変換やウェーブレット変換などの公知の周波数解析が任意に採用される。   The frequency analysis unit 32 in FIG. 1 has a spectrum (frequency spectrum) N2 (m,) for each of a plurality of frames obtained by dividing the time series of the estimated feedback sound component n2 (t) generated by the feedback sound estimation unit 24 on the time axis. generates f). Symbol m indicates a frame (frame number), and symbol f indicates a frequency or frequency band (frequency bin) on the frequency axis. Similar to the frequency analysis unit 32, the frequency analysis unit 34 generates a spectrum (frequency spectrum) W2 (m, f) for each of a plurality of frames obtained by dividing the acoustic signal w2 (t) on the time axis. For the generation of the spectrum N2 (m, f) and the spectrum W2 (m, f), known frequency analysis such as fast Fourier transform and wavelet transform is arbitrarily employed.

残留音推定部36は、帰還音成分n1(t)のうち帰還音抑圧部26による処理後の残留音成分e1(t)を推定した推定残留音成分e2(t)のスペクトルE2(m,f)を生成する(すなわち、残留音成分e1(t)を推定する)。帰還音成分n1(t)(推定帰還音成分n2(t))の周波数特性が残留音成分e1(t)に反映されるという傾向を考慮して、本実施形態の残留音推定部36は、周波数解析部32が生成した推定帰還音成分n2(t)のスペクトルN2(m,f)からスペクトルE2(m,f)を生成する。具体的には、残留音推定部36は、推定帰還音成分n2(t)のスペクトルN2(m,f)に係数δ(0≦δ≦1)を乗算することで推定残留音成分e2(t)のスペクトルE2(m,f)を算定する。係数δは、各周波数に共通の数値または周波数毎に相異なる数値に設定され得る。   The residual sound estimation unit 36 estimates the residual sound component e1 (t) processed by the feedback sound suppression unit 26 from the feedback sound component n1 (t), and the spectrum E2 (m, f) of the estimated residual sound component e2 (t). ) (Ie, the residual sound component e1 (t) is estimated). In consideration of the tendency that the frequency characteristic of the feedback sound component n1 (t) (estimated feedback sound component n2 (t)) is reflected in the residual sound component e1 (t), the residual sound estimation unit 36 of this embodiment is A spectrum E2 (m, f) is generated from the spectrum N2 (m, f) of the estimated feedback sound component n2 (t) generated by the frequency analysis unit 32. FIG. Specifically, the residual sound estimation unit 36 multiplies the spectrum N2 (m, f) of the estimated feedback sound component n2 (t) by a coefficient δ (0 ≦ δ ≦ 1) to thereby estimate the residual sound component e2 (t ) Spectrum E2 (m, f). The coefficient δ can be set to a numerical value common to each frequency or a different numerical value for each frequency.

スペクトル減算部42は、音響信号w2(t)のスペクトルW2(m,f)から推定残留音成分e2(t)のスペクトルE2(m,f)を減算することでスペクトルW3(m,f)を算定する。スペクトルW3(m,f)は、音響処理装置100Aによる処理後の音響信号z(t)の周波数スペクトルであり、振幅スペクトルA(m,f)とスペクトルW2(m,f)の位相スペクトルθw2(m,f)とを利用して以下の数式(3)で表現される。
W3(m,f)=A(m,f)・ejθw2(m,f) ……(3)
The spectrum subtracting unit 42 subtracts the spectrum E2 (m, f) of the estimated residual sound component e2 (t) from the spectrum W2 (m, f) of the acoustic signal w2 (t) to obtain the spectrum W3 (m, f). Calculate. The spectrum W3 (m, f) is a frequency spectrum of the acoustic signal z (t) processed by the acoustic processing apparatus 100A, and the phase spectrum θw2 (of the amplitude spectrum A (m, f) and the spectrum W2 (m, f). m, f) and expressed by the following formula (3).
W3 (m, f) = A (m, f) ・ e jθw2 (m, f) (3)

スペクトル減算部42は、以下の数式(4a)および数式(4b)の演算を実行することでスペクトルW3(m,f)のパワースペクトル|A(m,f)|を算定し、パワースペクトル|A(m,f)|から算定される振幅スペクトルA(m,f)とスペクトルW2(m,f)の位相スペクトルθw2(m,f)とから数式(3)の演算でスペクトルW3(m,f)を算定する。

Figure 2010220087

The spectrum subtraction unit 42 calculates the power spectrum | A (m, f) | 2 of the spectrum W3 (m, f) by executing the following expressions (4a) and (4b), and the power spectrum | From the amplitude spectrum A (m, f) calculated from A (m, f) | 2 and the phase spectrum θw2 (m, f) of the spectrum W2 (m, f), the spectrum W3 (m , f).
Figure 2010220087

数式(4a)に示すように、スペクトルW2(m,f)の強度(パワー)|W2(m,f)|が所定値THを上回る場合、スペクトル減算部42は、推定残留音成分e2(t)のスペクトルE2(m,f)の強度|E2(m,f)|を減算係数α(m)に応じて調整してから音響信号w2(t)のスペクトルW2(m,f)の強度|W2(m,f)|から減算(スペクトル減算)することで、スペクトルW3(m,f)のパワースペクトル|A(m,f)|を算定する。具体的には、減算係数α(m)と強度|E2(m,f)|との乗算値が強度|W2(m,f)|から減算される。所定値THは、例えば、減算係数α(m)と強度|E2(m,f)|との乗算値(TH=α(m)・|E2(m,f)|)に設定される。他方、強度|W2(m,f)|が所定値THを下回る場合、スペクトル減算部42は、数式(4b)に示すように、強度|E2(m,f)|をフロアリング係数β(m)に応じて調整する(具体的には強度|E2(m,f)|にフロアリング係数β(m)を乗算する)ことでパワースペクトル|A(m,f)|を算定する。 As shown in Expression (4a), when the intensity (power) | W2 (m, f) | 2 of the spectrum W2 (m, f) exceeds a predetermined value TH, the spectrum subtracting unit 42 calculates the estimated residual sound component e2 ( t) The intensity of the spectrum E2 (m, f) | E2 (m, f) | 2 is adjusted according to the subtraction coefficient α (m), and then the spectrum W2 (m, f) of the acoustic signal w2 (t) is adjusted. By subtracting (spectral subtraction) from the intensity | W2 (m, f) | 2 , the power spectrum | A (m, f) | 2 of the spectrum W3 (m, f) is calculated. Specifically, the product of the subtraction coefficient α (m) and the intensity | E2 (m, f) | 2 is subtracted from the intensity | W2 (m, f) | 2 . The predetermined value TH is set to, for example, a multiplication value of the subtraction coefficient α (m) and the intensity | E2 (m, f) | 2 (TH = α (m) · | E2 (m, f) | 2 ). . On the other hand, when the intensity | W2 (m, f) | 2 is lower than the predetermined value TH, the spectrum subtraction unit 42 converts the intensity | E2 (m, f) | 2 to the flooring coefficient β as shown in the equation (4b). The power spectrum | A (m, f) | 2 is calculated by adjusting according to (m) (specifically, multiplying the intensity | E2 (m, f) | 2 by the flooring coefficient β (m)) To do.

図1の逆変換部44は、スペクトル減算部42が生成する各フレームのスペクトルW3(m,f)を時間領域の信号に変換し、前後のフレームの変換後の信号を時間軸上で相互に連結することで音響信号z(t)を生成する。数式(4a)のように推定残留音成分e2(t)のスペクトルE2(m,f)が音響信号w2(t)のスペクトルW2(m,f)から減算されるので、帰還音成分n1(t)の残留音成分e1(t)を効果的に低減した音響信号z(t)を生成することが可能である。逆変換部44が生成した音響信号z(t)は通信網に送信される。   1 converts the spectrum W3 (m, f) of each frame generated by the spectrum subtracting unit 42 into a time domain signal, and converts the converted signals of the preceding and succeeding frames to each other on the time axis. The acoustic signal z (t) is generated by the connection. Since the spectrum E2 (m, f) of the estimated residual sound component e2 (t) is subtracted from the spectrum W2 (m, f) of the acoustic signal w2 (t) as shown in Equation (4a), the feedback sound component n1 (t It is possible to generate an acoustic signal z (t) in which the residual sound component e1 (t) is effectively reduced. The acoustic signal z (t) generated by the inverse conversion unit 44 is transmitted to the communication network.

ところで、以上のようにスペクトルE2(m,f)の減算で生成されたスペクトルW3(m,f)には、スペクトル減算に起因した高強度の成分(孤立点)が時間軸上および周波数軸上に分散的に存在し、人工的で耳障りなミュージカルノイズとして受聴者に知覚される場合がある。スペクトル減算の度合(スペクトル減算の前後にわたるスペクトルの変化の度合)が大きいほどスペクトルW3(m,f)のミュージカルノイズは顕著となる。具体的には、減算係数α(m)を大きい数値に設定した場合(スペクトル減算の度合を増加した場合)やフロアリング係数β(m)を小さい数値に設定した場合にミュージカルノイズは顕著となる。   By the way, in the spectrum W3 (m, f) generated by subtracting the spectrum E2 (m, f) as described above, a high-intensity component (isolated point) resulting from the spectrum subtraction is on the time axis and the frequency axis. May be perceived by the listener as artificial and annoying musical noise. The greater the degree of spectral subtraction (the degree of change in the spectrum before and after spectral subtraction), the more marked the musical noise in the spectrum W3 (m, f). Specifically, musical noise becomes noticeable when the subtraction coefficient α (m) is set to a large value (when the degree of spectrum subtraction is increased) or the flooring coefficient β (m) is set to a small value. .

図2の部分(A)は、スペクトル減算前のスペクトルW2(m,f)(音響信号w2(t))の各周波数における強度の度数を所定個(複数)のフレームについて計数することで生成された度数分布(強度を確率変数とする確率密度関数)F1のグラフである。図2の部分(A)に示すように、スペクトル減算前における各強度の度数(確率)は、強度がゼロから増加するほど減少するように非線形に分布する。   Part (A) in FIG. 2 is generated by counting the frequency of intensity at each frequency of the spectrum W2 (m, f) (acoustic signal w2 (t)) before spectrum subtraction for a predetermined number of frames. Is a graph of the frequency distribution (probability density function with intensity as a random variable) F1. As shown in part (A) of FIG. 2, the frequency (probability) of each intensity before spectrum subtraction is non-linearly distributed so as to decrease as the intensity increases from zero.

図2の部分(B)は、スペクトル減算後のスペクトルW3(m,f)(音響信号z(t))の各周波数における強度の度数を所定個のフレームについて計数することで生成された度数分布(確率密度関数)F2のグラフである。ゼロに近い強度の度数(確率)はスペクトル減算で増加するから、スペクトル減算後の度数分布F2のうち強度がゼロの近傍となる領域内の分布は、スペクトル減算前の度数分布F1と比較して急峻な形状となる。   Part (B) of FIG. 2 shows a frequency distribution generated by counting the frequency of intensity at each frequency of the spectrum W3 (m, f) (acoustic signal z (t)) after spectrum subtraction for a predetermined number of frames. (Probability density function) is a graph of F2. Since the frequency (probability) of the intensity close to zero is increased by spectrum subtraction, the distribution in the region where the intensity is close to zero in the frequency distribution F2 after the spectrum subtraction is compared with the frequency distribution F1 before the spectrum subtraction. It becomes a steep shape.

度数分布の形状(傾斜の急峻度)の尺度として尖度(kurtosis)を導入すると、スペクトル減算後の度数分布F2の尖度KCはスペクトル減算前の度数分布F1の尖度KBよりも高い数値となる。尖度κは、n次のモーメントμnから以下の数式(5)で算定される高次統計量である。

Figure 2010220087
When kurtosis is introduced as a measure of the shape of the frequency distribution (steepness of inclination), the kurtosis KC of the frequency distribution F2 after the spectrum subtraction is higher than the kurtosis KB of the frequency distribution F1 before the spectrum subtraction. Become. The kurtosis κ is a higher-order statistic calculated from the n-th moment μn by the following equation (5).
Figure 2010220087

非ガウス性の指標としての尖度κの性質に着目すると、度数分布の非ガウス性がスペクトル減算に起因して増加すると理解できる。ミュージカルノイズは非ガウス性が高い雑音であるから、スペクトル減算の前後にわたる尖度κの変化(尖度KBに対する尖度KCの相対比KC/KBや、尖度KCと尖度KBとの差分値(KC−KB))が増加するほどミュージカルノイズが顕在化するという傾向がある。すなわち、スペクトル減算前のスペクトルW2(m,f)(音響信号w2(t))の度数分布F1の尖度KBが低いほど、スペクトル減算に起因したミュージカルノイズがスペクトルW3(m,f)(音響信号z(t))に発生し易い。   Focusing on the nature of kurtosis κ as a non-Gaussian index, it can be understood that the non-Gaussianity of the frequency distribution increases due to spectral subtraction. Since musical noise is highly non-Gaussian noise, the change in kurtosis κ before and after spectral subtraction (the relative ratio KC / KB of kurtosis KC to kurtosis KB, or the difference between kurtosis KC and kurtosis KB As (KC-KB)) increases, there is a tendency that the musical noise becomes obvious. That is, the lower the kurtosis KB of the frequency distribution F1 of the spectrum W2 (m, f) (acoustic signal w2 (t)) before spectrum subtraction, the more the musical noise caused by the spectrum subtraction becomes the spectrum W3 (m, f) (sound Signal z (t)).

いま、音響信号w2(t)のスペクトルW2(m,f)のうち残留音成分e1(t)のスペクトルE1(m,f)から推定残留音成分e2(t)のスペクトルE2(m,f)を減算するという本来的な目的に着目すると(すなわち、目的音成分s(t)を便宜的に無視すると)、残留音成分e1(t)の強度の度数分布の尖度が低いほど、スペクトル減算に起因したミュージカルノイズが発生し易いという傾向が把握される。そして、推定残留音成分e2(t)は残留音成分e1(t)の推定値であるから、推定残留音成分e2(t)(スペクトルE2(m,f))の強度の度数分布における尖度KAが低いほど、推定残留音成分e2(t)のスペクトルE2(m,f)の減算に起因したミュージカルノイズが発生し易いという傾向がある。すなわち、ミュージカルノイズの発生の度合を示す定量的な指標として尖度KAを利用できる。以上の傾向を考慮して、本実施形態においては、数式(4a)の減算係数α(m)や数式(4b)のフロアリング係数β(m)(すなわち、スペクトル減算の度合)を推定残留音成分e2(t)(スペクトルE2(m,f))の強度の度数分布における尖度KA(m)に応じて可変に制御する。   The spectrum E2 (m, f) of the estimated residual sound component e2 (t) from the spectrum E1 (m, f) of the residual sound component e1 (t) in the spectrum W2 (m, f) of the acoustic signal w2 (t). Focusing on the original purpose of subtracting (ie, ignoring the target sound component s (t) for convenience), the lower the kurtosis of the intensity distribution of the residual sound component e1 (t), the lower the spectral subtraction. The tendency that the musical noise resulting from this is easy to generate is grasped. Since the estimated residual sound component e2 (t) is an estimated value of the residual sound component e1 (t), the kurtosis in the frequency distribution of the intensity of the estimated residual sound component e2 (t) (spectrum E2 (m, f)) As KA is lower, there is a tendency that musical noise due to subtraction of the spectrum E2 (m, f) of the estimated residual sound component e2 (t) is more likely to occur. That is, the kurtosis KA can be used as a quantitative index indicating the degree of occurrence of musical noise. In consideration of the above tendency, in the present embodiment, the subtraction coefficient α (m) in Expression (4a) and the flooring coefficient β (m) in Expression (4b) (that is, the degree of spectrum subtraction) are estimated residual sound. Control is variably performed according to the kurtosis KA (m) in the frequency distribution of the intensity of the component e2 (t) (spectrum E2 (m, f)).

図1の尖度算定部52Aは、残留音推定部36が生成したスペクトルE2(m,f)(推定残留音成分e2(t))の強度の度数分布における尖度KA(m)をフレーム毎に算定する。M個(Mは2以上の自然数)の強度x1〜xMの度数分布について尖度κを算定する方法の具体例を以下に詳述する。   The kurtosis calculation unit 52A in FIG. 1 calculates the kurtosis KA (m) in the frequency distribution of the intensity of the spectrum E2 (m, f) (estimated residual sound component e2 (t)) generated by the residual sound estimation unit 36 for each frame. To calculate. A specific example of a method for calculating the kurtosis κ for the frequency distribution of M (M is a natural number of 2 or more) intensities x1 to xM will be described in detail below.

M個の強度x1〜xMの度数分布は、以下の数式(6)の関数Ga(x;k,θ)で近似される。

Figure 2010220087

数式(6)の係数Cは、ガンマ関数Γ(k)を利用して以下のように定義される。
Figure 2010220087
The frequency distribution of M intensities x1 to xM is approximated by a function Ga (x; k, θ) of the following formula (6).
Figure 2010220087

The coefficient C in Equation (6) is defined as follows using the gamma function Γ (k).
Figure 2010220087

2次のモーメントμ2の定義式における分布関数(確率密度関数)P(x)を数式(6)の関数Ga(x;k,θ)に置換することで以下の数式(7)が導出される。

Figure 2010220087
The following equation (7) is derived by replacing the distribution function (probability density function) P (x) in the definition equation of the second moment μ2 with the function Ga (x; k, θ) of the equation (6). .
Figure 2010220087

2次のモーメントμ2の導出と同様に、4次のモーメントμ4の定義式における分布関数(確率密度関数)P(x)を数式(6)の関数Ga(x;k,θ)に置換することで以下の数式(8)が導出される。

Figure 2010220087
Similar to the derivation of the second-order moment μ2, the distribution function (probability density function) P (x) in the definition equation of the fourth-order moment μ4 is replaced with the function Ga (x; k, θ) of the equation (6). The following formula (8) is derived.
Figure 2010220087

数式(7)の2次のモーメントμ2と数式(8)の4次のモーメントμ4とを数式(5)に代入すると、尖度κを定義する以下の数式(9)が導出される。

Figure 2010220087
Substituting the second-order moment μ2 in equation (7) and the fourth-order moment μ4 in equation (8) into equation (5) yields the following equation (9) that defines kurtosis κ.
Figure 2010220087

図1の尖度算定部52Aは、第m番目のフレームを含む所定個のフレーム(例えば、第m番目のフレームを最後とする複数のフレーム)にわたる各スペクトルE2(m,f)のM個の強度|E2(m,f)|を強度x1〜xMとしたときの数式(9)の尖度κを尖度KA(m)として算定する。 The kurtosis calculation unit 52A in FIG. 1 performs M number of spectrums E2 (m, f) over a predetermined number of frames including the mth frame (for example, a plurality of frames with the mth frame as the last). The kurtosis κ in the equation (9) when the intensity | E2 (m, f) | 2 is the intensity x1 to xM is calculated as the kurtosis KA (m).

図1の係数設定部54は、尖度算定部52Aが算定した尖度KA(m)に応じて減算係数α(m)とフロアリング係数β(m)とを可変に設定する。各フレームについて尖度KA(m)が算定されるたびに減算係数α(m)およびフロアリング係数β(m)が順次に更新される。図2を参照して説明したように、尖度KA(m)が小さいほど(すなわち、残留音成分e1(t)の強度の度数分布における尖度が小さいほど)、スペクトル減算後にミュージカルノイズが発生し易いという傾向がある。以上の傾向を考慮して、係数設定部54は、尖度KA(m)が小さいほど、スペクトル減算の度合が低減される(すなわち、スペクトル減算に起因したミュージカルノイズが低減される)ように、減算係数α(m)およびフロアリング係数β(m)を設定する。   The coefficient setting unit 54 in FIG. 1 variably sets the subtraction coefficient α (m) and the flooring coefficient β (m) according to the kurtosis KA (m) calculated by the kurtosis calculation unit 52A. Each time the kurtosis KA (m) is calculated for each frame, the subtraction coefficient α (m) and the flooring coefficient β (m) are sequentially updated. As described with reference to FIG. 2, the smaller the kurtosis KA (m) (that is, the smaller the kurtosis in the frequency distribution of the intensity of the residual sound component e1 (t)), the more noise is generated after subtracting the spectrum. There is a tendency to be easy to do. In consideration of the above tendency, the coefficient setting unit 54 reduces the degree of spectrum subtraction as the kurtosis KA (m) is smaller (that is, the musical noise due to spectrum subtraction is reduced). A subtraction coefficient α (m) and a flooring coefficient β (m) are set.

具体的には、減算係数α(m)が小さいほど音響信号z(t)のミュージカルノイズは抑制されるから、係数設定部54は、尖度KA(m)が小さいほど減算係数α(m)を小さい数値(すなわち、スペクトルW2(m,f)からの減算量を抑制する数値)に設定する。減算係数α(m)と尖度KA(m)との具体的な関係は任意であるが、例えば、尖度KA(m)に所定の正数を乗算することで減算係数α(m)を算定する構成が好適である。   Specifically, since the musical noise of the acoustic signal z (t) is suppressed as the subtraction coefficient α (m) is smaller, the coefficient setting unit 54 determines that the subtraction coefficient α (m) is smaller as the kurtosis KA (m) is smaller. Is set to a small value (that is, a value that suppresses the amount of subtraction from the spectrum W2 (m, f)). The specific relationship between the subtraction coefficient α (m) and the kurtosis KA (m) is arbitrary. For example, the subtraction coefficient α (m) is obtained by multiplying the kurtosis KA (m) by a predetermined positive number. The structure to calculate is suitable.

また、フロアリング係数β(m)が大きいほど音響信号z(t)のミュージカルノイズは抑制されるから、係数設定部54は、尖度KA(m)が小さいほどフロアリング係数β(m)を大きい数値(すなわち、スペクトルW2(m,f)とスペクトルW3(m,f)との相違を抑制する数値)に設定する。例えば、尖度KA(m)を所定値から減算した数値をフロアリング係数β(m)として算定する構成が好適である。なお、減算係数α(m)とフロアリング係数β(m)とが尖度KA(m)の数値(あるいは範囲)毎に格納されたテーブルから、尖度KA(m)に応じた減算係数α(m)およびフロアリング係数β(m)を係数設定部54が探索する構成も好適である。   Further, since the musical noise of the acoustic signal z (t) is suppressed as the flooring coefficient β (m) is larger, the coefficient setting unit 54 sets the flooring coefficient β (m) as the kurtosis KA (m) is smaller. A large numerical value (that is, a numerical value that suppresses the difference between the spectrum W2 (m, f) and the spectrum W3 (m, f)) is set. For example, a configuration in which a numerical value obtained by subtracting the kurtosis KA (m) from a predetermined value is calculated as the flooring coefficient β (m) is preferable. The subtraction coefficient α (m) and the flooring coefficient β (m) are stored for each numerical value (or range) of the kurtosis KA (m), and the subtraction coefficient α corresponding to the kurtosis KA (m) is stored. A configuration in which the coefficient setting unit 54 searches for (m) and the flooring coefficient β (m) is also suitable.

以上の形態においては、音響信号w2(t)のスペクトルW2(m,f)から減算されるスペクトルE2(m,f)の調整に適用される減算係数α(m)が推定残留音成分e2(t)の強度の度数分布における尖度KA(m)に応じて(すなわち、スペクトル減算に起因して発生する可能性があるミュージカルノイズの程度)に応じて可変に設定される。したがって、減算係数α(m)が尖度KA(m)に依存しない構成(例えば減算係数α(m)を所定値に固定した構成)と比較して、音響信号z(t)におけるミュージカルノイズを有効に抑制しながら、帰還音成分n1(t)(特に残留音成分e1(t))を抑圧できるという利点がある。さらに、フロアリング係数β(m)も尖度KA(m)に応じて可変に設定されるから、減算係数α(m)およびフロアリング係数β(m)が尖度KA(m)に依存しない構成(例えば減算係数α(m)やフロアリング係数β(m)を所定値に固定した構成)や減算係数α(m)のみを尖度KA(m)に応じて設定する構成と比較して、スペクトル減算に起因したミュージカルノイズの低減の効果は格別に顕著である。   In the above embodiment, the subtraction coefficient α (m) applied to the adjustment of the spectrum E2 (m, f) subtracted from the spectrum W2 (m, f) of the acoustic signal w2 (t) is the estimated residual sound component e2 ( It is variably set according to the kurtosis KA (m) in the frequency distribution of the intensity of t) (that is, the degree of musical noise that may occur due to spectral subtraction). Therefore, compared with a configuration in which the subtraction coefficient α (m) does not depend on the kurtosis KA (m) (for example, a configuration in which the subtraction coefficient α (m) is fixed to a predetermined value), the musical noise in the acoustic signal z (t) is reduced. There is an advantage that the feedback sound component n1 (t) (particularly the residual sound component e1 (t)) can be suppressed while being effectively suppressed. Further, since the flooring coefficient β (m) is also variably set according to the kurtosis KA (m), the subtraction coefficient α (m) and the flooring coefficient β (m) do not depend on the kurtosis KA (m). Compared to the configuration (for example, the configuration in which the subtraction coefficient α (m) and the flooring coefficient β (m) are fixed to predetermined values) and the configuration in which only the subtraction coefficient α (m) is set according to the kurtosis KA (m) The effect of reducing musical noise due to spectral subtraction is particularly remarkable.

<B:第2実施形態>
次に、本発明の第2実施形態について説明する。なお、以下の各形態において作用や機能が第1実施形態と同等である要素については、以上と同じ符号を付して各々の詳細な説明を適宜に省略する。
<B: Second Embodiment>
Next, a second embodiment of the present invention will be described. In addition, about the element in which an effect | action and a function are equivalent to 1st Embodiment in each following form, the same code | symbol as the above is attached | subjected and each detailed description is abbreviate | omitted suitably.

第1実施形態の帰還音推定部24は、目的音判定部22が特定した無音区間で動作(推定帰還音成分n2(t)の更新)を停止することで推定帰還音成分n2(t)を高精度に推定する。しかし、様々な要因(例えば、直前の無音区間における推定の誤差の影響)で、有音区間に適用される推定帰還音成分n2(t)に誤差が発生する場合がある。第1実施形態の尖度算定部52Aは、推定帰還音成分n2(t)に応じた推定残留音成分e2(t)のスペクトルE2(m,f)を対象として尖度KA(m)を算定するから、推定帰還音成分n2(t)に起因した誤差が尖度KA(m)に発生する。したがって、減算係数α(m)やフロアリング係数β(m)を適切な数値に設定できない場合がある。第2実施形態は以上の問題を解決するための形態である。   The feedback sound estimation unit 24 of the first embodiment stops the operation (update of the estimated feedback sound component n2 (t)) in the silent section specified by the target sound determination unit 22 to obtain the estimated feedback sound component n2 (t). Estimate with high accuracy. However, an error may occur in the estimated feedback sound component n2 (t) applied to the voiced section due to various factors (for example, the influence of the estimation error in the previous silent section). The kurtosis calculation unit 52A of the first embodiment calculates the kurtosis KA (m) for the spectrum E2 (m, f) of the estimated residual sound component e2 (t) corresponding to the estimated feedback sound component n2 (t). Therefore, an error caused by the estimated feedback sound component n2 (t) occurs in the kurtosis KA (m). Therefore, the subtraction coefficient α (m) and the flooring coefficient β (m) may not be set to appropriate values. The second embodiment is a form for solving the above problem.

図3は、第2実施形態に係る音響処理装置100Bのブロック図である。図3に破線の矢印で示すように、目的音判定部22による判定の結果は、帰還音推定部24とともに尖度算定部52Aにも通知される。尖度算定部52Aは、目的音判定部22が特定した無音区間(目的音成分s(t)が存在しない区間)内では第1実施形態と同様に尖度KA(m)をフレーム毎に算定および更新するが、有音区間内では尖度KA(m)の算定を停止する。したがって、有音区間内では、直前の無音区間の最後に算定した尖度KA(m)が継続的に係数設定部54に指示される。係数設定部54が尖度KA(m)に応じて減算係数α(m)やフロアリング係数β(m)を算定する動作は第1実施形態と同様である。   FIG. 3 is a block diagram of the sound processing apparatus 100B according to the second embodiment. As indicated by the dashed arrows in FIG. 3, the determination result by the target sound determination unit 22 is notified to the kurtosis calculation unit 52 </ b> A together with the feedback sound estimation unit 24. The kurtosis calculation unit 52A calculates the kurtosis KA (m) for each frame in the silent section specified by the target sound determination unit 22 (section in which the target sound component s (t) does not exist) as in the first embodiment. However, the calculation of the kurtosis KA (m) is stopped within the sound section. Therefore, in the voiced section, the coefficient setting unit 54 is instructed continuously with the kurtosis KA (m) calculated at the end of the immediately preceding silent section. The operation of the coefficient setting unit 54 calculating the subtraction coefficient α (m) and the flooring coefficient β (m) according to the kurtosis KA (m) is the same as that in the first embodiment.

以上の形態においては、有音区間内で尖度KA(m)の更新が停止するから、有音区間内における推定帰還音成分n2(t)の誤差は減算係数α(m)やフロアリング係数β(m)に影響しない。したがって、スペクトル減算に起因したミュージカルノイズを第1実施形態よりも効果的に抑制できるという利点がある。また、有音区間内では尖度KA(m)の算定が停止するから、有音区間および無音区間の双方にて尖度KA(m)を算定する第1実施形態と比較して、尖度算定部52Aの処理量が削減されるという利点もある。   In the above embodiment, the update of the kurtosis KA (m) is stopped in the sounded section, so the error of the estimated feedback sound component n2 (t) in the sounded section is the subtraction coefficient α (m) or the flooring coefficient. Does not affect β (m). Therefore, there is an advantage that musical noise caused by spectrum subtraction can be more effectively suppressed than in the first embodiment. In addition, since the calculation of the kurtosis KA (m) is stopped in the voiced section, the kurtosis is compared with the first embodiment in which the kurtosis KA (m) is calculated in both the voiced section and the silent section. There is also an advantage that the processing amount of the calculation unit 52A is reduced.

<C:第3実施形態>
図2を参照して前述したように、スペクトル減算前のスペクトルW2(m,f)(音響信号w2(t))の強度の度数分布F1の尖度KB(m)が低いほど、スペクトル減算に起因したミュージカルノイズがスペクトルW3(m,f)(音響信号z(t))に発生し易いという傾向がある。以上の傾向を考慮して、第3実施形態においては、スペクトルW2(m,f)の強度の度数分布F1の尖度KB(m)に応じてスペクトル減算の度合(減算係数α(m)やフロアリング係数β(m))を制御する。
<C: Third Embodiment>
As described above with reference to FIG. 2, the lower the kurtosis KB (m) of the frequency distribution F1 of the intensity of the spectrum W2 (m, f) (acoustic signal w2 (t)) before the spectrum subtraction, the lower the spectrum subtraction. The resulting musical noise tends to occur easily in the spectrum W3 (m, f) (acoustic signal z (t)). In consideration of the above tendency, in the third embodiment, the degree of spectrum subtraction (subtraction coefficient α (m) and the like) according to the kurtosis KB (m) of the frequency distribution F1 of the intensity of the spectrum W2 (m, f). The flooring coefficient β (m)) is controlled.

図4は、第3実施形態に係る音響処理装置100Cのブロック図である。図4に示すように、音響処理装置100Cは、第1実施形態の音響処理装置100Aにおける尖度算定部52Aを尖度算定部52Bに置換した構成である。尖度算定部52Bは、周波数解析部34が生成したスペクトルW2(m,f)(音響信号w2(t))の強度の度数分布F1における尖度KB(m)をフレーム毎に算定する。尖度KB(m)の算定には、第1実施形態における尖度KA(m)の算定と同様の方法が採用される。   FIG. 4 is a block diagram of a sound processing apparatus 100C according to the third embodiment. As illustrated in FIG. 4, the acoustic processing device 100C has a configuration in which the kurtosis calculation unit 52A in the acoustic processing device 100A of the first embodiment is replaced with a kurtosis calculation unit 52B. The kurtosis calculation unit 52B calculates the kurtosis KB (m) in the frequency distribution F1 of the intensity of the spectrum W2 (m, f) (acoustic signal w2 (t)) generated by the frequency analysis unit 34 for each frame. For the calculation of the kurtosis KB (m), the same method as the calculation of the kurtosis KA (m) in the first embodiment is adopted.

係数設定部54は、第1実施形態と同様に、尖度算定部52Bが算定した尖度KB(m)に応じて減算係数α(m)とフロアリング係数β(m)とを可変に設定する。尖度KB(m)が低いほどミュージカルノイズが発生し易いという傾向を考慮して、係数設定部54は、尖度KB(m)が小さいほどスペクトル減算の度合が低減されるように減算係数α(m)およびフロアリング係数β(m)を設定する。具体的には、係数設定部54は、尖度KB(m)が小さいほど減算係数α(m)を小さい数値に設定し、尖度KB(m)が小さいほどフロアリング係数β(m)を大きい数値に設定する。   As in the first embodiment, the coefficient setting unit 54 variably sets the subtraction coefficient α (m) and the flooring coefficient β (m) according to the kurtosis KB (m) calculated by the kurtosis calculation unit 52B. To do. Considering the tendency that musical noise is more likely to occur as the kurtosis KB (m) is lower, the coefficient setting unit 54 reduces the subtraction coefficient α so that the degree of spectral subtraction is reduced as the kurtosis KB (m) is smaller. Set (m) and flooring coefficient β (m). Specifically, the coefficient setting unit 54 sets the subtraction coefficient α (m) to a smaller value as the kurtosis KB (m) is smaller, and sets the flooring coefficient β (m) as the kurtosis KB (m) is smaller. Set to a large number.

第3実施形態においては、音響信号w2(t)のスペクトルW2(m,f)の尖度KB(m)に応じて減算係数α(m)やフロアリング係数β(m)が制御されるから、第1実施形態と同様に、音響信号z(t)におけるミュージカルノイズを有効に抑制しながら、帰還音成分n1(t)(特に残留音成分e1(t))を抑圧できるという利点がある。また、音響信号w2(t)の尖度KB(m)が利用されるから、推定残留音成分e2(t)の推定の精度に影響されずに適切な減算係数α(m)やフロアリング係数β(m)を算定できるという利点もある。   In the third embodiment, the subtraction coefficient α (m) and flooring coefficient β (m) are controlled in accordance with the kurtosis KB (m) of the spectrum W2 (m, f) of the acoustic signal w2 (t). As in the first embodiment, there is an advantage that the feedback sound component n1 (t) (particularly the residual sound component e1 (t)) can be suppressed while effectively suppressing the musical noise in the acoustic signal z (t). In addition, since the kurtosis KB (m) of the acoustic signal w2 (t) is used, an appropriate subtraction coefficient α (m) and flooring coefficient are not affected by the estimation accuracy of the estimated residual sound component e2 (t). There is also an advantage that β (m) can be calculated.

<D:第4実施形態>
音響信号w2(t)には目的音成分s(t)が含まれる場合と含まれない場合とがある。帰還音成分n1(t)は、反射や拡散を経て収音機器14に到達した残響音であるから、大部分が音源から直接的に収音機器14に到達する目的音成分s(t)と比較すると尖度(非ガウス性)が低い。すなわち、音響信号w2(t)が目的音成分s(t)を含まない場合の尖度KB(m)は、音響信号w2(t)が目的音成分s(t)を含む場合の尖度KB(m)よりも低い。
<D: Fourth Embodiment>
The acoustic signal w2 (t) may or may not include the target sound component s (t). Since the feedback sound component n1 (t) is a reverberant sound that reaches the sound collecting device 14 through reflection or diffusion, most of the feedback sound component n1 (t) is a target sound component s (t) that reaches the sound collecting device 14 directly from the sound source. In comparison, kurtosis (non-Gaussian) is low. That is, the kurtosis KB (m) when the acoustic signal w2 (t) does not include the target sound component s (t) is the kurtosis KB when the acoustic signal w2 (t) includes the target sound component s (t). Lower than (m).

したがって、音響信号w2(t)が目的音成分s(t)を含まない場合の尖度KB(m)のもとでミュージカルノイズが効果的に抑制されるように尖度KB(m)と減算係数α(m)との関係を決定することを前提とすれば、目的音成分s(t)の有無に拘わらず音響信号w2(t)の尖度KB(m)が減算係数α(m)に反映される第3実施形態の構成では、音響信号w2(t)が目的音成分s(t)を含む場合(すなわち、目的音成分s(t)を含まない場合と比較して尖度KB(m)が高い場合)に減算係数α(m)が大きい数値に設定され、推定残留音成分e2(t)のスペクトルE(m,f)が音響信号w2(t)のスペクトルW2(m,f)から過剰に減算される可能性がある。そこで、第4実施形態においては、音響信号w2(t)が目的音成分s(t)を含む場合に尖度KB(m)の算定を停止する。   Therefore, the kurtosis KB (m) is subtracted from the kurtosis KB (m) so that the musical noise is effectively suppressed under the kurtosis KB (m) when the acoustic signal w2 (t) does not include the target sound component s (t). Assuming that the relationship with the coefficient α (m) is determined, the kurtosis KB (m) of the acoustic signal w2 (t) is the subtraction coefficient α (m) regardless of the presence or absence of the target sound component s (t). In the configuration of the third embodiment reflected in the above, the kurtosis KB is compared with the case where the acoustic signal w2 (t) includes the target sound component s (t) (that is, as compared with the case where the target sound component s (t) is not included). When (m) is high), the subtraction coefficient α (m) is set to a large value, and the spectrum E (m, f) of the estimated residual sound component e2 (t) becomes the spectrum W2 (m, There is a possibility of excessive subtraction from f). Therefore, in the fourth embodiment, the calculation of the kurtosis KB (m) is stopped when the acoustic signal w2 (t) includes the target sound component s (t).

図5は、第4実施形態に係る音響処理装置100Dのブロック図である。図5に破線の矢印で示すように、目的音判定部22による判定の結果は、帰還音推定部24とともに尖度算定部52Bにも通知される。第2実施形態の尖度算定部52Aと同様に、尖度算定部52Bは、目的音判定部22による判定の結果に応じて尖度KB(m)の算定を実行または停止する。すなわち、尖度算定部52Bは、目的音判定部22が特定した無音区間内では、第3実施形態と同様に尖度KB(m)をフレーム毎に算定および更新するが、有音区間内では尖度KB(m)の算定を停止する。係数設定部54が尖度KB(m)に応じて減算係数α(m)やフロアリング係数β(m)を算定する動作は第3実施形態と同様である。   FIG. 5 is a block diagram of a sound processing apparatus 100D according to the fourth embodiment. As indicated by the dashed arrows in FIG. 5, the determination result by the target sound determination unit 22 is notified to the kurtosis calculation unit 52 </ b> B together with the feedback sound estimation unit 24. Similar to the kurtosis calculation unit 52A of the second embodiment, the kurtosis calculation unit 52B executes or stops the calculation of the kurtosis KB (m) according to the determination result by the target sound determination unit 22. That is, the kurtosis calculation unit 52B calculates and updates the kurtosis KB (m) for each frame in the silent section specified by the target sound determination unit 22, as in the third embodiment. Stop calculating kurtosis KB (m). The operation of the coefficient setting unit 54 to calculate the subtraction coefficient α (m) and the flooring coefficient β (m) according to the kurtosis KB (m) is the same as in the third embodiment.

以上の形態においては、音響信号w2(t)が目的音成分s(t)を含む有音区間内で尖度KB(m)の算定が停止するから、減算係数α(m)やフロアリング係数β(m)は目的音成分s(t)に影響されない。したがって、スペクトル減算に起因したミュージカルノイズを適切に抑圧できるという効果が実現される。また、有音区間内では尖度KB(m)の算定が停止するから、有音区間および無音区間の双方で尖度KB(m)を算定する第3実施形態と比較して、尖度算定部52Bの処理量が削減されるという利点もある。   In the above embodiment, since the calculation of the kurtosis KB (m) is stopped within the sound section where the acoustic signal w2 (t) includes the target sound component s (t), the subtraction coefficient α (m) and the flooring coefficient β (m) is not affected by the target sound component s (t). Therefore, the effect that the musical noise resulting from spectrum subtraction can be suppressed appropriately is realized. Also, since the calculation of kurtosis KB (m) stops in the voiced section, the kurtosis calculation is compared with the third embodiment in which the kurtosis KB (m) is calculated in both the voiced and silent sections. There is also an advantage that the processing amount of the unit 52B is reduced.

<E:第5実施形態>
図6は、第5実施形態に係る音響処理装置100Eのブロック図である。図6に示すように、音響処理装置100Eは、第1実施形態(図1)の音響処理装置100Aに第3実施形態(図4)の尖度算定部52Bを追加するとともに係数設定部54に判定部56を追加した構成である。尖度算定部52Aは推定残留音成分e2(t)(スペクトルE2(m,f))の尖度KA(m)をフレーム毎に算定し、尖度算定部52Bは音響信号w2(t)(スペクトルW2(m,f))の尖度KB(m)をフレーム毎に算定する。
<E: Fifth Embodiment>
FIG. 6 is a block diagram of a sound processing apparatus 100E according to the fifth embodiment. As shown in FIG. 6, the sound processing device 100E adds the kurtosis calculating unit 52B of the third embodiment (FIG. 4) to the sound processing device 100A of the first embodiment (FIG. 1) and adds it to the coefficient setting unit 54. The determination unit 56 is added. The kurtosis calculation unit 52A calculates the kurtosis KA (m) of the estimated residual sound component e2 (t) (spectrum E2 (m, f)) for each frame, and the kurtosis calculation unit 52B calculates the acoustic signal w2 (t) ( The kurtosis KB (m) of the spectrum W2 (m, f)) is calculated for each frame.

尖度算定部52Aが尖度KA(m)を算定する推定残留音成分e2(t)(スペクトルE2(m,f))は、尖度算定部52Bが尖度KB(m)を算定する音響信号w2(t)内の残留音成分e1(t)の推定値である。したがって、音響信号w2(t)が残留音成分e1(t)のみを含む場合(すなわち、目的音成分s(t)を含まない場合)には、尖度KA(m)と尖度KB(m)とが近似する。他方、第4実施形態について前述したように、音響信号w2(t)が目的音成分s(t)を含む場合と目的音成分s(t)を含まない場合とで尖度KB(m)は相違するから、音響信号w2(t)が目的音成分s(t)を含む場合には、尖度KA(m)と尖度KB(m)とは相違する。   The estimated residual sound component e2 (t) (spectrum E2 (m, f)) for which the kurtosis calculation unit 52A calculates the kurtosis KA (m) is the acoustic for which the kurtosis calculation unit 52B calculates the kurtosis KB (m). This is an estimated value of the residual sound component e1 (t) in the signal w2 (t). Therefore, when the acoustic signal w2 (t) includes only the residual sound component e1 (t) (that is, when it does not include the target sound component s (t)), the kurtosis KA (m) and the kurtosis KB (m ) Approximate. On the other hand, as described above for the fourth embodiment, the kurtosis KB (m) is obtained when the acoustic signal w2 (t) includes the target sound component s (t) and does not include the target sound component s (t). Therefore, when the acoustic signal w2 (t) includes the target sound component s (t), the kurtosis KA (m) and the kurtosis KB (m) are different.

以上の傾向を考慮して、判定部56は、尖度算定部52Aが算定した尖度KA(m)と尖度算定部52Bが算定した尖度KB(m)とが近似するか否かに応じて有音区間と無音区間とを区別する。すなわち、判定部56は、目的音判定部22とは異なる方法で目的音成分s(t)の有無を判定する。具体的には、判定部56は、尖度KA(m)と尖度KB(m)とが近似する場合には音響信号w2(t)が目的音成分s(t)を含まない(無音区間である)と判定し、尖度KA(m)と尖度KB(m)とが近似しない場合には音響信号w2(t)が目的音成分s(t)を含む(有音区間である)と判定する。例えば、尖度KA(m)と尖度KB(m)との差分値(絶対値)が所定の閾値を下回る場合には無音区間と判定され、尖度KA(m)と尖度KB(m)との差分値が所定の閾値を上回る場合には有音区間と判定される。   Considering the above tendency, the determination unit 56 determines whether or not the kurtosis KA (m) calculated by the kurtosis calculation unit 52A approximates the kurtosis KB (m) calculated by the kurtosis calculation unit 52B. Accordingly, the voiced section and the silent section are distinguished. That is, the determination unit 56 determines the presence or absence of the target sound component s (t) by a method different from that of the target sound determination unit 22. Specifically, the determination unit 56 determines that the acoustic signal w2 (t) does not include the target sound component s (t) when the kurtosis KA (m) and the kurtosis KB (m) are approximated (silent section). If the kurtosis KA (m) and the kurtosis KB (m) are not approximate, the acoustic signal w2 (t) includes the target sound component s (t) (which is a voiced section). Is determined. For example, if the difference value (absolute value) between the kurtosis KA (m) and the kurtosis KB (m) is below a predetermined threshold, it is determined as a silent section, and the kurtosis KA (m) and the kurtosis KB (m ) Is greater than a predetermined threshold value, it is determined as a sound section.

係数設定部54は、判定部56による判定の結果に応じて減算係数α(m)およびフロアリング係数β(m)の更新を実行または停止する。具体的には、係数設定部54は、判定部56が特定した無音区間内の各フレームについては、第1実施形態と同様に、尖度算定部52Aが算定した尖度KA(m)に応じて減算係数α(m)およびフロアリング係数β(m)を設定する。他方、判定部56が特定した有音区間内の各フレームについては、減算係数α(m)およびフロアリング係数β(m)の算定を停止する。減算係数α(m)およびフロアリング係数β(m)の算定を停止すると、係数設定部54は、減算係数α(m)およびフロアリング係数β(m)を、算定の停止前(有音区間の開始前)の減算係数α(m)やフロアリング係数β(m)とは無関係な所定値(例えばスペクトル減算の度合を低減する数値)に初期化する。具体的には、減算係数α(m)はゼロに近い数値に設定され、フロアリング係数β(m)は1に近い数値に設定される。したがって、有音区間内の各フレームに対するスペクトル減算が無音区間内の音響信号w2(t)(帰還音成分n1(t))の影響で過剰となることを抑制できる。もっとも、無音区間での更新後の減算係数α(m)やフロアリング係数β(m)を有音区間内のフレームに適用してもスペクトル減算の度合が過剰とならない場合もある。したがって、無音区間で更新された減算係数α(m)およびフロアリング係数β(m)が直後の有音区間でも継続的に係数設定部54からスペクトル減算部42に指示される構成も採用され得る。   The coefficient setting unit 54 executes or stops the update of the subtraction coefficient α (m) and the flooring coefficient β (m) according to the determination result by the determination unit 56. Specifically, the coefficient setting unit 54, for each frame in the silent section specified by the determination unit 56, according to the kurtosis KA (m) calculated by the kurtosis calculation unit 52A, as in the first embodiment. To set the subtraction coefficient α (m) and the flooring coefficient β (m). On the other hand, the calculation of the subtraction coefficient α (m) and the flooring coefficient β (m) is stopped for each frame in the sound section specified by the determination unit 56. When the calculation of the subtraction coefficient α (m) and the flooring coefficient β (m) is stopped, the coefficient setting unit 54 calculates the subtraction coefficient α (m) and the flooring coefficient β (m) before the calculation is stopped (sound period). Is initialized to a predetermined value irrelevant to the subtraction coefficient α (m) and the flooring coefficient β (m) (for example, a numerical value for reducing the degree of spectrum subtraction). Specifically, the subtraction coefficient α (m) is set to a value close to zero, and the flooring coefficient β (m) is set to a value close to 1. Therefore, it is possible to prevent the spectral subtraction for each frame in the sounded section from becoming excessive due to the influence of the acoustic signal w2 (t) (feedback sound component n1 (t)) in the silent section. However, even if the updated subtraction coefficient α (m) and flooring coefficient β (m) in the silent section are applied to the frames in the voiced section, the degree of spectrum subtraction may not be excessive. Accordingly, a configuration in which the subtraction coefficient α (m) and the flooring coefficient β (m) updated in the silent section are continuously instructed from the coefficient setting unit 54 to the spectrum subtracting section 42 even in the immediately following voiced section can be adopted. .

以上の形態においては、スペクトル減算に適用される減算係数α(m)やフロアリング係数β(m)の更新が有音区間内で停止するから、目的音成分s(t)の特性がスペクトル減算の度合に影響する(例えば目的音成分s(t)に起因してスペクトル減算が過剰となる)ことは防止される。また、有音区間内では減算係数α(m)やフロアリング係数β(m)の算定が停止するから、有音区間および無音区間の双方にて減算係数α(m)やフロアリング係数β(m)を算定する構成と比較して、係数設定部54の処理量が削減されるという利点もある。なお、係数設定部54が尖度KA(m)から減算係数α(m)やフロアリング係数β(m)を算定する構成を以上では例示したが、減算係数α(m)やフロアリング係数β(m)を、第3実施形態と同様に尖度KB(m)から算定する構成や、尖度KA(m)および尖度KB(m)の双方に応じて(例えば、尖度KA(m)と尖度KB(m)との平均値に応じて)算定する構成も採用される。   In the above embodiment, the update of the subtraction coefficient α (m) and the flooring coefficient β (m) applied to the spectral subtraction stops in the sound section, so that the characteristic of the target sound component s (t) is spectral subtraction. (For example, excessive spectrum subtraction due to the target sound component s (t)) is prevented. In addition, since the calculation of the subtraction coefficient α (m) and flooring coefficient β (m) is stopped in the sounded section, the subtraction coefficient α (m) and flooring coefficient β ( Compared with the configuration for calculating m), there is also an advantage that the processing amount of the coefficient setting unit 54 is reduced. The configuration in which the coefficient setting unit 54 calculates the subtraction coefficient α (m) and the flooring coefficient β (m) from the kurtosis KA (m) has been exemplified above, but the subtraction coefficient α (m) and the flooring coefficient β (m) is calculated from the kurtosis KB (m) as in the third embodiment, and according to both the kurtosis KA (m) and the kurtosis KB (m) (for example, the kurtosis KA (m ) And the kurtosis KB (m) (based on the average value) are also used.

<F:第6実施形態>
図7は、第6実施形態に係る音響処理装置100Fのブロック図である。図7に示すように、音響処理装置100Fは、第3実施形態の音響処理装置100C(図4)における尖度算定部52B(第1実施形態の音響処理装置100Aにおける尖度算定部52A)を尖度算定部52Cに置換した構成である。尖度算定部52Cは、スペクトル減算後の音響信号z(t)(スペクトルW3(m,f))の強度の度数分布F2における尖度KC(m)(図2の部分(B))をスペクトルW3(m,f)からフレーム毎に算定する。尖度KC(m)の算定には、第3実施形態における尖度KB(m)の算定と同様の方法が採用される。
<F: Sixth Embodiment>
FIG. 7 is a block diagram of a sound processing apparatus 100F according to the sixth embodiment. As illustrated in FIG. 7, the sound processing device 100F includes a kurtosis calculation unit 52B (the kurtosis calculation unit 52A in the sound processing device 100A of the first embodiment) in the sound processing device 100C (FIG. 4) of the third embodiment. In this configuration, the kurtosis calculation unit 52C is replaced. The kurtosis calculation unit 52C spectrums the kurtosis KC (m) (part (B) of FIG. 2) in the frequency distribution F2 of the intensity of the acoustic signal z (t) (spectrum W3 (m, f)) after the spectrum subtraction. Calculate for each frame from W3 (m, f). For the calculation of the kurtosis KC (m), the same method as the calculation of the kurtosis KB (m) in the third embodiment is adopted.

係数設定部54は、尖度算定部52Cによる算定の結果に応じて減算係数α(m)およびフロアリング係数β(m)をフレーム毎に順次に設定する。尖度KC(m)は実行済のスペクトル減算の結果(スペクトルW3(m,f))を利用して算定されるから、第m番目のフレームの減算係数α(m)およびフロアリング係数β(m)は、第(m-1)番目のフレームのスペクトルW3(m-1,f)から算定された尖度KC(m-1)に応じて設定される。図2を参照して説明したように、尖度KC(m)が大きいほどミュージカルノイズが発生し易い。したがって、係数設定部54は、尖度KC(m-1)が大きいほど、減算係数α(m)を小さい数値に設定するとともにフロアリング係数β(m)を大きい数値に設定する。第6実施形態においても第1実施形態や第3実施形態と同様の効果が実現される。   The coefficient setting unit 54 sequentially sets the subtraction coefficient α (m) and the flooring coefficient β (m) for each frame according to the calculation result by the kurtosis calculation unit 52C. Since the kurtosis KC (m) is calculated using the result of spectrum subtraction already performed (spectrum W3 (m, f)), the subtraction coefficient α (m) and flooring coefficient β ( m) is set according to the kurtosis KC (m-1) calculated from the spectrum W3 (m-1, f) of the (m-1) th frame. As described with reference to FIG. 2, musical noise is more likely to occur as the kurtosis KC (m) increases. Therefore, the coefficient setting unit 54 sets the subtraction coefficient α (m) to a smaller numerical value and the flooring coefficient β (m) to a larger numerical value as the kurtosis KC (m−1) increases. In the sixth embodiment, the same effects as those of the first and third embodiments are realized.

なお、第4実施形態と同様に有音区間内で尖度KC(m)の算定および更新を停止する構成も採用される。また、以上においては尖度KC(m-1)に応じて減算係数α(m)およびフロアリング係数β(m)を算定したが、第m番目のフレームの尖度KC(m)から当該フレームの減算係数α(m)やフロアリング係数β(m)を算定する構成も採用される。例えば、第m番目のフレームのスペクトル減算後の尖度KC(m)に応じて設定された減算係数α(m)およびフロアリング係数β(m)を適用したスペクトル減算が第m番目のフレームのスペクトルW2(m,f)について実行される(すなわち、尖度KC(m)の算定のためのスペクトル減算と本来の目的のスペクトル減算とが1個のフレームについて実行される)。   In addition, the structure which stops calculation and update of kurtosis KC (m) within a sound area similarly to 4th Embodiment is also employ | adopted. In the above, the subtraction coefficient α (m) and the flooring coefficient β (m) are calculated according to the kurtosis KC (m−1), but the kurtosis KC (m) of the m-th frame A configuration for calculating the subtraction coefficient α (m) and the flooring coefficient β (m) is also employed. For example, the spectral subtraction using the subtraction coefficient α (m) and the flooring coefficient β (m) set according to the kurtosis KC (m) after the spectral subtraction of the mth frame is performed for the mth frame. It is performed on the spectrum W2 (m, f) (ie, the spectral subtraction for calculating the kurtosis KC (m) and the original target spectral subtraction are performed for one frame).

<G:第7実施形態>
図8は、本発明の第7実施形態に係る音響処理装置100Gのブロック図である。図8に示すように、音響処理装置100Gは、スペクトル減算前の尖度KB(m)を算定する第3実施形態の尖度算定部52Bと、スペクトル減算後の尖度KC(m)を算定する第6実施形態の尖度算定部52Cとを含んで構成される。
<G: Seventh Embodiment>
FIG. 8 is a block diagram of a sound processing apparatus 100G according to the seventh embodiment of the present invention. As shown in FIG. 8, the acoustic processing device 100G calculates the kurtosis calculation unit 52B of the third embodiment for calculating the kurtosis KB (m) before spectrum subtraction, and the kurtosis KC (m) after spectrum subtraction. The kurtosis calculating unit 52C of the sixth embodiment is configured.

係数設定部54は、尖度KB(m-1)および尖度KC(m-1)の双方に応じて第m番目のフレームの減算係数α(m)およびフロアリング係数β(m)を可変に設定する。具体的には、尖度KB(m-1)と尖度KC(m-1)との相違が大きいほどスペクトル減算の度合が低減されるように、減算係数α(m)およびフロアリング係数β(m)が算定される。例えば、係数設定部54は、尖度KB(m-1)に対する尖度KC(m-1)の相対比KC(m-1)/KB(m-1)や、尖度KC(m-1)と尖度KB(m-1)との差分値(KC(m-1)−KB(m-1))が大きいほど、減算係数α(m)を小さい数値に設定するとともにフロアリング係数β(m)を大きい数値に設定する。   The coefficient setting unit 54 varies the subtraction coefficient α (m) and the flooring coefficient β (m) of the mth frame according to both the kurtosis KB (m−1) and the kurtosis KC (m−1). Set to. Specifically, the subtraction coefficient α (m) and the flooring coefficient β are set such that the greater the difference between the kurtosis KB (m−1) and the kurtosis KC (m−1), the lower the degree of spectral subtraction. (m) is calculated. For example, the coefficient setting unit 54 calculates the relative ratio KC (m-1) / KB (m-1) of the kurtosis KC (m-1) to the kurtosis KB (m-1) or the kurtosis KC (m-1). ) And kurtosis KB (m-1), the larger the difference value (KC (m-1) -KB (m-1)), the smaller the subtraction coefficient α (m) and the flooring coefficient β Set (m) to a large number.

以上の形態においては、スペクトル減算の実行前の尖度KB(m)と実行後の尖度KC(m)との双方が減算係数α(m)やフロアリング係数β(m)に反映されるから、尖度KB(m)および尖度KC(m)の一方のみを利用する構成(第3実施形態や第6実施形態)と比較して、スペクトル減算に起因したミュージカルノイズの抑制の精度を向上させることが可能である。   In the above embodiment, both the kurtosis KB (m) before execution of spectral subtraction and the kurtosis KC (m) after execution are reflected in the subtraction coefficient α (m) and the flooring coefficient β (m). Therefore, compared with the configuration using only one of kurtosis KB (m) and kurtosis KC (m) (the third embodiment and the sixth embodiment), the accuracy of suppressing musical noise caused by spectral subtraction is improved. It is possible to improve.

<H:変形例>
以上に例示した各形態は様々に変形され得る。変形の具体的な態様を以下に例示する。なお、以下の例示から任意に選択された2以上の態様は適宜に併合され得る。
<H: Modification>
Each form illustrated above can be variously modified. Specific modes of deformation are exemplified below. Note that two or more aspects arbitrarily selected from the following examples may be appropriately combined.

(1)変形例1
尖度算定部52Bが尖度KB(m)を算定する対象は適宜に変更される。例えば、収音機器14が生成した音響信号w1(t)の各スペクトルの強度(x1〜xM)から尖度KB(m)が算定され得る。また、時間領域の信号から尖度KB(m)を算定する構成も好適である。例えば、音響信号w1(t)や音響信号w2(t)の強度(各信号値)の度数分布における尖度κ(すなわち、時系列に配列するM個の強度x1〜xMを適用した数式(9)の演算値)がスペクトル減算前の尖度KB(m)として算定され得る。
(1) Modification 1
The object for which the kurtosis calculating unit 52B calculates the kurtosis KB (m) is changed as appropriate. For example, the kurtosis KB (m) can be calculated from the intensity (x1 to xM) of each spectrum of the acoustic signal w1 (t) generated by the sound collection device 14. A configuration for calculating kurtosis KB (m) from a signal in the time domain is also suitable. For example, the mathematical expression (9) applying the kurtosis κ (that is, M intensities x1 to xM arranged in time series) in the frequency distribution of the intensity (each signal value) of the acoustic signal w1 (t) and the acoustic signal w2 (t). )) Can be calculated as the kurtosis KB (m) before spectral subtraction.

同様に、第6実施形態や第7実施形態において尖度算定部52Cが尖度KC(m)を算定する対象は適宜に変更される。例えば、時間領域の音響信号z(t)の各信号値の度数分布における尖度κをスペクトル減算後の尖度KC(m)として算定する構成や、音響信号z(t)に対する所定の処理で生成された音響信号(またはスペクトル)から尖度KC(m)を算定する構成が採用される。   Similarly, the target in which the kurtosis calculation unit 52C calculates the kurtosis KC (m) in the sixth embodiment and the seventh embodiment is appropriately changed. For example, in a configuration in which the kurtosis κ in the frequency distribution of each signal value of the acoustic signal z (t) in the time domain is calculated as the kurtosis KC (m) after spectrum subtraction, or by a predetermined process for the acoustic signal z (t) A configuration is employed in which the kurtosis KC (m) is calculated from the generated acoustic signal (or spectrum).

(2)変形例2
時間領域から周波数領域への変換の位置は任意に変更される。すなわち、スペクトル減算部42による処理が周波数領域で実行されることを除けば、音響処理装置100(100A〜100F)の各要素の処理が周波数領域および時間領域の何れで実行されるかは本発明において不問である。例えば、収音機器14が生成した直後(帰還音抑圧部26による処理前)の音響信号w1(t)を周波数領域に変換する構成では、帰還音推定部24や帰還音抑圧部26による処理が周波数領域で実行される。また、例えば周波数解析部32が残留音推定部36の後段に配置された構成では、残留音推定部36は時間領域の推定残留音成分e2(t)を生成する。
(2) Modification 2
The position of transformation from the time domain to the frequency domain is arbitrarily changed. In other words, except that the processing by the spectrum subtracting unit 42 is performed in the frequency domain, whether the processing of each element of the acoustic processing device 100 (100A to 100F) is performed in the frequency domain or the time domain is described in the present invention. Is unquestionable. For example, in a configuration in which the acoustic signal w1 (t) immediately after generation by the sound collection device 14 (before processing by the feedback sound suppression unit 26) is converted to the frequency domain, processing by the feedback sound estimation unit 24 and the feedback sound suppression unit 26 is performed. Run in the frequency domain. Further, for example, in a configuration in which the frequency analysis unit 32 is arranged at the subsequent stage of the residual sound estimation unit 36, the residual sound estimation unit 36 generates an estimated residual sound component e2 (t) in the time domain.

(3)変形例3
以上の各形態における残留音推定部36は必須の要素ではない。例えば、以上の各形態では、推定帰還音成分n2(t)のスペクトルN2(m,f)に係数δを乗算することで推定残留音成分e2(t)のスペクトルE2(m,f)を生成したが、係数δを加味して減算係数α(m)やフロアリング係数β(m)を設定すれば、推定帰還音成分n2(t)のスペクトルN2(m,f)を推定残留音成分e2(t)のスペクトルE2(m,f)としてスペクトル減算部42に供給することも可能である。
(3) Modification 3
The residual sound estimation unit 36 in each of the above forms is not an essential element. For example, in each of the above embodiments, the spectrum E2 (m, f) of the estimated residual sound component e2 (t) is generated by multiplying the spectrum N2 (m, f) of the estimated feedback sound component n2 (t) by the coefficient δ. However, if the subtraction coefficient α (m) and the flooring coefficient β (m) are set in consideration of the coefficient δ, the spectrum N2 (m, f) of the estimated feedback sound component n2 (t) is estimated as the estimated residual sound component e2. It is also possible to supply the spectrum subtraction unit 42 as the spectrum E2 (m, f) of (t).

(4)変形例4
尖度κ(KA(m),KB(m),KC(m))を算定する方法は各形態の例示に限定されない。例えば、強度x1〜xMの度数分布を所定の関数(例えば数式(6))で近似する構成は必須ではない。また、尖度κ(KA(m),KB(m),KC(m))の算定や更新の周期は任意である。例えば、複数のフレームを単位として尖度κを算定する構成や、フレームとは無関係の周期で尖度κを算定する構成も採用される。
(4) Modification 4
The method for calculating the kurtosis κ (KA (m), KB (m), KC (m)) is not limited to the illustration of each form. For example, a configuration for approximating the frequency distribution of the intensities x1 to xM with a predetermined function (for example, Equation (6)) is not essential. Further, the calculation and update cycle of kurtosis κ (KA (m), KB (m), KC (m)) is arbitrary. For example, a configuration in which kurtosis κ is calculated in units of a plurality of frames, or a configuration in which kurtosis κ is calculated at a period unrelated to the frame are also employed.

(5)変形例5
以上の各形態における音響処理装置100(100A〜100F)は、帰還音に起因したハウリングを抑制する装置(ハウリング抑制装置)としても好適である。例えば、逆変換部44が生成した音響信号z(t)を増幅器による増幅後に放音機器12に供給する構成(典型的には、収音機器14の周囲の音響の音量を調整して放音機器12から放射する拡声装置)が採用される。以上の構成においては、放音機器12から収音機器14に到来する帰還音が帰還音抑圧部26およびスペクトル減算部42にて抑圧されるから、放音機器12からの放射音におけるミュージカルノイズを抑制しながら、放音機器12と収音機器14と音響処理装置100とで構成されるループに起因したハウリングが効果的に防止される。
(5) Modification 5
The acoustic processing device 100 (100A to 100F) in each of the above embodiments is also suitable as a device that suppresses howling caused by feedback sound (howling suppression device). For example, a configuration in which the acoustic signal z (t) generated by the inverse transform unit 44 is supplied to the sound emitting device 12 after being amplified by an amplifier (typically, the sound is emitted by adjusting the sound volume around the sound collecting device 14. A loudspeaker radiating from the device 12 is employed. In the above configuration, the feedback sound arriving at the sound collecting device 14 from the sound emitting device 12 is suppressed by the feedback sound suppressing unit 26 and the spectrum subtracting unit 42, so that the musical noise in the radiated sound from the sound emitting device 12 is reduced. While being suppressed, howling due to a loop formed by the sound emitting device 12, the sound collecting device 14, and the sound processing device 100 is effectively prevented.

100A,100B,100C,100D,100E,100F,100G……音響処理装置、12……放音機器、14……収音機器、22……目的音判定部、24……帰還音推定部、26……帰還音抑圧部、32,34……周波数解析部,36……残留音推定部、42……スペクトル減算部、44……逆変換部、52A,52B,52C……尖度算定部,54……係数設定部、56……判定部。
100A, 100B, 100C, 100D, 100E, 100F, 100G... Acoustic processing device, 12... Sound emission device, 14... Sound collection device, 22 .. target sound determination unit, 24. …… Feedback sound suppression unit 32, 34 Frequency analysis unit 36 Residual sound estimation unit 42 Spectral subtraction unit 44 Inverse transformation unit 52A 52B 52C Kurtosis calculation unit 54... Coefficient setting unit, 56.

Claims (9)

放音機器から収音機器に到来する帰還音を推定した推定帰還音成分を生成する帰還音推定手段と、
前記収音機器による生成後の音響信号から前記推定帰還音成分を抑圧する帰還音抑圧手段と、
前記帰還音のうち前記帰還音抑圧手段による処理後に残留する成分を推定した推定残留音成分の強度の度数分布における尖度を算定する尖度算定手段と、
前記尖度算定手段が算定した尖度に応じて減算係数を設定する係数設定手段と、
前記推定残留音成分のスペクトルを前記減算係数に応じて調整して前記帰還音抑圧手段による処理後の音響信号のスペクトルから減算するスペクトル減算手段と
を具備する音響処理装置。
Feedback sound estimation means for generating an estimated feedback sound component that estimates the feedback sound coming from the sound emitting device to the sound collecting device;
Feedback sound suppression means for suppressing the estimated feedback sound component from the acoustic signal generated by the sound collection device;
Kurtosis calculating means for calculating the kurtosis in the frequency distribution of the intensity of the estimated residual sound component obtained by estimating the component remaining after the processing by the feedback sound suppression means of the feedback sound;
Coefficient setting means for setting a subtraction coefficient according to the kurtosis calculated by the kurtosis calculating means;
An acoustic processing apparatus comprising: a spectrum subtracting unit that adjusts a spectrum of the estimated residual sound component according to the subtraction coefficient and subtracts it from a spectrum of the acoustic signal processed by the feedback sound suppressing unit.
前記スペクトル減算手段は、前記推定残留音成分のスペクトルに前記減算係数を乗算して前記帰還音抑圧手段による処理後の音響信号のスペクトルから減算し、
前記係数設定手段は、前記尖度が大きいほど前記減算係数を大きい数値に設定する
請求項1の音響処理装置。
The spectrum subtracting unit multiplies the spectrum of the estimated residual sound component by the subtraction coefficient and subtracts it from the spectrum of the acoustic signal processed by the feedback sound suppression unit,
The sound processing apparatus according to claim 1, wherein the coefficient setting unit sets the subtraction coefficient to a larger numerical value as the kurtosis increases.
前記尖度算定手段は、前記推定残留音成分の尖度と、前記収音機器による生成後の音響信号の強度の度数分布における尖度とを算定し、
前記係数設定手段は、前記推定残留音成分の尖度と前記音響信号の尖度とが近似する場合には、前記推定残留音成分の尖度および前記音響信号の尖度の少なくとも一方に応じて減算係数を設定し、前記推定残留音成分の尖度と前記音響信号の尖度とが近似しない場合には減算係数の更新を停止する
請求項1または請求項2の音響処理装置。
The kurtosis calculation means calculates the kurtosis of the estimated residual sound component and the kurtosis in the frequency distribution of the intensity of the acoustic signal generated by the sound collection device,
When the kurtosis of the estimated residual sound component approximates the kurtosis of the acoustic signal, the coefficient setting means, according to at least one of the kurtosis of the estimated residual sound component and the kurtosis of the acoustic signal, The acoustic processing apparatus according to claim 1, wherein a subtraction coefficient is set, and updating of the subtraction coefficient is stopped when the kurtosis of the estimated residual sound component and the kurtosis of the acoustic signal are not approximated.
放音機器から収音機器に到来する帰還音を推定した推定帰還音成分を生成する帰還音推定手段と、
前記収音機器による生成後の音響信号から前記推定帰還音成分を抑圧する帰還音抑圧手段と、
前記収音機器による生成後の音響信号の強度の度数分布における尖度を算定する尖度算定手段と、
前記尖度算定手段が算定した尖度に応じて減算係数を設定する係数設定手段と、
前記帰還音のうち前記帰還音抑圧手段による処理後に残留する成分を推定した推定残留音成分のスペクトルを前記減算係数に応じて調整して前記帰還音抑圧手段による処理後の音響信号のスペクトルから減算するスペクトル減算手段と
を具備する音響処理装置。
Feedback sound estimation means for generating an estimated feedback sound component that estimates the feedback sound coming from the sound emitting device to the sound collecting device;
Feedback sound suppression means for suppressing the estimated feedback sound component from the acoustic signal generated by the sound collection device;
Kurtosis calculating means for calculating kurtosis in the frequency distribution of the intensity of the acoustic signal generated by the sound collecting device;
Coefficient setting means for setting a subtraction coefficient according to the kurtosis calculated by the kurtosis calculating means;
Of the feedback sound, the spectrum of the estimated residual sound component obtained by estimating the component remaining after the processing by the feedback sound suppression means is adjusted according to the subtraction coefficient and subtracted from the spectrum of the acoustic signal after the processing by the feedback sound suppression means And a spectral subtracting means.
前記尖度算定手段は、前記スペクトル減算手段による処理前の音響信号の尖度を算定し、
前記スペクトル減算手段は、前記推定残留音成分のスペクトルに前記減算係数を乗算して前記帰還音抑圧手段による処理後の音響信号のスペクトルから減算し、
前記係数設定手段は、前記尖度が大きいほど前記減算係数を大きい数値に設定する
請求項4の音響処理装置。
The kurtosis calculation means calculates the kurtosis of the acoustic signal before processing by the spectrum subtraction means,
The spectrum subtracting means multiplies the spectrum of the estimated residual sound component by the subtraction coefficient and subtracts it from the spectrum of the acoustic signal processed by the feedback sound suppression means,
The sound processing apparatus according to claim 4, wherein the coefficient setting unit sets the subtraction coefficient to a larger numerical value as the kurtosis increases.
前記尖度算定手段は、前記スペクトル減算手段による処理後の音響信号の尖度を算定し、
前記スペクトル減算手段は、前記推定残留音成分のスペクトルに前記減算係数を乗算して前記帰還音抑圧手段による処理後の音響信号のスペクトルから減算し、
前記係数設定手段は、前記尖度が大きいほど前記減算係数を小さい数値に設定する
請求項4の音響処理装置。
The kurtosis calculation means calculates the kurtosis of the acoustic signal after processing by the spectrum subtraction means,
The spectrum subtracting unit multiplies the spectrum of the estimated residual sound component by the subtraction coefficient and subtracts it from the spectrum of the acoustic signal processed by the feedback sound suppression unit,
The sound processing apparatus according to claim 4, wherein the coefficient setting unit sets the subtraction coefficient to a smaller numerical value as the kurtosis is larger.
前記収音機器による生成後の音響信号について目的音成分の有無を判定する目的音判定手段を具備し、
前記尖度算定手段は、目的音成分が存在しないと前記目的音判定手段が判定した場合に前記尖度を算定し、目的音成分が存在すると前記目的音判定手段が判定した場合に前記尖度の算定を停止する
請求項1から請求項6の何れかの音響処理装置。
A target sound determining means for determining the presence or absence of a target sound component for the acoustic signal generated by the sound collecting device;
The kurtosis calculating unit calculates the kurtosis when the target sound determining unit determines that the target sound component does not exist, and the kurtosis when the target sound determining unit determines that the target sound component exists. The sound processing apparatus according to any one of claims 1 to 6, wherein the calculation of is stopped.
放音機器から収音機器に到来する帰還音を推定した推定帰還音成分を生成する帰還音推定処理と、
前記収音機器による生成後の音響信号から前記推定帰還音成分を抑圧する帰還音抑圧処理と、
前記帰還音のうち前記帰還音抑圧処理の実行後に残留する成分を推定した推定残留音成分の強度の度数分布における尖度を算定する尖度算定処理と、
前記尖度算定処理で算定した尖度に応じて減算係数を設定する係数設定処理と、
前記推定残留音成分のスペクトルを前記減算係数に応じて調整して前記帰還音抑圧処理の実行後の音響信号のスペクトルから減算するスペクトル減算処理と
をコンピュータに実行させるプログラム。
A feedback sound estimation process for generating an estimated feedback sound component estimating a feedback sound arriving at the sound collecting device from the sound emitting device;
Feedback sound suppression processing for suppressing the estimated feedback sound component from the acoustic signal generated by the sound collection device;
A kurtosis calculation process for calculating a kurtosis in the frequency distribution of the intensity of the estimated residual sound component obtained by estimating the component remaining after execution of the feedback sound suppression process in the feedback sound;
A coefficient setting process for setting a subtraction coefficient according to the kurtosis calculated in the kurtosis calculation process;
A program that causes a computer to execute spectrum subtraction processing that adjusts a spectrum of the estimated residual sound component according to the subtraction coefficient and subtracts it from a spectrum of an acoustic signal after execution of the feedback sound suppression processing.
放音機器から収音機器に到来する帰還音を推定した推定帰還音成分を生成する帰還音推定処理と、
前記収音機器による生成後の音響信号から前記推定帰還音成分を抑圧する帰還音抑圧処理と、
前記収音機器による生成後の音響信号の強度の度数分布における尖度を算定する尖度算定処理と、
前記尖度算定処理で算定した尖度に応じて減算係数を設定する係数設定処理と、
前記帰還音のうち前記帰還音抑圧処理の実行後に残留する成分を推定した推定残留音成分のスペクトルを前記減算係数に応じて調整して前記帰還音抑圧処理の実行後の音響信号のスペクトルから減算するスペクトル減算処理と
をコンピュータに実行させるプログラム。
A feedback sound estimation process for generating an estimated feedback sound component estimating a feedback sound arriving at the sound collecting device from the sound emitting device;
Feedback sound suppression processing for suppressing the estimated feedback sound component from the acoustic signal generated by the sound collection device;
Kurtosis calculation processing for calculating the kurtosis in the frequency distribution of the intensity of the acoustic signal after generation by the sound collection device;
A coefficient setting process for setting a subtraction coefficient according to the kurtosis calculated in the kurtosis calculation process;
A spectrum of an estimated residual sound component obtained by estimating a component remaining after execution of the feedback sound suppression processing in the feedback sound is adjusted according to the subtraction coefficient and subtracted from a spectrum of the acoustic signal after execution of the feedback sound suppression processing. A program that causes a computer to execute spectral subtraction processing.
JP2009066830A 2009-03-18 2009-03-18 Sound processing apparatus and program Pending JP2010220087A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP2009066830A JP2010220087A (en) 2009-03-18 2009-03-18 Sound processing apparatus and program

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP2009066830A JP2010220087A (en) 2009-03-18 2009-03-18 Sound processing apparatus and program

Publications (1)

Publication Number Publication Date
JP2010220087A true JP2010220087A (en) 2010-09-30

Family

ID=42978407

Family Applications (1)

Application Number Title Priority Date Filing Date
JP2009066830A Pending JP2010220087A (en) 2009-03-18 2009-03-18 Sound processing apparatus and program

Country Status (1)

Country Link
JP (1) JP2010220087A (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2011077636A1 (en) * 2009-12-25 2011-06-30 三菱電機株式会社 Noise removal device and noise removal program
JP2011180219A (en) * 2010-02-26 2011-09-15 Nara Institute Of Science & Technology Factor setting device and noise reduction apparatus
JP2012113190A (en) * 2010-11-26 2012-06-14 Nara Institute Of Science & Technology Acoustic processing device
WO2012157788A1 (en) * 2011-05-19 2012-11-22 日本電気株式会社 Audio processing device, audio processing method, and recording medium on which audio processing program is recorded
JP2013068919A (en) * 2011-09-07 2013-04-18 Nara Institute Of Science & Technology Device for setting coefficient for noise suppression and noise suppression device
JP2015026956A (en) * 2013-07-25 2015-02-05 沖電気工業株式会社 Voice signal processing device and program

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2011077636A1 (en) * 2009-12-25 2011-06-30 三菱電機株式会社 Noise removal device and noise removal program
CN102667928A (en) * 2009-12-25 2012-09-12 三菱电机株式会社 Noise removal device and noise removal program
JP5383828B2 (en) * 2009-12-25 2014-01-08 三菱電機株式会社 Noise removal apparatus and noise removal program
US9087518B2 (en) 2009-12-25 2015-07-21 Mitsubishi Electric Corporation Noise removal device and noise removal program
JP2011180219A (en) * 2010-02-26 2011-09-15 Nara Institute Of Science & Technology Factor setting device and noise reduction apparatus
JP2012113190A (en) * 2010-11-26 2012-06-14 Nara Institute Of Science & Technology Acoustic processing device
WO2012157788A1 (en) * 2011-05-19 2012-11-22 日本電気株式会社 Audio processing device, audio processing method, and recording medium on which audio processing program is recorded
JP6094479B2 (en) * 2011-05-19 2017-03-15 日本電気株式会社 Audio processing apparatus, audio processing method, and recording medium recording audio processing program
JP2013068919A (en) * 2011-09-07 2013-04-18 Nara Institute Of Science & Technology Device for setting coefficient for noise suppression and noise suppression device
JP2015026956A (en) * 2013-07-25 2015-02-05 沖電気工業株式会社 Voice signal processing device and program

Similar Documents

Publication Publication Date Title
JP5207479B2 (en) Noise suppression device and program
JP5347794B2 (en) Echo suppression method and apparatus
US9210504B2 (en) Processing audio signals
JP6243536B2 (en) Echo cancellation
JP6177253B2 (en) Harmonicity-based single channel speech quality assessment
JP2011145372A (en) Noise suppressing device
JP2010220087A (en) Sound processing apparatus and program
US8259961B2 (en) Audio processing apparatus and program
JP5187666B2 (en) Noise suppression device and program
JP2021522550A (en) Background noise estimation using gap reliability
JP5152799B2 (en) Noise suppression device and program
JP5942388B2 (en) Noise suppression coefficient setting device, noise suppression device, and noise suppression coefficient setting method
JP2004078021A (en) Method, device, and program for sound pickup
JP5152800B2 (en) Noise suppression evaluation apparatus and program
JP5609157B2 (en) Coefficient setting device and noise suppression device
EP2086249B1 (en) Howling suppression apparatus and computer readable recording medium
JP5376635B2 (en) Noise suppression processing selection device, noise suppression device, and program
JP2015169901A (en) Acoustic processing device
JP2010156742A (en) Signal processing device and method thereof
JP2006262339A (en) Loudspeaker apparatus
JP2013042334A (en) Information processing device, information processing method and program
JP4950971B2 (en) Reverberation removal apparatus, dereverberation method, dereverberation program, recording medium
JP2013250356A (en) Coefficient setting device and noise suppression device
JP2014010279A (en) Noise suppression device
JP6102053B2 (en) Sound processing apparatus and sound processing method