JP2016090799A

JP2016090799A - Noise suppression device, and method and program for the same

Info

Publication number: JP2016090799A
Application number: JP2014224894A
Authority: JP
Inventors: 達也加古; Tatsuya Kako; 和則小林; Kazunori Kobayashi
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 2014-11-05
Filing date: 2014-11-05
Publication date: 2016-05-23
Anticipated expiration: 2034-11-05
Also published as: JP6151236B2

Abstract

PROBLEM TO BE SOLVED: To provide a noise suppression device which suppresses deterioration, voice distortion, and noise without necessity of a further device such as VAD (voice activity detection), and to provide a method and a program for the noise suppression device.SOLUTION: In a noise suppression device 100, an adaptive filter is successively updated by using an error signal and a second sound collection signal so as to minimize the error signal in an adaptive filter unit 110. When the absolute value of the error proportion, which is the proportion of the error signal to the second sound collection signal, is equal to or smaller than a predetermined threshold value, the adaptive filter is updated by a first update amount based on a monotonic increasing function of the error proportion. When the absolute value of the error proportion is greater than a predetermined threshold value, the adaptive filter is updated by a second update amount based on a monotonic increasing function of the error proportion, where the increment in the second update amount is smaller than the increment in the first update amount.SELECTED DRAWING: Figure 1

Description

本発明は、ある収音信号に含まれる雑音成分を他の収音信号を利用して抑圧する雑音抑圧技術に関する。特に、モバイル端末に搭載された複数のマイクロホンから得られる複数の収音信号のうちの一つの収音信号に含まれる雑音成分を抑圧する雑音抑圧技術に関する。 The present invention relates to a noise suppression technique for suppressing a noise component included in a certain collected sound signal by using another collected sound signal. In particular, the present invention relates to a noise suppression technique that suppresses a noise component contained in one of a plurality of collected sound signals obtained from a plurality of microphones mounted on a mobile terminal.

音声をマイクロホンで収音する場合、音声とともに周囲環境の雑音を収音してしまうことは不可避の事象である。従って、目的音成分と雑音成分とを含む音をマイクロホンで収音する場合には、何らかの方法で雑音成分を除去ないし抑圧する技術がこれまで研究されてきた。 When sound is picked up by a microphone, it is an inevitable event that sound of the surrounding environment is picked up along with the sound. Therefore, when a sound including a target sound component and a noise component is collected by a microphone, techniques for removing or suppressing the noise component by some method have been studied so far.

例えば、従来、スマートフォンなどのモバイル端末のマイクロホンを利用して、雑音抑圧を行う場合、スペクトルサブトラクション法が演算量も少ないので一般的に用いられてきた（非特許文献１参照）。スペクトルサブトラクション法は、接話マイクロホン等で収音した信号から雑音区間（すなわち、収音したい音声（目的音）が含まれない時間区間で、非音声区間とも呼ぶ）の雑音パワーを推定する。そして推定した雑音パワーを用いて音声区間（目的音が含まれる時間区間）の収音信号に重畳する雑音成分を周波数スペクトル上で差し引くことで雑音を抑圧する手法である。 For example, conventionally, when noise suppression is performed using a microphone of a mobile terminal such as a smartphone, the spectral subtraction method has been generally used because of a small amount of calculation (see Non-Patent Document 1). In the spectral subtraction method, the noise power of a noise interval (that is, a time interval in which a voice (target sound) to be collected (target sound) is not included) is estimated from a signal collected by a close-talking microphone or the like. This is a technique for suppressing noise by subtracting, on the frequency spectrum, a noise component to be superimposed on a collected sound signal in a speech section (time section including the target sound) using the estimated noise power.

また、モノラルマイクロホン向けのスペクトルサブトラクション法以外にも、スマートフォンに二つのマイクロホンを搭載し、二つのマイクロホンで収音した信号からマイクロホンアレイ処理を行い、背面に配置されたサブマイクロホンで収音した信号の成分を、通話時において口元近傍に位置するように配置されたメインマイクロホンの信号から除去することで、雑音抑圧を行う雑音抑圧処理が用いられている（非特許文献２及び非特許文献３参照）。この処理方法が成り立つ前提には、二つのマイクロホンの特性はある程度同じで、かつ、サブマイクロホンは雑音のみ収音し、メインマイクロホンは目的音と雑音との双方を収音するという仮定がある。 In addition to the spectral subtraction method for monaural microphones, two microphones are mounted on the smartphone, microphone array processing is performed from the signals collected by the two microphones, and the signals collected by the sub-microphones arranged on the back are Noise suppression processing is performed in which noise suppression is performed by removing the component from the signal of the main microphone arranged so as to be positioned near the mouth during a call (see Non-Patent Document 2 and Non-Patent Document 3). . The premise for this processing method is that the characteristics of the two microphones are the same to some extent, the sub microphones only pick up noise, and the main microphone picks up both the target sound and noise.

目的音に重畳する雑音の音源は、目的音の音源より離れた位置に存在し、雑音の音源とマイクロホンとの間の伝達特性の影響をより大きく受ける。加えて、一般的にはその特性は未知であるため、それを推定する必要がある。そこで、その伝達過程を未知システムとして適応フィルタによるシステム同定を行い、サブマイクロホンの収音信号に適応フィルタを乗じて得られるフィルタ出力をメインマイクロホンの収音信号から差し引くことで目的音を取り出す。 The noise source superimposed on the target sound exists at a position distant from the target sound source, and is more greatly affected by the transfer characteristics between the noise source and the microphone. In addition, since its characteristics are generally unknown, it is necessary to estimate it. Therefore, system identification by an adaptive filter is performed with the transmission process as an unknown system, and the target sound is extracted by subtracting the filter output obtained by multiplying the sound collection signal of the sub microphone by the adaptive filter from the sound collection signal of the main microphone.

このとき、二つのマイクロホンの間隔は大き過ぎず、小さ過ぎないことが望ましい。二つのマイクロホンは間隔が大き過ぎる場合、互いに異なった特徴の雑音成分を収音することになり、単純なスペクトルサブトラクション法では誤った雑音成分を差し引くことになるためである。他方、二つのマイクロホンの間隔が小さ過ぎる場合、各マイクロホンの雑音成分の相関性は高まるが、本来除去対象とすべきでない目的音成分もサブマイクロホンで雑音成分と同時に収音することになり、サブマイクロホンは雑音のみ収音するという前提が崩れてしまう。すなわち二つのマイクロホンを用いたスペクトルサブトラクション法は、二つのマイクロホンで雑音を収音しながら、その相関性を保ちつつ、目的音をメインマイクロホンでのみ収音しなければならないという相反する音響的特性を理想として適用されている。しかし現実的には、二つのマイクロホンの特性がそろっているため、回り込んだ目的音をサブマイクロホンで収音しないようにすることは困難である。 At this time, it is desirable that the distance between the two microphones is not too large and not too small. This is because, if the distance between the two microphones is too large, noise components having different characteristics will be collected, and the erroneous noise component will be subtracted in the simple spectral subtraction method. On the other hand, if the distance between the two microphones is too small, the correlation between the noise components of each microphone will increase, but the target sound component that should not be removed will be collected simultaneously with the noise component by the sub microphone. The premise that microphones only collect noise is broken. In other words, the spectral subtraction method using two microphones has the contradictory acoustic characteristics that the target sound must be picked up only by the main microphone while keeping the correlation while picking up the noise with the two microphones. Applied as an ideal. However, in reality, since the characteristics of the two microphones are the same, it is difficult to prevent the target sound that has been turned around from being collected by the sub microphone.

BOLL S. F., "Suppression of acoustic noise in speech using spectral subtraction. Acoustics", Speech and Signal Processing, 1979, IEEE Transactions on, Volume:27 , Issue: 2, pp.113-120.BOLL S. F., "Suppression of acoustic noise in speech using spectral subtraction.Acoustics", Speech and Signal Processing, 1979, IEEE Transactions on, Volume: 27, Issue: 2, pp.113-120. Jian Zhang et. al. "A FAST TWO-MICROPHONE NOISE REDUCTION ALGORITHM BASED ON POWER LEVEL RATIO FOR MOBILE PHONE", Kowloon: Chinese Spoken Language Processing (ISCSLP), 2012, 8th International Symposium on, pp.206-209.Jian Zhang et. Al. "A FAST TWO-MICROPHONE NOISE REDUCTION ALGORITHM BASED ON POWER LEVEL RATIO FOR MOBILE PHONE", Kowloon: Chinese Spoken Language Processing (ISCSLP), 2012, 8th International Symposium on, pp.206-209. 中西功、「知識の森」、1 群（信号・システム）-- 9 編（ディジタル信号処理）3 章適応信号処理、［online］、電子情報通信学会、「知識の森」、[平成26年10月23日検索]、インターネット<http://www.ieice-hbkb.org/files/01/01gun_09hen_03m.pdf>Isao Nakanishi, “Knowledge Forest”, Group 1 (Signal / System)-Volume 9 (Digital Signal Processing) Chapter 3, Adaptive Signal Processing, [online], IEICE, “Knowledge Forest”, [2014 Search October 23], Internet <http://www.ieice-hbkb.org/files/01/01gun_09hen_03m.pdf>

スマートフォンに取り付けられた二つのマイクロホンを利用して、背面に配置されたサブマイクロホンの収音信号を用いてメインマイクロホンの収音信号から雑音成分を除去するスペクトルサブトラクション法を行うだけでは、サブマイクロホンで収音してしまっている目的音までもメインマイクロホンの収音信号から除去してしまい、目的音にミュージカルノイズ等として知られている劣化や音声の歪みが生じてしまうという課題がある。 Using the two microphones attached to the smartphone and using the submicrophone pickup signal placed on the back to perform the spectral subtraction method to remove the noise component from the main microphone pickup signal, Even the target sound that has been picked up is removed from the picked-up signal of the main microphone, and there is a problem in that the target sound is deteriorated and is distorted as known as musical noise.

そこで、劣化や音声の歪みを抑えるために、非音声区間である雑音区間でのみ適応フィルタによるシステム同定を行う。雑音区間で推定された適応フィルタを用いることで目的音を残しながら雑音を消すことができる。 Therefore, in order to suppress deterioration and distortion of speech, system identification is performed by an adaptive filter only in a noise interval that is a non-speech interval. By using the adaptive filter estimated in the noise section, it is possible to eliminate the noise while leaving the target sound.

本発明では、この非音声区間のみで適応フィルタの学習をすすめる処理を適応フィルタの式を変形を変形することで実現する。本発明では、新たにVAD（voice activity detection）などの装置を必要とせずに、劣化や音声の歪みを抑え、雑音を抑圧する雑音抑圧装置、その方法及びプログラムを提供することを目的とする。 In the present invention, the process of promoting learning of the adaptive filter only in the non-speech period is realized by modifying the expression of the adaptive filter. An object of the present invention is to provide a noise suppression apparatus, a method and a program for suppressing noise by suppressing deterioration and distortion of a voice without requiring a new apparatus such as VAD (voice activity detection).

上記の課題を解決するために、本発明の一態様によれば、雑音抑圧装置は、第一収音信号に含まれる雑音成分を第二収音信号を用いて抑圧する。雑音抑圧装置は、第二収音信号に適応フィルタを用いてフィルタリングを行い、フィルタリング後信号を求める適応フィルタ部と、第一収音信号と、フィルタリング後信号との差分を誤差信号として求める減算部とを含み、適応フィルタ部において、誤差信号と第二収音信号とを用いて、誤差信号が最小となるように逐次的に適応フィルタを更新し、第二収音信号に対する誤差信号の割合である誤差割合の絶対値が所定の閾値以下の場合には、誤差割合に対する単調増加関数に基づく第１更新量により適応フィルタを更新し、誤差割合の絶対値が所定の閾値より大きい場合には、誤差割合に対して第１更新量よりも増加量が小さい単調増加関数に基づく第２更新量により適応フィルタを更新する。 In order to solve the above-described problem, according to one aspect of the present invention, the noise suppression device suppresses a noise component included in the first sound collection signal using the second sound collection signal. The noise suppression device performs filtering using an adaptive filter on the second collected sound signal and obtains a filtered signal, and a subtractor obtains a difference between the first collected signal and the filtered signal as an error signal. The adaptive filter unit sequentially updates the adaptive filter using the error signal and the second collected sound signal so that the error signal is minimized, and the ratio of the error signal to the second collected signal is When the absolute value of a certain error ratio is less than or equal to a predetermined threshold, the adaptive filter is updated with the first update amount based on a monotonically increasing function with respect to the error ratio, and when the absolute value of the error ratio is larger than the predetermined threshold, The adaptive filter is updated with the second update amount based on the monotonically increasing function whose increase amount is smaller than the first update amount with respect to the error ratio.

上記の課題を解決するために、本発明の他の態様によれば、雑音抑圧方法は、第一収音信号に含まれる雑音成分を第二収音信号を用いて抑圧する。雑音抑圧方法は、第二収音信号に適応フィルタを用いてフィルタリングを行い、フィルタリング後信号を求める適応フィルタステップと、第一収音信号と、フィルタリング後信号との差分を誤差信号として求める減算ステップとを含み、適応フィルタステップにおいて、誤差信号と第二収音信号とを用いて、誤差信号が最小となるように逐次的に適応フィルタを更新し、第二収音信号に対する誤差信号の割合である誤差割合の絶対値が所定の閾値以下の場合には、誤差割合に対する単調増加関数に基づく第１更新量により適応フィルタを更新し、誤差割合の絶対値が所定の閾値より大きい場合には、誤差割合に対して第１更新量よりも増加量が小さい単調増加関数に基づく第２更新量により適応フィルタを更新する。 In order to solve the above-described problem, according to another aspect of the present invention, a noise suppression method suppresses a noise component included in a first sound pickup signal using the second sound pickup signal. In the noise suppression method, the second collected sound signal is filtered using an adaptive filter, and an adaptive filter step for obtaining a filtered signal, and a subtracting step for obtaining a difference between the first collected signal and the filtered signal as an error signal. In the adaptive filter step, using the error signal and the second collected sound signal, the adaptive filter is sequentially updated so that the error signal is minimized, and the ratio of the error signal to the second collected signal is When the absolute value of a certain error ratio is less than or equal to a predetermined threshold, the adaptive filter is updated with the first update amount based on a monotonically increasing function with respect to the error ratio, and when the absolute value of the error ratio is larger than the predetermined threshold, The adaptive filter is updated with the second update amount based on the monotonically increasing function whose increase amount is smaller than the first update amount with respect to the error ratio.

本発明によれば、新たにVAD（voice activity detection）などの装置を必要とせずに、劣化や音声の歪みを抑え、雑音を抑圧することができるという効果を奏する。 According to the present invention, there is an effect that it is possible to suppress deterioration and distortion of voice and suppress noise without newly requiring a device such as VAD (voice activity detection).

第一実施形態に係る雑音抑圧装置の機能ブロック図。The functional block diagram of the noise suppression apparatus which concerns on 1st embodiment. 第一実施形態に係る雑音抑圧装置の処理フローの例を示す図。The figure which shows the example of the processing flow of the noise suppression apparatus which concerns on 1st embodiment. 図３Ａは第一実施形態に係る雑音抑圧装置の正面図、図３Ｂは第一実施形態に係る雑音抑圧装置の背面図。FIG. 3A is a front view of the noise suppression device according to the first embodiment, and FIG. 3B is a rear view of the noise suppression device according to the first embodiment. 制限関数f(β)の例を示す図。The figure which shows the example of the limiting function f ((beta)). 制限関数f(β)の例を示す図。The figure which shows the example of the limiting function f ((beta)).

以下、本発明の実施形態について、説明する。なお、以下の説明に用いる図面では、同じ機能を持つ構成部や同じ処理を行うステップには同一の符号を記し、重複説明を省略する。また、ベクトルや行列の各要素単位で行われる処理は、特に断りが無い限り、そのベクトルやその行列の全ての要素に対して適用されるものとする。 Hereinafter, embodiments of the present invention will be described. In the drawings used for the following description, components having the same function and steps performing the same process are denoted by the same reference numerals, and redundant description is omitted. Further, the processing performed for each element of a vector or matrix is applied to all elements of the vector or matrix unless otherwise specified.

＜第一実施形態のポイント＞
雑音抑圧装置には、メインマイクロホンとサブマイクロホンとが搭載される。例えば、雑音抑圧装置は、モバイル端末（小型ノートパソコン・スマートホン・タブレット型端末等）であり、その特性上、筐体が小さく、二つのマイクロホンを所定の間隔以上離すことは難しい。そのため、本実施形態では、雑音はメインマイクロホンとサブマイクロホンにおいて同じ音圧（同程度の音圧）で収音されると仮定する。また、本実施形態では、メインマイクロホンで収音する収音信号に含まれる目的音の音圧が、サブマイクロホンで収音する収音信号に含まれる目的音の音圧よりも大きくなるように、メインマイクロホンとサブマイクロホンとをモバイル端末上に配置する。例えば、通話時において口元近傍に位置するようにメインマイクロホンを配置し、通話時において口元から最も遠く目的音が入りずらい位置になるようにサブマイクロホンを配置する。例えば、メインマイロホンをモバイル端末正面下部や底面に配置し、サブマイクロホンをモバイル端末背面上部や上面に配置する。なお、モバイル端末で通話をする際に通話者が耳を当てる面を正面とし、通話をする際に口元側に位置する部分を下部、口元側の面を底面とする。この前提を元に、音声区間と非音声区間とで適応フィルタの更新量に対して異なる制限をかける制限関数を用いることで、適応フィルタの学習方法を調整し、適応フィルタの安定化と目的音の劣化や音声の歪みを抑え、雑音を抑圧することが本発明のポイントである。 <Points of first embodiment>
The noise suppression device is equipped with a main microphone and a sub microphone. For example, the noise suppression device is a mobile terminal (small notebook personal computer, smart phone, tablet type terminal, etc.), and due to its characteristics, the housing is small and it is difficult to separate two microphones beyond a predetermined interval. Therefore, in the present embodiment, it is assumed that noise is collected with the same sound pressure (similar sound pressure) in the main microphone and the sub microphone. In the present embodiment, the sound pressure of the target sound included in the sound collection signal collected by the main microphone is larger than the sound pressure of the target sound contained in the sound collection signal collected by the sub microphone. A main microphone and a sub microphone are arranged on the mobile terminal. For example, the main microphone is arranged so that it is located in the vicinity of the mouth during a call, and the sub microphone is arranged so that the target sound is located farthest from the mouth during a call and is difficult to enter. For example, the main mylophone is arranged on the lower front surface and the bottom surface of the mobile terminal, and the sub microphone is arranged on the upper rear surface and the upper surface of the mobile terminal. In addition, the surface on which the caller touches when making a call with a mobile terminal is the front, the portion located on the mouth side when making a call is the lower portion, and the surface on the mouth side is the bottom surface. Based on this premise, the adaptive filter learning method is adjusted by using a limiting function that places different limits on the update amount of the adaptive filter in the speech and non-speech segments, and the stabilization of the adaptive filter and the target sound The point of the present invention is to suppress the deterioration of the sound and the distortion of the voice and to suppress the noise.

本実施形態では、モバイル端末に取り付けられた２つのマイクロホンを利用し、適応フィルタを用いて雑音抑圧、音声強調を行う。 In the present embodiment, noise suppression and speech enhancement are performed using an adaptive filter using two microphones attached to a mobile terminal.

なお、上記のような配置にすると、マイクロホン間の収音信号の関係は次のようになる。目的音はメインマイクロホンにおいて、サブマイクロホンに比べて高い音圧で収音される。また、雑音はメインマイクロホン、サブマイクロホンともに同程度の音圧で収音される。この性質を用いて適応フィルタにより雑音を抑圧し、目的音を強調する。
＜第一実施形態に係る雑音抑圧装置１００＞
図１は第一実施形態に係る雑音抑圧装置１００の機能ブロック図を、図２はその処理フローを示す。なお、図１の機能ブロック図は処理回路を明示するためのものであって、実際には回路構成は雑音抑圧装置１００に内蔵されているものである。 If the arrangement is as described above, the relationship between the collected sound signals between the microphones is as follows. The target sound is picked up by the main microphone with a higher sound pressure than the sub microphone. In addition, noise is picked up with the same sound pressure in both the main microphone and the sub microphone. Using this property, noise is suppressed by an adaptive filter and the target sound is emphasized.
<Noise Suppression Device 100 according to First Embodiment>
FIG. 1 is a functional block diagram of a noise suppression apparatus 100 according to the first embodiment, and FIG. 2 shows a processing flow thereof. Note that the functional block diagram of FIG. 1 is for clearly showing the processing circuit, and the circuit configuration is actually built in the noise suppression apparatus 100.

雑音抑圧装置１００は、メインマイクロホン１０１とサブマイクロホン１０２と適応フィルタ部１１０と減算部１２０とフィルタ設計部１３０とスペクトルフィルタ部１４０とを含む。 The noise suppression apparatus 100 includes a main microphone 101, a sub microphone 102, an adaptive filter unit 110, a subtraction unit 120, a filter design unit 130, and a spectrum filter unit 140.

＜メインマイクロホン１０１及びサブマイクロホン１０２＞
メインマイクロホン１０１は、目的音と雑音を収音し、第一収音信号d(n)を出力する（Ｓ１０１）。サブマイクロホン１０２は、目的音と雑音を収音し、第二収音信号x(n)を出力する（Ｓ１０２）。なお、nを時間を表すインデックスとする。雑音抑圧装置１００に搭載するメインマイクロホン１０１とサブマイクロホン１０２との位置を図３に示す。図３Ａは雑音抑圧装置１００の正面図を、図３Ｂは背面図を示す。例えば、メインマイクロホン１０１及びサブマイクロホン１０２は、何れも無指向性型のマイクロホンであり、メインマイクロホン１０１及びサブマイクロホン１０２のマイクロホン感度の周波数特性は揃っていることとする。ただし、本発明は、メインマイクロホン１０１及びサブマイクロホン１０２の特性をこれに限定するものではない。 <Main microphone 101 and sub microphone 102>
The main microphone 101 collects the target sound and noise and outputs the first sound collection signal d (n) (S101). The sub microphone 102 picks up the target sound and noise and outputs the second sound pickup signal x (n) (S102). Note that n is an index representing time. The positions of the main microphone 101 and the sub microphone 102 mounted on the noise suppression apparatus 100 are shown in FIG. 3A is a front view of the noise suppression device 100, and FIG. 3B is a rear view. For example, it is assumed that the main microphone 101 and the sub microphone 102 are omnidirectional microphones, and the frequency characteristics of the microphone sensitivity of the main microphone 101 and the sub microphone 102 are uniform. However, the present invention does not limit the characteristics of the main microphone 101 and the sub microphone 102.

メインマイクロホン１０１は、雑音抑圧装置１００を送受話装置または音声入力装置として利用する際に利用者の口元に近づくように雑音抑圧装置１００上に配置されている。サブマイクロホン１０２はメインマイクロホン１０１と同一筐体上（つまり雑音抑圧装置１００上）に配置され、メインマイクロホン１０１の配置位置から遠ざけつつ、メインマイクロホン１０１が収音する周囲雑音と相関性の高い周囲雑音を収録する位置に配置されている。また、サブマイクロホン１０２の入力孔は、利用者が手で塞がないように配置される。しかし、サブマイクロホン１０２側に利用者の音声が、空間を通じて伝わったり、利用者の骨や筋肉や筐体の振動を通じて伝わったり、あるいは周囲の音響環境で反射するなどによって、収音されることを否定するものではない。 The main microphone 101 is arranged on the noise suppression device 100 so as to approach the user's mouth when the noise suppression device 100 is used as a transmission / reception device or a voice input device. The sub microphone 102 is arranged on the same housing as the main microphone 101 (that is, on the noise suppression device 100), and is highly ambient noise that is highly correlated with the ambient noise picked up by the main microphone 101 while being away from the arrangement position of the main microphone 101. It is arranged at the position to record. Further, the input hole of the sub microphone 102 is arranged so that the user does not block it with a hand. However, the user's voice is transmitted to the sub microphone 102 side through the space, transmitted through the vibration of the user's bones, muscles, and the case, or reflected by the surrounding acoustic environment. There is no denial.

＜適応フィルタ部１１０＞
適応フィルタ部１１０は、第二出力信号x(n)と誤差信号e(n)とを受け取り、第二収音信号x^-(n)に適応フィルタh^-(n)を用いてフィルタリングを行い（Ｓ１１０）、フィルタリング後信号h^-H(n)x^-(n)を求め、出力する。ただし、h^-(n)=[h₀(n),h₁(n),…,h_M-1(n)]^T、x(n)=[x(n),x(n-1),…,x(n-M+1)]^Tとし、^Tは転置を、^Hは複素共役転置を表す。適応フィルタh^-(n)は畳み込み演算を行うため、タップサイズMの長さをもち、第二収音信号x^-(n)を演算に用いる。 <Adaptive filter unit 110>
The adaptive filter unit 110 receives the second output signal x (n) and the error signal e (n), and filters the second collected sound signal x ⁻ (n) using the adaptive filter h ⁻ (n) ( S110), the filtered signal h ^-H (n) x ^- calculated (n), and outputs. ^{However, h - (n) = [} h 0 (n), h 1 (n), ..., h M-1 (n)] T, x (n) = [x (n), x (n-1) , ..., x (n-M + 1)] ^T , ^T represents transpose, and ^H represents complex conjugate transpose. Adaptive filter h ^- (n) is for performing a convolution operation, has a length of the tap size M, the second voice collecting signal x ^- using (n) in the calculation.

また、適応フィルタ部１１０は、誤差信号e(n)と第二収音信号x(n)とを用いて、誤差信号e(n)が最小となるように逐次的に適応フィルタh^-(n)を更新し、第二収音信号x(n)に対する誤差信号e(n)の割合（以下「誤差割合」ともいう）βの絶対値が所定の閾値（本実施形態では閾値を1とする）以下の場合には、誤差割合βに対する単調増加関数に基づく第１更新量により適応フィルタh^-(n)を更新し、誤差割合βの絶対値が所定の閾値より大きい場合には、誤差割合βに対して第１更新量よりも増加量が小さい単調増加関数に基づく第２更新量により適応フィルタh^-(n)を更新する。 Further, the adaptive filter unit 110 sequentially uses the error signal e (n) and the second sound pickup signal x (n) to sequentially adapt the adaptive filter h ⁻ (n to minimize the error signal e (n). ) And the absolute value of the ratio of the error signal e (n) to the second collected sound signal x (n) (hereinafter also referred to as “error ratio”) is a predetermined threshold value (in this embodiment, the threshold value is 1). ) in the following cases, the adaptation by the first updating amount based on monotonically increasing function with respect to error rate β filter h ^- update the (n), when the absolute value of the error ratio β is larger than a predetermined threshold value, the error rate accommodated by second updating amount based on monotonically increasing function weight increase is smaller than the first update amount relative to β filter h ^- updating (n).

以下、適応フィルタの設計方法について述べる。メインマイクロホン１０１には目的音と雑音が混在する音声が収音されている。この雑音をサブマイクロホン１０２で収音した第二収音信号x(n)と適応フィルタh^-(n)とを用いて、除去する。本実施形態では、適応フィルタの更新に、正規化LMS(NLMS: Normalized least mean square)法を用いる（引用文献３参照）。 The adaptive filter design method will be described below. The main microphone 101 collects a sound in which target sound and noise are mixed. This noise is removed using the second sound pickup signal x (n) picked up by the sub microphone 102 and the adaptive filter h ⁻ (n). In the present embodiment, a normalized LMS (NLMS: Normalized least mean square) method is used to update the adaptive filter (see cited document 3).

適応フィルタh^-(n)は、第一収音信号d(n)とフィルタリング後信号h^-H(n)x^-(n)との差分である誤差信号e(n)が最小になるようフィルタ設計を行う。
e(n)=d(n)-h^-H(n)x^-(n) (1)
なお、適応フィルタh^-(n)は逐次的に更新を行う。通常のNLMSでは、更新式は以下を用いる。 Adaptive filter h ^- (n), the first voice collecting signal d (n) and the filtered signal h ^-H (n) x ^- as (n) error signal which is a difference between the e where (n) is minimized filter Do the design.
e (n) = d (n ) -h -H (n) x - (n) (1)
Incidentally, the adaptive filter h ^- (n) performs sequentially updated. In normal NLMS, the update formula is as follows.

ここで、||x(n)||は第二収音信号x(n)のノルム、適応定数μは更新式の更新量を決めるステップサイズのパラメータである。適応係数μはシステム動作中は誤差信号e(n)に拠らず一定値を取り、値の範囲は0<μ<2の実数である。この更新式を以下のように分解する。 Here, || x (n) || is the norm of the second collected sound signal x (n), and the adaptation constant μ is a step size parameter that determines the update amount of the update equation. The adaptive coefficient μ takes a constant value regardless of the error signal e (n) during the system operation, and the value range is a real number of 0 <μ <2. This update formula is decomposed as follows.

ここでβは、第二収音信号x^-(n)のノルムに対する誤差信号e(n)の比率（割合）を表している。 Here, β represents the ratio (ratio) of the error signal e (n) to the norm of the second collected sound signal x ⁻ (n).

適応フィルタh^-(n)の学習がある程度収束した状態では、モバイル端末に取り付けられた二つのマイクロホンの位置関係から、非音声区間において、メインマイクロホン１０１とサブマイクロホン１０２とで雑音成分を同程度の音圧で収音することができる。そのため、適応フィルタh^-(n)のフィルタリングにより、第一収音信号d(n)とフィルタリング後信号h^-H(n)x^-(n)との差分である誤差信号e(n)は小さくなり、第二収音信号x^-(n)のノルムに対する誤差信号e(n)の比率も小さくなり、-1<β<1となる。 Adaptive filter h ^- in the state where the learning has converged to some extent in (n), from the positional relationship between the two microphones attached to a mobile terminal, the non-speech section, comparable noise components in the main microphone 101 and the sub-microphone 102 Sound can be collected with sound pressure. Therefore, the adaptive filter h ^- by filtering (n), the first voice collecting signal d (n) and the filtered signal ^{^{h -H (n) x - (}} n) which is the difference between the error signal e (n) is small Therefore, the ratio of the error signal e (n) to the norm of the second collected sound signal x ⁻ (n) is also reduced, and −1 <β <1.

一方、メインマイクロホン１０１は話者の口元の近くに配置されることから、音声区間における音声はメインマイクロホン１０１の収音の音圧が大きくなる。すると誤差信号e(n)には話者の発した目的音成分が多く含まれ、誤差信号e(n)の絶対値は第二収音信号x^-(n)のノルム||x(n)||よりも大きな値となり、β<-1, 1<βとなる。本実施形態では、βに対して非線形な制限関数f(β)を用いることで、音声区間でのフィルタの更新量が小さくなる。 On the other hand, since the main microphone 101 is arranged near the mouth of the speaker, the sound pressure of the sound picked up by the main microphone 101 is increased in the voice in the voice section. Then, the error signal e (n) contains many target sound components emitted by the speaker, and the absolute value of the error signal e (n) is the norm of the second collected signal x ⁻ (n) || x (n) It is larger than ||, and β <-1, 1 <β. In the present embodiment, by using a limit function f (β) that is non-linear with respect to β, the amount of filter update in the speech interval is reduced.

制限関数f(β)は|β|≧1(β≦-1,β≧1)で小さな値をとる非線形な関数である。例えば、以下の式で表される。 The limiting function f (β) is a nonlinear function that takes a small value with | β | ≧ 1 (β ≦ −1, β ≧ 1). For example, it is represented by the following formula.

例えば、L=5とすると、図４で示す関数となる。また、例えば、制限関数f(β)は以下の式で表されるシグモイド関数を用いてもよい。 For example, when L = 5, the function shown in FIG. 4 is obtained. For example, the limiting function f (β) may be a sigmoid function expressed by the following equation.

例えば、L=5とすると、図５で示す関数となる。図４と図５において、誤差割合βの絶対値が1以下の場合には、誤差割合βに対する単調増加関数に基づく第１更新量を用い、誤差割合βの絶対値が1より大きい場合には、誤差割合βに対して第１更新量よりも増加量が小さい単調増加関数に基づく第２更新量を用いる。よって、誤差割合βの絶対値が1(閾値)以下の場合の更新量(第１更新量)が、誤差割合βの絶対値が1(閾値)より大きい場合の更新量(第２更新量)よりも大きい。そして、式(5)により、第１更新量または第２更新量に基づいて適応フィルタh^-(n)を更新する。 For example, when L = 5, the function shown in FIG. 5 is obtained. 4 and 5, when the absolute value of the error rate β is 1 or less, the first update amount based on the monotonically increasing function for the error rate β is used, and when the absolute value of the error rate β is greater than 1. The second update amount based on a monotonically increasing function that is smaller than the first update amount with respect to the error rate β is used. Therefore, the update amount (first update amount) when the absolute value of the error rate β is 1 (threshold) or less (first update amount), and the update amount (second update amount) when the absolute value of the error rate β is greater than 1 (threshold value). Bigger than. Then, the equation (5), the adaptive filter h on the basis of the first update amount or the second update amount ^- updates (n).

関数の制約条件はβ=1, β=-1を境に、βの値の絶対値が減少する。雑音区間では、メインマイクロホン１０１、サブマイクロホン１０２ともに同程度の音圧で観測されるため、もし適応フィルタによって全くメインマイクの信号を抑圧しない場合はβが1をとり、抑圧することができれば|β|<1となる。次に音声区間では、誤差信号e(n)が第二収音信号x(n)に比べて大きく観測されるためβ>1となる。音声区間で誤差信号e(n)を小さくするよう学習すると目的音声まで抑圧してしまう。それを避けるためβ>１では小さい値をとることでフィルタの更新量が小さくなる制限関数の設計となっている。 The function constraint condition is that the absolute value of β decreases with β = 1 and β = -1. In the noise section, both the main microphone 101 and the sub microphone 102 are observed with the same sound pressure. Therefore, if the adaptive filter does not suppress the main microphone signal at all, β is 1, and if it can be suppressed, | β | <1. Next, in the speech section, β> 1 since the error signal e (n) is observed larger than the second sound collection signal x (n). If the error signal e (n) is learned to be small in the speech section, the target speech is suppressed. In order to avoid this, the design of the limiting function is such that the update amount of the filter is reduced by taking a small value when β> 1.

別の言い方をすると、音声区間ではメインマイクロホン１０１で収音する第一収音信号d(n)とサブマイクロホン１０２で収音する第二収音信号x(n)とで観測できる音圧に大きな差があり、誤差信号e(n)には目的音成分が残るため、第二収音信号x^-(n)のノルムに対する誤差信号e(n)の比率も大きくなる。そのため、|β|>1となる。逆に言えば、|β|>1となる区間は音声区間の可能性が高く、|β|>1となる区間で制限を加える事で、音声区間での適応フィルタのステップサイズを抑えることができる。言い換えると、|β|>1となる区間において、βに比べ、f(β)の値を小さくすることができる。また、本手法によって、ステップサイズを|β|>1（音声区間）で0にしないことで、雑音源がサブマイクロホン１０２の近傍に存在する場合や、雑音源が大きく移動し、雑音源からサブマイクロホン１０２、メインマイクロホン１０１への伝達関数が変化した場合に|β|>1となった場合にフィルタの更新が停止することを防ぐことができる。この制限関数f(β)を用いることで、音声区間に存在する目的音である音声信号を消す方向に進む適応フィルタの処理を抑えることが出来る。また、雑音と性質の異なる音声区間でフィルタの学習を緩和することでフィルタの安定性を高める効果も得ることが出来る。 In other words, the sound pressure that can be observed by the first collected signal d (n) collected by the main microphone 101 and the second collected signal x (n) collected by the sub microphone 102 in the voice section is large. Since there is a difference and the target sound component remains in the error signal e (n), the ratio of the error signal e (n) to the norm of the second collected signal x ⁻ (n) also increases. Therefore, | β |> 1. Conversely, the interval where | β |> 1 is highly likely to be a speech interval, and by limiting the interval where | β |> 1, the step size of the adaptive filter in the speech interval can be suppressed. it can. In other words, in the interval where | β |> 1, the value of f (β) can be made smaller than β. In addition, with this technique, the step size is not set to 0 at | β |> 1 (speech interval), so that the noise source moves in the vicinity of the sub-microphone 102 or the noise source moves greatly and the sub- When the transfer functions to the microphone 102 and the main microphone 101 change, it is possible to prevent the update of the filter from being stopped when | β |> 1. By using this limiting function f (β), it is possible to suppress the processing of the adaptive filter that proceeds in the direction of erasing the speech signal that is the target sound existing in the speech section. In addition, it is possible to obtain an effect of improving the stability of the filter by relaxing the learning of the filter in a speech section having a characteristic different from that of noise.

＜減算部１２０＞
減算部１２０は、第一収音信号d(n)とフィルタリング後信号h^-H(n)x^-(n)とを受け取り、その差分d(n)-h^-H(n)x^-(n)を誤差信号e(n)として求め（Ｓ１２０）、出力する。 <Subtraction unit 120>
Subtraction unit 120, the first voice collecting signal d (n) and the filtered signal ^{^{h -H (n) x - (}} n) receives and, the difference ^{d (n) -h -H (n} ) x - (n ) As an error signal e (n) (S120) and output.

＜フィルタ設計部１３０＞
フィルタ設計部１３０は、第二収音信号x(n)と誤差信号e(n)とを受け取り、減算部１２０で消し残った雑音成分を抑圧するフィルタGを設計し（Ｓ１３０）、出力する。 <Filter design unit 130>
The filter design unit 130 receives the second collected sound signal x (n) and the error signal e (n), designs a filter G that suppresses the noise component that has not been erased by the subtraction unit 120 (S130), and outputs it.

なお、フィルタの設計手法は様々あるが、例えば、参考文献１記載にPSD（power-spectrum density：パワースペクトル密度）推定に基づく雑音除去技術を利用した手法を用いてフィルタ設計を行う。
（参考文献１）丹羽健太、日岡裕輔、小林和則、鎌土記良、「雑音下での音声認識率向上を目的としたマイクロホンアレイの実装」、日本音響学会講演論文集、２０１４年、pp.717-718 Although there are various filter design methods, for example, the filter design is performed using a method using a noise removal technique based on PSD (power-spectrum density) estimation described in Reference Document 1.
(Reference 1) Kenta Niwa, Yusuke Hioka, Kazunori Kobayashi, Noriyoshi Kamado, “Implementation of a microphone array to improve speech recognition under noisy conditions”, Proc. Of the Acoustical Society of Japan, 2014, pp .717-718

例えば、フィルタ設計部１３０は、第二収音信号x(n)と誤差信号e(n)とを周波数領域の信号である周波数領域第二収音信号X(ω,τ)及び周波数領域誤差信号E(ω,τ)に変換する。誤差信号と第二収音信号の比から|E(ω,τ)|/|X(ω,τ)| > 1となるときの周波数領域誤差信号E(ω,τ)を目的音のスペクトルEs(ω,τ)とし、|E(ω,τ)|/|X(ω,τ)| < 1となるときの周波数領域誤差信号E(ω,τ)を雑音のスペクトルEn(ω,τ)とする。このとき、次式により、Wiener法に基づいてポストフィルタG(ω)を設計する。 For example, the filter design unit 130 uses the second collected sound signal x (n) and the error signal e (n) as the frequency domain second collected signal X (ω, τ) and the frequency domain error signal. Convert to E (ω, τ). From the ratio of the error signal and the second collected signal, the frequency domain error signal E (ω, τ) when | E (ω, τ) | / | X (ω, τ) | (ω, τ) and the frequency domain error signal E (ω, τ) when | E (ω, τ) | / | X (ω, τ) | And At this time, the post filter G (ω) is designed based on the Wiener method according to the following equation.

Xs(ω)= E[|Es(ω,τ)|²], Xn(ω)= E[|En(ω、τ)²|]とする。ただし、ωが周波数を表すインデックスであり、τはフレームを表すインデックスであり、E[]はフレームτの平均値とする。スペクトルの算出方法は例えば、高速フーリエ変換（FFT）により時間領域の信号を周波数領域の信号に変換すればよい。 Xs (ω) = E [| Es (ω, τ) | ² ], Xn (ω) = E [| En (ω, τ) ² |]. Here, ω is an index representing a frequency, τ is an index representing a frame, and E [] is an average value of the frame τ. For example, the spectrum may be calculated by converting a time domain signal into a frequency domain signal by fast Fourier transform (FFT).

＜スペクトルフィルタ部１４０＞
スペクトルフィルタ部１４０は、誤差信号e(n)とフィルタGとを受け取り、フィルタGを用いて、誤差信号e(n)に対してフィルタリングを行う（Ｓ１４０）。誤差信号e(n)に含まれる消し残った雑音成分を抑圧するために、ポストフィルタG(ω)を掛け合わせる。
Y(ω,τ)=G(ω)E(ω,τ) (9)
最後に、Y(ω,τ)を逆高速フーリエ変換（IFFT）することで、出力信号y(n)を得る。 <Spectral filter unit 140>
The spectrum filter unit 140 receives the error signal e (n) and the filter G, and filters the error signal e (n) using the filter G (S140). In order to suppress the unerased noise component included in the error signal e (n), the post filter G (ω) is multiplied.
Y (ω, τ) = G (ω) E (ω, τ) (9)
Finally, Y (ω, τ) is subjected to inverse fast Fourier transform (IFFT) to obtain an output signal y (n).

＜効果＞
このような構成により、新たにVAD（voice activity detection）などの装置を必要とせずに、劣化や音声の歪みを抑え、雑音を抑圧することができる。本実施形態では、スマートフォンに搭載された二つのマイクロホンを利用して雑音抑圧を行う際に、音声区間、雑音区間ごとに制限関数によって適応フィルタの更新の速度量を変化させる。これにより、音声区間での誤った方向へのフィルタ学習を抑制し、２つのマイクロホンへ同等の音圧で収音される雑音のみを消すフィルタを作成することができる。また、音声区間でのフィルタ学習を緩和する事で、音声の抑圧を防ぎ、フィルタの安定化を図ることが可能となる。 <Effect>
With such a configuration, it is possible to suppress deterioration and distortion of voice and suppress noise without requiring a new device such as VAD (voice activity detection). In the present embodiment, when noise suppression is performed using two microphones mounted on a smartphone, the speed of update of the adaptive filter is changed by a limiting function for each voice section and noise section. As a result, it is possible to create a filter that suppresses filter learning in the wrong direction in the speech section and eliminates only noise collected by two microphones with equivalent sound pressure. In addition, by relaxing the filter learning in the speech section, it is possible to prevent speech suppression and stabilize the filter.

＜変形例＞
本実施形態では、β<-1、β>1において、フィルタの更新量を制限しているが、β<-a、β>aにおいてフィルタの更新量を制限してもよい。aの値はa> 0とする。例えば、式(6)を以下の式に置き換えてもよい。 <Modification>
In this embodiment, the filter update amount is limited when β <−1 and β> 1, but the filter update amount may be limited when β <−a and β> a. The value of a is a> 0. For example, equation (6) may be replaced with the following equation.

本実施形態では、適応フィルタ部１１０、減算部１２０の処理を時間領域で行っているが、周波数領域で処理を行ってもよい。例えば、図示しない周波数領域変換部を設け、第一収音信号d(n)及び第二収音信号x(n)をそれぞれ周波数領域の信号である周波数領域第一収音信号D(ω,τ)及び周波数領域第二収音信号X(ω,τ)に変換する。 In the present embodiment, the processing of the adaptive filter unit 110 and the subtraction unit 120 is performed in the time domain, but the processing may be performed in the frequency domain. For example, a frequency domain conversion unit (not shown) is provided, and the first sound collection signal d (n) and the second sound collection signal x (n) are frequency domain first sound collection signals D (ω, τ ) And the frequency domain second collected signal X (ω, τ).

適応フィルタ部１１０は、周波数領域第二出力信号X(ω,τ)と周波数領域誤差信号E(ω,τ)とを受け取り、周波数領域第二出力信号X(ω,τ)に適応フィルタH(ω,τ)を用いてフィルタリングを行い（Ｓ１１０）、フィルタリング後信号H(ω,τ)X(ω,τ)を求め、出力する。 The adaptive filter unit 110 receives the frequency domain second output signal X (ω, τ) and the frequency domain error signal E (ω, τ), and applies the adaptive filter H ( Filtering is performed using (ω, τ) (S110), and a filtered signal H (ω, τ) X (ω, τ) is obtained and output.

また、適応フィルタ部１１０では、次式により、フィルタを更新する。 The adaptive filter unit 110 updates the filter according to the following equation.

周波数領域誤差信号E(ω,τ)に目的音成分が多く含まれるとき、周波数領域誤差信号E(ω,τ)の絶対値は周波数領域第二収音信号X(ω,τ)のノルム||X(ω,τ)||よりも大きな値となり、β<-1, 1<βとなる。この変形例の場合でも、適応フィルタ部１１０では第一実施形態と同様に、誤差割合βに対して非線形な制限関数f(β)を用いることで、音声区間でのフィルタの更新量を小さくすることができる。また、この説明では非線形な制限関数f(β)は全周波数帯域で同じ制限関数f(β)を用いることとしたが、周波数領域ωごとにそれぞれ別の制限関数f(β，ω)を用いるように構成しても良い。 When the frequency domain error signal E (ω, τ) contains many target sound components, the absolute value of the frequency domain error signal E (ω, τ) is the norm of the frequency domain second collected signal X (ω, τ) | A value larger than | X (ω, τ) ||, β <-1, 1 <β. Even in this modification, the adaptive filter unit 110 reduces the filter update amount in the speech section by using a non-linear limit function f (β) with respect to the error rate β, as in the first embodiment. be able to. In this description, the same limiting function f (β) is used for the non-linear limiting function f (β) in all frequency bands, but a different limiting function f (β, ω) is used for each frequency domain ω. You may comprise as follows.

減算部１２０では、周波数領域第一収音信号D(ω,τ)とフィルタリング後信号H(ω,τ)X(ω,τ)とを受け取り、その差分D(ω,τ)-H(ω,τ)X(ω,τ)を周波数領域誤差信号E(ω,τ)として求め（Ｓ１２０）、出力する。後段（フィルタ設計部１３０及びスペクトルフィルタ部１４０）において、周波数領域で処理を行うのであれば、そのまま周波数領域第二出力信号X(ω,τ)と周波数領域誤差信号E(ω,τ)を用いればよいし、時間領域信号を用いるのであれば、周波数領域信号に変換して後段に出力すればよい。 The subtractor 120 receives the frequency domain first collected signal D (ω, τ) and the filtered signal H (ω, τ) X (ω, τ), and the difference D (ω, τ) −H (ω , τ) X (ω, τ) is obtained as a frequency domain error signal E (ω, τ) (S120) and output. If the subsequent stage (filter design unit 130 and spectral filter unit 140) performs processing in the frequency domain, the frequency domain second output signal X (ω, τ) and the frequency domain error signal E (ω, τ) can be used as they are. If a time domain signal is used, it may be converted into a frequency domain signal and output to the subsequent stage.

また、本実施形態のポイントは、音声区間と非音声区間とで適応フィルタの更新量に対して異なる制限をかける制限関数を用いることである。よって、雑音抑圧装置１００は、必ずしもメインマイクロホン１０１、サブマイクロホン１０２、フィルタ設計部１３０、スペクトルフィルタ部１４０を含まなくともよい。 Also, the point of the present embodiment is to use a restriction function that places different restrictions on the update amount of the adaptive filter between the speech section and the non-speech section. Therefore, the noise suppression apparatus 100 does not necessarily include the main microphone 101, the sub microphone 102, the filter design unit 130, and the spectrum filter unit 140.

＜その他の変形例＞
本発明は上記の実施形態及び変形例に限定されるものではない。例えば、上述の各種の処理は、記載に従って時系列に実行されるのみならず、処理を実行する装置の処理能力あるいは必要に応じて並列的にあるいは個別に実行されてもよい。その他、本発明の趣旨を逸脱しない範囲で適宜変更が可能である。 <Other variations>
The present invention is not limited to the above-described embodiments and modifications. For example, the various processes described above are not only executed in time series according to the description, but may also be executed in parallel or individually as required by the processing capability of the apparatus that executes the processes. In addition, it can change suitably in the range which does not deviate from the meaning of this invention.

＜プログラム及び記録媒体＞
また、上記の実施形態及び変形例で説明した各装置における各種の処理機能をコンピュータによって実現してもよい。その場合、各装置が有すべき機能の処理内容はプログラムによって記述される。そして、このプログラムをコンピュータで実行することにより、上記各装置における各種の処理機能がコンピュータ上で実現される。 <Program and recording medium>
In addition, various processing functions in each device described in the above embodiments and modifications may be realized by a computer. In that case, the processing contents of the functions that each device should have are described by a program. Then, by executing this program on a computer, various processing functions in each of the above devices are realized on the computer.

この処理内容を記述したプログラムは、コンピュータで読み取り可能な記録媒体に記録しておくことができる。コンピュータで読み取り可能な記録媒体としては、例えば、磁気記録装置、光ディスク、光磁気記録媒体、半導体メモリ等どのようなものでもよい。 The program describing the processing contents can be recorded on a computer-readable recording medium. As the computer-readable recording medium, for example, any recording medium such as a magnetic recording device, an optical disk, a magneto-optical recording medium, and a semiconductor memory may be used.

また、このプログラムの流通は、例えば、そのプログラムを記録したＤＶＤ、ＣＤ−ＲＯＭ等の可搬型記録媒体を販売、譲渡、貸与等することによって行う。さらに、このプログラムをサーバコンピュータの記憶装置に格納しておき、ネットワークを介して、サーバコンピュータから他のコンピュータにそのプログラムを転送することにより、このプログラムを流通させてもよい。 The program is distributed by selling, transferring, or lending a portable recording medium such as a DVD or CD-ROM in which the program is recorded. Further, the program may be distributed by storing the program in a storage device of the server computer and transferring the program from the server computer to another computer via a network.

このようなプログラムを実行するコンピュータは、例えば、まず、可搬型記録媒体に記録されたプログラムもしくはサーバコンピュータから転送されたプログラムを、一旦、自己の記憶部に格納する。そして、処理の実行時、このコンピュータは、自己の記憶部に格納されたプログラムを読み取り、読み取ったプログラムに従った処理を実行する。また、このプログラムの別の実施形態として、コンピュータが可搬型記録媒体から直接プログラムを読み取り、そのプログラムに従った処理を実行することとしてもよい。さらに、このコンピュータにサーバコンピュータからプログラムが転送されるたびに、逐次、受け取ったプログラムに従った処理を実行することとしてもよい。また、サーバコンピュータから、このコンピュータへのプログラムの転送は行わず、その実行指示と結果取得のみによって処理機能を実現する、いわゆるＡＳＰ（Application Service Provider）型のサービスによって、上述の処理を実行する構成としてもよい。なお、プログラムには、電子計算機による処理の用に供する情報であってプログラムに準ずるもの（コンピュータに対する直接の指令ではないがコンピュータの処理を規定する性質を有するデータ等）を含むものとする。 A computer that executes such a program first stores, for example, a program recorded on a portable recording medium or a program transferred from a server computer in its storage unit. When executing the process, this computer reads the program stored in its own storage unit and executes the process according to the read program. As another embodiment of this program, a computer may read a program directly from a portable recording medium and execute processing according to the program. Further, each time a program is transferred from the server computer to the computer, processing according to the received program may be executed sequentially. Also, the program is not transferred from the server computer to the computer, and the above-described processing is executed by a so-called ASP (Application Service Provider) type service that realizes the processing function only by the execution instruction and result acquisition. It is good. Note that the program includes information provided for processing by the electronic computer and equivalent to the program (data that is not a direct command to the computer but has a property that defines the processing of the computer).

また、コンピュータ上で所定のプログラムを実行させることにより、各装置を構成することとしたが、これらの処理内容の少なくとも一部をハードウェア的に実現することとしてもよい。 In addition, although each device is configured by executing a predetermined program on a computer, at least a part of these processing contents may be realized by hardware.

Claims

A noise suppression device that suppresses a noise component included in a first collected sound signal using a second collected sound signal,
Filtering the second collected sound signal using an adaptive filter and obtaining a filtered signal; and
A subtracting unit for obtaining a difference between the first collected sound signal and the filtered signal as an error signal;
The adaptive filter unit sequentially updates the adaptive filter using the error signal and the second sound pickup signal so as to minimize the error signal, and the error signal relative to the second sound pickup signal is updated. When the absolute value of the error ratio, which is a ratio, is less than or equal to a predetermined threshold, the adaptive filter is updated with a first update amount based on a monotonically increasing function with respect to the error ratio, and the absolute value of the error ratio is less than the predetermined threshold If larger, the adaptive filter is updated with a second update amount based on a monotonically increasing function whose increase amount is smaller than the first update amount with respect to the error ratio.
Noise suppression device.

The noise suppression device of claim 1,
The first sound pickup signal is a signal picked up by a first microphone arranged to pick up a target sound,
The second sound pickup signal is a signal picked up by a second microphone arranged to pick up ambient noise correlated with the ambient noise included in the first sound pickup signal.
Noise suppression device.

The noise suppression device of claim 1,
The first sound pickup signal is a signal picked up by a first microphone arranged at the mouth of the speaker, and is a signal picked up the target sound and ambient noise that are the utterance of the speaker,
The second sound collecting signal has a sound pressure of the target sound included in the second sound collecting signal that is lower than a sound pressure of the target sound included in the first sound collecting signal, and the second sound collecting signal. The sound collected by the second microphone arranged so that the sound pressure of the noise included in the signal is approximately the same as the sound pressure of the noise included in the first sound collection signal,
Noise suppression device.

The noise suppression device according to any one of claims 1 to 3,
The predetermined threshold value is a> 0, the error rate is β, the first update amount or the second update amount is f (β), an index representing time is n, and the filter coefficient of the adaptive filter The filter length is M, the filter coefficient at time n is h ⁻ (n) = [h (n), h (n−1),..., H (n−M + 1)], and the adaptation constant is μ. , the second voice collecting signal and x (n) at time ^{n, x - (n) =} [x (n), x (n-1), ..., x (n-M + 1)] and the second collected signals x ^- the norm of (n) || x ^- a (n) ||, the error signal at time n and e (n), and the L and real number larger than 1, the update formula of the adaptive filter Is

And

Or

Is,
Noise suppression device.

A noise suppression method for suppressing a noise component included in a first sound pickup signal using a second sound pickup signal,
An adaptive filter unit that filters the second collected sound signal using an adaptive filter and obtains a filtered signal; and
The subtracting unit includes a subtracting step for obtaining a difference between the first collected sound signal and the filtered signal as an error signal,
In the adaptive filter step, using the error signal and the second sound pickup signal, the adaptive filter is sequentially updated so that the error signal is minimized, and the error signal relative to the second sound pickup signal is updated. When the absolute value of the error ratio, which is a ratio, is less than or equal to a predetermined threshold, the adaptive filter is updated with a first update amount based on a monotonically increasing function with respect to the error ratio, and the absolute value of the error ratio is less than the predetermined threshold If larger, the adaptive filter is updated with a second update amount based on a monotonically increasing function whose increase amount is smaller than the first update amount with respect to the error ratio.
Noise suppression method.

The noise suppression method according to claim 5, comprising:
The first sound pickup signal is a signal picked up by a first microphone arranged to pick up a target sound,
The second sound pickup signal is a signal picked up by a second microphone arranged to pick up ambient noise correlated with the ambient noise included in the first sound pickup signal.
Noise suppression method.

The program for functioning a computer as a noise suppression apparatus in any one of Claims 1-4.