JP6772890B2

JP6772890B2 - Signal processing equipment, programs and methods

Info

Publication number: JP6772890B2
Application number: JP2017032567A
Authority: JP
Inventors: 大藤枝
Original assignee: Oki Electric Industry Co Ltd
Current assignee: Oki Electric Industry Co Ltd
Priority date: 2017-02-23
Filing date: 2017-02-23
Publication date: 2020-10-21
Anticipated expiration: 2037-02-23
Also published as: JP2018136509A

Description

本発明は、信号処理装置、プログラム及び方法に関し、例えば、複数の音源が存在する環境下で、特定の方向の範囲に存在する音源を強調し収音することを欲する通信端末、オーディオ機器、音声認識装置などに適用し得る。 The present invention relates to a signal processing device, a program, and a method, for example, a communication terminal, an audio device, and a voice that want to emphasize and collect sound sources existing in a specific direction in an environment where a plurality of sound sources exist. It can be applied to recognition devices and the like.

複数の音源が存在する環境下において、ある目的の音源を抽出する技術として、複数のマイクを用いた音源分離、マイクを直線上や平面上、球面上等に配置したマイクアレイを用いたビームフォーマやヌルフォーマ等がある。特に、目的の音源以外の音源が非定常である場合や、複数ある場合には、単一のマイクを用いたノイズサプレッサによる目的音源の抽出は難しく、２つ以上のマイクを用いることが必須となる。 As a technique for extracting a certain target sound source in an environment where multiple sound sources exist, sound source separation using multiple microphones and a beam former using a microphone array in which microphones are arranged on a straight line, a plane, a spherical surface, etc. And nullformers. In particular, when sound sources other than the target sound source are non-stationary, or when there are multiple sound sources, it is difficult to extract the target sound source with a noise suppressor using a single microphone, and it is essential to use two or more microphones. Become.

上述したマイクアレイを用いたビームフォーマとは、ある特定の方向の音のみ強調し収音する技術である。ビームフォーマとは、各マイクに到達する信号の時間差を利用して指向性を形成する技術である。 The beam former using the microphone array described above is a technique for emphasizing and collecting sound only in a specific direction. Beam former is a technique for forming directivity by utilizing the time difference of signals arriving at each microphone.

ビームフォーマには、加算型と減算型という２つの種類がある。加算型ビームフォーマに比べて、減算型ビームフォーマはより少ないマイク数で鋭い指向性を形成できるという利点がある。 There are two types of beam formers: addition type and subtraction type. Compared to the additive beamformer, the subtractive beamformer has the advantage that sharp directivity can be formed with a smaller number of microphones.

図１３は、マイク数が２個の場合の減算型ビームフォーマに係る構成を示すブロック図である。図１３の減算型ビームフォーマは、第１のマイクＭ１、第２のマイクＭ２、第１の遅延手段３、第２の遅延手段４、減算手段５から構成される。第１のマイクＭ１で収音した第１の入力信号は第１の遅延手段３に与えられ、第２のマイクＭ２で収音した第２の入力信号は第２の遅延手段４に与えられる。妨害音が第１のマイクＭ１側から到来している場合、第１の遅延手段３は第１の入力信号を遅延させることで、第１の入力信号と第２の入力信号に含まれる妨害音の位相を合わせる。一方、妨害音が第２のマイクＭ２側から到来している場合、第２の遅延手段４は第２の入力信号を遅延させることで、妨害音の位相を合わせる。第１の遅延手段３から得られた第１の遅延信号と第２の遅延手段から得られた第２の遅延信号は減算手段５に与えられる。減算手段５は、第１の遅延信号から第２の遅延信号を減じることで、強調音声を得る。以上のように、減算型ビームフォーマは、第１の入力信号と第２の入力信号とに含まれる妨害音の位相を合わせ、減算し、妨害音を抑圧することで、目的音を強調する。減算型ビームフォーマは、事前に与えられる妨害音の到来方向情報を必要とする。 FIG. 13 is a block diagram showing a configuration related to a subtraction type beamformer when the number of microphones is two. The subtraction type beamformer of FIG. 13 is composed of a first microphone M1, a second microphone M2, a first delay means 3, a second delay means 4, and a subtraction means 5. The first input signal picked up by the first microphone M1 is given to the first delay means 3, and the second input signal picked up by the second microphone M2 is given to the second delay means 4. When the disturbing sound comes from the first microphone M1 side, the first delay means 3 delays the first input signal, so that the disturbing sound included in the first input signal and the second input signal To match the phase of. On the other hand, when the disturbing sound comes from the second microphone M2 side, the second delay means 4 delays the second input signal to match the phase of the disturbing sound. The first delay signal obtained from the first delay means 3 and the second delay signal obtained from the second delay means are given to the subtraction means 5. The subtracting means 5 obtains the emphasized voice by subtracting the second delay signal from the first delay signal. As described above, the subtraction type beamformer emphasizes the target sound by matching the phases of the disturbing sounds contained in the first input signal and the second input signal, subtracting them, and suppressing the disturbing sounds. The subtraction type beam former requires information on the direction of arrival of the disturbing sound given in advance.

ところで、減算型ビームフォーマには、妨害音源が少しでも移動してしまうと、妨害音の抑圧性能が大きく低下してしまう問題がある。 By the way, the subtraction type beam former has a problem that the suppression performance of the disturbing sound is greatly deteriorated if the disturbing sound source moves even a little.

図１４は、従来の信号処理装置Ｚを用いて、自動車（車両）Ａの中における運転手Ｕ１の音声を強調する例について示した説明図である。 FIG. 14 is an explanatory diagram showing an example of emphasizing the voice of the driver U1 in the automobile (vehicle) A by using the conventional signal processing device Z.

例えば、図１４に示すように音声認識を用いて音声によって操作できるカーナビゲーションシステムなどでは、自動車内において運転手の音声だけを抽出する必要がある。 For example, in a car navigation system or the like that can be operated by voice using voice recognition as shown in FIG. 14, it is necessary to extract only the driver's voice in the car.

したがって、運転席と助手席にそれぞれ人が乗車している場合には、助手席の助手Ｕ２の音声（妨害音）を抑圧する必要があるが、助手Ｕ２が前後左右に顔（妨害音源）を動かすと、減算型ビームフォーマでは妨害音を抑圧することができない。 Therefore, when a person is in the driver's seat and the passenger's seat, it is necessary to suppress the voice (interfering sound) of the passenger U2 in the passenger's seat, but the assistant U2 makes a face (interfering sound source) in front, back, left and right. When moved, the subtractive beamformer cannot suppress the disturbing sound.

適応ビームフォーマの代表の一つである最小分散ビームフォーマ（ＭｉｎｉｍｕｍＶａｒｉａｎｃｅＢｅａｍｆｏｒｍｅｒ：ＭＶＢ）は、目的音の到来方向を事前に与えることで、妨害音を効率的に抑圧できる方法である。ＭＶＢは、目的音の到来方向に対してはゲインが１となるような拘束条件の下で、強調音声の分散を最小化することにより、妨害音を抑圧する。 The Minimum Variance Beamformer (MVB), which is one of the representative adaptive beamformers, is a method that can efficiently suppress disturbing sounds by giving the direction of arrival of the target sound in advance. The MVB suppresses the disturbing sound by minimizing the dispersion of the emphasized sound under the constraint condition that the gain is 1 with respect to the arrival direction of the target sound.

また、スペクトル減算法を用いることで、目的音源の到来方向に強い指向性を形成することができる。非特許文献１では、目的音源は常に正面にあると仮定して、第１に減算型ビームフォーマで正面方向から到来する目的音を抑圧した目的音抑圧信号を得、第２に第１の入力信号の振幅スペクトルから目的音抑圧信号の振幅スペクトルを減算（スペクトル減算）することで目的音を強調した強調音声の振幅スペクトルを得、第３に強調音声の振幅スペクトルと第１の入力信号の位相スペクトルとを用いて強調音声を得る。 Further, by using the spectrum subtraction method, it is possible to form a strong directivity in the direction of arrival of the target sound source. In Non-Patent Document 1, assuming that the target sound source is always in front, a subtraction type beam former first obtains a target sound suppression signal that suppresses a target sound coming from the front direction, and secondly, a first input. By subtracting the amplitude spectrum of the target sound suppression signal from the amplitude spectrum of the signal (spectrum subtraction), the amplitude spectrum of the emphasized sound that emphasizes the target sound is obtained, and thirdly, the amplitude spectrum of the emphasized sound and the phase of the first input signal are obtained. The enhanced sound is obtained using the spectrum.

矢頭隆、森戸誠、山田圭、小川哲司、“正方形マイクロホンアレイによる音源分離技術”、情報処理、Ｖｏｌ．５１、Ｎｏ．１１、２０１０Takashi Yato, Makoto Morito, Kei Yamada, Tetsuji Ogawa, "Sound Source Separation Technology by Square Microphone Array", Information Processing, Vol. 51, No. 11, 2010

しかしながら、従来の技術は以下に述べる問題を有する。 However, conventional techniques have the following problems.

図１５は、自動車Ａの中における目的音と妨害音のイメージについて示した説明図である。 FIG. 15 is an explanatory diagram showing an image of a target sound and an interfering sound in the automobile A.

ＭＶＢは、マイクの数より１つ少ない数の妨害音しか抑圧することができない。したがって、図１４のように２つのマイクで目的音を強調する場合、妨害音は図１５（ｂ）に示すように伝搬するため、ＭＶＢは妨害音の直接音を抑圧できるが反射音を抑圧できないので、目的音を十分に強調することができない。 The MVB can suppress only one less disturbing sound than the number of microphones. Therefore, when the target sound is emphasized by two microphones as shown in FIG. 14, since the disturbing sound propagates as shown in FIG. 15 (b), the MVB can suppress the direct sound of the disturbing sound but cannot suppress the reflected sound. Therefore, the target sound cannot be sufficiently emphasized.

非特許文献１に記載の技術は、正面方向以外から到来した音声は、目的音に由来するものであってもすべて抑圧してしまう。したがって、図１４のように２つのマイクで目的音を強調する場合、目的音は図１５（ａ）に示すように伝搬するため、非特許文献１に記載の技術は目的音の反射音をも抑圧してしまうため、目的音の音質が劣化してしまう。 The technique described in Non-Patent Document 1 suppresses all sounds coming from other than the front direction, even if they are derived from the target sound. Therefore, when the target sound is emphasized by two microphones as shown in FIG. 14, the target sound propagates as shown in FIG. 15 (a). Therefore, the technique described in Non-Patent Document 1 also includes the reflected sound of the target sound. Since it is suppressed, the sound quality of the target sound deteriorates.

そのため、より少ない演算コストで、且つ、より少ない歪みで目的音を強調する信号処理装置、プログラム及び方法を提供することができる。 Therefore, it is possible to provide a signal processing device, a program, and a method for emphasizing a target sound with less calculation cost and less distortion.

第１の本発明の信号処理装置は、（１）第１の収音装置から入力された第１の入力信号を周波数解析して第１の入力スペクトルを得る第１の周波数解析手段と、（２）第２の収音装置から入力された第２の入力信号を周波数解析して第２の入力スペクトルを得る第２の周波数解析手段と、（３）前記第１の周波数解析手段で得られた第１の入力スペクトルと前記第２の周波数解析手段で得られた第２の入力スペクトルに基づき、前記第１の収音装置の位置と前記第２の収音装置の位置を結んだ直線と垂直をなす正面方向に対して、正面方向及び前記第１の収音装置側の方向の値を大きくとり、前記第２の収音装置側の方向の値を小さくとる第１の特徴量を算出する特徴量算出手段と、（４）前記特徴量算出手段で算出された前記第１の特徴量を、所定の広義単調増加関数で写像して強調フィルタを得るフィルタ決定手段と、（５）前記第１の周波数解析手段で得られた第１の入力スペクトルに前記フィルタ決定手段で得られた強調フィルタを乗じて強調スペクトルを得る乗算手段とを備えることを特徴とする。 The first signal processing device of the present invention includes (1) a first frequency analysis means for frequency-analyzing a first input signal input from a first sound collecting device to obtain a first input spectrum, and (1). 2) A second frequency analysis means for obtaining a second input spectrum by frequency-analyzing a second input signal input from the second sound collecting device, and (3) obtained by the first frequency analysis means. Based on the first input spectrum and the second input spectrum obtained by the second frequency analysis means, a straight line connecting the position of the first sound collecting device and the position of the second sound collecting device. Calculate the first feature amount in which the values in the front direction and the direction on the first sound collecting device side are made larger and the values in the direction on the second sound collecting device side are made smaller than the vertical front direction. The feature amount calculating means to be used, (4) the filter determining means for obtaining an emphasis filter by mapping the first feature amount calculated by the feature amount calculating means with a predetermined broad-sense monotonous increase function, and (5) the above. It is characterized by including a multiplication means for obtaining an emphasis spectrum by multiplying the first input spectrum obtained by the first frequency analysis means by the enhancement filter obtained by the filter determination means.

第２の本発明の信号処理プログラムは、コンピュータを、（１）第１の収音装置から入力された第１の入力信号を周波数解析して第１の入力スペクトルを得る第１の周波数解析手段と、（２）第２の収音装置から入力された第２の入力信号を周波数解析して第２の入力スペクトルを得る第２の周波数解析手段と、（３）前記第１の周波数解析手段で得られた第１の入力スペクトルと前記第２の周波数解析手段で得られた第２の入力スペクトルに基づき、前記第１の収音装置の位置と前記第２の収音装置の位置を結んだ直線と垂直をなす正面方向に対して、正面方向及び前記第１の収音装置側の方向の値を大きくとり、前記第２の収音装置側の方向の値を小さくとる第１の特徴量を算出する特徴量算出手段と、（４）前記特徴量算出手段で算出された前記第１の特徴量を、所定の広義単調増加関数で写像して強調フィルタを得るフィルタ決定手段と、（５）前記第１の周波数解析手段で得られた第１の入力スペクトルに前記フィルタ決定手段で得られた強調フィルタを乗じて強調スペクトルを得る乗算手段と、（６）前記乗算手段で得られた強調スペクトルを入力して信号波形を復元して強調音声を得る波形復元手段として機能させることを特徴とする。 The second signal processing program of the present invention is a first frequency analysis means for obtaining a first input spectrum by frequency-analyzing a computer with (1) a first input signal input from a first sound collecting device. And (2) a second frequency analysis means for obtaining a second input spectrum by frequency analysis of the second input signal input from the second sound collecting device, and (3) the first frequency analysis means. Based on the first input spectrum obtained in the above and the second input spectrum obtained by the second frequency analysis means, the position of the first sound collecting device and the position of the second sound collecting device are connected. The first feature that the value in the front direction and the direction on the first sound collecting device side is large and the value in the direction on the second sound collecting device side is small with respect to the front direction perpendicular to the straight line. A feature amount calculating means for calculating an amount, and (4) a filter determining means for obtaining an emphasis filter by mapping the first feature amount calculated by the feature amount calculating means with a predetermined broad-sense monotonous increase function. 5) A multiplication means obtained by multiplying the first input spectrum obtained by the first frequency analysis means by the emphasis filter obtained by the filter determination means to obtain an emphasis spectrum, and (6) the multiplication means obtained by the multiplication means. It is characterized in that it functions as a waveform restoration means for inputting an emphasis spectrum and restoring a signal waveform to obtain an emphasized sound.

第３の本発明の信号処理方法は、信号処理方法において、（１）第１の周波数解析手段、第２の周波数解析手段、特徴量算出手段、フィルタ決定手段、及び乗算手段を有し、（２）前記第１の周波数解析手段は、第１の収音装置から入力された第１の入力信号を周波数解析して第１の入力スペクトルを得て、（３）前記第２の周波数解析手段は、第２の収音装置から入力された第２の入力信号を周波数解析して第２の入力スペクトルを得て、（４）前記特徴量算出手段は、前記第１の周波数解析手段で得られた第１の入力スペクトルと前記第２の周波数解析手段で得られた第２の入力スペクトルに基づき、前記第１の収音装置の位置と前記第２の収音装置の位置を結んだ直線と垂直をなす正面方向に対して、正面方向及び前記第１の収音装置側の方向の値を大きくとり、前記第２の収音装置側の方向の値を小さくとる第１の特徴量を算出し、（５）前記フィルタ決定手段は、前記特徴量算出手段で算出された前記第１の特徴量を、所定の広義単調増加関数で写像して強調フィルタを得て、（６）前記乗算手段は、前記第１の周波数解析手段で得られた第１の入力スペクトルに前記フィルタ決定手段で得られた強調フィルタを乗じて強調スペクトルを得ることを特徴とする。 The third signal processing method of the present invention includes (1) a first frequency analysis means, a second frequency analysis means, a feature amount calculation means, a filter determination means, and a multiplication means in the signal processing method. 2) The first frequency analysis means obtains a first input spectrum by frequency-analyzing the first input signal input from the first sound collecting device, and (3) the second frequency analysis means. (4) The feature amount calculation means is obtained by the first frequency analysis means by frequency-analyzing the second input signal input from the second sound collecting device to obtain a second input spectrum. A straight line connecting the position of the first sound collecting device and the position of the second sound collecting device based on the obtained first input spectrum and the second input spectrum obtained by the second frequency analysis means. A first feature amount in which the values in the front direction and the direction on the first sound collecting device side are large and the values in the direction on the second sound collecting device side are small with respect to the front direction perpendicular to the frequency. The filter determining means (5) maps the first feature amount calculated by the feature amount calculating means with a predetermined broad monotonous increase function to obtain an emphasis filter, and (6) the multiplication. The means is characterized in that the enhancement spectrum is obtained by multiplying the first input spectrum obtained by the first frequency analysis means by the enhancement filter obtained by the filter determining means.

本発明によれば、より少ない演算コストで、且つ、より少ない歪みで目的音を強調する信号処理装置、プログラム及び方法を提供することができる。 According to the present invention, it is possible to provide a signal processing device, a program and a method for emphasizing a target sound with less calculation cost and less distortion.

第１の実施形態に係る信号処理装置の機能的構成について示したブロック図である。It is a block diagram which showed the functional structure of the signal processing apparatus which concerns on 1st Embodiment. 第１の実施形態に係る信号処理装置の使用環境の例について示した説明図である。It is explanatory drawing which showed the example of the use environment of the signal processing apparatus which concerns on 1st Embodiment. 第１の実施形態に係る信号処理装置で処理される特徴量Ｆ_{ｃｅｎｔｅｒ}の例について示している。An example of the feature quantity _Fcenter processed by the signal processing apparatus according to the first embodiment is shown. 第１の実施形態に係る信号処理装置で処理される特徴量Ｆ_ｓｉｄｅの例について示している。An example of the feature quantity F _side processed by the signal processing apparatus according to the first embodiment is shown. 第１の実施形態に係る信号処理装置で処理される音の到来方向θとＤＯＡ特徴量Ｆとの関係について示したグラフである。It is a graph which showed the relationship between the arrival direction θ of the sound processed by the signal processing apparatus which concerns on 1st Embodiment, and DOA feature quantity F. 第１の実施形態に係る信号処理装置で処理される広義単調増加関数の例について示したグラフである。It is a graph which showed the example of the broad sense monotonous increase function processed by the signal processing apparatus which concerns on 1st Embodiment. 第１の実施形態に係る信号処理装置で用いられる強調フィルタの例について示したグラフである。It is a graph which showed the example of the emphasis filter used in the signal processing apparatus which concerns on 1st Embodiment. 第１の実施形態に係るフィルタ決定手段で得られる強調フィルタＧの例について示したグラフである。It is a graph which showed the example of the emphasis filter G obtained by the filter determination means which concerns on 1st Embodiment. 第２の実施形態に係るフィルタ決定手段で得られる強調フィルタＧの例について示したグラフである。It is a graph which showed the example of the emphasis filter G obtained by the filter determination means which concerns on 2nd Embodiment. 第２の実施形態に係るフィルタ決定手段で得られる強調フィルタＧと、第３の実施形態に係るフィルタ決定手段で得られる強調フィルタＧとの比較について示したグラフである。It is a graph which showed the comparison between the emphasis filter G obtained by the filter determination means which concerns on 2nd Embodiment, and the enhancement filter G obtained by the filter determination means which concerns on 3rd Embodiment. 第４の実施形態に係る信号処理装置で処理される音の到来方向θとＤＯＡ特徴量Ｆ’との関係について示したグラフである。It is a graph which showed the relationship between the arrival direction θ of the sound processed by the signal processing apparatus which concerns on 4th Embodiment, and DOA feature quantity F'. 第４の実施形態に係るフィルタ決定手段４０４で得られる強調フィルタＧの例について示した説明図である。It is explanatory drawing which showed the example of the emphasis filter G obtained by the filter determination means 404 which concerns on 4th Embodiment. 従来のマイク数が２個の場合の減算型ビームフォーマに係る構成を示すブロック図である。It is a block diagram which shows the structure which concerns on the subtraction type beam former when the number of conventional microphones is two. 従来の信号処理装置を用いて、自動車の中における運転手の音声を強調する例について示した説明図である。It is explanatory drawing which showed the example which emphasizes the voice of a driver in an automobile by using the conventional signal processing apparatus. 自動車の中における目的音と妨害音のイメージについて示した説明図である。It is explanatory drawing which showed the image of the target sound and the disturbing sound in an automobile.

（Ａ）第１の実施形態
以下、本発明による信号処理装置、プログラム及び方法の第１の実施形態を、図面を参照しながら詳述する。 (A) First Embodiment Hereinafter, the first embodiment of the signal processing apparatus, program and method according to the present invention will be described in detail with reference to the drawings.

（Ａ−１）第１の実施形態の構成
図２は、第１の実施形態に係る信号処理装置１００が利用される環境について示した説明図である。なお、図２において、括弧内の符号は、後述する第２〜第４の実施形態において用いられる符号である。 (A-1) Configuration of First Embodiment FIG. 2 is an explanatory diagram showing an environment in which the signal processing device 100 according to the first embodiment is used. In FIG. 2, the reference numerals in parentheses are the reference numerals used in the second to fourth embodiments described later.

第１の実施形態に係る信号処理装置１００は、自動車Ａの中における運転手Ｕ１の音声を強調する例について示した説明図である。自動車Ａの中では、運転席に運転手Ｕ１が座り、助手席に助手Ｕ２が座った状態となっている。そして、自動車Ａの中では運転手Ｕ１の正面（運転席の正面）に、マイクアレイを構成する第１のマイクＭ１及び第２のマイクＭ２が配置されている。運転手Ｕ１からみて、第１のマイクＭ１は左側（助手Ｕ２と反対の側）に配置されており、第２のマイクＭ２は右側（助手Ｕ２の側）に配置されている。 The signal processing device 100 according to the first embodiment is an explanatory diagram showing an example of emphasizing the voice of the driver U1 in the automobile A. In the automobile A, the driver U1 is seated in the driver's seat, and the passenger U2 is seated in the passenger seat. In the automobile A, the first microphone M1 and the second microphone M2 constituting the microphone array are arranged in front of the driver U1 (front of the driver's seat). The first microphone M1 is arranged on the left side (the side opposite to the assistant U2) and the second microphone M2 is arranged on the right side (the side opposite to the assistant U2) when viewed from the driver U1.

図１は、第１の実施形態に係る信号処理装置１００の機能的構成を示すブロック図である。 FIG. 1 is a block diagram showing a functional configuration of the signal processing device 100 according to the first embodiment.

第１の実施形態の信号処理装置１００は、第１の周波数解析手段１０１、第２の周波数解析手段１０２、特徴量算出手段１０３、フィルタ決定手段１０４、乗算手段１０５、及び波形復元手段１０６を有している。 The signal processing device 100 of the first embodiment includes a first frequency analysis means 101, a second frequency analysis means 102, a feature amount calculation means 103, a filter determination means 104, a multiplication means 105, and a waveform restoration means 106. are doing.

信号処理装置１００は、一部または全部をソフトウェア的に構成するようにしてもよい。信号処理装置１００は、例えば、メモリ及びプロセッサを有するコンピュータにプログラム（実施形態に係る信号処理プログラムを含む）をインストールすることにより構成してもよい。 The signal processing device 100 may be partially or wholly configured by software. The signal processing device 100 may be configured, for example, by installing a program (including the signal processing program according to the embodiment) in a computer having a memory and a processor.

第１の周波数解析手段１０１は、第１の入力信号ｘ１を周波数解析して第１の入力スペクトルＸ１を得る。 The first frequency analysis means 101 frequency-analyzes the first input signal x1 to obtain the first input spectrum X1.

第２の周波数解析手段１０２は、第２の入力信号ｘ２を周波数解析して第２の入力スペクトルＸ２を得る。 The second frequency analysis means 102 frequency-analyzes the second input signal x2 to obtain the second input spectrum X2.

特徴量算出手段１０３は、第１の入力スペクトルＸ１と第２の入力スペクトルＸ２とに基づいて所定の特徴量（以下、「ＤＯＡ特徴量Ｆ」と呼ぶ）を得る。ＤＯＡ特徴量Ｆは、目的音の到来方向に応じて変化する特徴量であり、詳細については後述する。 The feature amount calculating means 103 obtains a predetermined feature amount (hereinafter, referred to as “DOA feature amount F”) based on the first input spectrum X1 and the second input spectrum X2. The DOA feature amount F is a feature amount that changes according to the arrival direction of the target sound, and the details will be described later.

特徴量算出手段１０３は、（１）式または（１）式を式変形した計算式によって前記ＤＯＡ特徴量Ｆを得ることができる。 The feature amount calculation means 103 can obtain the DOA feature amount F by a calculation formula obtained by modifying the formula (1) or the formula (1).

（１）式では、ある時刻のある周波数において、前記第１の入力スペクトルをＸ_１、前記第２の入力スペクトルをＸ_２、前記第２の入力スペクトルの複素共役をＸ_２ ^＊としている。

In the equation (1), the first input spectrum is X ₁ , the second input spectrum is X ₂ , and the complex conjugate of the second input spectrum is X ₂ ^{* at} a certain frequency at a certain time.

フィルタ決定手段１０４は、ＤＯＡ特徴量Ｆを所定の広義単調増加関数で写像して強調フィルタＧを得る。 The filter determining means 104 maps the DOA feature amount F with a predetermined broad-sense monotonous increasing function to obtain an emphasis filter G.

乗算手段１０５は、第１の入力スペクトルＸ１に強調フィルタＧを乗じて強調スペクトルＹを得る。 The multiplication means 105 multiplies the first input spectrum X1 by the emphasis filter G to obtain the enhancement spectrum Y.

波形復元手段１０６は、強調スペクトルＹに基づいて信号波形を復元して強調音声ｙを得る。 The waveform restoration means 106 restores the signal waveform based on the enhancement spectrum Y to obtain the enhancement voice y.

次に、特徴量算出手段１０３が得るＤＯＡ特徴量と、フィルタ決定手段１０４が得る強調フィルタの設計思想について述べる。 Next, the DOA feature amount obtained by the feature amount calculating means 103 and the design concept of the emphasis filter obtained by the filter determining means 104 will be described.

強調フィルタには、第２のマイクＭ２側（妨害音側）から到来する妨害音の直接音と反射音を抑圧し、第１のマイクＭ１側（目的音側、また、正面方向を含む）から到来する目的音の直接音と反射音を抑圧しない特徴を与える必要がある。そのため、ＤＯＡ特徴量には、音が、第１のマイクＭ１側から到来した場合には大きな値を取り、第２のマイクＭ２側から到来した場合には小さな値を取るようにしたい。しかし、第１のマイクＭ１側が正面方向を含んでいるために、このような特徴は音の到来方向に対して対称とはならないため、当該特徴を有する公知の特徴量はない。 The emphasis filter suppresses the direct sound and reflected sound of the disturbing sound coming from the second microphone M2 side (jamming sound side), and suppresses the direct sound and reflected sound from the first microphone M1 side (including the target sound side and the front direction). It is necessary to give a feature that does not suppress the direct sound and the reflected sound of the incoming target sound. Therefore, it is desired that the DOA feature amount take a large value when the sound arrives from the first microphone M1 side and a small value when the sound arrives from the second microphone M2 side. However, since the first microphone M1 side includes the front direction, such a feature is not symmetrical with respect to the arrival direction of the sound, and therefore, there is no known feature amount having the feature.

そこで、正面方向に対して大きな値を取る特徴量と、第２のマイクＭ２側に対して大きな値を取る特徴量を考える。ある時刻のある周波数において、前記第１の入力スペクトルをＸ_１、前記第２の入力スペクトルをＸ_２、前記第２の入力スペクトルの複素共役をＸ_２ ^＊とおき、例えば、式（１−１）で表される特徴量Ｆ_{ｃｅｎｔｅｒ}と、式（１−２）で表される特徴量Ｆ_ｓｉｄｅを考える。

Therefore, consider a feature amount that takes a large value with respect to the front direction and a feature amount that takes a large value with respect to the second microphone M2 side. At a certain frequency at a certain time, the first input spectrum is X ₁ , the second input spectrum is X ₂ , and the complex conjugate of the second input spectrum is X ₂ ^* . For example, the equation (1-1) a feature amount _{F center} represented by), consider a feature amount _{F side} of the formula (1-2).

ここで、正面方向（２つのマイクの位置を結んだ直線と垂直をなす方向）を０度、第２のマイクＭ２側の（第１のマイクＭ１から見た第２のマイクＭ２の）方向を＋９０度とし、音源のスペクトルをＳ、角周波数をω、２つのマイク間隔をｄ、音の到来方向をθ（シータ）、音速をｃとおくと、Ｘ_１とＸ_２はそれぞれ（２）式と（３）式のように書け、式（２）と式（３）を（１−１）式と（１−２）式に代入すると、それぞれ（３−１）式と（３−２）式が得られる。（３−１）式と（３−２）式で表される特徴量Ｆ_{ｃｅｎｔｅｒ}とＦ_ｓｉｄｅの、音の到来方向θに対する関係を、それぞれ図３と図４に示す。 Here, the front direction (the direction perpendicular to the straight line connecting the positions of the two microphones) is 0 degrees, and the direction (of the second microphone M2 seen from the first microphone M1) on the second microphone M2 side is. Assuming that the spectrum of the sound source is S, the angular frequency is ω, the distance between the two microphones is d, the direction of arrival of the sound is θ (theta), and the sound velocity is c, then X ₁ and X ₂ are equations (2), respectively. And (3), and substituting equations (2) and (3) into equations (1-1) and (1-2), equations (3-1) and (3-2), respectively. The formula is obtained. The relationships between the feature quantities F _center and F _side represented by the equations (3-1) and (3-2) with respect to the sound arrival direction θ are shown in FIGS. 3 and 4, respectively.

図３は、特徴量Ｆ_{ｃｅｎｔｅｒ}の例について示している。 FIG. 3 shows an example of the feature quantity F _center .

図３では、横軸を音源の到来方向θとし縦軸を特徴量Ｆ_{ｃｅｎｔｅｒ}としている。図３では、音源の周波数を１ｋＨｚ、２ｋＨｚ、４ｋＨｚと変化させた場合の到来方向θと特徴量Ｆ_{ｃｅｎｔｅｒ}の関係を示したグラフとなっている。 In FIG. 3, the horizontal axis is the direction of arrival of the sound source θ, and the vertical axis is the feature amount F _center . FIG. 3 is a graph showing the relationship between the arrival direction θ and the feature amount F _center when the frequency of the sound source is changed to 1 kHz, 2 kHz, and 4 kHz.

図４は、特徴量Ｆ_ｓｉｄｅの例について示している。 FIG. 4 shows an example of the feature amount F _side .

図４では、横軸を音源の到来方向θとし縦軸を特徴量Ｆ_ｓｉｄｅとしている。図４では、音源の周波数を１ｋＨｚ、２ｋＨｚ、４ｋＨｚと変化させた場合の到来方向θと特徴量Ｆ_ｓｉｄｅの関係を示したグラフとなっている。 In FIG. 4, the horizontal axis is the direction of arrival of the sound source θ, and the vertical axis is the feature amount F _side . FIG. 4 is a graph showing the relationship between the arrival direction θ and the feature amount F _side when the frequency of the sound source is changed to 1 kHz, 2 kHz, and 4 kHz.

図３と図４から、Ｆ_{ｃｅｎｔｅｒ}は到来方向０度に対して大きな値となっており、またＦ_ｓｉｄｅは０度に対して第２のマイクＭ２側に対して大きな値となり、第１のマイクＭ１側に対して小さな値となっている。 From FIGS. 3 and 4, the F _center has a large value with respect to the arrival direction of 0 degrees, and the F _side has a large value with respect to the second microphone M2 side with respect to 0 degrees, and the first microphone has a large value. It is a small value with respect to the M1 side.

以上のように、ＤＯＡ特徴量Ｆは、目的音の方向（０度の方向；正面方向）に対して大きな値となる特性があるＦ_{ｃｅｎｔｅｒ}と、第２のマイクＭ２側（すなわち妨害音の音源である助手Ｕ２の側のマイク）に対して大きな値となる特性があるＦ_ｓｉｄｅを用いて得られる特徴量であることがわかる。 As described above, DOA feature F, the direction of the target sound (0 ° direction; front direction) of the F _center there is a characteristic that a large value relative to the second microphone M2 side (i.e. interference sound source It can be seen that this is a feature amount obtained by using F _side, which has a characteristic of having a large value with respect to the microphone on the side of the assistant U2.

次に、音の到来方向とＤＯＡ特徴量との関係について述べる。 Next, the relationship between the direction of arrival of sound and the DOA feature amount will be described.

ＤＯＡ特徴量は（３−３）式で定義する。

DOA features are defined by Eq. (3-3).

（２）式と（３）式を（３−３）式に代入すると（４）式が得られ、式変形すると（５）式が得られる。（５）式で表される音の到来方向θとＤＯＡ特徴量Ｆとの関係を図５に示す。 By substituting Eqs. (2) and (3) into Eqs. (3-3), Eqs. (4) is obtained, and by transforming Eqs., Eqs. (5) is obtained. FIG. 5 shows the relationship between the arrival direction θ of the sound represented by the equation (5) and the DOA feature amount F.

図５は、音の到来方向θとＤＯＡ特徴量Ｆとの関係について示したグラフである。 FIG. 5 is a graph showing the relationship between the sound arrival direction θ and the DOA feature amount F.

図５では、横軸を音源の到来方向θとし縦軸をＤＯＡ特徴量Ｆとしている。図５では、音源の周波数を１ｋＨｚ、２ｋＨｚ、４ｋＨｚと変化させた場合の到来方向θとＤＯＡ特徴量Ｆの関係を示したグラフとなっている。 In FIG. 5, the horizontal axis is the sound source arrival direction θ, and the vertical axis is the DOA feature amount F. FIG. 5 is a graph showing the relationship between the arrival direction θ and the DOA feature amount F when the frequency of the sound source is changed to 1 kHz, 2 kHz, and 4 kHz.

図５から、ＤＯＡ特徴量Ｆは、正面方向に対しては必ずＦ＝１となり、第２のマイクＭ２側（妨害音側）に対しては必ずＦ＜１となる。一方、第１のマイクＭ１側に対しては、低い周波数と高い周波数の０度に近いθではＦ＞１となり、高い周波数の９０度に近い部分ではＦ＜１となる。 From FIG. 5, the DOA feature amount F always has F = 1 in the front direction and F <1 in the second microphone M2 side (interfering sound side). On the other hand, with respect to the first microphone M1, F> 1 at a low frequency and θ close to 0 degrees at a high frequency, and F <1 at a portion close to 90 degrees at a high frequency.

以上のように、ＤＯＡ特徴量Ｆは、音が、第１のマイクＭ１側から到来した場合には大きな値を取り、第２のマイクＭ２側から到来した場合には小さな値を取る特徴を備えていることがわかる。言い換えると、ＤＯＡ特徴量Ｆは、正面方向から第１のマイクＭ１側の方向（助手Ｕ２からの妨害音と反対方向）にピークが存在し、当該ピークの存在する方向から第２のマイクＭ２側に方向が傾くほど値が小さくなる特徴があることがわかる。

As described above, the DOA feature amount F has a feature that a large value is taken when the sound arrives from the first microphone M1 side and a small value is taken when the sound arrives from the second microphone M2 side. You can see that. In other words, the DOA feature amount F has a peak in the direction from the front direction to the first microphone M1 side (the direction opposite to the disturbing sound from the assistant U2), and the second microphone M2 side from the direction in which the peak exists. It can be seen that the value becomes smaller as the direction is tilted.

強調フィルタは、ＤＯＡ特徴量を所定の広義単調増加関数で写像することで得られる。 The emphasis filter is obtained by mapping DOA features with a predetermined broad-sense monotonous increasing function.

図６は、広義単調増加関数ｆｍａｐ（Ｆ）の例について示したグラフである。 FIG. 6 is a graph showing an example of an improper monotonic increase function fmap (F).

図６では、横軸をＤＯＡ特徴量Ｆの値とし縦軸を強調フィルタＧの値としている。 In FIG. 6, the horizontal axis is the value of the DOA feature amount F, and the vertical axis is the value of the emphasis filter G.

図１５からわかるように、強調フィルタは、第２のマイクＭ２側から到来する音を抑圧し、正面方向と第１のマイクＭ１側から到来する音は抑圧しないようにしたい。そこで、例えば広義単調増加関数ｆｍａｐ（Ｆ）を（６）式のように定義する。図６では、マイク間隔を３ｃｍ、音速を３３２ｍ／ｓ、Ｆ_０＝０．９としたｆｍａｐ（Ｆ）の例を示している。 As can be seen from FIG. 15, the emphasis filter suppresses the sound coming from the second microphone M2 side, and does not want to suppress the sound coming from the front direction and the first microphone M1 side. Therefore, for example, the improper monotonic increase function fmap (F) is defined as in Eq. (6). FIG. 6 shows an example of fmap (F) in which the microphone spacing is 3 cm, the speed of sound is 332 m / s, and F ₀ = 0.9.

強調フィルタをＧ＝ｆｍａｐ（Ｆ）として得ると、音の到来方向θと強調フィルタＧとの関係は図７のようになる。 When the emphasis filter is obtained as G = fmap (F), the relationship between the sound arrival direction θ and the emphasis filter G is as shown in FIG.

図７は、強調フィルタの例について示したグラフである。 FIG. 7 is a graph showing an example of an emphasis filter.

図７では、横軸を音の到来方向θの値とし縦軸を強調フィルタＧの値としている。図７では、音源の周波数を１ｋＨｚ、２ｋＨｚ、４ｋＨｚと変化させた場合の到来方向θと強調フィルタＧの値の関係を示したグラフとなっている。 In FIG. 7, the horizontal axis is the value of the sound arrival direction θ, and the vertical axis is the value of the emphasis filter G. FIG. 7 is a graph showing the relationship between the arrival direction θ and the value of the emphasis filter G when the frequency of the sound source is changed to 1 kHz, 2 kHz, and 4 kHz.

つまり、ＤＯＡ特徴量Ｆが１より少し小さい値より大きい場合には強調フィルタＧを１とし、そうでない場合には強調フィルタＧは１より小さくすることで、強調フィルタに所望の特性、すなわち妨害音の直接音と反射音を抑圧するが目的音の直接音と妨害音は抑圧しない特性を与えられる。

That is, if the DOA feature amount F is larger than a value slightly smaller than 1, the emphasis filter G is set to 1, and if not, the emphasis filter G is set to be smaller than 1, so that the emphasis filter has the desired characteristics, that is, the disturbing sound. The direct sound and the reflected sound of the target sound are suppressed, but the direct sound and the disturbing sound of the target sound are not suppressed.

なお、本発明と同様の強調フィルタは、例えば第１の入力スペクトルと第２の入力スペクトルとから周波数ごとに到来方向θを算出することで得ることもできるが、逆正接関数（ａｔａｎ、ａｒｃｔａｎ、ｔａｎ^−１などと書かれる）を計算する演算コストがかかる。そのため、演算コストの観点で本発明の方が優位である。 The emphasis filter similar to the present invention can be obtained by calculating the arrival direction θ for each frequency from, for example, the first input spectrum and the second input spectrum, but the inverse tangent function (atan, arctan, It costs a lot to calculate (written as tan ^-1 etc.). Therefore, the present invention is superior from the viewpoint of calculation cost.

（Ａ−２）第１の実施形態の動作
次に、上述した構成を有する第１の実施形態の信号処理装置１００の動作（実施形態の信号処理方法）を、図１を参照しながら説明する。 (A-2) Operation of First Embodiment Next, the operation of the signal processing device 100 of the first embodiment having the above-described configuration (signal processing method of the embodiment) will be described with reference to FIG. ..

信号処理装置１００は、目的音源を含む第１の入力信号ｘ_１と第２の入力信号ｘ_２（時間領域の入力信号）について、目的音強調を行って、強調音声ｙ（時間領域の出力信号）を生成するものである。 The signal processing device 100 emphasizes the target sound for the first input signal x ₁ and the second input signal x ₂ (input signal in the time domain) including the target sound source, and emphasizes the sound y (output signal in the time domain). ) Is generated.

第１の周波数解析手段１０１及び第２の周波数解析手段は、フーリエ変換に代表される任意の周波数解析手法、またはフィルタバンクに代表される任意の帯域分割手段によって、第１の入力信号ｘ_１と第２の入力信号ｘ_２をそれぞれＫ個の帯域に分割し、第１の入力スペクトルＸ_１と第２の入力スペクトルＸ_２とを得る。以下、第１の入力スペクトルと第２の入力スペクトルは、帯域の番号（例えばｋ番目）を明示する必要がある場合はＸ_１（ｋ）、Ｘ_２（ｋ）と書き、帯域の番号を明示する必要がない場合は単にＸ_１、Ｘ_２と表記する。第１の周波数解析手段１０１は、得られた第１の入力スペクトルＸ_１を特徴量算出手段１０３と乗算手段１０５に与え、第２の周波数解析手段１０２は、得られた第２の入力スペクトルＸ_２を特徴量算出手段１０３に与える。なお、乗算手段１０５に与えられる入力スペクトルは第１の入力スペクトルＸ_１としたが、これに限定されるものではなく、第２の入力スペクトルＸ_２を乗算手段１０５に与えても良く、いずれも同様の効果を奏する。 The first frequency analysis means 101 and the second frequency analysis means are combined with the first input signal x ₁ by any frequency analysis method represented by the Fourier transform or any band division means represented by the filter bank. The second input signal x ₂ is divided into K bands, respectively, to obtain a _first input spectrum X ₁ and a second input spectrum X ₂ . Hereinafter, the first input spectrum and the second input spectrum are written as X ₁ (k) and X ₂ (k) when it is necessary to specify the band number (for example, the kth), and the band number is specified. If it is not necessary to do so, simply write X ₁ and X ₂ . The first frequency analysis means 101 gives the obtained first input spectrum X ₁ to the feature amount calculation means 103 and the multiplication means 105, and the second frequency analysis means 102 gives the obtained second input spectrum X 1. ₂ is given to the feature amount calculating means 103. The input spectrum given to the multiplication means 105 is the first input spectrum X ₁ , but the present invention is not limited to this, and the second input spectrum X ₂ may be given to the multiplication means 105. It has the same effect.

特徴量算出手段１０３は、第１の入力スペクトルＸ_１と第２の入力スペクトルＸ_２とに基づいて（７）式によってＤＯＡ特徴量Ｆを算出し、フィルタ決定手段１０４に与える。（７）式をそのまま使って計算しても良いが、冗長な演算を含むため、式変形しても良い。Ｘ_１とＸ_２は複素数なので、これを（８）式のように書き直して（７）式に代入して整理すると、（９）式を得る。（７）式の代わりに（９）式を用いることで、乗算回数を減らすことができる。

Feature calculating unit 103 has a first input spectrum _{X 1} and based on the second input spectrum _{X 2} (7) to calculate the DOA feature F by formula, giving the filter determining unit 104. The formula (7) may be used as it is for calculation, but the formula may be modified because it includes redundant calculations. Since X ₁ and X ₂ are complex numbers, rewriting them as in Eq. (8) and substituting them into Eq. (7) to obtain Eq. (9). By using the equation (9) instead of the equation (7), the number of multiplications can be reduced.

フィルタ決定手段１０４は、ＤＯＡ特徴量Ｆに基づいて所定の広義単調増加関数によって強調フィルタＧを算出し、乗算手段１０５に与える。第１の実施形態では、すべての周波数に対して同じ広義単調増加関数を用いる。所定の広義単調増加関数ｆｍａｐ（Ｆ）として、例えば（１０）式で定義される１つの閾値Ｆ_０を持つ関数や、（１１）式で定義される２つの閾値Ｆ_１、Ｆ_２を持つ関数、（１２）式で定義されるスケールＦ_３、オフセットＦ_０のシグモイド関数を用いることができる。

The filter determining means 104 calculates the emphasis filter G by a predetermined broad-sense monotonous increase function based on the DOA feature amount F, and gives it to the multiplication means 105. In the first embodiment, the same broadly monotonous increasing function is used for all frequencies. As a predetermined broadly defined monotonous increasing function fmap (F), for example, a function having one threshold value F ₀ defined by the equation (10) and a function having two threshold values F ₁ and F ₂ defined by the equation (11). , The sigmoid function of scale F ₃ and offset F ₀ defined by Eq. (12) can be used.

図８は、第１の実施形態に係るフィルタ決定手段１０４で得られる強調フィルタＧの例について示したグラフである。 FIG. 8 is a graph showing an example of the emphasis filter G obtained by the filter determining means 104 according to the first embodiment.

図８（ａ）、図８（ｂ）、図８（ｃ）は、それぞれ（１０）式、（１１）式、（１２）式によって得られる強調フィルタＧの例を示している。図８（ａ）、図８（ｂ）、図８（ｃ）では、音源の周波数を１ｋＨｚ、２ｋＨｚ、４ｋＨｚと変化させた場合の到来方向θと強調フィルタＧの関係を示したグラフとなっている。図８（ａ）、図８（ｂ）、図８（ｃ）では、横軸を音源の到来方向θとし縦軸を強調フィルタＧの値（到来方向θに応じた値）としている。 8 (a), 8 (b), and 8 (c) show examples of the emphasis filter G obtained by the equations (10), (11), and (12), respectively. 8 (a), 8 (b), and 8 (c) are graphs showing the relationship between the arrival direction θ and the emphasis filter G when the frequency of the sound source is changed to 1 kHz, 2 kHz, and 4 kHz. There is. In FIGS. 8 (a), 8 (b), and 8 (c), the horizontal axis is the sound source arrival direction θ, and the vertical axis is the value of the emphasis filter G (value corresponding to the arrival direction θ).

ここでは、Ｆ_０＝０．８、Ｆ_１＝０．７、Ｆ_２＝０．９、Ｆ_３＝１２とした。妨害音の抑圧性能に関して、（１０）式と（１１）式との差はあまりない。一方、強調音声の歪みに関して、（１０）式で得られる強調フィルタＧは値を０か１しか持たないためにミュージカルノイズを発生しやすいが、（１１）式は遷移帯域があることで抑圧／非抑圧の切り替わりが緩やかになるためにミュージカルノイズが発生しにくい。（１２）式は（１１）式をさらに滑らかにした特性となっており、更なるミュージカルノイズ低減効果や歪みを減らす効果が期待できる。多少の演算コストの増加が許容されるのであれば、（１２）式を用いるのが好適である。 Here, F ₀ = 0.8, F ₁ = 0.7, F ₂ = 0.9, and F ₃ = 12. There is not much difference between the equations (10) and (11) in terms of the suppression performance of the disturbing sound. On the other hand, regarding the distortion of the emphasized voice, the emphasis filter G obtained by the equation (10) tends to generate musical noise because it has only 0 or 1, but the equation (11) suppresses / suppresses it by having a transition band. Musical noise is less likely to occur because the switching of non-suppression becomes gentle. Equation (12) has characteristics that are smoother than those of equation (11), and further effects of reducing musical noise and distortion can be expected. If a slight increase in calculation cost is allowed, it is preferable to use Eq. (12).

乗算手段１０５は、入力スペクトルＸ_１に周波数ごとに強調フィルタＧ（強調ゲイン）を乗じ、得られた強調スペクトルＹを波形復元手段１０６に与える。 Multiplying means 105 multiplies the enhancement filter G (emphasis gain) for each frequency to the input spectrum X _1, giving the resulting enhancement spectrum Y to the waveform restoration means 106.

波形復元手段１０６は、第１の周波数解析手段１０１と第２の周波数解析手段１０２で用いた周波数解析手法または帯域分割手法に対応する波形復元手法を用いて、乗算手段１０５から与えられた強調スペクトルＹに基づいて信号波形を再構成し、得られた強調音声ｙ（強調信号）を出力する。 The waveform restoration means 106 uses the emphasis spectrum given by the multiplication means 105 by using the waveform restoration method corresponding to the frequency analysis method or the band division method used in the first frequency analysis means 101 and the second frequency analysis means 102. The signal waveform is reconstructed based on Y, and the obtained emphasized voice y (emphasized signal) is output.

（Ａ−３）第１の実施形態の効果
第１の実施形態によれば、以下のような効果を奏することができる。 (A-3) Effect of First Embodiment According to the first embodiment, the following effects can be obtained.

第１の実施形態の信号処理装置１００では、第２のマイクＭ２側の方向から到来する音を抑圧し、正面方向と第１のマイクＭ１側の方向から到来する音は抑圧しないので、自動車内において運転手Ｕ１の声（目的音）を強調する場合などにおいて、少ない歪みで目的音を強調することができる。 In the signal processing device 100 of the first embodiment, the sound coming from the direction of the second microphone M2 side is suppressed, and the sound coming from the front direction and the direction of the first microphone M1 side is not suppressed. In the case of emphasizing the voice (target sound) of the driver U1, the target sound can be emphasized with less distortion.

言い換えると、信号処理装置１００では、少ない演算コストで、妨害音の直接音と反射音を抑圧するが、目的音の直接音と反射音は抑圧しない強調フィルタＧを設計できるので、少ない歪みで目的音を強調できるという効果を奏する。 In other words, the signal processing device 100 can design an emphasis filter G that suppresses the direct sound and the reflected sound of the disturbing sound at a low calculation cost, but does not suppress the direct sound and the reflected sound of the target sound. It has the effect of emphasizing the sound.

（Ｂ）第２の実施形態
以下、本発明による信号処理装置、プログラム及び方法の第２の実施形態を、図面を参照しながら詳述する。 (B) Second Embodiment Hereinafter, a second embodiment of the signal processing apparatus, program and method according to the present invention will be described in detail with reference to the drawings.

（Ｂ−１）第２の実施形態の構成
第２の実施形態の信号処理装置２００も、第１の実施形態と同様に図２に示すような環境で利用されるものとして説明する。 (B-1) Configuration of Second Embodiment The signal processing device 200 of the second embodiment will be described as being used in the environment as shown in FIG. 2 as in the first embodiment.

また、第２の実施形態の信号処理装置２００の内部構成についても、上述の図１を用いて示すことができる。 Further, the internal configuration of the signal processing device 200 of the second embodiment can also be shown with reference to FIG. 1 described above.

以下では、第２の実施形態の信号処理装置２００について、第１の実施形態との差異を説明する。 Hereinafter, the difference between the signal processing device 200 of the second embodiment and that of the first embodiment will be described.

第１の実施形態では、フィルタ決定手段１０４において、すべての周波数に同じ広義単調増加関数ｆｍａｐ（Ｆ）を適用して強調フィルタＧを得ていたため、図８に示した通り、強調フィルタＧの特性が周波数ごとに異なっていた。特に低い周波数では抑圧されない到来方向の範囲が広くなる現象が起こる。そこで、第２の実施形態では、どの周波数でも同じような特性となるように、周波数ごとに異なる広義単調増加関数を適用する。 In the first embodiment, in the filter determining means 104, the same broadly defined monotonic increase function fmap (F) is applied to all frequencies to obtain the emphasis filter G. Therefore, as shown in FIG. 8, the characteristics of the emphasis filter G are obtained. Was different for each frequency. Especially at low frequencies, a phenomenon occurs in which the range of the arrival direction that is not suppressed becomes wide. Therefore, in the second embodiment, a broad monotonous increase function different for each frequency is applied so that the characteristics are the same at any frequency.

第２の実施形態の信号処理装置２００の構成は、図１に示すように、フィルタ決定手段１０４がフィルタ決定手段２０４に替わること以外は、第１の実施形態の信号処理装置１００の構成と同じである。 The configuration of the signal processing device 200 of the second embodiment is the same as the configuration of the signal processing device 100 of the first embodiment except that the filter determining means 104 is replaced with the filter determining means 204 as shown in FIG. Is.

（Ｂ−２）第２の実施形態の動作
次に、以上のような構成を有する第２の実施形態の信号処理装置２００の動作（実施形態の信号処理方法）を説明する。 (B-2) Operation of Second Embodiment Next, the operation of the signal processing device 200 of the second embodiment having the above configuration (signal processing method of the embodiment) will be described.

第２の実施形態の信号処理装置２００の動作は、フィルタ決定手段２０４の動作がフィルタ決定手段１０４とは異なる点以外は、第1の実施形態の信号処理装置１００の動作と同じである。 The operation of the signal processing device 200 of the second embodiment is the same as the operation of the signal processing device 100 of the first embodiment except that the operation of the filter determining means 204 is different from that of the filter determining means 104.

第１の実施形態では、図７に示すように、正面方向から第２のマイクＭ２の側に到来方向を傾けた際に、周波数によって強調フィルタＧのゲインが所定以下（例えば、０．５以下）となる到来角度（以下、「カットオフ到来角度」と呼ぶ）にばらつきがある。言い換えると、第１の実施形態では、周波数によって抑圧しない到来方向の範囲にばらつきがある。これに対して、第２の実施形態のフィルタ決定手段２０４は、周波数ごとの広義単調増加関数を設定することで、このばらつきを吸収し、複数の周波数でカットオフ到来角度（抑圧しない到来方向の範囲）が近づくようにしている。フィルタ決定手段２０４において、周波数ごとのカットオフ到来角度のばらつき（抑圧しない到来方向の範囲のばらつき）を抑制するような、周波数ごとの広義単調増加関数を求める方式については限定されないものであるが、例えば、いかのような処理を適用することができる。 In the first embodiment, as shown in FIG. 7, the gain of the emphasis filter G is equal to or less than a predetermined value (for example, 0.5 or less) depending on the frequency when the arrival direction is tilted from the front direction to the side of the second microphone M2. ) (Hereinafter referred to as "cutoff arrival angle"). In other words, in the first embodiment, the range of the arrival direction that is not suppressed varies depending on the frequency. On the other hand, the filter determining means 204 of the second embodiment absorbs this variation by setting a broadly monotonous increasing function for each frequency, and cutoff arrival angles (in the arrival direction without suppression) at a plurality of frequencies. The range) is getting closer. The filter determining means 204 is not limited to a method of obtaining a broad monotonous increase function for each frequency that suppresses variation in the cutoff arrival angle for each frequency (variation in the range of the arrival direction that is not suppressed). For example, any processing can be applied.

フィルタ決定手段２０４は、ＤＯＡ特徴量Ｆに基づいて所定の広義単調増加関数によって強調フィルタＧを算出し、乗算手段１０５に与える。第２の実施形態では、周波数ごとに異なる広義単調増加関数を用いる。ここでは、ｋ番目の周波数のＤＯＡ特徴量をＦ（ｋ）、ｋ番目の周波数の強調ゲインをＧ（ｋ）と書く。ｋ番目の周波数をｆ_ｋとして、到来方向と周波数の番号をＤＯＡ特徴量に変換する関数を（１３）式で定義する。そして、所定のｋ番目の周波数の広義単調増加関数ｆｍａｐ_ｋ（Ｆ（ｋ））として、例えば（１４）式で定義される１つの到来方向閾値θ_０を関数や、（１５）式で定義される２つの到来方向閾値θ_１、θ_２を持つ関数、（１６）式で定義されるスケールＦ_ａ、オフセット到来方向θ_０のシグモイド関数を用いることができる。

The filter determining means 204 calculates the emphasis filter G by a predetermined broad-sense monotonous increasing function based on the DOA feature amount F, and gives it to the multiplication means 105. In the second embodiment, a broad monotonous increase function that differs for each frequency is used. Here, the DOA feature of the k-th frequency is written as F (k), and the emphasis gain of the k-th frequency is written as G (k). Let _{fk be the} k-th frequency, and define a function that converts the arrival direction and frequency number into DOA features by Eq. (13). Then, as the broadly defined monotonous increasing function fmap _k (F (k)) of the predetermined k-th frequency, for example, one arrival direction threshold θ ₀ defined by the equation (14) is defined by the function or the equation (15). that two DOA threshold theta _1, function with the theta _2, it is possible to use the scale F _a, sigmoid function of the offset arrival direction theta ₀ as defined in (16).

図９は、第２の実施形態に係るフィルタ決定手段２０４で得られる強調フィルタＧの例について示したグラフである。 FIG. 9 is a graph showing an example of the emphasis filter G obtained by the filter determining means 204 according to the second embodiment.

図９（ａ）、図９（ｂ）、図９（ｃ）は、それぞれ（１４）式、（１５）式、（１６）式によって得られる強調フィルタＧの例を示している。図９（ａ）、図９（ｂ）、図９（ｃ）では、音源の周波数を１ｋＨｚ、２ｋＨｚ、４ｋＨｚと変化させた場合の到来方向θと強調フィルタＧの関係を示したグラフとなっている。図９（ａ）、図９（ｂ）、図９（ｃ）では、横軸を音源の到来方向θとし縦軸を強調フィルタＧの値（到来方向θに応じた値）としている。 9 (a), 9 (b), and 9 (c) show examples of the emphasis filter G obtained by the equations (14), (15), and (16), respectively. 9 (a), 9 (b), and 9 (c) are graphs showing the relationship between the arrival direction θ and the emphasis filter G when the frequency of the sound source is changed to 1 kHz, 2 kHz, and 4 kHz. There is. In FIGS. 9A, 9B, and 9C, the horizontal axis is the sound source arrival direction θ, and the vertical axis is the value of the emphasis filter G (value corresponding to the arrival direction θ).

ここでは、θ_０＝１５、θ_１＝２０、θ_２＝１０、Ｆ_ａ＝１２とした。第１の実施形態における強調フィルタＧ（図８）では、抑圧しない到来方向の範囲が周波数ごとに変化していたが、第２の実施形態における強調フィルタＧ（図９）では、高い周波数の第１のマイクＭ１側を除いて、抑圧しない到来方向の範囲は周波数が変わっても変化しない。なお、（１６）式については周波数によって特性が変化しているが、強調フィルタＧのゲインが０．５となるカットオフ到来角度は周波数に依らず一定である。つまり、（１３）〜（１６）式を用いて強調ゲインを算出すれば、第２のマイクＭ２側、すなわち妨害音側（助手席側）を何度まで抑圧するかを、すべての周波数共通かつ直接的に設定できる。 Here, θ ₀ = 15, θ ₁ = 20, θ ₂ = 10, and _Fa = 12. In the emphasis filter G (FIG. 8) of the first embodiment, the range of the non-suppressing arrival direction changed for each frequency, but in the emphasis filter G (FIG. 9) of the second embodiment, the high frequency first. Except for the microphone M1 side of No. 1, the range of the arrival direction that is not suppressed does not change even if the frequency changes. Although the characteristics of Eq. (16) change depending on the frequency, the cutoff arrival angle at which the gain of the emphasis filter G is 0.5 is constant regardless of the frequency. That is, if the emphasis gain is calculated using the equations (13) to (16), the number of times the second microphone M2 side, that is, the disturbing sound side (passenger seat side) is suppressed is common to all frequencies. Can be set directly.

（Ｂ−３）第２の実施形態の効果
第２の実施形態によれば、第１の実施形態の効果に加えて、以下のような効果を奏することができる。 (B-3) Effect of Second Embodiment According to the second embodiment, the following effects can be obtained in addition to the effect of the first embodiment.

第２の実施形態の信号処理装置２００では、強調ゲインが抑圧しない到来方向の範囲をすべての周波数で同じように与えることができ、かつその範囲を到来方向の角度そのもので設定できるので、より適切な調整が可能となり、より少ない歪みで目的音を強調できるという効果を奏する。 In the signal processing device 200 of the second embodiment, the range of the arrival direction in which the emphasis gain is not suppressed can be given in the same manner at all frequencies, and the range can be set by the angle of the arrival direction itself, which is more appropriate. The effect is that the target sound can be emphasized with less distortion.

（Ｃ）第３の実施形態
以下、本発明による信号処理装置、プログラム及び方法の第３の実施形態を、図面を参照しながら詳述する。 (C) Third Embodiment Hereinafter, a third embodiment of the signal processing apparatus, program and method according to the present invention will be described in detail with reference to the drawings.

（Ｃ−１）第３の実施形態の構成
第３の実施形態の信号処理装置２００も、第１、第２の実施形態と同様に図２に示すような環境で利用されるものとして説明する。 (C-1) Configuration of Third Embodiment The signal processing device 200 of the third embodiment will be described as being used in the environment as shown in FIG. 2 as in the first and second embodiments. ..

また、第３の実施形態の信号処理装置３００の内部構成についても、上述の図１を用いて示すことができる。 Further, the internal configuration of the signal processing device 300 of the third embodiment can also be shown with reference to FIG. 1 described above.

以下では、第３の実施形態の信号処理装置２００について、第２の実施形態との差異を説明する。 Hereinafter, the difference between the signal processing device 200 of the third embodiment and that of the second embodiment will be described.

第２の実施形態では、妨害音側の到来方向を何度まで抑圧するかを、すべての周波数共通で設定した。しかし、低い周波数は信号の波長に対してマイク間隔を十分に広く取ることが困難なため（１００Ｈｚの波長は約３．３ｍだが、自動車内でのマイク間隔は数ｃｍとするのが一般的）、低い周波数において数値計算によって得られる到来方向に関する情報（本発明ではＤＯＡ特徴量）は一般に曖昧になる（到来方向推定の意味で推定誤差が大きくなる）。そこで、第３の実施形態では、所定よりも低い周波数（例えば、２５０Ｈｚ以下の周波数帯）では強調ゲインが抑圧しない到来方向の範囲を広げる（第２のマイクＭ２の側に広げる；妨害音を発する助手Ｕ２の側に広げる）ように設計する。 In the second embodiment, the number of times the direction of arrival on the disturbing sound side is suppressed is set for all frequencies. However, it is difficult to make the microphone interval sufficiently wide with respect to the signal wavelength at low frequencies (the wavelength of 100 Hz is about 3.3 m, but the microphone interval in the car is generally several cm). Information on the arrival direction (DOA feature amount in the present invention) obtained by numerical calculation at a low frequency is generally ambiguous (the estimation error becomes large in the sense of the arrival direction estimation). Therefore, in the third embodiment, the range of the arrival direction in which the emphasis gain is not suppressed is widened (widened to the side of the second microphone M2; an interfering sound is emitted) at a frequency lower than a predetermined frequency (for example, a frequency band of 250 Hz or less). It is designed to spread to the side of the assistant U2).

第３の実施形態の信号処理装置３００の構成は、フィルタ決定手段１０４がフィルタ決定手段３０４に替わること以外は、第１の実施形態の信号処理装置１００の構成と同じである。 The configuration of the signal processing device 300 of the third embodiment is the same as the configuration of the signal processing device 100 of the first embodiment except that the filter determining means 104 is replaced with the filter determining means 304.

（Ｃ−２）第３の実施形態の動作
次に、以上のような構成を有する第３の実施形態の信号処理装置３００の動作（実施形態の信号処理方法）を説明する。 (C-2) Operation of Third Embodiment Next, the operation of the signal processing device 300 of the third embodiment having the above configuration (signal processing method of the embodiment) will be described.

第３の実施形態の信号処理装置３００の動作は、フィルタ決定手段３０４の動作がフィルタ決定手段１０４とは異なる点以外は、第1の実施形態の信号処理装置１００の動作と同じである。 The operation of the signal processing device 300 of the third embodiment is the same as the operation of the signal processing device 100 of the first embodiment except that the operation of the filter determining means 304 is different from that of the filter determining means 104.

フィルタ決定手段３０４は、ＤＯＡ特徴量Ｆに基づいて所定の広義単調増加関数によって強調フィルタＧを算出し、乗算手段１０５に与える。第２の実施形態では、周波数ごとに異なる広義単調増加関数を用いる。ここでは、ｋ番目の周波数のＤＯＡ特徴量をＦ（ｋ）、ｋ番目の周波数の強調ゲインをＧ（ｋ）と書く。ｋ番目の周波数をｆ_ｋとして、到来方向と周波数の番号をＤＯＡ特徴量に変換する関数を（１７）式で定義する。そして、所定のｋ番目の周波数の広義単調増加関数ｆｍａｐ_ｋ（Ｆ（ｋ））として、例えば（１８）式で定義される１つの到来方向閾値θ_０を持つ関数を用いることができる。

The filter determining means 304 calculates the emphasis filter G by a predetermined broad-sense monotonous increase function based on the DOA feature amount F, and gives it to the multiplication means 105. In the second embodiment, a broad monotonous increase function that differs for each frequency is used. Here, the DOA feature of the k-th frequency is written as F (k), and the emphasis gain of the k-th frequency is written as G (k). Letting the kth frequency be f _k , the function for converting the arrival direction and the frequency number into the DOA feature is defined by the equation (17). Then, as the broadly defined monotonic increasing function fmap _k (F (k)) of the predetermined k-th frequency, for example, a function having one arrival direction threshold value θ ₀ defined by the equation (18) can be used.

図１０は、第２の実施形態に係るフィルタ決定手段２０４で得られる強調フィルタＧと、第３の実施形態に係るフィルタ決定手段３０４で得られる強調フィルタＧとの比較について示したグラフである。 FIG. 10 is a graph showing a comparison between the emphasis filter G obtained by the filter determination means 204 according to the second embodiment and the emphasis filter G obtained by the filter determination means 304 according to the third embodiment.

図１０（ａ）は、第２の実施形態におけるフィルタ決定手段２０４で上述の（１３）式及び（１４）式を用いて得られる強調フィルタＧについて示している。また、図１０（ｂ）は、第３の実施形態に係るフィルタ決定手段３０４で（１７）式及び（１８）式により得られる強調フィルタＧの例を示している。図１０（ａ）、図１０（ｂ）では、音源の周波数を１ｋＨｚ、２ｋＨｚ、４ｋＨｚと変化させた場合の到来方向θと強調フィルタＧの関係を示したグラフとなっている。図１０（ａ）、図１０（ｂ）では、横軸を音源の到来方向θとし縦軸を強調フィルタＧの値（到来方向θに応じた値）としている。 FIG. 10A shows the emphasis filter G obtained by using the above equations (13) and (14) in the filter determining means 204 in the second embodiment. Further, FIG. 10B shows an example of the emphasis filter G obtained by the equations (17) and (18) in the filter determining means 304 according to the third embodiment. 10 (a) and 10 (b) are graphs showing the relationship between the arrival direction θ and the emphasis filter G when the frequency of the sound source is changed to 1 kHz, 2 kHz, and 4 kHz. In FIGS. 10A and 10B, the horizontal axis is the sound source arrival direction θ, and the vertical axis is the value of the emphasis filter G (value corresponding to the arrival direction θ).

ここでは、θ_０＝５、Ｆ_０＝０．９７とした。図１０より、到来方向と周波数の番号をＤＯＡ特徴量に変換する関数Φ（ファイ）の上限値をＦ_０としたことで、低い周波数の抑圧しない到来方向の範囲が広くなったことが確認できる。 Here, θ ₀ = 5 and F ₀ = 0.97. Than 10, the function for converting the number of the arrival direction and frequency in DOA feature quantity Φ the upper limit of (phi) that has a F _0, it can be confirmed that the suppression was not DOA range of low frequencies is wider ..

（Ｃ−３）第３の実施形態の効果
第３の実施形態によれば、第１、第２の実施形態の効果に加えてができる。 (C-3) Effect of Third Embodiment According to the third embodiment, the effect can be added to the effects of the first and second embodiments.

第３の実施形態の信号処理装置３００では、数値計算によって得られる到来方向に関する情報が曖昧となる低い周波数において、抑圧しない到来方向の範囲を広めに確保できるので、低い周波数の目的音の歪みが軽減され、より少ない歪みで目的音を強調できるという効果を奏する。 In the signal processing device 300 of the third embodiment, at a low frequency where the information about the arrival direction obtained by the numerical calculation is ambiguous, a wide range of the arrival direction that is not suppressed can be secured, so that the distortion of the target sound at the low frequency is distorted. It is reduced and has the effect of emphasizing the target sound with less distortion.

（Ｄ）第４の実施形態
以下、本発明による信号処理装置、プログラム及び方法の第４の実施形態を、図面を参照しながら詳述する。 (D) Fourth Embodiment Hereinafter, a fourth embodiment of the signal processing apparatus, program and method according to the present invention will be described in detail with reference to the drawings.

（Ｄ−１）第４の実施形態の構成
第４の実施形態の信号処理装置４００も、第１〜第３の実施形態と同様に図２に示すような環境で利用されるものとして説明する。 (D-1) Configuration of Fourth Embodiment The signal processing device 400 of the fourth embodiment will be described as being used in the environment as shown in FIG. 2 as in the first to third embodiments. ..

また、第４の実施形態の信号処理装置４００の内部構成についても、上述の図１を用いて示すことができる。 Further, the internal configuration of the signal processing device 400 of the fourth embodiment can also be shown with reference to FIG. 1 described above.

以下では、第４の実施形態の信号処理装置４００について、第１〜第３の実施形態との差異を説明する。 Hereinafter, the difference between the signal processing device 400 of the fourth embodiment and the first to third embodiments will be described.

第１〜第３の実施形態では、自動車Ａ内において運転手Ｕ１の正面に２つのマイクＭ１、Ｍ２をセットする場合を想定して、助手席側（助手Ｕ２側）だけを抑圧する強調フィルタＧを設計した。これに対して、第４の実施形態では、本発明におけるＤＯＡ特徴量を用いて正面方向のみを強調する（抑圧しない）強調フィルタを適用するものとする。 In the first to third embodiments, the emphasis filter G that suppresses only the passenger side (passenger U2 side) is assumed in the case where the two microphones M1 and M2 are set in front of the driver U1 in the automobile A. Designed. On the other hand, in the fourth embodiment, the emphasis filter that emphasizes (does not suppress) only the front direction by using the DOA feature amount in the present invention is applied.

図１に示すように、第４の実施形態の信号処理装置４００の構成は、特徴量算出手段１０３とフィルタ決定手段１０４がそれぞれ特徴量算出手段４０３とフィルタ決定手段４０４に替わること以外は、第１の実施形態の信号処理装置１００の構成と同じである。 As shown in FIG. 1, the configuration of the signal processing device 400 of the fourth embodiment is the first except that the feature amount calculating means 103 and the filter determining means 104 are replaced with the feature amount calculating means 403 and the filter determining means 404, respectively. It is the same as the configuration of the signal processing device 100 of the first embodiment.

（Ｄ−２）第４の実施形態の動作
次に、以上のような構成を有する第４の実施形態の信号処理装置４００の動作（実施形態の信号処理方法）を説明する。 (D-2) Operation of Fourth Embodiment Next, the operation of the signal processing device 400 of the fourth embodiment having the above configuration (signal processing method of the embodiment) will be described.

次に、上述した構成を有する第４の実施形態の信号処理装置４００の動作を説明する。第４の実施形態の信号処理装置４００の動作は、特徴量算出手段４０３とフィルタ決定手段３０４の動作が特徴量算出手段１０３とフィルタ決定手段１０４とは異なる点以外は、第1の実施形態の信号処理装置１００の動作と同じである。 Next, the operation of the signal processing device 400 of the fourth embodiment having the above-described configuration will be described. The operation of the signal processing device 400 of the fourth embodiment is that of the first embodiment, except that the operations of the feature amount calculating means 403 and the filter determining means 304 are different from those of the feature amount calculating means 103 and the filter determining means 104. The operation is the same as that of the signal processing device 100.

特徴量算出手段４０３は、第１の入力スペクトルＸ_１と第２の入力スペクトルＸ_２とに基づいて（１９）式によって２つのＤＯＡ特徴量ＦとＦ’を算出し、フィルタ決定手段４０４に与える。２つのＤＯＡ特徴量を音の到来方向θに関して整理すると、（２０）式となる。 The feature amount calculation means 403 calculates two DOA feature amounts F and F'by the equation (19) based on the first input spectrum X ₁ and the second input spectrum X _2, and gives them to the filter determination means 404. .. When the two DOA features are arranged with respect to the sound arrival direction θ, it becomes Eq. (20).

図１１は、音の到来方向θと（２０）式のＤＯＡ特徴量Ｆ’との関係について示したグラフである。 FIG. 11 is a graph showing the relationship between the sound arrival direction θ and the DOA feature amount F'in equation (20).

図１１では、音源の周波数を１ｋＨｚ、２ｋＨｚ、４ｋＨｚと変化させた場合の到来方向θとＤＯＡ特徴量Ｆ’の関係を示したグラフとなっている。図１１では、横軸を音源の到来方向θとし縦軸をＤＯＡ特徴量Ｆ’としている。 FIG. 11 is a graph showing the relationship between the arrival direction θ and the DOA feature amount F'when the frequency of the sound source is changed to 1 kHz, 2 kHz, and 4 kHz. In FIG. 11, the horizontal axis is the sound source arrival direction θ, and the vertical axis is the DOA feature amount F'.

図１１を見ると、ＤＯＡ特徴量Ｆ（図５）とはちょうど左右が反転していることが確認できる。

Looking at FIG. 11, it can be confirmed that the left and right sides are exactly reversed from the DOA feature amount F (FIG. 5).

フィルタ決定手段４０４は、２つのＤＯＡ特徴量ＦとＦ’に基づいて所定の広義単調増加関数によって強調フィルタＧを算出し、乗算手段１０５に与える。所定の広義単調増加関数には、第１の実施形態に係るｆｍａｐ（Ｆ）、第２の実施形態に係るΦ（φ，ｋ）とｆｍａｐ_ｋ（Ｆ（ｋ））、第３の実施形態に係るΦ（φ，Ｆ_０，ｋ）とｆｍａｐ_ｋ（Ｆ（ｋ））のいずれを用いても良いが、ここでは一例として、第２の実施形態の所定の広義単調増加関数を用いて説明する。第４の実施形態において、強調フィルタＧは（２１）式を用いて算出される。 The filter determining means 404 calculates the emphasis filter G by a predetermined broad-sense monotonous increasing function based on the two DOA features F and F', and gives it to the multiplication means 105. The predetermined broadly defined monotonous increasing function includes fmap (F) according to the first embodiment, Φ (φ, k) and fmap _k (F (k)) according to the second embodiment, and the third embodiment. Either Φ (φ, F ₀ , k) or fmap _k (F (k)) may be used, but here, as an example, a predetermined broad-sense monotonic increase function of the second embodiment will be used. .. In the fourth embodiment, the emphasis filter G is calculated using the equation (21).

図１２は、第４の実施形態に係るフィルタ決定手段４０４で得られる強調フィルタＧの例について示した説明図である。 FIG. 12 is an explanatory diagram showing an example of the emphasis filter G obtained by the filter determining means 404 according to the fourth embodiment.

図１２（ａ）、図１２（ｂ）、図１２（ｃ）は、それぞれｆｍａｐ_ｋ（Ｆ（ｋ））として（１４）式、（１５）式、（１６）式を用いた場合に得られる強調フィルタＧの例を示している。図１２（ａ）、図１２（ｂ）、図１２（ｃ）では、音源の周波数を１ｋＨｚ、２ｋＨｚ、４ｋＨｚと変化させた場合の到来方向θと強調フィルタＧの関係を示したグラフとなっている。図１２（ａ）、図１２（ｂ）、図１２（ｃ）では、横軸を音源の到来方向θとし縦軸を強調フィルタＧの値（到来方向θに応じた値）としている。 12 (a), 12 (b), and 12 (c) are obtained when the formulas (14), (15), and (16) are used as the fmap _k (F (k)), respectively. An example of the emphasis filter G is shown. 12 (a), 12 (b), and 12 (c) are graphs showing the relationship between the arrival direction θ and the emphasis filter G when the frequency of the sound source is changed to 1 kHz, 2 kHz, and 4 kHz. There is. In FIGS. 12 (a), 12 (b), and 12 (c), the horizontal axis is the sound source arrival direction θ, and the vertical axis is the value of the emphasis filter G (value corresponding to the arrival direction θ).

ここでは、θ_０＝２０、θ_１＝１５、θ_２＝２５、Ｆ_ａ＝１２とした。図１２より、正面方向のみを強調する（抑圧しない）強調フィルタが得られていることが分かる。 Here, θ ₀ = 20, θ ₁ = 15, θ ₂ = 25, and _Fa = 12. From FIG. 12, it can be seen that an emphasis filter that emphasizes (does not suppress) only the front direction is obtained.

（Ｄ−３）第４の実施形態の効果
第４の実施形態によれば、第１〜第３の実施形態と比較して以下のような効果を奏することができる。 (D-3) Effect of Fourth Embodiment According to the fourth embodiment, the following effects can be obtained as compared with the first to third embodiments.

第４の実施形態の信号処理装置４００では、強調フィルタＧ（強調ゲイン）が抑圧しない到来方向の範囲を正面方向に限定した目的音を強調できるという特有の効果を奏することができる。 In the signal processing device 400 of the fourth embodiment, it is possible to achieve a unique effect that the target sound can be emphasized by limiting the range of the arrival direction that the emphasis filter G (emphasis gain) does not suppress to the front direction.

（Ｅ）他の実施形態
本発明は、上記の各実施形態に限定されるものではなく、以下に例示するような変形実施形態も挙げることができる。 (E) Other Embodiments The present invention is not limited to each of the above embodiments, and modified embodiments as illustrated below can also be mentioned.

（Ｅ−１）上記の実施形態において、信号処理装置は、強調スペクトルＹの波形を復元して強調音声ｙを出力するものとして記載したが、波形を復元せずに強調スペクトルＹを出力しても良い。また、強調スペクトルＹと強調音声ｙの両方を出力するようにしても良い。その場合、波形復元手段１０６は除外するようにしてもよい。 (E-1) In the above embodiment, the signal processing device is described as restoring the waveform of the emphasized spectrum Y and outputting the emphasized voice y, but outputs the enhanced spectrum Y without restoring the waveform. Is also good. Further, both the emphasized spectrum Y and the emphasized voice y may be output. In that case, the waveform restoration means 106 may be excluded.

１００…信号処理装置、１０１…第１の周波数解析手段、１０２…第２の周波数解析手段、１０３…特徴量算出手段、１０４…フィルタ決定手段、１０５…乗算手段、１０６…波形復元手段、Ｍ１…第１のマイク（第１の収音装置）、Ｍ２…第２のマイク（第２の収音装置）。 100 ... Signal processing device, 101 ... First frequency analysis means, 102 ... Second frequency analysis means, 103 ... Feature amount calculation means, 104 ... Filter determination means, 105 ... Multiplying means, 106 ... Waveform restoration means, M1 ... First microphone (first sound collecting device), M2 ... Second microphone (second sound collecting device).

Claims

A first frequency analysis means for obtaining a first input spectrum by frequency analysis of a first input signal input from the first sound picking device, and
A second frequency analysis means for obtaining a second input spectrum by frequency analysis of the second input signal input from the second sound pickup device, and
Based on the first input spectrum obtained by the first frequency analysis means and the second input spectrum obtained by the second frequency analysis means, the position of the first sound collecting device and the second The values in the front direction and the direction on the first sound collecting device side are set larger than the front direction perpendicular to the straight line connecting the positions of the sound collecting devices, and the values in the direction on the second sound collecting device side are taken. The feature amount calculation means for calculating the first feature amount, and
A filter determining means for obtaining an emphasis filter by mapping the first feature amount calculated by the feature amount calculating means with a predetermined broad-sense monotonous increasing function.
A signal processing apparatus comprising: a multiplication means for obtaining an emphasis spectrum by multiplying a first input spectrum obtained by the first frequency analysis means by a enhancement filter obtained by the filter determination means.

The first feature amount is on the side of the first sound collecting device with respect to the front direction perpendicular to the straight line connecting the position of the first sound collecting device and the position of the second sound collecting device. The signal processing device according to claim 1, wherein a peak exists in the direction, and the value becomes smaller as the direction is inclined from the direction of the peak toward the second sound collecting device.

The feature amount calculating means has a second feature amount that has a large value with respect to the front direction and a third feature amount that has a large value with respect to the direction toward the second sound collecting device than the front direction. The signal processing apparatus according to claim 1 or 2, wherein the first feature amount is calculated using the above.

The method according to any one of claims 1 to 3, wherein the filter determining means maps the first feature quantity using a broad-sense monotonous increase function different for each frequency, and obtains the emphasis filter for each frequency. Signal processing device.

The signal processing apparatus according to claim 4, wherein the filter determining means sets a broad monotonous increasing function of each frequency so as to match the range of the arrival direction that is not suppressed in the emphasis filter for each frequency. ..

The filter determining means sets a broad monotonous increasing function in the emphasis filter in a low frequency band below a predetermined frequency to widen the range of the arrival direction that is not suppressed as compared with a high frequency band higher than the predetermined frequency. The signal processing apparatus according to claim 4.

The filter determining means sets a broad monotonous increasing function in the emphasis filter in the low frequency band to widen the range of the arrival direction that is not suppressed as compared with the high frequency band toward the second sound pickup device. The signal processing apparatus according to claim 6.

The signal according to any one of claims 1 to 3, wherein the filter determining means uses the first feature amount to set a broadly defined monotonous increasing function for obtaining an emphasis filter that emphasizes only the front direction. Processing equipment.

Computer,
A first frequency analysis means for obtaining a first input spectrum by frequency analysis of a first input signal input from the first sound picking device, and
A second frequency analysis means for obtaining a second input spectrum by frequency analysis of the second input signal input from the second sound pickup device, and
Based on the first input spectrum obtained by the first frequency analysis means and the second input spectrum obtained by the second frequency analysis means, the position of the first sound collecting device and the second The values in the front direction and the direction on the first sound collecting device side are set larger than the front direction perpendicular to the straight line connecting the positions of the sound collecting devices, and the values in the direction on the second sound collecting device side are taken. The feature amount calculation means for calculating the first feature amount, and
A filter determining means for obtaining an emphasis filter by mapping the first feature amount calculated by the feature amount calculating means with a predetermined broad-sense monotonous increasing function.
A multiplication means for obtaining an emphasis spectrum by multiplying the first input spectrum obtained by the first frequency analysis means by the enhancement filter obtained by the filter determination means.
A signal processing program characterized by inputting an emphasis spectrum obtained by the multiplication means and restoring a signal waveform to function as a waveform restoration means for obtaining an emphasized voice.

In the signal processing method
It has a first frequency analysis means, a second frequency analysis means, a feature amount calculation means, a filter determination means, and a multiplication means.
The first frequency analysis means frequency-analyzes the first input signal input from the first sound collecting device to obtain a first input spectrum.
The second frequency analysis means frequency-analyzes the second input signal input from the second sound collecting device to obtain a second input spectrum.
The feature amount calculation means is based on the first input spectrum obtained by the first frequency analysis means and the second input spectrum obtained by the second frequency analysis means, and the first sound collecting device is used. The values in the front direction and the direction on the side of the first sound collecting device are set larger than those in the front direction perpendicular to the straight line connecting the position of the second sound collecting device and the position of the second sound collecting device. Calculate the first feature amount that keeps the value in the direction of the sound device small,
The filter determining means maps the first feature amount calculated by the feature amount calculating means with a predetermined broad-sense monotonous increase function to obtain an emphasis filter.
The multiplication means is a signal processing method, characterized in that a first input spectrum obtained by the first frequency analysis means is multiplied by an emphasis filter obtained by the filter determination means to obtain an emphasis spectrum.