JP2018136509A

JP2018136509A - Signal processing apparatus, program, and method

Info

Publication number: JP2018136509A
Application number: JP2017032567A
Authority: JP
Inventors: 大藤枝; Masaru Fujieda
Original assignee: Oki Electric Industry Co Ltd
Current assignee: Oki Electric Industry Co Ltd
Priority date: 2017-02-23
Filing date: 2017-02-23
Publication date: 2018-08-30
Anticipated expiration: 2037-02-23
Also published as: JP6772890B2

Abstract

PROBLEM TO BE SOLVED: To enhance a target sound at lower computational cost and with lower distortion.SOLUTION: This invention relates to a signal processing apparatus. The signal processing apparatus of this invention includes: means for conducting frequency analysis of a first input signal input from a first sound collecting device to obtain a first input spectrum; means for conducting frequency analysis of a second input signal input from a second sound collecting device to obtain a second input spectrum; means for computing a first feature quantity based on the first and second input spectrums, the first feature quantity assuming a value in a direction of the first sound collecting device larger relative to a front direction and assuming a value in a direction of the second sound collecting device smaller relative to the front direction; means for mapping the first feature quantity with a predetermined broad-sense monotonically increasing function to obtain an enhancement filter; and multiplication means for multiplying the enhancement filter obtained by the first input spectrum to obtain an enhanced spectrum.SELECTED DRAWING: Figure 1

Description

本発明は、信号処理装置、プログラム及び方法に関し、例えば、複数の音源が存在する環境下で、特定の方向の範囲に存在する音源を強調し収音することを欲する通信端末、オーディオ機器、音声認識装置などに適用し得る。 The present invention relates to a signal processing device, a program, and a method, for example, a communication terminal, an audio device, and a voice that want to emphasize and collect sound sources existing in a range in a specific direction in an environment where a plurality of sound sources exist. It can be applied to a recognition device or the like.

複数の音源が存在する環境下において、ある目的の音源を抽出する技術として、複数のマイクを用いた音源分離、マイクを直線上や平面上、球面上等に配置したマイクアレイを用いたビームフォーマやヌルフォーマ等がある。特に、目的の音源以外の音源が非定常である場合や、複数ある場合には、単一のマイクを用いたノイズサプレッサによる目的音源の抽出は難しく、２つ以上のマイクを用いることが必須となる。 As a technique for extracting a target sound source in an environment where there are a plurality of sound sources, a sound source separation using a plurality of microphones, and a beamformer using a microphone array in which the microphones are arranged on a straight line, a plane, a spherical surface, etc. And nullformers. In particular, when there are non-stationary sound sources other than the target sound source or when there are a plurality of sound sources, it is difficult to extract the target sound source with a noise suppressor using a single microphone, and it is essential to use two or more microphones. Become.

上述したマイクアレイを用いたビームフォーマとは、ある特定の方向の音のみ強調し収音する技術である。ビームフォーマとは、各マイクに到達する信号の時間差を利用して指向性を形成する技術である。 The beam former using the above-described microphone array is a technique that emphasizes and collects only sound in a specific direction. The beam former is a technique for forming directivity by using a time difference between signals reaching each microphone.

ビームフォーマには、加算型と減算型という２つの種類がある。加算型ビームフォーマに比べて、減算型ビームフォーマはより少ないマイク数で鋭い指向性を形成できるという利点がある。 There are two types of beamformers: an addition type and a subtraction type. Compared with the addition beamformer, the subtraction beamformer has an advantage that a sharp directivity can be formed with a smaller number of microphones.

図１３は、マイク数が２個の場合の減算型ビームフォーマに係る構成を示すブロック図である。図１３の減算型ビームフォーマは、第１のマイクＭ１、第２のマイクＭ２、第１の遅延手段３、第２の遅延手段４、減算手段５から構成される。第１のマイクＭ１で収音した第１の入力信号は第１の遅延手段３に与えられ、第２のマイクＭ２で収音した第２の入力信号は第２の遅延手段４に与えられる。妨害音が第１のマイクＭ１側から到来している場合、第１の遅延手段３は第１の入力信号を遅延させることで、第１の入力信号と第２の入力信号に含まれる妨害音の位相を合わせる。一方、妨害音が第２のマイクＭ２側から到来している場合、第２の遅延手段４は第２の入力信号を遅延させることで、妨害音の位相を合わせる。第１の遅延手段３から得られた第１の遅延信号と第２の遅延手段から得られた第２の遅延信号は減算手段５に与えられる。減算手段５は、第１の遅延信号から第２の遅延信号を減じることで、強調音声を得る。以上のように、減算型ビームフォーマは、第１の入力信号と第２の入力信号とに含まれる妨害音の位相を合わせ、減算し、妨害音を抑圧することで、目的音を強調する。減算型ビームフォーマは、事前に与えられる妨害音の到来方向情報を必要とする。 FIG. 13 is a block diagram showing a configuration related to a subtractive beamformer when the number of microphones is two. The subtractive beamformer shown in FIG. 13 includes a first microphone M1, a second microphone M2, a first delay unit 3, a second delay unit 4, and a subtracting unit 5. The first input signal picked up by the first microphone M 1 is given to the first delay means 3, and the second input signal picked up by the second microphone M 2 is given to the second delay means 4. When the disturbing sound has arrived from the first microphone M1 side, the first delay means 3 delays the first input signal, thereby causing the disturbing sound included in the first input signal and the second input signal. Adjust the phase. On the other hand, when the disturbing sound has arrived from the second microphone M2 side, the second delay means 4 delays the second input signal, thereby matching the phase of the disturbing sound. The first delay signal obtained from the first delay means 3 and the second delay signal obtained from the second delay means are supplied to the subtraction means 5. The subtracting means 5 obtains emphasized speech by subtracting the second delayed signal from the first delayed signal. As described above, the subtractive beamformer emphasizes the target sound by matching and subtracting the phases of the interference sounds included in the first input signal and the second input signal and suppressing the interference sounds. The subtractive beamformer requires the direction-of-arrival information of disturbance sound given in advance.

ところで、減算型ビームフォーマには、妨害音源が少しでも移動してしまうと、妨害音の抑圧性能が大きく低下してしまう問題がある。 By the way, the subtractive beamformer has a problem that if the disturbing sound source is moved even a little, the suppression performance of the disturbing sound is greatly deteriorated.

図１４は、従来の信号処理装置Ｚを用いて、自動車（車両）Ａの中における運転手Ｕ１の音声を強調する例について示した説明図である。 FIG. 14 is an explanatory diagram showing an example in which the voice of the driver U1 in the automobile (vehicle) A is emphasized using the conventional signal processing device Z.

例えば、図１４に示すように音声認識を用いて音声によって操作できるカーナビゲーションシステムなどでは、自動車内において運転手の音声だけを抽出する必要がある。 For example, as shown in FIG. 14, in a car navigation system that can be operated by voice using voice recognition, it is necessary to extract only the voice of the driver in the car.

したがって、運転席と助手席にそれぞれ人が乗車している場合には、助手席の助手Ｕ２の音声（妨害音）を抑圧する必要があるが、助手Ｕ２が前後左右に顔（妨害音源）を動かすと、減算型ビームフォーマでは妨害音を抑圧することができない。 Therefore, when a person is in the driver's seat and the passenger seat, it is necessary to suppress the voice (interference sound) of the assistant U2 in the passenger seat. When moved, the subtractive beamformer cannot suppress the interference sound.

適応ビームフォーマの代表の一つである最小分散ビームフォーマ（ＭｉｎｉｍｕｍＶａｒｉａｎｃｅＢｅａｍｆｏｒｍｅｒ：ＭＶＢ）は、目的音の到来方向を事前に与えることで、妨害音を効率的に抑圧できる方法である。ＭＶＢは、目的音の到来方向に対してはゲインが１となるような拘束条件の下で、強調音声の分散を最小化することにより、妨害音を抑圧する。 A minimum dispersion beamformer (MVB), which is one of representative adaptive beamformers, is a method that can efficiently suppress interference sound by giving the arrival direction of a target sound in advance. The MVB suppresses the interference sound by minimizing the dispersion of the emphasized speech under the constraint condition that the gain is 1 with respect to the arrival direction of the target sound.

また、スペクトル減算法を用いることで、目的音源の到来方向に強い指向性を形成することができる。非特許文献１では、目的音源は常に正面にあると仮定して、第１に減算型ビームフォーマで正面方向から到来する目的音を抑圧した目的音抑圧信号を得、第２に第１の入力信号の振幅スペクトルから目的音抑圧信号の振幅スペクトルを減算（スペクトル減算）することで目的音を強調した強調音声の振幅スペクトルを得、第３に強調音声の振幅スペクトルと第１の入力信号の位相スペクトルとを用いて強調音声を得る。 Further, by using the spectral subtraction method, strong directivity can be formed in the direction of arrival of the target sound source. In Non-Patent Document 1, assuming that the target sound source is always in front, first, a target sound suppression signal in which the target sound arriving from the front direction is suppressed by a subtractive beamformer is obtained, and secondly, the first input By subtracting the amplitude spectrum of the target sound suppression signal from the amplitude spectrum of the signal (spectral subtraction), an amplitude spectrum of the emphasized speech in which the target sound is emphasized is obtained. Third, the amplitude spectrum of the emphasized speech and the phase of the first input signal Emphasized speech is obtained using the spectrum.

矢頭隆、森戸誠、山田圭、小川哲司、“正方形マイクロホンアレイによる音源分離技術”、情報処理、Ｖｏｌ．５１、Ｎｏ．１１、２０１０Takashi Yagami, Makoto Morito, Satoshi Yamada, Tetsuji Ogawa, “Sound Source Separation Technology Using Square Microphone Array”, Information Processing, Vol. 51, no. 11, 2010

しかしながら、従来の技術は以下に述べる問題を有する。 However, the conventional technology has the following problems.

図１５は、自動車Ａの中における目的音と妨害音のイメージについて示した説明図である。 FIG. 15 is an explanatory diagram showing an image of a target sound and an interference sound in the automobile A.

ＭＶＢは、マイクの数より１つ少ない数の妨害音しか抑圧することができない。したがって、図１４のように２つのマイクで目的音を強調する場合、妨害音は図１５（ｂ）に示すように伝搬するため、ＭＶＢは妨害音の直接音を抑圧できるが反射音を抑圧できないので、目的音を十分に強調することができない。 MVB can suppress only one disturbance sound, which is one less than the number of microphones. Therefore, when the target sound is emphasized by two microphones as shown in FIG. 14, the interference sound propagates as shown in FIG. 15B, so that MVB can suppress the direct sound of the interference sound but cannot suppress the reflected sound. Therefore, the target sound cannot be emphasized sufficiently.

非特許文献１に記載の技術は、正面方向以外から到来した音声は、目的音に由来するものであってもすべて抑圧してしまう。したがって、図１４のように２つのマイクで目的音を強調する場合、目的音は図１５（ａ）に示すように伝搬するため、非特許文献１に記載の技術は目的音の反射音をも抑圧してしまうため、目的音の音質が劣化してしまう。 With the technique described in Non-Patent Document 1, all voices coming from other than the front direction are suppressed even if they originate from the target sound. Therefore, when the target sound is emphasized by two microphones as shown in FIG. 14, the target sound propagates as shown in FIG. 15 (a). Therefore, the technique described in Non-Patent Document 1 provides a reflected sound of the target sound. Since the sound is suppressed, the sound quality of the target sound is deteriorated.

そのため、より少ない演算コストで、且つ、より少ない歪みで目的音を強調する信号処理装置、プログラム及び方法を提供することができる。 Therefore, it is possible to provide a signal processing apparatus, program, and method that emphasizes a target sound with less calculation cost and less distortion.

第１の本発明の信号処理装置は、（１）第１の収音装置から入力された第１の入力信号を周波数解析して第１の入力スペクトルを得る第１の周波数解析手段と、（２）第２の収音装置から入力された第２の入力信号を周波数解析して第２の入力スペクトルを得る第２の周波数解析手段と、（３）前記第１の周波数解析手段で得られた第１の入力スペクトルと前記第２の周波数解析手段で得られた第２の入力スペクトルに基づき、前記第１の収音装置の位置と前記第２の収音装置の位置を結んだ直線と垂直をなす正面方向に対して、正面方向及び前記第１の収音装置側の方向の値を大きくとり、前記第２の収音装置側の方向の値を小さくとる第１の特徴量を算出する特徴量算出手段と、（４）前記特徴量算出手段で算出された前記第１の特徴量を、所定の広義単調増加関数で写像して強調フィルタを得るフィルタ決定手段と、（５）前記第１の周波数解析手段で得られた第１の入力スペクトルに前記フィルタ決定手段で得られた強調フィルタを乗じて強調スペクトルを得る乗算手段とを備えることを特徴とする。 The signal processing apparatus according to the first aspect of the present invention includes: (1) first frequency analysis means for obtaining a first input spectrum by performing frequency analysis on the first input signal input from the first sound collection device; 2) second frequency analysis means for obtaining a second input spectrum by frequency analysis of the second input signal inputted from the second sound collecting device; and (3) obtained by the first frequency analysis means. A straight line connecting the position of the first sound pickup device and the position of the second sound pickup device based on the first input spectrum and the second input spectrum obtained by the second frequency analysis means; A first feature value is calculated which takes a value in the front direction and the direction on the first sound collecting device side larger and a value in the direction on the second sound collecting device side smaller than the vertical front direction. And (4) the first feature amount calculated by the feature amount calculation unit. Filter determining means that obtains an enhancement filter by mapping with a predetermined monotonically increasing function in a broad sense; and (5) the enhancement filter obtained by the filter decision means on the first input spectrum obtained by the first frequency analysis means. And multiplication means for obtaining an enhanced spectrum by multiplying by.

第２の本発明の信号処理プログラムは、コンピュータを、（１）第１の収音装置から入力された第１の入力信号を周波数解析して第１の入力スペクトルを得る第１の周波数解析手段と、（２）第２の収音装置から入力された第２の入力信号を周波数解析して第２の入力スペクトルを得る第２の周波数解析手段と、（３）前記第１の周波数解析手段で得られた第１の入力スペクトルと前記第２の周波数解析手段で得られた第２の入力スペクトルに基づき、前記第１の収音装置の位置と前記第２の収音装置の位置を結んだ直線と垂直をなす正面方向に対して、正面方向及び前記第１の収音装置側の方向の値を大きくとり、前記第２の収音装置側の方向の値を小さくとる第１の特徴量を算出する特徴量算出手段と、（４）前記特徴量算出手段で算出された前記第１の特徴量を、所定の広義単調増加関数で写像して強調フィルタを得るフィルタ決定手段と、（５）前記第１の周波数解析手段で得られた第１の入力スペクトルに前記フィルタ決定手段で得られた強調フィルタを乗じて強調スペクトルを得る乗算手段と、（６）前記乗算手段で得られた強調スペクトルを入力して信号波形を復元して強調音声を得る波形復元手段として機能させることを特徴とする。 A signal processing program according to a second aspect of the present invention provides a computer, (1) first frequency analysis means for obtaining a first input spectrum by performing frequency analysis on a first input signal input from a first sound collection device. And (2) second frequency analysis means for obtaining a second input spectrum by performing frequency analysis on the second input signal input from the second sound collection device, and (3) the first frequency analysis means. Based on the first input spectrum obtained in step 2 and the second input spectrum obtained by the second frequency analysis means, the position of the first sound collecting device and the position of the second sound collecting device are connected. A first feature is that the values of the front direction and the direction of the first sound collector are larger and the value of the direction of the second sound collector is smaller than the front direction perpendicular to the straight line. A feature amount calculating means for calculating the amount; and (4) calculated by the feature amount calculating means. Filter determining means for mapping the first feature value by a predetermined monotonically increasing function in a broad sense to obtain an enhancement filter; and (5) the first input spectrum obtained by the first frequency analyzing means in the first input spectrum Multiplication means for obtaining an enhancement spectrum by multiplying the enhancement filter obtained by the filter determination means; (6) Waveform restoration means for obtaining the enhancement speech by inputting the enhancement spectrum obtained by the multiplication means and restoring the signal waveform. It is made to function.

第３の本発明の信号処理方法は、信号処理方法において、（１）第１の周波数解析手段、第２の周波数解析手段、特徴量算出手段、フィルタ決定手段、及び乗算手段を有し、（２）前記第１の周波数解析手段は、第１の収音装置から入力された第１の入力信号を周波数解析して第１の入力スペクトルを得て、（３）前記第２の周波数解析手段は、第２の収音装置から入力された第２の入力信号を周波数解析して第２の入力スペクトルを得て、（４）前記特徴量算出手段は、前記第１の周波数解析手段で得られた第１の入力スペクトルと前記第２の周波数解析手段で得られた第２の入力スペクトルに基づき、前記第１の収音装置の位置と前記第２の収音装置の位置を結んだ直線と垂直をなす正面方向に対して、正面方向及び前記第１の収音装置側の方向の値を大きくとり、前記第２の収音装置側の方向の値を小さくとる第１の特徴量を算出し、（５）前記フィルタ決定手段は、前記特徴量算出手段で算出された前記第１の特徴量を、所定の広義単調増加関数で写像して強調フィルタを得て、（６）前記乗算手段は、前記第１の周波数解析手段で得られた第１の入力スペクトルに前記フィルタ決定手段で得られた強調フィルタを乗じて強調スペクトルを得ることを特徴とする。 A signal processing method according to a third aspect of the present invention is the signal processing method, comprising: (1) first frequency analysis means, second frequency analysis means, feature amount calculation means, filter determination means, and multiplication means ( 2) The first frequency analysis means obtains a first input spectrum by performing frequency analysis on the first input signal inputted from the first sound collecting device, and (3) the second frequency analysis means. Obtains a second input spectrum by performing frequency analysis on the second input signal inputted from the second sound collecting device, and (4) the feature amount calculating means is obtained by the first frequency analyzing means. A straight line connecting the position of the first sound collecting device and the position of the second sound collecting device based on the obtained first input spectrum and the second input spectrum obtained by the second frequency analyzing means. And the first sound collecting device with respect to the front direction perpendicular to the front direction. The first feature value is calculated by taking a larger value in the direction of the second sound and taking a smaller value in the direction on the second sound collecting device side. (5) The filter determining means is calculated by the feature value calculating means. The first feature value is mapped with a predetermined broad monotonically increasing function to obtain an enhancement filter. (6) The multiplying unit adds the first input spectrum obtained by the first frequency analyzing unit to the first input spectrum. The enhancement spectrum obtained by the filter determination means is multiplied to obtain an enhancement spectrum.

本発明によれば、より少ない演算コストで、且つ、より少ない歪みで目的音を強調する信号処理装置、プログラム及び方法を提供することができる。 ADVANTAGE OF THE INVENTION According to this invention, the signal processing apparatus, program, and method which emphasize a target sound with less calculation cost and less distortion can be provided.

第１の実施形態に係る信号処理装置の機能的構成について示したブロック図である。It is the block diagram shown about the functional structure of the signal processing apparatus which concerns on 1st Embodiment. 第１の実施形態に係る信号処理装置の使用環境の例について示した説明図である。It is explanatory drawing shown about the example of the usage environment of the signal processing apparatus which concerns on 1st Embodiment. 第１の実施形態に係る信号処理装置で処理される特徴量Ｆ_{ｃｅｎｔｅｒ}の例について示している。The example of the feature-value _Fcenter processed with the signal processing apparatus which concerns on 1st Embodiment is shown. 第１の実施形態に係る信号処理装置で処理される特徴量Ｆ_ｓｉｄｅの例について示している。An example of the feature value F _side processed by the signal processing device according to the first embodiment is shown. 第１の実施形態に係る信号処理装置で処理される音の到来方向θとＤＯＡ特徴量Ｆとの関係について示したグラフである。It is the graph shown about the relationship between DOA feature-value F and the arrival direction (theta) of the sound processed with the signal processing apparatus which concerns on 1st Embodiment. 第１の実施形態に係る信号処理装置で処理される広義単調増加関数の例について示したグラフである。It is the graph shown about the example of the broad sense monotone increase function processed with the signal processing apparatus which concerns on 1st Embodiment. 第１の実施形態に係る信号処理装置で用いられる強調フィルタの例について示したグラフである。It is the graph shown about the example of the emphasis filter used with the signal processor concerning a 1st embodiment. 第１の実施形態に係るフィルタ決定手段で得られる強調フィルタＧの例について示したグラフである。It is the graph shown about the example of the emphasis filter G obtained by the filter determination means which concerns on 1st Embodiment. 第２の実施形態に係るフィルタ決定手段で得られる強調フィルタＧの例について示したグラフである。It is the graph shown about the example of the emphasis filter G obtained by the filter determination means which concerns on 2nd Embodiment. 第２の実施形態に係るフィルタ決定手段で得られる強調フィルタＧと、第３の実施形態に係るフィルタ決定手段で得られる強調フィルタＧとの比較について示したグラフである。It is the graph shown about the comparison with the emphasis filter G obtained by the filter determination means which concerns on 3rd Embodiment, and the emphasis filter G obtained by the filter determination means which concerns on 2nd Embodiment. 第４の実施形態に係る信号処理装置で処理される音の到来方向θとＤＯＡ特徴量Ｆ’との関係について示したグラフである。It is the graph shown about the relationship between DOA feature-value F 'and the arrival direction (theta) of the sound processed with the signal processing apparatus which concerns on 4th Embodiment. 第４の実施形態に係るフィルタ決定手段４０４で得られる強調フィルタＧの例について示した説明図である。It is explanatory drawing shown about the example of the emphasis filter G obtained by the filter determination means 404 which concerns on 4th Embodiment. 従来のマイク数が２個の場合の減算型ビームフォーマに係る構成を示すブロック図である。It is a block diagram which shows the structure which concerns on the conventional subtraction type beam former in case the number of microphones is two. 従来の信号処理装置を用いて、自動車の中における運転手の音声を強調する例について示した説明図である。It is explanatory drawing shown about the example which emphasizes the audio | voice of the driver in a motor vehicle using the conventional signal processing apparatus. 自動車の中における目的音と妨害音のイメージについて示した説明図である。It is explanatory drawing shown about the image of the target sound and disturbance sound in a motor vehicle.

（Ａ）第１の実施形態
以下、本発明による信号処理装置、プログラム及び方法の第１の実施形態を、図面を参照しながら詳述する。 (A) First Embodiment A signal processing apparatus, program, and method according to a first embodiment of the present invention will be described in detail below with reference to the drawings.

（Ａ−１）第１の実施形態の構成
図２は、第１の実施形態に係る信号処理装置１００が利用される環境について示した説明図である。なお、図２において、括弧内の符号は、後述する第２〜第４の実施形態において用いられる符号である。 (A-1) Configuration of First Embodiment FIG. 2 is an explanatory diagram showing an environment in which the signal processing apparatus 100 according to the first embodiment is used. In FIG. 2, the reference numerals in parentheses are those used in the second to fourth embodiments described later.

第１の実施形態に係る信号処理装置１００は、自動車Ａの中における運転手Ｕ１の音声を強調する例について示した説明図である。自動車Ａの中では、運転席に運転手Ｕ１が座り、助手席に助手Ｕ２が座った状態となっている。そして、自動車Ａの中では運転手Ｕ１の正面（運転席の正面）に、マイクアレイを構成する第１のマイクＭ１及び第２のマイクＭ２が配置されている。運転手Ｕ１からみて、第１のマイクＭ１は左側（助手Ｕ２と反対の側）に配置されており、第２のマイクＭ２は右側（助手Ｕ２の側）に配置されている。 The signal processing apparatus 100 according to the first embodiment is an explanatory diagram illustrating an example in which the voice of the driver U1 in the automobile A is emphasized. In the car A, the driver U1 sits in the driver's seat and the assistant U2 sits in the passenger seat. In the automobile A, the first microphone M1 and the second microphone M2 constituting the microphone array are arranged in front of the driver U1 (front of the driver's seat). When viewed from the driver U1, the first microphone M1 is disposed on the left side (the side opposite to the assistant U2), and the second microphone M2 is disposed on the right side (the assistant U2 side).

図１は、第１の実施形態に係る信号処理装置１００の機能的構成を示すブロック図である。 FIG. 1 is a block diagram illustrating a functional configuration of the signal processing apparatus 100 according to the first embodiment.

第１の実施形態の信号処理装置１００は、第１の周波数解析手段１０１、第２の周波数解析手段１０２、特徴量算出手段１０３、フィルタ決定手段１０４、乗算手段１０５、及び波形復元手段１０６を有している。 The signal processing apparatus 100 according to the first embodiment includes a first frequency analysis unit 101, a second frequency analysis unit 102, a feature amount calculation unit 103, a filter determination unit 104, a multiplication unit 105, and a waveform restoration unit 106. doing.

信号処理装置１００は、一部または全部をソフトウェア的に構成するようにしてもよい。信号処理装置１００は、例えば、メモリ及びプロセッサを有するコンピュータにプログラム（実施形態に係る信号処理プログラムを含む）をインストールすることにより構成してもよい。 A part or all of the signal processing apparatus 100 may be configured by software. For example, the signal processing apparatus 100 may be configured by installing a program (including the signal processing program according to the embodiment) in a computer having a memory and a processor.

第１の周波数解析手段１０１は、第１の入力信号ｘ１を周波数解析して第１の入力スペクトルＸ１を得る。 The first frequency analysis means 101 obtains a first input spectrum X1 by performing frequency analysis on the first input signal x1.

第２の周波数解析手段１０２は、第２の入力信号ｘ２を周波数解析して第２の入力スペクトルＸ２を得る。 The second frequency analysis means 102 analyzes the frequency of the second input signal x2 to obtain a second input spectrum X2.

特徴量算出手段１０３は、第１の入力スペクトルＸ１と第２の入力スペクトルＸ２とに基づいて所定の特徴量（以下、「ＤＯＡ特徴量Ｆ」と呼ぶ）を得る。ＤＯＡ特徴量Ｆは、目的音の到来方向に応じて変化する特徴量であり、詳細については後述する。 The feature amount calculation means 103 obtains a predetermined feature amount (hereinafter referred to as “DOA feature amount F”) based on the first input spectrum X1 and the second input spectrum X2. The DOA feature value F is a feature value that changes according to the direction of arrival of the target sound, and will be described in detail later.

特徴量算出手段１０３は、（１）式または（１）式を式変形した計算式によって前記ＤＯＡ特徴量Ｆを得ることができる。 The feature quantity calculation means 103 can obtain the DOA feature quantity F by the formula (1) or a calculation formula obtained by transforming the formula (1).

（１）式では、ある時刻のある周波数において、前記第１の入力スペクトルをＸ_１、前記第２の入力スペクトルをＸ_２、前記第２の入力スペクトルの複素共役をＸ_２ ^＊としている。

In the formula (1), at a certain frequency at a certain time, the first input spectrum is X ₁ , the second input spectrum is X ₂ , and the complex conjugate of the second input spectrum is X ₂ ^* .

フィルタ決定手段１０４は、ＤＯＡ特徴量Ｆを所定の広義単調増加関数で写像して強調フィルタＧを得る。 The filter determination unit 104 obtains an enhancement filter G by mapping the DOA feature amount F with a predetermined broad-sense monotone increasing function.

乗算手段１０５は、第１の入力スペクトルＸ１に強調フィルタＧを乗じて強調スペクトルＹを得る。 Multiplication means 105 multiplies first input spectrum X1 by enhancement filter G to obtain enhancement spectrum Y.

波形復元手段１０６は、強調スペクトルＹに基づいて信号波形を復元して強調音声ｙを得る。 The waveform restoration means 106 restores the signal waveform based on the enhancement spectrum Y to obtain the enhanced speech y.

次に、特徴量算出手段１０３が得るＤＯＡ特徴量と、フィルタ決定手段１０４が得る強調フィルタの設計思想について述べる。 Next, the DOA feature quantity obtained by the feature quantity calculation unit 103 and the design concept of the enhancement filter obtained by the filter determination unit 104 will be described.

強調フィルタには、第２のマイクＭ２側（妨害音側）から到来する妨害音の直接音と反射音を抑圧し、第１のマイクＭ１側（目的音側、また、正面方向を含む）から到来する目的音の直接音と反射音を抑圧しない特徴を与える必要がある。そのため、ＤＯＡ特徴量には、音が、第１のマイクＭ１側から到来した場合には大きな値を取り、第２のマイクＭ２側から到来した場合には小さな値を取るようにしたい。しかし、第１のマイクＭ１側が正面方向を含んでいるために、このような特徴は音の到来方向に対して対称とはならないため、当該特徴を有する公知の特徴量はない。 The enhancement filter suppresses the direct sound and reflected sound of the interference sound coming from the second microphone M2 side (interference sound side), and from the first microphone M1 side (including the target sound side and the front direction). It is necessary to give a characteristic that does not suppress the direct sound and reflected sound of the incoming target sound. Therefore, it is desired that the DOA feature value takes a large value when the sound comes from the first microphone M1 side and takes a small value when the sound comes from the second microphone M2 side. However, since the first microphone M1 side includes the front direction, such a feature is not symmetric with respect to the sound arrival direction, and there is no known feature amount having the feature.

そこで、正面方向に対して大きな値を取る特徴量と、第２のマイクＭ２側に対して大きな値を取る特徴量を考える。ある時刻のある周波数において、前記第１の入力スペクトルをＸ_１、前記第２の入力スペクトルをＸ_２、前記第２の入力スペクトルの複素共役をＸ_２ ^＊とおき、例えば、式（１−１）で表される特徴量Ｆ_{ｃｅｎｔｅｒ}と、式（１−２）で表される特徴量Ｆ_ｓｉｄｅを考える。

Therefore, a feature value that takes a large value with respect to the front direction and a feature value that takes a large value with respect to the second microphone M2 side are considered. At a certain frequency at a certain time, the first input spectrum is set as X ₁ , the second input spectrum is set as X ₂ , and the complex conjugate of the second input spectrum is set as X ₂ ^*. a feature amount _{F center} represented by), consider a feature amount _{F side} of the formula (1-2).

ここで、正面方向（２つのマイクの位置を結んだ直線と垂直をなす方向）を０度、第２のマイクＭ２側の（第１のマイクＭ１から見た第２のマイクＭ２の）方向を＋９０度とし、音源のスペクトルをＳ、角周波数をω、２つのマイク間隔をｄ、音の到来方向をθ（シータ）、音速をｃとおくと、Ｘ_１とＸ_２はそれぞれ（２）式と（３）式のように書け、式（２）と式（３）を（１−１）式と（１−２）式に代入すると、それぞれ（３−１）式と（３−２）式が得られる。（３−１）式と（３−２）式で表される特徴量Ｆ_{ｃｅｎｔｅｒ}とＦ_ｓｉｄｅの、音の到来方向θに対する関係を、それぞれ図３と図４に示す。 Here, the front direction (the direction perpendicular to the straight line connecting the positions of the two microphones) is 0 degree, and the direction on the second microphone M2 side (the second microphone M2 viewed from the first microphone M1) is Assuming that +90 degrees, the spectrum of the sound source is S, the angular frequency is ω, the distance between two microphones is d, the direction of sound arrival is θ (theta), and the speed of sound is c, X ₁ and X ₂ are respectively expressed by Equation (2) And (3), and substituting Equations (2) and (3) into Equations (1-1) and (1-2), respectively, (3-1) and (3-2) The formula is obtained. FIGS. 3 and 4 show the relationship between the feature amounts F _center and F _side represented by the equations (3-1) and (3-2) with respect to the sound arrival direction θ, respectively.

図３は、特徴量Ｆ_{ｃｅｎｔｅｒ}の例について示している。 FIG. 3 shows an example of the feature amount F _center .

図３では、横軸を音源の到来方向θとし縦軸を特徴量Ｆ_{ｃｅｎｔｅｒ}としている。図３では、音源の周波数を１ｋＨｚ、２ｋＨｚ、４ｋＨｚと変化させた場合の到来方向θと特徴量Ｆ_{ｃｅｎｔｅｒ}の関係を示したグラフとなっている。 In FIG. 3, the horizontal axis is the sound source arrival direction θ, and the vertical axis is the feature amount F _center . FIG. 3 is a graph showing the relationship between the arrival direction θ and the feature amount F _center when the frequency of the sound source is changed to 1 kHz, 2 kHz, and 4 kHz.

図４は、特徴量Ｆ_ｓｉｄｅの例について示している。 FIG. 4 shows an example of the feature amount F _side .

図４では、横軸を音源の到来方向θとし縦軸を特徴量Ｆ_ｓｉｄｅとしている。図４では、音源の周波数を１ｋＨｚ、２ｋＨｚ、４ｋＨｚと変化させた場合の到来方向θと特徴量Ｆ_ｓｉｄｅの関係を示したグラフとなっている。 In FIG. 4, the horizontal axis is the arrival direction θ of the sound source, and the vertical axis is the feature amount F _side . FIG. 4 is a graph showing the relationship between the arrival direction θ and the feature amount F _side when the frequency of the sound source is changed to 1 kHz, 2 kHz, and 4 kHz.

図３と図４から、Ｆ_{ｃｅｎｔｅｒ}は到来方向０度に対して大きな値となっており、またＦ_ｓｉｄｅは０度に対して第２のマイクＭ２側に対して大きな値となり、第１のマイクＭ１側に対して小さな値となっている。 From FIG. 3 and FIG. 4, F _center is a large value with respect to the arrival direction of 0 degrees, and F _side is a large value with respect to the second microphone M2 side with respect to 0 degrees. It is a small value with respect to the M1 side.

以上のように、ＤＯＡ特徴量Ｆは、目的音の方向（０度の方向；正面方向）に対して大きな値となる特性があるＦ_{ｃｅｎｔｅｒ}と、第２のマイクＭ２側（すなわち妨害音の音源である助手Ｕ２の側のマイク）に対して大きな値となる特性があるＦ_ｓｉｄｅを用いて得られる特徴量であることがわかる。 As described above, the DOA feature amount F includes the F _center having characteristics that have a large value with respect to the direction of the target sound (the direction of 0 degrees; the front direction) and the second microphone M2 side (that is, the sound source of the interference sound). It can be seen that the feature amount is obtained using F _side having a characteristic that is large with respect to the assistant U2 side microphone).

次に、音の到来方向とＤＯＡ特徴量との関係について述べる。 Next, the relationship between the sound arrival direction and the DOA feature value will be described.

ＤＯＡ特徴量は（３−３）式で定義する。

The DOA feature value is defined by equation (3-3).

（２）式と（３）式を（３−３）式に代入すると（４）式が得られ、式変形すると（５）式が得られる。（５）式で表される音の到来方向θとＤＯＡ特徴量Ｆとの関係を図５に示す。 Substituting Equations (2) and (3) into Equation (3-3) gives Equation (4), and transforming Equation gives Equation (5). FIG. 5 shows the relationship between the sound arrival direction θ expressed by the equation (5) and the DOA feature amount F.

図５は、音の到来方向θとＤＯＡ特徴量Ｆとの関係について示したグラフである。 FIG. 5 is a graph showing the relationship between the sound arrival direction θ and the DOA feature amount F.

図５では、横軸を音源の到来方向θとし縦軸をＤＯＡ特徴量Ｆとしている。図５では、音源の周波数を１ｋＨｚ、２ｋＨｚ、４ｋＨｚと変化させた場合の到来方向θとＤＯＡ特徴量Ｆの関係を示したグラフとなっている。 In FIG. 5, the horizontal axis is the arrival direction θ of the sound source, and the vertical axis is the DOA feature amount F. FIG. 5 is a graph showing the relationship between the arrival direction θ and the DOA feature amount F when the frequency of the sound source is changed to 1 kHz, 2 kHz, and 4 kHz.

図５から、ＤＯＡ特徴量Ｆは、正面方向に対しては必ずＦ＝１となり、第２のマイクＭ２側（妨害音側）に対しては必ずＦ＜１となる。一方、第１のマイクＭ１側に対しては、低い周波数と高い周波数の０度に近いθではＦ＞１となり、高い周波数の９０度に近い部分ではＦ＜１となる。 From FIG. 5, the DOA feature amount F is always F = 1 for the front direction, and F <1 for the second microphone M2 side (interference sound side). On the other hand, for the first microphone M1 side, F> 1 at θ close to 0 degrees of the low frequency and high frequency, and F <1 at the portion close to 90 degrees of the high frequency.

以上のように、ＤＯＡ特徴量Ｆは、音が、第１のマイクＭ１側から到来した場合には大きな値を取り、第２のマイクＭ２側から到来した場合には小さな値を取る特徴を備えていることがわかる。言い換えると、ＤＯＡ特徴量Ｆは、正面方向から第１のマイクＭ１側の方向（助手Ｕ２からの妨害音と反対方向）にピークが存在し、当該ピークの存在する方向から第２のマイクＭ２側に方向が傾くほど値が小さくなる特徴があることがわかる。

As described above, the DOA feature amount F has a feature that takes a large value when the sound comes from the first microphone M1 side and takes a small value when the sound comes from the second microphone M2 side. You can see that In other words, the DOA feature amount F has a peak in the direction from the front direction to the first microphone M1 side (the direction opposite to the interference sound from the assistant U2), and the second microphone M2 side from the direction in which the peak exists. It can be seen that there is a characteristic that the value becomes smaller as the direction is inclined.

強調フィルタは、ＤＯＡ特徴量を所定の広義単調増加関数で写像することで得られる。 The enhancement filter can be obtained by mapping the DOA feature value with a predetermined broad monotone increasing function.

図６は、広義単調増加関数ｆｍａｐ（Ｆ）の例について示したグラフである。 FIG. 6 is a graph showing an example of the broad-sense monotone increasing function fmap (F).

図６では、横軸をＤＯＡ特徴量Ｆの値とし縦軸を強調フィルタＧの値としている。 In FIG. 6, the horizontal axis represents the DOA feature value F and the vertical axis represents the enhancement filter G value.

図１５からわかるように、強調フィルタは、第２のマイクＭ２側から到来する音を抑圧し、正面方向と第１のマイクＭ１側から到来する音は抑圧しないようにしたい。そこで、例えば広義単調増加関数ｆｍａｐ（Ｆ）を（６）式のように定義する。図６では、マイク間隔を３ｃｍ、音速を３３２ｍ／ｓ、Ｆ_０＝０．９としたｆｍａｐ（Ｆ）の例を示している。 As can be seen from FIG. 15, it is desired that the enhancement filter suppresses the sound coming from the second microphone M2 side and does not suppress the sound coming from the front direction and the first microphone M1 side. Therefore, for example, the broad-sense monotone increasing function fmap (F) is defined as shown in Equation (6). FIG. 6 shows an example of fmap (F) in which the microphone interval is 3 cm, the sound speed is 332 m / s, and F ₀ = 0.9.

強調フィルタをＧ＝ｆｍａｐ（Ｆ）として得ると、音の到来方向θと強調フィルタＧとの関係は図７のようになる。 When the enhancement filter is obtained as G = fmap (F), the relationship between the sound arrival direction θ and the enhancement filter G is as shown in FIG.

図７は、強調フィルタの例について示したグラフである。 FIG. 7 is a graph showing an example of the enhancement filter.

図７では、横軸を音の到来方向θの値とし縦軸を強調フィルタＧの値としている。図７では、音源の周波数を１ｋＨｚ、２ｋＨｚ、４ｋＨｚと変化させた場合の到来方向θと強調フィルタＧの値の関係を示したグラフとなっている。 In FIG. 7, the horizontal axis is the value of the sound arrival direction θ, and the vertical axis is the value of the enhancement filter G. FIG. 7 is a graph showing the relationship between the direction of arrival θ and the value of the enhancement filter G when the frequency of the sound source is changed to 1 kHz, 2 kHz, and 4 kHz.

つまり、ＤＯＡ特徴量Ｆが１より少し小さい値より大きい場合には強調フィルタＧを１とし、そうでない場合には強調フィルタＧは１より小さくすることで、強調フィルタに所望の特性、すなわち妨害音の直接音と反射音を抑圧するが目的音の直接音と妨害音は抑圧しない特性を与えられる。

That is, when the DOA feature amount F is larger than a value slightly smaller than 1, the enhancement filter G is set to 1. Otherwise, the enhancement filter G is made smaller than 1, so that the enhancement filter has a desired characteristic, that is, an interference sound. The direct sound and reflected sound of the target sound are suppressed, but the direct sound and interference sound of the target sound are not suppressed.

なお、本発明と同様の強調フィルタは、例えば第１の入力スペクトルと第２の入力スペクトルとから周波数ごとに到来方向θを算出することで得ることもできるが、逆正接関数（ａｔａｎ、ａｒｃｔａｎ、ｔａｎ^−１などと書かれる）を計算する演算コストがかかる。そのため、演算コストの観点で本発明の方が優位である。 The enhancement filter similar to the present invention can be obtained by calculating the arrival direction θ for each frequency from the first input spectrum and the second input spectrum, for example, but the inverse tangent function (atan, arctan, tan ^-1 written as such) it takes computational cost of calculating. Therefore, the present invention is superior from the viewpoint of calculation cost.

（Ａ−２）第１の実施形態の動作
次に、上述した構成を有する第１の実施形態の信号処理装置１００の動作（実施形態の信号処理方法）を、図１を参照しながら説明する。 (A-2) Operation of First Embodiment Next, the operation (signal processing method of the embodiment) of the signal processing apparatus 100 of the first embodiment having the above-described configuration will be described with reference to FIG. .

信号処理装置１００は、目的音源を含む第１の入力信号ｘ_１と第２の入力信号ｘ_２（時間領域の入力信号）について、目的音強調を行って、強調音声ｙ（時間領域の出力信号）を生成するものである。 Signal processing apparatus 100, first the input signal x ₁ and the second input signal x _{2 (input} signal in the time domain), performs target sound is emphasized, the output signal of the enhanced speech y (time region including the target sound source ).

第１の周波数解析手段１０１及び第２の周波数解析手段は、フーリエ変換に代表される任意の周波数解析手法、またはフィルタバンクに代表される任意の帯域分割手段によって、第１の入力信号ｘ_１と第２の入力信号ｘ_２をそれぞれＫ個の帯域に分割し、第１の入力スペクトルＸ_１と第２の入力スペクトルＸ_２とを得る。以下、第１の入力スペクトルと第２の入力スペクトルは、帯域の番号（例えばｋ番目）を明示する必要がある場合はＸ_１（ｋ）、Ｘ_２（ｋ）と書き、帯域の番号を明示する必要がない場合は単にＸ_１、Ｘ_２と表記する。第１の周波数解析手段１０１は、得られた第１の入力スペクトルＸ_１を特徴量算出手段１０３と乗算手段１０５に与え、第２の周波数解析手段１０２は、得られた第２の入力スペクトルＸ_２を特徴量算出手段１０３に与える。なお、乗算手段１０５に与えられる入力スペクトルは第１の入力スペクトルＸ_１としたが、これに限定されるものではなく、第２の入力スペクトルＸ_２を乗算手段１０５に与えても良く、いずれも同様の効果を奏する。 The first frequency analysis means 101 and the second frequency analysis means are connected to the first input signal x ₁ by an arbitrary frequency analysis method represented by Fourier transform or an arbitrary band dividing means represented by a filter bank. the second input signal x ₂ is divided into K bands, respectively, to obtain the first input spectrum X ₁ and a second input spectrum X _2. Hereinafter, the first input spectrum and the second input spectrum are written as X ₁ (k) and X ₂ (k) when the band number (for example, k-th) needs to be clearly indicated, and the band number is clearly indicated. When it is not necessary to do this, they are simply expressed as X ₁ and X ₂ . First frequency analysis means 101 gives the first of the input spectrum X ₁ feature calculating unit 103 and the multiplication means 105 a resulting second frequency analysis means 102, the resulting second input spectrum X ₂ is given to the feature quantity calculation means 103. The input spectrum is provided to multiplier 105 is set to the first input spectrum X _1, is not limited thereto, be applied to a second input spectrum X ₂ a multiplication means 105 may, both The same effect is produced.

特徴量算出手段１０３は、第１の入力スペクトルＸ_１と第２の入力スペクトルＸ_２とに基づいて（７）式によってＤＯＡ特徴量Ｆを算出し、フィルタ決定手段１０４に与える。（７）式をそのまま使って計算しても良いが、冗長な演算を含むため、式変形しても良い。Ｘ_１とＸ_２は複素数なので、これを（８）式のように書き直して（７）式に代入して整理すると、（９）式を得る。（７）式の代わりに（９）式を用いることで、乗算回数を減らすことができる。

The feature amount calculation unit 103 calculates the DOA feature amount F by the expression (7) based on the first input spectrum X ₁ and the second input spectrum X ₂ and gives the calculated value to the filter determination unit 104. The calculation may be performed using the equation (7) as it is, but the equation may be modified because it includes a redundant operation. Since X ₁ and X ₂ are complex numbers, when this is rewritten as shown in equation (8) and substituted into equation (7) for rearrangement, equation (9) is obtained. By using equation (9) instead of equation (7), the number of multiplications can be reduced.

フィルタ決定手段１０４は、ＤＯＡ特徴量Ｆに基づいて所定の広義単調増加関数によって強調フィルタＧを算出し、乗算手段１０５に与える。第１の実施形態では、すべての周波数に対して同じ広義単調増加関数を用いる。所定の広義単調増加関数ｆｍａｐ（Ｆ）として、例えば（１０）式で定義される１つの閾値Ｆ_０を持つ関数や、（１１）式で定義される２つの閾値Ｆ_１、Ｆ_２を持つ関数、（１２）式で定義されるスケールＦ_３、オフセットＦ_０のシグモイド関数を用いることができる。

The filter determination unit 104 calculates the enhancement filter G based on the DOA feature amount F using a predetermined broad-sense monotone increasing function, and supplies the enhancement filter G to the multiplication unit 105. In the first embodiment, the same broad-sense monotone increasing function is used for all frequencies. As a predetermined broad-sense monotone increasing function fmap (F), for example, a function having one threshold value F ₀ defined by the equation (10) or a function having two threshold values F ₁ and F ₂ defined by the equation (11) , A sigmoid function having a scale F ₃ and an offset F ₀ defined by the equation (12) can be used.

図８は、第１の実施形態に係るフィルタ決定手段１０４で得られる強調フィルタＧの例について示したグラフである。 FIG. 8 is a graph showing an example of the enhancement filter G obtained by the filter determination unit 104 according to the first embodiment.

図８（ａ）、図８（ｂ）、図８（ｃ）は、それぞれ（１０）式、（１１）式、（１２）式によって得られる強調フィルタＧの例を示している。図８（ａ）、図８（ｂ）、図８（ｃ）では、音源の周波数を１ｋＨｚ、２ｋＨｚ、４ｋＨｚと変化させた場合の到来方向θと強調フィルタＧの関係を示したグラフとなっている。図８（ａ）、図８（ｂ）、図８（ｃ）では、横軸を音源の到来方向θとし縦軸を強調フィルタＧの値（到来方向θに応じた値）としている。 FIG. 8A, FIG. 8B, and FIG. 8C show examples of the enhancement filter G obtained by the equations (10), (11), and (12), respectively. 8A, 8B, and 8C are graphs showing the relationship between the arrival direction θ and the enhancement filter G when the frequency of the sound source is changed to 1 kHz, 2 kHz, and 4 kHz. Yes. In FIG. 8A, FIG. 8B, and FIG. 8C, the horizontal axis represents the arrival direction θ of the sound source, and the vertical axis represents the value of the enhancement filter G (value corresponding to the arrival direction θ).

ここでは、Ｆ_０＝０．８、Ｆ_１＝０．７、Ｆ_２＝０．９、Ｆ_３＝１２とした。妨害音の抑圧性能に関して、（１０）式と（１１）式との差はあまりない。一方、強調音声の歪みに関して、（１０）式で得られる強調フィルタＧは値を０か１しか持たないためにミュージカルノイズを発生しやすいが、（１１）式は遷移帯域があることで抑圧／非抑圧の切り替わりが緩やかになるためにミュージカルノイズが発生しにくい。（１２）式は（１１）式をさらに滑らかにした特性となっており、更なるミュージカルノイズ低減効果や歪みを減らす効果が期待できる。多少の演算コストの増加が許容されるのであれば、（１２）式を用いるのが好適である。 Here, F ₀ = 0.8, F ₁ = 0.7, F ₂ = 0.9, and F ₃ = 12. There is not much difference between the expression (10) and the expression (11) regarding the interference noise suppression performance. On the other hand, with respect to the distortion of the emphasized speech, the enhancement filter G obtained by the equation (10) has only a value of 0 or 1, and thus tends to generate musical noise. Since the switching of non-suppression is gentle, musical noise is unlikely to occur. The expression (12) is a characteristic obtained by further smoothing the expression (11), and further musical noise reduction effect and distortion reduction effect can be expected. If a slight increase in calculation cost is allowed, it is preferable to use the expression (12).

乗算手段１０５は、入力スペクトルＸ_１に周波数ごとに強調フィルタＧ（強調ゲイン）を乗じ、得られた強調スペクトルＹを波形復元手段１０６に与える。 The multiplication unit 105 multiplies the input spectrum X ₁ by an enhancement filter G (enhancement gain) for each frequency, and gives the obtained enhancement spectrum Y to the waveform restoration unit 106.

波形復元手段１０６は、第１の周波数解析手段１０１と第２の周波数解析手段１０２で用いた周波数解析手法または帯域分割手法に対応する波形復元手法を用いて、乗算手段１０５から与えられた強調スペクトルＹに基づいて信号波形を再構成し、得られた強調音声ｙ（強調信号）を出力する。 The waveform restoration unit 106 uses the waveform restoration method corresponding to the frequency analysis method or the band division method used in the first frequency analysis unit 101 and the second frequency analysis unit 102, and the enhanced spectrum given from the multiplication unit 105. The signal waveform is reconstructed based on Y, and the obtained enhanced speech y (enhanced signal) is output.

（Ａ−３）第１の実施形態の効果
第１の実施形態によれば、以下のような効果を奏することができる。 (A-3) Effects of First Embodiment According to the first embodiment, the following effects can be achieved.

第１の実施形態の信号処理装置１００では、第２のマイクＭ２側の方向から到来する音を抑圧し、正面方向と第１のマイクＭ１側の方向から到来する音は抑圧しないので、自動車内において運転手Ｕ１の声（目的音）を強調する場合などにおいて、少ない歪みで目的音を強調することができる。 In the signal processing apparatus 100 according to the first embodiment, sound arriving from the direction of the second microphone M2 is suppressed, and sound arriving from the front direction and the direction of the first microphone M1 is not suppressed. When emphasizing the voice (target sound) of the driver U1, the target sound can be emphasized with less distortion.

言い換えると、信号処理装置１００では、少ない演算コストで、妨害音の直接音と反射音を抑圧するが、目的音の直接音と反射音は抑圧しない強調フィルタＧを設計できるので、少ない歪みで目的音を強調できるという効果を奏する。 In other words, the signal processing apparatus 100 can design the enhancement filter G that suppresses the direct sound and the reflected sound of the interference sound with low calculation cost but does not suppress the direct sound and the reflected sound of the target sound. The effect is that the sound can be emphasized.

（Ｂ）第２の実施形態
以下、本発明による信号処理装置、プログラム及び方法の第２の実施形態を、図面を参照しながら詳述する。 (B) Second Embodiment Hereinafter, a second embodiment of the signal processing apparatus, program and method according to the present invention will be described in detail with reference to the drawings.

（Ｂ−１）第２の実施形態の構成
第２の実施形態の信号処理装置２００も、第１の実施形態と同様に図２に示すような環境で利用されるものとして説明する。 (B-1) Configuration of Second Embodiment The signal processing apparatus 200 of the second embodiment will be described as being used in an environment as shown in FIG. 2 as in the first embodiment.

また、第２の実施形態の信号処理装置２００の内部構成についても、上述の図１を用いて示すことができる。 Also, the internal configuration of the signal processing device 200 of the second embodiment can be shown using FIG. 1 described above.

以下では、第２の実施形態の信号処理装置２００について、第１の実施形態との差異を説明する。 Below, the difference from 1st Embodiment is demonstrated about the signal processing apparatus 200 of 2nd Embodiment.

第１の実施形態では、フィルタ決定手段１０４において、すべての周波数に同じ広義単調増加関数ｆｍａｐ（Ｆ）を適用して強調フィルタＧを得ていたため、図８に示した通り、強調フィルタＧの特性が周波数ごとに異なっていた。特に低い周波数では抑圧されない到来方向の範囲が広くなる現象が起こる。そこで、第２の実施形態では、どの周波数でも同じような特性となるように、周波数ごとに異なる広義単調増加関数を適用する。 In the first embodiment, the filter determination unit 104 applies the same broad-sense monotone increasing function fmap (F) to all frequencies to obtain the enhancement filter G. Therefore, as shown in FIG. Was different for each frequency. In particular, a phenomenon occurs in which the range of arrival directions that are not suppressed is widened at low frequencies. Therefore, in the second embodiment, a broad monotone increasing function that differs for each frequency is applied so that the same characteristics are obtained at any frequency.

第２の実施形態の信号処理装置２００の構成は、図１に示すように、フィルタ決定手段１０４がフィルタ決定手段２０４に替わること以外は、第１の実施形態の信号処理装置１００の構成と同じである。 The configuration of the signal processing device 200 according to the second embodiment is the same as the configuration of the signal processing device 100 according to the first embodiment except that the filter determination unit 104 is replaced with a filter determination unit 204 as shown in FIG. It is.

（Ｂ−２）第２の実施形態の動作
次に、以上のような構成を有する第２の実施形態の信号処理装置２００の動作（実施形態の信号処理方法）を説明する。 (B-2) Operation | movement of 2nd Embodiment Next, operation | movement (signal processing method of embodiment) of the signal processing apparatus 200 of 2nd Embodiment which has the above structures is demonstrated.

第２の実施形態の信号処理装置２００の動作は、フィルタ決定手段２０４の動作がフィルタ決定手段１０４とは異なる点以外は、第1の実施形態の信号処理装置１００の動作と同じである。 The operation of the signal processing device 200 of the second embodiment is the same as the operation of the signal processing device 100 of the first embodiment, except that the operation of the filter determination unit 204 is different from the filter determination unit 104.

第１の実施形態では、図７に示すように、正面方向から第２のマイクＭ２の側に到来方向を傾けた際に、周波数によって強調フィルタＧのゲインが所定以下（例えば、０．５以下）となる到来角度（以下、「カットオフ到来角度」と呼ぶ）にばらつきがある。言い換えると、第１の実施形態では、周波数によって抑圧しない到来方向の範囲にばらつきがある。これに対して、第２の実施形態のフィルタ決定手段２０４は、周波数ごとの広義単調増加関数を設定することで、このばらつきを吸収し、複数の周波数でカットオフ到来角度（抑圧しない到来方向の範囲）が近づくようにしている。フィルタ決定手段２０４において、周波数ごとのカットオフ到来角度のばらつき（抑圧しない到来方向の範囲のばらつき）を抑制するような、周波数ごとの広義単調増加関数を求める方式については限定されないものであるが、例えば、いかのような処理を適用することができる。 In the first embodiment, as shown in FIG. 7, when the arrival direction is tilted from the front direction toward the second microphone M2, the gain of the enhancement filter G is predetermined or less (for example, 0.5 or less) depending on the frequency. ) (Hereinafter referred to as “cutoff arrival angle”). In other words, in the first embodiment, there is variation in the range of the arrival direction that is not suppressed depending on the frequency. On the other hand, the filter determination unit 204 of the second embodiment absorbs this variation by setting a broad-sense monotonically increasing function for each frequency, and cut-off arrival angles (in the direction of arrival without suppression) at a plurality of frequencies. Range) is approaching. The filter determining means 204 is not limited to a method for obtaining a broad monotone increasing function for each frequency so as to suppress variation in cutoff arrival angle for each frequency (variation in the range of arrival directions that are not suppressed). For example, what kind of processing can be applied.

フィルタ決定手段２０４は、ＤＯＡ特徴量Ｆに基づいて所定の広義単調増加関数によって強調フィルタＧを算出し、乗算手段１０５に与える。第２の実施形態では、周波数ごとに異なる広義単調増加関数を用いる。ここでは、ｋ番目の周波数のＤＯＡ特徴量をＦ（ｋ）、ｋ番目の周波数の強調ゲインをＧ（ｋ）と書く。ｋ番目の周波数をｆ_ｋとして、到来方向と周波数の番号をＤＯＡ特徴量に変換する関数を（１３）式で定義する。そして、所定のｋ番目の周波数の広義単調増加関数ｆｍａｐ_ｋ（Ｆ（ｋ））として、例えば（１４）式で定義される１つの到来方向閾値θ_０を関数や、（１５）式で定義される２つの到来方向閾値θ_１、θ_２を持つ関数、（１６）式で定義されるスケールＦ_ａ、オフセット到来方向θ_０のシグモイド関数を用いることができる。

The filter determination unit 204 calculates the enhancement filter G based on the DOA feature amount F using a predetermined broad-sense monotone increasing function, and supplies the enhancement filter G to the multiplication unit 105. In the second embodiment, a broad-sense monotone increasing function that differs for each frequency is used. Here, the DOA feature quantity of the kth frequency is written as F (k), and the enhancement gain of the kth frequency is written as G (k). A function for converting the direction of arrival and the frequency number into a DOA feature amount is defined by equation (13), where the k-th frequency is f _k . Then, as a broad monotone increasing function fmap _k (F (k)) of a predetermined k-th frequency, for example, one arrival direction threshold value θ ₀ defined by equation (14) is defined by a function or equation (15). A function having _two arrival direction threshold values θ ₁ and θ ₂ , a scale F _a defined by the equation (16), and a sigmoid function having an offset arrival direction θ ₀ can be used.

図９は、第２の実施形態に係るフィルタ決定手段２０４で得られる強調フィルタＧの例について示したグラフである。 FIG. 9 is a graph showing an example of the enhancement filter G obtained by the filter determination unit 204 according to the second embodiment.

図９（ａ）、図９（ｂ）、図９（ｃ）は、それぞれ（１４）式、（１５）式、（１６）式によって得られる強調フィルタＧの例を示している。図９（ａ）、図９（ｂ）、図９（ｃ）では、音源の周波数を１ｋＨｚ、２ｋＨｚ、４ｋＨｚと変化させた場合の到来方向θと強調フィルタＧの関係を示したグラフとなっている。図９（ａ）、図９（ｂ）、図９（ｃ）では、横軸を音源の到来方向θとし縦軸を強調フィルタＧの値（到来方向θに応じた値）としている。 FIG. 9A, FIG. 9B, and FIG. 9C show examples of the enhancement filter G obtained by the equations (14), (15), and (16), respectively. 9A, 9B, and 9C are graphs showing the relationship between the arrival direction θ and the enhancement filter G when the frequency of the sound source is changed to 1 kHz, 2 kHz, and 4 kHz. Yes. 9A, 9B, and 9C, the horizontal axis represents the arrival direction θ of the sound source, and the vertical axis represents the value of the enhancement filter G (value corresponding to the arrival direction θ).

ここでは、θ_０＝１５、θ_１＝２０、θ_２＝１０、Ｆ_ａ＝１２とした。第１の実施形態における強調フィルタＧ（図８）では、抑圧しない到来方向の範囲が周波数ごとに変化していたが、第２の実施形態における強調フィルタＧ（図９）では、高い周波数の第１のマイクＭ１側を除いて、抑圧しない到来方向の範囲は周波数が変わっても変化しない。なお、（１６）式については周波数によって特性が変化しているが、強調フィルタＧのゲインが０．５となるカットオフ到来角度は周波数に依らず一定である。つまり、（１３）〜（１６）式を用いて強調ゲインを算出すれば、第２のマイクＭ２側、すなわち妨害音側（助手席側）を何度まで抑圧するかを、すべての周波数共通かつ直接的に設定できる。 Here, θ ₀ = 15, θ ₁ = 20, θ ₂ = 10, and F _a = 12. In the enhancement filter G (FIG. 8) in the first embodiment, the range of arrival directions not to be suppressed changes for each frequency, but in the enhancement filter G (FIG. 9) in the second embodiment, the high-frequency Except for one microphone M1 side, the range of the arrival direction that is not suppressed does not change even if the frequency changes. Note that although the characteristic of the equation (16) changes depending on the frequency, the cutoff arrival angle at which the gain of the enhancement filter G becomes 0.5 is constant regardless of the frequency. That is, if the enhancement gain is calculated using the equations (13) to (16), the number of times to suppress the second microphone M2 side, that is, the disturbance sound side (passenger seat side), is the same for all frequencies. Can be set directly.

（Ｂ−３）第２の実施形態の効果
第２の実施形態によれば、第１の実施形態の効果に加えて、以下のような効果を奏することができる。 (B-3) Effects of Second Embodiment According to the second embodiment, the following effects can be obtained in addition to the effects of the first embodiment.

第２の実施形態の信号処理装置２００では、強調ゲインが抑圧しない到来方向の範囲をすべての周波数で同じように与えることができ、かつその範囲を到来方向の角度そのもので設定できるので、より適切な調整が可能となり、より少ない歪みで目的音を強調できるという効果を奏する。 In the signal processing device 200 according to the second embodiment, the range of the arrival direction in which the enhancement gain is not suppressed can be given in the same way at all frequencies, and the range can be set by the angle of the arrival direction itself. Adjustment is possible, and the target sound can be emphasized with less distortion.

（Ｃ）第３の実施形態
以下、本発明による信号処理装置、プログラム及び方法の第３の実施形態を、図面を参照しながら詳述する。 (C) Third Embodiment Hereinafter, a third embodiment of the signal processing apparatus, program and method according to the present invention will be described in detail with reference to the drawings.

（Ｃ−１）第３の実施形態の構成
第３の実施形態の信号処理装置２００も、第１、第２の実施形態と同様に図２に示すような環境で利用されるものとして説明する。 (C-1) Configuration of Third Embodiment The signal processing apparatus 200 of the third embodiment will be described as being used in an environment as shown in FIG. 2 as in the first and second embodiments. .

また、第３の実施形態の信号処理装置３００の内部構成についても、上述の図１を用いて示すことができる。 Further, the internal configuration of the signal processing apparatus 300 of the third embodiment can also be shown using FIG. 1 described above.

以下では、第３の実施形態の信号処理装置２００について、第２の実施形態との差異を説明する。 Hereinafter, differences from the second embodiment will be described for the signal processing device 200 of the third embodiment.

第２の実施形態では、妨害音側の到来方向を何度まで抑圧するかを、すべての周波数共通で設定した。しかし、低い周波数は信号の波長に対してマイク間隔を十分に広く取ることが困難なため（１００Ｈｚの波長は約３．３ｍだが、自動車内でのマイク間隔は数ｃｍとするのが一般的）、低い周波数において数値計算によって得られる到来方向に関する情報（本発明ではＤＯＡ特徴量）は一般に曖昧になる（到来方向推定の意味で推定誤差が大きくなる）。そこで、第３の実施形態では、所定よりも低い周波数（例えば、２５０Ｈｚ以下の周波数帯）では強調ゲインが抑圧しない到来方向の範囲を広げる（第２のマイクＭ２の側に広げる；妨害音を発する助手Ｕ２の側に広げる）ように設計する。 In the second embodiment, how many times the direction of arrival on the disturbing sound side is suppressed is set for all frequencies. However, it is difficult to make the microphone interval sufficiently wide with respect to the signal wavelength at a low frequency (the wavelength of 100 Hz is about 3.3 m, but the microphone interval in an automobile is generally several centimeters). Information regarding the direction of arrival obtained by numerical calculation at a low frequency (DOA feature amount in the present invention) is generally ambiguous (estimation error increases in the sense of direction of arrival estimation). Therefore, in the third embodiment, the range of the arrival direction in which the enhancement gain is not suppressed is expanded at a frequency lower than a predetermined frequency (for example, a frequency band of 250 Hz or less) (expanded to the second microphone M2 side; a disturbing sound is emitted). It is designed to spread to the side of the assistant U2.

第３の実施形態の信号処理装置３００の構成は、フィルタ決定手段１０４がフィルタ決定手段３０４に替わること以外は、第１の実施形態の信号処理装置１００の構成と同じである。 The configuration of the signal processing device 300 according to the third embodiment is the same as the configuration of the signal processing device 100 according to the first embodiment except that the filter determination unit 104 is replaced with the filter determination unit 304.

（Ｃ−２）第３の実施形態の動作
次に、以上のような構成を有する第３の実施形態の信号処理装置３００の動作（実施形態の信号処理方法）を説明する。 (C-2) Operation of Third Embodiment Next, the operation (signal processing method of the embodiment) of the signal processing device 300 of the third embodiment having the above-described configuration will be described.

第３の実施形態の信号処理装置３００の動作は、フィルタ決定手段３０４の動作がフィルタ決定手段１０４とは異なる点以外は、第1の実施形態の信号処理装置１００の動作と同じである。 The operation of the signal processing device 300 of the third embodiment is the same as the operation of the signal processing device 100 of the first embodiment, except that the operation of the filter determination unit 304 is different from the filter determination unit 104.

フィルタ決定手段３０４は、ＤＯＡ特徴量Ｆに基づいて所定の広義単調増加関数によって強調フィルタＧを算出し、乗算手段１０５に与える。第２の実施形態では、周波数ごとに異なる広義単調増加関数を用いる。ここでは、ｋ番目の周波数のＤＯＡ特徴量をＦ（ｋ）、ｋ番目の周波数の強調ゲインをＧ（ｋ）と書く。ｋ番目の周波数をｆ_ｋとして、到来方向と周波数の番号をＤＯＡ特徴量に変換する関数を（１７）式で定義する。そして、所定のｋ番目の周波数の広義単調増加関数ｆｍａｐ_ｋ（Ｆ（ｋ））として、例えば（１８）式で定義される１つの到来方向閾値θ_０を持つ関数を用いることができる。

The filter determining unit 304 calculates the enhancement filter G based on the DOA feature amount F using a predetermined broad-sense monotone increasing function, and supplies the enhancement filter G to the multiplication unit 105. In the second embodiment, a broad-sense monotone increasing function that differs for each frequency is used. Here, the DOA feature quantity of the kth frequency is written as F (k), and the enhancement gain of the kth frequency is written as G (k). A function for converting the direction of arrival and the frequency number into a DOA feature is defined by equation (17), where the k-th frequency is f _k . A function having one arrival direction threshold value θ ₀ defined by, for example, the equation (18) can be used as the broad-sense monotone increasing function fmap _k (F (k)) of a predetermined k-th frequency.

図１０は、第２の実施形態に係るフィルタ決定手段２０４で得られる強調フィルタＧと、第３の実施形態に係るフィルタ決定手段３０４で得られる強調フィルタＧとの比較について示したグラフである。 FIG. 10 is a graph showing a comparison between the enhancement filter G obtained by the filter determination unit 204 according to the second embodiment and the enhancement filter G obtained by the filter determination unit 304 according to the third embodiment.

図１０（ａ）は、第２の実施形態におけるフィルタ決定手段２０４で上述の（１３）式及び（１４）式を用いて得られる強調フィルタＧについて示している。また、図１０（ｂ）は、第３の実施形態に係るフィルタ決定手段３０４で（１７）式及び（１８）式により得られる強調フィルタＧの例を示している。図１０（ａ）、図１０（ｂ）では、音源の周波数を１ｋＨｚ、２ｋＨｚ、４ｋＨｚと変化させた場合の到来方向θと強調フィルタＧの関係を示したグラフとなっている。図１０（ａ）、図１０（ｂ）では、横軸を音源の到来方向θとし縦軸を強調フィルタＧの値（到来方向θに応じた値）としている。 FIG. 10A shows an enhancement filter G obtained by using the above equations (13) and (14) by the filter determination means 204 in the second embodiment. FIG. 10B shows an example of the enhancement filter G obtained by the equations (17) and (18) by the filter determination unit 304 according to the third embodiment. 10A and 10B are graphs showing the relationship between the arrival direction θ and the enhancement filter G when the frequency of the sound source is changed to 1 kHz, 2 kHz, and 4 kHz. 10A and 10B, the horizontal axis is the arrival direction θ of the sound source, and the vertical axis is the value of the enhancement filter G (value corresponding to the arrival direction θ).

ここでは、θ_０＝５、Ｆ_０＝０．９７とした。図１０より、到来方向と周波数の番号をＤＯＡ特徴量に変換する関数Φ（ファイ）の上限値をＦ_０としたことで、低い周波数の抑圧しない到来方向の範囲が広くなったことが確認できる。 Here, θ ₀ = 5 and F ₀ = 0.97. From FIG. 10, it can be confirmed that by setting the upper limit value of the function Φ (Phi) for converting the arrival direction and the frequency number to the DOA feature amount as F ₀ , the range of the arrival direction in which low frequencies are not suppressed is widened. .

（Ｃ−３）第３の実施形態の効果
第３の実施形態によれば、第１、第２の実施形態の効果に加えてができる。 (C-3) Effects of the Third Embodiment According to the third embodiment, the effects of the first and second embodiments can be added.

第３の実施形態の信号処理装置３００では、数値計算によって得られる到来方向に関する情報が曖昧となる低い周波数において、抑圧しない到来方向の範囲を広めに確保できるので、低い周波数の目的音の歪みが軽減され、より少ない歪みで目的音を強調できるという効果を奏する。 In the signal processing device 300 according to the third embodiment, since the range of the arrival direction that is not suppressed can be secured widely at a low frequency where the information about the arrival direction obtained by numerical calculation is ambiguous, distortion of the target sound at a low frequency is prevented. This reduces the effect of enhancing the target sound with less distortion.

（Ｄ）第４の実施形態
以下、本発明による信号処理装置、プログラム及び方法の第４の実施形態を、図面を参照しながら詳述する。 (D) Fourth Embodiment Hereinafter, a fourth embodiment of the signal processing apparatus, program and method according to the present invention will be described in detail with reference to the drawings.

（Ｄ−１）第４の実施形態の構成
第４の実施形態の信号処理装置４００も、第１〜第３の実施形態と同様に図２に示すような環境で利用されるものとして説明する。 (D-1) Configuration of the Fourth Embodiment The signal processing apparatus 400 of the fourth embodiment will be described as being used in an environment as shown in FIG. 2 as in the first to third embodiments. .

また、第４の実施形態の信号処理装置４００の内部構成についても、上述の図１を用いて示すことができる。 Further, the internal configuration of the signal processing apparatus 400 of the fourth embodiment can also be shown using FIG. 1 described above.

以下では、第４の実施形態の信号処理装置４００について、第１〜第３の実施形態との差異を説明する。 Below, the difference with the 1st-3rd embodiment is demonstrated about the signal processing apparatus 400 of 4th Embodiment.

第１〜第３の実施形態では、自動車Ａ内において運転手Ｕ１の正面に２つのマイクＭ１、Ｍ２をセットする場合を想定して、助手席側（助手Ｕ２側）だけを抑圧する強調フィルタＧを設計した。これに対して、第４の実施形態では、本発明におけるＤＯＡ特徴量を用いて正面方向のみを強調する（抑圧しない）強調フィルタを適用するものとする。 In the first to third embodiments, assuming that two microphones M1 and M2 are set in front of the driver U1 in the automobile A, the enhancement filter G that suppresses only the passenger seat side (the assistant U2 side). Designed. On the other hand, in the fourth embodiment, an enhancement filter that enhances (does not suppress) only the front direction using the DOA feature value in the present invention is applied.

図１に示すように、第４の実施形態の信号処理装置４００の構成は、特徴量算出手段１０３とフィルタ決定手段１０４がそれぞれ特徴量算出手段４０３とフィルタ決定手段４０４に替わること以外は、第１の実施形態の信号処理装置１００の構成と同じである。 As shown in FIG. 1, the configuration of the signal processing apparatus 400 according to the fourth embodiment is the same as that of the fourth embodiment except that the feature amount calculation unit 103 and the filter determination unit 104 are replaced with the feature amount calculation unit 403 and the filter determination unit 404, respectively. The configuration is the same as that of the signal processing apparatus 100 of the first embodiment.

（Ｄ−２）第４の実施形態の動作
次に、以上のような構成を有する第４の実施形態の信号処理装置４００の動作（実施形態の信号処理方法）を説明する。 (D-2) Operation of Fourth Embodiment Next, an operation (signal processing method of the embodiment) of the signal processing device 400 of the fourth embodiment having the above-described configuration will be described.

次に、上述した構成を有する第４の実施形態の信号処理装置４００の動作を説明する。第４の実施形態の信号処理装置４００の動作は、特徴量算出手段４０３とフィルタ決定手段３０４の動作が特徴量算出手段１０３とフィルタ決定手段１０４とは異なる点以外は、第1の実施形態の信号処理装置１００の動作と同じである。 Next, the operation of the signal processing apparatus 400 according to the fourth embodiment having the above-described configuration will be described. The operation of the signal processing apparatus 400 of the fourth embodiment is the same as that of the first embodiment, except that the operation of the feature amount calculation unit 403 and the filter determination unit 304 is different from the feature amount calculation unit 103 and the filter determination unit 104. The operation is the same as that of the signal processing apparatus 100.

特徴量算出手段４０３は、第１の入力スペクトルＸ_１と第２の入力スペクトルＸ_２とに基づいて（１９）式によって２つのＤＯＡ特徴量ＦとＦ’を算出し、フィルタ決定手段４０４に与える。２つのＤＯＡ特徴量を音の到来方向θに関して整理すると、（２０）式となる。 The feature quantity calculation means 403 calculates two DOA feature quantities F and F ′ by the equation (19) based on the first input spectrum X ₁ and the second input spectrum X _2, and gives them to the filter determination means 404. . When the two DOA feature quantities are arranged with respect to the sound arrival direction θ, the equation (20) is obtained.

図１１は、音の到来方向θと（２０）式のＤＯＡ特徴量Ｆ’との関係について示したグラフである。 FIG. 11 is a graph showing the relationship between the sound arrival direction θ and the DOA feature value F ′ in the equation (20).

図１１では、音源の周波数を１ｋＨｚ、２ｋＨｚ、４ｋＨｚと変化させた場合の到来方向θとＤＯＡ特徴量Ｆ’の関係を示したグラフとなっている。図１１では、横軸を音源の到来方向θとし縦軸をＤＯＡ特徴量Ｆ’としている。 FIG. 11 is a graph showing the relationship between the arrival direction θ and the DOA feature amount F ′ when the frequency of the sound source is changed to 1 kHz, 2 kHz, and 4 kHz. In FIG. 11, the horizontal axis represents the sound source arrival direction θ, and the vertical axis represents the DOA feature amount F ′.

図１１を見ると、ＤＯＡ特徴量Ｆ（図５）とはちょうど左右が反転していることが確認できる。

When FIG. 11 is seen, it can be confirmed that the left and right are just reversed from the DOA feature amount F (FIG. 5).

フィルタ決定手段４０４は、２つのＤＯＡ特徴量ＦとＦ’に基づいて所定の広義単調増加関数によって強調フィルタＧを算出し、乗算手段１０５に与える。所定の広義単調増加関数には、第１の実施形態に係るｆｍａｐ（Ｆ）、第２の実施形態に係るΦ（φ，ｋ）とｆｍａｐ_ｋ（Ｆ（ｋ））、第３の実施形態に係るΦ（φ，Ｆ_０，ｋ）とｆｍａｐ_ｋ（Ｆ（ｋ））のいずれを用いても良いが、ここでは一例として、第２の実施形態の所定の広義単調増加関数を用いて説明する。第４の実施形態において、強調フィルタＧは（２１）式を用いて算出される。 The filter determination unit 404 calculates the enhancement filter G based on the two DOA feature amounts F and F ′ using a predetermined broad-sense monotone increasing function, and supplies the enhancement filter G to the multiplication unit 105. The predetermined broad monotone increasing function includes fmap (F) according to the first embodiment, Φ (φ, k) and fmap _k (F (k)) according to the second embodiment, and the third embodiment. Any of Φ (φ, F ₀ , k) and fmap _k (F (k)) may be used, but here, as an example, a description will be given using the predetermined broad-sense monotonically increasing function of the second embodiment. . In the fourth embodiment, the enhancement filter G is calculated using equation (21).

図１２は、第４の実施形態に係るフィルタ決定手段４０４で得られる強調フィルタＧの例について示した説明図である。 FIG. 12 is an explanatory diagram illustrating an example of the enhancement filter G obtained by the filter determination unit 404 according to the fourth embodiment.

図１２（ａ）、図１２（ｂ）、図１２（ｃ）は、それぞれｆｍａｐ_ｋ（Ｆ（ｋ））として（１４）式、（１５）式、（１６）式を用いた場合に得られる強調フィルタＧの例を示している。図１２（ａ）、図１２（ｂ）、図１２（ｃ）では、音源の周波数を１ｋＨｚ、２ｋＨｚ、４ｋＨｚと変化させた場合の到来方向θと強調フィルタＧの関係を示したグラフとなっている。図１２（ａ）、図１２（ｂ）、図１２（ｃ）では、横軸を音源の到来方向θとし縦軸を強調フィルタＧの値（到来方向θに応じた値）としている。 12 (a), 12 (b), and 12 (c) are obtained when Equation (14), Equation (15), and Equation (16) are used as fmap _k (F (k)), respectively. An example of the enhancement filter G is shown. 12A, 12B, and 12C are graphs showing the relationship between the arrival direction θ and the enhancement filter G when the frequency of the sound source is changed to 1 kHz, 2 kHz, and 4 kHz. Yes. 12 (a), 12 (b), and 12 (c), the horizontal axis represents the arrival direction θ of the sound source, and the vertical axis represents the value of the enhancement filter G (value corresponding to the arrival direction θ).

ここでは、θ_０＝２０、θ_１＝１５、θ_２＝２５、Ｆ_ａ＝１２とした。図１２より、正面方向のみを強調する（抑圧しない）強調フィルタが得られていることが分かる。 Here, θ ₀ = 20, θ ₁ = 15, θ ₂ = 25, and F _a = 12. From FIG. 12, it can be seen that an enhancement filter that enhances only the front direction (does not suppress) is obtained.

（Ｄ−３）第４の実施形態の効果
第４の実施形態によれば、第１〜第３の実施形態と比較して以下のような効果を奏することができる。 (D-3) Effects of the Fourth Embodiment According to the fourth embodiment, the following effects can be achieved as compared with the first to third embodiments.

第４の実施形態の信号処理装置４００では、強調フィルタＧ（強調ゲイン）が抑圧しない到来方向の範囲を正面方向に限定した目的音を強調できるという特有の効果を奏することができる。 In the signal processing device 400 of the fourth embodiment, it is possible to achieve a specific effect that the target sound in which the range of the arrival direction that is not suppressed by the enhancement filter G (enhancement gain) is limited to the front direction can be enhanced.

（Ｅ）他の実施形態
本発明は、上記の各実施形態に限定されるものではなく、以下に例示するような変形実施形態も挙げることができる。 (E) Other Embodiments The present invention is not limited to the above-described embodiments, and may include modified embodiments as exemplified below.

（Ｅ−１）上記の実施形態において、信号処理装置は、強調スペクトルＹの波形を復元して強調音声ｙを出力するものとして記載したが、波形を復元せずに強調スペクトルＹを出力しても良い。また、強調スペクトルＹと強調音声ｙの両方を出力するようにしても良い。その場合、波形復元手段１０６は除外するようにしてもよい。 (E-1) In the above embodiment, the signal processing apparatus is described as restoring the waveform of the enhanced spectrum Y and outputting the enhanced speech y, but outputs the enhanced spectrum Y without restoring the waveform. Also good. Further, both the enhanced spectrum Y and the enhanced sound y may be output. In that case, the waveform restoration means 106 may be excluded.

１００…信号処理装置、１０１…第１の周波数解析手段、１０２…第２の周波数解析手段、１０３…特徴量算出手段、１０４…フィルタ決定手段、１０５…乗算手段、１０６…波形復元手段、Ｍ１…第１のマイク（第１の収音装置）、Ｍ２…第２のマイク（第２の収音装置）。 DESCRIPTION OF SYMBOLS 100 ... Signal processing apparatus, 101 ... 1st frequency analysis means, 102 ... 2nd frequency analysis means, 103 ... Feature-value calculation means, 104 ... Filter determination means, 105 ... Multiplication means, 106 ... Waveform restoration means, M1 ... First microphone (first sound collecting device), M2... Second microphone (second sound collecting device).

Claims

First frequency analysis means for obtaining a first input spectrum by performing frequency analysis on the first input signal input from the first sound collection device;
Second frequency analysis means for obtaining a second input spectrum by performing frequency analysis on the second input signal input from the second sound collecting device;
Based on the first input spectrum obtained by the first frequency analysis means and the second input spectrum obtained by the second frequency analysis means, the position of the first sound collecting device and the second input spectrum are obtained. The front direction and the direction on the first sound collection device side are set larger than the front direction perpendicular to the straight line connecting the positions of the sound collection devices, and the value on the direction on the second sound collection device side. A feature amount calculating means for calculating a first feature amount that takes a small value;
Filter determining means for mapping the first feature quantity calculated by the feature quantity calculating means with a predetermined broad-sense monotone increasing function to obtain an enhancement filter;
A signal processing apparatus comprising: multiplication means for obtaining an enhanced spectrum by multiplying the first input spectrum obtained by the first frequency analyzing means by the enhancement filter obtained by the filter determining means.

The first feature amount is located on a side of the first sound collecting device with respect to a front direction perpendicular to a straight line connecting the position of the first sound collecting device and the position of the second sound collecting device. The signal processing apparatus according to claim 1, wherein a peak is present in a direction, and the value decreases as the direction inclines from the peak direction toward the second sound collector.

The feature quantity calculation means includes a second feature quantity that has a large value with respect to the front direction, and a third feature quantity that has a greater value with respect to the direction of the second sound collecting device than the front direction. The signal processing apparatus according to claim 1, wherein the first feature amount is calculated using a signal.

The said filter determination means maps the said 1st feature-value using the broad-sense monotone increasing function which changes for every frequency, The said emphasis filter is obtained for every frequency, The Claim 1 characterized by the above-mentioned. Signal processing equipment.

5. The signal processing apparatus according to claim 4, wherein the filter determination unit sets a broadly monotonically increasing function of each frequency so that the range of arrival directions not to be suppressed matches in the enhancement filter for each frequency. .

The filter determination means sets a broadly monotonically increasing function that widens the range of arrival directions that are not suppressed in the enhancement filter in a low frequency band below a predetermined frequency compared to a high frequency band higher than the predetermined frequency. The signal processing device according to claim 4.

The filter determination means sets a broadly monotonically increasing function that widens the range of the arrival direction that is not suppressed in the enhancement filter in the low frequency band toward the second sound collecting device side than in the high frequency band. The signal processing apparatus according to claim 6, characterized in that:

4. The signal according to claim 1, wherein the filter determination unit sets a broadly monotonically increasing function that obtains an enhancement filter that emphasizes only the front direction using the first feature amount. 5. Processing equipment.

Computer
First frequency analysis means for obtaining a first input spectrum by performing frequency analysis on the first input signal input from the first sound collection device;
Second frequency analysis means for obtaining a second input spectrum by performing frequency analysis on the second input signal input from the second sound collecting device;
Based on the first input spectrum obtained by the first frequency analysis means and the second input spectrum obtained by the second frequency analysis means, the position of the first sound collecting device and the second input spectrum are obtained. The front direction and the direction on the first sound collection device side are set larger than the front direction perpendicular to the straight line connecting the positions of the sound collection devices, and the value on the direction on the second sound collection device side. A feature amount calculating means for calculating a first feature amount that takes a small value;
Filter determining means for mapping the first feature quantity calculated by the feature quantity calculating means with a predetermined broad-sense monotone increasing function to obtain an enhancement filter;
Multiplying means for multiplying the first input spectrum obtained by the first frequency analyzing means by the enhancement filter obtained by the filter determining means to obtain an enhanced spectrum;
A signal processing program that functions as a waveform restoration unit that receives an enhanced spectrum obtained by the multiplication unit and restores a signal waveform to obtain enhanced speech.

In the signal processing method,
A first frequency analysis unit, a second frequency analysis unit, a feature amount calculation unit, a filter determination unit, and a multiplication unit;
The first frequency analysis means obtains a first input spectrum by performing frequency analysis on the first input signal input from the first sound collection device,
The second frequency analysis means obtains a second input spectrum by performing frequency analysis on the second input signal input from the second sound collection device,
The feature amount calculating means is based on the first input spectrum obtained by the first frequency analyzing means and the second input spectrum obtained by the second frequency analyzing means. With respect to the front direction perpendicular to the straight line connecting the position of the second sound collecting device and the position of the second sound collecting device, the values in the front direction and the direction on the first sound collecting device side are set larger. Calculating a first feature value that takes a smaller value in the direction of the sound device;
The filter determination unit maps the first feature amount calculated by the feature amount calculation unit with a predetermined broad monotone increasing function to obtain an enhancement filter,
The signal processing method, wherein the multiplication means obtains an enhanced spectrum by multiplying the first input spectrum obtained by the first frequency analysis means by the enhancement filter obtained by the filter determination means.