JP5139111B2 - Method and apparatus for extracting sound from moving sound source - Google Patents

Method and apparatus for extracting sound from moving sound source Download PDF

Info

Publication number
JP5139111B2
JP5139111B2 JP2008034445A JP2008034445A JP5139111B2 JP 5139111 B2 JP5139111 B2 JP 5139111B2 JP 2008034445 A JP2008034445 A JP 2008034445A JP 2008034445 A JP2008034445 A JP 2008034445A JP 5139111 B2 JP5139111 B2 JP 5139111B2
Authority
JP
Japan
Prior art keywords
sound source
observation
signal vector
time
target sound
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
JP2008034445A
Other languages
Japanese (ja)
Other versions
JP2008219884A (en
Inventor
弘史 中島
一博 中臺
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Honda Motor Co Ltd
Original Assignee
Honda Motor Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Honda Motor Co Ltd filed Critical Honda Motor Co Ltd
Publication of JP2008219884A publication Critical patent/JP2008219884A/en
Application granted granted Critical
Publication of JP5139111B2 publication Critical patent/JP5139111B2/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

<P>PROBLEM TO BE SOLVED: To provide a method to correctly extract a sound from a mobile sound source without generating frequency veering and/or amplitude fluctuation caused by a discontinuous output and a doppler effect. <P>SOLUTION: The method obtains a position of a sound source, and obtains a time varying convolution matrix from the sound source to each of observation points (H(p(k), n)) for converting a sound source signal vector (s) into observation signal vector (x(n)) for a plurality of observation points at each of the observation points as a function of the position of the sound source. Further, the time varying convolution matrix from the sound source to each of observation points is used to obtain a beam forming coefficient matrix (G(n)) for converting the observation signal vector at each of the observation points into the sound source signal vector for the intended sound source, and obtains a sound source signal vector (y) for the intended sound source from the observation signal vector at each of the observation points and the beam forming coefficient matrix. <P>COPYRIGHT: (C)2008,JPO&amp;INPIT

Description

本発明は、移動音源からの音の抽出方法および装置に関する。   The present invention relates to a method and apparatus for extracting sound from a moving sound source.

目的とする音源(以下、目的音源と呼称する)の音源信号を正確に抽出するために、ビームフォーミングが行われる。ビームフォーミングは、複数のマイクロフォンで観測した観測信号を適当なフィルタで処理して、目的音源の音源信号を強調して抽出する方法である。ビームフォーミングの具体的な方法としては、所定の方向からの音を同相化して強調する、遅延和アレイによる処理などが広く知られている。また、より高精度のビームフォーミング方法も提案されている(たとえば、特許文献1)。   In order to accurately extract a sound source signal of a target sound source (hereinafter referred to as a target sound source), beam forming is performed. Beam forming is a method in which observation signals observed with a plurality of microphones are processed with an appropriate filter, and a sound source signal of a target sound source is emphasized and extracted. As a specific method of beam forming, processing using a delay-and-sum array, in which sound from a predetermined direction is in-phase and emphasized, is widely known. A more accurate beamforming method has also been proposed (for example, Patent Document 1).

一方、移動音源に対するビームフォーミング方法は、従来、静止音源に対するビームフォーミング方法をそのまま適用したものであった。具体的には、たとえば、移動音源の移動領域を分割し、分割領域ごとにビームフォーミングのためのフィルタ係数を求め、それぞれの分割領域のフィルタ係数によって移動音源からの音の出力値を求め、出力値の内最大のものを移動音源からの音とする。しかし、このような従来の方法には、分割領域ごとに係数を切り替えることによる出力の不連続やドップラー効果による周波数変化や振幅変動が生じるという欠点があった。
特開平2006−270903号公報
On the other hand, the beam forming method for a moving sound source has been conventionally applied to the beam forming method for a stationary sound source as it is. Specifically, for example, the moving area of the moving sound source is divided, the filter coefficient for beam forming is obtained for each divided area, the output value of the sound from the moving sound source is obtained by the filter coefficient of each divided area, and output The largest value is the sound from the moving sound source. However, such a conventional method has a drawback in that output discontinuity due to switching of coefficients for each divided region and frequency change and amplitude fluctuation due to the Doppler effect occur.
Japanese Patent Laid-Open No. 2006-270903

したがって、出力の不連続やドップラー効果による周波数変化や振幅変動を生じさせることなく、移動音源からの音を正確に抽出する方法および装置に対するニーズがある。   Therefore, there is a need for a method and apparatus for accurately extracting sound from a moving sound source without causing frequency changes or amplitude fluctuations due to output discontinuities or Doppler effects.

本発明による移動音源からの音の抽出方法は、音源の位置を求め、前記音源の位置の関数として、前記音源の音源信号ベクトルを複数の観測点のそれぞれの観測点における観測信号ベクトルに変換する、前記音源から前記それぞれの観測点までの時変畳み込み行列を求める。さらに、前記音源から前記それぞれの観測点までの時変畳み込み行列を使用して、前記それぞれの観測点における観測信号ベクトルを目的音源の音源信号ベクトルに変換するビームフォーミング係数行列を求め、前記それぞれの観測点における観測信号ベクトルおよび前記ビームフォーミング係数行列から前記目的音源の音源信号ベクトルを求める。   The sound extraction method from the moving sound source according to the present invention obtains the position of the sound source, and converts the sound source signal vector of the sound source into an observation signal vector at each of the observation points as a function of the position of the sound source. Then, a time-variant convolution matrix from the sound source to the respective observation points is obtained. Further, using a time-varying convolution matrix from the sound source to the respective observation points, a beam forming coefficient matrix for converting an observation signal vector at the respective observation points into a sound source signal vector of a target sound source is obtained, A sound source signal vector of the target sound source is obtained from the observation signal vector at the observation point and the beamforming coefficient matrix.

本発明による移動音源からの音の抽出装置は、複数の観測点における音の観測信号ベクトルを取得する音データ取得部と、音源の位置を検出する位置検出部と、を備える。本装置は、前記音源の位置の関数として、前記音源の音源信号ベクトルを複数の観測点のそれぞれの観測点における観測信号ベクトルに変換する、前記音源から前記それぞれの観測点までの時変畳み込み行列を格納する時変畳み込み行列格納部と、前記複数の観測点における音の観測信号ベクトルおよび前記音源から前記それぞれの観測点までの時変畳み込み行列から目的音源の音源信号ベクトルを求める演算処理部と、をさらに備える。   An apparatus for extracting sound from a moving sound source according to the present invention includes a sound data acquisition unit that acquires observation signal vectors of sound at a plurality of observation points, and a position detection unit that detects the position of the sound source. The apparatus converts a sound source signal vector of the sound source into an observation signal vector at each of a plurality of observation points as a function of the position of the sound source, and a time-varying convolution matrix from the sound source to the respective observation points A time-variant convolution matrix storage unit for storing sound, an arithmetic processing unit for obtaining a sound source signal vector of a target sound source from sound observation signal vectors at the plurality of observation points and a time-variant convolution matrix from the sound source to the respective observation points; Are further provided.

本発明によれば、移動する音源の位置によりビームフォーミング係数を切り替えて使用することがなく、それぞれの時刻において最適なビームフォーミング係数を使用するので、出力の不連続やドップラー効果による周波数変化や振幅変動を生じさせることなく、移動する目的音源による音を正確に抽出することができる。   According to the present invention, the beam forming coefficient is not switched and used depending on the position of the moving sound source, and the optimum beam forming coefficient is used at each time, so that the frequency change and amplitude due to output discontinuity and Doppler effect It is possible to accurately extract the sound from the moving target sound source without causing fluctuations.

移動音源の位置をp(t)、その信号(体積速度)をs(t)とすれば、位置qにおける観測信号(音圧)x(t)は、次式の解である[P. M. Morse and K. U. Ingard,
”Theoretical Acoustics”, Princeton, USA, pp. 717-732, 1968]。なお、以下の式において、pおよびqは、位置を表すベクトルを示す。

Figure 0005139111
If the position of the moving sound source is p (t) and its signal (volume velocity) is s (t), the observed signal (sound pressure) x (t) at the position q is a solution of the following equation [PM Morse and KU Ingard,
“Theoretical Acoustics”, Princeton, USA, pp. 717-732, 1968]. In the following equations, p and q represent vectors representing positions.
Figure 0005139111

式(1)は、静止音源の波動方程式と比較すると、右辺のデルタ関数が時間の関数でもある点で異なり、移動速度などにより観測点での音圧は変化する。式(1)は、時変ではあるが、線形の方程式である。このため、s(t)を時刻tでのインパルス入力

Figure 0005139111
の積分として分解すれば
Figure 0005139111
となり、観測信号x(t)は、時刻tでのインパルス入力
Figure 0005139111
に対する応答
Figure 0005139111
の積分
Figure 0005139111
として計算できる。応答
Figure 0005139111
は、
Figure 0005139111
としたときの式(1)の解であり、式(4)を式(1)に代入し、
Figure 0005139111
とおいて整理すれば、
Figure 0005139111
となる。式(5)は音源位置がp(t)である場合の静止音源に対するインパルス応答を与える式と一致する。したがって、
Figure 0005139111
である。ここで、h(t,p)は、音源位置がpのときのインパルス応答である。したがって、式(3)は、
Figure 0005139111
と変形できる。ここでh(t,p)は、位置pにある静止音源から観測位置qまでのインパルス応答である。この式は移動音源であっても、音源が取りうる各位置からの静止のインパルス応答が既知であれば、その応答と音源信号から出力が求められることを示す。本明細書では 式(7)を時変畳み込み演算と定義する。この演算は離散システムにおいても、式(7)と同様に近似的に次式で計算できることが 実験的に示されている[奥山智尚、松久寛、宇津野秀夫、“伝達関数を用いた仮想空間移動時の音圧計算”、騒音振動研究会報告 N-2006-46, 2006]。 Equation (1) differs from the wave equation of a stationary sound source in that the delta function on the right side is also a function of time, and the sound pressure at the observation point changes depending on the moving speed and the like. Equation (1) is a time-varying but linear equation. For this reason, impulse input of s (t), at time t s
Figure 0005139111
If it is decomposed as an integral of
Figure 0005139111
Next, the observed signal x (t) is an impulse input at time t s
Figure 0005139111
Response to
Figure 0005139111
The integral of
Figure 0005139111
Can be calculated as response
Figure 0005139111
Is
Figure 0005139111
Is the solution of equation (1), substituting equation (4) into equation (1),
Figure 0005139111
If you keep it organized,
Figure 0005139111
It becomes. Equation (5) is consistent with the formula which gives the impulse response with respect to the stationary sound source when the sound source position is p (t s). Therefore,
Figure 0005139111
It is. Here, h (t, p) is an impulse response when the sound source position is p. Therefore, equation (3) is
Figure 0005139111
And can be transformed. Here, h (t, p) is an impulse response from the stationary sound source at the position p to the observation position q. This equation indicates that even if the sound source is a moving sound source, if the stationary impulse response from each position that the sound source can take is known, the output is obtained from the response and the sound source signal. In this specification, Equation (7) is defined as a time-varying convolution operation. It has been experimentally shown that this calculation can also be calculated approximately in the same way as in Equation (7) in the discrete system [Tomonao Okuyama, Hiroshi Matsuhisa, Hideo Utsuno, “Virtual Space Using Transfer Functions” "Sound pressure calculation during movement", Noise and Vibration Study Group Report N-2006-46, 2006].

Figure 0005139111
ここで、kおよびkは離散時間を表す。サンプリング周波数は、移動によるドップラー効果を加味した音源の上限周波数の2倍より大きく定める必要がある。式(8)はベクトルと行列を用いて次式で表現できる。
Figure 0005139111
Here, k and k S represent discrete time. The sampling frequency must be set to be larger than twice the upper limit frequency of the sound source taking into account the Doppler effect due to movement. Expression (8) can be expressed by the following expression using a vector and a matrix.

Figure 0005139111
Figure 0005139111

ここで、sは音源信号ベクトル、xは観測信号ベクトル,Hは時変畳み込み行列[M. Mastumoto, M. Tohyama and H. Yanagawa, “A method of interpolating binaural impulse responses for moving sound images,” Acoust. Sci. & Tech. 24,5, pp284-292,2003]である。時変畳み込み行列Hの各行および音源信号ベクトルsの列は、音源信号の時刻に対応し、時変畳み込み行列Hの各列および観測信号ベクトルxの列は、観測信号の時刻に対応する。音源の移動パターンを離散時間ベクトルkの位置ベクトル関数 p(k)として定義すれば、Hは、移動パターンp(k)と観測点qにより定まる。なお離散時間の原点はl、音源信号長はLインパルス応答長はLとした。 Here, s is a sound source signal vector, x is an observation signal vector, H is a time-varying convolution matrix [M. Mastumoto, M. Tohyama and H. Yanagawa, “A method of interpolating binaural impulse responses for moving sound images,” Acoust. Sci. & Tech. 24,5, pp284-292, 2003]. Each row of the time-varying convolution matrix H and the column of the sound source signal vector s correspond to the time of the sound source signal, and each column of the time-varying convolution matrix H and the column of the observation signal vector x correspond to the time of the observation signal. If the movement pattern of the sound source is defined as a position vector function p (k) of the discrete time vector k, H is determined by the movement pattern p (k) and the observation point q. The origin of the discrete time is 1 and the sound source signal length is L s impulse response length L h .

図1は、本発明の一実施形態による移動音源からの音の抽出装置の構成を示す図である。   FIG. 1 is a diagram illustrating a configuration of an apparatus for extracting sound from a moving sound source according to an embodiment of the present invention.

移動音源からの音の抽出装置は、音データ取得部101、移動音源の位置を検出する位置検出部103、ビームフォーミングなどの演算により移動音源を抽出する演算処理部105および時変畳み込み行列を格納する時変畳み込み行列格納部107を備える。   An apparatus for extracting sound from a moving sound source stores a sound data acquisition unit 101, a position detection unit 103 that detects the position of the moving sound source, an arithmetic processing unit 105 that extracts a moving sound source by operations such as beam forming, and a time-varying convolution matrix. A time-varying convolution matrix storage unit 107 is provided.

位置検出部103は、レーザ距離計や電波の位相差を利用した距離計などにより、音データ取得部101に対する移動音源の位置を検出する。時変畳み込み行列格納部107は、任意の音源の位置における時変畳み込み行列を格納しておき、検出された移動音源の位置に対応する時変畳み込み行列を取り出すようにしてもよい。あるいは、移動音源の移動経路が固定されている場合には、時変畳み込み行列格納部107は、該移動経路上の位置における時変畳み込み行列を格納しておき、検出された移動音源の位置に対応する時変畳み込み行列を取り出すようにしてもよい。   The position detection unit 103 detects the position of the moving sound source with respect to the sound data acquisition unit 101 by using a laser distance meter or a distance meter using a phase difference of radio waves. The time-varying convolution matrix storage unit 107 may store a time-varying convolution matrix at an arbitrary sound source position and take out a time-varying convolution matrix corresponding to the detected position of the moving sound source. Alternatively, when the moving path of the moving sound source is fixed, the time-varying convolution matrix storage unit 107 stores the time-varying convolution matrix at the position on the moving path, and the detected position of the moving sound source is stored. A corresponding time-varying convolution matrix may be extracted.

本明細書において、移動音源は、音データ取得部101に対して相対的に移動する音源を含む。換言すれば、移動音源の位置は観測点に対する相対的な位置である。したがって、音源が所定の位置に固定され、たとえば、ロボットなどに固定された音データ取得部101が、ロボットとともに移動する場合に、該所定の位置に固定された音源は移動音源とみなすことができる。   In this specification, the moving sound source includes a sound source that moves relative to the sound data acquisition unit 101. In other words, the position of the moving sound source is a relative position with respect to the observation point. Therefore, when the sound source is fixed at a predetermined position, for example, when the sound data acquisition unit 101 fixed to the robot or the like moves with the robot, the sound source fixed at the predetermined position can be regarded as a moving sound source. .

図2は、上記の移動音源からの音の抽出装置の機能を示す図である。音データ取得部101は、N個のマイクロフォンを備える。x(n)は、移動音源201の音sを、n番目のマイクロフォンによって観測した観測信号ベクトルを示す。G(n)は、n番目のマイクロフォンの観測信号ベクトルに対するビームフォーミング係数である。yは、抽出された移動音源信号ベクトルである。演算処理部105は、ビームフォーミング係数G(n)を求め、観測信号ベクトルx(n)およびビームフォーミング係数G(n)から移動音源信号ベクトルyを求める。ビームフォーミング係数G(n)の求め方については後で説明する。   FIG. 2 is a diagram illustrating the function of the sound extraction device from the moving sound source. The sound data acquisition unit 101 includes N microphones. x (n) represents an observation signal vector obtained by observing the sound s of the moving sound source 201 with the nth microphone. G (n) is a beamforming coefficient for the observation signal vector of the nth microphone. y is the extracted moving sound source signal vector. The arithmetic processing unit 105 obtains a beam forming coefficient G (n) and obtains a moving sound source signal vector y from the observation signal vector x (n) and the beam forming coefficient G (n). A method for obtaining the beam forming coefficient G (n) will be described later.

図3は、本発明の一実施形態による移動音源の抽出方法を示す流れ図である。以下の各ステップは、観測信号ベクトルx(n)のサンプリングごと、または所定のサンプリング回数ごとに行う。   FIG. 3 is a flowchart illustrating a moving sound source extraction method according to an embodiment of the present invention. The following steps are performed every time the observed signal vector x (n) is sampled or every predetermined number of times of sampling.

図3のステップS010において、演算処理部105は、位置検出部103からの情報により、その時点における移動音源の位置を求め、該位置に対する移動音源からn番目のマイクロフォンまでの時変畳み込み行列H(p(k),n)を、時変畳み込み行列格納部107から取り出す。該行列の行数は、L+L−1であり、列数は、Lである。 In step S010 of FIG. 3, the arithmetic processing unit 105 obtains the position of the moving sound source at that time from the information from the position detecting unit 103, and a time-varying convolution matrix H (from the moving sound source to the nth microphone for the position. p (k), n) is extracted from the time-varying convolution matrix storage unit 107. The number of rows of the matrix is L s + L h −1, and the number of columns is L s .

図3のステップS020において、演算処理部105は、それぞれのマイクロフォンの観測信号ベクトルx(n)および移動音源から該マイクロフォンまでの時変畳み込み行列H(p(k),n)から、それぞれのマイクロフォンの観測信号ベクトルに対するビームフォーミング係数G(1)、G(2)・・・G(N)および移動音源信号ベクトルyを求める。移動音源信号ベクトルyは

Figure 0005139111
である。ここで、yの列およびGの各列は、音源信号の時刻に対応する。Gの各行およびxの列は、N個のマイクロフォンの観測信号の時刻に対応する。 In step S020 in FIG. 3, the arithmetic processing unit 105 determines each microphone from the observed signal vector x (n) of each microphone and the time-varying convolution matrix H (p (k), n) from the moving sound source to the microphone. Beam forming coefficients G (1), G (2)... G (N) and a moving sound source signal vector y are obtained for the observed signal vectors. The moving sound source signal vector y is
Figure 0005139111
It is. Here, each column of the columns and G T of y corresponds to the time of the source signal. Column of each row and x of G T corresponds to the time of the observation signals of the N microphones.

他方、観測信号群xは、

Figure 0005139111
と表せる。ここで、xの列およびH(p(k))の各列は、N個のマイクロフォンの観測信号の時刻に対応する。H(p(k))の各行およびsの列は、音源信号の時刻に対応する。 On the other hand, the observation signal group x is
Figure 0005139111
It can be expressed. Here, each column of x and each column of H T (p (k)) corresponds to the time of the observation signals of N microphones. Each row of H T (p (k)) and the column of s correspond to the time of the sound source signal.

y=sである必要から、Gは、

Figure 0005139111
の解であり、式(12)を擬似逆行列として求めれば、移動音源に対応した最小ノルム重み付き遅延和ビームフォーミングの係数が得られる。 Since y = s needs to be satisfied, G is
Figure 0005139111
If Equation (12) is obtained as a pseudo inverse matrix, the minimum norm weighted delay sum beamforming coefficient corresponding to the moving sound source can be obtained.

また、非目的音源の移動パターンp(k)と非目的音源からそれぞれの観測点(N個のマイクロフォン)への時変畳み込み行列H(p(k))が既知であれば、

Figure 0005139111
を順次追加した解を求めることで、非目的音源からの音のゲインを小さくし、目的音源による音をより正確に抽出することができる[中島弘史、“不定項を利用した平均サイドローブ最小ビームフォーミングの実現”、日本音響学会誌62巻10号、pp.726-737,2006]。 If the movement pattern p U (k) of the non-target sound source and the time-varying convolution matrix H (p U (k)) from the non-target sound source to each observation point (N microphones) are known,
Figure 0005139111
By finding a solution that sequentially adds, the gain of the sound from the non-target sound source can be reduced, and the sound from the target sound source can be extracted more accurately [Hiroshi Nakajima, “Average sidelobe minimum beam using indefinite terms. Realization of forming ", Journal of the Acoustical Society of Japan, Vol. 62, No. 10, pp.726-737, 2006].

図4は、本実施形態による移動音源からの音の抽出方法の機能を確認するための数値実験を説明するための図である。音源は、移動音源S1と静止音源S2の2個とした。S1は、図5に示すように125Hzの正弦波、 S2は400Hzの正弦波とした。サンプリング周波数は1kHz、S1の速度は20m/s、信号長は0.5sとした。3素子のマイクロフォン・アレイ(M1、M2、M3)を使用したビームフォーミングにより、S1の信号を抽出することを目的とした。   FIG. 4 is a diagram for explaining a numerical experiment for confirming the function of the sound extraction method from the moving sound source according to the present embodiment. Two sound sources were used, a moving sound source S1 and a stationary sound source S2. As shown in FIG. 5, S1 was a 125 Hz sine wave, and S2 was a 400 Hz sine wave. The sampling frequency was 1 kHz, the S1 speed was 20 m / s, and the signal length was 0.5 s. The purpose was to extract the S1 signal by beam forming using a three-element microphone array (M1, M2, M3).

図5乃至図7において、横軸は時間(単位は秒)を示し、縦軸は音圧(単位はパスカル)を示す。   5 to 7, the horizontal axis indicates time (unit is second), and the vertical axis indicates sound pressure (unit is Pascal).

図6は、マイクロフォンM1における観測信号を示す図である。観測信号には、S2の信号(高周波)と、振幅と周波数が変化するS1の信号が見られる。   FIG. 6 is a diagram illustrating an observation signal in the microphone M1. In the observation signal, a signal S2 (high frequency) and a signal S1 whose amplitude and frequency change can be seen.

本実施形態による方法において、式(12)および式(13)の連立方程式の解としてビームフォーミング係数行列を求めた。   In the method according to the present embodiment, a beamforming coefficient matrix is obtained as a solution of the simultaneous equations of Expression (12) and Expression (13).

本実施形態と比較するための方法(以下、比較方法と呼称する)において、10度間隔の音源方向に対して、それぞれビームフォーミング係数行列を求め、それぞれのビームフォーミング係数行列にしたがって出力値を求め、最大の出力値をS1の信号とした。   In a method for comparison with the present embodiment (hereinafter referred to as a comparison method), a beamforming coefficient matrix is obtained for each sound source direction at intervals of 10 degrees, and an output value is obtained according to each beamforming coefficient matrix. The maximum output value was the signal of S1.

図7(a)は、比較方法により抽出されたS1の信号を示す図である。図7(a)に示すように、S2の高周波成分は除去されるものの、S1の信号において移動による振幅変化や切り替えによる不連続が残る。なお、図7(a)において、0.3s付近の振幅が低い原因は焦点と死角の方向が近いためである。   FIG. 7A is a diagram illustrating the S1 signal extracted by the comparison method. As shown in FIG. 7A, although the high-frequency component of S2 is removed, the amplitude change due to movement and the discontinuity due to switching remain in the signal of S1. In FIG. 7A, the reason why the amplitude is low in the vicinity of 0.3 s is that the direction of the focal point and the dead angle are close.

図7(b)は、本実施形態による方法により抽出されたS1の信号を示す図である。図7(b)に示すように、S1の信号において移動による振幅変化や切り替えによる不連続はなく、正確にS1の信号が再現されている。   FIG. 7B is a diagram showing the signal S1 extracted by the method according to the present embodiment. As shown in FIG. 7B, there is no amplitude change due to movement or discontinuity due to switching in the S1 signal, and the S1 signal is accurately reproduced.

本発明の実施形態の特徴を以下に説明する。   Features of the embodiment of the present invention will be described below.

本発明の実施形態によれば、目的音源の位置を求め、前記目的音源の位置の関数として、前記目的音源の音源信号ベクトルを複数の観測点のそれぞれの観測点における観測信号ベクトルに変換する、前記目的音源から前記それぞれの観測点までの時変畳み込み行列を求める。さらに、前記ビームフォーミング係数行列を、前記目的音源から前記それぞれの観測点までの時変畳み込み行列の擬似逆行列として求める。   According to an embodiment of the present invention, the position of the target sound source is obtained, and as a function of the position of the target sound source, the sound source signal vector of the target sound source is converted into an observation signal vector at each observation point of a plurality of observation points. A time-varying convolution matrix from the target sound source to each of the observation points is obtained. Further, the beamforming coefficient matrix is obtained as a pseudo inverse matrix of a time-varying convolution matrix from the target sound source to the respective observation points.

本実施形態によれば、移動する目的音源に対応した最小ノルム重み付き遅延和ビームフォーミングの係数が得られ、移動する目的音源による音を正確に抽出することができる。   According to this embodiment, the coefficient of the minimum norm weighted delay sum beamforming corresponding to the moving target sound source can be obtained, and the sound by the moving target sound source can be accurately extracted.

本発明の他の実施形態によれば、非目的音源の位置をさらに求め、前記非目的音源の位置の関数として、前記非目的音源の音源信号ベクトルを前記それぞれの観測点における観測信号ベクトルに変換する、前記非目的音源から前記それぞれの観測点までの時変畳み込み行列をさらに求め、前記ビームフォーミング係数行列を、前記非目的音源から前記それぞれの観測点までの時変畳み込み行列と前記ビームフォーミング係数行列との積が0となるように調整する。   According to another embodiment of the present invention, the position of the non-target sound source is further obtained, and the sound source signal vector of the non-target sound source is converted into an observation signal vector at each observation point as a function of the position of the non-target sound source. Further, a time-varying convolution matrix from the non-target sound source to the respective observation points is further obtained, and the beam forming coefficient matrix is determined from the time-varying convolution matrix from the non-target sound source to the respective observation points and the beam forming coefficient. Adjust so that the product with the matrix is zero.

本実施形態によれば、非目的音源からの音のゲインを小さくし、目的音源による音をより正確に抽出することができる。   According to this embodiment, the gain of the sound from the non-target sound source can be reduced, and the sound from the target sound source can be extracted more accurately.

本発明の一実施形態による移動音源からの音の抽出装置の構成を示す図である。It is a figure which shows the structure of the extraction apparatus of the sound from the moving sound source by one Embodiment of this invention. 移動音源からの音の抽出装置の機能を示す図である。It is a figure which shows the function of the extraction apparatus of the sound from a moving sound source. 本発明の一実施形態による移動音源の抽出方法を示す流れ図である。4 is a flowchart illustrating a method of extracting a moving sound source according to an embodiment of the present invention. 本実施形態による移動音源からの音の抽出方法の機能を確認するための数値実験を説明するための図である。It is a figure for demonstrating the numerical experiment for confirming the function of the extraction method of the sound from the moving sound source by this embodiment. 移動音源信号を示す図である。It is a figure which shows a moving sound source signal. マイクロフォンM1における観測信号を示す図である。It is a figure which shows the observation signal in the microphone M1. 比較方法および本実施形態による方法により抽出されたS1の信号を示す図である。It is a figure which shows the signal of S1 extracted by the comparison method and the method by this embodiment.

符号の説明Explanation of symbols

101…音データ取得部、103…位置検出部、105…演算処理部、109…時変畳み込み行列格納部 DESCRIPTION OF SYMBOLS 101 ... Sound data acquisition part, 103 ... Position detection part, 105 ... Operation processing part, 109 ... Time-variant convolution matrix storage part

Claims (6)

音源の位置を求め、
前記音源の位置の関数として、前記音源の音源信号ベクトルを複数の観測点のそれぞれの観測点における観測信号ベクトルに変換する、前記音源から前記それぞれの観測点までの時変畳み込み行列を求め、
前記音源から前記それぞれの観測点までの時変畳み込み行列を使用して、前記それぞれの観測点における観測信号ベクトルを目的音源の音源信号ベクトルに変換するビームフォーミング係数行列を求め、
前記それぞれの観測点における観測信号ベクトルおよび前記ビームフォーミング係数行列から前記目的音源の音源信号ベクトルを求める、移動音源からの音の抽出方法。
Find the location of the sound source,
As a function of the position of the sound source, the sound source signal vector of the sound source is converted into an observation signal vector at each observation point of a plurality of observation points, a time-varying convolution matrix from the sound source to the respective observation points is obtained,
Using a time-varying convolution matrix from the sound source to the respective observation points, a beam forming coefficient matrix for converting an observation signal vector at the respective observation points into a sound source signal vector of the target sound source is obtained.
A method for extracting sound from a moving sound source, wherein a sound source signal vector of the target sound source is obtained from observation signal vectors at the respective observation points and the beamforming coefficient matrix.
前記目的音源の位置を求め、
前記目的音源の位置の関数として、前記目的音源の音源信号ベクトルを複数の観測点のそれぞれの観測点における観測信号ベクトルに変換する、前記目的音源から前記それぞれの観測点までの時変畳み込み行列を求め、
前記ビームフォーミング係数行列を、前記目的音源から前記それぞれの観測点までの時変畳み込み行列の擬似逆行列として求める、請求項1に記載の移動音源からの音の抽出方法。
Determining the position of the target sound source;
A time-varying convolution matrix from the target sound source to each observation point, which converts the sound source signal vector of the target sound source into an observation signal vector at each observation point of a plurality of observation points as a function of the position of the target sound source. Seeking
The method for extracting sound from a moving sound source according to claim 1, wherein the beam forming coefficient matrix is obtained as a pseudo inverse matrix of a time-varying convolution matrix from the target sound source to each observation point.
非目的音源の位置をさらに求め、
前記非目的音源の位置の関数として、前記非目的音源の音源信号ベクトルを前記それぞれの観測点における観測信号ベクトルに変換する、前記非目的音源から前記それぞれの観測点までの時変畳み込み行列をさらに求め、
前記ビームフォーミング係数行列を、前記非目的音源から前記それぞれの観測点までの時変畳み込み行列と前記ビームフォーミング係数行列との積が0となるように調整する、請求項に記載の移動音源からの音の抽出方法。
Further find the location of the non-target sound source,
A time-variant convolution matrix from the non-target sound source to the respective observation points, which converts the sound source signal vector of the non-target sound source into an observation signal vector at the respective observation points as a function of the position of the non-target sound source; Seeking
3. The mobile sound source according to claim 2 , wherein the beam forming coefficient matrix is adjusted so that a product of a time-varying convolution matrix from the non-target sound source to each observation point and the beam forming coefficient matrix becomes zero. Sound extraction method.
複数の観測点における音の観測信号ベクトルを取得する音データ取得部と、
音源の位置を検出する位置検出部と、
前記音源の位置の関数として、前記音源の音源信号ベクトルを複数の観測点のそれぞれの観測点における観測信号ベクトルに変換する、前記音源から前記それぞれの観測点までの時変畳み込み行列を格納する時変畳み込み行列格納部と、
前記複数の観測点における音の観測信号ベクトルおよび前記音源から前記それぞれの観測点までの時変畳み込み行列から目的音源の音源信号ベクトルを求める演算処理部と、を備える移動音源からの音の抽出装置。
A sound data acquisition unit for acquiring sound observation signal vectors at a plurality of observation points;
A position detector for detecting the position of the sound source;
When storing a time-varying convolution matrix from the sound source to each observation point, converting the sound source signal vector of the sound source into an observation signal vector at each observation point of a plurality of observation points as a function of the position of the sound source A convolution matrix storage unit;
An apparatus for extracting sound from a moving sound source, comprising: an arithmetic processing unit that obtains a sound source signal vector of a target sound source from observation signal vectors of sound at the plurality of observation points and a time-varying convolution matrix from the sound source to the respective observation points .
前記位置検出部が、前記目的音源の位置を検出し、
前記時変畳み込み行列格納部が、前記目的音源の位置の関数として、前記目的音源の音源信号ベクトルを前記それぞれの観測点における観測信号ベクトルに変換する、前記目的音源から前記それぞれの観測点までの時変畳み込み行列を格納し、
前記演算処理部が、前記それぞれの観測点における観測信号ベクトルを前記目的音源の音源信号ベクトルに変換するビームフォーミング係数行列を、前記目的音源から前記それぞれの観測点までの時変畳み込み行列の擬似逆行列として求め、前記それぞれの観測点における観測信号ベクトルおよび前記ビームフォーミング係数行列から前記目的音源の音源信号ベクトルを求める、請求項4に記載の移動音源からの音の抽出装置。
The position detection unit detects the position of the target sound source;
The time-varying convolution matrix storage unit converts the sound source signal vector of the target sound source into an observation signal vector at the respective observation points as a function of the position of the target sound source, from the target sound source to the respective observation points. Store the time-varying convolution matrix
The arithmetic processing unit generates a beamforming coefficient matrix for converting an observation signal vector at each observation point into a sound source signal vector of the target sound source, and a pseudo inverse of a time-varying convolution matrix from the target sound source to each observation point. 5. The apparatus for extracting sound from a moving sound source according to claim 4, wherein a sound source signal vector of the target sound source is obtained from an observation signal vector at each observation point and the beam forming coefficient matrix.
前記位置検出部が、非目的音源の位置をさらに検出し、
前記時変畳み込み行列格納部が、前記非目的音源の位置の関数として、前記非目的音源の音源信号ベクトルを前記それぞれの観測点における観測信号ベクトルに変換する、前記非目的音源から前記それぞれの観測点までの時変畳み込み行列をさらに格納し、
前記演算処理部が、前記ビームフォーミング係数行列を、前記非目的音源から前記それぞれの観測点までの時変畳み込み行列と前記ビームフォーミング係数行列との積が0となるように調整する、請求項5に記載の移動音源からの音の抽出装置。
The position detector further detects the position of the non-target sound source;
The time-varying convolution matrix storage unit converts a sound source signal vector of the non-target sound source into an observation signal vector at the respective observation points as a function of the position of the non-target sound source, and the respective observations from the non-target sound source. Store more time-varying convolution matrices up to points,
The arithmetic processing unit adjusts the beamforming coefficient matrix so that a product of a time-varying convolution matrix from the non-target sound source to each observation point and the beamforming coefficient matrix becomes zero. An apparatus for extracting sound from a moving sound source described in 1.
JP2008034445A 2007-03-02 2008-02-15 Method and apparatus for extracting sound from moving sound source Expired - Fee Related JP5139111B2 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US90443307P 2007-03-02 2007-03-02
US60/904,433 2007-03-02

Publications (2)

Publication Number Publication Date
JP2008219884A JP2008219884A (en) 2008-09-18
JP5139111B2 true JP5139111B2 (en) 2013-02-06

Family

ID=39855168

Family Applications (1)

Application Number Title Priority Date Filing Date
JP2008034445A Expired - Fee Related JP5139111B2 (en) 2007-03-02 2008-02-15 Method and apparatus for extracting sound from moving sound source

Country Status (1)

Country Link
JP (1) JP5139111B2 (en)

Cited By (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11297426B2 (en) 2019-08-23 2022-04-05 Shure Acquisition Holdings, Inc. One-dimensional array microphone with improved directivity
US11297423B2 (en) 2018-06-15 2022-04-05 Shure Acquisition Holdings, Inc. Endfire linear array microphone
US11303981B2 (en) 2019-03-21 2022-04-12 Shure Acquisition Holdings, Inc. Housings and associated design features for ceiling array microphones
US11302347B2 (en) 2019-05-31 2022-04-12 Shure Acquisition Holdings, Inc. Low latency automixer integrated with voice and noise activity detection
US11310596B2 (en) 2018-09-20 2022-04-19 Shure Acquisition Holdings, Inc. Adjustable lobe shape for array microphones
US11310592B2 (en) 2015-04-30 2022-04-19 Shure Acquisition Holdings, Inc. Array microphone system and method of assembling the same
US11438691B2 (en) 2019-03-21 2022-09-06 Shure Acquisition Holdings, Inc. Auto focus, auto focus within regions, and auto placement of beamformed microphone lobes with inhibition functionality
US11445294B2 (en) 2019-05-23 2022-09-13 Shure Acquisition Holdings, Inc. Steerable speaker array, system, and method for the same
US11477327B2 (en) 2017-01-13 2022-10-18 Shure Acquisition Holdings, Inc. Post-mixing acoustic echo cancellation systems and methods
US11523212B2 (en) 2018-06-01 2022-12-06 Shure Acquisition Holdings, Inc. Pattern-forming microphone array
US11552611B2 (en) 2020-02-07 2023-01-10 Shure Acquisition Holdings, Inc. System and method for automatic adjustment of reference gain
US11558693B2 (en) 2019-03-21 2023-01-17 Shure Acquisition Holdings, Inc. Auto focus, auto focus within regions, and auto placement of beamformed microphone lobes with inhibition and voice activity detection functionality
US11678109B2 (en) 2015-04-30 2023-06-13 Shure Acquisition Holdings, Inc. Offset cartridge microphones
US11706562B2 (en) 2020-05-29 2023-07-18 Shure Acquisition Holdings, Inc. Transducer steering and configuration systems and methods using a local positioning system
US11785380B2 (en) 2021-01-28 2023-10-10 Shure Acquisition Holdings, Inc. Hybrid audio beamforming system

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR101254989B1 (en) 2011-10-14 2013-04-16 한양대학교 산학협력단 Dual-channel digital hearing-aids and beamforming method for dual-channel digital hearing-aids
KR102362121B1 (en) 2015-07-10 2022-02-11 삼성전자주식회사 Electronic device and input and output method thereof
EP3131311B1 (en) * 2015-08-14 2019-06-19 Nokia Technologies Oy Monitoring
WO2020121545A1 (en) * 2018-12-14 2020-06-18 日本電信電話株式会社 Signal processing device, signal processing method, and program
CN110530510B (en) * 2019-09-24 2021-01-05 西北工业大学 Method for measuring sound source radiation sound power by utilizing linear sound array beam forming

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP3514714B2 (en) * 2000-08-21 2004-03-31 日本電信電話株式会社 Sound collection method and device
JP2006270903A (en) * 2005-03-22 2006-10-05 Nittobo Acoustic Engineering Co Ltd Nonlinear beam forming by microphone array of arbitrary arrangement
JP2006332736A (en) * 2005-05-23 2006-12-07 Yamaha Corp Microphone array apparatus
JP4760160B2 (en) * 2005-06-29 2011-08-31 ヤマハ株式会社 Sound collector
JP4760249B2 (en) * 2005-09-13 2011-08-31 ヤマハ株式会社 Speaker array device

Cited By (22)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11310592B2 (en) 2015-04-30 2022-04-19 Shure Acquisition Holdings, Inc. Array microphone system and method of assembling the same
US11678109B2 (en) 2015-04-30 2023-06-13 Shure Acquisition Holdings, Inc. Offset cartridge microphones
US11832053B2 (en) 2015-04-30 2023-11-28 Shure Acquisition Holdings, Inc. Array microphone system and method of assembling the same
US11477327B2 (en) 2017-01-13 2022-10-18 Shure Acquisition Holdings, Inc. Post-mixing acoustic echo cancellation systems and methods
US11800281B2 (en) 2018-06-01 2023-10-24 Shure Acquisition Holdings, Inc. Pattern-forming microphone array
US11523212B2 (en) 2018-06-01 2022-12-06 Shure Acquisition Holdings, Inc. Pattern-forming microphone array
US11770650B2 (en) 2018-06-15 2023-09-26 Shure Acquisition Holdings, Inc. Endfire linear array microphone
US11297423B2 (en) 2018-06-15 2022-04-05 Shure Acquisition Holdings, Inc. Endfire linear array microphone
US11310596B2 (en) 2018-09-20 2022-04-19 Shure Acquisition Holdings, Inc. Adjustable lobe shape for array microphones
US11438691B2 (en) 2019-03-21 2022-09-06 Shure Acquisition Holdings, Inc. Auto focus, auto focus within regions, and auto placement of beamformed microphone lobes with inhibition functionality
US11778368B2 (en) 2019-03-21 2023-10-03 Shure Acquisition Holdings, Inc. Auto focus, auto focus within regions, and auto placement of beamformed microphone lobes with inhibition functionality
US11303981B2 (en) 2019-03-21 2022-04-12 Shure Acquisition Holdings, Inc. Housings and associated design features for ceiling array microphones
US11558693B2 (en) 2019-03-21 2023-01-17 Shure Acquisition Holdings, Inc. Auto focus, auto focus within regions, and auto placement of beamformed microphone lobes with inhibition and voice activity detection functionality
US11445294B2 (en) 2019-05-23 2022-09-13 Shure Acquisition Holdings, Inc. Steerable speaker array, system, and method for the same
US11800280B2 (en) 2019-05-23 2023-10-24 Shure Acquisition Holdings, Inc. Steerable speaker array, system and method for the same
US11688418B2 (en) 2019-05-31 2023-06-27 Shure Acquisition Holdings, Inc. Low latency automixer integrated with voice and noise activity detection
US11302347B2 (en) 2019-05-31 2022-04-12 Shure Acquisition Holdings, Inc. Low latency automixer integrated with voice and noise activity detection
US11750972B2 (en) 2019-08-23 2023-09-05 Shure Acquisition Holdings, Inc. One-dimensional array microphone with improved directivity
US11297426B2 (en) 2019-08-23 2022-04-05 Shure Acquisition Holdings, Inc. One-dimensional array microphone with improved directivity
US11552611B2 (en) 2020-02-07 2023-01-10 Shure Acquisition Holdings, Inc. System and method for automatic adjustment of reference gain
US11706562B2 (en) 2020-05-29 2023-07-18 Shure Acquisition Holdings, Inc. Transducer steering and configuration systems and methods using a local positioning system
US11785380B2 (en) 2021-01-28 2023-10-10 Shure Acquisition Holdings, Inc. Hybrid audio beamforming system

Also Published As

Publication number Publication date
JP2008219884A (en) 2008-09-18

Similar Documents

Publication Publication Date Title
JP5139111B2 (en) Method and apparatus for extracting sound from moving sound source
US8385562B2 (en) Sound source signal filtering method based on calculated distances between microphone and sound source
Ward et al. Particle filter beamforming for acoustic source localization in a reverberant environment
EP3484184A1 (en) Acoustic field formation device, method, and program
US20130082875A1 (en) Processing Signals
JP4812302B2 (en) Sound source direction estimation system, sound source direction estimation method, and sound source direction estimation program
WO2016179211A1 (en) Coprime microphone array system
EP1856948A1 (en) Position-independent microphone system
Nakamura et al. A real-time super-resolution robot audition system that improves the robustness of simultaneous speech recognition
KR102191736B1 (en) Method and apparatus for speech enhancement with artificial neural network
Padois Acoustic source localization based on the generalized cross-correlation and the generalized mean with few microphones
JP6763332B2 (en) Sound collectors, programs and methods
Padois et al. On the use of modified phase transform weighting functions for acoustic imaging with the generalized cross correlation
KR101086304B1 (en) Signal processing apparatus and method for removing reflected wave generated by robot platform
CN103688187A (en) Sound source localization using phase spectrum
Padois et al. On the use of geometric and harmonic means with the generalized cross-correlation in the time domain to improve noise source maps
JPH1141687A (en) Signal processing unit and signal processing method
Hosseini et al. Time difference of arrival estimation of sound source using cross correlation and modified maximum likelihood weighting function
Salvati et al. Acoustic source localization using a geometrically sampled grid SRP-PHAT algorithm with max-pooling operation
JP2008089312A (en) Signal arrival direction estimation apparatus and method, signal separation apparatus and method, and computer program
Jung et al. Distance estimation of a sound source using the multiple intensity vectors
JP6433630B2 (en) Noise removing device, echo canceling device, abnormal sound detecting device, and noise removing method
JP3862685B2 (en) Sound source direction estimating device, signal time delay estimating device, and computer program
EP3757598A1 (en) In device interference mitigation using sensor fusion
JP2004279845A (en) Signal separating method and its device

Legal Events

Date Code Title Description
A621 Written request for application examination

Free format text: JAPANESE INTERMEDIATE CODE: A621

Effective date: 20101126

A977 Report on retrieval

Free format text: JAPANESE INTERMEDIATE CODE: A971007

Effective date: 20120620

A131 Notification of reasons for refusal

Free format text: JAPANESE INTERMEDIATE CODE: A131

Effective date: 20120626

A521 Request for written amendment filed

Free format text: JAPANESE INTERMEDIATE CODE: A523

Effective date: 20120809

TRDD Decision of grant or rejection written
A01 Written decision to grant a patent or to grant a registration (utility model)

Free format text: JAPANESE INTERMEDIATE CODE: A01

Effective date: 20121023

A01 Written decision to grant a patent or to grant a registration (utility model)

Free format text: JAPANESE INTERMEDIATE CODE: A01

A61 First payment of annual fees (during grant procedure)

Free format text: JAPANESE INTERMEDIATE CODE: A61

Effective date: 20121115

R150 Certificate of patent or registration of utility model

Ref document number: 5139111

Country of ref document: JP

Free format text: JAPANESE INTERMEDIATE CODE: R150

Free format text: JAPANESE INTERMEDIATE CODE: R150

FPAY Renewal fee payment (event date is renewal date of database)

Free format text: PAYMENT UNTIL: 20151122

Year of fee payment: 3

R250 Receipt of annual fees

Free format text: JAPANESE INTERMEDIATE CODE: R250

LAPS Cancellation because of no payment of annual fees