JP5294603B2 - Acoustic signal estimation device, acoustic signal synthesis device, acoustic signal estimation synthesis device, acoustic signal estimation method, acoustic signal synthesis method, acoustic signal estimation synthesis method, program using these methods, and recording medium - Google Patents

Acoustic signal estimation device, acoustic signal synthesis device, acoustic signal estimation synthesis device, acoustic signal estimation method, acoustic signal synthesis method, acoustic signal estimation synthesis method, program using these methods, and recording medium Download PDF

Info

Publication number
JP5294603B2
JP5294603B2 JP2007259797A JP2007259797A JP5294603B2 JP 5294603 B2 JP5294603 B2 JP 5294603B2 JP 2007259797 A JP2007259797 A JP 2007259797A JP 2007259797 A JP2007259797 A JP 2007259797A JP 5294603 B2 JP5294603 B2 JP 5294603B2
Authority
JP
Japan
Prior art keywords
band
signal
sound source
sound
band signal
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
JP2007259797A
Other languages
Japanese (ja)
Other versions
JP2009089315A (en
Inventor
健弘 守谷
登 原田
優 鎌本
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nippon Telegraph and Telephone Corp
Original Assignee
Nippon Telegraph and Telephone Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nippon Telegraph and Telephone Corp filed Critical Nippon Telegraph and Telephone Corp
Priority to JP2007259797A priority Critical patent/JP5294603B2/en
Publication of JP2009089315A publication Critical patent/JP2009089315A/en
Application granted granted Critical
Publication of JP5294603B2 publication Critical patent/JP5294603B2/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Description

本発明は、複数チャネルの音響信号から音源の位置または方向と強度と位相を推定し、任意の位置の音響信号を合成する音響信号推定装置、音響信号合成装置、音響信号推定合成装置、音響信号推定方法、音響信号合成方法、音響信号推定合成方法、これらの方法を用いたプログラム、及び記録媒体に関する。   The present invention relates to an acoustic signal estimation device, an acoustic signal synthesis device, an acoustic signal estimation synthesis device, an acoustic signal that estimates the position or direction, intensity, and phase of a sound source from acoustic signals of a plurality of channels and synthesizes an acoustic signal at an arbitrary position. The present invention relates to an estimation method, an acoustic signal synthesis method, an acoustic signal estimation synthesis method, a program using these methods, and a recording medium.

立体的音響信号を複数のマイクで収音し、音源を分離したり、雑音を抑圧したりする手法は良く知られている。音源の位置はセンサーで収集できる。また、アレーマイクで個別の音に分離して収集することもできる。その手段として、SAFIA法(非特許文献1)やCSCC法(非特許文献2)が知られている。
青木真理子,山口義和,古家賢一,片岡章俊,“音源分離方式SAFIAを用いた高騒音下における近接音源の分離抽出”,電子情報通信学会誌A,Vol.J88-A, No.4, pp.468-479, 2005. 松本恭輔,小野順貴,嵯峨山茂樹,“位相差拘束付複素スペクトル円心(CSCC)法による雑音抑圧の検討”,日本音響学会講演論文集,3-1-11, pp.499-500, 2006.
There are well-known techniques for collecting three-dimensional acoustic signals with a plurality of microphones, separating sound sources, and suppressing noise. The position of the sound source can be collected by the sensor. It can also be separated and collected into individual sounds with an array microphone. As such means, the SAFIA method (Non-patent document 1) and the CSCC method (Non-patent document 2) are known.
Mariko Aoki, Yoshikazu Yamaguchi, Kenichi Furuya, Akitoshi Kataoka, “Separation and Extraction of Proximity Sound Sources under High Noise Using Sound Source Separation Method SAFIA”, IEICE Journal A, Vol.J88-A, No.4, pp. 468-479, 2005. Shinsuke Matsumoto, Junki Ono, Shigeki Hatakeyama, “Examination of Noise Suppression by Complex Spectral Center with Constrained Phase Difference (CSCC) Method”, Proceedings of the Acoustical Society of Japan, 3-1-11, pp.499-500, 2006.

一般的に、複数のマイクは音源から離れた位置に設置され、常時、音を収音している。しかし、音源の位置や数は明確ではなく、時間とともに変動することも想定される。このような場合に、任意の位置で収音される音を求めるには、いくつかのパラメータを仮定して音源を分離する手法では対応できない。
本発明はこのような問題点を解決し、複数のマイクで収音された音から、音源を推定し、任意の位置での音を合成する方法を提供することにある。
In general, a plurality of microphones are installed at positions distant from a sound source and always collect sound. However, the position and number of sound sources are not clear and may vary with time. In such a case, it is not possible to obtain a sound picked up at an arbitrary position by a method of separating sound sources assuming some parameters.
The present invention solves such problems and provides a method for estimating a sound source from sounds collected by a plurality of microphones and synthesizing sound at an arbitrary position.

本発明の音響信号推定装置は、帯域分割部と音源推定部から構成される。帯域分割部は、複数のマイクで収音した複数チャネルの音響信号を、チャネルごとに所定の周波数帯域ごとに分割して帯域信号を生成する。音源推定部は、周波数帯域ごとに音源の位置または方向と強度と位相を推定する。そして、チャネルごとに音源からの信号を帯域信号から除いて残差帯域信号を求める。つまり、1以上の音源が推定できた周波数帯域は、チャネルごとに音源からの信号を帯域信号から除いて残差帯域信号を求め、音源が推定できなかった周波数帯域は、各チャネルの帯域信号を残差帯域信号とする。   The acoustic signal estimation apparatus according to the present invention includes a band dividing unit and a sound source estimating unit. The band dividing unit divides the sound signals of a plurality of channels collected by a plurality of microphones into predetermined frequency bands for each channel to generate a band signal. The sound source estimation unit estimates the position or direction, intensity, and phase of the sound source for each frequency band. Then, a residual band signal is obtained by removing the signal from the sound source from the band signal for each channel. That is, the frequency band in which one or more sound sources can be estimated is obtained by removing the signal from the sound source from the band signal for each channel to obtain a residual band signal, and the frequency band in which the sound source cannot be estimated is the band signal of each channel. Let it be a residual band signal.

本発明の音響信号合成装置は、帯域信号成分推定部と帯域信号成分加算部と帯域統合部から構成され、各音源の位置または方向と周波数帯域ごとの強度と位相、各チャネルの残差帯域信号、音を合成する位置を入力とする。帯域信号成分推定部は、各音源の位置または方向と周波数帯域ごとの強度と位相から、指定された位置での各音源からの帯域信号を推定する。帯域信号成分加算部は、推定された各音源からの帯域信号と各チャネルの残差帯域信号とを重み付き加算することで、指定された位置での帯域信号を求める。帯域統合部は、指定された位置での帯域信号を、時間領域の信号に変換する。   The acoustic signal synthesizer according to the present invention includes a band signal component estimation unit, a band signal component addition unit, and a band integration unit. The position or direction of each sound source, the intensity and phase of each frequency band, and the residual band signal of each channel The position where the sound is synthesized is input. The band signal component estimation unit estimates a band signal from each sound source at a designated position from the position or direction of each sound source and the intensity and phase for each frequency band. The band signal component adding unit obtains a band signal at a designated position by weighted addition of the estimated band signal from each sound source and the residual band signal of each channel. The band integration unit converts the band signal at the designated position into a signal in the time domain.

本発明の音響信号推定合成装置は、上述の音響信号推定装置と記録部と音響信号合成装置から構成される。記録部は、音響信号推定装置から出力される各音源の位置または方向と周波数帯域ごとの強度と位相、および各チャネルの残差帯域信号を記録する。音響信号合成装置は、記録部に記録された推定された各音源の位置または方向と周波数帯域ごとの強度と位相、各チャネルの残差帯域信号、収音される音を合成する位置を入力とする。
なお、音響信号推定装置や音響信号合成装置は、上述の記録部を内部に備えていてもよい。
The acoustic signal estimation and synthesis apparatus according to the present invention includes the acoustic signal estimation apparatus, the recording unit, and the acoustic signal synthesis apparatus described above. The recording unit records the position or direction of each sound source output from the acoustic signal estimation device, the intensity and phase of each frequency band, and the residual band signal of each channel. The acoustic signal synthesizer receives the estimated position or direction of each sound source recorded in the recording unit and the intensity and phase of each frequency band, the residual band signal of each channel, and the position where the collected sound is synthesized. To do.
Note that the acoustic signal estimation device and the acoustic signal synthesis device may include the recording unit described above.

本発明の音響信号推定装置によれば、複数のマイクで収音した複数チャネルの音響信号から、1以上の音源の位置または方向と周波数帯域ごとの強度と位相を推定し、各チャネルの残差帯域信号を求める。したがって、音源が推定できた音と雑音などの音源が推定できない音に分けることができる。
本発明の音響信号合成装置によれば、音源が推定できた音については、音源の位置または方向から指定された位置で収音される音を計算できる。また、音源が推定できない音については、各チャネルの残差帯域信号(帯域信号に含まれる音源が特定できない信号)から指定された位置で収音される音を計算できる。そして、これらを重み付け加算するので、指定された位置での音を合成できる。
According to the acoustic signal estimation device of the present invention, the position or direction of one or more sound sources and the intensity and phase of each frequency band are estimated from the acoustic signals of a plurality of channels collected by a plurality of microphones, and the residual of each channel is estimated. Obtain the band signal. Therefore, the sound can be divided into a sound that can be estimated and a sound that cannot be estimated such as noise.
According to the acoustic signal synthesizer of the present invention, the sound collected at a position designated from the position or direction of the sound source can be calculated for the sound whose sound source can be estimated. For sounds that cannot be estimated by the sound source, it is possible to calculate the sound that is collected at a specified position from the residual band signal of each channel (a signal that cannot identify the sound source included in the band signal). Since these are weighted and added, the sound at the designated position can be synthesized.

本発明の音響信号推定合成装置によれば、上述の音響信号推定装置と音響信号合成装置の効果を有するので、複数のマイクで収音した複数チャネルの音響信号から、指定された位置での合成できる。
このような効果があるので、例えば、複数の場所のカメラから任意の視点の画像・映像を合成する自由視点映像システムに対応した音響信号の合成も可能となる。
According to the acoustic signal estimating and synthesizing apparatus of the present invention, since it has the effects of the above-described acoustic signal estimating apparatus and the acoustic signal synthesizing apparatus, synthesis at a designated position from the acoustic signals of a plurality of channels picked up by a plurality of microphones. it can.
Because of such an effect, for example, it is possible to synthesize an audio signal corresponding to a free viewpoint video system that synthesizes images and videos of arbitrary viewpoints from cameras at a plurality of locations.

以下に、図を示しながら本発明の原理と実施形態を説明する。
原理
図1は、4つのマイクと伝播した音が平面波と近似できるほど遠方の音源からの音の様子を示す例である。一般的には、最も離れたマイク同士の間隔より、10倍以上音源が離れた場合、平面波と近似できる。図1では、4つのマイク501〜504は直線状に配置されている。音源Aからの音は、マイクの配置と垂直な方向から来るとする。この場合には、到達する音波の波面がそろうので、音源Aからの各マイクへの入力信号は同一となる。音源Bからの音は、マイクの配置に対して垂直ではない方向から来るとする。この場合、各マイクへの音源Bからの音の到達時間が異なる。また、帯域ごとの帯域信号成分でみると位相が異なる。図2に、マイク501〜504が設置されている場所501〜504での、音源Aから伝播された音のスペクトルの例を示す。図3に場所501〜504での音源Bから伝播された音のスペクトルの例を示す。図3(A)は場所501での音源Bからの音のスペクトル、(B)は場所502での音源Bからの音のスペクトル、(C)は場所503での音源Bからの音のスペクトル、(D)は場所504での音源Bからの音のスペクトルである。また、図4から図6に場所501、502、503での、音源Aと音源Bからの音のスペクトルを示す。図4が場所501での音源Aと音源Bからの音のスペクトル、図5が場所502での音源Aと音源Bからの音のスペクトル、図6が場所503での音源Aと音源Bからの音のスペクトルである。
The principle and embodiments of the present invention will be described below with reference to the drawings.
Principle FIG. 1 is an example showing the state of sound from a sound source that is so far away that sound propagated through four microphones can be approximated to a plane wave. Generally, when the sound source is separated 10 times or more than the distance between the most distant microphones, it can be approximated as a plane wave. In FIG. 1, the four microphones 501 to 504 are arranged in a straight line. It is assumed that the sound from the sound source A comes from a direction perpendicular to the microphone arrangement. In this case, since the wavefronts of the arriving sound waves are aligned, the input signals from the sound source A to each microphone are the same. It is assumed that the sound from the sound source B comes from a direction that is not perpendicular to the microphone arrangement. In this case, the arrival time of the sound from the sound source B to each microphone is different. Further, the phase is different in terms of the band signal component for each band. FIG. 2 shows an example of the spectrum of the sound propagated from the sound source A at the locations 501 to 504 where the microphones 501 to 504 are installed. FIG. 3 shows an example of the spectrum of the sound propagated from the sound source B at the locations 501 to 504. 3A is a sound spectrum from the sound source B at the location 501, FIG. 3B is a sound spectrum from the sound source B at the location 502, and FIG. 3C is a sound spectrum from the sound source B at the location 503, (D) is the spectrum of the sound from the sound source B at the location 504. 4 to 6 show the spectrum of sound from the sound source A and the sound source B at the locations 501, 502, and 503. 4 is a spectrum of sound from sound source A and sound source B at location 501, FIG. 5 is a spectrum of sound from sound source A and sound source B at location 502, and FIG. 6 is a sound spectrum from sound source A and sound source B at location 503. It is the spectrum of sound.

本発明では、このように複数のマイクで収音した音のスペクトルから、音源の方向、音源のスペクトルを推定する。なお、図7のように球面波を前提とした場合には、音源の方向を推定するのではなく、音源の位置を推定することになる。そして、各マイクでの推定した音源からの音のスペクトルを計算し、残差として残る信号(残差信号)を音源が特定できない雑音として扱う。そして、推定された音源の位置とスペクトルから、音響波形が欲しい位置(指定された位置)での各音源からの音のスペクトルを求める。また、指定された位置の残差信号のスペクトルは、指定された位置の近くのマイクの残差信号を、指定された位置とマイクとの距離を考慮した重み付け加算して求める。これらを加算することで、指定された位置での音響波形を合成する。   In the present invention, the direction of the sound source and the spectrum of the sound source are estimated from the spectrum of the sound collected by the plurality of microphones. When spherical waves are assumed as shown in FIG. 7, the direction of the sound source is not estimated, but the position of the sound source is estimated. Then, the spectrum of sound from the sound source estimated by each microphone is calculated, and a signal remaining as a residual (residual signal) is handled as noise that cannot be specified by the sound source. Then, from the estimated position and spectrum of the sound source, the spectrum of the sound from each sound source at the position (designated position) where the acoustic waveform is desired is obtained. The spectrum of the residual signal at the designated position is obtained by weighting and adding the residual signal of the microphone near the designated position in consideration of the distance between the designated position and the microphone. By adding these, the acoustic waveform at the specified position is synthesized.

音源の方向やスペクトルを推定する方法は、従来から存在する方法を用いればよい。例えば、各マイクで収音した音の位相差を用いる方法などがある。音の位相差は、例えば、マイクが2つの場合には、相互相関関数の計算である時間差でピークがはっきりと出れば、1つの音源があると判断できる。また、マイクが2つ以上の場合、例えば、1つの音源を仮定して連立方程式を解くか、位相差を周波数領域で評価すると、得られた結果が1つの音源とみなせるか否か判断できる。つまり、一般的に、2つ以上のマイクがあれば、収音された音の個々の周波数帯域での位相の違いから、音源方向を推定できる。   As a method for estimating the direction and spectrum of the sound source, a conventional method may be used. For example, there is a method of using a phase difference of sound collected by each microphone. For example, when there are two microphones, the sound phase difference can be determined to be one sound source if a peak appears clearly due to the time difference calculated by the cross-correlation function. Further, when there are two or more microphones, for example, by solving simultaneous equations assuming one sound source or evaluating a phase difference in the frequency domain, it can be determined whether or not the obtained result can be regarded as one sound source. That is, generally, if there are two or more microphones, the direction of the sound source can be estimated from the difference in phase in each frequency band of the collected sound.

SAFIA法では、個々の帯域では主要な音源の成分は1つであると仮定し、音源の位置と、その音源からの音を求める。音源のスペクトルには、強い部分と弱い部分があり、ある帯域に注目すると主要な成分が複数の音源から来ることは比較的少ない。例えば、図4から図6に示したように、音源Aからの音のスペクトルと音源Bからの音のスペクトルでは、スペクトルが存在する周波数のほとんどが異なる(例えば、図4から図6の注目帯域aと注目帯域c)。したがって、帯域分割した場合、ある帯域では、音源Aまたは音源Bの一方の音が主となり、他方はほとんどない。SAFIA法は、このような特性を利用している。   In the SAFIA method, it is assumed that there is one main sound source component in each band, and the position of the sound source and the sound from the sound source are obtained. The spectrum of the sound source has a strong portion and a weak portion, and when attention is paid to a certain band, it is relatively rare that main components come from a plurality of sound sources. For example, as shown in FIGS. 4 to 6, the spectrum of the sound from the sound source A and the spectrum of the sound from the sound source B are almost different in frequency (for example, the attention band in FIGS. 4 to 6). a and band of interest c). Therefore, when the band is divided, one sound of the sound source A or the sound source B is mainly used in a certain band, and there is almost no other. The SAFIA method utilizes such characteristics.

CSCC法では、他の音源からの入力スペクトルが一定となる場合、あるいはそのように換算した場合、複数のマイクに対する単一音源からのスペクトルの複素平面上での配置から音源方向とその信号成分を分離して推定する。注目帯域aの例の場合のように、音源Aからの成分はほとんどない場合や、各信号に遅延を与えるなどして音源Aからの信号成分がすべてのマイクに共通となるように換算できる場合には、場所501〜504の音源Bの成分から、音源Bの方向が精度よく推定できる。なお、音源の位置の推定精度は、他の音がどの程度あるかに依存する。注目帯域cの場合には、音源Bからの成分がほとんどないので、場所501〜504の音源Aの成分から、音源Aの方向が精度よく推定できる。この場合は、どの場所のスペクトルも同じなので、マイクの設置方向と垂直な方向に音源Aが存在することが分かる。注目帯域bの場合には、音源Aの成分も音源Bの成分も強いため、単純な分離は難しい。この場合、音源方向の推定の信頼度が高い帯域(例えば、注目帯域a、注目帯域c)で推定した音源の位置を用いて、音源Aからの成分と音源Bからの成分とを推定する。この例では、音源Aからの成分は、マイクの場所によらないので、定数とみなすことができる。   In the CSCC method, when the input spectrum from another sound source is constant or converted as such, the sound source direction and its signal component are calculated from the arrangement of the spectrum from a single sound source for a plurality of microphones on the complex plane. Estimate separately. When there is almost no component from the sound source A as in the case of the band of interest a, or when the signal component from the sound source A can be converted to be common to all microphones by giving a delay to each signal. The direction of the sound source B can be accurately estimated from the components of the sound source B at the locations 501 to 504. Note that the accuracy of estimating the position of the sound source depends on how much other sound is present. In the case of the attention band c, since there is almost no component from the sound source B, the direction of the sound source A can be accurately estimated from the components of the sound source A at the locations 501 to 504. In this case, since the spectrum of every place is the same, it can be seen that the sound source A exists in a direction perpendicular to the microphone installation direction. In the case of the attention band b, since the components of the sound source A and the sound source B are strong, simple separation is difficult. In this case, the component from the sound source A and the component from the sound source B are estimated using the position of the sound source estimated in the band with high reliability of estimation of the sound source direction (for example, the attention band a and the attention band c). In this example, since the component from the sound source A does not depend on the location of the microphone, it can be regarded as a constant.

その他にも複数の音源を音源数以上の数のマイクの信号から分離する技術がある(特開2006-243664号公報)。また、帯域を分割すれば、音源が発生する周波数成分が偏るので、マイクの数が少なくても分離可能となる(特開2007−198977号公報)。   In addition, there is a technique for separating a plurality of sound sources from the number of microphone signals equal to or greater than the number of sound sources (Japanese Patent Laid-Open No. 2006-243664). Further, if the band is divided, the frequency components generated by the sound source are biased, so that separation is possible even with a small number of microphones (Japanese Patent Laid-Open No. 2007-198977).

本発明でも、複数の音源があることを前提に複数のマイクで収音した信号を、音源ごとに分離することで、音源の方向(または位置)、音源のスペクトルを推定する。したがって、上述の信号の分離方法や類似の方法を用いる点では共通するし、どの方法を用いるかは適宜選択すればよい。しかし、本発明の目的は、任意の位置での音を合成することであり、音源ごとに音を分離することではない。つまり、本発明では、音を正確に分離できることよりも、結果的に指定された位置での音のように合成できることが重要である。そこで、本発明では、上述のいずれかの方法で可能な範囲まで、音源の位置または方向と音源帯域信号(周波数帯域ごとの強度と位相)とを推定し、残る信号を音源の位置が特定できない残差信号として扱う。残差信号は、マイクごとに求められる。そして、音源ごとの方向と周波数帯域ごとの音源帯域信号(複素スペクトル)、マイク(チャネル)ごとの周波数帯域ごとの残差信号(残差帯域信号)が記録される。指定された位置での音の合成では、各音源の位置または方向と音源帯域信号(周波数帯域ごとの強度と位相)から、指定された位置での各音源からの帯域信号を推定する。そして、推定された各音源からの帯域信号と各チャネルの残差帯域信号とを重み付き加算することで、指定された位置での帯域信号を求める。最後に、指定された位置での帯域信号を、時間領域の信号に変換する。   In the present invention, the direction (or position) of the sound source and the spectrum of the sound source are estimated by separating the signals collected by the plurality of microphones for each sound source on the assumption that there are a plurality of sound sources. Therefore, it is common to use the above-described signal separation method and similar methods, and which method should be used may be appropriately selected. However, an object of the present invention is to synthesize sounds at arbitrary positions, not to separate sounds for each sound source. That is, in the present invention, it is more important to be able to synthesize like a sound at a designated position as a result rather than accurately separating the sounds. Therefore, in the present invention, the position or direction of the sound source and the sound source band signal (intensity and phase for each frequency band) are estimated to the extent possible by any of the above methods, and the position of the sound source cannot be specified for the remaining signals. Treat as residual signal. The residual signal is obtained for each microphone. Then, a sound source band signal (complex spectrum) for each direction and frequency band for each sound source, and a residual signal (residual band signal) for each frequency band for each microphone (channel) are recorded. In the synthesis of the sound at the designated position, the band signal from each sound source at the designated position is estimated from the position or direction of each sound source and the sound source band signal (intensity and phase for each frequency band). Then, the band signal at the designated position is obtained by weighted addition of the estimated band signal from each sound source and the residual band signal of each channel. Finally, the band signal at the designated position is converted into a time domain signal.

[第1実施形態]
図8に、本発明の音響信号推定合成装置の機能構成例を示す。また、図9に、音響信号推定合成装置の処理フローの例を示す。本発明の音響信号推定合成装置100は、帯域分割部110、音源推定部120、記録部130、帯域信号成分推定部140、帯域信号成分加算部150、帯域統合部160から構成される。帯域分割部110は、K個(Kは2以上の整数)のマイクで収音したKチャネルの音響信号x(t),x(t),…,x(t)を、チャネルごとに所定の周波数帯域ωごとに分割して帯域信号X(ω),X(ω),…,X(ω)を生成する(S110)。音響信号x(t)は、Tサンプルからなるフレーム中の1つのサンプル値(スカラー量)であり、tは0,…,T−1の値を取る。このような音響信号x(t)から、所定の周波数帯域ごとの帯域信号X(ω)を得る。帯域信号X(ω)は、例えば複素スペクトルである。なお、帯域信号X(ω)は帯域分割複素信号でもよいが、以下では複素スペクトルとして説明する。次式のように、時間領域のT点ごとのフレームを複素フーリエ変換し、T/2点の複素フーリエ係数を求めたものを帯域信号X(ω)とする。

Figure 0005294603
ただし、ω=0,…,T/2、jは虚数単位、πは円周率とする。
帯域信号X(ω)は1番目のマイク(第1のチャネル)の位置での信号の、周波数帯ωごとの振幅と位相を示している。サンプリング周波数をf〔Hz〕としたとき、ωf/T〔Hz〕を中心周波数とする帯域信号とみなせる。なお、帯域分割部110への入力を、アナログの音響信号とし、帯域分割部110内でサンプリングした値を音響信号x(t)としてもよい。どの場合も、出力は同じである。 [First Embodiment]
FIG. 8 shows a functional configuration example of the acoustic signal estimation / synthesis apparatus of the present invention. FIG. 9 shows an example of the processing flow of the acoustic signal estimation / synthesis apparatus. The acoustic signal estimation and synthesis apparatus 100 of the present invention includes a band dividing unit 110, a sound source estimating unit 120, a recording unit 130, a band signal component estimating unit 140, a band signal component adding unit 150, and a band integrating unit 160. The band dividing unit 110 collects K channel acoustic signals x 1 (t), x 2 (t),..., X K (t) collected by K microphones (K is an integer of 2 or more) for each channel. Are divided into predetermined frequency bands ω to generate band signals X 1 (ω), X 2 (ω),..., X K (ω) (S110). The acoustic signal x 1 (t) is one sample value (scalar amount) in a frame composed of T samples, and t takes values of 0,..., T−1. A band signal X 1 (ω) for each predetermined frequency band is obtained from the acoustic signal x 1 (t). The band signal X 1 (ω) is, for example, a complex spectrum. The band signal X 1 (ω) may be a band division complex signal, but will be described as a complex spectrum below. A frame signal X 1 (ω) is obtained by performing a complex Fourier transform on a frame for each T point in the time domain and obtaining a complex Fourier coefficient at a T / 2 point as in the following equation.
Figure 0005294603
Here, ω = 0,..., T / 2, j is an imaginary unit, and π is a circumference ratio.
The band signal X 1 (ω) indicates the amplitude and phase of each signal in the frequency band ω at the position of the first microphone (first channel). When the sampling frequency is f [Hz], it can be regarded as a band signal having a center frequency of ωf / T [Hz]. The input to the band dividing unit 110 may be an analog acoustic signal, and the value sampled in the band dividing unit 110 may be the acoustic signal x 1 (t). In all cases, the output is the same.

音源推定部120は、従来から存在する方法で、周波数帯域ωごとに音源の位置または方向Dω,1,Dω,2,…,Dω,Mωと音源帯域信号Sω,1,Sω,2,…,Sω,Mωを推定する(Mωは周波数帯域ωでの音源の数であり、0以上の整数である)。音源帯域信号Sω,1は、周波数帯域ωでの第1の音源から伝搬した音によって、マイク近傍で生じる信号を計算するための強度と位相の情報(例えば、複素スペクトル)である。例えば、Dω,1が音源の位置を示しており、音を球面波とするのであれば、音源帯域信号Sω,1は音源の位置での強度と位相を示す複素スペクトルとすればよい。また、Dω,1が音源の方向を示しており、音を平面波に近似とするのであれば、音源帯域信号Sω,1はある位置(音源の位置である必要はない)での強度と位相を示す複素スペクトルとすればよい。この推定の過程で、各マイクの位置での、それぞれの音源からの信号Uk,ω,mも求めておく(kはマイクの番号を示しており、1〜Kの整数である)。信号Uk,ω,mは、k番目のマイクの位置での周波数帯ωのm番目の音源からの信号を示している(mは周波数帯ωごとに付された音源の番号であり、0〜Mωの整数である)。例えば、平面波で近似する場合であれば、音源帯域信号Sω,1の位置とマイクkの位置とを結ぶベクトルと音の伝搬方向の単位ベクトルとの内積(音の伝搬方向にどれだけ離れているかを示す値)から、音源帯域信号Sω,1の位置とマイクkの位置との位相差を求め、Sω,1の位相をその位相差だけシフトした信号をマイクkの位置での信号Uk,ω,1とすればよい。 The sound source estimation unit 120 is a conventional method, and the position or direction D ω, 1 , D ω, 2 ,..., D ω, Mω of the sound source and the sound source band signals S ω, 1 , S ω for each frequency band ω. , 2 ,..., S ω, Mω are estimated (Mω is the number of sound sources in the frequency band ω and is an integer of 0 or more). The sound source band signal S ω, 1 is intensity and phase information (for example, complex spectrum) for calculating a signal generated in the vicinity of the microphone by the sound propagated from the first sound source in the frequency band ω. For example, if D ω, 1 indicates the position of the sound source and the sound is a spherical wave, the sound source band signal S ω, 1 may be a complex spectrum indicating the intensity and phase at the position of the sound source. Further, if D ω, 1 indicates the direction of the sound source and the sound is approximated to a plane wave, the sound source band signal S ω, 1 has the intensity at a certain position (it does not have to be the position of the sound source). What is necessary is just to set it as the complex spectrum which shows a phase. In this estimation process, signals U k, ω, m from the respective sound sources at the positions of the respective microphones are also obtained (k indicates the number of the microphone and is an integer from 1 to K). The signal U k, ω, m indicates a signal from the mth sound source of the frequency band ω at the position of the kth microphone (m is the number of the sound source assigned to each frequency band ω, and 0 Is an integer of ~ Mω). For example, in the case of approximating with a plane wave, the inner product (how far away in the sound propagation direction) the vector connecting the position of the sound source band signal Sω, 1 and the position of the microphone k and the unit vector of the sound propagation direction. A phase difference between the position of the sound source band signal S ω, 1 and the position of the microphone k, and a signal obtained by shifting the phase of S ω, 1 by the phase difference is a signal at the position of the microphone k. U k, ω, 1 may be used.

1以上の音源が推定できた周波数帯域ωは、次式のようにチャネルごとに音源からの信号を帯域信号から除いて残差帯域信号N(ω),N(ω),…,N(ω)を求める。

Figure 0005294603
また、音源が推定できなかった周波数帯域ωは、N(ω)=X(ω)をすべてのk(マイク)とω(周波数帯)に対して計算することで、各チャネルの帯域信号X(ω)を残差帯域信号N(ω),N(ω),…,N(ω)とする(S120)。つまり、チャネルごとに、マイクの位置での推定できた音源からの信号を帯域信号から引くことで、残差帯域信号N(ω),N(ω),…,N(ω)を求めている。このように、本発明では、音源の位置が推定できなかった信号を、残差帯域信号として扱うので、音源の位置が特定できなかった信号を無理やりいずれかの音源に割り振る必要がない。 The frequency band ω in which one or more sound sources can be estimated is obtained by removing the signal from the sound source from the band signal for each channel as shown in the following equation, and residual band signals N 1 (ω), N 2 (ω),. Find K (ω).
Figure 0005294603
Further, the frequency band ω for which the sound source could not be estimated is calculated by calculating N k (ω) = X k (ω) for all k (microphones) and ω (frequency band). Let X k (ω) be the residual band signals N 1 (ω), N 2 (ω),..., N K (ω) (S120). That is, for each channel, the residual band signals N 1 (ω), N 2 (ω),..., N K (ω) are obtained by subtracting the signal from the sound source that can be estimated at the microphone position from the band signal. Looking for. In this way, in the present invention, since the signal for which the position of the sound source cannot be estimated is handled as a residual band signal, it is not necessary to forcibly allocate the signal for which the position of the sound source cannot be specified to any sound source.

音源の位置を推定するか方向を推定するかは、音を球面波と仮定するか平面波と仮定するかで決まる。この仮定は、あらかじめ定めておく。また、どのような方法で音源の位置または方向と強度と位相を推定するかは、上述の方法などから適宜選択しておけばよい。なお、上述したように、本発明では正確に音源の位置(または方向)やスペクトルを推定することよりも、最終的に合成された音が、指定された位置での音らしくなることが重要である。ステップS120で推定された各音源の位置または方向と周波数帯域ごとの強度と位相、および各チャネルの残差帯域信号は、記録部130に記録される。なお、記録される情報は、符号化された情報でもよい。   Whether the position of the sound source is estimated or the direction is estimated depends on whether the sound is assumed to be a spherical wave or a plane wave. This assumption is predetermined. The method for estimating the position or direction, intensity, and phase of the sound source may be appropriately selected from the above-described methods. As described above, in the present invention, it is more important that the finally synthesized sound looks like a sound at a designated position than accurately estimating the position (or direction) and spectrum of the sound source. is there. The position or direction of each sound source estimated in step S120, the intensity and phase of each frequency band, and the residual band signal of each channel are recorded in the recording unit 130. Note that the recorded information may be encoded information.

帯域信号成分推定部140は、位置Pが指定されると、周波数帯域ωごとの各音源の位置または方向Dω,1,Dω,2,…,Dω,Mωと音源帯域信号Sω,1,Sω,2,…,Sω,Mωから、指定された位置Pでのすべての音源からの音を合成した帯域信号Z(ω)を推定する(S140)。例えば、周波数帯域ωごとに、位置Pでの各音源からの信号UP,ω,mを求める(mは周波数帯ωごとに付された音源の番号であり、0〜Mωの整数である)。信号UP,ω,mの求め方は、音源推定部120の各マイクの位置での音源からの信号Uk,ω,mの求め方と同じでよい。位置Pでの各音源からの信号UP,ω,mを、次のように周波数帯域ωごとに、加算すれば、帯域信号Z(ω)を求めることができる。

Figure 0005294603
帯域信号成分加算部150は、推定されたすべての音源からの音を合成した帯域信号Z(ω)と各チャネルの残差帯域信号N(ω),N(ω),…,N(ω)とを重み付き加算することで、指定された位置Pでの帯域信号Y(ω)を求める(S150)。例えば、次式のように、推定されたすべての音源からの音を合成した帯域信号Z(ω)には重み1を乗算し、各チャネルの残差帯域信号には、すべてのチャネルへの重みの合計が1となるように、各チャネルのマイクと指定された位置Pとの距離に応じた(例えば、反比例した)重みを設定し、重みを乗算して加算すればよい。
Figure 0005294603
ただし、dは、k番目のマイクと位置Pとの距離とする。 When the position P is designated, the band signal component estimation unit 140 specifies the position or direction D ω, 1 , D ω, 2 ,..., D ω, Mω and the sound source band signal S ω, for each frequency band ω. A band signal Z (ω) obtained by synthesizing sounds from all the sound sources at the designated position P is estimated from 1 , S ω, 2 ,..., S ω, Mω (S140). For example, for each frequency band ω, signals UP , ω, m from each sound source at position P are obtained (m is a sound source number assigned to each frequency band ω, and is an integer from 0 to Mω). . Signals U P, omega, Determination of m, the signal U k from the sound source at the position of the microphones in the sound source estimation unit 120, omega, or the same as the method for obtaining the m. If the signals UP, ω, m from each sound source at the position P are added for each frequency band ω as follows, the band signal Z (ω) can be obtained.
Figure 0005294603
The band signal component adding unit 150 combines the band signals Z (ω) obtained by synthesizing sounds from all the estimated sound sources and the residual band signals N 1 (ω), N 2 (ω) ,. The band signal Y (ω) at the designated position P is obtained by weighted addition with (ω) (S150). For example, as shown in the following equation, the band signal Z (ω) obtained by synthesizing sounds from all estimated sound sources is multiplied by a weight 1, and the residual band signal of each channel is weighted to all channels. A weight corresponding to the distance between the microphone of each channel and the designated position P (for example, inversely proportional) may be set, and the weights may be multiplied and added so that the sum of the values becomes 1.
Figure 0005294603
However, d k is the distance between the k-th microphone and the position P.

帯域統合部160は、指定された位置Pでの帯域信号Y(ω)を、時間領域の信号y(t)に変換する(S160)。例えば、信号y(t)は、Tサンプルからなるフレーム内の1つのサンプル値であり、tは0,…,T−1の値を取る。
本発明の音響信号推定合成装置100はこのような構成なので、音源が推定できた音と雑音などの音源が推定できない音に分けることができる。そして、音源が推定できた音については、音源の位置または方向から指定された位置Pでの音を計算できる。また、音源が推定できない音については、各チャネルの残差帯域信号(帯域信号に含まれる音源が特定できない信号)から指定された位置Pでの音を計算できる。そして、これらを重み付け加算するので、指定された位置Pでの音を合成できる。このような効果があるので、例えば、複数の場所のカメラから任意の視点の画像・映像を合成する自由視点映像システムに対応した音響信号の合成も可能となる。
The band integration unit 160 converts the band signal Y (ω) at the designated position P into a time domain signal y (t) (S160). For example, the signal y (t) is one sample value in a frame composed of T samples, and t takes values of 0,.
Since the acoustic signal estimation / synthesis apparatus 100 according to the present invention has such a configuration, it can be divided into a sound whose sound source can be estimated and a sound such as noise that cannot be estimated. As for the sound whose sound source can be estimated, the sound at the position P designated from the position or direction of the sound source can be calculated. For sounds that cannot be estimated by a sound source, it is possible to calculate the sound at a specified position P from the residual band signal of each channel (a signal that cannot specify a sound source included in the band signal). Since these are weighted and added, the sound at the designated position P can be synthesized. Because of such an effect, for example, it is possible to synthesize an audio signal corresponding to a free viewpoint video system that synthesizes images and videos of arbitrary viewpoints from cameras at a plurality of locations.

[変形例]
第1実施形態では、音響信号推定合成装置100を説明した。しかし、各音源の位置または方向と周波数帯域ごとの音源帯域信号、および各チャネルの残差帯域信号を推定するまでを1つの装置(音響信号推定装置)としても良い。また、各音源の位置または方向と周波数帯域ごとの音源帯域信号、および各チャネルの残差帯域信号から、指定された位置Pでの音を合成するまでを1つの装置(音響信号合成装置)としても良い。
[Modification]
In the first embodiment, the acoustic signal estimation / synthesis apparatus 100 has been described. However, one device (acoustic signal estimation device) may be used until the position or direction of each sound source and the sound source band signal for each frequency band and the residual band signal for each channel are estimated. Also, one device (acoustic signal synthesizer) is a process from the sound source band signal of each sound source and the sound source band signal for each frequency band and the residual band signal of each channel until the sound at the designated position P is synthesized. Also good.

音響信号推定装置200は、例えば、帯域分割部110と音源推定部120から構成される。記録部130は、音響信号推定装置200の内部に備えても良いし、外部でも良い。音響信号合成装置300は、例えば、帯域信号成分推定部140、帯域信号成分加算部150、帯域統合部160から構成される。
このように、いくつかの装置に分割して全体で音響信号推定合成装置を形成しても、第1実施形態と同じ効果を得ることができる。
The acoustic signal estimation apparatus 200 includes, for example, a band division unit 110 and a sound source estimation unit 120. The recording unit 130 may be provided inside the acoustic signal estimation apparatus 200 or may be provided outside. The acoustic signal synthesis device 300 includes, for example, a band signal component estimation unit 140, a band signal component addition unit 150, and a band integration unit 160.
Thus, even if it divides | segments into several apparatuses and forms an acoustic signal estimation synthetic | combination apparatus as a whole, the same effect as 1st Embodiment can be acquired.

図10に、コンピュータの機能構成例を示す。なお、本発明の音響信号推定合成方法、音響信号推定方法、音響信号合成方法は、コンピュータ2000の記録部2020に、本発明の各構成部としてコンピュータ2000を動作させるプログラムを読み込ませ、制御部2010、入力部2030、出力部2040などを動作させることで、コンピュータに実行させることができる。また、コンピュータに読み込ませる方法としては、プログラムをコンピュータ読み取り可能な記録媒体に記録しておき、記録媒体からコンピュータに読み込ませる方法、サーバ等に記録されたプログラムを、電気通信回線等を通じてコンピュータに読み込ませる方法などがある。   FIG. 10 shows a functional configuration example of a computer. The acoustic signal estimation and synthesis method, acoustic signal estimation method, and acoustic signal synthesis method of the present invention cause the recording unit 2020 of the computer 2000 to read a program that causes the computer 2000 to operate as each component of the present invention, and to control the controller 2010. The computer can be executed by operating the input unit 2030, the output unit 2040, and the like. In addition, as a method of causing the computer to read, the program is recorded on a computer-readable recording medium, and the program recorded on the server or the like is read into the computer through a telecommunication line or the like. There is a method to make it.

4つのマイクと遠くの音源から伝播した平面波の音の様子を示す図。The figure which shows the mode of the sound of the plane wave propagated from four microphones and a distant sound source. 場所501〜504での音源Aから伝播された音のスペクトルの例を示す図。The figure which shows the example of the spectrum of the sound propagated from the sound source A in the places 501-504. 場所501〜504での音源Bから伝播された音のスペクトルの例を示す図。The figure which shows the example of the spectrum of the sound propagated from the sound source B in the places 501-504. 場所501での音源Aと音源Bからの音のスペクトルを示す図。The figure which shows the spectrum of the sound from the sound source A and the sound source B in the place 501. FIG. 場所502での音源Aと音源Bからの音のスペクトルを示す図。The figure which shows the spectrum of the sound from the sound source A and the sound source B in the place 502. FIG. 場所503での音源Aと音源Bからの音のスペクトルを示す図。The figure which shows the spectrum of the sound from the sound source A and the sound source B in the place 503. FIG. 4つのマイクと音源から伝播した球面波の音の様子を示す図。The figure which shows the mode of the sound of the spherical wave propagated from four microphones and a sound source. 音響信号推定合成装置の機能構成例を示す図。The figure which shows the function structural example of an acoustic signal estimation synthetic | combination apparatus. 音響信号推定合成装置の処理フローの例を示す図。The figure which shows the example of the processing flow of an acoustic signal estimation synthetic | combination apparatus. コンピュータの機能構成例を示す図。The figure which shows the function structural example of a computer.

Claims (8)

K個(Kは2以上の整数)のマイクで収音したKチャネルの音響信号を、チャネルk(k=1,2,…,K)ごとに所定の周波数帯域ωごとに分割して帯域信号X(ω)を生成する帯域分割部と、
周波数帯域ωごとに、各マイクkの位置での各音源m(m=1,2,…,Mω、Mωは周波数帯域ωにおける音源の数)からの信号Uk,ω,mを推定し、前記帯域信号X(ω)と前記信号Uk,ω,mとから、
Figure 0005294603
である残差帯域信号N(ω)を求める音源推定部と、
を備える音響信号推定装置。
A K-channel acoustic signal picked up by K microphones (K is an integer equal to or greater than 2) is divided into predetermined frequency bands ω for each channel k (k = 1, 2,..., K) to obtain a band signal. A band dividing unit for generating X k (ω);
For each frequency band ω, estimate the signal U k, ω, m from each sound source m (m = 1, 2,..., Mω, Mω is the number of sound sources in the frequency band ω) at the position of each microphone k, From the band signal X k (ω) and the signal U k, ω, m ,
Figure 0005294603
A sound source estimator for obtaining a residual band signal N k (ω),
An acoustic signal estimation device comprising:
周波数帯域ωごとの各音源m(m=1,2,…,Mω、Mωは周波数帯域ωにおける音源の数)の位置または方向Dω,mと当該音源の位置または方向に対応づけられた周波数帯域ωごとの強度と位相Sω,mと、前記音源の位置または方向に対応づけられていない信号で各チャネルk(k=1,2,…,K、Kはチャネルの数)と周波数帯域ωに対応づけられた信号である残差帯域信号N(ω)、音を合成する位置を入力とする音響信号合成装置であって、
周波数帯域ωごとに、各音源の位置または方向Dω, と周波数帯域ごとの強度と位相Sω, から、指定された位置での前記各音源からの信号を合成した帯域信号Z(ω)を推定する帯域信号成分推定部と、
各マイクkと前記指定された位置との距離に応じた重みをαとしたとき、周波数帯域ωごとに、前記帯域信号Z(ω)と前記残差帯域信号N(ω)とから、
Figure 0005294603
である前記指定された位置での帯域信号Y(ω)を求める帯域信号成分加算部と、
前記指定された位置での帯域信号Y(ω)を、時間領域の信号に変換する帯域統合部と
を備える音響信号合成装置。
The frequency or the direction D ω, m of each sound source m (m = 1, 2,..., Mω, Mω is the number of sound sources in the frequency band ω) for each frequency band ω and the frequency associated with the position or direction of the sound source. Each channel k (k = 1, 2,..., K, K is the number of channels) and a frequency band with a signal that is not associated with the intensity and phase S ω, m for each band ω and the position or direction of the sound source. a residual band signal N k (ω), which is a signal associated with ω, and an acoustic signal synthesizer that receives a position for synthesizing a sound;
For each frequency band ω, a band signal Z (ω that combines signals from the sound sources at specified positions from the position or direction D ω, m of each sound source and the intensity and phase S ω, m of each frequency band. ) To estimate the band signal component,
When the weight corresponding to the distance between each microphone k and the designated position is α k , for each frequency band ω, from the band signal Z (ω) and the residual band signal N k (ω),
Figure 0005294603
A band signal component adder for obtaining a band signal Y (ω) at the designated position,
A sound signal synthesizer comprising: a band integration unit that converts the band signal Y (ω) at the designated position into a signal in the time domain.
K個(Kは2以上の整数)のマイクで収音したKチャネルの音響信号を、チャネルk(k=1,2,…,K)ごとに所定の周波数帯域ωごとに分割して帯域信号X(ω)を生成する帯域分割部と、
周波数帯域ωごとに、各音源m(m=1,2,…,Mω、Mωは周波数帯域ωにおける音源の数)の位置または方向Dω,mと強度と位相Sω,mと各マイクkの位置での各音源mからの信号Uk,ω,mを推定し、前記帯域信号X(ω)と前記信号Uk,ω,mとから、
Figure 0005294603
である残差帯域信号N(ω)を求める音源推定部と、
各音源の位置または方向Dω,mと周波数帯域ωごとの強度と位相Sω,m、および各チャネルの残差帯域信号N(ω)を記録する記録部と、
周波数帯域ωごとに、各音源の位置または方向Dω,mと周波数帯域ごとの強度と位相Sω,mから、指定された位置での各音源からの信号を合成した帯域信号Z(ω)を推定する帯域信号成分推定部と、
各マイクkと前記指定された位置との距離に応じた重みをαとしたとき、周波数帯域ωごとに、前記帯域信号Z(ω)と前記残差帯域信号N(ω)とから、
Figure 0005294603
である前記指定された位置での帯域信号Y(ω)を求める帯域信号成分加算部と、
前記指定された位置での帯域信号Y(ω)を、時間領域の信号に変換する帯域統合部と
を備える音響信号推定合成装置。
A K-channel acoustic signal picked up by K microphones (K is an integer equal to or greater than 2) is divided into predetermined frequency bands ω for each channel k (k = 1, 2,..., K) to obtain a band signal. A band dividing unit for generating X k (ω);
For each frequency band ω, the position or direction D ω, m of each sound source m (m = 1, 2,..., Mω, Mω is the number of sound sources in the frequency band ω), intensity, phase S ω, m, and each microphone k. The signal U k, ω, m from each sound source m at the position is estimated, and from the band signal X k (ω) and the signal U k, ω, m ,
Figure 0005294603
A sound source estimator for obtaining a residual band signal N k (ω),
A recording unit for recording the position or direction D ω, m of each sound source and the intensity and phase S ω, m for each frequency band ω, and the residual band signal N k (ω) of each channel;
For each frequency band ω, a band signal Z (ω) obtained by synthesizing a signal from each sound source at a designated position from the position or direction D ω, m of each sound source and the intensity and phase S ω, m for each frequency band. A band signal component estimator for estimating
When the weight corresponding to the distance between each microphone k and the designated position is α k , for each frequency band ω, from the band signal Z (ω) and the residual band signal N k (ω),
Figure 0005294603
A band signal component adder for obtaining a band signal Y (ω) at the designated position,
An acoustic signal estimation and synthesis device comprising: a band integration unit that converts the band signal Y (ω) at the designated position into a signal in the time domain.
帯域分割部で、K個(Kは2以上の整数)のマイクで収音したKチャネルの音響信号を、チャネルk(k=1,2,…,K)ごとに所定の周波数帯域ωごとに分割して帯域信号X(ω)を生成する帯域分割ステップと、
音源推定部で、周波数帯域ωごとに、各マイクkの位置での各音源m(m=1,2,…,Mω、Mωは周波数帯域ωにおける音源の数)からの信号Uk,ω,mを推定し、前記帯域信号X(ω)と前記信号Uk,ω,mとから、
Figure 0005294603
である残差帯域信号N(ω)を求める音源推定ステップと、
を有する音響信号推定方法。
A K-channel acoustic signal picked up by K (K is an integer equal to or greater than 2) microphones in the band dividing unit is obtained for each predetermined frequency band ω for each channel k (k = 1, 2,..., K). A band dividing step of dividing to generate a band signal X k (ω);
In the sound source estimation unit, for each frequency band ω, signals U k, ω,, from each sound source m (m = 1, 2,..., Mω, Mω are the number of sound sources in the frequency band ω) at the position of each microphone k . m is estimated, and from the band signal X k (ω) and the signal U k, ω, m ,
Figure 0005294603
A sound source estimation step for obtaining a residual band signal N k (ω) of
An acoustic signal estimation method comprising:
周波数帯域ωごとの各音源m(m=1,2,…,Mω、Mωは周波数帯域ωにおける音源の数)の位置または方向Dω,mと当該音源の位置または方向に対応づけられた周波数帯域ωごとの強度と位相Sω,mと、前記音源の位置または方向に対応づけられていない信号で各チャネルk(k=1,2,…,K、Kはチャネルの数)と周波数帯域ωに対応づけられた信号である残差帯域信号N(ω)、音を合成する位置を入力とする音響信号合成方法であって、
帯域信号成分推定部で、周波数帯域ωごとに、各音源の位置または方向Dω, と周波数帯域ごとの強度と位相Sω, から、指定された位置での前記各音源からの信号を合成した帯域信号Z(ω)を推定する帯域信号成分推定ステップと、
帯域信号成分加算部で、各マイクkと前記指定された位置との距離に応じた重みをαとしたとき、周波数帯域ωごとに、前記帯域信号Z(ω)と前記残差帯域信号N(ω)とから、
Figure 0005294603
である前記指定された位置での帯域信号Y(ω)を求める帯域信号成分加算ステップと、
帯域統合部で、前記指定された位置での帯域信号Y(ω)を、時間領域の信号に変換する帯域統合ステップと
を有する音響信号合成方法。
The frequency or the direction D ω, m of each sound source m (m = 1, 2,..., Mω, Mω is the number of sound sources in the frequency band ω) for each frequency band ω and the frequency associated with the position or direction of the sound source. Each channel k (k = 1, 2,..., K, K is the number of channels) and a frequency band with a signal that is not associated with the intensity and phase S ω, m for each band ω and the position or direction of the sound source. a residual band signal N k (ω), which is a signal associated with ω, and an acoustic signal synthesis method in which a position for synthesizing a sound is input,
In the band signal component estimation unit, for each frequency band ω, a signal from each sound source at a specified position is obtained from the position or direction D ω, m of each sound source and the intensity and phase S ω, m for each frequency band. A band signal component estimation step for estimating the combined band signal Z (ω);
In the band signal component adding unit, when the weight corresponding to the distance between each microphone k and the designated position is α k , the band signal Z (ω) and the residual band signal N for each frequency band ω. From k (ω),
Figure 0005294603
A band signal component adding step for obtaining a band signal Y (ω) at the designated position,
A band integrating step of converting a band signal Y (ω) at the designated position into a time domain signal in a band integrating unit.
帯域分割部で、K個(Kは2以上の整数)のマイクで収音したKチャネルの音響信号を、チャネルk(k=1,2,…,K)ごとに所定の周波数帯域ωごとに分割して帯域信号X(ω)を生成する帯域分割ステップと、
音源推定部で、周波数帯域ωごとに、各音源m(m=1,2,…,Mω、Mωは周波数帯域ωにおける音源の数)の位置または方向Dω,mと強度と位相Sω,mと各マイクkの位置での各音源mからの信号Uk,ω,mを推定し、前記帯域信号X(ω)と前記信号Uk,ω,mとから、
Figure 0005294603
である残差帯域信号N(ω)を求める音源推定ステップと、
帯域信号成分推定部で、周波数帯域ωごとに、各音源の位置または方向Dω,mと周波数帯域ごとの強度と位相Sω,mから、指定された位置での各音源からの信号を合成した帯域信号Z(ω)を推定する帯域信号成分推定ステップと、
帯域信号成分加算部で、各マイクkと前記指定された位置との距離に応じた重みをαとしたとき、周波数帯域ωごとに、前記帯域信号Z(ω)と前記残差帯域信号N(ω)とから、
Figure 0005294603
である前記指定された位置での帯域信号Y(ω)を求める帯域信号成分加算ステップと、
帯域統合部で、前記指定された位置での帯域信号Y(ω)を、時間領域の信号に変換する帯域統合ステップと
を有する音響信号推定合成方法。
A K-channel acoustic signal picked up by K (K is an integer equal to or greater than 2) microphones in the band dividing unit is obtained for each predetermined frequency band ω for each channel k (k = 1, 2,..., K). A band dividing step of dividing to generate a band signal X k (ω);
In the sound source estimation unit, for each frequency band ω, the position or direction D ω, m of each sound source m (m = 1, 2,..., Mω, where Mω is the number of sound sources in the frequency band ω), the intensity and phase S ω, m and the signal U k, ω, m from each sound source m at the position of each microphone k, and from the band signal X k (ω) and the signal U k, ω, m ,
Figure 0005294603
A sound source estimation step for obtaining a residual band signal N k (ω) of
The band signal component estimation unit synthesizes the signal from each sound source at the specified position from the position or direction D ω, m of each sound source and the intensity and phase S ω, m of each frequency band for each frequency band ω. A band signal component estimation step for estimating the band signal Z (ω) performed;
In the band signal component adding unit, when the weight corresponding to the distance between each microphone k and the designated position is α k , the band signal Z (ω) and the residual band signal N for each frequency band ω. From k (ω),
Figure 0005294603
A band signal component adding step for obtaining a band signal Y (ω) at the designated position,
A band integration step of converting a band signal Y (ω) at the designated position into a time domain signal in a band integration unit.
請求項4から6のいずれかに記載の方法をコンピュータに実行させるプログラム。   The program which makes a computer perform the method in any one of Claim 4 to 6. 請求項7記載のプログラムを記録したコンピュータ読み取り可能な記録媒体。
A computer-readable recording medium on which the program according to claim 7 is recorded.
JP2007259797A 2007-10-03 2007-10-03 Acoustic signal estimation device, acoustic signal synthesis device, acoustic signal estimation synthesis device, acoustic signal estimation method, acoustic signal synthesis method, acoustic signal estimation synthesis method, program using these methods, and recording medium Active JP5294603B2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP2007259797A JP5294603B2 (en) 2007-10-03 2007-10-03 Acoustic signal estimation device, acoustic signal synthesis device, acoustic signal estimation synthesis device, acoustic signal estimation method, acoustic signal synthesis method, acoustic signal estimation synthesis method, program using these methods, and recording medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP2007259797A JP5294603B2 (en) 2007-10-03 2007-10-03 Acoustic signal estimation device, acoustic signal synthesis device, acoustic signal estimation synthesis device, acoustic signal estimation method, acoustic signal synthesis method, acoustic signal estimation synthesis method, program using these methods, and recording medium

Publications (2)

Publication Number Publication Date
JP2009089315A JP2009089315A (en) 2009-04-23
JP5294603B2 true JP5294603B2 (en) 2013-09-18

Family

ID=40662053

Family Applications (1)

Application Number Title Priority Date Filing Date
JP2007259797A Active JP5294603B2 (en) 2007-10-03 2007-10-03 Acoustic signal estimation device, acoustic signal synthesis device, acoustic signal estimation synthesis device, acoustic signal estimation method, acoustic signal synthesis method, acoustic signal estimation synthesis method, program using these methods, and recording medium

Country Status (1)

Country Link
JP (1) JP5294603B2 (en)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2448289A1 (en) * 2010-10-28 2012-05-02 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Apparatus and method for deriving a directional information and computer program product
PL2647222T3 (en) 2010-12-03 2015-04-30 Fraunhofer Ges Forschung Sound acquisition via the extraction of geometrical information from direction of arrival estimates
US10497381B2 (en) 2012-05-04 2019-12-03 Xmos Inc. Methods and systems for improved measurement, entity and parameter estimation, and path propagation effect measurement and mitigation in source signal separation
US8694306B1 (en) * 2012-05-04 2014-04-08 Kaonyx Labs LLC Systems and methods for source signal separation
US9728182B2 (en) 2013-03-15 2017-08-08 Setem Technologies, Inc. Method and system for generating advanced feature discrimination vectors for use in speech recognition
CN112599144B (en) * 2020-12-03 2023-06-06 Oppo(重庆)智能科技有限公司 Audio data processing method, audio data processing device, medium and electronic equipment

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP4616736B2 (en) * 2005-09-09 2011-01-19 日本電信電話株式会社 Sound collection and playback device

Also Published As

Publication number Publication date
JP2009089315A (en) 2009-04-23

Similar Documents

Publication Publication Date Title
EP3320692B1 (en) Spatial audio processing apparatus
TWI530201B (en) Sound acquisition via the extraction of geometrical information from direction of arrival estimates
US9622003B2 (en) Speaker localization
JP5294603B2 (en) Acoustic signal estimation device, acoustic signal synthesis device, acoustic signal estimation synthesis device, acoustic signal estimation method, acoustic signal synthesis method, acoustic signal estimation synthesis method, program using these methods, and recording medium
EP2360685B1 (en) Noise suppression
KR101456866B1 (en) Method and apparatus for extracting the target sound signal from the mixed sound
JP6019969B2 (en) Sound processor
KR20070036777A (en) Audio signal dereverberation
JP2005538633A (en) Calibration of the first and second microphones
JP2008236077A (en) Target sound extracting apparatus, target sound extracting program
EP1899954A1 (en) System and method for extracting acoustic signals from signals emitted by a plurality of sources
JP6591477B2 (en) Signal processing system, signal processing method, and signal processing program
CN111863015A (en) Audio processing method and device, electronic equipment and readable storage medium
JP2007006253A (en) Signal processor, microphone system, and method and program for detecting speaker direction
KR20090037845A (en) Method and apparatus for extracting the target sound signal from the mixed sound
KR20080000478A (en) Method and apparatus for removing noise from signals inputted to a plurality of microphones in a portable terminal
KR20170124279A (en) Method and Apparatus for DEMON Processing in order that Removal of External Target Noise When measuring underwater radiated noise
JP2009020472A (en) Sound processing apparatus and program
JP4568193B2 (en) Sound collecting apparatus and method, program and recording medium
JP2006227328A (en) Sound processor
JP4886616B2 (en) Sound collection device, sound collection method, sound collection program using the method, and recording medium
JP4928376B2 (en) Sound collection device, sound collection method, sound collection program using the method, and recording medium
JP5143802B2 (en) Noise removal device, perspective determination device, method of each device, and device program
JP2006178333A (en) Proximity sound separation and collection method, proximity sound separation and collecting device, proximity sound separation and collection program, and recording medium
JP2005062096A (en) Detection method of speaker position, system, program and record medium

Legal Events

Date Code Title Description
A621 Written request for application examination

Free format text: JAPANESE INTERMEDIATE CODE: A621

Effective date: 20100818

RD03 Notification of appointment of power of attorney

Free format text: JAPANESE INTERMEDIATE CODE: A7423

Effective date: 20110812

A977 Report on retrieval

Free format text: JAPANESE INTERMEDIATE CODE: A971007

Effective date: 20120424

A131 Notification of reasons for refusal

Free format text: JAPANESE INTERMEDIATE CODE: A131

Effective date: 20120501

A521 Request for written amendment filed

Free format text: JAPANESE INTERMEDIATE CODE: A523

Effective date: 20120625

A131 Notification of reasons for refusal

Free format text: JAPANESE INTERMEDIATE CODE: A131

Effective date: 20130319

A521 Request for written amendment filed

Free format text: JAPANESE INTERMEDIATE CODE: A523

Effective date: 20130417

TRDD Decision of grant or rejection written
A01 Written decision to grant a patent or to grant a registration (utility model)

Free format text: JAPANESE INTERMEDIATE CODE: A01

Effective date: 20130604

A61 First payment of annual fees (during grant procedure)

Free format text: JAPANESE INTERMEDIATE CODE: A61

Effective date: 20130611

R150 Certificate of patent or registration of utility model

Ref document number: 5294603

Country of ref document: JP

Free format text: JAPANESE INTERMEDIATE CODE: R150

Free format text: JAPANESE INTERMEDIATE CODE: R150

S531 Written request for registration of change of domicile

Free format text: JAPANESE INTERMEDIATE CODE: R313531

R350 Written notification of registration of transfer

Free format text: JAPANESE INTERMEDIATE CODE: R350