JP2019047478A

JP2019047478A - Acoustic signal processing apparatus, acoustic signal processing method, and acoustic signal processing program

Info

Publication number: JP2019047478A
Application number: JP2018027645A
Authority: JP
Inventors: 公孝堤; Kimitaka Tsutsumi; 賢一野口; Kenichi Noguchi; 高田　英明; Hideaki Takada; 英明高田; 羽田　陽一; Yoichi Haneda; 陽一羽田
Original assignee: Nippon Telegraph and Telephone Corp; University of Electro Communications NUC
Current assignee: Nippon Telegraph and Telephone Corp; University of Electro Communications NUC
Priority date: 2017-09-04
Filing date: 2018-02-20
Publication date: 2019-03-22
Anticipated expiration: 2038-02-20
Also published as: JP6865440B2

Abstract

To reproduce acoustics having a directional pattern by using a virtual sound source.SOLUTION: An acoustic signal processing apparatus 1 comprises: a plurality of initial focal point coordinates; a focal point position determination part 12 that acquires the coordinate of a virtual sound source and a direction a directional pattern, and determines the focal point coordinate with a consideration of the directional pattern on the basis of the coordinate of the virtual sound source for the plurality of initial focal point coordinates, and by multiplying a specified rotation matrix from the direction of the directional pattern to each initial focal point coordinate; a filter coefficient arithmetic part 13 that calculates an impulse response vector to be convoluted in an input acoustic signal from each focal point coordinate determined by the focal point position determination part 12 for each speaker of a linear speaker array; and a convolution operation part 14 that convolutes the impulse response vector in accordance with the speaker to the input acoustic signal for each speaker of the linear speaker array, and outputs an output acoustic signal to the speaker.SELECTED DRAWING: Figure 1

Description

本発明は、入力音響信号を、仮想音源を実現するための直線状スピーカアレイの各スピーカへの出力音響信号に変換する音響信号処理装置、音響信号処理方法および音響信号処理プログラムに関する。 The present invention relates to an acoustic signal processing apparatus, an acoustic signal processing method, and an acoustic signal processing program for converting an input acoustic signal into an output acoustic signal to each speaker of a linear speaker array for realizing a virtual sound source.

パブリックビューイングやコンサートでは、上映会場に設置した複数のスピーカから音声や音楽などを再生する。近年、仮想的な音源を上映空間に作り出すことにより、これまで以上に臨場感のある音響再生を実現する取り組みが行われている。特に直線状に多数のスピーカを並べてできるスピーカアレイを用いて、スピーカより前面、客席近くまで飛び出る仮想音源を生成することで高い臨場感を実現するといったことが行なわれている。 In public viewing and concerts, audio and music are played from multiple speakers installed at the screening venue. In recent years, efforts have been made to realize more realistic sound reproduction by creating virtual sound sources in a screening space. In particular, using a speaker array in which a large number of speakers are arranged in a straight line, a virtual sound source jumping up to the front of the speakers and near the customer seat is generated to realize a high sense of reality.

また、一般に、楽器や人間の声は方向によって放射されるパワーが異なるため、上映空間に仮想的な音源を生成する際に方向による音響信号のパワーの違い（指向性）を再現することで、さらに臨場感の高い音響コンテンツを実現することが期待されている。 Also, in general, since the power radiated from an instrument or human voice differs depending on the direction, when generating a virtual sound source in the screening space, by reproducing the difference (directivity) of the power of the acoustic signal depending on the direction, Furthermore, it is expected to realize highly realistic sound content.

上映空間に仮想的な音源を作り出す音響再生技術に対し、波面合成と呼ばれる方法がある（特許文献１）。特許文献１に基づく方法は、音響信号を収録する地点の音響信号を複数地点に設置したマイクロフォンで収音した上で、上下左右方向の音響信号の到来方向を分析し、上映空間中に設置した複数のスピーカを用いて収録会場の音響信号を物理的に再現する。 There is a method called wavefront synthesis as a sound reproduction technique for creating a virtual sound source in the screening space (Patent Document 1). The method based on patent document 1 analyzed the arrival direction of the acoustic signal of the upper and lower, right and left direction after collecting the acoustic signal of the point which records an acoustic signal with the microphone installed in multiple points, and installed in the screening space Physically reproduce the acoustic signal of the recording venue using multiple speakers.

想定する仮想音源に吸込み型音源（acoustic sink）を仮定し、第１種レイリー積分から導出される駆動信号をスピーカアレイに与えることにより、スピーカより前面に仮想音像を作り出すことができる技術がある（非特許文献１）。また、直線状スピーカアレイを用いて上映空間に生成する仮想的な音源にダイポールなどの原始的な指向性を実現できる技術がある（非特許文献２）。 There is a technology capable of creating a virtual sound image in front of a speaker by assuming a sink-type sound source (acoustic sink) as an assumed virtual sound source and providing a drive signal derived from Type 1 Rayleigh integration to a speaker array (see FIG. Non Patent Literature 1). There is also a technology that can realize primitive directivity such as a dipole as a virtual sound source generated in the screening space using a linear speaker array (Non-patent Document 2).

スピーカから放射される音の指向性を制御する方法として、多重極音源がある（非特許文献３）。多重極音源は、音の指向性をダイポール、クアドラポールといった原始的な指向性の組み合わせで表現する手法であり、原始的な指向性それぞれは互いに近接した極性の異なる無指向性の点音源（モノポール音源）の組み合わせで実現される。非特許文献３は、指向性の向きを回転させるには、これらモノポール音源の位置を回転させることを開示する。 As a method of controlling the directivity of the sound radiated from the speaker, there is a multipole sound source (Non-Patent Document 3). A multipole sound source is a method of expressing the directivity of sound by a combination of primitive directivity such as a dipole and quadrapole, and each of the primitive directivity is an omnidirectional point sound source (mono It is realized by the combination of pole sound source). Non-Patent Document 3 discloses rotating the positions of these monopole sound sources in order to rotate the directional direction.

特開２０１１−２４４３０６号公報JP 2011-244306 A

Sascha Spors, Hagen Wierstorf, Matthias Gainer, and Jens Ahrens, ”Physical and Perceptual Properties of Focused Sources in Wave Field Synthesis,” in 127th Audio Engineering Society Convention paper 7914, 2009, October.Sascha Spors, Hagen Wierstorf, Matthias Gainer, and Jens Ahrens, “Physical and Perceptual Properties of Focused Sources in Wave Field Synthesis,” in 127th Audio Engineering Society Convention paper 7914, 2009, October. J. Ahrens, and S. Spors, “Implementation of Directional Sources in Wave Field Synthesis,” Proceeding of IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, pp. 66-69, 2007.J. Ahrens, and S. Spors, “Implementation of Directional Sources in Wave Field Synthesis,” Proceedings of IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, pp. 66-69, 2007. 羽田陽一，古家賢一，島内末廣，“球調和関数展開に基づく多重極音源を用いた指向性合成”，日本音響学会誌 69巻 11号 pp577-588 2013.Haneda Yoichi, Furuya Kenichi, Shimauchi Suzuka, "Directivity synthesis using multipole sound source based on spherical harmonics expansion", Journal of the Acoustical Society of Japan 69 11 11 pp577-588 2013.

しかしながら、いずれの文献も、仮想音源を用いて指向性のある音響再生技術については、何ら開示も示唆もない。 However, none of the documents disclose or suggest any directional sound reproduction technology using a virtual sound source.

特許文献１に開示される技術は、収録地点の音響信号を忠実に再現するため仮想音源の再現において高い再現性をもつものの、スピーカアレイだけでなくマイクアレイも必要になり装置規模が増大する。また、収録した音を忠実に再生する発明であるため、例えば映画に代表されるように日常存在しないような効果音を特殊効果として加えるといったコンテンツの編集が困難である。また、複数の音源が発した音響信号が同時にマイクロフォンに混入するため、個々の音源を取り出して位置や音質を調整するといった編集が極めて困難である問題がある。 Although the technique disclosed in Patent Document 1 has high reproducibility in reproducing a virtual sound source in order to faithfully reproduce an acoustic signal at a recording point, not only a speaker array but also a microphone array is required, and the scale of the apparatus is increased. Further, since the invention is an invention for faithfully reproducing the recorded sound, it is difficult to edit the content, for example, to add a sound effect that does not exist everyday as represented by a movie as a special effect. In addition, since audio signals generated by a plurality of sound sources are mixed into the microphone at the same time, there is a problem that editing such as taking out individual sound sources to adjust the position and sound quality is extremely difficult.

非特許文献１に開示される技術は、仮想音源の生成にマイクアレイを必要せず、通常のマイクロフォンから収録されたモノラルの音源から複数チャネル分の音響信号を生成して、仮想的な音源を作り出すことができる。しかしながら非特許文献１に開示される技術は、仮想音源の放射特性として無指向性を前提としているため、仮想音源を用いて指向性のある音響を再生することはできない。 The technology disclosed in Non-Patent Document 1 does not require a microphone array to generate a virtual sound source, but generates sound signals for a plurality of channels from a monaural sound source recorded from a normal microphone to generate virtual sound sources. Can be produced. However, the technique disclosed in Non-Patent Document 1 assumes omnidirectionality as the radiation characteristic of the virtual sound source, and therefore, it is impossible to reproduce directional sound using the virtual sound source.

これに対し非特許文献２に開示された技術では、音源に指向性を持たせることができるものの、非特許文献２では、スピーカより前面に飛び出す仮想音源の生成を実現することはできない。 On the other hand, although the technology disclosed in Non-Patent Document 2 can give directivity to the sound source, Non-Patent Document 2 can not realize generation of a virtual sound source that pops out to the front from the speaker.

このように、従来、コンテンツ編集に適するようモノラル音源から仮想音源を生成する手法を用いて、スピーカより前面に飛び出す仮想音源に指向性をもたせることができない問題があった。 As described above, conventionally, there has been a problem that it is not possible to provide directivity to a virtual sound source that pops out in front of a speaker by using a method of generating a virtual sound source from a monaural sound source so as to be suitable for content editing.

従って本発明の目的は、コンテンツ編集に適するようモノラル音源から生成された仮想音源を用いて、指向性のある音響を実現可能な音響信号処理装置、音響信号処理方法および音響信号処理プログラムを提供することである。 Therefore, an object of the present invention is to provide an acoustic signal processing apparatus, an acoustic signal processing method, and an acoustic signal processing program capable of realizing directional sound using a virtual sound source generated from a monaural sound source so as to be suitable for content editing. It is.

上記課題を解決するために、本発明の第１の特徴は、入力音響信号を、仮想音源を実現するための直線状スピーカアレイの各スピーカへの出力音響信号に変換する音響信号処理装置に関する。本発明の第１の特徴に係る音響信号処理装置は、複数の初期焦点座標と、仮想音源の座標および指向性の方向を取得して、複数の初期焦点座標のそれぞれについて、仮想音源の座標に基づいて、初期焦点座標に指向性の方向から特定される回転行列をかけて、指向性を考慮した焦点座標を決定する焦点位置決定部と、直線状スピーカアレイの各スピーカについて、焦点位置決定部により決定された焦点座標のそれぞれから、入力音響信号に畳み込むインパルス応答ベクトルを算出するフィルタ係数演算部と、直線状スピーカアレイの各スピーカについて、入力音響信号に、スピーカに対応するインパルス応答ベクトルを畳み込んで、スピーカへの出力音響信号を出力する畳み込み演算部を備える。 In order to solve the above-mentioned subject, the 1st feature of the present invention relates to an acoustic signal processing device which transforms an input acoustic signal into an output acoustic signal to each speaker of a linear speaker array for realizing a virtual sound source. The acoustic signal processing device according to the first aspect of the present invention acquires a plurality of initial focal point coordinates, coordinates of a virtual sound source, and a direction of directivity, and sets each of the plurality of initial focal point coordinates to the virtual sound source coordinates. Based on the initial focus coordinates, a rotation position matrix specified from the direction of directivity is applied to determine the focus coordinates in consideration of the directivity, and the focus position determination unit for each speaker of the linear speaker array Filter coefficient calculation unit for calculating an impulse response vector to be convoluted into the input sound signal from each of the focal coordinates determined by the above and for each speaker of the linear speaker array, convolve the impulse response vector corresponding to the speaker to the input sound signal And a convolution operation unit for outputting an output sound signal to the speaker.

焦点位置決定部は、仮想音源の座標に対する初期焦点座標の相対座標に回転行列をかけ、回転行列をかけて得られた座標に、仮想音源の座標を加算して、指向性を考慮した焦点座標を決定しても良い。 The focus position determination unit adds a rotation matrix to the relative coordinates of the initial focus coordinates with respect to the coordinates of the virtual sound source, adds the coordinates of the virtual sound source to the coordinates obtained by applying the rotation matrix, and takes the directivity coordinates into consideration. You may decide

フィルタ係数演算部は、指向性を考慮した焦点座標のそれぞれを用いて、対象周波数に対して駆動関数を計算し、計算された駆動関数を逆フーリエ変換して得られた時間領域の駆動関数を加算して、スピーカに対するインパルス応答ベクトルを算出しても良い。 The filter coefficient calculation unit calculates a drive function for the target frequency using each of focus coordinates in consideration of directivity, and performs a time domain drive function obtained by performing inverse Fourier transform of the calculated drive function. The addition may be performed to calculate an impulse response vector for the speaker.

本発明の第２の特徴は、入力音響信号を、仮想音源を実現するための直線状スピーカアレイの各スピーカへの出力音響信号に変換する音響信号処理方法に関する。本発明の第２の特徴に係る音響信号処理方法は、複数の初期焦点座標と、仮想音源の座標および指向性の方向を取得するステップと、複数の初期焦点座標のそれぞれについて、仮想音源の座標に基づいて、初期焦点座標に指向性の方向から特定される回転行列をかけて、指向性を考慮した焦点座標を決定するステップと、直線状スピーカアレイの各スピーカについて、決定するステップにより決定された焦点座標のそれぞれから、入力音響信号に畳み込むインパルス応答ベクトルを算出するステップと、直線状スピーカアレイの各スピーカについて、入力音響信号に、スピーカに対応するインパルス応答ベクトルを畳み込んで、スピーカへの出力音響信号を出力するステップを備える。 A second feature of the present invention relates to an acoustic signal processing method for converting an input acoustic signal into an output acoustic signal to each speaker of a linear speaker array for realizing a virtual sound source. According to a second aspect of the present invention, there is provided an acoustic signal processing method comprising the steps of acquiring a plurality of initial focal point coordinates, coordinates of a virtual sound source and direction of directivity, and coordinates of the virtual sound source for each of the plurality of initial focal point coordinates. Based on the initial focus coordinates and the rotation matrix specified from the direction of directivity to determine the focus coordinates considering directivity, and determining for each speaker of the linear speaker array Calculating an impulse response vector to be convoluted into the input acoustic signal from each of the focal coordinates; and convoluting the impulse response vector corresponding to the speaker into the input acoustic signal for each speaker of the linear speaker array; Outputting the output acoustic signal.

本発明の第３の特徴は、入力音響信号を、仮想音源を実現するための直線状スピーカアレイの各スピーカへの出力音響信号に変換する音響信号処理装置であって、複数の初期焦点座標と、仮想音源の座標および指向性の方向を取得して、複数の初期焦点座標のそれぞれについて、仮想音源の座標に基づいて、初期焦点座標に指向性の方向から特定される回転行列をかけて、指向性を考慮した焦点座標を決定する焦点位置決定部と、波面合成プレフィルタを算出し、入力音響信号に波面合成プレフィルタを畳み込んで重み付き音響信号を出力するフィルタ演算部と、直線状スピーカアレイの各スピーカについて、スピーカと焦点座標のそれぞれの距離から決まる遅延量をそれぞれ算出し、重み付き音響信号を、算出されたそれぞれの遅延量で遅延させて、焦点座標のそれぞれについて、遅延音響信号を出力する遅延調整部と、直線状スピーカアレイの各スピーカについて、スピーカと焦点座標の位置から決まるゲインを、焦点座標のそれぞれの遅延音響信号に乗じて、スピーカへの出力音響信号を出力するゲイン乗算部を備える。 A third feature of the present invention is an acoustic signal processing device for converting an input acoustic signal into an output acoustic signal to each speaker of a linear speaker array for realizing a virtual sound source, comprising a plurality of initial focus coordinates and Acquiring the coordinates of the virtual sound source and the direction of directivity, and multiplying the initial focus coordinates by the rotation matrix specified from the direction of directivity based on the coordinates of the virtual sound source for each of a plurality of initial focus coordinates; A focus position determination unit that determines focus coordinates in consideration of directivity, a filter operation unit that calculates a wavefront synthesis prefilter, convolves a wavefront synthesis prefilter with an input acoustic signal, and outputs a weighted acoustic signal, and a linear shape For each speaker in the speaker array, the delay amount determined from the distance between the speaker and the focal point coordinate is calculated, and the weighted sound signal is delayed by the calculated delay amount. Then, for each of the focal point coordinates, for each of the delay adjustment unit that outputs the delayed acoustic signal and each speaker of the linear speaker array, the delayed acoustic signal of each focal point coordinate is multiplied by the gain determined from the position of the speaker and the focal point coordinate. And a gain multiplication unit for outputting an output sound signal to the speaker.

ここで、波面合成プレフィルタは、 Here, the wavefront synthesis prefilter is

であっても良い。 It may be

各スピーカについて、非整数遅延を補正するための補正フィルタを算出し、重み付き音響信号に補正フィルタを適用して補正後の音響信号を出力する補正フィルタ演算部をさらに備え、遅延調整部は、補正後の音響信号を、算出されたそれぞれの遅延量で遅延させて、焦点座標のそれぞれについて、遅延音響信号を出力しても良い。 The system further comprises a correction filter operation unit that calculates a correction filter for correcting a non-integer delay for each speaker, applies the correction filter to the weighted sound signal, and outputs the sound signal after correction, The acoustic signal after correction may be delayed by each calculated delay amount, and a delayed acoustic signal may be output for each of the focal point coordinates.

ここで、各スピーカについて、非整数遅延を補正するための補正フィルタは、非整数遅延分のディレイを用いて得られるフィルタであっても良い。 Here, for each speaker, the correction filter for correcting the non-integer delay may be a filter obtained by using the delay for the non-integer delay.

本発明の第４の特徴は、入力音響信号を、仮想音源を実現するための直線状スピーカアレイの各スピーカへの出力音響信号に変換する音響信号処理方法に関する。本発明の第４の特徴に係る音響信号処理方法は、複数の初期焦点座標と、仮想音源の座標および指向性の方向を取得するステップと、複数の初期焦点座標のそれぞれについて、仮想音源の座標に基づいて、初期焦点座標に指向性の方向から特定される回転行列をかけて、指向性を考慮した焦点座標を決定するステップと、波面合成プレフィルタを算出し、入力音響信号に波面合成プレフィルタを畳み込んで重み付き音響信号を出力するステップと、直線状スピーカアレイの各スピーカについて、スピーカと焦点座標のそれぞれの距離から決まる遅延量をそれぞれ算出し、重み付き音響信号を、算出されたそれぞれの遅延量で遅延させて、焦点座標のそれぞれについて、遅延音響信号を出力するステップと、直線状スピーカアレイの各スピーカについて、スピーカと焦点座標の位置から決まるゲインを、焦点座標のそれぞれの遅延音響信号に乗じて、スピーカへの出力音響信号を出力するステップを備える。 A fourth feature of the present invention relates to an acoustic signal processing method for converting an input acoustic signal into an output acoustic signal to each speaker of a linear speaker array for realizing a virtual sound source. According to a fourth aspect of the present invention, there is provided an acoustic signal processing method comprising: acquiring a plurality of initial focal point coordinates; a coordinate of a virtual sound source and a direction of directivity; and a plurality of initial focal point coordinates Based on the initial focal point coordinates and the rotation matrix specified from the direction of directivity to determine focal point coordinates in consideration of directivity, calculating a wavefront synthesis pre-filter, A step of convoluting a filter and outputting a weighted acoustic signal, and for each speaker of the linear speaker array, a delay amount determined from each distance between the speaker and the focal point coordinate is calculated, and the weighted acoustic signal is calculated Outputting a delayed acoustic signal for each of the focal point coordinates by delaying each of the delay amounts; and each speaker of the linear loudspeaker array For it, comprising the step of a gain determined from the position of the speaker and the focus coordinates, by multiplying each delayed audio signal of the focus coordinates, and outputs the output sound signal to the speaker.

本発明の第５の特徴は、コンピュータに、本発明の第１の特徴または第３の特徴に記載の音響信号処理装置として機能させるための音響信号処理プログラムに関する。 A fifth aspect of the present invention relates to an acoustic signal processing program for causing a computer to function as the acoustic signal processing device according to the first or third aspect of the present invention.

本発明によれば、コンテンツ編集に適するようモノラル音源から生成された仮想音源を用いて、指向性のある音響を実現可能な音響信号処理装置、音響信号処理方法および音響信号処理プログラムを提供することができる。 According to the present invention, there is provided an acoustic signal processing device, an acoustic signal processing method, and an acoustic signal processing program capable of realizing directional sound using a virtual sound source generated from a monaural sound source so as to be suitable for content editing. Can.

本発明の第１の実施の形態に係る音響信号処理装置のブロック図である。FIG. 1 is a block diagram of an acoustic signal processing device according to a first embodiment of the present invention. 本発明の第１の実施の形態に係る音響信号処理装置の焦点位置決定処理を説明するフローチャートである。It is a flowchart explaining the focus position determination process of the acoustic signal processing apparatus concerning the 1st Embodiment of this invention. 本発明の第１の実施の形態に係る音響信号処理装置の焦点位置決定処理において、初期焦点座標を説明する図である。FIG. 7 is a diagram for explaining initial focus coordinates in the focus position determination process of the acoustic signal processing device according to the first embodiment of the present invention. 本発明の第１の実施の形態に係る音響信号処理装置の焦点位置決定処理においてもちいられる回転行列の一例を説明する図である。It is a figure explaining an example of the rotation matrix used in the focus position determination process of the acoustic signal processing apparatus concerning the 1st Embodiment of this invention. 本発明の第１の実施の形態に係る音響信号処理装置の焦点位置決定処理において、指向性が考慮された焦点座標を説明する図である。It is a figure explaining the focus coordinate in which directivity was considered in the focus position decision processing of the acoustic signal processing device concerning a 1st embodiment of the present invention. 本発明の第１の実施の形態に係る音響信号処理装置のフィルタ係数決定処理を説明するフローチャートである。It is a flowchart explaining the filter coefficient determination process of the acoustic signal processing apparatus which concerns on the 1st Embodiment of this invention. 本発明の第１の実施の形態に係る音響信号処理装置の畳み込み演算処理を説明するフローチャートである。It is a flowchart explaining the convolution arithmetic processing of the acoustic signal processing apparatus concerning the 1st Embodiment of this invention. 本発明の第２の実施の形態に係る音響信号処理装置のブロック図である。It is a block diagram of the acoustic signal processing device concerning a 2nd embodiment of the present invention. 本発明の第２の実施の形態に係る音響信号処理装置のフィルタ演算処理を説明するフローチャートである。It is a flowchart explaining the filter arithmetic processing of the acoustic signal processing apparatus concerning the 2nd Embodiment of this invention. 本発明の第２の実施の形態に係る音響信号処理装置の遅延調整およびゲイン乗算処理を説明するフローチャートである。It is a flowchart explaining the delay adjustment and gain multiplication process of the acoustic signal processing apparatus which concerns on the 2nd Embodiment of this invention. 本発明の第３の実施の形態に係る音響信号処理装置のブロック図である。It is a block diagram of the acoustic signal processing device concerning a 3rd embodiment of the present invention.

次に、図面を参照して、本発明の実施の形態を説明する。以下の図面の記載において、同一または類似の部分には同一または類似の符号を付している。 Next, embodiments of the present invention will be described with reference to the drawings. In the following description of the drawings, the same or similar parts are given the same or similar reference numerals.

（第１の実施の形態）
図１を参照して、第１の実施の形態に係る音響信号処理装置１を説明する。音響信号処理装置１は、処理装置（図示せず）、メモリ１０などを備える一般的なコンピュータである。一般的なコンピュータが音響信号処理プログラムを実行することにより図１に示す機能を実現する。 First Embodiment
An acoustic signal processing device 1 according to the first embodiment will be described with reference to FIG. The acoustic signal processing device 1 is a general computer provided with a processing device (not shown), a memory 10 and the like. A general computer executes the sound signal processing program to realize the function shown in FIG.

第１の実施の形態に係る音響信号処理装置１は、複数のスピーカを直線状に並べた直線状スピーカアレイを用いて、スピーカよりも前面に飛び出し、かつ、指向性を有する仮想音源を実現する。 The acoustic signal processing device 1 according to the first embodiment realizes a virtual sound source having directivity and jumping out in front of the speakers by using a linear speaker array in which a plurality of speakers are linearly arranged. .

本発明の実施の形態においては、仮想音源を実現するために、互いに近接する位置に極性の異なる２以上の焦点音源を生成することで、多重極音源を実現する。焦点音源は、極性の異なる無指向性の点音源（モノポール音源）の組み合わせである。本発明の実施の形態においては、２つのモノポール音源である場合を説明するが、焦点音源の数は、偶数であればよく、その数は問わない。 In the embodiment of the present invention, in order to realize a virtual sound source, a multipole sound source is realized by generating two or more focused sound sources having different polarities at positions close to each other. The focused sound source is a combination of non-directional point sound sources (monopole sound sources) of different polarities. In the embodiment of the present invention, although the case of two monopole sound sources is described, the number of focus sound sources may be an even number, and the number is not limited.

音響信号処理装置１は、このような仮想音源を実現するために、入力音響信号Ｉを、直線状スピーカアレイの各スピーカへの出力音響信号Ｏに変換する。 The acoustic signal processing apparatus 1 converts an input acoustic signal I into an output acoustic signal O to each speaker of the linear speaker array in order to realize such a virtual sound source.

音響信号処理装置１は、メモリ１０、焦点位置決定部１２、フィルタ係数演算部１３、畳み込み演算部１４、入出力インタフェース（図示せず）等を備える。入出力インタフェースは、入力音響信号を音響信号処理装置１に入力し、各スピーカへの出力音響信号を出力するためのインタフェースである。入出力インタフェースは、音響信号処理装置１が実現する仮想音源の座標および指向性の方向の各情報を、音響信号処理装置１に入力する
メモリ１０は、焦点データ１１を記憶する。焦点データ１１は、仮想音源を実現するための複数の焦点の座標を含む。焦点データ１１は、少なくとも一対の焦点の座標を含み、複数対の焦点の情報を含んでも良い。焦点データ１１に含まれる焦点は、Ｘ軸およびＹ軸に対してそれぞれ対称に設けられ、指向性が考慮されていない無指向の焦点座標である。本発明の実施の形態において、焦点データ１１に記憶される焦点を、初期焦点と称し、初期焦点の座標を初期焦点座標と称する。なお、仮想音源は、初期焦点座標の中心となる。 The acoustic signal processing device 1 includes a memory 10, a focus position determination unit 12, a filter coefficient calculation unit 13, a convolution calculation unit 14, an input / output interface (not shown), and the like. The input / output interface is an interface for inputting an input sound signal to the sound signal processing device 1 and outputting an output sound signal to each speaker. The input / output interface inputs, to the acoustic signal processing device 1, pieces of information of the coordinate of the virtual sound source realized by the acoustic signal processing device 1 and the direction of directivity, and the memory 10 stores the focus data 11. The focus data 11 includes coordinates of a plurality of focal points for realizing a virtual sound source. The focus data 11 includes coordinates of at least one pair of focal points, and may include information of multiple pairs of focal points. The focal points included in the focal point data 11 are omnidirectional focal point coordinates provided symmetrically with respect to the X-axis and the Y-axis, respectively, with no directivity taken into consideration. In the embodiment of the present invention, the focus stored in the focus data 11 is referred to as an initial focus, and the coordinates of the initial focus are referred to as an initial focus coordinate. The virtual sound source is at the center of the initial focus coordinates.

焦点位置決定部１２は、仮想音源の位置、指向性の方向の情報、および対象周波数の各情報を受け取り、必要な数の焦点に関する座標を出力する。焦点位置決定部１２は、複数の初期焦点座標と、仮想音源の座標および指向性の方向を取得して、複数の初期焦点座標のそれぞれについて、仮想音源の座標に基づいて、初期焦点座標に指向性の方向から特定される回転行列をかけて、指向性を考慮した焦点座標を決定する。焦点位置決定部１２は、仮想音源の座標に対する初期焦点座標の相対座標に回転行列をかけ、回転行列をかけて得られた座標に、仮想音源の座標を加算して、指向性を考慮した焦点座標を決定する。 The focus position determination unit 12 receives the information of the position of the virtual sound source, the information of the direction of directivity, and the information of the target frequency, and outputs the coordinates of the required number of focal points. The focal position determining unit 12 acquires a plurality of initial focal coordinates, a coordinate of the virtual sound source, and a direction of directivity, and directs each of the plurality of initial focal coordinates to the initial focal coordinates based on the virtual sound source coordinates. A rotation matrix specified from the direction of sex is applied to determine focus coordinates in consideration of directivity. The focus position determination unit 12 adds the coordinates of the virtual sound source to the coordinates obtained by multiplying the rotation matrix with the relative coordinates of the initial focus coordinates with respect to the coordinates of the virtual sound source, and adding the coordinates of the virtual sound source. Determine the coordinates.

焦点位置決定部１２は、メモリ１０から、１以上の対の初期焦点座標を取得するとともに、音響信号処理装置１が実現する特定として、外部入力等により、仮想音源の座標および指向性の方向を取得する。焦点位置決定部１２は、取得した指向性の方向から、初期焦点座標に対してかける回転方向θを特定する。 The focus position determination unit 12 acquires one or more pairs of initial focus coordinates from the memory 10, and as a specification realized by the acoustic signal processing apparatus 1, coordinates of the virtual sound source and direction of directivity by external input or the like. get. The focus position determination unit 12 specifies the rotation direction θ to be applied to the initial focus coordinates from the acquired directivity direction.

焦点位置決定部１２は、一対の初期焦点座標を、 The focus position determination unit 12 sets a pair of initial focus coordinates,

とした場合、Ｘ軸方向に対してθ方向を指定すると、この方向から特定できる回転行列Ｇは、式（１）で求まるため、回転後のモノポールの座標は、式（２）で決定できる。 In this case, if the θ direction is specified with respect to the X-axis direction, the rotation matrix G that can be specified from this direction can be obtained by Equation (1), and therefore the coordinates of the monopole after rotation can be determined by Equation (2) .

焦点位置決定部１２は、メモリから読み出した所望の特性に対応する１対以上の初期焦点座標に対し、指向性の方向から特定できる回転行列を座標毎にかけた上で、仮想音源の座標を座標毎に加算することで、全ての焦点座標を計算する。 The focal position determining unit 12 applies a rotation matrix that can be identified from the direction of directivity to one or more pairs of initial focal coordinates corresponding to the desired characteristic read out from the memory, and then coordinates the coordinates of the virtual sound source. Calculate every focal point coordinate by adding every.

なお、クアドラポールなど、２より多数のモノポール音源からなる多重極音源についても、回転行列で回転させて新たな座標を算出することで指向性の回転に対応したモノポール音源の座標を計算する。 For multipole sound sources consisting of more than two monopole sound sources such as quadrapoles, the coordinates of monopole sound sources corresponding to directivity rotation are calculated by rotating with the rotation matrix and calculating new coordinates. .

図２を参照して、本発明の実施の形態に係る焦点位置決定部１２による焦点位置決定処理を説明する。 The focus position determination process by the focus position determination unit 12 according to the embodiment of the present invention will be described with reference to FIG.

まずステップＳ１１において焦点位置決定部１２は、仮想音源の座標と指向性の方向の情報を取得し、ステップＳ１２において、メモリから所望の特定に対応する１以上の初期焦点の情報を読み出す。 First, in step S11, the focal position determining unit 12 acquires information on the coordinates of the virtual sound source and the direction of directivity, and in step S12 reads out information on one or more initial focal points corresponding to the desired specification from the memory.

次に、ステップＳ１２で読み出した各初期焦点について、焦点位置決定部１２は、ステップＳ１３およびステップＳ１４の処理を繰り返す。ステップＳ１３において焦点位置決定部１２は、処理対象の対象焦点座標に、ステップＳ１１で取得した指向性の方向から特定される回転行列をかける。ここで用いられる対象焦点座標は、仮想音源に対する相対座標である。ステップＳ１４において焦点位置決定部１２は、仮想音源の座標に、ステップＳ１３により回転行列をかけた後の座標を加算して、指向性を考慮した焦点座標を決定する。 Next, for each initial focus read out in step S12, the focus position determination unit 12 repeats the processing in steps S13 and S14. In step S13, the focus position determination unit 12 applies a rotation matrix specified from the direction of directivity obtained in step S11 to the target focus coordinates of the processing target. The object focal point coordinates used here are relative coordinates with respect to the virtual sound source. In step S14, the focus position determination unit 12 adds the coordinates obtained by applying the rotation matrix in step S13 to the coordinates of the virtual sound source to determine focus coordinates in consideration of directivity.

ステップＳ１２で読み出した各初期焦点について、ステップＳ１３およびステップＳ１４の処理が終了すると、焦点位置決定部１２は処理を終了する。 When the process of step S13 and step S14 ends for each initial focus read out in step S12, the focus position determination unit 12 ends the process.

なお、ステップＳ１３ないしステップＳ１４の処理は、各焦点に対して行われればよく、どのような順序で行われても良い。 The processes in steps S13 to S14 may be performed for each focus, and may be performed in any order.

図３ないし図５を参照して、焦点位置決定部１２の処理のシミュレーション結果を説明する。図３は、直線状スピーカアレイと、初期焦点を示す。直線状スピーカアレイは、（−２，０）から、（２，０）に配設され、一対の初期焦点座標は、（０，１−０．０３４５）および（０，１＋０．０３４５）である。このとき、仮想音源の座標は、（０，１）である。この際の音場は、図３に示すように、左右対称に形成され、指向性がない。 Simulation results of the process of the focus position determination unit 12 will be described with reference to FIGS. 3 to 5. FIG. 3 shows a linear loudspeaker array and an initial focus. The linear loudspeaker array is arranged from (-2, 0) to (2, 0), and the pair of initial focal coordinates are (0, 1-0.0345) and (0, 1 + 0.0345) . At this time, the coordinates of the virtual sound source are (0, 1). The sound field at this time is formed symmetrically in the left-right direction as shown in FIG. 3 and has no directivity.

焦点位置決定部１２は、このような初期焦点座標に対して、式（１）で特定される回転行列かける。図４に示すように、初期焦点座標（１，１．０３４５）の仮想音源座標（０．０，１．０）に対する相対座標は、（０．０，０．０３４５）となる。焦点位置決定部１２は、初期焦点座標の仮想音源座標に対する相対座標に対して、回転行列をかけ、仮想音源座標を加算することにより、回転後の座標（０．０１７２，１．０２９９）を得る。もう一方の初期焦点座標（０，１−０．０３４５）に対しても同様に処理することにより、焦点位置決定部１２は、回転後の座標（−０．０１７２，０．９７０１）を得る。 The focus position determination unit 12 multiplies such initial focus coordinates by the rotation matrix specified by Equation (1). As shown in FIG. 4, relative coordinates of the initial focus coordinates (1, 1.0345) to the virtual sound source coordinates (0.0, 1.0) are (0.0, 0.0345). The focal position determining unit 12 obtains a coordinate after rotation (0.0172, 1.0299) by multiplying the rotation matrix with respect to the relative coordinates with respect to the virtual sound source coordinates of the initial focus coordinates and adding the virtual sound source coordinates. . By similarly processing the other initial focus coordinates (0, 1 to 0.0345), the focus position determination unit 12 obtains coordinates after rotation (− 0.0172, 0.9701).

図５は、図４の計算によって得られた回転後の座標における音場を示す。各モノポール座標は、図３と比べて時計回りに回転され、指向性が実現されている。 FIG. 5 shows the sound field in the coordinates after rotation obtained by the calculation of FIG. Each monopole coordinate is rotated clockwise as compared with FIG. 3, and directivity is realized.

焦点位置決定部１２によって、各初期焦点について、指向性を考慮した焦点座標が算出されると、フィルタ係数演算部１３により、処理される。 When focus coordinates in consideration of directivity are calculated for each initial focus by the focus position determination unit 12, the coordinates are processed by the filter coefficient calculation unit 13.

フィルタ係数演算部１３は、焦点位置決定部１２から出力された全ての焦点の座標を受け取り、スピーカ毎に周波数領域でフィルタを設計した後、逆フーリエ変換することで各スピーカに与えるインパルス応答ベクトルを出力する。フィルタ係数演算部１３は、直線状スピーカアレイの各スピーカについて、焦点位置決定部１２により決定された焦点座標のそれぞれから、入力音響信号Ｉに畳み込むインパルス応答ベクトルを算出する。フィルタ係数演算部１３は、指向性を考慮した焦点座標のそれぞれを用いて、対象周波数に対して駆動関数を計算し、計算された駆動関数を逆フーリエ変換して得られた時間領域の駆動関数を加算して、スピーカに対するインパルス応答ベクトルを算出する。 The filter coefficient calculation unit 13 receives the coordinates of all the focal points output from the focal position determination unit 12, designs a filter in the frequency domain for each speaker, and then performs an inverse Fourier transform on the impulse response vector to be given to each speaker Output. The filter coefficient calculation unit 13 calculates an impulse response vector to be convoluted into the input acoustic signal I from each of the focus coordinates determined by the focus position determination unit 12 for each speaker of the linear speaker array. The filter coefficient calculation unit 13 calculates a drive function for the target frequency using each of the focus coordinates in consideration of directivity, and calculates a drive function in the time domain obtained by inverse Fourier transform of the calculated drive function. To calculate an impulse response vector for the speaker.

フィルタ係数演算部１３は、外部入力等により対象周波数を算出し、この対象周波数に対して、式（３）により駆動関数を算出する。 The filter coefficient calculation unit 13 calculates a target frequency by an external input or the like, and calculates a drive function with respect to the target frequency by Equation (3).

式（３）を、事前に決めた周波数範囲（例えば、100Hz ≦ f ＜ 2000Hz）について計算することで、フィルタ係数演算部１３は、直線状スピーカアレイの各スピーカのうち、ｉ番目のスピーカに与える駆動信号を求めることができる。フィルタ係数演算部１３は、これを直線状スピーカアレイの各スピーカに対して計算することにより、各スピーカに対して与える駆動信号が求まる。 By calculating equation (3) with respect to a predetermined frequency range (for example, 100 Hz ≦ f <2000 Hz), the filter coefficient calculator 13 gives the i-th speaker among the speakers of the linear speaker array. The drive signal can be determined. The filter coefficient calculator 13 calculates this for each speaker of the linear speaker array to obtain a drive signal to be supplied to each speaker.

フィルタ係数演算部１３は、式（３）で与えられる各スピーカの駆動信号に対する、Ｘ軸方向の逆フーリエ変換により時間領域に変換して、時間領域の波面合成として知られる式（４）を得る。式（４）における式（５）は、波面合成プレフィルタとして知られている。 The filter coefficient calculation unit 13 converts the drive signal of each speaker given by the equation (3) into the time domain by the inverse Fourier transform in the X-axis direction to obtain the equation (4) known as wavefront synthesis in the time domain . Equation (5) in equation (4) is known as a wavefront synthesis prefilter.

時間領域での波面合成では、式（４）に示す通り、入力音響信号Ｉに式（５）で定義される波面合成プレフィルタを適用した上で、チャネル毎にパワー乗算とディレイを加えるだけで済むため、演算量を劇的に削減することができる。 In wave-field synthesis in the time domain, as shown in equation (4), after applying the wave-field synthesis prefilter defined in equation (5) to input acoustic signal I, it is necessary to add power multiplication and delay for each channel. The amount of computation can be dramatically reduced.

図６を参照して、フィルタ係数演算部１３によるフィルタ係数決定処理を説明する。 Filter coefficient determination processing by the filter coefficient calculation unit 13 will be described with reference to FIG.

まずステップＳ２１においてフィルタ係数演算部１３は、焦点位置決定処理で決定された各焦点座標を取得する。この各焦点座標は、初期焦点座標に対して所望の指向性が考慮された座標である。 First, in step S21, the filter coefficient computing unit 13 acquires each focal point coordinate determined in the focal position determination process. Each focal point coordinate is a coordinate in which a desired directivity is considered with respect to the initial focal point coordinate.

フィルタ係数演算部１３は、ステップＳ２２ないしステップＳ２６の処理を繰り返して、各スピーカについて、インパルス応答ベクトルを算出する処理を行う。ステップＳ２２においてフィルタ係数演算部１３は、処理対象の対象スピーカのインパルス応答ベクトルをゼロで初期化する。 The filter coefficient calculation unit 13 repeats the processing of step S22 to step S26 to perform processing of calculating an impulse response vector for each speaker. In step S22, the filter coefficient calculator 13 initializes the impulse response vector of the target speaker to be processed to zero.

フィルタ係数演算部１３は、ステップＳ２２においてインパルス応答ベクトルを初期化した後、各焦点について、ステップＳ２３ないしステップＳ２５の処理を繰り返す。ステップＳ２３においてフィルタ係数演算部１３は、処理対象の対象焦点座標を用いて、対象周波数に対して、式（３）により駆動関数を計算する。ステップＳ２４においてフィルタ係数演算部１３は、ステップＳ２３で計算された駆動関数を、逆フーリエ変換して、式（４）により、時間領域の駆動関数を取得する。ステップＳ２５において、ステップＳ２４で取得した時間領域の駆動関数をインパルス応答ベクトルに加算する。 After initializing the impulse response vector in step S22, the filter coefficient calculation unit 13 repeats the processing of step S23 to step S25 for each focal point. In step S23, the filter coefficient calculator 13 calculates a drive function according to equation (3) for the target frequency using the target focal point coordinate of the processing target. In step S24, the filter coefficient calculator 13 performs inverse Fourier transform on the drive function calculated in step S23, and obtains the drive function in the time domain according to equation (4). In step S25, the drive function in the time domain acquired in step S24 is added to the impulse response vector.

各焦点についてステップＳ２３ないしステップＳ２５の処理が終了すると、ステップＳ２６においてフィルタ係数演算部１３は、この時点のインパルス応答ベクトルを、対象スピーカに与えるインパルス応答ベクトルに決定する。 When the processing of step S23 to step S25 ends for each focal point, the filter coefficient calculation unit 13 determines the impulse response vector at this point of time as an impulse response vector to be given to the target speaker in step S26.

各スピーカについてステップＳ２３ないしステップＳ２６の処理が終了すると、フィルタ係数演算部１３は、処理を終了する。 When the processing of step S23 to step S26 ends for each speaker, the filter coefficient calculation unit 13 ends the processing.

なお、ステップＳ２２ないしステップＳ２６の処理は、各スピーカに対して行われればよく、どのような順序で行われても良い。同様に、ステップＳ２３ないしステップＳ２５の処理は、各焦点に対して行われればよく、どのような順序で行われても良い。 The processes in steps S22 to S26 may be performed for each speaker, and may be performed in any order. Similarly, the processes of steps S23 to S25 may be performed for each focus, and may be performed in any order.

フィルタ係数演算部１３により、直線状スピーカアレイの各スピーカに対するインパルス応答ベクトルが算出されると、畳み込み演算部１４が、入力音響信号Ｉに、インパルス応答ベクトルを畳み込むことにより、各スピーカに与える出力音響信号Ｏを算出する。 When the impulse response vector for each speaker of the linear speaker array is calculated by the filter coefficient operation unit 13, the convolution operation unit 14 convolutes the impulse response vector into the input acoustic signal I to output sound to be given to each speaker Calculate signal O.

畳み込み演算部１４は、直線状スピーカアレイの各スピーカについて、入力音響信号Ｉに、スピーカに対応するインパルス応答ベクトルを畳み込んで、スピーカへの出力音響信号Ｏを出力する。畳み込み演算部１４は、所定のスピーカについて、このスピーカに対応するインパルス応答ベクトルを、入力音響信号Ｉに畳み込むことにより、このスピーカに対する出力音響信号Ｏを得る。畳み込み演算部１４は、各スピーカについて同様の処理を繰り返し、各スピーカに対する出力音響信号Ｏを得る。 The convolution unit 14 convolutes an impulse response vector corresponding to the speaker into the input sound signal I for each speaker of the linear speaker array, and outputs an output sound signal O to the speaker. The convolution unit 14 obtains an output acoustic signal O for the speaker by convoluting an impulse response vector corresponding to the speaker into the input acoustic signal I for a predetermined speaker. The convolution operation unit 14 repeats the same process for each speaker to obtain an output sound signal O for each speaker.

図７を参照して、畳み込み演算部１４による畳み込み演算処理を説明する。 The convolution operation process by the convolution operation unit 14 will be described with reference to FIG.

畳み込み演算部１４は、ステップＳ３１およびステップＳ３２の処理を、直線状スピーカアレイの各スピーカに対して繰り返す。ステップＳ３１において畳み込み演算部１４は、フィルタ係数演算部１３から、処理対象の対象スピーカのインパルス応答ベクトルを取得する。ステップＳ３２において入力音響信号Ｉに、ステップＳ３１で取得したインパルス応答ベクトルを畳み込み、出力音響信号Ｏを取得する。 The convolution unit 14 repeats the processing of step S31 and step S32 for each speaker of the linear speaker array. In step S31, the convolution operation unit 14 acquires, from the filter coefficient operation unit 13, an impulse response vector of the target speaker to be processed. In step S32, the impulse response vector obtained in step S31 is convoluted with the input sound signal I to obtain an output sound signal O.

各スピーカについてステップＳ３１ないしステップＳ３２の処理が終了すると、畳み込み演算部１４は、処理を終了する。なお、ステップＳ３１ないしステップＳ３２の処理は、各スピーカに対して行われればよく、どのような順序で行われても良い。 When the processing in steps S31 to S32 ends for each speaker, the convolution operation unit 14 ends the processing. The processes in steps S31 to S32 may be performed for each speaker, and may be performed in any order.

第１の実施の形態に係る音響信号処理装置１は、予め、初期焦点座標に回転をかけて、所望の指向性を実現する焦点座標を算出して、各焦点座標に対して、各スピーカに対応するインパルス応答ベクトルを算出する。これにより、音響信号処理装置１は、入力音響信号Ｉに対して、各スピーカに対応するインパルス応答ベクトルを畳み込むことにより、各スピーカへの出力音響信号Ｏを得る。 The acoustic signal processing device 1 according to the first embodiment rotates the initial focal point coordinates in advance to calculate focal point coordinates for achieving desired directivity, and for each focal point coordinate, each speaker Calculate the corresponding impulse response vector. Thereby, the acoustic signal processing device 1 obtains the output acoustic signal O to each speaker by convoluting the impulse response vector corresponding to each speaker with respect to the input acoustic signal I.

第１の実施の形態に係る音響信号処理装置１は、少ない演算量で、仮想音源で所望の指向性を実現することができる。また第１の実施の形態に係る音響信号処理装置１は、仮想音源の生成にマイクアレイを必要せず、通常のマイクロフォンから収録されたモノラルの音源から複数チャネル分の音響信号を生成して、仮想的な音源を作り出すことができる。 The acoustic signal processing device 1 according to the first embodiment can realize desired directivity with a virtual sound source with a small amount of calculation. Further, the acoustic signal processing device 1 according to the first embodiment does not require a microphone array to generate a virtual sound source, but generates acoustic signals for a plurality of channels from a monaural sound source recorded from a normal microphone, It can create virtual sound sources.

（第２の実施の形態）
図８を参照して、第２の実施の形態に係る音響信号処理装置１ａを説明する。第２の実施の形態に係る音響信号処理装置１ａは、時間領域での波面合成を用いて、低演算量で仮想音源を多重極音源にする。第２の実施の形態に係る音響信号処理装置１ａは、図１の畳み込み演算部１４の代わりに、フィルタ演算部１５、遅延調整部１６およびゲイン乗算部１７を用いることにより、大幅な演算量の削減を実現する。 Second Embodiment
An acoustic signal processing device 1a according to the second embodiment will be described with reference to FIG. The acoustic signal processing device 1a according to the second embodiment uses a wave field synthesis in the time domain to make a virtual sound source a multipole sound source with a low amount of calculation. The acoustic signal processing apparatus 1a according to the second embodiment uses a filter operation unit 15, a delay adjustment unit 16, and a gain multiplication unit 17 instead of the convolution operation unit 14 of FIG. Realize reduction.

音響信号処理装置１ａは、メモリ１０、焦点位置決定部１２、フィルタ演算部１５、遅延調整部１６およびゲイン乗算部１７を備える。メモリ１０および焦点位置決定部１２は、第１の実施の形態と同様である。 The acoustic signal processing device 1 a includes a memory 10, a focus position determination unit 12, a filter operation unit 15, a delay adjustment unit 16, and a gain multiplication unit 17. The memory 10 and the focus position determination unit 12 are the same as those in the first embodiment.

フィルタ演算部１５は、第１の実施の形態と同様の方法で、上記式（５）により波面合成プレフィルタを算出し、入力音響信号Ｉに波面合成プレフィルタを畳み込んで重み付き音響信号を出力する。 The filter operation unit 15 calculates the wavefront synthesis pre-filter according to the above equation (5) in the same manner as in the first embodiment, and convolutes the wavefront synthesis pre-filter into the input acoustic signal I to obtain a weighted acoustic signal. Output.

図９を参照して、フィルタ演算部１５によるフィルタ演算処理を説明する。 Filter operation processing by the filter operation unit 15 will be described with reference to FIG.

まずステップＳ５１においてフィルタ演算部１５は、式（５）により、波面合成プレフィルタを算出する。ステップＳ５２において入力音響信号Ｉに、ステップＳ５１で算出した波面号令プレフィルタを畳み込み、重み付き音響信号を出力する。 First, in step S51, the filter operation unit 15 calculates a wavefront synthesis pre-filter according to equation (5). In step S52, the wavefront order prefilter calculated in step S51 is convoluted with the input acoustic signal I, and a weighted acoustic signal is output.

遅延調整部１６は、直線状スピーカアレイの各スピーカについて、スピーカと焦点座標のそれぞれの距離から決まる遅延量をそれぞれ算出し、重み付き音響信号を、算出されたそれぞれの遅延量で遅延させて、焦点座標のそれぞれについて、遅延音響信号を出力する。 The delay adjustment unit 16 calculates, for each speaker in the linear speaker array, a delay amount determined from the distance between the speaker and the focal point coordinates, and delays the weighted sound signal by the calculated delay amount, A delayed acoustic signal is output for each of the focal point coordinates.

遅延調整部１６は、焦点位置決定部１２により出力された複数の焦点位置のそれぞれについて、焦点の位置とスピーカ位置の距離を音速で進むのに必要な時間だけ出力信号に遅延を加えて遅延音響信号を出力する。焦点位置決定部１２が出力した焦点をＭとすると、Ｍ個すべての焦点について、式（６）により遅延音響信号を算出する。 The delay adjustment unit 16 adds delay to the output signal for the time required to travel the distance between the focal position and the speaker position at the speed of sound for each of the plurality of focal positions output by the focal position determination unit 12 to obtain delayed acoustics. Output a signal. Assuming that the focal point output from the focal point position determination unit 12 is M, the delayed acoustic signal is calculated using Equation (6) for all the M focal points.

ゲイン乗算部１７は、直線状スピーカアレイの各スピーカについて、スピーカと焦点座標の位置から決まるゲインを、焦点座標のそれぞれの遅延音響信号に乗じて、スピーカへの出力音響信号Ｏを出力する。 The gain multiplication unit 17 multiplies each delayed acoustic signal of the focal point coordinates by the gain determined from the position of the loudspeakers and the focal point coordinates for each speaker of the linear speaker array, and outputs an output acoustic signal O to the loudspeakers.

ゲイン乗算部１７は、所定のスピーカについて、焦点座標とスピーカアレイとの距離を、焦点音源とスピーカ位置との距離の３／２乗で割ることで得られるゲインを、遅延調整部１６によって得られた遅延音響信号に乗ずることによって出力音響信号Ｏを算出する。「焦点座標とスピーカアレイとの距離」は、スピーカアレイがＸ軸上に配列されている場合の、スピーカアレイのＹ軸上の値と、焦点座標のＹ軸の値の差分である。所定のスピーカに対する出力音響信号Ｏは、式（７）によって得られる。ゲイン乗算部１７は、各スピーカについて、式（７）により出力音響信号Ｏを算出する。 The gain multiplication unit 17 obtains, by the delay adjustment unit 16, a gain obtained by dividing the distance between the focal point coordinate and the speaker array by the third power of the distance between the focal source and the speaker position for a predetermined speaker. The output sound signal O is calculated by multiplying the delayed sound signal. The “distance between focal point coordinate and speaker array” is the difference between the value on the Y-axis of the speaker array and the value on the Y-axis of focal point coordinates when the speaker array is arranged on the X-axis. The output acoustic signal O for a given speaker is obtained by equation (7). The gain multiplication unit 17 calculates an output acoustic signal O according to Expression (7) for each speaker.

遅延調整部１６およびゲイン乗算部１７は、直線状スピーカアレイの所定のスピーカについて、スピーカの位置に対応する遅延とゲインを設定した遅延調整部とゲイン乗算部の処理を行って出力音響信号を生成する。これを着目するスピーカを次々に変化させて同様の処理を行うことにより、遅延調整部１６およびゲイン乗算部１７は、直線状スピーカアレイの各スピーカに対する出力音響信号Ｏを得る。 The delay adjusting unit 16 and the gain multiplying unit 17 process the delay adjusting unit and the gain multiplying unit which set the delay and the gain corresponding to the position of the speaker for the predetermined speaker of the linear speaker array, and generate the output sound signal Do. The delay adjusting unit 16 and the gain multiplying unit 17 obtain the output acoustic signal O for each speaker of the linear speaker array by performing the same process by changing the speakers focusing on this one after another.

図１０を参照して、遅延調整部１６およびゲイン乗算部１７による遅延調整およびゲイン乗算処理を説明する。 The delay adjustment and gain multiplication processing by the delay adjustment unit 16 and the gain multiplication unit 17 will be described with reference to FIG.

まず音響信号処理装置１ａは、直線状スピーカアレイの各スピーカについて、ステップＳ６１およびステップＳ６２の処理を行う。 First, the acoustic signal processing device 1a performs the process of step S61 and step S62 for each speaker of the linear speaker array.

まず遅延調整部１６は、各焦点について、ステップＳ６１の処理を行う。ステップＳ６１において遅延調整部１６は、対象スピーカと対象焦点完を、音速で進む時間だけ遅延させた遅延音響信号を出力する。各焦点について遅延音響信号が出力さされると、ゲイン乗算部１７は、ステップＳ６１で算出された各焦点に対する遅延音響信号に、対象スピーカのゲインを乗じて、対象スピーカに対する出力音響信号Ｏを出力する。 First, the delay adjustment unit 16 performs the process of step S61 for each focus. In step S61, the delay adjustment unit 16 outputs a delayed acoustic signal obtained by delaying the target speaker and the target focal point by the time for traveling at the speed of sound. When the delayed acoustic signal is output for each focal point, the gain multiplication unit 17 multiplies the delayed acoustic signal for each focal point calculated in step S61 by the gain of the target speaker, and outputs the output acoustic signal O for the target speaker .

各スピーカについて、ステップＳ６１およびＳ６２の処理が終了すると、音響信号処理装置１ａは、処理を終了する。 When the processes of steps S61 and S62 end for each speaker, the acoustic signal processing device 1a ends the process.

なお、ステップＳ６１の処理は、各焦点に対して行われればよく、どのような順序で行われても良い。同様に、ステップＳ６２の処理は、各スピーカに対して行われればよく、どのような順序で行われても良い。また処理環境等に応じて、所定の処理が並列に行われても良い。 The process of step S61 may be performed for each focus, and may be performed in any order. Similarly, the process of step S62 may be performed for each speaker, and may be performed in any order. Further, predetermined processing may be performed in parallel depending on the processing environment or the like.

第２の実施の形態に係る音響信号処理装置１ａは、予め、初期焦点座標に回転をかけて、所望の指向性を実現する焦点座標を算出するとともに、波面合成プレフィルタを算出する。音響信号処理装置１ａは、入力音響信号Ｉに波面号令プレフィルタを畳み込んで重み付き音響信号を生成し、各スピーカと各焦点の位置に応じて、適切な遅延およびゲインを与えることにより、各スピーカへの出力音響信号Ｏを得る。 The acoustic signal processing device 1a according to the second embodiment rotates the initial focal point coordinates in advance to calculate focal point coordinates that achieve desired directivity, and also calculates a wavefront synthesis pre-filter. The acoustic signal processing apparatus 1a convolutes a wavefront command prefilter to the input acoustic signal I to generate a weighted acoustic signal, and provides appropriate delay and gain according to the position of each speaker and each focal point. An output sound signal O to the speaker is obtained.

第２の実施の形態に係る音響信号処理装置１ａは、第１の実施の形態に比べてさらに少ない演算量で、第１の実施の形態と同様に仮想音源で所望の指向性を実現することができる。また第２の実施の形態に係る音響信号処理装置１ａは、第１の実施の形態と同様に、仮想音源の生成にマイクアレイを必要せず、通常のマイクロフォンから収録されたモノラルの音源から複数チャネル分の音響信号を生成して、仮想的な音源を作り出すことができる。 The acoustic signal processing device 1a according to the second embodiment realizes desired directivity with a virtual sound source as in the first embodiment with a smaller amount of calculation compared to the first embodiment. Can. Further, as in the first embodiment, the acoustic signal processing device 1a according to the second embodiment does not require a microphone array to generate a virtual sound source, and a plurality of monaural sound sources recorded from a normal microphone are used. An acoustic signal of a channel can be generated to create a virtual sound source.

（第３の実施の形態）
図１１を参照して、第３の実施の形態に係る音響信号処理装置１ｂを説明する。第３の実施の形態に係る音響信号処理装置１ｂは、時間領域での波面合成を用いて、低演算量で仮想音源を多重極音源にして、再現音場の精度を向上する。 Third Embodiment
An acoustic signal processing device 1b according to the third embodiment will be described with reference to FIG. The acoustic signal processing apparatus 1b according to the third embodiment uses the wave field synthesis in the time domain to make the virtual sound source a multipole sound source with a low calculation amount, and improves the accuracy of the reproduced sound field.

第３の実施の形態に係る音響信号処理装置１ｂは、図８に示す第２の実施の形態に係る音響信号処理装置１ａと比べて、フィルタ演算部１５と遅延調整部１６との間に補正フィルタ演算部１８を備え、焦点位置決定部１２が、補正フィルタ演算部１８に接続される点が異なる。補正フィルタ演算部１８以外の各部の動作は、第２の実施の形態に係る各部の動作と同様である。 The acoustic signal processing device 1b according to the third embodiment is corrected between the filter operation unit 15 and the delay adjustment unit 16 as compared to the acoustic signal processing device 1a according to the second embodiment shown in FIG. A filter operation unit 18 is provided, and the focus position determination unit 12 is connected to the correction filter operation unit 18. The operation of each unit other than the correction filter operation unit 18 is the same as the operation of each unit according to the second embodiment.

補正フィルタ演算部１８は、各スピーカ（チャネル）について、非整数遅延を補正するための補正フィルタを算出し、重み付き音響信号に補正フィルタを適用して補正後の音響信号を出力する。ここで、各スピーカについて、非整数遅延を補正するための補正フィルタは、非整数遅延分のディレイを用いて得られるフィルタである。補正フィルタとしては、sinc関数を用いるフィルタ、ＦＩＲ（Finite Impulse Response）フィルタ（Lagrange Interpolation）、ＩＩＲ（Infinite impulse response）フィルタ（THIRANフィルタ）等が考えられる。 The correction filter calculation unit 18 calculates a correction filter for correcting a non-integer delay for each speaker (channel), applies the correction filter to the weighted sound signal, and outputs a sound signal after correction. Here, for each speaker, a correction filter for correcting a non-integer delay is a filter obtained using a delay corresponding to a non-integer delay. As the correction filter, a filter using a sinc function, an FIR (Finite Impulse Response) filter (Lagrange Interpolation), an IIR (Infinite impulse response) filter (THIRAN filter) or the like can be considered.

まず、補正フィルタが、sinc関数を用いるフィルタである場合を説明する。第３の実施の形態において補正フィルタ演算部１８は、式（８）に従って、補正フィルタを算出する。 First, the case where the correction filter is a filter using a sinc function will be described. In the third embodiment, the correction filter calculation unit 18 calculates a correction filter according to equation (8).

次に、式（９）に示すように、波面合成プレフィルタ適用後の入力信号に、式（８）の補正フィルタを適用する。 Next, as shown in equation (9), the correction filter of equation (8) is applied to the input signal after application of the wavefront synthesis prefilter.

式（９）により算出された補正フィルタ適用後の入力信号は、遅延調整部１６に入力される。 The input signal after application of the correction filter calculated by the equation (9) is input to the delay adjustment unit 16.

第３の実施の形態において遅延調整部１６は、補正後の音響信号（式（９）により算出された補正フィルタ適用後の入力信号）を、算出されたそれぞれの遅延量で遅延させて、焦点座標のそれぞれについて、遅延音響信号を出力する。 In the third embodiment, the delay adjustment unit 16 delays the corrected acoustic signal (the input signal after the application of the correction filter calculated by the equation (9)) by the calculated delay amount, thereby achieving the focal point. A delayed acoustic signal is output for each of the coordinates.

次に、補正フィルタが、ＦＩＲフィルタである場合を説明する。第３の実施の形態において補正フィルタは、式（１０）で定義されるＦＩＲフィルタで求められても良い。 Next, the case where the correction filter is an FIR filter will be described. In the third embodiment, the correction filter may be obtained by the FIR filter defined by equation (10).

このとき、式（９）に示すように、波面合成プレフィルタ適用後の入力信号に、式（１０）の補正フィルタを適用する。 At this time, as shown in Expression (9), the correction filter of Expression (10) is applied to the input signal after application of the wavefront synthesis prefilter.

さらに、補正フィルタが、ＩＩＲフィルタである場合を説明する。第３の実施の形態において補正フィルタは、式（１１）で求められるＩＩＲフィルタであっても良い。 Furthermore, the case where the correction filter is an IIR filter will be described. In the third embodiment, the correction filter may be an IIR filter obtained by equation (11).

このとき、式（１２）に示すように、波面合成プレフィルタ適用後の入力信号に、式（１２）の補正フィルタを適用する。 At this time, as shown in Expression (12), the correction filter of Expression (12) is applied to the input signal after application of the wavefront synthesis prefilter.

従来、時間領域実装による焦点音源法において、式（２）に示すように、入力音源に波面合成プレフィルタを適用した後、スピーカごとにゲイン乗算とディレイ処理を行う。通常のディレイ処理は、ディジタル信号のサンプル単位で行うため、非整数サンプル分のディレイを反映することができない。例えば、ナイキスト周波数48 kHzでサンプリングされた音声を考えると、周波数領域での実装と比較して約7.1mm（＝音速340m/s ÷ 48000サンプル）の誤差が生じる。このように各スピーカから出力された音声信号がそれぞれ誤差を有するため、予め設定した焦点に各スピーカから出力された音声信号が集まらず、音場の精度が低下する問題がある。 Conventionally, in a focus sound source method based on time domain mounting, as shown in equation (2), after applying a wavefront synthesis pre-filter to an input sound source, gain multiplication and delay processing are performed for each speaker. Since normal delay processing is performed in units of samples of digital signals, it is not possible to reflect delays for non-integer samples. For example, considering speech sampled at the Nyquist frequency of 48 kHz, an error of about 7.1 mm (= speed of sound 340 m / s ÷ 48000 samples) occurs as compared to the implementation in the frequency domain. As described above, since the audio signals output from the respective speakers have an error, the audio signals output from the respective speakers are not collected at the focal point set in advance, and the accuracy of the sound field is lowered.

そこで、第３の実施の形態に示すように、スピーカごとに補正フィルタを算出して適用することで音場再現の精度を改善することが可能になる。
（その他の実施の形態）
上記のように、本発明の第１ないし第３の実施の形態によって記載したが、この開示の一部をなす論述および図面はこの発明を限定するものであると理解すべきではない。この開示から当業者には様々な代替実施の形態、実施例および運用技術が明らかとなる。 Therefore, as described in the third embodiment, it is possible to improve the accuracy of sound field reproduction by calculating and applying a correction filter for each speaker.
(Other embodiments)
As described above, although the first to third embodiments of the present invention have been described, it should not be understood that the statements and drawings that form a part of this disclosure limit the present invention. Various alternative embodiments, examples and operation techniques will be apparent to those skilled in the art from this disclosure.

本発明はここでは記載していない様々な実施の形態等を含むことは勿論である。従って、本発明の技術的範囲は上記の説明から妥当な特許請求の範囲に係る発明特定事項によってのみ定められるものである。 Of course, the present invention includes various embodiments and the like which are not described herein. Accordingly, the technical scope of the present invention is defined only by the invention-specifying matters according to the scope of claims appropriate from the above description.

１音響信号処理装置
１０メモリ
１１焦点データ
１２焦点位置決定部
１３フィルタ係数演算部
１４畳み込み演算部
１５フィルタ演算部
１６遅延調整部
１７ゲイン乗算部
１８補正フィルタ演算部
Ｉ入力音響信号
Ｏ出力音響信号 DESCRIPTION OF SYMBOLS 1 acoustic signal processing apparatus 10 memory 11 focus data 12 focus position determination part 13 filter coefficient operation part 14 convolution operation part 15 filter operation part 16 delay adjustment part 17 gain multiplication part 18 correction filter operation part I input acoustic signal O output acoustic signal

Claims

An acoustic signal processing apparatus for converting an input acoustic signal into an output acoustic signal to each speaker of a linear speaker array for realizing a virtual sound source, comprising:
Get multiple initial focus coordinates and the coordinates of the virtual sound source and the direction of directivity,
A focal point coordinate in consideration of directivity is determined by multiplying the initial focal point coordinates by a rotation matrix specified from the direction of directivity based on the coordinates of the virtual sound source for each of the plurality of initial focal point coordinates A position determination unit,
A filter coefficient calculator for calculating an impulse response vector to be convoluted into the input acoustic signal from each of the focal coordinates determined by the focal position determining unit for each speaker of the linear speaker array;
A sound processing unit characterized by comprising a convolution operation unit which convolutes an impulse response vector corresponding to the speaker into the input sound signal and outputs an output sound signal to the speaker for each speaker of the linear speaker array; Signal processor.

The focus position determination unit
The coordinates of the virtual sound source are added to the coordinates obtained by multiplying the rotation matrix with the relative coordinates of the initial focus coordinates with respect to the coordinates of the virtual sound source, and the coordinates obtained by multiplying the rotation matrix, and the focus coordinates taking into account the directivity The acoustic signal processing apparatus according to claim 1, wherein:

The filter coefficient calculation unit
The driving function is calculated for the target frequency using each of the focus coordinates in consideration of the directivity, and the driving function in the time domain obtained by inverse Fourier transform of the calculated driving function is added, The acoustic signal processing apparatus according to claim 1, wherein an impulse response vector for a speaker is calculated.

A sound signal processing method for converting an input sound signal into an output sound signal to each speaker of a linear speaker array for realizing a virtual sound source,
Obtaining a plurality of initial focus coordinates, coordinates of the virtual sound source and direction of directivity;
For each of the plurality of initial focus coordinates, multiplying the initial focus coordinates by the rotation matrix specified from the direction of the directivity based on the coordinates of the virtual sound source to determine the focus coordinates considering the directivity When,
Calculating an impulse response vector to be convoluted into the input acoustic signal from each of the focus coordinates determined in the determining step for each speaker of the linear speaker array;
Processing each of the speakers in the linear speaker array by convoluting an impulse response vector corresponding to the speaker into the input sound signal and outputting an output sound signal to the speaker Method.

An acoustic signal processing apparatus for converting an input acoustic signal into an output acoustic signal to each speaker of a linear speaker array for realizing a virtual sound source, comprising:
Get multiple initial focus coordinates and the coordinates of the virtual sound source and the direction of directivity,
A focal point coordinate in consideration of directivity is determined by multiplying the initial focal point coordinates by a rotation matrix specified from the direction of directivity based on the coordinates of the virtual sound source for each of the plurality of initial focal point coordinates A position determination unit,
A filter operation unit that calculates a wavefront synthesis prefilter, and convolutes the wavefront synthesis prefilter with the input acoustic signal to output a weighted acoustic signal;
The amount of delay determined from the distance between the speaker and the focal point coordinate is calculated for each speaker of the linear speaker array, and the weighted acoustic signal is delayed by the calculated amount of delay to obtain the focal point. A delay adjustment unit that outputs a delayed acoustic signal for each of the coordinates;
A gain multiplication unit which outputs an output sound signal to the speaker by multiplying the delayed sound signal of each of the focus coordinates by the gain determined from the speaker and the position of the focus coordinates for each speaker of the linear speaker array An acoustic signal processing apparatus comprising:

The wavefront synthesis prefilter is
The acoustic signal processing apparatus according to claim 5, wherein

The system further includes a correction filter operation unit that calculates a correction filter for correcting a non-integer delay for each speaker, applies the correction filter to the weighted sound signal, and outputs a sound signal after correction.
The said delay adjustment part delays the acoustic signal after the said correction by each calculated delay amount, and outputs a delay acoustic signal about each of the said focus coordinate. Acoustic signal processor.

The sound reproduction device according to claim 7, wherein the correction filter for correcting the non-integer delay for each speaker is a filter obtained by using the delay for the non-integer delay.

A sound signal processing method for converting an input sound signal into an output sound signal to each speaker of a linear speaker array for realizing a virtual sound source,
Obtaining a plurality of initial focus coordinates, coordinates of the virtual sound source and direction of directivity;
For each of the plurality of initial focus coordinates, multiplying the initial focus coordinates by the rotation matrix specified from the direction of the directivity based on the coordinates of the virtual sound source to determine the focus coordinates considering the directivity When,
Calculating a wave-field synthesis pre-filter, convoluting the wave-field synthesis pre-filter with the input sound signal and outputting a weighted sound signal;
The amount of delay determined from the distance between the speaker and the focal point coordinate is calculated for each speaker of the linear speaker array, and the weighted acoustic signal is delayed by the calculated amount of delay to obtain the focal point. Outputting delayed acoustic signals for each of the coordinates;
For each loudspeaker of the linear loudspeaker array, multiplying the delayed acoustic signal of each of the focal coordinates by a gain determined from the loudspeaker and the position of the focal coordinates, and outputting an output acoustic signal to the loudspeakers A sound signal processing method characterized in that.

An acoustic signal processing program for causing a computer to function as the acoustic signal processing device according to any one of claims 1 to 3.