JP2011211312A

JP2011211312A - Sound image localization processing apparatus and sound image localization processing method

Info

Publication number: JP2011211312A
Application number: JP2010074669A
Authority: JP
Inventors: Junji Araki; 潤二荒木
Original assignee: Panasonic Corp
Current assignee: Panasonic Corp
Priority date: 2010-03-29
Filing date: 2010-03-29
Publication date: 2011-10-20

Abstract

PROBLEM TO BE SOLVED: To solve the problem, wherein localization is pulled to a reproducing loudspeaker position depending on input signals to be reproduced, the localization sense of a virtual sound image is weakened, and full virtual sound image localization sense cannot be obtained, when the virtual sound image is generated using panning or a head transfer function, in a sound image localization processing system that attains virtual sound image localization using one or more sets of paired loudspeakers.SOLUTION: By calculating the azimuth and the size of the centroid of a sound field from multi-channel input signals by a centroid calculating section 2, determining the weight coefficient corresponding to the calculated azimuth and size of the centroid by a weight coefficient determination section 3, and performing virtual sound image generation processing, on the basis of the determined weight coefficient by a virtual sound image generation processing section 4; the localization sense of the virtual sound image is adaptively enhanced for the input signals; and as a result, a sensation of being surrounded by sound is obtained.

Description

本発明は、複数のスピーカを用いた音像定位処理技術に関し、特にパンニングと頭部伝達関数（ＨＲＴＦ）を用いて所望の位置に仮想音像定位を実現する機能を有する音像定位処理技術に関する。 The present invention relates to a sound image localization processing technique using a plurality of speakers, and more particularly to a sound image localization processing technique having a function of realizing a virtual sound image localization at a desired position using panning and a head related transfer function (HRTF).

仮想音像定位技術において、パンニングと頭部伝達関数を用いて前方及び後方の仮想音像定位を実現する手法がある。この手法では次のようにして仮想音像を生成する。 In the virtual sound image localization technique, there is a method of realizing front and rear virtual sound image localization using panning and head related transfer functions. In this method, a virtual sound image is generated as follows.

まず、頭部伝達関数を用意するため、仮想音像を定位させたい位置にスピーカを設置し、このスピーカから受聴者の外耳道入り口までの頭部伝達関数を測定する。これを頭部伝達関数フィルタとする。ここで、仮想音像を定位させたい位置に設置したスピーカは、頭部伝達関数を測定することにのみ用いられ、再生の際には設置されない。再生には入力信号を再生するための複数のスピーカのみが用いられる。 First, in order to prepare a head-related transfer function, a speaker is installed at a position where a virtual sound image is to be localized, and the head-related transfer function from this speaker to the listener's ear canal entrance is measured. This is a head-related transfer function filter. Here, the speaker installed at the position where the virtual sound image is to be localized is used only for measuring the head-related transfer function, and is not installed at the time of reproduction. Only a plurality of speakers for reproducing the input signal are used for reproduction.

次に、複数存在するスピーカのうち、隣り合う２つの再生スピーカ間に仮想音像を定位させるために、それぞれの再生スピーカから再生される信号をパンニングすることにより、仮想音像が２つのスピーカ間に定位するようにする。 Next, in order to localize a virtual sound image between two adjacent reproduction speakers among a plurality of existing speakers, the virtual sound image is localized between the two speakers by panning a signal reproduced from each reproduction speaker. To do.

最後に、仮想音像の定位精度をより向上させるために、仮想音像を定位させたい位置に設置したスピーカを用いて測定した頭部伝達関数フィルタを入力信号に畳み込んで再生することにより、仮想音像による定位を実現する。特許文献１には、このようにして仮想音像定位を実現する手法が記載されている（特許文献１参照）。 Finally, in order to further improve the localization accuracy of the virtual sound image, the head-related transfer function filter measured using a speaker installed at the position where the virtual sound image is to be localized is convolved with the input signal and reproduced. The localization by is realized. Patent Document 1 describes a method for realizing virtual sound image localization in this way (see Patent Document 1).

特開２００２−１３５８９９号公報JP 2002-135899 A

しかし、仮想音像の定位精度は、仮想音像の位置に実在するスピーカを置いて再生する場合と比較すると、前記手法のようにパンニングや頭部伝達関数フィルタを用いて複数の仮想音像を生成した場合でも、まだ十分な仮想音像定位感が得られず、その結果、実在する複数のスピーカを置いた場合と同等の高い包まれ感が実現できていないという問題が存在する。 However, the localization accuracy of the virtual sound image is higher when the virtual sound image is generated by using panning or head-related transfer function filters as in the above method, compared with the case of reproducing by placing a speaker present at the position of the virtual sound image. However, there is still a problem that a sufficient virtual sound image localization feeling cannot be obtained, and as a result, a high wrapping feeling equivalent to the case where a plurality of actual speakers are placed cannot be realized.

前記従来の課題を解決するため、本発明の音像定位処理装置は、受聴位置の周囲に設置した複数のスピーカを用いて、仮想音像を作るための入力信号を再生して仮想位置に定位させる音響再生装置であって、前記仮想音像を作るための入力信号を処理して、前記仮想位置を挟む２つのスピーカに出力する信号を生成する仮想音像生成処理手段と、前記仮想音像を作るための入力信号を含むマルチチャンネルの入力信号の重心を算出する重心算出手段と、前記重心算出手段により算出された重心の位置と大きさに応じて重み係数を決定する重み係数決定手段とを備え、前記仮想音像生成処理手段は、前記重み係数決定手段により決定された重み係数を前記仮想音像を作るための入力信号に掛けることを特徴とするものである。 In order to solve the above-described conventional problems, the sound image localization processing apparatus of the present invention uses a plurality of speakers installed around a listening position to reproduce an input signal for creating a virtual sound image and localize it to the virtual position. A playback device that processes an input signal for generating the virtual sound image and generates a signal to be output to two speakers sandwiching the virtual position; and an input for generating the virtual sound image Centroid calculating means for calculating the centroid of the multi-channel input signal including the signal, and weight coefficient determining means for determining a weight coefficient according to the position and size of the centroid calculated by the centroid calculating means, The sound image generation processing means is characterized by multiplying the input signal for creating the virtual sound image by the weighting coefficient determined by the weighting coefficient determination means.

また、前記重心算出手段は、前記マルチチャンネルのそれぞれの入力信号を、前記受聴位置を座標の中心とした前記それぞれの入力信号に対応したスピーカの位置へ向かう方位と前記それぞれの入力信号の所定時間間隔毎の平均レベルの大きさとを有するベクトルに変換し、これらのベクトルを合成して前記重心を算出することを特徴とするものである。 In addition, the center-of-gravity calculation means is configured so that each of the multi-channel input signals has a direction toward a speaker position corresponding to each input signal with the listening position as the center of coordinates and a predetermined time of each input signal. The center of gravity is calculated by converting the vectors into vectors having the average level for each interval and combining these vectors.

また、前記重み係数決定手段は、重心が存在しうる領域を方位と大きさに基づいて複数の領域に分割し、前記重心算出手段により算出した重心の方位と大きさがそれぞれ前記複数の領域のうちのどの領域に属するかによって前記重み係数を決定することを特徴とするものである。 The weighting factor determination unit divides a region where a center of gravity can exist into a plurality of regions based on an azimuth and a size, and the direction and size of the center of gravity calculated by the center of gravity calculation unit are The weighting factor is determined according to which of the regions it belongs to.

また、前記重み係数決定手段は、前記重心算出手段により算出した重心の方位がどの方位の領域に属するかによって方位の重み係数を算出するとともに、前記重心算出手段により算出した重心の大きさがどの大きさの領域に属するかによって大きさの重み係数を算出し、前記方位の重み係数と前記大きさの重み係数とを乗算したものを前記重み係数として決定することを特徴とするものである。 In addition, the weighting factor determination unit calculates a weighting factor of the azimuth according to which azimuth region the centroid direction calculated by the centroid calculation unit belongs to, and determines the size of the centroid calculated by the centroid calculation unit. A size weighting factor is calculated depending on whether it belongs to a size region, and a product obtained by multiplying the weighting factor of the direction and the weighting factor of the size is determined as the weighting factor.

また、前記重み係数決定手段は、前記重心の方位に近い方位領域に属する方位の重み関数ほど大きくなるように算出し、かつ、前記重心の大きさが大きいほど前記大きさの重み係数が大きくなるように算出することを特徴とするものである。 Further, the weighting factor determination means calculates the weighting function of the azimuth belonging to the azimuth region close to the azimuth of the centroid, and the weighting factor of the magnitude increases as the size of the centroid increases. It is characterized by calculating as follows.

また、本発明の音像定位処理方法は、受聴位置の周囲に設置した複数のスピーカを用いて、仮想音像を作るための入力信号を再生して仮想位置に定位させる音響再生方法であって、前記仮想音像を作るための入力信号を処理して、前記仮想位置を挟む２つのスピーカに出力する信号を生成する仮想音像生成処理ステップと、前記仮想音像を作るための入力信号を含むマルチチャンネルの入力信号の重心を算出する重心算出ステップと、前記重心算出ステップにおいて算出された重心の位置と大きさに応じて重み係数を決定する重み係数決定ステップとを有し、前記仮想音像生成処理ステップにおいて、前記重み係数決定ステップで決定された重み係数を前記仮想音像を作るための入力信号に掛けることを特徴とするものである。 Further, the sound image localization processing method of the present invention is a sound reproduction method for reproducing an input signal for creating a virtual sound image using a plurality of speakers installed around a listening position and localizing to a virtual position, A virtual sound image generation processing step for processing an input signal for creating a virtual sound image and generating a signal to be output to two speakers sandwiching the virtual position; and a multi-channel input including the input signal for creating the virtual sound image A centroid calculating step for calculating the centroid of the signal, and a weighting factor determining step for determining a weighting factor according to the position and size of the centroid calculated in the centroid calculating step. The weighting coefficient determined in the weighting coefficient determination step is multiplied by the input signal for creating the virtual sound image.

また、本発明のプログラムは、前記音像定位処理方法の各ステップをコンピュータに実行させるためのものである。 The program of the present invention is for causing a computer to execute each step of the sound image localization processing method.

また、本発明の記録媒体は、前記プログラムを格納したものである。 The recording medium of the present invention stores the program.

本発明によれば、受聴者の周囲に設置した２つ以上の複数スピーカから再生された音を仮想位置に定位させる音響再生装置において、マルチチャンネルの入力信号の重心を算出し、その重心位置に応じて決定された重み係数を仮想音像生成処理に反映して入力信号を再生することにより、仮想音像の定位効果をより強調し、音場の包まれ感を向上させることが可能となる。 According to the present invention, in a sound reproduction device that localizes sound reproduced from two or more speakers installed around a listener at a virtual position, the center of gravity of a multi-channel input signal is calculated, and the position of the center of gravity is calculated. By reproducing the input signal by reflecting the weighting factor determined accordingly in the virtual sound image generation process, it is possible to further enhance the localization effect of the virtual sound image and improve the feeling of the sound field wrapping.

本発明の実施の形態１における、音像定位処理装置のブロック図The block diagram of the sound image localization processing apparatus in Embodiment 1 of this invention 本発明の実施の形態１における、仮想音像生成処理部の構成を示すブロック図The block diagram which shows the structure of the virtual sound image generation process part in Embodiment 1 of this invention. 本発明の実施の形態１における、入力信号をベクトル分解して算出した重心位置を示す図The figure which shows the gravity center position calculated by carrying out vector decomposition | disassembly of the input signal in Embodiment 1 of this invention 本発明の実施の形態１における、重心Ｇの方位に対する重み付け領域を示す図The figure which shows the weighting area | region with respect to the direction of the gravity center G in Embodiment 1 of this invention. 本発明の実施の形態１における、重心Ｇの大きさに対する重み付け領域を示す図The figure which shows the weighting area | region with respect to the magnitude | size of the gravity center G in Embodiment 1 of this invention.

（実施の形態１）
以下図面を参照しながら、本発明の実施の形態について説明する。 (Embodiment 1)
Embodiments of the present invention will be described below with reference to the drawings.

図１は、本実施の形態において、５．１チャンネル等のマルチチャンネル入力信号に対して仮想音像生成処理を行い、フロントＬチャンネル（ＦＬ）スピーカ５、センターチャンネル（Ｃ）スピーカ６、フロントＲチャンネル（ＦＲ）スピーカ７、サラウンドＬチャンネル（ＳＬ）スピーカ８、サラウンドＲチャンネル（ＳＲ）スピーカ９の各再生スピーカを用いて仮想音像１０〜１５を定位させる音像定位処理を説明するためのブロック図である。 FIG. 1 shows that in this embodiment, virtual sound image generation processing is performed on a multichannel input signal such as 5.1 channel, and a front L channel (FL) speaker 5, a center channel (C) speaker 6, and a front R channel. FIG. 6 is a block diagram for explaining sound image localization processing in which virtual sound images 10 to 15 are localized using reproduction speakers of (FR) speaker 7, surround L channel (SL) speaker 8, and surround R channel (SR) speaker 9. .

図１において、マルチチャンネル入力信号は入力端子１より入力する。重心算出部２は、入力信号の重心を算出する。重み係数決定部３は、重心算出部２より算出した入力信号の重心を基に重み係数を決定する。仮想音像生成処理部４は、重み係数決定部３で決定された重み係数に基づき、マルチチャンネル入力信号に対して仮想音像生成処理を行い、複数の再生スピーカ５〜９に出力する信号を生成する。このような構成により、受聴者１６はＦＬスピーカ５、Ｃスピーカ６、ＦＲスピーカ７、ＳＬスピーカ８、ＳＲスピーカ９に加えて、仮想音像１０〜１５の位置から仮想的に再生音が聞こえることとなる。 In FIG. 1, a multi-channel input signal is input from an input terminal 1. The center of gravity calculation unit 2 calculates the center of gravity of the input signal. The weighting factor determination unit 3 determines a weighting factor based on the centroid of the input signal calculated by the centroid calculation unit 2. The virtual sound image generation processing unit 4 performs virtual sound image generation processing on the multi-channel input signal based on the weighting factor determined by the weighting factor determination unit 3, and generates signals to be output to the plurality of reproduction speakers 5 to 9. . With such a configuration, the listener 16 can virtually hear the reproduced sound from the positions of the virtual sound images 10 to 15 in addition to the FL speaker 5, the C speaker 6, the FR speaker 7, the SL speaker 8, and the SR speaker 9. Become.

ここで、仮想音像１０は、フロントＬチャンネル（ＦＬ）信号をパンニングおよび頭部伝達関数フィルタで処理してＦＬスピーカ５とＳＬスピーカ８に出力することにより定位させる。仮想音像１１は、サラウンドＬチャンネル（ＳＬ）信号をパンニングおよび頭部伝達関数フィルタで処理してＦＬスピーカ５とＳＬスピーカ８に出力することにより定位させる。仮想音像１２は、ＳＬ信号をパンニングおよび頭部伝達関数フィルタで処理してＳＬスピーカ８とＳＲスピーカ９に出力することにより定位させる。Ｒ側についても同様である。 Here, the virtual sound image 10 is localized by processing the front L channel (FL) signal with panning and a head-related transfer function filter and outputting the processed signal to the FL speaker 5 and the SL speaker 8. The virtual sound image 11 is localized by processing a surround L channel (SL) signal by panning and a head-related transfer function filter and outputting the processed signal to the FL speaker 5 and the SL speaker 8. The virtual sound image 12 is localized by processing the SL signal with panning and a head-related transfer function filter and outputting it to the SL speaker 8 and the SR speaker 9. The same applies to the R side.

以上のように構成された音像定位処理装置について、以下説明する。 The sound image localization processing apparatus configured as described above will be described below.

まず、仮想音像生成処理部４について説明する。図２に仮想音像生成処理部４の構成の一例を示す。図２において、２１１〜２１６は仮想音像を作るための信号にそれぞれ重み係数Ｋ１１〜Ｋ１６を掛けるための係数器、２２１〜２３２はそれぞれ特性ＥＱ２１〜ＥＱ３２を有する頭部伝達関数フィルタ、２４１〜２５２はそれぞれ頭部伝達関数フィルタ２２１〜２３２の出力信号にパンニングのための係数Ｋ４１〜Ｋ５２を掛ける係数器、２６１〜２６４は加算器である。 First, the virtual sound image generation processing unit 4 will be described. FIG. 2 shows an example of the configuration of the virtual sound image generation processing unit 4. In FIG. 2, 211 to 216 are coefficient multipliers for multiplying signals for creating virtual sound images by weighting coefficients K11 to K16, 221 to 232 are head-related transfer function filters having characteristics EQ21 to EQ32, and 241 to 252 are Coefficient units for multiplying the output signals of the head-related transfer function filters 221 to 232 by the coefficients K41 to K52 for panning, and 261 to 264 are adders.

仮想音像１０は、ＦＬ信号に係数器２１１で重み係数Ｋ１１を掛け、頭部伝達関数フィルタ２２１と係数器２４１を介してＦＬスピーカ５に出力すると共に、頭部伝達関数フィルタ２２３と係数器２４３を介してＳＬスピーカ８に出力する。仮想音像１１は、ＳＬ信号に係数器２１２で重み係数Ｋ１２を掛け、頭部伝達関数フィルタ２２２と係数器２４２を介してＦＬスピーカ５に出力すると共に、頭部伝達関数フィルタ２２４と係数器２４４を介してＳＬスピーカ８に出力する。仮想音像１２は、ＳＬ信号に係数器２１３で重み係数Ｋ１３を掛け、頭部伝達関数フィルタ２２５と係数器２４５を介してＳＬスピーカ８に出力すると共に、頭部伝達関数フィルタ２２７と係数器２４７を介してＳＲスピーカ９に出力する。Ｒ側についても同様である。 The virtual sound image 10 is obtained by multiplying the FL signal by a weighting coefficient K11 by a coefficient unit 211 and outputting the result to the FL speaker 5 through the head-related transfer function filter 221 and the coefficient unit 241, and the head-related transfer function filter 223 and the coefficient unit 243. To the SL speaker 8. The virtual sound image 11 is obtained by multiplying the SL signal by a weighting coefficient K12 by a coefficient unit 212 and outputting the result to the FL speaker 5 via the head-related transfer function filter 222 and the coefficient unit 242, and the head-related transfer function filter 224 and the coefficient unit 244. To the SL speaker 8. The virtual sound image 12 is obtained by multiplying the SL signal by a weighting coefficient K13 by a coefficient unit 213 and outputting the result to the SL speaker 8 via the head-related transfer function filter 225 and the coefficient unit 245, and the head-related transfer function filter 227 and the coefficient unit 247. To the SR speaker 9. The same applies to the R side.

ここで、パンニングのための係数は以下のように設定する。２つのスピーカＡ、Ｂを用いて、これらのスピーカに挟まれた位置に仮想音像Ｃを生成する場合、スピーカＡと仮想音像Ｃとの成す角度をａ、スピーカＢと仮想音像Ｃとの成す角度をｂとすると、スピーカＡに出力する信号レベルＰＡとスピーカＢ出力する信号レベルＰＢの比は、以下のようになる。 Here, the coefficient for panning is set as follows. When the virtual sound image C is generated at a position sandwiched between the two speakers A and B, the angle between the speaker A and the virtual sound image C is a, and the angle between the speaker B and the virtual sound image C is Is b, the ratio of the signal level PA output to the speaker A and the signal level PB output to the speaker B is as follows.

例えば、仮想音像１０をパンニングするための係数器２４１および２４３の係数Ｋ４１およびＫ４３は、これらの比が（数１）を満たし、かつトータルパワーが変わらない、すなわち２乗平均が１となるように設定すればよい。他の仮想音像についても同様である。 For example, the coefficients K41 and K43 of the coefficient units 241 and 243 for panning the virtual sound image 10 are such that their ratio satisfies (Equation 1) and the total power does not change, that is, the mean square is 1. You only have to set it. The same applies to other virtual sound images.

つぎに、頭部伝達関数フィルタの特性について説明する。仮想音像Ｃを生成するためには、仮想音像Ｃから受聴者までの頭部伝達特性をスピーカＡから受聴者までの頭部伝達特性で除したものをスピーカＡに出力するための頭部伝達関数フィルタとし、仮想音像Ｃから受聴者までの頭部伝達特性をスピーカＢから受聴者までの頭部伝達特性で除したものをスピーカＢに出力するための頭部伝達関数フィルタとする。 Next, characteristics of the head-related transfer function filter will be described. In order to generate the virtual sound image C, the head-related transfer function for outputting to the speaker A the head-related transfer characteristic from the virtual sound image C to the listener divided by the head-related transfer characteristic from the speaker A to the listener. The filter is a head-related transfer function filter for outputting to the speaker B the head-related transfer characteristic from the virtual sound image C to the listener divided by the head-related transfer characteristic from the speaker B to the listener.

例えば、仮想音像１０を生成するための頭部伝達関数フィルタ２２１の特性ＥＱ２１は、仮想音像１０から受聴者１６までの頭部伝達特性をＦＬスピーカ５から受聴者１６までの頭部伝達特性で除した特性であり、頭部伝達関数フィルタ２２３の特性ＥＱ２３は、仮想音像１０から受聴者１６までの頭部伝達特性をＳＬスピーカ８から受聴者１６までの頭部伝達特性で除した特性である。他の仮想音像についても同様である。 For example, the characteristic EQ21 of the head-related transfer function filter 221 for generating the virtual sound image 10 is obtained by dividing the head-related transfer characteristic from the virtual sound image 10 to the listener 16 by the head-related transfer characteristic from the FL speaker 5 to the listener 16. The characteristic EQ23 of the head-related transfer function filter 223 is a characteristic obtained by dividing the head-related transfer characteristic from the virtual sound image 10 to the listener 16 by the head-related transfer characteristic from the SL speaker 8 to the listener 16. The same applies to other virtual sound images.

なお、仮想音像を、その元となる信号から分離するために、頭部伝達関数フィルタ２２１〜２３２が、さらに遅延特性を有するようにしてもよい。 Note that the head-related transfer function filters 221 to 232 may further have delay characteristics in order to separate the virtual sound image from the original signal.

ここで、重み付け係数Ｋ１１〜Ｋ１６をすべて１に設定した場合は、頭部伝達関数フィルタ２２１〜２３２と係数器２４１〜２５２で仮想音像１０〜１５をある程度定位させることができるが、このようにして定位させた仮想音像は、元の信号を実スピーカで再生した実音像と比較して定位感が弱く、そのため、十分な包まれ感が得られない。 Here, when all of the weighting coefficients K11 to K16 are set to 1, the virtual sound images 10 to 15 can be localized to some extent by the head-related transfer function filters 221 to 232 and the coefficient units 241 to 252. The localized virtual sound image is weak in localization as compared with the actual sound image obtained by reproducing the original signal with a real speaker, so that a sufficient feeling of wrapping cannot be obtained.

そこで、本実施の形態では、各チャンネルの入力信号を合成した重心Ｇの方位と大きさをリアルタイムで検出し、重心Ｇの方位に近い仮想音像を作るための信号ほど増強されるように重み付けをし、さらに重心Ｇの大きさが大きいほど増強の程度を強くすることにより、仮想音像の定位感を強めるようにしている。 Therefore, in this embodiment, the azimuth and size of the center of gravity G obtained by synthesizing the input signals of the respective channels are detected in real time, and weighting is performed so that a signal for creating a virtual sound image close to the direction of the center of gravity G is enhanced. In addition, the greater the size of the center of gravity G, the stronger the degree of enhancement, thereby enhancing the sense of localization of the virtual sound image.

この重み付けを行うために、まず、重心算出部２で入力信号の重心Ｇを算出し、その算出結果に基づいて、重み係数決定部３で重み係数を決定し、その重み係数を、係数器２１１〜２１６の係数Ｋ１１〜Ｋ１６に設定する。 In order to perform this weighting, first, the center of gravity G of the input signal is calculated by the center of gravity calculating unit 2, the weighting factor is determined by the weighting factor determining unit 3 based on the calculation result, and the weighting factor is calculated by the coefficient unit 211. Set to coefficients K11 to K16 of ˜216.

つぎに、重心算出部２について説明する。図３は各チャンネルの入力信号をベクトル分解し、重心Ｇを算出する様子を示す図である。重心算出部２は、入力端子１から入力されるマルチチャンネル信号に対し、受聴者１６の位置を座標の中心とするベクトル座標において、各チャンネルの入力信号の所定時間間隔における平均レベルをｘ軸とｙ軸の要素にベクトル分解する。ここで、平均レベルで表されるＦＬ信号のｘ軸要素をＦＬｘ、ｙ軸要素をＦＬｙ、Ｃ信号のｘ軸要素をＣｘ、ｙ軸要素をＣｙ、ＦＲ信号のｘ軸要素をＦＲｘ、ｙ軸要素をＦＲｙ、ＳＬ信号のｘ軸要素をＳＬｘ、ｙ軸要素をＳＬｙ、ＳＲ信号のｘ軸要素をＳＲｘ、ｙ軸要素をＳＲｙと表すと、重心Ｇは（数２）次のように算出することができる。（−１）を掛ける理由は、入力信号から算出した重心を受聴者に対する重心に置き換えるためであり、また、各ｘ軸、ｙ軸要素はスカラー値を表し、向きは符号で表現することとする。 Next, the center of gravity calculation unit 2 will be described. FIG. 3 is a diagram showing a state in which the center of gravity G is calculated by vector decomposition of the input signal of each channel. For the multi-channel signal input from the input terminal 1, the center-of-gravity calculation unit 2 uses the average level of the input signal of each channel at a predetermined time interval as the x-axis in vector coordinates having the position of the listener 16 as the center of coordinates. Perform vector decomposition into y-axis elements. Here, the x-axis element of the FL signal represented by the average level is FLx, the y-axis element is FLy, the x-axis element of the C signal is Cx, the y-axis element is Cy, and the x-axis element of the FR signal is FRx, y-axis When the element is FRy, the x-axis element of the SL signal is SLx, the y-axis element is SLy, the x-axis element of the SR signal is SRx, and the y-axis element is SRy, the center of gravity G is calculated as follows. be able to. The reason for multiplying by (-1) is to replace the center of gravity calculated from the input signal with the center of gravity for the listener, and each x-axis and y-axis element represents a scalar value and the direction is represented by a sign. .

ここで、Ｇｘ、Ｇｙは重心Ｇのそれぞれｘ軸、ｙ軸要素を表し、｜Ｇ｜は図３における受聴者から重心Ｇまでの距離に対応し、各チャンネルの入力信号を合成した信号の平均レベルを表す。すなわち、受聴者１６の位置に対して、Ｇｘ、Ｇｙをベクトル合成した位置が重心Ｇの位置となり、その大きさは｜Ｇ｜で表される。 Here, Gx and Gy represent the x-axis and y-axis elements of the center of gravity G, respectively, and | G | corresponds to the distance from the listener to the center of gravity G in FIG. Represents a level. That is, the position obtained by vector synthesis of Gx and Gy with respect to the position of the listener 16 is the position of the center of gravity G, and the size is represented by | G |

次に、重み係数決定部３について説明する。重み係数は重心Ｇの方位θと大きさ｜Ｇ｜それぞれに応じて決定される。 Next, the weight coefficient determination unit 3 will be described. The weighting coefficient is determined in accordance with the azimuth θ and the size | G |

図４は重心Ｇの方位に対する重み付け領域を示す図である。まず、重心Ｇの方位θに関する重み係数Ｎについて説明する。重心Ｇの方位θを中心とする方位領域（例えば±３０°）をθ１とし、その左回り隣りの方位領域（例えば６０°）をθ２ｐ、右回り隣りの方位領域（例えば６０°）をθ２ｍ、そのさらに隣り合う方位領域（例えば６０°）をθ３ｐ、θ３ｍ・・・と設定し、方位領域θ１、（θ２ｐ、θ２ｍ）、（θ３ｐ、θ３ｍ）・・・に属する仮想音像を生成するための、方位に関する重み係数をそれぞれＮ１、Ｎ２、Ｎ３・・・（Ｎ１＞Ｎ２＞Ｎ３＞・・・＞０）と決定する（例えばＮ１＝１．０、Ｎ２＝０．８、Ｎ３＝０．６・・・）。 FIG. 4 is a diagram showing a weighted region for the orientation of the center of gravity G. First, the weighting coefficient N related to the direction θ of the center of gravity G will be described. An azimuth region (for example, ± 30 °) centered on the azimuth θ of the center of gravity G is θ1, the counterclockwise adjacent azimuth region (for example, 60 °) is θ2p, the rightward adjacent azimuth region (for example, 60 °) is θ2m, The adjacent azimuth regions (for example, 60 °) are set as θ3p, θ3m,... And a virtual sound image belonging to the azimuth regions θ1, (θ2p, θ2m), (θ3p, θ3m),. The weighting factors related to the bearings are determined as N1, N2, N3... (N1> N2> N3>...> 0) (for example, N1 = 1.0, N2 = 0.8, N3 = 0.6 ·・・）.

すなわち図３において、方位に関する重み係数はそれぞれ、仮想音像１０、１１を生成するための重み係数はＮ１、仮想音像１２を生成するための重み係数はＮ２、仮想音像１３，１５を生成するための重み係数はＮ３、仮想音像１４は重み付けなしとなる。 That is, in FIG. 3, the weighting coefficient for the azimuth is N1 for generating the virtual sound images 10 and 11, N2 for the virtual sound image 12, and N2 for generating the virtual sound image 12, respectively. The weighting coefficient is N3, and the virtual sound image 14 is not weighted.

続いて、図５を用いて、重心Ｇの大きさ｜Ｇ｜に関する重み係数Ｄについて説明する。 Subsequently, the weighting coefficient D related to the size | G | of the center of gravity G will be described with reference to FIG.

受聴者の位置を中心とし、その位置から遠ざかる方向に向かって、半径ｄ１、ｄ２、ｄ３（０＜ｄ１＜ｄ２＜ｄ３＜・・・）となるような同心円で区切られた領域を想定し、重心Ｇの大きさ｜Ｇ｜がどの領域に位置するかによって、重心Ｇの大きさ｜Ｇ｜に関する重み係数Ｄを、次のように決定する。 Assuming a region delimited by concentric circles with radii d1, d2, d3 (0 <d1 <d2 <d3 <...) Centering on the listener's position and moving away from that position, Depending on which region the size | G | of the center of gravity G is located, the weighting coefficient D for the size | G | of the center of gravity G is determined as follows.

（例えば、Ｄ１＝０．１、Ｄ２＝０．５、Ｄ３＝１．０・・・）
図５の場合、重心Ｇの大きさ｜Ｇ｜に関する重み係数Ｄは、Ｄ２となる。 (For example, D1 = 0.1, D2 = 0.5, D3 = 1.0 ...)
In the case of FIG. 5, the weighting coefficient D regarding the size | G | of the center of gravity G is D2.

このようにして決定した重心Ｇの方位θに関する重み係数Ｎと、重心Ｇの大きさ｜Ｇ｜に関する重み係数Ｄの積Ｎ・Ｄを重み係数Ｋとする。この重み係数Ｋを、仮想音像生成処理部４の係数器２１１〜２１６のそれぞれの係数Ｋ１１〜Ｋ１６に設定する。 The product N · D of the weighting coefficient N related to the orientation θ of the center of gravity G and the weighting coefficient D related to the size | G | The weighting coefficient K is set to the coefficients K11 to K16 of the coefficient units 211 to 216 of the virtual sound image generation processing unit 4, respectively.

以上のようにすることにより、各チャンネルの入力信号の重心Ｇをリアルタイムに算出し、その重心Ｇから重み係数Ｋを算出し、その重み係数を仮想音像を作る信号に掛けることにより、入力信号に応じて変化する重心に対して適応的に重み係数を変化させることが可能となり、その結果、重心位置に応じて仮想音像の定位感をより強調した高い包まれ感を得ることが可能となる。 As described above, the centroid G of the input signal of each channel is calculated in real time, the weighting coefficient K is calculated from the centroid G, and the weighting coefficient is multiplied by the signal for creating the virtual sound image, thereby obtaining the input signal. Accordingly, it is possible to adaptively change the weighting factor with respect to the center of gravity that changes accordingly, and as a result, it is possible to obtain a high wrapping feeling that emphasizes the localization feeling of the virtual sound image according to the position of the center of gravity.

なお、図２に示す仮想音像生成処理部では、重み係数を係数器２１１〜２１６の係数Ｋ１１〜Ｋ１６に設定しているが、その代わりに、係数器２４１〜２５２の係数Ｋ４１〜Ｋ５２に掛けるようにすれば、係数器２１１〜２１６は省略できる。 In the virtual sound image generation processing unit shown in FIG. 2, the weighting coefficients are set to the coefficients K11 to K16 of the coefficient units 211 to 216, but instead, the coefficients K41 to K52 of the coefficient units 241 to 252 are multiplied. In this case, the coefficient units 211 to 216 can be omitted.

また、図２に示す仮想音像生成処理部では、仮想音像を生成するために、パンニングと頭部伝達関数フィルタの両方を用いているが、どちらか一方のみであってもよい。 In the virtual sound image generation processing unit shown in FIG. 2, both panning and head-related transfer function filters are used to generate a virtual sound image, but only one of them may be used.

また、上記説明では、重み係数決定部において、方位に関する重み係数を決定する際、重心Ｇの方位θを基準として方位領域θ１、θ２ｐ、θ２ｍ、θ３ｐ、θ３ｍ・・・を設定しているが、その代わりに、予め全周を所定の角度間隔（例えば３０°間隔）で分割した複数の固定の方位領域を設定し、重心Ｇの方位θが位置する方位領域をθ１とし、その両隣の方位領域をθ２ｐ、θ２ｍとし、さらにそれらに隣接する方位領域をθ３ｐ、θ３ｍ・・・としてもよい。 In the above description, when the weighting coefficient determination unit determines the weighting coefficient related to the azimuth, the azimuth regions θ1, θ2p, θ2m, θ3p, θ3m,... Are set based on the azimuth θ of the center of gravity G. Instead, a plurality of fixed azimuth areas in which the entire circumference is divided in advance at predetermined angular intervals (for example, 30 ° intervals) are set, and the azimuth area where the azimuth θ of the center of gravity G is located is defined as θ1, and the adjacent azimuth areas on both sides May be θ2p, θ2m, and the azimuth regions adjacent to them may be θ3p, θ3m,.

本発明は、音楽信号が再生可能で１組以上の対となるスピーカを駆動する装置を備えた機器、例えばサラウンドシステム、ＴＶ、ＡＶアンプ、コンポ、携帯電話、ポータブルオーディオ機器に有用である。 INDUSTRIAL APPLICABILITY The present invention is useful for a device that can reproduce a music signal and includes a device that drives one or more pairs of speakers, such as a surround system, a TV, an AV amplifier, a component, a mobile phone, and a portable audio device.

１入力端子
２重心算出部
３重み係数決定部
４仮想音像生成処理部
５ＦＬスピーカ
６Ｃスピーカ
７ＦＲスピーカ
８ＳＬスピーカ
９ＳＲスピーカ
１０〜１５仮想音像
１６受聴者
２１１〜２１６係数器
２２１〜２３２頭部伝達関数フィルタ
２４１〜２５２係数器
２６１〜２６４加算器 DESCRIPTION OF SYMBOLS 1 Input terminal 2 Center of gravity calculation part 3 Weight coefficient determination part 4 Virtual sound image production | generation process part 5 FL speaker 6 C speaker 7 FR speaker 8 SL speaker 9 SR speaker 10-15 Virtual sound image 16 Audience 211-216 Coefficient unit 221-232 head Partial transfer function filter 241 to 252 Coefficient unit 261 to 264 Adder

Claims

A sound reproduction device that reproduces an input signal for creating a virtual sound image by using a plurality of speakers installed around a listening position and localizes it to the virtual position,
Virtual sound image generation processing means for processing an input signal for creating the virtual sound image and generating a signal to be output to two speakers sandwiching the virtual position;
Centroid calculating means for calculating the centroid of a multi-channel input signal including an input signal for creating the virtual sound image;
Weight coefficient determination means for determining a weight coefficient according to the position and size of the center of gravity calculated by the center of gravity calculation means;
The sound image localization processing device, wherein the virtual sound image generation processing means multiplies the input signal for creating the virtual sound image by the weighting coefficient determined by the weighting coefficient determination means.

The center-of-gravity calculation means calculates the input signals of the multi-channels at an orientation toward the speaker position corresponding to the respective input signals with the listening position as the center of coordinates and at predetermined time intervals of the respective input signals. The sound image localization processing apparatus according to claim 1, wherein the center of gravity is calculated by converting the vectors into a vector having a magnitude of the average level and combining the vectors.

The weighting factor determining unit divides a region where a centroid can exist into a plurality of regions based on an azimuth and a size, and the azimuth and size of the centroid calculated by the centroid calculating unit are respectively in the plurality of regions. The sound image localization processing apparatus according to claim 1, wherein the weighting coefficient is determined depending on which region it belongs to.

The weighting factor determination unit calculates a weighting factor of the azimuth depending on which azimuth region the centroid direction calculated by the centroid calculation unit belongs to, and the magnitude of the centroid calculated by the centroid calculation unit 4. A sound image according to claim 3, wherein a weighting coefficient of a size is calculated depending on whether the weight belongs to a region, and the weighting coefficient obtained by multiplying the weighting coefficient of the direction and the weighting coefficient of the size is determined as the weighting coefficient. Stereotaxic equipment.

The weighting factor determination means calculates the weighting function of the azimuth belonging to the azimuth region close to the azimuth of the centroid so that the weighting factor of the magnitude increases as the size of the centroid increases. The sound image localization processing apparatus according to claim 4, wherein the sound image localization processing apparatus is calculated.

An acoustic reproduction method for reproducing an input signal for creating a virtual sound image by using a plurality of speakers installed around a listening position and localizing to a virtual position,
A virtual sound image generation processing step of processing an input signal for creating the virtual sound image and generating a signal to be output to two speakers sandwiching the virtual position;
A centroid calculating step of calculating a centroid of a multi-channel input signal including an input signal for creating the virtual sound image;
A weighting factor determination step for determining a weighting factor according to the position and size of the center of gravity calculated in the center of gravity calculation step;
A sound image localization processing method characterized in that, in the virtual sound image generation processing step, an input signal for creating the virtual sound image is multiplied by the weighting coefficient determined in the weighting coefficient determination step.

A program for causing a computer to execute each step of the sound image localization processing method according to claim 6.

A recording medium storing the program according to claim 7.