JPH0937400A

JPH0937400A - Sound image localization controller

Info

Publication number: JPH0937400A
Application number: JP7206558A
Authority: JP
Inventors: Jiro Nakaso; 二郎中曽
Original assignee: Victor Company of Japan Ltd
Current assignee: Victor Company of Japan Ltd
Priority date: 1995-07-20
Filing date: 1995-07-20
Publication date: 1997-02-07

Abstract

PROBLEM TO BE SOLVED: To localize a sound image at a prescribed position without any sound image surrounding the prescribed position by applying convolutional operation to a sound source by the use of a coefficient calculated based upon a virtual impulse response obtained by adding a reflected sound preceding a direct sound for a specific time. SOLUTION: Digital and analog audio signals are switched by a switch SW, the selected signal is converted into a parallel signal by a serial-parallel converter 5 and the parallel signal is supplied to right and left channel convolvers 6, 7 and 8, 9. Convolutional operation is applied to the sound source data on a time base by the use of a coefficient corresponding to a prescribed sound image localization position supplied from a controlling sub-CPU 11 to a ROM 10. The processed signals are added by respective adders 12, 13 as right and left channel signals and the added signals are outputted to speakers. The coefficient stored in the ROM 10 is calculated based upon a virtual impulse response obtained by adding a reflected sound of -10 to -13dB preceding the direct sound for 9.1 to 11.3 ms. Since the sound image is localized in the prescribed position by the method, a reproduced sound can be listened as a natural sound.

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明は、実際のトランスジュ−
サ（スピ−カ／ヘッドホン等）とは異なる所望の任意の
位置に音像が定位しているように感じさせる音像定位制
御装置に係る。BACKGROUND OF THE INVENTION The present invention relates to an actual transducing method.
The present invention relates to a sound image localization control device that makes a user feel as if a sound image is localized at a desired arbitrary position different from a speaker (speaker / headphone, etc.).

【０００２】[0002]

【従来の技術】従来より、一般的なヘッドホンや、複数
のスピ−カを用いてステレオ信号やモノラル信号を再生
すると、ヘッドホンの場合には音が頭の中にこもった
り、耳元にまつわりついてしまったり、或いは、スピ−
カ再生の場合には、目的とする位置より音像が近い位置
に聞こえてしまうような感じがあった。2. Description of the Related Art Conventionally, when a stereo signal or a monaural signal is reproduced by using general headphones or a plurality of speakers, in the case of headphones, the sound is muffled in the head or caught around the ears. It's lost or spied
In the case of reproduction, there was a feeling that the sound image seemed to be closer to the target position.

【０００３】そこで、こうした不都合を避けるために、
自由空間において採取された音源から両耳までのインパ
ルス応答を再生される音源に畳み込むことによって、音
が頭外或いは所望の位置から聞こえるように認識させる
技術（バイノ−ラル技術）が種々提案されている。Therefore, in order to avoid such inconvenience,
Various techniques (binaural techniques) for recognizing sounds as if they are heard outside the head or at a desired position by convoluting the impulse responses from the sound source collected in free space to both ears into the reproduced sound source have been proposed. There is.

【０００４】例えば、図６は、その従来技術の一例で、
本願出願人が先に提案した音像定位制御装置の原理図で
ある。この装置によれば、必要な定位位置ｘにおける伝
達特性ｃｆＬｘ，ｃｆＲｘとして、ＦＩＲフィルタ処理
により実現するための係数として、ｃｆＬｘ、ｃｆＲｘ
を予め作成し、例えば、そのデ−タを予めＲＯＭに準備
しておくものである。For example, FIG. 6 shows an example of the prior art.
It is a principle view of the sound image localization control device previously proposed by the applicant of the present application. According to this apparatus, the transfer characteristics cfLx and cfRx at the required localization position x are cfLx and cfRx as the coefficients to be realized by the FIR filter processing.
Is prepared in advance and, for example, the data thereof is prepared in the ROM in advance.

【０００５】例えば、音源Ｘを所望の位置に定位させた
い場合には、ＲＯＭに用意されている予め実測に基づい
た伝達特性ｃｆＬｘをＦＩＲデジタルフィルタに転送
し、音源Ｘからの信号に畳み込み演算処理をし、この処
理をされた信号を一対のスピ−カＳＰ１，ＳＰ２から再
生することで、所望の任意の位置に音像を定位させるよ
うにしたものである。For example, when it is desired to localize the sound source X at a desired position, the transfer characteristic cfLx based on actual measurement prepared in the ROM is transferred to the FIR digital filter, and the signal from the sound source X is convoluted. Then, the signal thus processed is reproduced from the pair of speakers SP1 and SP2 to localize the sound image at a desired arbitrary position.

【０００６】さらに、この原理構成につき説明する。
今、ここで、スピ−カＳＰ1 から受聴者Ｍの左右両耳ま
での伝達特性（インパルス応答の周波数応答）をｈ１Ｌ
とし、左耳までの頭部伝達特性ｈ１Ｒ、ｓｐ２から左右
耳までの頭部伝達特性をｈ２Ｌ，ｈ２Ｒとする。また、
目的とする定位位置ｘに実際のスピ−カを配置した時の
受聴者Ｍの左右両耳までの伝達特性を夫々ｐＬｘ、ｐＲ
ｘとする。Further, the configuration of this principle will be described.
Now, the transfer characteristic (frequency response of impulse response) from the speaker SP1 to the left and right ears of the listener M is h1L.
The head-related transfer characteristics h1R to the left ear and the head-related transfer characteristics from the sp2 to the left and right ears are defined as h2L and h2R. Also,
The transmission characteristics to the left and right ears of the listener M when the actual speaker is placed at the intended localization position x are pLx and pR, respectively.
Let x.

【０００７】この伝達特性ｐＬｘ、ｐＲｘは、例えば、
無響空間にスピ−カと人頭又はダミ−ヘッド及び両耳位
置のマイクが配置されて測定されたものに適切な波形処
理等が施こされて得られたものである。図７が、その伝
達特性ｐＬｘ、ｐＲｘを得るための測定システムで、ダ
ミ−ヘッド（又は人頭）ＤＭの両耳に一対マイクロホン
ＭＬ，ＭＲが設置され、スピ−カ−ＳＰからの測定音を
受け、録音器ＲＡＴにソ−ス音（リファレンスデ−タ）
ｒｅｆＬ，ｒｅｆＲと被測定音（測定デ−タ）Ｌ，Ｒと
が同期して記録されようになっており、この記録された
デ−タをもとに所定の波形処理等が施されて、前述の伝
達特性ｐＬｘ、ｐＲｘが得れるようになっている。The transfer characteristics pLx and pRx are, for example,
It is obtained by arranging a speaker and a human head or a dummy head and a microphone at both ears positions in an anechoic space and subjecting the measured one to appropriate waveform processing. FIG. 7 shows a measurement system for obtaining the transfer characteristics pLx and pRx. A pair of microphones ML and MR are installed on both ears of a dummy head (or human head) DM, and a measurement sound from a speaker SP is output. Received, source sound to recorder RAT (reference data)
refL, refR and the sound to be measured (measurement data) L, R are designed to be recorded in synchronization, and a predetermined waveform processing or the like is performed on the basis of the recorded data. The transfer characteristics pLx and pRx described above are obtained.

【０００８】そして、これらの伝達特性ｐＬｘ、ｐＲｘ
は、次のような論理により導かれるものである。今、こ
こで、受聴者Ｍの左右両耳に得られる信号を夫々ｅＬ、
ｅＲとすると、ｅＬ＝ｈ１Ｌ・ｃｆＬｘ・Ｘ＋ｈ２Ｌ・ｃｆＲｘ・Ｘ …（１ａ）ｅＲ＝ｈ１Ｒ・ｃｆＬｘ・Ｘ＋ｈ２Ｒ・ｃｆＲｘ・Ｘ …（１ｂ）となる。Then, these transfer characteristics pLx, pRx
Is guided by the following logic. Now, here, the signals obtained in the left and right ears of the listener M are respectively eL,
If eR, then eL = h1L · cfLx · X + h2L · cfRx · X (1a) eR = h1R · cfLx · X + h2R · cfRx · X (1b)

【０００９】一方、ソ−スＸを目的の定位置から再生し
た時に受聴者Ｍの左右両耳に得られる信号をｄＬ、ｄＲ
とすると、ｄＬ＝ｐＬｘ・Ｘ …（２ａ）ｄＲ＝ｐＲｘ・Ｘ …（２ｂ）となる。On the other hand, when the source X is reproduced from the target fixed position, the signals obtained in the left and right ears of the listener M are dL and dR.
Then, dL = pLx · X (2a) and dR = pRx · X (2b).

【００１０】今、スピ−カＳＰ１、ＳＰ２の再生により
受聴者Ｍの左右両耳に得られる信号が目的位置からソ−
スを再生したときの信号に一致すれば、受聴者Ｍはあた
かも目的位置にスピ−カが存在するように音像が認識さ
れることになる。Now, by reproducing the speakers SP1 and SP2, the signals obtained in the left and right ears of the listener M are sourced from the target position.
If the signal matches the signal when the audio is reproduced, the listener M will recognize the sound image as if the speaker were present at the target position.

【００１１】すなわち、条件ｅＬ＝ｄＬ、ｅＲ＝ｄＲ
と、式（１ａ）、（１ｂ）、（２ａ）、（２ｂ）より、
Ｘを消去すると、ｈ１Ｌ・ｃｆＬｘ＋ｈ２Ｌ・ｃｆＲｘ＝ｐＬｘ …（３ａ）ｈ１Ｒ・ｃｆＬｘ＋ｈ２Ｒ・ｃｆＲｘ＝ｐＲｘ …（３ｂ）となる。That is, the conditions eL = dL, eR = dR
And from equations (1a), (1b), (2a), (2b),
When X is erased, h1L · cfLx + h2L · cfRx = pLx (3a) h1R · cfLx + h2R · cfRx = pRx (3b)

【００１２】そして、式（３ａ），（３ｂ）からｃｆＬ
ｘ、ｃｆＲｘを求めると、ｃｆＬｘ＝（ｈ２Ｒ・ｐＬｘ−ｈ２Ｌ・ｐＲｘ）／Ｈ …（４ａ）ｃｆＲｘ＝（−ｈ１Ｒ・ｐＬｘ−ｈ１Ｌ・ｐＲｘ）／Ｈ …（４ｂ）但し、Ｈ＝ｈ１Ｌ・ｈ２Ｒ−ｈ２Ｌ・ｈ１Ｒ …（４ｃ）となる。From equations (3a) and (3b), cfL
When x and cfRx are obtained, cfLx = (h2R · pLx−h2L · pRx) / H (4a) cfRx = (− h1R · pLx−h1L · pRx) / H (4b) where H = h1L · h2R− h2L · h1R (4c).

【００１３】したがって、式（４ａ）〜（４ｃ）により
算出した伝達特性ｃｆＬｘ、ｃｆＲｘを用いて定位させ
たい信号を処理すれば、目的の位置ｘの音像を定位させ
ることができるとういものである。具体的には、上述し
たように、図６の原理構成に必要な定位置ｘの個数に応
じたｃｆＬｘ、ｃｆＲｘを予め作成し、ＲＯＭとして準
備し、そのフィルタ係数をコントロ−ラにより制御する
構成のものである。Therefore, if the signal to be localized is processed using the transfer characteristics cfLx and cfRx calculated by the equations (4a) to (4c), the sound image at the target position x can be localized. . Specifically, as described above, cfLx and cfRx corresponding to the number of fixed positions x necessary for the principle configuration of FIG. 6 are created in advance, prepared as a ROM, and the filter coefficient thereof is controlled by the controller. belongs to.

【００１４】[0014]

【発明が解決しようとする課題】ところが、上記した構
成のものは、従来技術のものに比べて、かなりの改善効
果が得られるものの、自由空間で測定されるインパルス
応答が、図８に示すように直接音に続く反射音が少な
く、思うほどの効果が得られず、未だ認識される音像が
目的の位置よりもかなり近くに聞こえてしまうことがあ
った。また、音像の上昇を伴うことも知られている。However, although the above-mentioned configuration has a considerable improvement effect as compared with the prior art, the impulse response measured in free space is as shown in FIG. There was little reflected sound following the direct sound, the desired effect was not obtained, and the sound image that was still recognized sometimes sounded much closer than the intended position. It is also known that the sound image is increased.

【００１５】そこで、これら点を考慮して、インパルス
応答を、良好な音響特性が得られる室内で採取すること
も提案されている。しかし、この方法によれば、前述し
た問題点の改善を図ることができるが、前述のようにし
て採取したインパルス応答が非常に長く、これに対応し
た係数を用意するのは、実装面からあまり現実的ではな
い。Therefore, taking these points into consideration, it has been proposed to collect the impulse response in a room where good acoustic characteristics can be obtained. However, according to this method, although the above-mentioned problems can be improved, the impulse response sampled as described above is very long, and it is not necessary from a mounting point of view to prepare a coefficient corresponding thereto. Not realistic.

【００１６】このようなことから、殆どの場合、ある限
られた窓でスケ−リングされたインパルス応答のものを
用意するか、或いは、長い反射構造を人工的に作り出し
て、その効果を上げるようにすることが考えられてい
る。Therefore, in most cases, an impulse response that is scaled with a limited window is prepared, or a long reflective structure is artificially created to enhance its effect. It is considered to be.

【００１７】しかし、前者の場合、インパルス応答が途
中で切られてしまうため、その効果が薄れてしまう。ま
た、後者の場合には、反射構造を人工的に作り出すため
に、音質が金属的な不自然な音となってしまうという問
題があった。However, in the former case, the impulse response is cut off on the way, and the effect is diminished. In the latter case, there is a problem that the sound quality becomes a metallic unnatural sound because the reflective structure is artificially created.

【００１８】そこで、本願発明は、上述した種々の問題
を解決して、音が耳元にまつわりついたりせずに、より
遠方に音像を定位させて、従来以上の音質の改善を図る
と共に、回路の小型化を考慮した音量定位制御装置を提
供しようというものである。Therefore, the present invention solves the above-mentioned various problems and localizes the sound image further away without the sound clinging to the ear, thereby improving the sound quality more than before. An object of the present invention is to provide a volume localization control device in consideration of downsizing of a circuit.

【００１９】[0019]

【課題を解決するための手段】上記課題を解決するため
に、本発明の音量定位制御装置は、離隔した一対のトラ
ンスジュ−サから、同一の音源が供給された一対のコン
ボルバで処理した信号を再生して、聴取者に前記一対の
トランジュ−サとは異なる任意の位置に音像が定位して
いるように感じさせる音像定位制御装置において、同一
の音源からの信号を、設定された係数に応じて畳み込み
演算処理する一対のコンボルバと、各音像定位位置にお
いて測定された頭部伝達関数をもとにして、インパルス
応答として算出されたキャンセルフィルタ用係数群を保
持する記憶手段と、指定された音像定位位置に対応した
係数を、前記記憶手段から前記一対のコンボルバに供給
する係数供給手段とを具備し、前記キャンセルフィルタ
用係数群は、聴取者に直接入来する直接音に先行して、
前記直接音に遅れて入来する反射音を付加した仮想のイ
ンパルス応答に基づいて算出された係数群であることを
特徴とするものである。In order to solve the above problems, the volume localization control apparatus of the present invention is a signal processed by a pair of convolvers to which the same sound source is supplied from a pair of spaced transducers. In the sound image localization control device that makes the listener feel that the sound image is localized at an arbitrary position different from the pair of transducers, the signal from the same sound source is set to the set coefficient. A pair of convolvers that perform convolution calculation processing accordingly, and storage means that holds a cancel filter coefficient group calculated as an impulse response based on the head-related transfer function measured at each sound image localization position are designated. The cancel filter coefficient group includes a coefficient supply unit that supplies a coefficient corresponding to a sound image localization position from the storage unit to the pair of convolvers. Prior to the direct sound coming directly to,
It is characterized in that it is a coefficient group calculated based on a virtual impulse response in which a reflected sound that comes in late with the reflected sound is added.

【００２０】また、上記の構成の音像定位制御装置にお
いて、反射音の直接音に対する先行時間を、これらの音
の最大レベルを基準にして、９．１ｍ秒〜１１．３ｍ秒
にすると共に、前記反射音の最大レベルを前記直接音の
最大レベルに対して−１０ｄＢ〜−１３ｄＢにしたこと
を特徴とするものである。Further, in the sound image localization control apparatus having the above structure, the preceding time of the reflected sound with respect to the direct sound is set to 9.1 msec to 11.3 msec with reference to the maximum level of these sounds, and The maximum level of reflected sound is set to -10 dB to -13 dB with respect to the maximum level of the direct sound.

【００２１】また、更に、上記の構成の音像定位制御装
置において、直接音に先行して付加する反射音を、前記
直接音の始まる前に、ゼロに収束させたことを特徴とす
るものである。Further, in the sound image localization control device having the above-mentioned configuration, the reflected sound added prior to the direct sound is converged to zero before the direct sound starts. .

【００２２】[0022]

【実施例】以下、図面を参照して、本発明の実施例につ
き説明する。＜基本的な考え方＞本発明は、次に述べる要因の分析の
もとに達成されたものである。前述の問題点で指摘した
短いインパルス応答が、目的の位置よりかなり近位置で
聞こえてしまうのは、マスキング効果により、反射音の
効果が減じてしまうことが大きな要因として考えられ
る。Embodiments of the present invention will be described below with reference to the drawings. <Basic Concept> The present invention has been achieved based on an analysis of the factors described below. The reason why the short impulse response pointed out in the above-mentioned problem is heard at a position much closer than the target position is considered to be a large factor that the effect of the reflected sound is reduced by the masking effect.

【００２３】人間の聴覚現象の代表的なものの中に、
“ハ−ス効果／先行音効果”と“マスキング効果”とが
挙げられる。まず、“ハ−ス効果”について、図９
（ａ），（ｂ）を用いて説明する。図９（ａ）は受聴者
Ｍに対するスピ−カＳＰ1 ，ＳＰ２の配置関係を示し、
図９（ｂ）は縦軸に音像が定位する角度位置ψと、横軸
にスピ−カＳＰ1 に対するＳＰ２から発射される音の遅
れ時間Ｓｔを示した図である。今、２個のスピ−カＳＰ
1 ，ＳＰ２から非周期的でコヒ−レントな信号が放射さ
れているとする。両方のスピ−カから同一信号が、同時
に、かつ、同レベルで放射されると、受聴者Ｍの正面方
向に音像が定位する。仮に、いずれかのスピ−カからの
音の発射時間に相違があると、音像は、先行するスピ−
カ側に移動する。Among typical human auditory phenomena,
The “hearth effect / preceding sound effect” and the “masking effect” can be mentioned. First, regarding the “hearth effect”, FIG.
This will be described with reference to (a) and (b). FIG. 9A shows the positional relationship of the speakers SP1 and SP2 with respect to the listener M,
FIG. 9B is a diagram showing the angular position ψ where the sound image is localized on the vertical axis and the delay time St of the sound emitted from SP2 with respect to the speaker SP1 on the horizontal axis. Two speaker SPs now
1. Assume that SP2 emits an aperiodic and coherent signal. When the same signal is radiated from both speakers simultaneously and at the same level, a sound image is localized in the front direction of the listener M. If there is a difference in the sound emission time from one of the speakers, the sound image is
Move to Ka side.

【００２４】図９（ｂ）では、スピ−カＳＰ1 が設置さ
れている角度位置をψ＝４０°、スピ−カＳＰ２が設置
されている角度位置をψ＝−４０°として示してある。
同図に示すように、スピ−カＳＰ２から発射される音
が、スピ−カＳＰ１に対して略３０ｍｓｅｃまでは、先
に音を発射しているスピ−カＳＰ１側に音像が定位す
る。更に、略３０ｍｓｅｃを越え、略５０ｍｓｅｃの遅
れに至ると、音像が２つに分かれて聞こえるようにな
る。このように、聴覚にとって先行する音の効果が大で
あることが分かる。このような現象が、ハ−ス効果、或
いは、先行音効果と称されている。In FIG. 9B, the angular position where the speaker SP1 is installed is shown as ψ = 40 °, and the angular position where the speaker SP2 is installed is shown as ψ = -40 °.
As shown in the figure, until the sound emitted from the speaker SP2 is approximately 30 msec with respect to the speaker SP1, the sound image is localized on the side of the speaker SP1 which is emitting the sound first. Further, when it exceeds about 30 msec and reaches a delay of about 50 msec, the sound image is divided into two and can be heard. Thus, it can be seen that the preceding sound has a great effect on hearing. Such a phenomenon is called a hearth effect or a preceding sound effect.

【００２５】次に、“マスキング効果”について説明す
る。このマスキング効果には、フォワ−ド効果とバック
ワ−ド効果とがある。フォワ−ド効果は、レベルの大き
な直接音のために反射音が消されてしまう現象をいう。Next, the "masking effect" will be described. The masking effect includes a forward effect and a backward effect. The forward effect is a phenomenon in which a reflected sound is erased due to a direct sound having a large level.

【００２６】バックワ−ド効果とは、継時マスキングの
一種で、先行する音が遅れて到達する音によって抑制さ
れてしまう現象をいう。つまり、聞こえなくなってしま
う現象で、遅れて到着する音のレベルが大きい場合に、
このような現象が生じる。The backword effect is a kind of continuous masking and is a phenomenon in which a preceding sound is suppressed by a sound that arrives later. In other words, when the level of the sound that arrives late is high due to the phenomenon of inaudibility,
Such a phenomenon occurs.

【００２７】図１０は、その現象を説明するための図
で、受聴者Ｍとスピ−カＳＰ１，ＳＰ２とを図９（ａ）
に示したように配置したときの現象を示した図である。
縦軸にスピ−カＳＰ１とスピ−カＳＰ２とのレベル差
〔（スピ−カＳＰ１のレベルＬｓ0 ）−（スピ−カＳＰ
２のレベルＬｓt ）〕を示し、横軸にスピ−カＳＰ１に
対するスピ−カＳＰ２からの発射音の遅れ時間Ｓｔを示
してある。FIG. 10 is a diagram for explaining the phenomenon, and FIG. 9A shows the listener M and the speakers SP1 and SP2.
It is a figure which showed the phenomenon when arrange | positioning as shown in FIG.
The vertical axis indicates the level difference between the speaker SP1 and the speaker SP2 [(level Ls0 of the speaker SP1)-(speaker SP
2 level Lst)] and the horizontal axis represents the delay time St of the sound emitted from the speaker SP2 with respect to the speaker SP1.

【００２８】スピ−カＳＰ２の発射音が、スピ−カＳＰ
１の発射音に対して略１０ｍｓ以上の遅れを伴い、その
レベル差が略１５ｄＢ以上になると、先行音、即ち、ス
ピ−カＳＰ１の発射音が抑制（聞こえなくなる？）され
てしまうことを示している。このような現象をバックワ
−ド効果と称している。The firing sound of the speaker SP2 is the speaker SP.
If the level difference becomes about 15 dB or more with a delay of about 10 ms or more with respect to the firing sound of No. 1, the preceding sound, that is, the firing sound of the speaker SP1 is suppressed (inaudible?). ing. Such a phenomenon is called a backword effect.

【００２９】本願発明は、インパルス応答を有限長とし
た場合、前述したような問題点を、上記現象に着目して
解決を図るもので、反射音による効果を大きくするため
に、反射音を直接音より先行させ、同時に先行させた反
射音のレベルを、直接音によるバックワ−ドマスキング
によって聞こえない程度に抑えるというものである。In the present invention, when the impulse response has a finite length, the above-mentioned problems are solved by focusing on the above phenomenon. In order to increase the effect of the reflected sound, the reflected sound is directly reflected. This is to suppress the level of the reflected sound that precedes the sound and at the same time precedes it to the extent that it cannot be heard by backword masking by the direct sound.

【００３０】＜実施例＞次に、上述の現象に着目して実
現された音像定位制御装置の実施例につき説明する。図
１は、その音像定位制御装置の概略ブロック図である。
１はデジタルオ−ディオ信号の入力端子で、スイッチＳ
Ｗの端子ａ側に接続されている。２，３は夫々アナログ
オ−ディオ信号のＬチャンネル用及びＲチャンネル用の
入力端子で、アナログ−デジタル（Ａ／Ｄ）変換器４に
接続されて、ここで、パラレルに入来した左右のアナロ
グオ−ディオ信号をシリアルデジタルデ−タに変換する
すようになっている。５はシリアル−パラレル変換器
で、スイッチＳＷから選択的に入来したシリアルデ−タ
を左用及び右用のパラレルオ−ディオデ−タに変換する
ための回路である。<Embodiment> Next, an embodiment of the sound image localization control apparatus realized by paying attention to the above-mentioned phenomenon will be described. FIG. 1 is a schematic block diagram of the sound image localization control device.
1 is an input terminal for a digital audio signal, and a switch S
It is connected to the terminal a side of W. Reference numerals 2 and 3 are input terminals for the L channel and the R channel of the analog audio signal, respectively, which are connected to the analog-digital (A / D) converter 4, where the left and right analog audio signals that come in parallel are input. It is designed to convert the audio signal into serial digital data. Reference numeral 5 denotes a serial-parallel converter, which is a circuit for converting serial data selectively received from the switch SW into parallel audio data for left and right.

【００３１】このシリアル−パラレル変換器５は、夫々
一対づつ設けられたコンボルバ６，７及びコンボルバ
８，９に接続され、これらのコンボルバ６〜９には、Ｒ
ＯＭ１０から各音像定位に対応した係数が供給されて、
畳み込み演算が行われるようになっている。このＲＯＭ
１０に格納された係数は、前述したようにマスキング効
果が考慮され、後述する論理式に基づき、図７で示した
ような測定システムを用いて得られたもので、コントロ
−ル用サブＣＰＵ１１により所定の係数が選択されて、
各コンボルバ６〜９に供給されるようになっている。The serial-parallel converter 5 is connected to convolvers 6 and 7 and convolvers 8 and 9 provided in pairs, and the convolvers 6 to 9 have R
The coefficients corresponding to each sound image localization are supplied from OM10,
A convolution operation is performed. This ROM
The coefficient stored in 10 is obtained by using the measurement system as shown in FIG. 7 based on the logical formula described later in consideration of the masking effect as described above, and is controlled by the control sub CPU 11. A predetermined coefficient is selected,
It is adapted to be supplied to each convolver 6-9.

【００３２】加算器１２，１３は、各コンボルバ６〜９
から供給された信号をＬチャンネル用及びＲチヤンネル
用の信号に生成し、これらの信号を出力端子１５，１６
から出力する構成となっているものである。The adders 12 and 13 have convolvers 6 to 9 respectively.
The signals supplied from the above are generated into signals for the L channel and the R channel, and these signals are output terminals 15, 16
It is configured to output from.

【００３３】次に、本願の特徴となるＲＯＭ１０に格納
される各係数の求め方につき、図２並に前述の図６及び
図７を参照して説明する。図２は、前記係数の算出のス
テップを示した図である。頭部伝達関数(Head Related Transfer Function;以
下、ＨＲＴＦと称する)の測定（ステップ１０１）ダミ−ヘッド（または人頭）ＤＭの両耳に一対のマイク
ロホンＭＬ，ＭＲを設置し、スピ−カから測定音を受
け、録音器ＤＡＴにソ−ス音（レファレンスデ−タ）ｒ
ｅｆＬ，ｒｅｆＲと非測定音（測定デ−タ）Ｌ，Ｒを同
期して記録する。Next, how to obtain each coefficient stored in the ROM 10, which is a feature of the present invention, will be described with reference to FIG. 2 and the above-mentioned FIGS. 6 and 7. FIG. 2 is a diagram showing steps of calculating the coefficient. Measurement of Head Related Transfer Function (hereinafter referred to as HRTF) (Step 101) A pair of microphones ML and MR are installed on both ears of a dummy head (or human head) DM, and measurement is performed from a speaker. Receiving the sound, the source sound (reference data) r on the recorder DAT
efL and refR and non-measurement sounds (measurement data) L and R are recorded in synchronization.

【００３４】ソ−ス音ＸＨとしては、インパルス音，ホ
ワイトノイズ，その他のノイズ等を用いることができ
る。特に、統計処理の観点からは、ホワイトノイズは、
連続音で、かつ、オ−ディオ帯域にわたってエネルギ−
分布が一定なので、ホワイトノイズを用いることにより
ＳＮが向上する。As the source sound XH, impulse sound, white noise, other noises or the like can be used. In particular, from the viewpoint of statistical processing, white noise is
Energy in a continuous tone and over the audio band
Since the distribution is constant, the SN is improved by using white noise.

【００３５】上記スピ−カの位置を、例えば、正面を０
度（°）として予め取り決め複数の角度位置（例えば、
３０度ごとの１２ポイント）に設置し、それぞれ所定の
時間だけ、連続的に記録する。例えば、図３（ａ）に示
すように直接音に第１の反射音，第２の反射音，第３の
反射音，…が連続したような波形として記録される。The position of the above-mentioned speaker is, for example, 0 on the front side.
Predetermined as a degree (°)
It is installed at every 30 degrees (12 points) and continuously recorded for each predetermined time. For example, as shown in FIG. 3A, the direct sound is recorded as a waveform in which the first reflected sound, the second reflected sound, the third reflected sound, ... Are continuous.

【００３６】第１の反射音の切り出し（ステップ１０
２）図３（ａ）で示した波形に、第１の反射音を時間軸上で
所定範囲の窓をかけて、切り出す〔図３（ｂ）〕。Cutout of first reflected sound (step 10
2) The first reflected sound is cut out by applying a window in a predetermined range on the time axis to the waveform shown in FIG. 3A [FIG. 3B].

【００３７】第１の反射音の整形処理（ステップ１０
３）切り出された第１の反射音の不要な帯域成分、例えば、
高域に生じるディップをＢＰＦ（バンドパスフィルタ）
で除去する。また、同時に、第１の反射音が直接音のレ
ベルに対して−１０ｄＢ〜−１３ｄＢとなるようにレベ
ル設定を行う。例えば、サンプリングされたデ−タの最
大値２点の平均から算出する。First reflected sound shaping processing (step 10)
3) An unnecessary band component of the cut out first reflected sound, for example,
BPF (bandpass filter) for dips that occur in high frequencies
To remove. At the same time, the level is set so that the first reflected sound is -10 dB to -13 dB with respect to the level of the direct sound. For example, the maximum value of the sampled data is calculated from the average of two points.

【００３８】また更に、この切り出された第１の反射音
は、次のステップで直接音に先行して付加されるが、そ
の際に、付加される第１の反射音の後部側と直接音の前
部側とが重ならないように、図３（ｃ）に示すような、
例えば、ハ−フコサインウインドウにより窓処理を行う
〔図３（ｄ）〕。こうすることにより、反射音が直接音
に干渉して、音像が不明瞭になるのを避けることができ
る。Further, the cut-out first reflected sound is added prior to the direct sound in the next step. At that time, the rear side of the added first reflected sound and the direct sound are added. As shown in Fig. 3 (c), do not overlap the front side of
For example, window processing is performed using a half cosine window [FIG. 3 (d)]. By doing so, it is possible to prevent the reflected sound from interfering with the direct sound and obscuring the sound image.

【００３９】第１の反射音を直接音に先行して付加
（ステップ１０４）前述のステップ１０３において、レベル調整等を含んで
整形処理された第１の反射音を、直接音に先行して付加
する。この場合、第１の反射音の先行時間ｔは、直接音
の最大レベルから第１の反射音の最大レベルまでが９．
１ｍｓｅｃ〜１１．３ｍｓｅｃの範囲で、しかも、その
最大レベルでのレベル差が、１０〜１３ｄＢの範囲で付
加されるのが後述するデ−タからも効果的であることが
実証されている。Addition of the first reflected sound prior to the direct sound (step 104) In step 103, the first reflected sound that has undergone shaping processing including level adjustment and the like is added prior to the direct sound. To do. In this case, the leading time t of the first reflected sound is 9 from the maximum level of the direct sound to the maximum level of the first reflected sound.
It has been proved from the data described later that the level difference at the maximum level is added in the range of 1 msec to 11.3 msec and in the range of 10 to 13 dB, which is also effective from the data described later.

【００４０】仮想インパルス応答（Impulse Respons
e；以下ＩＲと称す）の算出のステップ（ステップ１０
５）ステップ１０１で、同期して記録されたソ−ス音（リフ
ァレンスデ−タ）ｒｅｆＬ，ｒｅｆＲと、前記ステップ
１０４において反射音が直接音に先行して付加された仮
想の被測定音（以後、仮想被測定音と称す）Ｌ，Ｒと
を、図示しないワ−クステ−ション上で処理する。Virtual impulse response (Impulse Respons
e; step of calculating IR (hereinafter referred to as IR) (step 10)
5) In step 101, the synchronously recorded source sounds (reference data) refL, refR, and the virtual measured sound (hereinafter, referred to as the reflected sound added in step 104 prior to the direct sound) , Virtual sound to be measured) L and R are processed on a workstation (not shown).

【００４１】ソ−ス音の周波数応答をＸ（Ｓ）、仮想被
測定音の周波数応答をＹ（Ｓ）、測定位置におけるＨＲ
ＴＦの周波数応答をＩＲ（Ｓ）とすると、式５に示す入
出力関係がある。Ｙ（Ｓ）＝ＩＲ（Ｓ）・Ｘ（Ｓ） …（式５）したがって、ＨＲＴＦの周波数応答ＩＲ（Ｓ）は、ＩＲ（Ｓ）＝Ｙ（Ｓ）／Ｘ（Ｓ） …（式６）となる。よって、リファレンスの周波数応答Ｘ（Ｓ）、
仮想被測定音の周波数Ｙ（Ｓ）は、前記ステップ１０４
で求めたデ−タを時間同期した窓で切り出し、それぞれ
ＦＦＴ変換により有限のフ−リエ級数展開して離散周波
数として計算し、式６によりＨＲＴＦの周波数応答ＩＲ
（Ｓ）が、周知の計算方法により求められる。The frequency response of the source sound is X (S), the frequency response of the virtual measured sound is Y (S), and HR at the measurement position.
When the frequency response of TF is IR (S), there is an input / output relationship shown in Equation 5. Y (S) = IR (S) · X (S) (Equation 5) Therefore, the frequency response IR (S) of the HRTF is: IR (S) = Y (S) / X (S) (Equation 6) Becomes Therefore, the frequency response X (S) of the reference,
The frequency Y (S) of the virtual measured sound is calculated in step 104 above.
The data obtained in step S1 is cut out in a time-synchronized window, expanded into a finite Fourier series by FFT conversion, and calculated as discrete frequencies.
(S) is obtained by a known calculation method.

【００４２】仮想ＩＲの整形処理（ステップ１０６）ここで、ステップ１０５で求めた仮想ＩＲを整形する。
まず、例えば、ＦＦＴ変換により、ステップ１０２で求
めた第１のＩＲをオ−ディオスペクトラムにわたる離散
周波数で展開し、不要な帯域をＢＰＦで除去する。そし
て、帯域制限されたＩＲ（Ｓ）を逆変換して、ＩＲ（イ
ンパスを応答）を時間軸上で例えばコサイン関数の窓を
掛けて、ウインドウ処理をする（第２のＩＲとなる）。Virtual IR shaping processing (step 106) Here, the virtual IR obtained in step 105 is shaped.
First, for example, by FFT conversion, the first IR obtained in step 102 is expanded at discrete frequencies over the audio spectrum, and unnecessary bands are removed by BPF. Then, the band-limited IR (S) is inversely transformed, and the IR (in-pass response) is multiplied by a window of, for example, a cosine function on the time axis to perform window processing (becomes the second IR).

【００４３】キャンセルフィルタｃｆＬｘ、ｃｆＲｘ
の算出（ステップ１０７）コンボルバ（畳み込み積分回路）であるキャンセルフィ
ルタｃｆＬｘ、ｃｆＲｘは、前述した式４ａ及び式４ｂ
に示したように、ｃｆＬｘ＝（ｈ２Ｒ・ｐＬｘ−ｈ２Ｌ・ｐＲｘ）／Ｈ …（４ａ）ｃｆＲｘ＝（−ｈ１Ｒ・ｐＬｘ−ｈ１Ｌ・ｐＲｘ）／Ｈ …（４ｂ）但し、Ｈ＝ｈ１Ｌ・ｈ２Ｒ−ｈ２Ｌ・ｈ１Ｒ …（４ｃ）である。Cancellation filters cfLx, cfRx
(Step 107) The cancel filters cfLx and cfRx, which are convolvers (convolutional integration circuits), are calculated by the above equations 4a and 4b.
As shown in, cfLx = (h2R.pLx-h2L.pRx) / H (4a) cfRx = (-h1R.pLx-h1L.pRx) / H (4b) However, H = h1L.h2R-h2L -H1R ... (4c).

【００４４】ここで、配置されるスピ−カＳＰ１，ＳＰ
２による頭部伝達特性ｈ１Ｌ，ｈ１Ｒ，ｈ２Ｌ，ｈ２
Ｒ、及び目的とする定位位置ｘに実際のスピ−カを配置
したときの頭部伝達特性ｐＬｘ，ｐＲｘとして、前記ス
テップによって求められた、各角度ごとの第２のＩＲ
（インパルス応答）を代入する。Here, the speakers SP1 and SP to be arranged
2 head-related transfer characteristics h1L, h1R, h2L, h2
R and the second IR for each angle obtained by the above step as head-related transfer characteristics pLx, pRx when an actual speaker is placed at the intended localization position x
Substitute (impulse response).

【００４５】頭部伝達特性ｈ１Ｌ，ｈ１Ｒは、図４に示
すＬチャンネルスピ−カの位置に対応するもので、正面
から左に例えば３０度（θ＝３３０度）に設置されると
すれば、３３０のＩＲを用いる。頭部伝達特性ｈ２Ｌ，
ｈ２Ｒは、同図で示すＲチャンネルスピ−カの位置に対
応するもので、正面から右に例えば３０度（θ＝３０
度）に設置されるとすれば、θ＝３０度のＩＲを用いる
（すなわち、想定される実際の音像再生システムに近い
ものを選ぶ）。The head-related transfer characteristics h1L and h1R correspond to the positions of the L-channel speakers shown in FIG. 4, and if they are installed at 30 degrees (θ = 330 degrees) from the front to the left, An IR of 330 is used. Head-related transfer characteristic h2L,
h2R corresponds to the position of the R channel speaker shown in the figure, and is, for example, 30 degrees (θ = 30) from the front to the right.
If so, an IR of θ = 30 degrees is used (that is, one that is close to the assumed actual sound image reproduction system is selected).

【００４６】そして、頭部伝達特性ｐＬｘ、ｐＲｘとし
ては、目的とする音像定位である正面から左右９０度の
１８０度の範囲はもちろんのこと、それを越える広範囲
の空間（全空間）における、例えば３０度ごとのＩＲを
代入することにより、それに対応した全空間のｃｆＬ
ｘ，ｃｆＲｘ群が求められる。As the head-related transfer characteristics pLx and pRx, not only the 180 ° range of 90 ° from the front, which is the desired sound image localization, but also in a wide space (entire space) beyond that, for example, By substituting IR for every 30 degrees, cfL of the entire space corresponding to it
The x, cfRx group is determined.

【００４７】上記キャンセルフィルタｃｆＬｘ，ｃｆＲ
ｘ群は、最終的には、時間軸上の応答であるＩＲ（イン
パルス応答）として求められる。式４ａによるキャンセ
ルフィルタｃｆＬｘ，ｃｆＲｘの計算は、次のようにし
て求められる。まず、式４ｂのＨに対する一種の逆フィ
ルタＨ^-1を最小２乗法により求め、これを逆ＦＦＴ変換
して時間関数ｈ（ｔ）とする。また、式４ａの各項ｈ１
Ｌ，ｈ１Ｒ，ｈ２Ｌ，ｈ２Ｒ，ｐＲｘ，ｐＬｘをそれぞ
れ時間関数で表すことにより、次式が成りたつ。Cancellation filters cfLx, cfR
The x group is finally obtained as IR (impulse response) which is the response on the time axis. The cancellation filters cfLx and cfRx are calculated by the equation 4a as follows. First, a kind of inverse filter H ⁻¹ with respect to H in Expression 4b is obtained by the least squares method, and this is inverse FFT-transformed into a time function h (t). Also, each term h1 of the equation 4a
By expressing L, h1R, h2L, h2R, pRx, and pLx by a time function, respectively, the following equation is established.

【００４８】ｃｆＬｘ（ｔ）＝（ｈ２Ｒ・ｐＬｘ−ｈ２Ｌ・ｐＲｘ）・ｈ（ｔ） …（７ａ）ｃｆＲｘ（ｔ）＝（−ｈ１Ｒ・ｐＬｘ＋ｈ１Ｌ・ｐＲｘ）・ｈ（ｔ）…（７ｂ）CfLx (t) = (h2R · pLx−h2L · pRx) · h (t) (7a) cfRx (t) = (− h1R · pLx + h1L · pRx) · h (t) ... (7b)

【００４９】したがって、これらの式（７ａ），（７
ｂ）からキャンセルフィルタｃｆＬｘ，ｃｆＲｘの係数
が求められる。これらの式から明らかなように、キャン
セルフィルタｃｆＬｘ，ｃｆＲｘの係数を短くするの
は、各頭部伝達特性ｈ１Ｌ，ｈ１Ｒ，ｈ２Ｌ，ｈ２Ｒ，
ｐＲｘをそれぞれ短くすることが極めて大切である。こ
のため、前述したように、各ステップで窓処理，整形処
理等の各種処理をして、各頭部伝達特性ｈ１Ｌ，ｈ１
Ｒ，ｈ２Ｌ，ｐＲｘ，ｐＬｘ，ｈ２Ｒを短くしている。Therefore, these equations (7a), (7
The coefficients of the cancellation filters cfLx and cfRx are obtained from b). As is clear from these equations, the coefficients of the cancel filters cfLx and cfRx are made shorter by each head-related transfer characteristic h1L, h1R, h2L, h2R,
It is extremely important to shorten pRx respectively. Therefore, as described above, various processes such as window processing and shaping processing are performed in each step to perform the head-related transfer characteristics h1L and h1.
R, h2L, pRx, pLx, h2R are shortened.

【００５０】各定位ポイントｘのキャンセルフィルタ
のスケ−リング（ステップ８）また、実際にコンボルバ
（キャンセルフィルタ）で音像処理される音源（ソ−ス
音）のスペクトラム分布は、統計的にみるとピンクノイ
ズのような分布するもの、あるいは高域でなだらかに下
がるものなどがあり、いずれにしても音源は単一音とは
異なるために、折り畳み演算（積分）を行ったときオ−
バ−フロ−して、歪みが発生する危険がある。そこで、
オ−バ−フロ−を防止するため、キャンセルフィルタｃ
ｆＬｘ、ｃｆＲｘの係数の中で最大のゲイン（例えば、
キャンセルフィルタｃｆＬｘ、ｃｆＲｘの各サンプル値
の２乗和）のものを見つけ、その係数と０ｄＢのホワイ
トノイズを畳み込んだときに、オ−バ−フロ−が生じな
いように、全係数をスケ−リングする。実際には、絶対
値の最大値が、許容レベル（振幅）１に対して０．１〜
０．４の範囲となるように減衰させると良い。Scaling of the cancellation filter at each localization point x (step 8) In addition, the spectrum distribution of the sound source (source sound) actually processed by the convolver (cancellation filter) is pink when viewed statistically. There are things that are distributed like noise, or things that drop gently in the high range. In any case, the sound source is different from a single sound, so when folding calculation (integration) is performed,
There is a risk of causing a flow and distortion. Therefore,
Cancel filter c to prevent overflow
Of the coefficients of fLx and cfRx, the maximum gain (for example,
Find all of the cancellation filters cfLx and cfRx (the sum of squares of each sample value), and when all the coefficients are convolved with the white noise of 0 dB, all coefficients are scaled. To ring. In practice, the maximum absolute value is 0.1 to the allowable level (amplitude) 1.
It is good to attenuate so as to be in the range of 0.4.

【００５１】そして、例えば、コサインウインドによ
り、実際のコンボルバの係数の数にあわせて、両端が０
となるように、窓処理をし、係数の有効長を短くする。
このようにしてスケ−リング処理されて、最終的にコン
ボルバに係数として供給され、ＲＯＭ１０に格納される
デ−タ群（この例では、３０度ごとに音像定位が可能な
１２組のコンボルバの係数群）ｃｆＬｘ，ｃｆＲｘが求
まる。Then, for example, by a cosine window, both ends are set to 0 according to the number of actual convolver coefficients.
Window processing to shorten the effective length of the coefficient.
In this way, the data group that has been scaled and finally supplied to the convolver as coefficients and stored in the ROM 10 (in this example, the coefficient of 12 sets of convolvers capable of sound image localization every 30 degrees). Group) cfLx, cfRx are obtained.

【００５２】例えば、図１に示される音像定位制御装置
がゲ−ム機に用いられた場合には、図９（ａ）に示すよ
うに、ゲ−ム操作者Ｍを中心として例えば左右３０度づ
つ離間して一対のスピ−カＳＰ１，ＳＰ２を配置した
り、或いは、ヘッドホ−ンが用意される。For example, when the sound image localization control device shown in FIG. 1 is used for a game machine, as shown in FIG. A pair of speakers SP1 and SP2 are arranged separately from each other, or a head horn is prepared.

【００５３】各入力端子１又は２，３を通じて入来した
例えばシンセサイザからの飛行機音等の音源となる信号
が、アナログ又はデジタル信号かによってスイッチＳＷ
によって選択的に切り換えられて、シリアル−パラレル
変換器５によってパラレル信号に変換された後、左右チ
ャンネル用として一対づつ用意されたコンボルバ６，７
及びコンボルバ８，９に供給される。また一方、コント
ロ−ル用サブＣＰＵ１１からコントロ−ル信号がＲＯＭ
１０に供給され、この信号に基づいて所定の各度位置に
対応した係数が取り出されて、各コンボルバ６〜９に供
給される。The switch SW is selected depending on whether the signal, which is a sound source such as an airplane sound from the synthesizer, coming through the input terminals 1 or 2 or 3 is an analog or digital signal.
After being selectively switched by the serial-parallel converter 5, the parallel signals are converted into parallel signals, and then the convolvers 6 and 7 are provided in pairs for the left and right channels.
And the convolvers 8 and 9. On the other hand, the control signal from the control sub CPU 11 is stored in the ROM.
10, the coefficient corresponding to each predetermined degree position is extracted based on this signal, and is supplied to each convolver 6-9.

【００５４】各コンボルバ６〜９では、例えば、飛行機
音用の音源デ−タがＲＯＭ１０から供給される係数によ
って時間軸上で畳み込み演算処理がなされ、各加算器１
２，１３において左右用の信号として加算処理された
後、出力端子１４，１５より出力される。そして、この
出力された信号は、その後、図示しないデジタル−アナ
ログ変換器等によりアナログ信号に変換されて、スピ−
カ、或いは、ヘッドホンに供給される。In each of the convolvers 6 to 9, for example, sound source data for airplane sound is subjected to convolution calculation processing on the time axis by a coefficient supplied from the ROM 10, and each adder 1
The signals 2 and 13 are added as left and right signals, and then output from the output terminals 14 and 15. The output signal is then converted into an analog signal by a digital-analog converter (not shown), etc.
Or be supplied to headphones.

【００５５】このようにしてスピ−カ、或いは、ヘッド
ホンから再生される音は、ゲ−ム者の両耳へのクロスト
−クがキャンセルされて、所望の位置に音源があるよう
に音像定位して聞かれことは勿論、特に、本装置におい
ては、スピ−カを用いた場合には、従来装置のように意
図した位置よりも近い位置に音源が定位することがな
く、所望の位置に音像を定位させることができる。ま
た、ヘッドホンを用いた場合には、耳元に音がまつわり
つくようなことが無く、音源が恰も遠方位置にあるよう
に定位される。また、音像の上昇もなく、極めて自然な
音として聞くことができ、現実感に満ちた仮想現実を提
供できる。In this way, the sound reproduced from the speaker or the headphones is localized so that the crosstalk to both ears of the gamer is canceled and the sound source is located at a desired position. In particular, in the present device, when the speaker is used, the sound source is not localized at a position closer than the intended position unlike the conventional device, and the sound image is located at a desired position. Can be localized. In addition, when headphones are used, the sound is not caught around the ears, and the sound source is localized so that it is located at a distant position. In addition, there is no rise in the sound image, and it can be heard as an extremely natural sound, and it is possible to provide a virtual reality filled with a sense of reality.

【００５６】尚、上述のステップ１０４において、直接
音に対する第１の反射音の先行時間ｔを、９．１ｍｓｅ
ｃ〜１１．３ｍｓｅｃの範囲とし、その互いの最大レベ
ルでのレベル差を、１０〜１３ｄＢの範囲としたのは、
所定の実験結果に基づくものである。この実験は、上述
したステップにより数十種のサンプル用の係数を得て、
図１と同様の構成の装置を用いて各種のデ−タを得たも
のである。また、この場合、先行音して付加した第１の
反射音は左右チャンネル夫々対称のもの用いた。図５
はその実験結果のデ−タをまとめた図で、縦方向に反射
音の先行時間（ｍｓ）を示し、横方向にレベル（ｄＢ）
を示してある。この図から明らかなように、特に、上述
した範囲において顕著な効果が認められるのが分かる。In the step 104, the leading time t of the first reflected sound with respect to the direct sound is set to 9.1 mse.
The range of c to 11.3 msec, and the level difference between them at the maximum level is 10 to 13 dB,
It is based on a predetermined experimental result. In this experiment, we obtained the coefficients for dozens of samples by the steps described above,
Various data were obtained by using an apparatus having the same configuration as that in FIG. Further, in this case, the first reflected sound added as the preceding sound is symmetrical to each of the left and right channels. FIG.
Is a diagram summarizing the data of the experimental results, showing the leading time (ms) of the reflected sound in the vertical direction and the level (dB) in the horizontal direction.
Is shown. As is clear from this figure, it can be seen that a remarkable effect is recognized particularly in the above range.

【００５７】また、図示はしていないが、本発明者の検
証によれば、上述の実験のように先行音を左右チャンネ
ル対称とするのではなく、非対称とした方がより効果的
であることが確認されている。また、更に、本実施例で
は、先行音として１つの反射音を付加したが、反射音の
生じ方、反射時間，レベル等により、１つの反射音だけ
でなく、例えば、第２の反射音、第３の反射音、等の複
数音を付加したり、第１の反射音以外の反射音を付加し
たりすることは、適宜応用できることはいうまでもな
い。Further, although not shown, according to the verification by the present inventor, it is more effective to make the preceding sound asymmetric rather than symmetric to the left and right channels as in the above experiment. Has been confirmed. Further, in the present embodiment, one reflected sound is added as the preceding sound, but not only one reflected sound but also the second reflected sound, for example, depending on how the reflected sound is generated, reflection time, level, etc. It goes without saying that adding a plurality of sounds such as the third reflected sound and adding a reflected sound other than the first reflected sound can be appropriately applied.

【００５８】[0058]

【発明の効果】本発明によれば、スピ−カ、或いは、ヘ
ッドホンから再生される音は、聴取者の両耳へのクロス
ト−クがキャンセルされて、所望の位置に音像が定位し
ているように聞かれことは勿論、特に、スピ−カを用い
た場合には、意図した位置より近い位置に音源が定位す
ることがなく、所望の位置に音像を定位させることがで
きる。また、ヘッドホンを用いた場合には、耳元に音が
まつわりつくようなことが無く、音源が恰も遠方位置に
あるように定位される。また、音像の上昇もなく、極め
て自然な音として聞くことができ、現実感に満ちた仮想
現実を提供できる。また、特に、請求項３に記載の装置
によれば、付加された反射音と、直接音とが重ならない
ように工夫されているので、反射音と直接音との干渉が
避けられ、音像が不明瞭になる等の効果を有するもので
ある。According to the present invention, the sound reproduced from the speaker or the headphones has the sound image localized at a desired position after the crosstalk to both ears of the listener is canceled. Of course, when a speaker is used, the sound source can be localized at a desired position without the sound source being localized at a position closer than the intended position. In addition, when headphones are used, the sound is not caught around the ears, and the sound source is localized so that it is located at a distant position. In addition, there is no rise in the sound image, and it can be heard as an extremely natural sound, and it is possible to provide a virtual reality filled with a sense of reality. Further, in particular, according to the device of claim 3, since the added reflected sound and the direct sound are devised so as not to overlap with each other, interference between the reflected sound and the direct sound is avoided, and a sound image is formed. It has the effect of making it unclear.

[Brief description of drawings]

【図１】本発明の音像定位制御装置の一実施例を示す概
略ブロック図である。FIG. 1 is a schematic block diagram showing an embodiment of a sound image localization control device of the present invention.

【図２】係数群を求めるためのフロ−チャ−トである。FIG. 2 is a flowchart for obtaining a coefficient group.

【図３】反射音を付加する過程を示す波形図である。FIG. 3 is a waveform diagram showing a process of adding a reflected sound.

【図４】キャンセルフィルタの算出例を示す図である。FIG. 4 is a diagram showing a calculation example of a cancel filter.

【図５】各種反射音を付加した時の実験デ−タをまとめ
た図である。FIG. 5 is a diagram summarizing experimental data when various reflected sounds are added.

【図６】従来例を示す図である。FIG. 6 is a diagram showing a conventional example.

【図７】ＨＲＴＦ（頭部伝達関数）測定システムを示す
構成図である。FIG. 7 is a block diagram showing an HRTF (head related transfer function) measurement system.

【図８】自由空間で測定されたインパルス応答の波形図
である。FIG. 8 is a waveform diagram of an impulse response measured in free space.

【図９】時間的に異なる発射音に対する音像の定位変化
を説明するための図である。FIG. 9 is a diagram for explaining a localization change of a sound image with respect to temporally different emission sounds.

【図１０】時間的に異なる発射音のレベル差に対応した
音像定位の良否を説明するための図である。FIG. 10 is a diagram for explaining the quality of the sound image localization corresponding to the level difference of the emitted sounds that are temporally different.

[Explanation of symbols]

１〜３入力端子４アナログ−デジタル変換器５シリアル−パラレル変換器６〜９コンボルバ１０ＲＯＭ１１コントロ−ル用サブＣＰＵ１１，１２加算器１３，１４出力端子ＳＰ１，ＳＰ２スピ−カ 1-3 Input terminal 4 Analog-digital converter 5 Serial-parallel converter 6-9 Convolver 10 ROM 11 Control sub CPU 11,12 Adder 13,14 Output terminal SP1, SP2 Speaker

Claims

[Claims]

1. A pair of transducers separated from each other reproduces a signal processed by a pair of convolvers to which the same sound source is supplied, and a signal is reproduced to a listener at an arbitrary position different from the pair of transducers. In a sound image localization control device that makes the user feel that the sound image is localized, a pair of convolvers that perform convolution processing of signals from the same sound source according to a set coefficient, and a head measured at each sound image localization position. Based on the transfer function, a storage means for holding a cancellation filter coefficient group calculated as an impulse response, and a coefficient corresponding to a designated sound image localization position are supplied from the storage means to the pair of convolvers. And a canceling filter coefficient group for generating a reflected sound that precedes the direct sound coming directly to the listener and comes later than the direct sound. A sound image localization control device comprising a coefficient group calculated based on an added virtual impulse response.

2. The lead time of the reflected sound with respect to the direct sound is 9.1 msec to 11.nsec based on the maximum level of these sounds.
The sound image localization control apparatus according to claim 1, wherein the maximum level of the reflected sound is set to -10 dB to -13 dB with respect to the maximum level of the direct sound while being set to 3 msec.

3. The sound image localization control apparatus according to claim 2, wherein the reflected sound added prior to the direct sound is converged to zero before the direct sound starts.