JPH07248255A

JPH07248255A - Method and apparatus for forming stereophonic image

Info

Publication number: JPH07248255A
Application number: JP6038514A
Authority: JP
Inventors: Hajime Shimizu; 肇清水
Original assignee: Sharp Corp
Current assignee: Sharp Corp
Priority date: 1994-03-09
Filing date: 1994-03-09
Publication date: 1995-09-26

Abstract

PURPOSE:To provide a stereophonic image forming apparatus and method which allows the elimination of time and labor for transferring a head transmission function by reducing interpolation calculation while enabling the following of the movement of the position of a sound image. CONSTITUTION:A sound data inputted undergoes a head transmission function processing with a plurality of head transmission function processors 13a, 13b... and 13n in parallel. The sound data subjected to the head transmission function processing is stored temporarily into a sound data memory 16. When the position of a sound image is detected with a sound image position detector 11, the sound data subjected to the head transmission function processing stored temporarily into the sound data memory 16 is interpolated in the sound image with a sound image interpolator 17 based on the results of the detection and is outputted as sound image output 100.

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明は、仮想現実、すなわちバ
ーチャルリアリティに等に用いられる立体音像を生成す
る立体音像生成装置及び立体音像生成方法に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a stereoscopic sound image generating apparatus and a stereoscopic sound image generating method for generating a stereoscopic sound image used for virtual reality, that is, virtual reality.

【０００２】[0002]

【従来の技術】一般に、音源からの音は、直接届く直接
音や、壁などに反射した反射音が、頭部や耳介の回折及
び反射を経て両方の耳に到達する。人が音の伝播方向を
知覚するのは、両耳に対する音圧の強度差と両耳への音
の到達時間差すなわち位相差によってである。音源から
の音が鼓膜に到達するまでの音圧差と位相差を各周波数
毎に予め測定しておき、その伝達関数である頭部伝達関
数を、音源によって発生された波形に作用させることに
よって、鼓膜直前の波形を再現することが可能である。2. Description of the Related Art Generally, a sound from a sound source directly reaches the ear or a reflected sound reflected on a wall reaches both ears after being diffracted and reflected by the head and auricle. A person perceives a sound propagation direction because of a difference in sound pressure intensity between the ears and a difference in arrival time of the sound to the ears, that is, a phase difference. The sound pressure difference and the phase difference until the sound from the sound source reaches the eardrum are measured in advance for each frequency, and the head related transfer function, which is the transfer function thereof, is applied to the waveform generated by the sound source. It is possible to reproduce the waveform just before the eardrum.

【０００３】バーチャルリアリティでは、音像が音源の
移動や人の動きに合わせて変化する。通常、音源の絶対
位置は時間の関数として与えられるので、音像の相対位
置は人間の頭の位置及び角度並びに音源の絶対位置とか
ら割り出される。In virtual reality, the sound image changes according to the movement of the sound source or the movement of the person. Since the absolute position of the sound source is usually given as a function of time, the relative position of the sound image is determined from the position and angle of the human head and the absolute position of the sound source.

【０００４】以下、図２を参照しながら従来の立体音像
生成装置について説明する。A conventional stereoscopic sound image generating apparatus will be described below with reference to FIG.

【０００５】立体音像生成装置は、音源からの音を入力
するマイク等の音源入力手段１０、及び音像の位置を検
出する音像位置検出装置１１を有しており、音源入力手
段１０には、入力された音データを格納する音データ記
憶装置１２が接続されている。該記憶装置１２には、入
力された音データの頭部伝達関数処理を行う複数の頭部
伝達関数処理装置１３ａ，１３ｂ，…，１３ｎが並列に
接続されている。頭部伝達関数処理装置１３ａは直接波
を頭部伝達関数処理するように構成されており、頭部伝
達関数処理装置１３ｂ，…，１３ｎはそれぞれ反射波を
頭部伝達関数処理するように構成されている。音像位置
検出装置１１には、検出された音像の位置情報から頭部
伝達関数を求める頭部伝達関数補間装置１４が接続され
ており、補間装置１４は前記処理装置１３ａ，１３ｂ，
…，１３ｎに接続されている。処理装置１３ａ，１３
ｂ，…，１３ｎには、処理装置１３ａ，１３ｂ，…，１
３ｎの結果を合成して音像出力１００を出力する音声合
成装置１５が接続されている。The three-dimensional sound image generation device has a sound source input means 10 such as a microphone for inputting a sound from a sound source, and a sound image position detection device 11 for detecting the position of the sound image. A sound data storage device 12 for storing the generated sound data is connected. A plurality of head related transfer function processing devices 13a, 13b, ..., 13n for performing head related transfer function processing of input sound data are connected in parallel to the storage device 12. The head related transfer function processing device 13a is configured to perform head related transfer function processing on the direct wave, and the head related transfer function processing devices 13b, ..., 13n are configured to perform head related transfer function processing on the reflected wave, respectively. ing. A head-related transfer function interpolating device 14 for obtaining a head-related transfer function from position information of the detected sound image is connected to the sound image position detecting device 11, and the interpolating device 14 includes the processing devices 13a, 13b,
..., 13n. Processors 13a, 13
b, ..., 13n include processing devices 13a, 13b ,.
A voice synthesizing device 15 for synthesizing the 3n result and outputting a sound image output 100 is connected.

【０００６】次に、従来例の動作について説明する。Next, the operation of the conventional example will be described.

【０００７】音源入力手段１０により音源からの音が入
力され、音データ記録装置１２に格納される。音像の位
置と人の頭の位置及び角度から相対的な音像の位置が音
像位置検出装置１１により検出される。検出された音像
の位置情報から頭部伝達関数が頭部伝達関数補間装置１
４により求められる。全ての方向の頭部伝達関数を用意
することは不可能であるので、補間によって必要な方向
の頭部伝達関数が求められる。各処理装置１３ａ，１３
ｂ，…，１３ｎにより直接波、各反射波毎に音データ記
録装置１２に格納された音データの中から到達時間差を
考慮して音のデータが読み込まれ、頭部伝達関数が作用
される。その結果が音声合成装置１５により合成され、
音像出力１００として出力される。The sound from the sound source is input by the sound source input means 10 and stored in the sound data recording device 12. The relative position of the sound image is detected by the sound image position detection device 11 from the position of the sound image and the position and angle of the person's head. From the position information of the detected sound image, the HRTF is HRTF interpolation device 1
4 is required. Since it is impossible to prepare the head related transfer functions in all directions, the head related transfer functions in the necessary directions are obtained by interpolation. Each processing device 13a, 13
.., 13n read the sound data from the sound data stored in the sound data recording device 12 for each of the direct waves and the reflected waves in consideration of the arrival time difference, and the head related transfer function is operated. The result is synthesized by the speech synthesizer 15,
It is output as a sound image output 100.

【０００８】以上のようにして、必要な方向からの音像
を生成する。As described above, the sound image from the required direction is generated.

【０００９】[0009]

【発明が解決しようとする課題】音像の相対位置検出か
らそれに対応した音が出るまでの時間遅れは、音の定位
感を不安定にすることが分かっているので、この時間遅
れは小さくする必要がある。この時間遅れの原因は、音
像位置検出の間隔、頭部伝達関数の補間を行うのに要す
る時間、補間を頭部伝達関数処理装置に転送する時間に
起因している。頭部伝達関数処理装置の実現方法を考え
ると、頭部伝達関数は周波数の領域で定義されたもので
あり、音源からの音をフーリエ変換し、Ｎ個の周波数成
分ごとの複素乗算を行い、逆フーリエ変換を行う。頭部
伝達関数が変化した場合、理論的には一つの音当たりｌ
ｏｇ（Ｎ）のオーダーで計算できるが、現実的には頭部
伝達関数が変化した場合は対応が困難である。It has been known that the time delay from the detection of the relative position of the sound image to the generation of a sound corresponding thereto makes the localization of the sound unstable, so this time delay needs to be made small. There is. The cause of this time delay is due to the interval of sound image position detection, the time required to interpolate the head related transfer function, and the time to transfer the interpolation to the head related transfer function processing device. Considering an implementation method of the head-related transfer function processing device, the head-related transfer function is defined in the frequency domain, and the sound from the sound source is Fourier-transformed to perform complex multiplication for each of the N frequency components, Inverse Fourier transform is performed. If the head related transfer function changes, theoretically,
It can be calculated in the order of og (N), but in reality, it is difficult to handle it when the head related transfer function changes.

【００１０】頭部伝達関数処理装置の実現方法として
は、頭部伝達関数を逆フーリエ変換したインパルス応答
と音源からの音とを畳み込み演算する方法が考えられ、
その計算量は一つの音当たりＮのオーダーなる。畳み込
み演算はＦＩＲフィルタやＩＩＲフィルタを使えば長さ
Ｎに関係なく所定時間で計算することが可能である。し
かし、畳み込み演算の係数の変更すなわち頭部伝達関数
の変更時にＮのオーダーの時間がかかる。As a method of realizing the head related transfer function processing device, a method of convoluting the impulse response obtained by inverse Fourier transforming the head related transfer function and the sound from the sound source is considered,
The calculation amount is on the order of N per sound. The convolution operation can be calculated in a predetermined time regardless of the length N by using an FIR filter or an IIR filter. However, when the coefficient of the convolution operation is changed, that is, when the head related transfer function is changed, it takes time of the order of N.

【００１１】定位感を犠牲としてインパルス応答をより
短いＦＩＲで近似し、インパルス応答と音源からの音と
の畳み込み演算で頭部伝達関数処理装置を実現する場合
でも、該ＦＩＲのタップ数は頭部伝達関数の補間の計算
及び転送の手間を考慮して定めなければならない。Even when the impulse response is approximated by a shorter FIR at the sacrifice of the localization and the head-related transfer function processing device is realized by the convolution operation of the impulse response and the sound from the sound source, the number of taps of the FIR is the head. It must be determined in consideration of the labor of calculation and transfer of transfer function interpolation.

【００１２】本発明は、補間の計算を軽減し、頭部伝達
関数を転送する手間を省略し、かつ音像位置移動にスム
ーズに追随し得る立体音像生成装置及び立体音像生成方
法を提供することを目的とする。An object of the present invention is to provide a stereoscopic sound image generating apparatus and a stereoscopic sound image generating method capable of reducing the calculation of interpolation, omitting the trouble of transferring the head related transfer function and smoothly following the movement of the sound image position. To aim.

【００１３】[0013]

【課題を解決するための手段】本発明によれば、前述の
目的は、音現入力手段と、該音源入力手段に並列に接続
されており入力された音データに頭部伝達関数処理を行
う複数の頭部伝達関数処理手段と、音像の位置を検出す
る音像位置検出手段と、該音像位置検出手段による検出
結果に基づき前記複数の頭部伝達関数処理手段により処
理された音データを音像補間する音像補間手段とを備え
る立体音像生成装置によって達成される。According to the present invention, the above-mentioned object is to perform head related transfer function processing on input sound data which is connected in parallel to the sound input means and the sound source input means. A plurality of head related transfer function processing means, a sound image position detecting means for detecting a position of a sound image, and sound image interpolation of sound data processed by the plurality of head related transfer function processing means based on a detection result by the sound image position detecting means. This is achieved by a stereoscopic sound image generating device including a sound image interpolating means.

【００１４】なお、前記立体音像生成装置は、前記複数
の頭部伝達関数処理手段により処理された音データを一
時格納する音データ記憶手段を備えるのがよい。It is preferable that the three-dimensional sound image generation device includes sound data storage means for temporarily storing the sound data processed by the plurality of head related transfer function processing means.

【００１５】本発明によれば、前述の目的は、入力され
た音データに並列に頭部伝達関数処理が行われ、音像の
位置が検出され、検出結果に基づいて前記複数の頭部伝
達関数処理された音データが音像補間される立体音像生
成方法によって達成される。According to the present invention, the above-mentioned object is to perform head related transfer function processing in parallel on input sound data, detect the position of a sound image, and to detect the plurality of head related transfer functions based on the detection result. This is achieved by a stereoscopic sound image generation method in which processed sound data is sound image interpolated.

【００１６】[0016]

【作用】本発明の立体音像生成装置によれば、入力され
た音データは複数の頭部伝達関数処理手段により並列に
頭部伝達関数処理を行われ、音像位置検出手段により音
像の位置が検出され、検出結果に基づいて頭部伝達関数
処理された音データが音像補間手段により音像補間され
ることにより、頭部伝達関数処理装置への頭部伝達関数
の転送の手間が省略され、補間の計算が軽減される。そ
の結果、頭部伝達関数の作用に相当する演算処理の方式
をＦＩＲフィルタ以外にＩＩＲフィルタやＦＦＴデバイ
スによる処理などの多様な処理方式に対応することを可
能とし、音像の相対的な動きに追随し易くなる。According to the three-dimensional sound image generating apparatus of the present invention, the input sound data is subjected to head-related transfer function processing in parallel by a plurality of head-related transfer function processing means, and the position of the sound image is detected by the sound image position detecting means. The sound image subjected to the head related transfer function processing based on the detection result is subjected to the sound image interpolation by the sound image interpolating means, so that the trouble of transferring the head related transfer function to the head related transfer function processing device is omitted, and the interpolation Calculation is reduced. As a result, it becomes possible to correspond to various processing methods such as the processing by IIR filter and FFT device other than the FIR filter as the processing method corresponding to the action of the head related transfer function, and follow the relative movement of the sound image. Easier to do.

【００１７】本発明の立体音像生成装置には、複数の頭
部伝達関数処理手段により処理された音データを一時格
納する音データ記憶手段を備えるのがよい。これによ
り、多様な音データの処理方法が可能となり、音像の相
対的な動きにさらに追随しやすくなる。The three-dimensional sound image generating apparatus of the present invention preferably comprises sound data storage means for temporarily storing the sound data processed by the plurality of head related transfer function processing means. As a result, various sound data processing methods are possible, and it becomes easier to follow the relative movement of the sound image.

【００１８】本発明の立体音像生成方法によれば、入力
された音データに並列に頭部伝達関数処理を行い、音像
の位置を検出し、検出結果に基づいて頭部伝達関数処理
した音データを音像補間する。頭部伝達関数の転送の手
間が省略でき、補間の計算が軽減できる。これにより、
頭部伝達関数の作用に相当する演算処理の方式をＦＩＲ
フィルタ以外にＩＩＲフィルタやＦＦＴデバイスによる
処理などの多様な処理方式に対応することを可能とし、
音像の相対的な動きに追随し易くなる。According to the three-dimensional sound image generation method of the present invention, sound data subjected to head related transfer function processing in parallel with input sound data to detect the position of the sound image, and subjected to head related transfer function processing based on the detection result. Sound image is interpolated. The labor of transferring the head related transfer function can be omitted, and the calculation of interpolation can be reduced. This allows
The calculation processing method corresponding to the function of the head related transfer function is FIR
In addition to filters, it is possible to support various processing methods such as processing by IIR filters and FFT devices,
It becomes easy to follow the relative movement of the sound image.

【００１９】本発明の立体音像生成方法では、並列に頭
部伝達関数処理した音データを一時格納するのがよい。
これにより、多様な音データの処理方法が可能となり、
音像の相対的な動きにさらに追随しやすくなる。In the three-dimensional sound image generation method of the present invention, it is preferable to temporarily store the sound data subjected to head-related transfer function processing in parallel.
This enables various sound data processing methods,
It becomes easier to follow the relative movement of the sound image.

【００２０】[0020]

【実施例】以下、本発明の実施例を図１を参照しながら
説明する。DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS An embodiment of the present invention will be described below with reference to FIG.

【００２１】本実施例の立体音像生成装置は、音源から
の音を入力するマイク等の音源入力手段１０、及び音像
の位置を検出する音像位置検出手段としての音像位置検
出装置１１を有している。音源入力手段１０には、入力
された音データの頭部伝達関数処理を行う複数の頭部伝
達関数処理手段としての頭部伝達関数処理装置１３ａ，
１３ｂ，…，１３ｎが並列に接続されており、前記処理
装置１３ａは直接波を頭部伝達関数処理するように構成
されており、前記処理装置１３ｂ，…，１３ｎはそれぞ
れ反射波を頭部伝達関数処理するように構成されてい
る。前記処理装置１３ａ，１３ｂ，…，１３ｎには、そ
れぞれ該処理装置１３ａ，１３ｂ，…，１３ｎにより処
理された音データを一時格納する音データ記憶手段とし
ての音データ記憶部１６が接続されている。これらの音
データ記憶部１６には、前記検出装置１１による検出結
果に基づき前記記憶部１６に格納された音データを音像
補間して音像出力１００を出力する音像補間手段として
の音像補間装置１７が接続されている。The stereoscopic sound image generating apparatus of this embodiment has a sound source input means 10 such as a microphone for inputting a sound from a sound source, and a sound image position detecting apparatus 11 as a sound image position detecting means for detecting the position of the sound image. There is. The sound source input means 10 includes a head related transfer function processing device 13a as a plurality of head related transfer function processing means for performing head related transfer function processing of input sound data,
, 13n are connected in parallel, the processing device 13a is configured to perform a head-related transfer function process on the direct wave, and the processing devices 13b, ..., 13n respectively transmit the reflected wave via the head. It is configured to process functions. A sound data storage unit 16 as sound data storage means for temporarily storing the sound data processed by the processing devices 13a, 13b, ..., 13n is connected to the processing devices 13a, 13b ,. . In these sound data storage units 16, a sound image interpolating device 17 as a sound image interpolating unit that outputs a sound image output 100 by performing sound image interpolation on the sound data stored in the storage unit 16 based on the detection result by the detecting device 11. It is connected.

【００２２】音像の定位は、両耳間の時間差すなわち位
相差と各周波数毎のレベル差とによって行われる。これ
は左右の頭部伝達関数の時間差と各周波数毎のレベル差
とから求められる。頭部伝達関数の補間を行う場合、時
間差と各周波数毎のレベル差に関して補間を行う必要が
ある。頭部伝達関数Ｈ（ω）の時間差τは、頭を球に近
似した場合の理論式や、頭部伝達関数を逆フーリエ変換
したインパルス応答ｈ（ｔ）の最大値を与える時間によ
って求められる。時間差を引いたｈ（ｔ−τ）のフーリ
エ変換においては、Ｈ（ω）ｅｘｐ（−ｊωτ）が、各
周波数毎のレベル差を表す。ここで、頭部伝達関数は距
離が一定であり、水平角、仰角が一定間隔の地点（θi
、ζj）で測定されているものとする。仰角が同一のに
方向θ1、θ2 からの頭部伝達関数をＨ1（ω）、Ｈ2
（ω）とし、時間差をτ1 、τ2 とするとき、 θ＝（１−λ）θ1 ＋λθ2 （０≦λ≦１）とすれば、時間差τは、 τ＝（１−λ）τ1 ＋λτ2 となり、各周波数ごとのレベル差は、（１−λ）Ｈ1（ω）ｅｘｐ（−ｊωτ1）＋λＨ2
（ω）ｅｘｐ（−ｊωτ2）で近似できる。従って、頭部伝達関数の補間は、｛（１−λ）Ｈ1（ω）ｅｘｐ（−ｊωτ1）＋λＨ2
（ω）ｅｘｐ（−ｊωτ2）｝＊ｅｘｐ（ｊω｛（１−
λ）τ1 ＋λτ2｝）＝（１−λ）Ｈ1（ω）ｅｘｐ（−ｊωλ（τ1 −τ
2））＋λＨ2（ω）ｅｘｐ（−ｊω（１−λ）（τ1 −
τ2））で近似できる。また、逆フーリエ変換したインパルス応
答ｈ（ｔ）は、ｈ（ｔ）＝（１−λ）ｈ1（ｔ−λ（τ1 −τ2））＋λ
ｈ2（ｔ−（１−λ）（τ1 −τ2））で与えられる。ここで、Σは変数ｉに関する０から∞ま
での総和とする。任意の信号ｘ（ｔ）とｈ1（ｔ）、ｈ2
（ｔ）との畳み込みをｙ1（ｔ）、ｙ2（ｔ）とすると、ｙ1（ｔ）＝ｈ1（ｔ）＊ｘ（ｔ）＝Σｈ1（ｉ）＊ｘ
（ｔ−ｉ）ｙ2（ｔ）＝ｈ2（ｔ）＊ｘ（ｔ）＝Σｈ2（ｉ）＊ｘ
（ｔ−ｉ）となり、ｈ（ｔ）との畳み込みｙ（ｔ）は、ｙ（ｔ）＝ｈ（ｔ）＊ｘ（ｔ−ｉ）＝Σ｛（１−λ）ｈ1（ｉ−λ（τ1 −τ2））＋λｈ2
（ｉ−（１−λ）（τ1 −τ2））｝＊ｘ（ｔ−ｉ）＝（１−λ）Σｈ1（ｉ−λ（τ1 −τ2 ））＊ｘ（ｔ
−ｉ）＋λ Σｈ2（ｉ−（１−λ）（τ1 −τ2））｝
＊ｘ（ｔ−ｉ）＝（１−λ）ｙ1（ｔ−λ（τ1 −τ2））＋λｙ2（ｔ
−（１−λ）（τ1 −τ2））となる。ここで、ｙ1、ｙ2はある方向からの音像であ
る。この式は頭部伝達関数の補間を行う変わりに音像を
補間することにより、求める角度方向の音像を作成でき
ることを示している。頭部伝達関数の補間が長さに依存
するのに比べて音像の補間は数回の演算で済み、処理が
軽くなる。The sound image is localized by the time difference between both ears, that is, the phase difference and the level difference for each frequency. This is obtained from the time difference between the left and right head related transfer functions and the level difference for each frequency. When the head-related transfer function is interpolated, it is necessary to interpolate the time difference and the level difference for each frequency. The time difference τ of the head-related transfer function H (ω) is obtained by the theoretical expression when the head is approximated to a sphere, or the time that gives the maximum value of the impulse response h (t) obtained by inverse Fourier transforming the head-related transfer function. In the Fourier transform of h (t-τ) from which the time difference is subtracted, H (ω) exp (-jωτ) represents the level difference for each frequency. Here, the head-related transfer function has a constant distance, and points (θi
, Ζ j). Head-related transfer functions from directions θ1 and θ2 with the same elevation angle are H1 (ω), H2
If (ω) and the time differences are τ1 and τ2, and θ = (1-λ) θ1 + λθ2 (0 ≦ λ ≦ 1), the time difference τ becomes τ = (1-λ) τ1 + λτ2. The level difference for each frequency is (1-λ) H1 (ω) exp (-jωτ1) + λH2
It can be approximated by (ω) exp (-jωτ2). Therefore, the interpolation of the head related transfer function is: {(1-λ) H1 (ω) exp (-jωτ1) + λH2
(Ω) exp (-jωτ2)} * exp (jω {(1-
λ) τ1 + λτ2}) = (1-λ) H1 (ω) exp (-jωλ (τ1 -τ
2)) + λH2 (ω) exp (-jω (1-λ) (τ1−
τ2)) can be approximated. The inverse Fourier transformed impulse response h (t) is h (t) = (1−λ) h1 (t−λ (τ1−τ2)) + λ
It is given by h2 (t- (1- [lambda]) ([tau] 1- [tau] 2)). Here, Σ is a total sum of 0 to ∞ regarding the variable i. Arbitrary signals x (t) and h1 (t), h2
If the convolution with (t) is y1 (t) and y2 (t), then y1 (t) = h1 (t) * x (t) = Σh1 (i) * x
(T-i) y2 (t) = h2 (t) * x (t) = Σh2 (i) * x
(T−i), and the convolution y (t) with h (t) is y (t) = h (t) * x (t−i) = Σ {(1−λ) h1 (i−λ ( τ1 −τ2)) + λh2
(I− (1−λ) (τ1−τ2))} * x (t−i) = (1−λ) Σh1 (i−λ (τ1−τ2)) * x (t
−i) + λ Σh2 (i− (1-λ) (τ1−τ2))}
* X (t-i) = (1-?) Y1 (t-? (? 1-? 2)) +? Y2 (t
− (1-λ) (τ1 −τ2)). Here, y1 and y2 are sound images from a certain direction. This formula shows that the sound image in the desired angular direction can be created by interpolating the sound image instead of interpolating the head related transfer function. The interpolation of the head-related transfer function depends on the length, but the interpolation of the sound image requires only a few calculations and the processing becomes light.

【００２３】任意の信号（ｘ）ｔと各周波数毎のレベル
差を表す関数との畳み込みをｚ1（ｔ）、ｚ2（ｔ）とす
れば、ｙ1（ｔ）＝ｚ1（ｔ−λτ1）ｙ2（ｔ）＝ｚ2（ｔ−λτ2）ｙ（ｔ）＝（１−λ）ｚ1（ｔ−（１−λ）τ1 −λτ
2））＋λｚ2（ｔ−（１−λ）τ1 −λτ2）となり、頭部伝達関数の代わりに各周波数毎のレベル差
を表す関数との畳み込み結果を用いることもできる。Letting z1 (t) and z2 (t) be the convolutions of an arbitrary signal (x) t and the level difference for each frequency, y1 (t) = z1 (t-λτ1) y2 ( t) = z2 (t- [lambda] [tau] 2) y (t) = (1- [lambda]) z1 (t- (1- [lambda]) [tau] 1- [lambda] [tau].
2)) + [lambda] z2 (t- (1- [lambda]) [tau] 1- [lambda] [tau] 2), and the convolution result with the function representing the level difference for each frequency can be used instead of the head related transfer function.

【００２４】音像が同一平面上に位置しない場合も、水
平角θi 、仰角ζj の各周波数毎のレベル差を表す関数
との畳み込み結果をｚij（ｔ）とし、時間差をτij
（ｔ）とすれば、 θ＝（１−λ）θ1 ＋λθ2 （０≦λ≦１） ζ＝（１−μ）ζ1 ＋μζ2 （０≦μ≦１）における音像は、上述同様にして、ｙ（ｔ）＝（１−λ）（１−μ）ｚ11（ｔ−τ）＋（１
−λ）μｚ12（ｔ−τ）＋λ（１−μ）ｚ21
（ｔ−τ）＋λμｚ22（ｔ−τ） τ＝（１−λ）（１−μ）τ11 ＋（１−λ）μτ12＋
λ（１−μ）τ21 ＋λμτ22 となる。Even when the sound images are not located on the same plane, the convolution result with the function representing the level difference for each frequency of the horizontal angle θi and elevation angle ζj is zij (t), and the time difference is τij.
If (t), the sound image at θ = (1−λ) θ1 + λθ2 (0 ≦ λ ≦ 1) ζ = (1-μ) ζ1 + μζ2 (0 ≦ μ ≦ 1) is the same as the above, and y ( t) = (1−λ) (1−μ) z11 (t−τ) + (1
−λ) μz12 (t−τ) + λ (1-μ) z21
(T-τ) + λμz22 (t-τ) τ = (1-λ) (1-μ) τ11 + (1-λ) μτ12 +
λ (1-μ) τ21 + λμτ22.

【００２５】反射成分を鏡像によって求めれば、音の大
きさＬi と到達までの時間Ｔi は、距離に依存して一位
に決定し、反射率をＡi とし、i＝０の時を直接波の場
合とすると、ｙ（ｔ）＝Σｈ（ｔ）＊（ＡiＬiｘ（ｔ−Ｔ））＝ΣＡiＬi（１−λi）（１−μi）ｚ11（ｔ−τi）＋（１−λi）μｚ12（ｔ−τi）＋λi（１−μi）ｚ21
（ｔ−τi）＋λiμiｚ22（ｔ−τi） τi ＝Ｔi＋（１−λi）（１−μi）τi11＋（１−λ
i）μiτi12 ＋λi（１−μi）τi21 ＋λiμiτi22 となる。If the reflection component is obtained by a mirror image, the loudness Li of the sound and the time Ti to reach it are determined to be the highest depending on the distance, the reflectance is Ai, and when i = 0, the direct wave is generated. In this case, y (t) = Σh (t) * (AiLix (t−T)) = ΣAiLi (1−λi) (1−μi) z11 (t−τi) + (1−λi) μz12 (t−) τi) + λi (1-μi) z21
(T- [tau] i) + [lambda] i [mu] iz22 (t- [tau] i) [tau] i = Ti + (1- [lambda] i) (1- [mu] i) [tau] i11 + (1- [lambda]
i) μiτi12 + λi (1-μi) τi21 + λiμiτi22.

【００２６】以上の式により複数方向からの音響信号を
求めれば、それらの音響信号の補間によって求める角度
方向の音響信号を作成できる。By obtaining the acoustic signals from a plurality of directions by the above equation, the acoustic signals in the angular direction can be created by interpolating the acoustic signals.

【００２７】次に、本実施例の動作について説明する。Next, the operation of this embodiment will be described.

【００２８】頭部伝達関数処理装置１３ａ，１３ｂ，
…，１３ｎは、音波の到来方向からの頭部、耳介の回折
効果を表す頭部伝達関数を元の音に作用させることによ
って再生信号を生成する。頭部伝達関数は、距離が一定
で、水平、仰角が一定間隔の地点（θi 、ζj）で測定
されていると仮定し、全ての頭部伝達関数に対応する処
理装置が存在すると仮定する。前記処理装置１３ａ，１
３ｂ，…，１３ｎの出力である各方向からの音響信号
は、音データ記憶装置１６に一時格納される。音像の位
置及び頭の位置と角度から相対的な音像の位置を検出す
る音像位置検出装置１１の情報、水平角θ、仰角ζか
ら、音像補間装置１７により、 θ＝（１−λ）θ1 ＋λθ2 （０≦λ≦１） ζ＝（１−μ）ζ1 ＋μζ2 （０≦μ≦１）を満足する四個の頭部伝達関数（θi 、ζj）と時間差
τijが決定され、距離によって音の大きさＬと到達まで
の時間Ｔが決定される。Head related transfer function processing devices 13a, 13b,
, 13n generate a reproduction signal by causing a head-related transfer function representing the diffraction effect of the head and auricle from the arrival direction of the sound wave to act on the original sound. It is assumed that the head-related transfer function is measured at points (θi, ζj) where the distance is constant, the horizontal angle is constant, and the elevation angle is constant, and it is assumed that there is a processor corresponding to all the head-related transfer functions. The processing device 13a, 1
Sound signals from the respective directions, which are the outputs of 3b, ..., 13n, are temporarily stored in the sound data storage device 16. From the information of the sound image position detecting device 11 for detecting the relative position of the sound image from the position of the sound image and the position of the head, the horizontal angle θ, and the elevation angle ζ, the sound image interpolating device 17 determines θ = (1−λ) θ1 + λθ2. (0 ≦ λ ≦ 1) ζ = (1−μ) ζ1 + μζ2 Four head related transfer functions (θi, ζj) satisfying (0 ≦ μ ≦ 1) and the time difference τij are determined, and the loudness of the sound is determined by the distance. The length L and the time T until the arrival are determined.

【００２９】 τ＝Ｔ＋（１−λ）（１−μ）τ11＋（１−λ）μτ12 ＋λ（１−μ）τ21 ＋λμτ22 としたとき、音データ記憶装置１６よりｚij（ｔ−τ）
に相当する音データを取り出し、ｙ（ｔ）＝Ｌ｛（１−λ）（１−μ）ｚ11（ｔ−τ）＋
（１−λ）μｚ12（ｔ−τ）＋λ（１−μ）ｚ21（ｔ−
τ）＋λμｚ22（ｔ−τ）｝により直接波が決定される。When τ = T + (1-λ) (1-μ) τ11 + (1-λ) μτ12 + λ (1-μ) τ21 + λμτ22, zij (t-τ) is calculated from the sound data storage device 16.
Sound data corresponding to y (t) = L {(1-λ) (1-μ) z11 (t-τ) +
(1-λ) μz12 (t-τ) + λ (1-μ) z21 (t-
τ) + λμz 22 (t-τ)} determines the direct wave.

【００３０】壁や床などの反射を考えた場合、音源壁や
床などの鏡像位置に帰着するから、上述の式と同様にし
て、ｙ（ｔ）＝ΣＡiＬi｛（１−λi）（１−μi）ｚ11（ｔ
−τi）＋（１−λi）μｚ12（ｔ−τi）＋λi（１−μ
i）ｚ21（ｔ−τi）＋λiμiｚ22（ｔ−τi）｝ τi ＝Ｔi＋（１−λi）（１−μi）τi11＋（１−λ
i）μiτi12＋λi（１−μi）τi21 ＋λiμiτi22 となり、音像出力１００が決定される。When the reflection of a wall or floor is taken into consideration, it results in a mirror image position of the sound source wall or floor. Therefore, y (t) = ΣAiLi {(1-λi) (1- μi) z11 (t
−τi) + (1-λi) μz12 (t−τi) + λi (1-μ
i) z21 (t-τi) + λiμi z22 (t-τi)} τi = Ti + (1-λi) (1-μi) τi11 + (1-λ
i) μiτi12 + λi (1-μi) τi21 + λiμiτi22, and the sound image output 100 is determined.

【００３１】以上のように、従来装置では音像の相対位
置が変わる度に頭部伝達関数を頭部伝達処理装置に設定
し直さなければならなかったのに対し、本実施例では頭
部伝達関数を頭部伝達関数処理装置に設定するのは、初
期設定のときだけでよいという利点を有する。As described above, in the conventional apparatus, the head-related transfer function had to be reset in the head-related transfer processing apparatus every time the relative position of the sound image changed, whereas in the present embodiment, the head-related transfer function was changed. Has the advantage that it can be set only in the initial setting in the head related transfer function processing device.

【００３２】本実施例によれば、頭部伝達関数が処理中
に変化しないので、例えばインモス社製のＡ１００など
のＦＩＲフィルタ専用装置などを利用でき、十分な長さ
のＦＩＲフィルタを実時間で処理できるような立体音像
生成装置の設計が容易となる。さらにＩＩＲフィルタ、
フーリエ変換を行うＦＦＴデバイス、数論変換するＮＴ
Ｔデバイスを使用した立体音像生成装置に置き換えられ
る。なお、ＩＩＲフィルタは、同等の近似を行うときに
は少ないタップ数で済むが、無限要素成分を含むため、
音源の相対位置が変わる度に頭部伝達関数を変えなけれ
ばならない。本実施例においては処理中に頭部伝達関数
が変化しないので、ＩＩＲフィルタを本発明の立体音像
生成装置に用いることができる。ＦＦＴデバイスやＮＴ
Ｔデバイスは、畳み込みの演算を所定時間の区間分まと
めて計算することによりトータルの処理時間を短縮でき
る。この場合、所定時間の区間内で頭部伝達係数が変化
しないことが必要である。本実施例においては処理中に
頭部伝達関数が変化することはないので、演算処理に係
る時間分だけ入力音を先に取り込むことができる。よっ
て、ＦＦＴデバイスやＮＴＴデバイスを本実施例の立体
音像生成装置に用いることができる。According to the present embodiment, since the head related transfer function does not change during the processing, it is possible to use an apparatus for exclusive use of FIR filter such as A100 manufactured by Inmos Co., Ltd., and a FIR filter of sufficient length can be used in real time. This facilitates the design of a stereoscopic sound image generation device that can be processed. Further IIR filter,
FFT device for Fourier transform, NT for number theory transform
It is replaced with a three-dimensional sound image generation device using a T device. Note that the IIR filter requires a small number of taps when performing the same approximation, but since it includes an infinite element component,
The head related transfer function must be changed every time the relative position of the sound source changes. In this embodiment, since the head related transfer function does not change during the processing, the IIR filter can be used in the stereoscopic sound image generating apparatus of the invention. FFT device or NT
The T-device can shorten the total processing time by collectively calculating the convolution calculation for a predetermined time period. In this case, it is necessary that the head related transfer coefficient does not change within the predetermined time period. In this embodiment, since the head related transfer function does not change during the processing, the input sound can be taken in first for the time required for the arithmetic processing. Therefore, the FFT device or the NTT device can be used for the stereoscopic sound image generation apparatus of this embodiment.

【００３３】[0033]

【発明の効果】請求項１の立体音像生成装置によれば、
入力された音データは並列に頭部伝達関数処理を行わ
れ、音像の位置が検出され、検出結果に基づいて頭部伝
達関数処理された音データが音像補間されるように構成
されている故に、処理中に頭部伝達関数が変化せず、頭
部伝達関数処理装置への頭部伝達関数の転送の手間が省
略できる。その結果、頭部伝達関数処理と補間の計算が
分離され、補間の計算が軽減できる。頭部伝達関数の作
用に相当する演算処理の方式をＦＩＲフィルタ以外にＩ
ＩＲフィルタやＦＦＴデバイスによる処理などの多様な
処理方式に対応することができ、音像の相対的な動きに
スムーズに追随することができる。According to the three-dimensional sound image generating device of the first aspect,
The input sound data is subjected to head-related transfer function processing in parallel, the position of the sound image is detected, and the sound data subjected to head-related transfer function processing based on the detection result is configured to be sound image interpolated. The head-related transfer function does not change during processing, and the labor of transferring the head-related transfer function to the head-related transfer function processing device can be omitted. As a result, the head related transfer function processing and the calculation of interpolation are separated, and the calculation of interpolation can be reduced. The calculation processing method corresponding to the action of the head related transfer function is I except the FIR filter.
It is possible to deal with various processing methods such as processing by an IR filter and an FFT device, and it is possible to smoothly follow the relative movement of the sound image.

【００３４】請求項２の立体音像生成装置によれば、多
様な音データの処理方法が可能となり、音像の相対的な
動きにさらに追随しやすくなる。According to the three-dimensional sound image generating apparatus of the second aspect, various sound data processing methods are possible, and it becomes easier to follow the relative movement of the sound image.

【００３５】請求項３の立体音像生成方法によれば、入
力された音データに並列に頭部伝達関数処理を行い、音
像の位置を検出し、検出結果に基づいて頭部伝達関数処
理した音データを音像補間するが故に、頭部伝達関数の
転送の手間が省略でき、補間の計算が軽減できる。その
結果、頭部伝達関数の作用に相当する演算処理の方式を
ＦＩＲフィルタ以外にＩＩＲフィルタやＦＦＴデバイス
による処理などの多様な処理方式に対応することを可能
とし、音像の相対的な動きに追随し易くなる。According to the three-dimensional sound image generation method of the third aspect, the head-related transfer function processing is performed in parallel with the input sound data to detect the position of the sound image, and the sound subjected to the head-related transfer function processing based on the detection result. Since the data is interpolated by the sound image, the labor of transferring the head related transfer function can be omitted and the calculation of the interpolation can be reduced. As a result, it becomes possible to correspond to various processing methods such as the processing by IIR filter and FFT device other than the FIR filter as the processing method corresponding to the action of the head related transfer function, and follow the relative movement of the sound image. Easier to do.

【００３６】請求項４の立体音像生成方法によれば、多
様な音データの処理方法が可能となり、音像の相対的な
動きにさらに追随しやすくなる。According to the four-dimensional sound image generating method of the fourth aspect, various sound data processing methods are possible, and it becomes easier to follow the relative movement of the sound image.

[Brief description of drawings]

【図１】本発明の立体音像生成装置の実施例の構成を示
すブロック図である。FIG. 1 is a block diagram showing a configuration of an embodiment of a three-dimensional sound image generation device of the present invention.

【図２】従来の立体音像生成装置の構成を示すブロック
図である。FIG. 2 is a block diagram showing a configuration of a conventional stereoscopic sound image generation device.

[Explanation of symbols]

１０音源入力手段１１音像位値検出装置１３ａ〜１３ｎ頭部伝達関数処理装置１６音データ記憶装置１７音像補間装置 10 Sound Source Input Means 11 Sound Image Position Detection Device 13a to 13n Head Transfer Function Processing Device 16 Sound Data Storage Device 17 Sound Image Interpolation Device

───────────────────────────────────────────────────── フロントページの続き (51)Int.Cl.⁶ 識別記号庁内整理番号ＦＩ技術表示箇所Ｈ０３Ｈ 17/08 8842−5ＪＨ０４Ｓ 1/00 Ｋ 8421−5Ｈ 7/00 Ｆ 8421−5Ｈ ─────────────────────────────────────────────────── ─── Continuation of the front page (51) Int.Cl. ⁶ Identification number Office reference number FI technical display location H03H 17/08 8842-5J H04S 1/00 K 8421-5H 7/00 F 8421-5H

Claims

[Claims]

1. A sound expression input means, a plurality of head related transfer function processing means connected in parallel to the sound source input means for performing head related transfer function processing on input sound data, and detecting a position of a sound image. A stereoscopic sound image generation device comprising: a sound image position detecting means for performing sound image position detection means; and a sound image interpolation means for performing sound image interpolation on the sound data processed by the plurality of head related transfer function processing means based on the detection result by the sound image position detecting means.

2. The stereoscopic sound image generating apparatus according to claim 1, further comprising a sound data storage unit that temporarily stores sound data processed by the plurality of head-related transfer function processing units.

3. Head-related transfer function processing is performed in parallel on the input sound data to detect the position of the sound image, and based on the detection result of the position of the sound image, the sound data subjected to the head-related transfer function processing in parallel is detected. A three-dimensional sound image generation method that interpolates sound images.

4. The stereoscopic sound image generation method according to claim 3, wherein the sound data subjected to the head related transfer function processing is temporarily stored in parallel.