JPH09247800A

JPH09247800A - Method for extracting left right sound image direction

Info

Publication number: JPH09247800A
Application number: JP8054950A
Authority: JP
Inventors: Michiyo Goto; 道代後藤; Kazuhiro Iida; 一博飯田
Original assignee: Matsushita Electric Industrial Co Ltd
Current assignee: Panasonic Holdings Corp
Priority date: 1996-03-12
Filing date: 1996-03-12
Publication date: 1997-09-19
Anticipated expiration: 2016-03-12
Also published as: JP3520430B2

Abstract

PROBLEM TO BE SOLVED: To extract a sound image direction of an actual acoustic signal whose energy or the like is changed with high accuracy in the case of extracting the left right sound image direction. SOLUTION: In the step 12 to calculate an energy, the energy of an input acoustic signal is calculated and in the step 13 for energy discrimination, the calculated energy is compared with a predetermined threshold level. Only for a time region where the energy is higher than the threshold level, the following processing is executed to extract the sound image direction; that is, a time series inter-ear correlation function is calculated in the calculation step 14 for the time series inter-ear correlation function, in the calculation step 15 for the inter-ear time difference, the inter-ear time difference is calculated, in the left right sound image direction extraction step 16, the left right sound image direction is extracted. In the end discrimination step 17, whether or not the processing is to be finished is discriminated, and when the processing is not to be finished, the processing is continued.

Description

Detailed Description of the Invention

【０００１】[0001]

【発明の属する技術分野】本発明は、音像の左右方向感
の抽出方法に関するものである。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a method of extracting a sense of left and right of a sound image.

【０００２】[0002]

【従来の技術】従来、受聴者への音像の左右方向におけ
る入射方位角は文献(飯田一博、“時系列両耳間相互相
関度(ＲＣＣ)の測定” 音響学会誌 49，111-116 (199
3)に記載されたものが知られている。2. Description of the Related Art Conventionally, the incident azimuth angle of a sound image to a listener in the left-right direction is described in a literature (Kazuhiro Iida, “Measurement of time series interaural cross-correlation (RCC)”, Acoustical Society of Japan, 49, 111-116 ( 199
Those described in 3) are known.

【０００３】図９は、従来の音像方向抽出方法を示すフ
ローチャートである。91は音響信号入力ステップ、92は
時系列両耳間相互相関関数の算出ステップ、93は両耳間
時間差の算出ステップ、94は左右音像方向の抽出ステッ
プ、95は終了判定ステップである。FIG. 9 is a flowchart showing a conventional sound image direction extraction method. Reference numeral 91 is an acoustic signal input step, 92 is a time-series interaural cross-correlation function calculation step, 93 is an interaural time difference calculation step, 94 is a left and right sound image direction extraction step, and 95 is an end determination step.

【０００４】それぞれのステップの処理内容について記
述する。音響信号入力ステップ91では、左右両チャネル
の音響信号を入力する。時系列両耳間相互相関関数の算
出ステップ92では、(数１)に従って時系列両耳間相互相
関関数を求める。The processing contents of each step will be described. In the acoustic signal input step 91, the acoustic signals of the left and right channels are input. In step 92 of calculating the time-series interaural cross-correlation function, the time-series interaural cross-correlation function is calculated according to (Equation 1).

【０００５】[0005]

【数１】 [Equation 1]

【０００６】ここで、Ｐ_l(ξ) ：左外耳道入口に
おけるインパルス応答Ｐ_r(ξ−τ)：右外耳道入口におけるインパルス応答 τ ：両耳間時間差Ｇ(ｔ−ξ) ：時間窓関数である。両耳間時間差τのとる範囲はおおむね±680μs
であり、−680μsは左側方90°、＋680μsは右側方90°
に相当する。Here, P _l (ξ): Impulse response at the entrance of the left ear canal P _r (ξ-τ): Impulse response at the entrance of the right ear canal τ: Interaural time difference G (t-ξ): Time window function . The range of the interaural time difference τ is approximately ± 680 μs
, −680 μs is 90 ° to the left, and +680 μs is 90 ° to the right.
Is equivalent to

【０００７】両耳間時間差の算出ステップ93では、(数
２)に従って各時間窓における両耳間時間差ＲＴＤ(ｔ)
を求める。In step 93 of calculating the interaural time difference, the interaural time difference RTD (t) in each time window is calculated according to (Equation 2).
Ask for.

【０００８】[0008]

【数２】ＲＴＤ(ｔ)＝τ， for ｜φ_lr(ｔ,τ)｜_max ここで、 φ_lr(ｔ,τ) ：正規化両耳間相関度である。## EQU2 ## RTD (t) = τ, for | φ _lr (t, τ) | _max where φ _lr (t, τ): normalized interaural correlation.

【０００９】左右音像方向の抽出ステップ94では、(数
３)に従って両耳間時間差ＲＴＤ(ｔ)から、対応する時
系列の音像方向 ψ(ｔ) (rad.)を求める。In the extraction step 94 of the left and right sound image directions, the corresponding time-series sound image direction ψ (t) (rad.) Is obtained from the interaural time difference RTD (t) according to (Equation 3).

【００１０】[0010]

【数３】ψ(ｔ)＋sinψ(ｔ)＝２Ｃ・ＲＴＤ(ｔ)／Ｄここで、Ｃ：音速、Ｄ：両耳間距離である。## EQU00003 ## .psi. (T) + sin .psi. (T) = 2C.RTD (t) / D where C is the speed of sound and D is the interaural distance.

【００１１】終了判定ステップ95は、左右音像方向の抽
出処理を終了してよいか否かを判定し、終了でなければ
処理を続行する。In the end judgment step 95, it is judged whether or not the extraction processing of the left and right sound image directions may be ended, and if not ended, the processing is continued.

【００１２】[0012]

【発明が解決しようとする課題】しかしながら、上記従
来の方法では、実際の音響信号(音声信号を含む)の左右
音像方向を時間的に連続して求める具体的な方法が示さ
れていない。即ち、音響信号は常に一定のエネルギーを
有するものではなく、レベルの低い部分や高い部分があ
る。また、無音の部分もあり得る。このような場合、前
記(数１)によって求められる時系列両耳間相互相関関数
Φ_lr(ｔ,τ)が必ずしも精度高く求められるとは限らな
い。従って、(数２)および(数３）を用いることによっ
て最終的に求められた音像方向ψ(ｔ)も常に正しいとは
限らない。However, the above-mentioned conventional method does not show a specific method for continuously obtaining the left and right sound image directions of the actual acoustic signal (including the audio signal) in terms of time. That is, the acoustic signal does not always have a constant energy but has a low level portion and a high level portion. There may also be silence. In such a case, the time series interaural cross-correlation function Φ _lr (t, τ) obtained by the above (Equation 1) is not always obtained with high accuracy. Therefore, the sound image direction ψ (t) finally obtained by using (Equation 2) and (Equation 3) is not always correct.

【００１３】本発明は、このような従来の問題点を解決
するものであり、音響信号の左右音像方向をより精度よ
く抽出する方法を提供することを目的とする。The present invention is intended to solve such a conventional problem, and an object thereof is to provide a method for more accurately extracting the left and right sound image directions of an acoustic signal.

【００１４】[0014]

【課題を解決するための手段】上記目的を達成するため
に、本発明の左右音像方向抽出方法は、受聴者の両耳に
入力される音響信号の時系列両耳間相互相関関数および
両耳間時間差を算出することにより、受聴者が知覚する
左右音像方向を抽出する方法において、 (1) 予め両耳に入力される左右両チャネルの入力音響信
号のエネルギーを算出し、算出されたエネルギーが予め
定めたしきい値より大きい時間領域においてのみ、音像
方向を抽出するようにしたものであり、これにより、エ
ネルギーの低い音響信号の音像方向を誤って抽出するこ
とを防ぐという作用を有する。In order to achieve the above object, the method for extracting the direction of left and right sound images according to the present invention is a time series interaural cross-correlation function of an acoustic signal input to both ears of a listener and both ears. In the method of extracting the left and right sound image directions perceived by the listener by calculating the inter-time difference, (1) the energy of the input acoustic signals of the left and right channels that are input to both ears in advance is calculated, and the calculated energy is The sound image direction is extracted only in a time region that is larger than a predetermined threshold value, which has the effect of preventing the sound image direction of an acoustic signal having low energy from being erroneously extracted.

【００１５】(2) また、予め両耳に入力される左右両チ
ャネルの入力音響信号の有声／無声判定を行い、有声音
と判定された時間領域においてのみ、音像方向を抽出す
るようにしたものであり、有声音に比べてエネルギーが
低く、波形に周期性が少ない無声音の音声信号の音像方
向を誤って抽出することを防ぐという作用を有する。(2) In addition, voiced / unvoiced determination of the input sound signals of the left and right channels input to both ears in advance is performed, and the sound image direction is extracted only in the time region determined as voiced sound. This has the effect of preventing erroneous extraction of the sound image direction of an unvoiced sound signal having lower energy and less periodicity in the waveform than voiced sound.

【００１６】(3) また、予め両耳に入力される左右両チ
ャネルの入力音響信号のtonal成分／non-tonal成分判定
を行い、tonal成分と判定された時間領域においての
み、音像方向を抽出するようにしたものであり、ノイズ
成分が多く、波形に周期性が少ないnon-tonal成分の音
響信号の音像方向を誤って抽出することを防ぐという作
用を有する。(3) Further, the tonal component / non-tonal component of the input acoustic signals of the left and right channels input to both ears in advance is determined, and the sound image direction is extracted only in the time region determined as the tonal component. This has the effect of preventing erroneous extraction of the sound image direction of the acoustic signal of the non-tonal component having many noise components and less periodicity in the waveform.

【００１７】(4) また、予め両耳に入力される左右両チ
ャネルの入力音響信号に対して低域通過フィルタ処理お
よび半波整流を行った後、音像方向を抽出するようにし
たものであり、人間の内耳を模した方法で音響信号の音
像方向を抽出するという作用を有する。(4) Further, the sound image direction is extracted after the low-pass filter processing and the half-wave rectification are performed on the input acoustic signals of the left and right channels which are input to both ears in advance. , Has the effect of extracting the sound image direction of the acoustic signal by a method that imitates the inner ear of a human.

【００１８】(5) また、予め両耳に入力される左右両チ
ャネルの入力音響信号に対してＧammatoneフィルタ処理
を行った後、音像方向を抽出するようにしたものであ
り、前記(4)とは別の、人間の内耳を模した方法で音響
信号の音像方向を抽出するという作用を有する。(5) In addition, the sound image direction is extracted after the Gammatone filter processing is performed on the input acoustic signals of the left and right channels which are input to both ears in advance, and the sound image direction is extracted. Has the effect of extracting the sound image direction of the acoustic signal by another method simulating the inner ear of a human.

【００１９】(6) また、抽出された左右音像方向の平滑
化を行うようにしたものであり、音像方向の時間変化を
より滑らかにするという作用を有する。(6) Further, the extracted left and right sound image directions are smoothed, which has the effect of smoothing the temporal change in the sound image direction.

【００２０】(7) また、抽出された左右音像方向の誤り
を検出した後、訂正を行うようにしたものである。(7) In addition, after the error in the extracted left and right sound image directions is detected, the correction is performed.

【００２１】(8) さらに、音像が抽出されなかった時間
領域での音像方向を、前後の時間領域での音像方向より
推定するようにしたものであり、音像方向が抽出されて
いない部分での音像方向を求めるという作用を有する。(8) Furthermore, the sound image direction in the time domain in which the sound image is not extracted is estimated from the sound image directions in the preceding and following time domains, and the sound image direction in the part where the sound image direction is not extracted is estimated. It has the effect of obtaining the sound image direction.

【００２２】[0022]

【発明の実施の形態】以下、本発明の実施の形態につい
て詳細に説明する。 (実施の形態１)図１は、本発明の実施の形態１における
左右音像方向抽出方法を示したもので、11は音響信号入
力ステップ、12はエネルギーの算出ステップ、13はエネ
ルギー判定ステップ、14は時系列両耳間相互相関関数の
算出ステップ、15は両耳間時間差の算出ステップ、16は
左右音像方向の抽出ステップ、17は終了判定ステップで
ある。Embodiments of the present invention will be described below in detail. (Embodiment 1) FIG. 1 shows a method for extracting left and right sound image directions in Embodiment 1 of the present invention, in which 11 is an acoustic signal input step, 12 is an energy calculation step, 13 is an energy determination step, and 14 is an energy determination step. Is a time series interaural cross-correlation function calculation step, 15 is a binaural time difference calculation step, 16 is a left and right sound image direction extraction step, and 17 is an end determination step.

【００２３】音像方向抽出の処理は、一定数のディジタ
ル音響信号を含む時間領域(以下フレームという)ごとに
行うものとする。一つのフレームでの処理が終了した後
に、連続する次のフレームまたはオーバーラップする次
のフレームでの処理を行い、音響信号が継続する限り処
理を続行する。従って、抽出される音像方向はフレーム
ごとに１個である。The process of extracting the sound image direction is performed for each time domain (hereinafter referred to as a frame) including a fixed number of digital acoustic signals. After the processing of one frame is completed, the processing of the next continuous frame or the next overlapping frame is performed, and the processing is continued as long as the acoustic signal continues. Therefore, there is one sound image direction extracted for each frame.

【００２４】音響信号入力ステップ11では、左右両チャ
ネルの音響信号を入力する。音響信号は既にディジタル
信号に変換されているものとする。まだ変換されていな
い場合は、音響信号入力ステップがＡ／Ｄ変換ステップ
を含むものとする。In the acoustic signal input step 11, the acoustic signals of the left and right channels are input. It is assumed that the acoustic signal has already been converted into a digital signal. If not yet converted, the acoustic signal input step shall include an A / D conversion step.

【００２５】エネルギーの算出ステップ12では、入力音
響信号のエネルギーを算出する。エネルギーは、例えば
(数４)に従って求める。In the energy calculation step 12, the energy of the input acoustic signal is calculated. Energy is, for example,
Calculate according to (Equation 4).

【００２６】[0026]

【数４】ここで、 x₁(i)：左チャネルの第ｉ番目のディジタル
音響信号 x_r(i)：右チャネルの第ｉ番目のディジタル音響信号である。(Equation 4) Here, x ₁ (i) is the i-th digital audio signal of the left channel x _r (i) is the i-th digital audio signal of the right channel.

【００２７】エネルギー判定ステップ13では、算出した
エネルギーを予めしきい値と比較する。しきい値の決定
の仕方は、例えば、予め入力すべき左右チャネルの音響
信号の最大値を数フレームにおいて算出しておき、その
何分の１かの値を左右チャネルの音響信号として(数４)
に従ってエネルギーを算出し、しきい値とする。また、
例えば、予め無音および無音に近い音響信号のあるフレ
ームを特定し、そのフレームのエネルギーを(数４)に従
って算出し、その値をしきい値とする等がある。In the energy judging step 13, the calculated energy is compared with a threshold value in advance. To determine the threshold value, for example, the maximum value of the left and right channel acoustic signals to be input in advance is calculated in several frames, and a fraction of that value is set as the left and right channel acoustic signals (Equation 4). )
The energy is calculated in accordance with the above and used as the threshold value. Also,
For example, there is a method in which a frame having silence or a sound signal close to silence is specified in advance, the energy of the frame is calculated according to (Equation 4), and the value is used as a threshold value.

【００２８】エネルギー判定ステップ13において算出し
たエネルギーがしきい値より大きい場合のみ、そのフレ
ームにおける音像方向を抽出するために以降の処理を実
行する。即ち、時系列両耳間相互相関関数の算出ステッ
プ14において時系列両耳間相互相関関数を算出し、両耳
間時間差の算出ステップ15において両耳間時間差を算出
し、左右音像方向の抽出ステップ16において左右音像方
向を抽出する。時系列両耳間相互相関関数の算出方法、
両耳間時間差の算出方法および左右音像方向の抽出方法
は従来例と同様とするが、時系列両耳間相互相関関数の
代りに、(数５)で表わされるような正規化時系列両耳間
相互相関関数を算出してもよい。Only when the energy calculated in the energy judging step 13 is larger than the threshold value, the following processing is executed to extract the sound image direction in the frame. That is, the time-series interaural cross-correlation function calculation step 14 calculates the time-series interaural cross-correlation function, the interaural time difference calculation step 15 calculates the interaural time difference, the left and right sound image direction extraction step At 16, the left and right sound image directions are extracted. Time series interaural cross-correlation function calculation method,
The method for calculating the interaural time difference and the method for extracting the left and right sound image directions are the same as in the conventional example, but instead of the time series interaural cross-correlation function, the normalized time series binaural as shown in (Equation 5) is used. An inter-correlation function may be calculated.

【００２９】[0029]

【数５】 (Equation 5)

【００３０】ここで、Ｓ_l(j)：左チャネルの第ｊ番目
のディジタル音響信号Ｓ_r(j)：右チャネルの第ｊ番目のディジタル音響信号ｋ：両耳サンプル数差Ｇ (j)：時間窓関数である。Here, S _l (j): j-th digital acoustic signal of the left channel S _r (j): j-th digital acoustic signal of the right channel k: binaural sample number difference G (j): It is a time window function.

【００３１】終了判定ステップ17では、処理を終了して
よいか否かを判定し、終了でなければ、連続する次のフ
レームの処理を続行する。In the end judgment step 17, it is judged whether or not the processing may be ended. If not ended, the processing of the next successive frame is continued.

【００３２】(実施の形態２)図２は、本発明の実施の形
態２における左右音像方向抽出方法を示したもので、21
は音響信号入力ステップ、22は有声／無声の判定ステッ
プ、23は有声判定ステップ、24は時系列両耳間相互相関
関数の算出ステップ、25は両耳間時間差の算出ステッ
プ、26は左右音像方向の抽出ステップ、27は終了判定ス
テップである。(Embodiment 2) FIG. 2 shows a method of extracting left and right sound image directions according to Embodiment 2 of the present invention.
Is an acoustic signal input step, 22 is a voiced / unvoiced determination step, 23 is a voiced determination step, 24 is a time-series interaural cross-correlation function calculation step, 25 is a binaural time difference calculation step, and 26 is a left / right sound image direction. Is an extraction step, and 27 is an end determination step.

【００３３】音像方向抽出の処理は、実施の形態１と同
様に、フレームごとに行うものとする。一つのフレーム
での処理が終了した後に、連続する次のフレームまたは
オーバーラップする次のフレームでの処理を行い、音響
信号が継続する限り処理を続行する。従って、抽出され
る音像方向はフレームごとに１個である。The sound image direction extraction processing is performed for each frame, as in the first embodiment. After the processing of one frame is completed, the processing of the next continuous frame or the next overlapping frame is performed, and the processing is continued as long as the acoustic signal continues. Therefore, there is one sound image direction extracted for each frame.

【００３４】音響信号入力ステップ21では、実施の形態
１における音響信号入力ステップ11と同様の処理を行
う。In the acoustic signal input step 21, the same processing as the acoustic signal input step 11 in the first embodiment is performed.

【００３５】有声／無声の判定ステップ22では、入力音
響信号が有声音であるか、無声音であるかを判定する。
この場合、入力音響信号の大部分が音声(音楽ではなく
人の声)であると予め判っているときに適用するものと
する。In the voiced / unvoiced determination step 22, it is determined whether the input acoustic signal is voiced sound or unvoiced sound.
In this case, it is applied when it is known in advance that most of the input acoustic signal is voice (human voice, not music).

【００３６】有声音とは声帯の振動を音源とする/a/，/
i/，/u/，/e/，/o/，/b/，/d/，/g/,/z/，/dz/，/m/，/
n/，/w/，/j/，/r/などの音であり、周期的な比較的振
幅の大きい波形からなる。また、無声音とは口の中の調
音のためのせばめの付近で発生する空気の乱流を音源と
する/p/，/t/，/k/，/f/，/s/，/sh/，/h/，/ts/，/tsh
/などの音であり、不規則な比較的振幅の小さい波形か
らなる。A voiced sound has a vocal cord vibration as a sound source / a /, /
i /, / u /, / e /, / o /, / b /, / d /, / g /, / z /, / dz /, / m /, /
Sounds such as n /, / w /, / j /, and / r / are composed of periodic waveforms with relatively large amplitude. The unvoiced sound is turbulence of air generated near the interference fit for the articulation in the mouth. / P /, / t /, / k /, / f /, / s /, / sh / , / H /, / ts /, / tsh
Sounds such as / are composed of irregular and relatively small amplitude waveforms.

【００３７】有声／無声判定の方法は、有声音および無
声音の性質を利用して、例えばエネルギー尺度、零交差
分析および短時間平均振幅差関数(short-time average
magnitude function，ＡＭＤＦ)最大対最小比を用いて
決定する。The voiced / unvoiced decision method utilizes the properties of voiced and unvoiced sounds, for example, energy scale, zero-crossing analysis and short-time average difference function (short-time average).
magnitude function (AMDF) Determined using the maximum to minimum ratio.

【００３８】有声判定ステップ23では、有声／無声の判
定ステップ22で判定された値が有声であるか無声である
かによって処理の方法を分ける。有声と判定された場合
のみ、そのフレームにおける音像方向を抽出するため
に、以降の処理を実行する。In the voiced judgment step 23, the processing method is divided depending on whether the value judged in the voiced / unvoiced judgment step 22 is voiced or unvoiced. Only when it is determined to be voiced, the following processing is executed in order to extract the sound image direction in that frame.

【００３９】即ち、時系列両耳間相互相関関数の算出ス
テップ24において時系列両耳間相互相関関数を算出し、
両耳間時間差の算出ステップ25において両耳間時間差を
算出し、左右音像方向の抽出ステップ26において左右音
像方向を抽出する。時系列両耳間相互相関関数の算出方
法、両耳間時間差の算出方法および左右音像方向の抽出
方法は、実施の形態１と同様とする。That is, in the calculation step 24 of the time series interaural cross correlation function, the time series interaural cross correlation function is calculated.
The interaural time difference is calculated in the interaural time difference calculation step 25, and the left and right sound image directions are extracted in the left and right sound image direction extraction step 26. The method of calculating the time series interaural cross-correlation function, the method of calculating the interaural time difference, and the method of extracting the left and right sound image directions are the same as those in the first embodiment.

【００４０】終了判定ステップ27では、処理を終了して
よいか否かを判定し、終了でなければ、連続する次のフ
レームの処理を続行する。At the end judgment step 27, it is judged whether or not the processing may be ended, and if not ended, the processing of the next successive frame is continued.

【００４１】(実施の形態３)図３は、本発明の実施の形
態３における左右音像方向抽出方法を示したもので、31
は音響信号入力ステップ、32はtonal成分／non-tonal成
分の判定ステップ、33はtonal成分判定ステップ、34は
時系列両耳間相互相関関数の算出ステップ、35は両耳間
時間差の算出ステップ、36は左右音像方向の抽出ステッ
プ、37は終了判定ステップである。(Embodiment 3) FIG. 3 shows a method for extracting left and right sound image directions according to Embodiment 3 of the present invention.
Is an acoustic signal input step, 32 is a tonal component / non-tonal component determination step, 33 is a tonal component determination step, 34 is a time series interaural cross-correlation function calculation step, and 35 is an interaural time difference calculation step, Reference numeral 36 is a left / right sound image direction extraction step, and 37 is an end determination step.

【００４２】音像方向抽出の処理は、実施の形態１と同
様に、フレームごとに行うものとする。一つのフレーム
での処理が終了した後に、連続する次のフレームまたは
オーバーラップする次のフレームでの処理を行い、音響
信号が継続する限り処理を続行する。従って、抽出され
る音像方向はフレームごとに１個である。The sound image direction extraction processing is performed frame by frame, as in the first embodiment. After the processing of one frame is completed, the processing of the next continuous frame or the next overlapping frame is performed, and the processing is continued as long as the acoustic signal continues. Therefore, there is one sound image direction extracted for each frame.

【００４３】音響信号入力ステップ31では、実施の形態
１における音響信号入力ステップ11と同様の処理を行
う。In the acoustic signal input step 31, the same processing as the acoustic signal input step 11 in the first embodiment is performed.

【００４４】tonal成分／non-tonal成分の判定ステップ
32では、入力音響信号がtonal成分であるか、non-tonal
成分であるかを判定する。この場合、入力音響信号の大
部分が音楽成分であると予め判っているときに適用する
ものとする。Tonal component / non-tonal component determination step
At 32, the input acoustic signal is a tonal component or non-tonal component
Determine if it is a component. In this case, it is applied when it is known in advance that most of the input acoustic signal is a music component.

【００４５】tonal成分とは調音成分のことであり、ス
ペクトラムを求めると、何本かの鋭いスペクトラムのピ
ークを見出すことができる。また、non-tonal成分とは
雑音的な非調音成分のことであり、スペクトラムを求め
ると、鋭いピークを見出すことはできない。The tonal component is an articulatory component, and when the spectrum is obtained, some sharp spectrum peaks can be found. The non-tonal component is a noise-like non-articulatory component, and when a spectrum is obtained, no sharp peak can be found.

【００４６】tonal成分／non-tonal成分の判定の方法
は、ＦＦＴ等を行ってそのフレームのスペクトラムを求
め、スペクトラムが鋭いピークを有するか否かを判定
し、その結果により、tonal成分であるかnon-tonal成分
であるかを決定する。The method of determining the tonal component / non-tonal component is to perform FFT or the like to obtain the spectrum of the frame and determine whether the spectrum has a sharp peak. Determine if it is a non-tonal component.

【００４７】tonal成分判定ステップ33では、tonal成分
／non-tonal成分の判定ステップ32で判定された値がton
al成分であるかnon-tonal成分であるかによって処理の
方法を分ける。tonal成分と判定された場合のみ、その
フレームにおける音像方向を抽出するために、以降の処
理を実行する。In the tonal component determination step 33, the value determined in the tonal component / non-tonal component determination step 32 is ton.
The processing method is divided depending on whether it is an al component or a non-tonal component. Only when it is determined to be the tonal component, the following process is executed to extract the sound image direction in the frame.

【００４８】即ち、時系列両耳間相互相関関数の算出ス
テップ34において時系列両耳間相互相関関数を算出し、
両耳間時間差の算出ステップ35において両耳間時間差を
算出し、左右音像方向の抽出ステップ36において左右音
像方向を抽出する。時系列両耳間相互相関関数の算出方
法、両耳間時間差の算出方法および左右音像方向の抽出
方法は、実施の形態１と同様とする。That is, in the calculation step 34 of the time series interaural cross correlation function, the time series interaural cross correlation function is calculated,
The interaural time difference is calculated in the interaural time difference calculation step 35, and the left and right sound image directions are extracted in the left and right sound image direction extraction step 36. The method of calculating the time series interaural cross-correlation function, the method of calculating the interaural time difference, and the method of extracting the left and right sound image directions are the same as those in the first embodiment.

【００４９】終了判定ステップ37では、処理を終了して
よいか否かを判定し、終了でなければ、連続する次のフ
レームの処理を続行する。At the end judgment step 37, it is judged whether or not the processing can be ended. If not, the processing of the next continuous frame is continued.

【００５０】(実施の形態４)図４は、本発明の実施の形
態４における左右音像方向抽出方法を示したものであ
り、41は音響信号入力ステップ、42は低域通過フィルタ
処理ステップ、43は半波整流ステップ、44は時系列両耳
間相互相関関数の算出ステップ、45は両耳間時間差の算
出ステップ、46は左右音像方向の抽出ステップ、47は終
了判定ステップである。(Embodiment 4) FIG. 4 shows a method of extracting left and right sound image directions according to Embodiment 4 of the present invention, in which 41 is an acoustic signal input step, 42 is a low-pass filter processing step, and 43 is a low-pass filter processing step. Is a half-wave rectification step, 44 is a time-series interaural cross-correlation function calculation step, 45 is an interaural time difference calculation step, 46 is a left and right sound image direction extraction step, and 47 is an end determination step.

【００５１】音像方向抽出の処理は、実施の形態１と同
様に、フレームごとに行うものとする。一つのフレーム
での処理が終了した後に、連続する次のフレームまたは
オーバーラップする次のフレームでの処理を行い、音響
信号が継続する限り処理を続行する。従って、抽出され
る音像方向はフレームごとに１個である。The sound image direction extraction processing is performed for each frame, as in the first embodiment. After the processing of one frame is completed, the processing of the next continuous frame or the next overlapping frame is performed, and the processing is continued as long as the acoustic signal continues. Therefore, there is one sound image direction extracted for each frame.

【００５２】音響信号入力ステップ41では、実施の形態
１における音響信号入力ステップ11と同様の処理を行
う。In the acoustic signal input step 41, the same processing as the acoustic signal input step 11 in the first embodiment is performed.

【００５３】低域通過フィルタ処理ステップ42では、例
えば遮断周波数1.6ｋＨzの低域通過フィルタで構成する
ことにより、入力音響信号から周波数1.6ｋＨz以下の成
分のみ抽出する。In the low-pass filter processing step 42, for example, a low-pass filter having a cutoff frequency of 1.6 kHz is used to extract only components having a frequency of 1.6 kHz or less from the input acoustic signal.

【００５４】半波整流ステップ43では、音響信号の正の
成分のみを抽出し、負の成分はゼロとする。低域通過フ
ィルタ処理ステップ42および半波整流ステップ43におけ
る処理の内容は人間の内耳の処理を模している。In the half-wave rectification step 43, only the positive component of the acoustic signal is extracted and the negative component is set to zero. The contents of the processing in the low-pass filter processing step 42 and the half-wave rectification step 43 imitate the processing of the human inner ear.

【００５５】時系列両耳間相互相関関数の算出ステップ
44では時系列両耳間相互相関関数を算出し、両耳間時間
差の算出ステップ45では両耳間時間差を算出し、左右音
像方向の抽出ステップ46では左右音像方向を抽出する。
時系列両耳間相互相関関数の算出方法、両耳間時間差の
算出方法および左右音像方向の抽出方法は、実施の形態
１と同様とする。Step of calculating time series interaural cross correlation function
At 44, a time-series interaural cross-correlation function is calculated, at interaural time difference calculation step 45, an interaural time difference is calculated, and at left and right sound image direction extraction step 46, a left and right sound image direction is extracted.
The method of calculating the time series interaural cross-correlation function, the method of calculating the interaural time difference, and the method of extracting the left and right sound image directions are the same as those in the first embodiment.

【００５６】終了判定ステップ47では、処理を終了して
よいか否かを判定し、終了でなければ、連続する次のフ
レームの処理を続行する。At the end judgment step 47, it is judged whether or not the processing can be ended. If not, the processing of the next successive frame is continued.

【００５７】(実施の形態５)図５は、本発明の実施の形
態５における左右音像方向抽出方法を示したものであ
り、51は音響信号入力ステップ、52はＧammatoneフィル
タ処理ステップ、53は時系列両耳間相互相関関数の算出
ステップ、54は両耳間時間差の算出ステップ、55は左右
音像方向の抽出ステップ、56は終了判定ステップであ
る。(Embodiment 5) FIG. 5 shows a method for extracting left and right sound image directions in Embodiment 5 of the present invention, in which 51 is an acoustic signal input step, 52 is a Gammatone filter processing step, and 53 is an hour. A series of interaural cross-correlation function calculation step, 54 an interaural time difference calculation step, 55 a left and right sound image direction extraction step, and 56 an end determination step.

【００５８】音像方向抽出の処理は、実施の形態１と同
様に、フレームごとに行うものとする。一つのフレーム
での処理が終了した後に、連続する次のフレームまたは
オーバーラップする次のフレームでの処理を行い、音響
信号が継続する限り処理を続行する。従って、抽出され
る音像方向はフレームごとに１個である。The sound image direction extraction processing is performed for each frame, as in the first embodiment. After the processing of one frame is completed, the processing of the next continuous frame or the next overlapping frame is performed, and the processing is continued as long as the acoustic signal continues. Therefore, there is one sound image direction extracted for each frame.

【００５９】音響信号入力ステップ51では、実施の形態
１における音響信号入力ステップ11と同様の処理を行
う。In the acoustic signal input step 51, the same processing as the acoustic signal input step 11 in the first embodiment is performed.

【００６０】Ｇammatoneフィルタ処理ステップ52では、
例えば中心周波数１ｋＨzの帯域通過フィルタで構成す
ることにより、入力音響信号から周波数１ｋＨzを中心
とする臨界帯域内の成分のみ抽出する。臨界帯域とは人
間の聴覚機構の中に可聴周波数帯域全体にわたって、あ
る帯域幅の帯域通過フィルタが順番に並んでいると考え
られ、そのような周波数帯域のことをいう。臨界帯域の
中心周波数は50Ｈzから13.5ｋＨzに及ぶが、例えば、前
述した１ｋＨzやそのほか700Ｈz，４ｋＨz，７ｋＨz等
音像方向抽出結果が最もよい中心周波数を選ぶことがで
きる。In the Gammatone filter processing step 52,
For example, by using a bandpass filter having a center frequency of 1 kHz, only the component within the critical band centered on the frequency of 1 kHz is extracted from the input acoustic signal. The critical band is considered to be a frequency band in which a bandpass filter having a certain bandwidth is arranged in order over the entire audible frequency band in the human auditory mechanism. The center frequency of the critical band ranges from 50 Hz to 13.5 kHz. For example, the above-mentioned 1 kHz or 700 kHz, 4 kHz, 7 kHz or the like, which is the best center frequency for extracting the sound image direction, can be selected.

【００６１】Ｇammatoneフィルタは時間領域で次のよう
に表わされる。The Gammatone filter is represented in the time domain as follows.

【００６２】[0062]

【数６】ｇt(ｔ)＝ａt³exp(−２πｂt)cos(２πｆ₀t) ここで、ａ：正規化係数ｂ：25.2(4.37ｆ₀／10
00＋１) ｆ₀：Ｇammatoneフィルタの中心周波数である。[6] ^{gt (t) = at 3 exp} (-2πbt) cos (2πf 0 t) where, a: normalization factor b: 25.2 (4.37f _0/10
00 + 1) f ₀ : The center frequency of the Gammatone filter.

【００６３】時系列両耳間相互相関関数の算出ステップ
53では時系列両耳間相互相関関数を算出し、両耳間時間
差の算出ステップ54では両耳間時間差を算出し、左右音
像方向の抽出ステップ55では左右音像方向を抽出する。
時系列両耳間相互相関関数の算出方法、両耳間時間差の
算出方法および左右音像方向の抽出方法は、実施の形態
１と同様とする。Calculation step of time series interaural cross correlation function
At 53, a time-series interaural cross-correlation function is calculated, at both interaural time difference calculating step 54, an interaural time difference is calculated, and at left and right sound image direction extracting step 55, left and right sound image directions are extracted.
The method of calculating the time series interaural cross-correlation function, the method of calculating the interaural time difference, and the method of extracting the left and right sound image directions are the same as those in the first embodiment.

【００６４】終了判定ステップ56では、処理を終了して
よいか否かを判定し、終了でなければ、連続する次のフ
レームの処理を続行する。At the end judgment step 56, it is judged whether or not the processing may be ended, and if not ended, the processing of the next successive frame is continued.

【００６５】(実施の形態６)図６は、本発明の実施の形
態６における左右音像方向抽出方法を示したものであ
り、61は音響信号入力ステップ、62は時系列両耳間相互
相関関数の算出ステップ、63は両耳間時間差の算出ステ
ップ、64は左右音像方向の抽出ステップ、65は平滑化処
理ステップ、66は終了判定ステップである。(Sixth Embodiment) FIG. 6 shows a method for extracting left and right sound image directions according to a sixth embodiment of the present invention, in which 61 is an acoustic signal input step and 62 is a time series interaural cross correlation function. Is a step of calculating the time difference between both ears, 64 is a step of extracting left and right sound image directions, 65 is a smoothing processing step, and 66 is an end determination step.

【００６６】音像方向抽出の処理は、実施の形態１と同
様に、フレームごとに行うものとする。一つのフレーム
での処理が終了した後に、連続する次のフレームまたは
オーバーラップする次のフレームでの処理を行い、音響
信号が継続する限り処理を続行する。従って、抽出され
る音像方向はフレームごとに１個である。The sound image direction extraction processing is performed frame by frame, as in the first embodiment. After the processing of one frame is completed, the processing of the next continuous frame or the next overlapping frame is performed, and the processing is continued as long as the acoustic signal continues. Therefore, there is one sound image direction extracted for each frame.

【００６７】音響信号入力ステップ61では、実施の形態
１における音響信号入力ステップ11と同様の処理を行
う。At the acoustic signal input step 61, the same processing as at the acoustic signal input step 11 in the first embodiment is performed.

【００６８】時系列両耳間相互相関関数の算出ステップ
62では時系列両耳間相互相関関数を算出し、両耳間時間
差の算出ステップ63では両耳間時間差を算出し、左右音
像方向の抽出ステップ64では左右音像方向を抽出する。
時系列両耳間相互相関関数の算出方法、両耳間時間差の
算出方法および左右音像方向の抽出方法は、実施の形態
１と同様とする。Calculation step of time series interaural cross correlation function
At 62, a time-series interaural cross-correlation function is calculated, at a calculation step 63 between interaural times, an interaural time difference is calculated, and at a left and right sound image direction extraction step 64, left and right sound image directions are extracted.
The method of calculating the time series interaural cross-correlation function, the method of calculating the interaural time difference, and the method of extracting the left and right sound image directions are the same as those in the first embodiment.

【００６９】平滑化処理ステップ65では、連続するフレ
ームにおける左右音像方向の平滑化を行う。フレームの
数は予め定めておく、定められたフレーム数分の左右音
像方向を平滑化処理部に蓄え、例えば算術平均値を算出
することにより平滑化を行う。平滑化を行った後は、時
間的に最も先行するフレームの左右音像方向を捨てる。
次のフレームに対して処理ステップ61から64を行って、
時間的に最も新しいフレームの左右音像方向を得た後、
再び平滑化処理ステップ65で数フレーム分の左右音像方
向の平滑化を行う。In the smoothing processing step 65, smoothing in the left and right sound image directions in successive frames is performed. The number of frames is set in advance. The left and right sound image directions corresponding to the determined number of frames are stored in the smoothing processing unit, and smoothing is performed by calculating an arithmetic mean value, for example. After smoothing, the left and right sound image directions of the frame that precedes temporally are discarded.
Perform processing steps 61 to 64 for the next frame,
After obtaining the left and right sound image directions of the temporally newest frame,
In the smoothing processing step 65 again, smoothing in the right and left sound image directions for several frames is performed.

【００７０】終了判定ステップ66では、処理を終了して
よいか否かを判定し、終了でなければ、連続する次のフ
レームの処理を続行する。At the end judgment step 66, it is judged whether or not the processing can be ended. If not, the processing of the next successive frame is continued.

【００７１】(実施の形態７)図７は、本発明の実施の形
態７における左右音像方向抽出方法を示したものであ
り、71は音響信号入力ステップ、72は時系列両耳間相互
相関関数の算出ステップ、73は両耳間時間差の算出ステ
ップ、74は左右音像方向の抽出ステップ、75は誤り検出
ステップ、76は誤り訂正ステップ、77は終了判定ステッ
プである。(Embodiment 7) FIG. 7 shows a method for extracting left and right sound image directions according to Embodiment 7 of the present invention, in which 71 is an acoustic signal input step and 72 is a time series interaural cross-correlation function. , 73 is a binaural time difference calculation step, 74 is a left and right sound image direction extraction step, 75 is an error detection step, 76 is an error correction step, and 77 is an end determination step.

【００７２】音像方向抽出の処理は、実施の形態１と同
様に、フレームごとに行うものとする。一つのフレーム
での処理が終了した後に、連続する次のフレームまたは
オーバーラップする次のフレームでの処理を行い、音響
信号が継続する限り処理を続行する。従って、抽出され
る音像方向はフレームごとに１個である。The sound image direction extraction processing is performed frame by frame, as in the first embodiment. After the processing of one frame is completed, the processing of the next continuous frame or the next overlapping frame is performed, and the processing is continued as long as the acoustic signal continues. Therefore, there is one sound image direction extracted for each frame.

【００７３】音響信号入力ステップ71では、実施の形態
１における音響信号入力ステップ11と同様の処理を行
う。At the acoustic signal input step 71, the same processing as the acoustic signal input step 11 in the first embodiment is performed.

【００７４】時系列両耳間相互相関関数の算出ステップ
72では時系列両耳間相互相関関数を算出し、両耳間時間
差の算出ステップ73では両耳間時間差を算出し、左右音
像方向の抽出ステップ74では左右音像方向を抽出する。
時系列両耳間相互相関関数の算出方法、両耳間時間差の
算出方法および左右音像方向の抽出方法は、実施の形態
１と同様とする。Calculation step of time series interaural cross correlation function
At 72, a time-series interaural cross-correlation function is calculated, at a binaural time difference calculation step 73, a binaural time difference is calculated, and at a left and right sound image direction extraction step 74, a left and right sound image direction is extracted.
The method of calculating the time series interaural cross-correlation function, the method of calculating the interaural time difference, and the method of extracting the left and right sound image directions are the same as those in the first embodiment.

【００７５】誤り検出ステップ75では、連続するフレー
ムにおける左右音像方向の誤りを検出する。フレームの
数は予め定めておく、定められたフレーム数分の左右音
像方向を誤り検出部に蓄え、例えば時間的に中央のフレ
ームの左右音像方向の誤りを、前後のフレームの左右音
像方向を用いて検出する。フレームの数が偶数の場合は
中央より前または後にずらしたフレームを中央の誤りを
検出すべきフレームとする。誤りの検出方法は、例え
ば、誤りを検出すべきフレームの左右音像方向を除い
て、他のフレームの左右音像方向の平均値を求め、その
平均値に対して中央のフレームの左右音像方向が予め定
めた値以上に離れている場合に誤りとする、などであ
る。In the error detection step 75, errors in the left and right sound image directions in consecutive frames are detected. The number of frames is set in advance, and the left and right sound image directions for the predetermined number of frames are stored in the error detection unit. For example, the error in the left and right sound image directions of the frame at the center in time is used as the left and right sound image directions of the preceding and following frames. To detect. When the number of frames is an even number, the frame shifted before or after the center is set as the frame in which the center error should be detected. An error detection method is, for example, a method in which the left and right sound image directions of a frame in which an error is to be detected are excluded, and an average value in the left and right sound image directions of other frames is obtained, and the left and right sound image directions of the central frame are previously calculated with respect to the average value. For example, if the distance is more than the specified value, it is regarded as an error.

【００７６】誤り訂正ステップ76では、誤り検出ステッ
プ75で誤りと判定された場合の音像方向を訂正する。訂
正の方法は、例えば、誤りと判定されて前後２フレーム
の左右音像方向の平均値を求め、誤りと判定されたフレ
ームの音像方向を置き換える、などである。In the error correction step 76, the direction of the sound image when it is judged as an error in the error detection step 75 is corrected. The correction method is, for example, to obtain an average value in the left and right sound image directions of two frames before and after being determined as an error and replace the sound image direction of the frame determined to be in error.

【００７７】誤り検出および誤り訂正を行った後は、時
間的に最も先行するフレームの左右音像方向を捨てる。
次のフレームに対して処理71から74を行って、時間的に
最も新しいフレームの左右音像方向を得た後、再び処理
ステップ75および76で音像方向の誤り検出および誤り訂
正を行う。After performing error detection and error correction, the left and right sound image directions of the frame that precedes temporally are discarded.
The processes 71 to 74 are performed on the next frame to obtain the left and right sound image directions of the frame that is newest in time, and then the error detection and error correction of the sound image direction are performed again in process steps 75 and 76.

【００７８】終了判定ステップ77では、処理を終了して
よいか否かを判定し、終了でなければ、連続する次のフ
レームの処理を続行する。At the end judgment step 77, it is judged whether or not the processing can be ended. If not, the processing of the next successive frame is continued.

【００７９】(実施の形態８)図８は、本発明の実施の形
態８における左右音像方向抽出方法を示したものであ
り、81は音響信号入力ステップ、82は時系列両耳間相互
相関関数の算出ステップ、83は両耳間時間差の算出ステ
ップ、84は左右音像方向の抽出ステップ、85は左右音像
方向未抽出区間の音像方向の推定ステップ、86は終了判
定ステップである。(Embodiment 8) FIG. 8 shows a method for extracting left and right sound image directions according to Embodiment 8 of the present invention, in which 81 is an acoustic signal input step, and 82 is a time series interaural cross-correlation function. Is a step of calculating the interaural time difference, 84 is a step of extracting the left and right sound image directions, 85 is a step of estimating the sound image direction of a section in which the left and right sound image directions have not been extracted, and 86 is an end determination step.

【００８０】音像方向抽出の処理は、実施の形態１と同
様に、フレームごとに行うものとする。一つのフレーム
での処理が終了した後に、連続する次のフレームまたは
オーバーラップする次のフレームでの処理を行い、音響
信号が継続する限り処理を続行する。従って、抽出され
る音像方向はフレームごとに１個である。The sound image direction extraction processing is performed frame by frame, as in the first embodiment. After the processing of one frame is completed, the processing of the next continuous frame or the next overlapping frame is performed, and the processing is continued as long as the acoustic signal continues. Therefore, there is one sound image direction extracted for each frame.

【００８１】音響信号入力ステップ81では、実施の形態
１における音響信号入力ステップ11と同様の処理を行
う。At the acoustic signal input step 81, the same processing as at the acoustic signal input step 11 in the first embodiment is performed.

【００８２】時系列両耳間相互相関関数の算出ステップ
82では時系列両耳間相互相関関数を算出し、両耳間時間
差の算出ステップ83では両耳間時間差を算出し、左右音
像方向の抽出ステップ84では左右音像方向を抽出する。
時系列両耳間相互相関関数の算出方法、両耳間時間差の
算出方法および左右音像方向の抽出方法は、実施の形態
１と同様とする。Calculation step of time series interaural cross correlation function
At 82, a time series interaural cross-correlation function is calculated, at interaural time difference calculation step 83, an interaural time difference is calculated, and at left and right sound image direction extraction step 84, a left and right sound image direction is extracted.
The method of calculating the time series interaural cross-correlation function, the method of calculating the interaural time difference, and the method of extracting the left and right sound image directions are the same as those in the first embodiment.

【００８３】左右音像方向未抽出区間の音像方向の推定
ステップ85では、連続するフレームにおける未抽出の左
右音像方向を推定する。フレームの数は予め定めてお
く、定められたフレーム数分の左右音像方向を左右音像
方向未抽出区間の音像方向の推定部に蓄え、例えば時間
的に中央のフレームの左右音像方向が未抽出である場
合、前後のフレームの左右音像方向を用いて推定する。
フレームの数が偶数の場合は中央より前または後にずら
したフレームを中央の左右音像方向を推定すべきフレー
ムとする。In the sound image direction estimation step 85 of the left and right sound image direction unextracted sections, the unextracted left and right sound image directions in successive frames are estimated. The number of frames is set in advance, and the left and right sound image directions corresponding to the predetermined number of frames are stored in the sound image direction estimation unit of the left and right sound image direction unextracted section. In some cases, the left and right sound image directions of the preceding and following frames are used for estimation.
When the number of frames is an even number, a frame shifted before or after the center is set as a frame for which the left and right sound image directions of the center should be estimated.

【００８４】なお、左右音像方向が未抽出であるという
のは、実施の形態１，２，３において時系列両耳間相互
相関関数の算出、両耳間時間差の算出、および左右音像
方向の抽出が行われなかった場合などである。左右音像
方向の推定の方法は、例えば、推定すべきフレームの左
右音像方向を除いて、他のフレームの左右音像方向の最
小２乗直線を求め、推定すべきフレームにおける直線の
値をそのフレームの左右音像方向とする、などである。
または簡単に、推定すべきフレームの前後２フレームの
左右音像方向の平均値を求め、推定すべきフレームの左
右音像方向とすることもできる。The fact that the left and right sound image directions are not extracted means that the time series interaural cross-correlation function is calculated, the interaural time difference is calculated, and the left and right sound image directions are extracted in the first, second, and third embodiments. Is not done. The method of estimating the left and right sound image directions is, for example, a method of finding the least squares straight line in the left and right sound image directions of other frames except for the left and right sound image directions of the frame to be estimated, and determining the value of the straight line in the frame to be estimated. Left and right sound image direction, and so on.
Alternatively, it is also possible to simply obtain the average value of the left and right sound image directions of the two frames before and after the frame to be estimated and set it as the left and right sound image direction of the frame to be estimated.

【００８５】左右音像方向未抽出区間の音像方向を推定
した後は、時間的に最も先行するフレームの左右音像方
向を捨てる。次のフレームに対して処理ステップ81から
84を行って、時間的に最も新しいフレームの左右音像方
向を得た後、再び処理ステップ85において左右音像方向
未抽出区間の音像方向の推定を行う。After estimating the sound image directions of the left and right sound image direction unextracted sections, the left and right sound image directions of the frame that precedes temporally are discarded. From processing step 81 for the next frame
After performing step 84 to obtain the left and right sound image direction of the frame that is temporally newest, in step 85, the sound image direction of the left and right sound image direction unextracted section is estimated again.

【００８６】終了判定ステップ86では、処理を終了して
よいか否かを判定し、終了でなければ、連続する次のフ
レームの処理を続行する。At the end judgment step 86, it is judged whether or not the processing may be ended, and if not ended, the processing of the next successive frame is continued.

【００８７】[0087]

【発明の効果】以上説明したように、本発明によれば、
時系列両耳間相互相関関数を求める際に、前もって左右
チャネルの音響信号のエネルギーを算出すること、有声
／無声判定を行うこと、tonal成分／non-tonal成分の判
定を行うこと、低域通過フィルタ処理および半波整流を
行うこと、およびＧammatoneフィルタ処理を行うことに
より、精度の高い音像方向を抽出することができるとい
う効果が得られる。さらに、音像方向を抽出した後に平
滑化を行うこと、誤り検出および訂正を行うこと、およ
び音像が抽出されなかった部分での音像方向を推定する
ことにより、精度の高い音像方向を抽出することができ
るという有利な効果が得られる。As described above, according to the present invention,
When calculating the time-series interaural cross-correlation function, the energy of the acoustic signals of the left and right channels is calculated in advance, voiced / unvoiced determination is performed, tonal component / non-tonal component determination is performed, and low pass is performed. By performing the filtering process and the half-wave rectification, and performing the Gammatone filtering process, it is possible to obtain an effect that a highly accurate sound image direction can be extracted. Furthermore, it is possible to extract a sound image direction with high accuracy by performing smoothing after extracting the sound image direction, performing error detection and correction, and estimating the sound image direction in a portion where the sound image is not extracted. The advantageous effect that it can be obtained.

[Brief description of drawings]

【図１】本発明の実施の形態１における左右音像方向抽
出方法を示すフローチャートである。FIG. 1 is a flowchart showing a method for extracting left and right sound image directions according to Embodiment 1 of the present invention.

【図２】本発明の実施の形態２における左右音像方向抽
出方法を示すフローチャートである。FIG. 2 is a flowchart showing a method for extracting left and right sound image directions according to the second embodiment of the present invention.

【図３】本発明の実施の形態３における左右音像方向抽
出方法を示すフローチャートである。FIG. 3 is a flowchart showing a method for extracting left and right sound image directions according to the third embodiment of the present invention.

【図４】本発明の実施の形態４における左右音像方向抽
出方法を示すフローチャートである。FIG. 4 is a flowchart showing a method for extracting left and right sound image directions according to the fourth embodiment of the present invention.

【図５】本発明の実施の形態５における左右音像方向抽
出方法を示すフローチャートである。FIG. 5 is a flowchart showing a method for extracting left and right sound image directions according to the fifth embodiment of the present invention.

【図６】本発明の実施の形態６における左右音像方向抽
出方法を示すフローチャートである。FIG. 6 is a flowchart showing a method for extracting left and right sound image directions according to the sixth embodiment of the present invention.

【図７】本発明の実施の形態７における左右音像方向抽
出方法を示すフローチャートである。FIG. 7 is a flowchart showing a method for extracting left and right sound image directions according to the seventh embodiment of the present invention.

【図８】本発明の実施の形態８における左右音像方向抽
出方法を示すフローチャートである。FIG. 8 is a flowchart showing a method for extracting left and right sound image directions according to the eighth embodiment of the present invention.

【図９】従来の音像方向抽出方法を示すフローチャート
である。FIG. 9 is a flowchart showing a conventional sound image direction extraction method.

[Explanation of symbols]

11,21,31,41,51,61,71,81…音響信号入力ステップ、 1
2…エネルギーの算出ステップ、 13…エネルギー判定
ステップ、 14,24,34,44,53,62,72,82…時系列両耳間
相互相関関数の算出ステップ、 15,25,35,45,54,63,7
3,83…両耳間時間差の算出ステップ、 16,26,36,46,5
5,64,74,84…左右音像方向の抽出ステップ、 17,27,3
7,47,56,66,77,86…終了判定ステップ、 22…有声／無
声の判定ステップ、 23…有声判定ステップ、 32…to
nal成分／non-tonal成分の判定ステップ、 33…tonal
成分判定ステップ、 42…低域通過フィルタ処理ステッ
プ、43…半波整流ステップ、 52…Ｇammatoneフィルタ
処理ステップ、 65…平滑化処理ステップ、 75…誤り
検出ステップ、 76…誤り訂正ステップ、 85…左右音
像方向未抽出区間の音像方向の推定ステップ。11,21,31,41,51,61,71,81… Acoustic signal input step, 1
2 ... Energy calculation step, 13 ... Energy determination step, 14,24,34,44,53,62,72,82 ... Time series interaural cross-correlation function calculation step, 15,25,35,45,54 , 63,7
3,83… Steps for calculating the interaural time difference, 16,26,36,46,5
5,64,74,84 ... Extraction step of left and right sound image direction, 17,27,3
7,47,56,66,77,86… End judgment step, 22… Voiced / unvoiced judgment step, 23… Voiced judgment step, 32… to
Nal component / non-tonal component judgment step, 33 ... tonal
Component determination step, 42 ... Low-pass filter processing step, 43 ... Half-wave rectification step, 52 ... Gammatone filter processing step, 65 ... Smoothing processing step, 75 ... Error detection step, 76 ... Error correction step, 85 ... Left and right sound image The step of estimating the sound image direction of the direction unextracted section.

Claims

[Claims]

1. A method for extracting a left-right sound image direction perceived by a listener by calculating a time-series interaural cross-correlation function and an interaural time difference of acoustic signals input to both ears of a listener. The energy of the input acoustic signals of both left and right channels input to both ears in advance is calculated, and the sound image direction is extracted only in a time region in which the calculated energy is larger than a predetermined threshold value. Left and right sound image direction extraction method.

2. A method for extracting a left-right sound image direction perceived by a listener by calculating a time series interaural cross-correlation function and an interaural time difference of an acoustic signal input to both ears of the listener. Left and right sound image direction extraction, which is characterized by performing voiced / unvoiced judgment of the left and right channel input acoustic signals input to both ears in advance, and extracting the sound image direction only in the time region judged as voiced sound. Method.

3. A method for extracting a left-right sound image direction perceived by a listener by calculating a time series interaural cross-correlation function and an interaural time difference of an acoustic signal input to both ears of the listener. Tonal component / non-tonal component determination of the left and right channel input acoustic signals input to both ears in advance is performed,
A left and right sound image direction extraction method characterized in that the sound image direction is extracted only in the time domain determined as the tonal component.

4. A method for extracting the left and right sound image directions perceived by a listener by calculating a time series interaural cross-correlation function and an interaural time difference of an acoustic signal input to both ears of the listener. A left and right sound image direction extraction method characterized in that the sound image directions are extracted after performing low-pass filter processing and half-wave rectification on input sound signals of the left and right channels that are input to both ears in advance.

5. A method for extracting a left-right sound image direction perceived by a listener by calculating a time series interaural cross-correlation function and an interaural time difference of an acoustic signal input to both ears of a listener, A left and right sound image direction extraction method, characterized in that the sound image direction is extracted after performing Gammatone filter processing on the input acoustic signals of the left and right channels input to both ears in advance.

6. A method for extracting a left-right sound image direction perceived by a listener by calculating a time-series interaural cross-correlation function and an interaural time difference of an acoustic signal input to both ears of the listener. A method for extracting left and right sound image directions, characterized in that the extracted left and right sound image directions are smoothed.

7. A method for extracting a left-right sound image direction perceived by a listener by calculating a time-series interaural cross-correlation function and an interaural time difference of acoustic signals input to both ears of a listener. A method for extracting a direction of a left and right sound image, which is characterized in that the detected error in the direction of the left and right sound image is corrected.

8. A method for extracting a left-right sound image direction perceived by a listener by calculating a time-series interaural cross-correlation function and an interaural time difference of an acoustic signal input to both ears of the listener. A left and right sound image direction extraction method, wherein a sound image direction in a time domain in which no sound image is extracted is estimated from sound image directions in front and back time domains.