JP5033156B2

JP5033156B2 - Sound image width estimation apparatus and sound image width estimation program

Info

Publication number: JP5033156B2
Application number: JP2009048814A
Authority: JP
Inventors: 一郎ベーマーヨハン; 訓史大出; 彰男安藤
Original assignee: Japan Broadcasting Corp
Current assignee: Japan Broadcasting Corp
Priority date: 2009-03-03
Filing date: 2009-03-03
Publication date: 2012-09-26
Anticipated expiration: 2029-03-03
Also published as: JP2010204325A

Abstract

PROBLEM TO BE SOLVED: To provide a sound image width estimating device capable of estimating sound image width with high precision based on the physical feature quantity obtained by analyzing digital acoustic signals constituted by two right and left channels. SOLUTION: This sound image width estimating device 100 is provided with filter banks 7R, 7L for dividing a digital acoustic signal constituted of two right and left channels into a plurality of subband signals in frequency band having frequency band width of 1/6 octave or less per acoustic signal of right and left channels, a frequency band-based feature quantity calculating means 8<SB>f</SB>for calculating a frequency band-based feature quantity having at least one among cross-correlation degree between both ears, standard deviation in the direction of time axis of time difference between both ears or standard deviation in the direction of time axis of level difference between both ears as the feature quantity per subband signal, a means 9 for calculating a typical value of physical feature quantity from the frequency band-based feature quantity, and a sound image width-estimated value calcurating means 10 for calculating an estimated value of sound image width by applying the typical value of physical feature quantity to an estimating model equation. COPYRIGHT: (C)2010,JPO&INPIT

Description

本発明は、左右２チャンネルで採取したデジタル音響信号を分析して得た物理特徴量に基づいて、聴覚特性である音像幅を推定する音像幅推定装置及び音像幅推定プログラムに関する。 The present invention relates to a sound image width estimation device and a sound image width estimation program for estimating a sound image width, which is an auditory characteristic, based on physical feature values obtained by analyzing digital audio signals collected in two left and right channels.

音響が人間に与える心理的効果の大きさは、主観評価によって定量することができる。この主観評価によって得られる心理的効果の大きさを、音響信号を採取・分析して得られた物理特徴量に基づいて客観評価する手法が多く試みられている。
その中で、人間の聴覚特性の一つである音像幅と物理特徴量との関係についても多くの研究が進められている。音響分析の分野では広く知られ、また受け入れられている物理特徴量として、ＩＡＣＣ（interaural cross-correlation；両耳間相互相関度）がある。一般的には、ＩＡＣＣが小さくなると、音像幅は広がると考えられており、種々の周波数帯域におけるＩＡＣＣの分析についての研究が数多くなされている（例えば、非特許文献１参照）。
また、音像幅と、物理特徴量であるＩＴＤ（interaural time differences；両耳間時間差）及びＩＬＤ（interaural level differences；両耳間レベル差）の変動との関係について報告されている（非特許文献２及び非特許文献３参照）。
更に、例えば、特許文献１においては、音響信号からＩＡＣＦ（interaural cross-correlation function；両耳間相互相関関数）の最大振幅であるＩＡＣＣと、この最大振幅の幅Ｗ_ＩＡＣＣとに基づいて、見かけの音源の幅（ＡＳＷ）を評価する手法について記載されている（段落００５０参照）。 The magnitude of the psychological effect of sound on humans can be quantified by subjective evaluation. Many attempts have been made to objectively evaluate the magnitude of the psychological effect obtained by this subjective evaluation based on physical features obtained by collecting and analyzing acoustic signals.
In this context, many studies have been conducted on the relationship between the sound image width and physical features, which is one of human auditory characteristics. One physical feature that is widely known and accepted in the field of acoustic analysis is IACC (interaural cross-correlation). In general, it is considered that the sound image width increases as IACC decreases, and many studies have been conducted on the analysis of IACC in various frequency bands (see, for example, Non-Patent Document 1).
In addition, a relationship between the sound image width and fluctuations of ITD (interaural time differences) and ILD (interaural level differences), which are physical features, has been reported (Non-patent Document 2). And Non-Patent Document 3).
Further, for example, in Patent Document 1, the apparent amplitude is determined based on _IACC , which is the maximum amplitude of IACF (interaural cross-correlation function) from an acoustic signal, and the width W _{IACC of the} maximum amplitude. A method for evaluating the width (ASW) of the sound source is described (see paragraph 0050).

特開２００３−５７１０８号公報JP 2003-57108 A

Masayuki Morimoto and Kazuhiro Iida, “Appropriate frequency bandwidth in measuring interaural cross-correlation as a physical measure of auditory source width”, Acourstical Science and Technology, 日本, 日本音響学会, 2005, Vol.26, No.2, p.179-184Masayuki Morimoto and Kazuhiro Iida, “Appropriate frequency bandwidth in measuring interaural cross-correlation as a physical measure of auditory source width”, Acourstical Science and Technology, Japan, Acoustical Society of Japan, 2005, Vol.26, No.2, p.179 -184 Russell Mason and Francis Rumsey, “A comparison of objective measurements for predicting selected subjective spatial attributes”, Audio Engineering Society 112th Convention Paper 5591, Germany, 2002Russell Mason and Francis Rumsey, “A comparison of objective measurements for predicting selected subjective spatial attributes”, Audio Engineering Society 112th Convention Paper 5591, Germany, 2002 Jens Blauert and Werner Lindemann, “Auditory spaciousness:Some further psychoacoustic analyses”, Journal of Acoustical Society of America, USA, 1986, Vol.80, No.2, p.533-542Jens Blauert and Werner Lindemann, “Auditory spaciousness: Some further psychoacoustic analyzes”, Journal of Acoustical Society of America, USA, 1986, Vol.80, No.2, p.533-542

しかしながら、左右２チャンネルで採取した音響信号を分析して得られる物理的特徴量に基づく従来の客観評価手法では、用いる物理特徴量と主観評価値との相関が必ずしも高くなく、任意の音源から発せられる音像幅を精度よく評価することができなかった。 However, in the conventional objective evaluation method based on the physical feature value obtained by analyzing the acoustic signals collected from the left and right channels, the correlation between the physical feature value to be used and the subjective evaluation value is not necessarily high, and it can be generated from any sound source. It was not possible to accurately evaluate the sound image width.

そこで、本発明はかかる課題に鑑みてなされたものであり、本発明の目的は、左右に２チャンネルからなるデジタル音響信号を分析して得られる物理特徴量に基づいて、音像幅を精度よく推定する音像幅推定装置を提供することである。 Accordingly, the present invention has been made in view of such problems, and an object of the present invention is to accurately estimate the sound image width based on physical feature values obtained by analyzing a digital acoustic signal composed of two channels on the left and right. It is to provide a sound image width estimation device.

前記した目的を達成するために、請求項１に記載の音像幅推定装置は、左右に２チャンネルからなるデジタル音響信号から物理特徴量を算出し、算出した物理特徴量を、物理特徴量と重み係数とからなる音像幅の推定モデル式に適用して音像幅を推定する音像幅推定装置であって、周波数帯域分割手段と、周波数帯域別特徴量算出手段と、物理特徴量算出手段と、推定値算出手段と、を備えて構成した。 In order to achieve the above object, the sound image width estimation device according to claim 1 calculates a physical feature amount from a digital acoustic signal including two channels on the left and right sides, and calculates the calculated physical feature amount and physical feature amount and weight. A sound image width estimation device for estimating a sound image width by applying to a sound image width estimation model formula comprising coefficients, a frequency band dividing unit, a frequency-based feature amount calculating unit, a physical feature amount calculating unit, and an estimation And a value calculating means.

かかる構成によれば、音像幅推定装置は、周波数帯域分割手段によって、左右に２チャンネルからなるデジタル音響信号を、左右のチャンネルの音響信号ごとに、周波数帯域幅が１／６オクターブ以下の複数の周波数帯域のサブバンド信号に分割する。次に、音像幅推定装置は、周波数帯域別特徴量算出手段によって、周波数帯域分割手段で分割したサブバンド信号から、サブバンド信号ごとに、両耳間相互相関度、両耳間時間差の時間軸方向における標準偏差又は両耳間レベル差の時間軸方向における標準偏差の中の少なくとも一つをサブバンド信号の左右のチャンネルにおける違いを表す周波数帯域別の特徴量である周波数帯域別特徴量を算出する。続いて、音像幅推定装置は、物理特徴量算出手段によって、周波数帯域別特徴量算出手段で算出した周波数帯域別特徴量に基づいて物理特徴量を算出する。そして、音像幅推定装置は、推定値算出手段によって、物理特徴量算出手段で算出した物理特徴量を、推定モデル式に適用して音像幅の推定値を算出する。
これによって、音像幅推定装置は、物理特徴量を用いた音像幅の客観評価を行う。 According to such a configuration, the sound image width estimation device uses a frequency band dividing unit to convert a digital audio signal having two channels on the left and right into a plurality of audio signals having a frequency bandwidth of 1/6 octave or less for each of the left and right channel acoustic signals. Divide into frequency band sub-band signals. Next, the sound image width estimation device uses the inter-aural cross-correlation degree and the inter-aural time difference time axis for each sub-band signal from the sub-band signal divided by the frequency band dividing unit by the frequency-band feature amount calculating unit. At least one of the standard deviation in the direction or the standard deviation in the time axis direction of the binaural level difference is calculated by the frequency band feature quantity representing the difference between the left and right channels of the subband signal. To do. Subsequently, in the sound image width estimation device, the physical feature amount calculating unit calculates the physical feature amount based on the frequency band-specific feature amount calculated by the frequency band-specific feature amount calculating unit. Then, the sound image width estimation device calculates an estimated value of the sound image width by applying the physical feature amount calculated by the physical feature amount calculation unit to the estimation model formula by the estimated value calculation unit.
Thus, the sound image width estimation device performs objective evaluation of the sound image width using the physical feature amount.

請求項２に記載の音像幅推定装置は、請求項１に記載の音像幅推定装置において、物理特徴量算出手段は、周波数帯域別特徴量の平均、重み付き平均、最大値又は中央値の中の何れか一つを、物理特徴量として算出する構成とした。 The sound image width estimation device according to claim 2 is the sound image width estimation device according to claim 1, wherein the physical feature amount calculation means is an average, weighted average, maximum value, or median of feature amounts by frequency band. Any one of these is calculated as a physical feature amount.

かかる構成によれば、音像幅推定装置は、物理特徴量算出手段によって、周波数帯域別特徴量算出手段で算出したサブバンド信号ごとの周波数帯域別特徴量の平均、重み付き平均、最大値又は中央値の中の何れか一つを物理特徴量として算出する。そして、音像幅推定装置は、音像幅推定値算出手段によって、物理特徴量算出手段で算出した当該物理特徴量に基づいて音像幅の推定値を算出する。
これによって、音像幅推定装置は、周波数帯域別に算出した特徴量を、当該特徴量の種別ごとに一つの値に集約した物理特徴量に基づいて音像幅の推定値を算出する。 According to such a configuration, the sound image width estimation device uses the physical feature amount calculating unit to calculate the average, weighted average, maximum value, or center of the feature amounts by frequency band for each subband signal calculated by the feature amount calculating unit by frequency band. Any one of the values is calculated as a physical feature amount. Then, the sound image width estimation device calculates an estimated value of the sound image width based on the physical feature amount calculated by the physical feature amount calculation unit by the sound image width estimation value calculation unit.
Thus, the sound image width estimation device calculates an estimated value of the sound image width based on the physical feature value obtained by collecting the feature values calculated for each frequency band into one value for each type of the feature value.

請求項３に記載の音像幅推定装置は、左右に２チャンネルからなるデジタル音響信号から物理特徴量を算出し、算出した物理特徴量を、物理特徴量と重み係数とからなる音像幅の推定モデル式に適用して音像幅を推定する音像幅推定装置であって、周波数帯域分割手段と、周波数帯域別特徴量算出手段と、推定値算出手段と、を備えて構成した。 The sound image width estimation apparatus according to claim 3 calculates a physical feature amount from a digital acoustic signal having two channels on the left and right, and uses the calculated physical feature amount as a sound image width estimation model including a physical feature amount and a weighting factor. A sound image width estimating apparatus that applies a formula to estimate a sound image width, and includes a frequency band dividing unit, a feature amount calculating unit for each frequency band, and an estimated value calculating unit.

かかる構成によれば、音像幅推定装置は、周波数帯域分割手段によって、左右に２チャンネルからなるデジタル音響信号を、左右のチャンネルの音響信号ごとに、周波数帯域幅が１／６オクターブ以下の複数の周波数帯域のサブバンド信号に分割する。次に、音像幅推定装置は、周波数帯域別特徴量算出手段によって、周波数帯域分割手段で分割したサブバンド信号から、サブバンド信号ごとに、両耳間相互相関度、両耳間時間差の時間軸方向における標準偏差又は両耳間レベル差の時間軸方向における標準偏差の中の少なくとも一つをサブバンド信号の左右のチャンネルにおける違いを表す周波数帯域別の特徴量である周波数帯域別特徴量を算出する。そして、音像幅推定装置は、推定値算出手段によって、周波数帯域別特徴量算出手段で算出した個々の周波数帯域別特徴量を物理特徴量として、推定モデル式に適用して音像幅の推定値を算出する。
これによって、音像幅推定装置は、周波数帯域ごとに算出した物理特徴量を用いた音像幅の客観評価を行う。 According to such a configuration, the sound image width estimation device uses a frequency band dividing unit to convert a digital audio signal having two channels on the left and right into a plurality of audio signals having a frequency bandwidth of 1/6 octave or less for each of the left and right channel acoustic signals. Divide into frequency band sub-band signals. Next, the sound image width estimation device uses the inter-aural cross-correlation degree and the inter-aural time difference time axis for each sub-band signal from the sub-band signal divided by the frequency band dividing unit by the frequency-band feature amount calculating unit. At least one of the standard deviation in the direction or the standard deviation in the time axis direction of the binaural level difference is calculated by the frequency band feature quantity representing the difference between the left and right channels of the subband signal. To do. Then, the sound image width estimation apparatus applies the estimated value of the sound image width by applying to the estimation model equation, using the estimated value calculation means as an individual feature value for each frequency band calculated by the feature value calculation means for each frequency band as a physical feature value. calculate.
Thus, the sound image width estimation device performs objective evaluation of the sound image width using the physical feature amount calculated for each frequency band.

請求項４に記載の音像幅推定装置は、請求項１乃至請求項３に記載の音像幅推定装置において、周波数帯域分割手段は、周波数帯域幅が１／１２オクターブ以下のサブバンド信号に分割するよう構成した。 The sound image width estimation apparatus according to claim 4 is the sound image width estimation apparatus according to claims 1 to 3, wherein the frequency band dividing unit divides the frequency band into subband signals having a frequency bandwidth of 1/12 octave or less. It was configured as follows.

かかる構成によれば、音像幅推定装置は、周波数帯域分割手段によって、左右に２チャンネルからなるデジタル音響信号を、左右のチャンネルの音響信号ごとに、周波数帯域幅が１／１２オクターブ以下のサブバンド信号に分割する。続いて、音像幅推定装置は、周波数帯域別特徴量算出手段によって、１／１２オクターブ以下の周波数帯域幅のサブバンド信号ごとに周波数帯域別特徴量を算出する。音像幅推定装置は物理特徴量算出手段によって、周波数帯域別特徴量算出手段で算出した周波数帯域別特徴量に基づいて物理特徴量を算出する。若しくは音像幅推定装置は、周波数帯域別特徴量算出手段で算出した周波数帯域別特徴量を個々の物理特徴量とする。そして、音像幅推定装置は、推定値算出手段によって、物理特徴量算出手段で算出した物理特徴量若しくは周波数帯域別特徴量算出手段で算出した周波数帯域別特徴量を、推定モデル式に適用して音像幅の推定値を算出する。
これによって、音像幅推定装置は、１／１２オクターブ以下に細かく分割した周波数帯域ごとに算出した特徴量に基づいて音像幅の推定値を算出する。 According to such a configuration, the sound image width estimation device uses a frequency band dividing unit to convert a digital acoustic signal having two channels on the left and right into subbands having a frequency bandwidth of 1/12 octave or less for each of the left and right channel acoustic signals. Divide into signals. Subsequently, the sound image width estimation device calculates a feature value for each frequency band for each subband signal having a frequency bandwidth of 1/12 octave or less by the feature value calculation unit for each frequency band. In the sound image width estimation device, the physical feature quantity is calculated by the physical feature quantity calculation means based on the feature quantity by frequency band calculated by the feature quantity calculation means by frequency band. Alternatively, the sound image width estimation device uses the frequency band-specific feature values calculated by the frequency band-specific feature value calculation means as individual physical feature values. Then, the sound image width estimation device applies the physical feature amount calculated by the physical feature amount calculation unit or the feature amount by frequency band calculated by the feature amount calculation unit by frequency band to the estimation model equation by the estimated value calculation unit. An estimated value of the sound image width is calculated.
Thus, the sound image width estimation device calculates an estimated value of the sound image width based on the feature amount calculated for each frequency band finely divided into 1/12 octaves or less.

請求項５に記載の音像幅推定装置は、請求項１乃至請求項４の何れか一項に記載の音像幅推定装置において、更に、重み係数算出手段を備える構成とした。 The sound image width estimation device according to claim 5 is the sound image width estimation device according to any one of claims 1 to 4, and further includes a weight coefficient calculation unit.

かかる構成によれば、音像幅推定装置は、重み係数算出手段によって、推定モデル式における重み係数を、両耳間相互相関度、両耳間時間差の時間軸方向における標準偏差又は両耳間レベル差の時間軸方向における標準偏差の中の少なくとも一つに基づく物理特徴量若しくは両耳間相互相関度、両耳間時間差の時間軸方向における標準偏差又は両耳間レベル差の時間軸方向における標準偏差の中の少なくとも一つの個々の周波数帯域別特徴量を説明変数とし、音像幅を目的変数とする回帰分析によって予め算出しておく。そして、音像幅推定装置は、推定値算出手段によって、物理特徴量算出手段で算出した物理特徴量若しくは周波数帯域別特徴量算出手段で算出した周波数帯域別特徴量と、重み係数算出手段で予め算出しておいた重み係数とを用いて、推定モデル式によって音像幅の推定値を算出する。
これによって、音像幅推定装置は、回帰分析によって定められた重み係数に従って音像幅の推定値を算出する。 According to such a configuration, the sound image width estimation apparatus uses the weighting coefficient calculation means to calculate the weighting coefficient in the estimation model formula as the interaural cross-correlation, the standard deviation in the time axis direction of the interaural time difference, or the interaural level difference. Physical feature based on at least one of the standard deviations in the time axis direction or interaural cross-correlation, standard deviation in the time axis direction of interaural time difference, or standard deviation in the time axis direction of interaural level difference Is calculated in advance by a regression analysis using at least one individual frequency band characteristic amount as an explanatory variable and the sound image width as an objective variable. Then, the sound image width estimation device is pre-calculated by the estimated value calculation means by the physical feature quantity calculated by the physical feature quantity calculation means or the feature quantity by frequency band calculated by the feature quantity calculation means by frequency band and the weight coefficient calculation means in advance. The estimated value of the sound image width is calculated by the estimation model formula using the weighting factor previously set.
Thereby, the sound image width estimation device calculates an estimated value of the sound image width according to the weighting coefficient determined by the regression analysis.

請求項６に記載の音像幅推定プログラムは、左右に２チャンネルからなるデジタル音響信号から物理特徴量を算出し、算出した物理特徴量を、物理特徴量と重み係数とからなる音像幅の推定モデル式に適用して音像幅を推定するために、コンピュータを、周波数帯域分割手段、周波数帯域別特徴量算出手段、物理特徴量算出手段、推定値算出手段、として機能させることとした。 The sound image width estimation program according to claim 6 calculates a physical feature amount from a digital acoustic signal composed of two channels on the left and right, and uses the calculated physical feature amount as a sound image width estimation model composed of a physical feature amount and a weighting factor. In order to estimate the sound image width by applying it to the equation, the computer is caused to function as a frequency band dividing unit, a characteristic amount calculating unit by frequency band, a physical feature amount calculating unit, and an estimated value calculating unit.

かかる構成によれば、音像幅推定プログラムは、周波数帯域分割手段によって、左右に２チャンネルからなるデジタル音響信号を、左右のチャンネルの音響信号ごとに、周波数帯域幅が１／６オクターブ以下の複数の周波数帯域のサブバンド信号に分割する。次に、音像幅推定プログラムは、周波数帯域別特徴量算出手段によって、周波数帯域分割手段で分割したサブバンド信号から、サブバンド信号ごとに、両耳間相互相関度、両耳間時間差の時間軸方向における標準偏差又は両耳間レベル差の時間軸方向における標準偏差の中の少なくとも一つをサブバンド信号の左右のチャンネルにおける違いを表す周波数帯域別の特徴量である周波数帯域別特徴量を算出する。続いて、音像幅推定プログラムは、物理特徴量算出手段によって、周波数帯域別特徴量算出手段で算出した周波数帯域別特徴量に基づいて物理特徴量を算出する。そして、音像幅推定プログラムは、推定値算出手段によって、物理特徴量算出手段で算出した物理特徴量を、推定モデル式に適用して音像幅の推定値を算出する。
これによって、音像幅推定プログラムは、物理特徴量を用いた音像幅の客観評価を行う。 According to such a configuration, the sound image width estimation program uses a frequency band dividing unit to convert a digital audio signal having two left and right channels into a plurality of audio signals having a frequency bandwidth of 1/6 octave or less for each of the left and right channel acoustic signals. Divide into frequency band sub-band signals. Next, the sound image width estimation program calculates the interaural cross-correlation degree and the time axis of the interaural time difference for each subband signal from the subband signal divided by the frequency band dividing unit by the frequency band feature amount calculating unit. At least one of the standard deviation in the direction or the standard deviation in the time axis direction of the binaural level difference is calculated by the frequency band feature quantity representing the difference between the left and right channels of the subband signal. To do. Subsequently, in the sound image width estimation program, the physical feature quantity is calculated by the physical feature quantity calculation unit based on the frequency band feature quantity calculated by the frequency band feature quantity calculation unit. Then, the sound image width estimation program calculates the estimated value of the sound image width by applying the physical feature amount calculated by the physical feature amount calculating unit to the estimation model formula by the estimated value calculating unit.
Thereby, the sound image width estimation program performs objective evaluation of the sound image width using the physical feature amount.

請求項７に記載の音像幅推定プログラムは、左右に２チャンネルからなるデジタル音響信号から物理特徴量を算出し、算出した物理特徴量を、物理特徴量と重み係数とからなる音像幅の推定モデル式に適用して音像幅を推定するために、コンピュータを、周波数帯域分割手段、周波数帯域別特徴量算出手段、推定値算出手段、として機能させることとした。 The sound image width estimation program according to claim 7 calculates a physical feature amount from a digital acoustic signal having two channels on the left and right, and uses the calculated physical feature amount as a sound image width estimation model including a physical feature amount and a weighting factor. In order to estimate the sound image width by applying it to the equation, the computer is caused to function as a frequency band dividing unit, a characteristic amount calculating unit for each frequency band, and an estimated value calculating unit.

かかる構成によれば、音像幅推定プログラムは、周波数帯域分割手段によって、左右に２チャンネルからなるデジタル音響信号を、左右のチャンネルの音響信号ごとに、周波数帯域幅が１／６オクターブ以下の複数の周波数帯域のサブバンド信号に分割する。次に、音像幅推定プログラムは、周波数帯域別特徴量算出手段によって、周波数帯域分割手段で分割したサブバンド信号から、サブバンド信号ごとに、両耳間相互相関度、両耳間時間差の時間軸方向における標準偏差又は両耳間レベル差の時間軸方向における標準偏差の中の少なくとも一つをサブバンド信号の左右のチャンネルにおける違いを表す周波数帯域別の特徴量である周波数帯域別特徴量を算出する。そして、音像幅推定プログラムは、推定値算出手段によって、周波数帯域別特徴量算出手段で算出した個々の周波数帯域別特徴量を物理特徴量として、推定モデル式に適用して音像幅の推定値を算出する。
これによって、音像幅推定プログラムは、周波数帯域ごとに算出した物理特徴量を用いた音像幅の客観評価を行う。 According to such a configuration, the sound image width estimation program uses a frequency band dividing unit to convert a digital audio signal having two left and right channels into a plurality of audio signals having a frequency bandwidth of 1/6 octave or less for each of the left and right channel acoustic signals. Divide into frequency band sub-band signals. Next, the sound image width estimation program calculates the interaural cross-correlation degree and the time axis of the interaural time difference for each subband signal from the subband signal divided by the frequency band dividing unit by the frequency band feature amount calculating unit. At least one of the standard deviation in the direction or the standard deviation in the time axis direction of the binaural level difference is calculated by the frequency band feature quantity representing the difference between the left and right channels of the subband signal. To do. Then, the sound image width estimation program applies the estimated value of the sound image width by applying to the estimation model formula, using the estimated value calculation means as the physical feature quantity of each individual frequency band feature quantity calculated by the frequency band feature quantity calculation means. calculate.
Accordingly, the sound image width estimation program performs objective evaluation of the sound image width using the physical feature amount calculated for each frequency band.

請求項１又は請求項６に記載の発明によれば、周波数帯域幅が１／６オクターブ以下の周波数帯域ごとに算出した特徴量に基づいて音像幅の推定値を算出するため、安定した精度で音像幅の推定を行うことができる。
請求項２に記載の発明によれば、周波数帯域ごとに算出した特徴量を、当該特徴量の種別ごとに一つの値に集約した物理特徴量に基づいて音像幅の推定値を算出するため、推定モデル式における重み係数の個数を増やすことなく、簡便な計算によって音像幅の推定値を算出することができる。
請求項３又は請求項７に記載の発明によれば、周波数帯域幅が１／６オクターブ以下の周波数帯域ごとに算出した特徴量に基づいて音像幅の推定値を算出するため、精度よく音像幅の推定を行うことができる。
請求項４に記載の発明によれば、周波数帯域幅が１／１２オクターブ以下の周波数帯域ごとに算出した特徴量に基づいて音像幅の推定値を算出するため、より安定した精度で音像幅の推定を行うことができる。
請求項５に記載の発明によれば、推定モデル式における重み係数を、主観評価データと、主観評価データに対応する物理特徴量とを用いた回帰分析によって定めるため、精度よく音像幅の推定を行うことができる。 According to the invention described in claim 1 or claim 6, since the estimated value of the sound image width is calculated based on the feature amount calculated for each frequency band having a frequency bandwidth of 1/6 octave or less, it can be performed with stable accuracy. The sound image width can be estimated.
According to the second aspect of the present invention, in order to calculate the estimated value of the sound image width based on the physical feature value obtained by collecting the feature value calculated for each frequency band into one value for each type of the feature value, The estimated value of the sound image width can be calculated by simple calculation without increasing the number of weighting coefficients in the estimation model formula.
According to the invention described in claim 3 or claim 7, since the estimated value of the sound image width is calculated based on the feature amount calculated for each frequency band having a frequency bandwidth of 1/6 octave or less, the sound image width is accurately determined. Can be estimated.
According to the fourth aspect of the present invention, since the estimated value of the sound image width is calculated based on the feature amount calculated for each frequency band whose frequency bandwidth is equal to or less than 1/12 octave, the sound image width of the sound image width can be more stable. Estimation can be performed.
According to the invention described in claim 5, since the weighting coefficient in the estimation model formula is determined by regression analysis using subjective evaluation data and physical feature values corresponding to the subjective evaluation data, the sound image width can be estimated with high accuracy. It can be carried out.

本発明に係る第１実施形態の音像幅推定装置の構成を示すブロック図である。It is a block diagram which shows the structure of the sound image width estimation apparatus of 1st Embodiment which concerns on this invention. 本発明に係る第１実施形態の音像幅推定装置における演算手段の構成を示すブロック図である。It is a block diagram which shows the structure of the calculating means in the sound image width estimation apparatus of 1st Embodiment which concerns on this invention. 本発明に係る第１実施形態の音像幅推定装置の処理の流れを示すフロー図である。It is a flowchart which shows the flow of a process of the sound image width estimation apparatus of 1st Embodiment which concerns on this invention. 本発明に係る第１実施形態の音像幅推定装置における推定モデル式の重み係数算出処理の流れを示すフロー図である。It is a flowchart which shows the flow of the weighting coefficient calculation process of the estimation model type | formula in the sound image width estimation apparatus of 1st Embodiment which concerns on this invention. 本発明における音像幅の推定モデル式で用いる物理特徴量と音像幅の主観評価データとの間のピアソン相関分析の結果を示すグラフ図であり、（１）、（２）、（３）及び（４）は、それぞれ音源としてバイオリンのＧ線、Ａ線、Ｄ線及びＥ線の開放弦を用いた場合の結果を示す。It is a graph which shows the result of the Pearson correlation analysis between the physical feature-value used with the estimation model formula of the sound image width in this invention, and the subjective evaluation data of a sound image width, (1), (2), (3) and ( 4) shows the results when using violin G-line, A-line, D-line, and E-line open strings as the sound source. 本発明に係る第２実施形態の音像幅推定装置の構成を示すブロック図である。It is a block diagram which shows the structure of the sound image width estimation apparatus of 2nd Embodiment which concerns on this invention. 本発明に係る第２実施形態の音像幅推定装置における演算手段の構成を示すブロック図である。It is a block diagram which shows the structure of the calculating means in the sound image width estimation apparatus of 2nd Embodiment which concerns on this invention. 本発明に係る第２実施形態の音像幅推定装置の処理の流れを示すフロー図である。It is a flowchart which shows the flow of a process of the sound image width estimation apparatus of 2nd Embodiment which concerns on this invention. 本発明に係る第２実施形態の音像幅推定装置における物理特徴量代表値算出用の重み係数算出処理の流れを示すフロー図である。It is a flowchart which shows the flow of the weighting coefficient calculation process for physical feature-value representative value calculation in the sound image width estimation apparatus of 2nd Embodiment which concerns on this invention. 本発明に係る第３実施形態の音像幅推定装置の構成を示すブロック図である。It is a block diagram which shows the structure of the sound image width estimation apparatus of 3rd Embodiment which concerns on this invention. 本発明に係る第３実施形態の音像幅推定装置の処理の流れを示すフロー図である。It is a flowchart which shows the flow of a process of the sound image width estimation apparatus of 3rd Embodiment which concerns on this invention. 本発明における音像幅の推定モデル式の重み係数を決定するために用いる、主観評価データを採取するための実験装置の構成例を示す模式図である。It is a schematic diagram which shows the structural example of the experimental apparatus for extract | collecting the subjective evaluation data used in order to determine the weighting coefficient of the estimation model formula of the sound image width in this invention. 本発明に係る音像幅推定装置によって算出した音像幅推定値の例を示すグラフ図である。It is a graph which shows the example of the sound image width estimated value computed by the sound image width estimation apparatus which concerns on this invention.

以下、本発明の実施形態について、適宜図面を参照して説明する。
［第１実施形態］
まず、図１を参照して、本発明における第１実施形態の音像幅推定装置１００の構成について説明する。図１に示すように、音像幅推定装置１００は、ダミーヘッド１と、マイクロフォン２Ｌ及び２Ｒと、ローパスフィルタ３Ｌ及び３Ｒと、ＡＤ変換器４Ｌ及び４Ｒと、演算手段５と、表示手段１４と、を備えて構成されている。また、演算手段５は、メモリ６Ｌ及び６Ｒと、フィルタバンク（周波数帯域分割手段）７Ｌ及び７Ｒと、周波数帯域別物理特徴量算出手段（周波数帯域別特徴量算出手段）８_ｆと、物理特徴量代表値算出手段（物理特徴量算出手段）９と、音像幅推定値算出手段（推定値算出手段）１０と、重み係数記憶手段１１と、推定値重み係数算出手段（重み係数算出手段）１２と、主観評価データ記憶手段１３と、を備えて構成されている。 Embodiments of the present invention will be described below with reference to the drawings as appropriate.
[First Embodiment]
First, the configuration of the sound image width estimation apparatus 100 according to the first embodiment of the present invention will be described with reference to FIG. As shown in FIG. 1, the sound image width estimation apparatus 100 includes a dummy head 1, microphones 2L and 2R, low-pass filters 3L and 3R, AD converters 4L and 4R, an arithmetic unit 5, a display unit 14, It is configured with. Further, the computing means 5 includes memories 6L and 6R, filter banks (frequency band dividing means) 7L and 7R, physical characteristic quantity calculating means for each frequency band (feature quantity calculating means for each frequency band) 8 _f , and physical characteristic quantities. Representative value calculating means (physical feature quantity calculating means) 9, sound image width estimated value calculating means (estimated value calculating means) 10, weight coefficient storage means 11, estimated value weight coefficient calculating means (weight coefficient calculating means) 12, And subjective evaluation data storage means 13.

ダミーヘッド１は、試験対象である音源ＳＳから発生する音響をバイノーラル方式で採取するための模擬頭である。ダミーヘッド１の左右両耳の入り口部には、それぞれマイクロフォン２Ｌ及び２Ｒが取り付けられている。 The dummy head 1 is a simulated head for collecting sound generated from the sound source SS to be tested by a binaural method. Microphones 2L and 2R are attached to the entrances of the left and right ears of the dummy head 1, respectively.

マイクロフォン２Ｌ及び２Ｒは、ダミーヘッド１のそれぞれ左耳及び右耳の入り口部における音源ＳＳから発生する音響を採取する収音手段である。マイクロフォン２Ｌ及び２Ｒで採取されたアナログ音響信号は、それぞれローパスフィルタ３Ｌ及び３Ｒに入力される。
なお、第１実施形態においては、マイクロフォン２Ｌ及び２Ｒは、ダミーヘッド１の左右両耳の入り口部に配置したが、マイクロフォン２Ｌ及び２Ｒをダミーヘッド１の鼓膜部に配置して収音するようにしてもよい。
また、ダミーヘッド１の替わりに、人間の頭部を模した球体を用いてマイクロフォン２Ｌ及び２Ｒを配置するようにしてもよいし、マイクロフォンスタンドを用いた２点マイクロフォンの形態でマイクロフォン２Ｌ及び２Ｒを配置するようにしてもよい。 The microphones 2L and 2R are sound collection means for collecting sound generated from the sound source SS at the entrance of the left and right ears of the dummy head 1, respectively. Analog sound signals collected by the microphones 2L and 2R are input to the low-pass filters 3L and 3R, respectively.
In the first embodiment, the microphones 2L and 2R are arranged at the entrance portions of the left and right ears of the dummy head 1, but the microphones 2L and 2R are arranged at the eardrum portion of the dummy head 1 to collect sound. May be.
Further, instead of the dummy head 1, the microphones 2L and 2R may be arranged using a sphere simulating a human head, or the microphones 2L and 2R may be arranged in the form of a two-point microphone using a microphone stand. It may be arranged.

ローパスフィルタ３Ｌ及び３Ｒは、それぞれマイクロフォン２Ｌ及び２Ｒによって採取されたアナログ音響信号を入力し、入力したアナログ音響信号からサンプリング周波数ｆｓの１／２を超える高周波数成分をＡＤ変換器４Ｌ及び４Ｒによってデジタル化（サンプリング）する前に帯域制限して、折り返し歪みの発生を防止するためのアンチエイリアシングフィルタである。ローパスフィルタ３Ｌ及び３Ｒは、帯域制限したアナログ音響信号を、それぞれＡＤ変換器４Ｌ及び４Ｒに出力する。
なお、人の可聴周波数の上限は２０ｋＨｚであるから、サンプリング周波数ｆｓは、２０ｋＨｚの２倍である４０ｋＨｚ以上とする必要がある。例えば、サンプリング周波数ｆｓ＝４８ｋＨｚとすると、ローパスフィルタ３Ｌ及び３Ｒによって、ｆｓ／２＝２４ｋＨｚを超える周波数成分を帯域制限するようにすればよい。 The low-pass filters 3L and 3R receive analog audio signals collected by the microphones 2L and 2R, respectively, and digitally convert high frequency components exceeding 1/2 of the sampling frequency fs from the input analog audio signals by the AD converters 4L and 4R. This is an anti-aliasing filter for preventing the occurrence of aliasing distortion by limiting the bandwidth before sampling (sampling). The low-pass filters 3L and 3R output analog audio signals with band restrictions to the AD converters 4L and 4R, respectively.
Since the upper limit of human audible frequency is 20 kHz, the sampling frequency fs needs to be 40 kHz or more, which is twice 20 kHz. For example, when the sampling frequency is fs = 48 kHz, the frequency components exceeding fs / 2 = 24 kHz may be band-limited by the low-pass filters 3L and 3R.

ＡＤ変換器４Ｌ及び４Ｒは、それぞれローパスフィルタ３Ｌ及び３Ｒによって帯域制限されたアナログ音響信号を入力し、入力したアナログ音響信号を例えば、サンプリング周波数ｆｓ＝４８ｋＨｚでサンプリングしてデジタル信号に変換する。ＡＤ変換器４Ｌ及び４Ｒは、それぞれデジタル信号に変換した左チャンネルの音響信号ｓｌ（ｎ）及び右チャンネルの音響信号ｓｒ（ｎ）（但し、ｎはサンプリングしたデータの番号を示す）を、それぞれ演算手段５のメモリ６Ｌ及び６Ｒに出力する。 The AD converters 4L and 4R input analog acoustic signals band-limited by the low-pass filters 3L and 3R, respectively, sample the input analog acoustic signals at a sampling frequency fs = 48 kHz, for example, and convert them into digital signals. The AD converters 4L and 4R respectively calculate the left channel acoustic signal sl (n) and the right channel acoustic signal sr (n) (where n represents the number of the sampled data) converted into digital signals, respectively. Output to the memories 6L and 6R of the means 5.

演算手段５は、バイノーラル方式で採取され、ＡＤ変換器４Ｌ及び４Ｒによってデジタル化された音響信号ｓｌ（ｎ）及びｓｒ（ｎ）を入力し、入力した音響信号ｓｌ（ｎ）及びｓｒ（ｎ）を数値演算によって分析することにより音像幅推定値（ハットｙ）を算出する分析手段である。演算手段５は、汎用的なコンピュータを用いて実現することができる。
演算手段５は、算出した音像幅推定値（ハットｙ）を表示手段１４に出力する。
なお、演算手段５の詳細については後記する。 The computing means 5 receives the acoustic signals sl (n) and sr (n) collected by the binaural method and digitized by the AD converters 4L and 4R, and the inputted acoustic signals sl (n) and sr (n) Is an analysis means for calculating a sound image width estimated value (hat y) by analyzing the numerical value by numerical calculation. The computing means 5 can be realized using a general-purpose computer.
The computing means 5 outputs the calculated sound image width estimated value (hat y) to the display means 14.
Details of the computing means 5 will be described later.

表示手段１４は、演算手段５から入力した音像幅推定値（ハットｙ）を、視認可能に表示する液晶ディスプレイなどの表示装置である。
表示手段１４は、演算手段５から所定の時間間隔ごとに出力される音像幅推定値（ハットｙ）の数値を適宜表示する。なお、表示手段１４は、音像幅推定値（ハットｙ）の経時変化が把握しやすいように、グラフ化して表示するようにしてもよい。 The display unit 14 is a display device such as a liquid crystal display that displays the estimated sound image width (hat y) input from the calculation unit 5 so as to be visible.
The display means 14 appropriately displays the numerical value of the estimated sound image width (hat y) output from the computing means 5 at predetermined time intervals. The display means 14 may be displayed in a graph so that the temporal change of the estimated sound image width (hat y) can be easily grasped.

音源ＳＳは、人間に音像幅を誘起させる音響を発生する音響発生手段である。試験対象である音源ＳＳとしては、楽器やスピーカなど任意の音源を用いることができ、音源ＳＳは、１個であっても複数個であってもよい。 The sound source SS is sound generation means for generating sound that induces a sound image width in humans. As the sound source SS to be tested, an arbitrary sound source such as a musical instrument or a speaker can be used, and the number of sound sources SS may be one or plural.

次に、演算手段５の各部の構成について説明する。
メモリ６Ｌ及び６Ｒは、それぞれＡＤ変換器４Ｌ及び４Ｒから入力した左チャンネルの音響信号ｓｌ（ｎ）及び右チャンネルの音響信号ｓｒ（ｎ）を記憶する記憶手段である。メモリ６Ｌ及び６Ｒに記憶した音響信号ｓｌ（ｎ）及びｓｒ（ｎ）は、それぞれ適宜にフィルタバンク７Ｌ及び７Ｒによって読み出される。 Next, the structure of each part of the calculating means 5 is demonstrated.
The memories 6L and 6R are storage means for storing the left channel acoustic signal sl (n) and the right channel acoustic signal sr (n) input from the AD converters 4L and 4R, respectively. The acoustic signals sl (n) and sr (n) stored in the memories 6L and 6R are read by the filter banks 7L and 7R as appropriate.

フィルタバンク（周波数帯域分割手段）７Ｌ及び７Ｒは、それぞれ互いに異なる複数の周波数帯域ｆを通過する特性を有するバンドパスフィルタ群から構成される。ここで、ｆは周波数帯域を識別する番号を示し、ｆ＝１，２，…，Ｆである。また、Ｆは２以上の整数である。
フィルタバンク７Ｌ及び７Ｒは、それぞれメモリ６Ｌ及６Ｒに記憶された左チャンネルの音響信号ｓｌ（ｎ）及び右チャンネルの音響信号ｓｒ（ｎ）を読み出し、読み出した音響信号ｓｌ（ｎ）及びｓｒ（ｎ）の複数の周波数帯域ｆの周波数帯域成分ｓｌ（ｎ，ｆ）及びｓｒ（ｎ，ｆ）を、各バンドパスフィルタの出力の組として得るものである。すなわち、フィルタバンク７Ｌ及び７Ｒは、音響信号ｓｌ（ｎ）及びｓｒ（ｎ）を複数の周波数帯域成分ｓｌ（ｎ，ｆ）及びｓｒ（ｎ，ｆ）に分割する周波数帯域分割手段である。フィルタバンク７Ｌ及び７Ｒは、音響信号ｓｌ（ｎ）及びｓｒ（ｎ）の各周波数帯域成分ｓｌ（ｎ，ｆ）及びｓｒ（ｎ，ｆ）を、それぞれの周波数帯域ｆに対応する周波数帯域別物理特徴量算出手段８_ｆに出力する。 The filter banks (frequency band dividing means) 7L and 7R are each composed of a band-pass filter group having a characteristic of passing a plurality of frequency bands f different from each other. Here, f indicates a number for identifying a frequency band, and f = 1, 2,... F is an integer of 2 or more.
The filter banks 7L and 7R read the left channel acoustic signal sl (n) and the right channel acoustic signal sr (n) stored in the memories 6L and 6R, respectively, and read the acoustic signals sl (n) and sr (n The frequency band components sl (n, f) and sr (n, f) of a plurality of frequency bands f are obtained as a set of outputs of each bandpass filter. That is, the filter banks 7L and 7R are frequency band dividing means for dividing the acoustic signals sl (n) and sr (n) into a plurality of frequency band components sl (n, f) and sr (n, f). The filter banks 7L and 7R convert the frequency band components sl (n, f) and sr (n, f) of the acoustic signals sl (n) and sr (n) into frequency band physicals corresponding to the respective frequency bands f. It outputs to the feature-value calculation means _8f .

フィルタバンク７Ｌ及び７Ｒは、例えば、１／６オクターブバンドフィルタなどの等比帯域フィルタ群で構成することができる。好ましくは、周波数帯域幅が１／６オクターブ以下、更に好ましくは１／１２オクターブ以下の狭帯域の１／ｎオクターブバンドフィルタ（ここで、ｎは１以上の整数）を用いることができる。
なお、フィルタ群を構成する各フィルタは、ＦＩＲ（finite impulse response；有限長インパルス応答）フィルタによって構成することができる。 The filter banks 7L and 7R can be configured with a group of equal ratio band filters such as a 1/6 octave band filter, for example. Preferably, a narrow band 1 / n octave band filter (where n is an integer of 1 or more) having a frequency bandwidth of 1/6 octave or less, more preferably 1/12 octave or less can be used.
In addition, each filter which comprises a filter group can be comprised with a FIR (finite impulse response) filter.

周波数帯域別物理特徴量算出手段（周波数帯域別特徴量算出手段）８_ｆ（ｆ＝１，２，…，Ｆ）は、それぞれフィルタバンク７Ｌ及び７Ｒから音響信号の周波数帯域ｆに対応する左右の周波数帯域成分ｓｌ（ｎ，ｆ）及びｓｒ（ｎ，ｆ）を入力し、入力した左右の周波数帯域成分ｓｌ（ｎ，ｆ）及びｓｒ（ｎ，ｆ）を分析して、周波数帯域ｆごとの３種類の物理特徴量である周波数帯域別物理特徴量（周波数帯域別特徴量）ｘ_ａ（ｆ）、ｘ_ｔ（ｆ）、ｘ_ｌ（ｆ）を算出して物理特徴量代表値算出手段９に出力する。 The frequency characteristic-specific physical feature quantity calculation means (frequency-band characteristic quantity calculation means) 8 _f (f = 1, 2,..., F) are respectively left and right corresponding to the frequency band f of the acoustic signal from the filter banks 7L and 7R. The frequency band components sl (n, f) and sr (n, f) are input, and the input left and right frequency band components sl (n, f) and sr (n, f) are analyzed, and each frequency band f is analyzed. The physical feature quantity representative value calculating means 9 calculates the physical feature quantity by frequency band (feature quantity by frequency band) x _a (f), x _t (f), x _l (f) as three types of physical feature quantities. Output to.

物理特徴量代表値算出手段（物理特徴量算出手段）９は、Ｆ個の周波数帯域別物理特徴量算出手段８_ｆ（ｆ＝１，２，…，Ｆ）からＦ組の周波数帯域別物理特徴量ｘ_ａ（ｆ）、ｘ_ｔ（ｆ）、ｘ_ｌ（ｆ）を入力し、入力した周波数帯域別物理特徴量ｘ_ａ（ｆ）、ｘ_ｔ（ｆ）、ｘ_ｌ（ｆ）を物理特徴量の種類ごとに、物理特徴量代表値Ｘ_ａ、Ｘ_ｔ、Ｘ_ｌを算出して音像幅推定値算出手段１０又は推定値重み係数算出手段１２に出力する。
なお、音像幅を推定するための推定モデル式における各物理特徴量代表値Ｘ_ａ、Ｘ_ｔ、Ｘ_ｌに対する重み係数Ｃ_ａ、Ｃ_ｔ及びＣ_ｌを算出する場合は、物理特徴量代表値算出手段９は、物理特徴量代表値Ｘ_ａ、Ｘ_ｔ、Ｘ_ｌを推定値重み係数算出手段１２に出力する。また、推定モデル式と重み係数Ｃ_ａ、Ｃ_ｔ及びＣ_ｌとを用いて音像幅を推定する場合は、物理特徴量代表値算出手段９は、物理特徴量代表値Ｘ_ａ、Ｘ_ｔ、Ｘ_ｌを音像幅推定値算出手段１０に出力する。 The physical feature quantity representative value calculation means (physical feature quantity calculation means) 9 includes F sets of physical features by frequency band from F frequency band physical feature quantity calculation means 8 _f (f = 1, 2,..., F). Quantities x _a (f), x _t (f), and x _l (f) are input, and the input physical characteristic amounts x _a (f), x _t (f), and x _l (f) by frequency band are physical characteristics. For each type of quantity, the physical feature quantity representative values X _a , X _t , and X _l are calculated and output to the sound image width estimated value calculating means 10 or the estimated value weight coefficient calculating means 12.
When calculating the weighting factors C _a , C _t, and C _l for each physical feature amount representative value X _a , X _t , X _l in the estimation model formula for estimating the sound image width, the physical feature amount representative value calculation is performed. The means 9 outputs the physical feature quantity representative values X _a , X _t , and X _l to the estimated value weight coefficient calculating means 12. When estimating the sound image width using the estimation model formula and the weighting coefficients C _a , C _t, and C _l , the physical feature quantity representative value calculating unit 9 uses the physical feature quantity representative values X _a , X _t , X ₁ is output to the sound image width estimated value calculating means 10.

ここで、図２を参照（適宜図１参照）して、周波数帯域別物理特徴量算出手段８_ｆと物理特徴量代表値算出手段９の詳細な構成について説明する。
図２に示すように、周波数帯域別物理特徴量算出手段８_ｆは、窓掛け手段２０Ｌ_ｆ及び２０Ｒ_ｆと、ＣＣＣ（interaural cross-correlation coefficient；両耳間相互相関係数）算出手段２１_ｆと、レベル算出手段２２Ｌ_ｆ及び２２Ｒ_ｆと、ＩＡＣＣ算出手段２３_ｆと、ＩＴＤ算出手段２４_ｆと、ＩＬＤ算出手段２５_ｆと、ＩＬＤ標準偏差算出手段２６_ｆと、ＩＡＣＣ平均算出手段２７_ｆと、ＩＴＤ標準偏差算出手段２８_ｆと、を備えて構成されている。
また、物理特徴量代表値算出手段９は、ＩＬＤ標準偏差代表値算出手段３０と、ＩＡＣＣ平均代表値算出手段３１と、ＩＴＤ標準偏差代表値算出手段３２と、を備えて構成されている。 Here, with reference to FIG. 2 (refer to FIG. 1 as appropriate), the detailed configuration of the frequency-specific physical feature quantity calculating means 8 _f and the physical feature quantity representative value calculating means 9 will be described.
As shown in FIG. 2, the physical characteristic amount calculating means 8 _f for each frequency band includes windowing means 20L _f and 20R _f , CCC (interaural cross-correlation coefficient) calculating means 21 _f , , Level calculating means 22L _f and 22R _f , IACC calculating means 23 _f , ITD calculating means 24 _f , ILD calculating means 25 _f , ILD standard deviation calculating means 26 _f , IACC average calculating means 27 _f , ITD Standard deviation calculating means 28 _f .
The physical feature quantity representative value calculating means 9 includes an ILD standard deviation representative value calculating means 30, an IACC average representative value calculating means 31, and an ITD standard deviation representative value calculating means 32.

窓掛け手段２０Ｌ_ｆ及び２０Ｒ_ｆは、それぞれフィルタバンク７Ｌ及び７Ｒから対応する周波数帯域ｆの周波数帯域成分ｓｌ（ｎ，ｆ）及びｓｒ（ｎ，ｆ）を入力し、入力した周波数帯域成分ｓｌ（ｎ，ｆ）及びｓｒ（ｎ，ｆ）に時間窓ｗ（ｎ）を掛けて、順次に所定時間長の信号を切り出す手段である。
窓掛け手段２０Ｌ_ｆ及び２０Ｒ_ｆは、切り出した信号列ｙｌ_ｋ（ｎ，ｆ）及びｙｒ_ｋ（ｎ，ｆ）を、それぞれレベル算出手段２２Ｌ_ｆ及び２２Ｒ_ｆに出力するとともに、左右のチャンネルの信号列ｙｌ_ｋ（ｎ，ｆ）及びｙｒ_ｋ（ｎ，ｆ）を、ＣＣＣ算出手段２１_ｆに出力する。 The windowing means 20L _f and 20R _f receive the frequency band components sl (n, f) and sr (n, f) of the corresponding frequency band f from the filter banks 7L and 7R, respectively, and the input frequency band components sl ( n, f) and sr (n, f) are multiplied by a time window w (n) to sequentially extract a signal having a predetermined time length.
The windowing means 20L _f and 20R _f output the cut signal sequences yl _k (n, f) and yr _k (n, f) to the level calculation means 22L _f and 22R _f , respectively, and the signals of the left and right channels. The columns yl _k (n, f) and yr _k (n, f) are output to the CCC calculating means 21 _f .

ここで、窓掛け手段２０Ｌ_ｆ及び２０Ｒ_ｆによって周波数帯域成分ｓｌ（ｎ，ｆ）及びｓｒ（ｎ，ｆ）から切り出される信号のデータ数をＮ（Ｎは１以上の整数）とすると、時間窓ｗ（ｎ）は、式（１）によって表すことができる。 Here, assuming that the number of data of signals cut out from the frequency band components sl (n, f) and sr (n, f) by the windowing means 20L _f and 20R _f is N (N is an integer of 1 or more), the time window w (n) can be expressed by equation (1).

なお、時間窓ｗ（ｎ）によって切り出す時間長は、例えば、１０（ｍｓ）〜１００（ｍｓ）とすることができる。
ここで、時間長をｔ（ｍｓ）、ＡＤ変換器４Ｌ及び４Ｒにおけるサンプリング周波数をｆｓ（Ｈｚ）とすると、切り出される信号のデータ数Ｎは、Ｎ＝ｔ１０^−３ｆｓとなる。 In addition, the time length cut out by the time window w (n) can be set to, for example, 10 (ms) to 100 (ms).
Here, assuming that the time length is t (ms) and the sampling frequency in the AD converters 4L and 4R is fs (Hz), the number N of data of the extracted signals is N = t10 ⁻³ fs.

また、窓掛け手段２０Ｌ_ｆ及び２０Ｒ_ｆは、時間領域において、時間窓ｗ（ｎ）によって、それぞれ周波数帯域成分ｓｌ（ｎ，ｆ）及びｓｒ（ｎ，ｆ）に対して移動幅ｄ（ｄは１以上の整数）ずつシフトしながら窓掛けして信号列を切り出す。左チャンネル及び右チャンネルの周波数帯域成分ｓｌ（ｎ，ｆ）及びｓｒ（ｎ，ｆ）からｋ番目に切り出される信号列ｙｌ_ｋ（ｎ，ｆ）及びｙｒ_ｋ（ｎ，ｆ）は、それぞれ式（２−１）及び式（２−２）のように表すことができる。 Further, the windowing means 20L _f and 20R _f have a movement width d (d is d) with respect to the frequency band components sl (n, f) and sr (n, f), respectively, in the time domain by the time window w (n). A signal sequence is cut out by shifting while shifting by an integer of 1 or more. The signal sequences yl _k (n, f) and yr _k (n, f) cut out k-th from the frequency band components sl (n, f) and sr (n, f) of the left channel and the right channel are respectively expressed by the formulas ( 2-1) and formula (2-2).

ここで、両耳間時間差をτ（ｍｓ）、サンプリング周波数をｆｓ（Ｈｚ）とすると、移動幅ｄは、ｄ≧τ１０^−３ｆｓとすることができる。すなわち、両耳間時間差τ以上に相当するデータ数ずつ時間窓ｗ（ｎ）によって切り出す位置をシフトするようにすることができる。これによって、後段のＣＣＣ算出手段２１_ｆやレベル算出手段２２Ｌ_ｆ及び２２Ｒ_ｆなどの各分析手段によって移動幅ｄに相当する時間幅を時間分解能とした移動分析を行うことができる。 Here, when the time difference between both ears is τ (ms) and the sampling frequency is fs (Hz), the movement width d can be d ≧ τ10 ⁻³ fs. That is, the position to be cut out by the time window w (n) can be shifted by the number of data corresponding to the interaural time difference τ or more. As a result, the movement analysis with the time width corresponding to the movement width d as the time resolution can be performed by each analysis means such as the CCC calculation means 21 _f and the level calculation means 22L _f and 22R _{f in the} subsequent stage.

レベル算出手段２２Ｌ_ｆ及び２２Ｒ_ｆは、それぞれ対応する周波数帯域ｆの窓掛け手段２０Ｌ_ｆ及び２０Ｒ_ｆから信号列ｙｌ_ｋ（ｎ，ｆ）及びｙｒ_ｋ（ｎ，ｆ）を入力し、入力したｋ番目の信号列ｙｌ_ｋ（ｎ，ｆ）及びｙｒ_ｋ（ｎ，ｆ）における音響エネルギーレベル（以下、レベルと呼ぶ）ｓｌＥ_ｋ（ｆ）及びｓｒＥ_ｋ（ｆ）を、それぞれ式（３−１）及び式（３−２）によって算出して、対応する周波数帯域ｆのＩＬＤ算出手段２５_ｆに出力する。 The level calculation means 22L _f and 22R _f input the signal sequences yl _k (n, f) and yr _k (n, f) from the windowing means 20L _f and 20R _{f of the} corresponding frequency band f, respectively, and input k The acoustic energy levels (hereinafter referred to as levels) slE _k (f) and srE _k (f) in the second signal sequence yl _k (n, f) and yr _k (n, f) And the equation (3-2), and outputs to the ILD calculating means 25 _f of the corresponding frequency band f.

ＣＣＣ（interaural cross-correlation coefficient；両耳間相互相関係数）算出手段２１_ｆは、それぞれ対応する周波数帯域ｆの窓掛け手段２０Ｌ_ｆ及び２０Ｒ_ｆから信号列ｙｌ_ｋ（ｎ，ｆ）及びｙｒ_ｋ（ｎ，ｆ）を入力し、入力したｋ番目の信号列ｙｌ_ｋ（ｎ，ｆ）及びｙｒ_ｋ（ｎ，ｆ）における両耳間相互相関係数ＣＣＣ_ｋ（τ，ｆ）を、式（４）によって算出して、対応する周波数帯域ｆのＩＡＣＣ算出手段２３_ｆ及びＩＴＤ算出手段２４_ｆに出力する。 The CCC (interaural cross-correlation coefficient) calculation means 21 _f respectively outputs signal sequences yl _k (n, f) and yr _k from the windowing means 20L _f and 20R _{f of the} corresponding frequency band f. (N, f) is input, and the interaural cross-correlation coefficient CCC _k (τ, f) in the input k-th signal sequence yl _k (n, f) and yr _k (n, f) is expressed by the formula ( 4) and output to the IACC calculation means 23 _f and ITD calculation means 24 _f of the corresponding frequency band f.

ＩＡＣＣ（absolute maximum value of the interaural cross-correlation coefficient；両耳間相互相関度）算出手段２３_ｆは、対応する周波数帯域ｆのＣＣＣ算出手段２１_ｆから両耳間相互相関係数ＣＣＣ_ｋ（τ，ｆ）を入力し、入力した両耳間相互相関係数ＣＣＣ_ｋ（τ，ｆ）における最大振幅である両耳間相互相関度ＩＡＣＣ_ｋ（ｆ）を、式（５−１）によって算出して、対応する周波数帯域ｆのＩＡＣＣ平均算出手段２７_ｆに出力する。
なお、ＩＡＣＣ算出手段２３_ｆは、両耳間相互相関度ＩＡＣＣ_ｋ（ｆ）を、式（５−１）に替えて、式（５−２）によって算出するようにしてもよい。 IACC (absolute maximum value of the interaural cross-correlation coefficient; interaural cross correlation) calculating unit 23 _f, the corresponding frequency band f of the CCC calculating unit 21 between both ears from _f correlation coefficient CCC _k (tau, f) is input, and the interaural cross-correlation degree IACC _k (f), which is the maximum amplitude in the input binaural cross-correlation coefficient CCC _k (τ, f), is calculated by the equation (5-1). , And output to the IACC average calculating means 27 _f of the corresponding frequency band f.
Note that the IACC calculation unit 23 _f may calculate the interaural cross-correlation degree IACC _k (f) by the equation (5-2) instead of the equation (5-1).

ＩＴＤ（interaural time difference；両耳間時間差）算出手段２４_ｆは、対応する周波数帯域ｆのＣＣＣ算出手段２１_ｆから両耳間相互相関係数ＣＣＣ_ｋ（τ，ｆ）を入力し、式（６−１）によって、入力した両耳間相互相関係数ＣＣＣ_ｋ（τ，ｆ）において最大振幅を与える時間差τを算出し、算出した時間差τを両耳間時間差ＩＴＤ_ｋ（ｆ）として、対応する周波数帯域ｆのＩＴＤ標準偏差算出手段２８_ｆに出力する。
なお、ＩＴＤ算出手段２４_ｆは、両耳間時間差ＩＴＤ_ｋ（ｆ）を、式（６−１）に替えて、式（６−２）によって算出するようにしてもよい。 The inter-ural time difference (ITD) calculating unit 24 _f receives the interaural cross-correlation coefficient CCC _k (τ, f) from the CCC calculating unit 21 _f of the corresponding frequency band f, and the equation (6) -1), the time difference τ giving the maximum amplitude in the input interaural cross-correlation coefficient CCC _k (τ, f) is calculated, and the calculated time difference τ is used as the interaural time difference ITD _k (f). It outputs to the ITD standard deviation calculation means 28 _f of the frequency band f.
The ITD calculation unit 24 _f may calculate the interaural time difference ITD _k (f) by the equation (6-2) instead of the equation (6-1).

ＩＬＤ（interaural level difference；両耳間レベル差）算出手段２５_ｆは、対応する周波数帯域ｆのレベル算出手段２２Ｌ_ｆ及び２２Ｒ_ｆから、レベルｓｌＥ_ｋ（ｆ）及びｓｒＥ_ｋ（ｆ）を入力し、入力したレベルｓｌＥ_ｋ（ｆ）及びｓｒＥ_ｋ（ｆ）から、式（７）によって両耳間レベル差ＩＬＤ_ｋ（ｆ）を算出して、算出した両耳間レベル差ＩＬＤ_ｋ（ｆ）を、対応する周波数帯域ｆのＩＬＤ標準偏差算出手段２６_ｆに出力する。 An ILD (interaural level difference) calculation means 25 _f receives the levels slE _k (f) and srE _k (f) from the level calculation means 22L _f and 22R _f of the corresponding frequency band f, From the input levels slE _k (f) and srE _k (f), the interaural level difference ILD _k (f) is calculated by the equation (7), and the calculated interaural level difference ILD _k (f) Output to the ILD standard deviation calculating means 26 _f of the corresponding frequency band f.

ＩＡＣＣ平均算出手段２７_ｆは、対応する周波数帯域ｆのＩＡＣＣ算出手段２３_ｆから両耳間相互相関度ＩＡＣＣ_ｋ（ｆ）を入力し、窓掛け手段２０Ｌ_ｆ及び２０Ｒ_ｆによって切り出されたすべての区間ｋ（ｋ＝１，２，…，Ｔ）における両耳間相互相関度ＩＡＣＣ_ｋ（ｆ）を入力すると、式（８）によって、時間軸方向における両耳間相互相関度ＩＡＣＣ_ｋ（ｆ）の平均を算出し、算出した平均を周波数帯域別物理特徴量の一つであるｘ_ａ（ｆ）としてＩＡＣＣ平均代表値算出手段３１に出力する。 IACC average calculating unit 27 _f, the corresponding type the interaural cross-correlation IACC _k (f) from IACC calculating means 23 _f of the frequency band f, all the sections cut out by windowing means 20L _f and 20R _f k (k = 1,2, ..., T) by entering the interaural cross-correlation IACC _k (f) in, by equation (8), the interaural cross-correlation IACC _k in the time axis direction (f) The average is calculated, and the calculated average is output to the IACC average representative value calculation means 31 as x _a (f), which is one of the physical features for each frequency band.

なお、周波数帯域別物理特徴量ｘ_ａ（ｆ）は、移動幅ｄごとに算出された両耳間相互相関度ＩＡＣＣ_ｋ（ｆ）の単純平均としたが、これに限定されるものではなく、重み付き平均を用いるようにしてもよいし、最大値又は中央値などを用いるようにしてもよい。 The physical feature amount x _a (f) for each frequency band is a simple average of the interaural cross-correlation degree IACC _k (f) calculated for each movement width d, but is not limited thereto. A weighted average may be used, or a maximum value or a median value may be used.

ＩＴＤ標準偏差算出手段２８_ｆは、対応する周波数帯域ｆのＩＴＤ算出手段２４_ｆから両耳間時間差ＩＴＤ_ｋ（ｆ）を入力し、窓掛け手段２０Ｌ_ｆ及び２０Ｒ_ｆによって切り出されたすべての区間ｋ（ｋ＝１，２，…，Ｔ）における両耳間時間差ＩＴＤ_ｋ（ｆ）を入力すると、式（９）によって、時間軸方向における両耳間時間差ＩＴＤ_ｋ（ｆ）の標準偏差を算出し、算出した標準偏差を周波数帯域別物理特徴量の一つであるｘ_ｔ（ｆ）としてＩＴＤ標準偏差代表値算出手段３２に出力する。 ITD standard deviation calculating means 28 _f, the corresponding interaural time difference from ITD calculation means 24 _f of the frequency band f enter the _ITD k (f), all the sections k cut out by windowing means 20L _f and 20R _f When the interaural time difference ITD _k (f) at (k = 1, 2,..., T) is input, the standard deviation of the interaural time difference ITD _k (f) in the time axis direction is calculated by Equation (9). The calculated standard deviation is output to the ITD standard deviation representative value calculation means 32 as x _t (f), which is one of the physical features for each frequency band.

ＩＬＤ標準偏差算出手段２６_ｆは、対応する周波数帯域ｆのＩＬＤ算出手段２５_ｆから両耳間レベル差ＩＬＤ_ｋ（ｆ）を入力し、２０Ｌ_ｆ及び２０Ｒ_ｆによって切り出されたすべての区間ｋ（ｋ＝１，２，…，Ｔ）における両耳間レベル差ＩＬＤ_ｋ（ｆ）を入力すると、式（１０）によって、時間軸方向における両耳間レベル差ＩＬＤ_ｋ（ｆ）の標準偏差を算出し、算出した標準偏差を周波数帯域別物理特徴量の一つであるｘ_ｌ（ｆ）としてＩＬＤ標準偏差代表値算出手段３０に出力する。 ILD standard deviation calculation means 26 _f, the corresponding ILD calculation means enter the interaural level difference _ILD k (f) from 25 _f of the frequency band f, 20L _f and all the sections k (k cut out by 20R _f = 1, 2,..., T), the interaural level difference ILD _k (f) is input, and the standard deviation of the interaural level difference ILD _k (f) in the time axis direction is calculated by the equation (10). The calculated standard deviation is output to the ILD standard deviation representative value calculating means 30 as x _l (f), which is one of the physical features for each frequency band.

ＩＡＣＣ平均代表値算出手段３１、ＩＴＤ標準偏差代表値算出手段３２及びＩＬＤ標準偏差代表値算出手段３０は、それぞれＩＡＣＣ平均算出手段２７_ｆ、ＩＴＤ標準偏差算出手段２８_ｆ及びＩＬＤ標準偏差算出手段２６_ｆから周波数帯域ｆごとに算出された周波数帯域別物理特徴量ｘ_ａ（ｆ）、ｘ_ｔ（ｆ）及びｘ_ｌ（ｆ）を入力し、それぞれ入力した周波数帯域別物理特徴量ｘ_ａ（ｆ）、ｘ_ｔ（ｆ）及びｘ_ｌ（ｆ）の代表値である物理特徴量代表値Ｘ_ａ、Ｘ_ｔ及びＸ_ｌを算出する。ＩＡＣＣ平均代表値算出手段３１、ＩＴＤ標準偏差代表値算出手段３２及びＩＬＤ標準偏差代表値算出手段３０は、それぞれ算出した物理特徴量代表値Ｘ_ａ、Ｘ_ｔ及びＸ_ｌを音像幅推定値算出手段１０又は推定値重み係数算出手段１２に出力する。 The IACC average representative value calculating means 31, the ITD standard deviation representative value calculating means 32, and the ILD standard deviation representative value calculating means 30 are respectively an IACC average calculating means 27 _f , an ITD standard deviation calculating means 28 _f, and an ILD standard deviation calculating means 26 _f. Frequency-specific physical feature values x _a (f), x _t (f) and x _l (f) calculated for each frequency band f from the input frequency feature physical parameters x _a (f) , X _t (f) and x _l (f), which are representative values of physical feature values X _a , X _t and X _l are calculated. The IACC average representative value calculating means 31, the ITD standard deviation representative value calculating means 32, and the ILD standard deviation representative value calculating means 30 respectively calculate the calculated physical feature quantity representative values X _a , X _t, and X _l as sound image width estimated value calculating means. 10 or the estimated value weighting coefficient calculation means 12.

前記したように、音像幅を推定するための推定モデル式における各物理特徴量代表値Ｘ_ａ、Ｘ_ｔ及びＸ_ｌに対する重み係数Ｃ_ａ、Ｃ_ｔ及びＣ_ｌを算出する場合は、物理特徴量代表値算出手段９の構成要素であるＩＡＣＣ平均代表値算出手段３１、ＩＴＤ標準偏差代表値算出手段３２及びＩＬＤ標準偏差代表値算出手段３０は、それぞれ物理特徴量代表値Ｘ_ａ、Ｘ_ｔ及びＸ_ｌを推定値重み係数算出手段１２に出力する。また、推定モデル式と重み係数Ｃ_ａ、Ｃ_ｔ及びＣ_ｌとを用いて音像幅を推定する場合は、ＩＡＣＣ平均代表値算出手段３１、ＩＴＤ標準偏差代表値算出手段３２及びＩＬＤ標準偏差代表値算出手段３０は、それぞれ物理特徴量代表値Ｘ_ａ、Ｘ_ｔ及びＸ_ｌを音像幅推定値算出手段１０に出力する。 As described above, when calculating the weighting factors C _a , C _t, and C _l for each physical feature amount representative value X _a , X _t, and X _l in the estimation model formula for estimating the sound image width, the physical feature amount The IACC average representative value calculating means 31, the ITD standard deviation representative value calculating means 32, and the ILD standard deviation representative value calculating means 30, which are constituent elements of the representative value calculating means 9, are respectively the physical feature quantity representative values X _a , X _t, and X ₁ is output to the estimated value weighting coefficient calculating means 12. Further, when the sound image width is estimated using the estimation model formula and the weight coefficients C _a , C _t, and C _l , the IACC average representative value calculating unit 31, the ITD standard deviation representative value calculating unit 32, and the ILD standard deviation representative value The calculating means 30 outputs the physical feature quantity representative values X _a , X _t and X _l to the sound image width estimated value calculating means 10, respectively.

ここで、第１実施形態におけるＩＡＣＣ平均代表値算出手段３１、ＩＴＤ標準偏差代表値算出手段３２及びＩＬＤ標準偏差代表値算出手段３０は、それぞれ周波数帯域別物理特徴量ｘ_ａ（ｆ）、ｘ_ｔ（ｆ）及びｘ_ｌ（ｆ）の代表値として、式（１１−１）、式（１１−２）及び式（１１−３）によって、周波数帯域ｆごとに算出した周波数帯域別物理特徴量ｘ_ａ（ｆ）、ｘ_ｔ（ｆ）及びｘ_ｌ（ｆ）の平均を算出して物理特徴量代表値Ｘ_ａ、Ｘ_ｔ及びＸ_ｌとする。 Here, the IACC average representative value calculating means 31, the ITD standard deviation representative value calculating means 32, and the ILD standard deviation representative value calculating means 30 in the first embodiment are respectively physical characteristic amounts x _a (f) and x _{t for} each frequency band. As representative values of (f) and x _l (f), the physical feature value x for each frequency band calculated for each frequency band f by Expression (11-1), Expression (11-2), and Expression (11-3). The average of _a (f), x _t (f), and x _l (f) is calculated to be the physical feature quantity representative values X _a , X _t, and X _l .

このように、推定モデル式で用いる物理特徴量として、周波数帯域ごとに算出した周波数帯域別物理特徴量を物理特徴量の種別ごとに一つの値に集約した代表値を用いることにより、推定モデル式における重み係数の個数を低減することができ、音像幅推定値（ハットｙ）の算出や重み係数を定めるための主観評価データの採取を簡略化することができる。 In this way, as the physical feature quantity used in the estimation model formula, the estimation model formula is obtained by using the representative value obtained by consolidating the physical feature quantity by frequency band calculated for each frequency band into one value for each type of physical feature quantity. The number of weighting coefficients can be reduced, and the calculation of the estimated sound image width (hat y) and the collection of subjective evaluation data for determining the weighting coefficient can be simplified.

図１に戻って（適宜図２参照）、音像幅推定装置１００の構成について説明を続ける。
音像幅推定値算出手段（推定値算出手段）１０は、ＩＡＣＣ平均代表値算出手段３１、ＩＴＤ標準偏差代表値算出手段３２及びＩＬＤ標準偏差代表値算出手段３０から、それぞれ物理特徴量代表値Ｘ_ａ、Ｘ_ｔ及びＸ_ｌを入力するとともに、重み係数記憶手段１１から、予め推定値重み係数算出手段１２によって算出して記憶しておいた重み係数Ｃ_ａ、Ｃ_ｔ及びＣ_ｌを読み出し、式（１２）に示した推定モデル式によって、音像幅の推定値（ハットｙ）を算出して、算出した推定値（ハットｙ）を表示手段１４に出力する。 Returning to FIG. 1 (see FIG. 2 as appropriate), the description of the configuration of the sound image width estimation apparatus 100 will be continued.
The sound image width estimated value calculating means (estimated value calculating means) 10 includes physical feature quantity representative values X _a from the IACC average representative value calculating means 31, the ITD standard deviation representative value calculating means 32, and the ILD standard deviation representative value calculating means 30, respectively. , X _t and X _l , and the weight coefficients C _a , C _t, and C _l calculated and stored in advance by the estimated value weight coefficient calculation means 12 are read from the weight coefficient storage means 11 and the equation ( The estimated value (hat y) of the sound image width is calculated by the estimated model formula shown in 12), and the calculated estimated value (hat y) is output to the display means 14.

式（１２）に示したように、第１実施形態における音像幅推定値（ハットｙ）は、３つの物理特徴量代表値Ｘ_ａ、Ｘ_ｔ及びＸ_ｌを要素とする３次元ベクトルの絶対値として算出することができる。 As shown in Expression (12), the estimated sound image width (hat y) in the first embodiment is an absolute value of a three-dimensional vector having three physical feature quantity representative values X _a , X _t, and X _l as elements. Can be calculated as

重み係数記憶手段１１は、推定値重み係数算出手段１２によって算出した式（１２）に示した推定モデル式の重み係数Ｃ_ａ、Ｃ_ｔ及びＣ_ｌを記憶する記憶手段である。重み係数記憶手段１１に記憶した重み係数Ｃ_ａ、Ｃ_ｔ及びＣ_ｌは、音像幅の推定を行う際に、音像幅推定値算出手段１０によって読み出され、音像幅推定値（ハットｙ）の算出に用いられる。 The weighting factor storage unit 11 is a storage unit that stores the weighting factors C _a , C _t, and C _l of the estimation model formula shown in Formula (12) calculated by the estimated value weighting factor calculation unit 12. The weight coefficients C _a , C _t, and C _l stored in the weight coefficient storage unit 11 are read out by the sound image width estimated value calculation unit 10 when the sound image width is estimated, and the sound image width estimated value (hat y) is calculated. Used for calculation.

推定値重み係数算出手段１２は、主観評価データ記憶手段１３に予め記憶しておいた主観評価データｙ_ｉを読み出すとともに、ＩＡＣＣ平均代表値算出手段３１、ＩＴＤ標準偏差代表値算出手段３２及びＩＬＤ標準偏差代表値算出手段３０から、それぞれ当該主観評価データｙ_ｉに対応する物理特徴量代表値Ｘ_ａｉ、Ｘ_ｔｉ及びＸ_ｌｉを入力し、入力した主観評価データｙ_ｉと物理特徴量代表値Ｘ_ａｉ、Ｘ_ｔｉ及びＸ_ｌｉとからなる複数組のデータを用いて、式（１２）に示した推定モデル式の重み係数Ｃ_ａ、Ｃ_ｔ及びＣ_ｌを回帰分析の手法である最小二乗法によって算出し、算出した重み係数Ｃ_ａ、Ｃ_ｔ及びＣ_ｌを重み係数記憶手段１１に記憶する。なお、ｉは、個々の主観評価データを識別する番号である。 The estimated value weight coefficient calculating means 12 reads subjective evaluation data y _i stored in the subjective evaluation data storage means 13 in advance, and also includes an IACC average representative value calculating means 31, an ITD standard deviation representative value calculating means 32, and an ILD standard. The physical feature quantity representative values X _ai , X _ti, and X _li respectively corresponding to the subjective evaluation data y _i are input from the deviation representative value calculation means 30, and the input subjective evaluation data y _i and physical feature quantity representative value X _ai are input. , X _ti and X _li are used to calculate the weighting factors C _a , C _t and C _l of the estimated model equation shown in Equation (12) by the least square method which is a regression analysis method. Then, the calculated weighting factors C _a , C _t and C _l are stored in the weighting factor storage means 11. Note that i is a number for identifying individual subjective evaluation data.

ここで、主観評価データｙ_ｉに対応する物理特徴量代表値Ｘ_ａｉ、Ｘ_ｔｉ及びＸ_ｌｉとは、当該主観評価データｙ_ｉを得たときの被験者と同じ音場条件で、ダミーヘッド１に取り付けられたマイクロフォン２Ｌ及び２Ｒを用いて音響信号を採取し、前記した各分析手段を用いて最終的にＩＡＣＣ平均代表値算出手段３１、ＩＴＤ標準偏差代表値算出手段３２及びＩＬＤ標準偏差代表値算出手段３０から出力される物理特徴量代表値Ｘ_ａ、Ｘ_ｔ及びＸ_ｌのことである。 Here, the physical characteristic amount representative value X _ai corresponding to subjective assessment data y _{_i,} and the X _ti and X _li, the same sound field conditions and subject when give the subjective assessment data y _i, the dummy head 1 The acoustic signals are collected using the attached microphones 2L and 2R, and finally the IACC average representative value calculating means 31, the ITD standard deviation representative value calculating means 32, and the ILD standard deviation representative value are calculated using the respective analysis means described above. These are the physical feature quantity representative values X _a , X _t and X ₁ output from the means 30.

次に、第１実施形態における推定値重み係数算出手段１２による重み係数Ｃ_ａ、Ｃ_ｔ及びＣ_ｌの算出手法について説明する。
第１実施形態では、３つの物理特徴量代表値Ｘ_ａ、Ｘ_ｔ及びＸ_ｌを説明変数とし、音像幅ｙを目的変数とする式（１２）に示した推定モデル式において、回帰分析の手法である最小二乗法によって重み係数Ｃ_ａ、Ｃ_ｔ及びＣ_ｌを算出する。すなわち、音像幅の主観評価データｙ_ｉと推定モデル式によって算出される予測値（ハットｙ_ｉ）との組を予め用意しておき、最小二乗法によって重み係数Ｃ_ａ、Ｃ_ｔ及びＣ_ｌを算出する。 Next, _a method for calculating the weighting factors C _a , C _t, and C _l by the estimated value weighting factor calculation unit 12 in the first embodiment will be described.
In the first embodiment, in the estimation model formula shown in Formula (12) in which three physical feature quantity representative values X _a , X _t and X _l are explanatory variables and the sound image width y is an objective variable, a regression analysis method is used. The weighting coefficients C _a , C _t and C _l are calculated by the least square method. That is, a set of the subjective evaluation data y _i of the sound image width and the predicted value (hat y _i ) calculated by the estimation model formula is prepared in advance, and the weight coefficients C _a , C _t, and C _l are calculated by the least square method. calculate.

式（１３）に示したように、主観評価データｙ_ｉと推定モデル式によって算出される予測値（ハットｙ_ｉ）との差の二乗和Ｊが最小となる重み係数Ｃ_ａ、Ｃ_ｔ及びＣ_ｌを算出する。なお、式（１３）において、Ｓは主観評価データのデータ数である。 As shown in the equation (13), the weighting factors C _a , C _t and C that minimize the sum of squares J of the difference between the subjective evaluation data y _i and the predicted value (hat y _i ) calculated by the estimation model equation. _l is calculated. In Equation (13), S is the number of subjective evaluation data.

ここで、計算の簡略化のため、便宜的に目的変数を音像幅ｙ_ｉの二乗とすると、式（１４）に示した二乗和Ｊが最小となる重み係数Ｃ_ａ、Ｃ_ｔ及びＣ_ｌを算出することになる。 Here, for simplification of calculation, if the objective variable is the square of the sound image width y _i for convenience, the weight coefficients C _a , C _t, and C _l that minimize the sum of squares J shown in Expression (14) are set. Will be calculated.

式（１４）の推定値（ハットｙ_ｉ）に、式（１２）を代入すると、二乗和Ｊは、式（１５）のように表すことができる。 Substituting equation (12) into the estimated value (hat y _i ) of equation (14), the sum of squares J can be expressed as equation (15).

ここで、二乗和Ｊが最小となる条件は、式（１６）に示した二乗和Ｊの各重み係数Ｃ_ａ、Ｃ_ｔ及びＣ_ｌによる偏微分が０になることである。 Here, the condition for minimizing the square sum J is that the partial differentiation of the square sum J shown in Expression (16) by the weighting factors C _a , C _t, and C _l is zero.

これにより、式（１７）に示した連立方程式が得られる。 As a result, the simultaneous equations shown in Expression (17) are obtained.

ここで、式（１７）に示した重み係数Ｃ_ａ、Ｃ_ｔ及びＣ_ｌを変数とする連立方程式において、重み係数Ｃ_ａ、Ｃ_ｔ及びＣ_ｌに対する係数を式（１８）のようにａ_１１〜ａ_３３及びｂ_１〜ｂ_３として定義する。 Here, the weighting factor _C a shown in equation (17), in simultaneous equations for the variables _{C t} and _{C l,} the coefficients for weighting coefficient _C a, _{C t} and _{C l} as in equation (18) _{a 11} ˜a ₃₃ and b ₁ ˜b ₃ .

式（１８）で定義したａ_１１〜ａ_３３及びｂ_１〜ｂ_３を用いると、式（１７）は、式（１９−１）のように表すことができる。そして、式（１９−１）は、式（１９−２）のように変形することができる。 With _a 11 _{~a 33} and _b 1 ~b ₃ defined in formula (18), equation (17) can be expressed by the equation (19-1). The equation (19-1) can be transformed as the equation (19-2).

ここで、ａ_１１〜ａ_３３及びｂ_１〜ｂ_３は、式（１７）に示したように、主観評価データｙ_ｉと、当該主観評価データｙ_ｉを得たときの被験者の位置で採取した音響信号を分析して得られる物理特徴量代表値Ｘ_ａｉ、Ｘ_ｔｉ及びＸ_ｌｉとを用いて算出することができる。 Here, a _{11 to} a ₃₃ and b _{1 to} b ₃ were collected at the position of the subject when the subjective evaluation data y _i and the subjective evaluation data y _i were obtained, as shown in the equation (17). It can be calculated using the physical feature value representative values X _ai , X _ti and X _li obtained by analyzing the acoustic signal.

推定値重み係数算出手段１２は、以上説明した手順により、重み係数Ｃ_ａ、Ｃ_ｔ及びＣ_ｌを算出することができる。 The estimated value weighting factor calculation means 12 can calculate the weighting factors C _a , C _t and C _l by the procedure described above.

主観評価データ記憶手段１３は、重み係数Ｃ_ａ、Ｃ_ｔ及びＣ_ｌを算出するための音像幅の主観評価データｙ_ｉを記憶する記憶手段である。主観評価データ記憶手段１３に記憶した主観評価データｙ_ｉは、推定値重み係数算出手段１２によって読み出され、重み係数Ｃ_ａ、Ｃ_ｔ及びＣ_ｌの算出のために用いられる。 The subjective evaluation data storage means 13 is a storage means for storing the subjective evaluation data y _i of the sound image width for calculating the weighting coefficients C _a , C _t and C _l . The subjective evaluation data y _i stored in the subjective evaluation data storage means 13 is read by the estimated value weight coefficient calculating means 12 and used for calculating the weight coefficients C _a , C _t and C _l .

以上、音像幅推定装置１００の構成について説明したが、本発明はこれに限定されるものではない。例えば、音像幅推定装置１００の演算手段５は、一般的なコンピュータにプログラムを実行させ、コンピュータ内の演算装置や記憶装置を動作させることにより実現することができる。このプログラム（音像幅推定プログラム）は、通信回線を介して配布することも可能であるし、ＣＤ−ＲＯＭ等の記録媒体に書き込んで配布することも可能である。 The configuration of the sound image width estimation apparatus 100 has been described above, but the present invention is not limited to this. For example, the calculation means 5 of the sound image width estimation apparatus 100 can be realized by causing a general computer to execute a program and operating a calculation device or a storage device in the computer. This program (sound image width estimation program) can be distributed via a communication line, or can be distributed by writing on a recording medium such as a CD-ROM.

次に、図３を参照（適宜図１及び図２参照）して、音像幅推定装置１００の動作について説明する。
図３に示すように、音像幅推定装置１００は、まず、推定値重み係数算出手段１２によって、式（１２）に示した推定モデル式における重み係数Ｃ_ａ、Ｃ_ｔ及びＣ_ｌを算出して、重み係数記憶手段１１に記憶しておく（ステップＳ１０）。既に重み係数Ｃ_ａ、Ｃ_ｔ及びＣ_ｌが重み係数記憶手段１１に記憶されている場合は、この推定モデル式の重み係数算出処理ステップは省略することができる。なお、推定モデル式の重み係数算出処理ステップの詳細については後記する。 Next, the operation of the sound image width estimation apparatus 100 will be described with reference to FIG. 3 (see FIGS. 1 and 2 as appropriate).
As shown in FIG. 3, the sound image width estimation apparatus 100 first calculates weight coefficients C _a , C _t, and C _l in the estimation model formula shown in Formula (12) by the estimated value weight coefficient calculation means 12. And stored in the weight coefficient storage means 11 (step S10). When the weight coefficients C _a , C _t and C _l are already stored in the weight coefficient storage means 11, the weight coefficient calculation processing step of this estimation model formula can be omitted. Details of the weighting factor calculation processing step of the estimated model formula will be described later.

次に、音像幅推定装置１００は、ダミーヘッド１に取り付けられたマイクロフォン２Ｌ及び２Ｒによって、試験対象である音源ＳＳから発生する音響をバイノーラル方式で採取し、採取したアナログ音響信号を、ローパスフィルタ３Ｌ及び３Ｒを介しＡＤ変換器４Ｌ及び４Ｒによって、デジタル信号に変換した音響信号ｓｌ（ｎ）及びｓｒ（ｎ）として、メモリ６Ｌ及び６Ｒに記憶する（ステップＳ１１）。 Next, the sound image width estimation apparatus 100 uses the microphones 2L and 2R attached to the dummy head 1 to collect the sound generated from the sound source SS to be tested by the binaural method, and uses the collected analog sound signal as the low-pass filter 3L. And 3R through the AD converters 4L and 4R, the acoustic signals sl (n) and sr (n) converted into digital signals are stored in the memories 6L and 6R (step S11).

音像幅推定装置１００は、フィルタバンク７Ｌ及び７Ｒによって、ステップＳ１１でメモリ６Ｌ及び６Ｒに記憶した音響信号ｓｌ（ｎ）及びｓｒ（ｎ）を読み出し、複数の周波数帯域ｆの周波数帯域成分ｓｌ（ｎ，ｆ）及びｓｒ（ｎ，ｆ）に分割して、対応する周波数帯域ｆの周波数帯域別物理特徴量算出手段８_ｆの窓掛け手段２０Ｌ_ｆ及び２０Ｒ_ｆに出力する（ステップＳ１２）。
なお、ここでは、音像幅推定装置１００は、フィルタバンク７Ｌ及び７Ｒとして、それぞれ１／６オクターブバンドフィルタを用いて演算する。 The sound image width estimation apparatus 100 reads out the acoustic signals sl (n) and sr (n) stored in the memories 6L and 6R in step S11 by the filter banks 7L and 7R, and the frequency band components sl (n , F) and sr (n, f) and output to the windowing means 20L _f and 20R _f of the corresponding frequency band-specific physical feature quantity calculating means 8 _f of the frequency band f (step S12).
Here, the sound image width estimation apparatus 100 performs calculations using 1/6 octave band filters as the filter banks 7L and 7R, respectively.

音像幅推定装置１００は、窓掛け手段２０Ｌ_ｆ及び２０Ｒ_ｆによって、ステップＳ１２でフィルタバンク７Ｌ及び７Ｒから入力した、対応する周波数帯域ｆの周波数帯域成分ｓｌ（ｎ，ｆ）及びｓｒ（ｎ，ｆ）に対して時間窓ｗ（ｎ）を掛けて、所定の移動幅ｄずつシフトした位置の、所定の時間長の音響信号ｙｌ_ｋ（ｎ，ｆ）及びｙｒ_ｋ（ｎ，ｆ）を順次に切り出す。
音像幅推定装置１００は、窓掛け手段２０Ｌ_ｆによって切り出した左チャンネルの音響信号ｙｌ_ｋ（ｎ，ｆ）を対応する周波数帯域ｆのレベル算出手段２２Ｌ_ｆ及びＣＣＣ算出手段２１_ｆに順次に出力するとともに、窓掛け手段２０Ｒ_ｆによって切り出した右チャンネルの音響信号ｙｒ_ｋ（ｎ，ｆ）を対応する周波数帯域ｆのレベル算出手段２２Ｒ_ｆ及びＣＣＣ算出手段２１_ｆに順次に出力する（ステップＳ１３）。 The sound image width estimation apparatus 100 uses the windowing means 20L _f and 20R _f to input frequency band components sl (n, f) and sr (n, f) of the corresponding frequency band f input from the filter banks 7L and 7R in step S12. ) Is multiplied by a time window w (n), and acoustic signals yl _k (n, f) and yr _k (n, f) of a predetermined time length at positions shifted by a predetermined movement width d are sequentially applied. cut.
The sound image width estimation apparatus 100 sequentially outputs the left-channel acoustic signal yl _k (n, f) cut out by the windowing means 20L _f to the corresponding frequency band f level calculation means 22L _f and CCC calculation means 21 _f. At the same time, the sound signal yr _k (n, f) of the right channel cut out by the windowing means 20R _f is sequentially output to the level calculating means 22R _f and the CCC calculating means 21 _f of the corresponding frequency band f (step S13).

音像幅推定装置１００は、レベル算出手段２２Ｌ_ｆ及び２２Ｒ_ｆによって、ステップＳ１３でそれぞれ窓掛け手段２０Ｌ_ｆ及び２０Ｒ_ｆから順次に入力した所定の時間長の音響信号ｙｌ_ｋ（ｎ，ｆ）及びｙｒ_ｋ（ｎ，ｆ）から、それぞれレベルｓｌＥ_ｋ（ｆ）及びｓｒ_ｋＥ（ｆ）を算出して、各周波数帯域ｆに対応するＩＬＤ算出手段２５_ｆに順次に出力する（ステップＳ１４）。
音像幅推定装置１００は、並行して、ＣＣＣ算出手段によって、ステップＳ１３で窓掛け手段２０Ｌ_ｆ及び２０Ｒ_ｆから順次に入力した音響信号ｙｌ_ｋ（ｎ，ｆ）及びｙｒ_ｋ（ｎ，ｆ）から、両耳間相互相関係数ＣＣＣ_ｋ（ｆ）を算出して、各周波数帯域ｆに対応するＩＡＣＣ算出手段２３_ｆ及びＩＴＤ算出手段２４_ｆに順次に出力する（ステップＳ１４）。 The sound image width estimation apparatus 100 uses the level calculation means 22L _f and 22R _f to input acoustic signals yl _k (n, f) and yr of predetermined time lengths sequentially input from the windowing means 20L _f and 20R _{f in} step S13, respectively. _The levels slE _k (f) and sr _k E (f) are calculated from _k (n, f), respectively, and sequentially output to the ILD calculation means 25 _f corresponding to each frequency band f (step S14).
In parallel, the sound image width estimation apparatus 100 uses the acoustic signals yl _k (n, f) and yr _k (n, f) sequentially input from the windowing means 20L _f and 20R _f in step S13 by the CCC calculating means. The interaural cross-correlation coefficient CCC _k (f) is calculated and sequentially output to the IACC calculation means 23 _f and the ITD calculation means 24 _f corresponding to each frequency band f (step S14).

音像幅推定装置１００は、ＩＡＣＣ算出手段２３_ｆによって、ステップＳ１４でＣＣＣ算出手段２１_ｆから入力した両耳間相互相関係数ＣＣＣ_ｋ（ｆ）から両耳間相互相関度ＩＡＣＣ_ｋ（ｆ）を算出してＩＡＣＣ平均算出手段２７_ｆに順次に出力する（ステップＳ１５）。
音像幅推定装置１００は、並行して、ＩＴＤ算出手段２４_ｆによって、ステップＳ１４でＣＣＣ算出手段２１_ｆから入力した両耳間相互相関係数ＣＣＣ_ｋ（ｆ）から両耳間時間差ＩＴＤ_ｋ（ｆ）を算出してＩＴＤ標準偏差算出手段２８_ｆに順次に出力する（ステップＳ１５）。
音像幅推定装置１００は、更に並行して、ＩＬＤ算出手段２５_ｆによって、ステップＳ１４でレベル算出手段２２Ｌ_ｆ及び２２Ｒ_ｆから入力したレベルｓｌＥ_ｋ（ｆ）及びｓｒ_ｋＥ（ｆ）から両耳間レベル差ＩＬＤ_ｋ（ｆ）を算出してＩＬＤ標準偏差算出手段２６_ｆに順次に出力する（ステップＳ１５）。 The sound image width estimation apparatus 100 calculates the interaural cross-correlation degree IACC _k (f) from the interaural cross-correlation coefficient CCC _k (f) input from the CCC calculation unit 21 _{f in} step S14 by the IACC calculation unit 23 _f . calculated and sequentially outputs the IACC average calculating unit 27 _f (step S15).
In parallel, the sound image width estimation apparatus 100 uses the ITD calculator 24 _{f to} calculate the interaural time difference ITD _k (f) from the interaural cross-correlation coefficient CCC _k (f) input from the CCC calculator 21 _{f in} step S14. ) is calculated sequentially outputs the ITD standard deviation calculating means 28 _f (step S15).
In parallel, the sound image width estimation apparatus 100 further performs inter-aural interplay between the levels slE _k (f) and sr _k E (f) input from the level calculation units 22L _f and 22R _{f in} step S14 by the ILD calculation unit 25 _f . The level difference ILD _k (f) is calculated and sequentially output to the ILD standard deviation calculating means 26 _f (step S15).

音像幅推定装置１００は、ＩＡＣＣ平均算出手段２７_ｆによって、ステップＳ１５でＩＡＣＣ算出手段２３_ｆから入力した両耳間相互相関度ＩＡＣＣ_ｋ（ｆ）から時間軸方向における平均を算出し、算出した当該平均を周波数帯域別物理特徴量ｘ_ａ（ｆ）としてＩＡＣＣ平均代表値算出手段３１に順次に出力する（ステップＳ１６）。
音像幅推定装置１００は、並行して、ＩＴＤ標準偏差算出手段２８_ｆによって、ステップＳ１５でＩＴＤ算出手段２４_ｆから入力した両耳間時間差ＩＴＤ_ｋ（ｆ）から時間軸方向における標準偏差を算出し、算出した当該標準偏差を周波数帯域別物理特徴量ｘ_ｔ（ｆ）としてＩＴＤ標準偏差代表値算出手段３２に順次に出力する（ステップＳ１６）。
音像幅推定装置１００は、更に並行して、ＩＬＤ標準偏差算出手段２６_ｆによって、ステップＳ１５でＩＬＤ算出手段２５_ｆから入力した両耳間レベル差ＩＬＤ_ｋ（ｆ）から時間軸方向における標準偏差を算出し、算出した当該標準偏差を周波数帯域別物理特徴量ｘ_ｌ（ｆ）としてＩＬＤ標準偏差代表値算出手段３０に順次に出力する（ステップＳ１６）。 The sound image width estimation apparatus 100 calculates the average in the time axis direction by calculating the average in the time axis direction from the interaural cross-correlation degree IACC _k (f) input from the IACC calculation unit 23 _{f in} step S15 by the IACC average calculation unit 27 _f . The average is sequentially output to the IACC average representative value calculating means 31 as the physical characteristic amount x _a (f) for each frequency band (step S16).
In parallel, the sound image width estimation apparatus 100 calculates the standard deviation in the time axis direction from the interaural time difference ITD _k (f) input from the ITD calculation unit 24 _{f in} step S15 by the ITD standard deviation calculation unit 28 _f . The calculated standard deviation is sequentially output to the ITD standard deviation representative value calculating means 32 as a physical characteristic amount x _t (f) for each frequency band (step S16).
In parallel, the sound image width estimation apparatus 100 further calculates the standard deviation in the time axis direction from the interaural level difference ILD _k (f) input from the ILD calculation unit 25 _{f in} step S15 by the ILD standard deviation calculation unit 26 _f . The calculated standard deviation is sequentially output to the ILD standard deviation representative value calculating means 30 as the physical characteristic amount x _l (f) for each frequency band (step S16).

音像幅推定装置１００は、ＩＡＣＣ平均代表値算出手段３１によって、ステップＳ１６でＩＡＣＣ平均算出手段２７_ｆから入力した周波数帯域別物理特徴量ｘ_ａ（ｆ）の平均を算出し、算出した当該平均を物理特徴量代表値Ｘ_ａとして音像幅推定値算出手段１０に出力する（ステップＳ１７）。
音像幅推定装置１００は、並行して、ＩＴＤ標準偏差代表値算出手段３２によって、ステップＳ１６でＩＴＤ標準偏差算出手段２８_ｆから入力した周波数帯域別物理特徴量ｘ_ｔ（ｆ）の平均を算出し、算出した当該平均を物理特徴量代表値Ｘ_ｔとして音像幅推定値算出手段１０に出力する（ステップＳ１７）。
音像幅推定装置１００は、更に並行して、ＩＬＤ標準偏差代表値算出手段３０によって、ステップＳ１６でＩＬＤ標準偏差算出手段２６_ｆから入力した周波数帯域別物理特徴量ｘ_ｌ（ｆ）の平均を算出し、算出した当該平均を物理特徴量代表値Ｘ_ｌとして音像幅推定値算出手段１０に出力する（ステップＳ１７）。 The sound image width estimation apparatus 100 calculates the average of the physical characteristic amount x _a (f) for each frequency band input from the IACC average calculation unit 27 _f in step S16 by the IACC average representative value calculation unit 31, and calculates the calculated average. and it outputs the sound image width estimate calculation unit 10 as a physical feature quantity representative value _{X a} (step S17).
In parallel, the sound image width estimation apparatus 100 calculates the average of the physical characteristics x _t (f) for each frequency band input from the ITD standard deviation calculation unit 28 _f in step S16 by the ITD standard deviation representative value calculation unit 32. , and it outputs the average calculated as a physical feature quantity representative value X _t sound image width estimation value calculating means 10 (step S17).
In parallel, the sound image width estimation apparatus 100 further calculates the average of the physical features x _l (f) for each frequency band input from the ILD standard deviation calculation unit 26 _f in step S16 by the ILD standard deviation representative value calculation unit 30. , and outputs the sound image width estimate calculation unit 10 the average calculated as a physical feature quantity representative value X _l (step S17).

音像幅推定装置１００は、音像幅推定値算出手段１０によって、ステップＳ１７でＩＡＣＣ平均代表値算出手段３１、ＩＴＤ標準偏差代表値算出手段３２及びＩＬＤ標準偏差代表値算出手段３０から入力した物理特徴量代表値Ｘ_ａ、Ｘ_ｔ及びＸ_ｌと、ステップＳ１０で推定値重み係数算出手段１２によって重み係数記憶手段１１に記憶しておいた重み係数Ｃ_ａ、Ｃ_ｔ及びＣ_ｌとから、式（１２）によって音像幅推定値（ハットｙ）を算出して表示手段１４に出力する（ステップＳ１８）。 The sound image width estimation apparatus 100 receives the physical feature amount input from the IACC average representative value calculation means 31, the ITD standard deviation representative value calculation means 32, and the ILD standard deviation representative value calculation means 30 in step S17 by the sound image width estimation value calculation means 10. From the representative values X _a , X _t and X _l and the weight coefficients C _a , C _t and C _l stored in the weight coefficient storage means 11 by the estimated value weight coefficient calculation means 12 in step S10, the equation (12 ) To calculate the estimated sound image width (hat y) and output it to the display means 14 (step S18).

音像幅推定装置１００は、表示手段１４によって、ステップＳ１８で音像幅推定値算出手段１０から入力した音像幅推定値（ハットｙ）を視認可能に表示する（ステップＳ１９）。
以上の処理によって、音像幅推定装置１００は、音像幅を推定することができる。 The sound image width estimation apparatus 100 displays the sound image width estimated value (hat y) input from the sound image width estimated value calculating means 10 in step S18 so as to be visible on the display means 14 (step S19).
With the above processing, the sound image width estimation apparatus 100 can estimate the sound image width.

次に、図４を参照（適宜図１及び図２参照）して、図３に示した推定モデル式の重み係数算出処理ステップ（ステップＳ１０）における音像幅推定装置１００の動作について説明する。
図４に示すように、音像幅推定装置１００は、まず、予め実施した主観評価によって得られた主観評価データｙ_ｉを不図示の入力手段によって入力し、主観評価データ記憶手段１３に記憶する（ステップＳ３０）。 Next, the operation of the sound image width estimation apparatus 100 in the weighting factor calculation processing step (step S10) of the estimation model formula shown in FIG. 3 will be described with reference to FIG. 4 (see FIGS. 1 and 2 as appropriate).
As shown in FIG. 4, the sound image width estimation apparatus 100 first inputs subjective evaluation data y _i obtained by subjective evaluation performed in advance by an input unit (not shown) and stores it in the subjective evaluation data storage unit 13 ( Step S30).

次に、音像幅推定装置１００は、マイクロフォン２Ｌ及び２Ｒによって、ステップＳ３０で入力した主観評価データｙ_ｉに対応する音響信号をバイノーラル方式で採取し、採取したアナログ音響信号を、ローパスフィルタ３Ｌ及び３Ｒを介しＡＤ変換器４Ｌ及び４Ｒによって、デジタル信号に変換した音響信号ｓｌ（ｎ）及びｓｒ（ｎ）として、メモリ６Ｌ及び６Ｒに記憶する（ステップＳ３１）。 Next, the sound image width estimation apparatus 100 collects the acoustic signal corresponding to the subjective evaluation data y _i input in step S30 by the binaural method using the microphones 2L and 2R, and uses the collected analog acoustic signal as the low-pass filters 3L and 3R. Are stored in the memories 6L and 6R as acoustic signals sl (n) and sr (n) converted into digital signals by the AD converters 4L and 4R (step S31).

ステップＳ３２〜ステップＳ３７までの処理は、それぞれ図３に示した処理におけるステップＳ１２〜ステップＳ１７までの処理と同様であるから説明は省略する。
なお、音像幅推定装置１００は、ステップＳ３０で入力した主観評価データｙ_ｉの個数Ｓに対応して、ステップＳ３１〜ステップＳ３７の処理を繰り返し、推定値重み係数算出手段１２に、Ｓ組の主観評価データｙ_ｉと物理特徴量代表値Ｘ_ａｉ、Ｘ_ｔｉ及びＸ_ｌｉとからなるデータを蓄積する。 The processing from step S32 to step S37 is the same as the processing from step S12 to step S17 in the processing shown in FIG.
Note that the sound image width estimation apparatus 100 repeats the processing of steps S31 to S37 corresponding to the number S of subjective evaluation data y _i input in step S30, and causes the estimated value weight coefficient calculation means 12 to receive S sets of subjective subjects. Data consisting of evaluation data y _i and physical feature quantity representative values X _ai , X _ti and X _li is stored.

音像幅推定装置１００は、推定値重み係数算出手段１２によって、ステップＳ３７でＩＡＣＣ平均代表値算出手段３１、ＩＴＤ標準偏差代表値算出手段３２及びＩＬＤ標準偏差代表値算出手段３０から入力した物理特徴量代表値Ｘ_ａｉ、Ｘ_ｔｉ及びＸ_ｌｉと、ステップＳ３０で不図示の入力手段から入力して主観評価データ記憶手段１３に記憶しておいた主観評価データｙ_ｉとからなるＳ組のデータを用いて、最小二乗法によって重み係数Ｃ_ａ、Ｃ_ｔ及びＣ_ｌを算出して（ステップＳ３８）、算出した重み係数Ｃ_ａ、Ｃ_ｔ及びＣ_ｌを重み係数記憶手段１１に記憶する（ステップＳ３９）。
以上で、音像幅推定装置１００は、推定モデル式の重み係数算出処理を終了する。 The sound image width estimation apparatus 100 uses the estimated feature weight coefficient calculation unit 12 to input physical feature values from the IACC average representative value calculation unit 31, the ITD standard deviation representative value calculation unit 32, and the ILD standard deviation representative value calculation unit 30 in step S37. S sets of data including representative values X _ai , X _ti, and X _li and subjective evaluation data y _i input from input means (not shown) and stored in the subjective evaluation data storage means 13 in step S30 are used. Then, the weight coefficients C _a , C _t and C _l are calculated by the least square method (step S38), and the calculated weight coefficients C _a , C _t and C _l are stored in the weight coefficient storage means 11 (step S39). .
Thus, the sound image width estimation apparatus 100 ends the weighting factor calculation process of the estimation model formula.

次に、図５を参照（適宜図１及び図２参照）して、式（１２）に示した本発明における音像幅の推定モデル式で用いる物理特徴量と音像幅の主観評価データとの間のピアソン相関分析の結果について説明する。 Next, referring to FIG. 5 (refer to FIG. 1 and FIG. 2 as appropriate), between the physical feature amount used in the sound image width estimation model expression in the present invention shown in Expression (12) and the subjective evaluation data of the sound image width. The result of Pearson correlation analysis will be described.

図５の（１）〜（４）は、それぞれバイオリンのＧ線、Ａ線、Ｄ線及びＥ線の開放弦の連続音を音源として用いたピアソン相関分析の結果である。図５の（１）〜（４）において、横軸はフィルタバンク７Ｌ及び７Ｒによって分割する周波数帯域幅を示しており、各図中の左から周波数帯域分割無し（１バンド）、１／１オクターブバンド〜１／９６オクターブバンドとした場合について示している。また、縦軸はピアソン相関係数を示している。「◆」、「□」及び「▲」で示したデータは、音像幅の主観評価データと、それぞれ両耳間相互相関度の周波数帯域についての平均である物理特徴量代表値Ｘ_ａ、両耳間時間差の標準偏差の周波数帯域についての平均である物理特徴量代表値Ｘ_ｔ及び両耳間レベル差の標準偏差の周波数帯域についての平均である物理特徴量代表値Ｘ_ｌとの相関係数を示している。 (1) to (4) in FIG. 5 are the results of Pearson correlation analysis using continuous sounds of violin G-line, A-line, D-line, and E-line as sound sources, respectively. 5 (1) to (4), the horizontal axis indicates the frequency bandwidth divided by the filter banks 7L and 7R. From the left in each figure, there is no frequency band division (1 band), 1/1 octave. The case where the band is set to 1/96 octave band is shown. The vertical axis represents the Pearson correlation coefficient. The data indicated by “◆”, “□”, and “▲” are the subjective evaluation data of the sound image width, the physical feature value representative value X _a that is the average for the frequency band of the binaural cross-correlation, respectively, binaural the correlation coefficient between the average of a physical characteristic amount representative value X _t and a physical feature quantity representative value X _l is the average of the frequency band of the standard deviation of the interaural level difference for the frequency band of the standard deviation between the time difference Show.

何れの結果も、周波数帯域幅を狭くするほど相関が高くなることを示しており、特に１／６オクターブバンド以下で高い相関を示し、１／１２オクターブバンド以下の狭帯域とした場合では、相関係数の値は飽和していることがわかる。
この分析結果より、フィルタバンク７Ｌ及び７Ｒによって分割する周波数帯域幅は、好ましくは１／６オクターブバンド以下、より好ましくは１／１２オクターブバンド以下とすることによって、安定した精度で音像幅を予測できることがわかる。 Both results show that the correlation becomes higher as the frequency bandwidth is narrowed. In particular, the correlation is high at 1/6 octave band or lower, and in the case of narrow band below 1/12 octave band, It can be seen that the value of the relation number is saturated.
From this analysis result, it is possible to predict the sound image width with stable accuracy by setting the frequency bandwidth divided by the filter banks 7L and 7R to preferably 1/6 octave band or less, more preferably 1/12 octave band or less. I understand.

［第２実施形態］
次に、図６及び図７を参照して、本発明における第２実施形態の音像幅推定装置１００Ａについて説明する。
図６に示すように、第２実施形態の音像幅推定装置１００Ａは、図１に示した第１実施形態の音像幅推定装置１００とは、演算手段５に替えて演算手段５Ａを備えたことが異なる。詳細には、第２実施形態の音像幅推定装置１００Ａは、図１に示した第１実施形態の音像幅推定装置１００とは、物理特徴量代表値算出手段９及び主観評価データ記憶手段１３に替えて、それぞれ物理特徴量代表値算出手段９Ａ及び主観評価データ記憶手段１３Ａを備えたことと、代表値重み係数記憶手段１５及び代表値重み係数算出手段１６を更に備えたことと、が異なる。 [Second Embodiment]
Next, with reference to FIG.6 and FIG.7, 100 A of sound image width estimation apparatuses of 2nd Embodiment in this invention are demonstrated.
As shown in FIG. 6, the sound image width estimation apparatus 100 A according to the second embodiment includes a calculation means 5 A instead of the calculation means 5 in the same manner as the sound image width estimation apparatus 100 according to the first embodiment shown in FIG. 1. Is different. Specifically, the sound image width estimation apparatus 100A according to the second embodiment is different from the sound image width estimation apparatus 100 according to the first embodiment shown in FIG. 1 in the physical feature quantity representative value calculation unit 9 and the subjective evaluation data storage unit 13. Instead, the physical feature quantity representative value calculation means 9A and the subjective evaluation data storage means 13A are provided, and the representative value weight coefficient storage means 15 and the representative value weight coefficient calculation means 16 are further provided.

第１実施形態における物理特徴量代表値算出手段９は、物理特徴量代表値Ｘ_ａ、Ｘ_ｔ及びＸ_ｌとして、それぞれ周波数帯域別物理特徴量ｘ_ａ（ｆ）、ｘ_ｔ（ｆ）及びｘ_ｌ（ｆ）の平均を算出したが、第２実施形態における物理特徴量代表値算出手段９Ａは、物理特徴量代表値Ｘ_ａ、Ｘ_ｔ及びＸ_ｌとして、それぞれ周波数帯域別物理特徴量ｘ_ａ（ｆ）、ｘ_ｔ（ｆ）及びｘ_ｌ（ｆ）の重み付き平均を算出するものである。 The physical feature quantity representative value calculation means 9 in the first embodiment uses the physical feature quantities x _a (f), x _t (f), and x ₁ as frequency feature representative values X _a , X _t, and X _l , respectively. The average of _l (f) is calculated, but the physical feature quantity representative value calculation unit 9A in the second embodiment uses the physical feature quantity x _{a for} each frequency band as the physical feature quantity representative values X _a , X _t, and X _l , respectively. A weighted average of (f), x _t (f) and x _l (f) is calculated.

図７に示すように、第２実施形態における演算手段５Ａの物理特徴量代表値算出手段９Ａは、図２に示した第１実施形態における演算手段５の物理特徴量代表値算出手段９とは、ＩＬＤ標準偏差代表値算出手段３０、ＩＡＣＣ平均代表値算出手段３１及びＩＴＤ標準偏差代表値算出手段３２に替えて、ＩＬＤ標準偏差代表値算出手段３０Ａ、ＩＡＣＣ平均代表値算出手段３１Ａ及びＩＴＤ標準偏差代表値算出手段３２Ａを備えたことが異なる。
なお、図１及び図２に示した第１実施形態と同じ構成要素については、同じ符号を付して、説明は適宜省略する。 As shown in FIG. 7, the physical feature quantity representative value calculation means 9A of the calculation means 5A in the second embodiment is the same as the physical feature quantity representative value calculation means 9 of the calculation means 5 in the first embodiment shown in FIG. In place of the ILD standard deviation representative value calculating means 30, the IACC average representative value calculating means 31 and the ITD standard deviation representative value calculating means 32, the ILD standard deviation representative value calculating means 30A, the IACC average representative value calculating means 31A and the ITD standard deviation The difference is that the representative value calculating means 32A is provided.
The same components as those in the first embodiment shown in FIGS. 1 and 2 are denoted by the same reference numerals, and description thereof will be omitted as appropriate.

第２実施形態におけるＩＬＤ標準偏差代表値算出手段３０Ａは、ＩＬＤ標準偏差算出手段２６_ｆから周波数帯域別物理特徴量ｘ_ｌ（ｆ）を入力するとともに、重み係数記憶手段１５から重み係数ｃ_ｌ（ｆ）を読み出し、物理特徴量代表値Ｘ_ｌとして、式（２０−３）によって重み付き平均を算出する。ＩＬＤ標準偏差代表値算出手段３０Ａは、算出した物理特徴量代表値Ｘ_ｌを音像幅推定値算出手段１０に出力する。
第２実施形態におけるＩＡＣＣ平均代表値算出手段３１Ａは、ＩＡＣＣ平均算出手段２７_ｆから周波数帯域別物理特徴量ｘ_ａ（ｆ）を入力するとともに、重み係数記憶手段１５から重み係数ｃ_ａ（ｆ）を読み出し、物理特徴量代表値Ｘ_ａとして、式（２０−１）によって重み付き平均を算出する。ＩＡＣＣ平均代表値算出手段３１Ａは、算出した物理特徴量代表値Ｘ_ａを音像幅推定値算出手段１０に出力する。
第２実施形態におけるＩＴＤ標準偏差代表値算出手段３２Ａは、ＩＴＤ標準偏差算出手段２８_ｆから周波数帯域別物理特徴量ｘ_ｔ（ｆ）を入力するとともに、重み係数記憶手段１５から重み係数ｃ_ｔ（ｆ）を読み出し、物理特徴量代表値Ｘ_ｔとして、式（２０−２）によって重み付き平均を算出する。ＩＴＤ標準偏差代表値算出手段３２Ａは、算出した物理特徴量代表値Ｘ_ｔを音像幅推定値算出手段１０に出力する。 The ILD standard deviation representative value calculation means 30A in the second embodiment receives the physical characteristic amount x _l (f) for each frequency band from the ILD standard deviation calculation means 26 _f and the weight coefficient c _l ( It reads f), as a physical feature quantity representative values _{X l,} calculates the weighted average by formula (20-3). ILD standard deviation representative value calculating unit 30A outputs the calculated physical characteristic amount representative value X _l sound image width estimation value calculating means 10.
The IACC average representative value calculation means 31A in the second embodiment receives the physical characteristic amount x _a (f) for each frequency band from the IACC average calculation means 27 _f and the weight coefficient c _a (f) from the weight coefficient storage means 15. And _a weighted average is calculated as the physical feature value representative value Xa by the equation (20-1). The IACC average representative value calculating unit 31A outputs the calculated physical feature amount representative value _Xa to the sound image width estimated value calculating unit 10.
The ITD standard deviation representative value calculation means 32A in the second embodiment receives the frequency band-specific physical feature value x _t (f) from the ITD standard deviation calculation means 28 _f and the weight coefficient storage means 15 from the weight coefficient c _t ( It reads f), as a physical feature quantity representative value _{X t,} and calculates the weighted average by formula (20-2). The ITD standard deviation representative value calculating unit 32A outputs the calculated physical feature amount representative value _Xt to the sound image width estimated value calculating unit 10.

主観評価データ記憶手段１３Ａは、図２に示した第１実施形態における主観評価データ記憶手段１３と同様に、代表値重み係数算出手段１２によって音像幅推定値（ハットｙ）を算出するための重み係数Ｃ_ａ、Ｃ_ｔ及びＣ_ｌを算出する際に用いられる主観評価データｙ_ｉを記憶する。加えて、主観評価データ記憶手段１３Ａは、代表値重み係数算出手段１６によって物理特徴量代表値Ｘ_ａ、Ｘ_ｔ及びＸ_ｌを算出するための重み係数ｃ_ａ（ｆ）、ｃ_ｔ（ｆ）及びｃ_ｌ（ｆ）を算出する際に用いられる主観評価データｙ_ｉを記憶する。重み係数Ｃ_ａ、Ｃ_ｔ及びＣ_ｌを算出する際に用いられる主観評価データｙ_ｉと重み係数ｃ_ａ（ｆ）、ｃ_ｔ（ｆ）及びｃ_ｌ（ｆ）を算出する際に用いられる主観評価データｙ_ｉとは、同じデータを共用するようにしてもよく、それぞれ異なるデータとしてもよい。
これらの主観評価データｙ_ｉは、不図示の入力手段によって入力され、主観評価データ記憶手段１３Ａに記憶される。 Similar to the subjective evaluation data storage unit 13 in the first embodiment shown in FIG. 2, the subjective evaluation data storage unit 13A is a weight for calculating the estimated sound image width (hat y) by the representative value weight coefficient calculation unit 12. Subjective evaluation data y _i used for calculating the coefficients C _a , C _t and C _l is stored. In addition, the subjective evaluation data storage unit 13A has weight coefficients c _a (f) and c _t (f) for calculating the physical feature quantity representative values X _a , X _t and X _l by the representative value weight coefficient calculation unit 16. And subjective evaluation data y _i used for calculating c _l (f). Subjective evaluation data y _i used in calculating the weighting factors C _a , C _t and C _l and subjectivity used in calculating the weighting factors c _a (f), c _t (f) and c _l (f) The evaluation data y _i may share the same data, or may be different data.
These subjective evaluation data y _i are input by an input unit (not shown) and stored in the subjective evaluation data storage unit 13A.

重み係数算出手段１６は、主観評価データ記憶手段１３Ａから主観評価データｙ_ｉを読み出すとともに、ＩＡＣＣ平均算出手段２７_ｆ、ＩＴＤ標準偏差算出手段２８_ｆ及びＩＬＤ標準偏差算出手段２６_ｆからそれぞれ当該主観評価データｙ_ｉに対応する３種類の周波数帯域別物理特徴量ｘ_ａｉ（ｆ）、ｘ_ｔｉ（ｆ）及びｘ_ｌｉ（ｆ）を入力し、物理特徴量の種別ごとに、入力した主観評価データｙ_ｉと周波数帯域別物理特徴量ｘ_ａｉ（ｆ）、ｘ_ｔｉ（ｆ）及びｘ_ｌｉ（ｆ）とからなる複数組のデータを用いて、式（２１−１）、式（２１−２）及び式（２１−３）に示した音像幅の推定モデル式の重み係数ｃ_ａ（ｆ）、ｃ_ｔ（ｆ）及びｃ_ｌ（ｆ）として回帰分析の手法である最小二乗法によって算出する。そして、重み係数算出手段１６は、算出した重み係数ｃ_ａ（ｆ）、ｃ_ｔ（ｆ）及びｃ_ｌ（ｆ）を重み係数記憶手段１５に記憶する。なお、ｉは、個々の主観評価データを識別する番号である。 The weighting factor calculation means 16 reads the subjective evaluation data y _i from the subjective evaluation data storage means 13A, and the subjective evaluation data from the IACC average calculation means 27 _f , the ITD standard deviation calculation means 28 _f, and the ILD standard deviation calculation means 26 _f , respectively. The three types of physical feature amounts x _ai (f), x _ti (f) and x _li (f) corresponding to the data y _i are inputted, and the subjective evaluation data y inputted for each type of physical feature amount Using a plurality of sets of data consisting of _i and frequency band physical feature quantities x _ai (f), x _ti (f), and x _li (f), equations (21-1), (21-2) and The weighting coefficients c _a (f), c _t (f), and c _l (f) of the estimation model expression of the sound image width shown in Expression (21-3) are calculated by the least square method that is a regression analysis technique. Then, the weighting factor calculation unit 16 stores the calculated weighting factors c _a (f), c _t (f), and c _l (f) in the weighting factor storage unit 15. Note that i is a number for identifying individual subjective evaluation data.

ここで、主観評価データｙ_ｉに対応する周波数帯域別物理特徴量ｘ_ａｉ（ｆ）、ｘ_ｔｉ（ｆ）及びｘ_ｌｉ（ｆ）とは、当該主観評価データｙ_ｉを得たときの被験者と同じ音場条件で、ダミーヘッド１に取り付けられたマイクロフォン２Ｌ及び２Ｒを用いて音響信号を採取し、前記した各分析手段を用いて最終的にＩＡＣＣ平均算出手段２７_ｆ、ＩＴＤ標準偏差算出手段２８_ｆ及びＩＬＤ標準偏差算出手段２６_ｆから出力される周波数帯域別物理特徴量ｘ_ａｉ（ｆ）、ｘ_ｔｉ（ｆ）及びｘ_ｌｉ（ｆ）のことである。
なお、重み係数ｃ_ａ（ｆ）、ｃ_ｔ（ｆ）及びｃ_ｌ（ｆ）の算出手法は、前記した第１実施形態における重み係数Ｃ_ａ、Ｃ_ｔ及びＣ_ｌの算出手法と同様であるので、説明は省略する。 Here, the subjective evaluation data _{y i} in the corresponding frequency band specific physical feature quantity _x ai _(f), and _x ti (f) and _x li (f) includes the subject of when to obtain the subjective evaluation data _{y i} Under the same sound field conditions, acoustic signals are collected using the microphones 2L and 2R attached to the dummy head 1, and finally the IACC average calculating means 27 _f and the ITD standard deviation calculating means 28 are used using the above-described analyzing means. _f and ILD standard deviation calculation means 26 frequency bands output from the _f-specific physical characteristic amount _x ai _(f), is that the _x ti (f) and _x li (f).
The calculation method of the weighting factors c _a (f), c _t (f), and c _l (f) is the same as the calculation method of the weighting factors C _a , C _t, and C _l in the first embodiment described above. Therefore, explanation is omitted.

また、第２実施形態においては、周波数帯域別物理特徴量ｘ_ａ（ｆ）、ｘ_ｔ（ｆ）及びｘ_ｌ（ｆ）の代表値として重み付き平均を用いるようにしたが、周波数帯域別物理特徴量ｘ_ａ（ｆ）、ｘ_ｔ（ｆ）及びｘ_ｌ（ｆ）の代表値として、式（２２−１）〜式（２２−３）に示したように、周波数帯域別物理特徴量ｘ_ａ（ｆ）、ｘ_ｔ（ｆ）及びｘ_ｌ（ｆ）の物理特徴量の種別ごとの重み付き最大値や、式（２３−１）〜式（２３−３）に示したように、周波数帯域別物理特徴量ｘ_ａ（ｆ）、ｘ_ｔ（ｆ）及びｘ_ｌ（ｆ）の物理特徴量の種別ごとの中央値を用いるようにしてもよい。 In the second embodiment, the weighted average is used as the representative value of the physical feature values x _a (f), x _t (f), and x _l (f) for each frequency band. As representative values of the feature quantities x _a (f), x _t (f), and x _l (f), as shown in the formulas (22-1) to (22-3), the physical feature quantities by frequency band x The weighted maximum value for each physical feature quantity type of _a (f), x _t (f), and x _l (f), and the frequency as shown in Expressions (23-1) to (23-3) band specific physical feature quantity _x a _(f), may be used the median of each type of physical feature values of _x t (f) and _x l (f).

但し、式（２３−１）〜式（２３−３）において、ｍｅｄｉａｎ（ａ_１，ａ_２，…，ａ_Ｆ）は、（）内の要素ａ_１，ａ_２，…，ａ_Ｆの中の中央値を算出する関数である。
また、これらの式（２２−１）〜式（２２−３）及び式（２３−１）〜式（２３−３）における重み係数ｃ_ａ（ｆ）、ｃ_ｔ（ｆ）及びｃ_ｌ（ｆ）は、前記した重み付き平均の算出手法と同様の手法により定めることができる。 However, the formula (23-1) to formula in (23-3), median _(a _1, a 2, ..., _{a F)} is () elements _a _1, a 2 in, ..., in _{a F} This function calculates the median.
In addition, the weight coefficients c _a (f), c _t (f), and c _l (f) in these expressions (22-1) to (22-3) and (23-1) to (23-3) ) Can be determined by the same method as the weighted average calculation method described above.

重み係数記憶手段１５は、代表値重み係数算出手段１６によって算出した、式（２１−１）、式（２１−２）及び式（２１−３）に示した音像幅の推定モデル式の重み係数ｃ_ａ（ｆ）、ｃ_ｔ（ｆ）及びｃ_ｌ（ｆ）、すなわち式（２０−１）〜式（２０−３）に示した物理特徴量代表値Ｘ_ａ、Ｘ_ｔ及びＸ_ｌの算出式の重み係数ｃ_ａ（ｆ）、ｃ_ｔ（ｆ）及びｃ_ｌ（ｆ）を記憶する記憶手段である。重み係数記憶手段１５に記憶した重み係数ｃ_ａ（ｆ）、ｃ_ｔ（ｆ）及びｃ_ｌ（ｆ）は、ＩＡＣＣ平均代表値算出手段３１Ａ、ＩＴＤ標準偏差代表値算出手段３２Ａ及びＩＬＤ標準偏差代表値算出手段３０Ａによって読み出され、それぞれ物理特徴量代表値Ｘ_ａ、Ｘ_ｔ及びＸ_ｌの算出に用いられる。 The weighting coefficient storage means 15 is a weighting coefficient of the estimated model expression of the sound image width shown in the expressions (21-1), (21-2) and (21-3) calculated by the representative value weighting coefficient calculating means 16. Calculation of c _a (f), c _t (f), and c _l (f), that is, physical feature representative values X _a , X _t, and X _l shown in equations (20-1) to (20-3) Storage means for storing the weighting factors c _a (f), c _t (f) and c _l (f) of the _equation . The weighting coefficients c _a (f), c _t (f), and c _l (f) stored in the weighting coefficient storage unit 15 are the IACC average representative value calculating unit 31A, the ITD standard deviation representative value calculating unit 32A, and the ILD standard deviation representative. It is read by the value calculating means 30A and used to calculate the physical feature quantity representative values X _a , X _t and X _l , respectively.

第２実施形態における推定値重み係数算出手段１２は、第１実施形態における推定値重み係数算出手段１２と同様の手法で、式（１２）に示した推定モデル式の重み係数Ｃ_ａ、Ｃ_ｔ及びＣ_ｌを算出し、算出した重み係数Ｃ_ａ、Ｃ_ｔ及びＣ_ｌを重み係数記憶手段１１に記憶する。 The estimated value weighting coefficient calculating means 12 in the second embodiment is a method similar to the estimated value weighting coefficient calculating means 12 in the first embodiment, and the weighting factors C _a and C _{t of} the estimated model formula shown in Expression (12) And C _l are calculated, and the calculated weight coefficients C _a , C _t and C _l are stored in the weight coefficient storage means 11.

なお、このとき、ＩＡＣＣ平均代表値算出手段３１Ａ、ＩＴＤ標準偏差代表値算出手段３２Ａ及びＩＬＤ標準偏差代表値算出手段３０Ａは、それぞれ代表値重み係数算出手段１６によって予め算出された重み係数ｃ_ａ（ｆ）、ｃ_ｔ（ｆ）及びｃ_ｌ（ｆ）を重み係数記憶手段１５から読み出し、読み出した重み係数ｃ_ａ（ｆ）、ｃ_ｔ（ｆ）及びｃ_ｌ（ｆ）と、それぞれＩＡＣＣ平均算出手段２７_ｆ、ＩＴＤ標準偏差算出手段２８_ｆ及びＩＬＤ標準偏差算出手段２６_ｆから入力した周波数帯域別物理特徴量ｘ_ａ（ｆ）、ｘ_ｔ（ｆ）及びｘ_ｌ（ｆ）とから、それぞれ物理特徴量代表値Ｘ_ａ、Ｘ_ｔ及びＸ_ｌを算出し、算出した物理特徴量代表値Ｘ_ａ、Ｘ_ｔ及びＸ_ｌを推定値重み係数算出手段１２に出力する。 At this time, the IACC average representative value calculating means 31A, the ITD standard deviation representative value calculating means 32A, and the ILD standard deviation representative value calculating means 30A are respectively weight coefficients c _a (calculated in advance by the representative value weight coefficient calculating means 16). f), c _t (f) and c _l (f) are read from the weight coefficient storage means 15, and the read weight coefficients c _a (f), c _t (f) and c _l (f) are respectively calculated as IACC averages. From the physical features x _a (f), x _t (f), and x _l (f) for each frequency band inputted from the means 27 _f , the ITD standard deviation calculating means 28 _f and the ILD standard deviation calculating means 26 _f , The feature quantity representative values X _a , X _t and X _l are calculated, and the calculated physical feature quantity representative values X _a , X _t and X _l are output to the estimated value weight coefficient calculation means 12.

次に、図８を参照（適宜図６及び図７参照）して、第２実施形態の音像幅推定装置１００Ａの動作について説明する。
図８に示すように、音像幅推定装置１００Ａは、まず、代表値重み係数算出手段１６によって、式（２０−１）〜式（２０−３）に示した物理特徴量代表値Ｘ_ａ、Ｘ_ｔ及びＸ_ｌの算出式の重み係数ｃ_ａ（ｆ）、ｃ_ｔ（ｆ）及びｃ_ｌ（ｆ）を算出して、重み係数記憶手段１５に記憶しておく（ステップＳ５０）。既に重み係数ｃ_ａ（ｆ）、ｃ_ｔ（ｆ）及びｃ_ｌ（ｆ）が重み係数記憶手段１５に記憶されている場合は、この物理特徴量代表値算出用の重み係数算出処理ステップは省略することができる。なお、物理特徴量代表値算出用の重み係数算出処理ステップの詳細については後記する。 Next, the operation of the sound image width estimation apparatus 100A of the second embodiment will be described with reference to FIG. 8 (see FIGS. 6 and 7 as appropriate).
As illustrated in FIG. 8, the sound image width estimation apparatus 100 </ _b > A first uses the representative value weighting coefficient calculation unit 16 to represent the physical feature quantity representative values X _a and X shown in Expressions (20-1) to (20-3). Weight coefficients c _a (f), c _t (f), and c _l (f) in the calculation formulas for _t and X _l are calculated and stored in the weight coefficient storage unit 15 (step S50). If the weighting factors c _a (f), c _t (f), and c _l (f) are already stored in the weighting factor storage means 15, this weighting factor calculation processing step for calculating the physical feature quantity representative value is omitted. can do. Details of the weighting factor calculation processing step for calculating the physical feature quantity representative value will be described later.

続いて、音像幅推定装置１００Ａは、推定値重み係数算出手段１２によって、式（１２）に示した推定モデル式における重み係数Ｃ_ａ、Ｃ_ｔ及びＣ_ｌを算出して、重み係数記憶手段１１に記憶しておく（ステップＳ５１）。既に重み係数Ｃ_ａ、Ｃ_ｔ及びＣ_ｌが重み係数記憶手段１１に記憶されている場合は、この推定モデル式の重み係数算出処理ステップは省略することができる。なお、本ステップＳ５１は、図３に示した第１実施形態の音像幅推定装置１００の処理におけるステップＳ１０と同様であるので、詳細な説明は省略する。 Subsequently, the sound image width estimation apparatus 100A calculates weighting factors C _a , C _t and C _l in the estimation model equation shown in the equation (12) by the estimated value weighting factor calculation unit 12, and the weighting factor storage unit 11 (Step S51). When the weight coefficients C _a , C _t and C _l are already stored in the weight coefficient storage means 11, the weight coefficient calculation processing step of this estimation model formula can be omitted. Note that step S51 is the same as step S10 in the process of the sound image width estimation apparatus 100 of the first embodiment shown in FIG.

ステップＳ５２〜ステップＳ５７の処理は、それぞれ図３に示した第１実施形態の音像幅推定装置１００の処理におけるステップＳ１１〜ステップＳ１６と同様であるので、説明は省略する。 The processes in steps S52 to S57 are the same as steps S11 to S16 in the process of the sound image width estimation apparatus 100 of the first embodiment shown in FIG.

音像幅推定装置１００Ａは、ＩＡＣＣ平均代表値算出手段３１Ａによって、ステップＳ５７でＩＡＣＣ平均算出手段２７_ｆから入力した周波数帯域別物理特徴量ｘ_ａ（ｆ）と、ステップＳ５０で重み係数記憶手段１５に記憶しておいた重み係数ｃ_ａ（ｆ）とから、式（２０−１）によって重み付き平均を算出し、算出した当該重み付き平均を物理特徴量代表値Ｘ_ａとして音像幅推定値算出手段１０に出力する（ステップＳ５８）。
音像幅推定装置１００Ａは、並行して、ＩＴＤ標準偏差代表値算出手段３２Ａによって、ステップＳ５７でＩＴＤ標準偏差算出手段２８_ｆから入力した周波数帯域別物理特徴量ｘ_ｔ（ｆ）と、ステップＳ５０で重み係数記憶手段１５に記憶しておいた重み係数ｃ_ｔ（ｆ）とから、式（２０−２）によって重み付き平均を算出し、算出した当該重み付き平均を物理特徴量代表値Ｘ_ｔとして音像幅推定値算出手段１０に出力する（ステップＳ５８）。
音像幅推定装置１００Ａは、更に並行して、ＩＬＤ標準偏差代表値算出手段３０Ａによって、ステップＳ５０でＩＬＤ標準偏差算出手段２６_ｆから入力した周波数帯域別物理特徴量ｘ_ｌ（ｆ）と、ステップＳ５０で重み係数記憶手段１５に記憶しておいた重み係数ｃ_ｌ（ｆ）とから、式（２０−３）によって重み付き平均を算出し、算出した当該重み付き平均を物理特徴量代表値Ｘ_ｌとして音像幅推定値算出手段１０に出力する（ステップＳ５８）。 Sound image width estimating apparatus 100A, by IACC average representative value calculating unit 31A, a frequency band specific physical feature quantity _x a (f) input from the IACC average calculating unit 27 _f at step S57, the the weight coefficient storage unit 15 in step S50 From the stored weight coefficient c _a (f), _a weighted average is calculated by the equation (20-1), and the calculated weighted average is used as a physical feature amount representative value X _{a to} calculate _a sound image width estimated value. 10 (step S58).
Sound image width estimator 100A, in parallel, by ITD standard deviation representative value calculating unit 32A, ITD and standard deviation calculating means each frequency band inputted from the 28 _f physical feature amount _x t (f) in step S57, the in step S50 A weighted average is calculated from the weighting coefficient c _t (f) stored in the weighting coefficient storage means 15 by the equation (20-2), and the calculated weighted average is used as the physical feature quantity representative value X _t. It outputs to the sound image width estimated value calculation means 10 (step S58).
Sound image width estimator 100A further in parallel, by ILD standard deviation representative value calculating unit 30A, an ILD standard deviation calculation means 26 frequency bands specific physical feature quantity input from the _f _x l _(f) in step S50, step S50 Then, the weighted average is calculated from the weight coefficient c _l (f) stored in the weight coefficient storage means 15 by the equation (20-3), and the calculated weighted average is used as the physical feature quantity representative value X _l. Is output to the sound image width estimated value calculating means 10 (step S58).

音像幅推定装置１００Ａは、音像幅推定値算出手段１０によって、ステップＳ５８でＩＡＣＣ平均代表値算出手段３１Ａ、ＩＴＤ標準偏差代表値算出手段３２Ａ及びＩＬＤ標準偏差代表値算出手段３０Ａから入力した物理特徴量代表値Ｘ_ａ、Ｘ_ｔ及びＸ_ｌと、ステップＳ５１で推定値重み係数算出手段１２によって重み係数記憶手段１１に記憶しておいた重み係数Ｃ_ａ、Ｃ_ｔ及びＣ_ｌとから、式（１２）によって音像幅推定値（ハットｙ）を算出して表示手段１４に出力する（ステップＳ５９）。 The sound image width estimation apparatus 100A receives the physical feature amount input from the IACC average representative value calculation means 31A, the ITD standard deviation representative value calculation means 32A, and the ILD standard deviation representative value calculation means 30A in step S58 by the sound image width estimation value calculation means 10. From the representative values X _a , X _t and X _l and the weight coefficients C _a , C _t and C _l stored in the weight coefficient storage means 11 by the estimated value weight coefficient calculation means 12 in step S51, the equation (12 ) To calculate the estimated sound image width (hat y) and output it to the display means 14 (step S59).

音像幅推定装置１００Ａは、表示手段１４によって、ステップＳ５９で音像幅推定値算出手段１０から入力した音像幅推定値（ハットｙ）を視認可能に表示する（ステップＳ６０）。
以上の処理によって、音像幅推定装置１００Ａは、音像幅を推定することができる。 The sound image width estimation apparatus 100A displays the sound image width estimated value (hat y) input from the sound image width estimated value calculating means 10 in step S59 so as to be visible on the display means 14 (step S60).
With the above processing, the sound image width estimation apparatus 100A can estimate the sound image width.

次に、図９を参照（適宜図６及び図７参照）して、図８に示した物理特徴量代表値算出用の重み係数算出処理ステップ（ステップＳ５０）における音像幅推定装置１００Ａの動作について説明する。
図９に示すように、音像幅推定装置１００Ａは、まず、予め実施した主観評価によって得られた主観評価データｙ_ｉを不図示の入力手段によって入力し、主観評価データ記憶手段１３Ａに記憶する（ステップＳ７０）。 Next, referring to FIG. 9 (refer to FIG. 6 and FIG. 7 as appropriate), the operation of the sound image width estimation apparatus 100A in the physical coefficient representative value calculation weight coefficient calculation processing step (step S50) shown in FIG. explain.
As shown in FIG. 9, the sound image width estimation apparatus 100A first inputs subjective evaluation data y _i obtained by subjective evaluation performed in advance by input means (not shown) and stores it in the subjective evaluation data storage means 13A ( Step S70).

次に、音像幅推定装置１００Ａは、マイクロフォン２Ｌ及び２Ｒによって、ステップＳ７０で入力した主観評価データｙ_ｉに対応する音響信号をバイノーラル方式で採取し、採取したアナログ音響信号を、ローパスフィルタ３Ｌ及び３Ｒを介しＡＤ変換器４Ｌ及び４Ｒによって、デジタル信号に変換した音響信号ｓｌ（ｎ）及びｓｒ（ｎ）として、メモリ６Ｌ及び６Ｒに記憶する（ステップＳ７１）。 Next, the sound image width estimation apparatus 100A collects the acoustic signal corresponding to the subjective evaluation data y _i input in step S70 by the binaural method using the microphones 2L and 2R, and uses the collected analog acoustic signal as the low-pass filters 3L and 3R. Are stored in the memories 6L and 6R as acoustic signals sl (n) and sr (n) converted into digital signals by the AD converters 4L and 4R (step S71).

ステップＳ７２〜ステップＳ７６までの処理は、それぞれ図８に示した処理におけるステップＳ５３〜ステップＳ５７までの処理と同様であるから説明は省略する。
なお、音像幅推定装置１００Ａは、ステップＳ７０で入力した主観評価データｙ_ｉの個数Ｓに対応して、ステップＳ７１〜ステップＳ７６の処理を繰り返し、代表値重み係数算出手段１６に、Ｓ組の主観評価データｙ_ｉと周波数帯域別物理特徴量ｘ_ａ（ｆ）、ｘ_ｔ（ｆ）及びｘ_ｌ（ｆ）とからなるデータを蓄積する。 The processing from step S72 to step S76 is the same as the processing from step S53 to step S57 in the processing shown in FIG.
Note that the sound image width estimation apparatus 100A repeats the processing of steps S71 to S76 corresponding to the number S of subjective evaluation data y _i input in step S70, and causes the representative value weight coefficient calculation means 16 to receive S sets of subjectives. Data consisting of evaluation data y _i and physical characteristics by frequency band x _a (f), x _t (f) and x _l (f) is stored.

音像幅推定装置１００Ａは、代表値重み係数算出手段１６によって、ステップＳ７６でＩＡＣＣ平均代表値算出手段３１Ａ、ＩＴＤ標準偏差代表値算出手段３２Ａ及びＩＬＤ標準偏差代表値算出手段３０Ａから入力した周波数帯域別物理特徴量ｘ_ａｉ（ｆ）、ｘ_ｔｉ（ｆ）及びｘ_ｌｉ（ｆ）と、ステップＳ７０で不図示の入力手段から入力して主観評価データ記憶手段１３Ａに記憶しておいた主観評価データｙ_ｉとからなるＳ組のデータを用いて、最小二乗法によって重み係数ｃ_ａ（ｆ）、ｃ_ｔ（ｆ）及びｃ_ｌ（ｆ）を算出して（ステップＳ７８）、算出した重み係数ｃ_ａ（ｆ）、ｃ_ｔ（ｆ）及びｃ_ｌ（ｆ）を重み係数記憶手段１５に記憶する（ステップＳ７９）。
以上で、音像幅推定装置１００Ａは、物理特徴量代表値算出用の重み係数算出処理を終了する。 The sound image width estimation apparatus 100A uses the representative value weighting coefficient calculation unit 16 for each frequency band input from the IACC average representative value calculation unit 31A, the ITD standard deviation representative value calculation unit 32A, and the ILD standard deviation representative value calculation unit 30A in step S76. The physical feature values x _ai (f), x _ti (f), and x _li (f) and the subjective evaluation data y input from the input unit (not shown) and stored in the subjective evaluation data storage unit 13A in step S70. Using the S sets of data consisting of _i , weight coefficients c _a (f), c _t (f) and c _l (f) are calculated by the method of least squares (step S78), and the calculated weight coefficient c _a (F), c _t (f) and c _l (f) are stored in the weight coefficient storage means 15 (step S79).
Thus, the sound image width estimation apparatus 100A ends the weighting coefficient calculation process for calculating the physical feature quantity representative value.

［第３実施形態］
次に、図１０を参照して、本発明における第３実施形態の音像幅推定装置１００Ｂについて説明する。
図１０に示すように、第３実施形態の音像幅推定装置１００Ｂは、図１に示した第１実施形態の音像幅推定装置１００とは、演算手段５に替えて演算手段５Ｂを備えたことが異なる。詳細には、第３実施形態の音像幅推定装置１００Ｂは、図１に示した第１実施形態の音像幅推定装置１００とは、物理特徴量代表値算出手段９を備えていないことと、音像幅推定値算出手段１０、重み係数記憶手段１１及び主観評価データ記憶手段１３に替えて、それぞれ音像幅推定値算出手段１０Ｂ、重み係数記憶手段１１Ｂ及び主観評価データ記憶手段１３Ｂを備えたことと、が異なる。 [Third Embodiment]
Next, a sound image width estimation apparatus 100B according to a third embodiment of the present invention will be described with reference to FIG.
As shown in FIG. 10, the sound image width estimation apparatus 100 B according to the third embodiment is different from the sound image width estimation apparatus 100 according to the first embodiment shown in FIG. Is different. Specifically, the sound image width estimation apparatus 100B according to the third embodiment is different from the sound image width estimation apparatus 100 according to the first embodiment shown in FIG. In place of the estimated width value calculation means 10, the weight coefficient storage means 11 and the subjective evaluation data storage means 13, a sound image width estimation value calculation means 10B, a weight coefficient storage means 11B and a subjective evaluation data storage means 13B are provided. Is different.

第１実施形態における音像幅推定値算出手段１０は、物理特徴量代表値算出手段９によって算出した物理特徴量代表値Ｘ_ａ、Ｘ_ｔ及びＸ_ｌを用いた推定モデル式によって音像幅推定値（ハットｙ）を算出するのに対して、第３実施形態における音像幅推定値算出手段１０Ｂは、周波数帯域別物理特徴量算出手段８_ｆによって算出した個々の周波数帯域別物理特徴量ｘ_ａ（ｆ）、ｘ_ｔ（ｆ）及びｘ_ｌ（ｆ）を物理特徴量として扱い、これらの周波数帯域別物理特徴量ｘ_ａ（ｆ）、ｘ_ｔ（ｆ）及びｘ_ｌ（ｆ）を用いた推定モデル式によって音像幅推定値（ハットｙ）を算出するものである。
なお、図１に示した第１実施形態と同じ構成要素については、同じ符号を付して、説明は適宜省略する。 The sound image width estimated value calculating means 10 in the first embodiment is a sound image width estimated value (in accordance with an estimation model formula using physical feature quantity representative values X _a , X _t and X _l calculated by the physical feature quantity representative value calculating means 9. In contrast to calculating the hat y), the sound image width estimated value calculating means 10B in the third embodiment is the physical feature quantity x _a (f) for each frequency band calculated by the physical feature quantity calculating means 8 _f for each frequency band. ), X _t (f) and x _l (f) are treated as physical feature quantities, and an estimation model using these frequency band physical feature quantities x _a (f), x _t (f) and x _l (f) The estimated sound image width (hat y) is calculated by the equation.
The same constituent elements as those in the first embodiment shown in FIG. 1 are denoted by the same reference numerals, and description thereof will be omitted as appropriate.

第３実施形態における音像幅推定値算出手段１０Ｂは、周波数帯域別物理特徴量算出手段８_ｆのＩＡＣＣ平均算出手段２７_ｆ、ＩＴＤ標準偏差算出手段２８_ｆ及びＩＬＤ標準偏差算出手段２６_ｆ（図２参照）から、それぞれ周波数帯域別の両耳間相互相関度の時間軸方向の平均である周波数帯域別物理特徴量ｘ_ａ（ｆ）、周波数帯域別の両耳間時間差の時間軸方向の標準偏差である周波数帯域別物理特徴量ｘ_ｔ（ｆ）及び周波数帯域別の両耳間レベル差の時間軸方向の標準偏差である周波数帯域別物理特徴量ｘ_ｌ（ｆ）を入力するとともに、重み係数記憶手段１１Ｂから重み係数ｃ_ａ（ｆ）、ｃ_ｔ（ｆ）及びｃ_ｌ（ｆ）を読み出し、式（２４）に示した推定モデル式によって音像幅推定値（ハットｙ）を算出する。音像幅推定値算出手段１０Ｂは、算出した音像幅推定値（ハットｙ）を表示手段１４に出力する。 Sound width estimation value calculation means in the third embodiment 10B includes, IACC average calculating unit ₂₇ f of the frequency band specific physical feature calculating unit _{8 f,} ITD standard deviation calculating means 28 _f and ILD standard deviation calculation unit 26 _f (FIG. 2 ), The physical feature amount by frequency band x _a (f), which is the average of the interaural cross-correlation for each frequency band in the time axis direction, and the standard deviation in the time axis of the interaural time difference by frequency band inputs the frequency band specific physical feature amount x _{t (f)} and the standard deviation of the time axis direction of the level difference between each frequency band of the binaural frequency band specific physical feature quantity x _{l (f)} is the weighting factor The weight coefficients c _a (f), c _t (f), and c _l (f) are read from the storage unit 11B, and the estimated sound image width (hat y) is calculated by the estimation model formula shown in the formula (24). The sound image width estimated value calculation means 10B outputs the calculated sound image width estimated value (hat y) to the display means 14.

なお、推定モデル式は、式（２４）に限定されるものではなく、例えば、式（２５）のように周波数帯域別物理特徴量ｘ_ａ（ｆ）、ｘ_ｔ（ｆ）及びｘ_ｌ（ｆ）によって表される他の推定モデル式を用いるようにしてもよい。 Note that the estimation model formula is not limited to the formula (24). For example, as shown in the formula (25), the physical feature amounts x _a (f), x _t (f), and x _l (f) for each frequency band. Other estimation model formulas represented by) may be used.

重み係数記憶手段１１Ｂは、推定値重み係数算出手段１２Ｂによって算出した式（２４）に示した推定モデル式の重み係数ｃ_ａ（ｆ）、ｃ_ｔ（ｆ）及びｃ_ｌ（ｆ）を記憶する記憶手段である。重み係数記憶手段１１Ｂに記憶した重み係数ｃ_ａ（ｆ）、ｃ_ｔ（ｆ）及びｃ_ｌ（ｆ）は、音像幅の推定を行う際に、音像幅推定値算出手段１０Ｂによって読み出され、音像幅推定値（ハットｙ）の算出に用いられる。 The weighting factor storage unit 11B stores the weighting factors c _a (f), c _t (f), and c _l (f) of the estimation model formula shown in the formula (24) calculated by the estimated value weighting factor calculation unit 12B. It is a storage means. The weight coefficients c _a (f), c _t (f), and c _l (f) stored in the weight coefficient storage unit 11B are read out by the sound image width estimated value calculation unit 10B when the sound image width is estimated. This is used to calculate the estimated sound image width (hat y).

推定値重み係数算出手段１２Ｂは、主観評価データ記憶手段１３Ｂから主観評価データｙ_ｉを読み出すとともに、周波数帯域別物理特徴量算出手段８_ｆのＩＡＣＣ平均算出手段２７_ｆ、ＩＴＤ標準偏差算出手段２８_ｆ及びＩＬＤ標準偏差算出手段２６_ｆ（図２参照）からそれぞれ当該主観評価データｙ_ｉに対応する３種類の周波数帯域別物理特徴量ｘ_ａｉ（ｆ）、ｘ_ｔｉ（ｆ）及びｘ_ｌｉ（ｆ）を入力し、入力した主観評価データｙ_ｉと周波数帯域別物理特徴量ｘ_ａｉ（ｆ）、ｘ_ｔｉ（ｆ）及びｘ_ｌｉ（ｆ）とからなる複数組のデータを用いて、式（２４）に示した音像幅の推定モデル式の重み係数ｃ_ａ（ｆ）、ｃ_ｔ（ｆ）及びｃ_ｌ（ｆ）として回帰分析の手法である最小二乗法によって算出する。そして、重み係数算出手段１２Ｂは、算出した重み係数ｃ_ａ（ｆ）、ｃ_ｔ（ｆ）及びｃ_ｌ（ｆ）を重み係数記憶手段１１Ｂに記憶する。なお、ｉは、個々の主観評価データを識別する番号である。 The estimated value weighting coefficient calculation means 12B reads the subjective evaluation data y _i from the subjective evaluation data storage means 13B, and at the same time, the IACC average calculation means 27 _f and the ITD standard deviation calculation means 28 _{f of the} physical characteristic amount calculation means 8 _f by frequency band. And ILD standard deviation calculation means 26 _f (see FIG. 2), three types of frequency feature physical characteristics x _ai (f), x _ti (f) and x _li (f) corresponding to the subjective evaluation data y _i , respectively. And a plurality of sets of data consisting of the input subjective evaluation data y _i and frequency band physical feature quantities x _ai (f), x _ti (f), and x _li (f), The weight coefficients c _a (f), c _t (f), and c _l (f) of the estimation model formula of the sound image width shown in FIG. 6 are calculated by the least square method that is a regression analysis method. Then, the weighting factor calculation unit 12B stores the calculated weighting factors c _a (f), c _t (f), and c _l (f) in the weighting factor storage unit 11B. Note that i is a number for identifying individual subjective evaluation data.

ここで、主観評価データｙ_ｉに対応する周波数帯域別物理特徴量ｘ_ａｉ（ｆ）、ｘ_ｔｉ（ｆ）及びｘ_ｌｉ（ｆ）とは、当該主観評価データｙ_ｉを得たときの被験者と同じ音場条件で、ダミーヘッド１に取り付けられたマイクロフォン２Ｌ及び２Ｒを用いて音響信号を採取し、前記した各分析手段を用いて最終的にＩＡＣＣ平均算出手段２７_ｆ、ＩＴＤ標準偏差算出手段２８_ｆ及びＩＬＤ標準偏差算出手段２６_ｆ（図２参照）から出力される周波数帯域別物理特徴量ｘ_ａｉ（ｆ）、ｘ_ｔｉ（ｆ）及びｘ_ｌｉ（ｆ）のことである。
なお、重み係数ｃ_ａ（ｆ）、ｃ_ｔ（ｆ）及びｃ_ｌ（ｆ）の算出手法は、前記した第１実施形態における重み係数Ｃ_ａ、Ｃ_ｔ及びＣ_ｌの算出手法と同様であるので、説明は省略する。 Here, the subjective evaluation data _{y i} in the corresponding frequency band specific physical feature quantity _x ai _(f), and _x ti (f) and _x li (f) includes the subject of when to obtain the subjective evaluation data _{y i} Under the same sound field conditions, acoustic signals are collected using the microphones 2L and 2R attached to the dummy head 1, and finally the IACC average calculating means 27 _f and the ITD standard deviation calculating means 28 are used using the above-described analyzing means. _f and ILD standard deviation calculating means 26 _f (refer to FIG. 2) are physical features x _ai (f), x _ti (f) and x _li (f) for each frequency band output from the frequency band.
The calculation method of the weighting factors c _a (f), c _t (f), and c _l (f) is the same as the calculation method of the weighting factors C _a , C _t, and C _l in the first embodiment described above. Therefore, explanation is omitted.

第３実施形態における推定値重み係数算出手段１２Ｂは、第１実施形態における推定値重み係数算出手段１２と同様の手法で、式（２４）に示した推定モデル式の重み係数ｃ_ａ（ｆ）、ｃ_ｔ（ｆ）及びｃ_ｌ（ｆ）を算出し、算出した重み係数ｃ_ａ（ｆ）、ｃ_ｔ（ｆ）及びｃ_ｌ（ｆ）を重み係数記憶手段１１Ｂに記憶する。 The estimated value weight coefficient calculating means 12B in the third embodiment is the same method as the estimated value weight coefficient calculating means 12 in the first embodiment, and the weight coefficient c _a (f) of the estimated model equation shown in Expression (24). , C _t (f) and c _l (f) are calculated, and the calculated weight coefficients c _a (f), c _t (f) and c _l (f) are stored in the weight coefficient storage unit 11B.

主観評価データ記憶手段１３Ｂは、推定値重み係数算出手段１２Ｂによって音像幅推定値（ハットｙ）を算出するための重み係数ｃ_ａ（ｆ）、ｃ_ｔ（ｆ）及びｃ_ｌ（ｆ）を算出する際に用いられる主観評価データｙ_ｉを記憶する。主観評価データｙ_ｉは、不図示の入力手段によって入力され、主観評価データ記憶手段１３Ｂに記憶される。 The subjective evaluation data storage unit 13B calculates weight coefficients c _a (f), c _t (f), and c _l (f) for calculating the sound image width estimated value (hat y) by the estimated value weight coefficient calculating unit 12B. Subjective evaluation data y _i used in the process is stored. The subjective evaluation data y _i is input by an input unit (not shown) and stored in the subjective evaluation data storage unit 13B.

次に、図１１を参照（適宜図１０参照）して、第３実施形態の音像幅推定装置１００Ｂの動作について説明する。
図１１に示すように、音像幅推定装置１００Ｂは、まず、推定値重み係数算出手段１２Ｂによって、式（２４）に示した音像幅推定値（ハットｙ）の推定モデル式の重み係数ｃ_ａ（ｆ）、ｃ_ｔ（ｆ）及びｃ_ｌ（ｆ）を算出して、重み係数記憶手段１１Ｂに記憶しておく（ステップＳ９０）。既に重み係数ｃ_ａ（ｆ）、ｃ_ｔ（ｆ）及びｃ_ｌ（ｆ）が重み係数記憶手段１１Ｂに記憶されている場合は、この推定モデル式の重み係数算出処理ステップは省略することができる。なお、推定モデル式の重み係数算出処理ステップは、図３に示した第１実施形態における推定モデル式の重み係数算出処理ステップとは、説明変数として物理特徴量代表値Ｘ_ａ、Ｘ_ｔ及びＸ_ｌに替えて周波数帯域別物理特徴量ｘ_ａ（ｆ）、ｘ_ｔ（ｆ）及びｘ_ｌ（ｆ）を用い、重み係数Ｃ_ａ、Ｃ_ｔ及びＣ_ｌに替えて重み係数ｃ_ａ（ｆ）、ｃ_ｔ（ｆ）及びｃ_ｌ（ｆ）を算出すること以外は、同様であるので、詳細な説明は省略する。 Next, referring to FIG. 11 (refer to FIG. 10 as appropriate), the operation of the sound image width estimation apparatus 100B of the third embodiment will be described.
As shown in FIG. 11, the sound image width estimation apparatus 100B first uses the estimated value weighting coefficient calculation unit 12B to calculate the weighting coefficient c _a of the estimated model formula of the sound image width estimated value (hat y) shown in Expression (24). f), c _t (f) and c _l (f) are calculated and stored in the weight coefficient storage means 11B (step S90). When the weighting factors c _a (f), c _t (f), and c _l (f) are already stored in the weighting factor storage unit 11B, the weighting factor calculation processing step of this estimation model formula can be omitted. . The weighting factor calculation processing step of the estimation model formula is different from the weighting factor calculation processing step of the estimation model formula in the first embodiment shown in FIG. 3 as the physical feature quantity representative values X _a , X _t and X Instead of _l , the physical characteristics by frequency band x _a (f), x _t (f) and x _l (f) are used instead of _l , and weight coefficients c _a (f) are substituted for the weight coefficients C _a , C _t and C _l. , C _t (f) and c _l (f) are the same except that they are calculated, and thus detailed description thereof is omitted.

ステップＳ９１〜ステップＳ９６の処理は、それぞれ図３に示した第１実施形態の音像幅推定装置１００の処理におけるステップＳ１１〜ステップＳ１６と同様であるので、説明は省略する。 The processing in steps S91 to S96 is the same as that in steps S11 to S16 in the processing of the sound image width estimation apparatus 100 of the first embodiment shown in FIG.

音像幅推定装置１００Ｂは、音像幅推定値算出手段１０Ｂによって、ステップＳ９６で周波数帯域別物理特徴量算出手段８_ｆのＩＡＣＣ平均算出手段２７_ｆ、ＩＴＤ標準偏差算出手段２８_ｆ及びＩＬＤ標準偏差算出手段２６_ｆ（図２参照）からそれぞれ入力した周波数帯域別物理特徴量ｘ_ａ（ｆ）、ｘ_ｔ（ｆ）及びｘ_ｌ（ｆ）と、ステップＳ９０で重み係数記憶手段１１Ｂに記憶しておいた重み係数ｃ_ａ（ｆ）、ｃ_ｔ（ｆ）及びｃ_ｌ（ｆ）とから、式（２４）によって音像幅推定値（ハットｙ）を算出し、算出した音像幅推定値（ハットｙ）を表示手段１４に出力する（ステップＳ９７）。 The sound image width estimation device 100B uses the sound image width estimation value calculation means 10B to perform the IACC average calculation means 27 _f , the ITD standard deviation calculation means 28 _f and the ILD standard deviation calculation means of the frequency band physical feature value calculation means 8 _f in step S96. 26 _f (refer to FIG. 2), the physical features x _a (f), x _t (f) and x _l (f) for each frequency band respectively input from 26 _f (see FIG. 2) and stored in the weight coefficient storage means 11B in step S90. A sound image width estimated value (hat y) is calculated from the weighting coefficients c _a (f), c _t (f), and c _l (f) by the equation (24), and the calculated sound image width estimated value (hat y) is calculated. It outputs to the display means 14 (step S97).

音像幅推定装置１００Ｂは、表示手段１４によって、ステップＳ９７で音像幅推定値算出手段１０Ｂから入力した音像幅推定値（ハットｙ）を視認可能に表示する（ステップＳ９８）。
以上の処理によって、音像幅推定装置１００Ｂは、音像幅を推定することができる。 The sound image width estimation device 100B displays the sound image width estimation value (hat y) input from the sound image width estimation value calculation unit 10B in step S97 so as to be visible on the display unit 14 (step S98).
With the above processing, the sound image width estimation apparatus 100B can estimate the sound image width.

次に、本発明の実施例について説明する。
図１及び図２に示した音像幅推定装置１００において、音源ＳＳとして、バイオリンの各開放弦による連続音を録音して用いた。主観評価は、図１２に示すような実験装置１１０を用いて、被験者ＳＵＢを中心とした被験者ＳＵＢの前方側の半円ＳＣ上に適宜スピーカＳＳ_１〜ＳＳ_３を配置して、録音しておいたバイオリンの連続音を再生して行った。このとき、音源であるスピーカＳＳ_１〜ＳＳ_３の配置個数及び被験者ＳＵＢを中心とするスピーカＳＳ_１〜ＳＳ_３の配置角度θ_ＳＳを調整することにより、被験者ＳＵＢが様々な音像幅を知覚できるように制御した。
音像幅の主観評価値は、被験者ＳＵＢの頭部の中心を視点とする水平方向の角度θに変換した。 Next, examples of the present invention will be described.
In the sound image width estimation apparatus 100 shown in FIG. 1 and FIG. 2, a continuous sound by each open string of a violin is recorded and used as the sound source SS. Subjective evaluation is performed by using the experimental apparatus 110 shown in FIG. 12 and appropriately recording speakers SS _{1 to} SS ₃ on the semicircle SC on the front side of the subject SUB centered on the subject SUB. It was performed by playing the continuous sound of the violin. At this time, the subject SUB can perceive various sound image widths by adjusting the arrangement number of the speakers SS _{1 to} SS _{3 as} the sound source and the arrangement angle θ _SS of the speakers SS ₁ to SS ₃ around the subject SUB. Controlled.
The subjective evaluation value of the sound image width was converted into a horizontal angle θ with the center of the head of the subject SUB as the viewpoint.

次に、前記した主観評価と同じ音場条件で、被験者ＳＵＢが評価したときと同じ位置にダミーヘッド１を配置し、マイクロフォン２Ｌ及び２Ｒを用いてバイノーラル方式で音響信号を採取した。採取した音響信号に対して、図１及び図２に示した音像幅推定装置１００を用いて、音像幅の推定値（ハットｙ）を算出した。このとき、下限周波数を１５０Ｈｚ、上限周波数を１２ｋＨｚとする１／２４オクターブフィルタによって構成されるフィルタバンク７Ｌ及び７Ｒを用いて周波数帯域の分割を行った。 Next, the dummy head 1 was placed at the same position as when the subject SUB evaluated under the same sound field conditions as in the subjective evaluation described above, and acoustic signals were collected using the binaural method using the microphones 2L and 2R. An estimated value (hat y) of the sound image width was calculated for the collected acoustic signal using the sound image width estimation apparatus 100 shown in FIGS. 1 and 2. At this time, the frequency band was divided using filter banks 7L and 7R constituted by 1/24 octave filters having a lower limit frequency of 150 Hz and an upper limit frequency of 12 kHz.

本実施例においては、２０種類の音刺激に対する主観評価を行った。その結果を図１３に示す。図１３においては、横軸に音像幅の推定値（ハットｙ）、縦軸に音像幅の主観評価値をとり、結果を示した。図１３に示すように、本発明によって、従来技術による手法に比べて良好に音像幅の推定を行うことができる。
なお、本実施例では、音像幅の推定モデル式として、物理特徴量の線形関数を用いたが、これに限定されるものではなく、物理特徴量の二次関数、べき関数、指数関数などを用いるようにすることもできる。 In this example, subjective evaluation was performed on 20 types of sound stimuli. The result is shown in FIG. In FIG. 13, the horizontal axis represents the estimated value of the sound image width (hat y), and the vertical axis represents the subjective evaluation value of the sound image width, and the results are shown. As shown in FIG. 13, according to the present invention, it is possible to estimate the sound image width better than the conventional technique.
In the present embodiment, the linear function of the physical feature amount is used as the estimation model formula of the sound image width, but is not limited to this, and a quadratic function, a power function, an exponential function, etc. of the physical feature amount are used. It can also be used.

１ダミーヘッド
２Ｌ、２Ｒマイクロフォン
３Ｌ、３Ｒローパスフィルタ
４Ｌ、４ＲＡＤ変換器
５、５Ａ、５Ｂ演算手段
６Ｌ、６Ｒメモリ
７Ｌ、７Ｒフィルタバンク（周波数帯域分割手段）
８_ｆ周波数帯域別物理特徴量算出手段（周波数帯域別特徴量算出手段）
９、９Ａ物理特徴量代表値算出手段（物理特徴量算出手段）
１０、１０Ｂ音像幅推定値算出手段（推定値算出手段）
１１、１１Ｂ重み係数記憶手段
１２、１２Ｂ推定値重み係数算出手段（重み係数算出手段）
１３、１３Ａ、１３Ｂ主観評価データ記憶手段
１４表示手段
１５重み係数記憶手段
１６代表値重み係数算出手段
２０Ｌ_ｆ、２０Ｒ_ｆ窓掛け手段
２１_ｆＣＣＣ算出手段
２２Ｌ_ｆ、２２Ｒ_ｆレベル算出手段
２３_ｆＩＡＣＣ算出手段
２４_ｆＩＴＤ算出手段
２５_ｆＩＬＤ算出手段
２６_ｆＩＬＤ標準偏差算出手段
２７_ｆＩＡＣＣ平均算出手段
２８_ｆＩＴＤ標準偏差算出手段
３０、３０ＡＩＬＤ標準偏差代表値算出手段
３１、３１ＡＩＡＣＣ平均代表値算出手段
３２、３２ＡＩＴＤ標準偏差代表値算出手段
１００、１００Ａ、１００Ｂ音像幅推定装置
ＳＳ、ＳＳ_１〜ＳＳ_３音源
ＳＵＢ被験者
DESCRIPTION OF SYMBOLS 1 Dummy head 2L, 2R Microphone 3L, 3R Low pass filter 4L, 4R AD converter 5, 5A, 5B Calculation means 6L, 6R Memory 7L, 7R Filter bank (frequency band division means)
8 _f frequency band specific physical feature calculating unit (frequency band feature quantity calculating means)
9, 9A Physical feature quantity representative value calculation means (physical feature quantity calculation means)
10, 10B Sound image width estimated value calculating means (estimated value calculating means)
11, 11B Weight coefficient storage means 12, 12B Estimated value weight coefficient calculation means (weight coefficient calculation means)
13, 13A, 13B Subjective evaluation data storage means 14 Display means 15 Weight coefficient storage means 16 Representative value weight coefficient calculation means 20L _f , 20R _f Windowing means 21 _f CCC calculation means 22L _f , 22R _f level calculation means 23 _f IACC calculation Means 24 _f ITD calculating means 25 _f ILD calculating means 26 _f ILD standard deviation calculating means 27 _f IACC average calculating means 28 _f ITD standard deviation calculating means 30, 30A ILD standard deviation representative value calculating means 31, 31A IACC average representative value calculating means 32, 32A ITD standard deviation representative value calculation means 100, 100A, 100B Sound image width estimation device SS, SS _{1 to} SS ₃ sound source SUB subject

Claims

A physical feature amount is calculated from a digital acoustic signal having two channels on the left and right, and the calculated sound feature width is applied to a model image estimation model of the sound image width including the physical feature amount and a weight coefficient. A sound image width estimating device for
Frequency band dividing means for dividing the digital audio signal consisting of two left and right channels into a plurality of frequency band subband signals having a frequency bandwidth of 1/6 octave or less for each of the left and right channel audio signals;
From the subband signals divided by the frequency band dividing means, the interaural cross-correlation, the standard deviation in the time axis direction of the binaural time difference, or the standard in the time axis direction of the binaural level difference for each subband signal. A frequency band feature quantity calculating means for calculating a frequency band feature quantity which is a feature quantity by frequency band representing a difference between the left and right channels of the subband signal as at least one of the deviations;
Physical feature quantity calculating means for calculating the physical feature quantity based on the frequency band feature quantity calculated by the frequency band feature quantity calculating means;
An estimated value calculating means for calculating the estimated value of the sound image width by applying the physical feature value calculated by the physical feature value calculating means to the estimated model equation;
A sound image width estimation apparatus comprising:

The physical feature quantity calculating means calculates any one of the average, weighted average, maximum value, or median of the feature quantity by frequency band for each subband signal calculated by the feature quantity calculation means by frequency band, The sound image width estimation device according to claim 1, wherein the sound image width estimation device calculates the physical feature amount.

A physical feature amount is calculated from a digital acoustic signal having two channels on the left and right, and the calculated sound feature width is applied to a model image estimation model of the sound image width including the physical feature amount and a weight coefficient. A sound image width estimating device for
Frequency band dividing means for dividing the digital audio signal consisting of two left and right channels into a plurality of frequency band subband signals having a frequency bandwidth of 1/6 octave or less for each of the left and right channel audio signals;
From the subband signals divided by the frequency band dividing means, the interaural cross-correlation, the standard deviation in the time axis direction of the binaural time difference, or the standard in the time axis direction of the binaural level difference for each subband signal. A frequency band feature quantity calculating means for calculating a frequency band feature quantity which is a feature quantity by frequency band representing a difference between the left and right channels of the subband signal as at least one of the deviations;
Estimated value calculating means for calculating the estimated value of the sound image width by applying the individual characteristic values for each frequency band calculated by the frequency band characteristic amount calculating means to the estimation model formula as the physical feature quantities;
A sound image width estimation apparatus comprising:

4. The sound image width estimation apparatus according to claim 1, wherein the frequency band dividing unit divides the frequency band into subband signals having a frequency bandwidth of 1/12 octave or less.

Furthermore, the weighting factor calculating means for calculating the weighting factor in the estimation model formula is provided, and the weighting factor calculating means uses the physical feature amount as an explanatory variable and the weighting factor by regression analysis using the sound image width as a target variable. The sound image width estimation device according to any one of claims 1 to 4, wherein the sound image width estimation device is calculated.

A physical feature amount is calculated from a digital acoustic signal having two channels on the left and right, and the calculated sound feature width is applied to a model image estimation model of the sound image width including the physical feature amount and a weight coefficient. Computer to
Frequency band dividing means for dividing the digital audio signal consisting of two channels on the left and right into subband signals of a plurality of predetermined frequency bands for each of the left and right channel audio signals;
From the subband signals divided by the frequency band dividing means, the interaural cross-correlation, the standard deviation in the time axis direction of the binaural time difference, or the standard in the time axis direction of the binaural level difference for each subband signal. Frequency band feature quantity calculating means for calculating a frequency band feature quantity which is a feature quantity by frequency band representing a difference between the left and right channels of the subband signal as at least one of the deviations;
Physical feature quantity calculating means for calculating the physical feature quantity based on the frequency band feature quantity calculated by the frequency band feature quantity calculating means;
An estimated value calculating means for calculating the estimated value of the sound image width by applying the physical feature value calculated by the physical feature value calculating means to the estimated model equation;
A sound image width estimation program that functions as a computer program.

A physical feature amount is calculated from a digital acoustic signal having two channels on the left and right, and the calculated sound feature width is applied to a model image estimation model of the sound image width including the physical feature amount and a weight coefficient. Computer to
Frequency band dividing means for dividing the digital audio signal consisting of two channels on the left and right into subband signals of a plurality of predetermined frequency bands for each of the left and right channel audio signals;
From the subband signals divided by the frequency band dividing means, the interaural cross-correlation, the standard deviation in the time axis direction of the binaural time difference, or the standard in the time axis direction of the binaural level difference for each subband signal. Frequency band feature quantity calculating means for calculating a frequency band feature quantity which is a feature quantity by frequency band representing a difference between the left and right channels of the subband signal as at least one of the deviations;
Estimated value calculating means for calculating the estimated value of the sound image width by applying each frequency band-specific feature value calculated by the frequency band-specific feature value calculating means to the estimated model equation as the physical feature value;
A sound image width estimation program that functions as a computer program.