JP6006627B2

JP6006627B2 - Impulse response length conversion device, impulse response length conversion method, impulse method conversion program

Info

Publication number: JP6006627B2
Application number: JP2012266477A
Authority: JP
Inventors: 靖茂中山; 健太郎松井
Original assignee: Japan Broadcasting Corp; NHK Engineering System Inc
Current assignee: Japan Broadcasting Corp; NHK Engineering System Inc
Priority date: 2012-12-05
Filing date: 2012-12-05
Publication date: 2016-10-12
Anticipated expiration: 2032-12-05
Also published as: JP2014112793A

Description

この発明は、インパルス応答長の変換を行う装置、方法及びプログラムに関し、特に、携帯電話などの高い計算能力を持ち合わせていない機器において３次元音響信号を計算するために、頭部伝達関数の時間表現であるインパルス応答のフィルタ長を、定位品質を保ちながら短縮するための技術に関する。 The present invention relates to an apparatus, a method, and a program for performing impulse response length conversion, and more particularly, to express a temporal expression of a head related transfer function in order to calculate a three-dimensional acoustic signal in a device such as a mobile phone that does not have high calculation capability. The present invention relates to a technique for shortening the filter length of the impulse response, while maintaining the localization quality.

５．１ｃｈサラウンド方式やスーパーハイビジョン用の２２．２マルチチャンネル方式といったマルチチャンネル音響方式は、音源を３次元空間内の様々な方向に定位させることができるという特徴を持っている（例えば非特許文献１参照）。ここで、事前に測定された頭部伝達関数（HRTF：Head-Related Transfer Function）を用いて、３次元音響信号の音響空間を疑似的にヘッドホンで再生することができる。この場合、測定された頭部伝達関数の時間表現であるインパルス応答が、方向フィルタとして音源に畳まれることになる。一般に、頭部伝達関数の時間表現である両耳のインパルス応答の長さは、測定方法によりフィルタ長が決定される。 Multi-channel sound systems such as the 5.1ch surround system and the 22.2 multi-channel system for Super Hi-Vision have a feature that a sound source can be localized in various directions in a three-dimensional space (for example, non-patent literature). 1). Here, using a head-related transfer function (HRTF) measured in advance, the acoustic space of the three-dimensional acoustic signal can be reproduced in a pseudo manner using headphones. In this case, an impulse response that is a time expression of the measured head-related transfer function is folded on the sound source as a directional filter. In general, the length of a binaural impulse response, which is a time expression of the head-related transfer function, is determined by a measurement method.

安藤彰男「高臨場感音響技術とその理論」電子情報通信学会 Fundamental Review Vol.3 No.4 pp.33-46 2010年4月Akio Ando “Highly Realistic Acoustic Technology and Its Theory” IEICE Fundamental Review Vol.3 No.4 pp.33-46 Apr 2010

３次元音響信号は例えばスーパーハイビジョン用の２２．２ｃｈ音響に代表されるように多くの音響信号チャンネルを用いて音響コンテンツを制作しているため、多チャンネル同時に畳み込み処理を行うためには、高い計算能力が必要となる。携帯端末などの計算能力が高くないデバイスを用いた場合、３次元音響を実装するためには、インパルス応答のフィルタ長をできるだけ短くして、計算量を削減することが望まれる。 3D audio signals are produced using many audio signal channels, as represented by, for example, 22.2ch audio for Super Hi-Vision, so it is expensive to perform convolution processing on multiple channels simultaneously. Ability is required. When a device such as a portable terminal that does not have high calculation capability is used, in order to implement three-dimensional sound, it is desirable to reduce the amount of calculation by making the filter length of the impulse response as short as possible.

頭部伝達関数の測定では、その伝達系がＬＴＩシステム（線型時不変系）を前提として、swept-sine法やＭ系列を用いた方法で伝達関数の時間表現であるインパルス応答を測定している。測定されたインパルス応答には、実験室内の測定装置よる反射音などが測定誤差として混入しており、インパルス応答の時間軸上で遅れて測定された反射音成分を分離できるものもある。しかし、頭部伝達関数の特徴、特に振幅周波数特性に関する特徴を有したまま、時間表現であるインパルス応答のフィルタ長を所望の長さまで直接切り取ることは非常に難しい。 In the measurement of the head-related transfer function, the impulse response, which is the time expression of the transfer function, is measured by the swept-sine method or the method using the M-sequence, assuming that the transfer system is an LTI system (linear time-invariant system). . In the measured impulse response, reflected sound from a measuring device in the laboratory is mixed as a measurement error, and there are some which can separate reflected sound components measured with a delay on the time axis of the impulse response. However, it is very difficult to directly cut the filter length of the impulse response, which is a time expression, to a desired length while maintaining the characteristics of the head-related transfer function, particularly the characteristics related to the amplitude frequency characteristic.

また、両耳インパルス応答のフィルタ長を短くすることを目的に、頭部伝達関数をケプストラムで簡略化する手法もあるが、両耳インパルス応答の両耳間時間差やレベル差などを含めて最適にフィルタ長を選択することは難しい。 In addition, there is a method to simplify the cranial transfer function with a cepstrum for the purpose of shortening the filter length of the binaural impulse response, but it is optimal to include the interaural time difference and level difference of the binaural impulse response. It is difficult to select the filter length.

したがって、かかる点に鑑みてなされた本発明の目的は、ケプストラムを用いてインパルス応答の振幅周波数特性に関する特徴を有したままフィルタ長を短縮可能なインパルス応答長変換装置、インパルス応答長変換方法、インパルス方法変換プログラムを提供することにある。 Accordingly, an object of the present invention made in view of such a point is to provide an impulse response length conversion device, an impulse response length conversion method, an impulse, and a filter length that can be shortened using a cepstrum while having characteristics related to the amplitude frequency characteristics of the impulse response. It is to provide a method conversion program.

上述した諸課題を解決すべく、本発明に係るインパルス応答長変換装置は、頭部伝達関数の時間表現である原インパルス応答を記憶するインパルス応答記憶部と、前記原インパルス応答のフィルタ長より低い次数のケプストラムで表現した前記原インパルス応答をインパルス応答に再変換した概形インパルス応答の最大振幅と他の振幅との振幅比より、ケプストラム次数と振幅比及びフィルタ長との関係を表す回帰モデルを算出し、前記回帰モデルに基づき、所望振幅比及び所望フィルタ長に対するケプストラム次数を決定するケプストラム次数決定部と、前記決定したケプストラム次数で前記原インパルス応答の概形インパルス応答を生成する概形インパルス応答生成部と、前記概形インパルス応答を用いて前記所望フィルタ長のインパルス応答を再構成するインパルス応答再構成部と、を備える。 In order to solve the above-described problems, an impulse response length conversion apparatus according to the present invention includes an impulse response storage unit that stores an original impulse response that is a temporal expression of a head related transfer function, and a filter length that is lower than the filter length of the original impulse response. A regression model expressing the relationship between the cepstrum order, the amplitude ratio, and the filter length based on the amplitude ratio between the maximum amplitude of the general impulse response and the other amplitudes, which are converted from the original impulse response expressed by the order cepstrum. A cepstrum order determining unit that calculates and determines a cepstrum order for a desired amplitude ratio and a desired filter length based on the regression model, and a general impulse response that generates a general impulse response of the original impulse response with the determined cepstrum order A generator and an impulse of the desired filter length using the approximate impulse response Comprising the impulse response reconstruction unit which reconstructs a response, the.

また、本発明にかかるインパルス応答変換方法は、頭部伝達関数の時間表現である原インパルス応答を記憶するインパルス応答記憶部を備えるインパルス応答長変換装置におけるインパルス応答長変換方法であって、前記インパルス応答長変換装置による処理手順が、前記原インパルス応答のフィルタ長より低い次数のケプストラムで表現した前記原インパルス応答をインパルス応答に再変換した概形インパルス応答の最大振幅と他の振幅との振幅比より、ケプストラム次数と振幅比及びフィルタ長との関係を表す回帰モデルを算出するステップと、前記回帰モデルに基づき、所望振幅比及び所望フィルタ長に対するケプストラム次数を決定するステップと、前記決定したケプストラム次数で前記原インパルス応答の概形インパルス応答を生成するステップと、前記概形インパルス応答を用いて前記所望フィルタ長のインパルス応答を再構成するステップと、を含む。 An impulse response conversion method according to the present invention is an impulse response length conversion method in an impulse response length conversion device including an impulse response storage unit that stores an original impulse response that is a time expression of a head related transfer function, The amplitude ratio between the maximum amplitude of the general impulse response obtained by re-converting the original impulse response expressed by a cepstrum of an order lower than the filter length of the original impulse response into an impulse response and the other amplitude. A step of calculating a regression model representing a relationship between a cepstrum order, an amplitude ratio and a filter length; a step of determining a cepstrum order for a desired amplitude ratio and a desired filter length based on the regression model; and the determined cepstrum order To generate a rough impulse response of the original impulse response. Comprising the steps of, and reconstructing an impulse response of the desired filter length using the outline impulse response, the.

また、本発明にかかるインパルス応答変換プログラムは、頭部伝達関数の時間表現である原インパルス応答を記憶するインパルス応答記憶部を備えるインパルス応答長変換装置に、前記原インパルス応答のフィルタ長より低い次数のケプストラムで表現した前記原インパルス応答をインパルス応答に再変換した概形インパルス応答の最大振幅と他の振幅との振幅比より、ケプストラム次数と振幅比及びフィルタ長との関係を表す回帰モデルを算出するステップと、前記回帰モデルに基づき、所望振幅比及び所望フィルタ長に対するケプストラム次数を決定するステップと、前記決定したケプストラム次数で前記原インパルス応答の概形インパルス応答を生成するステップと、前記概形インパルス応答を用いて前記所望フィルタ長のインパルス応答を再構成するステップと、を実行させる。 Further, an impulse response conversion program according to the present invention includes an impulse response length conversion device including an impulse response storage unit that stores an original impulse response that is a time expression of a head related transfer function, and an order lower than the filter length of the original impulse response. The regression model expressing the relationship between the cepstrum order, the amplitude ratio, and the filter length is calculated from the amplitude ratio between the maximum amplitude of the general impulse response, which is the original impulse response expressed in the cepstrum of the above, and converted into an impulse response. Determining a cepstrum order for a desired amplitude ratio and a desired filter length based on the regression model, generating a rough impulse response of the original impulse response with the determined cepstrum order, and the rough shape Impulse of the desired filter length using an impulse response And reconstructing an answer, to the execution.

本発明に係るインパルス応答長変換装置、インパルス応答長変換方法、インパルス方法変換プログラムによれば、ケプストラムを用いてインパルス応答の振幅周波数特性に関する特徴を有したままフィルタ長を短縮することが可能となる。 According to the impulse response length converter, the impulse response length conversion method, and the impulse method conversion program according to the present invention, it is possible to shorten the filter length using the cepstrum while maintaining the characteristics related to the amplitude frequency characteristics of the impulse response. .

本発明の一実施形態に係るインパルス応答長変換装置の構成を示す図である。It is a figure which shows the structure of the impulse response length converter which concerns on one Embodiment of this invention. 概形インパルス応答における各サンプルの振幅比の一例を示す図である。It is a figure which shows an example of the amplitude ratio of each sample in a rough impulse response. ケプストラム次数とフィルタ長及び振幅比との関係を表す回帰モデルの一例を示す図である。It is a figure which shows an example of the regression model showing the relationship between cepstrum order, filter length, and amplitude ratio.

以降、諸図面を参照しながら、本発明の実施態様を詳細に説明する。 Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings.

図１は、本発明の一実施形態に係るインパルス応答長変換装置の構成を示す図である。インパルス応答長変換装置１は、インパルス応答記憶部１０と、ケプストラム次数決定部２０と、概形インパルス応答生成部３０と、インパルス応答分割部４０と、オールパス成分時間差取得部５０と、インパルス応答再構成部６０と、を備える。 FIG. 1 is a diagram showing a configuration of an impulse response length conversion apparatus according to an embodiment of the present invention. The impulse response length conversion apparatus 1 includes an impulse response storage unit 10, a cepstrum order determination unit 20, a rough impulse response generation unit 30, an impulse response division unit 40, an all-pass component time difference acquisition unit 50, and an impulse response reconfiguration. Unit 60.

インパルス応答記憶部１０は、頭部伝達関数のデータベースであって、頭部伝達関数の時間表現である原インパルス応答を、複数人の両耳分、３次元空間方向毎に保存している。例えば、原インパルス応答のフィルタ長（インパルス応答長）は時間方向に５１２サンプルで構成され、右耳、左耳それぞれの原インパルス応答がペアとして保存されている。 The impulse response storage unit 10 is a database of head-related transfer functions, and stores original impulse responses, which are temporal expressions of head-related transfer functions, for both ears of a plurality of persons for each three-dimensional space direction. For example, the filter length (impulse response length) of the original impulse response is composed of 512 samples in the time direction, and the original impulse responses of the right ear and the left ear are stored as a pair.

ケプストラム次数決定部２０は、インパルス応答記憶部１０が記憶する原インパルス応答の振幅周波数特性、定位品質に関する特徴を有したままフィルタ長を短くするため、原インパルス応答をモデル化するケプストラムの次数を決定する。具体的には、まず、ケプストラム次数決定部２０は、原インパルス応答のフィルタ長より低い次数のケプストラムで表現した原インパルス応答をインパルス応答に再変換した概形インパルス応答を求め、当該概形インパルス応答の最大振幅と他の振幅との振幅比より、ケプストラム次数と振幅比及びフィルタ長との関係を表す回帰モデルを算出する。以下、回帰モデルの算出について詳述する。 The cepstrum order determination unit 20 determines the order of the cepstrum that models the original impulse response in order to shorten the filter length while maintaining the characteristics regarding the amplitude frequency characteristics and localization quality of the original impulse response stored in the impulse response storage unit 10. To do. Specifically, first, the cepstrum order determination unit 20 obtains a rough impulse response obtained by reconverting the original impulse response expressed by a cepstrum having an order lower than the filter length of the original impulse response into an impulse response, and the rough impulse response. Based on the amplitude ratio between the maximum amplitude and other amplitudes, a regression model representing the relationship between the cepstrum order, the amplitude ratio and the filter length is calculated. Hereinafter, calculation of the regression model will be described in detail.

ケプストラム次数決定部２０は、インパルス応答記憶部１０より両耳の原インパルス応答のペアを抽出し、片耳の原インパルス応答ずつ式（１）により実数ケプストラムを算出する。ここで、imp(n)は算出に係る片耳の原インパルス応答、abs()は絶対値、FFT()は離散フーリエ変換、IFFTは離散フーリエ逆変換、Re()は複素数の内実数値を取りだす処理、log()は自然対数処理をそれぞれ示すものである。 The cepstrum order determination unit 20 extracts a pair of original impulse responses of both ears from the impulse response storage unit 10 and calculates a real cepstrum by equation (1) for each original impulse response of one ear. Where imp (n) is the original impulse response of one ear for calculation, abs () is the absolute value, FFT () is the discrete Fourier transform, IFFT is the discrete Fourier inverse transform, and Re () is the process of extracting the real number of the complex number , Log () represents natural logarithm processing, respectively.

式（１）で算出された実数ケプストラムの０次の係数はバイアス項となり、原インパルス応答全体の大きさを表現する。また、原インパルス応答の周波数エンベロープは実数ケプストラムの低い次数の係数のみで表現できる。このため、ケプストラム次数決定部２０は、実数ケプストラムの０次と高次の係数を０として低い次数により表現されたケプストラムを再度インパルス応答に変換し、再変換後のインパルス応答の最大振幅と他の振幅との振幅比を求める。 The zeroth-order coefficient of the real cepstrum calculated by Equation (1) is a bias term and expresses the magnitude of the entire original impulse response. Further, the frequency envelope of the original impulse response can be expressed only by the low-order coefficient of the real cepstrum. For this reason, the cepstrum order determination unit 20 converts the cepstrum expressed by the lower order with the zeroth order and higher order coefficients of the real cepstrum as 0, and converts the cepstrum to the impulse response again. The amplitude ratio with the amplitude is obtained.

例えば、ケプストラム次数決定部２０は、式（１）で算出された実数ケプストラムについて、０次と５１次から５１１次までの係数とを０に設定する。なお、このとき、ケプストラム次数決定部２０は、エネルギーを保存するため、係数を０としなかった次数（例えば１次〜５０次）の係数について、係数を２倍にする処理を行う。 For example, the cepstrum order determination unit 20 sets the 0th order and the coefficients from the 51st order to the 511th order to 0 for the real number cepstrum calculated by Expression (1). At this time, the cepstrum order determination unit 20 performs a process of doubling the coefficient with respect to the coefficient of the order (for example, the first order to the 50th order) for which the coefficient is not 0 in order to save energy.

ケプストラム次数決定部２０は、低い次数で表現されたケプストラムを式（２）により再度インパルス応答へ変換する。ここで、exp()はイクスポネンシャル関数を示す。 The cepstrum order determination unit 20 converts the cepstrum expressed by a low order into an impulse response again using the equation (2). Here, exp () represents an exponential function.

式（２）で示すインパルス応答は、低い次数で表現されたケプストラムにより、微細構造が除去された周波数応答の時間表現となる。以降、式（２）で示す再変換後のインパルス応答を概形インパルス応答と称する。ケプストラム次数決定部２０は、概形インパルス応答の最大振幅と他の振幅との振幅比を求める。概形インパルス応答では、時刻０に近い場所に最大振幅が表れるため、ケプストラム次数決定部２０は、時刻０の概形インパルス応答の振幅値を基準として、各サンプル（ｎ）の振幅比ｄ［ｄＢ］を式（３）により計算する。 The impulse response expressed by the equation (2) is a time expression of the frequency response from which the fine structure is removed by the cepstrum expressed by a low order. Hereinafter, the impulse response after reconversion represented by Expression (2) is referred to as a general impulse response. The cepstrum order determination unit 20 obtains an amplitude ratio between the maximum amplitude of the approximate impulse response and other amplitudes. In the approximate impulse response, the maximum amplitude appears at a location close to time 0. Therefore, the cepstrum order determination unit 20 uses the amplitude value of the approximate impulse response at time 0 as a reference to determine the amplitude ratio d [dB] of each sample (n). ] Is calculated by equation (3).

図２は、概形インパルス応答における各サンプルの振幅比の一例を示す図である。図２は、ある所定のケプストラム次数（例えば５０次）について、式（３）で示す振幅比を示すものであり、ケプストラム次数決定部２０は、ケプストラム次数を変化させながら、インパルス応答記憶部１０に保存される全ての原インパルス応答について式（１）〜（３）の計算を行い、概形インパルス応答における各サンプルの振幅比を算出する。 FIG. 2 is a diagram illustrating an example of the amplitude ratio of each sample in the rough impulse response. FIG. 2 shows the amplitude ratio represented by Equation (3) for a given cepstrum order (for example, 50th order), and the cepstrum order determination unit 20 stores the impulse response storage unit 10 while changing the cepstrum order. Equations (1) to (3) are calculated for all the stored original impulse responses, and the amplitude ratio of each sample in the approximate impulse response is calculated.

ケプストラム次数決定部２０は、全ての原インパルス応答に対する式（１）〜（３）の計算から、ケプストラム次数と振幅比及びフィルタ長との関係を表す回帰モデルを算出する。図３は、ケプストラム次数とフィルタ長及び振幅比との関係を表す回帰モデルの一例を示す図である。図３の場合、ケプストラム次数決定部２０は、ケプストラム次数を３０次から８０次まで１０次間隔で変化させ、全ての原インパルス応答に対して式（１）〜（３）の計算を行い、ケプストラム次数とフィルタ長及び振幅比との関係を表す線形モデルを算出している。なお、ケプストラム次数決定部２０は、線形モデルに限らず、任意の好適なモデルにより回帰モデルを算出することができる。 The cepstrum order determination unit 20 calculates a regression model representing the relationship between the cepstrum order, the amplitude ratio, and the filter length from the calculations of the equations (1) to (3) for all the original impulse responses. FIG. 3 is a diagram illustrating an example of a regression model representing the relationship between the cepstrum order, the filter length, and the amplitude ratio. In the case of FIG. 3, the cepstrum order determination unit 20 changes the cepstrum order from the 30th order to the 80th order at the 10th order interval, calculates the equations (1) to (3) for all the original impulse responses, and obtains the cepstrum. A linear model representing the relationship between the order, the filter length, and the amplitude ratio is calculated. Note that the cepstrum order determination unit 20 can calculate a regression model using any suitable model, not limited to a linear model.

ケプストラム次数決定部２０は、算出した回帰モデルに基づき、所望振幅比及び所望フィルタ長に対するケプストラム次数を決定する。所望振幅比とは、概形インパルス応答の最大振幅に対して係数を維持すべき（０としない）サンプルの振幅比を示すものである。概形インパルス応答は原インパルス応答の周波数エンベロープを表現するものである。すなわち、所望振幅比とは、短縮されたインパルス応答において、原インパルス応答の振幅周波数特性、定位品質をどの程度維持すべきかを示すものである。また、所望フィルタ長は、原インパルス応答のフィルタ長より短いフィルタ長であって、携帯端末などの計算能力が高くないデバイスに３次元音響を実装する際に用いる短縮されたフィルタ長を示すものである。 The cepstrum order determination unit 20 determines a cepstrum order for a desired amplitude ratio and a desired filter length based on the calculated regression model. The desired amplitude ratio indicates the amplitude ratio of a sample whose coefficient should be maintained (not 0) with respect to the maximum amplitude of the approximate impulse response. The approximate impulse response represents the frequency envelope of the original impulse response. That is, the desired amplitude ratio indicates how much the amplitude frequency characteristic and localization quality of the original impulse response should be maintained in the shortened impulse response. The desired filter length is a filter length that is shorter than the filter length of the original impulse response, and indicates a shortened filter length that is used when three-dimensional sound is mounted on a device that does not have high calculation capability such as a portable terminal. is there.

例えば所望振幅比及び所望フィルタ長をそれぞれ９０ｄＢ及び２５６サンプルとすると、ケプストラム次数決定部２０は、図３の９０ｄＢの回帰モデルよりケプストラム次数を決定する。ここで、左右の時間差や他の方向の頭部伝達関数の時刻関係から、フィルタの前後にマージンを取ることが必要である。２５６サンプルのフィルタの前後５０サンプルをマージンとすると、必要フィルタ長は２５６−５０×２＝１５６と算出される。ケプストラム次数決定部２０は、図３の９０ｄＢの回帰直線により、フィルタ長が１５６のときのケプストラム次数を５５次に決定することができる。 For example, if the desired amplitude ratio and the desired filter length are 90 dB and 256 samples, respectively, the cepstrum order determination unit 20 determines the cepstrum order from the 90 dB regression model of FIG. Here, it is necessary to provide a margin before and after the filter from the time difference between the left and right time differences and the head-related transfer functions in other directions. If 50 samples before and after the 256-sample filter are used as a margin, the required filter length is calculated as 256−50 × 2 = 156. The cepstrum order determining unit 20 can determine the cepstrum order 55 when the filter length is 156 by the 90 dB regression line of FIG.

概形インパルス応答生成部３０は、０次からケプストラム次数決定部２０により算出されたケプストラム次数（例えば５５次）までの係数を保持し、それ以外の係数を０として、式（２）により原インパルス応答の概形インパルス応答imp_gを生成する。概形インパルス応答生成部３０は、生成した概形インパルス応答imp_gをインパルス応答再構成部６０に出力する。 The rough impulse response generation unit 30 holds coefficients from the 0th order to the cepstrum order (for example, 55th order) calculated by the cepstrum order determination unit 20, sets the other coefficients to 0, and uses the original impulse according to Equation (2). Generate a general impulse response imp_g of the response. The rough impulse response generator 30 outputs the generated rough impulse response imp_g to the impulse response reconstructor 60.

インパルス応答分割部４０は、式（１）に示す実数ケプストラムの右側係数部のみを用いて最小位相伝達関数を算出し、原インパルス応答を最小位相伝達関数で割り算することによりでオールパス伝達関数を算出する。インパルス応答分割部４０は、両耳の原インパルス応答のペアについてオールパス伝達関数を算出し、算出したオールパス伝達関数をオールパス成分時間差取得部５０に出力する。また、インパルス応答分割部４０は、両耳の原インパルス応答のペアのうち、基準となる耳（例えば左耳）の原インパルス応答の最大ピーク時刻t_ipを算出し、算出した最大ピーク時刻t_ipをインパルス応答再構成部６０に出力する。 The impulse response dividing unit 40 calculates the minimum phase transfer function using only the right coefficient part of the real cepstrum shown in Equation (1), and calculates the all-pass transfer function by dividing the original impulse response by the minimum phase transfer function. To do. The impulse response dividing unit 40 calculates an all-pass transfer function for a pair of binaural original impulse responses, and outputs the calculated all-pass transfer function to the all-pass component time difference acquisition unit 50. Further, the impulse response dividing unit 40 calculates the maximum peak time t_ip of the original impulse response of the reference ear (for example, the left ear) from the pair of original impulse responses of both ears, and impulses the calculated maximum peak time t_ip. The data is output to the response reconstruction unit 60.

オールパス成分時間差取得部５０は、両耳の原インパルス応答のペアそれぞれのオールパス成分に対し、相互相関関数により左右の時間差ｔを算出する。例えば、基準となる耳を左耳とすると、ｔ＜０の場合、右耳の原インパルス応答のオールパス成分が左耳より早くピークに達することを表し、ｔ＞０の場合、左耳の原インパルス応答のオールパス成分が右耳より早くピークに達することを表す。オールパス成分時間差取得部５０は、左右の時間差ｔをインパルス応答再構成部６０に出力する。 The all-pass component time difference acquisition unit 50 calculates the time difference t between the left and right with the cross-correlation function for each all-pass component of each pair of binaural original impulse responses. For example, if the reference ear is the left ear, when t <0, it represents that the all-pass component of the original impulse response of the right ear reaches a peak earlier than the left ear, and when t> 0, the original impulse response of the left ear. This means that the all-pass component of reaches the peak earlier than the right ear. The all-pass component time difference acquisition unit 50 outputs the left / right time difference t to the impulse response reconstruction unit 60.

インパルス応答再構成部６０は、概形インパルス応答imp_gと、左右の時間差ｔと、インパルス応答の最大ピーク時刻t_ipとから、所望のフィルタ長（例えば２５６サンプル）のインパルス応答を再構成する。インパルス応答再構成部６０は、まず、基準となる耳（例えば左耳）の概形インパルス応答imp_gに対し、最大ピーク時刻t_ipに基づき右側シフトを行う。ここで、インパルス応答の左側マージンを５０サンプルとし、正中面に音源定位のある原頭部伝達関数の左耳のインパルス応答の最大ピーク時刻t_ipが５０サンプルとなるようなオフセット値を考慮してシフト量を求める。 The impulse response reconstruction unit 60 reconstructs an impulse response having a desired filter length (for example, 256 samples) from the approximate impulse response imp_g, the left and right time difference t, and the maximum peak time t_ip of the impulse response. First, the impulse response reconstruction unit 60 performs a right shift based on the maximum peak time t_ip with respect to the rough impulse response imp_g of the reference ear (for example, the left ear). Here, the left margin of the impulse response is set to 50 samples, and the shift is performed in consideration of an offset value so that the maximum peak time t_ip of the left ear impulse response of the head-related transfer function having a sound source localization on the median plane is 50 samples. Find the amount.

インパルス応答再構成部６０は、オフセット量及び両耳のシフト量を、下記の式により求めることができる。
オフセット値＝正中面定位の左耳インパルス応答最大ピーク時刻t_ip−５０
右側シフト量（左耳）＝各方向の左耳インパルス応答最大ピーク時刻t_ip−オフセット値
右側シフト量（右耳）＝右側シフト量（左耳）＋t The impulse response reconstruction unit 60 can obtain the offset amount and the binaural shift amount by the following equations.
Offset value = Maximum peak time t_ip−50 for left ear impulse response with median plane localization
Right shift amount (left ear) = left ear impulse response maximum peak time t_ip in each direction-offset value Right shift amount (right ear) = right shift amount (left ear) + t

以上より、インパルス応答再構成部６０は、概形インパルス応答imp_gを時間方向にシフトさせ、両耳の原インパルス応答のペアそれぞれについて、短縮されたインパルス応答を再構成することができる。インパルス応答再構成部６０が再構成したインパルス応答は、原インパルス応答のフィルタ長（例えば５１２サンプル）より短いフィルタ長（例えば２５６サンプル）となり、かつ、所望の振幅比を満たすケプストラム次数による概形インパルス応答imp_gを用いることにより、原インパルス応答の振幅周波数特性、定位品質も維持されることになる。 As described above, the impulse response reconstruction unit 60 can shift the approximate impulse response imp_g in the time direction to reconstruct the shortened impulse response for each pair of binaural original impulse responses. The impulse response reconstructed by the impulse response reconstruction unit 60 has a filter length (for example, 256 samples) shorter than the filter length (for example, 512 samples) of the original impulse response, and an approximate impulse with a cepstrum order that satisfies a desired amplitude ratio. By using the response imp_g, the amplitude frequency characteristic and localization quality of the original impulse response are also maintained.

このように、本実施形態によれば、ケプストラム次数決定部２０は、原インパルス応答のフィルタ長より低い次数のケプストラムで表現した原インパルス応答をインパルス応答に再変換した概形インパルス応答の最大振幅と他の振幅との振幅比より、ケプストラム次数と振幅比及びフィルタ長との関係を表す回帰モデルを算出し、当該回帰モデルに基づき、所望振幅比及び所望フィルタ長に対するケプストラム次数を決定し、概形インパルス応答生成部３０は、決定したケプストラム次数で原インパルス応答の概形インパルス応答を生成し、インパルス応答再構成部６０は、概形インパルス応答を用いて所望フィルタ長のインパルス応答を再構成する。これにより、原インパルス応答の振幅周波数特性に関する特徴を有したままフィルタ長を短縮することが可能となる。即ち、携帯電話などの高い計算能力を持ち合わせていない機器を用いて３次元音響信号をヘッドホンで再生する際の計算負荷を低減するため、３次元音響信号を計算するためインパルス応答の振幅周波数特性、定位品質を維持したまま、フィルタ長を最適に短縮することができる。 Thus, according to the present embodiment, the cepstrum order determination unit 20 calculates the maximum amplitude of the approximate impulse response obtained by reconverting the original impulse response expressed by the cepstrum having an order lower than the filter length of the original impulse response into the impulse response. A regression model that represents the relationship between the cepstrum order, the amplitude ratio, and the filter length is calculated from the amplitude ratio with other amplitudes, and based on the regression model, the cepstrum order for the desired amplitude ratio and the desired filter length is determined. The impulse response generation unit 30 generates a rough impulse response of the original impulse response with the determined cepstrum order, and the impulse response reconstruction unit 60 reconstructs an impulse response of a desired filter length using the rough impulse response. As a result, the filter length can be shortened while maintaining the characteristics related to the amplitude frequency characteristic of the original impulse response. That is, in order to reduce the calculation load when reproducing a three-dimensional sound signal with headphones using a device that does not have high calculation ability such as a mobile phone, the amplitude frequency characteristic of the impulse response for calculating the three-dimensional sound signal, The filter length can be optimally shortened while maintaining the localization quality.

本発明を諸図面や実施例に基づき説明してきたが、当業者であれば本開示に基づき種々の変形や修正を行うことが容易であることに注意されたい。従って、これらの変形や修正は本発明の範囲に含まれることに留意されたい。例えば、各構成部、各ステップなどに含まれる機能などは論理的に矛盾しないように再配置可能であり、複数の構成部やステップなどを１つに組み合わせたり、或いは分割したりすることが可能である。 Although the present invention has been described based on the drawings and examples, it should be noted that those skilled in the art can easily make various modifications and corrections based on the present disclosure. Therefore, it should be noted that these variations and modifications are included in the scope of the present invention. For example, the functions included in each component, each step, etc. can be rearranged so that there is no logical contradiction, and multiple components, steps, etc. can be combined or divided into one It is.

なお、本発明は、インパルス応答長変換装置１が有するプロセッサに同等の処理（ステップ）を実行させるプログラムとしても実現し得るものであり、本発明の範囲にはこれらも包含されるものと理解されたい。例えば、インパルス応答長変換装置１は、各機能を実現する処理内容を記述したプログラムを記憶部に格納しておき、中央演算処理装置（ＣＰＵ）によって当該プログラムを読み出して実行することができる。 It should be noted that the present invention can be realized as a program for causing the processor of the impulse response length converter 1 to execute an equivalent process (step), and it is understood that these are also included in the scope of the present invention. I want. For example, the impulse response length conversion apparatus 1 can store a program describing processing contents for realizing each function in a storage unit, and read and execute the program by a central processing unit (CPU).

この発明によれば、携帯電話などの高い計算能力を持ち合わせていない機器において３次元音響信号を計算するために、頭部伝達関数の時間表現であるインパルス応答のフィルタ長を、振幅周波数特性、定位品質を保ちながら短縮することが可能になる。 According to the present invention, in order to calculate a three-dimensional acoustic signal in a device that does not have high calculation capability such as a mobile phone, the filter length of the impulse response, which is a temporal expression of the head related transfer function, is changed to the amplitude frequency characteristic, localization It becomes possible to shorten while maintaining quality.

１インパルス応答長変換装置
１０インパルス応答記憶部
２０ケプストラム次数決定部
３０概形インパルス応答生成部
４０インパルス応答分割部
５０オールパス成分時間差取得部
６０インパルス応答再構成部 DESCRIPTION OF SYMBOLS 1 Impulse response length conversion apparatus 10 Impulse response memory | storage part 20 Cepstrum order determination part 30 Schematic impulse response production | generation part 40 Impulse response division | segmentation part 50 All pass component time difference acquisition part 60 Impulse response reconstruction part

Claims

An impulse response storage unit for storing an original impulse response which is a time expression of the head-related transfer function;
The cepstrum order, the amplitude ratio, and the filter based on the amplitude ratio between the maximum amplitude of the general impulse response expressed by the cepstrum of the order lower than the filter length of the original impulse response and converted into the impulse response and the other amplitudes. A cepstrum order determination unit that calculates a regression model representing a relationship with the length, and determines a cepstrum order for a desired amplitude ratio and a desired filter length based on the regression model;
A general impulse response generator for generating a general impulse response of the original impulse response at the determined cepstrum order;
An impulse response length conversion device comprising: an impulse response reconstruction unit that reconstructs an impulse response of the desired filter length using the approximate impulse response.

An impulse response length conversion method in an impulse response length conversion device including an impulse response storage unit that stores an original impulse response that is a time expression of a head related transfer function,
The processing procedure by the impulse response length converter is as follows:
The cepstrum order, the amplitude ratio, and the filter based on the amplitude ratio between the maximum amplitude of the general impulse response expressed by the cepstrum of the order lower than the filter length of the original impulse response and converted into the impulse response and the other amplitudes. Calculating a regression model representing the relationship with length;
Determining a cepstrum order for a desired amplitude ratio and a desired filter length based on the regression model;
Generating a rough impulse response of the original impulse response with the determined cepstrum order;
Reconstructing an impulse response of the desired filter length using the approximate impulse response.

In an impulse response length conversion device including an impulse response storage unit that stores an original impulse response that is a time expression of a head-related transfer function,
The cepstrum order, the amplitude ratio, and the filter based on the amplitude ratio between the maximum amplitude of the general impulse response expressed by the cepstrum of the order lower than the filter length of the original impulse response and converted into the impulse response and the other amplitudes. Calculating a regression model representing the relationship with length;
Determining a cepstrum order for a desired amplitude ratio and a desired filter length based on the regression model;
Generating a rough impulse response of the original impulse response with the determined cepstrum order;
Reconstructing an impulse response of the desired filter length using the approximate impulse response, and executing an impulse response length conversion program.