JP4306815B2

JP4306815B2 - Stereophonic sound processor using linear prediction coefficients

Info

Publication number: JP4306815B2
Application number: JP04610596A
Authority: JP
Inventors: 直司松尾
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 1996-03-04
Filing date: 1996-03-04
Publication date: 2009-08-05
Anticipated expiration: 2016-03-04
Also published as: JPH09247799A

Abstract

PROBLEM TO BE SOLVED: To provide a stereoscopic acoustic effect onto a listener in a reproduction sound field especially through a headphone or the like. SOLUTION: In this acoustic processing unit, a desired acoustic characteristic added to an original signal is formed by a linear synthesis filter using a linear prediction coefficient obtained by a linear prediction analysis of an impulse response representing the acoustic characteristic as its filter coefficient and the desired acoustic characteristic is provided to the original signal through the linear synthesis filter. The power spectrum of the impulse response representing the acoustic characteristic is divided into plural critical band width and the linear prediction analysis is made based on an impulse signal obtained from the power spectrum signal representing the signal tone of each critical band width to obtain the filter coefficient of the linear synthesis filter.

Description

【０００１】
【発明の属する技術分野】
本発明は音響処理技術に関し、特にヘッドホン等を通した再生音場において聴取者に立体的な音響効果を提供する立体音響処理装置に関するものである。
【０００２】
【従来の技術】
一般に、音像を正確に再現し若しくは定位させるには、音源から聴取者までの原音場の音響特性と、スピーカ又はヘッドホン等の音響出力機器から聴取者までの再生音場の音響特性とを得ることが必要となる。実際の再生音場は、音源からの信号に前者の音響特性を付加し、且つその信号から後者の音響特性を除去することによって、スピーカ又はヘッドホンを用いた場合でも原音場の音像を聴取者に正確に再現することができる。
【０００３】
図１は、音源１０と聴取者１１からなる原音場の一例を示したものである。
図１において、音源１０から聴取者１１の左右の耳（l ，ｒ）に至る各音響空間経路は、それらに対応する伝達特性Ｓｌ，Ｓr で示される。
【０００４】
図２は、図１と同じ再生音場を聴取者１１に与えるための電気的な等価回路構成を示したものである。
図２において、図１の各音響空間経路の伝達特性Ｓｌ，Ｓr は対応する音響特性付加フィルタ（Ｓ→ｌ）１２，（Ｓ→r ）１３を用いて与えられる。なお、図２では聴取者１１の左右のヘッドホン１６，１７の音響特性を打ち消すために、さらに各ヘッドホンの逆特性ｈ^-1１４，１５が付加されている。その結果、聴取者１１は本構成によって図１と同じ位置に同じ音像１０を得ることが可能となる。
【０００５】
図３は、図２で示した音響特性付加フィルタ１２，１３のフィルタ係数を求めるための一構成例を示したものである。
図３では、まず無響室で測定したインパルスレスポンスの自己相関係数処理１８が行われる。前記処理によって得られた自己相関係数にさらに線形予測解析処理１９を行って線形予測係数を求める。そして、ＩＩＲフィルタを使いそのフィルタ係数として前記線形予測係数を用いることで前記音響特性付加フィルタ１２，１３を構成する。この場合、それ以前のＦＩＲフィルタを使った場合と比較して大幅にフィルタタップ数を削減することができる。
【０００６】
図４は、複数の仮想音源間の出力補間によって音像を移動させる一例を示したものである。
図４の（ａ）には、３個の仮想音源（Ａ〜Ｃ）２０−１〜２０−３を聴取者１１の前方に配置した例が示されている。
また、図４の（ｂ）には前記複数の仮想音源２０−１〜２０−３間に音像を定位させるための回路構成例が示されている。
【０００７】
図４の（ｂ）では、各仮想音源２０−１〜２０−３の位置と対応し、前記各音源から聴取者１１の左右の耳に至るそれぞれの音響空間経路の伝達特性に応じて３種類の音響特性付加フィルタ対２１及び２２，２３及び２４，そして２５及び２６が与えられている。各音響特性付加フィルタは、音響空間経路の伝達特性を示すフィルタ係数と入力信号に対するフィルタ演算出力結果を保持するフィルタメモリとを有し、前記演算出力は次段の可変増幅器（ｇＡ〜ｇＣ）２７〜３２に入力される。それらの増幅された出力は聴取者１１の左右の耳に対応する加算器３３〜３６で加算され、図２で示した音響特性付加フィルタ１２，１３の各出力となる。以降は、図２の説明と同様である。
【０００８】
ここで、例えば前記可変増幅器（ｇＡ，ｇＢ）２７〜３０の各ゲインを変える出力補間によって、図４の（ａ）に示すように仮想音源（Ａ）２０−１と仮想音源（Ｂ）２０−２との間に音像を滑らかに移動させることができる。仮想音源（Ｂ）２０−２と仮想音源（Ｃ）２０−３との間も同様に処理できる。
なお、上述した各従来技術の詳細については本願と同一発明者及び出願人による特願平７−２３１７０５号を参照されたい。
【０００９】
【発明が解決しようとする課題】
しかしながら、図３で示したように線形予測解析を行って元のインパルスレスポンスのサンプル数よりも少ない次数の合成フィルタ（ＩＩＲフィルタ）や予測フィルタ（ＦＩＲフィルタ）を用いて周波数特性を近似する場合に、特に元のインパルスレスポンスの周波数特性が急峻な山や谷の部分を有する複雑な場合には、その近似精度が低下するという問題があった。
【００１０】
また、その場合には前記合成フィルタのインパルスレスポンスの時間領域の波形と元のインパルスレスポンスの波形とが大きく異なり、そのため聴取者１１に対する両耳間時間差及びそのレベル差の制御が困難になるという問題があった。
【００１１】
さらにまた、図４の（Ｂ）に示すように仮想音源の出力補間によって音像定位処理を行う場合に、各仮想音源対応に備えられたフィルタ係数とフィルタメモリを用い、例え音像が仮想音源ＡとＢとの間にのみあるときでも仮想音源Ｃに関してそのフィルタ係数とフィルタメモリを用いた定位処理が行われていた。そのため、図２に示す再生音場をＤＳＰ（Digital Signal Processor) 等を用いて実現する場合に、前記音響特性付加フィルタの数が多いと演算処理手順やメモリ、レジスタ等の管理が複雑になるという問題があった。
【００１２】
そこで本発明の目的は、上記各問題点に鑑み、（１）自己相関係数を求めて線形予測解析を行う前に、聴覚上変化が無いように元のインパルスレスポンスの周波数特性を滑らかにし、さらに合成フィルタのインパルスレスポンスの時間領域での波形を補整して元のインパルスレスポンスの周波数特性に近づけること、（２）全体の音響特性を変化させること無くフィルタの数を減らすこと、そして（３）所望の音像定位を行うために必要な仮想音源の定位処理のみ行うこと、を実現する立体音響処理装置を提供することにある。
【００１３】
【課題を解決するための手段】
本発明によれば、原信号に付加する所望の音響特性を、その音響特性を表すインパルスレスポンスの線形予測解析によって得られる線形予測係数をフィルタ係数とする線形合成フィルタによって形成し、前記線形合成フィルタを通して前記原信号に所望の音響特性を付加する立体音響処理装置であって、前記音響特性を表すインパルスレスポンスのパワースペクトラムを複数の臨界帯域に分割し、前記各臨界帯域内の信号音を代表させたパワースペクトラム信号の情報が無い帯域の振幅スペクトルを近似的に計算するように補間した結果から求めたインパルス信号を基に前記線形予測解析を行って前記線形合成フィルタのフィルタ係数を求めることを特徴とする線形予測係数を用いた立体音響処理装置が提供される。
【００１４】
前記各臨界帯域内の信号音を代表するペクトラム信号には、各臨界帯域内のパワースペクトラムの累積加算値、最大値又は平均値が用いられる。
また、前記各臨界帯域内の信号音を代表させたパワースペクトラム信号間の出力補間が行われ、前記出力補間信号から求めたインパルス信号を基に前記線形予測解析を行って前記線形合成フィルタのフィルタ係数が求められる。前記出力補間には、１次の直線補間や高次のテイラー級数を用いた補間が用いられる。
【００１５】
さらに、前記音響特性を示すインパルスレスポンスとして原音場における伝達経路と再生音場の逆特性を持つ伝送経路を直列に結合した場合の音響特性を示すインパルスレスポンスを用い、そして前記結合したインパルスレスポンスを基に線形予測係数を求める前記線形合成フィルタとして原音場における音響特性を付加するフィルタと再音場における音響特性を除去するフィルタを１つに結合したフィルタが用いられる。また、前記線形予測係数を用いた線形合成フィルタのインパルスレスポンスと前記音響特性を示すインパルスレスポンスとの間の誤差を小さくする補整用フィルタが用いられる。
【００１６】
また本発明によれば、複数の仮想音源からのレベル制御によって音像を定位させる立体音響処理装置であって、その間に音像が定位する隣接した２つの前記仮想音源に対して与えられ、前記仮想音源から聴取者までの各音響空間経路の音響特性を示すインパルスレスポンスを付加する音響特性付加フィルタを有し、
前記音響特性付加フィルタは、前記隣接した２つの仮想音源のフィルタ演算パラメータを記憶し、音像が前記２つの仮想音源の内の１つを含む新たな隣接区間へ移動する際には前記１つの仮想音源に対応する音響特性フィルタの演算パラメータを変えることなく、もう一方の音響特性フィルタの演算パラメータを前記新たな隣接区間に存する仮想音源のものに更新する立体音響処理装置が提供される。
【００１７】
上記本発明によれば、音響特性を示すインパルスレスポンスを周波数領域において臨界帯域幅を考慮して変更する。そして、その結果から自己相関係数を求める。前記臨界帯域幅を考慮して変更する場合に人間の聴覚は位相のずれには鈍感なため、フェーズスペクトラムについては考慮しなくてもよい。臨界帯域幅を考慮して、聴覚上変化が無いように元のインパルレスポンスを滑らかにすることにより、少ない次数の線形予測係数を用いて周波数特性を近似する場合の近似精度を高くすることができる。
【００１８】
また、合成フィルタのインパルスレスポンスの時間領域での波形を補整することにより、両耳間時間差とレベル差の制御が容易になる。これによって、全体の音響特性を変化させることなく、フィルタの数を減らすことができ、ＤＳＰ等を用いた実現が容易となり、さらに所望の音像定位を行うために必要な仮想音源の定位処理のみ行うことで必要な処理量とメモリ量を小さくすることができる。
【００１９】
【発明の実施の形態】
図５は、本発明により音響特性を付加する線形予測係数を求めるための原理構成を示したものである。なお、以降の各図面を用いた説明において、従来例と同じものには同一の符号が付されており、それらについては改めて説明しない。
図５の（ａ）は、本発明の最も基本的な処理ブロック構成を示したものである。インパルスレスポンスは先ず本発明による臨界帯域幅を考慮した前処理を行うための臨界帯域幅前処理部１１０に入力される。なお、本例における自己相関係数計算部１８及び線形予測解析部１９は図３の従来例と同じものである。
【００２０】
ところで、「臨界帯域幅」とは、“フレッチャー(Fletcher)”の定義によれば中心周波数が連続的に変化する帯域フィルタで、（１）信号音に一番近い中心周波数を持つ帯域フィルタが信号音の周波数分析を行い、（２）信号音のマスキングに影響を及ぼす雑音成分はこの帯域フィルタ内の周波数成分に限られるような帯域フィルタのバンド幅をいう。
【００２１】
前記帯域フィルタは「聴覚フィルタ」とも呼ばれ、そのフィルタの中心周波数とバンド幅との間には、中心周波数が低い場合には臨界帯域幅は狭く、反対に中心周波数が高い場合には広くなることが種々の測定から確認されている。例えば、中心周波数が５００Ｈｚ以下では臨界帯域幅はほぼ一定の１００Ｈｚとなる。
【００２２】
そして、中心周波数f と臨界帯域の関係を数式で表したのがバーク(Bark)尺度である。バーク尺度は下記の式で与えられる。
Bark=13arctan（0.76f)+3.5arctan((f/7.5)²)
ここで、バーク尺度１．０は上記臨界帯域幅に相当し、従って上記臨界帯域幅の定義とも合まってバーク尺度１．０で分割された帯域信号は聴覚的に識別し得る信号音を表すことになる。
【００２３】
図５に戻って、図５の（ｂ）及び（ｃ）は、図５の（ａ）の臨界帯域幅前処理部１１０の内部ブロック構成例を示したものである。ここでは、図６〜図１０に示す臨界帯域処理の実施例を参照しながら説明する。
図５の（ｂ）及び（ｃ）において、インパルスレスポンス信号はＦＦＴ(First Fourier Transform) 処理部１１１で高速フーリエ変換によって時間領域信号から周波数領域信号に変換される。図６には、無響室で測定され、聴取者に対し左前方４５度の音源から左耳までの音響空間経路のインパルスレスポンスのパワースペクトラムの一例が示されている。
【００２４】
前記周波数領域信号は、次段の臨界帯域処理部１１２，１１４において上述したバーク尺度１．０の複数の帯域に分割され、図５の（ｂ）の場合には各臨界帯域内のパワースペクトラムの累積加算値が、また図５の（ｃ）の場合には各臨界帯域内のパワースペクトラムの最大値又は平均値がその帯域信号を代表する信号音として求められる。図７は、図６のパワースペクトラムを臨界帯域幅で分割し、図５の（ｃ）で示した各帯域におけるパワースペクトラムの最大値を求めた例を示したものである。
【００２５】
また、臨界帯域処理部１１２，１１４では、さらに前記各臨界帯域毎に求めたパワースペクトラムの累積加算値、最大値又は平均値の間を相互に滑らかに結ぶ出力補間処理が行われる。前記補間には、一次の直線補間や高次のテイラー級数による補間等が行われる。図８は、図７のパワースペクトルを出力補間することによって滑らかにしたパワースペクトラムの一例を示している。
【００２６】
最後に、前記滑らかにしたパワースペクトラムを逆ＦＦＴ部１１３で逆フーリエ変換することにより周波数領域の信号を時間領域の信号に復元する。ここで、フェーズスペクトルは、元のインパルスレスポンスのフェーズスペクトルをそのまま使用している。前記復元されたインパルスレスポンス信号のこれ以降の処理については、図３で説明した従来例と同様である。
【００２７】
このように、本発明によれば臨界帯域幅を用いて聴覚上の変化が生じないように信号音の特徴部分を抽出し、それを滑らかに補間処理した後に近似としての元のインパルレスポンスを復元する。これにより、本発明のように特に少ない次数の線形予測係数を用いて周波数特性を近似する場合に、複雑な元のインパルレスポンスから直接周波数特性を近似する従来例と比較してその近似精度を大幅に向上させることができる。
【００２８】
図９は、図５の（ａ）の処理によって得られた線形予測係数（ａｎ，..．，ａ２, ａ１）を用いた合成フィルタ（ＩＩＲ）１２１の一回路構成例を示したものでありる。図１０は、図９の線形予測係数を用いた１０次の合成フィルタを使って近似処理後のインパルスレスポンスから求めたパワースペクトラムの一例を示したものである。これから、パワースペクトラムの山の部分の近似精度が向上しているのが分かる。
【００２９】
図１１は、図９に示す線形予測係数を用いた合成フィルタ１２１の特性を補整する処理構成例を示したものである。
図１１では、音響特性付加フィルタ１２０として前記線形予測係数を用いた合成フィルタ１２１に加えて、補整用フィルタ１２２が直列に接続される。図１２及び図１３には補整用フィルタ１２２の一例がそれぞれ示されており、図１２では周波数領域における谷の特性部分を近似するための予測フィルタ（ＦＩＲ）の例が、また図１３では時間領域における両耳間遅延時間差やレベル差を補整するための遅延・増幅回路の例が示されている。
【００３０】
図１１に示すように、実際の音響特性を表すインパルスレスポンス信号を誤差計算部１３０の一方の入力に与え、前記音響特性付加フィルタ１２０にはインパルス信号を入力する。前記インパルス信号の入力によって音響特性付加フィルタ１２０の出力には時間領域の音響特性付加フィルタ特性信号が出力される。それを前記誤差計算部１３０の他方の入力に与え、前記実際の音響特性を表すインパルスレスポンス信号と比較する。そして、前記比較による誤差分を小さくするよう補整用フィルタ１２２を調整する。
【００３１】
一例として、図１２に示すｎ次のＦＩＲフィルタ１２２を用いて、合成フィルタ１２１のインパルスレスポンスの時間領域における波形の補整を行う場合について説明する。ここで、フィルタ係数ｃ０，ｃ１，．．．，ｃｐは次のようにして求められる。合成フィルタのインパルスレスポンスをｘ、元のインパルスレスポンスをｙとすると次式が成立する。ここで、ｑ≧ｐとする。
【００３２】
【数１】

【００３３】
上式の左辺の要素ｘ（０），．．．，ｘ（ｑ）の行列をＸ、要素ｃ０，．．．，ｃｐのベクトルをＣとし、右辺のベクトルをＹとすると次式により、フィルタ係数ｃ０，ｃ１，．．．，ｃｐが求まる。
Ｘｃ＝Ｙ
Ｘ^TＸｃ＝Ｘ^TＹ
ｃ＝（Ｘ^TＸ）^-1Ｘ^TＹ
また、最急降下法により求める方法もある。
【００３４】
図１４は、前記補整用フィルタ１２２を使って線形予測係数を用いた合成フィルタ１２１の周波数特性を変更した一例を示している。図１４の点線波形は補整前の合成フィルタ１２１の周波数特性の一例を示しており、図１４の実線波形は図１２の予測フィルタ１２２を使ってそれを補整した一例を示している。この補整によって前記周波数特性の谷の部分の特性が明瞭になったのが分かる。
【００３５】
図１５は、上述した本発明の１応用例を示したものである。
図２で説明したように、従来は音響特性付加フィルタ１２，１３とヘッドホン特性の逆特性フィルタ１４，１５とをそれぞれ別々に求め、それらを直列接続する構成としていた。この場合、例えば前段のフィルタ１２（又は１３）で１２８タップ及び後段のフィルタ１４（又は１５）で１２８タップをそれぞれ使用すると仮定した場合に、それらを直列接続して信号の収束を保証するためにはその約２倍の２５５タップが必要であった。
【００３６】
それに対し、図１５では最初から音響特性付加フィルタとヘッドホンの逆特性フィルタとを結合した１つのフィルタ１４１又は１４２を用いる。本発明によれば、図５の（ａ）に示すように音響特性の線形予測解析を行う前に臨界帯域幅を考慮した前処理１１０が行われる。その処理過程で上述したように聴覚上の変化が生じない範囲で信号音の特徴部分の抽出と補間処理が行われる。その結果、より少ない次数の線形予測係数を用いて周波数特性が近似され、従来のように前段と後段を直列接続する場合と比べて大幅なフィルタ回路の簡略化が可能となる。
【００３７】
図１６は、ヘッドホンのパワースペクトラムの逆特性（ｈ^-1）の一例を示したものである。また、図１７は、実際の音響特性とヘッドホンの逆特性の結合フィルタ（Ｓ→ｌ・ｈ^-1）のパワースペクトラムの一例を示したものである。図１８は、図１７のパワースペクトラムを臨界帯域幅で分割して各帯域における最大値で代表させた結果を示したものである。そして、図１９は、図１８のパワースペクトラムの代表値に補間処理を行った場合の例を示している。図１７と図１９のパワースペクトラムを比較すると、後者の方がより少ない次数の線形予測係数を用いてより正確に近似できることが分かる。
【００３８】
図２０は、本発明により複数の仮想音源間の出力補間で音像定位を行う処理の原理構成を示したものである。
図２０の（ａ）では、２個所の仮想音源（Ａ，Ｂ）２０−１及び２０−２から聴取者１１の左右の耳に至るまでの各音響空間経路の伝達特性を付加するために、４個の音響特性演算用メモリ１５１〜１５４が設けられている。そして、前記仮想音源（Ａ）２０−１と仮想音源（Ｂ）２０−２との間に音像を定位させ若しくはスムーズに移動させるために次段の増幅器２７〜３０の各ゲインが調整される。
【００３９】
次に、図２０の（ｂ）に示すように、前記音像を続く次の仮想音源（Ｂ，Ｃ）２０−２及び２０−３の間に定位若しくは移動させる場合に、前記４個の音響特性演算用メモリ１５１〜１５４の内、仮想音源（Ａ）２０−１用に割り当てられていた２個の音響特性演算用メモリ１５１及び１５２が仮想音源（Ｃ）２０−３のために割り当てられる。この場合、仮想音源（Ｂ）２０−２の音響特性演算用メモリ１５３及び１５４は変更されることなくそのまま使用される。そして、図２０の（ａ）と同様に前記仮想音源（Ｂ）２０−２と仮想音源（Ｃ）２０−３との間に音像を定位させ若しくはスムーズに移動させるために次段の増幅器２７〜３０の各ゲインが調整される。
【００４０】
すなわち、上記構成によれば（１）音響特性演算用メモリは２個の仮想音源に対応するだけでよく、また次段の増幅器やその出力加算回路も同様である。（２）音像の移動によって発音区域外となった仮想音源（上記の例ではＡ）の音響特性演算用メモリは、新たに発音区域内に置かれる仮想音源（上記の例ではＣ）用の音響特性演算用メモリとして使用される。そして（３）前記いずれの発音区域にも属する仮想音源（上記の例ではＢ）はそのまま音響特性演算用メモリの使用を継続する。
【００４１】
これより、（１）から音像の移動に必要なメモリ量等のハードウェアが最小限に抑えられ、その結果演算制御も簡易で高速なものとなる。。また、（２）及び（３）から発音区域の切り換わりの際には（３）の仮想音源（Ｂ）のみが発音し、他の仮想音源（Ａ，Ｃ）の増幅器ゲインはゼロである。従って、上記発音区間の切り換わりによるクリック音は発生しない。
【００４２】
図２１及び図２２は、図２０のより具体的な実施例を示したものである。
いずれも新たに音像の位置情報が与えられ、それからフィルタ係数やメモリの選択設定を行うメモリ制御部１５５と、増幅器２７〜３０の各音像位置に対するゲイン計算を行うゲイン制御部１５６とを有している。図２１は図２０の（ａ）に対応し、そして図２２は図２０の（ｂ）にそれぞれ対応している。
【００４３】
【発明の効果】
以上述べたように、本発明の立体音響処理装置によれば、臨界帯域幅を考慮し、それによって聴覚上変化が無いように元のインパルレスポンスを滑らかにすることで、少ない次数の線形予測係数を用いて周波数特性を近似する場合の近似精度を高くすることができる。その際、合成フィルタのインパルスレスポンスの時間領域での波形を補整することにより、両耳間時間差とレベル差等の制御も容易にすることができる。
【００４４】
さらに、本発明により所望の音像定位を行う際に必要な仮想音源の定位処理のみを行うことで、必要な処理量とメモリ量を必要最小限にすると共に仮想音源切り換わり時のクリック音の発生を防止することができる。
このように、本発明によれば全体の音響特性を変化させることなく、フィルタの数を減らし、その結果ＤＳＰ等を用いた立体音像の制御実現を容易に実現することができる。
【図面の簡単な説明】
【図１】従来の音像定位技術の説明図（１）である。
【図２】従来の音像定位技術の説明図（２）である。
【図３】従来の音像定位技術の説明図（３）である。
【図４】従来の音像定位技術の説明図（４）である。
【図５】本発明により音響特性を付加するための線形予測係数を求めるための基本原理図である。
【図６】音響空間経路のインパルスレスポンスのパワースペクトラムの一例を示した図である。
【図７】図６に示すパワースペクトラムを臨界帯域幅で分割してそのパワースペクトラムの最大値で代表させた例を示す図である。
【図８】図７に示すパワースペクトラムの出力補間によって滑らかなパワースペクラムを得る一例を示した図である。
【図９】線形予測係数を用いた合成フィルタの一構成例を示した図である。
【図１０】本発明による線形予測係数を用いた１０次の合成フィルタのパワースペクトラムの一例を示した図である。
【図１１】本発明による線形予測係数を用いた合成フィルタの補整処理の一構成例を示した図である。
【図１２】予測フィルタの一例を示した図である。
【図１３】遅延・増幅回路の一例を示した図である。
【図１４】補整フィルタにより周波数特性の補整を行った一例を示した図である。
【図１５】本発明によって音響特性付加フィルタとヘッドホンの逆特性を結合した例を示した図である。
【図１６】ヘッドホンのパワースペクトラムの逆特性の一例を示した図である。
【図１７】音響特性付加フィルタとヘッドホンの逆特性の結合フィルタによるパワースペクトラムの一例を示した図である。
【図１８】図１７に示すパワースペクトラムを臨界帯域幅で分割してその最大値で代表させた一例を示した図である。
【図１９】図１８のパワースペクトラムを補間した一例を示す図である。
【図２０】本発明による仮想音響空間の音像定位のための基本構成を示した図である。
【図２１】図２０の（ａ）の具体例を示した図である。
【図２２】図２０の（ｂ）の具体例を示した図である。
【符号の説明】
１８…自己相関係数計算部
１９…線形予測解析部
１１０…臨界帯域幅前処理部
１１１…高速フーリエ変換処理部
１１２…臨界帯域内累積加算部
１１３…逆高速フーリエ変換処理部
１１４…臨界帯域内最大／平均処理部
１２２…補整用フィルタ部
１４１，１４２…結合フィルタ[0001]
BACKGROUND OF THE INVENTION
The present invention relates to sound processing technology, and more particularly to a three-dimensional sound processing apparatus that provides a three-dimensional sound effect to a listener in a reproduction sound field through headphones or the like.
[0002]
[Prior art]
In general, in order to accurately reproduce or localize a sound image, obtain the acoustic characteristics of the original sound field from the sound source to the listener and the acoustic characteristics of the reproduced sound field from the sound output device such as speakers or headphones to the listener. Is required. The actual reproduction sound field adds the former acoustic characteristic to the signal from the sound source and removes the latter acoustic characteristic from the signal, so that the sound image of the original sound field can be heard by the listener even when using speakers or headphones. Can be accurately reproduced.
[0003]
FIG. 1 shows an example of an original sound field composed of a sound source 10 and a listener 11.
In FIG. 1, each acoustic spatial path from the sound source 10 to the left and right ears (l, r) of the listener 11 is indicated by the transfer characteristics S1, Sr corresponding to them.
[0004]
FIG. 2 shows an electrical equivalent circuit configuration for giving the listener 11 the same reproduced sound field as FIG.
In FIG. 2, the transfer characteristics S1 and Sr of each acoustic space path in FIG. 1 are given by using corresponding acoustic characteristic addition filters (S → l) 12 and (S → r) 13. In FIG. 2, in order to cancel the acoustic characteristics of the left and

right headphones

16 and 17 of the listener 11,

reverse characteristics h

⁻¹ 14 and 15 of each headphone are further added. As a result, the listener 11 can obtain the same sound image 10 at the same position as in FIG. 1 by this configuration.
[0005]
FIG. 3 shows a configuration example for obtaining filter coefficients of the acoustic characteristic addition filters 12 and 13 shown in FIG.
In FIG. 3, first, auto-correlation coefficient processing 18 of the impulse response measured in the anechoic chamber is performed. A linear prediction analysis process 19 is further performed on the autocorrelation coefficient obtained by the above processing to obtain a linear prediction coefficient. The acoustic characteristic addition filters 12 and 13 are configured by using an IIR filter and using the linear prediction coefficient as the filter coefficient. In this case, the number of filter taps can be greatly reduced as compared with the case of using the previous FIR filter.
[0006]
FIG. 4 shows an example of moving a sound image by output interpolation between a plurality of virtual sound sources.
FIG. 4A shows an example in which three virtual sound sources (A to C) 20-1 to 20-3 are arranged in front of the listener 11.
FIG. 4B shows a circuit configuration example for localizing a sound image between the plurality of virtual sound sources 20-1 to 20-3.
[0007]
In FIG. 4B, there are three types corresponding to the positions of the virtual sound sources 20-1 to 20-3, depending on the transfer characteristics of the respective acoustic spatial paths from the sound sources to the left and right ears of the listener 11. Acoustic characteristic

addition filter pairs

21 and 22, 23 and 24, and 25 and 26 are provided. Each acoustic characteristic addition filter has a filter coefficient indicating the transfer characteristic of the acoustic space path and a filter memory for holding a filter calculation output result for the input signal. The calculation output is a variable amplifier (gA to gC) 27 in the next stage. To 32. These amplified outputs are added by adders 33 to 36 corresponding to the left and right ears of the listener 11, and become the outputs of the acoustic characteristic addition filters 12 and 13 shown in FIG. The subsequent steps are the same as those in FIG.
[0008]
Here, the virtual sound source (A) 20-1 and the virtual sound source (B) 20-, as shown in FIG. 4 (a), by output interpolation that changes the gains of the variable amplifiers (gA, gB) 27 to 30, for example. The sound image can be smoothly moved between the two. The same processing can be performed between the virtual sound source (B) 20-2 and the virtual sound source (C) 20-3.
For details of the above-described conventional techniques, refer to Japanese Patent Application No. 7-231705 by the same inventor and applicant as the present application.
[0009]
[Problems to be solved by the invention]
However, when linear prediction analysis is performed as shown in FIG. 3 and frequency characteristics are approximated using a synthesis filter (IIR filter) or a prediction filter (FIR filter) having an order smaller than the number of samples of the original impulse response. In particular, in the case where the frequency characteristics of the original impulse response have a complicated peak or valley, there is a problem that the approximation accuracy is lowered.
[0010]
In this case, the time domain waveform of the impulse response of the synthesis filter is greatly different from the original impulse response waveform, which makes it difficult to control the interaural time difference and the level difference for the listener 11. was there.
[0011]
Furthermore, as shown in FIG. 4B, when sound image localization processing is performed by output interpolation of a virtual sound source, the filter image and filter memory provided for each virtual sound source are used, for example, the sound image is a virtual sound source A and The localization process using the filter coefficient and the filter memory is performed for the virtual sound source C even when it is only between B. Therefore, when the reproduction sound field shown in FIG. 2 is realized by using a DSP (Digital Signal Processor) or the like, if the number of the acoustic characteristic addition filters is large, the management of the arithmetic processing procedure, the memory, the register, and the like is complicated. There was a problem.
[0012]
Therefore, in view of the above problems, the object of the present invention is to (1) smooth the frequency characteristics of the original impulse response so that there is no auditory change before obtaining the autocorrelation coefficient and performing the linear prediction analysis, Furthermore, the waveform in the time domain of the impulse response of the synthesis filter is corrected to approximate the frequency characteristics of the original impulse response, (2) the number of filters is reduced without changing the overall acoustic characteristics, and (3) An object of the present invention is to provide a stereophonic sound processing apparatus that realizes only the localization processing of a virtual sound source necessary for performing desired sound image localization.
[0013]
[Means for Solving the Problems]
According to the present invention, a desired acoustic characteristic to be added to an original signal is formed by a linear synthesis filter using a linear prediction coefficient obtained by linear prediction analysis of an impulse response representing the acoustic characteristic as a filter coefficient, and the linear synthesis filter A stereophonic sound processing apparatus for adding desired acoustic characteristics to the original signal through, dividing a power spectrum of an impulse response representing the acoustic characteristics into a plurality of critical bands, and representing signal sounds in the respective critical bands A filter coefficient of the linear synthesis filter is obtained by performing the linear prediction analysis based on an impulse signal obtained from an interpolation result so as to approximately calculate an amplitude spectrum of a band having no power spectrum signal information. A stereophonic sound processing apparatus using the linear prediction coefficient is provided.
[0014]
A cumulative addition value, maximum value, or average value of the power spectrum in each critical band is used for the spectrum signal representing the signal sound in each critical band.
Further, output interpolation is performed between power spectrum signals typified by signal sounds in each critical band, and the linear prediction analysis is performed based on the impulse signal obtained from the output interpolation signal, so that the filter of the linear synthesis filter A coefficient is determined. For the output interpolation, primary linear interpolation or interpolation using higher-order Taylor series is used.
[0015]
Further, the impulse response indicating the acoustic characteristics is used as the impulse response indicating the acoustic characteristics, and the impulse response indicating the acoustic characteristics when the transmission path in the original sound field and the transmission path having the reverse characteristics of the reproduction sound field are connected in series, and based on the combined impulse response. As the linear synthesis filter for obtaining the linear prediction coefficient, a filter in which an acoustic characteristic in the original sound field is added and a filter that removes the acoustic characteristic in the re-sound field are combined into one. In addition, a correction filter that reduces an error between the impulse response of the linear synthesis filter using the linear prediction coefficient and the impulse response indicating the acoustic characteristics is used.
[0016]
Further, according to the present invention, there is provided a stereophonic sound processing apparatus that localizes a sound image by level control from a plurality of virtual sound sources, and is provided to two adjacent virtual sound sources in which a sound image is localized, and the virtual sound source An acoustic characteristic addition filter that adds an impulse response indicating the acoustic characteristics of each acoustic spatial path from the listener to the listener;
The acoustic characteristic addition filter stores filter calculation parameters of the two adjacent virtual sound sources, and the sound image moves to a new adjacent section including one of the two virtual sound sources. There is provided a stereophonic sound processing apparatus that updates the calculation parameter of the other acoustic characteristic filter to that of the virtual sound source existing in the new adjacent section without changing the calculation parameter of the acoustic characteristic filter corresponding to the sound source.
[0017]
According to the present invention, the impulse response indicating the acoustic characteristics is changed in consideration of the critical bandwidth in the frequency domain. Then, an autocorrelation coefficient is obtained from the result. When changing in consideration of the critical bandwidth, human hearing is insensitive to phase shifts, so the phase spectrum need not be considered. By considering the critical bandwidth and smoothing the original impulse response so that there is no auditory change, it is possible to increase the approximation accuracy when approximating the frequency characteristics using a low-order linear prediction coefficient. .
[0018]
Further, by correcting the waveform in the time domain of the impulse response of the synthesis filter, it becomes easy to control the interaural time difference and the level difference. As a result, the number of filters can be reduced without changing the overall acoustic characteristics, facilitating implementation using a DSP or the like, and performing only the localization processing of the virtual sound source necessary for performing desired sound image localization. As a result, the required processing amount and memory amount can be reduced.
[0019]
DETAILED DESCRIPTION OF THE INVENTION
FIG. 5 shows a principle configuration for obtaining a linear prediction coefficient to which an acoustic characteristic is added according to the present invention. In addition, in the description using subsequent drawings, the same code | symbol is attached | subjected to the same thing as a prior art example, and they are not demonstrated anew.
FIG. 5 (a) shows the most basic processing block configuration of the present invention. The impulse response is first input to the critical bandwidth preprocessing unit 110 for performing preprocessing in consideration of the critical bandwidth according to the present invention. The autocorrelation coefficient calculation unit 18 and the linear prediction analysis unit 19 in this example are the same as those in the conventional example of FIG.
[0020]
By the way, “critical bandwidth” is a band filter whose center frequency changes continuously according to the definition of “Fletcher”. (1) A band filter having a center frequency closest to the signal sound is a signal. The frequency component of the sound is analyzed, and (2) the noise component that affects the masking of the signal sound is the bandwidth of the band filter that is limited to the frequency component in the band filter.
[0021]
The band-pass filter is also called “auditory filter”, and between the center frequency and bandwidth of the filter, the critical bandwidth is narrow when the center frequency is low, and conversely wide when the center frequency is high. This has been confirmed from various measurements. For example, when the center frequency is 500 Hz or less, the critical bandwidth is almost constant 100 Hz.
[0022]
The Bark scale expresses the relationship between the center frequency f and the critical band by a mathematical expression. The Bark scale is given by:
Bark = 13arctan (0.76f) + 3.5arctan ((f / 7.5) ² )
Here, the Bark scale 1.0 corresponds to the critical bandwidth, and therefore the band signal divided by the Bark scale 1.0 in combination with the definition of the critical bandwidth represents a signal sound that can be audibly identified. It will be.
[0023]
Returning to FIG. 5, (b) and (c) of FIG. 5 show an internal block configuration example of the critical bandwidth preprocessing unit 110 of (a) of FIG. 5. Here, a description will be given with reference to embodiments of critical band processing shown in FIGS.
5B and 5C, the impulse response signal is converted from a time domain signal to a frequency domain signal by fast Fourier transform in an FFT (First Fourier Transform) processing unit 111. FIG. 6 shows an example of a power spectrum of an impulse response measured in an anechoic chamber and transmitted to the listener from the sound space path from the sound source 45 degrees left forward to the left ear.
[0024]
The frequency domain signal is divided into a plurality of bands of the above-mentioned Bark scale 1.0 in the critical

band processing units

112 and 114 in the next stage, and in the case of FIG. 5B, the power spectrum in each critical band. In the case of FIG. 5C, the cumulative addition value, and the maximum value or average value of the power spectrum in each critical band is obtained as a signal sound representative of the band signal. FIG. 7 shows an example in which the power spectrum of FIG. 6 is divided by the critical bandwidth, and the maximum value of the power spectrum in each band shown in FIG. 5C is obtained.
[0025]
Further, the critical

band processing units

112 and 114 further perform output interpolation processing for smoothly connecting the cumulative addition value, maximum value, or average value of the power spectrum obtained for each critical band. For the interpolation, a linear linear interpolation, a high-order Taylor series, or the like is performed. FIG. 8 shows an example of a power spectrum smoothed by output interpolation of the power spectrum of FIG.
[0026]
Finally, the smoothed power spectrum is subjected to inverse Fourier transform by the inverse FFT unit 113 to restore the frequency domain signal to the time domain signal. Here, the phase spectrum uses the original impulse response phase spectrum as it is. The subsequent processing of the restored impulse response signal is the same as in the conventional example described with reference to FIG.
[0027]
As described above, according to the present invention, the characteristic part of the signal sound is extracted so as not to cause an auditory change using the critical bandwidth, and the original impulsive response as an approximation is restored after performing smooth interpolation processing on the characteristic part. To do. As a result, when approximating the frequency characteristics using a linear prediction coefficient of a particularly low order as in the present invention, the approximation accuracy is greatly improved compared to the conventional example in which the frequency characteristics are directly approximated from a complicated original impulse response. Can be improved.
[0028]
FIG. 9 shows an example of a circuit configuration of the synthesis filter (IIR) 121 using the linear prediction coefficients (an,..., A2, a1) obtained by the process of FIG. The FIG. 10 shows an example of the power spectrum obtained from the impulse response after the approximation processing using the 10th-order synthesis filter using the linear prediction coefficient of FIG. From this, it can be seen that the approximation accuracy of the peak portion of the power spectrum is improved.
[0029]
FIG. 11 shows a processing configuration example for correcting the characteristics of the synthesis filter 121 using the linear prediction coefficient shown in FIG.
In FIG. 11, in addition to the synthesis filter 121 using the linear prediction coefficient as the acoustic characteristic addition filter 120, a correction filter 122 is connected in series. FIGS. 12 and 13 each show an example of the compensation filter 122. FIG. 12 shows an example of a prediction filter (FIR) for approximating a valley characteristic portion in the frequency domain, and FIG. 13 shows a time domain. An example of a delay / amplifier circuit for correcting the interaural delay time difference and level difference in FIG.
[0030]
As shown in FIG. 11, an impulse response signal representing actual acoustic characteristics is given to one input of the error calculator 130, and the impulse signal is inputted to the acoustic characteristic addition filter 120. By the input of the impulse signal, a time domain acoustic characteristic added filter characteristic signal is output to the output of the acoustic characteristic additional filter 120. It is given to the other input of the error calculator 130 and compared with an impulse response signal representing the actual acoustic characteristic. Then, the correction filter 122 is adjusted so as to reduce the error due to the comparison.
[0031]
As an example, a case where waveform correction in the time domain of the impulse response of the synthesis filter 121 is performed using the n-th order FIR filter 122 illustrated in FIG. 12 will be described. Here, the filter coefficients c0, c1,. . . , Cp are obtained as follows. If the impulse response of the synthesis filter is x and the original impulse response is y, the following equation is established. Here, q ≧ p.
[0032]
[Expression 1]

[0033]
Elements x (0),. . . , X (q) matrix X, element c0,. . . , Cp and C on the right side, and Y on the right side, the filter coefficients c0, c1,. . . , Cp is obtained.
Xc = Y
X ^T Xc = X ^T Y
c = (X ^T X) ⁻¹ X ^T Y
There is also a method of obtaining by the steepest descent method.
[0034]
FIG. 14 shows an example in which the frequency characteristic of the synthesis filter 121 using the linear prediction coefficient is changed using the correction filter 122. The dotted line waveform in FIG. 14 shows an example of the frequency characteristic of the synthesis filter 121 before correction, and the solid line waveform in FIG. 14 shows an example of correction using the prediction filter 122 in FIG. It can be seen that the characteristic of the valley portion of the frequency characteristic is clarified by this correction.
[0035]
FIG. 15 shows one application example of the present invention described above.
As described with reference to FIG. 2, conventionally, the acoustic characteristic addition filters 12 and 13 and the headphone characteristic inverse

characteristic filters

14 and 15 are obtained separately and connected in series. In this case, for example, if it is assumed that 128 taps are used in the front-stage filter 12 (or 13) and 128 taps are used in the rear-stage filter 14 (or 15), respectively, in order to guarantee signal convergence by connecting them in series. Needed about 255 taps.
[0036]
On the other hand, in FIG. 15, one

filter

141 or 142 in which an acoustic characteristic addition filter and a headphone inverse characteristic filter are combined from the beginning is used. According to the present invention, as shown in FIG. 5A, the pre-processing 110 considering the critical bandwidth is performed before performing the linear prediction analysis of the acoustic characteristics. As described above, the characteristic portion of the signal sound is extracted and interpolated in a range where no auditory change occurs in the process. As a result, the frequency characteristic is approximated using a linear prediction coefficient with a lower order, and the filter circuit can be greatly simplified as compared with the conventional case where the front stage and the rear stage are connected in series.
[0037]
FIG. 16 shows an example of the inverse characteristic (h ⁻¹ ) of the power spectrum of the headphones. FIG. 17 shows an example of the power spectrum of a coupling filter (S → l · h ⁻¹ ) having an actual acoustic characteristic and the inverse characteristic of headphones. FIG. 18 shows the result of dividing the power spectrum of FIG. 17 by the critical bandwidth and representing it by the maximum value in each band. FIG. 19 shows an example in which an interpolation process is performed on the representative value of the power spectrum of FIG. Comparing the power spectra of FIG. 17 and FIG. 19, it can be seen that the latter can be approximated more accurately by using a linear prediction coefficient of a lower order.
[0038]
FIG. 20 shows a principle configuration of processing for performing sound image localization by output interpolation between a plurality of virtual sound sources according to the present invention.
In FIG. 20 (a), in order to add transfer characteristics of each acoustic space path from the two virtual sound sources (A, B) 20-1 and 20-2 to the left and right ears of the listener 11, Four acoustic characteristic calculation memories 151 to 154 are provided. The gains of the amplifiers 27 to 30 in the next stage are adjusted in order to localize or smoothly move the sound image between the virtual sound source (A) 20-1 and the virtual sound source (B) 20-2.
[0039]
Next, as shown in FIG. 20B, when the sound image is localized or moved between the next virtual sound sources (B, C) 20-2 and 20-3, the four acoustic characteristics are displayed. Of the calculation memories 151 to 154, the two acoustic

characteristic calculation memories

151 and 152 allocated for the virtual sound source (A) 20-1 are allocated for the virtual sound source (C) 20-3. In this case, the acoustic

characteristic calculation memories

153 and 154 of the virtual sound source (B) 20-2 are used as they are without being changed. Then, as in FIG. 20A, in order to localize or smoothly move the sound image between the virtual sound source (B) 20-2 and the virtual sound source (C) 20-3, the amplifiers 27 to 27 in the next stage are used. Each gain of 30 is adjusted.
[0040]
That is, according to the above configuration, (1) the memory for calculating acoustic characteristics only needs to correspond to two virtual sound sources, and the same is true for the next stage amplifier and its output addition circuit. (2) The acoustic characteristic calculation memory of the virtual sound source (A in the above example) that is outside the sounding area due to the movement of the sound image is the sound for the virtual sound source (C in the above example) newly placed in the sounding area. Used as a characteristic calculation memory. (3) The virtual sound source (B in the above example) belonging to any of the sound generation areas continues to use the acoustic characteristic calculation memory as it is.
[0041]
As a result, the hardware such as the amount of memory necessary for moving the sound image from (1) is minimized, and as a result, the calculation control is also simple and fast. . Further, when the sound generation area is switched from (2) and (3), only the virtual sound source (B) of (3) generates sound, and the amplifier gains of the other virtual sound sources (A, C) are zero. Therefore, no click sound is generated due to the switching of the sound generation section.
[0042]
21 and 22 show a more specific embodiment of FIG.
Each of them has a new sound image position information, and then has a memory control unit 155 that performs filter coefficient and memory selection setting, and a gain control unit 156 that performs gain calculation for each sound image position of the amplifiers 27 to 30. Yes. FIG. 21 corresponds to (a) of FIG. 20, and FIG. 22 corresponds to (b) of FIG.
[0043]
【The invention's effect】
As described above, according to the stereophonic sound processing apparatus of the present invention, a low-order linear prediction coefficient is obtained by considering the critical bandwidth and thereby smoothing the original impulse response so that there is no auditory change. The approximation accuracy when approximating the frequency characteristics using can be increased. At this time, by correcting the waveform in the time domain of the impulse response of the synthesis filter, it is possible to easily control the interaural time difference and the level difference.
[0044]
Furthermore, by performing only the localization processing of the virtual sound source necessary for performing desired sound image localization according to the present invention, the necessary processing amount and memory amount are minimized, and the generation of a click sound at the time of virtual sound source switching Can be prevented.
As described above, according to the present invention, the number of filters can be reduced without changing the overall acoustic characteristics, and as a result, it is possible to easily realize control of a three-dimensional sound image using a DSP or the like.
[Brief description of the drawings]
FIG. 1 is an explanatory diagram (1) of a conventional sound image localization technique.
FIG. 2 is an explanatory diagram (2) of a conventional sound image localization technique.
FIG. 3 is an explanatory diagram (3) of a conventional sound image localization technique.
FIG. 4 is an explanatory diagram (4) of a conventional sound image localization technique.
FIG. 5 is a basic principle diagram for obtaining a linear prediction coefficient for adding an acoustic characteristic according to the present invention.
FIG. 6 is a diagram illustrating an example of a power spectrum of an impulse response of an acoustic space path.
7 is a diagram showing an example in which the power spectrum shown in FIG. 6 is divided by a critical bandwidth and represented by the maximum value of the power spectrum. FIG.
8 is a diagram showing an example in which a smooth power spectrum is obtained by output interpolation of the power spectrum shown in FIG.
FIG. 9 is a diagram illustrating a configuration example of a synthesis filter using a linear prediction coefficient.
FIG. 10 is a diagram illustrating an example of a power spectrum of a 10th-order synthesis filter using a linear prediction coefficient according to the present invention.
FIG. 11 is a diagram showing a configuration example of a synthesis filter correction process using a linear prediction coefficient according to the present invention.
FIG. 12 is a diagram illustrating an example of a prediction filter.
FIG. 13 is a diagram illustrating an example of a delay / amplifier circuit;
FIG. 14 is a diagram illustrating an example in which frequency characteristics are corrected by a correction filter.
FIG. 15 is a diagram showing an example in which the reverse characteristics of an acoustic characteristic addition filter and headphones are combined according to the present invention.
FIG. 16 is a diagram illustrating an example of a reverse characteristic of a power spectrum of headphones.
FIG. 17 is a diagram showing an example of a power spectrum by an acoustic characteristic addition filter and a combined filter having a reverse characteristic of headphones.
FIG. 18 is a diagram showing an example in which the power spectrum shown in FIG. 17 is divided by a critical bandwidth and represented by the maximum value.
FIG. 19 is a diagram illustrating an example in which the power spectrum of FIG. 18 is interpolated.
FIG. 20 is a diagram illustrating a basic configuration for sound image localization in a virtual acoustic space according to the present invention.
FIG. 21 is a diagram showing a specific example of FIG.
FIG. 22 is a diagram showing a specific example of FIG.
[Explanation of symbols]
18 ... autocorrelation coefficient calculation unit 19 ... linear prediction analysis unit 110 ... critical bandwidth preprocessing unit 111 ... fast Fourier transform processing unit 112 ... critical band cumulative addition unit 113 ... inverse fast Fourier transform processing unit 114 ... in critical band Maximum / average processing section 122 ...

correction filter sections

141, 142 ... combined filter

Claims

A desired acoustic characteristic to be added to the original signal is formed by a linear synthesis filter using a linear prediction coefficient obtained by linear prediction analysis of an impulse response representing the acoustic characteristic as a filter coefficient, and the desired signal is transmitted to the original signal through the linear synthesis filter. A stereophonic sound processing device for adding the acoustic characteristics of
The power spectrum of the impulse response representing the acoustic characteristics is divided into a plurality of critical bands, and the signal sound in each critical band is represented by the maximum value of the power spectrum in each critical band, and between the representative maximum values using linear prediction coefficients and obtains the filter coefficients of the linear combination filter power spectrum based on impulse response determined from a result of the interpolation to calculate approximately by performing the linear prediction analysis Stereo sound processing device.

The stereophonic sound processing apparatus using linear prediction coefficients according to claim 1, wherein linear interpolation is performed for the interpolation.

The stereophonic sound processing apparatus using linear prediction coefficients according to claim 1, wherein interpolation is performed using a higher-order Taylor series.

As the impulse response indicating the acoustic characteristic, an impulse response indicating the acoustic characteristic when the transmission path in the original sound field and the transmission path having the reverse characteristics of the reproduction sound field are connected in series is used, and linear based on the combined impulse response. 2. A solid using a linear prediction coefficient according to claim 1, wherein a filter that combines a filter that adds an acoustic characteristic in an original sound field and a filter that removes an acoustic characteristic in a re-sound field are combined as the linear synthesis filter for obtaining a prediction coefficient. Sound processing device.

The stereophonic sound processing using the linear prediction coefficient according to claim 1, further comprising a correction filter that reduces an error between an impulse response of a linear synthesis filter using the linear prediction coefficient and an impulse response indicating the acoustic characteristics. apparatus.