JP6374882B2

JP6374882B2 - Method and apparatus for determining the direction of uncorrelated sound sources in higher-order ambisonic representations of sound fields

Info

Publication number: JP6374882B2
Application number: JP2015556516A
Authority: JP
Inventors: クルーガー，アレクサンダー; コルドン，スベン
Original assignee: ドルビー・インターナショナル・アーベー
Priority date: 2013-02-08
Filing date: 2014-02-07
Publication date: 2018-08-15
Anticipated expiration: 2034-02-07
Also published as: US20150373471A1; KR102220187B1; CN104995926A; WO2014122287A1; JP2016509812A; KR20150115779A; US9622008B2; EP2765791A1; CN104995926B; TW201448616A; EP2954700B1; EP2954700A1; TWI647961B

Description

発明は、音場の高次アンビソニクス表現における無相関な音源の方向を決定する方法及び装置に関する。 The invention relates to a method and apparatus for determining the direction of an uncorrelated sound source in a higher-order ambisonic representation of a sound field.

高次アンビソニクス（Higher Order Ambisonics）（ＨＯＡ）は、波面合成法（Wave Field Synthesis）（ＷＦＳ）、及び２２．２のようなチャネルベースのアプローチのような技術の中で特に、３次元サウンドを表現するための１つの可能性を提供する。チャネルベースの方法と対照的に、また一方で、ＨＯＡ表現は、固有のラウドスピーカ配置に依存しないという利点を提供する。しかし、このような柔軟性は、特定のラウドスピーカ配置でのＨＯＡ表現の再生に必要とされる復号化プロセスを代償にする。必要とされるラウドスピーカの数が通常は非常に多いＷＦＳアプローチと比較して、ＨＯＡは、ほんの少数のラウドスピーカから成る配置にもレンダリングされ得る。ＨＯＡの更なる利点は、同じ表現が、ヘッドホンへのバイノーラル・レンダリングのためにも、如何なる変更なしで用いられ得ることである。 Higher Order Ambisonics (HOA) represents 3D sound, among other technologies such as Wave Field Synthesis (WFS) and channel-based approaches such as 22.2. It offers one possibility to do so. In contrast to channel-based methods, and on the other hand, the HOA representation offers the advantage of not being dependent on the inherent loudspeaker arrangement. However, such flexibility comes at the price of the decoding process required for playback of the HOA representation on a particular loudspeaker arrangement. Compared to the WFS approach where the number of loudspeakers required is usually very high, the HOA can also be rendered in an arrangement consisting of only a few loudspeakers. A further advantage of HOA is that the same representation can be used without any modification for binaural rendering to headphones.

ＨＯＡは、切り捨てられた球面調和関数（Spherical Harmonics）（ＳＨ）展開による複素高調平面波振幅の空間密度の表現に基づく。夫々の展開係数は、時間領域の関数によって等価に表現され得る角周波数の関数である。よって、一般性を損なうことなしに、完全なＨＯＡ音場表現は、実際上、Ｏ個の時間領域関数から成ると推測され得る。このとき、Ｏは、展開係数の数を表す。以下で、それらの時間領域関数は、ＨＯＡ係数シーケンスと、又はＨＯＡチャネルと呼ばれる。 HOA is based on a representation of the spatial density of the complex harmonic plane wave amplitude by a truncated spherical harmonic (SH) expansion. Each expansion coefficient is a function of angular frequency that can be equivalently represented by a function in the time domain. Thus, without sacrificing generality, it can be inferred that a complete HOA sound field representation is actually composed of O time-domain functions. At this time, O represents the number of expansion coefficients. In the following, these time domain functions are called HOA coefficient sequences or HOA channels.

ＨＯＡは、高い空間分解能を提供する可能性を備える。空間分解能は、最大展開次数Ｎを増やすことにより改善する。それは、ドミナント音源に対して音場を解析する可能性を提供する。 HOA has the potential to provide high spatial resolution. The spatial resolution is improved by increasing the maximum expansion order N. It offers the possibility to analyze the sound field for a dominant sound source.

応用は、音場を構成する独立したドミナント音源を所与のＨＯＡ表現から如何にして特定するのか、及びそれらの時間軌跡を如何にして追跡するのかであってよい。そのような動作は、例えば、欧州特許出願第１２３０５５３７．８号において記載されるような、ドミナント指向性信号及び残りの周囲成分への音場の分解によるＨＯＡ表現の圧縮のために、必要とされる。そのような方向追跡方法のための更なる応用は、粗い予備的な源分離でありうる。特定の音源の信号を増幅又は減衰させる手段として、推定される方向軌跡をＨＯＡ音場レコーディングのポストプロダクションのために使用することも可能でありうる。 An application may be how to identify the independent dominant sound sources that make up the sound field from a given HOA representation and how to track their time trajectories. Such an operation is required, for example, for compression of the HOA representation by decomposition of the dominant directional signal and the sound field into the remaining ambient components as described in European Patent Application No. 123055537.8. The A further application for such a direction tracking method may be coarse preliminary source separation. As a means of amplifying or attenuating the signal of a particular sound source, it may be possible to use the estimated directional trajectory for post production of HOA sound field recording.

上記の欧州特許出願では、次の３つの動作を逐次実行することが提案されている：
・時間フレーム内の目下存在しているドミナント音源の数が特定され、対応する方向が探索される。ドミナント音源の数は、ＨＯＡチャネル相互相関行列の固有値から決定される。ドミナント音源方向の探索のために、固定された多数の予め定義された試験方向についてのＨＯＡ係数のフレームに対応する指向性電力分布が評価される。最初の方向推定は、指向性電力分布において極大を探すことによって得られる。次いで、残りの特定された方向は、次の２つの動作を連続して繰り返すことによって見つけられる：空間近傍における試験方向は、残りの試験方向の組から削除され、結果として得られる組は、指向性電力分布の極大の探索のために考慮される。
・推定された方向は、最後の時間フレームにおいてアクティブであると見なされる音源に割り当てられる。
・割り当てに続いて、方向推定の適切な平滑化が、時間的に滑らかな方向軌跡を得るために実行される。 In the above European patent application, it is proposed to perform the following three operations sequentially:
-The number of dominant sound sources present in the time frame is identified and the corresponding direction is searched. The number of dominant sound sources is determined from the eigenvalues of the HOA channel cross-correlation matrix. For the search of the dominant sound source direction, the directional power distribution corresponding to the frame of HOA coefficients for a fixed number of predefined test directions is evaluated. An initial direction estimate is obtained by looking for a local maximum in the directional power distribution. The remaining identified directions are then found by repeating the following two actions in succession: the test direction in the vicinity of the space is deleted from the remaining set of test directions, and the resulting set is directed Is considered for the search for the maximum of the ionic power distribution.
The estimated direction is assigned to the sound source that is considered active in the last time frame.
• Following assignment, appropriate smoothing of direction estimation is performed to obtain a temporally smooth direction trajectory.

しかし、そのような処理によれば、方向推定の時間平滑化は、指数関数的に重み付けされた移動平均を計算することによって、原理上は達成されるが、この技術は、急な方向の変化又は新しいドミナント音の開始を正確に捕捉でないという欠点を持つ。 However, with such a process, the time smoothing of direction estimation is achieved in principle by calculating an exponentially weighted moving average, but this technique does Or it has the disadvantage that it does not accurately capture the start of a new dominant sound.

この問題を解決するよう、欧州特許出願第１２３０６４８５．９号では、ベイズ学習規則によって実施される統計的に動機付けされた平滑化のために用いられる簡単な統計的な源移動予測モデルを導入することが提案された。しかし、この特許出願及び先の欧州特許出願第１２３０５５３７．８号は、指向性電力分布からしか音源方向についての尤度関数を計算しない。この分布は、単位球面上のほぼ一様に分布したサンプリング点によって特定される方向からの多数の一般平面波の電力を表す。それは、異なる方向からの一般平面波の間の相互相関に関する如何なる情報も提供しない。実際に、ＨＯＡ表現の次数Ｎは、通常は有限であり、空間的に帯域制限された音場を生じさせる。特に、このことは、指向性電力分布への指向性音源の寄与が、近傍にある方向へと真の入射方向の周囲で不鮮明化されることを意味する。このような不鮮明化効果は‘分散関数’によって数学的に記述される。以下の「高次アンビソニクスの空間分解」の項を参照されたい。その範囲は、ＨＯＡ表現の次数が減るにつれて増大する。欧州特許出願第１２３０６４８５．９号及び欧州特許出願第１２３０５５３７．８号の方向追跡方法は、前に見つけられた方向の近傍の外にある領域に方向の探索を制限することによって、この効果をある程度まで考慮している。しかし、近傍の指定は、全ての音源がＨＯＡ表現の全次数Ｎにより符号化されると仮定する。このような仮定は、Ｎよりも小さい次数で符号化された一般平面波を含む次数ＮのＨＯＡ表現について破られる。Ｎよりも小さい次数のそのような一般平面波は、音源をより幅広く現れさせるために、芸術的創作の結果であってよい。しかし、それらは、球形マイクロホンによるＨＯＡ音場表現のレコーディングによっても起こる。 To solve this problem, European Patent Application No. 123066485.9 introduces a simple statistical source movement prediction model used for statistically motivated smoothing implemented by Bayesian learning rules. It was proposed. However, this patent application and the earlier European patent application 123055537.8 calculate the likelihood function for the sound source direction only from the directional power distribution. This distribution represents the power of a number of general plane waves from the direction specified by the sampling points distributed almost uniformly on the unit sphere. It does not provide any information regarding cross-correlation between general plane waves from different directions. In practice, the order N of the HOA representation is usually finite, resulting in a spatially band-limited sound field. In particular, this means that the contribution of the directional sound source to the directional power distribution is smeared around the true incident direction in the vicinity. Such blurring effect is mathematically described by a 'dispersion function'. See “Spatial decomposition of higher-order ambisonics” below. The range increases as the order of the HOA representation decreases. The direction tracking methods of European Patent Application No. 123066485.9 and European Patent Application No. 123055537.8 limit this effect to some extent by limiting the direction search to a region outside the vicinity of the previously found direction. To consider. However, the neighborhood designation assumes that all sound sources are encoded with the full order N of the HOA representation. Such assumptions are violated for order N HOA representations containing general plane waves encoded with orders less than N. Such general plane waves of order less than N may be the result of artistic creation to make the sound source appear more widely. However, they also occur by recording the HOA sound field representation with a spherical microphone.

欧州特許出願第１２３０６４８５．９号及び欧州特許出願第１２３０５５３７．８号の方向追跡方法は、音場がＮよりも小さい次数の単一の一般平面波から成る場合に、１よりも多い音源を特定しうる。このことは、好ましくない性質である。 The direction tracking method of European Patent Application No. 123066485.9 and European Patent Application No. 123055537.8 identifies more than one sound source when the sound field consists of a single general plane wave of order less than N. sell. This is an undesirable property.

発明によって解決されるべき課題は、ＨＯＡ音場におけるドミナント音源の決定を改善して、それらの時間軌跡が追跡され得るようにすることである。この課題は、請求項１、２及び６において開示される方法によって解決される。請求項６の方法を用いる装置は、請求項１１において開示される。

The problem to be solved by the invention is to improve the determination of dominant sound sources in the HOA sound field so that their time trajectories can be tracked. This problem is solved by the method disclosed in claims 1, 2 and 6. An apparatus using the method of claim 6 is disclosed in claim 11 .

発明は、欧州特許出願第１２３０６４８５．９号の処理を改善する。発明の処理は、独立したドミナント音源を探し、それらの方向を時間にわたって追跡する。‘独立したドミナント音源’との表現は、夫々の音源の信号が無相関であることを意味する。欧州特許出願第１２３０５５３７．８号及び欧州特許出願第１２３０６４８５．９号における最先端の方法は、原のＨＯＡ表現の指向性電力分布のみを調べることによって、ドミナント音源方向についての全ての潜在的な候補を探しており、一方、以下で記載される発明の処理は、原のＨＯＡ表現からの夫々の方向候補の探索について、前に見つけられた音源の信号と相関する全ての成分を除外する。そのような動作によって、ただ１つの正確な音源ではなく多くを誤って検出する問題は、音場へのその寄与が極めて方向的に分散される場合に回避され得る。上述されたように、そのような効果は、Ｎよりも小さい次数において符号化された一般平面波を含む次数ＮのＨＯＡ表現について起こり得る。 The invention improves the processing of European patent application 123066485.9. The inventive process looks for independent dominant sound sources and tracks their direction over time. The expression 'independent dominant sound source' means that the signals of the respective sound sources are uncorrelated. The state-of-the-art methods in European Patent Application No. 123055537.8 and European Patent Application No. 123066485.9 are all potential candidates for dominant sound source direction by examining only the directional power distribution of the original HOA representation. On the other hand, the process of the invention described below excludes all components that correlate with previously found sound source signals for each direction candidate search from the original HOA representation. With such an operation, the problem of erroneously detecting more than just one exact sound source can be avoided if its contribution to the sound field is very directionally distributed. As described above, such an effect can occur for an order N HOA representation that includes a general plane wave encoded in an order less than N.

欧州特許出願第１２３０６４８５．９号と同様に、ドミナント音源方向について見つけられた候補は、次いで、前に見つけられたドミナント音源に割り当てられ、最後に、統計的な源移動モデルに従って平滑化される。よって、欧州特許出願第１２３０６４８５．９号と同様に、発明の処理は、時間的に平滑化された方向推定を提供し、急な方向の変化及び新しいドミナント音の開始を捕捉することが可能である。 Similar to European patent application 123066485.9, candidates found for dominant sound source directions are then assigned to previously found dominant sound sources and finally smoothed according to a statistical source movement model. Thus, similar to European Patent Application No. 123066485.9, the inventive process provides temporally smoothed direction estimation and can capture sudden direction changes and the start of new dominant sounds. is there.

発明の処理は、２つの連続した処理において、ＨＯＡ表現の連続したフレームについてドミナント音源方向の推定を決定する：
ＨＯＡ表現の減算時間フレームｋから、ドミナント音源方向についての候補又は推定が逐次探索され、夫々の音源によって生成されると考えられるＨＯＡ表現の成分が決定される。この探索プロセスの夫々の繰り返しにおいて、夫々の更なる方向候補は、前に見つけられた音源の信号と相関する全ての成分が除外された原のＨＯＡ表現を表す残留ＨＯＡ表現から計算される。現在の方向候補は、複数の予め定義された試験方向の中から選択され、聴取者位置で前記選択された方向から作用する残余ＨＯＡ表現の関連する一般平面波の電力が、全ての他の試験方向の電力と比較して最大であるようにする。 The inventive process determines the dominant sound source direction estimate for consecutive frames of the HOA representation in two consecutive processes:
From the subtraction time frame k of the HOA representation, candidates or estimates for the dominant sound source direction are sequentially searched to determine the components of the HOA representation that are considered to be generated by each sound source. In each iteration of this search process, each further direction candidate is calculated from a residual HOA representation that represents the original HOA representation with all components correlated with the previously found sound source signal excluded. The current direction candidate is selected from a plurality of predefined test directions, and the power of the associated general plane wave of the residual HOA representation acting from the selected direction at the listener position is determined by all other test directions. Try to be the maximum compared to the power.

次に、現在時間フレームについての選択された方向候補は、ＨＯＡ係数の前の時間フレームｋ−１において見つけられたドミナント音源へ割り当てられる。その後に、結果として得られる時間軌跡に対して平滑化される最終の方向推定は、ベイズ推定プロセスを実行することによって計算される。このベイズ推定プロセスは、一方では、統計に基づく先験的な音源移動モデルを、そして、他方では、原のＨＯＡ表現のドミナント音源成分の指向性電力分布を利用する。その先験的な音源移動モデルは、個々の音源の現在の動きを、前の時間フレームｋ−１におけるそれらの方向と、前の時間フレームｋ−１と最後から２番目の時間フレームｋ−２との間での動きとから統計的に予測する。 Next, the selected direction candidates for the current time frame are assigned to the dominant sound source found in time frame k−1 prior to the HOA coefficient. Thereafter, a final direction estimate that is smoothed against the resulting time trajectory is calculated by performing a Bayesian estimation process. This Bayesian estimation process uses, on the one hand, a statistical a priori source movement model and, on the other hand, the directional power distribution of the dominant source component of the original HOA representation. The a priori sound source movement model shows the current movement of individual sound sources in their direction in the previous time frame k-1, the previous time frame k-1 and the penultimate time frame k-2. Statistically predict from the movement between

ＨＯＡ係数の前の時間フレーム（ｋ−１）において見つけられたドミナント音源への方向推定の割り当ては、方向推定及び前に見つけられた音源の方向の組の間の角度の連帯的な最小化と、方向推定に及び前の時間フレームにおいて見つけられたドミナント音源に関連した指向性信号の組の間の相関係数の絶対値の最大化とによって達成される。 The assignment of the direction estimate to the dominant sound source found in the previous time frame (k−1) of the HOA coefficient is a joint minimization of the angle between the direction estimate and the previously found sound source direction set. , By direction estimation and by maximizing the absolute value of the correlation coefficient between the set of directional signals associated with the dominant sound source found in the previous time frame.

原理上、発明の方法は、音場のＨＯＡと称される高次アンビソニクス表現における無相関な音源の方向を決定するのに適しており、当該方法は、
ＨＯＡ係数の現在時間フレームにおいて、ドミナント音源の一応の方向推定を逐次探索し、対応するドミナント音源によって生成されるＨＯＡ音場成分を計算し、対応する指向性信号を計算するステップと、
前記現在時間フレームの前記一応の方向推定と前記ＨＯＡ係数の前の時間フレームにおいてアクティブな音源の平滑化された方向とを比較することによって、且つ、前記現在時間フレームの前記指向性信号と前記前の時間フレームにおいてアクティブな音源の指向性信号とを相関させることによって、前記計算されたドミナント音源を、前記前の時間フレームにおいてアクティブな対応する音源に割り当てて、割り当て関数を得るステップと、
前記割り当て関数、前記前の時間フレームにおける平滑化された方向の組、前記前の時間フレームにおけるアクティブなドミナント音源のインデックスの組、最後から２番目の時間フレームと前記前の時間フレームとの間での夫々の源移動角度の組、及び前記対応するドミナント音源によって生成される前記ＨＯＡ音場成分を用いて、平滑化されたドミナント源方向を計算するステップと、
前記平滑化されたドミナント源方向、前記前の時間フレームの前記アクティブなドミナント音源の方向のフレーム遅延されたバージョン、及び前記前の時間フレームにおける前記アクティブなドミナント音源のインデックスのフレーム遅延されたバージョンを用いて、前記現在時間フレームの前記アクティブなドミナント音源のインデックス及び方向を決定するステップと
を有し、
前記前の時間フレームにおいてアクティブな音源の前記指向性信号は、前記前の時間フレームの前記アクティブなドミナント音源の方向の前記フレーム遅延されたバージョン及び前記前の時間フレームのＨＯＡ係数からモードマッチングを用いて計算され、
前記最後から２番目の時間フレームと前記前の時間フレームとの間での前記源移動角度の組は、前記前の時間フレームの前記アクティブなドミナント音源の方向の前記フレーム遅延されたバージョン及びその更にフレーム遅延されたバージョンから計算される。 In principle, the inventive method is suitable for determining the direction of uncorrelated sound sources in a higher-order ambisonic representation called the HOA of the sound field,
Sequentially searching for a random direction estimate of the dominant sound source in the current time frame of the HOA coefficient, calculating a HOA sound field component generated by the corresponding dominant sound source, and calculating a corresponding directional signal;
By comparing the tentative direction estimate of the current time frame with the smoothed direction of the active sound source in the time frame before the HOA coefficient, and with the directional signal of the current time frame and the previous Assigning the calculated dominant sound source to the corresponding sound source active in the previous time frame by correlating with the directional signal of the active sound source in the time frame of
The assignment function, the set of smoothed directions in the previous time frame, the set of active dominant sound source indices in the previous time frame, between the penultimate time frame and the previous time frame Calculating a smoothed dominant source direction using the respective source movement angle sets and the HOA sound field component generated by the corresponding dominant sound source;
A frame delayed version of the smoothed dominant source direction, a direction of the active dominant source in the previous time frame, and a frame delayed version of the index of the active dominant source in the previous time frame. Using to determine an index and direction of the active dominant sound source of the current time frame;
The directional signal of the sound source active in the previous time frame uses mode matching from the frame delayed version of the active dominant sound source direction of the previous time frame and the HOA coefficient of the previous time frame. Calculated,
The set of source movement angles between the penultimate time frame and the previous time frame is the frame delayed version of the direction of the active dominant source of the previous time frame and further Calculated from the frame delayed version.

原理上、発明の装置は、音場のＨＯＡと称される高次アンビソニクス表現における無相関な音源の方向を決定することに適しており、当該装置は、
ＨＯＡ係数の現在時間フレームにおいて、ドミナント音源の一応の方向推定を逐次探索し、対応するドミナント音源によって生成されるＨＯＡ音場成分を計算し、対応する指向性信号を計算するよう構成される手段と、
前記現在時間フレームの前記一応の方向推定と前記ＨＯＡ係数の前の時間フレームにおいてアクティブな音源の平滑化された方向とを比較することによって、且つ、前記現在時間フレームの前記指向性信号と前記前の時間フレームにおいてアクティブな音源の指向性信号とを相関させることによって、前記計算されたドミナント音源を、前記前の時間フレームにおいてアクティブな対応する音源に割り当てて、割り当て関数を得るよう構成される手段と、
前記割り当て関数、前記前の時間フレームにおける平滑化された方向の組、前記前の時間フレームにおけるアクティブなドミナント音源のインデックスの組、最後から２番目の時間フレームと前記前の時間フレームとの間での夫々の源移動角度の組、及び前記対応するドミナント音源によって生成される前記ＨＯＡ音場成分を用いて、平滑化されたドミナント源方向を計算するよう構成される手段と、
前記平滑化されたドミナント源方向、前記前の時間フレームの前記アクティブなドミナント音源の方向のフレーム遅延されたバージョン、及び前記前の時間フレームにおける前記アクティブなドミナント音源のインデックスのフレーム遅延されたバージョンを用いて、前記現在時間フレームの前記アクティブなドミナント音源のインデックス及び方向を決定するよう構成される手段と
を有し、
前記前の時間フレームにおいてアクティブな音源の前記指向性信号は、前記前の時間フレームの前記アクティブなドミナント音源の方向の前記フレーム遅延されたバージョン及び前記前の時間フレームのＨＯＡ係数からモードマッチングを用いて計算され、
前記最後から２番目の時間フレームと前記前の時間フレームとの間での前記源移動角度の組は、前記前の時間フレームの前記アクティブなドミナント音源の方向の前記フレーム遅延されたバージョン及びその更にフレーム遅延されたバージョンから計算される。 In principle, the device of the invention is suitable for determining the direction of uncorrelated sound sources in a higher-order ambisonics representation called the HOA of the sound field,
Means configured to sequentially search for a random direction estimate of the dominant sound source in the current time frame of the HOA coefficient, calculate a HOA sound field component generated by the corresponding dominant sound source, and calculate a corresponding directional signal; ,
By comparing the tentative direction estimate of the current time frame with the smoothed direction of the active sound source in the time frame before the HOA coefficient, and with the directional signal of the current time frame and the previous Means configured to assign the calculated dominant sound source to a corresponding sound source active in the previous time frame to obtain an assignment function by correlating with a directional signal of an active sound source in a time frame of When,
The assignment function, the set of smoothed directions in the previous time frame, the set of active dominant sound source indices in the previous time frame, between the penultimate time frame and the previous time frame Means configured to calculate a smoothed dominant source direction using a respective set of source movement angles and the HOA sound field component generated by the corresponding dominant sound source;
A frame delayed version of the smoothed dominant source direction, a direction of the active dominant source in the previous time frame, and a frame delayed version of the index of the active dominant source in the previous time frame. Using means configured to determine an index and direction of the active dominant sound source of the current time frame;
The directional signal of the sound source active in the previous time frame uses mode matching from the frame delayed version of the active dominant sound source direction of the previous time frame and the HOA coefficient of the previous time frame. Calculated,
The set of source movement angles between the penultimate time frame and the previous time frame is the frame delayed version of the direction of the active dominant source of the previous time frame and further Calculated from the frame delayed version.

発明の有利な更なる実施形態は、夫々の従属請求項において開示される。 Advantageous further embodiments of the invention are disclosed in the respective dependent claims.

高次アンビソニクス信号のドミナント及び無相関な指向性信号の方向の推定のための発明の処理のブロック図を示す。FIG. 3 shows a block diagram of the inventive process for estimation of dominant and uncorrelated directional signals in higher order ambisonics signals. 一応の方向推定の詳細を示す。Details of the direction estimation are shown. ドミナント音源によって生成される音場のＨＯＡ表現及びドミナント指向性信号の計算を示す。Fig. 4 shows the calculation of the HOA representation of the sound field generated by the dominant sound source and the dominant directional signal. 平滑化されたドミナント音源方向のモデルベースの計算を示す。FIG. 6 illustrates model-based computation of smoothed dominant source directions. 球座標系を示す。Indicates a spherical coordinate system. 異なるアンビソニクス次数Ｎについての且つ角度θ∈［０，π］についての正規化された分散関数ν_Ｎ（Θ）を示す。Figure 3 shows the normalized dispersion function ν _N (Θ) for different ambisonic orders N and for the angle θε [0, π].

発明の例となる実施形態は、添付の図面を参照して記載される。 Exemplary embodiments of the invention will be described with reference to the accompanying drawings.

発明の方向追跡処理の原理は、図１において表されており、以下で説明される。方向追跡は、ｋがフレームインデックスであるとして、長さＬのＨＯＡ係数シーケンスの入力フレームＣ（ｋ）の連続した処理に基づくと考えられる。フレームは、次の式（１）として、「高次アンビソニクスの基本」の項において式（４５）で特定されるＨＯＡ係数シーケンスに対して定義される： The principle of the inventive direction tracking process is represented in FIG. 1 and will be described below. Direction tracking is considered to be based on continuous processing of input frames C (k) of length L HOA coefficient sequences, where k is the frame index. A frame is defined for the HOA coefficient sequence specified in equation (45) in the section “Basics of Higher Order Ambisonics” as the following equation (1):

このとき、Ｔ_Ｓは、サンプリング周期を表し、Ｂ≦Ｌは、フレームシフトを示す。連続したフレームは重なり合っている、すなわち、Ｂ＜Ｌであると考えることが妥当であるが、必須ではない。

In this case, T _S represents the sampling period, B ≦ L indicates a frameshift. Although it is reasonable to consider that consecutive frames overlap, ie B <L, it is not essential.

第１のステップ又は段階１１において、ＨＯＡ表現のｋ番目のフレームＣ（ｋ）は、ドミナント音源について予備的解析をなされる。この処理の詳細な説明は、以下の「予備的な方向探索」の項で与えられる。特に、検出されたドミナント指向性信号の数
［外１］

は、
対応する
［外２］

とともに決定される。加えて、対応する個々のドミナント音源及び対応する瞬時指向性信号
［外３］

（すなわち、一般平面波関数）によって生成される（と考えられる）ＨＯＡ音場成分
［外４］

が計算される。 In the first step or stage 11, the kth frame C (k) of the HOA representation is preliminarily analyzed for a dominant sound source. A detailed description of this process is given in the “Preliminary Direction Search” section below. In particular, the number of dominant directional signals detected [outside 1]

Is
Corresponding [Outside 2]

Determined with. In addition, corresponding individual dominant sound sources and corresponding instantaneous directional signals [Outside 3]

HOA sound field component generated by (that is, considered to be a general plane wave function) [Outside 4]

Is calculated.

個々の一応の方向推定及び関連する量は、順次に、すなわち、最初にｄ＝１について、次いでｄ＝２について、そして以降同様に、計算される。第１のステップで、原のＨＯＡ表現Ｃ（ｋ）の指向性電力分布は、欧州特許出願第１２３０５５３７．８号で提案されているように計算され、引き続いてドミナント音源の存在について解析される。ドミナント音源が検出される場合に、夫々の一応の方向推定
［外５］

が計算される。加えて、対応する指向性信号ｘ_ＩＮＳＴ ^（１）（ｋ）は、この音源によって生成されると考えられる現在フレームＣ（ｋ）のその成分Ｃ_{ＤＯＭ，ＣＯＲＲ} ^（１）（ｋ）とともに、推定される。Ｃ_{ＤＯＭ，ＣＯＲＲ} ^（１）（ｋ）は、指向性信号ｘ_ＩＮＳＴ ^（１）（ｋ）と相関するＣ（ｋ）のその成分を表すと考えられる。最後に、ＨＯＡ成分Ｃ_{ＤＯＭ，ＣＯＲＲ} ^（１）（ｋ）は、残余ＨＯＡ表現Ｃ_ＲＥＭ ^（２）（ｋ）を得るために、Ｃ（ｋ）から減じられる。ｄ番目（ｄ≧２）の一応の方向の推定は、Ｃ（ｋ）の代わりに残余ＨＯＡ表現Ｃ_ＲＥＭ ^（ｄ）（ｋ）を用いる点のみを除いて、最初の一応の方向推定と全く同じように行われる。それによって、明らかに当然ながら、見つけられたｄ番目の音源によって生成される音場成分は、更なる方向探索について除外される。 Individual unidirectional direction estimates and associated quantities are calculated sequentially, ie first for d = 1, then for d = 2, and so on. In the first step, the directional power distribution of the original HOA representation C (k) is calculated as proposed in European Patent Application No. 123055537.8 and subsequently analyzed for the presence of a dominant sound source. When a dominant sound source is detected, each direction is estimated temporarily [Outside 5]

Is calculated. In addition, the corresponding directional signal x _INST ⁽¹⁾ (k) is estimated along with its components C _{DOM, CORR} ⁽¹⁾ (k) of the current frame C (k) that are considered to be generated by this sound source. The C _{DOM, CORR} ⁽¹⁾ (k) is considered to represent that component of C (k) that correlates with directional signal x _INST ⁽¹⁾ (k). Finally, the HOA component C _{DOM, CORR} ⁽¹⁾ (k) is subtracted from C (k) to obtain the residual HOA expression C _REM ⁽²⁾ (k). The d-th (d ≧ 2) tentative direction estimate is exactly the same as the first tentative direction estimate except that it uses the residual HOA representation C _REM ^(d) (k) instead of C (k). To be done. Thereby, of course, the sound field component generated by the found d th sound source is excluded for further direction searches.

方向割り当てステップ又は段階１３において、ｋ番目のフレームにおいてステップ／段階１１で見つけられたドミナント音源は、（ｋ−１）番目のフレームにおいてアクティブな（そうであると考えられる）対応する音源へ割り当てられる。一方で、割り当ては、現在のフレーム（ｋ）についての一応の方向推定
［外６］

と、（ｋ−１）番目のフレームにおいてアクティブな（そうであると考えられる）音源の平滑化された方向とを比較することによって、達成される。この平滑化された方向は組Ｇ_{Ω，ＤＯＭ，ＡＣＴ}（ｋ−１）に含まれており、それらのインデックスはＪ_{ＤＯＭ，ＡＣＴ}（ｋ−１）に含まれている。他方で、割り当てのために、フレームｋでの検出されたドミナント音源の瞬時指向性信号
［外７］

と、（ｋ−１）番目のフレームにおいてアクティブな（そうであると考えられる）音源の指向性信号Ｘ_ＡＣＴ（ｋ−１）との間の相関が利用される。割り当ての結果は、Ｄが、追跡されるべき期待される音源の最大数を表すとして、割り当て関数
［外８］

によって定式化される。このことは、ｄ番目の新たに見つけられた音源が、インデックスｆＡ，ｋ（ｄ）を持った以前にアクティブであった音源へ割り当てられることを意味する。 In the direction assignment step or stage 13, the dominant sound source found in step / stage 11 in the k th frame is assigned to the corresponding sound source that is active in (k−1) th frame. . On the other hand, the allocation is a temporary direction estimate for the current frame (k) [Out 6]

And the smoothed direction of the sound source that is active (considered) in the (k−1) th frame. This smoothed direction is included in the set _{GΩ, DOM, ACT} (k−1), and their indices are included in J _{DOM, ACT} (k−1). On the other hand, for assignment, the instantaneous directional signal of the detected dominant source at frame k [outside 7]

And the directional signal X _ACT (k−1) of the sound source active (considered) in the (k−1) th frame is used. The result of the assignment is that the assignment function [outside 8], where D represents the maximum number of expected sound sources to be tracked

Is formulated by This means that the dth newly found sound source is assigned to the previously active sound source with index fA, k (d).

平滑化されたドミナント音源方向のモデルベースの計算ステップ又は段階１４で、平滑化されたドミナント源方向
［外９］

は、フレーム（ｋ−１）でのアクティブなドミナント音源のインデックスの組Ｊ_{ＤＯＭ，ＡＣＴ}（ｋ−１）と、フレーム（ｋ−１）での対応するドミナント源方向推定の組Ｇ_{Ω，ＤＯＭ，ＡＣＴ}（ｋ−１）と、フレーム（ｋ−２）及び（ｋ−１）の間の夫々の源移動角度の組
［外１０］

と、見つけられたドミナント音源によって生成されると考えられるＨＯＡ音場成分
［外１１］

と、割り当て関数ｆ_Ａ，Ｋとを用いることによって、欧州特許出願第１２３０６４８５．９号で提案されている統計的な音源移動モデルに基づき計算される。このモデルベースの平滑化プロシージャの詳細な説明は、以下の「平滑化されたドミナント音源方向のモデルベースの計算」の項で与えられる。 In the model-based calculation step or stage 14 of the smoothed dominant source direction, the smoothed dominant source direction [outside 9]

Is the set of active dominant source indices J _{DOM, ACT} (k−1) at frame (k−1) and the corresponding set of dominant source direction estimates G _{Ω, DOM, at} frame (k−1) _{. ACT} (k-1) and each source movement angle set between frames (k-2) and (k-1) [outside 10]

HOA sound field component that is considered to be generated by the found dominant sound source [Outside 11]

And the allocation function f _{A, K} is calculated based on the statistical sound source movement model proposed in European Patent Application No. 123066485.9. A detailed description of this model-based smoothing procedure is given in the section “Model-Based Calculation of Smoothed Dominant Source Direction” below.

最後のステップ又は段階１５で、組Ｊ_{ＤＯＭ，ＡＣＴ}（ｋ）及びＧ_{Ω，ＤＯＭ，ＡＣＴ}（ｋ）に夫々含まれると考えられる目下アクティブなドミナント音源のインデックス及び方向は、ステップ／段階１４からの平滑化されたドミナント源方向
［外１２］

と、（ｋ−１）番目のフレームにおいてアクティブであると考えられる音源の平滑化された方向及び夫々のインデックスを含む組Ｇ_{Ω、ＤＯＭ，ＡＣＴ}（ｋ−１）及びＪ_{ＤＯＭ，ＡＣＴ}（ｋ−１）とを用いて、決定される。この動作は、少数の連続したフレームについて検出されなかった音源を擬似的に非アクティブにしない目的を持つ。 In the last step or stage 15, the index and direction of the currently active dominant sound sources considered to be included in the sets J _{DOM, ACT} (k) and _{GΩ, DOM, ACT} (k) respectively are Smoothed dominant source direction [outside 12]

And a set G _{Ω, DOM, ACT} (k−1) and J _{DOM, ACT} (k−) containing the smoothed direction of the sound source considered to be active in the (k−1) th frame and the respective indices. 1) and determined. This operation has the purpose of not making the sound source not detected for a small number of consecutive frames pseudo-inactive.

ステップ又は段階１２は、フレームｋ−１のＨＯＡ表現Ｃ（ｋ−１）と、（ｋ−１）番目のフレームにおいてアクティブであると考えられる音源の平滑化された方向の組Ｇ_{Ω、ＤＯＭ，ＡＣＴ}（ｋ−１）とを用いて、（ｋ−１）番目のフレームにおいてアクティブであると考えられる音源の指向性信号の計算を実行する。計算は、M. A. Poletti，“Three-Dimensional Surround Sound Systems Based on Spherical Harmonics”，J. Audio Eng. Soc.，Vo.53(11)，pp.1004-1025，２００５年において記載されるモードマッチングの原理に基づく。 Step or stage 12 consists of the HOA representation C (k−1) of frame k−1 and the smoothed direction set G _{Ω, DOM, of the} sound sources considered to be active in the (k−1) th frame _{. Using ACT} (k−1), the calculation of the directivity signal of the sound source considered to be active in the (k−1) -th frame is executed. The calculation is based on the principle of mode matching described in MA Poletti, “Three-Dimensional Surround Sound Systems Based on Spherical Harmonics”, J. Audio Eng. Soc., Vo. 53 (11), pp. 1004-1025, 2005. based on.

源移動角度推定ステップ又は段階１６で、フレームｋ−１でのドミナントのアクティブな音源の移動角度の組
［外１３］

は、（ｋ−１）番目及び（ｋ−２）番目のフレームにおいて夫々アクティブであると考えられる音源の平滑化された方向推定の２つの組Ｇ_{Ω，ＤＯＭ，ＡＣＴ}（ｋ−１）及びＧ_{Ω，ＤＯＭ，ＡＣＴ}（ｋ−２）から計算される。移動は、フレームｋ−２及びｋ−１の間で起こると理解される。アクティブなドミナント音源の移動角度は、フレームｋ−２でのその平滑化された方向推定と、フレームｋ−１でのその平滑化された方向推定との間の円弧である。 Set of movement angles of dominant active sound source at frame k−1 in the source movement angle estimation step or stage 16 [outside 13]

Are the two sets G _{Ω, DOM, ACT} (k−1) and G of the smoothed direction estimates of the sound source considered to be active in the (k−1) th and (k−2) th frames, respectively. Calculated from _{Ω, DOM, ACT} (k−2). It is understood that the movement occurs between frames k-2 and k-1. The active dominant source movement angle is the arc between its smoothed direction estimate at frame k-2 and its smoothed direction estimate at frame k-1.

備考：フレームｋ−２についての方向推定が、フレームｋ−１においてアクティブであると考えられるドミナント音源について利用可能でない場合は、夫々の移動角度は、‘π’の最大値に設定され得る。一般に、第１のフレームｋ及びフレームｋ−１について処理を開始するときに値は未だ利用可能でなく、図１のステップ又は段階において入力される対応する設定又は値は、夫々、空であるか、又はゼロに設定される。 Note: If direction estimation for frame k-2 is not available for the dominant sound source that is considered active in frame k-1, the respective movement angle may be set to the maximum value of 'π'. In general, the values are not yet available when starting processing for the first frame k and frame k-1, and are the corresponding settings or values entered in the steps or stages of FIG. 1 respectively empty? Or set to zero.

この動作は、この音源の次の方向についての事前確率を、全ての可能な方向にわたってほぼ一様にならしめる。以下の「目下アクティブなドミナント音源のインデックス及び方向の決定」の項を参照されたい。 This action makes the prior probabilities for the next direction of the sound source almost uniform across all possible directions. See “Determining the Index and Direction of the Currently Active Dominant Sound Source” below.

フレーム遅延１７１乃至１７４は、夫々の信号を１フレームずつ遅延させている。 Frame delays 171 to 174 delay each signal one frame at a time.

以下で、上記のステップ及び段階は、より詳細に説明される。 In the following, the above steps and stages will be described in more detail.

［予備的な方向探索］
予備的な方法探索のステップ／段階１１で、（フレームｋにおいて）存在するドミナント音源の現在数
［外１４］

及び夫々の方向
［外１５］

は推定される。加えて、個々の音源によって生成されると考えられるＨＯＡ音場成分
［外１６］

は、対応する指向性信号
［外１７］

（すなわち、一般平面波関数）とともに、計算される。全ての先に列挙された量は、最初に方向インデックスｄ＝１について、次いでｄ＝２について、そして以降同様に、
［外１８］

になるまで、計算される。 [Preliminary direction search]
Current number of dominant sources present (at frame k) in preliminary method search step / stage 11 [outside 14]

And their directions [outside 15]

Is estimated. In addition, HOA sound field components that are considered to be generated by individual sound sources [Outside 16]

Is the corresponding directional signal [outside 17]

(Ie, a general plane wave function). All previously listed quantities are first for the direction index d = 1, then for d = 2 and so on.
[Outside 18]

Calculated until

単一の方向ｄインデックスについての計算プロシージャは、図２に表されている。（ｄ−１）番目の方向の推定後に生成される残りのＨＯＡ表現Ｃ_ＲＥＭ ^（ｄ）（ｋ）（ｋ番目の時間フレームについてのｄ番目の方向の推定に関連する。）は、この段階へ入力される。それによって、ループの開始時にＣ_ＲＥＭ ^（１）（ｋ）は原のＨＯＡフレームＣ（ｋ）に対応すると理解される。第１のステップ又は段階２１で、残りのＨＯＡ表現Ｃ_ＲＥＭ ^（ｄ）（ｋ）の指向性電力分布ｐ^（ｄ）（ｋ）は、単位球面上でほぼ一様に分布する所定の数Ｑ個の離散的な試験方向Ω_ｑ，ｑ＝１，．．．，Ｑについて計算される。具体的には、夫々の試験信号Ω_ｑは、次の式（２）に従って、傾斜角θ_ｑ∈［０，π］及びアジマス角φｑ∈［０，２π］を含むベクトルとして定義される： The calculation procedure for a single direction d-index is represented in FIG. The remaining HOA representation C _REM ^(d) (k) (related to the d-th direction estimate for the k-th time frame) generated after the (d-1) -th direction estimation goes to this stage. Entered. Thereby, at the start of the loop, C _REM ⁽¹⁾ (k) is understood to correspond to the original HOA frame C (k). In the first step or stage 21, the directional power distribution p ^(d) (k) of the remaining HOA representation C _REM ^(d) (k) is a predetermined number Q distributed almost uniformly on the unit sphere. Discrete test directions Ω _q , q = 1,. . . , Q. Specifically, each test signal Ω _q is defined as a vector including the tilt angle θ _q ∈ [0, π] and the azimuth angle φ _q ∈ [0, 2π] according to the following equation (2):

このとき、（・）^Ｔは、転置を表す。指向性電力分布は、次のベクトル式（３）によって表される：

At this time, (·) ^T represents transposition. The directional power distribution is represented by the following vector equation (3):

その成分ｐ_ｑ ^（ｄ）（ｋ）は、ｋ番目の時間フレームについての方向Ω_ｑに関連した表現Ｃ_ＲＥＭ ^（ｄ）（ｋ）に残っている全てのドミナント音源の結合電力を表す。Ｃ_ＲＥＭ ^（ｄ）（ｋ）からの指向性電力分布ｐ^（ｄ）（ｋ）の実際の計算は、欧州特許出願第１２３０５５３７．８号で提案されているように実行されてよい。

Its component p _q ^(d) (k) represents the combined power of all dominant sound sources remaining in the representation C _REM ^(d) (k) associated with the direction Ω _q for the k th time frame. The actual calculation of the C _REM ^(d) directional power distribution ^p from ^{(k) (d) (k} ) may be performed as suggested in European Patent Application No. 12305537.8.

ステップ又は段階２２で、指向性電力分布ｐ^（ｄ）（ｋ）は、ドミナント音源の存在について解析される。ドミナント源を検出する１つの方法は、以下の「ドミナント音源の存在についての解析」の項で記載される。ドミナント音源の不在が検出される場合は、方向探索は停止され、見つけられたドミナント方向の総数は
［外１９］

に設定される。そうではなく、ドミナント音源が検出される場合は、座標原点に対するその方向
［外２０］

の一応の推定がステップ又は段階２３で計算される。詳細については、以下の「ドミナント音源方向の探索」の項を参照されたい。 In step or stage 22, the directional power distribution p ^(d) (k) is analyzed for the presence of a dominant sound source. One method for detecting a dominant source is described below in the section “Analysis for the Presence of a Dominant Sound Source”. If the absence of a dominant sound source is detected, the direction search is stopped and the total number of dominant directions found is [outside 19].

Set to Otherwise, if a dominant sound source is detected, its direction relative to the coordinate origin [outside 20]

A temporary estimate is calculated in step or stage 23. For details, see the section “Searching for dominant sound source direction” below.

引き続いて、ｄ番目のドミナント音源によって生成されると考えられる音場成分の夫々の指向性信号ｘ_ＩＮＳＴ ^（ｄ）（ｋ）及びＨＯＡ表現Ｃ_{ＤＯＭ，ＣＯＲＲ} ^（ｄ）（ｋ）は、以下の「ドミナント音源によって生成される音場のドミナント指向性信号及びＨＯＡ表現の計算」の項においてより詳細に記載されるように、ステップ又は段階２４で計算される。 Subsequently, each directional signal x _INST ^(d) (k) of the sound field component considered to be generated by the d-th dominant sound source and the HOA expression C _{DOM, CORR} ^(d) (k) are expressed as follows: Computed in step or stage 24 as described in more detail in the section “Calculating Dominant Directional Signals and HOA Representation of the Sound Field Generated by the Dominant Sound Source”.

最後に、ステップ又は段階２５で、ＨＯＡ成分Ｃ_{ＤＯＭ，ＣＯＲＲ} ^（ｄ）（ｋ）は、次（すなわち、（ｄ＋１）番目）の指向性音源の探索のために使用される残余ＨＯＡ表現Ｃ_ＲＥＭ ^{（ｄ＋１）}（ｋ）を得るために、Ｃ_ＲＥＭ ^（ｄ）（ｋ）から減じられる。それによって、明らかに当然ながら、見つけられたｄ番目の音源によって生成される音場成分は、更なる方向探索については除外される。 Finally, in step or stage 25, the HOA component C _{DOM, CORR} ^(d) (k) is the residual HOA representation C _REM ^{(respectively} used for searching for the next (ie (d + 1) th) directional sound source. ^{d + 1)} subtracted from C _REM ^(d) (k) to obtain (k). Obviously, of course, the sound field component generated by the found d-th sound source is excluded for further direction searches.

●ドミナント音源の存在の解析
Ｃ_ＲＥＭ ^（ｄ）（ｋ）によって表される音場内でドミナント音源の存在を検出するために、残りのＨＯＡ表現Ｃ_ＲＥＭ ^（１）（ｋ），．．．，Ｃ_ＲＥＭ ^（ｄ）（ｋ）の指向性電力分布ｐ^（１）（ｋ），．．．，ｐ^（ｄ）（ｋ）が考慮される。一方で、次の式（４）で表される分散比をモニタすることが妥当であることが実験的に分かっている： Analysis of presence of dominant sound source In order to detect the presence of a dominant sound source in the sound field represented by C _REM ^(d) (k), the remaining HOA representations C _REM ⁽¹⁾ (k),. . . , C _REM ^(d) (k) directional power distribution p ⁽¹⁾ (k),. . . , P ^(d) (k) are taken into account. On the other hand, it has been experimentally found that it is appropriate to monitor the dispersion ratio represented by the following equation (4):

この分散比は、最初のＨＯＡ表現Ｃ（ｋ）によって表される音場と比べられる、残りのＨＯＡ表現Ｃ_ＲＥＭ ^（ｄ）（ｋ）によって表される音場の重要性の指標と見なされ得る。小さい比δ_ｐ ^（ｄ）（ｋ）は、ＨＯＡ表現Ｃ_ＲＥＭ ^（ｄ）（ｋ）によって表される音源のいずれもがドミナントであると見なされるべきでないことを示す。他方で、次の式（５）によって表される、正規化された指向性電力分布ｐ_ＮＯＲＭ ^（ｄ）（ｋ）及びｐ_ＮＯＲＭ ^{（ｄ−１）}（ｋ）の分散の比を見ることも妥当である：

This variance ratio can be considered as an indicator of the importance of the sound field represented by the remaining HOA expression C _REM ^(d) (k) compared to the sound field represented by the first HOA expression C (k). . A small ratio δ _p ^(d) (k) indicates that none of the sound sources represented by the HOA representation C _REM ^(d) (k) should be considered dominant. On the other hand, it is also reasonable to look at the ratio of the variances of the normalized directional power distributions p _NORM ^(d) (k) and p _NORM ^(d-1) (k) represented by the following equation (5) _: Is:

次の式（６）によって表される正規化された電力分布の要素ｐ_{ｑ、ＮＯＲＭ} ^（ｄ）（ｋ），ｑ＝１，．．．，Ｑは、次の式（７）によって、ｐ^（ｄ）（ｋ）の要素に応じて定義される：

The normalized power distribution elements p _{q, NORM} ^(d) (k), q = 1,. . . , Q are defined according to the elements of p ^(d) (k) by the following equation (7):

分散ｖａｒ（ｐ_ＮＯＲＭ ^（ｄ）（ｋ））は、指向性電力分布ｐ^（ｄ）（ｋ）の一様性の指標として見なされ得る。特に、分散は、全ての入力方向にわたって電力がより一様に分布するほどますます小さくなる。空間に広がったノイズの極端な場合において、分散ｖａｒ（ｐ_ＮＯＲＭ ^（ｄ）（ｋ））は、ゼロの値に近づくべきである。そのような検討に基づき、分散比δ_{ｐ，ＮＯＲＭ} ^（ｄ）（ｋ）は、ＨＯＡ表現Ｃ_ＲＥＭ ^（ｄ）（ｋ）の指向性電力がＣ_ＲＥＭ ^{（ｄ−１）}（ｋ）の指向性電力よりも一様に分布しているかどうかを示す。

The variance var (p _NORM ^(d) (k)) can be viewed as an indicator of the uniformity of the directional power distribution p ^(d) (k). In particular, the variance becomes smaller as the power is more evenly distributed across all input directions. In the extreme case of noise spreading in space, the variance var (p _NORM ^(d) (k)) should approach a value of zero. Based on such a study, the dispersion ratio δ _{p, NORM} ^(d) (k) is the directional power of the HOA expression C _REM ^(d) (k) is the directional power of C _REM ^(d−1) (k). Indicates whether the distribution is more uniform.

上記の検討を要約するよう、Ｃ（ｋ）によって表される音場には少なくとも単一のドミナント音源が常に存在していると考えられ得る。すなわち、
［外２１］

である。更なるドミナント音源は、変数比δ_ｐ ^（ｄ）（ｋ）の値がある所定の閾値ε_ｐ＜１を上回ったままであり、且つ、変数比の値は１よりも小さい場合に、（ｄ≧２について）検出される。すなわち、ドミナント音源は、次の関係式（８）が成立する場合に、（ｄ≧２について）検出される： To summarize the above discussion, it can be assumed that there is always at least a single dominant sound source in the sound field represented by C (k). That is,
[Outside 21]

It is. A further dominant sound source is (d ≧) when the value of the variable ratio δ _p ^(d) (k) remains above a certain threshold ε _p <1 and the value of the variable ratio is less than 1. 2) detected. That is, a dominant sound source is detected (for d ≧ 2) when the following relation (8) holds:

ε_ｐの値は、何が‘ドミナント’を意味するのかの解釈に対して設定されるべきである。発明者は、妥当な選択がε_ｐ＝１０^−３によって与えられることに気付いた。

The value of ε _p should be set for the interpretation of what means 'dominant'. The inventor has realized that a reasonable choice is given by ε _p = 10 ⁻³ .

●ドミナント音源方向の探索
ｄ番目の音源が検出された後、その方向
［外２２］

の一応の推定は、指向性電力分布ｐ^（ｄ）（ｋ）を用いることによって探索される。探索は、指向性電力が最大であるところの試験方向Ω_ｑを採ることによって、達成される。すなわち： ● Search for dominant sound source direction After the d-th sound source is detected, its direction [outside 22]

Is estimated by using the directional power distribution p ^(d) (k). The search is accomplished by taking the test direction Ω _q where the directional power is maximum. Ie:

●ドミナント音源によって生成される音場のドミナント指向性信号及びＨＯＡ表現の計算
その後に、ドミナント源方向の一応の推定
［外２３］

を決定した後、夫々の指向性信号ｘ_ＩＮＴ ^（ｄ）（ｋ）は、同じ音源によって生成されると考えられる音場成分のＨＯＡ表現Ｃ_{ＤＯＭ，ＣＯＲＲ} ^（ｄ）（ｋ）とともに、図３に従って計算される。ステップ又は段階３１で、単位球面上にほぼ一様に分布すると考えられるＯ個のサンプリング位置Ω_{ＩＮＩＴ，ｏ}，ｏ＝１，．．．，Ｏから成る固定の予め定義された球面グリッドＧ_{Ω，ＩＮＩＴ}は回転されて、回転されたサンプリング位置Ω_{ＲＯＴ，ｏ} ^（ｄ）（ｋ），ｏ＝１，．．．，Ｏから成るグリッドＧ_{Ω，ＲＯＴ} ^（ｄ）（ｋ）を与える。回転は、第１の回転されたサンプリング位置Ω_{ＲＯＴ，１} ^（ｄ）（ｋ）が一応の方向推定
［外２４］

に対応するように実行される。

● Calculation of dominant directional signal and HOA representation of sound field generated by dominant sound source.

_Each directional signal x _INT ^(d) (k) is determined according to FIG. 3 together with the HOA representations C _{DOM, CORR} ^(d) (k) of the sound field components that are considered to be generated by the same sound source. Calculated. In step or stage 31, O sampling positions Ω _{INIT, o} , o = 1,. . . , O, the fixed predefined spherical grid G _{Ω, INIT} is rotated and rotated to the sampled position Ω _{ROT, o} ^(d) (k), o = 1,. . . , O, a grid _{GΩ, ROT} ^(d) (k) is given. The rotation is estimated by the first rotated sampling position Ω _{ROT, 1} ^(d) (k). [Outside 24]

It is executed to correspond to

ステップ又は段階３２で、ＨＯＡ表現Ｃ_ＲＥＭ ^（ｄ）（ｋ）は、いわゆる空間領域に変形される。このとき、それは、回転されたグリッド方向Ω_{ＲＯＴ，ｏ} ^（ｄ）（ｋ），ｏ＝１，．．．，Ｏから観測者位置（すなわち、座標原点）に作用すると考えられるＯ個の平面波関数（グリッド指向性信号とも呼ばれる。）ｘ_{ｏ，ＩＮＳＴ} ^（ｄ）（ｋ），ｏ＝１，．．．，Ｏによって等価に表される。平面波関数ｘ_{ｏ，ＩＮＳＴ} ^（ｄ）（ｋ），ｏ＝１，．．．，Ｏを計算するよう、回転されたグリッド方向に対するモード行列
［外２５］

は、次のように、式（１１）を用いて式（１０）の通りに計算される： In step or stage 32, the HOA representation C _REM ^(d) (k) is transformed into a so-called spatial domain. At this time, it means that the rotated grid direction Ω _{ROT, o} ^(d) (k), o = 1,. . . , O from O plane wave functions (also referred to as grid directivity signals) that are considered to act on the observer position (ie, coordinate origin) x _{o, INST} ^(d) (k), o = 1,. . . , O are equivalently represented. Plane wave function x _{o, INST} ^(d) (k), o = 1,. . . , O, the mode matrix for the rotated grid direction [outside 25]

Is calculated as in equation (10) using equation (11) as follows:

次の式（１２）の通りに、夫々のグリッド指向性信号ｘ_{ｏ，ＩＮＳＴ} ^（ｄ）（ｋ）を、ｋ番目の時間フレームの個々のサンプルから成る行ベクトルであるとする：

Let each grid directional signal x _{o, INST} ^(d) (k) be a row vector consisting of individual samples of the k th time frame, as in equation (12):

このとき、Ｌは、解析されるＨＯＡ表現の長さ（サンプルにおける）を表し、全てのグリッド指向性信号の計算は、次の式（１３）の通りに、球面調和関数変換（説明のために、以下の「球面調和関数変換」を参照されたい。）によって達成される：

At this time, L represents the length of the HOA expression to be analyzed (in the sample), and the calculation of all grid directivity signals is performed by spherical harmonic transformation (for explanation) as shown in the following equation (13). , See “Spherical Harmonic Transformation” below):

ドミナント音源方向の一応の推定
［外２６］

は、回転されたサンプリング位置Ω_{ＲＯＴ，１} ^（ｄ）（ｋ）に対応するので、一般平面波関数ｘ_{１，ＩＮＳＴ} ^（ｄ）（ｋ）は、所望のドミナント方向信号ｘ_ＩＮＳＴ ^（ｄ）（ｋ）と見なされ得る。すなわち：

Dominant sound source direction estimation [Outside 26]

Corresponds to the rotated sampling position Ω _{ROT, 1} ^(d) (k), so that the general plane wave function x _{1, INST} ^(d) (k) is the desired dominant direction signal x _INST ^(d) (k). Can be considered. Ie:

ｄ番目の音源によって生成されるＣ_ＲＥＭ ^（ｄ）（ｋ）のその成分を決定するよう、ステップ又は段階３３で、この成分は、ｘ_ＩＮＳＴ ^（ｄ）（ｋ）から予測され得る平面波関数によって等価に表現されると仮定される。よって、グリッド指向性信号ｘ_{ｏ，ＩＮＳＴ} ^（ｄ）（ｋ），ｏ＝２，．．．，Ｏは、ｘ_ＩＮＳＴ ^（ｄ）（ｋ）から予測されるよう試みられる。予測された信号は、
［外２７］

によって表される。

In step or stage 33, this component is equivalent by a plane wave function that can be predicted from x _INST ^(d) (k) to determine that component of C _REM ^(d) (k) generated by the d th sound source. It is assumed that Therefore, the grid directivity signal x _{o, INST} ^(d) (k), o = 2,. . . , O is attempted to be predicted from x _INST ^(d) (k). The predicted signal is
[Outside 27]

Represented by

そのような予測を達成する１つの方法は、予測される信号
［外２８］

を、フィルタが予測誤差を最小限するように決定される線形フィルタリングによってｘ_ＩＮＳＴ ^（ｄ）（ｋ）から生成されると考えることである。フィルタが（解析フレームの存続期間と比較して）ごく短い存続期間の有限インパルス応答（ＦＩＲ）フィルタであると考えられる場合は、予測誤差の最小化は、最先端の最小二乗技術を用いることによって達成され得る。 One way to achieve such prediction is to predict the signal [out 28]

Is generated from x _INST ^(d) (k) by linear filtering, which is determined to minimize the prediction error. If the filter is considered to be a finite impulse response (FIR) filter with a very short duration (compared to the duration of the analysis frame), prediction error minimization is achieved by using state-of-the-art least-squares techniques. Can be achieved.

最後に、ドミナント音源信号ｘ_ＩＮＳＴ ^（ｄ）（ｋ）及び全ての予測された相関成分のＨＯＡ表現は、次の式（１５）の通りに、逆球面調和関数変換（説明のために、以下の「球面調和関数」の項を参照されたい。）によって、ステップ又は段階３４で求められる： Finally, the dominant sound source signal x _INST ^(d) (k) and the HOA representation of all predicted correlation components can be expressed as the following equation (15): (See “Spherical Harmonic Function” section).

［以前にアクティブであったドミナント音源の指向性信号の計算］
（ｋ−１）番目のフレームにおいてアクティブであると考えられる音源の指向性信号
［外２９］

は、式（２０）に従って行列Ｘ_ＡＣＴ（ｋ−１）内に含まれる。この行列は、次の式（１６）によってモードマッチング（上記のPolettiの文献を参照されたい。）を用いて計算される：

[Calculation of directional signal of previously active dominant sound source]
(K-1) Directional signal of a sound source considered to be active in the frame No. [Outside 29]

_Are included in the matrix X _ACT (k−1) according to equation (20). This matrix is calculated using mode matching (see Poletti's reference above) by the following equation (16):

このとき、Ｃ（ｋ−１）は、原のＨＯＡ音場表現の（ｋ−１）番目のフレームを表し、
［外３０］

は、（ｋ−１）番目のフレームにおいてアクティブであると考えられる音源の方向
［外３１］

に対するモード行列を表す。モード行列
［外３２］

は、次のように、式（１８）を用いて式（１７）によって計算される：

At this time, C (k−1) represents the (k−1) th frame of the original HOA sound field expression,
[Outside 30]

Is the direction of the sound source considered to be active in the (k−1) th frame [outside 31]

Represents the mode matrix for. Mode matrix [Outside 32]

Is calculated by equation (17) using equation (18) as follows:

［方向割り当て］
上述されたように、一方で、図１のステップ／段階１３での割り当ては、一応の方向推定
［外３３］

と、（ｋ−１）番目のフレームにおいてアクティブであると考えられる音源の平滑化された方向とを比較することによって、達成される。この平滑化された方向は、次の式（１９）によって表される組に含まれる：

[Direction assignment]
As described above, on the other hand, the assignment in step / stage 13 of FIG.

And the smoothed direction of the sound source considered to be active in the (k−1) th frame. This smoothed direction is included in the set represented by the following equation (19):

このとき、ｉ_{ＡＣＴ，ｋ−１}（ｄ′）は、（ｋ−１）番目のフレームにおいてアクティブであると考えられるｄ′番目の音源のインデックスを表す。特に、
［外３４］

の組の間の角度
［外３５］

が小さければ小さいほど、ｄ番目の新たに見つけられたドミナント音源方向は、インデックスｉ_{ＡＣＴ，ｋ−１}（ｄ′）を持った以前にアクティブであった音源に対応する可能性がますます高くなると考えられる。

At this time, i _{ACT, k−1} (d ′) represents the index of the d ′ th sound source that is considered to be active in the (k−1) th frame. In particular,
[Outside 34]

Angle between pairs [Outside 35]

The smaller the is, the more likely the d-th newly found dominant sound source direction is to correspond to the previously active sound source with index i _{ACT, k-1} (d ′) Conceivable.

他方で、割り当てのために、フレームｋでの検出されたドミナント音源の瞬時指向性信号
［外３６］

と、（ｋ−１）番目のフレームにおいてアクティブであると考えられる音源の指向性信号Ｘ_ＡＣＴ（ｋ−１）との間の相関が利用される。ここで、フレームＸ_ＡＣＴ（ｋ−１）は、次の式（２０）の通りに、（ｋ−１）番目のフレームにおいてアクティブであると考えられる音源の個々の指向性信号
［外３７］

から成ると考えられる： On the other hand, for assignment, the instantaneous directional signal of the detected dominant source at frame k [outside 36]

And the directional signal X _ACT (k−1) of the sound source considered to be active in the (k−1) -th frame is used. Here, the frame X _ACT (k−1) is an individual directivity signal of the sound source considered to be active in the (k−1) -th frame as shown in the following equation (20).

Considered to consist of:

この定義を用いると、２つの信号
［外３８］

の間の相関係数
［外３９］

の絶対値が高ければ高いほど、ｄ番目の新たに見つけられたドミナント音源方向は、インデックスｉ_{ＡＣＴ，ｋ−１}（ｄ′）を持った以前にアクティブであった音源に対応する可能性がますます高くなると仮定される。そのような仮定は、相関係数が２つの信号の間の線形依存性のための指標を与えると事実によって正当化される。

Using this definition, two signals [outside 38]

Correlation coefficient between [Outside 39]

The higher the absolute value of, the more likely the d-th newly found dominant sound source direction will correspond to the previously active sound source with index i _{ACT, k-1} (d ′). It is assumed that it will become higher. Such an assumption is justified by the fact that the correlation coefficient gives an indication for the linear dependence between the two signals.

これらの検討に基づき、割り当てを特定する割り当て関数
［外４０］

は、次の費用関数（２１）を最小化するように計算される： Allocation function that identifies allocation based on these considerations [Ex. 40]

Is calculated to minimize the following cost function (21):

（ｋ−１）番目のフレーム内のいずれのアクティブな音源にも属さない方向インデックス
［外４１］

について、角度
［外４２］

は、Θ_ＭＩＮの最小角度に事実上設定されると暗に考えられる。このとき、例えば、Θ_ＭＩＮ＝２π／Ｎ。更に、方向インデックス
［外４３］

についての相関係数
［外４４］

は、事実上ゼロに設定される。最初の動作は、ｄ番目の新たに見つけられた方向
［外４５］

と以前にアクティブであったドミナント音源の方向との間の角度がΘ_ＭＩＮよりも大きい場合に、この新たに見つけられた方向が新しい音源に属する傾向を有するとの効果を有する。

(K-1) Direction index that does not belong to any active sound source in the frame (outside 41)

About the angle [Outside 42]

Is implicitly assumed to be effectively set to the minimum angle of Θ _MIN . At this time, for example, Θ _MIN = 2π / N. Furthermore, direction index [outside 43]

Correlation coefficient for [Outside 44]

Is effectively set to zero. The first action is the dth newly found direction [outside 45]

If when the angle between the direction of a dominant sound source was active greater than theta _MIN previously, this newly found was direction has the effect of having a tendency to belong to the new sound.

割り当ての問題は、H. W. Kuhn，“The Hungarian method for the assignment problem”，Naval research logistics quarterly，vol.2(1-2)，pp.83-97，１９９５年において記載されている周知のハンガリアン法を用いるよって解かれ得る。 The problem of assignment is the well-known Hungarian method described in HW Kuhn, “The Hungarian method for the assignment problem”, Naval research logistics quarterly, vol. 2 (1-2), pp. 83-97, 1995. It can be solved by using it.

［平滑化されたドミナント音源方向のモデルベースの計算］
この項は、統計的な音源移動モデルに従って図１のステップ／段階１４における平滑化されたドミナント音源方向の計算に対処する。この計算のための個々のステップは図４に表されており、以下で詳細に説明される。 [Model-based calculation of smoothed dominant sound source direction]
This term addresses the calculation of the smoothed dominant sound source direction in step / stage 14 of FIG. 1 according to a statistical sound source movement model. The individual steps for this calculation are represented in FIG. 4 and are described in detail below.

●ドミナント音源方向についての方向の事前確率関数の計算
新たに見つけられたドミナント音源方向についての方向の事前確率関数
［外４６］

は：
・フレーム（ｋ−１）でのアクティブなドミナント音源のインデックスｉ_{ＡＣＴ，ｋ−１}（ｄ′），ｄ′＝１，．．．，Ｄ_ＡＣＴ（ｋ−１）の組Ｊ_{ＤＯＭ，ＡＣＴ}（ｋ−１）と、
・フレーム（ｋ−１）での対応するドミナント音源方向推定
［外４７］

の組Ｇ_{Ω，ＤＯＭ，ＡＣＴ}（ｋ−１）と、
・フレーム（ｋ−２）及び（ｋ−１）の間の夫々の源移動角度
［外４８］

の組
［外４９］

と、
・割り当て関数ｆ_Ａ，ｋと
を用いて、ステップ又は段階４２で計算される。計算は、欧州特許出願第１２３０６４８５．９号において紹介されている単純な音源移動予測モデルに基づく。特に、ｄ番目の新たに見つけられたドミナント音源についての方向の事前確率関数
［外５０］

は、３次元空間における単位球面上のフォンミーゼス−フィッシャー分布の離散バージョンであると考えられる。 ● Calculation of direction prior probability function for dominant sound source direction Directional prior probability function for newly found dominant sound source direction [Outside 46]

Is:
The index of the active dominant sound source i _{ACT, k−1} (d ′), d ′ = 1,. . . , D _ACT (k−1) pair J _{DOM, ACT} (k−1),
-Corresponding dominant sound source direction estimation in frame (k-1) [Outside 47]

A set of G _{Ω, DOM, ACT} (k−1),
-Each source movement angle between frames (k-2) and (k-1) [outside 48]

Pair [outside 49]

When,
Calculated in step or stage 42 using the allocation function f _{A, k} . The calculation is based on a simple sound source movement prediction model introduced in European Patent Application 123066485.9. In particular, the prior probability function of direction for the d-th newly found dominant sound source [Outside 50]

Is considered to be a discrete version of the von Mises-Fischer distribution on the unit sphere in three-dimensional space.

以下で、方向の事前確率関数
［外５１］

は、次の式（２２）として、個々の試験方向Ω_ｑ，ｑ＝１，．．．，Ｑについての確率
［外５２］

から成るベクトルによって与えられると考えられる： Below, prior probability function of direction [outside 51]

Is the individual test directions Ω _q , q = 1,. . . , Q probability [outside 52]

Given a vector consisting of:

個々の試験方向Ω_ｑについての事前確率を計算するよう、２つの場合が区別される：
ａ）ｄ番目の新たに見つけられたドミナント音源に割り当てられる源インデックスｆ_Ａ，ｋ（ｄ）が組Ｊ_{ＤＯＭ，ＡＣＴ}（ｋ−１）に含まれる場合は、事前確率は、次の式（２３）に従って計算される：

Two cases are distinguished to calculate prior probabilities for individual test directions Ω _q :
a) If the source index f _{A, k} (d) assigned to the d-th newly found dominant sound source is included in the set J _{DOM, ACT} (k−1), the prior probability is given by ) Is calculated according to:

このとき、Θ_ｑ，ｄ（ｋ）は、推定される方向
［外５３］

と試験方向Ω_ｑとの間の角度を表す。すなわち：

At this time, Θ _{q, d} (k) is estimated direction [outside 53]

And the angle between the test direction Ω _q . Ie:

更に、ｋ_ｄ（ｋ）は、次の式（２５）に従って源移動角度推定
［外５４］

を用いて計算される濃度パラメータを表す：

Furthermore, k _d (k) is a source movement angle estimate according to the following equation (25) [Outside 54]

Represent the concentration parameter calculated using

このとき。Ｃ_Ｄは、次の関係（２６）に設定されてよい：

At this time. C _D may be set to the following relationship (26):

ｋ_ＭＡＸ及びＣ_Ｒのための妥当な値は、次の関係（２７）であることが分かっている（欧州特許出願第１２３０６４８５．９号を参照）：

reasonable value for k _MAX and _{C R} are (see European Patent Application No. 12306485.9) the following relation (27) with which found to be:

この計算の背後にある原理は、以前に音源が移動していなければいないほど、事前確率関数の濃度を増大させることである。音源が以前にたくさん動いている場合は、その一連の方向に関する不確かさは高く、よって、濃度パラメータは小さい値に達するべきである。

The principle behind this calculation is to increase the concentration of the prior probability function the more the sound source has not moved previously. If the sound source has moved a lot in the past, the uncertainty in that series of directions is high, so the concentration parameter should reach a small value.

ｂ）ｄ番目に新たに見つけられたドミナント音源に割り当てられた源インデックスｆ_Ａ，ｋ（ｄ）が組Ｊ_{ＤＯＭ，ＡＣＴ}（ｋ−１）に含まれない場合は、夫々の音源は、以前にアクティブでなかったと考えられる。結果として、この源の方向に関する演繹的知識は実際には利用可能でない。よって、事前確率関数
［外５５］

は、単位球面において一様であると考えられる。このとき、個々の確率は、全ての試験方向Ω_ｑに関して等しい。すなわち： b) If the source index f _{A, k} (d) assigned to the d-th newly found dominant sound source is not included in the set J _{DOM, ACT} (k−1), then each sound source is Probably not active. As a result, a priori knowledge about the direction of this source is not actually available. Therefore, prior probability function [Outside 55]

Is considered uniform on the unit sphere. The individual probabilities are then equal for all test directions Ω _q . Ie:

●ドミナント音源方向についての方向の尤度関数の計算
方向の尤度関数
［外５６］

は、割り当て関数ｆ_Ａ，ｋに加えて、個々の新たに検出されたドミナント音源によって生成されると考えられるＨＯＡ音場成分
［外５７］

を用いて、ステップ又は段階４１で計算される。方向の尤度関数
［外５８］

は、次の式（２９）のように、個々の試験方向Ω_ｑ，ｑ＝１，．．．，Ｑについての尤度
［外５９］

から成るベクトルあると考えられる：

● Calculation of likelihood function of direction for dominant sound source direction Likelihood function [External 56]

Is an HOA sound field component that is considered to be generated by each newly detected dominant sound source in addition to the assignment function f _{A, k} [57]

Is used in step or stage 41. Directional likelihood function [Outside 58]

Is the individual test directions Ω _q , q = 1,. . . , Q Likelihood [Outside 59]

Consider a vector consisting of:

個々の尤度
［外６０］

は、欧州特許出願第１２３０５５３７．８号で記載されるように、試験方向Ω_ｑから作用する一般平面波の電力の近似であるよう計算される。特に：

Individual likelihood [outside 60]

Is calculated to be an approximation of the power of a general plane wave acting from the test direction Ω _q as described in European Patent Application No. 123055537.8. In particular:

このとき、次の式（３１）で表されるものは、試験方向に対するモードベクトルを表し（なお、Ｓ_ｎ ^ｍ（・）は、以下の「実数値の球面調和関数の定義」の項において記載される実数値の球面調和関数を表す。）、このとき、次の式（３２）で表されるものは、ＨＯＡ表現Ｃ_{ＤＯＭ，ＣＯＲＲ} ^（ｄ）（ｋ）に対するＨＯＡ係数間相関行列を示す：

At this time, what is expressed by the following equation (31) represents a mode vector with respect to the test direction (note that S _n ^m (·) is described in the section “Definition of a real-valued spherical harmonic function” below) ), Where the following expression (32) represents the correlation matrix between HOA coefficients for the HOA expression C _{DOM, CORR} ^(d) (k):

●ドミナント音源方向についての方向の事後確率関数の計算
方向の事後確率関数
［外６１］

は、方向の事前確率関数
［外６２］

及び方向の尤度関数
［外６３］

を用いて、ステップ又は段階４３で計算される。ここで、もう一度、方向の事後確率関数
［外６４］

は、次の式（３３）のように、個々の試験方向Ω_ｑ，ｑ＝１，．．．，Ｑについての事後確率
［外６５］

から成るベクトルあると考えられる：

● Calculation of direction posterior probability function for dominant sound source direction Direction posterior probability function [Outside 61]

Is the prior probability function of the direction [outside 62]

And direction likelihood function [outside 63]

Is used in step or stage 43. Here, again, the posterior probability function of the direction [Outside 64]

Is the individual test direction Ω _q , q = 1,. . . , Q posterior probability [outside 65]

Consider a vector consisting of:

個々の事後確率
［外６６］

は、次の式（３４）ベのように、ベイズの規則に従って計算される（欧州特許出願第１２３０６４８５．９号を参照）：

Individual posterior probabilities [External 66]

Is calculated according to Bayes' rule, as in equation (34) (see European Patent Application No. 123066485.9):

固定の方向インデックスｄを考えると、式（３４）の分母は夫々の試験方向Ω_ｑについて一定である。続く方向探索のために、事後確率関数の最大値のみが重要である場合に、そのような大域的なスケーリングは不適切である。よって、式（３４）の分母の計算は、計算出力を節約するよう完全に断念され得ることが知られる。

Considering a fixed direction index d, the denominator of equation (34) is constant for each test direction Ω _q . Such global scaling is inappropriate when only the maximum value of the posterior probability function is important for subsequent direction searches. Thus, it is known that the calculation of the denominator of equation (34) can be completely abandoned to save calculation output.

●平滑化されたドミナント音源方向の計算
平滑化されたドミナント音源方向
［外６７］

は、事後確率関数
［外６８］

を用いて、ステップ又は段階４４で計算される。特に、フレームｋについて見つけられたｄ番目の音源の平滑化された方向
［外６９］

は、次の事後確率関数において最大値を探すことによって求められる： ● Calculation of smoothed dominant sound source direction Smoothed dominant sound source direction [Outside 67]

Is the posterior probability function [External 68]

Is used in step or stage 44. In particular, the smoothed direction of the d th sound source found for frame k [outer 69]

Can be found by looking for the maximum in the following posterior probability function:

［目下アクティブなドミナント音源のインデックス及び方向の決定］
フレームｋでの全てのＤ_ＡＣＴ（ｋ）個のアクティブなドミナント音源のインデックスｉ_{ａｃｔ，ｋ}（ｄ′），ｄ′＝１，．．．，Ｄ_ＡＣＴ（ｋ）の組Ｊ_{ＤＯＭ，ＡＣＴ}（ｋ）、及びフレームｋでの対応するドミナント源方向の推定
［外７０］

の組Ｇ_{Ω，ＤＯＭ，ＡＣＴ}（ｋ）は、フレーム（ｋ−１）での全てのアクティブなドミナント音源方向の平滑化された推定
［外７１］

の組Ｇ_{Ω，ＤＯＭ，ＡＣＴ}（ｋ−１）と、対応するインデックスｉ_{ａｃｔ，ｋ−１}（ｄ），ｄ′＝１，．．．，Ｄ_ＡＣＴ（ｋ−１）と、フレームｋについて求められた平滑化されたドミナント音源方向の推定
［外７２］

とを用いて、図１のステップ又は段階１５で計算される。この演算は、少数の連続したフレームについて検出されていない音源を見かけ上非アクティブにしない目的を持ち、このようなことは、例えば、個々のインパルスの間に短い中断を伴ってインパルス様の音響を生成するカスタネットのような、源について起こり得る。このように、最後（すなわち、（ｋ−１）番目）のふれーむにおいてアクティブであると考えられた音源を、それらが所定数Ｋ_{ＩＮＡＣＴ}の連続するフレームについて検出されなかった場合にのみ非アクティブにすることが妥当である。

[Determining the index and direction of the currently active dominant sound source]
The indices i _{act, k} (d ′), d ′ = 1,... Of all D _ACT (k) active dominant sound sources in frame k. . . , D _ACT (k), set J _{DOM, ACT} (k), and estimation of corresponding dominant source direction at frame k [outer 70]

The set G _{Ω, DOM, ACT} (k) is a smoothed estimate of all active dominant source directions in frame (k−1) [outside 71]

G _{Ω, DOM, ACT} (k−1) and the corresponding index i _{act, k−1} (d), d ′ = 1,. . . , D _ACT (k−1) and an estimate of the smoothed dominant source direction determined for frame k [out 72]

Are used in step or stage 15 of FIG. This operation has the purpose of not apparently deactivating the undetected sound source for a small number of consecutive frames, such as the impulsive sound with short interruptions between individual impulses. This can happen with sources such as castanets that generate. Thus, sound sources considered active at the last (ie, (k−1) th) frame are inactive only if they are not detected for a predetermined number of K _INACT consecutive frames. It is reasonable to

先の検討に従って、第１のステップで、フレーム（ｋ−１）での全てのＤ_ＡＣＴ（ｋ−１）個のアクティブなドミナント音源のインデックスｉ_{ＡＣＴ，ｋ−１}（ｄ′），ｄ′＝１，．．．，ＤＡＣＴ（ｋ−１）の組Ｊ_{ＤＯＭ，ＡＣＴ}（ｋ−１）と、次の式（３６）で表される全ての新たに検出された音源のインデックスの組との結合された組Ｊ_{ＪＯＩＮＥＤ}（ｋ）は、計算される： In accordance with the previous discussion, in a first step, the index i _{ACT, k−1} (d ′), d ′ = of all D _ACT (k−1) active dominant sources in frame (k−1). 1,. . . , DACT (k−1) pair J _{DOM, ACT} (k−1) and all newly detected sound source index pairs represented by the following equation (36) are combined J _JOINED (K) is calculated:

すなわち：

Ie:

この組から、所望の組Ｊ_{ＤＯＭ，ＡＣＴ}（ｋ）は、多数のＫ_{ＩＮＡＣＴ}個の前の連続したフレームについて検出されなかった源のインデックスをＪ_{ＪＯＩＮＥＤ}（ｋ）から除外することによって求められる。フレームｋでのアクティブなドミナント音源の数Ｄ_ＡＣＴ（ｋ）は、Ｊ_{ＤＯＭ，ＡＣＴ}（ｋ）の要素の数に設定される。

From this set, the desired set J _{DOM, ACT} (k) is determined by excluding from J _JOINED (k) the index of the source that was not detected for a number of K _INACT previous consecutive frames. The number of active dominant sound sources D _ACT (k) in frame k is set to the number of elements of J _{DOM, ACT} (k).

最後に、ｉ_{ａｃｔ，ｋ}（ｄ′）がＪ_{ＤＯＭ，ＡＣＴ}（ｋ）の要素を示すとして、ドミナント源方向推定
［外７３］

は、次の式（３８）によって決定される： Finally, assuming that i _{act, k} (d ′) represents an element of J _{DOM, ACT} (k), the dominant source direction estimate [outside 73]

Is determined by the following equation (38):

これは、夫々の音源がフレームｋで新たに検出されない場合に、以前にアクティブであったドミナント音源の方向が一定に保たれることを意味する。

This means that the direction of the previously active dominant sound source is kept constant when each sound source is not newly detected in frame k.

［高次アンビソニクスの基本］
高次アンビソニクス（ＨＯＡ）は、音源がないと考えられる関心のあるコンパクトな領域内での音場の記述に基づく。その場合に、関心のある領域内での時間ｔ及び位置ｘでの音圧ｐ（ｔ，ｘ）の時空間的な挙動は、同次波動方程式によって物理的に十分に決定される。以下で、図５に示される球座標系が考えられる。使用される座標系では、ｘ軸は正面位置を指し示し、ｙ軸は左を指し示し、ｚ軸は上を指し示す。空間ｘ（ｒ，θ，φ）^Ｔでの位置は、半径ｒ＞０（すなわち、座標原点までの距離）、極軸ｚから測定される傾斜角度θ∈［０，π］、及びｘ軸からｘ−ｙ平面において反時計回りで測定されるアジマス角φ∈［０，２π］によって表される。（・）^Ｔは転置を表す。 [Basics of higher-order ambisonics]
Higher order ambisonics (HOA) is based on a description of the sound field in a compact area of interest that is considered to have no sound source. In that case, the spatio-temporal behavior of the sound pressure p (t, x) at time t and position x in the region of interest is physically well determined by the homogeneous wave equation. In the following, the spherical coordinate system shown in FIG. 5 is considered. In the coordinate system used, the x-axis points to the front position, the y-axis points to the left, and the z-axis points to the top. The position in space x (r, θ, φ) ^T is the radius r> 0 (ie, the distance to the coordinate origin), the tilt angle θ∈ [0, π] measured from the polar axis z, and the x axis. It is represented by the azimuth angle φε [0,2π] measured counterclockwise in the xy plane. (•) ^T represents transposition.

次いで、ωが角周波数を表し且つｉが虚数単位を示すとして、Ｆ_ｔ（・）、すなわち、次の式（３９）によって表される、時間に対する音圧のフーリエ変換は、式（４０）に従って、一連の球面調和関数に展開され得ることが示され得る（E. G. Williams，“Fourier Acoustics”，vol.93 of Applied Mathematical Sciences，Academic Press，１９９９年を参照）： Then, assuming that ω represents an angular frequency and i represents an imaginary unit, F _t (·), that is, the Fourier transform of sound pressure with respect to time, represented by the following equation (39), is given by equation (40): Can be shown to be expanded into a series of spherical harmonics (see EG Williams, “Fourier Acoustics”, vol. 93 of Applied Mathematical Sciences, Academic Press, 1999):

式（４０）で、ｃ_ｓは音響の速度を表し、ｋは、ｋ＝ω／ｃ_ｓによって角周波数ωに関連付けられる角波数を表し、ｊ_ｎ（・）は、第１種の球ベッセル関数を表し、Ｓ_ｎ ^ｍ（θ，φ）は、以下の「実数値の球面調和関数の定義」の項で定義される次数ｎ及び角度ｍの実数値の球面調和関数を表す。展開係数Ａ_ｎ ^ｍ（ｋ）は、角波数ｋにのみ依存している。音圧は空間的に帯域制限されると暗に考えられる。よって、級数は、ＨＯＡ表現の次数と呼ばれる上限値Ｎで次数インデックスｎに対して切り捨てられる。

In the formula (40), _{c s} represents an acoustic velocity, k is the k = ω / _{c s} represents the angular wavenumber associated with the angular frequency _{ω, j} n (·) is the first kind of spherical Bessel functions S _n ^m (θ, φ) represents a real-valued spherical harmonic function of order n and angle m defined in the section “Definition of Real-Valued spherical harmonic function” below. The expansion coefficient A _n ^m (k) depends only on the angular wave number k. Sound pressure is considered implicit when it is spatially band limited. Therefore, the series is rounded down with respect to the order index n at the upper limit value N called the order of the HOA expression.

音場が、角度タプル（θ，φ）によって特定される全ての可能な方向から到来する異なる角周波数ωの無限数の調和平面波の重ね合わせによって表される場合に、夫々の平面波複素振幅関数Ｃ（ω，θ，φ）は、次の球面調和関数展開（４１）によって表現され得ることが示され得る（B. Rafaely，“Plane-wave Decomposition of the Sound Field on a Sphere by Spherical Convolution”，J. Acoust. Soc. Am.，vol.4(116)，pp.2149-2157，２００４年を参照）： When the sound field is represented by a superposition of an infinite number of harmonic plane waves of different angular frequencies ω coming from all possible directions specified by the angle tuple (θ, φ), each plane wave complex amplitude function C It can be shown that (ω, θ, φ) can be expressed by the following spherical harmonic expansion (41) (B. Rafaely, “Plane-wave Decomposition of the Sound Field on a Sphere by Spherical Convolution”, J Acoust. Soc. Am., Vol. 4 (116), pp. 2149-2157, 2004):

このとき、展開係数Ｃ_ｎ ^ｍ（ｋ）は、次の式（４２）によって、展開係数Ａ_ｎ ^ｍ（ｋ）に関連付けられる：

At this time, the expansion coefficient C _n ^m (k) is related to the expansion coefficient A _n ^m (k) by the following equation (42):

個々の係数Ｃ_ｎ ^ｍ（ｋ＝ω／ｃ_ｓ）が角周波数ωの関数であるとする場合に、逆フーリエ変換（Ｆ_−１（・）によって表される）の適用は、夫々の次数及び角度ｍについて、時間領域の関数（４３）を与える：

_{Given that the} individual coefficients C ^nm (k = ω / _cs ) are a function of the angular frequency ω, the application of the inverse Fourier transform (represented by F ₋₁ (•)) Give the time domain function (43) for the angle m:

これは、次の式（４４）によって、単一ベクトルｃ（ｔ）において収集され得る：

This can be collected in a single vector c (t) by the following equation (44):

ベクトルｃ（ｔ）内の時間領域関数ｃ_ｎ ^ｍ（ｔ）の位置インデックスは、ｎ（ｎ＋１）＋１＋ｍによって与えられる。ベクトルｃ（ｔ）における要素の全体数はＯ＝（Ｎ＋１）^２によって与えられる。

The position index of the time domain function c _n ^m (t) in the vector c (t) is given by n (n + 1) + 1 + m. The total number of elements in the vector c (t) is given by O = (N + 1) ² .

最終のアンビソニクス様式は、次の式（４５）のように、サンプリング周波数ｆ_Ｓを用いたｃ（ｔ）のサンプリングされたバージョンを提供する： The final ambisonics style provides a sampled version of c (t) using the sampling frequency f _S , as in equation (45):

このとき、Ｔ_Ｓ＝１／ｆ_Ｓはサンプリング周期を表す。ｃ（ｌＴ_Ｓ）の要素はアンビソニクス係数と呼ばれる。時間領域信号ｃ_ｎ ^ｍ（ｔ）、ひいてはアンビソニクス係数は、実数値である。

At this time, T _S = 1 / f _S represents a sampling period. The element of c (lT _S ) is called an ambisonic coefficient. The time domain signal c _n ^m (t) and thus the ambisonic coefficient are real values.

●実数値の球面調和関数の定義
実数値の球面調和関数Ｓ_ｎ ^ｍ（θ，φ）は、次の式（４６）及び（４７）によって表される： ● spherical harmonics S _{n m} ^(theta, phi) of the definition real-valued spherical harmonics of real value is expressed by the following equation (46) and (47):

関連するルジャンドル関数Ｐ_ｎ，ｍ（ｘ）は、ルジャンドル多項式Ｐ_ｎ（ｘ）を用いて、上記のE. G. Williamsのテキストとは異なって、コンドン−ショートレイ位相項（−１）^ｍによらずに、次の式（４８）のように定義される：

The associated Legendre function P _{n, m} (x) uses the Legendre polynomial P _n (x) and, unlike the EG Williams text above, does not depend on the Condon-Shortley phase term (−1) ^m Is defined as the following equation (48):

●高次アンビソニクスの空間分解能
方向Ω_０＝（θ_０，φ_０）^Ｔから到来する一般平面波関数ｘ（ｔ）は、次の式（４９）によって、ＨＯＡにおいて表される：

Spatial resolution of higher-order ambisonics Direction Ω ₀ = (θ ₀ , φ ₀ ) The general plane wave function x (t) coming from ^T is expressed in the HOA by the following equation (49):

平面波振幅の対応する空間密度
［外７４］

は、次の式（５０）及び（５１）によって与えられる：

Corresponding spatial density of plane wave amplitude [outside 74]

Is given by the following equations (50) and (51):

式（５１）から、それは一般平面波関数ｘ（ｔ）と空間分散関数ν_Ｎ（Θ）との積であることが分かる。このことは、次の式（５２）によって表される性質をもって、ΩとΩ_０との間の角度Θののみ依存しているものとして示され得る：

From equation (51), it can be seen that it is the product of the general plane wave function x (t) and the spatial dispersion function ν _N (Θ). This can be shown as being dependent only on the angle Θ between Ω and Ω ₀ with the properties represented by the following equation (52):

期待されるように、無限次数、すなわち、Ｎ→∞の制限において、空間分散関数は、デラック・デルタδ（・）になる。すなわち：

As expected, in the limit of infinite order, ie N → ∞, the spatial dispersion function becomes the deluxe delta δ (·). Ie:

しかし、有限次数Ｎの場合に、方向Ω_０からの一般平面波の寄与は、近傍方向に不鮮明化される。このとき、不鮮明の程度は、次数の増大に伴って小さくなる。Ｎの異なる値についての正規化された関数ν_Ｎ（Θ）のプロットは図６で与えられている。

However, for a finite order N, the contribution of the general plane wave from direction Ω ₀ is smeared in the vicinity direction. At this time, the degree of blurring decreases as the order increases. A plot of the normalized function ν _N (Θ) for different values of _N is given in FIG.

あらゆる方向Ωについて、平面波振幅の空間密度の時間領域の挙動は、あらゆる他の方向でのその挙動の倍数である。特に、幾つかの固定方向Ω_１及びΩ_２についての関数ｃ（ｔ，Ω_１）及びｃ（ｔ，Ω_２）は、時間ｔに関して互いに大いに相関される。 For any direction Ω, the time domain behavior of the spatial density of the plane wave amplitude is a multiple of that behavior in any other direction. In particular, the functions c (t, Ω ₁ ) and c (t, Ω ₂ ) for several fixed directions Ω ₁ and Ω ₂ are highly correlated with respect to time t.

●球面調和関数変換
平面波振幅の空間密度が、単位球面上でほぼ一様に分布している多数のＯ個の空間方向Ω_ｏ，１≦ｏ≦Ｏで離散化される場合に、Ｏ個の指向性信号ｃ（ｔ，Ω_ｏ）が得られる。それらの信号を次の式（５４）のようにベクトルにまとめることを考える： Spherical Harmonic Function Transformation When the spatial density of the plane wave amplitude is discretized in a number of O spatial directions Ω _o , 1 ≦ o ≦ O distributed almost uniformly on the unit sphere, O A directional signal c (t, Ω _o ) is obtained. Consider combining these signals into a vector as in equation (54):

このベクトルは、次の式（５５）のように単純マトリクス乗算によって、式（４４）において定義される連続アンビソニクス表現ｄ（ｔ）から計算されることが、式（５０）を用いることによって立証され得る：

It is verified by using equation (50) that this vector is calculated from the continuous ambisonic representation d (t) defined in equation (44) by simple matrix multiplication as in equation (55) below. obtain:

このとき、（・）^Ｈは、共役転置を示し、Ψは、次の式（５６）によって定義されるモード行列を表す：

Where (·) ^H denotes a conjugate transpose and ψ denotes a mode matrix defined by the following equation (56):

Ωｏは、単位球面においてほぼ一様に分布しているので、モード行列は、一般に反転可能である。よって、連続アンビソニクス表現は、次の式（５８）によって、指向性信号ｃ（ｔ，Ω_ｏ）から計算され得る：

Since Ωo is distributed almost uniformly on the unit sphere, the mode matrix is generally invertible. Thus, the continuous ambisonic representation can be calculated from the directional signal c (t, Ω _o ) by the following equation (58):

双方の式は、アンビソニクス表現と‘空間領域’との間の変換及び逆変換を構成する。それらの変換は、夫々、球面調和関数変換及び逆球面調和関数変換と称される。方向Ω_ｏは、単位球面においてほぼ一様に分布しているので、式（５５）においてΨ^Ｈの代わりにΨ^−１の使用を正当化する近似が存在する：

Both equations constitute the transformation between the ambisonic representation and the 'spatial domain' and the inverse transformation. These transformations are referred to as spherical harmonic transformation and inverse spherical harmonic transformation, respectively. Since the direction Ω _o is distributed almost uniformly in the unit sphere, there is an approximation that justifies the use of Ψ ⁻¹ instead of Ψ ^H in equation (55):

上記の全ての関係は、離散時間領域についても有効である。

All the above relationships are also valid for the discrete time domain.

発明の処理は、単一のプロセッサ又は電子回路によって、あるいは、並行して動作する及び／又は発明処理の異なる部分において動作する複数のプロセッサ若しくは電子回路によって、実行され得る。
いくつかの態様を記載しておく。
〔態様１〕
音場のＨＯＡと称される高次アンビソニクス表現における無相関な音源の方向を決定する方法であって、
ＨＯＡ係数の現在時間フレームにおいて、ドミナント音源の一応の方向推定を逐次探索し、対応するドミナント音源によって生成されるＨＯＡ音場成分を計算するステップを有し、
前記探索の夫々の繰り返しにおいて、夫々の更なる方向推定は、前に見つけられた音源の信号と相関する全ての成分が取り除かれている原のＨＯＡ表現を表す残余ＨＯＡ表現から計算され、
現在の方向推定は、複数の予め定義された試験方向の中から選択され、聴取者位置で前記選択された方向から作用する前記残余ＨＯＡ表現の関連する一般平面波の電力が、全ての他の試験方向の電力と比較して最大であるようにする、方法。
〔態様２〕
前記ＨＯＡ係数の現在時間フレームについての前記選択された方向推定は、ＨＯＡ係数の前の時間フレームにおいて見つけられたドミナント音源へ割り当てられ、最終の方向推定は、結果として得られる時間軌跡に対して平滑化される、
態様１に記載の方法。
〔態様３〕
前記平滑化は、ベイズ推定プロセスを実行することによって実行され、該ベイズ推定プロセスは、前記原のＨＯＡ表現のドミナント音源成分の指向性電力分布と、統計に基づく先験的な音源移動モデルとを利用する、
態様２に記載の方法。
〔態様４〕
前記統計に基づく先験的な音源移動モデルは、個々の音源の動きを、前記前の時間フレームにおけるそれらの方向の知識と、前記前の時間フレームと最後から２番目の時間フレームとの間での動きの知識とから統計的に予測する、
態様３に記載の方法。
〔態様５〕
前記ＨＯＡ係数の前の時間フレームにおいて見つけられたドミナント音源への方向推定の前記割り当ては、方向推定及び前に見つけられた音源の方向の組の間の角度の連帯的な最小化と、方向推定に及び前記ＨＯＡ係数の前の時間フレームにおいて見つけられたドミナント音源に関連した指向性信号の組の間の相関係数の絶対値の最大化とによって達成される、
態様３又は４に記載の方法。
〔態様６〕
音場のＨＯＡと称される高次アンビソニクス表現における無相関な音源の方向を決定する方法であって、
ＨＯＡ係数の現在時間フレームにおいて、ドミナント音源の一応の方向推定を逐次探索し、対応するドミナント音源によって生成されるＨＯＡ音場成分を計算し、対応する指向性信号を計算するステップと、
前記現在時間フレームの前記一応の方向推定と前記ＨＯＡ係数の前の時間フレームにおいてアクティブな音源の平滑化された方向とを比較することによって、且つ、前記現在時間フレームの前記指向性信号と前記前の時間フレームにおいてアクティブな音源の指向性信号とを相関させることによって、前記計算されたドミナント音源を、前記前の時間フレームにおいてアクティブな対応する音源に割り当てて、割り当て関数を得るステップと、
前記割り当て関数、前記前の時間フレームにおける平滑化された方向の組、前記前の時間フレームにおけるアクティブなドミナント音源のインデックスの組、最後から２番目の時間フレームと前記前の時間フレームとの間での夫々の源移動角度の組、及び前記対応するドミナント音源によって生成される前記ＨＯＡ音場成分を用いて、平滑化されたドミナント源方向を計算するステップと、
前記平滑化されたドミナント源方向、前記前の時間フレームの前記アクティブなドミナント音源の方向のフレーム遅延されたバージョン、及び前記前の時間フレームにおける前記アクティブなドミナント音源のインデックスのフレーム遅延されたバージョンを用いて、前記現在時間フレームの前記アクティブなドミナント音源のインデックス及び方向を決定するステップと
を有し、
前記前の時間フレームにおいてアクティブな音源の前記指向性信号は、前記前の時間フレームの前記アクティブなドミナント音源の方向の前記フレーム遅延されたバージョン及び前記前の時間フレームのＨＯＡ係数からモードマッチングを用いて計算され、
前記最後から２番目の時間フレームと前記前の時間フレームとの間での前記源移動角度の組は、前記前の時間フレームの前記アクティブなドミナント音源の方向の前記フレーム遅延されたバージョン及びその更にフレーム遅延されたバージョンから計算される、方法。
〔態様７〕
音場のＨＯＡと称される高次アンビソニクス表現における無相関な音源の方向を決定する装置であって、
ＨＯＡ係数の現在時間フレームにおいて、ドミナント音源の一応の方向推定を逐次探索し、対応するドミナント音源によって生成されるＨＯＡ音場成分を計算し、対応する指向性信号を計算するよう構成される手段と、
前記現在時間フレームの前記一応の方向推定と前記ＨＯＡ係数の前の時間フレームにおいてアクティブな音源の平滑化された方向とを比較することによって、且つ、前記現在時間フレームの前記指向性信号と前記前の時間フレームにおいてアクティブな音源の指向性信号とを相関させることによって、前記計算されたドミナント音源を、前記前の時間フレームにおいてアクティブな対応する音源に割り当てて、割り当て関数を得るよう構成される手段と、
前記割り当て関数、前記前の時間フレームにおける平滑化された方向の組、前記前の時間フレームにおけるアクティブなドミナント音源のインデックスの組、最後から２番目の時間フレームと前記前の時間フレームとの間での夫々の源移動角度の組、及び前記対応するドミナント音源によって生成される前記ＨＯＡ音場成分を用いて、平滑化されたドミナント源方向を計算するよう構成される手段と、
前記平滑化されたドミナント源方向、前記前の時間フレームの前記アクティブなドミナント音源の方向のフレーム遅延されたバージョン、及び前記前の時間フレームにおける前記アクティブなドミナント音源のインデックスのフレーム遅延されたバージョンを用いて、前記現在時間フレームの前記アクティブなドミナント音源のインデックス及び方向を決定するよう構成される手段と
を有し、
前記前の時間フレームにおいてアクティブな音源の前記指向性信号は、前記前の時間フレームの前記アクティブなドミナント音源の方向の前記フレーム遅延されたバージョン及び前記前の時間フレームのＨＯＡ係数からモードマッチングを用いて計算され、
前記最後から２番目の時間フレームと前記前の時間フレームとの間での前記源移動角度の組は、前記前の時間フレームの前記アクティブなドミナント音源の方向の前記フレーム遅延されたバージョン及びその更にフレーム遅延されたバージョンから計算される、装置。
〔態様８〕
検出されたドミナント指向性信号の数及び対応する一応の方向推定の決定において、前記対応するドミナント音源によって生成されるＨＯＡ音場成分は、対応する残余ＨＯＡ表現を得るために、前記ＨＯＡ係数の現在時間フレームから減算され、該減算の処理は、見つけられた音場成分が更なる方向探索について除外されるように、更なるそのような音場成分についてその都度の残りの残余ＨＯＡ表現に基づき繰り返し実行される、
態様６に記載の方法、又は態様７に記載の装置。
〔態様９〕
単一の方向インデックについて、前記残りの残余ＨＯＡ表現の指向性電力分布は、単位球面においてほぼ一様に分布する所定の数の離散的な試験方向について計算され、前記指向性電力分布は、ドミナント音源の存在について解析され、ドミナント音源の不在が検出される場合は、前記方向探索は停止され、ドミナント音源が検出される場合は、座標原点に対するその方向の一応の推定が計算される、
態様８に記載の方法、又は態様８に記載の装置。
〔態様１０〕
ドミナント音源の一応の推定を決定した後、同じ音源によって生成されると推測される音場成分のＨＯＡ表現及び夫々の指向性信号は、
単位球面に一様に分布することを目標とされるサンプリング位置から成る固定の予め定義された球面グリッドを回転させて、回転されたサンプリング位置のグリッドを提供し、前記回転が、第１の回転されたサンプリング位置が前記一応の方向推定に対応するように実行されることと、
前記残りの残余ＨＯＡ表現を、前記回転されたグリッド方向から座標原点に作用すると推測される対応する平面波関数によって等価に表現される空間領域へと変換し、ドミナント音源信号及びグリッド指向性信号を計算することと、
ドミナント音源信号からの前記グリッド指向性信号の予測を実行することと、
前記残りの残余ＨＯＡ表現によって表される音場に対する前記ドミナント音源の寄与を表す、前記予測されたグリッド指向性信号のＨＯＡ表現を、逆球面調和関数変換によって計算することと
によって計算される、
態様８若しくは９に記載の方法、又は態様８若しくは９に記載の装置。
〔態様１１〕
前記平滑化されたドミナント源方向の計算は、
前記割り当て関数、前記前の時間フレームにおける平滑化された方向の組、前記前の時間フレームにおけるアクティブなドミナント音源のインデックスの組、及び源移動角度の組を用いて、ドミナント音源方向について方向の事前確率関数を計算することと、
前記割り当て関数を用いて、且つ、ドミナント音源によって生成される前記ＨＯＡ音場成分を用いて、ドミナント音源方向について方向の尤度関数を計算することと、
前記方向の尤度関数を用いて、且つ、前記方向の事前確率関数を用いて、ドミナント音源方向について方向の事後確率関数を計算することと、
ドミナント音源方向についての前記方向の事後確率関数を用いて、平滑化されたドミナント音源方向を決定することと
によって実行される、
態様６及び８乃至１０のうちいずれか一項に記載の方法、又は態様７乃至１０のうちいずれか一項に記載の装置。 The inventive process may be performed by a single processor or electronic circuit or by multiple processors or electronic circuits operating in parallel and / or operating in different parts of the inventive process.
Several aspects are described.
[Aspect 1]
A method for determining the direction of an uncorrelated sound source in a higher-order ambisonics representation called a HOA of a sound field,
Sequentially searching for a random direction estimate of the dominant sound source in the current time frame of the HOA coefficient and calculating a HOA sound field component generated by the corresponding dominant sound source;
In each iteration of the search, each further direction estimate is computed from a residual HOA representation that represents the original HOA representation with all components correlated with the previously found sound source signal removed,
The current direction estimate is selected from a plurality of predefined test directions, and the power of the associated general plane wave of the residual HOA representation acting from the selected direction at the listener position is determined by all other tests. A method that ensures that it is maximum compared to the direction power.
[Aspect 2]
The selected direction estimate for the current time frame of the HOA coefficient is assigned to the dominant sound source found in the previous time frame of the HOA coefficient, and the final direction estimate is smoothed against the resulting time trajectory. ,
A method according to aspect 1.
[Aspect 3]
The smoothing is performed by performing a Bayesian estimation process, which includes a directional power distribution of dominant source components of the original HOA representation and a statistical a priori source movement model. To use,
A method according to embodiment 2.
[Aspect 4]
The a priori sound source movement model based on the statistics shows the movement of individual sound sources between knowledge of their direction in the previous time frame and the last time frame and the last time frame. Predict statistically from knowledge of the movement of the
A method according to aspect 3.
[Aspect 5]
The assignment of the direction estimate to the dominant sound source found in the previous time frame of the HOA coefficient is based on the direction estimation and joint minimization of the angle between the previously found sound source direction set and the direction estimate. And maximizing the absolute value of the correlation coefficient between the set of directional signals associated with the dominant sound source found in the time frame prior to the HOA coefficient.
A method according to embodiment 3 or 4.
[Aspect 6]
A method for determining the direction of an uncorrelated sound source in a higher-order ambisonics representation called a HOA of a sound field,
Sequentially searching for a random direction estimate of the dominant sound source in the current time frame of the HOA coefficient, calculating a HOA sound field component generated by the corresponding dominant sound source, and calculating a corresponding directional signal;
By comparing the tentative direction estimate of the current time frame with the smoothed direction of the active sound source in the time frame before the HOA coefficient, and with the directional signal of the current time frame and the previous Assigning the calculated dominant sound source to the corresponding sound source active in the previous time frame by correlating with the directional signal of the active sound source in the time frame of
The assignment function, the set of smoothed directions in the previous time frame, the set of active dominant sound source indices in the previous time frame, between the penultimate time frame and the previous time frame Calculating a smoothed dominant source direction using the respective source movement angle sets and the HOA sound field component generated by the corresponding dominant sound source;
A frame delayed version of the smoothed dominant source direction, a direction of the active dominant source in the previous time frame, and a frame delayed version of the index of the active dominant source in the previous time frame. Using to determine an index and direction of the active dominant sound source of the current time frame;
Have
The directional signal of the sound source active in the previous time frame uses mode matching from the frame delayed version of the active dominant sound source direction of the previous time frame and the HOA coefficient of the previous time frame. Calculated,
The set of source movement angles between the penultimate time frame and the previous time frame is the frame delayed version of the direction of the active dominant source of the previous time frame and further A method, calculated from the frame delayed version.
[Aspect 7]
An apparatus for determining the direction of an uncorrelated sound source in a high-order ambisonic representation called a HOA of a sound field,
Means configured to sequentially search for a random direction estimate of the dominant sound source in the current time frame of the HOA coefficient, calculate a HOA sound field component generated by the corresponding dominant sound source, and calculate a corresponding directional signal; ,
By comparing the tentative direction estimate of the current time frame with the smoothed direction of the active sound source in the time frame before the HOA coefficient, and with the directional signal of the current time frame and the previous Means configured to assign the calculated dominant sound source to a corresponding sound source active in the previous time frame to obtain an assignment function by correlating with a directional signal of an active sound source in a time frame of When,
The assignment function, the set of smoothed directions in the previous time frame, the set of active dominant sound source indices in the previous time frame, between the penultimate time frame and the previous time frame Means configured to calculate a smoothed dominant source direction using a respective set of source movement angles and the HOA sound field component generated by the corresponding dominant sound source;
A frame delayed version of the smoothed dominant source direction, a direction of the active dominant source in the previous time frame, and a frame delayed version of the index of the active dominant source in the previous time frame. Means adapted to determine an index and direction of the active dominant sound source of the current time frame;
Have
The directional signal of the sound source active in the previous time frame uses mode matching from the frame delayed version of the active dominant sound source direction of the previous time frame and the HOA coefficient of the previous time frame. Calculated,
The set of source movement angles between the penultimate time frame and the previous time frame is the frame delayed version of the direction of the active dominant source of the previous time frame and further A device calculated from a frame delayed version.
[Aspect 8]
In determining the number of detected dominant directional signals and the corresponding tentative direction estimate, the HOA sound field component generated by the corresponding dominant sound source is the current HOA coefficient to obtain the corresponding residual HOA representation. Subtracted from the time frame, the subtraction process is repeated based on the remaining residual HOA representations for each such sound field component, such that the found sound field component is excluded for further direction searches. Executed,
The method according to aspect 6 or the apparatus according to aspect 7.
[Aspect 9]
For a single directional index, the directional power distribution of the remaining residual HOA representation is calculated for a predetermined number of discrete test directions that are approximately uniformly distributed in the unit sphere, and the directional power distribution is calculated as a dominant power distribution. If the presence of a sound source is analyzed and the absence of a dominant sound source is detected, the direction search is stopped, and if a dominant sound source is detected, a linear estimate of that direction relative to the coordinate origin is calculated.
The method according to aspect 8, or the apparatus according to aspect 8.
[Aspect 10]
After determining a tentative estimate of the dominant sound source, the HOA representation of the sound field components that are assumed to be generated by the same sound source and the respective directional signals are
Rotating a fixed predefined spherical grid of sampling positions targeted to be uniformly distributed on the unit sphere to provide a grid of rotated sampling positions, the rotation being a first rotation A sampled sampling position is performed to correspond to the temporary orientation estimation;
Transform the remaining residual HOA representation into a spatial domain equivalently represented by a corresponding plane wave function that is assumed to act on the coordinate origin from the rotated grid direction, and calculate a dominant source signal and grid directivity signal To do
Performing a prediction of the grid directional signal from a dominant source signal;
Computing an HOA representation of the predicted grid directivity signal representing the contribution of the dominant sound source to the sound field represented by the remaining residual HOA representation by inverse spherical harmonic transformation.
Calculated by the
The method according to aspect 8 or 9, or the apparatus according to aspect 8 or 9.
[Aspect 11]
The calculation of the smoothed dominant source direction is
Using the assignment function, the set of smoothed directions in the previous time frame, the set of active dominant sound source indices in the previous time frame, and the set of source movement angles, the direction a priori for the dominant sound source direction Calculating a probability function;
Using the allocation function and using the HOA sound field component generated by a dominant sound source to calculate a likelihood function of direction for a dominant sound source direction;
Using the likelihood function of the direction and calculating the posterior probability function of the direction with respect to the dominant sound source direction using the prior probability function of the direction;
Determining a smoothed dominant sound source direction using a posterior probability function of the direction with respect to the dominant sound source direction;
Executed by the
A method according to any one of aspects 6 and 8 to 10, or an apparatus according to any one of aspects 7 to 10.

Claims

A method for determining the direction of an uncorrelated sound source in a higher order ambiniconics (HOA) representation of a sound field,
Searching for a preliminary direction estimate of the dominant sound source in the current time frame of the HOA coefficient;
Corresponding and a Ru determine Teisu steps HOA sound field component based on the dominant sound source,
The current direction estimate is determined based on a residual HOA representation that represents the original HOA representation with all components correlated with the previously found sound source signal removed,
The current direction estimate is based on the power of the associated general plane wave in the residual HOA representation acting from one direction relative to the listener position compared to the respective power in all other test directions, Selected from a plurality of predefined test directions,
The method wherein the current direction estimate for the current time frame of the HOA coefficient is assigned to at least one dominant source of the time frame prior to the HOA coefficient and smoothed with respect to the time trajectory.

The smoothing is based on a Bayesian estimation process, which utilizes a directional power distribution of dominant source components of the original HOA representation and an a priori source movement model based on statistics.
The method of claim 1.

The a priori sound source movement model based on the statistics shows the movement of individual sound sources in their direction in the previous time frame and the movement between the previous time frame and the penultimate time frame. Based on and statistically predict,
The method of claim 2.

Direction estimation was related to direction estimation and joint minimization of the angle between previously found sound source direction sets, and to direction estimation and dominant sound sources found in the time frame before the HOA coefficient. Based on maximizing the absolute value of the correlation coefficient between the set of directional signals and assigned to the dominant sound source of the time frame before the HOA coefficient,
The method of claim 2.

A method for determining the direction of an uncorrelated sound source in a higher order ambiniconics (HOA) representation of a sound field,
Searching for a preliminary direction estimate of the dominant sound source in the current time frame of the HOA coefficient;
Determining a HOA sound field component based on a corresponding dominant sound source and determining a corresponding directional signal;
Based on comparing the preliminary direction estimate of the current time frame with the smoothed direction of the sound source active in the time frame prior to the HOA coefficient, the dominant sound source is activated in the previous time frame. Assigning to a corresponding sound source, the assignment further comprising obtaining an assignment function based further on a correlation between the directional signal of the current time frame and a directional signal of an active sound source active in the previous time frame; ,
The allocation function, the smoothed dominant source direction in the previous time frame, the index of the active dominant sound source in the previous time frame, and the last second time frame and the previous time frame, respectively. Determining a smoothed dominant source direction based on the source movement angle and the HOA sound field component based on the corresponding dominant sound source;
A frame-delayed version of the smoothed dominant source direction, a frame-delayed version of the active dominant source in the previous time frame, and a frame-delayed version of the index of the active dominant source in the previous time frame And determining an index and direction of the active dominant sound source of the current time frame based on:
The directional signal of the sound source active in the previous time frame is subjected to mode matching based on the frame delayed version of the direction of the active dominant sound source of the previous time frame and the HOA coefficient of the previous time frame. Based on
The source movement angle between the penultimate time frame and the previous time frame is the frame-delayed version of the direction of the active dominant sound source of the previous time frame and further the frame delay The method is determined based on the released version.

A device for determining the direction of an uncorrelated sound source in a higher-order ambiniconic (HOA) representation of a sound field,
Configured to search for a preliminary direction estimate of the dominant sound source in a current time frame of the HOA coefficient, determine a HOA sound field component based on the corresponding dominant sound source, and further determine a corresponding directional signal; Having a processor,
The processor determines the dominant sound source to the previous time frame based on a comparison of the preliminary direction estimate of the current time frame and a smoothed direction of an active sound source in the time frame prior to the HOA coefficient. Is further configured to assign to a corresponding sound source active in the assignment, the assignment further based on a correlation between the directional signal of the current time frame and the directional signal of the sound source active in the previous time frame. And
The processor includes the allocation function, a smoothed dominant source direction in the previous time frame, an index of an active dominant source in the previous time frame, a penultimate time frame and the previous time frame. Further configured to determine a smoothed dominant source direction based on each source movement angle between and the HOA sound field component based on the corresponding dominant sound source;
The processor includes a frame delayed version of the smoothed dominant source direction, a frame delayed version of the active dominant source direction of the previous time frame, and an index of the active dominant source index in the previous time frame. Further configured to determine an index and direction of the active dominant sound source of the current time frame based on
The directional signal of the sound source active in the previous time frame is subjected to mode matching based on the frame delayed version of the direction of the active dominant sound source of the previous time frame and the HOA coefficient of the previous time frame. Based on
The source movement angle between the penultimate time frame and the previous time frame is the frame-delayed version of the direction of the active dominant sound source of the previous time frame and further the frame delay The device is determined based on the released version.

The determination of the detected dominant directional signal and the corresponding preliminary direction estimate is based on subtraction of the corresponding dominant sound source from the current time frame of the HOA coefficient to obtain a corresponding residual HOA representation. Further including determining a field component, wherein the subtraction process is repeated for each remaining residual HOA expression for each additional sound field component such that the sound field component is excluded for further direction searches. To
The method of claim 5.

Determining a representation for a predetermined number of discrete test directions distributed substantially uniformly in the unit sphere;
The directional power distribution is analyzed for the presence of a dominant sound source, based on the determination of the absence of a dominant sound source, the direction search is stopped, and based on the determination of the detection of the dominant sound source, a preliminary estimate of that direction relative to the coordinate origin is It is determined,
The method of claim 7.

HOA representation of sound field components based on the same sound source and each directional signal are
Rotating a fixed predefined spherical grid of sampling positions targeted to be uniformly distributed on the unit sphere to determine a grid of rotated sampling positions, said rotation being a first rotation The performed sampling positions correspond to the preliminary direction estimate;
Transforming the remaining residual HOA representation into a spatial domain to determine a dominant source signal and a grid directivity signal;
Performing a prediction of the grid directional signal from a dominant source signal;
Determining an HOA representation of the predicted grid directional signal representing the contribution of the dominant sound source to the sound field represented by the remaining residual HOA representation based on an inverse spherical harmonic transformation. ,
The method of claim 8.

The smoothed dominant source direction is
Based on the allocation function, the smoothed dominant source direction in the previous time frame, the index of the active dominant source in the previous time frame, and the source movement angle, a prior probability function of direction for the dominant source direction To decide,
Determining a likelihood function of a direction for a dominant sound source direction based on the allocation function and the HOA sound field component generated by a dominant sound source;
Determining a direction posterior probability function for a dominant sound source direction based on the likelihood function of the direction and the prior probability function of the direction;
Determining a smoothed dominant sound source direction based on a posterior probability function of said direction with respect to the dominant sound source direction;
The method of claim 5.