RU2776307C2

RU2776307C2 - Method and device for compression and decompression of representation based on higher-order ambiophony

Info

Publication number: RU2776307C2
Application number: RU2018133016A
Authority: RU
Inventors: Александр КРЮГЕР; Свен КОРДОН
Original assignee: Долби Интернэшнл Аб
Priority date: 2013-04-29
Filing date: 2014-04-24
Publication date: 2022-07-18

Abstract

FIELD: information technology.

SUBSTANCE: invention relates to means for compression and decompression of a representation based on higher-order ambiophony. For a current frame, a set of dominant directions and a corresponding dataset of indices of detected directional signals are assessed. The second number of directional signals with corresponding directions contained in the mentioned set of dominant direction assessments and with the corresponding delayed dataset of indices of the mentioned directional signals is separated from HOA coefficient sequences of the mentioned current frame. A surrounding HOA component is represented by means of a reduced number of HOA coefficient sequences and the corresponding dataset of indices of the mentioned reduced number of surrounding HOA coefficient sequences, wherein this reduced number corresponds to the difference between the mentioned first number and the mentioned second number. Directional signals and HOA coefficient sequences of the surrounding HOA component are assigned to a frame of channels, the number of which corresponds to the mentioned first number. The mentioned channels of the frame, which received the assignment, are perceptually encoded in such a way that to provide an encoded compressed frame.

EFFECT: increase in the compression efficiency.

18 cl, 5 dwg

Description

Область техникиTechnical field

Изобретение относится к способу и к устройству для сжатия и распаковки представления на основе амбиофонии высшего порядка посредством обработки направленных и окружающих компонентов сигнала по-разному.The invention relates to a method and apparatus for compressing and decompressing a higher-order ambiphony representation by processing directional and ambient signal components differently.

Уровень техникиState of the art

Амбиофония высшего порядка (HOA) является одной из возможностей представления трехмерного звука из числа других технологий, таких как синтез волнового поля (WFS) или канальные подходы, к примеру, 22.2. Тем не менее, в отличие от канальных способов, HOA-представление обеспечивает преимущество независимости от конкретной компоновки громкоговорителей. Тем не менее, эта гибкость обеспечивается за счет процесса декодирования, который требуется для воспроизведения HOA-представления на конкретной компоновке громкоговорителей. По сравнению с WFS-подходом, в котором число требуемых громкоговорителей обычно является очень большим, HOA также может быть подготовлена посредством рендеринга для компоновок, состоящих только из небольшого числа громкоговорителей. Дополнительное преимущество HOA состоит в том, что идентичное представление также может использоваться без модификации для бинаурального рендеринга в наушники.Higher order ambiophony (HOA) is one of the possibilities for representing 3D sound from among other technologies such as wave field synthesis (WFS) or channel approaches such as 22.2. However, unlike channel methods, the HOA presentation provides the advantage of being independent of the particular speaker layout. However, this flexibility comes at the expense of the decoding process required to reproduce the HOA representation on a particular speaker layout. Compared to the WFS approach, in which the number of speakers required is typically very large, HOA can also be prepared by rendering for layouts consisting of only a small number of speakers. An additional benefit of HOA is that the same representation can also be used without modification for binaural rendering to headphones.

HOA основана на представлении пространственной плотности амплитуд гармонической плоской волны сложной формы посредством усеченного разложения в ряд по сферическим гармоникам (SH). Каждый коэффициент разложения в ряд представляет собой функцию от угловой частоты, которая может быть эквивалентно представлена посредством функции во временной области. Следовательно, без потери общности, полное HOA-представление звукового поля фактически может предполагаться как состоящее из 0 функций во временной области, где 0 обозначает число коэффициентов разложения в ряд. Эти функции во временной области эквивалентно упоминаются как "последовательности HOA-коэффициентов" или как "HOA-каналы".HOA is based on the representation of the spatial amplitude density of complex harmonic plane wave amplitudes by means of a truncated expansion in terms of spherical harmonics (SH). Each expansion coefficient is a function of the corner frequency, which can be equivalently represented by a function in the time domain. Therefore, without loss of generality, the full HOA sound field representation can in fact be assumed to consist of 0 time-domain functions, where 0 denotes the number of series expansion coefficients. These time domain functions are equivalently referred to as "HOA coefficient sequences" or as "HOA channels".

Пространственное разрешение HOA-представления повышается с растущим максимальным порядком N разложения в ряд. К сожалению, число коэффициентов разложения в ряд в 0 растет квадратично с порядком N, в частности

. Например, типичные HOA-представления с использованием порядка N=4 требуют 0=25 HOA-коэффициентов (разложения в ряд). Согласно вышеприведенным соображениям, полная скорость передачи битов для передачи HOA-представления, с учетом требуемой одноканальной частоты

дискретизации и числа

битов в расчете на выборку, определяется посредством

. Следовательно, передача HOA-представления порядка N=4 с частотой дискретизации

= 48 кГц с использованием

= 16 битов на выборку приводит к скорости передачи битов в 19,2 Мбит/с, которая является очень высокой для многих практических вариантов применения, например, для потоковой передачи.The spatial resolution of the HOA representation increases with increasing maximum order N of the series expansion. Unfortunately, the number of expansion coefficients in a series at 0 grows quadratically with order N, in particular

. For example, typical HOA representations using order N=4 require 0=25 HOA coefficients (series expansion). According to the above considerations, the total bit rate for transmitting the HOA representation, given the required single-channel frequency

discretization and numbers

bits per sample is determined by

. Therefore, transmitting a HOA representation of order N=4 with a sampling rate

= 48 kHz using

= 16 bits per sample results in a bit rate of 19.2 Mbps, which is very high for many practical applications such as streaming.

Сжатие HOA-представлений звукового поля предложено в заявках на патент EP 12306569.0 и EP 12305537.8. Вместо перцепционного кодирования каждой из последовательностей HOA-коэффициентов по отдельности, которое выполняется, например, в работе авторов E. Hellerud, I. Burnett, A. Солвенг и U.P. Svensson, "Encoding Higher Order Ambisonics with AAC", 124th AES Convention, Амстердам, 2008 год, предпринимается попытка сокращать число сигналов, которые должны быть перцепционно кодированы, в частности, посредством выполнения анализа звукового поля и разложения данного HOA-представления на направленный и остаточный окружающий компонент. В общем, предполагается, что направленный компонент представлен посредством небольшого числа доминирующих направленных сигналов, которые могут рассматриваться в качестве общих функций плоской волны. Порядок остаточного окружающего HOA-компонента уменьшается, поскольку предполагается, что после извлечения доминирующих направленных сигналов, HOA-коэффициенты низшего порядка переносят наиболее релевантную информацию.Compression of HOA sound field representations is proposed in patent applications EP 12306569.0 and EP 12305537.8. Instead of perceptual encoding of each of the sequences of HOA coefficients separately, which is performed, for example, in the work of the authors E. Hellerud, I. Burnett, A. Solveng and U.P. Svensson, "Encoding Higher Order Ambisonics with AAC", 124th AES Convention, Amsterdam, 2008, an attempt is made to reduce the number of signals that must be perceptually encoded, in particular by performing sound field analysis and decomposing a given HOA representation into directional and residual surrounding component. In general, it is assumed that the directional component is represented by a small number of dominant directional signals, which can be considered as general plane wave functions. The order of the residual ambient HOA component is reduced because it is assumed that after extracting the dominant directional signals, the lowest order HOA coefficients carry the most relevant information.

Сущность изобретенияThe essence of the invention

В итоге, посредством такой операции, начальное число

последовательностей HOA-коэффициентов, которые должны быть перцепционно кодированы, уменьшается до фиксированного числа D доминирующих направленных сигналов и числа

последовательностей HOA-коэффициентов, представляющих остаточный окружающий HOA-компонент с усеченным порядком

, в силу чего число сигналов кодироваться является фиксированным, т.е.

. В частности, это число является независимым от фактически обнаруженного числа

активных доминирующих направленных источников звука во временном кадре k. Это означает то, что в k временных кадрах, в которых фактически обнаруженное число

активных доминирующих направленных источников звука меньше максимального разрешенного числа D направленных сигналов, некоторые или даже все доминирующие направленные сигналы, которые должны быть перцепционно кодированы, являются нулевыми. В конечном счете, это означает то, что эти каналы вообще не используются для захвата релевантной информации звукового поля. В этом контексте, дополнительное возможно слабое место в обработках согласно EP 12306569.0 и EP 12305537.8 представляет собой критерий для определения количества активных доминирующих направленных сигналов в каждом временном кадре, поскольку предпринимается попыток определять оптимальное количество активных доминирующих направленных сигналов относительно последовательного перцепционного кодирования звукового поля. Например, в EP 12305537.8, количество доминирующих источников звука оценивается с использованием простого критерия мощности, а именно, посредством определения размерности подпространства матрицы корреляции между коэффициентами, принадлежащей наибольшим собственным значениям. В EP 12306569.0 предложено инкрементное обнаружение доминирующих направленных источников звука, причем направленный источник звука считается доминирующим, если мощность функции плоской волны из соответствующего направления является достаточно высокой относительно первого направленного сигнала. Использование критериев на основе мощности, как указано в EP 12306569.0 и EP 12305537.8, может приводить к направленно-окружающему разложению, которое является субоптимальным относительно перцепционного кодирования звукового поля.As a result, through such an operation, the initial number

sequences of HOA coefficients to be perceptually encoded is reduced to a fixed number D of dominant directional signals and the number

sequences of HOA coefficients representing the residual surrounding HOA component with truncated order

, due to which the number of signals to be encoded is fixed, i.e.

. In particular, this number is independent of the number actually found.

active dominant directional sound sources in time frame k. This means that in k time frames in which the actually detected number

of active dominant directional sound sources is less than the maximum allowed number D of directional signals, some or even all of the dominant directional signals to be perceptually encoded are zero. Ultimately, this means that these channels are not used at all to capture the relevant sound field information. In this context, an additional possibly weak point in the treatments according to EP 12306569.0 and EP 12305537.8 is a criterion for determining the number of active dominant directional signals in each temporal frame, as attempts are made to determine the optimal number of active dominant directional signals relative to sequential perceptual sound field coding. For example, in EP 12305537.8, the number of dominant sound sources is estimated using a simple power criterion, namely by determining the dimension of the subspace of the correlation matrix between the coefficients belonging to the largest eigenvalues. EP 12306569.0 proposes incremental detection of dominant directional sound sources, where a directional sound source is considered dominant if the power of the plane wave function from the corresponding direction is high enough relative to the first directional signal. The use of power-based criteria as specified in EP 12306569.0 and EP 12305537.8 can result in a directional-ambient decomposition that is sub-optimal with respect to perceptual audio field coding.

Проблема, которая должна разрешаться посредством изобретения, состоит в том, чтобы улучшать HOA-сжатие посредством определения для текущего HOA-контента аудиосигнала того, как назначать для предварительно определенного сокращенного числа каналов, направленные сигналы и коэффициенты для окружающего HOA-компонента. Эта проблема разрешается посредством способов, раскрытых в пунктах 1 и 3 формулы изобретения. Устройства, которые используют эти способы, раскрыты в пунктах 2 и 4 формулы изобретения.The problem to be solved by the invention is to improve HOA compression by determining, for the current HOA content of the audio signal, how to assign to a predetermined reduced number of channels, steered signals and coefficients for the surrounding HOA component. This problem is solved by the methods disclosed in paragraphs 1 and 3 of the claims. Devices that use these methods are disclosed in paragraphs 2 and 4 of the claims.

Изобретение улучшает обработку сжатия, предложенную в EP 12306569.0, в двух аспектах. Во-первых, лучше используется полоса пропускания, предоставленная посредством данного числа каналов, которые должны быть кодированы. Во временных кадрах, в которых сигналы доминирующих источников звука не обнаруживаются, каналы, первоначально зарезервированные для доминирующих направленных сигналов, используются для захвата дополнительной информации относительно окружающего компонента, в форме дополнительных последовательностей HOA-коэффициентов остаточного окружающего HOA-компонента. Во-вторых, с учетом цели использовать данное число каналов для того, чтобы перцепционно кодировать данное HOA-представление звукового поля, критерий определения количества направленных сигналов, которые должны извлекаться из HOA-представления, адаптирован относительно этого назначения. Число направленных сигналов определяется таким образом, что декодированное и восстановленное HOA-представление предоставляет наименьшую воспринимаемую ошибку. Этот критерий сравнивает ошибки моделирования, либо возникающие в результате извлечения направленного сигнала и использования последовательности HOA-коэффициентов меньше для описания остаточного окружающего HOA-компонента, либо возникающие в результате неизвлечения направленного сигнала и использования вместо этого дополнительной последовательности HOA-коэффициентов для описания остаточного окружающего HOA-компонента. Этот критерий дополнительно учитывает для обоих случаев пространственное распределение мощности шума квантования введенным посредством перцепционного кодирования направленных сигналов и последовательностей HOA-коэффициентов остаточного окружающего HOA-компонента.The invention improves on the compression processing proposed in EP 12306569.0 in two aspects. First, better use is made of the bandwidth provided by the given number of channels to be encoded. In time frames in which no dominant sound source signals are detected, the channels originally reserved for dominant directional signals are used to capture additional information about the surround component, in the form of additional residual HOA surround component HOA coefficient sequences. Second, given the goal of using a given number of channels to perceptually encode a given HOA sound field representation, the criterion for determining the number of directional signals to be extracted from the HOA representation is adapted with respect to this assignment. The number of directed signals is determined such that the decoded and reconstructed HOA representation provides the least perceived error. This test compares simulation errors that either result from extracting the directional signal and using a sequence of HOA coefficients smaller to describe the residual ambient HOA component, or resulting from not extracting the directional signal and using an additional sequence of HOA coefficients to describe the residual ambient HOA component instead. component. This criterion additionally takes into account, for both cases, the spatial power distribution of the quantization noise introduced by the perceptual encoding of the steered signals and the HOA coefficient sequences of the residual ambient HOA component.

Чтобы реализовывать вышеописанную обработку, перед началом HOA-сжатия, указывается общее число сигналов (каналов), по сравнению с которым уменьшается исходное число 0 последовательностей HOA-коэффициентов. Окружающий HOA-компонент предположительно должен быть представлен посредством минимального числа

последовательностей HOA-коэффициентов. В некоторых случаях, это минимальное число может быть нулем. Оставшиеся каналы

предположительно содержат либо направленные сигналы, либо дополнительные последовательности коэффициентов окружающего HOA-компонента, в зависимости от того, что обработка извлечения направленных сигналов определяет в качестве перцепционно (то есть с точки зрения восприятия) более значимого. Предполагается, что назначение либо направленных сигналов, либо последовательностей коэффициентов окружающего HOA-компонента оставшимся D каналов может изменяться на покадровой основе. Для восстановления звукового поля на стороне приемного устройства информация относительно назначения передается в качестве дополнительной вспомогательной информации.In order to implement the above processing, before starting the HOA compression, the total number of signals (channels) is indicated, in comparison with which the initial number 0 of the HOA coefficient sequences is reduced. The surrounding HOA component is supposed to be represented by a minimum number of

sequences of HOA coefficients. In some cases, this minimum number may be zero. Remaining channels

presumably contain either directional cues or additional sequences of ambient HOA component coefficients, whichever directional signal extraction processing determines to be perceptually (i.e., perceptually) more meaningful. It is contemplated that the assignment of either steered signals or ambient HOA component coefficient sequences to the remaining D channels may change on a frame-by-frame basis. In order to reconstruct the sound field on the receiver side, the destination information is transmitted as additional auxiliary information.

В принципе, изобретаемый способ сжатия подходит для сжатия с использованием фиксированного числа перцепционных кодирований представления на основе амбиофонии высшего порядка звукового поля, обозначаемой HOA, с входными временными кадрами последовательностей HOA-коэффициентов, причем упомянутый способ включает в себя следующие этапы, которые выполняются на покадровой основе:In principle, the inventive compression method is suitable for compressing using a fixed number of perceptual representation encodings based on higher-order sound field ambiphony, denoted HOA, with input time frames of sequences of HOA coefficients, said method including the following steps, which are performed on a frame-by-frame basis :

- для текущего кадра, оценка набора доминирующих направлений и соответствующего набора данных индексов обнаруженных направленных сигналов;- for the current frame, the evaluation of the set of dominant directions and the corresponding data set of indexes of the detected directional signals;

- разложение последовательностей HOA-коэффициентов упомянутого текущего кадра на нефиксированное число направленных сигналов с соответствующими направлениями, содержащимися в упомянутом наборе оценок доминирующих направлений, и с соответствующим набором данных индексов упомянутых направленных сигналов, при этом упомянутое нефиксированное число меньше упомянутого фиксированного числа, и на остаточный окружающий HOA-компонент, который представлен посредством сокращенного числа последовательностей HOA-коэффициентов и соответствующего набора данных индексов упомянутого сокращенного числа остаточных окружающих последовательностей HOA-коэффициентов, причем это сокращенное число соответствует разности между упомянутым фиксированным числом и упомянутым нефиксированным числом;- decomposing sequences of HOA coefficients of said current frame into a non-fixed number of directional signals with corresponding directions contained in said set of dominant direction estimates and with a corresponding data set of indices of said directional signals, wherein said non-fixed number is less than said fixed number, and into a residual ambient a HOA component that is represented by a reduced number of HOA coefficient sequences and a corresponding set of index data of said reduced number of residual surrounding HOA coefficient sequences, the reduced number corresponding to the difference between said fixed number and said non-fixed number;

- назначение упомянутых направленных сигналов и последовательностей HOA-коэффициентов упомянутого остаточного окружающего HOA-компонента каналам, число которых соответствует упомянутому фиксированному числу, при этом для упомянутого назначения используются упомянутый набор данных индексов упомянутых направленных сигналов и упомянутый набор данных индексов упомянутого сокращенного числа остаточных окружающих последовательностей HOA-коэффициентов;assigning said directional signals and HOA coefficient sequences of said residual HOA surround component to channels corresponding to said fixed number, said assignment using said index data set of said directional signals and said index data set of said reduced number of residual HOA surround sequences -coefficients;

- перцепционное кодирование упомянутых каналов связанного кадра таким образом, чтобы предоставлять кодированный сжатый кадр.- perceptually encoding said channels of the associated frame in such a way as to provide an encoded compressed frame.

В принципе, изобретаемое устройство сжатия подходит для сжатия с использованием фиксированного числа перцепционных кодирований представления на основе амбиофонии высшего порядка звукового поля, обозначаемой HOA, с входными временными кадрами последовательностей HOA-коэффициентов, причем упомянутое устройство выполняет обработку на покадровой основе и включает в себя:In principle, the inventive compressor is suitable for compressing using a fixed number of perceptual representation encodings based on higher-order sound field ambiphony, denoted HOA, with input time frames of HOA coefficient sequences, said device performing processing on a frame-by-frame basis and including:

- средство, выполненное с возможностью оценки для текущего кадра набора доминирующих направлений и соответствующего набора данных индексов обнаруженных направленных сигналов;means for estimating, for the current frame, a set of dominant directions and a corresponding data set of indices of detected directional signals;

- средство, выполненное с возможностью с возможностью разложения последовательностей HOA-коэффициентов упомянутого текущего кадра на нефиксированное число направленных сигналов с соответствующими направлениями, содержащимися в упомянутом наборе оценок доминирующих направлений, и с соответствующим набором данных индексов упомянутых направленных сигналов, при этом упомянутое нефиксированное число меньше упомянутого фиксированного числа, и на остаточный окружающий HOA-компонент, который представлен посредством сокращенного числа последовательностей HOA-коэффициентов и соответствующего набора данных индексов упомянутого сокращенного числа остаточных окружающих последовательностей HOA-коэффициентов, причем это сокращенное число соответствует разности между упомянутым фиксированным числом и упомянутым нефиксированным числом;means configured to decompose sequences of HOA coefficients of said current frame into a non-fixed number of directional signals with respective directions contained in said set of dominant direction estimates and with a corresponding set of index data of said directional signals, wherein said non-fixed number is less than said a fixed number, and a residual surrounding HOA component that is represented by a reduced number of HOA coefficient sequences and a corresponding set of index data of said reduced number of residual surrounding HOA coefficient sequences, the reduced number corresponding to the difference between said fixed number and said floating number;

- средство, выполненное с возможностью назначения упомянутых направленных сигналов и последовательностей HOA-коэффициентов упомянутого остаточного окружающего HOA-компонента каналам, число которых соответствует упомянутому фиксированному числу, при этом для упомянутого назначения используются упомянутый набор данных индексов упомянутых направленных сигналов и упомянутый набор данных индексов упомянутого сокращенного числа остаточных окружающих последовательностей HOA-коэффициентов;- means configured to assign said directional signals and HOA coefficient sequences of said residual HOA surround component to channels corresponding to said fixed number, said assignment using said index data set of said directional signals and said index data set of said abbreviated the number of residual surrounding sequences of HOA coefficients;

- средство, выполненное с возможностью перцепционного кодирования упомянутых каналов связанного кадра таким образом, чтобы предоставлять кодированный сжатый кадр.- means configured to perceptually encode said channels of the associated frame in such a way as to provide an encoded compressed frame.

В принципе, изобретаемый способ распаковки подходит для распаковки представления на основе амбиофонии высшего порядка, сжатого согласно вышеуказанному способу сжатия, причем упомянутая распаковка включает в себя следующие этапы:In principle, the inventive decompressing method is suitable for decompressing a higher-order ambiphony representation compressed according to the above compression method, said decompressing comprising the following steps:

- перцепционное декодирование текущего кодированного сжатого кадра таким образом, чтобы предоставлять перцепционно декодированный кадр каналов;- perceptually decoding the current encoded compressed frame so as to provide a perceptually decoded channel frame;

- перераспределение упомянутого перцепционно декодированного кадра каналов, с использованием упомянутого набора данных индексов обнаруженных направленных сигналов и упомянутого набора данных индексов выбранных окружающих последовательностей HOA-коэффициентов, с тем чтобы воссоздавать соответствующий кадр направленных сигналов и соответствующий кадр остаточного окружающего HOA-компонента;- reallocating said perceptually decoded channel frame using said detected steer index data set and said selected HOA coefficient ambient index data set to recreate a corresponding steer frame and a corresponding residual HOA surround component frame;

- повторное составление текущего распакованного кадра HOA-представления из упомянутого кадра направленных сигналов и из упомянутого кадра остаточного окружающего HOA-компонента, с использованием упомянутого набора данных индексов обнаруженных направленных сигналов и упомянутого набора оценок доминирующих направлений,- recomposing the current decompressed frame of the HOA representation from said frame of directional signals and from said frame of the residual surrounding HOA component, using said dataset of indices of detected directional signals and said set of estimates of dominant directions,

- при этом направленные сигналы относительно равномерно распределенных направлений прогнозируются из упомянутых направленных сигналов, и после этого упомянутый текущий распакованный кадр повторно составляется из упомянутого кадра направленных сигналов, упомянутых прогнозированных сигналов и упомянутого остаточного окружающего HOA-компонента.wherein steered signals with respect to evenly spaced directions are predicted from said steered signals, and thereafter said current decompressed frame is recomposed from said steered signal frame, said predicted signals and said residual HOA surround component.

В принципе, изобретаемое устройство распаковки подходит для распаковки представления на основе амбиофонии высшего порядка, сжатого согласно вышеуказанному способу сжатия, причем упомянутое устройство включает в себя:In principle, the inventive decompressor is suitable for decompressing a higher-order ambiphony representation compressed according to the above compression method, said device including:

- средство, выполненное с возможностью перцепционного декодирования текущего кодированного сжатого кадра таким образом, чтобы предоставлять перцепционно декодированный кадр каналов;- means configured to perceptually decode the current encoded compressed frame so as to provide a perceptually decoded frame of channels;

- средство, выполненное с возможностью перераспределения упомянутого перцепционно декодированного кадра каналов, с использованием упомянутого набора данных индексов обнаруженных направленных сигналов и упомянутого набора данных индексов выбранных окружающих последовательностей HOA-коэффициентов, с тем чтобы воссоздавать соответствующий кадр направленных сигналов и соответствующий кадр остаточного окружающего HOA-компонента;- means configured to reallocate said perceptually decoded channel frame using said detected directional signal index data set and said selected HOA coefficient ambient index data set in order to recreate the corresponding directional signal frame and the corresponding frame of the residual ambient HOA component ;

- средство, выполненное с возможностью повторного составления текущего распакованного кадра HOA-представления из упомянутого кадра направленных сигналов, упомянутого кадра остаточного окружающего HOA-компонента, упомянутого набора данных индексов обнаруженных направленных сигналов и упомянутого набора оценок доминирующих направлений, при этом направленные сигналы относительно равномерно распределенных направлений прогнозируются из упомянутых направленных сигналов, и после этого упомянутый текущий распакованный кадр повторно составляется из упомянутого кадра направленных сигналов, упомянутых прогнозированных сигналов и упомянутого остаточного окружающего HOA-компонента.- means configured to recompose the current decompressed frame of the HOA representation from said frame of directional signals, said frame of the residual surrounding HOA component, said data set of detected directional signals indices, and said set of dominant direction estimates, wherein the directional signals are relative to uniformly distributed directions are predicted from said steered signals, and thereafter said current decompressed frame is recomposed from said steered signal frame, said predicted signals and said residual HOA surround component.

Преимущественные дополнительные варианты осуществления изобретения раскрыты в соответствующих зависимых пунктах формулы изобретения.Preferred additional embodiments of the invention are disclosed in the respective dependent claims.

Краткое описание чертежейBrief description of the drawings

Примерные варианты осуществления изобретения описаны со ссылкой на прилагаемые чертежи, из которых:Exemplary embodiments of the invention are described with reference to the accompanying drawings, of which:

Фиг. 1 является блок-схемой для HOA-сжатия;Fig. 1 is a block diagram for HOA compression;

Фиг. 2 является оценкой направлений доминирующих источников звуков;Fig. 2 is an estimate of the directions of the dominant sound sources;

Фиг. 3 является блок-схемой для HOA-распаковки;Fig. 3 is a block diagram for HOA decompression;

Фиг. 4 является сферической системой координат;Fig. 4 is a spherical coordinate system;

Фиг. 5 является нормализованной дисперсионной функцией

для различных порядков N амбиофонии и для углов

.Fig. 5 is the normalized dispersion function

for various orders N of ambiophony and for angles

.

Подробное описание вариантов осуществленияDetailed description of embodiments

A. Улучшенное HOA-сжатиеA. Improved HOA compression

Обработка сжатия согласно изобретению, которая основана на EP 12306569.0, проиллюстрирована на фиг. 1, на котором блоки обработки сигналов, которые модифицированы или введены как новые по сравнению с EP 12306569.0, представлены с помощью полужирного поля, и на котором

(оценки направлений как таковые) и

в этой заявке соответствуют

(матрице оценок направлений) и

в EP 12306569.0, соответственно. Для HOA-сжатия используется покадровая обработка с неперекрывающимися входными кадрами C(k) последовательностей HOA-коэффициентов длины L, где k обозначает индекс кадра. Кадры задаются относительно последовательностей HOA-коэффициентов, указываемых в уравнении (45), следующим образом:The compression processing according to the invention, which is based on EP 12306569.0, is illustrated in FIG. 1, in which signal processing blocks that are modified or introduced as new compared to EP 12306569.0 are represented by a bold box, and in which

(direction scores per se) and

in this application correspond

(matrix of assessments of directions) and

in EP 12306569.0, respectively. For HOA compression, frame-by-frame processing is used with non-overlapping input frames C(k) of sequences of HOA coefficients of length L, where k denotes the frame index. The frames are relative to the HOA coefficient sequences given in Equation (45) as follows:

, (1)

, (one)

где

указывает период дискретизации. Первый этап или стадия 11/12 на фиг. 1 является необязательной и состоит из конкатенации неперекрывающихся k-го и (k-1)-х кадров последовательностей HOA-коэффициентов в длинный кадр

следующим образом:where

specifies the sampling period. The first step or step 11/12 in FIG. 1 is optional and consists of the concatenation of non-overlapping k-th and (k-1)-th frames of HOA coefficient sequences into a long frame

in the following way:

, (2)

причем этот длинный кадр перекрывается на 50% со смежным длинным кадром, и этот длинный кадр последовательно использован для оценки направлений доминирующих источников звуков. Аналогично обозначению для

, символ тильды используется в нижеприведенном описании для указания того, что соответствующая величина означает длинные перекрывающиеся кадры. Если этап/стадия 11/12 не присутствует, символ тильды не имеет конкретного смысла. В принципе, этап или стадия 13 оценки доминирующих источников звука выполняется так, как предложено в EP 13305156.5, но с важной модификацией. Модификация связана с определением количества направлений, которые должны обнаруживаться, т.е. того, сколько направленных сигналов предположительно извлекаются из HOA-представления. Это осуществляется с намерением извлекать направленные сигналы, только если это является перцепционно более релевантным по сравнению с использованием вместо этого дополнительных последовательностей HOA-коэффициентов для лучшей аппроксимации окружающего HOA-компонента. Подробное описание этой технологии приведено в разделе 2.wherein this long frame overlaps 50% with an adjacent long frame, and this long frame is successively used to estimate directions of dominant sound sources. Similar to the notation for

, the tilde character is used in the description below to indicate that the corresponding value means long overlapping frames. If stage/stage 11/12 is not present, the tilde character has no specific meaning. In principle, step or step 13 of evaluating dominant sound sources is performed as proposed in EP 13305156.5, but with an important modification. The modification is related to the definition of the number of directions to be detected, i.e. how many directional signals are expected to be extracted from the HOA representation. This is done with the intention of extracting directional signals only if it is perceptually more relevant than using additional HOA coefficient sequences instead to better approximate the surrounding HOA component. A detailed description of this technology is given in section 2.

Оценка предоставляет набор

данных индексов направленных сигналов, которые обнаружены, а также набор

соответствующих оценок направлений. D обозначает максимальное число направленных сигналов, которое должно задаваться перед началом HOA-сжатия.Assessment provides a set

data of indices of directional signals that are detected, as well as a set

corresponding assessments of directions. D denotes the maximum number of directional signals that must be set before starting HOA compression.

На этапе или стадии 14, текущий (длинный) кадр

последовательностей HOA-коэффициентов разлагается (как предложено в EP 13305156.5) на число

направленных сигналов, принадлежащих направлениям, содержащимся в наборе

, и остаточный окружающий HOA-компонент

. Задержка в два кадра вводится в результате обработки суммирования с перекрытием, чтобы получать сглаженные сигналы. Предполагается, что

содержит всего D каналов, из которых, тем не менее, только каналы, которые соответствуют активным направленным сигналам, являются ненулевыми. Индексы, указывающие эти каналы, предположительно должны выводиться в наборе

данных. Дополнительно, разложение на этапе/стадии 14 предоставляет некоторые параметры

, которые используются на стороне распаковки для прогнозирования частей исходного HOA-представления из направленных сигналов (дополнительную информацию см. в EP 13305156.5). На этапе или стадии 15, число коэффициентов окружающего HOA-компонента

обоснованно уменьшено, так что они содержат только

последовательностей ненулевых HOA-коэффициентов, где

указывает число элементов набора

данных, т.е. число активных направленных сигналов в кадре k-2. Поскольку окружающий HOA-компонент предположительно должен всегда быть представлен посредством минимального числа

последовательностей HOA-коэффициентов, эта проблема может быть фактически сведена к выбору оставшихся

последовательностей HOA-коэффициентов из возможных

. Чтобы получать сглаженное уменьшенное окружающее HOA-представление, этот выбор выполняется таким образом, что по сравнению с выбором, осуществленным в предыдущем кадре k-3, возникает минимально возможное число изменений.At stage or stage 14, the current (long) frame

sequences of HOA coefficients is decomposed (as proposed in EP 13305156.5) into the number

directional signals belonging to the directions contained in the set

, and the residual surrounding HOA component

. A delay of two frames is introduced as a result of overlap summation processing to obtain smoothed signals. It is assumed that

contains a total of D channels, of which, however, only the channels that correspond to active directional signals are non-zero. The indexes indicating these channels are supposed to be output in the set

data. Additionally, the decomposition at stage/stage 14 provides some parameters

, which are used on the decompression side to predict parts of the original HOA representation from directional signals (see EP 13305156.5 for more information). At step or step 15, the number of coefficients of the surrounding HOA component

reasonably reduced so that they contain only

sequences of non-zero HOA coefficients, where

specifies the number of elements in the set

data, i.e. number of active steered signals in frame k-2. Because the surrounding HOA component is supposed to always be represented by the minimum number

sequences of HOA coefficients, this problem can actually be reduced to choosing the remaining

sequences of HOA coefficients from possible

. In order to obtain a smooth reduced HOA surround representation, this selection is made such that, compared to the selection made in the previous frame k-3, the least possible number of changes occurs.

В частности, следует различать следующе три случая:In particular, the following three cases should be distinguished:

a)

: В этом случае, предположительно должны выбираться последовательности HOA-коэффициентов, идентичные последовательностям HOA-коэффициентов в кадре k-3.a)

: In this case, HOA coefficient sequences identical to the HOA coefficient sequences in frame k-3 are assumed to be selected.

b)

: В этом случае, большее число последовательностей HOA-коэффициентов по сравнению с последним кадром k-3 может использоваться для представления окружающего HOA-компонента в текущем кадре. Эти последовательности HOA-коэффициентов, которые выбраны в k-3, предположительно также должны выбираться в текущем кадре. Дополнительные последовательности HOA-коэффициентов могут выбираться согласно различным критериям. Например, выбор этих последовательностей HOA-коэффициентов в

с наибольшей средней мощностью или выбор последовательностей HOA-коэффициентов относительно их перцепционной значимости.b)

: In this case, more HOA coefficient sequences compared to the last frame k-3 may be used to represent the surrounding HOA component in the current frame. These HOA coefficient sequences, which are selected in k-3, are also expected to be selected in the current frame. Additional sequences of HOA coefficients may be selected according to various criteria. For example, choosing these sequences of HOA coefficients in

with the highest mean power or selection of HOA coefficient sequences relative to their perceptual significance.

c)

: В этом случае, меньшее число последовательностей HOA-коэффициентов по сравнению с последним кадром k-3 может использоваться для представления окружающего HOA-компонента в текущем кадре. Здесь вопрос, на который следует ответить, заключается в том, какая из ранее выбранных последовательностей HOA-коэффициентов должна деактивироваться. Обоснованное решение заключается в том, чтобы деактивировать те последовательности, которые назначены каналам

на этапе или стадии 16 назначения сигналов в кадре k-3. Для недопущения неоднородностей на границах кадров, когда дополнительные последовательности HOA-коэффициентов активируются или деактивируются, преимущественно обеспечивать плавное постепенное усиление или затухание соответствующих сигналов.c)

: In this case, fewer HOA coefficient sequences compared to the last frame k-3 may be used to represent the surrounding HOA component in the current frame. Here, the question to be answered is which of the previously selected HOA coefficient sequences should be deactivated. A reasonable solution is to deactivate those sequences that are assigned to channels

in step or step 16 of assigning signals in frame k-3. To avoid inhomogeneities at frame boundaries when additional HOA coefficient sequences are activated or deactivated, it is advantageous to provide a smooth fade in or out of the respective signals.

Конечное окружающее HOA-представление с сокращенным числом

последовательностей ненулевых коэффициентов обозначается посредством

. Индексы выбранных окружающих последовательностей HOA-коэффициентов выводятся в наборе

данных.Reduced Final HOA Surrounding Representation

sequences of non-zero coefficients is denoted by

. The indexes of the selected surrounding sequences of HOA coefficients are output in the set

data.

На этапе/стадии 16, активные направленные сигналы, содержащиеся в

, и последовательности HOA-коэффициентов, содержащиеся в

, назначаются кадру

I-каналов для отдельного перцепционного кодирования. Если подробнее описывать назначение сигналов, кадры

и

предположительно должны состоять из отдельных сигналов

и

следующим образом:In step/step 16, the active directional signals contained in

, and sequences of HOA coefficients contained in

, are assigned to the frame

I-channels for separate perceptual coding. If we describe in more detail the purpose of signals, frames

and

presumably should consist of separate signals

and

in the following way:

(3)

(3)

Активные направленные сигналы назначаются таким образом, что они поддерживают свои индексы каналов, для того чтобы получать непрерывные сигналы для последовательного перцепционного кодирования. Это может выражаться следующим образом:The active steered signals are assigned such that they maintain their channel indices in order to obtain continuous signals for sequential perceptual coding. This can be expressed as follows:

(4)

(four)

Последовательности HOA-коэффициентов окружающего компонента назначаются таким образом, что минимальное число

последовательностей коэффициентов всегда содержится в последних

сигналах

, т.е.:The sequences of HOA coefficients of the surrounding component are assigned in such a way that the minimum number

sequences of coefficients are always contained in the last

signals

, i.e.:

(5)

Для дополнительных

последовательностей HOA-коэффициентов окружающего компонента, следует различать, выбраны они также или нет в предыдущем кадре:For additional

sequences of HOA coefficients of the surrounding component, it is necessary to distinguish whether they are also selected or not in the previous frame:

a) Если они также выбраны для передачи в предыдущем кадре, т.е. если соответствующие индексы также содержатся в наборе

данных, назначение этих последовательностей коэффициентов сигналам в

является идентичным назначению для предыдущего кадра. Эта операция обеспечивает сглаженные сигналы

, что является предпочтительным для последовательного перцепционного кодирования на этапе или стадии 17.a) If they are also selected for transmission in the previous frame, i.e. if the corresponding indices are also contained in the set

data, assigning these sequences of coefficients to signals in

is identical to the assignment for the previous frame. This operation provides smooth signals

, which is preferred for sequential perceptual coding in step or step 17.

b) В противном случае, если некоторые последовательности коэффициентов выбираются как новые, т.е. если их индексы содержатся в наборе

данных, а не в наборе

данных, они сначала размещаются относительно своих индексов в порядке по возрастанию и в этом порядке назначаются каналам

, которые еще не заняты посредством направленных сигналов.b) Otherwise, if some sequences of coefficients are chosen as new, i.e. if their indices are contained in the set

data, not in a set

data, they are first placed relative to their indices in ascending order and assigned to the channels in that order.

that are not yet occupied by directional signals.

Это конкретное назначение обеспечивает такое преимущество, что в ходе процесса HOA-распаковки, перераспределение и составление сигналов может выполняться без знания того, какая окружающая последовательность HOA-коэффициентов содержится в каком канале

. Вместо этого, назначение может быть восстановлено во время HOA-распаковки с простым знанием наборов

и

данных. Преимущественно, эта операция назначения также предоставляет вектор

назначений, элементы

,

которого обозначают индексы каждой из дополнительных

последовательностей HOA-коэффициентов окружающего компонента. Иначе говоря, элементы вектора

назначений предоставляют информацию в отношении того, какие из дополнительных

последовательностей HOA-коэффициентов окружающего HOA-компонента назначаются в

каналах с неактивными направленными сигналами. Этот вектор может передаваться дополнительно, но менее часто, чем посредством частоты кадров, для получения возможности инициализации процедуры перераспределения, выполняемой для HOA-распаковки (см. раздел B). Этап/стадия 17 перцепционного кодирования кодирует I-каналы кадра

и выводит кодированный кадр

.This particular assignment provides the advantage that, during the HOA decompression process, signal remapping and composition can be performed without knowing which surrounding sequence of HOA coefficients is contained in which channel.

. Instead, the assignment can be restored during HOA unboxing with simple knowledge of the sets.

and

data. Advantageously, this assignment also provides a vector

appointments, elements

,

which is denoted by the indices of each of the additional

sequences of HOA coefficients of the surrounding component. In other words, the elements of the vector

appointments provide information as to which of the additional

sequences of HOA coefficients of the surrounding HOA component are assigned in

channels with inactive directional signals. This vector may be transmitted additionally, but less frequently than the frame rate, to enable initialization of the reallocation procedure performed for HOA decompression (see section B). Perceptual encoding step/step 17 encodes the I-channels of the frame

and outputs the encoded frame

.

Для кадров, для которых вектор

не передается из этапа/стадии 16 на стороне распаковки, наборы

и

параметров данных вместо вектора

используются для выполнения перераспределения.For frames for which the vector

not transferred from stage/stage 16 on the unpacking side, sets

and

data parameters instead of a vector

used to perform redistribution.

A.1. Оценка направлений доминирующих источников звуковA.1. Evaluation of directions of dominant sound sources

Этап/стадия 13 оценки для направлений доминирующих источников звуков по фиг. 1 подробнее проиллюстрирован на фиг. 2. Он, по сути, выполняется согласно этапу из EP 13305156.5, но с определяющим отличием, которое состоит в способе определения количества доминирующих источников звука, соответствующих числу направленных сигналов, которые должны извлекаться из данного HOA-представления. Это число является значительным, поскольку оно используется для управления тем, данное HOA-представление представляется лучше либо посредством использования более направленных сигналов, либо вместо этого посредством использования более последовательностей HOA-коэффициентов, с тем чтобы лучше моделировать окружающий HOA-компонент.The step/step 13 of evaluation for the directions of the dominant sound sources of FIG. 1 is illustrated in more detail in FIG. 2. It is essentially the same as the step in EP 13305156.5, but with a defining difference, which is the way in which the number of dominant sound sources corresponding to the number of directional signals to be extracted from a given HOA representation is determined. This number is significant because it is used to control whether a given HOA representation is better represented either by using more directional signals or instead by using more sequences of HOA coefficients in order to better model the surrounding HOA component.

Оценка направлений доминирующих источников звуков начинается на этапе или стадии 21 с предварительного поиска направлений доминирующих источников звуков с использованием длинного кадра

входных последовательностей HOA-коэффициентов. Вместе с предварительными оценками

направлений, вычисляются соответствующие направленные сигналы

и HOA-компоненты

звукового поля, которые предположительно созданы посредством отдельных источников звука, как описано в EP 13305156.5. На этапе или стадии 22, эти количества используются вместе с кадром

входных последовательностей HOA-коэффициентов для определения числа

направленных сигналов, которые должны извлекаться. Следовательно, оценки

направлений, соответствующие направленные сигналы

и HOA-компоненты

звукового поля отбрасываются. Вместо этого, только оценки

,

направлений затем назначаются ранее найденным источникам звука.Evaluation of directions of dominant sound sources begins at step or step 21 with a preliminary search for directions of dominant sound sources using a long frame

input sequences of HOA coefficients. Together with preliminary estimates

directions, the corresponding directional signals are calculated

and HOA components

sound field, which are supposedly created by separate sound sources, as described in EP 13305156.5. In step or step 22, these quantities are used in conjunction with the frame

input sequences of HOA coefficients to determine the number

directional signals to be extracted. Therefore, the estimates

directions, corresponding directional signals

and HOA components

sound field are discarded. Instead, only estimates

,

directions are then assigned to previously found sound sources.

На этапе или стадии 23, результирующие траектории направлений сглажены согласно модели перемещения источников звука, и определяется то, какие из источников звука предположительно являются активными (см. EP 13305156.5). Последняя операция предоставляет набор

индексов активных направленных источников звука и набор

соответствующих оценок направлений.In step or step 23, the resulting directional paths are smoothed according to the sound source movement model and it is determined which of the sound sources are supposed to be active (see EP 13305156.5). The last operation provides a set

indices of active directional sound sources and a set

corresponding assessments of directions.

A.2. Определение числа извлеченных направленных сигналовA.2. Determination of the number of extracted directional signals

Для определения числа направленных сигналов на этапе/стадии 22, предполагается ситуация, в которой имеется данное общее количество I-каналов, которые должны быть использованы для захвата перцепционно наиболее релевантной информации звукового поля. Следовательно, определяется число направленных сигналов, которые должны извлекаться, обусловленное вопросом касательно того, для общего качества HOA-сжатия/распаковки текущее HOA-представление представляется лучше либо посредством использования более направленных сигналов, либо посредством использования более последовательностей HOA-коэффициентов, для лучшего моделирования окружающего HOA-компонента. Чтобы извлекать на этапе/стадии 22 критерий определения числа направленных источников звука, которые должны извлекаться, причем этот критерий связан с человеческим восприятием, учитывается то, что HOA-сжатие достигается, в частности, посредством следующих двух операций:To determine the number of directional signals in step/step 22, a situation is assumed in which there is a given total number of I-channels to be used to capture the perceptually most relevant sound field information. Therefore, the number of directional signals to be extracted is determined, driven by the question as to whether, for the overall quality of the HOA compression/decompression, the current HOA representation is better represented either by using more directional signals or by using more sequences of HOA coefficients, for better modeling of the surroundings. HOA component. In order to extract in step/step 22 a criterion for determining the number of directional sound sources to be extracted, this criterion being related to human perception, it is taken into account that HOA compression is achieved in particular by the following two operations:

- уменьшение последовательностей HOA-коэффициентов для представления окружающего HOA-компонента (что означает уменьшение числа связанных каналов);- reduction of sequences of HOA coefficients to represent the surrounding HOA component (which means a reduction in the number of associated channels);

- перцепционное кодирование направленных сигналов и последовательностей HOA-коэффициентов для представления окружающего HOA-компонента. В зависимости от числа M,

, извлеченных направленных сигналов, первая операция приводит к аппроксимации:- perceptual encoding of directional signals and sequences of HOA coefficients to represent the surrounding HOA component. Depending on the number M,

, extracted directional signals, the first operation leads to the approximation:

(6)

, (7)

где

(8)where

(eight)

обозначает HOA-представление направленного компонента, состоящее из HOA-компонентов

,

звукового поля, которые предположительно должны создаваться посредством M отдельно рассматриваемых источников звука, и

обозначает HOA-представление окружающего компонента только

последовательностей ненулевых HOA-коэффициентов. Аппроксимация из второй операции может выражаться следующим образом:denotes a HOA representation of a directional component, consisting of HOA components

,

sound field, which are supposed to be created by M separately considered sound sources, and

denotes the HOA representation of the surrounding component only

sequences of nonzero HOA coefficients. The approximation from the second operation can be expressed as follows:

(9)

, (10)

, (ten)

где

и

обозначают составленные направленные и окружающие HOA-компоненты после перцепционного декодирования, соответственно.where

and

denote the composed directional and ambient HOA components after perceptual decoding, respectively.

Формулирование критерияFormulation of the criterion

Число

направленных сигналов, которые должны извлекаться, выбирается таким образом, что полная ошибка аппроксимации:Number

of the directional signals to be extracted is chosen in such a way that the total approximation error is:

, (11)

, (eleven)

где

, является минимально возможно менее значимой относительно человеческого восприятия. Чтобы обеспечивать это, распределение направленной мощности полной ошибки для отдельных критических полос частот по шкале в барках учитывается в предварительно заданном числе Q тестовых направлений

, которые почти равномерно распределены по единичной сфере. Более конкретно, распределение направленной мощности для b-той критической полосы частот, b=1..., B, представлено посредством вектора:where

, is as minimally significant as possible relative to human perception. To ensure this, the total error directional power distribution for the individual critical frequency bands on the barq scale is taken into account in a predefined number Q of test directions

, which are almost uniformly distributed over the unit sphere. More specifically, the directional power distribution for the b-th critical frequency band, b=1..., B, is represented by the vector:

(12)

компоненты

которого обозначают мощность полной ошибки

, связанной с направлением

, b-той критической полосой частот по шкале в барках и k-м кадром. Распределение

направленной мощности полной ошибки

сравнивается с распределением направленной перцепционной мощности маскирования:Components

which denotes the power of the total error

direction related

, b-th critical frequency band on the scale in barques and k-th frame. Distribution

directional power total error

compared with the directional perceptual concealment power distribution:

(13)

вследствие исходного HOA-представления

. Затем, для каждого тестового направления

и критической полосы b частот, вычисляется уровень восприятия

полной ошибки. Здесь он, по сути, задается в качестве отношения направленной мощности полной ошибки

и направленной мощности маскирования согласно следующему:due to the original HOA representation

. Then, for each test direction

and critical frequency band b, the level of perception is calculated

complete error. Here it is, in fact, given as the ratio of the directional power of the total error

and directional masking power according to the following:

(14)

(fourteen)

Вычитание 1 и последующая операция определения максимума выполняются для того, чтобы обеспечивать то, что уровень восприятия является нулевым при условии, что мощность ошибки ниже порогового значения маскирования.The subtraction of 1 and the subsequent maximum determination operation are performed in order to ensure that the perceptual level is zero provided that the error power is below the masking threshold.

В завершение, число

направленных сигналов, которые должны извлекаться, может выбираться таким образом, чтобы минимизировать среднее по всем тестовым направлениям максимума уровня восприятия ошибки по всем критическим полосам частот, т.е.:Finally, the number

of the directional signals to be extracted can be chosen in such a way as to minimize the average over all test directions of the maximum error perception level over all critical frequency bands, i.e.:

(15)

(fifteen)

Следует отметить, что, альтернативно, можно заменять максимум посредством операции усреднения в уравнении (15).It should be noted that, alternatively, it is possible to replace the maximum by means of an averaging operation in equation (15).

Вычисление распределения направленной перцепционной мощности маскированияCalculation of Directional Perceptual Concealing Power Distribution

Для вычисления распределения

направленной перцепционной мощности маскирования вследствие исходного HOA-представления

, последнее преобразуется в пространственную область, так что оно представляется посредством общих плоских волн

, падающих из тестовых направлений

, q=l,..., Q. При размещении общих сигналов плоской волны

в матрице

следующим образом:To calculate the distribution

directional perceptual concealment power due to the original HOA presentation

, the latter is transformed into a spatial domain, so that it is represented by general plane waves

falling from test directions

, q=l,..., Q. When placing common plane wave signals

in the matrix

in the following way:

(16)

преобразование в пространственную область выражается посредством операции:the transformation to the spatial domain is expressed through the operation:

, (17)

где

обозначает матрицу мод относительно тестового направления

, заданную следующим образом:where

denotes the mode matrix with respect to the test direction

, defined as follows:

, (18)

, (eighteen)

причем

and

. (19)

Элементы

распределения

направленной перцепционной мощности маскирования, вследствие исходного HOA-представления

, соответствуют мощностям маскирования общих функций

плоской волны для отдельных критических полос b частот.Elements

distribution

directional perceptual masking power due to the original HOA presentation

, correspond to the masking powers of the common functions

plane wave for individual critical frequency bands b.

Вычисление распределения направленной мощностиDirectional Power Distribution Calculation

Далее представлены две альтернативы для вычисления распределения

направленной мощности:The following are two alternatives for calculating the distribution

directional power:

a. Одна возможность состоит в том, чтобы фактически вычислять аппроксимацию

требуемого HOA-представления

посредством выполнения двух операций, упомянутых в начале раздела 2. Затем полная ошибка

аппроксимации вычисляется согласно уравнению (11). Далее полная ошибка

аппроксимации преобразована в пространственную область, так что она представлена посредством общих плоских волн

, падающих из тестовых направлений

. При размещении общих сигналов плоской волны в матрице

следующим образом:a. One possibility is to actually calculate the approximation

required HOA representation

by performing the two operations mentioned at the beginning of section 2. Then the complete error

approximation is calculated according to equation (11). Further complete error

approximation is converted to the spatial domain so that it is represented by general plane waves

falling from test directions

. When placing common plane wave signals in a matrix

in the following way:

(20),

(twenty),

. (21)

Элементы

распределения

направленной мощности полной ошибки

аппроксимации получаются посредством вычисления мощностей общих функций

плоской волны в пределах отдельных критических полос b частот.Elements

distribution

directional power total error

approximations are obtained by calculating the powers of the general functions

plane wave within individual critical frequency bands b.

b. Альтернативное решение состоит в том, чтобы вычислять только аппроксимацию

вместо

. Этот способ обеспечивает такое преимущество, что сложное перцепционное кодирование отдельных сигналов не должно выполняться непосредственно. Вместо этого, достаточно знать мощности ошибки перцепционного квантования в пределах отдельных критических полос частот по шкале в барках. С этой целью, полная ошибка аппроксимации, заданная в уравнении (11), может быть записана в качестве суммы трех следующих ошибок аппроксимации:b. An alternative solution is to calculate only the approximation

instead of

. This method provides such an advantage that complex perceptual encoding of individual signals does not have to be performed directly. Instead, it is sufficient to know the perceptual quantization error powers within the individual critical frequency bands on the barq scale. To this end, the total approximation error given in equation (11) can be written as the sum of the following three approximation errors:

(22)

(23)

, (24)

которые могут предполагаться как независимые друг от друга. Вследствие этой независимости, распределение направленной мощности полной ошибки

может выражаться как сумма распределений направленной мощности трех отдельных ошибок

,

и

.which can be assumed to be independent of each other. Because of this independence, the total error directional power distribution

can be expressed as the sum of the directional power distributions of three separate errors

,

and

.

Далее описывается то, как вычислять распределения направленной мощности трех ошибок для отдельных критических полос частот по шкале в барках:The following describes how to calculate the three error directional power distributions for the individual critical frequency bands on the barq scale:

a. Чтобы вычислять распределение направленной мощности ошибки

, она сначала преобразуется в пространственную область следующим образом:a. To calculate the directional error power distribution

, it is first converted to a spatial domain as follows:

, (25)

при этом ошибка

аппроксимации, следовательно, представлена посредством общих плоских волн

, падающих из тестовых направлений

, которые размещаются в матрице

согласно следующему:while the error

approximation is therefore represented by general plane waves

falling from test directions

, which are placed in the matrix

according to the following:

(26)

Следовательно, элементы

распределения

направленной мощности ошибки

плоской волны в отдельных критических полосах

частот.Therefore, the elements

distribution

directional error power

approximations are obtained by calculating the powers of the general functions

plane wave in separate critical bands

frequencies.

b. Для вычисления распределения

направленной мощности ошибки

, следует принимать во внимание, что эта ошибка вводится в направленный HOA-компонент

посредством перцепционного кодирования направленных сигналов

,

. Дополнительно, считается, что направленный HOA-компонент задается посредством уравнения (8). Затем для простоты предполагается, что HOA-компонент

эквивалентно представлен в пространственной области посредством 0 общих функций

плоской волны, которые созданы из направленного сигнала

посредством простого масштабирования, т.е.:b. To calculate the distribution

directional error power

, you should take into account that this error is injected into the directed HOA component

through perceptual encoding of directional signals

,

. Additionally, the directional HOA component is considered to be given by equation (8). Then, for simplicity, it is assumed that the HOA component

equivalently represented in the spatial domain by 0 generic features

plane waves that are created from a directional signal

through simple scaling, i.e.:

, (27)

где

,

, обозначают коэффициенты масштабирования. Соответствующие направления

,

, плоской волны предположительно должны быть равномерно распределены по единичной сфере и циклически сдвинуты таким образом, что

соответствует оценке

направления. Следовательно, коэффициент

масштабирования равен 1.where

,

, denote scaling factors. Relevant Directions

,

, a plane wave should presumably be uniformly distributed over the unit sphere and cyclically shifted in such a way that

corresponds to the assessment

directions. Therefore, the coefficient

scaling is 1.

При задании

в качестве матрицы мод относительно циклически сдвинутых направлений

и размещении всех коэффициентов масштабирования

в векторе согласно следующему:When asked

as a mode matrix with respect to cyclically shifted directions

and placement of all scaling factors

in a vector according to the following:

(28)

HOA-компонент

может быть записан следующим образом:HOA component

can be written like this:

(29)

Следовательно, ошибка

(см. уравнение (23)) между истинным направленным HOA-компонентом:Therefore, the error

(see equation (23)) between the true directional HOA component:

(30)

(thirty)

и направленным HOA-компонентом, составленным из перцепционно декодированных направленных сигналов

, посредством:and a directional HOA component composed of perceptually decoded directional signals

, through:

(31)

(32)

может выражаться с точки зрения ошибок перцепционного кодирования:can be expressed in terms of perceptual coding errors:

(33)

в отдельных направленных сигналах следующим образом:in individual directional signals as follows:

(34)

Представление ошибки

в пространственной области относительно тестовых направлений

, задается следующим образом:Error representation

in the spatial domain with respect to the test directions

, is given as follows:

(35)

Если обозначить элементы вектора

посредством

,

, и при условии, что отдельные ошибки

перцепционного кодирования являются независимыми друг от друга, из уравнения (35) следует то, что элементы

распределения

направленной мощности ошибки перцепционного кодирования

могут вычисляться следующим образом:If we denote the elements of the vector

through

,

, and provided that individual errors

perceptual coding are independent of each other, it follows from equation (35) that the elements

distribution

directional power of perceptual coding error

can be calculated as follows:

(36)

как предполагается, представляет мощность ошибки перцепционного квантования в b-той критической полосе частот в направленном сигнале

. Эта мощность может предполагаться как соответствующая перцепционной мощности маскирования направленного сигнала

.

is assumed to represent the power of the perceptual quantization error in the b-th critical frequency band in the steered signal

. This power can be assumed to correspond to the perceptual masking power of the directional signal

.

c. Для вычисления распределения

направленной мощности ошибки

, получающейся в результате перцепционного кодирования последовательностей HOA-коэффициентов окружающего HOA-компонента, каждая последовательность HOA-коэффициентов предположительно должна кодироваться независимо. Следовательно, ошибки, введенные в отдельные последовательности HOA-коэффициентов в каждой критической полосе частот по шкале в барках, могут предполагаться как декоррелированные. Это означает то, что матрица корреляции между коэффициентами ошибки

относительно каждой критической полосы частот по шкале в барках является диагональной, т.е.:c. To calculate the distribution

directional error power

resulting from perceptual encoding of the surrounding HOA component's HOA score sequences, each HOA score sequence is expected to be encoded independently. Therefore, the errors introduced into the individual sequences of HOA coefficients in each critical frequency band on the barq scale can be assumed to be decorrelated. This means that the correlation matrix between the error coefficients

with respect to each critical frequency band on the scale in barks is diagonal, i.e.:

(37)

Элементы

предположительно представляют мощность ошибки перцепционного квантования в b-той критической полосе частот в o-й кодированной последовательности HOA-коэффициентов в

. Они предположительно могут соответствовать перцепционной мощности маскирования o-й последовательности HOA-коэффициентов

. Распределение направленной мощности ошибки

перцепционного кодирования в силу этого вычисляется следующим образом:Elements

presumably represent the power of the perceptual quantization error in the b-th critical frequency band in the o-th encoded sequence of HOA coefficients in

. They can presumably correspond to the perceptual masking power of the o-th sequence of HOA coefficients

. Directional error power distribution

perceptual coding is therefore calculated as follows:

(38)

B. Улучшенная HOA-распаковкаB. Improved HOA unboxing

Соответствующая обработка HOA-распаковки проиллюстрирована на фиг. 3 и включает в себя следующие этапы или стадии.The corresponding HOA decompression processing is illustrated in FIG. 3 and includes the following steps or steps.

На этапе или стадии 31, перцепционное декодирование I сигналов, содержащихся в

, выполняется для того, чтобы получать I декодированных сигналов в

. На этапе или стадии 32 перераспределения сигналов, перцепционно декодированные сигналы в

перераспределяются, чтобы воссоздавать кадр

направленных сигналов и кадр

окружающего HOA-компонента. Информация относительно того, как перераспределять сигналы, получается посредством воспроизведения операции назначения, выполняемой для HOA-сжатия, с использованием наборов

и

данных индексов. Поскольку она представляет собой рекурсивную процедуру (см. раздел A), дополнительно передаваемый вектор

назначений может использоваться для получения возможности инициализации процедуры перераспределения, например, в случае если передача нарушается.In step or step 31, perceptual decoding of the I signals contained in

, is performed in order to obtain I decoded signals in

. At the signal redistribution step or step 32, the perceptually decoded signals in

are redistributed to recreate the frame

directional signals and frame

surrounding HOA component. Information on how to reallocate signals is obtained by reproducing the assignment operation performed for HOA compression using sets

and

index data. Since it is a recursive procedure (see section A), the additionally passed vector

assignments can be used to enable the redistribution procedure to be initialized, for example, in the event that a transfer fails.

На этапе или стадии 33 составления, повторно составляется текущий кадр

требуемого полного HOA-представления (согласно обработке, описанной в связи с фиг. 2b и фиг. 4 из EP 12306569.0 с использованием кадра

направленных сигналов, набора

индексов активных направленных сигналов вместе с набором

соответствующих направлений, параметров

для прогнозирования частей HOA-представления из направленных сигналов и кадра

последовательностей HOA-коэффициентов уменьшенного окружающего HOA-компонента.

соответствует компоненту

в EP 12306569.0, а

и

соответствуют

в EP 12306569.0, при этом индексы активных направленных сигналов отмечаются в матричных элементах

. Иными словами, направленные сигналы относительно равномерно распределенных направлений прогнозируются из направленных сигналов

с использованием принимаемых параметров

для такого прогнозирования, и после этого текущий распакованный кадр

повторно составляется из кадра направленных сигналов

, прогнозных частей и уменьшенного окружающего HOA-компонента

.In the composing step or step 33, the current frame is recomposed

required complete HOA representation (according to the processing described in connection with Fig. 2b and Fig. 4 of EP 12306569.0 using the frame

directional signals, dialing

indices of active directional signals along with a set

relevant directions, parameters

to predict parts of the HOA representation from directional signals and a frame

sequences of HOA coefficients of the reduced surrounding HOA component.

corresponds to the component

in EP 12306569.0, and

and

correspond

in EP 12306569.0, with indices of active directional signals noted in matrix elements

. In other words, directional signals with respect to uniformly distributed directions are predicted from directional signals

using received parameters

for such a prediction, and after that the current decompressed frame

recomposed from a frame of directional signals

, predictive parts and reduced ambient HOA component

.

C. Основы амбиофонии высшего порядкаC. Fundamentals of Higher Order Ambiophony

Амбиофония высшего порядка (HOA) основана на описании звукового поля в компактной интересующей области, которая предположительно не должна содержать источников звука. В этом случае, пространственно-временной характер изменения звукового давления p(t,x) во время t и в позиции x внутри интересующей области физически полностью определяется посредством гомогенного волнового уравнения. Далее, предполагается сферическая система координат, как показано на фиг. 4. В используемой системе координат ось X указывает на переднюю позицию, ось Y указывает влево, а ось Z указывает вверх. Позиция в пространстве

представлена посредством радиуса

(т.е. расстояния до начала координат), угла

наклона, измеренного из полярной оси z, и азимутального угла

, измеренного против часовой стрелки в плоскости X-Y от оси X. Дополнительно,

обозначает транспозицию.Higher order ambiophony (HOA) is based on the description of the sound field in a compact area of interest, which is supposed to contain no sound sources. In this case, the spatiotemporal nature of the change in sound pressure p(t,x) at time t and at position x within the region of interest is physically completely determined by the homogeneous wave equation. Further, a spherical coordinate system is assumed as shown in FIG. 4. In the coordinate system used, the X-axis points to the front position, the Y-axis points to the left, and the Z-axis points up. Position in space

represented by a radius

(i.e. distance to origin), angle

tilt measured from polar z-axis and azimuth angle

measured counterclockwise in the XY plane from the X axis. Optionally,

stands for transposition.

Можно показать (см. работу автора E.G. Williams "Fourier Acoustics", том 93 Applied Mathematical Sciences, Academic Press, 1999 год), что преобразование Фурье звукового давления относительно времени, обозначаемое посредством

, т.е.:It can be shown (see EG Williams "Fourier Acoustics", Volume 93 of Applied Mathematical Sciences, Academic Press, 1999) that the Fourier transform of sound pressure versus time, denoted by

, i.e.:

(39)

где

обозначает угловую частоту, а i указывает мнимую единицу, может разлагаться на последовательность сферических гармоник согласно следующему:where

denotes the angular frequency and i indicates the imaginary unit, can be decomposed into a sequence of spherical harmonics according to the following:

(40)

В уравнении (40) c_s обозначает скорость звука, и k обозначает угловое волновое число, которое связано с угловой частотой

посредством

. Дополнительно,

обозначают сферические функции Бесселя первого вида, и

обозначают действительнозначные сферические гармоники порядка n и степени m, которые задаются в нижеприведенном разделе C.1. Коэффициенты

разложения в ряд зависят только от углового волнового числа k. Выше, неявно предполагается, что звуковое давление имеет ограниченную пространственную полосу частот. Таким образом, последовательность сферических гармоник усекается относительно индекса n порядка в верхнем пределе N, который называется порядком HOA-представления.In equation (40), c _s denotes the speed of sound and k denotes the angular wave number, which is related to the angular frequency

through

. Additionally,

denote the spherical Bessel functions of the first kind, and

denote real-valued spherical harmonics of order n and degree m, which are defined in Section C.1 below. Odds

the series expansions depend only on the angular wavenumber k. Above, it is implicitly assumed that the sound pressure has a limited spatial bandwidth. Thus, the sequence of spherical harmonics is truncated with respect to the index n of the order in the upper limit of N, which is called the order of the HOA representation.

Если звуковое поле представлено посредством наложения бесконечного числа гармонических плоских волн с различными угловыми частотами

, поступающих из всех возможных направлений, указываемых посредством кортежа

угла, можно показать (см. работу B. Rafaely "Plane-wave Decomposition of the Sound Field on the Sphere by Spherical Convolution", Journal of the Acoustical Society of America, том 4 (116), стр. 2149-2157, 2004 год), что соответствующая комплексная амплитудная функция

плоской волны может выражаться посредством следующего разложения в ряд по сферическим гармоникам:If the sound field is represented by superimposing an infinite number of harmonic plane waves with different angular frequencies

coming from all possible directions specified by the tuple

angle can be shown (see B. Rafaely "Plane-wave Decomposition of the Sound Field on the Sphere by Spherical Convolution", Journal of the Acoustical Society of America, vol. 4 (116), pp. 2149-2157, 2004 ) that the corresponding complex amplitude function

plane wave can be expressed by the following expansion in a series in terms of spherical harmonics:

, (41)

где коэффициенты

разложения в ряд связаны с коэффициентами

разложения в ряд посредством

. (42)where coefficients

series expansions are related to the coefficients

series expansion by means of

. (42)

При условии, что отдельные коэффициенты

представляют собой функции от угловой частоты

, применение обратного преобразования Фурье (обозначаемого посредством

предоставляет функции во временной области:Provided that the individual coefficients

are functions of the angular frequency

, applying the inverse Fourier transform (denoted by

provides functions in the time domain:

(43)

для каждого порядка n и степени m, которые могут собираться в одном векторе

посредством

(44)for each order n and degree m that can be collected in one vector

through

(44)

Индекс позиции функции

во временной области в векторе

задается посредством

. Общее количество элементов в векторе

задается посредством

. Конечный формат амбиофонии предоставляет дискретизированную версию c(t) с использованием частоты

дискретизации следующим образом:Function position index

in the time domain in vector

given by

. The total number of elements in the vector

given by

. The final ambiophonic format provides a sampled version of c(t) using the frequency

discretization as follows:

, (45)

где

обозначает период дискретизации. Элементы

здесь упоминаются в качестве коэффициентов амбиофонии. Сигналы

временной области и, следовательно, коэффициенты амбиофонии являются действительнозначными.where

denotes the sampling period. Elements

are referred to here as ambiophonic coefficients. Signals

time domain and hence the ambiophonicity coefficients are real-valued.

C.1. Определение действительнозначных сферических гармоникC.1. Definition of real-valued spherical harmonics

Действительнозначные сферические гармоники

задаются следующим образом:Real-valued spherical harmonics

are set as follows:

, (46)

где

. (47)where

. (47)

Ассоциированные функции

Лежандра задаются следующим образом:Associated Functions

Legendre is defined as follows:

, (48)

с полиномом Лежандра

и, в отличие от вышеуказанной работы автора Williams, без фазовой составляющей

Кондона-Шортли.with the Legendre polynomial

and, unlike the above work by Williams, without the phase component

Condon-Shortley.

C.2. Пространственное разрешение амбиофонии высшего порядкаC.2. Spatial resolution of higher order ambiophony

Общая функция x(t) плоской волны, поступающей из направления

, представлена в HOA следующим образом:General function x(t) of a plane wave coming from the direction

, is represented in HOA as follows:

(49)

Соответствующая пространственная плотность амплитуд плоской волны:The corresponding spatial amplitude density of a plane wave is:

задается следующим образом:

is set as follows:

(50)

(fifty)

(51)

Из уравнения (51) можно видеть то, что она представляет собой произведение общей функции x(t) плоской волны и пространственной дисперсионной функции

, которое, как можно показать, зависит только от угла

между

и

, имеющего свойство:From equation (51) one can see that it is the product of the general function x(t) of a plane wave and the spatial dispersion function

, which, as can be shown, depends only on the angle

between

and

, which has the property:

(52)

Как и следовало ожидать, в пределе бесконечного порядка, т.е.

, пространственная дисперсионная функция превращается в дельту Дирака:As expected, in the limit of infinite order, i.e.

, the spatial dispersion function turns into the Dirac delta:

, т.е.

. (53)

, i.e.

. (53)

Тем не менее, в случае конечного порядка N, доля общей плоской волны из направления

размывается по соседним направлениям, при этом степень размывания снижается с увеличением порядка. График нормализованной функции

для различных значений N показан на фиг. 5.However, in the case of a finite order N, the fraction of the total plane wave from the direction

is blurred in neighboring directions, while the degree of blurring decreases with increasing order. Plot of a normalized function

for various values of N is shown in Fig. 5.

Следует отметить, что для любого направления

, характер изменения во временной области пространственной плотности амплитуд плоской волны является кратным ее характеру изменения в любом другом направлении. В частности, функции

и

для некоторых фиксированных направлений

и

имеют высокую корреляцию друг с другом относительно времени t.It should be noted that for any direction

, the nature of the change in the time domain of the spatial density of the plane wave amplitudes is a multiple of its change in any other direction. In particular, the functions

and

for some fixed directions

and

have a high correlation with each other with respect to time t.

C.3. Преобразование сферических гармоникC.3. Spherical harmonic conversion

Если пространственная плотность амплитуд плоской волны дискретизируется при числе 0 пространственных направлений

,

, которые почти равномерно распределены по единичной сфере, получаются 0 направленных сигналов

. При сборе этих сигналов в вектор следующим образом:If the spatial amplitude density of a plane wave is discretized at the number 0 of spatial directions

,

, which are almost uniformly distributed over the unit sphere, there are 0 directional signals

. When collecting these signals into a vector as follows:

, (54)

посредством использования уравнения (50), можно проверять то, что этот вектор может вычисляться из непрерывного представления d(t) на основе амбиофонии, заданного в уравнении (44), посредством простого умножения матриц следующим образом:by using equation (50), it can be verified that this vector can be computed from the continuous representation d(t) based on the ambiophony given in equation (44) by simple matrix multiplication as follows:

, (55)

где

указывает объединенную транспозицию и сопряжение, а

обозначает матрицу мод, заданную следующим образом:where

indicates a combined transposition and conjugation, and

denotes the mode matrix defined as follows:

, (56)

где:where:

. (57)

Поскольку направления

почти равномерно распределены по единичной сфере, матрица мод, в общем, является обратимой. Следовательно, непрерывное представление на основе амбиофонии может вычисляться из направленных сигналов

следующим образом:Since directions

almost uniformly distributed over the unit sphere, the mode matrix is generally reversible. Therefore, an ambiphony-based continuous representation can be computed from directional signals

in the following way:

(58)

Оба уравнения составляют преобразование и обратное преобразование между представлением на основе амбиофонии и пространственной областью. Эти преобразования здесь называются "преобразованием сферических гармоник" и "обратным преобразованием сферических гармоник".Both equations account for the transformation and inverse transformation between the ambiphony-based representation and the spatial domain. These transformations are referred to herein as "spherical harmonic transformation" and "inverse spherical harmonic transformation".

Следует отметить, что поскольку направления

почти равномерно распределены по единичной сфере, аппроксимация:It should be noted that since the directions

almost uniformly distributed over the unit sphere, approximation:

(59)

доступна, что оправдывает использование

вместо

в уравнении (55).available, which justifies the use

instead of

in equation (55).

Преимущественно, все упомянутые взаимосвязи также являются допустимыми для дискретной временной области.Preferably, all of the relationships mentioned are also valid for the discrete time domain.

Изобретаемая обработка может выполняться посредством одного процессора или электронной схемы либо посредством нескольких процессоров или электронных схем, работающих параллельно и/или работающих в различных частях изобретаемой обработки.The inventive processing may be performed by a single processor or electronic circuit, or by multiple processors or electronic circuits operating in parallel and/or operating in different parts of the inventive processing.

Claims

1. A method for compressing a presentation based on higher-order ambiphony of a sound field using a first number of perceptual encodings, denoted HOA, with input time frames of sequences of HOA coefficients, said method including the steps, which are performed on a frame-by-frame basis, in which:

- for the current frame, the set of dominant directions and the corresponding data set of indexes of the detected directional signals are evaluated;

- separating from the sequences of HOA coefficients of said current frame a second number of directional signals with corresponding directions contained in said set of dominant direction estimates and with a corresponding delayed data set of indices of said directional signals,

and a surrounding HOA component that is represented by a reduced number of HOA coefficient sequences and a corresponding set of index data of said reduced number of surrounding HOA coefficient sequences, the reduced number corresponding to the difference between said first number and said second number;

- assigning said steered signals and HOA coefficient sequences of said surround HOA component to a frame of channels whose number corresponds to said first number, wherein said delayed index data set of said directional signals and said index data set of said reduced number of surround sequences are used for said assignment HOA coefficients;

- perceptually encoding said channels of the assigned frame in such a way as to provide an encoded compressed frame.

2. The method of claim 1, wherein said second number of directional signals is determined according to a perceptually related criterion such that:

- an appropriately decompressed HOA representation provides the smallest perceived error that can be achieved with a fixed given number of channels for compression, while said criterion takes into account the following errors:

-- simulation errors resulting from the use of different numbers of said directional signals and different numbers of HOA coefficient sequences for the surrounding HOA component;

-- quantization noise introduced by perceptual coding of said directional signals;

-- quantization noise introduced by encoding individual sequences of HOA coefficients of said ambient HOA component;

-- the total error resulting from the above three errors is taken into account for the number of test directions and the number of critical frequency bands relative to its perceptibility;

- said second number of directional signals is chosen in such a way as to minimize the average perceived error or the maximum perceived error in order to achieve said smallest perceived error.

3. The method of claim 1, wherein the selection of a reduced number of HOA coefficient sequences to represent the surrounding HOA component is performed according to a criterion that differs between the following three cases:

- in case the number of HOA coefficient sequences for said current frame is identical to the number of HOA coefficient sequences for the previous frame, HOA coefficient sequences are selected that are identical to the HOA coefficient sequences in said previous frame;

- in the event that the number of HOA coefficient sequences for said current frame is less than the number of HOA coefficient sequences for said previous frame, those HOA coefficient sequences from said previous frame that are in said previous frame assigned to a channel that is in said current frame are deactivated the frame occupied by the steered signal;

- in the event that the number of HOA coefficient sequences for said current frame exceeds the number of HOA coefficient sequences for said previous frame, those HOA coefficient sequences that are selected in said previous frame are also selected in said current frame, and these additional HOA sequences - coefficients can be chosen according to their perceptual significance or according to the highest average power.

4. The method according to claim 1, wherein said assignment is performed as follows:

- active directional signals are assigned to these channels in such a way that they retain their channel indices in order to obtain continuous signals for said perceptual coding;

- sequences of HOA coefficients of said surrounding HOA component are assigned in such a way that the minimum number (

) of such sequences of coefficients is always contained in the corresponding number (

) recent channels;

- to assign additional sequences of HOA coefficients of said surrounding HOA component, it is determined whether they were also selected in the previous frame:

-- if true, the assignment of these HOA coefficient sequences to the channels to be perceptually encoded is identical to the assignment for said previous frame;

-- if this is not true, and if the HOA coefficient sequences are chosen as new, the HOA coefficient sequences are first placed relative to their indexes in ascending order, and assigned in that order to channels to be perceptually encoded that are not yet occupied by directional signals.

5. The method according to claim 1, in which

is the number of HOA score sequences representing said surrounding HOA component, where the parameters describing said assignment are placed in a bitmap that has a length corresponding to the additional number of HOA score sequences used in addition to the number

sequences of HOA coefficients to represent said surrounding HOA component, where each o-th bit in said bitmap indicates whether

-th additional sequence of HOA coefficients to represent said surrounding HOA component.

6. The method of claim 1, wherein the parameters describing said assignment are placed in an assignment vector having a length corresponding to the number of inactive directional signals, the elements of this vector indicating which of the additional sequences of HOA coefficients of the surrounding HOA component are assigned to channels with inactive directional signals.

7. The method of claim 1, wherein said separating the HOA coefficient sequences of said current frame further provides parameters that can be used on the decompression side to predict portions of the original HOA representation from said steered signals.

8. The method of claim 4, wherein said assignment provides a vector of assignments, the elements of the vector representing information as to which of the additional HOA coefficient sequences for said ambient HOA component are assigned to inactive steered channels.

9. An apparatus for compressing using a first number of perceptual representation encodings based on higher-order sound field ambiphony, denoted HOA, with input time frames of sequences of HOA coefficients, wherein said apparatus performs processing on a frame-by-frame basis and includes:

an estimator for estimating, for the current frame, a set of dominant directions and a corresponding set of index data of the detected directional signals;

a separating unit for separating, from the sequences of HOA coefficients of said current frame, a second number of directional signals with respective directions contained in said set of dominant direction estimates and with a corresponding delayed index data set of said directional signals,

a assigner for assigning said steered signals and HOA coefficient sequences of said HOA surround component to a frame of channels corresponding to said first number, and thereby deriving index parameters of the selected HOA coefficient surrounding sequences describing said assignment that can be used for corresponding reallocation on the decompressing side, said assignment using said delayed index data set of said directional signals and said index data set of said reduced number of surrounding HOA coefficient sequences;

a coding unit that perceptually encodes said channels of the assigned frame so as to provide an encoded compressed frame.

10. The apparatus of claim 9, wherein said second number of directional signals is determined according to a perceptually related criterion such that:

- the total error resulting from the above three errors is taken into account for the number of test directions and the number of critical frequency bands in relation to its perceptibility;

11. The apparatus of claim 9, wherein the selection of a reduced number of HOA coefficient sequences to represent the surrounding HOA component is performed according to a criterion that differs between the following three cases:

12. The device according to claim 9, in which the mentioned assignment is performed as follows:

) recent channels;

13. The device according to claim 9, in which

14. The apparatus of claim 9, wherein the parameters describing said assignment are placed in an assignment vector having a length corresponding to the number of inactive directional signals, the elements of this vector indicating which of the additional sequences of HOA coefficients of the surrounding HOA component are assigned to channels with inactive directional signals.

15. The apparatus of claim 9, wherein said separating the HOA coefficient sequences of said current frame further provides parameters that can be used on the decompression side to predict portions of the original HOA representation from said steered signals.

16. The apparatus of claim 12, wherein said assignment provides a vector of assignments, the elements of the vector representing information as to which of the additional HOA coefficient sequences for said ambient HOA component are assigned to inactive steered channels.

17. A method for decompressing a compressed representation based on higher order ambiphony, said decompression comprising the steps of:

decoding the current encoded compressed frame to provide a decoded channel frame;

reassigning said perceptually decoded channel frame based on an assignment vector indicating at least an index of a possibly contained ambient HOA component coefficient sequence and a set of directional signal index data to recreate a corresponding reconstructed frame of the ambient HOA component;

recomposing the current decompressed HOA representation frame from the reconstructed frame of the surrounding HOA component and the reconstructed directional signal frame based on the detected directional signal index data set and the dominant direction estimate set.

18. A device for decompressing a compressed representation based on higher order ambiophony, said device including:

a decoding unit for decoding the current encoded compressed frame so as to provide a decoded channel frame;

a reallocator for reallocating said perceptually decoded channel frame based on an assignment vector indicating at least an index of a possibly contained ambient HOA component coefficient sequence and a set of directional signal index data to recreate a corresponding reconstructed frame of the ambient HOA component;

a recomposing unit for recomposing the current decompressed HOA representation frame from the reconstructed frame of the surrounding HOA component and the reconstructed directional signal frame based on the detected directional signal index data set and the dominant direction estimation set.