RU2773510C2

RU2773510C2 - Downmixer, audio encoder, method and computer program applying phase value to absolute value

Info

Publication number: RU2773510C2
Application number: RU2020136237A
Authority: RU
Inventors: Александр КАРАПЕТЯН; Феликс ВОЛЬФ; Ян ПЛОГСТИС
Original assignee: Фраунхофер-Гезелльшафт Цур Фердерунг Дер Ангевандтен Форшунг Е.Ф.
Priority date: 2018-04-06
Filing date: 2019-04-05
Publication date: 2022-06-06

Abstract

FIELD: downmix signal providing.

SUBSTANCE: inventions group relates to a downmixer, an audio encoder and methods for providing a downmix signal based on a plurality of input signals. The downmixer for providing the downmix signal based on the plurality of input signals is configured to determine the magnitude value of the value in the spectral domain of the downmix signal based on the loudness information of the input signals. The downmixer is configured to determine the phase value of the value in the spectral domain of the downmix signal, and the downmixer is configured to apply the phase value to obtain a complex-valued representation of the numbers of the value in the spectral domain of the downmix signal based on the absolute magnitude value of the value in the spectral domain of the downmix signal . The audio encoder uses such a downmixer. A method for downmixing and a computer-readable storage medium are also described.

EFFECT: providing a downmixing method that improves sound quality while reducing computational complexity.

44 cl, 11 dwg

Description

Область техники, к которой относится изобретениеThe technical field to which the invention belongs

Варианты осуществления согласно изобретению относятся к понижающему микшеру для предоставления сигнала понижающего микширования на основе множества входных сигналов.Embodiments of the invention relate to a downmixer for providing a downmix signal based on a plurality of input signals.

Дополнительные варианты осуществления согласно изобретению относятся к аудиокодеру для предоставления кодированного аудиопредставления на основе множества входных аудиосигналов.Additional embodiments according to the invention relate to an audio encoder for providing an encoded audio representation based on a plurality of input audio signals.

Дополнительные варианты осуществления согласно изобретению относятся к способу для предоставления сигнала понижающего микширования на основе множества входных сигналов.Additional embodiments according to the invention relate to a method for providing a downmix signal based on a plurality of input signals.

Дополнительные варианты осуществления согласно изобретению относятся к компьютерной программе.Additional embodiments according to the invention relate to a computer program.

Уровень техникиState of the art

В области обработки аудиосигналов, иногда желательно комбинировать несколько аудиосигналов в один аудиосигнал. Например, это позволяет уменьшать сложность для кодирования аудио. Информация относительно характеристик исходных аудиосигналов и/или относительно характеристик процесса понижающего микширования, например, может быть включена в кодированное аудиопредставление, а также в сам сигнал понижающего микширования (предпочтительно в кодированной форме).In the field of audio signal processing, it is sometimes desirable to combine multiple audio signals into a single audio signal. For example, this allows the complexity for audio encoding to be reduced. Information regarding the characteristics of the original audio signals and/or regarding the characteristics of the downmix process, for example, may be included in the encoded audio representation as well as in the downmix signal itself (preferably in encoded form).

Понижающее микширование представляет собой процесс преобразования, например, программы с многоканальной конфигурацией в программу с меньшим числом каналов. Относительно этой проблемы, следует обратиться, например, к определению "понижающего микширования", которое приведено в Википедии.Downmixing is the process of converting, for example, a program with a multi-channel configuration into a program with fewer channels. Regarding this problem, one should refer, for example, to the definition of "downmix", which is given in Wikipedia.

Частный случай представляет собой бинауральное понижающее микширование, при котором несколько подготовленных посредством бинаурального рендеринга сигналов (в расчете на ухо) микшируются с понижением в один канал. Традиционно, N каналов многоканального сигнала объединяются вместе посредством простого суммирования, чтобы формировать M-канальный сигнал (при этом, типично, N>M).A special case is binaural downmixing, in which several binaurally rendered signals (per ear) are downmixed into one channel. Traditionally, N channels of a multi-channel signal are combined together by simple summation to form an M-channel signal (typically N>M).

Далее описываются некоторые проблемы при понижающем микшировании.The following describes some of the problems in downmixing.

Обнаружено, что при микшировании с понижением нескольких аудиосигналов, нежелательные помехи могут представлять собой результат. Также обнаружено, что помехи могут разделяться на три категории:It has been found that when downmixing multiple audio signals, unwanted noise may be the result. It has also been found that interference can be divided into three categories:

1. Два сигнала S₁ и S₂ (при этом сигналы, например, могут представляться посредством векторов S, описывающих их абсолютную величину (длину) и фазу (угол)) имеют в определенный момент времени аналогичные фазовые углы (см., например, фиг. 4a), и в таком случае существуют конструктивные помехи (например, суммирование абсолютной величины с +6 дБ вместо суммирования энергии с +3 дБ).1. Two signals S ₁ and S ₂ (signals, for example, can be represented by vectors S describing their absolute value (length) and phase (angle)) have similar phase angles at a certain point in time (see, for example, Fig. .4a), in which case there are constructive interferences (for example, summing the absolute value with +6 dB instead of summing the energy with +3 dB).

2. Если оба вектора указывают в различных направлениях в определенное время (см., например, фиг. 4b), то существуют частично деструктивные помехи.2. If both vectors point in different directions at a certain time (see, for example, Fig. 4b), then there is partially destructive interference.

3. Если оба вектора имеют аналогичные абсолютные величины и угловую разность приблизительно в 180°, то существуют сильные деструктивные помехи или даже полное подавление (см., например, фиг. 4c). В этом случае, результирующий вектор имеет ошибочный фазовый угол.3. If both vectors have similar absolute values and an angular difference of approximately 180°, then there is strong destructive interference or even complete suppression (see, for example, Fig. 4c). In this case, the resulting vector has the wrong phase angle.

В качестве вывода, пояснены три типа помех, которые могут возникать во время процедуры понижающего микширования. Эти три типа помех проиллюстрированы на фиг. 4.As a conclusion, three types of interference that may occur during the downmix procedure are explained. These three types of interference are illustrated in FIG. four.

Эта проблема возникает в широкополосных сигналах, а также в отдельных полосах частот. С точки зрения качества звука, первые два типа помех приводят к непредпочтительным изменениям цвета звука, флэнджерным эффектам, частично реверберирующему впечатлению и т.д. Третий тип помех, с другой стороны, приводит к подавлению сигнальных компонентов или может (перцепционно) усиливать вышеуказанные артефакты.This problem occurs in wideband signals as well as in individual frequency bands. In terms of sound quality, the first two types of interference result in undesired color changes in the sound, flanger effects, partially reverberant impressions, and so on. The third type of interference, on the other hand, leads to the suppression of signal components or may (perceptually) enhance the above artifacts.

Обнаружено, что один подход для коррекции непредпочтительных звуковых изменений выполняется посредством модификации спектра микшированного с понижением сигнала. Обнаружено, что через коррекции с сохранением энергии в отдельных полосах частот, пассивное понижающее микширование частотно корректируется в спектральной области, и требуемый спектр (почти) достигается. Также обнаружено, что предпочтительно, значения энергии должны сглаживаться во времени с использованием этого способа. Тем не менее, обнаружено, что в силу сглаживания, результирующие значения коррекции становятся вялыми по отклику и дополнительно могут усиливать конструктивные или ослабленные деструктивные помехи.It has been found that one approach for correcting undesired audio changes is performed by modifying the spectrum of the downmixed signal. It has been found that through energy conservation corrections in individual frequency bands, the passive downmix is frequency corrected in the spectral domain and the desired spectrum is (almost) achieved. It has also been found that it is preferable that the energy values should smooth over time using this method. However, it has been found that due to smoothing, the resulting correction values become sluggish in response and may further enhance constructive or attenuated destructive interference.

Такой принцип может обобщаться в качестве скорректированного по энергии понижающего микширования.Such a principle can be generalized as an energy-corrected downmix.

US 7039204 B2 описывает частотную коррекцию для аудиомикширования. Во время микширования N-канального входного сигнала, чтобы формировать M-канальный выходной сигнал, микшированные канальные сигналы частотно корректируются (например, усиливаются), чтобы поддерживать полный уровень энергии/громкости выходного сигнала практически равным полному уровню энергии/громкости входного сигнала. В одном варианте осуществления, N входных канальных сигналов преобразуются в частотную область на покадровой основе, и полная спектральная громкость N-канального входного сигнала оценивается. После микширования спектров для N входных канальных сигналов (например, с использованием суммирования со взвешиванием), также оценивается полная спектральная громкость результирующих M микшированных канальных сигналов. Частотно-зависимый коэффициент усиления, который основан на двух оценках громкости, применяется к спектральным компонентам M микшированных канальных сигналов, чтобы формировать M частотно-скорректированных микшированных канальных сигналов. M-канальный выходной сигнал формируется посредством преобразования M частотно-скорректированных микшированных канальных сигналов во временную область.US 7039204 B2 describes frequency equalization for audio mixing. While mixing an N-channel input signal to form an M-channel output signal, the mixed channel signals are equalized (e.g., boosted) to keep the overall energy/loudness level of the output signal substantially equal to the full energy/loudness level of the input signal. In one embodiment, the N input channel signals are converted to the frequency domain on a frame-by-frame basis and the overall spectral loudness of the N-channel input signal is estimated. After mixing the spectra for the N input channel signals (eg, using weighted summing), the overall spectral loudness of the resulting M mixed channel signals is also estimated. A frequency-dependent gain that is based on two loudness estimates is applied to the spectral components of the M downmix channel signals to generate M frequency-corrected downmix channel signals. The M-channel output signal is generated by converting the M frequency-corrected mixed channel signals into the time domain.

Тем не менее, с учетом традиционных концепций, существует потребность в концепции для понижающего микширования, которая предоставляет улучшенный компромисс между качеством звука и вычислительной сложностью.However, in view of conventional concepts, there is a need for a concept for downmixing that provides an improved compromise between audio quality and computational complexity.

Сущность изобретенияThe essence of the invention

Вариант осуществления согласно изобретению создает понижающий микшер для предоставления сигнала понижающего микширования на основе множества входных сигналов (которые, например, могут быть комплекснозначными и которые, например, могут представлять собой входные аудиосигналы). Понижающий микшер выполнен с возможностью определять (например, вычислять или оценивать) значение абсолютной величины значения в спектральной области микшированного с понижением сигнала (например, для данного спектрального элемента разрешения) на основе информации громкости входных сигналов (например, на основе значений громкости, ассоциированных с данным спектральным элементом разрешения входных сигналов). Понижающий микшер выполнен с возможностью определять значение фазы (которое, например, может быть скалярным значением) значения в спектральной области сигнала понижающего микширования (например, для данного спектрального элемента разрешения). Например, понижающий микшер может быть выполнен с возможностью определять значение фазы независимо от определения значения абсолютной величины. Понижающий микшер выполнен с возможностью применять значение фазы, чтобы получать комплекснозначное представление чисел значения в спектральной области сигнала понижающего микширования (например, для данного спектрального элемента разрешения) на основе значения абсолютной величины значения в спектральной области сигнала понижающего микширования.An embodiment according to the invention creates a downmixer for providing a downmix signal based on a plurality of input signals (which, for example, may be complex-valued and which, for example, may be audio input signals). The downmixer is configured to determine (e.g., calculate or estimate) the absolute magnitude value of a value in the spectral domain of the downmixed signal (e.g., for a given spectral bin) based on loudness information of the input signals (e.g., based on loudness values associated with a given spectral resolution element of the input signals). The downmixer is configured to determine a phase value (which, for example, may be a scalar value) of a value in the spectral domain of the downmix signal (eg, for a given spectral bin). For example, the downmixer may be configured to determine the phase value independently of the determination of the absolute value. The downmixer is configured to apply a phase value to obtain a complex-valued representation of the numbers of the value in the spectral domain of the downmix signal (eg, for a given spectral bin) based on the magnitude value of the value in the spectral domain of the downmix signal.

Этот вариант осуществления согласно изобретению основан на такой идее, что хороший компромисс между вычислительной сложностью и качеством звука может достигаться посредством вычисления значения абсолютной величины значения в спектральной области сигнала понижающего микширования, которое является скалярным значением, и посредством применения значения фазы, которое типично является скалярным значением, которое вычисляется отдельно от значения абсолютной величины на следующем этапе. Соответственно, большинство этапов обработки могут работать со скалярными значениями, и комплекснозначное представление чисел для значений в спектральной области сигналов понижающего микширования формируется только в позднем (конечном) каскаде вычисления.This embodiment according to the invention is based on the idea that a good compromise between computational complexity and audio quality can be achieved by calculating the absolute value of the value in the spectral domain of the downmix signal, which is a scalar value, and by applying the phase value, which is typically a scalar value , which is calculated separately from the value of the absolute value in the next step. Accordingly, most of the processing steps can work with scalar values, and the complex-valued representation of numbers for values in the spectral domain of the downmix signals is formed only in the late (final) stage of the calculation.

Кроме того, обнаружено, что определение скалярного значения абсолютной величины является возможным с хорошей точностью на основе информации громкости входных сигналов. Посредством использования информации громкости входных сигналов, чтобы получать значение абсолютной величины, можно не допускать того, что значение абсолютной величины сильно затрагивается посредством деструктивных помех. Это обусловлено тем фактом, что информация громкости входных сигналов типично не затрагивается посредством деструктивных помех таким образом, что преобразование информации громкости в значение абсолютной величины типично приводит к численно стабильным решениям.In addition, it has been found that the determination of a scalar value of the absolute value is possible with good accuracy based on the loudness information of the input signals. By using the loudness information of the input signals to obtain an absolute value, it can be prevented that the absolute value is strongly affected by destructive interference. This is due to the fact that the loudness information of the input signals is typically unaffected by destructive interference such that converting the loudness information to an absolute value typically results in numerically stable solutions.

Другими словами, посредством определения значения абсолютной величины значения в спектральной области главным образом на основе информации громкости входных сигналов (с возможной, необязательной коррекцией после преобразования информации громкости в значение абсолютной величины, чтобы рассматривать эффекты подавления), числовые нестабильности и артефакты, которые могут вызываться посредством суммирования комплекснозначных чисел и посредством последующего масштабирования, могут не допускаться.In other words, by determining the magnitude value of a value in the spectral domain mainly based on the loudness information of the input signals (with possible, optional correction after converting the loudness information to an absolute value in order to consider cancellation effects), numerical instabilities and artifacts that can be caused by summation of complex-valued numbers and through subsequent scaling may not be allowed.

Кроме того, посредством рассмотрения информации громкости входных сигналов при определении значения абсолютной величины, усиление сигнала в 6 дБ, которое может возникать в случае конструктивных помех и которое типично должно восприниматься как артефакт, может не допускаться. Наоборот, посредством рассмотрения информации громкости входных сигналов, может достигаться то, что сигнал понижающего микширования лучше адаптируется к воспринимаемой громкости по сравнению со случаями, в которых предусмотрено просто суммирование комплексных значений, представляющих входные сигналы.In addition, by considering the loudness information of the input signals when determining the absolute value, the 6 dB signal amplification that can occur in the case of constructive interference and that should typically be perceived as an artifact can be avoided. Conversely, by considering the loudness information of the input signals, it can be achieved that the downmix signal adapts better to the perceived loudness compared to cases in which the complex values representing the input signals are simply summed.

Кроме того, обнаружено, что отдельное вычисление фазы, которое является отдельным от определения значения абсолютной величины, предоставляет высокую степень гибкости. Вычисление фазы может выполняться с хорошей точностью, при этом можно применять коррекции для того, чтобы определять значения фазы в случае деструктивных помех. Поскольку значение фазы типично является скалярным значением, которое применяется только тогда, когда значение абсолютной величины определено, вычислительные усилия для определения и значения коррекции фазы являются особенно небольшими.In addition, it has been found that a separate phase calculation that is separate from the absolute value determination provides a high degree of flexibility. The phase calculation can be performed with good accuracy and corrections can be applied to determine the phase values in case of destructive interference. Because the phase value is typically a scalar value that only applies when the absolute value is determined, the computational effort to determine and the phase correction value is particularly low.

В качестве вывода, обнаружено, что хороший компромисс между эффективностью вычислений и впечатлением от прослушивания может достигаться посредством отдельной обработки значения абсолютной величины и значения фазы и посредством комбинирования этих значений, чтобы получать комплекснозначное представление чисел значения в спектральной области сигнала понижающего микширования, только в конце цепочки обработки (т.е. в конце понижающего микширования).As a conclusion, it has been found that a good compromise between computational efficiency and listening experience can be achieved by processing the absolute magnitude value and the phase value separately, and by combining these values to obtain a complex-valued representation of the numbers of the value in the spectral domain of the downmix signal, only at the end of the chain processing (i.e. at the end of the downmix).

В предпочтительном варианте осуществления, понижающий микшер выполнен с возможностью определять значение фазы значения в спектральной области сигнала понижающего микширования независимо от определения значения абсолютной величины значения в спектральной области сигнала понижающего микширования. Такая отдельная обработка и определение значения абсолютной величины и значения фазы показаны ка вычислительно эффективные. Кроме того, отсутствует неуправляемое влияние деструктивных помех в тракте обработки для определения значения абсолютной величины.In a preferred embodiment, the downmixer is configured to determine the phase value of the value in the spectral domain of the downmix signal independently of the determination of the magnitude value of the value in the spectral domain of the downmix signal. Such separate processing and determination of the magnitude value and the phase value are shown to be computationally efficient. In addition, there is no uncontrolled influence of destructive interference in the processing path to determine the value of the absolute value.

В предпочтительном варианте осуществления, понижающий микшер выполнен с возможностью определять значения громкости значений в спектральной области входных сигналов. Понижающий микшер выполнен с возможностью извлекать значение суммированной громкости, ассоциированное со значением в спектральной области сигнала понижающего микширования, на основе значений громкости значений в спектральной области входных сигналов. Понижающий микшер выполнен с возможностью извлекать значение абсолютной величины (например, значение амплитуды) значения в спектральной области сигнала понижающего микширования из значения суммированной громкости. Соответственно, значение абсолютной величины хорошо представляет воспринимаемую громкость. Тем не менее, посредством рассмотрения суммированной громкости и посредством преобразования этого значения суммированной громкости в значение абсолютной величины, может достигаться то, что значение абсолютной величины (например, значение амплитуды) значения в спектральной области сигнала понижающего микширования не содержит повышенную громкость в случае, если входные сигналы показывают конструктивные помехи. В этом случае, предусмотрено только суммирование громкости, но не квадратическое увеличение громкости, что способствует обоснованному впечатлению от прослушивания. С другой стороны, также отсутствуют деструктивные помехи, так что отсутствуют "глубокие впадины" значения абсолютной величины, даже в случае, если между входными сигналами существуют деструктивные помехи. Соответственно, извлеченное значение абсолютной величины является хорошо подходящим для последующей обработки. Если требуется, можно легко ослаблять значение абсолютной величины или даже увеличивать значение абсолютной величины без численных проблем. В частности, извлечение этого значения абсолютной величины на основе значений громкости имеет такое преимущество, что значение абсолютной величины всегда находится в пределах обоснованного диапазона значений, поскольку как чрезвычайно небольших значения не допускаются (посредством рассмотрения значения суммированной громкости), так и чрезмерно большие значения не допускаются (посредством недопущения прямого суммирования амплитуд). Таким образом, такая обработка имеет существенное преимущество.In a preferred embodiment, the down-mixer is configured to determine loudness values of values in the spectral domain of the input signals. The downmixer is configured to extract a sum loudness value associated with a spectral domain value of the downmix signal based on the loudness values of the spectral domain values of the input signals. The downmixer is configured to extract an absolute value (eg, an amplitude value) of a value in the spectral domain of the downmix signal from the summed loudness value. Accordingly, the absolute value represents the perceived loudness well. However, by considering the summed loudness and converting this summed loudness value to an absolute value, it can be achieved that the absolute value (e.g. amplitude value) of the value in the spectral domain of the downmix signal does not contain increased loudness in case the input signals show constructive interference. In this case, only volume summing is provided, but not a quadratic increase in volume, which contributes to a reasonable listening experience. On the other hand, there is also no destructive interference, so that there are no "deep dips" in the absolute value, even if there is destructive interference between the input signals. Accordingly, the extracted absolute value is well suited for further processing. If required, one can easily weaken the absolute value or even increase the absolute value without numerical problems. In particular, deriving this absolute value from loudness values has the advantage that the absolute value is always within a reasonable range of values, since both extremely small values are not allowed (by consideration of the summed loudness value) and excessively large values are not allowed. (by avoiding direct summation of amplitudes). Thus, such processing has a significant advantage.

В предпочтительном варианте осуществления, понижающий микшер выполнен с возможностью определять сумму или взвешенную сумму значений в спектральной области входных сигналов и определять значение фазы на основе суммы или на основе взвешенной суммы значений в спектральной области входных сигналов. Посредством использования такого вычисления значения фазы, корректное и надежное значение фазы может получаться при многих обстоятельствах (даже если могут возникать некоторые ошибки в случае сильных деструктивных помех).In a preferred embodiment, the downmixer is configured to determine the sum or weighted sum of the spectral domain values of the input signals and determine the phase value based on the sum or weighted sum of the spectral domain values of the input signals. By using such a phase value calculation, a correct and reliable phase value can be obtained under many circumstances (even though some errors may occur in the case of strong destructive interference).

В предпочтительном варианте осуществления, понижающий микшер выполнен с возможностью использовать значение абсолютной величины значения в спектральной области сигнала понижающего микширования в качестве абсолютного значения полярного представления значения в спектральной области сигнала понижающего микширования и использовать значение фазы в качестве значения фазы полярного представления значения в спектральной области сигнала понижающего микширования. Кроме того, понижающий микшер выполнен с возможностью получать декартово комплекснозначное представление значения в спектральной области сигнала понижающего микширования на основе полярного представления. Соответственно, декартово комплекснозначное представление значения в спектральной области получается в сравнительно позднем каскаде обработки, в то время как предыдущие каскады обработки отдельно определяют абсолютное значение и значение фазы. Обнаружено, что такая процедура является преимущественной, поскольку обработка полных комплексных значений может приводить к нежелательным артефактам в зависимости от соотношения фаз между входными сигналами. Наоборот, только комбинирование абсолютного значения и значения фазы на позднем каскаде обработки (или даже в качестве конечного каскада определения сигнала понижающего микширования) исключает такие артефакты. Кроме того, отдельная обработка абсолютного значения и значения фазы вычислительно проще обработки комплексных значений на нескольких каскадах обработки.In a preferred embodiment, the downmixer is configured to use the absolute value of the spectral domain value of the downmix signal as the absolute value of the polar representation of the spectral domain value of the downmix signal, and use the phase value as the phase value of the polar representation of the spectral domain value of the downmix signal. mixing. In addition, the downmixer is configured to obtain a Cartesian complex-valued representation of a value in the spectral domain of the downmix signal based on the polar representation. Accordingly, a Cartesian complex-valued representation of a value in the spectral domain is obtained in a relatively late processing stage, while the previous processing stages separately determine the absolute value and the phase value. This procedure has been found to be advantageous because the processing of the full complex values can lead to unwanted artifacts depending on the phase relationship between the input signals. Conversely, only combining the absolute value and the phase value in the late processing stage (or even as the final stage of determining the downmix signal) eliminates such artifacts. In addition, processing absolute value and phase value separately is computationally easier than processing complex values in multiple processing stages.

В предпочтительном варианте осуществления, понижающий микшер выполнен с возможностью определять (например, вычислять) информация степени подавления (например, Q) и рассматривать информацию степени подавления при определении значения абсолютной величины (например,

) значения в спектральной области сигнала понижающего микширования. Например, информация степени подавления описывает (или количественно описывает) степень конструктивных или деструктивных помех между значениями в спектральной области (например, ассоциированными с идентичным спектральным элементом разрешения) входных сигналов. Кроме того, понижающий микшер выполнен с возможностью избирательно уменьшать (например, ослаблять) значение абсолютной величины (например,

) значения в спектральной области сигнала понижающего микширования по сравнению со (или относительно) значением абсолютной величины (например,

) или по сравнению с (или относительно) "опорной абсолютной величиной", представляющей сумму значений громкости значений в спектральной области входного сигнала в случае, если информация степени подавления указывает деструктивные помехи (при этом, например, уменьшение значения абсолютной величины может варьироваться непрерывно в зависимости от информации степени подавления). Обнаружено, что уменьшение значения абсолютной величины значения в спектральной области может рекомендоваться, когда сильные деструктивные помехи обнаруживаются, поскольку значение фазы типично является ненадежным в этом случае. Другими словами, присутствие сильных деструктивных помех типично приводит к тому, что значению фазы является ненадежным или быстро изменяется в большом диапазоне углов. В таких случаях, уменьшение значения абсолютной величины значения в спектральной области сигнала понижающего микширования помогает уменьшать артефакты. Тем не менее, обнаружено, что лучше уменьшать значение абсолютной величины значения в спектральной области сигнала понижающего микширования хорошо управляемым способом по сравнению с простым суммированием комплекснозначных представлений значений в спектральной области входных сигналов.In a preferred embodiment, the downmixer is configured to determine (eg, calculate) the suppression degree information (eg, Q) and consider the suppression degree information when determining the absolute value value (eg,

) values in the spectral domain of the downmix signal. For example, the rejection ratio information describes (or quantifies) the degree of constructive or destructive interference between values in the spectral domain (eg, associated with an identical spectral bin) of the input signals. In addition, the downmixer is configured to selectively reduce (e.g., attenuate) an absolute value (e.g.,

) values in the spectral domain of the downmix signal compared to (or relative to) an absolute value (e.g.,

) or compared to (or relative to) a "reference absolute value" representing the sum of loudness values of values in the spectral domain of the input signal in case the suppression degree information indicates destructive interference (whereby, for example, the decrease in the absolute value value may vary continuously depending on suppression information). It has been found that reducing the magnitude of the spectral domain value may be recommended when strong destructive interference is detected, since the phase value is typically unreliable in this case. In other words, the presence of strong destructive interference typically results in the phase value being unreliable or changing rapidly over a large range of angles. In such cases, reducing the absolute value of the value in the spectral domain of the downmix signal helps to reduce artifacts. However, it has been found to be better to reduce the absolute value of the spectral domain value of the downmix signal in a well controlled manner, compared to simply summing the complex valued representations of the spectral domain values of the input signals.

Другими словами, концепция предоставляет возможность очень хорошего компромисса между эффективностью вычислений и уменьшением влияния (сильных) деструктивных помех.In other words, the concept provides a very good compromise between computational efficiency and reduction of the impact of (strong) destructive interference.

В предпочтительном варианте осуществления, понижающий микшер выполнен с возможностью определять суммы (например, sumIm+, sumIm-, sumRe+, sumRe-) компонентов значений в спектральной области входных сигналов, имеющих (например, четыре) различные ориентации (например, компонентов, имеющих ориентацию в направлении положительных мнимых осей, компонентов, имеющих ориентацию в направлении отрицательных мнимых осей, компонентов, имеющих ориентацию в направлении положительной действительной оси, и компонентов, имеющих ориентацию в направлении отрицательной действительной оси; альтернативно, компоненты имеют ориентацию в первом направлении, которое может определяться посредством вектора суммы значений в спектральной области входных сигналов, втором направлении, которое является ортогональным к первому направлению, третьем направлении, которое является противоположным первому направлению, и четвертом направлении, которое является противоположным второму направлению). Кроме того, понижающий микшер выполнен с возможностью определять информацию степени подавления на основе сумм (например, sumIm+, sumIm-, sumRe+, sumRe-) компонентов значений в спектральной области входных сигналов, имеющих различные ориентации.In a preferred embodiment, the downmixer is configured to determine sums (e.g., sumIm+, sumIm-, sumRe+, sumRe-) of components of values in the spectral domain of input signals having (e.g., four) different orientations (e.g., components having an orientation in the direction positive imaginary axes, components having an orientation in the direction of the negative imaginary axes, components having an orientation in the direction of the positive real axis, and components having an orientation in the direction of the negative real axis; alternatively, the components have an orientation in a first direction, which may be determined by a sum vector values in the spectral domain of the input signals, a second direction that is orthogonal to the first direction, a third direction that is opposite the first direction, and a fourth direction that is opposite the second direction). Further, the downmixer is configured to determine suppression degree information based on the sums (eg, sumIm+, sumIm-, sumRe+, sumRe-) of spectral domain value components of input signals having different orientations.

Обнаружено, что оценка сумм компонентов значений в спектральной области входных сигналов, имеющих различные ориентации, обеспечивает возможность эффективно определять ожидаемую степень подавления. Например, если все компоненты имеют идентичную ориентацию (например, имеют положительную мнимую часть и положительную действительную часть), можно ожидать, что отсутствует сильное подавление. С другой стороны, если суммы компонентов в противоположных направлениях являются аналогичными или даже являются идентичными, можно прийти к выводу, что существует высокая степень подавления. Другими словами, посредством сравнения сумм компонентов в различных ориентациях или направлениях, можно приходиться к эффективному и надежному заключению касательно степени подавления. Соответственно, можно адаптировать значение абсолютной величины значения в спектральной области сигнала понижающего микширования, когда предполагается чрезмерное подавление (или, эквивалентно, когда предполагается, что информация фазы является ненадежной).It has been found that estimating the sums of components of values in the spectral domain of input signals having different orientations makes it possible to effectively determine the expected degree of suppression. For example, if all components have the same orientation (eg, have a positive imaginary part and a positive real part), one would expect that there is no strong suppression. On the other hand, if the sums of the components in opposite directions are similar or even identical, it can be concluded that there is a high degree of suppression. In other words, by comparing the sums of the components in different orientations or directions, one can reach an efficient and reliable conclusion regarding the degree of suppression. Accordingly, it is possible to adapt the value of the absolute value of the value in the spectral domain of the downmix signal when excessive suppression is assumed (or, equivalently, when the phase information is assumed to be unreliable).

В предпочтительном варианте осуществления, понижающий микшер выполнен с возможностью выбирать две из определенных сумм (например, sumIm+ и sumRe+), которые ассоциированы с ортогональными ориентациями или направлениями (например, вдоль положительной мнимой оси и вдоль положительной действительной оси) и которые превышают или равны суммам, которые ассоциированы с противоположными ориентациями или направлениями (например, sumIm- и sumRe-), в качестве доминирующих значений суммы (например, sumIm+ и sumRe+). Например, понижающий микшер выполнен с возможностью определять, для двух ориентаций, то, какие из определенных сумм имеют наибольшую абсолютную величину, и выбирать эти суммы в качестве "доминирующих значений суммы". Кроме того, понижающий микшер выполнен с возможностью определять значение масштабирования (например, Q или Q_mapped), которое приводит к избирательному уменьшению значения абсолютной величины (например,

) значения в спектральной области сигнала понижающего микширования, на основе соотношения без знака (т.е. соотношения, при котором знак не рассматривается, или соотношения абсолютных значений, или абсолютного значения соотношения) между первым недоминирующим значением суммы (например, sumRe-), которое ассоциировано с направлением или ориентацией, противоположной ориентации первого доминирующего значения суммы (например, sumRe+), и первым доминирующим значением суммы (например, sumRe+), а также на основе соотношения без знака (например, соотношения, при котором знак не рассматривается, или соотношения абсолютных значений, или абсолютного значения соотношения) между вторым недоминирующим значением суммы (например, sumIm-), которое ассоциировано с ориентацией (или направлением), противоположной ориентации (или направлению) второго доминирующего значения суммы (например, sumIm+), и вторым доминирующим значением суммы (например, sumIm+), таким образом, что увеличение соотношений без знака (например, |sumRe-|/sumRe+ и |sumIm-|/sumIm+) между недоминирующим значением суммы и его ассоциированным доминирующим значением суммы приводит к уменьшению значения абсолютной величины (например,

) значения в спектральной области сигнала понижающего микширования (например, при уменьшении значения Q масштабирования). Этот вариант осуществления основан на такой идее, что соотношение между значениями суммы, которые ассоциированы с противоположными направлениями, предоставляет достоверную информацию относительно степени отрицательных (деструктивных) помех. Например, если первое недоминирующее значение суммы существенно меньше первого доминирующего значения суммы, можно прийти к выводу, что отсутствует или имеется только небольшое подавление между первым направлением (ассоциированным с первой доминирующей суммой) и третьим направлением (ассоциированным с первой недоминирующей суммой). Аналогично, если соотношение без знака (т.е. соотношение, которое не рассматривает знака) между первым недоминирующим значением суммы и его ассоциированным первым доминирующим значением суммы становится большим (например, близким к единице), можно прийти к выводу, что существует сравнительно сильное подавление между первым направлением (с которым ассоциировано первое доминирующее значение суммы), и третье направление (с которым ассоциировано первое недоминирующее значение суммы). В качестве вывода, недоминирующие значения суммы и доминирующие значения суммы могут эффективно использоваться для того, чтобы распознавать подавление между входными сигналами, и в силу этого могут эффективно использоваться для того, чтобы управлять уменьшением значения абсолютной величины значения в спектральной области сигнала понижающего микширования.In a preferred embodiment, the downmixer is configured to select two of certain sums (e.g., sumIm+ and sumRe+) that are associated with orthogonal orientations or directions (e.g., along the positive imaginary axis and along the positive real axis) and that are greater than or equal to the sums which are associated with opposite orientations or directions (eg sumIm- and sumRe-) as the dominant sum values (eg sumIm+ and sumRe+). For example, the downmixer is configured to determine, for two orientations, which of the determined sums have the largest absolute value, and select those sums as "dominant sum values". In addition, the downmixer is configured to determine a scaling value (eg, Q or Q _mapped ) that results in a selective decrease in the absolute value value (eg,

) values in the spectral domain of the downmix signal, based on an unsigned relationship (i.e., a relationship in which the sign is not considered, or an absolute value ratio, or an absolute value of the ratio) between the first non-dominant sum value (e.g., sumRe-), which associated with a direction or orientation opposite to that of the first dominant sum value (for example, sumRe+) and the first dominant sum value (for example, sumRe+), as well as based on an unsigned relation (for example, a relation in which the sign is not considered, or an absolute relation values, or the absolute value of the ratio) between the second non-dominant sum value (eg, sumIm-), which is associated with an orientation (or direction) opposite to the orientation (or direction) of the second dominant sum value (eg, sumIm+), and the second dominant sum value ( e.g. sumIm+), so that the increment of unsigned ratios ( for example, |sumRe-|/sumRe+ and |sumIm-|/sumIm+) between the non-dominant sum value and its associated dominant sum value causes the absolute value to decrease (for example,

) values in the spectral domain of the downmix signal (eg, by decreasing the Q value of the scaling). This embodiment is based on the idea that the ratio between the sum values that are associated with opposite directions provides reliable information regarding the degree of negative (destructive) interference. For example, if the first non-dominant sum value is significantly less than the first dominant sum value, it can be concluded that there is no or only slight suppression between the first direction (associated with the first dominant sum) and the third direction (associated with the first non-dominant sum). Similarly, if an unsigned relation (i.e., a relation that does not consider sign) between the first non-dominant sum value and its associated first dominant sum value becomes large (e.g., close to one), one can conclude that there is a relatively strong suppression between the first direction (with which the first dominant sum value is associated), and the third direction (with which the first non-dominant sum value is associated). As a conclusion, the non-dominant sum values and the dominant sum values can be effectively used to recognize suppression between input signals, and thus can be effectively used to control the decrease in the magnitude value of the value in the spectral domain of the downmix signal.

В предпочтительном варианте осуществления, понижающий микшер выполнен с возможностью вычислять информацию Q степени подавления согласно уравнению, упомянутому в данном документе. В этом случае, sumRe+ является суммой положительных действительных частей комплекснозначных значений в спектральной области входных аудиосигналов (например, в рассматриваемом спектральном элементе разрешения, при этом все комплекснозначные значения в спектральной области, имеющие положительную действительную часть, рассматриваются); sumRe- является суммой отрицательных действительных частей комплекснозначных значений в спектральной области входных аудиосигналов (например, в рассматриваемом спектральном элементе разрешения), при этом все комплекснозначные значения в спектральной области, имеющие отрицательную действительную часть, рассматриваются; sumIm+ может быть суммой положительных мнимых частей комплекснозначных значений в спектральной области входных аудиосигналов (например, в рассматриваемом спектральном элементе разрешения), при этом все комплекснозначные значения в спектральной области, имеющие положительную мнимую часть, рассматриваются); sumIm- является суммой отрицательных мнимых частей комплекснозначных значений в спектральной области входного аудиосигнала (например, в рассматриваемом спектральном элементе разрешения), при этом все комплекснозначные значения в спектральной области, имеющие отрицательную мнимую часть, рассматриваются. Соответственно, информация Q степени подавления может вычисляться эффективным способом в соответствии с вышеуказанными соображениями.In a preferred embodiment, the downmixer is configured to calculate suppression degree information Q according to the equation mentioned herein. In this case, sumRe+ is the sum of the positive real parts of the complex-valued values in the spectral domain of the input audio signals (eg, in the considered spectral bin, with all complex-valued values in the spectral domain having a positive real part being considered); sumRe- is the sum of the negative real parts of the complex-valued values in the spectral domain of the input audio signals (eg, in the considered spectral bin), with all complex-valued values in the spectral domain having a negative real part being considered; sumIm+ may be the sum of the positive imaginary parts of the complex-valued values in the spectral domain of the input audio signals (eg, in the considered spectral bin), with all complex-valued values in the spectral domain having a positive imaginary part being considered); sumIm- is the sum of the negative imaginary parts of the complex-valued values in the spectral domain of the input audio signal (eg, in the considered spectral bin), with all complex-valued values in the spectral domain having a negative imaginary part being considered. Accordingly, the suppression degree information Q can be calculated in an efficient manner according to the above considerations.

В предпочтительном варианте осуществления, понижающий микшер выполнен с возможностью определять значение абсолютной величины (например,

) значения в спектральной области сигнала понижающего микширования таким образом, что значение абсолютной величины (например,

) избирательно уменьшается относительно опорного значения (например,

), которое соответствует суммированной громкости значений в спектральной области входных сигналов в моменты времени, в которые информация степени подавления (например, Q), определенная посредством понижающего микшера, указывает сравнительно большие деструктивные помехи между входными сигналами (например, в рассматриваемом спектральном элементе разрешения), и таким образом, что значение абсолютной величины избирательно увеличивается относительно опорного значения (например,

) в моменты времени, в которые информация степени подавления (например, Q) указывает сравнительно небольшие деструктивные помехи между входными сигналами. Посредством избирательного снижения значения абсолютной величины значения в спектральной области сигнала понижающего микширования в моменты времени, в которые информация степени подавления указывает сравнительно большие деструктивные помехи, искажения, которые могут вызываться посредством ошибочных значений фазы или посредством быстрого изменения значений фазы, могут не допускаться. С другой стороны, посредством избирательного увеличения значения абсолютной величины значения в спектральной области сигнала понижающего микширования в момент времени, в который информация степени подавления указывает сравнительно небольшие деструктивные помехи между входными сигналами, энергетические потери, которые вызываются посредством уменьшения значения абсолютной величины, могут компенсироваться, по меньшей мере, частично. Таким образом, полная воспринимаемая громкость может поддерживаться. Избирательное уменьшение абсолютной величины значения в спектральной области сигнала понижающего микширования в некоторые моменты времени (в которые имеются высокие деструктивные помехи) (по меньшей мере, частично) компенсируется посредством избирательного увеличения абсолютной величины значения в спектральной области сигнала понижающего микширования в другие моменты времени, когда отсутствует высокий риск искажений. Соответственно, энергетические потери могут, по меньшей мере, частично компенсироваться, и может достигаться хорошее впечатление от прослушивания сигнала понижающего микширования.In a preferred embodiment, the downmix is configured to determine an absolute value (e.g.,

) values in the spectral domain of the downmix signal such that the absolute magnitude value (e.g.,

) selectively decreases relative to the reference value (for example,

), which corresponds to the summed loudness of values in the spectral domain of the input signals at times at which the rejection ratio information (e.g., Q) determined by the downmixer indicates relatively large destructive interference between the input signals (e.g., in the considered spectral bin), and in such a way that the value of the absolute value is selectively increased relative to the reference value (for example,

) at times at which suppression ratio information (eg, Q) indicates relatively small destructive interference between input signals. By selectively reducing the absolute value of the value in the spectral domain of the downmix signal at times at which the suppression degree information indicates relatively large destructive interference, distortion that may be caused by erroneous phase values or by rapidly changing phase values can be prevented. On the other hand, by selectively increasing the magnitude value of a value in the spectral domain of the downmix signal at a time point at which the suppression ratio information indicates comparatively small destructive interference between input signals, the power loss that is caused by decreasing the magnitude value can be compensated, by at least partially. Thus, full perceived loudness can be maintained. The selective decrease in absolute magnitude of the value in the spectral domain of the downmix signal at some times (at which there is high destructive interference) is (at least partially) compensated for by selectively increasing the absolute magnitude of the value in the spectral domain of the downmix signal at other times when there is no high risk of distortion. Accordingly, the power loss can be at least partially compensated, and a good listening experience of the downmix signal can be achieved.

В предпочтительном варианте осуществления, понижающий микшер выполнен с возможностью отслеживать информацию степени подавления (например, Q(t)) во времени и определять, в зависимости от предыстории информации степени подавления, то, насколько значение абсолютной величины (например,

) избирательно увеличивается относительно опорного значения абсолютной величины (например, M_R) в моменты времени, в которые информация степени подавления (например, Q) указывает сравнительно небольшие деструктивные помехи между входными сигналами. Например, избирательное увеличение значения абсолютной величины относительно опорного значения абсолютной величины может определяться таким образом, что значение абсолютной величины увеличивается на сравнительно большое значение, если ранее возникает сравнительно сильное уменьшение значения абсолютной величины (например, в среднем по времени), и таким образом, что значение абсолютной величины увеличивается на сравнительно меньшее значение, если ранее возникает сравнительно меньшее уменьшение значения абсолютной величины (например, в среднем по времени). Другими словами, степень избирательного увеличения значения абсолютной величины относительно опорного значения может определяться таким образом, что потери энергии вследствие избирательного уменьшения значения абсолютной величины в моменты времени, в которые информация степени подавления указывает сравнительно большие деструктивные помехам между входными сигналами, по меньшей мере, частично компенсируются посредством избирательного увеличения значения абсолютной величины в моменты времени, в которые информация степени подавления указывает сравнительно небольшие деструктивные помехи. Таким образом, энергетические потери, которые вызываются посредством уменьшения значения абсолютной величины в моменты времени, в которые деструктивные помехи возникают, могут, по меньшей мере, частично компенсироваться, при этом предыстория информации степени подавления предоставляет достоверную информацию, насколько большая компенсация является надлежащей.In the preferred embodiment, the downmixer is configured to track the suppression degree information (e.g., Q(t)) over time and determine, depending on the history of the suppression degree information, how much the value of the absolute value (e.g.,

) is selectively increased relative to a reference absolute value (eg, M _R ) at times at which suppression ratio information (eg, Q) indicates relatively small destructive interference between input signals. For example, a selective increase in the absolute value value relative to the reference absolute value value may be determined such that the absolute value value is increased by a relatively large amount if a relatively large decrease in the absolute value value occurs previously (e.g., time average), and such that the value of the absolute value increases by a relatively smaller value if a relatively smaller decrease in the value of the absolute value occurs earlier (for example, on average over time). In other words, the degree of selective increase in the absolute value relative to the reference value can be determined such that the energy loss due to the selective decrease in the value of the absolute value at times at which the suppression degree information indicates relatively large destructive interference between input signals is at least partially compensated. by selectively increasing the absolute value at times at which the suppression degree information indicates comparatively small destructive interference. Thus, the power loss that is caused by reducing the absolute value at the times at which the destructive interference occurs can be at least partially compensated, with the history of the suppression degree information providing reliable information about how much compensation is appropriate.

В предпочтительном варианте осуществления, понижающий микшер выполнен с возможностью получать временно сглаженную информацию степени подавления на основе мгновенной информации степени подавления с использованием операции сглаживания на основе бесконечного импульсного отклика или с использованием операции сглаживания на основе скользящего среднего, чтобы отслеживать информацию степени подавления. Обнаружено, что такие операции являются хорошо адаптированными для отслеживания информации степени подавления и способствуют надежным результатам.In a preferred embodiment, the downmixer is configured to obtain temporally smoothed suppression degree information based on instantaneous suppression degree information using an infinite impulse response smoothing operation or using a moving average smoothing operation to track the suppression degree information. It has been found that such operations are well adapted to track suppression rate information and provide reliable results.

В предпочтительном варианте осуществления, понижающий микшер выполнен с возможностью преобразовывать мгновенное значение степени подавления (например, Q(t)) в преобразованное значение степени подавления (например, Q_mapped) (которое например, может определяться то, насколько значение

абсолютной величины избирательно увеличивается относительно опорного значения M_R в моменты времени, в которые информация Q степени подавления указывает сравнительно небольшие деструктивные помехи между входными сигналами) в зависимости от временно сглаженной информации степени подавления таким образом, что значение временно сглаженной информации степени подавления, указывающее (прошлое/предыдущее) уменьшение значения абсолютной величины, приводит к увеличению (текущего) преобразованного значения степени подавления по сравнению с мгновенным (текущим) значением степени подавления (по меньшей мере, для мгновенного значения степени подавления, указывающего сравнительно небольшие деструктивные помехи между входными сигналами). Соответственно, фактически можно извлекать преобразованное значение степени подавления, которое является хорошо адаптированным к предыдущей разработке информации степени подавления.In a preferred embodiment, the downmixer is configured to convert an instantaneous rejection ratio value (eg, Q(t)) to a mapped rejection ratio value (eg, Q _mapped ) (which, for example, can be determined by how much the value

absolute value selectively increases with respect to the reference value M _R at times at which the suppression ratio information Q indicates relatively small destructive interference between input signals) depending on the temporally smooth suppression ratio information such that the value of the temporally smooth suppression ratio information indicating (the past /previous) decreasing the value of the absolute value, leads to an increase in the (current) converted value of the suppression ratio compared to the instantaneous (current) value of the suppression ratio (at least for the instantaneous value of the suppression ratio, indicating relatively small destructive interference between input signals). Accordingly, it is actually possible to extract a transformed suppression degree value that is well adapted to the previous development of the suppression degree information.

В предпочтительном варианте осуществления, понижающий микшер выполнен с возможностью получать обновленное сглаженное значение Q_smooth(t) степени подавления на основе предыдущего сглаженного значения Q_smooth(t-1) степени подавления и на основе мгновенного (текущего) значения Q(t) степени подавления согласно уравнению, описанному в данном документе, при этом p может быть константой, при 0<p<1. Понижающий микшер также может быть выполнен с возможностью получать преобразованное значение Q_mapped(t) степени подавления согласно уравнению, описанному в данном документе, при этом T является константой, при 0<T<1. Предпочтительно, взаимосвязь 0,3<=T<=0,8 может быть справедливой. Кроме того, можно предполагать, что Q(t) находится в диапазоне между 0 и 1 и принимает значение 0 для сравнительно больших деструктивных помех между входными сигналами и принимает значение 1 для сравнительно небольших деструктивных помех между входными сигналами. Показано, что такое вычисление преобразованного значения степени подавления способствует хорошим результатам при поддержании вычислительной сложности достаточно небольшой.In a preferred embodiment, the downmixer is configured to obtain an updated smooth value Q _smooth (t) of the suppression ratio based on the previous smooth value Q _smooth (t-1) of the suppression ratio and based on the instantaneous (current) value Q(t) of the suppression ratio according to the equation described in this document, while p can be a constant, with 0<p<1. The downmixer can also be configured to obtain the transformed value Q _mapped (t) of the degree of suppression according to the equation described in this document, with T being a constant, with 0<T<1. Preferably, the relationship 0.3<=T<=0.8 may be valid. In addition, it can be assumed that Q(t) is in the range between 0 and 1 and takes on a value of 0 for relatively large destructive interference between input signals and takes on a value of 1 for relatively small destructive interference between input signals. It is shown that such a calculation of the transformed value of the degree of suppression contributes to good results while keeping the computational complexity sufficiently small.

В предпочтительном варианте осуществления, понижающий микшер выполнен с возможностью масштабировать значение абсолютной величины (например, "опорное значение", которое может быть равно M_R), которое соответствует суммированной громкости значений в спектральной области входных сигналов, с использованием значения степени подавления (например, Q_mapped), чтобы получать значение абсолютной величины значения в спектральной области сигнала понижающего микширования. Соответственно, значение в спектральной области сигнала понижающего микширования может уменьшаться (например, относительно опорного значения) во время, в которое возникает высокий риск помех, и может увеличиваться (например, относительно опорного значения) в моменты времени, в которые возникает низкий риск помех. Соответственно, чрезмерные артефакты могут не допускаться в моменты времени, в которые имеется высокая вероятность деструктивных помех, и энергетические потери может компенсироваться в моменты времени, в которые существует низкая вероятность деструктивных помех. С другой стороны, значение абсолютной величины значения в спектральной области сигнала понижающего микширования может оставаться в пределах обоснованного диапазона таким образом, что чрезмерное преувеличение громкости в случае конструктивных помех также не допускается. Кроме того, концепции, описанные в данном документе, не допускают числовых проблем, поскольку это исключается сильное "повышающее масштабирование" значений, которые являются близкими к нулю (например, вследствие деструктивных помех).In a preferred embodiment, the downmixer is configured to scale an absolute value (eg, "reference value", which may be equal to M _R ) that corresponds to the summed loudness of values in the spectral domain of the input signals, using a rejection ratio value (eg, Q _mapped ) to get the absolute value of the value in the spectral domain of the downmix signal. Accordingly, the value in the spectral domain of the downmix signal may decrease (eg, relative to the reference value) at a time at which a high risk of interference occurs, and may increase (eg, relative to the reference value) at times at which a low risk of interference occurs. Accordingly, excessive artifacts can be prevented at times where there is a high probability of destructive interference, and power loss can be compensated for at times when there is a low probability of destructive interference. On the other hand, the value of the absolute value of the value in the spectral domain of the downmix signal can remain within a reasonable range, so that excessive loudness exaggeration in the case of constructive interference is also not allowed. In addition, the concepts described in this document do not allow for numerical problems, since this eliminates the strong "upscaling" of values that are close to zero (for example, due to destructive interference).

В предпочтительном варианте осуществления, понижающий микшер выполнен с возможностью определять взвешенную сумму значений в спектральной области входных сигналов и определять значение фазы на основе взвешенной суммы значений в спектральной области входного сигнала. Например, понижающий микшер выполнен с возможностью взвешивать значения в спектральной области входного сигнала таким способом, чтобы не допускать деструктивных помех, которые превышают предварительно определенный уровень помех. Другими словами, при определении значения фазы, взвешивание может вводиться, с тем чтобы не допускать чрезмерных деструктивных помех. Например, посредством использования такого взвешивания, надежность значений фазы может увеличиваться (например, посредством приложения относительно увеличенного весового коэффициента к значениям в спектральной области, которые имеют сравнительно большую абсолютную величину в прошлом). Таким образом, может повышаться качество определения фазы.In a preferred embodiment, the downmixer is configured to determine the weighted sum of the values in the spectral domain of the input signals and determine the phase value based on the weighted sum of the values in the spectral domain of the input signal. For example, the downmixer is configured to weight values in the spectral domain of the input signal in such a way as to prevent destructive interference that exceeds a predetermined interference level. In other words, when determining the phase value, weighting may be introduced so as not to allow excessive destructive interference. For example, by using such weighting, the reliability of phase values can be increased (eg, by applying a relatively increased weight to values in the spectral domain that have a relatively large absolute value in the past). Thus, the quality of the phase detection can be improved.

В предпочтительном варианте осуществления, понижающий микшер выполнен с возможностью определять взвешенную сумму значений в спектральной области входных сигналов и определять значение фазы на основе взвешенной суммы значений в спектральной области входных сигналов. Понижающий микшер выполнен с возможностью взвешивать значения в спектральной области входных сигналов в зависимости от усредненной во времени интенсивности (например, амплитуд или энергий, или громкости) соответствующего спектрального элемента разрешения в различных входных сигналах. Следовательно, может достигаться значимое взвешивание, и может повышаться надежность значений фазы.In a preferred embodiment, the downmixer is configured to determine the weighted sum of the spectral domain values of the input signals and determine the phase value based on the weighted sum of the spectral domain values of the input signals. The downmixer is configured to weight values in the spectral domain of the input signals depending on the time-averaged intensity (eg, amplitudes or energies, or loudness) of the corresponding spectral bin in the various input signals. Therefore, significant weighting can be achieved and the reliability of the phase values can be improved.

Вариант осуществления согласно изобретению создает аудиокодер для предоставления кодированного аудиопредставления на основе множества входных аудиосигналов. Аудиокодер содержит понижающий микшер, как описано выше. Понижающий микшер выполнен с возможностью предоставлять сигнал понижающего микширования на основе (предпочтительно комплекснозначных) представлений в спектральной области множества входных аудиосигналов. Аудиокодер также выполнен с возможностью кодировать сигнал понижающего микширования, чтобы получать кодированное аудиопредставление. Обнаружено, что использование такого понижающего микшера в аудиокодере является особенно преимущественным, поскольку надежность значений амплитуды и значений фазы может увеличиваться за счет понижающего микшера. Соответственно, сигнал понижающего микширования оптимально подходит для восстановления аудиосигналов со стороны аудиодекодера либо также для прямого воспроизведения. В частности, поскольку артефакты являются сравнительно небольшими с использованием концепции понижающего микширования, раскрытой в данном документе, аудиокодер может использовать сравнительно "чистый" сигнал понижающего микширования, который упрощает кодирование и в то же время увеличивает качество декодированных аудиосигналов.An embodiment according to the invention creates an audio encoder for providing an encoded audio representation based on a plurality of input audio signals. The audio encoder contains a downmixer as described above. The downmixer is configured to provide a downmix signal based on spectral domain (preferably complex) representations of a plurality of input audio signals. The audio encoder is also configured to encode the downmix signal to obtain an encoded audio representation. It has been found that the use of such a downmixer in an audio encoder is particularly advantageous since the reliability of the amplitude values and phase values can be increased by the downmixer. Accordingly, the downmix signal is optimally suited for the reconstruction of audio signals on the part of the audio decoder, or also for direct reproduction. In particular, since artifacts are relatively small using the downmix concept disclosed herein, an audio encoder can use a relatively "clean" downmix signal that simplifies encoding while increasing the quality of decoded audio signals.

Другой вариант осуществления согласно изобретению создает способ для предоставления сигнала понижающего микширования на основе множества (например, комплекснозначных) входных сигналов (которые, например, могут представлять собой входные аудиосигналы). Способ содержит определение (например, вычисление или оценку) значения абсолютной величины (например, M_R или

) значения в спектральной области сигнала понижающего микширования (например, для данного спектрального элемента разрешения) на основе информации громкости входных сигналов (например, на основе значений громкости, ассоциированных с данным спектральным элементом разрешения входных сигналов). Способ содержит определение (предпочтительно скаляр) значение фазы (например, P_P или

значения в спектральной области сигнала понижающего микширования (например, для данного спектрального элемента разрешения), например, независимо от определения значения абсолютной величины. Способ также содержит применение значения фазы (например, P_P или

), чтобы получать комплексное представление чисел значения в спектральной области сигнала понижающего микширования (например, для данного спектрального элемента разрешения), на основе значения абсолютной величины значения в спектральной области. Этот способ основан на соображении, идентичном соображении для понижающего микшера, описанного выше. Также следует отметить, что способ может дополняться посредством любых из признаков, функциональностей и подробностей, описанных в данном документе, также относительно соответствующего понижающего микшера. Способ может дополняться посредством таких признаков, функциональностей и подробностей отдельно или в комбинации.Another embodiment according to the invention provides a method for providing a downmix signal based on multiple (eg, complex-valued) input signals (which, for example, may be audio input signals). The method comprises determining (eg, calculating or estimating) an absolute value (eg, M _R or

) values in the spectral domain of the downmix signal (eg, for a given spectral bin) based on the loudness information of the input signals (eg, based on the loudness values associated with a given spectral bin of the input signals). The method comprises determining a (preferably scalar) phase value (for example, P _P or

values in the spectral domain of the downmix signal (eg, for a given spectral bin), eg, regardless of the definition of the absolute magnitude value. The method also comprises applying a phase value (eg, P _P or

) to obtain a complex representation of the numbers of the value in the spectral domain of the downmix signal (for example, for a given spectral bin), based on the magnitude value of the value in the spectral domain. This method is based on an identical consideration to the downmixer described above. It should also be noted that the method may be augmented by any of the features, functionality, and details described herein, also with respect to an appropriate downmixer. The method may be supplemented by such features, functionalities, and details alone or in combination.

Другой вариант осуществления согласно изобретению создает компьютерную программу для осуществления способа, когда компьютерная программа работает на компьютере.Another embodiment according to the invention creates a computer program for carrying out the method when the computer program is running on the computer.

Краткое описание чертежейBrief description of the drawings

Далее описываются варианты осуществления согласно изобретению со ссылкой на прилагаемые чертежи, на которых:The following describes embodiments according to the invention with reference to the accompanying drawings, in which:

Фиг. 1 показывает принципиальную блок-схему понижающего микшера, согласно варианту осуществления изобретения;Fig. 1 shows a schematic block diagram of a downmixer according to an embodiment of the invention;

Фиг. 2 показывает фрагмент принципиальной блок-схемы понижающего микшера, согласно другому варианту осуществления настоящего изобретения;Fig. 2 shows a schematic block diagram fragment of a downmixer according to another embodiment of the present invention;

Фиг. 3 показывает принципиальную блок-схему определения значений фаз, согласно варианту осуществления изобретения;Fig. 3 shows a schematic block diagram for determining phase values, according to an embodiment of the invention;

Фиг. 4 показывает схематичное представление трех типов помех во время процедуры понижающего микширования;Fig. 4 shows a schematic representation of three types of interference during a downmix procedure;

Фиг. 5 показывает блок-схему последовательности сигналов для понижающего микширования с сохранением громкости, согласно варианту осуществления изобретения;Fig. 5 shows a signal flow diagram for a volume-preserving downmix according to an embodiment of the invention;

Фиг. 6 показывает блок-схему последовательности сигналов понижающего микширования громкости с адаптивными опорными абсолютными величинами;Fig. 6 shows a block diagram of a loudness downmix signal sequence with adaptive reference magnitudes;

Фиг. 7 показывает схематичное представление извлечения степени подавления трех входных сигналов в комплексной плоскости;Fig. 7 shows a schematic representation of the extraction of the degree of suppression of the three input signals in the complex plane;

Фиг. 8 показывает блок-схему последовательности сигналов понижающего микширования громкости с адаптивной фазой; иFig. 8 shows a flowchart of adaptive phase loudness downmix signals; and

Фиг. 9 показывает блок-схему последовательности операций способа для предоставления сигнала понижающего микширования, согласно варианту осуществления изобретения; иFig. 9 shows a flowchart of a method for providing a downmix signal according to an embodiment of the invention; and

Фиг. 10 показывает принципиальную блок-схему аудиокодера, согласно варианту осуществления изобретения; иFig. 10 shows a schematic block diagram of an audio encoder according to an embodiment of the invention; and

Фиг. 11 показывает графическое представление примеров кривых преобразования, которые могут достигаться с использованием различных концепций преобразования для сохранения громкости, описанного в данном документе.Fig. 11 shows a graphical representation of examples of transformation curves that can be achieved using the various loudness preservation transformation concepts described herein.

Подробное описание вариантов осуществленияDetailed description of embodiments

1. Понижающий микшер согласно фиг. 11. Downmixer according to FIG. one

Фиг. 1 показывает принципиальную блок-схему понижающего микшера 100, согласно варианту осуществления изобретения.Fig. 1 shows a schematic block diagram of a downmixer 100 according to an embodiment of the invention.

Понижающий микшер выполнен с возможностью принимать множество входных сигналов 110a, 110b и предоставлять, на их основе, сигнал 112 понижающего микширования. Например, первый входной сигнал, который может представлять собой входной аудиосигнал, может представляться посредством последовательности значений в спектральной области (которые ассоциированы с различными частотами или спектральными элементами разрешения), которые, например, могут быть в комплексном представлении чисел. Кроме того, второй входной сигнал также, например, может содержать последовательность значений в спектральной области (которые ассоциированы с различными частотами или спектральными элементами разрешения), которые могут представляться в комплексном представлении чисел.The downmixer is configured to receive a plurality of input signals 110a, 110b and provide, based on them, a downmix signal 112. For example, the first input signal, which may be an audio input signal, may be represented by a sequence of spectral domain values (which are associated with different frequencies or spectral bins), which, for example, may be in complex numeric representation. In addition, the second input signal may also, for example, contain a sequence of values in the spectral domain (which are associated with different frequencies or spectral bins), which can be represented in a complex representation of numbers.

Сигнал 112 понижающего микширования может представляться посредством значения в спектральной области сигнала понижающего микширования (или, в общем, посредством множества значений в спектральной области, ассоциированных с различными частотами), которое может представляться в форме комплексного представления чисел.The downmix signal 112 may be represented by a spectral domain value of the downmix signal (or, more generally, by a plurality of spectral domain values associated with different frequencies), which may be represented in the form of a complex representation of numbers.

Далее рассматривается обработка только одного спектрального элемента разрешения. Тем не менее, значения в спектральной области различных спектральных элементов разрешения, например, могут обрабатываться независимо и идентично.Next, the processing of only one spectral bin is considered. However, values in the spectral domain of different spectral bins, for example, can be processed independently and identically.

Понижающий микшер 100 содержит определение 120 значений абсолютной величины (которое также может рассматриваться как модуль определения значений абсолютной величины). Определение 120 значений абсолютной величины выполнено с возможностью определять значение 122 абсолютной величины значения 112 в спектральной области сигнала понижающего микширования (например, для данного спектрального элемента разрешения) на основе информации громкости входных сигналов 110a, 110b (например, на основе значений громкости, ассоциированных с данным спектральным элементом разрешения входных сигналов). Например, определение значений абсолютной величины содержит первое определение 124 (или модуль определения) информации громкости, которое определяет громкость значения в спектральной области первого входного сигнала 110a. Кроме того, определение 120 значений абсолютной величины также содержит второе определение 126 (или модуль определения) информации громкости, которое определяет информацию громкости значения в спектральной области второго входного сигнала 110b. Кроме того, определение 120 значений абсолютной величины типично определяет значение 122 абсолютной величины таким образом, что значение 122 абсолютной величины (которое может быть основой для определения значения абсолютной величины значения в спектральной области сигнала понижающего микширования либо которое может даже использоваться в качестве значения абсолютной величины значения в спектральной области сигнала понижающего микширования), основано на суммированной громкости соответствующего значения в спектральной области первого входного сигнала 110a и соответствующего значения в спектральной области второго входного сигнала 110b. Тем не менее, значение абсолютной величины 120 может содержать дополнительные коррекции таким образом, что значение абсолютной величины корректируется, четко определенным способом, так что оно соответствует громкости, которая меньше суммированной громкости или больше суммированной громкости, в зависимости от обстоятельств. Тем не менее, следует отметить, что значение абсолютной величины типично является одним скалярным значением, которое ассоциировано с определенным значением в спектральной области (например, ассоциировано с определенным спектральным элементом разрешения).The downmixer 100 contains an absolute value determination 120 (which can also be considered an absolute magnitude determination unit). The absolute magnitude determination 120 is configured to determine an absolute magnitude value 122 of a value 112 in the spectral domain of the downmix signal (e.g., for a given spectral bin) based on loudness information of input signals 110a, 110b (e.g., based on loudness values associated with a given spectral resolution element of the input signals). For example, the definition of absolute value values contains the first definition 124 (or determiner) information loudness, which determines the loudness of the value in the spectral region of the first input signal 110a. In addition, the absolute value determination 120 also includes a second loudness information determination (or determination unit) 126 that determines loudness information of a value in the spectral region of the second input signal 110b. In addition, the absolute magnitude determination 120 typically determines the absolute magnitude value 122 such that the absolute magnitude value 122 (which may be the basis for determining the absolute magnitude value of a value in the spectral domain of the downmix signal, or which may even be used as the absolute magnitude value of the value in the spectral domain of the downmix signal) is based on the summed loudness of the corresponding value in the spectral domain of the first input signal 110a and the corresponding value in the spectral domain of the second input signal 110b. However, the absolute value 120 may contain additional corrections such that the absolute value is adjusted, in a well-defined manner, such that it corresponds to a loudness that is less than the summed loudness or greater than the summed loudness, as the case may be. However, it should be noted that an absolute value is typically a single scalar value that is associated with a specific value in the spectral domain (eg, associated with a specific spectral bin).

Понижающий микшер 100 также содержит определение 130 (или модуль определения) значений фазы. Соответственно, понижающий микшер выполнен с возможностью определять (скалярное) значение 132 фазы значения 112 в спектральной области сигнала понижающего микширования (например, для данного спектрального элемента разрешения). Например, определение 130 значений фазы принимает первый входной сигнал 110a и второй входной сигнал 110b или значение в спектральной области (ассоциированное с определенным спектральным элементом разрешения) первого входного сигнала 110a и значение в спектральной области (ассоциированное с определенным спектральным элементом разрешения) второго входного сигнала 110b. Например, определение 130 (или модуль определения) значений фазы определяет значение 132 фазы независимо от определения значения 122 абсолютной величины.The downmixer 100 also contains a determination 130 (or determination module) of the phase values. Accordingly, the downmixer is configured to determine the (scalar) phase value 132 of the value 112 in the spectral domain of the downmix signal (eg, for a given spectral bin). For example, the phase value determination 130 takes a first input signal 110a and a second input signal 110b, or a spectral domain value (associated with a specific spectral bin) of the first input signal 110a and a spectral domain value (associated with a specific spectral bin) of the second input signal 110b . For example, the determination 130 (or determination module) of the phase values determines the phase value 132 independently of the determination of the absolute value 122 .

Кроме того, понижающий микшер также содержит применение 140 значений фазы (которое также может рассматриваться как модуль применения значений фазы). Соответственно, понижающий микшер выполнен с возможностью применять значение 132 фазы, чтобы получать комплекснозначное представление чисел значения 112 в спектральной области сигнала понижающего микширования (например, для данного спектрального элемента разрешения), на основе значения 122 абсолютной величины значения в спектральной области сигнала понижающего микширования.In addition, the downmix also contains the application of 140 phase values (which can also be considered as a phase value application module). Accordingly, the downmixer is configured to apply the phase value 132 to obtain a complex-valued representation of the numbers of the value 112 in the spectral domain of the downmix signal (e.g., for a given spectral bin), based on the absolute magnitude value 122 of the value in the spectral domain of the downmix signal.

Вообще говоря, следует отметить, что понижающий микшер 100, например, может определять значение абсолютной величины 112 и значение 132 фазы независимо и после этого, в качестве конечного этапа обработки, применять значение 132 фазы, чтобы получать комплексное представление чисел значения в спектральной области сигнала понижающего микширования. Например, значение 132 фазы может использоваться для того, чтобы извлекать синфазный компонент и квадратурный компонент значения в спектральной области сигнала понижающего микширования на основе значения абсолютной величины таким образом, что получается декартово представление (представление действительной части и мнимой части) комплекснозначного значения в спектральной области сигнала понижающего микширования. Посредством извлечения значения абсолютной величины на основе информации громкости входных сигналов (например, на основе значений громкости данного спектрального элемента разрешения входных сигналов) может получаться хорошая степень численной стабильности, в то время как может не допускаться повышенная громкость (которая, например, вызывается посредством простого суммирования значений в спектральной области в случае конструктивных помех) и значительные падения громкости (которые вызываются посредством деструктивных помех в случае, если выполнено простое комплекснозначное суммирование значений в спектральной области). Кроме того, числовые нестабильности, которые представляют собой результат решений, выполняющих сильную посткоррекцию комплексно-суммированных значений, могут не допускаться.Generally speaking, it should be noted that the downmixer 100, for example, can determine the absolute value 112 and the phase value 132 independently and thereafter, as a final processing step, apply the phase value 132 to obtain a complex representation of the numbers of the value in the spectral domain of the down signal. mixing. For example, the phase value 132 can be used to extract the in-phase component and the quadrature component of a value in the spectral domain of the downmix signal based on an absolute value such that a Cartesian representation (representation of the real part and imaginary part) of a complex-valued value in the spectral domain of the signal is obtained. downmix. By extracting an absolute value based on the loudness information of the input signals (for example, based on the loudness values of a given spectral bin of the input signals), a good degree of numerical stability can be obtained, while excessive loudness (which, for example, is caused by a simple summation) can be avoided. values in the spectral domain in the case of constructive interference) and significant drops in loudness (which are caused by destructive interference in the case where a simple complex-valued summation of values in the spectral domain is performed). In addition, numerical instabilities, which are the result of decisions that perform strong post-correction of complex-summed values, may not be tolerated.

В качестве вывода, понижающий микшер, как описано со ссылкой на фиг. 1, содержит значительные преимущества, которые частично представляют собой результат отдельной обработки значений 122 абсолютной величины и значений 132 фазы и которые также представляют собой результат учета информации громкости при определении значения 122 абсолютной величины.As an output, the downmixer as described with reference to FIG. 1 contains significant benefits that are partly the result of separately processing the absolute magnitude values 122 and the phase values 132, and which are also the result of accounting for loudness information when determining the absolute magnitude value 122.

Кроме того, следует отметить, что понижающий микшер 100 согласно фиг. 1 может дополняться посредством любых из признаков, функциональностей и подробностей, описанных в данном документе, как отдельно, так и в комбинации. Кроме того, признаки, функциональности и подробности, описанные относительно понижающего микшера 100, могут вводиться в другие варианты осуществления, как отдельно, так и в комбинации.In addition, it should be noted that the down mixer 100 of FIG. 1 may be augmented by any of the features, functionality, and details described herein, either alone or in combination. In addition, the features, functionality, and details described with respect to downmixer 100 may be introduced into other embodiments, either alone or in combination.

2. Понижающий микшер согласно фиг. 22. Downmixer according to FIG. 2

Фиг. 2 показывает фрагмент принципиальной блок-схемы понижающего микшера, согласно варианту осуществления изобретения.Fig. 2 shows a schematic block diagram fragment of a downmixer according to an embodiment of the invention.

В частности, фиг. 2 представляет извлечение значения 222 абсолютной величины (которое может соответствовать значению 122 абсолютной величины, описанному со ссылкой на фиг. 1) на основе первого входного сигнала 210a (который может соответствовать первому входному сигналу 110a, описанному со ссылкой на фиг. 1) и также на основе второго входного сигнала 210b (который может соответствовать второму входному сигналу 110b, описанному со ссылкой на фиг. 1).In particular, FIG. 2 represents the extraction of an absolute value 222 (which may correspond to the absolute value 122 described with reference to FIG. 1) based on a first input signal 210a (which may correspond to a first input signal 110a described with reference to FIG. 1) and also on based on the second input signal 210b (which may correspond to the second input signal 110b described with reference to FIG. 1).

Также следует отметить, что модуль обработки или функциональный блок 200, показанный на фиг. 2, например, может занимать место определения 120 значений абсолютной величины (модуля определения значений абсолютной величины), показанного на фиг. 1.It should also be noted that the processing unit or functional block 200 shown in FIG. 2, for example, can take the place of the absolute value determination 120 (absolute value determination unit) shown in FIG. one.

Функциональный блок 200 содержит определение 220 опорных значений абсолютной величины или модуль определения опорных значений абсолютной величины, функциональность которого, в общем, может быть аналогичной функциональности определения 120 значений абсолютной величины/модуля определения значений абсолютной величины. Например, модуль 220 определения опорных значений абсолютной величины может быть выполнен с возможностью предоставлять опорное значение 221 абсолютной величины на основе первого входного сигнала 210a и на основе второго входного сигнала 210b. Например, определение 220 опорных значений абсолютной величины может извлекать опорное значение 221 абсолютной величины значения в спектральной области сигнала понижающего микширования (которое может рассматриваться как немодифицированное опорное значение) на основе информации громкости входных сигналов 210a, 210b. Например, опорное значение 221 абсолютной величины может быть скалярным значением, которое ассоциировано с данным спектральным элементом разрешения сигнала понижающего микширования, и может быть основано на значении громкости, ассоциированном с данным спектральным элементом разрешения первого входного сигнала 210a, и на значении громкости, ассоциированном с данным спектральным элементом разрешения второго входного сигнала 210b. Соответственно, опорное значение абсолютной величины значения в спектральной области, например, может соответствовать громкости, которая превышает наименьшее значение громкости (например, данного спектрального элемента разрешения входных сигналов) и которая типично даже больше наибольшего значения громкости данного спектрального элемента разрешения входных сигналов 210a, 210b. Другими словами, опорная абсолютная величина 221 типично не является сильно небольшой, если данный спектральный элемент разрешения не содержит очень низкую интенсивность сигнала в обоих входных сигналах 210a, 210b. С другой стороны, опорное значение 221 абсолютной величины типично также не содержит чрезмерно большое значение, поскольку оно основано на информации громкости всех входных сигналов. Предпочтительно, опорное значение 221 абсолютной величины не затрагивается посредством конструктивных и деструктивных помех входных сигналов, которые должны возникать, если фаза входных сигналов рассматривается при определении опорного значения абсолютной величины. Наоборот, опорное значение абсолютной величины, например, может отражать суммирование громкости в данном рассматриваемом спектральном элементе разрешения входных сигналов.The function block 200 comprises an absolute value determination 220 or an absolute magnitude reference determination module, the functionality of which may be generally similar to that of the absolute magnitude determination 120/absolute magnitude determination module. For example, the absolute value reference determination module 220 may be configured to provide the absolute value reference 221 based on the first input signal 210a and based on the second input signal 210b. For example, the absolute value reference determination 220 may extract the spectral domain absolute value reference 221 of the downmix signal (which may be considered an unmodified reference) based on the loudness information of the input signals 210a, 210b. For example, the magnitude reference 221 may be a scalar value that is associated with a given spectral bin of the downmix signal and may be based on the loudness value associated with the given spectral bin of the first input signal 210a and the loudness value associated with the given spectral bin of the second input signal 210b. Accordingly, a reference value of the absolute value of a value in the spectral domain, for example, may correspond to a loudness that is greater than the smallest loudness value (e.g., of a given spectral bin of the input signals) and that is typically even greater than the highest loudness value of a given spectral bin of the input signals 210a, 210b. In other words, the absolute reference value 221 is typically not very small unless a given spectral bin contains very low signal strength in both input signals 210a, 210b. On the other hand, the absolute value reference 221 typically does not contain an excessively large value either, since it is based on the loudness information of all input signals. Preferably, the absolute value reference 221 is unaffected by constructive and destructive input signal interference that would occur if the phase of the input signals is considered in determining the absolute value reference. Conversely, the absolute magnitude reference value, for example, may reflect the summation of the loudness in the given spectral bin of the input signals under consideration.

Соответственно, опорное значение 221 абсолютной величины является хорошей основой для возможных коррекций, поскольку можно предполагать, что оно находится в пределах численно обоснованного диапазона и в силу этого может масштабироваться с понижением и масштабироваться с повышением, не вызывая числовые нестабильности.Accordingly, the absolute value reference 221 is a good basis for possible corrections since it can be assumed to be within a numerically valid range and therefore can be scaled down and scaled up without causing numerical instabilities.

Функциональный блок 200 также содержит вычисление 230 степени подавления, которое выполнено с возможностью принимать входные сигналы 210a, 210b (или, по меньшей мере, значение в спектральной области рассматриваемого данного спектрального элемента разрешения). Вычисление 230 степени подавления предоставляет информацию 232 степени подавления, которая, в общем, описывает то, насколько большим должно быть подавление (деструктивные помехи), если значения в спектральной области рассматриваемого данного спектрального элемента разрешения входных сигналов суммированы в качестве комплексных чисел (т.е. при рассмотрении их фаз и возможных эффектов подавления). Могут использоваться различные механизмы для вычисления информации 232 степени подавления (которая может рассматриваться как текущая или мгновенная информация степени подавления и которая может быть ассоциирована с рассматриваемым данным спектральным элементом разрешения). Тем не менее, в предпочтительном подходе, информация 232 степени подавления, которая также обозначается с помощью Q, принимает значение, близкое к нулю, если существует высокая степень подавления, и информация Q степени подавления принимает значение, близкое к 1, если существует низкая степень подавления (например, в данном рассматриваемом спектральном элементе разрешения).Function block 200 also includes a rejection rate calculation 230 that is configured to receive input signals 210a, 210b (or at least a value in the spectral domain of a given spectral bin in question). The rejection ratio calculation 230 provides rejection ratio information 232 that generally describes how large the rejection (destructive interference) should be if the values in the spectral domain of a given spectral bin of the input signals under consideration are summed as complex numbers (i.e., when considering their phases and possible suppression effects). Various mechanisms may be used to calculate suppression ratio information 232 (which may be considered current or instantaneous suppression ratio information and which may be associated with the given spectral bin in question). However, in the preferred approach, the suppression degree information 232, which is also denoted by Q, takes on a value close to zero if there is a high suppression degree, and the Q suppression degree information takes on a value close to 1 if there is a low suppression degree. (for example, in the considered spectral element of resolution).

Информация 232 степени подавления, например, может использоваться для того, чтобы масштабировать опорное значение 221 абсолютной величины, чтобы извлекать (масштабированное) значение абсолютной величины 222 из значения в спектральной области. Тем не менее, даже если должно быть возможным непосредственно использовать информацию 232 степени подавления, чтобы масштабировать опорное значение 221 абсолютной величины, предпочтительно иметь дополнительную обработку, которая описывается ниже.The suppression ratio information 232, for example, can be used to scale the absolute magnitude reference 221 to extract the (scaled) absolute magnitude value 222 from a value in the spectral domain. However, even if it should be possible to directly use the suppression degree information 232 to scale the absolute value reference 221, it is preferable to have additional processing as described below.

В предпочтительном варианте осуществления, функциональный блок 200 также содержит преобразование 240 (или модуль преобразования), которое принимает (мгновенную/текущую) информацию степени подавления (которая описывает степень подавления в рассматриваемом спектральном элементе разрешения, ассоциированном с временным блоком, который должен в данный момент обрабатываться), и предоставляет преобразованное значение 242 степени подавления (или преобразованную информацию степени подавления) на его основе. Например, преобразованное значение степени подавления предоставляется в масштабирование (или модуль 260 масштабирования), которое масштабирует опорное значение 221 абсолютной величины на основе преобразованного значения 242 степени подавления, чтобы за счет этого извлекать значение 222 абсолютной величины значения в спектральной области сигнала понижающего микширования.In the preferred embodiment, function block 200 also comprises a transform 240 (or transform module) that receives (instantaneous/current) suppression ratio information (which describes the suppression ratio in the considered spectral bin associated with the time block that is currently to be processed. ), and provides a transformed suppression degree value 242 (or transformed suppression degree information) based on it. For example, the transformed suppression degree value is provided to a scaler (or scaler 260) that scales the absolute value reference 221 based on the transformed suppression degree value 242 to thereby extract the absolute magnitude value 222 of a value in the spectral domain of the downmix signal.

Функциональный блок 200 предпочтительно содержит временное сглаживание 250/отслеживание предыстории, которое предоставляет информацию 252 предыстории степени подавления или временно сглаженную информацию степени подавления в преобразование 240/определение регулирования значений абсолютной величины. Другими словами, преобразование 240/определение регулирования значений абсолютной величины предпочтительно принимает мгновенную (текущую) информацию 232 степени подавления и информацию 252 предыстории степени подавления (которая, например, может представлять собой временно сглаженную информацию степени подавления). Соответственно, преобразование 240/определение регулирования значений абсолютной величины может предоставлять преобразованное значение 242 степени подавления на основе мгновенной (текущей) информации 232 степени подавления, при этом мгновенная (текущая) информация 232 степени подавления может избирательно увеличиваться в зависимости от информации 252 предыстории степени подавления, чтобы за счет этого извлекать преобразованную информацию степени подавления 242.The function block 200 preferably includes a temporal smoothing/history tracking 250 that provides suppression degree history information 252 or temporally smoothed suppression degree information to the absolute value adjustment transformation/determination 240 . In other words, the absolute value adjustment transformation/determination 240 preferably receives instantaneous (current) suppression degree information 232 and suppression degree history information 252 (which, for example, may be temporally smoothed suppression degree information). Accordingly, the transformation 240/determination of the regulation of absolute values can provide the transformed value 242 of the suppression degree based on the instantaneous (current) information 232 of the suppression degree, while the instantaneous (current) information 232 of the suppression degree can be selectively increased depending on the historical information 252 of the suppression degree, to thereby extract the transformed suppression degree information 242.

Например, информация 232 степени подавления может быть значением в диапазоне между 0 и 1 таким образом, что прямое масштабирование опорного значения 221 абсолютной величины с информацией 232 степени подавления типично должно приводить к уменьшению энергии. Тем не менее, обнаружено, что опорное значение 221 абсолютной величины должно масштабироваться с понижением посредством модуля 260 масштабирования в случае, если существует высокая степень подавления между входными сигналами 210a, 210b (например, в рассматриваемом спектральном элементе разрешения). С другой стороны, также обнаружено, что не проблематично "масштабировать с повышением" опорное значение 221 абсолютной величины умеренным способом в моменты времени, в которые существует низкая степень подавления. Другими словами, обнаружено, что преобразованное значение 242 степени подавления должно быть существенно меньше 1 (например, меньше 0,5 или еще меньше 0,3 или еще меньше 0,1), если существует высокая степень подавления в текущий момент времени. С другой стороны, обнаружено, что то, что не проблематично, если преобразованное значение 242 степени подавления в определенной степени больше 1 (например, между 1 и 1,2 или между 1 и 1,5, или даже между 1 и 2) в моменты времени, в которые существует низкая степень подавления. Соответственно, преобразование 240/определение регулирования значений абсолютной величины избирательно увеличивает преобразованное значение 242 степени подавления относительно мгновенной (текущей) информации 232 степени подавления в зависимости от информации 252 предыстории степени подавления. Например, если мгновенная информация 232 степени подавления принимает сравнительно небольшое значение за определенный период времени, преобразование 240/определение регулирования значений абсолютной величины может увеличивать преобразованное значение 242 степени подавления относительно мгновенной информации 232 степени подавления (по меньшей мере, при наличии низкой степени подавления), так что оно превышает 1 (по меньшей мере, в момент времени, в котором существует низкая степень подавления), чтобы за счет этого, по меньшей мере, частично компенсировать потери энергии, которые вызваны посредством сравнительно небольшой информации 232 степени подавления (что нормально также приводит к сравнительно небольшому преобразованному значению 242 степени подавления, которое существенно меньше 1). С другой стороны, если мгновенная (текущая) информация 232 степени подавления имеет значение около 1, увеличение преобразованного значения 242 степени подавления относительно мгновенной (текущей) информации 232 степени подавления типично является небольшим, поскольку необязательно в такой ситуации компенсировать большие потери энергии. В качестве вывода, степень (или величина), до которой преобразованное значение 242 степени подавления увеличивается по сравнению с мгновенной (текущей) информацией степени подавления, зависит от информации 252 предыстории степени подавления, и увеличение является сравнительно значительным, если возникают (сравнительно) большие потери энергии в прошлом, и увеличение является сравнительно небольшим, если возникают только (сравнительно) небольшие потери энергии в прошлом.For example, the suppression degree information 232 may be a value between 0 and 1 such that directly scaling the absolute value reference 221 with the suppression degree information 232 would typically result in a reduction in energy. However, it has been found that the absolute value reference 221 should be downscaled by the scaler 260 in case there is a high degree of suppression between the input signals 210a, 210b (eg, in the considered spectral bin). On the other hand, it has also been found that it is not problematic to "scale up" the absolute value reference 221 in a moderate manner at times at which there is a low degree of suppression. In other words, it has been found that the suppression ratio converted value 242 must be substantially less than 1 (eg, less than 0.5, or even less than 0.3, or even less than 0.1) if there is a high suppression degree at the current time. On the other hand, it has been found that it is not problematic if the suppression ratio converted value 242 is greater than 1 to a certain extent (for example, between 1 and 1.2, or between 1 and 1.5, or even between 1 and 2) at times times in which there is a low degree of suppression. Accordingly, the transformation 240/determining the adjustment of absolute values selectively increases the converted suppression amount value 242 relative to the instantaneous (current) suppression amount information 232 depending on the suppression amount history information 252 . For example, if the instantaneous suppression amount information 232 takes on a relatively small value over a certain period of time, the transformation 240/determination of the absolute value adjustment may increase the transformed suppression amount value 242 relative to the instantaneous suppression amount information 232 (at least when there is a low amount of suppression), so that it exceeds 1 (at least at the point in time at which there is a low suppression degree) to thereby at least partially compensate for the energy losses that are caused by the relatively small suppression degree information 232 (which normally also results in to a relatively small converted value 242 of the degree of suppression, which is significantly less than 1). On the other hand, if the instantaneous (current) suppression rate information 232 has a value of about 1, the increase in the converted suppression ratio value 242 relative to the instantaneous (current) suppression ratio information 232 is typically small, since it is not necessary to compensate for large energy losses in such a situation. As a conclusion, the degree (or amount) to which the suppression ratio converted value 242 increases compared to the instantaneous (current) suppression ratio information 252 depends on the suppression ratio history information 252, and the increase is relatively large if a (comparatively) large loss occurs. energy in the past, and the increase is relatively small if only (comparatively) small losses of energy in the past occur.

Типично, сравнительно небольшая информация степени подавления (близкая к 0, что указывает высокую степень подавления) также приводит к сравнительно небольшому преобразованному значению 242 степени подавления (которое существенно меньше 1). С другой стороны, если мгновенная информация степени подавления имеет значение около 1 (указывающее низкую степень подавления), то преобразованное значение 242 степени подавления может быть меньше 1 или также может быть больше 1, например, если мгновенная информация степени подавления принимает значение, существенно меньшее 1, в течение определенного периода времени до этого. Соответственно, значение 222 абсолютной величины значения в спектральной области, которое получается посредством модуля 260 масштабирования, типично меньше опорного значения 221 абсолютной величины, если существует высокая степень подавления, и типично даже больше опорного значения 221 абсолютной величины, если существует низкая степень подавления, и если существует высокая степень подавления в течение определенного периода времени до этого.Typically, relatively little suppression degree information (close to 0, indicating a high degree of suppression) also results in a relatively small transformed suppression degree value 242 (which is substantially less than 1). On the other hand, if the instantaneous information of the suppression degree has a value of about 1 (indicating a low degree of suppression), then the converted value 242 of the suppression degree may be less than 1 or may also be greater than 1, for example, if the instantaneous information of the suppression degree takes a value significantly less than 1 , for a certain period of time before. Accordingly, the spectral domain absolute value value 222 that is obtained by the scaler 260 is typically smaller than the absolute value reference 221 if there is a high degree of suppression, and typically even greater than the absolute value reference 221 if there is a low suppression degree, and if there is a high degree of suppression for a certain period of time before.

Как упомянуто выше, функциональный блок 200, например, может заменять определение 120/модуль определения значений абсолютной величины по фиг. 1 в некоторых вариантах осуществления изобретения.As mentioned above, the function block 200, for example, can replace the absolute value determination 120/module of FIG. 1 in some embodiments of the invention.

Кроме того, следует отметить, что функциональный блок 200 может дополняться посредством любых из признаков, функциональностей и подробностей, описанных в данном документе, также относительно других вариантов осуществления. Такие признаки, функциональности и подробности могут добавляться в функциональный блок 200 отдельно или в комбинации. В частности, уравнения, описанные для вычисления мгновенной (текущей) информации Q степени подавления, для вычисления информации Q_smooth предыстории степени подавлени_я, для вычисления преобразованной информации Q_mapped степени, для вычисления опорного значения M_R абсолютной величины и для вычисления (масштабированного) значения (

) абсолютной величины, описанных в данном документе, могут необязательно использоваться при реализации функциональности функционального блока 200. Тем не менее, следует отметить, что достаточно, если используются одно или более упомянутых уравнений, и что необязательно использовать все эти уравнения в комбинации.In addition, it should be noted that the functional block 200 may be supplemented by any of the features, functionality, and details described herein, also with respect to other embodiments. Such features, functionality, and details may be added to functional block 200 individually or in combination. In particular, the equations described for calculating instantaneous (current) suppression degree information Q, for calculating suppression degree history information _Q _smooth , for calculating transformed degree information Q _mapped , for calculating an absolute value reference value M _R , and for calculating a (scaled) value (

) absolute values described herein may optionally be used in implementing the functionality of function block 200. However, it should be noted that it is sufficient if one or more of the above equations are used, and that it is not necessary to use all of these equations in combination.

3. Определение значений фазы согласно фиг. 33. Determining the phase values according to FIG. 3

Фиг. 3 показывает схематичное представление определения значений фазы, согласно варианту осуществления настоящего изобретения. Определение значений фазы согласно фиг. 3 полностью обозначается с помощью 300. Следует отметить, что определение 300 значений фазы может, необязательно, заменять определение 130 значений фазы в понижающем микшере 100 согласно фиг. 1. Следует отметить, что определение 300 значений фазы необязательно может использоваться в сочетании с функциональным блоком 200 (который может заменять блок 120 в понижающем микшере 100 согласно фиг. 1). Тем не менее, определение 300 значений фазы также может использоваться в сочетании с определением 120 значений абсолютной величины.Fig. 3 shows a schematic representation of determining phase values according to an embodiment of the present invention. Determining the phase values according to FIG. 3 is denoted in its entirety by 300. It should be noted that phase value determination 300 may optionally replace phase value determination 130 in downmixer 100 of FIG. 1. It should be noted that the phase value determination 300 may optionally be used in conjunction with function block 200 (which may replace block 120 in downmixer 100 of FIG. 1). However, the definition of 300 phase values can also be used in combination with the definition of 120 absolute values.

По ссылке с номером 310, представление в частотно-временной области входного сигнала (например, входного аудиосигнала) показывается. Абсцисса 312 описывает время, и ордината 313 описывает частоту. Соответственно, частотно-временные элементы разрешения показаны. Например, выделяются три частотно-временных элемента 314a, 314b, 314c разрешения, которые ассоциированы с частотой f₄ (или с частотным диапазоном, или с частотным элементом разрешения) и которые ассоциированы с временами t₁, t₂, t₃ (или временными частями, или кадрами).At 310, a time-frequency domain representation of an input signal (eg, an input audio signal) is shown. Abscissa 312 describes time and ordinate 313 describes frequency. Accordingly, time-frequency bins are shown. For example, three time-frequency bins 314a, 314b, 314c are allocated that are associated with frequency f ₄ (or a frequency band or frequency bin) and that are associated with times t ₁ , t ₂ , t ₃ (or time fractions , or frames).

Аналогично, по ссылке с номером 320, показывается графическое представление представления в частотно-временной области второго входного сигнала. Абсцисса 322 описывает время, и ордината 323 описывает частоту. Спектральные элементы 324a, 324b, 324c разрешения (например, на частоте f₄ и в моменты t₁, t₂, t₃ времени) выделяются, при этом, например, комплекснозначное значение в спектральной области ассоциировано с каждым из спектральных элементов 324a, 324b, 324c разрешения.Similarly, at 320, a graphical representation of the time-frequency domain representation of the second input signal is shown. Abscissa 322 describes time and ordinate 323 describes frequency. Resolution spectral bins 324a, 324b, 324c (eg, at frequency f ₄ and at times t ₁ , t ₂ , t ₃ ) are extracted, with, for example, a complex value in the spectral domain associated with each of the spectral bins 324a, 324b, 324c permissions.

Аналогично, схематичное представление по ссылке с номером 330 показывает представление в частотно-временной области третьего входного сигнала. Абсцисса 332 описывает время, и ордината 333 описывает частоту. Три спектральных элемента 334a, 334b, 334c разрешения на частоте f₄ и в моменты t₁, t₂, t₃ времени выделяются.Similarly, the schematic diagram at reference 330 shows the time-frequency domain representation of the third input signal. Abscissa 332 describes time and ordinate 333 describes frequency. Three resolution bins 334a, 334b, 334c at frequency f ₄ and at times t ₁ , t ₂ , t ₃ are highlighted.

Далее описывается обработка, которая может выполняться посредством определения значений фазы (например, посредством определения 130 значений фазы/модуля определения значений фазы). Например, первое усреднение 360 (или первый модуль усреднения) может формировать среднее (например, интенсивности или энергии, или громкости) по значениям в спектральной области множества спектральных элементов разрешения, которые ассоциированы с идентичной частотой и которые ассоциированы с последующими временами. Усреднение может представлять собой усреднение на основе скользящего окна или может представлять собой рекурсивное усреднение (на основе конечного импульсного отклика). Кроме того, следует отметить, что усреднение, например, может усреднять комплексные значения значений в спектральной области либо может усреднять абсолютные величины или значения громкости значений в спектральной области. Соответственно, модуль 330 усреднения предоставляет весовое значение 362.The following describes the processing that can be performed by determining phase values (for example, by determining 130 phase values/phase value determining unit). For example, the first averaging 360 (or the first averaging unit) may average (eg, intensity or energy or loudness) over spectral domain values of a plurality of spectral bins that are associated with an identical frequency and that are associated with successive times. The averaging may be a sliding window averaging or may be a recursive averaging (based on a finite impulse response). Furthermore, it should be noted that averaging, for example, may average complex values of values in the spectral domain, or may average absolute magnitudes or loudness values of values in the spectral domain. Accordingly, the averaging module 330 provides a weight value 362.

Аналогично, второе усреднение (или второй модуль 370 усреднения определяет среднее во времени (например, интенсивности, энергии или громкости) значений в спектральной области, ассоциированных со спектральными элементами 324a-324c разрешения второго входного сигнала, чтобы за счет этого получать весовое значение 372 для второго входного сигнала.Similarly, the second averaging (or second averaging module 370 determines the time average (e.g., intensity, energy, or loudness) of the values in the spectral domain associated with spectral bins 324a-324c of the second input signal, to thereby obtain a weight value 372 for the second input signal.

Кроме того, третье усреднение (или третий модуль 380 усреднения) определяет среднее во времени (например, интенсивности, энергии или громкости) по значениям в спектральной области, ассоциированным со спектральными элементами 334a-334c разрешения третьего входного сигнала, чтобы за счет этого получать весовое значение 382 для третьего входного сигнала.In addition, the third averaging (or third averaging module 380) determines the time average (eg, intensity, energy, or loudness) of the values in the spectral domain associated with the spectral bins 334a-334c of the third input signal, to thereby obtain a weight value 382 for the third input.

Другими словами, первое усреднение 360, второе усреднение 370 и третье усреднение 380 могут выполнять аналогичные или идентичные функциональности, но работать со значениями в спектральной области различных входных сигналов.In other words, the first averaging 360, the second averaging 370, and the third averaging 380 may perform similar or identical functionality but operate on values in the spectral domain of different input signals.

Определение 300 значений фазы также содержит масштабирование или взвешивание 364 текущего значения в спектральной области первого входного сигнала (либо извлекаемого из первого входного сигнала), чтобы за счет этого получать масштабированное значение 366 в спектральной области первого входного сигнала. Аналогично, определение значений фазы содержит второе масштабирование или взвешивание 374, в котором текущее значение в спектральной области второго входного сигнала (например, ассоциированного с в данный момент обрабатываемым спектральным элементом разрешения) масштабируется с использованием весового значения 372, извлекаемого из второго входного сигнала. Соответственно, взвешенное значение 376 в спектральной области второго входного сигнала получается. Аналогично, определение 300 значений фазы содержит третье масштабирование или взвешивание 384, которое масштабирует текущее значение в спектральной области третьего входного сигнала с использованием весового значения 382 третьего входного сигнала, чтобы за счет этого получать значение 386 в спектральной области третьего входного сигнала.The phase value determination 300 also comprises scaling or weighting 364 the current value in the spectral domain of the first input signal (or derived from the first input signal) to thereby obtain a scaled value 366 in the spectral domain of the first input signal. Similarly, determining the phase values comprises a second scaling or weighting 374 in which the current value in the spectral domain of the second input signal (eg, associated with the currently processed spectral bin) is scaled using the weight value 372 extracted from the second input signal. Accordingly, a weighted value 376 in the spectral domain of the second input signal is obtained. Similarly, the phase value determination 300 comprises a third scaling or weighting 384 that scales the current value in the spectral domain of the third input signal using the weight value 382 of the third input signal to thereby obtain a value 386 in the spectral domain of the third input signal.

Определение 300 значений фазы также содержит комбинирование 390 масштабированного значения 366 в спектральной области первого входного сигнала, масштабированного значения 376 в спектральной области второго входного сигнала и масштабированного значения 386 в спектральной области третьего входного сигнала. Например, комбинирование с суммированием выполняется, при этом следует отметить, что масштабируемые комплексные значения (например, в декартовом представлении, содержащем действительный компонент и мнимый компонент) комбинируются. Соответственно, как результат комбинирования 390, получается взвешенная сумма 392, которая типично является комплексным значением и которая типично находится в декартовом представлении (с действительным компонентом и мнимым компонентом). Определение 300 значений фазы также содержит вычисление 396 фазы, в котором значение фазы взвешенной суммы 392 вычисляется и предоставляется в качестве значения 398 фазы. Значение 398 фазы, например, может соответствовать значению 132 фазы, описанному со ссылкой на фиг. 1, и может использоваться посредством применения 140 значений фазы.The phase value determination 300 also comprises combining 390 the scaled value 366 in the spectral domain of the first input signal, the scaled value 376 in the spectral domain of the second input signal, and the scaled value 386 in the spectral domain of the third input signal. For example, combining with summation is performed, it should be noted that scaled complex values (eg, in a Cartesian representation containing a real component and an imaginary component) are combined. Accordingly, as a result of combination 390, a weighted sum 392 is obtained, which is typically a complex value and which is typically in Cartesian representation (with a real component and an imaginary component). The phase value determination 300 also comprises a phase calculation 396 in which the phase value of the weighted sum 392 is calculated and provided as the phase value 398 . The phase value 398, for example, may correspond to the phase value 132 described with reference to FIG. 1 and can be used by applying 140 phase values.

Определение 300 значений фазы основано на такой идее, что текущее значение в спектральной области входного сигнала, который является сравнительно сильным (например, по сравнению с другими входными сигналами) в прошлом (например, в спектральных элементах разрешения, ассоциированных с более ранними временами, но с частотой, идентичной частоте текущего значения в спектральной области), должен взвешиваться сильнее в вычислении 396 фазы по сравнению со значениями в спектральной области одного или более входных сигналов, которые являются сравнительно более слабыми в прошлом (например, в спектральных элементах разрешения, имеющих частоту, идентичную частоте текущего значения в спектральной области, но ассоциированных с более ранними временами). Обнаружено, что вероятность того, что значение 398 фазы содержит большую ошибку или содержит быстрое изменение, уменьшается посредством такой концепции, и что, как результат, (слышимые) артефакты в сигнале понижающего микширования могут уменьшаться или не допускаться посредством использования такого определения значений фазы. Другими словами, вычисление 396 фазы, которое выполняется для того, чтобы получать значение 398 фазы, не выполняется на основе равновзвешенной комбинации текущих значений в спектральной области различных входных сигналов, но текущие значения в спектральной области различных входных сигналов взвешиваются в соответствии с прошлым средним по времени интенсивности, энергии или громкости (например, в прошлых спектральных элементах разрешения идентичной частоты). Таким образом, повышается надежность вычисления фазы.The phase value determination 300 is based on the idea that the current value in the spectral domain of an input signal that is relatively strong (eg, compared to other input signals) is in the past (eg, in spectral bins associated with earlier times but with frequency identical to the frequency of the current value in the spectral domain) should be weighted more strongly in the phase calculation 396 compared to the spectral domain values of one or more input signals that are comparatively weaker in the past (e.g., in spectral bins having a frequency identical to the frequency of the current value in the spectral domain, but associated with earlier times). It has been found that the likelihood that the phase value 398 contains a large error or contains a fast change is reduced by such a concept, and that, as a result, (audible) artifacts in the downmix signal can be reduced or avoided by using such a definition of the phase values. In other words, the phase calculation 396 that is performed to obtain the phase value 398 is not performed based on an equally weighted combination of the current values in the spectral domain of the various input signals, but the current values in the spectral domain of the various input signals are weighted according to the past time average. intensity, energy, or loudness (eg, in past bins of identical frequency). Thus, the reliability of the phase calculation is increased.

Тем не менее, следует отметить, что любые из признаков, функциональностей и подробностей, описанных в данном документе, например, относительно определения значений фазы, также могут применяться в сочетании с определением 300 значений фазы, как отдельно, так и в комбинации. Кроме того, следует отметить, что определение 300 значений фазы необязательно может вводиться в любой из других вариантов осуществления, описанных в данном документе.However, it should be noted that any of the features, functionality, and details described herein, for example, regarding the determination of phase values, may also be used in conjunction with the determination of 300 phase values, either alone or in combination. In addition, it should be noted that the phase value determination 300 may optionally be entered in any of the other embodiments described herein.

4. Вариант осуществления согласно фиг. 54. The embodiment according to FIG. 5

Далее описывается вариант осуществления понижающего микшера со ссылкой на фиг. 5.Next, an embodiment of the downmixer will be described with reference to FIG. 5.

Фиг. 5 показывает принципиальную блок-схему понижающего микшера 500, согласно варианту осуществления изобретения. Понижающий микшер выполнен с возможностью принимать множество входных сигналов 500a-500n, которые также обозначаются с помощью s₁-s_N.Fig. 5 shows a schematic block diagram of a downmixer 500 according to an embodiment of the invention. The downmixer is configured to receive a plurality of input signals 500a-500n, which are also denoted by s ₁ -s _N .

Кроме того, понижающий микшер 500 предоставляет, в качестве выходного сигнала, сигнал 592 понижающего микширования, который также обозначается с помощью s_LoudnessDMX. Понижающий микшер 500 необязательно содержит гребенку 501 фильтров, которая, например, представляет собой гребенку аналитических фильтров (или, вообще говоря, которая служит для того, чтобы выполнять анализ). Например, гребенка 501 фильтров может отдельно анализировать различные входные сигналы 500a-500n. Например, гребенка фильтров может предоставлять комплекснозначное представление для каждого из входных сигналов 500a-500n. Например, гребенка 501 фильтров предоставляет первое комплекснозначное представление 501a на основе первого входного сигнала 500a и предоставляет n-ое комплекснозначное представление 501n на основе n-ого входного сигнала 500n. Например, первое комплекснозначное представление 501a может содержать множество спектральных значений, например, по одному для каждого спектрального элемента разрешения. Отдельные спектральные значения могут быть комплекснозначными и, например, могут представляться в декартовой форме (с отдельным представлением чисел для действительной части и мнимой части).In addition, the downmixer 500 provides, as an output signal, a downmix signal 592, which is also denoted by s _LoudnessDMX . The downmixer 500 optionally includes a filter bank 501, which is, for example, an analytical filter bank (or more generally, which serves to perform analysis). For example, the filterbank 501 may separately analyze the various input signals 500a-500n. For example, the filterbank may provide a complex valued representation for each of the input signals 500a-500n. For example, the filter bank 501 provides a first complex valued representation 501a based on the first input signal 500a and provides an nth complex valued representation 501n based on the nth input signal 500n. For example, the first complex valued representation 501a may contain a plurality of spectral values, such as one for each spectral bin. The individual spectral values may be complex-valued and, for example, may be represented in Cartesian form (with a separate number representation for the real part and the imaginary part).

Далее, описывается обработка только для одного спектрального элемента разрешения. Тем не менее, следует отметить, что различные спектральные элементы разрешения (имеющие ассоциированные различные частоты), например, могут обрабатываться отдельно, но, например, с использованием идентичной концепции.Next, processing for only one spectral bin will be described. However, it should be noted that different spectral bins (having different frequencies associated), for example, can be processed separately, but, for example, using the same concept.

Например, представление в спектральной области рассматриваемого спектрального элемента разрешения первого входного сигнала обозначается с помощью Re₁ (представления чисел для действительной части значения в спектральной области первого входного сигнала) и Im₁ (представления чисел для мнимой части значения в спектральной области первого входного сигнала). Аналогично, представление в спектральной области n-ого входного сигнала обозначается с помощью Re_N (представления чисел для действительной части значения в спектральной области n-ого входного сигнала) и Im_N (представления чисел для мнимой части спектрального значения n-ого входного сигнала).For example, the spectral domain representation of the considered spectral bin of the first input signal is denoted by Re ₁ (representing the numbers for the real part of the value in the spectral domain of the first input signal) and Im ₁ (representing the numbers for the imaginary part of the value in the spectral domain of the first input signal). Similarly, the spectral domain representation of the nth input signal is denoted by Re _N (number representation for the real part of the value in the spectral domain of the nth input signal) and Im _N (number representation for the imaginary part of the spectral value of the nth input signal).

Понижающий микшер также содержит оценку 503 громкости, при этом громкость отдельно оценивается для различных входных сигналов. Например, значение 503a громкости первого входного сигнала 500a вычисляется или оценивается на основе представления чисел для действительной части значения в спектральной области первого входного сигнала и на основе представления чисел для мнимой части значения в спектральной области первого входного сигнала (для рассматриваемого спектрального элемента разрешения). Аналогично, громкость n-ого входного сигнала вычисляется или оценивается на основе представления Re_N, Im_N чисел значения в спектральной области n-ого входного сигнала (для рассматриваемого спектрального элемента разрешения), чтобы за счет этого получать значение 503b громкости. Отдельные блоки или модули оценки громкости обозначаются с помощью 503.The downmix also contains a loudness estimate 503, with loudness separately estimated for various input signals. For example, the loudness value 503a of the first input signal 500a is calculated or estimated based on the representation of numbers for the real part of the value in the spectral domain of the first input signal and based on the representation of numbers for the imaginary part of the value in the spectral domain of the first input signal (for the considered spectral bin). Similarly, the loudness of the n-th input signal is calculated or estimated based on the representation Re _N , Im _N numbers of the value in the spectral domain of the n-th input signal (for the considered spectral bin) to thereby obtain the loudness value 503b. The individual loudness estimation blocks or modules are designated with 503.

Кроме того, отдельные значения 503a, 503b громкости, которые отдельно представляют громкость отдельных входных сигналов 500a-500n, комбинируются (например, суммируются) в модуле 503c комбинирования, чтобы за счет этого получать значение 503d суммированной громкости. Соответственно, значение 503d суммированной громкости описывает суммированную громкость входных сигналов 501a-501n. Понижающий микшер 500 также содержит преобразование 504 громкости в абсолютную величину, которое принимает значение 503d суммированной громкости и преобразует значение 503d суммированной громкости в значение 505 абсолютной величины, которое может рассматриваться как опорная абсолютная величина M_R. Опорное значение 505 абсолютной величины может быть скалярным значением, которое представляет суммированную громкость, описанную посредством значения 503d суммированной громкости (но которое может находиться в области значения амплитуды).In addition, the individual loudness values 503a, 503b, which individually represent the loudness of the individual input signals 500a-500n, are combined (eg added) in the combiner 503c to thereby obtain a summed loudness value 503d. Accordingly, the sum volume value 503d describes the sum volume of the input signals 501a-501n. Downmixer 500 also includes a loudness to absolute value conversion 504 that takes a summed loudness value 503d and converts the summed loudness value 503d to an absolute value 505 that can be considered as a reference absolute value M _R . The absolute value reference 505 may be a scalar value that represents the summed loudness described by the summed loudness value 503d (but which may be in the range of the amplitude value).

Понижающий микшер 500, необязательно, может содержать модуль 506 масштабирования, который, тем не менее, может быть неактивным в варианте осуществления по фиг. 5. Соответственно, модифицированное ("масштабированное") значение 506a абсолютной величины может быть идентичным опорному значению 505 абсолютной величины.Downmixer 500 may optionally include a scaler 506, which may however be inactive in the embodiment of FIG. 5. Accordingly, the modified ("scaled") absolute value 506a may be identical to the reference absolute value 505.

Понижающий микшер 500 также содержит вычисление 508 фазы. Вычисление 508 фазы может принимать представление чисел комплекснозначного значения суммы, которое комбинирует значения 501a-501n в спектральной области. Например, представления Re1-Re_N чисел действительных частей значений 501a-501n в спектральной области могут суммироваться (например, в сумматоре или модуле 507a комбинирования), чтобы получать представление 507b чисел (также обозначаемое с помощью Re_DMX) для действительной части значения суммы. Аналогично, представления чисел Im₁-Im_N мнимых частей значений 501a-501n в спектральной области суммируются (например, посредством сумматора или модуля 507c комбинирования), чтобы получать представление 507d чисел (также обозначаемое с помощью Lm_DMX) для мнимой части значения суммы.The downmixer 500 also includes a phase calculation 508 . The phase calculation 508 may take a complex-valued sum value number representation that combines the values 501a-501n in the spectral domain. For example, the number representations Re1-Re _N of the real parts of the values 501a-501n in the spectral domain can be added (eg, in an adder or combiner 507a) to obtain a number representation 507b (also referred to as Re _DMX ) for the real part of the sum value. Similarly, the number representations Im ₁ -Im _N of the imaginary parts of the values 501a-501n in the spectral domain are summed (eg, via an adder or combiner 507c) to obtain a number representation 507d (also denoted by Lm _DMX ) for the imaginary part of the sum value.

Вычисление 508 фазы вычисляет значение 508a фазы на основе представления 507b чисел для действительной части значения суммы и на основе представления 507d чисел для мнимой части значения суммы. Например, вычисление фазы может содержать операцию взятия арктангенса, при этом может учитываться различие между квадрантами, в которых расположены представления чисел для действительной части и мнимой части значения суммы. Таким образом, значение 508a фазы, например, может указывать диапазон между 0 и 360° или между 0 и 2π, или между -180° и +180°, или между -π и+π.The phase calculation 508 calculates the phase value 508a based on the number representation 507b for the real part of the sum value and based on the number representation 507d for the imaginary part of the sum value. For example, the calculation of the phase may include the operation of taking the arc tangent, which may take into account the difference between the quadrants in which the representations of numbers for the real part and the imaginary part of the sum value are located. Thus, the phase value 508a, for example, may indicate a range between 0 and 360°, or between 0 and 2π, or between -180° and +180°, or between -π and +π.

Понижающий микшер 500 также содержит необязательную коррекцию 510 фазы, которая типично является неактивной в варианте осуществления согласно фиг. 5.Downmixer 500 also contains an optional phase correction 510, which is typically inactive in the embodiment of FIG. 5.

Понижающий микшер 500 также содержит применение 511 значений фазы/восстановление представления чисел. Применение значений фазы принимает значение 506a абсолютной величины (которое может быть идентичным опорному значению 505 абсолютной величины в настоящем варианте осуществления), и также принимает скорректированное значение 510a фазы, которое может быть идентичным значению 508a фазы в настоящем варианте осуществления.The downmixer 500 also contains the application 511 of the phase values/recovery of the representation of the numbers. The application of phase values receives an absolute value 506a (which may be identical to the reference absolute value 505 in the present embodiment), and also receives a corrected phase value 510a, which may be identical to the phase value 508a in the present embodiment.

Применение 511 значений фазы определяет представление чисел для действительной части (Re_active) из значения в спектральной области сигнала понижающего микширования, а также определяет представление чисел для мнимой части значения в спектральной области сигнала понижающего микширования. Соответственно, применение 511 значений фазы предоставляет представление 511a чисел для действительной части значения в спектральной области сигнала понижающего микширования и представление 511b чисел для мнимой части значения в спектральной области сигнала понижающего микширования.Applying 511 phase values determines the numeric representation for the real part (Re _active ) of the value in the spectral domain of the downmix signal, and also determines the numeric representation for the imaginary part of the value in the spectral domain of the downmix signal. Accordingly, applying 511 phase values provides a number representation 511a for the real part of the value in the spectral domain of the downmix signal and a representation 511b of numbers for the imaginary part of the value in the spectral domain of the downmix signal.

Как представление чисел для действительной части, так и представление чисел для мнимой части 511a, 511b предоставляются в необязательную гребенку 502 фильтров, которая может представлять собой гребенку синтезирующих фильтров. Гребенка 502 фильтров может быть выполнена с возможностью предоставлять представление 592 во временной области сигнала понижающего микширования на основе представлений чисел (комплекснозначных) значений в спектральной области сигнала понижающего микширования, например, для множества спектральных элементов разрешения (например, имеющих ассоциированные различные частоты).Both the real part number representation and the imaginary part number representation 511a, 511b are provided to an optional filterbank 502, which may be a synthesis filterbank. The filter bank 502 may be configured to provide a time domain representation 592 of the downmix signal based on representations of numbers (complex valued) values in the spectral domain of the downmix signal, e.g., for multiple spectral bins (e.g., having different frequencies associated).

Соответственно, может получаться сигнал понижающего микширования, при этом значение абсолютной величины и значение фазы обрабатываются независимо (например, в качестве скалярных значений), и при этом комплекснозначное представление чисел значений в спектральной области формируется только в качестве конечного этапа обработки (например, перед повторным синтезом представления во временной области).Accordingly, a downmix signal can be obtained while the absolute value and the phase value are processed independently (for example, as scalar values), and the complex-valued representation of the numbers of values in the spectral domain is formed only as a final processing step (for example, before resynthesis representations in the time domain).

Далее обобщается концепция, описанная со ссылкой на фиг. 5. Следует отметить, что концепции, описанные ниже, могут использоваться независимо от вышеуказанных подробностей. Тем не менее, любая из подробностей, описанных ниже, также может использоваться в комбинации с любым из вариантов осуществления, описанных в данном документе.The following summarizes the concept described with reference to FIG. 5. It should be noted that the concepts described below can be used regardless of the above details. However, any of the details described below can also be used in combination with any of the embodiments described herein.

Следует отметить, что концепция может рассматриваться как "понижающее микширование с сохранением громкости". Новый подход, описанный в данном документе, не просто микширует с понижением входные сигналы и затем пытается корректировать нежелательные побочные эффекты в дальнейшем. Он вычисляет требуемую информацию абсолютной величины и фазы (с сохранением громкости) независимо друг от друга, на основе двух различных концепций.It should be noted that the concept can be seen as "downmixing while maintaining loudness". The new approach described in this paper does not simply downmix the input signals and then attempt to correct unwanted side effects down the line. It calculates the required magnitude and phase information (while maintaining loudness) independently of each other, based on two different concepts.

Например, требуемая (опорная) абсолютная величина вычисляется непосредственно. Она является свободной от нежелательных помех и в силу этого свободной от любых нежелательных артефактов понижающего микширования (DMX) при комбинировании с соответствующей информацией фазы. Информация фазы вычисляется отдельно и исходит из пассивного понижающего микширования (DMX).For example, the required (reference) absolute value is calculated directly. It is free of unwanted interference and therefore free of any unwanted downmix (DMX) artifacts when combined with appropriate phase information. The phase information is calculated separately and comes from a passive downmix (DMX).

На фиг. 5, показывается примерный вариант осуществления изобретения для одной полосы частот (между анализом 501 и синтезом 502 гребенки фильтров). Конечно, различные размеры буферов являются возможными. Кроме того, следует отметить, что вычисление степени подавления (предотвращение артефактов) и преобразование (сохранение громкости), которые показаны на фиг. 5, не представляют собой существенные компоненты варианта осуществления согласно фиг. 5, а должны рассматриваться как необязательные расширения. Аналогично, вычисление значений фазовой коррекции должно рассматриваться как необязательное дополнение.In FIG. 5, an exemplary embodiment of the invention is shown for one frequency band (between analysis 501 and filterbank synthesis 502). Of course, different buffer sizes are possible. In addition, it should be noted that the calculation of the degree of suppression (artifact avoidance) and transformation (volume preservation), which are shown in FIG. 5 are not essential components of the embodiment of FIG. 5a should be considered as optional extensions. Likewise, the calculation of phase correction values should be considered an optional extra.

Далее приводятся некоторые дополнительные пояснения относительно вычисления абсолютной величины или опорной абсолютной величины (505 или 506a) и относительно вычисления фазы.The following are some additional explanations regarding the calculation of the absolute value or reference absolute value (505 or 506a) and regarding the calculation of the phase.

(Опорная) абсолютная величина(Reference) absolute value

Входные сигналы микшируются с понижением с сохранением громкости, чтобы формировать абсолютную величину M_R 505, которая показывается посредством красных/непрерывных линий, или посредством линий, помеченных как "вычисление абсолютных величин" на фиг. 5, следующим образом:The input signals are downmixed while maintaining volume to form the absolute value M _R 505, which is shown by red/solid lines, or by the lines labeled "absolute value calculation" in FIG. 5, as follows:

1. Громкость каждого входного сигнала вычисляется (оценка 503 громкости); громкость может представлять громкость на основе слуховой системы человека, значений энергии, значений абсолютной величины и т.д.;1. The volume of each input signal is calculated (loudness estimate 503); loudness may represent loudness based on the human auditory system, energy values, absolute magnitude values, etc.;

2. Значения громкости суммируются;2. Volume values are summed;

3. Суммирование громкости транслируется в абсолютную величину (преобразование 504 громкости в абсолютную величину); например, квадратный корень используется для значений энергии;3. Loudness summation is translated into absolute value (loudness to absolute value conversion 504); for example, the square root is used for energy values;

4. Необязательный: взвешивание M_R (опорной абсолютной величины M_R 505) приводит к модифицированной (или масштабированной) абсолютной величине M^Mod _R 506a (например, с использованием масштабирования 506); ниже описываются дополнительные детали при описании понижающего микширования громкости с адаптивной опорной абсолютной величиной; этот этап может выполняться, чтобы не допускать потенциальных артефактов, которые могут предположительно вызываться посредством ошибочной информации фазы.4. Optional: weighting M _R (reference absolute value M _R 505) results in a modified (or scaled) absolute value M ^Mod _R 506a (eg, using scaling 506); further details are described below when describing the loudness downmix with an adaptive magnitude reference; this step may be performed to avoid potential artifacts that might be presumably caused by erroneous phase information.

ФазаPhase

Фаза P_P 508a (также обозначаемая как "пассивная DMX-фаза P_P") извлекается из пассивного понижающего микширования (например, полученного посредством модулей 507a, 507c комбинирования или сумматоров и обозначаемого с помощью 507b, 507d), при этом извлечение фазы показывается с синими/непрерывными линиями или линиями, помеченными "вычисление фазы", следующим образом:The P _P phase 508a (also referred to as the "P _P passive DMX phase") is extracted from the passive downmix (for example, obtained by combiners or adders 507a, 507c and denoted by 507b, 507d), with the phase extraction shown with blue /solid lines or lines labeled "phase calculation" as follows:

1. Входные сигналы микшируются с понижением пассивным способом (простое суммирование), например, в модулях 507a, 507c комбинирования или сумматорах; необязательно можно использовать по-другому обусловленное понижающее микширование DMX в модулях 507a, 507c комбинирования или сумматорах; Тем не менее, в этом случае, как суммирование громкости, так и дополнительные процедуры, описанные ниже в разделах, описывающих "понижающее микширование громкости с адаптивной опорной абсолютной величиной" и "понижающее микширование громкости с адаптивной фазой", должны обрабатываться (или обязаны обрабатываться) в смысле другого типа понижающего микширования;1. The input signals are passively downmixed (simple summing), for example, in combiners 507a, 507c or adders; optionally, differently conditioned DMX downmix can be used in combiners 507a, 507c or adders; However, in this case, both the loudness summation and the additional procedures described below in the sections describing "loudness downmix with adaptive absolute value reference" and "loudness downmix with adaptive phase" must be processed (or must be processed) in the sense of another type of downmix;

2. Re_DMX и Im_DMX (507b, 507d) используются для того, чтобы вычислять информацию фазы (например, с использованием вычисления 508 фазы), например, посредством использования четырехквадрантной обратной функции тангенса.2. Re _DMX and Im _DMX (507b, 507d) are used to calculate phase information (eg, using phase calculation 508), for example, by using a four-quadrant inverse tan function.

3. Необязательный: фаза P_P 508a (также обозначаемая как "пассивная DMX-фаза P_P") может модифицироваться, чтобы формировать скорректированное или модифицированное значение P^Mod _P 510a фазы (например, с использованием модуля 510 комбинирования или сумматора). Подробности относительно этой проблемы описываются ниже, например, в разделе, описывающем понижающее микширование громкости с адаптивной фазой. Этот этап может выполняться, чтобы создавать фазовый отклик без скачков фазы.3. Optional: The phase P _P 508a (also referred to as "Passive DMX phase P _P ") can be modified to generate an adjusted or modified phase value P ^Mod _P 510a (eg, using a combiner or adder 510). Details regarding this problem are described below, for example, in the section describing adaptive phase loudness downmixing. This step may be performed to create a phase response without phase jumps.

Опорная абсолютная величина M_R (505) (или модифицированное значение M^Mod _R 506a абсолютной величины) и фаза P_P (508a) (или модифицированная фаза P^Mod _P 510a) комбинируются в применении 511 значений фазы, т.е. при переходе из полярной в декартову форму (или представление чисел).Reference absolute value M _R (505) (or modified absolute value M ^Mod _R 506a) and phase P _P (508a) (or modified phase P ^Mod _P 510a) are combined in applying 511 phase values, i.e. when changing from polar to cartesian form (or representation of numbers).

5. Вариант осуществления согласно фиг. 65. The embodiment according to FIG. 6

Фиг. 6 показывает принципиальную блок-схему понижающего микшера с использованием понижающего микширования громкости с адаптивной опорной абсолютной величиной. Следует отметить, что понижающий микшер 600 согласно фиг. 6 является аналогичным понижающему микшеру 500 согласно фиг. 5 таким образом, что идентичные сигналы, блоки, признаки и функциональности не описываются снова. Кроме того, следует отметить, что идентичные признаки и сигналы обозначаются с помощью идентичных ссылок с номерами таким образом, что следует обратиться к вышеприведенному описанию.Fig. 6 shows a schematic block diagram of a downmixer using volume downmix with an adaptive absolute value reference. It should be noted that the downmixer 600 of FIG. 6 is similar to the downmixer 500 of FIG. 5 in such a way that identical signals, blocks, features and functionality are not described again. In addition, it should be noted that identical features and signals are designated by identical reference numbers in such a way that reference should be made to the above description.

Тем не менее, в дополнение к понижающему микшеру 500, понижающий микшер 600 содержит вычисление 612 степени подавления, которое может рассматриваться как предотвращение артефактов, и преобразование 613, которое может рассматриваться как сохранение громкости. Например, предотвращение 612 степени подавления принимает значения 501a-501n в спектральной области (или, более точно, их декартовы представления чисел). Вычисление 612 степени подавления предоставляет значение 612a усиления, которое также обозначается с помощью Q, в преобразование 613.However, in addition to downmixer 500, downmixer 600 includes a reduction ratio calculation 612, which can be considered artifact avoidance, and a transform 613, which can be considered loudness preservation. For example, suppression rate prevention 612 takes values 501a-501n in the spectral domain (or, more precisely, their Cartesian representations of numbers). The suppression ratio calculation 612 provides a gain value 612a, which is also denoted by Q, to the transform 613.

Преобразование 613 принимает значение 612 усиления (Q) и предоставляет, на его основе, преобразованное значение 613a усиления, которое также обозначается с помощью Q_mapped, в модуль 506 масштабирования, при этом модуль 506 масштабирования масштабирует опорное значение 505 абсолютной величины с использованием преобразованного значения 613a усиления, чтобы за счет этого получать масштабированное значение абсолютной величины 506a, которое вводится в применение 511 значений фазы. Например, вычисление 612 степени подавления может определять значение 612a усиления таким образом, что значение 612a усиления принимает сравнительно небольшое значение (например, значение, близкое к нулю), если существует высокая степень подавления, и определять значение 612a усиления, чтобы принимать сравнительно большее значение (например, значение, близкое к единице), когда существует сравнительно небольшая степень подавления между входными сигналами (например, при рассмотрении комбинации входных сигналов посредством комплекснозначного суммирования). Таким образом, усиление 612a выбирается небольшим, если обнаружено (или предполагается), что должна обеспечиваться высокая степень подавления, которая соответствует высокой степени ненадежности значения фазы или риску скачков фазы. С другой стороны, значение 612a усиления выбирается сравнительно большим, если имеется небольшая степень подавления, что подразумевает то, что значение фазы является сравнительно надежным, и то, что отсутствуют несоответствующие скачки фазы.The transform 613 takes the gain (Q) value 612 and provides, based on it, the transformed gain value 613a, which is also denoted by Q _mapped , to the scaler 506, where the scaler 506 scales the absolute value reference 505 using the transformed value 613a amplification, in order thereby to obtain a scaled absolute value 506a, which is introduced into the application 511 phase values. For example, the suppression rate calculation 612 may determine the gain value 612a such that the gain value 612a takes on a relatively small value (eg, a value close to zero) if there is a high amount of suppression, and determine the gain value 612a to take on a relatively large value ( eg, a value close to one) when there is a relatively small degree of suppression between inputs (eg, when considering a combination of inputs via complex-valued summation). Thus, the gain 612a is selected to be small if it is found (or expected) to provide a high degree of rejection, which corresponds to a high degree of unreliability of the phase value or risk of phase jumps. On the other hand, the gain value 612a is chosen to be relatively large if there is a small amount of suppression, which implies that the phase value is relatively reliable and that there are no inappropriate phase jumps.

Преобразование 613 помогает, по меньшей мере, частично компенсировать энергетические потери (по меньшей мере, за среднее по времени), которые вызваны посредством уменьшения (масштабированного) значения 506a абсолютной величины в случае, если существует сравнительно высокая степень подавления. Например, преобразование 613 может получать преобразованное усиление 613a таким образом, что преобразованное усиление иногда больше единицы (например, когда существует сравнительно небольшая степень подавления, и когда возникают энергетические потери, вызываемые посредством сравнительно небольших значений Q усиления ранее), и таким образом, что преобразованное значение 613 усиления существенно меньше единицы в другие периоды времени (например, когда существует сравнительно большая степень подавления).Transform 613 helps to at least partially compensate for the energy loss (at least over time) that is caused by reducing the (scaled) absolute value 506a in case there is a relatively high degree of suppression. For example, the transform 613 may obtain the converted gain 613a such that the converted gain is sometimes greater than one (for example, when there is a relatively small amount of suppression, and when there is an energy loss caused by relatively small gain Q values previously), and such that the converted the gain value 613 is substantially less than one at other times (eg, when there is a relatively large amount of suppression).

Ниже описываются подробности относительно вычисления 612 степени подавления и относительно преобразования 613. Тем не менее, также следует обратиться к вышеприведенным пояснениям, при этом вышеуказанные функциональности необязательно могут вводиться в понижающий микшер 600.Details regarding the calculation 612 of the amount of suppression and regarding the transformation 613 are described below.

Далее предоставляются некоторые дополнительные пояснения. В частности, следует отметить, что понижающий микшер 600 расширяется по сравнению с понижающим микшером 500, чтобы лучше справляться со случаем, в котором существует высокая степень подавления.Some additional explanations are provided below. In particular, it should be noted that the downmixer 600 is expanded compared to the downmixer 500 in order to better cope with the case in which there is a high degree of suppression.

Тем не менее, в общем, можно сказать, что понижающий микшер 600 согласно фиг. 6 и также понижающий микшер 800 согласно фиг. 8 предоставляют необязательные решения для частных случаев.However, in general, it can be said that the down mixer 600 of FIG. 6 and also the downmixer 800 of FIG. 8 provide optional solutions for particular cases.

Как уже упомянуто выше (например, в пояснении случая, в котором оба вектора имеют аналогичные абсолютные величины и угловую разность приблизительно в 180 градусов; см. фиг. 4c), суммирование входных сигналов может приводить к очень сильным подавлениям и формировать сильные скачки фазы. В этом случае, комбинация опорной абсолютной величины M_R 505 с ошибочной информацией P_P 508a фазы вызывает слышимые артефакты.As already mentioned above (for example, in the explanation of the case in which both vectors have similar absolute values and an angular difference of approximately 180 degrees; see Fig. 4c), summing the input signals can lead to very strong cancellations and generate strong phase jumps. In this case, the combination of the absolute reference value M _R 505 with the erroneous phase information P _P 508a causes audible artifacts.

Чтобы преодолевать эти искусственно сформированные артефакты, два решения представляются в данном документе (например, со ссылкой на фиг. 6 и 8). Первое решение содержит ослабление артефактов ниже значения порога слышимости посредством понижения опорной абсолютной величины. Это описывается в разделе, озаглавленном "понижающее микширование громкости с адаптивной опорной абсолютной величиной". В качестве второго решения, которое может использоваться альтернативно или в дополнение к первому решению, может выполняться коррекция ненадежного фазового отклика. Это описывается в разделе, озаглавленном "понижающее микширование громкости с адаптивной фазой".To overcome these artificially generated artifacts, two solutions are presented in this document (for example, with reference to Fig. 6 and 8). The first solution involves attenuating the artifacts below the hearing threshold value by lowering the reference absolute value. This is described in the section entitled "loudness downmix with adaptive absolute value reference". As a second solution, which may be used as an alternative to or in addition to the first solution, the unreliable phase response may be corrected. This is described in the section entitled "Adaptive Phase Loudness Downmix".

Понижающее микширование громкости с адаптивной опорной абсолютной величинойLoudness downmix with adaptive absolute value reference

Одна возможность для преодоления искусственно сформированных артефактов состоит в том, чтобы ослаблять опорную абсолютную величину (например, опорную абсолютную величину 505) в определенные моменты во времени, до тех пор, пока они не становятся неслышимыми. Для этого, "левое крыло" понижающего микшера 500 согласно фиг. 5 активируется (которое показывается, например, посредством красных/пунктирных линий или посредством типа линий, помеченных как "необязательная модификация абсолютной величины").One possibility for overcoming artificially generated artifacts is to attenuate the reference magnitude (eg, reference magnitude 505) at certain points in time until they become inaudible. For this, the "left wing" of the downmixer 500 of FIG. 5 is activated (which is shown, for example, by means of red/dotted lines or by means of a linetype labeled as "optional absolute value modification").

Относительно этой проблемы, следует обратиться к фиг. 6, который показывает принципиальную блок-схему понижающего микшера с понижающим микшированием громкости с адаптивной опорной абсолютной величиной.With regard to this problem, reference should be made to FIG. 6 which shows a schematic block diagram of a volume downmixer with an adaptive absolute value reference.

При вычислении 612 степени подавления, входные сигналы отклоняются, и степень подавления вычисляется (или оценивается). Если отсутствуют деструктивные помехи, то значение 612a усиления, также обозначаемое с помощью Q, равно 1. В случае полного подавления, значение 612a усиления, также обозначаемое с помощью Q, равно 0. Этот показатель используется для того, чтобы обнаруживать потенциальную ошибочную информацию фазы.When calculating 612 the degree of suppression, the input signals are rejected, and the degree of suppression is calculated (or estimated). If there is no destructive interference, then the gain value 612a, also denoted by Q, is 1. In the case of complete cancellation, the gain value 612a, also denoted by Q, is 0. This metric is used to detect potential erroneous phase information.

На втором этапе, который обозначается как преобразование 613, степень подавления преобразуется таким образом, что она становится усилением Q_mapped с сохранением громкости (например, преобразованным усилением 613a). Ниже описываются оба этапа либо функциональных блока или функциональности 612, 613.In the second step, which is referred to as transformation 613, the degree of suppression is transformed such that it becomes a loudness-preserving Q _mapped gain (eg, transformed gain 613a). Both steps of either the function block or functionality 612, 613 are described below.

Предотвращение артефактов/вычисление 612 степени подавленияArtifact Prevention/Suppression Degree Calculation 612

Фиг. 7 показывает схематичное представление извлечения степени подавления трех входных сигналов в комплексной плоскости. Абсцисса 710 обозначает действительную часть (или действительный компонент), и ордината 712 описывает мнимую часть (или мнимый компонент). Первое комплексное значение, например, представляющее спектральный элемент разрешения первого входного сигнала, представляется посредством первого вектора 720a, второе комплексное значение, которое, например, может представлять спектральный элемент разрешения второго входного сигнала, представляется посредством второго вектора 720b, и третье комплексное значение, которое, например, может представлять спектральный элемент разрешения третьего входного сигнала, представляется посредством третьего вектора 720c. Другими словами, на фиг. 7, одна потенциальная концепция примерно поясняется на основе трех входных сигналов, представленных посредством трех векторов 720a, 720b, 720c в комплексной плоскости.Fig. 7 shows a schematic representation of extracting the suppression ratio of three input signals in the complex plane. The abscissa 710 denotes the real part (or real component), and the ordinate 712 describes the imaginary part (or imaginary component). The first complex value, for example, representing the spectral bin of the first input signal, is represented by the first vector 720a, the second complex value, which, for example, can represent the spectral bin of the second input signal, is represented by the second vector 720b, and the third complex value, which, for example, may represent the spectral bin of the third input signal, is represented by the third vector 720c. In other words, in FIG. 7, one potential concept is roughly clarified based on three inputs represented by three vectors 720a, 720b, 720c in the complex plane.

Степень подавления на мнимой оси и действительной оси вычисляется отдельно и комбинируется с коррекцией по энергии:The amount of suppression on the imaginary axis and the real axis is calculated separately and combined with the energy correction:

- Сумма для положительных мнимых частей трех векторов вычисляется - sumIm+ - The sum for the positive imaginary parts of the three vectors is calculated - sumIm+

- Сумма для отрицательных мнимых частей трех векторов вычисляется - sumIm-- The sum for the negative imaginary parts of the three vectors is calculated - sumIm-

- Сумма для положительных действительных частей трех векторов вычисляется - sumRe+ - The sum for the positive real parts of the three vectors is calculated - sumRe+

- Сумма для отрицательных действительных частей трех векторов вычисляется - sumRe-- The sum for the negative real parts of the three vectors is calculated - sumRe-

- Четыре суммы комбинируются в следующем уравнении- The four sums are combined in the following equation

Тем не менее, следует отметить, что, для вычисления степени подавления, также может использоваться наклоненная система координат (например, с ориентацией к фазовому углу пассивного понижающего микширования DMX). Кроме того, следует отметить, что дополнительная процедура, описанная выше, необязательно может вычислять степень подавления с использованием альтернативной формулы. Тем не менее, в некоторых вариантах осуществления важно вычислять степень сильных подавлений точно, чтобы уменьшать опорную абсолютную величину в достаточной степени. Следует отметить, что четыре суммы (например, сумма для положительных мнимых частей, сумма для отрицательных мнимых частей, сумма для положительных действительных частей и сумма для отрицательных действительных частей) могут комбинироваться в следующем уравнении (или с использованием следующего уравнения), например, чтобы извлекать значение 612a усиления:However, it should be noted that, in order to calculate the amount of suppression, a tilted coordinate system can also be used (eg oriented to the passive DMX downmix phase angle). In addition, it should be noted that the additional procedure described above may optionally calculate the degree of suppression using an alternative formula. However, in some embodiments, it is important to calculate the degree of strong suppressions accurately in order to reduce the reference absolute value sufficiently. It should be noted that four sums (for example, the sum for positive imaginary parts, the sum for negative imaginary parts, the sum for positive real parts, and the sum for negative real parts) can be combined in the following equation (or using the following equation), for example, to derive 612a gain value:

-

,-

,

-

,-

,

-

,-

,

-

,-

,

Четыре дифференцирования по случаям выполняются, так что Q может принимать значения между 0 и 1.Four case differentiations are made so that Q can take values between 0 and 1.

Преобразование 613 с сохранением громкости - альтернатива 1Volume Preserving 613 Conversion - Alternative 1

Далее процедура преобразования (которая может выполняться посредством блока 613 преобразования) примерно вычисляется для случая сохранения энергии. Тем не менее, следует отметить, что различные уравнения для преобразования являются возможными.Next, the conversion procedure (which may be performed by the conversion unit 613) is approximately calculated for the energy conservation case. However, it should be noted that different conversion equations are possible.

Если значение Q усиления должно применяться непосредственно к опорной абсолютной величине, то оно уменьшает свою энергию (например, если значение Q усиления составляет в диапазоне между 0 и 1). Это может уменьшать воспринимаемую громкость микшированного сигнала.If the Q gain value is to be applied directly to the reference absolute value, then it reduces its energy (eg if the Q gain value is between 0 and 1). This may reduce the perceived loudness of the mixed signal.

Согласно аспекту изобретения, энергетические потери в силу этого отслеживаются и с временной задержкой возвращаются в сигнал. Важно, не отменять уменьшение опорной абсолютной величины 612, которое ранее выполнено посредством этого второго этапа 613. Энергия может возвращаться только в том случае, если уменьшение опорной абсолютной величины не является слишком высоким. В частности, эти этапы выполняются:According to an aspect of the invention, the energy loss is therefore monitored and, with a time delay, returned to the signal. It is important not to cancel the decrease in the reference absolute value 612 that was previously performed by this second step 613. Energy can only be returned if the decrease in the reference absolute value is not too high. In particular, these steps are carried out:

- Отслеживание степени подавления во времени посредством сглаживания с

=[0-1]:- Tracking the degree of suppression over time by smoothing with

=[0-1]:

- Преобразование Q выше верхнего предела его диапазона значений, чтобы обеспечивать возможность значений выше 1 и в силу этого усиления:- Transforming Q above the upper limit of its range of values to allow values above 1 and by virtue of this gain:

Тем не менее, следует отметить, что возможны различные уравнения и/или способы отслеживания.However, it should be noted that various equations and/or tracking methods are possible.

Тем не менее, следует указывать следующие комментарии:However, the following comments should be included:

Обнаружено, что при постоянном значении T=0,6, может достигаться преобразование диапазона значений Q, которое компенсирует энергетические потери в среднем. Следует отметить, что значение экспоненты T определено эмпирически из базы данных сигналов более чем из 125 аудиосигналов. С этой целью, энергия опорной абсолютной величины суммируется по всем полосам частот (в слышимом диапазоне) и сравнивается с суммированной энергией модифицированной абсолютной величины, обрабатываемой с помощью Q_mapped, и разность минимизируется за T. Тем не менее, экспонента T по-прежнему может изменяться, если требуется другой эффект преобразования.It has been found that at a constant value of T=0.6, a range conversion of Q values can be achieved that compensates for energy losses on average. It should be noted that the value of the exponent T is determined empirically from a database of signals from more than 125 audio signals. To this end, the energy of the reference absolute value is summed over all frequency bands (in the audible range) and compared with the summed energy of the modified absolute value processed by Q _mapped , and the difference is minimized in T. However, the exponent T can still be changed if you want a different transform effect.

Кроме того, следует отметить, что чем меньше Q, тем меньше оно преобразуется вверх. Артефакты не усиливаются.In addition, it should be noted that the smaller Q, the less it is converted up. Artifacts are not enhanced.

Кроме того, чем больше Q, тем больше оно преобразуется вверх и может достигать значений выше 1.In addition, the larger Q, the more it converts up and can reach values above 1.

В некоторых вариантах осуществления, это обеспечивает то, что чем более надежной является информация фазы в это время, тем больше энергии возвращается в сигнал. Тем не менее, в некоторых вариантах осуществления, может быть полезным ограничивать величину возвращенной энергии с тем, чтобы не допускать чрезмерных усилений. Например, Q_mapped может быть ограничено определенным значением, например, 1,2, 1,5, 1,8 или 2.0.In some embodiments, this ensures that the more reliable the phase information is at that time, the more energy is returned to the signal. However, in some embodiments, it may be useful to limit the amount of energy returned so as not to allow excessive amplifications. For example, Q _mapped can be limited to a specific value, such as 1.2, 1.5, 1.8, or 2.0.

Преобразование 613 с сохранением громкости - альтернатива 2Volume Preserving 613 Conversion - Alternative 2

Далее описывается альтернативная реализация преобразования 613с сохранением громкости.The following describes an alternative implementation of the volume-preserving transformation 613.

Далее процедура преобразования примерно вычисляется для случая сохранения энергии. Тем не менее, различные уравнения для преобразования являются возможными.Next, the conversion procedure is approximately calculated for the case of conservation of energy. However, different conversion equations are possible.

Если Q должно применяться непосредственно к опорной абсолютной величине, то оно уменьшает свою энергию. Это может уменьшать воспринимаемую громкость микшированного сигнала. Энергетические потери в силу этого отслеживаются и возвращаются с временной задержкой в сигнал. Важно не отменять уменьшение опорной абсолютной величины (например, в блоке 612), которое выполнено ранее, посредством этого второго этапа (например, в блоке 613). Энергия может возвращаться только в том случае, если уменьшение опорной абсолютной величины не является слишком высоким.If Q is to be applied directly to a reference absolute value, then it reduces its energy. This may reduce the perceived loudness of the mixed signal. Energy losses are therefore tracked and returned with a time delay to the signal. It is important not to cancel the decrease in the absolute reference value (eg, at block 612) that was previously performed by this second step (eg, at block 613). Energy can only be returned if the reduction in the reference absolute value is not too high.

В частности, эти этапы выполняются:In particular, these steps are carried out:

-- Отслеживание степени подавления во времени посредством сглаживания с p=[0-1]-- Tracking the degree of suppression over time by smoothing with p=[0-1]

Тем не менее, различные уравнения/способы отслеживания являются возможными.However, various tracking equations/methods are possible.

- (Удовлетворяемое) преобразование Q к значению 1 и в силу этого без усиления опорной абсолютной величины [212]:- (Satisfied) transformation of Q to value 1 and therefore without amplification of the reference absolute value [212]:

Вообще говоря, этот тип преобразования пытается сохранять исходную опорную абсолютную величину и ослабляет ее только в том случае, если обнаруживаются более сильные деструктивные помехи. Хотя отсутствует усиление, воспринимаемая общая громкость не изменяется. Ослабление опорной абсолютной величины вследствие более сильных деструктивных помех главным образом маскируется посредством сигнала.Generally speaking, this type of transformation attempts to maintain the original reference absolute value and weakens it only if stronger destructive noise is detected. Although there is no gain, the perceived overall loudness does not change. The weakening of the reference absolute value due to stronger destructive interference is mainly masked by the signal.

Следующие комментарии предпочтительно должны учитываться:The following comments should preferably be considered:

- Постоянное усиление G является интенсивностью наклона и, например, может принимать значения между 1 и 10 (или между 0,5 и 20).- The constant gain G is the intensity of the slope and, for example, can take values between 1 and 10 (or between 0.5 and 20).

- Наклон

зависит в среднем от степени подавления:- Tilt

depends on the average on the degree of suppression:

- Чем меньше Q_smooth(t), тем более взвешенным является преобразование, чтобы не усиливать потенциальные артефакты.- The smaller Q _smooth (t), the more weighted the transformation is so as not to amplify potential artefacts.

- Чем больше Q_smooth(t), тем сильнее преобразование.- The larger Q _smooth (t), the stronger the transformation.

Фиг. 11 показывает примеры кривых преобразования, которые могут достигаться с использованием различных концепций преобразования для сохранения громкости, описанного в данном документе.Fig. 11 shows examples of conversion curves that can be achieved using the various loudness preservation conversion concepts described herein.

В преобразовании согласно первой альтернативе, усиления, большие 1, разрешаются, так что недостающая энергия вводится (возвращается) в сигнал способом с временной задержкой с использованием Q_mapped.In the transformation according to the first alternative, gains greater than 1 are allowed so that the missing energy is injected (returned) into the signal in a time-delayed manner using Q _mapped .

В преобразовании согласно второй альтернативе, усиление не разрешается. Далее проводится попытка поддерживать в максимально возможной степени опорную абсолютную величину, чтобы за счет этого не масштабировать с понижением (или уменьшать) опорную абсолютную величину. Опорная абсолютная величина снижается или масштабируется с понижением только в том случае, если возникают сильные деструктивные помехи. Кроме того, степень снижения (или понижающего масштабирования) по-прежнему зависит от Q_smooth, т.е. от энергии, потерянной во времени.In the transformation according to the second alternative, gain is not allowed. Next, an attempt is made to maintain the reference absolute value as much as possible, so as not to downscale (or decrease) the reference absolute value. The reference absolute value is reduced or scaled down only if strong disruptive interference occurs. Also, the amount of reduction (or downscaling) still depends on Q _smooth , i.e. from energy lost in time.

6. Понижающий микшер согласно фиг. 86. Downmixer according to FIG. eight

Фиг. 8 показывает принципиальную блок-схему понижающего микшера, согласно другому варианту осуществления настоящего изобретения.Fig. 8 shows a schematic block diagram of a downmixer according to another embodiment of the present invention.

Понижающий микшер 800 является аналогичным понижающему микшеру 500 таким образом, что идентичные признаки, функциональности и сигналы не описываются здесь снова. Наоборот, используются идентичные ссылки с номерами с пояснением понижающего микшера 500, и следует обратиться к вышеприведенным пояснениям относительно понижающего микшера 500.Downmixer 800 is similar to downmixer 500 in that identical features, functionality, and signals are not described again here. On the contrary, identical reference numbers are used with the explanation of the downmixer 500, and the above explanations regarding the downmixer 500 should be referred to.

Тем не менее, в дополнение к функциональностям и/или блокам понижающего микшера 500, понижающий микшер 800 также содержит вычисление 814 значений фазовой коррекции, которое принимает комплекснозначное представление 501a-501n входных сигналов (или их спектральных элементов разрешения). Кроме того, вычисление 814 значений фазовой коррекции также может принимать значение 508a фазы. Вычисление 814 значений фазовой коррекции также предоставляет значение 815 фазовой коррекции в модуль 510 комбинирования таким образом, что модуль 510 комбинирования извлекает модифицированное значение 510a фазы на основе значения 508a фазы, с учетом значения 815 фазовой коррекции (которое также обозначается с помощью W).However, in addition to the functionality and/or blocks of downmixer 500, downmixer 800 also includes a phase correction calculation 814 that takes a complex valued representation 501a-501n of the input signals (or their spectral bins). In addition, the calculation 814 phase correction values may also take the value 508a phase. The phase correction value calculation 814 also provides the phase correction value 815 to the combiner 510 such that the combiner 510 extracts a modified phase value 510a based on the phase value 508a, given the phase correction value 815 (which is also denoted by W).

Соответственно, вычисление 814 значений фазовой коррекции, например, может определять то, когда значение 508a фазы, которое может получаться посредством простого вычисления 508 фазы, описанного выше, отклоняется от фактического значения фазы сильно, или когда значение 508a фазы содержит чрезмерные скачки фазы, и т.п.Accordingly, the calculation 814 of phase correction values, for example, can determine when the phase value 508a, which can be obtained by the simple calculation 508 of the phase described above, deviates from the actual phase value by a large amount, or when the phase value 508a contains excessive phase jumps, and so on. .P.

Например, вычисление 814 значений фазовой коррекции может предоставлять значение 815 фазовой коррекции таким образом, что существует сглаженный наплыв между значениями фазы, предоставленными посредством вычисления фазы 508a, и скорректированными значениями 510a фазы. Например, вычисление 814 значений фазовой коррекции может предоставлять значение 815 фазовой коррекции таким образом, что значение 815 фазовой коррекции плавно переходит от нуля до требуемого значения фазовой коррекции.For example, the calculation 814 of phase correction values may provide a phase correction value 815 such that there is a smoothed overflow between the phase values provided by the phase calculation 508a and the corrected phase values 510a. For example, the calculation 814 of phase correction values may provide a phase correction value 815 such that the phase correction value 815 transitions smoothly from zero to the desired phase correction value.

Тем не менее, следует отметить, что в некоторых вариантах осуществления, сумматоры 507a, 507c/модули комбинирования, вычисление 508 фазы, вычисление 814 значений фазовой коррекции и комбинация 510 могут заменяться посредством улучшенного вычисления значения фазы, которое обычно вычисляет значения фазы, имеющие повышенную надежность.However, it should be noted that in some embodiments, adders 507a, 507c/combiners, phase calculation 508, phase correction values calculation 814, and combination 510 may be replaced by improved phase value calculation, which typically calculates phase values having improved reliability. .

Например, определение значений фазы, как показано на фиг. 3, может использоваться постоянно либо может использоваться для инициализации значений 815 фазовой коррекции, в зависимости от требований.For example, determining phase values as shown in FIG. 3 may be used permanently, or may be used to initialize phase correction values 815, depending on requirements.

Понижающее микширование громкости с адаптивной фазойVolume downmix with adaptive phase

Далее описывается понижающее микширование громкости с адаптивной фазой, которое может использоваться согласно аспекту изобретения.The following describes an adaptive phase loudness downmix that can be used according to an aspect of the invention.

Чтобы иметь возможность непрерывно использовать опорную абсолютную величину M_R, требуется "надежный" фазовый отклик. С этой целью, правое крыло на фиг. 5 (и также на фиг. 8) активируется (показано синими/пунктирными линиями или линиями, помеченными как "необязательная модификация фазы"). На этапе или в функциональном блоке "вычисление значений фазовой коррекции" 814, значение 815 фазовой коррекции (также обозначаемое с помощью W) вычисляется на основе ответвленных входных сигналов (например, на основе представлений 501a-501n чисел). Потенциальная ошибочная фаза пассивного понижающего микширования, например, "фаза P_p 508a пассивного понижающего микширования" корректируется таким образом, что заметные артефакты (на основе скачков фазы) не допускаются.To be able to continuously use the reference absolute value M _R requires a "reliable" phase response. To this end, the right wing in Fig. 5 (and also in FIG. 8) is activated (shown in blue/dotted lines or lines labeled "optional phase modification"). In step or function block "calculate phase correction values" 814, the phase correction value 815 (also denoted by W) is calculated based on the branched input signals (eg, based on number representations 501a-501n). A potential erroneous passive downmix phase, eg "passive downmix phase P _p 508a" is corrected such that noticeable artifacts (based on phase jumps) are not allowed.

Модуль (либо функциональный блок или функциональность) "вычисление значений фазовой коррекции" 814 может состоять из нескольких субмодулей. В случае отсутствия деструктивных помех входных сигналов во время пассивного понижающего микширования, значение фазовой коррекции является близким к нулю. После того, как деструктивные помехи/подавления возникают, вычисляется значение (например, значение фазовой коррекции), которое приводит к надежному фазовому отклику.The "calculate phase correction values" module (or function block or functionality) 814 may consist of several sub-modules. If there is no destructive interference of the input signals during the passive downmix, the phase correction value is close to zero. After the destructive interference/suppression occurs, a value (eg, phase correction value) is calculated that results in a reliable phase response.

Надежный фазовый отклик извлекается, например, из адаптивного суммирования со взвешиванием входных сигналов. Например, может быть необходимым отслеживать значения громкости отдельных сигналов во времени. Адаптивное взвешивание нацелено на создание DMX (подмикширования) без нарушения деструктивных помех. При подмикшировании, деструктивные помехи могут допускаться до определенной степени. Это может быть полезным, чтобы не допускать искусственно сформированных скачков фазы при повторном взвешивании отдельных входных сигналов.A robust phase response is derived, for example, from adaptive weighted summing of the input signals. For example, it may be necessary to track the loudness values of individual signals over time. Adaptive weighting aims to create DMX (mixing) without disturbing destructive noise. When submixing, destructive noise can be tolerated to a certain extent. This can be useful to avoid artificially generated phase jumps when re-weighting individual input signals.

Чтобы обеспечивать плавные переходы при переключении между пассивным понижающим микшированием (DMX) и подмикшированием, фазовая коррекция также может применяться, когда не возникают деструктивные помехи/подавления. Необязательно, можно сглаживать фазовые отклики по нескольким полосам частот, чтобы дополнительно ослаблять скачки фазы.In order to provide smooth transitions when switching between passive downmix (DMX) and downmix, phase correction can also be applied when destructive noise/cancellation does not occur. Optionally, phase responses can be smoothed across multiple frequency bands to further attenuate phase jumps.

В качестве вывода, фиг. 8 показывает принципиальную блок-схему понижающего микшера, который использует понижающее микширование громкости с адаптивной фазой.As a conclusion, Fig. 8 shows a schematic block diagram of a downmixer that uses adaptive phase loudness downmixing.

Например, в варианте осуществления согласно фиг. 8, вычисление 612 степени подавления и преобразование 613 могут быть неактивными (или отсутствовать), но вычисление 814 значений фазовой коррекции может быть активным.For example, in the embodiment of FIG. 8, the calculation 612 of the degree of suppression and the transformation 613 may be inactive (or absent), but the calculation 814 of the phase correction values may be active.

Тем не менее, в некоторых вариантах осуществления, также можно использовать вычисление 612 степени подавления и преобразование 613, а также вычисление 814 значений фазовой коррекции, одновременно, чтобы за счет этого получать хорошие результаты.However, in some embodiments, it is also possible to use the calculation 612 of the degree of suppression and the transformation 613, as well as the calculation 814 of the phase correction values, at the same time, to thereby obtain good results.

Тем не менее, следует отметить, что вариант осуществления согласно фиг. 8 может дополняться посредством любых из признаков, функциональностей и подробностей, раскрытых в данном документе, как отдельно, так и в комбинации.However, it should be noted that the embodiment of FIG. 8 may be augmented by any of the features, functionality, and details disclosed herein, either alone or in combination.

7. Заключения и общие комментарии7. Conclusions and general comments

В качестве вывода, следует отметить, что описываются концепции, которые помогают уменьшать артефакты при предоставлении сигнала понижающего микширования на основе множества входных сигналов. В частности, решены проблемы, возникающие в результате подавлений. Например, как только два или более указателей (или фазовращателей, или векторов) находятся за пределами зоны углов в 90°, предусмотрены подавления на одной или даже на обеих осях системы координат. Это означает то, что действительные компоненты или мнимые компоненты указателей (или фазовращателей, или векторов) (либо и те, и другие компоненты) уравновешиваются частично или даже полностью. Таким образом, возникает проблема деструктивных помех/наложения. Таким образом, вопрос в отношении того, существуют либо нет деструктивные помехи или наложение, является независимым от длины суммарного вектора и также является независимым от вопроса касательно того, превышает или нет длина суммарного вектора более длинный из двух векторов.As a conclusion, it should be noted that concepts are described that help to reduce artifacts when providing a downmix signal based on multiple input signals. In particular, the problems arising as a result of suppressions are solved. For example, as soon as two or more pointers (or phase shifters or vectors) are outside the 90° angle zone, suppressions are provided on one or even both axes of the coordinate system. This means that the real components or the imaginary components of the pointers (or phase shifters or vectors) (or both) are balanced partially or even completely. Thus, the problem of destructive interference/aliasing arises. Thus, the question of whether or not there is destructive interference or overlap is independent of the length of the sum vector, and is also independent of the question of whether or not the length of the sum vector is greater than the longer of the two vectors.

В качестве дополнительного комментария, следует отметить, что помехи рассматриваются только во временном среднем, поскольку обработка типично осуществляется в частотной области, и поскольку типично сигнальные буферы определенной длины анализируются. Следует отметить, что может возникать такая ситуация, что в сигнальном буфере (при рассмотрении временной структуры сигнала) одновременно существуют конструктивные и деструктивные помехи. Тем не менее, в частотной области, можно видеть только то, какой тип помех избыточно взвешивает буфер. Таким образом, буфер классифицируется, соответственно. Таким образом, следует отметить, что вопрос в отношении того, существуют либо нет конструктивные или деструктивные помехи, может определяться так, как описано в данном документе. Кроме того, надлежащие коррекции амплитуды и/или фазы могут выполняться, например, когда обнаружено, что значение фазы должно быть ненадежным с учетом помех.As an additional comment, it should be noted that the interference is considered only in time average, since the processing is typically performed in the frequency domain, and since signal buffers of a certain length are typically analyzed. It should be noted that such a situation may arise that in the signal buffer (when considering the temporal structure of the signal) there are simultaneously constructive and destructive interference. However, in the frequency domain, one can only see what type of noise is overweighting the buffer. So the buffer is classified accordingly. Thus, it should be noted that the question of whether or not constructive or destructive interference exists can be determined as described herein. In addition, appropriate amplitude and/or phase corrections can be made, for example, when it is found that the phase value should be unreliable given the noise.

8. Способ согласно фиг. 98. The method according to FIG. 9

Фиг. 9 показывает блок-схему последовательности операций способа 900 для предоставления сигнала понижающего микширования на основе множества входных сигналов, согласно варианту осуществления изобретения.Fig. 9 shows a flowchart of a method 900 for providing a downmix signal based on a plurality of input signals, according to an embodiment of the invention.

Способ 900 содержит определение 910 значения абсолютной величины значения в спектральной области сигнала понижающего микширования на основе информации громкости входных сигналов, иThe method 900 comprises determining 910 an absolute magnitude value of a value in the spectral domain of the downmix signal based on the loudness information of the input signals, and

- способ 900 содержит определение 920 значения фазы значения в спектральной области сигнала понижающего микширования. Способ 900 также содержит применение 930 значения фазы, чтобы получать комплексное представление чисел значения в спектральной области сигнала понижающего микширования на основе значения абсолютной величины значения в спектральной области.- the method 900 comprises determining 920 the phase value of the value in the spectral domain of the downmix signal. The method 900 also comprises applying 930 a phase value to obtain a complex representation of the numbers of the value in the spectral domain of the downmix signal based on the magnitude value of the value in the spectral domain.

Способ 900 необязательно может дополняться посредством любых из признаков, функциональностей и подробностей, раскрытых в данном документе, как отдельно, так и в комбинации.The method 900 may optionally be supplemented by any of the features, functionality, and details disclosed herein, either alone or in combination.

Кроме того, следует отметить, что этапы 910 и 920 естественно также могут выполняться параллельно при необходимости.In addition, it should be noted that steps 910 and 920 can naturally also be performed in parallel if necessary.

9. Аудиокодер согласно фиг. 109. The audio encoder of FIG. ten

Фиг. 10 показывает принципиальную блок-схему аудиокодера 1000, согласно варианту осуществления настоящего изобретения.Fig. 10 shows a schematic block diagram of an audio encoder 1000 according to an embodiment of the present invention.

Аудиокодер 1000 выполнен с возможностью предоставления кодированного аудиопредставления 1012 на основе множества входных аудиосигналов 1010a-1010n.Audio encoder 1000 is configured to provide an encoded audio representation 1012 based on a plurality of input audio signals 1010a-1010n.

Аудиокодер содержит понижающий микшер 1020, который может соответствовать любому из понижающих микшеров, описанных выше. Понижающий микшер 1020 выполнен с возможностью предоставлять сигнал 1022 понижающего микширования на основе (комплекснозначных) представлений в спектральной области множества входных аудиосигналов. Кроме того, аудиокодер выполнен с возможностью кодировать сигнал 1022 понижающего микширования, чтобы получать кодированное аудиопредставление 1012.The audio encoder includes a downmixer 1020, which may correspond to any of the downmixers described above. The downmixer 1020 is configured to provide a downmix signal 1022 based on spectral domain (complex) representations of a plurality of input audio signals. In addition, the audio encoder is configured to encode the downmix signal 1022 to obtain an encoded audio representation 1012.

Аудиокодер может использовать любую из известных технологий кодирования, чтобы кодировать сигнал понижающего микширования, таких как, например, AAC-кодирование или кодирование на основе LPC. Кроме того, аудиокодер необязательно может предоставлять дополнительную вспомогательную информацию, описывающую понижающее микширование (например, взвешивание входных сигналов в сигнале понижающего микширования), или любую другую вспомогательную информацию, известная в области техники кодирования аудио.The audio encoder may use any of the known coding techniques to encode the downmix signal, such as, for example, AAC coding or LPC based coding. In addition, the audio encoder may optionally provide additional side information describing the downmix (eg, weighting of input signals in the downmix signal), or any other side information known in the art of audio coding.

10. Альтернативы реализации10. Implementation alternatives

Хотя некоторые аспекты описаны в контексте оборудования, очевидно, что эти аспекты также представляют описание соответствующего способа, при этом блок или устройство соответствует этапу способа либо признаку этапа способа. Аналогично, аспекты, описанные в контексте этапа способа, также представляют описание соответствующего блока или элемента, или признака соответствующего оборудования. Некоторые или все этапы способа могут выполняться посредством (или с использованием) аппаратного оборудования, такого как, например, микропроцессор, программируемый компьютер либо электронная схема. В некоторых вариантах осуществления, один или более из самых важных этапов способа могут выполняться посредством этого оборудования.Although some aspects are described in the context of equipment, it is obvious that these aspects also represent a description of the corresponding method, with the block or device corresponding to a method step or a feature of a method step. Likewise, aspects described in the context of a method step also provide a description of the associated block or element, or feature of the associated equipment. Some or all of the steps of the method may be performed by (or using) hardware such as, for example, a microprocessor, a programmable computer, or an electronic circuit. In some embodiments, one or more of the most important steps of the method may be performed by this equipment.

В зависимости от определенных требований к реализации, варианты осуществления изобретения могут реализовываться в аппаратных средствах или в программном обеспечении. Реализация может выполняться с использованием цифрового носителя хранения данных, например, гибкого диска, DVD, Blu-Ray, CD, ROM, PROM, EPROM, EEPROM или флэш-памяти, имеющего сохраненные электронно считываемые управляющие сигналы, которые взаимодействуют (или допускают взаимодействие) с программируемой компьютерной системой таким образом, что осуществляется соответствующий способ. Следовательно, цифровой носитель хранения данных может быть машиночитаемым.Depending on certain implementation requirements, embodiments of the invention may be implemented in hardware or in software. The implementation may be performed using a digital storage medium, such as a floppy disk, DVD, Blu-ray, CD, ROM, PROM, EPROM, EEPROM, or flash memory, having electronically readable control signals stored that interact (or are capable of interacting) with programmable computer system in such a way that the corresponding method is carried out. Therefore, the digital storage medium can be machine readable.

Некоторые варианты осуществления согласно изобретению содержат носитель данных, имеющий электронночитаемые управляющие сигналы, которые допускают взаимодействие с программируемой компьютерной системой таким образом, что осуществляется один из способов, описанных в данном документе.Some embodiments of the invention comprise a storage medium having electronically readable control signals that are capable of interacting with a programmable computer system such that one of the methods described herein is implemented.

В общем, варианты осуществления настоящего изобретения могут реализовываться как компьютерный программный продукт с программным кодом, при этом программный код выполнен с возможностью осуществления одного из способов, когда компьютерный программный продукт работает на компьютере. Программный код, например, может сохраняться на машиночитаемом носителе.In general, embodiments of the present invention may be implemented as a computer program product with program code, wherein the program code is configured to perform one of the methods when the computer program product is running on the computer. The program code may, for example, be stored on a computer-readable medium.

Другие варианты осуществления содержат компьютерную программу для осуществления одного из способов, описанных в данном документе, сохраненную на машиночитаемом носителе.Other embodiments comprise a computer program for carrying out one of the methods described herein, stored on a computer-readable medium.

Другими словами, вариант осуществления изобретаемого способа в силу этого представляет собой компьютерную программу, имеющую программный код для осуществления одного из способов, описанных в данном документе, когда компьютерная программа работает на компьютере.In other words, an embodiment of the inventive method is therefore a computer program having program code for carrying out one of the methods described herein when the computer program is running on a computer.

Следовательно, дополнительный вариант осуществления изобретаемых способов представляет собой носитель хранения данных (цифровой носитель хранения данных или машиночитаемый носитель), содержащий записанную компьютерную программу для осуществления одного из способов, описанных в данном документе. Носитель данных, цифровой носитель хранения данных или носитель с записанными данными типично является материальным и/или энергонезависимым.Therefore, a further embodiment of the inventive methods is a storage medium (digital storage medium or computer-readable medium) containing a recorded computer program for carrying out one of the methods described herein. The storage medium, digital storage medium or recorded data medium is typically tangible and/or non-volatile.

Следовательно, дополнительный вариант осуществления изобретаемого способа представляет собой поток данных или последовательность сигналов, представляющих компьютерную программу для осуществления одного из способов, описанных в данном документе. Поток данных или последовательность сигналов, например, может быть выполнена с возможностью передачи через соединение для передачи данных, например, через Интернет.Therefore, an additional embodiment of the inventive method is a data stream or a sequence of signals representing a computer program for implementing one of the methods described herein. The data stream or signal sequence, for example, may be configured to be transmitted over a data connection, such as the Internet.

Дополнительный вариант осуществления содержит средство обработки, например, компьютер или программируемое логическое устройство, выполненное с возможностью осуществлять один из способов, описанных в данном документе.An additional embodiment comprises processing means, such as a computer or programmable logic device, configured to perform one of the methods described herein.

Дополнительный вариант осуществления содержит компьютер, имеющий установленную компьютерную программу для осуществления одного из способов, описанных в данном документе.An additional embodiment comprises a computer having a computer program installed to implement one of the methods described herein.

Дополнительный вариант осуществления согласно изобретению содержит оборудование или систему, выполненную с возможностью передавать (например, электронно или оптически) компьютерную программу для осуществления одного из способов, описанных в данном документе, в приемное устройство. Приемное устройство, например, может представлять собой компьютер, мобильное устройство, запоминающее устройство и т.п. Оборудование или система, например, может содержать файловый сервер для передачи компьютерной программы в приемное устройство.A further embodiment according to the invention comprises equipment or a system capable of transmitting (eg, electronically or optically) a computer program for performing one of the methods described herein to a receiving device. The receiving device may, for example, be a computer, mobile device, storage device, or the like. The equipment or system, for example, may include a file server for transmitting a computer program to a receiving device.

В некоторых вариантах осуществления, программируемое логическое устройство (например, программируемая пользователем вентильная матрица) может использоваться для того, чтобы выполнять часть или все из функциональностей способов, описанных в данном документе. В некоторых вариантах осуществления, программируемая пользователем вентильная матрица может взаимодействовать с микропроцессором, чтобы осуществлять один из способов, описанных в данном документе. В общем, способы предпочтительно осуществляются посредством любого аппаратного оборудования.In some embodiments, a programmable logic device (eg, a field programmable gate array) may be used to perform some or all of the functionality of the methods described herein. In some embodiments, a user-programmable gate array may interface with a microprocessor to perform one of the methods described herein. In general, the methods are preferably carried out by any hardware.

Оборудование, описанное в данном документе, может реализовываться с использованием аппаратного оборудования либо с использованием компьютера, либо с использованием комбинации аппаратного оборудования и компьютера.The equipment described herein may be implemented using hardware, either using a computer, or using a combination of hardware and a computer.

Оборудование, описанное в данном документе, или любые компоненты оборудования, описанного в данном документе, могут реализовываться, по меньшей мере, частично в аппаратных средствах и/или в программном обеспечении.The equipment described herein, or any components of the equipment described herein, may be implemented at least in part in hardware and/or software.

Способы, описанные в данном документе, могут осуществляться с использованием аппаратного оборудования либо с использованием компьютера, либо с использованием комбинации аппаратного оборудования и компьютера.The methods described herein may be performed using hardware, or using a computer, or using a combination of hardware and a computer.

Способы, описанные в данном документе, или любые компоненты оборудования, описанного в данном документе, могут выполняться, по меньшей мере, частично посредством аппаратных средств и/или посредством программного обеспечения.The methods described herein, or any components of the equipment described herein, may be performed at least in part by hardware and/or by software.

Вышеописанные варианты осуществления являются просто иллюстративными в отношении принципов настоящего изобретения. Следует понимать, что модификации и изменения компоновок и подробностей, описанных в данном документе, должны быть очевидными для специалистов в данной области техники. Следовательно, они подразумеваются как ограниченные только посредством объема нижеприведенной формулы изобретения, а не посредством конкретных подробностей, представленных посредством описания и пояснения вариантов осуществления в данном документе.The above described embodiments are merely illustrative of the principles of the present invention. It should be understood that modifications and changes to the arrangements and details described herein should be apparent to those skilled in the art. Therefore, they are intended to be limited only by the scope of the following claims, and not by the specific details provided by way of describing and explaining the embodiments herein.

11. Дополнительные заключения11. Additional conclusions

В качестве еще одного вывода, при понижающем микшировании N-канального входного сигнала, чтобы получать M-канальный выходной сигнал (N>M), могут возникать нежелательные эффекты. Эти эффекты могут проявлять себя в форме оцвечивания звука, манипулирования объемным окружением, снижения понятности речи и других артефактов.As a further conclusion, when downmixing an N-channel input signal to obtain an M-channel output signal (N>M), undesirable effects may occur. These effects can manifest themselves in the form of colorization of the sound, manipulation of the volumetric environment, reduction in the intelligibility of speech, and other artifacts.

Чтобы преодолевать эти эффекты, понижающее микширование с сохранением громкости может обрабатываться для абсолютной величины, и неадаптивное понижающее микширование может вычисляться для извлечения информации фазы, параллельно. Впоследствии, абсолютная величина и фаза объединяются, чтобы формировать M-канальный выходной сигнал.To overcome these effects, the loudness-preserving downmix may be processed for absolute value, and the non-adaptive downmix may be computed to extract phase information, in parallel. Subsequently, the magnitude and phase are combined to form an M-channel output signal.

Эти соображения могут необязательно вводиться в любой из вариантов осуществления, раскрытых в данном документе.These considerations may optionally be introduced in any of the embodiments disclosed herein.

Claims

1. Downmixer (100; 500; 600; 800; 1020) to provide a downmix signal (592; 1022) based on a plurality of input signals (110a, 110b; 210a, 210b; 500a, 500n, 1010a, 1010n),

wherein the downmixer is configured to determine the value (M _R , M ^Mod _R ; 122; 221, 222; 505, 506a) of the absolute value of the value (112; 511a, 511b) in the spectral region of the downmix signal based on the loudness information of the input signals, and

wherein the downmixer is configured to determine the value (P _P , P ^Mod _P ; 132; 398; 508a, 510a) of the phase value in the spectral domain of the downmix signal; and

wherein the downmixer is configured to apply the phase value (P _P , P ^Mod _P ; 132; 398; 508a, 510a) to obtain a complex-valued representation (112; 511a, 511b) of the value numbers in the spectral domain of the downmix signal based on the value of the absolute magnitude values in the spectral domain of the downmix signal;

wherein the downmixer is configured to determine the sum (507b, 507d) or the weighted sum (392) of complex valued values in the spectral domain of the input signals, and

determine the value (P _P , P ^Mod _P ; 132; 398; 508a, 510a) of the phase based on the real part and the imaginary part of the sum, or based on the real part and the imaginary part of the weighted sum of values in the spectral domain of the input signals.

2. The downmixer according to claim 1, wherein the downmixer is configured to determine the value (P _P , P ^Mod _P ) of the phase of the value in the spectral domain of the downmix signal, regardless of the determination of the value (M _R , M ^Mod _R ) of the absolute value of the value in spectral region of the downmix signal.

3. Downmix according to claim 1,

wherein the downmixer is configured to determine loudness values (503a, 503b) of values (110a, 110b; 210a, 210b; 501a, 501n) in the spectral domain of the input signals, and

wherein the downmixer is configured to extract a sum loudness value (503d) associated with a spectral domain value of the downmix signal based on the loudness values of the spectral domain values of the input signals; and

wherein the downmixer is configured to extract the value (M _R , M ^Mod _R ; 122; 221, 222; 505, 506a) of the absolute value of the value in the spectral domain of the downmix signal from the summed loudness value.

4. Downmixer according to claim 1, wherein the downmixer is configured to use the value (M _R , M ^Mod _R ; 122; 221, 222; 505, 506a) of the absolute value of the value in the spectral domain of the downmix signal as the absolute value of the polar represent the spectral domain value of the downmix signal and use the phase value (P _P , P ^Mod _P ; 132; 398; 508a, 510a) of the phase as the phase value of the polar representation of the spectral domain value of the downmix signal and obtain a Cartesian complex valued representation of (511a, 511b ) values in the spectral domain of the downmix signal based on the polar representation.

5. Downmix according to claim 1,

wherein the downmixer is configured to determine the suppression degree information and consider the suppression degree information when determining the value (M ^Mod _R ; 222; 506a) of the absolute value of the value in the spectral domain of the downmix signal,

wherein the rejection ratio information describes the degree of constructive or destructive interference between values in the spectral domain of the input signals, and

wherein the downmixer is configured to selectively reduce the value (M ^Mod _R ; 222; 506a) of the absolute value of the value in the spectral domain of the downmix signal compared to the value (M _R ; 221; 505) of the absolute value representing the sum of the loudness values of the values in the spectral area of the input signals, in case the information of the degree of suppression indicates destructive interference.

6. Downmix according to claim 5,

wherein the downmixer is configured to determine the individual sums (sumIm+, sumIm-, sumRe+, sumRe-) of the value components (110a; 110b; 210a, 210b; 501a, 501n) in the spectral domain of the input signals having different orientations, and

wherein the downmixer is configured to determine the suppression degree information based on the individual sums (sumIm+, sumIm-, sumRe+, sumRe-) of spectral domain value components of the input signals having different orientations.

7. Downmix according to claim 6,

wherein the downmixer is configured to select two of certain sums (sumIm+, sumRe+) that are associated with orthogonal orientations and that are greater than or equal to sums that are associated with opposite directions (sumIm-, sumRe-) as dominant sum values, and

wherein the downmixer is configured to determine a scaling value that results in a selective decrease in the value (M ^Mod _R ) of the absolute magnitude of the value in the spectral domain of the downmix signal based on:

- the unsigned relationship between the first non-dominant sum value (sumRe-), which is associated with an orientation opposite to that of the first dominant sum value (sumRe+), and the first dominant sum value (sumRe+), and

- the unsigned relationship between the second non-dominant sum value (sumIm-), which is associated with the opposite orientation of the second dominant sum value (sumIm+), and the second dominant sum value (sumIm+),

such that an increase in the unsigned relationships (|sumRe-|/sumRe+, |sumIm-|/sumIm+) between the non-dominant sum value and its associated dominant sum value results in a decrease in the absolute magnitude value (M ^Mod _R ) of the value in the spectral domain of the down signal mixing.

8. The down mixer of claim 5, wherein the down mixer is configured to calculate suppression degree information Q according to the following equations:

if

and

:

,

if

and

:

,

if

and

:

,

if

and

:

,

while sumRe+ is the sum of the positive real parts of complex-valued values (110a; 110b; 210a, 210b; 501a, 501n) in the spectral region of the input audio signals;

while sumRe- is the sum of negative real parts of complex-valued values in the spectral region of the input audio signals;

while sumIm+ is the sum of the positive imaginary parts of complex-valued values in the spectral region of the input audio signals; and

while sumIm- is the sum of negative imaginary parts of complex-valued values in the spectral region of the input audio signals.

9. Downmixer according to claim 1, wherein the downmixer is configured to determine the value (M ^Mod _R ; 222) of the absolute value of the value in the spectral domain of the downmix signal

such that the absolute value value (M ^Mod _R ) is selectively reduced with respect to the reference value (M _R ; 221) which corresponds to the summed loudness of the values in the spectral domain of the input signals at the times at which the suppression ratio information determined by the down-mixer indicates relatively large destructive interference between input signals, and

such that the absolute value value selectively increases relative to the reference value (M _R ) at times at which the suppression ratio information indicates relatively small destructive interference between the input signals.

10. The down mixer according to claim 9, wherein the down mixer is configured to monitor the suppression amount information over time and determine, depending on the history of the suppression amount information, how much the absolute value selectively increases relative to the reference value (M _R ) at times time in which the information of the degree of suppression indicates relatively small destructive interference between the input signals.

11. The down mixer of claim 9, wherein the down mixer is configured to obtain temporally smoothed suppression degree information based on instantaneous suppression degree information using an infinite impulse response smoothing operation or using a moving average smoothing operation to track the information. degree of suppression.

12. The down-mixer according to claim 9, wherein the down-mixer is configured to convert the instantaneous value of the suppression degree to the converted value of the suppression degree depending on the temporally smoothed information of the suppression degree

such that the value of the temporally smoothed suppression degree information indicating a decrease in the absolute value value leads to an increase in the converted suppression degree value compared to the instantaneous suppression degree value.

13. Downmix according to claim 1,

wherein the downmixer is configured to obtain an updated smooth value Q _smooth (t) of the suppression ratio based on the previous smooth value Q _smooth (t-1) of the suppression ratio and based on the instantaneous value Q(t) of the suppression ratio according to:

,

while p is a constant with 0<p<1;

and wherein the downmixer is configured to obtain the transformed value Q _mapped (t) of the degree of suppression according to:

,

while T is a constant with 0<T<1;

while Q(t) is in the range between 0 and 1 and takes a value of 0 for relatively large destructive interference between input signals and takes a value of 1 for relatively small destructive interference between input signals;

wherein the downmixer is configured to scale the absolute magnitude reference value (505) using the converted suppression ratio value so as to obtain the absolute magnitude value (506a).

14. Downmixer according to claim 1,

,

while p is a constant with 0≤p≤1;

,

wherein G is a predetermined value or a constant value between 0.5 and 20 or between 1 and 10;

while m _slope (t) is an auxiliary variable;

while max{} is the maximum operator;

while min{} is the minimal operator;

15. The down mixer according to claim 1, wherein the down mixer is configured to scale the value (M _R ; 221) of the absolute value, which corresponds to the summed loudness of values in the spectral domain of the input signals, using the value of the degree of suppression to obtain the value (M ^Mod _R ; 222) the absolute value of the value in the spectral domain of the downmix signal.

16. Downmixer according to claim 1,

wherein the downmixer is configured to determine the weighted sum (392) of the values (110a, 110b; 210a, 210b; 501a, 501n) in the spectral region of the input signals and

determine the value (398) of the phase based on the weighted sum of the values in the spectral domain of the input signals,

wherein the downmixer is configured to weight values in the spectral domain of the input signals in such a way as to prevent destructive interference that exceeds a predetermined interference level to obtain a weighted sum.

17. Downmix according to claim 1,

wherein the downmixer is configured to determine the weighted sum (392) of values in the spectral region of the input signals and

wherein the downmixer is configured to weight values in the spectral domain of the input signals depending on the time averaged intensity (362, 372, 382) of the corresponding spectral bin in the various input signals to obtain a weighted sum.

18. An audio encoder (1000) for providing an encoded audio representation (1012) based on a plurality of input audio signals (1010a, 1010n),

wherein the audio encoder comprises a downmixer according to claim 1,

wherein the downmixer is configured to provide a downmix signal (1022) based on spectral domain representations of the plurality of input audio signals, and

wherein the audio encoder is configured to encode the downmix signal to obtain an encoded audio representation (1012).

19. A method (900) for providing a downmix signal based on a plurality of input signals,

wherein the method comprises determining (910) a value (M _R , M ^Mod _R ) of the absolute magnitude of the value in the spectral domain of the downmix signal based on the loudness information of the input signals, and

wherein the method comprises determining (920) a phase value (P _P , P ^Mod _P ) of a value phase in the spectral domain of the downmix signal; and

wherein the method comprises the step of applying (930) the phase value (P _P , P ^Mod _P ) to obtain a complex representation of the numbers of the value in the spectral domain of the downmix signal based on the absolute magnitude value of the value in the spectral domain,

wherein the method comprises the step of determining the sum (507b, 507d) or the weighted sum (392) of complex-valued values in the spectral domain of the input signals, and

determine the value (P _P , P ^Mod _P ; 132; 398; 508a, 510a) of the phase based on the real part and the imaginary part of the sum or based on the real part and the imaginary part of the weighted sum of values in the spectral domain of the input signals.

20. A computer-readable storage medium storing a computer program for carrying out the method of claim 19 when the computer program is executed on the computer.

21. Downmixer (100; 500; 600; 800; 1020) to provide a downmix signal (592; 1022) based on a plurality of input signals (110a, 110b; 210a, 210b; 500a, 500n, 1010a, 1010n),

at the same time, the downmixer is configured to scale the value (M _R ; 221; 505) of the absolute value, which represents the sum of the loudness values of the values in the spectral region of the input signals, depending on the information on the degree of suppression, selectively reduce the value (M ^Mod _R ; 222; 506a) the absolute value of the value in the spectral domain of the downmix signal compared with the value (M ^Mod _R ; 221; 505) of the absolute value representing the sum of the loudness values of the values in the spectral domain of the input signals in case the suppression degree information indicates destructive interference.

22. Downmixer (100; 500; 600; 800; 1020) to provide a downmix signal (592; 1022) based on a plurality of input signals (110a, 110b; 210a, 210b; 500a, 500n, 1010a, 1010n),

wherein the downmixer is configured to selectively reduce the value (M ^Mod _R ; 222; 506a) of the absolute value of the value in the spectral domain of the downmix signal compared to the value (M _R ; 221; 505) of the absolute value representing the sum of the loudness values of the values in the spectral areas of input signals, in case the information of the degree of suppression indicates destructive interference;

wherein the downmixer is configured to determine the sums (sumIm+, sumIm-, sumRe+, sumRe-) of the value components (110a; 110b; 210a, 210b; 501a, 501n) in the spectral domain of the input signals having different orientations, and

wherein the downmixer is configured to determine suppression degree information (Q) based on sums (sumIm+, sumIm-, sumRe+, sumRe-) of spectral domain value components of input signals having different orientations;

23. Downmixer (100; 500; 600; 800; 1020) to provide a downmix signal (592; 1022) based on a plurality of input signals (110a, 110b; 210a, 210b; 500a, 500n, 1010a, 1010n),

wherein the downmixer is configured to selectively reduce the value (M ^Mod _R ; 222; 506a) of the absolute value of the value in the spectral domain of the downmix signal compared to the value (M _R ; 221; 505) of the absolute value representing the sum of the loudness values of the values in the spectral the area of the input signals in the case that the information of the degree of suppression indicates destructive interference;

wherein the downmixer is configured to calculate suppression degree information Q according to the following equations:

if

and

:

,

if

and

:

,

if

and

:

,

if

and

:

,

while sumRe- is the sum of the negative real parts of complex-valued values in the spectral region of the input audio signals;

24. Downmixer (100; 500; 600; 800; 1020) for providing a downmix signal (592; 1022) based on a plurality of input signals (110a, 110b; 210a, 210b; 500a, 500n, 1010a, 1010n),

wherein the downmixer is configured to determine the absolute value reference value based on the plurality of input signals; and

wherein the downmixer is configured to scale the absolute magnitude reference value, which is not affected by the constructive and destructive interference of the input signals, to determine the absolute magnitude value (M ^Mod _R ; 222) of the value in the spectral domain of the downmix signal

such that the absolute value value selectively increases relative to the reference value (M _R ) at times at which the suppression rate information (Q) indicates relatively small destructive interference between the input signals.

25. Downmixer (100; 500; 600; 800; 1020) to provide a downmix signal (592; 1022) based on a plurality of input signals (110a, 110b; 210a, 210b; 500a, 500n, 1010a, 1010n),

,

while p is a constant with 0<p<1;

,

while T is a constant with 0<T<1;

while Q(t) is in the range between 0 and 1 and takes a value of 0 for relatively large destructive interference between input signals and takes a value of 1 for relatively smaller destructive interference between input signals;

26. Downmixer (100; 500; 600; 800; 1020) for providing a downmix signal (592; 1022) based on a plurality of input signals (110a, 110b; 210a, 210b; 500a, 500n, 1010a, 1010n),

,

while p is a constant with 0≤p≤1;

,

while m _slope (t) is an auxiliary variable;

while max{} is the maximum operator;

while min{} is the minimal operator;

27. Downmixer (100; 500; 600; 800; 1020) for providing a downmix signal (592; 1022) based on a plurality of input signals (110a, 110b; 210a, 210b; 500a, 500n, 1010a, 1010n),

wherein the downmixer is configured to weight values in the spectral domain of the input signals in such a manner as to prevent destructive interference that exceeds a predetermined interference level to obtain a weighted sum;

wherein the downmixer is configured to extract the value (M _R , M ^Mod _R ; 122; 221, 222; 505, 506a) of the absolute value of the value in the spectral domain of the downmix signal from the value of the summed loudness.

28. Downmixer (100; 500; 600; 800; 1020) to provide a downmix signal (592; 1022) based on a plurality of input signals (110a, 110b; 210a, 210b; 500a, 500n, 1010a, 1010n),

wherein the downmixer is configured to weight values in the spectral domain of the input signals depending on the time-averaged intensity (362, 372, 382) of the corresponding spectral bin in the various input signals using the weight values to obtain a weighted sum;

wherein the downmixer is configured to extract the value (M _R , M ^Mod _R ; 122; 221, 222; 505, 506a) of the absolute value of the value in the spectral domain of the downmix signal from the summed loudness value;

wherein the downmixer is configured to generate an average of the values in the spectral domain of the plurality of spectral bins of the first of the input signals that are associated with the same frequency and that are associated with subsequent times to obtain the first of the weight values (362) for the first input signal, and

wherein the downmixer is configured to generate an average of the values in the spectral domain of the plurality of spectral bins of the second of the input signals that are associated with the same frequency and that are associated with subsequent times to obtain the second of the weight values (372) for the second input signal.

29. A method for providing a downmix signal (592; 1022) based on a plurality of input signals (110a, 110b; 210a, 210b; 500a, 500n, 1010a, 1010n),

wherein the method comprises the step of determining the value (M _R , M ^Mod _R ; 122; 221, 222; 505, 506a) of the absolute value of the value (112; 511a, 511b) in the spectral region of the downmix signal based on the loudness information of the input signals , and

wherein the method comprises the step of determining the value (P _P , P ^Mod _P ; 132; 398; 508a, 510a) of the phase value in the spectral domain of the downmix signal; and

wherein the method comprises applying a phase value (P _P , P ^Mod _P ; 132; 398; 508a, 510a) to obtain a complex-valued representation (112; 511a, 511b) of the value numbers in the spectral domain of the downmix signal based on the value the absolute value of a value in the spectral domain of the downmix signal;

wherein the method comprises determining the suppression degree information and considering the suppression degree information when determining the value (M ^Mod _R ; 222; 506a) of the absolute value of the value in the spectral domain of the downmix signal,

wherein the method comprises the step of scaling the value (M _R ; 221; 505) of the absolute value representing the sum of the loudness values of the values in the spectral region of the input signals, depending on the information of the degree of suppression, in order to selectively reduce the value (M ^Mod _R ; 222; 506a) the absolute value of the value in the spectral domain of the downmix signal compared to the value (M ^Mod _R ; 221; 505) of the absolute value representing the sum of the loudness values of the values in the spectral domain of the input signals in case the suppression degree information indicates destructive interference.

30. A method for providing a downmix signal (592; 1022) based on a plurality of input signals (110a, 110b; 210a, 210b; 500a, 500n, 1010a, 1010n),

wherein the method comprises the step of selectively reducing the absolute magnitude value (M ^Mod _R ; 222; 506a) of the value in the spectral domain of the downmix signal compared to the absolute magnitude value (M _R ; 221; 505) representing the sum of the loudness values of the values in the spectral region of the input signals in case the suppression degree information indicates destructive interference;

wherein the method comprises the step of determining the sums (sumIm+, sumIm-, sumRe+, sumRe-) of the components of the values (110a; 110b; 210a, 210b; 501a, 501n) in the spectral region of the input signals having different orientations, and

the method comprising determining suppression degree information based on the sums (sumIm+, sumIm-, sumRe+, sumRe-) of spectral domain value components of input signals having different orientations;

wherein the method comprises selecting two of certain sums (sumIm+, sumRe+) that are associated with orthogonal orientations and that are greater than or equal to sums that are associated with opposite directions (sumIm-, sumRe-) as dominant values of the sum, and

wherein the method comprises determining a scaling value that results in a selective decrease in the value (M ^Mod _R ) of the absolute magnitude of the value in the spectral domain of the downmix signal based on:

31. A method for providing a downmix signal (592; 1022) based on a plurality of input signals (110a, 110b; 210a, 210b; 500a, 500n, 1010a, 1010n),

wherein the method comprises the step of calculating suppression degree information Q according to the following equations:

if

and

:

,

if

and

:

,

if

and

:

,

if

and

:

,

32. A method for providing a downmix signal (592; 1022) based on a plurality of input signals (110a, 110b; 210a, 210b; 500a, 500n, 1010a, 1010n),

wherein the method comprises determining a reference absolute value based on a plurality of input signals; and

wherein the method comprises the step of scaling the absolute magnitude reference value, which is not affected by the constructive and destructive interference of the input signals, to determine the magnitude value (M ^Mod _R ; 222) of the absolute magnitude value in the spectral domain of the downmix signal

33. A method for providing a downmix signal (592; 1022) based on a plurality of input signals (110a, 110b; 210a, 210b; 500a, 500n, 1010a, 1010n),

wherein the method comprises the step of obtaining an updated smooth value Q _smooth (t) of the suppression degree based on the previous smoothed value Q _smooth (t-1) of the suppression degree and based on the instantaneous value Q(t) of the suppression degree according to:

,

while p is a constant with 0<p<1;

and wherein the method comprises the step of obtaining the transformed value Q _mapped (t) of the degree of suppression according to:

,

while T is a constant with 0<T<1;

wherein the method comprises scaling the absolute value reference value (505) using the converted suppression degree value to obtain the absolute value (506a).

34. A method for providing a downmix signal (592; 1022) based on a plurality of input signals (110a, 110b; 210a, 210b; 500a, 500n, 1010a, 1010n),

,

while p is a constant with 0≤p≤1;

,

while m _slope (t) is an auxiliary variable;

while max{} is the maximum operator;

while min{} is the minimal operator;

35. A method for providing a downmix signal (592; 1022) based on a plurality of input signals (110a, 110b; 210a, 210b; 500a, 500n, 1010a, 1010n),

wherein the method comprises the step of determining the weighted sum (392) of the values (110a, 110b; 210a, 210b; 501a, 501n) in the spectral region of the input signals and

determine the phase value (398) based on the weighted sum of the values in the spectral domain of the input signals,

wherein the method comprises the step of weighting values in the spectral domain of the input signals in such a way as to prevent destructive interference that exceeds a predetermined interference level to obtain a weighted sum;

wherein the method comprises the step of determining the loudness values (503a, 503b) of the values (110a, 110b; 210a, 210b; 501a, 501n) in the spectral region of the input signals, and

the method comprising extracting a sum loudness value (503d) associated with a spectral domain value of the downmix signal based on the loudness values of the spectral domain values of the input signals; and

wherein the method comprises the step of extracting the value (M _R , M ^Mod _R ; 122; 221, 222; 505, 506a) of the absolute value of the value in the spectral domain of the downmix signal from the summed loudness value.

36. A method for providing a downmix signal (592; 1022) based on a plurality of input signals (110a, 110b; 210a, 210b; 500a, 500n, 1010a, 1010n),

wherein the method comprises the step of determining the weighted sum (392) of values in the spectral region of the input signals and

the method comprising the step of weighting values in the spectral domain of the input signals depending on the time-averaged intensity (362, 372, 382) of the corresponding spectral bin in the various input signals using the weight values to obtain a weighted sum;

wherein the method comprises the step of extracting the value (M _R , M ^Mod _R ; 122; 221, 222; 505, 506a) of the absolute value of the value in the spectral domain of the downmix signal from the summed loudness value;

wherein the method comprises the step of forming an average over the values in the spectral domain of a plurality of spectral bins of the first of the input signals that are associated with an identical frequency and that are associated with subsequent times to obtain the first of the weight values (362) for the first input signal, and

wherein the method comprises the step of forming an average of values in the spectral domain of a plurality of spectral bins of the second of the input signals that are associated with the same frequency and that are associated with subsequent times to obtain the second of the weight values (372) for the second input signal.

37. A computer-readable storage medium storing a computer program for carrying out the method of claim 29 when the computer program is executed on the computer.

38. A computer-readable storage medium storing a computer program for carrying out the method of claim 30 when the computer program is executed on the computer.

39. A computer-readable storage medium storing a computer program for carrying out the method of claim 31 when the computer program is executed on the computer.

40. A computer-readable storage medium storing a computer program for carrying out the method of claim 32 when the computer program is executed on the computer.

41. A computer-readable storage medium storing a computer program for carrying out the method of claim 33 when the computer program is executed on the computer.

42. A computer-readable storage medium storing a computer program for carrying out the method of claim 34 when the computer program is executed on the computer.

43. A computer-readable storage medium storing a computer program for carrying out the method of claim 35 when the computer program is executed on the computer.

44. A computer-readable storage medium storing a computer program for carrying out the method of claim 36 when the computer program is executed on the computer.