RU2791673C1

RU2791673C1 - Downmix device and downmix method

Info

Publication number: RU2791673C1
Application number: RU2021128913A
Authority: RU
Inventors: Франц РОЙТЕЛЬХУБЕР; Бернд ЭДЛЕР; Элени ФОТОПОУЛОУ; Маркус МУЛЬТРУС; Паллави МАБЕН; Саша ДИШ
Original assignee: Фраунхофер-Гезелльшафт Цур Фердерунг Дер Ангевандтен Форшунг Е.Ф.
Priority date: 2019-03-06
Filing date: 2020-03-04
Publication date: 2023-03-13

Abstract

FIELD: computer technology.

SUBSTANCE: invention relates to the field of computer technology for processing audio signals. Weight values by frequency bands are assessed for two channels, weight values by frequency bands are calculated based on a target energy value for each frequency band in such a way that energy in a frequency band of a downmixed audio signal is in a set ratio to energies in the same frequency bands of the mentioned at least two channels; representation in a spectral region of two channels is weighted using weight values by frequency bands to obtain weighted representations in the spectral region.

EFFECT: reduction in delay, while providing downmix to a multichannel audio signal.

50 cl, 14 dwg

Description

Настоящее изобретение направлено на обработку аудиосигналов и, в частности, на понижающее микширование многоканальных сигналов или преобразование спектрального разрешения аудиосигналов.The present invention is directed to the processing of audio signals and, in particular, downmixing of multi-channel signals or transforming the spectral resolution of audio signals.

Хотя обычно стереофонический кодированный битовый поток декодируется для воспроизведения на стереосистеме, не все устройства, которые могут принимать стереофонический битовый поток, всегда могут выдавать стереосигнал. Возможным сценарием является воспроизведение стереосигнала на мобильном телефоне только с монофоническим динамиком. С появлением многоканальных сценариев мобильной связи, поддерживаемых новым стандартом 3GPP IVAS, требуется понижающее микширование из стерео в моно, которое не требует дополнительной задержки и максимально эффективно с точки зрения сложности, а также обеспечивает наилучшее качество восприятия, превышающее то, которое достижимо с помощью простого пассивного понижающего микширования.Although typically a stereo encoded bitstream is decoded for playback on a stereo system, not all devices that can receive a stereo bitstream can always output a stereo signal. A possible scenario is stereo playback on a mobile phone with only a mono speaker. With the advent of multi-channel mobile scenarios supported by the new 3GPP IVAS standard, a stereo to mono downmix is required that does not require additional delay and is as efficient as possible in terms of complexity, and also provides the best quality of experience, beyond that achievable with a simple passive downmix.

Существует несколько способов преобразования стереосигнала в моносигнал. Наиболее прямым способом сделать это является пассивное понижающее микширование [1] во временной области, которое формирует mid-сигнал путем сложения левого и правого каналов и масштабирования результата:There are several ways to convert a stereo signal to a mono signal. The most direct way to do this is passive downmixing [1] in the time domain, which generates a mid-signal by adding the left and right channels and scaling the result:

Дальнейшие более сложные (т.е. активные) способы понижающего микширования на основе временной области включают в себя масштабирование энергии в попытке обеспечить сохранность общей энергии сигнала [2] [3], выравнивание фазы, чтобы избежать эффектов взаимной компенсации [4], и предотвращение эффектов гребенчатых фильтров с помощью подавления когерентности [5].Further more sophisticated (i.e. active) time domain downmixing techniques include energy scaling in an attempt to preserve the overall signal energy [2] [3], phase equalization to avoid cross-compensation effects [4], and avoidance of effects of comb filters using coherence suppression [5].

Другой способ заключается в выполнении энергетической коррекции в зависимости от частоты посредством вычисления отдельных весовых коэффициентов для нескольких спектральных полос. Например, это выполнено как часть модуля преобразования формата MPEG-H [6], в котором понижающее микширование выполняется на гибридном представлении частотных подполос QMF или с помощью набора фильтров преобразования STFT сигналов с дополнительным предшествующим выравниванием фазы каналов. В контексте IVAS аналогичное понижающее микширование по частотным полосам (включающее в себя и фазовое, и временное выравнивание) уже используется для параметрического стереофонического DFT в режиме низкой битовой скорости, где взвешивание и микширование применяются в области DFT [7].Another way is to perform energy correction as a function of frequency by calculating separate weights for several spectral bands. For example, this is done as part of an MPEG-H format conversion module [6], in which downmixing is performed on a hybrid representation of QMF frequency subbands or using an STFT signal conversion filter bank with additional prior channel phase equalization. In the context of IVAS, a similar frequency band downmix (including both phase and time equalization) is already used for parametric stereo DFT in low bit rate mode, where weighting and mixing are applied in the DFT domain [7].

Простое решение пассивного понижающего микширования из стерео а моно во временной области после декодирования стереосигнала не является идеальным, поскольку известно, что полностью пассивное понижающее микширование обладает некоторыми недостатками, например, эффектами подавления фазы или потерей общей энергии, которые могут в зависимости от ситуации сильно ухудшить качество.A simple passive downmix solution from stereo to mono in the time domain after decoding the stereo signal is not ideal, because a completely passive downmix is known to have some disadvantages, such as phase suppression effects or loss of total energy, which can, depending on the situation, greatly degrade the quality. .

Другие активные способы понижающего микширования, которые основаны полностью на временной области, смягчают некоторые проблемы пассивного понижающего микширования, но по-прежнему являются не оптимальными из-за отсутствия зависимого от частоты взвешивания.Other active downmix methods that are based entirely on the time domain alleviate some of the problems of passive downmix, but are still suboptimal due to the lack of frequency-dependent weighting.

С учетом неявных ограничений для кодеков мобильной связи, таких как IVAS, с точки зрения задержки и сложности, наличие специальной ступени постобработки, такой как модуль преобразования формата MPEG-H, для применения понижающего микширования по частотным полосам также не является вариантом, поскольку необходимые преобразования в частотную область и обратно неизбежно приведут к увеличению как сложности, так и задержки.Given the implicit limitations of mobile codecs such as IVAS in terms of latency and complexity, having a dedicated post-processing stage such as an MPEG-H format conversion module to apply the downmix across frequency bands is also not an option, since the necessary conversions to frequency domain and vice versa will inevitably lead to an increase in both complexity and delay.

Для режима стереокодека, который использует кодирование с преобразованием TCX с переключением блоков, как в [8], могут использоваться различные режимы: например, один блок на кадр с размером блока 20 мс (TCX20) и два подблока на кадр с размером блока 10 мс (TCX10). Каждый подблок является либо полным блоком TCX10 длительностью 10 мс, или снова разделен на два блока по 5 мс (TCX5). Решение, какой из режимов использовать, принимается для каждого канала независимо от другого. Это означает, что возможно иметь разные решения между каналами. Это лишает возможности использовать один и тот же способ понижающего микширования, используемый в стереокодере на основе DFT, как описано в [7] (взвешивание каналов по частотным полосам, затем монофоническое понижающее микширование обоих каналов в области DFT), вследствие разных частотно-временных разрешений соответствующих представлений в спектральной области.For a stereo codec mode that uses block switching TCX transform coding as in [8], different modes can be used: for example, one block per frame with a block size of 20 ms (TCX20) and two sub-blocks per frame with a block size of 10 ms ( TCX10). Each sub-block is either a full TCX10 block of 10 ms, or again split into two 5 ms blocks (TCX5). The decision which mode to use is made for each channel independently of the other. This means that it is possible to have different solutions between channels. This makes it impossible to use the same downmix method used in a DFT-based stereo encoder as described in [7] (channel weighting by frequency bands, then mono downmixing of both channels in the DFT domain), due to different time-frequency resolutions of the respective representations in the spectral region.

Задача настоящего изобретения состоит в создании улучшенной концепции обработки аудиосигнала.The object of the present invention is to provide an improved concept for audio signal processing.

Эта задача решается с помощью устройства понижающего микширования по пункту 1 или пункту 35 формулы, способа понижающего микширования по пункту 46 или пункту 47 формулы или компьютерной программы по пункту 48 формулы изобретения.This problem is solved by using the downmixer according to claim 1 or claim 35, the downmixing method according to claim 46 or claim 47, or the computer program according to claim 48.

В соответствии с первым аспектом настоящего изобретения устройство понижающего микширования содержит модуль оценки весовых значений, модуль спектрального взвешивания, модуль преобразования и соединенный далее микшер. Преобразование из спектральной области во временную область выполняется после спектрального взвешивания представления в спектральной области первого канала, взвешивания представления в спектральной области второго канала и, в зависимости от обстоятельств, спектрального взвешивания представлений в спектральной области дополнительных каналов. Взвешенные представления в спектральной области преобразовываются из представления в спектральной области во временное представление соответствующего канала. Во временной области микширование выполняется, чтобы получить микшированный с понижением сигнал как выходные данные устройства понижающего микширования. Эта процедура позволяет выполнять полезное и эффективное взвешивание, но тем не менее с высоким качеством звука в спектральной области, но по-прежнему позволяет обеспечить отдельную обработку отдельных каналов в спектральной области по сравнению с ситуацией, в которой взвешивание в спектральной области и понижающее микширование выполняются в одной операции. В такой ситуации невозможно выполнить отдельную обработку канала, поскольку после спектрального взвешивания и понижающего микширования имеется единственный микшированный с понижением сигнал. Таким образом, в соответствии с этим аспектом настоящего изобретения сделано возможным тем не менее выполнять отдельную обработку каналов в спектральной области, но эта отдельная обработка в спектральной области выполняется после спектрального взвешивания.According to the first aspect of the present invention, the downmixer comprises a weight estimation module, a spectral weighting module, a transform module, and a downmixer connected downstream. The spectral domain to time domain conversion is performed after spectral weighting the spectral domain representation of the first channel, weighting the spectral domain representation of the second channel, and, as the case may be, spectral weighting the spectral domain representations of additional channels. The weighted spectral domain representations are converted from the spectral domain representation to the temporal representation of the corresponding channel. In the time domain, mixing is performed to obtain a downmix signal as an output of the downmixer. This procedure allows for useful and efficient weighting, but still with high sound quality in the spectral domain, but still allows separate processing of individual channels in the spectral domain compared to the situation in which the weighting in the spectral domain and downmixing are performed in one operation. In such a situation, it is not possible to perform separate channel processing because there is only one downmix signal after spectral weighting and downmixing. Thus, according to this aspect of the present invention, it is still possible to perform separate channel processing in the spectral domain, but this separate spectral domain processing is performed after the spectral weighting.

В ситуации, когда по меньшей мере два канала имеют другие временные или частотные разрешения, вычисление весовых значений по частотным полосам по меньшей мере для двух каналов требует преобразовать либо одно, либо оба представления в спектральной области по меньшей мере двух каналов для отдельных частотных полос в соответствующие представления, которые имеют одинаковое временное или частотное разрешение. Могут быть вычислены весовые значения по частотным полосам. Однако в этом аспекте весовые значения по частотным полосам не применяются к преобразованному представлению в спектральной области или двум или более объединенным спектральным представлениям. Вместо этого спектральное взвешивание применяется к исходному представлению в спектральной области, из которого было выведено объединенное представление в спектральной области. Таким образом, обеспечивается, чтобы взвешенные представления в спектральной области основывались на исходных представлениях спектральной области, и только весовые значения, которые в любом случае основаны на определенных оценках энергии, предпочтительно с использованием целевой энергии для полосы в каналах перед понижающим микшированием и целевой энергии для полосы сигнала понижающего микширования, получаются из одного или более комбинированных представлений в спектральной области, которые по меньшей в некотором отношении отличаются от исходных представлений спектральной области.In the situation where at least two channels have different time or frequency resolutions, calculating the weight values across frequency bands for at least two channels requires converting either one or both of the spectral domain representations of at least two channels for the individual frequency bands into the corresponding representations that have the same temporal or frequency resolution. Weight values can be calculated over frequency bands. However, in this aspect, the band weights are not applied to the transformed spectral domain representation or the two or more combined spectral representations. Instead, spectral weighting is applied to the original spectral domain representation from which the combined spectral domain representation was derived. In this way it is ensured that the weighted spectral domain representations are based on the original spectral domain representations, and only the weightings, which are in any case based on certain energy estimates, preferably using the target energy for the band in the pre-downmix channels and the target energy for the band the downmix signal are obtained from one or more combined spectral domain representations that are at least in some respect different from the original spectral domain representations.

Предпочтительно модуль преобразования для преобразования взвешенных представлений в спектральной области во временные представления имеет несколько компонентов. Одним компонентом является фактический модуль частотно-временного преобразования, и дополнительным компонентом является постобработка по каналам во временной области с использованием параметров, которые были переданы, например, через вспомогательную информацию с многоканальным сигналом, из которого получены представления в спектральной области. В качестве альтернативы постпроцессор применяется перед фактическим частотно-временным преобразованием. Управляющие параметры направляют обработку в спектральной области отдельных каналов. Однако предпочтительно сначала применять модуль частотно-временного преобразования и применять постпроцессор для постобработки представлений во временной области по меньшей мере двух каналов с использованием управляющих параметров по каналам, которые выводятся из вспомогательной информации многоканального сигнала или фактически формируются или вводятся в устройство понижающего микширования через пользовательский ввод, или с применением любого другого формирования параметров. После этой постобработки во временной области находится микшер, который фактически формирует микшированный с понижением сигнал.Preferably, a transform module for converting weighted spectral domain representations to temporal representations has several components. One component is the actual time-frequency conversion module, and an additional component is time-domain channel-by-channel post-processing using parameters that have been passed, for example, via ancillary information with a multi-channel signal from which spectral domain representations are derived. Alternatively, the post processor is applied before the actual time-frequency conversion. The control parameters guide the processing in the spectral domain of the individual channels. However, it is preferable to first apply a time-frequency conversion module and apply a post-processor to post-process the time-domain representations of at least two channels using channel-by-channel control parameters that are derived from the side information of the multi-channel signal, or actually generated or input to the downmixer via user input, or using any other parameter shaping. After this post-processing, there is a mixer in the time domain that actually generates the downmixed signal.

Эта процедура обеспечивает высококачественную обработку аудиосигнала благодаря применению весовых значений по частотным полосам к исходным представлениям в спектральной области, и благодаря тому, что весовые значения по частотным полосам, которые так или иначе основаны на некоторой оценке мощности или изображения, выводятся из одного или более (искусственно созданных) объединенных представлений в спектральной области. С другой стороны, высокая гибкость обработки достигается благодаря тому, что любая возможно требуемая обработка во временной области или в частотной области отдельных каналов может быть по-прежнему выполнена, поскольку фактический этап микширования является последним этапом в цепи обработки, который наступает, когда были применены все требуемые отдельные обработки каналов. Кроме того, эта процедура является очень эффективной, поскольку процедура не требует понижающего микширования управляющих параметров, что потребовалось бы, когда фактическая операция понижающего микширования является первой операцией в цепи обработки.This procedure provides high quality audio signal processing by applying frequency band weights to the original spectral domain representations, and by having the frequency band weights, which are somehow based on some power or image estimate, derived from one or more (artificially) created) combined representations in the spectral domain. On the other hand, high processing flexibility is achieved by the fact that any possibly required processing in the time domain or in the frequency domain of the individual channels can still be performed, since the actual mixing step is the last step in the processing chain, which occurs when all required individual channel treatments. In addition, this procedure is very efficient because the procedure does not require downmixing of the control parameters, which would be required when the actual downmix operation is the first operation in the processing chain.

В соответствии со вторым аспектом настоящего изобретения устройство для преобразования спектрального разрешения содержит модуль вычисления спектральных значений для объединения спектральных значений, принадлежащих одному и тому же частотному элементу разрешения из каждого субкадра из множества субкадров одного или более представлений в спектральной области первым методом, чтобы получить первую группу объединенных спектральных значений, и для объединения спектральных значений, принадлежащих одному и тому же частотному элементу разрешения из каждого субкадра представления в спектральной области вторым методом, чтобы получить вторую группу объединенных спектральных значений. Этот второй метод отличается от первого метода, и первая и вторая группы объединенных спектральных значений представляют объединенное представление в спектральной области, имеющее другой размер временного элемента разрешения и другой размер частотного элемента разрешения. Это преобразование спектрального разрешения особенно полезно, когда имеется пара спектральных представлений, полученных из короткого частотно-временного преобразования, имеющих высокое временное разрешение, но низкое частотное разрешение.According to a second aspect of the present invention, a spectral resolution converting apparatus comprises a spectral value calculation module for combining spectral values belonging to the same frequency bin from each subframe of a plurality of subframes of one or more representations in the spectral domain by a first method to obtain a first group combined spectral values, and for combining spectral values belonging to the same frequency bin from each subframe of representation in the spectral domain by a second method to obtain a second group of combined spectral values. This second method differs from the first method, and the first and second groups of combined spectral values represent a combined spectral domain representation having a different time bin size and a different frequency bin size. This spectral resolution transform is especially useful when there are a pair of spectral representations obtained from a short time-frequency transform that have high temporal resolution but low frequency resolution.

В соответствии со вторым аспектом изобретения эта пара коротких представлений в спектральной области преобразовывается в одно длинное представление в спектральной области, имеющее высокое спектральное разрешение, но низкое временное разрешение. Это преобразование из одного частотно-временного разрешения (высокого временного разрешения и низкого частотного разрешения) в другое частотно-временное разрешение (низкое временное разрешение и высокое частотное разрешение) происходит без какого-либо фактического вычисления промежуточного представления во временной области. Таким образом, вместо обычной процедуры, которая состояла бы из преобразования двух коротких представлений в спектральной области во временную область и снова преобразования результата в частотную область, настоящее изобретение применяет только объединение в спектральной области спектральных значений, принадлежащих одному и тому же элементу разрешения, двумя разными методами. Таким образом, в отличие от выполнения двух частотно-временных преобразований и одного временно-частотного преобразования, которые очень не эффективны и подвержены значительной задержке, настоящее изобретение обеспечивает только необходимые базовые арифметические операции объединения, такие как сложение двух значений или вычитание двух значений друг из друга, чтобы получить из двух представлений в спектральной области с низким частотным разрешением представления с высоким разрешением в спектральной области. Предпочтительно первое правило объединения представляет собой низкочастотную фильтрацию или, другими словами, сложение или взвешенное сложение двух спектральных значений, принадлежащих одному и тому же частотному элементу разрешения с низким разрешением, в то время как объединение спектральных значений в соответствии со вторым методом представляет собой высокочастотную фильтрацию или вычисление разности между двумя спектральными значениями. Соответствующие два смежных последовательных спектральных значения преобразуются в два смежных по частоте спектральных значения, причем одно из двух смежных по частоте спектральных значений является спектральным значением с более низкой частотой, полученным в результате операции низкочастотной фильтрации, а следующее - спектральным значением с более высокой частотой, полученным в результате высокочастотной операции.According to a second aspect of the invention, this pair of short spectral domain representations is converted into one long spectral domain representation having high spectral resolution but low temporal resolution. This conversion from one time-frequency resolution (high temporal resolution and low frequency resolution) to another time-frequency resolution (low temporal resolution and high frequency resolution) occurs without any actual calculation of the intermediate time domain representation. Thus, instead of the usual procedure which would consist of converting two short representations in the spectral domain to the time domain and again converting the result to the frequency domain, the present invention only applies the union in the spectral domain of spectral values belonging to the same bin by two different methods. Thus, unlike performing two time-frequency transforms and one time-frequency transform, which are very inefficient and subject to significant latency, the present invention provides only the necessary basic arithmetic union operations, such as adding two values or subtracting two values from each other. to obtain from the two low frequency domain representations the high resolution spectral domain representations. Preferably, the first combining rule is low-pass filtering, or in other words, addition or weighted addition of two spectral values belonging to the same low-resolution frequency bin, while combining spectral values according to the second method is high-pass filtering, or calculating the difference between two spectral values. The corresponding two adjacent successive spectral values are converted into two adjacent frequency spectral values, where one of the two adjacent frequency spectral values is the lower frequency spectral value obtained by the low pass filtering operation, and the other is the higher frequency spectral value obtained by as a result of high frequency operation.

Следующая процедура заключается в том, что следующая пара спектральных значений с высоким спектральным разрешением снова вычисляется в той же процедуре, т.е. выполняется первое объединение для спектрального значения с более низкой частотой, обычно представляющего низкочастотную характеристику, и выполняется другое объединение для спектрального значения с более высокой частотой, представляющего высокочастотную операцию для спектрального значения с более высокой частотой из пары спектральных значений.The next procedure is that the next pair of spectral values with high spectral resolution is calculated again in the same procedure, i.e. a first merge is performed on the lower frequency spectral value, typically representing a low frequency response, and another merge is performed on the higher frequency spectral value, representing a high frequency operation on the higher frequency spectral value of the pair of spectral values.

Объединенное представление в спектральной области, сформированное в соответствии со вторым аспектом настоящего изобретения, может использоваться для других целей. В первом аспекте изобретения объединенное представление в спектральной области используется для получения весовых значений по частотным полосам. Это особенно полезно, когда представление в спектральной области первого канала имеет низкое временное разрешение и высокое спектральное разрешение, а второй канал по меньшей мере из двух каналов имеет два представления в спектральной области высокого временного разрешения, которые оба имеют низкое разрешение, преобразовывается, и из объединенного представления в спектральной области, сформированного преобразованием, могут быть выведены весовые значения по частотным полосам. При дальнейшем использовании объединенное представление в спектральной области может быть дополнительно обработано посредством любой полезной постобработки, такой как преобразование во временную область и использование преобразованного спектра в целях воспроизведения, сохранения или сжатия аудиосигнала. Другая процедура будет выполнять спектральную обработку объединенного представления в спектральной области вместе с другим спектральным представлением, которое имеет такое же спектральное разрешение, например, с целью понижающего микширования в спектральной области.The combined spectral domain representation generated in accordance with the second aspect of the present invention may be used for other purposes. In a first aspect of the invention, the combined representation in the spectral domain is used to obtain weights over frequency bands. This is especially useful when the spectral domain representation of the first channel has low temporal resolution and high spectral resolution, and the second channel of at least two channels has two high temporal domain representations that are both low resolution, transformed, and from the combined The spectral domain representation generated by the transform can be weighted across the frequency bands. Upon further use, the combined spectral domain representation can be further processed through any useful post-processing, such as time domain conversion and use of the transformed spectrum to reproduce, store, or compress the audio signal. Another procedure will spectrally process the combined spectral domain representation along with another spectral representation that has the same spectral resolution, for example, for the purpose of spectral domain downmixing.

В соответствии с третьим аспектом настоящего изобретения операция понижающего микширования выполняется с использованием спектрального взвешивания, причем весовые значения по частотным полосам вычисляются на основе целевого значения энергии на каждую частотную полосу, чтобы энергия в частотной полосе микшированного с понижением сигнала находилась в заданном отношении, например была равна или равна в допустимом диапазоне ±30% более высокому значению из двух энергий в тех же частотных полосах по меньшей мере двух каналов. Управляемые энергией весовые значения по частотным полосам применяются к представлениям в спектральной области по меньшей мере двух каналов, и микшированный с понижением сигнал вычисляется с использованием взвешенных представлений в спектральной области по меньшей мере двух каналов либо во временной области, как в первом аспекте изобретения, либо в спектральной области, по мере необходимости.According to the third aspect of the present invention, the downmix operation is performed using spectral weighting, wherein the weight values per frequency bands are calculated based on the target energy value per each frequency band, so that the energy in the frequency band of the downmixed signal is in a predetermined ratio, for example, is equal to or equal, within the allowable range of ±30%, to the higher of the two energies in the same frequency bands of at least two channels. Energy-driven frequency band weights are applied to the spectral domain representations of the at least two channels, and the downmix signal is computed using the weighted spectral domain representations of the at least two channels, either in the time domain as in the first aspect of the invention, or spectral region, as needed.

В случае, когда представления в спектральной области являются полностью действительными, как в преобразовании MDCT, или когда представления в спектральной области являются полностью мнимыми, как при применении преобразования MDST (модифицированного дискретного синусного преобразования), модуль оценки весовых значений выполнен с возможностью оценки на основе существующего представления в спектральной области, которое является либо полностью действительным, либо полностью мнимым, другого представления в спектральной области. Таким образом, когда существует действительнозначное представление в спектральной области, оценивается мнимое представление в спектральной области, и когда существует мнимое представление в спектральной области, оценивается действительнозначное представление в спектральной области. Эти оцененные значения используются для вычисления энергии первого канала в частотной полосе, для вычисления энергии второго канала в частотной полосе и для вычисления микшированных составляющих между каналами в зависимости от произведения или линейной комбинации спектральных значений по меньшей мере из двух каналов в частотной полосе.In the case where the spectral domain representations are fully real, as in the MDCT transform, or when the spectral domain representations are fully imaginary, as in the application of the MDST (Modified Discrete Sine Transform), the weight estimator is configured to estimate based on the existing a spectral domain representation that is either fully real or fully imaginary of another spectral domain representation. Thus, when there is a real-valued representation in the spectral domain, the imaginary representation in the spectral domain is evaluated, and when there is an imaginary representation in the spectral domain, the real-valued representation in the spectral domain is evaluated. These estimated values are used to calculate the energy of the first channel in the frequency band, to calculate the energy of the second channel in the frequency band, and to calculate the mixed components between the channels depending on the product or linear combination of the spectral values of at least two channels in the frequency band.

Эта процедура вычисления весовых значений по частотным полосам для спектрального взвешивания в контексте понижающего микширования может быть применена в первом аспекте, когда между спектральным взвешиванием и понижающим микшированием происходит частотно-временное преобразование и некоторая постобработка во временной области. Что касается второго аспекта изобретения, представление в спектральной области одного или обоих каналов, которые используются для вычисления весовых значений в спектральной области в соответствии с характеристикой целевой энергии, получаются либо из исходных представлений в спектральной области, либо из одного или двух комбинированных представлений в спектральной области, которые были сформированы посредством преобразования спектрального разрешения, проиллюстрированного в отношении второго аспекта изобретения или проиллюстрированного в отношении первого аспекта.This procedure for computing weights over frequency bands for spectral weighting in the context of downmixing can be applied in the first aspect, where time-frequency conversion and some time-domain post-processing takes place between spectral weighting and downmixing. With respect to the second aspect of the invention, the spectral domain representation of one or both of the channels that are used to calculate the spectral domain weights according to the target energy characteristic are obtained either from the original spectral domain representations or from one or two combined spectral domain representations. , which were generated by transforming the spectral resolution illustrated in relation to the second aspect of the invention or illustrated in relation to the first aspect.

Понижающее микширование с использованием спектрального взвешивания с использованием весовых значений по частотным полосам, которые выведены на основе целевого значения энергии на каждую частотную полосу, с одной стороны, является очень эффективным благодаря тому, что спектральное взвешивание может быть легко выполнено посредством применения одного и того же весового значения к каждому спектральному значению в частотной полосе, в частности, когда применяются психоакустически мотивированные частотные полосы, ширина которых увеличиваются от узких частотных полос на низких частотах до широких частотных полос на высоких частотах. Когда, например, рассматривается высокочастотная полоса, которая имеет, например, 100 или больше спектральных значений, вычисляется только одно весовое значение для этой частотной полосы, и это единственное весовое значение применяется к каждому отдельному спектральному значению. Для этой процедуры необходимы лишь умеренные вычислительные ресурсы, поскольку взвешивание, например, посредством умножения представляет собой процедуру с низким ресурсом и малой задержкой, и в то же время эта процедура применения одного и того же весового значения к каждому спектральному значению в частотной полосе имеет высокий потенциал для параллельных вычислений посредством определенных параллельных аппаратных процессоров. С другой стороны, получается высокое качество звука микшированного с понижением сигнала, который свободен от подавлений сигнала или других артефактов, возникающих, когда два канала, которые должны быть подвергнуты понижающему микшированию, находятся в фазовом отношении друг с другом, что является проблематичным при понижающем микшировании, т.е. когда оба канала сильно коррелированы друг с другом и имеют определенное фазовое соотношение.Downmixing using spectral weighting using frequency band weights that are derived based on the target energy value per frequency band is, on the one hand, very efficient due to the fact that spectral weighting can be easily performed by applying the same weight. values to each spectral value in a frequency band, in particular when psychoacoustically motivated frequency bands are applied, the width of which increases from narrow frequency bands at low frequencies to wide frequency bands at high frequencies. When, for example, a high frequency band is considered that has, for example, 100 or more spectral values, only one weight value is calculated for that frequency band, and this single weight value is applied to each individual spectral value. This procedure requires only moderate computational resources, since weighting, for example, by multiplication, is a low resource and low latency procedure, and at the same time, this procedure of applying the same weight value to each spectral value in the frequency band has a high potential. for parallel computing through certain parallel hardware processors. On the other hand, a high quality audio of the downmix signal is obtained, which is free from signal cancellations or other artifacts that occur when two channels to be downmixed are in phase relationship with each other, which is problematic in downmixing, those. when both channels are highly correlated with each other and have a certain phase relationship.

Предпочтительные варианты осуществления настоящего изобретения описаны ниже с обращением к следующим сопровождающим чертежам.Preferred embodiments of the present invention are described below with reference to the following accompanying drawings.

Фиг. 1 иллюстрирует устройство понижающего микширования в соответствии с первым аспектом;Fig. 1 illustrates a downmix apparatus according to the first aspect;

Фиг. 2 иллюстрирует дополнительный вариант осуществления устройства понижающего микширования в соответствии с первым аспектом;Fig. 2 illustrates a further embodiment of a downmixer according to the first aspect;

Фиг. 3a иллюстрирует предпочтительную реализацию модуля оценки весовых значений;Fig. 3a illustrates a preferred implementation of the weight estimator;

Фиг. 3b иллюстрирует предпочтительный вариант осуществления модуля оценки весовых значений, который является предпочтительным также для третьего аспекта;Fig. 3b illustrates a preferred embodiment of the weight estimator which is also preferred for the third aspect;

Фиг. 4a иллюстрирует разные частотно-временные разрешения в разных каналах;Fig. 4a illustrates different time-frequency resolutions in different channels;

Фиг. 4b иллюстрирует спектральное представление, демонстрирующее высокое спектральное разрешение, среднее спектральное разрешение и низкое спектральное разрешение;Fig. 4b illustrates a spectral representation showing high spectral resolution, medium spectral resolution, and low spectral resolution;

Фиг. 5a иллюстрирует оценку весового значения в соответствии с первым вариантом осуществления, приводящую к низкому частотному разрешению и низкому временному разрешению;Fig. 5a illustrates weight estimation according to the first embodiment resulting in low frequency resolution and low temporal resolution;

Фиг. 5b иллюстрирует процедуру, выполняемую модулем оценки весовых значений в соответствии со вторым вариантом осуществления, приводящую к высокому частотному разрешению и низкому временному разрешению, которая также применена в соответствии со вторым аспектом;Fig. 5b illustrates a procedure performed by the weight estimator according to the second embodiment resulting in high frequency resolution and low temporal resolution, which is also applied according to the second aspect;

Фиг. 5c иллюстрирует реализацию оценки весового значения в соответствии с третьим вариантом осуществления, приводящей к низкому частотному разрешению и высокому временному разрешению;Fig. 5c illustrates the implementation of weight estimation according to the third embodiment resulting in low frequency resolution and high temporal resolution;

Фиг. 5d иллюстрирует дополнительную процедуру модуля оценки весовых значений, приводящую к высокому частотному разрешению и высокому временному разрешению;Fig. 5d illustrates an additional procedure of the weight estimator resulting in high frequency resolution and high temporal resolution;

Фиг. 6 иллюстрирует вариант осуществления устройства для преобразования спектрального разрешения в соответствии со вторым аспектом;Fig. 6 illustrates an embodiment of an apparatus for converting spectral resolution according to the second aspect;

Фиг. 7 иллюстрирует дополнительную реализацию устройства для преобразования спектрального разрешения в соответствии со вторым аспектом;Fig. 7 illustrates a further implementation of an apparatus for spectral resolution conversion according to the second aspect;

Фиг. 8 иллюстрирует вариант осуществления устройства понижающего микширования в соответствии с третьим аспектом; иFig. 8 illustrates an embodiment of a downmixer according to the third aspect; And

Фиг. 9 иллюстрирует дополнительный вариант осуществления устройства понижающего микширования в соответствии с третьим аспектом.Fig. 9 illustrates a further embodiment of a downmixer according to the third aspect.

Фиг. 1 иллюстрирует вариант осуществления устройства понижающего микширования для первого аспекта настоящего изобретения. Устройство понижающего микширования содержит модуль 100 оценки весовых значений, модуль 200 спектрального взвешивания, соединенный с модулем 100 оценки весовых значений, и вход для первого или левого канала и второго или правого канала. Модуль 200 спектрального взвешивания соединен с модулем 300 преобразования для преобразования взвешенных представлений в спектральной области по меньшей мере двух каналов во временные представления по меньшей мере двух каналов. Эти временные представления выдаются микшеру для микширования временных представлений по меньшей мере двух каналов, чтобы получить микшированный с понижением сигнал во временной области. Предпочтительно модуль 300 преобразования содержит модуль 310 частотно-временного преобразования и соединенный далее постпроцессор 320. Модуль 310 частотно-временного преобразования фактически выполняет преобразование взвешенных представлений в спектральной области во временную область, и постпроцессор 320, который является факультативным, выполняет независимую от канала обработку первого канала и второго канала, уже представленных во временной области, с использованием управляющих параметров для левого канала и правого канала соответственно. Модуль 300 преобразования выполнен с возможностью формирования посредством модуля 310 частотно-временного преобразования необработанных временных представлений с использованием алгоритма спектрально-временного преобразования, и кроме того модуль 300 преобразования выполнен с возможностью последующей обработки посредством постпроцессора 320 необработанных временных представлений по отдельности и, в частности, в направлении обработки сигналов перед микшированием микшером с использованием отдельной управляющей информации для каналов, чтобы получить временные представления по меньшей мере двух каналов.Fig. 1 illustrates an embodiment of a downmixer for the first aspect of the present invention. The downmixer comprises a weight estimation module 100, a spectral weighting module 200 connected to the weight estimation module 100, and an input for a first or left channel and a second or right channel. The spectral weighting module 200 is coupled to a transform module 300 for converting the weighted spectral domain representations of the at least two channels into temporal representations of the at least two channels. These time representations are provided to a mixer for mixing the time representations of the at least two channels to obtain a downmixed signal in the time domain. Preferably, the transform module 300 comprises a time-frequency transform module 310 and a further connected post-processor 320. The time-frequency transform module 310 actually performs the transformation of weighted representations in the spectral domain into the time domain, and the post-processor 320, which is optional, performs channel-independent processing of the first channel. and a second channel already represented in the time domain using control parameters for the left channel and the right channel, respectively. The transform module 300 is configured to generate, by means of the time-frequency transform module 310, the raw time representations using the spectral-time transform algorithm, and in addition, the transform module 300 is configured to post-process by the post-processor 320 the raw time representations individually and, in particular, in signal processing direction before mixing by the mixer using separate control information for the channels to obtain temporal representations of at least two channels.

Предпочтительно постпроцессор 320 выполнен с возможностью выполнения в качестве операции постобработки низкочастотной постфильтрации, обработки TCX-LTP (долгосрочное предсказание с возбуждением посредством кода с преобразованием) или синтеза LPC (кодирование с линейным предсказанием). Преимущество постпроцессора, воздействующего на спектрально взвешенные каналы, но работающего перед фактическим преобразованием в микшированный с понижением сигнал, состоит в том, что параметры, которые доступны как отдельные параметры для левого и правого канала или в общем случае для отдельного канала из двух или более каналов многоканального сигнала еще могут использоваться без какого-либо понижающего микширования параметров. Такая процедура в ином случае была бы необходима, когда понижающее микширование будет выполнено вместе со спектральным взвешиванием, чтобы на выходе модуля 310 частотно-временного преобразования уже существовал микшированный с понижением сигнал во временной области.Preferably, the post processor 320 is configured to perform, as a post-processing operation, low-pass post-filtering, TCX-LTP (transform code excited long term prediction) processing, or LPC (linear prediction coding) synthesis. The advantage of a post-processor operating on the spectrally weighted channels, but working before actually converting to the downmix signal, is that the parameters that are available as separate parameters for the left and right channels, or in general for a single channel of two or more channels of a multichannel signals can still be used without any downmixing of the parameters. Such a procedure would otherwise be necessary when the downmix is performed in conjunction with spectral weighting so that the time-domain downmix signal already exists at the output of the time-frequency module 310.

В общем случае многоканальный сигнал может содержать два канала, т.е. левый канал и правый канал, или многоканальный сигнал содержит более двух каналов, например три или более канала. В такой ситуации модуль 100 оценки весовых значений выполнен с возможностью вычисления множества первых весовых значений по частотным полосам для множества частотных полос первого канала по меньшей мере из двух каналов и вычислять второе множество весовых значений по частотным полосам для множества частотных полос второго канала по меньшей мере из двух каналов. Кроме того, модуль 100 оценки весовых значений выполнен с возможностью вычисления множества первых весовых значений по частотным полосам для множества частотных полос первого канала многоканального сигнала, имеющего более двух каналов, вычисления второго множества весовых значений по частотным полосам для множества частотных полос второго канала из более двух каналов, и вычисления дополнительного множества весовых значений по частотным полосам для множества частотных полос третьего или дополнительного канала из более двух каналов.In general, a multi-channel signal may contain two channels, i.e. a left channel and a right channel, or a multi-channel signal contains more than two channels, such as three or more channels. In such a situation, the weight estimator 100 is configured to calculate a plurality of first weight values per frequency bands for a plurality of first channel frequency bands from at least two channels, and calculate a second plurality of weight values per frequency bands for a plurality of second channel frequency bands from at least two channels. In addition, the weight estimator 100 is configured to calculate a plurality of first weight values by frequency bands for a plurality of frequency bands of a first channel of a multi-channel signal having more than two channels, calculate a second plurality of weight values by frequency bands for a plurality of frequency bands of a second channel of more than two channels, and calculating an additional set of weight values by frequency bands for the set of frequency bands of the third or additional channel of more than two channels.

В частности, каждое представление в спектральной области по меньшей мере двух каналов содержит множество частотных элементов разрешения, причем спектральные значения ассоциированы с частотными элементами разрешения. В частности, модуль 100 оценки весовых значений выполнен с возможностью вычисления весовых значений по частотным полосам для частотных полос, причем каждая частотная полоса содержит одно, два или больше спектральных значений, и предпочтительно количество частотных элементов разрешения на каждую частотную полосу увеличивается по мере повышения центральной частоты частотных полос для получения психоакустически мотивированного разделения представлений в спектральной области на частотные полосы с неравномерной шириной полосы.In particular, each spectral domain representation of the at least two channels comprises a plurality of frequency bins, the spectral values being associated with the frequency bins. Specifically, the weight estimator 100 is configured to calculate weight values across frequency bands for frequency bands, each frequency band containing one, two or more spectral values, and preferably the number of frequency bins per frequency band increases as the center frequency increases. frequency bands to obtain a psychoacoustically motivated separation of representations in the spectral domain into frequency bands with non-uniform bandwidth.

Предпочтительная реализация для устройства понижающего микширования проиллюстрирована на фиг. 2. Многоканальный сигнал доступен как стереофонический битовый поток и подается в стереодекодер 500, который предпочтительно реализован как стереодекодер с преобразованием MDCT. Кроме того, модуль оценки весовых значений содержит модуль 110 вычисления значения левого канала, модуль 112 вычисления значения правого канала, а также модуль 120 оценки мнимой части для левого канала и модуль 122 оценки мнимой части для правого канала. В варианте осуществления на фиг. 2 стереодекодер 500 является стереодекодером с преобразованием MDCT, что означает, что декодированные спектральные представления левого и правого каналов имеет полностью действительные спектральные значения, т.е. значения преобразования MDCT. Модули 120, 122 оценки мнимой части формируют полностью мнимые спектральные значения, т.е. значения MDST (модифицированное дискретное синусное преобразование). На основе этих информационных элементов, т.е. представлений в спектральной области и оцененных спектральных значений, вычисляются весовые коэффициенты и передаются модулю 200 спектрального взвешивания, выполняющему взвешивание по частотным полосам, как обозначено на фиг. 2. Взвешенные представления в спектральной области передаются соответствующим модулям 310 частотно-временного преобразования, которые реализованы в виде модуля преобразования IMDCT для каждого канала. Кроме того, факультативный постпроцессор 320 также проиллюстрирован для каждого канала, и преобразованные и подвергнутые факультативной постобработке данные вводятся в устройство понижающего микширования DMX 400, чтобы сформировать микшированный с понижением сигнал во временной области, то есть в варианте осуществления на фиг. 2 монофонический выходной сигнал, но это также может быть многоканальный сигнал при условии, что количество одного или более каналов микшированного с понижением сигнала меньше, чем количество каналов многоканального сигнала перед понижающим микшированием.A preferred implementation for the downmixer is illustrated in FIG. 2. The multi-channel signal is available as a stereo bitstream and is provided to a stereo decoder 500, which is preferably implemented as a stereo MDCT transform decoder. In addition, the weight estimator includes a left channel value calculation unit 110, a right channel value calculation unit 112, as well as a left channel imaginary part estimator 120 and a right channel imaginary part estimator 122 . In the embodiment of FIG. 2, stereo decoder 500 is an MDCT transform stereo decoder, which means that the decoded spectral representations of the left and right channels have fully real spectral values, i. MDCT transform values. The imaginary part estimators 120, 122 generate fully imaginary spectral values, i. e. MDST (Modified Discrete Sine Transform) values. Based on these information elements, i.e. representations in the spectral domain and the estimated spectral values, weights are calculated and passed to the spectral weighting module 200, which performs weighting by frequency bands, as indicated in FIG. 2. The weighted representations in the spectral domain are passed to the respective time-frequency transform modules 310, which are implemented as an IMDCT transform module for each channel. In addition, an optional post processor 320 is also illustrated for each channel, and the converted and optionally post-processed data is input to the DMX downmixer 400 to generate a downmix signal in the time domain, that is, in the embodiment of FIG. 2 is a mono output signal, but it can also be a multi-channel signal, provided that the number of one or more channels of the downmix signal is less than the number of channels of the multi-channel signal before the downmix.

В качестве альтернативы, когда многоканальный декодер или стереодекодер 500 реализован как декодер с мнимыми значениями, такой как декодер MDST, блоки 120, 122 будут оценивать полностью действительные данные, такие как значения преобразования MDCT. Таким образом, в общем случае модуль 100 оценки весовых значений выполнен с возможностью оценки мнимого спектрального представления, когда представление в спектральной области является полностью действительным, или оценки действительного спектрального представления, когда исходное представление в спектральной области является полностью мнимым. Кроме того, модуль 110 оценки весовых значений выполнен с возможностью оценки весовых значений с использованием оцененного мнимого спектрального представления или оцененного действительного спектрального представления в зависимости от обстоятельств. Это особенно полезно для вычисления спектральных весовых значений по частотным полосам, которое основано на целевом значении энергии на каждую частотную полосу, чтобы энергия в частотной полосе микшированного с понижением сигнала находилась в заданном отношении к энергиям в тех же частотных полосах по меньшей мере двух каналов. Предпочтительно заданное отношение является таким, что энергия в частотной полосе микшированного с понижением сигнала является суммой энергий одних и тех же частотных полос по меньшей мере в двух каналах. Однако другие заданные отношения также полезны. В качестве примера заданное отношение может охватывать от 75% до 125% суммы двух каналов как энергия соответствующей частотной полосы микшированного с понижением сигнала. Однако в самом предпочтительном варианте осуществления, заданное отношение представляет собой равенство или равенство в допустимом диапазоне ±10%.Alternatively, when the multi-channel decoder or stereo decoder 500 is implemented as a decoder with imaginary values, such as an MDST decoder, blocks 120, 122 will evaluate fully real data, such as MDCT transform values. Thus, in general, the weight estimator 100 is configured to evaluate the imaginary spectral representation when the spectral domain representation is fully real or evaluate the actual spectral representation when the original spectral domain representation is fully imaginary. In addition, the weight estimator 110 is configured to estimate the weight values using the estimated imaginary spectral representation or the estimated actual spectral representation, as the case may be. This is especially useful for calculating spectral weights across frequency bands, which is based on a target energy value per frequency band such that the energy in the downmix signal's frequency band is in a given ratio to the energies in the same frequency bands of at least two channels. Preferably, the given ratio is such that the energy in the frequency band of the downmix signal is the sum of the energies of the same frequency bands in at least two channels. However, other given relationships are also useful. By way of example, the given ratio may cover 75% to 125% of the sum of the two channels as the energy of the corresponding downmixed frequency band. However, in the most preferred embodiment, the predetermined ratio is equality or equality within the allowable range of ±10%.

Фиг. 3a иллюстрирует предпочтительную реализацию модуля 100 оценки весовых значений. В частности, эта реализация полезна для вычисления весовых значений, когда представления в спектральной области по меньшей мере двух каналов имеют разные временные или частотные разрешения. Как показано в блоке или на этапе 130, модуль 100 оценки весовых значений выполнен с возможностью проверки, отличаются ли частотно-временные разрешения представлений в спектральной области первого и вторых каналов друг от друга. В случае равных временных или частотных разрешений модуль 100 оценки весовых значений выполнен с возможностью вычисления весовых коэффициентов по частотным полосам или весовых значений по частотным полосам, как обозначено посредством w_L для первого или левого канала, и как обозначено w_R для второго или правого канала.Fig. 3a illustrates a preferred implementation of the weight estimation module 100. In particular, this implementation is useful for computing weights when the spectral domain representations of at least two channels have different time or frequency resolutions. As shown in block or step 130, the weight estimator 100 is configured to check whether the time-frequency resolutions of the spectral domain representations of the first and second channels differ from each other. In the case of equal time or frequency resolutions, the weight estimator 100 is configured to calculate weights per frequency bands or weights per frequency bands, as denoted by w _L for the first or left channel, and as denoted by w _R for the second or right channel.

В качестве альтернативы, когда модулем 100 оценки весовых значений в блоке 130 определено, что временные или частотные разрешения не равны между левым и правым, или первым и вторым каналами в течение некоторого периода времени, как позже проиллюстрировано относительно фиг. 4a, модуль 100 оценки весовых значений выполнен с возможностью вычисления 132 одного или двух объединенных представлений в спектральной области. В частности, первое представление в спектральной области первого канала по меньшей мере из двух каналов имеет первое временное разрешение или первое частотное разрешение, и второе представление в спектральной области второго канала по меньшей мере из двух каналов имеет второе временное разрешение или второе частотное разрешение, причем второе временное разрешение отличается от первого временного разрешения, или второе частотное разрешение отличается от первого частотного разрешения. Модуль 100 оценки весовых значений выполнен с возможностью преобразования или вычисления 132 первого представления в спектральной области в объединенное представление в спектральной области, имеющее второе временное разрешение или второе частотное разрешение, и вычисления весовых значений по частотным полосам с использованием объединенного представления в спектральной области и второго представления в спектральной области. В качестве альтернативы, второе представление в спектральной области преобразуется в объединенное представление в спектральной области, имеющее первое временное разрешение или первое частотное разрешение, и весовые значения по частотным полосам вычисляются с использованием объединенного представления в спектральной области и первого представления в спектральной области. В качестве альтернативы, когда первое представление в спектральной области первого канала имеет первое временное разрешение или первое частотное разрешение, и второе представление в спектральной области второго канала по меньшей мере из двух каналов имеет второе временное разрешение или второе частотное разрешение, причем второе временное разрешение отличается от первого временного разрешения, или второе частотное разрешение отличается от первого временного разрешения, модуль 100 оценки весовых значений выполнен с возможностью преобразования или вычисления 132 первого представления в спектральной области в первое объединенное представление в спектральной области, имеющее третье временное разрешение или третье частотное разрешение, причем третье временное разрешение отличается от первого временного разрешения или второго временного разрешения, и третье частотное разрешение отличается от первого частотного разрешения и/или второго частотного разрешения. Кроме того, второе представление в спектральной области также преобразовывается во второе объединенное представление в спектральной области, имеющее третье временное разрешение или третье частотное разрешение, и весовые значения по частотным полосам вычисляются с использованием первого объединенного представления в спектральной области и второго представления в спектральной области. В зависимости от фактической ситуации, как позже описано относительно фиг. 5a-5d, также может возникать ситуация, в которой весовые значения или коэффициенты по частотным полосам, вычисленные блоком 134, не используются для фактического спектрального взвешивания, а вычисляются выведенные весовые коэффициенты по частотным полосам, как проиллюстрировано на этапе 136 на фиг. 3a.Alternatively, when it is determined by the weight estimator 100 in block 130 that the time or frequency resolutions are not equal between the left and right or first and second channels for a period of time, as later illustrated with respect to FIG. 4a, the weight estimator 100 is configured to calculate 132 one or two combined representations in the spectral domain. In particular, the first spectral domain representation of the first channel of at least two channels has a first time resolution or a first frequency resolution, and the second spectral domain representation of a second channel of at least two channels has a second time resolution or a second frequency resolution, the second the temporal resolution is different from the first temporal resolution, or the second frequency resolution is different from the first frequency resolution. The weight estimator 100 is configured to convert or compute 132 the first spectral domain representation to a combined spectral domain representation having a second temporal resolution or a second frequency resolution, and calculate the weights across frequency bands using the combined spectral domain representation and the second representation. in the spectral region. Alternatively, the second spectral domain representation is converted to a combined spectral domain representation having a first temporal resolution or a first frequency resolution, and frequency band weights are computed using the combined spectral domain representation and the first spectral domain representation. Alternatively, when the first spectral domain representation of the first channel has a first time resolution or a first frequency resolution, and the second spectral domain representation of the second channel of at least two channels has a second time resolution or a second frequency resolution, the second time resolution being different from first time resolution, or the second frequency resolution is different from the first time resolution, the weight estimator 100 is configured to convert or compute 132 the first spectral domain representation into a first combined spectral domain representation having a third time resolution or a third frequency resolution, wherein the third the temporal resolution is different from the first temporal resolution or the second temporal resolution, and the third frequency resolution is different from the first frequency resolution and/or the second frequency resolution. In addition, the second spectral domain representation is also converted to a second combined spectral domain representation having a third time resolution or a third frequency resolution, and frequency band weights are calculated using the first combined spectral domain representation and the second spectral domain representation. Depending on the actual situation, as later described with respect to FIG. 5a-5d, a situation may also occur in which the weights or coefficients per frequency bands calculated by block 134 are not used for the actual spectral weighting, but the derived weights per frequency bands are calculated, as illustrated at step 136 in FIG. 3a.

В общем случае и в предположении, что первый канал имеет низкое первое временное разрешение и высокое первое частотное разрешение, а также в предположении, что второй канал имеет высокое второе временное разрешение и низкое второе частотное разрешение, функциональность модуля 100 оценки весовых значений может выбрать один из четырех разных путей для установления соответствия между разрешениями между первым и вторым каналом в спектральной области, чтобы вычислить весовые значения в спектральной области для этих каналов.In general, and assuming that the first channel has a low first time resolution and a high first frequency resolution, and also assuming that the second channel has a high second time resolution and a low second frequency resolution, the functionality of the weight estimator 100 may select one of four different ways to map the resolutions between the first and second channel in the spectral domain to calculate the weights in the spectral domain for these channels.

Фиг. 5a иллюстрирует первый вариант осуществления, в котором весовые значения по частотным полосам вычисляются на основе двух объединенных представлений в спектральной области, причем оба объединенных представления в спектральной области имеют низкое частотное разрешение и низкое временное разрешение.Fig. 5a illustrates a first embodiment in which frequency band weights are computed based on two combined spectral domain representations, with both combined spectral domain representations having low frequency resolution and low temporal resolution.

Во втором варианте осуществления, проиллюстрированном на фиг. 5b, только одно объединенное представление в спектральной области вычисляется на основе представления с низким частотным разрешения таким образом, чтобы весовые значения по частотным полосам были вычислены на основе пары представлений в спектральной области, которые оба имеют высокое частотное разрешение и низкое временное разрешение.In the second embodiment illustrated in FIG. 5b, only one combined spectral domain representation is calculated based on the low frequency resolution representation such that the frequency band weights are calculated based on a pair of spectral domain representations that both have high frequency resolution and low temporal resolution.

Фиг. 5c иллюстрирует дополнительный третий вариант осуществления, в котором одно объединенное представление вычисляется и используется для вычисления весовых значений в спектральной области по частотным полосам с использованием двух представлений в спектральной области, которые оба имеют низкое частотное разрешение и высокое временное разрешение.Fig. 5c illustrates a further third embodiment in which one combined representation is computed and used to calculate spectral domain weights across frequency bands using two spectral domain representations that both have low frequency resolution and high temporal resolution.

В четвертом варианте осуществления, проиллюстрированном на фиг. 5d, модуль оценки весовых значений выполнен с возможностью вычисления весовых значений по частотным полосам с использованием двух объединенных представлений, оба из которых имеют формат, имеющий высокое частотное разрешение и высокое временное разрешение.In the fourth embodiment illustrated in FIG. 5d, the weight estimator is configured to calculate weight values over frequency bands using two combined representations, both of which are in a format having high frequency resolution and high temporal resolution.

Фиг. 4a иллюстрирует ситуацию, в которой имеются два разных разрешения (по времени и/или частоте) в первом канале и втором канале. Первая часть на фиг. 4a показывает кадр, имеющий длинный блок в первом канале и два последовательных коротких блока во втором канале. Длинный блок, например, может представлять собой блок TCX20. Короткие блоки могут представлять собой два последовательных блока TCX10. Кроме того, фиг. 4a иллюстрирует дополнительный кадр, который разделен на два субкадра A, B, причем в первом канале субкадр A имеет короткий блок, и во втором канале субкадр также имеет короткий блок. Однако в субкадре B второго кадра на фиг. 4a первый канал имеет короткий блок, а второй канал имеет два очень коротких блока, т.е. один очень короткий блок для каждого подсубкадра. Очень короткий блок, например, может представлять собой блок TCX5. В общем случае длинные блоки длиннее, чем короткие блоки, и короткие блоки длиннее, чем очень короткие блоки, и, безусловно, очень короткие блоки короче, чем длинные блоки. Естественно, не необходимо, чтобы один длинный блок имел такую же длину, как два коротких блока. В качестве альтернативы может иметься три коротких блока, имеющих общую длину, равную длине одного длинного блока, или может иметься четыре коротких блока, такие как очень короткий блок для каждого подсубкадра. Также могут иметь место другие разделения, т.е. два длинных блока в первом канале имеют общую длину, равную длине трех коротких блоков во втором канале. Длины длинных, коротких и очень коротких блоков не обязательно должны находиться в целочисленном отношении друг с другом. Кроме того, может существовать не только три разных длины блоков, а более трех длин блоков или только две разных длины блоков.Fig. 4a illustrates the situation in which there are two different resolutions (in time and/or frequency) in the first channel and the second channel. The first part in Fig. 4a shows a frame having a long block in the first channel and two consecutive short blocks in the second channel. The long block, for example, may be a TCX20 block. Short blocks can be two consecutive TCX10 blocks. In addition, FIG. 4a illustrates an additional frame that is divided into two subframes A, B, wherein in the first channel subframe A has a short block and in the second channel the subframe also has a short block. However, in subframe B of the second frame in FIG. 4a, the first channel has a short block and the second channel has two very short blocks, i.e. one very short block for each sub-subframe. A very short block, for example, may be a TCX5 block. In general, long blocks are longer than short blocks, and short blocks are longer than very short blocks, and certainly very short blocks are shorter than long blocks. Naturally, it is not necessary that one long block be the same length as two short blocks. Alternatively, there may be three short blocks having a total length equal to the length of one long block, or there may be four short blocks, such as a very short block for each subsubframe. Other divisions may also take place, ie. two long blocks in the first channel have a total length equal to the length of three short blocks in the second channel. The lengths of long, short and very short blocks do not have to be in integer relation to each other. In addition, there may be not only three different block lengths, but more than three block lengths, or only two different block lengths.

Фиг. 4b иллюстрирует представление спектра с высоким спектральным разрешением в первой линии. Спектральные значения обозначены целыми числами вдоль линии частоты, и фиг. 4b иллюстрирует три последовательные частотные полосы b₁, b₂, b₃, причем каждая частотная полоса, представляющая более высокие частоты, является более широкой, чем каждая частотная полоса, представляющая более низкие частоты. В ситуации с высоким спектральным разрешением, например, в спектре TCX20 наиболее низкая частотная полоса b₁ имеет четыре спектральные линии, или спектральных значения, или спектральных элемента разрешения. Вторая частотная полоса b₂ в варианте осуществления имеет восемь спектральных значений, и третья спектральная полоса b₃ имеет двенадцать спектральных элементов разрешения. Перенос или преобразование высокого спектрального разрешения в представление со средним спектральным разрешением приводят к тому, что из спектрального представления с высоким разрешением спектральные значения объединяются (или прореживаются) таким образом, чтобы среднее спектральное разрешение, такое как разрешение TCX10, имело два спектральных элемента разрешения для первой частотной полосы, четыре спектральных элемента разрешения для второй частотной полосы b₂ и шесть спектральных элементов разрешения для третьей частотной полосы b₃. Еще раз сравнивая это среднее спектральное разрешение с представлением с низким спектральным разрешением, например, в блоке TCX5, первая частотная полоса будет иметь только один частотный элемент разрешения, вторая частотная полоса b₂ будет иметь два частотных элемента разрешения, и третья спектральная полоса b₃ будет иметь три спектральных элемента разрешения. Среднее спектральное разрешение может быть преобразовано в низкое спектральное разрешение посредством объединения двух или более смежных спектральных линий или операции прореживания.Fig. 4b illustrates a high spectral resolution spectrum representation in the first line. Spectral values are indicated by integers along the frequency line, and FIG. 4b illustrates three successive frequency bands b ₁ , b ₂ , b ₃ , with each frequency band representing higher frequencies being wider than each frequency band representing lower frequencies. In a high spectral resolution situation, such as in the TCX20 spectrum, the lowest frequency band b ₁ has four spectral lines, or spectral values, or spectral bins. The second frequency band b ₂ in the embodiment has eight spectral values, and the third spectral band b ₃ has twelve spectral bins. Transferring or converting a high spectral resolution to a medium spectral resolution representation results in the spectral values from the high resolution spectral representation being combined (or decimated) such that an average spectral resolution, such as a TCX10 resolution, has two spectral bins for the first frequency band, four spectral bins for the second frequency band b ₂ and six spectral bins for the third frequency band b ₃ . Again comparing this average spectral resolution with a low spectral resolution representation, for example in block TCX5, the first frequency band will have only one frequency bin, the second frequency band b ₂ will have two frequency bins, and the third spectral band b ₃ will have three spectral elements of resolution. An average spectral resolution may be converted to a low spectral resolution by combining two or more adjacent spectral lines or by a decimation operation.

С другой стороны, представление с низким спектральным разрешением может быть преобразовано в представление с более высоким разрешением посредством интерполяции, или копирования, или копирования и фильтрации таким образом, чтобы, например, на основе двух спектральных элементов разрешения в первой частотной полосе b₁ для среднего спектрального разрешения могли быть вычислены четыре спектральных элемента разрешения 1, 2, 3, 4 с высоким разрешением, как проиллюстрировано на фиг. 4b.On the other hand, a low spectral resolution representation can be converted to a higher resolution representation by interpolation, or copying, or copying and filtering, so that, for example, based on two spectral bins in the first frequency band b ₁ for the average spectral resolution, four high-resolution spectral bins 1, 2, 3, 4 could be calculated as illustrated in FIG. 4b.

Этот новый подход направлен на обеспечение активного способа понижающего микширования по частотным полосам без задержек для преобразования из стерео а моно, в котором только взвешивание спектральных полос по частотным полосам двух каналов выполняется в частотной области, в то время как фактическое понижающее микширование в монофонический сигнал выполняется после преобразования во временную область посредством суммирования и масштабирования двух спектрально взвешенных сигналов.This new approach aims to provide an active method of downmixing across frequency bands without delays for conversion from stereo to mono, in which only the weighting of the spectral bands across the frequency bands of the two channels is done in the frequency domain, while the actual downmixing to mono is done after time domain conversion by summing and scaling the two spectrally weighted signals.

В случае, если представления в спектральной области обоих сигналов имеют разные частотно-временные разрешения (т.е. более короткий размер блока для одного сигнала), вычисление весового коэффициента выполнено с возможностью объединения соседних спектральных элементов разрешения и по времени, и в спектре таким образом, чтобы вычисление между спектрами могло быть сделано в одних тех же частотно-временных областях.In the event that the spectral domain representations of both signals have different time-frequency resolutions (i.e., a shorter block size for one signal), the calculation of the weight coefficient is made with the possibility of combining adjacent spectral bins both in time and in the spectrum, thus so that the calculation between spectra can be done in the same time-frequency domains.

В этом способе частотно-временное разрешение двух стереоканалов не обязательно должно быть однородным, поскольку взвешивание по частотным полосам каналов все еще может быть выполнено, если каналы отличаются в этом отношении, в то время как критическое преобразование из стерео в моно выполняется позже, когда оба спектрально взвешенных канала уже преобразованы обратно во временную область.In this method, the time-frequency resolution of the two stereo channels does not have to be uniform, as weighting over the channel bandwidths can still be done if the channels differ in this regard, while the critical conversion from stereo to mono is done later when both are spectrally the weighted channel has already been converted back to the time domain.

Варианты осуществления обеспечивают на стороне декодера оптимизированное понижающее микширование из стерео в моно без задержек.Embodiments provide, at the decoder side, an optimized downmix from stereo to mono without delay.

Предпочтительные аспекты относятся к активному понижающему микшированию со взвешиванием по частотным полосам с раздельными стадиями взвешивания (в частотной области) и микширования (во временной области).Preferred aspects relate to an active frequency band weighted downmix with separate weighting (in the frequency domain) and mixing (in the time domain).

Дополнительные предпочтительные аспекты относятся к временному/спектральному объединению частотных элементов разрешения для корреляции между спектрами в случае каналов с разными представлениями в спектральной области, причем эти аспекты могут использоваться отдельно от аспектов понижающего микширования или вместе с аспектами понижающего микширования.Additional preferred aspects relate to temporal/spectral combining of frequency bins for correlation between spectra in the case of channels with different representations in the spectral domain, which aspects can be used separately from the downmix aspects or together with the downmix aspects.

В отличие от параметрических стереокодеков, таких как [7], когда передается только уже подвергнутый понижающему микшированию базовый сигнал, а также несколько вспомогательных параметров, представляющих стереоизображение, в декодере нет понижающего микширования для дискретного стереофонического приложения на основе MDCT, где оба канала всегда напрямую кодируются с помощью кодера TCX. Таким образом, понижающее микширование должно быть полностью сформировано на стороне декодера.Unlike parametric stereo codecs such as [7], where only the already downmixed base signal is transmitted, as well as a few auxiliary parameters representing the stereo image, there is no downmix in the decoder for a discrete stereo application based on MDCT, where both channels are always directly encoded using the TCX encoder. Thus, the downmix must be fully formed on the decoder side.

Фиг. 3b иллюстрирует предпочтительную реализацию модуля 100 оценки весовых значений, проиллюстрированного на фиг. 1. На этапе 140 модуль оценки весовых значений оценивает соответствующее мнимое или действительное спектральные значения для каждого частотного элемента разрешения из первого канала и второго канала или, в качестве альтернативы, из первого канала и объединенного представления в спектральной области, или из второго канала и объединенного представления в спектральной области, или из первого объединенного представления в спектральной области и второго объединенного представления в спектральной области. Обычно модуль оценки весовых значений выполнен с возможностью вычисления первого весового значения и второго весового значения с использованием энергии первого канала в частотной полосе, энергии второго канала в частотной полосе и микшированной составляющей, зависящей от произведения или линейной комбинации спектральных значений по меньшей мере из двух каналов в частотной полосе. На фиг. 3b энергия первого канала и энергия второго канала в качестве примера вычисляются в блоке 140. Кроме того, микшированная составляющая, зависящая от произведения, вычисляется в блоке 148, и другая микшированная составляющая, зависящая от линейной комбинации, вычисляется в блоке 146. Кроме того, «амплитуда» для каждой частотной полосы, которая соответствует квадратному корню мощности спектральных элементов разрешения для каждой частотной полосы, вычисляется в блоке 144.Fig. 3b illustrates a preferred implementation of the weight estimation module 100 illustrated in FIG. 1. In step 140, the weight estimator estimates the corresponding apparent or real spectral values for each frequency bin from the first channel and the second channel, or alternatively from the first channel and the combined spectral domain representation, or from the second channel and the combined representation in the spectral domain, or from the first combined spectral domain representation and the second combined spectral domain representation. Typically, the weight estimator is configured to calculate the first weight and the second weight using the energy of the first channel in the frequency band, the energy of the second channel in the frequency band, and the mixed component depending on the product or linear combination of the spectral values of at least two channels in frequency band. In FIG. 3b, the energy of the first channel and the energy of the second channel as an example are calculated in block 140. In addition, a product dependent mixing component is calculated in block 148, and another mixing component depending on the linear combination is calculated in block 146. In addition, " amplitude" for each frequency band, which corresponds to the square root of the power of the spectral bins for each frequency band, is calculated in block 144.

Таким образом, как проиллюстрировано на фиг. 3b, первое весовое значение w_L вычисляется на основе амплитуд для каждой частотной полосы для обоих каналов и в зависимости от микшированной составляющей, и предпочтительно микшированная составляющая зависит от линейной комбинации, проиллюстрированной в блоке 146. Кроме того, предпочтительно, чтобы вектор взвешивания w_L для каждой частотной полосы вычислялся с использованием весового значения w_R для каждой частотной полосы, т.е. для другого канала. Значение для другого канала, т.е. w_R для каждой частотной полосы предпочтительно вычисляется на основе микшированной составляющей, зависящей от произведения, проиллюстрированного в блоке 148, и "амплитуд" для каждой частотной полосы, полученные блоком 144 на основе мощностей для каждой частотной полосы в соответствующих каналах, как определено в блоке 142.Thus, as illustrated in FIG. 3b, the first weight value w _L is calculated based on the amplitudes for each frequency band for both channels and depending on the mixed component, and preferably the mixed component depends on the linear combination illustrated in block 146. Further, it is preferable that the weighting vector w _L for each frequency band was calculated using the weight value w _R for each frequency band, i.e. for another channel. The value for another channel, i.e. w _R for each frequency band is preferably calculated based on the mixed component depending on the product illustrated in block 148 and the "amplitudes" for each frequency band obtained by block 144 based on the powers for each frequency band in the respective channels, as determined in block 142 .

Таким образом, предпочтительно квадратный корень энергии сложенных друг с другом спектральных значений в частотной полосе из представлений в спектральной области по меньшей мере двух каналов используется в качестве «амплитуд», но также могут использоваться другие «амплитуды», например, «амплитуды», выведенные из степеней с показателем, который меньше 1 и отличается от 1/2. Спектральные значения из частотной полосы линейно объединяются, т.е. складываются друг с другом, и из полученного в результате значения берется квадратный корень или используется любое другое возведение в степень с показателем меньше 1, причем предпочтительно дополнительно используются возведения в степень для обоих каналов в частотной полосе.Thus, preferably, the square root of the energy of the summed spectral values in the frequency band from the spectral domain representations of at least two channels is used as "amplitudes", but other "amplitudes" can also be used, for example, "amplitudes" derived from degrees with an exponent less than 1 and different from 1/2. The spectral values from the frequency band are linearly combined, i.e. are added to each other and the resulting value is square rooted or any other exponentiation with an exponent less than 1 is used, preferably exponentiations for both channels in the frequency band are additionally used.

В качестве микшированной составляющей, представляющей произведение, также может быть определено абсолютное значение комплексного скалярного произведения между спектральными значениями в частотной полосе первого канала и спектральными значениями в частотной полосе второго канала, например, в вычислении блока 148. Предпочтительно одинаковый весовой коэффициент, определенный модулем 200 спектрального взвешивания, применяется к каждому спектральному значению в частотной полосе одного по меньшей мере из двух каналов, и другой весовой коэффициент применяется к каждому спектральному значению в частотной полосе другого канала по меньшей мере из двух каналов.As the mixed component representing the product, the absolute value of the complex dot product between the spectral values in the frequency band of the first channel and the spectral values in the frequency band of the second channel can also be determined, for example, in the calculation of block 148. Preferably, the same weighting factor determined by the spectral module 200 weighting is applied to each spectral value in the frequency band of one of the at least two channels, and a different weighting factor is applied to each spectral value in the frequency band of the other channel of the at least two channels.

Далее проиллюстрирована предпочтительная реализация вычисления весовых коэффициентов для каждой частотной полосы, которая может использоваться модулем 100 оценки весовых значений.The following illustrates a preferred implementation of the calculation of weights for each frequency band, which can be used by the module 100 estimation weights.

Поскольку использование пассивного понижающего микширования имеет свои указанные выше недостатки, использование активной схемы понижающего микширования приводит к существенным улучшениям по многим пунктам. Добавление еще одной стадии декодера, включающей в себя преобразование DFT для обоих каналов после стереодекодирования, не выполнимо и по сложности, и как причина задержки, таким образом, процесс понижающего микширования выполняется как комбинация обработки в области преобразования MDCT и во временной области.Since the use of a passive downmix has its drawbacks mentioned above, the use of an active downmix scheme results in significant improvements in many respects. Adding another decoder stage including DFT transformation for both channels after stereo decoding is not feasible both in terms of complexity and cause of delay, thus the downmixing process is performed as a combination of processing in the MDCT transform domain and in the time domain.

Сначала весовые коэффициенты по частотным полосам вычисляются и применяются к представлениям MDCT обоих каналов. Это происходит после стереофонической обработки (например, обратного MS-преобразования и т.д.) и непосредственно перед обратным преобразованием IMDCT. Весовые коэффициенты вычисляются по той же самой схеме, которая уже используется в стереокодере на основе DFT, описанном в [7], с целевой энергией mid-канала, повернутого по фазе:First, band weights are computed and applied to the MDCT representations of both channels. This occurs after stereo processing (eg, MS inverse transform, etc.) and just before the IMDCT inverse transform. The weight coefficients are calculated according to the same scheme that is already used in the DFT-based stereo encoder described in [7], with the target mid-channel energy rotated in phase:

где

и

представляют спектральные магнитуды левого и правого каналов. Затем на основе этой целевой энергии может быть вычислен весовой коэффициент для каналов для каждой спектральной полосы следующим образом:Where

And

represent the spectral magnitudes of the left and right channels. Then, based on this target energy, the channel weighting factor for each spectral band can be calculated as follows:

иAnd

Эти весовые коэффициенты или весовые значения w_R и w_L по частотным полосам вычисляются для каждой спектральной полосы, и каждая частотная полоса охватывает несколько элементов разрешения MDCT, начиная с малого количества элементов разрешения для полос с наиболее низкими частотами, например, 4, и с увеличением к более высоким частотам до нескольких элементов разрешения или большого количества элементов разрешения для полос с наиболее высокими частотами, например, 160.These weights or weights w _R and w _L across frequency bands are computed for each spectral band, and each frequency band spans multiple MDCT bins, starting with a small number of bins for the lowest frequency bands, such as 4, and increasing to higher frequencies up to a few bins or a large number of bins for the highest frequency bands, e.g. 160.

Поскольку переданные коэффициенты преобразования MDCT являются только действительнозначными, комплементарные значения MDST, которые требуются для взвешивания с сохранением энергии, получаются для каждого канала с помощью оценки [9]Because the transmitted MDCT transform coefficients are only real-valued, the complementary MDST values that are required for energy conservation weighting are obtained for each channel using the estimate [9]

где

определяет номер спектрального элемента разрешения.Where

determines the number of the spectral element of resolution.

С использованием этой оценки

и

вычисляются для каждой частотной полосы

какUsing this estimate

And

calculated for each frequency band

How

вычисляется как

is calculated as

и

вычисляется как магнитуда или абсолютное значение комплексного скалярного произведенияAnd

calculated as the magnitude or absolute value of the complex dot product

где

определяет номер элемента разрешения в спектральной полосе

.Where

determines the resolution element number in the spectral band

.

Несмотря на другое преобразование и только оцененные энергии, полученные в результате весовые коэффициенты по-прежнему приводят понижающему микшированию, сходному с описанным в [7].Despite the different transformation and only estimated energies, the resulting weights still result in a downmix similar to that described in [7].

На втором этапе два взвешенных канала затем подвергаются понижающему микшированию во временной области простым суммированием и масштабированием двух спектрально взвешенных каналов.In the second step, the two weighted channels are then downmixed in the time domain by simply summing and scaling the two spectrally weighted channels.

Сделана ссылка на фиг. 2.Reference is made to FIG. 2.

Причина такого объединенного подхода двояка: во-первых, посредством преобразования обоих каналов обратно во временную область на обоих каналах может выполняться постфильтрация, например, TCX-LTP - которая также действует во временной области - с использованием параметров (например, высоты звука), извлеченных из базового кодирования отдельных каналов, тем самым избегая необходимости пытаться найти усредненные параметры, которые соответствуют понижающему микшированию. Во-вторых, и более критически, стереофоническое преобразования MDCT выполнено с обеспечением возможности обеспечения другого базового кодера и/или решений с наложением для двух каналов. Конкретно это означает, что один канал может быть кодирован, например, с помощью одного длинного блока TCX20 (кадр 20 мс, высокое частотное разрешение, низкое временное разрешение), в то время как другой канал кодирован, например, с помощью двух коротких блоков TCX10 (два субкадра по 10 мс, низкое частотное разрешение, высокое временное разрешение), причем один или оба коротких блока могут быть разделены еще на два субкадра TCX5 (два по 5 мс). Это делает фактически невозможным полное понижающее микширование в частотной области. Однако одно лишь взвешивание по частотным полосам может быть сделано непосредственно в области MDCT.The reason for this combined approach is twofold: first, by converting both channels back to the time domain, post-filtering, such as TCX-LTP - which also operates in the time domain - can be performed on both channels using parameters (eg, pitch) extracted from basic coding of the individual channels, thereby avoiding the need to try to find average parameters that match the downmix. Second, and more critically, the MDCT stereo transform is designed to provide a different core encoder and/or overlay solution for the two channels. Specifically, this means that one channel can be encoded with, for example, one long TCX20 block (frame 20 ms, high frequency resolution, low temporal resolution), while the other channel is encoded with, for example, two short TCX10 blocks ( two 10 ms subframes, low frequency resolution, high temporal resolution), where one or both short blocks can be divided into two more TCX5 subframes (two 5 ms each). This makes complete downmixing in the frequency domain virtually impossible. However, the weighting by frequency bands alone can be done directly in the MDCT domain.

Один вариант осуществления, проиллюстрированный на фиг. 5a, работает следующим образом. Для особого случая разных ядер в двух каналах вычисление корреляции между спектрами как часть вычисления весовых коэффициентов должно быть немного адаптировано. Из-за разных частотных и временных разрешений TCX20 и TCX10 непосредственное вычисление скалярного произведения между левым каналом и правым каналом невозможно. Вместо этого элементы разрешения преобразования MDCT должны быть объединены таким образом, чтобы они охватывали одинаковые частотно-временные области. Для TCX20 это означает всегда объединение двух соседних элементов разрешения, тогда как для TCX10 каждый элемент разрешения первого субкадра должен быть объединен с тем же элементом разрешения в следующем субкадре, например,One embodiment illustrated in FIG. 5a works as follows. For the special case of different nuclei in two channels, the calculation of the correlation between the spectra as part of the calculation of the weight coefficients must be slightly adapted. Due to the different frequency and time resolutions of the TCX20 and TCX10, it is not possible to directly calculate the dot product between the left channel and the right channel. Instead, the MDCT transform bins must be combined so that they cover the same time-frequency domains. For TCX20 this means always concatenating two adjacent bins, whereas for TCX10 each bin of the first subframe must be concatenated with the same bin in the next subframe, e.g.

иAnd

если

- спектр преобразования MDCT TCX20, и

- спектр преобразования MDCT TCX10 с 2 субкадрами, где

определяет номер спектрального элемента разрешения, и

и

- субкадры TCX10. Такое же объединение также выполняется с оцененными спектрами MDST.If

is the MDCT TCX20 transform spectrum, and

- TCX10 MDCT transform spectrum with 2 subframes, where

determines the spectral bin number, and

And

- TCX10 subframes. The same pooling is also performed with the estimated MDST spectra.

Корреляция между спектрами

и/или значение для

затем вычисляется с полученными в результате объединенными элементами разрешения. Это приводит к несколько более грубой оценке корреляции, но оказалось, что этого вполне достаточно.Correlation between spectra

and/or value for

then computed with the resulting combined permission elements. This leads to a somewhat coarser estimate of the correlation, but it turned out to be quite sufficient.

Другой вариант осуществления, проиллюстрированный на фиг. 5b, работает следующим образом. Для особого случая разных ядер в двух каналах вычисление корреляции между спектрами как часть вычисления весовых коэффициентов должно быть немного адаптировано. Из-за разных частотных и временных разрешений TCX20 и TCX10 непосредственное вычисление скалярного произведения между левым каналом и правым каналом невозможно. Чтобы сделать его возможным, спектр (суб)кадра с низким спектральным разрешением преобразовывается в аппроксимацию спектра с удвоенным спектральным разрешением посредством вычисления:Another embodiment illustrated in FIG. 5b works as follows. For the special case of different nuclei in two channels, the calculation of the correlation between the spectra as part of the calculation of the weight coefficients must be slightly adapted. Due to the different frequency and time resolutions of the TCX20 and TCX10, it is not possible to directly calculate the dot product between the left channel and the right channel. To make this possible, the spectrum of a (sub)frame with a low spectral resolution is converted to an approximation of the spectrum with twice the spectral resolution by calculating:

иAnd

где

и

- субкадры с низким разрешением. Эти сложения и вычитания могут рассматриваться как операции высоко- и низкочастотной фильтрации, которые разбивают один элемент разрешения с низким разрешением на два элемента разрешения с высоким разрешением, причем фильтрация зависит от того, является ли номер элемента разрешения

нечетным или четным (начиная с

для самого нижнего элемента разрешения).Where

determines the spectral bin number, and

And

- subframes with low resolution. These additions and subtractions can be thought of as high- and low-pass filtering operations that split one low-resolution bin into two high-resolution bins, the filtering depending on whether the bin number is

odd or even (starting from

for the bottommost permission element).

Это означает, что если один канал имеет разрешение TCX20, другой канал преобразуется в такое же спектральное разрешение. Если один или оба из субкадров другого канала разделены еще на два «подсубкадра» с разрешением TCX5, они сначала преобразовываются в разрешение TCX10 посредством той же самой фильтрации перед разбиением, чтобы достигнуть окончательного представления с разрешением TCX20.This means that if one channel has TCX20 resolution, the other channel is converted to the same spectral resolution. If one or both of the other channel's subframes are split into two more "sub-subframes" with TCX5 resolution, they are first converted to TCX10 resolution by the same filtering before splitting to reach the final TCX20 resolution representation.

Даже если ни один из каналов не имеет разрешение TCX20, преобразование в более высокое разрешение по-прежнему может быть необходимым для одного или обоих субкадров в том случае, если имеется разрешение TCX10 в одном канале и разрешение TCX5 в другом. В качестве примера, если левый канал имеет разрешение TCX10 в субкадре A и два "подсубкадра" с разрешением TCX5 в субкадре B, в то время как правый канал имеет два "подсубкадра" с разрешением TCX5 в субкадре A и разрешение TCX10 в субкадре B, оба канала будут преобразованы, чтобы они имели разрешение TCX10 в обоих субкадрах (преобразование субкадра B для левого канала, субкадра A для правого канала). Если в этом же примере правый канал также имеет разрешение TCX 10 для субкадра A и два "подсубкадра" с разрешением TCX5 для субкадра B, то преобразование не делается; т.е. субкадр A будет подвергнут понижающему микшированию с разрешением TCX10, а субкадр B с разрешением TCX5.Even if neither channel has TCX20 resolution, upscaling may still be necessary for one or both subframes if there is TCX10 resolution in one channel and TCX5 resolution in the other. As an example, if the left channel has TCX10 resolution in subframe A and two "subsubframes" of TCX5 resolution in subframe B, while the right channel has two "subsubframes" of TCX5 resolution in subframe A and TCX10 resolution in subframe B, both channels will be converted to have TCX10 resolution in both subframes (transform subframe B for left channel, subframe A for right channel). If, in the same example, the right channel also has TCX 10 resolution for subframe A and two "subsubframes" with TCX5 resolution for subframe B, then no conversion is done; those. subframe A will be downmixed at TCX10 resolution and subframe B at TCX5 resolution.

Затем вычисляются оценки MDST и окончательные весовые коэффициенты канала с использованием этих преобразованных спектров. Сами весовые коэффициенты применяются к исходным входным спектрам, это означает, что в случае преобразования каждый вычисленный весовой коэффициент применяется ко всем элементам разрешения, покрывающим один и тот же частотный диапазон в исходном низком разрешении для каждого субкадра.The MDST estimates and final channel weights are then computed using these transformed spectra. The weights themselves are applied to the original input spectra, which means that in the case of a transformation, each calculated weight is applied to all bins covering the same frequency range in the original low resolution for each subframe.

Посредством разделения стадии взвешивания активного понижающего микширования по частотным полосам от фактической стадии микширования новый способ может выдавать монофонический сигнал с преимуществами активного понижающего микширования, но без дополнительной задержки или сложности, и независимо от выбранного частотно-временного разрешения отдельных каналов.By separating the active downmix weighting step into frequency bands from the actual mixing step, the new method can output a mono signal with the benefits of active downmixing, but without additional delay or complexity, and regardless of the selected time-frequency resolution of the individual channels.

Это также позволяет использовать дополнительную постобработку во временной области (например, постфильтр TCX-LTP с использованием информации о высоте звука) на обоих каналах без необходимости специализированного понижающего микширования параметров.It also allows additional time domain post-processing (such as a TCX-LTP post-filter using pitch information) on both channels without the need for dedicated parameter downmixing.

Фиг. 5a иллюстрирует первую альтернативу, в которой формируются два объединенных представления в спектральной области. Первое объединенное представление в спектральной области вычисляется посредством сложения двух соседних элементов разрешения представления в спектральной области с высоким разрешением, проиллюстрированной слева на фиг. 5a, чтобы получить первое объединенное представление в спектральной области.Fig. 5a illustrates the first alternative in which two combined spectral domain representations are generated. The first combined spectral domain representation is computed by adding two adjacent resolution bins of the high-resolution spectral domain representation illustrated on the left in FIG. 5a to get the first combined representation in the spectral domain.

Кроме того, два представления с низким спектральным разрешением, проиллюстрированные в TCX10 в середине на фиг. 5a, объединяются друг с другом, чтобы получить второе объединенное представление в спектральной области. Модуль 100 оценки весовых значений выполнен с возможностью вычисления весовых коэффициентов w_L и w_R для левого и правого каналов на основе этих двух объединенных представлений в спектральной области.In addition, the two low spectral resolution views illustrated in TCX10 in the middle of FIG. 5a are combined with each other to obtain a second combined representation in the spectral domain. The weight estimator 100 is configured to calculate weights w _L and w _R for the left and right channels based on these two combined spectral domain representations.

Относительно фактически выполненного спектрального взвешивания, выполненного модулем 200 спектрального взвешивания, весовой коэффициент для левого канала применяется к исходному представлению левого канала, т.е. представлению TCX20, проиллюстрированному слева на фиг. 5a. Кроме того, весовые значения по частотным полосам для правого канала, представленного двумя последовательными во времени блоками TCX10, применяются к обоим блокам TCX10. Такое же весовое значение по частотным полосам применяется к соответствующим частотным полосам двух последовательных во времени блоков TCX10, проиллюстрированных в середине на фиг. 5a.With respect to the actually performed spectral weighting performed by the spectral weighting module 200, the weighting factor for the left channel is applied to the original representation of the left channel, i. the TCX20 view illustrated on the left in FIG. 5a. In addition, the bandwidth weights for the right channel represented by two consecutive TCX10 blocks are applied to both TCX10 blocks. The same frequency band weighting is applied to the respective frequency bands of the two time-consecutive TCX10 blocks illustrated in the middle of FIG. 5a.

Во второй альтернативе, проиллюстрированной на фиг. 5b, вычисляется только одно объединенное представление в спектральной области, как показано для нескольких других случаев. Когда, например, субкадр в первом канале имеет два очень коротких подсубкадра, таких как кадры TCX5, и следующий субкадр имеет один кадр TCX10, и когда второй канал, например, имеет два кадра TCX10, объединенное представление в спектральной области вычисляется для первого подсубкадра, в то время как для второго подсубкадра первый и второй канал уже находятся в представлении TCX10.In the second alternative, illustrated in FIG. 5b, only one combined representation in the spectral domain is computed, as shown for several other cases. When, for example, a subframe in the first channel has two very short subsubframes, such as TCX5 frames, and the next subframe has one TCX10 frame, and when the second channel, for example, has two TCX10 frames, the combined spectral domain representation is computed for the first subsubframe, in while for the second sub-subframe, the first and second channels are already in the TCX10 representation.

В этом примере модуль 200 спектрального взвешивания выполнен с возможностью применения весовых коэффициентов с высоким спектральным разрешением к соответствующим частотным полосам в субкадрах, каждый из которых представляет пять миллисекунд. Кроме того, весовые коэффициенты с высоким разрешением применяются к соответствующим исходным представлениям в спектральной области другого канала, имеющего короткий кадр TCX10 в первом субкадре A.In this example, spectral weighting module 200 is configured to apply high spectral resolution weights to respective frequency bands in subframes each representing five milliseconds. In addition, high resolution weights are applied to the corresponding original spectral domain representations of the other channel having a short TCX10 frame in the first A subframe.

В качестве альтернативы ситуация такова, что первый канал имеет представление, проиллюстрированное слева на фиг. 5b, а второй канал - представление, проиллюстрированное справа на фиг. 5b, представление первого канала преобразовывается в одно объединенное представление в спектральной области через два этапа: от левой стороны на фиг. 5b к середине, и от середины на фиг. 5b к правой стороне. Частотное разрешение используется для вычисления весовых коэффициентов, и соответствующие весовые коэффициенты применяются к представлению с высоким частотным разрешением и с низким временным разрешением второго канала, которое будет иметь разрешение, проиллюстрированное справа на фиг. 5b, и такие же значения для частотной полосы будут применены ко всем отдельным субкадрам A, B и следующему субкадру, проиллюстрированному как D и C на фиг. 5b.Alternatively, the situation is such that the first channel has the representation illustrated on the left in FIG. 5b and the second channel is the view illustrated on the right in FIG. 5b, the first channel representation is converted to one combined spectral domain representation in two steps: from the left side in FIG. 5b towards the middle, and from the middle in FIG. 5b to the right side. The frequency resolution is used to calculate the weights, and the corresponding weights are applied to the high frequency resolution, low time resolution representation of the second channel, which will have the resolution illustrated on the right in FIG. 5b and the same values for the frequency band will be applied to all individual subframes A, B and the next subframe illustrated as D and C in FIG. 5b.

Фиг. 5c иллюстрирует другую альтернативу, в которой фактические весовые значения в области вычисляются на основе представления с низким частотным разрешением и с высоким временным разрешением. Первый канал, например, имеет представление TCX20, и второй канал, например, имеет последовательность из двух представлений TCX10. В отличие от альтернативы, проиллюстрированной на фиг. 5b, объединенным представлением теперь является представление с высоким временным разрешением и низким частотным разрешением, проиллюстрированным в правом верхнем углу на фиг. 5c. Весовые коэффициенты в спектральной области вычисляются на основе объединенного представления, с одной стороны, и исходного представления в спектральной области второго канала, проиллюстрированного в левом нижнем углу на фиг. 5c.Fig. 5c illustrates another alternative in which the actual weights in the region are computed based on a representation with low frequency resolution and high temporal resolution. The first channel, for example, has a TCX20 representation, and the second channel, for example, has a sequence of two TCX10 representations. Unlike the alternative illustrated in FIG. 5b, the combined view is now the high temporal and low frequency view illustrated in the upper right corner of FIG. 5c. The spectral domain weights are computed based on the combined representation on the one hand and the original spectral domain representation of the second channel illustrated in the lower left corner of FIG. 5c.

Получены два набора весовых значений по частотным полосам, т.е. один для каждого субкадра. Эти значения применяются к соответствующим субкадрам второго канала. Однако вследствие того, что первый канал только имеет одно представление в спектральной области для целого кадра, выведенные весовые значения в спектральной области вычисляются, как иллюстрировано на этапе 136 на фиг. 3a. Одна процедура вычисления выведенного весового значения в спектральной области должна выполнить взвешенное сложение соответствующих весовых значений одной и той же частотной полосы для двух (или более) субкадров, причем каждое весовое значение, например, взвешивается с коэффициентом 0,5 при взвешенном сложении, что приводит к операции усреднения. Другой альтернативой может быть вычисление среднего арифметического или геометрического весовых значений для двух субкадров или любая другая процедура для получения одного весового значения из двух весовых значений для частотной полосы в кадре. Факультативным вариантом может быть простой выбор одного из двух значений и отбрасывание другого и т.д.Two sets of weight values were obtained for the frequency bands, i.e. one for each subframe. These values are applied to the corresponding subframes of the second channel. However, due to the fact that the first channel only has one spectral domain representation for the entire frame, the inferred spectral domain weights are calculated as illustrated at step 136 in FIG. 3a. One procedure for computing an inferred weight in the spectral domain is to perform a weighted addition of the respective weights of the same frequency band for two (or more) subframes, with each weight, for example, being weighted by a factor of 0.5 in the weighted addition, resulting in averaging operations. Another alternative would be to calculate the arithmetic or geometric average of the weights for the two subframes, or any other procedure to obtain one weight from the two weights for a frequency band in a frame. An optional option would be to simply select one of the two values and discard the other, and so on.

Кроме того, для вычисления объединенного представления в спектральной области на основе первого канала может использоваться процедура, описанная ниже с обращением к фиг. 5a, т.е. два соседних спектральных значения могут быть сложены вместе, чтобы сократить спектральное разрешение. Это также проиллюстрировано на фиг. 4b, где высокое спектральное разрешение, имеющее определенное количество спектральных значений в частотной полосе, может быть сокращено до среднего спектрального разрешения, имеющего более низкое количество спектральных значений в той же самой частотной полосе. Кроме того, чтобы удвоить спектральные значения для двух субкадров, проиллюстрированных в правом верхнем углу на фиг. 5c, можно использовать, например, одинаковые спектральные значения (с низким спектральным разрешением) для частотной полосы в обоих субкадрах, или можно выполнить некоторое взвешенное прореживание с использованием более ранних или более поздних значений в зависимости от обстоятельств.In addition, the procedure described below with reference to FIG. 5a, i.e. two neighboring spectral values can be added together to reduce the spectral resolution. This is also illustrated in FIG. 4b, where a high spectral resolution having a certain number of spectral values in a frequency band can be reduced to an average spectral resolution having a lower number of spectral values in the same frequency band. In addition, in order to double the spectral values for the two subframes illustrated in the upper right corner in FIG. 5c, for example, the same spectral values (with low spectral resolution) for the frequency band in both subframes may be used, or some weighted decimation may be performed using earlier or later values as appropriate.

Фиг. 5d иллюстрирует дополнительную реализацию, когда первый канал имеет представление с высоким частотным и низким временным разрешением, например, представление TCX20, и второй канал имеет представление с низким частотным и высоким временным разрешением, например, последовательность из двух коротких кадров, таких как два кадра TCX10. Первым объединенным представлением в спектральной области является представление с высоким частотным разрешением и высоким временным разрешением, и вторым объединенным представлением в спектральной области является дополнительно представление с высоким частотным разрешением и высоким временным разрешением. Процедура, проиллюстрированная на фиг. 5d, например, может быть выполнена таким образом, что на основе первого канала первое объединенное представление в спектральной области вычисляется с теми же спектральными значениями, но теперь в течение двух последовательных временных кадров, проиллюстрированных как TCX10. В качестве альтернативы также может быть выполнена некоторая процедура интерполяции и т.д., чтобы удвоить количество кадров для вычисления на основе кадра TCX20 двух последовательных кадров TCX10. Кроме того, второй канал уже имеет правильное временное разрешение, но частотное разрешение должно быть удвоено. С этой целью может быть выполнена процедура от нижней линии до верхней линии на фиг. 4b, т.е. спектральное значение в частотном элементе разрешения представления TCX10 может быть обработано, чтобы иметь одинаковое спектральное значение для пары частотных элементов разрешения. Чтобы иметь правильную энергию, может быть выполнено некоторое взвешивание. В качестве альтернативы или дополнительно некоторая усовершенствованная интерполяция может быть выполнена таким образом, чтобы частотные элементы разрешения, смежные друг с другом во втором объединенном представлении в спектральной области, имели не обязательно одинаковое спектральное значение, а разные значения. Весовые значения в спектральной области вычисляются модулем 100 оценки весовых значений на основе первого объединенного представления в спектральной области и второго объединенного представления в спектральной области, которые выведены из данных с высоким временным разрешением и высоким частотным разрешением.Fig. 5d illustrates a further implementation when the first channel has a high frequency, low time resolution representation, such as a TCX20 representation, and the second channel has a low frequency, high time resolution representation, such as a sequence of two short frames, such as two TCX10 frames. The first combined spectral domain representation is a high frequency resolution high temporal resolution representation, and the second combined spectral domain representation is further a high frequency resolution high temporal resolution representation. The procedure illustrated in FIG. 5d, for example, could be done such that, based on the first channel, the first combined representation in the spectral domain is calculated with the same spectral values, but now over two consecutive time frames, illustrated as TCX10. Alternatively, some interpolation procedure etc. may also be performed to double the number of frames to calculate two consecutive TCX10 frames based on a TCX20 frame. Also, the second channel already has the correct time resolution, but the frequency resolution needs to be doubled. To this end, the procedure from the bottom line to the top line in FIG. 4b, i.e. the spectral value in the frequency bin of the TCX10 representation may be processed to have the same spectral value for a pair of frequency bins. To have the right energy, some weighting can be done. Alternatively, or additionally, some advanced interpolation may be performed such that frequency bins adjacent to each other in the second combined spectral domain representation do not necessarily have the same spectral value, but different values. The spectral domain weights are calculated by the weight estimator 100 based on the first combined spectral domain representation and the second combined spectral domain representation, which are derived from the high time resolution and high frequency resolution data.

Модуль 200 спектрального взвешивания выполнен с возможностью применения соответствующих весовых значений в спектральной области ко второму каналу, где для каждого субкадра существует множество весовых значений по частотным полосам. В целях взвешивания данных TCX20 первого канала модуль 100 оценки весовых значений выполнен с возможностью повторного вычисления выведенных весовых коэффициентов 136 по частотным полосам, поскольку только один набор спектральных весовых коэффициентов области требуется для взвешивания представления в спектральной области с высоким частотным разрешением и временным разрешением (TCX20) первого канала. Процедура объединения для вычисления выведенных весовых значений по частотным полосам, например, может представлять собой усреднение.The spectral weighting module 200 is configured to apply appropriate weights in the spectral domain to the second channel, where for each subframe there are multiple weights across frequency bands. For the purpose of weighting the TCX20 data of the first channel, the weight estimator 100 is configured to recalculate the derived weights 136 across frequency bands, since only one set of spectral domain weights is required to weight the high frequency domain time domain representation (TCX20) the first channel. The merging procedure for calculating the derived weight values over the frequency bands may, for example, be an averaging.

Фиг. 6 иллюстрирует дополнительный аспект изобретения, т.е. устройство для преобразования спектрального разрешения представления в спектральной области канала, содержащего по меньшей мере два субкадра, причем каждый субкадр содержит множество спектральных значений, представляющих размер временного элемента разрешения и размер частотного элемента разрешения. Модуль 160 вычисления спектральных значений, включенный в устройство для преобразования в соответствии со вторым аспектом, содержит модуль 170 объединения для первого метода и модуль 180 объединения для второго метод. Предпочтительно модуль объединения для первого метода действует в качестве низкочастотного процессора, и модуль объединения для второго метода управляет высокочастотным процессором. Модуль вычисления спектральных значений посредством модуля объединения для первого метода объединяет спектральные значения, принадлежащие одному и тому же частотному элементу разрешения из каждого субкадра представления в спектральной области, чтобы получить первую группу объединенных спектральных значений, и модуль 180 объединения для второго метода объединяет спектральные значения, принадлежащие одному и тому же частотному элементу разрешения из каждого субкадра представления в спектральной области вторым методом, чтобы получить вторую группу объединенных спектральных значений, причем второй метод отличается от первого метода, и первая группа объединенных спектральных значений и вторая группа объединенных спектральных значений представляют объединенное представление в спектральной области, имеющее разный размер временного элемента разрешения и разный размер частотного элемента разрешения. Предпочтительная реализация этого вычисления описана и проиллюстрирована относительно фиг. 5b, где на одной иллюстрации последовательность из A₂, A₁ и B₂, B₁ преобразовывается в представление с высоким спектральным разрешением, но теперь имеющее низкое временное разрешение, как проиллюстрировано посредством F₂, E₂, с одной стороны, и F₁, E₁, с другой стороны.Fig. 6 illustrates a further aspect of the invention, i. a device for converting a spectral resolution of a representation in the spectral domain of a channel containing at least two subframes, each subframe containing a plurality of spectral values representing a temporal bin size and a frequency bin size. The spectral value calculation unit 160 included in the conversion apparatus according to the second aspect includes a combining unit 170 for the first method and a combining unit 180 for the second method. Preferably, the combiner for the first method acts as a low frequency processor and the combiner for the second method controls the high frequency processor. The spectral value calculation module, by means of the combining module for the first method, combines the spectral values belonging to the same frequency bin from each subframe of the representation in the spectral domain to obtain the first group of combined spectral values, and the combining module 180 for the second method combines the spectral values belonging to the same frequency bin from each subframe of the spectral domain representation by the second method to obtain the second group of combined spectral values, the second method being different from the first method, and the first group of combined spectral values and the second group of combined spectral values representing the combined spectral value an area having a different temporal bin size and a different frequency bin size. A preferred implementation of this calculation is described and illustrated with respect to FIG. 5b, where in one illustration the sequence of A ₂ , A ₁ and B ₂ , B ₁ is converted to a high spectral resolution representation, but now having a low temporal resolution, as illustrated by F ₂ , E ₂ on the one hand, and F ₁ , E ₁ , on the other hand.

В качестве альтернативы фиг. 5b также иллюстрирует ситуацию, в которой по меньшей мере два субкадра проиллюстрированы на средней схеме на фиг. 5b как два последовательные во времени субкадры длительностью 10 мс, и представление с высоким спектральным разрешением и низким временным разрешением проиллюстрировано справа нат фиг. 5b. Предпочтительно в первом методе выполняется сложение, и во втором методе выполняется вычитание. Кроме того, предпочтительно, чтобы обе процедуры также содержали функцию усреднения. Кроме того, модуль 160 вычисления спектральных значений на фиг. 6 выполнен с возможностью применения либо первого метода, либо второго метода, содержащего взвешивание с использованием знака взвешивания, причем модуль вычисления спектральных значений выполнен с возможностью установления знака взвешивания в соответствии с номером частотного элемента разрешения для одного и того же частотного элемента разрешения. Кроме того, модуль вычисления спектральных значений, как проиллюстрировано на фиг. 5b, выполнен с возможностью преобразования элемента разрешения с низким разрешением в два элемента разрешения с более высоким разрешением, причем первый метод используется для четного номера элемента разрешения, и второй метод используется для нечетного номера элемента разрешения.As an alternative to FIG. 5b also illustrates the situation in which at least two subframes are illustrated in the middle diagram in FIG. 5b as two 10 ms subframes consecutive in time, and a high spectral resolution, low temporal resolution representation is illustrated on the right in FIG. 5b. Preferably, the first method performs addition and the second method performs subtraction. In addition, it is preferable that both procedures also contain an averaging function. In addition, the spectral value calculation module 160 in FIG. 6 is configured to apply either the first method or a second method comprising weighting using the weighting sign, wherein the spectral value calculation module is configured to set the weighting sign according to a frequency bin number for the same frequency bin. In addition, the spectral value calculation unit, as illustrated in FIG. 5b is configured to convert a low resolution bin to two higher resolution bins, the first method being used for an even bin number and the second method being used for an odd bin number.

Фиг. 7 иллюстрирует дополнительную реализацию устройства для преобразования спектрального разрешения. В дополнение к модулю 160 объединения спектрального разрешения устройство для преобразования спектрального разрешения может содержать дополнительные элементы. Дополнительными элементами являются, например, спектральный процессор 500, и/или модуль 190 вычисления данных обработки, и/или дополнительный спектральный процессор 220. В реализации со спектральным процессором 500 преобразованное представление в спектральной области, которое было преобразовано без операций обратного и прямого преобразования, и, таким образом, было сформировано с низкими вычислительными ресурсами и с малой задержкой, может быть дополнительно обработано самостоятельно или, например, вместе с другим спектральным представлением, которое имеет такое же второе спектральное разрешение. Например, это может быть выполнено для некоторого понижающего микширования. Представление с высоким частотным разрешением и низким временным разрешением, проиллюстрированное справа на фиг. 5b, может не только использоваться для вычисления данных обработки, но и фактически далее обрабатывается для дополнительного или другого альтернативного использования, например, для понижающего микширования или любого вида рендеринга аудиоинформации на стадии дальнейшей обработки.Fig. 7 illustrates a further implementation of a spectral resolution transform apparatus. In addition to the spectral resolution combining unit 160, the spectral resolution converting device may include additional elements. Additional elements are, for example, spectral processor 500 and/or processing data calculation module 190 and/or additional spectral processor 220. , thus generated with low computational resources and low latency, can be further processed on its own or, for example, together with another spectral representation that has the same second spectral resolution. For example, this can be done for some downmix. The high frequency resolution, low temporal resolution representation illustrated on the right in FIG. 5b can not only be used to calculate processing data, but is actually further processed for additional or other alternative uses, such as downmixing or any kind of rendering of audio information in a further processing step.

С другой стороны, процедура, описанная выше с обращением к фиг. 1 и фиг. 5b, заключается в том, что представление в спектральной области со вторым спектральным разрешением, т.е. «объединенное представление в спектральной области», просто используется для вычисления некоторых данных обработки, таких как весовые значения для левого и правого каналов, или в общем случае для первого и второго канала многоканального сигнала. Данные обработки, сформированные с использованием представления в спектральной области, которое было преобразовано в высокое спектральное разрешение, используются только для вычисления данных обработки, но это представление в спектральной области далее не обрабатывается само по себе. Вместо этого с использованием данных обработки, таких как весовые значения, исходное входное представление в спектральной области с первым спектральным разрешением спектрально обрабатывается, как проиллюстрировано посредством блока 220. С этой целью предпочтительно использовать, например, другое представление в спектральной области с первым разрешением, например, для операции понижающего микширования, имеющей место в спектральной области.On the other hand, the procedure described above with reference to FIG. 1 and FIG. 5b is that the representation in the spectral domain with the second spectral resolution, i.e. "combined representation in the spectral domain", is simply used to calculate some processing data, such as weights for the left and right channels, or in general for the first and second channel of a multi-channel signal. The processing data generated using the spectral domain representation that has been converted to high spectral resolution is only used to calculate the processing data, but this spectral domain representation is not further processed by itself. Instead, using processing data such as weights, the original input first spectral resolution spectral domain representation is spectrally processed as illustrated by block 220. To this end, it is preferable to use, for example, another first resolution spectral domain representation, for example, for the downmix operation taking place in the spectral domain.

Фиг. 8 иллюстрирует вариант осуществления третьего аспекта настоящего изобретения, действующего в качестве устройства понижающего микширования для понижающего микширования многоканального сигнала, имеющего по меньшей мере два канала. Устройство понижающего микширования содержит модуль 100 оценки весовых значений для оценки весовых значений по частотным полосам по меньшей мере для двух каналов, причем модуль оценки весовых значений выполнен с возможностью вычисления весовых значений по частотным полосам на основе целевого значения энергии на каждую частотную полосу, чтобы энергия в частотной полосе микшированного с понижением сигнала находилась в заданном отношении к энергиям в тех же частотных полосах в двух каналах. Предпочтительно модуль 100 оценки весовых значений реализован, как проиллюстрировано на фиг. 3b и как описано в контексте фиг. 3b. Устройство понижающего микширования дополнительно содержит модуль 200 спектрального взвешивания и присоединенный далее микшер 400 для вычисления микшированного с понижением сигнала с использованием взвешенных представлений в спектральной области по меньшей мере двух каналов.Fig. 8 illustrates an embodiment of the third aspect of the present invention operating as a downmixer for downmixing a multi-channel signal having at least two channels. The downmixer comprises a weight estimator 100 for estimating weights per frequency bands for at least two channels, the weight estimator being configured to calculate the weights per frequency bands based on a target energy value per frequency band so that the energy in frequency band of the downmix signal was in a given relation to the energies in the same frequency bands in the two channels. Preferably, the weight estimator 100 is implemented as illustrated in FIG. 3b and as described in the context of FIG. 3b. The downmixer further comprises a spectral weighting module 200 and a downstream mixer 400 for calculating the downmixed signal using weighted representations in the spectral domain of the at least two channels.

Фиг. 9 иллюстрирует дополнительную реализацию устройства понижающего микширования, показанного на фиг. 8. Модуль 200 спектрального взвешивания предпочтительно выполнен с возможностью приёма управляющих данных для первого и/или второго канала. Кроме того, модуль спектрального взвешивания выполнен с возможностью применения управляющих данных для одной из четырех других пар входных данных. Первая пара входных данных может являться представлением в спектральной области первого канала и представлением в спектральной области второго канала, как проиллюстрировано слева на в фиг. 9. Вторая альтернатива может являться представлением в спектральной области первого канала и объединенным представлением в спектральной области, выведенным, как, например, описано с обращением к фиг. 5b, 5c. Кроме того, другая альтернатива может являться парой данных, представляющих представление в спектральной области второго канала и одно объединенное представление в спектральной области, как описано выше также с обращением к фиг. 5b, 5c. Другая альтернатива может состоять в том, что модуль 200 спектрального взвешивания применяет спектральные весовые коэффициенты к первому объединенному представлению в спектральной области и второму объединенному спектральному представлению, как проиллюстрировано относительно фиг. 5a или 5d. Управляющие данные для первого и/или вторых каналов могут являться, например, весовыми значениями w_L, с одной стороны, и w_R, с другой стороны, но также могут являться любыми другими управляющими данными, используемыми для выполнения любого вида спектрального взвешивания.Fig. 9 illustrates a further implementation of the downmixer shown in FIG. 8. The spectral weighting module 200 is preferably configured to receive control data for the first and/or second channel. In addition, the spectral weighting module is configured to apply control data to one of four other input data pairs. The first pair of input data may be a spectral domain representation of the first channel and a spectral domain representation of the second channel, as illustrated on the left in FIG. 9. The second alternative may be the spectral domain representation of the first channel and the combined spectral domain representation derived as, for example, described with reference to FIG. 5b, 5c. Furthermore, another alternative may be a data pair representing the spectral domain representation of the second channel and one combined spectral domain representation, as described above also with reference to FIG. 5b, 5c. Another alternative may be that the spectral weighting module 200 applies spectral weights to the first combined spectral domain representation and the second combined spectral representation, as illustrated with respect to FIG. 5a or 5d. The control data for the first and/or second channels may be, for example, the weight values w _L on the one hand and w _R on the other hand, but may also be any other control data used to perform any kind of spectral weighting.

Дополнительным элементом устройства понижающего микширования в варианте осуществления является модуль 480 суммирования, который вычисляет сложенное представление в спектральной области, т.е. представление понижающего микширования в спектральной области. Может использоваться процессор 490 монофонического сигнала, который, например, управляется любыми данными или реализован в виде модуля частотно-временного преобразования, как описано выше в отношении блока 310 на фиг. 1 или фиг. 2.An additional element of the downmixer in the embodiment is a summation module 480 which calculates a folded representation in the spectral domain, i.e. a representation of the downmix in the spectral domain. A mono signal processor 490 may be used, which, for example, is driven by any data or implemented as a time-frequency converter, as described above with respect to block 310 in FIG. 1 or fig. 2.

Следует подчеркнуть, что эти три аспекта могут использоваться отдельно друг от друга, но могут также быть успешно объединены друг с другом. В частности, реализация модуля оценки весовых значений в соответствии с фиг. 8 может быть применена в модуле 100 оценки весовых значений первого аспекта, проиллюстрированного на фиг. 1. Кроме того, модуль преобразования спектрального разрешения, проиллюстрированный на фиг. 6, предпочтительно реализован модулем 100 оценки весовых значений на фиг. 1 в альтернативе, проиллюстрированной на фиг. 5b, формирующей представление в спектральной области с высоким разрешением/низким разрешением от двух субкадров с высоким временным разрешением и низким спектральным разрешением. Кроме того, функциональность первого аспекта, проиллюстрированного на фиг. 1, в частности, относительно вычисления обработки данных, может быть реализована посредством модуля 190 вычисления данных обработки и дополнительного спектрального процессора 220, иллюстрированных на фиг. 7, и микшер 400 из третьего аспекта изобретения может быть реализован как альтернатива на фиг. 9 таким образом, что микшер 400 для вычисления микшированного с понижением сигнала применяет функциональность модуля 300 преобразования, проиллюстрированного на фиг. 1, до выполнения фактического сложения по каждой выборке во временной области. Таким образом, все заданные варианты осуществления, определенные в зависимом пункте формулы изобретения для одного из этих трех аспектов, также могут быть применены к любому другому аспекту из трех аспектов в определении соответствующего зависимого пункта формулы изобретения.It should be emphasized that these three aspects can be used separately from each other, but can also be successfully combined with each other. In particular, the implementation of the weight estimator according to FIG. 8 may be applied to the weight estimation module 100 of the first aspect illustrated in FIG. 1. In addition, the spectral resolution conversion unit illustrated in FIG. 6 is preferably implemented by weight estimation module 100 in FIG. 1 in the alternative illustrated in FIG. 5b forming a high-resolution/low-resolution spectral domain representation from two subframes with a high temporal resolution and a low spectral resolution. In addition, the functionality of the first aspect illustrated in FIG. 1, in particular with respect to data processing calculation, can be implemented by the processing data calculation unit 190 and the additional spectral processor 220 illustrated in FIG. 7, and the mixer 400 of the third aspect of the invention may be implemented as an alternative in FIG. 9 such that the mixer 400 uses the functionality of the transform module 300 illustrated in FIG. 1 before performing the actual addition on each sample in the time domain. Thus, all given embodiments defined in a dependent claim for one of these three aspects can also be applied to any other aspect of the three aspects in the definition of the corresponding dependent claim.

Таким образом становится ясно, что в зависимости от реализации три аспекта могут быть применены отдельно или могут быть объединены друг с другом либо посредством объединения любых двух из трех аспектов, либо посредством объединения всех трех аспектов.It thus becomes clear that, depending on the implementation, the three aspects may be applied separately or may be combined with each other, either by combining any two of the three aspects, or by combining all three aspects.

Далее даны дополнительные примеры аспектов настоящего изобретения.Further examples of aspects of the present invention are given below.

1. Устройство понижающего микширования для понижающего микширования многоканального сигнала, имеющего по меньшей мере два канала, содержащее:1. A downmixer for downmixing a multi-channel signal having at least two channels, comprising:

модуль (100) оценки весовых значений для оценки весовых значений по частотным полосам для упомянутых по меньшей мере двух каналов;a weight estimation module (100) for estimating weight values across frequency bands for said at least two channels;

модуль (200) спектрального взвешивания для взвешивания представлений в спектральной области по меньшей мере двух каналов c использованием весовых значений по частотным полосам;a spectral weighting module (200) for weighting spectral domain representations of the at least two channels using weights across frequency bands;

модуль (300) преобразования для преобразования взвешенных представлений в спектральной области по меньшей мере двух каналов во временные представления по меньшей мере двух каналов; иa conversion module (300) for converting weighted spectral domain representations of at least two channels into temporal representations of at least two channels; And

микшер (400) для микширования временных представлений по меньшей мере двух каналов для получения микшированного с понижением сигнала.a mixer (400) for mixing the time representations of the at least two channels to obtain a downmixed signal.

2. Устройство понижающего микширования из примера 1, в котором модуль (100) оценки весовых значений выполнен с возможностью вычисления множества первых весовых значений по частотным полосам для множества частотных полос первого канала на основе по меньшей мере двух каналов и вычисления второго множества весовых значений по частотным полосам для множества частотных полос второго канала на основе по меньшей мере двух каналов, или2. The downmixer of Example 1, wherein the weight estimator (100) is configured to calculate a plurality of first weight values by frequency bands for a plurality of frequency bands of a first channel based on at least two channels and calculate a second plurality of weight values by frequency bands. bands for a plurality of second channel frequency bands based on at least two channels, or

в котором многоканальный сигнал имеет больше двух каналов, и в котором модуль (100) оценки весовых значений выполнен с возможностью вычисления множества первых весовых значений по частотным полосам для множества частотных полос первого канала на основе более чем двух каналов, вычисления второго множества весовых значений по частотным полосам для множества частотных полос второго канала на основе более чем двух каналов и вычислять дополнительное множество весовых значений по частотным полосам для множества частотных полос дополнительного канала на основе более чем двух каналов.wherein the multi-channel signal has more than two channels, and wherein the weight estimation module (100) is configured to calculate a plurality of first weight values by frequency bands for a plurality of first channel frequency bands based on more than two channels, calculate a second plurality of weight values by frequency bands for the plurality of frequency bands of the second channel based on more than two channels, and calculate an additional set of weight values per frequency bands for the plurality of frequency bands of the additional channel based on more than two channels.

3. Устройство понижающего микширования из примера 1 или 2,3. Downmixer from example 1 or 2,

в котором каждое из представлений в спектральной области по меньшей мере двух каналов содержит набор частотных элементов разрешения, причем спектральные значения ассоциированы с частотными элементами разрешения,wherein each of the spectral domain representations of at least two channels contains a set of frequency bins, wherein the spectral values are associated with frequency bins,

в котором модуль (100) оценки весовых значений выполнен с возможностью вычисления весовых значений по частотным полосам для частотных полос, причем каждая частотная полоса содержит один, два или более частотных элемента разрешения, илиwherein the weight estimator (100) is configured to calculate weight values over frequency bands for frequency bands, each frequency band comprising one, two or more frequency bins, or

в котором количество частотных элементов разрешения на каждую частотную полосу увеличивается по мере повышения центральной частоты частотных полос.in which the number of frequency bins per frequency band increases as the center frequency of the frequency bands increases.

4. Устройство понижающего микширования из одного из предыдущих примеров,4. Downmixer from one of the previous examples,

в котором модуль (100) оценки весовых значений выполнен с возможностью вычисления весовых значений по частотным полосам на основе целевого значения энергии на каждую частотную полосу, чтобы энергия в частотной полосе микшированного с понижением сигнала находилась в заданном отношении к энергиям в одних тех же частотных полосах по меньшей мере двух каналов.wherein the weight estimation module (100) is configured to calculate weight values across frequency bands based on a target energy value per frequency band such that the energy in the frequency band of the downmixed signal is in a predetermined ratio to the energies in the same frequency bands across at least two channels.

5. Устройство понижающего микширования из одного из предыдущих примеров, дополнительно содержащее:5. The downmixer of one of the previous examples, further comprising:

базовый декодер (500) для декодирования кодированного сигнала, кодированный сигнал имеет кодированные представления в спектральной области по меньшей мере двух исходных каналов, причем базовый декодер выполнен с возможностью формирования представлений в спектральной области на основе кодированных представлений в спектральной области.a base decoder (500) for decoding the encoded signal, the encoded signal has encoded spectral domain representations of at least two original channels, wherein the base decoder is configured to generate spectral domain representations based on the encoded spectral domain representations.

6. Устройство понижающего микширования из одного из предыдущих примеров,6. Downmixer from one of the previous examples,

в котором представления в спектральной области являются либо полностью действительными, либо полностью мнимыми,in which representations in the spectral domain are either fully real or fully imaginary,

в котором модуль (100) оценки весовых значений выполнен с возможностью оценки (120, 122) мнимого представления в спектральной области, когда представление в спектральной области является полностью действительным, или оценки действительного представления в спектральной области, когда представление в спектральной области является полностью мнимым, иwherein the weight estimation module (100) is configured to evaluate (120, 122) the imaginary representation in the spectral domain when the representation in the spectral domain is fully real or the estimation of the actual representation in the spectral domain when the representation in the spectral domain is fully imaginary, And

в котором модуль (100) оценки весовых значений выполнен с возможностью оценки весовых значений по частотным полосам с использованием оцененного мнимого представления в спектральной области или оцененного действительного представления в спектральной области.wherein the weight estimation module (100) is configured to estimate the weight values across frequency bands using the estimated imaginary spectral domain representation or the estimated real spectral domain representation.

7. Устройство понижающего микширования из одного из предыдущих примеров, в котором модуль (100) оценки весовых значений выполнен с возможностью вычисления первого весового значения для частотной полосы первого канала на основе по меньшей мере двух каналов,7. The downmixer of one of the previous examples, in which the weight estimation module (100) is configured to calculate the first weight value for the frequency band of the first channel based on at least two channels,

в котором модуль (100) оценки весовых значений выполнен с возможностью вычисления второго весового значения для частотной полосы второго канала на основе по меньшей мере двух каналов, иwherein the weight estimator (100) is configured to calculate a second weight value for the frequency band of the second channel based on at least two channels, and

в котором модуль (100) оценки весовых значений выполнен с возможностью вычисления первого весового значения и второго весового значения с использованием энергии первого канала в частотной полосе, энергии второго канала в частотной полосе и микшированной составляющей, зависящей от произведения или линейной комбинации спектральных значений по меньшей мере из двух каналов в частотной полосе.wherein the weight estimation module (100) is configured to calculate the first weight value and the second weight value using the energy of the first channel in the frequency band, the energy of the second channel in the frequency band, and a mixed component depending on a product or a linear combination of spectral values of at least of two channels in the frequency band.

8. Устройство понижающего микширования из одного из предыдущих примеров,8. Downmixer from one of the previous examples,

в котором модуль (100) оценки весовых значений выполнен с возможностью вычисления в качестве микшированной составляющей, представляющей линейную комбинацию, квадратного корня энергии спектральных значений, сложенных друг с другом, в частотной полосе из представлений в спектральной области по меньшей мере двух каналов, причем частотная полоса содержит множество спектральных значений, или вычисления в качестве микшированной составляющей, представляющей упомянутое произведение, абсолютного значения комплексного скалярного произведения между спектральными значениями в частотной полосе первого канала и спектральными значениями в частотной полосе второго канала по меньшей мере из двух каналов.wherein the weight value estimation module (100) is configured to calculate, as a mixed component representing a linear combination, the square root of the energy of the spectral values added to each other in the frequency band from the spectral domain representations of at least two channels, wherein the frequency band contains a plurality of spectral values, or calculations as a mixed component representing said product, the absolute value of the complex scalar product between the spectral values in the frequency band of the first channel and the spectral values in the frequency band of the second channel of at least two channels.

9. Устройство понижающего микширования из одного из предыдущих примеров,9. Downmixer from one of the previous examples,

в котором каждая частотная полоса первого и второго каналов из множества по меньшей мере из двух каналов имеет множество спектральных значений, причем модуль (200) спектрального взвешивания выполнен с возможностью применения одинакового весового коэффициента к каждому спектральному значению в частотной полосе одного по меньшей мере из двух каналов и применения другого весового коэффициента к каждому спектральному значению в частотной полосе другого канала по меньшей мере из двух каналов.in which each frequency band of the first and second channels from a plurality of at least two channels has a plurality of spectral values, and the spectral weighting module (200) is configured to apply the same weighting factor to each spectral value in the frequency band of one of at least two channels and applying a different weighting factor to each spectral value in the frequency band of another channel of at least two channels.

10. Устройство понижающего микширования из одного из предыдущих примеров,10. Downmixer from one of the previous examples,

в котором взвешенные представления в спектральной области являются спектрами преобразования MDCT (modified discrete cosine transform, модифицированного дискретного косинусного преобразования, иin which the weighted representations in the spectral domain are the spectra of the MDCT (modified discrete cosine transform), and

в котором модуль (300) преобразования выполнен с возможностью выполнения для каждого канала из множества каналов обратное преобразование MDCT с использованием операции оконной обработке синтеза и операции сложения с наложением.wherein the transform module (300) is configured to perform, for each channel of the plurality of channels, an inverse MDCT transform using a synthesis windowing operation and an overlay addition operation.

11. Устройство понижающего микширования из одного из предыдущих примеров,11. Downmixer from one of the previous examples,

в котором микшер (400) выполнен с возможностью применения сложения по каждой выборке временных представлений по меньшей мере двух каналов, илиwherein the mixer (400) is configured to apply addition on each sample of the time representations of the at least two channels, or

в котором микшер (400) выполнен с возможностью применения сложения по каждой выборке временных представлений по меньшей мере двух каналов и операцию масштабирования, применяемую к результату сложения по каждой выборке или применяемую к входным значениям сложения по каждой выборке.wherein the mixer (400) is configured to apply an addition on each sample of the time representations of the at least two channels and a scaling operation applied to the result of the addition on each sample or applied to input values of the addition on each sample.

12. Устройство понижающего микширования из одного из предыдущих примеров,12. Downmixer from one of the previous examples,

в котором модуль (300) преобразования выполнен с возможностью формирования (310) необработанных временных представлений с использованием алгоритма преобразования спектрального представления во временное представление, иwherein the transform module (300) is configured to generate (310) raw temporal representations using an algorithm for converting a spectral representation to a temporal representation, and

выполнения по отдельности постобработки (320) необработанных временных представлений в направлении обработки сигналов перед микшированием посредством микшера (400) с использованием отдельной управляющей информации для каналов, чтобы получить временные представления.performing separately post-processing (320) of the raw temporal representations in the direction of signal processing before mixing by the mixer (400) using separate control information for the channels to obtain temporal representations.

13. Устройство понижающего микширования из примера 12,13. Downmixer from Example 12,

в котором модуль (300) преобразования выполнен с возможностью выполнения в качестве постобработки (320) низкочастотной постфильтрации, обработки TCX-LTP (долгосрочное предсказание с возбуждением посредством кода с преобразованием) или синтеза LPC (кодирование с линейным предсказанием) по отдельности для каждого временного представления.wherein the transform module (300) is configured to perform, as a post-processing (320), low-pass post-filtering, TCX-LTP (transform code excited long-term prediction) processing, or LPC (linear prediction coding) synthesis separately for each temporal representation.

14. Устройство понижающего микширования из одного из предыдущих примеров,14. Downmixer from one of the previous examples,

в котором первое представление в спектральной области первого канала по меньшей мере из двух каналов имеет первое временное или частотное разрешение,in which the first representation in the spectral domain of the first channel of at least two channels has the first time or frequency resolution,

в котором второе представление в спектральной области второго канала по меньшей мере из двух каналов имеет второе временное или частотное разрешение, причем второе временное или частотное разрешение отличаются от первого временного или частотного разрешения, иwherein the second spectral domain representation of the second channel of at least two channels has a second time or frequency resolution, wherein the second time or frequency resolution is different from the first time or frequency resolution, and

в котором модуль (100) оценки весовых значений выполнен с возможностью вычисления весовых значений по частотным полосам таким образом, чтобы частотное разрешение частотных полос, ассоциированных с весовыми значениями по частотным полосам, было ниже, чем первое частотное разрешение и второе частотное разрешение, или было равно более низкому разрешению из первого и второго частотных разрешений.wherein the weight estimation module (100) is configured to calculate the weight values by frequency bands such that the frequency resolution of the frequency bands associated with the weights by frequency bands is lower than the first frequency resolution and the second frequency resolution, or equal to the lower resolution of the first and second frequency resolutions.

15. Устройство понижающего микширования из одного из предыдущих примеров,15. Downmixer from one of the previous examples,

в котором первое представление в спектральной области имеет первое множество спектральных значений в частотной полосе,in which the first representation in the spectral domain has the first set of spectral values in the frequency band,

в котором второе представление в спектральной области имеет второе множество спектральных значений в частотной полосе, причём второе множество больше, чем первое множество, иwherein the second spectral domain representation has a second set of spectral values in the frequency band, the second set being larger than the first set, and

в котором модуль (100) оценки весовых значений выполнен с возможностьюin which the weight estimation module (100) is configured to

объединения двух или более спектральных значений из второго множества спектральных значений или выбора из второго множества спектральных значений поднабора спектральных значений,combining two or more spectral values from the second set of spectral values or selecting a subset of spectral values from the second set of spectral values,

вычисления микшированной составляющей, зависящей от произведений или линейных комбинаций спектральных значений по меньшей мере из двух каналов в частотной полосе, с использованием результата объединения двух или более спектральных значений или с использованием поднабора спектральных значений, иcalculating a mixed component depending on products or linear combinations of spectral values from at least two channels in a frequency band, using the result of combining two or more spectral values, or using a subset of spectral values, and

вычисления весовых значений по частотным полосам с использованием микшированной составляющей.calculating weight values over frequency bands using the mixed component.

16. Устройство понижающего микширования в соответствии с одним из предыдущих примеров,16. Downmixer according to one of the previous examples,

котором первое представление в спектральной области содержит множество первых спектральных значений, представляющих первый размер временного элемента разрешения и первый размер частотного элемента разрешения,wherein the first representation in the spectral domain contains a plurality of first spectral values representing a first temporal bin size and a first frequency bin size,

в котором второе представление в спектральной области содержит множество спектральных значений, представляющих второй размер временного элемента разрешения и второй размер частотного элемента разрешения,wherein the second spectral domain representation contains a plurality of spectral values representing a second temporal bin size and a second frequency bin size,

в котором первый размер временного элемента разрешения больше, чем второй размер временного элемента разрешения, или в котором первый размер частотного элемента разрешения меньше, чем второй размер частотного элемента разрешения,wherein the first temporal bin size is greater than the second temporal bin size, or wherein the first frequency bin size is smaller than the second frequency bin size,

в котором модуль (100) оценки весовых значений выполнен с возможностью объединения множества спектральных значений из первого представления в спектральной области, чтобы получить первое объединенное представление в спектральной области, причем размер объединенного частотного элемента разрешения равен размеру второго частотного элемента разрешения, или объединения множества спектральных значений от второго представления в спектральной области, чтобы получить первое объединенное представление в спектральной области, причем размер объединенного временного элемента разрешения равен размеру первого временного элемента разрешения.wherein the weight estimator (100) is configured to combine a plurality of spectral values from a first spectral domain representation to obtain a first combined spectral domain representation, wherein the size of the combined frequency bin is equal to the size of the second frequency bin, or to combine the plurality of spectral values from the second spectral domain representation to obtain the first combined spectral domain representation, wherein the combined temporal bin size is equal to the size of the first temporal bin.

17. Устройство понижающего микширования из примера 16,17. Downmixer from Example 16,

в котором модуль (100) оценки весовых значений выполнен с возможностью использования первого объединенного спектрального представления или второго объединенного представления в спектральной области для вычисления весовых значений по частотным полосам для первого канала и второго канала по меньшей мере из двух каналов, вычисление содержит вычисление микшированной составляющей в частотных полосах и вычисление энергий в частотных полосах, иwherein the weight estimator (100) is configured to use the first combined spectral representation or the second combined spectral domain representation to calculate the weight values across the frequency bands for the first channel and the second channel from at least two channels, the calculation comprises calculating a mixed component in frequency bands and calculation of energies in frequency bands, and

в котором модуль (200) спектрального взвешивания выполнен с возможностью применения весовых значений по частотным полосам для первого канала по меньшей мере из двух каналов к спектральным значениям первого представления в спектральной области в соответствующих частотных полосах и применения весовых значений по частотным полосам для второго канала по меньшей мере из двух каналов к спектральным значениям второго представления в спектральной области в соответствующих частотных полосах.wherein the spectral weighting module (200) is configured to apply frequency band weights for the first channel of at least two channels to the spectral values of the first representation in the spectral domain in the respective frequency bands, and apply weight values by frequency bands for the second channel of at least measure of the two channels to the spectral values of the second representation in the spectral domain in the respective frequency bands.

18. Устройство понижающего микширования в соответствии с одним из примеров 1-15,18. A downmixer according to one of Examples 1-15,

в котором первое представление в спектральной области первого канала содержит множество первых спектральных значений, представляющих первый размер временного элемента разрешения и первый размер частотного элемента разрешения,wherein the first representation in the spectral domain of the first channel contains a plurality of first spectral values representing the first time bin size and the first frequency bin size,

в котором второе представление в спектральной области второго канала содержит по меньшей мере два субкадра, причем каждый субкадр содержит множество спектральных значений, представляющих второй размер временного элемента разрешения и второй размер частотного элемента разрешения,wherein the second spectral domain representation of the second channel comprises at least two subframes, each subframe containing a plurality of spectral values representing a second time bin size and a second frequency bin size,

объединения спектральных значений, принадлежащих одному и тому же частотному элементу разрешения из каждого субкадра второго представления в спектральной области первым методом, чтобы получить первую группу объединенных спектральных значений, иcombining spectral values belonging to the same frequency bin from each subframe of the second representation in the spectral domain by the first method to obtain a first group of combined spectral values, and

объединения спектральных значений, принадлежащих одному и тому же частотному элементу разрешения из каждого субкадра второго представления в спектральной области вторым методом, чтобы получить вторую группу объединенных спектральных значений, причём второй метод отличается от первого метода,combining spectral values belonging to the same frequency bin from each subframe of the second representation in the spectral domain by the second method to obtain a second group of combined spectral values, the second method being different from the first method,

причем первая группа объединенных спектральных значений и вторая группа объединенных спектральных значений представляют объединенное представление в спектральной области, имеющее первый размер временного элемента разрешения и первый размер частотного элемента разрешения, иwherein the first group of combined spectral values and the second group of combined spectral values represent a combined spectral domain representation having a first time bin size and a first frequency bin size, and

использования спектральных значений объединенного представления в спектральной области и первого представления в спектральной области для вычисления весовых значений по частотным полосам.using the spectral values of the combined spectral domain representation and the first spectral domain representation to compute weights over the frequency bands.

19. Устройство понижающего микширования в соответствии с примером 18,19. Downmixer according to example 18,

в котором модуль (100) оценки весовых значений выполнен с возможностью выполнения одного из сложения и вычитания в первом методе и другого из сложения и вычитания во втором методе.wherein the weight estimation module (100) is configured to perform one of addition and subtraction in the first method and another of addition and subtraction in the second method.

20. Устройство понижающего микширования в соответствии с примером 18 или 19, в котором модуль (100) оценки весовых значений выполнен с возможностью выполнения функции усреднения в первом методе и втором методе.20. The downmixer according to Example 18 or 19, wherein the weight estimation unit (100) is configured to perform an averaging function in the first method and the second method.

21. Устройство понижающего микширования в соответствии с одним из примеров 18-20, в котором модуль (100) оценки весовых значений выполнен с возможностью применения либо первого метода, либо второго метода, содержащего взвешивание с использованием знака взвешивания, причем модуль (100) оценки весовых значений выполнен с возможностью установления знака взвешивания в соответствии с номером частотного элемента разрешения одного и того же частотного элемента разрешения.21. A downmix device according to one of Examples 18-20, in which the module (100) for estimating the weight values is configured to apply either the first method or the second method containing weighting using the weight sign, and the module (100) for estimating the weight values is configured to set the weighting sign in accordance with the frequency bin number of the same frequency bin.

21. Устройство понижающего микширования в соответствии с одним из примеров 18-21, в котором модуль (100) оценки весовых значений выполнен с возможностью применения в качестве первого метода одного из высокочастотной фильтрации и низкочастотной фильтрации и в качестве второго метода другого из высокочастотной фильтрации и низкочастотной фильтрации.21. The downmixer according to one of Examples 18-21, wherein the weight estimation unit (100) is configured to apply as a first method one of high pass filtering and low pass filtering and as a second method the other of high pass filtering and low pass filtering. filtration.

22. Устройство понижающего микширования в соответствии с одним из примеров 18-22, в котором модуль (100) оценки весовых значений выполнен с возможностью преобразования элемента разрешения с более низким разрешением в два элемента разрешения с более высоким разрешением, причем первый метод используется для четного номера элемента разрешения первого элемента разрешения с более высоким разрешением из двух элементов разрешения с более высоким разрешением, и второй метод используется для нечетного номера элемента разрешения второго элемента разрешения с более высоким разрешением из двух элементов разрешения с более высоким разрешением.22. A downmixer according to one of Examples 18-22, wherein the weight estimator (100) is configured to convert a lower resolution bin to two higher resolution bins, with the first method being used for an even number. the first higher resolution bin of the two higher resolution bins, and the second method is used for the odd bin number of the second higher resolution bin of the two higher resolution bins.

23. Устройство понижающего микширования в соответствии с одним из примеров 18-22,23. A downmixer according to one of Examples 18-22,

в котором первое представление в спектральной области первого канала содержит кадр TCX20, второе представление в спектральной области второго канала содержит два субкадра TCX10, причем модуль (100) оценки весовых значений выполнен с возможностью вычисления объединенного представления в спектральной области TCX20 на основе двух субкадров TCX10, илиwherein the first spectral domain representation of the first channel comprises a TCX20 frame, the second spectral domain representation of the second channel comprises two TCX10 subframes, wherein the weight estimator (100) is configured to compute a combined TCX20 spectral domain representation based on the two TCX10 subframes, or

в котором первое представление в спектральной области первого канала содержит кадр TCX20, второе представление в спектральной области второго канала содержит субкадр TCX10 и два субкадра TCX5, причем модуль (100) оценки весовых значений выполнен с возможностью вычисления первого объединенного представления в спектральной области TCX10 на основе двух субкадров TCX5 и вычисления второго объединенного субкадра TCX20 на основе первого объединенного представления в спектральной области TCX10 и субкадра TCX10, илиwherein the first spectral domain representation of the first channel comprises a TCX20 frame, the second spectral domain representation of the second channel comprises a TCX10 subframe and two TCX5 subframes, wherein the weight estimator (100) is configured to calculate the first combined spectral domain representation TCX10 based on the two TCX5 subframes and calculating a second combined TCX20 subframe based on the first combined TCX10 spectral domain representation and the TCX10 subframe, or

в котором первое представление в спектральной области первого канала содержит субкадр TCX10, второе представление в спектральной области второго канала содержит два субкадра TCX5, причем модуль (100) оценки весовых значений выполнен с возможностью вычисления объединенного представления в спектральной области TCX10 на основе двух субкадров TCX5,wherein the first spectral domain representation of the first channel contains a TCX10 subframe, the second spectral domain representation of the second channel contains two TCX5 subframes, wherein the weight estimation module (100) is configured to calculate a combined TCX10 spectral domain representation based on two TCX5 subframes,

причем выражение TCX20 указывает первый участок с первой длительностью, выражение TCX10 указывает второй участок со второй длительностью, выражение TCX5 указывает третий участок с третьей длительностью, причем первая длительность больше, чем вторая длительность или третья длительность, вторая длительность меньше, чем первая длительность, или больше, чем вторая длительность, и третья длительность меньше, чем первая длительность, или меньше, чем вторая длительность.where the expression TCX20 indicates the first section with the first duration, the expression TCX10 indicates the second section with the second duration, the expression TCX5 indicates the third section with the third duration, and the first duration is greater than the second duration or the third duration, the second duration is less than the first duration, or more than the second duration, and the third duration is less than the first duration or less than the second duration.

24. Устройство понижающего микширования в соответствии с одним из примеров 18-23, в котором модуль (100) оценки весовых значений выполнен с возможностью применения первого метода на основе следующего уравнения:24. The downmixer according to one of Examples 18-23, wherein the weight estimation module (100) is configured to apply the first method based on the following equation:

, или

, or

в котором модуль (100) оценки весовых значений выполнен с возможностью применения второго метода на основе следующего уравнения:wherein the weight estimation module (100) is configured to apply the second method based on the following equation:

,

где

определяет номер спектрального элемента разрешения,

и

определяют субкадры второго представления в спектральной области второго канала, иWhere

determines the number of the spectral element of resolution,

And

determining subframes of the second representation in the spectral domain of the second channel, and

где

и

указывают спектральные значения объединенного представления в спектральной области, и

и

указывают спектральные значения из второго субкадра k₁ и первого субкадра k₀ соответственно.Where

And

indicate the spectral values of the combined representation in the spectral domain, and

And

indicate the spectral values from the second subframe k ₁ and the first subframe k ₀ respectively.

25. Устройство понижающего микширования в соответствии с примером 1,25. Downmixer according to example 1,

в котором первое представление в спектральной области первого канала по меньшей мере из двух каналов имеет первое временное разрешение или первое частотное разрешение, второе представление в спектральной области второго канала по меньшей мере из двух каналов имеет второе временное разрешение или второе частотное разрешение, причем второе временное разрешение отличается от первого временного разрешения, или второе частотное разрешение отличается от первого частотного разрешения, иwherein the first spectral domain representation of the first channel of at least two channels has a first time resolution or a first frequency resolution, the second spectral domain representation of a second channel of at least two channels has a second time resolution or a second frequency resolution, wherein the second time resolution is different from the first time resolution, or the second frequency resolution is different from the first frequency resolution, and

в котором модуль (100) оценки весовых значений выполнен с возможностью преобразования (132) первого представления в спектральной области в объединенное представление в спектральной области, имеющее второе временное разрешение или второе частотное разрешение, и вычисления весовых значений по частотным полосам с использованием объединенного представления в спектральной области и второго представления в спектральной области или преобразования второго представления в спектральной области в объединенное представление в спектральной области, имеющее первое временное разрешение или первое частотное разрешение, и вычисления весовых значений по частотным полосам с использованием объединенного представления в спектральной области и первого представления в спектральной области, илиwherein the weight estimator (100) is configured to convert (132) the first spectral domain representation to a combined spectral domain representation having a second temporal resolution or a second frequency resolution, and calculate the weight values across frequency bands using the combined spectral domain representation domain and the second spectral domain representation, or converting the second spectral domain representation to a combined spectral domain representation having a first temporal resolution or a first frequency resolution, and calculating weights over frequency bands using the combined spectral domain representation and the first spectral domain representation , or

преобразования (132) первого представления в спектральной области в первое объединенное представление в спектральной области, имеющее третье временное разрешение или третье частотное разрешение,transforming (132) the first spectral domain representation into a first combined spectral domain representation having a third temporal resolution or a third frequency resolution,

причем третье временное разрешение отличается от первого временного разрешения или второго временного разрешения, и третье частотное разрешение отличается от первого частотного разрешения или второго частотного разрешения,wherein the third time resolution is different from the first time resolution or the second time resolution, and the third frequency resolution is different from the first frequency resolution or the second frequency resolution,

преобразовывать (132) второе представление в спектральной области во второе объединенное представление в спектральной области, имеющее третье временное разрешение или третье частотное разрешение, иtransform (132) the second spectral domain representation into a second combined spectral domain representation having a third time resolution or a third frequency resolution, and

вычислять (134) весовые значения по частотным полосам с использованием первого объединенного представления в спектральной области и второго объединенного представления в спектральной области.calculate (134) weights over frequency bands using the first combined spectral domain representation and the second combined spectral domain representation.

26. Устройство понижающего микширования в соответствии с примером 25,26. Downmixer according to example 25,

в котором второй канал содержит для определенного временного участка (TCX20) второе представление в спектральной области,in which the second channel contains for a certain time section (TCX20) the second representation in the spectral domain,

в котором первый канал содержит для определенного временного участка (2xTCX10) два или более из первых представлений в спектральной области,in which the first channel contains for a certain time section (2xTCX10) two or more of the first representations in the spectral domain,

в котором модуль (100) оценки весовых значений выполнен с возможностью преобразования двух или более первых представлений в спектральной области в объединенное представление в спектральной области, имеющее такое же временное и частотное разрешение, как второе представление в спектральной области, и вычисления весовых значений по частотным полосам с использованием объединенного представления в спектральной области и второго представления в спектральной области, иwherein the weight estimator (100) is configured to convert the two or more first spectral domain representations into a combined spectral domain representation having the same time and frequency resolution as the second spectral domain representation, and calculate the weight values over the frequency bands using the combined spectral domain representation and the second spectral domain representation, and

в котором модуль (200) спектрального взвешивания выполнен с возможностью взвешивания второго представления в спектральной области с использованием весовых значений по частотным полосам и взвешивания каждого первого представления в спектральной области из двух или более первых представлений в спектральной области с использованием одинаковых весовых значений по частотным полосам.wherein the spectral weighting module (200) is configured to weight the second spectral domain representation using frequency band weights and weight each first spectral domain representation from the two or more first spectral domain representations using the same frequency band weights.

27. Устройство понижающего микширования в соответствии с примером 26,27. Downmixer according to example 26,

в котором модуль (100) оценки весовых значений выполнен с возможностью сложения спектральных значений для одной и той же частоты двух или более первых представлений в спектральной области, чтобы получить первое спектральное значение объединенного представления в спектральной области, и вычитания спектральных значений для одной и той же частоты двух или более первых представлений в спектральной области, чтобы получить второе спектральное значение объединенного представления в спектральной области, которое является более высоким и смежным по частоте по отношению к первому спектральному значению объединенного представления в спектральной области, иwherein the weight estimation module (100) is configured to add spectral values for the same frequency of two or more first spectral domain representations to obtain a first spectral value of the combined spectral domain representation, and subtract spectral values for the same the frequencies of the two or more first spectral domain representations to obtain a second spectral value of the combined spectral domain representation that is higher and adjacent in frequency to the first spectral value of the combined spectral domain representation, and

в котором модуль (200) спектрального взвешивания выполнен с возможностью взвешивания частотной полосы, имеющей одинаковые частоты в каждом первом представлении в спектральной области из двух или более первых представлений в спектральной области, с использованием одинакового весового значения по частотным полосам.wherein the spectral weighting module (200) is configured to weight a frequency band having the same frequencies in each first spectral domain representation of the two or more first spectral domain representations using the same frequency band weighting.

28. Устройство понижающего микширования в соответствии с примером 25,28. Downmixer according to example 25,

в котором первый канал содержит для определенного временного участка (2xTCX10) два или более первых представления в спектральной области,in which the first channel contains for a certain time section (2xTCX10) two or more first representations in the spectral domain,

преобразования второго представления в спектральной области в два или более объединенных представления в спектральной области, имеющие такое же временное и частотное разрешение, как два или более первых представления в спектральной области,converting the second spectral domain representation into two or more combined spectral domain representations having the same time and frequency resolution as the two or more first spectral domain representations,

вычисления первых весовых значений по частотным полосам с использованием первого объединенного представления в спектральной области из двух или более объединенных представлений в спектральной области и первого первого представления в спектральной области из двух или более первых представлений в спектральной области,calculating first weight values across frequency bands using a first combined spectral domain representation of the two or more combined spectral domain representations and a first first spectral domain representation of the two or more first spectral domain representations,

вычисления вторых весовых значений по частотным полосам с использованием второго объединенного представления в спектральной области из двух или более объединенных представлений в спектральной области и второго первого представления в спектральной области из двух или более первых представлений в спектральной области, иcalculating second weight values across frequency bands using a second combined spectral domain representation of the two or more combined spectral domain representations and a second first spectral domain representation of the two or more first spectral domain representations, and

в котором модуль (200) спектрального взвешивания выполнен с возможностьюwherein the spectral weighting module (200) is configured to

взвешивания второго представления в спектральной области с использованием выведенного весового значения по частотным полосам, выведенного (136) из первых и вторых весовых значений по частотным полосам,weighting the second spectral domain representation using the derived frequency band weight derived (136) from the first and second frequency band weights,

взвешивания первого представления в спектральной области из двух или более первых представлений в спектральной области с использованием первых весовых значений по частотным полосам иweighting the first spectral domain representation of the two or more first spectral domain representations using the first weight values across frequency bands, and

взвешивать второе первое представление в спектральной области из двух или более первых представлений в спектральной области с использованием вторых весовых значений по частотным полосам.to weight a second first spectral domain representation of the two or more first spectral domain representations using second frequency band weights.

29. Устройство понижающего микширования в соответствии с примером 28,29. Downmixer according to example 28,

в котором модуль (100) оценки весовых значений выполнен с возможностью сложения спектральных значений для пар частот второго представления в спектральной области, чтобы получить сложенное спектральное значение, и копирования к сложенному спектральному значению для получения объединенного спектрального значения для каждого из двух или более объединенных представлений в спектральной области, иwherein the weight estimator (100) is configured to add the spectral values for the frequency pairs of the second representation in the spectral domain to obtain the combined spectral value, and copy to the combined spectral value to obtain the combined spectral value for each of the two or more combined representations in spectral region, and

в котором модуль (200) спектрального взвешивания выполнен с возможностью объединения (136) весового значения для определенной частотной полосы первых весовых значений по частотным полосам с весовым значением для определенной частотной полосы вторых весовых значений по частотным полосам, чтобы получить выведенное весовое значение для определенной частотной полосы из выведенных весовых значений по частотным полосам.in which the spectral weighting module (200) is configured to combine (136) the weight value for a specific frequency band of the first weight values by frequency bands with the weight value for a specific frequency band of the second weight values by frequency bands to obtain a derived weight value for a specific frequency band from the derived weights across the frequency bands.

30. Устройство понижающего микширования в соответствии с примером 25,30. Downmixer according to example 25,

преобразования второго представления в спектральной области в два или более объединенных представления в спектральной области, имеющие такое же временное разрешение, как два или более первых представления в спектральной области, и имеющие такое же частотное разрешение, как второе представление в спектральной области,converting the second spectral domain representation into two or more combined spectral domain representations having the same temporal resolution as the two or more first spectral domain representations and having the same frequency resolution as the second spectral domain representation,

вычисления первых весовых значений по частотным полосам с использованием первого объединенного представления в спектральной области из двух или более объединенных представлений в спектральной области и первого представления в спектральной области из двух или более первых представлений в спектральной области,calculating first weight values across frequency bands using a first combined spectral domain representation of the two or more combined spectral domain representations and a first spectral domain representation of the two or more first spectral domain representations,

взвешивания второго представления в спектральной области с использованием весовых значений по частотным полосам, выведенных (136) из первых и вторых весовых значений по частотным полосам,weighting the second spectral domain representation using the frequency band weights derived (136) from the first and second frequency band weights,

взвешивания первого первого представления в спектральной области из двух или более первых представлений в спектральной области с использованием первых весовых значений по частотным полосам , иweighting the first first representation in the spectral domain of the two or more first representations in the spectral domain using the first weight values by frequency bands , and

взвешивания второго первого представления в спектральной области из двух или более первых представлений в спектральной области с использованием вторых весовых значений по частотным полосам.weighting the second first representation in the spectral domain of the two or more first representations in the spectral domain using the second weight values by frequency bands.

31. Устройство понижающего микширования в соответствии с примером 30,31. Downmixer according to example 30,

в котором модуль (100) оценки весовых значений выполнен с возможностью повышающей дискретизации одного или более спектральных значений, чтобы получить дискретизированные с повышением спектральные значения для смежных частот второго представления в спектральной области и для копирования к дискретизированным с повышением спектральным значениям, чтобы получить объединенные спектральные значения для каждого из двух или более объединенных представлений в спектральной области, иwherein the weight estimator (100) is configured to upsample one or more spectral values to obtain upsampled spectral values for adjacent frequencies of the second representation in the spectral domain and to copy to the upsampled spectral values to obtain combined spectral values for each of two or more combined representations in the spectral domain, and

32. Устройство понижающего микширования в соответствии с примером 25,32. Downmixer according to example 25,

преобразования двух или более первых представлений в спектральной области в первое объединенное представление в спектральной области, имеющее такое же временное разрешение, как второе представление в спектральной области,converting two or more first spectral domain representations to a first combined spectral domain representation having the same temporal resolution as the second spectral domain representation,

преобразования вторых представлений в спектральной области во второе объединенное представление в спектральной области, имеющее такое же частотное разрешение, как два или более первых представления в спектральной области, иconverting the second spectral domain representations to a second combined spectral domain representation having the same frequency resolution as the two or more first spectral domain representations, and

вычисления весовых значений по частотным полосам с использованием первого объединенного представления в спектральной области и второго объединенного представления в спектральной области, иcalculating weights over frequency bands using the first combined spectral domain representation and the second combined spectral domain representation, and

33. Устройство понижающего микширования в соответствии с примером 32,33. Downmixer according to example 32,

в котором модуль (100) оценки весовых значений выполнен с возможностью сложения спектральных значений для пар частот второго представления в спектральной области, чтобы получить второе объединенное представление в спектральной области, и сложения спектральных значений одинаковой частоты из двух или более из первых представлений в спектральной области, чтобы получить первое объединенное представление в спектральной области, иwherein the weight estimation module (100) is configured to add spectral values for frequency pairs of the second spectral domain representation to obtain a second combined spectral domain representation, and to add spectral values of the same frequency from two or more of the first spectral domain representations, to get the first combined representation in the spectral domain, and

34. Устройство понижающего микширования в соответствии с одним из предыдущих примеров,34. Downmixer according to one of the previous examples,

в котором модуль (100) оценки весовых значений выполнен с возможностью вычисления множества первых весовых значений по частотным полосам для множества частотных полос первого канала по меньшей мере из двух каналов с использованием первого правила вычисления в зависимости по меньшей мере от двух из спектральных значений первого представления в спектральной области первого канала, спектральных значений второго представления в спектральной области второго канала, спектральных значений одного объединенного представления в спектральной области, выведенного из спектральных значений первого представления в спектральной области или второго представления в спектральной области, спектральных значений первого объединенного представления в спектральной области, выведенного из спектральных значений первого представления в спектральной области, и спектральных значений второго объединенного представления в спектральной области, выведенного из спектральных значений второго представления в спектральной области, иwherein the weight estimation module (100) is configured to calculate a plurality of first weight values across frequency bands for a plurality of first channel frequency bands from at least two channels using a first calculation rule depending on at least two of the spectral values of the first representation in the spectral domain of the first channel, the spectral values of the second spectral domain representation of the second channel, the spectral values of one combined spectral domain representation derived from the spectral values of the first spectral domain representation or the second spectral domain representation, the spectral values of the first combined spectral domain representation derived from the spectral values of the first spectral domain representation, and the spectral values of the second combined spectral domain representation derived from the spectral values of the second spectral domain representation areas, and

в котором модуль (100) оценки весовых значений выполнен с возможностью вычисления множества вторых весовых значений по частотным полосам для множества частотных полос первого канала по меньшей мере из двух каналов с использованием второго правила вычисления в зависимости по меньшей мере от двух из множества первых весовых значений по частотным полосам, спектральных значений первого представления в спектральной области первого канала, спектральных значений второго представления в спектральной области второго канала, спектральных значений одного объединенного представления в спектральной области, выведенного из спектральных значений первого представления в спектральной области или второго представления в спектральной области, спектральных значений первого объединенного представления в спектральной области, выведенного из спектральных значений первого представления в спектральной области, и спектральных значений второго объединенного представления в спектральной области, выведенного из спектральных значений второго представления в спектральной области, причем второе правило вычисления отличается от первого правила вычисления.wherein the weight estimation module (100) is configured to calculate a plurality of second weight values by frequency bands for a plurality of first channel frequency bands from at least two channels using a second calculation rule depending on at least two of the plurality of first weight values by frequency bands, spectral values of the first spectral domain representation of the first channel, spectral values of the second spectral domain representation of the second channel, spectral values of one combined spectral domain representation derived from the spectral values of the first spectral domain representation or the second spectral domain representation, spectral values of the first combined spectral domain representation derived from the spectral values of the first spectral domain representation and the spectral values of the second combined spectral domain representation derived from sp spectral values of the second representation in the spectral domain, and the second calculation rule differs from the first calculation rule.

35. Устройство для преобразования спектрального разрешения представления в спектральной области канала, содержащего по меньшей мере два субкадра, причем каждый субкадр содержит множество спектральных значений, представляющих размер временного элемента разрешения и размер частотного элемента разрешения, причём устройство содержит:35. A device for converting the spectral resolution of a representation in the spectral domain of a channel containing at least two subframes, each subframe containing a plurality of spectral values representing the size of the time bin and the size of the frequency bin, and the device contains:

модуль (160) вычисления спектральных значений для объединения (170) спектральных значений, принадлежащих одному и тому же частотному элементу разрешения, из каждого субкадра представления в спектральной области первым методом, чтобы получить первую группу объединенных спектральных значений, и для объединения (180) спектральных значений, принадлежащих одному и тому же частотному элементу разрешения, из каждого субкадра представления в спектральной области вторым методом, чтобы получить вторую группу объединенных спектральных значений, второй метод отличается от первого метода, причем первая группа объединенных спектральных значений и вторая группа объединенных спектральных значений, представляют объединенное представление в спектральной области, имеющее другой размер временного элемента разрешения и другой размер частотного элемента разрешения.a spectral value calculation unit (160) for combining (170) spectral values belonging to the same frequency bin from each spectral domain representation subframe by the first method to obtain a first group of combined spectral values, and for combining (180) spectral values belonging to the same frequency bin from each subframe of the spectral domain representation by the second method to obtain the second group of combined spectral values, the second method differs from the first method, wherein the first group of combined spectral values and the second group of combined spectral values represent the combined a spectral domain representation having a different temporal bin size and a different frequency bin size.

36. Устройство в соответствии с примером 35,36. Device according to example 35,

в котором модуль (160) вычисления спектральных значений выполнен с возможностью выполнения одного из сложения и вычитания в первом методе и другого из сложения и вычитания во втором методе.wherein the spectral value calculation module (160) is configured to perform one of addition and subtraction in the first method and another of addition and subtraction in the second method.

37. Устройство в соответствии с примером 35 или 36, в котором модуль (160) вычисления спектральных значений выполнен с возможностью выполнения функции усреднения в первом методе и во втором методе.37. Apparatus according to example 35 or 36, wherein the spectral value calculation module (160) is configured to perform an averaging function in the first method and in the second method.

38. Устройство в соответствии с одним из примеров 35-37, в котором модуль (160) вычисления спектральных значений выполнен с возможностью применения либо первого метода, либо второго метода, содержащего взвешивание с использованием знака взвешивания, причем модуль (160) вычисления спектральных значений выполнен с возможностью установления знака взвешивания в соответствии с номером частотного элемента разрешения одного и того же частотного элемента разрешения.38. An apparatus according to one of Examples 35-37, wherein the spectral value calculation module (160) is configured to apply either the first method or the second method comprising weighting using the weight sign, wherein the spectral value calculation module (160) is configured with the possibility of setting the weighting sign in accordance with the frequency bin number of the same frequency bin.

39. Устройство в соответствии с одним из примеров 35-38, в котором модуль (160) вычисления спектральных значений выполнен с возможностью применения в качестве первого метода одного из высокочастотной фильтрации и низкочастотной фильтрации, и в качестве второго метода другого из высокочастотной фильтрации и низкочастотной фильтрации.39. Apparatus according to one of Examples 35-38, wherein the spectral value calculation module (160) is configured to apply as a first method one of high pass filtering and low pass filtering, and as a second method the other of high pass filtering and low pass filtering. .

40. Устройство в соответствии с одним из примеров 35-39, в котором модуль (160) вычисления спектральных значений выполнен с возможностью преобразования элемента разрешения с низким разрешением в два элемента разрешения с более высоким разрешением, причем первый метод используется для четного номера элемента разрешения, и второй метод используется для нечетного номера элемента разрешения.40. An apparatus according to one of Examples 35-39, wherein the spectral value calculation module (160) is configured to convert a low resolution bin to two higher resolution bins, the first method being used for an even bin number, and the second method is used for an odd numbered permission element.

41. Устройство в соответствии с одним из примеров 35-40,41. The device in accordance with one of examples 35-40,

в котором первое представление в спектральной области первого канала содержит кадр TCX20, представление в спектральной области канала содержит два субкадра TCX10, причем модуль вычисления спектральных значений выполнен с возможностью вычисления объединенного представления в спектральной области TCX20 на основе двух субкадров TCX10, илиwherein the first spectral domain representation of the first channel comprises a TCX20 frame, the channel spectral domain representation comprises two TCX10 subframes, and the spectral value calculation module is configured to calculate a combined TCX20 spectral domain representation based on the two TCX10 subframes, or

в котором первое представление в спектральной области первого канала содержит кадр TCX20, представление в спектральной области канала содержит субкадр TCX10 и два субкадра TCX5, причем модуль (160) вычисления спектральных значений выполнен с возможностью вычисления первого объединенного представления в спектральной области TCX10 на основе двух субкадров TCX5 и вычисления второго объединенного субкадра TCX20 на основе первого объединенного представления в спектральной области TCX10 и субкадра TCX10, илиwherein the first spectral domain representation of the first channel comprises a TCX20 frame, the channel spectral domain representation comprises a TCX10 subframe and two TCX5 subframes, wherein the spectral value calculation module (160) is configured to calculate the first combined TCX10 spectral domain representation based on the two TCX5 subframes and calculating a second combined TCX20 subframe based on the first combined TCX10 spectral domain representation and the TCX10 subframe, or

в котором первое представление в спектральной области первого канала содержит субкадр TCX10, представление в спектральной области канала содержит два субкадра TCX5, причем модуль (160) вычисления спектральных значений выполнен с возможностью вычисления объединенного представления в спектральной области TCX10 на основе двух субкадров TCX5,wherein the first spectral domain representation of the first channel comprises a TCX10 subframe, the channel spectral domain representation comprises two TCX5 subframes, and the spectral value calculation module (160) is configured to calculate the combined TCX10 spectral domain representation based on the two TCX5 subframes,

42. Устройство в соответствии с одним из примеров 35-41, в котором модуль (160) вычисления спектральных значений выполнен с возможностью применения первого метода на основе следующего уравнения:42. An apparatus according to one of Examples 35-41, wherein the spectral value calculation module (160) is configured to apply the first method based on the following equation:

, или

, or

в котором модуль вычисления спектральных значений выполнен с возможностью применения второго метода на основе следующего уравнения:wherein the spectral value calculation module is configured to apply the second method based on the following equation:

,

где

и

субкадры представления в спектральной области канала, иWhere

determines the spectral bin number, and

And

representation subframes in the channel spectral domain, and

где

и

And

43. Устройство в соответствии с одним из примеров 35-42, дополнительно содержащие модуль (500, 190, 220) вычисления сигналов для использования объединенного представления в спектральной области, имеющего другой размер временного элемента разрешения и другой размер частотного элемента разрешения, при вычислении кодированного, декодированного или обработанного аудиосигнала.43. An apparatus according to one of Examples 35-42, further comprising a signal calculation module (500, 190, 220) for using a combined spectral domain representation having a different time bin size and a different frequency bin size when calculating the encoded, decoded or processed audio signal.

44. Устройство в соответствии с любым из примеров 35-43, в котором модуль (160) вычисления спектральных значений выполнен с возможностью приёма представления в спектральной области с первым спектральным разрешением и формирования преобразованного представления в спектральной области со вторым спектральным разрешением, отличающимся от первого спектрального разрешения,44. An apparatus according to any one of Examples 35-43, wherein the spectral value calculation module (160) is configured to receive a spectral domain representation with a first spectral resolution and generate a transformed spectral domain representation with a second spectral resolution different from the first spectral resolution. permissions,

причем устройство дополнительно содержит:wherein the device further comprises:

первый спектральный процессор (500) для обработки преобразованного представления в спектральной области, чтобы получить обработанное представление в спектральной области со вторым разрешением, илиa first spectral processor (500) for processing the transformed spectral domain representation to obtain a second resolution processed spectral domain representation, or

модуль (190) вычисления данных обработки для вычисления данных обработки на основе преобразованного представления в спектральной области и второй спектральный процессор (220) для обработки представления в спектральной области, чтобы получить обработанное представление в спектральной области с первым разрешением.a processing data calculation unit (190) for calculating processing data based on the transformed spectral domain representation; and a second spectral processor (220) for processing the spectral domain representation to obtain a processed spectral domain representation with a first resolution.

45. Устройство в соответствии с примером 44, в котором первый спектральный процессор (500) выполнен с возможностью использовать при обработке дополнительное представление в спектральной области, имеющее второе спектральное разрешение, или45. The apparatus of Example 44, wherein the first spectral processor (500) is configured to process an additional spectral domain representation having a second spectral resolution, or

в котором второй спектральный процессор (220) выполнен с возможностью использования при обработке дополнительного представления в спектральной области, имеющего первое спектральное разрешение.wherein the second spectral processor (220) is configured to be used in processing an additional spectral domain representation having a first spectral resolution.

46. Устройство понижающего микширования для понижающего микширования многоканального сигнала, имеющего по меньшей мере два канала, содержащее:46. A downmixer for downmixing a multi-channel signal having at least two channels, comprising:

модуль (100) оценки весовых значений для оценки весовых значений по частотным полосам для упомянутых по меньшей мере двух каналов, причем модуль (100) оценки весовых значений выполнен с возможностью вычисления весовых значений по частотным полосам на основе целевого значения энергии на каждую частотную полосу, чтобы энергия в частотной полосе микшированного с понижением сигнала находилась в заданном отношении к энергиям в тех же частотных полосах упомянутых по меньшей мере двух каналов;a weight estimator (100) for estimating the weights per frequency bands for the at least two channels, wherein the weight estimator (100) is configured to calculate the weights per frequency bands based on the target energy value per frequency band, so that the energy in the frequency band of the downmixed signal was in a predetermined ratio to the energies in the same frequency bands of the at least two channels;

модуль (200) спектрального взвешивания для взвешивания представлений в спектральной области по меньшей мере двух каналов c использованием весовых значений по частотным полосам; иa spectral weighting module (200) for weighting spectral domain representations of the at least two channels using weights across frequency bands; And

микшер (400) для вычисления микшированного с понижением сигнала с использованием взвешенных представлений в спектральной области по меньшей мере двух каналов.a mixer (400) for calculating the downmixed signal using weighted representations in the spectral domain of the at least two channels.

47. Устройство понижающего микширования из примера 46,47. The downmixer of Example 46,

в котором модуль (100) оценки весовых значений выполнен с возможностью оценки (140) мнимого представления в спектральной области, когда представление в спектральной области является полностью действительным, или оценки (140) действительного представления в спектральной области, когда представление в спектральной области является полностью мнимым, иwherein the weight estimator (100) is configured to evaluate (140) the imaginary spectral domain representation when the spectral domain representation is fully real or evaluate (140) the actual spectral domain representation when the spectral domain representation is fully imaginary , And

48. Устройство понижающего микширования из одного из примеров 46 или 47, в котором модуль (100) оценки весовых значений выполнен с возможностью вычисления первого весового значения для частотной полосы первого канала по меньшей мере из двух каналов, вычислять второе весовое значение для частотной полосы второго канала по меньшей мере из двух каналов и вычислять первое весовое значение и второе весовое значение с использованием (142) энергии первого канала в частотной полосе, энергии второго канала в частотной полосе и микшированной составляющей, зависящей от произведения (148) или линейной комбинации (146) спектральных значений по меньшей мере из двух каналов в частотной полосе.48. The downmixer of one of Examples 46 or 47, wherein the weight estimator (100) is configured to calculate a first weight value for a first channel frequency band from at least two channels, calculate a second weight value for a second channel frequency band from at least two channels and calculate a first weight value and a second weight value using (142) the energy of the first channel in the frequency band, the energy of the second channel in the frequency band, and the mixed component depending on the product (148) or linear combination (146) of the spectral values from at least two channels in the frequency band.

49. Устройство понижающего микширования из одного из примеров 46-48,49. The downmixer of one of Examples 46-48,

в котором модуль (100) оценки весовых значений выполнен с возможностью вычисления в качестве микшированной составляющей, представляющей линейную комбинацию (146), квадратного корня энергии спектральных значений, сложенных друг с другом, в частотной полосе из представлений в спектральной области по меньшей мере двух каналов, причем частотная полоса содержит множество спектральных значений, или вычислять в качестве микшированной составляющей, представляющей произведение (148), абсолютное значение комплексного скалярного произведения между спектральными значениями в частотной полосе первого канала и спектральными значениями в частотной полосе второго канала по меньшей мере из двух каналов.in which the module (100) for evaluating the weight values is configured to calculate, as a mixed component representing a linear combination (146), the square root of the energy of the spectral values added to each other in the frequency band from the representations in the spectral domain of at least two channels, moreover, the frequency band contains a plurality of spectral values, or to calculate as a mixed component representing the product (148), the absolute value of the complex scalar product between the spectral values in the frequency band of the first channel and the spectral values in the frequency band of the second channel of at least two channels.

50. Устройство понижающего микширования одного из примеров 46-49,50. Downmixer of one of Examples 46-49,

в котором каждая частотная полоса первого и второго канала по меньшей мере из двух каналов имеет множество спектральных значений, причем модуль (200) спектрального взвешивания выполнен с возможностью применения одинакового весового коэффициента к каждому спектральному значению в частотной полосе одного по меньшей мере из двух каналов и применения другого весового коэффициента к каждому спектральному значению в частотной полосе другого канала по меньшей мере из двух каналов.wherein each frequency band of the first and second channel of at least two channels has a plurality of spectral values, wherein the spectral weighting module (200) is configured to apply the same weighting factor to each spectral value in the frequency band of one of the at least two channels and apply a different weighting factor to each spectral value in the frequency band of another channel of at least two channels.

51. Устройство понижающего микширования из одного из примеров 46-50, в котором модуль (100) оценки весовых значений выполнен с возможностью вычисления (150) весовых значений по частотным полосам для первого канала по меньшей мере из двух каналов на основе следующего уравнения:51. The downmix apparatus of one of Examples 46-50, wherein the weight estimation module (100) is configured to calculate (150) weight values across the frequency bands for the first channel of at least two channels based on the following equation:

где w_R - весовой коэффициент для первого канала для частотной полосы,

- оцененная мощность для второго канала,

- оцененная мощность для первого канала в частотной полосе,

- оцененное скалярное произведение между каналами в частотной полосе,

- оцененная амплитуда для второго канала в частотной полосе,

- оцененная амплитуда для первого канала в частотной полосе.where w _R is the weighting factor for the first channel for the frequency band,

is the estimated power for the second channel,

is the estimated power for the first channel in the frequency band,

is the estimated dot product between channels in the frequency band,

is the estimated amplitude for the second channel in the frequency band,

is the estimated amplitude for the first channel in the frequency band.

52. Устройство понижающего микширования из примера 51, в котором модуль (100) оценки весовых значений выполнен с возможностью вычисления (152) весовых значений по частотным полосам для второго канала по меньшей мере из двух каналов на основе следующего уравнения:52. The downmixer of Example 51, wherein the weight estimation module (100) is configured to calculate (152) weight values across frequency bands for a second channel of at least two channels based on the following equation:

где w_L - весовой коэффициент для второго канала для частотной полосы,

- оцененная линейная комбинация оцененных амплитуд для первого канала и второго канала в частотной полосе.where w _L is the weighting factor for the second channel for the frequency band,

is the estimated linear combination of the estimated amplitudes for the first channel and the second channel in the frequency band.

53. Устройство понижающего микширования из одного из примеров 50-52, в котором модуль (100) оценки весовых значений выполнен с возможностью вычисления (144) оцененной амплитуды для второго канала в частотной полосе и вычисления оцененной амплитуды для первого канала в частотной полосе на основе следующих уравнений:53. The downmix apparatus of one of Examples 50-52, wherein the weight estimation module (100) is configured to calculate (144) an estimated amplitude for a second channel in a frequency band and calculate an estimated amplitude for a first channel in a frequency band based on the following equations:

, или

, or

в котором модуль (100) оценки весовых значений выполнен с возможностью вычисления (146) оцененной линейной комбинации оцененных амплитуд для первого канала и второго канала в частотной полосе на основе следующего уравнения:wherein the weight estimator (100) is configured to compute (146) an estimated linear combination of the estimated amplitudes for the first channel and the second channel in the frequency band based on the following equation:

, или

, or

в котором модуль (100) оценки весовых значений выполнен с возможностью вычисления (148) оцененного скалярного произведения между каналами в частотной полосе на основе следующего уравнения:wherein the weight estimator (100) is configured to compute (148) an estimated scalar product between channels in the frequency band based on the following equation:

, или

, or

в котором модуль (100) оценки весовых значений выполнен с возможностью вычисления (142) оцененной мощности для второго канала в частотной полосе или оцененной мощности для первого канала в частотной полосе на основе следующего уравнения:wherein the weight estimator (100) is configured to calculate (142) the estimated power for the second channel in the frequency band or the estimated power for the first channel in the frequency band based on the following equation:

где

, и

представляет оцененную мнимую часть элемента разрешения i преобразования MDCT,

представляет действительную часть элемента разрешения i преобразования MDCT, включенную в представление в спектральной области первого или второй канала, r представляет первый канал, и l представляет второй канал.Where

determines the resolution element number in the spectral band

, And

represents the estimated imaginary part of the bin i of the MDCT transform,

represents the real part of the MDCT bin i included in the spectral domain representation of the first or second channel, r represents the first channel, and l represents the second channel.

54. Устройство понижающего микширования из одного из примеров 46-53,54. The downmixer of one of Examples 46-53,

в котором первое представление в спектральной области первого канала по меньшей мере из двух каналов имеет первое временное разрешение или первое частотное разрешение, второе представление в спектральной области второго канала по меньшей мере из двух каналов имеет второе временное разрешение или второе частотное разрешение, второе временное разрешение отличается от первого временного разрешения, и второе частотное разрешение отличается от первого частотного разрешения (130), иwherein the first spectral domain representation of the first channel of at least two channels has a first time resolution or a first frequency resolution, the second spectral domain representation of the second channel of at least two channels has a second time resolution or a second frequency resolution, the second time resolution is different from the first time resolution, and the second frequency resolution is different from the first frequency resolution (130), and

в котором модуль (100) оценки весовых значений выполнен с возможностью преобразования (132) первого представления в спектральной области в объединенное представление в спектральной области, имеющее второе временное разрешение или второе частотное разрешение, и вычисления (134) весовых значений по частотным полосам с использованием объединенного представления в спектральной области и второго представления в спектральной области, или преобразования (132) второго представления в спектральной области в объединенное представление в спектральной области, имеющее первое временное разрешение или первое частотное разрешение, и вычисления (134) весовых значений по частотным полосам с использованием объединенного представления в спектральной области и первого представления в спектральной области, илиwherein the weight estimator (100) is configured to convert (132) the first spectral domain representation to a combined spectral domain representation having a second temporal resolution or a second frequency resolution, and compute (134) the weight values over the frequency bands using the combined of a spectral domain representation and a second spectral domain representation, or converting (132) the second spectral domain representation to a combined spectral domain representation having a first temporal resolution or a first frequency resolution, and calculating (134) weight values over frequency bands using the combined a spectral domain representation and a first spectral domain representation, or

преобразования (132) второго представления в спектральной области во второе объединенное представление в спектральной области, имеющее третье временное разрешение или третье частотное разрешение, иtransforming (132) the second spectral domain representation into a second combined spectral domain representation having a third time resolution or a third frequency resolution, and

вычисления (134) весовых значений по частотным полосам с использованием первого объединенного представления в спектральной области и второго объединенного представления в спектральной области.calculating (134) weights over frequency bands using the first combined spectral domain representation and the second combined spectral domain representation.

55. Устройство понижающего микширования из примера 54, в котором модуль (200) спектрального взвешивания выполнен с возможностью взвешивания в качестве представления в спектральной области по меньшей мере двух каналов одного из объединенного представления в спектральной области и второго представления в спектральной области, объединенного представления в спектральной области и первого представления в спектральной области, и первого объединенного представления в спектральной области и второго объединенного представления в спектральной области, чтобы получить первое взвешенное представление в спектральной области и второе взвешенное представление в спектральной области.55. The downmix apparatus of Example 54, wherein the spectral weighting module (200) is configured to weight, as a spectral domain representation, at least two channels of one of the combined spectral domain representation and the second spectral domain representation, the combined spectral domain representation. and the first spectral domain representation, and the first combined spectral domain representation and the second combined spectral domain representation, to obtain a first weighted spectral domain representation and a second weighted spectral domain representation.

56. Устройство понижающего микширования из примера 55, в котором микшер (400) выполнен с возможностью сложения первого и второго взвешенных представлений в спектральной области, чтобы получить представление понижающего микширования в спектральной области, и преобразования представления понижающего микширования в спектральной области во временную область, чтобы получить микшированный с понижением сигнал, или преобразования первого и второго взвешенных представлений в спектральной области во временную область, чтобы получить временные представления по меньшей мере двух каналов, и сложения временных представлений по меньшей мере двух каналов, чтобы получить микшированный с понижением сигнал.56. The downmix apparatus of Example 55, wherein the mixer (400) is configured to add the first and second weighted spectral domain representations to obtain the spectral domain downmix representation, and convert the spectral domain downmix representation to the time domain so as to obtain a downmixed signal, or converting the first and second weighted spectral domain representations to the time domain to obtain temporal representations of at least two channels, and adding the temporal representations of at least two channels to obtain a downmixed signal.

57. Способ понижающего микширования многоканального сигнала, имеющего по меньшей мере два канала, причём способ содержит:57. A method for downmixing a multi-channel signal having at least two channels, the method comprising:

оценку весовых значений по частотным полосам по меньшей мере для двух каналов;estimating weight values across frequency bands for at least two channels;

взвешивание представлений в спектральной области по меньшей мере двух каналов с использованием весовых значений по частотным полосам;weighting the spectral domain representations of the at least two channels using weights across frequency bands;

преобразование взвешенных представлений в спектральной области по меньшей мере двух каналов во временные представления по меньшей мере двух каналов; иconverting the weighted spectral domain representations of the at least two channels to temporal representations of the at least two channels; And

микширование временных представлений по меньшей мере двух каналов, чтобы получить микшированный с понижением сигнал.mixing the temporal representations of the at least two channels to obtain a downmixed signal.

58. Способ преобразования спектрального разрешения представления в спектральной области канала, содержащего по меньшей мере два субкадра, причем каждый субкадр содержит множество спектральных значений, представляющих размер временного элемента разрешения и размер частотного элемента разрешения, причём способ содержит:58. A method for transforming the spectral resolution of a representation in the spectral domain of a channel containing at least two subframes, each subframe containing a plurality of spectral values representing a temporal bin size and a frequency bin size, the method comprising:

объединение спектральных значений, принадлежащих одному и тому же частотному элементу разрешения, из каждого субкадра представления в спектральной области первым методом, чтобы получить первую группу объединенных спектральных значений; иcombining spectral values belonging to the same frequency bin from each subframe of the spectral domain representation with a first method to obtain a first group of combined spectral values; And

объединение спектральных значений, принадлежащих одному и тому же частотному элементу разрешения, из каждого субкадра представления в спектральной области вторым методом, чтобы получить вторую группу объединенных спектральных значений, второй метод отличается от первого метода, причем первая группа объединенных спектральных значений и вторая группа объединенных спектральных значений, представляют объединенное представление в спектральной области, имеющее другой размер временного элемента разрешения и другой размер частотного элемента разрешения.combining the spectral values belonging to the same frequency bin from each subframe of the representation in the spectral domain by the second method to obtain the second group of combined spectral values, the second method is different from the first method, wherein the first group of combined spectral values and the second group of combined spectral values , represent a combined spectral domain representation having a different temporal bin size and a different frequency bin size.

59. Способ понижающего микширования многоканального сигнала, имеющего по меньшей мере два канала, причём способ содержит:59. A method for downmixing a multi-channel signal having at least two channels, the method comprising:

оценку весовых значений по частотным полосам для упомянутых по меньшей мере двух каналов, что содержит вычисление весовых значений по частотным полосам на основе целевого значения энергии для каждой частотной полосы таким образом, чтобы энергия в частотной полосе микшированного с понижением сигнала находилась в заданном отношении к энергиям в тех же частотных полосах упомянутых по меньшей мере двух каналов;estimating weight values across frequency bands for said at least two channels, which comprises calculating weight values across frequency bands based on a target energy value for each frequency band such that the energy in the frequency band of the downmixed signal is in a predetermined ratio to the energies in the same frequency bands of the at least two channels mentioned;

взвешивание представлений в спектральной области по меньшей мере двух каналов с использованием весовых значений по частотным полосам, чтобы получить взвешенные представления в спектральной области; иweighting the spectral domain representations of the at least two channels using the weights across frequency bands to obtain weighted spectral domain representations; And

вычисление микшированного с понижением сигнала с использованием взвешенных представлений в спектральной области по меньшей мере двух каналов.calculating the downmixed signal using the weighted spectral domain representations of the at least two channels.

60. Компьютерная программа для выполнения способа из примера 57, 58 или 59 при ее выполнении на компьютере или процессоре.60. A computer program for performing the method of Example 57, 58, or 59 when executed on a computer or processor.

Следует упомянуть, что все описанные выше альтернативы или аспекты и все аспекты, определенные независимыми пунктами в нижеследующей формуле изобретения, могут использоваться по отдельности, т.е. без какой-либо другой альтернативы или объекта, кроме рассматриваемой альтернативы, объекта или независимого пункта формулы изобретения. Однако в других вариантах осуществления две или более альтернативы, или два или более аспекта или независимых пункта формулы изобретения могут быть объединены друг с другом, и в других вариантах осуществления все аспекты или альтернативы и все независимые пункты формулы изобретения могут быть объединены друг с другом.It should be mentioned that all alternatives or aspects described above and all aspects defined by the independent claims in the following claims may be used individually, i. without any other alternative or subject matter other than the considered alternative, subject matter or independent claim. However, in other embodiments, two or more alternatives, or two or more aspects or independent claims may be combined with each other, and in other embodiments, all aspects or alternatives and all independent claims may be combined with each other.

Кодированный с помощью настоящего изобретения аудиосигнал может быть сохранен на цифровом носителе информации или носителе информации долговременного хранения, или может быть передан по передающему носителю, такому как беспроводной передающий носитель или проводной передающий носитель, такой как интернет.The audio signal encoded with the present invention may be stored on a digital storage medium or a non-volatile storage medium, or may be transmitted over a transmission medium such as a wireless transmission medium or a wired transmission medium such as the Internet.

Хотя некоторые аспекты были описаны в контексте устройства, ясно, что эти аспекты также представляют собой описание соответствующего способа, в котором модуль или устройство соответствуют этапу способа или признаку этапа способа. Аналогичным образом, аспекты, описанные в контексте этапа способа, также представляют собой описание соответствующего модуля, элемента или признака соответствующего устройства.Although some aspects have been described in the context of a device, it is clear that these aspects are also a description of the corresponding method, in which the module or device corresponds to a method step or a feature of a method step. Likewise, aspects described in the context of a method step are also descriptions of the respective module, element, or feature of the respective device.

В зависимости от некоторых требований реализации варианты осуществления изобретения могут быть реализованы в аппаратном или программном обеспечении. Реализация может быть выполнена с использованием цифрового запоминающего носителя, например, дискеты, цифрового универсального диска (DVD), диска Blu-Ray, компакт-диска (CD), постоянного запоминающего устройства (ROM), программируемого постоянного запоминающего устройства (PROM), стираемого программируемого постоянного запоминающего устройства (EPROM), электрически стираемого программируемого постоянного запоминающего устройства (EEPROM) и флэш-памяти, имеющего сохраненные на нем электронно считываемые сигналы, которые взаимодействуют (или способны взаимодействовать) с программируемой компьютерной системой, в результате чего выполняется соответствующий способ.Depending on some implementation requirements, embodiments of the invention may be implemented in hardware or software. The implementation may be performed using a digital storage medium such as floppy disk, digital versatile disc (DVD), Blu-ray disc, compact disc (CD), read only memory (ROM), programmable read only memory (PROM), erasable programmable read-only memory (EPROM), electrically erasable programmable read-only memory (EEPROM), and flash memory having electronically readable signals stored thereon that interact (or are capable of interacting) with a programmable computer system, resulting in the execution of the corresponding method.

Некоторые варианты осуществления в соответствии с изобретением содержат носитель данных, имеющий электронно считываемые управляющие сигналы, которые способны взаимодействовать с программируемой компьютерной системой, в результате чего выполняется один из способов, описанных в настоящем документе.Some embodiments in accordance with the invention include a storage medium having electronically readable control signals that are capable of interacting with a programmable computer system, resulting in one of the methods described herein.

Обычно варианты осуществления настоящего изобретения могут быть реализованы как компьютерный программный продукт с программным кодом, причем программный код выполняет один из способов, когда компьютерный программный продукт исполняется на компьютере. Программный код, например, может быть сохранен на машиночитаемом носителе.Typically, embodiments of the present invention may be implemented as a computer program product with program code, the program code performing one of the methods when the computer program product is executed on the computer. The program code may, for example, be stored on a computer-readable medium.

Другие варианты осуществления содержат компьютерную программу для выполнения одного из описанных в настоящем документе способов, сохраненную на машиночитаемом носителе или носителе информации долговременного хранения.Other embodiments comprise a computer program for performing one of the methods described herein stored on a computer-readable or non-volatile storage medium.

Другими словами, вариант осуществления способа по изобретению, таким образом, представляет собой компьютерную программу, имеющую программный код для выполнения одного из описанных здесь способов, когда компьютерная программа выполняется компьютером.In other words, an embodiment of the method of the invention is thus a computer program having program code for performing one of the methods described herein when the computer program is executed by a computer.

Дополнительным вариантом осуществления способов изобретения, таким образом, является носитель данных (или цифровой запоминающий носитель, или машиночитаемый носитель), содержащий записанную на нем компьютерную программу для выполнения одного из способов, описанных в настоящем документе.An additional embodiment of the methods of the invention, therefore, is a storage medium (or digital storage medium, or computer-readable medium) containing a computer program recorded thereon for performing one of the methods described herein.

Дополнительным вариантом осуществления способа настоящего изобретения, таким образом, являются поток данных или последовательность сигналов, представляющие компьютерную программу для выполнения одного из способов, описанных в настоящем документе. Поток данных или последовательность сигналов, например, могут быть выполнены с возможностью переноса через соединение передачи данных, например, через интернет.An additional embodiment of the method of the present invention is thus a data stream or signal sequence representing a computer program for performing one of the methods described herein. The data stream or sequence of signals, for example, may be carried over a data connection, such as over the Internet.

Дополнительный вариант осуществления содержит средство обработки, например, компьютер или программируемое логическое устройство, выполненное с возможностью или адаптированное для выполнения одного из способов, описанных в настоящем документе.An additional embodiment comprises processing means, such as a computer or programmable logic device, configured or adapted to perform one of the methods described herein.

Дополнительный вариант осуществления содержит компьютер, имеющий установленную на нем компьютерную программу для выполнения одного из способов, описанных в настоящем документе.An additional embodiment comprises a computer having a computer program installed thereon for performing one of the methods described herein.

В некоторых вариантах осуществления для выполнения некоторой или всей функциональности способов, описанных в настоящем документе, может использоваться программируемое логическое устройство (например, программируемая пользователем вентильная матрица). В некоторых вариантах осуществления программируемая пользователем вентильная матрица может взаимодействовать с микропроцессором для выполнения одного из способов, описанных в настоящем документе. Обычно способы предпочтительно выполняются любым аппаратным устройством.In some embodiments, a programmable logic device (eg, a user programmable gate array) may be used to perform some or all of the functionality of the methods described herein. In some embodiments, a user-programmable gate array may interface with a microprocessor to perform one of the methods described herein. In general, the methods are preferably performed by any hardware device.

Описанные выше варианты осуществления являются лишь в качестве иллюстрации принципов настоящего изобретения. Подразумевается, что модификации и вариации размещений и подробностей, описанных в настоящем документе, будут очевидны другим специалистам в данной области техники. Таким образом, подразумевается, что изобретение ограничено только объемом нижеследующей формулы изобретения, а не конкретными подробностями, представленными в качестве описания и пояснения изложенных в настоящем документе вариантов осуществления.The embodiments described above are merely illustrative of the principles of the present invention. It is intended that modifications and variations of the placements and details described herein will be apparent to others skilled in the art. Thus, the invention is intended to be limited only by the scope of the following claims, and not by the specific details provided as a description and explanation of the embodiments set forth herein.

ЛитератураLiterature

[1] ITU-R BS.775-2, Multichannel Stereophonic Sound System With And Without Accompanying Picture, 07/2006.[1] ITU-R BS.775-2, Multichannel Stereophonic Sound System With And Without Accompanying Picture, 07/2006.

[2] F. Baumgarte, C. Faller und P. Kroon, „Audio Coder Enhancement using Scalable Binaural Cue Coding with Equalized Mixing,“ in 116th Convention of the AES, Berlin, 2004.[2] F. Baumgarte, C. Faller und P. Kroon, “Audio Coder Enhancement using Scalable Binaural Cue Coding with Equalized Mixing,” in 116th Convention of the AES , Berlin, 2004.

[3] G. Stoll, J. Groh, M. Link, J. Deigmöller, B. Runow, M. Keil, R. Stoll, M. Stoll and C. Stoll, „Method for Generating a Downward-Compatible Sound Format“. USA Patent US 2012/0 014 526, 2012.[3] G. Stoll, J. Groh, M. Link, J. Deigmöller, B. Runow, M. Keil, R. Stoll, M. Stoll and C. Stoll, “Method for Generating a Downward-Compatible Sound Format” . USA Patent US 2012/0 014 526, 2012.

[4] M. Kim, E. Oh and H. Shim, „Stereo audio coding improved by phase parameters“ in 129th Convention of the AES, San Francisco, 2010.[4] M. Kim, E. Oh and H. Shim, “Stereo audio coding improved by phase parameters” in 129th Convention of the AES , San Francisco, 2010.

[5] A. Adami, E. Habets and J. Herre, „Down-mixing using coherence suppression,“ in IEEE International Conference on Acoustics, Speech and Signal Processing, Florence, 2014.[5] A. Adami, E. Habets and J. Herre, “Down-mixing using coherence suppression,” in IEEE International Conference on Acoustics, Speech and Signal Processing , Florence, 2014.

[6] ISO/IEC 23008-3:, Information technology - High efficiency coding and media delivery in heterogeneous environments - Part 3: 3D audio, 2019.[6] ISO/IEC 23008-3:, Information technology - High efficiency coding and media delivery in heterogeneous environments - Part 3: 3D audio, 2019.

[7] S. Bayer, C. Borß, J. Büthe, S. Disch, B. Edler, G. Fuchs, F. Ghido and M. Multrus, „DOWNMIXER AND METHOD FOR DOWNMIXING AT LEAST TWO CHANNELS AND MULTICHANNEL ENCODER AND MULTICHANNEL DECODER“. WO2018086946.[7] S. Bayer, C. Borß, J. Büthe, S. Disch, B. Edler, G. Fuchs, F. Ghido and M. Multrus, „DOWNMIXER AND METHOD FOR DOWNMIXING AT LEAST TWO CHANNELS AND MULTICHANNEL ENCODER AND MULTICHANNEL DECODER". WO2018086946.

[8] 3GPP TS 26.445, Codec for Enhanced Voice Services (EVS); Detailed algorithmic description. [8] 3GPP TS 26.445, Codec for Enhanced Voice Services (EVS); Detailed algorithmic description.

[9] S. Chen, H. Ruimin and S. Zhang, „Estimating spatial cues for audio coding in MDCT domain“ in IEEE International Conference on Multimedia and Expo, New York, 2009.[9] S. Chen, H. Ruimin and S. Zhang, “Estimating spatial cues for audio coding in MDCT domain” in IEEE International Conference on Multimedia and Expo , New York, 2009.

Claims

1. A downmixer for downmixing a multi-channel audio signal having at least two channels, comprising:

a weight estimation module (100) for estimating weight values across frequency bands for the at least two channels;

a spectral weighting module (200) for weighting spectral domain representations of the at least two channels using weights across frequency bands;

a conversion module (300) for converting weighted spectral domain representations of at least two channels into temporal representations of at least two channels; And

a mixer (400) for mixing temporal representations of at least two channels to obtain a downmixed audio signal.

2. The downmixer according to claim 1, wherein the weight estimation module (100) is configured to calculate a plurality of first weight values by frequency bands for a plurality of first channel frequency bands from at least two channels and calculate a second plurality of weight values by frequency bands. bands for a plurality of second channel frequency bands of at least two channels, or

in which the multi-channel audio signal has more than two channels, and in which the weight estimation module (100) is configured to calculate a set of first weight values by frequency bands for a plurality of frequency bands of the first channel from more than two channels, calculate a second set of weight values by frequency bands for the plurality of frequency bands of the second channel of more than two channels; and calculating an additional set of weight values by frequency bands for the plurality of frequency bands of the additional channel of more than two channels.

3. Downmixer according to claim 1 or 2,

wherein each of the spectral domain representations of at least two channels contains a set of frequency bins, wherein the spectral values are associated with the frequency bins,

wherein the weight estimator (100) is configured to calculate weight values over frequency bands for frequency bands, each frequency band containing one, two or more frequency bins, or

in which the number of frequency bins per frequency band increases as the center frequency of the frequency bands increases.

4. Downmixer according to any one of the preceding claims,

wherein the weight estimation module (100) is configured to calculate weight values across frequency bands based on a target energy value per each frequency band such that the energy in the frequency band of the downmixed audio signal is in a predetermined ratio to the energies in the same frequency bands at least two channels.

5. The downmixer according to any one of the preceding claims, further comprising:

a base decoder (500) for decoding the encoded signal, wherein the encoded signal has encoded spectral domain representations of at least two original channels, the base decoder being configured to generate spectral domain representations from the encoded spectral domain representations.

6. Downmixer according to any one of the preceding claims,

in which representations in the spectral domain are either fully real or fully imaginary,

wherein the weight estimation module (100) is configured to evaluate (120, 122) the imaginary representation in the spectral domain when the representation in the spectral domain is fully real or the evaluation of the actual representation in the spectral domain when the representation in the spectral domain is fully imaginary, And

wherein the weight estimation module (100) is configured to estimate the weight values across frequency bands using the estimated imaginary spectral domain representation or the estimated real spectral domain representation.

7. The downmixer according to claim 1, wherein the weight estimation module (100) is configured to calculate a first weight value from the weight values across frequency bands for a first channel frequency band of at least two channels,

wherein the weight estimation module (100) is configured to calculate a second weight value from the weight values across frequency bands for a second channel frequency band of at least two channels, and

wherein the weight estimation module (100) is configured to calculate the first weight value and the second weight value using the energy of the first channel in the frequency band, the energy of the second channel in the frequency band, and the mixed component depending on a product or a linear combination of spectral values from those mentioned in at least two channels in said frequency band.

8. Downmixer according to claim 7,

wherein the weight value estimation module (100) is configured to calculate, as a mixed component depending on the linear combination, the square root of the energy of the spectral values added to each other in the frequency band from the spectral domain representations of at least two channels, wherein the frequency the band contains a plurality of spectral values, or calculations as a mixed component depending on said product, the absolute value of the complex scalar product between the spectral values in the frequency band of the first channel and the spectral values in the frequency band of the second channel of at least two channels.

9. Downmixer according to claim 1,

in which each frequency band of the first and second channels from a plurality of at least two channels has a plurality of spectral values, and the spectral weighting module (200) is configured to apply the same weighting factor to each spectral value in the frequency band of one of at least two channels and applying a different weighting factor to each spectral value in the frequency band of a different channel of the at least two channels.

10. Downmixer according to any one of the preceding claims,

in which the weighted representations in the spectral domain are the spectra of the MDCT (Modified Discrete Cosine Transform), and

wherein the transform module (300) is configured to perform, for each channel of the plurality of channels, an MDCT inverse transform using a synthesis windowing operation and an overlay addition operation.

11. Downmixer according to any one of the preceding claims,

wherein the mixer (400) is configured to apply addition on each sample of the time representations of the at least two channels, or

wherein the mixer (400) is configured to apply an addition on each sample of the time representations of the at least two channels and a scaling operation applied to the result of the addition on each sample or applied to input values of the addition on each sample.

12. Downmixer according to any one of the preceding claims,

wherein the transform module (300) is configured to generate (310) raw temporal representations using an algorithm for converting a spectral representation to a temporal representation, and

performing separately post-processing (320) of the raw temporal representations in the direction of signal processing before mixing by the mixer (400) using separate control information for the channels to obtain temporal representations.

13. Downmixer according to claim 12,

wherein the transform module (300) is configured to perform, as a post-processing (320), low-pass post-filtering, TCX-LTP (transform code excited long-term prediction) processing, or LPC (linear prediction coding) synthesis separately for each temporal representation.

14. Downmixer according to claim 1,

in which the first representation in the spectral domain of the first channel of at least two channels has the first time or frequency resolution,

wherein the second spectral domain representation of the second channel of the at least two channels has a second time or frequency resolution, the second time or frequency resolution being different from the first time or frequency resolution, and

wherein the weight estimation module (100) is configured to calculate the weight values by frequency bands such that the frequency resolution of the frequency bands associated with the weights by frequency bands is lower than the first frequency resolution and the second frequency resolution, or equal to the lower resolution of the first and second frequency resolutions.

15. Downmixer according to any one of the preceding claims,

in which the first representation in the spectral domain has the first set of spectral values in the frequency band,

wherein the second spectral domain representation has a second set of spectral values in the frequency band, the second set being larger than the first set, and

in which the weight estimation module (100) is configured to

combining two or more spectral values from the second set of spectral values or selecting a subset of spectral values from the second set of spectral values,

calculating a mixed component depending on products or linear combinations of spectral values from at least two channels in a frequency band, using the result of combining two or more spectral values, or using a subset of spectral values, and

calculating weight values over frequency bands using the mixed component.

16. Downmixer according to any one of the preceding claims,

wherein the first spectral domain representation contains a plurality of first spectral values representing a first temporal bin size and a first frequency bin size,

wherein the second spectral domain representation contains a plurality of spectral values representing a second temporal bin size and a second frequency bin size,

wherein the first temporal bin size is greater than the second temporal bin size, or wherein the first frequency bin size is smaller than the second frequency bin size,

wherein the weight estimator (100) is configured to combine a plurality of spectral values from a first spectral domain representation to obtain a first combined spectral domain representation, wherein the size of the combined frequency bin is equal to the size of the second frequency bin, or to combine the plurality of spectral values from the second spectral domain representation to obtain the first combined spectral domain representation, wherein the combined temporal bin size is equal to the size of the first temporal bin.

17. Downmixer according to claim 16,

wherein the weight estimation module (100) is configured to use the first combined spectral representation or the second combined spectral domain representation to calculate weight values across frequency bands for the first channel and the second channel from at least two channels, the calculation comprising calculating the mixed component in frequency bands and energy calculation in frequency bands, and

wherein the spectral weighting module (200) is configured to apply frequency band weights for the first channel of at least two channels to the spectral values of the first representation in the spectral domain in the respective frequency bands, and apply weight values by frequency bands for the second channel of the at least two channels. at least two channels to the spectral values of the second representation in the spectral domain in the respective frequency bands.

18. Device downmix according to any one of paragraphs. 1-15,

wherein the first representation in the spectral domain of the first channel contains a plurality of first spectral values representing the first time bin size and the first frequency bin size,

wherein the second spectral domain representation of the second channel comprises at least two subframes, each subframe containing a plurality of spectral values representing a second time bin size and a second frequency bin size,

in which the weight estimation module (100) is configured to

combining spectral values belonging to the same frequency bin from each subframe of the second representation in the spectral domain by the first method to obtain a first group of combined spectral values, and

combining spectral values belonging to the same frequency bin from each subframe of the second representation in the spectral domain by the second method to obtain a second group of combined spectral values, the second method being different from the first method,

wherein the first group of combined spectral values and the second group of combined spectral values represent a combined spectral domain representation having a first time bin size and a first frequency bin size, and

using the spectral values of the combined spectral domain representation and the first spectral domain representation to compute weights over the frequency bands.

19. Downmixer according to claim 18,

wherein the weight estimation module (100) is configured to perform one of addition and subtraction in the first method and another of addition and subtraction in the second method.

20. The downmixer according to claim 18 or 19, wherein the weight estimation module (100) is configured to perform an averaging function in the first method and the second method.

21. Device downmix according to any one of paragraphs. 18-20, wherein the weight value estimation module (100) is configured to apply either the first method or the second method comprising weighing using the weight sign, wherein the weight value estimation module (100) is configured to set the weight sign according to the number frequency bin of the same frequency bin.

22. Device downmix according to any one of paragraphs. 18-21, wherein the weight estimation module (100) is configured to apply one of high pass filtering and low pass filtering as a first method, and the other of high pass filtering and low pass filtering as a second method.

23. Device downmix according to any one of paragraphs. 18-22, wherein the weight estimator (100) is configured to convert a lower resolution bin to two higher resolution bins, the first method being used for an even bin number of the first higher resolution bin of the two. higher resolution bins, and the second method is used for the odd bin number of the second higher resolution bin of the two higher resolution bins.

24. Device downmix according to any one of paragraphs. 18-23,

wherein the first spectral domain representation of the first channel comprises a TCX20 frame, the second spectral domain representation of the second channel comprises two TCX10 subframes, wherein the weight estimator (100) is configured to compute a combined TCX20 spectral domain representation based on the two TCX10 subframes, or

wherein the first spectral domain representation of the first channel comprises a TCX20 frame, the second spectral domain representation of the second channel comprises a TCX10 subframe and two TCX5 subframes, wherein the weight estimator (100) is configured to calculate the first combined spectral domain representation TCX10 based on the two TCX5 subframes and calculating a second combined TCX20 subframe based on the first combined TCX10 spectral domain representation and the TCX10 subframe, or

wherein the first spectral domain representation of the first channel contains a TCX10 subframe, the second spectral domain representation of the second channel contains two TCX5 subframes, wherein the weight estimation module (100) is configured to calculate a combined TCX10 spectral domain representation based on two TCX5 subframes,

where the expression TCX20 indicates the first section with the first duration, the expression TCX10 indicates the second section with the second duration, the expression TCX5 indicates the third section with the third duration, and the first duration is greater than the second duration or the third duration, the second duration is less than the first duration, or more than the second duration, and the third duration is less than the first duration or less than the second duration.

25. Device downmix according to any one of paragraphs. 18-24, wherein the weight estimation module (100) is configured to apply the first method based on the following equation:

, or

wherein the weight estimation module (100) is configured to apply the second method based on the following equation:

,

Where

determines the spectral bin number, and

And

Where

And

26. Downmixer according to claim 1,

in which the first representation in the spectral domain of the first channel of at least two channels has a first time resolution or a first frequency resolution, and the second representation in the spectral domain of the second channel of at least two channels has a second time resolution or a second frequency resolution, and the second time resolution the resolution is different from the first temporal resolution, or the second frequency resolution is different from the first frequency resolution, and

wherein the weight estimator (100) is configured to convert (132) the first spectral domain representation to a combined spectral domain representation having a second temporal resolution or a second frequency resolution, and calculate the weight values across frequency bands using the combined spectral domain representation domain and the second spectral domain representation, or converting the second spectral domain representation to a combined spectral domain representation having a first temporal resolution or a first frequency resolution, and calculating weights over frequency bands using the combined spectral domain representation and the first spectral domain representation , or

wherein the first spectral domain representation of the first channel of at least two channels has a first time resolution or a first frequency resolution, the second spectral domain representation of a second channel of at least two channels has a second time resolution or a second frequency resolution, wherein the second time resolution is different from the first time resolution, or the second frequency resolution is different from the first frequency resolution, and

in which the weight estimation module (100) is configured to

transforming (132) the first spectral domain representation into a first combined spectral domain representation having a third temporal resolution or a third frequency resolution,

wherein the third time resolution is different from the first time resolution or the second time resolution, and the third frequency resolution is different from the first frequency resolution or the second frequency resolution,

transforming (132) the second spectral domain representation into a second combined spectral domain representation having a third time resolution or a third frequency resolution, and

calculating (134) weights over frequency bands using the first combined spectral domain representation and the second combined spectral domain representation.

27. Downmixer according to claim 26,

in which the second channel contains for a certain time section (TCX20) the second representation in the spectral domain,

in which the first channel contains for a certain time section (2xTCX10) two or more of the first representations in the spectral domain,

wherein the weight estimator (100) is configured to convert the two or more first spectral domain representations into a combined spectral domain representation having the same time and frequency resolution as the second spectral domain representation, and calculate the weight values over the frequency bands using the combined spectral domain representation and the second spectral domain representation, and

wherein the spectral weighting module (200) is configured to weight the second spectral domain representation using frequency band weights and weight each first spectral domain representation from the two or more first spectral domain representations using the same frequency band weights.

28. Downmixer according to claim 27,

wherein the weight estimation module (100) is configured to add spectral values for the same frequency of two or more first spectral domain representations to obtain a first spectral value of the combined spectral domain representation, and subtract spectral values for the same the frequencies of the two or more first spectral domain representations to obtain a second spectral value of the combined spectral domain representation that is higher and adjacent in frequency to the first spectral value of the combined spectral domain representation, and

wherein the spectral weighting module (200) is configured to weight a frequency band having the same frequencies in each first spectral domain representation of the two or more first spectral domain representations using the same frequency band weighting.

29. Downmixer according to claim 26,

in which the first channel contains for a certain time section (2xTCX10) two or more first representations in the spectral domain,

in which the weight estimation module (100) is configured to

converting the second spectral domain representation into two or more combined spectral domain representations having the same time and frequency resolution as the two or more first spectral domain representations,

calculating first weight values across frequency bands using a first combined spectral domain representation of the two or more combined spectral domain representations and a first spectral domain representation of the two or more first spectral domain representations,

calculating second weight values across frequency bands using a second combined spectral domain representation of the two or more combined spectral domain representations and a second first spectral domain representation of the two or more first spectral domain representations, and

wherein the spectral weighting module (200) is configured to

weighting the second spectral domain representation using the derived frequency band weight derived (136) from the first and second frequency band weights,

weighting the first first representation in the spectral domain of the two or more first representations in the spectral domain using the first weight values by frequency bands and

weighting the second first representation in the spectral domain of the two or more first representations in the spectral domain using the second weight values by frequency bands.

30. Downmixer according to claim 29,

wherein the weight estimator (100) is configured to add the spectral values for the frequency pairs of the second representation in the spectral domain to obtain the combined spectral value, and copy to the combined spectral value to obtain the combined spectral value for each of the two or more combined representations in spectral region, and

in which the spectral weighting module (200) is configured to combine (136) the weight value for a specific frequency band of the first weight values by frequency bands with the weight value for a specific frequency band of the second weight values by frequency bands to obtain a derived weight value for a specific frequency band from the derived weights across the frequency bands.

31. Downmixer according to claim 26,

in which the weight estimation module (100) is configured to

converting the second spectral domain representation into two or more combined spectral domain representations having the same temporal resolution as the two or more first spectral domain representations and having the same frequency resolution as the second spectral domain representation,

calculating first weight values across frequency bands using a first combined spectral domain representation of the two or more combined spectral domain representations and a first first spectral domain representation of the two or more first spectral domain representations,

wherein the spectral weighting module (200) is configured to

weighting the second spectral domain representation using the frequency band weights derived (136) from the first and second frequency band weights,

weighting the first first representation in the spectral domain of the two or more first representations in the spectral domain using the first weight values by frequency bands, and

32. Downmixer according to claim 31,

wherein the weight estimator (100) is configured to upsample one or more spectral values to obtain upsampled spectral values for adjacent frequencies of the second representation in the spectral domain and copy to the upsampled spectral values to obtain combined spectral values for each of two or more combined representations in the spectral domain, and

33. Downmixer according to claim 26,

in which the weight estimation module (100) is configured to

converting two or more first spectral domain representations to a first combined spectral domain representation having the same temporal resolution as the second spectral domain representation,

converting the second spectral domain representations to a second combined spectral domain representation having the same frequency resolution as the two or more first spectral domain representations, and

calculating weights over frequency bands using the first combined spectral domain representation and the second combined spectral domain representation, and

34. Downmixer according to claim 33,

wherein the weight estimation module (100) is configured to add spectral values for frequency pairs of the second spectral domain representation to obtain a second combined spectral domain representation, and to add spectral values of the same frequency from two or more of the first spectral domain representations, to get the first combined representation in the spectral domain, and

35. Downmixer according to any one of the preceding claims,

wherein the weight estimation module (100) is configured to calculate a plurality of first weight values across frequency bands for a plurality of first channel frequency bands from at least two channels using a first calculation rule depending on at least two of the spectral values of the first representation in the spectral domain of the first channel, the spectral values of the second spectral domain representation of the second channel, the spectral values of one combined spectral domain representation derived from the spectral values of the first spectral domain representation or the second spectral domain representation, the spectral values of the first combined spectral domain representation derived from the spectral values of the first spectral domain representation, and the spectral values of the second combined spectral domain representation derived from the spectral values of the second spectral domain representation areas, and

wherein the weight estimation module (100) is configured to calculate a plurality of second weight values by frequency bands for a plurality of first channel frequency bands from at least two channels using a second calculation rule depending on at least two of the plurality of first weight values by frequency bands, spectral values of the first spectral domain representation of the first channel, spectral values of the second spectral domain representation of the second channel, spectral values of one combined spectral domain representation derived from the spectral values of the first spectral domain representation or the second spectral domain representation, spectral values of the first combined spectral domain representation derived from the spectral values of the first spectral domain representation and the spectral values of the second combined spectral domain representation derived from sp spectral values of the second representation in the spectral domain, and the second calculation rule differs from the first calculation rule.

36. A downmixer for downmixing a multi-channel audio signal having at least two channels, comprising:

a weight estimator (100) for estimating the weights per frequency bands for the at least two channels, wherein the weight estimator (100) is configured to calculate the weights per frequency bands based on the target energy value per frequency band, thus so that the energy in the frequency band of the downmixed audio signal is in a predetermined ratio to the energies in the same frequency bands of the at least two channels;

a spectral weighting module (200) for weighting spectral domain representations of the at least two channels using weights across frequency bands; And

a mixer (400) for calculating the downmixed audio signal using weighted spectral domain representations of the at least two channels.

37. Downmixer according to claim 36,

wherein the weight estimator (100) is configured to evaluate (140) the imaginary spectral domain representation when the spectral domain representation is fully real or evaluate (140) the actual spectral domain representation when the spectral domain representation is fully imaginary , And

38. The downmixer according to claim 36 or 37, wherein the weight estimation module (100) is configured to calculate a first weight value for the first channel frequency band from said at least two channels, calculate a second weight value for the second channel frequency band of said at least two channels, and calculating a first weight value and a second weight value using (142) the energy of the first channel in the frequency band, the energy of the second channel in the frequency band, and the mixed component depending on the product (148) or linear combination (146 ) spectral values from said at least two channels in the frequency band.

39. Device downmix according to any one of paragraphs. 36-38,

in which the module (100) for evaluating the weight values is configured to calculate, as a mixed component representing a linear combination (146), the square root of the energy of the spectral values added to each other in the frequency band from the representations in the spectral domain of at least two channels, moreover, the frequency band contains a plurality of spectral values, or calculations as a mixed component representing the product (148), the absolute value of the complex scalar product between the spectral values in the frequency band of the first channel and the spectral values in the frequency band of the second channel of the at least two channels.

40. Device downmix according to any one of paragraphs. 36-39,

wherein each frequency band of the first and second channel of said at least two channels has a plurality of spectral values, wherein the spectral weighting module (200) is configured to apply the same weighting factor to each spectral value in the frequency band of one of the at least two channels, and applying a different weighting factor to each spectral value in the frequency band of another channel of the at least two channels.

41. Device downmix according to any one of paragraphs. 36-40, wherein the weight estimator (100) is configured to calculate (150) the weight values across the frequency bands for the first channel of at least two channels based on the following equation:

where w _R is the weighting factor for the first channel for the frequency band,

is the estimated power for the second channel,

is the estimated power for the first channel in the frequency band,

is the estimated dot product between channels in the frequency band,

is the estimated amplitude for the second channel in the frequency band,

is the estimated amplitude for the first channel in the frequency band.

42. The downmixer of claim 41, wherein the weight estimator (100) is configured to calculate (152) the weight values across the frequency bands for the second channel of the at least two channels based on the following equation:

where w _L is the weighting factor for the second channel for the frequency band,

43. Device downmix according to any one of paragraphs. 40-42, wherein the weight estimation module (100) is configured to calculate (144) the estimated amplitude for the second channel in the frequency band and calculate the estimated amplitude for the first channel in the frequency band based on the following equations:

, or

wherein the weight estimator (100) is configured to compute (146) an estimated linear combination of the estimated amplitudes for the first channel and the second channel in the frequency band based on the following equation:

, or

wherein the weight estimator (100) is configured to compute (148) an estimated scalar product between channels in the frequency band based on the following equation:

or

wherein the weight estimator (100) is configured to calculate (142) the estimated power for the second channel in the frequency band or the estimated power for the first channel in the frequency band based on the following equation:

Where

determines the resolution element number in the spectral band

, And

represents the estimated imaginary part of the bin i of the MDCT transform,

44. Device downmix according to any one of paragraphs. 36-43,

wherein the first spectral domain representation of the first channel of at least two channels has a first time resolution or a first frequency resolution, the second spectral domain representation of a second channel of at least two channels has a second time resolution or a second frequency resolution, wherein the second time resolution is different from the first time resolution and the second frequency resolution is different from the first frequency resolution (130), and

wherein the weight estimator (100) is configured to convert (132) the first spectral domain representation to a combined spectral domain representation having a second temporal resolution or a second frequency resolution, and calculate (134) the weight values over the frequency bands using the combined of a spectral domain representation and a second spectral domain representation, or converting (132) the second spectral domain representation to a combined spectral domain representation having a first temporal resolution or a first frequency resolution, and calculating (134) weight values over frequency bands using the combined the spectral domain representation and the first spectral domain representation, or

wherein the first spectral domain representation of the first channel of said at least two channels has the first time resolution or the first frequency resolution, the second spectral domain representation of the second channel of the at least two channels has the second time resolution or the second frequency resolution, and the second the time resolution is different from the first time resolution and the second frequency resolution is different from the first frequency resolution (130), and

in which the weight estimation module (100) is configured to

45. The downmix apparatus of claim 44, wherein the spectral weighting module (200) is configured to weight, as a spectral domain representation, at least two channels of one of the combined spectral domain representation and the second spectral domain representation, the combined spectral domain representation, spectral domain and the first spectral domain representation, and the first combined spectral domain representation and the second combined spectral domain representation, to obtain a first weighted spectral domain representation and a second weighted spectral domain representation.

46. The downmix apparatus of claim 45, wherein the mixer (400) is configured to add the first and second weighted spectral domain representations to obtain the spectral domain downmix representation, and convert the spectral domain downmix representation to the time domain, to obtain a downmixed audio signal, or converting the first and second weighted spectral domain representations to the time domain to obtain temporal representations of at least two channels, and adding the temporal representations of at least two channels to obtain a downmixed audio signal.

47. A method for downmixing a multi-channel audio signal having at least two channels, the method comprising the steps of:

estimate the weight values in frequency bands for the mentioned at least two channels;

weighting the spectral domain representations of the at least two channels using weights across frequency bands;

converting weighted representations in the spectral domain of at least two channels into temporal representations of at least two channels; And

mixing temporal representations of at least two channels to obtain a downmixed audio signal.

48. A method for downmixing a multi-channel audio signal having at least two channels, the method comprising the steps of:

estimating the weight values by frequency bands for the at least two channels, which comprises the step of calculating the weight values by frequency bands based on the target energy value for each frequency band so that the energy in the frequency band of the downmixed audio signal is within a predetermined in relation to the energies in the same frequency bands of the at least two channels mentioned;

weighting the spectral domain representations of the at least two channels using the frequency band weights to obtain weighted spectral domain representations; And

calculating the downmixed audio signal using weighted representations in the spectral domain of at least two channels.

49. A physical storage medium on which a computer program for performing the method of claim 47 is stored when it is executed on a computer or processor.

50. A physical storage medium on which a computer program for performing the method of claim 48 is stored when it is executed on a computer or processor.