UA119808C2

UA119808C2 - Decoding audio bitstreams with enhanced spectral band replication metadata in at least one fill element

Info

Publication number: UA119808C2
Application number: UAA201709027A
Authority: UA
Inventors: Ларс Віллемоес; Ларс ВИЛЛЕМОЕС; Хейко ПУРНХАГЕН; Пер Екстранд
Original assignee: Долбі Інтернешнл Аб; Долби Интернешнл Аб
Priority date: 2015-03-13
Filing date: 2016-03-10
Publication date: 2019-08-12
Also published as: TW202042215A; DK3268956T3; HK1259544A1; ES2770029T3; HK1259408A1; JP6922017B2; ZA201901691B; IL256786A; IL285643B2; JP7210658B2; ZA201801941B; IL279327B; HK1259548A1; HUE053954T2; SG10202005260VA; HK1259131A1; ZA201905506B; TWI732403B; ZA201705978B; IL256786B

Abstract

Embodiments relate to an audio processing unit that includes a buffer, bitstream payload deformatter, and a decoding subsystem. The buffer stores at least one Mock of an encoded audio bitstream. The block includes a till element that begins with an identifier followed by till data. The fill data includes at least one flag identifying whether enhanced spectral band replication (eSBR) processing is to be performed on audio content of the block. Л corresponding method for decoding mi encoded audio bitstream is also provided.

Description

Перехресне посилання на споріднені заявкиCross reference to related applications

Надана заявка запитує пріоритет заявки на європейський патент Мо 15159067.6, поданої 13 березня 2015 року, і попередньої заявки на патент США Мо 62/133,800, поданої 16 березня 2016 року, кожна з яких повністю включена у надану заявку за допомогою посиланняThe filed application claims priority to European patent application Mo 15159067.6, filed Mar. 13, 2015, and prior US patent application Mo 62/133,800, filed Mar. 16, 2016, each of which is incorporated herein in its entirety by reference

Галузь техніки.The field of technology.

Винахід належить до обробки аудіосигналів. Деякі варіанти здійснення належать до кодування і декодування бітових потоків аудіо (наприклад, бітових потоків, що мають форматThe invention relates to the processing of audio signals. Some embodiments relate to the encoding and decoding of audio bitstreams (for example, bitstreams having the format

МРЕС-4 ААС), що включають в себе метадані для керування розширеним копіюванням спектральної смуги (езВЕ). Інші варіанти здійснення належать до декодування таких бітових потоків за допомогою декодерів попередніх версій, які не виконані з можливістю виконувати обробку еЗВЕ, і які ігнорують такі метадані, або до декодування бітового потоку аудіо, який не включає в себе такі метадані, що включає в себе формування керувальних даних еЗзВК у відповідь на бітовий потік.MPRES-4 AAS), which include metadata for managing extended spectral band copying (ESB). Other embodiments include decoding such bitstreams using legacy decoders that are not capable of performing eSEE processing and which ignore such metadata, or decoding an audio bitstream that does not include such metadata, which includes shaping eZzVK control data in response to the bit stream.

Рівень техніки.Technical level.

Типовий бітовий потік аудіо включає в себе як аудіодані (наприклад, закодовані аудіодані), що вказують один або більше каналів вмісту аудіо, так і метадані, що вказують щонайменше одну характеристику аудіоданих або вмісту аудіо. Одним відомим форматом для формування закодованого бітового потоку аудіо є формат удосконаленого кодування аудіо МРЕС-4 (МРЕС-4A typical audio bitstream includes both audio data (eg, encoded audio data) indicating one or more channels of audio content and metadata indicating at least one characteristic of the audio data or audio content. One well-known format for forming an encoded audio bitstream is the MPEC-4 Advanced Audio Coding format (MPEC-4

Адуапсей Айцаїйо Содіпо, ААС), описаний у стандарті П5ОЛЕС 14496-3:2009). У стандарті МРЕО-4 абревіатура ААС означає "айдмапсей ацйдіо содіпд (удосконалене кодування аудіо)», і абревіатура НЕ-ААС означає "Підп-ейісієпсу адмапсей ацйдіо содіпд (високоефективне удосконалене кодування аудіо)».Aduapsey Aitsaiyo Sodipo, AAS), described in the P5OLES 14496-3:2009 standard). In the MREO-4 standard, the abbreviation AAS stands for "admapsei acido sodipd (advanced audio coding)", and the abbreviation NE-AAS means "Admapsei acido sodipd (high-efficiency advanced audio coding)".

Стандарт МРЕС-4 ААдАС задає декілька аудіопрофілів, які визначають, які об'єкти і інструменти кодування є присутніми в сумісному кодері або декодері. Три з цих аудіопрофілів являють собою (1) профіль ААС, (2) профіль НЕ-ААС і (3) профіль НЕ-ААС м2. Профіль ААС включає в себе тип об'єкту ААС низької складності (або "ААС-І С"). Об'єкт ААС-Ї С є аналогом профілю МРЕС-2 ААС низької складності з деякими коригуваннями і не включає в себе ні тип об'єкту копіювання спектральної смуги (ЗВА"), ні тип об'єкту параметричного стерео ("Р").The МРЕС-4 ААдАС standard specifies several audio profiles that determine which objects and encoding tools are present in a compatible encoder or decoder. Three of these audio profiles are (1) AAS profile, (2) NON-AAS profile, and (3) NON-AAS m2 profile. The AAS profile includes the object type AAS of low complexity (or "AAS-I C"). The AAS-Y C object is a low-complexity analog of the MPRES-2 AAS profile with some adjustments and does not include either the Spectral Band Copy (SBA) object type or the Parametric Stereo ("P") object type.

Профіль НЕ-ААС є розширенням профілю ААС і додатково включає в себе тип об'єкту 5ВЕ.The NON-AAS profile is an extension of the AAS profile and additionally includes the 5VE object type.

Зо Профіль НЕ-ААС ум2 є розширенням профілю НЕ-ААС і додатково включає в себе тип об'єктуЗ The НЕ-ААС profile ум2 is an extension of the НЕ-ААС profile and additionally includes the object type

РБ.RB.

Тип об'єкку ЗВК містить інструмент копіювання спектральної смуги, що є важливим інструментом кодування, який значно підвищує ефективність стискування перцепційних аудіокодеків. ЗВК відтворює високочастотні компоненти аудіосигналу на боці приймача (наприклад, в декодері). Таким чином, кодер повинен тільки закодувати і передати низькочастотні компоненти, що дає набагато більш високу якість аудіо на низьких швидкостях передачі даних. ЗВК оснований на копіюванні послідовностей гармонік, попередньо зрізаних, щоб скоротити швидкість передачі даних, з сигналу з обмеженою доступною шириною смуги і керувальних даних, отриманих від кодера. Співвідношення між тональними і шумоподібними компонентами підтримується за допомогою адаптивної зворотної фільтрації, а також необов'язковим додаванням шуму і синусоїд. У стандарті МРЕС-4 ААС інструмент ЗВЕ виконує спектральну вставку, в якій декілька суміжних піддіапазонів квадратурного дзеркального фільтру (Оцайгайшге Міггог Рійег, ОМЕ) копіюються з переданої низькосмугової частини аудіосигналу у високосмугову ділянку аудіосигналу, який формується в декодері.The ZVC object type contains a spectral band copy tool, which is an important coding tool that greatly improves the compression efficiency of perceptual audio codecs. ZVK reproduces the high-frequency components of the audio signal on the receiver side (for example, in a decoder). Thus, the encoder only has to encode and transmit low-frequency components, resulting in much higher audio quality at low data rates. ZVC is based on copying sequences of harmonics, previously clipped to reduce the data rate, from a signal with limited available bandwidth and control data received from the encoder. The ratio between tonal and noise-like components is maintained using adaptive inverse filtering, as well as the optional addition of noise and sine waves. In the MRES-4 AAS standard, the ZVE tool performs spectral insertion, in which several adjacent subbands of the quadrature mirror filter (Otsaigaishge Miggog Riyeg, OME) are copied from the transmitted low-band part of the audio signal to the high-band part of the audio signal, which is formed in the decoder.

Спектральна вставка може не бути ідеальною для деяких типів аудіо, наприклад, для музики з відносно низьким переходом по частотах. Таким чином, потрібні методики для покращення копіювання спектральної смуги.Spectral insertion may not be ideal for some types of audio, such as music with a relatively low frequency response. Thus, techniques are needed to improve spectral band copying.

Короткий опис варіантів здійснення винаходу.Brief description of variants of the invention.

Перший клас варіантів здійснення належить до блоків обробки аудіо, які включають в себе пам'ять, блок видалення форматування корисних даних бітового потоку і підсистему декодування. Пам'ять виконана з можливістю зберігати щонайменше один блок закодованого бітового потоку аудіо (наприклад, бітового потоку МРЕС-4 ААС). Блок видалення форматування корисних даних бітового потоку виконаний з можливістю демультиплексувати закодований аудіоблок. Підсистема декодування виконана з можливістю декодувати вміст аудіо закодованого аудіоблоку. Закодований аудіоблок включає в себе заповнюючий елемент з ідентифікатором, що вказує початок заповнюючого елемента, і заповнюючі дані після ідентифікатора. Заповнюючі дані включають в себе щонайменше один прапор, що ідентифікує, чи має бути виконана обробка розширеного копіювання спектральної смуги (еЗзВК) для вмісту аудіо закодованого аудіоблоку.The first class of variants of the implementation belongs to the audio processing units, which include a memory, a unit for removing the formatting of the useful data of the bit stream and a decoding subsystem. The memory is made with the ability to store at least one block of an encoded audio bit stream (for example, a MPRES-4 AAS bit stream). The block for removing the formatting of the useful data of the bit stream is made with the ability to demultiplex the encoded audio block. The decoding subsystem is designed to decode the audio content of an encoded audio block. An encoded audio block includes a padding element with an identifier indicating the beginning of the padding element and padding data after the identifier. The padding data includes at least one flag identifying whether enhanced spectral band copy (eBSC) processing should be performed for the audio content of the encoded audio block.

Другий клас варіантів здійснення належить до способів декодування закодованого бітового потоку аудіо. Спосіб включає в себе прийом щонайменше одного блоку закодованого бітового потоку аудіо, демультиплексування, щонайменше, деяких частин щонайменше одного блоку закодованого бітового потоку аудіо і декодування, щонайменше, деяких частин щонайменше одного блоку закодованого бітового потоку аудіо. Щонайменше один блок закодованого бітового потоку аудіо включає в себе заповнюючий елемент з ідентифікатором, що вказує початок заповнюючого елемента, і заповнюючі дані після ідентифікатора. Заповнюючі дані включають в себе щонайменше один прапор, що ідентифікує, чи має бути виконана обробка розширеного копіювання спектральної смуги (еЗзВЕ) для вмісту аудіо щонайменше одного блоку закодованого бітового потоку аудіо.The second class of embodiments relates to methods of decoding an encoded audio bit stream. The method includes receiving at least one block of an encoded audio bit stream, demultiplexing at least some parts of at least one block of an encoded audio bit stream and decoding at least some parts of at least one block of an encoded audio bit stream. At least one block of the encoded audio bitstream includes a padding element with an identifier indicating the beginning of the padding element and padding data after the identifier. The padding data includes at least one flag identifying whether enhanced spectral band copy (eBSP) processing should be performed for the audio content of at least one block of the encoded audio bit stream.

Інші класи варіантів здійснення належать до кодування і транскодування бітових потоків аудіо, що містять метадані, що ідентифікують, чи має бути виконана обробка розширеного копіювання спектральної смуги (езвВК).Other classes of embodiments relate to the encoding and transcoding of audio bitstreams containing metadata identifying whether enhanced spectral band copy (ESB) processing should be performed.

Короткий опис креслень.Brief description of the drawings.

Фіг. 1 - блок-схема варіанту здійснення системи, яка може бути виконана з можливістю виконувати варіант здійснення способу винаходу.Fig. 1 is a block diagram of a variant of the implementation of the system, which can be performed with the possibility of performing a variant of the implementation of the method of the invention.

Фіг. 2 - блок-схема кодера, який є варіантом здійснення блоку обробки аудіо винаходу.Fig. 2 is a block diagram of an encoder, which is an embodiment of the audio processing unit of the invention.

Фіг. З - блок-схема системи, що включає в себе декодер, який є варіантом здійснення блоку обробки аудіо винаходу, і необов'язково також постпроцесор, з'єднаний з ним.Fig. C is a block diagram of a system including a decoder, which is an embodiment of the audio processing unit of the invention, and optionally also a post-processor connected to it.

Фіг. 4 - блок-схема декодера, який є варіантом здійснення блоку обробки аудіо винаходу.Fig. 4 is a block diagram of a decoder, which is an embodiment of the audio processing unit of the invention.

Фіг 5 - блок-схема декодера, який є іншим варіантом здійснення блоку обробки аудіо винаходу.Fig. 5 is a block diagram of a decoder, which is another embodiment of the audio processing unit of the invention.

Фіг. 6 - блок-схема іншого варіанту здійснення блоку обробки аудіо винаходу.Fig. 6 is a block diagram of another embodiment of the audio processing unit of the invention.

Фіг. 7 - схема блоку бітового потоку МРЕС-4 ААС, що включає в себе сегменти, на які він розділений.Fig. 7 - a block diagram of the MRES-4 AAS bit stream, which includes the segments into which it is divided.

Позначення і термінологія.Notation and terminology.

У цьому розкритті, у тому числі в формулі винаходу, вираз "виконувати операцію над" сигналом або даними (наприклад, фільтрацію, масштабування, перетворення сигналу або даних, або застосування коефіцієнта посилення до сигналу або даних) використовується в широкому розумінні для позначення виконання операції безпосередньо над сигналом або даними, або над обробленою версією сигналу або даних (наприклад, над версією сигналу, який був підданий попередній фільтрації або попередній обробці до виконання подальшої операції).In this disclosure, including the claims, the expression "perform an operation on" a signal or data (eg, filtering, scaling, transforming the signal or data, or applying a gain factor to the signal or data) is used broadly to mean performing an operation directly over the signal or data, or over a processed version of the signal or data (for example, over a version of the signal that has been pre-filtered or pre-processed before a further operation is performed).

У цьому розкритті, у тому числі в формулі винаходу, вираз "блок обробки аудіо" використовується в широкому розумінні для позначення системи або пристрою, виконаних з можливістю обробляти аудіодані. Приклади блоків обробки аудіо включають в себе, але без обмеження, кодери (наприклад, транскодери), декодери, кодер-декодери, системи попередньої обробки, системи подальшої обробки і системи обробки бітового потоку (що іноді називаються інструментами обробки бітового потоку). Фактично уся побутова електроніка, така як мобільні телефони, телевізори, ноутбуки і планшетні комп'ютери, містять блок обробки аудіо.In this disclosure, including the claims, the expression "audio processing unit" is used in a broad sense to denote a system or device designed to process audio data. Examples of audio processing units include, but are not limited to, encoders (eg, transcoders), decoders, encoder-decoders, preprocessing systems, postprocessing systems, and bitstream processing systems (sometimes called bitstream processing tools). Virtually all consumer electronics, such as mobile phones, televisions, laptops and tablet computers, contain an audio processing unit.

У цьому розкритті, у тому числі в формулі винаходу, термін "з'єднує" або "з'єднаний" використовується в широкому розумінні для позначення або прямого, або непрямого з'єднання.In this disclosure, including the claims, the term "connects" or "connected" is used broadly to mean either a direct or an indirect connection.

Таким чином, якщо перший пристрій з'єднується з другим пристроєм, то з'єднання може бути через пряме з'єднання, або через непряме з'єднання через інші пристрої і з'єднання. Крім того, компоненти, які інтегровані в інші компоненти або з іншими компонентами, також з'єднані один з одним.Thus, if the first device connects to the second device, the connection can be through a direct connection, or through an indirect connection through other devices and connections. In addition, components that are integrated into other components or with other components are also connected to each other.

Докладний опис варіантів здійснення винаходу.Detailed description of variants of implementation of the invention.

Стандарт МРЕС-4 ААС припускає, що закодований бітовий потік МРЕС-4 ААС включає в себе метадані, що вказують кожний тип обробки ЗВЕ, яка має бути застосована (якщо має бути застосована) декодером, щоб декодувати вміст аудіо бітового потоку, і/або які керують такою обробкою 5ВЕ, і/або вказують щонайменше одну характеристику або параметр щонайменше одного інструменту ЗВЕ, який повинен використовуватися, щоб декодувати вміст аудіо бітового потоку. У наданому документі ми використовуємо вираз "метадані ВК" для позначення метаданих цього типу, які описані або згадані у стандарті МРЕС-4 ДАС.The MPEC-4 AAC standard assumes that the encoded MPEC-4 AAC bitstream includes metadata that indicates each type of AAC processing that must be applied (if applicable) by the decoder to decode the audio content of the bitstream, and/or which control such 5VE processing, and/or specify at least one characteristic or parameter of at least one VE tool that should be used to decode the content of the audio bitstream. In this document, we use the expression "VK metadata" to refer to metadata of this type that is described or mentioned in the МРЕС-4 АС standard.

Верхній рівень бітового потоку МРЕС-4 ААС є послідовністю блоків даних (елементів "ам дата ріосК") кожний з яких є сегментом даних (що у наданому документі називається "блоком"), який містить аудіодані (як правило, для періоду часу з 1024 або 960 відліків) і відповідну інформацію і/або інші дані. У цьому документі використовується термін "блок" для позначення сегменту бітового потоку МРЕС-4 ААС, що містить аудіодані (і відповідні метадані, і необов'язково також інші відповідні дані), які визначають або є показником одного (але не 60 більше ніж одного) елемента "гам/ даїа Біоск".The upper level of the MPRES-4 AAS bitstream is a sequence of data blocks ("am data riosK" elements), each of which is a data segment (referred to in this document as a "block") that contains audio data (typically for a time period of 1024 or 960 counts) and relevant information and/or other data. This document uses the term "block" to refer to a segment of an MPEC-4 AAS bitstream containing audio data (and associated metadata, and optionally other associated data) that defines or is indicative of one (but not 60 more than one) of the "gam/ daia Biosk" element.

Кожний блок бітового потоку МРЕС-4 ААС може включати в себе декілька синтаксичних елементів (кожний з яких також здійснений у бітовому потоці як сегмент даних). Сім типів таких синтаксичних елементів задані в стандарті МРЕС-4 ААС. Кожний синтаксичний елемент ідентифікується різним значенням елемента даних "ій б5уп єІє". Приклади синтаксичних елементів включають в себе "5іпдіє спаппе! еіетепіО», "спаппеї! раїг еіетепі)» «і "Й еіетепіО». Елемент одиночного каналу є контейнером, що включає в себе аудіодані одного аудіоканалу (монофонічний аудіосигнал). Елемент пари каналів включає в себе аудіодані двох аудіоканалів (тобто, стереофонічний аудіосигнал).Each block of the MPEC-4 AAS bit stream can include several syntactic elements (each of which is also implemented in the bit stream as a data segment). Seven types of such syntactic elements are specified in the MPEC-4 AAS standard. Each syntactic element is identified by a different value of the data element "iy b5up eIie". Examples of syntactic elements include "5ipdie spappe! eietepiO", "sapppei! A single channel element is a container that includes the audio data of one audio channel (a monophonic audio signal). A pair of channels element includes the audio data of two audio channels (ie, a stereophonic audio signal).

Заповнюючий елемент є контейнером інформації, що включає в себе ідентифікатор (наприклад, значення згаданого вище елемента "ід 5уп еїе"), за яким йдуть дані, які згадуються як "заповнюючі дані". Заповнюючі елементи історично використовувалися для коригування поточного бітрейта (частоти слідування бітів) бітових потоків, які повинні передаватися по каналу зі сталою швидкістю. За допомогою додавання відповідної кількості заповнюючих даних до кожного блоку може бути досягнена стала швидкість передачі даних.A padding element is a container of information that includes an identifier (for example, the value of the "id 5up eee" element mentioned above) followed by data referred to as "padding data". Padding elements have historically been used to adjust the current bitrate (bit tracking frequency) of bitstreams that must be transmitted over a channel at a constant rate. By adding an appropriate amount of padding data to each block, a constant data rate can be achieved.

Згідно з варіантами здійснення винаходу заповнюючі дані можуть включати в себе одне або більше додаткових корисних навантажень, які розширюють тип даних (наприклад, метаданих), які можуть бути передані у бітовому потоці. Декодер, який приймає бітові потоки з заповнюючими даними, що містять новий тип даних, може необов'язково використовуватися пристроєм, що приймає бітовий потік (наприклад, декодером), щоб розширити функціональність пристрою. Таким чином, як може оцінити фахівець в галузі техніки, заповнюючі елементи є спеціальним типом структури даних і відрізняються від структур даних, звичайно використовуваних для передачі аудіоданих (наприклад, корисних даних аудіо, що містять дані каналу).According to embodiments of the invention, padding data may include one or more additional payloads that expand the type of data (eg, metadata) that can be transmitted in a bit stream. A decoder that accepts bitstreams with padding data containing a new data type may optionally be used by a device receiving the bitstream (eg, a decoder) to extend the functionality of the device. Thus, as one skilled in the art can appreciate, padding elements are a special type of data structure and are distinct from data structures commonly used to transmit audio data (eg, audio payloads containing channel data).

У деяких варіантах здійснення винаходу ідентифікатор, використовуваний для ідентифікації заповнюючого елемента, може складатися з трибітового цілого без знаку, у якого спочатку передається старший значущий біт ("цітері!"), що має значення 0хб. У одному блоці можуть зустрічатися декілька екземплярів синтаксичного елемента однакового типу (наприклад, декілька заповнюючих елементів).In some embodiments of the invention, the identifier used to identify the padding element may consist of a three-bit unsigned integer, which is first transmitted with the most significant bit ("citeri!") having a value of 0xb. Several instances of a syntactic element of the same type can occur in one block (for example, several filler elements).

Іншим стандартом для кодування бітових потоків аудіо є стандарт уніфікованого кодування мови і аудій МРЕС (МРЕС Опійва Зреесп апа Ацчаїо Содіпд, ОБАС) ПЗОЛЕС 23003-3:20121.Another standard for encoding audio bitstreams is the Standard for Unified Speech and Audio Coding of MPRES (MPRES Opiiva Zreesp apa Atchaio Sodipd, OBAS) PZOLES 23003-3:20121.

Стандарт МРЕС ИО5БАС описує кодування і декодування вмісту аудіо з використовуванням обробки копіювання спектральної смуги (у тому числі обробка 5ВК, як описано в стандартіThe МРЕС IО5BAS standard describes the encoding and decoding of audio content using spectrum band copying processing (including 5VK processing, as described in the standard

МРЕС-4 ААС, а також у тому числі інші розширені форми обробки копіювання спектральної смуги). Ця обробка застосовує інструменти копіювання спектральної смуги (що іноді згадуються у наданому документі як "Інструменти розширеного 5ВЕ" або "інструменти е5ВА") розширеної і вдосконаленої версії набору інструментів 5ВК, описаних в стандарті МРЕС-4 ААС. Таким чином, еЗВК (як задано в стандарті ОБАС) є покращенням ВК (як задано в стандарті МРЕО-4MPRES-4 AAS, as well as including other advanced forms of spectral band copying processing). This processing uses the spectral band copying tools (sometimes referred to in this document as "Extended 5VE Tools" or "e5BA Tools") of the extended and improved version of the 5VK toolset described in the МРЕС-4 AAS standard. Thus, eZVK (as specified in the OBAS standard) is an improvement of VC (as specified in the MREO-4 standard

ААС).AAS).

У наданому документі використовується вираз "обробка розширеного ЗВК" (або "обробка езвВе") для позначення обробки копіювання спектральної смуги з використовуванням щонайменше одного інструменту езВК (наприклад, щонайменше одного інструменту езВК, який описаний або згаданий в стандарті МРЕС ОБАС), який не описаний і не згаданий в стандарті МРЕОС-4 ААС. Прикладами таких інструментів езВК є гармонічна транспозиція, додаткова попередня обробка ОМЕ-вставки, або "попереднє згладжування", і формування часової обвідної (Тетрога! ЕпмеІоре Зпаріпо) відліків між піддіапазонами, або "інтер-ТЕ5".This document uses the term "enhanced ESR processing" (or "eSR processing") to refer to spectral band copying processing using at least one ESR tool (for example, at least one ESR tool described or referenced in the OBAS MREC standard) that is not described and not mentioned in the MREOS-4 AAS standard. Examples of such EZVK tools are harmonic transposition, additional pre-processing of the OME-insert, or "pre-smoothing", and the formation of a time envelope (Tetroga! EpmeIore Zparipo) counting between sub-bands, or "inter-TE5".

Бітовий потік, сформований згідно зі стандартом МРЕС ИБАС (що іноді згадується у наданому документі як "бітовий потік ЮБАС"), включає в себе закодований вміст аудіо і звичайно включає в себе метадані, що вказують кожний тип обробки копіювання спектральної смуги, яка має бути застосована декодером, щоб декодувати вміст аудіо бітового потоку ОБАС, іабо метадані, які керують такою обробкою копіювання спектральної смуги, і/або вказують щонайменше одну характеристику або параметр щонайменше одного інструменту ЗВК і/або інструменту е5ВЕ, який повинен використовуватися, щоб декодувати вміст аудіо бітового потоку ОБАС.A bitstream formed according to the IBAS MREC standard (sometimes referred to in this document as a "YUBAS bitstream") includes encoded audio content and typically includes metadata indicating each type of spectral band copy processing to be applied by a decoder to decode the content of the OBAS audio bitstream, and/or the metadata that controls such spectral band copy processing, and/or specifies at least one characteristic or parameter of at least one ZVC tool and/or e5VE tool to be used to decode the audio bitstream content OBAS.

У цьому документі використовується вираз "метадані розширеного 5ВЕ" (або "метадані е5ЗВЕА") для позначення метаданих, що вказують кожний тип обробки копіювання спектральної смуги, яка має бути застосована декодером, щоб декодувати вміст аудіо закодованого бітового потоку аудіо (наприклад, бітового потоку О5АС), і/або які керують такою обробкою копіювання спектральної смуги, і/або вказують щонайменше одну характеристику або параметр щонайменше одного інструменту ЗВК і/або інструменту езВЕ, який повинен використовуватися, 60 щоб декодувати такий вміст аудіо, але який не описаний і не згаданий в стандарті МРЕС-4 ААДС.This document uses the expression "Extended 5BE metadata" (or "e5ZBEA metadata") to refer to metadata specifying each type of spectral band copying processing that must be applied by a decoder to decode the content of an audio-encoded audio bitstream (e.g., an O5AC bitstream ), and/or which control such spectral band copy processing, and/or specify at least one characteristic or parameter of at least one ZVC tool and/or ezVE tool to be used 60 to decode such audio content, but which is not described or mentioned in the MPRES-4 AADS standard.

Прикладом метаданих езВЕ. є метадані (що вказують обробку копіювання спектральної смуги або керують нею), які описані або згадані в стандарті МРЕС 5АС, але не в стандарті МРЕОС-4An example of ezVE metadata. is metadata (indicating or controlling spectrum copy processing) that is described or referenced in the МРЕС 5АС standard, but not in the МРЕОС-4 standard

ААС. Таким чином, метадані е5ВвВК в наданому документі означають метадані, які не є метаданими ВК, і метадані 5ВК у наданому документі означають метадані, які не є метаданими езвВк.AAS. Thus, e5VvVK metadata in the provided document means metadata that is not VC metadata, and 5VK metadata in the provided document means metadata that is not ezvVk metadata.

Бітовий потік ОБАС може включати в себе і метадані 5ВЕ, і метадані е5ВЕ. Більш конкретно, бітовий потік ОБАС може включати в себе метадані езвкК, які керують функціонуванням обробки еЗВЕК за допомогою декодера, і метадані 5ВЕ, які керують функціонуванням обробки ЗВК за допомогою декодера. Згідно з типовими варіантами здійснення наданого винаходу метадані езВвВК (наприклад, специфічні для езВК дані конфігурації) включені (згідно з наданим винаходом) у бітовий потік МРЕС-4 ААС (наприклад, в контейнер 5бг ехіепвзіоп/) у кінці корисних даних З5ВК).The OBAS bit stream can include both 5VE and e5VE metadata. More specifically, the OBAS bitstream may include ezvkK metadata that controls the operation of eZVEC processing by the decoder, and 5VE metadata that controls the operation of ZVK processing by the decoder. According to typical embodiments of the present invention, ezVvVK metadata (e.g., ezVK specific configuration data) is included (according to the provided invention) in the bit stream of MPEC-4 AAS (for example, in the container 5bg ehiepvsiop/) at the end of the useful data of З5VK).

Функціонування обробки езВК під час декодування закодованого бітового потоку з використовуванням множини інструментів езВК (що містять щонайменше один інструмент еЗВК) за допомогою декодера відновлює високочастотну смугу аудіосигналу на основі копіювання послідовностей гармонік, які були відсічені під час кодування. Така обробка е5ВЕК звичайно коригує обвідну спектра сформованої високочастотної смуги і застосовує зворотну фільтрацію і додає шумові і синусоїдальні компоненти, щоб відтворити спектральні характеристики первинного аудіосигналу.The operation of ezVK processing during decoding of an encoded bit stream using a plurality of ezVK tools (containing at least one ezVK tool) with the help of a decoder restores the high-frequency band of the audio signal based on copying the harmonic sequences that were cut off during encoding. Such e5VEK processing usually corrects the contour of the spectrum of the formed high-frequency band and applies inverse filtering and adds noise and sinusoidal components to reproduce the spectral characteristics of the original audio signal.

Згідно з типовими варіантами здійснення винаходу метадані езВК включені (наприклад, включена невелика кількість керувальних бітів, які є метаданими езВК) в один або більше сегментів метаданих закодованого бітового потоку аудіо (наприклад, бітового потоку МРЕС-4According to typical embodiments of the invention, EZVK metadata is included (e.g., a small number of control bits that are ezVK metadata are included) in one or more metadata segments of an encoded audio bitstream (e.g., an MPEC-4 bitstream

ААС), який також включає закодовані аудіодані в інші сегменти (сегменти аудіоданих). Як правило, щонайменше один такий сегмент метаданих кожного блоку бітового потоку являє собою (або включає в себе) заповнюючий елемент (що включає в себе ідентифікатор, що вказує початок заповнюючого елемента), і метадані езВК включені в заповнюючий елемент після ідентифікатора.AAS), which also includes encoded audio data in other segments (audio data segments). Typically, at least one such metadata segment of each bitstream block is (or includes) a padding element (which includes an identifier indicating the beginning of the padding element), and the EZVK metadata is included in the padding element after the identifier.

Фіг. 1 є блок-схемою ілюстративної послідовності обробки аудіосигналів (системи обробки аудіоданих), в якій один або більше елементів системи можуть бути сконфігуровані згідно з варіантом здійснення наданого винаходу. Система включає в себе наступні елементи, з'єднані разом, як показано: кодер 1, підсистему 2 доставки, декодер З і блок 4 подальшої обробки. У варіаціях показаної системи один або більше елементів опущені, або включені додаткові блоки обробки аудіоданих.Fig. 1 is a block diagram of an illustrative audio signal processing sequence (audio data processing system) in which one or more system elements may be configured in accordance with an embodiment of the present invention. The system includes the following elements connected together as shown: encoder 1, subsystem 2 delivery, decoder C and unit 4 further processing. In variations of the system shown, one or more elements are omitted, or additional audio data processing units are included.

У деяких реалізаціях кодер 1 (який необов'язково включає в себе блок попередньої обробки) виконаний з можливістю приймати відліки РОМ (у часовій ділянці), що містять вміст аудіо, як вхідну інформацію і видавати закодований бітовий потік аудіо (що має формат, який сумісний зі стандартом МРЕС-4 ДААС), що вказує вміст аудіо. Дані бітового потоку, що вказують вміст аудіо, іноді згадуються у наданому документі як "аудіодані" або "закодовані аудіодані". Якщо кодер виконаний згідно з типовим варіантом здійснення наданого винаходу, виведення бітового потоку аудіо з кодера включає в себе метадані езВЕ (і, як правило, також інші метадані), а також аудіодані.In some implementations, encoder 1 (which optionally includes a preprocessing unit) is configured to accept ROM counts (in a time domain) containing audio content as input and output an encoded audio bitstream (having a format that is compatible with with the MPRES-4 DAAS standard), which indicates the audio content. Bitstream data indicating audio content is sometimes referred to herein as "audio data" or "encoded audio data". If the encoder is made in accordance with a typical embodiment of the present invention, the output of the audio bitstream from the encoder includes ezVE metadata (and typically other metadata as well) as well as audio data.

Один або більше закодованих бітових потоків аудіо, виданих з кодера 1, можуть бути поміщені у підсистему 2 доставки закодованого аудіо. Підсистема 2 виконана з можливістю зберігати і/або доставляти кожний закодований бітовий потік, виданий з кодера 1. Закодований бітовий потік аудіо, виданий з кодера 1, може бути збережений підсистемою 2 (наприклад, у формі диска ОМО або Віи-гау) або переданий підсистемою 2 (яка може реалізувати лінію передачі або мережу), або може бути і збережений, і переданий підсистемою 2.One or more encoded audio bitstreams issued from the encoder 1 may be placed in the encoded audio delivery subsystem 2. Subsystem 2 is configured to store and/or deliver each encoded bitstream output from encoder 1. The encoded audio bitstream output from encoder 1 may be stored by subsystem 2 (e.g., in the form of an OMO or Whi-gau disk) or transmitted by the subsystem 2 (which may implement a transmission line or network), or may be both stored and transmitted by subsystem 2.

Декодер З виконаний з можливістю декодувати закодований бітовий потік аудійо МРЕС-4Decoder C is made with the ability to decode the encoded bit stream of MPRES-4 audio

ААС (сформований кодером 1), який він приймає через підсистему 2. У деяких варіантах здійснення декодер З виконаний з можливістю витягувати метадані е5ВЕ з кожного блоку бітового потоку і декодувати бітовий потік (у тому числі за допомогою виконання обробки езВвік з використовуванням витягнених метаданих е5ВкК), щоб сформувати декодовані аудіодані (наприклад, потоки декодованих відліків аудіоданих РСМ). У деяких варіантах здійснення декодер З виконаний з можливістю витягувати метадані ЗВЕ з бітового потоку (але ігнорувати метадані езВЕ, включені у бітовий потік) і декодувати бітовий потік (у тому числі за допомогою виконання обробки 5ВК з використовуванням витягнених метаданих 5ВЕК), щоб сформувати декодовані аудіодані (наприклад, потоки декодованих відліків аудіоданих РСМ). Як правило, декодер З включає в себе буфер, який зберігає (наприклад, енергонезалежним чином) сегменти закодованого бітового потоку аудіо, прийнятого від підсистеми 2.AAS (formed by encoder 1), which it receives through subsystem 2. In some embodiments, decoder C is configured with the ability to extract e5VE metadata from each block of a bit stream and decode the bit stream (including by performing ezVvk processing using the extracted e5VkC metadata) , to form decoded audio data (eg streams of decoded PCM audio data counts). In some embodiments, the C decoder is configured to extract the UE metadata from the bitstream (but ignore the UE metadata included in the bitstream) and decode the bitstream (including by performing 5VC processing using the extracted 5VC metadata) to form decoded audio data (for example, streams of decoded readings of PCM audio data). As a rule, decoder C includes a buffer that stores (for example, in a non-volatile manner) segments of the encoded audio bit stream received from subsystem 2.

Блок 4 подальшої обробки на фіг. 1 виконаний з можливістю приймати потік декодованих аудіоданих від декодера З (наприклад, декодовані відліки аудіоданих РСМ) і виконувати їх подальшу обробку. Блок 4 подальшої обробки також може бути виконаний з можливістю відтворювати підданий подальшій обробці вміст аудіо (або декодовані аудіодані, прийняті від декодера 3) для відтворення за допомогою одного або більше динаміків.Block 4 of further processing in fig. 1 is designed with the ability to receive a stream of decoded audio data from decoder C (for example, decoded readings of PCM audio data) and perform their further processing. The post-processing unit 4 may also be configured to reproduce the post-processed audio content (or decoded audio data received from the decoder 3) for playback through one or more speakers.

Фіг. 2 є блок-схемою кодера (100), який є варіантом здійснення блоку обробки аудіо винаходу. Будь-який з компонентів або елементів кодера 100 може бути реалізований як один або більше процесів і/або одна або більше схем (наприклад, спеціалізованих інтегральних схем (АБІС), що програмуються користувачем вентильних матриць (ЕРСА) або інших інтегральних схем), в апаратних засобах, у програмному забезпеченні або у комбінації апаратних засобів і програмного забезпечення. Кодер 100 включає в себе кодер 105, модуль 107 форматування, модуль 106 формування метаданих і буферну пам'ять 109, з'єднані, як показано. Як правило, також кодер 100 включає в себе інші елементи обробки (не показані). Кодер 100 виконаний з можливістю перетворювати вхідний бітовий потік аудіо у вихідний закодований бітовий потікFig. 2 is a block diagram of an encoder (100), which is an embodiment of the audio processing unit of the invention. Any of the components or elements of the encoder 100 may be implemented as one or more processes and/or one or more circuits (eg, user-programmable gate arrays (PUAs) or other integrated circuits) in hardware hardware, software or a combination of hardware and software. The encoder 100 includes an encoder 105, a formatting module 107, a metadata generating module 106, and a buffer memory 109 connected as shown. Typically, encoder 100 also includes other processing elements (not shown). Encoder 100 is configured to convert an input audio bitstream to an output encoded bitstream

МРЕС-4 ААДб.MRES-4 AADb.

Генератор 106 метаданих з'єднаний і виконаний з можливістю формувати (і/або пропускати в модуль 107) метадані (що включають в себе метадані еЗВК і метадані 5ВЕ), які мають бути включені за допомогою модуля 107 у закодований бітовий потік, який має бути виданий з кодера 100.Metadata generator 106 is connected and configured to generate (and/or pass to module 107) metadata (including eZVK metadata and 5VE metadata) to be included by module 107 in an encoded bitstream to be issued from encoder 100.

Кодер 105 з'єднаний і виконаний з можливістю закодувати (наприклад, за допомогою виконання стискування) вхідні аудіодані і помістити отримані в результаті закодовані аудіодані в модуль 107 для включення у закодований бітовий потік, який має бути виданий з модуля 107.Encoder 105 is connected and configured to encode (eg, by performing compression) input audio data and place the resulting encoded audio data into module 107 for inclusion in the encoded bitstream to be output from module 107.

Модуль 107 виконаний з можливістю мультиплексувати закодовані аудіодані з кодера 105 і метадані (що включають в себе метадані езВК і метадані ЗВК) з генератора 106, щоб сформувати закодований бітовий потік, який має бути виданий з модуля 107, переважно так, щоб закодований бітовий потік мав формат, визначений одним з варіантів здійснення наданого винаходу.The module 107 is configured to multiplex the encoded audio data from the encoder 105 and the metadata (including EZVK metadata and ZVK metadata) from the generator 106 to form an encoded bitstream to be output from the module 107, preferably such that the encoded bitstream has the format defined by one of the embodiments of the present invention.

Буферна пам'ять 109 виконана з можливістю зберігати (наприклад, енергонезалежним чином) щонайменше один блок закодованого бітового потоку аудіо, виданого з модуля 107, іThe buffer memory 109 is configured to store (for example, in a non-volatile manner) at least one block of the encoded audio bit stream issued from the module 107, and

Зо послідовність блоків закодованого бітового потоку аудіо потім переміщається з буферної пам'яті 109 як виведення з кодера 100 в систему доставки.A sequence of blocks of the encoded audio bitstream is then moved from buffer memory 109 as output from encoder 100 to the delivery system.

Фіг. З є блок-схемою системи, що включає в себе декодер (200), який є варіантом здійснення блоку обробки аудіо, і необов'язково також постпроцесор (300), з'єднаний з ним. Будь-який з компонентів або елементів декодера 200 і постпроцесора 300 може бути реалізований як один або більше процесів і/або одна або більше схем (наприклад, спеціалізованих інтегральних схем (АБІС), що програмуються користувачем вентильних матриць (ЕРСА) або інших інтегральних схем), в апаратних засобах, в програмному забезпеченні або у комбінації апаратних засобів і програмного забезпечення. Декодер 200 містить буферну пам'ять 201, блок 205 видалення форматування (синтаксичного розбору) корисних даних бітового потоку, підсистему 202 декодування аудіо (що іноді називається "базовим" модулем декодування або "базовою" підсистемою декодування), модуль 203 обробки езВЕ і модуль 204 формування керувальних бітів, з'єднані, як показано. Як правило, декодер 200 також включає в себе інші елементи обробки (не показані).Fig. C is a block diagram of a system including a decoder (200), which is an embodiment of an audio processing unit, and optionally also a post-processor (300) connected thereto. Any of the components or elements of decoder 200 and post-processor 300 may be implemented as one or more processes and/or one or more circuits (eg, user-programmable gate arrays (PUAs) or other integrated circuits). , in hardware, software or a combination of hardware and software. The decoder 200 includes a buffer memory 201, a unit 205 for removing formatting (parsing) of the useful data of the bit stream, an audio decoding subsystem 202 (which is sometimes called a "basic" decoding module or a "basic" decoding subsystem), an EE processing module 203 and a module 204 forming control bits connected as shown. Typically, the decoder 200 also includes other processing elements (not shown).

Буферна пам'ять (буфер) 201 зберігає (наприклад, енергонезалежним чином) щонайменше один блок закодованого бітового потоку аудіо, прийнятого декодером 200. Під час роботи декодера 200, послідовність блоків бітового потоку переміщається з буфера 201 у блок 205 видалення форматування.The buffer memory (buffer) 201 stores (for example, in a non-volatile manner) at least one block of the encoded audio bit stream received by the decoder 200. During the operation of the decoder 200, the sequence of blocks of the bit stream is moved from the buffer 201 to the block 205 of removing formatting.

У варіаціях варіантів здійснення на фіг. З (або варіантів здійснення на фіг. 4, які будуть описані), блок АРІ, який не є декодером (наприклад, блок 500 АРИ на фіг. 6) включає в себе буферну пам'ять (наприклад, буферну пам'ять, ідентичну буферу 201), яка зберігає (наприклад, енергонезалежним чином) щонайменше один блок закодованого бітового потоку аудіо (наприклад, бітового потоку аудіо МРЕС-4 ААС) такого самого типу, прийнятого буфером 201 на фіг. З або фіг. 4 (тобто, закодований бітовий потік аудіо, який включає в себе метадані езВК).In variations of the embodiments in fig. With (or the embodiments of FIG. 4 that will be described), an ARI unit that is not a decoder (eg, ARI unit 500 in FIG. 6 ) includes a buffer memory (e.g., a buffer memory identical to the 201), which stores (for example, in a non-volatile manner) at least one block of an encoded audio bit stream (for example, an MPEC-4 AAS audio bit stream) of the same type received by the buffer 201 in Fig. With or fig. 4 (ie, an encoded audio bitstream that includes ezVK metadata).

Знову з посиланням на фіг. 3, блок 205 видалення форматування з'єднаний і виконаний з можливістю демультиплексувати кожний блок бітового потоку, щоб витягнути звідти метаданіReferring again to FIG. 3, the deformatting unit 205 is connected and configured to demultiplex each block of the bitstream to extract metadata therefrom

ЗВЕ (що включають в себе квантовані дані обвідної) і метадані езВЕ (і, як правило, також інші метадані), поміщати щонайменше метадані еЗзВК і метадані 5ВК в модуль 203 обробки еЗзВК і як правило, також поміщати інші витягнені метадані в підсистему 202 декодування (і необов'язково також в генератор 204 керувальних бітів). Блок 205 видалення форматування також з'єднаний і виконаний з можливістю витягувати аудіодані з кожного блоку бітового потоку і поміщати витягнені аудіодані в підсистему 202 декодування (модуль декодування).EZV (including quantized envelope data) and ezVe metadata (and typically other metadata as well), place at least evZVK metadata and 5VK metadata in the evZVK processing module 203 and typically also place other extracted metadata in the decoding subsystem 202 ( and optionally also in the control bit generator 204). The deformatting unit 205 is also connected and configured to extract audio data from each block of the bitstream and place the extracted audio data into the decoding subsystem 202 (decoding module).

Система на фіг. З необов'язково також включає в себе постпроцесор 300. Постпроцесор 300 включає в себе буферну пам'ять (буфер) 301 і інші елементи обробки (не показані), що включають в себе щонайменше один елемент обробки, з'єднаний з буфером 301. Буфер 301 зберігає (наприклад, енергонезалежним чином) щонайменше один блок (або кадр) декодованих аудіоданих, прийнятих постпроцесором 300 від декодера 200. Елементи обробки постпроцесора 300 з'єднані і виконані з можливістю приймати і адаптивно обробляти послідовність блоків (або кадрів) декодованого аудіо, виданого з буфера 301, з використовуванням метаданих, виданих з підсистеми декодування 202 (і/або блоку 205 видалення форматування), і/або керувальних бітів, виданих з модуля 204 декодери 200.The system in fig. C optionally also includes a post-processor 300. The post-processor 300 includes a buffer memory (buffer) 301 and other processing elements (not shown), including at least one processing element connected to the buffer 301. The buffer 301 stores (for example, in a nonvolatile manner) at least one block (or frame) of decoded audio data received by the postprocessor 300 from the decoder 200. The processing elements of the postprocessor 300 are connected and configured to receive and adaptively process a sequence of blocks (or frames) of decoded audio issued from buffer 301, using metadata issued from decoding subsystem 202 (and/or deformatting unit 205), and/or control bits issued from module 204 decoders 200.

Підсистема 202 декодування аудіо 202 декодера 200 виконана з можливістю декодувати аудіодані, витягнені блоком 205 синтаксичного розбору (таке декодування може згадуватися як "базова" операція декодування), щоб сформувати декодовані аудіодані, і поміщати декодовані аудіодані в модуль 203 обробки е5ВК. Декодування виконується в частотній ділянці і, як правило, включає в себе зворотне квантування, за яким йде спектральна обробка. Як правило, завершальний етап обробки в підсистемі 202 застосовує перетворення з частотної ділянки в часову ділянку до декодованих аудіоданих частотної ділянки, таким чином, виведенням підсистеми є декодовані аудіодані в часовій ділянці. Модуль 203 виконаний з можливістю застосовувати інструменти ЗВК і інструменти еЗВК, вказані за допомогою метаданих езВК і е5ВК (витягнених блоком 205 синтаксичного розбору) до декодованих аудіоданих (тобто, виконувати обробку ЗВК і обробку еЗзВК на виході підсистеми 202 декодування з використовуванням метаданих ЗВК і метаданих е5ВК), щоб сформувати повністю декодовані аудіодані, які видаються (наприклад, постпроцесору 300) з декодера 200. Як правило, декодер 200 включає в себе пам'ять (доступну для підсистеми 202 і модуля 203), яка зберігає піддані видаленню форматування аудіодані і метадані, видані з блоку 205 видалення форматування, і модуль 203 виконаний з можливістю здійснювати доступ до аудіоданих і метаданих (що включають в себе метадані З5ВЕ. і метадані езВкК) в міру необхідності під час обробки 5ВК і обробки е5ВК. Обробка 5ВК і обробка езВК в модулі 203 можуть розглядатися як подальшаThe audio decoding subsystem 202 of the decoder 200 is configured to decode the audio data extracted by the parsing unit 205 (such decoding may be referred to as a "base" decoding operation) to form the decoded audio data, and to place the decoded audio data into the e5VK processing module 203. Decoding is performed in the frequency domain and, as a rule, includes inverse quantization, followed by spectral processing. Typically, the final stage of processing in the subsystem 202 applies the conversion from the frequency domain to the time domain to the decoded audio data of the frequency domain, so that the output of the subsystem is the decoded audio data in the time domain. Module 203 is configured to apply ZVK tools and eZVK tools specified using ezVK and e5VK metadata (extracted by parsing unit 205) to decoded audio data (ie, perform ZVK processing and eZVK processing at the output of decoding subsystem 202 using ZVK metadata and e5VK metadata ) to form fully decoded audio data that is output (e.g., to postprocessor 300) from decoder 200. Typically, decoder 200 includes memory (accessible to subsystem 202 and module 203) that stores deformatted audio data and metadata, issued from the unformatting block 205, and the module 203 is configured to access the audio data and metadata (which includes the metadata of the C5VE. and the metadata of the ezVkK) as necessary during the processing of the 5VK and the processing of the e5VK. 5VK processing and ezVK processing in module 203 can be considered as further

Ко) обробка на виході основної підсистеми 202 декодування. Необов'язково декодер 200 також включає в себе підсистему фінального підвищувального мікшування (яка може застосувати інструменти параметричного стерео ("РБ") задані в стандарті МРЕС-4 ААС, з використовуванням метаданих Р5, витягнених блоком 205 видалення форматування, і/або керувальних бітів, сформованих в підсистемі 204), яка з'єднана і виконана з можливістю виконувати підвищувальне мікшування на виході модуля 203, щоб сформувати повністю декодоване, піддане підвищувальному мікшуванню аудіо, яке видається з декодера 200. Як альтернатива постпроцесор 300 виконаний з можливістю виконувати підвищувальне мікшування на виході декодера 200 (наприклад, з використовуванням метаданих Р5, витягнених блоком 205 видалення форматування, і/або керувальних бітів, сформованих в підсистемі 204).Ko) processing at the output of the main decoding subsystem 202. Optionally, the decoder 200 also includes a final up-mixing subsystem (which may apply the parametric stereo ("RB") tools specified in the MPEC-4 AAS standard, using the P5 metadata extracted by the deformatting unit 205, and/or control bits, formed in subsystem 204) which is connected and configured to perform up-mixing at the output of module 203 to form fully decoded, up-mixed audio output from decoder 200. Alternatively, post-processor 300 is configured to perform up-mixing at the output decoder 200 (for example, using P5 metadata extracted by deformatting unit 205 and/or control bits generated in subsystem 204).

У відповідь на метадані, витягнені блоком 205 видалення форматування, генератор 204 керувальних бітів може сформувати керувальні дані, і керувальні дані можуть бути використані в декодері 200 (наприклад, в системі фінального підвищувального мікшування) і/або розміщені як виведення декодера 200 (наприклад, в постпроцесорі 300 для використовування під час подальшої обробки). У відповідь на метадані, витягнені з вхідного бітового потоку (ії необов'язково також у відповідь на керувальні дані), модуль 204 може сформувати (і помістити в постпроцесор 300), керувальні біти, що вказують, що декодовані аудіодані, видані з модуля 203 обробки е5ВЕ, мають бути піддані визначеному типу подальшої обробки. У деяких реалізаціях декодер 200 виконаний з можливістю поміщати метадані, витягнені блоком 205 видалення форматування з вхідного бітового потоку, в постпроцесор 300, і постпроцесор 300 виконаний з можливістю виконувати подальшу обробку декодованих аудіоданих, виданих з декодера 200, з використовуванням метаданих.In response to the metadata extracted by the deformatting unit 205 , the control bit generator 204 may generate control data, and the control data may be used in the decoder 200 (e.g., in a final up-mixing system) and/or placed as an output of the decoder 200 (e.g., in postprocessors 300 for use during further processing). In response to metadata extracted from the input bitstream (and not necessarily in response to control data), module 204 may generate (and place in postprocessor 300) control bits indicating that the decoded audio data output from processing module 203 e5BE, must be subjected to a certain type of further processing. In some implementations, the decoder 200 is configured to place the metadata extracted by the deformatting unit 205 from the input bitstream into the post-processor 300, and the post-processor 300 is configured to post-process the decoded audio data output from the decoder 200 using the metadata.

Фіг. 4 є блок-схемою блоку (210) обробки аудіо ("АРО"), який є іншим варіантом здійснення блоку обробки аудіо винаходу. Блок 210 АРИ є декодером попередніх версій, який не виконаний з можливістю виконувати обробку езВК. Будь-який з компонентів або елементів АРО 210 може бути реалізований як один або більше процесів і/лабо одна або більше схем (наприклад, спеціалізованих інтегральних схем (АБІС), що програмуються користувачем вентильних матриць (ЕРОА) або інших інтегральних схем), в апаратних засобах, у програмному забезпеченні або у комбінації апаратних засобів і програмного забезпечення. Блок 210 АР 60 містить буферну пам'ять 201, блок 215 видалення форматування (блок синтаксичного розбору)Fig. 4 is a block diagram of an audio processing unit (210) ("APO"), which is another embodiment of the audio processing unit of the invention. Block 210 ARI is a decoder of earlier versions, which is not designed with the ability to perform EZVK processing. Any of the components or elements of APO 210 may be implemented as one or more processes and/or one or more circuits (eg, user-programmable integrated circuits (PUAs) or other integrated circuits) in hardware hardware, software or a combination of hardware and software. Block 210 AR 60 contains buffer memory 201, block 215 of removing formatting (parsing block)

корисних даних бітового потоку, підсистему 202 декодування аудіо (що іноді називається "базовим" модулем декодування або "базовою" підсистемою декодування), і модуль 213 обробки ЗВК, з'єдніні, як показано. Як правило, Блок 210 АРО також включає в себе інші елементи обробки (не показані).payload data bit stream, audio decoding subsystem 202 (sometimes referred to as a "basic" decoding module or "basic" decoding subsystem), and a VCR processing module 213, connected as shown. Typically, the APO Block 210 also includes other processing elements (not shown).

Елементи 201 ії 202 блоку 210 АРІ ідентичні ідентично пронумерованим елементам декодера 200 (фіг. 3), і їх приведений вище опис не буде повторюватися. Під час роботи блоку 210 АР послідовність блоків закодованого бітового потоку аудіо (бітовий потік МРЕС-4 ДАС), прийнятого блоком 210 АРИ, переміщається з буфера 201 у блок 215 видалення форматування.Elements 201 and 202 of the block 210 ARI are identical to the identically numbered elements of the decoder 200 (Fig. 3), and their above description will not be repeated. During the operation of the AR unit 210, the sequence of blocks of the coded audio bit stream (MRES-4 DAS bit stream), received by the AR unit 210, is moved from the buffer 201 to the formatting removal unit 215.

Блок 215 видалення форматування з'єднаний і виконаний з можливістю демультиплексувати кожний блок бітового потоку, щоб витягнути звідти метадані 5ВК (що включають в себе квантовані дані обвідної), і, як правило, також інші метадані, але ігнорувати метадані еЗзВК, які можуть бути включені у бітовий потік, відповідно до будь-якого варіанту здійснення наданого винаходу. Блок 215 видалення форматування виконаний з можливістю поміщати, щонайменше, метадані ЗВК в модуль 213 обробки ЗВК. Блок 215 видалення форматування також з'єднаний і виконаний з можливістю витягувати аудіодані з кожного блоку бітового потоку і поміщати витягнені аудіодані в підсистему 202 декодування (модуль декодування).Deformatting unit 215 is connected and configured to demultiplex each block of the bitstream to extract 5VC metadata therefrom (which includes quantized surround data), and typically other metadata as well, but to ignore eZzVC metadata that may be included into a bit stream, according to any embodiment of the present invention. Block 215 of removing formatting is made with the ability to place, at least, metadata of ZVK in module 213 of ZVK processing. The deformatting unit 215 is also connected and configured to extract audio data from each block of the bitstream and place the extracted audio data in the decoding subsystem 202 (decoding module).

Підсистема 202 декодування аудіо декодера 200 виконана з можливістю декодувати аудіодані, витягнені блоком 215 видалення форматування (таке декодування може згадуватися як "базова" операція декодування), щоб сформувати декодовані аудіодані, і поміщати декодовані аудіодані в модуль 213 обробки 5ВК. Декодування виконується в частотній ділянці.The decoding subsystem 202 of the audio decoder 200 is configured to decode the audio data extracted by the deformatting unit 215 (such decoding may be referred to as a "base" decoding operation) to form decoded audio data, and to place the decoded audio data into the 5VC processing module 213. Decoding is performed in the frequency domain.

Як правило, завершальний етап обробки в підсистемі 202 застосовує перетворення з частотної ділянки у часову ділянку до декодованих аудіоданих частотної ділянки, таким чином, виведенням підсистеми є декодовані аудіодані в часовій ділянці. Модуль 213 виконаний з можливістю застосовувати інструменти ЗВК (але не інструменти еЗзВК), вказані за допомогою метаданих ЗВЕ (витягнених блоком 215 видалення форматування) до декодованих аудіоданих (тобто, виконати обробку ЗВК на виході підсистеми 202 декодування з використовуванням метаданих З5ВЕ), щоб сформувати повністю декодовані аудіодані, які видаються (наприклад, постпроцесору 300) з блоку 210 АРИ. Як правило, Блок 210 АРО включає в себе пам'ятьTypically, the final stage of processing in subsystem 202 applies a frequency-domain-to-time-domain transformation to the decoded audio data of the frequency domain, so that the output of the subsystem is the decoded audio data in the time domain. The module 213 is configured to apply the SVC tools (but not the eSVC tools) specified by the SVC metadata (extracted by the deformatting unit 215) to the decoded audio data (ie, perform SVC processing at the output of the decoding subsystem 202 using the SVC metadata) to form a fully decoded audio data that is output (eg, to postprocessor 300) from ARI unit 210. As a rule, Block 210 ARO includes memory

Зо (доступну для підсистеми 202 і модуля 213), яка зберігає піддані видаленню форматування аудіодані і метадані, видані з блоку 215 видалення форматування, і модуль 213 виконаний з можливістю здійснювати доступ до аудіоданих і метаданих (що включають в себе метаданіZo (accessible to subsystem 202 and module 213), which stores deformatted audio data and metadata issued from deformatting unit 215, and module 213 is configured to access audio data and metadata (including metadata

ЗВЕ) в міру необхідності під час обробки 5ВК. Обробка 5ВК в модулі 213 може розглядатися як подальша обробка на виході основної підсистеми 202 декодування. Необов'язково блок 210ZVE) as necessary during the processing of 5VK. Processing 5VK in the module 213 can be considered as further processing at the output of the main decoding subsystem 202. Not necessarily block 210

АРИ також включає в себе підсистему фінального підвищувального мікшування (яка може застосувати інструменти параметричного стерео ("Р5"), задані в стандарті МРЕС-4 ААС, з використовуванням метаданих Р5, витягнених блоком 215 видалення форматування), який з'єднаний і виконаний з можливістю виконувати підвищувальне мікшування на виході модуля 213, щоб сформувати повністю декодоване, піддане підвищувальному мікшуванню аудіо, яке видається з блоку 210 АР. Як альтернатива постпроцесор виконаний з можливістю виконувати підвищувальне мікшування на виході блоку 210 АРИ (наприклад, з використовуванням метаданих Р5, витягнених блоком 215 видалення форматування і/або керувальних бітів, сформованих у блоці 210 АР).The ARI also includes a final upmixing subsystem (which can apply the parametric stereo ("P5") tools specified in the MPEC-4 AAS standard, using the P5 metadata extracted by the deformatting unit 215), which is connected and implemented with the perform upmixing on the output of module 213 to form a fully decoded, upmixed audio output from AP unit 210. Alternatively, the post-processor is configured to perform up-mixing on the output of AR block 210 (for example, using P5 metadata extracted by deformatting block 215 and/or control bits generated in AR block 210).

Різні реалізації кодера 100, декодера 200 і блоку 210 АРІ виконані з можливістю виконувати різні варіанти здійснення способу винаходу.Various implementations of the encoder 100, the decoder 200, and the ARI block 210 are made with the ability to perform various variants of the method of the invention.

Згідно з деякими варіантами здійснення метадані езВкК (наприклад, включена невелика кількість керувальних бітів, які Є метаданими езВК) включені в закодований бітовий потік аудіо (наприклад, бітовий потік МРЕС-4 ААС), внаслідок чого декодери попередніх версій (які не виконані з можливістю аналізувати метадані езВве. або використовувати який-небудь інструмент еЗВЕ, до якого належать метадані езВК), може проігнорувати метадані еЗзВЕ, але тим не менш декодувати бітовий потік в міру можливості без використовування метаданих езВкК або якого- небудь інструменту езВЕ, до якого належать метадані езВК, як правило, без яких-небудь значних втрат якості декодованого аудіо. Проте декодери езВК, виконані з можливістю аналізувати бітовий потік, щоб ідентифікувати метадані еЗ5ВкК і використовувати щонайменше один інструмент езВК у відповідь на метадані езВЕ, будуть володіти перевагами використовування щонайменше одного такого інструменту езВК. Таким чином, варіанти здійснення винаходу забезпечують засіб для ефективної передачі керувальних даних або метаданих розширеного копіювання спектральної смуги (езвкК) з дотриманням зворотної сумісності.According to some embodiments, the metadata of ezVkC (for example, a small number of control bits that ARE metadata of ezVK are included) are included in the encoded audio bitstream (for example, the bitstream of MPEC-4 AAS), as a result of which the decoders of earlier versions (which are not designed to be able to analyze ecv metadata or use any ecv tool to which the ecv metadata belongs) may ignore the ecv metadata but nevertheless decode the bitstream as far as possible without using the ecvc metadata or any ecv tool to which the ecv metadata belongs, usually without any significant loss in the quality of the decoded audio. However, EZVC decoders configured to analyze the bitstream to identify ezVC metadata and use at least one ezVC tool in response to the ezVE metadata will have the benefit of using at least one such ezVC tool. Thus, embodiments of the invention provide a means for efficient transfer of control data or extended spectrum copy (ESB) metadata while maintaining backward compatibility.

Як правило, метадані еЗзВК у бітовому потоці вказують (наприклад, вказують щонайменше одну характеристику або параметр) один або більше з наступних інструментів езВК (які описані в стандарті МРЕС ШЗАС, і які можуть бути застосовані або не застосовані кодером під час формування бітового потоку): - гармонічна транспозиція; - додаткова попередня обробка ОМЕ-вставки (попереднє згладжування); і - формування часової обвідної відліків між піддіапазонами, або "інтер-ТЕ5".As a rule, eZVK metadata in a bitstream indicates (for example, specifies at least one characteristic or parameter) one or more of the following eZVK tools (which are described in the МРЕС ШЗАС standard, and which may or may not be applied by the encoder during the formation of the bitstream): - harmonic transposition; - additional pre-processing of the OME insert (pre-smoothing); and - the formation of a time loop of counting between sub-bands, or "inter-TE5".

Наприклад, метадані еЗВК, включені у бітовий потік, можуть вказувати значення параметрів (описані в стандарті МРЕС ЗАС і у наданому розкритті): паппопісеВе|сп), збгРаїспіпамоае|сн|, зБгОмегзатріїпоНіадісні, 5БгРіїспІпВіпві|спі, 5БгРіЇспІпВіпв|спі, р5 іпетгтезв, р5 їетр 5ПпареГспІ(епмі|, 05 іпієї їетр 5паре тоде (спІ(епмі| і 05 5бг ргергосезвіпа.For example, the eZVK metadata included in the bit stream can indicate the values of the parameters (described in the МРЕС ZАС standard and in the provided disclosure): 5PpareGspI(epmi|)

Тут позначення ХІСПІ, де Х - деякий параметр, означає, що параметр належить до каналу ("сп") вмісту аудіо закодованого бітового потоку, який має бути декодований. Для простоти ми іноді опускаємо вираз (сп) і припускаємо, що релевантний параметр належить до каналу вмісту аудіо.Here, the notation HISPI, where X is some parameter, means that the parameter belongs to the channel ("sp") of the content of the audio encoded bit stream to be decoded. For simplicity, we sometimes omit the expression (sp) and assume that the relevant parameter belongs to the channel of the audio content.

Тут позначення ХІспІ(епм|, де Х - деякий параметр, означає, що параметр належить до обвідної ЗВЕК ("епу") каналу ("сп") вмісту аудіо закодованого бітового потоку, який має бути декодований. Для простоти ми іноді опускаємо вирази Гепмі і ІСП) і припускаємо, що релевантний параметр належить до обвідної ЗВК каналу вмісту аудіо.Here, the notation ХИспИ(епм|, where Х is some parameter) means that the parameter belongs to the surround ZVEK ("epu") of the channel ("sp") of the content of the audio coded bit stream to be decoded. For simplicity, we sometimes omit the Hepmi expressions and ISP) and we assume that the relevant parameter belongs to the surround ZVK of the audio content channel.

Як зазначено, стандарт МРЕС ОЗАС припускає, що бітовий потік ШОБАС включає в себе метадані еЗзВК, які керують функціонуванням обробки е5ВК за допомогою декодера. Метадані еЗВК включають в себе наступні однобітові параметри метаданих: пагтопісзВЕе; р5 іпіегтЕЗ5 і р5 рус.As indicated, the standard of the МРЕС ОЗАС assumes that the ШОБАС bit stream includes eZzVK metadata, which controls the operation of e5VK processing with the help of a decoder. The eZVK metadata includes the following one-bit metadata parameters: pagtopiszVEe; p5 ipiegtEZ5 and p5 rus.

Параметр "Наптопіс53ВА" вказує використовування гармонічної вставки (гармонічної транспозиції) для ЗВК. Більш конкретно, пагптопіс5ВК-О вказує не гармонічну спектральну вставку, як описано у розділі 4.6.18.6.3 стандарту МРЕС-4 ААС; і паппопіс53ВК-1 вказує гармонічну вставку 5ВЕ. (типу, використовуваного в езВК, як описано у розділі 7.5.3 або 7.5.4 стандарту МРЕС ИО5АС). Гармонічна вставка ЗВК не використовується згідно з копіюванням спектральної смуги, що не є езВЕ (тобто, ЗВЕК, а не еЗзВК). У цьому розкритті спектральнаThe parameter "Naptopis53BA" indicates the use of harmonic insertion (harmonic transposition) for the ZVK. More specifically, pagptopis5VK-O indicates a non-harmonic spectral insertion, as described in section 4.6.18.6.3 of the MPEC-4 AAS standard; and pappopis53VK-1 indicates the harmonic insertion of 5BE. (of the type used in EZVK, as described in section 7.5.3 or 7.5.4 of the МРЕС ИО5АС standard). Harmonic interpolation of ZVK is not used according to the copying of a spectral band that is not eZVE (ie, ZVEK, not eZzVK). In this disclosure, the spectral

Зо вставка згадується як базова форма копіювання спектральної смуги, тоді як гармонічна транспозиція згадується як розширена форма копіювання спектральної смуги.Zo interpolation is referred to as the basic form of spectral band copying, while harmonic transposition is referred to as the extended form of spectral band copying.

Значення параметра "ре іпіетТЕ5" вказує використовування інструменту інтер-ТЕ5 е5ВВ.The value of the "re ipietTE5" parameter indicates the use of the inter-TE5 e5BB tool.

Значення параметра "р5 рис" вказує використовування інструменту РУС еЗзВЕ.The value of the "p5 pic" parameter indicates the use of the RUS eZzVE tool.

Під час декодування закодованого бітового потоку функціонуванням гармонічної транспозиції під час етапу обробки езвк декодування (для кожного каналу "сі" вмісту аудіо, вказаного бітовим потоком) керують наступні параметри метаданих еЗзВЕ: 5бгРаїспіпаМоае|сп|; 5ргОмегзатріїпоніад(сп); 5бгРИСПІпВіп5 Радіус; і 5бгРИСПІпВІіп5|СНІ.During the decoding of the encoded bitstream, the operation of harmonic transposition during the processing stage of ezvk decoding (for each "si" channel of the audio content specified by the bitstream) is controlled by the following eZzVE metadata parameters: 5бгРаиспипаМоае|сп|; 5rgOmegazatriiponiad(sp); 5бгРИСПИpВип5 Radius; and 5бгРИСПИpВИip5|СНИ.

Значення "5БбгРаїспіпуаМоде|сп)|" вказує тип транспозиції, використовуваної в езвк: 5ргРагспіпамМоде|снп|-1 вказує не гармонічну вставку, як описано у розділі 4.6.18.6.3 стандартуValue "5BbgRaispipuaMode|sp)|" indicates the type of transposition used in ezvk: 5ргРагспипамМоде|снп|-1 indicates non-harmonic insertion as described in section 4.6.18.6.3 of the standard

МРЕС-4 ААС; 5ргРаїспіпдаМоае|снп|-О вказує гармонічну вставку ЗВЕ, як описано у розділі 7.5.3 або 7.5.4 стандарту МРЕС БАС.MPRES-4 AAS; 5ргРаиспипдаМоае|снп|-О indicates the harmonic insertion of ZVE, as described in section 7.5.3 or 7.5.4 of the МРЕС BAS standard.

Значення "збгОмегзатріїпа Іаяа|сп|!" вказує використовування сигнальної адаптивної наддискретизації в частотній ділянці в е5ВК у поєднанні з основаною на ОЕТ гармонічною вставкою ЗВК, як описано у розділі 7.5.3 стандарту МРЕС О5АС. Цей прапор керує розміромThe meaning of "збгОмегзатриипа Иаяа|сп|!" indicates the use of signal adaptive oversampling in the frequency domain in e5VK in combination with OET-based harmonic interpolation of ZVK, as described in section 7.5.3 of the МРЕС О5АС standard. This flag controls the size

ОЕТ, який використовується під час транспозиції: 1 вказує, що сигнальна адаптивна наддискретизація в частотній ділянці доступна, як описано у розділі 7.5.3.1 стандарту МРЕОСOET used during transposition: 1 indicates that signal adaptive oversampling in the frequency domain is available, as described in section 7.5.3.1 of the MREOS standard

БАС; 0 вказує, що сигнальна адаптивна наддискретизація в частотній ділянці недоступна, як описано у розділі 7.5.3.1 стандарту МРЕС ЗАС.BASS; 0 indicates that the signal adaptive oversampling in the frequency domain is not available, as described in section 7.5.3.1 of the МРЕС ZАС standard.

Значення "зрБгРіїспіпВіпе|сСп)" керує додаванням множників векторного добутку під час гармонічної транспозиції 5ВК. Значення 5ргРіїспіпВіп5|сп) є цілочисловим значенням в діапазоніThe value "zrBgRiispipVipe|сSp)" controls the addition of the vector product multipliers during the harmonic transposition of 5VK. The value 5ргРииспипВип5|сп) is an integer value in the range

ІО,127| і відображає відстань, виміряну в частотних двійкових символах для перетворення ОЕТ з 1536 лініями, що діє на частоту дискретизації основного кодера.IO, 127| and displays the distance measured in frequency binary symbols for the 1536-line OET conversion acting on the sampling frequency of the main encoder.

У випадку, коли бітовий потік МРЕС-4 ААС вказує пару каналів 5ВЕ, канали якої не з'єднані (а не єдиний канал 5ВК), бітовий потік вказує два екземпляри згаданого вище синтаксису (для гармонічної або не гармонічної транспозиції), по одному для кожного каналу 5Бг спаппеї! раї еіетепі).In the case where the MRES-4 AAS bit stream indicates a pair of 5VE channels, the channels of which are not connected (and not a single 5VK channel), the bit stream indicates two instances of the syntax mentioned above (for harmonic or non-harmonic transposition), one for each channel 5Bg spappei! rai eietepi).

Гармонічна транспозиція інструменту езВК, як правило, покращує якість декодованих музичних сигналів під час відносно низького переходу по частотах. Гармонічна транспозиція має бути реалізована в декодері за допомогою гармонічної транспозиції або на основі ОЕТ, або на основі ОМЕ. Не гармонічна транспозиція (тобто, спектральна вставка попередніх версій), як правило, покращує мовні сигнали. Отже, відправна точка під час вирішення відносно того, який тип транспозиції переважний для кодування заданого вмісту аудіо, полягає у виборі способу транспозиції залежно від виявлення мови/музики, при цьому гармонічна транспозиція використовується для музики, і спектральна вставка використовується для мови.Harmonic transposition of the EZVK instrument generally improves the quality of decoded music signals during relatively low frequency transitions. Harmonic transposition must be implemented in the decoder using either OET-based or OME-based harmonic transposition. Non-harmonic transposition (ie, spectral interpolation of earlier versions) tends to improve speech signals. Therefore, the starting point when deciding which type of transposition is preferred for encoding a given audio content is to choose a transposition method based on speech/music detection, with harmonic transposition used for music and spectral interpolation used for speech.

Функціонуванням попереднього згладжування під час обробки е5ВЕ керує значення однобітового параметра метаданих еЗзВЕ, відомого як "б5 5бг ргергосеззіпа", у тому розумінні, що попереднє згладжування або виконається, або не виконується залежно від значення цього єдиного біта. Коли використовується алгоритм ОМЕ-вставки ЗВЕ, як описано у розділі 4.6.18.6.3 стандарту МРЕС-4 ААДАС, може бути виконаний етап попереднього згладжування (коли позначено параметром "р5 вбг ргергосезвіпа"), щоб уникнути неоднорідностей у формі обвідної спектру високочастотного сигналу, що вводиться у подальший блок коригування обвідної (блок коригування обвідної виконує інший етап обробки е5вк). Попереднє згладжування, як правило, покращує операцію подальшого етапу коригування обвідної, що дає у результаті високосмуговий сигнал, який сприймається більш стабільно.The operation of pre-smoothing during e5BE processing is controlled by the value of a one-bit metadata parameter of the eZzBE known as "b5 5bg rgergosezzip" in that pre-smoothing is either performed or not performed depending on the value of this single bit. When the algorithm of OME-insertion of ZVE is used, as described in section 4.6.18.6.3 of the МРЕС-4 AADAS standard, a pre-smoothing stage can be performed (when marked with the "р5 вбг ргергосезвипа" parameter) to avoid inhomogeneities in the contour shape of the high-frequency signal spectrum, which is entered into the subsequent contour correction block (the contour correction block performs another stage of e5vk processing). Pre-smoothing generally improves the operation of the subsequent envelope adjustment stage, resulting in a high-band signal that is perceived more stably.

Функціонуванням формування часової обвідної відліків між піддіапазонами (інструмент "інтер-ТЕ5") під час обробки езЗВЕ в декодері керують наступні параметри метаданих еЗзВК для кожної обвідної ЗВК ("епу") кожного каналу ("сі") вмісту аудіо декодованого бітового потокуThe functioning of the formation of the time contour count between subbands ("inter-TE5" tool) during the processing of ezZVE in the decoder is controlled by the following parameters of the ezZVK metadata for each bypass ZVC ("epu") of each channel ("si") of the audio content of the decoded bit stream

ОБАС: ре їетр 5Паре(Гспі(епмі; і 5 іпіег Тетр 5Ппаре тоде(спЦІепмі.OBAS: re ietr 5Pare(Gspi(epmi; and 5 ipieg Tetr 5Ppare tode(spCIepmi.

Інструмент інтер-ТЕ5 обробляє ОМЕ відліки піддіапазону після блоку коригування обвідної.The inter-TE5 tool processes the OME of the subrange readings after the bypass correction block.

Цей етап обробки формує часову обвідну більш високого діапазону частот з більш високим ступенем часової деталізації, ніж блок коригування обвідної. За допомогою застосування коефіцієнта посилення до кожного ОМЕ відліку піддіапазону в обвідній ЗВЕ інтер-ТЕ5 формує часову обвідну серед ОМЕ відліків піддіапазону.This stage of processing forms a time envelope of a higher frequency range with a higher degree of temporal detail than the block of correction of the envelope. By applying the amplification factor to each OME of the sub-range reading in the bypass ZVE, inter-TE5 forms a time envelope among the OME of the sub-band readings.

Параметр "р5 Тетр 5пПареГспІепм|» є прапором, який сигналізує використовування інтер-Parameter "p5 Tetr 5pPareGspIepm|» is a flag that signals the use of inter-

ТЕ5. Параметр "р5 іпіег їетр 5Ппаре тоаде(спіІ(епм|» вказує (як задано в стандарті МРЕОСTE5. Parameter "p5 ipieg yetr 5Ppare toade(spiI(epm|" indicates (as specified in the MREOS standard

Зо ОЗАС) значення параметра у в інтер-ТЕ5.From OZAS) the value of parameter y in inter-TE5.

Загальна вимога бітової швидкості для включення у бітовий потік МРЕС-4 ААС метаданих еЗВК, що вказують згадані вище інструменти езВК (гармонічна транспозиція, попереднє згладжування і інтер-ТЕ5) передбачається на рівні порядку декількох сотень бітів на секунду, оскільки тільки відмітні керувальні дані, необхідні для виконання обробки езВЕ, передаються згідно з деякими варіантами здійснення винаходу. Декодери попередніх версій можуть ігнорувати цю інформацію, оскільки вона включена з дотриманням зворотної сумісності (як буде описано пізніше). Таким чином, несприятлива дія на бітову швидкість, пов'язану з включенням метаданих еЗВЕ, є незначною з ряду причин, у тому числі наступних: - Втрати бітової швидкості (внаслідок включення метаданих езвк) являють собою дуже невелику частину загальної бітової швидкості, оскільки передаються тільки відмітні керувальні дані, необхідні для виконання обробки езВК (а не паралельна передача керувальних данихThe overall bit rate requirement for the inclusion of the eZVK metadata specifying the above-mentioned ezVK tools (harmonic transposition, pre-smoothing and inter-TE5) in the MPRES-4 AAS bit stream is assumed to be on the order of several hundreds of bits per second, since only the distinctive control data required for processing ezVE, are transferred according to some variants of the implementation of the invention. Decoders from earlier versions can ignore this information because it is included for backwards compatibility (as described later). Thus, the adverse effect on the bit rate associated with the inclusion of eZVE metadata is insignificant for a number of reasons, including the following: - Bit rate losses (due to the inclusion of eZVE metadata) represent a very small part of the total bit rate, since only distinctive control data necessary to perform ezVK processing (not parallel transfer of control data

ЗВ); - Налаштування керувальної інформації, що належить до З5ВЕ, як правило, не залежить від детальних відомостей про транспозицію; і - Інструмент інтер-ТЕ5 (використовуваний під час обробки езЗзВК) виконує однобічну подальшу обробку транспонованого сигналу.ZV); - Setting of control information belonging to Z5VE, as a rule, does not depend on detailed information about transposition; and - The inter-TE5 tool (used during the processing of ezZzVK) performs one-sided post-processing of the transposed signal.

Таким чином, варіанти здійснення винаходу забезпечують засіб для ефективної передачі керувальних даних або метаданих розширеного копіювання спектральної смуги (езвВкК) з дотриманням зворотної сумісності. Ця ефективна передача керувальних даних езВК скорочує вимоги до пам'яті в декодерах, кодерах і транскодерах, що використовують аспекти винаходу, без спричинення якого-небудь відчутного негативного ефекту на бітову швидкість. Крім того, складність і вимоги до обробки, пов'язані з виконанням езВК згідно з варіантами здійснення винаходу, також скорочені, оскільки дані 5ВЕК необхідно обробити тільки один раз, а не передавати їх паралельно, що мало б місце, якби езВвкК розглядалося як абсолютно окремий тип об'єкту в МРЕС-4 ААС, замість того, щоб бути інтегрованим в кодер-декодер МРЕС-4 ААС з дотриманням зворотної сумісності.Thus, embodiments of the invention provide a means for efficient transfer of control data or metadata of extended spectral band copying (ESBCC) with observance of backward compatibility. This efficient transfer of EZVK control data reduces memory requirements in decoders, encoders and transcoders employing aspects of the invention without causing any appreciable negative effect on bitrate. In addition, the complexity and processing requirements associated with performing an EZVK in accordance with embodiments of the invention are also reduced, since the 5VEK data needs to be processed only once, rather than being transmitted in parallel, which would be the case if the ezVvKC was treated as a completely separate object type in MPRES-4 AAS, instead of being integrated into the MPRES-4 AAS encoder-decoder with backward compatibility.

Далі з посиланням на фіг. 7 ми описуємо елементи блоку ("там даїа ріоск") бітового потокуNext, with reference to fig. 7 we describe the elements of a block ("tam daia riosk") of a bit stream

МРЕС-4 ААС, в які включені метадані еЗзВК, згідно з деякими варіантами здійснення наданого винаходу. Фіг. 7 є схемою блоку (там даїа БіосК") бітового потоку МРЕС-4 ААС, що показує 60 деякі його сегменти.MRES-4 AAS, which includes eZzVK metadata, according to some variants of implementation of the given invention. Fig. 7 is a block diagram (there is a BiosK) bit stream of MPEC-4 AAS, showing 60 some of its segments.

Блок бітового потоку МРЕС-4 ААС може включати в себе щонайменше один елемент "віпдіє спаппе! еіетепі)» (наприклад, елемент одиночного каналу, показаний на фіг. 7) і/або щонайменше один елемент "спаппе! раїг еіетепі)» (спеціально не показаний на фіг. 7, хоча може бути присутнім), що включає в себе аудіодані для аудіо програми. Блок також може включати в себе декілька елементів "її еіетепів" (наприклад, заповнюючий елемент 1 і/або заповнюючий елемент 2 на фіг. 7), що включають в себе дані (наприклад, метадані), що належать до програми. Кожний елемент "зіпдіе спаппе! еіетепі)» включає в себе ідентифікатор (наприклад, "ІЮ01" на фіг. 7), що вказує початок елемента одиночного каналу, і може включати в себе аудіодані, що вказують інший канал багатоканальної аудіо програми.A block of bitstream MPEC-4 AAS may include at least one element "effect spappe! eietepi)" (for example, the single channel element shown in Fig. 7) and/or at least one element "sappe! raig eietepi)" (not specifically shown in Fig. 7, although it may be present), which includes audio data for the audio program. The block may also include several elements of "its eietepes" (for example, filler element 1 and/or filler element 2 in Fig. 7), which include data (for example, metadata) belonging to the program. Each element "zipdie spappe! eietepi)" includes an identifier (for example, "IU01" in Fig. 7) indicating the beginning of a single-channel element, and may include audio data indicating another channel of a multi-channel audio program.

Кожний елемент "спаппе! раї єІетепі»»» включає в себе ідентифікатор (не показаний на фіг. 7), що вказує початок елемента пари каналів, і може включати в себе аудіодані, що вказують два канали програми.Each element "spappe! rai eIetepi""" includes an identifier (not shown in Fig. 7) indicating the beginning of the element of a pair of channels, and may include audio data indicating two channels of the program.

Елемент їй еІетепі (що називається тут далі "заповнюючий елемент") бітового потокуAn element of a bit stream (hereafter referred to as a "filler element") of a bit stream

МРЕС-4 ААС включає в себе ідентифікатор ("702" на фіг. 7), що вказує початок заповнюючого елемента, і заповнюючі дані після ідентифікатора. Ідентифікатор 02 може складатися з трибітового цілого без знаку, у якого спочатку передається старший значущий біт ("ціт5ер), що має значення 0хб. Заповнюючі дані можуть включати в себе елемент ехіепзіоп рауїсайді() (що іноді згадується в наданому документі як додаткове корисне навантаження), синтаксис якого показаний в таблиці 4.57 стандарту МРЕС-4 ААС. Існують декілька типів додаткових корисних навантажень, і вони ідентифікуються через параметр "ехієпвіоп їуре", який є чотирибітовим цілим без знаку, у якого спочатку передається старший значущий біт ("піт5рі").MPRES-4 AAS includes an identifier ("702" in Fig. 7), which indicates the beginning of the filler element, and filler data after the identifier. Identifier 02 may consist of a three-bit unsigned integer with the most significant bit (quote5er) of 0x first transmitted. Padding data may include an echiepsiop rauisaidi() element (which is sometimes referred to in this document as an optional payload) , the syntax of which is shown in Table 4.57 of the AAS МРЕС-4 standard.There are several types of additional payloads, and they are identified by the parameter "echiepviop iure", which is a four-bit unsigned integer whose most significant bit ("pit5ri") is transmitted first.

Заповнюючі дані (наприклад, їх додаткове корисне навантаження) можуть включати в себе заголовок або ідентифікатор (наприклад, "Заголовок 1" на фіг. 7), який вказує сегмент заповнюючих даних, який вказує об'єкт 5ВК (тобто, заголовок ініціалізує тип "об'єкт ЗВА", що називається 5бг ехієпзіоп даїа) в стандарті МРЕС-4 ААС). Наприклад, додаткове корисне навантаження копіювання спектральної смуги (5ВК) ідентифікується значенням "1101" або "1110" для ехіепзіоп їуре поля у заголовку, і ідентифікатор "1101" ідентифікує додаткове корисне навантаження з даними ЗВК, а "1110" ідентифікує додаткове корисне навантаження з даними ЗВК з циклічним контролем надмірності (СКС) для перевірки правильності даних ЗВ.The padding data (eg, its additional payload) may include a header or identifier (eg, "Header 1" in Fig. 7) that indicates the segment of the padding data that indicates the 5VK object (ie, the header initializes the type "about "ZVA object", which is called 5bg ekhiepsiop daia) in the MRES-4 AAS standard). For example, a Spectrum Copying Additional Payload (5VC) is identified by the value "1101" or "1110" for the field header field, and an identifier of "1101" identifies an additional payload with data VC and "1110" identifies an additional payload with data ZVK with cyclic redundancy check (SKS) for checking the correctness of ZV data.

Зо Коли заголовок (наприклад, поле ехіепвіоп іуре) ініціалізує тип об'єкту 5ВК, метадані ЗВЕ (що іноді згадуються у цьому документі як "дані копіювання спектральної смуги" і називаються 5бг дага() в стандарті МРЕС-4 ААС) йдуть за заголовком, і щонайменше один додатковий елемент копіювання спектральної смуги (наприклад, "додатковий елемент 5ВК" заповнюючого елемента 1 на фіг. 7) може йти за метаданими 5ВК. Такий додатковий елемент копіювання спектральної смуги (сегмент бітового потоку) згадується як контейнер "5бг ехіепзіопО» в стандарті МРЕС-4 ААДС. Додатковий елемент копіювання спектральної смуги необов'язково включає в себе заголовок (наприклад, "додатковий заголовок ЗВК" заповнюючого елемента 1 на фіг. 7).When a header (for example, an ehiepviop iure field) initializes a 5VK object type, the UE metadata (sometimes referred to in this document as "spectral band copy data" and called 5bg daga() in the MPEC-4 AAS standard) follows the header, and at least one additional element of copying the spectral band (for example, "additional element 5VK" of the filling element 1 in Fig. 7) can follow the metadata of 5VK. Such an additional spectral band copying element (bit stream segment) is referred to as a "5bg ehiepsiopO" container in the AADS MPEC-4 standard. The additional spectral band copying element does not necessarily include a header (for example, "additional header ZVK" of the filler element 1 in Fig. 7).

Стандарт МРЕС-4 ААС припускає, що додатковий елемент копіювання спектральної смуги може включати в себе дані Р5 (параметричного стерео) для аудіоданих програми. СтандартThe MPEC-4 AAS standard suggests that an additional element of spectral band copying may include P5 (parametric stereo) data for program audio data. Standard

МРЕС-4 ААС припускає, що коли заголовок заповнюючого елемента (наприклад, його додаткових корисних даних) ініціалізує тип об'єкту ВК (як робить "Заголовок 1" на фіг. 7), і додатковий елемент копіювання спектральної смуги заповнюючого елемента включає в себе дані Р5, заповнюючий елемент (наприклад, його додаткове корисне навантаження) включає в себе дані копіювання спектральної смуги і параметр "р5 ехієепвіоп ід", значення якого (тобто, р5 ехіепзіоп їа-2) вказує, що дані Р5 включені у додатковий елемент копіювання спектральної смуги заповнюючого елемента.MPRES-4 AAS assumes that when the header of a filler element (for example, its additional useful data) initializes the type of the VC object (as "Header 1" does in Fig. 7), and the additional element of copying the spectral band of the filler element includes the data P5, the padding element (eg, its additional payload) includes the spectral band copy data and the "p5 echiepviop id" parameter, the value of which (ie, p5 echiepsiop ia-2) indicates that the P5 data is included in the spectral band copy additional element filler element.

Згідно з деякими варіантами здійснення наданого винаходу метадані езВвК (наприклад, прапор, що вказує, чи має бути виконана обробка розширеного копіювання спектральної смуги (еЗВЕ) для вмісту аудіо блоку), включені у додатковий елемент копіювання спектральної смуги заповнюючого елемента. Наприклад, такий прапор позначений в заповнюючому елементі 1 на фіг. 7, де прапор має місце після заголовка ("додатковий заголовок ЗВК" заповнюючого елемента 1) "додаткового елемента 5ВЕ" заповнюючого елемента 1. Необов'язково такий прапор і додаткові метадані езвкК включаються в додатковий елемент копіювання спектральної смуги після заголовка додаткового елемента копіювання спектральної смуги (наприклад, у додатковому елементі 5ВК заповнюючого елемента 1 на фіг. 7, після додаткового заголовкаAccording to some embodiments of the present invention, ezVvK metadata (eg, a flag indicating whether enhanced spectral band copying (eSBR) processing should be performed for the content of the audio block) is included in the additional spectral band copying element of the padding element. For example, such a flag is marked in the filler element 1 in fig. 7, where the flag takes place after the header ("additional header ZVK" of filler element 1) "additional element 5VE" of filler element 1. Optionally, such a flag and additional metadata ezvkK are included in the additional element of copying the spectral band after the header of the additional element of copying the spectral band (for example, in the additional element 5VK of the filling element 1 in Fig. 7, after the additional header

ЗВК). Згідно з деякими варіантами здійснення наданого винаходу, заповнюючий елемент, який включає в себе метадані езВК, також включає в себе параметр "р5 ехієпвзіоп ід", значення якого (наприклад, р5 ехіепзіоп їй-3) вказує, що метадані езВкК включені в заповнюючий 60 елемент, і що обробка еЗзВК має бути виконана для вмісту аудіо релевантного блоку.ZVK). According to some embodiments of the present invention, a padding element that includes ezVK metadata also includes a parameter "p5 echiepvsiop id", the value of which (eg, p5 echiepsiop ii-3) indicates that the eZVkK metadata is included in the padding element 60 , and that eZzVK processing must be performed for the audio content of the relevant block.

Згідно з деякими варіантами здійснення винаходу метадані езВвкК включені у заповнюючий елемент (наприклад, заповнюючий елемент 2 на фіг. 7) бітового потоку МРЕС-4 ААС, що відрізняється від додаткового елемента копіювання спектральної смуги (додатковий елементAccording to some variants of the implementation of the invention, the ezVvkK metadata is included in the filler element (for example, the filler element 2 in Fig. 7) of the MPEC-4 AAS bit stream, which differs from the additional element of copying the spectral band (the additional element

ЗВК) заповнюючого елемента. Це викликано тим, що заповнюючі елементи, що містять ехієпвіоп рауїбадО) з даними ЗВК або даними ЗВК з СКС, не містять ніякого іншого додаткового корисного навантаження ніякого іншого додаткового типу. Таким чином, у варіантах здійснення, коли метадані ез5ВК зберігають їх власне додаткове корисне навантаження, окремий заповнюючий елемент використовується, щоб зберігати метадані езВК. Такий заповнюючий елемент включає в себе ідентифікатор (наприклад, "02" на фіг. 7), що вказує початок заповнюючого елемента, і заповнюючі дані після ідентифікатора. Заповнюючі дані можуть включати в себе елемент ехіепзіоп рауїоад0) (що іноді згадуються у наданому документі як додаткове корисне навантаження), синтаксис якого показаний в таблиці 4.57 стандартуZVK) of the filling element. This is caused by the fact that the padding elements containing the ehiepviop rauibadO) with ZVK data or ZVK data from SCS do not contain any other additional payload of any other additional type. Thus, in embodiments where the ez5VK metadata stores its own additional payload, a separate padding element is used to store the ezVK metadata. Such a filler element includes an identifier (for example, "02" in Fig. 7) indicating the beginning of the filler element, and filler data after the identifier. The padding data may include an element (sometimes referred to in this document as an additional payload), the syntax of which is shown in Table 4.57 of the standard

МРЕС-4 ААС. Заповнюючі дані (наприклад, додаткове корисне навантаження) включає в себе заголовок (наприклад, "Заголовок 2" заповнюючого елемента 2 на фіг. 7), який вказує об'єкт еЗзвВЕ (тобто, заголовок ініціалізує тип об'єкту розширеного копіювання спектральної смуги (езВвЕ)), і заповнюючі дані (наприклад, додаткове корисне навантаження) включає в себе метадані е5ВК після заголовка. Наприклад, заповнюючий елемент 2 на фіг. 7 включає в себе такий заголовок ("Заголовок 2"), і також включає в себе після заголовка метадані езВЕ (тобто, "прапор" в заповнюючому елементі 2, який вказує, чи має бути виконана обробка розширеного копіювання спектральної смуги (езВК) для вмісту аудіо блоку). Необов'язково додаткові метадані езВК також включені в заповнюючі дані заповнюючого елемента 2 на фіг. 7 післяMPRES-4 AAS. The padding data (e.g., additional payload) includes a header (e.g., "Header 2" of padding element 2 in FIG. 7 ) that indicates an eZvVE object (i.e., the header initializes the object type of the enhanced spectrum copy (ezVvE )), and padding data (for example, additional payload) includes e5VK metadata after the header. For example, the filling element 2 in fig. 7 includes such a header ("Header 2"), and also includes after the header ezVE metadata (ie, a "flag" in padding element 2 indicating whether enhanced spectrum copy (ezVC) processing should be performed for the content audio unit). Optionally, additional EZVK metadata is also included in the padding data of the padding element 2 in FIG. 7 after

Заголовка 2. У варіантах здійснення, що описуються у наданому абзаці, заголовок (наприклад,Header 2. In the embodiments described in this paragraph, the header (e.g.

Заголовок 2 на фіг. 7) має ідентифікаційне значення, яке не є одним з традиційних значень, визначених в таблиці 4.57 стандарту МРЕС-4 ААС, і замість цього вказує додаткове корисне навантаження езВК (таким чином, що ехіепзіоп їуре поле заголовка вказує, що заповнюючі дані включають в себе метадані еЗВК).Title 2 in fig. 7) has an identification value that is not one of the traditional values defined in table 4.57 of the МРЕС-4 AAS standard, and instead indicates the additional payload of the EZVK (so that the header field indicates that the padding data includes metadata eZVK).

У першому класі варіантів здійснення винахід являє собою блок обробки аудіо (наприклад, декодер), що містить: пам'ять (наприклад, буфер 201 на фіг. З або 4), виконаний з можливістю, зберігати щонайменше один блок закодованого бітового потоку аудіо (наприклад, щонайменше один блок бітового потоку МРЕС-4 ААС); блок видалення форматування корисних даних бітового потоку (наприклад, елемент 205 на фіг. З або елемент 215 на фіг. 4), з'єднаний з пам'яттю і виконаний з можливістю демультиплексувати щонайменше одну частину згаданого блоку бітового потоку; і підсистему декодування (наприклад, елементи 202 і 203 на фіг. З або елементи 202 і 213 на фіг. 4), з'єднану і виконану з можливістю декодувати щонайменше одну частину вмісту аудіо згаданого блоку бітового потоку, причому блок включає в себе: заповнюючий елемент, що включає в себе ідентифікатор, що вказує початок заповнюючого елемента (наприклад, ідентифікатор "іі 5уп єїе", що має значення О0х6б, таблиці 4.85 стандартуIn the first class of embodiments, the invention is an audio processing unit (e.g., a decoder) containing: a memory (e.g., buffer 201 in Fig. 3 or 4), designed to store at least one block of an encoded audio bit stream (e.g. , at least one bitstream block of MPRES-4 AAS); a block for removing the formatting of the useful data of the bit stream (for example, element 205 in Fig. C or element 215 in Fig. 4), connected to the memory and made with the ability to demultiplex at least one part of the mentioned block of the bit stream; and a decoding subsystem (eg, elements 202 and 203 in FIG. 3 or elements 202 and 213 in FIG. 4) connected and configured to decode at least one portion of the audio content of said bitstream block, wherein the block includes: an element that includes an identifier indicating the beginning of a filler element (for example, the identifier "ii 5up еие", having the value О0х6б, table 4.85 of the standard

МРЕС-4 ААС), і заповнюючі дані після ідентифікатора, причому заповнюючі дані включають в себе: щонайменше один прапор, що ідентифікує, чи має бути виконана обробка розширеного копіювання спектральної смуги (езВЕ) для вмісту аудіо блоку (наприклад, з використовуванням даних копіювання спектральної смуги і метаданих еЗзВК, включених у блок).MRES-4 AAS), and padding data after the identifier, and the padding data includes: at least one flag identifying whether enhanced spectral band copy (ESB) processing should be performed for the content of the audio block (eg, using spectral copy data band and eZzVK metadata included in the block).

Прапор являє собою метадані еЗВЕК, і прикладом прапора є прапор 5бгРаїспіпдаМоде. Іншим прикладом прапора є прапор папптопіс5ВЕ. Обидва з цих прапорів вказують, чи має бути виконана базова форма копіювання спектральної смуги або розширена форма копіювання спектральної смуги для аудіоданих блоку. Базовою формою копіювання спектральної смуги є спектральна вставка, і розширеною формою копіювання спектральної смуги є гармонічна транспозиція.A flag is eZVEK metadata, and an example of a flag is the 5bgRaispipdaMode flag. Another example of a flag is the paptopis5VE flag. Both of these flags indicate whether a basic form of spectral band copying or an advanced form of spectral band copying should be performed for the block's audio data. The basic form of spectral band copying is spectral insertion, and the advanced form of spectral band copying is harmonic transposition.

У деяких варіантах здійснення заповнюючі дані також включають в себе додаткові метадані еЗВК (тобто, метадані езВЕ, що не є прапором).In some embodiments, the padding data also includes additional eZVE metadata (ie, non-flag eZVE metadata).

Пам'ять може являти собою буферну пам'ять (наприклад, реалізація буфера 201 на фіг. 4), яка зберігає (наприклад, енергонезалежним чином) щонайменше один блок закодованого бітового потоку аудіо.The memory can be a buffer memory (for example, the implementation of the buffer 201 in Fig. 4), which stores (for example, in a non-volatile manner) at least one block of an encoded audio bit stream.

Припускається, що складність функціонування обробки е5ВК (з використовуванням інструментів гармонічної транспозиції, попереднього згладжування і інтер-ТЕ5 езвкК) за допомогою декодера еЗзВЕК під час декодування бітового потоку МРЕС-4 ААС, який включає в себе метадані езВК (що вказують ці інструменти езВЕК) буде наступною (для типового 60 декодування з вказаними параметрами):It is assumed that the complexity of processing e5VK (with the use of tools of harmonic transposition, pre-smoothing and inter-TE5 ezvKK) with the help of the eZzVEK decoder during the decoding of the MRES-4 AAS bit stream, which includes ezVK metadata (indicating these ezVEK tools) will be as follows (for a typical 60 decoding with the specified parameters):

- Гармонічна транспозиція (16 Кбіт/с, 14400/28800 Гц) -- на основі ОЕТ: 3,68 МУМОРЗ5З (зважених мільйонів операцій на секунду); -- на основі ОМЕ: 0,98 М/МОР5; - Попередня обробка ОМЕ-вставки (попереднє згладжування): 0,1 М/МОР5; і - Формування часової обвідної відліків між піддіапазонами (інтер-ТЕ5): у більшій мірі 0,16- Harmonic transposition (16 Kbit/s, 14400/28800 Hz) -- based on OET: 3.68 MUMORZ5Z (weighted million operations per second); -- based on OME: 0.98 M/MOR5; - Pre-treatment of OME insert (pre-smoothing): 0.1 M/MOR5; and - Formation of a time loop of counts between sub-bands (inter-TE5): to a greater extent 0.16

М/МОРБ.M/MORB.

Відомо, що транспозиція на основі ОЕТ, як правило, виконується краще, ніж транспозиція на основі ОМЕ для перехідних процесів.OET-based transposition is known to generally perform better than OME-based transposition for transients.

Згідно з деякими варіантами здійснення наданого винаходу заповнюючий елемент (закодованого бітового потоку аудіо), який включає в себе метадані езВК, також включає в себе параметр (наприклад, параметр "р5 ехієепвіоп ій"), значення якого (наприклад, р5 ехіепзіоп ії4-3) сигналізує, що метадані езВК включені у заповнюючий елемент, і що обробка езВК має бути виконана для вмісту аудіо релевантного блоку, і/або параметр (наприклад, цей самий параметр "р5 ехівпбіоп ій"), значення якого (наприклад, р5 ехіепзіоп ід-2) сигналізує, що контейнер 5бг ехіепзіоп() заповнюючого елемента включає в себе дані Р5. Наприклад, як вказано у приведеній нижче таблиці 1, такий параметр, що має значення р5 ехіепзіоп іїй-2, може сигналізувати, що контейнер 5бг ехіепзіопО) заповнюючого елемента включає в себе дані Р5, і такий параметр, що має значення р5 ехіепзіоп ід-3, може сигналізувати, що контейнер 5бг ехіепзіоп() заповнюючого елемента включає в себе метадані еЗзВЕ:According to some embodiments of the present invention, the padding element (of the encoded audio bitstream) that includes the ezVK metadata also includes a parameter (e.g., the parameter "p5 ehiepviop iy"), the value of which (e.g. p5 ehieppiop iy4-3) signals that EZVK metadata is included in the padding element, and that EZVK processing should be performed on the content of the audio relevant block, and/or a parameter (for example, this same parameter "p5 echivpbiop iy") whose value (eg, p5 echivpbiop id-2 ) signals that the container 5bg ehiepsiop() of the filling element includes P5 data. For example, as shown in Table 1 below, such a parameter having a value of p5 ehiepsiop iii-2 may signal that the container 5bg ehiepsiop) of the padding element includes P5 data, and such a parameter having a value of p5 ehiepsiop id-3 , can signal that the container 5бг ехиепсиоп() of the filling element includes the eZzVE metadata:

Таблиця 1 нити Ен НОЯ 0000111 вареюрвоваю 11лнеоюю0 1Table 1 threads En NOYA 0000111 vareyurvovayu 11lneoyuyu0 1

Згідно з деякими варіантами здійснення винаходу синтаксис кожного додаткового елемента копіювання спектральної смуги, який включає в себе метадані еЗВЕ і/або дані Р5, як вказано у приведеній нижче таблиці 2 "в якій (5бг ехіепзіоп()» означає контейнер, який є додатковим елементом копіювання спектральної смуги, "б5 ехієпбвіоп їй" описаний у приведеній вище таблиці 1, "р5 даїа" означає дані Р, і "езбг даїа" означає метадані езЗВК):According to some embodiments of the invention, the syntax of each additional spectral band copying element that includes eZVE metadata and/or P5 data is as indicated in Table 2 below, "where (5bg ehiepsiop()" means a container that is an additional copying element spectral band, "b5 ehiepbviop her" is described in Table 1 above, "p5 daia" means P data, and "ezbg daia" means ezZVK metadata):

Таблиця 2 5Бг ехіепвіоп(ре ехіепзіоп ій, пит біїє Іей) ни шиTable 2 5Bg ehiepviop (re ehiepsiop iy, pit biiye Iey) ny shi

ПО НИКИ ПИКА нн ПО Я саве ЕХТЕМЮМ 10. ЄВА: ни ни нн І Я 00000000PO NIKY PIKA nn PO I save ECHTEMIUM 10. EVE: ni ni nn I I 00000000

ППО НИКИ ПИКАAIR FORCE OF NIKY PIKA

ПОООО и ПОКІ КОХPOOOO and POKI KOH

Прим. 1: ро даїа() повертає кількість лічених бітів.Approx. 1: ro daia() returns the number of bits counted.

Прим. 2: езбг адага0 повертає кількість лічених бітів.Approx. 2: ezbg adaga0 returns the number of bits counted.

Прим. 3: параметр Б5 ПТ бБії5 містить М бітів, де М-пит бів ей.Approx. 3: parameter B5 PT bBii5 contains M bits, where M-pyt biv ey.

У ілюстративному варіанті здійснення езбг даїа(), згаданий в приведеній вище таблиці 2, вказує значення наступних параметрів метаданих: 1. кожний з описаних вище однобітових параметрів метаданих "наппопіс5 ВА"; "р5 іпівгтТЕ5"; і"р5 5бг ргергосезвіпа"; 2. для кожного каналу ("сп") вмісту аудіо закодованого бітового потоку, який має бути декодований, кожний З описаних вище параметрів: "вБгРаїспіпамМоде!|сні"; "вБгОмегзатріїпоНіад(сні"; "зБбгРИСпІпВіп5 Радіус)"; і "бгРЇСАІпВіпві(сп1"; іIn an illustrative variant of the implementation of ezbg daia(), mentioned in Table 2 above, indicates the value of the following metadata parameters: 1. each of the one-bit metadata parameters described above "nappopis5 VA"; "p5 ipivgtTE5"; i"r5 5bg rhergosezvipa"; 2. for each channel ("sp") of the content of the audio coded bit stream to be decoded, each of the parameters described above: "vBgRaispipamMode!|sni"; "vBgOmegazatriipoNiad(sni"; "zBbgRISPIpVip5 Radius)"; and "bgRYSAIpVipvi(sp1); and

З. для кожної обвідної ЗВЕ ("епу") кожного каналу ("сі") вмісту аудіо закодованого бітового потоку, який має бути декодований, кожний 3 описаних вище параметрів: "ро їетр 5ПареГспІ(епм|»; і "Б5 іпіег Тетр 5Ппаре тоде(спЦІ(епмі|».C. for each surround ZVE ("epu") of each channel ("si") of the content of the audio coded bitstream to be decoded, each of the 3 parameters described above: then (spTCI (epmi|»).

Наприклад, в деяких варіантах здійснення езбг даїа(/) може мати синтаксис, вказаний в таблиці 3, щоб вказати ці параметри метаданих:For example, in some embodiments, ezbg daia(/) may have the syntax shown in Table 3 to specify these metadata parameters:

Таблиця З тео 00100101 шт: нин с НОЯ велить 101 п нн шт КО ПО 7 тнонатеіюниюх 1111001 оевенювнняяю 11111001 нин тн ло ПОЯ п нн теле 000101110000011 0 отеетесюретювнснем; рРО111011Table Z teo 00100101 pcs: nin s NOYA orders 101 p nn pcs KO PO 7 tnonateiuniyuh 1111001 oevenyuvnnayayu 11111001 nin tn lo POYA p nn tele 000101110000011 0 oteesesuretyuvnsnem; pPO111011

В таблиці З число у центральному стовпці вказує кількість бітів відповідного параметра у лівій колонці.In table C, the number in the central column indicates the number of bits of the corresponding parameter in the left column.

Приведений вище синтаксис дає можливість ефективної реалізації розширеної форми копіювання спектральної смуги, такий як гармонічна транспозиція, як розширення для декодера попередніх версій. Більш конкретно, дані е5ВЕ в таблиці З включають в себе тільки ті параметри, необхідні для виконання розширеної форми копіювання спектральної смуги, які або вже не підтримуються у бітовому потоці, або можуть бути безпосередньо виведені з параметрів, вже підтримуваних у бітовому потоці. Усі інші параметри і дані обробки, необхідні для виконання розширеної форми копіювання спектральної смуги, витягнені з раніше існуючих параметрів у вже заданих місцях розташування у бітовому потоці. У цьому полягає відмінність від альтернативної (і менш ефективної) реалізації, яка просто передає усі метадані обробки, використовувані для розширеного копіювання спектральної смуги.The above syntax allows for an efficient implementation of an advanced form of spectral band copying, such as harmonic transposition, as an extension to the earlier decoder. More specifically, the e5BE data in Table C include only those parameters necessary to perform an extended form of spectral band copying, which are either not already supported in the bitstream, or can be directly derived from parameters already supported in the bitstream. All other parameters and processing data required to perform the advanced form of spectral band copying are extracted from pre-existing parameters at pre-defined locations in the bitstream. This is in contrast to an alternative (and less efficient) implementation that simply passes all the processing metadata used for extended spectral band copying.

Наприклад, декодер, сумісний з МРЕС-4 НЕ-ААС або НЕ-ААС м2, може бути розширений, щоб він включав в себе розширену форму копіювання спектральної смуги, таку як гармонічна транспозиція. Ця розширена форма копіювання спектральної смуги доповнює базову форму копіювання спектральної смуги, вже підтримувану декодером. У контексті декодера, сумісного зFor example, a decoder compatible with MPEC-4 NON-AAS or NON-AAS m2 can be extended to include an advanced form of spectral band copying, such as harmonic transposition. This advanced form of spectral band copying complements the basic form of spectral band copying already supported by the decoder. In the context of a decoder compatible with

МРЕС-4 НЕ-АДС або НЕ-ААС м2, цією базовою формою копіювання спектральної смуги є інструмент спектральної вставки ОМЕ 5ВЕ, як задано у розділі 4.6.18 стандарту МРЕС-4 ДАС.MRES-4 NON-ADS or NON-AAS m2, this basic form of spectral band copying is the OME 5VE spectral insertion tool, as specified in section 4.6.18 of the MRES-4 DAS standard.

Під час виконання розширеної форми копіювання спектральної смуги розширений декодерWhen performing an advanced form of spectral band copying, an advanced decoder

НЕ-ААС може повторно використовувати багато параметрів бітового потоку, вже включених у додаткове корисне навантаження ЗВЕ. бітового потоку. Конкретні параметри, які можуть бути повторно використані, включають в себе, наприклад, різні параметри, які визначають основну таблицю діапазонів частот. Ці параметри включають в себе р5 5іагі їед (параметр, який визначає початок параметра основної таблиці частот), б5 5іор їтед (параметр, який визначає кінець основної таблиці частот), 05 їед 5саІе (параметр, який визначає кількість діапазонів частот на октаву) і Бб5 айег 5саіїе (параметр, який змінює масштаб діапазонів частот).NON-AAS can re-use many bitstream parameters already included in the additional payload of the RMS. bit stream. Specific parameters that can be reused include, for example, the various parameters that define the main table of frequency bands. These parameters include p5 5iagi yed (a parameter that defines the beginning of the main frequency table parameter), b5 5ior yed (a parameter that defines the end of the main frequency table), 05 yed 5saIe (a parameter that defines the number of frequency bands per octave), and Бб5 ayeg 5saiie (a parameter that changes the scale of frequency ranges).

Параметри, які можуть бути повторно використані, також включають параметри, які визначають таблицю смуги шумів (05 поїхе Бапі5) і параметри таблиці смуги обмежувача (65 Ітіег Брапабв).Parameters that can be reused also include the parameters that define the noise band table (05 bit Bapi5) and the limiter band table parameters (65 Itieg Brapabv).

На додаток до численних параметрів інші елементи даних також можуть бути повторно використані розширеним декодером НЕ-ААС під час виконання розширеної форми копіювання спектральної смуги згідно з варіантами здійснення винаходу. Наприклад, дані обвідної і дані мінімального рівня шуму також можуть бути витягнені з даних р5 даїа епм і Б5 поїібзе епм і використані під час розширеної форми копіювання спектральної смуги.In addition to the numerous parameters, other data elements may also be reused by the extended NO-AAS decoder when performing an extended form of spectral band copying according to embodiments of the invention. For example, envelope data and noise floor data can also be extracted from p5 daia epm and B5 poiibze epm data and used during an extended form of spectral band copying.

По суті, ці варіанти здійснення використовують параметри конфігурації і дані обвідної, вже підтримувані декодером НЕ-ААС або НЕ-ААС м2 попередніх версій, в додатковому корисному навантаженні ЗВЕ, щоб дати можливість для розширеної форми копіювання спектральної смуги, вимагаючи якомога менше додаткових переданих даних. Згідно з цим розширені декодери, які підтримують розширену форму копіювання спектральної смуги, можуть бути створені дуже ефективним чином, покладаючись на вже задані елементи бітового потоку (наприклад, в додатковому корисному навантаженні ЗВЕК) і додаючи тільки ті параметри, які потрібні для підтримки розширеної форми копіювання спектральної смуги (у додатковому корисному навантаженні заповнюючого елемента). Ця ознака скорочення обсягу даних у поєднанні з розміщенням нових доданих параметрів в зарезервованому полі даних, такому як додатковий контейнер, значною мірою скорочує бар'єри для створення декодера, який підтримує розширене копіювання спектральної смуги, гарантуючи, що бітовий потік зворотно сумісний з декодером попередніх версій, що не підтримує розширену форму копіювання спектральної смуги.Essentially, these implementation options use the configuration parameters and bypass data, already supported by the non-AAS or non-AAS m2 decoder of previous versions, in the additional payload of the ZVE to enable an advanced form of spectral band copying, requiring as little additional transmitted data as possible. Accordingly, extended decoders that support an extended form of spectral band copying can be built very efficiently by relying on already defined bitstream elements (e.g., in the additional ZVEK payload) and adding only those parameters that are required to support the extended form of copying spectral band (in the additional payload of the filling element). This data reduction feature, combined with the placement of newly added parameters in a reserved data field such as an extra container, greatly reduces the barriers to creating a decoder that supports extended spectral band copying, ensuring that the bitstream is backward compatible with the decoder of previous versions , which does not support the extended form of spectral band copying.

У деяких варіантах здійснення винахід являє собою спосіб, що включає в себе етап кодування аудіоданих, щоб сформувати закодований бітовий потік (наприклад, бітовий потікIn some embodiments, the invention is a method comprising the step of encoding audio data to form an encoded bitstream (eg, a bitstream

МРЕС-4 ААС), у тому числі за допомогою включення метаданих езВвкК, щонайменше, в один сегмент щонайменше одного блоку закодованого бітового потоку і аудіоданих, щонайменше, ще в один сегмент блоку. В типових варіантах здійснення спосіб включає в себе етап мультиплексування аудіоданих з метаданими е5ВК в кожному блоці закодованого бітового потоку. В типовому декодуванні закодованого бітового потоку в декодері езВЕ декодер витягує метадані еб5ВЕК з бітового потоку (у тому числі за допомогою синтаксичного розбору і демультиплексування метаданих еЗВЕ і аудіоданих) і використовує метадані езВЕ для обробки аудіоданих, щоб сформувати потік декодованих аудіоданих.MPRES-4 AAS), including by including metadata of ezVvkK, at least, in one segment of at least one block of the coded bit stream and audio data, at least in one more segment of the block. In typical embodiments, the method includes the step of multiplexing audio data with e5VK metadata in each block of the encoded bit stream. In a typical decoding of an encoded bitstream in an ezVE decoder, the decoder extracts eb5VEC metadata from the bitstream (including parsing and demultiplexing the ezVE metadata and audio data) and uses the ezVE metadata to process the audio data to form a stream of decoded audio data.

Іншим аспектом винаходу є декодер езВЕ, виконаний з можливістю виконувати обробку еЗВК (наприклад, з використовуванням щонайменше одного з інструментів езВК, відомих як гармонічна транспозиція, попереднє згладжування або інтер-ТЕ5) під час декодування закодованого бітового потоку аудіо (наприклад, бітового потоку МРЕС-4 ААС), який не включає в себе метадані е5ВК. Приклад такого декодера буде описаний з посиланням на фіг. 5.Another aspect of the invention is an ezVE decoder configured to perform ezVK processing (eg, using at least one of the ezVK tools known as harmonic transposition, pre-smoothing, or inter-TE5) when decoding an encoded audio bitstream (eg, an MPEC bitstream 4 AAS), which does not include e5VK metadata. An example of such a decoder will be described with reference to fig. 5.

Декодер (400) еЗВЕК на фіг. 5 включає в себе буферну пам'ять 201 (яка ідентична пам'яті 201 на фіг. З і 4), блок 215 видалення форматування корисних даних бітового потоку (який ідентичний блоку 215 видалення форматування на фіг. 4), підсистему 202 декодування аудіо (що іноді називається "базовим" модулем декодування або "базовою" підсистемою декодування, яка ідентична базовій підсистемі 202 декодування на фіг. 3), підсистему 401 формування керувальних даних е5ЗВЕ і модуль 203 обробки еЗВЕ (який ідентичний модулю 203 на фіг. 3), з'єднані, як показано. Як правило, також декодер 400 включає в себе інші елементи обробки (не показані).Decoder (400) eZVEK in fig. 5 includes a buffer memory 201 (which is identical to memory 201 in Figs. 3 and 4), a bitstream payload deformatting unit 215 (which is identical to a deformatting unit 215 in Fig. 4), an audio decoding subsystem 202 ( which is sometimes referred to as the "base" decoding module or "base" decoding subsystem, which is identical to the base decoding subsystem 202 of FIG. 3), the e5ZVE control data generation subsystem 401, and the eZVE processing module 203 (which is identical to the module 203 of FIG. 3), with 'connected as shown. Typically, the decoder 400 also includes other processing elements (not shown).

У роботі декодера 400 послідовність блоків закодованого бітового потоку аудіо (бітового потоку МРЕС-4 ААС), прийнятого декодером 400, переміщається з буфера 201 у блок 215 видалення форматування.In the operation of the decoder 400, the sequence of blocks of the coded audio bit stream (MRES-4 AAS bit stream), received by the decoder 400, is moved from the buffer 201 to the block 215 of removing formatting.

Блок 215 видалення форматування з'єднаний і виконаний з можливістю демультиплексувати кожний блок бітового потоку, щоб витягнути звідти метадані ВЕ (що включають в себе квантовані дані обвідної) і, як правило, також інші метадані. Блок 215 видалення форматування виконаний з можливістю поміщати, щонайменше, метадані ЗВК в модуль 203 обробки е5ВЕ. Блок 215 видалення форматування також з'єднаний і виконаний з можливістю витягувати аудіодані з кожного блоку бітового потоку і поміщати витягнені аудіодані у підсистему 202 декодування (модуль декодування).The deformatting unit 215 is connected and configured to demultiplex each block of the bit stream in order to extract the BE metadata (including quantized envelope data) and, typically, other metadata as well. Block 215 of removing formatting is made with the ability to place, at least, metadata of ZVK in module 203 of e5BE processing. The deformatting unit 215 is also connected and configured to extract audio data from each block of the bit stream and place the extracted audio data into the decoding subsystem 202 (decoding module).

Підсистема 202 декодування аудіо декодера 400 виконана з можливістю декодувати аудіодані, витягнені блоком 215 видалення форматування (таке декодування може згадуватися як "базова" операція декодування), щоб сформувати декодовані аудіодані, і поміщати декодовані аудіодані в модуль 203 обробки е5ВЕ. Декодування виконується в частотній ділянці.The decoding subsystem 202 of the audio decoder 400 is configured to decode the audio data extracted by the deformatting unit 215 (such decoding may be referred to as a "base" decoding operation) to form decoded audio data, and to place the decoded audio data into the e5BE processing module 203. Decoding is performed in the frequency domain.

Як правило, завершальний етап обробки в підсистемі 202 застосовує перетворення з частотної ділянки у часову ділянку до декодованих аудіоданих частотної ділянки, таким чином, виведенням підсистеми є декодовані аудіодані в часовій ділянці. Модуль 203 виконаний з можливістю застосовувати інструменти ВК (і інструменти езВК), вказані за допомогою метаданих 5ВЕ (витягнених блоком 215 видалення форматування) і метаданих евк,Typically, the final stage of processing in subsystem 202 applies a frequency-domain-to-time-domain transformation to the decoded audio data of the frequency domain, so that the output of the subsystem is the decoded audio data in the time domain. Module 203 is made with the ability to use VC tools (and ezVK tools) specified using 5VE metadata (extracted by the formatting removal unit 215) and evk metadata,

Ко) сформованих в підсистемі 401, до декодованих аудіоданих (тобто, виконувати обробку ЗВК і обробку еЗзВК на виході підсистеми декодування 202 з використовуванням метаданих 5ВЕ і метаданих езВК), щоб сформувати повністю декодовані аудіодані, які видаються з декодера 400. Як правило, декодер 400 включає в себе пам'ять (доступну для підсистеми 202 і модуля 203), яка зберігає піддані видаленню форматування аудіодані і метадані, видані з блоку 215 видалення форматування (і необов'язково також підсистеми 401), і модуль 203 виконаний з можливістю здійснювати доступ до аудіоданих і метаданих в міру необхідності під час обробкиCo) formed in the subsystem 401, to the decoded audio data (ie, perform ZVK processing and eZzVK processing at the output of the decoding subsystem 202 using the 5VE metadata and the ezVK metadata) to form the fully decoded audio data that is output from the decoder 400. Typically, the decoder 400 includes memory (accessible to subsystem 202 and module 203) that stores deformatted audio data and metadata output from deformatting unit 215 (and optionally also subsystem 401), and module 203 is configured to access audio data and metadata as necessary during processing

ЗВЕ і обробки езВК. Обробка 5ВЕ в 203 може розглядатися як подальша обробка на виході основної підсистеми 202 декодування. Необов'язково декодер 400 також включає в себе підсистему фінального підвищувального мікшування (яка може застосувати інструменти параметричного стерео ("Р5"), задані у стандарті МРЕС-4 ААС, з використовуванням метаданихZVE and ezVK processing. The processing of 5VE in 203 can be considered as further processing at the output of the main decoding subsystem 202. Optionally, the decoder 400 also includes a final up-mixing subsystem (which can apply the parametric stereo ("P5") tools specified in the MPEC-4 AAS standard using metadata

Р5, витягнених блоком 215 видалення форматування), яка з'єднана і виконана з можливістю виконувати підвищувальне мікшування на виході модуля 203, щоб сформувати повністю декодоване, піддане підвищувальному мікшуванню аудіо, яке видається з блоку 210 АРИ.P5 extracted by deformatting unit 215) which is connected and configured to perform up-mixing on the output of module 203 to form fully decoded, up-mixed audio output from ARI unit 210.

Підсистема 401 формування керувальних даних на фіг. 5 з'єднана і виконана з можливістю виявляти щонайменше одну властивість закодованого бітового потоку аудіо, який має бути декодований, і формувати керувальні дані езВкК (які можуть являти собою або включати в себе метадані езВК будь-якого з типів, включених в закодовані бітові потоки аудіо і згідно з іншими варіантами здійснення винаходу) у відповідь на щонайменше один результат етапу виявлення.Control data generation subsystem 401 in fig. 5 is connected and configured to detect at least one property of the encoded audio bitstream to be decoded and to generate the control data ezVkC (which may be or include ezVK metadata of any of the types included in the encoded audio bitstreams and according to other embodiments of the invention) in response to at least one result of the detection step.

Керувальні дані езВЕ поміщаються в модуль 203, щоб ініціювати застосування окремих інструментів езВК або комбінації інструментів е5ВЕ після виявлення заданої властивості (або комбінації властивостей) бітового потоку, і/або керувати застосуванням таких інструментів е5ВК. Наприклад, щоб керувати функціонуванням обробки еЗзВК з використовуванням гармонічної транспозиції, деякі варіанти здійснення підсистеми формування керувальних даних 401 включають в себе: датчик музики (наприклад, спрощену версію традиційного датчика музики) для встановлення параметра 5БбгРаїспіпаМоаде|стп| (ії приміщання встановленого параметра в модуль 203) у відповідь на виявлення, що бітовий потік вказує або не вказує музику; датчик переходу для встановлення параметра 5бгОмегзатріїпоРіІад(сп) (і приміщання встановленого параметра в модуль 203) у відповідь на виявлення присутності або відсутності перехідних процесів у вмісті аудіо, вказаному за допомогою бітового потоку; і/або датчик тону бо для встановлення параметрів 5бБгРіїспІпВіп5РіІадіснп| і збБгРСспІпВіп5Ісп| (і приміщання встановлених параметрів в модуль 203) у відповідь на виявлення тону вмісту аудіо, вказаного за допомогою бітового потоку. Іншими аспектами винаходу є способи декодування бітового потоку аудіо, виконані за допомогою будь-якого варіанту здійснення декодера винаходу, описаного в цьому абзаці і попередньому абзаці.Control data ezVE is placed in the module 203 to initiate the application of individual ezVK tools or a combination of e5VE tools upon detection of a given property (or combination of properties) of the bitstream, and/or to control the application of such e5VK tools. For example, to control the operation of eZzVK processing using harmonic transposition, some embodiments of the control data generation subsystem 401 include: a music sensor (eg, a simplified version of a traditional music sensor) to set the parameter 5BbgRaispipaMoade|stp| (ii placing the set parameter in module 203) in response to detecting that the bit stream indicates or does not indicate music; a transition sensor for setting the parameter 5bgOmegazatriipoRiIad(sp) (and placing the set parameter in module 203) in response to detecting the presence or absence of transients in the audio content indicated by the bit stream; and/or the tone sensor for setting the parameters and zbBgRSspIpVip5Isp| (and placing the set parameters in module 203) in response to detecting the tone of the audio content specified by the bitstream. Other aspects of the invention are methods of decoding an audio bit stream, performed using any embodiment of the decoder of the invention described in this paragraph and the previous paragraph.

Аспекти винаходу включають в себе спосіб кодування або декодування типу, який, будь- який варіант здійснення блоку АР, системи або пристрою винаходу, виконаний з можливістю виконувати (наприклад, запрограмований). Інші аспекти винаходу включають в себе систему або пристрій, виконані з можливістю (наприклад, запрограмовані) виконувати будь-який варіант здійснення способу винаходу, і машиночитаний носій (наприклад, диск), який зберігає код (наприклад, енергонезалежним чином) для реалізації будь-якого варіанту здійснення способу винаходу або його етапів. Наприклад, система винаходу може являти собою або включати в себе програмований процесор загального призначення, процесор цифрової обробки сигналів або мікропроцесор, запрограмований за допомогою програмного забезпечення або програмно- апаратного забезпечення і/або іншим чином виконаний з можливістю виконувати будь-яку множину операцій для даних, що включають в себе варіант здійснення способу винаходу або його етапів. Такий процесор загального призначення може являти собою або включати в себе комп'ютерну систему, що включає в себе пристрій введення, пам'ять і схему обробки, запрограмовану (і/або іншим чином виконану з можливістю) виконувати варіант здійснення способу винаходу (або його етапи) у відповідь на поміщені в нього дані.Aspects of the invention include a method of encoding or decoding the type that any embodiment of the AR unit, system, or device of the invention is capable of performing (eg, programmed). Other aspects of the invention include a system or device configured to (eg, programmed) to perform any embodiment of the method of the invention, and a machine-readable medium (eg, a disk) that stores code (eg, in a non-volatile manner) to implement any variant implementation of the method of the invention or its stages. For example, the system of the invention may be or include a general purpose programmable processor, digital signal processing processor, or microprocessor programmed by software or hardware and/or otherwise configured to perform any number of operations on data, which include a variant of the method of the invention or its stages. Such a general-purpose processor may be or include a computer system including an input device, memory, and processing circuitry programmed (and/or otherwise capable of) performing an embodiment of the method of the invention (or steps thereof ) in response to the data placed in it.

Варіанти здійснення наданого винаходу можуть бути реалізовані в апаратних засобах, програмно-апаратному забезпеченні або програмному забезпеченні, або в їх комбінації (наприклад, як програмована логічна матриця). Якщо не визначено інакше, алгоритми або процеси, включені як частина винаходу, первісно не належать до яких-небудь конкретного комп'ютеру або іншого пристрою. Зокрема, різні машини загального призначення можуть використовуватися з програмами, написаними згідно з ідеями у наданому документі, або може бути зручніше побудувати більш спеціалізований пристрій (наприклад, інтегральні схеми) для виконання етапів потрібного методу. Таким чином, винахід може бути реалізований в одній або більше комп'ютерних програмах, що виконуються на одній або більше програмованих комп'ютерних системах (наприклад, реалізація будь-якого з елементів на фіг. 1, або кодер 100Embodiments of the present invention may be implemented in hardware, software, or software, or a combination thereof (eg, as a programmable logic array). Unless otherwise specified, the algorithms or processes included as part of the invention do not originally belong to any particular computer or other device. In particular, various general-purpose machines may be used with programs written according to the ideas in this document, or it may be more convenient to build a more specialized device (eg, integrated circuits) to perform the steps of the desired method. Thus, the invention may be implemented in one or more computer programs running on one or more programmable computer systems (eg, an implementation of any of the elements in FIG. 1, or encoder 100

Зо на фіг. 2 (або його елемент), або декодер 200 на фіг. З (або його елемент), або декодер 210 на фіг. 4 (або його елемент), або декодер 400 на фіг. 5 (або його елемент)), кожна з яких включає в себе щонайменше один процесор, щонайменше одну систему зберігання даних (у тому числі енергозалежну і енергонезалежну пам'ять і/або запам'ятовувальні елементи), щонайменше один пристрій або порт введення і щонайменше один пристрій або порт виведення. Програмний код застосовується для введення даних для виконання функцій, описаних у наданому документі, і формування вихідної інформації. Вихідна інформація застосовується до одного або більше пристроїв виведення відомим чином.From in fig. 2 (or its element), or the decoder 200 in fig. With (or its element), or the decoder 210 in fig. 4 (or an element thereof), or the decoder 400 in FIG. 5 (or its element)), each of which includes at least one processor, at least one data storage system (including volatile and non-volatile memory and/or storage elements), at least one device or input port and at least one output device or port. The program code is used to enter data to perform the functions described in the provided document and generate output information. The output information is applied to one or more output devices in a known manner.

Кожна така програма може бути реалізована на будь-якій бажаній комп'ютерній мові (у тому числі машинній мові, мові асемблера або процедурних, логічних або об'єктно-орієнтованих мовах програмування високого рівня) для взаємодії з комп'ютерною системою. У будь-якому випадку мова може бути компільовуваною або інтерпретовуваною мовою.Each such program may be implemented in any desired computer language (including machine language, assembly language, or high-level procedural, logical, or object-oriented programming languages) to interact with the computer system. In either case, the language can be a compiled or interpreted language.

Наприклад, під час реалізації за допомогою послідовностей комп'ютерних програмних команд різні функції і етапи варіантів здійснення винаходу можуть бути реалізовані за допомогою багатопотокових послідовностей програмних команд, що працюють у придатних апаратних засобах цифрової обробки сигналів, у цьому випадку різні пристрої, модулі і функції варіантів здійснення можуть відповідати частинам програмних команд.For example, when implemented using sequences of computer program instructions, various functions and stages of embodiments of the invention may be implemented using multi-threaded sequences of program instructions operating in suitable digital signal processing hardware, in which case the various devices, modules and functions of the variants implementation may correspond to parts of program commands.

Кожна така комп'ютерна програма переважно збережена або завантажена на запам'ятовувальний носій або пристрій (наприклад, твердотільну пам'ять або носій, або магнітні або оптичні носії), що читаються за допомогою програмованого комп'ютера загального або спеціалізованого призначення, для конфігурації і роботи комп'ютера, коли запам'ятовувальний носій або пристрій зчитуються комп'ютерною системою для виконання процедур, описаних у наданому документі. Система винаходу також може бути реалізована як машиночитаний запам'ятовувальний носій, конфігурований за допомогою (тобто, який зберігає) комп'ютерної програми, причому конфігурований таким чином запам'ятовувальний носій примушує комп'ютерну систему працювати заданим і попередньо заданим чином для виконання функцій, описаних у наданому документі.Each such computer program is preferably stored or loaded on a storage medium or device (eg, solid state memory or media, or magnetic or optical media) readable by a general purpose or special purpose programmable computer for configuration and computer operation when a storage medium or device is read by a computer system to perform the procedures described in this document. The system of the invention may also be implemented as a machine-readable storage medium configured by (i.e., storing) a computer program, wherein the storage medium so configured causes the computer system to operate in a specified and predetermined manner to perform functions, described in this document.

Було описано деяке число варіантів здійснення винаходу. Проте, слід розуміти, що можуть бути зроблені різні модифікації без відступу від суті і обсягу винаходу. Численні модифікації і зміни наданого винаходу можливі у світлі викладених вище ідей. Слід розуміти, що у рамках бо обсягу прикладеної формули винаходу винахід може бути здійснений інакше, ніж конкретно описано у наданому документі.A number of embodiments of the invention have been described. However, it should be understood that various modifications may be made without departing from the spirit and scope of the invention. Numerous modifications and variations of the present invention are possible in light of the above ideas. It should be understood that within the scope of the appended claims, the invention can be implemented differently than specifically described in the given document.

Будь-які номери для посилань, що містяться у подальшій формулі винаходу, дані тільки у ілюстративних цілях і не повинні використовуватися, щоб тлумачити або обмежувати формулу винаходу яким би то не було чином.Any reference numbers contained in the following claims are for illustrative purposes only and should not be used to construe or limit the claims in any way.

Claims

FORMULA OF THE INVENTION

1. Unit (210) of audio processing, containing: a buffer (201), designed to store at least one block of encoded audio bit stream; block (215) of removing the formatting of the useful data of the bit stream, connected to the buffer and made with the possibility of demultiplexing at least a part of at least one block of the encoded bit stream of audio; and a decoding subsystem (202) connected to the bitstream payload deformatting unit (215) and configured to decode at least a portion of at least one block of the encoded audio bitstream, wherein the at least one block of the encoded audio bitstream includes: a padding element with an identifier indicating the start of a padding element and padding data following the identifier, wherein the padding data includes: at least one flag identifying whether a basic form of spectral band copying or an enhanced form of spectral band copying is to be performed for the audio content of at least one block of an encoded audio bitstream, wherein the basic form of spectral band copying includes spectral interpolation, the enhanced form of spectral band copying includes harmonic transposition, one value of the flag indicates that said enhanced form of spectral band copying is to be performed on the audio content, and the other the value of the flag indicates that said basic form of spectral band copying, rather than said harmonic transposition, should be performed on the audio content.

2. The audio processing unit according to claim 1, in which the padding data additionally includes extended spectral band copying metadata. Zo

3. The audio processing unit according to claim 2, in which the extended spectrum copy metadata is contained in the additional payload of the filler element.

4. The audio processing unit according to any one of claims 2-3, in which the extended spectral band copying metadata includes one or more parameters defining a basic table of frequency bands.

5. The audio processing unit according to any one of claims 2-3, in which the metadata of the extended spectral band copy includes the scale coefficients of the envelope or the scale coefficients of the minimum noise level.

b. The audio processing unit of any of the preceding claims, wherein the audio processing unit is an audio decoder, and the identifier is a three-bit unsigned integer to which the most significant bit is transmitted first, and which has a value of Ox6b.

7. The audio processing unit of any of the preceding claims, wherein the padding data includes an additional payload, the additional payload includes additional spectral band copy data, and the additional payload is identified by a four-bit unsigned integer first transmitted significant bit, and having a value of "1101" or "1110", and optionally the additional spectrum copy data includes: an optional spectrum copy header, spectrum copy data after the header, and an additional spectrum copy element after the spectral band copying data, and the first flag is included in the additional spectral band copying element.

8. The audio processing unit according to any of the preceding items, wherein the at least one block of the encoded audio bitstream includes a first padding element and a second padding element, and the spectral band copy data is included in the first padding element and the first flag, but not spectral band copy data included in the second filler element.

9. The audio processing unit of any of the preceding items, wherein the advanced form of spectral band copying processing includes harmonic transposition, the basic form of spectral band copying processing includes spectral interpolation, one value of the first flag indicates that said advanced form of processing spectral band copying must be 60 performed on the audio content of at least one block of the encoded audio bitstream, and a different value of the first flag indicates that spectral interpolation, rather than said harmonic transposition, must be performed on the audio content of at least one block of the encoded audio bitstream.

10. The audio processing unit of claim 7, wherein the extended spectrum copy element includes extended spectrum copy metadata that is different from the first flag, and wherein the extended spectrum copy metadata includes a parameter indicating whether to perform pre-smoothing.

11. The audio processing unit of claim 7, wherein the extended spectral band copying element includes extended spectral band copying metadata that is different from the first flag and the second flag, and wherein the extended spectral band copying metadata includes a parameter indicating whether it is necessary to perform the formation of time bypass counts between sub-ranges.

12. The audio processing unit according to any of the preceding clauses, further comprising an extended spectral band copying processing subsystem (203) configured to perform extended spectral band copying processing using the first flag, and the extended spectral band copying includes harmonic transposition .

13. An audio processing unit according to any of the preceding items, wherein if at least one flag identifies an advanced form of spectral band copying processing, a second flag identifies whether or not signal adaptive oversampling in the frequency domain is available.

14. A method of decoding an encoded audio bitstream, and the method includes the steps of: receiving at least one block of an encoded audio bitstream; demultiplex at least a part of at least one block of the encoded audio bit stream; and decode at least a portion of at least one block of the encoded audio bitstream, wherein the at least one block of the encoded audio bitstream includes: a padding element with an identifier indicating the beginning of the padding element, and padding data after the identifier, and the padding data includes: one flag identifying whether a basic form of spectral band copy processing or an advanced form of spectral band copy processing is to be performed on the audio content of at least one block of the encoded audio bitstream, wherein the basic form of spectral band copy includes spectral insertion, the advanced form of spectral band copy is to be performed includes harmonic transposition, one value of the flag indicates that said enhanced form of spectral band copying is to be performed for audio content, and another flag value indicates that said basic form of spectral band copying, rather than said harmonic transposition, has be performed for audio content.

15. The method according to claim 14, in which the identifier is a three-bit unsigned integer, in which the most significant bit is transmitted first, and such that it has a value of Ох6.

16. The method according to claim 14 or 15, in which the padding data additionally includes extended spectrum copy metadata.

17. The method of any one of claims 14-16, wherein the padding data includes an additional payload, the additional payload includes additional spectral band copy data, and the additional payload is identified by a four-bit unsigned integer that is initially transmitted most significant bit, and having the value "1101" or "1110", and optionally, the additional spectrum copy data includes: optional spectrum copy header, spectrum copy data after the header, additional the spectral band copying element after the spectral band copying data, and the first flag is included in the additional spectral band copying element.

18. The method according to any one of claims 14-17, wherein the advanced form of spectral band copying processing is harmonic transposition, the basic form of spectral band copying processing is spectral insertion, one value of the first flag indicates that said advanced spectral band copying processing should be performed on the audio content of at least one block of the encoded audio bitstream, and a different value of the first flag indicates that spectral interpolation, rather than said harmonic transposition, is to be performed on the audio content of at least one block of the encoded audio bitstream.

19. The method according to claim 17 or claim 18, wherein the additional spectral band copying element includes extended spectral band copying metadata that is not a first flag, and wherein the extended spectral band copying metadata includes a parameter indicating whether to perform pre-smoothing, or wherein the additional spectral band copy element includes enhanced spectral band copy metadata that is different from the first flag, and wherein the enhanced spectral band copy metadata includes a parameter indicating whether to perform time-wrapping between subbands.

20. The method according to any one of claims 14-19, which additionally includes the step of performing the processing of extended copying of the spectral band using the first flag and the second flag, and the extended copying of the spectral band includes harmonic transposition.

21. The method according to any of claims 14-20 or the audio processing unit according to any of claims 1-8, and the coded audio bit stream is the MPRES-4 AAS bit stream. and BIT FLOW MREB-Z AAS i ! y I a input sk vod M dostyavkYi DECODER --o FURTHER oo tIG. ии я т05 я и Т05 AUDIUSE CODERO) - BLOCK nya BUFFER sen BIT FLOW Di sennya 1 BARMATATION FROM ши: MREB-YAA AAS input sk «este sokok v . The ham is the generator of the metadata a "TRO! (Coderi fig. 2 I give the elasticity of the honeycomb of the honeycomb. . ho sh zo oh ho soh xxx xxx oo xxx xxx xxx xxx xxx xxx xxx xxx xxx xxx

FIG. From g KT pln nn nn nn nn to mn nn nn nn nn : IBLOJ ARI nn in EX oblique ro FORMAT | ; 5 and I "oo ANNYA / peneyu, and ; oonnnnnnnnnnnnnnnnai 0 METADATA 5VA nn j s 2187 2137

FIG. 4 Uh 00 BIJODR OO 0000000000ayaya 00000 r - PPE i i i z it ! entrance and E Z . DELETE | i r to sha le MNN, o t eerdtor y in KE ben kvrRuval BITS PROCESSING: yo e5VE shi tn nn tek nizel len yah teh not same pek ono okho zhk ya lek yah eh ya kyu okt zn ya en

FIG. 5 ENTRANCE TO THE BATTLE (BLOCKERS 0) -2O0i

FIG. b

EYAEMENT beer CHANNEL rea unnnentn on rent si i ; Е и И З Z Я is И ; ; и: not NO; and and and Z and I Mak 3 MORE: I Shi SHON run yang - flag tot What - and her . What is the food? 7