JP7182751B1

JP7182751B1 - System, method, and apparatus for conversion of channel-based audio to object-based audio

Info

Publication number: JP7182751B1
Application number: JP2022532868A
Authority: JP
Inventors: シー．ウォード，マイケル; サンチェス，フレディー; フェルシュ，クリストフ
Original assignee: ドルビーラボラトリーズライセンシングコーポレイション; ドルビー・インターナショナル・アーベー
Priority date: 2019-12-02
Filing date: 2020-12-02
Publication date: 2022-12-02
Anticipated expiration: 2040-12-02
Also published as: WO2021113350A1; BR112022010737A2; US20230024873A1; EP3857919B1; CN114930876A; JP2022553111A; CN114930876B; KR102471715B1; JP7182751B6; KR20220100084A; EP3857919A1

Abstract

チャネルベースオーディオ（CBA）（例えば、２２.２chオーディオ）からオブジェクトベースオーディオ（OBA）への変換のための実施形態が開示される。変換は、CBAメタデータをオブジェクトオーディオメタデータ（OAMD）に変換すること、及びOAMDのチャネル順序制約に従い導出されるチャネルシャッフル情報に基づき、CBAチャネルを並べ替えることを含む。並べ替えたチャネルを有するOBAは、OAMDを用いて、再生装置で、又はセットトップボックス若しくはオーディオ／ビデオレコーダのようなソース装置でレンダリングされる、実施形態では、CBAメタデータは、メタデータの変換において使用されるべき特定のOAMD表現を示すシグナリングを含む。実施形態では、予め計算されたOAMDは、ソース装置におけるレンダリングのため又は（例えばHDMI（登録商標）を介する）送信のためにネイティブオーディオビットストリーム（例えば、AAC）の中で送信される。実施形態では、予め計算されたOAMDは、トランスポート層ビットストリーム（例えば、ISO BMFF、MPEG４オーディオビットストリーム）の中で再生装置又はソース装置へ送信される。Embodiments are disclosed for conversion of channel-based audio (CBA) (eg, 22.2ch audio) to object-based audio (OBA). Transformation involves transforming CBA metadata into Object Audio Metadata (OAMD) and reordering the CBA channels based on channel shuffle information derived according to the OAMD's channel order constraints. An OBA with reordered channels is rendered using OAMD on a playback device or on a source device such as a set-top box or audio/video recorder. Contains signaling that indicates the particular OAMD representation to be used in In embodiments, the pre-computed OAMD is transmitted in a native audio bitstream (eg, AAC) for rendering on the source device or transmission (eg, via HDMI). In embodiments, the pre-computed OAMD is sent to the playback or source device in a transport layer bitstream (eg, ISO BMFF, MPEG4 audio bitstream).

Description

［関連出願の相互参照］
本願は、米国仮特許出願番号第６２/９４２,３２２号、２０１９年１２月２日出願、及び欧州特許出願番号第１９２１２９０６.２号、２０１９年１２月２日出願の優先権を主張する。両出願は、参照によりその全体がここに組み込まれる。 [Cross reference to related applications]
This application claims priority to U.S. Provisional Patent Application No. 62/942,322, filed December 2, 2019, and European Patent Application No. 19212906.2, filed December 2, 2019. Both applications are incorporated herein by reference in their entireties.

［技術分野］
本開示は、概して、チャネルベースオーディオからオブジェクトベースオーディオへの変換を含むオーディオ信号処理に関する。 [Technical field]
The present disclosure relates generally to audio signal processing, including conversion from channel-based audio to object-based audio.

チャネルベースオーディオ（channel-based audio （CBA））コーディングでは、トラックのセットをチャネル構成に関連付けることにより、トラックのセットは、暗示的に特定のラウドスピーカに割り当てられる。再生スピーカ構成がコーディングチャネル構成と異なる場合、ダウンミキシング又はアップミキシング仕様は、利用可能スピーカにオーディオを再分配することを要求する。この枠組みは、よく知られており、復号端におけるチャネル構成が予め決定できるか、妥当な確実性で２.０、５.X、又は７.Xであると想定できるとき、機能する。しかしながら、新しいスピーカ編成（setup）の人気に伴い、再生のために使用されるスピーカ編成に関して想定を行うことができない。従って、CBAは、ソーススピーカレイアウトが復号端におけるスピーカレイアウトと一致しない場合に、表現を適応するための十分な方法を提供しない。これは、著作者のコンテンツをスピーカ構成と独立に良好に再生しようとするとき、問題を生じる。 In channel-based audio (CBA) coding, sets of tracks are implicitly assigned to particular loudspeakers by associating them with channel configurations. Downmixing or upmixing specifications require that audio be redistributed to available speakers if the playback speaker configuration differs from the coding channel configuration. This framework is well known and works when the channel configuration at the decoding end can be predetermined or assumed with reasonable certainty to be 2.0, 5.X, or 7.X. However, with the popularity of new speaker setups, no assumptions can be made regarding the speaker setup that will be used for playback. Therefore, CBA does not provide a sufficient way to adapt the representation when the source speaker layout does not match the speaker layout at the decoding end. This creates a problem when trying to reproduce the author's content well independent of the speaker configuration.

オブジェクトベースオーディオ（object-based audio （OBA））コーディングでは、個別に割り当てられたオブジェクト特性を含むメタデータと関連して、オブジェクトオーディオ要素を含むオブジェクトに、レンダリングが適用される。特性（例えば、x、y、z位置、又はチャネル位置）は、コンテンツ制作者がオーディオコンテンツがどのようにレンダリングされることを意図しているかをより明示的に指定する（つまり、それらは、要素をスピーカにどのようにレンダリングするかに制約を課す）。個々の音声要素は遙かに豊かなメタデータのセットに関連付けることができ、要素に意味を与えるので、オーディオを再生するスピーカ構成への適応の方法は、より少数のスピーカへどのようにレンダリングするかに関するより良好な情報を提供できる。 In object-based audio (OBA) coding, rendering is applied to objects containing object audio elements in conjunction with metadata containing individually assigned object characteristics. Properties (e.g., x, y, z position, or channel position) more explicitly specify how the author intends the audio content to be rendered (i.e. they are to the speaker). Since individual audio elements can be associated with a much richer set of metadata, giving meaning to the element, the method of adaptation to the speaker configuration that reproduces the audio is how it renders to a smaller number of speakers. can provide better information about

ETSI TS １０２３６６[１]に定義された拡張AC-３（E-AC-３）のような、CBAコンテンツの送信のための幾つかの標準化フォーマットがある。既存の装置との互換性を保証するために、標準化CBAフォーマットと関連して、OBAをトランスポートするために、共同オブジェクトコーディング（joint object coding （JOC））が使用できる。JOCは、低ビットレートで没入型オーディオを提供する。これは、デコーダにおいてダウンミックスからのオーディオオブジェクトの再構成を可能にするパラメータサイド情報と一緒に、知覚オーディオコーディングアルゴリズムを用いて、没入型コンテンツのマルチチャネルダウンミックスを伝達することにより達成される。テレビ放送のような幾つかの適用では、コンテンツがOBA再生装置のインストールベースと互換性があるように、CBAコンテンツをOBAコンテンツとして表現することが望ましい。しかしながら、CBA及びOBAの標準化ビットストリームフォーマットは、全体的に互換性がない。 There are several standardized formats for transmission of CBA content, such as Extended AC-3 (E-AC-3) defined in ETSI TS 102 366 [1]. Joint object coding (JOC) can be used to transport the OBA in conjunction with the standardized CBA format to ensure compatibility with existing equipment. JOC provides immersive audio at low bitrates. This is achieved by conveying a multi-channel downmix of the immersive content using perceptual audio coding algorithms together with parametric side information that allows reconstruction of the audio objects from the downmix at the decoder. In some applications, such as television broadcasting, it is desirable to represent CBA content as OBA content so that the content is compatible with the installed base of OBA playback devices. However, the CBA and OBA standardized bitstream formats are totally incompatible.

CBAコンテンツをOBAコンテンツに変換する実施形態が開示される。特定の実施形態では、OBA互換再生装置で再生するために、２２.２チャネルコンテンツをOBAコンテンツに変換する。 Embodiments are disclosed for converting CBA content to OBA content. Particular embodiments convert 22.2 channel content to OBA content for playback on OBA compatible playback devices.

実施形態では方法は、
オーディオ処理機器の１つ以上のプロセッサにより、チャネルベースオーディオと関連するチャネルベースオーディオメタデータとを含むビットストリームを受信するステップ、を含み、
前記１つ以上のプロセッサは、
前記チャネルベースオーディオメタデータからシグナリングパラメータをパースし、前記シグナリングパラメータは、複数の異なるオブジェクトオーディオメタデータ（OAMD）表現のうちの１つを示し、前記OAMD表現のうちの各OAMD表現は、前記チャネルベースオーディオの１つ以上のオーディオチャネルを１つ以上のオーディオオブジェクトにマッピングし、
前記シグナリングパラメータにより示されるOAMD表現を用いて、前記チャネルベースオーディオメタデータを前記１つ以上のオーディオオブジェクトに関連付けられたOAMDに変換し、
前記OAMDのチャネル順序制約に基づき、チャネルシャッフル情報を生成し、
前記チャネルシャッフル情報に基づき前記チャネルベースオーディオの１つ以上のオーディオチャネルを並べ替えて、並べ替えチャネルベースオーディオを生成し、
前記OAMDを用いて、前記並べ替えチャネルベースオーディオをレンダリングオーディオにレンダリングするか、又は、
前記並べ替えチャネルベースオーディオ及び前記OAMDをオブジェクトベースオーディオビットストリームに符号化し、前記オブジェクトベースオーディオビットストリームを再生装置又はソース装置へ送信する、
よう構成される。 In embodiments, the method comprises:
receiving, by one or more processors of an audio processing device, a bitstream containing channel-based audio and associated channel-based audio metadata;
The one or more processors
parsing signaling parameters from the channel-based audio metadata, the signaling parameters indicating one of a plurality of different Object Audio Metadata (OAMD) representations, each OAMD representation of the OAMD representations representing the channel-based audio metadata; mapping one or more audio channels of the base audio to one or more audio objects;
converting the channel-based audio metadata into OAMDs associated with the one or more audio objects using the OAMD representation indicated by the signaling parameters;
generating channel shuffle information based on the OAMD channel order constraint;
reordering one or more audio channels of the channel-based audio based on the channel shuffle information to produce reordered channel-based audio;
using the OAMD to render the reordered channel-based audio into rendered audio; or
encoding the reordered channel-based audio and the OAMD into an object-based audio bitstream, and transmitting the object-based audio bitstream to a playback or source device;
configured as follows.

実施形態では、前記チャネルベースオーディオ及びメタデータはネイティブオーディオビットストリームに含まれ、前記方法は、前記ネイティブオーディオビットストリームを復号して、前記チャネルベースオーディオ及びメタデータを復元する（つまり、決定する、又は抽出する）ステップ、を更に含む。 In embodiments, the channel-based audio and metadata are included in a native audio bitstream, and the method decodes the native audio bitstream to recover (i.e., determine) the channel-based audio and metadata. or extracting).

実施形態では、前記チャネルベースオーディオ及びメタデータは、N.Mチャネルベースオーディオ及びメタデータであり、Nは９より大きい正の整数であり、Mは０以上の正の整数である。 In an embodiment, the channel-based audio and metadata are N.M channel-based audio and metadata, where N is a positive integer greater than 9 and M is a positive integer greater than or equal to 0.

実施形態では、前記方法は、OAMDベッドチャネルにより表現できるチャネルベースオーディオの第１チャネルセットを決定するステップと、
前記第１チャネルセットにOAMDベッドチャネルラベルを割り当てるステップと、
OAMDベッドチャネルにより表現できないチャネルベースオーディオの第２チャネルセットを決定するステップと、
前記第２チャネルセットに静的OAMD位置座標を割り当てるステップと、
を更に含む。 In an embodiment, the method comprises determining a first channel set of channel-based audio that can be represented by an OAMD bed channel;
assigning an OAMD bed channel label to the first set of channels;
determining a second channel set of channel-based audio that cannot be represented by the OAMD bed channels;
assigning static OAMD location coordinates to the second set of channels;
further includes

実施形態では、方法は、
オーディオ処理機器の１つ以上のプロセッサにより、チャネルベースオーディオとメタデータとを含むビットストリームを受信するステップを含み、
前記１つ以上のプロセッサは、
前記チャネルベースオーディオをネイティブオーディオビットストリームに符号化し、
前記メタデータからシグナリングパラメータをパースし、前記シグナリングパラメータは複数の異なるオブジェクトオーディオメタデータ（OAMD）表現のうちの１つを示し、
前記シグナリングパラメータにより示されるOAMD表現を用いて、前記チャネルベースメタデータをOAMDに変換し、
前記OAMDのチャネル順序制約に基づき、チャネルシャッフル情報を生成し、
前記ネイティブオーディオビットストリーム、前記チャネルシャッフル情報、及び前記OAMDを含むビットストリームパッケージを生成し、
前記パッケージをトランスポート層ビットストリームに多重化し、
前記トランスポート層ビットストリームを再生装置又はソース装置に送信する、よう構成される。 In embodiments, the method comprises:
receiving, by one or more processors of an audio processing device, a bitstream containing channel-based audio and metadata;
The one or more processors
encoding the channel-based audio into a native audio bitstream;
parsing signaling parameters from the metadata, the signaling parameters indicating one of a plurality of different Object Audio Metadata (OAMD) representations;
converting the channel-based metadata to OAMD using the OAMD representation indicated by the signaling parameters;
generating channel shuffle information based on the OAMD channel order constraint;
generating a bitstream package including the native audio bitstream, the channel shuffle information, and the OAMD;
multiplexing the packages into a transport layer bitstream;
It is arranged to transmit said transport layer bitstream to a playback device or a source device.

実施形態では、前記チャネルベースオーディオ及びメタデータは、N.Mチャネルベースオーディオ及びメタデータであり、Nは７より大きい正の整数であり、Mは０以上の正の整数である。 In an embodiment, the channel-based audio and metadata are N.M channel-based audio and metadata, where N is a positive integer greater than 7 and M is a positive integer greater than or equal to 0.

実施形態では、OAMDベッドチャネルラベルにより表現できるチャネルベースオーディオの中のチャネルは、前記OAMDベッドチャネルラベルを使用し、OAMDベッドチャネルラベルにより表現できないチャネルベースオーディオの中のチャネルは、静的オブジェクト位置を使用し、各静的オブジェクト位置は、OAMD位置座標で記述される。 In an embodiment, channels in channel-based audio that can be represented by an OAMD bed channel label use said OAMD bed channel label, and channels in channel-based audio that cannot be represented by an OAMD bed channel label use static object positions. Each static object position used is described in OAMD position coordinates.

実施形態では、前記トランスポートビットストリームは、動画専門家グループ（MPEG）オーディオビットストリームの拡張フィールドの中のOAMDの存在を示す信号を含むMPEGオーディオビットストリームである。 In an embodiment, said transport bitstream is an MPEG audio bitstream that includes a signal indicating the presence of OAMD in an extension field of a Motion Picture Experts Group (MPEG) audio bitstream.

実施形態では、前記MPEGオーディオビットストリームの中のOAMDの存在を示す前記信号は、サラウンド音声モードをシグナリングための前記MPEGオーディオビットストリームの中の予約メタデータフィールドに含まれる。 In an embodiment, the signal indicating presence of OAMD in the MPEG audio bitstream is included in a reserved metadata field in the MPEG audio bitstream for signaling surround sound mode.

実施形態では、方法は、
オーディオ処理機器の１つ以上のプロセッサにより、パッケージを含むトランスポート層ビットストリームを受信するステップを含み、
前記１つ以上のプロセッサは、
前記トランスポート層ビットストリームを逆多重化して、前記パッケージを復元し（つまり、決定し、又は抽出し）、
前記パッケージを復号して、ネイティブオーディオビットストリーム、チャネルシャッフル情報、及びオブジェクトオーディオメタデータ（OAMD）を復元し（つまり、決定し、又は抽出し）、
前記ネイティブオーディオビットストリームを復号して、チャネルベースオーディオ及びメタデータを復元し、
前記チャネルシャッフル情報に基づき、前記チャネルベースオーディオのチャネルを並べ替え、
前記OAMDを用いて、前記並べ替えチャネルベースオーディオをレンダリングオーディオにレンダリングするか、又は、
前記チャネルベースオーディオ及びOAMDをオブジェクトベースオーディオビットストリームに符号化し、前記オブジェクトベースオーディオビットストリームをソース装置へ送信する、よう構成される。 In embodiments, the method comprises:
receiving, by one or more processors of audio processing equipment, a transport layer bitstream containing the package;
The one or more processors
demultiplexing the transport layer bitstream to recover (i.e., determine or extract) the package;
decoding the package to recover (i.e., determine or extract) the native audio bitstream, channel shuffle information, and object audio metadata (OAMD);
decoding the native audio bitstream to recover channel-based audio and metadata;
reordering channels of the channel-based audio based on the channel shuffle information;
using the OAMD to render the reordered channel-based audio into rendered audio; or
It is configured to encode the channel-based audio and OAMD into an object-based audio bitstream and transmit the object-based audio bitstream to a source device.

実施形態では、方法は、OAMDベッドチャネルにより表現できるチャネルベースオーディオの第１チャネルセットを決定するステップと、
前記第１チャネルセットにOAMDベッドチャネルラベルを割り当てるステップと、
OAMDベッドチャネルにより表現できないチャネルベースオーディオの第２チャネルセットを決定するステップと、
前記第２チャネルセットに静的OAMD位置座標を割り当てるステップと、
を更に含む。 In an embodiment, a method determines a first channel set of channel-based audio that can be represented by an OAMD bed channel;
assigning an OAMD bed channel label to the first set of channels;
determining a second channel set of channel-based audio that cannot be represented by the OAMD bed channels;
assigning static OAMD location coordinates to the second set of channels;
further includes

実施形態では、前記MPEGオーディオビットストリームの中のOAMDの存在を示す前記信号は、サラウンド音声モードをシグナリングための前記MPEGオーディオビットストリームのメタデータの中のデータ構造の予約メタデータフィールドに含まれる。 In an embodiment, said signal indicating presence of OAMD in said MPEG audio bitstream is included in a reserved metadata field of a data structure in metadata of said MPEG audio bitstream for signaling surround sound mode.

実施形態では、機器は、
１つ以上のプロセッサと、
命令を格納している非一時的コンピュータ可読記憶媒体であって、前記命令は、前記１つ以上のプロセッサにより実行されると、前記１つ以上のプロセッサに、本願明細書に記載の方法を実行させる、非一時的コンピュータ可読記憶媒体と、
を含む。 In embodiments, the device comprises:
one or more processors;
A non-transitory computer-readable storage medium containing instructions that, when executed by said one or more processors, cause said one or more processors to perform the methods described herein. a non-transitory computer-readable storage medium that causes
including.

本願明細書に開示される他の実施形態は、システム、機器、及びコンピュータ可読媒体を対象とする。開示される実装の詳細は、添付の図面及び以下の説明において説明される。他の特徴、目的、及び利点は、説明、図面、及び請求項から明らかになる。 Other embodiments disclosed herein are directed to systems, apparatus, and computer-readable media. Details of the disclosed implementations are set forth in the accompanying drawings and the description below. Other features, objects, and advantages will become apparent from the description, drawings, and claims.

本願明細書に開示される特定の実施形態は、以下の利点のうちの１つ以上を提供する。OBA互換再生装置の既存のインストールされたベースは、再生装置のハードウェアコンポーネントを置き換えることなく、既存の規格に基づくネイティブオーディオ及びトランスポートビットストリームフォーマットを用いて、CBAコンテンツをOBAコンテンツに変換できる。 Certain embodiments disclosed herein provide one or more of the following advantages. An existing installed base of OBA-compatible playback devices can convert CBA content to OBA content using native audio and transport bitstream formats based on existing standards without replacing hardware components of the playback device.

以下で参照される添付の図面において、種々の実施形態は、ブロック図、フローチャート、及び他の図で示される。フローチャート又はブロック内の各ブロックは、指定された論理機能を実行するための１つ以上の実行可能命令を含むモジュール、プログラム、又はコードの部分を表してよい。これらのブロックは方法のステップを実行するために特定の順序で示されるが、それらは、必ずしも、図示された順序に厳密に従い実行される必要はない。例えば、それらは、各々の動作の特性に依存して、逆の順序で又は同時に実行されるかもしれない。留意すべき子ｔに、ブロック図及び／又はフローチャートの中の各ブロック、及びそれらの組合せは、指定された機能／動作を実行する専用ソフトウェアベース又はハードウェアベースシステムにより、又は専用ハードウェア及びコンピュータ命令の組合せにより、実施されてよい。 In the accompanying drawings referenced below, various embodiments are illustrated in block diagrams, flowcharts, and other diagrams. Each block within the flowchart or blocks may represent a module, program, or portion of code containing one or more executable instructions for performing the specified logical function. Although these blocks are presented in a particular order for performing method steps, they do not necessarily have to be performed strictly in the order illustrated. For example, they may be performed in reverse order or concurrently, depending on the characteristics of each operation. It should be noted that each block in the block diagrams and/or flowcharts, and combinations thereof, may be implemented by a dedicated software-based or hardware-based system, or dedicated hardware and computer, to perform the specified functions/operations. It may be implemented by a combination of instructions.

実施形態による、２つの異なるオブジェクトオーディオメタデータ（OAMD）表現のベッドチャネル及びオブジェクト位置を示す表である。4 is a table showing bed channels and object positions for two different Object Audio Metadata (OAMD) representations, according to an embodiment;

実施形態による、２つの異なるOAMD表現のベッドチャネル割り当て及びチャネル順序を示す表である。4 is a table showing bed channel assignments and channel order for two different OAMD representations, according to an embodiment;

実施形態による、次元トリミングメタデータを示す表である。4 is a table showing dimension trimming metadata, according to an embodiment;

実施形態による、トリミング／バランス制御を示す表である。4 is a table showing trimming/balance control, according to an embodiment;

実施形態による、ビットストリーム符号化を用いずに、２２.２チャネルオーディオビットストリームをオーディオオブジェクト及びOAMDに変換するシステムのブロック図である。1 is a block diagram of a system for converting a 22.2 channel audio bitstream into an audio object and an OAMD without using bitstream encoding, according to an embodiment; FIG.

実施形態による、ビットストリーム符号化を用いて、２２.２チャネルオーディオビットストリームをオーディオオブジェクト及びOAMDに変換するシステムのブロック図である。1 is a block diagram of a system for converting a 22.2 channel audio bitstream into an audio object and an OAMD using bitstream encoding, according to an embodiment; FIG.

実施形態による、ソース装置におけるレンダリングのために、２２.２チャネルオーディオビットストリームをオーディオオブジェクト及びOAMDに変換するシステムのブロック図である。1 is a block diagram of a system for converting a 22.2 channel audio bitstream into audio objects and OAMDs for rendering on a source device, according to an embodiment; FIG.

実施形態による、外部レンダリングのために、高精細度マルチメディアインタフェース（HDMI（登録商標））を介して送信するために、２２.２chオーディオビットストリームをオーディオオブジェクト及びOAMDに変換するシステムのブロック図である。1 is a block diagram of a system for converting a 22.2ch audio bitstream into audio objects and OAMD for transmission over a high-definition multimedia interface (HDMI) for external rendering, according to an embodiment; FIG. be. 実施形態による、外部レンダリングのために、高精細度マルチメディアインタフェース（HDMI）を介して送信するために、２２.２chオーディオビットストリームをオーディオオブジェクト及びOAMDに変換するシステムのブロック図である。1 is a block diagram of a system for converting a 22.2ch audio bitstream into audio objects and OAMD for transmission over a high-definition multimedia interface (HDMI) for external rendering, according to an embodiment; FIG.

実施形態による、２２.２chオーディオビットストリームをオーディオオブジェクト及びOAMDに変換するシステムのブロック図であり、チャネルシャッフル情報及びOAMDがネイティブオーディオビットストリーム内にパッケージされる。1 is a block diagram of a system for converting a 22.2ch audio bitstream into an audio object and OAMD, where the channel shuffle information and OAMD are packaged within the native audio bitstream, according to an embodiment; FIG. 実施形態による、２２.２chオーディオビットストリームをオーディオオブジェクト及びOAMDに変換するシステムのブロック図であり、チャネルシャッフル情報及びOAMDがネイティブオーディオビットストリーム内にパッケージされる。1 is a block diagram of a system for converting a 22.2ch audio bitstream into an audio object and OAMD, where the channel shuffle information and OAMD are packaged within the native audio bitstream, according to an embodiment; FIG. 実施形態による、２２.２chオーディオビットストリームをオーディオオブジェクト及びOAMDに変換するシステムのブロック図であり、チャネルシャッフル情報及びOAMDがネイティブオーディオビットストリーム内にパッケージされる。1 is a block diagram of a system for converting a 22.2ch audio bitstream into an audio object and OAMD, where the channel shuffle information and OAMD are packaged within the native audio bitstream, according to an embodiment; FIG.

実施形態による、ソース装置におけるレンダリングのために、２２.２chオーディオビットストリームをオーディオオブジェクト及びOAMDに変換するシステムのブロック図であり、ソース装置におけるレンダリングのために、チャネルシャッフル情報及びOAMDがネイティブオーディオビットストリーム内にパッケージされる。1 is a block diagram of a system for converting a 22.2ch audio bitstream into audio objects and OAMD for rendering on a source device, where channel shuffle information and OAMD are native audio bits for rendering on a source device, according to an embodiment; FIG. Packaged in a stream. 実施形態による、ソース装置におけるレンダリングのために、２２.２chオーディオビットストリームをオーディオオブジェクト及びOAMDに変換するシステムのブロック図であり、ソース装置におけるレンダリングのために、チャネルシャッフル情報及びOAMDがネイティブオーディオビットストリーム内にパッケージされる。1 is a block diagram of a system for converting a 22.2ch audio bitstream into audio objects and OAMD for rendering on a source device, where channel shuffle information and OAMD are native audio bits for rendering on a source device, according to an embodiment; FIG. Packaged in a stream.

実施形態による、２２.２chオーディオビットストリームをオーディオオブジェクト及びOAMDに変換するシステムのブロック図であり、ソース装置に供給するために、チャネルシャッフル情報及びOAMDがトランスポート層に埋め込まれ、次にHDMIを介して送信するために、ネイティブオーディオビットストリーム内にパッケージされる。FIG. 2 is a block diagram of a system for converting a 22.2ch audio bitstream into an audio object and OAMD, according to an embodiment, where the channel shuffle information and OAMD are embedded in the transport layer and then HDMI for delivery to the source device; packaged within a native audio bitstream for transmission via 実施形態による、２２.２chオーディオビットストリームをオーディオオブジェクト及びOAMDに変換するシステムのブロック図であり、ソース装置に供給するために、チャネルシャッフル情報及びOAMDがトランスポート層に埋め込まれ、次にHDMIを介して送信するために、ネイティブオーディオビットストリーム内にパッケージされる。FIG. 2 is a block diagram of a system for converting a 22.2ch audio bitstream into an audio object and OAMD, according to an embodiment, where the channel shuffle information and OAMD are embedded in the transport layer and then HDMI for delivery to the source device; packaged within a native audio bitstream for transmission via 実施形態による、２２.２chオーディオビットストリームをオーディオオブジェクト及びOAMDに変換するシステムのブロック図であり、ソース装置に供給するために、チャネルシャッフル情報及びOAMDがトランスポート層に埋め込まれ、次にHDMIを介して送信するために、ネイティブオーディオビットストリーム内にパッケージされる。FIG. 2 is a block diagram of a system for converting a 22.2ch audio bitstream into an audio object and OAMD, according to an embodiment, where the channel shuffle information and OAMD are embedded in the transport layer and then HDMI for delivery to the source device; packaged within a native audio bitstream for transmission via

実施形態による、２２.２chオーディオビットストリームをオーディオオブジェクト及びOAMDに変換するシステムのブロック図であり、ソース装置におけるレンダリングのために、チャネルシャッフル情報及びOAMDが、トランスポート層に埋め込まれる。1 is a block diagram of a system for converting a 22.2ch audio bitstream into an audio object and OAMD, where channel shuffle information and OAMD are embedded in the transport layer for rendering on the source device, according to an embodiment; FIG. 実施形態による、２２.２chオーディオビットストリームをオーディオオブジェクト及びOAMDに変換するシステムのブロック図であり、ソース装置におけるレンダリングのために、チャネルシャッフル情報及びOAMDが、トランスポート層に埋め込まれる。1 is a block diagram of a system for converting a 22.2ch audio bitstream into an audio object and OAMD, where channel shuffle information and OAMD are embedded in the transport layer for rendering on the source device, according to an embodiment; FIG.

実施形態による、CBAからOBAへの変換処理のフロー図である。FIG. 4 is a flow diagram of a conversion process from CBA to OBA, according to an embodiment;

実施形態による、代替のCBAからOBAへの変換処理のフロー図である。FIG. 4 is a flow diagram of an alternative CBA to OBA conversion process, according to an embodiment;

実施形態による、チャネルオーディオからオブジェクトオーディオへの変換を含む例示的なオーディオシステムアーキテクチャのブロック図である。1 is a block diagram of an exemplary audio system architecture including conversion of channel audio to object audio, according to an embodiment; FIG.

種々の図面で使用される同じ参照符号は同様の要素を示す。 The same reference numbers used in different drawings indicate similar elements.

＜＜概要＞＞
オブジェクトオーディオメタデータ（Object Audio Metadata （OAMD））は、例えばETSI TS １０３４２０ v１.２.１（２０１８-１０）に記載されたメタデータのような、OBA処理のためのメタデータのコーディングビットストリーム表現である。OAMDビットストリームは、例えばETSI TS １０２３６６[１]に指定されたような拡張可能メタデータ配信フォーマット（Extensible Metadata Delivery Format （EMDF））コンテナの中で運ばれてよい。OAMDは、オーディオオブジェクトをレンダリングするために使用される。レンダリング情報は、動的に変化してよい（例えば、利得及び位置）。OAMDビットストリーム要素は、コンテンツ記述メタデータ、オブジェクト特性メタデータ、特性更新メタデータ、及び他のメタデータを含んでよい。＜＜Overview＞＞
Object Audio Metadata (OAMD) is a metadata coding bitstream for OBA processing, such as the metadata described in ETSI TS 103 420 v1.2.1 (2018-10). Expression. The OAMD bitstream may be carried in an Extensible Metadata Delivery Format (EMDF) container, eg, as specified in ETSI TS 102 366 [1]. OAMD is used to render audio objects. Rendering information may change dynamically (eg, gain and position). OAMD bitstream elements may include content description metadata, object property metadata, property update metadata, and other metadata.

実施形態では、コンテンツ記述メタデータは、OAMDペイロードシンタックスのバージョン、合計オブジェクト数、オブジェクトタイプ、及びプログラム構成物を含む。オブジェクト特性メタデータは、部屋にアンカーされた（anchored）、画面にアンカーされた、又はスピーカにアンカーされた座標のオブジェクト位置、オブジェクトサイズ（幅、深さ、高さ）、優先度（オブジェクトに重要度による順序を課し、オブジェクトについて優先度が高いほど重要度が高い）、利得（オブジェクトにカスタム利得値を適用するために使用される）、チャネルロック（オブジェクトのレンダリングを単一のスピーカに制約するために使用され、オーディオの非拡散、音色ニュートラルな再生を提供する）、ゾーン制約（オブジェクトが除外される又は含まれる聴取環境のゾーン又はサブボリュームを指定する）、オブジェクト多様化（オブジェクトを２個のオブジェクトに変換するために使用され、エネルギがX軸に沿って広がる）、及びオブジェクトトリム（ミックス内で示されるスクリーン外要素のレベルを低下させるために使用される）を含む。 In embodiments, the content description metadata includes OAMD payload syntax version, total object count, object type, and program constructs. Object characteristics metadata includes object position in room-anchored, screen-anchored, or speaker-anchored coordinates, object size (width, depth, height), priority (important to object) impose order by degree, with higher priority being more important for an object), gain (used to apply a custom gain value to an object), channel lock (constraining object rendering to a single speaker) provide non-diffuse, tone-neutral reproduction of audio), zone constraints (specify zones or sub-volumes of the listening environment in which objects are excluded or included), object diversification (objects are divided into two and object trim (used to reduce the level of off-screen elements shown in the mix).

実施形態では、特性更新メタデータは、全部の送信されたオブジェクトの更新に適用可能なタイミングデータをシグナリングする。送信された特性更新のタイミングデータは、先行する更新又は後続の更新及び連続する更新の間の補間処理のための時間期間を有する更新コンテキストと一緒に、更新の開始時間を指定する。OAMDビットストリームシンタックスは、各コーデックフレームにおいて、オブジェクト当たり最大８個の特性更新をサポートする。シグナリングされた更新の数、又は各特性更新の開始及び停止時間は、全部のオブジェクトについて同一である。メタデータは、前の特性更新のシグナリングされたオブジェクト特性値から現在の更新の値への補間のためのオーディオサンプル単位の時間期間を指定するOAMD内のランプ期間値の値を示す。 In an embodiment, property update metadata signals timing data applicable to all transmitted object updates. The transmitted property update timing data specifies the start time of the update along with the update context with the time period for the preceding or subsequent update and the interpolation process between successive updates. The OAMD bitstream syntax supports up to 8 property updates per object in each codec frame. The number of updates signaled, or the start and stop time of each property update, is the same for all objects. The metadata indicates the value of the Ramp Duration value in the OAMD that specifies the time period in audio samples for interpolation from the signaled object property value of the previous property update to the value of the current update.

実施形態では、タイミングデータは、開始サンプル値オフセット及びフレームオフセットを計算するためにデコーダにより使用されるサンプルオフセット値及びブロックオフセット値も含む。サンプルオフセットは、例えばETSI TS １０２３６６[１]、第H.２.２.３.１及びH.２.２.３.２節に指定されたような、OAMDペイロード内のデータが適用される最初のパルスコード変調（pulse code modulated （PCM））オーディオサンプルまでの、サンプル単位の時間オフセットである。ブロックオフセット値は、全部の特性更新に共通のサンプルオフセットからのオフセットとして、サンプル単位の時間期間を示す。 In embodiments, the timing data also includes sample offset and block offset values used by the decoder to calculate the starting sample value offset and frame offset. The sample offset applies to the data in the OAMD payload, e.g. as specified in ETSI TS 102 366 [1], Sections H.2.2.3.1 and H.2.2.3.2 The time offset in samples to the first pulse code modulated (PCM) audio sample. The block offset value indicates the time period in samples as an offset from the sample offset common to all property updates.

実施形態では、デコーダは、対応するオブジェクト特性のオブジェクトオーディオ要素オーディオデータ及びタイムスタンプ付きメタデータ更新を含むOBAのためのインタフェースを提供する。インタフェースにおいて、デコーダは、タイムスタンプ付き更新の中で、復号されたオブジェクト毎のメタデータを提供する。各更新について、デコーダは、メタデータ更新構造の中で指定されたデータを提供する。 In an embodiment, the decoder provides an interface for OBA containing object audio element audio data and timestamped metadata updates for corresponding object properties. At the interface, the decoder provides metadata for each decoded object in timestamped updates. For each update, the decoder provides the data specified in the metadata update structure.

＜＜例示的なCBAからOBAへの変換＞＞
以下の開示では、OAMDを用いて、CBAコンテンツをOBAに変換する技術が開示される。例示的な実施形態では、２２.２チャネル（「２２.２ch」）コンテンツは、OAMDを用いてOBAに変換される。本実施形態では、２２.２chコンテンツは、チャネルが位置付けられ、従ってダウンミキシング／レンダリングされる２つの定義された方法を有する。方法の選択は、２２.２chビットストリームに埋め込まれたdmix_pos_adj_idxパラメータのようなパラメータの値に依存してよい。２２.２ch位置をOAMD表現に変換するフォーマット変換器は、このパラメータの値に基づき、２つのOAMD表現のうちの１つを選択する。選択された表現は、再生装置（例えば、Dolby（登録商標）Atmos（登録商標）再生装置）に入力されるOBAビットストリーム（例えば、Dolby（登録商標）MATビットストリーム）内で運ばれる。例示的な２２.２chシステムは、Hamasaki２２.２である。Hamasaki２２.２は、NHK放送技術研究所により開発されたテレビジョン規格であるスーパーハイビジョンのサラウンド音声コンポーネントであり、３層に配置された（２個のサブウーハを含む）２４個のスピーカを使用する。 <<Example CBA to OBA Conversion>>
The following disclosure discloses techniques for converting CBA content to OBA using OAMD. In an exemplary embodiment, 22.2 channel (“22.2ch”) content is converted to OBA using OAMD. In this embodiment, 22.2ch content has two defined ways in which channels are positioned and thus downmixed/rendered. The choice of method may depend on the values of parameters such as the dmix_pos_adj_idx parameter embedded in the 22.2ch bitstream. A format converter that converts a 22.2ch position to an OAMD representation will select one of the two OAMD representations based on the value of this parameter. The selected representation is carried in an OBA bitstream (eg, a Dolby® MAT bitstream) input to a playback device (eg, a Dolby® Atmos® playback device). An exemplary 22.2ch system is the Hamasaki 22.2. Hamasaki 22.2 is a surround audio component for Super Hi-Vision, a television standard developed by NHK Science & Technical Research Laboratories, and uses 24 speakers (including 2 subwoofers) arranged in 3 layers.

以下の開示は２２.２chコンテンツがOAMDを用いてOBAコンテンツに変換される実施形態を対象としているが、開示の実施形態は、標準化された又は独自のビットストリームフォーマットを含む任意のCBA又はOBAビットストリームフォーマット、及び任意の再生装置又はシステムに適用可能である。更に、以下の開示は、２２.２chからOBAへの変換に限定されず、任意のN.Mチャネルベースオーディオの変換にも適用可能である。ここで、Nは７より大きい正の整数であり、Mは０以上の正の整数である。 Although the following disclosure is directed to embodiments in which 22.2ch content is converted to OBA content using OAMD, the disclosed embodiments cover any CBA or OBA bitstream format, including standardized or proprietary bitstream formats. Stream format and applicable to any playback device or system. Furthermore, the following disclosure is not limited to 22.2ch to OBA conversion, but is applicable to any N.M channel based audio conversion. Here, N is a positive integer greater than 7, and M is a positive integer greater than or equal to 0.

本願明細書で使用されるとき、用語「含む」及びその変形は、「含む（include）が、それに限定されない」を意味する広義の用語として解釈される。用語「又は」は、文脈上明確に示されない限り、「及び／又は」として解釈される。用語「に基づく」は、「少なくとも部分的に基づく」として解釈される。用語「１つの例示的な実施形態」及び「例示的な実施形態」は、「少なくとも１つの例示的な実施形態」として解釈されるべきである。用語「別の実施形態」は、「少なくとも１つの他の実施形態」として解釈されるべきである。更に、以下の説明及び請求の範囲では、特に断りのない限り、本願明細書で使用される全ての技術的及び科学的用語は、本開示が属する分野の当業者により一般的に理解されるものと同じ意味を有する。 As used herein, the term "including" and variations thereof is to be interpreted as a broad term meaning "including but not limited to." The term "or" should be interpreted as "and/or" unless the context clearly indicates otherwise. The term "based on" is to be interpreted as "based at least in part on". The terms "one exemplary embodiment" and "exemplary embodiment" should be interpreted as "at least one exemplary embodiment." The term "another embodiment" should be interpreted as "at least one other embodiment." Further, in the following description and claims, unless defined otherwise, all technical and scientific terms used herein are commonly understood by one of ordinary skill in the art to which this disclosure pertains. have the same meaning as

＜プログラム割り当て及びオブジェクト位置＞
本願では、２２.２chコンテンツ３０５（例えば、ファイル又はライブストリーム）は、フォーマット変換器３０１により受信される。コンテンツ３０５は、オーディオ及び関連付けられたメタデータを含む。メタデータは、dmix_pos_adj_idxパラメータを含む。該パラメータは、該パラメータの値に基づき、２つのOAMD表現のうちの１つを選択するためのものである。OAMDベッド（bed）チャネルラベルにより表現できるチャネルは、OAMDベッドチャネルラベルを使用する。OAMDベッドチャネルラベルにより表現できないチャネルは、静的オブジェクト位置を使用する。ここで、各静的オブジェクト位置は、例えばETSI TS １０３４２０ v１.２.１（２０１８-１０）に記載されるようなOAMD[x,y,z]位置座標で記述される。本願明細書で使用されるとき、「ベッド（bed）チャネル」は、複数のベッド（bed）オブジェクトのグループであり、「ベッドオブジェクト」は、再生システムのラウドスピーカへの割り当てにより空間的位置が固定される静的オブジェクトである。 <Program allocation and object location>
In this application, 22.2ch content 305 (eg, file or live stream) is received by format converter 301 . Content 305 includes audio and associated metadata. Metadata includes the dmix_pos_adj_idx parameter. The parameter is for selecting one of two OAMD representations based on the value of the parameter. Channels that can be represented by an OAMD bed channel label use the OAMD bed channel label. Channels that cannot be represented by an OAMD bed channel label use static object locations. Here, each static object position is described by OAMD[x,y,z] position coordinates as described, for example, in ETSI TS 103 420 v1.2.1 (2018-10). As used herein, a "bed channel" is a group of multiple bed objects, the "bed objects" having a fixed spatial position due to their assignment to the loudspeakers of the playback system. is a static object that is

図１Aは、実施形態による、２つの異なるOAMD表現のベッドチャネル及びオブジェクト位置を示す表である。表の一番上の行は２４個の２２.２chラベルを含み、表の真ん中の行は、dmix_pos_adj_idx=０によりシグナリングされる第１OAMD表現のベッドチャネルラベル及びオブジェクト位置を含み、表の一番下の行はdmix_pos_adj_idx=１によりシグナリングされる第２OAMD表現のベッドチャネルラベル及びオブジェクト位置を含む。dmix_pos_adj_idx信号は、例示的な信号であり、ブールフラグ及び１つ以上のビットにより符号化される信号を含むがこれに限定されない任意の種類のシグナリングが使用できることに留意する。 FIG. 1A is a table showing bed channels and object positions for two different OAMD representations, according to an embodiment. The top row of the table contains 24 22.2ch labels, the middle row of the table contains the bed channel label and object position of the first OAMD representation signaled by dmix_pos_adj_idx=0, and the bottom of the table contains the bed channel label and object position of the second OAMD representation signaled by dmix_pos_adj_idx=1. Note that the dmix_pos_adj_idx signals are exemplary signals and that any type of signaling can be used, including but not limited to Boolean flags and signals encoded by one or more bits.

図１Aの表を参照すると、２２.２chラベルの幾つかの例は、FL（front-left）、FR（front-right）、FC（Front-center ）、LFE１（low-frequency effects １）、BL（back-left）、BR（back-right）、FLc（front-left-center）、FRc（front-right-center）、BC（back-center）、LFE２（low-frequency effects ２）、SIL（left-side）、SIR（right-side）、TpFL（top-front-left）、TpFR（top-front-right）、TpFC（top-front-center）、TpC（top-center）、TpBL（top-back-left）、TpBR（top-back-right）、TpSIL（top-side-left）、TpSIR（top-side-right）、TpBC（top-back-center）、BtFL（between-front-left）、BtFR（between-front-right）、及びBtFC（between-front-center）を含む。これらのラベルは、OAMDベッドチャネルラベル又は静的オブジェクト位置[x,y,z]のいずれかにマッピングされることに留意する。例えば、第１OAMD表現（dmix_pos_adj_idx=０）では、２２.２chラベルFLは静的オブジェクト位置[０,０.２５,０]にマッピングし、２２.２chラベルFRは静的オブジェクト位置[１,０.２５,０]にマッピングし、２２.２chラベルFCはOAMDベッドチャネルラベルCにマッピングする、等である。OAMD表現は、シグナリングパラメータ（例えばその値）に基づき、１つ以上のオーディオチャネルを１つ以上のオーディオオブジェクトにマッピングする。１つ以上のオーディオオブジェクトは、動的又は静的オーディオオブジェクトであってよい。上述のように、静的オーディオオブジェクトは、固定された空間的位置を有するオーディオオブジェクトである。動的オーディオオブジェクトは、空間的位置が時間に渡り変化され得るオーディオオブジェクトである。上述の例では、OAMD表現は、チャネルラベル、ベッドチャネルラベル、及び静的オブジェクト位置を含む。OAMD表現は、シグナリングパラメータ（例えばその値）に基づき、チャネルラベルを、ベッドチャネルラベル又は静的オブジェクト位置のいずれかにマッピングする。 Referring to the table in FIG. 1A, some examples of 22.2ch labels are FL (front-left), FR (front-right), FC (Front-center), LFE1 (low-frequency effects 1), BL (back-left), BR (back-right), FLc (front-left-center), FRc (front-right-center), BC (back-center), LFE2 (low-frequency effects 2), SIL (left -side), SIR (right-side), TpFL (top-front-left), TpFR (top-front-right), TpFC (top-front-center), TpC (top-center), TpBL (top-back) -left), TpBR (top-back-right), TpSIL (top-side-left), TpSIR (top-side-right), TpBC (top-back-center), BtFL (between-front-left), BtFR (between-front-right), and BtFC (between-front-center). Note that these labels map to either OAMD bed channel labels or static object positions [x,y,z]. For example, in the first OAMD representation (dmix_pos_adj_idx=0), 22.2ch label FL maps to static object position [0,0.25,0], and 22.2ch label FR maps to static object position [1,0. 25,0], 22.2ch label FC maps to OAMD bed channel label C, and so on. An OAMD representation maps one or more audio channels to one or more audio objects based on signaling parameters (eg, their values). One or more audio objects may be dynamic or static audio objects. As mentioned above, static audio objects are audio objects that have a fixed spatial position. A dynamic audio object is an audio object whose spatial position can change over time. In the example above, the OAMD representation includes channel labels, bed channel labels, and static object locations. The OAMD representation maps channel labels to either bed channel labels or static object locations based on signaling parameters (eg, their values).

＜プログラム割り当て及びオブジェクト位置＞
OAMDは、ベッドオブジェクトが動的オブジェクトより先行すると想定する。更に、ベッドオブジェクトは特定の順序で現れる。これらの理由から、２２.２chコンテンツのオーディオは、OAMD順序制約を満たすために、オーディオチャネルシャッフラ３０３により並べ替えられる。オーディオチャネルシャッフラ３０３は、メタデータ生成器３０４からチャネルシャッフル情報を受信し、チャネルシャッフル情報を用いて、２２.２チャネルを並べ替える。 <Program allocation and object location>
OAMD assumes that bed objects precede dynamic objects. Additionally, bed objects appear in a particular order. For these reasons, the audio of 22.2ch content is reordered by the audio channel shuffler 303 to satisfy the OAMD order constraint. Audio channel shuffler 303 receives channel shuffle information from metadata generator 304 and uses the channel shuffle information to reorder the 22.2 channels.

図１Bは、実施形態による、２つの異なるOAMD表現のベッドチャネル割り当て及びチャネル順序を示す表である。表の一番上の行は、２２.２chコンテンツ（Hamasaki２２.２）について想定されるチャネル順序（０～２３チャネル）及びチャネルラベルを示す。表の真ん中の行は、第１OAMD表現のベッド割り当てラベルを示す。表の一番下の行は、第２OAMD表現のベッド割り当てラベルを示す。変換されたオーディオ及びOAMDメタデータは、図３を参照すると、フォーマット変換器３０１により、レンダリングオーディオを生成するオブジェクトオーディオレンダラ３０２へと出力される。 FIG. 1B is a table showing bed channel assignments and channel order for two different OAMD representations, according to an embodiment. The top row of the table shows the assumed channel order (0-23 channels) and channel labels for 22.2ch content (Hamasaki 22.2). The middle row of the table shows the bed assignment labels for the first OAMD representation. The bottom row of the table shows the bed assignment labels for the second OAMD representation. The converted audio and OAMD metadata are output by format converter 301 to object audio renderer 302 which produces rendered audio, see FIG.

図１Bの表を参照すると、２２.２chコンテンツの最初の２個のチャネル（０，１）はFL及びFRである。第１OAMD表現（dmix_pos_adj_idx=０）では、最初の２個のチャネル（０，１）は、OAMDチャネル１５及びチャネル１６に各々並べ替えられる（「シャッフルされる」）。第２OAMD表現（dmix_pos_adj_idx=１）では、最初の２個のチャネル（０，１）は、OAMDチャネルL及びRに各々並べ替えられる。本例では、第１OAMD表現（dmix_pos_adj_idx=０）では、インデックス０を有する第１出力チャネルについて、第１OAMD表現をそれに関連付けるために、入力（例えば、Hamasaki ２２.２）のインデックス６は、インデックスチャネル０になるように並べ替えられ／シャッフルされる。言い換えると、本例では、左チャネルLが入力ベッドチャネルの中に存在する場合、第１OAMD表現の中のこの左チャネルは、強制的に（インデックスチャネル０を有する）第１チャネルにされる。ベッドチャネルの全部は、存在する場合には、OAMDで表現されるとき、特定の順序で現れる。ベッドチャネルが並べ替えられると、ベッドチャネルの並べ替えの結果として、動的オブジェクトが並べ替えられる。特定のOAMD表現順序制約を満たす並べ替え。制約は、OBA再生装置／システムにより使用されるOAMD使用に依存する。例えば、Dolby Atmosと互換性のあるOBA再生装置／システムでは、Dolby Atmosコンテンツを含むシステム及びコーデックにおいて送信されるOAMDは、Dolby Atmos OAMD仕様により指定される。これらの仕様／制約は、OAMDベッドチャネルの順序を決定する。例えば図１Aに示されるように及び以下のようになり、括弧内は対応するチャネルラベルである：Left（L）、right（R）、Center（C）、Low-Frequency Effects（LFE）、Left Surround（Ls）、Right Surround（Rs）、Left Rear Surround（Lrs）、Right Rear Surround（Rrs）、Left Front High（Lfh）、Right Front High（Rfh）、Left Top Middle（Ltm）、Right Top Middle（Rtm）、Left Rear High（Lrh）、Right Rear High（Rrh）、及びLow-Frequency Effects ２（LFE２）である。 Referring to the table in FIG. 1B, the first two channels (0,1) of 22.2ch content are FL and FR. In the first OAMD representation (dmix_pos_adj_idx=0), the first two channels (0,1) are permuted ("shuffled") into OAMD channels 15 and 16, respectively. In the second OAMD representation (dmix_pos_adj_idx=1), the first two channels (0,1) are permuted into OAMD channels L and R respectively. In this example, for the first OAMD representation (dmix_pos_adj_idx=0), for the first output channel with index 0, index 6 of the input (e.g., Hamasaki 22.2) becomes index channel 0 to associate the first OAMD representation with it. are sorted/shuffled so that In other words, in this example, if a left channel L is present in the input bed channels, this left channel in the first OAMD representation is forced to be the first channel (with index channel 0). All of the bed channels, if present, appear in a particular order when represented in OAMD. When the bed channels are reordered, the dynamic objects are reordered as a result of the bed channel reordering. A permutation that satisfies a specific OAMD presentation order constraint. Constraints depend on the OAMD usage used by the OBA player/system. For example, in a Dolby Atmos compatible OBA player/system, the OAMD transmitted in the system and codec containing Dolby Atmos content is specified by the Dolby Atmos OAMD specification. These specifications/constraints determine the order of the OAMD bed channels. For example, as shown in FIG. 1A and below, where in brackets are the corresponding channel labels: Left (L), right (R), Center (C), Low-Frequency Effects (LFE), Left Surround. (Ls), Right Surround (Rs), Left Rear Surround (Lrs), Right Rear Surround (Rrs), Left Front High (Lfh), Right Front High (Rfh), Left Top Middle (Ltm), Right Top Middle (Rtm) ), Left Rear High (Lrh), Right Rear High (Rrh), and Low-Frequency Effects 2 (LFE2).

＜次元トリミングメタデータ＞
図２Aは、実施形態による、次元トリミングメタデータを示す表である。２２.２chコンテンツのOBAコンテンツへの並べ替えが２２.２ch仕様により指定されるダウンミックスと厳密に一致するようになることを保証するために、OBAレンダリング装置に配信される２２.２chコンテンツを伴うOAMDに次元トリミングメタデータが含まれる。オブジェクトtirmは、ミックスに含まれるスクリーン外要素のレベルを低下させるために使用される。これは、没入型ミックスが幾つかのラウドスピーカを有するレイアウトで再生されるとき、望ましい。 <Dimension trimming metadata>
FIG. 2A is a table showing dimension trimming metadata, according to an embodiment. With 22.2ch content delivered to OBA rendering devices to ensure that the reordering of 22.2ch content to OBA content will closely match the downmix specified by the 22.2ch specification OAMD contains dimension trimming metadata. The object tirm is used to reduce the level of off-screen elements in the mix. This is desirable when the immersive mix is played in a layout with several loudspeakers.

実施形態では、第１メタデータフィールドは、パラメータwarp_modeを含む。該パラメータは、値「０」に設定された場合、５.１X出力構成におけるオブジェクトの通常レンダリング（つまり、ワーピング無し）を示す。warp_modeが値「１」に設定された場合、５.１X出力構成において、オブジェクトにワーピングが適用される。ワープは、レンダラが聴取環境（例えば、部屋）の中央点と背後との間でパニングされるコンテンツをどのように扱うかを表す。ワープにより、コンテンツは、聴取環境の背後と中央点との間でサラウンドスピーカにおいて一定レベルで提示され、聴取環境の前半分になるまで、ファントムイメージングの必要を回避する。 In embodiments, the first metadata field includes the parameter warp_mode. When the parameter is set to a value of "0", it indicates normal rendering (ie, no warping) of objects in the 5.1X output configuration. If warp_mode is set to the value "1", warping is applied to objects in the 5.1X output configuration. Warp describes how the renderer handles content that is panned between the center point of the listening environment (eg, room) and the back. Warping causes content to be presented at a constant level in the surround speakers between the back and center points of the listening environment, avoiding the need for phantom imaging until in the front half of the listening environment.

次元トリミングメタデータの表の中の第２メタデータフィールドは、図２Bに示されるような８個のスピーカ構成（例えば、２.０、５.１.０、７.１.０、２.１.２、５.１.２、７.１.２、２.１.４、５.１.４、７.１.４）の場合の、構成毎のトリム／バランス制御を含む。自動トリミング（auto_trim）、中央トリミング（center_trim）、サラウンドトリミング（surround_trim）、高さトリミング（height_trim）、及び前／後バランストリミング（fb_balance_ohfl、fb_balance_surr）のためのメタデータフィールドが存在する。 The second metadata field in the dimension trimming metadata table is the eight speaker configurations (e.g., 2.0, 5.1.0, 7.1.0, 2.1 .2, 5.1.2, 7.1.2, 2.1.4, 5.1.4, 7.1.4), including per-configuration trim/balance control. There are metadata fields for auto trim (auto_trim), center trim (center_trim), surround trim (surround_trim), height trim (height_trim), and front/rear balance trim (fb_balance_ohfl, fb_balance_surr).

図２Aを参照すると、第３メタデータフィールドは、パラメータobject_trim_bypassを含む。このパラメータは、２２.２chチャネルコンテンツの中の全部のベッド及び動的オブジェクトに適用される値を有する。object_trim_bypassが「１」の値に設定される場合、ベッド及び動的オブジェクトにトリミングが適用されない。 Referring to Figure 2A, the third metadata field contains the parameter object_trim_bypass. This parameter has a value that applies to all beds and dynamic objects in 22.2ch channel content. If object_trim_bypass is set to a value of '1', no trimming is applied to beds and dynamic objects.

＜オブジェクト利得＞
OAMDは、各オブジェクトが個々のオブジェクト利得を有することを許容する。この利得は、オブジェクトオーディオレンダラ３０２により適用される。オブジェクト利得は、２２.２chコンテンツのダウンミックス値の間の差の補償、及び２２.２chコンテンツのOAMD表現のレンダリングを可能にする。実施形態では、オブジェクト利得は、LFE１又はLFE２のベッドチャネル割り当てを有するオブジェクトについて-３dBに、全部の他のオブジェクトについて０dBに設定される。オブジェクト利得の他の値は、適用に依存して使用できる。 <Object gain>
OAMD allows each object to have individual object gains. This gain is applied by the object audio renderer 302 . Object gain enables compensation of differences between downmix values of 22.2ch content and rendering of OAMD representation of 22.2ch content. In an embodiment, the object gain is set to -3 dB for objects with bed channel assignments of LFE1 or LFE2 and to 0 dB for all other objects. Other values of object gain can be used depending on the application.

＜＜例示的な適用＞＞
＜OBAとしての２２.２chコンテンツの聴取＞
図３は、実施形態による、ビットストリーム符号化を用いずに、２２.２チャネルオーディオビットストリームをオーディオ及びOAMDに変換する例示的なシステム３００のブロック図である。システム３００は、２２.２chコンテンツがOBA再生システム（Dolby（登録商標）Atmos（登録商標））でOBAコンテンツとして聴取される適用で使用される。 <<Example Application>>
<Listening to 22.2ch content as an OBA>
FIG. 3 is a block diagram of an exemplary system 300 for converting a 22.2 channel audio bitstream into audio and OAMD without bitstream encoding, according to an embodiment. System 300 is used in applications where 22.2ch content is heard as OBA content in an OBA playback system (Dolby® Atmos®).

システム３００は、フォーマット変換器３０１及びオブジェクトオーディオレンダラ３０２を含む。フォーマット変換器３０１は、オーディオチャネルシャッフラ３０３及びOAMDメタデータ生成器３０４を更に含む。OAMDメタデータの幾つかの例は、限定ではないが、コンテンツ記述メタデータ、特性更新メタデータ、及びトリミングデータを含む。２２.２chコンテンツ３０５（例えば、ファイル又はライブストリーム）は、フォーマット変換器３０１に入力される２２.２chオーディオ及びメタデータを含む。OAMDメタデータ生成器３０４は、例えば図１Aを参照して説明した原理に従うように２２.２chメタデータをOAMDにマッピングし、チャネルシャッフル情報を生成する。チャネルシャッフル情報は、例えば図１Bを参照して説明した原理に従いオーディオチャネルシャッフラ３０３により適用される２２.２chコンテンツのチャネル並べ替えを記述する。オーディオチャネルシャッフラ３０３の出力は並べ替えられたオーディオチャネルである。フォーマット変換器３０１の出力は、オブジェクトオーディオレンダラ３０２に入力される、オーディオの並べ替えれたチャネル、及びOAMDである。オブジェクトオーディオレンダラ３０２は、OAMDを用いてオーディオを処理し、それを特定のラウドスピーカレイアウトに適応する。 System 300 includes format converter 301 and object audio renderer 302 . Format converter 301 further includes audio channel shuffler 303 and OAMD metadata generator 304 . Some examples of OAMD metadata include, without limitation, content description metadata, property update metadata, and trimming data. 22.2ch content 305 (eg, file or live stream) includes 22.2ch audio and metadata that is input to format converter 301 . OAMD metadata generator 304 maps 22.2ch metadata to OAMD to generate channel shuffle information, for example, according to the principles described with reference to FIG. 1A. The channel shuffle information describes the channel permutation of the 22.2ch content applied by the audio channel shuffler 303, for example, according to the principles described with reference to FIG. 1B. The output of audio channel shuffler 303 is the reordered audio channels. The output of format converter 301 is the reordered channels of audio and OAMD that are input to object audio renderer 302 . The Object Audio Renderer 302 uses OAMD to process audio and adapt it to a particular loudspeaker layout.

＜OBAとしての２２.２コンテンツの送信＞
図４は、実施形態による、ビットストリーム符号化を用いて、２２.２チャネルオーディオビットストリームをオーディオオブジェクト及びOAMDに変換する例示的なシステム４００のブロック図である。本願では、２２.２chコンテンツを送信するのではなく、２２.２chコンテンツは、フォーマット変換され、OBAコーデックを用いてOBAとして送信される。 <22.2 Sending content as an OBA>
FIG. 4 is a block diagram of an exemplary system 400 for converting a 22.2 channel audio bitstream into audio objects and OAMDs using bitstream encoding, according to an embodiment. In this application, instead of transmitting 22.2ch content, the 22.2ch content is format-converted and transmitted as OBA using the OBA codec.

システム４００は、フォーマット変換器４０１及びOBAエンコーダ４０２を含む。フォーマット変換器４０１は、OAMDメタデータ生成器４０４及びオーディオチャネルシャッフラ４０３を更に含む。OAMDメタデータの幾つかの例は、限定ではないが、コンテンツ記述メタデータ、特性更新メタデータ、及びトリミングデータを含む。２２.２chコンテンツ４０５（例えば、ファイル又はライブストリーム）は、フォーマット変換器４０１に入力される２２.２chオーディオ及びメタデータを含む。OAMDメタデータ生成器４０４は、例えば図１Aを参照して説明した原理に従うように２２.２chメタデータをOAMDにマッピングし、チャネルシャッフル情報を生成する。チャネルシャッフル情報は、例えば図１Bを参照して説明した原理に従いオーディオチャネルシャッフラ４０３により適用される２２.２chコンテンツのチャネル並べ替えを記述する。オーディオチャネルシャッフラ４０３の出力は並べ替えられたオーディオチャネルである。 System 400 includes format converter 401 and OBA encoder 402 . Format converter 401 further includes OAMD metadata generator 404 and audio channel shuffler 403 . Some examples of OAMD metadata include, without limitation, content description metadata, property update metadata, and trimming data. 22.2ch content 405 (eg, file or live stream) includes 22.2ch audio and metadata that is input to format converter 401 . OAMD metadata generator 404 maps 22.2ch metadata to OAMD to generate channel shuffle information, for example, according to the principles described with reference to FIG. 1A. The channel shuffle information describes the channel permutation of the 22.2ch content applied by the audio channel shuffler 403 according to the principles described with reference to FIG. 1B, for example. The output of audio channel shuffler 403 is the reordered audio channels.

フォーマット変換器４０１の出力は、エンコーダ４０２に入力される、オーディオの並べ替えれたチャネル、及びOAMDである。OBAエンコーダ４０２は、OAMDを用いて（例えば、JOCを用いて）オーディオを符号化して、OBAビットストリーム４０６を生成する。OBAビットストリーム４０６は、下流のOBA再生装置へ送信でき、そこで、オーディオを処理して特定のラウドスピーカレイアウトに適応するオブジェクトオーディオレンダラによりレンダリングされる。 The output of format converter 401 is the reordered channels of audio and OAMD that are input to encoder 402 . OBA encoder 402 encodes the audio using OAMD (eg, using JOC) to generate OBA bitstream 406 . The OBA bitstream 406 can be sent downstream to an OBA playback device, where it is rendered by an object audio renderer that processes the audio and adapts it to a particular loudspeaker layout.

＜ソース装置でレンダリングするために、送信された２２.２コンテンツのOBAへの変換＞
図５は、実施形態による、ソース装置におけるレンダリングのために、２２.２チャネルオーディオビットストリームをオーディオオブジェクト及びOAMDに変換する例示的なシステムのブロック図である。本願では、セットトップボックス（STB）又はオーディオ／ビデオレコーダ（AVR）のようなソース装置は、２２.２chコンテンツをネイティブオーディオビットストリームから受信し、フォーマット変換器によるフォーマット変換の後に、コンテンツはオブジェクトオーディオレンダラを用いてレンダリングされる。例示的なネイティブオーディオビットストリームフォーマットは、高度オーディオコーディング（advanced audio coding （AAC））標準ビットストリームフォーマットである。 <Conversion of Transmitted 22.2 Content to OBA for Rendering on Source Device>
FIG. 5 is a block diagram of an exemplary system for converting a 22.2 channel audio bitstream into audio objects and OAMDs for rendering on a source device, according to an embodiment. In this application, a source device such as a set-top box (STB) or an audio/video recorder (AVR) receives 22.2ch content from a native audio bitstream, and after format conversion by a format converter, the content becomes object audio. Rendered using a renderer. An exemplary native audio bitstream format is the advanced audio coding (AAC) standard bitstream format.

システム５００は、フォーマット変換器５０１及びオブジェクトオーディオレンダラ５０２及びデコーダ５０６を含む。フォーマット変換器５０１は、OAMDメタデータ生成器５０４及びオーディオチャネルシャッフラ５０３を更に含む。OAMDメタデータの幾つかの例は、限定ではないが、コンテンツ記述メタデータ、特性更新メタデータ、及びトリミングデータを含む。オーディオビットストリーム５０５（例えば、AAC/MP４）は、デコーダ５０６（例えば、AAC/MP４デコーダ）に入力される２２.２chオーディオ及びメタデータを含む。デコーダ５０６の出力は、フォーマット変換器５０１に入力される、２２.２chオーディオ及びメタデータである。OAMDメタデータ生成器５０４は、例えば図１Aを参照して説明した原理に従うように２２.２chメタデータをOAMDにマッピングし、チャネルシャッフル情報を生成する。チャネルシャッフル情報は、例えば図１Bを参照して説明した原理に従いオーディオチャネルシャッフラ５０３により適用される２２.２chコンテンツのチャネル並べ替えを記述する。オーディオチャネルシャッフラ５０３の出力は並べ替えられたオーディオチャネルである。フォーマット変換器５０１の出力は、オブジェクトオーディオレンダラ５０２に入力される、オーディオの並べ替えれたチャネル、及びOAMDである。オブジェクトオーディオレンダラ５０２は、OAMDを用いてオーディオを処理し、それを特定のラウドスピーカレイアウトに適応する。 System 500 includes format converter 501 and object audio renderer 502 and decoder 506 . Format converter 501 further includes OAMD metadata generator 504 and audio channel shuffler 503 . Some examples of OAMD metadata include, without limitation, content description metadata, property update metadata, and trimming data. Audio bitstream 505 (eg, AAC/MP4) includes 22.2ch audio and metadata that is input to decoder 506 (eg, AAC/MP4 decoder). The output of decoder 506 is the 22.2ch audio and metadata that is input to format converter 501 . OAMD metadata generator 504 maps 22.2ch metadata to OAMD to generate channel shuffle information, for example, according to the principles described with reference to FIG. 1A. The channel shuffle information describes the channel permutation of the 22.2ch content applied by the audio channel shuffler 503, for example, according to the principles described with reference to FIG. 1B. The output of audio channel shuffler 503 is the reordered audio channels. The output of format converter 501 is the reordered channels of audio and OAMD that are input to object audio renderer 502 . The Object Audio Renderer 502 uses OAMD to process the audio and adapt it to a specific loudspeaker layout.

＜外部レンダリング（STBA/VR/SB）のためにHDMIを介して送信するための、送信された２２.２コンテンツのOBAへの変換＞
図６A及び６Bは、実施形態による、外部レンダリングのために、高精細度マルチメディアインタフェース（high definition multimedia interface （HDMI））を介して送信するために、２２.２chオーディオビットストリームをオーディオオブジェクト及びOAMDに変換する例示的なシステムのブロック図である。本願では、チャネルシャッフル情報は、OAMDと共に、エンコーダにおいて生成され、送信されるためにネイティブオーディオビットストリーム（例えば、AAV）内にパッケージされる。この構成では、生じるフォーマット変換は、オーディオシャッフラに簡略化される。OAMDと一緒にシャッフルされたオーディオは、HDMIを介してビットストリーム内で送信するために、OBAエンコーダへ送信される。受信機側で、ビットストリームは、復号され、オブジェクトオーディオレンダラによりレンダリングされる。 <Conversion of transmitted 22.2 content to OBA for transmission over HDMI for external rendering (STBA/VR/SB)>
Figures 6A and 6B illustrate a 22.2ch audio bitstream as an audio object and an OAMD for transmission over a high definition multimedia interface (HDMI) for external rendering, according to an embodiment. 1 is a block diagram of an exemplary system for converting to . In this application, channel shuffle information, along with OAMD, is generated at the encoder and packaged within a native audio bitstream (eg, AAV) for transmission. In this configuration, the resulting format conversion is simplified to an audio shuffler. The shuffled audio along with the OAMD is sent to the OBA encoder for transmission in the bitstream over HDMI. At the receiver side, the bitstream is decoded and rendered by the Object Audio Renderer.

図６Aを参照すると、符号化システム６００Aは、フォーマット変換器６０１、OBAエンコーダ６０２、及びデコーダ６０６を含む。フォーマット変換器６０１は、OAMDメタデータ生成器６０４及びオーディオチャネルシャッフラ６０３を更に含む。OAMDメタデータの幾つかの例は、限定ではないが、コンテンツ記述メタデータ、特性更新メタデータ、及びトリミングデータを含む。ネイティブオーディオビットストリーム６０５（例えば、AAC/MP４）は、デコーダ６０６（例えば、AAC/MP４デコーダ）に入力される２２.２chオーディオ及びメタデータを含む。デコーダ６０６の出力は、フォーマット変換器６０１に入力される、２２.２chオーディオ及びメタデータである。OAMDメタデータ生成器６０４は、例えば図１Aを参照して説明した原理に従うように２２.２chメタデータをOAMDにマッピングし、チャネルシャッフル情報を生成する。チャネルシャッフル情報は、例えば図１Bを参照して説明した原理に従いオーディオチャネルシャッフラ６０３により適用される２２.２chコンテンツのチャネル並べ替えを記述する。オーディオチャネルシャッフラ６０３の出力は並べ替えられたオーディオチャネルである。フォーマット変換器６０１の出力は、エンコーダ６０２に入力される、オーディオの並べ替えれたチャネル、及びOAMDである。OABエンコーダ６０２は、オーディオ及びOAMDを符号化し、オーディオとOAMDとを含むOBAビットストリームを出力する。 Referring to FIG. 6A, encoding system 600 A includes format converter 601 , OBA encoder 602 and decoder 606 . Format converter 601 further includes OAMD metadata generator 604 and audio channel shuffler 603 . Some examples of OAMD metadata include, without limitation, content description metadata, property update metadata, and trimming data. A native audio bitstream 605 (eg, AAC/MP4) contains 22.2ch audio and metadata that is input to a decoder 606 (eg, AAC/MP4 decoder). The output of decoder 606 is 22.2ch audio and metadata that is input to format converter 601 . OAMD metadata generator 604 maps 22.2ch metadata to OAMD and generates channel shuffle information, for example, according to the principles described with reference to FIG. 1A. The channel shuffle information describes the channel permutation of the 22.2ch content applied by the audio channel shuffler 603 according to the principles described with reference to FIG. 1B, for example. The output of audio channel shuffler 603 is the reordered audio channels. The output of format converter 601 is the reordered channels of audio and OAMD that are input to encoder 602 . OAB encoder 602 encodes the audio and OAMD and outputs an OBA bitstream containing the audio and OAMD.

図６Bを参照すると、復号システム６００Bは、OBAデコーダ６０７及びオブジェクトオーディオレンダラ６０８を含む。OBAビットストリームは、オブジェクトオーディオレンダラ６０８に入力されるオーディオ及びOAMDを出力するOBAデコーダ６０７へ入力される。オブジェクトオーディオレンダラ６０８は、OAMDを用いてオーディオを処理し、それを特定のラウドスピーカレイアウトに適応する。 Referring to FIG. 6B, decoding system 600 B includes OBA decoder 607 and object audio renderer 608 . The OBA bitstream is input to OBA decoder 607 which outputs audio and OAMD which is input to object audio renderer 608 . The Object Audio Renderer 608 uses OAMD to process the audio and adapt it to a specific loudspeaker layout.

＜HDMIを介して送信するために、ネイティブビットストリームを介して２２.２の予め計算されたOAMDを送信する＞
図７A～７Cは、実施形態による、２２.２chオーディオビットストリームをオーディオオブジェクト及びOAMDに変換する例示的なシステムのブロック図であり、チャネルシャッフル情報及びOAMDがネイティブオーディオビットストリーム内にパッケージされる。前の例示的な適用では、OAMDは、デコーダ（例えば、AACデコーダ）の後に生成される。しかしながら、代替の実施形態として、チャネルシャッフル情報及びOAMDを送信フォーマット）（ネイティブオーディオビットストリーム又はトランスポート層のいずれか）に埋め込むことが可能である。本願では、チャネルシャッフル情報は、OAMDと共に、エンコーダにおいて生成され、送信されるためにネイティブオーディオビットストリーム（例えば、AACビットストリーム）内にパッケージされる。この構成では、生じるフォーマット変換は、オーディオシャッフラに簡略化される。OAMDと一緒にシャッフルされたオーディオは、HDMIを介して送信するために、OBAエンコーダへ送信される。受信側で、OBAビットストリームは、復号され、オブジェクトオーディオレンダラによりレンダリングされる。 <Sending 22.2 pre-computed OAMD via native bitstream for sending via HDMI>
7A-7C are block diagrams of exemplary systems for converting 22.2ch audio bitstreams into audio objects and OAMDs, where channel shuffle information and OAMDs are packaged within native audio bitstreams, according to embodiments. In the previous exemplary application, the OAMD is generated after the decoder (eg, AAC decoder). However, as an alternative embodiment, it is possible to embed the channel shuffle information and OAMD in the transmission format (either native audio bitstream or transport layer). In this application, the channel shuffle information along with the OAMD are generated at the encoder and packaged into the native audio bitstream (eg, AAC bitstream) for transmission. In this configuration, the resulting format conversion is simplified to an audio shuffler. The shuffled audio with OAMD is sent to the OBA encoder for transmission over HDMI. At the receiving end, the OBA bitstream is decoded and rendered by the Object Audio Renderer.

図７Aを参照すると、符号化システム７００Aは、エンコーダ７０１（例えば、AACエンコーダ）、及びトランスポート層多重化器７０６を含む。エンコーダ７０１は、コアエンコーダ７０２、フォーマット変換器７０３、及びビットストリームパッケージャ７０５を更に含む。フォーマット変換器７０３は、例えばDolby ATMOSメタデータ生成器であってよいOAMDメタデータ生成器７０４を更に含む。OAMDメタデータの幾つかの例は、限定ではないが、コンテンツ記述メタデータ、特性更新メタデータ、及びトリミングデータを含む。 Referring to FIG. 7A, encoding system 700 A includes encoder 701 (eg, AAC encoder) and transport layer multiplexer 706 . Encoder 701 further includes core encoder 702 , format converter 703 and bitstream packager 705 . Format converter 703 further includes OAMD metadata generator 704, which may be, for example, a Dolby ATMOS metadata generator. Some examples of OAMD metadata include, without limitation, content description metadata, property update metadata, and trimming data.

ネイティブオーディオビットストリーム７０７（例えば、AAC/MP４）は、２２.２chオーディオ及びメタデータを含む。オーディオは、オーディオをネイティブオーディオフォーマットに符号化し符号化オーディオをビットストリームパッケージ７０５に出力するエンコーダ７０１のコアエンコーダ７０２に入力される。OAMDメタデータ生成器７０４は、例えば図１Aを参照して説明した原理に従うように２２.２chメタデータをOAMDにマッピングし、チャネルシャッフル情報を生成する。チャネルシャッフル情報は、例えば図１Bを参照して説明した原理に従い２２.２chコンテンツのチャネル並べ替えを記述する。チャネルシャッフル情報は、OAMDと一緒にビットストリームパッケージャ７０５に入力される。ビットストリームパッケージャ７０５の出力は、チャネルシャッフル情報及びOAMDを含むネイティブオーディオビットストリームである。ネイティブオーディオビットストリームは、ネイティブオーディオビットストリームを含むトランスポートストリームを出力するトランスポート層多重化器７０６に入力される。 A native audio bitstream 707 (eg, AAC/MP4) contains 22.2ch audio and metadata. Audio is input to core encoder 702 of encoder 701 which encodes the audio into a native audio format and outputs the encoded audio to bitstream package 705 . OAMD metadata generator 704 maps 22.2ch metadata to OAMD and generates channel shuffle information, for example, according to the principles described with reference to FIG. 1A. The channel shuffle information describes the channel permutation of 22.2ch content according to the principle explained with reference to FIG. 1B, for example. Channel shuffle information is input to bitstream packager 705 along with the OAMD. The output of Bitstream Packager 705 is a native audio bitstream containing channel shuffle information and OAMD. The native audio bitstream is input to transport layer multiplexer 706 which outputs a transport stream containing the native audio bitstream.

図７Bを参照すると、復号／符号化システム７００Bは、トランスポート層逆多重化器７０８、デコーダ７０９、オーディオチャネルシャッフラ７１０、及びOBAエンコーダ７１１を含む。トランスポート層逆多重化器７０８は、オーディオ及びOAMDをトランスポートビットストリームから逆多重化し、オーディオ及びOAMDをデコーダ７０９に入力する。デコーダ７０９は、オーディオ及びOAMDをネイティブオーディオビットストリームから復号する。復号されたオーディオ及びOAMDは、次にOBAエンコーダ７１１へ入力される。OBAエンコーダ７１１は、オーディオ及びOAMDをOBAビットストリームに符号化する。 Referring to FIG. 7B, decoding/encoding system 700 B includes transport layer demultiplexer 708 , decoder 709 , audio channel shuffler 710 and OBA encoder 711 . Transport layer demultiplexer 708 demultiplexes the audio and OAMD from the transport bitstream and inputs the audio and OAMD to decoder 709 . Decoder 709 decodes audio and OAMD from the native audio bitstream. The decoded audio and OAMD are then input to OBA encoder 711 . OBA encoder 711 encodes audio and OAMD into an OBA bitstream.

図７Cを参照すると、復号システム７００Cは、OBAデコーダ７１２及びオブジェクトオーディオレンダラ７１３を含む。OBAビットストリームは、オブジェクトオーディオレンダラ７１３に入力されるオーディオ及びOAMDを出力するOBAデコーダ７１２へ入力される。オブジェクトオーディオレンダラ７１３は、OAMDを用いてオーディオを処理し、それを特定のラウドスピーカレイアウトに適応する。 Referring to FIG. 7C, decoding system 700 C includes OBA decoder 712 and object audio renderer 713 . The OBA bitstream is input to OBA decoder 712 which outputs audio and OAMD which is input to object audio renderer 713 . The Object Audio Renderer 713 uses OAMD to process audio and adapt it to a specific loudspeaker layout.

＜ソース装置におけるレンダリングのために、予め計算されたOAMDを送信する＞
図８A及び８Bは、実施形態による、ソース装置におけるレンダリングのために、２２.２chオーディオビットストリームをオーディオオブジェクト及びOAMDに変換する例示的なシステムのブロック図であり、ソース装置におけるレンダリングのために、チャネルシャッフル情報及びOAMDがネイティブオーディオビットストリーム内にパッケージされる。本願では、チャネルシャッフル情報は、OAMDと共に、エンコーダにおいて生成され、トランスポート層を介して送信されるためにネイティブオーディオビットストリーム（例えば、AACビットストリーム）内にパッケージされる。この構成では、生じるフォーマット変換は、オーディオシャッフラに簡略化される。OAMDと一緒にシャッフルされたオーディオは、レンダリングするために、オブジェクトオーディオレンダラへ送信される。 <Send precomputed OAMD for rendering on source device>
8A and 8B are block diagrams of exemplary systems for converting 22.2ch audio bitstreams into audio objects and OAMDs for rendering on the source device, according to embodiments; Channel shuffle information and OAMD are packaged within the native audio bitstream. In this application, the channel shuffle information along with the OAMD are generated at the encoder and packaged within the native audio bitstream (eg, AAC bitstream) for transmission over the transport layer. In this configuration, the resulting format conversion is simplified to an audio shuffler. The shuffled audio with the OAMD is sent to the Object Audio Renderer for rendering.

図８Aを参照すると、符号化システム８００Aは、エンコーダ８０１（例えば、AACエンコーダ）、及びトランスポート層多重化器８０７を含む。エンコーダ８０１は、コアエンコーダ８０３、フォーマット変換器８０２、及びビットストリームパッケージャ８０５を更に含む。フォーマット変換器８０２は、例えばDolby ATMOSメタデータ生成器であってよいOAMDメタデータ生成器８０４を更に含む。OAMDメタデータの幾つかの例は、限定ではないが、コンテンツ記述メタデータ、特性更新メタデータ、及びトリミングデータを含む。 Referring to FIG. 8A, encoding system 800 A includes encoder 801 (eg, AAC encoder) and transport layer multiplexer 807 . Encoder 801 further includes core encoder 803 , format converter 802 and bitstream packager 805 . Format converter 802 further includes an OAMD metadata generator 804, which may be, for example, a Dolby ATMOS metadata generator. Some examples of OAMD metadata include, without limitation, content description metadata, property update metadata, and trimming data.

ネイティブオーディオビットストリーム８０６（例えば、AAC/MP４）は、２２.２chオーディオ及びメタデータを含む。オーディオは、オーディオをネイティブオーディオフォーマットに符号化し符号化オーディオをビットストリームパッケージ８０５に出力するエンコーダ８０１のコアエンコーダ８０３に入力される。OAMDメタデータ生成器８０４は、例えば図１Aを参照して説明した原理に従うように２２.２chメタデータをOAMDにマッピングし、チャネルシャッフル情報を生成する。チャネルシャッフル情報は、例えば図１Bを参照して説明した原理に従い２２.２chコンテンツのチャネル並べ替えを記述する。チャネルシャッフル情報は、OAMDと一緒にビットストリームパッケージャ８０５に入力される。ビットストリームパッケージャ８０５の出力は、チャネルシャッフル情報及びOAMDを含むネイティブオーディオビットストリームである。ネイティブオーディオビットストリームは、ネイティブオーディオビットストリームを含むトランスポートストリームを出力するトランスポート層多重化器８０７に入力される。 A native audio bitstream 806 (eg, AAC/MP4) contains 22.2ch audio and metadata. Audio is input to core encoder 803 of encoder 801 which encodes the audio into a native audio format and outputs the encoded audio to bitstream package 805 . OAMD metadata generator 804 maps 22.2ch metadata to OAMD and generates channel shuffle information, for example, according to the principles described with reference to FIG. 1A. The channel shuffle information describes the channel permutation of 22.2ch content according to the principle explained with reference to FIG. 1B, for example. Channel shuffle information is input to bitstream packager 805 along with the OAMD. The output of Bitstream Packager 805 is a native audio bitstream containing channel shuffle information and OAMD. The native audio bitstream is input to transport layer multiplexer 807 which outputs a transport stream containing the native audio bitstream.

図８Bを参照すると、復号システム８００Bは、トランスポート層逆多重化器８０８、デコーダ８０９、オーディオチャネルシャッフラ８１０、及びオブジェクトオーディオレンダラ８１１を含む。トランスポート層逆多重化器８０８は、オーディオ及びOAMDをトランスポートビットストリームから逆多重化し、オーディオ及びOAMDをデコーダ８０９に入力する。デコーダ８０９は、オーディオ及びOAMDをネイティブオーディオビットストリームから復号する。復号されたオーディオ及びOAMDは、次に、オブジェクトオーディオレンダラ８１１に入力される。オブジェクトオーディオレンダラ８１１は、OAMDを用いてオーディオを処理し、それを特定のラウドスピーカレイアウトに適応する。 Referring to FIG. 8B, decoding system 800 B includes transport layer demultiplexer 808 , decoder 809 , audio channel shuffler 810 and object audio renderer 811 . Transport layer demultiplexer 808 demultiplexes the audio and OAMD from the transport bitstream and inputs the audio and OAMD to decoder 809 . Decoder 809 decodes audio and OAMD from the native audio bitstream. The decoded audio and OAMD are then input to Object Audio Renderer 811 . The Object Audio Renderer 811 uses OAMD to process audio and adapt it to a specific loudspeaker layout.

＜HDMIを介して送信するために、し、トランスポート層を介して予め計算されたOAMDを送信する＞
図９A～９Cは、実施形態による、２２.２chオーディオビットストリームをオーディオオブジェクト及びOAMDに変換する例示的なシステムのブロック図であり、ソース装置に供給するために、チャネルシャッフル情報及びOAMDがトランスポート層に埋め込まれ、次にHDMIを介して送信するために、ネイティブオーディオビットストリーム内にパッケージされる。 <To send over HDMI, send pre-computed OAMD over the transport layer>
9A-9C are block diagrams of exemplary systems for converting 22.2ch audio bitstreams into audio objects and OAMDs, according to embodiments, in which channel shuffle information and OAMDs are transported for delivery to source devices. embedded in layers and then packaged within a native audio bitstream for transmission over HDMI.

２２.２chコンテンツを表現するために使用されるOAMDは、プログラムの間、静的である。この理由から、オーディオビットストリームの中でデータレートの増大を回避するために、OAMDを頻繁に送信することを回避することが望ましい。これは、静的OAMD及びチャネルシャッフル情報を、トランスポート層内で送信し及びトランスポート層で送信されることにより達成できる。受信されると、OAMD及びチャネルシャッフル情報は、HDMIを介する後の送信のために、OBAエンコーダにより使用される。例示的なトランスポート層は、ビデオ及びオーディオのような時間に基づくマルチメディアファイルの一般的構造を定義するISO/IEC１４４９６-１２-MPEG-４ Part１２に記載されるベースメディアファイルフォーマット（base media file format （BMFF））である。MPEG-DASHを使用する実施形態では、OAMDはマニフェストに含まれる。 The OAMD used to represent 22.2ch content is static during the program. For this reason, it is desirable to avoid sending OAMDs frequently in order to avoid increasing the data rate in the audio bitstream. This can be achieved by sending static OAMD and channel shuffle information within and at the transport layer. Once received, the OAMD and channel shuffle information are used by the OBA encoder for later transmission over HDMI. An exemplary transport layer is the base media file format described in ISO/IEC 14496-12-MPEG-4 Part 12, which defines the general structure of time-based multimedia files such as video and audio. (BMFF)). In embodiments using MPEG-DASH, the OAMD is included in the manifest.

図９Aを参照すると、符号化システム９００Aは、エンコーダ９０２（例えば、AACエンコーダ）、フォーマット変換器９０５、及びトランスポート層多重化器９０３を含む。フォーマット変換器９０５は、OAMDメタデータ生成器９０４を更に含む。OAMDメタデータの幾つかの例は、限定ではないが、コンテンツ記述メタデータ、特性更新メタデータ、及びトリミングデータを含む。 Referring to FIG. 9A, encoding system 900 A includes encoder 902 (eg, AAC encoder), format converter 905 and transport layer multiplexer 903 . Format converter 905 further includes OAMD metadata generator 904 . Some examples of OAMD metadata include, without limitation, content description metadata, property update metadata, and trimming data.

ネイティブオーディオビットストリーム９０１（例えば、AAC/MP４）は、２２.２chオーディオ及びメタデータを含む。オーディオは、オーディオをネイティブオーディオフォーマットに符号化し符号化オーディオをトランスポート層多重化器９０３に出力するエンコーダ９０２に入力される。OAMDメタデータ生成器９０４は、例えば図１Aを参照して説明した原理に従うように２２.２chメタデータをOAMDにマッピングし、チャネルシャッフル情報を生成する。チャネルシャッフル情報は、例えば図１Bを参照して説明した原理に従い２２.２chコンテンツのチャネル並べ替えを記述する。チャネルシャッフル情報は、OAMDと一緒にビットし、トランスポート層多重化器９０３に入力される。トランスポート層多重化器９０３の出力は、ネイティブオーディオビットストリームを含む、トランスポートビットストリーム（例えば、MPEG－２トランスポートストリーム）又はパッケージファイル（例えば、ISO BMFFファイル）又はメディアプレゼンテーション記述（例えば、MPEG-DASHマニフェスト）である。 A native audio bitstream 901 (eg, AAC/MP4) includes 22.2ch audio and metadata. Audio is input to encoder 902 which encodes the audio into a native audio format and outputs the encoded audio to transport layer multiplexer 903 . OAMD metadata generator 904 maps 22.2ch metadata to OAMD to generate channel shuffle information, for example, according to the principles described with reference to FIG. 1A. The channel shuffle information describes the channel permutation of 22.2ch content according to the principle explained with reference to FIG. 1B, for example. Channel shuffle information is bit together with OAMD and input to transport layer multiplexer 903 . The output of transport layer multiplexer 903 is a transport bitstream (eg, MPEG-2 transport stream) or package file (eg, ISO BMFF file) or media presentation description (eg, MPEG -DASH manifest).

図９Bを参照すると、復号システム９００Bは、トランスポート層逆多重化器９０６、デコーダ９０７、オーディオチャネルシャッフラ９０８、及びOBAエンコーダ９０９を含む。トランスポート層逆多重化器９０６は、トランスポートビットストリームから、オーディオ、チャネルシャッフル情報、及びOAMDを逆多重化する。復号されたオーディオは、デコーダ９０７（例えば、AACデコーダ）へのオーディオビットストリームに入力され、デコーダ９０７は、オーディオを復号して、ネイティブオーディオビットストリームを復元する（つまり、決定し又は抽出する）。ネイティブオーディオビットストリームは、次に、トランスポート層逆多重化器９０６により出力されるチャネルシャッフル情報と一緒に、オーディオチャネルシャッフラ９０８に入力される。レンダリングされるチャネルを有するオーディオは、オーディオチャネルシャッフラ９０８から出力され、OAMDと一緒にOBAエンコーダ９０９に入力される。OBAエンコーダの出力は、OBAビットストリームである。 Referring to FIG. 9B, decoding system 900 B includes transport layer demultiplexer 906 , decoder 907 , audio channel shuffler 908 and OBA encoder 909 . A transport layer demultiplexer 906 demultiplexes the audio, channel shuffle information and OAMD from the transport bitstream. The decoded audio is input into an audio bitstream to decoder 907 (eg, an AAC decoder), which decodes the audio to recover (ie, determine or extract) the native audio bitstream. The native audio bitstream is then input to audio channel shuffler 908 along with channel shuffle information output by transport layer demultiplexer 906 . The audio with channels to be rendered is output from the audio channel shuffler 908 and input to the OBA encoder 909 along with the OAMD. The output of the OBA encoder is the OBA bitstream.

図９Cを参照すると、復号システム９００Cは、OBAデコーダ９１０及びオブジェクトオーディオレンダラ９１１を含む。OBAビットストリームは、オブジェクトオーディオレンダラ９１１に入力されるオーディオ及びOAMDを出力するOBAデコーダ９１０へ入力される。オブジェクトオーディオレンダラ９１１は、OAMDを用いてオーディオを処理し、それを特定のラウドスピーカレイアウトに適応する。 Referring to FIG. 9C, decoding system 900C includes OBA decoder 910 and object audio renderer 911 . The OBA bitstream is input to an OBA decoder 910 that outputs audio and OAMD that is input to an object audio renderer 911 . The Object Audio Renderer 911 processes audio using OAMD and adapts it to a specific loudspeaker layout.

＜ソース装置におけるレンダリングのために、トランスポート層を介して、予め計算されたOAMDを送信する＞
図１０A及び１０Bは、実施形態による、２２.２chオーディオビットストリームをオーディオオブジェクト及びOAMDに変換する例示的なシステムのブロック図であり、ソース装置（例えば、STB、AVR）におけるレンダリングのために、チャネルシャッフル情報及びOAMDが、トランスポート層に埋め込まれる。２２.２chコンテンツを表現するために使用されるOAMDは、プログラムの間、静的である。この理由から、オーディオビットストリームの中でデータレートの増大を回避するために、OAMDを頻繁に送信することを回避することが望ましい。これは、静的OAMD及びチャネルシャッフル情報を、トランスポート層内で送信し及びトランスポート層で送信されることにより達成できる。受信されると、OAMD及びチャネルシャッフル情報は、コンテンツをレンダリングするためにオブジェクトオーディオレンダラにより使用される。例示的なトランスポート層は、ビデオ及びオーディオのような時間に基づくマルチメディアファイルの一般的構造を定義するISO/IEC１４４９６-１２-MPEG-４ Part１２に記載されるベースメディアファイルフォーマット（base media file format （BMFF））である。実施形態では、OAMDは、MPEG-DASHマニフェストに含まれる。 <Send precomputed OAMD via transport layer for rendering on source device>
10A and 10B are block diagrams of exemplary systems for converting a 22.2ch audio bitstream into audio objects and OAMDs, according to embodiments, for rendering on a source device (eg, STB, AVR). Shuffle information and OAMD are embedded in the transport layer. The OAMD used to represent 22.2ch content is static during the program. For this reason, it is desirable to avoid sending OAMDs frequently in order to avoid increasing the data rate in the audio bitstream. This can be achieved by sending static OAMD and channel shuffle information within and at the transport layer. Once received, the OAMD and channel shuffle information are used by the Object Audio Renderer to render the content. An exemplary transport layer is the base media file format described in ISO/IEC 14496-12-MPEG-4 Part 12, which defines the general structure of time-based multimedia files such as video and audio. (BMFF)). In embodiments, the OAMD is included in the MPEG-DASH manifest.

図１０Aを参照すると、符号化システム１０００Aは、エンコーダ１００１（例えば、AACエンコーダ）、フォーマット変換器１００２、及びトランスポート層多重化器１００４を含む。フォーマット変換器１００２は、OAMDメタデータ生成器１００３を更に含む。OAMDメタデータの幾つかの例は、限定ではないが、コンテンツ記述メタデータ、特性更新メタデータ、及びトリミングデータを含む。 Referring to FIG. 10A, encoding system 1000A includes encoder 1001 (eg, AAC encoder), format converter 1002 and transport layer multiplexer 1004 . Format converter 1002 further includes OAMD metadata generator 1003 . Some examples of OAMD metadata include, without limitation, content description metadata, property update metadata, and trimming data.

ネイティブオーディオビットストリーム１００５（例えば、AAC/MP４）は、２２.２chオーディオ及びメタデータを含む。オーディオは、オーディオをネイティブオーディオフォーマットに符号化し符号化オーディオをトランスポート層多重化器１００４に出力するエンコーダ１００１に入力される。OAMDメタデータ生成器１００３は、例えば図１Aを参照して説明した原理に従うように２２.２chメタデータをOAMDにマッピングし、チャネルシャッフル情報を生成する。チャネルシャッフル情報は、例えば図１Bを参照して説明した原理に従い２２.２chコンテンツのチャネル並べ替えを記述する。チャネルシャッフル情報は、OAMDと一緒にビットし、トランスポート層多重化器１００４に入力される。トランスポート層多重化器１００４の出力は、ネイティブオーディオビットストリームを含むトランスポートストリームである。 A native audio bitstream 1005 (eg, AAC/MP4) includes 22.2ch audio and metadata. Audio is input to encoder 1001 which encodes the audio into a native audio format and outputs the encoded audio to transport layer multiplexer 1004 . OAMD metadata generator 1003 maps 22.2ch metadata to OAMD according to the principle described with reference to FIG. 1A, for example, and generates channel shuffle information. The channel shuffle information describes the channel permutation of 22.2ch content according to the principle explained with reference to FIG. 1B, for example. Channel shuffle information is bit together with OAMD and input to transport layer multiplexer 1004 . The output of transport layer multiplexer 1004 is a transport stream containing the native audio bitstream.

図１０Bを参照すると、復号システム１０００Bは、トランスポート層逆多重化器１００６、デコーダ１００７、オーディオチャネルシャッフラ１００８、及びオブジェクトオーディオレンダラ１００９を含む。トランスポート層逆多重化器１００６は、オーディオ及びOAMDをトランスポートビットストリームから逆多重化し、オーディオ及びOAMDをデコーダ１００７に入力する。デコーダ８０９は、オーディオ及びOAMDをネイティブオーディオビットストリームから復号する。復号されたオーディオ及びOAMDは、次に、オブジェクトオーディオレンダラ１００９に入力される。オブジェクトオーディオレンダラ１００９は、OAMDを用いてオーディオを処理し、それを特定のラウドスピーカレイアウトに適応する。 Referring to FIG. 10B, decoding system 1000 B includes transport layer demultiplexer 1006 , decoder 1007 , audio channel shuffler 1008 and object audio renderer 1009 . Transport layer demultiplexer 1006 demultiplexes the audio and OAMD from the transport bitstream and inputs the audio and OAMD to decoder 1007 . Decoder 809 decodes audio and OAMD from the native audio bitstream. The decoded audio and OAMD are then input to Object Audio Renderer 1009 . The Object Audio Renderer 1009 uses OAMD to process audio and adapt it to a specific loudspeaker layout.

＜＜例示的な処理＞＞
図１１は、CBAからOBAへの変換処理１１００のフロー図である。処理１１００は、図３に示すオーディオシステムアーキテクチャを用いて実施できる。処理１１００は、チャネルベースオーディオとメタデータとを含むビットストリームを受信するステップと（１１０１）、ビットストリームからOAMD表現を示すシグナリングパラメータをパースするステップと（１１０２）、シグナリングされたOAMD表現に基づき、チャネルベースメタデータをOAMDに変換するステップと（１１０３）、OAMDの順序制約に基づき、チャネルシャッフル情報を生成するステップと（１１０４）、チャネルシャッフル情報に基づき、チャネルベースオーディオのチャネルを並べ替えるステップと（１１０５）、OAMDを用いて並べ替えチャネルベースオーディオをレンダリングするステップと（１１０６）、を含む。上述のステップ１１０３及び１１０４は、例えば、OAMD表現及び各々図１A及び１Bに示されるベッドチャネル割り当て／順序、並びに図３に示されるオーディオシステムアーキテクチャを用いて実行できる。OAMDメタデータの幾つかの例は、限定ではないが、コンテンツ記述メタデータ、特性更新メタデータ、及びトリミングデータを含む。 <<Exemplary Processing>>
FIG. 11 is a flow diagram of a conversion process 1100 from CBA to OBA. Process 1100 may be implemented using the audio system architecture shown in FIG. The process 1100 includes the steps of receiving (1101) a bitstream containing channel-based audio and metadata, parsing (1102) signaling parameters indicating an OAMD representation from the bitstream, and based on the signaled OAMD representation, converting (1103) the channel-based metadata to OAMD; generating (1104) channel shuffle information based on the order constraints of the OAMD; and reordering the channels of the channel-based audio based on the channel shuffle information. (1105), rendering the reordered channel-based audio using OAMD (1106). Steps 1103 and 1104 described above can be performed, for example, using the OAMD representation and bed channel assignment/order shown in FIGS. 1A and 1B, respectively, and the audio system architecture shown in FIG. Some examples of OAMD metadata include, without limitation, content description metadata, property update metadata, and trimming data.

図１２は、CBAからOBAへの変換処理１２００のフロー図である。処理１２００は、図４に示すオーディオシステムアーキテクチャを用いて実施できる。処理１２００は、チャネルベースオーディオとメタデータとを含むビットストリームを受信するステップと（１２０１）、ビットストリームからOAMD表現を示すシグナリングパラメータをパースするステップと（１２０２）、シグナリングされたOAMD表現に基づき、チャネルベースメタデータをOAMDに変換するステップと（１２０３）、OAMDの順序制約に基づき、チャネルシャッフル情報を生成するステップと（１２０４）、チャネルシャッフル情報に基づき、チャネルベースオーディオのチャネルを並べ替えるステップと（１２０５）、オーディオがOAMDを用いてオブジェクトオーディオレンダラによりレンダリングされる再生装置へ送信するために、並べ替えチャネルベースオーディオ及びOAMDをOBAビットストリームに符号化するステップと（１２０６）、を含む。上述のステップ１２０３及び１２０５は、例えば、OAMD表現及び各々図１A及び１Bに示されるベッドチャネル割り当て／順序、並びに図４に示されるオーディオシステムアーキテクチャを用いて実行できる。OAMDメタデータの幾つかの例は、限定ではないが、コンテンツ記述メタデータ、特性更新メタデータ、及びトリミングデータを含む。 FIG. 12 is a flow diagram of a conversion process 1200 from CBA to OBA. Process 1200 may be implemented using the audio system architecture shown in FIG. The process 1200 includes the steps of receiving (1201) a bitstream containing channel-based audio and metadata, parsing (1202) signaling parameters indicating an OAMD representation from the bitstream, and based on the signaled OAMD representation, converting (1203) the channel-based metadata to OAMD; generating (1204) channel shuffle information based on the OAMD order constraints; and reordering the channels of the channel-based audio based on the channel shuffle information. (1205) encoding (1206) the reordered channel-based audio and the OAMD into an OBA bitstream for transmission to a playback device where the audio is rendered by an object audio renderer using the OAMD. Steps 1203 and 1205 described above can be performed, for example, using the OAMD representation and bed channel assignment/order shown in FIGS. 1A and 1B, respectively, and the audio system architecture shown in FIG. Some examples of OAMD metadata include, without limitation, content description metadata, property update metadata, and trimming data.

図１３は、CBAからOBAへの変換処理１３００のフロー図である。処理１３００は、図５に示すオーディオシステムアーキテクチャを用いて実施できる。処理１３００は、ネイティブオーディオフォーマットのチャネルベースオーディオとメタデータとを含むネイティブオーディオビットストリームを受信するステップと（１３０１）、ネイティブオーディオビットストリームを復号して、チャネルベースオーディオ及びメタデータを復元するステップと（１３０２）、ビットストリームからOAMD表現を示すシグナリングパラメータをパースするステップと（１３０３）、シグナリングされたOAMD表現に基づき、チャネルベースメタデータをOAMDに変換するステップと（１３０４）、OAMDの順序制約に基づき、チャネルシャッフル情報を生成するステップと（１３０５）、チャネルシャッフル情報に基づき、チャネルベースオーディオのチャネルを並べ替えるステップと（１３０６）、OAMDを用いて並べ替えチャネルベースオーディオをレンダリングするステップと（１３０７）、を含む。ステップ１３０４及び１３０５は、例えば、OAMD表現及び各々図１A及び１Bに示されるベッドチャネル割り当て／順序、並びに図５に示されるオーディオシステムアーキテクチャを用いて実行できる。 FIG. 13 is a flow diagram of a conversion process 1300 from CBA to OBA. Process 1300 may be implemented using the audio system architecture shown in FIG. The process 1300 includes receiving (1301) a native audio bitstream containing channel-based audio and metadata in a native audio format, and decoding the native audio bitstream to recover the channel-based audio and metadata. (1302) parsing the signaling parameters indicating the OAMD representation from the bitstream (1303); converting (1304) the channel-based metadata to OAMD based on the signaled OAMD representation; generating (1305) channel shuffle information based on the channel shuffle information; reordering (1306) the channels of the channel-based audio based on the channel shuffle information; and rendering (1307) the reordered channel-based audio using OAMD. ),including. Steps 1304 and 1305 can be performed, for example, using the OAMD representation and bed channel assignment/order shown in FIGS. 1A and 1B, respectively, and the audio system architecture shown in FIG.

図１４は、CBAからOBAへの変換処理１４００のフロー図である。処理１４００は、図６A及び６Bに示すオーディオシステムアーキテクチャを用いて実施できる。処理１４００は、ネイティブビットストリームフォーマットのチャネルベースオーディオとメタデータとを含むネイティブオーディオビットストリームを受信するステップと（１４０２）、ネイティブオーディオビットストリームを復号して、チャネルベースオーディオ及びメタデータを復元する、つまり決定する又は抽出するステップと（１４０２）、ビットストリームからOAMD表現を示すシグナリングパラメータをパースするステップと（１４０３）、シグナリングされたOAMD表現に基づき、チャネルベースメタデータをOAMDに変換するステップと（１４０４）、OAMDの順序制約に基づき、チャネルシャッフル情報を生成するステップと（１４０５）、チャネルシャッフル情報に基づき、チャネルベースオーディオのチャネルを並べ替えるステップと（１４０６）、オーディオがOAMDを用いてオブジェクトオーディオレンダラによりレンダリングされる再生装置へそうしんすために、並べ替えチャネルベースオーディオ及びOAMDをOBAビットストリームに符号化するステップと（１４０７）、を含む。ステップ１４０４及び１４０５は、例えば、OAMD表現及び各々図１A及び１Bに示されるベッドチャネル割り当て／順序、並びに図６A及び６Bに示されるオーディオシステムアーキテクチャを用いて実行できる。 FIG. 14 is a flow diagram of a CBA to OBA conversion process 1400 . Process 1400 can be implemented using the audio system architecture shown in FIGS. 6A and 6B. The process 1400 receives (1402) a native audio bitstream including channel-based audio and metadata in a native bitstream format; decoding the native audio bitstream to recover the channel-based audio and metadata; parsing (1403) signaling parameters indicating the OAMD representation from the bitstream; converting channel-based metadata to OAMD based on the signaled OAMD representation; 1404), generating channel shuffle information based on the OAMD order constraints (1405), reordering the channels of the channel-based audio based on the channel shuffle information (1406), and converting the audio to object audio using the OAMD. and encoding (1407) the reordered channel-based audio and OAMD into an OBA bitstream for presentation to a playback device rendered by the renderer. Steps 1404 and 1405 can be performed, for example, using the OAMD representation and bed channel assignment/order shown in FIGS. 1A and 1B, respectively, and the audio system architecture shown in FIGS. 6A and 6B.

図１５は、CBAからOBAへの変換処理１５００のフロー図である。処理１５００は、図７A～７Cに示すオーディオシステムアーキテクチャを用いて実施できる。処理１５００は、チャネルベースオーディオとメタデータとを含むチャネルベースオーディオビットストリームを受信するステップにより開始し（１５０１）、チャネルベースオーディオをネイティブオーディオビットストリームに符号化し（１５０２）、チャネルベースメタデータからOAMD表現を示すシグナリングパラメータをパースし（１５０３）、シグナリングされたOAMD表現に基づき、チャネルベースオーディオメタデータをOAMD表現に変換し（１５０４）、OAMDの順序制約に基づき、チャネルシャッフル情報を生成し（１５０５）、ネイティブオーディオビットストリーム、チャネルシャッフル情報、及びOAMDを、結合オーディオビットストリームに結合し（１５０６）、レンダリングするために再生装置へ又はレンダリングするためにソース装置（例えば、STB、AVR）へ送信するために、結合オーディオビットストリームをトランスポート層ビットストリームに含める（１５０７）。上述のステップの詳細は、図１A、１B、７A、７C、８A、８B、９A～９C、１０A及び１０Bを参照して説明された。 FIG. 15 is a flow diagram of a CBA to OBA conversion process 1500 . Process 1500 may be implemented using the audio system architecture shown in Figures 7A-7C. The process 1500 begins by receiving a channel-based audio bitstream including channel-based audio and metadata (1501), encoding the channel-based audio into a native audio bitstream (1502), and generating an OAMD from the channel-based metadata. Parse (1503) the signaling parameters indicating the representation, convert (1504) the channel-based audio metadata to the OAMD representation based on the signaled OAMD representation, and generate (1505) channel shuffle information based on the OAMD order constraints. ), combine 1506 the native audio bitstream, channel shuffle information, and OAMD into a combined audio bitstream and transmit to a playback device for rendering or to a source device (e.g., STB, AVR) for rendering. To do so, the combined audio bitstream is included in the transport layer bitstream (1507). Details of the above steps have been described with reference to FIGS. 1A, 1B, 7A, 7C, 8A, 8B, 9A-9C, 10A and 10B.

図１６は、CBAからOBAへの変換処理１６００のフロー図である。処理１６００は、図８A、８B、９A～９C、１０A及び１０Bに示すオーディオシステムアーキテクチャを用いて実施できる。処理１６００は、ネイティブオーディオビットストリームとメタデータとを含むトランスポート層ビットストリームを受信するステップにより開始し（１６０１）、ネイティブオーディオビットストリーム及びメタデータ、チャネルシャッフル情報、及びOAMDをトランスポートビットストリームから抽出し（１６０２）、ネイティブオーディオビットストリームを復号して、チャネルベースオーディオを復元し、つまり決定し又は抽出し（１６０３）、チャネルシャッフル情報を用いてチャネルベースオーディオのチャネルを並べ替え（１６０４）、任意的に、並べ替えチャネルベースオーディオ及びOAMDをOBAビットストリームに符号化して、再生装置又はソース装置へ送信するか（１６０５）、又は任意的に、OBAビットストリームを復号して、並べ替えチャネルベースオーディオ及びOAMDを復元し（１６０６）、OAMDを用いて並べ替えチャネルベースオーディオをレンダリングし（１６０７）、再生装置へ送信する。上述のステップの詳細は、図８A、８B、９A～９C、１０A及び１０Bを参照して説明された。 FIG. 16 is a flow diagram of a CBA to OBA conversion process 1600 . Process 1600 can be implemented using the audio system architectures shown in Figures 8A, 8B, 9A-9C, 10A and 10B. The process 1600 begins by receiving 1601 a transport layer bitstream that includes a native audio bitstream and metadata, extracting the native audio bitstream and metadata, channel shuffle information, and OAMD from the transport bitstream. extract (1602), decode the native audio bitstream to recover, i.e. determine or extract (1603) the channel-based audio, use the channel shuffle information to reorder the channels of the channel-based audio (1604); Optionally encode the reordered channel-based audio and OAMD into an OBA bitstream for transmission 1605 to the playback device or source device, or optionally decode the OBA bitstream to produce a reordered channel-based The audio and OAMD are decompressed (1606) and the reordered channel-based audio is rendered (1607) using the OAMD and sent to the playback device. Details of the above steps have been described with reference to FIGS. 8A, 8B, 9A-9C, 10A and 10B.

＜MPEG-４オーディオ又はMPEG-Dオーディオビットストリーム内で予め計算されたOAMDを送信する＞
実施形態では、２２.２コンテンツを表現するOAMDは、MPEG-４オーディオ（ISO/IEC１４４９６-３）ビットストリームのようなネイティブオーディオビットストリームの中で運ばれる。３つの実施形態の例示的なシンタックスが以下に提供される。

<Send pre-computed OAMD in MPEG-4 audio or MPEG-D audio bitstream>
In embodiments, an OAMD representing 22.2 content is carried in a native audio bitstream, such as an MPEG-4 Audio (ISO/IEC 14496-3) bitstream. Exemplary syntax for three embodiments is provided below.

上述の例示的なシンタックスでは、要素element_instance_tagは、データストリーム要素を識別するための数値であり、要素extension_payload（int）は、fill_element（ID_FIL）の中に含まれてよい。上述の３つのシンタックスの実施形態の各々は、追加データの意味を示すために「tag」又は「extension_type」を説明する。実施形態では、信号がビットストリーム内に挿入されることができ、追加OAMD及びチャネルシャッフル情報がビットストリームの３つの拡張領域のうちの１つに存在することをシグナリングして、デコーダにビットストリームのそれらの領域をチェックさせることを回避する。例えば、MPEG４_ancillary_dataフィールドは、以下のセマンティクスを有するdolby_surround_modeフィールドを含む。OAMDがビットストリーム内に存在することをデコーダに示すために、同様のシグナリングシンタックスが使用できる。

In the example syntax above, the element element_instance_tag is a numeric value to identify the data stream element, and the element extension_payload(int) may be included in the fill_element(ID_FIL). Each of the three syntax embodiments described above describes a "tag" or "extension_type" to indicate the meaning of the additional data. In embodiments, a signal may be inserted into the bitstream to signal to the decoder that additional OAMD and channel shuffle information is present in one of three extension regions of the bitstream. Avoid having those areas checked. For example, the MPEG4_ancillary_data field contains a dolby_surround_mode field with the following semantics. A similar signaling syntax can be used to indicate to the decoder that OAMD is present in the bitstream.

実施形態では、上述の表の中の予約フィールドは、予め計算されたOAMDペイロードがビットストリームの拡張データの中のどこかに埋め込まれていることを示すために使用される。（dolby_surround_mode=“１１”）の予約された値は、拡張データフィールドが及び２２.２をOBA（例えば、Dolby（登録商標）Atmos（登録商標））に変換するために必要とされる必要なOAMD及びチャネル情報を含むことを、デコーダに示すために使用される。代替として、予約されたフィールドは、コンテンツがOBA互換（例えば、Dolby（登録商標）Atmos（登録商標）互換）であり、２２.２chコンテンツのOBAへの変換が可能であることを示す。従って、dolby_surround_mode信号が予約された値「１１」に設定される場合、デコーダは、コンテンツがOBA互換であることを知り、更なる符号化及び／又はレンダリングのために２２.２chコンテンツをOBAに変換する。 In an embodiment, a reserved field in the table above is used to indicate that a pre-computed OAMD payload is embedded somewhere in the extension data of the bitstream. A reserved value of (dolby_surround_mode="11") indicates the required OAMD where the extended data field is needed to convert 22.2 to OBA (e.g. Dolby Atmos). and channel information to indicate to the decoder. Alternatively, a reserved field indicates that the content is OBA compatible (eg, Dolby® Atmos® compatible) and that conversion of 22.2ch content to OBA is possible. Therefore, if the dolby_surround_mode signal is set to the reserved value "11", the decoder knows the content is OBA compatible and converts the 22.2ch content to OBA for further encoding and/or rendering. do.

実施形態では、２２.２コンテンツを表現するOAMDは、MPEG-D USAC（ISO/IEC２３００３-３）オーディオビットストリームのようなネイティブオーディオビットストリームの中で運ばれる。そのような実施形態の例示的なシンタックスが以下に提供される。

In embodiments, an OAMD representing 22.2 content is carried in a native audio bitstream, such as an MPEG-D USAC (ISO/IEC 23003-3) audio bitstream. An exemplary syntax for such an embodiment is provided below.

＜＜例示的なオーディオシステムアーキテクチャ＞＞
図１７は、実施形態による、チャネルオーディオからオブジェクトオーディオへの変換を含む例示的なオーディオシステムアーキテクチャのブロック図である。本例では、アーキテクチャはSTB又はAVRのためである。STB/AVR１７００は、入力１７０１、アナログ－デジタル変換器（ADC）１７０２、復調器１７０３、同期化器／デコーダ１７０４、MEPG逆多重化器１７０７、MEPGデコーダ１７０６、メモリ１７０９、制御プロセッサ１７１０、オーディオチャネルシャッフラ１７０５、OBAエンコーダ１７１１、及びビデオエンコーダ１７１２を含む。本例では、STB/AVR１７００は、図９A～９C、及び１０A、１０Bで説明した適用を実施する。ここで、予め計算されたOAMDはMPEG-４オーディオビットストリームの中で運ばれる。 <<Exemplary Audio System Architecture>>
FIG. 17 is a block diagram of an exemplary audio system architecture including channel audio to object audio conversion, according to an embodiment. In this example, the architecture is for STB or AVR. STB/AVR 1700 includes input 1701, analog-to-digital converter (ADC) 1702, demodulator 1703, synchronizer/decoder 1704, MPEG demultiplexer 1707, MPEG decoder 1706, memory 1709, control processor 1710, audio channel shutter. It includes a framer 1705 , an OBA encoder 1711 and a video encoder 1712 . In this example, STB/AVR 1700 implements the applications described in FIGS. 9A-9C and 10A, 10B. Here, the pre-computed OAMD is carried in the MPEG-4 audio bitstream.

実施形態では、低雑音ブロックは、衛星テレビ受信用アンテナから無線波を集め、それらをアナログ信号に変換し、アナログ信号は同軸ケーブルを通じてSTB/AVR１７００の入力ポート１７０１へ送信される。アナログ信号は、ADC１７０２によりデジタル信号に変換される。デジタル信号は、復調器１７０３（例えば、QPSK復調器）により復調され、同期化器／デコーダ１７０４（例えば、同期化器及びビタビ（Viterbi）デコーダ）により同期化及び復号されて、MPEGトランスポートビットストリームを復元する。MPEGトランスポートビットストリームは、MPEG逆多重化器１７０７により逆多重化され、MPEGデコーダ１７０６により復号されて、チャネルベースオーディオ及びビデオオーディオビットストリーム、及びチャネルシャッフル情報とOAMDとを含むメタデータを復元する。オーディオチャネルシャッフラ１７０５は、例えば図１Bを参照して説明した原理に従うようなチャネルシャッフル情報に従い、オーディオチャネルを並べ替える。OBAエンコーダ１７１１は、再生装置内のオブジェクトオーディオレンダラによりレンダリングされるために再生装置（例えば、Dolby（登録商標）Atmos（登録商標）装置）へ送信されるOBAオーディオビットストリーム（例えば、Dolby（登録商標）MAT）に、並べ替えられたチャネルを有するオーディオを符号化する。ビデオエンコーダ１７１２は、ビデオを、再生装置によりサポートされるビデオフォーマットに符号化する。 In an embodiment, the low noise block collects radio waves from satellite television reception antennas and converts them to analog signals, which are transmitted to input port 1701 of STB/AVR 1700 over coaxial cable. Analog signals are converted to digital signals by ADC 1702 . The digital signal is demodulated by demodulator 1703 (e.g., QPSK demodulator), synchronized and decoded by synchronizer/decoder 1704 (e.g., synchronizer and Viterbi decoder) to form an MPEG transport bitstream. to restore. The MPEG transport bitstream is demultiplexed by MPEG demultiplexer 1707 and decoded by MPEG decoder 1706 to recover channel-based audio and video audio bitstreams and metadata including channel shuffle information and OAMD. . Audio channel shuffler 1705 sorts the audio channels according to channel shuffle information, such as according to the principles described with reference to FIG. 1B. OBA encoder 1711 is an OBA audio bitstream (e.g., Dolby Atmos device) sent to a playback device (e.g., a Dolby Atmos device) to be rendered by an object audio renderer in the playback device. )MAT) to encode the audio with permuted channels. Video encoder 1712 encodes the video into a video format supported by the playback device.

図１７を参照して説明されるアーキテクチャは、単なる例示的なアーキテクチャであることに留意する。CBAからOBAへの変換は、１つ以上のプロセッサ、メモリ、適切な入力／出力インタフェース、及び本願明細書に記載されたフォーマット変換及びチャネル並べ替えを実行するためのソフトウェアモジュール及び／又はハードウェア（例えば、ASIC）を含む任意の装置により実行できる。 Note that the architecture described with reference to FIG. 17 is merely an exemplary architecture. The conversion from CBA to OBA requires one or more processors, memory, appropriate input/output interfaces, and software modules and/or hardware (such as hardware) to perform the format conversion and channel reordering described herein. For example, it can be performed by any device including an ASIC).

本願明細書は多数の特定の実装の詳細を含むが、これらは、請求され得るものの範囲に対する限定としてではなく、むしろ、特定の実装の特定の実装に固有の特徴の説明として考えられるべきである。別個の実施形態の文脈で本願明細書に記載された特定の特徴は、単一の実装形態において結合されて実装されてもよい。反対に、単一の実施形態の文脈で記載された種々の特徴は、複数の実施形態で別個に又は任意の適切な部分的組み合わせで実装されてもよい。更に、特徴は特定の組み合わせで動作するよう上述され、そのように初めに請求され得るが、請求される組み合わせからの１つ以上の特徴は、幾つかの場合には、組み合わせから切り離すことができ、請求される組み合わせは、部分的組み合わせ又は部分的組み合わせの変形に向けられてよい。図面に示された論理的フローは、望ましい結果を達成するために示された特定の順序又はシーケンシャルな順序を必要としない。更に、他のステップが設けられてよく、又はステップは記載されたフローから除去されてよく、記載されたシステムに他のコンポーネントが追加されてよく又は除去されてよい。したがって、他の実装は以下の特許請求の範囲の範囲内にある。 While the specification contains numerous specific implementation details, these should not be considered as limitations on the scope of what may be claimed, but rather as descriptions of features specific to particular implementations. . Certain features that are described in this specification in the context of separate embodiments can also be implemented in combination in a single implementation. Conversely, various features that are described in the context of a single embodiment can also be implemented in multiple embodiments separately or in any suitable subcombination. Further, although features may be described above and initially claimed to operate in particular combinations, one or more features from a claimed combination may in some cases be separated from the combination. , the claimed combinations may be directed to subcombinations or variations of subcombinations. The logical flows depicted in the figures do not require the particular order or sequential order shown to achieve desirable results. Additionally, other steps may be provided or removed from the described flows, and other components may be added or removed from the described systems. Accordingly, other implementations are within the scope of the following claims.

Claims

a method,
receiving, by one or more processors of an audio processing device, a bitstream containing channel-based audio and associated channel-based audio metadata;
The one or more processors
parsing signaling parameters from the channel-based audio metadata, the signaling parameters indicating one of a plurality of different Object Audio Metadata (OAMD) representations, each OAMD representation of the OAMD representations representing the channel-based audio metadata; mapping one or more audio channels of the base audio to one or more audio objects;
converting the channel-based audio metadata into OAMDs associated with the one or more audio objects using the OAMD representation indicated by the signaling parameters;
generating channel shuffle information based on the OAMD channel order constraint;
reordering one or more audio channels of the channel-based audio based on the channel shuffle information to produce reordered channel-based audio;
using the OAMD to render the reordered channel-based audio into rendered audio; or
encoding the reordered channel-based audio and the OAMD into an object-based audio bitstream, and transmitting the object-based audio bitstream to a playback or source device;
A method configured as follows.

2. The method of claim 1, wherein the bitstream is a native audio bitstream, the method further comprising decoding the native audio bitstream to determine the channel-based audio and metadata.

3. The method of claim 2, wherein the native audio bitstream is an Advanced Audio Coding (AAC) bitstream.

The channel-based audio and the associated channel-based audio metadata are respectively N.M channel-based audio and channel-based audio metadata associated with the N.M channel-based audio, N being a positive integer greater than 9, and M is a positive integer of 0 or more.

5. The method of claim 4, wherein the channel-based audio is 22.2.

a method,
receiving, by one or more processors of an audio processing device, a bitstream containing channel-based audio and associated channel-based audio metadata;
The one or more processors
encoding the channel-based audio into a native audio bitstream;
parsing signaling parameters from the channel-based audio metadata, the signaling parameters indicating one of a plurality of different Object Audio Metadata (OAMD) representations, each OAMD representation of the OAMD representations representing the channel-based audio metadata; mapping one or more audio channels of the base audio to one or more audio objects;
converting the channel-based metadata into OAMDs associated with the one or more audio objects using the OAMD representation indicated by the signaling parameters;
generating channel shuffle information based on the OAMD channel order constraint;
generating a bitstream package including the native audio bitstream, the channel shuffle information, and the OAMD, wherein the channel shuffle information is output to one or more of the channel-based audio at a playback device or source device based on the channel shuffle information; allows you to reorder the audio channels of the to produce reordered channel-based audio,
multiplexing the bitstream package into a transport layer bitstream;
transmitting the transport layer bitstream to the playback device or the source device;
A method configured as follows.

7. The method of claim 6 , wherein the native audio bitstream is an Advanced Audio Coding (AAC) bitstream.

The channel-based audio and the associated channel-based audio metadata are NM channel-based audio and channel-based audio metadata associated with the NM channel-based audio, respectively, where N is a positive integer greater than 7, and M 8. The method according to claim 6 or 7 , wherein is a positive integer greater than or equal to 0.

9. The method of claim 8 , wherein the channel-based audio is 22.2.

a method,
receiving, by one or more processors of audio processing equipment, a transport layer bitstream comprising a bitstream package, said bitstream package comprising encoded channel-based audio, channel shuffle information, and object audio; including a step including a native audio bitstream with metadata (OAMD),
The one or more processors
demultiplexing the transport layer bitstream to determine the bitstream package;
decoding the bitstream package to determine the channel-based audio, the channel shuffle information, and the object audio metadata (OAMD);
reordering audio channels of the channel-based audio based on the channel shuffle information to generate reordered channel-based audio;
using the OAMD to render the reordered channel-based audio into rendered audio; or
encoding the reordered channel-based audio and the OAMD into an object-based audio bitstream and transmitting the object-based audio bitstream to a source device;
A method configured as follows.

11. The method of claim 10 , wherein the native audio bitstream is an Advanced Audio Coding (AAC) bitstream.

12. A method according to claim 10 or 11 , wherein said channel-based audio is NM channel-based audio, N is a positive integer greater than 7, and M is a positive integer greater than or equal to 0.

13. The method of claim 12 , wherein the channel-based audio is 22.2.

a device,
one or more processors;
A non-transitory computer-readable storage medium storing instructions, said instructions, when executed by said one or more processors, causing said one or more processors to perform any of claims 1-13 . a non-transitory computer-readable storage medium that causes the described method to be performed;
equipment including

A non-transitory computer-readable storage medium storing instructions, said instructions, when executed by said one or more processors, causing said one or more processors to perform any of claims 1-13 . A non-transitory computer-readable storage medium that causes the described method to be performed.