RU2017134913A

RU2017134913A - EFFECTIVE ENCODING OF AUDIO SCENES CONTAINING AUDIO OBJECTS

Info

Publication number: RU2017134913A
Application number: RU2017134913A
Authority: RU
Inventors: Хейко ПУРНХАГЕН; Кристофер ЧОЭРЛИНГ; Тони ХИРВОНЕН; Ларс ВИЛЛЕМОЕС; Дирк Йерун БРЕБАРТ
Original assignee: Долби Интернешнл Аб
Priority date: 2013-05-24
Filing date: 2014-05-23
Publication date: 2019-02-08
Also published as: KR20160003039A; BR112015029113B1; CN109410964A; KR102033304B1; EP3312835B1; JP2016525699A; US20220189493A1; EP3005353A1; CN105229733B; RU2017134913A3; CN109712630B; CN105229733A; US9852735B2; US11705139B2; CN110085240A; WO2014187991A1; ES2643789T3; RU2745832C2; JP2017199034A; HK1214027A1

Claims

1. A method of restoring and representing sound objects based on a data stream, including:

receiving a data stream containing:

the result of backward compatible down-mix containing M down-mix signals, which are combinations of N sound objects, with N> 1 and M≤N,

time-varying additional information containing parameters that provide restoration of N sound objects based on M down-mix signals, and

a plurality of metadata instances associated with N sound objects, wherein a plurality of metadata instances determine the corresponding required presentation settings for representing N sound objects, and for each metadata instance, transition data containing the initial time and duration of the interpolation from the current presentation setting to the desired presentation setting defined metadata instance

restoring N sound objects based on the result of backward compatible downmix and additional information, and

representation of N sound objects in output channels with a predetermined channel configuration by:

execution of the presentation in accordance with the current installation of the presentation;

starting at the initial time determined by the transition data for the metadata instance, interpolating from the current view setting to the desired view setting determined by the metadata instance; and

completing the interpolation to the desired setting of the view after the length of time determined by the transition data for the metadata instance.

2. The method according to p. 1, characterized in that the metadata instances associated with N sound objects contain information about the spatial position of the sound objects.

3. The method according to p. 2, characterized in that the metadata instances associated with N sound objects further comprise one or more of the size of the object, the volume of the object, the significance of the object, the type of object content and zone masks.

4. The method according to any one of the preceding paragraphs, characterized in that the initial points in time associated with multiple instances of metadata correspond to time events related to audio content, such as frame boundaries.

5. A method according to any one of the preceding paragraphs, characterized in that the interpolation from the current view setting to the desired view setting is linear interpolation.

6. A method according to any one of the preceding paragraphs, characterized in that the data stream contains a plurality of additional information instances defining the corresponding required restoration settings for restoring N sound objects, and for each additional information instance, transition data containing two independently assigned parts that are in combination determine the point in time to start the interpolation from the current recovery installation to the desired recovery installation defined by the instance Flax information and time to complete interpolation, and wherein the restoration N audio objects includes:

performing recovery in accordance with the current recovery installation;

starting at a time determined by the transition data for the additional information instance, interpolation from the current recovery installation to the desired recovery installation, determined by the additional information instance; and

completion of the interpolation at a time determined by the transition data for an instance of additional information.

7. A system for restoring and presenting sound objects based on a data stream, comprising:

a receive component configured for a data stream comprising:

a recovery component designed to restore N sound objects based on the result of backward compatible downmix and additional information,

presentation means made for presenting N sound objects to output channels with a predetermined channel configuration by:

execution of the view in accordance with the current view setup.

8. A data structure for displaying metadata associated with N sound objects, comprising:

a plurality of metadata instances defining the corresponding required presentation settings for representing N sound objects, and

transition data associated with each instance of metadata, wherein the transition data contains the initial time and duration of the interpolation from the current view setting to the desired view setting determined by the metadata instance.

9. A method of encoding audio objects as a data stream, including:

receiving N sound objects, with N> 1, and time-varying metadata associated with N sound objects describing how to represent N sound objects for playback on the decoder side;

calculating the result of the backward compatible downmix containing M downmix signals, wherein M N N, by forming combinations of N sound objects;

calculating time-varying additional information containing parameters that provide restoration of N sound objects based on M down-mix signals;

inclusion of the result of backward compatible down-mix and additional information in the data stream for transmission to the decoder; and

additional inclusion in the data stream:

for each instance of the transition data metadata containing the initial time and duration of the interpolation from the current view setting to the desired view setting determined by the metadata instance.

10. The method according to p. 9, characterized in that the metadata associated with N sound objects contain information about the spatial position of the sound objects.

11. The method according to p. 10, characterized in that the metadata associated with N sound objects, further comprise one or more of the size of the object, the volume of the object, the significance of the object, the type of object content and zone masks.

12. The method according to any one of paragraphs. 9-11, characterized in that the interpolation from the current view setting to the desired view setting is linear interpolation.

13. The method according to any one of paragraphs. 9-12, characterized in that it further includes:

inclusion in the data stream:

a plurality of copies of additional information defining the respective required recovery settings for restoring sound objects, and

transition data for each instance of additional information containing two independently assigned parts, which in combination determine the point in time for starting the transition from the current recovery installation to the desired recovery setting, which is determined by the additional information instance, and the moment in time for completing the transition.

14. An encoder for encoding audio objects as a data stream containing:

a receiver configured to receive N sound objects, N> 1, and time-varying metadata associated with N sound objects describing how to represent N sound objects for playback on the decoder side;

a downmix component configured to calculate M downmix signals, wherein M N N, by forming combinations of N sound objects;

a component for analysis, configured to calculate time-varying additional information containing parameters that provide recovery of N audio objects based on M down-mix signals;

a compaction component configured to include the result of a backward compatible downmix and additional information in the data stream for transmission to the decoder; and

wherein the compaction component is further configured to be included in the data stream:

15. A machine-readable medium storing a computer program product containing instructions for executing the method according to any one of claims. 1-6 or command to perform the method according to any one of paragraphs. 9-13.