RU2023121473A

RU2023121473A - METHODS AND DEVICES FOR ENCODING AND/OR DECODING IMMERSION AUDIO SIGNALS

Info

Publication number: RU2023121473A
Application number: RU2023121473A
Authority: RU
Inventors: Дэвид С. МАКГРАТ; Майкл ЭККЕРТ; Хейко ПУРНХАГЕН; Стефан БРУН
Original assignee: Долби Лэборетериз Лайсенсинг Корпорейшн; Долби Интернэшнл Аб
Priority date: 2018-07-02
Filing date: 2019-07-02
Publication date: 2023-09-01

Claims

1. A method for determining a reconstructed multi-channel signal from encoded audio data indicating a plurality of reconstructed channel signals and from encoded metadata indicating merged encoding metadata, the method comprising the steps of:

decode the encoded audio data to provide a plurality of reconstructed channel signals, and decode the encoded metadata) to provide merged encoding metadata; And

determining a reconstructed multi-channel signal from a plurality of reconstructed channel signals using the merged encoding metadata.

2. The method according to claim 1, wherein the plurality of reconstructed channel signals is a first order ambiophonic signal, namely, in B-format or in A-format.

3. The method according to claim 1, in which the merged encoding metadata contains:

upmix data, namely, an upmix matrix for upmixing a plurality of reconstructed channel signals into a reconstructed multi-channel signal; and/or

Decorrelation data providing the ability to generate a reconstructed multichannel signal having a predetermined covariance.

4. The method of claim 1, wherein the fusion encoding metadata comprises different metadata for different subbands of the reconstructed multi-channel signal.

5. The method of claim 1, wherein decoding the encoded audio data comprises decoding, based on the waveforms, each of the plurality of reconstructed channel signals, in particular using a mono decoder for each reconstructed channel signal.

6. The method of claim 1, wherein the encoded metadata is decoded using an entropy decoder.

7. The method according to claim 1, in which:

the reconstructed multi-channel signal contains one or more reconstructed object signals of one or more audio objects; And

the method comprises decoding, in particular using an entropy decoder, object metadata for one or more audio objects from the encoded metadata.

8. The method according to claim 1, in which:

a plurality of reconstructed channel signals form a sound field representation signal called "SR", namely, a K-th order ambiophonic signal, with K≥1;

a reconstructed multi-channel signal is determined by upmixing a plurality of reconstructed channel signals using joint encoding metadata; And

the reconstructed multi-channel signal comprises a reconstructed SR signal, namely an L-th order ambiophonic signal, with L≥K, and one or more reconstructed object signals of one or more audio objects.

9. The method according to claim 1, in which

the merged encoding metadata is configured to perform an inversion of the energy multiplex operation for the plurality of reconstructed channel signals; and/or

the merged encoding metadata is configured to perform an inverse prediction operation on at least some of the plurality of reconstructed channel signals; and/or

The fusion encoding metadata is configured to perform the inversion of the Karhunen-Loeve transform, the principal component analysis transform, and/or the singular value decomposition transform on at least some of the plurality of reconstructed channel signals

10. The method according to claim 1, in which

method (comprising the step of determining that the reconstructed multi-channel signal should be determined using a second mode;

in the second mode, the merged encoding metadata comprises prediction data and/or transform data configured to redistribute energy between the various reconstructed channel signals;

in the second mode, determining the reconstructed multi-channel signal comprises redistributing energy among the various reconstructed channel signals (using prediction data and/or transform data; and

in the second mode, the reconstructed multi-channel signal contains a number of channels identical to the number of channels of the plurality of reconstructed channel signals.

11. The method of claim 10, wherein the transformation data indicates the inverse of the Karhunen-Loeve transform, the principal component analysis transform, and/or the singular value decomposition transform that is to be applied to at least some of the plurality of reconstructed channel channels. signals to determine the reconstructed multi-channel signal.

12. The method according to claim 10, in which

the reconstructed multi-channel input signal contains a sequence of frames; And

the method comprises determining, for each frame of the sequence of frames, whether or not the second mode should be used.

13. The method according to claim 10, containing the steps of

extracting encoded audio data and encoded metadata from the bitstream; And

extracting from the bit stream an indicator that indicates whether the second mode should be used or not.

14. The method according to claim 10, comprising the step of preparing a reconstructed multi-channel signal by rendering.

15. A decoding unit for determining a reconstructed multi-channel signal from encoded audio data indicating a plurality of reconstructed channel signals and from encoded metadata indicating combined encoding metadata, wherein the decoding unit is configured to:

decode the encoded audio data to provide a plurality of reconstructed channel signals;

decode encoded metadata to provide merged encoding metadata; And

determine the reconstructed multi-channel signal from the plurality of reconstructed channel signals using the merged encoding metadata.