EP2934025A1 - Method and device for applying dynamic range compression to a higher order ambisonics signal - Google Patents

Method and device for applying dynamic range compression to a higher order ambisonics signal Download PDF

Info

Publication number
EP2934025A1
EP2934025A1 EP14305559.8A EP14305559A EP2934025A1 EP 2934025 A1 EP2934025 A1 EP 2934025A1 EP 14305559 A EP14305559 A EP 14305559A EP 2934025 A1 EP2934025 A1 EP 2934025A1
Authority
EP
European Patent Office
Prior art keywords
hoa
hoa signal
gain
signal
simplified mode
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
EP14305559.8A
Other languages
German (de)
French (fr)
Inventor
Johannes Boehm
Florian Keiler
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Thomson Licensing SAS
Original Assignee
Thomson Licensing SAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Thomson Licensing SAS filed Critical Thomson Licensing SAS
Priority to EP14305559.8A priority Critical patent/EP2934025A1/en
Priority to CN201811253721.8A priority patent/CN108962266B/en
Priority to KR1020217000212A priority patent/KR102479741B1/en
Priority to CA3153913A priority patent/CA3153913C/en
Priority to TW109126543A priority patent/TWI718979B/en
Priority to CN201811253730.7A priority patent/CN109036441B/en
Priority to US15/127,775 priority patent/US9936321B2/en
Priority to PCT/EP2015/056206 priority patent/WO2015144674A1/en
Priority to TW108105179A priority patent/TWI695371B/en
Priority to JP2016558102A priority patent/JP6246948B2/en
Priority to AU2015238448A priority patent/AU2015238448B2/en
Priority to CN201580015764.0A priority patent/CN106165451B/en
Priority to CN201811253717.1A priority patent/CN109087654B/en
Priority to CA2946916A priority patent/CA2946916C/en
Priority to TW111107641A priority patent/TWI794032B/en
Priority to BR122020020730-2A priority patent/BR122020020730B1/en
Priority to BR112016022008-0A priority patent/BR112016022008B1/en
Priority to KR1020227044220A priority patent/KR102596944B1/en
Priority to KR1020197021732A priority patent/KR102201027B1/en
Priority to EP18173707.3A priority patent/EP3451706B1/en
Priority to KR1020167026390A priority patent/KR102005298B1/en
Priority to TW109101396A priority patent/TWI711034B/en
Priority to BR122020020719-1A priority patent/BR122020020719B1/en
Priority to CN201811253713.3A priority patent/CN109285553B/en
Priority to EP23192252.7A priority patent/EP4273857A3/en
Priority to CA3155815A priority patent/CA3155815A1/en
Priority to EP15711759.9A priority patent/EP3123746B1/en
Priority to TW104109277A priority patent/TWI662543B/en
Priority to TW112102828A priority patent/TWI833562B/en
Priority to RU2016141386A priority patent/RU2658888C2/en
Priority to BR122020014764-4A priority patent/BR122020014764B1/en
Priority to BR122018005665-7A priority patent/BR122018005665B1/en
Priority to TW110102935A priority patent/TWI760084B/en
Priority to RU2018118336A priority patent/RU2760232C2/en
Priority to CN201811253716.7A priority patent/CN109087653B/en
Priority to KR1020237037213A priority patent/KR20230156153A/en
Priority to CN202311083699.8A priority patent/CN117133298A/en
Priority to UAA201610606A priority patent/UA119765C2/en
Priority to CN202311083155.1A priority patent/CN117153172A/en
Publication of EP2934025A1 publication Critical patent/EP2934025A1/en
Priority to JP2017219647A priority patent/JP6545235B2/en
Priority to US15/891,326 priority patent/US10362424B2/en
Priority to HK19101101.3A priority patent/HK1258770A1/en
Priority to HK19101671.3A priority patent/HK1259306A1/en
Priority to JP2019112767A priority patent/JP6762405B2/en
Priority to US16/457,135 priority patent/US10567899B2/en
Priority to AU2019205998A priority patent/AU2019205998B2/en
Priority to US16/660,626 priority patent/US10638244B2/en
Priority to US16/857,093 priority patent/US10893372B2/en
Priority to JP2020150380A priority patent/JP7101219B2/en
Priority to US17/144,325 priority patent/US11838738B2/en
Priority to AU2021204754A priority patent/AU2021204754B2/en
Priority to JP2022107586A priority patent/JP7333855B2/en
Priority to AU2023201911A priority patent/AU2023201911A1/en
Priority to JP2023132200A priority patent/JP2023144032A/en
Priority to US18/505,494 priority patent/US20240098436A1/en
Withdrawn legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S3/00Systems employing more than two channels, e.g. quadraphonic
    • H04S3/008Systems employing more than two channels, e.g. quadraphonic in which the audio signals are in digital form, i.e. employing more than two discrete digital channels
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/008Multichannel audio signal coding or decoding using interchannel correlation to reduce redundancy, e.g. joint-stereo, intensity-coding or matrixing
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S2420/00Techniques used stereophonic systems covered by H04S but not provided for in its groups
    • H04S2420/01Enhancing the perception of the sound image or of the spatial distribution using head related transfer functions [HRTF's] or equivalents thereof, e.g. interaural time difference [ITD] or interaural level difference [ILD]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S2420/00Techniques used stereophonic systems covered by H04S but not provided for in its groups
    • H04S2420/11Application of ambisonics in stereophonic audio systems

Definitions

  • This invention relates to a method and a device for performing Dynamic Range Compression (DRC) to an Ambisonics signal, and in particular to a Higher Order Ambisonics (HOA) signal.
  • DRC Dynamic Range Compression
  • HOA Higher Order Ambisonics
  • DRC Dynamic Range Compression
  • Fig.1 a shows how DRC is applied to an audio signal.
  • the signal level usually the signal envelope, is detected and a related gain is computed.
  • the time-varying gain is used to change the amplitude of the audio signal.
  • Fig.1 b) shows the principle of using DRC for encoding/decoding, wherein gain factors are transmitted together with the coded audio signal. On the decoder side, the gains are applied to the decoded audio signal in order to reduce the dynamic range.
  • the present invention describes how DRC can be applied to HOA signals.
  • a HOA signal is analysed in order to obtain one or more gain coefficients.
  • at least two gain coefficients are obtained and the analysis of the HOA signal comprises a transformation into the spatial domain (iDSHT).
  • the one or more gain coefficients are transmitted together with the original HOA signal.
  • a special indication can be transmitted to indicate if all gain coefficients are equal. This is the case in a so-called simplified mode, whereas at least two different gain coefficients are used in a non-simplified mode.
  • the one or more gains can (but need not) be applied to the HOA signal. The user has a choice whether or not to apply the one or more gains.
  • An advantage of the simplified mode is that it requires considerably less computations, since only one gain factor is used, and since the gain factor can be applied to the coefficient channels of the HOA signal directly in the HOA domain, so that the transform into the spatial domain and subsequent transform back into the HOA domain can be skipped.
  • the gain factor is obtained by analysis of only the zeroth order coefficient channel of the HOA signal.
  • a method for performing DRC on a HOA signal comprises transforming the HOA signal to the spatial domain (by an inverse DSHT), analyzing the transformed HOA signal and obtaining, from results of said analyzing, gain factors that are usable for dynamic range compression.
  • the obtained gain factors are multiplied (in the spatial domain) with the transformed HOA signal, wherein a gain compressed transformed HOA signal is obtained.
  • the gain compressed transformed HOA signal is transformed back into the HOA domain (by a DSHT), i.e. coefficient domain, wherein a gain compressed HOA signal is obtained.
  • a method for performing DRC in a simplified mode on a HOA signal comprises analyzing the HOA signal and obtaining from results of said analyzing a gain factor that is usable for dynamic range compression.
  • the obtained gain factor is multiplied with coefficient channels of the HOA signal (in the HOA domain), wherein a gain compressed HOA signal is obtained.
  • the indication to indicate simplified mode i.e. that only one gain factor is used, can be set implicitly, e.g. if only simplified mode can be used due to hardware or other restrictions, or explicitly, e.g. upon user selection of either simplified or non-simplified mode.
  • a method for applying DRC gain factors to a HOA signal comprises receiving a HOA signal, an indication and gain factors (together with the HOA signal or separately), determining that the indication indicates non-simplified mode, transforming the HOA signal into the spatial domain (using an inverse DSHT), wherein a transformed HOA signal is obtained, multiplying the gain factors with the transformed HOA signal, wherein a dynamic range compressed transformed HOA signal is obtained, and transforming the dynamic range compressed transformed HOA signal back into the HOA domain (i.e. coefficient domain) (using DSHT), wherein a dynamic range compressed HOA signal is obtained.
  • a method for applying a DRC gain factor to a HOA signal comprises receiving a HOA signal, an indication and a gain factor (together with the HOA signal or separately), determining that the indication indicates simplified mode, and upon said determining multiplying the gain factor with the HOA signal, wherein a dynamic range compressed HOA signal is obtained.
  • An apparatus for performing DRC on a HOA signal is disclosed in claim 12.
  • An apparatus for applying DRC gain factors to a HOA signal is disclosed in claim 13.
  • the invention provides a computer readable medium having executable instructions to cause a computer to perform a method comprising steps described above.
  • Fig.2 depicts the principle of the approach.
  • HOA signals are analyzed, DRC gains g are calculated from the analysis of the HOA signal, and the DRC gains are coded and transmitted along with a coded representation of the HOA content. This may be a multiplexed bitstream or two or more separate bitstreams.
  • the gains g are extracted from such bitstream or bitstreams.
  • the gains g are applied to the HOA signal as described below.
  • the gains are applied to the HOA signal, i.e. in general a dynamic range reduced HOA signal is obtained.
  • the dynamic range adjusted HOA signal is rendered in a HOA renderer.
  • B ⁇ ⁇ R N + 1 2 ⁇ x ⁇ denotes a block of ⁇ HOA samples
  • B [ b (1), b (2), .. , b (t), .., b ( ⁇ )]
  • b t b 1 , b 2 , ... b 0 , ... b N + 1 2
  • N denotes the HOA truncation order.
  • the number of higher order coefficients in b is (N + 1) 2 .
  • the sample index for one block of data is t. ⁇ may range from usually one sample to 64 samples or more.
  • the zeroth order signal ⁇ 0 [ b 1 (1), b 1 (2), ..., b 1 ( ⁇ )] is the first row of B .
  • D L is well-conditioned and its inverse D L - 1 exists.
  • a single gain-group means a single DRC gain value, here indicated by g 1 , is applied to all speaker channel ⁇ samples.
  • the virtual speaker positions sample spatial areas surrounding a virtual listener.
  • the sampling positions, D L , D L - 1 are known at the encoder side when the DRC gains are created. At the decoder side, D L and D L - 1 need to be known for applying the gain values.
  • DRC gains for HOA works as follows.
  • Up to L L (N + 1) 2 DRC gains are created by analyzing these signals.
  • AO signals such as e.g. dialog tracks may be used for side chaining. This is shown in Fig.4b ).
  • a single gain may be assigned to all L channels, in the simplest case (simplified mode).
  • Fig.4 creation of DRC gains for HOA is shown.
  • Fig.4a is depicted how a single gain (for a single gain group) can be derived from the zeroth HOA order component ⁇ 0 (optional with side chaining from AOs).
  • Fig.4b is depicted how two or more DRC gains are created by transforming the HOA representation into a spatial domain.
  • sounds from the back e.g. background sound
  • sounds from the back might get more attenuation than sounds originating from front and side directions. This would lead to (N + 1) 2 gain values in g which could be transmitted within two gain groups for this example.
  • Optional, here also is side chaining by Audio Objects wave forms and their directional information. Distracting sounds in the HOA mix sharing the same spatial source areas with the AO foreground sounds can get stronger attenuation gains than spatially distant sounds.
  • the gain values are transmitted to a receiver or decoder side.
  • Gain values can be assigned to channel groups for transmission. In an embodiment, all equal gains are combined in one channel group to minimize transmission data. If a single gain is transmitted, it is related to all L L channels. Transmitted are the number of channel groups gain values Channel groups are signaled.
  • the gain values are applied as follows.
  • B DRC g 1 B
  • Fig.5 applying DRC to HOA signals is shown.
  • Fig.5 a) a single channel group gain is transmitted and decoded and applied directly onto the HOA coefficients.
  • Fig. 5 b) more than one channel group gains are transmitted, decoded and a gain vector g of (N + 1) 2 gain values is decoded.
  • a gain matrix G is created and applied to a block of HOA samples.
  • Fig. 5 c) instead of applying the gain matrix / gain value to the HOA signal directly, it is applied directly onto the renderer's matrix. This is computationally beneficial if the DRC block size ⁇ is larger than the number of output channels L.
  • Each ⁇ ( ⁇ l ) is a mode vector containing the spherical harmonics of the direction ⁇ l .
  • L Quadrature gains related to the spherical layout positions are assembled in vector These quadrature gains rate the spherical area of such a position and all sum up to a value of 4 ⁇ related to the surface of a sphere with radius one.
  • D ⁇ L D ⁇ ⁇ L ⁇ D ⁇ ⁇ L ⁇ k , where k denotes the matrix norm type.
  • k denotes the matrix norm type.
  • 1 L T D ⁇ L denotes the sum of rows vectors of D ⁇ L .
  • This matrix fulfills requirement 2 and requirement 3.
  • the first row elements of D L - 1 all become one.
  • analyzing the sum signal in spatial domain is equal to analyzing the zeroth order HOA component.
  • DRC analyzers use the signals' energy as well as its amplitude.
  • the sum signal is related to amplitude and energy.
  • 1 S is a vector assembled out of S elements with a value of 1.
  • Fig.5 Typical application scenarios to apply DRC gains to HOA signals are shown in Fig.5 .
  • DRC gain application can be realized in at least two ways for flexible rendering.
  • Fig.6 shows exemplarily Dynamic Range Compression (DRC) processing at the decoder side.
  • DRC Dynamic Range Compression
  • Fig.6 a) DRC is applied before rendering and mixing.
  • Fig.6 b) DRC is applied to the loudspeaker signals, i.e. after rendering and mixing.
  • DRC gains are applied to Audio Objects and HOA separately: DRC gains are applied to Audio Objects in an Audio Object DRC block 610, and DRC gains are applied to HOA in a HOA DRC block 615.
  • the realization of the block HOA DRC block 615 matches one of those in Fig.5 .
  • a single gain is applied to all channels of the mixture signal of the rendered HOA and rendered Audio Object signal.
  • no spatial emphasis and attenuation is possible.
  • the related DRC gain cannot be created by analyzing the sum signal of the rendered mix, because the speaker layout of the consumer site is not known at the time of creation at the broadcast or content creation site.
  • DRC is applied to the HOA signal before rendering and may be combined with rendering.
  • DRC for HOA can be applied in time domain or in QMF-filter bank domain.
  • DRC gains to the HOA signals: c drc D L - 1 ⁇ diag g drc ⁇ D DSHT c
  • c is a vector of one time sample of HOA coefficients c ⁇ ⁇ R N + 1 2 ⁇ x 1 and D DSHT ⁇ ⁇ R N + 1 2 ⁇ x N + 1 2 and its inverse D DSHT - 1 are matrices related to a Discrete Spherical Harmonics Transform (DSHT) optimized for DRC purposes.
  • DSHT Discrete Spherical Harmonics Transform
  • D L is renamed to D DSHT .
  • the DRC decoder provides a gain value g ch ( n , m ) for every time frequency tile n, m for (N + 1) 2 spatial channels.
  • the gains for time slot n and frequency band m are arranged in g n m ⁇ ⁇ R N + 1 2 ⁇ x 1 .
  • Multiband DRC is applied in QMF Filter bank domain. The processing steps are shown in Fig.7 .
  • the reconstructed HOA signal is transformed into spatial domain by (inverse DSHT):
  • W DSHT D DSHT C , where C ⁇ ⁇ R N + 1 2 ⁇ x ⁇ is a block of T HOA samples and W DSHT ⁇ ⁇ R N + 1 2 ⁇ x ⁇ is a block of spatial samples matching the input time granularity of the QMF filter bank. Then the QMF analysis filter bank is applied.
  • Let w ⁇ DSHT n m ⁇ ⁇ ⁇ C N + 1 2 x 1 denote the a vector of spatial channels per time frequency tile ( n , m).
  • w ⁇ DRC n m diag g n m ⁇ w ⁇ DSHT n m .
  • D denotes the HOA rendering matrix.
  • the QMF signals then can be fed to the mixer for further processing.
  • Fig.7 shows DRC for HOA in the QMF domain combined with a rendering step. If only a single gain group for DRC has been used this should be flagged by the DRC decoder because again computational simplifications are possible.
  • the gains in vector g(n, m) all share the same value of g DRC ( n , m ).
  • the QMF filter bank can be directly applied to the HOA signal and the gain g DRC ( n, m ) can be multiplied in filter bank domain.
  • Fig.8 shows DRC for HOA in the QMF domain (a filter domain of a Quadrature Mirror Filter) combined with a rendering step, with computational simplifications for the simple case of a single DRC gain group.
  • the invention relates to a method for performing DRC on a HOA signal, the method comprising steps of setting or determining a mode, the mode being either a simplified mode or a non-simplified mode, in the non-simplified mode, transforming the HOA signal to the spatial domain, wherein an inverse DSHT is used, in the non-simplified mode, analyzing the transformed HOA signal, and in the simplified mode, analyzing the HOA signal, obtaining, from results of said analyzing, one or more gain factors that are usable for dynamic range compression, wherein only one gain factor is obtained in the simplified mode and wherein two or more different gain factors are obtained in the non-simplified mode, in the simplified mode multiplying the obtained gain factor with the HOA signal, wherein a gain compressed HOA signal is obtained, in the non-simplified mode, multiplying the obtained gain factors with the transformed HOA signal, wherein a gain compressed transformed HOA signal is obtained, and transforming the gain compressed transformed HOA signal back
  • the method further comprises before said multiplying the obtained factors, transmitting the HOA signals together with the obtained gain factor or gain factors.
  • the HOA signal is divided into frequency subbands, and the steps of analysing the HOA signal (or transformed HOA signal), obtaining one or more gain factors, multiplying the obtained gain factor(s) with the HOA signal (or transformed HOA signal), and transforming the gain compressed transformed HOA signal back into the HOA domain are applied to each frequency subband separately, with individual gains per subband. It is noted that the sequential order of dividing the HOA signal into frequency subbands and transforming the HOA signal to the spatial domain can be swapped, and/or the sequential order of synthesizing the subbands and transforming the gain compressed transformed HOA signals back into the HOA domain can be swapped, independently from each other.
  • the invention relates to a method for applying DRC gain factors to a HOA signal, the method comprising steps of receiving a HOA signal together with an indication and one or more gain factors, the indication indicating either a simplified mode or a non-simplified mode, wherein only one gain factor is received if the indication indicates the simplified mode, selecting either a simplified mode or a non-simplified mode according to said indication, in the simplified mode multiplying the gain factor with the HOA signal, wherein a dynamic range compressed HOA signal is obtained, and in the non-simplified mode transforming the HOA signal into the spatial domain, wherein a transformed HOA signal is obtained, multiplying the gain factors with the transformed HOA signals, wherein dynamic range compressed transformed HOA signals are obtained, and transforming the dynamic range compressed transformed HOA signals back into the HOA domain, wherein a dynamic range compressed HOA signal is obtained.
  • the invention relates to a device for performing DRC on a HOA signal, the device comprising a processor or one or more processing elements adapted for setting or determining a mode, the mode being either a simplified mode or a non-simplified mode, in the non-simplified mode transforming the HOA signal to the spatial domain, wherein an inverse DSHT is used, in the non-simplified mode analyzing the transformed HOA signal, while in the simplified mode analyzing the HOA signal, obtaining, from results of said analyzing, one or more gain factors that are usable for dynamic range compression, wherein only one gain factor is obtained in the simplified mode and wherein two or more different gain factors are obtained in the non-simplified mode, in the simplified mode multiplying the obtained gain factor with the HOA signal, wherein a gain compressed HOA signal is obtained, and in the non-simplified mode multiplying the obtained gain factors with the transformed HOA signal, wherein a gain compressed transformed HOA signal is obtained, and transforming the gain compressed transformed HOA signal back into
  • a device for performing DRC on a HOA signal comprises a processor or one or more processing elements adapted for transforming the HOA signal to the spatial domain, analyzing the transformed HOA signal, obtaining, from results of said analyzing, gain factors that are usable for dynamic range compression, multiplying the obtained factors with the transformed HOA signals, wherein gain compressed transformed HOA signals are obtained, and transforming the gain compressed transformed HOA signals back into the HOA domain, wherein gain compressed HOA signals are obtained.
  • the device further comprises a transmission unit for transmitting, before multiplying the obtained gain factor or gain factors, the HOA signal together with the obtained gain factor or gain factors.
  • the sequential order of dividing the HOA signal into frequency subbands and transforming the HOA signal to the spatial domain can be swapped, and the sequential order of synthesizing the subbands and transforming the gain compressed transformed HOA signals back into the HOA domain can be swapped, independently from each other.
  • the invention relates to a device for applying DRC gain factors to a HOA signal
  • the device comprising a processor or one or more processing elements adapted for receiving a HOA signal together with an indication and one or more gain factors, the indication indicating either a simplified mode or a non-simplified mode, wherein only one gain factor is received if the indication indicates the simplified mode, setting the device to either a simplified mode or a non-simplified mode, according to said indication, in the simplified mode, multiplying the gain factor with the HOA signal, wherein a dynamic range compressed HOA signal is obtained; and in the non-simplified mode, transforming the HOA signal into the spatial domain, wherein a transformed HOA signal is obtained, multiplying the gain factors with the transformed HOA signals, wherein dynamic range compressed transformed HOA signals are obtained, and transforming the dynamic range compressed transformed HOA signals back into the HOA domain, wherein a dynamic range compressed HOA signal is obtained.
  • the device further comprises a transmission unit for transmitting, before multiplying the obtained factors, the HOA signals together with the obtained gain factors.
  • the HOA signal is divided into frequency subbands, and the analysing the transformed HOA signal, obtaining gain factors, multiplying the obtained factors with the transformed HOA signals and transforming the gain compressed transformed HOA signals back into the HOA domain are applied to each frequency subband separately, with individual gains per subband.
  • the HOA signal is divided into a plurality of frequency subbands, and obtaining one or more gain factors, multiplying the obtained gain factors with the HOA signals or the transformed HOA signals, and in the non-simplified mode transforming the gain compressed transformed HOA signals back into the HOA domain are applied to each frequency subband separately, with individual gains per subband.
  • the invention relates to a device for applying DRC gain factors to a HOA signal, the device comprising a processor or one or more processing elements adapted for receiving a HOA signal together with gain factors, transforming the HOA signal into the spatial domain (using iDSHT), wherein a transformed HOA signal is obtained, multiplying the gain factors with the transformed HOA signal, wherein a dynamic range compressed transformed HOA signal is obtained, and transforming the dynamic range compressed transformed HOA signal back into the HOA domain (i.e. coefficient domain) (using DSHT), wherein a dynamic range compressed HOA signal is obtained.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Signal Processing (AREA)
  • Stereophonic System (AREA)

Abstract

Dynamic Range Control (DRC) cannot be simply applied to Higher Order Ambisonics (HOA) based signals. A method for performing DRC on a HOA signal comprises transforming the HOA signal to the spatial domain, analyzing the transformed HOA signal, and obtaining, from results of said analyzing, gain factors that are usable for dynamic compression. The gain factors can be transmitted together with the HOA signal. When applying the DRC, the HOA signal is transformed to the spatial domain, the gain factors are extracted and multiplied with the transformed HOA signal in the spatial domain, wherein a gain compensated transformed HOA signal is obtained. The gain compensated transformed HOA signal is transformed back into the HOA domain, wherein a gain compensated HOA signal is obtained.

Description

    Field of the invention
  • This invention relates to a method and a device for performing Dynamic Range Compression (DRC) to an Ambisonics signal, and in particular to a Higher Order Ambisonics (HOA) signal.
  • Background
  • The purpose of Dynamic Range Compression (DRC) is to reduce the dynamic range of an audio signal. A time-varying gain factor is applied to the audio signal. Typically this gain factor is dependent on the amplitude envelope of the signal used for controlling the gain. The mapping is in general non-linear. Large amplitudes are mapped to smaller ones while faint sounds are often amplified. Scenarios are noisy environments, late night listening, small speakers or mobile headphone listening.
  • A common concept for streaming or broadcasting Audio is to generate the DRC gains before transmission and apply these gains after receiving and decoding. The principle of using DRC is shown in Fig.1 a). Fig.1 a) shows how DRC is applied to an audio signal. The signal level, usually the signal envelope, is detected and a related gain is computed. The time-varying gain is used to change the amplitude of the audio signal. Fig.1 b) shows the principle of using DRC for encoding/decoding, wherein gain factors are transmitted together with the coded audio signal. On the decoder side, the gains are applied to the decoded audio signal in order to reduce the dynamic range.
  • For 3D audio, different gains can be applied to loudspeaker channels, which represent different spatial positions. These positions then need to be known at the sending side, in order to be able to generate a matching set of gains. This is usually only possible for idealized conditions, while in a realistic case the number of speakers and its placement varies in many ways. This is more influenced from practical considerations than from specifications. Higher Order Ambisonics (HOA) allows for flexible rendering. A HOA signal is composed of coefficient channels that do not directly represent sound levels. Therefore, DRC cannot be simply applied to HOA based signals.
  • Summary of the Invention
  • The present invention describes how DRC can be applied to HOA signals. A HOA signal is analysed in order to obtain one or more gain coefficients. In one embodiment, at least two gain coefficients are obtained and the analysis of the HOA signal comprises a transformation into the spatial domain (iDSHT). The one or more gain coefficients are transmitted together with the original HOA signal. A special indication can be transmitted to indicate if all gain coefficients are equal. This is the case in a so-called simplified mode, whereas at least two different gain coefficients are used in a non-simplified mode. At the decoder, the one or more gains can (but need not) be applied to the HOA signal. The user has a choice whether or not to apply the one or more gains. An advantage of the simplified mode is that it requires considerably less computations, since only one gain factor is used, and since the gain factor can be applied to the coefficient channels of the HOA signal directly in the HOA domain, so that the transform into the spatial domain and subsequent transform back into the HOA domain can be skipped. In the simplified mode, the gain factor is obtained by analysis of only the zeroth order coefficient channel of the HOA signal.
  • According to one embodiment of the invention, a method for performing DRC on a HOA signal comprises transforming the HOA signal to the spatial domain (by an inverse DSHT), analyzing the transformed HOA signal and obtaining, from results of said analyzing, gain factors that are usable for dynamic range compression. In further steps, the obtained gain factors are multiplied (in the spatial domain) with the transformed HOA signal, wherein a gain compressed transformed HOA signal is obtained. Finally, the gain compressed transformed HOA signal is transformed back into the HOA domain (by a DSHT), i.e. coefficient domain, wherein a gain compressed HOA signal is obtained.
  • Further, according to one embodiment of the invention, a method for performing DRC in a simplified mode on a HOA signal comprises analyzing the HOA signal and obtaining from results of said analyzing a gain factor that is usable for dynamic range compression. In further steps, upon evaluation of the indication, the obtained gain factor is multiplied with coefficient channels of the HOA signal (in the HOA domain), wherein a gain compressed HOA signal is obtained. Also upon evaluation of the indication, it can be determined that a transformation of the HOA signal can be skipped. The indication to indicate simplified mode, i.e. that only one gain factor is used, can be set implicitly, e.g. if only simplified mode can be used due to hardware or other restrictions, or explicitly, e.g. upon user selection of either simplified or non-simplified mode.
  • Further, according to one embodiment of the invention, a method for applying DRC gain factors to a HOA signal comprises receiving a HOA signal, an indication and gain factors (together with the HOA signal or separately), determining that the indication indicates non-simplified mode, transforming the HOA signal into the spatial domain (using an inverse DSHT), wherein a transformed HOA signal is obtained, multiplying the gain factors with the transformed HOA signal, wherein a dynamic range compressed transformed HOA signal is obtained, and transforming the dynamic range compressed transformed HOA signal back into the HOA domain (i.e. coefficient domain) (using DSHT), wherein a dynamic range compressed HOA signal is obtained.
  • Further, according to one embodiment of the invention, a method for applying a DRC gain factor to a HOA signal comprises receiving a HOA signal, an indication and a gain factor (together with the HOA signal or separately), determining that the indication indicates simplified mode, and upon said determining multiplying the gain factor with the HOA signal, wherein a dynamic range compressed HOA signal is obtained.
  • An apparatus for performing DRC on a HOA signal is disclosed in claim 12.
    An apparatus for applying DRC gain factors to a HOA signal is disclosed in claim 13.
  • In one embodiment, the invention provides a computer readable medium having executable instructions to cause a computer to perform a method comprising steps described above.
  • Advantageous embodiments of the invention are disclosed in the dependent claims, the following description and the figures.
  • Brief description of the drawings
  • Exemplary embodiments of the invention are described with reference to the accompanying drawings, which show in
  • Fig.1
    the general principle of DRC applied to audio;
    Fig.2
    a general approach for applying DRC to HOA based signals according to the invention;
    Fig.3
    Spherical speaker grids for N=1 to N=6;
    Fig.4
    Creation of DRC gains for HOA;
    Fig.5
    Applying DRC to HOA signals;
    Fig.6
    Dynamic Range Compression processing at the decoder side;
    Fig.7
    DRC for HOA in QMF domain combined with rendering step; and
    Fig.8
    DRC for HOA in QMF domain combined with rendering step for the simple case of a single DRC gain group.
    Detailed description of the invention
  • The present invention describes how DRC can be applied to HOA. Fig.2 depicts the principle of the approach. On the encoding or transmitting side, as shown in Fig.2 a), HOA signals are analyzed, DRC gains g are calculated from the analysis of the HOA signal, and the DRC gains are coded and transmitted along with a coded representation of the HOA content. This may be a multiplexed bitstream or two or more separate bitstreams.
  • On the decoding or receiving side, as shown in Fig.2 b), the gains g are extracted from such bitstream or bitstreams. After decoding of the bitstream or bitstreams in a Decoder, the gains g are applied to the HOA signal as described below. By this, the gains are applied to the HOA signal, i.e. in general a dynamic range reduced HOA signal is obtained. Finally, the dynamic range adjusted HOA signal is rendered in a HOA renderer.
  • In the following, used assumptions and definitions are explained.
    Assumptions are that the HOA renderer is energy preserving, i.e. N3D normalized Spherical Harmonics are used, and the energy of a single directional signal coded inside the HOA representation is maintained after rendering. It is described e.g. in EP13306042 (PD030040) how to achieve this energy preserving HOA rendering.
  • Definitions of used terms are as follows. B ϵ R N + 1 2 x τ
    Figure imgb0001
    denotes a block of τ HOA samples, B = [ b (1), b (2), .. , b (t), .., b (τ)], with vector b t = b 1 , b 2 , b 0 , b N + 1 2 T = B 0 0 , B 1 - 1 , B n m , B N N , T
    Figure imgb0002
    which contains the Ambisonics coefficients in ACN order (vector index 0 = n 2 + n + m + 1, with coefficient order index n and coefficient degree index m). N denotes the HOA truncation order. The number of higher order coefficients in b is (N + 1)2. The sample index for one block of data is t. τ may range from usually one sample to 64 samples or more.
    The zeroth order signal ϑ0 = [b 1(1), b 1(2), ..., b 1(τ)] is the first row of B . D ϵ R L x N + 1 2
    Figure imgb0003
    denotes an energy preserving rendering matrix that renders a block of HOA samples to a block of L loudspeaker channel in spatial domain: W = DB, with W ϵ R L x τ .
    Figure imgb0004
    This is the assumed procedure of the HOA renderer in Fig.2 b) (HOA rendering).
    D L ϵ R N + 1 2 x N + 1 2
    Figure imgb0005
    denotes a rendering matrix related to LL = (N + 1)2 channels which are positioned on a sphere in a very regular manner, in a way that all neighboring positions share the same distance. D L is well-conditioned and its inverse D L - 1
    Figure imgb0006
    exists. Thus both define a pair of transformation matrices (DSHT - Discrete Spherical Harmonics Transform): W L = D L B , B = D L - 1 W L
    Figure imgb0007
  • g is a vector of LL = (N + 1)2 gain DRC values. Gain values are assumed to be applied to a block of τ samples and are assumed to be smooth from block to block. For transmission gain values that share the same values can be combined to gain-groups. A single gain-group means a single DRC gain value, here indicated by g 1, is applied to all speaker channel τ samples.
  • For every HOA truncation order N, an ideal LL = (N + 1)2 virtual speaker grid and related rendering matrix D L are defined. The virtual speaker positions sample spatial areas surrounding a virtual listener. The grids for N=1 to 6 are shown in Fig.3, where areas related to a speaker are shaded cells. One sampling position is always related to a central speaker position (azimuth = 0, inclination = π/2; azimuth is measured from frontal direction related to the listening position). The sampling positions, D L, D L - 1
    Figure imgb0008
    are known at the encoder side when the DRC gains are created. At the decoder side, D L and D L - 1
    Figure imgb0009
    need to be known for applying the gain values.
  • Creation of DRC gains for HOA works as follows.
    The HOA signal is converted to the spatial domain by W L = D L B. Up to LL = (N + 1)2 DRC gains
    Figure imgb0010
    are created by analyzing these signals. If the content is a combination of HOA and Audio Objects (AO), AO signals such as e.g. dialog tracks may be used for side chaining. This is shown in Fig.4b). When creating different DRC gain values related to different spatial areas, care needs to be taken that these gains do not influence the spatial image stability at the decoder side. To avoid this, a single gain may be assigned to all L channels, in the simplest case (simplified mode). This can be done by analyzing all spatial signals W or by analyzing the zeroth order HOA coefficient sample block (ϑ0) and the transformation to spatial domain is not needed (Fig.4a). The latter is identical to analyzing the downmix signal of W. Further details are given below.
  • In Fig.4, creation of DRC gains for HOA is shown. In Fig.4a) is depicted how a single gain (for a single gain group) can be derived from the zeroth HOA order component ϑ0 (optional with side chaining from AOs). In Fig.4b) is depicted how two or more DRC gains are created by transforming the HOA representation into a spatial domain. As an example, sounds from the back (e.g. background sound) might get more attenuation than sounds originating from front and side directions. This would lead to (N + 1)2 gain values in g which could be transmitted within two gain groups for this example. Optional, here also is side chaining by Audio Objects wave forms and their directional information. Distracting sounds in the HOA mix sharing the same spatial source areas with the AO foreground sounds can get stronger attenuation gains than spatially distant sounds.
  • The gain values are transmitted to a receiver or decoder side.
    A variable number of 1 to LL = (N + 1)2 gain values related to a block of τ samples is transmitted. Gain values can be assigned to channel groups for transmission. In an embodiment, all equal gains are combined in one channel group to minimize transmission data. If a single gain is transmitted, it is related to all LL channels. Transmitted are the number of channel groups gain values
    Figure imgb0011
    Channel groups are signaled.
  • The gain values are applied as follows.
    The receiver/decoder can determine the number of transmitted gain values, decode related information and assign the gains to LL = (N + 1)2 channels.
    If only one gain value (one channel group) is transmitted, it can be directly applied to the HOA signal (BDRC = g 1 B), as shown in Fig.5 a). This has an advantage because the decoding is much simpler and requires considerably less processing. The reason is that no matrix operations are required; instead, the gain values can be applied directly, e.g. multiplied with the HOA coefficients. For further details see below.
    If two or more gains are transmitted, the channel group gains are assigned to L channel gains g = [g 1, ..., gL ].
  • For the virtual regular loudspeaker grid, the loudspeaker signals with the DRC gains applied are computed by W ^ L = diag g W L .
    Figure imgb0012

    The resulting modified HOA representation is then computed by B DRC = D L - 1 W ^ L .
    Figure imgb0013
  • This can be simplified, as shown in Fig.5 b). Instead of transforming the HOA signal into the spatial domain, applying the gains and transforming the result back to the HOA domain, the gain vector is transformed to the HOA domain by: G = D L - 1 diag g D L ,
    Figure imgb0014
    with G ϵ R N + 1 2 x N + 1 2 .
    Figure imgb0015
    The gain matrix is applied directly to the HOA coefficients: B DRC = GB .
    Figure imgb0016

    This is more efficient in terms of computational operations needed for (N + 1)2 < τ. That is, this solution has an advantage because the decoding is much simpler and requires considerably less processing. The reason is that no matrix operations are required; instead, the gain values can be applied directly, e.g. multiplied with the HOA coefficients.
  • An even more efficient way of applying the gain matrix is to manipulate the Renderer matrix by = DG , apply the DRC and render in one step: W = D̂B. This is shown in Fig.5 c). This is beneficial if L < τ.
  • In Fig.5, applying DRC to HOA signals is shown. In Fig.5 a), a single channel group gain is transmitted and decoded and applied directly onto the HOA coefficients. In Fig. 5 b), more than one channel group gains are transmitted, decoded and a gain vector g of (N + 1)2 gain values is decoded. A gain matrix G is created and applied to a block of HOA samples. In Fig. 5 c), instead of applying the gain matrix / gain value to the HOA signal directly, it is applied directly onto the renderer's matrix. This is computationally beneficial if the DRC block size τ is larger than the number of output channels L.
  • In the following, calculation of ideal DSHT (Discrete Spherical Harmonics Transform) matrices for DRC is described.
    The requirements for the ideal rendering and encoding matrices D Land D L - 1
    Figure imgb0017
    related to an ideal spherical layout are derived below. Even for ideal rendering layouts, requirement 2 and 3 seem to be in contradiction to each other. Either one or the other can be fulfilled without error, but with errors exceeding 3dB for the other one. This is considered to lead to audible artifacts. A method to overcome this is described in the following.
  • First, an ideal spherical layout with L = (N + 1)2 is selected. The L directions of the (virtual) speaker positions are given by Ωl and the related mode matrix is denoted as ψ L = [ϕ1), ..., ϕl), ϕL]. Each ϕl) is a mode vector containing the spherical harmonics of the direction Ωl. L Quadrature gains related to the spherical layout positions are assembled in vector
    Figure imgb0018
    These quadrature gains rate the spherical area of such a position and all sum up to a value of 4π related to the surface of a sphere with radius one. A first prototype rendering matrix L is derived by D ˜ L = diag q Ψ L L .
    Figure imgb0019

    Note that the division by L can be omitted due to a later normalization step (see below).
  • Second, a compact singular value decomposition is performed: D ˜ L = USV T
    Figure imgb0020
    and a second prototype matrix is derived by D ˜ ^ L = U V T .
    Figure imgb0021
  • Third, the prototype matrix is normalized: D ˇ L = D ˜ ^ L D ˜ ^ L k ,
    Figure imgb0022

    where k denotes the matrix norm type. Two matrix norm types show equally good performance. Either the k = 1 norm or the Frobenius norm should be used. This matrix fulfills the requirement 3 (energy preservation).
  • Fourth, in the last step the Amplitude error to fulfill requirement 2 is substituted:
    Row-vector e is calculated by e = - 1 L T D L - 1 0 0 .. 0 L ,
    Figure imgb0023
    where [1,0,0,..,0] is a row vector of (N + 1)2 all zero elements except for the first element with a value of one. 1 L T D L
    Figure imgb0024
    denotes the sum of rows vectors of D L .
    Figure imgb0025
    The rendering matrix DL is now derived by substituting the amplitude error: D L = D ˇ L + e T e T e T .. T ,
    Figure imgb0026
    where vector e is added to every row of D L .
    Figure imgb0027
    This matrix fulfills requirement 2 and requirement 3. The first row elements of D L - 1
    Figure imgb0028
    all become one.
  • In the following, detailed requirements for DRC are explained.
    First, LL identical gains with a value of g 1 applied in spatial domain is equal to apply the gain g 1 to the HOA coefficients: D L - 1 g W L = D L - 1 g 1 I D L B = g 1 D L - 1 D L B = g 1 B
    Figure imgb0029
  • This leads to the requirement: D L - 1 D L = I ,
    Figure imgb0030
    which means that L = (N + 1)2 and D L - 1
    Figure imgb0031
    needs to exist (trivial).
  • Second, analyzing the sum signal in spatial domain is equal to analyzing the zeroth order HOA component. DRC analyzers use the signals' energy as well as its amplitude. Thus the sum signal is related to amplitude and energy.
    The signal model of HOA: B = Ψ e X s , X s ϵ R S x τ
    Figure imgb0032
    is a matrix of S directional signals; ψ e = [ϕ1), ..., ϕs), ϕS)] is a N3D mode matrix related to the directions Ω 1, .., Ωs. The mode vector φ Ω s = Y 0 0 Ω s , Y 1 - 1 Ω s , Y N N Ω s T
    Figure imgb0033
    is assembled out of Spherical Harmonics. In N3D notation the zeroth order component Y 0 0 Ω s = 1
    Figure imgb0034
    is independent of the direction.
    The zeroth order component HOA signal needs to become the sum of the directional signals b 0 = b 1 1 , b 1 2 , , b 1 T = 1 S T X s
    Figure imgb0035
    to reflect the correct amplitude of the summation signal. 1 S is a vector assembled out of S elements with a value of 1.
    The energy of the directional signals is preserved in this mix because b o b o T =
    Figure imgb0036
    1 S T X s X s T 1 S .
    Figure imgb0037
    This would simplify to s = 1 S t = 1 τ X s , t 2 = X s fro 2
    Figure imgb0038
    if the signals Xs are not correlated.
  • The sum of amplitudes in spatial domain is given by 1 L T W L = 1 L T D L Ψ e X s = 1 L T M L X s
    Figure imgb0039
    with HOA panning matrix M L = D L ψ e.
    This becomes b o = 1 S T X s for 1 L T M L = 1 L T D L Ψ e = 1 S T .
    Figure imgb0040
    The latter requirement can be compared to the sum of amplitudes requirement sometimes used in panning like VBAP. Empirically it can be seen that this can be achieved in good approximation for very symmetric spherical speaker setups with D L = Ψ e - 1 ,
    Figure imgb0041
    because there we find: 1 L T D L
    Figure imgb0042
    1 0 0 .. 0 1 L T D L Ψ e Y 0 0 Ω 1 , Y 0 0 Ω s = 1 S T .
    Figure imgb0043
    The Amplitude requirement can then be reached within necessary accuracy.
    This also ensures that the energy requirement for the sum signal can be met:
    The energy sum in spatial domain is given by: 1 L T W L W L T 1 L = 1 L T M L X s X s T M L 1 L
    Figure imgb0044
    which would become in good approximation 1 S T X s X s T 1 S ,
    Figure imgb0045
    the existence of an ideal symmetric speaker setup required.
    This leads to the requirement: 1 L T D L 1 0 0 .. 0
    Figure imgb0046
    and in addition from the signal model we can conclude that the top row of D L - 1
    Figure imgb0047
    needs to be [1,1,1,1,..], i.e. a vector of length L with "one" elements) in order that the re-encoded order zero signal maintains amplitude and energy.
  • Third, energy preservation is a prerequisite: The energy of signal x s ϵ R 1 x τ
    Figure imgb0048
    should be preserved after conversion to HOA and spatial rendering to loud speakers independent of the signal's direction Ω s . This leads to D L φ Ω s 2 2 = 1.
    Figure imgb0049
    This can be achieved by modelling D L from rotation matrices and a diagonal gain matrix: D L = UVT diag(a) (the dependency on the direction (Ωs) was removed for clarity): D L φ 2 2 = φ T D L T D L φ =
    Figure imgb0050
    φ T diag a VU T UV T diag a φ = φ T diag a 2 φ = o = 1 N + 1 2 a 0 2 φ 0 2 1
    Figure imgb0051

    For Spherical harmonics φ o 2 = Y n m 2 Ω s = 1 ,
    Figure imgb0052
    so all gains a o 2
    Figure imgb0053
    related to D L fro 2 = o = 1 N + 1 2 a o 2 = 1
    Figure imgb0054
    would satisfy the equation. If all gains are selected equal, this leads to a o 2 = N + 1 - 2 .
    Figure imgb0055

    The requirement VVT = 1 can be achieved for L(N + 1)2 and only be approximated for L < (N + 1)2.)
  • This leads to the requirement: D L T D L = diag a 2 with o = 1 N + 1 2 a 0 2 = 1.
    Figure imgb0056
  • As an example, a case with ideal spherical positions (for HOA orders N=1 to N=3) is described in the following (Tabs.1-3). Ideal spherical positions for further HOA orders (N=4 to N=6) are described further below (Tabs.4-6). All the below-mentioned positions are derived from modified positions published by Jörg Fliege in "Integration nodes for the sphere", http://www.mathematik.uni-dortmund.de/Isx/research/projects/fliege/nodes/-nodes.html, 2010. Online, accessed 2010-10-05. The method to derive these positions and related quadrature/cubature gains was published in Jörg Fliege and Ulrike Maier. "A two-stage approach for computing cubature formulae for the sphere", Technical report, Fachbereich Mathematik, Universität Dortmund, 1999. In these tables, the azimuth is measured counter-clockwise from frontal direction related to the listening position and the inclination is measured from the z-axis with an inclination of 0 being above the listening position.
  • The term numerical quadrature is often abbreviated to quadrature and is quite a synonym for numerical integration, especially as applied to 1-dimensional integrals. Numerical integration over more than one dimension is called cubature herein.
  • N=1 Positions
  • Spherical position Ωl
    Figure imgb0057
    Inclination θ / rad Azimuth φ / rad Quadrature gains
    0.33983655 3.14159265 3.14159271
    1.57079667 0.00000000 3.14159267
    2.06167886 1.95839324 3.14159262
    2.06167892 -1.95839316 3.14159262
    1. a)
      D L :
      • 0.2500 -0.0000 0.4082 -0.1443
      • 0.2500 0.0000 -0.0000 0.4330
      • 0.2500 0.3536 -0.2041 -0.1443
      • 0.2500 -0.3536 -0.2041 -0.1443
    2. b)
      Tab.1: a) Spherical positions of virtual loudspeakers for HOA order N=1, and b) resulting rendering matrix for spatial transform (DSHT)
    N=2 Positions
  • Spherical position Ω l
    Figure imgb0058
    Inclination θ / rad Azimuth φ / rad Quadrature gains
    1.57079633 0.00000000 1.41002219
    2.35131567 3.14159265 1.36874571
    1.21127801 -1.18149779 1.36874584
    1.21127606 1.18149755 1.36874598
    1.31812905 -2.45289512 1.41002213
    0.00975782 -0.00009218 1.41002214
    1.31812792 2.45289621 1.41002230
    2.41880319 1.19514740 1.41002223
    2.41880555 -1.19514441 1.41002209
    a)
    DL :
    0.1117 0.0000 0.0067 0.2001 0.0000 -0.0000 -0.0931 -0.0078 0.2235
    0.1099 -0.0000 -0.1237 -0.1249 -0.0000 0.0000 0.0486 0.2399 0.0889
    0.1099 -0.1523 0.0619 0.0625 -0.1278 -0.1266 -0.0850 0.0841 -0.1455
    0.1099 0.1523 0.0619 0.0625 0.1278 0.1266 -0.0850 0.0841 -0.1455
    0.1117 -0.1272 0.0450 -0.1479 0.1938 -0.0427 -0.0898 -0.1001 0.0350
    0.1117 -0.0000 0.2001 0.0086 0.0000 -0.0000 0.2402 -0.0040 0.0310
    0.1117 0.1272 0.0450 -0.1479 -0.1938 0.0427 -0.0898 -0.1001 0.0350
    0.1117 0.1272 -0.1484 0.0436 0.0408 -0.1942 0.0769 -0.0982 -0.0612
    0.1117 -0.1272 -0.1484 0.0436 -0.0408 0.1942 0.0769 -0.0982 -0.0612
    b)
    Tab.2: a) Spherical positions of virtual loudspeakers for HOA order N=2 and b) resulting rendering matrix for spatial transform (DSHT)
  • N=3 Positions
  • Spherical position Ω l
    Figure imgb0059
    Inclination θ / rad Azimuth φ / rad Quadrature gains
    0.49220083 0.00000000 0.75567412
    1.12054210 -0.87303924 0.75567398
    2.52370429 -0.05517088 0.75567401
    2.49233024 -2.15479457 0.87457076
    1.57082248 0.00000000 0.87457075
    2.02713647 1.01643753 0.75567388
    1.61486095 -2.60674413 0.75567396
    2.02713675 -1.01643766 0.75567398
    1.08936018 2.89490077 0.75567412
    1.18114721 0.89523032 0.75567399
    0.65554353 1.89029902 0.75567382
    1.60934762 1.91089719 0.87457082
    2.68498672 2.02012831 0.75567392
    1.46575084 -1.76455426 0.75567402
    0.58248614 -2.22170415 0.87457060
    2.00306837 2.81329239 0.75567389
    a)
    D L :
    0.061457 -0.000075 0.093499 0.050400 -0.000027 0.000060 0.091035 0.098988 0.026750 0.019405 0.001461 0.003133 0.065741 0.124248 0.086602 0.029345
    0.061457 -0.073257 0.046432 0.061316 -0.094748 -0.071487 -0.029426 0.059688 -0.016892 -0.055360 -0.097812 -0.010980 -0.082425 -0.007027 -0.048502 -0.080998
    0.061457 -0.003584 -0.086661 0.061312 -0.004319 0.006362 0.068273 -0.111895 0.039506 0.008330 0.001142 -0.027428 -0.044323 0.125349 -0.097700 0.021534
    0.065628 -0.057573 -0.090918 -0.038050 0.042921 0.102558 0.066570 0.067780 -0.018289 0.008866 -0.087449 -0.104655 -0.011720 -0.061567 0.025778 0.023749
    0.065628 -0.000000 -0.000003 0.114142 -0.000000 0.000000 -0.073690 -0.000007 0.127634 0.002742 0.000000 0.010620 0.012464 -0.093807 0.009642 0.121106
    0.061457 0.081011 -0.046687 0.050396 0.085735 -0.079893 -0.028706 -0.049469 -0.042390 0.016897 -0.101358 0.003784 0.101201 -0.012537 0.040833 -0.076613
    0.061457 -0.054202 -0.004471 -0.091238 0.104013 0.005102 -0.068089 0.008829 0.056943 -0.149185 0.004553 0.050065 0.007556 0.060425 -0.003395 -0.002394
    0.061457 -0.080936 -0.046816 0.050396 -0.085707 0.079834 -0.028795 -0.049516 -0.042442 -0.030388 0.099898 0.015986 0.082103 -0.014540 0.065488 -0.078162
    0.061457 0.023227 0.049179 -0.091237 -0.044356 0.023858 -0.024641 -0.094498 0.082023 0.072649 -0.042376 -0.007211 -0.082403 0.008618 0.112746 -0.042512
    0.061457 0.076842 0.040224 0.061316 0.099067 0.065125 -0.038969 0.052207 -0.022402 0.028674 0.096668 -0.032684 -0.098253 -0.008594 -0.028068 -0.082210
    0.061457 0.061293 0.084298 -0.020472 -0.026210 0.108838 0.060891 -0.036183 -0.035381 -0.026726 -0.058661 0.111083 0.035312 -0.053574 -0.087737 0.014123
    0.065628 0.107524 -0.004399 -0.038047 -0.080156 -0.009268 -0.073361 0.003280 -0.099081 -0.064714 0.014164 -0.085660 -0.004839 0.038775 0.016889 0.101473
    0.061457 0.042357 -0.095230 -0.020477 -0.018235 -0.084766 0.096995 0.040799 -0.014532 -0.025100 0.058531 0.110659 -0.076710 -0.053780 0.056883 0.013978
    0.061457 -0.103651 0.010933 -0.020474 0.044445 -0.024073 -0.066259 -0.004608 -0.108789 0.127480 0.000140 0.071265 -0.019816 0.026559 -0.016573 0.076201
    0.065628 -0.049951 0.095320 -0.038045 0.037235 -0.093290 0.080481 -0.071053 -0.010264 -0.018490 0.073275 -0.097597 0.032029 -0.080959 -0.030699 0.008722
    0.061457 0.030975 -0.044701 -0.091239 -0.059658 -0.028961 -0.032307 0.085658 0.077606 0.084920 0.037824 -0.010382 0.084083 0.002412 -0.102187 -0.047341
    b)
    Tab.3: a) Spherical positions of virtual loudspeakers for HOA order N=3 and b) resulting rendering matrix for spatial transform (DSHT)
  • Typical application scenarios to apply DRC gains to HOA signals are shown in Fig.5. For mixed content applications like HOA plus Audio Objects, DRC gain application can be realized in at least two ways for flexible rendering. Fig.6 shows exemplarily Dynamic Range Compression (DRC) processing at the decoder side. In Fig.6 a), DRC is applied before rendering and mixing. In Fig.6 b), DRC is applied to the loudspeaker signals, i.e. after rendering and mixing.
  • In Fig.6a), DRC gains are applied to Audio Objects and HOA separately: DRC gains are applied to Audio Objects in an Audio Object DRC block 610, and DRC gains are applied to HOA in a HOA DRC block 615. Here the realization of the block HOA DRC block 615 matches one of those in Fig.5. In Fig.6b), a single gain is applied to all channels of the mixture signal of the rendered HOA and rendered Audio Object signal. Here no spatial emphasis and attenuation is possible. The related DRC gain cannot be created by analyzing the sum signal of the rendered mix, because the speaker layout of the consumer site is not known at the time of creation at the broadcast or content creation site. The DRC gain can be derived analyzing y m ϵ R 1 x τ
    Figure imgb0060
    where ym is mix of the zeroth order HOA signal b w and the mono downmix of S Audio Objects: y m = b o + s = 1 S x s .
    Figure imgb0061
  • In the following, further details of the disclosed solution are described.
  • DRC for HOA Content
  • DRC is applied to the HOA signal before rendering and may be combined with rendering. DRC for HOA can be applied in time domain or in QMF-filter bank domain.
  • DRC in Time Domain
  • The DRC decoder shall provide (N + 1)2 gain values g drc = g 1 g N + 1 2 T
    Figure imgb0062
    according to the number of HOA coefficient channels of the HOA signal c. N is the HOA order. Application of DRC gains to the HOA signals: c drc = D L - 1 diag g drc D DSHT c
    Figure imgb0063
    where c is a vector of one time sample of HOA coefficients c ϵ R N + 1 2 x 1
    Figure imgb0064
    and D DSHT ϵ R N + 1 2 x N + 1 2
    Figure imgb0065
    and its inverse D DSHT - 1
    Figure imgb0066
    are matrices related to a Discrete Spherical Harmonics Transform (DSHT) optimized for DRC purposes.
    In one embodiment, to decrease the computational load by (N + 1)4 operations per sample, it can be advantageous to include the rendering step and calculate the loudspeaker signals directly by: w drc = D D DSHT - 1 diag g drc D DSHT c ,
    Figure imgb0067
    where D is the rendering matrix and D D L - 1
    Figure imgb0068
    can be pre-computed.
    If all gains g 1,...,g (N+1)2 have the same value of gdrc , as in the simplified mode, a single gain group has been used to transmit the coder DRC gains. This case can be flagged by the DRC decoder, because in this case the calculation in the spatial filter is not needed, so that the calculation simplifies to: c drc = g drc c .
    Figure imgb0069
  • Calculation of DSHT matrices for DRC
  • In the following, D L is renamed to D DSHT. The matrices to determine the spatial filter D DSHT and its inverse D DSHT - 1
    Figure imgb0070
    are calculated as follows:
    A set of spherical positions D DSHT = Ω 1 Ω 1 Ω N + 1 2 with Ω 1 = θ l ϕ l T
    Figure imgb0071
    and related quadrature (cubature) gains q ϵ R N + 1 2 x 1
    Figure imgb0072
    are selected indexed by the HOA order N from Tables 1-4. A mode matrix ψ DSHT related to these positions is calculated (see above). A first prototype matrix is calculated by D ˜ 1 = diag q Ψ DSHT N + 1 2
    Figure imgb0073
    (the division by (N+1)2 can be skipped due to a subsequent normalization). A compact singular value decomposition is performed 1 = USVT and a new prototype matrix is calculated by: D ˜ ^ 2 = UV T .
    Figure imgb0074
    This matrix is normalized by: D 2 = D ˜ ^ 2 D ˜ ^ 2 fro .
    Figure imgb0075
    A row-vector e is calculated by e = - 1 L T D 2 - 1 0 0 .. 0 N + 1 2 ,
    Figure imgb0076
    where [1,0,0,..,0] is a row vector of (N + 1)2 all zero elements except for the first element with a value of one. 1 L T D 2
    Figure imgb0077
    denotes the sum of rows of D 2 .
    Figure imgb0078
    The optimized DSHT matrix D DSHT is now derived by: D DSHT . It has been found that, if erroneously - e is used instead of e, the invention provides slightly worse but still usable results.
  • For DRC in QMF-filter bank domain, the following applies.
  • The DRC decoder provides a gain value gch (n, m) for every time frequency tile n, m for (N + 1)2 spatial channels. The gains for time slot n and frequency band m are arranged in g n m ϵ R N + 1 2 x 1 .
    Figure imgb0079

    Multiband DRC is applied in QMF Filter bank domain. The processing steps are shown in Fig.7. The reconstructed HOA signal is transformed into spatial domain by (inverse DSHT): W DSHT = D DSHT C , where C ϵ R N + 1 2 x τ
    Figure imgb0080
    is a block of T HOA samples and W DSHT ϵ R N + 1 2 x τ
    Figure imgb0081
    is a block of spatial samples matching the input time granularity of the QMF filter bank. Then the QMF analysis filter bank is applied. Let w ^ DSHT n m ϵ C N + 1 2 x 1
    Figure imgb0082
    denote the a vector of spatial channels per time frequency tile (n, m). Then the DRC gains are applied: w DRC n m = diag g n m w ^ DSHT n m .
    Figure imgb0083

    To minimize the computational complexity, the DSHT and rendering to loudspeaker channels are combined: w n m = D D DSHT - 1 w DRC n m ,
    Figure imgb0084
    where D denotes the HOA rendering matrix. The QMF signals then can be fed to the mixer for further processing.
    Fig.7 shows DRC for HOA in the QMF domain combined with a rendering step.
    If only a single gain group for DRC has been used this should be flagged by the DRC decoder because again computational simplifications are possible. In this case the gains in vector g(n, m) all share the same value of gDRC (n, m). The QMF filter bank can be directly applied to the HOA signal and the gain gDRC (n, m) can be multiplied in filter bank domain.
  • Fig.8 shows DRC for HOA in the QMF domain (a filter domain of a Quadrature Mirror Filter) combined with a rendering step, with computational simplifications for the simple case of a single DRC gain group.
  • As has become apparent in view of the above, in one embodiment the invention relates to a method for performing DRC on a HOA signal, the method comprising steps of setting or determining a mode, the mode being either a simplified mode or a non-simplified mode, in the non-simplified mode, transforming the HOA signal to the spatial domain, wherein an inverse DSHT is used, in the non-simplified mode, analyzing the transformed HOA signal, and in the simplified mode, analyzing the HOA signal, obtaining, from results of said analyzing, one or more gain factors that are usable for dynamic range compression, wherein only one gain factor is obtained in the simplified mode and wherein two or more different gain factors are obtained in the non-simplified mode, in the simplified mode multiplying the obtained gain factor with the HOA signal, wherein a gain compressed HOA signal is obtained, in the non-simplified mode, multiplying the obtained gain factors with the transformed HOA signal, wherein a gain compressed transformed HOA signal is obtained, and transforming the gain compressed transformed HOA signal back into the HOA domain, wherein a gain compressed HOA signal is obtained.
  • In one embodiment, the method further comprises before said multiplying the obtained factors, transmitting the HOA signals together with the obtained gain factor or gain factors.
  • In one embodiment, the HOA signal is divided into frequency subbands, and the steps of analysing the HOA signal (or transformed HOA signal), obtaining one or more gain factors, multiplying the obtained gain factor(s) with the HOA signal (or transformed HOA signal), and transforming the gain compressed transformed HOA signal back into the HOA domain are applied to each frequency subband separately, with individual gains per subband. It is noted that the sequential order of dividing the HOA signal into frequency subbands and transforming the HOA signal to the spatial domain can be swapped, and/or the sequential order of synthesizing the subbands and transforming the gain compressed transformed HOA signals back into the HOA domain can be swapped, independently from each other.
  • In one embodiment the invention relates to a method for applying DRC gain factors to a HOA signal, the method comprising steps of receiving a HOA signal together with an indication and one or more gain factors, the indication indicating either a simplified mode or a non-simplified mode, wherein only one gain factor is received if the indication indicates the simplified mode, selecting either a simplified mode or a non-simplified mode according to said indication, in the simplified mode multiplying the gain factor with the HOA signal, wherein a dynamic range compressed HOA signal is obtained, and in the non-simplified mode transforming the HOA signal into the spatial domain, wherein a transformed HOA signal is obtained, multiplying the gain factors with the transformed HOA signals, wherein dynamic range compressed transformed HOA signals are obtained, and transforming the dynamic range compressed transformed HOA signals back into the HOA domain, wherein a dynamic range compressed HOA signal is obtained.
  • Further, in one embodiment the invention relates to a device for performing DRC on a HOA signal, the device comprising a processor or one or more processing elements adapted for setting or determining a mode, the mode being either a simplified mode or a non-simplified mode, in the non-simplified mode transforming the HOA signal to the spatial domain, wherein an inverse DSHT is used, in the non-simplified mode analyzing the transformed HOA signal, while in the simplified mode analyzing the HOA signal, obtaining, from results of said analyzing, one or more gain factors that are usable for dynamic range compression, wherein only one gain factor is obtained in the simplified mode and wherein two or more different gain factors are obtained in the non-simplified mode, in the simplified mode multiplying the obtained gain factor with the HOA signal, wherein a gain compressed HOA signal is obtained, and in the non-simplified mode multiplying the obtained gain factors with the transformed HOA signal, wherein a gain compressed transformed HOA signal is obtained, and transforming the gain compressed transformed HOA signal back into the HOA domain, wherein a gain compressed HOA signal is obtained.
  • In one embodiment for non-simplified mode only, a device for performing DRC on a HOA signal comprises a processor or one or more processing elements adapted for transforming the HOA signal to the spatial domain, analyzing the transformed HOA signal, obtaining, from results of said analyzing, gain factors that are usable for dynamic range compression, multiplying the obtained factors with the transformed HOA signals, wherein gain compressed transformed HOA signals are obtained, and transforming the gain compressed transformed HOA signals back into the HOA domain, wherein gain compressed HOA signals are obtained. In one embodiment, the device further comprises a transmission unit for transmitting, before multiplying the obtained gain factor or gain factors, the HOA signal together with the obtained gain factor or gain factors.
  • Also here it is noted that the sequential order of dividing the HOA signal into frequency subbands and transforming the HOA signal to the spatial domain can be swapped, and the sequential order of synthesizing the subbands and transforming the gain compressed transformed HOA signals back into the HOA domain can be swapped, independently from each other.
  • Further, in one embodiment the invention relates to a device for applying DRC gain factors to a HOA signal, the device comprising a processor or one or more processing elements adapted for receiving a HOA signal together with an indication and one or more gain factors, the indication indicating either a simplified mode or a non-simplified mode, wherein only one gain factor is received if the indication indicates the simplified mode, setting the device to either a simplified mode or a non-simplified mode, according to said indication, in the simplified mode, multiplying the gain factor with the HOA signal, wherein a dynamic range compressed HOA signal is obtained; and in the non-simplified mode, transforming the HOA signal into the spatial domain, wherein a transformed HOA signal is obtained, multiplying the gain factors with the transformed HOA signals, wherein dynamic range compressed transformed HOA signals are obtained, and transforming the dynamic range compressed transformed HOA signals back into the HOA domain, wherein a dynamic range compressed HOA signal is obtained.
  • In one embodiment, the device further comprises a transmission unit for transmitting, before multiplying the obtained factors, the HOA signals together with the obtained gain factors. In one embodiment, the HOA signal is divided into frequency subbands, and the analysing the transformed HOA signal, obtaining gain factors, multiplying the obtained factors with the transformed HOA signals and transforming the gain compressed transformed HOA signals back into the HOA domain are applied to each frequency subband separately, with individual gains per subband.
  • In one embodiment of the device for applying DRC gain factors to a HOA signal, the HOA signal is divided into a plurality of frequency subbands, and obtaining one or more gain factors, multiplying the obtained gain factors with the HOA signals or the transformed HOA signals, and in the non-simplified mode transforming the gain compressed transformed HOA signals back into the HOA domain are applied to each frequency subband separately, with individual gains per subband.
  • Further, in one embodiment where only the non-simplified mode is used, the invention relates to a device for applying DRC gain factors to a HOA signal, the device comprising a processor or one or more processing elements adapted for receiving a HOA signal together with gain factors, transforming the HOA signal into the spatial domain (using iDSHT), wherein a transformed HOA signal is obtained, multiplying the gain factors with the transformed HOA signal, wherein a dynamic range compressed transformed HOA signal is obtained, and transforming the dynamic range compressed transformed HOA signal back into the HOA domain (i.e. coefficient domain) (using DSHT), wherein a dynamic range compressed HOA signal is obtained.
  • While there has been shown, described, and pointed out fundamental novel features of the present invention as applied to preferred embodiments thereof, it will be understood that various omissions and substitutions and changes in the apparatus and method described, in the form and details of the devices disclosed, and in their operation, may be made by those skilled in the art without departing from the spirit of the present invention. It is expressly intended that all combinations of those elements that perform substantially the same function in substantially the same way to achieve the same results are within the scope of the invention. Substitutions of elements from one described embodiment to another are also fully intended and contemplated.
  • The following tables list spherical positions of virtual loudspeakers for HOA of order N with N=4, 5 or 6.
  • N=4 Positions
  • Tab.4: Spherical positions of virtual loudspeakers for HOA order N=4
    Inclination \rad Azimuth \rad Gain
    Figure imgb0085
    1.57079633 0.00000000 0.52689274
    2.39401407 0.00000000 0.48518011
    1.14059283 -1.75618245 0.52688432
    1.33721851 0.69215601 0.47027816
    1.72512898 -1.33340585 0.48037442
    1.17406779 -0.79850952 0.51130478
    0.69042674 1.07623171 0.50662254
    1.47478735 1.43953896 0.52158458
    1.67073876 2.25235428 0.52835300
    2.52745842 -1.33179653 0.52388165
    1.81037110 3.05783641 0.49800736
    1.91827560 -2.03351312 0.48516540
    0.27992161 2.55302196 0.50663531
    0.47981675 -1.18580204 0.50824199
    2.37644317 2.52383590 0.45807408
    0.98508365 2.03459671 0.47260252
    2.18924206 1.58232601 0.49801422
    1.49441825 -2.58932194 0.51745117
    2.04428895 0.76615262 0.51744164
    2.43923726 -2.63989327 0.52146074
    1.10308418 2.88498471 0.52158484
    0.78489181 -2.54224201 0.47027748
    2.96802845 1.25258904 0.52145388
    1.91816652 -0.63874484 0.48036020
    0.80829458 -0.00991977 0.50824345
  • N=5 Positions
  • Tab.5: Spherical positions of virtual loudspeakers for HOA orders N= 5
    Inclination \rad Azimuth \rad Gain
    Figure imgb0086
    1.57079633 0.00000000 0.34493574
    2.68749293 3.14159265 0.35131373
    1.92461621 -1.22481468 0.35358151
    1.95917092 3.06534485 0.36442231
    2.18883411 0.08893301 0.36437350
    0.35664531 -2.15475973 0.33953855
    1.32915731 -1.05408340 0.35358417
    2.21829206 2.45308518 0.33534647
    1.00903070 2.31872053 0.34739607
    0.99455136 -2.29370294 0.36437101
    1.13601102 -0.46303195 0.33534542
    0.41863640 0.63541391 0.35131934
    1.78596913 -0.56826765 0.34739591
    0.56658255 -0.66284593 0.36441956
    2.25292410 0.89044754 0.36437098
    2.67263757 -1.71236120 0.36442208
    0.86753981 -1.50749854 0.34068122
    1.38158330 1.72190554 0.35358401
    0.98578154 0.23428465 0.35131950
    1.45079827 -1.69748851 0.34739437
    2.09223697 -1.85025366 0.33534659
    2.62854417 1.70110685 0.34494256
    1.44817433 -2.83400771 0.33953463
    2.37827410 -0.72817212 0.34068529
    0.82285875 1.51124182 0.33534531
    0.40679748 2.38217051 0.34493552
    0.84332549 -3.07860398 0.36437337
    1.38947809 2.83246237 0.34068522
    1.61795773 -2.27837285 0.34494274
    2.17389505 -2.58540735 0.35131361
    1.65172710 2.28105193 0.35358166
    1.67862104 0.57097606 0.33953819
    2.02514031 1.70739195 0.34739443
    1.12965858 0.89802542 0.36442004
    2.82979093 0.17840931 0.33953488
    1.67550339 1.18664952 0.34068114
  • N=6 Positions
  • Tab.6: Spherical positions of virtual loudspeakers for HOA orders N= 6
    Inclination \rad Azimuth \rad Gain
    Figure imgb0087
    1.57079633 0.00000000 0.23821170
    2.42144792 0.00000000 0.23821175
    0.32919895 2.78993083 0.26169552
    1.06225899 1.49243160 0.25534085
    1.06225899 1.49243160 0.25534085
    1.01526896 -2.16495206 0.25092628
    1.10570423 -1.59180661 0.25099550
    1.47319543 1.14258135 0.26160776
    2.15414541 1.88359269 0.24442720
    0.20805372 -0.52863458 0.25487678
    0.50141101 -2.11057110 0.25619096
    1.98041218 0.28912378 0.26288225
    0.83752075 -2.81667891 0.25837996
    2.44130228 0.81495962 0.26772416
    1.21539727 -1.00788022 0.25534092
    2.62944184 -1.58354086 0.26437874
    1.86884674 -2.40686906 0.25619091
    0.68705554 -1.20612227 0.25576026
    1.52325470 -1.98940871 0.26169551
    2.39097364 -2.37336381 0.25576025
    0.98667678 0.86446728 0.26014219
    2.27078506 -3.06771779 0.25099551
    2.33605400 2.51674567 0.26455002
    1.29371004 2.03656562 0.25576032
    0.86334494 2.77720222 0.25092620
    1.94118355 -0.37820559 0.26772409
    2.10323413 -1.28283816 0.24442725
    1.87416330 0.80785741 0.23821179
    1.63423157 1.65277986 0.26437876
    2.06477636 1.31341296 0.25595469
    0.82305807 -0.47771423 0.26437883
    2.04154780 -1.85106655 0.25487677
    0.61285067 0.33640173 0.24442716
    1.08029340 0.10986230 0.25595472
    1.60164764 -1.43535015 0.26455000
    2.66513701 1.69643796 0.26014228
    1.35887781 -2.58083733 0.25838000
    1.78658555 2.25563014 0.25487674
    1.83333508 2.80487382 0.26169549
    0.78406009 2.08860099 0.25099560
    2.94031615 -0.07888534 0.26160780
    1.34658213 2.57400947 0.25619094
    1.73906669 -0.87744928 0.26014223
    0.50210739 1.33550547 0.26455007
    2.38040297 -0.75104092 0.25595462
    1.41826790 0.54845193 0.26772418
    1.77904107 -2.93136138 0.25092628
    1.35746628 -0.47759398 0.26160765
    1.31545731 3.12752832 0.25838016
    2.81487011 -3.12843671 0.25534100
  • It will be understood that the present invention has been described purely by way of example, and modifications of detail can be made without departing from the scope of the invention. Each feature disclosed in the description and (where appropriate) the claims and drawings may be provided independently or in any appropriate combination. Features may, where appropriate be implemented in hardware, software, or a combination of the two.

Claims (15)

  1. A method for performing DRC on a HOA signal, the method comprising steps of
    - setting or determining a mode, the mode being either simplified mode or non-simplified mode;
    - in the non-simplified mode, transforming the HOA signal to the spatial domain, wherein an inverse DSHT is used;
    - in the non-simplified mode, analyzing the transformed HOA signal, and in the simplified mode, analyzing the HOA signal;
    - obtaining, from results of said analyzing, one or more gain factors that are usable for dynamic range compression, wherein only one gain factor is obtained in the simplified mode and wherein two or more different gain factors are obtained in the non-simplified mode;
    - in the simplified mode multiplying the obtained gain factor with the HOA signal, wherein a gain compressed HOA signal is obtained;
    - in the non-simplified mode, multiplying the obtained gain factors with the transformed HOA signal, wherein a gain compressed transformed HOA signal is obtained, and transforming the gain compressed transformed HOA signal back into the HOA domain, wherein a gain compressed HOA signal is obtained.
  2. Method of claim 1, further comprising a step of transmitting the HOA signals together with the obtained gain factors before said step of multiplying the obtained factors.
  3. Method according to claim 1 or 2, wherein the HOA signal is divided into frequency subbands, and the steps of analysing the transformed HOA signal, obtaining gain factors, multiplying the obtained factors with the transformed HOA signals and transforming the gain compressed transformed HOA signals back into the HOA domain are applied to each frequency subband separately, with individual gains per subband.
  4. Method according to claim 3, wherein the sequential order of dividing the HOA signal into frequency subbands and transforming the HOA signal to the spatial domain can be swapped, and the sequential order of synthesizing the subbands and transforming the gain compressed transformed HOA signals back into the HOA domain can be swapped, independently from each other.
  5. A method for applying DRC gain factors to a HOA signal, the method comprising
    - receiving a HOA signal and one or more gain factors;
    - transforming the HOA signal into the spatial domain, wherein a transformed HOA signal is obtained;
    - multiplying the gain factors with the transformed HOA signal, wherein a dynamic range compressed transformed HOA signal is obtained; and
    - transforming the dynamic range compressed transformed HOA signal back into the HOA domain (i.e. coefficient domain) (using DSHT), wherein a dynamic range compressed HOA signal is obtained.
  6. Method according to claim 5, wherein also an indication is received, the indication indicating either a simplified mode or a non-simplified mode, and wherein only one gain factor is received if the indication indicates the simplified mode, further comprising a step of selecting either a simplified mode or a non-simplified mode according to said indication, wherein the steps of transforming the HOA signal into the spatial domain and transforming the dynamic range compressed transformed HOA signal back into the HOA domain are performed only in the non-simplified mode, and wherein in the simplified mode the gain factors are multiplied with the HOA signal.
  7. Method according to claim 1, 5 or 6, wherein the step of transforming the HOA signal into the spatial domain uses a transform matrix according to at least one of Tab.1 b), Tab 2 b) and Tab 3 b).
  8. Method according to claim 1, 5 or 6, wherein in the step of transforming the HOA signal into the spatial domain an iDSHT is used with a transform matrix obtained from the spherical positions of virtual loudspeakers and quadrature gains q.
  9. Method according to claim 8, wherein the transform matrix is computed from the mode matrix ψ DSHT and corresponding quadrature gains.
  10. Method according to claim 8 or 9, wherein spherical positions and corresponding quadrature gains according to at least one of Tab.1 a), Tab.2a), Tab.3a) and Tab.4-6 are used.
  11. Method according to claim 8 or 9, wherein the transform matrix is computed according to D DSHT = D 2 + e T e T e T .. T
    Figure imgb0088
    wherein D 2 = D ˜ ^ 2 D ˜ ^ 2 fro
    Figure imgb0089
    is a normalized version of D ˜ ^ 2 = UV T
    Figure imgb0090
    with U,V obtained from D ˜ 1 = USV T = diag q Ψ DSHT N + 1 2
    Figure imgb0091
    with ψ DSHT being the mode matrix of used spherical positions of virtual loudspeaker, and e T is a transposed version of e = - 1 L T D 2 - 1 0 0 .. 0 N + 1 2 .
    Figure imgb0092
  12. Device for performing DRC on a HOA signal, the device comprising a processor or one or more processing elements adapted for
    - setting or determining a mode, the mode being either a simplified mode or a non-simplified mode;
    - in the non-simplified mode, transforming the HOA signal to the spatial domain, wherein an inverse DSHT is used;
    - in the non-simplified mode analyzing the transformed HOA signal, while in the simplified mode analyzing the HOA signal;
    - obtaining, from results of said analyzing, one or more gain factors that are usable for dynamic range compression, wherein only one gain factor is obtained in the simplified mode and wherein two or more different gain factors are obtained in the non-simplified mode;
    - in the simplified mode, multiplying the obtained gain factor with the HOA signal, wherein a gain compressed HOA signal is obtained; and
    - in the non-simplified mode, multiplying the obtained gain factors with the transformed HOA signal, wherein a gain compressed transformed HOA signal is obtained, and transforming the gain compressed transformed HOA signal back into the HOA domain, wherein a gain compressed HOA signal is obtained.
  13. Device according to claim 12, further comprising a transmission unit for transmitting, before multiplying the obtained gain factor or gain factors, the HOA signal together with the obtained gain factor or gain factors.
  14. Device according to claim 12 or 13, wherein the HOA signal is divided into a plurality of frequency subbands, and the analysing the HOA signal or the transformed HOA signal, obtaining one or more gain factors, multiplying the obtained gain factors with the HOA signal or the transformed HOA signal and, in the non-simplified mode, the transforming the gain compressed transformed HOA signal back into the HOA domain are applied to each frequency subband separately, with individual gains per subband.
  15. Device for applying DRC gain factors to a HOA signal, the device comprising a processor or one or more processing elements adapted for
    - receiving a HOA signal together with an indication and one or more gain factors, the indication indicating either a simplified mode or a non-simplified mode, wherein only one gain factor is received if the indication indicates the simplified mode;
    - setting the device to either a simplified mode or a non-simplified mode, according to said indication;
    - in the simplified mode, multiplying the gain factor with the HOA signal, wherein a dynamic range compressed HOA signal is obtained; and
    - in the non-simplified mode, transforming the HOA signal into the spatial domain, wherein a transformed HOA signal is obtained, multiplying the gain factors with the transformed HOA signals, wherein dynamic range compressed transformed HOA signals are obtained, and transforming the dynamic range compressed transformed HOA signals back into the HOA domain, wherein a dynamic range compressed HOA signal is obtained.
EP14305559.8A 2014-03-24 2014-04-15 Method and device for applying dynamic range compression to a higher order ambisonics signal Withdrawn EP2934025A1 (en)

Priority Applications (55)

Application Number Priority Date Filing Date Title
EP14305559.8A EP2934025A1 (en) 2014-04-15 2014-04-15 Method and device for applying dynamic range compression to a higher order ambisonics signal
CN201811253721.8A CN108962266B (en) 2014-03-24 2015-03-24 Method and apparatus for applying dynamic range compression to high order ambisonics signals
KR1020217000212A KR102479741B1 (en) 2014-03-24 2015-03-24 Method and device for applying dynamic range compression to a higher order ambisonics signal
CA3153913A CA3153913C (en) 2014-03-24 2015-03-24 Method and device for applying dynamic range compression to a higher order ambisonics signal
TW109126543A TWI718979B (en) 2014-03-24 2015-03-24 Method and device for applying dynamic range compression to a higher order ambisonics signal
CN201811253730.7A CN109036441B (en) 2014-03-24 2015-03-24 Method and apparatus for applying dynamic range compression to high order ambisonics signals
US15/127,775 US9936321B2 (en) 2014-03-24 2015-03-24 Method and device for applying dynamic range compression to a higher order ambisonics signal
PCT/EP2015/056206 WO2015144674A1 (en) 2014-03-24 2015-03-24 Method and device for applying dynamic range compression to a higher order ambisonics signal
TW108105179A TWI695371B (en) 2014-03-24 2015-03-24 Method and apparatus for applying dynamic range compression and a non-transitory computer readable storage medium
JP2016558102A JP6246948B2 (en) 2014-03-24 2015-03-24 Method and apparatus for applying dynamic range compression to higher order ambisonics signals
AU2015238448A AU2015238448B2 (en) 2014-03-24 2015-03-24 Method and device for applying Dynamic Range Compression to a Higher Order Ambisonics signal
CN201580015764.0A CN106165451B (en) 2014-03-24 2015-03-24 To the method and apparatus of high-order clear stereo signal application dynamic range compression
CN201811253717.1A CN109087654B (en) 2014-03-24 2015-03-24 Method and apparatus for applying dynamic range compression to high order ambisonics signals
CA2946916A CA2946916C (en) 2014-03-24 2015-03-24 Method and device for applying dynamic range compression to a higher order ambisonics signal
TW111107641A TWI794032B (en) 2014-03-24 2015-03-24 Method and device for applying dynamic range compression to a higher order ambisonics signal
BR122020020730-2A BR122020020730B1 (en) 2014-03-24 2015-03-24 METHOD AND DEVICE FOR APPLYING DYNAMIC RANGE COMPRESSION TO A HIGHER ORDER AMBISONICS SIGNAL
BR112016022008-0A BR112016022008B1 (en) 2014-03-24 2015-03-24 METHOD FOR DYNAMIC RANGE COMPRESSION, APPARATUS FOR DYNAMIC RANGE COMPRESSION AND NON-TRANSITORY COMPUTER READable STORAGE MEDIA
KR1020227044220A KR102596944B1 (en) 2014-03-24 2015-03-24 Method and device for applying dynamic range compression to a higher order ambisonics signal
KR1020197021732A KR102201027B1 (en) 2014-03-24 2015-03-24 Method and device for applying dynamic range compression to a higher order ambisonics signal
EP18173707.3A EP3451706B1 (en) 2014-03-24 2015-03-24 Method and device for applying dynamic range compression to a higher order ambisonics signal
KR1020167026390A KR102005298B1 (en) 2014-03-24 2015-03-24 Method and device for applying dynamic range compression to a higher order ambisonics signal
TW109101396A TWI711034B (en) 2014-03-24 2015-03-24 Method and apparatus for applying dynamic range compression and a non-transitory computer readable storage medium
BR122020020719-1A BR122020020719B1 (en) 2014-03-24 2015-03-24 METHOD, COMPUTER READABLE STORAGE MEDIA, AND DYNAMIC RANGE COMPRESSION (DRC) APPLIANCE
CN201811253713.3A CN109285553B (en) 2014-03-24 2015-03-24 Method and apparatus for applying dynamic range compression to high order ambisonics signals
EP23192252.7A EP4273857A3 (en) 2014-03-24 2015-03-24 Method and device for applying dynamic range compression to a higher order ambisonics signal
CA3155815A CA3155815A1 (en) 2014-03-24 2015-03-24 Method and device for applying dynamic range compression to a higher order ambisonics signal
EP15711759.9A EP3123746B1 (en) 2014-03-24 2015-03-24 Method and device for applying dynamic range compression to a higher order ambisonics signal
TW104109277A TWI662543B (en) 2014-03-24 2015-03-24 Method and apparatus for applying dynamic range compression and a non-transitory computer readable storage medium
TW112102828A TWI833562B (en) 2014-03-24 2015-03-24 Method and device for applying dynamic range compression to a higher order ambisonics signal
RU2016141386A RU2658888C2 (en) 2014-03-24 2015-03-24 Method and device of the dynamic range compression application to the higher order ambiophony signal
BR122020014764-4A BR122020014764B1 (en) 2014-03-24 2015-03-24 METHOD AND DEVICE FOR APPLYING DYNAMIC RANGE COMPRESSION GAIN FACTORS TO A HIGHER ORDER AMBISONICS SIGNAL AND COMPUTER READable STORAGE MEDIA
BR122018005665-7A BR122018005665B1 (en) 2014-03-24 2015-03-24 METHOD AND DEVICE FOR APPLYING DYNAMIC RANGE COMPRESSION TO A HIGHER ORDER AMBISONICS SIGNAL
TW110102935A TWI760084B (en) 2014-03-24 2015-03-24 Method and device for applying dynamic range compression to a higher order ambisonics signal
RU2018118336A RU2760232C2 (en) 2014-03-24 2015-03-24 Method and device for applying dynamic range compression to higher-order ambiophony signal
CN201811253716.7A CN109087653B (en) 2014-03-24 2015-03-24 Method and apparatus for applying dynamic range compression to high order ambisonics signals
KR1020237037213A KR20230156153A (en) 2014-03-24 2015-03-24 Method and device for applying dynamic range compression to a higher order ambisonics signal
CN202311083699.8A CN117133298A (en) 2014-03-24 2015-03-24 Method and apparatus for applying dynamic range compression to high order ambisonics signals
UAA201610606A UA119765C2 (en) 2014-03-24 2015-03-24 Method and device for applying dynamic range compression to a higher order ambisonics signal
CN202311083155.1A CN117153172A (en) 2014-03-24 2015-03-24 Method and apparatus for applying dynamic range compression to high order ambisonics signals
JP2017219647A JP6545235B2 (en) 2014-03-24 2017-11-15 Method and apparatus for applying dynamic range compression to higher order ambisonics signals
US15/891,326 US10362424B2 (en) 2014-03-24 2018-02-07 Method and device for applying dynamic range compression to a higher order ambisonics signal
HK19101101.3A HK1258770A1 (en) 2014-03-24 2019-01-22 Method and device for applying dynamic range compression to a higher order ambisonics signal
HK19101671.3A HK1259306A1 (en) 2014-03-24 2019-01-30 Method and device for applying dynamic range compression to a higher order ambisonics signal
JP2019112767A JP6762405B2 (en) 2014-03-24 2019-06-18 Methods and Devices for Applying Dynamic Range Compression to Higher Ambisonics Signals
US16/457,135 US10567899B2 (en) 2014-03-24 2019-06-28 Method and device for applying dynamic range compression to a higher order ambisonics signal
AU2019205998A AU2019205998B2 (en) 2014-03-24 2019-07-16 Method and device for applying dynamic range compression to a higher order ambisonics signal
US16/660,626 US10638244B2 (en) 2014-03-24 2019-10-22 Method and device for applying dynamic range compression to a higher order ambisonics signal
US16/857,093 US10893372B2 (en) 2014-03-24 2020-04-23 Method and device for applying dynamic range compression to a higher order ambisonics signal
JP2020150380A JP7101219B2 (en) 2014-03-24 2020-09-08 Methods and Devices for Applying Dynamic Range Compression to Higher-Order Ambisonics Signals
US17/144,325 US11838738B2 (en) 2014-03-24 2021-01-08 Method and device for applying Dynamic Range Compression to a Higher Order Ambisonics signal
AU2021204754A AU2021204754B2 (en) 2014-03-24 2021-07-07 Method and device for applying dynamic range compression to a higher order ambisonics signal
JP2022107586A JP7333855B2 (en) 2014-03-24 2022-07-04 Method and Apparatus for Applying Dynamic Range Compression to Higher Order Ambisonics Signals
AU2023201911A AU2023201911A1 (en) 2014-03-24 2023-03-29 Method and device for applying dynamic range compression to a higher order ambisonics signal
JP2023132200A JP2023144032A (en) 2014-03-24 2023-08-15 Method and device for applying dynamic range compression to high order ambisonics signal
US18/505,494 US20240098436A1 (en) 2014-03-24 2023-11-09 Method and device for applying dynamic range compression to a higher order ambisonics signal

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
EP14305559.8A EP2934025A1 (en) 2014-04-15 2014-04-15 Method and device for applying dynamic range compression to a higher order ambisonics signal

Publications (1)

Publication Number Publication Date
EP2934025A1 true EP2934025A1 (en) 2015-10-21

Family

ID=50732988

Family Applications (1)

Application Number Title Priority Date Filing Date
EP14305559.8A Withdrawn EP2934025A1 (en) 2014-03-24 2014-04-15 Method and device for applying dynamic range compression to a higher order ambisonics signal

Country Status (1)

Country Link
EP (1) EP2934025A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110168638A (en) * 2017-01-13 2019-08-23 高通股份有限公司 Audio potential difference for virtual reality, augmented reality and mixed reality
WO2022242480A1 (en) * 2021-05-17 2022-11-24 华为技术有限公司 Three-dimensional audio signal encoding method and apparatus, and encoder

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2013181115A1 (en) * 2012-05-31 2013-12-05 Dts, Inc. Audio depth dynamic range enhancement
EP2688066A1 (en) * 2012-07-16 2014-01-22 Thomson Licensing Method and apparatus for encoding multi-channel HOA audio signals for noise reduction, and method and apparatus for decoding multi-channel HOA audio signals for noise reduction

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2013181115A1 (en) * 2012-05-31 2013-12-05 Dts, Inc. Audio depth dynamic range enhancement
EP2688066A1 (en) * 2012-07-16 2014-01-22 Thomson Licensing Method and apparatus for encoding multi-channel HOA audio signals for noise reduction, and method and apparatus for decoding multi-channel HOA audio signals for noise reduction

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
"WD1-HOA Text of MPEG-H 3D Audio", 107. MPEG MEETING;13-1-2014 - 17-1-2014; SAN JOSE; (MOTION PICTURE EXPERT GROUP OR ISO/IEC JTC1/SC29/WG11),, no. N14264, 21 February 2014 (2014-02-21), XP030021001 *
BURNETT IAN ET AL: "Encoding Higher Order Ambisonics with AAC", AES CONVENTION 124; MAY 2008, AES, 60 EAST 42ND STREET, ROOM 2520 NEW YORK 10165-2520, USA, 1 May 2008 (2008-05-01), XP040508582 *
JORG FLIEGE, INTEGRATION NODES FOR THE SPHERE, 2010, Retrieved from the Internet <URL:http://www.mathematik.uni-dortmund.de/lsx/research/projects/fliege/nodes/- nodes.html>

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110168638A (en) * 2017-01-13 2019-08-23 高通股份有限公司 Audio potential difference for virtual reality, augmented reality and mixed reality
CN110168638B (en) * 2017-01-13 2023-05-09 高通股份有限公司 Audio head for virtual reality, augmented reality and mixed reality
WO2022242480A1 (en) * 2021-05-17 2022-11-24 华为技术有限公司 Three-dimensional audio signal encoding method and apparatus, and encoder

Similar Documents

Publication Publication Date Title
US11838738B2 (en) Method and device for applying Dynamic Range Compression to a Higher Order Ambisonics signal
EP2934025A1 (en) Method and device for applying dynamic range compression to a higher order ambisonics signal

Legal Events

Date Code Title Description
PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

AK Designated contracting states

Kind code of ref document: A1

Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR

AX Request for extension of the european patent

Extension state: BA ME

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE APPLICATION IS DEEMED TO BE WITHDRAWN

18D Application deemed to be withdrawn

Effective date: 20160422