EP2688065A1 - Verfahren und Vorrichtung zur Verhinderung der Demaskierung von Codierungsrauschen beim Mischen wahrnehmungscodierter Mehrkanal-Audiosignale - Google Patents
Verfahren und Vorrichtung zur Verhinderung der Demaskierung von Codierungsrauschen beim Mischen wahrnehmungscodierter Mehrkanal-Audiosignale Download PDFInfo
- Publication number
- EP2688065A1 EP2688065A1 EP12305860.4A EP12305860A EP2688065A1 EP 2688065 A1 EP2688065 A1 EP 2688065A1 EP 12305860 A EP12305860 A EP 12305860A EP 2688065 A1 EP2688065 A1 EP 2688065A1
- Authority
- EP
- European Patent Office
- Prior art keywords
- channel audio
- audio signals
- correlated
- perceptually
- channel
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Withdrawn
Links
- 230000005236 sound signal Effects 0.000 title claims abstract description 171
- 238000000034 method Methods 0.000 title claims description 37
- 239000011159 matrix material Substances 0.000 claims abstract description 76
- 238000009877 rendering Methods 0.000 claims abstract description 10
- 230000011218 segmentation Effects 0.000 claims description 14
- 238000009432 framing Methods 0.000 claims description 12
- 238000013507 mapping Methods 0.000 claims description 5
- 230000001131 transforming effect Effects 0.000 claims description 3
- 238000001228 spectrum Methods 0.000 description 11
- 230000000875 corresponding effect Effects 0.000 description 7
- 238000013139 quantization Methods 0.000 description 7
- 230000002596 correlated effect Effects 0.000 description 5
- 238000013459 approach Methods 0.000 description 4
- 230000005540 biological transmission Effects 0.000 description 4
- 230000008859 change Effects 0.000 description 3
- 230000008447 perception Effects 0.000 description 3
- 230000008569 process Effects 0.000 description 3
- 230000017105 transposition Effects 0.000 description 3
- 238000007906 compression Methods 0.000 description 2
- 230000006835 compression Effects 0.000 description 2
- 230000021615 conjugation Effects 0.000 description 2
- 238000000354 decomposition reaction Methods 0.000 description 2
- 230000001419 dependent effect Effects 0.000 description 2
- 230000006870 function Effects 0.000 description 2
- 230000000873 masking effect Effects 0.000 description 2
- 238000006467 substitution reaction Methods 0.000 description 2
- 239000013598 vector Substances 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 238000002592 echocardiography Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000009472 formulation Methods 0.000 description 1
- 230000037433 frameshift Effects 0.000 description 1
- 230000000670 limiting effect Effects 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000000717 retained effect Effects 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
- 230000007704 transition Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/008—Multichannel audio signal coding or decoding using interchannel correlation to reduce redundancy, e.g. joint-stereo, intensity-coding or matrixing
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04S—STEREOPHONIC SYSTEMS
- H04S3/00—Systems employing more than two channels, e.g. quadraphonic
- H04S3/02—Systems employing more than two channels, e.g. quadraphonic of the matrix type, i.e. in which input signals are combined algebraically, e.g. after having been phase shifted with respect to each other
Definitions
- This invention relates to a method and an apparatus for avoiding an unmasking of coding noise when mixing perceptually coded and decoded multi-channel audio signals.
- Fig. 1 A known approach, which is e.g. pursued in [1] with respect to HOA representations, is illustrated in Fig. 1 .
- the term matrixing means adding or mixing the decoded signals x ⁇ ⁇ i l in a weighted manner.
- the problem of noise unmasking when matrixing perceptually decoded multi-channel signals was addressed in [7] and [8].
- the input signals x ⁇ i ( l ) for perceptual coding are assumed to be a downmix of multi-channel audio signals, e.g. stereo or surround signals.
- the goal for matrixing of the perceptually decoded signals x ⁇ ⁇ i l is the reconstruction of the original multi-channel audio signal from the downmix signals.
- the proposed two step technique should solve the noise unmasking problem, which consisted of an appropriate quantization and choice of the masking thresholds.
- channel independent perceptual encoding 20 and successive decoding 30 introduces quantization noise into the individual channel audio signals.
- the introduced quantization noise remains unnoticed by the human ear because of a masking effect that is caused by the audio signal x i ( l ).
- the audio signal part and the quantization noise can get separated and partially routed to different new signals y ⁇ ⁇ j l . This may result in the appearance of audible artifacts.
- the present invention is based on the recognition of the fact that at least the above-mentioned problems can be solved by decorrelating the individual audio channel signals before perceptual encoding and recorrelating them after the perceptual decoding, as shown in Fig.2 .
- Such decorrelation and recorrelation has the advantage that noise unmasking at matrixing is avoided, even for arbitrary mixing matrices which are not yet known at the perceptual encoding stage. That is, the disclosed decorrelating of individual audio channel signals before perceptual encoding and recorrelating them after the perceptual decoding is suitable for achieving any arbitrary matrixing/mixing without unmasking of noise that is masked during the perceptual encoding and perceprual decoding.
- the method and apparatus proposed herein does not suffer from the restriction that the loudspeaker set-up must be known in advance at the perceptual coding stage.
- a method for avoiding the unmasking of coding noise when mixing perceptually coded multi-channel audio signals comprises steps of de-correlating the multi-channel audio signals, wherein de-correlated multi-channel audio signals and correlation information are obtained, perceptually encoding the de-correlated multi-channel audio signals, perceptually decoding the perceptually encoded, de-correlated multi-channel audio signals, wherein perceptually decoded, de-correlated multi-channel audio signals are obtained, re-correlating the perceptually decoded, de-correlated multi-channel audio signals according to the correlation information, wherein re-correlated, perceptually decoded multi-channel audio signals are obtained, and mixing the re-correlated, perceptually decoded multi-channel audio signals, wherein at least two of the re-correlated, perceptually decoded multi-channel audio signals are combined to obtain at least one audio output signal, and wherein a mixing scheme is used that is independent from (e.g. unknown in) said step of en
- a corresponding decoding method comprises a method for avoiding the unmasking of coding noise when mixing perceptually coded multi-channel audio signals within a decoder, comprising steps of receiving through a connection, or retrieving from a storage, perceptually encoded, de-correlated multi-channel audio signals and correlation information, perceptually decoding the perceptually encoded, de-correlated multi-channel audio signals, wherein perceptually decoded, de-correlated multi-channel audio signals are obtained, re-correlating the perceptually decoded, de-correlated multi-channel audio signals according to the correlation information, wherein re-correlated, perceptually decoded multi-channel audio signals are obtained, and mixing the re-correlated, perceptually decoded multi-channel audio signals, wherein at least two of the re-correlated, perceptually decoded multi-channel audio signals are combined to obtain at least one audio output signal.
- the method relates to perceptually coded time-frequency transformed multi-channel audio signals.
- a method for avoiding any unmasking of coding noise when mixing multi-channel audio signals that have been perceptually coded by using a time-frequency transform comprises perceptual encoding, perceptual decoding and mixing, and the perceptual encoding comprising steps of time-frequency transform framing and time-frequency transforming the multi-channel audio signals, wherein time-frequency transformed multi-channel audio signals are obtained, segmenting (i.e.
- the perceptual decoding comprises steps of channel independent decoding without inverse time-frequency transform, wherein a channel independent decoded signal is obtained, re-correlating the channel independent decoded signal with a KLT, wherein re-correlated signals are obtained, performing a transform de-segmentation on the re-correlated signals, wherein time-frequency transformed multi-channel audio signals are obtained, performing inverse time-frequency transform with overlap add on the time-frequency transformed multi-channel audio signals, wherein framed time-domain multi-channel audio signals are obtained, and de-framing the framed time-domain multi-channel audio signals, wherein the re-correlated, perceptually decoded multi-channel audio signals are obtained; and the mixing comprises mixing the re-correlated, perceptually decoded multi-channel audio signals, wherein at least two of the
- a corresponding decoding method comprises a method for avoiding the unmasking of coding noise when mixing, in a perceptual decoder, multi-channel audio signals that have been perceptually coded by using a time-frequency transform, comprising steps of channel independent decoding without inverse time-frequency transform, wherein channel independent decoded signals are obtained, re-correlating the channel independent decoded signals with a KLT, wherein re-correlated signals are obtained, performing a transform de-segmentation of the re-correlated signals, wherein time-frequency transformed multi-channel audio signals are obtained, performing inverse time-frequency transform with overlap add on the time-frequency transformed multi-channel audio signals, wherein framed time-domain multi-channel audio signals are obtained, de-framing the framed time-domain multi-channel audio signals, wherein the re-correlated, perceptually decoded multi-channel audio signals are obtained, and mixing the re-correlated, perceptually decoded multi-channel audio signals, wherein at least two of the
- a computer readable storage medium has executable instructions to cause a computer to perform a method comprising steps as disclosed in one of the claims 1-14.
- the decorrelation is accomplished with the Karhunen-Loève Transform (KLT), which is also known as Principle Component Analysis (PCA) [3].
- KLT Karhunen-Loève Transform
- PCA Principle Component Analysis
- Fig.3 shows, in an embodiment of the present invention, a conceptual scheme for a KLT based perceptual encoder for multi-channel audio signals. It comprises steps of performing a segmentation 110 of the input signals, decorrelation 120 with a KLT, and channel independent perceptual coding 130.
- the decorrelation comprises following steps: A first step is computing 1210 the inter-channel correlation matrix for each segment. Subsequently, the inter-channel correlation matrix is quantized 1220, rescaled (e.g. inverse quantized) 1230, and the rescaled version of the correlation matrix is employed for the computation 1240 of the KLT transform matrix B KLT (m). Finally, the KLT is applied 1250 to the signals within each respective segment, before the individual transformed signals Z(m) are independently perceptually coded 130.
- a first step is computing 1210 the inter-channel correlation matrix for each segment.
- the inter-channel correlation matrix is quantized 1220, rescaled (e.g. inverse quantized) 1230, and the re
- the coded quantized correlation matrix is transmitted as side information.
- it is losslessly encoded 1260.
- the losslessly encoded quantized correlation matrix is transmitted instead of the KLT transform matrix, since it is always symmetric, and therefore only a portion thereof (the upper or lower half, including the diagonal) needs to be transmitted.
- the quantized and encoded inter-channel correlation matrix is used for transmission, in order to reduce the amount of side information.
- the first step 110 concerns the segmentation.
- the indices l m -1 + 1 and l m corresponding to the left and right border of the m-th segment are chosen to obtain constant inter-channel correlations within the segment.
- the inter-channel correlation matrix ⁇ X ( m ) is quantized (e.g. element-wise) 1220 to obtain the quantized inter-channel correlation matrix ⁇ X ( m ).
- the quantization can be accomplished e.g. as proposed in [6].
- the quantized inter-channel correlation matrix ⁇ X ( m ) is rescaled 1230 to obtain the approximate correlation matrix ⁇ X ( m ). It has to be ensured that the hermitian symmetry of the correlation matrix is retained after the rescaling.
- the coded representation of all channels within the segment m is denoted by ( m ).
- the quantized inter-channel correlation matrix ⁇ X ( m ) is losslessly encoded 1260 to obtain the coded representation
- the symmetry of the quantized inter-channel correlation matrix ⁇ X ( m ) can be exploited by coding only the upper (or lower) triangular part and the diagonal.
- a conceptual scheme of the KLT based perceptual decoder for multi-channel signals is depicted in Fig.4 , and comprises channel independent perceptual decoding 210, re-correlation with an IKLT 220 and desegmentation 230.
- the inter-channel correlation matrix is decoded 2210 and the decorrelated signals are decoded 210, using a channel independent perceptual decoding.
- the KL transform matrix is computed 2220 in the same way as in the encoder.
- the inverse KL transformation is applied 2230 to the decoded decorrelated signals, and the segments are de-segmented (i.e. combined) 230 to continuous streams of individual channel signals.
- the coded quantized inter-channel correlation matrix is decoded and rescaled (e.g. inverse quantized) to obtain ⁇ X ( m ). Note that since the coding of the correlation matrix is lossless, the decoded matrix ⁇ X ( m ) corresponds exactly to the one which was coded.
- the segments ( m ) of the coded decorrelated signals are perceptually decoded, where each channel is decoded independently.
- the decoded representation of all channel signals within the segment m is denoted by ⁇ ( m ).
- the KLT matrix B KLT ( m ) is computed 2220 from the inter-channel correlation matrix ⁇ X ( m ) in the same way as it is done in the KLT based perceptual encoder.
- transform coders are particularly suitable for being used together with transform coders.
- Many perceptual coders transform the time domain signal to the frequency domain to better adapt the coding to the human psychoacoustics.
- a prominent example is Advanced Audio Coding (AAC) where the transform is accomplished using a Modified Discrete Fourier Transform (MDCT).
- MDCT Modified Discrete Fourier Transform
- the time domain signal is usually processed frame-wise, where successive frames overlap.
- the proposed encoders and decoders may be slightly modified in order to solve a problem that has been found to occur for the following reason (cf. Fig.5 ).
- TFT time-frequency transform
- the TFT can often be regarded as a linear mapping to a space of minor dimension.
- the MDCT maps a time domain frame of N samples to a frame of N /2 frequency bin values. Since this mapping, with respect to this single frame, is not invertible, the cross-correlations between the time domain and frequency domain representations of the respective frame are different. Consequently, using a KLT matrix computed based on the time domain signals does not assure that the frequency domain signals, which are actually coded, are properly decorrelated.
- At least one embodiment of the presently disclosed solution is suitable for preventing all the above mentioned problems for TFT coders.
- the TFT is performed before carrying out the KLT segmentation in the encoder.
- the inverse TFT is performed after the inverse KLT (IKLT), and not before the IKLT.
- TFT time-frequency transform
- the resulting short-time spectra S320 are denoted by X i ( k, n ), where k denotes the frequency bin index and n denotes the TFT frame index. If, for simplicity, we assume that e.g. an MDCT is used as TFT, there are N /2 frequency bins and k ⁇ ⁇ 0,..., N /2 - 1 ⁇ .
- the short-time spectra of the individual channels within the KLT segment are decorrelated 120 using the KLT to obtain the segments V(m) of decorrelated short-time spectra S122 and the respective losslessly encoded quantized inter-channel correlation matrix S121.
- the individual decorrelated short-time spectra within the segment V(m) S122 are individually perceptually coded 130 directly in the frequency domain (i.e., without using a TFT) to obtain the coded representation S130.
- the KLT based encoder outputs the losslessly encoded quantized inter-channel correlation matrix S121 and the coded segment S130 of decorrelated short-time spectra, which can then be transmitted or stored, and subsequently received or retrieved from storage (not shown in Fig.6 ).
- the frames with the encoded short-time spectra are perceptually decoded 210 to the frequency domain without using an inverse TFT (ITFT) to obtain the segments V ⁇ ( m ) S210 of decoded short-time spectra. These are correlated 220 with the inverse KLT (IKLT), using the coded quantized inter-channel correlation matrix S121.
- IKLT inverse KLT
- the above description relates to broadband signals, i.e. full bandwidth.
- the decorrelation and respective correlation may be performed on frequency bands that are related to e.g. the human perception, rather than on the broadband signals.
- a number of losslessly encoded quantized frequency-band related correlation matrices are included into the side information.
- a further essential assumption is that the coding is performed such that a predefined signal-to-noise ratio (SNR) is satisfied for each channel.
- SNR signal-to-noise ratio
- ⁇ n j 2 a j H ⁇ ⁇ E ⁇ a j
- this SNR is obtained from the predefined SNR, SNR x , by the multiplication with a term, which is dependent on the diagonal and non-diagonal component of the signal correlation matrix ⁇ X .
- KLT Karhunen-Loève Transform
- the correlation matrix is diagonal, i.e. all non-diagonal elements are zero.
- Perceptual coding of audio signals means a coding that is adapted to the human perception of audio. It should be noted that when perceptually coding the audio signals, a quantization is usually performed not on the broad-band audio signal samples, but rather in individual frequency bands related to the human perception. Hence, the ratio between the signal power and the quantization noise may vary between the individual frequency bands. To be precise, in order to completely avoid noise unmasking in these cases, it would be necessary to decorrelate the band-pass signals rather than the broad-band audio channel signals. This, however, increases the amount of side information to be transmitted by a factor equal to the number of frequency bands considered. Assuming a fixed given data rate, this reduces the total rate for the coded audio signals.
- an apparatus for avoiding the unmasking of coding noise when mixing perceptually coded multi-channel audio signals comprises a de-correlator 10 for de-correlating the multi-channel audio signals, wherein de-correlated multi-channel audio signals S10 and correlation information S11 are obtained; a perceptual encoder 20 for perceptually encoding the de-correlated multi-channel audio signals; a perceptual decoder 30 for perceptually decoding the perceptually encoded, de-correlated multi-channel audio signals, wherein perceptually decoded, de-correlated multi-channel audio signals S31 are obtained; a re-correlator 40 for re-correlating the perceptually decoded, de-correlated multi-channel audio signals according to the correlation information S11, wherein re-correlated, perceptually decoded multi-channel audio signals S40 are obtained; and a mixer 50 for mixing the re-correlated, perceptually
- an apparatus for avoiding the unmasking of coding noise when mixing perceptually coded time-frequency transformed multi-channel audio signals comprising a perceptual encoding unit and a perceptual decoding unit, and the perceptual encoding unit comprises
- ITFT units 440 for performing inverse time-frequency transform with overlap add on the time-frequency transformed multi-channel audio signals S230, wherein framed time-domain multi-channel audio signals S440 are obtained; and a de-framer 450 for de-framing the framed time-domain multi-channel audio signals S440, wherein the re-correlated, perceptually decoded multi-channel audio signals S40 are obtained; and a mixer 50 for mixing the re-correlated, perceptually decoded multi-channel audio signals S40, wherein at least two of the re-correlated, perceptually decoded multi-channel audio signals are combined to obtain at least one audio output signal ⁇ 1 (l) , and wherein a mixing scheme is used, wherein in said de-correlator 120 the de-correlating the segments obtained in the transform segmentation during encoding is independent from the mixing scheme that is used in the mixer 50.
- an apparatus for avoiding the unmasking of coding noise when mixing, in a perceptual decoder, multi-channel audio signals that have been perceptually coded by using a time-frequency transform comprises decoder 210 for channel independent decoding without inverse time-frequency transform, wherein channel independent decoded signals S21 0 are obtained; a correlation unit 220 for re-correlating the channel independent decoded signal S210 with a KLT, wherein re-correlated signals S220 are obtained; a de-segmenter 230 for performing a transform de-segmentation of the re-correlated signals S220, wherein time-frequency transformed multi-channel audio signals S230 are obtained; ITFT units 440 for performing inverse time-frequency transform with overlap add on the time-frequency transformed multi-channel audio signals S230, wherein framed time-domain multi-channel audio signals S440 are obtained; and a de-framing unit 450 for de-framing the framed time-domain multi-channel audio signals S440,
- an encoder and decoder are adapted for automatically switching between channel-based audio signals (i.e. loudspeaker channel related) and sound field audio signals, such as HOA. Both required different matrices in the matrixing block at the decoder. While input signals x i (l) have one particular format assigned (such as e.g. loudspeaker channel based format or HOA format), such information is transmitted or stored as side information S11, received or retrieved on the decoding side and used in the matrixing unit 50 to automatically adapt the matrix for the respective signal format (e.g. switch between loudspeaker channel based and HOA formats).
- a particular format assigned such as e.g. loudspeaker channel based format or HOA format
- KLT KLT
- other types of transforms may be constructed other than KLT, as would be apparent to those of ordinary skill in the art, all of which are contemplated within the spirit and scope of the invention.
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Mathematical Physics (AREA)
- Signal Processing (AREA)
- Multimedia (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- Algebra (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Analysis (AREA)
- Mathematical Optimization (AREA)
- Pure & Applied Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Stereophonic System (AREA)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
EP12305860.4A EP2688065A1 (de) | 2012-07-16 | 2012-07-16 | Verfahren und Vorrichtung zur Verhinderung der Demaskierung von Codierungsrauschen beim Mischen wahrnehmungscodierter Mehrkanal-Audiosignale |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
EP12305860.4A EP2688065A1 (de) | 2012-07-16 | 2012-07-16 | Verfahren und Vorrichtung zur Verhinderung der Demaskierung von Codierungsrauschen beim Mischen wahrnehmungscodierter Mehrkanal-Audiosignale |
Publications (1)
Publication Number | Publication Date |
---|---|
EP2688065A1 true EP2688065A1 (de) | 2014-01-22 |
Family
ID=46583919
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
EP12305860.4A Withdrawn EP2688065A1 (de) | 2012-07-16 | 2012-07-16 | Verfahren und Vorrichtung zur Verhinderung der Demaskierung von Codierungsrauschen beim Mischen wahrnehmungscodierter Mehrkanal-Audiosignale |
Country Status (1)
Country | Link |
---|---|
EP (1) | EP2688065A1 (de) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2017525318A (ja) * | 2014-07-02 | 2017-08-31 | クゥアルコム・インコーポレイテッドQualcomm Incorporated | 高次アンビソニック(hoa)バックグラウンドチャネル間の相関の低減 |
US9818413B2 (en) | 2014-03-21 | 2017-11-14 | Dolby Laboratories Licensing Corporation | Method for compressing a higher order ambisonics signal, method for decompressing (HOA) a compressed HOA signal, apparatus for compressing a HOA signal, and apparatus for decompressing a compressed HOA signal |
WO2018001500A1 (en) * | 2016-06-30 | 2018-01-04 | Huawei Technologies Duesseldorf Gmbh | Apparatuses and methods for encoding and decoding a multichannel audio signal |
US9930464B2 (en) | 2014-03-21 | 2018-03-27 | Dolby Laboratories Licensing Corporation | Method for compressing a higher order ambisonics (HOA) signal, method for decompressing a compressed HOA signal, apparatus for compressing a HOA signal, and apparatus for decompressing a compressed HOA signal |
US10127914B2 (en) | 2014-03-21 | 2018-11-13 | Dolby Laboratories Licensing Corporation | Method for compressing a higher order ambisonics (HOA) signal, method for decompressing a compressed HOA signal, apparatus for compressing a HOA signal, and apparatus for decompressing a compressed HOA signal |
CN110827840A (zh) * | 2014-01-30 | 2020-02-21 | 高通股份有限公司 | 译码环境高阶立体混响系数的独立帧 |
-
2012
- 2012-07-16 EP EP12305860.4A patent/EP2688065A1/de not_active Withdrawn
Non-Patent Citations (11)
Title |
---|
"Three-Dimensional Surround Sound Systems Based on Spherical Harmonics", J. AUDIO ENG. SOC., vol. 53, no. 11, 2005, pages 1004 - 1025 |
BURNETT IAN ET AL: "Encoding Higher Order Ambisonics with AAC", AES CONVENTION 124; MAY 2008, AES, 60 EAST 42ND STREET, ROOM 2520 NEW YORK 10165-2520, USA, 1 May 2008 (2008-05-01), XP040508582 * |
ERIK HELLERUD; TAN BURNETT; AUDUN SOLVANG; U. PETER SVENSSON: "Encoding Higher Order Ambisonics with AAC", 124TH AES CONVENTION, 2008 |
JOLLIFFE, I. T.: "Principal Component Analysis", 2002, SPRINGER |
KATE, W.R.T. TEN; BOERS, P.M.; MAKIVIRTA, A.; KUUSAMA, J.; CHRISTENSEN, K.E.; SORENSEN, E.: "Matrixing of bit rate reduced audio signals", PROC. OF THE ICASSP, no. 2, March 1992 (1992-03-01), pages 205 - 208, XP000356973, DOI: doi:10.1109/ICASSP.1992.226084 |
KATE, WARNER R.: "Compatibility Matrixing of Multichannel Bit-Rate Reduced Audio Signals", 96TH CONVENTION OF THE AUDIO ENG. SOC., February 1994 (1994-02-01) |
MAURI VÄÄNÄNEN: "Robustness Issues in Multi-View Audio Coding", AES CONVENTION 125; OCTOBER 2008, AES, 60 EAST 42ND STREET, ROOM 2520 NEW YORK 10165-2520, USA, 1 October 2008 (2008-10-01), XP040508860 * |
POLETTI M A: "THREE-DIMENSIONAL SURROUND SOUND SYSTEMS", JOURNAL OF THE AUDIO ENGINEERING SOCIETY, AUDIO ENGINEERING SOCIETY, NEW YORK, NY, US, vol. 53, no. 11, 1 November 2005 (2005-11-01), pages 1004 - 1025, XP001243400, ISSN: 1549-4950 * |
SOLEDAD TORRES-GUIJARRO; JON A. BERACOECHEA-ALAVA; LUIS I. ORTIZ-BERENGUER; F. JAVIER CASAJUS-QUIROS: "Inter-channel de-correlation for perceptual audio coding", APPLIED ACOUSTICS, vol. 66, no. 8, 2005, pages 889 - 901 |
YANG DAI ET AL: "An Inter-Channel Redundancy Removal Approach for High-Quality Multichannel Audio Compression", 22 September 2000 (2000-09-22), pages 1 - 14, XP002517098, Retrieved from the Internet <URL:http://www.aes.org/tmpFiles/elib/20090227/9100.pdf> [retrieved on 20000901] * |
YANG, DAI; AI, HONGMEI; KYRIAKAKIS, C.; KUO, C. -C.: "J. High-fidelity multichannel audio coding with Karhunen-Loeve transform", IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, vol. 11, no. 4, 2003, pages 365 - 380, XP011099062, DOI: doi:10.1109/TSA.2003.814375 |
Cited By (23)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110827840A (zh) * | 2014-01-30 | 2020-02-21 | 高通股份有限公司 | 译码环境高阶立体混响系数的独立帧 |
CN110827840B (zh) * | 2014-01-30 | 2023-09-12 | 高通股份有限公司 | 译码环境高阶立体混响系数的独立帧 |
US10542364B2 (en) | 2014-03-21 | 2020-01-21 | Dolby Laboratories Licensing Corporation | Methods, apparatus and systems for decompressing a higher order ambisonics (HOA) signal |
US10679634B2 (en) | 2014-03-21 | 2020-06-09 | Dolby Laboratories Licensing Corporation | Methods and apparatus for decoding a compressed HOA signal |
US10089992B2 (en) | 2014-03-21 | 2018-10-02 | Dolby Laboratories Licensing Corporation | Methods and apparatus for decompressing a compressed HOA signal |
US10127914B2 (en) | 2014-03-21 | 2018-11-13 | Dolby Laboratories Licensing Corporation | Method for compressing a higher order ambisonics (HOA) signal, method for decompressing a compressed HOA signal, apparatus for compressing a HOA signal, and apparatus for decompressing a compressed HOA signal |
US10192559B2 (en) | 2014-03-21 | 2019-01-29 | Dolby Laboratories Licensing Corporation | Methods and apparatus for decompressing a compressed HOA signal |
CN109410960A (zh) * | 2014-03-21 | 2019-03-01 | 杜比国际公司 | 用于对压缩的hoa信号进行解码的方法、装置和存储介质 |
US10334382B2 (en) | 2014-03-21 | 2019-06-25 | Dolby Laboratories Licensing Corporation | Methods, apparatus and systems for decompressing a higher order ambisonics (HOA) signal |
US10388292B2 (en) | 2014-03-21 | 2019-08-20 | Dolby Laboratories Licensing Corporation | Methods and apparatus for decompressing a compressed HOA signal |
US12069465B2 (en) | 2014-03-21 | 2024-08-20 | Dolby Laboratories Licensing Corporation | Methods, apparatus and systems for decompressing a Higher Order Ambisonics (HOA) signal |
US11830504B2 (en) | 2014-03-21 | 2023-11-28 | Dolby Laboratories Licensing Corporation | Methods and apparatus for decoding a compressed HOA signal |
US10629212B2 (en) | 2014-03-21 | 2020-04-21 | Dolby Laboratories Licensing Corporation | Methods and apparatus for decompressing a compressed HOA signal |
US9930464B2 (en) | 2014-03-21 | 2018-03-27 | Dolby Laboratories Licensing Corporation | Method for compressing a higher order ambisonics (HOA) signal, method for decompressing a compressed HOA signal, apparatus for compressing a HOA signal, and apparatus for decompressing a compressed HOA signal |
US10779104B2 (en) | 2014-03-21 | 2020-09-15 | Dolby Laboratories Licensing Corporation | Methods, apparatus and systems for decompressing a higher order ambisonics (HOA) signal |
US9818413B2 (en) | 2014-03-21 | 2017-11-14 | Dolby Laboratories Licensing Corporation | Method for compressing a higher order ambisonics signal, method for decompressing (HOA) a compressed HOA signal, apparatus for compressing a HOA signal, and apparatus for decompressing a compressed HOA signal |
US11395084B2 (en) | 2014-03-21 | 2022-07-19 | Dolby Laboratories Licensing Corporation | Methods, apparatus and systems for decompressing a higher order ambisonics (HOA) signal |
US11462222B2 (en) | 2014-03-21 | 2022-10-04 | Dolby Laboratories Licensing Corporation | Methods and apparatus for decoding a compressed HOA signal |
US11722830B2 (en) | 2014-03-21 | 2023-08-08 | Dolby Laboratories Licensing Corporation | Methods, apparatus and systems for decompressing a Higher Order Ambisonics (HOA) signal |
CN109410960B (zh) * | 2014-03-21 | 2023-08-29 | 杜比国际公司 | 用于对压缩的hoa信号进行解码的方法、装置和存储介质 |
RU2741763C2 (ru) * | 2014-07-02 | 2021-01-28 | Квэлкомм Инкорпорейтед | Уменьшение корреляции между фоновыми каналами амбиофонии высшего порядка (ноа) |
JP2017525318A (ja) * | 2014-07-02 | 2017-08-31 | クゥアルコム・インコーポレイテッドQualcomm Incorporated | 高次アンビソニック(hoa)バックグラウンドチャネル間の相関の低減 |
WO2018001500A1 (en) * | 2016-06-30 | 2018-01-04 | Huawei Technologies Duesseldorf Gmbh | Apparatuses and methods for encoding and decoding a multichannel audio signal |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10614821B2 (en) | Methods and apparatus for encoding and decoding multi-channel HOA audio signals | |
US9257128B2 (en) | Apparatus and method for coding and decoding multi object audio signal with multi channel | |
US6934676B2 (en) | Method and system for inter-channel signal redundancy removal in perceptual audio coding | |
AU2014295167B2 (en) | In an reduction of comb filter artifacts in multi-channel downmix with adaptive phase alignment | |
US9774975B2 (en) | Method and apparatus for decoding a compressed HOA representation, and method and apparatus for encoding a compressed HOA representation | |
EP2688065A1 (de) | Verfahren und Vorrichtung zur Verhinderung der Demaskierung von Codierungsrauschen beim Mischen wahrnehmungscodierter Mehrkanal-Audiosignale | |
US20140355767A1 (en) | Method and apparatus for performing an adaptive down- and up-mixing of a multi-channel audio signal | |
US10403292B2 (en) | Method and apparatus for encoding/decoding of directions of dominant directional signals within subbands of a HOA signal representation | |
US10194257B2 (en) | Method and apparatus for encoding/decoding of directions of dominant directional signals within subbands of a HOA signal representation | |
US9794714B2 (en) | Method and apparatus for decoding a compressed HOA representation, and method and apparatus for encoding a compressed HOA representation | |
US20120093321A1 (en) | Apparatus and method for encoding and decoding spatial parameter | |
EP3061088B1 (de) | Dekorrelatorstruktur zur parametrischen rekonstruktion von audiosignalen | |
EP3271918B1 (de) | Vorrichtung und verfahren zur tonsignalverarbeitung | |
US9800986B2 (en) | Method and apparatus for encoding/decoding of directions of dominant directional signals within subbands of a HOA signal representation | |
EP2770505B1 (de) | Audio-Codiervorrichtung und Verfahren | |
US12051427B2 (en) | Determining corrections to be applied to a multichannel audio signal, associated coding and decoding | |
US20150170656A1 (en) | Audio encoding device, audio coding method, and audio decoding device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PUAI | Public reference made under article 153(3) epc to a published international application that has entered the european phase |
Free format text: ORIGINAL CODE: 0009012 |
|
AK | Designated contracting states |
Kind code of ref document: A1 Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR |
|
AX | Request for extension of the european patent |
Extension state: BA ME |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: THE APPLICATION IS DEEMED TO BE WITHDRAWN |
|
18D | Application deemed to be withdrawn |
Effective date: 20140723 |