EP1175030B1 - Method and system for multichannel perceptual audio coding using the cascaded discrete cosine transform or modified discrete cosine transform - Google Patents

Method and system for multichannel perceptual audio coding using the cascaded discrete cosine transform or modified discrete cosine transform Download PDF

Info

Publication number
EP1175030B1
EP1175030B1 EP20010305191 EP01305191A EP1175030B1 EP 1175030 B1 EP1175030 B1 EP 1175030B1 EP 20010305191 EP20010305191 EP 20010305191 EP 01305191 A EP01305191 A EP 01305191A EP 1175030 B1 EP1175030 B1 EP 1175030B1
Authority
EP
Grant status
Grant
Patent type
Prior art keywords
signal
channel
inter
redundancy reduction
audio signals
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
EP20010305191
Other languages
German (de)
French (fr)
Other versions
EP1175030A2 (en )
EP1175030A3 (en )
Inventor
Ye Wang
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nokia Solutions and Networks Oy
Original Assignee
Nokia Solutions and Networks Oy
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Grant date

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04HBROADCAST COMMUNICATION
    • H04H20/00Arrangements for broadcast or for distribution combined with broadcast
    • H04H20/86Arrangements characterised by the broadcast information itself
    • H04H20/88Stereophonic broadcast systems
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/008Multichannel audio signal coding or decoding, i.e. using interchannel correlation to reduce redundancies, e.g. joint-stereo, intensity-coding, matrixing
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/02Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
    • G10L19/0212Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders using orthogonal transformation

Description

    Field of the Invention
  • The present invention relates generally to audio coding and, in particular, to the coding technique used in a multiple channel surround sound system.
  • Background of the Invention
  • As it is well known in the art, the International Organization for Standardization (IOS) founded the Moving Pictures Expert Group (MPEG) with the intention to develop and standardize compression algorithms for video and audio signals. One of the most efficient audio coding techniques since 1997 is the MPEG-2 Advanced Audio Coding (AAC) algorithm.
    The driving force to develop the AAC algorithm has been the quest for an efficient coding method for surround sound signals, such as 5-channel signals including left (L), right (R), center (C), left-surround (LS) and right-surround (RS) signals. MPEG-2 AAC basically makes use of the signal masking properties of the human ear in order to reduce the amount of data. Generally, an N-channel surround sound system, running with a bit rate of M bps/ch does not necessarily have a total bit rate of MxN bps, but rather an overall bit rate significantly less than MxN bps due to cross channel (inter-channel) redundancy. To exploit the inter-channel redundancy, two methods have been used in MPEG-2 AAC standards: Mid-Side (MS) Stereo Coding and Intensity Stereo Coding. Both the MS Stereo and Intensity Stereo coding methods operate on channel pairs, as shown in Figure 1. As shown in Figure 1, the signals in one channel pairs are denoted by (100 L, 100 R) and (100 LS, 100 RS). The rationale behind the application of stereo audio coding is based on the fact that the human auditory system as well as a stereo recording system use two audio signal detectors. While a human being has two ears, a stereo recording system has two microphones. With these two audio signal detectors, the human auditory system or the stereo recording system receives and records an audio signal from the same source twice, once through each audio signal detector. The two sets of recorded data of the audio signal from the same source contain time and signal level differences caused mainly by the positions of the detectors in relation to the source.
  • It is believed that the human auditory system itself is able to detect and discard the inter-channel redundancy, thereby avoiding extra processing. At low frequencies, the human auditory system locates sound sources mainly based on the inter-aural time difference (ITD) of the arrived signals. At high frequencies, the difference in signal strength or intensity level at both ears, or inter-aural level difference (ILD), is the major cue. In order to remove the redundancy in the received signals in a stereo sound system, the psychoacoustic model analyzes the received signals with consecutive time blocks and determines for each block the spectral components of the received audio signal in the frequency domain in order to remove certain spectral components, thereby mimicking the masking properties of the human auditory system. Like any perceptual audio coder, the MPEG audio coder does not attempt to retain the input signal exactly after encoding and decoding, rather its goal is to reduce the amount of audio data yet maintaining the output signals similar to what the human auditory system might perceive. Thus, the MS Stereo coding technique applies a matrix to the signals of the (L,R) or (LS, RS) pair in order to compute the sum and difference of the two original signals, dealing mainly with the spectral image at the mid-frequency range. Intensity Stereo coding replaces the left and the right signals by a single representative signal plus directional information. The replacement of signals in the Intensity Stereo coding scheme is psychoacoustically justified in the higher frequency range at around 2kHz.
  • While conventional audio coding techniques can reduce a significant amount of channel redundancy in channel pairs (L/R or LS/RS) based on the dual channel correlation, they may not be efficient in coding audio signals when a large number of channels are used in a surround sound system.
  • It is advantageous and desirable to provide a more efficient encoding system and method in order to further reduce the redundancy in the stereo sound signals. In particular, the method can be advantageously applied to a surround sound system having a large number of sound channels (6 or more, for example). Such system and method can also be used in audio streaming over Internet Protocol (IP) for personal computer (PC) users, mobile IP and third-generation (3G) systems for mobile laptop users, digital radio, digital television, and digital archives of movie sound tracks and the like.
  • Summary of the Invention
  • The primary objective of the present invention is to improve the efficiency in encoding audio signals in a sound system in order to reduce the amount of audio data for transmission or storage.
  • According to the present invention, a method of coding audio signals in a sound system having a plurality of sound channels for providing M sets of audio signals, wherein M is a positive integer greater than 2, comprises the steps of computing a first value representative of coding efficiency of an intra-channel signal redundancy reduction of said audio signals, providing a first signal have a first magnitude indicative of said first value, computing a second value representative of coding efficiency of an inter-channel signal redundancy reduction in said audio signals, in response to said first signal, providing a second signal have a second magnitude indicative of said second value, in response to the first and second signals, comparing the first value to the second value, selecting the one of said intra-channel signal redundancy reduction and said inter-channel signal redundancy reduction that is more efficient, providing a third signal indicative of the one of the first and second signals corresponding to the selected one of said intra-channel signal redundancy reduction and said inter-channel signal redundancy reduction and encoding the audio signals according to the selected process.
  • Preferably, the intra-channel signal redundancy reduction is carried out in accordance with a modified discrete cosine transform process, and the inter-channel signal redundancy reduction is carried out in accordance with a cascaded discrete cosine transform process. In this case, the selection of intra-channel signal redundancy reduction or inter-channel signal redundancy reduction of said audio signals, may comprise causing inter-channel signal redundancy reduction means to be turned off or on respectively.
  • Preferably, the inter-channel signal redundancy reduction is carried out in order to reduce redundancy in the audio signals among L channels, wherein L is a positive integer greater than 2 but smaller than M+1.
  • Preferably, the encoding step includes a signal masking process according to a psychoacoustic model simulating a human auditory system.
  • Preferably, the method further includes the step of converting the encoded signals into a bit stream.
  • The present invention also provides an encoding apparatus for a sound system having a plurality of sound channels for providing M sets of audio signals, wherein M is a positive integer greater than 2, comprises comparing means for comparing a first value, representative of the coding efficiency of an intra-channel signal redundancy reduction of said audio signals, with a second value, representative of the coding efficiency of an inter-channel signal redundancy reduction of said audio signals, to determine which of said intra-channel signal redundancy reduction and said inter-channel signal redundancy reduction is more efficient, selecting means for selecting one of said intra-channel redundancy reduction and said inter-channel signal redundancy reduction based on said comparison, computation means for computing said first and second values, means responsive to said audio signals, for providing a first signal having a first magnitude indicative of the first value, and means responsive to the first reduced audio signal, for providing a second signal having a second magnitude indicative of the second value, wherein said comparing means is arranged to compare said first and second values in response to said first and second signals.
  • Preferably, the intra-channel signal redundancy is reduced by a modified discrete cosine transform process, and the inter-channel signal redundancy is reduced by a cascaded discrete cosine transform process in L of the M sets of audio signals, wherein L is a positive integer greater than 2 but smaller than M+1.
  • Preferably, the encoder also includes a mechanism for masking the audio signals according to a psychoacoustic model simulating a human auditory system.
  • Preferably, the encoder also includes a quantizer for quantizing the third signal into an encoded signal and a bit-stream formatter for converting the encoded signal into a bit-stream.
  • The present invention will become apparent upon reading the description taken in conjunction with Figures 2a to 3.
  • Brief Description of the Drawings
    • Figure 1 is a diagrammatic representation illustrating a conventional audio coding method for a surround sound system.
    • Figure 2a is a diagrammatic representation illustrating an audio coding method using an M channel cascaded discrete cosine transform in an M channel sound system.
    • Figure 2b is a diagrammatic representation illustrating an audio coding method using an L channel cascaded discrete cosine transform in an M channel sound system, where L<M.
    • Figure 3 is a block diagram illustrating a system for audio coding, according to the present invention.
    Detailed Description
  • The present invention improves the coding efficiency in audio coding for a sound system having M sound channels for sound reproduction, wherein M is greater than 2. In the encoder of the present invention, the individual or intra-channel masking thresholds for each of the sound channels are calculated in a fashion similar to a basic Advanced Audio Coding (AAC) encoder. This method is herein referred to as the intra-channel signal redundancy reduction method. Unlike the convention coding method, however, it also relies on the inter-channel discrete cosine transform (DCT) of the modified discrete cosine transform coefficients. This method is herein referred to as the cascaded MDCT-DCT coding method for inter-channel signal redundancy reduction. The MDCT-DCT coefficients should be quantized according to the highest threshold, taking into account the inter-channel masking effect, known as the masking level difference (MLD). This is characterized by a decreasing masking threshold when the masking mechanism is spatially separated from the source being masked.
  • As shown in Figures 2a and 2b, one of the audio coding steps of the present coding method is to perform an inter-channel DCT of multiple channel MDCT coefficients in a cascaded manner in order to reduce the inter-channel redundancy in an M channel sound system, wherein M is greater than 2. Figures 2a and 2b diagrammatically illustrate M sound channels, and a group of DCT units 40 are used to perform inter-channel DCT from audio signals 1001, 1002, 1003,.., 100M-1, and 100M , When a block of N samples (the transform length) are used to compute a series of MDCT coefficients, the maximum number of DCT units 40 used to perform the inter-channel DCT is equal to the number of MDCT coefficients. The MDCT transform length N is determined by transform gain, computational complexity and the pre-echo problem; and the number of MDCT coefficients is N/2. Typically, the MDCT transform length N is between 256 and 2048 samples. Accordingly, the number of DCT units required to perform the inter-channel DCT is between 128 and 1024. In practice, however, the number of DCT units needed for performing the inter-channel DCT is much less.
  • As shown in Figure 2a, the cascaded MDCT-DCT is carried out with M DCT units 40. It is also possible, however, to perform the inter-channel DCT of the MDCT coefficients of L channels, wherein L is a subset of M with L being greater than 2 and smaller than M+1. For example, in a 5-channel sound system consisting of left (L), right (R), center (C), left-surround (LS) and right-surround (RS) channels, it is possible to perform the cascaded inter-channel DCT of the MDCT coefficients involving only 4 channels, namely, L, R, LS and RS. Likewise, in a 12-channel sound system, it is possible to perform an inter-channel DCT of only 5 or 6 channel MDCT coefficients. As shown in Figure 2b, the cascaded MDCT-DCT is carried out with M-3 DCT units 40 in order to compute the cross correlation among audio signals 1003,.., and 100M-1.
  • In some surround sound recording and reproduction cases, the correlation in the audio signals among L (>2) channels is strong. Accordingly, the efficiency of audio coding using the cascaded MDCT-DCT method is higher than the efficiency of the intra-channel MDCT method alone. However, if the correlation in the audio signals among the L channels is weak, it is possible that this inter-channel DCT technique may not be as efficient as the intra-channel signal redundancy reduction using the MDCT coding method. Thus, it is advantageous to provide a comparison device to compare the coding efficiency of the two methods for each sampling block or a group of sampling blocks and select the more efficient method.
  • The efficiency of the intra-channel MDCT coding method is represented by Equations 1 and 2 below. In a block of N samples with each block having a series of sound amplitude values of a(k)'s, the MDCT coefficients in the frequency domain are given by: α r m = 1 / N k = 0 N - 1 a k m cos π / 4 N 4 k + 2 + N 2 r + 1
    Figure imgb0001
    bits_per_coef 1 = 1 / MN r = 0 N / 2 - 1 m = 1 M log 2 abs α r m
    Figure imgb0002

    In the above equations, m represents a channel number and M represents the number of sound channels involved.
  • In particular, if it is desirable to determine the cross correlation among all M channels, then a cascaded inter-channel DCT of the M sets of MDCT coefficients should be performed, as given in Equations 3 and 4 below: β r , s = 1 / MN k = 0 N - 1 j = 0 M - 1 a k j cos π / 4 N 4 k + 2 + N 2 r + 1 cos π / M 2 j + 1 s
    Figure imgb0003
    bits_per_coef 2 = 1 / MN r = 0 N / 2 - 1 s = 1 M log 2 abs β r , s
    Figure imgb0004

    It should be noted that the coefficient a(k) in Equation 1 and the coefficient a(k,j) in Equation 3 may include a modified function of sin(πk/N).
  • In order to ensure that the efficiency of the cascaded MDCT-DCT process is higher than that of the intra-channel MDCT process, it is possible to compute the gain according to Equation 5 as follows: G = i = 0 L - 1 bits_per_coef 1 - bits_per_coef 2 / bits_per_coef 1
    Figure imgb0005

    where L is the number of frames of the test signal used to calculate the average gain G. If G is positive, then the efficiency of the cascaded inter-channel DCT process is higher than the efficiency of the intra-channel MDCT process. Accordingly, the cascaded inter-channel DCT should be used for audio coding in order to reduce the amount of encoded data.
  • Alternatively, the efficiency in the inter-channel signal redundancy reduction using the cascaded MDCT- DCT process can be evaluated using a cross-channel correlation method. The normalized cross-channel correlation coefficient between any two channels p and q is represented by the following equation: C pq = r = 0 N / 2 - 1 α r p α r q / sqrt r = 0 N / 2 - 1 α r p α r p r = 0 N / 2 - 1 α r q α r q
    Figure imgb0006

    The absolute value of Cpq can be used to set a threshold over which the cascaded MDCT-DCT process should be used. In an M channel system, it is possible to calculate M(M-1)/2 normalized cross-channel correlation coefficients. For example, in a three channel system having channels 1, 2 and 3, it is possible to calculate the normalized cross-channel correlation coefficients C12, C13, and C23. The sum of the absolute values of these normalized cross-channel correlation coefficients can be used to compare the efficiency of the intra-channel MDCT method to the inter-channel cascaded MDCT-DCT method.
  • Accordingly, the present invention provides a system for efficient audio coding to reduce redundancy in an M channel sound system, as shown in Figure 3. As shown, the pulsed code modulation (PCM) samples 20 in the M channels are first conveyed to a set of M Shifted Discrete Fourier Transform (SDFT) devices 221, 222 , .., 22 M so that the real parts of the SDFT coefficients form a group of M MDCT coefficients in a group of M MDCT units 30 1, 30 2, .., 30 M, respectively. The devices 221, 222, .., 22M and the MDCT units 301, 302, .., 30M together perform an intra-channel decorrelation.
  • For a set of signal sequences {a(k)m}, the Shifted Discrete Fourier Transform coefficient is defined as follows: α r m = 1 / N k = 0 N - 1 a k m exp i 2 π k + u r + v / N
    Figure imgb0007

    where u=(N+2)/4 and v=1/2, being the shift in the time domain and the shift in the frequency domain, respectively. Thus, the relationship between the MDCT coefficients (Eq.1) and the SDFT coefficients (Eq.7) is as follows: α r m = 1 / N k = 0 N - 1 a k m cos π / 4 N 4 k + 2 + N 2 r + 1 = 1 / N k = 0 N - 1 a ˜ k m exp i 2 π k + u r + v / N
    Figure imgb0008

    where ã(k)m = a(k)m- a(N/2-1-k)m for k-0,..., (N/2)-1; and ã(k)m - a(k)m- a(3N/2-1-k)m for k=(N/2),..., (N-1) with N being an even number. Accordingly, the right-hand side of Eq.8 is SDFTu,v(ã(k)m/2) or real {SDFTu,v(a(k)m/2)}.
  • As shown in Figure 3, a number of DCT units 40 are used to compute the inter-channel signal redundancy reduction in these M sets of MDCT coefficients. The number of DCT units 40 can be equal to or less than the number of MDCT coefficients in each of the M channels, as discussed earlier in conjunction with Equation 3. A comparison device 50 is used to compute the gain G (Equation 5) or the threshold from the cross-channel correlation coefficients Cpq (Equation 6) to ensure that the coding according to the cascaded inter-channel DCT of the MDCT coefficients is more efficient than the intra-channel decorrelation by the MDCT units 30 1, 30 2, .., 30 M. If the gain G is negative or the cross-channel correlation is lower than a pre-determined threshold, it can cause the DCT units 40 to turn off. A masking mechanism 52, based on a so-called psychoacoustic model, is used to remove the audio data believed not to be used by a human auditory system. As shown in Figure 3, the masking mechanism is also operatively connected to the comparison device 50 so that the masking is carried out according to the intra-channel MDCT manner or the inter-channel MDCT-DCT manner. Finally, the 2-D spectral image is quantized by a group of quantizers 601, 602, .., 60M according to the masking threshold calculated by the psychoacoustic model and the quantized data is further processed by a bit stream formatter 70 into a bit stream 80 for transmission or storage.
  • The efficiency of the cascaded MDCT-DCT coding process in removing cross-channel redundancy, in general, increases with the number of sound channels involved. For example, if a sound system consists of 6 or more surround sound speakers, then the reduction in cross-channel redundancy using the cascaded MDCT-DCT processing is usually significant. However, if the number of channels to be used in the cascaded MDCT-DCT processing is 2, then the efficiency may not be improved at all. It should be noted that, like any perceptual audio coder, the goal of the cascaded MDCT-DCT processing is to reduce the audio data for transmission or storage. While the processing method is intended to produce signal outputs similar to what a human auditory system might perceive, its goal is not to replicate the input signals.
  • It should be noted that the so-called psychoacoustic model may consist of a certain perceptual model and a certain band mapping model. The surround sound encoding system may consist of components such as an AAC gain control and a certain long-term prediction model. However, these components are well-known in the art and they can be modified, replaced or omitted. Thus, although the invention has been described with respect to a preferred embodiment thereof, it will be understood by those skilled in the art that the foregoing and various other changes, omissions and deviations in the form and detail thereof may be made without departing from the scope of this invention.

Claims (13)

  1. A method of coding audio signals (1001, ..., 100M) in a sound system having a plurality of sound channels for providing M sets of audio signals, wherein M is a positive integer greater than 2, said method comprising the steps of:
    comparing a first value, representative of the coding efficiency of an intra-channel signal redundancy reduction of said audio signals, with a second value, representative of the coding efficiency of an inter-channel signal redundancy reduction of said audio signals, to determine which of said intra-channel signal redundancy reduction and said inter-channel signal redundancy reduction is more efficient;
    selecting one of said intra-channel redundancy reduction and said inter-channel signal redundancy reduction based on said comparison; and
    encoding the audio signals (1001, ..., 100M) according to the selected redundancy reduction;
    characterised by:
    computing said first value;
    providing a first signal having a first magnitude indicative of said first value;
    computing said second value;
    in response to said first signal, providing a second signal having a second magnitude indicative of said second value; and
    providing a third signal indicative of the one of the first and second signals corresponding to the selected one of one of said intra-channel redundancy reduction and said inter-channel signal redundancy reduction;
    wherein said step of comparing is performed in response to the first and second signals.
  2. The method of claim 1, wherein the intra-channel signal redundancy reduction is carried out in accordance with a modified discrete cosine transform process, and the inter-channel signal redundancy reduction is carried out in accordance with a cascaded discrete cosine transform process.
  3. The method of claim 1, wherein the inter-channel signal redundancy reduction is carried out in order to reduce redundancy in the audio signals (1001, ..., 100M) among L channels, wherein L is a positive integer greater than 2 but smaller than M+1.
  4. The method of claim 1, wherein the encoding step includes a signal masking process according to a psychoacoustic model (52) simulating a human auditory system.
  5. The method of claim 1, further comprising the step of converting the encoded signals into a bit stream.
  6. An encoding apparatus for a sound system having a plurality of sound channels for providing M sets of audio signals (1001,..., 100M), wherein M is a positive integer greater than 2, comprising:
    comparing means (50) for comparing a first value, representative of the coding efficiency of an intra-channel signal redundancy reduction of said audio signals, with a second value, representative of the coding efficiency of an inter-channel signal redundancy reduction of said audio signals, to determine which of said intra-channel signal redundancy reduction and said inter-channel signal redundancy reduction is more efficient and for selecting one of said intra-channel redundancy reduction and said inter-channel signal redundancy reduction based on said comparison;
    characterised by:
    computation means for computing said first and second values;
    first signal providing means (301, ..., 30M), responsive to said audio signals (1001, ..., 100M), for providing a first signal having a first magnitude indicative of the first value; and
    second signal providing means (40), responsive to the first reduced audio signal, for providing a second signal having a second magnitude indicative of the second value;
    wherein said comparing means (50) is arranged to compare said first and second values in response to said first and second signals.
  7. The encoding apparatus of claim 6, wherein the first signal providing means (301, ..., 30M) remove the intra-channel signal redundancy by a modified discrete cosine transform process, and the second signal providing means (40) remove the inter-channel signal redundancy by a cascaded discrete cosine transform process.
  8. The encoding apparatus of claim 6, wherein the second signal providing means (40) removes the inter-channel signal redundancy in L of the M sets of audio signals and wherein L is a positive integer greater than 2 but smaller than M+1.
  9. The encoding apparatus of claim 6, further comprising a mechanism for masking the audio signals according to a psychoacoustic model (52) simulating a human auditory system.
  10. The encoding apparatus of claim 6, further comprising a mechanism (601, ..., 60M) for quantizing the third signal into an encoded signal.
  11. The encoding apparatus of claim 10, further comprising a mechanism (70) for converting the encoded signal into a bit stream (80).
  12. The encoding apparatus of claim 6, wherein the comparing means (50) is capable of computing a value indicative of cross-channel correlation coefficients among the M sets of audio signals (1001, ..., 100M) and comparing said value to a pre-determined threshold in order to compare the first and second data amounts.
  13. The encoding apparatus of claim 7, wherein the comparing means (50) is arranged to select intra-channel signal redundancy reduction or inter-channel signal redundancy reduction of said audio signals (1001, ..., 100M), by causing inter-channels signal redundancy reduction means (40) to be turned off or on respectively.
EP20010305191 2000-07-07 2001-06-14 Method and system for multichannel perceptual audio coding using the cascaded discrete cosine transform or modified discrete cosine transform Expired - Fee Related EP1175030B1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
US61220700 true 2000-07-07 2000-07-07
US612207 2000-07-07

Publications (3)

Publication Number Publication Date
EP1175030A2 true EP1175030A2 (en) 2002-01-23
EP1175030A3 true EP1175030A3 (en) 2002-10-23
EP1175030B1 true EP1175030B1 (en) 2008-02-20

Family

ID=24452190

Family Applications (1)

Application Number Title Priority Date Filing Date
EP20010305191 Expired - Fee Related EP1175030B1 (en) 2000-07-07 2001-06-14 Method and system for multichannel perceptual audio coding using the cascaded discrete cosine transform or modified discrete cosine transform

Country Status (2)

Country Link
EP (1) EP1175030B1 (en)
DE (1) DE60132853D1 (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7860720B2 (en) 2002-09-04 2010-12-28 Microsoft Corporation Multi-channel audio encoding and decoding with different window configurations
US7917369B2 (en) 2001-12-14 2011-03-29 Microsoft Corporation Quality improvement techniques in an audio encoder
US8190425B2 (en) 2006-01-20 2012-05-29 Microsoft Corporation Complex cross-correlation parameters for multi-channel audio
US8249883B2 (en) 2007-10-26 2012-08-21 Microsoft Corporation Channel extension coding for multi-channel source
US8255229B2 (en) 2007-06-29 2012-08-28 Microsoft Corporation Bitstream syntax for multi-process audio decoding
US8255234B2 (en) 2002-09-04 2012-08-28 Microsoft Corporation Quantization and inverse quantization for audio
US8428943B2 (en) 2001-12-14 2013-04-23 Microsoft Corporation Quantization matrices for digital audio
US8645127B2 (en) 2004-01-23 2014-02-04 Microsoft Corporation Efficient coding of digital media spectral data using wide-sense perceptual similarity
US9105271B2 (en) 2006-01-20 2015-08-11 Microsoft Technology Licensing, Llc Complex-transform channel coding with extended-band frequency coding

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7299190B2 (en) 2002-09-04 2007-11-20 Microsoft Corporation Quantization and inverse quantization for audio
US7953604B2 (en) 2006-01-20 2011-05-31 Microsoft Corporation Shape and scale parameters for extended-band frequency coding
US8046214B2 (en) * 2007-06-22 2011-10-25 Microsoft Corporation Low complexity decoder for complex transform coding of multi-channel sound
US8631060B2 (en) 2007-12-13 2014-01-14 Qualcomm Incorporated Fast algorithms for computation of 5-point DCT-II, DCT-IV, and DST-IV, and architectures
US20120215788A1 (en) * 2009-11-18 2012-08-23 Nokia Corporation Data Processing
KR101666465B1 (en) * 2010-07-22 2016-10-17 삼성전자주식회사 Apparatus method for encoding/decoding multi-channel audio signal

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DE4331376C1 (en) * 1993-09-15 1994-11-10 Fraunhofer Ges Forschung Method for determining the type of encoding to selected for the encoding of at least two signals
US5488665A (en) * 1993-11-23 1996-01-30 At&T Corp. Multi-channel perceptual audio compression system with encoding mode switching among matrixed channels
JP3404837B2 (en) * 1993-12-07 2003-05-12 ソニー株式会社 Multilayer encoding device
KR970005131B1 (en) * 1994-01-18 1997-04-12 배순훈 Digital audio encoding apparatus adaptive to the human audatory characteristic
EP0688113A2 (en) * 1994-06-13 1995-12-20 Sony Corporation Method and apparatus for encoding and decoding digital audio signals and apparatus for recording digital audio
US5812971A (en) * 1996-03-22 1998-09-22 Lucent Technologies Inc. Enhanced joint stereo coding method using temporal envelope shaping
DE19628292B4 (en) * 1996-07-12 2007-08-02 At & T Laboratories A method of encoding and decoding stereo audio spectral values

Cited By (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8428943B2 (en) 2001-12-14 2013-04-23 Microsoft Corporation Quantization matrices for digital audio
US7917369B2 (en) 2001-12-14 2011-03-29 Microsoft Corporation Quality improvement techniques in an audio encoder
US9305558B2 (en) 2001-12-14 2016-04-05 Microsoft Technology Licensing, Llc Multi-channel audio encoding/decoding with parametric compression/decompression and weight factors
US8805696B2 (en) 2001-12-14 2014-08-12 Microsoft Corporation Quality improvement techniques in an audio encoder
US8554569B2 (en) 2001-12-14 2013-10-08 Microsoft Corporation Quality improvement techniques in an audio encoder
US9443525B2 (en) 2001-12-14 2016-09-13 Microsoft Technology Licensing, Llc Quality improvement techniques in an audio encoder
US7860720B2 (en) 2002-09-04 2010-12-28 Microsoft Corporation Multi-channel audio encoding and decoding with different window configurations
US8255230B2 (en) 2002-09-04 2012-08-28 Microsoft Corporation Multi-channel audio encoding and decoding
US8255234B2 (en) 2002-09-04 2012-08-28 Microsoft Corporation Quantization and inverse quantization for audio
US8386269B2 (en) 2002-09-04 2013-02-26 Microsoft Corporation Multi-channel audio encoding and decoding
US8069050B2 (en) 2002-09-04 2011-11-29 Microsoft Corporation Multi-channel audio encoding and decoding
US8099292B2 (en) 2002-09-04 2012-01-17 Microsoft Corporation Multi-channel audio encoding and decoding
US8620674B2 (en) 2002-09-04 2013-12-31 Microsoft Corporation Multi-channel audio encoding and decoding
US8645127B2 (en) 2004-01-23 2014-02-04 Microsoft Corporation Efficient coding of digital media spectral data using wide-sense perceptual similarity
US8190425B2 (en) 2006-01-20 2012-05-29 Microsoft Corporation Complex cross-correlation parameters for multi-channel audio
US9105271B2 (en) 2006-01-20 2015-08-11 Microsoft Technology Licensing, Llc Complex-transform channel coding with extended-band frequency coding
US8645146B2 (en) 2007-06-29 2014-02-04 Microsoft Corporation Bitstream syntax for multi-process audio decoding
US9026452B2 (en) 2007-06-29 2015-05-05 Microsoft Technology Licensing, Llc Bitstream syntax for multi-process audio decoding
US9349376B2 (en) 2007-06-29 2016-05-24 Microsoft Technology Licensing, Llc Bitstream syntax for multi-process audio decoding
US8255229B2 (en) 2007-06-29 2012-08-28 Microsoft Corporation Bitstream syntax for multi-process audio decoding
US8249883B2 (en) 2007-10-26 2012-08-21 Microsoft Corporation Channel extension coding for multi-channel source

Also Published As

Publication number Publication date Type
EP1175030A2 (en) 2002-01-23 application
DE60132853D1 (en) 2008-04-03 grant
EP1175030A3 (en) 2002-10-23 application

Similar Documents

Publication Publication Date Title
US6766293B1 (en) Method for signalling a noise substitution during audio signal coding
US7328160B2 (en) Encoding device and decoding device
US7573912B2 (en) Near-transparent or transparent multi-channel encoder/decoder scheme
US5633981A (en) Method and apparatus for adjusting dynamic range and gain in an encoder/decoder for multidimensional sound fields
US6011824A (en) Signal-reproduction method and apparatus
US5717764A (en) Global masking thresholding for use in perceptual coding
US7885819B2 (en) Bitstream syntax for multi-process audio decoding
US6950794B1 (en) Feedforward prediction of scalefactors based on allowable distortion for noise shaping in psychoacoustic-based compression
US20070172071A1 (en) Complex transforms for multi-channel audio
US20050267763A1 (en) Multichannel audio extension
US20060085200A1 (en) Diffuse sound shaping for BCC schemes and the like
US20040174911A1 (en) Method and apparatus for encoding and/or decoding digital data using bandwidth extension technology
US20080319739A1 (en) Low complexity decoder for complex transform coding of multi-channel sound
US20070174062A1 (en) Complex-transform channel coding with extended-band frequency coding
US6675148B2 (en) Lossless audio coder
US5632005A (en) Encoder/decoder for multidimensional sound fields
US6356870B1 (en) Method and apparatus for decoding multi-channel audio data
US8150042B2 (en) Method, device, encoder apparatus, decoder apparatus and audio system
US7602922B2 (en) Multi-channel encoder
US6931291B1 (en) Method and apparatus for frequency-domain downmixing with block-switch forcing for audio decoding functions
US5890125A (en) Method and apparatus for encoding and decoding multiple audio channels at low bit rates using adaptive selection of encoding method
US20060031075A1 (en) Method and apparatus to recover a high frequency component of audio data
US20050078832A1 (en) Parametric audio coding
US20030236583A1 (en) Hybrid multi-channel/cue coding/decoding of audio signals
US6141645A (en) Method and device for down mixing compressed audio bit stream having multiple audio channels

Legal Events

Date Code Title Description
AX Request for extension of the european patent to

Free format text: AL;LT;LV;MK;RO;SI

AK Designated contracting states:

Kind code of ref document: A2

Designated state(s): AT BE CH CY DE DK ES FI FR GB GR IE IT LI LU MC NL PT SE TR

RAP1 Transfer of rights of an ep published application

Owner name: NOKIA CORPORATION

AK Designated contracting states:

Kind code of ref document: A3

Designated state(s): AT BE CH CY DE DK ES FI FR GB GR IE IT LI LU MC NL PT SE TR

AX Request for extension of the european patent to

Free format text: AL;LT;LV;MK;RO;SI

17P Request for examination filed

Effective date: 20030327

AKX Payment of designation fees

Designated state(s): AT BE CH CY DE DK ES FI FR GB GR IE IT LI LU MC NL PT SE TR

17Q First examination report

Effective date: 20050324

17Q First examination report

Effective date: 20050324

AK Designated contracting states:

Kind code of ref document: B1

Designated state(s): AT BE CH CY DE DK ES FI FR GB GR IE IT LI LU MC NL PT SE TR

RAP1 Transfer of rights of an ep published application

Owner name: NOKIA SIEMENS NETWORKS OY

REG Reference to a national code

Ref country code: GB

Ref legal event code: FG4D

REG Reference to a national code

Ref country code: CH

Ref legal event code: EP

REG Reference to a national code

Ref country code: IE

Ref legal event code: FG4D

REF Corresponds to:

Ref document number: 60132853

Country of ref document: DE

Date of ref document: 20080403

Kind code of ref document: P

REG Reference to a national code

Ref country code: SE

Ref legal event code: TRGR

PG25 Lapsed in a contracting state announced via postgrant inform. from nat. office to epo

Ref country code: FI

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20080220

Ref country code: ES

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20080531

NLV1 Nl: lapsed or annulled due to failure to fulfill the requirements of art. 29p and 29m of the patents act
PG25 Lapsed in a contracting state announced via postgrant inform. from nat. office to epo

Ref country code: AT

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20080220

PG25 Lapsed in a contracting state announced via postgrant inform. from nat. office to epo

Ref country code: BE

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20080220

PG25 Lapsed in a contracting state announced via postgrant inform. from nat. office to epo

Ref country code: PT

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20080721

Ref country code: NL

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20080220

Ref country code: DK

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20080220

EN Fr: translation not filed
26N No opposition filed

Effective date: 20081121

PG25 Lapsed in a contracting state announced via postgrant inform. from nat. office to epo

Ref country code: DE

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20080521

Ref country code: MC

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20080630

REG Reference to a national code

Ref country code: CH

Ref legal event code: PL

PG25 Lapsed in a contracting state announced via postgrant inform. from nat. office to epo

Ref country code: FR

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20081212

Ref country code: IE

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20080616

PG25 Lapsed in a contracting state announced via postgrant inform. from nat. office to epo

Ref country code: CH

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20080630

Ref country code: LI

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20080630

PG25 Lapsed in a contracting state announced via postgrant inform. from nat. office to epo

Ref country code: CY

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20080220

PG25 Lapsed in a contracting state announced via postgrant inform. from nat. office to epo

Ref country code: IT

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20080220

PG25 Lapsed in a contracting state announced via postgrant inform. from nat. office to epo

Ref country code: LU

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20080614

PG25 Lapsed in a contracting state announced via postgrant inform. from nat. office to epo

Ref country code: TR

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20080220

PG25 Lapsed in a contracting state announced via postgrant inform. from nat. office to epo

Ref country code: GR

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20080521

PGFP Postgrant: annual fees paid to national office

Ref country code: SE

Payment date: 20100614

Year of fee payment: 10

Ref country code: GB

Payment date: 20100618

Year of fee payment: 10

REG Reference to a national code

Ref country code: SE

Ref legal event code: EUG

GBPC Gb: european patent ceased through non-payment of renewal fee

Effective date: 20110614

PG25 Lapsed in a contracting state announced via postgrant inform. from nat. office to epo

Ref country code: GB

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20110614

PG25 Lapsed in a contracting state announced via postgrant inform. from nat. office to epo

Ref country code: SE

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20110615