WO1999043110A1 - A fast frequency transformation techique for transform audio coders - Google Patents
A fast frequency transformation techique for transform audio coders Download PDFInfo
- Publication number
- WO1999043110A1 WO1999043110A1 PCT/SG1998/000014 SG9800014W WO9943110A1 WO 1999043110 A1 WO1999043110 A1 WO 1999043110A1 SG 9800014 W SG9800014 W SG 9800014W WO 9943110 A1 WO9943110 A1 WO 9943110A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- sequence
- transform coefficient
- samples
- transform
- complex
- Prior art date
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04H—BROADCAST COMMUNICATION
- H04H20/00—Arrangements for broadcast or for distribution combined with broadcast
- H04H20/86—Arrangements characterised by the broadcast information itself
- H04H20/88—Stereophonic broadcast systems
- H04H20/89—Stereophonic broadcast systems using three or more audio channels, e.g. triphonic or quadraphonic
Definitions
- This invention is applicable in the field of multi-channel audio coders which use modified discrete cosine transform as a step in the compression of audio signals.
- the amount of information required to represent the audio signals may be reduced.
- the amount of digital information needed to accurately reproduce the original pulse code modulation (PCM) samples may be reduced by applying a digital compression algorithm, resulting in a digitally compressed representation of the original signal.
- the goal of the digital compression algorithm is to produce a digital representation of an audio signal which, when decoded and reproduced, sounds the same as the original signal, while using a minimum of digital information for the compressed or encoded representation.
- the time domain audio signal is first converted to the frequency domain using a bank of filters.
- the frequency domain coefficients are converted to fixed point representation.
- each coefficient is represented as a mantissa and an exponent.
- the bulk of the compressed bitstream transmitted to the decoder comprises these exponents and mantissas. - 2 -
- each mantissa must be truncated to a fixed or variable number of decimal places.
- the number of bits to be used for coding each mantissa is obtained from a bit allocation algorithm which may be based on the masking property of the human auditory system. Lower numbers of bits result in higher compression ratios because less space is required to transmit the coefficients. However, this may cause high quantization errors, leading to audible distortion.
- a good distribution of available bits to each mantissa forms the core of the advanced audio coders.
- the frequency transformation phase has one of the greatest computation requirements in a transform coder. Therefore, an efficient implementation of this phase can decrease the computation requirement of the system significantly and make real time operation of the encoder more easily attainable.
- the frequency domain transformation of signals is performed by the modified discrete cosine transform (MDCT).
- MDCT discrete cosine transform
- the MDCT requires OfiV 2 ) additions and multiplications.
- FFT Fast Fourier Transform
- a method for coding audio data comprising a sequence of digital audio samples, including d e steps of: i) multiplying the input samples with a first trigonometric function factor to generate an intermediate sample sequence; ii) computing a fast Fourier transform of the intermediate sample sequence to generate a Fourier transform coefficient sequence; - 3 - iii) for each transform coefficient in the sequence, multiplying die real and imaginary components of e transform coefficient by respective second trigonometric function factors, adding the multiplied real and imaginary transform coefficient components to generate an addition stream coefficient, and subtracting me multiplied real and imaginary transform coefficient components to generate a subtraction stream coefficient; iv) multiplying the addition and subtraction stream coefficients with respective third trigonometric function factors; and v) subtracting the corresponding multiplied addition and subtraction stream coefficients to generate audio coded frequency domain coefficients.
- the present invention also provides a method for coding audio data, including the steps of: combining first and second sequences of digital audio samples from first and second audio channels into a single complex sample sequence; determining a Fourier transform coefficient sequence as defined above; generating first and second transform coefficient sequences by combining and/or differencing first and second selected transform coefficients from said Fourier transform coefficient sequence; and for each of me first and second transform coefficient sequences, generating audio coded frequency domain coefficients as defined above, so as to generate respective sequences of said audio coded frequency domain coefficients for the first and second audio channels.
- the present invention also provides a method for coding audio data including d e steps of: obtaining at least one input sequence of digital audio samples; pre-processing die input sequence samples including applying a pre-multiplication factor to obtain modified input sequence samples; transforming e modified input sequence samples into a transform coefficient sequence utilising a fast Fourier transform; and post-processing the sequence of transform coefficients including applying first post- - 4 - multiplication factors to the real and imaginary coefficient components, differencing and combining die post-multiplied real and imaginary components, applying second post- multiplication factors to die difference and combination results, and differencing to obtain a sequence of modified discrete cosine transform coefficients representing said input sequence of digital audio samples.
- the present invention also provides a method for coding audio data including die steps of: obtaining first and second input sequences of digital audio samples corresponding to respective first and second audio channels; combining me first and second input sequences of digital audio samples into a single complex input sample sequence; pre-processing the complex input sequence samples including applying a pre- multiplication factor to obtain modified complex input sequence samples; transforming die modified complex input sequence samples into a complex transform coefficient sequence utilising a fast Fourier transform; and post-processing the sequence of complex transform coefficients to obtain first and second sequences of audio coded frequency domain coefficients corresponding to me first and second audio channels including, for each corresponding frequency domain coefficient in d e first and second sequences, selecting first and second complex transform coefficients from said sequence of complex transform coefficients, combining the first complex transform coefficient and die complex conjugate of me second complex transform coefficient for said first channel and differencing die first complex transform coefficient and the complex conjugate of the second complex transform coefficient for said second channel, and applying respective post-multiplication factors to the combination and difference to obtain said audio coded frequency domain coefficient
- G k is a transform coefficient sequence for the first channel
- G' k is a transform coefficient sequence for the second channel; g Kr and g u are the real and imaginary transform coefficient components of G k ; g' k r and g' k ! are the real and imaginary transform coefficient components of G' k ;
- the modified discrete cosine transform equation can be expressed as - 6 -
- xfnj is the input sequence for a channel and N is die transform length.
- X k cosy *(g kr cos( ⁇ (k+l/2 N)-g k ⁇ sm( ⁇ (k+ ⁇ /2)/N)) -smy *(g kr sin( ⁇ (k+l/2)/N)+g k ⁇ cos( ⁇ (k+V2)/N)) g kr ,g kl e 3t(set of real numbers)
- n N- ⁇ where G k - g kr + jg k y ⁇ ( ⁇ [n ⁇ e j ⁇ nlN )*e j2 ⁇ " m .
- Figure 1 is a diagrammatic representation of a stream of audio data and die substructure arrangement thereof;
- Figure 2 is a functional block diagram of a digital audio encoder
- Figure 3 is a functional block diagram of a system for encoding a single audio channel
- Figure 4 is a functional block diagram of a system for encoding a pair of audio channels.
- the input to an audio coder comprises a stream of digitised samples of the time domain analog signal.
- the stream consists of interleaved samples for each channel.
- the input stream is sectioned into blocks, each block containing N consecutive samples of each channel (see Fig. 1).
- N samples of a channel form a sequence ⁇ x[0], x[l], x[2], ... , x[N-l] ⁇ .
- the time domain samples are next converted to die frequency domain using an analysis filter bank (see Fig. 2).
- the frequency domain coefficients, thus generated, form a coefficient set which can be identified as (X 0 , X ⁇ , X 2 , ..., X N/2 ⁇ ). Since die signal is real only the first Nil frequency components are considered.
- X 0 is the lowest frequency (DC) component while X N/2 _, is the highest frequency component of the signal.
- DC lowest frequency
- X N/2 _ is the highest frequency component of the signal.
- Audio compression essentially entails finding how much of the information in the set (X 0 , X_, X 2 , .... X N/2 ._) is necessary to reproduce the original analog signal at the decoder with minimal audible distortion.
- the coefficient set is normally converted into floating point format, where each coefficient is represented by an exponent and mantissa.
- the exponent set is usually transmitted in its original form.
- the mantissa is truncated to a fixed or variable number of decimal places.
- the value of number of bits for coding a mantissa is usually obtained from a bit allocation algorithm which for advanced psychoacoustic coders may be based on the masking property of the human auditory system.
- a low number of bits results in high compression ratio because less space is required to transmit the coefficients. However this causes very high quantization error leading to audible distortion.
- a good distribution of available bits to each mantissa forms the core of the most advanced encoders.
- d e frequency domain transformation of signals is performed by die (MDCT) modified discrete cosine transform (Eq. 1).
- die MDCT requires O ⁇ N 2 ) additions and multiplications.
- G k g kr + jg i is computed in O(NogN) operation by use of FFT algorithms.
- the additional operation outlined in Eq. 16 to extract the final X k is only of order 9( ⁇ - Therefore the MDCT can now be computed in 0(Mog 2 N) time.
- the operations required to obtain die MDCT are illustrated in Fig. 3.
- die multi-channel encoder is required to process m audio channels. Instead of computing an FFT for each channel as described in die previous section, it is possible to further reduce the computational requirement of the coder by combining two channels and using a single FFT only. In effect, instead of m FFTs only m/2 FFTS need to be computed.
- DFT for any two channels can be computed with only one FFT block by considering the input as a complex number.
- the real part is formed from the sequence for any one channel and the imaginary part is from data of another channel. After the Fourier Transform is computed for the resulting complex variable, the resulting transform for each channel can be easily retrieved.
- the input data to the FFT block is actually a complex number (formed by multiplying the real data by complex variable e ' m/N ).
- e ' m/N complex variable
- using some processing after the FFT one can still compute the DFT of two channel using a single FFT block. -13-
- the frequency transform lengtii N is decided by die encoder based on temporal and spectral resolution requirements.
- the input signal is usually analysed witii a high frequency bandpass filter to detect die presence of transients. This information is used to adjust die block lengtii, restricting quantization noise associated witii die transient within a small temporal region about die transient, avoiding temporal masking.
- two short transform of length N/2 each are taken.
- a single long transform of length N is used, thus providing higher spectral resolution.
- a short transform is required for restricting quantization noise associated with die transient within a small temporal region about the transient, avoiding temporal masking.
- a long transform gives slight better frequency resolution but die error is not much compared to die case when in die presence of transient a long transform is utilised. Forcing a long transform onto a channel in the presence of transient leads to greater distortion in die final produced music. This conjecture was proven true by experimental studies on benchmark music streams.
- x[n] Before d e time domain signal x[n] is transformed to die frequency domain, a windowing function is usually applied.
- the invention has been described herein primarily in terms of its mathematical derivation and application, and the procedures required for implementation, it will be readily recognised by those skilled in the art that die procedures described can be implemented by means of any desired computational apparatus.
- the invention may be embodied in computer software operating on general purpose computing equipment, or may be embodied in purpose built circuitry or contained in microcode or the like in an integrated circuit or set of integrated circuits.
- G * ⁇ (x[n]*e ]nnlN )*e ⁇ " m
- T 2 M2j( ⁇ k * myN G k -e - ** ⁇ G ⁇
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
Abstract
Description
Claims
Priority Applications (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/SG1998/000014 WO1999043110A1 (en) | 1998-02-21 | 1998-02-21 | A fast frequency transformation techique for transform audio coders |
EP98909964A EP1057292B1 (en) | 1998-02-21 | 1998-02-21 | A fast frequency transformation techique for transform audio coders |
DE69823557T DE69823557T2 (en) | 1998-02-21 | 1998-02-21 | QUICK FREQUENCY TRANSFORMATION TECHNOLOGY FOR TRANSFORM AUDIO CODES |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/SG1998/000014 WO1999043110A1 (en) | 1998-02-21 | 1998-02-21 | A fast frequency transformation techique for transform audio coders |
Publications (1)
Publication Number | Publication Date |
---|---|
WO1999043110A1 true WO1999043110A1 (en) | 1999-08-26 |
Family
ID=20429840
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/SG1998/000014 WO1999043110A1 (en) | 1998-02-21 | 1998-02-21 | A fast frequency transformation techique for transform audio coders |
Country Status (3)
Country | Link |
---|---|
EP (1) | EP1057292B1 (en) |
DE (1) | DE69823557T2 (en) |
WO (1) | WO1999043110A1 (en) |
Cited By (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
DE19959156A1 (en) * | 1999-12-08 | 2001-06-28 | Fraunhofer Ges Forschung | Method and device for processing a stereo audio signal |
EP1403854A2 (en) | 2002-09-04 | 2004-03-31 | Microsoft Corporation | Multi-channel audio encoding and decoding |
GB2423451A (en) * | 2005-02-16 | 2006-08-23 | Ishce Ltd | Inserting a watermark code into a digitally compressed audio or audio-visual signal or file |
US7143030B2 (en) | 2001-12-14 | 2006-11-28 | Microsoft Corporation | Parametric compression/decompression modes for quantization matrices for digital audio |
US7299190B2 (en) | 2002-09-04 | 2007-11-20 | Microsoft Corporation | Quantization and inverse quantization for audio |
US7539612B2 (en) | 2005-07-15 | 2009-05-26 | Microsoft Corporation | Coding and decoding scale factor information |
US7680671B2 (en) | 1998-10-26 | 2010-03-16 | Stmicroelectronics Asia Pacific Pte. Ltd. | Multi-precision technique for digital audio encoder |
US7831434B2 (en) | 2006-01-20 | 2010-11-09 | Microsoft Corporation | Complex-transform channel coding with extended-band frequency coding |
US7953604B2 (en) | 2006-01-20 | 2011-05-31 | Microsoft Corporation | Shape and scale parameters for extended-band frequency coding |
US9026452B2 (en) | 2007-06-29 | 2015-05-05 | Microsoft Technology Licensing, Llc | Bitstream syntax for multi-process audio decoding |
US9443525B2 (en) | 2001-12-14 | 2016-09-13 | Microsoft Technology Licensing, Llc | Quality improvement techniques in an audio encoder |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP0506111A2 (en) * | 1991-03-27 | 1992-09-30 | Mitsubishi Denki Kabushiki Kaisha | DCT/IDCT processor and data processing method |
US5181183A (en) * | 1990-01-17 | 1993-01-19 | Nec Corporation | Discrete cosine transform circuit suitable for integrated circuit implementation |
EP0564089A1 (en) * | 1992-03-02 | 1993-10-06 | AT&T Corp. | A method and appartus for the perceptual coding of audio signals |
EP0590790A2 (en) * | 1992-09-28 | 1994-04-06 | Sony Corporation | Modified DCT signal transforming system |
EP0718746A1 (en) * | 1994-12-21 | 1996-06-26 | Laboratoires D'electronique Philips S.A.S. | Booth multiplier for trigonometric functions |
-
1998
- 1998-02-21 WO PCT/SG1998/000014 patent/WO1999043110A1/en active IP Right Grant
- 1998-02-21 EP EP98909964A patent/EP1057292B1/en not_active Expired - Lifetime
- 1998-02-21 DE DE69823557T patent/DE69823557T2/en not_active Expired - Fee Related
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5181183A (en) * | 1990-01-17 | 1993-01-19 | Nec Corporation | Discrete cosine transform circuit suitable for integrated circuit implementation |
EP0506111A2 (en) * | 1991-03-27 | 1992-09-30 | Mitsubishi Denki Kabushiki Kaisha | DCT/IDCT processor and data processing method |
EP0564089A1 (en) * | 1992-03-02 | 1993-10-06 | AT&T Corp. | A method and appartus for the perceptual coding of audio signals |
US5592584A (en) * | 1992-03-02 | 1997-01-07 | Lucent Technologies Inc. | Method and apparatus for two-component signal compression |
EP0590790A2 (en) * | 1992-09-28 | 1994-04-06 | Sony Corporation | Modified DCT signal transforming system |
EP0718746A1 (en) * | 1994-12-21 | 1996-06-26 | Laboratoires D'electronique Philips S.A.S. | Booth multiplier for trigonometric functions |
Cited By (23)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7680671B2 (en) | 1998-10-26 | 2010-03-16 | Stmicroelectronics Asia Pacific Pte. Ltd. | Multi-precision technique for digital audio encoder |
US7260225B2 (en) | 1999-12-08 | 2007-08-21 | Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. | Method and device for processing a stereo audio signal |
DE19959156C2 (en) * | 1999-12-08 | 2002-01-31 | Fraunhofer Ges Forschung | Method and device for processing a stereo audio signal to be encoded |
DE19959156A1 (en) * | 1999-12-08 | 2001-06-28 | Fraunhofer Ges Forschung | Method and device for processing a stereo audio signal |
US7930171B2 (en) | 2001-12-14 | 2011-04-19 | Microsoft Corporation | Multi-channel audio encoding/decoding with parametric compression/decompression and weight factors |
US7143030B2 (en) | 2001-12-14 | 2006-11-28 | Microsoft Corporation | Parametric compression/decompression modes for quantization matrices for digital audio |
US7155383B2 (en) | 2001-12-14 | 2006-12-26 | Microsoft Corporation | Quantization matrices for jointly coded channels of audio |
US7249016B2 (en) | 2001-12-14 | 2007-07-24 | Microsoft Corporation | Quantization matrices using normalized-block pattern of digital audio |
US9443525B2 (en) | 2001-12-14 | 2016-09-13 | Microsoft Technology Licensing, Llc | Quality improvement techniques in an audio encoder |
US9305558B2 (en) | 2001-12-14 | 2016-04-05 | Microsoft Technology Licensing, Llc | Multi-channel audio encoding/decoding with parametric compression/decompression and weight factors |
US7801735B2 (en) | 2002-09-04 | 2010-09-21 | Microsoft Corporation | Compressing and decompressing weight factors using temporal prediction for audio data |
EP1403854A3 (en) * | 2002-09-04 | 2006-05-10 | Microsoft Corporation | Multi-channel audio encoding and decoding |
US7502743B2 (en) | 2002-09-04 | 2009-03-10 | Microsoft Corporation | Multi-channel audio encoding and decoding with multi-channel transform selection |
EP1403854A2 (en) | 2002-09-04 | 2004-03-31 | Microsoft Corporation | Multi-channel audio encoding and decoding |
US7299190B2 (en) | 2002-09-04 | 2007-11-20 | Microsoft Corporation | Quantization and inverse quantization for audio |
GB2423451A (en) * | 2005-02-16 | 2006-08-23 | Ishce Ltd | Inserting a watermark code into a digitally compressed audio or audio-visual signal or file |
US7539612B2 (en) | 2005-07-15 | 2009-05-26 | Microsoft Corporation | Coding and decoding scale factor information |
US7831434B2 (en) | 2006-01-20 | 2010-11-09 | Microsoft Corporation | Complex-transform channel coding with extended-band frequency coding |
US7953604B2 (en) | 2006-01-20 | 2011-05-31 | Microsoft Corporation | Shape and scale parameters for extended-band frequency coding |
US9105271B2 (en) | 2006-01-20 | 2015-08-11 | Microsoft Technology Licensing, Llc | Complex-transform channel coding with extended-band frequency coding |
US9026452B2 (en) | 2007-06-29 | 2015-05-05 | Microsoft Technology Licensing, Llc | Bitstream syntax for multi-process audio decoding |
US9349376B2 (en) | 2007-06-29 | 2016-05-24 | Microsoft Technology Licensing, Llc | Bitstream syntax for multi-process audio decoding |
US9741354B2 (en) | 2007-06-29 | 2017-08-22 | Microsoft Technology Licensing, Llc | Bitstream syntax for multi-process audio decoding |
Also Published As
Publication number | Publication date |
---|---|
EP1057292A1 (en) | 2000-12-06 |
DE69823557D1 (en) | 2004-06-03 |
EP1057292B1 (en) | 2004-04-28 |
DE69823557T2 (en) | 2005-02-03 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11557304B2 (en) | Methods and apparatus for performing variable block length watermarking of media | |
EP0990368B1 (en) | Method and apparatus for frequency-domain downmixing with block-switch forcing for audio decoding functions | |
KR100253136B1 (en) | Low computational complexity digital filter bank | |
CN102150207B (en) | Compression of audio scale-factors by two-dimensional transformation | |
JP4242450B2 (en) | An efficient implementation of a single sideband filter bank for phasor measurements. | |
JP5414684B2 (en) | Method and apparatus for performing audio watermarking, watermark detection, and watermark extraction | |
JP5101579B2 (en) | Spatial audio parameter display | |
CN101887726A (en) | The method of stereo coding and decoding and equipment thereof | |
EP1057292A1 (en) | A fast frequency transformation techique for transform audio coders | |
US20070036228A1 (en) | Method and apparatus for audio encoding and decoding | |
KR20120095920A (en) | Optimized low-throughput parametric coding/decoding | |
EP0775389B1 (en) | Encoding system and encoding method for encoding a digital signal having at least a first and a second digital signal component | |
Brandenburg | Introduction to perceptual coding | |
CN1862969B (en) | Adaptive block length, constant converting audio frequency decoding method | |
US7203717B1 (en) | Fast modified discrete cosine transform method | |
EP0707761B1 (en) | Arrangement for determining a signal spectrum of a wideband digital signal and for deriving bit allocation information in response thereto | |
EP1076295A1 (en) | Method and encoder for bit-rate saving encoding of audio signals | |
KR100424036B1 (en) | Disassembly / synthesis filtering system with efficient nodal stack single-sided band filter bank using time-domain aliasing | |
Lanciani | Compressed-domain processing of MPEG audio signals | |
JP3292228B2 (en) | Signal encoding device and signal decoding device | |
Fielder et al. | Audio Coding Tools for Digital Television Distribution | |
Wylie | apt-X100: Low-Delay, Low-Bit-Rate Subband ADPCM Digital: Audio Coding | |
Sandler | Linear time-invariant systems in the wavelet domain | |
Orglmeister | Data reduction in high-quality audio signals | |
JPH06169289A (en) | Compressed data reproduction device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AK | Designated states |
Kind code of ref document: A1 Designated state(s): JP SG US |
|
AL | Designated countries for regional patents |
Kind code of ref document: A1 Designated state(s): AT BE CH DE DK ES FI FR GB GR IE IT LU MC NL PT SE |
|
DFPE | Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101) | ||
121 | Ep: the epo has been informed by wipo that ep was designated in this application | ||
WWE | Wipo information: entry into national phase |
Ref document number: 1998909964 Country of ref document: EP |
|
WWE | Wipo information: entry into national phase |
Ref document number: 09622736 Country of ref document: US |
|
WWP | Wipo information: published in national office |
Ref document number: 1998909964 Country of ref document: EP |
|
WWG | Wipo information: grant in national office |
Ref document number: 1998909964 Country of ref document: EP |