EP1057292A1 - A fast frequency transformation techique for transform audio coders - Google Patents
A fast frequency transformation techique for transform audio codersInfo
- Publication number
- EP1057292A1 EP1057292A1 EP98909964A EP98909964A EP1057292A1 EP 1057292 A1 EP1057292 A1 EP 1057292A1 EP 98909964 A EP98909964 A EP 98909964A EP 98909964 A EP98909964 A EP 98909964A EP 1057292 A1 EP1057292 A1 EP 1057292A1
- Authority
- EP
- European Patent Office
- Prior art keywords
- sequence
- transform coefficient
- samples
- transform
- complex
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04H—BROADCAST COMMUNICATION
- H04H20/00—Arrangements for broadcast or for distribution combined with broadcast
- H04H20/86—Arrangements characterised by the broadcast information itself
- H04H20/88—Stereophonic broadcast systems
- H04H20/89—Stereophonic broadcast systems using three or more audio channels, e.g. triphonic or quadraphonic
Definitions
- This invention is applicable in the field of multi-channel audio coders which use modified discrete cosine transform as a step in the compression of audio signals.
- the amount of information required to represent the audio signals may be reduced.
- the amount of digital information needed to accurately reproduce the original pulse code modulation (PCM) samples may be reduced by applying a digital compression algorithm, resulting in a digitally compressed representation of the original signal.
- the goal of the digital compression algorithm is to produce a digital representation of an audio signal which, when decoded and reproduced, sounds the same as the original signal, while using a minimum of digital information for the compressed or encoded representation.
- the time domain audio signal is first converted to the frequency domain using a bank of filters.
- the frequency domain coefficients are converted to fixed point representation.
- each coefficient is represented as a mantissa and an exponent.
- the bulk of the compressed bitstream transmitted to the decoder comprises these exponents and mantissas. - 2 -
- each mantissa must be truncated to a fixed or variable number of decimal places.
- the number of bits to be used for coding each mantissa is obtained from a bit allocation algorithm which may be based on the masking property of the human auditory system. Lower numbers of bits result in higher compression ratios because less space is required to transmit the coefficients. However, this may cause high quantization errors, leading to audible distortion.
- a good distribution of available bits to each mantissa forms the core of the advanced audio coders.
- the frequency transformation phase has one of the greatest computation requirements in a transform coder. Therefore, an efficient implementation of this phase can decrease the computation requirement of the system significantly and make real time operation of the encoder more easily attainable.
- the frequency domain transformation of signals is performed by the modified discrete cosine transform (MDCT).
- MDCT discrete cosine transform
- the MDCT requires OfiV 2 ) additions and multiplications.
- FFT Fast Fourier Transform
- a method for coding audio data comprising a sequence of digital audio samples, including d e steps of: i) multiplying the input samples with a first trigonometric function factor to generate an intermediate sample sequence; ii) computing a fast Fourier transform of the intermediate sample sequence to generate a Fourier transform coefficient sequence; - 3 - iii) for each transform coefficient in the sequence, multiplying die real and imaginary components of e transform coefficient by respective second trigonometric function factors, adding the multiplied real and imaginary transform coefficient components to generate an addition stream coefficient, and subtracting me multiplied real and imaginary transform coefficient components to generate a subtraction stream coefficient; iv) multiplying the addition and subtraction stream coefficients with respective third trigonometric function factors; and v) subtracting the corresponding multiplied addition and subtraction stream coefficients to generate audio coded frequency domain coefficients.
- the present invention also provides a method for coding audio data, including the steps of: combining first and second sequences of digital audio samples from first and second audio channels into a single complex sample sequence; determining a Fourier transform coefficient sequence as defined above; generating first and second transform coefficient sequences by combining and/or differencing first and second selected transform coefficients from said Fourier transform coefficient sequence; and for each of me first and second transform coefficient sequences, generating audio coded frequency domain coefficients as defined above, so as to generate respective sequences of said audio coded frequency domain coefficients for the first and second audio channels.
- the present invention also provides a method for coding audio data including d e steps of: obtaining at least one input sequence of digital audio samples; pre-processing die input sequence samples including applying a pre-multiplication factor to obtain modified input sequence samples; transforming e modified input sequence samples into a transform coefficient sequence utilising a fast Fourier transform; and post-processing the sequence of transform coefficients including applying first post- - 4 - multiplication factors to the real and imaginary coefficient components, differencing and combining die post-multiplied real and imaginary components, applying second post- multiplication factors to die difference and combination results, and differencing to obtain a sequence of modified discrete cosine transform coefficients representing said input sequence of digital audio samples.
- the present invention also provides a method for coding audio data including die steps of: obtaining first and second input sequences of digital audio samples corresponding to respective first and second audio channels; combining me first and second input sequences of digital audio samples into a single complex input sample sequence; pre-processing the complex input sequence samples including applying a pre- multiplication factor to obtain modified complex input sequence samples; transforming die modified complex input sequence samples into a complex transform coefficient sequence utilising a fast Fourier transform; and post-processing the sequence of complex transform coefficients to obtain first and second sequences of audio coded frequency domain coefficients corresponding to me first and second audio channels including, for each corresponding frequency domain coefficient in d e first and second sequences, selecting first and second complex transform coefficients from said sequence of complex transform coefficients, combining the first complex transform coefficient and die complex conjugate of me second complex transform coefficient for said first channel and differencing die first complex transform coefficient and the complex conjugate of the second complex transform coefficient for said second channel, and applying respective post-multiplication factors to the combination and difference to obtain said audio coded frequency domain coefficient
- G k is a transform coefficient sequence for the first channel
- G' k is a transform coefficient sequence for the second channel; g Kr and g u are the real and imaginary transform coefficient components of G k ; g' k r and g' k ! are the real and imaginary transform coefficient components of G' k ;
- the modified discrete cosine transform equation can be expressed as - 6 -
- xfnj is the input sequence for a channel and N is die transform length.
- X k cosy *(g kr cos( ⁇ (k+l/2 N)-g k ⁇ sm( ⁇ (k+ ⁇ /2)/N)) -smy *(g kr sin( ⁇ (k+l/2)/N)+g k ⁇ cos( ⁇ (k+V2)/N)) g kr ,g kl e 3t(set of real numbers)
- n N- ⁇ where G k - g kr + jg k y ⁇ ( ⁇ [n ⁇ e j ⁇ nlN )*e j2 ⁇ " m .
- Figure 1 is a diagrammatic representation of a stream of audio data and die substructure arrangement thereof;
- Figure 2 is a functional block diagram of a digital audio encoder
- Figure 3 is a functional block diagram of a system for encoding a single audio channel
- Figure 4 is a functional block diagram of a system for encoding a pair of audio channels.
- the input to an audio coder comprises a stream of digitised samples of the time domain analog signal.
- the stream consists of interleaved samples for each channel.
- the input stream is sectioned into blocks, each block containing N consecutive samples of each channel (see Fig. 1).
- N samples of a channel form a sequence ⁇ x[0], x[l], x[2], ... , x[N-l] ⁇ .
- the time domain samples are next converted to die frequency domain using an analysis filter bank (see Fig. 2).
- the frequency domain coefficients, thus generated, form a coefficient set which can be identified as (X 0 , X ⁇ , X 2 , ..., X N/2 ⁇ ). Since die signal is real only the first Nil frequency components are considered.
- X 0 is the lowest frequency (DC) component while X N/2 _, is the highest frequency component of the signal.
- DC lowest frequency
- X N/2 _ is the highest frequency component of the signal.
- Audio compression essentially entails finding how much of the information in the set (X 0 , X_, X 2 , .... X N/2 ._) is necessary to reproduce the original analog signal at the decoder with minimal audible distortion.
- the coefficient set is normally converted into floating point format, where each coefficient is represented by an exponent and mantissa.
- the exponent set is usually transmitted in its original form.
- the mantissa is truncated to a fixed or variable number of decimal places.
- the value of number of bits for coding a mantissa is usually obtained from a bit allocation algorithm which for advanced psychoacoustic coders may be based on the masking property of the human auditory system.
- a low number of bits results in high compression ratio because less space is required to transmit the coefficients. However this causes very high quantization error leading to audible distortion.
- a good distribution of available bits to each mantissa forms the core of the most advanced encoders.
- d e frequency domain transformation of signals is performed by die (MDCT) modified discrete cosine transform (Eq. 1).
- die MDCT requires O ⁇ N 2 ) additions and multiplications.
- G k g kr + jg i is computed in O(NogN) operation by use of FFT algorithms.
- the additional operation outlined in Eq. 16 to extract the final X k is only of order 9( ⁇ - Therefore the MDCT can now be computed in 0(Mog 2 N) time.
- the operations required to obtain die MDCT are illustrated in Fig. 3.
- die multi-channel encoder is required to process m audio channels. Instead of computing an FFT for each channel as described in die previous section, it is possible to further reduce the computational requirement of the coder by combining two channels and using a single FFT only. In effect, instead of m FFTs only m/2 FFTS need to be computed.
- DFT for any two channels can be computed with only one FFT block by considering the input as a complex number.
- the real part is formed from the sequence for any one channel and the imaginary part is from data of another channel. After the Fourier Transform is computed for the resulting complex variable, the resulting transform for each channel can be easily retrieved.
- the input data to the FFT block is actually a complex number (formed by multiplying the real data by complex variable e ' m/N ).
- e ' m/N complex variable
- using some processing after the FFT one can still compute the DFT of two channel using a single FFT block. -13-
- the frequency transform lengtii N is decided by die encoder based on temporal and spectral resolution requirements.
- the input signal is usually analysed witii a high frequency bandpass filter to detect die presence of transients. This information is used to adjust die block lengtii, restricting quantization noise associated witii die transient within a small temporal region about die transient, avoiding temporal masking.
- two short transform of length N/2 each are taken.
- a single long transform of length N is used, thus providing higher spectral resolution.
- a short transform is required for restricting quantization noise associated with die transient within a small temporal region about the transient, avoiding temporal masking.
- a long transform gives slight better frequency resolution but die error is not much compared to die case when in die presence of transient a long transform is utilised. Forcing a long transform onto a channel in the presence of transient leads to greater distortion in die final produced music. This conjecture was proven true by experimental studies on benchmark music streams.
- x[n] Before d e time domain signal x[n] is transformed to die frequency domain, a windowing function is usually applied.
- the invention has been described herein primarily in terms of its mathematical derivation and application, and the procedures required for implementation, it will be readily recognised by those skilled in the art that die procedures described can be implemented by means of any desired computational apparatus.
- the invention may be embodied in computer software operating on general purpose computing equipment, or may be embodied in purpose built circuitry or contained in microcode or the like in an integrated circuit or set of integrated circuits.
- G * ⁇ (x[n]*e ]nnlN )*e ⁇ " m
- T 2 M2j( ⁇ k * myN G k -e - ** ⁇ G ⁇
Abstract
Description
Claims
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/SG1998/000014 WO1999043110A1 (en) | 1998-02-21 | 1998-02-21 | A fast frequency transformation techique for transform audio coders |
Publications (2)
Publication Number | Publication Date |
---|---|
EP1057292A1 true EP1057292A1 (en) | 2000-12-06 |
EP1057292B1 EP1057292B1 (en) | 2004-04-28 |
Family
ID=20429840
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
EP98909964A Expired - Lifetime EP1057292B1 (en) | 1998-02-21 | 1998-02-21 | A fast frequency transformation techique for transform audio coders |
Country Status (3)
Country | Link |
---|---|
EP (1) | EP1057292B1 (en) |
DE (1) | DE69823557T2 (en) |
WO (1) | WO1999043110A1 (en) |
Families Citing this family (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP1125235B1 (en) | 1998-10-26 | 2003-04-23 | STMicroelectronics Asia Pacific Pte Ltd. | Multi-precision technique for digital audio encoder |
DE19959156C2 (en) | 1999-12-08 | 2002-01-31 | Fraunhofer Ges Forschung | Method and device for processing a stereo audio signal to be encoded |
US6934677B2 (en) | 2001-12-14 | 2005-08-23 | Microsoft Corporation | Quantization matrices based on critical band pattern information for digital audio wherein quantization bands differ from critical bands |
US7240001B2 (en) | 2001-12-14 | 2007-07-03 | Microsoft Corporation | Quality improvement techniques in an audio encoder |
US7502743B2 (en) | 2002-09-04 | 2009-03-10 | Microsoft Corporation | Multi-channel audio encoding and decoding with multi-channel transform selection |
US7299190B2 (en) | 2002-09-04 | 2007-11-20 | Microsoft Corporation | Quantization and inverse quantization for audio |
GB2423451A (en) * | 2005-02-16 | 2006-08-23 | Ishce Ltd | Inserting a watermark code into a digitally compressed audio or audio-visual signal or file |
US7539612B2 (en) | 2005-07-15 | 2009-05-26 | Microsoft Corporation | Coding and decoding scale factor information |
US7831434B2 (en) | 2006-01-20 | 2010-11-09 | Microsoft Corporation | Complex-transform channel coding with extended-band frequency coding |
US7953604B2 (en) | 2006-01-20 | 2011-05-31 | Microsoft Corporation | Shape and scale parameters for extended-band frequency coding |
US7885819B2 (en) | 2007-06-29 | 2011-02-08 | Microsoft Corporation | Bitstream syntax for multi-process audio decoding |
Family Cites Families (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2646778B2 (en) * | 1990-01-17 | 1997-08-27 | 日本電気株式会社 | Digital signal processor |
JP2866754B2 (en) * | 1991-03-27 | 1999-03-08 | 三菱電機株式会社 | Arithmetic processing unit |
CA2090052C (en) * | 1992-03-02 | 1998-11-24 | Anibal Joao De Sousa Ferreira | Method and apparatus for the perceptual coding of audio signals |
JPH06112909A (en) * | 1992-09-28 | 1994-04-22 | Sony Corp | Signal converter for improved dct |
EP0718746B1 (en) * | 1994-12-21 | 2005-03-23 | Koninklijke Philips Electronics N.V. | Booth multiplier for trigonometric functions |
-
1998
- 1998-02-21 WO PCT/SG1998/000014 patent/WO1999043110A1/en active IP Right Grant
- 1998-02-21 DE DE69823557T patent/DE69823557T2/en not_active Expired - Fee Related
- 1998-02-21 EP EP98909964A patent/EP1057292B1/en not_active Expired - Lifetime
Non-Patent Citations (1)
Title |
---|
See references of WO9943110A1 * |
Also Published As
Publication number | Publication date |
---|---|
EP1057292B1 (en) | 2004-04-28 |
DE69823557T2 (en) | 2005-02-03 |
DE69823557D1 (en) | 2004-06-03 |
WO1999043110A1 (en) | 1999-08-26 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11557304B2 (en) | Methods and apparatus for performing variable block length watermarking of media | |
KR100253136B1 (en) | Low computational complexity digital filter bank | |
CN102150207B (en) | Compression of audio scale-factors by two-dimensional transformation | |
JP4242450B2 (en) | An efficient implementation of a single sideband filter bank for phasor measurements. | |
JP5414684B2 (en) | Method and apparatus for performing audio watermarking, watermark detection, and watermark extraction | |
EP0990368A1 (en) | Method and apparatus for frequency-domain downmixing with block-switch forcing for audio decoding functions | |
JP2009271554A (en) | Parametric representation of spatial audio | |
CN101887726A (en) | The method of stereo coding and decoding and equipment thereof | |
EP1057292A1 (en) | A fast frequency transformation techique for transform audio coders | |
US20070036228A1 (en) | Method and apparatus for audio encoding and decoding | |
KR20120095920A (en) | Optimized low-throughput parametric coding/decoding | |
EP0775389B1 (en) | Encoding system and encoding method for encoding a digital signal having at least a first and a second digital signal component | |
Brandenburg | Introduction to perceptual coding | |
CN1862969B (en) | Adaptive block length, constant converting audio frequency decoding method | |
US7203717B1 (en) | Fast modified discrete cosine transform method | |
EP0707761B1 (en) | Arrangement for determining a signal spectrum of a wideband digital signal and for deriving bit allocation information in response thereto | |
EP1076295A1 (en) | Method and encoder for bit-rate saving encoding of audio signals | |
KR100424036B1 (en) | Disassembly / synthesis filtering system with efficient nodal stack single-sided band filter bank using time-domain aliasing | |
Lanciani | Compressed-domain processing of MPEG audio signals | |
JP3292228B2 (en) | Signal encoding device and signal decoding device | |
Fielder et al. | Audio Coding Tools for Digital Television Distribution | |
Wylie | apt-X100: Low-Delay, Low-Bit-Rate Subband ADPCM Digital: Audio Coding | |
Sandler | Linear time-invariant systems in the wavelet domain | |
Orglmeister | Data reduction in high-quality audio signals | |
JPH06169289A (en) | Compressed data reproduction device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PUAI | Public reference made under article 153(3) epc to a published international application that has entered the european phase |
Free format text: ORIGINAL CODE: 0009012 |
|
17P | Request for examination filed |
Effective date: 20000920 |
|
AK | Designated contracting states |
Kind code of ref document: A1 Designated state(s): DE FR GB IT |
|
RAP1 | Party data changed (applicant data changed or rights of an application transferred) |
Owner name: STMICROELECTRONICS ASIA PACIFIC PTE LTD. |
|
17Q | First examination report despatched |
Effective date: 20030305 |
|
GRAP | Despatch of communication of intention to grant a patent |
Free format text: ORIGINAL CODE: EPIDOSNIGR1 |
|
GRAS | Grant fee paid |
Free format text: ORIGINAL CODE: EPIDOSNIGR3 |
|
GRAA | (expected) grant |
Free format text: ORIGINAL CODE: 0009210 |
|
AK | Designated contracting states |
Kind code of ref document: B1 Designated state(s): DE FR GB IT |
|
REG | Reference to a national code |
Ref country code: GB Ref legal event code: FG4D |
|
REF | Corresponds to: |
Ref document number: 69823557 Country of ref document: DE Date of ref document: 20040603 Kind code of ref document: P |
|
ET | Fr: translation filed | ||
PLBE | No opposition filed within time limit |
Free format text: ORIGINAL CODE: 0009261 |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: NO OPPOSITION FILED WITHIN TIME LIMIT |
|
26N | No opposition filed |
Effective date: 20050131 |
|
PGFP | Annual fee paid to national office [announced via postgrant information from national office to epo] |
Ref country code: IT Payment date: 20080213 Year of fee payment: 11 Ref country code: GB Payment date: 20080129 Year of fee payment: 11 Ref country code: DE Payment date: 20080208 Year of fee payment: 11 |
|
PGFP | Annual fee paid to national office [announced via postgrant information from national office to epo] |
Ref country code: FR Payment date: 20080228 Year of fee payment: 11 |
|
GBPC | Gb: european patent ceased through non-payment of renewal fee |
Effective date: 20090221 |
|
REG | Reference to a national code |
Ref country code: FR Ref legal event code: ST Effective date: 20091030 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: DE Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20090901 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: GB Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20090221 Ref country code: FR Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20090302 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: IT Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20090221 |