WO2000051243A1 - A backward decoding method of digital audio data - Google Patents

A backward decoding method of digital audio data Download PDF

Info

Publication number
WO2000051243A1
WO2000051243A1 PCT/KR1999/000764 KR9900764W WO0051243A1 WO 2000051243 A1 WO2000051243 A1 WO 2000051243A1 KR 9900764 W KR9900764 W KR 9900764W WO 0051243 A1 WO0051243 A1 WO 0051243A1
Authority
WO
WIPO (PCT)
Prior art keywords
frame
audio data
data
header
digital audio
Prior art date
Application number
PCT/KR1999/000764
Other languages
French (fr)
Inventor
Soo Geun You
Jung Jae Park
Original Assignee
Soo Geun You
Jung Jae Park
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Soo Geun You, Jung Jae Park filed Critical Soo Geun You
Priority to JP2000601744A priority Critical patent/JP2002538503A/en
Priority to AU16934/00A priority patent/AU1693400A/en
Publication of WO2000051243A1 publication Critical patent/WO2000051243A1/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/439Processing of audio elementary streams
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/02Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11BINFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
    • G11B20/00Signal processing not specific to the method of recording or reproducing; Circuits therefor
    • G11B20/00007Time or data compression or expansion
    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11BINFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
    • G11B20/00Signal processing not specific to the method of recording or reproducing; Circuits therefor
    • G11B20/10Digital recording or reproducing
    • G11B20/10527Audio or video recording; Data buffering arrangements
    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11BINFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
    • G11B5/00Recording by magnetisation or demagnetisation of a record carrier; Reproducing by magnetic means; Record carriers therefor
    • G11B5/008Recording on, or reproducing or erasing from, magnetic tapes, sheets, e.g. cards, or wires
    • G11B5/00813Recording on, or reproducing or erasing from, magnetic tapes, sheets, e.g. cards, or wires magnetic tapes

Definitions

  • the present invention relates to a method of decoding compressed digital audio data backward, more particularly, to a method of backward decoding an MPEG (Moving Picture Experts Group) encoded audio data into analog audio signal with little increase of computation load and memory size.
  • MPEG Motion Picture Experts Group
  • Digital audio signal is in general more robust to noise than analog signal and thus the quality is not subject to degradation during copy or transmission over network.
  • the digital audio signals are, moreover, transmitted more rapidly and stored in storage media of less capacity due to effective compression methods recently developed.
  • MPEG Motion Picture Experts Group
  • MPEG audio layer-1, layer-2, and layer-3 were devised to encode high-quality stereo audio signals with little or no perceptible loss of quality. They have been widely adopted in digital music broadcasting area and in addition have been used with MPEG video standards to encode multimedia data.
  • MPEG-1 standard specifications for digital environments have been proposed; MPEG-2 includes standards on compression of multimedia data. Standards for object oriented multimedia communication are included in MPEG-4, which is in progress.
  • MPEG-1 consists of five coding standards for compressing and storing moving picture and audio signals in digital storage media.
  • MPEG audio standard includes three audio coding methods: layer-1, layer-2, and layer-3 MPEG audio layer-3 (hereinafter referred to as ⁇ MP3") algorithm includes a much more refined approach than in layer-1 and layer-2 to achieve higher compression ratio and sound quality, which will be described briefly below.
  • MPEG audio layer-1, 2, 3 compress audio data using perceptual coding techniques which address perception of sound waves of the human auditory system. To be specific, they take an advantage of the human auditory system's inability to hear quantization noise under conditions of auditory masking.
  • the "masking” is a perceptual property of the human ear which occurs whenever the presence of a strong audio signal makes a temporal or spectral neighborhood of weaker audio signals imperceptible.
  • a pianist plays the piano in front of audience. When the pianist does not touch keyboard, the audience can hear trailing sounds, but is no longer able to hear the trailing sounds at the instant of touching the keyboard. This is because, in presence of masking sounds, or the newly generated sounds, the trailing sounds which fall inside frequency bands centering the masking sound, so-called critical bands, and loudness of which is lower than a masking threshold are not audible. This phenomenon is called spectral masking effect.
  • the masking ability of a given signal component depends on its frequency position and its loudness.
  • the masking threshold is low in the sensitive frequency bands of the human ear, i.e., 2KHz to 5KHz, but high in other frequency bands .
  • temporal masking phenomenon in the human auditory system. That is, after hearing a loud sound, it takes a period of time for us to be able to hear a new sound that is not louder than the sound. For instance, it requires 5 milliseconds for us to be able to hear a new sound of 40 dB after hearing a sound of 60 dB during 5 milliseconds.
  • the temporal delay time also depends on frequency band.
  • the MP3 works by dividing the audio signal into frequency subbands that approximate critical bands, then quantizing each subband according to the audibility of quantization noise within that band, so that the quantization noise is inaudible due to the spectral and temporal masking.
  • the MP3 encoding process is described below in detail, step by step, with reference to FIGS. 1 and 2.
  • PCM format audio signal is, first, windowed and converted into spectral subband components via a filter bank 10, shown in FIG. 1, which consists of 32 equally spaced bandpass filters.
  • the filtered bandpass output signals are critically sub-sampled at the rate of 1/32 of the sampling rate and then encoded.
  • Polyphase filterbank is, in general, used to cancel the aliasing of adjacent overlapping bands that occurs otherwise because of the low sampling rate at the sub- sampling step.
  • MDCT Modified Discrete Cosine Transform
  • aliasing reduction unit 30 are adopted to cancel the aliasing, thereby preventing deterioration of the quality.
  • MDCT Discrete Cosine Transform
  • the number of quantization bits is allocated by taking into account the masking effect by neighboring subbands. That is, quantization and bit allocation is performed to keep the quantization noise in all critical bands below the masking threshold.
  • Variable-length Huffamn codes are used to get better data compression rate of the quantized samples.
  • the Huffman coding is called entropy coding whereby redundancy reduction is carried out based on statistical property of the digital data.
  • the principle behind the Huffman coding is that codewords of small length are assigned to symbols having higher probability, while large-length codewords are assigned to symbols with lower probability. In effect, the average length of encoded data are reduced as small as possible.
  • the quantized samples are 00, 01, 10, and 11. Their probabilities are 0.6, 0.2, 0.1, and 0.1, respectively.
  • MP3 adopts bit reservoir buffering technique whereby unused bits in the frames in which the size of coded data are relatively small are used when the encoder needs more bits than the average number of bits to code a frame.
  • the audio signal is formatted into a bitstream.
  • FIG. 3 shows the arrangement of the various fields in a frame of an MP3 encoded bitstream.
  • digital audio signals typically consist of 16 bit samples recorded at several sampling rates than twice the actual audio bandwidth, (e.g., 32KHz, 44.1KHz, and 48KHz) .
  • MP3 audio coding the original sound data can be encoded at the bit rate of 128 to 256 Kbps . That is, 1.5 to 3 bits are, on the average, needed for sampling instead of 16 bits, and therefore the MP3 enables to shrink down the original sound data from a CD- DA by a factor of about 12 without loss of the sound quality.
  • Digital audio data are first decoded and then recorded on either track on a magnetic tape on which a forward track and a backward track are provided, That is, when the tape travels in the forward (backward) direction, the audio signals are recorded on the forward (backward) track. After completion of recording the audio signals on the forward track, the tape begins to travel in the backward and the audio signals are recorded thereon. As a result, it needs the time for two times tape travels to record the digital audio signals on a magnetic tape.
  • the method has weak points of more storage spaces for the encoded backward- reproduced signals in addition to the encoded forward- reproduced signals, and imperfect reproduction of the audio signals due to MP3 encoding using masking phenomenon since small amplitude preceding large amplitude in view of normal reproduction was suppressed while encoding audio signal reproduced backward.
  • the present invention provides a method of a method of backward decoding an MPEG audio data into an analog audio data, comprising the steps of locating a header of a last frame of the compressed digital audio data; dequantizing a plurality of data blocks constructing the frame based on information contained in the located header; extracting time signals of each frequency subband from the dequantized data blocks, reducing discontinuities between the dequantized data blocks; and synthesizing the extracted time signals of all subbands backward into real audio signal reversed in time.
  • the backward decoding method according to present invention enables fast recording of MPEG audio data on both of tracks on the magnetic tape.
  • FIGS. 1 and 2 are block diagrams showing an MPEG audio encoder
  • FIG. 3 shows the arrangement of the various bit fields in a frame of MPEG audio data
  • FIG. 4 is a block diagram showing an MPEG audio decoder
  • FIG. 5 is a schematic diagram showing an illustration of the bit reservoir within a fixed length frame structure
  • FIG. 6 is a schematic diagram illustrating the overlap of inverse-modified-discrete-cosine-transformed blocks
  • FIG. 7 is a flow graph showing a synthesis filterbank
  • FIG. 8 is a flowchart showing an algorithm implementing the synthesis filterbank of FIG. 7;
  • FIG. 9 is a block diagram of the flowchart of FIG. 8;
  • FIG. 10 is a flow graph showing a synthesis filterbank for backward decoding according to the present invention.
  • FIG. 11 is a flowchart showing an algorithm implementing the synthesis filterbank of FIG. 10; and FIG. 12 is a block diagram of the flowchart of FIG. 11.
  • FIG. 4 shows a block diagram of an MP3 audio decoder to which an embodiment of the present invention is applied, comprising a demultiplexer 100 for dividing an MP3 audio bitstream into several data of different types; a side- information decoder 110 for decoding side-information contained the bitstream; a Huffman-decoder 120 for
  • a dequantizer 130 for obtaining actual frequency energies from the Huffman-decoded data
  • an inverse MDCT (IMDCT) unit 140 for applying IMDCT to the energies
  • a synthesis filterbank 150 for synthesizing subband values the into PCM samples.
  • the first step in the backward decoding process of an MP3 bitstream is to find where decoding is started in the bitstream.
  • frames are independent of each other, and consequently the first step is to locate a frame header in the bitstream, requiring knowing the frame length.
  • All MPEG bit streams are generally divided in separate chunks of bits called frames. There is a fixed number of frames per second for each MPEG format, which means that for a given bit rate and sampling frequency, each input frame has a fixed length and produces a fixed number of output samples.
  • Locating header information is done by searching for a synchronization bit-pattern marked within the header. However, it happens that locating header information fails because some audio data may contain the same bit pattern as the synchronization bit-pattern.
  • the demultiplexer 100 analyzes the first header in the stream and obtains the length of the frame having no padding bit based on information in the first header. By using the frame length, the header of the last frame is located while traveling the MP3 audio clip from the end.
  • padding bit is added to a frame, the frame length is increased by 1 byte. That is, the frame length may change from frame to frame due to the padding bit. Because it is uncertain that the last frame have padding bit, searching for the header of the last frame needs to examine whether the last frame header is away from the end of the clip by the frame length or one more byte away. ( 2 ) . Obtaining Side-in ormation
  • the demultiplexer 100 divides the input MP3 audio bitstream into side- information containing how the frame was encoded, scale factor specifying gain of each frequency band, and Huffman-coded data.
  • the side-information decoder 110 decodes the side-information so that the decoder knows what to do with the data contained in the frame.
  • the number of bits required for MP3 encoding depends on acoustic characteristics of samples to be encoded with equal quality of sound.
  • the coded data do not necessarily fit into a fixed length frame in the code bitstream.
  • MP3 uses bit reservoir technique whereby bit rate may be borrowed from previous frames in order to provide more bits to demanding parts of the input signal.
  • the encoder donates bits to a reservoir when it needs less than the average number of bits to code a frame. Later, when the encoder needs more than the average number of bits to code a frame, it borrows bits from the reservoir. The encoder can only borrow bits donated from past frames with limits. It cannot borrow from future frames.
  • the current frame being decoded may include audio data belonging to the frames that will be presented subsequently.
  • the starting byte of the audio data for the current frame is limited to 511 bytes away from that frame.
  • a 9-bit pointer is included in each frame's side- information that points to the location of the starting byte of the audio data for that frame, as shown in FIG. 5.
  • the audio data for the current frame being decoded i.e., scale factor and Huffman-coded data may be included in data region of the previous frames, which are within 511 bytes distance from that frame.
  • MP3 audio data are forwardly decoded, if it is determined that data belonging to the current frame contains data for the subsequent frames, they are kept until the subsequent frames are decoded.
  • MP3 audio data when the current frame is decoded, it is checked whether or not the decoding current frame needs data contained in the precedent frame, and if any, the data are obtained in such a manner that headers of the precedent frames and data belonging to the frames are identified.
  • the Huffman decoder 120 starts to Huffman-decode the audio data (including the data contained in the precedent frames) based on the side-information and Huffman trees which were constructed and used in the encoding process according to the data contents . This step is the same as that of forward decoding.
  • DequantizJ g and descaling When the Huffman-decoder 120 has decoded the audio data, they have to be dequantized by the dequantizer 130 and descaled using the scale factors into real spectral energy values. For example, if the Huffman-decoded value is Y, then the real spectral energy value is obtained by multiplying ⁇ (4/3) and the scale factors.
  • each channel can be transmitted separately in every frame, but transmission of the sum and the difference between the two channels is often adopted to reduce redundancies therebetween. If the bitstream was encoded in this way, the decoder has to perform stereo-processing to recover the original two channels.
  • IMDCT inverse modified d.i screte cosine transform
  • MDCT is done to get better frequency resolution than in the other layers.
  • MDCT are essentially critically sampled DCT, implying that if no quantizing had been done, the original signal would be reconstructed perfectly.
  • quantization is performed for each data block in the encoding process, discontinuities between data blocks occur inevitably.
  • the single data block is the unit block of output samples of the decoder and is corresponding to a granule in inverse MDCT.
  • the inverse MDCT uses 50% overlap, i.e., every inverse-modified- discrete-cosine-transformed granules are overlapped with half of the previous transformed granules to smooth out any discontinuities.
  • IMDCT produces 36 samples output in a manner that the second half 18 samples of the previous granule is added to the first half 18 samples of the current granule, as shown in FIG. 6.
  • the order in which granule is added must be reversed, i.e., the second half 18 samples of the current granule is added to the first half 18 samples of the precedent granule.
  • second granule of that frame is added with zeros or just used without overlapping.
  • the IMDCT process in the forward decoding is expressed by the following equation.
  • xNn) is a target sample output
  • yN n) is inverse- modified-discrete-cosine-transformed sample
  • i is the granule index
  • N is the total number of frames
  • y 0 (n+18) are all zeros for 0 ⁇ n ⁇ 18.
  • the final step to get the output audio samples is to synthesize 32 subband samples.
  • the subband synthesis operation is to interpolate 32 subband samples into audio samples in the time domain.
  • a subband synthesis filter needs the delayed inputs of previous frames, but in case of the backward decoding, subband samples are presented to the synthesis filter in the reverse order to the forward decoding. Therefore, redesign of MPEG standard synthesis filterbank is required to perform the backward decoding operation.
  • the MPEG standard synthesis filterbank for the forward decoding is described below in detail and then the synthesis filterbank for the backward decoding according to the present invention is explained in detail.
  • FIG. 7 shows a flow graph of an MPEG standard synthesis filterbank for forward decoding, whereby 32 subband samples are synthesized into audio samples of a time- series in the similar way to frequency division multiplexing.
  • x r (mT sl ) is the r-th subband sample and x r (nT s2 ) is 32 up-sampled from x r (mT sl ) such that thirty-one zeros are inserted into the interval between (m-l)T sl and mT sl for x r (mT sl ) samples.
  • x r (nT s2 ) is processed by band-pass filter H r (z) to pass the signal belonging to frequency- band allocated to each filter.
  • the band-pass filter has 512 orders and is constructed by phase-shifting a prototype low-pass filter.
  • the flow graph of FIG. 7 is expressed by the equation (1) .
  • St (nT S2 ) is the synthesized output sample at time t. That is, St(nT s2 ) represents the synthesized output sample of 32 subband samples or x r (tT sl )'s at time t.
  • the equation (1) implies the convolution of x r (kT s2 ) and H r (KT s2 ), which has 512 coefficients and is constructed by the product of the prototype low-pass filter h(kT s2 ) and N r (k) that is used for phase-shift thereof.
  • equation (1) Reduction of the number of computations, i.e., multiplies and adds is possible in equation (1) .
  • equation (1) leads to equation (2), hereinafter, sampling period in the following equations is omitted for convenience and is T s2 if not explicitly expressed.
  • each subband For each subband, one sample is presented and multiplied by N r (k), resulting in 64 samples.
  • the 64 samples are stored in 1024 FIFO (First In First Out) buffer, samples have been stored therein being shifted by 64.
  • 32 PCM output samples are obtained by multiplying samples in the 1024 FIFO buffer by coefficients of the time window.
  • the synthesis filterbank for backward decoding will be described below in detail with reference to the MPEG standard synthesis filterbank for the forward decoding.
  • MPEG standard synthesis filterbank requires past samples for synthesizing PCM audio samples, if samples are presented in the reverse order to perform backward decoding, MPEG standard synthesis filterbank cannot use the previous samples. As a result, MPEG standard synthesis filterbank must be modified to perform backward decoding. The structure thereof is explained below.
  • Equation (1) is changed to equation (4) in accordance with the reverse of the presentation order.
  • equations (7) and (8) are the same as equations (2) and (3) except the index of input samples.
  • the synthesis filter for backward decoding is similar to the synthesis filter for forward decoding, and therefore computation and memory size that are needed to implement the synthesis filter are identical. Accordingly, the backward decoding can be performed with the synthesis filter for forward decoding by reversing the direction in which the samples in the FIFO buffer are shifted as well as the order in which subband samples are summed.
  • Output samples are produced in the reverse order to their playback order in units of 32 samples, but each of 32 samples is arranged in the playback order. Accordingly, the backward decoder outputs the 32 samples per a frame in the reverse order. This is repeated frame by frame until the first frame of the MPEG audio data. Note that the synthesis filter for backward decoding can be used in MPEG audio layer-1, layer-2, and layer-3.
  • MPEG audio data that will be recorded on the backward track are first decoded into PCM samples and stored in a buffer, retrieved from the end in the backward order, and converted into analog audio signals.
  • This method is very simple, but it needs a large-sized buffer for temporarily storing the decoded audio data. Moreover, because the required buffer size depends on the length of MPEG audio clip being decoded, it is difficult to fix the maximum size of the buffer in advance.
  • blocks from the (N-2M) -th frame to the (N-M) - th frame are decoded and then blocks from the (N-2M+1) -th rame to the (N-M) -th frame are recorded in the reverse order.
  • the (N-M) -th frame is included again in the second decoding.
  • the (N-M) -th frame is decoded perfectly because it is decoded with its precedent frame.
  • the decoding-and-recording operation is repeated till all frames are decoded.
  • the first frame that is included in the block to be decoded last is just decoded and recorded because it has no precedent frame, as in the forward decoding.
  • this method has advantages that small-sized buffer enough to store M samples is sufficient and that the buffer size is fixed in advance .
  • fast recording of MPEG audio data on both of The backward decoding algorithm needs more memory than the forward decoding algorithm, but the number of computations thereof is the same as that of the forward decoding algorithm.
  • the size of memory needed is two times larger than that of memory in the forward decoding because a frame must be Huffman-decoded at a time unlike the forward decoding algorithm, where it is possible that two blocks consisting of a frame are Huffman-decoded sequentially.
  • the memory size results in 1152X2 words.
  • the backward decoding that is performed by applying forward decoding algorithm to every predetermined number of frames requires a buffer in which forward decoded data are temporarily stored, but is easy to implement .

Landscapes

  • Engineering & Computer Science (AREA)
  • Signal Processing (AREA)
  • Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Acoustics & Sound (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)
  • Signal Processing For Digital Recording And Reproducing (AREA)

Abstract

This invention provides a method of backward decoding compressed digital audio data into an analog audio data reversed in time. The method according to this invention comprises the steps of locating a header of a last frame of the compressed digital audio data; dequantizing a plurality of data blocks constructing the frame based on information contained in the located header; extracting time signals of each frequency subband from the dequantized data blocks, reducing discontinuities between the dequantized data blocks; and synthesizing the extracted time signals of all subbands backward into real audio signal reversed in time. Therefore, this invention enables to record the decoded analog signal on both tracks on a magnetic tape simultaneously while the magnetic tape travels in one direction with little increase of computation load and memory size, resulting in a high speed recording.

Description

D E S C R I P T I O N
A BACKWARD DECODING METHOD OF DIGITAL AUDIO DATA
1. Technical Field
The present invention relates to a method of decoding compressed digital audio data backward, more particularly, to a method of backward decoding an MPEG (Moving Picture Experts Group) encoded audio data into analog audio signal with little increase of computation load and memory size.
2. Background Art
Digital audio signal is in general more robust to noise than analog signal and thus the quality is not subject to degradation during copy or transmission over network. The digital audio signals are, moreover, transmitted more rapidly and stored in storage media of less capacity due to effective compression methods recently developed.
Many compression methods have been proposed to effectively encode audio signals into digital data. MPEG (Moving Picture Experts Group) audio coding schemes have been used for the standard in this area. The MPEG audio standards that are standardized as ISO (International
Standardization Organization) - MPEG audio layer-1, layer-2, and layer-3 were devised to encode high-quality stereo audio signals with little or no perceptible loss of quality. They have been widely adopted in digital music broadcasting area and in addition have been used with MPEG video standards to encode multimedia data. In addition to MPEG-1, standard specifications for digital environments have been proposed; MPEG-2 includes standards on compression of multimedia data. Standards for object oriented multimedia communication are included in MPEG-4, which is in progress.
MPEG-1 consists of five coding standards for compressing and storing moving picture and audio signals in digital storage media. MPEG audio standard includes three audio coding methods: layer-1, layer-2, and layer-3 MPEG audio layer-3 (hereinafter referred to as λMP3") algorithm includes a much more refined approach than in layer-1 and layer-2 to achieve higher compression ratio and sound quality, which will be described briefly below. MPEG audio layer-1, 2, 3 compress audio data using perceptual coding techniques which address perception of sound waves of the human auditory system. To be specific, they take an advantage of the human auditory system's inability to hear quantization noise under conditions of auditory masking. The "masking" is a perceptual property of the human ear which occurs whenever the presence of a strong audio signal makes a temporal or spectral neighborhood of weaker audio signals imperceptible. Let us suppose that a pianist plays the piano in front of audience. When the pianist does not touch keyboard, the audience can hear trailing sounds, but is no longer able to hear the trailing sounds at the instant of touching the keyboard. This is because, in presence of masking sounds, or the newly generated sounds, the trailing sounds which fall inside frequency bands centering the masking sound, so-called critical bands, and loudness of which is lower than a masking threshold are not audible. This phenomenon is called spectral masking effect. The masking ability of a given signal component depends on its frequency position and its loudness. The masking threshold is low in the sensitive frequency bands of the human ear, i.e., 2KHz to 5KHz, but high in other frequency bands .
There is the temporal masking phenomenon in the human auditory system. That is, after hearing a loud sound, it takes a period of time for us to be able to hear a new sound that is not louder than the sound. For instance, it requires 5 milliseconds for us to be able to hear a new sound of 40 dB after hearing a sound of 60 dB during 5 milliseconds. The temporal delay time also depends on frequency band.
Based on a psychoacoustic model of the human ear, the MP3 works by dividing the audio signal into frequency subbands that approximate critical bands, then quantizing each subband according to the audibility of quantization noise within that band, so that the quantization noise is inaudible due to the spectral and temporal masking.
The MP3 encoding process is described below in detail, step by step, with reference to FIGS. 1 and 2.
(1) . Subband coding and MDCT (Modified Discrete Cosine
Transform)
In the MP3 encoder, PCM format audio signal is, first, windowed and converted into spectral subband components via a filter bank 10, shown in FIG. 1, which consists of 32 equally spaced bandpass filters. The filtered bandpass output signals are critically sub-sampled at the rate of 1/32 of the sampling rate and then encoded.
Polyphase filterbank is, in general, used to cancel the aliasing of adjacent overlapping bands that occurs otherwise because of the low sampling rate at the sub- sampling step. As another method, MDCT (Modified Discrete Cosine Transform) unit 20 and aliasing reduction unit 30 are adopted to cancel the aliasing, thereby preventing deterioration of the quality.
Because MDCT is essentially critically sampled DCT (Discrete Cosine Transform) , the input PCM audio signal can be reconstructed perfectly in the absence of quantization errors. Discontinuities between transformed blocks occur since quantization is carried out.
For each subband, the number of quantization bits is allocated by taking into account the masking effect by neighboring subbands. That is, quantization and bit allocation is performed to keep the quantization noise in all critical bands below the masking threshold.
(2) . Scaling
Samples in each of the 32 subbands are normalized by a scale factor such that the sample of the largest magnitude is unity, and the scale factor is encoded for use in the decoder. With the scaling process, the amplitude of signal is compressed, therefore, the quantization noise is reduced and become inaudible due to the psychoacoustic phenomenon. (3) . Huffman Coding
Variable-length Huffamn codes are used to get better data compression rate of the quantized samples. The Huffman coding is called entropy coding whereby redundancy reduction is carried out based on statistical property of the digital data. The principle behind the Huffman coding is that codewords of small length are assigned to symbols having higher probability, while large-length codewords are assigned to symbols with lower probability. In effect, the average length of encoded data are reduced as small as possible.
Let us consider an example for illustration. The quantized samples are 00, 01, 10, and 11. Their probabilities are 0.6, 0.2, 0.1, and 0.1, respectively. In case of using codewords of constant length, say, 2 bits, the average length of a codeword is 2 bits without calculation of (2X0.6 + 2X0.2 + 2X0.1 + 2X0.1) / 4 = 2 bits. However, if variable-length codewords are used, i.e., 1 bit is assigned to 00 with the highest probability, 2 bits for 01 with the second highest probability, and 3 bits for 10 and 11, the average length of the codeword leads to 1.6 bits ( =(1X0.6 + 2X0.2 + 3 X0.1 + 3X0.1) /4 ).
In addition, in order to achieve high compression rate, MP3 adopts bit reservoir buffering technique whereby unused bits in the frames in which the size of coded data are relatively small are used when the encoder needs more bits than the average number of bits to code a frame. After being processed by the above processes, the audio signal is formatted into a bitstream. FIG. 3 shows the arrangement of the various fields in a frame of an MP3 encoded bitstream.
Without data reduction, digital audio signals typically consist of 16 bit samples recorded at several sampling rates than twice the actual audio bandwidth, (e.g., 32KHz, 44.1KHz, and 48KHz) . In case of two channels stereo audio signals at a sampling rate of 44.1KHz with 16 bits per sample, the bit rate is 16X44100X2=1411200, or about 1.4 Mbps. By using MP3 audio coding, the original sound data can be encoded at the bit rate of 128 to 256 Kbps . That is, 1.5 to 3 bits are, on the average, needed for sampling instead of 16 bits, and therefore the MP3 enables to shrink down the original sound data from a CD- DA by a factor of about 12 without loss of the sound quality.
Despite its advantages, digital audio recorders and players are in infancy for several reasons and analog audio recorders and players have still been the majority in the market. Accordingly, it would be attractive in terms of commercial products if it is possible that digital audio signals are recorded on analog signal storage media like magnetic tapes because users can enjoy digital audio without buying new digital audio recorders and players .
Digital audio data are first decoded and then recorded on either track on a magnetic tape on which a forward track and a backward track are provided, That is, when the tape travels in the forward (backward) direction, the audio signals are recorded on the forward (backward) track. After completion of recording the audio signals on the forward track, the tape begins to travel in the backward and the audio signals are recorded thereon. As a result, it needs the time for two times tape travels to record the digital audio signals on a magnetic tape.
For fast recording, it is possible to encode analog audio signals which were backward-reproduced and to decode and record the encoded signal on the tape during only one tape travel. However, the method has weak points of more storage spaces for the encoded backward- reproduced signals in addition to the encoded forward- reproduced signals, and imperfect reproduction of the audio signals due to MP3 encoding using masking phenomenon since small amplitude preceding large amplitude in view of normal reproduction was suppressed while encoding audio signal reproduced backward.
3. Disclosure of Invention
It is a primary object of the present invention to provide a method of backward decoding an MPEG digital audio data into an analog audio data which enables to record the decoded analog signal on analog signal storage media like magnetic tapes at a high speed with little increase of computation load and memory size.
To achieve the object, the present invention provides a method of a method of backward decoding an MPEG audio data into an analog audio data, comprising the steps of locating a header of a last frame of the compressed digital audio data; dequantizing a plurality of data blocks constructing the frame based on information contained in the located header; extracting time signals of each frequency subband from the dequantized data blocks, reducing discontinuities between the dequantized data blocks; and synthesizing the extracted time signals of all subbands backward into real audio signal reversed in time.
According to the method of backward decoding MPEG audio data according to the present invention, when MPEG audio data are asked to be recorded on a magnetic tape at a high speed, the MPEG audio data can be decoded and recorded on both of the two tracks on the magnetic tape simultaneously while the tape travels in one direction. Therefore, the backward decoding method according to present invention enables fast recording of MPEG audio data on both of tracks on the magnetic tape.
4. Brief Description of Drawings
The accompanying drawings, which are included to provide a further understanding of the invention, illustrate the preferred embodiment of this invention, and together with the description, serve to explain the principles of the present invention.
In the drawings: FIGS. 1 and 2 are block diagrams showing an MPEG audio encoder;
FIG. 3 shows the arrangement of the various bit fields in a frame of MPEG audio data;
FIG. 4 is a block diagram showing an MPEG audio decoder;
FIG. 5 is a schematic diagram showing an illustration of the bit reservoir within a fixed length frame structure;
FIG. 6 is a schematic diagram illustrating the overlap of inverse-modified-discrete-cosine-transformed blocks;
FIG. 7 is a flow graph showing a synthesis filterbank;
FIG. 8 is a flowchart showing an algorithm implementing the synthesis filterbank of FIG. 7;
FIG. 9 is a block diagram of the flowchart of FIG. 8; FIG. 10 is a flow graph showing a synthesis filterbank for backward decoding according to the present invention;
FIG. 11 is a flowchart showing an algorithm implementing the synthesis filterbank of FIG. 10; and FIG. 12 is a block diagram of the flowchart of FIG. 11.
5. Modes for Carrying out the Invention
The preferred embodiments of the present invention will be described hereinafter in detail referring to the accompanying drawings . FIG. 4 shows a block diagram of an MP3 audio decoder to which an embodiment of the present invention is applied, comprising a demultiplexer 100 for dividing an MP3 audio bitstream into several data of different types; a side- information decoder 110 for decoding side-information contained the bitstream; a Huffman-decoder 120 for
Huffman-decoding the divided audio data; a dequantizer 130 for obtaining actual frequency energies from the Huffman-decoded data; an inverse MDCT (IMDCT) unit 140 for applying IMDCT to the energies; and a synthesis filterbank 150 for synthesizing subband values the into PCM samples.
With reference to the MP3 audio decoder of FIG. 4, the method of backward decoding MP3 encoded audio data are described below step by step. ( ) ■ Identifying Frame Header
The first step in the backward decoding process of an MP3 bitstream is to find where decoding is started in the bitstream. In MPEG audio, frames are independent of each other, and consequently the first step is to locate a frame header in the bitstream, requiring knowing the frame length. All MPEG bit streams are generally divided in separate chunks of bits called frames. There is a fixed number of frames per second for each MPEG format, which means that for a given bit rate and sampling frequency, each input frame has a fixed length and produces a fixed number of output samples.
In order to obtain actual frame length, it is required to locate a frame header in the bitstream and to get the required information from it, because the frame length depends on the bit rate and sampling frequency. Locating header information is done by searching for a synchronization bit-pattern marked within the header. However, it happens that locating header information fails because some audio data may contain the same bit pattern as the synchronization bit-pattern.
To alleviate this problem, on the assumption that neither bit rate nor sampling frequency does not change in an MP3 audio clip, the demultiplexer 100 analyzes the first header in the stream and obtains the length of the frame having no padding bit based on information in the first header. By using the frame length, the header of the last frame is located while traveling the MP3 audio clip from the end.
If padding bit is added to a frame, the frame length is increased by 1 byte. That is, the frame length may change from frame to frame due to the padding bit. Because it is uncertain that the last frame have padding bit, searching for the header of the last frame needs to examine whether the last frame header is away from the end of the clip by the frame length or one more byte away. ( 2 ) . Obtaining Side-in ormation
After the frame header is found, the demultiplexer 100 divides the input MP3 audio bitstream into side- information containing how the frame was encoded, scale factor specifying gain of each frequency band, and Huffman-coded data. The side-information decoder 110 decodes the side-information so that the decoder knows what to do with the data contained in the frame.
The number of bits required for MP3 encoding depends on acoustic characteristics of samples to be encoded with equal quality of sound. The coded data do not necessarily fit into a fixed length frame in the code bitstream. Based on this, MP3 uses bit reservoir technique whereby bit rate may be borrowed from previous frames in order to provide more bits to demanding parts of the input signal. To be specific, the encoder donates bits to a reservoir when it needs less than the average number of bits to code a frame. Later, when the encoder needs more than the average number of bits to code a frame, it borrows bits from the reservoir. The encoder can only borrow bits donated from past frames with limits. It cannot borrow from future frames. On the decoder's side, the current frame being decoded may include audio data belonging to the frames that will be presented subsequently. The starting byte of the audio data for the current frame is limited to 511 bytes away from that frame. A 9-bit pointer is included in each frame's side- information that points to the location of the starting byte of the audio data for that frame, as shown in FIG. 5.
That is, the audio data for the current frame being decoded, i.e., scale factor and Huffman-coded data may be included in data region of the previous frames, which are within 511 bytes distance from that frame. When MP3 audio data are forwardly decoded, if it is determined that data belonging to the current frame contains data for the subsequent frames, they are kept until the subsequent frames are decoded. On the other hand, in order to backward decoding MP3 audio data, when the current frame is decoded, it is checked whether or not the decoding current frame needs data contained in the precedent frame, and if any, the data are obtained in such a manner that headers of the precedent frames and data belonging to the frames are identified.
(3) . Huffman decoding Once obtaining the audio data are completed, the Huffman decoder 120 starts to Huffman-decode the audio data (including the data contained in the precedent frames) based on the side-information and Huffman trees which were constructed and used in the encoding process according to the data contents . This step is the same as that of forward decoding. However, since a frame is encoded in two granules (granule 0 and granule 1) and data of granule 0 must be decoded in order to locate granule 1, two granules must be decoded to output granule 1 in the backward decoding process whereas it is possible to decode the MP3 encoded data from granule 0 to granule 1 sequentially in the forward decoding, whereas data of two granules must be decoded at a time in the backward decoding process. (4) . DequantizJ g and descaling When the Huffman-decoder 120 has decoded the audio data, they have to be dequantized by the dequantizer 130 and descaled using the scale factors into real spectral energy values. For example, if the Huffman-decoded value is Y, then the real spectral energy value is obtained by multiplying γ(4/3) and the scale factors.
If the bitstream is a stereo signal, each channel can be transmitted separately in every frame, but transmission of the sum and the difference between the two channels is often adopted to reduce redundancies therebetween. If the bitstream was encoded in this way, the decoder has to perform stereo-processing to recover the original two channels.
(5) , IMDCT (inverse modified d.i screte cosine transform) So far the signals have all been in the frequency domain, and to synthesize the output samples, a transform is applied that is the reverse of the time-to-frequency transform used in the encoder.
In MPEG layer-3, MDCT is done to get better frequency resolution than in the other layers. MDCT are essentially critically sampled DCT, implying that if no quantizing had been done, the original signal would be reconstructed perfectly. However, because quantization is performed for each data block in the encoding process, discontinuities between data blocks occur inevitably. The single data block is the unit block of output samples of the decoder and is corresponding to a granule in inverse MDCT.
To avoid discontinuities between the granules, which would lead to perceptible noise and clicks, the inverse MDCT uses 50% overlap, i.e., every inverse-modified- discrete-cosine-transformed granules are overlapped with half of the previous transformed granules to smooth out any discontinuities. To be specific, IMDCT produces 36 samples output in a manner that the second half 18 samples of the previous granule is added to the first half 18 samples of the current granule, as shown in FIG. 6. For the backward decoding, the order in which granule is added must be reversed, i.e., the second half 18 samples of the current granule is added to the first half 18 samples of the precedent granule. For the end frame which is to be decoded at first at the backward decoding process, second granule of that frame is added with zeros or just used without overlapping.
The IMDCT process in the forward decoding is expressed by the following equation.
Xi(n) = Yi(n) + yi.N n+18 ) 0<n<18, i=l,2, ■■■ 2N. where xNn) is a target sample output, yN n) is inverse- modified-discrete-cosine-transformed sample, i is the granule index, N is the total number of frames, and y0(n+18) are all zeros for 0≤n<18.
The above equation must be changed into the following equation for the IMDCT process in the backward decoding. x±(n) = y±(n+18) + y^ (n) 0<n<18, i=2N, 2N-1 — , 1. where y2N+1(n+18) are all zeros for 0≤n<18. The overlapping procedure is the same as that of the forward decoding and therefore computation and memory size needed are identical.
( ) . Synthesis of Subband signals
Once the transformed blocks is overlapped after the IMDCT process, the final step to get the output audio samples is to synthesize 32 subband samples. The subband synthesis operation is to interpolate 32 subband samples into audio samples in the time domain.
A subband synthesis filter needs the delayed inputs of previous frames, but in case of the backward decoding, subband samples are presented to the synthesis filter in the reverse order to the forward decoding. Therefore, redesign of MPEG standard synthesis filterbank is required to perform the backward decoding operation. The MPEG standard synthesis filterbank for the forward decoding is described below in detail and then the synthesis filterbank for the backward decoding according to the present invention is explained in detail.
FIG. 7 shows a flow graph of an MPEG standard synthesis filterbank for forward decoding, whereby 32 subband samples are synthesized into audio samples of a time- series in the similar way to frequency division multiplexing. To be specific, 32 subband samples or xr(mTsl)'s, each of which is critically sampled at a sampling period of TSl, are synthesized into an output samples or s (nTs2) which is critically sampled signal at a sampling period of Ts2 (= Tsl / 32) .
Here, xr(mTsl) is the r-th subband sample and xr(nTs2) is 32 up-sampled from xr(mTsl) such that thirty-one zeros are inserted into the interval between (m-l)Tsl and mTsl for xr(mTsl) samples. This up-sampling generates 31 images of baseband centered at harmonics of the original sampling frequency, kfsl (k=l, 2, ..., 31) . That is, sampling frequency is increased from fsl (= 1/TS1) to fs2 (=1/TS2) for the original subband sample of xr(mTsl). For each subband, xr(nTs2) is processed by band-pass filter Hr(z) to pass the signal belonging to frequency- band allocated to each filter. The band-pass filter has 512 orders and is constructed by phase-shifting a prototype low-pass filter. The flow graph of FIG. 7 is expressed by the equation (1) .
31 511
St (nT„ ) = ∑ ∑ xr ((32/ + n - k)TS2 ) • Hr (kTS2 ) r=0 k=0
= ∑∑xr( 2t + n-k)TS2)-h(kTS2)- Nr(k) ( 1 )
-=0 -.=0
Figure imgf000022_0001
where r is the subband index ranging from 0 to 31, n is the output sample index ranging from 0 to 31, and St (nTS2) is the synthesized output sample at time t. That is, St(nTs2) represents the synthesized output sample of 32 subband samples or xr(tTsl)'s at time t.
The equation (1) implies the convolution of xr(kTs2) and Hr(KTs2), which has 512 coefficients and is constructed by the product of the prototype low-pass filter h(kTs2) and Nr(k) that is used for phase-shift thereof.
Reduction of the number of computations, i.e., multiplies and adds is possible in equation (1) . By utilizing the symmetry property of cosine terms and zeros that are filled in xr(kTs2) at the time of up-sampling, equation (1) leads to equation (2), hereinafter, sampling period in the following equations is omitted for convenience and is Ts2 if not explicitly expressed. 15
St (ri) = ~ h(n + 32t) • (-l)[, 2] • gt (n + 64/ + 32 x (i%2))
1=0 15 ( 2 )
= ~ d(n + 32 ) - gt(n + 64/ + 32 x (ι%2))
7=0
,, ^ - -i ,-.,. -,,, .. ,(2r + l)( + 16 . ^(* + 64/) = ∑xr(32/ -32ι). cos(i ^- — ) ( 3 ) r=0 64 where r is the subband index ranging from 0 to 31, n, i, and k are computation indices (n=0,l,2 ••• 31, i= 0,1,2 -"15, k=0,l,2, ••• ,63), t represents the time when the subband sample is presented to the decoder. % is the modular operator and [x] represents the largest integer that is not greater than x.
For each subband, one sample is presented and multiplied by Nr(k), resulting in 64 samples. The 64 samples are stored in 1024 FIFO (First In First Out) buffer, samples have been stored therein being shifted by 64. 32 PCM output samples are obtained by multiplying samples in the 1024 FIFO buffer by coefficients of the time window.
The synthesis filterbank for backward decoding according to the present invention will be described below in detail with reference to the MPEG standard synthesis filterbank for the forward decoding. It should be noted that for backward decoding, subband samples are presented to the decoder in the reverse order to their playback order. For example, given N samples for each subband, while the forward decoder decodes the samples in the increasing order (t = 0, 1, 2, ...,N-1) , the samples have to be decoded in the decreasing order (t=N- l,N-2,.„, 0) for backward decoding. Because MPEG standard synthesis filterbank requires past samples for synthesizing PCM audio samples, if samples are presented in the reverse order to perform backward decoding, MPEG standard synthesis filterbank cannot use the previous samples. As a result, MPEG standard synthesis filterbank must be modified to perform backward decoding. The structure thereof is explained below.
FIG. 10 depicts a flow graph showing the synthesis filterbank for backward decoding according to the present invention, which is identical to the forward decoding synthesis filterbank except that Hr(z) is replaced by Br(z). Note that xr(mTsl) is presented to the filterbank in the decreasing order, i.e., m = N-l,N-2,..., 1, 0.
Equation (1) is changed to equation (4) in accordance with the reverse of the presentation order.
31 511
St(nTS2) = ∑∑xr((Q2t - 5U) + n + k)TS2) -Hr(kTS2)
-=0 k=0
= ∑∑xr( 2(t -l5) + (n - 3\) + k)TS2) - h(kTS2) - Nr(k) ( 4 ) r=0 -.=0 31 511 (2r + l)(k + l6)π
∑∑xr ((32(t - 15) + (n - 31) + k)TS2 ) ■ h(kTS2 ) ■ cos( 64 ) r=0 k=0 Compared to equation (1), the band-pass filter of each subband is the same as that of equation (1), but sequence of subband samples that are multiplied by the band-pass filter coefficients is reversed, i.e., xr ( ( (32t+n) -k) Ts) is replaced by xr ( ( (32t+n) -511+k) Ts) .
The above equation is partially optimized to reduce the number of computations, resulting in the following equations .
St (n) = ∑h(3 l - n + 32/) • (-1)[,72] • gt (31 - n + 64; + 32 x (i%2))
15 ( x 5 )
= ∑ d( 1 - n + 320 • g, (31 - « + 64i + 32 x (i%2))
1=0 gt (63 - k + 64/) = Y xr (32(t - 15) + 32/) • cos((2 " + 1)(63 ~ k + λ6 ) ( 6 ) ^, 64 where all indices are the same as those in equations (2) and (3) .
If using j = 31 - n and m = 63 - k, the equation (5) and (6) are changed to the following equations.
St(j) = ∑d(j + 32i)- gt(j + 64i + 32 x (i%2)) (7)
;=0
#,(/«+64/) = yN (32(t - 15)+32z)-cos(i —) (8) r=0 64
Note that equations (7) and (8) are the same as equations (2) and (3) except the index of input samples.
The flow chart of an algorithm implementing the equations (7) and (8) and block diagram thereof are shown in FIG.
11 and FIG. 12, respectively.
It should be noted that the synthesis filter for backward decoding is similar to the synthesis filter for forward decoding, and therefore computation and memory size that are needed to implement the synthesis filter are identical. Accordingly, the backward decoding can be performed with the synthesis filter for forward decoding by reversing the direction in which the samples in the FIFO buffer are shifted as well as the order in which subband samples are summed.
Output samples are produced in the reverse order to their playback order in units of 32 samples, but each of 32 samples is arranged in the playback order. Accordingly, the backward decoder outputs the 32 samples per a frame in the reverse order. This is repeated frame by frame until the first frame of the MPEG audio data. Note that the synthesis filter for backward decoding can be used in MPEG audio layer-1, layer-2, and layer-3.
Meanwhile, when it is asked that MPEG audio data are recorded on a magnetic tape at a high speed, it is desirable that the decoded audio data are recorded on the forward track on the magnetic tape, and at the same time another decoded audio data are recorded on the backward track while the magnetic tape travels in predetermined direction. To do this, MPEG audio data that will be recorded on the backward track on the magnetic tape must be decoded backward in real time by using the foregoing embodiment according to the present invention. Besides the foregoing embodiment, it is possible to use the conventional forward decoding algorithm for simultaneous record of MPEG audio data on both tracks on magnetic tape, which will be described below.
MPEG audio data that will be recorded on the backward track are first decoded into PCM samples and stored in a buffer, retrieved from the end in the backward order, and converted into analog audio signals. This method is very simple, but it needs a large-sized buffer for temporarily storing the decoded audio data. Moreover, because the required buffer size depends on the length of MPEG audio clip being decoded, it is difficult to fix the maximum size of the buffer in advance.
It also gives another method to apply forward decoding algorithm to an MPEG audio bitstream in units of a predetermined number of frames to record the decoded audio segment to the backward track in the reverse order. For example, if an MPEG audio clip of N frames is to be decoded in units of M frames, blocks from the (N-M) -th frame to the N-th frame are decoded and then blocks from the (N-M+l)-th frame to the N-th frame are recorded in the reverse order. It should be noted that the (N-M) -th frame is not recorded because the precedent frame, i.e., the (N-M-l)-th frame is needed to obtain its complete decoded samples. Decoding of the current requires only one precedent frame. For the reason, data blocks from the N-th frame to the (N-M+l) -t frame are valid except the (N-M) -th frame.
And then, blocks from the (N-2M) -th frame to the (N-M) - th frame are decoded and then blocks from the (N-2M+1) -th rame to the (N-M) -th frame are recorded in the reverse order. Note that the (N-M) -th frame is included again in the second decoding. At this time, the (N-M) -th frame is decoded perfectly because it is decoded with its precedent frame. The decoding-and-recording operation is repeated till all frames are decoded. The first frame that is included in the block to be decoded last is just decoded and recorded because it has no precedent frame, as in the forward decoding.
Compared to the forgoing method, this method has advantages that small-sized buffer enough to store M samples is sufficient and that the buffer size is fixed in advance . fast recording of MPEG audio data on both of The backward decoding algorithm needs more memory than the forward decoding algorithm, but the number of computations thereof is the same as that of the forward decoding algorithm. The size of memory needed is two times larger than that of memory in the forward decoding because a frame must be Huffman-decoded at a time unlike the forward decoding algorithm, where it is possible that two blocks consisting of a frame are Huffman-decoded sequentially. Thus, the memory size results in 1152X2 words. The backward decoding that is performed by applying forward decoding algorithm to every predetermined number of frames requires a buffer in which forward decoded data are temporarily stored, but is easy to implement .
The foregoing is provided only for the purpose of illustration and explanation of the preferred embodiments of the present invention, so changes, variations and modifications may be made without departing from the spirit and scope of the invention.

Claims

C L A I M S
1. A method of backward decoding compressed digital audio data consisting of a plurality of frames, comprising the steps of:
A. locating a header of a last frame of the compressed digital audio data;
B. dequantizing a plurality of data blocks constructing the frame based on information contained in the located header;
C. extracting time signals of each frequency subband from the dequantized data blocks, reducing discontinuities between the dequantized data blocks; and
D. synthesizing the extracted time signals of all subbands backward into real audio signal reversed in time.
2. A method according to claim 1, wherein said step A locates the header of the last frame based on information contained in the header of a first frame of the digital audio data.
3. A method according to claim 1, wherein said step A comprises the steps of:
Al . obtaining frame size from information contained in a header of a first frame of the compressed digital audio data; A2. estimating location of the header of the last frame on the basis of the obtained frame size; and
A3, locating the header of the last frame around the estimated location depending on whether padding bit is present or absent.
4. A method according to claim 1, wherein said step B comprises the steps of:
Bl . obtaining side-information from the located frame header;
B2. identifying locations of data blocks belonging to the frame based on the obtained side-information; and
B3. dequantizing all of the identified data blocks belonging to the frame simultaneously.
5. A method according to claim 4, wherein said step B3 comprises the steps of:
Huffman-decoding all of the identified data blocks belonging to the frame; and dequantizing and descaling the Huffman-decoded data blocks .
6. A method according to claim 1, wherein said step C reduces discontinuities between inverse modified discrete cosine transformed data blocks of a frame by overlapping a second inverse-modified-discrete-cosine-transformed data block and a first inverse-modified-discrete-cosine- transformed data block, the first data block being inputted just after the second data block.
7. A method according to claim 6, wherein the second data block of the frame is overlapped with a block filled with zeros.
8. A method of backward decoding compressed digital audio data consisting of a plurality of frames, comprising the steps of:
A. reading a header of a first frame of the compressed digital audio data;
B. obtaining information on the size of frames from information contained in the read header;
C. searching for a frame ahead of a just-decoded frame based on the obtained size information; and
D. decoding the frame discovered in said step C backward into real audio signal reversed in time.
9. A method according to claim 8, wherein said step D comprising the steps of:
Dl . dequantizing a plurality of data blocks constructing the discovered frame;
D2. extracting time signals of each frequency subband from the dequantized data blocks, reducing discontinuities between the dequantized data blocks; and
D3. synthesizing the extracted time signals of all frequency subbands into real audio data whose output sequence is reverse to normal playback.
10. A method of backward decoding compressed digital audio data consisting of a plurality of frames, comprising the steps of:
A. searching for a frame ahead of a just-decoded frame based on pre-obtained frame size information;
B. decoding the frame discovered in said step A backward into real audio signal reversed in time; and
C. repeating said steps A and B while the frame decoded in said step B is not a start one of the compressed digital audio data.
11. A method of backward decoding compressed digital audio data consisting of a plurality of frames, comprising the steps of:
A. picking up a predetermined number of frames from the end to the beginning of the compressed digital audio data, wherein the predetermined number is smaller than the number of total frames;
B. decoding forwardly the picked up frames; and
C. outputting the decoded data in the reverse to the decoding direction.
PCT/KR1999/000764 1999-02-24 1999-12-11 A backward decoding method of digital audio data WO2000051243A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
JP2000601744A JP2002538503A (en) 1999-02-24 1999-12-11 Reverse decoding method for digital audio data
AU16934/00A AU1693400A (en) 1999-02-24 1999-12-11 A backward decoding method of digital audio data

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
KR1999/6157 1999-02-24
KR1019990006157A KR100300887B1 (en) 1999-02-24 1999-02-24 A method for backward decoding an audio data

Publications (1)

Publication Number Publication Date
WO2000051243A1 true WO2000051243A1 (en) 2000-08-31

Family

ID=19574975

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/KR1999/000764 WO2000051243A1 (en) 1999-02-24 1999-12-11 A backward decoding method of digital audio data

Country Status (4)

Country Link
JP (1) JP2002538503A (en)
KR (1) KR100300887B1 (en)
AU (1) AU1693400A (en)
WO (1) WO2000051243A1 (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2002086896A1 (en) * 2001-04-20 2002-10-31 Koninklijke Philips Electronics N.V. Method and apparatus for editing data streams
WO2003036622A2 (en) * 2001-10-23 2003-05-01 Thomson Licensing S.A. Method and apparatus for decoding a coded digital audio signal which is arranged in frames containing headers
JP2005531014A (en) * 2002-06-27 2005-10-13 サムスン エレクトロニクス カンパニー リミテッド Audio coding method and apparatus using harmonic components
US7610195B2 (en) 2006-06-01 2009-10-27 Nokia Corporation Decoding of predictively coded data using buffer adaptation
US10212417B2 (en) 2001-02-13 2019-02-19 Realtime Adaptive Streaming Llc Asymmetric data decompression systems
US10284225B2 (en) 2000-10-03 2019-05-07 Realtime Data, Llc Systems and methods for data compression

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR101390551B1 (en) * 2012-09-24 2014-04-30 충북대학교 산학협력단 Method of low delay modified discrete cosine transform

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0687111A2 (en) * 1994-06-06 1995-12-13 SICAN, GESELLSCHAFT FÜR SILIZIUM-ANWENDUNGEN UND CAD/CAT NIEDERSACHSEN mbH Method for coding and decoding a data stream
JPH10178349A (en) * 1996-12-19 1998-06-30 Matsushita Electric Ind Co Ltd Coding and decoding method for audio signal

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH07334937A (en) * 1994-04-09 1995-12-22 Victor Co Of Japan Ltd Data recording method and disk medium
JP3190204B2 (en) * 1994-04-12 2001-07-23 ユナイテッド・モジュール・コーポレーション MPEG standard audio signal decoder
JPH08293157A (en) * 1995-04-21 1996-11-05 Matsushita Electric Ind Co Ltd Recording and reproducing method for variable frame length high efficiency coded data
JPH09147496A (en) * 1995-11-24 1997-06-06 Nippon Steel Corp Audio decoder
US5835375A (en) * 1996-01-02 1998-11-10 Ati Technologies Inc. Integrated MPEG audio decoder and signal processor
JP3596978B2 (en) * 1996-05-14 2004-12-02 株式会社ルネサステクノロジ Audio playback device
JPH10112135A (en) * 1996-10-08 1998-04-28 Suzuki Motor Corp Disk reproducing device
US5893066A (en) * 1996-10-15 1999-04-06 Samsung Electronics Co. Ltd. Fast requantization apparatus and method for MPEG audio decoding

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0687111A2 (en) * 1994-06-06 1995-12-13 SICAN, GESELLSCHAFT FÜR SILIZIUM-ANWENDUNGEN UND CAD/CAT NIEDERSACHSEN mbH Method for coding and decoding a data stream
JPH10178349A (en) * 1996-12-19 1998-06-30 Matsushita Electric Ind Co Ltd Coding and decoding method for audio signal

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
PATENT ABSTRACTS OF JAPAN vol. 199, no. 811 *

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10284225B2 (en) 2000-10-03 2019-05-07 Realtime Data, Llc Systems and methods for data compression
US10212417B2 (en) 2001-02-13 2019-02-19 Realtime Adaptive Streaming Llc Asymmetric data decompression systems
WO2002086896A1 (en) * 2001-04-20 2002-10-31 Koninklijke Philips Electronics N.V. Method and apparatus for editing data streams
KR100892860B1 (en) * 2001-04-20 2009-04-15 코닌클리케 필립스 일렉트로닉스 엔.브이. Method and apparatus for editing data streams
WO2003036622A2 (en) * 2001-10-23 2003-05-01 Thomson Licensing S.A. Method and apparatus for decoding a coded digital audio signal which is arranged in frames containing headers
EP1308931A1 (en) * 2001-10-23 2003-05-07 Deutsche Thomson-Brandt Gmbh Decoding of a digital audio signal organised in frames comprising a header
WO2003036622A3 (en) * 2001-10-23 2003-10-16 Thomson Licensing Sa Method and apparatus for decoding a coded digital audio signal which is arranged in frames containing headers
CN1319044C (en) * 2001-10-23 2007-05-30 汤姆森许可贸易公司 Method and apparatus for decoding a coded digital audio signal which is arranged in frames containing headers
US7342944B2 (en) 2001-10-23 2008-03-11 Thomson Licensing Method and apparatus for decoding a coded digital audio signal which is arranged in frames containing headers
KR100944084B1 (en) * 2001-10-23 2010-02-24 톰슨 라이센싱 Method and apparatus for decoding a coded digital audio signal which is arranged in frames containing headers
JP2005531014A (en) * 2002-06-27 2005-10-13 サムスン エレクトロニクス カンパニー リミテッド Audio coding method and apparatus using harmonic components
US7610195B2 (en) 2006-06-01 2009-10-27 Nokia Corporation Decoding of predictively coded data using buffer adaptation

Also Published As

Publication number Publication date
JP2002538503A (en) 2002-11-12
AU1693400A (en) 2000-09-14
KR20000056661A (en) 2000-09-15
KR100300887B1 (en) 2001-09-26

Similar Documents

Publication Publication Date Title
US6446037B1 (en) Scalable coding method for high quality audio
EP1715476B1 (en) Low-bitrate encoding/decoding method and system
JP3970342B2 (en) Perceptual coding of acoustic signals
US7143047B2 (en) Time-scale modification of data-compressed audio information
EP1536410A1 (en) Method and apparatus for encoding/decoding MPEG-4 BSAC audio bitstream having ancillary information
KR100721499B1 (en) Digital signal processing apparatus and digital signal processing method
JPH08190764A (en) Method and device for processing digital signal and recording medium
US7792681B2 (en) Time-scale modification of data-compressed audio information
JP2006126826A (en) Audio signal coding/decoding method and its device
JP2003308098A (en) Method and apparatus for encoding/decoding digital information signal
JP3964860B2 (en) Stereo audio encoding method, stereo audio encoding device, stereo audio decoding method, stereo audio decoding device, and computer-readable recording medium
WO2000051243A1 (en) A backward decoding method of digital audio data
US5918205A (en) Audio decoder employing error concealment technique
US6463405B1 (en) Audiophile encoding of digital audio data using 2-bit polarity/magnitude indicator and 8-bit scale factor for each subband
US6038369A (en) Signal recording method and apparatus, recording medium and signal processing method
JP4470304B2 (en) Compressed data recording apparatus, recording method, compressed data recording / reproducing apparatus, recording / reproducing method, and recording medium
JPH06338861A (en) Method and device for processing digital signal and recording medium
JPH11330974A (en) Encoding method and device, decoding method and device, digital signal recording method and device, recording medium and digital transmitting method and device
JP3352401B2 (en) Audio signal encoding and decoding method and apparatus
JPH07193510A (en) Digital signal processor, digital signal processing method and recording medium
JP3141853B2 (en) Audio signal processing method
JP3200886B2 (en) Audio signal processing method
JP2002268687A (en) Device and method for information amount conversion
JP2004341384A (en) Digital signal recording/reproducing apparatus and its control program
JPH1083198A (en) Digital signal processing method and device therefor

Legal Events

Date Code Title Description
ENP Entry into the national phase

Ref country code: AU

Ref document number: 2000 16934

Kind code of ref document: A

Format of ref document f/p: F

AK Designated states

Kind code of ref document: A1

Designated state(s): AU BR CA CN DE ES GB IN JP RU US

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE

121 Ep: the epo has been informed by wipo that ep was designated in this application
DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
ENP Entry into the national phase

Ref country code: JP

Ref document number: 2000 601744

Kind code of ref document: A

Format of ref document f/p: F

REG Reference to national code

Ref country code: DE

Ref legal event code: 8642

122 Ep: pct application non-entry in european phase