D E S C R I P T I O N
A BACKWARD DECODING METHOD OF DIGITAL AUDIO DATA
1. Technical Field
The present invention relates to a method of decoding compressed digital audio data backward, more particularly, to a method of backward decoding an MPEG (Moving Picture Experts Group) encoded audio data into analog audio signal with little increase of computation load and memory size.
2. Background Art
Digital audio signal is in general more robust to noise than analog signal and thus the quality is not subject to degradation during copy or transmission over network. The digital audio signals are, moreover, transmitted more rapidly and stored in storage media of less capacity due to effective compression methods recently developed.
Many compression methods have been proposed to effectively encode audio signals into digital data. MPEG (Moving Picture Experts Group) audio coding schemes have been used for the standard in this area. The MPEG audio standards that are standardized as ISO (International
Standardization Organization) - MPEG audio layer-1,
layer-2, and layer-3 were devised to encode high-quality stereo audio signals with little or no perceptible loss of quality. They have been widely adopted in digital music broadcasting area and in addition have been used with MPEG video standards to encode multimedia data. In addition to MPEG-1, standard specifications for digital environments have been proposed; MPEG-2 includes standards on compression of multimedia data. Standards for object oriented multimedia communication are included in MPEG-4, which is in progress.
MPEG-1 consists of five coding standards for compressing and storing moving picture and audio signals in digital storage media. MPEG audio standard includes three audio coding methods: layer-1, layer-2, and layer-3 MPEG audio layer-3 (hereinafter referred to as λMP3") algorithm includes a much more refined approach than in layer-1 and layer-2 to achieve higher compression ratio and sound quality, which will be described briefly below. MPEG audio layer-1, 2, 3 compress audio data using perceptual coding techniques which address perception of sound waves of the human auditory system. To be specific, they take an advantage of the human auditory system's inability to hear quantization noise under conditions of auditory masking. The "masking" is a perceptual property
of the human ear which occurs whenever the presence of a strong audio signal makes a temporal or spectral neighborhood of weaker audio signals imperceptible. Let us suppose that a pianist plays the piano in front of audience. When the pianist does not touch keyboard, the audience can hear trailing sounds, but is no longer able to hear the trailing sounds at the instant of touching the keyboard. This is because, in presence of masking sounds, or the newly generated sounds, the trailing sounds which fall inside frequency bands centering the masking sound, so-called critical bands, and loudness of which is lower than a masking threshold are not audible. This phenomenon is called spectral masking effect. The masking ability of a given signal component depends on its frequency position and its loudness. The masking threshold is low in the sensitive frequency bands of the human ear, i.e., 2KHz to 5KHz, but high in other frequency bands .
There is the temporal masking phenomenon in the human auditory system. That is, after hearing a loud sound, it takes a period of time for us to be able to hear a new sound that is not louder than the sound. For instance, it requires 5 milliseconds for us to be able to hear a new sound of 40 dB after hearing a sound of 60 dB during 5
milliseconds. The temporal delay time also depends on frequency band.
Based on a psychoacoustic model of the human ear, the MP3 works by dividing the audio signal into frequency subbands that approximate critical bands, then quantizing each subband according to the audibility of quantization noise within that band, so that the quantization noise is inaudible due to the spectral and temporal masking.
The MP3 encoding process is described below in detail, step by step, with reference to FIGS. 1 and 2.
(1) . Subband coding and MDCT (Modified Discrete Cosine
Transform)
In the MP3 encoder, PCM format audio signal is, first, windowed and converted into spectral subband components via a filter bank 10, shown in FIG. 1, which consists of 32 equally spaced bandpass filters. The filtered bandpass output signals are critically sub-sampled at the rate of 1/32 of the sampling rate and then encoded.
Polyphase filterbank is, in general, used to cancel the aliasing of adjacent overlapping bands that occurs otherwise because of the low sampling rate at the sub- sampling step. As another method, MDCT (Modified Discrete Cosine Transform) unit 20 and aliasing reduction unit 30 are adopted to cancel the aliasing, thereby preventing
deterioration of the quality.
Because MDCT is essentially critically sampled DCT (Discrete Cosine Transform) , the input PCM audio signal can be reconstructed perfectly in the absence of quantization errors. Discontinuities between transformed blocks occur since quantization is carried out.
For each subband, the number of quantization bits is allocated by taking into account the masking effect by neighboring subbands. That is, quantization and bit allocation is performed to keep the quantization noise in all critical bands below the masking threshold.
(2) . Scaling
Samples in each of the 32 subbands are normalized by a scale factor such that the sample of the largest magnitude is unity, and the scale factor is encoded for use in the decoder. With the scaling process, the amplitude of signal is compressed, therefore, the quantization noise is reduced and become inaudible due to the psychoacoustic phenomenon. (3) . Huffman Coding
Variable-length Huffamn codes are used to get better data compression rate of the quantized samples. The Huffman coding is called entropy coding whereby redundancy reduction is carried out based on statistical
property of the digital data. The principle behind the Huffman coding is that codewords of small length are assigned to symbols having higher probability, while large-length codewords are assigned to symbols with lower probability. In effect, the average length of encoded data are reduced as small as possible.
Let us consider an example for illustration. The quantized samples are 00, 01, 10, and 11. Their probabilities are 0.6, 0.2, 0.1, and 0.1, respectively. In case of using codewords of constant length, say, 2 bits, the average length of a codeword is 2 bits without calculation of (2X0.6 + 2X0.2 + 2X0.1 + 2X0.1) / 4 = 2 bits. However, if variable-length codewords are used, i.e., 1 bit is assigned to 00 with the highest probability, 2 bits for 01 with the second highest probability, and 3 bits for 10 and 11, the average length of the codeword leads to 1.6 bits ( =(1X0.6 + 2X0.2 + 3 X0.1 + 3X0.1) /4 ).
In addition, in order to achieve high compression rate, MP3 adopts bit reservoir buffering technique whereby unused bits in the frames in which the size of coded data are relatively small are used when the encoder needs more bits than the average number of bits to code a frame. After being processed by the above processes, the audio
signal is formatted into a bitstream. FIG. 3 shows the arrangement of the various fields in a frame of an MP3 encoded bitstream.
Without data reduction, digital audio signals typically consist of 16 bit samples recorded at several sampling rates than twice the actual audio bandwidth, (e.g., 32KHz, 44.1KHz, and 48KHz) . In case of two channels stereo audio signals at a sampling rate of 44.1KHz with 16 bits per sample, the bit rate is 16X44100X2=1411200, or about 1.4 Mbps. By using MP3 audio coding, the original sound data can be encoded at the bit rate of 128 to 256 Kbps . That is, 1.5 to 3 bits are, on the average, needed for sampling instead of 16 bits, and therefore the MP3 enables to shrink down the original sound data from a CD- DA by a factor of about 12 without loss of the sound quality.
Despite its advantages, digital audio recorders and players are in infancy for several reasons and analog audio recorders and players have still been the majority in the market. Accordingly, it would be attractive in terms of commercial products if it is possible that digital audio signals are recorded on analog signal storage media like magnetic tapes because users can enjoy digital audio without buying new digital audio recorders
and players .
Digital audio data are first decoded and then recorded on either track on a magnetic tape on which a forward track and a backward track are provided, That is, when the tape travels in the forward (backward) direction, the audio signals are recorded on the forward (backward) track. After completion of recording the audio signals on the forward track, the tape begins to travel in the backward and the audio signals are recorded thereon. As a result, it needs the time for two times tape travels to record the digital audio signals on a magnetic tape.
For fast recording, it is possible to encode analog audio signals which were backward-reproduced and to decode and record the encoded signal on the tape during only one tape travel. However, the method has weak points of more storage spaces for the encoded backward- reproduced signals in addition to the encoded forward- reproduced signals, and imperfect reproduction of the audio signals due to MP3 encoding using masking phenomenon since small amplitude preceding large amplitude in view of normal reproduction was suppressed while encoding audio signal reproduced backward.
3. Disclosure of Invention
It is a primary object of the present invention to
provide a method of backward decoding an MPEG digital audio data into an analog audio data which enables to record the decoded analog signal on analog signal storage media like magnetic tapes at a high speed with little increase of computation load and memory size.
To achieve the object, the present invention provides a method of a method of backward decoding an MPEG audio data into an analog audio data, comprising the steps of locating a header of a last frame of the compressed digital audio data; dequantizing a plurality of data blocks constructing the frame based on information contained in the located header; extracting time signals of each frequency subband from the dequantized data blocks, reducing discontinuities between the dequantized data blocks; and synthesizing the extracted time signals of all subbands backward into real audio signal reversed in time.
According to the method of backward decoding MPEG audio data according to the present invention, when MPEG audio data are asked to be recorded on a magnetic tape at a high speed, the MPEG audio data can be decoded and recorded on both of the two tracks on the magnetic tape simultaneously while the tape travels in one direction. Therefore, the backward decoding method according to
present invention enables fast recording of MPEG audio data on both of tracks on the magnetic tape.
4. Brief Description of Drawings
The accompanying drawings, which are included to provide a further understanding of the invention, illustrate the preferred embodiment of this invention, and together with the description, serve to explain the principles of the present invention.
In the drawings: FIGS. 1 and 2 are block diagrams showing an MPEG audio encoder;
FIG. 3 shows the arrangement of the various bit fields in a frame of MPEG audio data;
FIG. 4 is a block diagram showing an MPEG audio decoder;
FIG. 5 is a schematic diagram showing an illustration of the bit reservoir within a fixed length frame structure;
FIG. 6 is a schematic diagram illustrating the overlap of inverse-modified-discrete-cosine-transformed blocks;
FIG. 7 is a flow graph showing a synthesis filterbank;
FIG. 8 is a flowchart showing an algorithm implementing the synthesis filterbank of FIG. 7;
FIG. 9 is a block diagram of the flowchart of FIG. 8;
FIG. 10 is a flow graph showing a synthesis filterbank for backward decoding according to the present invention;
FIG. 11 is a flowchart showing an algorithm implementing the synthesis filterbank of FIG. 10; and FIG. 12 is a block diagram of the flowchart of FIG. 11.
5. Modes for Carrying out the Invention
The preferred embodiments of the present invention will be described hereinafter in detail referring to the accompanying drawings . FIG. 4 shows a block diagram of an MP3 audio decoder to which an embodiment of the present invention is applied, comprising a demultiplexer 100 for dividing an MP3 audio bitstream into several data of different types; a side- information decoder 110 for decoding side-information contained the bitstream; a Huffman-decoder 120 for
Huffman-decoding the divided audio data; a dequantizer 130 for obtaining actual frequency energies from the Huffman-decoded data; an inverse MDCT (IMDCT) unit 140 for applying IMDCT to the energies; and a synthesis filterbank 150 for synthesizing subband values the into PCM samples.
With reference to the MP3 audio decoder of FIG. 4, the method of backward decoding MP3 encoded audio data are described below step by step.
( ) ■ Identifying Frame Header
The first step in the backward decoding process of an MP3 bitstream is to find where decoding is started in the bitstream. In MPEG audio, frames are independent of each other, and consequently the first step is to locate a frame header in the bitstream, requiring knowing the frame length. All MPEG bit streams are generally divided in separate chunks of bits called frames. There is a fixed number of frames per second for each MPEG format, which means that for a given bit rate and sampling frequency, each input frame has a fixed length and produces a fixed number of output samples.
In order to obtain actual frame length, it is required to locate a frame header in the bitstream and to get the required information from it, because the frame length depends on the bit rate and sampling frequency. Locating header information is done by searching for a synchronization bit-pattern marked within the header. However, it happens that locating header information fails because some audio data may contain the same bit pattern as the synchronization bit-pattern.
To alleviate this problem, on the assumption that neither bit rate nor sampling frequency does not change in an MP3 audio clip, the demultiplexer 100 analyzes the
first header in the stream and obtains the length of the frame having no padding bit based on information in the first header. By using the frame length, the header of the last frame is located while traveling the MP3 audio clip from the end.
If padding bit is added to a frame, the frame length is increased by 1 byte. That is, the frame length may change from frame to frame due to the padding bit. Because it is uncertain that the last frame have padding bit, searching for the header of the last frame needs to examine whether the last frame header is away from the end of the clip by the frame length or one more byte away. ( 2 ) . Obtaining Side-in ormation
After the frame header is found, the demultiplexer 100 divides the input MP3 audio bitstream into side- information containing how the frame was encoded, scale factor specifying gain of each frequency band, and Huffman-coded data. The side-information decoder 110 decodes the side-information so that the decoder knows what to do with the data contained in the frame.
The number of bits required for MP3 encoding depends on acoustic characteristics of samples to be encoded with equal quality of sound. The coded data do not necessarily fit into a fixed length frame in the code bitstream.
Based on this, MP3 uses bit reservoir technique whereby bit rate may be borrowed from previous frames in order to provide more bits to demanding parts of the input signal. To be specific, the encoder donates bits to a reservoir when it needs less than the average number of bits to code a frame. Later, when the encoder needs more than the average number of bits to code a frame, it borrows bits from the reservoir. The encoder can only borrow bits donated from past frames with limits. It cannot borrow from future frames. On the decoder's side, the current frame being decoded may include audio data belonging to the frames that will be presented subsequently. The starting byte of the audio data for the current frame is limited to 511 bytes away from that frame. A 9-bit pointer is included in each frame's side- information that points to the location of the starting byte of the audio data for that frame, as shown in FIG. 5.
That is, the audio data for the current frame being decoded, i.e., scale factor and Huffman-coded data may be included in data region of the previous frames, which are within 511 bytes distance from that frame. When MP3 audio data are forwardly decoded, if it is determined that data belonging to the current frame contains data for the subsequent frames, they are kept until the subsequent
frames are decoded. On the other hand, in order to backward decoding MP3 audio data, when the current frame is decoded, it is checked whether or not the decoding current frame needs data contained in the precedent frame, and if any, the data are obtained in such a manner that headers of the precedent frames and data belonging to the frames are identified.
(3) . Huffman decoding Once obtaining the audio data are completed, the Huffman decoder 120 starts to Huffman-decode the audio data (including the data contained in the precedent frames) based on the side-information and Huffman trees which were constructed and used in the encoding process according to the data contents . This step is the same as that of forward decoding. However, since a frame is encoded in two granules (granule 0 and granule 1) and data of granule 0 must be decoded in order to locate granule 1, two granules must be decoded to output granule 1 in the backward decoding process whereas it is possible to decode the MP3 encoded data from granule 0 to granule 1 sequentially in the forward decoding, whereas data of two granules must be decoded at a time in the backward decoding process. (4) . DequantizJ g and descaling
When the Huffman-decoder 120 has decoded the audio data, they have to be dequantized by the dequantizer 130 and descaled using the scale factors into real spectral energy values. For example, if the Huffman-decoded value is Y, then the real spectral energy value is obtained by multiplying γ(4/3) and the scale factors.
If the bitstream is a stereo signal, each channel can be transmitted separately in every frame, but transmission of the sum and the difference between the two channels is often adopted to reduce redundancies therebetween. If the bitstream was encoded in this way, the decoder has to perform stereo-processing to recover the original two channels.
(5) , IMDCT (inverse modified d.i screte cosine transform) So far the signals have all been in the frequency domain, and to synthesize the output samples, a transform is applied that is the reverse of the time-to-frequency transform used in the encoder.
In MPEG layer-3, MDCT is done to get better frequency resolution than in the other layers. MDCT are essentially critically sampled DCT, implying that if no quantizing had been done, the original signal would be reconstructed perfectly. However, because quantization is performed for each data block in the encoding process, discontinuities
between data blocks occur inevitably. The single data block is the unit block of output samples of the decoder and is corresponding to a granule in inverse MDCT.
To avoid discontinuities between the granules, which would lead to perceptible noise and clicks, the inverse MDCT uses 50% overlap, i.e., every inverse-modified- discrete-cosine-transformed granules are overlapped with half of the previous transformed granules to smooth out any discontinuities. To be specific, IMDCT produces 36 samples output in a manner that the second half 18 samples of the previous granule is added to the first half 18 samples of the current granule, as shown in FIG. 6. For the backward decoding, the order in which granule is added must be reversed, i.e., the second half 18 samples of the current granule is added to the first half 18 samples of the precedent granule. For the end frame which is to be decoded at first at the backward decoding process, second granule of that frame is added with zeros or just used without overlapping.
The IMDCT process in the forward decoding is expressed by the following equation.
Xi(n) = Yi(n) + yi.N n+18 ) 0<n<18, i=l,2, ■■■ 2N. where xNn) is a target sample output, yN n) is inverse-
modified-discrete-cosine-transformed sample, i is the granule index, N is the total number of frames, and y0(n+18) are all zeros for 0≤n<18.
The above equation must be changed into the following equation for the IMDCT process in the backward decoding. x±(n) = y±(n+18) + y^ (n) 0<n<18, i=2N, 2N-1 — , 1. where y2N+1(n+18) are all zeros for 0≤n<18. The overlapping procedure is the same as that of the forward decoding and therefore computation and memory size needed are identical.
( ) . Synthesis of Subband signals
Once the transformed blocks is overlapped after the IMDCT process, the final step to get the output audio samples is to synthesize 32 subband samples. The subband synthesis operation is to interpolate 32 subband samples into audio samples in the time domain.
A subband synthesis filter needs the delayed inputs of previous frames, but in case of the backward decoding, subband samples are presented to the synthesis filter in the reverse order to the forward decoding. Therefore, redesign of MPEG standard synthesis filterbank is required to perform the backward decoding operation. The MPEG standard synthesis filterbank for the forward decoding is described below in detail and then the
synthesis filterbank for the backward decoding according to the present invention is explained in detail.
FIG. 7 shows a flow graph of an MPEG standard synthesis filterbank for forward decoding, whereby 32 subband samples are synthesized into audio samples of a time- series in the similar way to frequency division multiplexing. To be specific, 32 subband samples or xr(mTsl)'s, each of which is critically sampled at a sampling period of TSl, are synthesized into an output samples or s (nTs2) which is critically sampled signal at a sampling period of Ts2 (= Tsl / 32) .
Here, xr(mTsl) is the r-th subband sample and xr(nTs2) is 32 up-sampled from xr(mTsl) such that thirty-one zeros are inserted into the interval between (m-l)Tsl and mTsl for xr(mTsl) samples. This up-sampling generates 31 images of baseband centered at harmonics of the original sampling frequency, kfsl (k=l, 2, ..., 31) . That is, sampling frequency is increased from fsl (= 1/TS1) to fs2 (=1/TS2) for the original subband sample of xr(mTsl). For each subband, xr(nTs2) is processed by band-pass filter Hr(z) to pass the signal belonging to frequency- band allocated to each filter. The band-pass filter has 512 orders and is constructed by phase-shifting a prototype low-pass filter.
The flow graph of FIG. 7 is expressed by the equation (1) .
31 511
St (nT„ ) = ∑ ∑ xr ((32/ + n - k)TS2 ) • Hr (kTS2 ) r=0 k=0
= ∑∑xr( 2t + n-k)TS2)-h(kTS2)- Nr(k) ( 1 )
-•=0 -.=0
where r is the subband index ranging from 0 to 31, n is the output sample index ranging from 0 to 31, and St (nT
S2) is the synthesized output sample at time t. That is, St(nT
s2) represents the synthesized output sample of 32 subband samples or x
r(tT
sl)'s at time t.
The equation (1) implies the convolution of xr(kTs2) and Hr(KTs2), which has 512 coefficients and is constructed by the product of the prototype low-pass filter h(kTs2) and Nr(k) that is used for phase-shift thereof.
Reduction of the number of computations, i.e., multiplies and adds is possible in equation (1) . By utilizing the symmetry property of cosine terms and zeros that are filled in xr(kTs2) at the time of up-sampling, equation (1) leads to equation (2), hereinafter, sampling period in the following equations is omitted for convenience and is Ts2 if not explicitly expressed.
15
St (ri) = ~ h(n + 32t) • (-l)[, 2] • gt (n + 64/ + 32 x (i%2))
1=0 15 ( 2 )
= ~ d(n + 32 ) - gt(n + 64/ + 32 x (ι%2))
7=0
,, ^ - -i ,-.,. -,,, .. ,(2r + l)( + 16 . ^(* + 64/) = ∑xr(32/ -32ι). cos(i ^- — ) ( 3 ) r=0 64 where r is the subband index ranging from 0 to 31, n, i, and k are computation indices (n=0,l,2 ••• 31, i= 0,1,2 -"15, k=0,l,2, ••• ,63), t represents the time when the subband sample is presented to the decoder. % is the modular operator and [x] represents the largest integer that is not greater than x.
For each subband, one sample is presented and multiplied by Nr(k), resulting in 64 samples. The 64 samples are stored in 1024 FIFO (First In First Out) buffer, samples have been stored therein being shifted by 64. 32 PCM output samples are obtained by multiplying samples in the 1024 FIFO buffer by coefficients of the time window.
The synthesis filterbank for backward decoding according to the present invention will be described below in detail with reference to the MPEG standard synthesis filterbank for the forward decoding. It should be noted that for backward decoding, subband samples are presented to the decoder in the reverse order to their playback order. For example, given N samples for
each subband, while the forward decoder decodes the samples in the increasing order (t = 0, 1, 2, ...,N-1) , the samples have to be decoded in the decreasing order (t=N- l,N-2,.„, 0) for backward decoding. Because MPEG standard synthesis filterbank requires past samples for synthesizing PCM audio samples, if samples are presented in the reverse order to perform backward decoding, MPEG standard synthesis filterbank cannot use the previous samples. As a result, MPEG standard synthesis filterbank must be modified to perform backward decoding. The structure thereof is explained below.
FIG. 10 depicts a flow graph showing the synthesis filterbank for backward decoding according to the present invention, which is identical to the forward decoding synthesis filterbank except that Hr(z) is replaced by Br(z). Note that xr(mTsl) is presented to the filterbank in the decreasing order, i.e., m = N-l,N-2,..., 1, 0.
Equation (1) is changed to equation (4) in accordance with the reverse of the presentation order.
31 511
St(nTS2) = ∑∑xr((Q2t - 5U) + n + k)TS2) -Hr(kTS2)
-•=0 k=0
= ∑∑xr( 2(t -l5) + (n - 3\) + k)TS2) - h(kTS2) - Nr(k) ( 4 ) r=0 -.=0 31 511 (2r + l)(k + l6)π
∑∑xr ((32(t - 15) + (n - 31) + k)TS2 ) ■ h(kTS2 ) ■ cos( 64 ) r=0 k=0
Compared to equation (1), the band-pass filter of each subband is the same as that of equation (1), but sequence of subband samples that are multiplied by the band-pass filter coefficients is reversed, i.e., xr ( ( (32t+n) -k) Ts) is replaced by xr ( ( (32t+n) -511+k) Ts) .
The above equation is partially optimized to reduce the number of computations, resulting in the following equations .
St (n) = ∑h(3 l - n + 32/) • (-1)[,72] • gt (31 - n + 64; + 32 x (i%2))
15 ( x 5 )
= ∑ d( 1 - n + 320 • g, (31 - « + 64i + 32 x (i%2))
1=0 gt (63 - k + 64/) = Y xr (32(t - 15) + 32/) • cos((2 " + 1)(63 ~ k + λ6 ) ( 6 ) ^, 64 where all indices are the same as those in equations (2) and (3) .
If using j = 31 - n and m = 63 - k, the equation (5) and (6) are changed to the following equations.
St(j) = ∑d(j + 32i)- gt(j + 64i + 32 x (i%2)) (7)
;=0
#,(/«+64/) = yN (32(t - 15)+32z)-cos(i —) (8) r=0 64
Note that equations (7) and (8) are the same as equations (2) and (3) except the index of input samples.
The flow chart of an algorithm implementing the equations (7) and (8) and block diagram thereof are shown in FIG.
11 and FIG. 12, respectively.
It should be noted that the synthesis filter for
backward decoding is similar to the synthesis filter for forward decoding, and therefore computation and memory size that are needed to implement the synthesis filter are identical. Accordingly, the backward decoding can be performed with the synthesis filter for forward decoding by reversing the direction in which the samples in the FIFO buffer are shifted as well as the order in which subband samples are summed.
Output samples are produced in the reverse order to their playback order in units of 32 samples, but each of 32 samples is arranged in the playback order. Accordingly, the backward decoder outputs the 32 samples per a frame in the reverse order. This is repeated frame by frame until the first frame of the MPEG audio data. Note that the synthesis filter for backward decoding can be used in MPEG audio layer-1, layer-2, and layer-3.
Meanwhile, when it is asked that MPEG audio data are recorded on a magnetic tape at a high speed, it is desirable that the decoded audio data are recorded on the forward track on the magnetic tape, and at the same time another decoded audio data are recorded on the backward track while the magnetic tape travels in predetermined direction. To do this, MPEG audio data that will be recorded on the backward track on the magnetic tape must
be decoded backward in real time by using the foregoing embodiment according to the present invention. Besides the foregoing embodiment, it is possible to use the conventional forward decoding algorithm for simultaneous record of MPEG audio data on both tracks on magnetic tape, which will be described below.
MPEG audio data that will be recorded on the backward track are first decoded into PCM samples and stored in a buffer, retrieved from the end in the backward order, and converted into analog audio signals. This method is very simple, but it needs a large-sized buffer for temporarily storing the decoded audio data. Moreover, because the required buffer size depends on the length of MPEG audio clip being decoded, it is difficult to fix the maximum size of the buffer in advance.
It also gives another method to apply forward decoding algorithm to an MPEG audio bitstream in units of a predetermined number of frames to record the decoded audio segment to the backward track in the reverse order. For example, if an MPEG audio clip of N frames is to be decoded in units of M frames, blocks from the (N-M) -th frame to the N-th frame are decoded and then blocks from the (N-M+l)-th frame to the N-th frame are recorded in the reverse order. It should be noted that the (N-M) -th
frame is not recorded because the precedent frame, i.e., the (N-M-l)-th frame is needed to obtain its complete decoded samples. Decoding of the current requires only one precedent frame. For the reason, data blocks from the N-th frame to the (N-M+l) -t frame are valid except the (N-M) -th frame.
And then, blocks from the (N-2M) -th frame to the (N-M) - th frame are decoded and then blocks from the (N-2M+1) -th rame to the (N-M) -th frame are recorded in the reverse order. Note that the (N-M) -th frame is included again in the second decoding. At this time, the (N-M) -th frame is decoded perfectly because it is decoded with its precedent frame. The decoding-and-recording operation is repeated till all frames are decoded. The first frame that is included in the block to be decoded last is just decoded and recorded because it has no precedent frame, as in the forward decoding.
Compared to the forgoing method, this method has advantages that small-sized buffer enough to store M samples is sufficient and that the buffer size is fixed in advance . fast recording of MPEG audio data on both of The backward decoding algorithm needs more memory than the forward decoding algorithm, but the number of
computations thereof is the same as that of the forward decoding algorithm. The size of memory needed is two times larger than that of memory in the forward decoding because a frame must be Huffman-decoded at a time unlike the forward decoding algorithm, where it is possible that two blocks consisting of a frame are Huffman-decoded sequentially. Thus, the memory size results in 1152X2 words. The backward decoding that is performed by applying forward decoding algorithm to every predetermined number of frames requires a buffer in which forward decoded data are temporarily stored, but is easy to implement .
The foregoing is provided only for the purpose of illustration and explanation of the preferred embodiments of the present invention, so changes, variations and modifications may be made without departing from the spirit and scope of the invention.