EP1228507B1 - Procede de reduction des besoins memoire dans un codeur audio ac-3 - Google Patents
Procede de reduction des besoins memoire dans un codeur audio ac-3 Download PDFInfo
- Publication number
- EP1228507B1 EP1228507B1 EP99954578A EP99954578A EP1228507B1 EP 1228507 B1 EP1228507 B1 EP 1228507B1 EP 99954578 A EP99954578 A EP 99954578A EP 99954578 A EP99954578 A EP 99954578A EP 1228507 B1 EP1228507 B1 EP 1228507B1
- Authority
- EP
- European Patent Office
- Prior art keywords
- channel
- psd
- bit
- memory
- block
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Lifetime
Links
- 230000015654 memory Effects 0.000 title claims description 44
- 238000000034 method Methods 0.000 title claims description 41
- 238000010168 coupling process Methods 0.000 claims description 49
- 230000008878 coupling Effects 0.000 claims description 47
- 238000005859 coupling reaction Methods 0.000 claims description 47
- 230000000873 masking effect Effects 0.000 claims description 17
- 230000005284 excitation Effects 0.000 claims description 16
- 230000001052 transient effect Effects 0.000 claims description 14
- 230000006870 function Effects 0.000 claims description 13
- 238000001514 detection method Methods 0.000 claims description 7
- 230000009466 transformation Effects 0.000 claims description 7
- 238000012856 packing Methods 0.000 claims description 5
- 238000001228 spectrum Methods 0.000 claims description 5
- 238000013507 mapping Methods 0.000 claims description 3
- 230000008569 process Effects 0.000 description 11
- 238000004364 calculation method Methods 0.000 description 8
- 238000007667 floating Methods 0.000 description 7
- 238000013139 quantization Methods 0.000 description 7
- 235000019800 disodium phosphate Nutrition 0.000 description 6
- 238000012545 processing Methods 0.000 description 6
- 230000006835 compression Effects 0.000 description 5
- 238000007906 compression Methods 0.000 description 5
- 238000004458 analytical method Methods 0.000 description 3
- 230000007423 decrease Effects 0.000 description 3
- 238000010586 diagram Methods 0.000 description 3
- 230000010354 integration Effects 0.000 description 3
- 230000003595 spectral effect Effects 0.000 description 3
- 238000012360 testing method Methods 0.000 description 3
- 238000003491 array Methods 0.000 description 2
- 230000008901 benefit Effects 0.000 description 2
- 230000005540 biological transmission Effects 0.000 description 2
- 230000003247 decreasing effect Effects 0.000 description 2
- 230000001419 dependent effect Effects 0.000 description 2
- 238000001914 filtration Methods 0.000 description 2
- 239000011159 matrix material Substances 0.000 description 2
- 238000005070 sampling Methods 0.000 description 2
- 230000002123 temporal effect Effects 0.000 description 2
- 230000003044 adaptive effect Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 230000000295 complement effect Effects 0.000 description 1
- 238000012937 correction Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000009977 dual effect Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 101150041966 lev-11 gene Proteins 0.000 description 1
- 238000007781 pre-processing Methods 0.000 description 1
- 238000011084 recovery Methods 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 230000000717 retained effect Effects 0.000 description 1
- 230000005236 sound signal Effects 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
- 230000003936 working memory Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/04—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
- G10L19/16—Vocoder architecture
Definitions
- the present invention relates to a method of reducing memory requirements in an encoder particularly, but not exclusively, an AC-3 encoder.
- AC-3 is a transform-based audio coding algorithm designed to provide data-rate reduction for wide-band signals while maintaining the high quality of the original content.
- AC-3 soundtrack can be found on the latest generation of laser disc, can be found as the standard audio track on Digital Versatile Discs (DVD), is the standard audio format for High Definition Television (HDTV), and is being used for digital cable and satellite transmissions.
- DVD Digital Versatile Discs
- HDTV High Definition Television
- Chip cost is dictated by parameters such as chip-area, memory and the more recently popularised low power consumption.
- Decreasing memory requirements for a complex algorithm such as AC-3 Encoder means that algorithm must be deeply analysed and re-structured such that the quality of encoding is preserved but the quantity of storage area is decreased.
- U.S. 5,687,282 discloses a method for reducing memory requirements for an encoder, by successively overwriting various functions used for encoding.
- the present invention seeks to provide a method, particularly for an AC-3 encoder, which further reduces memory requirements.
- a method of reducing memory requirements for an encoder which includes a function of bit allocation for quantising frequency coefficients of an input signal, including:
- the method includes allocation of bits for coding mantissa values of the frequency coefficients, the bit allocation being determined on the basis of the PSD, SSMC and said variable, with bit allocation pointers being generated and stored for a current data block only.
- bit allocation is followed by quantisation of the mantissa values according to the bit allocation pointers and packing of the quantised values into a data frame, the quantised values being stored in memory in place of original unquantised coeffients.
- the frequency coefficients are initially separated into exponent and mantissa components and wherein the exponent components are overwritten with the PSD.
- AC-3 is fundamentally an adaptive transform-based coder using a frequency-linear, critically sampled filterbank based on the Princen Bradley Time Domain Aliasing Cancellation (TDAC) technique.
- TDAC Time Domain Aliasing Cancellation
- AC-3 is a frame based encoder. Each frame contains information equivalent to 256x6 PCM (pulse code modulated) samples per channel. For coding convenience the frame is divided into six audio blocks, each block therefore containing information of 256 samples per channel.
- PCM pulse code modulated
- Transients are detected in the full-bandwidth channels in order to decide when to switch to short length audio blocks for restricting quantization noise associated with the transient within a small temporal region about the transient.
- High-pass filtered versions of the signals are examined for an increase in energy from one sub-block time segment to the next.
- Sub-blocks are examined at different time scales. If a transient is detected in the second half of an audio block in a channel, that channel switches to a short block. In presence of transient the bit 'blksw' for the channel in the encoded bit stream in the particular audio block is set.
- the transient detector operates on 512 samples for every audio block.
- AC-3 Encoder uses overlap technique, so although each block contains 256 samples only, when the block is presented for transient detection (or frequency-transformation) the previous block is prefixed to it, which produces a total of 512 samples.
- Each channel's time domain input signal is windowed and filtered with a TDAC-based analysis filter bank to generate frequency domain coefficients. If the blksw bit is set, meaning that a transient was detected for the block, two short transforms of length 256 each are taken, which increases the temporal resolution of the signal. If not set, a single long transform of length 512 is taken, thereby providing a high spectral resolution.
- the output frequency coefficient X k is defined as: where x [ n ] is the windowed input sequence for a channel and N is the transform length. Instead of evaluating X k in the form given above it could be computed in a computationally efficient manner in accordance with the following: where The symbol j represents the imaginary number -1 .
- High compression can be achieved in AC-3 by use of a technique known as coupling.
- Coupling takes advantage of the way the human ear determines directionality for very high frequency signals.
- the encoder combines the high frequency coefficients of the individual channels to form a common coupling channel.
- the original channels combined to form the coupling channel are called the coupled channel.
- the most basic encoder can form the coupling channel by simply taking the average of all the individual channel coefficients.
- a more sophisticated encoder could alter the signs of the individual channels before adding them into the sum to avoid phase cancellation.
- the generated coupling channel is next sectioned into a number of bands. For each such band and each coupling channel a coupling co-ordinate is transmitted to the decoder. To obtain the high frequency coefficients in any band, for a particular coupled channel, from the coupling channel, the decoder multiplies the coupling channel coefficients in that frequency band by the coupling co-ordinate of that channel for that particular frequency band. For a dual channel encoder a phase correction information is also sent for each frequency band of the coupling channel. "Assume that the frequency domain coefficients are identified as:
- An additional process, rematrixing, is invoked in the special case that the encoder is processing two channels only.
- the sum and difference of the two signals from each channel are calculated on a band by band basis, and if, in a given band, the level disparity between the derived (matrixed) signal pair is greater than the corresponding level of the original signal, the matrix pair is chosen instead.
- More bits are provided in the bit stream to indicate this condition, in response to which the decoder performs a complementary unmatrixing operation to restore the original signals.
- the rematrix bits are omitted if the coded channels arc more than two.
- This technique avoids directional unmasking if the decoded signals are subsequently processed by a matrix surround processor, such as Dolby Prologic decoder.
- rematrixing is performed independently in separate frequency bands. There are four band with boundary locations dependent on the coupling information. The boundary location are by coefficient bin number, and the corresponding rematrixing band frequency boundaries change with sampling frequency.
- the coefficient values which may have undergone rematrix and coupling process, are converted to a specific floating point representation, resulting in separate arrays of exponents and mantissas. This floating point arrangement is maintained through out the remainder of the coding process, until just prior to the decoder's inverse transform, and provides 144 dB dynamic range, as well as allows AC-3 to be implemented on either fixed or floating point hardware.
- Coded audio information consists essentially of separate representation of the exponent and mantissas arrays. The remaining coding process focuses individually on reducing the exponent and mantissa data rate.
- the exponents are coded using one of the exponent coding strategies.
- Each mantissa is truncated to a fixed number of binary places.
- the number of bits to be used for coding each mantissa is to be obtained from a bit allocation algorithm which is based on the masking property of the human auditory system.
- Exponent values in AC-3 are allowed to range from 0 to -24.
- the exponent acts as a scale factor for each mantissa.
- Exponents for coefficients which have more than 24 leading zeros are fixed at -24 and the corresponding mantissas are allowed to have leading zeros.
- AC-3 bit stream contains exponents for independent, coupled and the coupling channels. Exponent information may be shared across blocks within a frame, so blocks 1 through 5 may reuse exponents from previous blocks.
- AC-3 exponent transmission employs differential coding technique, in which the exponents for a channel are differentially coded across frequency.
- the first exponent is always sent as an absolute value.
- the value indicates the number of leading zeros of the first transform coefficient.
- Successive exponents are sent as differential values which must be added to the prior exponent value to form the next actual exponent value.
- the differential encoded exponents are next combined into groups.
- the grouping is done by one of the three methods: D15, D25 and D45 . These together with 'reuse' are referred to as exponent strategies.
- the number of exponents in each group depends only on the exponent strategy.
- each group is formed from three exponents.
- D45 four exponents are represented by one differential value.
- three consecutive such representative differential values are grouped together to form one group.
- Each group always comprises of 7 bits.
- the strategy is 'reuse' for a channel in a block, then no exponents are sent for that channel and the decoder reuses the exponents last sent for this channel
- Pre-processing of exponents prior to coding can lead to better audio quality.
- Choice of the suitable strategy for exponent coding forms a crucial aspect of AC-3.
- D15 provides the highest accuracy but is low in compression.
- transmitting only one exponent set for a channel in the frame (in the first audio block of the frame) and attempting to 'reuse' the same exponents for the next five audio block, can lead to high exponent compression but also sometimes very audible distortion.
- the bit allocation algorithm analyses the spectral envelope of the audio signal being coded, with respect to masking effects, to determine the number of bits to assign to each transform coefficient mantissa.
- the bit allocation is recommended to be performed globally on the ensemble of channels as an entity, from a common bit pool.
- the bit allocation routine contains a parametric model of the human hearing for estimating a noise level threshold, expressed as a function of frequency, which separates audible from inaudible spectral components.
- Various parameters of the hearing model can be adjusted by the encoder depending upon the signal characteristic.
- the number of bits available for packing mantissas, in an AC-3 frame is dependent firstly, of course, on the frame-size and, secondly, on the number of bits consumed by other fields - exponents, coupling parameters etc.
- a significant part of the bit-allocation process is the optimisation of the bit-allocation to mantissa such that under masking consideration, the sum total of all bits consumed by mantissas equals (or is almost close to) available bits. This optimisation is performed by what's known as a Binary-Convergence Algorithm.
- Floating point arithmetic usually use IEEE 754 (32 bits : 24-bit mantissas, 7-bit exponent & 1 sign bit) which is adequate for high quality AC-3 encoding.
- Work-stations like Sun SPARCstation 20 can provide much higher precision (e.g. double is 8 bytes).
- floating point units require more chip area and consequently most DSP Processors use fixed point arithmetic.
- the AC-3 Encoder is often intended to be a part of a consumer product e.g. DVD (Digital Versatile Disk) where cost (chip area) is an important factor.
- the AC-3 Encoder has been implemented on 24-bit processors like the Motorola 56000 and has met with much commercial success.
- the quality of AC-3 Encoder on a 16-bit processor though universally assumed to be of low quality, no adequate study (as yet not published) has been conducted to benchmark the quality or compare it with the floating point version.
- double precision (32-bit) to implement the encoder on a 16-bit processor can lead to high quality (even more than the 24-bit version).
- double precision arithmetic is very computationally expensive (e.g. on D950 single precision multiplication takes 1 cycle while double precision requires 6 cycles). Rather than allowing single or double precision throughout the whole cycle of processing, different precision calculations may be made for different stages of computation.
- D950 contains two data-memory spaces called X-Memory and Y-Memory, from which load/store operations can be performed concurrently in a single cycle.
- data-memory in DSPs are usually flat (unsegmented), for indexing and logistics purpose this implementation views memory as chunks of 512 words. Choice of 512 is natural since each block contains 256 words of PCM and for a stereo this adds to 512.
- Segments in X-Memory are labelled as X00, X01 etc.
- Y-Memory segments are labelled as Y00, Y01 etc. Consecutive segments are assumed to be adjacent to each other, e.g. if starting address of X04 is 1500, then address of X05 will be 2000.
- segments X07-X12 is written with the input PCM-data of six blocks.
- AC-3 uses overlap method for frequency transformation, whereby each block requires data from previous block to generate coefficients for current block.
- PCM input from last block of previous frame is combined.
- Previous frame last block is stored in X13 and upon start of processing for current frame it is copied to X06 , so that X06-X12 presents a continuous block of 6 block, each 512 samples (with overlap), as illustrated in Figure 2.
- Transient detection for each block requires 512 inputs. As explained earlier, each block combined with data from previous one is presented for transient detection and frequency transformation.
- the filtering operation does not alter the input but generates an equal number of high pass filtered information. This is analysed by the transient-detector to generate transient information. Filtering and transient detection requires a working buffer of 1024 words - X14-X15 , as illustrated in Figure 2.
- Frequency Transformation using the Time-Domain-Aliasing-Cancellation Method produces 256 32-bit coefficients. These coefficients are transferred from X14 to appropriate location in the address space X00-X11 , so that a particular coefficient X[blk_no][ch][bin] can be addressed conveniently as X00[blk_no*1024+ch*512+bin*2] .
- Figure 3 shows the arrangement of PCM samples from X06-X12 allowing the generated coefficients to be stored in the required format while safe guarding that coefficients storing does not result in over-writing (write-before-read) of PCM samples still required for generation of coefficients of next block (or channel).
- Rematrixing is very straight forward as far as memory requirements and allocations are concerned. Rematrixed data is written in-place of the original channel coefficients.
- the coupling channel may be mapped to the same memory reserved for one of the coupled channel. Normally a memory space of 256 bins would be reserved for storing and processing coefficients of each full-bandwidth channel (e.g. channel 0 & 1, for stereo encoder). However, instead of creating a new block of memory for coupling channel, a coupled channel's location may be reused. From bin zero to endmant[ch]-1 , coefficients for coupled channel (ch) are stored and from endmant[ch] onwards to max (255, cplendmant) the coupling channel coefficients are stored.
- Each frequency coefficient (32-bit) generates a mantissa and an exponent. Exponents have a maximum value of 24 therefore sixteen bits are more than enough to store their value. For mantissas it is not obvious whether sixteen bits are enough or full thirty-two bits need to be retained. However, patent application by the author titled “Accuracy Demands on Mantissa Representation in AC- Encoder", addresses this issue and proves that sixteen bits are sufficient. Therefore, six block of frequency coefficients in locations X00-X11 are overwritten with exponents ( X00-X05 ) and mantissas ( X06-X11 ), see Figure 2.
- exponents in AC-3 are differentially-coded and subsequently grouped using one of the schemes D15, D25, D45 and Reuse. Scratch pad memory of 2 K is required for coding and grouping process. The resulting grouped exponents require additional memory for storage before they are finally packed into AC-3 frame. The memory allocated must be sufficient even in the worst case. Let us check this.
- the grouped exponents must be easy to index. Even though the grouped exponents may occupy 512 words, they would in general be spread out in memory because of indexing e.g. to index to grp_exp[blk_no][ch][grp] , the address should be X12[ (blk_no*max_grp_size*3) + (max_grp_size*ch) + grp] .
- Bit allocation is one of the most complicated (computationally and memory wise) part of AC-3 encoding. It can be partitioned into the following steps
- the first step of bit allocation determines the power-spectrum density (PSD) according to equation below.
- PSD power-spectrum density
- the PSD are to be stored in the same location as the exponents. This is possible as exponents are no longer required once PSD is generated.
- Next step of the algorithm integrates fine-grain PSD values within each of a multiplicity of 1/6th octave bands to generate band-psd.
- the integration of PSD values in each band is performed with log-addition.
- the log-addition is implemented by computing the difference between the two operands and using the absolute difference divided by 2 as an address into a length 256 lookup table. In total, there can be 50 such bands per channel.
- the coupling channel however can reuse the same location as one of the coupled channel.
- the band-psd for the coupled channel occupies the lower part ( 0-bndstart[ch] ), the upper portion can be occupied be the coupling channel, provided the starting bin of the coupling channel always is on a new band - otherwise coupling band will overwrite the last band of the coupled channel.
- Table I above shows the band structure for PSD-integration.
- the excitation function is computed by applying the prototype masking curve selected by the encoder (and transmitted to the decoder) to the integrated PSD spectrum (bndpsd[]). The result of this computation is then offset downward in amplitude by the fgain and sgain parameters, which are also obtained from the bit stream.
- excitation curve values can be written in-place of the band-psd. However, since band-psd values are required during initial portion of masking curve calculations, a temporary back-up of its value can be made.
- FSMC First-Step-Masking Curve
- This step computes the masking (noise level threshold) curve from the excitation function.
- the hearing threshold is given in ATSC Document.
- the fscod and dbknee variables are assigned by the encoder.
- the FSMC is written over the excitation curve as its value are no longer required by the encoder.
- AC-3 performs global bit allocation, that is, the allocation routine shuttles bits across channels and blocks as necessary, to meet the shifting demands of the signal.
- Mantissa bits for the entire frame are allocated from a common pool. As a result the bit-allocation requires masking and psd information of the entire frame.
- quantization of the mantissa according to the assigned bits can be performed on a block basis. This is because sharing of information about quantized mantissa is restricted to block level.
- the first step is to separate quantization from the bit allocation process.
- bit-allocation requires only three piece of information - FSMC, PSD and snroffst . While the first two are familiar by now, the third parameter needs to be explained.
- the bit allocation algorithm iterates with various values of this parameter till it converges to a value with which the total quantized-mantissa bits in the frame add upto the available bits.
- the masking curve needs to be re-computed at each iteration from the excitation curve.
- the baps would be computed and totalled to estimate the mantissa size. Storing baps for the entire frame would require 3 K memory.
- the masking curve would have to be stored at a separate location from the excitation curve, otherwise for next iteration the excitation curve values would be corrupted.
- the SSMC second-step masking curve
- the SSMC is computed and stored in a temporary location and disposed once its purpose is served. Baps are stored for current block only.
- the optimal snroffst value is known, the SSMC and baps are re-computed for each block again as and when necessary. This effectively increases number of iterations by one, but since usually the number of iterations are quite large (-6) the impact is not significant.
- the last step of the bit-allocation checks if the constraints (ATSC Doc.) on the AC-3 frame such as - size of block 0 and block 1 combined, will never exceed 5 / 8 of the frame are satisfied. Once constraint test is passed the bit allocation pointers for each block is computed and their value is used to quantize the mantissas.
- the mantissas in X06-X11 are quantized up to number of bits dictated by the bit-allocation-pointers.
- the quantized mantissas are stored in-place. However, in AC-3 mantissas with certain levels of quanization are grouped together. These mantissas need to be stored separately and grouped and then packed into the AC-3 frame.
- baps 1,2 and 4 i.e. Lev-3, Lev-5 and Lev-11 mantissas
- Mapping of bap to Quantizer bap quantizer levels quantization type mantissa bits qntztab[bap]) (group bits / num in group) 0 0 none 0 1 3 symmetric 1.67(5/3) 2 5 symmetric 2.33 (7/3) 3 7 symmetric 3 4 11 symmetric 3.5(7/2) 5 15 symmetric 4 6 32 asymmetric 5 7 64 asymmetric 6 8 128 asymmetric 7 9 256 asymmetric 8 10 512 asymmetric 9 11 1,024 asymmetric 10 12 2,048 asymmeuic 11 13 4,096 asymmetric 12 14 16,384 asymmetric 14 15 65,536 asymmetric 16
- Figure 6 shows the Quantizer which quantizes mantissa of a particular block according to the corresponding baps.
- Lev-3,5 and 11 mantissas are stored separately for grouping, one can store these mantissas in their original location but then would need pointer to them for grouping stage, these pointer being equal in number would occupy identical amount of space.
- the compression of the level mantissas is 3,3 and 2 (corresponding to group size of 3,3 and 2), therefore proportional amount of space is reserved in Y06-Y07 for each.
- the last step in the encoding process is the packing of mantissas onto the AC-3 frame. For each mantissa Q bits of the quantized mantissa is stored into the AC-3 frame, the size Q being determined from the bit-allocation pointer value. At this stage, the PSD values for the block under consideration are no longer required and so the Q values may be stored in their place (Location : X06-X11 ), see Figure 2.
- the frame size depends on the compression ratio. For stereo AC-3, bitrates of up to 192-384 kbps are reasonable, for in this range transparent quality can be achieved. The largest frame size (836 words) results when the bitrate is 384 kbps, sampling frequency being 44.1 kHz. A 1 K of frame buffer size is therefore reasonable for storing the AC-3 frame ( X14-X15 ).
Landscapes
- Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Signal Processing (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
Claims (9)
- Procédé de réduction de la place nécessaire en mémoire pour un codeur qui comprend la fonction d'allocation de bit pour la quantification de coefficients de fréquence d'un signal d'entrée, comprenant :le calcul d'une densité de spectre de puissances (PSD);l'intégratio de la PSD sur une multiplicité de bandes de fréquences afin de former une PSD de bande;le calcul d'une fonction d'excitation en appliquant une courbe de masquage prototype à la PSD de bande;la génération d'une courbe de masquage de première étape (FSMC) d'un seuil de niveau de bruit à partir de la fonction d'excitation;le calcul d'une courbe de masquage de deuxième étape (SSMC) en incrémentant la FSMC en fonction d'une variable de signal sur bruit sélectionnée (snroffst),la fonction d'excitation, après avoir été calculée, étant écrite en mémoire à la place de la PSD de bande et étant ensuite écrasée par réécriture par la FSMC, et la SSMC étant stockée dans une mémoire temporaire et recalculée pour chaque bloc de données traité par le codeur, caractérisé en ce que les coefficients de fréquence de deux canaux d'entrée liés sont combinés en un canal de liaison, le procédé comprenant la transposition des données de canal de liaison dans une mémoire réservée à l'un des canaux liés.
- Procédé selon la revendication 1, comprenant l'allocation de bits pour le codage de valeurs de mantisse des coefficients de fréquence, l'allocation de bits étant déterminée sur la base de la PSD, de la SSMC et de la variable, avec des pointeurs d'allocation de bit qui sont générés et stockés pour un bloc de données courant seulement.
- Procédé selon la revendication 2, selon lequel l'allocation de bits est suivie par la quantification des valeurs de mantisse en fonction des pointeurs d'allocation de bit et le compactage des valeurs quantifiées en une trame de données, les valeurs quantifiées étant stockées en mémoire à la place de coefficients non quantifiés originaux.
- Procédé selon la revendication 1, selon lequel les coefficients de fréquence sont initialement séparés en composantes d'exposant et de mantisse et selon lequel les composantes d'exposant sont réécrites avec la PSD.
- Procédé selon la revendication 1, selon lequel le codeur comprend la fonction de remise en matrice et de liaison, la remise en matrice et la liaison, si nécessaire, étant réalisées sur place, avec un canal de liaison imposé sur une moitié supérieure d'un premier canal lié, en supprimant ainsi la nécessité de créer une nouvelle zone de stockage pour le canal de liaison.
- Procédé selon l'une quelconque des revendications précédentes, selon lequel le signal d'entrée est traité en trames de six blocs d'entrée stéréo qui sont stockés dans une mémoire X en des emplacements X07 à X12, avec un dernier bloc d'une trame précédente préfixé à X06, de telle sorte que six blocs continus de 512 échantillons en chevauchement par canal d'entrée sont présentés à une détection transitoire et des modules de transformation de fréquence du codeur et selon lequel les coefficients sont représentés par 256 coefficients de 32 bits, par canal, qui sont stockés en X00 à X11.
- Procédé selon la revendication 6, selon lequel des coefficients de fréquence de 32 bits sont convertis en mantisse et exposant de 16 bits chacun.
- Procédé selon la revendication 7, selon lequel les exposants sont codés suivant une stratégie de codage sélectionnée utilisant un procédé à double indexation.
- Procédé selon la revendication 2 ou 3, selon lequel les mantisses sont des valeurs de quantification allouées sur la base des pointeurs d'allocation de bit, les valeurs de quantification étant stockées à la place de la PSD.
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/SG1999/000111 WO2001033556A1 (fr) | 1999-10-30 | 1999-10-30 | Procede de reduction des besoins memoire dans un codeur audio ac-3 |
Publications (2)
Publication Number | Publication Date |
---|---|
EP1228507A1 EP1228507A1 (fr) | 2002-08-07 |
EP1228507B1 true EP1228507B1 (fr) | 2003-05-28 |
Family
ID=20430245
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
EP99954578A Expired - Lifetime EP1228507B1 (fr) | 1999-10-30 | 1999-10-30 | Procede de reduction des besoins memoire dans un codeur audio ac-3 |
Country Status (3)
Country | Link |
---|---|
EP (1) | EP1228507B1 (fr) |
DE (1) | DE69908433T2 (fr) |
WO (1) | WO2001033556A1 (fr) |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO1996021975A1 (fr) * | 1995-01-09 | 1996-07-18 | Philips Electronics N.V. | Procede et appareil pour determiner un seuil masque |
TW316302B (fr) * | 1995-05-02 | 1997-09-21 | Nippon Steel Corp | |
US5819215A (en) * | 1995-10-13 | 1998-10-06 | Dobson; Kurt | Method and apparatus for wavelet based data compression having adaptive bit rate control for compression of digital audio or other sensory data |
-
1999
- 1999-10-30 EP EP99954578A patent/EP1228507B1/fr not_active Expired - Lifetime
- 1999-10-30 WO PCT/SG1999/000111 patent/WO2001033556A1/fr active IP Right Grant
- 1999-10-30 DE DE69908433T patent/DE69908433T2/de not_active Expired - Lifetime
Also Published As
Publication number | Publication date |
---|---|
WO2001033556A1 (fr) | 2001-05-10 |
DE69908433T2 (de) | 2004-04-08 |
EP1228507A1 (fr) | 2002-08-07 |
DE69908433D1 (de) | 2003-07-03 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
KR101168473B1 (ko) | 오디오 인코딩 시스템 | |
EP0684705B1 (fr) | Codage de signaux à canaux multiples utilisant une quantification vectorielle | |
EP0610975B1 (fr) | Formatage d'un signal codé pour codeur et décodeur d'un système audio de haute qualité | |
US5369724A (en) | Method and apparatus for encoding, decoding and compression of audio-type data using reference coefficients located within a band of coefficients | |
EP1072036B1 (fr) | Optimisation rapide de trames dans un codeur audio | |
EP1852851A1 (fr) | Dispositif et procede de codage/decodage audio ameliores | |
US5394508A (en) | Method and apparatus for encoding decoding and compression of audio-type data | |
PL182240B1 (pl) | Koder akustyczny wielokanalowy PL PL PL PL PL PL PL PL PL | |
US7680671B2 (en) | Multi-precision technique for digital audio encoder | |
JPH07199993A (ja) | 音響信号の知覚符号化 | |
US20040220805A1 (en) | Method and device for processing time-discrete audio sampled values | |
KR20060131798A (ko) | 블록 그룹화에 기반한 오디오 코딩 | |
CN100489965C (zh) | 音频编码系统 | |
EP1228576B1 (fr) | Couplage de canaux pour un codeur ac-3 | |
EP2697795B1 (fr) | Partage adaptatif du taux gain/forme | |
JP4843142B2 (ja) | 音声符号化のための利得−適応性量子化及び不均一符号長の使用 | |
US6775587B1 (en) | Method of encoding frequency coefficients in an AC-3 encoder | |
EP1228507B1 (fr) | Procede de reduction des besoins memoire dans un codeur audio ac-3 | |
US6754618B1 (en) | Fast implementation of MPEG audio coding | |
JP3093178B2 (ja) | 高品質オーディオ用低ビットレート変換エンコーダ及びデコーダ | |
Chen et al. | Fast time-frequency transform algorithms and their applications to real-time software implementation of AC-3 audio codec | |
JPH0758707A (ja) | 量子化ビット割当方式 | |
Absar et al. | AC-3 Encoder Implementation on the D950 DSP-Core |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PUAI | Public reference made under article 153(3) epc to a published international application that has entered the european phase |
Free format text: ORIGINAL CODE: 0009012 |
|
17P | Request for examination filed |
Effective date: 20020529 |
|
AK | Designated contracting states |
Kind code of ref document: A1 Designated state(s): AT BE CH CY DE DK ES FI FR GB GR IE IT LI LU MC NL PT SE |
|
GRAH | Despatch of communication of intention to grant a patent |
Free format text: ORIGINAL CODE: EPIDOS IGRA |
|
GRAH | Despatch of communication of intention to grant a patent |
Free format text: ORIGINAL CODE: EPIDOS IGRA |
|
GRAA | (expected) grant |
Free format text: ORIGINAL CODE: 0009210 |
|
AK | Designated contracting states |
Designated state(s): DE FR GB IT |
|
REG | Reference to a national code |
Ref country code: GB Ref legal event code: FG4D |
|
REG | Reference to a national code |
Ref country code: IE Ref legal event code: FG4D |
|
REF | Corresponds to: |
Ref document number: 69908433 Country of ref document: DE Date of ref document: 20030703 Kind code of ref document: P |
|
ET | Fr: translation filed | ||
PLBE | No opposition filed within time limit |
Free format text: ORIGINAL CODE: 0009261 |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: NO OPPOSITION FILED WITHIN TIME LIMIT |
|
26N | No opposition filed |
Effective date: 20040302 |
|
REG | Reference to a national code |
Ref country code: IE Ref legal event code: MM4A |
|
REG | Reference to a national code |
Ref country code: FR Ref legal event code: PLFP Year of fee payment: 18 |
|
REG | Reference to a national code |
Ref country code: FR Ref legal event code: PLFP Year of fee payment: 19 |
|
REG | Reference to a national code |
Ref country code: FR Ref legal event code: PLFP Year of fee payment: 20 |
|
PGFP | Annual fee paid to national office [announced via postgrant information from national office to epo] |
Ref country code: FR Payment date: 20180920 Year of fee payment: 20 Ref country code: IT Payment date: 20180919 Year of fee payment: 20 |
|
PGFP | Annual fee paid to national office [announced via postgrant information from national office to epo] |
Ref country code: GB Payment date: 20180925 Year of fee payment: 20 |
|
PGFP | Annual fee paid to national office [announced via postgrant information from national office to epo] |
Ref country code: DE Payment date: 20180819 Year of fee payment: 20 |
|
REG | Reference to a national code |
Ref country code: DE Ref legal event code: R071 Ref document number: 69908433 Country of ref document: DE |
|
REG | Reference to a national code |
Ref country code: GB Ref legal event code: PE20 Expiry date: 20191029 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: GB Free format text: LAPSE BECAUSE OF EXPIRATION OF PROTECTION Effective date: 20191029 |