MXPA06009932A - Device and method for determining a quantiser step size - Google Patents

Device and method for determining a quantiser step size

Info

Publication number
MXPA06009932A
MXPA06009932A MXPA/A/2006/009932A MXPA06009932A MXPA06009932A MX PA06009932 A MXPA06009932 A MX PA06009932A MX PA06009932 A MXPA06009932 A MX PA06009932A MX PA06009932 A MXPA06009932 A MX PA06009932A
Authority
MX
Mexico
Prior art keywords
size
interference
stage
quantizer
threshold
Prior art date
Application number
MXPA/A/2006/009932A
Other languages
Spanish (es)
Inventor
Grill Bernhard
Schug Michael
Teichmann Bodo
Rettelbach Nikolaus
Original Assignee
Fraunhofergesellschaft Zur Foerderung Der Angewandten Forschung EV
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fraunhofergesellschaft Zur Foerderung Der Angewandten Forschung EV filed Critical Fraunhofergesellschaft Zur Foerderung Der Angewandten Forschung EV
Publication of MXPA06009932A publication Critical patent/MXPA06009932A/en

Links

Abstract

In order to determine a quantiser step size for quantising a signal comprising audio or video information, a first quantiser step size and an interference threshold are supplied (502). According to the invention, the actual interference introduced by means of the first quantizer step size is determined (504) and compared with the interference threshold (506). If the comparison indicates that the actually introduced interference is higher than the threshold, a second coarser quantizer step size is used (508), that is then used for the quantisation (514) if it emerges that the interference introduced by the coarser second quantizer step size is lower than the threshold or the interference introduced by the first quantiser step size (512). In this way, the quantisation interference is reduced during the coarsening of the quantisation and thus during an increase in the compression gain.

Description

APPARATUS AND METHOD TO DETERMINE A SIZE OF THE QUANTIFFER'S STAGE Description The present invention relates to audio encoders, and, in particular, to audio encoders, which are based on the transformation, that is, in which a conversion of a temporal representation into a spectral representation is performed at the beginning of the Encoder pipe line. An audio encoder of the prior art, based on the transformation, is illustrated in Figure 3. The encoder, shown in Figure 3, is represented in the ISO / IEC 14496-3 international standard; 2001 (E), subpart 4, page 4, and is also known as AAC encoder in the art. The prior art encoder will be presented below. An audio signal, to be encoded, is supplied in an input 1000. This audio signal is initially fed to a scaling stage 1002, in which the so-called gain control AAC is conducted to establish the level of the audio signal . The secondary information of the scaling is supplied to a device 1004 for formatting the bit stream, as represented by the arrow located between block 1002 and block 1004. The scaled audio signal is then supplied to a bank of MDCT 1006 filters. With the AAC encoder, the filter bank performs a modified discrete cosine transformation, with 50% overlap windows, the length of the window being determined by a block 1008. Generally speaking, block 1008 is present for the purpose of place transient signals in the window, with relatively short windows, and place signals in the window which tend to be stationary, with relatively large windows, this serves to reach a higher level of time resolution (at the expense of the resolution of frequency) for signal processing due to relatively short windows, for signals that tend to be stationary, a frequency resolution Larger distance (at the expense of time resolution) is achieved due to large windows, there is a tendency to prefer larger windows, since they result in a greater gain of coding. At the output of the filter bank 1006, blocks of spectral values - the blocks being successive in time - are present, which may be MDCT coefficients, Fourier coefficients or subband signals, depending on the embodiment of the filter bank, each subband signal having a specific limited bandwidth, determined by the respective subband channel in the first filter bank 1006, and each subband signal having a specific number of sub samples. -band. Next, there is a presentation, in the form of an example, of the case in which the filter bank temporarily produces successive blocks of MDCT spectral coefficients, which, generally speaking, represent successive short-period spectra of the audio signal that is going to encode at input 1000. A block of HDCT spectral values is then fed into a TNS 1010 process block - temporal noise configuration - in which this temporary noise configuration is performed. The TNS technique is used to configure the temporal form of noise quantization within each window of the transformation. This is achieved by applying a filtering process to parts of the spectral data of each channel. The coding is done in a window base. In particular, the following steps are performed to apply the TNS tool to a spectrum data window, ie to a block of spectral values. Initially, a frequency range for the YMS tool is selected. A suitable selection comprises covering a frequency range of 1.5 kHz with a filter, up to the highest possible scale frequency band. It will be noted that this frequency range depends on the sampling regime, as specified in the AAC standard (ISO / IEC 13396-3: 2001 (E)). Subsequently, an LPC calculation (LPC = linear predictive coding) is performed, to be precise using the spectral MDCT coefficients, present in the selected target frequency range. For increased stability, the coefficients corresponding to frequencies below 2.5 kHz are excluded from this process. The common LPC procedures are known from the speech process and can be used for the calculation of LPC, for example, the known Lavinson-Durbin algorithm. The calculation is performed for the maximum admissible order of the noise configuration filter. As a result of the LPC calculation, the expected prediction gain (PG) is obtained. In addition, the reflection coefficients, or Parcor coefficients, are obtained. If the prediction gain does not exceed a specific threshold, the TNS tool does not apply. In this case, a piece of the control information is written in the bitstream, so that a decoder knows that no TNS process has been performed. However, if the prediction gain exceeds a threshold, the TNS process is applied.
In a next step, the reflection coefficients are quantized. The order of the used noise configuration filter is determined by removing all reflection coefficients having an absolute value less than a threshold, from the "row" of the arrangement of the reflection coefficients. The number of remaining reflection coefficients is of the order of the magnitude of the noise configuration filter. A suitable threshold is 0.1 The remaining reflection coefficients are typically converted into linear prediction coefficients, this technique is also known as the "step-up" procedure. The calculated LPC coefficients are then used as coding noise configuration filter coefficients, that is, as prediction filter coefficients. This FIR filter is used for filtering in a specified target frequency range. A self-regressive filter is used in decoding, while the so-called moving average filter is used in coding. Finally, a secondary information for the TNS tool is supplied to the bitstream format device, as represented by the arrow shown between the TNS process block 1010 and the bitstream format device 1004 in Figure 3.
Next, several optional tools, which are not shown in Figure 3, are passed through, such as the long period prediction tool, an intensity / coupling tool, a prediction tool, a noise replacement tool, until finally a medium / side encoder 1012 is reached. This middle / side encoder 1012 is active when the audio signal, to be encoded, is a multi-channel signal, ie a stereo signal, having a left channel and a right channel, Until now, ie upstream from block 1012, in Figure 3, the left and right stereo channels have been processed, ie scaled, transformed by the filter bank, into the INS process or not, • .etc., separately from each other. In the middle / side encoder, the verification is initially performed as if the middle / side coding has a sense, ie it provides a coding gain throughout. The middle / side coding will provide a coding gain if the left and right channels have to be similar, since, in this case, the middle channel, ie the sum of the left and right channels, is almost equal to the left channel or to the right channel, apart from scaling by a factor of 1/2, while the side channel has only very small values, since it is equal to the difference between the left and right channels. As a consequence, one can see that when the left and right channels are approximately the same, the difference is approximately zero, or it includes only very small values, which, as expected - will be quantized to zero in a subsequent quantifier 1014, and thus they can be transmitted in a very efficient manner, since an entropy coder 1016 is connected downstream of the quantizer 1014. The quantizer 1014 is supplied with a permissible interference per band of the scale factor by the floor-acoustic model 1020. . This quantifier operates in an iterative way, that is, an external interaction cycle is called initially, which it then calls an internal iteration cycle. Generally speaking, starting from the quantize step size start values, a quantization of a block of values is initially executed at the input of the quantizer 1014. In particular, the internal cycle quantifies the MDCT coefficients, a specific number of bits being consumed in the process. The external cycle calculates the distortion and modified energy of the coefficients that use the scale factor to call an internal cycle again. This process is repeated for such time until the specific conditional cause is found. For each iteration in the external iteration cycle, the signal is reconstructed in order to calculate the interference introduced by the quantization and to compare it with the allowed interference provided by the psycho-acoustic model 1020. In addition, the scale factors of these frequency bands, which after this comparison are still considered will be interfered with, are enlarged by one or more stages of interaction to interaction, to be precise for each interaction of the external interaction cycle. Once a situation is reached, in which the quantization interference introduced by the quantification is below the allowed interference, determined by the psycho-acoustic model, and if at the same time the bit requirements are met, that state, to be precise, that the maximum bit rate is not exceeded, the iteration, ie the analysis by synthesis method, is terminated, and the scale factors obtained are coded as illustrated in block 1014, and are supplied, in coded form , to the bitstream format device 1004, as marked by the arrow, which is removed between block 1014 and block 1004. The quantized values are then supplied to the entropy coder 1016, which typically performs the coding of entropy for varis bands of the scale factor using varis tables of the Huffman code, in order to translate the quantized values in a binary format. As is known, the entropy coding in the form of the Huffman coding implies falling back into the coding tables that were created in the base of the expected signal statistics, and in which the values that occur in frequency are they give shorter code words than the values that occur less frequently. The encoded entropy values are then supplied, as actual primary information, to the 1004 bitstream format device, which then produces the encoded audio signal on the output side, according to a specific order of the bitstream. . As already illustrated, a very fine quantizer step size is used in this iterative quantization, in case the interference introduced by the size of the quantizing step is greater than the threshold, this is done in the hope that this will lead to a reduction of quantization noise, due to the fact that the quantification performed is finer. This concept is disadvantageous because, due to the size of the finer quantizing step, the amount of data to be transmitted increases naturally and thus, the gain of the compression decreases.
It is the object of the present invention to provide a concept for determining a size of the quantizing step, which, on the one hand, introduces low quantization interferences and provides, on the other hand, a high compression gain. This object is achieved by an apparatus for determining a size of the quantizing step, as claimed in patent claim 1, by a method of determining a size of the quantizing step, as claimed in patent claim 8, or by a program computer, as claimed in the patent claim 9. The present invention is based on the findings that a further reduction in the interference power, on the one hand, and, at the same time, an increase or at least the preservation of the Coding gain can be achieved in at least several sizes of more approximate quantizing steps are treated even when the interference introduced is greater than a threshold, rather than performing a finer quantization, as has been done in the prior art. It is believed that even with the most approximate quantizer step sizes, reductions in the interference introduced by the quantization can be achieved, to be precise, in those cases when the size of the closest quantum step "hits" with the value to be quantified better than the finest quantifier step size. This effect is based on the fact that the quantization error depends not only on the size of the quantifier step, but naturally also on the values to be quantified. If the values to be quantified are in close proximity to the step sizes of the closest approximate quantifier step size, a reduction in the quantization noise will be achieved, while increasing the compression gain (since the quantization has been closer ). The inventive concept is very helpful, particularly when very well estimated quantifier step sizes are already present for the first size of the quantizing step, on the basis of which threshold comparison is executed. In a preferred embodiment of the present invention, therefore, it is preferred to determine the first size of the quantizing step by means of a direct calculation on the basis of the average energy of the noise rather than at the base of a scenario at worst case. Thus, the iteration cycle, according to the prior art, can either be considerably reduced or it can become completely obsolete.
The inventive post-process quantifier step size will then be treated once again only, with an even more approximate quantifier step size, in order to benefit from the described effect of the "enhanced shock" of a value that is will quantify If it is subsequently believed that the interference obtained by the size of the most approximate quantizing step is smaller than the previous interference, or even smaller than the threshold, many iterations can be made to deal with an even closer quantizer step size. This procedure of approximating the size of the quantizing step is continued for such a time until the interference introduced increases again. Then, a termination criterion is reached, so the quantification is done with that size of the stored quantifier step, which has provided the least interference introduced and so the coding process is continued, as required. In an alternative embodiment of the present invention, to estimate the first size of the quantizing step, an analysis-by-synthesis approach, as in the prior art, can be performed, which is continued for such a time until a completion criterion is reached. . Then, the inventive post-process may be employed or finally verify whether or not it may be possible to achieve equally good or even better interference results, with a more approximate quantifier step size. If one finds that the size of the nearest quantum step is equally good or even better with respect to the interference introduced, this step size will be used for quantification. However, if one finds that the closest approximation does not provide a positive effect, that size of the quantizing step which was originally determined, for example, by means of an analysis / synthesis method, will be used for the final quantification. According to the invention, any size of the quantizing step can thus be used to perform a first threshold comparison. It is irrelevant whether this first size of the quantizing step has already been determined by analysis / synthesis schemes, or even by means of the direct calculation of the sizes of quantizing steps. In a preferred embodiment of the present invention, this concept is used to quantify an audio signal present in the frequency range. However, this concept can also be used to quantify a time domain signal comprising the audio and / or video information. In addition, it will be noted that the threshold used for the comparison is the allowed psycho-acoustic or psycho-optical interference, or another threshold, which is desired, as shown below. For example, this threshold may actually be an allowed interference provided by a psychoacoustic model. However, this threshold may also be the interference introduced, previously determined, for the original quantizing step size, or any other threshold. It should be noted that the quantized values do not necessarily have to be encoded by Huffman, but they can alternatively be encoded using another entropy encoding, such as arithmetic coding. Alternatively, the quantized values can also be encoded in a binary manner, since this coding, likewise, has the effect that in order to transmit lower values or values equal to zero, fewer bits are required than to transmit higher values or, generally, no values. equal to zero. To determine the starting values, ie, the size of the quantizing step 1, the iterative approach may preferably be complete or at least greatly distributed if the size of the quantizing step is determined from a direct noise energy estimate. The calculation of the size of the quantizing step from an accurate noise estimate is considerably faster than calculating in the cycle of analysis by synthesis, since the values for the calculation are directly present. It is not necessary to first perform and compare several quantization attempts, up to a size of the quantizing step, which is favorable for the coding, is found. However, since the characteristic curve of the quantizer used is a non-linear characteristic curve, this non-linear characteristic curve must be taken into account in the estimation of noise energy. It is no longer possible to use the simple noise energy estimate for a linear quantifier, since it is not accurate enough. According to the invention, a quantizer is used, which has the following quantization characteristic curve: f yt = rounding V \ + s 1) In the previous equation, the xi are spectral values to be quantified. The starting values are characterized by y¡., Being thus y ± the quantized spectral values, q is the size of the quantifier stage. Rounding is the rounding function, which is preferably the nint function, where "nint" is the "nearest integer". The exponent that makes the quantifier a non-linear quantifier, refers to a, where a is different from one. Typically, the exponent a will be smaller than 1, so the quantifier has a compression characteristic. With layer 3, and with AAC, the exponent a is equal to 0.75. The parameter s is an additive constant that can have any value, but which can also be zero. According to the invention, the following connection is used to calculate the step size of the quantizer: Where a is equal to 3/4, the following equation results: SI-A £ S > TO In these equations, the term on the left side indicates the THR interference that is allowed in a frequency band and that is provided by a psycho-acoustic module for a scale factor band with the frequency lines of i equal to ii to i equal to i2. The above equation enables an almost exact estimate of the interference introduced by a quantizing step size q for a non-linear quantifier, which has the characteristic curve of the previous quantifier, with the exponent a different from 1, where the function nint of the equation of the quantifier performs the equation of the actual quantifier, which is rounded to the next integer. It should be noted that instead of the nint function, any desired rounding function can be used, specifically, for example, also rounding to the next integer for or following integer non, or rounding to the next number of 10, etc. Generally speaking, the rounding function is responsible for forming a map of a set of values that have a specific number of values allowed to a following set of values that have a second specific number of values lower. In a preferred embodiment of the present invention, the quantized spectral values have been previously subjected to the TNS process and, if shared, for example, with stereo signals, to the middle / side coding, as long as the channels where such a middle encoder / side are activated. Thus, the scale factor for each band of scale factor can be indicated directly and can be fed into a respective audio encoder, with the connection between the size of the quantizing step and the scale factor, which is given, according to with the following equation: The scale factor results from the following equation: or scf = FFAC In a preferred embodiment of the present invention, a post-process iteration can also be used based on a principle of analysis by synthesis, in order to slightly vary the size of the quantizing step, which has been calculated directly without iteration, for each band of scale factor, in order to achieve the real optimum. However, compared to the prior art, the already very accurate calculation of the starting values enables a very short iteration, although it is believed in most cases, the downstream iteration can be completely distributed. The preferred concept based on the step size calculation, which uses the average noise energy, thus provides a good and realistic estimate, since unlike the prior art, it does not operate with the scenario in the worst case, but uses a the expected value of the quantization error, as a basis and thus enables, with subjectively equivalent quality, a more efficient coding of the data with a considerably reduced bit count. In addition, a considerably faster encoder can be achieved, due to the fact that the iteration can be completely omitted and / or that the number of iteration steps can be clearly reduced. This is remarkable, in particular, due to the iteration cycles in the prior art encoder that has been essential for the overall encoder time requirement. Thus, even a reduction by one or less iteration stages leads to a considerable saving of the overall encoder time. Preferred embodiments of the present invention will be explained below in detail, with reference to the accompanying figures, in which: Figure 1 is a block diagram of an apparatus for determining a quantized audio signal; Figure 2 is a flowchart for representing the postprocess, according to a preferred embodiment of the present invention; Figure 3 illustrates a block diagram of a prior art encoder, in accordance with the AAC standard; Figure 4 is a representation of the reduction of the quantization interference by a step size of the most approximate quantizer; and Figure 5 is a block diagram representation of the apparatus of the invention, for determining a quantifier step size, for quantifying a signal.
The concept of the invention will be presented below with reference to Figure 5. Figure 5 shows a schematic representation of an apparatus for determining a size of the quantizing step to quantize a signal comprising audio or video information and that is provided by means of one 500 signal input. The signal is supplied to a means 502 to provide a first quantizer step size 8QSS) and to provide an interference threshold which will also be referred to below as interference that can be introduced. It will be noted that the interference threshold can be any threshold. Preferably, however, it will be an interference that can be introduced psycho-acoustically or psycho-optically, this threshold being selected so that a signal in which the interference has been introduced is still perceived as not being interfered with by human listeners or observers.
The threshold (THR) as well as the first size of the quantization step are supplied to a means 504 to determine the first real interference introduced by the first size of the quantization step. The determination of the interference actually introduced is preferably conducted by quantizing using the first size of the quantizer stage, by re-quantizing using the first size of the quantizer stage, and calculating the distance between the original signal and the signal quantified Preferably when spectral values are processed, these corresponding spectral values of the original signal and the re-quantized signal are squared to then determine the difference of the squares. Alternative methods of determining distance can be employed. The means 504 provides a value for a first interference, actually introduced by the first size of the quantizer stage. This first interference is supplied, together with the threshold THR, to a means 506 for comparison. The means 506 performs a comparison between the threshold THR and the first interference actually inputted. If this first really entered interference is greater than the threshold, the means 506 will activate the means 508 to select a second size of the quantizer stage, the means 508 being configured to select the second size of the quantizer stage to be closer, that is, larger than the first size of the quantifier stage. The second size of the quantizer stage, selected by means 508, is supplied to a means 510 to determine the second interference actually introduced. To this end, the medium 510 obtains the original signal as well as the second size of the quantizer stage and again performs a quantization using the second size of the quantizer stage, a re-quantization using the second size of the stage of the quantifier. quantifier, and a calculation of the distance between the re-quantized signal and the original signal, in order to provide a means 512 for comparison with a measurement of the second interference actually introduced. The means 512 for comparison compare the second interference actually introduced with the first interference actually input or with the threshold THR. If the second interference, actually introduced, is less than the first interference, actually introduced, or even less than the threshold THR, the second size of the quantizer stage will be used to quantize the signal. It will be noted that the concept illustrated in Figure 6 is only schematic. Of course, it is not absolutely necessary to provide separate comparison means for performing the comparisons in blocks 506 and 512, but it is also possible to provide a simple comparison means, which is conveniently controlled. The same applies to means 504 and 510 for determining the interferences actually introduced. They, likewise, will not necessarily be configured as separate media. Furthermore, it will be noted that the means for quantifying will not necessarily be configured as a means which is separated from the means 510. To be precise, the signals are quantified by the second size of the quantifier stage and typically generated as before as in the media 410 when these means 510 perform a quantization and re-quantization to determine the interference actually introduced. The quantized values obtained can also be stored and produced as a quantized signal when the means 512 for the comparison provides a positive result, so the means 514 for the quantization is "melted" as such with the means 510 for determining the second interference actually introduced. In a preferred embodiment of the present invention, the threshold THR is the maximum interference that can be introduced, determined by the psycho-acoustic form, the signal being an audio signal in this case. The THR threshold here is provided by a psycho model -acoustic which operates in a conventional manner and provides, for each scale factor band, an estimated maximum quantization interference, which can be input in this band of the scale factor. The interference that can be introduced maximally is based on the masking threshold in that it is identical with the masking threshold or is derived from this masking threshold, in the sense that, for example, coding with a secure spacing is performed in a manner that the interference that can be introduced is less than the masking threshold, or that a coding rather offensive in the sense that a reduction of the bit rate is made, specifically in the sense that the allowed interference exceeds the masking threshold. A preferred way of performing the means 501 to provide the first size of the quantizer stage will be presented below with reference to Figure 1. In this aspect, the functionalities of the medium 50 of Figure 2 and of the medium 502 of Figure 5 , are equal. Preferably, the medium 502 is configured to have the functionalities of the medium 10 and the medium 12 of Figure 1. In addition, the quantizer 514 in Figure 5 is configured to be identical with the quantizer 14 in Figure 1 in this example.
In addition, the left branch in Figure 1, which illustrates the inventive concept, extends in that in case the introduced interference exceeds the threshold and that the approximation of the size of the quantifier stage does not provide any effect, and if the requirements of bit rates or are particularly stringent and / or if there is still some space in the "bit savings bank", an iteration is performed using a smaller, ie finer, quantifier stage size. Finally, the effect on which the present invention is based will be presented below with reference to the Figure 4, specifically the effect that, despite the approximation of the size of the quantizer stage, a reduced quantization noise and, associated therewith, an increase in compression gain, can be obtained. Figure 1 shows an apparatus for determining a quantized audio signal, which is supplied as a spectral representation in the form of spectral values. It will be noted, in particular, that in the case that - with reference to Figure 3 - no THS process and no half / side coding has been executed, the spectral values are directly the starting values of the filter bank. However, if only the TNS process, but not the media / lateral encoding is executed, the spectral values fed into the quantizer 1015 are spectral residual values as formed from the TNS prediction filtering. If the TNS process including the middle / side coding is employed, the spectral values fed into the apparatus of the invention are spectral values of a medium channel or spectral values of a side channel. For the game, the present invention includes a means for providing an allowable interference, indicated by 10 in Figure 1. The floor-acoustic model 1020, shown in Figure 3, which is typically configured to provide an allowable interference or threshold, also referred to as THR, for each band of scale factor, ie, for a group of several spectral values, which are spectrally adjacent to each other, can serve as the means to provide an allowable interference. This allowed interference is based on the psychoacoustic masking threshold of the amount of energy that can be introduced into an original audio signal within the interference energy, being perceived by the human ear. In other words, the allowed interference is the portion of the signal artificially introduced (by quantization) which is masked by the actual audio signal. The means 10 is illustrated to calculate the allowable interference THR for a frequency band, preferably a scale factor band and to supply it to a downstream medium 12, this means 12 serves to calculate a piece of the information of the size of the quantifier stage, for the frequency band for which the allowed interference THR has been indicated. The means 12 are configured to supply the piece of information q of the size of the quantizer stage to a downstream medium 14 for quantization. The means 14 for the quantization operates in accordance with the quantization specification of the block 14, the size information of the quantizer stage being used, in the case shown in Figure 1, to initially divide a spectral value X | by the value of q, and then raise the result to the exponent not equal to l, and then add an additive factor s, according to the asao can be. Subsequently, this result is supplied to a rounding function, which, in the embodiment shown in Figure 1, selects the next integer. According to the definition, the integer can be generated again by trimming digits after the decimal point, that is to say by "always rounding down". Alternatively, the next integer can also be generated by rounding down to 0.499 and rounding up to 0.6. In another alternative, the next integer can be determined by "always rounding up!, depending on the individual embodiment." However, instead of the nint function, any other rounding function can be used, which, generally speaking, , projects a value on the map, which is going to be rounded off, from a first larger set of values into a second, smaller set of values.The quantized spectral value will then be present in the frequency band at the output of medium 14. As can be seen from the equation illustrated in block 14, means 14 will naturally also be supplied, in addition to the size q of the quantizer stage, with the spectral value to be quantified in the frequency band I considered. that the means 12 does not necessarily need to directly calculate the size q of the quantizer stage, but, as an alternative, the information of the size of the quantifier stage, the Scale factor as used in audio encoders based on the transformation of the prior art, can also be calculated. The scale factor is linked to the size of the actual quantifier stage, by means of the relationship illustrated to the right of block 12 in Figure 1. If the means to calculate is also configured for the calculation, such as the size information of the quantifier stage, the scale factor scf, this scale factor will be supplied to the means 14 to quantify, said means will then be used, in block 14, the value of 2 1 4 scf for the calculation of the quantization, instead of the value q. A derivation of the form given in block 12, will be given below. As noted, the low exponential quantifier, as illustrated in block 14, obeys the following relationship: y, = rounding + S The reverse operation will be presented as follows: This equation represents the operation required for the re-quantification, in which yi is a quantized spectral value and where xl 'is a re-quantized spectral value, again q is the size of the quantifier stage, which is associated with the scale factor by means of the relationship shown in Figure 1 to the right of block 12. As expected, in the case that a is equal to 1, the result is consistent with this equation.
If the above equation is added on a vector of the spectral values, the total power of the noise in a band determined by the index i is given as follows: In summary, the expected value of the quantization noise of a vector is determined by the size q of the quantizer stage and the so-called form factor, which describes the distribution of the quantities of the components of the vector. The form factor, which is the term on the right in the above equation, depends on the actual input values and will necessarily only be calculated once, even if the above equation is calculated for the desired THR interference levels in different degrees . As has already been pointed out, this equation with a equal to 3/4 is simplified as follows: The left side of this equation is thus an estimate of the energy of the quantization noise, which, in a case of limit line, conforms to the allowed noise energy (threshold). Thus, the following approach will be made: The sum through the roots of the frequency lines in the right part of the equation corresponds to a measure of the uniformity of the frequency lines and is known as the form factor, preferably as before in the encoder: 11/2 S \ xt \ = FFAC Thus, the following results are obtained O3'2 THR * ^ FFAC 6.75 here corresponds to the size of the quantification stage. With AAC, specified as: scf is the scale factor. If this scale factor is to be determined, the equation can be calculated as follows, based on the relationship between the stage size and the scale factor: (3 / S) scf _ 6.15 - THR > 2 FFAC . 8,, 6.75 - THR. < = > scf = -loe, () J 3 2 FFAC 8 < = > scf = 5 • [lo w (6.75 • THR) - log10 (FFAC)] 3 log, 0 2 or > scf = 8.8585 • [log10 (6.75 • THR) - log10 (FFAC)] The present invention thus provides a closed connection between the scale factors, scf, for a scale factor band, which has a specific form factor and for which a specific interference threshold THR, which typically originates from the model floor-acoustic is supplied. As already noted, the calculation of the stage size coupled with the main noise energy provides a better estimate, since the base used is the expected value of the quantization error rather than a scenario in the worst case. Thus, the concept of the invention is suitable for determining the size of the quantifier stage and / or in equivalence thereto, the scale factor for a band of the scale factor without any iteration. However, the subsequent process, as will be represented below by means of Figure 2, can also be performed if the calculation time requirements are not very strict. In a first step in Figure 2, the first size of the quantifier stage is estimated (step 50). The estimation of the first size of the quantifier stage (QSS) is performed using the procedure illustrated by means of Figure 1. Subsequently, a quantization using the first size of the quantizer stage is performed in a step 52, preferably in accordance with the quantizer as illustrated using block 14 in Figure 1 Subsequently, the values obtained with the first size of the quantizer stage are also re-quantized so as to then calculate the interference introduced. A) Yes, the verification is done in a step 54 if the interference introduced exceeds the predefined threshold. It will be noted that the size q of the quantizer stage (or scf) that has been calculated by the connection represented in block 12 is an approximation. If the connection given in block 12 of Figure 1 is really accurate, it must be established, in block 54, that the interference introduced corresponds exactly to the threshold. Due to the approach nature of the connection in block 12 of Figure 1, however, the interference introduced may exceed the fall below the threshold THR. In addition, it will be noted that the deviation of the threshold will not be particularly large, although, nevertheless, it is present. If one finds, in step 54, that by using the first quantizer size, the interference introduced falls below the threshold, ie if the question in step 54 is answered negative, the right branch in Figure 3 will be taken. If the interference introduced falls below the threshold, this means that the estimate in block 12 in Figure 1 is too pessimistic, so in a step 56, a thicker size of the quantizer stage then the second size of the quantifier stage is established.
The degree to which the second size of the quantifier stage is thicker, in comparison, than the first size of the quantizer stage, can be selected. However, it is preferred to take relatively small increments, since the estimate in block 50 will already be relatively accurate. Using the second size of the thickest (largest) quantizer stage, a quantization of the spectral values, a subsequent re-quantization, and a calculation of the second interference corresponding to the second size of the quantizer stage, are performed in a step 58. In step (60), the verification is then done as if the second interference, which corresponds to the second size of the quantizer stage, still falls below the original threshold. If so, the second size of the quantizer stage is stored (62), and a new iteration is started in order to adjust an even larger quantifier stage size in a step (56). Then, step 60 and, as the case may be, step 62 is again performed using the size of the still thicker quantizer stage, so as to again start a new iteration. If one finds, during an iteration in step 60, that the second interference does not fall below the threshold, that is, exceeds the threshold, a termination criterion has been reached, and upon reaching the termination criterion, the quantization is performed ( 64) using a quantifier stage size that has been stored last. Since the first size of the estimated quantifier stage is already a relatively good value, the number of iterations, compared to the poorly estimated starting values, will be small, which will lead to significant savings in calculation time when coding , since the iterations for calculating the size of the quantifier stage take the largest proportion of the time of the encoder calculation. An inventive method that is used when the interference introduced actually exceeds the threshold, will be represented below with reference to the left branch in Figure 2. Despite the fact that the interference introduced already exceeds the threshold, a second size of the quantizer stage still thicker is established according to the invention (70), a quantization, representation and calculation of the second noise interference, corresponding to the second size of the quantizer stage, then executed in a step 72. Next, the verification it is done in a step 74 if the second noise interference now falls below the threshold. If this is the case, the question in step 74 is answered with "yes", and the second size of the quantizer stage is stored (76). However, one finds that the second noise interference exceeds the threshold, or a quantization is performed using the size of the quantization stage, or, if no second size of the quantization stage has been stored, an iteration is passed through. , in that, as in the prior art, a second size of the finest quantizer stage is selected to "push" the interference introduced below the threshold. There follows a discussion of why an improvement can still be achieved, when an even thicker quantifier stage size is used, particularly when the interference introduced exceeds the threshold. So far, one has always operated on the assumption that a finer quantifier stage size leads to a lower quantization energy input, and that a larger quantifier stage size leads to a greater quantization interference, on average , and the opposite will be true, in particular, for bands of the general scale factor in a thinner form and, in particular, when the quantifier has a non-linear characteristic curve. It has been found, according to the invention, that in a number of cases it will not be underestimated, a larger quantifier stage size leads to a lower interference introduced. This can be plotted again by the fact that it can also be the case when a larger quantifier stage size collides with a spectral value, which is going to be quantized better than a finer quantifier stage size, as will point out using the following example with reference to Figure 4. As an example, Figure 5 shows a curve (69) quantization characteristic, which provides four quantization steps 0, 1, 2, 3, when the input signals between 0 and 1 are quantified. The quantized values correspond to 0.0, 0.13, 0.5, 0.75. In compassion, a more common, different quantization characteristic curve is drawn by dashed lines in Figure 4 (62), which only has three quantization stages, which correspond to the absolute values 0.0, 0.33, 0.66. Thus, in the first case, that is, with the curve 60 characteristic of the quantifier, the size of the quantifier stage is equal to 0.15, while in the second case, that is, with the characteristic curve 62 of the quantifier, the size of the quantifier the stage of the quantifier is equal to 0.33. The second curve 862) characteristic of the most common quantizer, therefore has a quantifier stage size more common than the first curve (60) characteristic of the quantizer, which represents a fine quantization characteristic curve. If the value x = 0.33, which is to be quantified, is considered, one can see from Figure 4 that the quantization error that uses the fine quantifier, which has four stages, is equal to the difference between 0.33 and 0.25 and that's 0.08. In contrast, the error in the quantization using three stages is equal to zero, due to the fact that a quantifier stage "hits" exactly, since it is the value to be quantified. It can therefore be seen from FIG. 4 that a coarser quantization can lead to a quantization error smaller than a fine quantization. In addition, a coarser quantization is the decision factor for a lower starting bit rate being required, since the possible states are only three states, that is, 0, 1, 2, unlike the case of the finer quantifier, in which four stages 0, 1, 2, 3 must be indicated. In addition, the size of the thickest quantifier stage tends to be "quantified in distance" from 0 than with a finer quantifier stage size, in which smaller value is quantified as a distance of "0". Although when several spectral values in a scale factor band are considered, "quantization to 0" leads to an increase in the quantization error, this does not necessarily become problematic, since the size of the quantifier stage more coarse can collide with other more important spectral values, in a more accurate way, so the quantization error is canceled and is still over-compensated by the coarser quantization of the other spectral values, a bit rate less occurs at the same time . In other words, the result obtained from the coder is "better", from the bull, since the inventive concept achieves a smaller number of states to be signaled and, at the same time, an improved "shock" of the quantization stages. According to the invention, as shown in the left branch of Figure 2, a still larger quantifier stage size is attempted, starting from the estimated values (step 50 in Figure 2), when the Interference introduced exceeds the threshold, so as to benefit from the effect represented using Figure 4.
Furthermore, it has been believed that this effect is even more significant with non-linear quantifiers, than in the case, drawn in Figure 4, of two characteristic curves of the linear quantifier. The presented concept of the size of the quantifier stage after the process, and / or the scale factor after the process, thus serve to improve the result of the estimator of the scale factor. The presented concept of the size of the stage of the quantifier after the process and / or the scale factor after the process, thus serve to improve the result of the estimator of the scale factor. Starting from the size of the quantifier stage determined in the scale factor estimator (50 in Figure 2), new sizes of the quantifier stage, which are as large as possible, and for which the error energy falls below of the predefined threshold value are determined in the analysis by synthesis stage. Therefore, the spectrum is quantized with the calculated quantizer stage sizes, and the energy of the error signal, ie, preferably the sum of squares of the difference of the original and quantized spectral values, is determined. Alternatively, for error determination, a corresponding time signal may also be used, although the use of spectral values is preferred. The size of the quantizer stage and the error signal are stored as the best result obtained. If the calculated interference exceeds a threshold value, the following approach is adopted: The scale factor within the predefined range is varied around the originally calculated value, also making use, in particular, of the sizes (70) of the thickest quantifier stage. For each new scale factor, the spectrum is again quantized, and the energy of the error signal is calculated. If the error signal is smaller than the smallest calculated signal, the size of the current quantizer stage has been calculated thus, the size of the current quantifier stage is locked together with the energy of the associated error signal ,, the best known result obtained. According to the invention, not only relatively small, but also relatively large scale factors are taken into account here, in order to benefit from the concept described with reference to Figure 4, particularly when the quantizer is a non-linear quantifier. However, when the calculated interference falls below the threshold value, that is, if the estimate in step 50 was too pessimistic, the scale factor will vary within a predefined interval around the originally calculated value. For each new scale factor, the spectrum is re-quantized, and the energy of the error signal is calculated.
If the error signal is smaller than the smallest signal that has been calculated known, the size of the current quantifier stage is locked, together with the energy of the associated error signal, as the best known result obtained. However, only relatively thick scaling factors are taken into account here, in order to reduce the number of bits required for coding the audio spectrum. Depending on the circumstances, the method of the invention can be carried out in the hardware (equipment) or in the software (program). The embodiment can be carried out in a digital storage medium, in particular a disk or a CD, with control signals that can be read electronically, which can cooperate with a programmable computer system, according to the method that is executed. Generally, the invention thus consists of a computer program product, which has a program code stored in a carrier that can be read by machine, when the computer program product is run on a computer. In other words, the invention can be performed as a computer program, which has a program code to carry out the method, when this computer program is run on a computer.

Claims (10)

  1. CLAIMS 1. An apparatus for determining a size of the quantizer stage, for quantizing a signal, comprising audio or video information, this apparatus comprises: a means for providing a first size of the quantizer stage and an interference threshold; a means for determining a first interference, introduced by the first size of the quantizer stage; means for comparing the interference introduced by the first size of the quantizer stage with the interference threshold; means for selecting a second size of the quantizer stage, which is larger than the first size of the quantizer stage, if the first interference introduced exceeds said interference threshold; a means for determining a second interference, introduced by the second size of the quantizer stage; a means to compare the second interference introduced with the interference threshold or the first interference introduced; and a means for quantizing the signal with the second size of the quantizer stage, if the second interference introduced is less than the first interference introduced or is less than the interference threshold.
  2. 2. The apparatus, as claimed in claim 1, wherein the signal is an audio signal and comprises spectral values of a spectral representation of the audio signal and in which the means to provide is configured as a floor-acoustic model, the which calculates an allowed interference for a frequency band at the base of the psycho-acoustic masking threshold.
  3. 3. The apparatus, as claimed in claims 1 or 2, wherein the means for determining the first interference introduced, or the means for calculating the second interference introduced, are configured to quantify using the size of the quantizer stage, to quantify using the size of the quantizer stage and to calculate a distance between the re-quantized signal and the signal, in order to obtain the interference introduced.
  4. 4. The apparatus, as claimed in any of the previous claims, in which the means for providing the first size of the quantizer stage is configured to calculate the size of the quantizer stage, according to the following equation: in which the means to quantify is configured to quantify according to the following equation: Y. = rounding x v + s J in which i is a value of the spectrum to be quantified, where q represents the size information of the quantifier stage, where s is a number different or equal to zero, in that a is an exponent other than "1", rounding is a function that places a value in a map of a first larger range of values, to a value within a second smaller range of values, where S [? x] 2 [THR] is the allow interference and in that i is an operation index for spectral values in the frequency band.
  5. 5. The apparatus, as claimed in any of the previous claims, wherein the means for selecting is further configured to select a size of the quantizer stage, when the interference introduced is less than the allowed interference.
  6. The apparatus, as claimed in any of the previous claims, wherein the means for the provision is configured to supply the first size of the quantifier stage, as a result of a determination of the analysis &synthesis.
  7. 7. The apparatus, as claimed in any of the previous claims, in which the means for selecting is configured to alter a quantifier stage size for a frequency band, independently of a quantifier stage size for another frequency band. .
  8. 8. The apparatus, as claimed in any of the previous claims, in which the means to provide, is configured to determine the first size of the quantizer stage as a result of the preceding iteration step, with a coarser state the size of the quantizer stage and wherein the interference threshold is an interference introduced in the preceding iteration step, to determine the first size of the quantizer stage.
  9. 9. A method for determining a size of the quantizer stage for quantizing a signal comprising audio or video information this method comprises: providing a first size of the quantizer stage and an interference threshold; determining a first interference, introduced by the first size of the quantizer stage; compare the interference introduced by the first size of the quantizer stage with the interference threshold; selecting a second size of the quantizer stage, which is larger than the first size of the quantizer stage, if the first interference introduced exceeds the interference threshold; determining a second interference introduced by the second size of the quantizer stage; compare the second interference introduced with the interference threshold or the first interference introduced; quantizing the signal with the second size of the quantizer stage if the second interference introduced is less than the first interference introduced, or is less than the interference threshold;
  10. 10. A computer program, which has a program code for carrying out the method as claimed in claim 9, when this computer program runs on a computer.
MXPA/A/2006/009932A 2004-03-01 2006-08-31 Device and method for determining a quantiser step size MXPA06009932A (en)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
DE102004009955.3 2004-03-01

Publications (1)

Publication Number Publication Date
MXPA06009932A true MXPA06009932A (en) 2007-04-10

Family

ID=

Similar Documents

Publication Publication Date Title
US8756056B2 (en) Apparatus and method for determining a quantizer step size
US10446162B2 (en) System, method, and non-transitory computer readable medium storing a program utilizing a postfilter for filtering a prefiltered audio signal in a decoder
KR100823097B1 (en) Device and method for processing a multi-channel signal
EP1808851B1 (en) System and method for low power stereo perceptual audio coding using adaptive masking threshold
JP4673882B2 (en) Method and apparatus for determining an estimate
JPH09204197A (en) Perceptual noise shaping in time area by lps prediction in frequency area
JP2010508550A (en) Spectral value post-processing apparatus and method, and audio signal encoder and decoder
US7349842B2 (en) Rate-distortion control scheme in audio encoding
US20230133513A1 (en) Audio decoder, audio encoder, and related methods using joint coding of scale parameters for channels of a multi-channel audio signal
CN111344784A (en) Controlling bandwidth in an encoder and/or decoder
US8600764B2 (en) Determining an initial common scale factor for audio encoding based upon spectral differences between frames
MXPA06009932A (en) Device and method for determining a quantiser step size
Zhang et al. Informed source separation from compressed mixtures using spatial Wiener filter and quantization noise estimation
Wang et al. A new bit-allocation algorithm for AAC encoder based on linear prediction
Ghaderi et al. Wideband speech and audio coding using a new spectral replication method based on parametric stereo coding
MXPA06009934A (en) Device and method for determining an estimated value
Padhi et al. Low bitrate MPEG 1 layer III encoder