MXPA06009146A - Audio coding - Google Patents

Audio coding

Info

Publication number
MXPA06009146A
MXPA06009146A MXPA/A/2006/009146A MXPA06009146A MXPA06009146A MX PA06009146 A MXPA06009146 A MX PA06009146A MX PA06009146 A MXPA06009146 A MX PA06009146A MX PA06009146 A MXPA06009146 A MX PA06009146A
Authority
MX
Mexico
Prior art keywords
audio
parameterization
noise power
version
block
Prior art date
Application number
MXPA/A/2006/009146A
Other languages
Spanish (es)
Inventor
Schuller Gerald
Wabnik Stefan
Gayer Marc
Original Assignee
Fraunhofergesellschaft Zur Foerderung Der Angewandten Forschung EV
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fraunhofergesellschaft Zur Foerderung Der Angewandten Forschung EV filed Critical Fraunhofergesellschaft Zur Foerderung Der Angewandten Forschung EV
Publication of MXPA06009146A publication Critical patent/MXPA06009146A/en

Links

Abstract

The aim of the invention is to dispense with the previous procedure, i.e. interpolation at the filter coefficients and the amplification value in order to obtain interpolated values for the intermediate audio values starting from the support points. Coding containing fewer audible artifacts can be obtained by using the performance limit derived from the masking threshold, preferably from the area below the square sum of the masking threshold, for each support point, i.e. for each parameterization that is to be transmitted, instead of interpolating the amplification value, and then performing interpolation between said performance limits of adjacent support points, e.g. linear interpolation. An amplification value can then be calculated from the determined intermediate performance threshold value at the coder end and the decoder end in such a way that the quantizing noise which is caused by quantizing and whose frequency is constant before post-filtering is done at the decoder end, lies below the performance limit or corresponds thereto following post-filtering.

Description

AUDIO CODING DESCRIPTION The present invention relates to audio coding, in general, and in particular to audio coding that allows audio signals to be encoded with a short delay time. The best known audio compression method in the present is the MPEG-1 Layer III. (MPRG-3 Layer III). With this compression method, the sample or audio values of an audio signal are encoded in a coded signal in a time-wasting manner. But differently, the irrelevance and redundancy of the original audio signal is ideally reduced or removed when it is compressed. In order to achieve this, simultaneous and temporal masks are recognized by a psycho-acoustic model, that is, a threshold of masking, temporarily variable, or specific indication from which the volume or tones of a certain frequency is perceived by the ear human. This information, in turn, is used to encode the signal by quantifying the spectral values for the audio signal in a more or less accurate manner or not at all, depending on the masking threshold, and integrating them into a coded signal.
Audio compression methods, such as, for example, the MP # format, experience a limit in their applicability, when the audio data will be transferred over a limited transmission channel in the bit rate, on the one hand, in a compressed manner, but, on the other hand, with a small delay time as possible. In some applications, the delay time does not play a role, such as, for example, when archiving audio information. Audio encoders of a small delay are sometimes referred to as "ultra-low delay encoders", however, they are necessary when audio signals critical in time are to be transmitted, such as, for example, conferences, in wireless speakers or microphones. For these fields of application, the article by Schuller G. et al. "Perceptual Audio Coding, Using Pre- and Post-Filters and Compression without Loss, Vol. 10, No. 6, September 2002, pages 379-390, suggests audio coding where the reduction of irrelevance and the reduction of Redundancy is not performed based on a simple transformation, but on two separate transformations The principle will be discussed subsequently with reference to Figures 12 and 13. The coding begins with an audio signal 902, which has already been sampled and is thus already present as a sequence 904 of audio or sample values 906, in which the time order of the audio values 906 are indicated by the arrow 908. The listening threshold is calculated by means of a floor-acoustic model for successive blocks of value. of audio 906, characterized by an ascending numbering by "blocks." Figure 13, for example, shows a diagram where, relative to the frequency f, it shows a graph to which it projects the spectrum of a signal channel of 128 audio values 906, and b which projects the masking threshold, as calculated by a psycho-acoustic model, in logarithmic units. The masking threshold indicates < , as already mentioned, up to which intensity frequencies remains inaudible to the human ear, that is, all the tones below the masking threshold b, based on these listening thresholds, calculated for each block, a reduction of irrelevance is achieves, by controlling a filter for etrizable, the parameterization is calculated so that its response frequency corresponds to the inverse of the magnitude of the masking threshold. This parameterization is indicated in Figure 12 by x # (i). After filtering the audio values 906, quantization with a constant pitch size takes place, such as, for example, a rounding operation to the next integer. The quantization noise caused by this is white noise. On the side of the decoder, the filtered signal is "retransformed" again by a parameterizable filter, the transfer function of which is adjusted to the magnitude of the masking threshold itself. Not only is the filtered signal decoded again by this, but the quantization noise on the decoder side is also adjusted to the shape or configuration of the masking threshold. In order to quantify the noise to correspond to the masking threshold, as precisely as possible, an amplification value a # applied to the filtered signal, before quantization, is calculated on the encoder side for each adjusted parameter or each parameterization. For the purpose of retransformation on the decoder side, the amplification value a. and the parameterization x are transferred to the encoder as secondary information 910 apart from the actual main data, ie, quantized filtered audio values 912. For redundancy reduction 914, these data, that is, side information 910 and main data 912, are subjected to lossless compression, i.e. entropy coding, which is how the coded signal is obtained. The article, mentioned above, suggests a sample size 906 of 12% as a block size. This allows a relatively short delay of 8 ms with a sampling rate of 32 kHz. With reference to the detailed embodiment, the article also notes that, in order to increase the efficiency of secondary information encoding, this secondary information, ie, the coefficients x # and # will only be transferred if there are sufficient changes compared to the adjusted parameter transferred. before, that is, if the changes exceed a certain threshold value. Furthermore, it was described that the embodiment is preferably performed so that an adjusted current parameter is not applied directly to all the sample values belonging to the respective block, but that a linear interpolation of the filter coefficients x # is used to avoid audible artifacts. In order to execute the linear interpolation of the filter coefficients, a grid structure is suggested so that the filter prevents instabilities from occurring. For the case that the signal coded with a controlled bit rate is desired, the article also suggests selectively multiplying or attenuating the filtered filtered signal with the amplification factor that depends on time, by a factor not equal to 1, so that interference occurs audible, but the bit rate can be reduced in places of the audio signal, which are complicated to encode.
Although the audio coding scheme described in the aforementioned article already reduces the delay time for many applications to a sufficient degree, a problem in the above scheme is that, due to the requirement of having to transfer the masking threshold or the Filter transfer function of the side of the encoder, subsequently referred to as a pre-filter, the transfer channel is loaded to a relatively high degree, although the filter coefficients will only be transferred when the predetermined threshold is exceeded. Another disadvantage of the above coding scheme is that, due to the fact that the masking threshold or its inverse, has to be made available on the side of the decoder by the adjustment of the parameter x # to be transferred, a compromise has to be made between the lowest possible bit rate or high compression ratio, on the one hand, and the most accurate possible approximation or parameterization of the masking threshold or its inverse, on the other. Thus, it is unavoidable for quantizer noise adjusted to the masking threshold by the above audio coding scheme, to exceed the masking threshold in some frequency ranges and thus result in audio interference audible to the listener. Figure 13, for example, shows the parametrized frequency response of the parametrizable filter on the side of the decoder by the graph c. As can be seen, there are regions where the filter transfer function of the side of the decoder, referred to in the following as a post-filter, exceeds the masking threshold b. The problem is aggravated by the fact that the parameterization is only transferred intermittently with a sufficient change between the parameterizations and interpolated between them. An interpolation of the filter coefficients x #, as suggested in the article, only results in audible interference, when the value a. of amplification remains constant from one node to another or from a new parameterization to another new parameterization. Even if the interpolation suggested in the article is applied to the information value lateral to #, ie the amplification value transferred, the audible audio artifacts may remain in the audio signal arriving at the side of the decoder. Another problem with the audio coding scheme, according to Figures 12 and 13, is that the filtered signal may, due to selective frequency filtering, take an unpredictable form, where particularly due to random overlap of any wave individual harmonic, one or several individual audio values of the encoded signal, is added up to very high values, which, in turn, result in a poorer compression ratio in the reduction of subsequent redundancy, due to its rare occurrence. It is the object of the present invention to provide an audio coding scheme, to allow coding that produces minor audible artifacts. This object is achieved by a method, according to claims 13 or 15, and a device according to claims 1 or 14. The inventive coding of an audio signal of a sequence of audio values in a coded signal, includes determining a first listening threshold for a first block of audio values of the sequence of audio values and a second listening threshold for a second block of audio values of the sequence of audio values; calculate a version of a first parametrisation of a parameterizable filter, so that its transfer function corresponds approximately to the inverse of the magnitude of the first threshold of listening and a version of a second parameterisation of the parameterizable filter, so that its transfer function corresponds to approximately the reverse of the magnitude of the second listening threshold; determining a first noise power limit, depending on the first masking threshold and a second noise power limit, depending on the second masking threshold; parametrically filter and scale or amplify a predetermined block of audio values of the sequence of audio values, to obtain a block of scaled filtered audio values, corresponding to the predetermined block, the last step comprises the following substeps: interpolate between the version d the first parameterization and the version of the second parameterization, to obtain a version of an interpolated parameterization for a predetermined audio value in the predetermined block of audio values; interpolating between the first noise power limit and the second noise power limit to obtain a power limit of the interpolated noise for the predetermined audio value, depending on the power limit of the interpolated noise; and apply the parameterizable filter with the version of the interpolated parameterization and the intermediate scaling value to the default audio value, to obtain one of the filter, scaled audio values. Finally, the quantification of filtered, scaled audio values takes place, to obtain a block of filtered audio values, scaled; and the integration information in the encoded signal from which the block of audio values filtered, scaled, quantized, the version of the first parameterization, the version of the second parameterization, the first noise power limit and the second limit of Noise power can be derived. The central idea of the present invention is that the above procedure, ie the interpolation relative to the filter coefficients and the amplification value, to obtain the interpolated values for the intermediate audio values starting from the nodes to be removed. The coding containing less audible artifacts can be obtained without interpolating the amplification value, but rather by taking the power limit derived from the masking threshold, preferably as the area below the square of the magnitude of the masking threshold, for each node, that is, for each parameterization to be transferred, and then executing the interpolation between these power limits of the neighboring nodes, such as, for example, a linear interpolation. In both the encoder and the decoder side, an amplification value may be calculated from the determined intermediate power limit, so that the quantization noise caused by this quantization, which has a constant frequency before the post-filtration in the side of the decoder, is below the power limit or corresponds to it, after the post-filtration.
Preferred embodiments of the present invention will be subsequently referred to in detail, with reference to the accompanying drawings, in which: Figure 1 shows a block circuit diagram of an audio encoder, according to an embodiment of the present invention; Figure 2 is a flow diagram for illustrating the operation mode of the audio encoder of Figure 1, in the data entry; Figure 3 shows a flow diagram to illustrate the operation mode of the audio encoder of Figure 1, with respect to the evaluation of the incoming audio signal, by a psycho-acoustic model; Figure 4 shows a flow chart to illustrate the mode of operation of the audio encoder of Figure 1, with respect to the application of the parameters obtained by the pisco-acoustic model to the incoming audio signal; Figure 5a shows a schematic diagram for illustrating the incoming audio signal, the sequence of the audio values, consisting of, and the operation steps of Figure 4, in relation to the audio values; Figure 5b is a schematic diagram for illustrating the establishment of a coded signal; Figure 6 shows a flowchart to illustrate the mode of operation of the audio encoder of Figure 1, with respect to the final process up to the encoded signal; Figure 7a shows a diagram where a function of the mode of a quantization stage is shown; Figure 7b shows a diagram where another embodiment of a quantization stage is shown; Figure 8 shows a block circuit diagram of an audio encoder, which is capable of decoding an audio signal encoded by the audio encoder of Figure 1, according to an embodiment of the present invention; Figure 9 shows a flow diagram to illustrate the mode of operation of the decoder of Figure 8 in the data entry; Figure 10 shows a flowchart to illustrate the operation mode of the decoder of Figure 8 with respect to the regulation of the quantized and filtered audio data, pre-decoded, and the process of the audio blocks, without corresponding to the lateral information; Figure 11 shows a flow chart to illustrate the mode of operation of the decoder of Figure 8, with respect to the actual reverse filtering; Figure 12 shows a schematic diagram for illustrating a conventional audio coding scheme, having a short delay time; and Figure 13 shows a diagram showing an example of a spectrum of the audio signal, its listening threshold and the post-filter transfer function in the decoder.
Figure 1 shows an audio encoder, according to an embodiment of the present invention. This audio encoder, which is generally indicated by 10, includes a data input 12, where it receives the audio signal to be encoded, which, as will be explained in more detail later with reference to Figure 5a, consists of of a sequence of audio values or sample values, and a data output, where the encoded signal exits, the content of information, which will be discussed in greater detail with reference to Figure 5b. The audio encoder 10 of Figure 1 is divided into an irrelevance reduction part 16 and a redundancy reduction part 18. The irrelevance reduction part 16 includes means 22 for calculating an amplification value, means 24 for calculating a parameterization, means 26 comparing nodes, a quantizer 28 and a parameterizable prefilter 30 and a FIFO input regulator (first entering, first out), a regulator or memory 38 and a multiplier or means of multiplying 40. The redundant reduction part 18 includes a compressor 34 and a controller 36 of the bit rate. The irrelevance reduction part 14 and the redundancy reduction part 18 are connected in series in this order, between the data input 12 and the data output 14. In particular, the data input 12 is connected to the data input of the means 20, to determine the input threshold 12. A data output of the means 20 for determining a listening threshold is connected to an input of the means 24 for calculating a parameterization and a data input of the means 22 to calculate an amplification value to pass in a listening threshold determined by the same. The means 22 and 24 calculate a parameterization or amplification value, based on the listening threshold and are connected to the means 26 that compares the nodes, to pass these results thereto. Depending on the result of the comparison, the means 26 comparing the node, as will be discussed subsequently, passes the results calculated by the means 22 and 24 as an input parameter or parametrization to the parameterizable pre-filter 30. This parameterizable pre-filter is connected between a data output from the input controller 32 and a data input from the controller 38. The multiplier 40 is connected between the data output of the controller 38 and the quantizer 28. This quantizer 28 passes over the filtered audio values, which can be multiplied or scaled, but always quantized, to the redundancy reduction part 18, more precisely to the data input of the compressor 34. The means 26 that compare nodes pass information from which the input parameters pass to the parameterizable pre-filter and can refer to the redundancy reduction part 18, more precisely to another data entry of the compressor 34. The controller of the regime of bits is connected to a control input of the multiplier 40 by means of a control connection to provide filtered, quantized audio values as received from the pre-filter 30, which will be multiplied by the multiplier 40 by a multiplying adequate, as will be discussed in more detail below. The bit rate controller 36 is connected between a data output of the compressor 34 and the data output 14 of the audio encoder 10, in order to determine the multiplier for the multiplier 40 in a suitable manner. When each audio value passes quantizer 40, for the first time, the multiplier is the first set for an appropriate scale factor, such as, for example, 1. However, regulator 38 continues to store each filtered audio value for giving the bit rate to the controller 36, a possibility of changing the multiplying for another pass of a block of audio values will be subsequently described. If such a change is not indicated by the bit rate controller 36, the regulator 38 can release the memory taken by this block. After describing the setting of the audio encoder of Figure 1, its operation will be described below, with reference to Figures 2 to 7b. as can be seen from Figure 2, the audio signal, when it has raised the audio input 12, has already obtained the sample of the audio signal 50 from an analog audio signal. This sampling of the audio signal is performed with a predetermined sampling frequency, which is usually between 43 and 48 kHz. Consequently, in the data input 12, there is an audio signal consisting of a sample sequence or audio values. Although the encoding of the audio signal does not take place in the block-based manner, as will become obvious from the subsequent description, the audio values in the data input 112 are first combined to form blocks of audio in the stage 52. The combination for forming the audio blocks takes place only for the purpose of determining the listening threshold, as will become obvious from the following description, and will take place in an input step of the medium 20 to determine a listening threshold . In the present embodiment, it is exemplarily assumed that 128 successive audio values are combined to form the audio blocks and that the combination takes place in such a way that, on the one hand, successive audio blocks do not overlap and, on the other, they are neighbor to each other. This will be discussed in an exemplary manner briefly, referring to Figure 5a. Figure 6a at 54 indicates a sequence of the sample values, each sample value is illustrated by a rectangle 56. The sample values are numbered for purposes of illustration, in which the reasons for clarity in turn only some values of the sequence 34 are shown. As indicated above, the sequences 54, 128 successive sample values are combined to form a block according to the present embodiment, in which the 128 successive sample values directly from the next block.- Only as a precautionary measure, will be pointed out that the combination to form blocks, can also be executed differently, exemplified by the blocks that overlap or spaced blocks, and blocks that have another size, although the size of blocks of 128, in turn, is preferred since it provides a good exchange between high audio quality, on the one hand, and smaller possible delay time, on the other hand. In the meantime, the audio blocks combined in the means 20 in step 52 are processed in these means 20 to determine the threshold of listening batch, block by block, the incoming audio values will be regulated 54 in the input controller 32 until the parameterizable pre-filter 30 has obtained the input parameters from the means 26 comprising the nodes, to perform said pre-filtering, as will be subsequently described. As can be seen from Figure 3, the means 20 for determining a listening threshold begins its processing directly after sufficient audio values have been received, at the data input 12, to form an audio block or to form the next audio block, which means 20 monitor by inspection in step 60. If there is not a complete processable audio block, means 20 will wait. If a complete audio block to be processed is present, the means 20 for determining the listening threshold will calculate a listening threshold in step 62, based on the appropriate psycho-acoustic model in step 62. To illustrate the listening threshold, reference is again made to Figure 12 and, in particular, to graph b, which has been obtained based on the floor-acoustic model, exemplarily with respect to the audio block present with a spectrum a. The masking threshold, which was determined in step 62, is a frequency dependent function, which may vary for successive audio blocks and may also vary considerably from an audio signal to another audio signal, such as, for example, example, from rock music to pieces of classical music. The listening threshold indicates for each frequency, a. a threshold value below which the human ear can not perceive interference. In a subsequent step 64, the means 24 and the means 22 are calculated from the listening threshold M (f), calculated (f indicates the frequency), an amplification value x (i) (i = 1, ... N) The parameterization x (i), which the means 24, calculated in step 64, is provided for the parameterizable pre-filter 30, which is, for example, incorporated in an adaptive filter structure, as used in the LPC coding (LPC = linear predictive coding). For example, s (n), n = 0, ..., 127, will be the 128 audio values of the current audio block and s1 (n) will be the 128 filtered filtered audio values, then the filter is incorporated and emplarmente , so that the following equation applies: s' (n) = s (n) -? S { TÍ - k), k = l K is the filter order ya ^ (, k = 1, ..., K, being coefficients of the filter, and the index t is to illustrate that the filter coefficients change in audio blocks successively, the means 24 then calculate the parameterization Y, so that the transfer function H (f) of the parameterizable pre-filter 30 is approximately equal to the inverse of the magnitude of the masking threshold M (f), ie , so that the following applies: H (f, t) «X M (f, t) | in which the dependence of t. in turn, it is to illustrate that the masking threshold M (F) changes for different audio blocks. When the pre-filter 30 is performed as the adaptive filter mentioned above, the filter coefficients Y ±, will be obtained as follows: the inverse discrete Fourier transform of! M (f, t) 2 over the frequency for the block in the moment t, results in the function r ^ (i). Then, we obtain the a \, solving the linear equation system: k = 0 In order that no instabilities arise between the settings in the interpolations described in greater detail below, preferably a grid structure is used for the filter 30, in which the filter coefficients for the grid structure are re-parameterized to forming coefficients of linear reflection, with respect to further details, such as the design of the pre-filter, the calculation of the coefficients and the reference to the re-parameterization are made to the article by Schuller etc., mentioned in the introduction to the description and , in particular, to page 381, division III, which is incorporated herein by reference. As a result the means 24 calculate a parameterization for the parameterizable pre-filter 30, such that its transfer function equals the inverse of the masking threshold, the means 22 calculate a noise power limit based on the threshold of listening, is say a limit indicating which noise power the quantizer 28 allows to be introduced into the audio signal filtered by the prefilter 30, in order to quantify the noise on the side of the decoder, to be below the listening threshold M (f) . or exactly the same, after the post- or reverse filtering. The means 22 calculate this power limit of the noise as the area below the square of the magnitude of the threshold of listening, M, that is to say as S! M (f)! 2 • the means 22 calculate the value of the amplification a from a Noise power limit of the noise power quantification divided by the noise power limit. The noise of the quantization is the noise caused by the quantizer 28. The noise caused by the quantizer 28 is, as will be described below, white noise and thus independent of the frequency. The power of the quantization noise is the power of the quantizing noise. As will become apparent from the above description, means 22 also calculates the limit of the noise power apart from the amplification value a. Although it is possible for the noise comparison means 26, again calculate the noise power limit of the amplification value a, obtained from the means 22, transmit the power limit of the noise, determined to the node comparison means 26, apart of the amplification value. After calculating the amplification value and the parameterisation, the node comparison means 26 checks in step 66, whether the precisely calculated parameterization differs by more than a predetermined threshold from the last actual parameterization passed on the parameterizable pre-filter. If the check in step 66 has the result that the precisely calculated parameterization differs from the current one by more than one predetermined threshold, the filter coefficients, just calculated, and the amplification value, just calculated or power limit of the noise, are regulated in the means 26 comparing the node for an interpolation to be discussed and the means 26 comparing the node 26 for the transfer to the pre-filter 30 of the filter coefficients, just calculated, in step 68 and the amplification value, just calculated, in step 70, However, if this is not the case, and the parameterization, just calculated does not differ from the current one by more than a predetermined threshold, the means to compare the node (26) to transfer the pre-filter 30 in step 72, instead of the just calculated parameterization, only the parameterization of the current node, is say the parameterization that resulted last in a positive result in step 66, that is, different from the parameterization of the previous node by more than a predetermined threshold. After steps 70 and 72, the process of Figure 3 returns to the process of the next audio block, that is, to a question 50. In the case that the parameterization, just calculated, does not differ from the parameterization of the current node and consequently the pre-filter 30 in step 72 again obtain the node parametrization, already obtained for at least the last audio block, the pre-filter 30 will apply this node parameterization to all the sample values of this audio block in the input 22, as will be described in more detail below, which is how this current block is taken out of the state 32 and the quantizer 28 receives an audio block resulting from the pre-filtered audio values. Figure 4 illustrates the mode of operation of the parameterizable pre-filter 30 for the case that receives the parameterization just calculated and the amplification value just calculated, because they differ sufficiently from the parameterization of the current node in greater detail. As described with reference to Figure 3, there is no process according to Figure 4, for each of the successive audio blocks, but only for the audio blocks where the respective parameterization differs sufficiently from the parameterization of the current node . The other audio blocks are, as just described, pre-filtered by amplifying the respective parameterization of the current node and the respective actual amplification value for all the sample values of these audio blocks. In step 80, the parameterizable pre-filter 30 checks whether a transfer of the exactly calculated filter coefficients of the comparison element of node 26 has taken place, or of parameterizations of the oldest node. The prefilter 30 performs the check 80 until such regulation has taken place. As soon as such a transfer has taken place, the parameterisable pre-filter 30 starts to process the current audio value block of audio, just in the controller 32, that is, in one of which the parameterization has been precisely calculated. In Figure 5a, for example, it is illustrated that all the audio values 56 in front of the audio value with the number 9, have already been processed and thus, have been passed to the memory 322. The process of the audio value block in front of the audio value with the number 9 has been triggered because the parameterization calculated for the audio block in front of block 0, ie x0 (i), deferred from the parameterization of the node passed before the pre-filter 30 by more than the predetermined threshold. The parameterization x0 (i) thus is a node parameterization, as described in the present invention. The process of the audio values in the audio block in front of the audio value 0 was made in the base of the parameter setting a0, x0 (i). It is assumed in Figure 6a, that the parameterization has been calculated for block 0 with the audio values 0 - 127 deferred by less than the predetermined threshold from the parameterization x0 (i), which refers to the block on the front. This block 0 is thus also taken from the FIFO input 32 by the pre-filter 30, likewise processed with respect to all the sample values 0 to 127, by means of the parameterization xa (i) supplied in step 72, as shown in FIG. indicated by the arrow 82, described by the "direct application" and then passed to the quantizer 28. The parameterization calculated for the block 1, still located at the FIFO input 32, however, in deferred contrast, according to the illustrative example of Figure 5a by more than the predetermined threshold of the parameterization x0 (i) and thus passed in step 68 to the pre-filter 30 as a parameterization x? (i), together with the amplification value ai (step 70) and, if is applicable, the limit of the relevant noise power, in which the indices of a. and _z in Figure 5, are for an index for the nodes, as used in the interpolation with respect to the sample values 128 - 255 in block 1, symbolized by the arrow 82 and made by the steps following the step 80 in Figure 4. The process in step 80 will be initiated with the occurrence of the audio block with the number 1. At the time when the setting of the parameter to? Xi is passed, only the audio values 128-255, that is, the current audio block, after the last audio block 0 processed by the pre-filter 30, are in the memory 32.
After determination of the transfer of the node parameters X? (I) in step 80, the pre-filter 30 determines the noise power limit qi, which corresponds to the value of the amplification a in step 84. This it can take place by means 26 comparing the node, which passes this value to the pre-filter 30 or by the pre-filter again calculating this value, as described with reference to step 64. After that, an index j a sample value in step 86 is initialized to the point where the oldest sample value remains in the FIFO memory 32 or the first sample value of the current audio block "block 1!", ie in the present example of Figure 5 shows the sample value 128.
In step 88, the parameterizable pre-filter performs an interpolation between the filter coefficients x0 and i where the parameterization xo acts as a node in the node having the number 127 of the audio value of block 0 and the parameterization i acts as a node in the node having the number 255 of the audio value of the current block 1. These positions of the audio value 127 and 255 will subsequently be referred to as the node 0 and the node 1, in which the parameterizations of the nodes refer to the nodes in the Figure 5a, are indicated by the arrows 90 and 92.
In step 88, the parameterizable pre-filter 30 performs the interpolation of the filter coefficients x0,? \, Between the nodes in the form of a linear interpolation, to obtain the filter coefficients interpolated in the sample position j, is say. x (t?) (i), i = 1 ..., N. After that, ie in step 90, the parameterizable pre-filter 30 performs an interpolation between the noise power limit qi and qo to obtain the noise power limit at the sample position j, that is, q (tj). In step-92, the parameterizable pre-filter 30 subsequently calculates the amplification value for the sample position j on the basis of the interpolated noise power limit and the power of the quantization noise, and preferably also the filter coefficients interpolated, that is, for example depending on the root of [noise power quantization / q (tj)] where reference is made to the explanations of step 64 of Figure 3. In step 94, the pre-filter parameterizable 30 then applies the value of the calculated amplification and the filter coefficients interpolated to the sample value at the sample position j, to obtain a filtered sample value for this sample position, ie s' (tj).
In step 96, the parameterizable pre-filter 30 then checks whether the sample position j has reached the current node, ie the node 1, in the case of Figure 5a, the sample position 255, ie the value of sample for which the parameterization transferred to the parameterizable pre-filter 30 plus the value of the amplification is going to be valid directly, that is, without the interpolation. If this is not the case, the parameterizable pre-filter will increase or increase the index j by 1, in which steps 88 to 96 will be repeated. If the check in step 96, however, is positive, the parameterizable pre-filter will apply, in step 100, the last amplification value from the node comparison means 26 and the last filter coefficients transmitted from the medium node comparison 26, directly without an interpolation in the sample value in the new node, in which the current block, ie in the present case block 1, has been processed, and the process is performed again in step 80 in relation to the subsequent block to be processed, which, depending on whether the parameterization of the next audio block 2 differs sufficiently from the parameterization x0 (i), can be this next audio block 2 or also a last audio block .
Before further processing, when the process of the sample values filtered out will be described with reference to Figure 5, the purpose and background of the procedure of Figures 3 and 4 will be described below. The purpose of filtering is to filter the audio signal at input 12 with an adaptive filter, the transfer function of which is continuously adjusted to the reverse of the listening threshold with the best possible degree, which also changes over time, the reason for this is that, on the side of the decoder, the inverse filtering of the function of which is continuously adjusted in a manner corresponding to the listening threshold configurations while quantifying the introduced noise by quantizing the filtered audio signal, i.e. the constant frequency quantizing noise, by an adaptive filter, i.e. adjusted to the form of the listening threshold. The application of the amplification value in steps 56 and 100, in the pre-filter 30 is a multiplication of the audio signal or the filtered audio signal, i.e., the sample values or the filtered sample values s' by the amplification factor. The purpose is to adjust for this the quantification of the noise introduced in the audio signal filtered by the quantization described in more detail below, and which is adjusted by the reverse filtering on the side of the decoder to the form of the threshold of listening, so high as possible, without exceeding this listening threshold. This can be exemplified by the formula of Parsevals, according to which the square of the magnitude of an anointing is equal to the square of the magnitude of the Fourier transformation. When on the side of the decoder the multiplication of the noise signal in the pre-filter by the amplification value is inverted again by dividing the filtered audio signal by the amplification value, the power of the quantizing noise is also reduced, ie by the factor a-2, being at the amplification value. Consequently, the power of the quantizing noise can be adjusted to be of an optimally high degree, by applying the amplification value in the pre-amplifier, which is synonymous with the size of the quantizing stage, which is increased and thus the number of quantizing steps which are to be encoded is reduced, which, in turn, increases the compression in the subsequent redundancy reduction part. Said differently, the effect of the pre-filter can be considered as a normalization of the signal to its masking threshold, in order to maintain constant the level of the quantization interferences or the quantizing noise, in both tempo and frequency. Since the audio signal is in the time domain, the quantization can be performed step by step, with a uniform constant quantization, as will be subsequently described. In this way, ideally any possible irrelevance is removed from the audio signal and a lossless compression scheme can be used to also remove the redundancy remaining in the pre-filtered and quantized audio signal, as will be described below. Referring to Figure 5a, again it is pointed out that, explicitly, the course of the values of coefficients and amplification of the filter a0, a ^ xQ, Xi, used must be available on the side of the decoder as secondary information, that the complexity of this, however, is decreased and the new amplification values for each block. Rather, a check 66 of the threshold value takes place by only transferring the parameterisations as secondary information with a sufficient parameterization change and without otherwise transferring the secondary information or parameterisations. An interpolation of the old to the new parameterization takes place in the audio blocks for which the settings have already been transferred. The interpolation of the filter coefficients takes place in the manner described above with reference to step 88. Interpolation with respect to the amplification it takes place by a deviation, that is by means of a linear interpolation 90 of the noise power limit qO, ql. compared to a direct interpolation by means of the amplification value, linear interpolation results in better listening performance or lower audible artifacts with respect to noise power limit. Subsequently, the further processing of the pre-filtered signal will be described with reference to Figure 6, which basically includes the quantification and reduction of redundancy. First, the filtered sample values produced by the parameterizable pre-filter 30 are stored in the controller 39 and, at the same time, they will pass from the regulator 38 to the multiplier 40, where there is, since it is their first pass, they will first pass without changing , ie with a scale factor of one, by the multiplier 40 to the quantizer 28. From there, the filtered audio values, above an upper limit, are cut in step 110 and then quantized in step 12. two steps, 110 and 112 are executed by the quantizer 28. In particular, the two steps, 110 and 112, are preferably executed by the quantizer 28 in one step, quantifying the audio values s' filtered by a function of the quantizing step, which places on the map the values of the filtered sample s' present exemplarily in a floating point illustration to a plurality of values of the quantizing stage of integers or indices and which has a flat course for the values of mu They are filtered from a certain threshold value, so the filtered values shown to be greater than the threshold value are quantified to one and the same quantifier step. An example of such a function of the quantizing step is illustrated in Figure 7a. The quantized filtered sample values,. are referred to by s' in Figure 7a. The function of the quantizing step is preferably a quantizing step function with a step size that is constant, below the threshold value, ie, the stop to the next quantizing step will always take place after of a constant interval along the input values S '. In one embodiment, the step size to the threshold value is adjusted so that the number of quantizing steps preferably corresponds to a power of 2. Compared to the floating point illustration of the filtered filtered sample values, s', the threshold value is smaller so a maximum value of the illustrative region of the floating point illustration exceeding the threshold value. The reason for this threshold value is that, if it has been observed that the filtered audio signal produced by the prefilter 30, it occasionally comprises audio values that add up to very large values, due to the unfavorable accumulation of harmonic waves. Likewise, it has been observed that cutting these values, as is achieved by the function of the quantizing stage, shown in Figure 7a, results in a high data reduction, but only in a lesser detriment of the audio quality. Rather, these occasional locations in the filtered audio signal are artificially formed by frequency selective filtering in the parameterizable filter 30, so cutting them only slightly damages the audio quality. A somewhat more specific example of the quantization stage function, shown in Figure 7a, would be one which rounds all the filtered sample values to 'to the next integer up to the threshold value, and then quantifies all the filtered sample values above the highest quantization stage, such as, for example, 256. This case is illustrated in Figure 7a,. Another example of a possible quantizing stage function would be one shown in Figure 7b. Up to the threshold value, the function of the quantizing stage of Figure 7b corresponds to that of Figure 7a. Instead of having an abruptly flat course for sample values s' above the threshold value, however, the function of the quantization stage continues with a smaller slope than that in the region below the threshold value. Said differently, the size of the quantization stage is greater than the threshold value., a similar effect is achieved as per the quantization function of Figure 7a, but, on the other hand, with greater complexity due to the different stage sizes of the function of the quantization stage and below the threshold value and, for another part, an improved audio quality, since the filtered audio values are very high s' are not cut completely but only quantized with a larger quantization stage size. as already described above, on the decoder side not only the audio values, quantized and filtered, s' must be available, but also the input parameters for the pre-filter 30 being the filtering base of these values, is say, node parameterization, which includes hiding the relevant amplification value. In step 114, the compressor 34 thus executes a first compression test and thus compresses the secondary information containing the amplification values a0 and ai, in the nodes, such as, for example, 127 and 255, and the filter coefficients xo yi, in the nodes and the quantized filtered sample values s' to a temporarily filtered signal. The compressor 34 thus is an encoder that operates without losses, such as, for example, a Huffman or arithmetic encoder, with or without prediction and / or adaptation.
The memory 38, which evaluates the audio sample shows two s' passes through, serves as a regulator for a suitable block size with which the compressor 34 processes the quantized, filtered and also scaled signals, as described above, the values of audio s' produced by the quantizer 28. The size of the block may differ from the block size of the audio blocks, as used by means 20. As already mentioned, the bit rate controller 36 has controlled the multiplier 40 by a multiplier of 1 for the first compression test,, so that the filtered audio values go unchanged from the pre-filter 30 to the quantizer 28 and from there as filtered audio valves, quantized to the compressor 34. This compressor 34 monitors in step 116 if a certain size of the compression block, i.e. a certain number of quantized sampled audio values have been encoded in the temporal encoded signal, or if other value It is quantized filtered audio s' are going to be encoded in the current signal coded temporarily. If the size of the compression block has not been reached, the compressor 34 will continue to perform the actual compression 114. If the size of the compression block, however, has been reached, the bit rate controller 36 will check in step 118 if the number of bits required for compression is greater than the number of bits dictated by a desired bit rate. If this is not the case, the bit rate controller 36 will check in step 120 whether the number of bits required is less than the number of bits dictated by the desired bit rate. If this is the case, the bit rate controller 36 will fill the signal encoded in step 122 with padding bits until the number of bits dictated by the desired bit rate has been reached. Subsequently, the encoded signal is produced in step 124. As an alternative to step 122, the bit rate controller 36 may pass over the compression block of the filtered audio values s' still stored in the memory 38, in which the last compression has been based on a form multiplied by a multiplicand greater than 1, by the multiplier 40 the quantizer 28 to again pass the steps 110-118, until the number of bits dictated by the desired bit rate is it has reached, as it was Indian by a stage 125, illustrated in interrupted lines. However, if the check in step 118 results in the amount of bits required being greater than that dictated by the desired bit rate, the bit rate controller 36 will change the multiplier for the multiplier 40 to a factor between 0 and 1 exclusive This is done in step 126. After step 126, the bit rate controller 36 provides the memory 38 for again producing the last compression block of the filtered audio values s' on which the compression is based. , in which they are subsequently multiplied by the factor set in step 126 and again supplied to quantizer 28, where cases 110-118 are performed again and until the temporarily encoded signal is discarded. It will be noted that, after executing steps 110-116 again, in step 114, of course the factor used in step 126 (or step 1254) is also integrated in the encoded signal. The purpose of the method after step 126 is to increase the effective step size of the quantizer 28 by the factor. This means that the resulting quantizing noise is uniformly above the masking threshold, which results in audible interference or audible noise, but at a low bit rate. If, after passing steps 110-116 again, it is again determined in step 118 that the number of bits required is greater than dictated by the desired bit rate, the factor will be reduced again in step 126, etc.
If the data is finally produced in step 124, as a coded signal, the next compression block will be made from filtered, quantized, subsequent audio values s1. It is also pointed out that another pre-initialized value different from 1 can be used as the multiplication factor, that is, for example, 1. Then, the scaling takes place in any case at the beginning, that is, at the top of the Figure 6. Figure 5b illustrates again the resulting coded signal, which is generally indicated by 130. This coded signal includes secondary information and main data between it. The secondary information includes, as already mentioned, information from which for special audio blocks, ie audio blocks, where a significant change in the filter coefficients have resulted in the sequence of the audio blocks, the value of amplification and the value of the filter coefficients may be derived, if necessary, the secondary information shall include other information relating to the amplification value used for the bit controller. Due to the mutual dependence of the amplification value and the noise power limit q, the secondary information can optionally, apart from the amplification value a, to a # node, also include the noise power limit q #, or only the latter. The secondary information is preferably arranged within the encoded signal, so that the secondary information filters the coefficients and the relevant amplification value or the relevant noise power limit is arranged in front of the main data to the audio block of the values of quantified filtered audio, of which these filter coefficients with relevant amplification values or the power limit of the relevant noise, have been derived, that is, the secondary information ao, or (i) after block -1 and the secondary information ai, xi (i) after the block 1. Said in a different way, the main data, that is, the audio values s' filtered quantified, starting from, excluding, an audio block of the class where a Significant change in the sequence of audio blocks has resulted in the filter coefficients, up to, including, the next audio block of this class, in Figure 5, eg For example, the audio values s1 (t0) - s't255) will always be arranged between the secondary information block 132 132 to the first of these two audio blocks (block -1) and the other block 134 (block 1). The audio values s' (t0) - s't? 27), are decodable or have been, as mentioned before with reference to Figure 5a, obtained only by means of the secondary information 132, while the audio values (ti28) - s't255), have been obtained by interpolation by means of the secondary information 132, as support values in the node with the number of the sample value 127 and by means of the secondary information 134, as values of support in the node, with the number of the sample value 255 and are thus decodable only by means of both secondary information. In addition, the secondary information regarding the amplification value or the noise power limit and the filter coefficients in each secondary information block 132 and 134 are not always integrated independently of each other. Rather, this lateral information is transferred in differences to the previous lateral information block. In Figure 5b, for example, the lateral information block 132 contains the amplification value a0 and the filter coefficients Xo, with respect to the node at time t_ ?. In the block 132 of side information, these values may be derived from the block 132 itself. From the side information block 134, however, the side information regarding the node at time t255 may no longer be derived from this block alone. Rather, the lateral information block 134 only includes difference information of the amplification value ai, of the node at time t255 and the amplification value of the node at the time to and the differences of the filter coefficients Xi and the coefficients of Xo filter. The side information block 134 consequently only contains the information in ai- a0 and xi (i) - xoA However, in intermittent moments, the filter coefficients and the amplification value or the noise power limit, must be completely transferred. and not only as a difference to the previous node, such as, for example, every second, to allow a receiver or decoder to lock into an operating stream of the encoding data, as will be discussed below. This kind of integration of side information into the blocks, 132 and 134, of lateral information, offers the advantage of the possibility of a higher compression regime. The reason for this is that, although the lateral information, if possible, will only be transferred if a sufficient change of the filter coefficients to the filter coefficients of the previous node has resulted, the complexity of the calculation of the difference on the side of the The encoder or the calculation of the sum on the decoder side is compensated, since the resulting differences are small, despite the question in step 66 and thus allow advantages in entropy coding.
After a modality of an audio encoder has been described above, a mode of an audio decoder, which is suitable for decoding the encoded signal generated by the audio encoder 10 of Figure 1 to an operable or processable decoded audio signal. , will be described subsequently. The setting of this decoder is shown in Figure 8. The decoder, indicated generally by 210, includes a decompressor 212, a FIFO memory 213, a multiplier 216 and a parameterizable post-filter 218. The decompressor 212, the FIFO memory 213, the multiplier 216 and the parameterizable post-filter 218 are connected in this order, between a data input 220 and a data output 222 of the decoder 210, in which the encoded signal is received in the data input 220 and the decoded audio signal, which differs only from the original audio signal in the data input 12 of the audio encoder 10, by quantizing the noise generated by the quantizer 28 in the audio encoder 10, is produced in the data output 222. The decompressor 212 is connected to a control input of the multiplier 216 in another data output to pass in a multiplying thereof, and to the parametrizing input of the post-filter 218 parameterizable 218, by means of another data output.
As shown in Figure 9, decompressor 212 first decompresses the compressed signal in data input 220 in step 324 to obtain filtered, quantized audio data, i.e., sample values s' and lateral information relevant in the lateral information blocks 132, 123, which, as is known, indicate the filter coefficients and the amplification values or, instead of the amplification values, the limits of the noise power in the nodes. As shown in Figure 10, the decompressor 212 checks the decompressed signal in the order of appearance in step 226, if the side information with the filter coefficients is contained therein, in a self-contained form without a difference reference to a block of previous lateral information. Stated differently, the decompressor 212 searches for the side information block 132. As soon as the decompressor 212 has found something, quantized filtered audio values s 'are regulated in the FIFO memory 214 in step 228. If a complete audio block of quantized, filtered audio values, s' has been stored during step 228, without directly following a lateral information block, will be the first postfiltered in step 228, by means of the information contained in the lateral information received in step 226, in the parameterization and amplification value in a postfilter and amplified in multiplier 216, which is like it is decoded and thus the relevant decoded audio block is achieved. In step 230, the decompressor 212 monitors the decompressed signal for the occurrence of any kind of lateral information block, i.e. with absolute filter coefficients or differences of filter coefficients to a previous lateral information block. In the example of Figure 5b, decompressor 212, for example, would recognize the occurrence of side information block 134 in step 230 when recognizing side information block 132 in step 226. Thus, the block of audio values filtered, quantized s (t0) -s (ti27) have been decoded in step 228, using the side information 132. While the side information block 134 in the decompressed signal, has not yet occurred, the regulator and, perhaps, , the decoder of the blocks is continued in step 228, by means of the side information of step 226, as described above. As soon as the lateral information block 132 has occurred, the decompressor 212 will calculate the values of the parameter in node 1, ie ai, X? (I), in step 232, adding the difference values in block 134 of side information and parameter values in block 132 of lateral information. Step 2322 is, of course, omitted if the current lateral information block is a self-contained side information block, with no differences, which, as described above, can occur exemplarily every second, in order that time Wait for the decoder 210 is not too long, the side information blocks 132, where the values of the parameter can be derived absolutely, ie, unrelated to another block of lateral information, are arranged at sufficiently small distances, so that the operating time or idle time, when changing in the audio encoder 210, in the case of, for example, a radio transmission or broadcast transmission is not too long. Preferably, the number of side information blocks 132 disposed therebetween with the difference values are arranged at a fixed predetermined number between the side information blocks 132, so that the decoder knows when a lateral information block of the type 132 is again expected in the encoded signal. Alternatively, the different lateral information block types are indicated by the corresponding flags. As shown in Figure 11, after a lateral information block has been reached for a new node, in particular, after step 226 or 232, a sample value index 3 is first initialized to 0 in step 234 This value corresponds to the sample position of the first sample value in the audio block, currently remaining in the FIFO memory 214 to which the current lateral information relates. Step 234 is performed by the parameterizable post-filter 218. This post-filter 218 then calculates the noise power limit at the new node in step 236, where this step corresponds to step 84 of Figure 4 and can be omitted when, for example, the noise power limit in the nodes are also transmitted to the amplification values. In subsequent stages 238 and 240, the postfilter 218 executes interpolations with respect to the filter coefficients and the noise power limit, which corresponds to the interpolations 88 and 90 of Figure 4. The subsequent calculation of the amplification value for the position of sample j at the base of the interpolated noise power limit and the interpolated filter coefficients of steps 238 and 240 in step 242 corresponds to step 92 of Figure 4. In step 244, the post-filter 218 applies the amplification value calculated in step 242 and the filter coefficients interpolated to the sample value in the sample position 3. This step differs from step 94 of Figure 4, in that the interpolated filter coefficients are applied to the quantized filtered sample values s', so that the transfer function of the parameterizable post filter does not correspond to the inverse one. of the listening threshold, but at the listening threshold itself. Furthermore, the post-filter does not execute a multiplication by the amplification value, but a division by the amplification value in the sample values s' filtered and quantified or the filtered sample value, quantified, already filtered in inverse manner, in the position j. If the post-filter 218 has not yet reached the current node with the position j of 'sample, which is checked in step 246, the index j of the position of will be increased. shows at stage 248 and will begin cases 238-246 again. Only when the node has been reached will the amplification value and the filter coefficients of the new node be applied to the sample value in the node, ie in step 250. The application, in turn, includes, as in the step 218, a division by means of the amplification and filtering value with a transfer anointing that equals the listening threshold and not the inverse of the latter, instead of a multiplication. After step 250, the current audio block is decoded by an interpolation between two node settings.
As already mentioned, the noise introduced by the quantization, when the coding in step 110 or 112 is adjusted in both configuration and magnitude to the listening threshold by the filtering and the application of an amplification value in steps 218 and 224. It will also be noted that in the case that the filtered and quantized audio values have been subjected to another multiplication in step 126, due to the bit rate controller before being coded in the encoded signal, this factor can also be considered in step 218 t 224. Alternatively, the audio values obtained by the process of Figure 11, they can, of course, be subjected to another multiplication to correspondingly amplify again the audio values weakened by a lower bit rate. with respect to Figures 3, 5, 6 and 9 to 11, it is noted that they show flowcharts that illustrate the mode of operation of the encoder of Figure 1 or the decoder of Figure 8, and that each of the stages illustrated in the flow chart by a block, as described, is performed in a corresponding medium, as already described above. This embodiment of the individual steps can be carried out in the hardware (equipment), as a part of the ASIC circuit, or in the software (program) as subroutines. In particular, the explanations written in the blocks in these figures indicate approximately to which process the respective stage corresponds to the respective blocks referred to, while the arrows between the blocks illustrate the order of the stages when operating the encoder and decoder, respectively. Referring to the previous description, it was again pointed out that the coding scheme, illustrated above, can be varied in many aspects. Exemplary, it is not necessary for a parameterization and an amplification value or a noise power limit, as determined for a certain audio block, is considered as valid directly for a certain audio value, as in the previous mode, the last audio value for each audio block, that is, the value 128 in this audio block, so interpolation for this audio value can be omitted. Furthermore, it is possible to relate these values of the node parameter to a node which is temporarily between sample times tn, n = 0, ..., 127, of the audio values of this audio block, so an interpolation It would be necessary for each audio value. In particular, the parameterization determined for an audio block or the determined amplification value for this audio block can also be applied indirectly to another value, such as, for example, the audio value in the middle of the audio block, such such as, for example, the audio value of order 64 in the case of the size of the previous block of the 128 audio values. Additionally, it is pointed out that the above modality, referred to as an audio coding scheme, designed to generate a coded signal with a controlled bit rate. However, bit rate control is not necessary for each application case. This is why the corresponding steps 116 to 122 and 126 or 126 can also be omitted. With reference to the compression scheme mentioned with reference to step 114, for reasons of completeness, reference is made to the document by Schuller et al., Which describes in the introduction to the description and, in particular, to division IV, the contents of which, with respect to reducing redundancy, by means of the lower loss coding, is incorporated herein by reference. In addition, the following will be indicated with reference to the previous modality. Although it has been described before that the threshold value always remains constant when quantized or even the function of the quantization stage always remains consistent, ie, the artifacts generated in the filtered audio signal are always quantized or clipped by an approximate quantization, which can damage the audio quality to an audible extension, it is also possible to only use these measurements if the complexity of the audio signal requires this , that is, if the bit rate required for coding exceeds a desired bit rate. In this case, in addition to the functions of the quantization stage, shown in Figures 7a and 7b, for example, one with a constant of the size of the quantizing stage over the entire range of possible values at the output of the prefilter can be used and the quantizer, for example, will respond to a signal to use any function of the quantizing stage with an always constant quantization stage size or one of the functions of the quantization stage, according to Figures 7a or 7b, so that the quantizer can be driven by the signal to be executed, with little detriment to the audio quality, the quantization step decreases above the threshold limit or trimming above the threshold value. Alternatively, the threshold value can also be reduced gradually. In this case, the reduction of the threshold value can be performed instead of the reduction of the factor of step 126. After a first compression test without step 110, the temporarily compressed signal can only be subjected to a quantization of the value of selective threshold in a modified step 126, if the bit rate is still too high (118). In another pass, the filtered audio values will then be quantized with the function of the quantizing stage having a flatter course above the audio threshold. In addition, the bit rate reductions can be performed in the modified step 126 by reducing the threshold value and thus by another modification of the function of the quantization stage. Additionally, it is noted that the integration of the parameters a and x in the lateral information block, described above, can also take place, so that no difference is calculated, but the corresponding parameters can be derived from each secondary information block alone. Furthermore, it is not necessary to perform the quantization so that, as already explained with reference to step 110, the size of the quantizing step is changed from a certain upper limit which will be greater than below the upper threshold. Rather, other quantization rules shown in Figures 7a and 7b are also possible. In summary, the previous modalities used lower the coefficients with respect to the audio coding scheme, which has a very small delay time. When encoded, secondary information is transmitted at certain intervals. The coefficients are interpolated between the transmission times. A coefficient indicating the possible noise power or area below the masking threshold, or a value from which it can be derived, is used for the interpolation, preferably, also transmitted because it has favorable characteristics in the interpolation. Thus, on the one hand, the lateral information of the pre-filter, the coefficients of the lime must be transferred so that the post-filter in the decoder has the inverse transfer function, so the audio signal can again be reconstituted appropriately in the decoder can be transferred with a low bit rate, for example, by only transferring the information at certain intervals and, on the other hand, the audio quality can be maintained to a relatively good degree, since the interpolation of the possible Noise power, as the area below the masking threshold a good approximation for the times between the nodes. In particular, it is pointed out that, depending on the circumstances, the inventive audio coding scheme can also be realized in the software. The realization can be in a digital storage medium, in particular, which can be read electronically, which can cooperate with a programmable computer system, so that the corresponding method is executed. In general, the invention is also a computer program product, which has a program code stored in a carrier that can be read by machine to execute the inventive method, when the computer program product runs on a computer. Stated otherwise, the invention can also be realized as a computer program having a program code to execute the method, when the computer program runs on a computer. In particular, the steps of the previous method in the blocks of the flow chart can be carried out individually or in groups of several of them together in routines of the sub-program. Alternatively, an embodiment of a device of the invention, in the form of an integrated circuit is, of course, also possible, when these blocks are, for example, realized as individual circuit parts of an ASIC. In particular, it is pointed out that, depending on the circumstances, the scheme of the invention can also be realized in the software. The embodiment may be in a digital storage medium, in particular on a disk or a CD having control signals that can be read electronically, which may cooperate with a system of a programmable computer, so that the corresponding method is executed. In general, the invention thus also refers to a computer program product, which has a program code stored in a carrier that can be read by a machine, to perform the inventive method, when the computer program runs on a computer . In other words, the invention can also be realized as a computer program having a program code to execute the method, when the computer program runs on a computer.

Claims (16)

  1. CLAIMS 1. A device for encoding an audio signal of a sequence of audio values, in a coded signal, this device comprises: a means for determining a first listening threshold for a first block of audio values of the sequence of these audio values, and a second listening threshold, for a second block of audio values of the sequence of said audio values; means for calculating a version of a first parametrization of a parameterizable filter, so that its transfer function corresponds approximately to the inverse of the magnitude of the first listening threshold, and a version of the second parameterisation of the parameterizable filter, so that its transfer function corresponds approximately to the inverse of the amount of the second listening threshold; means for determining a first noise power limit, which depends on the first masking threshold, and a second noise power limit, which depends on the second masking threshold; means for parametrically filtering and scaling a predetermined block of audio values from the sequence of said audio values, to obtain a block of filtered, scaled audio values, corresponding to the predetermined block, this means comprising: a method for interpolating between the version of the first parameterization and the version of the second parameterization, to obtain a version of the interpolated parameterization for a predetermined audio value in the predetermined block of audio values, a means to interpolate between the first noise power limit and the second noise power limit, to obtain an interpolated noise power limit for the predetermined audio value; means for applying the parameterizable filter with the version of the interpolated parameterization and the intermediate scale value to said predetermined audio values, in order to obtain one of said filtered, scaled audio values; a means to quantify filtered, scaled audio values, according to the quantization rule, to obtain a block of filtered, scaled, quantized audio values; and a means to integrate the information into the encoded signal, from which the block of audio values filtered, scaled, quantized, the version of the first parameterization, the first noise power limit and the second noise power limit , they can be derivatives.
  2. 2. The device, according to claim 1, wherein the means for determining the first and second noise power limits is formed to determine the first noise power limit, as an area below the square of the magnitude of the first noise threshold. listens and the second noise power limit, as an area below the square of the magnitude of the second listening threshold.
  3. 3. The device, according to claims 1 or 2, wherein the means for determining an intermediate scale value is formed to execute the determination in addition to depending on the quantization of the noise power, caused by a certain quantization rule.
  4. 4. The device according to one of claims 1 to 3, further comprising a means for determining a second scale value, which depends on the quantification of the noise power, and the second power limit of the noise, wherein the medium for filtering and scaling, it also includes means for applying the parameterizable filter with the version of the second parameterization and the second value for escaping to an audio value, associated with the predetermined block, to obtain one of the filtered, scaled audio values.
  5. 5. The device according to claim 4, wherein the means for determining the first and second scale values comprises means for calculating the square root of the quotient of the quantized noise, divided by the limit of the power of the noise and the square root of the quantizer noise quotient divided by the second noise power limit.
  6. 6. The device, according to one of the preceding claims, in which the means for determining the intermediate scale value comprises means for calculating the square root of the quotient of the power of the quantizing noise divided by the power limit of the interpolated noise.
  7. 7. The device, according to one of the preceding claims, in which the means for interpolating between the version of the first parameterization and the version of the second parameterization is formed to perform a linear interpolation.
  8. 8. The device, according to one of the preceding claims, in which the means for interpolating between the first noise power limit and the second noise power limit is formed to perform a linear interpolation.
  9. 9. The device, according to one of the preceding claims, in which the means for quantifying is formed to perform the quantization, based on the quantization stage function, which comprises the approximately constant quantizing step of the size up to a threshold value.
  10. 10. The device, according to one of the preceding claims, wherein the means for integration includes an entropy encoder
  11. 11. The device, according to one of the preceding claims, wherein the means for integration is formed so that the information represents the first or second noise power limit, or the first or second scale value.
  12. 12. The device, according to one of the preceding claims, further comprising; a means to check the settings that follow the first parameterization, by the means to calculate, one after the other, to see if they differ from the first parameterization by more than a predetermined degree, and to select only between the settings, as the second parameterization, when this is the case for • the first time. -
  13. 130. A method for encoding an audio signal of a sequence of audio values, into a coded signal, this method comprises the steps of: determining a first listening threshold for a first block of audio values of the sequence of these values of audio and a second listening threshold for a second block of audio values of the sequence of these audio values; calculate a version of a first parametrisation of a parameterizable filter, so that its transfer function corresponds approximately to the inverse of the magnitude of the first threshold of listening, and a version of a second parameterisation of the parameterizable filter, so that its function of transfer corresponds approximately to the inverse of the magnitude of the second listening threshold; determining a first noise power limit, which depends on the first masking threshold and a second noise power limit, which depends on the second masking threshold; filtering, in parameterizable form, and scaling a predetermined block of audio values from the sequence of this audio values, to obtain a block of filtered, scaled audio values corresponding to the predetermined block; interpolating between the version of the first parameterization and the version of the second parameterization, to obtain a version of an interpolated parameterization, for a predetermined audio value, in the predetermined block, of audio values; interpolating between the first noise power limit and the second noise power limit, to obtain an interpolated noise power limit for the predetermined audio value; determine an intermediate scale value, which depends on the interpolated noise power limit; and apply the parameterizable filter with the version of the interpolated parameterization and the intermediate scaling value to the predetermined audio value, to obtain one of the filtered, scaled audio values; quantify filtered, scaled audio values to obtain a block of filtered, scaled, quantized audio values; and the integration of information in the encoded signal, from which the block of audio values filtered, scaled, quantized, the version of the first parameterization, the version of the second parameterization, the first limit of noise power and the second limit of noise power, can be derived.
  14. 14. A device for decoding a signal encoded in a decoded audio signal, in which this encoded signal contains information from which a predetermined block of audio values, filtered, scaled, quantized, a version of a first parameterization, a first power limit of noise and a second limit of noise power can be derived, this device comprises: a means for deriving the predetermined block of filtered, scaled, quantized audio values, the version of the first parameterization, the version of the second parameterization, the first noise power limit and the second noise power limit of said coded signal; means for filtering in a parameterizable manner and scaling the predetermined block of filtered, scaled, quantized audio values to obtain a corresponding block of decoded audio values, this means comprises: a means for interpolating between the version of the first parameterization and the version of the second parameterization, to obtain a version of an interpolated parameterization for a predetermined audio value, in the block of filtered, scaled, quantized audio values; means for interpolating between the first noise power limit and the second noise power limit, to obtain an interpolated noise power limit for the predetermined audio values; a means for determining an intermediate scale value, which depends on the interpolated noise power limit; and a means for applying the parameterizable filter with the version of the interpolated parameterization and the intermediate scaling value to the predetermined audio value, in order to obtain one of the decoded audio values.
  15. 15. A method for decoding a coded signal in a decoded audio signal, this coded signal contains information from which a predetermined block of filtered, scaled, quantized audio values, a version of a first parameterization, a version of a second parameterization, a First limit of noise power and a second limit of noise power, can be derived, this method comprises the steps of: deriving the predetermined block of audio values filtered, scaled, the version of the first parameterization, the version of the second parameterization, the first noise power limit and the second noise power limit, from the encoded signal; filter, in a parameterizable manner, and scale the predetermined block of filtered, scaled, quantized audio values to obtain a corresponding block of decoded audio values, comprising the following sub-steps: interpolate between the version of the first parameterization and the version of the second parameterization, to obtain a version of an interpolated parameterization, for a predetermined audio value, in the block of filtered, scaled, quantized audio values; interpolate between the first noise power limit and the second noise power limit, to obtain an interpolated noise power limit for the predetermined audio value. determine an intermediate scale value, which depends on the interpolated noise power limit; and apply the parameterizable filter with the version of the interpolated parameterization and the intermediate scale value to the predetermined audio value, to obtain one of the decoded audio values.
  16. 16. A computer program, having a program code for performing the method according to claims 13 or 15, when this computer program is run on a computer.
MXPA/A/2006/009146A 2004-02-13 2006-08-11 Audio coding MXPA06009146A (en)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
DE102004007200.0 2004-02-13

Publications (1)

Publication Number Publication Date
MXPA06009146A true MXPA06009146A (en) 2006-12-13

Family

ID=

Similar Documents

Publication Publication Date Title
CA2556099C (en) Audio coding
RU2337413C2 (en) Method and device for data signal quantisation
JP4673882B2 (en) Method and apparatus for determining an estimate
US20080010064A1 (en) Apparatus for coding a wideband audio signal and a method for coding a wideband audio signal
CA2556325C (en) Audio encoding
EP3175457B1 (en) Method for estimating noise in an audio signal, noise estimator, audio encoder, audio decoder, and system for transmitting audio signals
MXPA06009146A (en) Audio coding
JP5238512B2 (en) Audio signal encoding method and decoding method
JP4273062B2 (en) Encoding method, encoding apparatus, decoding method, and decoding apparatus
JP2002182695A (en) High-performance encoding method and apparatus
MXPA06009144A (en) Audio encoding
MXPA06009110A (en) Method and device for quantizing a data signal