DE10217297A1

DE10217297A1 - Device and method for coding a discrete-time audio signal and device and method for decoding coded audio data

Info

Publication number: DE10217297A1
Application number: DE10217297A
Authority: DE
Inventors: Ralf Geiger; Thomas Sporer; Karlheinz Brandenburg; Juergen Herre; Juergen Koller
Original assignee: Fraunhofer Gesellschaft zur Forderung der Angewandten Forschung eV
Current assignee: Fraunhofer Gesellschaft zur Forderung der Angewandten Forschung eV
Priority date: 2002-04-18
Filing date: 2002-04-18
Publication date: 2003-11-06
Also published as: CN1258172C; CN1625768A; EP1495464A1; ATE305655T1; KR20050007312A; JP2005527851A; WO2003088212A1; EP1495464B1; AU2002358578A1; HK1077391A1; CA2482427C; KR100892152B1; CA2482427A1; DE50204426D1; JP4081447B2

Abstract

According to the invention, a time-discrete audio signal is processed (52) in order to provide a quantization block with quantized spectral values (52). In addition, a whole-number spectral representation is generated from a time-discrete audio signal, using a whole-number transformation algorithm (56). The quantization block, which has been generated using a psychoacoustic model (54), is inverse quantized and rounded (58) to form a differential between the whole-number spectral values and the inverse quantized rounded spectral values. The quantization block alone produces a psychoacoustic encoded/decoded audio signal affected by loss after the decoding process, whereas the quantization block together with the combination block provides a loss-free, or practically loss-free encoded and decoded audio signal during said decoding process. The generation of the differential signal in the frequency range allows a simpler encoder/decoder structure to be produced.

Description

Die vorliegende Erfindung bezieht sich auf die Audiocodierung/Audiodecodierung und insbesondere auf skalierbare Codier/Decodier-Algorithmen mit einer psychoakustischen ersten Skalierungsschicht und einer zweiten Skalierungsschicht, die Zusatzaudiodaten für eine verlustlose Decodierung umfaßt. The present invention relates to Audio encoding / audio decoding and especially scalable Coding / decoding algorithms with a psychoacoustic first scaling layer and a second Scaling layer, the additional audio data for a lossless Decoding includes.

Moderne Audiocodierverfahren, wie z. B. MPEG Layer3 (MP3) oder MPEG AAC verwenden Transformationen wie beispielsweise die sogenannte modifizierte diskrete Cosinustransformation (MDCT), um eine blockweise Frequenzdarstellung eines Audiosignals zu erhalten. Ein solcher Audiocodierer erhält üblicherweise einen Strom von zeitdiskreten Audio-Abtastwerten. Der Strom von Audio-Abtastwerten wird gefenstert, um einen gefensterten Block von beispielsweise 1024 oder 2048 gefensterten Audio-Abtastwerten zu erhalten. Zur Fensterung werden verschiedene Fensterfunktionen eingesetzt, wie z. B. ein Sinus-Fenster, etc. Modern audio coding methods, such as. B. MPEG Layer3 (MP3) or MPEG AAC use transformations such as the so-called modified discrete cosine transformation (MDCT) for a block-wise frequency representation of a Obtain audio signal. Such an audio encoder receives usually a stream of discrete-time audio samples. The stream of audio samples is windowed by one windowed block of 1024 or 2048, for example to get windowed audio samples. To the fenestration different window functions are used, such as B. a sine window, etc.

Die gefensterten zeitdiskreten Audio-Abtastwerte werden dann mittels einer Filterbank in eine spektrale Darstellung umgesetzt. Prinzipiell kann hierzu eine Fourier- Transformation, oder aus speziellen Gründen eine Abart der Fourier-Transformation, wie z. B. eine FFT oder, wie es ausgeführt worden ist, eine MDCT eingesetzt werden. Der Block von Audio-Spektralwerten am Ausgang der Filterbank kann dann je nach Bedarf weiter verarbeitet werden. Bei den oben bezeichneten Audio-Codierern folgt eine Quantisierung der Audio-Spektralwerte, wobei die Quantisierungsstufen typischerweise so gewählt werden, daß das durch das Quantisieren eingeführt Quantisierungsrauschen unterhalb der psychoakustischen Maskierungsschwelle liegt, d. h. "wegmaskiert" wird. Die Quantisierung ist eine verlustbehaftete Codierung. Um eine weitere Datenmengenreduktion zu erhalten, werden die quantisierten Spektralwerte anschließend beispielsweise mittels einer Huffman-Codierung Entropiecodiert. Durch Hinzufügen von Seiteninformationen, wie z. B. Skalenfaktoren etc. wird aus den Entropie-codierten quantisierten Spektralwerten mittels eines Bitstrom- Multiplexers ein Bitstrom gebildet, der gespeichert oder übertragen werden kann. The windowed time discrete audio samples are then using a filter bank in a spectral representation implemented. In principle, a Fourier Transformation, or for special reasons a variant of the Fourier transform, such as B. an FFT or how it MDCT has been used. The Block of audio spectral values at the output of the filter bank can then be further processed as required. Both The above-mentioned audio encoders are followed by quantization the audio spectral values, the quantization levels typically be chosen so that the Quantize introduced quantization noise below that psychoacoustic masking threshold is d. H. "masked away". The quantization is lossy Encoding. To further reduce the amount of data obtained, the quantized spectral values are then for example using Huffman coding Entropy. By adding page information such as B. Scale factors etc. are coded from the entropy quantized spectral values using a bitstream Multiplexers formed a bit stream that is stored or can be transferred.

Im Audio-Decodierer wird der Bitstrom mittels eines Bitstrom-Demultiplexers in codierte quantisierte Spektralwerte und Seiteninformationen aufgeteilt. Die Entropie-codierten quantisierten Spektralwerte werden zunächst Entropiedecodiert, um die quantisierten Spektralwerte zu erhalten. Die quantisierten Spektralwerte werden dann invers quantisiert, um decodierte Spektralwerte zu erhalten, die Quantisierungsrauschen aufweisen, das jedoch unterhalb der psychoakustischen Maskierungsschwelle liegt und daher unhörbar sein wird. Diese Spektralwerte werden dann mittels eines Synthese-Filterbank in eine zeitliche Darstellung umgesetzt, um zeitdiskrete decodierte Audio-Abtastwerte zu erhalten. In der Synthese-Filterbank muß ein zum Transformations-Algorithmus inverser Transformations-Algorithmus eingesetzt werden. Außerdem muß nach der Frequenz-Zeit- Rücktransformation das Fenstern rückgängig gemacht werden. In the audio decoder, the bit stream is generated using a Bitstream demultiplexers into coded quantized spectral values and split page information. The entropy-encoded quantized spectral values are initially Entropy decoded to get the quantized spectral values. The quantized spectral values then become inverse quantized to obtain decoded spectral values that Show quantization noise, but below the psychoacoustic masking threshold lies and therefore will be inaudible. These spectral values are then determined using of a synthesis filter bank in a temporal representation implemented to discrete-time decoded audio samples receive. In the synthesis filter bank one must Transformation algorithm inverse transformation algorithm be used. In addition, after the frequency-time Reverse transformation that windows can be undone.

Um eine gute Frequenzselektivität zu erreichen, verwenden moderne Audio-Codierer typischerweise eine Block- Überlappung. Ein solcher Fall ist in Fig. 4a dargestellt. Zunächst werden beispielsweise 2048 zeitdiskrete Audio- Abtastwerte genommen und mittels einer Einrichtung 402 gefenstert. Das Fenster, das die Einrichtung 402 verkörpert, hat eine Fensterlänge von 2N Abtastwerten und liefert ausgangsseitig einen Block von 2N gefensterten Abtastwerten. Um eine Fensterüberlappung zu erreichen, wird mittels einer Einrichtung 404, die lediglich aus Übersichtlichkeitsgründen in Fig. 4a getrennt von der Einrichtung 402 dargestellt ist, ein zweiter Block von 2N gefensterten Abtastwerten gebildet. Die in die Einrichtung 404 eingespeisten 2048 Abtastwerte sind jedoch nicht die an das erste Fenster unmittelbar anschließenden zeitdiskreten Audio-Abtastwerte, sondern beinhalten die zweite Hälfte der durch die Einrichtung 402 gefensterten Abtastwerte und beinhalten zusätzlich lediglich 1024 "neue" Abtastwerte. Die Überlappung ist durch eine Einrichtung 406 in Fig. 4a symbolisch dargestellt, die einen Überlappungsgrad von 50% bewirkt. Sowohl die durch die Einrichtung 402 ausgegebenen 2N gefensterten Abtastwerte als auch die durch die Einrichtung 404 ausgegebenen 2N gefensterten Abtastwerte werden dann mittels einer Einrichtung 408 bzw. 410 dem MDCT-Algorithmus unterzogen. Die Einrichtung 408 liefert gemäß dem bekannten MDCT-Algorithmus N Spektralwerte für das erste Fenster, während die Einrichtung 410 ebenfalls N Spektralwerte liefert, jedoch für das zweite Fenster, wobei zwischen dem ersten Fenster und dem zweiten Fenster eine Überlappung von 50% besteht. In order to achieve good frequency selectivity, modern audio encoders typically use block overlap. Such a case is shown in Fig. 4a. First, for example, 2048 discrete-time audio samples are taken and windowed by means 402 . The window that embodies the device 402 has a window length of 2N samples and provides a block of 2N windowed samples on the output side. In order to achieve a window overlap, a second block of 2N windowed samples is formed by means of a device 404 , which is only shown separately from the device 402 in FIG. 4a for reasons of clarity. However, the 2048 samples fed into device 404 are not the discrete-time audio samples immediately following the first window, but rather contain the second half of the samples windowed by device 402 and additionally only contain 1024 "new" samples. The overlap is symbolically represented by means 406 in FIG. 4a, which causes a degree of overlap of 50%. Both the 2N windowed samples output by means 402 and the 2N windowed samples output by means 404 are then subjected to the MDCT algorithm by means 408 and 410, respectively. The device 408 supplies N spectral values for the first window according to the known MDCT algorithm, while the device 410 also delivers N spectral values, but for the second window, with an overlap of 50% between the first window and the second window.

Im Decodierer werden die N Spektralwerte des ersten Fensters, wie es in Fig. 4b gezeigt ist, einer Einrichtung 412, die eine inverse modifizierte diskrete Cosinustransformation durchführt, zugeführt. Dasselbe gilt für die N Spektralwerte des zweiten Fensters. Diese werden einer Einrichtung 414 zugeführt, die ebenfalls eine inverse modifizierte diskrete Cosinustransformation durchführt. Sowohl die Einrichtung 412 als auch die Einrichtung 414 liefern jeweils 2N Abtastwerte für das erste Fenster bzw. 2N Abtastwerte für das zweite Fenster. In the decoder, the N spectral values of the first window, as shown in FIG. 4b, are fed to a device 412 which carries out an inverse modified discrete cosine transformation. The same applies to the N spectral values of the second window. These are fed to a device 414 , which also carries out an inverse modified discrete cosine transformation. Both the device 412 and the device 414 each deliver 2N samples for the first window and 2N samples for the second window.

In einer Einrichtung 416, die in Fig. 4b mit TDAC (TDAC = Time Domain Aliasing Cancellation) bezeichnet ist, wird die Tatsache berücksichtigt, daß die beiden Fenster überlappend sind. Insbesondere wird ein Abtastwert y₁ der zweiten Hälfte des ersten Fensters, also mit einem Index N + k, mit einem Abtastwert y₂ aus der ersten Hälfte des zweiten Fensters, also mit einem Index k summiert, so daß sich ausgangsseitig, also im Decodierer, N decodierte zeitliche Abtastwerte ergeben. In a device 416 , which is labeled TDAC (TDAC = Time Domain Aliasing Cancellation) in FIG. 4b, the fact that the two windows are overlapping is taken into account. In particular, a sample y _{1 of} the second half of the first window, that is to say with an index N + k, is summed up with a sample value y ₂ from the first half of the second window, that is to say with an index k, so that on the output side, that is to say in the decoder, N decoded time samples result.

Es sei darauf hingewiesen, daß durch die Funktion der Einrichtung 416, die auch als Add-Funktion bezeichnet wird, die in dem durch Fig. 4a schematisch dargestellten Codierer durchgeführte Fensterung gewissermaßen automatisch berücksichtigt wird, so daß in dem durch Fig. 4b dargestellten Decodierer keine explizite "inverse Fensterung" stattzufinden hat. It should be pointed out that the function of the device 416 , which is also referred to as an add function, automatically takes into account the windowing carried out in the encoder shown schematically by FIG. 4a, so that none in the decoder shown by FIG. 4b explicit "inverse windowing" has to take place.

Wenn die durch die Einrichtung 402 oder 404 implementierte Fensterfunktion mit w(k) bezeichnet wird, wobei der Index k den Zeitindex darstellt, bezeichnet wird, so muß die Bedingung erfüllt sein, daß das Fenstergewicht w(k) im Quadrat addiert zu dem Fenstergewicht w (N + k) im Quadrat zusammen 1 ergibt, wobei k von 0 bis N - 1 läuft. Wenn ein Sinus-Fenster verwendet wird, dessen Fenster-Gewichtungen der ersten Halbwelle der Sinus-Funktion folgen, so ist diese Bedingung immer erfüllt, da das Quadrat des Sinus und das Quadrat des Cosinus für jeden Winkel zusammen den Wert 1 ergeben. If the window function implemented by the device 402 or 404 is designated w (k), where the index k represents the time index, then the condition must be met that the window weight w (k) squared is added to the window weight w (N + k) squared together gives 1, where k runs from 0 to N - 1. If a sine window is used whose window weightings follow the first half-wave of the sine function, this condition is always fulfilled, since the square of the sine and the square of the cosine together give the value 1 for each angle.

Nachteilig an dem in Fig. 4a beschriebenen Fenster- Verfahren mit anschließender MDCT-Funktion ist die Tatsache, daß die Fensterung durch Multiplikation eines zeitdiskreten Abtastwerts, wenn an ein Sinus-Fenster gedacht wird, mit einer Gleitkommazahl erreicht wird, da der Sinus eines Winkels zwischen 0 und 180 Grad abgesehen von dem Winkel 90 Grad keine Ganzzahl ergibt. Auch wenn ganzzahlige zeitdiskrete Abtastwerte gefenstert werden, entstehen nach dem Fenstern also Gleitkommazahlen. A disadvantage of the window method described in FIG. 4a with subsequent MDCT function is the fact that the windowing is achieved by multiplying a discrete-time sample value when a sine window is considered by a floating point number, since the sine of an angle between 0 and 180 degrees apart from the 90 degree angle does not result in an integer. So even if integer, discrete-time samples are windowed, floating-point numbers arise after the window.

Daher ist, auch wenn kein psychoakustischer Codierer verwendet wird, d. h. wenn eine verlustlose Codierung erreicht werden soll, am Ausgang der Einrichtungen 408 bzw. 410 eine Quantisierung notwendig, um eine einigermaßen überschaubare Entropie-Codierung durchführen zu können. Therefore, even if no psychoacoustic encoder is used, ie if lossless coding is to be achieved, quantization is necessary at the output of the devices 408 or 410 in order to be able to carry out a somewhat manageable entropy coding.

Wenn also bekannte Transformationen, wie sie anhand von Fig. 4a betrieben worden sind, für ein verlustloses Audiocodieren eingesetzt werden soll, muß entweder eine sehr feine Quantisierung eingesetzt werden, um den resultierenden Fehler aufgrund der Rundung der Gleitkommazahlen vernachlässigen zu können, oder das Fehlersignal muß zusätzlich beispielsweise im Zeitbereich codiert werden. If known transformations, such as those which have been operated with reference to FIG. 4a, are to be used for lossless audio coding, either a very fine quantization must be used in order to be able to neglect the resulting error due to the rounding of the floating point numbers, or the error signal must can also be encoded in the time domain, for example.

Konzepte der ersteren Art, also bei denen die Quantisierung so fein eingestellt, daß der resultierende Fehler aufgrund der Rundung der Gleitkommazahlen vernachlässigbar ist, sind beispielsweise in der deutschen Patentschrift DE 197 42 201 C1 offenbart. Hier wird ein Audiosignal in seine spektrale Darstellung überführt und quantisiert, um quantisierte Spektralwerte zu erhalten. Die quantisierten Spektralwerte werden wieder invers quantisiert, in den Zeitbereich überführt und mit dem ursprünglichen Audiosignal verglichen. Liegt der Fehler, also der Fehler zwischen dem ursprünglichen Audiosignal und dem quantisierten/invers quantisierten Audiosignal, oberhalb einer Fehlerschwelle, so wird der Quantisierer rückkopplungsmäßig feiner eingestellt, und der Vergleich wird erneut durchgeführt. Die Iteration ist beendet, wenn die Fehlerschwelle unterschritten wird. Das dann noch möglicherweise vorhandene Restsignal wird mit einem Zeitbereichscodierer codiert und in einen Bitstrom geschrieben, der neben dem Zeitbereichs-codierten Restsignal auch codierte Spektralwerte umfaßt, die gemäß den Quantisierereinstellungen quantisiert worden sind, die zum Zeitpunkt des Abbruchs der Iteration vorhanden waren. Es sei darauf hingewiesen, daß der verwendete Quantisierer nicht von einem psychoakustischen Modell gesteuert werden muß, so daß die codierten Spektralwerte typischerweise genauer quantisiert sind, als dies aufgrund des psychoakustischen Modells sein müßte. Concepts of the first kind, that is, quantization so fine that the resulting error is due to the rounding of the floating point numbers is negligible for example in German patent DE 197 42 201 C1 disclosed. Here is an audio signal in its spectral Presentation converted and quantized to quantized Get spectral values. The quantized spectral values are inversely quantized again, in the time domain transferred and compared with the original audio signal. Is the error, that is the error between the original audio signal and the quantized / inversely quantized Audio signal, above an error threshold, so the Feedback quantizer finely adjusted, and the Comparison is carried out again. The iteration is ended when the error threshold is undershot. Then that Any residual signal that may still be present is indicated by a Time domain encoder encoded and into a bit stream written next to the time-domain coded residual signal encoded spectral values, which according to the Quantizer settings that have been quantized to When the iteration was aborted. It is noted that the quantizer used is not must be controlled by a psychoacoustic model, so that the encoded spectral values are typically more accurate are quantized than because of the psychoacoustic Should be a model.

In der Fachveröffentlichung "A Design of Lossy and Lossless Scalable Audio Coding", T. Moriya u. a., Proc. ICASSP, 2000, ist ein skalierbarer Codierer beschrieben, der als erstes verlustbehaftetes Datenkompressionsmodul z. B. einen MPEG- Codierer umfaßt, der eine blockweise digitale Signalform als Eingangssignal hat und den komprimierten Bitstrom erzeugt. In einem ebenfalls vorhandenen lokalen Decodierer wird die Codierung wieder rückgängig gemacht, und es wird ein codiertes/decodiertes Signal erzeugt. Dieses Signal wird mit dem ursprünglichen Eingangssignal verglichen, indem das codierte/decodierte Signal von dem ursprünglichen Eingangssignal subtrahiert wird. Das Fehlersignal wird dann in ein zweites Modul eingespeist, wo eine verlustlose Bitkonversion verwendet wird. Diese Konversion hat zwei Schritte. Der erste Schritt besteht in einer Konversion von einem Zweierkomplementformat in ein Vorzeichen-Betrag- Format. Der zweite Schritt besteht in der Umwandlung von einer vertikalen Betragssequenz in eine horizontale Bitsequenz in einem Verarbeitungsblock. Die verlustlose Datenumwandlung wird ausgeführt, um die Anzahl von Nullen zu maximieren oder die Anzahl von aufeinanderfolgenden Nullen in einer Sequenz zu maximieren, um eine möglichst gute Komprimierung des zeitlichen Fehlersignals, das als Folge von digitalen Zahlen vorliegt, zu erreichen. Dieses Prinzip basiert auf einem Bit-Slice-Arithmetic-Coding-(BSAC-)Schema, das in der Fachveröffentlichung "Multi-Layer Bit Sliced Bit Rate Scalable Audio Coder", 103. AES-Convention, Preprint Nr. 4520, 1997, dargestellt ist. In the professional publication "A Design of Lossy and Lossless Scalable Audio Coding ", T. Moriya et al., Proc. ICASSP, 2000, describes a scalable encoder that first lossy data compression module z. B. an MPEG Encoder comprising a block-wise digital waveform as an input signal and the compressed bit stream generated. In an existing local decoder the coding is undone and it becomes generates an encoded / decoded signal. This signal is compared to the original input signal, by encoding / decoding the signal from the original Input signal is subtracted. The error signal is then fed into a second module, where a lossless Bit conversion is used. This conversion has two Steps. The first step is to convert from a two's complement format into a sign amount Format. The second step is to convert from a vertical sequence of amounts into a horizontal one Bit sequence in a processing block. The lossless Data conversion is performed to increase the number of zeros maximize or the number of consecutive zeros in maximize a sequence to get the best possible Compression of the temporal error signal, which is the result of digital numbers are available. This principle based on one Bit-slice arithmetic coding (BSAC) scheme, which is described in the specialist publication "Multi-Layer Bit Sliced Bit Rate Scalable Audio Coder ", 103rd AES Convention, Preprint No. 4520, 1997.

Nachteilig an den vorstehend beschriebenen Konzepten ist die Tatsache, daß die Daten für die verlustlose Erweiterungsschicht, d. h. die Zusatzdaten, die benötigt werden, um eine verlustlose Decodierung des Audiosignals zu erreichen, im Zeitbereich gewonnen werden müssen. Dies bedeutet, daß eine vollständige Decodierung einschließlich einer Frequenz/Zeit-Umsetzung erforderlich ist, um das codierte/decodierte Signal im Zeitbereich zu erhalten, damit mittels einer Abtastwert-weisen Differenzbildung zwischen dem ursprünglichen Audioeingangssignal und dem codierten/decodierten Audiosignal, das aufgrund der psychoakustischen Codierung verlustbehaftet ist, das Fehlersignal berechnet wird. Dieses Konzept ist insbesondere dahingehend nachteilhaft, daß im Codierer, der den Audiodatenstrom erzeugt, sowohl eine komplette Zeit-Frequenz- Umsetzungseinrichtung, wie z. B. eine Filterbank bzw. z. B. ein MDCT-Algorithmus für die Hintransformation benötigt wird, und gleichzeitig, lediglich um das Fehlersignal zu erzeugen, eine komplette inverse Filterbank bzw. ein kompletter Synthesealgorithmus benötigt wird. Der Codierer muß daher zusätzlich zu seinen inhärenten Codiererfunktionalitäten auch die komplette Decodiererfunktionalität enthalten. Wenn der Codierer softwaremäßig implementiert ist, so werden hierfür sowohl Speicherkapazitäten als auch Prozessorkapazitäten benötigt, die zu einer Codiererimplementation mit erhöhtem Aufwand führt. A disadvantage of the concepts described above the fact that the data for the lossless Extension layer, d. H. the additional data required to losslessly decode the audio signal reach, must be won in the time domain. This means, that full decoding including one Frequency / time implementation is required to do that to get coded / decoded signal in the time domain so by means of a sample-wise difference between the original audio input signal and the encoded / decoded audio signal due to the psychoacoustic coding is lossy, the error signal is calculated. This concept is particularly relevant disadvantageous that in the encoder that the audio data stream generates both a complete time-frequency Implementation device, such as. B. a filter bank or z. B. an MDCT algorithm is required for the forward transformation is, and at the same time, only to the error signal generate a complete inverse filter bank or a complete synthesis algorithm is needed. The encoder must hence in addition to its inherent Encoder functionalities also the complete decoder functionality contain. If the encoder is implemented in software, so both storage capacities and Processor capacities required to a Coder implementation leads to increased effort.

Die Aufgabe der vorliegenden Erfindung besteht darin, ein weniger aufwendiges Konzept zu schaffen, durch das ein Audiodatenstrom erzeugbar ist, der zumindest nahezu verlustlos decodierbar ist. The object of the present invention is a to create less elaborate concept by the one Audio data stream can be generated that is at least almost is decodable without loss.

Diese Aufgabe wird durch eine Vorrichtung zum Codieren eines zeitdiskreten Audiosignals nach Patentanspruch 1, durch ein Verfahren zum Codieren eines zeitdiskreten Audiosignals nach Patentanspruch 13, durch eine Vorrichtung zum Decodieren von codierten Audiodaten nach Patentanspruch 14, durch ein Verfahren zum Decodieren von codierten Audiodaten nach Patentanspruch 15 oder durch ein Computer-Programm nach Anspruch 16 oder 17 gelöst. This task is accomplished by a coding device a discrete-time audio signal according to claim 1, by a method for coding a discrete-time audio signal according to claim 13, by a device for Decoding encoded audio data according to claim 14, by according to a method for decoding encoded audio data Claim 15 or by a computer program Claim 16 or 17 solved.

Der vorliegenden Erfindung liegt die Erkenntnis zugrunde, daß die Zusatzaudiodaten, die eine verlustlose Decodierung des Audiosignals ermöglichen, dadurch gewonnen werden können, daß ein Block von quantisierten Spektralwerten wie üblich bereitgestellt wird, und dann invers quantisiert wird, um invers quantisierte Spektralwerte zu haben, die aufgrund der Quantisierung mittels eines psychoakustischen Modells verlustbehaftet sind. Diese invers quantisierten Spektralwerte werden dann gerundet, um einen Rundungs-Block von gerundeten invers quantisierten Spektralwerten zu erhalten. Als Referenz zur Differenzbildung wird erfindungsgemäß ein Ganzzahl-Transformationsalgorithmus verwendet, welcher aus einem Block von ganzzahligen zeitdiskreten Abtastwerten einen Ganzzahl-Block von Spektralwerten, der lediglich ganzzahlige Spektralwerte aufweist, erzeugt. Erfindungsgemäß wird nunmehr die Kombination der Spektralwerte im Rundungs- Block und im Ganzzahl-Block spektralwertweise, also im Frequenzbereich durchgeführt, so daß im Codierer selbst kein Synthesealgorithmus, also eine inverse Filterbank oder ein inverser MDCT-Algorithmus etc. benötigt wird. Der Kombinations-Block, der die Differenz-Spektralwerte aufweist, umfaßt aufgrund des Ganzzahl-Transformationsalgorithmus und der gerundeten Quantisierungswerte lediglich ganzzahlige Werte, die auf irgendeine bekannte Art und Weise Entropiecodiert werden können. Es sei darauf hingewiesen, daß zur Entropie-Codierung des Kombinationsblocks beliebige Entropie-Codierer eingesetzt werden können, wie z. B. Huffman- Codierer oder arithmetische Codierer etc. The present invention is based on the finding that that the additional audio data that a lossless decoding enable the audio signal to be obtained can that a block of quantized spectral values like is usually provided, and then inversely quantized, to have inversely quantized spectral values that are due to quantization using a psychoacoustic model are lossy. These inversely quantized Spectral values are then rounded to a rounding block of to obtain rounded inverse quantized spectral values. According to the invention, a is used as a reference for forming the difference Integer transformation algorithm is used, which is from a block of integer discrete-time samples an integer block of spectral values that is only has integer spectral values generated. According to the invention the combination of the spectral values in the rounding Block and in the integer block by spectral value, i.e. in Frequency range carried out so that in the encoder itself no Synthesis algorithm, i.e. an inverse filter bank or a inverse MDCT algorithm etc. is required. The Combination block, which has the difference spectral values, includes due to the integer transformation algorithm and the rounded quantization values are only integers Values in any known way Entropy can be encoded. It should be noted that for Entropy coding of the combination block any Entropy encoders can be used, such as. B. Huffman Encoder or arithmetic encoder etc.

Zur Codierung der quantisierten Spektralwerte des Quantisierungsblocks können ebenfalls beliebige Codierer eingesetzt werden, wie z. B. die bekannten für moderne Audiocodierer üblichen Werkzeuge. For coding the quantized spectral values of the Quantization blocks can also be any encoder are used, such as. B. the well-known for modern Audio encoder usual tools.

Es sei darauf hingewiesen, daß das erfindungsgemäße Codier/Decodierkonzept kompatibel ist mit modernen Codierwerkzeugen, wie z. B. Fenster-Umschalten, TNS oder Mitte/Seite-Codierung für mehrkanalige Audiosignale. It should be noted that the invention Coding / decoding concept is compatible with modern ones Coding tools, such as B. window switching, TNS or Middle / side coding for multi-channel audio signals.

Bei einem bevorzugten Ausführungsbeispiel der vorliegenden Erfindung wird zum Liefern eines Quantisierungsblocks von unter Verwendung eines psychoakustischen Modells quantisierten Spektralwerten eine MDCT eingesetzt. Darüber hinaus wird es bevorzugt, als Ganzzahl-Transformationsalgorithmus eine sogenannte IntMDCT einzusetzen. In a preferred embodiment of the present Invention is used to provide a quantization block of using a psychoacoustic model quantized spectral values used an MDCT. Furthermore it is preferred as an integer transformation algorithm to use a so-called IntMDCT.

Bei einem alternativen Ausführungsbeispiel der vorliegenden Erfindung kann auf die übliche MDCT verzichtet werden, und es kann die IntMDCT als Annäherung für die MDCT verwendet werden, und zwar dahingehend, daß das ganzzahlige Spektrum, das durch den Ganzzahl-Transformationsalgorithmus erhalten wird, einem psychoakustischen Quantisierer zugeführt wird, um quantisierte IntMDCT-Spektralwerte zu erhalten, die dann wieder invers quantisiert und gerundet werden, um mit den ursprünglichen Ganzzahl-Spektralwerten verglichen zu werden. In diesem Fall wird lediglich nur noch eine einzige Transformation benötigt, nämlich die IntMDCT, die aus ganzzahligen zeitdiskreten Abtastwerten ganzzahlige Spektralwerte erzeugt. In an alternative embodiment of the present Invention can be dispensed with the usual MDCT, and it can use the IntMDCT as an approximation for the MDCT in that the integer spectrum, obtained by the integer transformation algorithm is fed to a psychoacoustic quantizer, to get quantized IntMDCT spectral values, which then again be inversely quantized and rounded to match the original integer spectral values compared to become. In this case, only one is left Transformation needed, namely the IntMDCT that made up integer discrete-time samples integer Spectral values generated.

Typischerweise arbeiten Prozessoren mit ganzen Zahlen, bzw. jede Gleitkommazahl ist als eine ganze Zahl darstellbar. Wenn eine Ganzzahl-Arithmetik in einem Prozessor verwendet wird, so kann auf das Runden der invers quantisierten Spektralwerte verzichtet werden, da aufgrund der Arithmetik des Prozessors ohnehin gerundete Werte, nämlich innerhalb der Genauigkeit des LSB, d. h. des niederstwertigen Bits, vorliegen. In diesem Fall wird eine vollständig verlustlose Verarbeitung erreicht, d. h. eine Verarbeitung innerhalb der Genauigkeit des verwendeten Prozessorsystems. Alternativ kann jedoch eine Rundung auf eine gröbere Genauigkeit durchgeführt werden, dahingehend, daß das Differenzsignal im Kombinationsblock auf die durch eine Rundungsfunktion festgelegte Genauigkeit gerundet ist. Das Einführen einer Rundung über die inhärente Rundung eines Prozessorsystems hinaus ermöglicht eine Flexibilität dahingehend, den "Grad" der Verlustlosigkeit der Codierung zu beeinflussen, um im Sinne einer Datenkompression einen nahezu verlustlosen Codierer zu schaffen. Typically, processors work with integers, or each floating point number can be represented as an integer. When using integer arithmetic in a processor , the inversely quantized rounding Spectral values are omitted because of the arithmetic of the processor anyway rounded values, namely within the accuracy of the LSB, d. H. the least significant bit, available. In this case, it will be completely lossless Processing reached, d. H. processing within the accuracy of the processor system used. Alternatively, however, rounding to a coarser accuracy can be achieved be performed in that the difference signal in the combination block with a rounding function specified accuracy is rounded. Introducing one Rounding over the inherent rounding of a processor system flexibility also allows the "degree" to influence the loss of the coding in order to An almost lossless sense of data compression To create encoders.

Der erfindungsgemäße Decodierer zeichnet sich dadurch aus, daß aus den Audiodaten sowohl die psychoakustisch codierten Audiodaten als auch die Zusatzaudiodaten extrahiert werden, einer möglicherweise vorhandenen Entropie-Decodierung unterzogen werden und dann wie folgt verarbeitet werden. Zunächst wird der Quantisierungsblock im Decodierer invers quantisiert und unter Verwendung derselben Rundungsfunktion, die auch im Codierer eingesetzt worden ist, gerundet, um dann zu den Entropie-decodierten Zusatzaudiodaten hinzuaddiert zu werden. Im Decodierer liegen dann sowohl eine psychoakustisch komprimierte spektrale Darstellung des Audiosignals als auch eine verlustlose Darstellung des Audiosignals vor, wobei die psychoakustisch komprimierte spektrale Darstellung des Audiosignals in den Zeitbereich umzusetzen ist, um ein verlustbehaftetes codiertes/decodiertes Audiosignal zu erhalten, während die verlustlose Darstellung unter Verwendung eines zum Ganzzahl- Transformationsalgoritmus inversen Ganzzahl- Transformationsalgorithmus in den Zeitbereich umgesetzt wird, um ein verlustlos oder, wie es ausgeführt worden ist, nahezu verlustlos codiertes/decodiertes Audiosignal zu erhalten. The decoder according to the invention is characterized in that that both the psychoacoustically encoded from the audio data Audio data as well as the additional audio data are extracted, a possibly existing entropy decoding subjected and then processed as follows. First the quantization block in the decoder becomes inverse quantized and using the same Rounding function, which was also used in the encoder, rounded, then to the entropy-decoded additional audio data to be added. Then there are both one in the decoder psychoacoustically compressed spectral representation of the Audio signal as well as a lossless representation of the Audio signal before, the psycho-acoustically compressed spectral representation of the audio signal in the time domain is to be implemented to a lossy coded / decoded Get audio signal while the lossless Representation using an integer Inverse integer transformation algorithm Transformation algorithm implemented in the time domain becomes a lossless or as it has been executed almost lossless encoded / decoded audio signal receive.

Bevorzugte Ausführungsbeispiele der vorliegenden Erfindung werden nachfolgend Bezug nehmend auf die beiliegenden Zeichnungen detailliert erläutert. Es zeigen: Preferred embodiments of the present invention are referred to below with reference to the enclosed Drawings explained in detail. Show it:

Fig. 1 ein Blockschaltbild einer bevorzugten Einrichtung zum Verarbeiten von zeitdiskreten Audio- Abtastwerten, um ganzzahlige Werte zu erhalten, aus denen ganzzahlige Spektralwerte ermittelbar sind; Fig. 1 to obtain integer values from which integer spectral values can be determined a block diagram of a preferred means for processing time-discrete audio samples;

Fig. 2 eine schematische Darstellung der Zerlegung einer MDCT und einer inversen MDCT in Givens-Rotationen und zwei DCT-IV-Operationen; Figure 2 is a schematic representation of the decomposition of a MDCT and inverse MDCT in Givens rotations and two DCT-IV operations.

Fig. 3 eine Darstellung zur Veranschaulichung der Zerlegung der MDCT mit 50-Prozent-Überlappung in Rotationen und DCT-IV-Operationen; Fig. 3 is a diagram illustrating the decomposition of the MDCT with 50 percent overlap into rotations and DCT-IV operations;

Fig. 4a ein schematisches Blockschaltbild eines bekannten Codierers mit MDCT und 50 Prozent Überlappung; Figure 4a is a schematic block diagram of a known encoder with MDCT and 50 percent overlap.

Fig. 4b ein Blockschaltbild eines bekannten Decodierers zum Decodieren der durch Fig. 4a erzeugten Werte; Fig. 4b is a block diagram of a known decoder for decoding the values generated by Fig. 4a;

Fig. 5 ein Prinzipblockschaltbild eines bevorzugten erfindungsgemäßen Codierers; Fig. 5 is a schematic block diagram of a preferred inventive encoder;

Fig. 6 ein Prinzipblockschaltbild eines alternativen erfindungsgemäß bevorzugten Codierers; und Fig. 6 is a schematic block diagram of an alternative preferred according encoder; and

Fig. 7 ein Prinzipblockschaltbild eines erfindungsgemäß bevorzugten Decodierers. Fig. 7 is a schematic block diagram of a decoder according to the invention preferred.

Im nachfolgenden wird anhand der Fig. 5 bis 7 auf erfindungsgemäße Codiererschaltungen (Fig. 5 und Fig. 6) bzw. eine erfindungsgemäß bevorzugte Decodiererschaltung (Fig. 7) eingegangen. Der in Fig. 5 gezeigte erfindungsgemäße Codierer umfaßt einen Eingang 50, in den ein zeitdiskretes Audiosignal einspeisbar ist, sowie einen Ausgang 52, aus dem codierte Audiodaten ausgebbar sind. Das am Eingang 50 eingespeiste zeitdiskrete Audiosignal wird in eine Einrichtung 52 zum Liefern eines Quantisierungs-Blocks eingespeist, der ausgangsseitig einen Quantisierungs-Block des zeitdiskreten Audiosignals liefert, der unter Verwendung eines psychoakustischen Modells 54 quantisierte Spektralwerte des zeitdiskreten Audiosignals 50 aufweist. Der erfindungsgemäße Codierer umfaßt ferner eine Einrichtung zum Erzeugen eines Ganzzahl-Blocks unter Verwendung eines Ganzzahl-Transformationsalgorithmus 56, wobei der Ganzzahl- Algorithmus wirksam ist, um aus ganzzahligen zeitdiskreten Abtastwerten ganzzahlige Spektralwerte zu erzeugen. In the following 5 to 7 according to the invention to encoder circuits (Fig. 5 and Fig. 6) with reference to FIGS. And a preferred according to the invention (Fig. 7) decoder received. The encoder according to the invention shown in FIG. 5 comprises an input 50 , into which a time-discrete audio signal can be fed, and an output 52 , from which encoded audio data can be output. The time-discrete audio signal fed in at input 50 is fed into a device 52 for supplying a quantization block, which on the output side supplies a quantization block of the time-discrete audio signal, which has 54 quantized spectral values of the time-discrete audio signal 50 using a psychoacoustic model. The encoder according to the invention further comprises a device for generating an integer block using an integer transformation algorithm 56 , the integer algorithm being effective to generate integer spectral values from integer discrete-time samples.

Der erfindungsgemäße Codierer umfaßt ferner eine Einrichtung 58 zum inversen Quantisieren des Quantisierungs- Blocks, der von der Einrichtung 52 ausgegeben wird, und, wenn eine andere Genauigkeit als die Prozessorgenauigkeit erforderlich ist, eine Rundungsfunktion. Wenn bis zur Genauigkeit des Prozessorsystems, wie es ausgeführt worden ist, gegangen werden soll, so ist die Rundungsfunktion bereits inhärent bei dem inversen Quantisieren des Quantisierungs-Blocks enthalten, da ein Prozessor, der eine Ganzzahlarithmetik hat, ohnehin nicht in der Lage ist, nichtganzzahlige Werte zu liefern. Die Einrichtung 58 liefert somit einen sogenannten Rundungs-Block, der invers quantisierte Spektralwerte umfaßt, die ganzzahlig sind, d. h. inhärent oder explizit gerundet worden sind. Sowohl der Rundungs-Block als auch der Ganzzahl-Block werden einer Kombinationseinrichtung zugeführt, die unter Verwendung einer Differenzbildung einen Differenz-Block mit Differenz- Spektralwerten liefert, wobei der Ausdruck "Differenz- Block" darauf hindeuten soll, daß die Differenz- Spektralwerte Werte sind, die Unterschiede zwischen dem Ganzzahl-Block und dem Rundungs-Block umfassen. The encoder according to the invention further comprises means 58 for inversely quantizing the quantization block output by means 52 and, if an accuracy other than processor accuracy is required, a rounding function. If you want to go as far as the accuracy of the processor system as it has been done, the rounding function is inherent in the inverse quantization of the quantization block because a processor that has integer arithmetic is not capable of integer anyway Deliver values. The device 58 thus supplies a so-called rounding block, which comprises inversely quantized spectral values that are integers, that is, have been inherently or explicitly rounded. Both the rounding block and the integer block are fed to a combination device which, using a difference formation, supplies a difference block with difference spectral values, the term "difference block" being intended to indicate that the difference spectral values are values that include differences between the integer block and the rounding block.

Sowohl der Quantisierungs-Block, der aus der Einrichtung 52 ausgegeben wird, als auch der Differenz-Block, der aus der Differenzbildungseinrichtung 58 ausgegeben wird, werden einer Verarbeitungseinrichtung 60 zugeführt, die z. B. eine übliche Verarbeitung des Quantisierungs-Blocks durchführt, und die ferner z. B. eine Entropie-Codierung des Differenz- Blocks bewirkt. Die Einrichtung 60 zum Verarbeiten gibt an dem Ausgang 52 codierte Audiodaten aus, die sowohl Informationen über den Quantisierungs-Block enthalten als auch Informationen über den Differenz-Block umfassen. Both the quantization block which is output from the device 52 and the difference block which is output from the difference forming device 58 are fed to a processing device 60 which, for. B. performs a normal processing of the quantization block, and the z. B. entropy coding the difference block. The device 60 for processing outputs coded audio data at the output 52 , which both contain information about the quantization block and information about the difference block.

Bei einem ersten bevorzugten Ausführungsbeispiel wird, wie es in Fig. 6 gezeigt ist, das zeitdiskrete Audiosignal mittels einer MDCT in seine spektrale Darstellung umgesetzt und dann quantisiert. Die Einrichtung 52 zum Liefern des Quantisierungsblocks besteht somit aus der MDCT-Einrichtung 52a und einem Quantisierer 52b. In a first preferred exemplary embodiment, as shown in FIG. 6, the discrete-time audio signal is converted into its spectral representation by means of an MDCT and then quantized. The device 52 for delivering the quantization block thus consists of the MDCT device 52 a and a quantizer 52 b.

Darüber hinaus wird es bevorzugt, den Ganzzahl-Block mit einer IntMDCT 56 als ganzzahligem Transformationsalgorithmus zu erzeugen. In addition, it is preferred to generate the integer block with an IntMDCT 56 as an integer transformation algorithm.

In Fig. 6 ist ferner die in Fig. 5 gezeigte Verarbeitungseinrichtung 60 als Bitstrom-Codiereinrichtung 60a zum Bitstrom-Codieren des Quantisierungs-Blocks, der durch die Einrichtung 52b ausgegeben wird, sowie durch einen Entropie-Codierer 60b zum Entropie-Codieren des Differenz-Blocks dargestellt. Der Bitstrom-Codierer 60a gibt die psychoakustisch codierten Audiodaten aus, während der Entropie- Codierer 60b einen Entropie-codierten Differenz-Block ausgibt. Die beiden Ausgangsdaten der Blöcke 60a und 60b können in geeigneter Weise in einen Bitstrom kombiniert werden, der als erste Skalierungsschicht die psychoakustisch codierten Audiodaten hat, und der als zweite Skalierungsschicht die Zusatzaudiodaten für eine verlustlose Decodierung hat. Der skalierte Bitstrom entspricht dann den in Fig. 5 gezeigten codierten Audiodaten am Ausgang 52 des Codierers. In Fig. 6 is also 60 a b in Fig. Processing device shown 5 60 as a bit stream encoder is output b by the means 52 for bitstream encoding the quantization block, and by an entropy encoder 60 for entropy encoding of the difference block. The bitstream encoder 60 a outputs the psychoacoustically encoded audio data, while the entropy encoder 60 b outputs an entropy-encoded difference block. The two output data of blocks 60 a and 60 b can be combined in a suitable manner in a bit stream which has the psychoacoustically encoded audio data as the first scaling layer and which has the additional audio data for lossless decoding as the second scaling layer. The scaled bit stream then corresponds to the encoded audio data shown in FIG. 5 at the output 52 of the encoder.

Bei einem alternativen bevorzugten Ausführungsbeispiel kann auf den MDCT-Block 52a von Fig. 6 verzichtet werden, wie es in Fig. 5 durch einen gestrichelten Pfeil 62 angedeutet ist. In diesem Fall wird das Ganzzahl-Spektrum, das durch die Ganzzahl-Transformationseinrichtung 56 geliefert wird, sowohl in die Differenz-Bildungseinrichtung 58 eingespeist als auch in den Quantisierer 52b von Fig. 6. Die Spektralwerte, die durch die Ganzzahl-Transformation erzeugt werden, werden hier gewissermaßen als Annäherung für eine übliches MDCT-Spektrum verwendet. Dieses Ausführungsbeispiel hat den Vorteil, daß nur der IntMDCT-Algorithmus im Codierer vorhanden ist, und daß nicht sowohl der IntMDCT- Algorithmus als auch der MDCT-Algorithmus im Codierer vorhanden sein müssen. In an alternative preferred exemplary embodiment, the MDCT block 52 a from FIG. 6 can be dispensed with, as is indicated in FIG. 5 by a dashed arrow 62 . In this case, the integer spectrum supplied by the integer transformation device 56 is fed into both the difference formation device 58 and the quantizer 52 b of FIG. 6. The spectral values which are generated by the integer transformation are used here as an approximation for a common MDCT spectrum. This embodiment has the advantage that only the IntMDCT algorithm is present in the encoder and that both the IntMDCT algorithm and the MDCT algorithm need not be present in the encoder.

Wieder Bezug nehmend auf Fig. 6 sei darauf hingewiesen, daß die durchgezogenen Blöcke und Linien einen üblichen Audiocodierern nach einem der MPEG-Standards darstellen, während die gestrichelten Blöcke und Linien die Erweiterung eines solchen üblichen MPEG-Codierers darstellen. Es ist also zu sehen, daß keine grundsätzliche Änderung des üblichen MPEG- Codierers erforderlich sind, sondern daß die erfindungsgemäße Gewinnung der Zusatzaudiodaten für eine verlustlose Codierung mittels einer Ganzzahl-Transformation ohne Änderung der Codierer/Decodierer-Grundstruktur hinzugefügt werden können. Referring back to FIG. 6, it should be noted that the solid blocks and lines represent a common audio encoder according to one of the MPEG standards, while the dashed blocks and lines represent the extension of such a common MPEG encoder. It can thus be seen that no fundamental change to the conventional MPEG encoder is required, but that the additional audio data according to the invention can be added for lossless coding by means of an integer transformation without changing the basic encoder / decoder structure.

Fig. 7 zeigt ein Prinzipblockschaltbild eines erfindungsgemäßen Decodierers zum Decodieren der an dem Ausgang 52 von Fig. 5 ausgegebenen codierten Audiodaten. Diese werden zunächst in psychoakustisch codierte Audiodaten einerseits und die Zusatzaudiodaten andererseits zerlegt. Die psychoakustisch codierten Audiodaten werden einem üblichen Bitstrom-Decodierer 70 zugeführt, während die Zusatzaudiodaten, wenn sie im Codierer Entropie-codiert worden sind, mittels eines Entropie-Decodierers 72 Entropie-decodiert werden. Am Ausgang des Bitstrom-Decodierers 70 von Fig. 7 liegen quantisierte Spektralwerte vor, die einem inversen Quantisierer 74 zugeführt werden, der prinzipiell identisch zu dem inversen Quantisierer in der Einrichtung von Fig. 6 aufgebaut sein kann. Wird eine Genauigkeit angestrebt, die nicht der Prozessor-Genauigkeit entspricht, so ist im Decodierer ferner eine Rundungseinrichtung 76 vorgesehen, die denselben Rundungsalgorithmus bzw. dieselbe Rundungsfunktion zum Abbilden einer reellen Zahl auf eine Ganzzahl durchführt, wie sie auch in der Einrichtung 58 von Fig. 6 implementiert sein kann. In einem decodiererseitigen Kombinierer 78 werden die gerundeten invers quantisierten Spektralwerte mit den Entropie-codierten Zusatzaudiodaten spektralwertweise vorzugsweise additiv kombiniert, so daß im Decodierer zum einen invers quantisierte Spektralwerte am Ausgang der Einrichtung 74 vorliegen und zum anderen Ganzzahl- Spektralwerte am Ausgang des Kombinierers 78 vorliegen. FIG. 7 shows a basic block diagram of a decoder according to the invention for decoding the coded audio data output at the output 52 of FIG. 5. These are first broken down into psychoacoustically encoded audio data on the one hand and the additional audio data on the other. The psychoacoustically encoded audio data is fed to a conventional bitstream decoder 70 , while the additional audio data, if it has been entropy-encoded in the encoder, is entropy-decoded by means of an entropy decoder 72 . At the output of the bitstream decoder 70 of FIG. 7, there are quantized spectral values which are fed to an inverse quantizer 74 , which in principle can be constructed identically to the inverse quantizer in the device from FIG. 6. If an accuracy is desired that does not correspond to the processor accuracy, a rounding device 76 is also provided in the decoder, which performs the same rounding algorithm or the same rounding function for mapping a real number to an integer as is also used in the device 58 of FIG. 6 can be implemented. In a decoder-side combiner 78 , the rounded inversely quantized spectral values are combined with the entropy-coded additional audio data, preferably additively, so that the decoder has inversely quantized spectral values at the output of the device 74 and there are integer spectral values at the output of the combiner 78 .

Die ausgangsseitigen Spektralwerte der Einrichtung 74 können dann mittels einer Einrichtung 80 zum Durchführen einer inversen modifizierten diskreten Cosinustransformation in den Zeitbereich umgesetzt werden, um ein verlustbehaftetes psychoakustisch codiertes und wieder decodiertes Audisignal zu erhalten. Mittels einer Einrichtung 82 zum Durchführen einer inversen Ganzzahl-MDCT (IntMDCT) wird ferner das Ausgangssignal des Kombinierers 78 in seine zeitliche Darstellung umgesetzt, um ein verlustlos codiertes/decodiertes Audiosignal oder ein, wenn eine entsprechende gröbere Rundung eingesetzt worden ist, ein nahezu verlustlos codiertes und wieder decodiertes Audiosignal zu erzeugen. The spectral values on the output side of the device 74 can then be converted into the time domain by means of a device 80 for carrying out an inverse modified discrete cosine transformation in order to obtain a lossy psychoacoustically coded and again decoded audio signal. By means of a device 82 for performing an inverse integer MDCT (IntMDCT), the output signal of the combiner 78 is also converted into its temporal representation in order to produce a losslessly coded / decoded audio signal or, if a correspondingly coarse rounding has been used, an almost losslessly coded one and generate decoded audio signal again.

Im nachfolgenden wird auf eine besondere bevorzugte Ausführungsform des Entropie-Codierers 60b von Fig. 6 eingegangen. Nachdem in einem üblichen modernen MPEG-Codierer mehrere Codetabellen, die abhängig von einer durchschnittlichen Statistik der quantisierten Spektralwerte ausgewählt werden, vorliegen, wird es bevorzugt, dieselben Codetabellen oder Codebooks auch für die Entropie-Codierung des Differenz-Blocks am Ausgang des Kombinierers 58 zu verwenden. Nachdem der Betrag des Differenz-Blocks, also des Rest- IntMDCT-Spektrums, von der Genauigkeit der Quantisierung abhängt, kann eine Codebuch-Auswahl für den Entropie- Codierer 60b ohne zusätzliche Seiteninformationen durchgeführt werden. A particularly preferred embodiment of the entropy encoder 60 b from FIG. 6 is discussed below. Since there are several code tables in a conventional modern MPEG encoder, which are selected depending on an average statistic of the quantized spectral values, it is preferred to use the same code tables or codebooks also for the entropy coding of the difference block at the output of the combiner 58 , Since the amount of the difference block, that is to say the remaining IntMDCT spectrum, depends on the accuracy of the quantization, a code book selection for the entropy encoder 60 b can be carried out without additional page information.

In einem MPEG-2-AAC-Codierer sind die spektralen Koeffizienten, also die quantisierten Spektralwerte im Quantisierungs-Block in Skalenfaktorbänder gruppiert, wobei die Spektralwerte mit einem Verstärkungsfaktor gewichtet sind, der von einem entsprechenden Skalenfaktor, der einem Skalenfaktorband zugeordnet ist, abgeleitet ist. Da in diesem bekannten Codiererkonzept ein ungleichmäßiger Quantisierer verwendet wird, um die gewichteten Spektralwerte zu quantisieren, hängt die Größe der Restwerte, also der Spektralwerte am Ausgang des Kombinierers 58, nicht nur von den Skalenfaktoren ab, sondern auch von den quantisierten Werten selbst. Nachdem jedoch sowohl die Skalenfaktoren als auch die quantisierten Spektralwerte in dem Bitstrom, der von der Einrichtung 60a von Fig. 6 erzeugt wird, also in den psychoakustisch codierten Audiodaten enthalten sind, wird es bevorzugt, eine Codebuch-Auswahl im Codierer abhängig von der Größe der Differenz-Spektralwerte durchzuführen und ferner im Decodierer die im Codierer verwendete Codetabelle auf der Basis sowohl der im Bitstrom übertragenen Skalenfaktoren als auch der quantisierten Werte zu ermitteln. Nachdem zum Entropie-Codieren der Differenz- Spektralwerte am Ausgang des Kombinierers 58 keine Seiteninformationen übertragen werden müssen, führt die Entropie- Codierung lediglich zu einer Datenratenkompression ohne daß irgendwelche Signalisierungsbits im Datenstrom als Seiteninformationen für den Entropie-Codierer 60b aufgewendet werden müßten. In an MPEG-2 AAC encoder, the spectral coefficients, that is to say the quantized spectral values in the quantization block, are grouped into scale factor bands, the spectral values being weighted with an amplification factor which is derived from a corresponding scale factor which is assigned to a scale factor band. Since a non-uniform quantizer is used in this known encoder concept to quantize the weighted spectral values, the size of the residual values, that is to say the spectral values at the output of the combiner 58 , depends not only on the scale factors but also on the quantized values themselves Both the scale factors and the quantized spectral values in the bit stream, which is generated by the device 60 a of FIG. 6, that is to say are contained in the psychoacoustically coded audio data, it is preferred to select a codebook in the encoder depending on the size of the difference Perform spectral values and also determine in the decoder the code table used in the encoder on the basis of both the scale factors transmitted in the bit stream and the quantized values. Would have after the entropy encoding 58 no side information needs to be transmitted to the differential spectral values at the output of the combiner, the entropy coding only leads to a data rate compression without any signaling bits in the data stream as side information to the entropy encoder spent b 60 are.

In einem Audiocodierer nach dem Standard MPEG-2 AAC wird eine Fensterumschaltung verwendet, um Vorechos in transienten Audiosignalbereichen zu vermeiden. Diese Technik basiert auf der Möglichkeit, Fensterformen individuell in jeder Hälfte des MDCT-Fensters auszuwählen, und erlaubt es, die Blockgröße in aufeinanderfolgenden Blöcken zu variieren. Auf ähnliche Art und Weise ist der Ganzzahl- Transformationsalgorithmus in Form der IntMDCT, auf die Bezug nehmend auf die Fig. 1 bis 3 eingegangen wird, ausgeführt, um ebenfalls unterschiedliche Fensterformen beim Fenstern und beim Zeitbereich-Aliasing-Abschnitt der MDCT- Zerlegung zu verwenden. Es wird daher bevorzugt, sowohl für den Ganzzahl-Transformationsalgorithmus als auch für den Transformationsalgorithmus zum Erzeugen des Quantisierungsblocks dieselben Fensterentscheidungen zu verwenden. In an audio encoder according to the MPEG-2 AAC standard, window switching is used to avoid pre-echoes in transient audio signal areas. This technique is based on the ability to select window shapes individually in each half of the MDCT window, and allows the block size to be varied in successive blocks. In a similar manner, the integer transformation algorithm in the form of the IntMDCT, which is referred to with reference to FIGS . 1 to 3, is implemented in order to likewise use different window shapes for the window and for the time-domain aliasing section of the MDCT decomposition , It is therefore preferred to use the same window decisions for both the integer transformation algorithm and for the transformation algorithm for generating the quantization block.

In einem Codierer nach MPEG-2 AAC existieren ferner mehrere weitere Codierwerkzeuge, von denen lediglich TNS (TNS = Temporal Noise Shaping) und Mitte/Seite-(MS)Stereocodierung erwähnt seien. Bei einer TNS-Codierung wird genauso wie bei einer MS-Codierung eine Modifikation der Spektralwerte vor der Quantisierung durchgeführt. Folglich nimmt die Differenz zwischen den IntMDCT-Werten, also dem Ganzzahl-Block, und den quantisierten MDCT-Werten zu. Erfindungsgemäß ist der Ganzzahl-Transformationsalgorithmus ausgebildet, um sowohl eine TNS-Codierung als auch eine Mitte/Seite-Codierung auch von Ganzzahl-Spektralwerten zuzulassen. Die TNS-Technik basiert auf einer adaptiven Vorwärtsprädiktion der MDCT-Werte über der Frequenz. Dasselbe Prädiktionsfilter, das von einem üblichen TNS-Modul signaladaptiv berechnet wird, wird vorzugsweise ebenfalls dazu verwendet, die Ganzzahl-Spektralwerte zu prädizieren, wobei, falls dadurch nichtganzzahlige Werte entstehen, eine nachgeschaltete Rundung eingesetzt werden kann, um wieder ganzzahlige Werte zu erzeugen. Diese Rundung erfolgt vorzugsweise nach jedem Prädiktionsschritt. In dem Decodierer kann das ursprüngliche Spektrum wieder rekonstruiert werden, indem das inverse Filter und dieselbe Rundungsfunktion eingesetzt werden. Auf ähnliche Art und Weise kann die MS- Codierung ebenfalls auf IntMDCT-Spektralwerte angewendet werden, indem gerundete Givens-Rotationen mit einem Winkel von π/4, basierend auf dem Lifting-Schema, eingesetzt werden. Dadurch können die ursprünglichen IntMDCT-Werte in dem Decodierer wieder rekonstruiert werden. There are also several in an MPEG-2 AAC encoder further coding tools, of which only TNS (TNS = Temporal Noise Shaping) and Middle / Side (MS) stereo coding should be mentioned. The same applies to TNS coding a modification of the like in MS coding Spectral values performed before quantization. Hence takes the difference between the IntMDCT values, i.e. the Integer block, and the quantized MDCT values too. The integer transformation algorithm is according to the invention trained to encode both a TNS and a Middle / side coding also of integer spectral values permit. The TNS technology is based on an adaptive one Forward prediction of MDCT values over frequency. The same thing Prediction filter from a standard TNS module is computed in a signal-adaptive manner, is also preferably used used to predict the integer spectral values where, if this results in non-integer values, a downstream rounding can be used to get back generate integer values. This rounding takes place preferably after each prediction step. In the decoder can reconstruct the original spectrum again by using the inverse filter and the same rounding function be used. In a similar way, MS Coding also applied to IntMDCT spectral values be rounded by givens rotations with an angle of π / 4, based on the lifting scheme become. This allows the original IntMDCT values in the Decoders can be reconstructed again.

Es sei darauf hingewiesen, daß das erfindungsgemäße Konzept in seiner bevorzugten Ausprägung mit der IntMDCT als ganzzahligem Transformationsalgorithmus auf alle MDCT-basierten gehörangepaßten Audiocodierer angewendet werden kann. Lediglich beispielhaft sind solche Codierer Codierer nach MPEG-4 AAC Scalable, MPEG-4 AAC Low Delay, MPEG-4 BSAC, MPEG-4 Twin VQ, Dolby AC-3 etc. It should be noted that the concept of the invention in its preferred form with the IntMDCT as integer transformation algorithm on all MDCT-based custom audio encoder can be applied. Such encoders are encoders by way of example only MPEG-4 AAC Scalable, MPEG-4 AAC Low Delay, MPEG-4 BSAC, MPEG-4 Twin VQ, Dolby AC-3 etc.

Es sei besonders darauf hingewiesen, daß das erfindungsgemäße Konzept rückwärts kompatibel ist. Der gehörangepaßte Codierer bzw. Decodierer wird nicht verändert, sondern nur erweitert. Zusatzinformationen für die verlustlosen Komponenten lassen sich rückwärts kompatibel im gehörangepaßt codierten Bitstrom übertragen, beispielsweise bei MPEG-2 AAC im Feld "Ancilliary Data". Der Zusatz zum bisherigen gehörangepaßten Decodierer, der in Fig. 7 gestrichelt gezeichnet ist, kann diese Zusatzdaten auswerten und zusammen mit dem quantisierten MDCT-Spektrum vom gehörangepaßten Decodierer das IntMDCT-Spektrum verlustlos rekonstruieren. It should be particularly pointed out that the concept according to the invention is backwards compatible. The hearing-adapted encoder or decoder is not changed, but only expanded. Additional information for the lossless components can be transmitted backwards compatible in the audibly coded bit stream, for example in the "Ancilliary Data" field with MPEG-2 AAC. The addition to the previous hearing-adapted decoder, which is shown in broken lines in FIG. 7, can evaluate this additional data and, together with the quantized MDCT spectrum from the hearing-adapted decoder, reconstruct the IntMDCT spectrum without loss.

Im nachfolgenden wird als Beispiel für einen ganzzahligen Transformationsalgorithmus auf den IntMDCT- Transformationsalgorithmus eingegangen, der in "Audio Coding Based on Integer Transforms" 111-te AES-Versammlung, New York, 2001, beschrieben ist. Die IntMDCT ist besonders günstig, da sie die attraktiven Eigenschaften der MDCT hat, wie z. B. eine gute spektrale Darstellung des Audiosignals, eine kritische Abtastung und eine Blocküberlappung aufweist. Die gute Approximation der MDCT durch eine IntMDCT erlaubt es ferner, bei dem in Fig. 5 gezeigten Codierer nur einen Transformationsalgorithmus zu verwenden, wie es durch einen Pfeil 62 in Fig. 5 dargestellt ist. In the following, the IntMDCT transformation algorithm, which is described in "Audio Coding Based on Integer Transforms", 111th AES Assembly, New York, 2001, is discussed as an example of an integer transformation algorithm. The IntMDCT is particularly cheap because it has the attractive properties of MDCT, such as: B. has a good spectral representation of the audio signal, a critical sampling and a block overlap. The good approximation of the MDCT by an IntMDCT also allows only one transformation algorithm to be used in the encoder shown in FIG. 5, as is shown by an arrow 62 in FIG. 5.

Im nachfolgenden wird zum besseren Verständnis der IntMDCT anhand der Fig. 1 bis 4 auf die wesentlichen Eigenschaften dieser speziellen Form eines ganzzahligen Transformationsalgorithmus eingegangen. For a better understanding of the IntMDCT, the essential properties of this special form of an integer transformation algorithm are discussed below with the aid of FIGS . 1 to 4.

Das erfindungsgemäße Verfahren zum Codieren bzw. Decodieren ist vorzugsweise auf einem digitalen Speichermedium, wie z. B. einer Diskette, mit elektronisch auslesbaren Steuersignalen gespeichert, wobei die Steuersignale so mit einem programmierbaren Computersystem zusammenarbeiten können, dass das Codier- und/oder Decodierverfahren ausgeführt werden kann/können. In anderen Worten ausgedrückt liegt also ein Computer-Programm-Produkt mit auf einem maschinenlesbaren Träger gespeichertem Programmcode zur Durchführen des Codierverfahrens und/oder des Decodierverfahrens vor, wenn das Programmprodukt auf einem Rechner abläuft. Die erfindungsgemäßen Verfahren können also in einem Computer- Programm mit einem Programmcode zur Durchführen der erfindungsgemäßen Verfahren, wenn das Programm auf einem Computer abläuft, realisiert sein. The inventive method for coding or decoding is preferably on a digital storage medium, such as. B. a floppy disk with electronically readable Control signals stored, the control signals so with a programmable computer system can work together that the coding and / or decoding process is carried out can be. In other words, lies a computer program product with on one machine-readable carrier stored program code for performing the Coding method and / or the decoding method before if the program product runs on a computer. The The methods according to the invention can therefore be carried out in a computer Program with a program code to carry out the inventive method when the program on a Computer runs, be realized.

Fig. 1 zeigt ein Übersichtsdiagramm für die erfindungsgemäß bevorzugte Vorrichtung zum Verarbeiten von zeitdiskreten Abtastwerten, die ein Audiosignal darstellen, um ganzzahlige Werte zu erhalten, auf denen aufbauend der Int-MDCT- Ganzzahl-Transformationsalgorithmus arbeitet. Die zeitdiskreten Abtastwerte werden durch die in Fig. 1 gezeigte Vorrichtung gefenstert und optional in eine spektrale Darstellung umgesetzt. Die zeitdiskreten Abtastwerte, die an einem Eingang 10 in die Vorrichtung eingespeist werden, werden mit einem Fenster w mit einer Länge, die 2N zeitdiskreten Abtastwerten entspricht, gefenstert, um an einem Ausgang 12 ganzzahlige gefensterte Abtastwerte zu erreichen, welche dazu geeignet sind, um mittels einer Transformation und insbesondere der Einrichtung 14 zum Ausführen einer ganzzahligen DCT in eine spektrale Darstellung umgesetzt zu werden. Die ganzzahlige DCT ist ausgebildet, um aus N Eingangswerten N Ausgangswerte zu erzeugen, was im Gegensatz zu der MDCT-Funktion 408 von Fig. 4a steht, die aus 2N gefensterten Abtastwerten aufgrund der MDCT-Gleichung lediglich N Spektralwerte erzeugt. Fig. 1 shows an overview diagram of the present invention preferred apparatus for processing time-discrete sampled values representing an audio signal to obtain integer values based on which the Int-MDCT integer transform algorithm operates. The discrete-time samples are windowed by the device shown in FIG. 1 and optionally converted into a spectral representation. The discrete-time samples that are fed into the device at an input 10 are windowed with a window w with a length that corresponds to 2N discrete-time samples in order to achieve integer windowed samples at an output 12 , which are suitable for using a transformation and in particular the device 14 for executing an integer DCT to be converted into a spectral representation. The integer DCT is designed to generate N output values from N input values, which is in contrast to the MDCT function 408 of FIG. 4a, which generates only N spectral values from 2N windowed sample values on the basis of the MDCT equation.

Zum Fenstern der zeitdiskreten Abtastwerte werden zunächst in einer Einrichtung 16 zwei zeitdiskrete Abtastwerte ausgewählt, die zusammen einen Vektor von zeitdiskreten Abtastwerten darstellen. Ein zeitdiskreter Abtastwert, der durch die Einrichtung 16 ausgewählt wird, liegt im ersten Viertel des Fensters. Der andere zeitdiskrete Abtastwert liegt in dem zweiten Viertel des Fensters, wie es anhand von Fig. 3 noch detaillierter ausgeführt wird. Der durch die Einrichtung 16 erzeugte Vektor wird nunmehr mit einer Drehmatrix der Dimension 2 × 2 beaufschlagt, wobei diese Operation nicht unmittelbar durchgeführt wird, sondern mittels mehrerer sogenannten Lifting-Matrizen. To window the time-discrete samples, two devices are selected in a device 16 , which together represent a vector of discrete-time samples. A discrete-time sample value, which is selected by the device 16 , lies in the first quarter of the window. The other discrete-time sample lies in the second quarter of the window, as is explained in more detail with reference to FIG. 3. The vector generated by the device 16 is now subjected to a rotation matrix of the dimension 2 × 2, this operation not being carried out directly, but by means of several so-called lifting matrices.

Eine Lifting-Matrix hat die Eigenschaft, daß sie nur ein Element aufweist, das von dem Fenster w abhängt und ungleich "1" oder "0" ist. A lifting matrix has the property that it is only one Element that depends on the window w and is not equal to "1" or "0".

Die Faktorisierung von Wavelet-Transformationen in Lifting- Schritte ist in der Fachveröffentlichung "Factoring Wavlet Transforms Into Lifting Steps", Ingrid Daubechies und Wim Sweldens, Preprint, Bell Laboratories, Lucent Technologies, 1996, dargestellt. Allgemein ist ein Lifting-Schema eine einfache Beziehung zwischen perfekt rekonstruierenden Filterpaaren, die dasselbe Tiefpaß- oder Hochpaßfilter haben. Jedes Paar komplementärer Filter kann in Lifting-Schritte faktorisiert werden. Insbesondere gilt dies für Givens- Rotationen. Es sei der Fall betrachtet, bei dem die Polyphasenmatrix eine Givens-Rotation ist. Es gilt dann:

The factorization of wavelet transformations in lifting steps is described in the specialist publication "Factoring Wavlet Transforms Into Lifting Steps", Ingrid Daubechies and Wim Sweldens, Preprint, Bell Laboratories, Lucent Technologies, 1996. Generally, a lifting scheme is a simple relationship between perfectly reconstructing filter pairs that have the same low-pass or high-pass filter. Each pair of complementary filters can be factored into lifting steps. This applies in particular to Givens rotations. Consider the case where the polyphase matrix is a Givens rotation. The following then applies:

Jede der drei rechts des Gleichheitszeichens stehenden Lifting-Matrizen hat als Hauptdiagonalelemente den Wert "1". Ferner ist in jeder Lifting-Matrix ein Nebendiagonalelement gleich 0, und ein Nebendiagonalelement vom Drehwinkel α abhängig. Any of the three to the right of the equal sign Lifting matrices have the value "1" as the main diagonal elements. There is also a secondary diagonal element in each lifting matrix is 0, and a secondary diagonal element of the angle of rotation α dependent.

Der Vektor wird nunmehr mit der dritten Lifting-Matrix, d. h. der Liftingmatrix ganz rechts in obiger Gleichung, multipliziert, um einen ersten Ergebnisvektor zu erhalten. Dies ist in Fig. 1 durch eine Einrichtung 18 dargestellt. Es wird nunmehr der erste Ergebnisvektor mit einer beliebigen Rundungsfunktion, die die Menge der reellen Zahlen in die Menge der ganzen Zahlen abbildet, gerundet, wie es in Fig. 1 durch eine Einrichtung 20 dargestellt ist. Am Ausgang der Einrichtung 20 wird ein gerundeter erster Ergebnisvektor erhalten. Der gerundete erste Ergebnisvektor wird nunmehr in eine Einrichtung 22 zum Multiplizieren desselben mit der mittleren, d. h. zweiten, Lifting-Matrix eingespeist, um einen zweiten Ergebnisvektor zu erhalten, der in einer Einrichtung 24 wiederum gerundet wird, um einen gerundeten zweiten Ergebnisvektor zu erhalten. Der gerundete zweite Ergebnisvektor wird nunmehr in eine Einrichtung 26 eingespeist, und zwar zum Multiplizieren desselben mit der links in der obigen Gleichung aufgeführten, d. h. ersten, Liftingmatrix, um einen dritten Ergebnisvektor zu erhalten, der schließlich noch mittels einer Einrichtung 28 gerundet wird, um schließlich an dem Ausgang 12 ganzzahlige gefensterte Abtastwerte zu erhalten, die nun, wenn eine spektrale Darstellung derselben gewünscht wird, durch die Einrichtung 14 verarbeitet werden müssen, um an einem Spektralausgang 30 ganzzahlige Spektralwerte zu erhalten. The vector is now multiplied by the third lifting matrix, ie the lifting matrix on the far right in the above equation, in order to obtain a first result vector. This is represented in FIG. 1 by a device 18 . The first result vector is now rounded with an arbitrary rounding function, which maps the set of real numbers into the set of integers, as shown by a device 20 in FIG. 1. A rounded first result vector is obtained at the output of the device 20 . The rounded first result vector is now fed into a device 22 for multiplying it by the middle, ie second, lifting matrix in order to obtain a second result vector which is rounded in a device 24 in order to obtain a rounded second result vector. The rounded second result vector is now fed into a device 26 , in order to multiply it by the lifting matrix listed on the left in the above equation, ie the first lifting matrix, in order to obtain a third result vector which is finally rounded by means 28 , finally to obtain integer windowed sample values at the output 12 which, if a spectral representation thereof is desired, must now be processed by the device 14 in order to obtain integer spectral values at a spectral output 30 .

Vorzugsweise ist die Einrichtung 14 als Ganzzahl-DCT oder Integer-DCT ausgeführt. The device 14 is preferably designed as an integer DCT or an integer DCT.

Die diskrete Cosinus-Transformation gemäß Typ 4 (DCT-IV) mit einer Länge N ist durch folgende Gleichung gegeben:

The discrete cosine transformation according to type 4 (DCT-IV) with a length N is given by the following equation:

Die Koeffizienten der DCT-IV bilden eine orthonormale N × N Matrix. Jede orthogonale N × N-Matrix kann in N (N - 1)/2 Givens-Rotationen zerlegt werden, wie es in der Fachveröffentlichung P. P. Vaidyanathan, "Multirate Systems And Filter Banks", Prentice Hall, Englewood Cliffs, 1993, ausgeführt ist. Es sei darauf hingewiesen, daß auch weitere Zerlegungen existieren. The coefficients of the DCT-IV form an orthonormal N × N Matrix. Each orthogonal N × N matrix can be in N (N - 1) / 2 Givens rotations can be disassembled as it is in the Technical publication P. P. Vaidyanathan, "Multirate Systems And Filter Banks ", Prentice Hall, Englewood Cliffs, 1993, is executed. It should be noted that other Disassemblies exist.

Bezüglich der Klassifikationen der verschiedenen DCT- Algorithmen sei auf H. S. Malvar, "Signal Processing With Lapped Transforms", Artech House, 1992, verwiesen. Allgemein unterscheiden sich die DCT-Algorithmen durch die Art ihrer Basisfunktionen. Während die DCT-IV, die hier bevorzugt wird, nicht-symmetrische Basisfunktionen umfaßt, d. h. eine Cosinus-Viertelwelle, eine Cosinus-3/4-Welle, eine Cosinus-5/4-Welle, eine Cosinus-7/4-Welle, etc., hat die diskrete Cosinustransformation z. B. vom Typ II (DCT-II), achsensymmetrische und punktsymmetrische Basisfunktionen. Die 0-te Basisfunktion hat einen Gleichanteil, die erste Basisfunktion ist eine halbe Cosinuswelle, die zweite Basisfunktion ist eine ganze Cosinuswelle, usw. Aufgrund der Tatsache, daß die DCT-II den Gleichanteil besonders berücksichtigt, wird sie bei der Videocodierung verwendet, nicht aber bei der Audiocodierung, da bei der Audiocodierung im Gegensatz zur Videocodierung der Gleichanteil nicht von Relevanz ist. Regarding the classifications of the different DCT Algorithms can be found in H. S. Malvar, "Signal Processing With Lapped Transforms ", Artech House, 1992. In general, the DCT algorithms differ in the type of their basic functions. While the DCT-IV, which is here is preferred comprises non-symmetric basis functions, i. H. a quarter cosine wave, a cosine 3/4 wave, one Cosine 5/4 wave, a cosine 7/4 wave, etc., has the discrete cosine transformation z. B. Type II (DCT-II), Axis-symmetrical and point-symmetrical basic functions. The The 0th basic function has a constant component, the first The basic function is half a cosine wave, the second Basic function is a whole cosine wave, etc. Because of the The fact that the DCT-II especially the DC component is taken into account, it is used for video coding, but not with audio coding, since with audio coding in In contrast to video coding, the DC component is not relevant is.

Im nachfolgend wird darauf eingegangen, wie der Drehwinkel α der Givens-Rotation von der Fensterfunktion abhängt. In the following it is discussed how the angle of rotation α of the Givens rotation depends on the window function.

Eine MDCT mit einer Fensterlänge von 2N kann in eine diskrete Cosinustransformation vom Typ IV mit einer Länge N reduziert werden. Dies wird dadurch erreicht, daß die TDAC- Operation explizit im Zeitbereich durchgeführt wird, und daß dann die DCT-IV angewandt wird. Bei einer 50%igen Überlappung überlappt die linke Hälfte des Fensters für einen Block t mit der rechten Hälfte des vorausgehenden Blocks, d. h. des Blocks t - 1. Der überlappende Teil zwei aufeinanderfolgender Blöcke t - 1 und t wird im Zeitbereich, d. h. vor der Transformation, folgendermaßen vorverarbeitet, d. h. zwischen dem Eingang 10 und dem Ausgang 12 von Fig. 1 verarbeitet:

An MDCT with a window length of 2N can be reduced to a discrete type IV cosine transformation with a length N. This is achieved by performing the TDAC operation explicitly in the time domain and then using the DCT-IV. With a 50% overlap, the left half of the window for a block t overlaps the right half of the preceding block, ie block t-1. The overlapping part of two successive blocks t-1 and t becomes in the time domain, ie before the transformation , preprocessed as follows, ie processed between input 10 and output 12 of FIG. 1:

Die mit der Tilde bezeichneten Werte sind die Werte am Ausgang 12 von Fig. 1, während die ohne Tilde in der obigen Gleichung bezeichnete x Werte die Werte am Eingang 10 bzw. hinter der Einrichtung 16 zum Auswählen sind. Der Laufindex k läuft von 0 bis N/2 - 1, während w die Fensterfunktion darstellt. The values denoted by the tilde are the values at the output 12 of FIG. 1, while the x values denoted without a tilde in the above equation are the values at the input 10 or behind the device 16 for selection. The running index k runs from 0 to N / 2 - 1, while w represents the window function.

Aus der TDAC-Bedingung für die Fensterfunktion w gilt folgender Zusammenhang:

The following relationship applies from the TDAC condition for window function w:

Für bestimmte Winkel α_k, k = 0, . . ., N/2 - 1 kann diese Vorverarbeitung im Zeitbereich als Givens-Rotation geschrieben werden, wie es ausgeführt worden ist. For certain angles α _k , k = 0,. , ., N / 2-1 this preprocessing can be written in the time domain as Givens rotation, as has been done.

Der Winkel α der Givens-Rotation hängt folgendermaßen von der Fensterfunktion w ab:

α = arctan [w(N/2 - 1 - k)/w(N/2 + k)] (5)
The angle α of the Givens rotation depends on the window function w as follows:

α = arctan [w (N / 2 - 1 - k) / w (N / 2 + k)] (5)

Es sei darauf hingewiesen, daß beliebige Fensterfunktionen w eingesetzt werden können, solange sie diese TDAC- Bedingung erfüllen. It should be noted that any window functions w can be used as long as they have this TDAC Meet condition.

Im nachfolgenden wird anhand von Fig. 2 ein kaskadierter Codierer und Decodierer beschrieben. Die zeitdiskreten Abtastwerte x(0) bis x(2N - 1), die durch ein Fenster gemeinsam "gefenstert" werden, werden zunächst derart durch die Einrichtung 16 von Fig. 1 ausgewählt, daß der Abtastwert x(0) und der Abtastwert x(N - 1), d. h. ein Abtastwert aus dem ersten Viertel des Fensters und ein Abtastwert aus dem zweiten Viertel des Fensters, ausgewählt werden, um den Vektor am Ausgang der Einrichtung 16 zu bilden. Die sich überkreuzenden Pfeile stellen schematisch die Lifting- Multiplikationen und anschließenden Rundungen der Einrichtungen 18, 20 bzw. 22, 24 bzw. 26, 28 dar, um am Eingang der DCT-IV-Blöcke die ganzzahligen gefensterten Abtastwerte zu erhalten. A cascaded encoder and decoder is described below with reference to FIG. 2. The discrete-time samples x (0) to x ( 2 N- 1 ), which are "windowed" together by a window, are first selected by the device 16 of FIG. 1 in such a way that the sample x (0) and the sample x (N-1), ie a sample from the first quarter of the window and a sample from the second quarter of the window, are selected to form the vector at the output of device 16 . The intersecting arrows schematically represent the lifting multiplications and subsequent rounding of the devices 18 , 20 or 22 , 24 or 26 , 28 in order to obtain the integer windowed sample values at the input of the DCT-IV blocks.

Wenn der erste Vektor wie oben beschrieben verarbeitet ist, wird ferner ein zweiter Vektor aus den Abtastwerten x(N/2 - 1) und x(N/2), d. h. wieder ein Abtastwert aus dem ersten Viertel des Fenster und ein Abtastwert aus dem zweiten Viertel des Fensters, ausgewählt und wiederum durch den in Fig. 1 beschriebenen Algorithmus verarbeitet. Analog dazu werden sämtliche anderen Abtastwertpaare aus dem ersten und zweiten Viertel des Fensters bearbeitet. Die selbe Verarbeitung wird für das dritte und vierte Viertel des ersten Fensters durchgeführt. Nunmehr liegen am Ausgang 12 2N gefensterte ganzzahlige Abtastwerte vor, die nunmehr so, wie es in Fig. 2 dargestellt ist, in eine DCT-IV-Transformation eingespeist werden. Insbesondere werden die ganzzahligen gefensterten Abtastwerte des zweiten und dritten Viertels in eine DCT eingespeist. Die gefensterten ganzzahligen Abtastwerte des ersten Viertels des Fensters werden in eine vorausgehende DCT-IV zusammen mit den gefensterten ganzzahligen Abtastwerten des vierten Viertels des vorausgehenden Fensters verarbeitet. Analog dazu wird das vierte Viertel der gefensterten ganzzahligen Abtastwerte in Fig. 2 mit dem ersten Viertel des nächsten Fensters zusammen in eine DCT- IV-Transformation eingespeist. Die mittlere in Fig. 2 gezeigte ganzzahlige DCT-IV-Transformation 32 liefert nunmehr N ganzzahlige Spektralwerte y(0) bis y(N - 1). Diese ganzzahligen Spektralwerte können nunmehr beispielsweise einfach Entropie-codiert werden, ohne daß eine dazwischenliegende Quantisierung erforderlich ist, da die Fensterung und Transformation ganzzahlige Ausgangswerte liefert. If the first vector is processed as described above, a second vector is further formed from the samples x (N / 2-1) and x (N / 2), ie again a sample from the first quarter of the window and a sample from the second Quarter of the window, selected and in turn processed by the algorithm described in FIG. 1. Similarly, all other sample pairs from the first and second quarters of the window are processed. The same processing is done for the third and fourth quarters of the first window. Now there are 2N windowed integer samples at the output 12, which are now fed into a DCT-IV transformation as shown in FIG. 2. In particular, the integer windowed samples of the second and third quarters are fed into a DCT. The windowed integer samples of the first quarter of the window are processed into a previous DCT-IV along with the windowed integer samples of the fourth quarter of the previous window. Analogously, the fourth quarter of the windowed integer samples in FIG. 2 are fed together with the first quarter of the next window into a DCT-IV transformation. The mean integer DCT-IV transformation 32 shown in FIG. 2 now supplies N integer spectral values y (0) to y (N - 1). These integer spectral values can now, for example, simply be entropy-coded without the need for intermediate quantization, since the windowing and transformation provide integer output values.

In der rechten Hälfte von Fig. 2 ist ein Decodierer dargestellt. Der Decodierer bestehend aus Rücktransformation und "inverser Fensterung" arbeitet invers zum Codierer. Es ist bekannt, daß zur Rücktransformation einer DCT-IV eine inverse DCT-IV verwendet werden kann, wie es in Fig. 2 dargestellt ist. Die Ausgangswerte der Decodierer-DCT-IV 34 werden nunmehr, wie es in Fig. 2 dargestellt ist, mit den entsprechenden Werten der vorausgehenden Transformation bzw. der nachfolgenden Transformation invers verarbeitet, um aus den ganzzahligen gefensterten Abtastwerten am Ausgang der Einrichtung 34 bzw. der vorausgehenden und nachfolgenden Transformation wieder zeitdiskrete Audio-Abtastwerte x(0) bis x(2N - 1) zu erzeugen. A decoder is shown in the right half of FIG . The decoder consisting of reverse transformation and "inverse windowing" works inversely to the encoder. It is known that an inverse DCT-IV can be used for the inverse transformation of a DCT-IV, as shown in FIG. 2. The output values of the decoder DCT-IV 34 are now, as shown in FIG. 2, inversely processed with the corresponding values of the preceding transformation or the subsequent transformation, in order to derive from the integer windowed sample values at the output of the device 34 or previous and subsequent transformation again to generate discrete-time audio samples x (0) to x ( 2 N - 1 ).

Die ausgangsseitige Operation geschieht durch eine inverse Givens-Rotation, d. h. derart, daß die Blöcke 26, 28 bzw. 22, 24 bzw. 18, 20 in der entgegengesetzten Richtung durchlaufen werden. Dies sei anhand der zweiten Lifting-Matrix von Gleichung 1 näher dargestellt. Wenn (im Codierer) der zweite Ergebnisvektor durch Multiplikation des gerundeten ersten Ergebnisvektors mit der zweiten Liftingmatrix (Einrichtung 22) gebildet wird, so ergibt sich folgender Ausdruck:

(x, y) → (x, y + xsinα) (6)
The operation on the output side is carried out by an inverse Givens rotation, ie in such a way that blocks 26 , 28 or 22 , 24 or 18 , 20 are passed through in the opposite direction. This is illustrated in more detail using the second lifting matrix of equation 1. If (in the encoder) the second result vector is formed by multiplying the rounded first result vector by the second lifting matrix (device 22 ), the following expression results:

(x, y) → (x, y + xsinα) (6)

Die Werte x, y auf der rechten Seite von Gleichung 6 sind Ganzzahlen. Dies trifft jedoch für den Wert xsinα nicht zu. Hier muß die Rundungsfunktion r eingeführt werden, wie es in der nachfolgenden Gleichung

(x, y) → (x, y + r(xsinα)) (7)

dargestellt ist. Diese Operation führt die Einrichtung 24 aus. The values x, y on the right side of Equation 6 are integers. However, this does not apply to the xsinα value. The rounding function r must be introduced here, as in the following equation

(x, y) → (x, y + r (xsinα)) (7)

is shown. The device 24 carries out this operation.

Die inverse Abbildung (im Decodierer) ist folgendermaßen definiert:

(x', y') → (x', y' - r(x'sinα)) (8)
The inverse mapping (in the decoder) is defined as follows:

(x ', y') → (x ', y' - r (x'sinα)) (8)

Aufgrund dem Minuszeichens vor der Rundungsoperation wird ersichtlich, daß die ganzzahlige Approximierung des Lifting-Schritts umgekehrt werden kann, ohne daß ein Fehler eingeführt wird. Die Anwendung dieser Approximation auf jeden der drei Lifting-Schritte führt zu einer ganzzahligen Approximation der Givens-Rotation. Die gerundete Rotation (im Codierer) kann umgekehrt werden (im Decodierer), ohne daß ein Fehler eingeführt wird, und zwar indem die inversen gerundeten Lifting-Schritte in umgekehrter Reihenfolge durchlaufen werden, d. h. wenn beim Decodieren der Algorithmus von Fig. 1 von unten nach oben durchgeführt wird. The minus sign before the rounding operation shows that the integer approximation of the lifting step can be reversed without introducing an error. Applying this approximation to each of the three lifting steps leads to an integer approximation of the Givens rotation. The rounded rotation (in the encoder) can be reversed (in the decoder) without introducing an error by going through the inverse rounded lifting steps in reverse order, ie when decoding the algorithm of Fig. 1 from the bottom to the bottom is performed above.

Wenn die Rundungsfunktion r punktsymmetrisch ist, ist die inverse gerundete Rotation identisch zu der gerundeten Rotation mit dem Winkel -α und lautet folgendermaßen:

If the rounding function r is point symmetric, the inverse rounded rotation is identical to the rounded rotation with the angle -α and is as follows:

Die Lifting-Matrizen für den Decodierer, d. h. für die inverse Givens-Rotation, ergibt sich in diesem Fall unmittelbar aus Gleichung (1), indem lediglich der Ausdruck "sin α" durch den Ausdruck "-sin α" ersetzt wird. The lifting matrices for the decoder, i. H. for the inverse Givens rotation results in this case directly from equation (1), using only the expression "sin α "is replaced by the expression" -sin α ".

Im nachfolgenden wird anhand von Fig. 3 noch einmal die Zerlegung einer üblichen MDCT mit überlappenden Fenstern 40 bis 46 dargelegt. Die Fenster 40 bis 46 überlappen jeweils zu 50%. Pro Fenster werden zunächst Givens-Rotationen innerhalb des ersten und zweiten Viertels eines Fensters bzw. innerhalb des dritten und vierten Viertels eines Fensters ausgeführt, wie es durch die Pfeile 48 schematisch dargestellt ist. Dann werden die rotierten Werte, d. h. die gefensterten ganzzahligen Abtastwerte derart in eine N-zu-N- DCT eingespeist, daß immer das zweite und dritte Viertel eines Fensters bzw. das vierte und erste Viertel eines darauffolgenden Fensters gemeinsam mittels eines DCT-IV- Algorithmus in eine spektrale Darstellung umgesetzt wird. In the following, the decomposition of a customary MDCT with overlapping windows 40 to 46 is again shown with reference to FIG. 3. Windows 40 to 46 each overlap by 50%. For each window, Givens rotations are first carried out within the first and second quarters of a window or within the third and fourth quarters of a window, as is shown schematically by arrows 48 . Then the rotated values, ie the windowed integer samples, are fed into an N-to-N DCT in such a way that the second and third quarters of a window or the fourth and first quarters of a subsequent window are always used together using a DCT-IV algorithm is converted into a spectral representation.

Es werden daher die üblichen Givens-Rotation in Lifting- Matrizen zerlegt, die sequentiell ausgeführt werden, wobei nach jeder Lifting-Matrix-Multiplikation ein Rundungsschritt eingeführt wird, derart, daß die Gleitkomma-Zahlen unmittelbar nach ihrer Entstehung gerundet werden, derart, daß vor jeder Multiplikation eines Ergebnisvektors mit einer Lifting-Matrix der Ergebnisvektor lediglich Ganzzahlen hat. The usual Givens rotation in lifting- Disassembled matrices that are executed sequentially, where after each lifting matrix multiplication Rounding step is introduced such that the floating point numbers be rounded immediately after their creation, such that before any result vector is multiplied by In a lifting matrix, the result vector is only an integer Has.

Die Ausgangswerte bleiben also immer ganzzahlig, wobei es bevorzugt wird, auch ganzzahlige Eingangswerte zu verwenden. Dies stellt keine Einschränkung dar, da jegliche beispielsweise PCM-Abtastwerte, wie sie auf einer CD abgespeichert sind, ganzzahlige Zahlenwerte sind, deren Wertebereich je nach Bitbreite variiert, d. h. abhängig davon, ob die zeitdiskreten digitalen Eingangswerte 16-Bit-Werte oder 24-Bit-Werte sind. Dennoch ist, wie es ausgeführt worden ist, der gesamte Prozeß invertierbar, indem die inversen Rotationen in umgekehrter Reihenfolge ausgeführt werden. Es existiert somit eine ganzzahlige Approximation der MDCT mit perfekter Rekonstruktion, also eine verlustlose Transformation. The initial values therefore always remain an integer, whereby it it is preferred to also use integer input values use. This is not a limitation as any for example PCM samples as they are on a CD are stored, are integer numerical values whose Range of values varies depending on bit width, i. H. depending on whether the discrete-time digital input values are 16-bit values or Are 24-bit values. Still, how it was done is, the whole process is invertible by the inverse Rotations are performed in reverse order. It there is thus an integer approximation of the MDCT perfect reconstruction, so a lossless one Transformation.

Die gezeigte Transformation liefert ganzzahlige Ausgangswerte statt Gleitkommawerte. Sie liefert eine perfekte Rekonstruktion, so daß kein Fehler eingeführt wird, wenn eine Vorwärts- und dann eine Rückwärtstransformation ausgeführt werden. Die Transformation ist gemäß einem bevorzugten Ausführungsbeispiel der vorliegenden Erfindung ein Ersatz für die modifizierte diskrete Cosinustransformation. Auch andere Transformationsverfahren können jedoch ganzzahlig ausgeführt werden, so lange eine Zerlegung in Rotationen und eine Zerlegung der Rotationen in Lifting-Schritte möglich ist. The transformation shown provides integers Initial values instead of floating point values. It delivers a perfect Reconstruction so that no error is introduced if one Forward and then backward transformation performed become. The transformation is according to a preferred one Embodiment of the present invention a replacement for the modified discrete cosine transformation. Also however, other transformation methods can be integers as long as a decomposition into rotations and the rotations can be broken down into lifting steps is.

Die ganzzahlige MDCT hat die meisten günstigen Eigenschaften der MDCT. Sie hat eine überlappende Struktur, wodurch eine bessere Frequenzselektivität als bei nichtüberlappenden Blocktransformationen erhalten wird. Aufgrund der TDAC-Funktion, die bereits beim Fenstern vor der Transformation berücksichtigt wird, wird eine kritische Abtastung beibehalten, so daß die Gesamtanzahl von Spektralwerten, die ein Audiosignal darstellen, gleich der Gesamtanzahl von Eingangs-Abtastwerten ist. The integer MDCT has the most cheap ones Properties of the MDCT. It has an overlapping structure, which means better frequency selectivity than with non-overlapping block transformations is obtained. by virtue of the TDAC function, which is already in the window before the Transformation is considered critical Keep sampling so that the total number of Spectral values representing an audio signal equal to that Total number of input samples.

Verglichen mit einer normalen MDCT, die Gleitkomma- Abtastwerte liefert, zeigt sich bei der beschriebenen bevorzugten ganzzahligen Transformation, daß lediglich in dem Spektralbereich, in dem wenig Signalpegel ist, das Rauschen im Vergleich zur normalen MDCT erhöht ist, während sich diese Rauscherhöhung bei signifikanten Signalpegeln nicht bemerkbar macht. Dafür bietet sich die ganzzahlige Verarbeitung für eine effiziente Hardware-Implementation an, da lediglich Multiplikationsschritte verwendet werden, die ohne weiteres in Verschieben-Addieren-Schritte (Shift/Add- Schritte) zerlegt werden können, welche einfach und schnell hardwaremäßig implementiert werden können. Selbstverständlich ist auch eine Software-Implementation möglich. Compared to a normal MDCT, the floating point Supplies samples, is shown in the described preferred integer transformation that only in the Spectral range in which there is little signal level, the noise compared to normal MDCT is increased while this increase in noise at significant signal levels is not noticeable. The integer is suitable for this Processing for an efficient hardware implementation since only multiplication steps are used that easily in move-add steps (shift / add- Steps) which can be disassembled easily and quickly can be implemented in hardware. Software implementation is of course also possible.

Die ganzzahlige Transformation liefert eine gute spektrale Darstellung des Audiosignals und bleibt dennoch im Bereich der ganzen Zahlen. Wenn sie auf tonale Teile eines Audiosignals angewandt wird, resultiert dies in einer guten Energiekonzentrierung. Damit kann ein effizientes verlustloses Codierschema aufgebaut werden, indem einfach die in Fig. 1 dargestellte Fensterung/Transformation mit einem Entropiecodierer kaskadiert wird. Insbesondere ein gestapeltes Codieren (Stacked Coding) unter Verwendung von Escape-Werten, wie es in MPEG AAC eingesetzt wird, ist günstig. Es wird bevorzugt, alle Werte um eine bestimmte Potenz von zwei herunterzuskalieren, bis sie in eine erwünschte Codetabelle passen, und dann die weggelassenen niederstwertigen Bits zusätzlich zu codieren. Im Vergleich zu der Alternative der Verwendung von größeren Codetabellen ist die beschriebene Alternative hinsichtlich des Speicherverbrauchs zum Speichern der Codetabellen günstiger. Ein nahezu verlustloser Codierer könnte auch dadurch erhalten werden, daß einfach bestimmte der niederstwertigen Bits weggelassen werden. The integer transformation provides a good spectral representation of the audio signal and still remains in the range of the integers. When applied to tonal parts of an audio signal, it results in good energy concentration. An efficient lossless coding scheme can thus be set up by simply cascading the windowing / transformation shown in FIG. 1 with an entropy encoder. In particular, stacked coding using escape values, as is used in MPEG AAC, is favorable. It is preferred to scale down all values by a certain power of two until they fit into a desired code table, and then additionally code the omitted least significant bits. In comparison to the alternative of using larger code tables, the alternative described is cheaper in terms of memory consumption for storing the code tables. An almost lossless encoder could also be obtained by simply omitting certain of the least significant bits.

Insbesondere für tonale Signale ermöglicht eine Entropie- Codierung der ganzzahligen Spektralwerte einen hohen Codiergewinn. Für transiente Teile des Signals ist der Codiergewinn niedrig, und zwar aufgrund des flachen Spektrums transienter Signale, d. h. aufgrund einer geringen Anzahl von Spektralwerten, die gleich oder nahezu 0 sind. Wie es in J. Herre, J. D. Johnston: "Enhancing the Performance of Perceptual Audio Coders by Using Temporal Noise Shaping (TNS)" 101. AES Convention, Los Angeles, 1996, Preprint 4384, beschrieben ist, kann diese Flachheit jedoch verwendet werden, indem eine lineare Prädiktion im Frequenzbereich verwendet wird. Eine Alternative ist eine Prädiktion mit offener Schleife. Eine andere Alternative ist der Prädiktor mit geschlossener Schleife. Die erste Alternative, d. h. der Prädiktor mit offener Schleife, wird TNS genannt. Die Quantisierung nach der Prädiktion führt zu einer Adaption des resultierenden Quantisierungsrauschens an die zeitliche Struktur des Audiosignals und verhindert daher Vorechos in psychoakustischen Audiocodierern. Für ein verlustloses Audiocodieren ist die zweite Alternative, d. h. mit einem Prädiktor mit geschlossener Schleife, geeigneter, da die Prädiktion mit geschlossener Schleife eine genaue Rekonstruktion des Eingangssignals erlaubt. Wenn diese Technik auf ein erzeugtes Spektrum angewendet wird, muß ein Rundungsschritt nach jedem Schritt des Prädiktionsfilters durchgeführt werden, um im Bereich der Ganzzahlen zu bleiben. Durch Verwenden des inversen Filters und derselben Rundungsfunktion kann das ursprüngliche Spektrum genau wieder hergestellt werden. Especially for tonal signals, entropy coding of the integer spectral values enables a high coding gain. For transient parts of the signal, the coding gain is low because of the flat spectrum of transient signals, ie because of a small number of spectral values that are equal to or almost zero. However, as described in J. Herre, JD Johnston: "Enhancing the Performance of Perceptual Audio Coders by Using Temporal Noise Shaping (TNS)" 101st AES Convention, Los Angeles, 1996, Preprint 4384 , this flatness can be used, using a linear prediction in the frequency domain. An alternative is an open loop prediction. Another alternative is the closed loop predictor. The first alternative, the open loop predictor, is called TNS. The quantization after prediction leads to an adaptation of the resulting quantization noise to the temporal structure of the audio signal and therefore prevents pre-echoes in psychoacoustic audio encoders. The second alternative, ie with a closed-loop predictor, is more suitable for lossless audio coding, since the closed-loop prediction allows an exact reconstruction of the input signal. When this technique is applied to a generated spectrum, a rounding step must be performed after each step of the prediction filter to remain in the integer range. By using the inverse filter and the same rounding function, the original spectrum can be restored exactly.

Um die Redundanz zwischen zwei Kanälen zur Datenreduktion auszunutzen, kann auch eine Mitte-Seite-Codierung verlustlos eingesetzt werden, wenn eine gerundete Rotation mit einem Winkel π/4 verwendet wird. Im Vergleich zur Alternative des Berechnens der Summe und Differenz des linken und rechten Kanals eines Stereosignals hat die gerundete Rotation den Vorteil der Energieerhaltung. Die Verwendung sogenannter Joint-Stereo-Codiertechniken kann für jedes Band ein- oder ausgeschaltet werden, wie es auch im Standard MPEG AAC durchgeführt wird. Weitere Drehwinkel können ebenfalls berücksichtigt werden, um eine Redundanz zwischen zwei Kanälen flexibler reduzieren zu können. Redundancy between two channels for data reduction a middle-side coding can also be used be used losslessly when using a rounded rotation an angle π / 4 is used. In comparison to Alternative of calculating the sum and difference of the left and right channel of a stereo signal has the rounded Rotation the benefit of energy conservation. The usage So-called joint stereo coding techniques can be used for each band can be switched on or off, as in the standard MPEG AAC is performed. More angles of rotation can also be considered to provide redundancy between to be able to reduce two channels more flexibly.

Claims

1. Device for encoding a discrete-time audio signal in order to obtain encoded audio data, having the following features:
means ( 52 ) for providing a quantization block of spectral values of the discrete-time audio signal quantized using a psychoacoustic model ( 54 );
means ( 58 ) for inversely quantizing the quantization block and rounding the inverse quantized spectral values to obtain a rounding block of rounded inverse quantized spectral values;
means ( 56 ) for generating an integer block of integer spectral values using an integer transformation algorithm configured to generate the integer block of spectral values from a block of integer time discrete samples;
combining means ( 58 ) for forming a difference block which depends on a spectral value-wise difference between the rounding block and the integer block to obtain a difference block with difference spectral values; and
means ( 60 ) for processing the quantization block and the rounding block to produce encoded audio data comprising information about the quantization block and information about the difference block.

2. Device according to claim 1, in which the device ( 52 ) is designed for delivery,
to generate an MDCT block of MDCT spectral values from a time block of temporal audio signal values by means of an MDCT, and
to quantize the MDCT block using a psychoacoustic model to generate the quantization block that has quantized MDCT spectral values.

3. The apparatus of claim 2, wherein the means ( 56 ) for generating the integer block is designed to perform an IntMDCT on the time block to generate the integer block which has IntMDCT spectral values.

4. Device according to one of the preceding claims, in which the device ( 52 ) is designed to supply, in order to calculate the quantization block using a floating point transformation algorithm.

5. Device according to one of claims 1 to 3, wherein the means ( 52 ) for supplying is designed to calculate the quantization block using the integer block generated by the means ( 56 ) for generating.

6. Device according to one of the preceding claims,
in which the device ( 60 ) is designed for processing in order to subject the quantization block to entropy coding ( 60 a) in order to obtain an entropy-coded quantization block,
to entropy the rounding block ( 60 b) to obtain an entropy-coded rounding block, and
to convert the entropy-encoded quantization block to a first scaling layer of a scaled data stream that represents the encoded audio data, and to convert the entropy-encoded rounding block to a second scaling layer of the scaled data stream.

7. The device according to claim 6,
wherein the means ( 60 ) for processing is further configured to use one of a plurality of code tables depending on the quantized spectral values for the entropy coding of the quantization block, and
wherein the means ( 60 ) for processing is further configured to select one of a plurality of code tables for the entropy coding of the difference block depending on a property of a quantizer, which can be used in a quantization to generate the quantization block.

8. Device according to one of the preceding claims,
in which the device ( 52 ) is designed to use one of a plurality of windows, depending on the nature of the audio signal, to window a time block of audio signal values, and
in which the device ( 56 ) is designed to generate in order to make the same window selection for the integer transformation algorithm.

9. Device according to one of claims 1 to 8,
in which the device for generating is designed to use an integer transformation algorithm which has the following steps:
Windows of the time-discrete samples with a window (w) with a length that corresponds to 2N time-discrete samples in order to provide windowed time-discrete samples for a conversion of the time-discrete samples into a spectral representation by means of a transformation which can generate N output values from N input values, whereby the windows have the following sub-steps:
Selecting ( 16 ) a discrete-time sample from a quarter of the window and a discrete-time sample from another quarter of the window to obtain a vector of discrete-time samples;
Applying the vector with a square rotation matrix, the dimension of which corresponds to the dimension of the vector, the rotation matrix being able to be represented by a plurality of lifting matrices, a lifting matrix having only one element which depends on the window (w) and is not equal Is 1 or 0, the sub-step of loading having the following sub-steps:
Multiplying ( 18 ) the vector by a lifting matrix to obtain a first result vector;
Rounding ( 20 ) a component of the first result vector with a rounding function (r) that maps a real number to an integer to obtain a rounded first result vector; and
sequentially performing the multiplying ( 22 ) and rounding ( 24 ) steps with another lifting matrix until all lifting matrices are processed to obtain a rotated vector that has an integer windowed sample from the quarter of the window and an integer windowed one Sample from the other quarter of the window, and
Performing the window step for all discrete-time samples of the remaining quarters of the window to obtain 2N filtered integer samples; and
Converting ( 14 ) N windowed integer samples to a spectral representation by an integer DCT for values with the filtered integer samples of the second quarter and third quarter of the window to obtain N integer spectral values.

10. Device according to one of the preceding claims,
in which the device ( 52 ) for supplying the quantization block is designed to carry out a prediction of spectral values against the frequency before a quantization step ( 52 b) using a prediction filter in order to obtain prediction residual spectral values which, after quantization, the quantization block group;
a prediction device is also provided, which is designed to carry out a prediction over the frequency of the integer spectral values of the integer block, wherein a rounding device is also provided to round prediction residual spectral values based on the integer spectral values that represent the rounding block ,

11. The method according to any one of the preceding claims,
in which the time-discrete audio signal has at least two channels,
in which the device ( 52 ) is designed to perform center / side coding with spectral values of the discrete-time audio signal in order to obtain the quantization block after quantization of center / side spectral values, and
in which the device ( 56 ) for generating the integer block is designed to also carry out a center / side coding which corresponds to the center / side coding of the device ( 52 ) for delivering.

12. Device according to one of the preceding claims, in which the device ( 60 ) is designed for processing in order to generate an MPEG-2-AAC data stream, additional information for the integer transformation algorithm being introduced in a field Ancilliary Data.

13. A method for encoding a discrete-time audio signal to obtain encoded audio data, comprising the following steps:
Providing ( 52 ) a quantization block of spectral values of the time-discrete audio signal quantized using a psychoacoustic model ( 54 );
inverse quantizing ( 58 ) the quantization block and rounding the inverse quantized spectral values to obtain a rounding block of rounded inverse quantized spectral values;
Generating ( 56 ) an integer block of integer spectral values using an integer transformation algorithm which is designed to generate the integer block of spectral values from a block of integer discrete-time samples;
Forming ( 58 ) a difference block dependent on a spectral value difference between the rounding block and the integer block to obtain a difference block with difference spectral values; and
Processing ( 60 ) the quantization block and the rounding block to produce encoded audio data including information about the quantization block and information about the difference block.

14. Device for decoding coded audio data, which consists of a time-discrete audio signal by supplying ( 52 ) a quantization block of spectral values of the time-discrete audio signal quantized using a psychoacoustic model ( 54 ), by inverse quantizing ( 58 ) the quantization block and rounding the inverse quantized spectral values to obtain a rounding block of rounded inversely quantized spectral values by generating ( 56 ) an integer block of integer spectral values using an integer transformation algorithm which is designed to extract the integer from a block of integer discrete-time samples. Generating a block of spectral values and by forming ( 58 ) a difference block which depends on a spectral value-wise difference between the rounding block and the integer block to obtain a difference block with difference spectral values, with the following characteristics n:
means ( 70 ) for processing the encoded audio data to obtain a quantization block and a difference block;
means ( 74 ) for inversely quantizing and rounding the quantization block to obtain an integer inverse quantized block;
means ( 78 ) for spectrally combining the integer quantization block and the difference block to obtain a combination block; and
means ( 82 ) for generating a temporal representation of the discrete-time audio signal using the combination block and using an integer transformation algorithm inverse to the integer transformation algorithm.

15. A method for decoding coded audio data which has been generated from a discrete-time audio signal by supplying, inverse quantizing, generating, forming and processing, comprising the following steps:
Processing ( 70 ) the encoded audio data to obtain a quantization block and a difference block;
inverse quantizing ( 74 ) the quantization block and rounding to obtain an integer inverse quantized block;
spectrally combining ( 78 ) the integer quantization block and the difference block to obtain a combination block; and
Generate ( 82 ) a temporal representation of the discrete-time audio signal using the combination block and using an integer transformation algorithm inverse to the integer transformation algorithm.

16. Computer program with a program code Implementation of the method for coding according to claim 13, if the program runs on a computer.

17. Computer program with a program code Implementation of the method for decoding according to claim 15, if the program runs on a computer.