EP1525576B1 - Arrangement and method for the generation of a complex spectral representation of a time-discrete signal - Google Patents

Arrangement and method for the generation of a complex spectral representation of a time-discrete signal Download PDF

Info

Publication number
EP1525576B1
EP1525576B1 EP03766165A EP03766165A EP1525576B1 EP 1525576 B1 EP1525576 B1 EP 1525576B1 EP 03766165 A EP03766165 A EP 03766165A EP 03766165 A EP03766165 A EP 03766165A EP 1525576 B1 EP1525576 B1 EP 1525576B1
Authority
EP
European Patent Office
Prior art keywords
spectral
real
block
coefficient
coefficients
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Lifetime
Application number
EP03766165A
Other languages
German (de)
French (fr)
Other versions
EP1525576A1 (en
Inventor
Bernd Edler
Stefan Geyersberger
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fraunhofer Gesellschaft zur Forderung der Angewandten Forschung eV
Original Assignee
Fraunhofer Gesellschaft zur Forderung der Angewandten Forschung eV
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fraunhofer Gesellschaft zur Forderung der Angewandten Forschung eV filed Critical Fraunhofer Gesellschaft zur Forderung der Angewandten Forschung eV
Publication of EP1525576A1 publication Critical patent/EP1525576A1/en
Application granted granted Critical
Publication of EP1525576B1 publication Critical patent/EP1525576B1/en
Anticipated expiration legal-status Critical
Expired - Lifetime legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/03Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
    • G10L25/18Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being spectral information of each sub-band

Definitions

  • the present invention relates to time-frequency conversion algorithms, and more particularly to such algorithms in conjunction with audio compression concepts.
  • a complex special coefficient can be represented by a first and a second partial spectral coefficient, the first part spectral coefficient being the real part and the second part spectral coefficient being the imaginary part, as desired.
  • the complex spectral coefficient may also be represented by the magnitude as the first partial spectral coefficient and the phase as the second partial spectral coefficient.
  • the input signal is first divided into blocks of a predetermined length by means of multiplication with temporally staggered window functions. Each of these blocks is then converted to a spectral representation by application of the DFT. Do the blocks used each contain L samples, i. H. If the window length is L, the output of the DFT can again be completely described in the form of a total of L values (real and imaginary parts or magnitude and phase values). For example, if the input signal is real, L / 2 results in complex values. When using suitable window functions, the input signal can be reconstructed from this representation with the aid of an inverse DFT.
  • DFT discrete Fourier transformation
  • non-overlapping window functions means a strong limitation on the achievable quality of the spectral decomposition, in particular the separation of different frequency bands.
  • modulated filter banks which are characterized by the possibility of an efficient implementation.
  • MDCT modified discrete cosine transformation
  • the window length L may assume values between N and 2N - 1 due to different degrees of overlap.
  • Fig. 6 shows the decomposition of a time discrete input signal x (n) into the spectral components u k, m , where m represents the temporal block index, ie the time index after the sampling rate reduction, while k is the frequency index or subband index.
  • the sampling frequencies are the same in all subbands, ie the original sampling frequency is reduced by the factor N.
  • filter bank with filters 60 and downstream downsampling elements 62 provides a uniform band division.
  • the above transformation rule may also differ from the above equation, e.g. For example, when the sine function is used instead of the cosine function, or when "+ N / 2" is used instead of "-N / 2". It is also conceivable to use the alternating MDCT / MDST mentioned later (when using k instead of k + 1/2).
  • hP (n) represents the prototype impulse response.
  • Hk (n) is the filter impulse response for the filter associated with subband k.
  • n is the counting index of the time-discrete input signal x (n), while N indicates the number of spectral coefficients.
  • the output values of a real-valued transformation such as the MDCT, which is known not energy conserving, but are only partially usable for applications that require complex-valued spectral components. If, for example, the amounts of the real output values are used as an approximation for the amounts of complex-valued spectral components in the corresponding frequency ranges, then even with sinusoidal input signals of constant amplitude there will be strong fluctuations. Such a procedure thus provides only poor approximations for short-term magnitude spectrums of the input signal.
  • a Scalable and Progressive Audio Codec Vinton and Atlas, IEEE ICASSP 2001, 7-11 May 2001 , Salt Lake City
  • the input signal is windowed by an Kaiser-Bessel window function to produce temporally successive blocks of samples.
  • the blocks of input values are then transformed either by a modified Discrete Cosine Transform (MDCT) or by a Modified Discrete Sine Transform (MDST) depending on a shift index.
  • MDCT modified Discrete Cosine Transform
  • MDST Modified Discrete Sine Transform
  • two temporally adjacent blocks of spectral coefficients are combined into a single complex transformation such that the MDCT block represents the real parts of complex spectral coefficients, while the temporally consecutive MDST block represents the associated imaginary parts of the complex spectral coefficients.
  • the MDCT block represents the real parts of complex spectral coefficients
  • the temporally consecutive MDST block represents the associated imaginary parts of the complex spectral coefficients.
  • the object of the present invention is to provide an improved concept for generating a complex spectral representation of a time-discrete signal.
  • a device for generating a complex spectral representation according to claim 1 a method for generating a complex spectral representation according to claim 18, a device for encoding a discrete-time signal according to claim 19, a method for encoding a discrete-time signal according to claim 20, an apparatus for generating a real spectral representation according to claim 21, a method for generating a real spectral representation according to claim 22 or solved by a computer program according to claim 23.
  • the present invention is based on the finding that a good approximation for a spectral representation of a time-discrete signal from a block-wise real-valued spectral representation of the time-discrete signal can be determined by a first Ambigralkostory and / or a second partial spectral coefficient is calculated by combining at least two real spectral coefficients.
  • This is z. B. the real part / or the imaginary part of an approximated complex spectral coefficient for a particular frequency index by combining two or more real spectral coefficients, preferably in temporal and / or frequency proximity to the complex spectral coefficients to be calculated.
  • the combination is a linear combination, wherein furthermore the real spectral coefficients to be combined before the linear combination, ie an addition or subtraction, can be weighted with constant weighting factors.
  • a linear combination is an addition or subtraction of different linear combination partners, which may or may not be weighted with weighting factors before the linear combination.
  • the weighting factors can be positive or negative real numbers including zero.
  • the two or more real spectral coefficients that are combined to obtain a complex partial spectral coefficient for a frequency index and a (temporal) block index are arranged in frequency and / or temporal proximity.
  • the real spectral coefficients are at a higher or lower by 1 frequency index from the current (temporal) block.
  • the corresponding real spectral coefficients are located in temporal proximity from the immediately preceding time block or the immediately following time block with the same frequency index.
  • the real spectral coefficients of the immediately preceding or immediately following temporal block having a frequency index higher or lower by a frequency index than that Frequency index of the partial spectral coefficient just calculated.
  • the combination rule for calculating a partial spectral coefficient varies depending on whether the frequency index is even or odd.
  • the frequency response - which usually has a bandpass character - should have a desired course for positive frequencies, and should be as small as possible or equal to 0 for negative frequencies.
  • Such a frequency response results from the inventive concept and is considered advantageous for many applications.
  • the properties of this frequency response can be in preferred embodiments z. B. by appropriate adjustment of the weighting factors or by appropriate modification of the window functions of the first transformation to generate the real-valued spectral coefficients are manipulated.
  • the system thus provides many degrees of freedom for adaptation to particular needs, in particular the possibility of not only combining two real spectral coefficients, but also combining more than two real spectral coefficients to provide an even better approximation to a desired frequency response of the overall arrangement to reach.
  • Fig. 1 shows an apparatus for generating a complex spectral representation of a time-discrete signal x (n).
  • the discrete-time signal x (n) is fed to a means 10 for generating a block-wise real-valued spectral representation of the discrete-time signal, the spectral representation having temporally consecutive blocks, each block having a set of spectral coefficients as shown in FIG Fig. 2a to 2b will be explained in more detail.
  • At the output of the device 10 is thus a sequence of consecutive blocks of spectral coefficients that are real-valued spectral coefficients due to the property of the device 10.
  • This sequence of temporally successive blocks of spectral coefficients is fed to a post-processing means 12 to obtain a block-wise complex approximated spectral representation comprising successive blocks, each block having a set of complex approximated spectral coefficients, a complex approximated spectral coefficient being determined by a first Part spectral coefficient and a second spectral coefficient can be displayed, wherein at least the first or the second spectral coefficient is determined by a combination of at least two real spectral coefficients.
  • Fig. 2a to 2c together, they show a sequence of blocks of real-valued spectral coefficients, as determined by means 10 of FIG Fig. 1 be generated.
  • m represents a block index while k represents a frequency index.
  • Fig. 2 shows a block of real-valued spectral coefficients plotted along the frequency axis at the time point or block index (m-1).
  • the block of spectral coefficients comprises spectral coefficients u i, m-1 , where i is a run index, while m-1 is the block index.
  • Fig. 2b shows the same situation, but now for the temporally following block m.
  • Fig. 2c shows the same situation again, but now for the block index (m + 1). This results in the episode of Fig. 2a, 2b, 2c a time course, by an arrow 20 in the Fig. 2a to 2c is symbolized.
  • Fig. 3 shows an alternative representation of the device for generating a complex spectral representation, wherein the discrete-time input signal x (n) in the device 10 for Generating a block-wise real spectral representation is fed in Fig. 3 is denoted by T 1 . It should be noted that this is a first conversion of the time signal, which has been windowed to be in block, into a spectral representation at the output of the device 10.
  • Fig. 3 shows a snapshot at the time or block index m, so refers to Fig. 2b which has been described above.
  • the output values of the device 10, ie the real-valued spectral coefficients, which may be MDCT coefficients, for example, are fed to the device 12 for post-processing to obtain a complex spectrum on the output side, which for each frequency index k a first Partspektralkostoryen p k, m and a second partial spectral coefficients q k, m , where p k, m is the real part and q k, m is the imaginary part of the complex spectral coefficient for the frequency index k, where m denotes the block index.
  • Fig. 3 is designated T 1 and 10, respectively. From these, for example, a real and an imaginary part p, q are formed for a specific frequency index and for a specific (temporal) block index. Alternatively, of course, amount and phase could be generated.
  • special phase relationships of the modulation functions can be exploited, which are the basis of a modulated filter bank.
  • the operation T 2 or 12, respectively, which is connected after the first transformation, is again an invertible, critically sampled transformation. This results in a total system, which also has the property of critical sampling and at the same time allows a reconstruction of the spectral components obtained.
  • T 2 is now a two-dimensional transformation, since in the preferred embodiment of the present invention both temporally adjacent and frequency adjacent real valued spectral coefficients are combined, ie as their input values extend along the time and frequency axes, as shown in FIG Fig. 2a to 2c has been shown. Since a real and an imaginary part are formed from each transformation operation using the device 12, a value pair is to be calculated for a critical scan only for every second scanning position of the time / frequency plane. This is achieved in a preferred embodiment of the present invention by sampling rate reduction along the time axis, ie computation only for every other block of the first transform T 1 . Alternatively, this is achieved by sampling rate reduction along the frequency axis, ie computation only for every second subband i of the first transformation. Again, alternatively, this is offset, ie achieved in the form of a checkerboard pattern, in which alternately every second block and every second band are used.
  • the transformation coefficients of the second transformation, with which the output values of T 1 are each weighted before their summation, ie the weighting factors, preferably fulfill the conditions for the exact reconstruction according to the respective sampling scheme.
  • the system according to the invention contains a number of degrees of freedom, which can be used for optimizing the properties of the overall system, ie for optimizing the frequency response of the entire system as a complex filter bank.
  • Fig. 4 A first embodiment of the present invention for the detailed specification of the device 12 for post-processing shown. It is preferable to distinguish between a straight frequency index k and an odd frequency index k + 1.
  • a straight frequency index that is, when p k, m and q k, m are to be calculated (m is the block index and k is the frequency index)
  • the real part p k, m is summed by two temporally successive real-valued spectral coefficients.
  • p k, m thus results either from the summation of the spectral coefficient with the index k from the Fig. 2b and 2a or from the Fig. 2c and 2b ,
  • the associated imaginary part q k, m is inventively either by summing two successive values with the frequency index k-1 again the Fig. 2a, 2b (Block m-1 and block m) or the Fig. 2b and 2c (Block m and block m + 1).
  • the real part p k + 1, m is calculated as the difference between two consecutive values, ie as the difference between the spectral coefficients k + 1 of the Fig. 2a, 2b or 2b, 2c .
  • the associated imaginary part q k + 1, m results as the difference between two successive values with the frequency index k, ie as the difference between the real-valued spectral coefficients with the index k der Fig. 2a, 2b or 2b, 2c ,
  • Fig. 4 shown transformation function which is generally designated by the reference numeral 12 a, wherein the transformation function has two transformation sub-regulations h L (m) and h H (m), which, as shown in Fig. 4 shown in pairs alternately applied to the output values of the device 10.
  • the first subfunction h L (m) has the form ⁇ 1, 1 ⁇
  • the second subfunction has the form ⁇ 1, -1 ⁇ .
  • the notation of the subfunctions h L (m) and h H (m) is intended to mean that a sum or difference of the corresponding spectral coefficients is to be formed from two (temporally) adjacent blocks.
  • the critical sampling is achieved by a factor of 2 temporal sampling rate reduction, as indicated by the device labeled 12b in FIG Fig. 4 is shown symbolically. If an orthogonality of the second transformation (12a, 12b) is desired, then all the output values p, q can be normalized by multiplication by the factor 1 / ⁇ 2.
  • the second transformation (12a, 12b) connected downstream of the first transformation which is, for example, an MDCT, respectively engages over the two adjacent bands from which the real part p k, m and the imaginary part q k, m are formed for a frequency index k.
  • temporally successive real-valued spectral coefficients are taken into account in the combination, ie the summation or subtraction.
  • the downstream transformation 12a, 12b does not include any degrees of freedom for optimizing the overall system in terms of adjustable weighting factors contained in the functions h L and h H , it is preferable to optimize the overall system by the window function of the first transformation, for example the MDCT manipulate, ie in To change comparison to a given known window function.
  • the window function of the first transformation for example the MDCT manipulate, ie in To change comparison to a given known window function.
  • Equation T 2 is an inverse to the transform rule T 2 T 2 transformation rule used -1. If equations (1) to (4) are considered, it turns out that the real spectral components u k, m-1 and u k, m from the real part p k, m and the imaginary part q k + 1, m from the Equations (1) and (4) can be calculated by solving two equations (1) and (4) for two unknowns according to the sought real spectral coefficients u k, m-1 and u k, m .
  • T 2 -1 Knowing the sequence of blocks of complex approximated spectral coefficients, it is possible to calculate back to the sequence of real spectral coefficients by performing the inverse combination rule.
  • the values of the coefficients a, b, and c can be used to optimize the overall system, again to achieve a desired frequency response of the overall arrangement, which, as has been stated, is desirable, for example positive frequencies a bandpass characteristic is available as a frequency response, while for negative frequencies the greatest possible attenuation is desired.
  • weighting factors a, b, c are used to weight all the real spectral coefficients adjacent to the real spectral coefficient u k, m in the time-frequency plane more or less, as shown in equation (6) ,
  • the methods according to the invention can be implemented in hardware or in software.
  • the implementation may be on a digital storage medium, in particular a floppy disk or CD with electronically readable control signals, which interact with a programmable computer system such that the corresponding method is executed.
  • the invention thus also consists in a computer program product with program code stored on a machine-readable carrier for carrying out one or more of the inventive methods when the computer program product runs on a computer.
  • the invention is also a computer program having a program code for performing one or more of the methods when the computer program runs on a computer.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Computational Linguistics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)
  • Spectrometry And Color Measurement (AREA)
  • Image Processing (AREA)
  • Ultra Sonic Daignosis Equipment (AREA)

Abstract

A filter bank device for generating a complex spectral representation of a discrete-time signal includes a generator for generating a block-wise real spectral representation, which, for example, implements an MDCT, to obtain temporally successive blocks of real spectral coefficients. The output values of this spectral conversion device are fed to a post-processor for post-processing the block-wise real spectral representation to obtain an approximated complex spectral representation having successive blocks, each block having a set of complex approximated spectral coefficients, wherein a complex approximated spectral coefficient can be represented by a first partial spectral coefficient and by a second partial spectral coefficient, wherein at least one of the first and second partial spectral coefficients is determined by combining at least two real spectral coefficients. A good approximation for a complex spectral representation of the discrete-time signal is obtained by combining two real spectral coefficients, preferably by a weighted linear combination, wherein additionally more degrees of freedom for optimizing the entire system are available.

Description

Die vorliegende Erfindung bezieht sich auf Zeit-Frequenz-Umsetzungsalgorithmen und insbesondere auf solche Algorithmen in Verbindung mit Audiokompressionskonzepten.The present invention relates to time-frequency conversion algorithms, and more particularly to such algorithms in conjunction with audio compression concepts.

Für einige Anwendungen bei der Codierung zu Zwecken der Datenkompression und insbesondere bei der Audiocodierung ist eine Darstellung reellwertiger zeitdiskreter Signale in Form von komplexwertigen Spektralkomponenten notwendig. Ein komplexer Spezialkoeffizient kann durch einen ersten und einen zweiten Teilspektralkoeffizient dargestellt werden, wobei je nach Wunsch der erste Teilspektralkoeffizient der Realteil und zweite Teilspektralkoeffizient der Imaginärteil sind. Alternativ kann der komplexe Spektralkoeffizient auch durch den Betrag als ersten Teilspektralkoeffizient und die Phase als zweiten Teilspektralkoeffizient dargestellt werden.For some coding applications for data compression purposes, and in particular for audio coding, it is necessary to present real-valued discrete-time signals in the form of complex-valued spectral components. A complex special coefficient can be represented by a first and a second partial spectral coefficient, the first part spectral coefficient being the real part and the second part spectral coefficient being the imaginary part, as desired. Alternatively, the complex spectral coefficient may also be represented by the magnitude as the first partial spectral coefficient and the phase as the second partial spectral coefficient.

Solche komplexen Spektralkoeffizienten werden auch in Vinton et al: "Scalable and progressive audio codec", 2001 IEEE Int. Conf. on Acoustics, Speech, and Signal Processing, pp. 3277-3280, vol. 5 , verwendet.Such complex spectral coefficients are also found in Vinton et al: "Scalable and Progressive Audio Codec", 2001 IEEE Int. Conf. on Acoustics, Speech, and Signal Processing, pp. 3277-3280, vol. 5 , used.

Insbesondere bei der Audiocodierung werden oft reellwertige Transformationsverfahren eingesetzt, wie z. B. die bekannte MDCT, die in " Analysis/Synthesis Filter Bank Design Based on Time Domain Aliasing Cancellation", J. Princen, A. Bradley, IEEE Trans. Acoust., Speech, and Signal Processing 34, S. 1153 - 1161, 1986 , beschrieben ist. Es besteht z. B. innerhalb des psychoakustischen Modells der Bedarf nach einem komplexen Spektrum. Hierzu wird auf das psychoakustische Modell in Annex D.2.4 des Standards ISO/IEC 11172-3 verwiesen, der auch als MPEG1-Standard bezeichnet wird. Bei bestimmten Anwendungen läuft parallel zur eigentlichen MDCT-Transformation (MDCT = modifizierte diskrete Cosinustransformation) eine komplexe diskrete Fourier-Transformation mit, um psychoakustische Parameter zu berechnen, wie z. B. die psychoakustische Maskierungsschwelle.In particular, in audio coding often real-valued transformation methods are used, such. For example, the well-known MDCT Analysis / Synthesis Filter Bank Design Based on Time Domain Aliasing Cancellation ", J. Princen, A. Bradley, IEEE Trans. Acoust., Speech, and Signal Processing 34, pp. 1153-1161, 1986 , is described. There is z. For example, within the psychoacoustic model there is a need for a complex spectrum. Reference is made to the psychoacoustic model in Annex D.2.4 of the ISO / IEC 11172-3 standard, which is also referred to as the MPEG1 standard. In certain applications, in addition to the MDCT (modified discrete cosine transformation) transformation, a complex discrete Fourier transform is included to calculate psychoacoustic parameters such as: For example, the psychoacoustic masking threshold.

Bei dieser diskreten Fourier-Transformation (DFT) wird das Eingangssignal zunächst mittels Multiplikation mit zeitlich gegeneinander versetzten Fensterfunktionen in Blöcke einer vorgegebenen Länge unterteilt. Jeder dieser Blöcke wird anschließend durch Anwendung der DFT in eine Spektraldarstellung überführt. Beinhalten die verwendeten Blöcke jeweils L Abtastwerte, d. h. beträgt die Fensterlänge L, so läßt sich der Ausgang der DFT wiederum in Form von insgesamt L Werten (Real- und die Imaginärteile oder Betrags- und Phasenwerte) vollständig beschreiben. Wenn beispielsweise das Eingangssignal reell ist, ergeben sich L/2 komplexe Werte. Bei der Verwendung geeigneter Fensterfunktionen kann aus dieser Darstellung mit Hilfe einer inversen DFT das Eingangssignal wieder rekonstruiert werden.In this discrete Fourier transformation (DFT), the input signal is first divided into blocks of a predetermined length by means of multiplication with temporally staggered window functions. Each of these blocks is then converted to a spectral representation by application of the DFT. Do the blocks used each contain L samples, i. H. If the window length is L, the output of the DFT can again be completely described in the form of a total of L values (real and imaginary parts or magnitude and phase values). For example, if the input signal is real, L / 2 results in complex values. When using suitable window functions, the input signal can be reconstructed from this representation with the aid of an inverse DFT.

Dieser Ansatz unterliegt jedoch einigen Einschränkungen. So ist beispielsweise eine kritische Abtastung nur möglich, wenn sich aufeinanderfolgende Fenster nicht überlappen. Andernfalls wären nämlich bei einem zeitlichen Versatz von N < L Werten für jeweils N neue Eingangswerte der DFT L Werte in der Spektraldarstellung zu übertragen, was insbesondere bei Datenkompressionsverfahren unerwünscht ist.However, this approach is subject to some limitations. For example, a critical scan is possible only if consecutive windows do not overlap. Otherwise, with a time offset of N <L values for each N, new input values of the DFT L values would have to be transmitted in the spectral representation, which is undesirable, in particular in the case of data compression methods.

Die Verwendung nicht-überlappender Fensterfunktionen bedeutet jedoch eine starke Einschränkung der erzielbaren Güte der Spektralzerlegung, wobei insbesondere die Trennung unterschiedlicher Frequenzbänder zu nennen ist.However, the use of non-overlapping window functions means a strong limitation on the achievable quality of the spectral decomposition, in particular the separation of different frequency bands.

Eine bessere Bandtrennung läßt sich dagegen mit reellwertigen Transformationen mit überlappenden Fensterfunktionen erzielen. Eine besondere Klasse dieser Transformationen stellen die sogenannten modulierten Filterbänke dar, die sich durch die Möglichkeit einer effizienten Implementierung auszeichnen. Unter diesen modulierten Filterbänken hat sich als Sonderform die modifizierte diskrete Cosinustransformation (MDCT) durchgesetzt, bei der die Fensterlänge L aufgrund unterschiedlicher Überlappungsgrade Werte zwischen N und 2N - 1 annehmen kann.On the other hand, better band separation can be achieved with real-valued transformations with overlapping window functions. A special class of these transformations are the so-called modulated filter banks, which are characterized by the possibility of an efficient implementation. Among these modulated filter banks, the modified discrete cosine transformation (MDCT), in which the window length L may assume values between N and 2N - 1 due to different degrees of overlap.

Fig. 6 zeigt die Zerlegung eines zeitdiskreten Eingangssignals x(n) in die Spektralkomponenten uk,m, wobei m den zeitlichen Blockindex darstellt, also den Zeitindex nach der Abtastratenreduktion, während k der Frequenzindex oder Teilband-Index ist. Die Abtastfrequenzen sind in allen Teilbändern gleich, d. h. die Original-Abtastfrequenz ist um den Faktor N reduziert. Die in Fig. 6 dargestellte Filterbank mit Filtern 60 und nachgeschalteten Downsampling-Elementen 62 liefert eine gleichförmige Bandaufteilung. Fig. 6 shows the decomposition of a time discrete input signal x (n) into the spectral components u k, m , where m represents the temporal block index, ie the time index after the sampling rate reduction, while k is the frequency index or subband index. The sampling frequencies are the same in all subbands, ie the original sampling frequency is reduced by the factor N. In the Fig. 6 shown filter bank with filters 60 and downstream downsampling elements 62 provides a uniform band division.

Bei einer modulierten Filterbank entstehen die einzelnen Teilbandfilter durch Multiplikation einer Prototyp-Impulsantwort hP(n) mit einer teilbandspezifischen Modulationsfunktion, wobei für die MDCT und ähnliche Transformationen folgende Vorschrift verwendet wird: h k n = h P n cos π N n - N 2 + 1 2 k + 1 2

Figure imgb0001
In the case of a modulated filter bank, the individual subband filters are formed by multiplying a prototype impulse response h P (n) by a subband-specific modulation function, the following rule being used for the MDCT and similar transformations: H k n = H P n cos π N n - N 2 + 1 2 k + 1 2
Figure imgb0001

Die obige Transformationsvorschrift kann auch von der obigen Gleichung abweichen, z. B. wenn die Sinusfunktion anstatt der Cosinusfunktion verwendet wird, oder wenn anstelle von "-N/2" "+N/2" verwendet wird. Auch der Einsatz mit der später erwähnten abwechselnden MDCT/MDST (bei Verwendung von k anstelle von k+1/2) ist denkbar.The above transformation rule may also differ from the above equation, e.g. For example, when the sine function is used instead of the cosine function, or when "+ N / 2" is used instead of "-N / 2". It is also conceivable to use the alternating MDCT / MDST mentioned later (when using k instead of k + 1/2).

In der oben stehenden Gleichung stellt hP(n) die Prototyp-Impulsantwort dar. hk(n) ist die Filter-Impulsantwort für das Filter, das dem Teilband k zugeordnet ist. n ist der Zählindex des zeitdiskreten Eingangssignals x(n), während N die Anzahl der Spektralkoeffizienten angibt.In the above equation, hP (n) represents the prototype impulse response. Hk (n) is the filter impulse response for the filter associated with subband k. n is the counting index of the time-discrete input signal x (n), while N indicates the number of spectral coefficients.

Die Ausgangswerte einer reellwertigen Transformation, wie z. B. der MDCT, die bekanntlich nicht energieerhaltend ist, sind jedoch nur bedingt für Anwendungen einsetzbar, die komplexwertige Spektralkomponenten erfordern. Verwendet man beispielsweise die Beträge der reellen Ausgangswerte als Näherung für die Beträge komplexwertiger Spektralkomponenten in den entsprechenden Frequenzbereichen, so ergeben sich selbst bei sinusförmigen Eingangssignalen konstanter Amplitude starke Schwankungen. Eine derartige Vorgehensweise liefert demnach nur schlechte Näherungen für Kurzzeit-Betragsspektren des Eingangssignals.The output values of a real-valued transformation, such. As the MDCT, which is known not energy conserving, but are only partially usable for applications that require complex-valued spectral components. If, for example, the amounts of the real output values are used as an approximation for the amounts of complex-valued spectral components in the corresponding frequency ranges, then even with sinusoidal input signals of constant amplitude there will be strong fluctuations. Such a procedure thus provides only poor approximations for short-term magnitude spectrums of the input signal.

In der Fachveröffentlichung " A Scalable and Progressive Audio Codec", Vinton und Atlas, IEEE ICASSP 2001, 7.-11. Mai 2001 , Salt Lake City, ist ein Audiocodierer mit einem Transformationsalgorithmus dargestellt, der aus einer Basistransformation und einer zweiten Transformation besteht. Das Eingangssignal wird durch eine Kaiser-Bessel-Fensterfunktion gefenstert, um zeitlich aufeinanderfolgende Blöcke von Abtastwerten zu erzeugen. Die Blöcke von Eingangswerten werden dann entweder mittels einer modifizierten diskreten Cosinustransformation (MDCT) oder mittels einer modifizierten diskreten Sinustransformation (MDST) abhängig von einem Verschiebungsindex transformiert. Dieser Basistransformationsprozeß entspricht im wesentlichen der TDAC-Filterbank, die in der zitierten Fachveröffentlichung von Princen und Bradley beschrieben ist. Hierauf werden zwei zeitlich benachbarte Blöcke von Spektralkoeffizienten in eine einzige komplexe Transformation kombiniert, derart, daß der MDCT-Block die Realteile von komplexen Spektralkoeffizienten darstellt, während der zeitlich aufeinanderfolgende MDST-Block die zugehörigen Imaginärteile der komplexen Spektralkoeffizienten darstellt. Hieraus wird eine Zeit-Frequenz-Verteilung des Betrags des komplexen Spektrums erzeugt, wobei eine zweidimensionale Betragsverteilung über der Zeit in jedem Frequenzband gefenstert wird, und zwar wieder mit 50% überlappenden Fensterfunktionen. Hierauf wird mittels der zweiten Transformation eine Betragsmatrix berechnet. Die Phaseninformationen werden der zweiten Transformation nicht unterzogen.In the technical publication " A Scalable and Progressive Audio Codec ", Vinton and Atlas, IEEE ICASSP 2001, 7-11 May 2001 , Salt Lake City, is an audio encoder with a transformation algorithm consisting of a base transformation and a second transformation. The input signal is windowed by an Kaiser-Bessel window function to produce temporally successive blocks of samples. The blocks of input values are then transformed either by a modified Discrete Cosine Transform (MDCT) or by a Modified Discrete Sine Transform (MDST) depending on a shift index. This basic transformation process essentially corresponds to the TDAC filter bank described in the cited reference by Princen and Bradley. Then, two temporally adjacent blocks of spectral coefficients are combined into a single complex transformation such that the MDCT block represents the real parts of complex spectral coefficients, while the temporally consecutive MDST block represents the associated imaginary parts of the complex spectral coefficients. From this, a time-frequency distribution of the magnitude of the complex spectrum is generated, windowing a two-dimensional magnitude distribution over time in each frequency band, again with 50% overlapping window functions. Then, using the second transformation, an amount matrix calculated. The phase information is not subjected to the second transformation.

Die abwechselnde Verwendung der Ausgangswerte einer MDCT als Real- und Imaginärteil wird auch in der Fachveröffentlichung " MDCT Filter Banks with Perfect Reconstruction", Karp und Fliege, Proc. IEEE ISCAS 1995, Seattle, WA , als "MDFT" eingeführt.The alternate use of the initial values of a MDCT as a real and imaginary part is also described in the technical publication " MDCT Filters Banks with Perfect Reconstruction, Karp and Fly, Proc. IEEE ISCAS 1995, Seattle, WA introduced as "MDFT".

Es wurde herausgefunden, daß auch diese Approximation eines komplexen Spektrums aus einer reellwertigen Spektraldarstellung des zeitdiskreten Eingangssignals dahingehend problematisch ist, daß für Töne bestimmter Frequenzen keine angemessene Betragsdarstellung gewonnen werden kann. Somit ist auch bei dieser Transformation die Bestimmung von Kurzzeit-Betragsspektren nur bedingt möglich.It has also been found that this approximation of a complex spectrum from a real-valued spectral representation of the time-discrete input signal is also problematic in that it is not possible to obtain an appropriate magnitude representation for tones of certain frequencies. Thus, even with this transformation, the determination of short-term magnitude spectra is only possible to a limited extent.

Die Aufgabe der vorliegenden Erfindung besteht darin, ein verbessertes Konzept zum Erzeugen einer komplexen Spektraldarstellung eines zeitdiskreten Signals zu schaffen.The object of the present invention is to provide an improved concept for generating a complex spectral representation of a time-discrete signal.

Diese Aufgabe wird durch eine Vorrichtung zum Erzeugen einer komplexen Spektraldarstellung nach Patentanspruch 1, ein Verfahren zum Erzeugen einer komplexen Spektraldarstellung nach Patentanspruch 18, eine Vorrichtung zum Codieren eines zeitdiskreten Signals nach Patentanspruch 19, ein Verfahren zum Codieren eines zeitdiskreten Signals nach Patentanspruch 20, eine Vorrichtung zum Erzeugen einer reellen Spektraldarstellung nach Patentanspruch 21, ein Verfahren zum Erzeugen einer reellen Spektraldarstellung nach Patentanspruch 22 oder durch ein Computer-Programm nach Patentanspruch 23 gelöst.This object is achieved by a device for generating a complex spectral representation according to claim 1, a method for generating a complex spectral representation according to claim 18, a device for encoding a discrete-time signal according to claim 19, a method for encoding a discrete-time signal according to claim 20, an apparatus for generating a real spectral representation according to claim 21, a method for generating a real spectral representation according to claim 22 or solved by a computer program according to claim 23.

Der vorliegenden Erfindung liegt die Erkenntnis zugrunde, daß eine gute Approximation für eine Spektraldarstellung eines zeitdiskreten Signals aus einer blockweisen reellwertigen Spektraldarstellung des zeitdiskreten Signals ermittelt werden kann, indem ein erster Teilspektralkoeffizient und/oder ein zweiter Teilspektralkoeffizient dadurch berechnet wird, daß zumindest zwei reelle Spektralkoeffizienten kombiniert werden. Damit wird z. B. der Realteil/oder der Imaginärteil eines approximierten komplexen Spektralkoeffizienten für einen bestimmten Frequenzindex durch Kombination von zwei oder mehr reellen Spektralkoeffizienten vorzugsweise in zeitlicher und/oder frequenzmäßiger Nähe zu dem zu berechnenden komplexen Spektralkoeffizienten erhalten. Vorzugsweise ist die Kombination eine Linearkombination, wobei ferner die zu kombinierenden reellen Spektralkoeffizienten vor der Linearkombination, d. h. einer Addition oder Subtraktion, mit konstanten Gewichtungsfaktoren gewichtet werden können.The present invention is based on the finding that a good approximation for a spectral representation of a time-discrete signal from a block-wise real-valued spectral representation of the time-discrete signal can be determined by a first Teilspektralkoeffizient and / or a second partial spectral coefficient is calculated by combining at least two real spectral coefficients. This is z. B. the real part / or the imaginary part of an approximated complex spectral coefficient for a particular frequency index by combining two or more real spectral coefficients, preferably in temporal and / or frequency proximity to the complex spectral coefficients to be calculated. Preferably, the combination is a linear combination, wherein furthermore the real spectral coefficients to be combined before the linear combination, ie an addition or subtraction, can be weighted with constant weighting factors.

An dieser Stelle sei darauf hingewiesen, daß eine Linearkombination eine Addition oder Subtraktion verschiedener Linearkombinationspartner ist, die mit Gewichtungsfaktoren vor der Linearkombination gewichtet sein können oder nicht. die Gewichtungsfaktoren können positive oder negative reelle Zahlen einschließlich Null sein.It should be noted at this point that a linear combination is an addition or subtraction of different linear combination partners, which may or may not be weighted with weighting factors before the linear combination. the weighting factors can be positive or negative real numbers including zero.

Bei einem bevorzugten Ausführungsbeispiel der vorliegenden Erfindung sind die zwei oder mehr reellen Spektralkoeffizienten, die kombiniert werden, um einen komplexen Teilspektralkoeffizienten für einen Frequenzindex und einen (zeitlichen) Blockindex zu erhalten, in frequenzmäßiger und/oder zeitlicher Nähe angeordnet. In frequenzmäßiger Nähe befinden sich die reellen Spektralkoeffizienten mit einem um 1 höheren oder um 1 niedrigeren Frequenzindex aus dem aktuellen (zeitlichen) Block. Darüber hinaus befinden sich in zeitlicher Nähe die entsprechenden reellen Spektralkoeffizienten aus dem unmittelbar vorausgehenden zeitlichen Block oder dem unmittelbar nachfolgenden zeitlichen Block mit demselben Frequenzindex. In zeitlicher und frequenzmäßiger Nähe befinden sich ferner die reellen Spektralkoeffizienten des unmittelbar vorausgehenden oder unmittelbar folgenden zeitlichen Blocks mit einem Frequenzindex, der um einen Frequenzindex höher oder niedriger ist als der Frequenzindex des gerade berechneten Teilspektralkoeffizienten.In a preferred embodiment of the present invention, the two or more real spectral coefficients that are combined to obtain a complex partial spectral coefficient for a frequency index and a (temporal) block index are arranged in frequency and / or temporal proximity. In frequency proximity, the real spectral coefficients are at a higher or lower by 1 frequency index from the current (temporal) block. In addition, the corresponding real spectral coefficients are located in temporal proximity from the immediately preceding time block or the immediately following time block with the same frequency index. Also in temporal and frequency proximity are the real spectral coefficients of the immediately preceding or immediately following temporal block having a frequency index higher or lower by a frequency index than that Frequency index of the partial spectral coefficient just calculated.

Vorzugsweise variiert die Kombinationsvorschrift zum Berechnen eines Teilspektralkoeffizienten abhängig davon, ob der Frequenzindex gerade oder ungerade ist.Preferably, the combination rule for calculating a partial spectral coefficient varies depending on whether the frequency index is even or odd.

Erfindungsgemäß wurde herausgefunden, daß eine Kombination von reellen Spektralkoeffizienten in zeitlicher und/oder frequenzmäßiger Nähe zu dem komplexen Spektralkoeffizienten, der bestimmt werden soll, eine gute Annäherung an einen erwünschten Frequenzgang der gesamten Anordnung aus der Einrichtung zum Erzeugen einer blockweisen reellwertigen Spektraldarstellung und der Einrichtung zum Nachverarbeiten der blockweisen reellwertigen Spektraldarstellung liefert, wobei der Frequenzgang - der üblicherweise einen Bandpasscharakter aufweist - für positive Frequenzen einen erwünschten Verlauf haben soll, und für negative Frequenzen möglichst klein bzw. gleich 0 sein soll. Ein solcher Frequenzgang ergibt sich durch das erfindungsgemäße Konzept und wird für viele Anwendungen als vorteilhaft angesehen.According to the invention, it has been found that a combination of real spectral coefficients in temporal and / or frequency proximity to the complex spectral coefficient to be determined is a good approximation to a desired frequency response of the entire arrangement of the block-wise real-valued spectral representation and the device for Processing the block-wise real-valued spectral representation delivers, the frequency response - which usually has a bandpass character - should have a desired course for positive frequencies, and should be as small as possible or equal to 0 for negative frequencies. Such a frequency response results from the inventive concept and is considered advantageous for many applications.

Die Eigenschaften dieses Frequenzgangs können bei bevorzugten Ausführungsbeispielen z. B. durch geeignete Einstellung der Gewichtungsfaktoren oder aber durch entsprechende Modifikation der Fensterfunktionen der ersten Transformation zum Erzeugen der reellwertigen Spektralkoeffizienten manipuliert werden. Das System liefert somit viele Freiheitsgrade zur Anpassung an bestimmte Bedürfnisse, wobei insbesondere auch die Möglichkeit zu nennen ist, nicht nur zwei reelle Spektralkoeffizienten zu kombinieren, sondern auch mehr als zwei reelle Spektralkoeffizienten zu kombinieren, um eine noch bessere Approximation an einen gewünschten Frequenzgang der Gesamtanordnung zu erreichen.The properties of this frequency response can be in preferred embodiments z. B. by appropriate adjustment of the weighting factors or by appropriate modification of the window functions of the first transformation to generate the real-valued spectral coefficients are manipulated. The system thus provides many degrees of freedom for adaptation to particular needs, in particular the possibility of not only combining two real spectral coefficients, but also combining more than two real spectral coefficients to provide an even better approximation to a desired frequency response of the overall arrangement to reach.

Bevorzugte Ausführungsbeispiele der vorliegenden Erfindung werden nachfolgend Bezug nehmend auf die beiliegenden Zeichnungen detailliert erläutert. Es zeigen:

Fig. 1
ein Blockschaltbild der erfindungsgemäßen Vorrichtung zum Erzeugen einer komplexen Spektraldarstellung;
Fig. 2a
bis 2c eine Darstellung der einer Teilspektralkomponente für einen komplexen Spektralkoeffizient mit Frequenzindex k und Blockindex m benachbarten reellen Spektralkoeffizienten;
Fig. 3
eine schematische Darstellung zur Berechnung komplexer Teilbandsignale mit einer reellwertigen Transformation T1 und einer Nachverarbeitungstransformation T2;
Fig. 4
ein Blockschaltbild der erfindungsgemäßen Vorrichtung gemäß einem bevorzugten Ausführungsbeispiel der vorliegenden Erfindung mit kritischer Abtastung;
Fig. 5
ein Blockschaltbild der erfindungsgemäßen Vorrichtung gemäß einem weiteren Ausführungsbeispiel der vorliegenden Erfindung ohne kritische Abtastung; und
Fig. 6
eine bekannte reellwertige Filterbank mit gleichförmiger Bandaufteilung.
Preferred embodiments of the present invention will be explained below in detail with reference to the accompanying drawings. Show it:
Fig. 1
a block diagram of the inventive device for generating a complex spectral representation;
Fig. 2a
2c shows a representation of the real spectral coefficients which are adjacent to a partial spectral component for a complex spectral coefficient with frequency index k and block index m;
Fig. 3
a schematic representation for the calculation of complex subband signals with a real-valued transformation T 1 and a Nachverarbeitungstransformation T 2 ;
Fig. 4
a block diagram of the inventive device according to a preferred embodiment of the present invention with critical scanning;
Fig. 5
a block diagram of the device according to the invention according to another embodiment of the present invention without critical scanning; and
Fig. 6
a known real-valued filter bank with uniform band division.

Fig. 1 zeigt eine Vorrichtung zum Erzeugen einer komplexen Spektraldarstellung eines zeitdiskreten Signals x(n). Das zeitdiskrete Signal x(n) wird in eine Einrichtung 10 zum Erzeugen einer blockweisen reellwertigen Spektraldarstellung des zeitdiskreten Signals eingespeist, wobei die Spektraldarstellung zeitlich aufeinanderfolgende Blöcke aufweist, wobei jeder Block einen Satz von Spektralkoeffizienten aufweist, wie es anhand der Fig. 2a bis 2b detaillierter erläutert wird. Am Ausgang der Einrichtung 10 liegt somit eine Folge von zeitlich aufeinanderfolgenden Blöcken von Spektralkoeffizienten vor, die aufgrund der Eigenschaft der Einrichtung 10 reellwertige Spektralkoeffizienten sind. Diese Folge von zeitlich aufeinanderfolgenden Blöcken von Spektralkoeffizienten wird in eine Einrichtung 12 zum Nachverarbeiten eingespeist, um eine blockweise komplexe approximierte Spektraldarstellung zu erhalten, die aufeinanderfolgende Blöcke aufweist, wobei jeder Block einen Satz von komplexen approximierten Spektralkoeffizienten aufweist, wobei ein komplexer approximierter Spektralkoeffizient durch einen ersten Teilspektralkoeffizient und einen zweiten Spektralkoeffizient darstellbar ist, wobei zumindest der erste oder der zweite Spektralkoeffizient durch eine Kombination von zumindest zwei reellen Spektralkoeffizienten ermittelt wird. Fig. 1 shows an apparatus for generating a complex spectral representation of a time-discrete signal x (n). The discrete-time signal x (n) is fed to a means 10 for generating a block-wise real-valued spectral representation of the discrete-time signal, the spectral representation having temporally consecutive blocks, each block having a set of spectral coefficients as shown in FIG Fig. 2a to 2b will be explained in more detail. At the output of the device 10 is thus a sequence of consecutive blocks of spectral coefficients that are real-valued spectral coefficients due to the property of the device 10. This sequence of temporally successive blocks of spectral coefficients is fed to a post-processing means 12 to obtain a block-wise complex approximated spectral representation comprising successive blocks, each block having a set of complex approximated spectral coefficients, a complex approximated spectral coefficient being determined by a first Part spectral coefficient and a second spectral coefficient can be displayed, wherein at least the first or the second spectral coefficient is determined by a combination of at least two real spectral coefficients.

Die Fig. 2a bis 2c zeigen zusammen eine Folge von Blöcken von Beträgen von reellwertigen Spektralkoeffizienten, wie sie durch die Einrichtung 10 von Fig. 1 erzeugt werden. m stellt einen Blockindex dar, während k einen Frequenzindex darstellt. Fig. 2 zeigt einen entlang der Frequenzachse aufgetragenen Block von reellwertigen Spektralkoeffizienten zum Zeitpunkt bzw. Blockindex (m-1). Der Block von Spektralkoeffizienten umfaßt Spektralkoeffizienten ui,m-1, wobei i ein Laufindex ist, während m-1 für den Blockindex steht. Insbesondere ist in Fig. 2a eine Spektrallinie mit dem Frequenzindex i = k sowie eine Spektralkomponente mit dem Frequenzindex i = (k-1) und i = (k+1) gezeigt.The Fig. 2a to 2c Together, they show a sequence of blocks of real-valued spectral coefficients, as determined by means 10 of FIG Fig. 1 be generated. m represents a block index while k represents a frequency index. Fig. 2 shows a block of real-valued spectral coefficients plotted along the frequency axis at the time point or block index (m-1). The block of spectral coefficients comprises spectral coefficients u i, m-1 , where i is a run index, while m-1 is the block index. In particular, in Fig. 2a a spectral line with the frequency index i = k and a spectral component with the frequency index i = (k-1) and i = (k + 1) shown.

Fig. 2b zeigt dieselbe Situation, nun jedoch für den zeitlich nachfolgenden Block m. Schließlich zeigt Fig. 2c wieder dieselbe Situation, nun jedoch für den Blockindex (m+1). Damit ergibt sich in der Folge der Fig. 2a, 2b, 2c ein zeitlicher Verlauf, der durch einen Pfeil 20 in den Fig. 2a bis 2c symbolisiert ist. Fig. 2b shows the same situation, but now for the temporally following block m. Finally shows Fig. 2c the same situation again, but now for the block index (m + 1). This results in the episode of Fig. 2a, 2b, 2c a time course, by an arrow 20 in the Fig. 2a to 2c is symbolized.

Fig. 3 zeigt eine alternative Darstellung der Vorrichtung zum Erzeugen einer komplexen Spektraldarstellung, wobei das zeitdiskrete Eingangssignal x(n) in der Einrichtung 10 zum Erzeugen einer blockweisen reellen Spektraldarstellung eingespeist wird, die in Fig. 3 mit T1 bezeichnet ist. Es sei darauf hingewiesen, daß es sich hier um eine erste Umsetzung des Zeitsignals, das gefenstert worden ist, um blockweise vorzuliegen, in eine spektrale Darstellung am Ausgang der Einrichtung 10 handelt. Fig. 3 zeigt eine Momentaufnahme zum Zeitpunkt bzw. Blockindex m, bezieht sich also auf Fig. 2b, die vorstehend beschrieben worden ist. Die Ausgangswerte der Einrichtung 10, also die reellwertigen Spektralkoeffizienten, die beispielsweise MDCT-Koeffizienten sein können, werden in die Einrichtung 12 zum Nachverarbeiten eingespeist, um ausgangsseitig ein komplexes Spektrum zu erhalten, das für jeden Frequenzindex k einen ersten Teilspektralkoeffizienten pk,m und einen zweiten Teilspektralkoeffizienten qk,m umfaßt, wobei pk,m der Realteil und qk,m der Imaginärteil des komplexen Spektralkoeffizienten für den Frequenzindex k sind, wobei m den Blockindex bezeichnet. Fig. 3 shows an alternative representation of the device for generating a complex spectral representation, wherein the discrete-time input signal x (n) in the device 10 for Generating a block-wise real spectral representation is fed in Fig. 3 is denoted by T 1 . It should be noted that this is a first conversion of the time signal, which has been windowed to be in block, into a spectral representation at the output of the device 10. Fig. 3 shows a snapshot at the time or block index m, so refers to Fig. 2b which has been described above. The output values of the device 10, ie the real-valued spectral coefficients, which may be MDCT coefficients, for example, are fed to the device 12 for post-processing to obtain a complex spectrum on the output side, which for each frequency index k a first Teilspektralkoeffizienten p k, m and a second partial spectral coefficients q k, m , where p k, m is the real part and q k, m is the imaginary part of the complex spectral coefficient for the frequency index k, where m denotes the block index.

Erfindungsgemäß werden somit zur Erzeugung komplexwertiger Spektralkomponenten reellwertige Transformationen in Form von modulierten Filterbänken für die eigentliche Spektralzerlegung eingesetzt. Es werden nunmehr reelle Spektralkoeffizienten aus zeitlich aufeinanderfolgenden und/oder spektral benachbarten Ausgangswerten der reellwertigen Transformation verwendet, die in Fig. 3 mit T1 bzw. 10 bezeichnet ist. Aus diesen wird beispielhaft ein Real- und ein Imaginärteil p, q für einen bestimmten Frequenzindex und für einen bestimmten (zeitlichen) Blockindex gebildet. Alternativ könnten selbstverständlich auch Betrag und Phase erzeugt werden. Hierbei können besondere Phasenbeziehungen der Modulationsfunktionen ausgenutzt werden, die einer modulierten Filterbank zugrunde liegen.According to the invention, therefore, real-valued transformations in the form of modulated filter banks are used for the actual spectral decomposition to generate complex-valued spectral components. Real spectral coefficients from temporally successive and / or spectrally adjacent output values of the real-valued transformation are now used Fig. 3 is designated T 1 and 10, respectively. From these, for example, a real and an imaginary part p, q are formed for a specific frequency index and for a specific (temporal) block index. Alternatively, of course, amount and phase could be generated. Here, special phase relationships of the modulation functions can be exploited, which are the basis of a modulated filter bank.

Bei einem bevorzugten Ausführungsbeispiel ist die Operation T2 bzw. 12, die der ersten Transformation nachgeschaltet ist, wiederum eine invertierbare, kritisch abgetastete Transformation. Damit ergibt sich ein Gesamtsystem, welches ebenfalls die Eigenschaft der kritischen Abtastung aufweist und gleichzeitig eine Rekonstruktion aus den gewonnenen Spektralkomponenten ermöglicht.In a preferred embodiment, the operation T 2 or 12, respectively, which is connected after the first transformation, is again an invertible, critically sampled transformation. This results in a total system, which also has the property of critical sampling and at the same time allows a reconstruction of the spectral components obtained.

T2 ist nun eine zweidimensionale Transformation, da bei dem bevorzugten Ausführungsbeispiel der vorliegenden Erfindung sowohl zeitlich benachbarte als auch frequenzmäßig benachbarte reellwertige Spektralkoeffizienten kombiniert werden, d. h. da sich ihre Eingangswerte entlang der Zeit- und der Frequenzachse erstrecken, wie es anhand der Fig. 2a bis 2c dargestellt worden ist. Da aus jeder Transformations-Operation unter Verwendung der Einrichtung 12 jeweils ein Real- und ein Imaginärteil entsteht, ist für eine kritische Abtastung nur für jede zweite Abtastposition der Zeit/Frequenz-Ebene ein Wertepaar zu berechnen. Dies wird bei einem bevorzugten Ausführungsbeispiel der vorliegenden Erfindung durch Abtastratenreduktion entlang der Zeitachse, d. h. Berechnung nur für jeden zweiten Block der ersten Transformation T1 erreicht. Alternativ wird dies durch Abtastratenreduktion entlang der Frequenzachse, d. h. Berechnung nur für jedes zweite Teilband i der ersten Transformation, erreicht. Wieder alternativ wird dies versetzt, d. h. in Form eines Schachbrett-Musters, bei dem abwechselnd jeder zweite Block und jedes zweite Band verwendet werden, erreicht.T 2 is now a two-dimensional transformation, since in the preferred embodiment of the present invention both temporally adjacent and frequency adjacent real valued spectral coefficients are combined, ie as their input values extend along the time and frequency axes, as shown in FIG Fig. 2a to 2c has been shown. Since a real and an imaginary part are formed from each transformation operation using the device 12, a value pair is to be calculated for a critical scan only for every second scanning position of the time / frequency plane. This is achieved in a preferred embodiment of the present invention by sampling rate reduction along the time axis, ie computation only for every other block of the first transform T 1 . Alternatively, this is achieved by sampling rate reduction along the frequency axis, ie computation only for every second subband i of the first transformation. Again, alternatively, this is offset, ie achieved in the form of a checkerboard pattern, in which alternately every second block and every second band are used.

Die Transformationskoeffizienten der zweiten Transformation, mit denen die Ausgangswerte von T1 vor ihrer Summation jeweils gewichtet werden, also die Gewichtungsfaktoren, erfüllen vorzugsweise die Bedingungen für die exakte Rekonstruktion gemäß dem jeweiligen Abtastschema. Das erfindungsgemäße System enthält eine Anzahl von Freiheitsgraden, die für eine Optimierung der Eigenschaften des Gesamtsystems, d. h. für die Optimierung des Frequenzgangs des Gesamtsystems als komplexe Filterbank, genutzt werden können.The transformation coefficients of the second transformation, with which the output values of T 1 are each weighted before their summation, ie the weighting factors, preferably fulfill the conditions for the exact reconstruction according to the respective sampling scheme. The system according to the invention contains a number of degrees of freedom, which can be used for optimizing the properties of the overall system, ie for optimizing the frequency response of the entire system as a complex filter bank.

Es sei ferner darauf hingewiesen, daß für manche Anwendungen die kritische Abtastung nicht zwingend erforderlich ist. Dies kann z. B. der Fall sein bei einer Nachverarbeitung der decodierten aber noch nicht in den Zeitbereich zurücktransformierten Signale in einem Audiodecodierer. In diesem Fall hat man einen höheren Freiheitsgrad bei der Wahl der Transformationskoeffizienten in T2. Dieser höhere Freiheitsgrad wird bevorzugt für eine bessere Optimierung des Gesamtverhaltens eingesetzt.It should also be noted that for some applications, critical sampling is not mandatory is. This can be z. B. be the case with a post-processing of the decoded but not yet transformed back into the time domain signals in an audio decoder. In this case, one has a higher degree of freedom in the choice of the transformation coefficients in T 2 . This higher degree of freedom is preferably used for a better optimization of the overall behavior.

Nachfolgend wird anhand von Fig. 4 ein erstes Ausführungsbeispiel der vorliegenden Erfindung für die detaillierte Vorschrift der Einrichtung 12 zum Nachverarbeiten dargestellt. Es wird bevorzugt, zwischen einem geraden Frequenzindex k und einem ungeraden Frequenzindex k+1 zu unterscheiden. Im Falle eines geraden Frequenzindex, also wenn pk,m und qk,m zu berechnen sind (m ist der Blockindex und k ist der Frequenzindex), wird gemäß dem ersten Ausführungsbeispiel der vorliegenden Erfindung der Realteil pk,m durch Summation von zwei zeitlich aufeinanderfolgenden reellwertigen Spektralkoeffizienten ermittelt. pk,m ergibt sich somit entweder aus der Summation des Spektralkoeffizienten mit dem Index k aus den Fig. 2b und 2a oder aus den Fig. 2c und 2b.The following is based on Fig. 4 A first embodiment of the present invention for the detailed specification of the device 12 for post-processing shown. It is preferable to distinguish between a straight frequency index k and an odd frequency index k + 1. In the case of a even frequency index, that is, when p k, m and q k, m are to be calculated (m is the block index and k is the frequency index), according to the first embodiment of the present invention, the real part p k, m is summed by two temporally successive real-valued spectral coefficients. p k, m thus results either from the summation of the spectral coefficient with the index k from the Fig. 2b and 2a or from the Fig. 2c and 2b ,

Der zugehörige Imaginärteil qk,m wird erfindungsgemäß durch Summation zweier aufeinanderfolgender Werte mit dem Frequenzindex k-1 entweder wieder der Fig. 2a, 2b (Block m-1 und Block m) oder der Fig. 2b und 2c (Block m und Block m+1) erhalten.The associated imaginary part q k, m is inventively either by summing two successive values with the frequency index k-1 again the Fig. 2a, 2b (Block m-1 and block m) or the Fig. 2b and 2c (Block m and block m + 1).

Für einen ungeraden Frequenzindex k+1 wird der Realteil pk+1,m als Differenz zweier aufeinanderfolgender Werte berechnet, also als Differenz zwischen den Spektralkoeffizienten k+1 der Fig. 2a, 2b oder 2b, 2c. Der zugehörige Imaginärteil qk+1,m ergibt sich als Differenz zweier aufeinanderfolgender Werte mit dem Frequenzindex k, also als Differenz aus den reellwertigen Spektralkoeffizienten mit dem Index k der Fig. 2a, 2b oder 2b, 2c.For an odd frequency index k + 1, the real part p k + 1, m is calculated as the difference between two consecutive values, ie as the difference between the spectral coefficients k + 1 of the Fig. 2a, 2b or 2b, 2c , The associated imaginary part q k + 1, m results as the difference between two successive values with the frequency index k, ie as the difference between the real-valued spectral coefficients with the index k der Fig. 2a, 2b or 2b, 2c ,

Damit ergibt sich die in Fig. 4 dargestellte Transformationsfunktion, die insgesamt mit dem Bezugszeichen 12a bezeichnet ist, wobei die Transformationsfunktion zwei Transformations-Untervorschriften hL(m) und hH(m) aufweist, die, wie es in Fig. 4 gezeigt ist, paarweise alternierend auf die Ausgangswerte der Einrichtung 10 angewendet werden. Insbesondere hat die erste Unterfunktion hL(m) die Form {1, 1}, während die zweite Unterfunktion die Form {1, -1} umfaßt. Die Notation der Unterfunktionen hL(m) und hH(m) soll bedeuten, daß eine Summe bzw. Differenz der entsprechenden Spektralkoeffizienten aus zwei (zeitlich) benachbarten Blöcken zu bilden ist.This results in the in Fig. 4 shown transformation function, which is generally designated by the reference numeral 12 a, wherein the transformation function has two transformation sub-regulations h L (m) and h H (m), which, as shown in Fig. 4 shown in pairs alternately applied to the output values of the device 10. In particular, the first subfunction h L (m) has the form {1, 1}, while the second subfunction has the form {1, -1}. The notation of the subfunctions h L (m) and h H (m) is intended to mean that a sum or difference of the corresponding spectral coefficients is to be formed from two (temporally) adjacent blocks.

Die kritische Abtastung wird durch eine zeitliche Abtastratenreduktion um den Faktor 2 erzielt, wie es durch die mit 12b bezeichneten Einrichtung in Fig. 4 symbolisch dargestellt ist. Ist eine Orthogonalität der zweiten Transformation (12a, 12b) gewünscht, so können sämtliche Ausgangswerte p, q durch Multiplikation mit dem Faktor 1/√2 normiert werden.The critical sampling is achieved by a factor of 2 temporal sampling rate reduction, as indicated by the device labeled 12b in FIG Fig. 4 is shown symbolically. If an orthogonality of the second transformation (12a, 12b) is desired, then all the output values p, q can be normalized by multiplication by the factor 1 / √2.

Die der ersten Transformation, die beispielsweise eine MDCT ist, nachgeschaltete zweite Transformation (12a, 12b) greift jeweils über die zwei benachbarten Bänder, aus denen der Realteil pk,m und der Imaginärteil qk,m für einen Frequenzindex k gebildet werden. Außerdem werden, wie es durch die Funktionen hL und hH dargestellt ist, zeitlich aufeinanderfolgende reellwertige Spektralkoeffizienten in der Kombination, d. h. der Summation bzw. Differenzbildung, berücksichtigt.The second transformation (12a, 12b) connected downstream of the first transformation, which is, for example, an MDCT, respectively engages over the two adjacent bands from which the real part p k, m and the imaginary part q k, m are formed for a frequency index k. In addition, as represented by the functions h L and h H , temporally successive real-valued spectral coefficients are taken into account in the combination, ie the summation or subtraction.

Da bei dem in Fig. 4 gezeigten Ausführungsbeispiel die nachgeschaltete Transformation 12a, 12b keine Freiheitsgrade zur Optimierung des Gesamtsystems im Sinne von in den Funktionen hL und hH enthaltenen einstellbaren Gewichtungsfaktoren umfaßt, wird es bevorzugt, zur Optimierung des Gesamtsystems die Fensterfunktion der ersten Transformation, also beispielsweise der MDCT, zu manipulieren, d. h. im Vergleich zu einer vorgegebenen bekannten Fensterfunktion zu verändern. Hierbei erhält man einen Freiheitsgrad N/2 bei einer Frequenzauflösung von N Teilbändern und einer Fensterlänge von L = 2 N Werten.Since at the in Fig. 4 In the exemplary embodiment shown, the downstream transformation 12a, 12b does not include any degrees of freedom for optimizing the overall system in terms of adjustable weighting factors contained in the functions h L and h H , it is preferable to optimize the overall system by the window function of the first transformation, for example the MDCT manipulate, ie in To change comparison to a given known window function. In this case one obtains a degree of freedom N / 2 at a frequency resolution of N subbands and a window length of L = 2 N values.

Zusammenfassend lautet die in Fig. 4 dargestellte Transformationsvorschrift T2 folgendermaßen:

  • für k gerade: p k , m = u k , m + u k , m - 1
    Figure imgb0002
    q k , m = u k - 1 , m + u k - 1 , m - 1
    Figure imgb0003
  • für k+1: p k + 1 , m = u k + 1 , m - u k + 1 , m - 1
    Figure imgb0004
    q k + 1 , m = u k , m - u k , m - 1
    Figure imgb0005
In summary, the in Fig. 4 illustrated transformation rule T 2 as follows:
  • for k straight: p k . m = u k . m + u k . m - 1
    Figure imgb0002
    q k . m = u k - 1 . m + u k - 1 . m - 1
    Figure imgb0003
  • for k + 1: p k + 1 . m = u k + 1 . m - u k + 1 . m - 1
    Figure imgb0004
    q k + 1 . m = u k . m - u k . m - 1
    Figure imgb0005

Zur Rückgängigmachung der Transformation T2, wie sie für Fig. 4 beispielhaft in den Gleichungen (1) bis (4) dargestellt ist, wird eine zu der Transformationsvorschrift T2 inverse Transformationsvorschrift T2 -1 verwendet. Wenn Gleichungen (1) bis (4) betrachtet werden, so zeigt sich, daß die reellen Spektralkomponenten uk, m-1 und uk,m aus dem Realteil pk,m und dem Imaginärteil qk+1,m also aus den Gleichungen (1) und (4) berechnet werden können, indem die beiden Gleichungen (1) und (4) für zwei Unbekannte nach den gesuchten reellen Spektralkoeffizienten uk, m-1 und uk,m aufgelöst werden. Unter Verwendung dieser inversen Kombinationsvorschrift T2 -1 kann unter Kenntnis der Folge von Blöcken von komplexen approximierten Spektralkoeffi zienten wieder auf die Folge von reellen Spektralkoeffizienten zurückgerechnet werden, indem die inverse Kombinationsvorschrift durchgeführt wird.To undo the transformation T 2 , as for Fig. 4 by way of example in the equations (1) to (4) is an inverse to the transform rule T 2 T 2 transformation rule used -1. If equations (1) to (4) are considered, it turns out that the real spectral components u k, m-1 and u k, m from the real part p k, m and the imaginary part q k + 1, m from the Equations (1) and (4) can be calculated by solving two equations (1) and (4) for two unknowns according to the sought real spectral coefficients u k, m-1 and u k, m . By using this inverse combination rule T 2 -1 , knowing the sequence of blocks of complex approximated spectral coefficients, it is possible to calculate back to the sequence of real spectral coefficients by performing the inverse combination rule.

Nachfolgend wird anhand von Fig. 5 ein alternatives Ausführungsbeispiel beschrieben, bei dem keine kritische Abtastung vorgesehen ist. Hierbei wird der Ausgangswert uk,m der m-ten MDCT-Operation mit dem Frequenzindex k direkt zur Bildung des Realteils herangezogen. Der zugehörige Imaginärteil wird als gewichtete Summe der in der Zeit-Frequenz-Ebene umliegenden MDCT-Ausgangswerte uk-1, m-1, uk-1,m, uk-1, m+1, uk, m-1, uk, m+1, uk+1, m-1, uk+1,m und uk+1, m+1 berechnet. Eine mögliche Kombination der entsprechenden Filter gemäß Fig. 5 (im Beispiel für k ungerade) lautet folgendermaßen:

  • für den Realteil p: h R m = 0 1 0 ,
    Figure imgb0006
  • für den Imaginärteil q: h A m = a , - b , a , h B m = c , 0 , - c , h C m = a b a
    Figure imgb0007
The following is based on Fig. 5 an alternative embodiment is described in which no critical sampling is provided. Here, the output value u k, m of the m-th MDCT operation with the frequency index k is used directly to form the real part. The associated imaginary part is calculated as the weighted sum of the MDCT output values uk-1, m-1 , uk-1, m , uk-1, m + 1 , uk, m-1 in the time-frequency plane , u k, m + 1 , u k + 1, m-1 , u k + 1, m and u k + 1, m + 1 calculated. A possible combination of the corresponding filters according to Fig. 5 (in the example for k odd) is as follows:
  • for the real part p: H R m = 0 1 0 .
    Figure imgb0006
  • for the imaginary part q: H A m = a . - b . a . H B m = c . 0 . - c . H C m = a b a
    Figure imgb0007

In dem obigen Ausdruck können die Werte der Koeffizienten a, b, und c zur Optimierung des Gesamtsystems herangezogen werden, also wieder dazu, daß ein gewünschter Frequenzgang der Gesamtanordnung erreicht wird, der, wie es ausgeführt worden ist, beispielsweise dahingehend erwünscht ist, daß für positive Frequenzen eine Bandpasscharakteristik als Frequenzgang vorliegt, während für negative Frequenzen eine möglichst große Dämpfung erwünscht wird.In the above expression, the values of the coefficients a, b, and c can be used to optimize the overall system, again to achieve a desired frequency response of the overall arrangement, which, as has been stated, is desirable, for example positive frequencies a bandpass characteristic is available as a frequency response, while for negative frequencies the greatest possible attenuation is desired.

Gleichungsmäßig ausgedrückt stellt sich die in Fig. 5 dargestellte Transformationsvorschrift T2, die aus den Einzelfiltern 50a, 50b, 50c, 50d sowie einem Summierer 50e besteht, folgendermaßen dar:

  • für k ungerade: p k , m = u k , m ;
    Figure imgb0008
    q k , m = a u k - 1 , m + 1 + - b u k - 1 , m + a u k - 1 , m - 1 + - c u k , m + 1 + c u k , m - 1 + a u k + 1 , m + 1 + b u k + 1 , m + a u k + 1 , m - 1 ;
    Figure imgb0009
Expressed in terms of equations, the in Fig. 5 illustrated transformation rule T 2 , which consists of the individual filters 50a, 50b, 50c, 50d and a summer 50e, as follows:
  • odd for k: p k . m = u k . m ;
    Figure imgb0008
    q k . m = a u k - 1 . m + 1 + - b u k - 1 . m + a u k - 1 . m - 1 + - c u k . m + 1 + c u k . m - 1 + a u k + 1 . m + 1 + b u k + 1 . m + a u k + 1 . m - 1 ;
    Figure imgb0009

Zur Berechnung von qk,m werden somit mehr oder weniger stark durch Gewichtungsfaktoren a, b, c gewichtet sämtliche zum reellen Spektralkoeffizienten uk,m in der Zeit-Frequenz-Ebene benachbarten reellen Spektralkoeffizienten verwendet, wie es in Gleichung (6) dargestellt ist.For the calculation of qk , m , weighting factors a, b, c are used to weight all the real spectral coefficients adjacent to the real spectral coefficient u k, m in the time-frequency plane more or less, as shown in equation (6) ,

Es sei darauf hingewiesen, dass für ein gerades k dieselben Gleichungen (4) bis (6) verwendet können. Die Gewichtungsfaktoren haben in diesem Fall vorzugsweise die gleichen Beträge, jedoch teilweise unterschiedliche Vorzeichen.It should be noted that for even k, the same equations (4) to (6) can be used. The weighting factors in this case preferably have the same amounts but partly different signs.

Zur Umkehrung der in Fig. 5 dargestellten Transformationsvorschrift ist für die Ermittlung von uk,m lediglich eine triviale Operation durchzuführen, da sich dieser Wert unmittelbar aus Gleichung (5) ergibt. Nachdem es sich bei dem in Fig. 5 gezeigten System um ein nicht kritisch abgetastetes System handelt, sind die Real- und Imaginärteile informationsmäßig redundant dargestellt. Dies äußert sich in der invertierten Transformationsvorschrift T2 -1 dadurch, daß allein aus den Realteilen die reellen Spektralkoeffizienten berechnet werden können. Gleichung (6) muß daher nicht zur Auswertung herangezogen werden. Die der Transformationsvorschrift inverse Transformationsvorschrift ist somit bei dem in Fig. 5 gezeigten Ausführungsbeispiel identisch und durch Gleichung (5) gegeben.
Es sei darauf hingewiesen, daß bei dem im vorhergehenden beschriebenen Fall, bei dem die komplexe approximierte Spektraldarstellung beispielsweise in einem psychoakustischen Modell benötigt wird, um in einem Codierer die Quantisiererschrittweite einzustellen, eine Rückrechnung von der komplexen approximierten Spektraldarstellung zu der reellen Spektraldarstellung nicht mehr benötigt wird. Alternativ können jedoch Fälle existieren, bei denen eine entsprechende Inversion benötigt wird, bei denen also aus der komplexen approximierten Spektraldarstellung wieder die zugrundeliegende reelle Spektraldarstellung berechnet werden muß.
To reverse the in Fig. 5 For the determination of u k, m, only one trivial operation is to be performed since this value results directly from equation (5). After the in Fig. 5 shown system is a non-critically sampled system, the real and imaginary parts are presented redundant information. This manifests itself in the inverted transformation instruction T 2 -1 in that the real spectral coefficients can be calculated from the real parts alone. Therefore equation (6) does not have to be used for the evaluation. The transformation rule inverse transformation rule is thus in the in Fig. 5 identical embodiment and given by equation (5).
It should be noted that in the case described above, where the complex approximated spectral representation is needed, for example, in a psychoacoustic model, the quantizer step size in an encoder set back, a recalculation of the complex approximated spectral representation to the real spectral representation is no longer needed. Alternatively, however, there may be cases in which a corresponding inversion is required, in which case the underlying real spectral representation must be calculated from the complex approximated spectral representation again.

Abhängig von den Gegebenheiten können die erfindungsgemäßen Verfahren in Hardware oder in Software implementiert werden. Die Implementation kann auf einem digitalen Speichermedium, insbesondere einer Diskette oder CD mit elektronisch auslesbaren Steuersignalen erfolgen, die so mit einem programmierbaren Computersystem zusammenwirken, daß das entsprechende Verfahren ausgeführt wird. Allgemein besteht die Erfindung somit auch in einem Computer-Programm-Produkt mit auf einem maschinenlesbaren Träger gespeichertem Programmcode zur Durchführung von einem oder mehreren der erfindungsgemäßen Verfahren, wenn das Computer-Programm-Produkt auf einem Rechner abläuft. In anderen Worten ausgedrückt ist die Erfindung auch ein Computer-Programm mit einem Programmcode zur Durchführung von einem oder mehreren der Verfahren, wenn das Computer-Programm auf einem Computer abläuft.Depending on the circumstances, the methods according to the invention can be implemented in hardware or in software. The implementation may be on a digital storage medium, in particular a floppy disk or CD with electronically readable control signals, which interact with a programmable computer system such that the corresponding method is executed. In general, the invention thus also consists in a computer program product with program code stored on a machine-readable carrier for carrying out one or more of the inventive methods when the computer program product runs on a computer. In other words, the invention is also a computer program having a program code for performing one or more of the methods when the computer program runs on a computer.

Claims (23)

  1. Device for generating a complex spectral representation of a discrete-time signal, comprising:
    means (10) for generating a block-wise real-valued spectral representation of the discrete-time signal, the spectral representation comprising temporally successive blocks, each block comprising a set of real spectral coefficients; and
    means (12) for post-processing the block-wise real-valued spectral representation to obtain a block-wise complex approximated spectral representation comprising successive blocks, each block comprising a set of complex approximated spectral coefficients, wherein a complex approximated spectral coefficient can be represented by a first partial spectral coefficient and a second partial spectral coefficient, wherein at least one of the first and the second partial spectral coefficients is to be determined by combining at least two temporally and/or frequency-adjacent real spectral coefficients.
  2. Device according to claim 1,
    wherein the first partial spectral coefficient is a real part of the complex approximated spectral coefficient and the second partial spectral coefficient is an imaginary part of the complex approximated spectral coefficient.
  3. Device according to claim 1 or 2,
    wherein the combination is a linear combination.
  4. Device according to one of the preceding claims,
    wherein the means (12) for post-processing is formed to combine a real spectral coefficient of the frequency and a real spectral coefficient of an adjacent higher or lower frequency for determining a complex spectral coefficient.
  5. Device according to one of the preceding claims,
    wherein the means (12) for post-processing is formed to combine a real spectral coefficient in a current block and a real spectral coefficient in a temporally preceding block or a temporally subsequent block for determining a complex spectral coefficient of a certain frequency.
  6. Device according to one of the preceding claims, formed to operate, in a critical sampling, such that a real spectral value is generated for each discrete-time sample value by the means (10) for generating a block-wise real spectral representation and that a complex spectral coefficient is generated for two real spectral coefficients.
  7. Device according to claim 6,
    wherein the means (12) for post-processing is formed to only be active for every second block of real-valued spectral coefficients to reduce a sampling rate or to be active for every second real spectral coefficient to reduce the sampling rate or to only be active for every second block or for every second real spectral coefficient alternatingly to reduce the sampling rate.
  8. Device according to one of the preceding claims,
    wherein the means (12) for post-processing is formed to sum two real spectral coefficients having the same frequency index from a current block and from a temporally preceding block for the first partial spectral coefficient having an even frequency index, and to sum two real spectral coefficients having a frequency index lower by 1 from the current block and the temporally preceding block for the second partial spectral coefficient having the even frequency index.
  9. Device according to one of the preceding claims,
    wherein the means (12) for post-processing is formed to form a difference of two real spectral coefficients having an odd frequency index from a current block and from a temporally preceding block for the first partial spectral coefficient having the odd frequency index, and to form a difference of two real spectral coefficients having a frequency index lower by 1 from the current block and the temporally preceding block for the second partial spectral coefficient.
  10. Device according to one of the preceding claims,
    wherein the means (12) for post-processing is formed to normalize the first and second partial spectral coefficients each by a factor of 1/√2.
  11. Device according to one of claims 1 to 7,
    wherein the means (12) for post-processing is formed to use a real spectral coefficient having a frequency index as the first partial spectral coefficient for the frequency index, and to use a weighted sum of the real spectral coefficients having adjacent frequency indices of a current block, from one or several preceding blocks or from one or several subsequent blocks for calculating the second partial spectral coefficient, at least two weighting factors being unequal to 0.
  12. Device according to claim 11,
    wherein the means (12) for post-processing is formed not to use the real spectral coefficient forming the first partial spectral coefficient for calculating the second partial spectral coefficient.
  13. Device according to claim 11 or 12,
    wherein the means for post-processing is formed to apply the following rule for calculating the second spectral coefficient: q k , m = a u k - 1 , m + 1 + - b u k - 1 , m + a u k - 1 , m - 1 + - c u k , m + 1 + c u k , m - 1 + a u k + 1 , m + 1 + b u k + 1 , m + a u k + 1 , m - 1 ;
    Figure imgb0011

    a, b, c being positive or negative weighting factors, k-1 being a current frequency index k minus 1, m-1 being a current block index m minus 1, k+1 being a current frequency index k plus 1, m+1 being a current block index m plus 1, and uk-1,m-1 being a real spectral coefficient of a temporally preceding block having a frequency index k-1, uk-1,m being a real spectral coefficient of a current block having a frequency index k-1, uk-1,m+1 being a real spectral coefficient of a temporally subsequent block having a frequency index k-1, uk,m-1 being a real spectral coefficient having the frequency index of k from the temporally preceding block, uk,m+1 being a real spectral coefficient having the frequency index for the temporally subsequent block, uk+1,m-1 being a real spectral coefficient having the frequency index k+1 from the temporally preceding block, uk+1,m being a real spectral coefficient for the frequency index k+1 from the current block, and uk+1,m+1 being a real spectral coefficient having the frequency index k+1 from the temporally subsequent block.
  14. Device according to claim 13,
    wherein the signs from one or several weighting factors are different for even and odd frequency indices k.
  15. Device according to claim 13 or 14,
    wherein the weighting factors are adjusted to provide a desired frequency response for the device for generating a complex spectral representation.
  16. Device according to one of the preceding claims,
    wherein the means (10) for generating is formed to execute a modified discrete cosine transform.
  17. Device according to claim 16,
    wherein the means (10) for generating is formed to execute a modified discrete cosine transform with a window overlapping of 50%.
  18. Method for generating a complex spectral representation of a discrete-time signal, comprising the steps of:
    generating (10) a block-wise real-valued spectral representation of the discrete-time signal, the spectral representation comprising temporally successive blocks, each block comprising a set of real spectral coefficients; and
    post-processing (12) the block-wise real-valued spectral representation to obtain a block-wise complex approximated spectral representation comprising successive blocks, each block comprising a set of complex approximated spectral coefficients, wherein a complex approximated spectral coefficient can be represented by a first partial spectral coefficient and a second partial spectral coefficient, wherein at least one of the first and second partial spectral coefficients is to be determined by combining at least two temporally and/or frequency-adjacent real spectral coefficients.
  19. Device for coding a discrete-time signal, comprising:
    means for generating a block-wise real-valued spectral representation of the discrete-time signal, the spectral representation comprising temporally successive blocks, each block comprising a set of real spectral coefficients;
    a psycho-acoustic module for calculating a psycho-acoustic masking threshold depending on the discrete-time signal;
    means for quantizing a block of real-valued spectral coefficients using the psycho-acoustic masking threshold,
    wherein the psycho-acoustic module comprises means (12) for post-processing the block-wise real spectral representation to obtain a block-wise complex approximated spectral representation comprising successive blocks, each block comprising a set of complex approximated spectral coefficients, wherein a complex approximated spectral coefficient can be represented by a first partial spectral coefficient and a second partial spectral coefficient, wherein at least one of the first and second partial spectral coefficients is to be determined by combining at least two temporally and/or frequency-adjacent real spectral coefficients.
  20. Method for coding a discrete-time signal, comprising the steps of:
    generating a block-wise real-valued spectral representation of the discrete-time signal, the spectral representation comprising temporally successive blocks, each block comprising a set of real spectral coefficients;
    calculating a psycho-acoustic masking threshold depending on the discrete-time signal;
    quantizing a block of real-valued spectral coefficients using the psycho-acoustic masking threshold,
    wherein a step of post-processing (12) the block-wise real spectral representation is performed in the step of calculating to obtain a block-wise complex approximated spectral representation comprising successive blocks, each comprising a set of complex approximated spectral coefficients, wherein a complex approximated spectral coefficient can be represented by a first partial spectral coefficient and a second partial spectral coefficient, wherein at least one of the first and second partial spectral coefficients is to be determined by combining at least two temporally and/or frequency-adjacent real spectral coefficients.
  21. Device for generating a real spectral representation from a complex approximated spectral representation, the real spectral representation to be determined comprising temporally successive blocks, each block comprising a set of real spectral coefficients, the complex approximated spectral representation comprising temporally successive blocks, each block comprising a set of complex approximated spectral coefficients, wherein a complex approximated spectral coefficient can be represented by a first partial spectral coefficient and a second partial spectral coefficient, the complex approximated spectral coefficients having been calculated by a transform rule from the real spectral coefficients, the transform rule including a combination of at least two temporally and/or frequency-adjacent real spectral coefficients to calculate at least one of the first and second partial spectral coefficients of a complex approximated spectral coefficient, comprising:
    means for performing a combining rule inverse to the transform rule (T2) to calculate the real spectral coefficients from the complex approximated spectral coefficients.
  22. Method for generating a real spectral representation of a complex approximated spectral representation, the real spectral representation to be determined comprising temporally successive blocks, each block comprising a set of real spectral coefficients, the complex approximated spectral representation comprising temporally successive blocks, each block comprising a set of complex approximated spectral coefficients, wherein a complex approximated spectral coefficient can be represented by a first partial spectral coefficient and a second partial spectral coefficient, the complex approximated spectral coefficients having been calculated by a transform rule from the real spectral coefficients, the transform rule including a combination of at least two temporally and/or frequency-adjacent real spectral coefficients to calculate at least one of the first and second partial spectral coefficients of a complex approximated spectral coefficient, comprising the step of:
    performing a combination rule inverse to the transform rule (T2) to calculate the real spectral coefficients from the complex approximated spectral coefficients.
  23. Computer program product having a program code for performing the method according to claim 18, claim 20 or claim 22, when the program runs on a computer.
EP03766165A 2002-07-26 2003-07-14 Arrangement and method for the generation of a complex spectral representation of a time-discrete signal Expired - Lifetime EP1525576B1 (en)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
DE10234130A DE10234130B3 (en) 2002-07-26 2002-07-26 Device and method for generating a complex spectral representation of a discrete-time signal
DE10234130 2002-07-26
PCT/EP2003/007608 WO2004013839A1 (en) 2002-07-26 2003-07-14 Arrangement and method for the generation of a complex spectral representation of a time-discrete signal

Publications (2)

Publication Number Publication Date
EP1525576A1 EP1525576A1 (en) 2005-04-27
EP1525576B1 true EP1525576B1 (en) 2009-05-27

Family

ID=30469126

Family Applications (1)

Application Number Title Priority Date Filing Date
EP03766165A Expired - Lifetime EP1525576B1 (en) 2002-07-26 2003-07-14 Arrangement and method for the generation of a complex spectral representation of a time-discrete signal

Country Status (6)

Country Link
US (2) US7707030B2 (en)
EP (1) EP1525576B1 (en)
AT (1) ATE432524T1 (en)
AU (1) AU2003250945A1 (en)
DE (2) DE10234130B3 (en)
WO (1) WO2004013839A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TWI575962B (en) * 2012-02-24 2017-03-21 杜比國際公司 Low delay real-to-complex conversion in overlapping filter banks for partially complex processing

Families Citing this family (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6980933B2 (en) * 2004-01-27 2005-12-27 Dolby Laboratories Licensing Corporation Coding techniques using estimated spectral magnitude and phase derived from MDCT coefficients
DE102004059979B4 (en) 2004-12-13 2007-11-22 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Device and method for calculating a signal energy of an information signal
KR100736607B1 (en) * 2005-03-31 2007-07-09 엘지전자 주식회사 audio coding method and apparatus using the same
DE102006047197B3 (en) * 2006-07-31 2008-01-31 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Device for processing realistic sub-band signal of multiple realistic sub-band signals, has weigher for weighing sub-band signal with weighing factor that is specified for sub-band signal around subband-signal to hold weight
DE102006051673A1 (en) 2006-11-02 2008-05-15 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Apparatus and method for reworking spectral values and encoders and decoders for audio signals
EP2374211B1 (en) 2008-12-24 2012-04-04 Dolby Laboratories Licensing Corporation Audio signal loudness determination and modification in the frequency domain
KR101437896B1 (en) 2010-04-09 2014-09-16 돌비 인터네셔널 에이비 Mdct-based complex prediction stereo coding
EP2375409A1 (en) 2010-04-09 2011-10-12 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Audio encoder, audio decoder and related methods for processing multi-channel audio signals using complex prediction
AU2011240239B2 (en) 2010-04-13 2014-06-26 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Audio or video encoder, audio or video decoder and related methods for processing multi-channel audio or video signals using a variable prediction direction
MX2013011131A (en) * 2011-03-28 2013-10-30 Dolby Lab Licensing Corp Reduced complexity transform for a low-frequency-effects channel.
CN103366749B (en) * 2012-03-28 2016-01-27 北京天籁传音数字技术有限公司 A kind of sound codec devices and methods therefor
CN103366750B (en) * 2012-03-28 2015-10-21 北京天籁传音数字技术有限公司 A kind of sound codec devices and methods therefor
US20140074614A1 (en) * 2012-09-12 2014-03-13 Globys, Inc. Time series-based entity behavior classification
US8804971B1 (en) 2013-04-30 2014-08-12 Dolby International Ab Hybrid encoding of higher frequency and downmixed low frequency content of multichannel audio
EP3067889A1 (en) 2015-03-09 2016-09-14 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Method and apparatus for signal-adaptive transform kernel switching in audio coding
WO2019121980A1 (en) 2017-12-19 2019-06-27 Dolby International Ab Methods and apparatus systems for unified speech and audio decoding improvements
TWI812658B (en) 2017-12-19 2023-08-21 瑞典商都比國際公司 Methods, apparatus and systems for unified speech and audio decoding and encoding decorrelation filter improvements
EP3729427A1 (en) 2017-12-19 2020-10-28 Dolby International AB Methods and apparatus for unified speech and audio decoding qmf based harmonic transposer improvements
FR3087309B1 (en) * 2018-10-12 2021-08-06 Ateme OPTIMIZATION OF SUB-SAMPLING BEFORE THE CODING OF IMAGES IN COMPRESSION

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5727119A (en) * 1995-03-27 1998-03-10 Dolby Laboratories Licensing Corporation Method and apparatus for efficient implementation of single-sideband filter banks providing accurate measures of spectral magnitude and phase
FI100840B (en) * 1995-12-12 1998-02-27 Nokia Mobile Phones Ltd Noise attenuator and method for attenuating background noise from noisy speech and a mobile station
US5890106A (en) * 1996-03-19 1999-03-30 Dolby Laboratories Licensing Corporation Analysis-/synthesis-filtering system with efficient oddly-stacked singleband filter bank using time-domain aliasing cancellation
DE10236694A1 (en) * 2002-08-09 2004-02-26 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Equipment for scalable coding and decoding of spectral values of signal containing audio and/or video information by splitting signal binary spectral values into two partial scaling layers

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TWI575962B (en) * 2012-02-24 2017-03-21 杜比國際公司 Low delay real-to-complex conversion in overlapping filter banks for partially complex processing

Also Published As

Publication number Publication date
WO2004013839A1 (en) 2004-02-12
US8155954B2 (en) 2012-04-10
US20050197831A1 (en) 2005-09-08
DE10234130B3 (en) 2004-02-19
ATE432524T1 (en) 2009-06-15
DE50311552D1 (en) 2009-07-09
AU2003250945A1 (en) 2004-02-23
US7707030B2 (en) 2010-04-27
US20100161319A1 (en) 2010-06-24
EP1525576A1 (en) 2005-04-27

Similar Documents

Publication Publication Date Title
EP1525576B1 (en) Arrangement and method for the generation of a complex spectral representation of a time-discrete signal
DE60317722T2 (en) Method for reducing aliasing interference caused by the adjustment of the spectral envelope in real value filter banks
EP1741039B1 (en) Information signal processing by carrying out modification in the spectral/modulation spectral region representation
DE602006000399T2 (en) PARTLY COMPLEX MODULATED FILTER BANK
DE60024501T2 (en) Improvement of Perceptual Quality of SBR (Spectral Band Replication) AND HFR (Radio Frequency Reconstruction) Coding method by adaptively adding noise floor and limiting the noise substitution
EP1979901B1 (en) Method and arrangements for audio signal encoding
DE4316297C1 (en) Audio signal frequency analysis method - using window functions to provide sample signal blocks subjected to Fourier analysis to obtain respective coefficients.
DE60014363T2 (en) REDUCING DATA QUANTIZATION DATA BLOCK DISCOUNTS IN AN AUDIO ENCODER
EP1609084B1 (en) Device and method for conversion into a transformed representation or for inversely converting the transformed representation
EP1697931B1 (en) Device and method for determining an estimated value
DE102006047197B3 (en) Device for processing realistic sub-band signal of multiple realistic sub-band signals, has weigher for weighing sub-band signal with weighing factor that is specified for sub-band signal around subband-signal to hold weight
EP1647009B1 (en) Device and method for processing a signal
EP0200239B1 (en) Digital polyphase filter bank with maximum sampling-rate reduction
EP1654674B1 (en) Device and method for processing at least two input values
EP1697930A1 (en) Device and method for processing a multi-channel signal
EP0065210A2 (en) Electrical signal conditioning method with a digital filter device
EP1397799B1 (en) Method and device for processing time-discrete audio sampled values
EP1239455A2 (en) Method and system for implementing a Fourier transformation which is adapted to the transfer function of human sensory organs, and systems for noise reduction and speech recognition based thereon
EP0957471B1 (en) Measuring process for loudness quality assessment of audio signals
DE69823557T2 (en) QUICK FREQUENCY TRANSFORMATION TECHNOLOGY FOR TRANSFORM AUDIO CODES
EP1755110A2 (en) Method and device for adaptive reduction of noise signals and background signals in a speech processing system
EP0608281B1 (en) Process for reducing frequency crosstalk during acoustic or optical signal transmission and/or recording
DE60210479T2 (en) AUDIO CODERS WITH IRREGULAR FILTER BANK
DE3732047C2 (en)
EP1538749A2 (en) Filterbank for spectrally modifying a digital signal and corresponding method

Legal Events

Date Code Title Description
PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

17P Request for examination filed

Effective date: 20041230

AK Designated contracting states

Kind code of ref document: A1

Designated state(s): AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IT LI LU MC NL PT RO SE SI SK TR

AX Request for extension of the european patent

Extension state: AL LT LV MK

DAX Request for extension of the european patent (deleted)
RBV Designated contracting states (corrected)

Designated state(s): AT CH DE FR GB LI

RAP1 Party data changed (applicant data changed or rights of an application transferred)

Owner name: FRAUNHOFER-GESELLSCHAFT ZUR FOERDERUNG DER ANGEWAN

GRAP Despatch of communication of intention to grant a patent

Free format text: ORIGINAL CODE: EPIDOSNIGR1

GRAS Grant fee paid

Free format text: ORIGINAL CODE: EPIDOSNIGR3

GRAA (expected) grant

Free format text: ORIGINAL CODE: 0009210

AK Designated contracting states

Kind code of ref document: B1

Designated state(s): AT CH DE FR GB LI

REG Reference to a national code

Ref country code: GB

Ref legal event code: FG4D

Free format text: NOT ENGLISH

REG Reference to a national code

Ref country code: CH

Ref legal event code: EP

REF Corresponds to:

Ref document number: 50311552

Country of ref document: DE

Date of ref document: 20090709

Kind code of ref document: P

PLBE No opposition filed within time limit

Free format text: ORIGINAL CODE: 0009261

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: NO OPPOSITION FILED WITHIN TIME LIMIT

26N No opposition filed

Effective date: 20100302

REG Reference to a national code

Ref country code: FR

Ref legal event code: PLFP

Year of fee payment: 14

REG Reference to a national code

Ref country code: FR

Ref legal event code: PLFP

Year of fee payment: 15

REG Reference to a national code

Ref country code: FR

Ref legal event code: PLFP

Year of fee payment: 16

PGFP Annual fee paid to national office [announced via postgrant information from national office to epo]

Ref country code: GB

Payment date: 20220725

Year of fee payment: 20

Ref country code: DE

Payment date: 20220621

Year of fee payment: 20

Ref country code: AT

Payment date: 20220718

Year of fee payment: 20

PGFP Annual fee paid to national office [announced via postgrant information from national office to epo]

Ref country code: FR

Payment date: 20220726

Year of fee payment: 20

PGFP Annual fee paid to national office [announced via postgrant information from national office to epo]

Ref country code: CH

Payment date: 20220727

Year of fee payment: 20

P01 Opt-out of the competence of the unified patent court (upc) registered

Effective date: 20230512

REG Reference to a national code

Ref country code: DE

Ref legal event code: R071

Ref document number: 50311552

Country of ref document: DE

Ref country code: CH

Ref legal event code: PL

REG Reference to a national code

Ref country code: GB

Ref legal event code: PE20

Expiry date: 20230713

REG Reference to a national code

Ref country code: AT

Ref legal event code: MK07

Ref document number: 432524

Country of ref document: AT

Kind code of ref document: T

Effective date: 20230714

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: GB

Free format text: LAPSE BECAUSE OF EXPIRATION OF PROTECTION

Effective date: 20230713