WO2008071353A2 - Encoder, decoder and methods for encoding and decoding data segments representing a time-domain data stream - Google Patents

Encoder, decoder and methods for encoding and decoding data segments representing a time-domain data stream Download PDF

Info

Publication number
WO2008071353A2
WO2008071353A2 PCT/EP2007/010665 EP2007010665W WO2008071353A2 WO 2008071353 A2 WO2008071353 A2 WO 2008071353A2 EP 2007010665 W EP2007010665 W EP 2007010665W WO 2008071353 A2 WO2008071353 A2 WO 2008071353A2
Authority
WO
WIPO (PCT)
Prior art keywords
domain
time
data
frequency
encoded
Prior art date
Application number
PCT/EP2007/010665
Other languages
French (fr)
Other versions
WO2008071353A3 (en
Inventor
Ralf Geiger
Max Neuendorf
Yoshikazu Yokotani
Nikolaus Rettelbach
Juergen Herre
Stefan Geyersberger
Original Assignee
Fraunhofer-Gesellschaft Zur Förderung Der Angewandten Forschung E.V:
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority to US12/518,627 priority Critical patent/US8818796B2/en
Priority to CN2007800461881A priority patent/CN101589623B/en
Application filed by Fraunhofer-Gesellschaft Zur Förderung Der Angewandten Forschung E.V: filed Critical Fraunhofer-Gesellschaft Zur Förderung Der Angewandten Forschung E.V:
Priority to ES07856467T priority patent/ES2383217T3/en
Priority to MX2009006201A priority patent/MX2009006201A/en
Priority to AU2007331763A priority patent/AU2007331763B2/en
Priority to EP07856467A priority patent/EP2052548B1/en
Priority to BRPI0718738-6A priority patent/BRPI0718738B1/en
Priority to JP2009540636A priority patent/JP5171842B2/en
Priority to BR122019024992-0A priority patent/BR122019024992B1/en
Priority to AT07856467T priority patent/ATE547898T1/en
Priority to CA2672165A priority patent/CA2672165C/en
Priority to PL07856467T priority patent/PL2052548T3/en
Priority to TW096147145A priority patent/TWI363563B/en
Publication of WO2008071353A2 publication Critical patent/WO2008071353A2/en
Publication of WO2008071353A3 publication Critical patent/WO2008071353A3/en
Priority to IL198725A priority patent/IL198725A/en
Priority to HK09105016.0A priority patent/HK1126602A1/en
Priority to NO20092506A priority patent/NO342080B1/en
Priority to US13/924,441 priority patent/US8812305B2/en
Priority to US14/250,306 priority patent/US9043202B2/en
Priority to US14/637,256 priority patent/US9355647B2/en
Priority to US15/094,984 priority patent/US9653089B2/en
Priority to US15/595,170 priority patent/US10714110B2/en
Priority to US16/922,934 priority patent/US11581001B2/en
Priority to US18/152,721 priority patent/US11961530B2/en

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
    • G10L19/16Vocoder architecture
    • G10L19/18Vocoders using multiple modes
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/42Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/02Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/02Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
    • G10L19/022Blocking, i.e. grouping of samples in time; Choice of analysis windows; Overlap factoring
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/30Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using hierarchical techniques, e.g. scalability
    • H04N19/31Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using hierarchical techniques, e.g. scalability in the temporal domain
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/60Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/78Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/783Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • G06F16/7847Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using low-level visual features of the video content
    • G06F16/7864Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using low-level visual features of the video content using domain-transform features, e.g. DCT or wavelet transform coefficients
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/02Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
    • G10L19/032Quantisation or dequantisation of spectral components
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
    • G10L19/26Pre-filtering or post-filtering
    • G10L19/265Pre-filtering, e.g. high frequency emphasis prior to encoding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/236Assembling of a multiplex stream, e.g. transport stream, by combining a video stream with other content or additional data, e.g. inserting a URL [Uniform Resource Locator] into a video stream, multiplexing software data into a video stream; Remultiplexing of multiplex streams; Insertion of stuffing bits into the multiplex stream, e.g. to obtain a constant bit-rate; Assembling of a packetised elementary stream
    • H04N21/2368Multiplexing of audio and video streams
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/238Interfacing the downstream path of the transmission network, e.g. adapting the transmission rate of a video stream to network bandwidth; Processing of multiplex streams
    • H04N21/2383Channel coding or modulation of digital bit-stream, e.g. QPSK modulation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/25Management operations performed by the server for facilitating the content distribution or administrating data related to end-users or client devices, e.g. end-user or client device authentication, learning user preferences for recommending movies
    • H04N21/266Channel or content management, e.g. generation and management of keys and entitlement messages in a conditional access system, merging a VOD unicast channel into a multicast channel
    • H04N21/2662Controlling the complexity of the video stream, e.g. by scaling the resolution or bitrate of the video stream based on the client capabilities
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/434Disassembling of a multiplex stream, e.g. demultiplexing audio and video streams, extraction of additional data from a video stream; Remultiplexing of multiplex streams; Extraction or processing of SI; Disassembling of packetised elementary stream
    • H04N21/4341Demultiplexing of audio and video streams
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/438Interfacing the downstream path of the transmission network originating from a server, e.g. retrieving MPEG packets from an IP network
    • H04N21/4382Demodulation or channel decoding, e.g. QPSK demodulation

Definitions

  • the present invention is in the field of coding, where different characteristics of data to be encoded are utilized for coding rates, as for example in video and audio coding.
  • coding strategies can make use of characteristics of a data stream to be encoded.
  • perception models are used in order to compress source data almost without decreasing the noticeable quality and degradation when replayed.
  • MDCT Modified Discrete Cosine Transform
  • ACELP Algebraic Code Excited Linear Prediction
  • speech coders use a predictive approach, and in this way may represent the audio/speech signal in the time domain.
  • Such speech coders can model the characteristics of the human speech production process, i.e. the human vocal tract and, consequently, achieve excellent performance for speech signals at low bit rates.
  • perceptional audio coders do not achieve the level of performance offered by speech coders for speech signals coded at low bit rates, and using speech coders to represent general audio signals/music results in significant quality impairments.
  • Conventional concepts provide a layered combination in which always all partial coders are active, i.e. time- domain and frequency-domain encoders, and the final output signal is calculated by combining the contributions of the partial coders for a given processed time frame.
  • a popular example of layered coding are MPEG-4 scalable speech/audio coding with a speech coder as the base layer and a filterbank-based enhancement layer, cf. Bernhard Grill, Karlheinz Brandenburg, "A Two-or Three-Stage Bit-Rate Scalable Audio Coding System", Preprint Number 4132, 99 th Convention of the AES (September 1995) .
  • the MDCT has become a dominant filterbank for conventional perceptual audio coders because of its advantageous properties. For example, it can provide a smooth cross-fade between processing blocks. Even if a signal in each processing block is altered differently, for example due to quantization of spectral coefficients, no blocking artifacts due to abrupt transitions from block to block occur because of the windowed overlap/add operations.
  • the MDCT uses the concept of time-domain aliasing cancellation (TDAC) .
  • the MDCT is a Fourier-related transform based on the type- IV discrete cosine transform, with the additional property of being lapped. It is designed to be performed in consecutive blocks of a larger data set, where subsequent blocks are overlapped so that the last half of one block coincides with the first half of the next block. This overlapping, in addition to an energy-compaction quality of the DCT, makes the MDCT especially attractive for signal compression applications, since it helps to avoid said artifacts steming from the block boundaries.
  • the MDCT is a bit unusual compared to other Fourier-related transforms in that it has half as many outputs as inputs, instead of the same number. In particular, 2N real numbers are transformed into N real numbers, where N is a positive integer.
  • the inverse MDCT is also known as IMDCT. Because there are different numbers of inputs and outputs, at first glance it might seem that the MDCT should not be invertible. However, perfect invertibility is achieved by adding the overlap IDMCTs of subsequent overlapping blocks, causing the errors to cancel and the original data to be retrieved, i.e. achieving TDAC.
  • the number of spectral values at the output of a filterbank is equal to the number of time-domain input values at its input which is also referred to as critical sampling.
  • An MDCT filterbank provides a high-frequency selectivity and enables a high coding gain.
  • the properties of overlapping of blocks and critical sampling can be achieved by utilizing the technique of time-domain aliasing cancellation, cf. J. Princen, A. Bradley, "Analysis/Synthesis Filter Bank Design Based on Time Domain Aliasing Cancellation", IEEE Trans. ASSP, ASSP-34 (5) : 1153- 1161, 1986.
  • Fig. 4 illustrates these effects of an MDCT.
  • Fig. 4 shows an MDCT input signal, in terms of an impulse along a time axis 400 at the top.
  • the input signal 400 is then transformed by two consecutive windowing and MDCT blocks, where the windows 410 are illustrated underneath the input signal 400 in Fig. 4.
  • the back transformed individual windowed signals are displayed in Fig. 4 by the time lines 420 and 425.
  • the first block produces an aliasing component with positive sign 420
  • the second block produces an aliasing component with the same magnitude and a negative sign 425.
  • the aliasing components cancel each other after addition of the two output signals 420 and 425 as shown in the final output 430 at the bottom of Fig. 4.
  • AMR-WB+ Extended Adaptive Multi-Rate - Wideband codec
  • AMR-WB+ Adaptive Multi-Rate Wideband
  • the ACELP model is a time-domain, predictive encoder, best suited for speech and transient signals.
  • the AMR-WB encoder is used in ACELP modes.
  • the TCX model is a transform based encoder, and is more appropriate for typical music samples.
  • the AMR-WB+ uses a discrete Fourier transform (DFT) for the transform coding mode TCX.
  • DFT discrete Fourier transform
  • TCX/ACELP coding modes
  • the DFT together with the windowing and overlap represents a filterbank that is not critically sampled.
  • Each TCX frame utilizes an overlap of 1/8 of the frame length which equals the number of new input samples. Consequently, the corresponding length of the DFT is 9/8 of the frame length.
  • AAC Advanced Audio Coding
  • the Dolby E codec is described in Fielder, Louis D.; Todd, Craig C, "The Design of a Video Friendly Audio Coding System for Distributing Applications", Paper Number 17-008, The AES 17 th International Conference: High-Quality Audio Coding (March 1999) and Fielder, Louis D.; Davidson, Grant A., "Audio Coding Tools for Digital Television Distribution", Preprint Nubmer 5104, 108 th Convention of the AES (January 2000) .
  • the Dolby E codec utilizes the MDCT filterbank. In the design of this coding, special focus was put on the possibility to perform editing in the coding domain. To achieve this, special alias-free windows are used. At the boundaries of these windows a smooth-cross fade or splicing of different signal portions is possible.
  • the object is achieved by an apparatus for decoding according to claim 1, a method for decoding according to claim 22, an apparatus for generating an encoded data stream according to claim 24 and a method for generating an encoded data stream according to claim 35.
  • the present invention is based on the finding that a more efficient encoding and decoding concept can be utilized by using combined time-domain and frequency-domain encoders, respectively decoders.
  • the problem of time aliasing can be efficiently combat by transforming time-domain data to the frequency-domain in the decoder and by combining the resulting transformed frequency-domain data with the decoded frequency-domain data received.
  • Overheads can be reduced by adapting overlapping regions of overlap windows being applied to data segments to coding domain changes. Using windows with smaller overlapping regions can be beneficial when using time-domain encoding, respectively when switching from or to time-domain encoding.
  • Embodiments can provide a universal audio encoding and decoding concept that achieves improved performance for both types of input signals, such as speech signals and music signals.
  • Embodiments can take advantage by combining multiple coding approaches, e.g. time-domain and frequency- domain coding concepts.
  • Embodiments can efficiently combine filterbank based and time-domain based coding concepts into a single scheme.
  • Embodiments may result in a combined codec which can, for example, be able to switch between an audio codec for music-like audio content and a speech codec for speech-like content. Embodiments may utilize this switching frequently, especially for mixed content.
  • Embodiments of the present invention may provide the advantage that no switching artifacts occur.
  • the amount of additional transmit data, or additionally coded samples, for a switching process can be minimized in order to avoid a reduced efficiency during this phase of operation.
  • the concept of switched combination of partial coders is different from that of the layered combination in which always all partial coders are active.
  • Fig. Ia shows an embodiment of an apparatus for decoding
  • Fig. Ib shows another embodiment of an apparatus for decoding
  • Fig. Ic shows another embodiment of an apparatus for decoding
  • Fig. Id shows another embodiment of an apparatus for decoding
  • Fig. Ie shows another embodiment of an apparatus for decoding
  • Fig. If shows another embodiment of an apparatus for decoding
  • Fig. 2a shows an embodiment of an apparatus for encoding
  • Fig. 2b shows another embodiment of an apparatus for encoding
  • Fig. 2c shows another embodiment of an apparatus for encoding
  • Fig. 3a illustrates overlapping regions when switching between frequency-domain and time-domain coding for the duration of one window
  • Fig. 3b illustrates the overlapping regions when switching between frequency-domain coding and time-domain coding for a duration of two windows
  • Fig. 3c illustrates multiple windows with different overlapping regions
  • Fig. 3d illustrates the utilization of windows with different overlapping regions in an embodiment
  • Fig. 4 illustrates time-domain aliasing cancellation when using MDCT.
  • Fig. Ia shows an apparatus 100 for decoding data segments representing a time-domain data stream, a data segment being encoded in a time domain or in a frequency domain, a data segment being encoded in the frequency domain having successive blocks of data representing successive and overlapping blocks of time-domain data samples.
  • This data stream could, for example, correspond to an audio stream, wherein some of the data blocks are encoded in the time domain and other ones are encoded in the frequency domain.
  • Data blocks or segments which have been encoded in the frequency domain may represent time-domain data samples of overlapping data blocks.
  • the apparatus 100 comprises a time-domain decoder 110 for decoding a data segment being encoded in the time domain. Furthermore, the apparatus 100 comprises a processor 120 for processing the data segment being encoded in the frequency domain and output data of the time-domain decoder 110 to obtain overlapping time-domain data blocks. Moreover, the apparatus 100 comprises an overlap/add- combiner 130 for combining the overlapping time-domain data blocks to obtain the decoded data segments of the time- domain data stream.
  • Fig. Ib shows another embodiment of the apparatus 100.
  • the processor 120 may comprise a frequency- domain decoder 122 for decoding data segments being encoded in the frequency domain to obtain frequency-domain data segments.
  • the processor 120 may comprise a time-domain to frequency-domain converter 124 for converting the output data of the time-domain decoder 110 to obtain converted frequency-domain data segments.
  • the processor 120 may comprise a frequency-domain combiner 126 for combining the frequency-domain segments and the converted frequency- domain data segments to obtain a frequency-domain data stream.
  • the processor 120 may further comprise a frequency- domain to time-domain converter 128 for converting the frequency-domain data stream to overlapping time-domain data blocks which can then be combined by the overlap/add- combiner 130.
  • Embodiments may utilize an MDCT filterbank, as for example, used in MPEG-4 AAC, without any modifications, especially without giving up the property of critical sampling. Embodiments may provide optimum coding efficiency. Embodiments may achieve a smooth transition to a time- domain codec compatible with the established MDCT windows while introducing no additional switching artifacts and only a minimal overhead.
  • Embodiments may keep the time-domain aliasing in the filterbank and intentionally introduce a corresponding time-domain aliasing into the signal portions coded by the time-domain codec.
  • resulting components of the time-domain aliasing can cancel each other out in the same way as they do for two consecutive frames of the MDCT spectra.
  • Fig. Ic illustrates another embodiment of an apparatus 100.
  • the frequency-domain decoder 122 can comprise a re-quantization stage 122a.
  • the time- domain to frequency-domain converter 124 can comprise a cosine modulated filterbank, an extended lapped transform, a low delay filterbank or a polyphase filterbank.
  • the embodiment shown in Fig. Ic illustrates that the time- domain to frequency-domain converter 124 can comprise an MDCT 124a.
  • Fig. Ic depicts that the frequency-domain combiner 126 may comprise an adder 126a.
  • the frequency-domain to time-domain converter 128 can comprise a cosine modulated filterbank, respectively an inverse MDCT 128a.
  • the data stream comprising time-domain encoded and frequency-domain encoded data segment may be generated by an encoder which will be further detailed below.
  • the switching between frequency-domain encoding and time-domain encoding can be achieved by encoding some portions of the input signal with a frequency-domain encoder and some input signal portions with a time-domain encoder.
  • the embodiment of the apparatus 100 depicted in Fig. Ic illustrates the principle structure of a corresponding apparatus 100 for decoding.
  • the re-quantization 122a and the inverse modified discrete cosine transform 128a can represent a frequency-domain decoder.
  • the time-domain output of the time-domain decoder 110 can be transformed by the forward MDCT 124a.
  • the time-domain decoder may utilize a prediction filter to decode the time-domain encoded data.
  • the embodiment shown in Fig. Ic also comprises an operation mode where both codecs can operate in parallel.
  • the processor 120 can be adapted for processing a data segment being encoded in parallel in the time domain and in the frequency domain. In this way the signal can partially be coded in the frequency domain and partially in the time domain, similar to a layered coding approach. The resulting signals are then added up in the frequency domain, compare the frequency- domain combiner 126a.
  • embodiments may carry out a mode of operation which is to switch exclusively between the two codecs and only have a preferably minimum number of samples where both codecs are active in order to obtain best possible efficiency.
  • Fig. Ic illustrates an embodiment of an apparatus 100 illustrating this approach.
  • the apparatus 100 shown in Fig. Id illustrates that the processor 120 may comprise a calculator 129 for calculating overlapping time- domain data blocks based on the output data of the time- domain decoder 110.
  • the processor 120 or the calculator 129 can be adapted for reproducing a property respectively an overlapping property of the frequency-domain to time-domain converter 128 based on the output data of the time-domain decoder 110, i.e.
  • the processor 120 or calculator 129 may reproduce an overlapping characteristic of time-domain data blocks similar to an overlapping characteristic produced by the frequency-domain to time-domain converter 128. Moreover, the processor 120 or calculator 129 can be adapted for reproducing time-domain aliasing similar to time-domain aliasing introduced by the frequency-domain to time-domain converter 128 based on the output data of the time-domain decoder 110.
  • the frequency-domain to time-domain converter 128 can then be adapted for converting the frequency-domain data segments provided by the frequency-domain decoder 122 to overlapping time-domain data blocks.
  • the overlap/add- combiner 130 can be adapted for combining data blocks provided by the frequency-domain to time-domain converter 128 and the calculator 129 to obtain the decoded data segments of the time-domain data stream.
  • the calculator 129 may comprise a time-domain aliasing stage 129a as it is illustrated in the embodiment shown in Fig. Ie.
  • the time-domain aliasing stage 129a can be adapted for time-aliasing output data of the time-domain decoder to obtain the overlapping time-domain data blocks.
  • a combination of the MDCT and the IMDCT can make the process in embodiments much simpler in both structure and computational complexity as only the process of time-domain aliasing (TDA) remains in embodiments.
  • TDA time-domain aliasing
  • This efficient process can be based on a number of observations.
  • the windowed MDCT of the input segments of 2N samples can be decomposed into three steps.
  • the input signal is multiplied by an analysis window.
  • the result is then folded down from 2N samples to N samples.
  • this process implies that the first quarter of the samples is combined, i.e. substracted, in time-reversed order with the second quarter of the samples, and that the fourth quarter of the samples is combined, i.e. added, with the third quarter of the samples in time- reversed order.
  • the result is the time-aliased, down- sampled signal in the modified second and third quarter of the signal, comprising N samples.
  • the down-sampled signal is then transformed using an orthogonal DCT-like transform mapping N input to N output samples to form the final MDCT output.
  • the windowed IMDCT reconstruction of an input sequence of N spectral samples can likewise be decomposed into three steps.
  • the input sequence of N spectral samples is transformed using an orthogonal inverse DCT-like transform mapping N input to N output samples.
  • the results unfolded from N to 2N samples by writing the inverse DCT transformed values into the second and third quarter of a 2N samples output buffer, filling the first quarter with the time-reversed and inverted version of the second quarter, and the fourth quarter with a time-reverse version of the third quarter, respectively.
  • the resulting 2N samples are multiplied with the synthesis window to form the windowed IMDCT output.
  • a concatenation of the windowed MDCT and the windowed IMDCT may be efficiently carried out in embodiments by the sequence of the first and second steps of the windowed MDCT and the second and third steps of the windowed IMDCT.
  • the third step of the MDCT and the first step of the IMDCT can be omitted entirely in embodiments because they are inverse operations with respect to each other and thus cancel out.
  • the remaining steps can be carried out in the time domain only, and thus embodiments using this approach can be substantially low in computational complexity.
  • the second and third step of the MDCT and the second and third step of the IMDCT can be written as a multiplication with the following sparse 2Nx2N matrix.
  • the calculator 129 can be adapted for segmenting the output of the time-domain decoder 110 in calculator segments comprising 2N sequential samples, applying weights to the 2N samples according to an analysis windowing function, subtracting the first N/2 samples in reversed order from the second N/2 samples, and the last N/2 samples in reversed order to the third N/2 samples, inverting the second and third N/2 samples, replacing the first N/2 samples with the time-reversed and inverted version of the second N/2 samples, replacing the fourth N/2 samples with the time reversed version of the third N/2 samples, and applying weights to the 2N samples according to a synthesis windowing function.
  • the overlap/add-combiner 130 can be adapted for applying weights according to a synthesis windowing function to overlapping time-domain data blocks provided by the frequency-domain to time-domain converter 128. Furthermore, the overlap/add-combiner 130 can be adapted for applying weights according to a synthesis windowing function being adapted to the size of an overlapping region of consecutive overlapping time-domain data blocks.
  • the calculator 129 may be adapted for applying weights to the 2N samples according to an analysis windowing function being adapted to the size of an overlapping region of consecutive overlapping time-domain data blocks and the calculator may be further adapted for applying weigths to the 2N samples according to a synthesis window function being adapted to the size of the overlapping region.
  • the size of an overlapping region of two consecutive time-domain data blocks which are encoded in the frequency-domain can be larger than the size of an overlapping of two consecutive time-domain data blocks of which one being encoded in the frequency domain and one being encoded in the time domain.
  • the size of the data segments can be adapted to the size of the overlapping regions.
  • Embodiments may have an efficient implementation of a combined MDCT/IMDCT processing, i.e. a block TDA comprising the operations of analysis windowing, folding and unfolding, and synthesis windowing. Moreover, in embodiments some of these steps may be partially or fully combined in an actual implementation.
  • an apparatus 100 may further comprise a bypass 140 for the processor 120 and the overlay/add- combiner 130 being adapted for bypassing the processor 120 and the overlay/add-combiner 130 when non-overlapping consecutive time-domain data blocks occur in data segments, which are encoded in the time domain. If multiple data segments are encoded in the time domain, i.e. no conversion to the frequency domain may be necessary for decoding consecutive data segments, they may be transmitted without any overlapping. For these cases the embodiments as shown in Fig. If may bypass the processor 120 and the overlap/add-combiner 130. In embodiments the overlapping of blocks can be determined according to the AAC- specifications . Fig.
  • the apparatus 200 comprises a segment processor 210 for providing data segments from the data stream, two consecutive data segments having a first or a second overlapping region, the second overlapping region being smaller than the first overlapping region.
  • the apparatus 200 further comprises a time-domain encoder 220 for encoding a data segment in the time domain and a frequency-domain encoder 230 for applying weights to samples of the time-domain data stream according to a first or a second windowing function to obtain a windowed data segment, the first and second windowing functions being adapted to the first and second overlapping regions and for encoding the windowed data segment in the frequency domain.
  • the apparatus 200 comprises a time-domain data analyzer 240 for determining a transmission indication associated with a data segment and a controller 250 for controlling the apparatus such that for data segments having a first transition indication, output data of the time-domain encoder 220 is included in the encoded data stream and for data segments having a second transition indication, output data of the frequency-domain encoder 230 is included in the encoded data stream.
  • the time-domain data analyzer 240 may be adapted for determining the transition indication from the time-domain data stream or from data segments provided by the segment processor 210. These embodiments are indicated in Fig. 2b. In Fig. 2b it is illustrated that the time-domain data analyzer 240 may be coupled to the input of the segment processor 210 in order to determine the transition indication from the time-domain data stream. In another embodiment the time-domain data analyzer 240 may be coupled to the output of the segment processor 210 in order to determine the transition indication from the data segments. In embodiments the time-domain data analyzer 240 can be coupled directly to the segment processor 210 in order to determine the transition indication from data provided directly by the segment processor. These embodiments are indicated by the dotted lines in Fig. 2b.
  • the time-domain data analyzer 240 can be adapted for determining a transition measure, the transition measure being based on a level of transience in the time-domain data stream or the data segments wherein the transition indicator may indicate whether the level of transience exceeds a predetermined threshold.
  • Fig. 2c shows another embodiment of the apparatus 200.
  • the segment processor 210 can be adapted for providing data segments with the first and the second overlapping regions
  • the time-domain encoder 220 can be adapted for encoding all data segments
  • the frequency-domain encoder 230 may be adapted for encoding all windowed data segments
  • the controller 250 can be adapted for controlling the time-domain encoder 220 and the frequency-domain encoder 220 and the frequency-domain encoder 230 such that for data segments having a first transition indication, output data of the time-domain encoder 220 is included in the encoded data stream and for data segments having a second transition indication, output data of the frequency- domain encoder 230 is included in the encoded data stream.
  • both output data of the time-domain encoder 220 and the frequency-domain encoder 230 may be included in the encoded data stream.
  • the transition indicator may be indicating whether a data segment is rather associated or correlated with a speech signal or with a music signal.
  • the frequency-domain encoder 230 may be used for more music-like data segments and the time-domain encoder 220 may be used for more speech-like data segments.
  • parallel encoding may be utilized, e.g. for a speech-like audio signal having background music.
  • controller 250 may control the multiple components within the apparatus 200.
  • the different possibilities are indicated by dotted lines in Fig. 2c.
  • the controller 250 could be coupled to the time-domain encoder 220 and the frequency- domain encoder 230 in order to choose which encoder should produce an encoded output based on the transition indication.
  • the controller 250 may control a switch at the outputs of the time-domain encoder 220 and the frequency-domain encoder 230.
  • both the time-domain encoder 220 and the frequency-domain encoder 230 may encode all data segments and the controller 250 may be adapted for choosing via said switch which is coupled to the outputs of the encoders, which encoded data segment should be included in the encoded data stream, based on coding efficiency, respectively the transition indication.
  • the controller 250 can be adapted for controlling the segment processor 210 for providing the data segments either to the time-domain encoder 220 or the frequency-domain encoder 230.
  • the controller 250 may also control the segment processor 210 in order to set overlapping regions for a data segment.
  • the controller 250 may be adapted for controlling a switch between the segment processor 210 and the time-domain encoder 220, respectively the frequency- domain encoder 230.
  • the controller 250 could then influence the switch so to direct data segments to either one of the encoders, respectively to both.
  • the controller 250 can be further adapted to set the windowing functions for the frequency-domain encoder 230 along with the overlapping regions and coding strategies.
  • the frequency-domain encoder 230 can be adapted for applying weights of window functions according to AAC specifications.
  • the frequency-domain encoder 230 can be adapted for converting a windowed data segment to the frequency domain to obtain a frequency- domain data segment.
  • the frequency domain encoder 230 can be adapted for quantizing the frequency-domain data segments and, furthermore, the frequency-domain encoder 230 may be adapted for evaluating the frequency-domain data segments according to a perceptual model.
  • the frequency-domain encoder 230 can be adapted for utilizing a cosine modulated filterbank, an extended lapped transform, a low-delay filterbank or a polyphase filterbank to obtain the frequency-domain data segments.
  • the frequency-domain encoder 230 may be adapted for utilizing an MDCT to obtain the frequency data segments.
  • the time-domain encoder 220 can be adapted for using a prediction model for encoding the data segments.
  • an MDCT in the frequency-domain encoder 230 operates in a so-called long block mode, i.e. the regular mode of operation that is used for coding non- transient input signals, compare AAC-specifications, the overhead introduced by the switching process may be high. This can be true for the cases where only one frame, i.e. a length/framing rate of N samples, should be coded using the time-domain encoder 220 instead of the frequency-domain encoder 230.
  • Figs. 3a to 3d illustrate some conceivable overlapping regions of segments, respectively applicable windowing functions.
  • 2N samples may have to be coded with the time-domain encoder 220 in order to replace one block of frequency-domain encoded data.
  • Fig. 3a illustrates an example, where frequency-domain encoded data blocks use a solid line, and time-domain encoded data uses a dotted line. Underneath the windowing functions data segments are depicted which can be encoded in the frequency domain (solid boxes) or in the time domain (dotted boxes) . This representation will be referred to in Figs. 3b to 3d as well.
  • Fig. 3a illustrates the case where data is encoded in the frequency domain, interrupted by one data segment which is encoded in the time domain, and the data segment after it is encoded in the frequency domain again.
  • the time-domain encoded data segment in Fig. 3a has a size of 2N, then at its start and at the end it overlaps with the frequency-domain encoded data by N/2 samples.
  • the overhead for the time-domain encoded section stays at N samples.
  • Fig. 3b shows the overlap structure in case of two frames encoded with time-domain encoder 220. 3N samples have to be coded with the time-domain encoder 220 in this case.
  • This overhead can be reduced in embodiments by utilizing window switching, for example, according to the structure which is used in AAC. Fig.
  • FIG. 3c illustrates a typical sequence of Long, Start, 8Short and Stop windows, as they are used in AAC. From Fig. 3c it can be seen that the window sizes, the data segment sizes and, consequently, the size of the overlapping regions change with the different windows.
  • the sequence depicted in Fig. 3c is an example for the sequence mentioned above.
  • Embodiments should not be limited to windows of the size of AAC windows, however, embodiments take advantage of windows with different overlapping regions and also of windows of different durations.
  • transitions to and from short windows may utilize a reduced overlap as, for example, disclosed in Bernd Edler, "Cod mich Audiosignalen mit ⁇ berlappender Transformation und adaptiven Novafunktionen", Frequenz, Vol. 43, No. 9, p. 252-256, September 1989 and Generic Coding of Moving Pictures and Associated Audio: Advanced Audio Coding, International Standard 13818-7, ISO/IEC JTC1/SC29/WG11 Moving Pictures Expert Group, 1997 may be used in embodiments to reduce the overhead for the transitions to and from the time-domain encoded regions, as it is illustrated in Fig. 3d.
  • Fig. 3d illustrates four data segments, of which the first two and the last one are encoded in the frequency domain and the third one is encoded in the time domain.
  • the transition may be based on Start and Stop windows identical to the ones used in AAC.
  • the corresponding windows for the transitions to and from the time-domain encoded regions are windows with only small regions of overlap.
  • the overhead i.e. the number of additional values to be transmitted due to the switching process decreases substantially.
  • the overhead may be N ov i/2 for each transition with the window overlap of N ov i samples.
  • Embodiments may utilize a filterbank in the frequency- domain encoder 230 as, for example, the widely used MDCT filterbank, however, other embodiments may also be used with frequency-domain codecs based on other cosine- modulated filterbanks.
  • This may comprise the derivates of the MDCT, such as extended lapped transforms or low-delay filterbanks as well as polyphase filterbanks, such as, for example, the one used in MPEG-l-Layer-1/2/3 audio codecs.
  • efficient implementation of a forward/back- filterbank operation may take into account a specific type of window and folding/unfolding used in the filterbank.
  • the analysis stage may be implemented efficiently by a preprocessing step and a block transform, i.e. DCT-like or DFT, for the modulation.
  • a block transform i.e. DCT-like or DFT
  • the corresponding synthesis stage can ⁇ be implemented using the corresponding inverse transform and a post processing step.
  • Embodiments may only use the pre- and post processing steps for the time-domain encoded signal portions .
  • Embodiments of the present invention provide the advantage that a better code efficiency can be achieved, since switching between a time-domain encoder 220 and the frequency-domain encoder 230 can be done introducing very low overhead. In signal sections of subsequent time-domain encoding only, overlap may be omitted completely in embodiments.
  • Embodiments of the apparatus 100 enable the according decoding of the encoded data stream.
  • Embodiments therewith provide the advantage that a lower coding rate can be achieved for the same quality of, for example, an audio signal, respectively a higher quality can be achieved with the same coding rate, as the respective encoders can be adapted to the transience in the audio signal .
  • the inventive methods can be implemented in hardware or in software.
  • the implementation can be performed using a digital storage medium, in particular a disc, DVD or CD having electronically stored control signals stored thereon, which corporate with the programmable computer system such that the inventive methods are performed.
  • the present invention is, therefore, a computer program product having a program code stored on a machine-readable carrier, the program code being operative for performing the inventive methods when the computer program product runs on a computer.
  • the inventive methods are, therefore, a computer program having a program code for performing at least one of the inventive methods when the computer program runs on a computer.

Abstract

An apparatus for decoding data segments representing a time-domain data stream, a data segment being encoded in the time domain or in the frequency domain, a data segment being encoded in the frequency domain having successive blocks of data representing successive and overlapping blocks of time-domain data samples. The apparatus comprises a time-domain decoder for decoding a data segment being encoded in the time domain and a processor for processing the data segment being encoded in the frequency domain and output data of the time-domain decoder to obtain overlapping time-domain data blocks. The apparatus further comprises an overlap/add-combiner for combining the overlapping time-domain data blocks to obtain a decoded data segment of the time-domain data stream.

Description

Encoder, Decoder and Methods for Encoding and Decoding Data Segments representing a time-domain Data Stream
Description
The present invention is in the field of coding, where different characteristics of data to be encoded are utilized for coding rates, as for example in video and audio coding.
State of the art coding strategies can make use of characteristics of a data stream to be encoded. For example, in audio coding, perception models are used in order to compress source data almost without decreasing the noticeable quality and degradation when replayed. Modern perceptual audio coding schemes, such as for example, MPEG- 2/4 AAC (MPEG = Moving Pictures Expert Group, AAC = Advanced Audio Coding), cf. Generic Coding of Moving Pictures and Associated Audio: Advanced Audio Coding, International Standard 13818-7, ISO/IEC JTC1/SC29/WG11 Moving Pictures Expert Group, 1997, may use filter banks, such as for example the Modified Discrete Cosine Transform (MDCT) , for representing the audio signal in the frequency domain.
In the frequency domain quantization of frequency coefficients can be carried out, according to a perceptual model. Such coders can provide excellent perceptual audio quality for general types of audio signals as, for example, music. On the other hand, modern speech coders, such as, for example, ACELP (ACELP = Algebraic Code Excited Linear Prediction) , use a predictive approach, and in this way may represent the audio/speech signal in the time domain. Such speech coders can model the characteristics of the human speech production process, i.e. the human vocal tract and, consequently, achieve excellent performance for speech signals at low bit rates. Conversely, perceptional audio coders do not achieve the level of performance offered by speech coders for speech signals coded at low bit rates, and using speech coders to represent general audio signals/music results in significant quality impairments.
Conventional concepts provide a layered combination in which always all partial coders are active, i.e. time- domain and frequency-domain encoders, and the final output signal is calculated by combining the contributions of the partial coders for a given processed time frame. A popular example of layered coding are MPEG-4 scalable speech/audio coding with a speech coder as the base layer and a filterbank-based enhancement layer, cf. Bernhard Grill, Karlheinz Brandenburg, "A Two-or Three-Stage Bit-Rate Scalable Audio Coding System", Preprint Number 4132, 99th Convention of the AES (September 1995) .
Conventional frequency-domain encoders can make use of MDCT filterbanks. The MDCT has become a dominant filterbank for conventional perceptual audio coders because of its advantageous properties. For example, it can provide a smooth cross-fade between processing blocks. Even if a signal in each processing block is altered differently, for example due to quantization of spectral coefficients, no blocking artifacts due to abrupt transitions from block to block occur because of the windowed overlap/add operations. The MDCT uses the concept of time-domain aliasing cancellation (TDAC) .
The MDCT is a Fourier-related transform based on the type- IV discrete cosine transform, with the additional property of being lapped. It is designed to be performed in consecutive blocks of a larger data set, where subsequent blocks are overlapped so that the last half of one block coincides with the first half of the next block. This overlapping, in addition to an energy-compaction quality of the DCT, makes the MDCT especially attractive for signal compression applications, since it helps to avoid said artifacts steming from the block boundaries. As a lapped transform, the MDCT is a bit unusual compared to other Fourier-related transforms in that it has half as many outputs as inputs, instead of the same number. In particular, 2N real numbers are transformed into N real numbers, where N is a positive integer.
The inverse MDCT is also known as IMDCT. Because there are different numbers of inputs and outputs, at first glance it might seem that the MDCT should not be invertible. However, perfect invertibility is achieved by adding the overlap IDMCTs of subsequent overlapping blocks, causing the errors to cancel and the original data to be retrieved, i.e. achieving TDAC.
Therewith, the number of spectral values at the output of a filterbank is equal to the number of time-domain input values at its input which is also referred to as critical sampling.
An MDCT filterbank provides a high-frequency selectivity and enables a high coding gain. The properties of overlapping of blocks and critical sampling can be achieved by utilizing the technique of time-domain aliasing cancellation, cf. J. Princen, A. Bradley, "Analysis/Synthesis Filter Bank Design Based on Time Domain Aliasing Cancellation", IEEE Trans. ASSP, ASSP-34 (5) : 1153- 1161, 1986. Fig. 4 illustrates these effects of an MDCT. Fig. 4 shows an MDCT input signal, in terms of an impulse along a time axis 400 at the top. The input signal 400 is then transformed by two consecutive windowing and MDCT blocks, where the windows 410 are illustrated underneath the input signal 400 in Fig. 4. The back transformed individual windowed signals are displayed in Fig. 4 by the time lines 420 and 425.
After the inverse MDCT, the first block produces an aliasing component with positive sign 420, the second block produces an aliasing component with the same magnitude and a negative sign 425. The aliasing components cancel each other after addition of the two output signals 420 and 425 as shown in the final output 430 at the bottom of Fig. 4.
In "Extended Adaptive Multi-Rate - Wideband (AMR-WB+) codec", 3GPP TS 26.290V6.3.0, 2005-06, Technical Specification the AMR-WB+ (AMR-WB = Adaptive Multi-Rate Wideband) codec is specified. According to section 5.2, the encoding algorithm at the core of the AMR-WB+ codec is based on a hybrid ACELP/TCX (TCX = Transform coded Excitation) model. For every block of an input signal the encoder decides, either in an open loop or a closed loop mode which encoding model, i.e. ACELP or TCX, is best. The ACELP model is a time-domain, predictive encoder, best suited for speech and transient signals. The AMR-WB encoder is used in ACELP modes. Alternatively, the TCX model is a transform based encoder, and is more appropriate for typical music samples.
Specifically, the AMR-WB+ uses a discrete Fourier transform (DFT) for the transform coding mode TCX. In order to allow a smooth transition between adjacent blocks, a windowing and overlap is used. This windowing and overlap is necessary both for transitions between different coding modes (TCX/ACELP) and for consecutive TCX frames. Thus, the DFT together with the windowing and overlap represents a filterbank that is not critically sampled. The filterbank produces more frequency values than the number of new input samples, cf. Fig. 4 in 3GPP TS 26.290V6.3.0 (3GPP = Third Generation Partnership Project, TS = Technical Specification) . Each TCX frame utilizes an overlap of 1/8 of the frame length which equals the number of new input samples. Consequently, the corresponding length of the DFT is 9/8 of the frame length.
Considering the non-critically sampled DFT filterbank in the TCX, i.e. the number of spectral values at the output of the filterbank is larger than the number of time-domain input values at its input, this frequency domain coding mode is different from audio codecs such as AAC (AAC = Advanced Audio Coding) which utilizes an MDCT, a critically sampled lapped transform.
The Dolby E codec is described in Fielder, Louis D.; Todd, Craig C, "The Design of a Video Friendly Audio Coding System for Distributing Applications", Paper Number 17-008, The AES 17th International Conference: High-Quality Audio Coding (August 1999) and Fielder, Louis D.; Davidson, Grant A., "Audio Coding Tools for Digital Television Distribution", Preprint Nubmer 5104, 108th Convention of the AES (January 2000) . The Dolby E codec utilizes the MDCT filterbank. In the design of this coding, special focus was put on the possibility to perform editing in the coding domain. To achieve this, special alias-free windows are used. At the boundaries of these windows a smooth-cross fade or splicing of different signal portions is possible. In the above-referenced documents it is, for example, outlined, cf. section 3 of "The Design of a Video Friendly Audio Coding System for Distribution Applications", that this would not be possible by simply using the usual MDCT windows which introduce time-domain aliasing. However, it is also described that the removal of aliasing comes at the cost of an increased number of transform coefficients, indicating that the resulting filterbank does not have the property of critical sampling anymore.
It is the object of the present invention to provide a more efficient concept for encoding and decoding data segments.
The object is achieved by an apparatus for decoding according to claim 1, a method for decoding according to claim 22, an apparatus for generating an encoded data stream according to claim 24 and a method for generating an encoded data stream according to claim 35. The present invention is based on the finding that a more efficient encoding and decoding concept can be utilized by using combined time-domain and frequency-domain encoders, respectively decoders. The problem of time aliasing can be efficiently combat by transforming time-domain data to the frequency-domain in the decoder and by combining the resulting transformed frequency-domain data with the decoded frequency-domain data received. Overheads can be reduced by adapting overlapping regions of overlap windows being applied to data segments to coding domain changes. Using windows with smaller overlapping regions can be beneficial when using time-domain encoding, respectively when switching from or to time-domain encoding.
Embodiments can provide a universal audio encoding and decoding concept that achieves improved performance for both types of input signals, such as speech signals and music signals. Embodiments can take advantage by combining multiple coding approaches, e.g. time-domain and frequency- domain coding concepts. Embodiments can efficiently combine filterbank based and time-domain based coding concepts into a single scheme. Embodiments may result in a combined codec which can, for example, be able to switch between an audio codec for music-like audio content and a speech codec for speech-like content. Embodiments may utilize this switching frequently, especially for mixed content.
Embodiments of the present invention may provide the advantage that no switching artifacts occur. In embodiments the amount of additional transmit data, or additionally coded samples, for a switching process can be minimized in order to avoid a reduced efficiency during this phase of operation. Therewith the concept of switched combination of partial coders is different from that of the layered combination in which always all partial coders are active. In the following embodiments of the present invention will be described in detail using the accompanying Figures, in which
Fig. Ia shows an embodiment of an apparatus for decoding;
Fig. Ib shows another embodiment of an apparatus for decoding;
Fig. Ic shows another embodiment of an apparatus for decoding;
Fig. Id shows another embodiment of an apparatus for decoding;
Fig. Ie shows another embodiment of an apparatus for decoding;
Fig. If shows another embodiment of an apparatus for decoding;
Fig. 2a shows an embodiment of an apparatus for encoding;
Fig. 2b shows another embodiment of an apparatus for encoding;
Fig. 2c shows another embodiment of an apparatus for encoding;
Fig. 3a illustrates overlapping regions when switching between frequency-domain and time-domain coding for the duration of one window;
Fig. 3b illustrates the overlapping regions when switching between frequency-domain coding and time-domain coding for a duration of two windows; Fig. 3c illustrates multiple windows with different overlapping regions;
Fig. 3d illustrates the utilization of windows with different overlapping regions in an embodiment; and
Fig. 4 illustrates time-domain aliasing cancellation when using MDCT.
Fig. Ia shows an apparatus 100 for decoding data segments representing a time-domain data stream, a data segment being encoded in a time domain or in a frequency domain, a data segment being encoded in the frequency domain having successive blocks of data representing successive and overlapping blocks of time-domain data samples. This data stream could, for example, correspond to an audio stream, wherein some of the data blocks are encoded in the time domain and other ones are encoded in the frequency domain. Data blocks or segments which have been encoded in the frequency domain, may represent time-domain data samples of overlapping data blocks.
The apparatus 100 comprises a time-domain decoder 110 for decoding a data segment being encoded in the time domain. Furthermore, the apparatus 100 comprises a processor 120 for processing the data segment being encoded in the frequency domain and output data of the time-domain decoder 110 to obtain overlapping time-domain data blocks. Moreover, the apparatus 100 comprises an overlap/add- combiner 130 for combining the overlapping time-domain data blocks to obtain the decoded data segments of the time- domain data stream.
Fig. Ib shows another embodiment of the apparatus 100. In embodiments the processor 120 may comprise a frequency- domain decoder 122 for decoding data segments being encoded in the frequency domain to obtain frequency-domain data segments. Moreover, in embodiments the processor 120 may comprise a time-domain to frequency-domain converter 124 for converting the output data of the time-domain decoder 110 to obtain converted frequency-domain data segments.
Furthermore, in embodiments the processor 120 may comprise a frequency-domain combiner 126 for combining the frequency-domain segments and the converted frequency- domain data segments to obtain a frequency-domain data stream. The processor 120 may further comprise a frequency- domain to time-domain converter 128 for converting the frequency-domain data stream to overlapping time-domain data blocks which can then be combined by the overlap/add- combiner 130.
Embodiments may utilize an MDCT filterbank, as for example, used in MPEG-4 AAC, without any modifications, especially without giving up the property of critical sampling. Embodiments may provide optimum coding efficiency. Embodiments may achieve a smooth transition to a time- domain codec compatible with the established MDCT windows while introducing no additional switching artifacts and only a minimal overhead.
Embodiments may keep the time-domain aliasing in the filterbank and intentionally introduce a corresponding time-domain aliasing into the signal portions coded by the time-domain codec. Thus, resulting components of the time- domain aliasing can cancel each other out in the same way as they do for two consecutive frames of the MDCT spectra.
Fig. Ic illustrates another embodiment of an apparatus 100. According to Fig. Ic the frequency-domain decoder 122 can comprise a re-quantization stage 122a. Moreover, the time- domain to frequency-domain converter 124 can comprise a cosine modulated filterbank, an extended lapped transform, a low delay filterbank or a polyphase filterbank. The embodiment shown in Fig. Ic illustrates that the time- domain to frequency-domain converter 124 can comprise an MDCT 124a.
Furthermore, Fig. Ic depicts that the frequency-domain combiner 126 may comprise an adder 126a. As shown in Fig. Ic, the frequency-domain to time-domain converter 128 can comprise a cosine modulated filterbank, respectively an inverse MDCT 128a. The data stream comprising time-domain encoded and frequency-domain encoded data segment may be generated by an encoder which will be further detailed below. The switching between frequency-domain encoding and time-domain encoding can be achieved by encoding some portions of the input signal with a frequency-domain encoder and some input signal portions with a time-domain encoder. The embodiment of the apparatus 100 depicted in Fig. Ic illustrates the principle structure of a corresponding apparatus 100 for decoding. In other embodiments the re-quantization 122a and the inverse modified discrete cosine transform 128a can represent a frequency-domain decoder.
As indicated in Fig. Ic for signal portions where the time- domain decoder 110 takes over, the time-domain output of the time-domain decoder 110 can be transformed by the forward MDCT 124a. The time-domain decoder may utilize a prediction filter to decode the time-domain encoded data. Some overlap in the input of the MDCT 124a and thus some overhead may be introduced here. In the following embodiments will be described which reduce or minimize this overhead.
In principle, the embodiment shown in Fig. Ic also comprises an operation mode where both codecs can operate in parallel. In embodiments the processor 120 can be adapted for processing a data segment being encoded in parallel in the time domain and in the frequency domain. In this way the signal can partially be coded in the frequency domain and partially in the time domain, similar to a layered coding approach. The resulting signals are then added up in the frequency domain, compare the frequency- domain combiner 126a. Nevertheless, embodiments may carry out a mode of operation which is to switch exclusively between the two codecs and only have a preferably minimum number of samples where both codecs are active in order to obtain best possible efficiency.
In Fig. Ic, the output of the time-domain decoder 110 is transformed by the MDCT 124a, followed by the IMDCT 128a. In another embodiment, these two steps may be advantageously combined into a single step in order to reduce complexity. Fig. Id illustrates an embodiment of an apparatus 100 illustrating this approach. The apparatus 100 shown in Fig. Id illustrates that the processor 120 may comprise a calculator 129 for calculating overlapping time- domain data blocks based on the output data of the time- domain decoder 110. The processor 120 or the calculator 129 can be adapted for reproducing a property respectively an overlapping property of the frequency-domain to time-domain converter 128 based on the output data of the time-domain decoder 110, i.e. the processor 120 or calculator 129 may reproduce an overlapping characteristic of time-domain data blocks similar to an overlapping characteristic produced by the frequency-domain to time-domain converter 128. Moreover, the processor 120 or calculator 129 can be adapted for reproducing time-domain aliasing similar to time-domain aliasing introduced by the frequency-domain to time-domain converter 128 based on the output data of the time-domain decoder 110.
The frequency-domain to time-domain converter 128 can then be adapted for converting the frequency-domain data segments provided by the frequency-domain decoder 122 to overlapping time-domain data blocks. The overlap/add- combiner 130 can be adapted for combining data blocks provided by the frequency-domain to time-domain converter 128 and the calculator 129 to obtain the decoded data segments of the time-domain data stream.
The calculator 129 may comprise a time-domain aliasing stage 129a as it is illustrated in the embodiment shown in Fig. Ie. The time-domain aliasing stage 129a can be adapted for time-aliasing output data of the time-domain decoder to obtain the overlapping time-domain data blocks.
For the time-domain encoded data a combination of the MDCT and the IMDCT can make the process in embodiments much simpler in both structure and computational complexity as only the process of time-domain aliasing (TDA) remains in embodiments. This efficient process can be based on a number of observations. The windowed MDCT of the input segments of 2N samples can be decomposed into three steps.
First, the input signal is multiplied by an analysis window.
Second, the result is then folded down from 2N samples to N samples. For the MDCT, this process implies that the first quarter of the samples is combined, i.e. substracted, in time-reversed order with the second quarter of the samples, and that the fourth quarter of the samples is combined, i.e. added, with the third quarter of the samples in time- reversed order. The result is the time-aliased, down- sampled signal in the modified second and third quarter of the signal, comprising N samples.
Third, the down-sampled signal is then transformed using an orthogonal DCT-like transform mapping N input to N output samples to form the final MDCT output.
The windowed IMDCT reconstruction of an input sequence of N spectral samples can likewise be decomposed into three steps. First, the input sequence of N spectral samples is transformed using an orthogonal inverse DCT-like transform mapping N input to N output samples.
Second, the results unfolded from N to 2N samples by writing the inverse DCT transformed values into the second and third quarter of a 2N samples output buffer, filling the first quarter with the time-reversed and inverted version of the second quarter, and the fourth quarter with a time-reverse version of the third quarter, respectively.
Third, the resulting 2N samples are multiplied with the synthesis window to form the windowed IMDCT output.
Thus, a concatenation of the windowed MDCT and the windowed IMDCT may be efficiently carried out in embodiments by the sequence of the first and second steps of the windowed MDCT and the second and third steps of the windowed IMDCT. The third step of the MDCT and the first step of the IMDCT can be omitted entirely in embodiments because they are inverse operations with respect to each other and thus cancel out. The remaining steps can be carried out in the time domain only, and thus embodiments using this approach can be substantially low in computational complexity.
For one block of MDCT and consecutive IMDCT, the second and third step of the MDCT and the second and third step of the IMDCT can be written as a multiplication with the following sparse 2Nx2N matrix.
Figure imgf000015_0001
In other words, the calculator 129 can be adapted for segmenting the output of the time-domain decoder 110 in calculator segments comprising 2N sequential samples, applying weights to the 2N samples according to an analysis windowing function, subtracting the first N/2 samples in reversed order from the second N/2 samples, and the last N/2 samples in reversed order to the third N/2 samples, inverting the second and third N/2 samples, replacing the first N/2 samples with the time-reversed and inverted version of the second N/2 samples, replacing the fourth N/2 samples with the time reversed version of the third N/2 samples, and applying weights to the 2N samples according to a synthesis windowing function.
In other embodiments the overlap/add-combiner 130 can be adapted for applying weights according to a synthesis windowing function to overlapping time-domain data blocks provided by the frequency-domain to time-domain converter 128. Furthermore, the overlap/add-combiner 130 can be adapted for applying weights according to a synthesis windowing function being adapted to the size of an overlapping region of consecutive overlapping time-domain data blocks.
The calculator 129 may be adapted for applying weights to the 2N samples according to an analysis windowing function being adapted to the size of an overlapping region of consecutive overlapping time-domain data blocks and the calculator may be further adapted for applying weigths to the 2N samples according to a synthesis window function being adapted to the size of the overlapping region.
In embodiments the size of an overlapping region of two consecutive time-domain data blocks which are encoded in the frequency-domain can be larger than the size of an overlapping of two consecutive time-domain data blocks of which one being encoded in the frequency domain and one being encoded in the time domain.
In embodiments, the size of the data segments can be adapted to the size of the overlapping regions. Embodiments may have an efficient implementation of a combined MDCT/IMDCT processing, i.e. a block TDA comprising the operations of analysis windowing, folding and unfolding, and synthesis windowing. Moreover, in embodiments some of these steps may be partially or fully combined in an actual implementation.
Another embodiment of an apparatus 100 as shown in Fig. If illustrates that an apparatus 100 may further comprise a bypass 140 for the processor 120 and the overlay/add- combiner 130 being adapted for bypassing the processor 120 and the overlay/add-combiner 130 when non-overlapping consecutive time-domain data blocks occur in data segments, which are encoded in the time domain. If multiple data segments are encoded in the time domain, i.e. no conversion to the frequency domain may be necessary for decoding consecutive data segments, they may be transmitted without any overlapping. For these cases the embodiments as shown in Fig. If may bypass the processor 120 and the overlap/add-combiner 130. In embodiments the overlapping of blocks can be determined according to the AAC- specifications . Fig. 2a shows an embodiment of an apparatus 200 for generating an encoded data stream based on a time-domain data stream, the time-domain data stream having samples of a signal. The time-domain data stream could, for example, correspond to an audio signal, comprising speech sections and music sections or both at the same time. The apparatus 200 comprises a segment processor 210 for providing data segments from the data stream, two consecutive data segments having a first or a second overlapping region, the second overlapping region being smaller than the first overlapping region. The apparatus 200 further comprises a time-domain encoder 220 for encoding a data segment in the time domain and a frequency-domain encoder 230 for applying weights to samples of the time-domain data stream according to a first or a second windowing function to obtain a windowed data segment, the first and second windowing functions being adapted to the first and second overlapping regions and for encoding the windowed data segment in the frequency domain.
Furthermore, the apparatus 200 comprises a time-domain data analyzer 240 for determining a transmission indication associated with a data segment and a controller 250 for controlling the apparatus such that for data segments having a first transition indication, output data of the time-domain encoder 220 is included in the encoded data stream and for data segments having a second transition indication, output data of the frequency-domain encoder 230 is included in the encoded data stream.
In embodiments the time-domain data analyzer 240 may be adapted for determining the transition indication from the time-domain data stream or from data segments provided by the segment processor 210. These embodiments are indicated in Fig. 2b. In Fig. 2b it is illustrated that the time- domain data analyzer 240 may be coupled to the input of the segment processor 210 in order to determine the transition indication from the time-domain data stream. In another embodiment the time-domain data analyzer 240 may be coupled to the output of the segment processor 210 in order to determine the transition indication from the data segments. In embodiments the time-domain data analyzer 240 can be coupled directly to the segment processor 210 in order to determine the transition indication from data provided directly by the segment processor. These embodiments are indicated by the dotted lines in Fig. 2b.
In embodiments the time-domain data analyzer 240 can be adapted for determining a transition measure, the transition measure being based on a level of transience in the time-domain data stream or the data segments wherein the transition indicator may indicate whether the level of transience exceeds a predetermined threshold.
Fig. 2c shows another embodiment of the apparatus 200. In the embodiments shown in Fig. 2c the segment processor 210 can be adapted for providing data segments with the first and the second overlapping regions, the time-domain encoder 220 can be adapted for encoding all data segments, the frequency-domain encoder 230 may be adapted for encoding all windowed data segments and the controller 250 can be adapted for controlling the time-domain encoder 220 and the frequency-domain encoder 220 and the frequency-domain encoder 230 such that for data segments having a first transition indication, output data of the time-domain encoder 220 is included in the encoded data stream and for data segments having a second transition indication, output data of the frequency- domain encoder 230 is included in the encoded data stream. In other embodiments both output data of the time-domain encoder 220 and the frequency-domain encoder 230 may be included in the encoded data stream. The transition indicator may be indicating whether a data segment is rather associated or correlated with a speech signal or with a music signal. In embodiments the frequency-domain encoder 230 may be used for more music-like data segments and the time-domain encoder 220 may be used for more speech-like data segments. In embodiments parallel encoding may be utilized, e.g. for a speech-like audio signal having background music.
In the embodiment depicted in Fig. 2c, multiple possibilities are conceivable for the controller 250 to control the multiple components within the apparatus 200. The different possibilities are indicated by dotted lines in Fig. 2c. For example, the controller 250 could be coupled to the time-domain encoder 220 and the frequency- domain encoder 230 in order to choose which encoder should produce an encoded output based on the transition indication. In another embodiment the controller 250 may control a switch at the outputs of the time-domain encoder 220 and the frequency-domain encoder 230.
In such an embodiment both the time-domain encoder 220 and the frequency-domain encoder 230 may encode all data segments and the controller 250 may be adapted for choosing via said switch which is coupled to the outputs of the encoders, which encoded data segment should be included in the encoded data stream, based on coding efficiency, respectively the transition indication. In other embodiments the controller 250 can be adapted for controlling the segment processor 210 for providing the data segments either to the time-domain encoder 220 or the frequency-domain encoder 230. The controller 250 may also control the segment processor 210 in order to set overlapping regions for a data segment. In other embodiments the controller 250 may be adapted for controlling a switch between the segment processor 210 and the time-domain encoder 220, respectively the frequency- domain encoder 230. The controller 250 could then influence the switch so to direct data segments to either one of the encoders, respectively to both. The controller 250 can be further adapted to set the windowing functions for the frequency-domain encoder 230 along with the overlapping regions and coding strategies.
Moreover, in embodiments the frequency-domain encoder 230 can be adapted for applying weights of window functions according to AAC specifications. The frequency-domain encoder 230 can be adapted for converting a windowed data segment to the frequency domain to obtain a frequency- domain data segment. Moreover, the frequency domain encoder 230 can be adapted for quantizing the frequency-domain data segments and, furthermore, the frequency-domain encoder 230 may be adapted for evaluating the frequency-domain data segments according to a perceptual model.
The frequency-domain encoder 230 can be adapted for utilizing a cosine modulated filterbank, an extended lapped transform, a low-delay filterbank or a polyphase filterbank to obtain the frequency-domain data segments.
The frequency-domain encoder 230 may be adapted for utilizing an MDCT to obtain the frequency data segments. The time-domain encoder 220 can be adapted for using a prediction model for encoding the data segments.
In embodiments where an MDCT in the frequency-domain encoder 230 operates in a so-called long block mode, i.e. the regular mode of operation that is used for coding non- transient input signals, compare AAC-specifications, the overhead introduced by the switching process may be high. This can be true for the cases where only one frame, i.e. a length/framing rate of N samples, should be coded using the time-domain encoder 220 instead of the frequency-domain encoder 230.
Then all the input values for the MDCT may have to be encoded with the time-domain encoder 220, i.e. 2N samples are available at the output of the time-domain decoder 110. Thus, an overhead of N additional samples could be introduced. Figs. 3a to 3d illustrate some conceivable overlapping regions of segments, respectively applicable windowing functions. 2N samples may have to be coded with the time-domain encoder 220 in order to replace one block of frequency-domain encoded data. Fig. 3a illustrates an example, where frequency-domain encoded data blocks use a solid line, and time-domain encoded data uses a dotted line. Underneath the windowing functions data segments are depicted which can be encoded in the frequency domain (solid boxes) or in the time domain (dotted boxes) . This representation will be referred to in Figs. 3b to 3d as well.
Fig. 3a illustrates the case where data is encoded in the frequency domain, interrupted by one data segment which is encoded in the time domain, and the data segment after it is encoded in the frequency domain again. In order to provide the time-domain data which is necessary to cancel the time-domain aliasing evoked by the frequency-domain encoder 230, when switching from the frequency domain to the time domain, half of a segment size of overlapping is required, the same holds from switching back from the time domain to the frequency domain. Assuming that the time- domain encoded data segment in Fig. 3a has a size of 2N, then at its start and at the end it overlaps with the frequency-domain encoded data by N/2 samples.
In case more than one subsequent frames can be encoded using the time-domain encoder 220, the overhead for the time-domain encoded section stays at N samples. As it is illustrated in Fig. 3b where two consecutive frames are encoded in the time domain and the overlapping regions at the beginning and the end of the time-domain encoded sections have the same overlap as it was explained with respect to Fig. 3a. Fig. 3b shows the overlap structure in case of two frames encoded with time-domain encoder 220. 3N samples have to be coded with the time-domain encoder 220 in this case. This overhead can be reduced in embodiments by utilizing window switching, for example, according to the structure which is used in AAC. Fig. 3c illustrates a typical sequence of Long, Start, 8Short and Stop windows, as they are used in AAC. From Fig. 3c it can be seen that the window sizes, the data segment sizes and, consequently, the size of the overlapping regions change with the different windows. The sequence depicted in Fig. 3c is an example for the sequence mentioned above.
Embodiments should not be limited to windows of the size of AAC windows, however, embodiments take advantage of windows with different overlapping regions and also of windows of different durations. In embodiments transitions to and from short windows may utilize a reduced overlap as, for example, disclosed in Bernd Edler, "Codierung von Audiosignalen mit ϋberlappender Transformation und adaptiven Fensterfunktionen", Frequenz, Vol. 43, No. 9, p. 252-256, September 1989 and Generic Coding of Moving Pictures and Associated Audio: Advanced Audio Coding, International Standard 13818-7, ISO/IEC JTC1/SC29/WG11 Moving Pictures Expert Group, 1997 may be used in embodiments to reduce the overhead for the transitions to and from the time-domain encoded regions, as it is illustrated in Fig. 3d. Fig. 3d illustrates four data segments, of which the first two and the last one are encoded in the frequency domain and the third one is encoded in the time domain. When switching from the frequency domain to the time domain different windows with the reduced overlapping size are used, therewith reducing the overhead.
In embodiments the transition may be based on Start and Stop windows identical to the ones used in AAC. The corresponding windows for the transitions to and from the time-domain encoded regions are windows with only small regions of overlap. As a consequence, the overhead, i.e. the number of additional values to be transmitted due to the switching process decreases substantially. Generally, the overhead may be Novi/2 for each transition with the window overlap of Novi samples. Thus, a transition with the regular fully-overlapped window like an AAC with Novi = 1024 incurs an overhead of 1024/2 = 512 samples for the left, i.e. the fade-in window, and 1024/2 = 512 samples for the right, i.e. the fade-out window, transition resulting in a total overhead of 1024 (= N) samples. Choosing a reduced overlap window like the AAC Short block windows with NOvi=128 only results in an overall overhead of 128 samples.
Embodiments may utilize a filterbank in the frequency- domain encoder 230 as, for example, the widely used MDCT filterbank, however, other embodiments may also be used with frequency-domain codecs based on other cosine- modulated filterbanks. This may comprise the derivates of the MDCT, such as extended lapped transforms or low-delay filterbanks as well as polyphase filterbanks, such as, for example, the one used in MPEG-l-Layer-1/2/3 audio codecs. In embodiments efficient implementation of a forward/back- filterbank operation may take into account a specific type of window and folding/unfolding used in the filterbank. For every type of modulated filterbank the analysis stage may be implemented efficiently by a preprocessing step and a block transform, i.e. DCT-like or DFT, for the modulation. In embodiments the corresponding synthesis stage can βbe implemented using the corresponding inverse transform and a post processing step. Embodiments may only use the pre- and post processing steps for the time-domain encoded signal portions .
Embodiments of the present invention provide the advantage that a better code efficiency can be achieved, since switching between a time-domain encoder 220 and the frequency-domain encoder 230 can be done introducing very low overhead. In signal sections of subsequent time-domain encoding only, overlap may be omitted completely in embodiments. Embodiments of the apparatus 100 enable the according decoding of the encoded data stream.
Embodiments therewith provide the advantage that a lower coding rate can be achieved for the same quality of, for example, an audio signal, respectively a higher quality can be achieved with the same coding rate, as the respective encoders can be adapted to the transience in the audio signal .
Depending on certain implementation requirements of the inventive methods, the inventive methods can be implemented in hardware or in software. The implementation can be performed using a digital storage medium, in particular a disc, DVD or CD having electronically stored control signals stored thereon, which corporate with the programmable computer system such that the inventive methods are performed. Generally, the present invention is, therefore, a computer program product having a program code stored on a machine-readable carrier, the program code being operative for performing the inventive methods when the computer program product runs on a computer. In other words, the inventive methods are, therefore, a computer program having a program code for performing at least one of the inventive methods when the computer program runs on a computer.
Reference List
100 apparatus for decoding
110 time-domain decoder 120 processor
122 frequency-domain decoder
122a re-quantization
124 time-domain to frequency-domain converter
124a modified discrete cosine transform 126 frequency-domain combiner
126a adder
128 frequency-domain to time-domain converter 128a inverse modified discrete cosine transform
129 calculator 129a time-domain aliasing stage
130 overlap/add-combiner 200 apparatus for encoding 210 segment processor
220 time-domain encoder 230 frequency-domain encoder
240 time-domain data analyzer
250 controller
400 modified discrete cosine transform input
410 windows 420 inverse modified discrete cosine transform output first window
425 inverse modified discrete cosine transform output second window
430 final output

Claims

Claims
1. An apparatus for decoding data segments representing a time-domain data stream, a data segment being encoded in the time domain or in the frequency domain, a data segment being encoded in the frequency domain having successive blocks of data representing successive and overlapping blocks of time-domain data samples, the apparatus comprising:
a time-domain decoder for decoding a data segment being encoded in the time domain;
a processor for processing the data segments being encoded in the frequency domain and output data of the time-domain decoder to obtain overlapping time-domain data blocks; and
an overlap/add-combiner for combining the overlapping time-domain data blocks to obtain the decoded data segments of the time-domain data stream.
2. The apparatus of claim 1, wherein the processor comprises a frequency-domain decoder for decoding data segments being encoded in the frequency domain to obtain frequency-domain data segments.
3. The apparatus of claim 1, wherein the processor is adapted for processing a data segment being encoded in the time domain and in the frequency domain in parallel.
4. The apparatus of claim 2, wherein the processor comprises a time-domain to frequency-domain converter for converting the output data of the time-domain decoder to obtain converted frequency-domain data segments .
5. The apparatus of claim 4, wherein the processor comprises a frequency-domain combiner for combining the frequency-domain data segments and the converted frequency-domain data segments to obtain a frequency- domain data stream.
6. The apparatus of claim 5, wherein the processor comprises a frequency-domain to time-domain converter for converting the frequency-domain data stream to overlapping time-domain data blocks.
7. The apparatus of claim 2, wherein the frequency domain decoder further comprises a re-quantization stage.
8. The apparatus of claim 4, wherein the time-domain to frequency-domain converter comprises a cosine modulated filterbank, an extended lapped transform, a low-delay filterbank, a polyphase filterbank or a modified discrete cosine transform.
9. The apparatus of claim 5, wherein the frequency-domain combiner comprises an adder.
10. The apparatus of claim 6, wherein the frequency-domain to time-domain converter comprises a cosine modulated filterbank or an inverse modified discrete cosine transform.
11. The apparatus of claim 1, wherein the time-domain decoder is adapted for using a prediction filter to decode a data segment encoded in the time domain.
12. The apparatus of claim 1, wherein the processor comprises a calculator for calculating overlapping time-domain data blocks based on the output data of the time-domain decoder.
13. The apparatus of claim 12, wherein the calculator is adapted for reproducing an overlapping property of the frequency-domain to time-domain converter based on the output data of the time-domain decoder.
14. The apparatus of claim 13, wherein the calculator is adapted for reproducing a time-domain aliasing characteristic of the frequency-domain to time-domain converter based on the output data of the time-domain decoder.
15. The apparatus of claim 6, wherein the frequency-domain to time-domain converter is adapted for converting the frequency-domain data segments provided by the frequency-domain decoder to overlapping time-domain data blocks.
16. The apparatus of claim 15, wherein the overlap/add- combiner is adapted for combining the overlapping time-domain data blocks provided by the frequency- domain to time-domain converter and the calculator to obtain decoded data segments of the time-domain data stream.
17. The apparatus of claim 8, wherein the calculator comprises a time-domain aliasing stage for time- aliasing output data of the time-domain decoder to obtain the overlapping time-domain data blocks.
18. The apparatus of claim 12, wherein the calculator is adapted for
segmenting the output of the time-domain decoder in calculator segments comprising 2N sequential samples,
applying weights to the 2N samples according to an analysis window function, subtracting the first N/2 samples in reversed order from the second N/2 samples,
adding the last N/2 samples in reversed order to third N/2 samples,
inverting the second and third N/2 samples
replacing the first N/2 samples with the time-reversed and inverted version of the second N/2 samples,
replacing the fourth N/2 samples with the time- reversed version of the third N/2 samples, and
applying weights to the 2/N samples according to a synthesis windowing function.
19. The apparatus of claim 6, wherein the overlap/add- combiner is adapted for applying weights according to a synthesis windowing function to overlapping time- domain data blocks provided by the frequency-domain to time-domain converter.
20. The apparatus of claim 19, wherein the overlap/add- combiner is adapted for applying weights according to a synthesis windowing function being adapted to a size of an overlapping region of consecutive overlapping time-domain data blocks.
21. The apparatus of claim 20, wherein the calculator is adapted for applying weights to the 2N samples according to an analysis windowing function being adapted to a size of an overlapping region of consecutive overlapping time-domain data blocks and wherein the calculator is adapted for applying weights to the 2N samples according to a synthesis windowing function being adapted to the size of the overlapping region .
22. The apparatus of claim 1, wherein a size of an overlapping region of two consecutive time-domain data blocks which are encoded in the frequency domain is larger than a size of an overlapping region of two consecutive time-domain data blocks of which one being encoded in the frequency domain and one being encoded in the time domain.
23. The apparatus of claim 1, wherein the overlapping of data blocks is being determined according to the AAC- specifications .
24. The apparatus of claim 1, further comprising a bypass for the processor and the overlap/add-combiner, the bypass being adapted for bypassing the processor and the overlap/add-combiner when non-overlapping consecutive time-domain data blocks incur in data segments which are encoded in the time domain.
25. Method for decoding data segments representing a time- domain data stream, a data segment being encoded in the time domain or in the frequency domain, a data segment being encoded in the frequency domain having successive blocks of data representing successive and overlapping blocks of time-domain data samples, comprising the steps of:
decoding a data segment being encoded in the time domain;
processing the data segment being encoded in the frequency domain and output data of the time-domain decoder to obtain overlapping time-domain data blocks; and combining the overlapping time-domain data blocks to obtain the decoded data segments of the time-domain data stream.
26. Computer program having a program code for performing the method of claim 25, when the program code runs on a computer.
27. An apparatus for generating an encoded data stream based on a time-domain data stream, the time-domain data stream having samples of a signal, the apparatus comprising
a segment processor for providing data segments from the data stream, two consecutive data segments having a first or a second overlapping region, the second overlapping region being smaller than the first overlapping region;
a time-domain encoder for encoding a windowed data segment in the time domain;
a frequency-domain encoder for applying weights to samples of the time-domain data stream according to a first or second windowing function to obtain a windowed data segment, the first and second windowing functions being adapted to the first and second overlapping regions, the frequency-domain encoder being adapted for encoding a windowed data segment in the frequency domain;
a time-domain data analyzer for determining a transition indication associated with a data segment; and
a controller for controlling the apparatus such that for data segments having a first transition indication output data of the time-domain encoder is included in the encoded data stream and for data segments having a second transition indication, output data of the frequency-domain encoder is included in the encoded data stream.
28. The apparatus of claim 27, wherein the time-domain data analyzer is adapted for determining the transition indication from the time-domain data stream, the data segments or from data directly provided by the segment processor.
29. The apparatus of claim 27, wherein the time-domain data analyzer is adapted for determining a transition measure, the transition measure being based on the level of transience in the time-domain data stream or the data segment and wherein the transition indicator indicates whether a level of transience exceeds a predetermined threshold.
30. The apparatus of claim 27, wherein the segment processor is adapted for providing data segments with the first and the second overlapping regions,
the time-domain encoder is adapted for encoding the data segments,
the frequency-domain encoder is adapted for encoding the windowed data segments, and
the controller is adapted for controlling the time- domain encoder and the frequency-domain encoder such that for data segments having a first transition indication output data of the time-domain encoder is included in the encoded data stream and for windowed data segments having a second transition indication output data of the frequency-domain encoder is included in the encoded data stream.
31. The apparatus of claim 27, wherein the controller is adapted for controlling the segment processor for providing the data segments either to the time-domain encoder or the frequency-domain encoder.
32. The apparatus of claim 27, wherein the frequency- domain . encoder is adapted for applying weights of windowing functions according to the AAC- specifications .
33. The apparatus of claim 27, wherein the frequency- domain encoder is adapted for converting a windowed data segment to the frequency domain to obtain a frequency-domain data segment.
34. The apparatus of claim 33, wherein the frequency- domain encoder is adapted for quantizing the frequency-domain data segment.
35. The apparatus of claim 34, wherein the frequency- domain encoder is adapted for evaluating the frequency-domain data segment according to a perceptual model.
36. The apparatus of claim 35, wherein the frequency- domain encoder is adapted for utilizing a cosine- modulated filterbank, an extended lapped transform, a low-delay filterbank or a polyphase filterbank to obtain the frequency-domain data segments.
37. The apparatus of claim 33, wherein the frequency- domain encoder is adapted for utilizing a modified discrete cosine transform to obtain the frequency- domain data segments.
38. The apparatus of claim 27, wherein the time-domain encoder is adapted for using a prediction filter for encoding the data segments.
39. Method for generating an encoded data stream based on a time-domain data stream, the time-domain data stream having samples of a signal, comprising the steps of
providing data segments from the data stream, two consecutive data segments having a first or a second overlapping region, the second overlapping region being smaller than the first overlapping region;
determining a transition indication associated with the data segments;
encoding a data segment in the time domain, and/or applying weights to samples of the time-domain data stream according to a first or a second windowing function to obtain a windowed data segment, the first and second windowing functions being adapted to the first and second overlapping regions and encoding the windowed data segment in the frequency domain and;
controlling such that for data segments having a first transition indication output data being encoded in the time-domain is included in the encoded data stream and for data segments having a second transition indication output data being encoded in the frequency domain is included in the encoded data stream.
40. Computer program having a program code for performing the method of claim 39, when the program code runs on a computer.
PCT/EP2007/010665 2006-12-12 2007-12-07 Encoder, decoder and methods for encoding and decoding data segments representing a time-domain data stream WO2008071353A2 (en)

Priority Applications (23)

Application Number Priority Date Filing Date Title
PL07856467T PL2052548T3 (en) 2006-12-12 2007-12-07 Encoder, decoder and methods for encoding and decoding data segments representing a time-domain data stream
CN2007800461881A CN101589623B (en) 2006-12-12 2007-12-07 Encoder, decoder and methods for encoding and decoding data segments representing a time-domain data stream
ES07856467T ES2383217T3 (en) 2006-12-12 2007-12-07 Encoder, decoder and methods for encoding and decoding data segments representing a time domain data stream
MX2009006201A MX2009006201A (en) 2006-12-12 2007-12-07 Encoder, decoder and methods for encoding and decoding data segments representing a time-domain data stream.
AU2007331763A AU2007331763B2 (en) 2006-12-12 2007-12-07 Encoder, decoder and methods for encoding and decoding data segments representing a time-domain data stream
EP07856467A EP2052548B1 (en) 2006-12-12 2007-12-07 Encoder, decoder and methods for encoding and decoding data segments representing a time-domain data stream
BRPI0718738-6A BRPI0718738B1 (en) 2006-12-12 2007-12-07 ENCODER, DECODER AND METHODS FOR ENCODING AND DECODING DATA SEGMENTS REPRESENTING A TIME DOMAIN DATA STREAM
JP2009540636A JP5171842B2 (en) 2006-12-12 2007-12-07 Encoder, decoder and method for encoding and decoding representing a time-domain data stream
BR122019024992-0A BR122019024992B1 (en) 2006-12-12 2007-12-07 ENCODER, DECODER AND METHODS FOR ENCODING AND DECODING DATA SEGMENTS REPRESENTING A TIME DOMAIN DATA CHAIN
AT07856467T ATE547898T1 (en) 2006-12-12 2007-12-07 ENCODER, DECODER AND METHOD FOR ENCODING AND DECODING DATA SEGMENTS TO REPRESENT A TIME DOMAIN DATA STREAM
CA2672165A CA2672165C (en) 2006-12-12 2007-12-07 Encoder, decoder and methods for encoding and decoding data segments representing a time-domain data stream
US12/518,627 US8818796B2 (en) 2006-12-12 2007-12-07 Encoder, decoder and methods for encoding and decoding data segments representing a time-domain data stream
TW096147145A TWI363563B (en) 2006-12-12 2007-12-11 Encoder, decoder and methods for encoding and decoding data segments representing a time-domain data stream
IL198725A IL198725A (en) 2006-12-12 2009-05-13 Encoder, decoder and methods for encoding and decoding data segments representing a time-domain data stream
HK09105016.0A HK1126602A1 (en) 2006-12-12 2009-06-04 Encoder, decoder and methods for encoding and decoding data segments representing a time-domain data stream
NO20092506A NO342080B1 (en) 2006-12-12 2009-07-02 Codes, decoders and methods for encoding and decoding data segments that represent a data stream in the time domain.
US13/924,441 US8812305B2 (en) 2006-12-12 2013-06-21 Encoder, decoder and methods for encoding and decoding data segments representing a time-domain data stream
US14/250,306 US9043202B2 (en) 2006-12-12 2014-04-10 Encoder, decoder and methods for encoding and decoding data segments representing a time-domain data stream
US14/637,256 US9355647B2 (en) 2006-12-12 2015-03-03 Encoder, decoder and methods for encoding and decoding data segments representing a time-domain data stream
US15/094,984 US9653089B2 (en) 2006-12-12 2016-04-08 Encoder, decoder and methods for encoding and decoding data segments representing a time-domain data stream
US15/595,170 US10714110B2 (en) 2006-12-12 2017-05-15 Decoding data segments representing a time-domain data stream
US16/922,934 US11581001B2 (en) 2006-12-12 2020-07-07 Encoder, decoder and methods for encoding and decoding data segments representing a time-domain data stream
US18/152,721 US11961530B2 (en) 2006-12-12 2023-01-10 Encoder, decoder and methods for encoding and decoding data segments representing a time-domain data stream

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US86967006P 2006-12-12 2006-12-12
US60/869,670 2006-12-12

Related Child Applications (3)

Application Number Title Priority Date Filing Date
US12/518,627 A-371-Of-International US8818796B2 (en) 2006-12-12 2007-12-07 Encoder, decoder and methods for encoding and decoding data segments representing a time-domain data stream
US13/924,441 Continuation US8812305B2 (en) 2006-12-12 2013-06-21 Encoder, decoder and methods for encoding and decoding data segments representing a time-domain data stream
US13/924,441 Division US8812305B2 (en) 2006-12-12 2013-06-21 Encoder, decoder and methods for encoding and decoding data segments representing a time-domain data stream

Publications (2)

Publication Number Publication Date
WO2008071353A2 true WO2008071353A2 (en) 2008-06-19
WO2008071353A3 WO2008071353A3 (en) 2008-08-21

Family

ID=39410130

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/EP2007/010665 WO2008071353A2 (en) 2006-12-12 2007-12-07 Encoder, decoder and methods for encoding and decoding data segments representing a time-domain data stream

Country Status (20)

Country Link
US (8) US8818796B2 (en)
EP (1) EP2052548B1 (en)
JP (1) JP5171842B2 (en)
KR (1) KR101016224B1 (en)
CN (2) CN102395033B (en)
AT (1) ATE547898T1 (en)
AU (1) AU2007331763B2 (en)
BR (2) BR122019024992B1 (en)
CA (1) CA2672165C (en)
ES (1) ES2383217T3 (en)
HK (2) HK1126602A1 (en)
IL (1) IL198725A (en)
MX (1) MX2009006201A (en)
MY (1) MY148913A (en)
NO (1) NO342080B1 (en)
PL (1) PL2052548T3 (en)
RU (1) RU2444071C2 (en)
TW (1) TWI363563B (en)
WO (1) WO2008071353A2 (en)
ZA (1) ZA200903159B (en)

Cited By (24)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2144230A1 (en) * 2008-07-11 2010-01-13 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Low bitrate audio encoding/decoding scheme having cascaded switches
EP2144171A1 (en) * 2008-07-11 2010-01-13 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Audio encoder and decoder for encoding and decoding frames of a sampled audio signal
WO2010003563A1 (en) * 2008-07-11 2010-01-14 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Audio encoder and decoder for encoding and decoding audio samples
WO2010003532A1 (en) * 2008-07-11 2010-01-14 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Apparatus and method for encoding/decoding an audio signal using an aliasing switch scheme
WO2010003663A1 (en) * 2008-07-11 2010-01-14 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Audio encoder and decoder for encoding frames of sampled audio signals
EP2146344A1 (en) * 2008-07-17 2010-01-20 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Audio encoding/decoding scheme having a switchable bypass
WO2010040522A2 (en) * 2008-10-08 2010-04-15 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e. V. Multi-resolution switched audio encoding/decoding scheme
EP2214164A3 (en) * 2009-01-28 2011-01-26 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Audio decoder, audio encoder, methods for decoding and encoding an audio signal and computer program
WO2011034374A2 (en) * 2009-09-17 2011-03-24 Lg Electronics Inc. A method and an apparatus for processing an audio signal
US20110087494A1 (en) * 2009-10-09 2011-04-14 Samsung Electronics Co., Ltd. Apparatus and method of encoding audio signal by switching frequency domain transformation scheme and time domain transformation scheme
WO2010062123A3 (en) * 2008-11-26 2013-02-28 한국전자통신연구원 Unified speech/audio codec (usac) processing windows sequence based mode switching
JP2013508765A (en) * 2009-10-20 2013-03-07 フラウンホーファー−ゲゼルシャフト・ツール・フェルデルング・デル・アンゲヴァンテン・フォルシュング・アインゲトラーゲネル・フェライン Audio signal encoder, audio signal decoder, and audio signal encoding or decoding method using aliasing cancellation
JP2013508766A (en) * 2009-10-20 2013-03-07 フラウンホッファー−ゲゼルシャフト ツァ フェルダールング デァ アンゲヴァンテン フォアシュンク エー.ファオ Audio signal encoder, audio signal decoder, method for providing a coded representation of audio content, method for providing a decoded representation of audio content, and computer program for use in low-latency applications
KR101281661B1 (en) * 2008-07-11 2013-07-03 프라운호퍼 게젤샤프트 쭈르 푀르데룽 데어 안겐반텐 포르슝 에. 베. Method and Discriminator for Classifying Different Segments of a Signal
US8595019B2 (en) 2008-07-11 2013-11-26 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Audio coder/decoder with predictive coding of synthesis filter and critically-sampled time aliasing of prediction domain frames
US8892427B2 (en) 2009-07-27 2014-11-18 Industry-Academic Cooperation Foundation, Yonsei University Method and an apparatus for processing an audio signal
CN104240713A (en) * 2008-09-18 2014-12-24 韩国电子通信研究院 Coding method and decoding method
US9224403B2 (en) 2010-07-02 2015-12-29 Dolby International Ab Selective bass post filter
EP3002751A1 (en) 2008-07-11 2016-04-06 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Audio encoder and decoder for encoding and decoding audio samples
US9384748B2 (en) 2008-11-26 2016-07-05 Electronics And Telecommunications Research Institute Unified Speech/Audio Codec (USAC) processing windows sequence based mode switching
RU2604994C2 (en) * 2011-06-28 2016-12-20 Оранж Delay-optimised overlap transform, coding/decoding weighting windows
KR101848866B1 (en) * 2008-10-13 2018-04-13 한국전자통신연구원 Encoding and decoding apparatus for linear predictive coder residual signal of modified discrete cosine transform based unified speech and audio coding
US10142763B2 (en) 2013-11-27 2018-11-27 Dolby Laboratories Licensing Corporation Audio signal processing
RU2719285C1 (en) * 2016-07-29 2020-04-17 Фраунхофер-Гезелльшафт Цур Фердерунг Дер Ангевандтен Форшунг Е.Ф. Reduced overlapping of spectra in time domain for non-uniform filter banks, which use spectral analysis with subsequent partial synthesis

Families Citing this family (49)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8630863B2 (en) * 2007-04-24 2014-01-14 Samsung Electronics Co., Ltd. Method and apparatus for encoding and decoding audio/speech signal
WO2008151137A2 (en) * 2007-06-01 2008-12-11 The Trustees Of Columbia University In The City Of New York Real-time time encoding and decoding machines
WO2009006405A1 (en) 2007-06-28 2009-01-08 The Trustees Of Columbia University In The City Of New York Multi-input multi-output time encoding and decoding machines
EP2077551B1 (en) 2008-01-04 2011-03-02 Dolby Sweden AB Audio encoder and decoder
FR2936898A1 (en) * 2008-10-08 2010-04-09 France Telecom CRITICAL SAMPLING CODING WITH PREDICTIVE ENCODER
WO2010044593A2 (en) 2008-10-13 2010-04-22 한국전자통신연구원 Lpc residual signal encoding/decoding apparatus of modified discrete cosine transform (mdct)-based unified voice/audio encoding device
KR101137652B1 (en) * 2009-10-14 2012-04-23 광운대학교 산학협력단 Unified speech/audio encoding and decoding apparatus and method for adjusting overlap area of window based on transition
BR122021002104B1 (en) * 2010-07-08 2021-11-03 Fraunhofer-Gesellschaft Zur Forderung Der Angewandten Forschung E. V. ENCODER USING FUTURE SERRATED CANCELLATION
KR101826331B1 (en) 2010-09-15 2018-03-22 삼성전자주식회사 Apparatus and method for encoding and decoding for high frequency bandwidth extension
CA2981539C (en) * 2010-12-29 2020-08-25 Samsung Electronics Co., Ltd. Apparatus and method for encoding/decoding for high-frequency bandwidth extension
US9807424B2 (en) 2011-01-10 2017-10-31 Qualcomm Incorporated Adaptive selection of region size for identification of samples in a transition zone for overlapped block motion compensation
WO2012109407A1 (en) 2011-02-09 2012-08-16 The Trustees Of Columbia University In The City Of New York Encoding and decoding machine with recurrent neural networks
RU2585999C2 (en) 2011-02-14 2016-06-10 Фраунхофер-Гезелльшафт Цур Фердерунг Дер Ангевандтен Форшунг Е.Ф. Generation of noise in audio codecs
BR112012029132B1 (en) * 2011-02-14 2021-10-05 Fraunhofer - Gesellschaft Zur Förderung Der Angewandten Forschung E.V REPRESENTATION OF INFORMATION SIGNAL USING OVERLAY TRANSFORMED
CN103534754B (en) 2011-02-14 2015-09-30 弗兰霍菲尔运输应用研究公司 The audio codec utilizing noise to synthesize during the inertia stage
TWI488176B (en) * 2011-02-14 2015-06-11 Fraunhofer Ges Forschung Encoding and decoding of pulse positions of tracks of an audio signal
CA2827249C (en) 2011-02-14 2016-08-23 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Apparatus and method for processing a decoded audio signal in a spectral domain
KR101525185B1 (en) * 2011-02-14 2015-06-02 프라운호퍼 게젤샤프트 쭈르 푀르데룽 데어 안겐반텐 포르슝 에. 베. Apparatus and method for coding a portion of an audio signal using a transient detection and a quality result
PL3239978T3 (en) 2011-02-14 2019-07-31 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Encoding and decoding of pulse positions of tracks of an audio signal
CN103503062B (en) 2011-02-14 2016-08-10 弗劳恩霍夫应用研究促进协会 For using the prediction part of alignment by audio-frequency signal coding and the apparatus and method of decoding
JP5625126B2 (en) 2011-02-14 2014-11-12 フラウンホーファー−ゲゼルシャフト・ツール・フェルデルング・デル・アンゲヴァンテン・フォルシュング・アインゲトラーゲネル・フェライン Linear prediction based coding scheme using spectral domain noise shaping
JP5849106B2 (en) 2011-02-14 2016-01-27 フラウンホーファー−ゲゼルシャフト・ツール・フェルデルング・デル・アンゲヴァンテン・フォルシュング・アインゲトラーゲネル・フェライン Apparatus and method for error concealment in low delay integrated speech and audio coding
US9286491B2 (en) 2012-06-07 2016-03-15 Amazon Technologies, Inc. Virtual service provider zones
US10084818B1 (en) 2012-06-07 2018-09-25 Amazon Technologies, Inc. Flexibly configurable data modification services
US10075471B2 (en) 2012-06-07 2018-09-11 Amazon Technologies, Inc. Data loss prevention techniques
US9590959B2 (en) 2013-02-12 2017-03-07 Amazon Technologies, Inc. Data security service
US10210341B2 (en) * 2013-02-12 2019-02-19 Amazon Technologies, Inc. Delayed data access
US9608813B1 (en) 2013-06-13 2017-03-28 Amazon Technologies, Inc. Key rotation techniques
US9367697B1 (en) 2013-02-12 2016-06-14 Amazon Technologies, Inc. Data security with a security module
US9547771B2 (en) 2013-02-12 2017-01-17 Amazon Technologies, Inc. Policy enforcement with associated data
US10467422B1 (en) 2013-02-12 2019-11-05 Amazon Technologies, Inc. Automatic key rotation
US9705674B2 (en) 2013-02-12 2017-07-11 Amazon Technologies, Inc. Federated key management
US9300464B1 (en) 2013-02-12 2016-03-29 Amazon Technologies, Inc. Probabilistic key rotation
US10211977B1 (en) 2013-02-12 2019-02-19 Amazon Technologies, Inc. Secure management of information using a security module
EP2959481B1 (en) 2013-02-20 2017-04-26 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Apparatus and method for generating an encoded audio or image signal or for decoding an encoded audio or image signal in the presence of transients using a multi overlap portion
ES2693559T3 (en) * 2013-08-23 2018-12-12 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Apparatus and procedure for processing an audio signal by an aliasing error signal
US9397835B1 (en) 2014-05-21 2016-07-19 Amazon Technologies, Inc. Web of trust management in a distributed system
US9438421B1 (en) 2014-06-27 2016-09-06 Amazon Technologies, Inc. Supporting a fixed transaction rate with a variably-backed logical cryptographic key
US10116418B2 (en) 2014-08-08 2018-10-30 University Of Florida Research Foundation, Incorporated Joint fountain coding and network coding for loss-tolerant information spreading
US9866392B1 (en) 2014-09-15 2018-01-09 Amazon Technologies, Inc. Distributed system web of trust provisioning
KR101626280B1 (en) * 2014-11-05 2016-06-01 주식회사 디오텍 Method and apparatus for removing of harmonics component of synthesized sound
US10469477B2 (en) 2015-03-31 2019-11-05 Amazon Technologies, Inc. Key export techniques
US10014906B2 (en) * 2015-09-25 2018-07-03 Microsemi Semiconductor (U.S.) Inc. Acoustic echo path change detection apparatus and method
WO2017050398A1 (en) * 2015-09-25 2017-03-30 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Encoder, decoder and methods for signal-adaptive switching of the overlap ratio in audio transform coding
US10230388B2 (en) * 2015-12-30 2019-03-12 Northwestern University System and method for energy efficient time domain signal processing
WO2017161122A1 (en) * 2016-03-16 2017-09-21 University Of Florida Research Foundation, Incorporated System for live video streaming using delay-aware fountain codes
WO2017161124A1 (en) * 2016-03-16 2017-09-21 University Of Florida Research Foundation, Incorporated System for video streaming using delay-aware fountain codes
WO2018198454A1 (en) * 2017-04-28 2018-11-01 ソニー株式会社 Information processing device and information processing method
WO2020132142A1 (en) * 2018-12-18 2020-06-25 Northwestern University System and method for pipelined time-domain computing using time-domain flip-flops and its application in time-series analysis

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5987407A (en) 1997-10-28 1999-11-16 America Online, Inc. Soft-clipping postprocessor scaling decoded audio signal frame saturation regions to approximate original waveform shape and maintain continuity
US6226608B1 (en) 1999-01-28 2001-05-01 Dolby Laboratories Licensing Corporation Data framing for adaptive-block-length coding system
EP1396844A1 (en) 2002-09-04 2004-03-10 Microsoft Corporation Unified lossy and lossless audio compression
US20050071402A1 (en) 2003-09-29 2005-03-31 Jeongnam Youn Method of making a window type decision based on MDCT data in audio encoding
US20060247928A1 (en) 2005-04-28 2006-11-02 James Stuart Jeremy Cowdery Method and system for operating audio encoders in parallel

Family Cites Families (87)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5109417A (en) * 1989-01-27 1992-04-28 Dolby Laboratories Licensing Corporation Low bit rate transform coder, decoder, and encoder/decoder for high-quality audio
US5230038A (en) * 1989-01-27 1993-07-20 Fielder Louis D Low bit rate transform coder, decoder, and encoder/decoder for high-quality audio
ES2109296T3 (en) * 1989-01-27 1998-01-16 Dolby Lab Licensing Corp ENCODER, DECODER AND ENCODER / DECODER WITH LOW FREQUENCY BITS TRANSFORMATION FOR HIGH QUALITY AUDIOINFORMATION.
DE3902948A1 (en) 1989-02-01 1990-08-09 Telefunken Fernseh & Rundfunk METHOD FOR TRANSMITTING A SIGNAL
CN1062963C (en) * 1990-04-12 2001-03-07 多尔拜实验特许公司 Adaptive-block-lenght, adaptive-transform, and adaptive-window transform coder, decoder, and encoder/decoder for high-quality audio
US5327518A (en) * 1991-08-22 1994-07-05 Georgia Tech Research Corporation Audio analysis/synthesis system
SG45281A1 (en) 1992-06-26 1998-01-16 Discovision Ass Method and arrangement for transformation of signals from a frequency to a time domain
US5570455A (en) * 1993-01-19 1996-10-29 Philosophers' Stone Llc Method and apparatus for encoding sequences of data
DE69428119T2 (en) * 1993-07-07 2002-03-21 Picturetel Corp REDUCING BACKGROUND NOISE FOR LANGUAGE ENHANCEMENT
US5651090A (en) * 1994-05-06 1997-07-22 Nippon Telegraph And Telephone Corporation Coding method and coder for coding input signals of plural channels using vector quantization, and decoding method and decoder therefor
US5615299A (en) * 1994-06-20 1997-03-25 International Business Machines Corporation Speech recognition using dynamic features
TW271524B (en) * 1994-08-05 1996-03-01 Qualcomm Inc
ATE191107T1 (en) * 1994-12-20 2000-04-15 Dolby Lab Licensing Corp METHOD AND APPARATUS FOR APPLYING WAVEFORM PREDICTION TO SUB-BANDS IN A PERCEPTIVE CODING SYSTEM
JP3158932B2 (en) 1995-01-27 2001-04-23 日本ビクター株式会社 Signal encoding device and signal decoding device
US5669484A (en) * 1996-01-24 1997-09-23 Paulson; Tom J. Protective cover for the mini-slide knob of dimmers with mini-slide knobs
US5809459A (en) 1996-05-21 1998-09-15 Motorola, Inc. Method and apparatus for speech excitation waveform coding using multiple error waveforms
US6134518A (en) * 1997-03-04 2000-10-17 International Business Machines Corporation Digital audio signal coding using a CELP coder and a transform coder
KR100261253B1 (en) 1997-04-02 2000-07-01 윤종용 Scalable audio encoder/decoder and audio encoding/decoding method
US6064954A (en) * 1997-04-03 2000-05-16 International Business Machines Corp. Digital audio signal coding
US6233550B1 (en) * 1997-08-29 2001-05-15 The Regents Of The University Of California Method and apparatus for hybrid coding of speech at 4kbps
RU2214047C2 (en) * 1997-11-19 2003-10-10 Самсунг Электроникс Ко., Лтд. Method and device for scalable audio-signal coding/decoding
US6249766B1 (en) * 1998-03-10 2001-06-19 Siemens Corporate Research, Inc. Real-time down-sampling system for digital audio waveform data
US6085163A (en) * 1998-03-13 2000-07-04 Todd; Craig Campbell Using time-aligned blocks of encoded audio in video/audio applications to facilitate audio switching
US6119080A (en) * 1998-06-17 2000-09-12 Formosoft International Inc. Unified recursive decomposition architecture for cosine modulated filter banks
US7272556B1 (en) * 1998-09-23 2007-09-18 Lucent Technologies Inc. Scalable and embedded codec for speech and audio signals
US6188987B1 (en) * 1998-11-17 2001-02-13 Dolby Laboratories Licensing Corporation Providing auxiliary information with frame-based encoded audio information
US6311154B1 (en) * 1998-12-30 2001-10-30 Nokia Mobile Phones Limited Adaptive windows for analysis-by-synthesis CELP-type speech coding
US6640209B1 (en) * 1999-02-26 2003-10-28 Qualcomm Incorporated Closed-loop multimode mixed-domain linear prediction (MDLP) speech coder
US6449592B1 (en) * 1999-02-26 2002-09-10 Qualcomm Incorporated Method and apparatus for tracking the phase of a quasi-periodic signal
US7020285B1 (en) * 1999-07-13 2006-03-28 Microsoft Corporation Stealthy audio watermarking
US6604070B1 (en) * 1999-09-22 2003-08-05 Conexant Systems, Inc. System of encoding and decoding speech signals
CA2310769C (en) * 1999-10-27 2013-05-28 Nielsen Media Research, Inc. Audio signature extraction and correlation
US6868377B1 (en) * 1999-11-23 2005-03-15 Creative Technology Ltd. Multiband phase-vocoder for the modification of audio or speech signals
FR2802329B1 (en) 1999-12-08 2003-03-28 France Telecom PROCESS FOR PROCESSING AT LEAST ONE AUDIO CODE BINARY FLOW ORGANIZED IN THE FORM OF FRAMES
JP3630609B2 (en) * 2000-03-29 2005-03-16 パイオニア株式会社 Audio information reproducing method and apparatus
US20020049586A1 (en) 2000-09-11 2002-04-25 Kousuke Nishio Audio encoder, audio decoder, and broadcasting system
US7010480B2 (en) * 2000-09-15 2006-03-07 Mindspeed Technologies, Inc. Controlling a weighting filter based on the spectral content of a speech signal
US7020605B2 (en) * 2000-09-15 2006-03-28 Mindspeed Technologies, Inc. Speech coding system with time-domain noise attenuation
US7472059B2 (en) * 2000-12-08 2008-12-30 Qualcomm Incorporated Method and apparatus for robust speech classification
US6738739B2 (en) * 2001-02-15 2004-05-18 Mindspeed Technologies, Inc. Voiced speech preprocessing employing waveform interpolation or a harmonic model
US7283954B2 (en) 2001-04-13 2007-10-16 Dolby Laboratories Licensing Corporation Comparing audio using characterizations based on auditory events
US6658383B2 (en) * 2001-06-26 2003-12-02 Microsoft Corporation Method for coding speech and music signals
US7333929B1 (en) * 2001-09-13 2008-02-19 Chmounk Dmitri V Modular scalable compressed audio data stream
JP3750583B2 (en) * 2001-10-22 2006-03-01 ソニー株式会社 Signal processing method and apparatus, and signal processing program
AU2003213439A1 (en) * 2002-03-08 2003-09-22 Nippon Telegraph And Telephone Corporation Digital signal encoding method, decoding method, encoding device, decoding device, digital signal encoding program, and decoding program
US7366659B2 (en) * 2002-06-07 2008-04-29 Lucent Technologies Inc. Methods and devices for selectively generating time-scaled sound signals
JP4022111B2 (en) * 2002-08-23 2007-12-12 株式会社エヌ・ティ・ティ・ドコモ Signal encoding apparatus and signal encoding method
US7295970B1 (en) * 2002-08-29 2007-11-13 At&T Corp Unsupervised speaker segmentation of multi-speaker speech data
JP4676140B2 (en) * 2002-09-04 2011-04-27 マイクロソフト コーポレーション Audio quantization and inverse quantization
JP3870880B2 (en) * 2002-09-04 2007-01-24 住友電装株式会社 Connection structure between conductor and pressure contact terminal
WO2004036549A1 (en) * 2002-10-14 2004-04-29 Koninklijke Philips Electronics N.V. Signal filtering
EP1576583A2 (en) * 2002-12-19 2005-09-21 Koninklijke Philips Electronics N.V. Sinusoid selection in audio encoding
US7876966B2 (en) * 2003-03-11 2011-01-25 Spyder Navigations L.L.C. Switching between coding schemes
JP2004302259A (en) 2003-03-31 2004-10-28 Matsushita Electric Ind Co Ltd Hierarchical encoding method and hierarchical decoding method for sound signal
RU2005135650A (en) 2003-04-17 2006-03-20 Конинклейке Филипс Электроникс Н.В. (Nl) AUDIO SYNTHESIS
ATE354160T1 (en) * 2003-10-30 2007-03-15 Koninkl Philips Electronics Nv AUDIO SIGNAL ENCODING OR DECODING
AU2003291862A1 (en) * 2003-12-01 2005-06-24 Aic A highly optimized method for modelling a windowed signal
FR2865310A1 (en) 2004-01-20 2005-07-22 France Telecom Sound signal partials restoration method for use in digital processing of sound signal, involves calculating shifted phase for frequencies estimated for missing peaks, and correcting each shifted phase using phase error
US7516064B2 (en) * 2004-02-19 2009-04-07 Dolby Laboratories Licensing Corporation Adaptive hybrid transform for signal analysis and synthesis
US7596486B2 (en) * 2004-05-19 2009-09-29 Nokia Corporation Encoding an audio signal using different audio coder modes
US7649988B2 (en) * 2004-06-15 2010-01-19 Acoustic Technologies, Inc. Comfort noise generator using modified Doblinger noise estimate
KR100608062B1 (en) * 2004-08-04 2006-08-02 삼성전자주식회사 Method and apparatus for decoding high frequency of audio data
KR20070068424A (en) * 2004-10-26 2007-06-29 마츠시타 덴끼 산교 가부시키가이샤 Sound encoding device and sound encoding method
GB2420846B (en) * 2004-12-04 2009-07-08 Ford Global Technologies Llc A cooling system for a motor vehicle engine
US20070147518A1 (en) * 2005-02-18 2007-06-28 Bruno Bessette Methods and devices for low-frequency emphasis during audio compression based on ACELP/TCX
CN101151658B (en) * 2005-03-30 2011-07-06 皇家飞利浦电子股份有限公司 Multichannel audio encoding and decoding method, encoder and demoder
US7571104B2 (en) * 2005-05-26 2009-08-04 Qnx Software Systems (Wavemakers), Inc. Dynamic real-time cross-fading of voice prompts
US7630882B2 (en) * 2005-07-15 2009-12-08 Microsoft Corporation Frequency segmentation to obtain bands for efficient coding of digital media
US7546240B2 (en) * 2005-07-15 2009-06-09 Microsoft Corporation Coding with improved time resolution for selected segments via adaptive block transformation of a group of samples from a subband decomposition
KR100643310B1 (en) * 2005-08-24 2006-11-10 삼성전자주식회사 Method and apparatus for disturbing voice data using disturbing signal which has similar formant with the voice signal
US7953605B2 (en) 2005-10-07 2011-05-31 Deepen Sinha Method and apparatus for audio encoding and decoding using wideband psychoacoustic modeling and bandwidth extension
KR100647336B1 (en) * 2005-11-08 2006-11-23 삼성전자주식회사 Apparatus and method for adaptive time/frequency-based encoding/decoding
CN1963917A (en) * 2005-11-11 2007-05-16 株式会社东芝 Method for estimating distinguish of voice, registering and validating authentication of speaker and apparatus thereof
US7805297B2 (en) * 2005-11-23 2010-09-28 Broadcom Corporation Classification-based frame loss concealment for audio signals
EP1855436A1 (en) 2006-05-12 2007-11-14 Deutsche Thomson-Brandt Gmbh Method and apparatus for encrypting encoded audio signal
US8010352B2 (en) * 2006-06-21 2011-08-30 Samsung Electronics Co., Ltd. Method and apparatus for adaptively encoding and decoding high frequency band
US8036903B2 (en) * 2006-10-18 2011-10-11 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Analysis filterbank, synthesis filterbank, encoder, de-coder, mixer and conferencing system
DE102006051673A1 (en) * 2006-11-02 2008-05-15 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Apparatus and method for reworking spectral values and encoders and decoders for audio signals
KR101434198B1 (en) * 2006-11-17 2014-08-26 삼성전자주식회사 Method of decoding a signal
KR100964402B1 (en) * 2006-12-14 2010-06-17 삼성전자주식회사 Method and Apparatus for determining encoding mode of audio signal, and method and appartus for encoding/decoding audio signal using it
KR101334366B1 (en) * 2006-12-28 2013-11-29 삼성전자주식회사 Method and apparatus for varying audio playback speed
KR100883656B1 (en) * 2006-12-28 2009-02-18 삼성전자주식회사 Method and apparatus for discriminating audio signal, and method and apparatus for encoding/decoding audio signal using it
KR101403340B1 (en) * 2007-08-02 2014-06-09 삼성전자주식회사 Method and apparatus for transcoding
US8050934B2 (en) * 2007-11-29 2011-11-01 Texas Instruments Incorporated Local pitch control based on seamless time scale modification and synchronized sampling rate conversion
KR101441896B1 (en) * 2008-01-29 2014-09-23 삼성전자주식회사 Method and apparatus for encoding/decoding audio signal using adaptive LPC coefficient interpolation
US8364481B2 (en) * 2008-07-02 2013-01-29 Google Inc. Speech recognition with parallel recognition tasks
EP2631906A1 (en) 2012-02-27 2013-08-28 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Phase coherence control for harmonic signals in perceptual audio codecs

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5987407A (en) 1997-10-28 1999-11-16 America Online, Inc. Soft-clipping postprocessor scaling decoded audio signal frame saturation regions to approximate original waveform shape and maintain continuity
US6226608B1 (en) 1999-01-28 2001-05-01 Dolby Laboratories Licensing Corporation Data framing for adaptive-block-length coding system
EP1396844A1 (en) 2002-09-04 2004-03-10 Microsoft Corporation Unified lossy and lossless audio compression
US20050071402A1 (en) 2003-09-29 2005-03-31 Jeongnam Youn Method of making a window type decision based on MDCT data in audio encoding
US20060247928A1 (en) 2005-04-28 2006-11-02 James Stuart Jeremy Cowdery Method and system for operating audio encoders in parallel

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
O.A. NIAMUT; R. HEUSDENS: "Optimal Time Segmentation for Overlap-Add Systems with Variable Amount of Window Overlap", IEEE SIGNAL PROCESSING LETTERS, vol. 12, no. 10, October 2005 (2005-10-01)

Cited By (109)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TWI459379B (en) * 2008-07-11 2014-11-01 Fraunhofer Ges Forschung Audio encoder and decoder for encoding and decoding audio samples
KR101250309B1 (en) 2008-07-11 2013-04-04 프라운호퍼 게젤샤프트 쭈르 푀르데룽 데어 안겐반텐 포르슝 에. 베. Apparatus and method for encoding/decoding an audio signal using an aliasing switch scheme
WO2010003564A1 (en) * 2008-07-11 2010-01-14 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V Low bitrate audio encoding/decoding scheme having cascaded switches
WO2010003563A1 (en) * 2008-07-11 2010-01-14 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Audio encoder and decoder for encoding and decoding audio samples
WO2010003532A1 (en) * 2008-07-11 2010-01-14 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Apparatus and method for encoding/decoding an audio signal using an aliasing switch scheme
WO2010003491A1 (en) * 2008-07-11 2010-01-14 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Audio encoder and decoder for encoding and decoding frames of sampled audio signal
WO2010003663A1 (en) * 2008-07-11 2010-01-14 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Audio encoder and decoder for encoding frames of sampled audio signals
EP3002751A1 (en) 2008-07-11 2016-04-06 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Audio encoder and decoder for encoding and decoding audio samples
EP2144171A1 (en) * 2008-07-11 2010-01-13 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Audio encoder and decoder for encoding and decoding frames of a sampled audio signal
US8930198B2 (en) 2008-07-11 2015-01-06 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Low bitrate audio encoding/decoding scheme having cascaded switches
US8892449B2 (en) 2008-07-11 2014-11-18 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Audio encoder/decoder with switching between first and second encoders/decoders using first and second framing rules
KR101281661B1 (en) * 2008-07-11 2013-07-03 프라운호퍼 게젤샤프트 쭈르 푀르데룽 데어 안겐반텐 포르슝 에. 베. Method and Discriminator for Classifying Different Segments of a Signal
US11823690B2 (en) 2008-07-11 2023-11-21 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Low bitrate audio encoding/decoding scheme having cascaded switches
US11682404B2 (en) 2008-07-11 2023-06-20 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Audio decoding device and method with decoding branches for decoding audio signal encoded in a plurality of domains
US11676611B2 (en) 2008-07-11 2023-06-13 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Audio decoding device and method with decoding branches for decoding audio signal encoded in a plurality of domains
US8862480B2 (en) 2008-07-11 2014-10-14 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Audio encoding/decoding with aliasing switch for domain transforming of adjacent sub-blocks before and subsequent to windowing
CN102089758A (en) * 2008-07-11 2011-06-08 弗劳恩霍夫应用研究促进协会 Audio encoder and decoder for encoding and decoding frames of sampled audio signal
CN102089812A (en) * 2008-07-11 2011-06-08 弗劳恩霍夫应用研究促进协会 Apparatus and method for encoding/decoding an audio signal using an aliasing switch scheme
US11475902B2 (en) 2008-07-11 2022-10-18 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Low bitrate audio encoding/decoding scheme having cascaded switches
US8751246B2 (en) 2008-07-11 2014-06-10 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Audio encoder and decoder for encoding frames of sampled audio signals
RU2515704C2 (en) * 2008-07-11 2014-05-20 Фраунхофер-Гезелльшафт Цур Фердерунг Дер Ангевандтен Форшунг Е.Ф. Audio encoder and audio decoder for encoding and decoding audio signal readings
KR101380297B1 (en) * 2008-07-11 2014-04-02 프라운호퍼 게젤샤프트 쭈르 푀르데룽 데어 안겐반텐 포르슝 에. 베. Method and Discriminator for Classifying Different Segments of a Signal
AU2009267467B2 (en) * 2008-07-11 2012-03-01 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Low bitrate audio encoding/decoding scheme having cascaded switches
TWI426503B (en) * 2008-07-11 2014-02-11 Fraunhofer Ges Forschung Apparatus and method for encoding/decoding an audio signal using an aliasing switch scheme
US8595019B2 (en) 2008-07-11 2013-11-26 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Audio coder/decoder with predictive coding of synthesis filter and critically-sampled time aliasing of prediction domain frames
US10621996B2 (en) 2008-07-11 2020-04-14 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Low bitrate audio encoding/decoding scheme having cascaded switches
AU2009267518B2 (en) * 2008-07-11 2012-08-16 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Apparatus and method for encoding/decoding an audio signal using an aliasing switch scheme
CN102105930B (en) * 2008-07-11 2012-10-03 弗朗霍夫应用科学研究促进协会 Audio encoder and decoder for encoding frames of sampled audio signals
AU2009267394B2 (en) * 2008-07-11 2012-10-18 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Audio encoder and decoder for encoding frames of sampled audio signals
RU2498419C2 (en) * 2008-07-11 2013-11-10 Фраунхофер-Гезелльшафт цур Фёердерунг дер ангевандтен Audio encoder and audio decoder for encoding frames presented in form of audio signal samples
KR101224559B1 (en) 2008-07-11 2013-01-23 보이세지 코포레이션 Low Bitrate Audio Encoding/Decoding Scheme Having Cascaded swithces
KR101227729B1 (en) * 2008-07-11 2013-01-29 프라운호퍼-게젤샤프트 추르 푀르데룽 데어 안제반텐 포르슝 에 파우 Audio encoder and decoder for encoding frames of sampled audio signals
KR101325335B1 (en) * 2008-07-11 2013-11-08 프라운호퍼-게젤샤프트 추르 푀르데룽 데어 안제반텐 포르슝 에 파우 Audio encoder and decoder for encoding and decoding audio samples
US8571858B2 (en) 2008-07-11 2013-10-29 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Method and discriminator for classifying different segments of a signal
EP2144230A1 (en) * 2008-07-11 2010-01-13 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Low bitrate audio encoding/decoding scheme having cascaded switches
US10319384B2 (en) 2008-07-11 2019-06-11 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Low bitrate audio encoding/decoding scheme having cascaded switches
EP3002750A1 (en) 2008-07-11 2016-04-06 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Audio encoder and decoder for encoding and decoding audio samples
CN102089811B (en) * 2008-07-11 2013-04-10 弗朗霍夫应用科学研究促进协会 Audio encoder and decoder for encoding and decoding audio samples
AU2009267466B2 (en) * 2008-07-11 2013-05-16 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Audio encoder and decoder for encoding and decoding audio samples
RU2485606C2 (en) * 2008-07-11 2013-06-20 Франухофер-Гезелльшафт цур Фёрдерунг дер ангевандтен Форшунг Е.Ф. Low bitrate audio encoding/decoding scheme using cascaded switches
US8959017B2 (en) 2008-07-17 2015-02-17 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Audio encoding/decoding scheme having a switchable bypass
AU2009270524B2 (en) * 2008-07-17 2012-03-15 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Audio encoding/decoding scheme having a switchable bypass
WO2010006717A1 (en) * 2008-07-17 2010-01-21 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Audio encoding/decoding scheme having a switchable bypass
EP2146344A1 (en) * 2008-07-17 2010-01-20 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Audio encoding/decoding scheme having a switchable bypass
KR101224884B1 (en) 2008-07-17 2013-02-06 보이세지 코포레이션 Audio encoding/decoding scheme having a switchable bypass
US8321210B2 (en) 2008-07-17 2012-11-27 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Audio encoding/decoding scheme having a switchable bypass
RU2483364C2 (en) * 2008-07-17 2013-05-27 Фраунхофер-Гезелльшафт цур Фёрдерунг дер ангевандтен Audio encoding/decoding scheme having switchable bypass
CN104240713A (en) * 2008-09-18 2014-12-24 韩国电子通信研究院 Coding method and decoding method
WO2010040522A3 (en) * 2008-10-08 2010-09-02 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e. V. Multi-resolution switched audio encoding/decoding scheme
US9043215B2 (en) 2008-10-08 2015-05-26 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Multi-resolution switched audio encoding/decoding scheme
CN102177426B (en) * 2008-10-08 2014-11-05 弗兰霍菲尔运输应用研究公司 Multi-resolution switched audio encoding/decoding scheme
EP3640941A1 (en) * 2008-10-08 2020-04-22 Fraunhofer Gesellschaft zur Förderung der Angewand Multi-resolution switched audio encoding/decoding scheme
WO2010040522A2 (en) * 2008-10-08 2010-04-15 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e. V. Multi-resolution switched audio encoding/decoding scheme
US8447620B2 (en) 2008-10-08 2013-05-21 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Multi-resolution switched audio encoding/decoding scheme
JP2012505423A (en) * 2008-10-08 2012-03-01 フラウンホーファー−ゲゼルシャフト・ツール・フェルデルング・デル・アンゲヴァンテン・フォルシュング・アインゲトラーゲネル・フェライン Multi-resolution switching audio encoding and decoding scheme
KR20190026710A (en) * 2008-10-13 2019-03-13 한국전자통신연구원 Encoding and decoding apparatus for linear predictive coder residual signal of modified discrete cosine transform based unified speech and audio coding
KR101848866B1 (en) * 2008-10-13 2018-04-13 한국전자통신연구원 Encoding and decoding apparatus for linear predictive coder residual signal of modified discrete cosine transform based unified speech and audio coding
KR101956289B1 (en) 2008-10-13 2019-03-08 한국전자통신연구원 Encoding and decoding apparatus for linear predictive coder residual signal of modified discrete cosine transform based unified speech and audio coding
KR20180040543A (en) * 2008-10-13 2018-04-20 한국전자통신연구원 Encoding and decoding apparatus for linear predictive coder residual signal of modified discrete cosine transform based unified speech and audio coding
KR102002162B1 (en) 2008-10-13 2019-07-23 한국전자통신연구원 Encoding and decoding apparatus for linear predictive coder residual signal of modified discrete cosine transform based unified speech and audio coding
US11922962B2 (en) 2008-11-26 2024-03-05 Electronics And Telecommunications Research Institute Unified speech/audio codec (USAC) processing windows sequence based mode switching
CN104282313A (en) * 2008-11-26 2015-01-14 韩国电子通信研究院 Unified speech/audio codec (USAC) processing windows sequence based mode switching
US8954321B1 (en) 2008-11-26 2015-02-10 Electronics And Telecommunications Research Institute Unified speech/audio codec (USAC) processing windows sequence based mode switching
US10002619B2 (en) 2008-11-26 2018-06-19 Electronics And Telecommunications Research Institute Unified speech/audio codec (USAC) processing windows sequence based mode switching
US11430458B2 (en) 2008-11-26 2022-08-30 Electronics And Telecommunications Research Institute Unified speech/audio codec (USAC) processing windows sequence based mode switching
US10622001B2 (en) 2008-11-26 2020-04-14 Electronics And Telecommunications Research Institute Unified speech/audio codec (USAC) windows sequence based mode switching
EP3151241A1 (en) * 2008-11-26 2017-04-05 Electronics and Telecommunications Research Institute Unified speech/audio codec (usac) processing windows sequence based mode switching
US9384748B2 (en) 2008-11-26 2016-07-05 Electronics And Telecommunications Research Institute Unified Speech/Audio Codec (USAC) processing windows sequence based mode switching
WO2010062123A3 (en) * 2008-11-26 2013-02-28 한국전자통신연구원 Unified speech/audio codec (usac) processing windows sequence based mode switching
EP3252759A1 (en) * 2009-01-28 2017-12-06 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Audio encoder, method for encoding an audio signal and computer program
US8457975B2 (en) 2009-01-28 2013-06-04 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Audio decoder, audio encoder, methods for decoding and encoding an audio signal and computer program
EP2214164A3 (en) * 2009-01-28 2011-01-26 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Audio decoder, audio encoder, methods for decoding and encoding an audio signal and computer program
US9082399B2 (en) 2009-07-27 2015-07-14 Industry-Academic Cooperation Foundation, Yonsei University Method and apparatus for processing an audio signal using window transitions for coding schemes
USRE48916E1 (en) 2009-07-27 2022-02-01 Dolby Laboratories Licensing Corporation Alias cancelling during audio coding mode transitions
USRE47536E1 (en) 2009-07-27 2019-07-23 Dolby Laboratories Licensing Corporation Alias cancelling during audio coding mode transitions
US8892427B2 (en) 2009-07-27 2014-11-18 Industry-Academic Cooperation Foundation, Yonsei University Method and an apparatus for processing an audio signal
USRE49813E1 (en) 2009-07-27 2024-01-23 Dolby Laboratories Licensing Corporation Alias cancelling during audio coding mode transitions
US9214160B2 (en) 2009-07-27 2015-12-15 Industry-Academic Cooperation Foundation, Yonsei University Alias cancelling during audio coding mode transitions
US9064490B2 (en) 2009-07-27 2015-06-23 Industry-Academic Cooperation Foundation, Yonsei University Method and apparatus for processing an audio signal using window transitions for coding schemes
WO2011034376A2 (en) * 2009-09-17 2011-03-24 Lg Electronics Inc. A method and an apparatus for processing an audio signal
CN102498515A (en) * 2009-09-17 2012-06-13 Lg电子株式会社 A method and an apparatus for processing an audio signal
WO2011034374A2 (en) * 2009-09-17 2011-03-24 Lg Electronics Inc. A method and an apparatus for processing an audio signal
US8930199B2 (en) 2009-09-17 2015-01-06 Industry-Academic Cooperation Foundation, Yonsei University Method and an apparatus for processing an audio signal
WO2011034377A2 (en) * 2009-09-17 2011-03-24 Lg Electronics Inc. A method and an apparatus for processing an audio signal
WO2011034374A3 (en) * 2009-09-17 2011-07-14 Lg Electronics Inc. A method and an apparatus for processing an audio signal
CN102498515B (en) * 2009-09-17 2014-06-18 延世大学工业学术合作社 A method and an apparatus for processing an audio signal
WO2011034375A3 (en) * 2009-09-17 2011-07-07 Lg Electronics Inc. A method and an apparatus for processing an audio signal
WO2011034377A3 (en) * 2009-09-17 2011-07-07 Lg Electronics Inc. A method and an apparatus for processing an audio signal
WO2011034376A3 (en) * 2009-09-17 2011-07-07 Lg Electronics Inc. A method and an apparatus for processing an audio signal
US20110087494A1 (en) * 2009-10-09 2011-04-14 Samsung Electronics Co., Ltd. Apparatus and method of encoding audio signal by switching frequency domain transformation scheme and time domain transformation scheme
JP2013508765A (en) * 2009-10-20 2013-03-07 フラウンホーファー−ゲゼルシャフト・ツール・フェルデルング・デル・アンゲヴァンテン・フォルシュング・アインゲトラーゲネル・フェライン Audio signal encoder, audio signal decoder, and audio signal encoding or decoding method using aliasing cancellation
JP2013508766A (en) * 2009-10-20 2013-03-07 フラウンホッファー−ゲゼルシャフト ツァ フェルダールング デァ アンゲヴァンテン フォアシュンク エー.ファオ Audio signal encoder, audio signal decoder, method for providing a coded representation of audio content, method for providing a decoded representation of audio content, and computer program for use in low-latency applications
US9343077B2 (en) 2010-07-02 2016-05-17 Dolby International Ab Pitch filter for audio signals
US10236010B2 (en) 2010-07-02 2019-03-19 Dolby International Ab Pitch filter for audio signals
US9224403B2 (en) 2010-07-02 2015-12-29 Dolby International Ab Selective bass post filter
US9595270B2 (en) 2010-07-02 2017-03-14 Dolby International Ab Selective post filter
US9558754B2 (en) 2010-07-02 2017-01-31 Dolby International Ab Audio encoder and decoder with pitch prediction
US10811024B2 (en) 2010-07-02 2020-10-20 Dolby International Ab Post filter for audio signals
US9858940B2 (en) 2010-07-02 2018-01-02 Dolby International Ab Pitch filter for audio signals
US11183200B2 (en) 2010-07-02 2021-11-23 Dolby International Ab Post filter for audio signals
US9552824B2 (en) 2010-07-02 2017-01-24 Dolby International Ab Post filter
US9830923B2 (en) 2010-07-02 2017-11-28 Dolby International Ab Selective bass post filter
US9558753B2 (en) 2010-07-02 2017-01-31 Dolby International Ab Pitch filter for audio signals
US9396736B2 (en) 2010-07-02 2016-07-19 Dolby International Ab Audio encoder and decoder with multiple coding modes
US11610595B2 (en) 2010-07-02 2023-03-21 Dolby International Ab Post filter for audio signals
RU2604994C2 (en) * 2011-06-28 2016-12-20 Оранж Delay-optimised overlap transform, coding/decoding weighting windows
US10142763B2 (en) 2013-11-27 2018-11-27 Dolby Laboratories Licensing Corporation Audio signal processing
US10978082B2 (en) 2016-07-29 2021-04-13 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Time domain aliasing reduction for non-uniform filterbanks which use spectral analysis followed by partial synthesis
RU2719285C1 (en) * 2016-07-29 2020-04-17 Фраунхофер-Гезелльшафт Цур Фердерунг Дер Ангевандтен Форшунг Е.Ф. Reduced overlapping of spectra in time domain for non-uniform filter banks, which use spectral analysis with subsequent partial synthesis

Also Published As

Publication number Publication date
US20200335117A1 (en) 2020-10-22
RU2444071C2 (en) 2012-02-27
ES2383217T3 (en) 2012-06-19
US20100138218A1 (en) 2010-06-03
CN101589623B (en) 2013-03-13
CA2672165A1 (en) 2008-06-19
EP2052548B1 (en) 2012-02-29
US10714110B2 (en) 2020-07-14
US9653089B2 (en) 2017-05-16
KR20090085655A (en) 2009-08-07
JP2010512550A (en) 2010-04-22
US20230154475A1 (en) 2023-05-18
HK1168706A1 (en) 2013-01-04
BR122019024992B1 (en) 2021-04-06
CA2672165C (en) 2014-07-29
RU2009117569A (en) 2011-01-20
US11961530B2 (en) 2024-04-16
CN102395033A (en) 2012-03-28
ATE547898T1 (en) 2012-03-15
US9043202B2 (en) 2015-05-26
AU2007331763A1 (en) 2008-06-19
TW200841743A (en) 2008-10-16
CN101589623A (en) 2009-11-25
NO342080B1 (en) 2018-03-19
MY148913A (en) 2013-06-14
US9355647B2 (en) 2016-05-31
BRPI0718738A2 (en) 2015-03-24
ZA200903159B (en) 2010-07-28
IL198725A (en) 2016-03-31
CN102395033B (en) 2014-08-27
US20140222442A1 (en) 2014-08-07
NO20092506L (en) 2009-09-10
MX2009006201A (en) 2009-06-22
IL198725A0 (en) 2010-02-17
US20170249952A1 (en) 2017-08-31
US20150179183A1 (en) 2015-06-25
KR101016224B1 (en) 2011-02-25
US20130282389A1 (en) 2013-10-24
WO2008071353A3 (en) 2008-08-21
US8818796B2 (en) 2014-08-26
US20160225383A1 (en) 2016-08-04
BRPI0718738A8 (en) 2018-10-16
TWI363563B (en) 2012-05-01
PL2052548T3 (en) 2012-08-31
EP2052548A2 (en) 2009-04-29
JP5171842B2 (en) 2013-03-27
BRPI0718738B1 (en) 2023-05-16
US11581001B2 (en) 2023-02-14
HK1126602A1 (en) 2009-09-04
US8812305B2 (en) 2014-08-19
AU2007331763B2 (en) 2011-06-30

Similar Documents

Publication Publication Date Title
US11961530B2 (en) Encoder, decoder and methods for encoding and decoding data segments representing a time-domain data stream
US8862480B2 (en) Audio encoding/decoding with aliasing switch for domain transforming of adjacent sub-blocks before and subsequent to windowing
KR101227729B1 (en) Audio encoder and decoder for encoding frames of sampled audio signals
EA025020B1 (en) Audio decoder and decoding method using efficient downmixing
WO2013061584A1 (en) Hybrid sound-signal decoder, hybrid sound-signal encoder, sound-signal decoding method, and sound-signal encoding method

Legal Events

Date Code Title Description
WWE Wipo information: entry into national phase

Ref document number: 200780046188.1

Country of ref document: CN

WWE Wipo information: entry into national phase

Ref document number: 2007856467

Country of ref document: EP

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 07856467

Country of ref document: EP

Kind code of ref document: A2

DPE1 Request for preliminary examination filed after expiration of 19th month from priority date (pct application filed from 20040101)
WWE Wipo information: entry into national phase

Ref document number: 198725

Country of ref document: IL

WWE Wipo information: entry into national phase

Ref document number: 2007331763

Country of ref document: AU

WWE Wipo information: entry into national phase

Ref document number: 1020097011151

Country of ref document: KR

WWE Wipo information: entry into national phase

Ref document number: 2080/KOLNP/2009

Country of ref document: IN

WWE Wipo information: entry into national phase

Ref document number: 2672165

Country of ref document: CA

ENP Entry into the national phase

Ref document number: 2007331763

Country of ref document: AU

Date of ref document: 20071207

Kind code of ref document: A

WWE Wipo information: entry into national phase

Ref document number: MX/A/2009/006201

Country of ref document: MX

ENP Entry into the national phase

Ref document number: 2009540636

Country of ref document: JP

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 2009117569

Country of ref document: RU

Kind code of ref document: A

WWE Wipo information: entry into national phase

Ref document number: 12518627

Country of ref document: US

REG Reference to national code

Ref country code: BR

Ref legal event code: B01E

Ref document number: PI0718738

Country of ref document: BR

Free format text: COMPROVE QUE OS SIGNATARIOS DAS PETICOES DESP 018090030958 , "DESP 018090032932" E "DESP 018090038520" TEM PODERES PARA ATUAR EM NOME DO DEPOSITANTE, UMA VEZ QUE BASEADO NO ARTIGO 216 DA LEI 9.279/1996 DE 14/05/1996 (LPI) "OS ATOS PREVISTOS NESTA LEI SERAO PRATICADOS PELAS PARTES OU POR SEUS PROCURADORES, DEVIDAMENTE QUALIFICADOS"

REG Reference to national code

Ref country code: BR

Ref legal event code: B01E

Ref document number: PI0718738

Country of ref document: BR

Free format text: EM ADITAMENTO A EXIGENCIA PUBLICADA NA RPI 2257 EM 08/04/2014, SOLICITA-SE QUE A EXIGENCIA SEJA RESPONDIDA CORRETAMENTE, NUM PRAZO DE 60 DIAS, POR MEIO DA GRU CODIGO 207 ESPECIFICA PARA ESSE TIPO DE SERVICO.

ENP Entry into the national phase

Ref document number: PI0718738

Country of ref document: BR

Kind code of ref document: A2

Effective date: 20090608