WO2017220528A1 - Audio decoder and method for transforming a digital audio signal from a first to a second frequency domain - Google Patents

Audio decoder and method for transforming a digital audio signal from a first to a second frequency domain Download PDF

Info

Publication number
WO2017220528A1
WO2017220528A1 PCT/EP2017/065011 EP2017065011W WO2017220528A1 WO 2017220528 A1 WO2017220528 A1 WO 2017220528A1 EP 2017065011 W EP2017065011 W EP 2017065011W WO 2017220528 A1 WO2017220528 A1 WO 2017220528A1
Authority
WO
WIPO (PCT)
Prior art keywords
frequency
audio signal
digital audio
frame
nyquist frequency
Prior art date
Application number
PCT/EP2017/065011
Other languages
English (en)
French (fr)
Inventor
Per Ekstrand
Robin Thesing
Lars Villemoes
Original Assignee
Dolby International Ab
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Dolby International Ab filed Critical Dolby International Ab
Priority to US16/307,624 priority Critical patent/US10770082B2/en
Priority to EP17730205.6A priority patent/EP3475944B1/en
Priority to JP2018567177A priority patent/JP6976277B2/ja
Priority to CN201780038374.4A priority patent/CN109328382B/zh
Publication of WO2017220528A1 publication Critical patent/WO2017220528A1/en

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/038Speech enhancement, e.g. noise reduction or echo cancellation using band spreading techniques
    • G10L21/0388Details of processing therefor

Definitions

  • the present invention relates to the field of audio coding.
  • it relates to transformation of a digital audio signal from a first frequency domain to a second frequency domain in an audio decoder.
  • a modified discrete cosine transform may be used for encoding the waveform of a digital audio signal prior to transmittal from the encoder to the decoder
  • a quadrature mirror filter (QMF) bank may be used for high frequency and spatial synthesis of the digital audio signal in the decoder.
  • the digital audio signal has to be transformed from a first frequency domain associated with a first filter bank or transform to a second domain associated with a second filter bank or transform in the decoder.
  • HE-AAC High-Efficiency Advanced Audio Coding
  • Fig. 1 illustrates an audio decoder according to embodiments.
  • Fig. 2 is a flowchart of a method for transforming a digital audio signal from a first to a second frequency domain according to embodiments.
  • Fig. 3 illustrates the spectrum of a digital audio signal during different steps of the method of Fig. 2.
  • Fig. 4 illustrates a misalignment between windows of a first and a second filter bank.
  • Fig. 5 illustrates a sequence of frames of a digital audio signal.
  • Fig. 6 also illustrates a sequence of frames of a digital audio signal.
  • Fig. 7 illustrates a timing and buffer example according to an embodiment.
  • this object is achieved by a method in an audio decoder for transforming a digital audio signal from a first frequency domain to a second frequency domain, comprising:
  • the frequency range is below the Nyquist frequency by more than a threshold amount, lowering the Nyquist frequency of the digital audio signal from its original value to a reduced value by removing spectral bands of the digital audio signal above the identified frequency range,
  • the digital audio signal has a sampling rate in the intermediate time domain which is reduced in relation to the original sampling rate by a sub-sampling factor defined by a ratio between the original value of the Nyquist frequency and the reduced value of the Nyquist frequency, and
  • a decision is taken on a frame-by-frame basis as to whether the Nyquist frequency should be reduced or not. For each frame, the decision is taken on basis of the frequency range of the digital audio signal in the frame. If the frequency range is below the Nyquist frequency by more than a threshold amount, i.e. if the digital audio signal is found to be band-limited in the frame, a decision is taken to reduce the Nyquist frequency. In this way the method may adapt to the frequency content in each frame of the digital audio signal.
  • the Nyquist frequency is reduced from its original value to a reduced value by removing spectral bands above the frequency range identified with respect to the frame.
  • computational complexity is reduced since the removed spectral bands are omitted in the process of transforming the digital audio signal from the first frequency domain to the second frequency domain via an intermediate time domain.
  • the size of the transforms may be reduced by the sub-sampling factor, thereby making the transformations less computationally demanding.
  • the frequency range may vary between frames, and the reduced value of the Nyquist frequency depends on the frequency range, the method allows for different reduced values of the Nyquist frequency in different frames. In this way, the method may further adapt to variations in frequency contents between frames.
  • Reduction of the Nyquist frequency in the frequency domain corresponds to sub- sampling of the digital audio signal in the time domain.
  • the reduction of the Nyquist frequency thus has the effect that the digital audio signal will be sub-sampled when transformed to the time domain.
  • the factor by which the digital audio signal is sub-sampled in the time domain is given by the ratio between the original value of the Nyquist frequency and the reduced value of the Nyquist frequency.
  • the first frequency domain may generally be associated with a first time-to-frequency transform.
  • the second frequency domain may generally be associated with a second time-to-frequency transform.
  • the first frequency transform may be associated with a first filter bank and the second frequency domain may be associated with a second filter bank.
  • the digital audio signal is associated with a sampling rate.
  • the Nyquist frequency is half the sampling rate of the digital audio signal. This is the highest frequency of the original audio signal which may be represented in its digital version. The Nyquist frequency is thus the highest frequency on the frequency scale for the representation of the digital audio signal in the first frequency domain.
  • the digital audio signal may be received at the decoder in frames.
  • a frame of the digital audio signal represents a temporal portion of predefined duration of the digital audio signal.
  • frequency range is typically meant the bandwidth or the highest frequency having non-zero spectral contents of the digital audio signal.
  • spectral contents is generally meant the values or coefficients of the digital audio signal for the different spectral bands in a frequency domain representation of the digital audio signal.
  • spectral band is meant a frequency interval in a frequency domain representation of the digital audio signal.
  • frequency domain representation is typically meant the coefficients or subband samples constituting the output of a time-to-frequency domain transform or filter bank.
  • transform or filter bank are used interchangeably in the present disclosure.
  • the reduced value of the Nyquist frequency may vary between frames. This means that the method may switch from one reduced value of the Nyquist frequency to another reduced value of the Nyquist frequency when going from one frame to the next frame.
  • the reduced value of the Nyquist frequency of a current frame may be set depending on the reduced value of the Nyquist frequency of a previous frame in relation to the frequency range of the current frame. For example, depending on whether the frequency range of the current frame is above or below the reduced value of the Nyquist frequency in a previous frame, the reduced value of the Nyquist frequency may be increased or decreased, respectively. This allows the decision on how to adjust the reduced value of the Nyquist frequency to be made in a sequential manner.
  • the reduced value of the Nyquist frequency of the current frame is set to be larger than the reduced value of the Nyquist frequency of the previous frame (i.e., the Nyquist frequency is increased) if the frequency range of the current frame exceeds the reduced value of the Nyquist frequency of the previous frame by more than a threshold amount.
  • a threshold amount is set to zero, such that the reduced value of the Nyquist frequency is always increased if the bandwidth increases beyond the reduced value of the Nyquist frequency from a previous frame.
  • the method may decide to keep the reduced value of the Nyquist frequency from the preceding frame, since no (or little) artifacts would be introduced and/or little would be gained, in terms of computational complexity, by adjusting the reduced value of the Nyquist frequency. (In fact, a switch to another reduced value of the Nyquist frequency could in this situation, in the worst case, lead to an increase in computational complexity since re-sampling of the digital audio signal in the time domain would be needed as will be further explained below).
  • the reduced value of the Nyquist frequency of the current frame is set to be equal to the reduced value of the Nyquist frequency of the previous frame if a highest frequency of the frequency range of the current frame differs from the reduced value of the Nyquist frequency of the previous frame by no more than a threshold amount.
  • the frequency range of the current frame is significantly lower (as defined by a threshold amount) than the reduced valued of the Nyquist frequency of the preceding frame, it may be beneficial, for reasons of computational complexity, to decrease the reduced value of the Nyquist frequency when going from the preceding frame to the current frame (i.e., the Nyquist frequency is further decreased).
  • the reduced value of the Nyquist frequency of the current frame may be set to be lower than the reduced value of the Nyquist frequency of the previous frame if the frequency range of the current frame is below the reduced value of the Nyquist frequency of the previous frame by more than a threshold amount.
  • the threshold amount may for example correspond to 20% of the reduced value of the Nyquist frequency of the previous frame.
  • the method always increases the reduced value of the Nyquist frequency from a previous to a current frame if the frequency range of the next frame exceeds the reduced value of the Nyquist of the previous frame by more than a threshold amount. This is for the reason of avoiding audible artifacts such as limiting the spectral contents.
  • the reduced value of the Nyquist frequency of the current frame may further be set depending on the frequency range of a predefined number of previous frames. In this way, one may avoid situations in which the reduced value of the Nyquist frequency is unnecessarily adjusted in each and every frame.
  • the reduced value of the Nyquist frequency of the current frame may be set to be lower than the reduced value of the Nyquist frequency of the previous frame if, additionally, the absolute values of the differences between the frequency range of the current frame and each of a predefined number of previous frames are each no more than a threshold amount.
  • the reduced value of the Nyquist frequency of the current frame may be set to be lower than the reduced value of the Nyquist frequency of the previous frame if, additionally, the frequency range of each of a predefined number of previous frames is below the reduced value of the Nyquist frequency of the previous frame by more than a threshold amount.
  • the threshold amounts referred to above may all be different and are typically pre-defined in the decoder.
  • Adapting the reduced value of the Nyquist frequency (and thereby the sub- sampling ratio) from frame to frame poses a challenge to transforms that rely on time domain samples from previous frames. This is, in particular, the case if
  • transformation of the digital audio signal from the first frequency domain to the intermediate time domain or from the intermediate time domain to the second frequency domain requires intermediate time domain samples of the digital audio signal from a previous frame, in addition to intermediate time domain samples of the digital audio signal from a current frame.
  • the change of the transform size results in a change of the sampling rate of the intermediate time domain samples that are decoded from the current frame. These do not match the sampling rate of intermediate time domain samples from previous frames that are still stored in the system, and which need to be combined with the intermediate time domain samples of the current frame for further joint processing. According to example embodiments, this problem is solved by re-sampling the time domain samples from the previous frame(s).
  • the method may comprise checking if the reduced value of the Nyquist frequency is different in the current frame and the previous frame so as to identify if the intermediate time domain samples of the digital audio signal in the current and the previous frame have different sampling rates, and if so, re-sampling of the intermediate time domain samples of the previous frame such that the intermediate time domain samples in the current frame and the previous frame have the same sampling rate.
  • Re-sampling only happens in the transition frame(s), i.e. for adjacent frames being associated with different reduced values of the Nyquist frequency (i.e., different sub- sampling ratios). The re-sampling is no longer necessary when the switch to the new reduced value of the Nyquist frequency has been completed.
  • Sub-sampled operation of the transforms may introduce a temporal delay in the system.
  • the output signal of the decoder at sub-sampled operation (when the Nyquist frequency has been reduced) may be delayed with respect to the output signal of the decoder when operating at the original sampling rate. This is undesirable, since, optimally, one would like the output signal of the decoder to be the same regardless of whether the transforms operate at the original sampling rate or at a reduced sampling rate (i.e., regardless of whether the Nyquist frequency has its original value or a reduced value). Otherwise, there may be audible artifacts.
  • the temporal delay is due to a temporal misalignment of filters (sometimes referred herein as windows) of a first bank of filters used to transform the digital audio signal from the first frequency domain to the intermediate time domain, and filters of a second bank of filters used to transform the digital audio signal from the intermediate time domain to the second frequency domain.
  • filters sometimes referred herein as windows
  • filters of a second bank of filters used to transform the digital audio signal from the intermediate time domain to the second frequency domain.
  • the re-sampling of the intermediate time domain samples of the previous frame may comprise compensating for this temporal delay. If no such compensation is carried out there may be audible artifacts in the audio output of the decoder.
  • the temporal delay may be compensated for by temporally shifting the time domain samples of the previous frame by a delay value when re-sampling.
  • the temporal delay which is compensated for in the re-sampling of the intermediate time domain samples of the previous frame is given by a value dfmct, i which depends on a ratio qi between the sub-sampling factors of the current frame and the previous frame, respectively, according to
  • the re-sampling of the intermediate time domain samples of the previous frame(s) may be carried out in different ways. If a re-sampling of high quality is desired, interpolation and finite impulse response (FIR) filtering followed by decimation may be used. An alternative is to re-sample the intermediate time domain samples of the previous frame using interpolation, such as linear or cubic spline interpolation. This results in a lower quality but has a very low computational complexity.
  • quality is in this context meant that the output signal of the decoder at sub-sampled operation of the transforms is similar to the output signal of the decoder when the transforms operate at the original sampling rate.
  • the first frequency domain may be associated with a first bank of synthesis filters having a first, predetermined, length
  • the second frequency domain is associated with a second bank of analysis filters having a second, predetermined, length.
  • the first filter bank is associated with a first transform size being equal to the number of filters in the first filter bank, which in turn corresponds to the number of frequency bands, or channels, of the corresponding transform.
  • the second filter bank is associated with a second transform size being equal to the number of filters in the second filter bank, which in turn corresponds to the number of frequency bands, or channels, of the corresponding transform.
  • the first filter bank and the second filter bank are intended to work at the original sampling rate.
  • the first and the second filter bank are designed to transform the digital audio signal from the first frequency domain to the second frequency domain via an intermediate time domain, wherein the sampling rate in the intermediate time domain is the original sampling rate.
  • the transform sizes and the predetermined length of the filters are in this way associated with the original sampling rate (and the original value of the Nyquist frequency) of the digital audio signal.
  • the sampling rate is reduced by the sub-sampling factor.
  • the first and second filter banks which are associated with the original sampling frequency may be taken as a starting point for providing transforms or filter banks which operate at reduced sampling rates.
  • the reduction of the Nyquist frequency by removal of spectral bands implies that the sizes, i.e., the number of spectral bands or frequency channels, of the first and second filter banks may be reduced by the sub-sampling factor. This is possible since the removed spectral bands may be omitted in the process of transforming the digital audio signal from the first frequency domain to the second frequency domain via an intermediate time domain.
  • the step of transforming the digital audio signal from the first frequency domain to a second frequency domain via an intermediate time domain may comprise: reducing the length of the synthesis filters of the first bank by the sub-sampling factor and using the synthesis filters of reduced length when transforming the digital audio signal from the first frequency domain to the intermediate time domain, and/or reducing the length of the analysis filters of the second bank by the sub-sampling factor and using the analysis filters of reduced length when transforming the digital audio signal from the intermediate time domain to the second frequency domain.
  • the synthesis and analysis filters of the first and the second bank may be adapted to the reduced sampling rate corresponding to the reduced value of the Nyquist frequency.
  • the first and the second bank may be modulated filter banks.
  • the first filter bank may be associated with a first prototype filter from which the synthesis filters of the first bank may be derived.
  • the second filter bank may be associated with a second prototype filter from which the analysis filters of the second bank may be derived.
  • the lengths of the synthesis filters and the analysis filters may be reduced by first reducing the length of the respective prototype filters, and then deriving synthesis and analysis filter from the prototype filters of reduced length.
  • the filters may be downsampled in order to reduce their length.
  • the length of the synthesis filters of the first bank may be reduced by downsampling by the downsampling factor or by re-calculating the synthesis filters from a closed form expression describing the synthesis filters of the first bank.
  • the length of the analysis filters of the second bank may be reduced by downsampling by the downsampling factor or by re-calculating the analysis filters from a closed form expression describing the analysis filters of the second bank.
  • the length of the prototype filters may be reduced by the downsampling factor by downsampling or by re-calculation from a closed form expression.
  • the downsampling of the synthesis filters of the first bank and/or the analysis filters of the second bank may comprise compensating for a temporal delay being due to a temporal misalignment of the synthesis filters of the first bank, and the analysis filters of the second filter bank, as described above.
  • This temporal misalignment leads to a mismatch between the sub-sampled grids of the first and the second bank relative to the original sampling grid to be compensated for.
  • the temporal delay may be compensated for by temporally shifting the synthesis or analysis filter (or their prototype), as applicable, by a delay value when downsampling.
  • the temporal delay may be compensated for after transforming the digital audio signal to the second frequency domain.
  • the method may comprise applying a phase-shift to the digital audio signal after the step of
  • phase-shift depends on a temporal delay being due to a temporal misalignment of the synthesis filters of the first bank, and the analysis filters of the second filter bank.
  • the synthesis filters in the first bank and/or the analysis filters in the second bank may be downsampled using linear or cubic spline interpolation.
  • the first frequency domain may be a modified discrete cosine transform (MDCT) domain
  • the second frequency domain may be a quadrature mirror filter (QMF) domain.
  • MDCT modified discrete cosine transform
  • QMF quadrature mirror filter
  • the frequency range (or rather its upper limit), i.e. the bandwidth, of the digital audio signal is typically determined as the highest frequency having a non-zero spectral content in the spectrum of the digital audio signal as represented in the first frequency domain.
  • the method may further comprise receiving parameters relating to the digital audio signal, wherein the frequency range is further identified based on the parameters.
  • the parameters may relate to a frequency threshold above which spectral contents of the digital audio signal will be reconstructed based on spectral contents below the frequency threshold (e.g. using high frequency reconstruction techniques, such as spectral band replication).
  • the frequency range (or rather the upper limit of the frequency range) may then be set to the frequency threshold.
  • the reduced value of the Nyquist frequency may be selected to be equal to the highest frequency of the identified frequency range.
  • the step of lowering the Nyquist frequency of the digital audio signal from its original value to the reduced value comprises removing all spectral bands of the digital audio signal above the identified frequency range.
  • only a limited set of sub-sampling factors (and thereby a limited set of reduced values of the Nyquist frequency) may be supported. This limited set of sub-sampling factors is typically designed such that the sub-sampling factors result in transform sizes which can be implemented efficiently (e.g. power-of-two size FFTs).
  • the step of lowering the Nyquist frequency of the digital audio signal may therefore comprise: selecting, from a predefined set of values, a reduced value of the Nyquist frequency as the lowest value in the predefined set being above the identified frequency range, and removing spectral bands of the digital audio signal above the selected reduced value of the Nyquist frequency.
  • the decision on if and how to lower the Nyquist frequency is made on a channel basis. Specifically, the steps of identifying a frequency range of the digital audio signal and lowering the Nyquist frequency are performed for each audio channel, thereby allowing different audio channels to have different reduced values of the Nyquist frequency in the same frame.
  • a computer program product comprising a (non-transitory) computer-readable medium having computer code instructions stored thereon for carrying out the method of any one of the preceding claims when executed by a device having processing capability.
  • an audio decoder for transforming a digital audio signal from a first frequency domain to a second frequency domain, comprising:
  • a receiving component configured to receive subsequent frames of a digital audio signal being represented in a first frequency domain, the digital audio signal having a Nyquist frequency which is half of an original sampling rate of the digital audio signal
  • a transformation component configured to, for each frame of the digital audio signal:
  • a frequency range of the digital audio signal by analyzing spectral contents of the digital audio signal, if the frequency range is below the Nyquist frequency by more than a threshold amount, lower the Nyquist frequency of the digital audio signal from its original value to a reduced value by removing spectral bands of the digital audio signal above the identified frequency range,
  • the digital audio signal transforms the digital audio signal from the first frequency domain to a second frequency domain via an intermediate time domain, wherein the digital audio signal has a sampling rate in the intermediate time domain which is reduced in relation to the original sampling rate by a sub-sampling factor defined by a ratio between the original value of the Nyquist frequency and the reduced value of the Nyquist frequency, and
  • the second and the third aspects may generally have the same features and advantages as the first aspect.
  • Fig. 1 schematically illustrates an audio decoder 100.
  • the audio decoder 100 comprises a receiving component 1 10, a first transformation component 120, a signal processing component 130, and a second transformation component 140.
  • the receiving component 1 10 When in use, the receiving component 1 10 receives an (encoded) digital audio signal 102.
  • the digital audio signal 102 is received in temporally subsequent frames.
  • the digital audio signal 102 as received at the receiving component 1 10 is
  • the original sampling rate is the inverse of the temporal distance between subsequent temporal samples of the digital audio signal 102.
  • the digital audio signal 102 may comprise different audio channels. It is to be understood that the methods described herein may be applied to each of the audio channels of the digital audio signal 102 separately or in any combinations. For example, some audio channels may be parametrically coded such that spectral contents are added to higher frequencies by parametric tools which operate in the second frequency domain. When such parametric tools are in use, the bandwidth of the audio channel as represented in the first frequency domain is typically limited to half of the Nyquist frequency or lower, which allows cutting the transform size by a factor of two or more. As another example, the low frequency effects (LFE) audio channel is band-limited to a few hundred Hz by definition allowing for even more aggressive sub-sampling by a factor of 8 or even 16. Different audio channels may thus have different bandwidth properties. By treating the audio channels separately, different audio channels may be subject to sub-sampling by different factors in order to achieve maximum reduction of computational complexity.
  • LFE low frequency effects
  • the digital audio signal 102 as received at the decoder 100 is typically not represented in the time domain, but rather in a frequency domain.
  • the digital audio signal 102 may at the encoder have been transformed to a first frequency domain by application of a filter bank of analysis filters, such as an MDCT or another filter bank found suitable for that purpose.
  • the digital audio signal 102 is represented in a first frequency domain, i.e., as a collection of frequency domain samples which describe the spectral contents of the digital audio signal 102 for different frequency bands.
  • the maximum frequency of the representation of the digital audio signal 102 in the first frequency domain is given by the Nyquist frequency which is half of the original sampling rate of the digital audio signal 102.
  • the digital audio signal 102 is then passed along to the first transformation component 120 which is configured to transform the digital audio signal 102 from the first frequency domain representation to a second frequency domain representation.
  • the reason for transforming from one frequency domain representation to another is that the different frequency domain representations may be associated with different advantages.
  • the first frequency domain representation may be preferred for encoding the wave-form of the digital audio signal 102 and sending it from the encoder to the decoder 100, while a second frequency domain
  • the second frequency domain may be a QMF domain.
  • the digital audio signal 102 is then passed along from the first transformation component 120 to the signal processing component 130, where various processing of the digital audio signal 102 is carried out in the second frequency domain.
  • the signal processing component 130 may carry out parametric
  • the resulting signal from the signal processing component 130 is then transformed from the second frequency domain to the time domain by the second transformation component 140 in order to produce an output signal 104 for subsequent playback.
  • the general structure of the audio decoder 100 is similar to that of prior art decoders. However, the audio decoder 100 differs from prior art decoders in the functionality of the first transformation component 120. In order to reduce
  • the first transformation component 120 implements a method which adaptively, that is, on a frame-by-frame basis, allows the size of the transforms (from first frequency domain to time domain, and from time domain to second frequency domain) to vary. This is achieved by adapting the Nyquist frequency in each frame to the bandwidth of the digital audio signal 102 in the frame by omitting (typically empty) spectral bands of the digital audio signal 102 above the bandwidth. From a time domain perspective, this corresponds to sub-sampling the digital audio signal 102 and the transforms on a frame-by-frame basis.
  • step S02 of Fig. 2 the transformation component 120 receives, from the receiving component 1 10 of decoder 100, a frame of the digital audio signal 102 represented in the first frequency domain.
  • the first digital audio signal 102 is given in the form of a MDCT spectrum.
  • the receiving component 1 10 has in turn received the frame of the digital audio signal 102 from an encoder.
  • the transformation component 120 identifies a frequency range of the digital audio signal 102.
  • the frequency range is identified by analyzing spectral contents of the digital audio signal 102. This is further illustrated in Fig. 3a, which illustrates a frame of the digital audio signal 102 represented in the first frequency domain. The dashed bins correspond to spectral bands having non-zero spectral contents.
  • the transformation component 120 may typically determine the frequency range as the bandwidth B of the digital audio signal 102, i.e., as the highest frequency having a non-zero spectral content in the spectrum.
  • the frequency range is further determined on basis of received parameters which relate to the digital audio signal 102.
  • the parameters may relate to a frequency threshold above which spectral contents of the digital audio signal will be reconstructed, by the signal processing component 130, based on spectral contents below the frequency threshold (e.g. using high frequency reconstruction techniques, such as spectral band replication).
  • the frequency range (or rather the upper limit of the frequency range) may be set to the frequency threshold.
  • the parameters may relate to a frequency threshold above which spectral contents of one audio channel of the digital audio signal 102 will be reconstructed, by the signal processing component 130, based on spectral contents from another audio channel of the digital audio signal. In such cases, the frequency range (or rather the upper limit of the frequency range) may be set to that frequency threshold.
  • step S06 the transformation component 120 checks whether the frequency range is below the Nyquist frequency w by more than a predefined amount.
  • the transformation component 120 may first transform the audio signal 102 from the first frequency domain representation to an intermediate time domain representation by using a first bank of synthesis filters, such as an inverse MDCT filter bank.
  • the first filter bank is associated with a first (predetermined) transform size corresponding to the number of filters in the bank (this is the number of frequency sub-bands or channels of the transform).
  • the filters (sometimes referred to as windows) of the first bank have a predetermined length.
  • the second filter bank is associated with a second (predetermined) transform size corresponding to the number of filters in the bank (this is the number of frequency sub-bands or channels of the transform).
  • the filters (sometimes referred to as windows) of the second bank have a predetermined length.
  • the first and the second filter banks and the filters therein are thus intended to operate at the original sampling frequency.
  • the first bank may correspond to a MDCT transform of size 2048 with a filter length of 4096
  • the second bank may correspond to a QMF bank of size 64 with a filter length of 640.
  • the first and the second filter banks are modulated filter banks.
  • a modulated filter bank has a prototype filter from which the filters in the filter bank may be derived.
  • step S14 the transformation component 120 returns to step S02 where a subsequent frame of the digital audio signal is received.
  • step S06 If it instead is found in step S06 that the frequency range is below the Nyquist frequency f N by a predefined amount, the transformation component proceeds to step S08.
  • the transformation component 120 sets a reduced value f Nired of the Nyquist frequency.
  • the reduced value of the Nyquist frequency should be equal to, or above, the highest frequency in the frequency range.
  • the reduced value of the Nyquist frequency may be selected to be equal to the highest frequency of the identified frequency range, which in the example of Fig. 3a is the bandwidth B.
  • the limited set of reduced values e.g. is given in terms of the original Nyquist frequency divided by a set of sub- sampling factors.
  • the set of sub-sampling factors may comprise the sub-sampling factors 1 , 4/3, 2, 4, 8 and 16.
  • the transformation component 120 may therefore select the largest possible sub-sampling factor from the set of sub- sampling factors which still give a reduced value of the Nyquist frequency being above the identified frequency range of the digital audio signal 102.
  • the transformation component 120 may select the lowest value of the limited set of reduced values of the Nyquist frequency which exceeds the identified frequency range of the digital audio signal 102.
  • the transformation component 120 may lower the value of the Nyquist frequency from its original value f N to the reduced value f Nired by removing spectral bands of the digital audio signal 102 above the identified frequency range. This is further illustrated in Fig. 3b, where spectral bands above the frequency range are removed such that the highest frequency in the spectrum becomes the reduced value f Ni re d of the Nyquist frequency. From a time domain perspective, this
  • the transformation proceeds to transform the digital audio signal 102 from the first frequency domain (which e.g. is a MDCT domain) to a second frequency domain (which e.g. is a QMF domain) via an intermediate time domain.
  • Fig. 3c illustrates the digital audio signal 102 represented in a second (sub-sampled) frequency domain. Since the Nyquist frequency has been lowered, the transformation component 120 may work with reduced transform sizes. In particular, the transform sizes may be reduced by the sub-sampling factor compared to operation at the original sampling rate. In this way, the computational complexity is reduced.
  • the transformation component 120 may use a first filter bank of reduced transform size for transformation from the first frequency domain to the intermediate time domain, and a second filter bank of reduced transform size for transformation from the intermediate time domain to the second frequency domain.
  • the transformation component 120 may calculate and store filter banks intended to operate at different sampling rates, i.e. at different values of the sub-sampling factors. These filter banks may be re-used each time the different sub-sampling factors are selected. In this way computational complexity may be reduced.
  • the transformation component 120 only supports a limited set of sub-sampling factors. In this way the computational effort for calculating filters or transform windows of different sizes is minimized or completely eliminated by having pre-stored filter coefficients or windows in non-volatile memory.
  • the transformation component 120 may take the first and the second filter banks operating at the original sampling rate as a starting point.
  • the transform size needs to be reduced, meaning that the number of synthesis filters in the first filter bank of full size is reduced by the sub-sampling factor, and that the number of analysis filters in the second filter bank of full size is reduced by the sub-sampling factor.
  • the transform size reduction is achieved by removing filters from the first and second filter banks which correspond to spectral bands that were removed from the digital audio signal 102 in step S08.
  • the transformation component 120 may therefore reduce the length of the synthesis filters of the first bank, and the length of the analysis filters of the second bank by the sub-sampling factor.
  • these closed-form expressions may be used to re-calculate filters of reduced length.
  • the length of the filters may be reduced by downsampling by the sub-sampling factor.
  • the filters may be downsampled using interpolation, such as linear interpolation or cubic spline interpolation.
  • first and second filter banks corresponding to a sub- sampling factor are facilitated in case modulated filter banks are used.
  • the prototype filters of the first and the second filter banks of full size, respectively may, after modification, be used to derive corresponding first and second filter banks for sub-sampled operation.
  • the transformation component 120 may first reduce the length of the synthesis prototype filter of the first filter bank of full size by the sub-sampling factor by either downsampling by the sub-sampling factor or by recalculating a synthesis prototype filter of reduced length from a closed form
  • the synthesis prototype filter of reduced length may be used to derive the first filter bank of reduced transform size corresponding to the sub-sampling factor.
  • the sub-sampled operation of the transforms may introduce a temporal delay.
  • the first frequency domain representation is a MDCT and the second frequency domain representation is a QMF
  • Fig. 4a indicates the location of sample points relative to the MDCT window at the original sampling rate.
  • Fig. 4b shows the corresponding situation for the QMF window.
  • this represents an example of the relative timing scenario for the full band applications of MDCT synthesis followed by QMF analysis. It is desirable that the sub-sampled operation conforms to the same relative timing.
  • Fig. 4c indicates the location of the sample points relative to the MDCT window at the reduced sampling rate (as reduced by the sub-sampling factor of 2).
  • the optimal continuous time position of the QMF analysis window is unchanged and depicted by the dashed window shape in Fig. 4d. But, as the available
  • N is the length of the original prototype filter f
  • q 2 is the subsampling factor
  • m [ n ⁇ q 2 + df ract 2 ⁇ is an integer (L.J is the floor operator, i.e. the largest integer rounded downwards).
  • Adaptation of the reduced Nyquist frequency (or equivalently, the sub- sampling ratio) from frame to frame poses a challenge to transforms that rely on time domain samples from previous frames. This is for instance the case for the MDCT transform and the QMF bank which may be used as the frequency domain
  • the transformation component 120 may re-sample the time domain samples from the previous frame(s). In more detail, the transformation component 120 may keep track of the, possibly reduced, value of the Nyquist frequency used in each frame. In particular, the transformation component 120 may check whether the value of the Nyquist frequency (the reduced value or the original value of the Nyquist frequency depending on whether or not a reduction has taken place in the frame) of the current frame and the previous frame are different. In this way, the transformation component 120 may identify if the current and the previous frame have different sampling rates.
  • the transformation component 120 may, in an analogous fashion, check if the value of the Nyquist frequency is different in the current frame and in any of the plurality of previous frames. If the transformation component 120 finds that the current and the previous frame (or any of a plurality of previous frames) have different values of the Nyquist frequency, it may proceed to re-sample the intermediate time domain samples of the previous (or those of the previous frames which have a different value of the Nyquist frequency). The re-sampling is carried out such that the intermediate time domain samples of the current frame and the previous frame(s) have the same sampling rate.
  • This re-sampling may be achieved in different ways. For example, in order to have a re-sampling of high quality, traditional re-sampling using interpolation followed by low-pass filtering by a finite impulse response (FIR) filter, which in turn is followed by decimation, may be used. This is possible as long as the re-sampling concerns resampling by a rational factor (which is usually the case if the sub-sampling factors of the system are restricted to a limited set of integers or rational numbers as
  • FIR finite impulse response
  • the transformation component 120 may first interpolate by a factor of /, followed by FIR-filtering, and then decimate by a factor of / .
  • linear or cubic spline interpolation without subsequent filtering may be used. This may result in a lower quality (e.g. there may be problems with aliasing), but has the advantage of a very low computational complexity.
  • the temporal delay between the intermediate time domain samples of the current frame in relation to the intermediate time domain samples of the previous frame(s) is related to the ratio q 1 between the sub-sampling factors of the current frame and the previous frame.
  • the transformation component 120 may in step S12 proceed to restore the Nyquist frequency from its reduced value to the original value in the frame. This may be achieved by appending (empty) spectral bands to the digital audio signal in the second frequency domain above the reduced value of the Nyquist frequency f Nred . This is further illustrated in Fig. 3d, where the empty spectral bands have been added to the frequency representation of the digital audio signal 102 in the second frequency domain such that the highest frequency represented is again given by the original value of the Nyquist frequency f N .
  • the transformation component 120 may take a decision to switch the value of the reduced Nyquist frequency when going from the previous frame to the current frame. This decision may be taken only on basis of the spectral contents of the current frame. However, that may result in a jumping behavior of the reduced value of the Nyquist frequency, i.e., it may tend to change value very often.
  • the transformation component 120 may, when setting the reduced value of the Nyquist frequency of the current frame, in step S08, also take into account the reduced value of the Nyquist frequency of the previous frame in relation to the frequency range of the current frame. This is further illustrated in Figs 5 and 6.
  • Fig. 5 illustrates seven consecutive frames 501 a, 501 b, 501 c, 501 d, 501 e, 501 f, 501 g.
  • Each frame 501 a-g has a frequency range 502a-g (the dashed pattern of the frequency scale indicates non-zero spectral bands).
  • Frame 501 a is associated with a reduced value of the Nyquist frequency 503a (labeled by f N,red ).
  • the frequency range 502b of frame 501 b is compared to the reduced value of the Nyquist frequency f Nred of the previous frame 501 a.
  • the frequency range 502b exceeds the reduced value of the Nyquist frequency 503a of the previous frame 501 a by more than a threshold amount Ti .
  • the reduced value of the Nyquist frequency 503b of frame 501 b is set to be larger than the reduced value of the Nyquist frequency 503a of frame 501 a.
  • the reduced value of the Nyquist frequency 503b is set to a value above the frequency range 502b of frame 501 b.
  • the transformation component 120 When the transformation component 120 receives the subsequent frame 501 c, it compares the frequency range 502c of frame 501 c to the reduced value of the Nyquist frequency 503b of frame 501 b. In this example, it will find that the frequency range 502c differs from the reduced value of the Nyquist frequency 503b by no more than a threshold amount T2. It will therefore decide to keep the reduced value of the Nyquist frequency 503b of frame 501 b also in frame 501 c.
  • the threshold amount T2 is typically larger than the threshold amount Ti, meaning that the transformation component 120 is more prone to increase the reduced value of the Nyquist frequency (in order to avoid aliasing and a truncated bandwidth) than to decrease the reduced value of the Nyquist frequency (which may be beneficial for reducing computational complexity).
  • the transformation component 120 Upon receiving the next frame, frame 501 d, the transformation component 120 compares the frequency range 502d to the reduced value of the Nyquist frequency 503b. It will then find that the frequency range 502d is below the reduced value of the Nyquist frequency 503b by more than the threshold amount T2, meaning that it could be beneficial to switch to a lower reduced value of the Nyquist frequency.
  • the transformation component 120 would therefore switch to a lower reduced value of the Nyquist frequency in frame 501 d.
  • the transformation component 120 will also take the frequency range of a number of previous frames into account when setting the reduced value of the Nyquist frequency in frame 501d.
  • the transformation component 120 takes the frequency range of three preceding frames into account when setting the reduced value of the Nyquist frequency.
  • the number of previous frames is a parameter which may be predefined in or input to the system. The number of previous frames may typically be in the range 2-6 frames.
  • the transformation component 120 will check whether each of the frequency ranges 502c, 502b, 502a of the preceding frames 501 c, 501 b, 501 a is below the reduced value of the Nyquist frequency 503b by more than the threshold amount T2. Since this is not satisfied in the present example, the transformation component 120 decides to keep the reduced value of the Nyquist frequency 503b also in frame 501 d.
  • the transformation component 120 then repeats this procedure for frames 501 e and 501 f with the same outcome as for frame 501 d, and the reduced value of the Nyquist frequency 503b is kept also in frames 501 e and 501 f.
  • the transformation component 120 will find that the frequency range 502g of frame 501 g is below the reduced value of the Nyquist frequency 503b by more than the threshold amount T2, and, in addition, that also each of the frequency ranges 502f, 502e, 502d of the three preceding frames 501f, 501 e, 501 d is below the reduced value of the Nyquist frequency 503b by more than the threshold amount T2.
  • the transformation component 120 decides to switch to a new, lower, reduced value of the Nyquist frequency 503c. In this way, one may avoid switching of the reduced value of the Nyquist frequency too often. For example, otherwise the reduced value of the Nyquist frequency would first have been decreased in frame 501 d and then increased again in the following frame 501 e.
  • Fig. 6 illustrates a variant which may be used as an alternative to, or in addition to, the embodiment of Fig. 5.
  • the embodiment of Fig. 6 differs from the embodiment of Fig. 5 in that the transformation component 120 uses another decision criterion when switching to a lower reduced value of the Nyquist frequency.
  • the processing of frames 501 a, 501 b, and 501 c in the embodiments of Figs 5 and 6 is thus the same. However, this is not the case for frames 501d, 501 e, 501f, and 501 g.
  • the transformation component Upon receiving frame 501 d, the transformation component finds that the frequency range 502d is below the reduced value of the Nyquist frequency 503b of the previous frame by more than the threshold amount T2. However, before deciding to switch to another, lower, reduced value of the Nyquist frequency, the
  • transformation component will look at the frequency ranges of a number of preceding frames (in this case three preceding frames).
  • the transformation component 120 checks whether each of the frequency ranges 502c, 502b, 502a of the three preceding frames differs from the frequency range 502d of the current frame 501 d by no more than a threshold amount T3 (which is typically smaller than T2). In the illustrated example, this is not the case, and the transformation component 120 therefore decides to keep the reduced value of the Nyquist frequency 503b of the previous frame 501 c.
  • the transformation component 120 repeats these checks also for subsequent frames 501 e and 501f with the same outcome, namely that the reduced value of the Nyquist frequency 503b is kept also in frames 501 e and 501 f.
  • the transformation component 120 will come to another conclusion. Firstly, it will find that the frequency range 502g is below the reduced value of the Nyquist frequency 503b by more than the threshold amount T2.
  • each of the frequency ranges 502f, 502e, 502d of the three preceding frames 501 f, 501 e, 501 d differs from the frequency range 502g of the current frame 501 g by no more than the threshold amount T3.
  • the transformation component 120 takes a decision to switch to a new, lower, reduced value of the Nyquist frequency 503c.
  • Fig. 7 shows a timing and buffer view when switching from subsampling factor 1 (no subsampling) to sub-sampling by a factor 4 and then up to 4/3.
  • the height of the bars at the bottom of the figure indicate the amount of subsampling and hence the bandwidth of the subsampled system. Note that this example does not include the step of appending extra (empty) QMF bands above the current Nyquist frequency in order to restore the original bandwidth.
  • the downsampling of the windows and time domain (PCM) buffers are represented by dotted lines (with lower "dot-pitch" for higher degree of subsampling). They all represent the same absolute duration in time, only the sample rate and hence bandwidth are different.
  • the history buffer of the QMF qmfBuffer (N-L samples), and the IMDCT overlap-add buffer mdctBuffer, are downsampled by a factor 4.
  • the result is stored in the dashed blocks and used by the IMDCT overlap- add process and the analysis QMF ⁇ MIA channels) in frame n+1 .
  • the transforms may run on the new subsampled rate until there is a need to increase the bandwidth in frame n+4.
  • the time domain buffers from frame n+3 (dashed blocks on the right) are upsampled by a factor 3.
  • the result is stored in the dotted blocks and is used in the IMDCT overlap-add process and in the analysis QMF bank using a 3 ⁇ 4-size filter bank in frame n+4. Again, the resulting QMF samples are shown as dotted bars at the bottom of the figure.
  • the re-sampling of the buffers can be made in one step since they are contiguous.
  • a re-sampling of high quality can be done by traditional re-sampling involving interpolation and FIR-filtering, followed by decimation.
  • An alternative is to use linear or higher order interpolation resulting in less quality of the re-sampling but having a very low computational complexity.
  • the buffers are re- sampled using linear interpolation.
  • the concatenated buffer h is subsequently interpolated as:
  • mdctBuffer( ) h(n + (N— L)/ ⁇ ), 0 ⁇ n ⁇ frame Length/ q 1
  • the systems and methods disclosed hereinabove may be implemented as software, firmware, hardware or a combination thereof.
  • the "components" referred to herein may be implemented as circuitry.
  • the division of tasks between functional units referred to in the above description does not necessarily correspond to the division into physical units; to the contrary, one physical component may have multiple functionalities, and one task may be carried out by several physical components in cooperation.
  • Certain components or all components may be implemented as software executed by a digital signal processor or microprocessor, or be implemented as hardware or as an application-specific integrated circuit.
  • Such software may be distributed on computer readable media, which may comprise computer storage media (or non-transitory media) and communication media (or transitory media).
  • Computer storage media includes both volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data.
  • Computer storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital versatile disks (DVD) or other optical disk storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can be accessed by a computer. Further, it is well known to the skilled person that
  • communication media typically embodies computer readable instructions, data structures, program modules or other data in a modulated data signal such as a carrier wave or other transport mechanism and includes any information delivery media.
  • EEEs enumerated example embodiments
  • a method in an audio decoder for transforming a digital audio signal from a first frequency domain to a second frequency domain comprising:
  • the frequency range is below the Nyquist frequency by more than a threshold amount, lowering the Nyquist frequency of the digital audio signal from its original value to a reduced value by removing spectral bands of the digital audio signal above the identified frequency range,
  • the digital audio signal has a sampling rate in the intermediate time domain which is reduced in relation to the original sampling rate by a sub-sampling factor defined by a ratio between the original value of the Nyquist frequency and the reduced value of the Nyquist frequency, and
  • EEE 2 The method of EEE 1 , wherein the reduced value of the Nyquist frequency of a current frame is set depending on the reduced value of the Nyquist frequency of a previous frame in relation to the frequency range of the current frame.
  • EEE 3 The method of EEE 2, wherein the reduced value of the Nyquist frequency of the current frame is set to be larger than the reduced value of the Nyquist frequency of the previous frame if the frequency range of the current frame exceeds the reduced value of the Nyquist frequency of the previous frame by more than a threshold amount.
  • EEE 4 The method of EEE 2 or 3, wherein the reduced value of the Nyquist frequency of the current frame is set to be equal to the reduced value of the Nyquist frequency of the previous frame if a highest frequency of the frequency range of the current frame differs from the reduced value of the Nyquist frequency of the previous frame by no more than a threshold amount.
  • EEE 5 The method of any one of EEEs 2-4, wherein the reduced value of the Nyquist frequency of the current frame is set to be lower than the reduced value of the Nyquist frequency of the previous frame if the frequency range of the current frame is below the reduced value of the Nyquist frequency of the previous frame by more than a threshold amount.
  • EEE 6 The method of any one of EEEs 2-5, wherein the reduced value of the Nyquist frequency of the current frame is further set depending on the frequency range of a predefined number of previous frames.
  • EEE 7 The method of EEE 6, wherein the reduced value of the Nyquist frequency of the current frame is set to be lower than the reduced value of the Nyquist frequency of the previous frame if, additionally, the absolute values of the differences between the frequency range of the current frame and each of a predefined number of previous frames are each no more than a threshold amount.
  • EEE 8 The method of EEE 6, wherein the reduced value of the Nyquist frequency of the current frame is set to be lower than the reduced value of the Nyquist frequency of the previous frame if, additionally, the frequency range of each of a predefined number of previous frames is below the reduced value of the Nyquist frequency of the previous frame by more than a threshold amount.
  • EEE 9 The method of any one of the preceding EEEs, wherein transformation of the digital audio signal from the first frequency domain to the intermediate time domain or from the intermediate time domain to the second frequency domain requires intermediate time domain samples of the digital audio signal from a previous frame, in addition to intermediate time domain samples of the digital audio signal from a current frame, the method further comprising:
  • EEE 10 The method of EEE 9, wherein the re-sampling comprises
  • EEE 12 The method of any one of EEEs 9-1 1 , wherein the intermediate time domain samples of the previous frame are re-sampled using interpolation, such as linear or cubic spline interpolation.
  • EEE 13 The method of any one of EEEs 9-1 1 , wherein the intermediate time domain samples of the previous frame are re-sampled using interpolation and FIR- filtering followed by decimation.
  • EEE 14 The method of any one of the preceding EEEs, wherein
  • the first frequency domain is associated with a first bank of synthesis filters having a first, predetermined, length,
  • the second frequency domain is associated with a second bank of analysis filters having a second, predetermined, length, and
  • the step of transforming the digital audio signal from the first frequency domain to a second frequency domain via an intermediate time domain comprises:
  • EEE 15 The method of EEE 14, wherein the length of the synthesis filters of the first bank is reduced by downsampling by the sub-sampling factor or by recalculating the synthesis filters from a closed form expression describing the synthesis filters of the first bank.
  • EEE 16 The method of EEE 14 or 15, wherein the length of the analysis filters of the second bank is reduced by downsampling by the sub-sampling factor or by recalculating the analysis filters from a closed form expression describing the analysis filters of the second bank.
  • EEE 17 The method of EEE 15 or 16, wherein the downsampling of the synthesis filters of the first bank and/or the analysis filters of the second bank comprises compensating for a temporal delay being due to a temporal misalignment of the synthesis filters of the first bank, and the analysis filters of the second filter bank.
  • EEE 18 The method of any one of EEEs 14-16, further comprising: applying a phase-shift to the digital audio signal after the step of transforming the digital audio signal from the first frequency domain to a second frequency domain via an intermediate time domain, wherein the phase-shift depends on a temporal delay being due to a temporal misalignment of the synthesis filters of the first bank, and the analysis filters of the second filter bank.
  • EEE 20 The method of any one of EEEs 15-19, wherein the synthesis filters in the first bank and/or the analysis filters in the second bank are downsampled using linear or cubic spline interpolation.
  • EEE 21 The method of any one of the preceding EEEs, wherein the first frequency domain is a modified discrete cosine transform (MDCT) domain, and the second frequency domain is a quadrature mirror filter (QMF) domain.
  • EEE 22 The method of any one of the preceding EEEs, further comprising receiving parameters relating to the digital audio signal, wherein the frequency range is further identified based on the parameters.
  • MDCT modified discrete cosine transform
  • QMF quadrature mirror filter
  • EEE 23 The method of any one of the preceding EEEs, wherein the step of lowering the Nyquist frequency of the digital audio signal further comprises:
  • EEE 24 The method of any one of the preceding EEEs, wherein the digital audio signal has a plurality of audio channels, and wherein the steps of identifying a frequency range of the digital audio signal and lowering the Nyquist frequency are performed for each audio channel, thereby allowing different audio channels to have different reduced values of the Nyquist frequency in the same frame.
  • EEE 25 A computer program product comprising a computer-readable medium having computer code instructions stored thereon for carrying out the method of any one of the preceding EEEs when executed by a device having processing capability.
  • An audio decoder for transforming a digital audio signal from a first frequency domain to a second frequency domain comprising:
  • a receiving component configured to receive subsequent frames of a digital audio signal being represented in a first frequency domain, the digital audio signal having a Nyquist frequency which is half of an original sampling rate of the digital audio signal
  • a transformation component configured to, for each frame of the digital audio signal:
  • the Nyquist frequency of the digital audio signal if the frequency range is below the Nyquist frequency by more than a threshold amount, lower the Nyquist frequency of the digital audio signal from its original value to a reduced value by removing spectral bands of the digital audio signal above the identified frequency range,
  • the digital audio signal transforms the digital audio signal from the first frequency domain to a second frequency domain via an intermediate time domain, wherein the digital audio signal has a sampling rate in the intermediate time domain which is reduced in relation to the original sampling rate by a sub-sampling factor defined by a ratio between the original value of the Nyquist frequency and the reduced value of the Nyquist frequency, and

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Quality & Reliability (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)
PCT/EP2017/065011 2016-06-22 2017-06-20 Audio decoder and method for transforming a digital audio signal from a first to a second frequency domain WO2017220528A1 (en)

Priority Applications (4)

Application Number Priority Date Filing Date Title
US16/307,624 US10770082B2 (en) 2016-06-22 2017-06-20 Audio decoder and method for transforming a digital audio signal from a first to a second frequency domain
EP17730205.6A EP3475944B1 (en) 2016-06-22 2017-06-20 Audio decoder and method for transforming a digital audio signal from a first to a second frequency domain
JP2018567177A JP6976277B2 (ja) 2016-06-22 2017-06-20 第一の周波数領域から第二の周波数領域にデジタル・オーディオ信号を変換するためのオーディオ・デコーダおよび方法
CN201780038374.4A CN109328382B (zh) 2016-06-22 2017-06-20 用于将数字音频信号从第一频域变换到第二频域的音频解码器及方法

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US201662353241P 2016-06-22 2016-06-22
EP16175715 2016-06-22
US62/353,241 2016-06-22
EP16175715.8 2016-06-22

Publications (1)

Publication Number Publication Date
WO2017220528A1 true WO2017220528A1 (en) 2017-12-28

Family

ID=56148309

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/EP2017/065011 WO2017220528A1 (en) 2016-06-22 2017-06-20 Audio decoder and method for transforming a digital audio signal from a first to a second frequency domain

Country Status (2)

Country Link
CN (1) CN109328382B (zh)
WO (1) WO2017220528A1 (zh)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110781445A (zh) * 2019-10-11 2020-02-11 清华大学 一种时域流数据的增量式频域变换系统及方法

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2013124443A1 (en) * 2012-02-24 2013-08-29 Dolby International Ab Low delay real-to-complex conversion in overlapping filter banks for partially complex processing
US20160035329A1 (en) 2009-05-27 2016-02-04 Dolby International Ab Efficient Combined Harmonic Transposition

Family Cites Families (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
FR2802329B1 (fr) * 1999-12-08 2003-03-28 France Telecom Procede de traitement d'au moins un flux binaire audio code organise sous la forme de trames
JP2004252068A (ja) * 2003-02-19 2004-09-09 Matsushita Electric Ind Co Ltd デジタルオーディオ信号の符号化装置及び方法
JP4396683B2 (ja) * 2006-10-02 2010-01-13 カシオ計算機株式会社 音声符号化装置、音声符号化方法、及び、プログラム
CN102742267B (zh) * 2007-12-19 2015-05-27 杜比实验室特许公司 自适应运动估计
ATE500588T1 (de) * 2008-01-04 2011-03-15 Dolby Sweden Ab Audiokodierer und -dekodierer
EP2311034B1 (en) * 2008-07-11 2015-11-04 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Audio encoder and decoder for encoding frames of sampled audio signals
US8352394B2 (en) * 2008-09-30 2013-01-08 Rockwell Automation Technologies, Inc. Validation of laboratory test data based on predicted values of property-of-interest
SG182466A1 (en) * 2010-01-12 2012-08-30 Fraunhofer Ges Forschung Audio encoder, audio decoder, method for encoding and audio information, method for decoding an audio information and computer program using a modification of a number representation of a numeric previous context value
ES2530957T3 (es) * 2010-10-06 2015-03-09 Fraunhofer Ges Forschung Aparato y método para procesar una señal de audio y para proporcionar una mayor granularidad temporal para un códec de voz y de audio unificado combinado (USAC)
EP2757558A1 (en) * 2013-01-18 2014-07-23 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Time domain level adjustment for audio signal decoding or encoding

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160035329A1 (en) 2009-05-27 2016-02-04 Dolby International Ab Efficient Combined Harmonic Transposition
WO2013124443A1 (en) * 2012-02-24 2013-08-29 Dolby International Ab Low delay real-to-complex conversion in overlapping filter banks for partially complex processing

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
HSU, HAN-WEN ET AL: "Audio Patch Method in MPEG-4 HE-AAC Decoder", AUDIO ENGINEERING SOCIETY, AES, 6221, 28 October 2004 (2004-10-28) - 31 October 2004 (2004-10-31), San Francisco, CA, USA, XP040372550 *

Also Published As

Publication number Publication date
CN109328382B (zh) 2023-06-16
CN109328382A (zh) 2019-02-12

Similar Documents

Publication Publication Date Title
US8700388B2 (en) Audio transform coding using pitch correction
JP5323180B2 (ja) 音声信号復号器、時間軸圧縮曲線データ生成装置、復号化された音声信号の生成方法、およびコンピュータプログラム
KR101942913B1 (ko) 메타데이터 구동된 동적 범위 제어
EP3475944B1 (en) Audio decoder and method for transforming a digital audio signal from a first to a second frequency domain
EP1895511A1 (en) Audio encoding apparatus, audio decoding apparatus and audio encoding information transmitting apparatus
EP2667508B1 (en) Method and apparatus for efficient frequency-domain implementation of time-varying filters
KR20190124331A (ko) 변환 코딩 또는 디코딩을 위한 분석 또는 합성 가중 윈도우들의 적응들
WO2016062869A1 (en) Encoding and decoding of audio signals
WO2017220528A1 (en) Audio decoder and method for transforming a digital audio signal from a first to a second frequency domain
JP2019522816A5 (zh)
WO1998035449A1 (en) Method and equipment for processing data
TWI625722B (zh) 處理一編碼音源訊號之裝置及方法
EP3566473B1 (en) Integrated reconstruction and rendering of audio signals
JP2005057439A (ja) 帯域分割型符号化・復号化方法、及びその方法に用いる復号化装置
WO2023118138A1 (en) Ivas spar filter bank in qmf domain
WO2018162472A1 (en) Integrated reconstruction and rendering of audio signals

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17730205

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2018567177

Country of ref document: JP

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 2017730205

Country of ref document: EP

Effective date: 20190122