USRE46684E1  Coding techniques using estimated spectral magnitude and phase derived from MDCT coefficients  Google Patents
Coding techniques using estimated spectral magnitude and phase derived from MDCT coefficients Download PDFInfo
 Publication number
 USRE46684E1 USRE46684E1 US13675998 US201213675998A USRE46684E US RE46684 E1 USRE46684 E1 US RE46684E1 US 13675998 US13675998 US 13675998 US 201213675998 A US201213675998 A US 201213675998A US RE46684 E USRE46684 E US RE46684E
 Authority
 US
 Grant status
 Grant
 Patent type
 Prior art keywords
 spectral components
 spectral
 components
 source signal
 magnitude
 Prior art date
 Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
 Active
Links
Images
Classifications

 G—PHYSICS
 G06—COMPUTING; CALCULATING; COUNTING
 G06F—ELECTRIC DIGITAL DATA PROCESSING
 G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
 G06F17/10—Complex mathematical operations
 G06F17/14—Fourier, Walsh or analogous domain transformations, e.g. Laplace, Hilbert, KarhunenLoeve, transforms
 G06F17/147—Discrete orthonormal transforms, e.g. discrete cosine transform, discrete sine transform, and variations therefrom, e.g. modified discrete cosine transform, integer transforms approximating the discrete cosine transform

 G—PHYSICS
 G10—MUSICAL INSTRUMENTS; ACOUSTICS
 G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
 G10L19/00—Speech or audio signals analysissynthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
 G10L19/02—Speech or audio signals analysissynthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
 G10L19/0212—Speech or audio signals analysissynthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders using orthogonal transformation
Abstract
Description
More than one Reissue Application has been filed for the reissue of U.S. Pat. No. 6,980,933, issued Dec. 27, 2005. The reissue applications are application Ser. Nos. 13/297,256, filed Nov. 15, 2011 (now U.S. Reissue Pat. No. Re. 44,126, Issued Apr. 2, 2013), and 11/963,680, filed Dec. 21, 2007, (now U.S. Reissue Pat. No. Re. 42,935, Issued Nov. 15, 2011). The present application claims the benefit as a Reissue Continuation Application of copending application Ser. No. 13/297,256, filed Nov. 15, 2011, which is a Reissue Continuation Application of application Ser. No. 11/963,680, filed Dec. 21, 2007, now U.S. Reissue Pat. No. Re. 42,935, issued Nov. 15, 2011, which is a Reissue Application of U.S. Pat. No. 6,980,933, issued Dec. 27, 2005, the entire contents of all of the foregoing are hereby incorporated by reference as if fully set forth herein, under 35 U.S.C. § 120. Applicants hereby rescind any disclaimer of claim scope in the parent application(s) or the prosecution history thereof and advise the USPTO that the claims in this application may be broader than any claim in the parent application(s).
The present invention provides an efficient process for accurately estimating spectral magnitude and phase from spectral information obtained from various types of analysis filter banks including those implemented by Modified Discrete Cosine Transforms and Modified Discrete Sine Transforms. These accurate estimates may be used in various signal processing applications such as audio coding and video coding.
In the following discussion more particular mention is made of audio coding applications using filter banks implemented by a particular Modified Discrete Cosine Transform; however, the present invention is also applicable to other applications and other filter bank implementations.
Many coding applications attempt to reduce the amount of information required to adequately represent a source signal. By reducing information capacity requirements, a signal representation can be transmitted over channels having lower bandwidth or stored on media using less space.
Coding can reduce the information capacity requirements of a source signal by eliminating either redundant components or irrelevant components in the signal. So called perceptual coding methods and systems often use filter banks to reduce redundancy by decorrelating a source signal using a basis set of spectral components, and reduce irrelevancy by adaptive quantization of the spectral components according to psychoperceptual criteria. A coding process that adapts the quantizing resolution more coarsely can reduce information requirements to a greater extent but it also introduces higher levels of quantization error or “quantization noise” into the signal. Perceptual coding systems attempt to control the level of quantization noise so that the noise is “masked” or rendered imperceptible by other spectral content of the signal. These systems typically use perceptual models to predict the levels of quantization noise that can be masked by a given signal.
In perceptual audio coding systems, for example, quantization noise is often controlled by adapting quantizing resolutions according to predictions of audibility obtained from perceptual models based on psychoacoustic studies such as that described in E. Zwicker, Psychoacoustics, 1981. An example of a perceptual model that predicts the audibility of spectral components in a signal is discussed in M. Schroeder et al.; “Optimizing Digital Speech Coders by Exploiting Masking Properties of the Human Ear,” J. Acoust. Soc. Am., December 1979, pp. 16471652.
Spectral components that are deemed to be irrelevant because they are predicted to be imperceptible need not be included in the encoded signal. Other spectral components that are deemed to be relevant can be quantized using a quantizing resolution that is adapted to be fine enough to ensure the quantization noise is rendered just imperceptible by other spectral components in the source signal. Accurate predictions of perceptibility by a perceptual model allow a perceptual coding system to adapt the quantizing resolution more optimally, resulting in fewer audible artifacts.
A coding system using models known to provide inaccurate predictions of perceptibility cannot reliably ensure quantization noise is rendered imperceptible unless a finer quantizing resolution is used than would otherwise be required if a more accurate prediction was available. Many perceptual models such as that discussed by Schroeder, et al. are based on spectral component magnitude; therefore, accurate predictions by these models depend on accurate measures of spectral component magnitude.
Accurate measures of spectral component magnitude also influence the performance of other types of coding processes in addition to quantization. In two types of coding processes known as spectral regeneration and coupling, an encoder reduces information requirements of source signals by excluding selected spectral components from an encoded representation of the source signals and a decoder synthesizes substitutes for the missing spectral components. In spectral regeneration, the encoder generates a representation of a baseband portion of a source signal that excludes other portions of the spectrum. The decoder synthesizes the missing portions of the spectrum using the baseband portion and side information that conveys some measure of spectral level for the missing portions, and combines the two portions to obtain an imperfect replica of the original source signal. One example of an audio coding system that uses spectral regeneration is described in international patent application no. PCT/US03/08895 filed Mar. 21, 2003, publication no. WO 03/083034 published Oct. 9, 2003. In coupling, the encoder generates a composite representation of spectral components for multiple channels of source signals and the decoder synthesizes spectral components for multiple channels using the composite representation and side information that conveys some measure of spectral level for each source signal channel. One example of an audio coding system that uses coupling is described in the Advanced Television Systems Committee (ATSC) A/52A document entitled “Revision A to Digital Audio Compression (AC3) Standard” published Aug. 20, 2001.
The performance of these coding systems can be improved if the decoder is able to synthesize spectral components that preserve the magnitudes of the corresponding spectral components in the original source signals. The performance of coupling also can be improved if accurate measures of phase are available so that distortions caused by coupling outofphase signals can be avoided or compensated.
Unfortunately, some coding systems use particular types of filter banks to derive an expression of spectral components that make it difficult to obtain accurate measures of spectral component magnitude or phase. Two common types of coding systems are referred to as subband coding and transform coding. Filter banks in both subband and transform coding systems may be implemented by a variety of signal processing techniques including various timedomain to frequencydomain transforms. See J. Tribolet et al., “Frequency Domain Coding of Speech,” IEEE Trans. Acoust., Speech, and Signal Proc., ASSP27, October, 1979, pp. 512530.
Some transforms such as the Discrete Fourier Transform (DFT) or its efficient implementation, the Fast Fourier Transform (FFT), provide a set of spectral components or transform coefficients from which spectral component magnitude and phase can be easily calculated. Spectral components of the DFT, for example, are multidimensional representations of a source signal. Specifically, the DFT, which may be used in audio coding and video coding applications, provides a set of complexvalued coefficients whose real and imaginary parts may be expressed as coordinates in a twodimensional space. The magnitude of each spectral component provided by such a transform can be obtained easily from each component's coordinates in the multidimensional space using well known calculations.
Some transforms such as the Discrete Cosine Transform, however, provide spectral components that make it difficult to obtain an accurate measure of spectral component magnitude or phase. The spectral components of the DCT, for example, represent the spectral component of a source signal in only a subspace of the multidimensional space required to accurately convey spectral magnitude and phase. In typical audio coding and video coding applications, for example, a DCT provides a set of realvalued spectral components or transform coefficients that are expressed in a one dimensional subspace of the twodimensional real/imaginary space mentioned above. The magnitude of each spectral component provided by transforms like the DCT cannot be obtained easily from each component's coordinates in the relevant subspace.
This characteristic of the DCT is shared by a particular Modified Discrete Cosine Transform (MDCT), which is described in J. Princen et al., “Subband/Transform Coding Using Filter Bank Designs Based on Time Domain Aliasing Cancellation,” ICASSP 1987 Conf. Proc., May 1987, pp. 216164. The MDCT and its complementary Inverse Modified Discrete Cosine Transform (IMDCT) have gained widespread usage in many coding systems because they permit implementation of a critically sampled analysis/synthesis filter bank system that provides for perfect reconstruction of overlapping segments of a source signal. Perfect reconstruction refers to the property of an analysis/synthesis filter bank pair to reconstruct perfectly a source signal in the absence of errors caused by finite precision arithmetic. Critical sampling refers to the property of an analysis filter bank to generate a number of spectral components that is no greater than the number of samples used to convey the source signal. These properties are very attractive in many coding applications because critical sampling reduces the number of spectral components that must be encoded and conveyed in an encoded signal.
The concept of critical sampling deserves some comment. Although the DFT or the DCT, for example, generate one spectral component for each sample in a source signal segment, DFT and DCT analysis/synthesis systems in many coding applications do not provide critical sampling because the analysis transform is applied to a sequence of overlapping signal segments. The overlap allows use of nonrectangular shaped window functions that improve analysis filter bank frequency response characteristics and eliminate blocking artifacts; however, the overlap also prevents perfect reconstruction with critical sampling because the analysis filter bank must generate more coefficient values than the number of source signal samples. This loss of critical sampling increases the information requirements of the encoded signal.
As mentioned above, filter banks implemented by the MDCT and IMDCT are attractive in many coding systems because they provide perfect reconstruction of overlapping segments of a source signal with critically sampling. Unfortunately, these filter banks are similar to the DCT in that the spectral components of the MDCT represent the spectral component of a source signal in only a subspace of the multidimensional space required to accurately convey spectral magnitude and phase. Accurate measures of spectral magnitude or phase cannot be obtained easily from the spectral components or transform coefficients generated by the MDCT; therefore, the coding performance of many systems that use the MDCT filter bank is suboptimal because the prediction accuracy of perceptual models is degraded and the preservation of spectral component magnitudes by synthesizing processes is impaired.
Prior attempts to avoid this deficiency of various filter banks like the MDCT and DCT filter banks have not been satisfactory for a variety of reasons. One technique is disclosed in “ISO/IEC 111723: 1993 (E) Coding of Moving Pictures and Associated Audio for Digital Storage Media at Up to About 1.5 Mbit/s,” ISO/IEC JTC1/SC29/WG11, Part III Audio. According to this technique, a set of filter banks including several MDCTbased filter banks is used to generate spectral components for encoding and an additional FFTbased filter bank is used to derive accurate measures of spectral component magnitude. This technique is not attractive for at least two reasons: (1) considerable computational resources are required in the encoder to implement the additional FFT filter bank needed to derive the measures of magnitude, and (2) the processing to obtain accurate measures of magnitude are performed in the encoder; therefore additional bandwidth is required by the encoded signal to convey these measures of spectral component magnitude to the decoder.
Another technique avoids incurring any additional bandwidth required to convey measures of spectral component magnitude by calculating these measures in the decoder. This is done by applying a synthesis filter bank to the decoded spectral components to recover a replica of the source signal, applying an analysis filter bank to the recovered signal to obtain a second set of spectral components in quadrature with the decoded spectral components, and calculating spectral component magnitude from the two sets of spectral components. This technique also is not attractive because considerable computational resources are required in the decoder to implement the analysis filter bank needed to obtain the second set of spectral components.
Yet another technique, described in S. Merdjani et al., “Direct Estimation of Frequency From MCTEncoded Files,” Proc. of the 6th Int. Conf. on Digital Audio Effects (DAFx03), London, September 2003, estimates the frequency, magnitude and phase of a sinusoidal source signal from a “regularized spectrum” derived from MDCT coefficients. This technique overcomes the disadvantages mentioned above but it also is not satisfactory for typical coding applications because it is applicable only for a very simple source signal that has only one sinusoid.
Another technique, which is disclosed in U.S. patent application Ser. No. 09/948,053, publication number U.S. 2003/0093282 A1 published May 15, 2003, is able to derive DFT coefficients from MDCT coefficients; however, the disclosed technique does not obtain measures of magnitude or phase for spectral components represented by the MDCT coefficients themselves. Furthermore, the disclosed technique does not use measures of magnitude or phase to adapt processes for encoding or decoding information that represents the MDCT coefficients.
What is needed is a technique that provides accurate estimates of magnitude or phase from spectral components generated by analysis filter banks such as the MDCT that also avoids or overcomes deficiencies of known techniques.
The present invention overcomes the deficiencies of the prior art by receiving first spectral components that were generated by application of an analysis filterbank to a source signal conveying content intended for human perception, deriving one or more first intermediate components from at least some of the first spectral components, forming a combination of the one or more first intermediate components according to at least a portion of one or more impulse responses to obtain one or more second intermediate components, deriving second spectral components from the one or more second intermediate components, obtaining estimated measures of magnitude or phase using the first spectral components and the second spectral components, and applying an adaptive process to the first spectral components to generate processed information. The adaptive process adapts in response to the estimated measures of magnitude or phase.
The various features of the present invention and its preferred embodiments may be better understood by referring to the following discussion and the accompanying drawings in which like reference numerals refer to like elements in the several figures. The contents of the following discussion and the drawings are set forth as examples only and should not be understood to represent limitations upon the scope of the present invention.
The present invention allows accurate measures of magnitude or phase to be otained from spectral components generated by analysis filter banks such as the Modified Discrete Cosine Transform (MDCT) mentioned above. Various aspects of the present invention may be used in a number of applications including audio and video coding.
The transmitter illustrated in
Aspects of the present invention are described below with reference to implementations closely related to the MDCT, however, the present invention is not limited to these particular implementations.
In this disclosure, terms like “encoder” and “encoding” are not intended to imply any particular type of information processing. For example, encoding is often used to reduce information capacity requirements; however, these terms in this disclosure do not necessarily refer to this type of processing. The encoder 5 may perform essentially any type of processing that is desired. In one implementation, encoded information is generated by quantizing spectral components according to a perceptual model. In another implementation, the encoder 5 applies a coupling process to multiple channels of spectral components to generate a composite representation. In yet another implementation, spectral components for a portion of a signal bandwidth are discarded and an estimate of the spectral envelope of the discarded portion is included in the encoded information. No particular type of encoding is important to the present invention.
The receiver illustrated in
In this disclosure, terms like “decoder” and “decoding” are not intended to imply any particular type of information processing. The decoder 25 may perform essentially any type of processing that is needed or desired. In one implementation that is inverse to an encoding process described above, quantized spectral components are decoded into dequantized spectral components. In another implementation, multiple channels of spectral components are synthesized from a composite representation of spectral components. In yet another implementation, the decoder 25 synthesizes missing portions of a signal bandwidth from spectral envelope information. No particular type of decoding is important to the present invention.
In one implementation by an Odd Discrete Fourier Transform (ODFT), the analysis filter bank 3 generates complexvalued coefficients or “spectral components” with real and imaginary parts that may be expressed in a twodimensional space. This transform may be expressed as:
which may be separated into real and imaginary parts
X_{ODFT}(k)=Re[X_{ODFT}(k)]+j·Im[X_{ODFT}(k)] (2)
and rewritten as
where X_{ODFT}(k)=ODFT coefficient for spectral component k,
x(n)=source signal amplitude at time n;
Re[X]=real part of X; and
Im[X]=imaginary part of X.
The magnitude and phase of each spectral component k may be calculated as follows:
where Mag[X]=magnitude of X; and
Phs[X]=phase of X.
Many coding applications implement the analysis filter bank 3 by applying the Modified Discrete Cosine Transform (MDCT) discussed above to overlapping segments of the source signal that are modulated by an analysis window function. This transform may be expressed as:
where X_{MDCT}(k)=MDCT coefficient for spectral component k. It may be seen that the spectral components that are generated by the MDCT are equivalent to the real part of the ODFT coefficients.
X_{MDCT}(k)=Re[X_{ODFT}(k)] (7)
A particular Modified Discrete Sine Transform (MDST) that generates coefficients representing spectral components in quadrature with the spectral components represented by coefficients of the MDCT may be expressed as:
where X_{MDST}(k)=MDST coefficient for spectral component k. It may be seen that the spectral components that are generated by the MDST are equivalent to the negative imaginary part of the ODFT coefficients.
X_{MDST}(k)=−Im[X_{ODFT}(k)] (9)
Accurate measures of magnitude and phase cannot be calculated directly from MDCT coefficients but they can be calculated directly from a combination of MDCT and MDST coefficients, which can be seen by substituting equations 7 and 9 into equations 4 and 5:
Mag[X_{ODFT}(k)┘=√{square root over (X_{MDCT} ^{2}(k)+X_{MDST} ^{2}(k))}
The Princen paper mentioned above indicates that a correct use of the MDCT requires the application of an analysis window function that satisfies certain design criteria. The expressions of transform equations in this section of the disclosure omit an explicit reference to any analysis window function, which implies a rectangular analysis window function that does not satisfy these criteria. This does not affect the validity of expressions 10 and 11.
Implementations of the present invention described below obtain measures of spectral component magnitude and phase from MDCT coefficients and from MDST coefficients derived from the MDCT coefficients. These implementations are described below following a discussion of the underlying mathematical basis.
This section discusses the derivation of an analytical expression for calculating exact MDST coefficients from MDCT coefficients. This expression is shown below in equations 41a and 41b. The derivations of simpler analytical expressions for two specific window functions are also discussed. Considerations for practical implementations are presented following a discussion of the derivations.
One implementation of the present invention discussed below is derived from a process for calculating exact MDST coefficients from MDCT coefficients. This process is equivalent to another process that applies an Inverse Modified Discrete Cosine Transform (IMDCT) synthesis filter bank to blocks of MDCT coefficients to generate windowed segments of timedomain samples, overlapadds the windowed segments of samples to reconstruct a replica of the original source signal, and applies an MDST analysis filter bank to a segment of the recovered signal to generate the MDST coefficients.
Exact MDST coefficients cannot be calculated from a single segment of windowed samples that is recovered by applying the IMDCT synthesis filter bank to a single block of MDCT coefficients because the segment is modulated by an analysis window function and because the recovered samples contain timedomain aliasing. The exact MDST coefficients can be computed only with the additional knowledge of the MDCT coefficients for the preceding and subsequent segments. For example, in the case where the segments overlap one another by onehalf the segment length, the effects of windowing and the timedomain aliasing for a given segment II can be canceled by applying the synthesis filter bank and associated synthesis window function to three blocks of MDCT coefficients representing three consecutive overlapping segments of the source signal, denoted as segment I, segment II and segment III. Each segment overlaps an adjacent segment by an amount equal to onehalf of the segment length. Windowing effects and timedomain aliasing in the first half of segment II are canceled by an overlapadd with the second half of segment I, and these effects in the second half of segment II are canceled by an overlapadd with the first half of segment III.
The expression that calculates MDST coefficients from MDCT coefficients depends on the number of segments of the source signal, the overlap structure and length of these segments, and the choice of the analysis and synthesis window functions. None of these features are important in principle to the present invention. For ease of illustration, however, it is assumed in the examples discussed below that the three segments have the same length N, which is even, and overlap one another by an amount equal to onehalf the segment length, that the analysis and synthesis window functions are identical to one another, that the same window functions are applied to all segments of the source signal, and that the window functions are such that their overlapadd properties satisfy the following criterion, which is required for perfect reconstruction of the source signal as explained in the Princen paper.
where w(r)=analysis and synthesis window function; and
N=length of each source signal segment.

 The MDCT coefficients X_{i }for the source signal x(n) in each of the segments i may be expressed as:
The windowed timedomain samples {circumflex over (x)} that are obtained from an application of the IMDCT synthesis filter bank to each block of MDCT coefficients may be expressed as:
Samples s(r) of the source signal for segment II are reconstructed by overlapping and adding the three windowed segments as described above, thereby removing the timedomain aliasing from the source signal x. This may be expressed as:
A block of MDST coefficients S(k) may be calculated for segment II by applying an MDST analysis filter bank to the timedomain samples in the reconstructed segment II, which may be expressed as:
Using expression 18 to substitute for s(r), expression 19 can be rewritten as:
This equation can be rewritten in terms of the MDCT coefficients by using expressions 1517 to substitute for the timedomain samples:
The remainder of this section of the disclosure shows how this equation can be simplified as shown below in equations 41a and 41b.
Using the trigonometric identity sin α·cos β=½[sin (α+β)+sin (α−β)] to gather terms and switching the order of summation, expression 21 can be rewritten as
This expression can be simplified by combining pairs of terms that are equal to each other. The first and second terms are equal to each other. The third and fourth terms are equal to each other. The fifth and sixth terms are equal to each other and the seventh and eighth terms are equal to each other. The equality between the third and fourth terms, for example, may be shown by proving the following lemma:
This lemma may be proven by rewriting the lefthand and righthand sides of equation 23 as functions of p as follows:
where
The expression of G as a function of (p) can be rewritten as a function of (N−1−p) as follows:
It is known that MDCT coefficients are odd symmetric; therefore, X_{II}(N−1−p)=−X_{II}(p) for
By rewriting (k−(N−1−p)) as (k+1+p)−N, it may be seen that (k−(N−1−p))·(r+n_{0})=(k+1+p)·(r+n_{0})−N·(r+n_{0}). These two equalities allow expression 26 to be rewritten as:
Referring to the Princen paper, the value for n_{0 }is ½(N/2+1), which is midway between two integers. Because r is an integer, it can be seen that the final term 27π(r+n_{0}) in the summand of expression 27 is equal to an odd integer multiple of π; therefore, expression 27 can be rewritten as
which proves the lemma shown in equation 23. The equality between the other pairs of terms in equation 22 can be shown in a similar manner.
By omitting the first, third, fifth and seventh terms in expression 22 and doubling the second, fourth, sixth and eighth terms, equation 22 can be rewritten as follows after simplifying the second and eighth terms:
Using the following identities:
expression 29 can be rewritten as:
The inner summations of the third and fourth terms are changed so that their limits of summation are from r=0 to r=(N/2−1) by making the following substitutions:
This allows equation 31 to be rewritten as
Equation 32 can be simplified by using the restriction imposed on the window function mentioned above that is required for perfect reconstruction of the source signal. This restriction is w(r)^{2}+w(r+N/2)^{2}=1. With this restriction, equation 31 can be simplified to
Gathering terms, equation 33 can be rewritten as
Equation 34 can be simplified by recognizing the inner summation of the third term is equal to zero. This can be shown by proving two lemmas. One lemma postulates the following equality:
This equality may be proven by rewriting the summand into exponential form, rearranging, simplifying and combining terms as follows:
This may be proven by substituting n_{0 }for a in expression 35 to obtain the following:
By substituting (k−p) for q in expression 35 and using the preceding two lemmas, the inner summation of the third term in equation 34 may be shown to equal zero as follows:
Using this equality, equation 34 may be simplified to the following:
The MDST coefficients S(k) of a realvalued signal are symmetric according to the expression S(k)=S(N−1−k), for kε[0, N−1]. Using this property, all even numbered coefficients can be expressed as S(2v)=S(N−1−2v)=S(N−2(v+1)+1), for
Because N and 2(v+1) are both even numbers, the quantity (N−2(v+1)+1) is an odd number. From this, it can be seen the even numbered coefficients can be expressed in terms of the odd numbered coefficients. Using this property of the coefficients, equation 38 can be rewritten as follows:
The second term in this equation is equal to zero for all even values of p. The second term needs to be evaluated only for odd values of p, or for p=2l+1 for
Equation 40 can be rewritten as a summation of two modified convolution operations of two functions h_{I,III }and h_{II }with two sets of intermediate spectral components m_{I,III }and m_{II }that are derived from the MDCT coefficients X_{I}, X_{II}, and XIII for three segments of the source signal as follows:
The results of the modified convolution operations depend on the properties of the functions h_{I,III }and h_{II}, which are impulse responses of hypothetical filters that are related to the combined effects of the IMDCT synthesis filter bank, the subsequent MDST analysis filter bank, and the analysis and synthesis window functions The modified convolutions need to be evaluated only for even integers.
Each of the impulse responses is symmetric. It may be seen from inspection that h_{I,III}(τ)=h_{I,III}(−τ) and h_{II}(τ)=−h_{II}(−τ). These symmetry properties may be exploited in practical digital implementations to reduce the amount of memory needed to store a representation of each impulse response. An understanding of how the symmetry properties of the impulse responses interact with the symmetry properties of the intermediate spectral components m_{I,III }and m_{II }may also be exploited in practical implementations to reduce computational complexity.
The impulse responses h_{I,III}(τ) and h_{II}(τ) may be calculated from the summations shown above; however, it may be possible to simplify these calculations by deriving simpler analytical expressions for the impulse responses. Because the impulse responses depend on the window function w(r), the derivation of simpler analytical expressions requires additional specifications for the window function. An example of derivations of simpler analytical expressions for the impulse responses for two specific window functions, the rectangular and sine window functions, are discussed below.
The rectangular window function is not often used in coding applications because it has relatively poor frequency selectivity properties; however, its simplicity reduces the complexity of the analysis needed to derive a specific implementation. For this derivation, the window function
for r ε[0,N−1] is used. For this particular window function, the second term of equation 41a is equal to zero. The calculation of the MDST coefficients does not depend on the MDCT coefficients for the second segment. As a result, equation 41a may be rewritten as
If N is restricted to have a value that is a multiple of four, this equation can be simplified further by using another lemma that postulates the following equality:
This may be proven as follows:
By using the lemma shown in equation 35 with
expression 44 can be rewritten as
which can be simplified to obtain the following expression:
If q is an integer multiple of N such that q=mN, then the numerator and denominator of the quotient in expression 46 are both equal to zero, causing the value of the quotient to be indeterminate. L'Hospital's rule may be used to simplify the expression further. Differentiating the numerator and denominator with respect to q and substituting q=mN yields the expression
Because N is an integer multiple of four, the numerator is always equal to N and the denominator is equal to 2·(−1)^{m}=2·(−1)^{q/N}. This completes the proof of the lemma expressed by equation 43.
This equality may be used to obtain expressions for the impulse response h_{I,III}. Different cases are considered to evaluate the response h_{I,III}(τ). If τ is an integer multiple of N such that τ=mN then h_{I,III}(τ)=(−1)^{m}·N/4. The response equals zero for even values of τ other than an integer multiple of N because the numerator of the quotient in equation 46 is equal to zero. The value of the impulse response h_{I,III }for odd values of τ can be seen from inspection. The impulse response may be expressed as follows:
The impulse response h_{I,III }for a rectangular window function and N=128 is illustrated in
By substituting these expressions into equation 42, equations 41a and 41b can be rewritten as:
Using equations 49a and 49b, MDST coefficients for segment II can be calculated from the MDCT coefficients of segments I and III assuming the use of a rectangular window function. The computational complexity of this equation can be reduced by exploiting the fact that the impulse response h_{I,III}(τ) is equal to zero for many odd values of τ.
The sine window function has better frequency selectivity properties than the rectangular window function and is used in some practical coding systems. The following derivation uses a sine window function defined by the expression
w(r)=sin(π/N(r+½)) (50)
A simplified expression for the impulse response h_{I,III }may be derived by using a lemma that postulates the following:
This lemma may be proven by first simplifying the expression for w(r)w(r+N/2) as follows:
Substituting this simplified expression into equation 51 obtains the following:
Using the following trigonometric identity
sin u cos v=½[sin(u+v)+sin(u−v)] (54)
equation 53 can be rewritten as follows:
Equation 55 can be simplified by substitution in both terms of I(τ) according to equation 35, setting q=(τ+1) and
in the first term, and setting q=(−τ+1) and
in the second term. This yields the following:
Equation 58 is valid unless the denominator for either quotient is equal to zero. These special cases can be analyzed by inspecting equation 57 to identify the conditions under which either denominator is zero. It can be seen from equation 57 that singularities occur for τ=mN+1 and τ=mN−1, where m is an integer. The following assumes N is an integer multiple of four.
For τ=mN+1 equation 57 can be rewritten as:
The value of the quotient is indeterminate because the numerator and denominator are both equal to zero. L'Hospital's rule can be used to determine its value. Differentiating numerator and denominator with respect to m yields the following:
For τ=mN−1 equation 57 can be rewritten as:
The value of the quotient in this equation is indeterminate because the numerator and denominator are both equal to zero. L'Hospital's rule can be used to determine its value. Differentiating numerator and denominator with respect to m yields the following:
The lemma expressed by equation 51 is proven by combining equations 58, 60 and 62.
A simplified expression for the impulse response h_{II }may be derived by using a lemma that postulates the following:
The proof of this lemma is similar to the previous proof. This proof begins by simplifying the expression for w(r)w(r). Recall that sin^{2 }α=½−½ cos (2a), so that:
Using this expression, equation 63 can be rewritten as:
From equation 37 and the associated lemma, it may be seen the first term in equation 65 is equal to zero. The second term may be simplified using the trigonometric identity cos u·sin v=½[ sin (u+v)−sin (u−v)], which obtains the following:
Referring to equation 66, its first term is equal to the negative of the first term in equation 55 and its second term is equal to the second term of equation 55. The proof of the lemma expressed in equation 63 may be proven in a manner similar to that used to prove the lemma expressed in equation 51. The principal difference in the proof is the singularity analyses of equation 59 and equation 61. For this proof, I(mN−1) is multiplied by an additional factor of −1; therefore,
Allowing for this difference along with the minus sign preceding the first term of equation 55, the lemma expressed in equation 63 is proven.
An exact expression for impulse response h_{II}(τ) is given by this lemma; however, it needs to be evaluated only for odd values of τ because the modified convolution of h_{II }in equation 41a is evaluated only for τ=(2v−(2l+1)). According to equation 63, h_{II}(τ)=0 for odd values of τ except for τ=mN+1 and τ=mN−1. Because h_{II}(τ) is nonzero for only two values of τ, this impulse response can be expressed as:
The impulse responses h_{I,III}(τ) and h_{II}(τ) for the sine window function and N=128 are illustrated in
Using the analytical expressions for the impulse responses h_{I,III }and h_{II }provided by equations 51 and 67, equations 41a and 41b can be rewritten as:
Using equations 68a and 68b, MDST coefficients for segment II can be calculated from the MDCT coefficients of segments I, II and III assuming the use of a sine window function. The computational complexity of this equation can be reduced further by exploiting the fact that the impulse response h_{I,III}(τ) is equal to zero for many odd values of τ.
Equations 41a and 41b express a calculation of exact MDST coefficients from MDCT coefficients for an arbitrary window function. Equations 49a, 49b, 68a and 68b express calculations of exact MDST coefficients from MDCT coefficients using a rectangular window function and a sine window function, respectively. These calculations include operations that are similar to the convolution of impulse responses. The computational complexity of calculating the convolutionlike operations can be reduced by excluding from the calculations those values of the impulse responses that are known to be zero.
The computational complexity can be reduced further by excluding from the calculations those portions of the full responses that are of lesser significance; however, this resulting calculation provides only an estimate of the MDST coefficients because an exact calculation is no longer possible. By controlling the amounts of the impulse responses that are excluded from the calculations, an appropriate balance between computational complexity and estimation accuracy can be achieved.
The impulse responses themselves are dependent on the shape of the window function that is assumed. As a result, the choice of window function affects the portions of the impulse responses that can be excluded from calculation without reducing coefficient estimation accuracy below some desired level.
An inspection of equation 49a for rectangular window functions shows the impulse response h_{I,III }is symmetric about τ=0 and decays moderately rapidly. An example of this impulse response for N=128 is shown in
An inspection of equation 68a for the sine window function shows the impulse response h_{I,III }is symmetric about τ=0 and decays more rapidly than the corresponding response for the rectangular window function. For the sine window function, the impulse response h_{II }is nonzero for only two values of τ. An example of the impulse responses h_{I,III }and h_{II }for a sine window function and N=128 are shown in
Based on these observations, a modified form of equations 41a and 41b that provides an estimate of MDST coefficients for any analysis or synthesis window function may be expressed in terms of two filter structures as follows:
An example of a device 30 that estimates MDST coefficients according to equation 69 is illustrated by a schematic block diagram in
The magnitude and phase estimator 36 calculates measures of magnitude and phase from the calculated MDST coefficients and the MDCT coefficients received from the path 31 and passes these measures along the paths 38 and 39. The MDST coefficients may also be passed along the path 37. Measures of spectral magnitude and phase may be obtained by performing the calculations shown above in equations 10 and 11, for example. Other examples of measures that may be obtained include spectral flux, which may be obtained from the first derivative of spectral magnitude, and instantaneous frequency, which may be obtained from the first derivative of spectral phase.
Referring to the impulse responses shown in
The number and choice of taps for each filter structure can be selected using any criteria that may be desired. For example, an inspection of two impulse responses h_{I,III }and h_{II }will reveal the portions of the responses that are more significant. Taps may be chosen for only the more significant portions. In addition, computational complexity may be reduced by obtaining only selected MDST coefficients such as the coefficients in one or more frequency ranges.
An adaptive implementation of the present invention may use larger portions of the impulse responses to estimate the MDST coefficients for spectral components that are judged to be perceptually more significant by a perceptual model. For example, a measure of perceptual significance for a spectral component could be derived from the amount by which the spectral component exceeds a perceptual masking threshold that is calculated by a perceptual model. Shorter portions of the impulse responses may be used to estimate MDST coefficients for perceptually less significant spectral components. Calculations needed to estimate MDST coefficients for the least significant spectral components can be avoided.
A nonadaptive implementation may obtain estimates of MDST coefficients in various frequency subbands of a signal using portions of the impulse responses whose lengths vary according to the perceptual significance of the subbands as determined previously by an analysis of exemplary signals. In many audio coding applications, spectral content in lower frequency subbands generally has greater perceptual significance than spectral content in higher frequency subbands. In these applications, for example, a nonadaptive implementation could estimate MDST coefficients in subbands using portions of the impulse responses whose length varies inversely with the frequency of the subbands.
The preceding disclosure sets forth examples that describes only a few implementations of the present invention. Principles of the present invention may be applied and implemented in a wide variety of ways. Additional considerations are discussed below.
The exemplary implementations described above are derived from the MDCT that is expressed in terms of the ODFT as applied to fixedlength segments of a source signal that overlap one another by half the segment length. A variation of the examples discussed above as well as a variation of the alternatives discussed below may be obtained by deriving implementations from the MDST that is expressed in terms of the ODFT.
Additional implementations of the present invention may be derived from expressions of other transforms including the DFT, the FFT and a generalized expression of the MDCT filter bank discussed in the Princen paper cited above. This generalized expression is described in U.S. Pat. No. 5,727,119 issued Mar. 10, 1998.
Implementations of the present invention also may be derived from expressions of transforms that are applied to varyinglength signal segments and transforms that are applied to segments having no overlap or amounts of overlap other than half the segment length.
Some empirical results suggest that an implementation of the present invention with a specified level of computational complexity is often able to derive measures of spectral component magnitude that is more accurate for spectral components representing a band of spectral energy than it is for spectral components representing a single sinusoid or a few sinusoids that are isolated from one another in frequency. The process that estimates spectral component magnitude may be adapted in at least two ways to improve estimation accuracy for signals that have isolated spectral components.
One way to adapt the process is by adaptively increasing the length of the impulse responses for two filter structures shown in equation 69 so that more accurate computations can be performed for a restricted set of MDST coefficients that are related to the one or more isolated spectral components.
Another way to adapt this process is by adaptively performing an alternate method for deriving spectral component magnitudes for isolated spectral components. The alternate method derives an additional set of spectral components from the MDCT coefficients and the additional set of spectral components are used to obtain measures of magnitude and/or phase. This adaptation may be done by selecting the more appropriate method for segments of the source signal, and it may be done by using the more appropriate method for portions of the spectrum for a particular segment. A method that is described in the Merdjani paper cited above is one possible alternate method. If it is used, this method preferably is extended to provide magnitude estimates for more than a single sinusoid. This may be done by dynamically arranging MDCT coefficients into bands of frequencies in which each band has a single dominant spectral component and applying the Merdjani method to each band of coefficients.
The presence of a source signal that has one dominant spectral component or a few isolated dominant spectral components may be detected using a variety of techniques. One technique detects local maxima in MDCT coefficients having magnitudes that exceed the magnitudes of adjacent and nearby coefficients by some threshold amount and either counting the number of local maxima or determining the spectral distance between local maxima. Another technique determines the spectral shape of the source signal by calculating an approximate Spectral Flatness Measure (SFM) of the source signal. The SFM is described in N. Jayant et al., “Digital Coding of Waveforms,” PrenticeHall, 1984, p. 57, and is defined as the ratio of the geometric mean and the arithmetic mean of samples of the power spectral density of a signal.
The present invention may be used advantageously in a wide variety of applications. Schematic block diagrams of a transmitter and a receiver incorporating various aspects of the present invention are shown in
The transmitter shown in
The receiver shown in
Devices that incorporate various aspects of the present invention may be implemented in a variety of ways including software for execution by a computer or some other apparatus that includes more specialized components such as digital signal processor (DSP) circuitry coupled to components similar to those found in a generalpurpose computer.
In embodiments implemented in a general purpose computer system, additional components may be included for interfacing to devices such as a keyboard or mouse and a display, and for controlling a storage device having a storage medium such as magnetic tape or disk, or an optical medium. The storage medium may be used to record programs of instructions for operating systems, utilities and applications, and may include embodiments of programs that implement various aspects of the present invention.
The functions required to practice various aspects of the present invention can be performed by components that are implemented in a wide variety of ways including discrete logic components, integrated circuits, one or more ASICs and/or programcontrolled processors. The manner in which these components are implemented is not important to the present invention.
Software implementations of the present invention may be conveyed by a variety of machine readable media such as baseband or modulated communication paths throughout the spectrum including from supersonic to ultraviolet frequencies, or storage media that convey information using essentially any recording technology including magnetic tape, cards or disk, optical cards or disc, and detectable markings on media like paper.
Claims (62)
Priority Applications (4)
Application Number  Priority Date  Filing Date  Title 

US10766681 US6980933B2 (en)  20040127  20040127  Coding techniques using estimated spectral magnitude and phase derived from MDCT coefficients 
US11963680 USRE42935E1 (en)  20040127  20071221  Coding techniques using estimated spectral magnitude and phase derived from MDCT coefficients 
US13297256 USRE44126E1 (en)  20040127  20111115  Coding techniques using estimated spectral magnitude and phase derived from MDCT coefficients 
US13675998 USRE46684E1 (en)  20040127  20121113  Coding techniques using estimated spectral magnitude and phase derived from MDCT coefficients 
Applications Claiming Priority (1)
Application Number  Priority Date  Filing Date  Title 

US13675998 USRE46684E1 (en)  20040127  20121113  Coding techniques using estimated spectral magnitude and phase derived from MDCT coefficients 
Related Parent Applications (1)
Application Number  Title  Priority Date  Filing Date  

US10766681 Reissue US6980933B2 (en)  20040127  20040127  Coding techniques using estimated spectral magnitude and phase derived from MDCT coefficients 
Publications (1)
Publication Number  Publication Date 

USRE46684E1 true USRE46684E1 (en)  20180123 
Family
ID=34795716
Family Applications (4)
Application Number  Title  Priority Date  Filing Date 

US10766681 Active 20240211 US6980933B2 (en)  20040127  20040127  Coding techniques using estimated spectral magnitude and phase derived from MDCT coefficients 
US11963680 Active USRE42935E1 (en)  20040127  20071221  Coding techniques using estimated spectral magnitude and phase derived from MDCT coefficients 
US13297256 Active USRE44126E1 (en)  20040127  20111115  Coding techniques using estimated spectral magnitude and phase derived from MDCT coefficients 
US13675998 Active USRE46684E1 (en)  20040127  20121113  Coding techniques using estimated spectral magnitude and phase derived from MDCT coefficients 
Family Applications Before (3)
Application Number  Title  Priority Date  Filing Date 

US10766681 Active 20240211 US6980933B2 (en)  20040127  20040127  Coding techniques using estimated spectral magnitude and phase derived from MDCT coefficients 
US11963680 Active USRE42935E1 (en)  20040127  20071221  Coding techniques using estimated spectral magnitude and phase derived from MDCT coefficients 
US13297256 Active USRE44126E1 (en)  20040127  20111115  Coding techniques using estimated spectral magnitude and phase derived from MDCT coefficients 
Country Status (9)
Country  Link 

US (4)  US6980933B2 (en) 
EP (1)  EP1709627B1 (en) 
JP (1)  JP4787176B2 (en) 
KR (1)  KR101184992B1 (en) 
CN (1)  CN1918633B (en) 
CA (1)  CA2553784C (en) 
DK (1)  DK1709627T3 (en) 
ES (1)  ES2375285T3 (en) 
WO (1)  WO2005073960A1 (en) 
Families Citing this family (23)
Publication number  Priority date  Publication date  Assignee  Title 

US6980933B2 (en)  20040127  20051227  Dolby Laboratories Licensing Corporation  Coding techniques using estimated spectral magnitude and phase derived from MDCT coefficients 
KR20070001115A (en) *  20040128  20070103  코닌클리케 필립스 일렉트로닉스 엔.브이.  Audio signal decoding using complexvalued data 
US9055298B2 (en) *  20050715  20150609  Qualcomm Incorporated  Video encoding method enabling highly efficient partial decoding of H.264 and other transform coded information 
US20070118361A1 (en) *  20051007  20070524  Deepen Sinha  Window apparatus and method 
US8126706B2 (en) *  20051209  20120228  Acoustic Technologies, Inc.  Music detector for echo cancellation and noise reduction 
WO2007085275A1 (en) *  20060127  20070802  Coding Technologies Ab  Efficient filtering with a complex modulated filterbank 
CN101213423B (en) *  20060619  20100616  松下电器产业株式会社  Phase correction circuit of encoder signal 
US8214200B2 (en) *  20070314  20120703  Xfrm, Inc.  Fast MDCT (modified discrete cosine transform) approximation of a windowed sinusoid 
KR101597375B1 (en)  20071221  20160224  디티에스 엘엘씨  System for adjusting perceived loudness of audio signals 
KR101428487B1 (en) *  20080711  20140808  삼성전자주식회사  Method and apparatus for encoding and decoding multichannel 
CN101552006B (en)  20090512  20111228  武汉大学  The energy and phase adjusting method and apparatus domain windowed signal mdct 
US8805680B2 (en) *  20090519  20140812  Electronics And Telecommunications Research Institute  Method and apparatus for encoding and decoding audio signal using layered sinusoidal pulse coding 
CN101958119B (en) *  20090716  20120229  中兴通讯股份有限公司  Audiofrequency dropframe compensator and compensation method for modified discrete cosine transform domain 
US8538042B2 (en)  20090811  20130917  Dts Llc  System for increasing perceived loudness of speakers 
ES2507165T3 (en) *  20091021  20141014  Dolby International Ab  Oversampling filter bank combined reemisor 
EP2372704A1 (en) *  20100311  20111005  FraunhoferGesellschaft zur Förderung der Angewandten Forschung e.V.  Signal processor and method for processing a signal 
CN104851427B (en)  20100409  20180717  杜比国际公司  A decoding method and decoding system 
EP2375409A1 (en)  20100409  20111012  FraunhoferGesellschaft zur Förderung der angewandten Forschung e.V.  Audio encoder, audio decoder and related methods for processing multichannel audio signals using complex prediction 
US9135929B2 (en) *  20110428  20150915  Dolby International Ab  Efficient content classification and loudness estimation 
US9312829B2 (en)  20120412  20160412  Dts Llc  System for adjusting loudness of audio signals in real time 
KR101498113B1 (en) *  20131023  20150304  광주과학기술원  A apparatus and method extending bandwidth of sound signal 
EP2963645A1 (en) *  20140701  20160106  FraunhoferGesellschaft zur Förderung der angewandten Forschung e.V.  Calculator and method for determining phase correction data for an audio signal 
EP3067889A1 (en)  20150309  20160914  FraunhoferGesellschaft zur Förderung der angewandten Forschung e.V.  Method and apparatus for signaladaptive transform kernel switching in audio coding 
Citations (25)
Publication number  Priority date  Publication date  Assignee  Title 

US5285498A (en) *  19920302  19940208  At&T Bell Laboratories  Method and apparatus for coding audio signals based on perceptual model 
US5297236A (en) *  19890127  19940322  Dolby Laboratories Licensing Corporation  Low computationalcomplexity digital filter bank for encoder, decoder, and encoder/decoder 
US5451954A (en) *  19930804  19950919  Dolby Laboratories Licensing Corporation  Quantization noise suppression for encoder/decoder system 
US5592584A (en) *  19920302  19970107  Lucent Technologies Inc.  Method and apparatus for twocomponent signal compression 
US5627938A (en) *  19920302  19970506  Lucent Technologies Inc.  Rate loop processor for perceptual encoder/decoder 
US5682463A (en) *  19950206  19971028  Lucent Technologies Inc.  Perceptual audio compression based on loudness uncertainty 
US5699484A (en) *  19941220  19971216  Dolby Laboratories Licensing Corporation  Method and apparatus for applying linear prediction to critical band subbands of splitband perceptual coding systems 
US5699479A (en) *  19950206  19971216  Lucent Technologies Inc.  Tonality for perceptual audio compression based on loudness uncertainty 
US5727119A (en) *  19950327  19980310  Dolby Laboratories Licensing Corporation  Method and apparatus for efficient implementation of singlesideband filter banks providing accurate measures of spectral magnitude and phase 
US5945940A (en) *  19980312  19990831  Massachusetts Institute Of Technology  Coherent ultrawideband processing of sparse multisensor/multispectral radar measurements 
JP2000048481A (en)  19980729  20000218  Sony Corp  Signal processor, recording medium and signal processing method 
US6035177A (en) *  19960226  20000307  Donald W. Moses  Simultaneous transmission of ancillary and audio signals by means of perceptual coding 
US6131084A (en) *  19970314  20001010  Digital Voice Systems, Inc.  Dual subframe quantization of spectral magnitudes 
US6161089A (en) *  19970314  20001212  Digital Voice Systems, Inc.  Multisubframe quantization of spectral parameters 
US6182030B1 (en) *  19981218  20010130  Telefonaktiebolaget Lm Ericsson (Publ)  Enhanced coding to improve coded communication signals 
US6266644B1 (en) *  19980926  20010724  Liquid Audio, Inc.  Audio encoding apparatus and methods 
US6453289B1 (en) *  19980724  20020917  Hughes Electronics Corporation  Method of noise reduction for speech codecs 
US20030016772A1 (en) *  20010402  20030123  Per Ekstrand  Aliasing reduction using complexexponential modulated filterbanks 
US20030093282A1 (en)  20010905  20030515  Creative Technology Ltd.  Efficient system and method for converting between different transformdomain signal representations 
US6680972B1 (en) *  19970610  20040120  Coding Technologies Sweden Ab  Source coding enhancement using spectralband replication 
US20040071284A1 (en) *  20020816  20040415  Abutalebi Hamid Reza  Method and system for processing subband signals using adaptive filters 
US6847737B1 (en) *  19980313  20050125  University Of Houston System  Methods for performing DAF data filtering and padding 
US6862326B1 (en) *  20010220  20050301  Comsys Communication & Signal Processing Ltd.  Whitening matched filter for use in a communications receiver 
US20050197831A1 (en) *  20020726  20050908  Bernd Edler  Device and method for generating a complex spectral representation of a discretetime signal 
US6980933B2 (en)  20040127  20051227  Dolby Laboratories Licensing Corporation  Coding techniques using estimated spectral magnitude and phase derived from MDCT coefficients 
Patent Citations (33)
Publication number  Priority date  Publication date  Assignee  Title 

US5297236A (en) *  19890127  19940322  Dolby Laboratories Licensing Corporation  Low computationalcomplexity digital filter bank for encoder, decoder, and encoder/decoder 
US5285498A (en) *  19920302  19940208  At&T Bell Laboratories  Method and apparatus for coding audio signals based on perceptual model 
US5481614A (en) *  19920302  19960102  At&T Corp.  Method and apparatus for coding audio signals based on perceptual model 
US5592584A (en) *  19920302  19970107  Lucent Technologies Inc.  Method and apparatus for twocomponent signal compression 
US5627938A (en) *  19920302  19970506  Lucent Technologies Inc.  Rate loop processor for perceptual encoder/decoder 
US5451954A (en) *  19930804  19950919  Dolby Laboratories Licensing Corporation  Quantization noise suppression for encoder/decoder system 
US5699484A (en) *  19941220  19971216  Dolby Laboratories Licensing Corporation  Method and apparatus for applying linear prediction to critical band subbands of splitband perceptual coding systems 
US5682463A (en) *  19950206  19971028  Lucent Technologies Inc.  Perceptual audio compression based on loudness uncertainty 
US5699479A (en) *  19950206  19971216  Lucent Technologies Inc.  Tonality for perceptual audio compression based on loudness uncertainty 
US5727119A (en) *  19950327  19980310  Dolby Laboratories Licensing Corporation  Method and apparatus for efficient implementation of singlesideband filter banks providing accurate measures of spectral magnitude and phase 
US6035177A (en) *  19960226  20000307  Donald W. Moses  Simultaneous transmission of ancillary and audio signals by means of perceptual coding 
US6161089A (en) *  19970314  20001212  Digital Voice Systems, Inc.  Multisubframe quantization of spectral parameters 
US6131084A (en) *  19970314  20001010  Digital Voice Systems, Inc.  Dual subframe quantization of spectral magnitudes 
US6680972B1 (en) *  19970610  20040120  Coding Technologies Sweden Ab  Source coding enhancement using spectralband replication 
US20040078205A1 (en) *  19970610  20040422  Coding Technologies Sweden Ab  Source coding enhancement using spectralband replication 
US5945940A (en) *  19980312  19990831  Massachusetts Institute Of Technology  Coherent ultrawideband processing of sparse multisensor/multispectral radar measurements 
US6847737B1 (en) *  19980313  20050125  University Of Houston System  Methods for performing DAF data filtering and padding 
US6453289B1 (en) *  19980724  20020917  Hughes Electronics Corporation  Method of noise reduction for speech codecs 
JP2000048481A (en)  19980729  20000218  Sony Corp  Signal processor, recording medium and signal processing method 
US6266644B1 (en) *  19980926  20010724  Liquid Audio, Inc.  Audio encoding apparatus and methods 
US6182030B1 (en) *  19981218  20010130  Telefonaktiebolaget Lm Ericsson (Publ)  Enhanced coding to improve coded communication signals 
US6862326B1 (en) *  20010220  20050301  Comsys Communication & Signal Processing Ltd.  Whitening matched filter for use in a communications receiver 
US7242710B2 (en) *  20010402  20070710  Coding Technologies Ab  Aliasing reduction using complexexponential modulated filterbanks 
US20030016772A1 (en) *  20010402  20030123  Per Ekstrand  Aliasing reduction using complexexponential modulated filterbanks 
US20030093282A1 (en)  20010905  20030515  Creative Technology Ltd.  Efficient system and method for converting between different transformdomain signal representations 
US8155954B2 (en) *  20020726  20120410  FraunhoferGesellschaft Zur Foerderung Der Angewandten Forschung E.V.  Device and method for generating a complex spectral representation of a discretetime signal 
US20050197831A1 (en) *  20020726  20050908  Bernd Edler  Device and method for generating a complex spectral representation of a discretetime signal 
US20100161319A1 (en) *  20020726  20100624  Bernd Edler  Device and method for generating a complex spectral representation of a discretetime signal 
US7707030B2 (en) *  20020726  20100427  FraunhoferGesellschaft Zur Foerderung Der Angewandten Forschung E.V.  Device and method for generating a complex spectral representation of a discretetime signal 
US20040071284A1 (en) *  20020816  20040415  Abutalebi Hamid Reza  Method and system for processing subband signals using adaptive filters 
US7783032B2 (en) *  20020816  20100824  Semiconductor Components Industries, Llc  Method and system for processing subband signals using adaptive filters 
USRE42935E1 (en)  20040127  20111115  Dolby Laboratories Licensing Corporation  Coding techniques using estimated spectral magnitude and phase derived from MDCT coefficients 
US6980933B2 (en)  20040127  20051227  Dolby Laboratories Licensing Corporation  Coding techniques using estimated spectral magnitude and phase derived from MDCT coefficients 
NonPatent Citations (15)
Title 

Bosi et al., "ISO/IEC MPEG2 Advanced Audio Coding," J. Audio Eng. Soc., vol. 45, No. 10, Oct. 1997, pp. 789814. 
Daudet et al., "MDCT Analysis of Sinusoids and Applications to Coding Artifacts Reduction," Audio Eng. Soc. 114th Convention, Amsterdam, Mar. 2003, convention paper No. 5831, pp. 16. 
Duhamel et al., "A Fast Algorithm for the Implementation of Filter Banks Based on Time Domain Aliasing Cancellation" Int. Conf on Acoust., Speech and Sig. Proc., Toronto, May 1991, vol. 3, pp. 22092212. 
Ferreira, "Accurate Estimation in the ODFT Domain of the Frequency, Phase and Magnitude of Stationary Sinusoids," IEEE Workshop on Appl. of Sig. Proc. to Audio and Acoust., Oct. 2001, pp. 4750. 
Ferreira, "Combined Spectral Envelope Normalization & Substraction of Sinusoidal Components on the ODFT & MDCT Frequency Domains," Proc. 2001 IEEE Appl. of Sig. Proc. to Audio & Acoust., NY, Oct. 2001, pp. 5154. 
Kresch, R., et al., "Fast DCT Domain Filtering Using the DCT and the DCT," IEEE Transactions on Image Processing, IEEE Inc., New York, US, vol. 8, No. 6, Jun. 1999; pp. 821833. 
Kresch, R., et al., "Fast DCT Domain Filtering Using the DCT and the DST," IEEE Transaction on Image Processing, IEEE Inc., New York, US, vol. 8, No. 6, Jun. 1999; pp. 821833. 
Lanciani et al., "SubbandDomain Filtering of MPEG Audio Signals," Proc. of 1999 Int. Conf on Acoust., Speech and Signal Proc., Phoenix, AZ, Mar. 1999, vol. 2, pp. 917920. 
Mathew, M., et al., "Modified mp3 encoder using complex modified discrete cosine transform," Multimedia and Expo. 2003, Proceedings, 2003 Int'l. Conference on Jul. 69, 2003, Piscataway, NJ, USA, IEEE, vol. 2, Jul. 6, 2003, pp. 709712. 
Merdjani et al., "Direct Estimation of Frequency From MDCTEncoded Files," Proc. of 6th Int. Conf on Digital Audio Effects (DAFx03), London, Sep. 2003. 
Princen et al., "Analysis/Synthesis Filter Bank Design Based on Time Domain Aliasing Cancellation," IEEE Trans. on Acoust., Speech, Signal Proc., vol. ASSP34, 1986, pp. 11531161. 
Wang et al., "Modified Discrete Cosine TransformIts Implications for Audio Coding and Error Concealment," J. Audio Eng. Soc., vol. 51, No. 112, Jan./Feb. 2003, pp. 5261. 
Wang et al., "On the Relationship Between MDCT, SDFT and DFT," Proc. of 5th Int. Conf. on Sig. Proc., Beijing, Aug. 2000, vol. 1, pp. 4447. 
Wang et al., "Some Peculiar Properties of the MDCT," Proc. of 5th Int. Conf. on Sig. Proc., Beijing, Aug. 2000, vol. 1, pp. 6164. 
Wang et al., "Modified Discrete Cosine Transform—Its Implications for Audio Coding and Error Concealment," J. Audio Eng. Soc., vol. 51, No. 112, Jan./Feb. 2003, pp. 5261. 
Also Published As
Publication number  Publication date  Type 

USRE44126E1 (en)  20130402  grant 
JP4787176B2 (en)  20111005  grant 
JP2007524300A (en)  20070823  application 
EP1709627A1 (en)  20061011  application 
DK1709627T3 (en)  20120213  grant 
CN1918633B (en)  20110105  grant 
KR101184992B1 (en)  20121002  grant 
CA2553784C (en)  20130730  grant 
KR20060131797A (en)  20061220  application 
ES2375285T3 (en)  20120228  grant 
CN1918633A (en)  20070221  application 
EP1709627B1 (en)  20111102  grant 
US20050165587A1 (en)  20050728  application 
US6980933B2 (en)  20051227  grant 
USRE42935E1 (en)  20111115  grant 
WO2005073960A1 (en)  20050811  application 
CA2553784A1 (en)  20050811  application 
Similar Documents
Publication  Publication Date  Title 

Plumbley et al.  Sparse representations in audio and music: from coding to source separation  
US6226608B1 (en)  Data framing for adaptiveblocklength coding system  
Goodwin  Adaptive signal models: Theory, algorithms, and audio applications  
US5890125A (en)  Method and apparatus for encoding and decoding multiple audio channels at low bit rates using adaptive selection of encoding method  
USRE39080E1 (en)  Rate loop processor for perceptual encoder/decoder  
US20070100607A1 (en)  Time warped modified transform coding of audio signals  
US20070225971A1 (en)  Methods and devices for lowfrequency emphasis during audio compression based on ACELP/TCX  
US20020176353A1 (en)  Scalable and perceptually ranked signal coding and decoding  
US7356748B2 (en)  Partial spectral loss concealment in transform codecs  
US5394473A (en)  Adaptiveblocklength, adaptivetransforn, and adaptivewindow transform coder, decoder, and encoder/decoder for highquality audio  
EP0559383A1 (en)  A method and apparatus for coding audio signals based on perceptual model  
US6934677B2 (en)  Quantization matrices based on critical band pattern information for digital audio wherein quantization bands differ from critical bands  
Shlien  Guide to MPEG1 audio standard  
US6324560B1 (en)  Fast system and method for computing modulated lapped transforms  
US20070016404A1 (en)  Method and apparatus to extract important spectral component from audio signal and low bitrate audio signal coding and/or decoding method and apparatus using the same  
US6963842B2 (en)  Efficient system and method for converting between different transformdomain signal representations  
US20110035212A1 (en)  Transform coding of speech and audio signals  
US7369989B2 (en)  Unified filter bank for audio coding  
US5357594A (en)  Encoding and decoding using specially designed pairs of analysis and synthesis windows  
US5222189A (en)  Low timedelay transform coder, decoder, and encoder/decoder for highquality audio  
US7275036B2 (en)  Apparatus and method for coding a timediscrete audio signal to obtain coded audio data and for decoding coded audio data  
EP2743922A1 (en)  Method and apparatus for compressing and decompressing a higher order ambisonics representation for a sound field  
US6154762A (en)  Fast system and method for computing modulated lapped transforms  
US20090271204A1 (en)  Audio Compression  
US20030187663A1 (en)  Broadband frequency translation for high frequency regeneration 
Legal Events
Date  Code  Title  Description 

AS  Assignment 
Owner name: DOLBY LABORATORIES LICENSING CORPORATION, CALIFORN Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:CHENG, COREY;SMITHERS, MICHAEL;REEL/FRAME:029302/0292 Effective date: 20040713 

AS  Assignment 
Owner name: DOLBY LABORATORIES LICENSING CORPORATION, CALIFORN Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:LATHROP, DAVID N.;REEL/FRAME:044121/0515 Effective date: 20040709 