WO2001026095A1 - Efficient spectral envelope coding using variable time/frequency resolution and time/frequency switching - Google Patents

Efficient spectral envelope coding using variable time/frequency resolution and time/frequency switching Download PDF

Info

Publication number
WO2001026095A1
WO2001026095A1 PCT/SE2000/001887 SE0001887W WO0126095A1 WO 2001026095 A1 WO2001026095 A1 WO 2001026095A1 SE 0001887 W SE0001887 W SE 0001887W WO 0126095 A1 WO0126095 A1 WO 0126095A1
Authority
WO
WIPO (PCT)
Prior art keywords
time
frequency
signal
resolution
spectral envelope
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/SE2000/001887
Other languages
English (en)
French (fr)
Inventor
Lars Gustaf Liljeryd
Kristofer KJÖRLING
Per Ekstrand
Fredrik Henn
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Coding Technologies Sweden AB
Original Assignee
Coding Technologies Sweden AB
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Family has litigation
First worldwide family litigation filed litigation Critical https://patents.darts-ip.com/?family=20417226&utm_source=google_patent&utm_medium=platform_link&utm_campaign=public_patent_search&patent=WO2001026095(A1) "Global patent litigation dataset” by Darts-ip is licensed under a Creative Commons Attribution 4.0 International License.
Priority claimed from SE9903552A external-priority patent/SE9903552D0/xx
Priority to EP00968271A priority Critical patent/EP1216474B1/en
Priority to PT00968271T priority patent/PT1216474E/pt
Priority to DE60012198T priority patent/DE60012198T2/de
Priority to AT00968271T priority patent/ATE271250T1/de
Application filed by Coding Technologies Sweden AB filed Critical Coding Technologies Sweden AB
Priority to AU78212/00A priority patent/AU7821200A/en
Priority to HK03101398.3A priority patent/HK1049401B/zh
Priority to JP2001528974A priority patent/JP4035631B2/ja
Priority to BRPI0014642A priority patent/BRPI0014642B1/pt
Publication of WO2001026095A1 publication Critical patent/WO2001026095A1/en
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/02Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
    • G10L19/022Blocking, i.e. grouping of samples in time; Choice of analysis windows; Overlap factoring
    • G10L19/025Detection of transients or attacks for time/frequency resolution switching
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
    • G10L19/06Determination or coding of the spectral characteristics, e.g. of the short-term prediction coefficients
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/02Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
    • G10L19/0204Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders using subband decomposition
    • G10L19/0208Subband vocoders
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/02Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
    • G10L19/022Blocking, i.e. grouping of samples in time; Choice of analysis windows; Overlap factoring
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/02Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
    • G10L19/032Quantisation or dequantisation of spectral components
    • G10L19/035Scalar quantisation
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/038Speech enhancement, e.g. noise reduction or echo cancellation using band spreading techniques
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/03Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
    • G10L25/18Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being spectral information of each sub-band

Definitions

  • the present invention relates to a new method and apparatus for efficient coding of spectral envelopes m audio coding systems.
  • the method may be used both for natural audio coding and speech coding and is especially suited for coders using SBR [WO 98/57436] or other high frequency reconstruction methods.
  • Audio source coding techniques can be divided into two classes: natural audio coding and speech coding.
  • Natural audio coding is commonly used for music or arbitrary signals at medium bitrates, and generally offers wide audio bandwidth. Speech coders are basically limited to speech reproduction but can on the other hand be used at very low bitrates, albeit with low audio bandwidth.
  • the signal is generally separated into two major signal components, the "spectral envelope” and the corresponding "residual” signal.
  • the term “spectral envelope” refers to the coarse spectral distribution of the signal m a general sense, e.g.
  • filter coefficients m an linear prediction based coder or a set of time-frequency averages of subband samples m a subband coder.
  • residual refers to the fine spectral distribution in a general sense, e.g. the LPC error signal or subband samples normalized using the above time-frequency averages.
  • envelope data refers to the quantized and coded spectral envelope, and "residual data" to the quantized and coded residual. At medium and high bitrates, the residual data constitutes the main part of the bitstream. At very low bitrates, the envelope data constitutes a larger part of the bitstream Hence, it is indeed important to represent the spectral envelope compactly when using lower bitrates.
  • P ⁇ or art audio coders and most speech coders use constant length, relatively short, time segments in the generation of envelope data to achieve good temporal resolution.
  • this prevents optimal utilisation of the frequency domain masking known from psycho-acoustics.
  • modem audio coders employ adaptive window switching, i.e. they switch time segment lengths depending on the signals statistics.
  • Clearly a minimum usage of the short segments is a prerequisite for maximum coding gam.
  • long transition windows are needed to alter the segment lengths, limiting the switching flexibility.
  • the spectral envelope is a function of two variables: time and frequency.
  • the encodmg can be done by exploiting redundancy m either direction of the time/frequency plane.
  • codmg of the spectral envelope is performed in the frequency direction, using delta coding (DPCM) or vector quantization (VQ).
  • the present invention provides a new method, and an apparatus for spectral envelope coding.
  • the codmg scheme is designed to meet the special requirements of systems, where the residual signal withm certain frequency regions is excluded from the transmitted data. Examples are systems employing HFR (High Frequency Reconstruction), in particular SBR (Spectral Band Replication), or paramet ⁇ c coders.
  • HFR High Frequency Reconstruction
  • SBR Spectral Band Replication
  • paramet ⁇ c coders Spectral Band Replication
  • non-uniform time and frequency sampling of the spectral envelope is obtained by adaptively grouping subband samples from a fixed size filterbank, into frequency bands and time segments, each of which generates one envelope sample. This allows instantaneous selection of arbitrary time and frequency resolution withm the limits of the filterbank. The system defaults to long time segments and high frequency resolution.
  • variable time/frequency resolution method is also applicable on envelope encoding based on prediction. Instead of grouping of subband samples, predictor coefficients are generated for time segments of varying lengths according to the system.
  • the invention desc ⁇ bes two schemes for signalling of the time and frequency resolution used.
  • the first scheme allows arbitrary selection, by explicit signalling of time segment borders and frequency resolutions. In order to reduce the signalling overhead, four classes of granules are used, offe ⁇ ng different cost/flexibility tradeoffs.
  • the second scheme exploits the property of a typical programme mate ⁇ al, that transients are separated at least by a time T nm ⁇ n , in order to reduce the number of control bits further.
  • the position withm the interval is encoded and sent to the decoder.
  • the encoder and decoder share rules that specify the time/frequency distribution of the spectral envelope samples, given a certain combination of subsequent control signals, ensuring an unambiguous decoding of the envelope data.
  • the present invention presents a new and efficient method for scalefactor redundancy coding.
  • a dirac pulse in the time domain transforms to a constant in the frequency domain, and a dirac in the frequency domain, i.e. a single sinusoid, corresponds to a signal with constant magnitude m the time domain.
  • Figs, la - lb illustrate uniform respective non-uniform sampling in time of the spectral envelope.
  • Figs. 2a - 2b define, and illustrate usage of four classes of granules.
  • Figs. 3a - 3b are two examples of granules, and the corresponding control signals.
  • Figs. 4a - 4c illustrate the position signalling system.
  • Fig. 5 illustrates time/frequency switched delta coding.
  • Fig. 6 is a block diagram of an encoder using the envelope coding according to the invention.
  • Fig. 7 is a block diagram of a decoder using the envelope coding according to the invention.
  • Fig. 1 shows the time/frequency representation of a musical signal where sustained chords are combined with sharp transients with mamly high frequency contents.
  • the chords In the lowband the chords have high power and the transient power is low, whereas the opposite is true m the highband.
  • the envelope data that is generated du ⁇ ng time intervals where transients are present is dominated by the high intermittent transient power.
  • the spectral envelope of the transposed signal is estimated using the same instantaneous time- /frequency resolution as used for the analysis of the onginal highband. An equalization of the transposed signal is then performed, based on dissimila ⁇ ties in the spectral envelopes. E.g.
  • amplification factors m an envelope adjusting filterbank are calculated as the square root of the quotients between o ⁇ gmal signal and transposed signal average power.
  • a problem a ⁇ ses The transposed signal has the same "chord-to-transient" power ratio as the lowband. The gams needed in order to adjust the transposed transients to the correct level thus cause the transposed chords to be amplified relative to the o ⁇ gmal highband level for the full duration of the envelope data containing transient energy. These momenta ⁇ ly too loud chord fragments are perceived as pre- and post echoes to the transient, see Fig. la.
  • the solution is to maintain a low update rate du ⁇ ng tonal passages, which make up the major parts of a typical programme mate ⁇ al, and by means of a transient detector localize the transient positions, and update the envelope data close to the leading flanks, see Fig lb.
  • the update rate is momenta ⁇ ly increased in a time interval after the transient start. This eliminates gam induced post-echoes.
  • the time segmenting du ⁇ ng the decay is not as crucial as finding the start of the transient, as will be explained later.
  • a non- uniform sampling m time and frequency as outlined above is applicable both on filterbank- and linear prediction-based envelope coding. Different predictor orders may be used for transient and quasi- stationary (tonal) segments.
  • frequency resolution refers to a specific set of frequency bands, LPC coefficients or similar, used in the envelope estimate for a particular time segment.
  • high frequency resolution or high time resolution can be obtained instantaneously.
  • all practical codec bitstreams comprise data pe ⁇ ods, each of which corresponds to a short time segment of the input signal.
  • the time segment associated with such a data pe ⁇ od is hereinafter referred to as a "granule”.
  • Typical coders use granules of fixed length. The presence of granule bounda ⁇ es imposes constraints on the design of the time segments used for envelope estimation.
  • the algo ⁇ thm that generates these time segments may state that a segment "border" is required at a particular location, and that the subsequent segment should have a certain length. However, if a granule boundary falls withm this interval due to fixed length granules, the segment must be split into two parts. This has two implications: First, the number of segments to encode increases, possibly increasing the amount of data to transmit. Second, forced borders may generate segments that are too short for reliable average power estimates. In order to avoid those shortcomings, the present invention uses va ⁇ able length granules. This requires look-ahead in the encoder, as well as extra buffe ⁇ ng the decoder.
  • g ⁇ d denote the time segments and the corresponding frequency resolutions to use for a particular signal
  • local g ⁇ d denote the g ⁇ d of one granule.
  • the g ⁇ d must be signalled to the decoder for correct decoding of the envelope samples.
  • m low bitrate applications the number of bits for this "control signal” must be kept at a minimum.
  • Two signalling schemes are proposed in the present invention. P ⁇ or to desc ⁇ bmg them m detail, a “baseline system” and some design c ⁇ te ⁇ a are established.
  • the time quantization step for the spectral envelope be T q .
  • Those steps may be viewed as "subgranules", which are grouped into the aforementioned time segments.
  • a granule comp ⁇ ses of 5 subgranules, where S vanes from granule to granule.
  • the number of possible segment combinations withm a granule, ranging from one segment for the entire granule to S segments, is given by
  • An arbitrary subdivision of the granule can be signalled by S - 1 bits, representing the consecutive subgranules, stating whether a leading segment border is present at the corresponding subgranule or not. (The first and last granule borders need not be signalled here.) Since S is va ⁇ able it must be signalled, and if this scheme is combined with a fixed length granule lowband codec, the position relative the constant length granules must be signalled as well.
  • the segment frequency resolutions can be signalled with dynamically allocated control bits, e.g.
  • the minimum time-span between consecutive transients m music programme mate ⁇ al can be estimated in the following way:
  • the rhythmic "pulse" is desc ⁇ bed by a time signature expressed as a fraction AIB, where A denotes the number of "beats" per bar and XIB is the type of note corresponding to one beat, for example a 1/4 note, commonly referred to as a quarter note.
  • t denote the tempo in Beats Per Mmute (BPM)
  • BPM Beats Per Mmute
  • T q The necessary time resolution T q must also be established.
  • a transient signal has its mam energy in the highband to be reconstructed. This means that the encoded spectral envelope must carry all the "timing" information. The desired timing precision thus determines the resolution needed for encoding of leading flanks.
  • T q is much smaller than the minimum note period T nm ⁇ n , since small time deviations withm the pe ⁇ od clearly can be heard.
  • the transient has significant energy in the lowband.
  • the above desc ⁇ bed gam-induced pre-echoes must fall withm the so called pre- or backward masking time T m of the human auditory system m order to be inaudible.
  • T q must satisfy two conditions:
  • T m ⁇ T nm ⁇ n (otherwise the notes would be so fast that they could not be resolved) and according to ["Modeling the Additivity of Nonsimultaneous Masking", Hea ⁇ ng Res., vol. 80, pp. 105- 118 (1994)], T m amounts to 10-20 ms. Since T nm ⁇ n is in the 50ms range, a reasonable selection of T q according to Eq 3 results in that the second condition is also met. Of course the precision of the transient detection m the encoder and the time resolution of the analysis/synthesis filterbank must also be considered when selecting T q . Tracking of trailing flanks is less crucial, for several reasons: First, the note-off position has little or no effect on the perceived rhythm. Second, most instruments do not exhibit sharp trailing flanks, but rather a smooth decay curve, i.e. a well defined note-off time does not exist. Third, the post- or forward masking time is substantially longer than the pre-maskmg time.
  • both systems according to the present invention employ two time sampling modes; uniform and non-uniform sampling in time.
  • the uniform mode is used du ⁇ ng quasi-stationary passages, whereby fixed length segments are used, and little extra signalling is required.
  • the system switches to non-uniform operation and granules of va ⁇ able length are used, enabling a good fit to the ideal global g ⁇ d.
  • the granules are divided into four classes, and the control signals are tailored towards the specific needs of each class.
  • the classes are defined m Fig. 2a.
  • Class “FixFix” corresponds to conventional constant length granules
  • Class “FixVar” has a movable stop boundary, which allows the granule length to vary.
  • Class “VarFix” has a va ⁇ able start boundary, whereas the stop border is fixed.
  • the last class. "VarVar” has variable boundaries at both ends. All va ⁇ able boundaries can be offset -a / +b versus the "nominal positions”.
  • Fig 2b gives an example of a sequence of granules.
  • the system defaults to class FixFix.
  • a transient detector (or psycho-acoustical model) operates on a time region ahead of the current granule, as outlined in the figure.
  • a class FixVar granule is used - the system switches from uniform to non-uniform operation.
  • this granule is followed by a class VarFix granule, since transients most of the time are separated by a number of granules for all practical selections of granule lengths.
  • the VarVar class frames may be used.
  • Fig 3a is an example of a class FixVar - VarFix pair, and the corresponding control signal.
  • One transient is present, and the leading flank (quantized to T q ) is denoted by t.
  • the first part of the bitstream is the "class" signal. Since four classes are used, two bits are used for this signal.
  • the next signal desc ⁇ bes the location of the va ⁇ able boundary, expressed as the offset from the nominal position. This boundary is referred to as the "absolute border”.
  • the segment borders withm the granules are desc ⁇ bed by means of "relative borders": The absolute border is used as a reference, and the other borders are desc ⁇ bed as cumulative distances to the reference.
  • the number of relative borders is va ⁇ able, and is signalled to the decoder, after the absolute border.
  • a zero number means that the granule comp ⁇ ses one time segment only.
  • the segment lengths are signalled in a reversed sequence, moving away from the absolute border at the end of the granule.
  • the length of the first segment m a FixVar granule is de ⁇ ved from the relative borders and the total length, and is not signalled.
  • Class VarFix relative border signals are inserted into the bitsream m a forward sequence, whereby the last segment length is excluded.
  • the bitstream signal order is identical to that of class FixVar, that is: [class, abs. border, number of rel. borders, rel. border 0, rel. border 1 , ... , rel. border N- X]
  • the signals are shown in "clear text" instead of the actual binary code words sent m the bitstream.
  • Fig 3b shows an alternative coding of the signal.
  • the va ⁇ able boundary offers versatility when grouping the segments at a given global g ⁇ d.
  • some payload control can be performed at this level, e.g. to equalize the number of bits per granule. This may ease the operation of the lowband encoder.
  • Given enough look-ahead, a multipass encoding can be performed, and the optimum combination of local g ⁇ ds be used.
  • the absolute border in addition to the above function, serves to align a group of borders around the transient with the precision T q .
  • the highest precision is always available for coding of transient leading flanks, and a coarser resolution is used in the tracking of the decay.
  • the VarVar class frames use a combination of the FixVar and VarFix signalling, e.g. interleaved: [class, abs. bord. left, d:o ⁇ ght, num. rel. bord left, d:o right, [rel. bord. left 0,..., rel. bord. left N - X , [d:o ⁇ ght]].
  • This class offers the greatest flexibility m the local g ⁇ d selection, at the cost of an increased signalling overhead.
  • the FixFix class does not require other signals than the class signal per se, m which case for example two (equal length) segments are used. However, it is feasible to add a signal that enables selection withm a set of predefined g ⁇ ds.
  • the spectral envelope can be calculated for two segments, and if the two envelopes do not differ more than a certain amount, only one set of envelope data is sent. So far, only the segmenting m time has been desc ⁇ bed. For many reasons, it may be desirable to signal to the decoder which of the borders that corresponds to a transient leading edge. This can be accomplished by sending a "pointer" that points to the relevant border. The reference direction can follow that of the relative borders, and a zero value imply that no transient start is present within the current granule. Furthermore, the frequency resolution (number of power estimates or predictor order) used for the individual segments must also be defined. This can be signalled exphcitely, as m the "baseline system", or implicitely, i.e. the resolution is coupled to the segment lengths, and possibly the pointer position.
  • the second system hereinafter referred to as the "position-signalling system" is intended for very low bitrate applications.
  • the previously established design rules are used to a greater extent, in order to reduce the number of control signal bits even further.
  • a transient detector operating on intervals of length N, located Ni l ahead of the current granule, is employed, Fig. 4b
  • a flag associated with this region is set.
  • the transient detector has detected a transient in subgranule 2 at time n - X, and a transient m subgranule 3 at time n.
  • These positions, pos(n - 1) and pos( ), as well as the corresponding flags, 7 ⁇ g( « - 1) are used as input to the g ⁇ d generation algo ⁇ thm, and the corresponding local g ⁇ d for granule n might be as shown in Fig. 4c.
  • subgranule 3 of the granule at time n - 1 is included m the time/frequency g ⁇ d of granule n.
  • the only signals fed to the bitstream, are flag(n) [1 bit], and pos(n) [ce ⁇ l( ln_ (N )) bits] .
  • the g ⁇ d algo ⁇ thm is also known by the decoder, hence those signals, together with the corresponding signals of the preceding granule n - 1, are sufficient for unambiguous reconstruction of the g ⁇ d used by the encoder.
  • the position signal is obsolete, and can be replaced, for example by a 1 bit signal, stating whether one or two segments are used.
  • uniform mode operation is identical to that of the class signalling system.
  • This system may be viewed as a finite state machine, where the above desc ⁇ bed signals control the transitions from state to state, and the states define the local g ⁇ ds.
  • the states can be represented by tables, stored in both the encoder, and the decoder. Since the g ⁇ ds are hard coded, the ability to adaptively alter the payload has been sac ⁇ ficed. A reasonable approach is to keep the time/frequency data mat ⁇ x size (e.g. number of power estimates) approximately constant. Assuming that the number of scalefactors or coefficients m a high resolution segment is two times that of a low resolution segment, one high resolution segment can be traded for two low resolution segments.
  • Time/Frequency Switched Scalefactor Encoding Utilizing a time to frequency transform it can be shown that a pulse m the time domain corresponds to a flat spectrum in the frequency domain, and a "pulse" in the frequency domain, i.e. a single sinusoidal, corresponds to a quasi-stationary signal m the time domain. In other words a signal usually shows more transient properties in one domain than the other. In a spectrogram, l e. a time/frequency mat ⁇ x display, this property is evident, and can advantageously be used when coding spectral envelopes.
  • a tonal stationary signal can have a very sparse spectrum not suitable for delta codmg in the frequency- direction, but well suited for delta coding m the time -direction, and vice versa. This is displayed in Fig.
  • T/F-codmg a time/frequency switching method, hereinafter referred to as T/F-codmg:
  • the scalefactors are quantized and coded both in the time- and frequency-direction. For both cases, the required number of bits is calculated for a given coding error, or the error is calculated for a given number of bits. Based upon this, the most beneficial coding direction is selected.
  • DPCM and Huffman redundancy coding can be used. Two vectors are calculated, Df and D t :
  • Start values are transmitted whenever the spectral envelope is coded in the frequency direction but not when coded in the time direction since they are available at the decoder, through the previous envelope.
  • the proposed algo ⁇ thm also require extra information to be transmitted, namely a time/frequency flag indicating in which direction the spectral envelope was coded.
  • the T F algo ⁇ thm can advantageously be used with several different coding schemes of the scalefactor-envelope representation apart from DPCM and Huffman, such as ADPCM, LPC and vector quantisation
  • the proposed T/F algo ⁇ thm gives significant bitrate-reduction for the spectral-envelope data.
  • the analogue input signal is fed to an A D-converter 601, forming a digital signal.
  • the digital audio signal is fed to a perceptual audio encoder 602, where source coding is performed.
  • the digital signal is fed to a transient detector 603 and to an analysis filterbank 604, which splits the signal into its spectral equivalents (subband signals).
  • the transient detector could operate on the subband signals from the analysis bank, but for generality purposes it is here assumed to operate on the digital time domain samples directly.
  • the transient detector divides the signal into granules and determines, according to the invention, whether subgranules within the granules is to be flagged as transient.
  • This information is sent to the envelope grouping block 605, which specifies the time/frequency grid to be used for the current granule.
  • the block combines the uniform sampled subband signals, to form the non-uniform sampled envelope values.
  • these values may represent the average power density of the grouped subband samples.
  • the envelope values are, together with the grouping information, fed to the envelope encoder block 606.
  • This block decides in which direction (time or frequency) to encode the envelope values.
  • the resulting signals, the output from the audio encoder, the wideband envelope information, and the control signals are fed to the multiplexer 607, forming a se ⁇ al bitstream that is transmitted or stored.
  • the decoder side of the invention is shown in Fig.
  • the demultiplexer 701 restores the signals and feeds the approp ⁇ ate part to an audio decoder 702, which produces a low band digital audio signal.
  • the envelope information is fed from the demultiplexer to the envelope decoding block 703, which, by use of control data, determines m which direction the current envelope are coded and decodes the data.
  • the low band signal from the audio decoder is routed to the transposition module 704, which generates a replicated high band signal from the low band.
  • the high band signal is fed to an analysis filterbank 706, which is of the same type as on the encoder side.
  • the subband signals are combined in the scalefactor grouping unit 707.
  • the same type of combination and time/frequency dist ⁇ bution of the subband samples is adopted as on the encoder side.
  • the envelope information from the demultiplexer and the information from the scalefactor grouping unit is processed in the gam control module 708.
  • the module computes gam factors to be applied to the subband samples before recombination in the synthesis filterbank block 709.
  • the output from the synthesis filterbank is thus an envelope adjusted high band audio signal.
  • This signal is added to the output from the delay unit 705, which is fed with the low band audio signal. The delay compensates for the processing time of the high band signal.
  • the obtained digital wideband signal is converted to an analogue audio signal in the digital to analogue converter 710.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Computational Linguistics (AREA)
  • Quality & Reliability (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)
  • Ultra Sonic Daignosis Equipment (AREA)
  • Stabilization Of Oscillater, Synchronisation, Frequency Synthesizers (AREA)
  • Reduction Or Emphasis Of Bandwidth Of Signals (AREA)
  • Electrophonic Musical Instruments (AREA)
PCT/SE2000/001887 1999-10-01 2000-09-29 Efficient spectral envelope coding using variable time/frequency resolution and time/frequency switching Ceased WO2001026095A1 (en)

Priority Applications (8)

Application Number Priority Date Filing Date Title
BRPI0014642A BRPI0014642B1 (pt) 1999-10-01 2000-09-29 codificação de envelope espectral usando resolução de tempo-frequência variável e mudança de tempo-frequência
JP2001528974A JP4035631B2 (ja) 1999-10-01 2000-09-29 可変時間/周波数分解能および時間/周波数切り替えを使用する効率的なスペクトルエンベロープ符号化
PT00968271T PT1216474E (pt) 1999-10-01 2000-09-29 Codificacao eficiente de envolvente especial utilizando resolucao tempo/frequencia variavel
DE60012198T DE60012198T2 (de) 1999-10-01 2000-09-29 Kodierung der hüllkurve des spektrums mittels variabler zeit/frequenz-auflösung
AT00968271T ATE271250T1 (de) 1999-10-01 2000-09-29 Kodierung der hüllkurve des spektrums mittels variabler zeit/frequenz-auflösung
EP00968271A EP1216474B1 (en) 1999-10-01 2000-09-29 Efficient spectral envelope coding using variable time/frequency resolution
AU78212/00A AU7821200A (en) 1999-10-01 2000-09-29 Efficient spectral envelope coding using variable time/frequency resolution and time/frequency switching
HK03101398.3A HK1049401B (zh) 1999-10-01 2000-09-29 有效頻譜包絡編碼方法及其編解碼設備

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
SE9903552-9 1999-10-01
SE9903552A SE9903552D0 (sv) 1999-01-27 1999-10-01 Efficient spectral envelope coding using dynamic scalefactor grouping and time/frequency switching
PCT/SE2000/000158 WO2000045378A2 (en) 1999-01-27 2000-01-26 Efficient spectral envelope coding using variable time/frequency resolution and time/frequency switching
SEPCT/SE00/00158 2000-01-26

Publications (1)

Publication Number Publication Date
WO2001026095A1 true WO2001026095A1 (en) 2001-04-12

Family

ID=20417226

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/SE2000/001887 Ceased WO2001026095A1 (en) 1999-10-01 2000-09-29 Efficient spectral envelope coding using variable time/frequency resolution and time/frequency switching

Country Status (14)

Country Link
US (3) US6978236B1 (https=)
EP (1) EP1216474B1 (https=)
JP (3) JP4035631B2 (https=)
CN (1) CN1172293C (https=)
AT (1) ATE271250T1 (https=)
AU (1) AU7821200A (https=)
BR (1) BRPI0014642B1 (https=)
DE (1) DE60012198T2 (https=)
DK (1) DK1216474T3 (https=)
ES (1) ES2223591T3 (https=)
HK (1) HK1049401B (https=)
PT (1) PT1216474E (https=)
RU (1) RU2236046C2 (https=)
WO (1) WO2001026095A1 (https=)

Cited By (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2004264814A (ja) * 2002-09-04 2004-09-24 Microsoft Corp 純可逆的音声圧縮における技術革新
WO2006000951A1 (en) * 2004-06-21 2006-01-05 Koninklijke Philips Electronics N.V. Method of audio encoding
JPWO2005036527A1 (ja) * 2003-10-07 2006-12-21 松下電器産業株式会社 スペクトル包絡線符号化のための時間境界及び周波数分解能の決定方法
US7246065B2 (en) 2002-01-30 2007-07-17 Matsushita Electric Industrial Co., Ltd. Band-division encoder utilizing a plurality of encoding units
WO2008046505A1 (de) * 2006-10-18 2008-04-24 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Kodierung eines informationssignals
US7668711B2 (en) 2004-04-23 2010-02-23 Panasonic Corporation Coding equipment
WO2010003546A3 (en) * 2008-07-11 2010-03-04 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E .V. An apparatus and a method for calculating a number of spectral envelopes
US7742927B2 (en) 2000-04-18 2010-06-22 France Telecom Spectral enhancing method and device
WO2011000780A1 (en) * 2009-06-29 2011-01-06 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Bandwidth extension encoder, bandwidth extension decoder and phase vocoder
US8041578B2 (en) 2006-10-18 2011-10-18 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Encoding an information signal
US8073050B2 (en) 2007-03-09 2011-12-06 Fujitsu Limited Encoding device and encoding method
US8108221B2 (en) 2002-09-04 2012-01-31 Microsoft Corporation Mixed lossless audio compression
US8126721B2 (en) 2006-10-18 2012-02-28 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Encoding an information signal
US8249882B2 (en) 2006-11-24 2012-08-21 Fujitsu Limited Decoding apparatus and decoding method
US8275626B2 (en) 2008-07-11 2012-09-25 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Apparatus and a method for decoding an encoded audio signal
US8386271B2 (en) 2008-03-25 2013-02-26 Microsoft Corporation Lossless and near lossless scalable audio codec
US8417532B2 (en) 2006-10-18 2013-04-09 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Encoding an information signal
US9131290B2 (en) 2011-03-02 2015-09-08 Fujitsu Limited Audio coding device, audio coding method, and computer-readable recording medium storing audio coding computer program
RU2660633C2 (ru) * 2013-06-10 2018-07-06 Фраунхофер-Гезелльшафт Цур Фердерунг Дер Ангевандтен Форшунг Е.Ф. Устройство и способ для кодирования, обработки и декодирования огибающей аудиосигнала путем разделения огибающей аудиосигнала с использованием квантования и кодирования распределения
EP1869774B1 (en) * 2005-04-13 2019-06-05 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Adaptive grouping of parameters for enhanced coding efficiency

Families Citing this family (106)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2002058052A1 (en) * 2001-01-19 2002-07-25 Koninklijke Philips Electronics N.V. Wideband signal transmission system
US7711123B2 (en) * 2001-04-13 2010-05-04 Dolby Laboratories Licensing Corporation Segmenting audio signals into auditory events
JP3469567B2 (ja) * 2001-09-03 2003-11-25 三菱電機株式会社 音響符号化装置、音響復号化装置、音響符号化方法及び音響復号化方法
DE60202881T2 (de) * 2001-11-29 2006-01-19 Coding Technologies Ab Wiederherstellung von hochfrequenzkomponenten
US20030187663A1 (en) 2002-03-28 2003-10-02 Truman Michael Mead Broadband frequency translation for high frequency regeneration
SE0301273D0 (sv) * 2003-04-30 2003-04-30 Coding Technologies Sweden Ab Advanced processing based on a complex-exponential-modulated filterbank and adaptive time signalling methods
DE602004032587D1 (de) * 2003-09-16 2011-06-16 Panasonic Corp Codierungsvorrichtung und Decodierungsvorrichtung
CN1875402B (zh) * 2003-10-30 2012-03-21 皇家飞利浦电子股份有限公司 音频信号编码或解码
KR20060132697A (ko) * 2004-02-16 2006-12-21 코닌클리케 필립스 일렉트로닉스 엔.브이. 트랜스코더 및 트랜스코딩 방법
JP4355745B2 (ja) * 2004-03-17 2009-11-04 コーニンクレッカ フィリップス エレクトロニクス エヌ ヴィ オーディオ符号化
US7720230B2 (en) * 2004-10-20 2010-05-18 Agere Systems, Inc. Individual channel shaping for BCC schemes and the like
KR100657916B1 (ko) * 2004-12-01 2006-12-14 삼성전자주식회사 주파수 대역간의 유사도를 이용한 오디오 신호 처리 장치및 방법
KR100721537B1 (ko) * 2004-12-08 2007-05-23 한국전자통신연구원 광대역 음성 부호화기의 고대역 음성 부호화 장치 및 그방법
CN101107650B (zh) * 2005-01-14 2012-03-28 松下电器产业株式会社 语音切换装置及语音切换方法
US20060235683A1 (en) 2005-04-13 2006-10-19 Fraunhofer-Gesellschaft Zur Forderung Der Angewandten Forschung E.V. Lossless encoding of information with guaranteed maximum bitrate
US7788106B2 (en) * 2005-04-13 2010-08-31 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Entropy coding with compact codebooks
US8612236B2 (en) * 2005-04-28 2013-12-17 Siemens Aktiengesellschaft Method and device for noise suppression in a decoded audio signal
EP1742509B1 (en) * 2005-07-08 2013-08-14 Oticon A/S A system and method for eliminating feedback and noise in a hearing device
DE102005032724B4 (de) * 2005-07-13 2009-10-08 Siemens Ag Verfahren und Vorrichtung zur künstlichen Erweiterung der Bandbreite von Sprachsignalen
US8473298B2 (en) * 2005-11-01 2013-06-25 Apple Inc. Pre-resampling to achieve continuously variable analysis time/frequency resolution
JP4876574B2 (ja) 2005-12-26 2012-02-15 ソニー株式会社 信号符号化装置及び方法、信号復号装置及び方法、並びにプログラム及び記録媒体
KR101364979B1 (ko) * 2006-02-24 2014-02-20 오렌지 신호 엔벨로프의 양자화 인덱스들의 이진 코딩 방법과 신호엔벨로프의 디코딩 방법, 및 대응하는 코딩 모듈과 디코딩모듈
US7590523B2 (en) * 2006-03-20 2009-09-15 Mindspeed Technologies, Inc. Speech post-processing using MDCT coefficients
US9159333B2 (en) 2006-06-21 2015-10-13 Samsung Electronics Co., Ltd. Method and apparatus for adaptively encoding and decoding high frequency band
JP5093514B2 (ja) 2006-07-07 2012-12-12 日本電気株式会社 オーディオ符号化装置、オーディオ符号化方法およびそのプログラム
JP4757158B2 (ja) * 2006-09-20 2011-08-24 富士通株式会社 音信号処理方法、音信号処理装置及びコンピュータプログラム
RU2426179C2 (ru) * 2006-10-10 2011-08-10 Квэлкомм Инкорпорейтед Способ и устройство для кодирования и декодирования аудиосигналов
JP4918841B2 (ja) * 2006-10-23 2012-04-18 富士通株式会社 符号化システム
US8295507B2 (en) 2006-11-09 2012-10-23 Sony Corporation Frequency band extending apparatus, frequency band extending method, player apparatus, playing method, program and recording medium
JP5141180B2 (ja) * 2006-11-09 2013-02-13 ソニー株式会社 周波数帯域拡大装置及び周波数帯域拡大方法、再生装置及び再生方法、並びに、プログラム及び記録媒体
US20080243518A1 (en) * 2006-11-16 2008-10-02 Alexey Oraevsky System And Method For Compressing And Reconstructing Audio Files
JP4967618B2 (ja) * 2006-11-24 2012-07-04 富士通株式会社 復号化装置および復号化方法
US20080208575A1 (en) * 2007-02-27 2008-08-28 Nokia Corporation Split-band encoding and decoding of an audio signal
JP4871894B2 (ja) * 2007-03-02 2012-02-08 パナソニック株式会社 符号化装置、復号装置、符号化方法および復号方法
WO2008114080A1 (en) * 2007-03-16 2008-09-25 Nokia Corporation Audio decoding
US8630863B2 (en) * 2007-04-24 2014-01-14 Samsung Electronics Co., Ltd. Method and apparatus for encoding and decoding audio/speech signal
US20090006081A1 (en) * 2007-06-27 2009-01-01 Samsung Electronics Co., Ltd. Method, medium and apparatus for encoding and/or decoding signal
US8788264B2 (en) * 2007-06-27 2014-07-22 Nec Corporation Audio encoding method, audio decoding method, audio encoding device, audio decoding device, program, and audio encoding/decoding system
EP3288028B1 (en) * 2007-08-27 2019-07-03 Telefonaktiebolaget LM Ericsson (publ) Low-complexity spectral analysis/synthesis using selectable time resolution
EP2186090B1 (en) 2007-08-27 2016-12-21 Telefonaktiebolaget LM Ericsson (publ) Transient detector and method for supporting encoding of an audio signal
CN101471072B (zh) * 2007-12-27 2012-01-25 华为技术有限公司 高频重建方法、编码装置和解码装置
US9159325B2 (en) * 2007-12-31 2015-10-13 Adobe Systems Incorporated Pitch shifting frequencies
EP2242048B1 (en) * 2008-01-09 2017-06-14 LG Electronics Inc. Method and apparatus for identifying frame type
KR101413968B1 (ko) * 2008-01-29 2014-07-01 삼성전자주식회사 오디오 신호의 부호화, 복호화 방법 및 장치
KR101441897B1 (ko) * 2008-01-31 2014-09-23 삼성전자주식회사 잔차 신호 부호화 방법 및 장치와 잔차 신호 복호화 방법및 장치
KR101230479B1 (ko) * 2008-03-10 2013-02-06 프라운호퍼 게젤샤프트 쭈르 푀르데룽 데어 안겐반텐 포르슝 에. 베. 트랜지언트 이벤트를 갖는 오디오 신호를 조작하기 위한 장치 및 방법
EP2346029B1 (en) * 2008-07-11 2013-06-05 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Audio encoder, method for encoding an audio signal and corresponding computer program
JP5244971B2 (ja) * 2008-07-11 2013-07-24 フラウンホーファー−ゲゼルシャフト・ツール・フェルデルング・デル・アンゲヴァンテン・フォルシュング・アインゲトラーゲネル・フェライン オーディオ信号合成器及びオーディオ信号符号器
MY154452A (en) * 2008-07-11 2015-06-15 Fraunhofer Ges Forschung An apparatus and a method for decoding an encoded audio signal
US8326640B2 (en) * 2008-08-26 2012-12-04 Broadcom Corporation Method and system for multi-band amplitude estimation and gain control in an audio CODEC
JP5555707B2 (ja) * 2008-10-08 2014-07-23 フラウンホーファー−ゲゼルシャフト・ツール・フェルデルング・デル・アンゲヴァンテン・フォルシュング・アインゲトラーゲネル・フェライン マルチ分解能切替型のオーディオ符号化及び復号化スキーム
CN101751926B (zh) * 2008-12-10 2012-07-04 华为技术有限公司 信号编码、解码方法及装置、编解码系统
EP2360687A4 (en) * 2008-12-19 2012-07-11 Fujitsu Ltd LANGUAGE EXPANSION DEVICE AND LANGUAGE TREATMENT PROCESS
PL3992966T3 (pl) 2009-01-16 2023-03-20 Dolby International Ab Transpozycja harmonicznych rozszerzona o iloczyn wektorowy
MX2011007925A (es) * 2009-01-28 2011-08-17 Dten Forschung E V Fraunhofer Ges Zur Foeerderung Der Angewan Codificador de audio, decodificador de audio, información de audio codificada, métodos para la codificación y decodificación de una señal de audio y programa de computadora.
EP2214165A3 (en) * 2009-01-30 2010-09-15 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Apparatus, method and computer program for manipulating an audio signal comprising a transient event
WO2010102446A1 (zh) * 2009-03-11 2010-09-16 华为技术有限公司 一种线性预测分析方法、装置及系统
BR122019023947B1 (pt) 2009-03-17 2021-04-06 Dolby International Ab Sistema codificador, sistema decodificador, método para codificar um sinal estéreo para um sinal de fluxo de bits e método para decodificar um sinal de fluxo de bits para um sinal estéreo
JP4932917B2 (ja) * 2009-04-03 2012-05-16 株式会社エヌ・ティ・ティ・ドコモ 音声復号装置、音声復号方法、及び音声復号プログラム
CN101866649B (zh) * 2009-04-15 2012-04-04 华为技术有限公司 语音编码处理方法与装置、语音解码处理方法与装置、通信系统
US11657788B2 (en) 2009-05-27 2023-05-23 Dolby International Ab Efficient combined harmonic transposition
TWI675367B (zh) 2009-05-27 2019-10-21 瑞典商杜比國際公司 從訊號的低頻成份產生該訊號之高頻成份的系統與方法,及其機上盒、電腦程式產品、軟體程式及儲存媒體
WO2011048010A1 (en) 2009-10-19 2011-04-28 Dolby International Ab Metadata time marking information for indicating a section of an audio object
BR122022013454B1 (pt) 2009-10-20 2023-05-16 Fraunhofer-Gesellschaft Zur Forderung Der Angewandten Forschung E.V. Codificador de áudio, decodificador de áudio, método para codificar uma informação de áudio, método para decodificar uma informação de áudio que utiliza uma detecção de um grupo de valores espectrais previamente decodificados
HUE071544T2 (hu) 2009-10-21 2025-09-28 Dolby Int Ab Túlmintavételezés kombinált transzponáló szûrõbankban
TWI484473B (zh) 2009-10-30 2015-05-11 Dolby Int Ab 用於從編碼位元串流擷取音訊訊號之節奏資訊、及估算音訊訊號之知覺顯著節奏的方法及系統
AU2011206677B9 (en) 2010-01-12 2014-12-11 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Audio encoder, audio decoder, method for encoding and decoding an audio information, and computer program obtaining a context sub-region value on the basis of a norm of previously decoded spectral values
EP2372704A1 (en) * 2010-03-11 2011-10-05 Fraunhofer-Gesellschaft zur Förderung der Angewandten Forschung e.V. Signal processor and method for processing a signal
JP5850216B2 (ja) * 2010-04-13 2016-02-03 ソニー株式会社 信号処理装置および方法、符号化装置および方法、復号装置および方法、並びにプログラム
US9047875B2 (en) * 2010-07-19 2015-06-02 Futurewei Technologies, Inc. Spectrum flatness control for bandwidth extension
JP5712293B2 (ja) * 2010-08-25 2015-05-07 インディアン インスティテュート オブ サイエンスIndian Institute Of Science 不均一な間隔の周波数での有限長シーケンスのスペクトルサンプルの決定
WO2012037515A1 (en) * 2010-09-17 2012-03-22 Xiph. Org. Methods and systems for adaptive time-frequency resolution in digital data coding
JP5707842B2 (ja) * 2010-10-15 2015-04-30 ソニー株式会社 符号化装置および方法、復号装置および方法、並びにプログラム
JP5724338B2 (ja) * 2010-12-03 2015-05-27 ソニー株式会社 符号化装置および符号化方法、復号装置および復号方法、並びにプログラム
WO2012122297A1 (en) 2011-03-07 2012-09-13 Xiph. Org. Methods and systems for avoiding partial collapse in multi-block audio coding
US8838442B2 (en) 2011-03-07 2014-09-16 Xiph.org Foundation Method and system for two-step spreading for tonal artifact avoidance in audio coding
WO2012122299A1 (en) 2011-03-07 2012-09-13 Xiph. Org. Bit allocation and partitioning in gain-shape vector quantization for audio coding
CN102800317B (zh) * 2011-05-25 2014-09-17 华为技术有限公司 信号分类方法及设备、编解码方法及设备
RU2464649C1 (ru) * 2011-06-01 2012-10-20 Корпорация "САМСУНГ ЭЛЕКТРОНИКС Ко., Лтд." Способ обработки звукового сигнала
JP5807453B2 (ja) * 2011-08-30 2015-11-10 富士通株式会社 符号化方法、符号化装置および符号化プログラム
EP2767977A4 (en) 2011-10-21 2015-04-29 Samsung Electronics Co Ltd METHOD AND DEVICE FOR LOSS-FREE ENERGY CODING, AUDIO CODING METHOD AND DEVICE, METHOD AND APPARATUS FOR LOSS-FREE ENERGY DECODING AND AUDIO CODING METHOD AND DEVICE
JP5997592B2 (ja) 2012-04-27 2016-09-28 株式会社Nttドコモ 音声復号装置
EP2682941A1 (de) * 2012-07-02 2014-01-08 Technische Universität Ilmenau Vorrichtung, Verfahren und Computerprogramm für frei wählbare Frequenzverschiebungen in der Subband-Domäne
EP2717261A1 (en) 2012-10-05 2014-04-09 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Encoder, decoder and methods for backward compatible multi-resolution spatial-audio-object-coding
ES2659001T3 (es) 2013-01-29 2018-03-13 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Codificadores de audio, decodificadores de audio, sistemas, métodos y programas informáticos que utilizan una resolución temporal aumentada en la proximidad temporal de inicios o finales de fricativos o africados
CA2908625C (en) 2013-04-05 2017-10-03 Dolby International Ab Audio encoder and decoder
US10431243B2 (en) * 2013-04-11 2019-10-01 Nec Corporation Signal processing apparatus, signal processing method, signal processing program
KR101732059B1 (ko) 2013-05-15 2017-05-04 삼성전자주식회사 오디오 신호의 부호화, 복호화 방법 및 장치
MX353042B (es) * 2013-06-10 2017-12-18 Fraunhofer Ges Forschung Método y aparato para codificación, procesamiento y decodificación de envolvente de señal de audio mediante modelado de una representación de suma acumulativa que emplea cuantificación de distribución y codificación.
EP2830054A1 (en) * 2013-07-22 2015-01-28 Fraunhofer Gesellschaft zur Förderung der angewandten Forschung e.V. Audio encoder, audio decoder and related methods using two-channel processing within an intelligent gap filling framework
EP2830058A1 (en) 2013-07-22 2015-01-28 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Frequency-domain audio coding supporting transform length switching
EP2830055A1 (en) * 2013-07-22 2015-01-28 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Context-based entropy coding of sample values of a spectral envelope
BR112016007515B1 (pt) * 2013-10-18 2021-11-16 Telefonaktiebolaget Lm Ericsson (Publ) Método de codificação de segmento de sinal de áudio, codificador de segmento de sinal de áudio, e, terminal de usuário.
US20150149157A1 (en) * 2013-11-22 2015-05-28 Qualcomm Incorporated Frequency domain gain shape estimation
WO2015124597A1 (en) 2014-02-18 2015-08-27 Dolby International Ab Estimating a tempo metric from an audio bit-stream
GB2528460B (en) 2014-07-21 2018-05-30 Gurulogic Microsystems Oy Encoder, decoder and method
WO2016024853A1 (ko) * 2014-08-15 2016-02-18 삼성전자 주식회사 음질 향상 방법 및 장치, 음성 복호화방법 및 장치와 이를 채용한 멀티미디어 기기
CN105261373B (zh) * 2015-09-16 2019-01-08 深圳广晟信源技术有限公司 用于带宽扩展编码的自适应栅格构造方法和装置
CN105280190B (zh) * 2015-09-16 2018-11-23 深圳广晟信源技术有限公司 带宽扩展编码和解码方法以及装置
JP6763194B2 (ja) * 2016-05-10 2020-09-30 株式会社Jvcケンウッド 符号化装置、復号装置、通信システム
EP3382700A1 (en) * 2017-03-31 2018-10-03 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Apparatus and method for post-processing an audio signal using a transient location detection
CN110998722B (zh) * 2017-07-03 2023-11-10 杜比国际公司 低复杂性密集瞬态事件检测和译码
CN108828427B (zh) * 2018-03-19 2020-10-27 深圳市共进电子股份有限公司 信号完整性测试的判据查找方法、装置、设备及存储介质
CN111210832B (zh) * 2018-11-22 2024-06-04 广州广晟数码技术有限公司 基于频谱包络模板的带宽扩展音频编解码方法及装置
CN113571073A (zh) * 2020-04-28 2021-10-29 华为技术有限公司 一种线性预测编码参数的编码方法和编码装置
US20230162758A1 (en) * 2021-11-19 2023-05-25 Massachusetts Institute Of Technology Systems and methods for speech enhancement using attention masking and end to end neural networks

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5504832A (en) * 1991-12-24 1996-04-02 Nec Corporation Reduction of phase information in coding of speech
US5581653A (en) * 1993-08-31 1996-12-03 Dolby Laboratories Licensing Corporation Low bit-rate high-resolution spectral envelope coding for audio encoder and decoder
US5737718A (en) * 1994-06-13 1998-04-07 Sony Corporation Method, apparatus and recording medium for a coder with a spectral-shape-adaptive subband configuration
WO1998057436A2 (en) * 1997-06-10 1998-12-17 Lars Gustaf Liljeryd Source coding enhancement using spectral-band replication
US5852806A (en) * 1996-03-19 1998-12-22 Lucent Technologies Inc. Switched filterbank for use in audio signal coding

Family Cites Families (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS6439897A (en) 1987-08-06 1989-02-10 Canon Kk Communication control unit
SG44675A1 (en) * 1990-03-09 1997-12-19 At & T Corp Hybrid perceptual audio coding
CN1062963C (zh) * 1990-04-12 2001-03-07 多尔拜实验特许公司 用于产生高质量声音信号的解码器和编码器
JP3088580B2 (ja) * 1993-02-19 2000-09-18 松下電器産業株式会社 変換符号化装置のブロックサイズ決定法
US5623577A (en) * 1993-07-16 1997-04-22 Dolby Laboratories Licensing Corporation Computationally efficient adaptive bit allocation for encoding method and apparatus with allowance for decoder spectral distortions
US6141353A (en) * 1994-09-15 2000-10-31 Oki Telecom, Inc. Subsequent frame variable data rate indication method for various variable data rate systems
US5682463A (en) * 1995-02-06 1997-10-28 Lucent Technologies Inc. Perceptual audio compression based on loudness uncertainty
JP3266819B2 (ja) * 1996-07-30 2002-03-18 株式会社エイ・ティ・アール人間情報通信研究所 周期信号変換方法、音変換方法および信号分析方法
JP3464371B2 (ja) 1996-11-15 2003-11-10 ノキア モービル フォーンズ リミテッド 不連続伝送中に快適雑音を発生させる改善された方法
SE9700772D0 (sv) * 1997-03-03 1997-03-03 Ericsson Telefon Ab L M A high resolution post processing method for a speech decoder
EP0878790A1 (en) 1997-05-15 1998-11-18 Hewlett-Packard Company Voice coding system and method
EP0915588A4 (en) * 1997-05-16 2004-10-27 Nippon Telegraph & Telephone METHOD, TRANSMITTER AND RECEIVER FOR TRANSMITTING FRAME WITH VARIABLE LENGTH
JP4216364B2 (ja) 1997-08-29 2009-01-28 株式会社東芝 音声符号化/復号化方法および音声信号の成分分離方法
DE19747132C2 (de) 1997-10-24 2002-11-28 Fraunhofer Ges Forschung Verfahren und Vorrichtungen zum Codieren von Audiosignalen sowie Verfahren und Vorrichtungen zum Decodieren eines Bitstroms
JP2000221988A (ja) * 1999-01-29 2000-08-11 Sony Corp データ処理装置、データ処理方法、プログラム提供媒体及び記録媒体
EP1047047B1 (en) * 1999-03-23 2005-02-02 Nippon Telegraph and Telephone Corporation Audio signal coding and decoding methods and apparatus and recording media with programs therefor
US6604070B1 (en) * 1999-09-22 2003-08-05 Conexant Systems, Inc. System of encoding and decoding speech signals

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5504832A (en) * 1991-12-24 1996-04-02 Nec Corporation Reduction of phase information in coding of speech
US5581653A (en) * 1993-08-31 1996-12-03 Dolby Laboratories Licensing Corporation Low bit-rate high-resolution spectral envelope coding for audio encoder and decoder
US5737718A (en) * 1994-06-13 1998-04-07 Sony Corporation Method, apparatus and recording medium for a coder with a spectral-shape-adaptive subband configuration
US5852806A (en) * 1996-03-19 1998-12-22 Lucent Technologies Inc. Switched filterbank for use in audio signal coding
WO1998057436A2 (en) * 1997-06-10 1998-12-17 Lars Gustaf Liljeryd Source coding enhancement using spectral-band replication

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
BOSI M. ET AL.: "Time versus Frequency Resolution in a Low-Rate, High Quality Audio Transform Coder", IEEE ASSP WORKSHOP ON APPLICATIONS OF SIGNAL PROCESSING TO AUDIO AND ACOUSTICS, FINAL PROGRAM AND PAPER SUMMARIES, 1991, pages 0-81 - 0-82, XP010255201 *
PRINCEN J. ET AL.: "Audio coding with signal adaptive filterbanks", INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, ICASSP-95, vol. 5, 1995, pages 3071 - 3074, XP010151993 *

Cited By (48)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7742927B2 (en) 2000-04-18 2010-06-22 France Telecom Spectral enhancing method and device
US8239208B2 (en) 2000-04-18 2012-08-07 France Telecom Sa Spectral enhancing method and device
US7246065B2 (en) 2002-01-30 2007-07-17 Matsushita Electric Industrial Co., Ltd. Band-division encoder utilizing a plurality of encoding units
US8108221B2 (en) 2002-09-04 2012-01-31 Microsoft Corporation Mixed lossless audio compression
US8630861B2 (en) 2002-09-04 2014-01-14 Microsoft Corporation Mixed lossless audio compression
JP2004264814A (ja) * 2002-09-04 2004-09-24 Microsoft Corp 純可逆的音声圧縮における技術革新
JP4767687B2 (ja) * 2003-10-07 2011-09-07 パナソニック株式会社 スペクトル包絡線符号化のための時間境界及び周波数分解能の決定方法
JPWO2005036527A1 (ja) * 2003-10-07 2006-12-21 松下電器産業株式会社 スペクトル包絡線符号化のための時間境界及び周波数分解能の決定方法
EP1672618A4 (en) * 2003-10-07 2008-06-25 Matsushita Electric Industrial Co Ltd METHOD OF DECISION OF THE TIME LIMIT FOR THE CODING OF THE SPECTRO-CASE AND FREQUENCY RESOLUTION
US7451091B2 (en) 2003-10-07 2008-11-11 Matsushita Electric Industrial Co., Ltd. Method for determining time borders and frequency resolutions for spectral envelope coding
US7668711B2 (en) 2004-04-23 2010-02-23 Panasonic Corporation Coding equipment
US8065139B2 (en) 2004-06-21 2011-11-22 Koninklijke Philips Electronics N.V. Method of audio encoding
JP2008503766A (ja) * 2004-06-21 2008-02-07 コーニンクレッカ フィリップス エレクトロニクス エヌ ヴィ オーディオエンコードの方法
WO2006000951A1 (en) * 2004-06-21 2006-01-05 Koninklijke Philips Electronics N.V. Method of audio encoding
EP3503409A1 (en) * 2005-04-13 2019-06-26 Fraunhofer Gesellschaft zur Förderung der Angewand Adaptive grouping of parameters for enhanced coding efficiency
EP1869774B1 (en) * 2005-04-13 2019-06-05 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Adaptive grouping of parameters for enhanced coding efficiency
RU2413312C2 (ru) * 2006-10-18 2011-02-27 Фраунхофер-Гезелльшафт Цур Фердерунг Дер Ангевандтен Форшунг Е.Ф. Кодирование информационного сигнала
US8041578B2 (en) 2006-10-18 2011-10-18 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Encoding an information signal
NO341258B1 (no) * 2006-10-18 2017-09-25 Fraunhofer Ges Forschung Koding av et informasjonssignal
USRE50695E1 (en) 2006-10-18 2025-12-09 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Encoding an information signal
US8126721B2 (en) 2006-10-18 2012-02-28 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Encoding an information signal
USRE50768E1 (en) 2006-10-18 2026-01-27 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Encoding an information signal
USRE50801E1 (en) 2006-10-18 2026-02-17 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Encoding an information signal
AU2007312667B2 (en) * 2006-10-18 2010-09-09 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Coding of an information signal
USRE50654E1 (en) 2006-10-18 2025-11-04 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Encoding an information signal
WO2008046505A1 (de) * 2006-10-18 2008-04-24 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Kodierung eines informationssignals
US8417532B2 (en) 2006-10-18 2013-04-09 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Encoding an information signal
USRE50711E1 (en) 2006-10-18 2025-12-23 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Encoding an information signal
US8249882B2 (en) 2006-11-24 2012-08-21 Fujitsu Limited Decoding apparatus and decoding method
US8073050B2 (en) 2007-03-09 2011-12-06 Fujitsu Limited Encoding device and encoding method
US8386271B2 (en) 2008-03-25 2013-02-26 Microsoft Corporation Lossless and near lossless scalable audio codec
TWI415114B (zh) * 2008-07-11 2013-11-11 Fraunhofer Ges Forschung 用於計算頻譜包絡數目之裝置與方法
US8612214B2 (en) 2008-07-11 2013-12-17 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Apparatus and a method for generating bandwidth extension output data
KR101395252B1 (ko) * 2008-07-11 2014-05-15 프라운호퍼 게젤샤프트 쭈르 푀르데룽 데어 안겐반텐 포르슝 에. 베. 스펙트럼 포락선의 수를 산출하기 위한 장치 및 그 방법
KR101395250B1 (ko) * 2008-07-11 2014-05-15 프라운호퍼 게젤샤프트 쭈르 푀르데룽 데어 안겐반텐 포르슝 에. 베. 스펙트럼 포락선의 수를 산출하기 위한 장치 및 그 방법
KR101395257B1 (ko) * 2008-07-11 2014-05-15 프라운호퍼 게젤샤프트 쭈르 푀르데룽 데어 안겐반텐 포르슝 에. 베. 스펙트럼 포락선의 수를 산출하기 위한 장치 및 그 방법
US8296159B2 (en) 2008-07-11 2012-10-23 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Apparatus and a method for calculating a number of spectral envelopes
US8275626B2 (en) 2008-07-11 2012-09-25 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Apparatus and a method for decoding an encoded audio signal
WO2010003546A3 (en) * 2008-07-11 2010-03-04 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E .V. An apparatus and a method for calculating a number of spectral envelopes
WO2011000780A1 (en) * 2009-06-29 2011-01-06 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Bandwidth extension encoder, bandwidth extension decoder and phase vocoder
RU2563164C2 (ru) * 2009-06-29 2015-09-20 Фраунхофер-Гезелльшафт цур Фёердерунг дер ангевандтен Форшунг Е.Ф. Кодер расширения полосы пропускания, декодер расширения полосы пропускания и фазовый вокодер
KR101425157B1 (ko) * 2009-06-29 2014-08-01 프라운호퍼 게젤샤프트 쭈르 푀르데룽 데어 안겐반텐 포르슝 에. 베. 대역폭 확장 인코더, 대역폭 확장 디코더 및 위상 보코더
US8606586B2 (en) 2009-06-29 2013-12-10 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Bandwidth extension encoder for encoding an audio signal using a window controller
CN102473414B (zh) * 2009-06-29 2013-11-06 弗兰霍菲尔运输应用研究公司 带宽扩展编码器、带宽扩展解码器和相位声码器
CN102473414A (zh) * 2009-06-29 2012-05-23 弗兰霍菲尔运输应用研究公司 带宽扩展编码器、带宽扩展解码器和相位声码器
EP2273493A1 (en) * 2009-06-29 2011-01-12 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Bandwidth extension encoder, bandwidth extension decoder and phase vocoder
US9131290B2 (en) 2011-03-02 2015-09-08 Fujitsu Limited Audio coding device, audio coding method, and computer-readable recording medium storing audio coding computer program
RU2660633C2 (ru) * 2013-06-10 2018-07-06 Фраунхофер-Гезелльшафт Цур Фердерунг Дер Ангевандтен Форшунг Е.Ф. Устройство и способ для кодирования, обработки и декодирования огибающей аудиосигнала путем разделения огибающей аудиосигнала с использованием квантования и кодирования распределения

Also Published As

Publication number Publication date
JP4035631B2 (ja) 2008-01-23
PT1216474E (pt) 2004-11-30
DK1216474T3 (da) 2004-10-04
US20060031065A1 (en) 2006-02-09
JP2003529787A (ja) 2003-10-07
RU2236046C2 (ru) 2004-09-10
CN1172293C (zh) 2004-10-20
US7181389B2 (en) 2007-02-20
ATE271250T1 (de) 2004-07-15
AU7821200A (en) 2001-05-10
HK1049401B (zh) 2005-11-18
US20060031064A1 (en) 2006-02-09
EP1216474A1 (en) 2002-06-26
DE60012198D1 (de) 2004-08-19
JP4628921B2 (ja) 2011-02-09
DE60012198T2 (de) 2005-08-18
US7191121B2 (en) 2007-03-13
US6978236B1 (en) 2005-12-20
JP2006031053A (ja) 2006-02-02
JP2006065342A (ja) 2006-03-09
BR0014642A (pt) 2002-06-18
ES2223591T3 (es) 2005-03-01
JP4334526B2 (ja) 2009-09-30
CN1377499A (zh) 2002-10-30
BRPI0014642B1 (pt) 2016-04-26
EP1216474B1 (en) 2004-07-14
HK1049401A1 (en) 2003-05-09

Similar Documents

Publication Publication Date Title
EP1216474B1 (en) Efficient spectral envelope coding using variable time/frequency resolution
KR100648760B1 (ko) 고주파 재생 기술 향상을 위한 방법들 및 그를 수행하는 프로그램이 저장된 컴퓨터 프로그램 기록매체
JP6368029B2 (ja) 雑音信号処理方法、雑音信号生成方法、符号化器、復号化器、並びに符号化および復号化システム
CN105957532B (zh) 对音频/语音信号进行编码和解码的方法和设备
RU2752127C2 (ru) Усовершенствованный квантователь
RU2740359C2 (ru) Звуковые кодирующее устройство и декодирующее устройство
US9037454B2 (en) Efficient coding of overcomplete representations of audio using the modulated complex lapped transform (MCLT)
JP5719941B2 (ja) オーディオ信号の効率的なエンコーディング/デコーディング
WO2000045378A2 (en) Efficient spectral envelope coding using variable time/frequency resolution and time/frequency switching
JP2008546021A (ja) マルチステージコードブックおよび冗長コーディング技術フィールドを有するサブバンド音声コーデック
WO2009059632A1 (en) An encoder
WO2025027078A1 (en) Coding and decoding audio signal
Ning Analysis and coding of high quality audio signals

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CR CU CZ DE DK DM DZ EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT TZ UA UG US UZ VN YU ZA ZW

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG

ENP Entry into the national phase

Ref country code: US

Ref document number: 2001 763128

Date of ref document: 20010515

Kind code of ref document: A

Format of ref document f/p: F

121 Ep: the epo has been informed by wipo that ep was designated in this application
DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
WWE Wipo information: entry into national phase

Ref document number: 2000968271

Country of ref document: EP

ENP Entry into the national phase

Ref country code: JP

Ref document number: 2001 528974

Kind code of ref document: A

Format of ref document f/p: F

WWE Wipo information: entry into national phase

Ref document number: 008136025

Country of ref document: CN

ENP Entry into the national phase

Ref country code: RU

Ref document number: 2002 2002111665

Kind code of ref document: A

Format of ref document f/p: F

WWP Wipo information: published in national office

Ref document number: 2000968271

Country of ref document: EP

REG Reference to national code

Ref country code: DE

Ref legal event code: 8642

WWG Wipo information: grant in national office

Ref document number: 2000968271

Country of ref document: EP

DPE2 Request for preliminary examination filed before expiration of 19th month from priority date (pct application filed from 20040101)