EP2115741A1 - Codage/decodage perfectionnes de signaux audionumeriques - Google Patents
Codage/decodage perfectionnes de signaux audionumeriquesInfo
- Publication number
- EP2115741A1 EP2115741A1 EP08762010A EP08762010A EP2115741A1 EP 2115741 A1 EP2115741 A1 EP 2115741A1 EP 08762010 A EP08762010 A EP 08762010A EP 08762010 A EP08762010 A EP 08762010A EP 2115741 A1 EP2115741 A1 EP 2115741A1
- Authority
- EP
- European Patent Office
- Prior art keywords
- subband
- band
- signal
- coding
- masking threshold
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 230000000873 masking effect Effects 0.000 claims abstract description 91
- 230000003595 spectral effect Effects 0.000 claims abstract description 55
- 238000000034 method Methods 0.000 claims abstract description 45
- 238000004364 calculation method Methods 0.000 claims abstract description 18
- 238000004590 computer program Methods 0.000 claims abstract description 12
- OVOUKWFJRHALDD-UHFFFAOYSA-N 2-[2-(2-acetyloxyethoxy)ethoxy]ethyl acetate Chemical compound CC(=O)OCCOCCOCCOC(C)=O OVOUKWFJRHALDD-UHFFFAOYSA-N 0.000 claims description 28
- 230000006870 function Effects 0.000 claims description 17
- 238000010606 normalization Methods 0.000 claims description 12
- 230000007480 spreading Effects 0.000 claims description 11
- 230000015572 biosynthetic process Effects 0.000 claims description 9
- 238000003786 synthesis reaction Methods 0.000 claims description 9
- 238000001228 spectrum Methods 0.000 description 20
- 238000001914 filtration Methods 0.000 description 8
- 238000013139 quantization Methods 0.000 description 8
- 238000004458 analytical method Methods 0.000 description 4
- 230000008901 benefit Effects 0.000 description 4
- 230000005236 sound signal Effects 0.000 description 4
- 230000005540 biological transmission Effects 0.000 description 3
- 238000001514 detection method Methods 0.000 description 3
- 230000007774 longterm Effects 0.000 description 3
- 238000012545 processing Methods 0.000 description 3
- 238000005070 sampling Methods 0.000 description 3
- 230000006835 compression Effects 0.000 description 2
- 238000007906 compression Methods 0.000 description 2
- 230000006872 improvement Effects 0.000 description 2
- 230000009466 transformation Effects 0.000 description 2
- 230000006978 adaptation Effects 0.000 description 1
- 230000003044 adaptive effect Effects 0.000 description 1
- 238000004422 calculation algorithm Methods 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 238000013213 extrapolation Methods 0.000 description 1
- 230000004927 fusion Effects 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 238000012805 post-processing Methods 0.000 description 1
- 238000007781 pre-processing Methods 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 238000009877 rendering Methods 0.000 description 1
- 230000035945 sensitivity Effects 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
- 210000001260 vocal cord Anatomy 0.000 description 1
- 230000001755 vocal effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/04—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
- G10L19/08—Determination or coding of the excitation function; Determination or coding of the long-term prediction parameters
- G10L19/12—Determination or coding of the excitation function; Determination or coding of the long-term prediction parameters the excitation function being a code excitation, e.g. in code excited linear prediction [CELP] vocoders
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/02—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
- G10L19/0204—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders using subband decomposition
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/02—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/002—Dynamic bit allocation
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/02—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
- G10L19/0212—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders using orthogonal transformation
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/02—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
- G10L19/032—Quantisation or dequantisation of spectral components
- G10L19/038—Vector quantisation, e.g. TwinVQ audio
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/04—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
- G10L19/16—Vocoder architecture
- G10L19/18—Vocoders using multiple modes
- G10L19/24—Variable rate codecs, e.g. for generating different qualities using a scalable representation such as hierarchical encoding or layered encoding
Definitions
- the present invention relates to a sound data processing.
- This processing is adapted in particular to the transmission and / or storage of digital signals such as audio-frequency signals (speech, music, or other).
- - waveform coding methods such as PCM (for "Coded Pulse Modulation") and ADPCM (for "Pulse Modulation and Adaptive Differential Coding"), also known as “PCM” "and” ADPCM “in English
- PCM Coded Pulse Modulation
- ADPCM Pulse Modulation and Adaptive Differential Coding
- CELP Code Excited Linear Prediction
- a sound signal such as a speech signal can be predicted from its recent past (for example from 8 to 12 samples at 8 kHz) using parameters evaluated on short windows (10 to 20 ms). in this example).
- These short-term prediction parameters representative of the transfer function of the vocal tract (for example to pronounce consonants), are obtained by LPC (for Linear Prediction Coding) analysis methods.
- LPC Linear Prediction Coding
- a longer-term correlation is also used to determine the periodicities of voiced sounds (eg vowels) due to the vibration of the vocal cords. It is therefore a question of determining at least the fundamental frequency of the voiced signal which varies typically from 60 Hz (deep voice) to 600 Hz (high voice) according to the speakers.
- the LTP long-term prediction parameters including the pitch period, represent the fundamental vibration of the speech signal (when it is voiced), while the LPC short-term prediction parameters represent the spectral envelope. of this signal.
- all of these LPC and LTP parameters thus resulting from a speech coding, can be transmitted in blocks to a peer decoder, via one or more telecommunication networks, to then restore the initial speech signal.
- the encoder In conventional speech coding, the encoder generates a fixed rate bit stream. This flow constraint simplifies the implementation and use of the encoder and decoder. Examples of such systems are the ITU-T G.711 64 kbit / s standard encoding, the ITU-T G.729 8 kbit / s standard encoding, or the 12.2 kbit / s GSM-EFR encoding.
- variable rate bit stream In some applications (such as mobile telephony or VoIP for "Internet Protocol"), it is best to generate a variable rate bit stream. Flow values are taken in a predefined set. Such a coding technique, called “multi-rate” is therefore more flexible than a fixed rate coding technique.
- multi-mode coding controlled by the source and / or the channel, implemented in particular in 3GPP AMR-NB, 3GPP AMR-WB or 3GPP2 VMR-WB coders, the hierarchical coding (or "scalable” coding) that generates a so-called “hierarchical” bitstream because it comprises a core rate and one or more layer (s) enhancement (the G.722 standardized coding at 48, 56 and 64 kbit / s is typically bitrate scalable, while the ITU-T G.729.1 and MPEG-4 CELP codecs are scalable in both bit rate and bandwidth), the multi-description coding described in particular in: "A multiple description speech coder based on AMR-WB for mobile ad hoc networks", H. Dong, A. Gersho, JD Gibson, V. Cuperman, ICASSP, p. 277-280, vol. 1 ( May 2004).
- Hierarchical coding having the capacity to provide varied bit rates, is described below by distributing the information relating to an audio signal to be coded in hierarchical subsets, so that this information can be used in order of importance. in terms of audio rendering quality.
- the criterion taken into account for determining the order is a criterion for optimizing (or rather reducing) the quality of the coded audio signal.
- Hierarchical coding is particularly suited to transmission over heterogeneous networks or having variable available rates over time, or to transmission to terminals with varying capacities.
- the bit stream includes a base layer and one or more enhancement layers.
- the base layer is generated by a low-rate (fixed) codec, known as a "core coded", which guarantees the minimum quality of the coding. This layer must be received by the decoder to maintain an acceptable level of quality. Improvement layers are used to improve quality. However, they may not all be received by the decoder.
- the main advantage of hierarchical coding is that it allows an adaptation of the bit rate simply by "truncation of the bit stream".
- the number of layers i.e., the number of possible truncations of the bitstream
- the number of layers defines the granularity of the coding.
- scalable bandwidth and scalability encoding techniques are described below, with a CELP core-type coder, in a telephone band, and one or more broadband enhancement layer (s).
- An example of such systems is given in the ITU-T G.729.1 8-32 kbit / s fine grain standard.
- the G.729.1 coding / decoding algorithm is summarized below.
- the G.729.1 encoder is an extension of the ITU-T G.729 coder. It is a modified G.729 heart-shaped hierarchical encoder producing a bandwidth ranging from narrow band (50-4000 Hz) to wide band (50-7000 Hz) at a rate of 8 to 32 kbit / s for conversational services. This codec is compatible with existing VoIP devices (most of which are equipped according to G.729). Finally, it should be noted that G.729.1 was approved in May 2006.
- the G.729.1 coder is shown schematically in FIG. 1.
- the broadband input signal s wb sampled at 16 kHz, is first broken down into two subbands by QMF (for "Quadrature Mirror Filter") filtering.
- the low band (0-4000 Hz) is obtained by LP low-pass filtering (block 100) and decimation (block 101), and the high band (4000-8000 Hz) by HP high-pass filtering (block 102) and decimation (block 103).
- the LP and HP filters are of length 64.
- the low band is pretreated with a high-pass filter eliminating the components below 50 Hz (block 104), to obtain the signal s LB , before CELP coding in narrow band (block 105) at 8 and 12 kbit / s.
- This high-pass filtering takes into account that the Useful band is defined as covering the range 50-7000 Hz.
- the narrow-band CELP coding is a cascaded CELP coding comprising as a first stage a modified G.729 coding without pre-processing filter and as a second stage an additional fixed CELP dictionary.
- the high band is first pretreated (block 106) to compensate for the folding due to the high-pass filter (block 102) combined with the decimation (block 103).
- the high band is then filtered by a low pass filter (block 107) eliminating the components between 3000 and 4000 Hz from the high band (i.e. the components between 7000 and 8000 Hz in the original signal) to obtain the signal s HB .
- a band extension (block 108) is then performed.
- the error signal d LB of the low band is calculated (block 109) from the output of the CELP coder (block 105) and a predictive coding by transform (for example of type
- TDAC for "Time Domain Aliasing Cancellation" in the G.729.1 standard) is carried out at block 110.
- the TDAC encoding is applied to both the error signal on the band. bass and the filtered signal on the high band.
- Additional parameters can be transmitted by the block 111 to a homologous decoder, this block 111 performing a so-called “FEC” treatment for "Frame Erasure Concealment", in order to reconstitute possible erased frames.
- the different bit streams generated by the coding blocks 105, 108, 110 and 111 are finally multiplexed and structured into a hierarchical bit stream in the multiplexing block 112.
- the coding is performed by 20 ms sample blocks (or frames). 320 samples per frame.
- the G.729.1 codec therefore has a three-step coding architecture comprising:
- the homologous decoder according to the G.729.1 standard is illustrated in FIG. 2.
- the bits describing each 20 ms frame are demultiplexed in the block 200.
- the bit stream of the 8 and 12 kbit / s layers is used by the CELP decoder (block 201) to generate the narrow-band synthesis (0-4000 Hz).
- the portion of the bit stream associated with the 14 kbit / s layer is decoded by the tape extension module (block 202).
- the portion of the bit stream associated with data rates greater than 14 kbit / s is decoded by the TDAC module (block 203).
- Pre-echo and post-echo processing is performed by blocks 204 and 207 as well as enrichment (block 205) and aftertreatment of the low band (block 206).
- the extended band output signal s wb sampled at 16 kHz, is obtained via the QMF synthesis filter bank (blocks 209, 210, 211, 212 and 213) integrating the inverse folding (block 208).
- the TDAC type transform coding in the G.729.1 encoder is illustrated in FIG.
- the filter W LB (z) (block 300) is a perceptual weighting filter, with gain compensation, applied to the low band error signal d LB.
- MDCT transforms are then calculated (block 301 and 302) to obtain: - the MDCT spectrum D ⁇ 3 of the difference signal, perceptually filtered, and the MDCT spectrum S HB of the original signal of the high band.
- MDCT transforms (blocks 301 and 302) apply to 20 ms of sampled signal at 8 kHz (160 coefficients).
- the spectrum Y (k) from the block 303 of fusion thus comprises 2 x 160, or 320 coefficients. It is defined as follows:
- This spectrum is divided into eighteen sub-bands, a sub-band j being assigned a number of coefficients noted nb_coef (j).
- the subband splitting is specified in Table 1 below.
- a sub-band j comprises the coefficients Y (k) with sb-bound (j) ⁇ k ⁇ sb-bound (j + 1).
- the spectral envelope JlOg-ZmS (J) J _ o ⁇ is calculated in block 304 according to the formula:
- This value rms _index (j) is transmitted to bit allocation block 306.
- two types of coding may be chosen according to a given criterion, and, more precisely, the rms values _index (j): may be coded by "differential Huffman" coding, - or may be coded by natural binary coding .
- a bit (0 or 1) is transmitted to the decoder to indicate the encoding mode that has been chosen.
- the number of bits allocated to each sub-band for its quantization is determined in block 306 from the quantized spectral envelope from block 305.
- the allocation of the bits performed minimizes the squared error while respecting the constraint of a number of integer bits allocated per subband and a maximum number of bits not to be exceeded.
- the spectral content of the subbands is then encoded by spherical vector quantization (block 307).
- the step of TDAC-type transform decoding in the G.729.1 decoder is illustrated in FIG. 4.
- the decoded spectral envelope (block 401) makes it possible to recover the allocation.
- each of the subbands is found by inverse spherical vector quantization (block 403).
- the non-transmitted sub-bands, due to a lack of "budget" of bits, are extrapolated (block 404) from the MDCT transform of the signal at the output of the band extension block (block 202 of FIG. 2).
- the spectrum MDCT is separated into two (block 407): with 160 first coefficients corresponding to the spectrum D ⁇ B of a decoded difference signal in low band, perceptually filtered, and 160 subsequent coefficients corresponding to the spectrum S HB of the original decoded signal in high band.
- IMDCT inverse MDCT transform time signals
- W 18 (Z) 1 the inverse perceptual weighting filter
- W 18 (Z) 1 the inverse perceptual weighting
- Table 2 Possible values of number of bits allocated in TDAC subbands.
- nbit (j) arg rR m, in nb_ coef (j) x (ip (j) -A) - r
- ⁇ is a parameter optimized by dichotomy.
- the TDAC coding uses the perceptual weighting W LB (z) filter in the low band (block 300), as indicated above.
- perceptual weighting filtering allows you to format the coding noise.
- the principle of this filtering is to exploit the fact that it is possible to inject more noise in the frequency zones where the original signal has a high energy.
- the most common perceptual weighting filters used in narrow-band CELP coding are of the form ⁇ (z / ⁇ 1) / ⁇ (z / ⁇ 2) where 0 ⁇ 2 ⁇ l ⁇ 1 and ⁇ (z) represents a prediction spectrum linear (LPC).
- CELP coding synthesis analysis thus, it amounts to minimizing the quadratic error in a perceptually weighted signal domain by this type of filter.
- the W LB (z) filter is defined as:
- the fac factor makes it possible to ensure at the junction of the low and high bands (4 kHz) a gain of the filter at 1 to 4 kHz. It is important to note that in the G.729.1 TDAC coding, the coding is based on an energetic criterion only.
- the TDAC encoder jointly processes: the difference signal, between the original low band and the CELP synthesis, perceptually filtered by a ⁇ (z / ⁇ 1) / ⁇ (z / ⁇ 2) compensated filter. gain (ensuring spectral continuity), and - the high band which contains the original high band signal.
- the low band signal corresponds to the frequencies 50 Hz-4 kHz, while the high band signal corresponds to the frequencies 4-7 kHz.
- the joint coding of these two signals is carried out in the MDCT domain according to the criterion of the quadratic error.
- the high band is coded according to energy criteria, which is suboptimal (in the "perceptual" sense of the term). More generally, the case of multi-band coding may be considered, a perceptual weighting filter being applied to the signal of at least one band in the time domain, and the set of subbands being coded together. by transform coding. If we want to apply the perceptual weighting in the frequency domain, then there is the problem of continuity and homogeneity of the spectra between subbands.
- the present invention improves the situation.
- the method comprises: a determination of at least one frequency masking threshold to be applied on the second subband and normalizing said masking threshold to provide spectral continuity between said first and second subbands.
- the present invention therefore proposes to calculate a frequency perceptual weighting, using a masking threshold, on only a part of the frequency band (at least on the "second subband” mentioned above) and to ensure spectral continuity with at least another frequency band (at least the aforementioned "first sub-band”) by normalizing the masking threshold on the spectrum covering these two frequency bands.
- the allocation of the bits for the second sub-band at least is determined furthermore according to a standardized masking curve calculation, applied at least to the second sub-band.
- the application of the invention makes it possible to assign the bits to the sub-bands which require the most bits according to a perceptual criterion.
- perceptual frequency weighting is then applied by masking a part of the audio band, so as to improve the audio quality by optimizing in particular the distribution of bits between subbands according to criteria. perceptual.
- the transformed signal in the second subband is weighted by a factor proportional to the square root of the normalized masking threshold for the second subband.
- the normalized masking threshold is not used for the allocation of the bits to the subbands as in the first application mode above, but it can advantageously be used to directly weight the signal of the second sub-band at least in the transformed domain.
- the present invention is advantageously, but not exclusively, applied to a TDAC-type transform coding in a global encoder according to the G.729.1 standard, the first subband being included in a low frequency band, whereas the second subband is included in a low frequency band, while the second subband is included in a low frequency band.
- -band is included in a high frequency band that can extend up to 7000 Hz, or even more (typically up to 14 kHz) in band extension.
- the application of the invention may then consist in providing a perceptual weighting for the high band while ensuring spectral continuity with the low band.
- the first subband then comprises a signal resulting from a core coding of the hierarchical coder
- the second subband comprises an original signal
- the signal from the core coding can be perceptually weighted and the implementation of the invention is advantageous in the sense that the entire spectral band can finally be perceptually weighted.
- the signal from the core coding may be a signal representative of a difference between an original signal and a synthesis of this original signal (called “difference signal” or “error signal”). .
- difference signal or "error signal”
- the present invention also relates to a decoding method, homologous to the coding method defined above, in which at least one first and one second subband, adjacent, are decoded by transform.
- the decoding method then comprises: a determination of at least one frequency masking threshold to be applied on the second subband, starting from a decoded spectral envelope, and a normalization of said masking threshold to provide spectral continuity between said first and second subbands.
- a first decoding application mode homologous to the first application mode of the coding defined above, aims at the allocation of bits to the decoding and a number of bits to be allocated to each subband is determined from a decoding spectral envelope.
- the bit allocation for the at least second subband is further determined according to a normalized masking curve calculation applied at least to the second subband.
- a second method of applying decoding within the meaning of the invention consists in weighting the transformed signal in the second subband by the square root of the normalized masking threshold. This embodiment will be described in detail with reference to FIG.
- FIG. 5 illustrates an advantageous spreading function for masking in the sense of the invention
- FIG. 6 illustrates, for comparison with FIG. 3, the structure of a TDAC encoding using a masking curve calculation 606 for the allocation of bits according to a first embodiment of the invention
- FIG. 7 illustrates, for comparison with FIG. 4, the structure of a homologous TDAC decoding of FIG. 6, using a curve calculation 702 according to the first embodiment of the invention
- FIG. 8 illustrates a normalization of the masking curve, in a first embodiment where the sampling frequency is 16 kHz and the masking of the invention applied for the high band 4-7 kHz
- FIG. 9A illustrates the structure of a modified TDAC encoding, with directly weighting of the signal in the high frequencies 4-7 kHz in a second embodiment of the invention, and coding of the standardized masking threshold
- FIG. 9B illustrates the structure of a TDAC encoding in a variant of the second application mode illustrated in FIG. 9A, here with a coding of the spectral envelope
- FIG. 10A illustrates the structure of a homologous TDAC decoding of FIG. 9A, according to the second embodiment application of the invention
- FIG. 10A illustrates the structure of a homologous TDAC decoding of FIG. 9A, according to the second embodiment application of the invention
- FIG. 1B illustrates the structure of a homologous TDAC decoding of FIG. 9B, according to the second embodiment of the invention, with here a calculation of the decoding masking threshold
- FIG. 11 illustrates the normalization of the curve. in a second embodiment of the invention where the sampling frequency is 32 kHz and the masking of the invention applied for the high band widened from 4 to 14 kHz
- FIG. 12 illustrates the spectral power, at the output of the CELP coding, of the difference signal D LB (in solid lines) and of the original signal S LB (in dashed lines).
- the invention provides an improvement to the perceptual weighting performed in the transform coder by exploiting the masking effect known as "simultaneous masking" or "frequency masking".
- This property corresponds to the modification of the hearing threshold in the presence of a so-called “masking” sound. This phenomenon is observed typically when, for example, one tries to hold a discussion with ambient noise, for example in the street and that the noise of a vehicle comes to "hide” the voice of a speaker.
- an approximate masking threshold is calculated for each spectrum line. This threshold is the one above which the line concerned is supposed to be audible.
- the masking threshold is calculated from the convolution of the signal spectrum with a spreading function B (v) modeling the masking effect of a sound (sinusoid or filtered white noise) by another sound (sinusoid or noise filtered white).
- FIG. 5 An example of such a spreading function is shown in FIG. 5. This function is defined in a frequency domain whose unit is Bark. The frequency scale is representative of the frequency sensitivity of the ear. A usual approximation of the conversion of a frequency / in Hertz, in "frequencies" noted ⁇ (in Barks), is given by the following relation:
- the calculation of the masking threshold is performed by subband rather than by line.
- the threshold thus obtained is used to perceptually weight each of the subbands.
- the allocation of the bits is thus performed, not by minimizing the square error but by minimizing the "mask-to-mask noise" ratio, the aim being to shape the coding noise so that it is inaudible ( below the masking threshold).
- the spreading function may be a function of the level of the line and / or the frequency of the masking line. Detection of "peaks" can also be implemented.
- An application of the invention described hereinafter makes it possible to improve the TDAC coding of the encoder according to the G.729.1 standard, in particular by applying a perceptual weighting of the high band (4 to 7 kHz) while ensuring the continuity spectral between low and high bands for a satisfactory and joint coding of these two bands.
- the input signal is sampled at 16 kHz, bandwidth 50 Hz to 7 kHz.
- the encoder always operates at the maximum rate of 32 kbit / s, while the decoder can receive the core (8 kbit / s), as well as one or more enhancement layers (12 to 32 kbit / s per step). 2 kbit / s), as in G.729.1.
- the coding and decoding have the same architecture as that shown in FIGS. 1 and 2. Here, only blocks 110 and 203 are modified as described in FIGS. 6 and 7.
- the modified TDAC coder is identical to that of FIG. 3, except that the allocation of bits following the quadratic error (block 306) is now replaced by a masking curve calculation and a modified bit allocation (blocks 606 and 607), the invention forming part of the calculation of the masking curve (block 606) and its use in the allocation of bits (block 607).
- the modified TDAC decoder is shown in FIG. 7 in this first embodiment.
- This decoder is identical to that of FIG. 4, except that the allocation of bits following the quadratic error (block 402) is replaced by a masking curve calculation and a modified bit allocation (blocks 702 and 703). .
- the invention relates to blocks 702 and 703.
- the masking threshold M (J) of the sub-band j is defined by the convolution of the energy envelope ⁇ 2 (J) -rms -q (j) 2 ⁇ nb-coef (j), by a function spreading B (v).
- this masking is performed only on the high band of the signal, with:
- the masking threshold M (J) for a sub-band j is therefore defined by a convolution between:
- FIG. 5 An advantageous spreading function is that shown in FIG. 5. It is a triangular function whose first slope is + 27dB / Bark and -10dB / Bark for the second. This representation of the spreading function allows the iterative calculation of the following masking curve:
- a 1 (J) and A 2 (J) can be pre-calculated and stored.
- a first embodiment of the invention is described below for the allocation of bits in a hierarchical coder such as the G.729.1 encoder.
- the bit allocation criterion is based here on the signal-to-mask ratio given by The low band is already filtered perceptually, the application of the masking threshold is limited to the high band. In order to ensure the spectral continuity between the low and high band spectrum weighted by the masking threshold and to avoid biasing the bit allocation, the masking threshold is normalized by its value on the last subband of the low band.
- normfac log ; ⁇ a 2 U) XB (V 9 - V 1 )
- log_ mask (j) log 2 [M (J)) - normfac.
- the second line of the brace for the calculation of the perceptual importance is an expression of the implementation of the invention according to this first application to the allocation of bits in a transform coding as an upper layer a hierarchical coder.
- An illustration of the standardization of the masking threshold is given in FIG. 8, showing the connection of the high band on which the masking (4-7 kHz) is applied to the low band (0-4 kHz).
- a 0 1 is obtained by dichotomy as in the G.729.1 standard.
- the standardization of the masking threshold can be rather carried out from the value of the band.
- normfac log 2 ⁇ ⁇ 2 (j) x ⁇ (v 10 -v ; )
- the masking threshold can be calculated over the entire frequency band, with:
- the masking threshold is then applied only to the high band after normalization of the masking threshold by its value on the last subband of the low band:
- normfac log 2 B (V 10 - V 1 )
- these relations giving the normalization factor normfac or the masking threshold M (j) can be generalized to any number of sub-bands (different, in total, from eighteen) in the high band (with a different number of eight), as in low band (with a different number of ten).
- the normalized masking threshold is not used to weight the energy in the definition of the perceptual importance, as in the first embodiment described above, but it serves to directly weight the high band signal before TDAC coding.
- FIGS. 9A and 9B This second embodiment is illustrated in FIGS. 9A (for encoding) and 10A (for decoding).
- FIGS. 9B A variant of this second mode, which is the object of the present invention, in particular for the decoding performed, is illustrated in FIGS. 9B (for encoding) and 10B (for decoding).
- FIGS. 9A and 9B the spectrum Y (k) from block 903 is divided into eighteen subbands and the spectral envelope is calculated (block 904) as previously described.
- the masking threshold is calculated (block 905 of FIG. 9A and block 906b of FIG. 9B) from the unquantized spectral envelope.
- information representative of the weighting is directly coded by the masking threshold M (J), rather than encoding the spectral envelope.
- This coding is performed by algebraic quantization according to the quadratic error, as described in the document Ragot et al: "Low-complexity multi-rate lattice vector quantization with application to wideband TCX speech coding at 32 kbit / s", S. Ragot, B. Bessette, and R. Lefebvre, Proceedings ICASSP - Montreal (Canada), Pages: 501-504 , vol. 1 (2004).
- This gain-form type quantization method is implemented in particular in the 3GPP AMR-WB + standard.
- the peer decoder is shown in Figure 10A.
- the scaling factors sf _q (j), j - 0, - - -, 17, are decoded in the block 1001.
- the block 1002 is then carried out as described in the document Ragot et al. supra.
- This second embodiment may be particularly advantageous especially in an implementation according to the standard 3 GPP-AMR-WB + which is presented as the preferred context of the document Ragot et al. supra.
- the coded information remains the envelope of FIG. energy (rather than the masking threshold itself as in Figs. 9A and 10A).
- the masking threshold is calculated and normalized (block 906b of FIG. 9B) from the coded spectral envelope (block 905b).
- the masking threshold is calculated and normalized (block 1011b of FIG. 10B) from the decoded spectral envelope (block 1001b), the decoding of the envelope making it possible to perform a level adjustment (block 1010b of FIG. 10B) from the values quantified rms_q (j).
- a masking threshold is calculated for each sub-band, at least for the sub-bands of the high frequency band, this masking threshold being normalized to ensure spectral continuity between the subbands concerned.
- the calculation of the masking threshold is particularly advantageous when the signal to be coded is not tonal, in the first mode, as in the second embodiment, described above.
- the application of the spreading function B (v) results in a masking threshold very close to a tone a little more spread out in frequencies.
- the allocation criterion minimizing the masked coding noise ratio then gives a bit of bit allocation.
- the same is true for the direct weighting of the high band signal according to the second embodiment. It is therefore preferred, for a tonal signal, to use a bit allocation according to energy criteria.
- the invention is applied only if the signal to be encoded is not tonal.
- the bit relating to the mode of the coding of the spectral envelope indicates a "differential Huffman" mode or a "natural direct binary” mode.
- This mode bit can be interpreted as a tone detection, since, in general, a tonal signal leads to envelope coding by the "natural direct binary” mode, while most non-tonal signals, having a spectral dynamic more limited, lead to envelope coding by the "Differential Huffman" mode.
- the module 904 of FIG. 9A can, by calculating the spectral envelope, determine whether the signal is tonal or not and thus Block 905 is bypassed if yes.
- the module 904 can make it possible to determine whether the signal is tonal or not and thus bypass the block 907 in the affirmative.
- Figure 11 generalizes the normalization of the masking curve (described in Figure 8) in the case of super-wide band coding.
- the signals in this embodiment are sampled at a frequency of 32 kHz (instead of 16 kHz) for a useful band of 50 Hz - 14 kHz.
- the masking curve log 2 [M (J)] is then defined at least for the sub-bands ranging from 7 to 14 kHz.
- the spectrum covering the band 50 Hz - 14 kHz is coded by subbands and the allocation of bits to each subband is made from the spectral envelope as in the G.729.1 encoder.
- a partial masking threshold can be calculated as previously described.
- the standardization of the masking threshold is thus generalized to the case where the high band has more subbands or covers a wider frequency range than in the G.729.1 standard.
- a first transform T1 is applied to the time weighted difference signal.
- a second transform T2 is applied to the signal on the first high band between 4 and 7 kHz and a third transform T3 is applied to the signal on the second high band between 7 and 14 kHz.
- the invention is not limited to signals sampled at 16kHz. Its implementation is particularly advantageous also for signals sampled at higher frequencies, such as for the extension of the G.729.1 encoder to signals sampled not at 16 kHz but at 32 kHz, as described above. If the TDAC coding is generalized to such a frequency band (50 Hz - 14 kHz instead of 50 Hz - 7 kHz currently), the advantage provided by the invention will be really major.
- the invention also aims to improve the TDAC coding, in particular by applying a perceptual weighting of the high-bandwidth (4-14 kHz) while ensuring the spectral continuity between bands, this criterion being important for a joint coding of the band.
- first low band and the second high and extended band up to 14 kHz.
- the hierarchical coder is implemented with a heart coder in a first frequency band, and the error signal associated with this heart coder is directly transformed, without perceptual weighting in this first frequency band, to be coded. together with the transformed signal of a second frequency band.
- the original signal can be sampled at 16 kHz and decomposed into two frequency bands (from 0 to
- the encoder can typically be, in such an embodiment, an encoder according to the standard
- the transform coding is then performed on: the signal difference between the original signal and the synthesis G.711 in the first frequency band (0-4000 Hz), and the original signal, perceptually weighted in the frequency domain according to the invention, in a second frequency band (4000-8000 Hz).
- the perceptual weighting in the low band is not necessary for the application of the invention.
- the original signal is sampled at 32 kHz and decomposed into two frequency bands (0 to 8000 Hz and 8000 to 16000 Hz) by an appropriate filter bank, QMF type.
- the encoder can be here an encoder according to the G.722 standard (ADPCM compression in two sub-bands), and the transform coding is performed on: the signal difference between the original signal and the synthesis signal G.122 in the first frequency band (0-8000 Hz), and the original signal, which is still weighted perceptually according to the invention in a frequency domain restricted to the second frequency band (8000-16000 Hz).
- the present invention also relates to a first computer program, stored in a memory of an encoder of a telecommunication terminal and / or stored on a memory medium intended to cooperate with a reader of said encoder.
- This first program then comprises instructions for implementing the coding method defined above, when these instructions are executed by an encoder processor.
- the present invention also relates to an encoder comprising at least one memory storing this first computer program.
- FIGS. 6, 9A and 9B may constitute flowcharts of this first computer program, or further illustrate the structure of such an encoder, according to distinct embodiments and variants.
- the present invention also relates to a second computer program, stored in a memory of a decoder of a telecommunication terminal and / or stored on a storage medium intended to cooperate with a reader of said decoder.
- This second program then comprises instructions for implementing the decoding method defined above, when these instructions are executed by a processor of the decoder.
- the present invention also relates to a decoder comprising at least one memory storing this second computer program.
- FIGS. 7, 10A, 10B can constitute flowcharts of this second computer program, or further illustrate the structure of such a decoder, according to different embodiments and variants.
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Signal Processing (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
Description
Claims
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
FR0700747A FR2912249A1 (fr) | 2007-02-02 | 2007-02-02 | Codage/decodage perfectionnes de signaux audionumeriques. |
PCT/FR2008/050150 WO2008104663A1 (fr) | 2007-02-02 | 2008-01-30 | Codage/decodage perfectionnes de signaux audionumeriques |
Publications (2)
Publication Number | Publication Date |
---|---|
EP2115741A1 true EP2115741A1 (fr) | 2009-11-11 |
EP2115741B1 EP2115741B1 (fr) | 2010-07-07 |
Family
ID=38477199
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
EP08762010A Active EP2115741B1 (fr) | 2007-02-02 | 2008-01-30 | Codage/decodage perfectionnes de signaux audionumeriques |
Country Status (10)
Country | Link |
---|---|
US (1) | US8543389B2 (fr) |
EP (1) | EP2115741B1 (fr) |
JP (1) | JP5357055B2 (fr) |
KR (1) | KR101425944B1 (fr) |
CN (1) | CN101622661B (fr) |
AT (1) | ATE473504T1 (fr) |
DE (1) | DE602008001718D1 (fr) |
ES (1) | ES2347850T3 (fr) |
FR (1) | FR2912249A1 (fr) |
WO (1) | WO2008104663A1 (fr) |
Families Citing this family (46)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2008022181A2 (fr) * | 2006-08-15 | 2008-02-21 | Broadcom Corporation | Mise à jour des états de décodeur après un masquage de perte de paquet |
EP2304723B1 (fr) * | 2008-07-11 | 2012-10-24 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Appareil et procédé de décodage d un signal audio encodé |
BRPI0910517B1 (pt) * | 2008-07-11 | 2022-08-23 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V | Um aparelho e um método para calcular um número de envelopes espectrais a serem obtidos por um codificador de replicação de banda espectral (sbr) |
US8515747B2 (en) * | 2008-09-06 | 2013-08-20 | Huawei Technologies Co., Ltd. | Spectrum harmonic/noise sharpness control |
WO2010028297A1 (fr) * | 2008-09-06 | 2010-03-11 | GH Innovation, Inc. | Extension sélective de bande passante |
WO2010028292A1 (fr) * | 2008-09-06 | 2010-03-11 | Huawei Technologies Co., Ltd. | Prédiction de fréquence adaptative |
US8577673B2 (en) * | 2008-09-15 | 2013-11-05 | Huawei Technologies Co., Ltd. | CELP post-processing for music signals |
WO2010031003A1 (fr) | 2008-09-15 | 2010-03-18 | Huawei Technologies Co., Ltd. | Addition d'une seconde couche d'amélioration à une couche centrale basée sur une prédiction linéaire à excitation par code |
CN102396024A (zh) * | 2009-02-16 | 2012-03-28 | 韩国电子通信研究院 | 使用自适应正弦波脉冲编码的用于音频信号的编码/解码方法及其设备 |
FR2947944A1 (fr) * | 2009-07-07 | 2011-01-14 | France Telecom | Codage/decodage perfectionne de signaux audionumeriques |
WO2011042464A1 (fr) * | 2009-10-08 | 2011-04-14 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Décodeur de signal audio multimode, codeur de signal audio multimode, procédés et programme informatique utilisant une mise en forme de bruit basée sur un codage à prédiction linéaire |
JP5565914B2 (ja) * | 2009-10-23 | 2014-08-06 | パナソニック インテレクチュアル プロパティ コーポレーション オブ アメリカ | 符号化装置、復号装置およびこれらの方法 |
JP5598536B2 (ja) * | 2010-03-31 | 2014-10-01 | 富士通株式会社 | 帯域拡張装置および帯域拡張方法 |
US9443534B2 (en) * | 2010-04-14 | 2016-09-13 | Huawei Technologies Co., Ltd. | Bandwidth extension system and approach |
EP2562750B1 (fr) * | 2010-04-19 | 2020-06-10 | Panasonic Intellectual Property Corporation of America | Dispositif de codage, dispositif de décodage, procédé de codage et procédé de décodage |
US8600737B2 (en) | 2010-06-01 | 2013-12-03 | Qualcomm Incorporated | Systems, methods, apparatus, and computer program products for wideband speech coding |
KR101696632B1 (ko) | 2010-07-02 | 2017-01-16 | 돌비 인터네셔널 에이비 | 선택적인 베이스 포스트 필터 |
US9236063B2 (en) | 2010-07-30 | 2016-01-12 | Qualcomm Incorporated | Systems, methods, apparatus, and computer-readable media for dynamic bit allocation |
US9208792B2 (en) | 2010-08-17 | 2015-12-08 | Qualcomm Incorporated | Systems, methods, apparatus, and computer-readable media for noise injection |
KR101826331B1 (ko) * | 2010-09-15 | 2018-03-22 | 삼성전자주식회사 | 고주파수 대역폭 확장을 위한 부호화/복호화 장치 및 방법 |
EP2657933B1 (fr) * | 2010-12-29 | 2016-03-02 | Samsung Electronics Co., Ltd | Appareil de codage et appareil de décodage avec extension de largeur de bande |
WO2012144128A1 (fr) * | 2011-04-20 | 2012-10-26 | パナソニック株式会社 | Dispositif de codage vocal/audio, dispositif de décodage vocal/audio et leurs procédés |
US9173025B2 (en) | 2012-02-08 | 2015-10-27 | Dolby Laboratories Licensing Corporation | Combined suppression of noise, echo, and out-of-location signals |
US8712076B2 (en) | 2012-02-08 | 2014-04-29 | Dolby Laboratories Licensing Corporation | Post-processing including median filtering of noise suppression gains |
WO2013168414A1 (fr) * | 2012-05-11 | 2013-11-14 | パナソニック株式会社 | Codeur de signal audio hybride, décodeur de signal audio hybride, procédé de codage de signal audio et procédé de décodage de signal audio |
US9659567B2 (en) | 2013-01-08 | 2017-05-23 | Dolby International Ab | Model based prediction in a critically sampled filterbank |
CA2908625C (fr) * | 2013-04-05 | 2017-10-03 | Dolby International Ab | Codeur et decodeur audio |
CN104217727B (zh) * | 2013-05-31 | 2017-07-21 | 华为技术有限公司 | 信号解码方法及设备 |
US9418671B2 (en) * | 2013-08-15 | 2016-08-16 | Huawei Technologies Co., Ltd. | Adaptive high-pass post-filter |
CN108347689B (zh) * | 2013-10-22 | 2021-01-01 | 延世大学工业学术合作社 | 用于处理音频信号的方法和设备 |
KR101498113B1 (ko) * | 2013-10-23 | 2015-03-04 | 광주과학기술원 | 사운드 신호의 대역폭 확장 장치 및 방법 |
WO2015162500A2 (fr) * | 2014-03-24 | 2015-10-29 | 삼성전자 주식회사 | Procédé et dispositif de codage de bande haute et procédé et dispositif de décodage de bande haute |
ES2840349T3 (es) * | 2014-05-01 | 2021-07-06 | Nippon Telegraph & Telephone | Descodificación de una señal de sonido |
EP4293666A3 (fr) | 2014-07-28 | 2024-03-06 | Samsung Electronics Co., Ltd. | Procédé et appareil de codage de signal ainsi que procédé et appareil de décodage de signal |
WO2017033113A1 (fr) | 2015-08-21 | 2017-03-02 | Acerta Pharma B.V. | Associations thérapeutiques d'un inhibiteur de mek et d'un inhibiteur de btk |
US10628165B2 (en) * | 2017-08-17 | 2020-04-21 | Agora Lab, Inc. | Gain control for multiple description coding |
WO2019091576A1 (fr) | 2017-11-10 | 2019-05-16 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Codeurs audio, décodeurs audio, procédés et programmes informatiques adaptant un codage et un décodage de bits les moins significatifs |
EP3483878A1 (fr) | 2017-11-10 | 2019-05-15 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Décodeur audio supportant un ensemble de différents outils de dissimulation de pertes |
EP3483882A1 (fr) | 2017-11-10 | 2019-05-15 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Contrôle de la bande passante dans des codeurs et/ou des décodeurs |
EP3483886A1 (fr) | 2017-11-10 | 2019-05-15 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Sélection de délai tonal |
EP3483884A1 (fr) | 2017-11-10 | 2019-05-15 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Filtrage de signal |
EP3483880A1 (fr) | 2017-11-10 | 2019-05-15 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Mise en forme de bruit temporel |
EP3483883A1 (fr) | 2017-11-10 | 2019-05-15 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Codage et décodage de signaux audio avec postfiltrage séléctif |
EP3483879A1 (fr) | 2017-11-10 | 2019-05-15 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Fonction de fenêtrage d'analyse/de synthèse pour une transformation chevauchante modulée |
KR102189733B1 (ko) * | 2019-06-12 | 2020-12-11 | 주식회사 에이치알지 | 대동물의 섭취량을 측정하는 전자 장치 및 그 동작 방법 |
WO2024034389A1 (fr) * | 2022-08-09 | 2024-02-15 | ソニーグループ株式会社 | Dispositif de traitement de signal, procédé de traitement de signal et programme |
Family Cites Families (16)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH0695700A (ja) * | 1992-09-09 | 1994-04-08 | Toshiba Corp | 音声符号化方法及びその装置 |
US5632003A (en) * | 1993-07-16 | 1997-05-20 | Dolby Laboratories Licensing Corporation | Computationally efficient adaptive bit allocation for coding method and apparatus |
US5623577A (en) * | 1993-07-16 | 1997-04-22 | Dolby Laboratories Licensing Corporation | Computationally efficient adaptive bit allocation for encoding method and apparatus with allowance for decoder spectral distortions |
US5625743A (en) * | 1994-10-07 | 1997-04-29 | Motorola, Inc. | Determining a masking level for a subband in a subband audio encoder |
DE69620967T2 (de) * | 1995-09-19 | 2002-11-07 | At & T Corp | Synthese von Sprachsignalen in Abwesenheit kodierter Parameter |
US5790759A (en) * | 1995-09-19 | 1998-08-04 | Lucent Technologies Inc. | Perceptual noise masking measure based on synthesis filter frequency response |
DE69938016T2 (de) * | 1998-05-27 | 2008-05-15 | Microsoft Corp., Redmond | Verfahren und Vorrichtung zur Maskierung des Quantisierungsrauschens von Audiosignalen |
JP3515903B2 (ja) * | 1998-06-16 | 2004-04-05 | 松下電器産業株式会社 | オーディオ符号化のための動的ビット割り当て方法及び装置 |
US6363338B1 (en) * | 1999-04-12 | 2002-03-26 | Dolby Laboratories Licensing Corporation | Quantization in perceptual audio coders with compensation for synthesis filter noise spreading |
JP2003280697A (ja) * | 2002-03-22 | 2003-10-02 | Sanyo Electric Co Ltd | 音声圧縮方法および音声圧縮装置 |
WO2003091989A1 (fr) * | 2002-04-26 | 2003-11-06 | Matsushita Electric Industrial Co., Ltd. | Codeur, decodeur et procede de codage et de decodage |
FR2850781B1 (fr) * | 2003-01-30 | 2005-05-06 | Jean Luc Crebouw | Procede pour le traitement numerique differencie de la voix et de la musique, le filtrage du bruit, la creation d'effets speciaux et dispositif pour la mise en oeuvre dudit procede |
US7333930B2 (en) * | 2003-03-14 | 2008-02-19 | Agere Systems Inc. | Tonal analysis for perceptual audio coding using a compressed spectral representation |
KR20070084002A (ko) * | 2004-11-05 | 2007-08-24 | 마츠시타 덴끼 산교 가부시키가이샤 | 스케일러블 복호화 장치 및 스케일러블 부호화 장치 |
US7562021B2 (en) * | 2005-07-15 | 2009-07-14 | Microsoft Corporation | Modification of codewords in dictionary used for efficient coding of digital media spectral data |
EP2077551B1 (fr) * | 2008-01-04 | 2011-03-02 | Dolby Sweden AB | Encodeur audio et décodeur |
-
2007
- 2007-02-02 FR FR0700747A patent/FR2912249A1/fr not_active Withdrawn
-
2008
- 2008-01-30 CN CN2008800066533A patent/CN101622661B/zh active Active
- 2008-01-30 EP EP08762010A patent/EP2115741B1/fr active Active
- 2008-01-30 JP JP2009547737A patent/JP5357055B2/ja active Active
- 2008-01-30 ES ES08762010T patent/ES2347850T3/es active Active
- 2008-01-30 KR KR1020097016113A patent/KR101425944B1/ko active IP Right Grant
- 2008-01-30 US US12/524,774 patent/US8543389B2/en active Active
- 2008-01-30 AT AT08762010T patent/ATE473504T1/de not_active IP Right Cessation
- 2008-01-30 WO PCT/FR2008/050150 patent/WO2008104663A1/fr active Application Filing
- 2008-01-30 DE DE602008001718T patent/DE602008001718D1/de active Active
Non-Patent Citations (1)
Title |
---|
See references of WO2008104663A1 * |
Also Published As
Publication number | Publication date |
---|---|
KR20090104846A (ko) | 2009-10-06 |
WO2008104663A1 (fr) | 2008-09-04 |
CN101622661A (zh) | 2010-01-06 |
CN101622661B (zh) | 2012-05-23 |
KR101425944B1 (ko) | 2014-08-06 |
JP5357055B2 (ja) | 2013-12-04 |
FR2912249A1 (fr) | 2008-08-08 |
JP2010518422A (ja) | 2010-05-27 |
US8543389B2 (en) | 2013-09-24 |
EP2115741B1 (fr) | 2010-07-07 |
ES2347850T3 (es) | 2010-11-04 |
DE602008001718D1 (de) | 2010-08-19 |
US20100121646A1 (en) | 2010-05-13 |
ATE473504T1 (de) | 2010-07-15 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
EP2115741B1 (fr) | Codage/decodage perfectionnes de signaux audionumeriques | |
EP2452337B1 (fr) | Allocation de bits dans un codage/décodage d'amélioration d'un codage/décodage hiérarchique de signaux audionumériques | |
EP2452336B1 (fr) | Codage/décodage perfectionne de signaux audionumériques | |
EP1989706B1 (fr) | Dispositif de ponderation perceptuelle en codage/decodage audio | |
JP5161069B2 (ja) | 広帯域音声符号化のためのシステム、方法、及び装置 | |
EP1905010B1 (fr) | Codage/décodage audio hiérarchique | |
EP1692689B1 (fr) | Procede de codage multiple optimise | |
WO2007107670A2 (fr) | Procede de post-traitement d'un signal dans un decodeur audio |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PUAI | Public reference made under article 153(3) epc to a published international application that has entered the european phase |
Free format text: ORIGINAL CODE: 0009012 |
|
17P | Request for examination filed |
Effective date: 20090728 |
|
AK | Designated contracting states |
Kind code of ref document: A1 Designated state(s): AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MT NL NO PL PT RO SE SI SK TR |
|
GRAP | Despatch of communication of intention to grant a patent |
Free format text: ORIGINAL CODE: EPIDOSNIGR1 |
|
GRAS | Grant fee paid |
Free format text: ORIGINAL CODE: EPIDOSNIGR3 |
|
DAX | Request for extension of the european patent (deleted) | ||
GRAA | (expected) grant |
Free format text: ORIGINAL CODE: 0009210 |
|
RIN1 | Information on inventor provided before grant (corrected) |
Inventor name: GUILLAUME, CYRIL Inventor name: RAGOT, STEPHANE |
|
AK | Designated contracting states |
Kind code of ref document: B1 Designated state(s): AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MT NL NO PL PT RO SE SI SK TR |
|
REG | Reference to a national code |
Ref country code: GB Ref legal event code: FG4D Free format text: NOT ENGLISH |
|
REG | Reference to a national code |
Ref country code: CH Ref legal event code: EP |
|
REG | Reference to a national code |
Ref country code: IE Ref legal event code: FG4D |
|
REF | Corresponds to: |
Ref document number: 602008001718 Country of ref document: DE Date of ref document: 20100819 Kind code of ref document: P |
|
REG | Reference to a national code |
Ref country code: NL Ref legal event code: VDEP Effective date: 20100707 |
|
REG | Reference to a national code |
Ref country code: ES Ref legal event code: FG2A Ref document number: 2347850 Country of ref document: ES Kind code of ref document: T3 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: SI Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20100707 |
|
LTIE | Lt: invalidation of european patent or patent extension |
Effective date: 20100707 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: NO Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20101007 Ref country code: FI Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20100707 Ref country code: AT Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20100707 Ref country code: LT Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20100707 Ref country code: NL Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20100707 |
|
REG | Reference to a national code |
Ref country code: IE Ref legal event code: FD4D |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: PL Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20100707 Ref country code: HR Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20100707 Ref country code: BG Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20101007 Ref country code: CY Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20100707 Ref country code: IS Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20101107 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: SE Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20100707 Ref country code: LV Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20100707 Ref country code: GR Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20101008 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: DK Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20100707 Ref country code: IE Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20100707 |
|
PLBE | No opposition filed within time limit |
Free format text: ORIGINAL CODE: 0009261 |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: NO OPPOSITION FILED WITHIN TIME LIMIT |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: EE Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20100707 Ref country code: CZ Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20100707 Ref country code: RO Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20100707 Ref country code: SK Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20100707 |
|
26N | No opposition filed |
Effective date: 20110408 |
|
REG | Reference to a national code |
Ref country code: DE Ref legal event code: R097 Ref document number: 602008001718 Country of ref document: DE Effective date: 20110408 |
|
BERE | Be: lapsed |
Owner name: FRANCE TELECOM Effective date: 20110131 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: MC Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20110131 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: BE Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20110131 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: MT Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20100707 |
|
REG | Reference to a national code |
Ref country code: CH Ref legal event code: PL |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: CH Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20120131 Ref country code: LI Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20120131 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: LU Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20110130 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: PT Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20100707 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: TR Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20100707 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: HU Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20100707 |
|
REG | Reference to a national code |
Ref country code: FR Ref legal event code: PLFP Year of fee payment: 8 |
|
REG | Reference to a national code |
Ref country code: FR Ref legal event code: PLFP Year of fee payment: 9 |
|
REG | Reference to a national code |
Ref country code: FR Ref legal event code: PLFP Year of fee payment: 10 |
|
REG | Reference to a national code |
Ref country code: FR Ref legal event code: PLFP Year of fee payment: 11 |
|
PGFP | Annual fee paid to national office [announced via postgrant information from national office to epo] |
Ref country code: GB Payment date: 20231219 Year of fee payment: 17 |
|
PGFP | Annual fee paid to national office [announced via postgrant information from national office to epo] |
Ref country code: FR Payment date: 20231219 Year of fee payment: 17 |
|
PGFP | Annual fee paid to national office [announced via postgrant information from national office to epo] |
Ref country code: ES Payment date: 20240202 Year of fee payment: 17 |
|
PGFP | Annual fee paid to national office [announced via postgrant information from national office to epo] |
Ref country code: DE Payment date: 20231219 Year of fee payment: 17 |
|
PGFP | Annual fee paid to national office [announced via postgrant information from national office to epo] |
Ref country code: IT Payment date: 20240102 Year of fee payment: 17 |