EP2656343A1 - Codage de son à bas retard alternant codage prédictif et codage par transformée - Google Patents
Codage de son à bas retard alternant codage prédictif et codage par transforméeInfo
- Publication number
- EP2656343A1 EP2656343A1 EP11815474.9A EP11815474A EP2656343A1 EP 2656343 A1 EP2656343 A1 EP 2656343A1 EP 11815474 A EP11815474 A EP 11815474A EP 2656343 A1 EP2656343 A1 EP 2656343A1
- Authority
- EP
- European Patent Office
- Prior art keywords
- coding
- frame
- predictive
- decoding
- mdct
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 claims abstract description 67
- 230000005236 sound signal Effects 0.000 claims description 8
- 238000005562 fading Methods 0.000 claims description 7
- 238000004590 computer program Methods 0.000 claims description 5
- 230000001131 transforming effect Effects 0.000 claims description 3
- 230000002441 reversible effect Effects 0.000 claims description 2
- 230000007704 transition Effects 0.000 description 87
- 238000003786 synthesis reaction Methods 0.000 description 38
- 230000015572 biosynthetic process Effects 0.000 description 37
- 238000004458 analytical method Methods 0.000 description 31
- 230000005284 excitation Effects 0.000 description 26
- 230000009466 transformation Effects 0.000 description 18
- 230000015654 memory Effects 0.000 description 15
- 230000002123 temporal effect Effects 0.000 description 15
- 230000003595 spectral effect Effects 0.000 description 12
- 230000003044 adaptive effect Effects 0.000 description 11
- 238000013139 quantization Methods 0.000 description 11
- 239000000523 sample Substances 0.000 description 11
- 230000000630 rising effect Effects 0.000 description 9
- 230000006870 function Effects 0.000 description 8
- 238000001228 spectrum Methods 0.000 description 8
- 238000011084 recovery Methods 0.000 description 6
- 230000000694 effects Effects 0.000 description 5
- 238000005070 sampling Methods 0.000 description 5
- 238000003860 storage Methods 0.000 description 5
- 230000008901 benefit Effects 0.000 description 4
- 238000004422 calculation algorithm Methods 0.000 description 4
- 238000004364 calculation method Methods 0.000 description 4
- 238000012545 processing Methods 0.000 description 4
- 230000005540 biological transmission Effects 0.000 description 3
- 230000008859 change Effects 0.000 description 3
- 230000002301 combined effect Effects 0.000 description 3
- 238000001914 filtration Methods 0.000 description 3
- 239000000243 solution Substances 0.000 description 3
- 238000012546 transfer Methods 0.000 description 3
- OVOUKWFJRHALDD-UHFFFAOYSA-N 2-[2-(2-acetyloxyethoxy)ethoxy]ethyl acetate Chemical compound CC(=O)OCCOCCOCCOC(C)=O OVOUKWFJRHALDD-UHFFFAOYSA-N 0.000 description 2
- 230000000295 complement effect Effects 0.000 description 2
- 230000003247 decreasing effect Effects 0.000 description 2
- 230000001934 delay Effects 0.000 description 2
- 230000007774 longterm Effects 0.000 description 2
- 230000009467 reduction Effects 0.000 description 2
- 230000004044 response Effects 0.000 description 2
- 230000006978 adaptation Effects 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 238000005311 autocorrelation function Methods 0.000 description 1
- 230000015556 catabolic process Effects 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 238000012937 correction Methods 0.000 description 1
- 238000004132 cross linking Methods 0.000 description 1
- 230000007812 deficiency Effects 0.000 description 1
- 230000006735 deficit Effects 0.000 description 1
- 238000006731 degradation reaction Methods 0.000 description 1
- 230000003111 delayed effect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 239000000284 extract Substances 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000002347 injection Methods 0.000 description 1
- 239000007924 injection Substances 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 230000008520 organization Effects 0.000 description 1
- 230000000750 progressive effect Effects 0.000 description 1
- 230000010349 pulsation Effects 0.000 description 1
- 238000011002 quantification Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000001360 synchronised effect Effects 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
- 210000001260 vocal cord Anatomy 0.000 description 1
- 230000001755 vocal effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/04—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
- G10L19/08—Determination or coding of the excitation function; Determination or coding of the long-term prediction parameters
- G10L19/12—Determination or coding of the excitation function; Determination or coding of the long-term prediction parameters the excitation function being a code excitation, e.g. in code excited linear prediction [CELP] vocoders
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/02—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/02—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
- G10L19/022—Blocking, i.e. grouping of samples in time; Choice of analysis windows; Overlap factoring
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/02—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
- G10L19/0212—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders using orthogonal transformation
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/04—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
- G10L19/06—Determination or coding of the spectral characteristics, e.g. of the short-term prediction coefficients
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/04—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
- G10L19/16—Vocoder architecture
- G10L19/18—Vocoders using multiple modes
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/04—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
Definitions
- the present invention relates to the field of coding digital signals.
- the invention is advantageously applied to the coding of sounds with alternating speech and music.
- CELP Code Excited Linear Prediction
- transform coding techniques are preferred.
- CELP coders are predictive coders. They aim to model the production of speech from various elements: a short-term linear prediction to model the vocal tract, a long-term prediction to model the vibration of vocal cords in voiced period, and an excitation derived from a fixed dictionary (white noise, algebraic excitation) to represent ⁇ "innovation" which could not be modeled.
- critical-sampling transform is a transform for which the number of coefficients in the transformed domain is equal to the number of time samples analyzed.
- This technique is based on a CELP technology of the AMR-WB type, more specifically of the ACELP type (for "Algebraic Code Excited Linear Prediction") and a transform coding based on an overlapping Fourier transform in a TCX type model. (for "Transform Coded eXcitation").
- ACELP coding and TCX coding are both linear predictive type techniques.
- AMR-WB + codec has been developed for 3GPP PSS (for "Packet Switched Streaming" in English), MBMS (for "Multimedia Broadcast / Multicast Service” in English) and MMS (for "Multimedia Messaging Service”). "in English), in other words for broadcast and storage services, without strong constraints on algorithmic delay.
- This solution suffers from insufficient quality on the music.
- This deficiency comes particularly from transform coding.
- the overlapping Fourier transform is not a critical sampling transformation, and therefore, it is suboptimal.
- the windows used in this encoder are not optimal vis-à-vis the concentration of energy: the frequency forms of these quasi-rectangular windows are sub-optimal.
- RMO Model 0 Reference
- LPD Linear Predictive Domain
- wLPT weighted Linear Predictive Transform
- MDCT-type transform unlike the AMR-WB + code
- FD mode for "Frequency Domain” in English
- MDCT for "Modified Discrete Cosine Transform” in English
- MPEG AAC for "Advanced Audio Coding" out of 1024 samples.
- the major differences brought by the USAC RMO coding for the mono part are the use of a MDCT type critical decimation transform for transform coding and quantization of the MDCT spectrum by scalar quantization with arithmetic coding.
- the acoustic band coded by the different modes depends on the selected mode, which is not the case in the AMR-WB + codec where the ACELP and TCX modes operate at the same time. internal sampling frequency.
- the mode decision in the USAC RMO codec is performed in open loop (or "open-loop" in English) for each frame of 1024 samples.
- a closed loop decision is made by executing the different coding modes in parallel and choosing a posteriori the mode that gives the best result according to a predefined criterion.
- the decision is made a priori based on available data and observations but without testing whether this decision is optimal or not.
- the MDCT window is divided into 4 adjacent portions of equal lengths M / 2, called "quarters".
- the signal is multiplied by the analysis window and folds are made: the first quarter (windowed) is folded (ie inverted in time and overlapped) on the second quarter and the fourth quarter is folded on the third. Specifically, folding from one quarter to another is done in the following way: The first sample of the first quarter is added (or subtracted) to the last sample of the second quarter, the second sample of the first quarter is added (or subtracted) the second-last sample in the second quarter, and so on until the last sample of the first quarter that is added (or subtracted) to the first sample of the second quarter.
- the third and fourth quarters of the previous frame then become the first and second quarter of the current frame.
- the decoded version of these folded signals is thus obtained.
- Two consecutive frames contain the result of 2 different folds of the same quarters, that is to say for each pair of samples one has the result of 2 linear combinations with different but known weights: an equation system is thus solved to obtain the decoded version of the input signal, the time folding can be thus removed using 2 consecutive decoded frames.
- the resolution of the systems of equations mentioned is generally made by unfolding, multiplication by a wisely selected synthesis window then addition-recovery of the common parts.
- This overlap addition ensures at the same time the smooth transition (without discontinuity due to quantization errors) between two consecutive decoded frames, in fact this operation behaves like a crossfade.
- the window for the first quarter or fourth quarter is zero for each sample, we are talking about an MDCT transformation without time folding in this part of the window.
- the smooth transition is not ensured by the MDCT transformation, it must be done by other means such as for example an external crossfade.
- implementation variants of the MDCT transformation exist, in particular on the definition of the DCT transform, on how to temporally fold the block to be transformed (for example, we can reverse the applied signs to the quarters folded to the left and right, or to fold the second and third quarter on respectively the first and fourth quarters), etc. These variants do not change the principle of the MDCT analysis-synthesis with the reduction of the sample block by windowing, temporal folding then transformation and finally windowing, folding and addition-recovery.
- a transition window for the FD mode is used with a left overlap of 128 samples, as shown in Figure 1.
- the temporal folding on this overlap area is canceled by introducing an "artificial" time folding to the right of the ACELP frame rebuilt.
- the MDCT window used for the transition has a size of 2304 samples and the DCT transformation operates on 1152 samples whereas normally FD mode frames are coded with a window size of 2048 samples and a DCT transformation of 1024 samples.
- the MDCT transformation of the normal FD mode is not directly usable for the transition window, the encoder must also integrate a modified version of this transformation which complicates the implementation of the transition for the FD mode.
- the present invention improves the situation.
- the present invention proposes a method for encoding a digital sound signal, comprising the steps of:
- the method is such that a first part of the current frame is coded by a predictive coding restricted with respect to the predictive coding of the preceding frame by reusing at least one parameter of the predictive coding of the preceding frame and coding only the non-predictive parameters. reused from this first part of the current frame.
- a transition frame is thus provided.
- the fact that the first part of the current frame is also coded by predictive coding makes it possible to recover folding terms that it would not be possible to recover solely by transform coding since the transform coding memory for this frame of transition is not available, the previous frame has not been transformed-encoded.
- this frame part does not induce additional delay since this first part is at the beginning of the transition frame.
- this type of coding makes it possible to remain with a weighting window size of identical length for the transform coding whether for the coding of the transition frame or for the coding of the other frames coded by transform. The complexity of the coding method is therefore reduced.
- the restricted predictive coding uses a prediction filter copied from the preceding predictive coding frame.
- transform coding is generally selected when the coded segments are quasi-stationary.
- the spectral envelope parameter of the signal can be reused from one frame to another for a duration of a part of the frame, for example, a sub-frame, without having a significant impact on the quality of the coding.
- the use of the prediction filter used for the previous frame does not, therefore, affect the quality of the coding and makes it possible to dispense with additional bits for the transmission of its parameters.
- the restricted predictive coding further uses a decoded value of the pitch and / or its associated gain of the previous predictive coding frame.
- certain predictive coding parameters used for the restricted predictive coding are quantized in differential mode with respect to decoded parameters of the preceding predictive coding frame.
- the method comprises a step of obtaining the reconstructed signals resulting from predictive local codings and decodings and by transforming the first sub-frame of the current and combination frame by a cross-fading of these reconstructed signals.
- the coding transition in the current frame is smooth and does not induce troublesome artifacts.
- said crossfade of the reconstructed signals is performed on a portion of the first part of the current frame as a function of the shape of the transform coding weighting window.
- said crossfade of the reconstructed signals is performed on a portion of the first part of the current frame, said portion not containing temporal folding.
- the transform coding uses a weighting window comprising a chosen number of successive weighting coefficients of zero value at the end and at the beginning of the window.
- the transform coding uses an asymmetric weighting window comprising a chosen number of successive weighting coefficients of zero value in at least one end of the window.
- the present invention also relates to a method for decoding a digital sound signal, comprising the steps of:
- the method is such that it further comprises a step of decoding by a predictive decoding restricted with respect to the predictive decoding of the previous frame of a first part of the current frame.
- the decoding method is the counterpart of the coding method and provides the same advantages as those described for the coding method.
- the decoding method comprises a step of combining by crossfading the decoded signals by inverse transform and by restricted predictive decoding for at least a portion of the first part of the current frame received and encoded according to a restricted predictive coding, reusing at least one parameter of the predictive decoding of the previous frame and decoding only the parameters received for this first part of the current frame.
- the restricted predictive decoding uses a prediction filter decoded and used by the predictive decoding of the previous frame.
- the restricted predictive decoding furthermore uses a decoded value of the pitch and / or its associated gain of the predictive decoding of the preceding frame.
- the present invention also relates to a digital sound signal encoder, comprising:
- a predictive coding module for coding a previous frame of samples of the digital signal
- the encoder further comprises a predictive coding module which is restricted in relation to the predictive coding of the preceding frame for coding a first part of the current frame, by reusing at least one parameter of the coding predictive of the previous frame and only coding the non-reused parameters of this first part of the current frame.
- the invention relates to a digital sound signal decoder, comprising:
- a predictive decoding module for decoding a previous frame of samples of the digital signal received and coded according to a predictive coding
- an inverse transform decoding module for decoding a current frame of samples of the digital signal received and coded according to transform coding.
- the decoder is such that it further comprises a predictive decoding module which is restricted compared with the predictive decoding of the preceding frame in order to decode a first part of the current frame received and coded according to a restricted predictive coding, by reusing at least one parameter of the predictive decoding of the previous frame and decoding only the parameters received for this first part of the current frame.
- the invention relates to a computer program comprising code instructions for implementing the steps of the encoding method as described above and / or decoding as described above, when these instructions are executed by a processor.
- the invention also relates to a storage means, readable by a processor, integrated or not integrated with the encoder or the decoder, possibly removable, storing a computer program implementing an encoding method and / or a decoding method as described. previously.
- FIG. 1 illustrates an exemplary transition window of the state of the art for the transition between the CELP coding and the FD coding of the MPEG codec USAC, previously described;
- FIG. 2 illustrates in the form of a block diagram, an encoder and a coding method according to one embodiment of the invention
- FIG. 3a illustrates an example of a weighting window used for the transform coding of the invention
- FIG. 3b illustrates the method of transform coding used by the invention
- FIG. 4a illustrates the transition between a coded frame with a predictive coding and a transform coded frame according to an embodiment of the method of the invention
- FIGS. 4b, 4c and 4d illustrate the transition between a coded frame with a predictive coding and a transform coded frame according to two variant embodiments of the method of the invention
- FIG. 4e illustrates the transition between a coded frame with a predictive coding and a transform coded frame according to one of the variant embodiments of the method of the invention for the case where the MDCT transform uses asymmetric windows;
- FIG. 5 illustrates a decoder and a decoding method according to one embodiment of the invention
- FIGS. 6a and 6b illustrate in flowchart form the main steps of the coding method, respectively of the decoding method according to the invention.
- FIG. 7 illustrates a possible embodiment of a hardware encoder and a decoder according to the invention.
- FIG. 2 represents a multi-mode encoder CELP / MDCT in which the coding method according to the invention is implemented.
- This figure shows the coding steps performed for each signal frame.
- the input signal noted ⁇ ( ⁇ ')
- the frame length is 20 ms.
- the invention is generalized to cases where other sampling frequencies are used, for example for super-wide band signals sampled at 32 kHz, possibly with a split into two subbands to apply the invention in the band. low.
- the frame length is here chosen to correspond to that of the mobile encoders such as 3GPP AMR and AMR-WB, however other lengths are also possible (example: 10 ms).
- This input signal is first filtered by a high-pass filter (block 200), in order to attenuate frequencies below 50 Hz and eliminate the DC component, and then downsampled to the internal frequency of 12.8 kHz (block 201) to obtain a frame of the signal, s (n) of 256 samples.
- the decimation filter (block 201) is produced at a low delay by means of a finite impulse response filter (typically of order 60).
- the current frame, s (ri) of 256 samples is encoded according to the preferred embodiment of the invention by a CELP coder inspired by multi-rate ACELP coding (from 6.6 to 23.05 kbit / s) at 12.8 kHz described in 3GPP TS 26.190 or equivalent ITU-T G.722.2 - this algorithm is called AMR-WB (for "Adaptive MultiRate - WideBand").
- AMR-WB for "Adaptive MultiRate - WideBand"
- the successive frames of 20 ms contain 256 temporal samples at 12.8 kHz.
- the CELP coding (block 211) comprises several steps implemented in a manner similar to the ACELP coding of the AMR-WB standard; the main steps are given here as an example of embodiment:
- the CELP coder divides each 20 ms frame into 4 subframes of the 5 ms and the quantized LPC filter corresponds to the last (fourth) subframe.
- This signal is finally de-emphasized (block 212) by the transfer function filter l / l-CXz 1 to obtain the decoded signal CELP
- the block 211 corresponds to the CELP coding at 8 kbit / s described in the ITU-T G.718 standard according to one of the four possible CELP coding modes: unvoiced mode (UC), voiced mode (VC), transition mode (TC) or generic mode (GC).
- UC unvoiced mode
- VC voiced mode
- TC transition mode
- GC generic mode
- another embodiment of the CELP coding is chosen, for example the ACELP coding in the mode interoperable with the AMR-WB coding of the ITU-T G.718 standard.
- the representation of the LPC coefficients in the form of ISF can be replaced by the spectral line pairs (LSF) or other equivalent representations.
- the block 211 delivers the CELP indices coded I C ELP to be multiplexed in the bitstream.
- the window w (n) is chosen in the preferred embodiment as a "low delay" window symmetrical with the form:
- This window applies to the current frame of 20 ms as well as to a future signal
- the MDCT coding is thus synchronized with the CELP coding insofar as the MDCT decoder can reconstruct the entire current frame by addition-overlap, thanks to the overlap on the left and the intermediate “flat” of the MDCT window, and it also has an overlap on the future 5 ms frame. It is noted here for this window that the current MDCT frame induces a temporal folding on the first part of the frame (in fact on the first 5 ms) where the recovery takes place.
- B tot the total bit budget allocated in each frame to the MDCT coding.
- the discrete spectrum S (k) is divided into subbands, then a spectral envelope, corresponding to the rms (for "root mean square” in English, that is to say the square root of the average energy ) per subband, is quantized in the logarithmic domain in steps of 3 dB and coded by entropy coding.
- the bit budget used by this envelope coding is noted here B env ; it is variable because of entropy coding.
- a predetermined number of bits denoted B inJ (budget function B tot ) is reserved for the coding of noise injection levels in order to "fill" the coded coefficients. at a value of zero by noise and hide artifacts of "musical noise” that would otherwise be audible.
- the S (k) spectrum subbands are encoded by spherical vector quantization with the remaining budget of B tot - B env - B inJ bits. This quantization is not detailed, as is the adaptive allocation of the bits by sub-bands, since these details go beyond the scope of the invention.
- the block 221 delivers the MDCT indices encoded IMDCT to be multiplexed in the bit stream.
- the previous frame was coded by an MDCT mode.
- the memory (or states) necessary for the MDCT synthesis in the local (and remote) decoder is available and the addition / recovery operation used by the MDCT to cancel the temporal folding is possible.
- the MDCT frame is correctly decoded throughout the frame. This is the "normal" operation of the MDCT coding / decoding.
- the previous frame was coded by a CELP mode.
- the reconstruction of the frame at the decoder (local and remote) is not complete.
- the MDCT uses for the reconstruction an addition / overlap operation between the current frame and the previous frame (with states stored in memory) to eliminate the temporal folding of the frame to be decoded and also to avoid the effects of blocks. and increase the frequency resolution by using windows longer than a frame.
- MDCT windows sinusoidal type
- signal distortion due to time folding is stronger at the end of the window and near zero in the middle of the window.
- the previous frame is of CELP type, the MDCT memory is not available because the last frame was not coded by MDCT transform.
- the folded area of the start of the frame corresponds to the area of the signal in the MDCT frame which is disturbed by the time folding inherent in the MDCT transformation.
- the current frame is coded by the MDCT mode (block 220 to 223) and the previous frame has been coded by the CELP mode (blocks 210 to 212), a specific transition processing from CELP to MDCT is necessary.
- the first frame is coded by the CELP mode and can be completely reconstructed by the CELP decoder (local or remote).
- the second frame is coded by the MDCT mode; this second frame is considered to be the current frame.
- the overlap area on the left side of the MDCT window is problematic because the complementary part (with time folding) of this window is not available since the previous frame has not been coded by MDCT. Folding in this left part of the MDCT window can not be deleted.
- the coding method comprises a step of encoding a block of samples of length less than or equal to the length of the frame, chosen for example as an additional subframe of 5 ms, in the transform coded current frame (MDCT), representing the left folding area of the current frame, by a transition predictive coder or restricted predictive coding.
- MDCT transform coded current frame
- the type of coding in the frame preceding the transition MDCT frame could be another type of coding than the CELP coding, for example an ADPCM or a TCX coding.
- the invention applies in the general case where the previous frame has been coded by a coding not updating the MDCT memories in the signal domain and the invention involves coding a block of samples corresponding to a portion of the signal. the current frame by a transition coding using the information of the coding of the previous frame.
- the transition predictive coding is restricted compared to the predictive coding of the previous frame; It consists in using the stable parameters of the previous frame coded by a predictive coding and coding only a few minimal parameters for the additional subframe in the current transition frame.
- this restricted predictive coding reuses at least one parameter of the predictive coding of the preceding frame and therefore encodes only the parameters that are not reused. In this sense, one can speak of a restricted coding (by the restriction of the coded parameters).
- the mixed line (lines alternating dots and lines) correspond to the MDCT coding folding lines and the lines of unfolding. MDCT decoding.
- the bold lines separate the frames at the arrival of the encoder, it is possible to start the encoding of a new frame when a frame thus defined is entirely available. It is important to note that these bold lines in the encoder do not correspond to the current frame but to the block of new samples arriving for each frame; the current frame is in fact delayed by 5 ms. At the bottom, the bold lines separate the decoded frames at the output of the decoder.
- the specific processing of the transition frame corresponds to the blocks 230 to 232 and to the block 240 of FIG. 2. This processing is performed when the previous mode, noted pre mode, that is to say the type of coding of the frame. previous (CELP or MDCT), is of CELP type.
- the coding of the current transition frame between CELP coding and MDCT is based on several steps implemented by block 231:
- the window chosen for this coding is the window w (n) defined above, with an effective length of 25ms.
- FIGS. 4b, 4c, 4d and 4e Other forms of windows for replacing w (n) in the transition MDCT frame (first MDCT frame following a CELP frame) are illustrated in FIGS. 4b, 4c, 4d and 4e with the same effective length which may be different from 25 ms. .
- the 20ms of the current frame are placed at the beginning of the non-zero portion of the window, while the remaining 5 ms are the first 5 milliseconds of the future frame ("lookahead").
- This restricted predictive coding has the following steps.
- the filter A (z) of the first subframe is for example obtained by copying the filter A (z) of the fourth subframe of the previous frame. This saves the calculation of this filter and saves the number of bits associated with its coding in the bit stream.
- the MDCT mode is generally selected in quasi-stationary segments where the coding in the frequency domain is more efficient than in the time domain.
- this stationarity is normally already established, it can be assumed that some parameters such as the spectral envelope evolve very little frame to frame.
- the quantized synthesis filter 1 / A (z) transmitted during the previous frame, representing the spectral envelope of the signal, can be reused effectively.
- a bit is allocated to indicate whether the adaptive excitation v (n) was filtered or not by a coefficient low-pass filter (0.18, 0.64, 0.18). However, the value of this bit could be taken from the last previous CELP frame.
- the search for the algebraic excitation of the sub-frame is performed in a closed loop only for this transition sub-frame, and the coding of the positions and signs of the excitation pulsations are coded in the bit stream, again with a number of bits depending on the encoder rate.
- the gains g p , g c respectively associated with the adaptive and algebraic excitation are coded in the bit stream. The number of bits allocated to this coding depends on the rate of the encoder.
- Block 231 also provides the parameters of the restricted predictive coding, I TR , to be multiplexed in the bitstream. It is important to note that block 231 uses information, noted Mem. in the figure, the coding (block 211) carried out in the frame preceding the transition frame. For example, the information includes the LPC and pitch parameters of the last subframe.
- cross-fade in English
- this cross-fade is performed on the first 5 ms in the following manner as illustrated in FIG. 4a:
- the fade-in between the two signals is here 5 ms, but it can be smaller.
- the CELP coder and the MDCT encoder are perfect or almost perfect reconstruction, we can even do without cross-fade, in fact the first 5 milliseconds of the frame are coded perfectly (by the restricted CELP), and the next 15ms are also coded perfectly (by the MDCT encoder). Artifact attenuation by dissolving is theoretically no longer necessary. In this case, the signal s MDCT (”) is written more simply:
- the window is replaced by a window identical to the analysis and synthesis with a rectangular shape without folding to the left
- n ⁇ 0 and n> 255 we do not specify here for n ⁇ 0 and n> 255.
- n ⁇ 0 the value of w (n) is zero and for n> 255 the windows are determined by the analysis and synthesis windows MDCT used for the MDCT coding "Normal".
- the crossfade in FIG. 4b is performed as follows:
- the window is replaced by a window identical to analysis and synthesis with a form including a first part of zero value over 1.25 ms, then a sinusoidal rising edge over 2.5 ms, and a dish of unit value over 1.25 ms:
- n ⁇ 0 and n> 255 we do not specify here for n ⁇ 0 and n> 255.
- n ⁇ 0 the value of w (n) is zero and for n> 255 the windows are determined by the analysis and synthesis windows MDCT used for the MDCT coding "Normal".
- the fade-out in FIG. 4c is carried out as follows:
- n ⁇ 0 and n> 255 the values are determined by the analysis and synthesis windows MDCT used for "normal" MDCT coding.
- MDCT (n) cos - ⁇ °TR (n) + ⁇ MDCT ( h ⁇ n-32, ..., 63
- crossfade of Figures 4b to 4d could be used in the configuration of Figure 4a also.
- the advantage of doing so is that the cross-fade is performed on the decoded part MDCT where the folding error is the weakest. With the structure shown in Figure 4a is closer to the perfect reconstruction.
- the encoder operates with a closed-loop mode decision.
- the operation of the closed-loop decision (Block 254) is not described in greater detail.
- the decision of block 554 is encoded (I S EL) and multiplexed in the bitstream.
- the multiplexer 260 combines the coded decision I SEL and the different bits coming from the coding modules in the bit stream bst as a function of the decision of the module 254:
- the bits I CELP> are sent for a purely MDCT frame the bits I MDCT and for a transition frame CELP to MDCT the bits I ⁇ R and I MDCT - II it should be noted that the mode decision could also be made in open loop or specified externally to the encoder, without changing the nature of the 'invention.
- the decoder according to one embodiment of the invention is illustrated in FIG. 5.
- the demultiplexer (block 511) receives the bit stream bst and first extracts the mode index I SEL - This index controls the operation of the modules the decoder and the switch 509. If the I SEL index indicates a CELP frame the CELP decoder 501 is activated and decodes the CELP I indices CELP - The reconstructed signal s CELP (n) by the decoder
- the indices I TR are also decoded by the module 505. It is important to note that the block 505 uses information, noted Mem. in the figure, the decoding (block 501) carried out in the frame preceding the transition frame. For example, the information includes the LPC and pitch parameters of the last subframe.
- the decoder reuses at least one predictive decoding parameter of the previous frame to decode a first portion of the transition frame. It also uses the only parameters received for this first part which correspond to the parameters not reused.
- This processing (block 505 to 507) is performed when the previous mode, noted pre mode, that is to say the type of decoding of the previous frame (CELP or MDCT), is of CELP type.
- a transition frame the signals s TR (n) and s MDCT (n) are combined by the block 507, typically a cross- fading operation, as previously described for the encoder implementing the invention, is carried out in the first part of the frame to obtain the signal s MDCT (n).
- MDCT (n) s MDCT (n).
- the reconstructed signal x (n) at 16 kHz is obtained by oversampling from 12.8 kHz to 16 kHz (block 510). This change of rate is considered to be performed using a finite impulse response filter in polyphase (order 60).
- the samples corresponding to the first subframe of the current frame coded by transform coding are coded by a restricted predictive coder to the detriment of the bits available to transform coding (case of constant flow rate) or by increasing the transmitted flow rate (variable flow rate).
- the folded area is used only for cross-linking which ensures a smooth and seamless transition between CELP reconstruction and MDCT reconstruction.
- this cross-fade can be performed on the second part of the folded area or the folding effect is less strong.
- this variant illustrated in FIG. 4a by increasing the flow rate, it does not converge towards the perfect reconstruction because a part of the signal used for the crossfade is disturbed by the time folding.
- This variant can not be transparent even if this low-rate disturbance is quite acceptable and generally almost inaudible compared to the intrinsic degradation of the low-rate coding.
- CELP transition frame
- an MDCT transformation can be used without folding to the left, with a rectangular window starting in the middle of the subframe on the folding line.
- the output is identical to the decoded signal of the restricted predictive coding and then the transition is made during the second 2.5 ms following gradually decreasing the weight of the CELP component and increasing the weight of the MDCT component depending on the exact definition of the MDCT window.
- the transition is therefore made using the decoded signal MDCT without folding.
- rectangular windowing can cause block effects in the presence of MDCT coding noise.
- FIG. 4c illustrates another variant where the upward portion of the window (with time folding) on the left is shortened (for example to 2.5 ms) and thus the first 5 milliseconds of the signal reconstructed by the MDCT mode contain a portion (1.25 ms) without folding right in this first 5 ms subframe.
- the "flat" (i.e. constant value at 1 without folding) of the MDCT window is extended to the left in the sub-frame encoded by the restricted predictive coding by comparing with the configuration of Fig. 4a.
- the fade-in of these reconstructed signals is performed on the part of the window where the reconstructed signal resulting from the transform coding of the first part of the current frame has no time folding.
- the advantage of this variant over that illustrated in FIG. 4b is the best spectral property of the window used and the reduction of the block effects, without the rectangular part.
- the variant of Figure 4b is an extreme case of the variant of Figure 4c where the rising portion of the window (with time folding) to the left is shortened to 0.
- the length of the rising part of the window (with time folding) on the left depends on the flow rate: for example it is shortened with the increase of the flow rate.
- the weights of the crossfade used in this case can be adapted to the chosen window.
- low delay MDCT windows have been represented, these include a chosen number of successive weighting coefficients of zero value at the end and at the beginning of the window.
- the invention is also applicable for the case where conventional MDCT (sinusoidal) weighting windows are used.
- the fade-in has been presented in the examples given above with linear weights. Obviously other functions of variation of the weights can also be used as the rising edge of a sinusoidal function for example. In general, the weight of the other component is always chosen so that the sum of the two weights is always equal to one.
- the weight of the cross-fading of the MDCT component can be integrated in the MDCT synthesis weighting window of the transition frame for all the variants presented, by multiplying the synthesis weighting window MDCT by the fade-in weights , which reduces the complexity of calculation.
- the transition between the restricted predictive coding component and the transform coding component is done by adding on the one hand the predictive coding component multiplied by the crossfade weights and on the other hand the transform coding component. thus obtained, without additional weighting by the weights.
- the integration of the weights of the crossfade can be done in the analysis weighting window. This can advantageously be done in the variant of FIG. 4b since the cross-fade zone is entirely in the non-folding portion of the frame and the initial analysis weighting window was zero for the samples preceding the zone of folding.
- the rising portion of the analysis / transition synthesis weighting window is in the non-folding zone (after the folding line).
- This rising part is here defined as a quarter of sinusoidal cycle, so that the combined effect of the analysis / synthesis windows implicitly gives fade weights in the form of a sine squared.
- This rising part is used for both MDCT windowing and cross fading.
- the fade-in weights for the restricted predictive coding component are complementary to the rising portion of the combined analysis / synthesis weighting windows, so that the sum of the two weights always gives 1 on the cross-fade area. is done.
- the weights of the crossfade for the restricted predictive coding component are therefore in the form of a cosine squared (1 minus Sine squared)
- the weights of the crossfade are integrated into both the analysis and synthesis weighting window of the transition frame.
- the variant illustrated in FIG. 4d makes it possible to achieve the perfect high-throughput reconstruction because the cross-fade is performed on an area without time folding. .
- the invention is also applicable to the case where MDCT windows are asymmetrical and in case the MDCT analysis and synthesis windows are not identical as in the ITU-T G.718 standard.
- MDCT windows are asymmetrical and in case the MDCT analysis and synthesis windows are not identical as in the ITU-T G.718 standard.
- Figure 4e Such an example is given in Figure 4e.
- the left side of the transition MDCT window (in bold lines in the figure) and the weights of the crossfade are identical to those of FIG. 4d.
- the window and the crossfade corresponding to the other embodiments already presented could also be used in the left part of the transition window.
- the right part of the transition analysis window is identical to the right part of the analysis window MDCT normally used and that at the decoder , the right part of the transition synthesis window MDCT is identical with the right part of the synthesis window MDCT normally used.
- the left side of the transition MDCT weighting window the left part of one of the MDCT transition windows already presented in FIGS. 4a to 4d is used (in the example of FIG. 4e, that of FIG. 4d is used) .
- the weights of the crossfade are chosen according to the window used, as detailed in the embodiments of the invention described above (for example in FIGS. 4a to 4d).
- the left half of the MDCT analysis weighting window used is chosen such that the right part of the zone corresponding to this half of the window comprises no time folding (for example according to one of the examples of FIGS. 4a to 4e) and the left half of the corresponding MDCT synthesis weighting window is chosen in such a way that after the combined effect of the analysis and synthesis windows this zone without folding has a weight 1 at least on the right side (without any attenuation).
- Figures 4a to 4e show examples of analysis and synthesis window pairs that verify these criteria.
- the left half of the transition MDCT weighting window is identical to the analysis and the synthesis, but this is not necessarily the case in all embodiments of the invention.
- the synthesis window shape in the area where the weight of the MDCT component in the crossfade is zero is not important because these samples will not be used, it must not even be calculated.
- the contribution of the analysis and synthesis windows in the weights of the crossfade can be equally unequally distributed, which would give different analysis and synthesis windows in the left half of the MDCT weighting window. transition.
- the right half of the transition analysis and synthesis windows they are identical with those of the MDCT weighting windows normally used in the only areas encoded by transform coding.
- the cross-fade between the signal reconstructed by the restricted predictive decoder and the signal reconstructed by the transform decoder must be performed on a zone without temporal folding. .
- the combined effect of the analysis and synthesis windows can implicitly integrate the weights of the cross-fade of the component reconstructed by the transform decoder.
- the MDCT mode is generally selected in quasi-stationary segments where the frequency domain coding is more efficient than in the time domain.
- the mode decision is taken in open loop or driven externally to the encoder, without guarantee that the hypothesis of stationarity is verified.
- the quantized synthesis filter 1 / A (z) transmitted during the previous frame, representing the spectral envelope of the signal can be reused in order to save bits for the MDCT coding.
- the last synthesis filter transmitted in the CELP mode (the closest to the signal to be coded) is used.
- the information used to code the signal in the transition frame is: the pitch (associated with the long-term excitation), the excitation vector (or innovation) as well as the associated gain (s) to excitement.
- the decoded value of the pitch and / or its gain associated with the last sub-frame can also be reused because these parameters are also changing slowly in the stationary zones. This further reduces the amount of information to be transmitted during a transition from CELP to MDCT.
- One of the desired properties of the transition from CELP to MDCT is that at asymptotically high rate, when the CELP and MDCT coders are almost perfect reconstruction, the coding performed in the transition frame (MDCT frame following a CELP frame) must be himself with almost perfect reconstruction.
- the variants illustrated in FIGS. 4b and 4c ensure almost perfect reconstruction at very high speed.
- the number of bits allocated to these parameters of the restricted predictive coding may be variable and proportional to the total bit rate.
- the MDCT coding principle is modified so that no time-folding on the left is used in the MDCT window of the frame of transition.
- This variant involves using a modified version of the DCT transformation at the heart of the MDCT transformation because the length of the folded signal is different, since the time folding (reducing the size of the block) is only performed on the right.
- the invention is described in FIGS. 4a to 4d for the simplified case of identical analysis and synthesis windows MDCT in each frame (except the transition frame) coded by the MDCT mode.
- the MDCT window may be asymmetrical as shown in Figure 4e.
- the MDCT coding may use a window switching between at least one "long" window of typically 20-40 ms and a series of short windows of typically 5-10 ms ("window switching" in English).
- the invention provides for the transmission of at least one bit to indicate a transition mode different from the method described above, in order to keep more CELP parameters and / or CELP subframes to be coded in the frame of transition from CELP to MDCT.
- a first bit may indicate whether in the remainder of the bit stream the LPC filter is coded or the last received version may be used at the decoder, and another bit may signal the same for the pitch value. In the case where the encoding of a parameter is deemed necessary this can be done in differential with respect to the value transmitted in the last frame.
- the coding method according to the invention can be illustrated in flowchart form as shown in FIG. 6a.
- the current frame is a transition frame between predictive coding and transform coding.
- step E602 a restricted predictive coding is applied to a first part of the current frame. This predictive coding is restricted compared to the predictive coding used for the previous frame.
- the MDCT coding of the current frame is performed in step E603, in parallel for the entire current frame. At the end of this transform coding step, the signal s MDCT (n) is obtained.
- the method comprises a step of cross-fading in step E604, after reconstruction of the signals, making it possible to perform a smooth transition between the predictive coding and the transform coding in the transition frame.
- a reconstructed signal s MDCT ⁇ n) is obtained.
- the decoding method When at the decoding, a previous frame has been decoded according to a predictive type decoding method and the current frame is to be decoded according to a transform type decoding method (verification in E605), the decoding method includes a decoding step by a restricted predictive decoding of a first part of the current frame, in E606. It also comprises a step of decoding by transforming the current frame into E607.
- a step E608 is then performed, according to the embodiments described above, for effecting a combination of the decoded signals obtained, respectively s TR (n) and s MDCT (n), by cross- fading all or part of the current frame and thus obtain the decoded signal s MDCT ⁇ n) of the current frame.
- the invention has been presented in the specific case of a transition from CELP to MDCT. It is obvious that this invention also applies in the case where the CELP coding is replaced by another type of coding, such as ADPCM, TCX, and where a transition coding on a part of the transition frame is performed using the encoding information of the frame preceding the transition MDCT frame.
- CELP coding is replaced by another type of coding, such as ADPCM, TCX, and where a transition coding on a part of the transition frame is performed using the encoding information of the frame preceding the transition MDCT frame.
- This device DISP comprises an input for receiving a digital signal SIG which in the case of the encoder is an input signal x (n ') and in the case of the decoder, the bit stream bst.
- the device also comprises a processor PROC of digital signals adapted to perform coding / decoding operations in particular on a signal from the input E.
- This processor is connected to one or more memory units MEM adapted to store information necessary for controlling the device for the device.
- coding / decoding include instructions for implementing the coding method described above and in particular for implementing the steps of coding a previous frame of samples of the digital signal according to a predictive coding, coding of a current frame of samples of the digital signal according to a transform coding, such that a first part of the current frame is coded by a predictive coding restricted with respect to the predictive coding of the preceding frame, when the device is encoder type.
- these memory units include instructions for implementing the decoding method described above and in particular for implementing the steps of predictive decoding of a previous frame of samples of the digital signal. received and coded according to a predictive coding, inverse transform decoding of a current frame of samples of the received digital signal and encoded by transform coding, and further a decoding step by a predictive decoding restricted with respect to the predictive decoding of the previous frame of a first part of the current frame.
- These memory units may also include calculation parameters or other information.
- a storage means readable by a processor, whether or not integrated into the encoder or decoder, possibly removable, stores a computer program implementing an encoding method and / or a decoding method according to the invention.
- Figures 6a and 6b may for example illustrate the algorithm of such a computer program.
- the processor is also adapted to store results in these memory units.
- the device comprises an output S connected to the processor for providing an output signal SIG * which in the case of the encoder is a signal in the form of a bit stream bst and in the case of the decoder, an output signal x (n ' ).
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Signal Processing (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
Abstract
Description
Claims
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
FR1061203A FR2969805A1 (fr) | 2010-12-23 | 2010-12-23 | Codage bas retard alternant codage predictif et codage par transformee |
PCT/FR2011/053097 WO2012085451A1 (fr) | 2010-12-23 | 2011-12-20 | Codage de son à bas retard alternant codage prédictif et codage par transformée |
Publications (2)
Publication Number | Publication Date |
---|---|
EP2656343A1 true EP2656343A1 (fr) | 2013-10-30 |
EP2656343B1 EP2656343B1 (fr) | 2014-11-19 |
Family
ID=44059261
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
EP11815474.9A Active EP2656343B1 (fr) | 2010-12-23 | 2011-12-20 | Codage de son à bas retard alternant codage prédictif et codage par transformée |
Country Status (10)
Country | Link |
---|---|
US (1) | US9218817B2 (fr) |
EP (1) | EP2656343B1 (fr) |
JP (1) | JP5978227B2 (fr) |
KR (1) | KR101869395B1 (fr) |
CN (1) | CN103384900B (fr) |
BR (1) | BR112013016267B1 (fr) |
ES (1) | ES2529221T3 (fr) |
FR (1) | FR2969805A1 (fr) |
RU (1) | RU2584463C2 (fr) |
WO (1) | WO2012085451A1 (fr) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2015197989A1 (fr) | 2014-06-27 | 2015-12-30 | Orange | Ré-échantillonnage par interpolation d'un signal audio pour un codage /décodage à bas retard |
Families Citing this family (16)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP4977157B2 (ja) | 2009-03-06 | 2012-07-18 | 株式会社エヌ・ティ・ティ・ドコモ | 音信号符号化方法、音信号復号方法、符号化装置、復号装置、音信号処理システム、音信号符号化プログラム、及び、音信号復号プログラム |
KR102053900B1 (ko) | 2011-05-13 | 2019-12-09 | 삼성전자주식회사 | 노이즈 필링방법, 오디오 복호화방법 및 장치, 그 기록매체 및 이를 채용하는 멀티미디어 기기 |
CN103548080B (zh) * | 2012-05-11 | 2017-03-08 | 松下电器产业株式会社 | 声音信号混合编码器、声音信号混合解码器、声音信号编码方法以及声音信号解码方法 |
KR101498113B1 (ko) * | 2013-10-23 | 2015-03-04 | 광주과학기술원 | 사운드 신호의 대역폭 확장 장치 및 방법 |
FR3013496A1 (fr) * | 2013-11-15 | 2015-05-22 | Orange | Transition d'un codage/decodage par transformee vers un codage/decodage predictif |
US9489955B2 (en) * | 2014-01-30 | 2016-11-08 | Qualcomm Incorporated | Indicating frame parameter reusability for coding vectors |
US10134403B2 (en) * | 2014-05-16 | 2018-11-20 | Qualcomm Incorporated | Crossfading between higher order ambisonic signals |
EP2980795A1 (fr) | 2014-07-28 | 2016-02-03 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Codage et décodage audio à l'aide d'un processeur de domaine fréquentiel, processeur de domaine temporel et processeur transversal pour l'initialisation du processeur de domaine temporel |
EP2980796A1 (fr) * | 2014-07-28 | 2016-02-03 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Procédé et appareil de traitement d'un signal audio, décodeur audio et codeur audio |
EP2980797A1 (fr) | 2014-07-28 | 2016-02-03 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Décodeur audio, procédé et programme d'ordinateur utilisant une réponse d'entrée zéro afin d'obtenir une transition lisse |
EP2980794A1 (fr) * | 2014-07-28 | 2016-02-03 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Codeur et décodeur audio utilisant un processeur du domaine fréquentiel et processeur de domaine temporel |
FR3024582A1 (fr) | 2014-07-29 | 2016-02-05 | Orange | Gestion de la perte de trame dans un contexte de transition fd/lpd |
FR3024581A1 (fr) * | 2014-07-29 | 2016-02-05 | Orange | Determination d'un budget de codage d'une trame de transition lpd/fd |
WO2016142002A1 (fr) * | 2015-03-09 | 2016-09-15 | Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. | Codeur audio, décodeur audio, procédé de codage de signal audio et procédé de décodage de signal audio codé |
CN114898761A (zh) * | 2017-08-10 | 2022-08-12 | 华为技术有限公司 | 立体声信号编解码方法及装置 |
CN110556118B (zh) * | 2018-05-31 | 2022-05-10 | 华为技术有限公司 | 立体声信号的编码方法和装置 |
Family Cites Families (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5787387A (en) * | 1994-07-11 | 1998-07-28 | Voxware, Inc. | Harmonic adaptive speech coding method and system |
JP3317470B2 (ja) * | 1995-03-28 | 2002-08-26 | 日本電信電話株式会社 | 音響信号符号化方法、音響信号復号化方法 |
JP3653826B2 (ja) * | 1995-10-26 | 2005-06-02 | ソニー株式会社 | 音声復号化方法及び装置 |
US6134518A (en) * | 1997-03-04 | 2000-10-17 | International Business Machines Corporation | Digital audio signal coding using a CELP coder and a transform coder |
DE69926821T2 (de) * | 1998-01-22 | 2007-12-06 | Deutsche Telekom Ag | Verfahren zur signalgesteuerten Schaltung zwischen verschiedenen Audiokodierungssystemen |
US6959274B1 (en) * | 1999-09-22 | 2005-10-25 | Mindspeed Technologies, Inc. | Fixed rate speech compression system and method |
US7171355B1 (en) * | 2000-10-25 | 2007-01-30 | Broadcom Corporation | Method and apparatus for one-stage and two-stage noise feedback coding of speech and audio signals |
US6658383B2 (en) * | 2001-06-26 | 2003-12-02 | Microsoft Corporation | Method for coding speech and music signals |
JP3881943B2 (ja) * | 2002-09-06 | 2007-02-14 | 松下電器産業株式会社 | 音響符号化装置及び音響符号化方法 |
US7596486B2 (en) * | 2004-05-19 | 2009-09-29 | Nokia Corporation | Encoding an audio signal using different audio coder modes |
CN101308656A (zh) * | 2007-05-17 | 2008-11-19 | 展讯通信(上海)有限公司 | 音频暂态信号的编解码方法 |
PL2311034T3 (pl) * | 2008-07-11 | 2016-04-29 | Fraunhofer Ges Forschung | Koder i dekoder audio do kodowania ramek próbkowanego sygnału audio |
FR2936898A1 (fr) | 2008-10-08 | 2010-04-09 | France Telecom | Codage a echantillonnage critique avec codeur predictif |
RU2393548C1 (ru) * | 2008-11-28 | 2010-06-27 | Общество с ограниченной ответственностью "Конвент Люкс" | Устройство для изменения входящего голосового сигнала в выходящий голосовой сигнал в соответствии с целевым голосовым сигналом |
JP4977157B2 (ja) * | 2009-03-06 | 2012-07-18 | 株式会社エヌ・ティ・ティ・ドコモ | 音信号符号化方法、音信号復号方法、符号化装置、復号装置、音信号処理システム、音信号符号化プログラム、及び、音信号復号プログラム |
-
2010
- 2010-12-23 FR FR1061203A patent/FR2969805A1/fr not_active Withdrawn
-
2011
- 2011-12-20 BR BR112013016267-8A patent/BR112013016267B1/pt active IP Right Grant
- 2011-12-20 JP JP2013545471A patent/JP5978227B2/ja active Active
- 2011-12-20 RU RU2013134227/08A patent/RU2584463C2/ru active
- 2011-12-20 ES ES11815474.9T patent/ES2529221T3/es active Active
- 2011-12-20 WO PCT/FR2011/053097 patent/WO2012085451A1/fr active Application Filing
- 2011-12-20 CN CN201180068351.0A patent/CN103384900B/zh active Active
- 2011-12-20 KR KR1020137019387A patent/KR101869395B1/ko active IP Right Grant
- 2011-12-20 US US13/997,446 patent/US9218817B2/en active Active
- 2011-12-20 EP EP11815474.9A patent/EP2656343B1/fr active Active
Non-Patent Citations (1)
Title |
---|
See references of WO2012085451A1 * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2015197989A1 (fr) | 2014-06-27 | 2015-12-30 | Orange | Ré-échantillonnage par interpolation d'un signal audio pour un codage /décodage à bas retard |
EP4047492A1 (fr) | 2014-06-27 | 2022-08-24 | Orange | Ré-échantillonnage par interpolation d'un signal audio pour un codage /décodage à bas retard |
Also Published As
Publication number | Publication date |
---|---|
US9218817B2 (en) | 2015-12-22 |
CN103384900B (zh) | 2015-06-10 |
BR112013016267B1 (pt) | 2021-02-02 |
JP5978227B2 (ja) | 2016-08-24 |
RU2584463C2 (ru) | 2016-05-20 |
CN103384900A (zh) | 2013-11-06 |
RU2013134227A (ru) | 2015-01-27 |
US20130289981A1 (en) | 2013-10-31 |
BR112013016267A2 (pt) | 2018-07-03 |
EP2656343B1 (fr) | 2014-11-19 |
ES2529221T3 (es) | 2015-02-18 |
KR20130133816A (ko) | 2013-12-09 |
JP2014505272A (ja) | 2014-02-27 |
KR101869395B1 (ko) | 2018-06-20 |
WO2012085451A1 (fr) | 2012-06-28 |
FR2969805A1 (fr) | 2012-06-29 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
EP2656343B1 (fr) | Codage de son à bas retard alternant codage prédictif et codage par transformée | |
EP1907812B1 (fr) | Procede de commutation de debit en decodage audio scalable en debit et largeur de bande | |
EP1905010B1 (fr) | Codage/décodage audio hiérarchique | |
EP2277172B1 (fr) | Dissimulation d'erreur de transmission dans un signal audionumerique dans une structure de decodage hierarchique | |
EP1989706B1 (fr) | Dispositif de ponderation perceptuelle en codage/decodage audio | |
EP3014611B1 (fr) | Extension améliorée de bande de fréquence dans un décodeur de signaux audiofréquences | |
EP2080195A1 (fr) | Synthèse de blocs perdus d'un signal audionumérique, avec correction de période de pitch | |
EP3103116B1 (fr) | Extension ameliorée de bande de fréquence dans un décodeur de signaux audiofréquences | |
EP3175444B1 (fr) | Gestion de la perte de trame dans un contexte de transition fd/lpd | |
EP3069340B1 (fr) | Transition d'un codage/décodage par transformée vers un codage/décodage prédictif | |
WO2009089700A1 (fr) | Procédé et appareil de mise à jour de l'état d'un filtre de synthèse | |
EP3175443B1 (fr) | Détermination d'un budget de codage d'une trame de transition lpd/fd | |
WO2007107670A2 (fr) | Procede de post-traitement d'un signal dans un decodeur audio | |
Jax | Backwards Compatible Wideband Telephony |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PUAI | Public reference made under article 153(3) epc to a published international application that has entered the european phase |
Free format text: ORIGINAL CODE: 0009012 |
|
17P | Request for examination filed |
Effective date: 20130715 |
|
AK | Designated contracting states |
Kind code of ref document: A1 Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR |
|
DAX | Request for extension of the european patent (deleted) | ||
REG | Reference to a national code |
Ref country code: DE Ref legal event code: R079 Ref document number: 602011011646 Country of ref document: DE Free format text: PREVIOUS MAIN CLASS: G10L0019140000 Ipc: G10L0019022000 |
|
GRAP | Despatch of communication of intention to grant a patent |
Free format text: ORIGINAL CODE: EPIDOSNIGR1 |
|
RIC1 | Information provided on ipc code assigned before grant |
Ipc: G10L 19/12 20130101ALI20140430BHEP Ipc: G10L 19/022 20130101AFI20140430BHEP Ipc: G10L 19/02 20130101ALN20140430BHEP Ipc: G10L 19/04 20130101ALN20140430BHEP Ipc: G10L 19/18 20130101ALI20140430BHEP |
|
RIC1 | Information provided on ipc code assigned before grant |
Ipc: G10L 19/12 20130101ALI20140521BHEP Ipc: G10L 19/02 20130101ALN20140521BHEP Ipc: G10L 19/04 20130101ALN20140521BHEP Ipc: G10L 19/18 20130101ALI20140521BHEP Ipc: G10L 19/022 20130101AFI20140521BHEP |
|
INTG | Intention to grant announced |
Effective date: 20140603 |
|
GRAS | Grant fee paid |
Free format text: ORIGINAL CODE: EPIDOSNIGR3 |
|
GRAA | (expected) grant |
Free format text: ORIGINAL CODE: 0009210 |
|
AK | Designated contracting states |
Kind code of ref document: B1 Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR |
|
REG | Reference to a national code |
Ref country code: GB Ref legal event code: FG4D Free format text: NOT ENGLISH |
|
REG | Reference to a national code |
Ref country code: CH Ref legal event code: EP |
|
REG | Reference to a national code |
Ref country code: AT Ref legal event code: REF Ref document number: 697423 Country of ref document: AT Kind code of ref document: T Effective date: 20141215 |
|
REG | Reference to a national code |
Ref country code: IE Ref legal event code: FG4D Free format text: LANGUAGE OF EP DOCUMENT: FRENCH |
|
REG | Reference to a national code |
Ref country code: DE Ref legal event code: R096 Ref document number: 602011011646 Country of ref document: DE Effective date: 20141231 |
|
REG | Reference to a national code |
Ref country code: ES Ref legal event code: FG2A Ref document number: 2529221 Country of ref document: ES Kind code of ref document: T3 Effective date: 20150218 |
|
REG | Reference to a national code |
Ref country code: NL Ref legal event code: VDEP Effective date: 20141119 |
|
REG | Reference to a national code |
Ref country code: AT Ref legal event code: MK05 Ref document number: 697423 Country of ref document: AT Kind code of ref document: T Effective date: 20141119 |
|
REG | Reference to a national code |
Ref country code: LT Ref legal event code: MG4D |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: NO Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20150219 Ref country code: FI Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20141119 Ref country code: NL Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20141119 Ref country code: PT Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20150319 Ref country code: LT Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20141119 Ref country code: IS Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20150319 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: HR Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20141119 Ref country code: CY Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20141119 Ref country code: PL Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20141119 Ref country code: SE Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20141119 Ref country code: AT Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20141119 Ref country code: GR Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20150220 Ref country code: RS Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20141119 Ref country code: LV Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20141119 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: BE Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20141231 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: EE Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20141119 Ref country code: RO Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20141119 Ref country code: SK Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20141119 Ref country code: CZ Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20141119 Ref country code: DK Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20141119 |
|
REG | Reference to a national code |
Ref country code: CH Ref legal event code: PL |
|
REG | Reference to a national code |
Ref country code: DE Ref legal event code: R097 Ref document number: 602011011646 Country of ref document: DE |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: MC Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20141119 |
|
REG | Reference to a national code |
Ref country code: IE Ref legal event code: MM4A |
|
PLBE | No opposition filed within time limit |
Free format text: ORIGINAL CODE: 0009261 |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: NO OPPOSITION FILED WITHIN TIME LIMIT |
|
26N | No opposition filed |
Effective date: 20150820 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: LI Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20141231 Ref country code: CH Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20141231 Ref country code: IE Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20141220 |
|
REG | Reference to a national code |
Ref country code: FR Ref legal event code: PLFP Year of fee payment: 5 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: SI Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20141119 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: SM Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20141119 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: BG Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20141119 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: TR Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20141119 Ref country code: MT Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20141119 Ref country code: HU Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT; INVALID AB INITIO Effective date: 20111220 Ref country code: LU Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20141220 |
|
REG | Reference to a national code |
Ref country code: FR Ref legal event code: PLFP Year of fee payment: 6 |
|
REG | Reference to a national code |
Ref country code: FR Ref legal event code: PLFP Year of fee payment: 7 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: MK Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20141119 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: AL Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20141119 |
|
PGFP | Annual fee paid to national office [announced via postgrant information from national office to epo] |
Ref country code: IT Payment date: 20221122 Year of fee payment: 12 |
|
PGFP | Annual fee paid to national office [announced via postgrant information from national office to epo] |
Ref country code: GB Payment date: 20231124 Year of fee payment: 13 |
|
PGFP | Annual fee paid to national office [announced via postgrant information from national office to epo] |
Ref country code: FR Payment date: 20231122 Year of fee payment: 13 Ref country code: DE Payment date: 20231121 Year of fee payment: 13 |
|
PGFP | Annual fee paid to national office [announced via postgrant information from national office to epo] |
Ref country code: ES Payment date: 20240102 Year of fee payment: 13 |