WO2010111841A1 - Procédé et appareil de prédiction pour décodage d'impulsions dans le domaine fréquentiel et décodeur - Google Patents

Procédé et appareil de prédiction pour décodage d'impulsions dans le domaine fréquentiel et décodeur Download PDF

Info

Publication number
WO2010111841A1
WO2010111841A1 PCT/CN2009/071161 CN2009071161W WO2010111841A1 WO 2010111841 A1 WO2010111841 A1 WO 2010111841A1 CN 2009071161 W CN2009071161 W CN 2009071161W WO 2010111841 A1 WO2010111841 A1 WO 2010111841A1
Authority
WO
WIPO (PCT)
Prior art keywords
current frame
frame
block
spectrum
decoded
Prior art date
Application number
PCT/CN2009/071161
Other languages
English (en)
Chinese (zh)
Inventor
苗磊
刘泽新
齐峰岩
胡晨
陈龙吟
郎玥
吴文海
塔迪·哈维·米希尔
张清
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to PCT/CN2009/071161 priority Critical patent/WO2010111841A1/fr
Priority to CN2009801486921A priority patent/CN102246229B/zh
Publication of WO2010111841A1 publication Critical patent/WO2010111841A1/fr

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques

Definitions

  • the continuous multi-frame does not guarantee that the spectral coefficients of the same frequency band are encoded.
  • some frames in the same frequency band can decode the spectral coefficients, and some frames can only be obtained by filling 0, thus causing discontinuity of the spectral coefficients in the same frequency band, thereby causing insufficient auditory quality. Ideal, especially for signals with strong harmonics.
  • intra-frame prediction is performed by using a small number of bits or no bits, so that the undecoded spectral coefficients are predicted by spectral coefficients of other frequency bands or frequency points.
  • the bandwidth extension (BWE, Bandwidth Extension) algorithm is used at present, and the spectral coefficient of the high frequency band is predicted by using the low frequency spectral coefficient according to the correlation between the high and low frequencies, so as to increase the bandwidth of the output signal, thereby improving the output signal. Hearing quality.
  • the spectral energy predicted in the frame tends to be biased in some frequency bands, especially when the low frequency harmonics are strong and the high frequency harmonics are also strong.
  • the peak position often has a large deviation from the true peak position, which causes the audio signal to introduce more noise, which affects the auditory quality of the audio signal.
  • the embodiments of the present invention provide a prediction method, a prediction apparatus, and a decoder for frequency domain pulse decoding, which can better improve the auditory quality of an audio output signal.
  • the method for predicting frequency domain pulse decoding includes the following steps: performing spectrum block division on a current frame and a previous frame according to a spectral coefficient of a previous frame;
  • a block dividing unit configured to perform frequency bin partitioning on the current frame and the previous frame according to the frequency coefficient of the previous frame
  • a determining unit configured to determine, according to the correlation between the current frame and the previous frame spectrum block that are divided by the block dividing unit, whether the spectrum block of the current frame division needs to perform inter prediction
  • a prediction unit configured to determine, by the determining unit, a current frame spectrum block that needs to be inter-predicted, and use the decoded spectral coefficient in the corresponding spectrum block of the previous frame and the decoded spectral coefficient of the current frame to predict the current Undecoded spectral coefficients in the frame spectrum block.
  • the decoder provided by the embodiment of the present invention includes the above-mentioned frequency domain pulse decoding prediction apparatus and converter, and the frequency domain pulse decoding prediction apparatus is configured to determine a current frame spectrum block that needs to be inter-predicted. Precoding the undecoded spectral coefficients in the current frame spectrum block by the decoded spectral coefficients in the previous frame corresponding to the spectral block and the decoded spectral coefficients of the current frame; the converter is configured to decode according to the frequency domain pulse
  • the predicted frame spectral coefficients of the prediction device are subjected to frequency domain to time domain transformation, and output time domain audio signals.
  • the current frame and the previous frame are first frequency-band partitioned according to the spectral coefficient of the previous frame, and then the current frame is determined according to the correlation between the current frame and the frequency block divided by the previous frame. Whether the divided spectrum blocks need to be inter-predicted, and finally determine the current frame spectrum block that needs to be inter-predicted, use the decoded spectral coefficients in the corresponding spectrum block of the previous frame and the decoded spectral coefficients of the current frame, and predict Undecoded spectrum system in the current frame spectrum block Number.
  • FIG. 1 is a flowchart of a method for predicting frequency domain pulse decoding according to Embodiment 1 of the present invention
  • FIG. 2 is a flowchart of a method for predicting frequency domain pulse decoding according to Embodiment 2 of the present invention
  • An example diagram of frequency information of a current frame and a previous frame is provided by the example
  • FIG. 4 is a block diagram of an algorithm structure according to Embodiment 2 of the present invention.
  • FIG. 5 is a structural block diagram of another algorithm according to Embodiment 2 of the present invention.
  • a method for predicting frequency domain pulse decoding includes the following steps: Step 11: Perform spectrum block division on a current frame and a previous frame according to a spectral coefficient of a previous frame; Step 12, according to a current frame and before Correlation of the spectral block of the frame division, judging the current frame division Whether the spectrum block needs to do inter prediction;
  • the predicted current frame may be output, and then the subsequent decoding process may be further performed on the current frame, and finally the audio signal is output.
  • the method for predicting frequency domain pulse decoding provided by the embodiment of the present invention first performs frequency bin partitioning on the current frame and the previous frame according to the spectral coefficient of the previous frame, and then determines the current according to the correlation between the current frame and the spectrum block divided by the previous frame. Whether the spectrum block of the frame division needs inter-frame prediction, and finally, the current frame spectrum block that needs to be inter-predicted is used, and the spectrum coefficient decoded in the corresponding spectrum block of the previous frame and the decoded spectrum coefficient of the current frame are used. The undecoded spectral coefficients in the current frame spectrum block are predicted.
  • the frequency of the current frame is smoother, the discontinuity phenomenon is reduced, and the sound spectrum is closer, which improves the audio output better.
  • the signal is especially the auditory quality of the strong harmonic output signal.
  • a method for predicting frequency domain pulse decoding includes the following steps: Step 21: Corresponding to a frequency point of a spectrum coefficient decoded by a previous frame, centering the frequency point in a current frame and a previous frame A spectrum block is divided in the range of N frequency points before and after; wherein N > 1.
  • the frequency block is composed of frequency coefficients corresponding to consecutive L frequency points, and is centered on each frequency point corresponding to the spectrum coefficient decoded by the previous frame, and a spectrum block is divided in the range of N frequency points before and after.
  • the frequency blocks to be processed (as indicated by the dashed box).
  • the number of spectral blocks divided by the current frame and the previous frame is equal to the number 4 of spectral coefficients decoded by the previous frame.
  • the previous frame decodes two or more spectral coefficients in the range of adjacent N frequency points
  • the current frame is centered on any frequency point where the two or more spectral coefficients are located, before and after
  • the range of each of the N frequency points is divided into only one frequency block.
  • Step 22 Determine whether the distance between the frequency points corresponding to all the decoded spectral coefficients of the current frame and the previous frame is less than or equal to N. If the value is greater than or equal to M, determine that the current frame needs to perform inter prediction, otherwise it is not required; Where M is a preset value.
  • the value of M is determined according to the number of decoded spectral coefficients, and the more the decoded spectral coefficients are, the larger the value of M is.
  • the processing of the current frame does not necessarily refer to the previous frame, but also refers to the information of the previous frames.
  • the spectrum coefficient of the previous frame can be completely set to 0, so that the first frame after the frame loss can be prevented from inter-frame prediction, thereby avoiding bad influence; Keep the spectrum coefficient of the previous frame unchanged when the frame is lost, and whether to do it by the current frame.
  • the condition of inter prediction (the number of positions where the frame of the previous frame and the frame is decoded before the frame loss is less than or equal to N is greater than or equal to M) to ensure the robustness of the algorithm and does not cause bad effects.
  • Step 23 Determine whether the divided spectrum blocks need to be inter-predicted one by one. If there is no decoded spectral coefficient in the spectrum block of the current frame, and there is a decoded spectral coefficient in the corresponding spectrum block of the previous frame, the current frame is determined. The spectrum block needs to be inter-predicted.
  • the previous frame decodes the spectral coefficients, and the spectral coefficients are not decoded in the corresponding position of the current frame (27 +/- 2 ), while the other frequency blocks (frequency points 4, 40, and 50 correspond to In the spectrum block, the current frame has corresponding spectral coefficients decoded. Therefore, corresponding to FIG. 3, it is determined that the spectrum block corresponding to the frequency point 27 needs to be inter-predicted.
  • Step 24 Perform, for the spectrum block that needs to perform inter prediction, the amplitude of the frequency point at which the spectral coefficient is decoded in the corresponding spectrum block of the previous frame, and the minimum amplitude of all the spectral coefficients that have been decoded in the current frame. Weighted summation, the result of the weighted summation is taken as the amplitude of the corresponding frequency point of the undecoded spectral coefficient in the current frame spectrum block, and the sign of the spectral coefficient is the same as the sign of the spectral coefficient of the corresponding frequency point of the previous frame, where 0 ⁇ ⁇ 1.
  • the minimum amplitude, pre _ spec represents the amplitude of the frequency at which the frequency coefficient decoded in the previous frame corresponds to the frequency block.
  • I a weighting coefficient, which can be selected according to actual conditions.
  • the spectral coefficient prediction for the frequency point 27 is a weighted sum of the amplitude of the spectral coefficient at the 27-bit point of the previous frame and the minimum amplitude of 0.8 times of all the spectral coefficients decoded by the current frame.
  • the amplitude of the spectral coefficients at the current frame rate point 27 is the same as the sign at the 27th point of the previous frame.
  • the current spectrum block when determining whether the current spectrum block needs to be processed, it is determined based on the spectrum block, instead of being determined based on the frequency point. It is possible that some frequency points in the previous frame decode the spectral coefficients, and the spectrum coefficients are not decoded at the same frequency point of the current frame, but the spectral coefficients are decoded at a position where the distance is small, and no corresponding processing is performed at this time, as shown in the figure.
  • the current frame predicts the spectral coefficient at the frequency point 27.
  • the current frame has five non-zero frequency coefficients, and the corresponding frequency points are 4, 15, 27, 40, and 50, respectively.
  • the spectral coefficients of the current frame are output, and the spectral coefficients of the current frame before the prediction processing (the corresponding frequency points are 4, 15, 40, and 50, respectively) are saved as the frame information of the previous frame of the next frame.
  • the current frame is temporarily passed through the algorithm of the present invention: For each spectrum block to be processed, if the spectral coefficient block of the previous frame is decoded, the corresponding block of the current frame is decoded. The spectral coefficients are not decoded.
  • i is greater than or equal to M. If yes, the current frame satisfies the processing condition, the current frame spectral coefficient after the prediction processing is output, and the current frame spectral coefficient before the prediction processing is saved as the next frame. The previous frame spectral coefficient; otherwise, the current frame does not satisfy the processing condition, the current frame spectral coefficient is restored to the spectral coefficient before the prediction processing, and the current frame is saved. The frequency coefficient before the processing is used as the previous frame frequency coefficient of the next frame.
  • the embodiment is determined according to energy information corresponding to all decoded spectral coefficients in each spectrum block of the current frame and the previous frame, if the energy is equivalent If the number of spectrum blocks is greater than or equal to a preset value, it is determined that the current frame needs to be inter-predicted.
  • the spectral block energy can be expressed by the sum of the squares of the amplitudes of the spectral coefficients in the spectral block or the root mean square of the squared sum or the amplitude of the spectral coefficients.
  • a formula for calculating the energy of the spectral block based on the sum of the squares of the amplitudes of the spectral coefficients is:
  • the so-called energy is equivalent, that is, the energy ratio between the previous frame and the current frame spectrum block is in the range of [1/E, E], for example, E can take 0.8, and generally the value of E can be closer to 1 to ensure prediction. accuracy.
  • a flow chart of the algorithm structure similar to FIG. 4 or FIG. 5 can be used to determine whether the current frame is inter-predicted.
  • the specific processing algorithm is as follows:
  • the determining unit 70 is configured to determine, according to the current frame and the previous frame spectrum block that are divided by the block dividing unit 71, whether the spectrum block of the current frame division needs to perform inter prediction;
  • the first dividing module 711 is configured to: at a frequency point corresponding to the spectral coefficient decoded by the previous frame, and divide a frequency block in the range of the N frequency points in the current frame and the previous frame respectively. Where N ⁇ l. At this time, the number of spectral blocks divided by the current frame is equal to the number of spectral coefficients decoded by the previous frame. and / or,
  • a second dividing module 712 configured to decode two if the previous frame is in the range of adjacent N frequency points For the above frequency coefficient, any frequency point is selected from the frequency points where the two or more spectral coefficients are located, and the selected frequency points are centered at the N frequency points in the current frame and the previous frame respectively.
  • the range is divided into a frequency block, where N ⁇ l. At this time, there is no area overlap between each of the divided spectrum blocks, and the number of spectral coefficients decoded by the previous frame is larger than the number of spectrum blocks divided by the current frame.
  • the frame determining sub-unit 72 is configured to determine, according to the current frame and the previous frame spectral block divided by the block dividing unit 71, whether the current frame needs to be inter-predicted;
  • the block judging sub-unit 73 is configured to determine, by the frame judging unit 72, the current frame that needs to be inter-predicted, and determine whether the spectrum block of the current frame division needs to perform inter-frame prediction.
  • the frame judging subunit 72 includes:
  • the energy judging module 722 is configured to determine, according to the spectrum block of the spectral coefficient decoded in the current frame and the energy information of the corresponding spectrum block in the previous frame, if the number of corresponding spectrum block energy is greater than or equal to a preset value, Judging the current frame requires inter prediction.
  • the block determining sub-unit 73 may be specifically configured to determine, according to whether the current frame and the previous frame corresponding to the spectrum block have decoded spectral coefficients, if there is no decoded spectral coefficient in the spectrum block of the current frame, and the previous frame If there is a decoded spectral coefficient in the corresponding spectrum block, it is determined that the spectrum block of the current frame needs to be inter-predicted.
  • the prediction unit 74 may be specifically configured to: weight an amplitude of a frequency point of the spectral coefficient decoded in the corresponding spectrum block of the previous frame, and a ⁇ times of a minimum amplitude of all the spectral coefficients that have been decoded in the current frame. Summing, the result of the weighted summation is used as the amplitude of the corresponding frequency point of the undecoded spectral coefficient in the current frame spectrum block, and the sign of the spectral coefficient is the same as the symbol of the corresponding frequency point of the previous frame, where 0 ⁇ ⁇ 1.
  • the spectrum of the current frame is smoother, the discontinuity phenomenon is reduced, and the signal is closer to the real spectrum, and the audio output signal is better improved.
  • the auditory quality of a strong harmonic output signal is better improved.
  • the frequency domain pulse decoding prediction apparatus 81 is configured to: determine a current frame spectrum block that needs to be inter-predicted, use a spectrum coefficient decoded in a corresponding spectrum block of the previous frame, and a decoded spectrum coefficient of the current frame, Predicting undecoded spectral coefficients in the current frame spectrum block;
  • the converter 82 is configured to perform frequency domain to time domain transform according to the predicted frame frequency spectrum coefficient of the frequency domain pulse decoded prediction apparatus 81, and output a time domain audio signal.
  • the prediction device 81 of the frequency domain pulse decoding can be more specifically referred to the prediction device of the frequency domain pulse decoding in the foregoing method embodiment and the device embodiment 4, and details are not described herein again.
  • RAM random access memory
  • ROM read only memory
  • electrically programmable ROM electrically erasable programmable ROM
  • registers hard disk, removable disk, CD-ROM, or any other form of storage medium.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

L'invention porte sur un procédé et un appareil de prédiction pour un décodage d'impulsions dans le domaine fréquentiel et sur un décodeur. Le procédé comprend : la division d'une image actuelle et d'une image précédente en des blocs spectraux selon un coefficient spectral de l'image précédente (11), la détermination de la nécessité ou non de la prédiction entre images pour le bloc spectral divisé dans l'image actuelle selon la corrélation entre les blocs spectraux divisés dans l'image actuelle et l'image précédente (12), pour le bloc spectral de l'image actuelle pour lequel la prédiction entre images est jugée nécessaire, la prédiction du coefficient spectral qui n'est pas décodé du bloc spectral de l'image actuelle à l'aide du coefficient spectral décodé dans le bloc spectral correspondant à l'image précédente et du coefficient spectral décodé dans l'image actuelle (13).
PCT/CN2009/071161 2009-04-03 2009-04-03 Procédé et appareil de prédiction pour décodage d'impulsions dans le domaine fréquentiel et décodeur WO2010111841A1 (fr)

Priority Applications (2)

Application Number Priority Date Filing Date Title
PCT/CN2009/071161 WO2010111841A1 (fr) 2009-04-03 2009-04-03 Procédé et appareil de prédiction pour décodage d'impulsions dans le domaine fréquentiel et décodeur
CN2009801486921A CN102246229B (zh) 2009-04-03 2009-04-03 频域脉冲解码的预测方法和预测装置及解码器

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2009/071161 WO2010111841A1 (fr) 2009-04-03 2009-04-03 Procédé et appareil de prédiction pour décodage d'impulsions dans le domaine fréquentiel et décodeur

Publications (1)

Publication Number Publication Date
WO2010111841A1 true WO2010111841A1 (fr) 2010-10-07

Family

ID=42827473

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2009/071161 WO2010111841A1 (fr) 2009-04-03 2009-04-03 Procédé et appareil de prédiction pour décodage d'impulsions dans le domaine fréquentiel et décodeur

Country Status (2)

Country Link
CN (1) CN102246229B (fr)
WO (1) WO2010111841A1 (fr)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2003150191A (ja) * 2001-11-14 2003-05-23 Nippon Telegr & Teleph Corp <Ntt> 音声スペクトル推定方法、その装置、そのプログラムおよびその記録媒体
CN1504993A (zh) * 2002-11-29 2004-06-16 ���ǵ�����ʽ���� 用较少的计算量重构高频分量的声频解码方法和装置
US7003448B1 (en) * 1999-05-07 2006-02-21 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Method and device for error concealment in an encoded audio-signal and method and device for decoding an encoded audio signal
CN1813286A (zh) * 2004-01-23 2006-08-02 微软公司 使用广义感觉相似性对数字介质光谱数据的有效编码
US20070016415A1 (en) * 2005-07-15 2007-01-18 Microsoft Corporation Prediction of spectral coefficients in waveform coding and decoding

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP3747492B2 (ja) * 1995-06-20 2006-02-22 ソニー株式会社 音声信号の再生方法及び再生装置

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7003448B1 (en) * 1999-05-07 2006-02-21 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Method and device for error concealment in an encoded audio-signal and method and device for decoding an encoded audio signal
JP2003150191A (ja) * 2001-11-14 2003-05-23 Nippon Telegr & Teleph Corp <Ntt> 音声スペクトル推定方法、その装置、そのプログラムおよびその記録媒体
CN1504993A (zh) * 2002-11-29 2004-06-16 ���ǵ�����ʽ���� 用较少的计算量重构高频分量的声频解码方法和装置
CN1813286A (zh) * 2004-01-23 2006-08-02 微软公司 使用广义感觉相似性对数字介质光谱数据的有效编码
US20070016415A1 (en) * 2005-07-15 2007-01-18 Microsoft Corporation Prediction of spectral coefficients in waveform coding and decoding

Also Published As

Publication number Publication date
CN102246229A (zh) 2011-11-16
CN102246229B (zh) 2013-03-27

Similar Documents

Publication Publication Date Title
US9779749B2 (en) Audio signal coding method and apparatus
WO2013060223A1 (fr) Procédé et appareil de compensation de perte de trames pour signal à trames de parole
JP5587405B2 (ja) スピーチフレーム内の情報のロスを防ぐためのシステムおよび方法
RU2665889C2 (ru) Выбор процедуры маскирования потери пакета
TWI332193B (en) Method and apparatus of processing time-varying signals coding and decoding and computer program product
JP2019207430A (ja) 複数のオーディオ信号の符号化
JP2019215545A (ja) 冗長フレーム情報を通信するシステムおよび方法
RU2765985C2 (ru) Классификация и кодирование аудиосигналов
WO2011110031A1 (fr) Procédé et dispositif destinés à coder un signal à haute fréquence et procédé et dispositif destinés à décoder un signal à haute fréquence
RU2628197C2 (ru) Маскирование ошибок в кадрах
JP2020204778A5 (fr)
WO2022012629A1 (fr) Procédé et appareil pour estimer le retard temporel d&#39;un signal audio stéréo
WO2009109120A1 (fr) Procédé et dispositif d&#39;encodage et de décodage de signal audio
WO2017044245A1 (fr) Classification et post-traitement de signal audio après passage su signal dans un décodeur
WO2010111876A1 (fr) Procédé et dispositif de débruitage de signaux et système de décodage de fréquence audio
JP2013084002A (ja) 音声コーデックの品質向上装置およびその方法
CN102610231B (zh) 一种带宽扩展方法及装置
RU2644078C1 (ru) Способ, устройство и система кодирования/декодирования
JP2006018023A (ja) オーディオ信号符号化装置、および符号化プログラム
WO2012159370A1 (fr) Procédé et dispositif d&#39;amélioration vocale
WO2010111841A1 (fr) Procédé et appareil de prédiction pour décodage d&#39;impulsions dans le domaine fréquentiel et décodeur
WO2014000559A1 (fr) Procédé de traitement de signaux vocaux ou audio et appareil de codage associé
US20150334501A1 (en) Method and Apparatus for Generating Sideband Residual Signal
TW202103146A (zh) 語音編碼方法與電子裝置
WO2004112256A1 (fr) Dispositif de codage de donnees vocales

Legal Events

Date Code Title Description
WWE Wipo information: entry into national phase

Ref document number: 200980148692.1

Country of ref document: CN

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 09842498

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 09842498

Country of ref document: EP

Kind code of ref document: A1