WO2014107950A1 - 音频信号编码和解码方法、音频信号编码和解码装置 - Google Patents

音频信号编码和解码方法、音频信号编码和解码装置 Download PDF

Info

Publication number
WO2014107950A1
WO2014107950A1 PCT/CN2013/079804 CN2013079804W WO2014107950A1 WO 2014107950 A1 WO2014107950 A1 WO 2014107950A1 CN 2013079804 W CN2013079804 W CN 2013079804W WO 2014107950 A1 WO2014107950 A1 WO 2014107950A1
Authority
WO
WIPO (PCT)
Prior art keywords
emphasis
signal
excitation signal
factor
voiced sound
Prior art date
Application number
PCT/CN2013/079804
Other languages
English (en)
French (fr)
Inventor
刘泽新
王宾
苗磊
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to KR1020157013439A priority Critical patent/KR101736394B1/ko
Priority to JP2015543256A priority patent/JP6125031B2/ja
Priority to SG11201503286UA priority patent/SG11201503286UA/en
Priority to EP13871091.8A priority patent/EP2899721B1/en
Priority to KR1020177012597A priority patent/KR20170054580A/ko
Priority to EP18172248.9A priority patent/EP3467826A1/en
Priority to BR112015014956-1A priority patent/BR112015014956B1/pt
Publication of WO2014107950A1 publication Critical patent/WO2014107950A1/zh
Priority to US14/704,502 priority patent/US9805736B2/en
Priority to US15/717,952 priority patent/US10373629B2/en
Priority to US16/531,116 priority patent/US20190355378A1/en

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/038Speech enhancement, e.g. noise reduction or echo cancellation using band spreading techniques
    • G10L21/0388Details of processing therefor
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/038Speech enhancement, e.g. noise reduction or echo cancellation using band spreading techniques
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
    • G10L19/08Determination or coding of the excitation function; Determination or coding of the long-term prediction parameters
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
    • G10L19/26Pre-filtering or post-filtering
    • G10L19/265Pre-filtering, e.g. high frequency emphasis prior to encoding
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/02Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
    • G10L19/0204Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders using subband decomposition
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
    • G10L19/16Vocoder architecture
    • G10L19/18Vocoders using multiple modes
    • G10L19/24Variable rate codecs, e.g. for generating different qualities using a scalable representation such as hierarchical encoding or layered encoding
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/93Discriminating between voiced and unvoiced parts of speech signals

Definitions

  • Audio signal encoding and decoding method Audio signal encoding and decoding device
  • TECHNICAL FIELD Embodiments of the present invention relate to the field of field communication technologies, and, more particularly, to an audio signal encoding method, an audio signal decoding method, an audio signal encoding device, an audio signal decoding device, a transmitter, a receiver, and a communication system.
  • the solution proposed for this problem is to use a band extension technique.
  • the band extension technique can be implemented in the time domain or the frequency domain, and the present invention completes the band extension in the time domain.
  • the basic principle of band spreading in the time domain is to perform two different processing methods for the low band signal and the high band signal.
  • encoding is performed by various encoders in the encoding end as needed; the decoder corresponding to the encoder of the encoding end is used in the decoding end to decode and recover the low-band signal.
  • the low-frequency encoding parameters obtained by the encoder for the low-band signal are used to predict the high-band excitation signal, and the high-band signal of the original signal is subjected to, for example, linear predictive coding (LPC, Linear Prencdictive Coding) analysis obtains a high-band LPC coefficient, the high-band excitation signal is obtained by a synthesis filter determined according to the LPC coefficient, and then the predicted high-band signal is compared with the original signal.
  • LPC linear predictive coding
  • High frequency band signal and high frequency band increase
  • the high-band gain parameter, the LPC coefficient is transmitted to the decoding end to recover the high-band signal; at the decoding end, the low-frequency encoding parameter extracted during decoding of the low-band signal is used to recover the high-frequency band Excitation signal, generating a synthesis filter by using LPC coefficients, wherein the high-band excitation signal recovers the predicted high-band signal through a synthesis filter, which is adjusted by a high-band gain adjustment parameter to obtain a final high-band signal, and merges The high frequency band signal and the low band signal get the final output signal.
  • the high-band signal is recovered under a certain rate condition, but the performance index is not perfect.
  • the spectrum of the recovered output signal with the spectrum of the original signal, it can be seen that for the general period of voiced sound, there are often too strong harmonic components in the recovered high-band signal, but the high frequency in the real voice signal The harmonics of the signal are not as strong, and the difference causes the recovered signal to sound distinctly mechanical.
  • Embodiments of the present invention are directed to improving the above-described techniques for band spreading in the time domain to reduce or even eliminate mechanical sound in the recovered signal.
  • Embodiments of the present invention provide an audio signal encoding method, an audio signal decoding method, an audio signal encoding device, an audio signal decoding device, a transmitter, a receiver, and a communication system, which are capable of reducing or even eliminating mechanical sound in the recovered signal. , thereby improving encoding and decoding performance.
  • an audio signal encoding method including: dividing a time domain signal to be encoded into a low frequency band signal and a high frequency band signal; encoding a low frequency band signal to obtain a low frequency encoding parameter; and calculating a voiced sound according to the low frequency encoding parameter a degree factor, and predicting a high-band excitation signal according to a low-frequency encoding parameter, wherein the voiced sound factor is used to indicate a degree to which the high-band signal exhibits a voiced characteristic; and the high-band excitation is performed by using the voiced sound factor
  • the signal and the random noise are weighted to obtain a composite excitation signal; the high frequency encoding parameter is obtained based on the composite excitation signal and the high frequency band signal.
  • the utilizing a voiced sound factor pair is high
  • the weighting of the band excitation signal and the random noise to obtain the composite excitation signal may include: using a pre-emphasis factor to perform a pre-emphasis operation for boosting the high-frequency portion thereof to obtain pre-emphasis noise; using the voiced sound factor factor to Generating a pre-emphasis excitation signal by weighting the high-band excitation signal and the pre-emphasis noise; performing a de-emphasis operation for depressing the high-frequency portion of the pre-emphasis excitation signal by using a de-emphasis factor to obtain the composite excitation signal .
  • the de-emphasis factor may be based on the pre-emphasis factor and a ratio of the pre-emphasis noise in the pre-emphasis excitation signal determine.
  • the low frequency encoding parameter may include a pitch period, where the voiced sound factor is used to perform prediction of the high frequency band excitation signal and random noise.
  • the obtaining the composite excitation signal by weighting may include: modifying the voiced sound factor by using the pitch period; and weighting the high frequency band excitation signal and the random noise by using the corrected voiced sound factor to obtain a combined excitation signal.
  • the low frequency encoding parameter may include a generation digital book, a codebook gain, an adaptive codebook, an adaptive codebook gain, and a pitch period.
  • the predicting the high-band excitation signal according to the low-frequency encoding parameter may include: modifying the voiced sound factor by using the pitch period; and weighting the generation digital book and the random noise by using the corrected voiced sound factor A weighted result is obtained, and the product of the weighted result and the algebraic code gain is added to the product of the adaptive codebook and the adaptive codebook gain to predict the high-band excitation signal.
  • voice_fac is the voiced sound factor
  • TO is the pitch period
  • threshold_min and threshold_max are the minimum and maximum values of the pitch period set in advance, respectively.
  • voice_fac_A is the corrected voicedness factor.
  • the audio signal encoding method may further include: generating an encoded bit stream according to the low frequency encoding parameter and the high frequency encoding parameter, to send To the decoder.
  • an audio signal decoding method including: distinguishing a low frequency encoding parameter and a high frequency encoding parameter from the encoded information; decoding the low frequency encoding parameter to obtain a low frequency band signal; and according to the low frequency encoding parameter Calculating a voiced sound factor, and predicting a high frequency band excitation signal according to a low frequency coding parameter, the voiced sound factor being used to indicate a degree to which the high frequency band signal exhibits a voiced characteristic; and the high frequency band excitation is performed by using the voiced sound factor
  • the signal and the random noise are weighted to obtain a composite excitation signal; the high frequency band signal is obtained based on the composite excitation signal and the high frequency encoding parameter; and the low frequency band signal and the high frequency band signal are combined to obtain a final decoded signal.
  • the using the voiced sound factor to weight the high frequency band excitation signal and the random noise to obtain the combined excitation signal may include: using the pre-emphasis factor
  • the random noise performs a pre-emphasis operation for boosting the high frequency portion thereof to obtain pre-emphasis noise; the high-band excitation signal and the pre-emphasis noise are weighted by the voiced sound factor to generate a pre-emphasis excitation signal;
  • the factor performs a de-emphasis operation for depressing the high-frequency portion of the pre-emphasis excitation signal to obtain the composite excitation signal.
  • the de-emphasis factor may be based on the pre-emphasis factor and a ratio of the pre-emphasis noise in the pre-emphasis excitation signal To determine.
  • the low frequency encoding parameter may include a pitch period, where the voiced sound factor is used to perform prediction of the high frequency band excitation signal and random noise.
  • the obtaining the composite excitation signal by weighting may include: modifying the voiced sound factor by using the pitch period; and weighting the high frequency band excitation signal and the random noise by using the corrected voiced sound factor to obtain a combined excitation signal.
  • the low frequency encoding parameter may include a generation digital book, a generation digital book gain, an adaptive codebook, an adaptive codebook gain, and a pitch period.
  • the predicting the high-band excitation signal according to the low-frequency encoding parameter may include: modifying the voiced sound factor by using the pitch period; and weighting the generation digital book and the random noise by using the corrected voiced sound factor A weighted result is obtained, and the product of the weighted result and the algebraic code gain is added to the product of the adaptive codebook and the adaptive codebook gain to predict the high-band excitation signal.
  • the correcting the voiced sound factor by using the pitch period is performed according to the following formula:
  • voice_fac is the voiced sound factor
  • TO is the pitch period
  • threshold_min and threshold_max are the minimum and maximum values of the pitch period set in advance, respectively.
  • voice_fac_A is the corrected voicedness factor.
  • an audio signal encoding apparatus including: a dividing unit, configured to divide a time domain signal to be encoded into a low frequency band signal and a high frequency band signal; and a low frequency encoding unit, configured to encode the low frequency band signal And obtaining a low frequency encoding parameter; a calculating unit, configured to calculate a voiced sound factor according to the low frequency encoding parameter, wherein the voiced sound factor is used to indicate a degree of the high frequency band signal exhibiting a voiced sound characteristic; a unit for predicting a high-band excitation signal according to a low-frequency encoding parameter; a synthesizing unit, configured to weight the high-band excitation signal and random noise by the voiced sound factor to obtain a composite excitation signal; And for obtaining high frequency encoding parameters based on the composite excitation signal and the high frequency band signal.
  • the synthesizing unit may include: a pre-emphasis component, configured to perform, by using a pre-emphasis factor, the pre-emphasis operation for increasing the high frequency portion of the random noise Obtaining pre-emphasis noise, a weighting component, configured to weight the high-band excitation signal and the pre-emphasis noise by using a voiced sound factor to generate a pre-emphasis excitation signal; and a de-emphasis component for using a de-emphasis factor pair
  • the pre-emphasis excitation signal is subjected to a de-emphasis operation for depressing its high frequency portion to obtain the composite excitation signal.
  • the de-emphasis factor is based on the pre-emphasis factor and a ratio of the pre-emphasis noise in the pre-emphasis excitation signal definite.
  • the low frequency encoding parameter may include a pitch period
  • the synthesizing unit may include: a first correcting component, configured to utilize the pitch period And correcting the voiced sound factor; and a weighting component, configured to use the corrected voiced sound factor to weight the high frequency band excitation signal and the random noise to obtain a combined excitation signal.
  • the low frequency encoding parameter may include a generation digital book, a generation digital book gain, an adaptive codebook, an adaptive codebook gain, and a pitch period.
  • the prediction unit may include: a second correcting unit configured to correct the voiced sound factor by using the pitch period; and a predicting unit configured to use the corrected voiced sound factor to perform the generation of the digital book and the random noise
  • a weighting result is obtained by weighting, and the product of the weighting result and the algebraic code gain is added to the product of the adaptive codebook and the adaptive codebook gain to predict the high-band excitation signal.
  • At least one of the first ⁇ ⁇ ⁇ positive component and the second ⁇ ⁇ ⁇ positive component can be based on the following formula Come ⁇ ⁇ ' ⁇ The voiced tone factor:
  • voice_fac is the voiced sound factor
  • TO is the pitch period
  • threshold_min and threshold_max are the minimum and maximum values of the pitch period set in advance, respectively.
  • voice_fac_A is the corrected voicedness factor.
  • the audio signal encoding apparatus may further include: a bit stream generating unit, configured to use, according to the low frequency encoding parameter and the high frequency encoding parameter, An encoded bit stream is generated for transmission to the decoder.
  • a fourth aspect provides an audio signal decoding apparatus, including: a distinguishing unit, configured to distinguish a low frequency encoding parameter and a high frequency encoding parameter from the encoded information; and a low frequency decoding unit, configured to decode the low frequency encoding parameter And obtaining a low frequency band signal; a calculating unit, configured to calculate a voiced sound factor according to the low frequency encoding parameter, wherein the voiced sound factor is used to indicate a degree of the voiceband characteristic of the high frequency band signal; and the prediction unit is configured to use the low frequency encoding parameter Predicting a high-band excitation signal; a synthesizing unit, configured to perform weighting of the high-band excitation signal and random noise by the voiced sound factor to obtain a composite excitation signal; and a high-frequency decoding unit, configured to perform, based on the composite excitation signal And a high frequency encoding parameter to obtain a high frequency band signal; a merging unit for combining the low frequency band signal and the high frequency band signal to obtain
  • the synthesizing unit may include: a pre-emphasis component, configured to perform, by using a pre-emphasis factor, a pre-emphasis operation for boosting a high frequency portion thereof by using the pre-emphasis factor Obtaining pre-emphasis noise; a weighting component for using the voiced sound factor to the high frequency band excitation signal And the pre-emphasis noise is weighted to generate a pre-emphasis excitation signal; the de-emphasis component is configured to perform the de-emphasis operation for depressing the high-frequency portion of the pre-emphasis excitation signal by using a de-emphasis factor to obtain the synthesis Excitation signal.
  • a pre-emphasis component configured to perform, by using a pre-emphasis factor, a pre-emphasis operation for boosting a high frequency portion thereof by using the pre-emphasis factor Obtaining
  • the de-emphasis factor is based on the pre-emphasis factor and a ratio of the pre-emphasis noise in the pre-emphasis excitation signal definite.
  • the low frequency encoding parameter may include a pitch period
  • the synthesizing unit may include: a first correcting component, configured to utilize the pitch period And correcting the voiced sound factor; and a weighting component, configured to use the corrected voiced sound factor to weight the high frequency band excitation signal and the random noise to obtain a combined excitation signal.
  • the low frequency encoding parameter may include a generation digital book, a generation digital book gain, an adaptive codebook, an adaptive codebook gain, and a pitch period.
  • the prediction unit may include: a second correcting unit configured to correct the voiced sound factor by using the pitch period; and a predicting unit configured to use the corrected voiced sound factor to perform the generation of the digital book and the random noise
  • a weighting result is obtained by weighting, and the product of the weighting result and the algebraic code gain is added to the product of the adaptive codebook and the adaptive codebook gain to predict the high-band excitation signal.
  • At least one of the first ⁇ ⁇ ⁇ positive component and the second ⁇ ⁇ positive component can be based on the following formula Come ⁇ ⁇ ' ⁇ The voiced tone factor:
  • the fifth aspect provides a transmitter, comprising: the audio signal encoding device according to the third aspect; a transmitting unit, configured to allocate a bit to the high frequency encoding parameter and the low frequency encoding parameter generated by the audio signal encoding device A bit stream is generated and transmitted.
  • a receiver comprising: a receiving unit, configured to receive a bitstream, and extract encoded information from the bitstream; and the audio signal decoding apparatus according to the fourth aspect.
  • a communication system comprising the transmitter of the fifth aspect or the receiver of the sixth aspect.
  • FIG. 1 is a flow chart schematically illustrating an audio signal encoding method according to an embodiment of the present invention
  • FIG. 2 is a flow chart schematically illustrating an audio signal decoding method according to an embodiment of the present invention
  • FIG. 3 is a schematic diagram A block diagram of an audio signal encoding apparatus according to an embodiment of the present invention is shown
  • FIG. 4 is a block diagram schematically illustrating a prediction unit and a synthesizing unit in an audio signal encoding apparatus according to an embodiment of the present invention
  • FIG. 5 is a schematic diagram A block diagram of an audio signal decoding apparatus according to an embodiment of the present invention
  • FIG. 6 is a block diagram schematically illustrating a transmitter according to an embodiment of the present invention
  • FIG. 7 is a diagram schematically illustrating a receiver according to an embodiment of the present invention.
  • block diagram; 8 is a schematic block diagram of an apparatus according to another embodiment of the present invention.
  • audio codecs are widely used in various electronic devices, such as: mobile phones, wireless devices, personal data assistants (PDAs), handheld or portable computers, GPS receivers/navigators, cameras. , audio/video players, camcorders, video recorders, surveillance equipment, etc.
  • PDAs personal data assistants
  • audio/video players camcorders
  • video recorders surveillance equipment, etc.
  • an electronic device includes an audio encoder or an audio decoder to implement encoding and decoding of an audio signal, and the audio encoder or decoder can be directly implemented by a digital circuit or a chip such as a DSP (digital signal processor), or by software code.
  • the driver processor implements the flow in the software code.
  • the audio codec and codec method can also be applied to various communication systems, such as: GSM, Code Division Multiple Access (CDMA), Wideband Code Division Multiple Access (WCDMA), Wideband Code Division Multiple Access (WCDMA) Wireless), General Packet Radio Service (GPRS), Long Term Evolution (LTE), etc.
  • GSM Global System for Mobile Communications
  • CDMA Code Division Multiple Access
  • WCDMA Wideband Code Division Multiple Access
  • WCDMA Wideband Code Division Multiple Access
  • GPRS General Packet Radio Service
  • LTE Long Term Evolution
  • FIG. 1 is a flow chart schematically illustrating an audio signal encoding method according to an embodiment of the present invention.
  • the audio signal encoding method includes: dividing a time domain signal to be encoded into a low frequency band signal and a high frequency band signal (110); encoding the low frequency band signal to obtain a low frequency encoding parameter (120); and calculating a voiced sound according to the low frequency encoding parameter a degree factor, and predicting a high-band excitation signal according to a low-frequency encoding parameter, the voiced sound factor being used to indicate a degree to which the high-band signal exhibits a voiced characteristic (130);
  • the high-band excitation signal and the random noise are weighted to obtain a composite excitation signal (140); and the high-frequency encoding parameter (150) is obtained based on the composite excitation signal and the high-band signal.
  • the time domain signal to be encoded is divided into a low frequency band signal and a high frequency band signal.
  • the division is to be able to process the time domain signal into two ways, so that the meanings of the low frequency band signal and the low frequency band and the high frequency band are separately processed, for example, a frequency threshold can be set, A frequency below the frequency threshold is a low frequency band, and a frequency higher than the frequency threshold is a high frequency band.
  • the frequency threshold may be set as needed, or other methods may be used to distinguish the low-band signal component and the high-band signal component in the signal, thereby achieving division.
  • the low frequency band signal is encoded to obtain a low frequency encoding parameter.
  • the low frequency band signal is processed into a low frequency encoding parameter such that the decoding end recovers the low frequency band signal in accordance with the low frequency encoding parameter.
  • the low frequency encoding parameter is a parameter required by the decoding end to recover the low frequency band signal.
  • an encoder ACELP encoder
  • ACELP Algebraic Code Excited Linear Prediction
  • the low frequency encoding parameters obtained at this time may include, for example, a digital book, a digital book.
  • Gain, adaptive codebook, adaptive codebook gain and pitch period, etc. may also include other parameters.
  • the low frequency encoding parameters can be passed to a decoding end for recovering the low frequency band signal.
  • only the algebraic codebook index and the adaptive codebook index may be transmitted, and the decoding end is corresponding according to the algebraic codebook index and the adaptive codebook index.
  • the low frequency band signal may be encoded as appropriate by employing appropriate coding techniques; as the coding technique changes, the composition of the low frequency coding parameters will also change.
  • an encoding technique using the ACELP algorithm will be described as an example.
  • the voiced sound factor is calculated according to the low frequency coding parameter, and is predicted according to the low frequency coding parameter A high frequency band excitation signal, the voiced sound factor being used to indicate the extent to which the high frequency band signal exhibits voiced characteristics. Therefore, the 130 is configured to obtain the voiced sound factor and the high frequency band excitation signal from the low frequency encoding parameter, wherein the voiced sound factor and the high frequency band excitation signal are used to represent different characteristics of the high frequency band signal, that is, by using the 130 obtains the high frequency characteristics of the input signal, thereby being used for encoding of the high frequency band signal.
  • the coding technique using the ACELP algorithm is taken as an example to illustrate the calculation of the voiced sound factor and the high-band excitation signal.
  • voice_fac is the voiced sound factor
  • TO is the pitch period
  • threshold_min and threshold_max are the minimum and maximum values of the pitch period set in advance, respectively.
  • voice_fac_A is the corrected voicedness factor.
  • the voiced sound factor, the modified voiced sound factor can more accurately represent the degree to which the high frequency band signal appears as a voiced sound characteristic, thereby facilitating attenuating the mechanical sound introduced after the general period of the voiced signal is expanded.
  • the high-band excitation signal Ex can be calculated according to the following formula (3) or formula (4):
  • the FixCB is a generational digital book
  • the seed is random noise
  • the gc is an algebraic book gain
  • the AdpCB is an adaptive codebook
  • the ga is an adaptive codebook gain. It can be seen that in the formula (3) or (4), the generation of the digital book FixCB and the random noise seed are weighted by the voiced sound factor to obtain a weighted result, and the weighted result and the digital book gain are obtained.
  • the product of gc is added to the product of the adaptive codebook AdpCB and the adaptive codebook gain ga to obtain the high-band excitation signal Ex.
  • the voiced sound factor voice_fac may be replaced with the corrected voiced sound factor voice_fac_A in the formula (2) to more accurately represent the high-band signal performance. It is the degree of voiced sound characteristics, that is, the higher frequency band signal in the speech signal is more realistically expressed, thereby improving the encoding effect.
  • the high-band excitation signal and random noise are weighted by the voiced sound factor to obtain a composite excitation signal.
  • the voiced sound factor since the periodicity of the high-band excitation signal predicted according to the low-band coding parameter is too strong, the recovered audio signal sounds mechanically Strong.
  • the high-band excitation signal predicted according to the low-band signal is weighted by the voiced sound factor, and the periodicity of the high-band excitation signal predicted according to the low-band coding parameter can be weakened, thereby weakening the recovered Mechanical sound in the audio signal.
  • the weighting can be achieved by taking appropriate weights as needed.
  • the synthetic excitation signal SEx can be obtained according to the following formula (5):
  • Equation (5) Ex is the high-band excitation signal, seed is random noise, voice_fac is the voiced sound factor, powl is the energy of the high-band excitation signal, and pow2 is the energy of the random noise.
  • voice_fac the voiced sound factor
  • powl the energy of the high-band excitation signal
  • pow2 the energy of the random noise.
  • the voiced sound factor voice_fac may be replaced with the corrected voiced sound factor voice_fac_A in the formula (2) to more accurately represent the high frequency band signal in the voice signal, thereby improving Coding effect.
  • the random noise may be pre-emphasized in advance and de-emphasized after weighting.
  • the 140 may include: using a pre-emphasis factor to perform pre-emphasis operation for boosting the high-frequency portion of the random noise to obtain pre-emphasis noise; using the voiced sound factor to the high-band excitation signal and the The pre-emphasis noise is weighted to generate a pre-emphasis excitation signal; and the pre-emphasis excitation signal is subjected to a de-emphasis operation for depressing the high-frequency portion thereof by using a de-emphasis factor to obtain the composite excitation signal.
  • the noise component is usually getting stronger from low frequency to high frequency.
  • the random noise is pre-emphasized to accurately represent the noise signal characteristics in the voiced sound, that is, to raise the high frequency portion of the noise and lower the low frequency portion thereof.
  • the pre-emphasis factor can be appropriately set based on the characteristics of the random noise to accurately represent the noise signal characteristics in the voiced sound.
  • the pre-emphasis excitation signal S(i) can be de-emphasized using the following formula (7):
  • the pre-emphasis excitation signal is obtained, and the pre-emphasis excitation signal is obtained.
  • the composite excitation signal is obtained after the emphasis is obtained, and the de-emphasis factor ⁇ can be determined according to the following formula (8) or formula (9):
  • weight 1 1 - ⁇ /1 - voice _ fac
  • weight 2 j voice _ fac
  • the high frequency encoding parameter Including high band gain parameter, high band LPC coefficient, LPC analysis can be performed on the high band signal in the original signal to obtain a high band LPC coefficient, and the high band excitation signal passes through a synthesis filter determined according to the LPC coefficient.
  • the audio signal encoding method 100 may further include: generating an encoded bit stream according to the low frequency encoding parameters and the high frequency encoding parameters for transmission to the decoding end.
  • the composite excitation signal is obtained by weighting the high-band excitation signal and the random noise by using a voiced sound factor, and the characteristics of the high-frequency signal can be more accurately characterized based on the voiced signal. , thereby improving the encoding effect.
  • the audio signal decoding method includes: distinguishing a low frequency encoding parameter and a high frequency encoding parameter from the encoded information (210); decoding the low frequency encoding parameter to obtain a low frequency band signal (220); calculating a voiced sound according to the low frequency encoding parameter a degree factor, and predicting a high-band excitation signal according to a low-frequency encoding parameter, the voiced sound factor being used to indicate a degree to which the high-band signal exhibits a voiced characteristic (230); using the voiced sound factor to the high frequency band
  • the excitation signal and the random noise are weighted to obtain a composite excitation signal (240); the high frequency band signal is obtained based on the composite excitation signal and the high frequency encoding parameter (250); combining the low frequency band signal and the high frequency band signal
  • the final decoded signal is obtained (260).
  • low frequency encoding parameters and high frequency encoding parameters are distinguished from the encoded information.
  • the low frequency encoding parameter and the high frequency encoding parameter are parameters transmitted from the encoding end for recovering the low frequency signal and the high frequency signal.
  • the low frequency encoding parameters may include, for example, a generational digital book, a codebook gain, an adaptive codebook, an adaptive codebook gain and a pitch period, and the like, and other parameters, which may include, for example, LPC coefficients, high frequency bands. Gain parameters, etc., and other parameters.
  • the low frequency encoding parameters and high frequency encoding parameters may alternatively include other parameters.
  • the low frequency encoding parameters are decoded to obtain a low frequency band signal.
  • the specific decoding method corresponds to the encoding mode of the encoding end.
  • an ACELP decoder is employed in 220 to obtain a low-band signal.
  • a voiced sound factor is calculated based on the low frequency encoding parameters, and a high frequency band excitation signal is predicted based on the low frequency encoding parameter, the voiced sound factor being used to indicate the extent to which the high frequency band signal exhibits voiced characteristics.
  • the 230 is used to obtain the high frequency characteristics of the encoded signal based on the low frequency encoding parameters, thereby being used for decoding (or recovery) of the high frequency band signal.
  • the decoding technique corresponding to the coding technique using the ACELP algorithm will be described below as an example.
  • the voiced sound factor voice_fac can be calculated according to the aforementioned formula (1), and in order to better reflect the characteristics of the high-band signal, the pitch period in the low-frequency encoding parameter can be corrected by using the pitch period in the low-frequency encoding parameter as shown in the above formula (2).
  • the voiced sound factor voice_fac is described, and the corrected voiced sound factor voice_fac_A is obtained.
  • the modified voiced factor, voice_fac_A can more accurately represent the degree to which the high-band signal exhibits voiced characteristics, thereby facilitating the weakening of the mechanical mechanism introduced after the general period of voiced signal expansion. sound.
  • the high-band excitation signal Ex can be calculated according to the aforementioned formula (3) or formula (4). That is, the weighted result is obtained by weighting the generation digital book and the random noise by using a voiced sound factor, and the product of the weighted result and the algebraic book gain is added to the adaptive codebook and the adaptive codebook. The high-band excitation signal Ex is obtained by multiplying the gain. Similarly, the voiced tone factor voice_fac may be replaced with the corrected voiced tone factor voice_fac_A in equation (2) to further improve the decoding effect.
  • the high frequency band excitation signal and random noise are weighted by the voiced sound factor to obtain a composite excitation signal.
  • the high-band excitation signal predicted according to the low-band coding parameter is weighted by the voiced sound factor to reduce the periodicity of the high-band excitation signal predicted according to the low-band coding parameter, thereby weakening the Mechanical sound in the recovered audio signal.
  • the composite excitation signal Sex can be obtained according to the above formula (5), and the voiced sound factor voice_fac in the formula (5) can be replaced with the corrected voiced sound factor voice_fac_A in the formula (2).
  • the synthetic excitation signal can also be calculated in other ways as needed.
  • the random noise may be pre-emphasized in advance and performed after weighting. Go to increase.
  • the 240 may include: performing, by using a pre-emphasis factor ⁇ , the pre-emphasis operation for boosting the high-frequency portion of the random noise (for example, by using a formula)
  • pre-emphasis operation to obtain pre-emphasis noise
  • voiced sound factor to weight the high-band excitation signal and the pre-emphasis noise to generate a pre-emphasis excitation signal
  • de-emphasis factor ⁇ Pre-emphasis the excitation signal to perform a de-emphasis operation for depressing its high frequency portion
  • the pre-emphasis factor ⁇ can be preset as needed to accurately represent the characteristics of the noise signal in the voiced sound, that is, the high frequency portion of the noise signal is large and the low frequency portion signal is small. In addition, other types of noise can be used, in which case the pre-emphasis factor ⁇ is changed accordingly to represent the noise characteristics in the general voiced sound.
  • the de-emphasis factor ⁇ may be determined based on the pre-emphasis factor ⁇ and a ratio of the pre-emphasis noise in the pre-emphasis excitation signal. As an example, the de-emphasis factor ⁇ can be determined according to the previous formula (8) or formula (9).
  • a high frequency band signal is obtained based on the composite excitation signal and high frequency encoding parameters.
  • This 250 is implemented in contrast to the process of obtaining high frequency encoding parameters based on the composite excitation signal and the high frequency band signal in the encoding end.
  • the high frequency encoding parameter includes a high band gain parameter, a high band LPC coefficient, and the synthesis filter can be generated by using the LPC coefficient in the high frequency encoding parameter, and the synthesized excitation signal obtained in 240 is restored by the synthesis filter.
  • the predicted high band signal is adjusted by the high band gain adjustment parameter in the high frequency encoding parameters to obtain the final high band signal.
  • the 240 may be implemented by various techniques existing or future, and the manner in which the high frequency band signal is obtained based on the combined excitation signal and high frequency encoding parameters does not constitute a limitation of the present invention.
  • the low frequency band signal and the high frequency band signal are combined to obtain a final decoded signal.
  • This combination mode corresponds to the division mode in 110 of Fig. 1, thereby realizing decoding to obtain the final output signal.
  • the composite excitation signal is obtained by weighting the high-band excitation signal and the random noise by using a voiced sound factor, and the characteristics of the high-frequency signal can be more accurately characterized based on the voiced signal. , thereby improving the decoding effect.
  • FIG. 3 is a block diagram schematically illustrating an audio signal encoding apparatus 300 according to an embodiment of the present invention.
  • the audio signal encoding apparatus 300 includes: a dividing unit 310, configured to divide a time domain signal to be encoded into a low frequency band signal and a high frequency band signal; and a low frequency encoding unit 320, configured to encode the low frequency band signal to obtain a low frequency encoding parameter
  • the calculating unit 330 is configured to calculate a voiced sound factor according to the low frequency encoding parameter, where the voiced sound factor is used to indicate the degree to which the high frequency band signal appears as a voiced sound characteristic; and the forecasting unit 340 is configured to predict the high frequency according to the low frequency encoding parameter a stimuli signal; a synthesizing unit 350, configured to perform weighting of the high-band excitation signal and random noise by using the voiced sound factor to obtain a composite excitation signal; and a high-frequency encoding unit 360, configured to The high frequency band signal is used to
  • the dividing unit 310 may adopt an existing or future appearance after receiving the input time domain signal. Any division technique to achieve this division.
  • the meanings of the low frequency band and the high frequency band are relative.
  • a frequency threshold may be set, and a frequency lower than the frequency threshold is a low frequency band, and a frequency higher than the frequency threshold is a high frequency band.
  • the frequency threshold may be set as needed, or other methods may be used to distinguish the low-band signal component and the high-band signal component in the signal, thereby achieving division.
  • the low frequency encoding unit 320 may be encoded by, for example, an ACELP encoder using an ACELP algorithm, and the low frequency encoding parameters obtained at this time may include, for example, a generational digital book, a codebook gain, an adaptive codebook, an adaptive codebook gain, and The pitch period, etc., and may also include other parameters.
  • the low frequency band signal can be encoded as appropriate by employing appropriate coding techniques; as the coding technique changes, the composition of the low frequency coding parameters will also change.
  • the obtained low frequency coding parameters are parameters required to recover the low frequency band signals, which are transmitted to the decoder for low frequency band signal recovery.
  • the calculation unit 330 calculates a parameter for indicating a high frequency characteristic of the encoded signal, that is, a voiced sound factor, based on the low frequency encoding parameter. Specifically, the calculation unit 330 calculates the voiced sound factor voice_fac based on the low frequency encoding parameter obtained by the low frequency encoding unit 320, which can calculate the voiced sound factor voice_fac, for example, according to the aforementioned formula (1). The voiced tone factor is then used to obtain a composite excitation signal that is transmitted to the high frequency encoding unit 360 for encoding of the high frequency band signal.
  • 4 is a block diagram schematically illustrating a prediction unit 340 and a synthesizing unit 350 in an audio signal encoding apparatus according to an embodiment of the present invention.
  • Prediction unit 340 may include only prediction component 460 in FIG. 4, or may include both second correction component 450 and prediction component 460 in FIG.
  • the second correcting section 450 uses the pitch period TO in the low-frequency encoding parameter, for example, according to the above formula (2).
  • the voiced sound factor voice_fac is corrected, and the corrected voiced sound factor voice_fac_A2 is obtained.
  • the prediction component 460 calculates the high-band excitation signal Ex, for example, according to the aforementioned formula (3) or formula (4), that is, using the corrected voiced sound factor voice_fac_A2 to weight the algebraic codebook and the random noise in the low-frequency encoding parameters.
  • the prediction component 460 may also use the voiced sound factor voice_fac calculated by the calculation unit 330 to weight the algebraic codebook and the random noise in the low frequency encoding parameter to obtain a weighting result. In this case, the second correcting component 450 may be omitted. It is to be noted that the prediction component 460 can also calculate the high-band excitation signal Ex in other manners.
  • the synthesizing unit 350 may include the pre-emphasis component 410, the weighting component 420, and the de-emphasis component 430 in FIG. 4; or may include the first correcting component 440 and the weighting component 420 in FIG. 4, or may further include The pre-emphasis component 410, the weighting component 420, the de-emphasis component 430, and the first quad-positive component 440 of FIG.
  • the pre-emphasis part 410 obtains the pre-emphasis noise PEnoise by, for example, the pre-emphasis operation for increasing the high-frequency portion of the random noise by the pre-emphasis factor ⁇ by the formula (6).
  • This random noise can be the same as the random noise input to prediction component 460.
  • the pre-emphasis factor ⁇ can be set in advance as needed to accurately represent the characteristics of the noise signal in the voiced sound, that is, the high frequency portion of the noise is large and the low frequency portion is small. When other types of noise are used, the pre-emphasis factor ⁇ is changed accordingly to represent the noise characteristics in the general voiced sound.
  • the weighting unit 420 is configured to weight the high-band excitation signal Ex from the prediction unit 460 and the pre-emphasis noise PEnoise from the pre-emphasis unit 410 by using the corrected voiced sound factor voice_fac_Al to generate a pre-emphasis excitation signal PEEx.
  • the weighting component 420 may obtain the pre-emphasis excitation signal PEEx (replace the voiced sound factor voice_fac therein with the modified voiced sound factor voice_fac_Al) according to the above formula (5), and may calculate the method in other manners.
  • Pre-emphasis incentive Signal may be obtained from the prediction unit 460 and the pre-emphasis noise PEnoise from the pre-emphasis unit 410 by using the corrected voiced sound factor voice_fac_Al to generate a pre-emphasis excitation signal PEEx.
  • the weighting component 420 may obtain the pre-emphasis excitation
  • the corrected voiced sound factor voice_fac_Al is generated by the first correcting unit 440, and the first correcting unit 440 corrects the voiced sound factor by using the pitch period to obtain the corrected voiced sound level.
  • the correcting operation performed by the first correcting component 440 may be the same as the second correcting component 450 or may be different from the correcting operation of the second correcting component 450. That is, the first correcting section 440 may employ a formula other than the above formula (2) to correct the voiced sound factor voice_fac based on the pitch period.
  • the de-emphasis component 430 obtains the composite excitation signal SEx by de-emphasizing the pre-emphasis excitation signal ⁇ from the weighting component 420 by de-emphasis factor 420, for example, by de-emphasis factor 420.
  • the de-emphasis factor ⁇ may be determined based on the pre-emphasis factor ⁇ and a ratio of the pre-emphasis noise in the pre-emphasis excitation signal.
  • the de-emphasis factor ⁇ as an example can be determined according to the above formula (8) or formula (9).
  • the voiced sound factor voice_fac output from the calculating unit 330 may be supplied to one or both of the weighting unit 420 and the predicting unit 460.
  • the pre-emphasis component 410 and the de-emphasis component 430 may also be deleted, and the weighting section 420 weights the high-band excitation signal Ex and the random noise by using the corrected voiced sound factor (or voiced sound factor voice_fac). A synthetic excitation signal is obtained.
  • the high frequency encoding unit 360 obtains high frequency encoding parameters based on the combined excitation signal SEx and the high frequency band signal from the dividing unit 310.
  • the high frequency encoding unit 360 performs LPC analysis on the high frequency band signal to obtain a high frequency band LPC coefficient, and the high frequency band excitation signal obtains a predicted high frequency band signal by a synthesis filter determined according to the LPC coefficient.
  • the predicted high band signal and the high band signal from the dividing unit 310 are then compared to obtain a high band gain adjustment parameter, the high band increase
  • the benefit parameter and the LPC coefficient are components of the high frequency encoding parameter.
  • the manner in which the high frequency encoding unit 360 combines the excitation signal and the high frequency band signal to obtain high frequency encoding parameters does not constitute a limitation of the present invention. After obtaining the low frequency encoding parameters and the high frequency encoding parameters, the encoding of the signals is implemented, so that they can be transmitted to the decoding end for recovery.
  • the audio signal encoding apparatus 300 may further include: a bit stream generating unit 370, configured to generate an encoded bit stream according to the low frequency encoding parameter and the high frequency encoding parameter, to be sent to the decoding end.
  • a bit stream generating unit 370 configured to generate an encoded bit stream according to the low frequency encoding parameter and the high frequency encoding parameter, to be sent to the decoding end.
  • the synthesizing unit 350 weights the high-band excitation signal and the random noise by using a voiced sound factor to obtain a composite excitation signal, and can accurately represent the high-frequency signal based on the voiced signal. The characteristics of the code, thereby improving the coding effect.
  • FIG. 5 is a block diagram schematically illustrating an audio signal decoding apparatus 500 according to an embodiment of the present invention.
  • the audio signal decoding apparatus 500 includes: a distinguishing unit 510, configured to distinguish a low frequency encoding parameter and a high frequency encoding parameter from the encoded information; and a low frequency decoding unit 520, configured to decode the low frequency encoding parameter to obtain a low frequency band signal.
  • the calculating unit 530 is configured to calculate a voiced sound factor according to the low frequency encoding parameter, where the voiced sound factor is used to indicate the degree to which the high frequency band signal appears as a voiced sound characteristic; and the forecasting unit 540 is configured to predict the high frequency according to the low frequency encoding parameter a stimuli signal; a synthesizing unit 550, configured to perform weighting of the high-band excitation signal and random noise by using the voiced sound factor to obtain a composite excitation signal; and a high-frequency decoding unit 560, configured to High frequency encoding parameters to obtain
  • the distinguishing unit 510 provides the low frequency encoding parameter in the encoded signal after receiving the encoded signal
  • the low frequency decoding unit 520 is supplied, and the high frequency encoding parameters in the encoded signal are supplied to the high frequency decoding unit 560.
  • the low frequency encoding parameter and the high frequency encoding parameter are parameters transmitted from the encoding end for recovering the low frequency signal and the high frequency signal.
  • the low frequency encoding parameters may include, for example, a generational digital book, a codebook gain, an adaptive codebook, an adaptive codebook gain, a pitch period, and other parameters, which may include, for example, LPC coefficients, high band gain. Parameters, and other parameters.
  • the low frequency decoding unit 520 decodes the low frequency encoding parameters to obtain a low frequency band signal.
  • the specific decoding method corresponds to the encoding mode of the encoding end.
  • the low frequency decoding unit 520 also supplies low frequency encoding parameters such as algebraic codebook, algebraic code gain, adaptive codebook, adaptive codebook gain, pitch period, etc. to the computing unit 530 and the prediction unit 540, the computing unit 530 and The prediction unit 540 can also directly acquire the required low frequency encoding parameters from the distinguishing unit 510.
  • the calculating unit 530 is configured to calculate a voiced sound factor according to the low frequency encoding parameter, where the voiced sound factor is used to indicate the degree to which the high frequency band signal exhibits voiced characteristics. Specifically, the calculating unit 530 can calculate the voiced sound factor voice_fac according to the low frequency encoding parameter obtained by the low frequency decoding unit 520, which can calculate the voiced sound factor voice_fac according to the aforementioned formula (1), for example. Then, the voiced tone factor is used to obtain a composite excitation signal that is transmitted to the high frequency decoding unit 560 for obtaining a high frequency band signal.
  • the prediction unit 540 and the synthesis unit 550 are the same as the prediction unit 340 and the synthesis unit 350 in the audio signal coding apparatus 300 in FIG. 3, respectively, and thus the structure thereof can also be seen and described in FIG.
  • the prediction unit 540 includes both the second modification component 450 and the prediction component 460; in another implementation, the prediction unit 540 includes only the prediction component 460.
  • the synthesis unit 550 includes a pre-emphasis component 410, a weighting component 420, and a de-emphasis component 430; in another implementation, the synthesis unit 550 includes a first correction component 440, And weighting component 420; in yet another implementation, the synthesizing unit 550 includes a pre-emphasis portion The component 410, the weighting component 420, the de-emphasis component 430, and the first 4th positive component 440.
  • the high frequency decoding unit 560 obtains a high frequency band signal based on the combined excitation signal and high frequency encoding parameters.
  • the high frequency decoding unit 560 performs decoding using a decoding technique corresponding to the encoding technique of the high frequency encoding unit in the audio signal encoding device 300.
  • the high frequency decoding unit 560 generates a synthesis filter using the LPC coefficients in the high frequency coding parameters, and recovers the predicted high frequency band signal by the synthesis excitation signal from the synthesis unit 550, the predicted The high band signal is adjusted by the high band gain adjustment parameter in the high frequency encoding parameters to obtain the final high band signal.
  • the high frequency decoding unit 560 can be implemented in various existing or future technologies, and the specific decoding technique does not constitute a limitation of the present invention.
  • the merging unit 570 combines the low band signal and the high band signal to obtain a final decoded signal.
  • the merging unit 570 is combined in a manner corresponding to the division manner in which the dividing unit 310 in FIG. 3 performs the dividing operation, thereby implementing decoding to obtain a final output signal.
  • the composite excitation signal is obtained by weighting the high-band excitation signal and the random noise by using a voiced sound factor, and the characteristics of the high-frequency signal can be more accurately characterized based on the voiced signal. , thereby improving the decoding effect.
  • FIG. 6 is a block diagram that schematically illustrates a transmitter 600 in accordance with an embodiment of the present invention.
  • the transmitter 600 of Fig. 6 may include the audio signal encoding device 300 as shown in Fig. 3, and thus the repeated description is omitted as appropriate. Further, the transmitter 600 may further include a transmitting unit 610 for allocating bits for the high frequency encoding parameters and the low frequency encoding parameters generated by the audio signal encoding device 300 to generate a bit stream, and transmitting the bit stream.
  • FIG. 7 is a block diagram that schematically illustrates a receiver 700 in accordance with an embodiment of the present invention.
  • the receiver 700 of FIG. 7 may include the audio signal decoding device 500 as shown in FIG. 5, and thus the repeated description is omitted as appropriate.
  • the receiver 700 may further include a receiving unit 710, configured to receive an encoded signal for decoding the audio signal.
  • a communication system is also provided, which may include the transmitter 600 described in connection with FIG. 6 or the receiver 700 described in connection with FIG.
  • FIG. 8 is a schematic block diagram of an apparatus in accordance with another embodiment of the present invention.
  • the apparatus 800 of Figure 8 can be used to implement the various steps and methods of the above method embodiments.
  • Apparatus 800 is applicable to base stations or terminals in various communication systems.
  • the apparatus 800 includes a transmitting circuit 802, a receiving circuit 803, an encoding processor 804, a decoding processor 805, a processing unit 806, a memory 807, and an antenna 801.
  • the processing unit 806 controls the operation of the apparatus 800.
  • the processing unit 806 may also be referred to as a CPU (Central Processing Unit).
  • Memory 807 can include read only memory and random access memory and provides instructions and data to processing unit 806.
  • a portion of memory 807 may also include non-volatile line random access memory (NVRAM).
  • device 800 may be embedded or may itself be a wireless communication device such as a mobile telephone, and may also include a carrier that houses transmit circuitry 802 and receive circuitry 803 to allow for data transmission between device 800 and a remote location. receive. Transmitting circuit 802 and receiving circuit 803 can be coupled to antenna 801.
  • the various components of device 800 are coupled together by a bus system 809, which in addition to the data bus includes a power bus, a control bus, and a status signal bus. However, for clarity of description, various buses are labeled as bus system 809 in the figure.
  • the apparatus 800 can also include a processing unit 806 for processing signals, and further includes an encoding processor 804 and a decoding processor 805.
  • the audio signal encoding method disclosed in the foregoing embodiments of the present invention may be applied to or implemented by the encoding processor 804.
  • the audio signal decoding method disclosed in the foregoing embodiments of the present invention may be applied to or implemented by the decoding processor 805.
  • Encoding processor 804 or decoding processor 805 may be an integrated circuit chip with signal processing capabilities. In the implementation process, each step of the above method may be completed by an integrated logic circuit of the hardware in the encoding processor 804 or the decoding processor 805 or an instruction in the form of software. These instructions can be implemented and controlled by processor 806.
  • the above decoding processor may be a general purpose processor, a digital signal processor (DSP), an application specific integrated circuit (ASIC), an off-the-shelf programmable gate array (FPGA) or other programmable logic device, discrete gate or transistor logic device, discrete hardware. Component.
  • DSP digital signal processor
  • ASIC application specific integrated circuit
  • FPGA off-the-shelf programmable gate array
  • the methods, steps, and logical block diagrams disclosed in the embodiments of the present invention may be implemented or executed.
  • the general purpose processor may be a microprocessor or the processor or any conventional processor, decoder or the like.
  • the steps of the method disclosed in the embodiment of the present invention may be directly implemented as a hardware decoding processor, or may be performed by a combination of hardware and software modules in the decoding processor.
  • the software modules can be located in a conventional storage medium such as random access memory, flash memory, read only memory, programmable read only memory or electrically erasable programmable memory, registers, and the like.
  • the storage medium is located in the memory 807, and the encoding processor 804 or the decoding processor 805 reads the information in the memory 807 and performs the steps of the above method in combination with the hardware thereof.
  • memory 807 can store the resulting low frequency encoding parameters for use by encoding processor 804 or decoding processor 805 in encoding or decoding.
  • the audio signal encoding apparatus 300 of FIG. 3 may be implemented by an encoding processor 804, and the audio signal decoding apparatus 500 of FIG. 5 may be implemented by a decoding processor 805.
  • the prediction unit and the synthesis unit of FIG. 4 may be implemented by the processor 806, or may be implemented by the encoding processor 804 or the decoding processor 805.
  • the transmitter 610 of FIG. 6 can be implemented by an encoding processor 804, a transmitting circuit 802, an antenna 801, and the like.
  • the receiver 710 of FIG. 7 can be implemented by an antenna 801, a receiving circuit 803, a decoding processor 805, and the like.
  • the above examples are merely illustrative and are not intended to limit the embodiments of the invention to such specific implementations.
  • the memory 807 stores instructions that cause the processor 806 and/or the encoding processor 804 to: divide the time domain signal to be encoded into a low frequency band signal and a high frequency band signal; encode the low frequency band signal to obtain a low frequency Encoding parameters; calculating a voiced sound factor according to the low frequency encoding parameter, and predicting a high frequency band excitation signal according to the low frequency encoding parameter, wherein the voiced sound factor is used to indicate the degree to which the high frequency band signal exhibits voiced characteristics; Voiced sound factor versus the high frequency band excitation signal and random noise Sound is weighted to obtain a composite excitation signal; high frequency encoding parameters are obtained based on the composite excitation signal and the high frequency band signal.
  • the memory 807 stores instructions that cause the processor 806 or the decoding processor 805 to: distinguish low frequency encoding parameters from high frequency encoding parameters from the encoded information; decode the low frequency encoding parameters to obtain a low frequency band signal; Encoding parameters to calculate a voiced sound factor, and predicting a high frequency band excitation signal according to a low frequency coding parameter, the voiced sound factor being used to indicate a degree to which the high frequency band signal exhibits a voiced characteristic; using the voiced sound factor factor to the high
  • the frequency band excitation signal and the random noise are weighted to obtain a composite excitation signal; the high frequency band signal is obtained based on the composite excitation signal and the high frequency encoding parameter; combining the low frequency band signal and the high frequency band signal to obtain a final Decode the signal.
  • the communication system or communication device may include some or all of the above-described audio signal encoding device 300, transmitter 610, audio signal decoding device 500, receiver 710, and the like.
  • the disclosed systems, devices, and methods may be implemented in other ways.
  • the device embodiments described above are merely illustrative.
  • the division of the unit is only a logical function division.
  • there may be another division manner for example, multiple units or components may be combined or Can be integrated into another system, or some Features can be ignored or not executed.
  • the displayed components may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of the embodiment.
  • the functions, if implemented in the form of software functional units and sold or used as separate products, may be stored in a computer readable storage medium.
  • the technical solution of the present invention which is essential or contributes to the prior art, or a part of the technical solution, may be embodied in the form of a software product, which is stored in a storage medium, including
  • the instructions are used to cause a computer device (which may be a personal computer, server, or network device, etc.) to perform all or part of the steps of the methods described in various embodiments of the present invention.
  • the foregoing storage medium includes: a U disk, a mobile hard disk, a read-only memory (ROM), a random access memory (RAM), a magnetic disk or an optical disk, and the like, which can store program codes. .

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Quality & Reliability (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)
  • Spectroscopy & Molecular Physics (AREA)

Abstract

一种音频信号编码和解码方法、音频信号编码和解码装置、发射机、接收机和通信系统,其能够提高编码和/或解码性能。该音频信号编码方法包括:将待编码的时域信号划分为低频带信号和高频带信号(110);对低频带进行编码而获得低频编码参数(120);根据低频编码参数来计算浊音度因子,并根据低频编码参数来预测高频带激励信号,所述浊音度因子用于表示所述高频带信号表现为浊音特性的程度(130);利用所述浊音度因子对所述高频带激励信号和随机噪声进行加权而获得合成激励信号(140);基于所述合成激励信号和所述高频带信号来获得高频编码参数(150)。可以提高编码或解码效果。

Description

音频信号编码和解码方法、 音频信号编码和解码装置
本申请要求于 2013年 1月 11日提交中国专利局、申请号为 201310010936.8, 发明名称为 "音频信号编码和解码方法、 音频信号编码和解码装置" 的中国专 利申请优先权, 上述专利的全部内容通过引用结合在本申请中。 技术领域 本发明实施例涉及领域通信技术领域, 并且更具体地, 涉及一种音频信号 编码方法、 音频信号解码方法、 音频信号编码装置、 音频信号解码装置、 发射 机、 接收机和通信系统。 背景技术 随着通信技术的不断进步, 用户对话音质量的需求越来越高。 通常, 通过提 高话音质量的带宽来提高话音质量。 如果采用传统的编码方式来对带宽已增加 的信息进行编码, 则会大大提高码率, 并因此拘囿于当前网络带宽的限制条件 而难以实现。 因此, 要在码率不变或者码率变化不大的情况下对带宽更宽的信 号进行编码, 针对这个课题提出的解决方案就是采用频带扩展技术。 所述频带 扩展技术可以在时域或者频域完成, 本发明是在时域完成频带扩展。 在时域进行频带扩展的基本原理为对低频带信号和高频带信号采取两种不同的 处理方法完成。 对于原始信号中的低频带信号, 在编码端中根据需要利用各种 编码器进行编码; 在解码端中利用与编码端的编码器对应的解码器来解码并恢 复低频带信号。 对于高频带信号, 在编码端中, 利用用于低频带信号的编码器 获得的低频编码参数来预测高频带激励信号, 并对原始信号的高频带信号进行 例如线性预测编码 (LPC, linear Prencdictive Coding)分析得到高频带 LPC系数, 所述高频带激励信号通过根据 LPC系数确定的合成滤波器而得到预测的高频带 信号, 然后比较预测的高频带信号和原始信号中的高频带信号而获得高频带增 益调整参数, 所述高频带增益参数、 LPC 系数被传送到解码端来恢复高频带信 号; 在解码端, 利用在低频带信号的解码时提取的低频编码参数来恢复所述高 频带激励信号, 利用 LPC系数生成合成滤波器, 所述高频带激励信号通过合成 滤波器恢复所预测的高频带信号, 其经过高频带增益调整参数调整而获得最终 的高频带信号, 合并高频带信号和低频带信号得到最终的输出信号。
上述的在时域进行频带扩展的技术中, 在一定速率条件下恢复出了高频带信号, 但是性能指标还不够完善。 通过对比恢复的输出信号的频谱与原始信号的频谱 可以看出, 对于一般周期的浊音而言在恢复的高频带信号中经常有太强的谐波 成分, 然而真实的语音信号中的高频带信号的谐波性却没那么强, 该差异导致 所恢复的信号听起来有明显的机械声。
本发明实施例旨在改进上述的在时域进行频带扩展的技术, 以减小甚或消 除所恢复的信号中的机械声。
发明内容
本发明实施例提供一种音频信号编码方法、 音频信号解码方法、 音频信号编码 装置、 音频信号解码装置、 发射机、 接收机和通信系统, 其能够减小甚或消除 所恢复的信号中的机械声, 从而提高编码和解码性能。
第一方面, 提供了音频信号编码方法, 包括: 将待编码的时域信号划分为低频 带信号和高频带信号; 对低频带信号进行编码而获得低频编码参数; 根据低频 编码参数来计算浊音度因子, 并根据低频编码参数来预测高频带激励信号, 所 述浊音度因子用于表示所述高频带信号表现为浊音特性的程度; 利用所述浊音 度因子对所述高频带激励信号和随机噪声进行加权而获得合成激励信号; 基于 所述合成激励信号和所述高频带信号来获得高频编码参数。
结合第一方面, 在第一方面的一种实现方式中, 所述利用浊音度因子对所述高 频带激励信号和随机噪声进行加权而获得合成激励信号可包括: 利用预加重因 子对所述随机噪声进行用于提升其高频部分的预加重操作而获得预加重噪声; 利用浊音度因子对所述高频带激励信号和所述预加重噪声进行加权而生成预加 重激励信号; 利用去加重因子对所述预加重激励信号进行用于压低其高频部分 的去加重操作而获得所述合成激励信号。
结合第一方面及其上述实现方式, 在第一方面的另一实现方式中, 所述去加重 因子可基于所述预加重因子以及所述预加重噪声在所述预加重激励信号中的比 例来确定。
结合第一方面及其上述实现方式, 在第一方面的另一实现方式中, 所述低频编 码参数可包括基音周期, 所述利用浊音度因子对所预测的高频带激励信号和随 机噪声进行加权而获得合成激励信号可包括: 利用所述基音周期来修正所述浊 音度因子; 利用修正后的浊音度因子来对所述高频带激励信号和随机噪声进行 加权而获得合成激励信号。
结合第一方面及其上述实现方式, 在第一方面的另一实现方式中, 所述低频编 码参数可包括代数码书、 代数码书增益、 自适应码书、 自适应码书增益和基音 周期, 所述根据低频编码参数来预测高频带激励信号可包括: 利用所述基音周 期来修正所述浊音度因子; 利用修正后的浊音度因子来对所述代数码书和随机 噪声进行加权而得到加权结果, 将所述加权结果与代数码书增益的乘积加上所 述自适应码书与自适应码书增益的乘积而预测到所述高频带激励信号。
结合第一方面及其上述实现方式, 在第一方面的另一实现方式中, 所述利 用所述基音周期来修正所述浊音度因子可根据下面的公式来进行: voice _ fac _ A = voice _ fac * γ
- al * TO + bl T O < threshold _ min
γ= \ a 2 * T 0 + b2 threshold _ min < T 0 < threshold _ max
I 1 T 0 > threshold _ max 其中, voice_fac是浊音度因子, TO是基音周期, al、 a2、 bl>0 , b2≥ o , threshold_min 和 threshold_max分别是预先设置的基音周期的最小值和最大值, voice_fac_A是 修正后的浊音度因子。
结合第一方面及其上述实现方式, 在第一方面的另一实现方式中, 所述音频信 号编码方法还可包括: 根据所述低频编码参数和高频编码参数来生成编码比特 流, 以发送给解码端。
第二方面, 提供了一种音频信号解码方法, 包括: 从已编码信息中区分出低频 编码参数和高频编码参数; 对所述低频编码参数进行解码而获得低频带信号; 根据低频编码参数来计算浊音度因子, 并根据低频编码参数来预测高频带激励 信号, 所述浊音度因子用于表示高频带信号表现为浊音特性的程度; 利用所述 浊音度因子对所述高频带激励信号和随机噪声进行加权而获得合成激励信号; 基于所述合成激励信号和高频编码参数来获得高频带信号; 合并所述低频带信 号和所述高频带信号而得到最终的解码信号。
结合第二方面, 在第二方面的一种实现方式中, 所述利用浊音度因子对所述高 频带激励信号和随机噪声进行加权而获得合成激励信号可包括: 利用预加重因 子对所述随机噪声进行用于提升其高频部分的预加重操作而获得预加重噪声; 利用浊音度因子对所述高频带激励信号和所述预加重噪声进行加权而生成预加 重激励信号; 利用去加重因子对所述预加重激励信号进行用于压低其高频部分 的去加重操作而获得所述合成激励信号。
结合第二方面及其上述实现方式, 在第二方面的另一实现方式中, 所述去加重 因子可基于所述预加重因子以及所述预加重噪声在所述预加重激励信号中的比 例来确定。
结合第二方面及其上述实现方式, 在第二方面的另一实现方式中, 所述低频编 码参数可包括基音周期, 所述利用浊音度因子对所预测的高频带激励信号和随 机噪声进行加权而获得合成激励信号可包括: 利用所述基音周期来修正所述浊 音度因子; 利用修正后的浊音度因子来对所述高频带激励信号和随机噪声进行 加权而获得合成激励信号。
结合第二方面及其上述实现方式, 在第二方面的另一实现方式中, 所述低频编 码参数可包括代数码书、 代数码书增益、 自适应码书、 自适应码书增益和基音 周期, 所述根据低频编码参数来预测高频带激励信号可包括: 利用所述基音周 期来修正所述浊音度因子; 利用修正后的浊音度因子来对所述代数码书和随机 噪声进行加权而得到加权结果, 将所述加权结果与代数码书增益的乘积加上所 述自适应码书与自适应码书增益的乘积而预测到所述高频带激励信号。
结合第二方面及其上述实现方式, 在第二方面的另一实现方式中, 所述利 用所述基音周期来修正所述浊音度因子是根据下面的公式来进行的:
voice _ fac _ A = voice _ fac * γ
- al * TO + bl T O < threshold _ min
γ= \ a 2 * T 0 + b2 threshold _ min < T 0 < threshold _ max
I 1 T 0 > threshold _ max 其中, voice_fac是浊音度因子, TO是基音周期, al、 a2、 bl>0 , b2≥ o , threshold_min 和 threshold_max分别是预先设置的基音周期的最小值和最大值, voice_fac_A是 修正后的浊音度因子。
第三方面, 提供了一种音频信号编码装置, 包括: 划分单元, 用于将待编码的 时域信号划分为低频带信号和高频带信号; 低频编码单元, 用于对低频带信号 进行编码而获得低频编码参数; 计算单元, 用于根据低频编码参数来计算浊音 度因子, 所述浊音度因子用于表示高频带信号表现为浊音特性的程度; 预测单 元, 用于根据低频编码参数来预测高频带激励信号; 合成单元, 用于利用所述 浊音度因子对所述高频带激励信号和随机噪声进行加权而获得合成激励信号; 高频编码单元, 用于基于所述合成激励信号和所述高频带信号来获得高频编码 参数。
结合第三方面, 在第三方面的一种实现方式中, 所述合成单元可包括: 预加重 部件, 用于利用预加重因子对所述随机噪声进行用于提升其高频部分的预加重 操作而获得预加重噪声; 加权部件, 用于利用浊音度因子对所述高频带激励信 号和所述预加重噪声进行加权而生成预加重激励信号; 去加重部件, 用于利用 去加重因子对所述预加重激励信号进行用于压低其高频部分的去加重操作而获 得所述合成激励信号。
结合第三方面及其上述实现方式, 在第三方面的另一实现方式中, 所述去加重 因子是基于所述预加重因子以及所述预加重噪声在所述预加重激励信号中的比 例来确定的。
结合第三方面及其上述实现方式, 在第三方面的另一实现方式中, 所述低频编 码参数可包括基音周期, 所述合成单元可包括: 第一修正部件, 用于利用所述 基音周期来修正所述浊音度因子; 加权部件, 用于利用修正后的浊音度因子来 对所述高频带激励信号和随机噪声进行加权而获得合成激励信号。
结合第三方面及其上述实现方式, 在第三方面的另一实现方式中, 所述低频编 码参数可包括代数码书、 代数码书增益、 自适应码书、 自适应码书增益和基音 周期, 所述预测单元可包括: 第二修正部件, 用于利用所述基音周期来修正所 述浊音度因子; 预测部件, 用于利用修正后的浊音度因子来对所述代数码书和 随机噪声进行加权而得到加权结果, 将所述加权结果与代数码书增益的乘积加 上所述自适应码书与自适应码书增益的乘积而预测到所述高频带激励信号。 结合第三方面及其上述实现方式, 在第三方面的另一实现方式中, 所述第 一^ ί'爹正部件和第二^ ί'爹正部件中的至少一个可 ^据下面的公式来^ ί'爹正所述浊音度 因子:
voice _ fac _ A = voice _ fac * γ
- al * TO + bl T O < threshold _ min
γ= \ a 2 * T 0 + b2 threshold _ min < T 0 < threshold _ max
I 1 T 0 > threshold _ max 其中, voice_fac是浊音度因子, TO是基音周期, al、 a2、 bl>0 , b2≥ o , threshold_min 和 threshold_max分别是预先设置的基音周期的最小值和最大值, voice_fac_A是 修正后的浊音度因子。
结合第三方面及其上述实现方式, 在第三方面的另一实现方式中, 所述音频信 号编码装置还可包括: 比特流生成单元, 用于根据所述低频编码参数和高频编 码参数来生成编码比特流, 以发送给解码端。
第四方面, 提供了一种音频信号解码装置, 包括: 区分单元, 用于从已编码信 息中区分出低频编码参数和高频编码参数; 低频解码单元, 用于对所述低频编 码参数进行解码而获得低频带信号; 计算单元, 用于根据低频编码参数来计算 浊音度因子, 所述浊音度因子用于表示高频带信号表现为浊音特性的程度; 预 测单元, 用于根据低频编码参数来预测高频带激励信号; 合成单元, 用于利用 所述浊音度因子对所述高频带激励信号和随机噪声进行加权而获得合成激励信 号; 高频解码单元, 用于基于所述合成激励信号和高频编码参数来获得高频带 信号; 合并单元, 用于合并所述低频带信号和所述高频带信号而得到最终的解 码信号。
结合第四方面, 在第四方面的一种实现方式中, 所述合成单元可包括: 预加重 部件, 用于利用预加重因子对所述随机噪声进行用于提升其高频部分的预加重 操作而获得预加重噪声; 加权部件, 用于利用浊音度因子对所述高频带激励信 号和所述预加重噪声进行加权而生成预加重激励信号; 去加重部件, 用于利用 去加重因子对所述预加重激励信号进行用于压低其高频部分的去加重操作而获 得所述合成激励信号。
结合第四方面及其上述实现方式, 在第四方面的另一实现方式中, 所述去加重 因子是基于所述预加重因子以及所述预加重噪声在所述预加重激励信号中的比 例来确定的。
结合第四方面及其上述实现方式, 在第四方面的另一实现方式中, 所述低频编 码参数可包括基音周期, 所述合成单元可包括: 第一修正部件, 用于利用所述 基音周期来修正所述浊音度因子; 加权部件, 用于利用修正后的浊音度因子来 对所述高频带激励信号和随机噪声进行加权而获得合成激励信号。
结合第四方面及其上述实现方式, 在第四方面的另一实现方式中, 所述低频编 码参数可包括代数码书、 代数码书增益、 自适应码书、 自适应码书增益和基音 周期, 所述预测单元可包括: 第二修正部件, 用于利用所述基音周期来修正所 述浊音度因子; 预测部件, 用于利用修正后的浊音度因子来对所述代数码书和 随机噪声进行加权而得到加权结果, 将所述加权结果与代数码书增益的乘积加 上所述自适应码书与自适应码书增益的乘积而预测到所述高频带激励信号。
结合第四方面及其上述实现方式, 在第四方面的另一实现方式中, 所述第 一^ ί'爹正部件和第二^ ί'爹正部件中的至少一个可 ^据下面的公式来^ ί'爹正所述浊音度 因子:
voice _ fac _ A = voice _ fac * γ
ί- al * T0 + bl T O < threshold min
"
γ= a 2 * T 0 + b2 threshold _ min < T 0 < threshold _ max
I 1 T 0 > threshold _ max 其中, voice_fac是浊音度因子, TO是基音周期, al、 a2、 bl>0 , b2≥ o , threshold_min 和 threshold_max分别是预先设置的基音周期的最小值和最大值, voice_fac_A是 修正后的浊音度因子。 第五方面, 提供了一种发射机, 包括: 如第三方面所述的音频信号编码装置; 发射单元, 用于为所述音频信号编码装置产生的高频编码参数和低频编码参数 分配比特以生成比特流, 并发射该比特流。 第六方面, 提供了一种接收机, 包括: 接收单元, 用于接收比特流, 并从所述 比特流中提取已编码信息; 如第四方面所述的音频信号解码装置。 第七方面, 提供了一种通信系统, 包括第五方面所述的发射机或如第六方面所 述的接收机。 在本发明实施例的上述技术方案中, 在编码和解码时, 通过利用浊音度因子对 所述高频带激励信号和随机噪声进行加权而获得合成激励信号, 可以基于浊音 信号更准确地表征高频信号的特性, 从而提高编码和解码效果。 附图说明
为了更清楚地说明本发明实施例的技术方案, 下面将对实施例或现有技术 描述中所需要使用的附图作筒单地介绍, 显而易见地, 下面描述中的附图仅仅 是本发明的一些实施例, 对于本领域普通技术人员来讲, 在不付出创造性劳动 的前提下, 还可以根据这些附图获得其他的附图。
图 1是示意性图示了根据本发明实施例的音频信号编码方法的流程图; 图 2是示意性图示了根据本发明实施例的音频信号解码方法的流程图; 图 3是示意性图示了根据本发明实施例的音频信号编码装置的框图; 图 4是示意性图示了根据本发明实施例的音频信号编码装置中的预测单元和合 成单元的框图; 图 5是示意性图示了根据本发明实施例的音频信号解码装置的框图; 图 6是示意性图示了根据本发明实施例的发射机的框图; 图 7是示意性图示了根据本发明实施例的接收机的框图; 图 8是本发明另一实施例的装置的示意框图 具体实施方式
下面将结合本发明实施例中的附图, 对本发明实施例中的技术方案进行清楚、 完整地描述, 显然, 所描述的实施例是本发明一部分实施例, 而不是全部的实 施例。 基于本发明中的实施例, 本领域普通技术人员在没有作出创造性劳动前 提下所获得的所有其他实施例, 都属于本发明保护的范围。
在数字信号处理领域中, 音频编解码器被广泛应用于各种电子设备中, 例如: 移动电话、 无线装置、 个人数据助理(PDA )、 手持式或便携式计算机、 GPS接 收机 /导航器、 照相机、 音频 /视频播放器、 摄像机、 录像机、 监控设备等。 通常, 这类电子设备中包括音频编码器或音频解码器以实现对音频信号的编解码, 音 频编码器或者解码器可以直接由数字电路或芯片例如 DSP ( digital signal processor ) 实现, 或者由软件代码驱动处理器执行软件代码中的流程而实现。 此外, 音频编解码器和编解码方法还可以应用于各种通信系统, 例如: GSM, 码分多址 (CDMA , Code Division Multiple Access ) 系统, 宽带码分多址 ( WCDMA, Wideband Code Division Multiple Access Wireless ), 通用分组无线 业务( GPRS, General Packet Radio Service ),长期演进( LTE, Long Term Evolution ) 等。
图 1 是示意性图示了根据本发明实施例的音频信号编码方法的流程图。 该音频 信号编码方法包括: 将待编码的时域信号划分为低频带信号和高频带信号 ( 110 ); 对低频带信号进行编码而获得低频编码参数( 120 ); 根据低频编码参 数来计算浊音度因子, 并根据低频编码参数来预测高频带激励信号, 所述浊音 度因子用于表示所述高频带信号表现为浊音特性的程度(130 ); 利用所述浊音 度因子对所述高频带激励信号和随机噪声进行加权而获得合成激励信号( 140 ); 基于所述合成激励信号和所述高频带信号来获得高频编码参数 ( 150 )。
在 110 中, 将待编码的时域信号划分为低频带信号和高频带信号。 该划分是为 了能够将所述时域信号分为两路进行处理, 从而分开地处理所述低频带信号和 的低频带和高频带的含义是相对的, 例如可以设定一频率阈值, 则低于该频率 阈值的频率为低频带, 高于该频率阈值的频率为高频带。 在实践中, 可以根据 需要设定所述频率阈值, 也可以采取其它方式来区分出信号中的低频带信号成 分和高频带信号成分, 从而实现划分。
在 120 中, 对低频带信号进行编码而获得低频编码参数。 通过所述编码, 将低 频带信号处理为低频编码参数, 从而使得解码端根据所述低频编码参数来恢复 所述低频带信号。 所述低频编码参数是解码端恢复所述低频带信号所需要的参 数。作为示例,可以采用使用代数码本线性预测( ACELP, Algebraic Code Excited Linear Prediction )算法的编码器( ACELP编码器)来进行编码, 此时获得的低 频编码参数例如可包括代数码书、 代数码书增益、 自适应码书、 自适应码书增 益和基音周期等, 并且还可以包括其它参数。 所述低频编码参数可被传送到解 码端以用于恢复低频带信号。 此外, 在从编码端向解码端传送代数码书、 自适 应码书时, 可以仅传送代数码书索引和自适应码书索引, 解码端根据代数码书 索引和自适应码书索引得到对应的代数码书和自适应码书, 从而实现恢复。 在实践中, 可以根据需要采取合适的编码技术来对所述低频带信号进行编码; 当编码技术改变时, 所述低频编码参数的组成也会改变。 在本发明的实施例中, 以使用 ACELP算法的编码技术为例进行说明。
在 130 中, 根据低频编码参数来计算浊音度因子, 并根据低频编码参数来预测 高频带激励信号, 所述浊音度因子用于表示所述高频带信号表现为浊音特性的 程度。 因此, 该 130用于从所述低频编码参数获得所述浊音度因子和高频带激 励信号, 所述浊音度因子和高频带激励信号用于表示高频带信号的不同特性, 即通过该 130获得了输入的信号的高频特性, 从而用于高频带信号的编码。 下 面以使用 ACELP算法的编码技术为例,说明浊音度因子和高频带激励信号的计
浊音度因子 voice_fac可根据下面的公式( 1 )来计算: voice _ fac = a * voice _ factor + b * voice _ factor + c 中 voice _ factor = (ener adp - ener cb )/(ener adp + ener cb ) ^ ^式 ( 1 ) 其中, ener 为自适应码书的能量, ener。d为代数码书的能量, a、 b、 C为预先设 定的值。 根据如下原则来设定所述参数 、 b、 c: 使得 voice_fac的值大小处于 0 到 1之间; 以及将线性变化的 voice_factor变成非线性变化的 voice_fac, 从而更 好地体现了浊音度因子 voice_fac的特性。 此外, 为了使所述浊音度因子 voice_fac更好地体现高频带信号的特性, 还 可以利用低频编码参数中的基音周期来修正所述浊音度因子。 作为示例, 可根 据据下面的公式(2 )进一步修改公式(1 ) 中的所述浊音度因子 voice_fac: voice _ fac _ A = voice _ fac * γ
- al * T0 + bl T O≤ threshold min N ,
" 公式 ( 2 ) γ= \ a 2 * T 0 + b2 threshold _ min < T 0 < threshold _ max
I 1 T 0 > threshold _ max 其中, voice_fac是浊音度因子, TO是基音周期, al、 a2、 bl>0 , b2≥ o , threshold_min 和 threshold_max分别是预先设置的基音周期的最小值和最大值, voice_fac_A是 修正后的浊音度因子。 作为示例, 公式 (2 ) 中的各个参数可以取值如下: al=0.0126, bl=1.23 , a2=0.0087 , b2=0 , threshold_min=57.75 , threshold_max=115.5 , 所述参数取值仅仅是示意性的, 可以根据需要设置其它的值。 相对于没有修正 的浊音度因子, 修正后的浊音度因子能够更准确地表示高频带信号表现为浊音 特性的程度, 从而有利于削弱一般周期的浊音信号扩展之后引入的机械声。
高频带激励信号 Ex可根据下面的公式(3 )或公式(4 )来计算:
Ex=(FixCB+(l-voice_fac)*seed)*gc + AdpCB *ga 公式 (3)
Ex=(voice_fac *FixCB+(l-voice_fac)*seed)*gc + AdpCB *ga 公式 (4)
其中,所述 FixCB为代数码书,所述 seed为随机噪声,所述 gc为代数码书增益, 所述 AdpCB为自适应码书, 所述 ga为自适应码书增益。 可以看出, 在所述公 式(3 )或 (4 ) 中, 利用浊音度因子来对所述代数码书 FixCB和随机噪声 seed 进行加权而得到加权结果, 将所述加权结果与代数码书增益 gc的乘积加上所述 自适应码书 AdpCB与自适应码书增益 ga的乘积而得到所述高频带激励信号 Ex。 替换地, 在所述公式(3 )或 (4 ) 中, 可以将所述浊音度因子 voice_fac替换为 公式(2 )中的修正后的浊音度因子 voice_fac_A , 以更准确地表示高频带信号表 现为浊音特性的程度, 即更逼真地表示语音信号中的高频带信号, 从而提高编 码效果。
要注意, 上述的计算浊音度因子和高频带激励信号的方式仅仅是示意性的, 而不用于限制本发明实施例。 在不使用 ACELP算法的其它编码技术中, 还可以 采用其它的方式来计算所述浊音度因子和高频带激励信号。
在 140 中, 利用所述浊音度因子对所述高频带激励信号和随机噪声进行加 权而获得合成激励信号。 如前所述, 在现有技术中, 对一般周期的浊音信号而 言, 由于根据低频带编码参数预测的高频带激励信号的周期性太强, 导致所述 恢复的音频信号听起来机械声强。 通过该 140 ,对于根据低频带信号预测的高频 带激励信号, 用浊音度因子将其与噪声进行加权, 可以减弱根据低频带编码参 数预测的高频带激励信号的周期性, 从而削弱所恢复的音频信号中的机械声。 可以根据需要采取合适的权重来实现所述加权。 作为示例, 可以根据下面 的公式( 5 )来获得合成激励信号 SEx:
SEx = Ex * voice _ fac + seed
Figure imgf000016_0001
式 ( 5 ) 其中, Ex是所述高频带激励信号, seed是随机噪声, voice_fac是所述浊音度因 子, powl 是高频带激励信号的能量, pow2是所述随机噪声的能量。 替换地, 在该公式( 5 ), 可以将所述浊音度因子 voice_fac替换为公式( 2 )中的修正后的 浊音度因子 voice_fac_A, 以更准确地表示语音信号中的高频带信号, 从而提高 编码效果。 在公式 ( 2 ) 中的 al=0.0126、 bl=1.23、 a2=0.0087、 b2=0、 threshold_min=57.75、 threshold_max=115.5 的情况中, 如果根据所述公式( 5 ) 来获得合成激励信号 SEx, 则基音周期 TO大于阈值 threshold_max及小于阈值 threshold_min 的高频带激励信号有较大的权重, 其它的高频带激励信号有较小 的权重。 要注意, 根据需要, 还可以采用除了公式(5 )之外的其它方式来计算 所述合成激励信号。
此外, 在利用浊音度因子对所述高频带激励信号和随机噪声进行加权时, 还可 以预先对所述随机噪声进行预加重, 并在加权之后进行去加重。 具体地, 所述 140可包括:利用预加重因子对所述随机噪声进行用于提升其高频部分的预加重 操作而获得预加重噪声; 利用浊音度因子对所述高频带激励信号和所述预加重 噪声进行加权而生成预加重激励信号; 利用去加重因子对所述预加重激励信号 进行用于压低其高频部分的去加重操作而获得所述合成激励信号。 对于一般浊 音, 噪声成分通常是从低频到高频越来越强。 基于此, 对所述随机噪声进行预 加重操作, 以准确地表示浊音中的噪声信号特征, 即抬高噪声中的高频部分, 并降低其中的低频部分。 作为预加重操作的示例, 可以采用下面的公式(6 )来 对随机噪声 seed(n)进行预加重操作: seed (n) = seed (n) - a seed (n-1) 公式 ( 6 ) 其中, n = l、 2 N, α为预加重因子并且 0< α <1。 可以基于随机噪声的特 性适当地设置该预加重因子, 以准确地表示浊音中的噪声信号特征。 在以所述 公式(6 )进行预加重操作的情况下, 可以利用如下的公式(7 )对预加重激励 信号 S(i)进行去加重操作:
S(n)= S(n)+ β S(n-l) 公式( Ί ) 其中, η = 1、 2 Ν, β为预设的去加重因子。 要注意, 上述的公式(6 )所 示的预加重操作仅仅是示意性的, 在实践中可以采用其它的方式进行预加重; 并且, 当所采用的预加重操作变化时, 去加重操作也要对应地改变。 所述去加 重因子 β可基于所述预加重因子 a以及所述预加重噪声在所述预加重激励信号 中的比例确定。 作为示例, 在根据公式(5 )来利用浊音度因子对所述高频带激 励信号和所述预加重噪声进行加权时 (此时所得到的是预加重激励信号, 该预 加重激励信号被去加重之后才得到合成激励信号), 所述去加重因子 β可根据如 下的公式( 8 )或公式( 9 )来确定:
β a weight 1 /( weight 1 + weight 2)
,——; , 公式(8 )
其中, weight 1 = 1 - ^/1 - voice _ fac, weight 2 = j voice _ fac
β = a ^ weight 1 /( weight 1 + weight 2)
n ~~ , . 、 / , 公式( 9 )
其中, weight 1 /(1 - - voice _ fac L weight 2 = voice _ fac 在 150中, 基于所述合成激励信号和所述高频带信号来获得高频编码参数。 作为示例, 高频编码参数包括高频带增益参数、 高频带 LPC系数, 可以对原始 信号中的高频带信号进行 LPC分析得到高频带 LPC系数,所述高频带激励信号 通过根据 LPC系数确定的合成滤波器而得到预测的高频带信号, 然后比较预测 的高频带信号和原始信号中的高频带信号而获得高频带增益调整参数, 所述高 频带增益参数、 LPC 系数被传送到解码端来恢复高频带信号。 此外, 还可以现 有的或将来出现的各种技术来获得所述高频编码参数, 具体的基于所述合成激 励信号和所述高频带信号来获得高频编码参数的方式不构成对本发明的限制。 在获得低频编码参数和高频编码参数之后, 实现了信号的编码, 从而能够传送 到解码端进行恢复。
在获得低频编码参数和高频编码参数之后, 所述音频信号编码方法 100还可包 括: 根据所述低频编码参数和高频编码参数来生成编码比特流, 以发送给解码 端。
在本发明实施例的上述音频信号编码方法中, 通过利用浊音度因子对所述高频 带激励信号和随机噪声进行加权而获得合成激励信号, 可以基于浊音信号更准 确地表征高频信号的特性, 从而提高编码效果。
图 2是示意性图示了根据本发明实施例的音频信号解码方法 200的流程图。 该 音频信号解码方法包括: 从已编码信息中区分出低频编码参数和高频编码参数 ( 210 ); 对所述低频编码参数进行解码而获得低频带信号 ( 220 ); 根据低频编 码参数来计算浊音度因子, 并根据低频编码参数来预测高频带激励信号, 所述 浊音度因子用于表示高频带信号表现为浊音特性的程度( 230 ); 利用所述浊音 度因子对所述高频带激励信号和随机噪声进行加权而获得合成激励信号( 240 ); 基于所述合成激励信号和高频编码参数来获得高频带信号 ( 250 ); 合并所述低 频带信号和所述高频带信号而得到最终的解码信号 ( 260 )。
在 210 中, 从已编码信息中区分出低频编码参数和高频编码参数。 所述低频编 码参数和高频编码参数是从编码端传送来的用于恢复低频信号和高频信号的参 数。 所述低频编码参数例如可以包括代数码书、 代数码书增益、 自适应码书、 自适应码书增益和基音周期等以及其它参数, 所述高频编码参数例如可以包括 LPC 系数、 高频带增益参数等、 以及其它参数。 此外, 根据编码技术的不同, 所述低频编码参数和高频编码参数可以替换地包括其它的参数。
在 220 中, 对所述低频编码参数进行解码而获得低频带信号。 具体的解码 方式与编码端的编码方式对应。 作为示例, 在编码端采用使用 ACELP 算法的 ACELP编码器来进行编码时, 在 220中采用 ACELP解码器来获得低频带信号。 在 230 中, 根据低频编码参数来计算浊音度因子, 并根据低频编码参数来预测 高频带激励信号, 所述浊音度因子用于表示高频带信号表现为浊音特性的程度。 该 230用于根据低频编码参数获得了被编码信号的高频特性, 从而用于高频带 信号的解码(或恢复)。 下面以与使用 ACELP算法的编码技术对应的解码技术 为例进行说明。
浊音度因子 voice_fac可根据前述的公式( 1 )来计算, 并且为了更好地体现高频 带信号的特性, 可以如上面的公式(2 )所示、 利用低频编码参数中的基音周期 来修正所述浊音度因子 voice_fac, 并获得修正后的浊音度因子 voice_fac_A。 相 对于没有修正的浊音度因子浊音度因子 voice_fac , 修正后的浊音度因子 voice_fac_A能够更准确地表示高频带信号表现为浊音特性的程度, 从而有利于 削弱一般周期的浊音信号扩展之后引入的机械声。
高频带激励信号 Ex可根据前述的公式(3 )或公式(4 )来计算。 也就是说, 利 用浊音度因子来对所述代数码书和随机噪声进行加权而得到加权结果, 将所述 加权结果与代数码书增益的乘积加上所述自适应码书与自适应码书增益的乘积 而得到所述高频带激励信号 Ex。 类似地, 可以将所述浊音度因子 voice_fac替换 为公式(2 ) 中的修正后的浊音度因子 voice_fac_A, 以进一步提高解码效果。
上述的计算浊音度因子和高频带激励信号的方式仅仅是示意性的, 而不用 于限制本发明实施例。 在不使用 ACELP算法的其它编码技术中, 还可以采用其 它的方式来计算所述浊音度因子和高频带激励信号。 关于该 230的描述, 可以参见前面结合图 1的 130进行的描述。
在 240 中, 利用所述浊音度因子对所述高频带激励信号和随机噪声进行加 权而获得合成激励信号。通过该 240,对于根据低频带编码参数预测的高频带激 励信号, 用浊音度因子将其与噪声进行加权, 可以减弱根据低频带编码参数预 测的高频带激励信号的周期性, 从而削弱所恢复的音频信号中的机械声。
作为示例,在该 240中,可以根据上面的公式( 5 )来获得合成激励信号 Sex, 并且可以将公式(5 ) 中浊音度因子 voice_fac替换为公式(2 ) 中的修正后的浊 音度因子 voice_fac_A, 以更准确地表示语音信号中的高频带信号, 从而提高编 码效果。 根据需要, 还可以采用其它的方式来计算所述合成激励信号。
此外,在利用浊音度因子 voice_fac (或者修正后的浊音度因子 voice_fac_A ) 对所述高频带激励信号和随机噪声进行加权时, 还可以预先对所述随机噪声进 行预加重, 并在加权之后进行去加重。 具体地, 所述 240可包括: 利用预加重 因子 α对所述随机噪声进行用于提升其高频部分的预加重操作 (例如通过公式
( 6 ) 实现该预加重操作)而获得预加重噪声; 利用浊音度因子对所述高频带激 励信号和所述预加重噪声进行加权而生成预加重激励信号; 利用去加重因子 β 对所述预加重激励信号进行用于压低其高频部分的去加重操作 (例如通过公式
( 7 ) 实现该去加重操作)而获得所述合成激励信号。 所述预加重因子 α可以根 据需要预先设定, 以准确地表示浊音中的噪声信号特征, 即噪声中的高频部分 信号大、 低频部分信号小。 此外, 还可以采用其它类型的噪声, 此时预加重因 子 α要相应改变以表现一般浊音中的噪声特性。 所述去加重因子 β可基于所述 预加重因子 α以及所述预加重噪声在所述预加重激励信号中的比例确定。 作为 示例所述去加重因子 β可根据前面的公式(8 )或公式(9 )来确定。
关于该 240的描述, 可以参见前面结合图 1的 140进行的描述。 在 250 中, 基于所述合成激励信号和高频编码参数来获得高频带信号。 与 编码端中基于合成激励信号和高频带信号来获得高频编码参数的过程相反地, 来实现该 250。 作为示例, 高频编码参数包括高频带增益参数、 高频带 LPC系 数, 可以利用高频编码参数中的 LPC系数生成合成滤波器, 将 240中获得的合 成激励信号通过合成滤波器而恢复所预测的高频带信号, 其经过高频编码参数 中的高频带增益调整参数调整而获得最终的高频带信号。 此外, 还可以现有的 或将来出现的各种技术来实现该 240,具体的基于所述合成激励信号和高频编码 参数来获得高频带信号的方式不构成对本发明的限制。
在 260中, 合并所述低频带信号和所述高频带信号而得到最终的解码信号。 该合并方式与图 1 中的 110中的划分方式对应, 从而实现解码而得到最终的输 出信号。
在本发明实施例的上述音频信号解码方法中, 通过利用浊音度因子对所述高频 带激励信号和随机噪声进行加权而获得合成激励信号, 可以基于浊音信号更准 确地表征高频信号的特性, 从而提高解码效果。
图 3是示意性图示了根据本发明实施例的音频信号编码装置 300的框图。 该音 频信号编码装置 300包括: 划分单元 310, 用于将待编码的时域信号划分为低频 带信号和高频带信号; 低频编码单元 320, 用于对低频带信号进行编码而获得低 频编码参数; 计算单元 330, 用于根据低频编码参数来计算浊音度因子, 所述浊 音度因子用于表示高频带信号表现为浊音特性的程度; 预测单元 340, 用于根据 低频编码参数来预测高频带激励信号; 合成单元 350, 用于利用所述浊音度因子 对所述高频带激励信号和随机噪声进行加权而获得合成激励信号; 高频编码单 元 360, 用于基于所述合成激励信号和所述高频带信号来获得高频编码参数。 所述划分单元 310在接收输入的时域信号之后, 可以采用现有的或将来出现的 任何划分技术来实现该划分。 所述低频带和高频带的含义是相对的, 例如可以 设定一频率阈值, 则低于该频率阈值的频率为低频带, 高于该频率阈值的频率 为高频带。 在实践中, 可以根据需要设定所述频率阈值, 也可以采取其它方式 来区分出信号中的低频带信号成分和高频带信号成分, 从而实现划分。
所述低频编码单元 320例如可以采用使用 ACELP算法的 ACELP编码器来进行 编码, 此时获得的低频编码参数例如可包括代数码书、 代数码书增益、 自适应 码书、 自适应码书增益和基音周期等, 并且还可以包括其它参数。 在实践中, 可以根据需要采取合适的编码技术来对所述低频带信号进行编码; 当编码技术 改变时, 所述低频编码参数的组成也会改变。 所获得的低频编码参数是恢复所 述低频带信号所需要的参数, 其被传送到解码器进行低频带信号恢复。
所述计算单元 330根据低频编码参数来计算用于表示被编码信号的高频特性的 参数, 即浊音度因子。 具体地, 计算单元 330根据通过低频编码单元 320获得 的低频编码参数来计算浊音度因子 voice_fac, 其例如可根据前述的公式( 1 )来 计算该浊音度因子 voice_fac。 然后, 所述浊音度因子被用于获得合成激励信号, 该合成激励信号被传送到所述高频编码单元 360以用于高频带信号的编码。图 4 是示意性图示了根据本发明实施例的音频信号编码装置中的预测单元 340和合 成单元 350的框图。
预测单元 340可仅仅包括图 4中的预测部件 460,或者可以包括图 4中的第二修 正部件 450和预测部件 460二者。
为了更好地体现高频带信号的特性从而削弱一般周期的浊音信号扩展之后引入 的机械声, 第二修正部件 450例如根据上面的公式(2 )所示、 利用低频编码参 数中的基音周期 TO来修正所述浊音度因子 voice_fac,并获得修正后的浊音度因 子 voice_fac_A2。 预测部件 460例如根据前述的公式(3 )或公式(4 )来计算高频带激励信 号 Ex, 即利用修正后的浊音度因子 voice_fac_A2来对低频编码参数中的代数码 书和随机噪声进行加权而得到加权结果, 将所述加权结果与代数码书增益的乘 积加上所述自适应码书与自适应码书增益的乘积而得到所述高频带激励信号 Ex。所述预测部件 460也可以利用通过计算单元 330计算的浊音度因子 voice_fac 来对低频编码参数中的代数码书和随机噪声进行加权而得到加权结果, 此时则 可以省略第二修正部件 450。要注意, 该预测部件 460还可以采用其它的方式来 计算所述高频带激励信号 Ex。
作为示例, 所述合成单元 350可包括图 4中的预加重部件 410、 加权部件 420、 和去加重部件 430; 或者可包括图 4中的第一修正部件 440和加权部件 420, 或 者还可以包括图 4中的预加重部件 410、 加权部件 420、 去加重部件 430和第一 4爹正部件 440。
所述预加重部件 410, 例如通过公式(6 ), 利用预加重因子 α对随机噪声进行用 于提升其高频部分的预加重操作而获得预加重噪声 PEnoise。 该随机噪声可以与 输入到预测部件 460 的随机噪声相同。 所述预加重因子 α可以根据需要预先设 定, 以准确地表示浊音中的噪声信号特征, 即噪声中的高频部分信号大、 低频 部分信号小。 当采用其它类型的噪声时, 预加重因子 α要相应改变以表现一般 浊音中的噪声特性。
加权部件 420用于利用修正后的浊音度因子 voice_fac_Al对来自预测部件 460的高频带激励信号 Ex和来自预加重部件 410的预加重噪声 PEnoise进行加 权而生成预加重激励信号 PEEx。 作为示例, 该加权部件 420可以根据上面的公 式(5 )来获得预加重激励信号 PEEx (用修正后的浊音度因子 voice_fac_Al替 换其中的浊音度因子 voice_fac ),还可以采用其它的方式来计算所述预加重激励 信号。 所述修正后的浊音度因子 voice_fac_Al是通过所述第一修正部件 440来 产生的, 所述第一修正部件 440 利用所述基音周期来修正所述浊音度因子而得 到所述修正后的浊音度因子 voice_fac_Al。所述第一修正部件 440所进行的修正 操作可以与所述第二修正部件 450相同, 也可以不同于所述第二修正部件 450 的修正操作。 也就是说, 该第一修正部件 440可以采用除了上述的公式(2 )之 外的其它公式来基于基音周期修正浊音度因子 voice_fac。
去加重部件 430, 例如通过公式( 7 ), 利用去加重因子 β对来自加权部件 420的 预加重激励信号 ΡΕΕχ进行用于压低其高频部分的去加重操作而获得所述合成 激励信号 SEx。所述去加重因子 β可基于所述预加重因子 α以及所述预加重噪声 在所述预加重激励信号中的比例确定。 作为示例所述去加重因子 β可根据前面 的公式(8 )或公式(9 )来确定。
如前所述, 代替修正后的浊音度因子 voice_fac_Al或 voice_fac_A2 , 可以将从 计算单元 330输出的浊音度因子 voice_fac提供给加权部件 420和预测部件 460 之一或二者。 此外, 还可以删除所述预加重部件 410和去加重部件 430, 加权部 分 420利用修正后的浊音度因子 (或浊音度因子 voice_fac)来对所述高频带激励 信号 Ex和随机噪声进行加权而获得合成激励信号。
关于所述预测单元 340或合成单元 350的描述, 可以参见前面结合图 1的 130 和 140进行的描述。
所述高频编码单元 360基于所述合成激励信号 SEx和来自划分单元 310的 高频带信号来获得高频编码参数。 作为示例, 所述高频编码单元 360对高频带 信号进行 LPC分析得到高频带 LPC 系数, 所述高频带激励信号通过根据 LPC 系数确定的合成滤波器而得到预测的高频带信号, 然后比较预测的高频带信号 和来自划分单元 310 的高频带信号而获得高频带增益调整参数, 所述高频带增 益参数、 LPC系数即是所述高频编码参数的组成部分。 此外, 高频编码单元 360 述合成激励信号和所述高频带信号来获得高频编码参数的方式不构成对本发明 的限制。 在获得低频编码参数和高频编码参数之后, 实现了信号的编码, 从而 能够传送到解码端进行恢复。
可选地, 所述音频信号编码装置 300还可以包括: 比特流生成单元 370, 用于根 据所述低频编码参数和高频编码参数来生成编码比特流, 以发送给解码端。
关于图 3 中所示的音频信号编码装置的各个单元所执行的操作, 可以参见 结合图 1的音频信号编码方法所进行的描述。
在本发明实施例的上述音频信号编码装置中, 合成单元 350 利用浊音度因子对 所述高频带激励信号和随机噪声进行加权而获得合成激励信号, 可以基于浊音 信号更准确地表征高频信号的特性, 从而提高编码效果。
图 5是示意性图示了根据本发明实施例的音频信号解码装置 500的框图。 该音 频信号解码装置 500包括: 区分单元 510, 用于从已编码信息中区分出低频编码 参数和高频编码参数; 低频解码单元 520, 用于对所述低频编码参数进行解码而 获得低频带信号; 计算单元 530, 用于根据低频编码参数来计算浊音度因子, 所 述浊音度因子用于表示高频带信号表现为浊音特性的程度; 预测单元 540, 用于 根据低频编码参数来预测高频带激励信号; 合成单元 550, 用于利用所述浊音度 因子对所述高频带激励信号和随机噪声进行加权而获得合成激励信号; 高频解 码单元 560, 用于基于所述合成激励信号和高频编码参数来获得高频带信号; 合 并单元 570, 用于合并所述低频带信号和所述高频带信号而得到最终的解码信 号。
所述区分单元 510在接收到编码信号之后, 将编码信号中的低频编码参数提供 给低频解码单元 520,并将编码信号中的高频编码参数提供给高频解码单元 560。 所述低频编码参数和高频编码参数是从编码端传送来的用于恢复低频信号和高 频信号的参数。 所述低频编码参数例如可以包括代数码书、 代数码书增益、 自 适应码书、 自适应码书增益、 基音周期以及其它参数, 所述高频编码参数例如 可以包括 LPC系数、 高频带增益参数、 以及其它参数。
所述低频解码单元 520对所述低频编码参数进行解码而获得低频带信号。 具体的解码方式与编码端的编码方式对应。 此外, 该低频解码单元 520还将诸 如代数码书、 代数码书增益、 自适应码书、 自适应码书增益、 基音周期等低频 编码参数提供给计算单元 530和预测单元 540,计算单元 530和预测单元 540也 可以从区分单元 510中直接获取所需要的低频编码参数。
所述计算单元 530, 用于根据低频编码参数来计算浊音度因子, 所述浊音度 因子用于表示高频带信号表现为浊音特性的程度。 具体地, 计算单元 530可根 据通过低频解码单元 520获得的低频编码参数来计算浊音度因子 voice_fac , 其 例如可根据前述的公式( 1 )来计算该浊音度因子 voice_fac。 然后, 所述浊音度 因子被用于获得合成激励信号,该合成激励信号被传送到所述高频解码单元 560 以用于获得高频带信号。
所述预测单元 540和合成单元 550分别与图 3中的音频信号编码装置 300中的 预测单元 340和合成单元 350相同, 因此其结构也可以参见图 4中的所示和描 述。 例如, 在一个实现中, 所述预测单元 540包括第二修正部件 450和预测部 件 460二者; 在另一实现中, 所述预测单元 540仅仅包括所述预测部件 460。 对 于所述合成单元 550, 在一个实现中, 所述合成单元 550包括预加重部件 410、 加权部件 420、 去加重部件 430; 在另一实现中, 所述合成单元 550包括第一修 正部件 440、 和加权部件 420; 在又一实现中, 所述合成单元 550包括预加重部 件 410、 加权部件 420、 去加重部件 430、 和第一 4爹正部件 440。
高频解码单元 560基于所述合成激励信号和高频编码参数来获得高频带信号。 高频解码单元 560采用与音频信号编码装置 300中的高频编码单元的编码技术 对应的解码技术来进行解码。 作为示例, 高频解码单元 560 利用高频编码参数 中的 LPC系数生成合成滤波器, 将来自合成单元 550的合成激励信号通过所述 合成滤波器而恢复所预测的高频带信号, 该预测的高频带信号经过高频编码参 数中的高频带增益调整参数调整而获得最终的高频带信号。 此外, 还可以现有 的或将来出现的各种技术来实现该高频解码单元 560,具体的解码技术不构成对 本发明的限制。
所述合并单元 570合并所述低频带信号和所述高频带信号而得到最终的解码信 号。 所述合并单元 570的合并方式与图 3中的划分单元 310执行划分操作的划 分方式对应, 从而实现解码而得到最终的输出信号。
在本发明实施例的上述音频信号解码装置中, 通过利用浊音度因子对所述高频 带激励信号和随机噪声进行加权而获得合成激励信号, 可以基于浊音信号更准 确地表征高频信号的特性, 从而提高解码效果。
图 6是示意性图示了根据本发明实施例的发射机 600的框图。图 6的发射机 600 可包括如图 3所示的音频信号编码装置 300, 因此适当省略重复的描述。 此外, 发射机 600还可以包括发射单元 610,用于为所述音频信号编码装置 300产生的 高频编码参数和低频编码参数分配比特以生成比特流, 并发射该比特流。
图 7是示意性图示了根据本发明实施例的接收机 700的框图。图 7的接收机 700 可包括如图 5所示的音频信号解码装置 500, 因此适当省略重复的描述。 此外, 接收机 700还可以包括接收单元 710,用于接收编码信号以供所述音频信号解码 在本发明的另一个实施例中, 还提供一种通信系统, 其可包括结合图 6描述的 发射机 600或结合图 7描述的接收机 700。
图 8是本发明另一实施例的装置的示意框图。 图 8的装置 800可用于实现上述 方法实施例中各步骤及方法。 装置 800可应用于各种通信系统中的基站或者终 端。 图 8的实施例中, 装置 800包括发射电路 802、 接收电路 803、 编码处理器 804、 解码处理器 805、 处理单元 806、 存储器 807及天线 801。 处理单元 806控 制装置 800的操作, 处理单元 806还可以称为 CPU ( Central Processing Unit, 中 央处理单元)。 存储器 807可以包括只读存储器和随机存取存储器, 并向处理单 元 806提供指令和数据。 存储器 807的一部分还可以包括非易失行随机存取存 储器(NVRAM )。 具体的应用中, 装置 800可以嵌入或者本身可以就是例如移 动电话之类的无线通信设备, 还可以包括容纳发射电路 802和接收电路 803的 载体, 以允许装置 800和远程位置之间进行数据发射和接收。 发射电路 802和 接收电路 803可以耦合到天线 801。装置 800的各个组件通过总线系统 809耦合 在一起, 其中总线系统 809 除包括数据总线之外, 还包括电源总线、 控制总线 和状态信号总线。 但是为了清楚说明起见, 在图中将各种总线都标为总线系统 809。 装置 800还可以包括用于处理信号的处理单元 806, 此外还包括编码处理 器 804、 解码处理器 805。
上述本发明实施例揭示的音频信号编码方法可以应用于编码处理器 804或由其 实现, 上述本发明实施例揭示的音频信号解码方法可以应用于解码处理器 805 或由其实现。 编码处理器 804或解码处理器 805可能是一种集成电路芯片, 具 有信号的处理能力。 在实现过程中, 上述方法的各步骤可以通过编码处理器 804 或解码处理器 805 中的硬件的集成逻辑电路或者软件形式的指令完成。 这些指 令可以通过处理器 806以配合实现及控制。 用于执行本发明实施例揭示的方法, 上述的解码处理器可以是通用处理器、 数字信号处理器 (DSP )、 专用集成电路 ( ASIC )、 现成可编程门阵列 (FPGA )或者其他可编程逻辑器件、 分立门或者 晶体管逻辑器件、 分立硬件组件。 可以实现或者执行本发明实施例中的公开的 各方法、 步骤及逻辑框图。 通用处理器可以是微处理器或者该处理器也可以是 任何常规的处理器, 译码器等。 结合本发明实施例所公开的方法的步骤可以直 接体现为硬件解码处理器执行完成, 或者用解码处理器中的硬件及软件模块组 合执行完成。 软件模块可以位于随机存储器, 闪存、 只读存储器, 可编程只读 存储器或者电可擦写可编程存储器、 寄存器等本领域成熟的存储介质中。 该存 储介质位于存储器 807中, 编码处理器 804或解码处理器 805读取存储器 807 中的信息, 结合其硬件完成上述方法的步骤。 例如, 存储器 807可存储所得到 的低频编码参数, 供编码处理器 804或解码处理器 805在编码或解码时使用。 例如, 图 3的音频信号编码装置 300可以由编码处理器 804实现, 图 5的音频 信号解码装置 500可以由解码处理器 805实现。 另外, 图 4的预测单元和合成 单元可以由处理器 806实现, 也可以由编码处理器 804或解码处理器 805实现。 另外, 例如, 图 6的发射机 610可以由编码处理器 804、 发射电路 802和天线 801等实现。图 7的接收机 710可以由天线 801、接收电路 803和解码处理器 805 等实现。 但上述例子仅仅是示意性的, 并非将本发明实施例限于这样的具体实 现形式。
具体地, 存储器 807存储使得处理器 806和 /或编码处理器 804实现以下操作的 指令: 将待编码的时域信号划分为低频带信号和高频带信号; 对低频带信号进 行编码而获得低频编码参数; 根据低频编码参数来计算浊音度因子, 并根据低 频编码参数来预测高频带激励信号, 所述浊音度因子用于表示所述高频带信号 表现为浊音特性的程度; 利用所述浊音度因子对所述高频带激励信号和随机噪 声进行加权而获得合成激励信号; 基于所述合成激励信号和所述高频带信号来 获得高频编码参数。 存储器 807存储使得处理器 806或解码处理器 805实现以 下操作的指令: 从已编码信息中区分出低频编码参数和高频编码参数; 对所述 低频编码参数进行解码而获得低频带信号; 根据低频编码参数来计算浊音度因 子, 并根据低频编码参数来预测高频带激励信号, 所述浊音度因子用于表示高 频带信号表现为浊音特性的程度; 利用所述浊音度因子对所述高频带激励信号 和随机噪声进行加权而获得合成激励信号; 基于所述合成激励信号和高频编码 参数来获得高频带信号; 合并所述低频带信号和所述高频带信号而得到最终的 解码信号。
根据本发明实施例的通信系统或通信装置可包括上述音频信号编码装置 300、发 射机 610、 音频信号解码装置 500、 接收机 710等中的部分或全部。
本领域普通技术人员可以意识到, 结合本文中所公开的实施例描述的各示例的 单元及算法步骤, 能够以电子硬件、 或者计算机软件和电子硬件的结合来实现。 这些功能究竟以硬件还是软件方式来执行, 取决于技术方案的特定应用和设计 约束条件。 专业技术人员可以对每个特定的应用来使用不同方法来实现所描述 的功能, 但是这种实现不应认为超出本发明的范围。
所属领域的技术人员可以清楚地了解到, 为描述的方便和筒洁, 上述描述的系 统、 装置和单元的具体工作过程, 可以参考前述方法实施例中的对应过程, 在 此不再赘述。
在本申请所提供的几个实施例中, 应该理解到, 所揭露的系统、 装置和方法, 可以通过其它的方式实现。 例如, 以上所描述的装置实施例仅仅是示意性的, 例如, 所述单元的划分, 仅仅为一种逻辑功能划分, 实际实现时可以有另外的 划分方式, 例如多个单元或组件可以结合或者可以集成到另一个系统, 或一些 特征可以忽略, 或不执行。 显示的部件可以是或者也可以不是物理单元, 即可以位于一个地方, 或者也可 以分布到多个网络单元上。 可以根据实际的需要选择其中的部分或者全部单元 来实现本实施例方案的目的。
所述功能如果以软件功能单元的形式实现并作为独立的产品销售或使用时, 可 以存储在一个计算机可读取存储介质中。 基于这样的理解, 本发明的技术方案 本质上或者说对现有技术做出贡献的部分或者该技术方案的部分可以以软件产 品的形式体现出来, 该计算机软件产品存储在一个存储介质中, 包括若干指令 用以使得一台计算机设备(可以是个人计算机, 服务器, 或者网络设备等)执 行本发明各个实施例所述方法的全部或部分步骤。 而前述的存储介质包括: U 盘、移动硬盘、只读存储器( ROM , Read-Only Memory )、随机存取存储器( RAM, Random Access Memory )、 磁碟或者光盘等各种可以存储程序代码的介质。
以上所述, 仅为本发明的具体实施方式, 但本发明的保护范围并不局限于此, 任何熟悉本技术领域的技术人员在本发明揭露的技术范围内, 可轻易想到变化 或替换, 都应涵盖在本发明的保护范围之内。 因此, 本发明的保护范围应所述 以权利要求的保护范围为准。

Claims

权 利 要 求
1. 一种音频信号编码方法, 其特征在于, 包括:
将待编码的时域信号划分为低频带信号和高频带信号;
对低频带信号进行编码而获得低频编码参数;
根据低频编码参数来计算浊音度因子, 并根据低频编码参数来预测高频带激励 信号, 所述浊音度因子用于表示所述高频带信号表现为浊音特性的程度; 利用所述浊音度因子对所述高频带激励信号和随机噪声进行加权而获得合成激 励信号;
基于所述合成激励信号和所述高频带信号来获得高频编码参数。
2. 根据权利要求 1的方法, 其特征在于, 所述利用浊音度因子对所述高频带激 励信号和随机噪声进行加权而获得合成激励信号包括:
利用预加重因子对所述随机噪声进行用于提升其高频部分的预加重操作而获得 预加重噪声;
利用浊音度因子对所述高频带激励信号和所述预加重噪声进行加权而生成预加 重激励信号;
利用去加重因子对所述预加重激励信号进行用于压低其高频部分的去加重操作 而获得所述合成激励信号。
3. 根据权利要求 2的方法, 其特征在于, 所述去加重因子是基于所述预加重因 子以及所述预加重噪声在所述预加重激励信号中的比例来确定的。
4. 根据权利要求 1的方法, 其特征在于, 所述低频编码参数包括基音周期, 所 述利用浊音度因子对所预测的高频带激励信号和随机噪声进行加权而获得合成 激励信号包括:
利用所述基音周期来修正所述浊音度因子; 利用修正后的浊音度因子来对所述高频带激励信号和随机噪声进行加权而获得 合成激励信号。
5. 根据权利要求 1 - 4 中任一项的方法, 其特征在于, 所述低频编码参数包括 代数码书、 代数码书增益、 自适应码书、 自适应码书增益和基音周期, 所述根 据低频编码参数来预测高频带激励信号包括:
利用所述基音周期来修正所述浊音度因子;
利用修正后的浊音度因子来对所述代数码书和随机噪声进行加权而得到加权结 果, 将所述加权结果与代数码书增益的乘积加上所述自适应码书与自适应码书 增益的乘积而预测到所述高频带激励信号。
6. 根据权利要求 4或 5的方法, 其特征在于, 所述利用所述基音周期来修 正所述浊音度因子是根据下面的公式来进行的:
voice _ fac _ A = voice _ fac * γ
- al * TO + bl T O < threshold _ min
γ= \ a 2 * T 0 + b2 threshold _ min < T 0 < threshold _ max
I 1 T 0 > threshold _ max 其中, voice_fac是浊音度因子, TO是基音周期, al、 a2、 bl>0 , b2≥ o , threshold_min 和 threshold_max分别是预先设置的基音周期的最小值和最大值, voice_fac_A是 修正后的浊音度因子。
7. 根据权利要求 1的方法, 其特征在于, 所述音频信号编码方法还包括: 根据所述低频编码参数和高频编码参数来生成编码比特流, 以发送给解码端。
8. 一种音频信号解码方法, 其特征在于, 包括:
从已编码信息中区分出低频编码参数和高频编码参数;
对所述低频编码参数进行解码而获得低频带信号;
根据低频编码参数来计算浊音度因子, 并根据低频编码参数来预测高频带激励 信号, 所述浊音度因子用于表示高频带信号表现为浊音特性的程度; 利用所述浊音度因子对所述高频带激励信号和随机噪声进行加权而获得合成激 励信号;
基于所述合成激励信号和高频编码参数来获得高频带信号;
合并所述低频带信号和所述高频带信号而得到最终的解码信号。
9. 根据权利要求 8的方法, 其特征在于, 所述利用浊音度因子对所述高频带激 励信号和随机噪声进行加权而获得合成激励信号包括:
利用预加重因子对所述随机噪声进行用于提升其高频部分的预加重操作而获得 预加重噪声;
利用浊音度因子对所述高频带激励信号和所述预加重噪声进行加权而生成预加 重激励信号;
利用去加重因子对所述预加重激励信号进行用于压低其高频部分的去加重操作 而获得所述合成激励信号。
10. 根据权利要求 9 的方法, 其特征在于, 所述去加重因子是基于所述预加重 因子以及所述预加重噪声在所述预加重激励信号中的比例来确定的。
11. 根据权利要求 8 的方法, 其特征在于, 所述低频编码参数包括基音周期, 所述利用浊音度因子对所预测的高频带激励信号和随机噪声进行加权而获得合 成激励信号包括:
利用所述基音周期来修正所述浊音度因子;
利用修正后的浊音度因子来对所述高频带激励信号和随机噪声进行加权而获得 合成激励信号。
12. 根据权利要求 8 - 10 中任一项的方法, 其特征在于, 所述低频编码参数包 括代数码书、 代数码书增益、 自适应码书、 自适应码书增益和基音周期, 所述 根据低频编码参数来预测高频带激励信号包括: 利用所述基音周期来修正所述浊音度因子;
利用修正后的浊音度因子来对所述代数码书和随机噪声进行加权而得到加权结 果, 将所述加权结果与代数码书增益的乘积加上所述自适应码书与自适应码书 增益的乘积而预测到所述高频带激励信号。
13. 根据权利要求 11或 12的方法, 其特征在于, 所述利用所述基音周期 来修正所述浊音度因子是根据下面的公式来进行的:
voice _ fac _ A = voice _ fac * γ
- al * TO + bl T O < threshold _ min
γ= \ a 2 * T 0 + b2 threshold _ min < T 0 < threshold _ max
I 1 T 0 > threshold _ max 其中, voice_fac是浊音度因子, TO是基音周期, al、 a2、 bl>0 , b2≥ o , threshold_min 和 threshold_max分别是预先设置的基音周期的最小值和最大值, voice_fac_A是 修正后的浊音度因子。
14. 一种音频信号编码装置, 其特征在于, 包括:
划分单元, 用于将待编码的时域信号划分为低频带信号和高频带信号; 低频编码单元, 用于对低频带信号进行编码而获得低频编码参数;
计算单元, 用于根据低频编码参数来计算浊音度因子, 所述浊音度因子用于表 示高频带信号表现为浊音特性的程度;
预测单元, 用于根据低频编码参数来预测高频带激励信号;
合成单元, 用于利用所述浊音度因子对所述高频带激励信号和随机噪声进行加 权而获得合成激励信号;
高频编码单元, 用于基于所述合成激励信号和所述高频带信号来获得高频编码 参数。
15. 根据权利要求 14的装置, 其特征在于, 所述合成单元包括:
预加重部件, 用于利用预加重因子对所述随机噪声进行用于提升其高频部分的 预加重操作而获得预加重噪声;
加权部件, 用于利用浊音度因子对所述高频带激励信号和所述预加重噪声进行 加权而生成预加重激励信号;
去加重部件, 用于利用去加重因子对所述预加重激励信号进行用于压低其高频 部分的去加重操作而获得所述合成激励信号。
16. 根据权利要求 15的装置, 其特征在于, 所述去加重因子是基于所述预加重 因子以及所述预加重噪声在所述预加重激励信号中的比例来确定的。
17. 根据权利要求 14的装置, 其特征在于, 所述低频编码参数包括基音周期, 所述合成单元包括:
第一修正部件, 用于利用所述基音周期来修正所述浊音度因子;
加权部件, 用于利用修正后的浊音度因子来对所述高频带激励信号和随机噪声 进行加权而获得合成激励信号。
18. 根据权利要求 14 - 16中任一项的装置, 其特征在于, 所述低频编码参数包 括代数码书、 代数码书增益、 自适应码书、 自适应码书增益和基音周期, 所述 预测单元包括:
第二修正部件, 用于利用所述基音周期来修正所述浊音度因子;
预测部件, 用于利用修正后的浊音度因子来对所述代数码书和随机噪声进行加 权而得到加权结果, 将所述加权结果与代数码书增益的乘积加上所述自适应码 书与自适应码书增益的乘积而预测到所述高频带激励信号。
19. 根据权利要求 17或 18的装置, 其特征在于, 所述第一修正部件和第 二修正部件中的至少一个根据下面的公式来修正所述浊音度因子: voice _ fac _ A = voice _ fac * γ
- al * TO + bl T O < threshold _ min
γ= \ a 2 * T 0 + b2 threshold _ min < T 0 < threshold _ max
I 1 T 0 > threshold _ max 其中, voice_fac是浊音度因子, TO是基音周期, al、 a2、 bl>0 , b2≥ o , threshold_min 和 threshold_max分别是预先设置的基音周期的最小值和最大值, voice_fac_A是 修正后的浊音度因子。
20. 根据权利要求 14的装置, 其特征在于, 所述音频信号编码装置还包括: 比特流生成单元, 用于根据所述低频编码参数和高频编码参数来生成编码比特 流, 以发送给解码端。
21. 一种音频信号解码装置, 其特征在于, 包括:
区分单元, 用于从已编码信息中区分出低频编码参数和高频编码参数; 低频解码单元, 用于对所述低频编码参数进行解码而获得低频带信号; 计算单元, 用于根据低频编码参数来计算浊音度因子, 所述浊音度因子用于表 示高频带信号表现为浊音特性的程度;
预测单元, 用于根据低频编码参数来预测高频带激励信号;
合成单元, 用于利用所述浊音度因子对所述高频带激励信号和随机噪声进行加 权而获得合成激励信号;
高频解码单元, 用于基于所述合成激励信号和高频编码参数来获得高频带信号; 合并单元, 用于合并所述低频带信号和所述高频带信号而得到最终的解码信号。
22. 根据权利要求 21的装置, 其特征在于, 所述合成单元包括:
预加重部件, 用于利用预加重因子对所述随机噪声进行用于提升其高频部分的 预加重操作而获得预加重噪声;
加权部件, 用于利用浊音度因子对所述高频带激励信号和所述预加重噪声进行 加权而生成预加重激励信号; 去加重部件, 用于利用去加重因子对所述预加重激励信号进行用于压低其高频 部分的去加重操作而获得所述合成激励信号。
23. 根据权利要求 21的装置, 其特征在于, 所述去加重因子是基于所述预加重 因子以及所述预加重噪声在所述预加重激励信号中的比例来确定的。
24. 根据权利要求 21的装置, 其特征在于, 所述低频编码参数包括基音周期, 所述合成单元包括:
第一修正部件, 用于利用所述基音周期来修正所述浊音度因子;
加权部件, 用于利用修正后的浊音度因子来对所述高频带激励信号和随机噪声 进行加权而获得合成激励信号。
25. 根据权利要求 21 - 23中任一项的装置, 其特征在于, 所述低频编码参数包 括代数码书、 代数码书增益、 自适应码书、 自适应码书增益和基音周期, 所述 预测单元包括:
第二修正部件, 用于利用所述基音周期来修正所述浊音度因子;
预测部件, 用于利用修正后的浊音度因子来对所述代数码书和随机噪声进行加 权而得到加权结果, 将所述加权结果与代数码书增益的乘积加上所述自适应码 书与自适应码书增益的乘积而预测到所述高频带激励信号。
26. 根据权利要求 24或 25的装置, 其特征在于, 所述第一修正部件和第 二修正部件中的至少一个根据下面的公式来修正所述浊音度因子:
voice _ fac _ A = voice _ fac * γ
- al * TO + bl T O < threshold _ min
γ= \ a 2 * T 0 + b2 threshold _ min < T 0 < threshold _ max
I 1 T 0 > threshold _ max 其中, voice_fac是浊音度因子, TO是基音周期, al、 a2、 bl>0 , b2≥ o , threshold_min 和 threshold_max分别是预先设置的基音周期的最小值和最大值, voice_fac_A是 修正后的浊音度因子。
27. 一种发射机, 其特征在于, 包括:
如权利要求 14所述的音频信号编码装置; 以及
发射单元, 用于为所述编码装置产生的高频编码参数和低频编码参数分配比特 以生成比特流, 并发射该比特流。
28. 一种接收机, 其特征在于, 包括:
接收单元, 用于接收比特流, 并从所述比特流中提取已编码信息; 以及 如权利要求 21所述的音频信号解码装置。
29. 一种通信系统, 其特征在于, 包括如权利要求 27所述的发射机或如权利要 求 28所述的接收机。
PCT/CN2013/079804 2013-01-11 2013-07-22 音频信号编码和解码方法、音频信号编码和解码装置 WO2014107950A1 (zh)

Priority Applications (10)

Application Number Priority Date Filing Date Title
KR1020157013439A KR101736394B1 (ko) 2013-01-11 2013-07-22 오디오 신호 인코딩/디코딩 방법 및 오디오 신호 인코딩/디코딩 장치
JP2015543256A JP6125031B2 (ja) 2013-01-11 2013-07-22 オーディオ信号符号化及び復号化方法並びにオーディオ信号符号化及び復号化装置
SG11201503286UA SG11201503286UA (en) 2013-01-11 2013-07-22 Audio signal encoding and decoding method, and audio signal encoding and decoding apparatus
EP13871091.8A EP2899721B1 (en) 2013-01-11 2013-07-22 Audio signal encoding/decoding method and audio signal encoding/decoding device
KR1020177012597A KR20170054580A (ko) 2013-01-11 2013-07-22 오디오 신호 인코딩/디코딩 방법 및 오디오 신호 인코딩/디코딩 장치
EP18172248.9A EP3467826A1 (en) 2013-01-11 2013-07-22 Audio signal encoding and decoding method, and audio signal encoding and decoding apparatus
BR112015014956-1A BR112015014956B1 (pt) 2013-01-11 2013-07-22 Método de codificação de sinal de áudio, método de decodificação de sinal de áudio, aparelho de codificação de sinal de áudio e aparelho de decodificação de sinal de áudio
US14/704,502 US9805736B2 (en) 2013-01-11 2015-05-05 Audio signal encoding and decoding method, and audio signal encoding and decoding apparatus
US15/717,952 US10373629B2 (en) 2013-01-11 2017-09-28 Audio signal encoding and decoding method, and audio signal encoding and decoding apparatus
US16/531,116 US20190355378A1 (en) 2013-01-11 2019-08-04 Audio signal encoding and decoding method, and audio signal encoding and decoding apparatus

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201310010936.8A CN103928029B (zh) 2013-01-11 2013-01-11 音频信号编码和解码方法、音频信号编码和解码装置
CN201310010936.8 2013-01-11

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US14/704,502 Continuation US9805736B2 (en) 2013-01-11 2015-05-05 Audio signal encoding and decoding method, and audio signal encoding and decoding apparatus

Publications (1)

Publication Number Publication Date
WO2014107950A1 true WO2014107950A1 (zh) 2014-07-17

Family

ID=51146227

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2013/079804 WO2014107950A1 (zh) 2013-01-11 2013-07-22 音频信号编码和解码方法、音频信号编码和解码装置

Country Status (9)

Country Link
US (3) US9805736B2 (zh)
EP (2) EP2899721B1 (zh)
JP (2) JP6125031B2 (zh)
KR (2) KR20170054580A (zh)
CN (2) CN105976830B (zh)
BR (1) BR112015014956B1 (zh)
HK (1) HK1199539A1 (zh)
SG (1) SG11201503286UA (zh)
WO (1) WO2014107950A1 (zh)

Families Citing this family (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
PL4231291T3 (pl) * 2008-12-15 2024-04-15 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Dekoder powiększania szerokości pasma audio, powiązany sposób oraz program komputerowy
CN103426441B (zh) 2012-05-18 2016-03-02 华为技术有限公司 检测基音周期的正确性的方法和装置
CN105976830B (zh) * 2013-01-11 2019-09-20 华为技术有限公司 音频信号编码和解码方法、音频信号编码和解码装置
US9384746B2 (en) * 2013-10-14 2016-07-05 Qualcomm Incorporated Systems and methods of energy-scaled signal processing
WO2015079946A1 (ja) * 2013-11-29 2015-06-04 ソニー株式会社 周波数帯域拡大装置および方法、並びにプログラム
CN105225671B (zh) 2014-06-26 2016-10-26 华为技术有限公司 编解码方法、装置及系统
US9837089B2 (en) * 2015-06-18 2017-12-05 Qualcomm Incorporated High-band signal generation
US10847170B2 (en) 2015-06-18 2020-11-24 Qualcomm Incorporated Device and method for generating a high-band signal from non-linearly processed sub-ranges
CN106328153B (zh) * 2016-08-24 2020-05-08 青岛歌尔声学科技有限公司 电子通信设备语音信号处理系统、方法和电子通信设备
US10825467B2 (en) * 2017-04-21 2020-11-03 Qualcomm Incorporated Non-harmonic speech detection and bandwidth extension in a multi-source environment
CN113196387A (zh) * 2019-01-13 2021-07-30 华为技术有限公司 高分辨率音频编解码
CN112767954B (zh) * 2020-06-24 2024-06-14 腾讯科技(深圳)有限公司 音频编解码方法、装置、介质及电子设备

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1484824A (zh) * 2000-10-18 2004-03-24 ��˹��ŵ�� 用于估算语音调制解调器中的模拟高频段信号的方法和系统
CN101083076A (zh) * 2006-06-03 2007-12-05 三星电子株式会社 使用带宽扩展技术对信号编码和解码的方法和设备
US20070299655A1 (en) * 2006-06-22 2007-12-27 Nokia Corporation Method, Apparatus and Computer Program Product for Providing Low Frequency Expansion of Speech
CN101188111A (zh) * 2006-11-24 2008-05-28 富士通株式会社 解码装置和解码方法
WO2010070770A1 (ja) * 2008-12-19 2010-06-24 富士通株式会社 音声帯域拡張装置及び音声帯域拡張方法
CN102800317A (zh) * 2011-05-25 2012-11-28 华为技术有限公司 信号分类方法及设备、编解码方法及设备

Family Cites Families (39)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH02230300A (ja) * 1989-03-03 1990-09-12 Nec Corp 音声合成器
US5455888A (en) * 1992-12-04 1995-10-03 Northern Telecom Limited Speech bandwidth extension method and apparatus
JPH0954600A (ja) * 1995-08-14 1997-02-25 Toshiba Corp 音声符号化通信装置
EP0870246B1 (en) 1995-09-25 2007-06-06 Adobe Systems Incorporated Optimum access to electronic documents
CA2252170A1 (en) * 1998-10-27 2000-04-27 Bruno Bessette A method and device for high quality coding of wideband speech and audio signals
US7260523B2 (en) * 1999-12-21 2007-08-21 Texas Instruments Incorporated Sub-band speech coding system
WO2002029782A1 (en) * 2000-10-02 2002-04-11 The Regents Of The University Of California Perceptual harmonic cepstral coefficients as the front-end for speech recognition
US6615169B1 (en) * 2000-10-18 2003-09-02 Nokia Corporation High frequency enhancement layer coding in wideband speech codec
EP1383113A1 (fr) * 2002-07-17 2004-01-21 STMicroelectronics N.V. Procédé et dispositif d'encodage de la parole à bande élargie capable de contrôler indépendamment les distorsions à court terme et à long terme
EP1383109A1 (fr) * 2002-07-17 2004-01-21 STMicroelectronics N.V. Procédé et dispositif d'encodage de la parole à bande élargie
KR100503415B1 (ko) * 2002-12-09 2005-07-22 한국전자통신연구원 대역폭 확장을 이용한 celp 방식 코덱간의 상호부호화 장치 및 그 방법
WO2004084182A1 (en) * 2003-03-15 2004-09-30 Mindspeed Technologies, Inc. Decomposition of voiced speech for celp speech coding
EP2080194B1 (fr) * 2006-10-20 2011-12-07 France Telecom Attenuation du survoisement, notamment pour la generation d'une excitation aupres d'un decodeur, en absence d'information
FR2907586A1 (fr) * 2006-10-20 2008-04-25 France Telecom Synthese de blocs perdus d'un signal audionumerique,avec correction de periode de pitch.
KR101565919B1 (ko) * 2006-11-17 2015-11-05 삼성전자주식회사 고주파수 신호 부호화 및 복호화 방법 및 장치
KR101379263B1 (ko) * 2007-01-12 2014-03-28 삼성전자주식회사 대역폭 확장 복호화 방법 및 장치
CN101617362B (zh) * 2007-03-02 2012-07-18 松下电器产业株式会社 语音解码装置和语音解码方法
CN101256771A (zh) * 2007-03-02 2008-09-03 北京工业大学 嵌入式编码、解码方法、编码器、解码器及系统
CN101414462A (zh) * 2007-10-15 2009-04-22 华为技术有限公司 音频编码方法和多点音频信号混音控制方法及相应设备
KR101373004B1 (ko) * 2007-10-30 2014-03-26 삼성전자주식회사 고주파수 신호 부호화 및 복호화 장치 및 방법
US9177569B2 (en) * 2007-10-30 2015-11-03 Samsung Electronics Co., Ltd. Apparatus, medium and method to encode and decode high frequency signal
US8423371B2 (en) 2007-12-21 2013-04-16 Panasonic Corporation Audio encoder, decoder, and encoding method thereof
US8433582B2 (en) * 2008-02-01 2013-04-30 Motorola Mobility Llc Method and apparatus for estimating high-band energy in a bandwidth extension system
US20090201983A1 (en) * 2008-02-07 2009-08-13 Motorola, Inc. Method and apparatus for estimating high-band energy in a bandwidth extension system
KR100998396B1 (ko) * 2008-03-20 2010-12-03 광주과학기술원 프레임 손실 은닉 방법, 프레임 손실 은닉 장치 및 음성송수신 장치
CN101572087B (zh) * 2008-04-30 2012-02-29 北京工业大学 嵌入式语音或音频信号编解码方法和装置
US8463599B2 (en) * 2009-02-04 2013-06-11 Motorola Mobility Llc Bandwidth extension method and apparatus for a modified discrete cosine transform audio coder
US8718804B2 (en) * 2009-05-05 2014-05-06 Huawei Technologies Co., Ltd. System and method for correcting for lost data in a digital audio signal
CN101996640B (zh) * 2009-08-31 2012-04-04 华为技术有限公司 频带扩展方法及装置
MX2012004648A (es) * 2009-10-20 2012-05-29 Fraunhofer Ges Forschung Codificacion de señal de audio, decodificador de señal de audio, metodo para codificar o decodificar una señal de audio utilizando una cancelacion del tipo aliasing.
US8484020B2 (en) * 2009-10-23 2013-07-09 Qualcomm Incorporated Determining an upperband signal from a narrowband signal
CN104221081B (zh) * 2011-11-02 2017-03-15 瑞典爱立信有限公司 带宽扩展音频信号的高频带扩展的生成
CN105976830B (zh) * 2013-01-11 2019-09-20 华为技术有限公司 音频信号编码和解码方法、音频信号编码和解码装置
US9728200B2 (en) * 2013-01-29 2017-08-08 Qualcomm Incorporated Systems, methods, apparatus, and computer-readable media for adaptive formant sharpening in linear prediction coding
HUE054780T2 (hu) * 2013-03-04 2021-09-28 Voiceage Evs Llc Kvantálási zaj csökkentésére szolgáló eszköz és módszer idõtartomány dekóderben
FR3008533A1 (fr) * 2013-07-12 2015-01-16 Orange Facteur d'echelle optimise pour l'extension de bande de frequence dans un decodeur de signaux audiofrequences
CN104517610B (zh) * 2013-09-26 2018-03-06 华为技术有限公司 频带扩展的方法及装置
KR101940740B1 (ko) * 2013-10-31 2019-01-22 프라운호퍼 게젤샤프트 쭈르 푀르데룽 데어 안겐반텐 포르슝 에. 베. 시간 도메인 여기 신호를 변형하는 오류 은닉을 사용하여 디코딩된 오디오 정보를 제공하기 위한 오디오 디코더 및 방법
US9697843B2 (en) * 2014-04-30 2017-07-04 Qualcomm Incorporated High band excitation signal generation

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1484824A (zh) * 2000-10-18 2004-03-24 ��˹��ŵ�� 用于估算语音调制解调器中的模拟高频段信号的方法和系统
CN101083076A (zh) * 2006-06-03 2007-12-05 三星电子株式会社 使用带宽扩展技术对信号编码和解码的方法和设备
US20070299655A1 (en) * 2006-06-22 2007-12-27 Nokia Corporation Method, Apparatus and Computer Program Product for Providing Low Frequency Expansion of Speech
CN101188111A (zh) * 2006-11-24 2008-05-28 富士通株式会社 解码装置和解码方法
WO2010070770A1 (ja) * 2008-12-19 2010-06-24 富士通株式会社 音声帯域拡張装置及び音声帯域拡張方法
CN102800317A (zh) * 2011-05-25 2012-11-28 华为技术有限公司 信号分类方法及设备、编解码方法及设备

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP2899721A4

Also Published As

Publication number Publication date
KR101736394B1 (ko) 2017-05-16
HK1199539A1 (zh) 2015-07-03
US20190355378A1 (en) 2019-11-21
CN103928029B (zh) 2017-02-08
US9805736B2 (en) 2017-10-31
EP2899721A4 (en) 2015-12-09
BR112015014956B1 (pt) 2021-11-30
EP3467826A1 (en) 2019-04-10
JP6364518B2 (ja) 2018-07-25
US20180018989A1 (en) 2018-01-18
KR20150070398A (ko) 2015-06-24
KR20170054580A (ko) 2017-05-17
EP2899721B1 (en) 2018-09-12
JP2016505873A (ja) 2016-02-25
BR112015014956A8 (pt) 2019-10-15
JP2017138616A (ja) 2017-08-10
CN105976830A (zh) 2016-09-28
CN105976830B (zh) 2019-09-20
SG11201503286UA (en) 2015-06-29
US20150235653A1 (en) 2015-08-20
CN103928029A (zh) 2014-07-16
JP6125031B2 (ja) 2017-05-10
EP2899721A1 (en) 2015-07-29
BR112015014956A2 (pt) 2017-07-11
US10373629B2 (en) 2019-08-06

Similar Documents

Publication Publication Date Title
WO2014107950A1 (zh) 音频信号编码和解码方法、音频信号编码和解码装置
JP6574820B2 (ja) 高周波帯域信号を予測するための方法、符号化デバイス、および復号デバイス
JP6616470B2 (ja) 符号化方法、復号化方法、符号化装置及び復号化装置
US9892739B2 (en) Bandwidth extension audio decoding method and device for predicting spectral envelope
CN101836252A (zh) 用于在音频代码化系统中生成增强层的方法和装置
US20200227061A1 (en) Signal codec device and method in communication system
WO2014117484A1 (zh) 带宽扩展频带信号的预测方法、解码设备
KR20160124877A (ko) 음성 주파수 코드 스트림 디코딩 방법 및 디바이스
WO2023197809A1 (zh) 一种高频音频信号的编解码方法和相关装置
JP6517300B2 (ja) 信号処理方法及び装置
WO2015000373A1 (zh) 信号编码和解码方法以及设备
EP3595211B1 (en) Method for processing lost frame, and decoder

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 13871091

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 2013871091

Country of ref document: EP

ENP Entry into the national phase

Ref document number: 20157013439

Country of ref document: KR

Kind code of ref document: A

ENP Entry into the national phase

Ref document number: 2015543256

Country of ref document: JP

Kind code of ref document: A

REG Reference to national code

Ref country code: BR

Ref legal event code: B01A

Ref document number: 112015014956

Country of ref document: BR

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 112015014956

Country of ref document: BR

Kind code of ref document: A2

Effective date: 20150619