WO2012149843A1 - 音频信号编解码方法和设备 - Google Patents

音频信号编解码方法和设备 Download PDF

Info

Publication number
WO2012149843A1
WO2012149843A1 PCT/CN2012/072778 CN2012072778W WO2012149843A1 WO 2012149843 A1 WO2012149843 A1 WO 2012149843A1 CN 2012072778 W CN2012072778 W CN 2012072778W WO 2012149843 A1 WO2012149843 A1 WO 2012149843A1
Authority
WO
WIPO (PCT)
Prior art keywords
bandwidth
subband
factor
band
sub
Prior art date
Application number
PCT/CN2012/072778
Other languages
English (en)
French (fr)
Inventor
齐峰岩
刘泽新
苗磊
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority to KR1020167005104A priority Critical patent/KR101690121B1/ko
Priority to EP12731282.5A priority patent/EP2613315B1/en
Priority to EP16160249.5A priority patent/EP3174049B1/en
Priority to JP2014519382A priority patent/JP5986199B2/ja
Priority to ES12731282.5T priority patent/ES2612516T3/es
Priority to KR1020167035436A priority patent/KR101765740B1/ko
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to KR1020137032084A priority patent/KR101602408B1/ko
Priority to US13/532,237 priority patent/US9105263B2/en
Publication of WO2012149843A1 publication Critical patent/WO2012149843A1/zh
Priority to US14/789,755 priority patent/US9984697B2/en
Priority to US15/981,645 priority patent/US10546592B2/en
Priority to US16/731,897 priority patent/US11127409B2/en

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/008Multichannel audio signal coding or decoding using interchannel correlation to reduce redundancy, e.g. joint-stereo, intensity-coding or matrixing
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/02Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
    • G10L19/032Quantisation or dequantisation of spectral components
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/002Dynamic bit allocation
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/02Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
    • G10L19/028Noise substitution, i.e. substituting non-tonal spectral components by noisy source
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/02Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
    • G10L19/0204Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders using subband decomposition

Definitions

  • Embodiments of the present invention relate to the field of audio codec technology, and more particularly, to an audio signal encoding and decoding method and apparatus.
  • the input time domain signal is first transformed into the frequency domain, and the subband normalization factor, ie, the envelope information of the spectrum, is extracted in the frequency domain.
  • the quantized subband normalization factor is then used to normalize the spectrum to obtain normalized spectral information. Then determine the bit allocation for each subband and quantize the normalized spectrum so that the audio signal is encoded as a quantity Entropy information and normalized spectrum information, output bit rate stream.
  • the decoding end is the inverse of the encoding end.
  • the encoding end cannot encode all frequency bands, and the decoding end needs to use bandwidth extension technology to recover the frequency band that is not encoded at the encoding end.
  • Simultaneously encoded subbands also have more zero-frequency points due to the limitations of the quantizer, requiring noise-filled modules to improve performance.
  • the decoded subband normalization factor is applied to the decoded normalized spectral coefficients to obtain the reconstructed spectral coefficients, and then inversely transformed to obtain the output time domain audio signal.
  • the high-frequency harmonics are divided into some scattered bits for encoding, but the distribution on the time axis is not continuous, so that the high-frequency harmonics reconstructed during decoding are intermittent, and excessive noise is introduced. , Reconstructed audio quality is poor. Summary of the invention
  • Embodiments of the present invention provide an audio signal encoding and decoding method and device, which can improve audio quality.
  • an audio signal encoding method including: dividing a frequency band of an audio signal into a plurality of sub-bands, and quantizing a sub-band normalization factor of each sub-band; according to the quantized sub-band normalization factor, or according to The quantized subband normalization factor and code rate information, determine the signal bandwidth of the bit allocation; assign bits to the subbands within the determined signal bandwidth; encode the spectral coefficients of the audio signal according to the bits allocated by each subband .
  • an audio signal decoding method including: obtaining a quantized subband normalization factor; according to the quantized subband normalization factor, or according to the quantized subband normalization factor and code Rate information, determining the signal bandwidth of the bit allocation; allocating bits to the subbands within the determined signal bandwidth; decoding the normalized spectrum according to the bits allocated by each subband; performing noise filling on the decoded normalized spectrum And bandwidth extension to obtain a normalized full-band spectrum; according to the normalized full-band spectrum and subband A factor that obtains the spectral coefficients of the audio signal.
  • an audio signal encoding apparatus including: a quantization unit, configured to divide a frequency band of an audio signal into a plurality of sub-bands, and quantize a sub-band normalization factor of each sub-band; and a first determining unit, configured to: Determining a signal bandwidth of the bit allocation according to the subband normalization factor quantized by the quantization unit, or according to the quantized subband normalization factor and the code rate information; the first allocation unit, configured to determine the signal of the first determining unit Subband allocation bits within the bandwidth; a coding unit for encoding the spectral coefficients of the audio signal according to the bits allocated by the allocation unit for each subband.
  • an audio signal decoding apparatus including: an obtaining unit, configured to obtain a quantized subband normalization factor; and a second determining unit, configured to perform normalization according to the quantized subband obtained by the obtaining unit a factor, or a signal bandwidth of the bit allocation according to the quantized subband normalization factor and code rate information; a second allocation unit, configured to allocate a bit to a subband within a signal bandwidth determined by the second determining unit; a unit, configured to decode a normalized spectrum according to a bit allocated by the second allocation unit for each sub-band; and an expansion unit configured to perform noise filling and bandwidth expansion on the decoded normalized spectrum to obtain a normalized Full-band spectrum; recovery unit for obtaining spectral coefficients of the audio signal based on the normalized full-band spectrum and sub-band normalization factor.
  • the quantized subband normalization factor or the code rate information is used to determine the signal bandwidth of the bit allocation, so that the number of bits can be concentrated to effectively encode and decode the determined signal bandwidth, and the audio is improved. quality.
  • FIG. 1 is a flowchart of an audio signal encoding method according to an embodiment of the present invention, without any creative labor.
  • 2 is a flow chart of an audio signal decoding method according to an embodiment of the present invention.
  • Figure 3 is a block diagram of an audio signal encoding apparatus in accordance with one embodiment of the present invention.
  • FIG. 4 is a block diagram of an audio signal encoding apparatus according to another embodiment of the present invention.
  • FIG. 5 is a block diagram of an audio signal decoding apparatus in accordance with one embodiment of the present invention.
  • FIG. 6 is a block diagram of an audio signal decoding apparatus according to another embodiment of the present invention. Mode for carrying out the invention
  • 1 is a flow chart of an audio signal encoding method according to an embodiment of the present invention. 101. Divide a frequency band of the audio signal into a plurality of sub-bands, and quantize the sub-band normalization factor of each sub-band.
  • the MDCT transform is taken as an example for description below.
  • the input audio signal is subjected to MDCT transform to obtain frequency domain coefficients.
  • the MDCT transform here may include several processes of windowing, time domain aliasing, and discrete DCT transform.
  • adding a sine window to the input time domain signal x ( n ) n 0,..., 2L-l L is the frame length of the signal
  • n L,..., 2L-l
  • the frequency domain envelope is then extracted from the MDCT coefficients and quantized.
  • the entire frequency band is divided into subbands of different frequency domain resolutions, the normalization factor of each subband is extracted, and the subband normalization factor is quantized.
  • a frequency band corresponding to a bandwidth of 16 kHz such as a frame length of 20 ms (640 samples)
  • a frequency band corresponding to a bandwidth of 16 kHz such as a frame length of 20 ms (640 samples)
  • a frame length of 20 ms 640 samples
  • a child can be defined as:
  • S P is the starting point of the subband
  • E P is the ending point of the subband
  • P is the total number of subbands.
  • the normalization factor After the normalization factor is obtained, it can be quantified in the log domain to obtain the quantized subband normalization factor wnorm.
  • bit-allocated signal bandwidth sfm_ 1 imi t may be defined as a partial bandwidth of the audio signal, such as a partial bandwidth at low frequencies (Tsfm_l imi t or an intermediate partial bandwidth).
  • a ratio factor fact can be determined based on the code rate information, the ratio factor fact being greater than 0 and less than or equal to one.
  • the smaller the code rate the smaller the ratio factor.
  • different fact rates can be obtained according to Table 1 below to take the corresponding fact value.
  • f ac t qx (0.5+ bitrate-value/128000), where bi trate_value is the value of the code rate such as 24000, and q is the correction factor.
  • the partial bandwidth is then determined based on the ratio factor fact and the quantized subband normalization factor wnorm.
  • the spectral energy in each sub-band can be obtained according to the quantized sub-band normalization factor, and the spectral energy in each sub-band is accumulated from the low frequency to the high frequency until the accumulated spectral energy is greater than the total spectral energy of all sub-bands.
  • the product of the ratio factor fact taking the bandwidth below the current subband as part of the bandwidth.
  • spectral energy can be obtained according to the following normalization factor according to the following equation:
  • the subbands are added one by one from the low frequency to the high frequency, and the spectrum energy energy_limit is accumulated, and it is judged whether energy.1 imi t > factxenergy_sum 0 is satisfied. If it is not satisfied, the spectrum energy of the subband is continuously accumulated. If satisfied, the current subband is the last subband of the defined partial bandwidth, output The current subband number sfm_limit is used to represent the defined partial bandwidth, ie 0 sfm_limit;
  • the rate factor information is used to determine the ratio factor fact.
  • fact can be determined by a subband normalization factor. For example, first obtain the harmonic level or noise level of the audio signal based on the subband normalization factor. noise-leveh In general, the larger the harmonic level of the audio signal, the smaller the noise level. The noise level is taken as an example for explanation.
  • the noise level noise_level can be obtained as follows.
  • wnorm is the decoded subband normalization factor and sfm is the number of subbands for the entire frequency band.
  • the fact When the noise-level is large, the fact is also large; when the noise-level is small, the fact is also small. If the harmonic level is used as a parameter, the fact is smaller when the harmonic level is larger; the fact is larger when the harmonic level is smaller.
  • the low frequency partial bandwidth of 0 _ sfm_limit has been described above as an example, the embodiment of the present invention is not limited thereto.
  • the above partial bandwidth may also be other forms as needed, for example, may be a partial bandwidth between a certain non-zero low frequency point and sfm_limit.
  • the following iterative method can be used: a) finding the subband corresponding to the largest rim value, allocating a certain bit; b) then making the wmorm value of the subband Correspondingly less; c) Repeat steps a ⁇ b until the bit allocation is complete. 104. Encode the spectral coefficients of the audio signal according to the bits allocated by each subband.
  • trellis vector quantization scheme that can be used for coding coefficients, or other quantization
  • the quantized subband normalization factor or the code rate information is used to determine the signal bandwidth of the bit allocation, so that the number of bits can be concentrated to effectively encode and decode the determined signal bandwidth, and the audio is improved. quality.
  • the bit allocation is performed within the signal bandwidth (Tsfn limit.
  • Tsfn limit By limiting the bandwidth sfm_limit for bit allocation, the bit number pair is more concentrated at a lower bit rate
  • the effective coding of the frequency band also makes the bandwidth expansion of the uncoded frequency band more efficient. This is mainly because if the bit allocation bandwidth is not limited, the high frequency harmonics will be divided into some scattered bits for encoding, but at the time. The distribution on the axis is not continuous, so that the reconstruction of the high-frequency harmonics is intermittent. If the scattered bits are more concentrated to the low frequency by limiting the bit allocation bandwidth, the low-frequency signal is better encoded, and the high-frequency harmonics pass the low-frequency signal. Bandwidth expansion, which makes the high frequency harmonic signals more continuous.
  • the subband normalization factor of the subband in the bandwidth may be first determined.
  • the adjustment makes it possible to allocate more bits in the high frequency band in the bandwidth.
  • the adjusted intensity can be adaptive to the bit rate. The main consideration is that if the lower band energy in this bandwidth is larger and the bits are larger, the quantization bit is saturated, and this adjustment can be used to increase the middle and high frequency quantization bits in the band. More harmonics are also beneficial for higher frequency bandwidth extensions.
  • the subband normalization factor of the intermediate subband of the partial bandwidth is used as the subband normalization factor of each subband after the intermediate subband, and the sfm_limit/2 subbands can be normalized.
  • the factor is a subband normalization factor for each subband within the range of the frequency band sfm_l imi t/2 s sfm_ l imi t . If sfm_l imi t/2 is not an integer, it can be rounded up or down. At this time, the adjusted subband normalization factor can be used when performing bit allocation.
  • audio signal frame classification can be further considered when applying the encoding and decoding method of the embodiment of the present invention.
  • the embodiments of the present invention can adopt different codec strategies for different classifications, thereby improving the coding and decoding quality of different signals.
  • the audio signal can be divided into various types such as No i se (noise), Harmonic (harmonic), Trans ient (transient).
  • the noise-like signal is divided into noisy se mode, in which the spectrum is relatively flat; the signal with steep phase is divided into Trans i ent mode, and the spectrum is also flat; the harmonic signal is divided into Harmonic mode, then the spectrum The change is large and contains more information.
  • Embodiments of the present invention may determine that the frame of the audio signal belongs to a harmonic type or a non-harmonic type prior to 101 of Fig. 1, and if the frame of the audio signal belongs to a harmonic type, the method of Fig. 1 is continued.
  • the signal bandwidth of the bit allocation can be defined in accordance with the embodiment of Figure 1, i.e., the signal bandwidth of the bit allocation of the frame is limited to the partial bandwidth of the frame.
  • the signal bandwidth of the bit allocation may be limited to a partial bandwidth according to the embodiment of FIG. 1, or the signal bandwidth of the bit allocation may not be limited, for example, the bit allocation bandwidth of such a frame is determined as the frame. The full bandwidth.
  • Audio signal frames can be classified by peak-to-average ratio.
  • the peak-to-average ratio of each of the sub-bands of all or a portion of the sub-bands of the frame (eg, a partial sub-band of high frequencies) is obtained.
  • the peak-to-average ratio refers to the ratio of the peak energy or amplitude of the sub-band to the average energy or amplitude of the sub-band.
  • the embodiments of the present invention are not limited to the example of classifying according to the peak-to-average ratio parameter, and may be classified according to other parameters.
  • the bandwidth sfm_l imi t it is more efficient to concentrate the selected frequency band by concentrating the number of bits at a low code rate, and also making bandwidth expansion of the uncoded frequency band more effective, mainly because If the bit allocation bandwidth is not limited, the high frequency harmonics will be divided into some scattered bits for encoding, but the distribution on the time axis is not continuous, so that the reconstruction of the high frequency harmonics is intermittent, if the bandwidth is allocated by limiting bits. These scattered bits are more concentrated into the low frequency, so that the low frequency signal is better encoded, and the high frequency harmonics are expanded by the low frequency signal, so that the high frequency harmonic signal is more continuous.
  • FIG. 2 is a flow chart of an audio signal decoding method according to an embodiment of the present invention.
  • the quantized subband normalization factor can be obtained by decoding the bit stream.
  • 202 Determine a signal bandwidth of the bit allocation according to the quantized subband normalization factor or according to the quantized subband normalization factor and the code rate information. 202 is similar to 102 in Fig. 1, and therefore the description will not be repeated.
  • the quantized subband normalization factor or the code rate information is used to determine the signal bandwidth of the bit allocation, so that the number of bits can be concentrated to effectively encode and decode the determined signal bandwidth, and the audio is improved. quality.
  • the embodiment of the present invention has no limitation on the order of execution of noise filling and bandwidth expansion in 205. You can perform noise filling before performing bandwidth expansion, or you can perform bandwidth expansion before performing noise filling. In addition, the embodiment of the present invention may perform bandwidth expansion on a part of the frequency band and perform noise filling on the other part of the frequency band first. These variations are all within the scope of embodiments of the invention.
  • bandwidth expansion can be performed to obtain a normalized full-band spectrum. For example, based on the current frame and before it
  • the bit allocation of the frame, the first frequency band is determined as the frequency band to be copied.
  • N is a positive integer. It is generally desirable to select a plurality of consecutive sub-bands with bit allocation as the range of the first frequency band. Then, based on the spectral coefficients of the first frequency band, the spectral coefficients of the high frequency band are obtained.
  • a correlation between a bit allocated by a current frame and a bit allocated by a previous N frame may be acquired, and the first frequency band is determined according to the acquired correlation.
  • the bit allocated by the current frame be R-current
  • the bit allocated in the previous frame be R_previous
  • the obtained top_band may be taken as the first band upper limit and top.band/2 as the first band lower limit. If the difference between the lower limit of the first band of the previous frame and the lower limit of the first band of the current frame is less than 1 kHz, the lower limit of the first band of the previous frame may be taken as the lower limit of the first band of the current frame. This is mainly to ensure continuity of the first frequency band to be extended, thereby ensuring continuity of the extended high frequency spectrum. Then cache the current R-current in the middle, as the R_previous of the next middle. If top-band/2 is not an integer, it can be rounded up or down.
  • noise filling first An example of performing noise filling first is described above.
  • the embodiment of the present invention is not limited thereto, and the bandwidth extension may be performed first, and the background noise is filled in the extended full frequency band.
  • the noise filling method can be similar to the above example.
  • last_sfm _ high_sfm above-described range can be estimated by the decoder noise_level value, further adjustment band filled in the last-sfm- high_sfm range of background noise.
  • noise-level refer to equation (8) above.
  • the noise-level is obtained by decoding the subband normalization factor to distinguish the intensity level of the padding noise, so that no coded bits are transmitted.
  • the background noise in the high frequency band can be adjusted using the obtained noise level as follows.
  • y(k) ( (1 - noise _ level ) * y norm (k) + noise _ level * noise _ CB(k) ) * wnor m (9)
  • U k ) is the normalized coefficient after decoding
  • nQise - CBG is a noise code book.
  • the present invention can also adjust the spectral coefficients of the first frequency band first, and then use the adjusted spectral coefficients for bandwidth extension to further improve the performance of the high frequency band.
  • a normalized length can be obtained according to the spectral flatness information and the high-band signal type, the spectral coefficients of the first frequency band are normalized using the obtained normalized length, and the normalized first frequency band is processed The spectral coefficient is used as the spectral coefficient of the high frequency band.
  • the above spectral flatness information may include: a mean peak ratio of each subband in the first frequency band, a correlation of a time domain signal corresponding to the first frequency band, or a zero crossing rate of a time domain signal corresponding to the first frequency band.
  • the average peak ratio is exemplified below, but the embodiment of the present invention is not limited thereto, and other spectral flatness information may be similarly adjusted.
  • the peak-to-average ratio is the ratio of the peak energy or amplitude of a subband to the average energy or amplitude of that subband.
  • the peak-to-average ratio of each sub-band in the first frequency band is obtained according to the spectral coefficient of the first frequency band, and whether the sub-band is a harmonic sub-band is determined according to the peak-to-average ratio value and the maximum peak value in the sub-band, and is recorded.
  • the number of harmonic subbands n_band is then adaptively determined according to the signal type of n_band and the high frequency band itself. ength. norm. harm: n band
  • Length _ norm— harm a * ⁇ 1 + -
  • Adaptive signal type such as harmonic signal
  • the spectral coefficients of the first frequency band can then be normalized using the obtained normalized length, and the spectral coefficients of the normalized first frequency band can be used as the spectral coefficients of the high frequency band.
  • the decoder can further consider the audio signal frame classification.
  • the embodiment of the present invention can adopt different coding and decoding strategies for different classifications, thereby improving the coding and decoding quality of different signals.
  • the method of classifying the audio signal frame can be referred to the coding end, and therefore will not be described again.
  • Classification information indicating the frame type can be extracted from the code stream.
  • the signal bandwidth of the bit allocation can be defined in accordance with the embodiment of Figure 2, i.e., the signal bandwidth of the bit allocation of the frame is limited to a portion of the bandwidth of the frame.
  • the signal bandwidth of the bit allocation may be limited to a partial bandwidth according to the embodiment of FIG. 2, or the signal bandwidth of the bit allocation may not be limited according to the prior art, for example, bit allocation of such a frame.
  • the bandwidth is determined to be the full bandwidth of the frame.
  • the embodiment of the present invention can improve the quality of the harmonic signal without reducing the quality of the non-harmonic signal.
  • FIG. 3 is a block diagram of an audio signal encoding apparatus in accordance with one embodiment of the present invention.
  • the audio signal encoding apparatus 30 of Fig. 3 includes a quantizing unit 31, a first determining unit 32, a first assigning unit 33, and an encoding unit 34.
  • the quantizing unit 31 divides the frequency band of the audio signal into a plurality of sub-bands, and quantizes the sub-band normalization factor of each sub-band.
  • the first determining unit 32 determines the signal bandwidth of the bit allocation based on the subband normalization factor quantized by the quantization unit 31 or based on the quantized subband normalization factor and code rate information.
  • the first allocation unit 33 allocates bits to subbands within the signal bandwidth determined by the first determining unit 32.
  • the encoding unit 34 encodes the spectral coefficients of the audio signal based on the bits allocated by the first allocation unit 33 for each subband.
  • the coded bandwidth of the bit allocation is determined according to the quantized subband normalization factor or code rate information, so that the number of bits can be concentrated.
  • the determined signal bandwidth is effectively coded to improve audio quality.
  • FIG. 4 is a block diagram of an audio signal encoding apparatus according to another embodiment of the present invention.
  • the audio signal encoding apparatus 40 of Fig. 4 the same or similar portions as those of Fig. 3 are denoted by the same reference numerals.
  • the first determining unit 32 may define the bit-allocated signal bandwidth as a partial bandwidth of the audio signal.
  • the first determining unit 32 may include a first ratio factor determining module 321.
  • the first ratio factor determination module 321 can determine a ratio factor fact based on the code rate information, the ratio factor fact being greater than Q and less than or equal to one.
  • the first determining unit 32 may include a second ratio factor determining module 322 instead of the first ratio factor determining module 321.
  • the second ratio factor determination module 322 obtains the harmonic level or noise level of the audio signal based on the subband normalization factor, and determines the ratio factor fact based on the harmonic level or noise level.
  • the first determining unit 32 further includes a first bandwidth determining module 323. After the ratio factor fac t is obtained, the first bandwidth determining module 323 can determine the partial bandwidth based on the ratio factor fact and the quantized subband normalization factor.
  • the first bandwidth determining module 323 when determining the partial bandwidth, acquires spectral energy in each subband according to the quantized subband normalization factor, and accumulates from the low frequency to the high frequency. The spectral energy in each subband is until the accumulated spectral energy is greater than the product of the total spectral energy of all subbands and the ratio factor fact, and the bandwidth below the current subband is taken as the partial bandwidth.
  • the audio signal encoding device 40 may further include a classifying unit 35 for classifying frames of the audio signal.
  • the classification unit 35 may determine that the frame of the audio signal belongs to a harmonic type or a non-harmonic type, and if the frame of the audio signal belongs to a harmonic type, the quantization unit 31 is triggered.
  • the type of frame may be determined based on the mean peak ratio.
  • the classification unit 35 acquires all or part of the frame.
  • the peak-to-average ratio of each sub-band in the sub-band, when the number of sub-bands whose peak-to-average ratio is greater than the first threshold is greater than or equal to the second threshold, determining that the frame belongs to a harmonic type, and the sub-average ratio is greater than the first threshold
  • the first determining unit 32 can limit the signal bandwidth of the bit allocation to the partial bandwidth of the frame for the frame belonging to the harmonic type.
  • the first allocation unit 33 may include a subband normalization factor adjustment module 331 and a bit allocation module 332.
  • Subband normalization factor adjustment module 331 adjusts the subband normalization factor of the subbands within the determined signal bandwidth, and bit allocation module 332 performs bit allocation based on the adjusted subband normalization factor.
  • the first allocation unit 33 may use the subband normalization factor of the intermediate subband of the partial bandwidth determined by the first determining unit 32 as the subband normalization factor of each subband after the intermediate subband.
  • the quantized subband normalization factor or the code rate information is used to determine the signal bandwidth of the bit allocation, so that the number of bits can be concentrated to effectively encode and decode the determined signal bandwidth, and the audio is improved. quality.
  • FIG. 5 is a block diagram of an audio signal decoding apparatus in accordance with one embodiment of the present invention.
  • the audio signal decoding apparatus 50 of Fig. 5 includes an acquisition unit 51, a second determination unit 52, a second assignment unit 53, a decoding unit 54, an extension unit 55, and a restoration unit 56.
  • the obtaining unit 51 obtains the quantized subband normalization factor.
  • the second determining unit 52 determines the signal bandwidth of the bit allocation according to the quantized subband normalization factor acquired by the obtaining unit 51, or according to the quantized subband normalization factor and the code rate information.
  • the second allocation unit 53 allocates bits to subbands within the signal bandwidth determined by the second determining unit 52.
  • the decoding unit 54 decodes the normalized spectrum based on the bits allocated by the second allocation unit 53 for each subband.
  • the spreading unit 55 performs noise filling and bandwidth expansion on the normalized spectrum decoded by the decoding unit 54, to obtain a normalized full-band spectrum.
  • Recovery order The element 56 obtains the spectral coefficients of the audio signal based on the normalized full-band spectrum and the sub-band normalization factor obtained by the spreading unit 55.
  • the quantized subband normalization factor or the code rate information is used to determine the signal bandwidth of the bit allocation, so that the number of bits can be concentrated to effectively encode and decode the determined signal bandwidth, and the audio is improved. quality.
  • FIG. 6 is a block diagram of an audio signal decoding apparatus according to another embodiment of the present invention.
  • the audio signal decoding device 60 of Fig. 6 the same or similar portions as those of Fig. 5 are denoted by the same reference numerals.
  • the second determining unit 52 of the audio signal decoding device 60 can define the signal bandwidth of the bit allocation as a partial bandwidth of the audio signal.
  • the second determining unit 52 may include a third ratio factor determining unit 521 for determining a ratio factor fact based on the code rate information, the ratio factor fac t being greater than 0 and less than or equal to 1.
  • the second determining unit 52 may include a fourth ratio factor determining unit 522 for acquiring a harmonic level or a noise level of the audio signal according to the subband normalization factor, and determining a ratio factor fact according to the harmonic level or the noise level. .
  • the second determining unit 52 further includes a second bandwidth determining module 523.
  • the second bandwidth determining module 523 can determine the partial bandwidth based on the ratio factor fact and the quantized subband normalization factor.
  • the second bandwidth determining module 523 when determining the partial bandwidth, acquires spectral energy in each subband according to the quantized subband normalization factor, and accumulates from the low frequency to the high frequency. The spectral energy in each subband is until the accumulated spectral energy is greater than the product of the total spectral energy of all subbands and the ratio factor fact, and the bandwidth below the current subband is taken as the partial bandwidth.
  • the extension unit 55 may include a first frequency band determination. Module 551 and spectral coefficient acquisition module 552.
  • the first frequency band determining module 551 determines the first frequency band according to the bit allocation of the current frame and its previous N frames, where N is a positive integer
  • the spectral coefficient obtaining module 552 obtains the spectral coefficients of the high frequency frequency band according to the spectral coefficients of the first frequency band.
  • the first frequency band determining module 551 may acquire a correlation between a bit allocated by the current frame and a bit allocated by the previous N frame, and determine the first frequency band according to the acquired correlation.
  • the audio signal decoding apparatus 60 may further include an adjustment unit 57 for obtaining a noise level based on the subband normalization factor and adjusting the background noise in the high frequency band using the obtained noise level.
  • the spectral coefficient acquisition module 552 can obtain a normalized length according to the spectral flatness information and the high-band signal type, and use the obtained normalized length to the spectrum of the first frequency band.
  • the coefficients are normalized, and the spectral coefficients of the normalized first frequency band are used as the spectral coefficients of the high frequency band.
  • the spectrum flatness information may include: a mean peak ratio of each subband in the first frequency band, a correlation of a time domain signal corresponding to the first frequency band, or a zero crossing rate of a time domain signal corresponding to the first frequency band.
  • the quantized subband normalization factor or the code rate information is used to determine the signal bandwidth of the bit allocation, so that the number of bits can be concentrated to effectively encode and decode the determined signal bandwidth, and the audio is improved. quality.
  • a codec system may include the above-described audio signal encoding device or audio signal decoding device.
  • the disclosed systems, devices, and methods may be implemented in other ways.
  • the device embodiments described above are merely illustrative.
  • the division of the unit is only a logical function division.
  • there may be another division manner for example, multiple units or components may be combined or Can be integrated into another system, or some features can be ignored or not executed.
  • the coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection through some interface, device or unit, and may be electrical, mechanical or otherwise.
  • the units described as separate components may or may not be physically separate, and the components displayed as units may or may not be physical units, i.e., may be located in one place, or may be distributed over multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of the embodiment.
  • each functional unit in each embodiment of the present invention may be integrated into one processing unit, or each unit may exist physically separately, or two or more units may be integrated into one unit.
  • the functions may be stored in a computer readable storage medium if implemented in the form of a software functional unit and sold or used as a standalone product. Based on such understanding, the technical solution of the present invention, which is essential or contributes to the prior art, or a part of the technical solution, may be embodied in the form of a software product, which is stored in a storage medium, including Several instructions to make one
  • the computer device (which may be a personal computer, server, or network device, etc.) performs all or part of the steps of the methods described in various embodiments of the present invention.
  • the foregoing storage medium includes: a USB flash drive, a removable hard disk, a read-only memory (ROM), a random access memory (RAM, Random Acces s Memory), a magnetic disk or an optical disk, and the like, which can store program codes. medium.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Mathematical Physics (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Description

音频信号编解码方法和 i殳备
技术领域
本发明实施例涉及音频编解码技术领域, 并且更具体地, 涉及 音频信号编解码方法和设备。
发明背景
目前的通信传输越来越重视音频的质量,所以要求编解码时在 保证语音质量的前提下要尽可能地提高音乐质量。由于音乐信号信 息量极为丰富, 不能采用传统语音的 CELP ( Code Exc i ted Linear Predict ion, 码激励线性预测)编码模式, 通常是利用变换编码的 方法, 在频域来处理音乐信号, 提升音乐信号的编码质量。但如何 有效地用有限的编码比特高效率的编码信息成为目前音频编码的 主要研究课题。
目前的音频编码技术通常采用 FFT ( Fas t Four i er Transform, 快速傅立叶变换 )或 MDCT ( Modif ied Di screte Cos ine Transform, 改进离散余弦变换 )将时域信号转换到频域,然后对频域信号进行 编码。由于在低比特率下有限的量化比特不能满足量化所有的音频 信号, 所以一般还要采用 BWE ( Bandwidth Extens ion 频带扩展) 技术和频谱填充技术。
在编码端,首先将输入的时域信号变换到频域,在频域提取子 带归一化因子, 即频谱的包络信息。然后用量化后的子带归一化因 子对频谱进行归一化,得到归一化的频谱信息。然后确定各子带的 比特分配,对归一化的频谱进行量化,这样音频信号就被编码为量 化的包络信息和归一化的频谱信息, 输出比特率流。
解码端是编码端的逆过程。低速率编码时编码端不能编码所有 频带, 在解码端需要用带宽扩展技术来恢复编码端没有编码的频 带。 同时编码的子带由于量化器的限制也会出现较多的零频点,需 要噪声填充模块来提升性能。最后用解码后的子带归一化因子应用 到解码后的归一化频谱系数得到重建频谱系数,然后进行反变换得 到输出的时域音频信号。
但是,在编码过程中, 高频谐波会分到一些零散的比特进行编 码,但在时间轴上分布并不连续,使得解码时重建的高频谐波时断 时续, 会引入过多噪声, 重建音频质量差。 发明内容
本发明实施例提供一种音频信号编解码方法和设备,能够提高 音频质量。
一方面, 提供了一种音频信号编码方法, 包括: 将音频信号的 频带分为多个子带,量化每个子带的子带归一化因子;根据量化后 的子带归一化因子, 或者根据量化后的子带归一化因子和码率信 息,确定比特分配的信号带宽;对所确定的信号带宽内的子带分配 比特;根据每个子带分配的比特,对音频信号的频谱系数进行编码。 另一方面, 提供了一种音频信号解码方法, 包括: 获取量化后的子 带归一化因子;根据量化后的子带归一化因子,或者根据量化后的 子带归一化因子和码率信息,确定比特分配的信号带宽;对所确定 的信号带宽内的子带分配比特;根据每个子带分配的比特,对归一 化频谱进行解码; 对解码后的归一化频谱进行噪声填充和带宽扩 展,得到归一化的全频带频谱;根据归一化的全频带频谱和子带归 一化因子, 获得音频信号的频谱系数。
另一方面, 提供了一种音频信号编码设备, 包括: 量化单元, 用于将音频信号的频带分为多个子带,量化每个子带的子带归一化 因子; 第一确定单元, 用于根据量化单元量化的子带归一化因子, 或者根据量化后的子带归一化因子和码率信息,确定比特分配的信 号带宽; 第一分配单元,用于对第一确定单元确定的信号带宽内的 子带分配比特;编码单元,用于根据分配单元为每个子带分配的比 特, 对音频信号的频谱系数进行编码。
另一方面, 提供了一种音频信号解码设备, 包括: 获取单元, 用于获取量化后的子带归一化因子; 第二确定单元,用于根据获取 单元获取的量化后的子带归一化因子,或者根据量化后的子带归一 化因子和码率信息, 确定比特分配的信号带宽; 第二分配单元, 用 于对第二确定单元确定的信号带宽内的子带分配比特; 解码单元, 用于根据第二分配单元为每个子带分配的比特,对归一化频谱进行 解码;扩展单元,用于对解码后的归一化频谱进行噪声填充和带宽 扩展, 得到归一化的全频带频谱; 恢复单元, 用于根据归一化的全 频带频谱和子带归一化因子, 获得音频信号的频谱系数。
本发明实施例在编解码过程中,根据量化后的子带归一化因子 或码率信息,确定比特分配的信号带宽,从而能够集中比特数对所 确定的信号带宽进行有效编解码, 提高音频质量。 附图简要说明
为了更清楚地说明本发明实施例的技术方案,下面将对实施例 描述中所需要使用的附图作筒单地介绍,显而易见地,下面描述中 的附图是本发明的一些实施例,对于本领域普通技术人员来讲,在 不付出创造性劳动的前提下, 还可以根据这些附图获得其他的附 图 1是本发明一个实施例的音频信号编码方法的流程图。 图 2是本发明一个实施例的音频信号解码方法的流程图。 图 3是本发明一个实施例的音频信号编码设备的框图。
图 4是本发明另一实施例的音频信号编码设备的框图。
图 5是本发明一个实施例的音频信号解码设备的框图。
图 6是本发明另一实施例的音频信号解码设备的框图。 实施本发明的方式
下面将结合本发明实施例中的附图,对本发明实施例中的技术 方案进行清楚、 完整地描述, 显然, 所描述的实施例是本发明一部 分实施例, 而不是全部的实施例。基于本发明中的实施例, 本领域 普通技术人员在没有作出创造性劳动前提下所获得的所有其他实 施例, 都属于本发明保护的范围。
图 1是本发明一个实施例的音频信号编码方法的流程图。 101 , 将音频信号的频带分为多个子带, 量化每个子带的子带 归一化因子。
下面以 MDCT变换为例进行描述。 首先对输入的音频信号进行 MDCT变换, 得到频域系数。 这里的 MDCT变换可包括加窗、 时域混 叠和离散 DCT变换几个过程。
例如对输入时域信号 x(n)加正弦窗 n = 0,..., 2L-l L为信号的帧长
Figure imgf000006_0001
得到加窗后的信号为: h(n) om (n), n = 0,..., L-l
h(n)x(n- L), n = L,..., 2L-l
( 2 )
然后进行时域混叠操作:
Figure imgf000007_0001
这里的 和」 U2分别表示为阶数为 L/ 2的对角矩阵:
Figure imgf000007_0002
对时域混叠信号做离散 DCT变换,最终得到频域的 MDCT系数:
Figure imgf000007_0003
然后从 MDCT系数中提取频域包络并量化。 将整个频带分成一 些不同频域分辨率的子带,提取每个子带的归一化因子,并量化子 带归一化因子。
例如对于 32kHz采样的音频信号, 对应 16kHz带宽的频带, 如 帧长为 20ms ( 640样点), 则可以按照如下表 1中的形式分子带。
分组的子带划
子带内 系 组内子 组 内 总 带宽(Hz) 开始频点 结束频点 数个数 带数 系 数个 (Hz) (Hz) 数 8 16 128 3200 0 3200I 16 8 128 3200 3200 6400I I 24 12 288 7200 6400 13600
首先分成几个组, 然后组内再细化子带,每个子带的归
子可定义为:
Norm(p)
Figure imgf000008_0001
这里 是子带内的系数个数, SP是子带的起始点, EP是子带的 结束点, P为总共的子带数。
得到归一化因子后, 可以在对数域对其进行量化,得到量化后 的子带归一化因子 wnorm。
102 , 根据量化后的子带归一化因子, 或者根据量化后的子带 归一化因子和码率信息, 确定比特分配的信号带宽。
可选地,在一个实施例中,可将比特分配的信号带宽 sfm_ 1 imi t 限定为音频信号的部分带宽,例如低频上的部分带宽(Tsfm_ l imi t 或者中间的部分带宽。
在一个例子中,在限定比特分配带宽 sfm_ l imi t时, 可根据码 率信息确定比率因子 fact ,该比率因子 fact大于 0且小于或等于 1。 在一个实施例中, 码率越小, 则比率因子越小。 例如, 可按照 如下表 1得到不同的码率取对应的 fact值。
表 2 码率和 fact值对应表:
码率 24kpbs 0.8
32kbps 0.9
48kpbs 0.95
>64kbps 1
或者, 也可以根据等式得到 fact , 例如 f ac t=qx (0.5+ bitrate-value/128000) , 其中 bi trate—value 为码率的值如 24000, q为修正因子。 例如可以设 d=l。 本发明实施例不限于这些 具体数值示例。
然后根据该比率因子 fact和量化后的子带归一化因子 wnorm, 确定上述部分带宽。可根据量化后的子带归一化因子,获取每个子 带内的频谱能量,并从低频向高频累加每个子带内的频谱能量,直 至累加的频谱能量大于所有子带的总频谱能量与比率因子 fact的 乘积, 将当前子带以下的带宽作为部分带宽。
举例来说, 可以首先设定一个最低累计频点, 求出低于此频点 的各子带的频谱能量和 energy_low。 可按照以下等式, 根据自带 归一化因子获得频谱能量:
q
energy― low = ^ wnorm( p) , q≤ P - 1
ρ=0 (7) 其中, q为设定的最低累计频点对应的子带。
依次类推, 继续增加子带, 直至求出所有子带的总频谱能量 energy_sum。
在 energy_low的基础上, 从低频向高频逐一增加子带, 累加 得到频谱能量 energy— limit, 并判断是否满足 energy.1 imi t > factxenergy_sum0 如果不满足, 则继续累加子带的频谱能量。 如 果满足,则当前子带作为所限定的部分带宽的最后一个子带,输出 当前子带的编号 sfm_limit, 用以表征所限定的部分带宽, 即 0 sfm_limit;。
上面的例子中, 使用码率信息确定比率因子 fact。 在另一个 例子中, fact 可以通过子带归一化因子进行确定。 例如, 首先根 据子带归一化因子获取音频信号的谐波等级或噪声水平 noise-leveh 一般而言, 音频信号的谐波等级越大, 则噪声水平 越小。 下面以噪声水平为例进行说明。 可按照下式获得噪声水平 noise_level。
sftii-1
|wnorm(i + 1) _ wnorm(i)|
noise ― level =― s―nii-l
V wnorm(i)
- (8) 其中 wnorm为解码的子带归一化因子, sfm为整个频带的子带 数。
当 noise-level较大时, fact也较大; 当 noise-level较小 时, fact 也较小。 如果以谐波等级作为参数, 则当谐波等级较大 时, fact较小; 当谐波等级越小时, fact较大。
应注意,虽然上面以 0_sfm_limit的低频部分带宽为例进行了 说明, 但本发明实施例不限于此。 根据需要, 上述部分带宽也可以 是其他形式的,例如,可以是某一非零低频点到 sfm_limit之间的 部分带宽。 这些变化均落入本发明实施例的范围内。
103, 对所确定的信号带宽内的子带分配比特。
根据确定的信号带宽内子带的 wnorm值, 进行比特分配, 可以 采用如下的迭代方法: a)找到最大的丽 orm值对应的子带,分配一 定比特; b)然后对此子带的 wmorm值做相应的较少; c)重复 a ~ b 步骤, 直到比特分配完毕。 104, 根据每个子带分配的比特, 对音频信号的频谱系数进行 编码。
例如, 编码系数可以采用的格形矢量量化方案, 或其它量化
MDCT频谱系数的现有方案。
本发明实施例在编解码过程中,根据量化后的子带归一化因子 或码率信息,确定比特分配的信号带宽,从而能够集中比特数对所 确定的信号带宽进行有效编解码, 提高音频质量。
例如, 当所确定的信号带宽为低频部分的(Tsfm_limit时, 在 该信号带宽(Tsfn limit内进行比特分配。通过限制进行比特分配 的带宽 sfm_limit,使得在低码率下更能集中比特数对选定的频带 进行有效编码,也使得对未被编码的频带进行带宽扩展更有效。这 主要是因为如果不做比特分配带宽的限制,高频谐波会分到一些零 散的比特进行编码,但在时间轴上分布并不连续,使得重建高频谐 波时断时续。如果通过限制比特分配带宽将这些零散比特更集中地 分到低频,使得低频信号编码更好, 而高频谐波通过低频信号进行 带宽扩展, 这样就会使高频谐波信号更加连续。
可选地, 在一个实施例中, 在图 1的 103中, 在确定比特分配 的信号带宽 sfm_limit后进行比特分配时,还可以先对该带宽内的 子带的子带归一化因子做一定的调整,使得在该带宽中高频段能分 配更多的比特。调整的强度可自适应码率。 这主要考虑的是, 如果 这个带宽内的较低频带能量较大分得的比特较多,对于量化所需比 特已经饱和,则可以通过这个调整来增加这个频带内中高频的量化 比特,这样能编出更多谐波,对更高频的带宽扩展也有好处。例如, 将部分带宽的中间子带的子带归一化因子作为该中间子带之后的 每个子带的子带归一化因子,即可将第 sfm_limit/2个子带的归一 化因子作为频带 sfm_ l imi t/2 _ sfm_ l imi t范围内的各个子带的子 带归一化因子。如果 sfm_ l imi t/2不是整数,则可以向上或向下取 整。 此时, 在进行比特分配时, 可使用调整后的子带归一化因子。
另外,根据本发明的另一个实施例, 在应用本发明实施例的编 解码方法时, 可以进一步考虑音频信号帧分类。 这样, 本发明实施 例能够针对不同的分类采取不同的编解码策略,从而提升不同信号 的编解码质量。如音频信号可以分成 No i se (噪声)、 Harmonic (谐 波)、 Trans ient (瞬时)等多种类型。一般将类噪声信号分成 Noi se 模式, 此时频谱比较平坦; 将有时域陡变的信号分成 Trans i ent 模式, 此时频谱也比较平坦; 将谐波性较强的信号分成 Harmonic 模式, 此时频谱变化较大, 包含信息较多。
下面以谐波类型和非谐波类型这两类进行描述。本发明实施例 可在图 1的 101之前,确定音频信号的帧属于谐波类型或非谐波类 型, 如果该音频信号的帧属于谐波类型, 则继续执行图 1的方法。 具体地,对于谐波类型的帧,可按照图 1的实施例限定比特分配的 信号带宽, 即,将该帧的比特分配的信号带宽限定为该帧的部分带 宽。对于非谐波类型的帧,可以按照图 1的实施例限定比特分配的 信号带宽为部分带宽, 也可以不对比特分配的信号带宽进行限定, 例如, 将这类帧的比特分配带宽确定为该帧的全部带宽。
可按照峰均比对音频信号帧进行分类。 例如, 获取该帧的全部 或部分子带(例如, 高频的部分子带)中每个子带的峰均比。 峰均 比是指该子带的峰值能量或幅度与该子带的平均能量或幅度的比 值。 在峰均比大于第一阈值的子带的数目大于或等于第二阈值时, 确定该帧属于谐波类型,在峰均比大于第一阈值的子带的数目小于 第二阈值时,确定该帧属于非谐波类型。上述第一阈值和第二阈值 可根据需要而设定或改变。
但是, 本发明实施例不限于根据峰均比参数进行分类的例子, 也可以根据其他参数进行分类。
通过限制进行比特分配的带宽 sfm_ l imi t , 使得在低码率下更 能集中比特数对选定的频带进行有效编码,也使得对未被编码的频 带进行带宽扩展更有效,这主要是因为如果不做比特分配带宽的限 制, 高频谐波会分到一些零散的比特进行编码,但在时间轴上分布 并不连续,使得重建高频谐波时断时续,如果通过限制比特分配带 宽将这些零散比特更集中地分到低频,使得低频信号编码更好, 而 高频谐波通过低频信号进行带宽扩展,这样就会使高频谐波信号更 力口连续。
上面描述了编码端的处理过程, 解码端是编码端的逆过程。 图
2是本发明一个实施例的音频信号解码方法的流程图。
201 , 获取量化后的子带归一化因子。
可通过解码比特流, 获取量化后的子带归一化因子。
202 , 根据量化后的子带归一化因子, 或者根据量化后的子带 归一化因子和码率信息, 确定比特分配的信号带宽。 202 类似于 图 1中的 102 , 因此不再重复描述。
203 ,对所确定的信号带宽内的子带分配比特。 203类似于图 1 中的 103 , 因此不再重复描述。
204 , 根据每个子带分配的比特, 对归一化频谱进行解码。
205 , 对解码后的归一化频谱进行噪声填充和带宽扩展, 得到 归一化的全频带频谱。
206 , 根据归一化的全频带频谱和子带归一化因子, 获得音频 信号的频谱系数。 例如 ,将每个子带的归一化频谱与该子带的子带归一化因子相 乘, 恢复得到音频信号的频谱系数。
本发明实施例在编解码过程中,根据量化后的子带归一化因子 或码率信息,确定比特分配的信号带宽,从而能够集中比特数对所 确定的信号带宽进行有效编解码, 提高音频质量。
本发明实施例对于 205 中噪声填充和带宽扩展的执行顺序没 有限制。可以先执行噪声填充再执行带宽扩展,也可以先执行带宽 扩展再执行噪声填充。另外,本发明实施例可以对一部分频带先执 行带宽扩展,对于另一部分频带先执行噪声填充。这些变化均落入 本发明实施例的范围内。
在编码子带时由于量化器的限制会出现较多零频点,通常可以 填充一些噪声, 使得重建音频信号听起来更自然。
如果先执行噪声填充,对于填充噪声后的归一化频谱, 可进行 带宽扩展, 得到归一化的全频带频谱。 例如, 可根据当前帧及其前
Ν帧的比特分配, 确定第一频带, 作为要拷贝 (copy )的频带。 其 中 N为正整数。一般希望选择较连续的有比特分配的多个子带作为 第一频带的范围。 然后, 根据第一频带的频谱系数, 获得高频频带 的频谱系数。
以 N=l为例, 可选地, 在一个实施例中, 可获取当前帧分配的 比特和前 N帧分配的比特之间的相关性, 并根据所获取的相关性, 确定上述第一频带。 例如, 设当前帧分配的比特为 R-current , 前 一帧分配的比特为 R_previous , 将它们相乘得到这些比特之间的 相关性 R_correlat ion。
得到相关性之后, 从有比特分配的最高频带 las t _ sfm向低频 搜索, 寻找第一个满足 R_correlat ion≠0 的子带, 说明该当前帧 和前一帧均有比特分配。 假设该子带的编号为 top_band。
在一个实施例中, 可将得到的 top_band作为第一频带上限, top.band/2 作为第一频带下限。 如果前一帧的第一频带下限和当 前帧的第一频带下限的差值小于 1kHz, 则可将前一帧的第一频带 下限作为当前帧的第一频带下限。这主要是为了保证进行扩展的第 一频带的连续性,从而保证扩展出的高频频谱连续。然后緩存当前 中贞的 R-current, 作为下一中贞的 R_previous。 如果 top-band/2不 是整数, 可以向上或向下取整。
在带宽扩展时, 向高频频带 last_sfm_ high_sfm拷贝第一频 带 top-band/2 ~ top-band的频谱系数。
上面描述了先执行噪声填充的例子。 本发明实施例不限于此, 也可以先执行带宽扩展,在扩展后的全频带上填充背景噪声。噪声 填充的方法可以与上面的例子相似。
另外, 对于高频频带部分, 例如上述 last_sfm_ high_sfm范 围, 可以用解码端估计出的 noise_level 值, 进一步调整频带 last-sfm- high_sfm范围内填充的背景噪声。 noise—level的计算 方式可参照上面的等式( 8 )。 noise-level是通过解码后的子带归 一化因子得到的,用来区分填充噪声的强度等级, 因此不用传编码 比特。
可按照如下方式,利用所获得的噪声水平调整高频频带内的背 景噪声。
y(k) = ( (1 - noise _ level ) * ynorm (k) + noise _ level * noise _ CB(k) ) * wnor m (9) 其中, Uk)为解码后的归一化系数, nQise— CBG 为噪声码书。 这样, 高频谐波通过低频信号进行带宽扩展, 能够使得高频谐 波信号更加连续, 保证了音频质量。
上面给出了直接拷贝第一频带的频谱系数的例子。本发明也可 以先调整第一频带的频谱系数,然后使用调整后的频谱系数进行带 宽扩展, 以进一步提升高频频带的性能。
可根据频谱平坦度信息和高频带信号类型获得归一化长度,使 用所获得的归一化长度对第一频带的频谱系数进行归一化处理,并 将归一化处理后的第一频带的频谱系数作为高频频带的频谱系数。
上述频谱平坦度信息可包括: 第一频带中每个子带的均峰比、 第一频带对应的时域信号的相关性、或者第一频带对应的时域信号 的过零率。下面以均峰比为例进行说明,但是本发明实施例不限于 此,也可以类似地使用其他频谱平坦度信息进行调整。峰均比是指 某一子带的峰值能量或幅度与该子带的平均能量或幅度的比值。
首先根据第一频带的频谱系数求出第一频带中每个子带的峰 均比,依据峰均比的值及其子带内最大峰值来判断此子带是否为谐 波性子带,并累记具有谐波性子带的个数 n_band,然后根据 n_band 及高频带本身的信号类型来 自 适应确定归一化长度 l ength. norm. harm: n band
length _ norm— harm = a * \ 1 + -
M 其中 M为第一频带的子带数。 a自适应信号类型,如是谐波信 号, 则《> 1。
然后可使用所获得的归一化长度对第一频带的频谱系数进行 归一化处理,并将归一化处理后的第一频带的频谱系数作为高频频 带的频谱系数。
上面描述了提升带宽扩展性能的一个例子,其他相应能提升带 宽扩展性能的算法也能用在本发明中。
另外, 类似于编码端, 解码端也可以进一步考虑音频信号帧分 类。这样,本发明实施例能够针对不同的分类采取不同的编解码策 略,从而提升不同信号的编解码质量。对音频信号帧进行分类的方 法可参照编码端所述, 因此不再赘述。
可从码流中提取指示帧类型的分类信息。 对于谐波类型的帧, 可按照图 2的实施例限定比特分配的信号带宽, 即,将该帧的比特 分配的信号带宽限定为该帧的部分带宽。对于非谐波类型的帧,可 以按照图 2的实施例限定比特分配的信号带宽为部分带宽,也可以 按照现有技术, 不对比特分配的信号带宽进行限定, 例如, 将这类 帧的比特分配带宽确定为该帧的全部带宽。
在获得全带的频谱系数之后, 经过频域反变换, 可以得到重建 的时域音频信号。因此,本发明实施例能够提升谐波性信号的质量, 同时没有降低非谐波性信号的质量。
图 3是本发明一个实施例的音频信号编码设备的框图。图 3的 音频信号编码设备 30包括量化单元 31、第一确定单元 32、第一分 配单元 33和编码单元 34。
量化单元 31将音频信号的频带分为多个子带, 量化每个子带 的子带归一化因子。第一确定单元 32根据量化单元 31量化的子带 归一化因子,或者根据量化后的子带归一化因子和码率信息,确定 比特分配的信号带宽。第一分配单元 33对第一确定单元 32确定的 信号带宽内的子带分配比特。 编码单元 34 根据第一分配单元 33 为每个子带分配的比特, 对音频信号的频谱系数进行编码。
本发明实施例在编解码过程中,根据量化后的子带归一化因子 或码率信息,确定比特分配的信号带宽,从而能够集中比特数对所 确定的信号带宽进行有效编解码, 提高音频质量。
图 4是本发明另一实施例的音频信号编码设备的框图。图 4的 音频信号编码设备 40中, 与图 3相同或相似的部分使用相同的附 图标记表示。
在确定比特分配的信号带宽时, 第一确定单元 32可将比特分 配的信号带宽限定为音频信号的部分带宽。 例如, 如图 4所示, 第 一确定单元 32可包括第一比率因子确定模块 321。 第一比率因子 确定模块 321可根据码率信息确定比率因子 fact , 比率因子 fact 大于 Q且小于或等于 1。 可替换地, 第一确定单元 32可包括第二 比率因子确定模块 322 , 替代第一比率因子确定模块 321。 第二比 率因子确定模块 322 根据子带归一化因子获取音频信号的谐波等 级或噪声水平, 根据谐波等级或噪声水平确定比率因子 fact。
另外, 第一确定单元 32还包括第一带宽确定模块 323。 在得 到比率因子 fac t 之后, 第一带宽确定模块 323 可根据比率因子 fact和量化后的子带归一化因子, 确定上述部分带宽。
可选地,在一个实施例中, 第一带宽确定模块 323在确定上述 部分带宽时,根据量化后的子带归一化因子,获取每个子带内的频 谱能量,并从低频向高频累加每个子带内的频谱能量,直至累加的 频谱能量大于所有子带的总频谱能量与比率因子 fact的乘积, 将 当前子带以下的带宽作为上述部分带宽。
在考虑分类信息的情况下, 音频信号编码设备 40还可以包括 分类单元 35 , 用于对音频信号的帧进行分类。 例如分类单元 35可 确定音频信号的帧属于谐波类型或非谐波类型,如果该音频信号的 帧属于谐波类型, 则触发量化单元 31。 在一个实施例中, 可根据 均峰比确定帧的类型。 例如, 分类单元 35获取该帧的全部或部分 子带中每个子带的峰均比,在峰均比大于第一阈值的子带的数目大 于或等于第二阈值时,确定该帧属于谐波类型,在峰均比大于第一 阈值的子带的数目小于第二阈值时,确定该帧属于非谐波类型。此 时第一确定单元 32对属于谐波类型的帧, 可以将比特分配的信号 带宽限定为该帧的部分带宽。
可选地, 在另一实施例中, 第一分配单元 33可包括子带归一 化因子调整模块 331和比特分配模块 332。子带归一化因子调整模 块 331对所确定的信号带宽内的子带的子带归一化因子进行调整, 比特分配模块 332根据调整后的子带归一化因子进行比特分配。例 如,第一分配单元 33可将第一确定单元 32所确定的部分带宽的中 间子带的子带归一化因子作为该中间子带之后的每个子带的子带 归一化因子。
本发明实施例在编解码过程中,根据量化后的子带归一化因子 或码率信息,确定比特分配的信号带宽,从而能够集中比特数对所 确定的信号带宽进行有效编解码, 提高音频质量。
图 5是本发明一个实施例的音频信号解码设备的框图。图 5的 音频信号解码设备 50包括获取单元 51、第二确定单元 52、第二分 配单元 53、 解码单元 54、 扩展单元 55和恢复单元 56。
获取单元 51获取量化后的子带归一化因子。 第二确定单元 52 根据获取单元 51获取的量化后的子带归一化因子, 或者根据量化 后的子带归一化因子和码率信息,确定比特分配的信号带宽。第二 分配单元 53对第二确定单元 52确定的信号带宽内的子带分配比 特。解码单元 54根据第二分配单元 53为每个子带分配的比特,对 归一化频谱进行解码。扩展单元 55对解码单元 54解码后的归一化 频谱进行噪声填充和带宽扩展,得到归一化的全频带频谱。恢复单 元 56根据扩展单元 55得到的归一化的全频带频谱和子带归一化因 子, 获得音频信号的频谱系数。
本发明实施例在编解码过程中,根据量化后的子带归一化因子 或码率信息,确定比特分配的信号带宽,从而能够集中比特数对所 确定的信号带宽进行有效编解码, 提高音频质量。
图 6是本发明另一实施例的音频信号解码设备的框图。图 6的 音频信号解码设备 60中, 与图 5相同或相似的部分使用相同的附 图标记表示。
类似于图 4的第一确定单元 32 , 在确定比特分配的信号带宽 时,音频信号解码设备 60的第二确定单元 52可将比特分配的信号 带宽限定为音频信号的部分带宽。 例如, 第二确定单元 52可包括 第三比率因子确定单元 521 , 用于根据码率信息确定比率因子 fact , 比率因子 fac t大于 0且小于或等于 1。 可替换地, 第二确 定单元 52可包括第四比率因子确定单元 522 , 用于根据子带归一 化因子获取音频信号的谐波等级或噪声水平,根据谐波等级或噪声 水平确定比率因子 fact。
另外, 第二确定单元 52还包括第二带宽确定模块 523。 在得 到比率因子 fac t 之后, 第二带宽确定模块 523 可根据比率因子 fact和量化后的子带归一化因子, 确定上述部分带宽。
可选地,在一个实施例中, 第二带宽确定模块 523在确定上述 部分带宽时,根据量化后的子带归一化因子,获取每个子带内的频 谱能量,并从低频向高频累加每个子带内的频谱能量,直至累加的 频谱能量大于所有子带的总频谱能量与比率因子 fact的乘积, 将 当前子带以下的带宽作为上述部分带宽。
可选地, 在一个实施例中, 扩展单元 55可包括第一频带确定 模块 551和频谱系数获取模块 552。第一频带确定模块 551根据当 前帧及其前 N帧的比特分配, 确定第一频带, 其中 N为正整数, 频 谱系数获取模块 552根据第一频带的频谱系数,获得高频频带的频 谱系数。 例如, 在确定第一频带时, 第一频带确定模块 551可获取 当前帧分配的比特和前 N帧分配的比特之间的相关性,并根据所获 取的相关性, 确定上述第一频带。
如果需要调整背景噪声, 音频信号解码设备 60还可以包括调 整单元 57 , 用于根据子带归一化因子, 获得噪声水平, 并利用所 获得的噪声水平, 调整高频频带内的背景噪声。
可选地,在另一实施例中, 频谱系数获取模块 552可根据频谱 平坦度信息和高频带信号类型获得归一化长度,使用所获得的归一 化长度对所述第一频带的频谱系数进行归一化处理,并将归一化处 理后的第一频带的频谱系数作为高频频带的频谱系数。其中,频谱 平坦度信息可包括: 第一频带中每个子带的均峰比、第一频带对应 的时域信号的相关性、 或者第一频带对应的时域信号的过零率等。
本发明实施例在编解码过程中,根据量化后的子带归一化因子 或码率信息,确定比特分配的信号带宽,从而能够集中比特数对所 确定的信号带宽进行有效编解码, 提高音频质量。
根据本发明实施例的编解码系统可包括上述音频信号编码设 备或音频信号解码设备。
本领域普通技术人员可以意识到,结合本文中所公开的实施例 描述的各示例的单元及算法步骤, 能够以电子硬件、或者计算机软 件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来 执行,取决于技术方案的特定应用和设计约束条件。专业技术人员 可以对每个特定的应用来使用不同方法来实现所描述的功能,但是 这种实现不应认为超出本发明的范围。
所属领域的技术人员可以清楚地了解到, 为描述的方便和筒 洁, 上述描述的系统、 装置和单元的具体工作过程, 可以参考前述 方法实施例中的对应过程, 在此不再赘述。
在本申请所提供的几个实施例中,应该理解到,所揭露的系统、 装置和方法, 可以通过其它的方式实现。 例如, 以上所描述的装置 实施例仅仅是示意性的, 例如, 所述单元的划分, 仅仅为一种逻辑 功能划分, 实际实现时可以有另外的划分方式,例如多个单元或组 件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或 不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通 信连接可以是通过一些接口, 装置或单元的间接耦合或通信连接, 可以是电性, 机械或其它的形式。
所述作为分离部件说明的单元可以是或者也可以不是物理上 分开的,作为单元显示的部件可以是或者也可以不是物理单元, 即 可以位于一个地方,或者也可以分布到多个网络单元上。可以根据 实际的需要选择其中的部分或者全部单元来实现本实施例方案的 目的。
另外,在本发明各个实施例中的各功能单元可以集成在一个处 理单元中,也可以是各个单元单独物理存在,也可以两个或两个以 上单元集成在一个单元中。
所述功能如果以软件功能单元的形式实现并作为独立的产品 销售或使用时,可以存储在一个计算机可读取存储介质中。基于这 样的理解,本发明的技术方案本质上或者说对现有技术做出贡献的 部分或者该技术方案的部分可以以软件产品的形式体现出来,该计 算机软件产品存储在一个存储介质中,包括若干指令用以使得一台 计算机设备 (可以是个人计算机, 服务器, 或者网络设备等)执行 本发明各个实施例所述方法的全部或部分步骤。而前述的存储介质 包括: U盘、 移动硬盘、 只读存储器(ROM, Read-Only Memory ), 随机存取存储器(RAM, Random Acces s Memory ), 磁碟或者光盘等 各种可以存储程序代码的介质。
以上所述,仅为本发明的具体实施方式,但本发明的保护范围 并不局限于此,任何熟悉本技术领域的技术人员在本发明揭露的技 术范围内,可轻易想到变化或替换,都应涵盖在本发明的保护范围 之内。因此,本发明的保护范围应所述以权利要求的保护范围为准。

Claims

权利要求书
1、 一种音频信号编码方法, 其特征在于, 包括:
将音频信号的频带分为多个子带, 量化每个子带的子带归一 化因子;
根据量化后的子带归一化因子, 或者根据量化后的子带归一 化因子和码率信息, 确定比特分配的信号带宽;
对所确定的信号带宽内的子带分配比特;
根据每个子带分配的比特, 对音频信号的频谱系数进行编码。
2、 如权利要求 1所述的方法, 其特征在于, 所述确定比特分 配的信号带宽, 包括:
将所述比特分配的信号带宽限定为所述音频信号的部分带 宽。
3、 如权利要求 2所述的方法, 其特征在于, 所述将所述比特 分配的信号带宽限定为所述音频信号的部分带宽包括:
根据所述码率信息确定比率因子, 所述比率因子大于 0且小 于或等于 1;
根据所述比率因子和量化后的子带归一化因子, 确定所述部 分带宽。
4、如权利要求 2所述的方法, 其特征在于, 所述将所述比特 分配的信号带宽限定为所述音频信号的部分带宽包括:
根据所述子带归一化因子获取所述音频信号的谐波等级或噪 声水平;
根据所述谐波等级或噪声水平确定比率因子, 所述比率因子 大于 0且小于或等于 1; 根据所述比率因子和量化后的子带归一化因子, 确定所述部 分带宽。
5、如权利要求 3或 4所述的方法, 其特征在于, 根据所述比 率因子和量化后的子带归一化因子, 确定所述部分带宽包括: 根据所述量化后的子带归一化因子, 获取每个子带内的频谱 能量;
从低频向高频累加每个子带内的频谱能量, 直至累加的频谱 能量大于所有子带的总频谱能量与所述比率因子的乘积,将当前子 带以下的带宽作为所述部分带宽。
6、 如权利要求 1-4任一项所述的方法, 其特征在于, 在将音 频信号的频带分为多个子带, 量化每个子带的子带归一化因子之 前, 所述方法还包括:
确定所述音频信号的帧属于谐波类型或非谐波类型; 如果所述音频信号的帧属于谐波类型,则继续执行所述方法。
7、 如权利要求 6所述的方法, 其特征在于, 确定所述音频 信号的帧属于谐波类型或非谐波类型, 包括:
获取所述帧的全部或部分子带中每个子带的峰均比; 在峰均比大于第一阈值的子带的数目大于或等于第二阈值 时,确定所述帧属于谐波类型,在峰均比大于第一阈值的子带的数 目小于第二阈值时, 确定所述帧属于非谐波类型。
8、 如权利要求 6所述的方法, 其特征在于, 所述将所述比 特分配的信号带宽限定为所述音频信号的部分带宽, 包括:
对属于谐波类型的帧, 将所述比特分配的信号带宽限定为所 述帧的部分带宽。
9、 如权利要求 1所述的方法, 其特征在于, 所述对所确定 的信号带宽内的子带分配比特包括:
对所确定的信号带宽内的子带的子带归一化因子进行调整; 根据调整后的子带归一化因子进行比特分配。
10、 如权利要求 9所述的方法, 其特征在于, 所述对所确定 的信号带宽内的子带的子带归一化因子进行调整, 包括: 将所述部分带宽的中间子带的子带归一化因子作为该中间子带之 后的每个子带的子带归一化因子。
11、 一种音频信号解码方法, 其特征在于, 包括: 获取量化后的子带归一化因子;
根据量化后的子带归一化因子, 或者根据量化后的子带归一 化因子和码率信息, 确定比特分配的信号带宽;
对所确定的信号带宽内的子带分配比特;
根据每个子带分配的比特, 对归一化频谱进行解码; 对解码后的归一化频谱进行噪声填充和带宽扩展, 得到归一 化的全频带频谱;
根据归一化的全频带频谱和子带归一化因子, 获得音频信号 的频谱系数。
12、 如权利要求 11所述的方法, 其特征在于, 所述确定比特 分配的信号带宽, 包括:
将所述比特分配的信号带宽限定为所述音频信号的部分带 宽。
13、如权利要求 12所述的方法, 其特征在于, 所述将所述比 特分配的信号带宽限定为所述音频信号的部分带宽, 包括:
根据所述码率信息确定比率因子, 所述比率因子大于 0且小 于或等于 1; 根据所述比率因子和量化后的子带归一化因子, 确定所述部 分带宽。
14、如权利要求 12所述的方法, 其特征在于, 所述将所述比 特分配的信号带宽限定为所述音频信号的部分带宽, 包括:
根据所述子带归一化因子获取所述音频信号的谐波等级或噪 声水平;
根据所述谐波等级或噪声水平确定比率因子, 所述比率因子 大于 0且小于或等于 1;
根据所述比率因子和量化后的子带归一化因子, 确定所述部 分带宽。
15、 如权利要求 13或 14所述的方法, 其特征在于, 根据所 述比率因子和量化后的子带归一化因子,确定所述部分带宽,包括: 根据所述量化后的子带归一化因子, 获取每个子带内的频谱 能量;
从低频向高频累加每个子带内的频谱能量, 直至累加的频谱 能量大于所有子带的总频谱能量与所述比率因子的乘积,将当前子 带以下的带宽作为所述部分带宽。
16、 如权利要求 11所述的方法, 其特征在于, 所述对解码后 的归一化频谱进行噪声填充和带宽扩展, 得到归一化的全频带频 谱, 包括:
艮据当前帧和所述当前帧的前 N帧的比特分配,确定第一频 带, 其中 N为正整数;
根据第一频带的频谱系数, 获得高频频带的频谱系数。
17、如权利要求 16所述的方法, 其特征在于, 所述根据当前 帧和所述当前帧的前 N帧的比特分配, 确定第一频带, 包括: 获取所述当前帧分配的比特和所述前 N帧分配的比特之间的 相关性;
根据所获取的相关性, 确定所述第一频带。
18、 如权利要求 16所述的方法, 其特征在于, 还包括: 根据子带归一化因子, 获得噪声水平;
利用所获得的噪声水平, 调整所述高频频带内的背景噪声。
19、如权利要求 16所述的方法, 其特征在于, 所述根据第一 频带的频谱系数, 获得高频频带的频谱系数, 包括:
根据频谱平坦度信息和高频带信号类型获得归一化长度; 使用所获得的归一化长度对所述第一频带的频谱系数进行归一化 处理;
将归一化处理后的第一频带的频谱系数作为所述高频频带的 频谱系数。
20、如权利要求 19所述的方法, 其特征在于, 所述频谱平坦 度信息包括:
所述第一频带中每个子带的均峰比、 所述第一频带对应的时 域信号的相关性、 或者所述第一频带对应的时域信号的过零率。
21、 一种音频信号编码设备, 其特征在于, 包括: 量化单元,用于将音频信号的频带分为多个子带,量化每个子带的 子带归一化因子;
第一确定单元, 用于根据量化后的子带归一化因子, 或者根 据量化后的子带归一化因子和码率信息, 确定比特分配的信号带 宽;
第一分配单元, 用于对所述第一确定单元确定的信号带宽内 的子带分配比特; 编码单元, 用于根据所述第一分配单元为每个子带分配的比 特, 对音频信号的频谱系数进行编码。
22、如权利要求 21所述的设备, 其特征在于, 所述第一确定 单元具体用于将所述比特分配的信号带宽限定为所述音频信号的 部分带宽。
23、如权利要求 22所述的设备, 其特征在于, 所述第一确定 单元包括:
第一比率因子确定模块, 用于根据所述码率信息确定比率因 子, 所述比率因子大于 0且小于或等于 1 ;
第一带宽确定模块, 用于根据所述比率因子和量化后的子带 归一化因子, 确定所述部分带宽。
24、如权利要求 22所述的设备, 其特征在于, 所述第一确定 单元包括:
第二比率因子确定模块, 用于根据所述子带归一化因子获取 所述音频信号的谐波等级或噪声水平,根据所述谐波等级或噪声水 平确定比率因子, 所述比率因子大于 0且小于或等于 1 ;
第一带宽确定模块, 用于根据所述比率因子和量化后的子带 归一化因子, 确定所述部分带宽。
25、 如权利要求 23或 24所述的设备, 其特征在于, 所述第 一带宽确定模块具体用于根据所述量化后的子带归一化因子,获取 每个子带内的频谱能量,并从低频向高频累加每个子带内的频谱能 量,直至累加的频谱能量大于所有子带的总频谱能量与所述比率因 子的乘积, 将当前子带以下的带宽作为所述部分带宽。
26、 如权利要求 22所述的设备, 其特征在于, 还包括: 分类单元, 用于确定所述音频信号的帧属于谐波类型或非谐 波类型;如果所述音频信号的帧属于谐波类型,则触发所述量化单 元。
27、如权利要求 21所述的设备, 其特征在于, 所述第一分配 单元包括:
子带归一化因子调整模块, 用于对所确定的信号带宽内的子 带的子带归一化因子进行调整;
比特分配模块,用于根据调整后的子带归一化因子进行比特 分配。
28、 一种音频信号解码设备, 其特征在于, 包括: 获取单元, 用于获取量化后的子带归一化因子;
第二确定单元, 用于根据量化后的子带归一化因子, 或者根 据量化后的子带归一化因子和码率信息, 确定比特分配的信号带 宽;
第二分配单元,用于对所述第二确定单元确定的信号带宽内 的子带分配比特;
解码单元,用于根据所述第二分配单元为每个子带分配的比 特, 对归一化频谱进行解码;
扩展单元,用于对所述解码单元解码后的归一化频谱进行噪 声填充和带宽扩展, 得到归一化的全频带频谱;
恢复单元,用于根据所述扩展单元得到的归一化的全频带频 谱和子带归一化因子, 获得音频信号的频谱系数。
29、 如权利要求 28所述的设备, 其特征在于, 所述第二确 定单元具体用于将所述比特分配的信号带宽限定为所述音频信号 的部分带宽。
30、 如权利要求 29所述的设备, 其特征在于, 所述第二确 定单元包括:
第三比率因子确定模块, 用于根据所述码率信息确定比率因 子, 所述比率因子大于 0且小于或等于 1 ;
第二带宽确定模块, 用于根据所述比率因子和量化后的子带 归一化因子, 确定所述部分带宽。
31、 如权利要求 29所述的设备, 其特征在于, 所述第二确 定单元包括:
第四比率因子确定模块,用于根据所述子带归一化因子获取 所述音频信号的谐波等级或噪声水平,根据所述谐波等级或噪声水 平确定比率因子, 所述比率因子大于 0且小于或等于 1 ;
第二带宽确定模块,用于根据所述比率因子和量化后的子带 归一化因子, 确定所述部分带宽。
32、如权利要求 30或 31所述的设备, 其特征在于, 所述第 二带宽确定模块具体用于根据所述量化后的子带归一化因子,获取 每个子带内的频谱能量,并从低频向高频累加每个子带内的频谱能 量,直至累加的频谱能量大于所有子带的总频谱能量与所述比率因 子的乘积, 将当前子带以下的带宽作为所述部分带宽。
33、 如权利要求 28所述的设备, 其特征在于, 所述扩展单 元包括:
第一频带确定模块, 用于根据当前帧和所述当前帧的前 N 帧的比特分配, 确定第一频带, 其中 N为正整数;
频谱系数获耳 ^莫块,用于根据第一频带的频谱系数,获得高频频带 的频谱系数。
34、 如权利要求 33所述的设备, 其特征在于, 还包括: 调整单元, 用于根据子带归一化因子, 获得噪声水平, 并利 用所获得的噪声水平, 调整所述高频频带内的背景噪声。
35、 如权利要求 33所述的设备, 其特征在于, 所述频谱系 数获取模块具体用于根据频谱平坦度信息和高频带信号类型获得 归一化长度 ,使用所获得的归一化长度对所述第一频带的频谱系数 进行归一化处理,并将归一化处理后的第一频带的频谱系数作为所 述高频频带的频谱系数。
PCT/CN2012/072778 2011-07-13 2012-03-22 音频信号编解码方法和设备 WO2012149843A1 (zh)

Priority Applications (11)

Application Number Priority Date Filing Date Title
EP12731282.5A EP2613315B1 (en) 2011-07-13 2012-03-22 Method and device for coding an audio signal
EP16160249.5A EP3174049B1 (en) 2011-07-13 2012-03-22 Audio signal coding method and device
JP2014519382A JP5986199B2 (ja) 2011-07-13 2012-03-22 音声信号の符号化と復号化の方法および装置
ES12731282.5T ES2612516T3 (es) 2011-07-13 2012-03-22 Método y dispositivo de codificación y decodificación de señal de audio
KR1020167035436A KR101765740B1 (ko) 2011-07-13 2012-03-22 오디오 신호 코딩 및 디코딩 방법 및 장치
KR1020167005104A KR101690121B1 (ko) 2011-07-13 2012-03-22 오디오 신호 코딩 및 디코딩 방법 및 장치
KR1020137032084A KR101602408B1 (ko) 2011-07-13 2012-03-22 오디오 신호 코딩 및 디코딩 방법 및 장치
US13/532,237 US9105263B2 (en) 2011-07-13 2012-06-25 Audio signal coding and decoding method and device
US14/789,755 US9984697B2 (en) 2011-07-13 2015-07-01 Audio signal coding and decoding method and device
US15/981,645 US10546592B2 (en) 2011-07-13 2018-05-16 Audio signal coding and decoding method and device
US16/731,897 US11127409B2 (en) 2011-07-13 2019-12-31 Audio signal coding and decoding method and device

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN2011101960353A CN102208188B (zh) 2011-07-13 2011-07-13 音频信号编解码方法和设备
CN201110196035.3 2011-07-13

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US13/532,237 Continuation US9105263B2 (en) 2011-07-13 2012-06-25 Audio signal coding and decoding method and device

Publications (1)

Publication Number Publication Date
WO2012149843A1 true WO2012149843A1 (zh) 2012-11-08

Family

ID=44696990

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2012/072778 WO2012149843A1 (zh) 2011-07-13 2012-03-22 音频信号编解码方法和设备

Country Status (8)

Country Link
US (4) US9105263B2 (zh)
EP (2) EP2613315B1 (zh)
JP (3) JP5986199B2 (zh)
KR (3) KR101765740B1 (zh)
CN (1) CN102208188B (zh)
ES (2) ES2612516T3 (zh)
PT (2) PT2613315T (zh)
WO (1) WO2012149843A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9830914B2 (en) 2012-12-06 2017-11-28 Huawei Technologies Co., Ltd. Method and device for decoding signal

Families Citing this family (35)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102208188B (zh) 2011-07-13 2013-04-17 华为技术有限公司 音频信号编解码方法和设备
CN106409299B (zh) 2012-03-29 2019-11-05 华为技术有限公司 信号编码和解码的方法和设备
KR20140130248A (ko) * 2012-03-29 2014-11-07 텔레폰악티에볼라겟엘엠에릭슨(펍) 하모닉 오디오 신호의 변환 인코딩/디코딩
CN103544957B (zh) * 2012-07-13 2017-04-12 华为技术有限公司 音频信号的比特分配的方法和装置
CN103778918B (zh) * 2012-10-26 2016-09-07 华为技术有限公司 音频信号的比特分配的方法和装置
JP6535466B2 (ja) * 2012-12-13 2019-06-26 フラウンホッファー−ゲゼルシャフト ツァ フェルダールング デァ アンゲヴァンテン フォアシュンク エー.ファオ 音声音響符号化装置、音声音響復号装置、音声音響符号化方法及び音声音響復号方法
CN103915097B (zh) * 2013-01-04 2017-03-22 中国移动通信集团公司 一种语音信号处理方法、装置和系统
BR112015017633B1 (pt) * 2013-01-29 2021-02-23 Fraunhofer-Gellschaft Zur Foerderung Der Angewandten Forschung E.V Conceito de preenchimento de ruído
EP3399763A1 (en) * 2013-05-24 2018-11-07 Immersion Corporation Method and system for haptic data encoding
CN104217727B (zh) 2013-05-31 2017-07-21 华为技术有限公司 信号解码方法及设备
US9489959B2 (en) 2013-06-11 2016-11-08 Panasonic Intellectual Property Corporation Of America Device and method for bandwidth extension for audio signals
CN104282308B (zh) 2013-07-04 2017-07-14 华为技术有限公司 频域包络的矢量量化方法和装置
EP3614381A1 (en) 2013-09-16 2020-02-26 Samsung Electronics Co., Ltd. Signal encoding method and device and signal decoding method and device
KR102023138B1 (ko) 2013-12-02 2019-09-19 후아웨이 테크놀러지 컴퍼니 리미티드 인코딩 방법 및 장치
EP2881943A1 (en) * 2013-12-09 2015-06-10 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Apparatus and method for decoding an encoded audio signal with low computational resources
KR102185478B1 (ko) * 2014-02-28 2020-12-02 프라운호퍼-게젤샤프트 추르 푀르데룽 데어 안제반텐 포르슝 에 파우 복호 장치, 부호화 장치, 복호 방법, 및 부호화 방법
WO2015136078A1 (en) 2014-03-14 2015-09-17 Telefonaktiebolaget L M Ericsson (Publ) Audio coding method and apparatus
WO2015162500A2 (ko) * 2014-03-24 2015-10-29 삼성전자 주식회사 고대역 부호화방법 및 장치와 고대역 복호화 방법 및 장치
EP3128513B1 (en) * 2014-03-31 2019-05-15 Fraunhofer Gesellschaft zur Förderung der Angewand Encoder, decoder, encoding method, decoding method, and program
CN105336339B (zh) * 2014-06-03 2019-05-03 华为技术有限公司 一种语音频信号的处理方法和装置
CN104143335B (zh) 2014-07-28 2017-02-01 华为技术有限公司 音频编码方法及相关装置
EP2980792A1 (en) * 2014-07-28 2016-02-03 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Apparatus and method for generating an enhanced signal using independent noise-filling
JP2016038435A (ja) * 2014-08-06 2016-03-22 ソニー株式会社 符号化装置および方法、復号装置および方法、並びにプログラム
EP3226243B1 (en) * 2014-11-27 2022-01-05 Nippon Telegraph and Telephone Corporation Encoding apparatus, decoding apparatus, and method and program for the same
KR101701623B1 (ko) * 2015-07-09 2017-02-13 라인 가부시키가이샤 VoIP 통화음성 대역폭 감소를 은닉하는 시스템 및 방법
EP3208800A1 (en) * 2016-02-17 2017-08-23 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Apparatus and method for stereo filing in multichannel coding
EP3324406A1 (en) 2016-11-17 2018-05-23 Fraunhofer Gesellschaft zur Förderung der Angewand Apparatus and method for decomposing an audio signal using a variable threshold
EP3324407A1 (en) * 2016-11-17 2018-05-23 Fraunhofer Gesellschaft zur Förderung der Angewand Apparatus and method for decomposing an audio signal using a ratio as a separation characteristic
CN108630212B (zh) * 2018-04-03 2021-05-07 湖南商学院 非盲带宽扩展中高频激励信号的感知重建方法与装置
GB2582749A (en) * 2019-03-28 2020-10-07 Nokia Technologies Oy Determination of the significance of spatial audio parameters and associated encoding
EP3751567B1 (en) 2019-06-10 2022-01-26 Axis AB A method, a computer program, an encoder and a monitoring device
CN112289328A (zh) * 2020-10-28 2021-01-29 北京百瑞互联技术有限公司 一种确定音频编码码率的方法及系统
CN112669860B (zh) * 2020-12-29 2022-12-09 北京百瑞互联技术有限公司 一种增加lc3音频编解码有效带宽的方法及装置
CN113724716B (zh) * 2021-09-30 2024-02-23 北京达佳互联信息技术有限公司 语音处理方法和语音处理装置
WO2024080597A1 (ko) * 2022-10-12 2024-04-18 삼성전자주식회사 오디오 비트스트림을 적응적으로 처리하는 전자 장치, 방법, 및 비일시적 컴퓨터 판독가능 저장 매체

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH09153811A (ja) * 1995-11-30 1997-06-10 Hitachi Ltd 符号化復号方法、符号化復号装置およびそれを用いたテレビ会議装置
JPH11234139A (ja) * 1998-02-18 1999-08-27 Fujitsu Ltd 音声符号化装置
CN1255673A (zh) * 1998-11-27 2000-06-07 松下电器产业株式会社 音频信号编码装置、无线话筒和音频信号解码装置
CN101325059A (zh) * 2007-06-15 2008-12-17 华为技术有限公司 语音编解码收发方法及装置
US7580893B1 (en) * 1998-10-07 2009-08-25 Sony Corporation Acoustic signal coding method and apparatus, acoustic signal decoding method and apparatus, and acoustic signal recording medium
CN102208188A (zh) * 2011-07-13 2011-10-05 华为技术有限公司 音频信号编解码方法和设备

Family Cites Families (24)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DE69227570T2 (de) * 1991-09-30 1999-04-22 Sony Corp Verfahren und Anordnung zur Audiodatenkompression
JP3173218B2 (ja) * 1993-05-10 2001-06-04 ソニー株式会社 圧縮データ記録方法及び装置、圧縮データ再生方法、並びに記録媒体
JPH10240297A (ja) * 1996-12-27 1998-09-11 Mitsubishi Electric Corp 音響信号符号化装置
JPH11195995A (ja) 1997-12-26 1999-07-21 Hitachi Ltd 画像音声圧縮伸長装置
JP2001134295A (ja) 1999-08-23 2001-05-18 Sony Corp 符号化装置および符号化方法、記録装置および記録方法、送信装置および送信方法、復号化装置および符号化方法、再生装置および再生方法、並びに記録媒体
JP2001267928A (ja) 2000-03-17 2001-09-28 Casio Comput Co Ltd オーディオデータ圧縮装置、及び記憶媒体
JP4055336B2 (ja) 2000-07-05 2008-03-05 日本電気株式会社 音声符号化装置及びそれに用いる音声符号化方法
SE0004187D0 (sv) * 2000-11-15 2000-11-15 Coding Technologies Sweden Ab Enhancing the performance of coding systems that use high frequency reconstruction methods
JP3478267B2 (ja) 2000-12-20 2003-12-15 ヤマハ株式会社 ディジタルオーディオ信号圧縮方法および圧縮装置
JP2003280695A (ja) 2002-03-19 2003-10-02 Sanyo Electric Co Ltd 音声圧縮方法および音声圧縮装置
FR2852172A1 (fr) * 2003-03-04 2004-09-10 France Telecom Procede et dispositif de reconstruction spectrale d'un signal audio
BRPI0410918A (pt) 2003-06-05 2006-06-27 Flexiped As dispositivo de plataforma para apoio de pé para uso em um aparelho para exercìcio fìsico, exercìcio preventivo e reabilitação, e, aparelho de exercìcio fìsico equipado com barras ascendente e descendentemente móveis
AU2004319555A1 (en) 2004-05-17 2005-11-24 Nokia Corporation Audio encoding with different coding models
KR100657916B1 (ko) 2004-12-01 2006-12-14 삼성전자주식회사 주파수 대역간의 유사도를 이용한 오디오 신호 처리 장치및 방법
US7715573B1 (en) * 2005-02-28 2010-05-11 Texas Instruments Incorporated Audio bandwidth expansion
KR100851970B1 (ko) 2005-07-15 2008-08-12 삼성전자주식회사 오디오 신호의 중요주파수 성분 추출방법 및 장치와 이를이용한 저비트율 오디오 신호 부호화/복호화 방법 및 장치
ATE547786T1 (de) 2007-03-30 2012-03-15 Panasonic Corp Codierungseinrichtung und codierungsverfahren
EP2186087B1 (en) * 2007-08-27 2011-11-30 Telefonaktiebolaget L M Ericsson (PUBL) Improved transform coding of speech and audio signals
EP2571024B1 (en) * 2007-08-27 2014-10-22 Telefonaktiebolaget L M Ericsson AB (Publ) Adaptive transition frequency between noise fill and bandwidth extension
EP2224432B1 (en) 2007-12-21 2017-03-15 Panasonic Intellectual Property Corporation of America Encoder, decoder, and encoding method
RU2621965C2 (ru) * 2008-07-11 2017-06-08 Фраунхофер-Гезелльшафт цур Фёрдерунг дер ангевандтен Форшунг Е.Ф. Передатчик сигнала активации с деформацией по времени, кодер звукового сигнала, способ преобразования сигнала активации с деформацией по времени, способ кодирования звукового сигнала и компьютерные программы
US8463412B2 (en) * 2008-08-21 2013-06-11 Motorola Mobility Llc Method and apparatus to facilitate determining signal bounding frequencies
US20100223061A1 (en) * 2009-02-27 2010-09-02 Nokia Corporation Method and Apparatus for Audio Coding
JP5863765B2 (ja) 2010-03-31 2016-02-17 エレクトロニクス アンド テレコミュニケーションズ リサーチ インスチチュートElectronics And Telecommunications Research Institute 符号化方法および装置、そして、復号化方法および装置

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH09153811A (ja) * 1995-11-30 1997-06-10 Hitachi Ltd 符号化復号方法、符号化復号装置およびそれを用いたテレビ会議装置
JPH11234139A (ja) * 1998-02-18 1999-08-27 Fujitsu Ltd 音声符号化装置
US7580893B1 (en) * 1998-10-07 2009-08-25 Sony Corporation Acoustic signal coding method and apparatus, acoustic signal decoding method and apparatus, and acoustic signal recording medium
CN1255673A (zh) * 1998-11-27 2000-06-07 松下电器产业株式会社 音频信号编码装置、无线话筒和音频信号解码装置
CN101325059A (zh) * 2007-06-15 2008-12-17 华为技术有限公司 语音编解码收发方法及装置
CN102208188A (zh) * 2011-07-13 2011-10-05 华为技术有限公司 音频信号编解码方法和设备

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP2613315A4 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9830914B2 (en) 2012-12-06 2017-11-28 Huawei Technologies Co., Ltd. Method and device for decoding signal
US10236002B2 (en) 2012-12-06 2019-03-19 Huawei Technologies Co., Ltd. Method and device for decoding signal
US10546589B2 (en) 2012-12-06 2020-01-28 Huawei Technologies Co., Ltd. Method and device for decoding signal
US10971162B2 (en) 2012-12-06 2021-04-06 Huawei Technologies Co., Ltd. Method and device for decoding signal
US11610592B2 (en) 2012-12-06 2023-03-21 Huawei Technologies Co., Ltd. Method and device for decoding signal

Also Published As

Publication number Publication date
US11127409B2 (en) 2021-09-21
US9984697B2 (en) 2018-05-29
US20200135219A1 (en) 2020-04-30
PT3174049T (pt) 2019-04-22
US20130018660A1 (en) 2013-01-17
EP3174049B1 (en) 2019-01-09
US10546592B2 (en) 2020-01-28
US9105263B2 (en) 2015-08-11
KR20160149326A (ko) 2016-12-27
KR101690121B1 (ko) 2016-12-27
JP6702593B2 (ja) 2020-06-03
JP2018106208A (ja) 2018-07-05
ES2718400T3 (es) 2019-07-01
JP5986199B2 (ja) 2016-09-06
EP2613315B1 (en) 2016-11-02
KR20140005358A (ko) 2014-01-14
JP2016218465A (ja) 2016-12-22
JP6321734B2 (ja) 2018-05-09
CN102208188B (zh) 2013-04-17
EP3174049A1 (en) 2017-05-31
US20150302860A1 (en) 2015-10-22
US20180261234A1 (en) 2018-09-13
EP2613315A1 (en) 2013-07-10
KR101765740B1 (ko) 2017-08-07
JP2014523549A (ja) 2014-09-11
EP2613315A4 (en) 2013-07-10
KR20160028511A (ko) 2016-03-11
ES2612516T3 (es) 2017-05-17
PT2613315T (pt) 2016-12-22
KR101602408B1 (ko) 2016-03-10
CN102208188A (zh) 2011-10-05

Similar Documents

Publication Publication Date Title
WO2012149843A1 (zh) 音频信号编解码方法和设备
US10037766B2 (en) Apparatus and method for generating bandwith extension signal
CN1110145C (zh) 可变规模语音编码/解码的方法和装置
EP1905007A1 (en) Method and apparatus to extract important spectral component from audio signal and low bit-rate audio signal coding and/or decoding method and apparatus using the same
CN1262990C (zh) 利用谐波提取的音频编码方法和设备
WO2014063489A1 (zh) 音频信号的比特分配的方法和装置
JP7144499B2 (ja) 信号処理方法及び装置
WO2014008786A1 (zh) 音频信号的比特分配的方法和装置
CN1677491A (zh) 一种增强音频编解码装置及方法
US20130006644A1 (en) Method and device for spectral band replication, and method and system for audio decoding
KR20170089982A (ko) 신호 인코딩 및 디코딩 방법 및 장치
CN1460992A (zh) 用于感知音频编/解码的低延时、自适应的多分辨率滤波器组
CN101308657B (zh) 一种基于先进音频编码器的码流合成方法
CN1471236A (zh) 用于感知音频编码的信号自适应多分辨率滤波器组
Gunjal et al. Traditional Psychoacoustic Model and Daubechies Wavelets for Enhanced Speech Coder Performance
Ghahabi et al. Adaptive Variable Degree-k Zero-Trees for Re-Encoding of Perceptually Quantized Wavelet-Packet Transformed Audio and High Quality Speech
Ghahabi et al. Adaptive Variable Degree-𝑘 Zero-Trees for Re-Encoding of Perceptually Quantized Wavelet Packet Transformed Audio and High-Quality Speech

Legal Events

Date Code Title Description
REEP Request for entry into the european phase

Ref document number: 2012731282

Country of ref document: EP

WWE Wipo information: entry into national phase

Ref document number: 2012731282

Country of ref document: EP

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 12731282

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 20137032084

Country of ref document: KR

Kind code of ref document: A

ENP Entry into the national phase

Ref document number: 2014519382

Country of ref document: JP

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE