WO2010048827A1 - 高频带信号的编解码方法及装置 - Google Patents

高频带信号的编解码方法及装置 Download PDF

Info

Publication number
WO2010048827A1
WO2010048827A1 PCT/CN2009/073129 CN2009073129W WO2010048827A1 WO 2010048827 A1 WO2010048827 A1 WO 2010048827A1 CN 2009073129 W CN2009073129 W CN 2009073129W WO 2010048827 A1 WO2010048827 A1 WO 2010048827A1
Authority
WO
WIPO (PCT)
Prior art keywords
high frequency
band
frequency
signal
frequency band
Prior art date
Application number
PCT/CN2009/073129
Other languages
English (en)
French (fr)
Inventor
刘泽新
苗磊
胡晨
肖玮
陈龙吟
塔迪·哈维·米希尔
张清
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2010048827A1 publication Critical patent/WO2010048827A1/zh

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/02Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
    • G10L19/0204Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders using subband decomposition
    • G10L19/0208Subband vocoders
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
    • G10L19/06Determination or coding of the spectral characteristics, e.g. of the short-term prediction coefficients

Definitions

  • the present invention relates to the field of signal processing technologies, and more particularly to a method and apparatus for encoding and decoding high frequency band signals.
  • the high frequency signal of the audio/speech signal generally contains a relatively rich content, and if the high frequency signal is absent, the sound quality of the audio signal will be damaged. Affected by factors such as sampling rate and bandwidth, it is usually preferred to ensure that there are enough bits to encode the low frequency signal, and the high frequency signal in the audio signal is not directly encoded. In order to reduce the damage to the audio signal quality, people usually ⁇ The high frequency signal is restored using a band extension technique.
  • the current band extension technology is implemented as follows: At the time of encoding, the high frequency signal gain parameter is solved, and the gain parameter is sent to the decoding end; at the time of decoding, the low frequency signal is obtained by using the high frequency gain parameter and decoding. The high frequency signal is reconstructed, and then the reconstructed high frequency signal and the decoded low frequency signal are integrated to obtain an audio signal, and the frequency range of the audio signal includes a high frequency band, thereby improving the sound quality of the audio signal.
  • the inventors have found that at least the following problems exist in the prior art: Since the encoding of the high-frequency band signal is to calculate an average value for successive predetermined number of sampling points, the restored high-frequency signal is obtained. Discontinuity, resulting in the final reduced high frequency signal may not accurately reflect the characteristics of the original high frequency signal, resulting in poorer auditory sound quality.
  • Embodiments of the present invention provide a coding and decoding method and apparatus for a high-band signal, so that the decoded high-band signal can accurately reflect the characteristics of the original high-frequency signal and improve the auditory sound quality.
  • the embodiment of the present invention adopts the following technical solutions:
  • a method for encoding a high frequency band signal comprising:
  • the frequency domain envelope information of the high frequency band and the flatness information of each of the subbands are combined into a coded stream.
  • a method for decoding a high frequency band signal comprising:
  • the low frequency band signal is corrected by the smoothed frequency borrowing gain to obtain a high frequency band frequency; the high frequency band spectrum is converted into a high frequency band time domain signal.
  • An encoding device for a high frequency band signal comprising:
  • a first acquiring unit configured to obtain frequency domain envelope information by using a frequency domain signal of a high frequency band
  • a second acquiring unit configured to obtain each of the frequency band signals in the high frequency band according to the subband spectral amplitude of the high frequency band Flatness information
  • an output unit configured to synthesize the frequency domain envelope information of the high frequency band and the flatness information of each of the subbands into a coded code stream.
  • a decoding device for a high frequency band signal comprising:
  • a decoding unit which decodes frequency domain envelope information and flatness information of the high frequency band signal from the received encoded code stream
  • a calculating unit configured to: according to the frequency domain envelope information of the high frequency band signal and the decoded low frequency Calculating the frequency gain of the high frequency band with a signal;
  • a processing unit configured to smooth the frequency gain in the frequency domain according to the flatness information and the frequency domain envelope information
  • a first correcting unit configured to correct the low frequency band signal by using the smoothed spectral gain to obtain a high frequency band spectrum
  • a transform unit configured to convert the high frequency band spectrum into a high frequency band time domain signal.
  • the coding and decoding method and apparatus for the high-band signal provided by the embodiment of the present invention need not only obtaining the frequency domain envelope information but also obtaining the flatness information corresponding to each sub-band of the high-band signal in the frequency domain. In order to show in more detail whether each sub-band of the high-band signal is flat.
  • the above-described flatness information and frequency domain envelope information are used to guide the high frequency gain for smoothing processing, and after the smoothing processing, the high frequency gain can more accurately reflect the frequency characteristic of the high frequency band signal, so
  • the high-frequency signal obtained by correcting the low-band signal with the smoothed high-frequency gain is closer to the original high-frequency signal, and can accurately reflect the characteristics of the original high-frequency signal and improve the auditory sound quality.
  • FIG. 1 is a flowchart of a method for encoding a high frequency band signal according to Embodiment 1 of the present invention
  • FIG. 1 is a flowchart of a method for decoding a high frequency band signal according to Embodiment 1 of the present invention
  • FIG. 3 is a block diagram of an apparatus for encoding a high frequency band signal according to Embodiment 1 of the present invention.
  • FIG. 4 is a block diagram of a decoding apparatus for a high frequency band signal according to Embodiment 1 of the present invention.
  • FIG. 5 is a flowchart of a method for encoding a high frequency band signal according to Embodiment 2 of the present invention.
  • FIG. 6 is a flowchart of a method for decoding a high frequency band signal according to Embodiment 1 of the present invention.
  • FIG. 7 is a block diagram of a first implementation of an apparatus for encoding a high-band signal according to Embodiment 2 of the present invention
  • FIG. 8 is a block diagram of a second implementation of an apparatus for encoding a high-band signal according to Embodiment 1 of the present invention
  • Figure 9 is a block diagram showing a decoding apparatus for a high frequency band signal in Embodiment 1 of the present invention.
  • This embodiment provides a coding method for a high frequency band signal. As shown in FIG. 1, the coding method includes the following steps:
  • the frequency domain envelope information is obtained by using the time domain signal of the high frequency band, and the specific acquisition process is as follows: First, the time domain envelope information of the high frequency band is obtained according to the time domain signal of the high frequency band, which may be specifically but not limited to the following Method: Average the energy of consecutive N (N is a positive integer greater than or equal to 1, in the embodiment, take the value of N as 3), and use the square root of the average as the time-domain packet of the N sampling points. Network, that is, in the case of a large number of sampling points, each successive N samples are represented by the same value, and the information obtained after the completion of all sampling points is the time domain envelope information of the entire frame.
  • the high frequency band time domain signal is converted into a high frequency band.
  • the frequency domain signal is obtained by finding the frequency domain envelope to better represent the high frequency band signal characteristics, which can be indirectly reduced by using the frequency domain envelope later.
  • a discrete frequency cosine transform is used to obtain a high frequency band frequency domain signal.
  • the frequency domain envelope information is calculated according to the frequency domain signal of the high frequency band, and the specific implementation of the calculation can be performed by converting the time domain signal into a similar manner to the time domain envelope information.
  • Obtaining the frequency domain envelope information of the high frequency band can also be as follows:
  • the time domain signal of the high frequency band is transformed into a frequency domain signal by a time-frequency transform, and the energy of the continuous N (N is a positive integer greater than or equal to 1) frequency domain coefficients is averaged.
  • the square root of the average is used as the frequency domain envelope of the N sampling points, and each successive N frequency domain coefficients is solved by one frequency. With the domain envelope, it is possible to solve multiple frequency domain envelopes of the entire frame.
  • the time domain envelope of the current frame can also be solved, making the recovered high-band signal more accurate.
  • the method of solving the time domain envelope is similar to the method of solving the frequency domain envelope.
  • solving the time domain envelope is a process of processing consecutive time M (M is a positive integer greater than or equal to 1) sample values of the time domain signal.
  • the embodiment further provides a decoding method for a high frequency band signal. As shown in FIG. 2, the decoding method includes the following steps:
  • the low frequency band signal calculates the high frequency gain, which can be used but is not limited to the following specific calculation formula:
  • g a e ⁇ ⁇ o N , where ga in [ i] represents the gain of the i-th sub-band, envH [ i ] envL[i]
  • the square root of the energy of the i-th subband of the high-band signal (i.e., frequency-domain envelope information), envU i] represents the square root of the energy of the i-th sub-band of the low-band signal (i.e., frequency domain envelope information).
  • the flatness information and the frequency domain envelope information are used to guide the high-frequency gain to perform smoothing processing, that is, when the flatness information is represented. If the original high-frequency signal has a sub-band that is not flat, and the average energy between the frames before and after the high-frequency is not much different from the average energy between the frames before and after the low-frequency, the frequency domain envelope information needs to be performed. The smoothing process yields a high frequency gain that is closer to the original high frequency signal.
  • the frequency-reduced signal of the low-frequency band is corrected by using the smoothed high-frequency gain to obtain a frequency-speaking signal of a high-frequency band, and the high-frequency gain-frequency signal is smoothed, so that the obtained high-frequency frequency-domain signal is obtained. It is smoother and more reflective of the characteristics of the original high-band signal.
  • the high-band time domain signal can be further corrected by the time domain envelope to obtain a relatively complete high-band time domain signal.
  • the embodiment further provides an encoding apparatus for the high-band signal.
  • the encoding apparatus includes: a first acquiring unit 31, a second acquiring Unit 32 and output unit 33.
  • the first obtaining unit 31 is configured to obtain the frequency domain envelope information of the high frequency band.
  • the first acquiring unit 31 first obtains time domain envelope information by using a time domain signal of a high frequency band, and then uses the high frequency band.
  • the time domain signal of the band is converted into a frequency domain signal of a high frequency band.
  • the frequency domain signal of the high frequency band is obtained by using discrete cosine transform; finally, the frequency domain envelope signal of the high frequency band is calculated by the frequency domain envelope information;
  • the second obtaining unit 32 is configured to obtain flatness information of each subband in the frequency domain signal of the high frequency band according to the subband spectral amplitude of the high frequency band, where the flatness information may indicate whether the subband is flat, so that the decoding can be performed.
  • the high-band spectrum characteristics are recovered more accurately according to the flatness information;
  • the output unit 33 is configured to synthesize the frequency domain envelope information and the flatness information into an encoded code stream.
  • the embodiment further provides a decoding apparatus for a high-band signal.
  • the encoding apparatus includes: a decoding unit 41, a calculating unit 42, and processing. Unit 43, first correction unit 44 and transformation unit 45.
  • the decoding unit 41 is configured to decode frequency domain envelope information and flatness information of the high frequency band signal from the received coded code stream; the calculating unit 42 is configured to use the frequency domain envelope information and the decoded low frequency.
  • Ga in [i] represents the high frequency gain of the i-th sub-band
  • envH [i] represents the square root of the energy of the i-th sub-band of the high-band signal (ie, frequency-domain envelope information)
  • envU i ] represents the low-band signal i
  • the processing unit 43 is configured to guide the high-frequency gain to perform smoothing processing according to the flatness information and the frequency domain envelope information, so that the high-order gain can be obtained
  • the frequency signal is relatively close to the high frequency gain
  • the first correcting unit 44 is configured to correct the low frequency band signal by using the smoothed high frequency gain to obtain a frequency domain signal of a high frequency band
  • the transform unit 45 is configured to use the frequency domain of the high frequency band
  • the signal is converted to a time domain signal of a high frequency band. Since the high-frequency gain is smoothed, it is more in line with the characteristics of the
  • the coding and decoding method and apparatus for the high-band signal provided by the embodiment of the present invention need not only calculate the frequency domain envelope information but also calculate the flatness information corresponding to each sub-band of the frequency domain signal of the high-band signal. In order to show in more detail whether the high frequency signal band signal is flat for each sub-band.
  • the flatness information and the frequency domain envelope information can be used to guide the high frequency gain for smoothing processing, and after the smoothing processing, the high frequency gain can more accurately reflect the characteristics of the high frequency band frequency signal.
  • the smoothed high-frequency gain is used to correct the low-band spectrum signal to obtain a high-frequency frequency-transmitted signal, thereby reducing the auditory discomfort of the decoded high-band signal, that is, the finally decoded frequency band.
  • the signal can accurately reflect the characteristics of the original high-frequency signal and improve the auditory sound quality.
  • This embodiment provides a method for encoding a high-band signal. As shown in FIG. 5, the encoding method includes the following steps:
  • the time domain signal of the high frequency band is sampled according to the sampling frequency, and the time domain envelope information is calculated by delaying the sampled high frequency band time domain signal s (n) by N samples, and the specific calculation manner may be adopted.
  • the method is not limited to the following: dividing the high-band time domain signal s (n) into a plurality of sub-frames, each sub-frame includes n sampling points, and averaging energy of consecutive n sampling points in the i-th sub-band , the flat
  • the mean value is used as the power value of the subframe (instantaneous domain envelope information), and all the time domain envelope information of the frame is obtained by the information obtained after all the sampling points are calculated.
  • E (i) represents the time domain envelope of the i-th subframe
  • n represents the number of time-domain signal sample points of the i-th subframe.
  • the frequency domain envelope is obtained by transforming the time domain signal of the high frequency band into the frequency domain signal of the high frequency band.
  • a modified frequency cosine transform (MDCT) is used to obtain a frequency domain signal of a high frequency band.
  • the average of the spectral amplitudes in each subband is calculated, and the maximum value of the spectral amplitude in each subband is found; then the average is divided by the maximum to obtain the spectral amplitude.
  • a ratio, and determining whether the spectral amplitude ratio is less than a predetermined ratio, and the value of the predetermined ratio is related to the input signal characteristic, generally taking 0.1 to 0.2. 2 is suitable; if the spectral amplitude ratio is less than a predetermined ratio, The output indicates uneven flatness information, generally expressed as the value "1"; otherwise the output indicates flat flatness information, generally expressed as the value "0".
  • Bi denotes the spectrum first address of the i-th sub-band
  • ei denotes the frequency-tail address of the i-th sub-band.
  • the decoding method of the high frequency band signal in this embodiment includes the following steps:
  • decoding frequency domain envelope information, time domain envelope information, and flatness information of the high frequency band signal from the encoded stream After receiving the encoded code stream, decoding frequency domain envelope information, time domain envelope information, and flatness information of the high frequency band signal from the encoded stream.
  • the spectrum information of the high frequency band is obtained by modifying the low frequency band spectrum information in this embodiment, it is required to calculate the spectral gain of the high frequency band signal relative to the low frequency band signal, that is, according to the frequency domain envelope information and
  • the decoded low-band signal calculates the high-band spectral gain, which may be, but is not limited to, the following specific formula:
  • envL[i] represents the square root of the average energy of the i-th sub-band of the low-band signal (ie, frequency-domain envelope information).
  • the high-frequency gain is required to be smoothed according to the flatness information and the frequency domain envelope information, and the flatness information and the frequency domain are first determined. Whether the envelope information satisfies the smoothing condition; when the smoothing condition is satisfied, step 604 is performed, and when the smoothing condition is not satisfied, step 605 is performed.
  • the present embodiment can ensure that the smoothing processing of the frequency-transmitting gain is performed only for those signals with high harmonicity of the high-frequency spectrum when performing step 604.
  • v - is the frequency of the N/2 subbands before the current frame.
  • _ envH _ 3 ⁇ 4 envH[i]
  • the difference between the preceding and following frames indicating the high frequency band frequency domain envelope is not large, and the frequency domain envelope information has a continuous and stable characteristic, so that the original signal is not high frequency.
  • the portion of the signal itself that is not continuously smooth is smoothed, so that the smoothed high frequency gain can correspond to the original high frequency signal as much as possible.
  • the frequency domain envelope information of the corresponding low frequency band signal of the current frame and the previous frame is continuous and stable.
  • K4, K5, K6, K7 are predetermined constants, > « ⁇ - ⁇ and ⁇ -" « ⁇ —6" ⁇ are the frequency domain packets of the current frame of the low-band signal and the N sub-bands of the previous frame respectively.
  • the sum of the sums, sum _ vL _fh and - e"J_ _3 ⁇ 4 are the sum of the frequency domain envelopes of the current frame of the low-band signal and the first N/2 sub-bands of the previous frame, respectively. In the range, the difference between the preceding and following frames indicating the frequency domain envelope of the corresponding low frequency signal is not large, and the frequency domain envelope information has continuous and stable characteristics.
  • this embodiment may also use other parameters of the high-band signal transmission to the decoding end, and other parameters of the decoded low-band signal, or the relationship between the parameters as gain smoothing conditions, such as spectral energy, linear prediction coefficients, and the like.
  • the weighted average of the high frequency gain of each subband of the current frame and the high frequency gain of the corresponding subband of the previous frame is used as the high frequency gain of the subband corresponding to the current frame.
  • Prev _ gain[i] a - gain[i] + ⁇ - prev _ gaudi] ⁇ where ga in [ i ] represents the high frequency gain of the i th subband in the current middle, and prev_gain[i] represents the previous frame
  • ga in [ i ] represents the high frequency gain of the i th subband in the current middle
  • prev_gain[i] represents the previous frame
  • the high frequency gain of the i subbands, the coefficient "sum is constant.
  • M is the number of elements in the i-th sub-band
  • N is the number of sub-bands in the frame
  • i is the i-th sub-band
  • j is the j-th element in the i-th sub-band
  • ⁇ ⁇ [ ⁇ ⁇ + ⁇ / '] indicates the value of the jth element in the i-th sub-band.
  • the formula indicates that: the high frequency gain of the first half of the first subband in each frame, and the high frequency gain of the latter half of the last subband remain unchanged, and the spectral gain of all the subbands in the middle and the front and rear sub-bands The spectral gain of the band is smoothed.
  • the embodiment uses a relationship between a high frequency signal and a low frequency signal to recover a high frequency signal.
  • the low frequency spectrum signal is multiplied by the corresponding high frequency band spectral gain to recover the high frequency spectrum signal.
  • the smoothed high frequency gain is used to correct the low frequency band signal to obtain a high frequency band spectrum signal.
  • the high frequency gain is smoothed, and the resulting high frequency band frequency domain signal is closer to the original high frequency frequency i "signal.
  • the low frequency band frequency domain signal needs to be corrected to obtain a high frequency band frequency domain signal. Therefore, the characteristics of the low frequency band frequency transmission signal directly affect the characteristics of the high frequency band spectrum signal decoded in the embodiment. In order to further improve the performance of the high-band spectrum signal, in this embodiment, it is necessary to perform spectral shaping on the low-band frequency-sense signal before the high-frequency gain correction by using the flatness information.
  • the present embodiment provides an encoding apparatus for a high-band signal.
  • the encoding apparatus includes: a first acquiring unit 71, Two acquisition unit 72 and output unit 73.
  • the first obtaining unit 71 is configured to obtain the time domain envelope information of the high frequency band, and then convert the time domain signal of the high frequency band into the frequency domain signal of the high frequency band.
  • the discrete cosine transform is used to obtain the high frequency band.
  • the frequency domain signal of the band is finally calculated;
  • the frequency domain envelope information is finally calculated by the frequency domain signal of the high frequency band;
  • the second obtaining unit 72 is configured to obtain each of the frequency domain signals of the high frequency band according to the subband spectral amplitude of the high frequency band.
  • the flatness information of the strip may indicate whether the subband is flat, so that the high frequency band spectral characteristic can be recovered more accurately according to the flatness information when decoding; the output unit 73 is used to adjust the time/frequency domain
  • the envelope information and the flatness information are combined to encode the code stream.
  • the foregoing second obtaining unit 72 can adopt the following two implementation manners:
  • the second obtaining unit 72 includes: a calculating module 721, a searching module 722, a dividing module 723, a determining module 724, and an output module 725.
  • the calculation module 721 is configured to calculate an average value of the spectral amplitude in each sub-band; 722 is used to find the maximum value of the spectral amplitude in each subband; the dividing module 723 is configured to divide the average value by the maximum value to obtain a spectral amplitude ratio; and the determining module 724 is configured to determine whether the spectral amplitude ratio is less than a predetermined ratio.
  • the output module 725 When the spectral amplitude ratio is less than a predetermined ratio, the output module 725 is configured to output flatness information indicating unevenness, generally represented by a value "1"; and when the frequency transmission amplitude ratio is not less than a predetermined ratio, the output module 725 Used to output flatness information indicating flatness, generally expressed as a value of "0".
  • the second obtaining unit 72 includes: a calculating module 726, a dividing module 727, a determining module 728, and an output module 729.
  • the calculation module 726 is configured to calculate a geometric mean value and an arithmetic mean value of the spectral amplitudes in each sub-band; the dividing module 727 is configured to divide the geometric mean value by the arithmetic mean value to obtain a spectral amplitude ratio; Determining whether the spectral amplitude ratio is less than a predetermined ratio; when the spectral amplitude ratio is less than a predetermined ratio, the output module 729 is configured to output flatness information indicating unevenness, generally represented by a value "1"; When the ratio is not less than the predetermined ratio, the output module 729 is for outputting flatness information indicating flatness, generally expressed by a value of "0".
  • the present embodiment provides a decoding apparatus for a high-band signal.
  • the apparatus includes: a decoding unit 91 and a calculating unit 92.
  • the decoding unit 91 is configured to decode frequency domain envelope information and flatness information of the high frequency band signal from the received encoded stream; the calculating unit 92 is configured to use the frequency domain envelope information and the decoded low frequency band signal according to the frequency domain envelope information. Calculate the high frequency gain, the calculation formula can use ⁇ ; , o N , where envL[i]
  • Ga in [ i ] represents the high frequency gain of the i-th sub-band
  • envH [ i] represents the square root of the average energy of the i-th sub-band of the high-band signal (ie, frequency-domain envelope information)
  • envL [ i ] represents the low-band signal
  • the processing unit 93 is configured to guide the high-frequency gain to perform smoothing processing according to the flatness information and the frequency domain envelope information, so that With the original The high frequency gain of the starting high frequency signal is relatively close.
  • the processing unit 93 in the embodiment of the present invention is implemented as follows:
  • the processing unit 93 includes: a judging module 931 and a smoothing module 932.
  • the determining module 931 is configured to determine, according to the flatness information and the frequency domain envelope information, whether the frequency satisfies the smoothing condition, where the smoothing condition is the same as the smoothing condition mentioned in step 603, and if the smoothing condition is satisfied,
  • the high frequency gain, the coefficient "and is a constant, in general,
  • the intra-smoothing unit 98 in this embodiment needs to smooth the spectral gain of the intra-frame continuous sub-bands, whether or not the smoothing condition is satisfied.
  • the present embodiment uses the spectral gain and the low-band spectrum signal to obtain a high-band spectrum signal.
  • this embodiment The spectrum shaping is performed before the correction of the low-band signal by the high-frequency gain correction, so the shaping unit 97 is configured to perform spectral shaping on the low-band signal using the flatness information.
  • the first correcting unit 94 in this embodiment corrects the shaped low-band frequency signal by using the smoothed high-frequency gain to obtain a frequency-domain signal of a high frequency band; the transform unit 95 is configured to transform the frequency-domain signal of the high-frequency band. It is a time domain signal of a high frequency band. Since the high-frequency gain is smoothed and more in line with the characteristics of the original high-frequency signal, such a high-band time-domain signal is more accurate, reducing the uncomfortable feeling of the decoded high-band signal.
  • the second correcting unit 96 in this embodiment is for correcting the transformed time domain signal of the high frequency band by using the time domain envelope information.
  • the encoding and decoding method and device for the high-band signal provided by the embodiment of the present invention need not only calculate the time domain envelope information and the frequency domain envelope information, but also calculate the frequency domain signal of the high-band signal.
  • the corresponding flatness information is provided to show in more detail whether the high frequency signal has a flatness per subband.
  • the above flatness information and frequency domain envelope information can be utilized.
  • the high-frequency gain is smoothed, and the smoothed high-frequency gain is used to correct the low-band signal spectrum to obtain a high-band frequency-sense signal, so that the high-band signal more realistically reflects the original high-band signal characteristics and reduces the high-frequency. Uncomfortable with signal decoding.
  • the smoothing process described above includes smoothing processing between adjacent sub-bands in the frame, and further includes smoothing processing between adjacent frames in the case where the smoothing condition is satisfied, so that the high-frequency signal satisfying the smooth condition does not exhibit energy mutation and no
  • the continuous smooth phenomenon can improve the auditory quality of high frequency signals.
  • the low-band signal which is the recovered band signal in this embodiment is the shaped low-band signal, which attenuates the metal noise which occurs in the finally obtained high-frequency signal.
  • the embodiments of the present invention are mainly used in applications where high frequency band signals need to be processed, such as: audio signal processing.
  • audio signal processing such as: audio signal processing.
  • the present invention can be implemented by means of software plus necessary general hardware, and of course, by hardware, but in many cases, the former is a better implementation.
  • the technical solution of the present invention which is essential or contributes to the prior art, may be embodied in the form of a software product stored in a readable storage medium, such as a floppy disk of a computer.
  • a hard disk or optical disk or the like includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to perform the methods described in various embodiments of the present invention.

Description

高频带信号的编解码方法及装置 本申请要求于 2008 年 10 月 29 日提交中国知识产权局、 申请号为 200810171591.3、 发明名称为 "高频带信号的编解码方法及装置" 的中国专利 申请的优先权, 在此并入其全部内容作为参考。
技术领域
本发明涉及信号处理技术领域, 尤其涉及对高频带信号进行编码和解码 的方法及装置。
背景技术
音频 /语音信号的高频信号一般包含较为丰富的内容, 如果缺少高频信 号, 将损伤音频信号的音质。 受采样率和带宽等因素的影响, 通常优先保证 有足够的比特对低频信号进行编码, 而对于音频信号中的高频信号不会直接 进行编码, 为了降低对音频信号音质的损伤, 人们通常釆用频带扩展技术还 原高频信号。
目前的频带扩展技术的实现如下: 在编码的时候, 求解高频信号增益参 数, 并将该增益参数发送给解码端; 在解码的时候, 利用所述的高频增益参 数和解码得出低频信号重构高频信号, 然后将重构的高频信号和解码出的低 频信号进行整合, 得到一个音频信号, 该音频信号的频率范围包括高频带, 从而提高了音频信号的音质。
在实现上述频带扩展技术的过程中, 发明人发现现有技术中至少存在如 下问题: 由于高频带信号的编码是采用对连续预定个数的采样点计算平均值, 这样使得还原的高频信号不连续, 导致最后还原的高频信号可能不能准确反 映原始高频信号的特性, 从而使得听觉音质较差。
发明内容
本发明的实施例提供一种高频带信号的编解码方法及装置, 使得解码出 的高频带信号能够较为准确反映原始高频信号的特性, 提高听觉音质。 为达到上述目的, 本发明的实施例采用如下技术方案:
一种高频带信号的编码方法, 包括:
利用高频带的频域信号获得频域包络信息;
根据高频带的子带频谱幅度获得高频带的频域信号中每个子带的平坦度 信息;
将所述高频带的频域包络信息以及所述每个子带的平坦度信息合成编码 码流。
一种高频带信号的解码方法, 包括:
从接收到的编码码流中解码出高频带信号的频域包络信息和平坦度信 息;
根据所述高频带信号的频域包络信息和已解码出的低频带信号计算高频 带的频谱增益;
依据所述平坦度信息和所述频域包络信息对所述频谱增益在频域进行平 滑;
利用平滑后的频借增益对低频带信号进行修正得到高频带频谙; 将所述高频带频谱转换为高频带的时域信号。
一种高频带信号的编码装置, 包括:
第一获取单元, 用于利用高频带的频域信号获得频域包络信息; 第二获取单元, 用于根据高频带的子带频谱幅度获得高频带的频域信号 中每个子带的平坦度信息;
输出单元, 用于将所述高频带的频域包络信息以及所述每个子带的平坦 度信息合成编码码流。
一种高频带信号的解码装置, 包括:
解码单元, 从接收到的编码码流中解码出高频带信号的频域包络信息和 平坦度信息;
计算单元, 用于根据所述高频带信号的频域包络信息和已解码出的低频 带信号计算高频带的频 增益;
处理单元, 用于依据所述平坦度信息和所述频域包络信息对所述频镨增 益在频域进行平滑;
第一修正单元, 用于利用平滑后的频谱增益对低频带信号进行修正得到 高频带频谱;
变换单元, 用于将所述高频带频谱转换为高频带的时域信号。
本发明实施例提供的高频带信号的编解码方法及装置, 在编码时, 不仅 需要获得频域包络信息, 还需要获得高频带信号在频域内的每个子带对应的 平坦度信息, 以便更细节地表示出高频带信号每个子带是否平坦。 在进行解 码时, 利用上述的平坦度信息和频域包络信息指导高频增益进行平滑处理, 在进行平滑处理后, 高频增益能够较为真实地反映高频带信号的频旙特性, 所以, 釆用平滑后的高频增益来修正低频带信号后得到的高频信号与原始的 高频信号更为接近, 能够较为准确反映原始高频信号的特性, 提高听觉音质。 附图说明
为了更清楚地说明本发明实施例或现有技术中的技术方案, 下面将对实 施例或现有技术描述中所需要使用的附图作简单地介绍, 显而易见地, 下面 描述中的附图仅仅是本发明的一些实施例 , 对于本领域普通技术人员来讲, 在不付出创造性劳动的前提下, 还可以根据这些附图获得其他的附图。
图 1为本发明实施例 1中高频带信号的编码方法的流程图;
图 1为本发明实施例 1中高频带信号的解码方法的流程图;
图 3为本发明实施例 1中高频带信号的编码装置的框图;
图 4为本发明实施例 1中高频带信号的解码装置的框图;
图 5为本发明实施例 2中高频带信号的编码方法的流程图;
图 6为本发明实施例 1中高频带信号的解码方法的流程图;
图 7为本发明实施例 2中高频带信号的编码装置的第一种实现框图; 图 8为本发明实施例 1中高频带信号的编码装置的第二种实现框图; 图 9为本发明实施例 1中高频带信号的解码装置的框图。
具体实施方式
下面将结合本发明实施例中的附图, 对本发明实施例中的技术方案进行 清楚、 完整地描述, 显然, 所描述的实施例仅仅是本发明一部分实施例, 而 不是全部的实施例。 基于本发明中的实施例, 本领域普通技术人员在没有作 出创造性劳动前提下所获得的所有其他实施例 , 都属于本发明保护的范围。
实施例 1 :
本实施例提供一种高频带信号的编码方法, 如图 1 所示, 该编码方法包 括如下步骤:
101、 利用高频带的时域信号获得频域包络信息, 具体获取过程如下: 首先, 根据高频带的时域信号获得高频带的时域包络信息, 具体可以采 用但不限于如下方式: 对连续 N ( N为大于等于 1的正整数, 本实施例中取 N 的值为 3 )个采样点的能量求平均值, 用该平均值的平方根作为该 N个采样点 时域包络, 即在采样点的个数较多的情况下, 每连续的 N个采样通过相同的 值表示, 所有的采样点计算完成后得到的信息就是整个帧的时域包络信息。
其次, 由于求取时域包络时连续的 N 个采样点釆用了相同值表示, 但是 这些点的数值可能有一些误差, 本步骤中通过将高频带的时域信号变换为高 频带的频域信号, 求取频域包络来更好地表示高频带信号特性, 在后面利用 该频域包络能够间接地减小上述误差。 本实施例中釆用离散余弦变换得到高 频带的频域信号。
最后, 根据高频带的频域信号计算出频域包络信息, 该计算的具体实现 可以釆用时域信号转换成时域包络信息相似的方式。
获得高频带的频域包络信息还可以采用如下方式:
首先, 将高频带的时域信号通过时频变换, 将时域信号变换到频域信号, 对连续 N ( N为大于等于 1的正整数)个频域系数的能量求平均值, 用该平均 值的平方根作为该 N个采样点频域包络, 每连续的 N个频域系数求解一个频 域包络, 就可以求解整帧的多个频域包络。
其次, 如果有比特可用, 还可以求解当前帧的时域包络, 使得恢复出来 的高频带信号更准确。 求解时域包络的方法和求解频域包络的方法类似, 只 是, 求解时域包络是对时域信号的连续 M ( M为大于等于 1的正整数)个采样 值进行的处理。
102、 根据高频带的子带频谱幅度获得高频带的频域信号中每个子带的平 坦度信息, 该平坦度信息可以表示高频带频谱中该子带是否平坦, 以便于解 码的时候能够按照该平坦度信息较准确地恢复出高频带的频谱。
103、 将上述频域包络信息以及平坦度信息合成编码码流, 这样就完成了 高频带信号的编码。
本实施例还提供一种高频带信号的解码方法, 如图 2 所示, 该解码方法 包括如下步骤:
201、 在接收到编码码流后, 从编码码流中解码出高频带信号的频域包络 信息和平坦度信息。
202、 由于本实施例采用修正低频带信号的方式得出高频带信号, 所以需 要计算出高频带信号相对于低频带信号的高频增益, 即: 根据频域包络信息 和已解码出的低频带信号计算高频增益, 可以采用但不限于如下具体计算公 式:
ga = e^ ^ o N , 其中 ga in [ i]表示第 i个子带的增益, envH [ i ] envL[i]
表示高频带信号第 i 个子带的能量的平方根(即频域包络信息), envU i]表 示低频带信号第 i个子带的能量的平方根(即频域包络信息)。
203、 为了使得恢复出的高频带信号的特性更精确, 本实施例中需要利用 所述平坦度信息和频域包络信息指导上述的高频增益进行平滑处理, 也就是 当平坦度信息表示原始的高频信号某个子带并不平坦, 且高频前后帧间的平 均能量与低频前后帧间的平均能量都相差不大时, 需要将频域包络信息进行 平滑处理得出与原始高频信号较为接近的高频增益。
204、 由于编码的高频信号的信息较少, 需要利用高频信号与低频信号之 间的关系来恢复高频信号, 一般通过低频的频谱信号乘以相应的增益来模拟 高频的频谱信号, 在本实施例中, 利用平滑后的高频增益对低频带的频豫信 号进行爹正得到高频带的频讲信号, 由于高频增益经过平滑处理, 所以得到 的高频带的频域信号更平滑一些, 更能反映原始高频带信号的特性。
205、 将高频带的频域信号变换为高频带的时域信号, 这样的高频带时域 信号是较为准确的, 减少了解码出的高频带信号在听觉上的不舒适感。
在实际解码时, 当还有比特可用时, 还可以通过时域包络进一步修正高 频带时域信号, 得到较为完整的高频带时域信号。
对应于图 1 中描述的高频带信号的编码方法, 本实施例还提供一种高频 带信号的编码装置, 如图 3所示, 该编码装置包括: 第一获取单元 31、 第二 获取单元 32和输出单元 33。
其中, 第一获取单元 31用于获得高频带的频域包络信息, 具体实现时, 该第一获取单元 31首先利用高频带的时域信号获得时域包络信息, 然后将高 频带的时域信号变换为高频带的频域信号, 本实施例中采用离散余弦变换得 到高频带的频域信号; 最后将高频带的频域信号计算出频域包络信息; 第二 获取单元 32用于根据高频带的子带频谱幅度获得高频带的频域信号中每个子 带的平坦度信息, 该平坦度信息可以表示该子带是否平坦, 以便于解码的时 候能够按照该平坦度信息较准确地恢复出高频带频谱特性; 输出单元 33用于 将频域包络信息以及平坦度信息合成编码码流。
对应于图 2 中描述的高频带信号的解码方法, 本实施例还提供一种高频 带信号的解码装置, 如图 4所示, 该编码装置包括: 解码单元 41、 计算单元 42、 处理单元 43、 第一修正单元 44和变换单元 45。
其中, 解码单元 41用于从接收到的编码码流中解码出高频带信号的频域 包络信息和平坦度信息; 计算单元 42用于根据频域包络信息和已解码出的低 频带信号计算高频增益, 计算公式可以采用 = U N , 其中 envL[i]
ga in [i]表示第 i个子带的高频增益, envH [i]表示高频带信号第 i个子带的 能量的平方根(即频域包络信息), envU i ]表示低频带信号第 i 个子带的能 量的平方根(即频域包络信息); 处理单元 43 用于依据所述平坦度信息和频 域包络信息指导所述高频增益进行平滑处理, 这样就可以得出与原始高频信 号较为接近的高频增益; 第一修正单元 44用于利用平滑后的高频增益对低频 带信号进行修正得到高频带的频域信号; 变换单元 45用于将高频带的频域信 号变换为高频带的时域信号。 由于高频增益经过了平滑处理, 更加符合原始 高频信号的特性, 减少了解码出的高频带信号的不舒适感。
本发明实施例提供的高频带信号的编解码方法及装置, 在编码时, 不仅 需要计算频域包络信息, 还需要计算高频带信号的频域信号每个子带对应的 平坦度信息, 以便更细节地表示出高频信号带信号每个子带是否平坦。 在进 行解码时, 就可以利用上述的平坦度信息和频域包络信息指导高频增益进行 平滑处理, 在进行平滑处理后, 高频增益就能够较为真实地反映高频带频语 信号的特性, 利用平滑后的高频增益来修正低频带频谱信号从而得到高频带 的频傳信号, 这样就能减少解码出的高频带信号在听觉上的不舒适感, 即: 最后解码出的频带信号能够较为准确反映原始高频信号的特性, 提高听觉音 质。
实施例 2:
本实施例提供一种高频带信号的编码方法, 如图 5 所示, 该编码方法包 括如下步骤:
501、 按照采样频率对高频带的时域信号进行采样, 将采样出的高频带时 域信号 s (n)延迟 N个样点后计算出时域包络信息, 具体的计算方式可以采用 但不限于如下方式: 将高频带时域信号 s (n)划分成多个子帧, 每个子帧中包 含由 n个采样点, 对第 i个子带中连续 n个采样点的能量取平均值, 将该平 均值作为该子帧的功率值(即时域包络信息), 所有的采样点计算完成后得 ^ 的信息就得到该帧的所有时域包络信息。
在具体求解上述时域包络信息时, 具体采用的公式如下:
Figure imgf000010_0001
其中, E (i)表示第 i个子帧的时域包络; n表示第 i个子帧的时域信号采 样点的个数。
502、 本步骤通过将高频带的时域信号变换为高频带的频域信号, 求取频 域包络。 本实施例中采用修正的离散余弦变换(MDCT )得到高频带的频域信 号。
503、 从高频带的频域信号计算出频域包络信息, 该计算的具体实现可以 釆用步骤 501中相似的方式。
504、 根据高频带的子带频谱幅度获得高频带的频域信号中每个子带的平 坦度信息, 平坦度信息反映信号频侮在一定的频段内频谱平坦与否, 本实施 例提供如下两种计算方法:
第一种、 在每个子带中, 计算出每个子带中频谱幅度的平均值, 并查找 出每个子带中频谱幅度的最大值; 然后将所述平均值除以所述最大值得出频 谱幅度比率, 并判断所述频谱幅度比率是否小于预定比率, 这个预定比率的 取值和输入的信号特性有关, 一般取 0. 1至 0. 2 比较合适; 如果所述频谱幅 度比率小于预定比率, 则输出表示不平坦的平坦度信息, 一般以数值 " 1" 来 表示; 否则输出表示平坦的平坦度信息, 一般以数值 " 0" 来表示。
第二种、 首先计算每个子带中频谱幅度的几何平均值和算术平均值, 然 后将所述几何平均值除以所述算术平均值得出频谱幅度比率; 其对应公式为:
SFM ,. = , 其中, X [k]是指频谱幅度, N为第 i子带的长 ∑: ,
度。 bi表示第 i个子带的频谱首地址, e i表示第 i个子带的频侮尾地址。 在得出频谱幅度比率后, 判断所述频谱幅度比率是否小于预定比率, 这 个预定比率的取值和输入的信号特性有关, 一般取 0. 1 至 0. 2 比较合适; 如 果所述频谱幅度比率小于预定比率, 则输出表示不平坦的平坦度信息, 一般 以数值 " 1 " 来表示; 否则输出表示平坦的平坦度信息, 一般以数值 " 0" 来 表示。
505、将上述时域包络信息、频域包络信息以及平坦度信息合成编码码流, 这样就完成了高频带信号的编码。
在解码时, 如果需要解码出高频带信号, 需要在低频带信号的基础进行 修正, 所以本实施例的解码过程中需要先对解码出的低频带信号进行 MDCT变 换(修正的离散余弦变换), 以便得到低频带信号的频讲信息, 才能进行高频 带信号的解码, 如图 6所示, 本实施例高频带信号的解码方法包括如下步骤:
601、 在接收到编码码流后, 从编码流中解码出高频带信号的频域包络信 息、 时域包络信息和平坦度信息。
602、 由于本实施例采用修正低频带频谱信息的方式得出高频带的频谱信 息, 所以需要计算出高频带信号相对于低频带信号的频谱增益, 即: 根据频 域包络信息和已解码出的低频带信号计算高频带频谱增益, 可以采用但不限 于如下具体计算公式:
gam[l]=^m J = 0,iA N , 其中 ga in [ i]表示第 i个子带的增益, env!U i ] envL[i]
表示高频带信号第 i个子带的平均能量的平方根(即频域包絡信息), envL [ i ] 表示低频带信号第 i个子带的平均能量的平方根(即频域包络信息)。
603、 为了使得恢复出的高频带频谱信号更精确, 本实施例中需要依据所 述平坦度信息和频域包络信息指导上述的高频增益进行平滑处理, 首先判断 平坦度信息和频域包络信息是否满足平滑条件; 当满足平滑条件时, 执行步 骤 604, 当不满足平滑条件时, 执行步骤 605。
其中的平滑条件可以包括但不限于如下几个: 1、 当前帧高频带信号具有谐波性, 用公式表示为: sum— Sharp SS >M , 其中, 为当前帧所有子带平坦度信息 (sharpnes s ) 的总和, M 为一个预设的常量, 对每个子带, 当 sharpnes s=l时, 说明此子带并不平坦、 谐波性较强; 当 sharPnes s=0时, 说明当前子带频谱比较平坦, 谐波性较弱。 故而, 当™™_^^^^ >M时, 能保证当前帧高频信号谐波性较强, 需要将频 域包络信息进行一次平滑处理得出与原始高频信号较为接近的高频增益。 有 了保证高频带信号具有谐波性的条件后, 本实施就能够保证在执行步骤 604 时, 只对那些高频频谱谐波性较强信号做频傳增益的平滑处理。
2、 当前帧与前一帧的对应的高频带信号的频域包络信息连续平稳, 用公
Ύ^Λ prev sum envH 0 n prey sum envH fh τ, Λ 式表示为, sum _envH sum _ envH _fli , 其中,
Kl、 K2、 Κ3、 Κ4为预定的常量, 假定一帧内子带个数为 Ν, prev vH prev sum envH - prev envH[i] 前一帧高频带 N个子带的频域包络总和, 即 — — ^ " , sum envH - envH[i] _e" H为当前帧高频带 N个子带的频域包络总和, 即 _ ,=。 ; prev sum envH 为前一帧前 Ν/2个子带的频域包络的总和, 即 prev _ sum _ envH _ βι= prev _ envH[i]
~ ― " -o ― ,™™-e"v - 为当前帧前 N/2个子带的频 sum _ envH _ ¾ = envH[i]
域包络总和, 即 — — '=。 。
将上述两个比值的在预定的范围内时, 表示高频带频域包络的前后帧的 差别并不大, 频域包络信息具有连续平稳的特性, 这样就不会在原始信号高 频信号本身不连续平稳的部分做平滑, 尽可能地使平滑处理后的高频增益能 与原高频信号相对应。
3、 当前帧与前一帧的对应的低频带信号的频域包络信息连续平稳, 用公 m prev sum envL T^r nT prev sum envL fh T,,
K4<- ~ = = <K58L&K6< ~ = = =^—<ΚΊ
式表示为 · sum— envL sum _ envL _ h , 其中,
K4、 K5、 K6、 K7为预定的常量, >«^-^^和^^-"«^— 6"^分别为低频带信号 的当前帧和其前一帧的 N 个子带的频域包络总和, sum _ vL _fh和 — e"J— _¾分别为低频带信号的当前帧和其前一帧的前 N/2 个子带的 频域包络总和。 将上述两个比值的在预定的范围内时, 表示对应的低频信号 的频域包络的前后帧的差别并不大, 频域包络信息具有连续平稳的特性。
当然, 本实施例还可以根据高频带信号传输到解码端的其它参数、 以及 解码出的低频带信号的其它参数、 或者参数之间的关系作为增益平滑条件, 例如频谱能量、 线形预测系数等。
604、 在满足平滑条件的情况下, 将当前帧每个子带的高频增益与前一帧 对应子带的高频增益的加权平均值作为当前帧对应子带的高频增益。 prev _ gain[i] = a - gain[i] + β - prev _ gairii] ^ 其中, ga in [ i ]表示当前中贞第 i个子带的高 频增益, prev_gain[i]表示前一帧第 i个子带的高频增益,系数《和 是常数。
一般情况下《和 满足如下关系: 《 + =1。
605、 对帧内连续子带的频谱增益之间进行平滑处理, 高频增益平滑处理 的公式 ^下: -I
Figure imgf000013_0001
其中, M为第 i个子带内元素的个数, N为该帧内子带的个数, i为第 i 个子带, j为第 i个子带中的第 j个元素, Μ Τ[ ·Μ + ·/']表示第 i个子带内第 j个元素值。
该公式表示: 每帧内第一个子带前一半元素的高频增益、 以及最后一个 子带的后一半元素的高频增益保持不变, 而中间所有子带的频谱增益要和其 前后子带的频谱增益进行平滑处理。
606、 本实施例利用高频信号与低频信号之间的关系来恢复高频信号, 一 般来说将低频频谱信号乘以相应的高频带频谱增益来恢复高频频谱信号, 在 本实施例中, 利用平滑后的高频增益对低频带信号进行修正得到高频带的频 谱信号, 高频增益经过平滑处理, 得到的高频带的频域信号与原始的高频频 i "信号更为接近。
607、 将高频带的频域信号变换为高频带的时域信号, 一般釆用 IMDCT变 换方法得到时域信号, 此操作是编码端由时域变换到频域的逆过程, 编解码 端的变换要保持一致。
608、 利用时域包络信息修正变换出的高频带的时域信号。
在上述实施例中需要对低频带频域信号进行修正得到高频带的频域信 号, 所以, 低频带频傳信号的特性直接影响了本实施例中解码出的高频带频 谱信号的特性, 为了进一步提升高频带频谱信号的性能, 本实施例中还需要 利用平坦度信息对高频增益修正前的低频带频语信号进行频谱整形。
对应于本实施例中图 5 所述的高频带信号的编码方法, 本实施提供一种 高频带信号的编码装置, 如图 7所示, 该编码装置包括: 第一获取单元 71、 第二获取单元 72和输出单元 73。
其中, 第一获取单元 71用于获得高频带的时域包络信息, 然后将高频带 的时域信号变换为高频带的频域信号, 本实施例中采用离散余弦变换得到高 频带的频域信号; 最后由高频带的频域信号计算出频域包络信息; 第二获取 单元 72用于根据高频带的子带频谱幅度获得高频带的频域信号中每个子带的 平坦度信息, 该平坦度信息可以表示该子带是否平坦, 以便于解码的时候能 够按照该平坦度信息较准确地恢复出高频带频谱特性; 输出单元 73用于将时 /频域包络信息以及平坦度信息合成编码码流。
上述的第二获取单元 72可以釆用如下两种实现方式:
第一、 如图 7所示, 该第二获取单元 72包括: 计算模块 721、 查找模块 722、 除法模块 723、 判断模块 724和输出模块 725。
其中, 计算模块 721 用于计算每个子带中频谱幅度的平均值; 查找模块 722用于查找每个子带中频谱幅度的最大值;除法模块 723用于将所述平均值 除以所述最大值得出频谱幅度比率; 判断模块 724 用于判断所述频谱幅度比 率是否小于预定比率; 在所述频谱幅度比率小于预定比率时, 输出模块 725 用于输出表示不平坦的平坦度信息, 一般以数值 " 1 " 表示; 在所述频傳幅度 比率不小于预定比率时, 输出模块 725 用于输出表示平坦的平坦度信息, 一 般以数值 " 0" 表示。
第二、 如图 8所示, 该第二获取单元 72包括: 计算模块 726、 除法模块 727、 判断模块 728和输出模块 729。
其中, 计算模块 726 用于计算每个子带中频谱幅度的几何平均值和算术 平均值; 除法模块 727 用于将所述几何平均值除以所述算术平均值得出频谱 幅度比率; 判断模块 728 用于判断所述频谱幅度比率是否小于预定比率; 在 所述频谱幅度比率小于预定比率时, 输出模块 729 用于输出表示不平坦的平 坦度信息, 一般以数值 "1" 表示; 在所述频谱幅度比率不小于预定比率时, 输出模块 729用于输出表示平坦的平坦度信息, 一般以数值 " 0" 表示。
对应于本实施例中图 6 所描述的高频带信号的解码方法, 本实施例提供 一种高频带信号的解码装置, 如图 9所示, 该装置包括: 解码单元 91、 计算 单元 92、 处理单元 93、 第一修正单元 94、 变换单元 95、 第二修正单元 96、 整形单元 97和帧内平滑单元 98。
其中, 解码单元 91用于从接收到的编码流中解码出高频带信号的频域包 络信息和平坦度信息; 计算单元 92用于根据频域包络信息和已解码出的低频 带信号计算高频增益, 计算公式可以釆用 ^ ; , o N , 其中 envL[i]
ga in [ i ]表示第 i个子带的高频增益, envH [ i]表示高频带信号第 i个子带的 平均能量的平方根(即频域包络信息), envL [ i ]表示低频带信号第 i 个子带 的平均能量的平方根(即频域包络信息); 处理单元 93 用于依据所述平坦度 信息和频域包络信息指导所述高频增益进行平滑处理, 这样就可以得出与原 始高频信号较为接近的高频增益。 本发明实施例中的处理单元 93通过如下方 式实现:
该处理单元 93包括: 判断模块 931和平滑模块 932。 其中, 判断模块 931 用于依据平坦度信息和频域包络信息判断频 i "增益是否满足平滑条件, 这里 的平滑条件和步骤 603 中提到的平滑条件相同, 在满足平滑条件的情况下, 平滑模块 932 用于将当前帧每个子带的高频增益与前一帧对应子带的高频增 益的加权平均值作为 当 前帧对应子带的 高频增益, 公式为 : prev _ gain[i] = a - gain[i] + β - prev _ gairi i] ^ 其中, ga in [ i ]表示当前桢第 i个子带的高 频增益, prev_ ga in [ i ]表示前一帧第 i个子带的高频增益,系数《和 是常数, 一般情况下《和 满足如下关系: + ^=1。
无论是否满足平滑条件, 本实施例中的帧内平滑单元 98都需要对帧内连 续子带的频谱增益进行平滑处理。
在对频谱增益进行平滑处理后, 本实施例利用频谱增益以及低频带频谱 信号得到高频带频谱信号, 为了使恢复出的高频带频谱信号与原始高频带频 谱信号更逼近, 本实施例在用高频增益修正对低频带信号修正前先进行频谱 整形, 故而整形单元 97用于利用平坦度信息对低频带信号进行频谱整形。
本实施例中的第一修正单元 94利用平滑后的高频增益对整形后的低频带 频 信号进行修正得到高频带的频域信号; 变换单元 95用于将高频带的频域 信号变换为高频带的时域信号。 由于高频增益经过了平滑处理, 更加符合原 高频信号的特性, 这样的高频带时域信号是较为准确的, 减少了解码出的高 频带信号的不舒适感。 本实施例中的第二修正单元 96用于利用时域包络信息 修正变换出的高频带的时域信号。
本发明实施例提供的高频带信号的编解码方法及装置, 在编码时, 不仅 需要计算出时域包络信息和频域包络信息, 还需要计算高频带信号的频域信 号每个子带对应的平坦度信息, 以便更细节地表示出高频信号带信号每个子 带是否平坦度。 在进行解码时, 就可以利用上述的平坦度信息和频域包络信 息指导高频增益进行平滑处理, 利用平滑后的高频增益来修正低频带信号频 谱从而得到高频带频语信号, 使高频带信号更真实地反映原始高频带信号特 性, 减少高频带信号解码时的不舒适感。
上述的平滑处理包括帧内相邻子带之间的平滑处理, 在满足平滑条件的 情况下还包括相邻帧之间的平滑处理, 使得满足平滑条件的高频信号不会出 现能量突变和不连续平稳的现象, 能够提高高频信号的听觉质量。
并且本实施例中作为恢复频带信号的低频带信号是经过整形后的低频带 信号, 削弱了最后得到的高频信号出现的金属噪音。
本发明实施例主要用在需要处理高频带信号的场合, 如: 音频信号处理。 通过以上的实施方式的描述, 所属领域的技术人员可以清楚地了解到本 发明可借助软件加必需的通用硬件的方式来实现, 当然也可以通过硬件, 但 很多情况下前者是更佳的实施方式。 基于这样的理解, 本发明的技术方案本 质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来, 该 计算机软件产品存储在可读取的存储介质中, 如计算机的软盘, 硬盘或光盘 等, 包括若干指令用以使得一台计算机设备(可以是个人计算机, 服务器, 或者网络设备等)执行本发明各个实施例所述的方法。
以上所述, 仅为本发明的具体实施方式, 但本发明的保护范围并不局限 于此, 任何熟悉本技术领域的技术人员在本发明揭露的技术范围内, 可轻易 想到的变化或替换, 都应涵盖在本发明的保护范围之内。 因此, 本发明的保 护范围应所述以权利要求的保护范围为准。

Claims

权利要求 书
1、 一种高频带信号的编码方法, 其特征在于, 包括:
获得高频带的频域包络信息;
根据高频带的子带频谱幅度获得高频带的频域信号中每个子带的平坦度信 息;
将所述高频带的频域包络信息以及所述每个子带的平坦度信息合成编码码 流。
2、 根据权利要求 1所述的高频带信号的编码方法, 其特征在于, 所述编码 码流中还包括利用所述高频带的时域信号获得的时域包络信息。
3、 根据权利要求 1所述的高频带信号的编码方法, 其特征在于, 所述根据 高频带的子带频谱幅度获得高频带的频域信号中每个子带的平坦度信息包括: 查找每个子带中频谱幅度的最大值;
计算每个子带的频谱幅度的平均值;
将所述平均值除以所述最大值得出频谱幅度比率;
将所述频 ϊ普幅度比率和预定比率进行比较;
如果所述频谱幅度比率小于预定比率, 则输出表示不平坦的平坦度信息; 否则输出表示平坦的平坦度信息。
4、 根据权利要求 1所述的高频带信号的编码方法, 其特征在于, 所述根据 高频带的子带频谱幅度获得高频带的频域信号中每个子带的平坦度信息包括: 计算每个子带中频谱幅度的几何平均值和算术平均值;
将所述几何平均值除以所述算术平均值得出频借幅度比率;
判断所述频语幅度比率是否小于预定比率;
如果所述频谱幅度比率小于预定比率, 则输出表示不平坦的平坦度信息; 否则输出表示平坦的平坦度信息。
5、 一种高频带信号的解码方法, 其特征在于, 包括:
从接收到的编码码流中解码出高频带信号的频域包络信息和平坦度信息; 根据所述高频带信号的频域包络信息和已解码出的低频带信号计算高频带 的频谱增益;
依据所述平坦度信息和所述频域包络信息对所述频谱增益在频域进行平 滑;
利用平滑后的频谱增益对低频带信号进行修正得到高频带频傳;
将所述高频带频谱转换为高频带的时域信号。
6、 根据权利要求 5所述的高频带信号的解码方法, 其特征在于, 该方法还 包括:
从接收到的编码流中解码出高频带信号的时域包络信息;
利用时域包络信息修正变换出的高频带的时域信号。
7、 根据权利要求 5所述的高频带信号的解码方法, 其特征在于, 该方法还 包括: 利用平坦度信息对已解码出的低频带信号进行频谱整形;
所述利用平滑后的频语增益对低频带信号进行修正得到高频带频谱为: 利 用平滑后的频谱增益对频谱整形后的低频带信号进行修正得到高频带频谱。
8、 根据权利要求 5所述的高频带信号的解码方法, 其特征在于, 所述依据 所述平坦度信息和频域包络信息指导所述高频增益进行平滑处理包括:
根据平坦度信息和频域包络信息判断高频增益是否满足平滑条件; 在满足平滑条件的情况下, 将当前帧每个子带的高频增益与前一帧对应子 带的高频增益的加权平均值作为当前帧对应子带的高频增益。
9、 根据权利要求 8所述的高频带信号的解码方法, 其特征在于, 所述平滑 条件包括: 当前帧与前一帧的对应的高频带信号的频域包络信息连续平稳, 或 者当前帧的平坦度信息表示该高频带信号具有谐波性, 或者当前帧与前一帧的 对应的低频带信号的频域包络信息连续平稳。
10、 根据权利要求 5或 8所述的高频带信号的解码方法, 其特征在于, 在 利用平滑后的频谱增益对低频带信号进行修正得到高频带频谱之前, 该方法还 包括: 在帧内连续子带的频谱增益之间进行平滑处理。
11、 一种高频带信号的编码装置, 其特征在于, 包括:
第一获取单元, 用于获得高频带的频域包络信息;
第二获取单元, 用于根据高频带的子带频谱幅度获得高频带的频域信号中 每个子带的平坦度信息;
输出单元, 用于将所述高频带的时 /频域包络信息以及所述每个子带的平坦 度信息合成编码码流。
12、 根据权利要求 11所述的高频带信号的编码装置, 其特征在于, 所述第 二获取单元包括:
查找模块, 用于查找每个子带中频谱幅度的最大值;
计算模块, 用于计算每个子带中频旙幅度的平均值;
除法模块, 用于将所述平均值除以所述最大值得出频语幅度比率; 比较模块, 用于将所述频谱幅度比率和预定比率进行比较;
输出模块, 用于在所述频豫幅度比率小于预定比率时, 输出表示不平坦的 平坦度信息; 在所述频谙幅度比率不小于预定比率时, 输出表示平坦的平坦度 信息。
13、 根据权利要求 11所述的高频带信号的编码装置, 其特征在于, 所述第 二获取单元包括:
计算模块, 用于计算每个子带中频谱幅度的几何平均值和算术平均值; 除法模块, 用于将所述几何平均值除以所述算术平均值得出频谱幅度比率; 判断模块, 用于判断所述频谱幅度比率是否小于预定比率;
输出模块, 用于在所述频谱幅度比率小于预定比率时, 输出表示不平坦的 平坦度信息; 在所述频谱幅度比率不小于预定比率时, 输出表示平坦的平坦度 信息。
14、 一种高频带信号的解码装置, 其特征在于, 包括:
解码单元, 从接收到的编码码流中解码出高频带信号的频域包络信息和平 坦度信息;
计算单元, 用于根据所述高频带信号的频域包络信息和已解码出的低频带 信号计算高频带的频谱增益;
处理单元, 用于依据所述平坦度信息和所述频域包络信息对所述频谱增益 在频域进行平滑;
第一修正单元, 用于利用平滑后的频谱增益对低频带信号进行修正得到高 频带频谱;
变换单元, 用于将所述高频带频谱转换为高频带的时域信号。
15、 根据权利要求 14所述的高频带信号的解码装置, 其特征在于, 所述解 码单元还用于从接收到的编码流中解码出高频带信号的时域包络信息;
该装置还包括:
第二修正单元, 用于利用时域包络信息修正变换出的高频带的时域信号。
16、 根据权利要求 14所述的高频带信号的解码装置, 其特征在于, 该装置 还包括: 整形单元, 用于利用平坦度信息对已解码出的低频带信号进行频豫整 形;
所述第一修正单元利用平滑后的频谱增益对频谱整形后的低频带信号进行 修正得到高频带频谱。
17、 根据权利要求 14所述的高频带信号的解码装置, 其特征在于, 所述处 理单元包括:
判断模块, 用于根据平坦度信息和频域包络信息判断高频增益是否满足平 滑条件;
平滑模块, 用于在满足平滑条件的情况下, 将当前帧每个子带的高频增益 与前一帧对应子带的高频增益的加权平均值作为当前帧对应子带的高频增益。
18、 根据权利要求 14或 17所述的高频带信号的解码装置, 其特征在于, 该装置还包括:
帧内平滑单元, 用于对帧内连续子带的频借增益之间进行平滑处理。
PCT/CN2009/073129 2008-10-29 2009-08-06 高频带信号的编解码方法及装置 WO2010048827A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN2008101715913A CN101727906B (zh) 2008-10-29 2008-10-29 高频带信号的编解码方法及装置
CN200810171591.3 2008-10-29

Publications (1)

Publication Number Publication Date
WO2010048827A1 true WO2010048827A1 (zh) 2010-05-06

Family

ID=42128225

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2009/073129 WO2010048827A1 (zh) 2008-10-29 2009-08-06 高频带信号的编解码方法及装置

Country Status (2)

Country Link
CN (1) CN101727906B (zh)
WO (1) WO2010048827A1 (zh)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8620673B2 (en) 2009-05-14 2013-12-31 Huawei Technologies Co., Ltd. Audio decoding method and audio decoder
CN110556123A (zh) * 2019-09-18 2019-12-10 腾讯科技(深圳)有限公司 频带扩展方法、装置、电子设备及计算机可读存储介质
WO2023241240A1 (zh) * 2022-06-15 2023-12-21 腾讯科技(深圳)有限公司 音频处理方法、装置、电子设备、计算机可读存储介质及计算机程序产品

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TWI496138B (zh) * 2013-09-03 2015-08-11 Helios Semiconductor Inc 用於編解碼高頻聲音信號之技術和系統
CN104681032B (zh) * 2013-11-28 2018-05-11 中国移动通信集团公司 一种语音通信方法和设备
CN105096957B (zh) 2014-04-29 2016-09-14 华为技术有限公司 处理信号的方法及设备
US10304472B2 (en) * 2014-07-28 2019-05-28 Nippon Telegraph And Telephone Corporation Method, device and recording medium for coding based on a selected coding processing
JP2016038435A (ja) * 2014-08-06 2016-03-22 ソニー株式会社 符号化装置および方法、復号装置および方法、並びにプログラム
CN107545900B (zh) * 2017-08-16 2020-12-01 广州广晟数码技术有限公司 带宽扩展编码和解码中高频弦信号生成的方法和装置
CN112242145A (zh) * 2019-07-17 2021-01-19 南京人工智能高等研究院有限公司 语音滤波方法、装置、介质和电子设备
CN113808597A (zh) * 2020-05-30 2021-12-17 华为技术有限公司 一种音频编码方法和音频编码装置

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1639770A (zh) * 2002-03-28 2005-07-13 杜比实验室特许公司 根据频率变换重建具有不完全频谱的音频信号的频谱
CN1897467A (zh) * 2005-07-11 2007-01-17 索尼株式会社 信号编码、信号解码装置和方法、程序以及记录介质
CN1954363A (zh) * 2004-05-19 2007-04-25 松下电器产业株式会社 编码装置、解码装置及它们的方法
CN101140759A (zh) * 2006-09-08 2008-03-12 华为技术有限公司 语音或音频信号的带宽扩展方法及系统

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1639770A (zh) * 2002-03-28 2005-07-13 杜比实验室特许公司 根据频率变换重建具有不完全频谱的音频信号的频谱
CN1954363A (zh) * 2004-05-19 2007-04-25 松下电器产业株式会社 编码装置、解码装置及它们的方法
CN1897467A (zh) * 2005-07-11 2007-01-17 索尼株式会社 信号编码、信号解码装置和方法、程序以及记录介质
CN101140759A (zh) * 2006-09-08 2008-03-12 华为技术有限公司 语音或音频信号的带宽扩展方法及系统

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8620673B2 (en) 2009-05-14 2013-12-31 Huawei Technologies Co., Ltd. Audio decoding method and audio decoder
CN110556123A (zh) * 2019-09-18 2019-12-10 腾讯科技(深圳)有限公司 频带扩展方法、装置、电子设备及计算机可读存储介质
CN110556123B (zh) * 2019-09-18 2024-01-19 腾讯科技(深圳)有限公司 频带扩展方法、装置、电子设备及计算机可读存储介质
WO2023241240A1 (zh) * 2022-06-15 2023-12-21 腾讯科技(深圳)有限公司 音频处理方法、装置、电子设备、计算机可读存储介质及计算机程序产品

Also Published As

Publication number Publication date
CN101727906A (zh) 2010-06-09
CN101727906B (zh) 2012-02-01

Similar Documents

Publication Publication Date Title
WO2010048827A1 (zh) 高频带信号的编解码方法及装置
US8473301B2 (en) Method and apparatus for audio decoding
US10762908B2 (en) Audio encoding device, method and program, and audio decoding device, method and program
KR101168645B1 (ko) 과도 신호 부호화 방법 및 장치, 과도 신호 복호화 방법 및 장치, 및 과도 신호 처리 시스템
US8214202B2 (en) Methods and arrangements for a speech/audio sender and receiver
WO2010075789A1 (zh) 信号处理方法及装置
US9443534B2 (en) Bandwidth extension system and approach
WO2015154397A1 (zh) 一种噪声信号的处理和生成方法、编解码器和编解码系统
US11011181B2 (en) Audio encoding/decoding based on an efficient representation of auto-regressive coefficients
WO2010022661A1 (zh) 音频编码、解码方法及装置、系统
JP2011504249A (ja) 信号処理方法及び装置
US20090180531A1 (en) codec with plc capabilities
BR112014016153B1 (pt) método para um codificador processar dados de áudio, método para processar um sinal de áudio, codificador e decodificador
WO2009109120A1 (zh) 一种音频信号的编解码方法和装置
US9076440B2 (en) Audio signal encoding device, method, and medium by correcting allowable error powers for a tonal frequency spectrum
WO2011047578A1 (zh) 频带扩展方法及装置
WO2023197809A1 (zh) 一种高频音频信号的编解码方法和相关装置
JP6573887B2 (ja) オーディオ信号の符号化方法、復号方法及びその装置
US20230298597A1 (en) Methods for phase ecu f0 interpolation split and related controller
US10332527B2 (en) Method and apparatus for encoding and decoding audio signal
WO2011144130A1 (zh) 一种频带扩展的方法和装置
TWI587287B (zh) 柔和噪音產生模式選擇之裝置與方法
KR20150034507A (ko) 오디오 신호 부호화 방법 및 장치
JP2000250597A (ja) Lsp補正装置,音声符号化装置及び音声復号化装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 09823019

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 09823019

Country of ref document: EP

Kind code of ref document: A1